Handling Data Trees in Grasshopper Python Scripts
Decorate your functions swiftly and easily with TreeHandler
Mar 24 / 2020
____________________
Published on Medium
In Grasshopper, all data are stored in Data Trees — a custom data structure which encapsulates information passed between various components. Normally the handling of a Data Tree is done automatically by Grasshopper, but once you start scripting your very own custom components, you may find Data Tree handling a less than intuitive process, especially when dealing with input trees with different shapes.
In the following tutorial, we will introduce treehandler — a lightweight utility library designed to simplify Data Tree handling in a GhPython module. This tutorial targets those familiar with Python scripting in Grasshopper. If you are new to working with grasshopper Data Tree or the GhPython module, I have included a few pointers in Additional Resources to help you get started. All Grasshopper definitions used in this tutorial can be downloaded from food4Rhino.
Content Overview
1. Motivation
2. Installation
3. Usage
4. Performance
5. Notes
6. Additional Resources
Motivation
Whether you are performing simple tasks or creating complex components using the GhPython module, it is always preferable to remain consistent with how Grasshopper handles Data Tree by default, ergo, any users new to your script can readily rely on prior intuition on Data Trees. The following are code snippets of two tasks at different levels of complexity.
In this simple task, we’ll create a conic surface by calling the
So what happens when we change
It’s worth interjecting that data do not truly exist as a ‘list’ or as an ‘item’. Data only exist in the form of a Data Tree. A ‘list’ is but a Data Tree with one branch which contains a list of data. Likewise, an ‘item’ is a ‘list’ (actually a Data Tree) that holds a single data entry. If this seems unfamiliar to you, check out Andrew Heumann’s tutorial on Data Tree. In short, if you are not performing more complex tasks, we can typically rely on Grasshopper to autopilot data retrieval, as long as we set the component with the correct input access.
But what if it becomes necessary to directly access Data Trees? Let’s take a look at the following code snippet. In this example, we will split a grid of cones into four quadrants, and use four attractor points to influence the sizes of the cones in each quadrant. We will assume that each attractor only affects the cones within its designated quadrant.
To split a grid of base points into four quadrants, the easiest route is to directly manipulate the Data Tree’s topology (from 6 branches, 6 points per branch, to a Data Tree of 4 branches, 9 points per branch). We can do so by calling
All seems well. It’s not until we call
This is due to the fact that the functions of
There are generally two ways around such a hurdle. We can either move subsequent functions after
While most tutorials advise that we’d better avoid manual processing of Data Trees, fear not! We can now utilize
Installation
Git clone
If you are unfamiliar with git, you can download the library from food4Rhino. Similar to Option 1, print out and choose one of the directory paths and copy the folder
Verify the installation by running the following import statement in a GhPython component:
Usage
To use TreeHandler, simply decorate the function definition with
Notice that we have added a default argument
Specify
Note that there’s no need to decorate every function that handles DataTree. In fact, all the subtasks —
Note that we don’t care if
Normally we can verify that our code is running as expected by inspecting the input and output Data Trees. However, since we had modified Data Tree topology of our inputs within a component, we need to manually inspect both arguments and return values from
With
We can also verify the result by visual inspection. As intended, we see that an attractor only affects cones within the same quadrant. We can also inspect individual Data Tree branches using the default Grasshopper
Performance
There have long been discussions regarding the performance of the GhPython module, which has seen drastic improvement with the release of Rhino 6. Below shows the runtime of Data Trees processed in a native Grasshopper component, and in a GhPython component with implicit and manual looping, with multithreading, and finally, with TreeHandler decorator. Runtimes are measured on input sizes: N={2601, 10201, 40401, 90601, 160801, 250001}.
These are by no means objective nor comprehensive, due to the limitations of hardware and unavoidable fluctuations in the results even when running the same test consecutively.
The first test calls the
Next, we will measure runtime on unflatten inputs with multiple branches, while the function call remains unchanged. We observe that even with type hints turned off, GhPython with implicit looping still takes twice as long to finish the same task. Overall, TreeHandler seems to run closely along with all other options, if not marginally faster.
In this test, the task is to connect a list of input points into a polyline (hence many-to-one) using the
In our last test, we will divide a curve and obtain a set of dividing points using
In Grasshopper, all data are stored in Data Trees — a custom data structure which encapsulates information passed between various components. Normally the handling of a Data Tree is done automatically by Grasshopper, but once you start scripting your very own custom components, you may find Data Tree handling a less than intuitive process, especially when dealing with input trees with different shapes.
In the following tutorial, we will introduce treehandler — a lightweight utility library designed to simplify Data Tree handling in a GhPython module. This tutorial targets those familiar with Python scripting in Grasshopper. If you are new to working with grasshopper Data Tree or the GhPython module, I have included a few pointers in Additional Resources to help you get started. All Grasshopper definitions used in this tutorial can be downloaded from food4Rhino.
Content Overview
1. Motivation
2. Installation
3. Usage
4. Performance
5. Notes
6. Additional Resources
Motivation
Whether you are performing simple tasks or creating complex components using the GhPython module, it is always preferable to remain consistent with how Grasshopper handles Data Tree by default, ergo, any users new to your script can readily rely on prior intuition on Data Trees. The following are code snippets of two tasks at different levels of complexity.
# Snippet of a simple task
import rhinoscriptsyntax as rs
import Rhino.Geometry as rg
def standardCone(base, radius):
"""Returns a standard conic surface on the XY plane
where its radius equals its height"""
apex = rs.PointAdd(base, rg.Vector3d(0, 0, radius))
return rs.AddCone(base, apex, radius)
a = standardCone(base, radius)
In this simple task, we’ll create a conic surface by calling the
rhinoscriptsyntax
function AddCone()
whose input base
and radius
have a one-to-one correspondence. Thus, we will set both inputs to have item access
, meaning that data will be piped into the component one item at a time. On the other hand, should our function take a list of objects as input, for instance, AddPolyline(points)
, we will need to set the input to have list access
.So what happens when we change
radius
to a list of numbers? If we test our script in grasshopper, we see that grasshopper had performed a number of implicit cycles which is equal to the size of the list of radius
. Since we set both inputs to have item access
, Grasshopper will automatically iterate through the input one item at the time. Despite there being only one base point but three radii, Grasshopper will automatically “extend” the list of base
by repeating its last element until it has the same length as radius
. This is Grasshopper’s default handling of inputs with different sizes.It’s worth interjecting that data do not truly exist as a ‘list’ or as an ‘item’. Data only exist in the form of a Data Tree. A ‘list’ is but a Data Tree with one branch which contains a list of data. Likewise, an ‘item’ is a ‘list’ (actually a Data Tree) that holds a single data entry. If this seems unfamiliar to you, check out Andrew Heumann’s tutorial on Data Tree. In short, if you are not performing more complex tasks, we can typically rely on Grasshopper to autopilot data retrieval, as long as we set the component with the correct input access.
But what if it becomes necessary to directly access Data Trees? Let’s take a look at the following code snippet. In this example, we will split a grid of cones into four quadrants, and use four attractor points to influence the sizes of the cones in each quadrant. We will assume that each attractor only affects the cones within its designated quadrant.
# Snippet of a more complex task:
from Grasshopper import DataTree
from Grasshopper.Kernel.Data import GH_Path
import rhinoscriptsyntax as rs
import Rhino.Geometry as rg
import math
def standardCone(base, radius):
"""Returns a standard conic surface on a XY plane
where its radius equals its height"""
apex = rs.PointAdd(base, rg.Vector3d(0, 0, radius))
return rs.AddCone(base, apex, radius)
def makeQuadrants(ptGrid):
"""Divide input point grid into four quadrants"""
result = DataTree[object]()
div = int(math.sqrt(ptGrid.DataCount / 4))
for i, branch in enumerate(ptGrid.Branches):
for j, item in enumerate(branch):
result.Add(item, GH_Path(0, i//div, j//div))
return result
def dist(pt1, pt2, cutoff=15):
"""Returns the distance between two points.
Returns cutoff if the distance exceeds it."""
return min(rs.Distance(pt1, pt2), cutoff)
def scale(scalar, val, bounds):
"""Returns a new value scaled by the scalar,
but capped within bounds"""
result = scalar*val
lower, upper = bounds
return max(min(result, upper), lower)
# split base and attractors into quadrants
# in essence modifying Data Tree topology
base = makeQuadrants(base)
attractor = makeQuadrants(attractor)
scalar = dist(base, attractor) * factor
radius = scale(scalar, radius, bounds=(0.1, radius))
a = standardCone(base, radius)
To split a grid of base points into four quadrants, the easiest route is to directly manipulate the Data Tree’s topology (from 6 branches, 6 points per branch, to a Data Tree of 4 branches, 9 points per branch). We can do so by calling
makeQuardrants()
, which can also be reused to split the attractors into quadrants. To use Data Trees as function arguments, we must set the corresponding input access of base
and attractor
to have tree access
.All seems well. It’s not until we call
dist()
that we get the following error from GhPython interpreter:Runtime error (ValueErrorException): Could not convert tree {9;9;9;9}
to a Point3d
This is due to the fact that the functions of
dist()
, scale()
, and standardCone()
expect inputs to have item access
, that is, data items will be piped in one at a time. Thus, Python interpreter have no clue what to do with the Data Trees in the argument.There are generally two ways around such a hurdle. We can either move subsequent functions after
makeQuadrants()
into a new GhPython component, set all input access as item access
, and let Grasshopper handle data retrieval. Otherwise, we can manually loop through the input Data Trees and retrieve the data ourselves. The first method is a quick fix that may turn out to be cumbersome. For as long as there’s a need to directly process Data Trees, it must be wrapped inside a stand-alone component. Thus a well-encapsulated component may now have to explode into a bunch of unclear and unjustified GhPython components which will be hard to maintain and reuse. The second method is a rather complex procedure, for we must manually match input Data Trees with different topologies (dimensional depth and breadth). This involves implicitly “grafting” branches with “inferior” dimensions to one with higher dimensions (i.e., {0; 1} to {0; 0; 1}
, matching branches with different number of child branches, and finally, matching branches containing different numbers of items.While most tutorials advise that we’d better avoid manual processing of Data Trees, fear not! We can now utilize
treehandler
to swiftly and easily handle Data Trees as input while yielding results consistent with those returned by Grasshopper.Installation
Option 1:
Git clone
ghpythonutil
into Rhino’s IronPython Library. Choose one of several destination <PATH>
by runing the following in a GhPython component# in a GhPython component
# Choose a <PATH> from sys.pat
import sys
for path in sys.path:
print(path)
# in a linux terminal run the following
$ git clone https://github.com/v-machine/gh_python_util
$ mv -nv gh_python_util/src/ghpythonutil <PATH>/ghpythonutil
$ rm gh_python_util
Option 2:
If you are unfamiliar with git, you can download the library from food4Rhino. Similar to Option 1, print out and choose one of the directory paths and copy the folder
ghpythonutil
into it.Verify the installation by running the following import statement in a GhPython component:
from ghpythonutil.treehandler import TreeHandler
Usage
treehandler.TreeHandler
is a function decorator for handling Data Trees as inputs in user-defined functions. Calls to decorated functions will avoid implicit looping behavior triggered by component inputs with item access
or list access
. The decorator will handle DataTree input in a fashion identical to any other default grasshopper component.To use TreeHandler, simply decorate the function definition with
@TreeHandler
. For instance, in the previous complex task, we will right-click on all inputs to the GhPython component and choose tree access
. We can then rewrite the function dist()
as the following:from ghpythonutil.treehandler import TreeHandler
@TreeHandler
def dist(pt1, pt2, cutoff=15, access=["item", "item"]):
"""Returns the distance between two points"""
return min(rs.Distance(pt1, pt2), cutoff)
Notice that we have added a default argument
access
as a list of access keywords: ["item", "item"]
. Arguments in the decorated function definition must include the access
default argument. This is due to the fact that the GhPython component will now parse all input as Data Trees. Thus, we must manually pass the actual access types to TreeHandler
so that they are parsed correctly when the decorated function is called.Specify
list access
if you know that the function takes a list of data as argument. For instance:@TreeHandler
def polyLine(vertices, access=["list"]):
"""Returns a polyline by connecting a list of vertices"""
return rs.AddPolyline(vertices)
Note that there’s no need to decorate every function that handles DataTree. In fact, all the subtasks —
dist()
, scale()
, and standardCone()
— can be wrapped inside a main()
function, which shall be the only function to be decorated. Thus, we can modify our previous code snippets by replacing the last three function calls with a decorated main()
function:from ghpythonutil.treehandler import TreeHandler
# ...functions definitions same as previous code snippetsbase
base = makeQuadrants(base)
attractor = makeQuadrants(attractor)
@TreeHandler
def main(base, radius, attractor, factor=0.1,
access=["item", "item", "item"]):
scalar = dist(base, attractor) * factor
radius = scale(scalar, radius, bounds=(0.1, radius))
return standardCone(base, radius)a = main(base, radius, attractor)
Note that we don’t care if
makeQuadrants()
is decorated, since the task of transforming the input Data Tree topology requires us to explicitly manipulate path indices. For all other functions, we can simply delegate Data Tree handling to TreeHandler
.Normally we can verify that our code is running as expected by inspecting the input and output Data Trees. However, since we had modified Data Tree topology of our inputs within a component, we need to manually inspect both arguments and return values from
main()
.print("Input Data Trees Topology")
for arg in (base, radius, attractor):
print(arg.TopologyDescription)print("Output Data Trees Topology")
print(a.TopologyDescription)
With
main()
being a function with “one-to-one” mapping (i.e., all input have item access
), we can verify that the returned Data Tree has the expected topology, which normally coincides with that of the largest input. Note that other types of functions with different mappings, such as “many-to-one” (e.g., AddPolyline()
) and “one-to-many” (e.g., divideCurve()
) may yield different output topologies.Input Data Trees Topology
Tree (Branches = 4)
{0;0;0} (N = 9)
{0;0;1} (N = 9)
{0;1;0} (N = 9
{0;1;1} (N = 9)
Tree (Branches = 1)
{0} (N = 1)
Tree (Branches = 4)
{0;0;0} (N = 1)
{0;0;1} (N = 1)
{0;1;0} (N = 1)
{0;1;1} (N = 1)
Output Data Trees TopologyTree (Branches = 4)
{0;0;0} (N = 9)
{0;0;1} (N = 9)
{0;1;0} (N = 9)
{0;1;1} (N = 9)
We can also verify the result by visual inspection. As intended, we see that an attractor only affects cones within the same quadrant. We can also inspect individual Data Tree branches using the default Grasshopper
Tree Branch
component. By isolating each branch, we observe that the resulting grid of cones is indeed divided into quadrants.Performance
There have long been discussions regarding the performance of the GhPython module, which has seen drastic improvement with the release of Rhino 6. Below shows the runtime of Data Trees processed in a native Grasshopper component, and in a GhPython component with implicit and manual looping, with multithreading, and finally, with TreeHandler decorator. Runtimes are measured on input sizes: N={2601, 10201, 40401, 90601, 160801, 250001}.
These are by no means objective nor comprehensive, due to the limitations of hardware and unavoidable fluctuations in the results even when running the same test consecutively.
One-to-One Function (with flattened input)
The first test calls the
Rhino.Geometry.Circle()
function. For flattened input (single branch), the performance of TreeHandler is on par with the native Grasshopper component, and with manual or parallel iteration on Data Trees. The slowest performing (3x slower) is implicit looping, with type hint enabled on a GhPython component. It has been mentioned in prior discussions that type hints seem to have a visible impact on performance, without which, however, certain function calls will require typecasting. To control for this variable, we shall disable type hints for all subsequent tests.One-to-One Function
Next, we will measure runtime on unflatten inputs with multiple branches, while the function call remains unchanged. We observe that even with type hints turned off, GhPython with implicit looping still takes twice as long to finish the same task. Overall, TreeHandler seems to run closely along with all other options, if not marginally faster.
Many-to-One Function
In this test, the task is to connect a list of input points into a polyline (hence many-to-one) using the
Rhino.Geometry.Polyline()
function. Triggered by vertices with list access
, GhPython module’s implicit looping still performs the slowest. However, we also observe that almost all other methods had outperformed TreeHandler, with the native Grasshopper component running the fastest.One-to-Many Function
In our last test, we will divide a curve and obtain a set of dividing points using
rhinoscriptsyntax.DivideCurve()
. I shall point out that this is not a fair comparison for the native Grasshopper Divide Curve
component returns three values versus single values returned by the GhPython components. From this, we can infer that Divide Curve
probably executes a much more complicated algorithm and thus the longer runtime. Set aside this outlier, we see that TreeHandler performs squarely in the middle, on par with manual looping and the multi-threaded implementation.Notes
- In our performance tests, we mostly called functions from
rhinoscriptsyntax
andRhino.Geometry
. You’re welcomed to use TreeHandler to decorate functions fromghpythonlib.components
, though be mindful of the performance decrease for this is essentially wrapping a Grasshopper component within a GhPython component.
- Functions decorated with TreeHandler will handle Data Trees exactly like any other native Grasshopper component, regardless of whether the Data Tree had been grafted, flattened, or simplified, and irrespective of whether the input Data Trees have mismatched shapes (topologies). What has yet to be tested is the performance on processing unmatching Data Trees, which is likely to run foreseeably slower due to branch matching.
- Should you forgot to set the component inputs to have
tree access
, TreeHandler will implicitly cast any item and list into trees. However, this is without the added benefit of runtime speed-up, foritem access
andlist access
inadvertently trigger the component’s implicit looping behavior.
- Parallelization of a decorated function is not supported yet. Although it is definitely possible to further optimize the performance of TreeHandler with leaner data structures, faster algorithms, and multi-threading. These are all subjects of experiment before the next release.
Additional Resources
- Intro to Data Tree
- Andrew Heumann’s Rules for Healthy and Happy Data Trees
- Data Tree to Python list and vice versa
- Scripting Geometries in Grasshopper Python