[RBASE-L] - Re: Hierarchical Data ... Redux

Ben Petersen Mon, 14 Apr 2008 16:01:58 -0700

> I agree with Jim Bentley (and perhaps Ben, too) that some of thisstuff is "mind-boggling"


Not "Ben too"... better, "Ben especially" <g>

Your sales tax scenario is very interesting, never would have occurredto me, but there's nothing new about that <sigh>. I'm a little confusedabout the nodelevel consideration?

I'm thinking the next evolutionary step for this, when I do anotherproject using it, is to extract the data into an xml document. Thatwould seem to be the perfect bridge between an Sql database andsomething inherently hierarchical and rather universal.

so I have had my challenges,including the fear of not beingable to understand it.


For me personally, I don't know why it was so difficult to get through my think head 
that the left/right values always bracketed subordinate data. And if those values are 
sequential there is no subordinate data. Oh well <g>.


Thanks for sharing your progress and insights.

Ben





Wills, Steve wrote:

Ben, it took a bit of effort, goin' at it in fits and starts, but I did get it 
"whipped".
Below, for any who may be interested, is the result of my effort, with thanks to Ben (and Celko). If you look at this, you might be able to see why it could be useful for data that is hierarchical in structure, but in a 2-D database table. Although there is more to it, if I were to associate, let's say, Sales Tax Revenues for all rows at level 4, municipality, then I could select Tennessee, at level 2, and run a query with a 'SUM(SalesTaxRevenues) WHERE NodeLeftSide > 3 and NodeRightSide < 14 and NodeLevel=4 (Nodelevel restriction necessary to avoid double-counting if records at other NodeLevels also have SalesTaxRevenue data.).Given this simple/simplistic example, it would be easy to do this without all this hierarchy stuff. However, what if you had all the municipalities and/or counties in the state of Tennessee? There are 95 counties in the state. What this does is allow for the querying of branches/sub-branches of the organizational tree. By using a query, view, (temp_)table that joins/unions whatever data elements you might have with the hierarchical (Level, Left, Right) elements, you can create and execute queries that are difficult (or impossible) to do otherwise.
I'm sure that some slick, cool query forms could be implemented, including the 
use of the DB Tree View control.
I know my explanation is wholly insufficient, but, trust me, if you encounter hierarchical data, ask yourself how you would figure out the total whatever for this sub-unit and it's sub-units ... then, ask me and I'll share the source from my RMD file. I'm not bragging, please believe me. As smart as I think I am, I agree with Jim Bentley (and perhaps Ben, too) that some of this stuff is "mind-boggling", so I have had my challenges, including the fear of not being able to understand it. Now that I have a rough understanding and code that makes my hierarchical data useful, i.e. query-able, I'm a happy camper.
(Let me also add that all this is for simple hierarchies, with 1 parent only.  
I'm not yet ready for hierarchies with 2:N parents ...)

My current employer has some 537 unique (sub-)units in it, from the root down 
(or up, if you prefer) with a total of 8 levels/generations in its 
organizational tree.  I have already been asked about how much whatever does a 
certain non-root-node and its children have?

--...

[MY_UNIT_HIERARCHY_AS_NESTED_SETS]

GPU_ID                          GPU_PARENT_ID                   HasChildren     
NodeLevel  NodeLeftSide  NodeRightSide
----------------------------------- ----------------------------------- 
----------- ---------- ------------- ------------------
USA                             0                               Y               
         1                 1               30
ARKANSAS                        USA                             Y               
         2                 2                9
CRITTENDEN                      ARKANSAS                        N               
         3                 3                4
PULASKI                         ARKANSAS                        N               
         3                 5                6
WHITE                           ARKANSAS                        N               
         3                 7                8
MISSISSIPPI                     USA                             Y               
         2                10               13
DESOTO                          MISSISSIPPI                     N               
         3                11               12
TENNESSEE                       USA                             Y               
         2                14               29
FAYETTE                         TENNESSEE                       N               
         3                15               16
SHELBY                          TENNESSEE                       Y               
         3                17               26
COLLIERVILLE                    SHELBY                          N               
         4                18               19
GERMANTOWN                      SHELBY                          N               
         4                20               21
MEMPHIS                         SHELBY                          N               
         4                22               23
MILLINGTON                      SHELBY                          N               
         4                24               25
TIPTON                          TENNESSEE                       N               
         3                27               28

NOTES:
(1) 'GPU' = Geo-Political Unit, i.e. a Nation, State, County, Municipality, Etc
(2) In this example, the root-node is USA, identified by '0'.  The root is the ultimate 
progenitor, if you will, the "parent-of-parents".  Other means for identifying 
the root could be used.  The root could have itself as its own parent, some other text 
value, or even NULL, although I would probably shy away from NULL, even though it might 
actually be logically justified here.
(3) NodeLevel could be viewed as "generation".  It is the current level in the 
hierarchy.
(4) Left and Right values indicate the breadth of successive generations. See how USA has 1..30, as, being the root-node, it encompasses the entire span of the tree.(5) Left and Right values are independent of how the records are uniquely identified. The Level, Left, Right values are just the numbers that identify the structure of the tree.
[MY_UNIT_HIERARCHY]
GPU_ID GPU_PARENT_ID----------------------------------- -----------------------------------USA 0TENNESSEE USAARKANSAS USAMISSISSIPPI USASHELBY TENNESSEETIPTON TENNESSEEFAYETTE TENNESSEEMEMPHIS SHELBYMILLINGTON SHELBYCOLLIERVILLE SHELBYGERMANTOWN SHELBYPULASKI ARKANSASWHITE ARKANSASCRITTENDEN ARKANSASDESOTO MISSISSIPPI
NOTES:
(1) This is simple, really; it's just a 2-column table with the GPU and its parent.
Again, if anyone wants, I'll elaborate all I can, which is only a bit, on why this is 
"a good thing" (I use that but I'm not much of a Martha Stewart fan) and I'll 
provide my RMD file, too.

Steve in Memphis


















From: [email protected] [mailto:[EMAIL PROTECTED] On Behalf Of Ben Petersen
Sent: Wednesday, March 12, 2008 10:56am 10:56
To: RBASE-L Mailing List
Subject: [RBASE-L] - Re: Hierarchical Data ...
Ben, your message bolstered my courage totackle this - kinda' humorous that a discussionof hierarchical data and a nested-setsimplementation might "inspire" someone.
As a friend of mine once said to a similar comment, "Ben, that's just sad" <g>.I regret to say that when I've used this logic in the past I had the luxury of starting from scratch (no data conversion), and most recently in VB, but with different objectives, so I can't offer much help. Looking at his "push down stack algorithm" just makes my head hurt w/o more time to digest it.
I can offer this *very* modest advice, fwiw.  It became much easier for me to 
internalize when I reflexively knew that if the two indexes were sequential the data 
item had no children, otherwise other data sets were encapsulated. I know it's 
obvious... and I can't explain why it seeing it just that way greased my mental 
skids, but there you are (more sadness, I guess <g>).

I had originally seen this method in a posting on this list long ago. I just Googled 
"celko sql tree" to get something explanatory for you. There are a bunch of 
matches and I recall seeing this applied in a number of different ways when I looked into 
it some years ago, so there may be more help there. But, it sounds like you've almost got 
it whipped.

Ben

[RBASE-L] - Re: Hierarchical Data ... Redux

Reply via email to