[RBASE-L] - Re: Hierarchical Data ... Redux

Wills, Steve Tue, 15 Apr 2008 07:27:19 -0700

Y'all, I should never attempt to send any sort of long/deep/multi-paragraph 
message late on a day when a nap would have benefitted me more than any "help" 
I might have offered anyone else.tried to

So, 2 things:

(1) >> ... but I did get it "whipped" should have been rendered by a Memphis as 
"whooped".
(2) The source from my RMD file follows
(3) Table listings are at end
(2) There is more that could/should be done, if one needs.  I did nothing about 
tree management:
    (a) Moving ...
        (i) existing sub-trees
        (ii) child-nodes from one lineage to another
    (b) Inserting/Updating/Deleting ...
        (i) entire sub-trees
        (ii) portions of sub-trees

If this is useful to someone, cool.  If not, it's still useful to me, cool.

Steve in Memphis

-- ********************************************************
-- *** AdjacencyList2NestedSet.RMD ************************
-- ********************************************************
-- *** 2008.03.11, JSW ************************************
-- *** Taken fm Celko's example(s), ***********************
-- *** (re-)inspired by Ben Petersen. *********************
-- ********************************************************

SET MESSAGES off ; SET ERR MESSAGES off

-- *** INIT Source Table **********************************
-- *** NODE_TREE is a "holding" table for processing    ***
-- *** the hierarchical data.  It should by empty       ***
-- *** after processing, but I like to make sure of it. ***
-- *** All the table needs for processing the data      ***
-- *** is the UnitID and the ParentUnitID.              ***
-- *** In my org', these ID's are TEXT (10).            ***
-- ********************************************************
DELETE ROWS FROM NODE_TREE

-- *** This statement is for my org's actual data ***
--INSERT INTO NODE_TREE SELECT UNIT_NUMBER,PARENT_UNIT_NUMBER FROM 
R_UNIT_HIERARCHY

-- *** This statement is for my dev/test data     ***
INSERT INTO NODE_TREE SELECT GPU_ID,GPU_PARENT_ID FROM STEVE_UNIT_HIERARCHY_xA

-- 
*******************************************************************************
-- ***   NOTE : NODE_STACK contains the hierarchy data structured as nested 
sets.
-- ***        : If it will be re-used, as in my case with the actual data
-- ***        : versus the test data, then the records in NODE_STACK
-- ***        : should be INSERTed/UPDATEd elsewhere for preservation.
-- 
*******************************************************************************

SET VAR vLeftRight INT = 2

-- *** How many records are there in the adjaceny list?   ***
SELECT COUNT(*) INTO vCounter IND viCounter FROM NODE_TREE

-- *** Maximum Left||Right Node will be twice that count. ***
SET VAR vMAXLeftRight INT = (2 * (.vCounter))

-- *** This could be set to 0, too.     ***
-- *** It's the bottom||top of the tree ***
-- *** and the level of the root.       ***
SET VAR vCurrentNodeLevel INT = 1

-- *** My VAR's for RB Implementation ***
SET VAR vChildCount INT = 0
SET VAR vMINNodeID TEXT = NULL

-- *** INIT Target Table **********************************
DELETE ROWS FROM NODE_STACK

-- *** Put ROOT NODE into Target Table.         ***
-- *** Root Node can be id'd in various ways.   ***
-- *** Whatever means is used by an org'        ***
-- *** will dictate content of the WHERE-clause ***
-- *** In my org', the Root has ParentID = 0.   ***
INSERT INTO +
       NODE_STACK (NodeLevel, NodeID, NodeLeftSide, NodeRightSide) +
SELECT 1, NodeID, 1, .vMAXLeftRight +
  FROM NODE_TREE +
 WHERE (INT(NodeParentID)) = 0

-- *** Delete that row from Source Table ***
DELETE ROWS +
  FROM NODE_TREE +
 WHERE (INT(NodeParentID)) = 0

--WHILE (.vLeftRight + 1) <= .vMAXLeftRight THEN
WHILE vLeftRight <= (.vMAXLeftRight - 1) THEN

   -- *** Is this Top-of-Stack NodeID a Parent in Source Table? ***
   SELECT COUNT(*) INTO vChildCount IND viChildCount +
     FROM NODE_STACK ns, NODE_TREE nt +
    WHERE ns.NodeID = nt.NodeParentID +
      AND ns.NodeLevel = .vCurrentNodeLevel

   -- *** No kids found ; NULL value for vTestCount?      ***
   -- *** As indicator var shows; so set vChildCount to 0 ***
   IF viChildCount < 0 THEN ; SET VAR vChildCount = 0 ; ENDIF

   -- *** Kids found ... ***
   IF vChildCount >= 1 THEN
      -- *** Get current "minimal" child,         ***
      -- *** determined here by Alphanumeric sort ***
      -- *** implied by use of MIN()              ***
      SELECT MIN(nt.NodeID) INTO vMINNodeID IND viMINNodeID +
        FROM NODE_STACK ns, NODE_TREE nt +
       WHERE ns.NodeID = nt.NodeParentID +
         AND ns.NodeLevel = .vCurrentNodeLevel

      -- *** Put this minimal kid's values into Target Table ***
      INSERT INTO NODE_STACK (NodeLevel, NodeID, NodeLeftSide, NodeRightSide) +
      VALUES (.vCurrentNodeLevel + 1), .vMINNodeID, .vLeftRight, NULL

      -- *** Delete current minimal kid from Source Table ***
      DELETE ROWS +
        FROM NODE_TREE +
       WHERE NodeID = (SELECT NodeID +
                         FROM NODE_STACK +
                        WHERE NodeLevel = (.vCurrentNodeLevel + 1) +
                          AND NodeRightSide IS NULL)

      -- *** Increment Pointer/Counter Var's ***
      SET VAR vLeftRight = (.vLeftRight + 1)
      SET VAR vCurrentNodeLevel = (.vCurrentNodeLevel + 1)

   ELSE
      -- *** No kids found, so end this lineage, assign value for Right Side ***
      UPDATE NODE_STACK +
         SET NodeRightSide = .vLeftRight +
       WHERE NodeLevel = .vCurrentNodeLevel +
         AND NodeRightSide IS NULL

       -- WHERE ...  AND NodeRightSide IS NULL
       -- SET ...            NodeStackTop = (NodeStackTop - 1) +

      -- *** Increment Pointer/Counter Var's ***
      SET VAR vLeftRight = (.vLeftRight + 1)
      SET VAR vCurrentNodeLevel = (.vCurrentNodeLevel - 1)
   ENDIF

ENDWHILE

RETURN

   Table: STEVE_UNIT_HIERARC   No Lock(s)

 No. Column Name        Attributes
 --- ------------------ ------------------------------------------------------
   1 GPU_ID             Type   : TEXT 30
   2 GPU_PARENT_ID      Type   : TEXT 30
   Current number of rows:     15

   Table: NODE_TREE            No Lock(s)
   Descr: -0- 

 No. Column Name        Attributes
 --- ------------------ ------------------------------------------------------
   1 NodeID             Type   : TEXT 10
   2 NodeParentID       Type   : TEXT 10
   Current number of rows:      0

   Table: NODE_STACK           No Lock(s)
   Descr: -0-

 No. Column Name        Attributes
 --- ------------------ ------------------------------------------------------
   1 NodeLevel          Type   : INTEGER
   2 NodeID             Type   : TEXT 10
   3 NodeLeftSide       Type   : INTEGER
   4 NodeRightSide      Type   : INTEGER
   Current number of rows:     15

From: [email protected] [mailto:[EMAIL PROTECTED] On Behalf Of Ben Petersen
Sent: Monday, April 14, 2008 6:00pm 18:00
To: RBASE-L Mailing List
Subject: [RBASE-L] - Re: Hierarchical Data ... Redux

> I agree with Jim Bentley (and perhaps Ben, too) that some of this stuff is 
> "mind-boggling"

Not "Ben too"... better, "Ben especially" <g>

Your sales tax scenario is very interesting, never would have occurred to me, 
but there's nothing new about that <sigh>. I'm a little confused about the 
nodelevel consideration?

I'm thinking the next evolutionary step for this, when I do another project 
using it, is to extract the data into an xml document. That would seem to be 
the perfect bridge between an Sql database and something inherently 
hierarchical and rather universal. 

> so I have had my challenges, 
> including the fear of not being 
> able to understand it.

For me personally, I don't know why it was so difficult to get through my think 
head that the left/right values always bracketed subordinate data. And if those 
values are sequential there is no subordinate data. Oh well <g>.

Thanks for sharing your progress and insights.

Ben

Wills, Steve wrote: 
Ben, it took a bit of effort, goin' at it in fits and starts, but I did get it 
"whipped".

Below, for any who may be interested, is the result of my effort, with thanks 
to Ben (and Celko).  If you look at this, you might be able to see why it could 
be useful for data that is hierarchical in structure, but in a 2-D database 
table.  Although there is more to it, if I were to associate, let's say, Sales 
Tax Revenues for all rows at level 4, municipality, then I could select 
Tennessee, at level 2, and run a query with a 'SUM(SalesTaxRevenues) WHERE 
NodeLeftSide > 3 and NodeRightSide < 14 and NodeLevel=4 (Nodelevel restriction 
necessary to avoid double-counting if records at other NodeLevels also have 
SalesTaxRevenue data.).  

Given this simple/simplistic example, it would be easy to do this without all 
this hierarchy stuff.  However, what if you had all the municipalities and/or 
counties in the state of Tennessee?  There are 95 counties in the state.  What 
this does is allow for the querying of branches/sub-branches of the 
organizational tree.  By using a query, view, (temp_)table that joins/unions 
whatever data elements you might have with the hierarchical (Level, Left, 
Right) elements, you can create and execute queries that are difficult (or 
impossible) to do otherwise.  

I'm sure that some slick, cool query forms could be implemented, including the 
use of the DB Tree View control.

I know my explanation is wholly insufficient, but, trust me, if you encounter 
hierarchical data, ask yourself how you would figure out the total whatever for 
this sub-unit and it's sub-units ... then, ask me and I'll share the source 
from my RMD file.  I'm not bragging, please believe me.  As smart as I think I 
am, I agree with Jim Bentley (and perhaps Ben, too) that some of this stuff is 
"mind-boggling", so I have had my challenges, including the fear of not being 
able to understand it.  Now that I have a rough understanding and code that 
makes my hierarchical data useful, i.e. query-able, I'm a happy camper.  

(Let me also add that all this is for simple hierarchies, with 1 parent only.  
I'm not yet ready for hierarchies with 2:N parents ...)

My current employer has some 537 unique (sub-)units in it, from the root down 
(or up, if you prefer) with a total of 8 levels/generations in its 
organizational tree.  I have already been asked about how much whatever does a 
certain non-root-node and its children have?

--...

[MY_UNIT_HIERARCHY_AS_NESTED_SETS]

GPU_ID                          GPU_PARENT_ID                   HasChildren     
NodeLevel  NodeLeftSide  NodeRightSide
----------------------------------- ----------------------------------- 
----------- ---------- ------------- ------------------
USA                             0                               Y               
         1                 1               30
ARKANSAS                        USA                             Y               
         2                 2                9
CRITTENDEN                      ARKANSAS                        N               
         3                 3                4
PULASKI                         ARKANSAS                        N               
         3                 5                6
WHITE                           ARKANSAS                        N               
         3                 7                8
MISSISSIPPI                     USA                             Y               
         2                10               13
DESOTO                          MISSISSIPPI                     N               
         3                11               12
TENNESSEE                       USA                             Y               
         2                14               29
FAYETTE                         TENNESSEE                       N               
         3                15               16
SHELBY                          TENNESSEE                       Y               
         3                17               26
COLLIERVILLE                    SHELBY                          N               
         4                18               19
GERMANTOWN                      SHELBY                          N               
         4                20               21
MEMPHIS                         SHELBY                          N               
         4                22               23
MILLINGTON                      SHELBY                          N               
         4                24               25
TIPTON                          TENNESSEE                       N               
         3                27               28

NOTES:
(1) 'GPU' = Geo-Political Unit, i.e. a Nation, State, County, Municipality, Etc
(2) In this example, the root-node is USA, identified by '0'.  The root is the 
ultimate progenitor, if you will, the "parent-of-parents".  Other means for 
identifying the root could be used.  The root could have itself as its own 
parent, some other text value, or even NULL, although I would probably shy away 
from NULL, even though it might actually be logically justified here.
(3) NodeLevel could be viewed as "generation".  It is the current level in the 
hierarchy.
(4) Left and Right values indicate the breadth of successive generations.  See 
how USA has 1..30, as, being the root-node, it encompasses the entire span of 
the tree.  
(5) Left and Right values are independent of how the records are uniquely 
identified.  The Level, Left, Right values are just the numbers that identify 
the structure of the tree.  

[MY_UNIT_HIERARCHY]

GPU_ID                          GPU_PARENT_ID                 
----------------------------------- -----------------------------------
USA                             0                             
TENNESSEE                       USA                           
ARKANSAS                        USA                           
MISSISSIPPI                     USA                           
SHELBY                          TENNESSEE                     
TIPTON                          TENNESSEE                     
FAYETTE                         TENNESSEE                     
MEMPHIS                         SHELBY                        
MILLINGTON                      SHELBY                        
COLLIERVILLE                    SHELBY                        
GERMANTOWN                      SHELBY                        
PULASKI                         ARKANSAS                      
WHITE                           ARKANSAS                      
CRITTENDEN                      ARKANSAS                      
DESOTO                          MISSISSIPPI                    

NOTES:
(1) This is simple, really; it's just a 2-column table with the GPU and its 
parent. 

Again, if anyone wants, I'll elaborate all I can, which is only a bit, on why 
this is "a good thing" (I use that but I'm not much of a Martha Stewart fan) 
and I'll provide my RMD file, too.

Steve in Memphis

From: [email protected] [mailto:[EMAIL PROTECTED] On Behalf Of Ben Petersen
Sent: Wednesday, March 12, 2008 10:56am 10:56
To: RBASE-L Mailing List
Subject: [RBASE-L] - Re: Hierarchical Data ...

Ben, your message bolstered my courage to 
tackle this - kinda' humorous that a discussion 
of hierarchical data and a nested-sets 
implementation might "inspire" someone.  

As a friend of mine once said to a similar comment, "Ben, that's just sad" <g>. 

I regret to say that when I've used this logic in the past I had the luxury of 
starting from scratch (no data conversion), and most recently in VB, but with 
different objectives, so I can't offer much help. Looking at his "push down 
stack algorithm" just makes my head hurt w/o more time to digest it. 

I can offer this *very* modest advice, fwiw.  It became much easier for me to 
internalize when I reflexively knew that if the two indexes were sequential the 
data item had no children, otherwise other data sets were encapsulated. I know 
it's obvious... and I can't explain why it seeing it just that way greased my 
mental skids, but there you are (more sadness, I guess <g>).

I had originally seen this method in a posting on this list long ago. I just 
Googled "celko sql tree" to get something explanatory for you. There are a 
bunch of matches and I recall seeing this applied in a number of different ways 
when I looked into it some years ago, so there may be more help there. But, it 
sounds like you've almost got it whipped.

Ben

[RBASE-L] - Re: Hierarchical Data ... Redux

Reply via email to