[SQL] How to represent a tree-structure in a relational database

2000-12-13 Thread Frank Joerdens

I am just thinking about the data model for a little content management system that I 
am
currently planning. Individual articles are sorted under different categories which 
branch
into subcategories, sub-subcategories etc. up to a depth of about 6 or 7 levels. The
structure should be extensible, i.e. it must be possible to add levels. What I am 
thinking
now is that you would keep the index in a separate index table (linked with the primary
key in the articles table), which would have 6 or 7 fields initially, and that you'd 
add
columns with the alter table command, if need be, to make the structure deeper. Is this
the recommended way to go about it? It feels pretty 'right' to me now but since the
problem should be fairly common, there must be other people who have thought and 
written
about it and there might even be a recognized 'optimal' solution to the problem.

Comments?

- Frank



Re: [SQL] How to represent a tree-structure in a relational database

2000-12-13 Thread Josh Berkus

Frank,

Please look in the list archives.  About 2 months ago this topic came
up and was discussed extensively (including a creative solution by yours
truly).

-Josh Berkus
-- 
__AGLIO DATABASE SOLUTIONS___
Josh Berkus
   Complete information technology  [EMAIL PROTECTED]
and data management solutions   (415) 436-9166
   for law firms, small businesses   fax  436-0137
and non-profit organizations.   pager 338-4078
San Francisco



Re: [SQL] How to represent a tree-structure in a relational database

2000-12-13 Thread Mathijs Brands

On Wed, Dec 13, 2000 at 04:48:47PM +0100, Frank Joerdens allegedly wrote:
> I am just thinking about the data model for a little content management system that 
>I am
> currently planning. Individual articles are sorted under different categories which 
>branch
> into subcategories, sub-subcategories etc. up to a depth of about 6 or 7 levels. The
> structure should be extensible, i.e. it must be possible to add levels. What I am 
>thinking
> now is that you would keep the index in a separate index table (linked with the 
>primary
> key in the articles table), which would have 6 or 7 fields initially, and that you'd 
>add
> columns with the alter table command, if need be, to make the structure deeper. Is 
>this
> the recommended way to go about it? It feels pretty 'right' to me now but since the
> problem should be fairly common, there must be other people who have thought and 
>written
> about it and there might even be a recognized 'optimal' solution to the problem.
> 
> Comments?

Yeah. I've built something similar.

The way I've done it:
  Give each record a unique ID (generated with a sequence) and store
  the records in a table. Create a second table in which you store
  parent id-child id combinations.

  So:

1 - Automotive transport
2 - Cars
3 - Motorcycles

Store in the table:
  1-2
  1-3

  There's one main category (Automotive transport) which has two sub-categories:
Cars & Motorcyles

The way I'd do it if I had to do it again:
  Give each record a unique id, generated by the application. Denote levels with
  extra letters.

  So:

   AA   - Automotive transport
    - Cars
   AAAB - Motorcycles

  The structures has the added bonus of making it very easy to determine all the
  sub-categories of a category, no matter how deep the tree is below the category
  you're looking at. With the first approach it is not possible to do this in a
  single SQL query. You could do this with a function, I guess.

I hope this is of some use to you.

Cheers,

Mathijs
-- 
"Borrowers of books -- those mutilators of collections, spoilers of the
 symmetry of shelves, and creators of odd volumes." 
Charles Lamb (1775-1834) 



Re: [SQL] How to represent a tree-structure in a relational database

2000-12-13 Thread Frank Joerdens

On Wed, Dec 13, 2000 at 11:38:18AM -0800, Stuart Statman wrote:
[ . . . ]
> I would suggest, instead, to create a table that represents your hierarchy
> without adding columns. For example :
> 
> create table Category (
> CategoryID   int4  not null  primary key,
> ParentCategoryID int4  not null  REFERENCES Category (CategoryID),
> CategoryName varchar(100)
> );
> 
> Add a CategoryID with an FK reference to this table, and your work is done.
> 
> Then adding, inserting, removing, or moving layers in the hierarchy becomes
> quite simple. This also preserves hierarchical integrity, where subcategory
> a of subcategory b will also remain a subcategory of category c if
> subcategory b is a subcategory of subcategory c, where I'm not sure your
> model will preserve or guarantee that. (Does that sentence deserve a prize?)

Cool. That looks like my solution. I had actually seen it someplace
before, but didn't make the connection with my problem. 

Ta, Frank



Re: [SQL] How to represent a tree-structure in a relational database

2000-12-13 Thread Frank Joerdens

On Wed, Dec 13, 2000 at 11:04:13AM -0800, Josh Berkus wrote:
> Frank,
> 
>   Please look in the list archives.  About 2 months ago this topic came
> up and was discussed extensively (including a creative solution by yours
> truly).

Hm, neither my archives nor a search on the postgresql.org page turned
up the thread you mention. Do you recall which list it was and what the
title of the thread was?

Thanks, Frank



Re: [SQL] How to represent a tree-structure in a relational database

2000-12-13 Thread clayton cottingham

Frank Joerdens wrote:
> 
> On Wed, Dec 13, 2000 at 11:04:13AM -0800, Josh Berkus wrote:
> > Frank,
> >
> >   Please look in the list archives.  About 2 months ago this topic came
> > up and was discussed extensively (including a creative solution by yours
> > truly).
> 
> Hm, neither my archives nor a search on the postgresql.org page turned
> up the thread you mention. Do you recall which list it was and what the
> title of the thread was?
> 
> Thanks, Frank


yes i  recall!!

i managed to implement something to that effect

a lot was done using postgres and perl

anyone need code fragements?



RE: [SQL] How to represent a tree-structure in a relational database

2000-12-13 Thread Stuart Statman

> The way I'd do it if I had to do it again:
>   Give each record a unique id, generated by the application.
> Denote levels with extra letters.
>
>   So:
>
>AA   - Automotive transport
> - Cars
>AAAB - Motorcycles
>
> The structures has the added bonus of making it very easy to
> determine all the
> sub-categories of a category, no matter how deep the tree is
> below the category
> you're looking at. With the first approach it is not possible
> to do this in a
> single SQL query. You could do this with a function, I guess.

The problem with this method is if you need to insert a category, or move a
category. You'll need to re-id a bunch of categories, and bubble those
changes out to every table that refers to this table.

Stuart Statman
Director of Software Development
Slam Media, Inc.


BEGIN:VCARD
VERSION:2.1
N:Statman;Stuart
FN:Stuart Statman
ORG:Slam Media, Inc.
TITLE:Director of Software Development
TEL;WORK;VOICE:(206) 391-0187
ADR;WORK;ENCODING=QUOTED-PRINTABLE:;;Slam Media, Inc.=0D=0A800 5th Ave. #101-296;Seattle;WA;98104;United States=
 of America
LABEL;WORK;ENCODING=QUOTED-PRINTABLE:Slam Media, Inc.=0D=0A800 5th Ave. #101-296=0D=0ASeattle, WA 98104=0D=0AUnit=
ed States of America
EMAIL;PREF;INTERNET:[EMAIL PROTECTED]
REV:2910T063546Z
END:VCARD



RE: [SQL] How to represent a tree-structure in a relational database

2000-12-13 Thread Stuart Statman

> What I am thinking now is that you would keep the index
> in a separate index table linked with the primary
> key in the articles table), which would have 6 or 7 fields
> initially, and that you'd add columns with the alter table
> command, if need be, to make the structure deeper.

I would suggest, instead, to create a table that represents your hierarchy
without adding columns. For example :

create table Category (
CategoryID   int4  not null  primary key,
ParentCategoryID int4  not null  REFERENCES Category (CategoryID),
CategoryName varchar(100)
);

Add a CategoryID with an FK reference to this table, and your work is done.

Then adding, inserting, removing, or moving layers in the hierarchy becomes
quite simple. This also preserves hierarchical integrity, where subcategory
a of subcategory b will also remain a subcategory of category c if
subcategory b is a subcategory of subcategory c, where I'm not sure your
model will preserve or guarantee that. (Does that sentence deserve a prize?)

In general, if you know that you will need to periodically alter a table to
add columns, you should come up with a different model that doesn't require
adding columns.

Stuart Statman
Director of Software Development
Slam Media, Inc.


BEGIN:VCARD
VERSION:2.1
N:Statman;Stuart
FN:Stuart Statman
ORG:Slam Media, Inc.
TITLE:Director of Software Development
TEL;WORK;VOICE:(206) 391-0187
ADR;WORK;ENCODING=QUOTED-PRINTABLE:;;Slam Media, Inc.=0D=0A800 5th Ave. #101-296;Seattle;WA;98104;United States=
 of America
LABEL;WORK;ENCODING=QUOTED-PRINTABLE:Slam Media, Inc.=0D=0A800 5th Ave. #101-296=0D=0ASeattle, WA 98104=0D=0AUnit=
ed States of America
EMAIL;PREF;INTERNET:[EMAIL PROTECTED]
REV:2910T063546Z
END:VCARD



Re: [SQL] How to represent a tree-structure in a relational database

2000-12-13 Thread Josh Berkus

Frank, etc:

> > create table Category (
> > CategoryID   int4  not null  primary key,
> > ParentCategoryID int4  not null  REFERENCES Category (CategoryID),
> > CategoryName varchar(100)
> > );

That was it.  I also gave an example of a UNION query that would display
the whole category tree in ASCII format:

I've done this before for one project.  Here's what you do:

CREATE TABLE sample_heirarchy (
unique_id   SERIAL CONSTRAINT PRIMARY KEY,
node_linkup INT4,
node_level  INT2,
label   VARCHAR(30)
datawhatever
);

Then you use the unique_id and node_linkup fields to create a heirarchy
of data nodes, with an indefinite number of levels, where the
node_linkup of each lower level equals the id of its parent record.  For
example:

id  linkup  level   label   data
3   0   1   Node1   Node1
4   3   2   Node1.1 Node1.1
6   3   2   Node1.2 Node1.2
7   6   3   Node1.2.1   Node1.2.1
5   0   1   Node2   Node2

etc.

You can then access the whole heirarchy through moderately complex, but
very fast-executing UNION queries.  The one drawback is that you need to
know in advance the maximum number of levels (3 in this example), but
I'm sure someone on this list can find a way around that:

SELECT n1.unique_id, n1.label, n1.data, n1.node_level, n1.unique_id AS
level1,
0 AS level2, 0 AS level3
FROM sample_heirarchy n1
WHERE n1.node_level = 1
UNION ALL
SELECT n2.unique_id, n2.label, n2.data, n2.node_level, n1.unique_id, 
n2.unique_id, 0
FROM sample_heirarchy n2, sample_heirarchy n1
WHERE n1.unique_id = n2.node_linkup
AND n2.node_level = 2
UNION ALL
SELECT n3.unique_id, n3.label, n3.data, n3.node_level, n1.unique_id, 
n2.unique_id, n3.unique_id
FROM sample_heirarchy n1, sample_heirarchy n2, sample_heirarchy n3
WHERE n1.unique_id = n2.node_linkup AND
n2.unique_id = n3.node_linkup
AND n3.node_level = 3
ORDER BY level1, level2, level3

Should produce this output (pardon any parsing errors; I'm not at a
PGSQL terminal right now):

unique_id   label   datalevel   level1  level2  level3
3   Node1   Node1 1 3   0   0
4   Node1.1 Node1.1   2 3   4   0
6   Node1.2 Node1.2   2 3   6   0
7   Node1.2.1   Node1.2.1 3 3   6   7
5   Node2   Node2 1 7   0   0
etc.

This sorts them in numerical (id) order, but one could just as easily
substitute the labels or data for the various levels and sort them
alphabetically (although you do need to allow for NULL sort order on
your database, and any label duplicates).

The advantages of this structure are:
1. It allows you to create, assign, and re-assign nodes freely all over
the heirarchy ... just change the level and/or linkup.
2. Aside from the Union query above, the table structure allows for any
number of levels, unlike a set or relationally linked tables.
3. Because the display query is entirely once table linking to itself on
(hopefully) indexed fields, in my expreience it runs very, very fast.
4. My PHP developer has reprogrammed the easily available PHP Tree
Control to uses this table structure (I don't know if he's giving it
out, but he said it wasn't very difficult).

-Josh Berkus

-- 
__AGLIO DATABASE SOLUTIONS___
Josh Berkus
   Complete information technology  [EMAIL PROTECTED]
and data management solutions   (415) 436-9166
   for law firms, small businesses   fax  436-0137
and non-profit organizations.   pager 338-4078
San Francisco



RE: [SQL] How to represent a tree-structure in a relational database

2000-12-13 Thread Stuart Statman

[Josh Berkus]

> I've done this before for one project.  Here's what you do:
>
> CREATE TABLE sample_heirarchy (
> unique_id   SERIAL CONSTRAINT PRIMARY KEY,
> node_linkup INT4,
> node_level  INT2,
> label   VARCHAR(30)
> datawhatever
> );
>
> Then you use the unique_id and node_linkup fields to create a heirarchy
> of data nodes, with an indefinite number of levels, where the
> node_linkup of each lower level equals the id of its parent record.  For
> example:
>
> id  linkup  level   label   data
> 3   0   1   Node1   Node1
> 4   3   2   Node1.1 Node1.1
> 6   3   2   Node1.2 Node1.2
> 7   6   3   Node1.2.1   Node1.2.1
> 5   0   1   Node2   Node2

I don't think I'd be comfortable with having the node_level column in the
table structure. First, because you can derive that value using a function,
it's duplicate data. Second, if you decide to take an entire segment of your
hierarchy and move it under another node (by changing the value of
node_linkup/ParentCategoryID), you'll need to recalculate all of those
node_level values. And all the node_level values underneath it.

> You can then access the whole heirarchy through moderately complex, but
> very fast-executing UNION queries.  The one drawback is that you need to
> know in advance the maximum number of levels (3 in this example), but
> I'm sure someone on this list can find a way around that:

I can think of another way to do this, though it would be a little complex
and would involve temp tables. Select all of your top level nodes into a
temp table. Create a new table with a new column for the new level. Select
the children of the top level nodes into the temp table, followed by those
top level nodes themselves, with a 0 in the new column and a flag indicating
not to expand again. Create a new temp table just like the last but with
another column for the new level, and repeat the above process from the
first temp table to the second, only expanding the latest children, but
copying all records over. Keep doing it until there are no more new
children.

Alternately, if you didn't need each level to have it's own column, but
didn't mind an x.x.x.x kind of notation, you could use one temp table, and
just append '.0' to the end of every copied-over parent node.

Basically, both methods are simulations of recursing the tree, but you get
to do each level all at once using an insert ... select. If you wanted, you
could even use a counter, to identify which level each node appeared in.

Clearly, this could also be done with cursors and recursive

> 4. My PHP developer has reprogrammed the easily available PHP Tree
> Control to uses this table structure (I don't know if he's giving it
> out, but he said it wasn't very difficult).

We've done a similar thing for Java. It was ridiculously easy to create a
TreeModel wrapped around this data. Almost too easy; it made me feel dirty.

Stuart Statman
Director of Software Development
Slam Media, Inc.



BEGIN:VCARD
VERSION:2.1
N:Statman;Stuart
FN:Stuart Statman
ORG:Slam Media, Inc.
TITLE:Director of Software Development
TEL;WORK;VOICE:(206) 391-0187
ADR;WORK;ENCODING=QUOTED-PRINTABLE:;;Slam Media, Inc.=0D=0A800 5th Ave. #101-296;Seattle;WA;98104;United States=
 of America
LABEL;WORK;ENCODING=QUOTED-PRINTABLE:Slam Media, Inc.=0D=0A800 5th Ave. #101-296=0D=0ASeattle, WA 98104=0D=0AUnit=
ed States of America
EMAIL;PREF;INTERNET:[EMAIL PROTECTED]
REV:2910T063546Z
END:VCARD



Re: [SQL] How to represent a tree-structure in a relational database

2000-12-13 Thread Josh Berkus

Stuart,

> I don't think I'd be comfortable with having the node_level column in the
> table structure. First, because you can derive that value using a function,
> it's duplicate data. Second, if you decide to take an entire segment of your
> hierarchy and move it under another node (by changing the value of
> node_linkup/ParentCategoryID), you'll need to recalculate all of those
> node_level values. And all the node_level values underneath it.

I can see that.  I suppose it depends on the data you're storing.  The
project I was working on tracked grocery inventory for a delivery
service, and thus each item had a fixed "level" in the heirarcy (Food
Class, Food Type, Manufacturer, and Item) and thus while items might get
reassigned *across* the heirarcy, they did not get re-assigned *up and
down* the heirarcy.

Also, I can't think of a way to represent the tree in pure SQL without
having the level identifiers (and a fixed number of levels).

> We've done a similar thing for Java. It was ridiculously easy to create a
> TreeModel wrapped around this data. Almost too easy; it made me feel dirty.

Great.  Maybe I'll buy it from you if I ever need to use Java :-)

-Josh

-- 
__AGLIO DATABASE SOLUTIONS___
Josh Berkus
   Complete information technology  [EMAIL PROTECTED]
and data management solutions   (415) 436-9166
   for law firms, small businesses   fax  436-0137
and non-profit organizations.   pager 338-4078
San Francisco



Re: [SQL] How to represent a tree-structure in a relational database

2000-12-13 Thread Mathijs Brands

On Wed, Dec 13, 2000 at 12:09:06PM -0800, Stuart Statman allegedly wrote:
> > The way I'd do it if I had to do it again:
> >   Give each record a unique id, generated by the application.
> > Denote levels with extra letters.
> >
> >   So:
> >
> >AA   - Automotive transport
> > - Cars
> >AAAB - Motorcycles
> >
> > The structures has the added bonus of making it very easy to
> > determine all the
> > sub-categories of a category, no matter how deep the tree is
> > below the category
> > you're looking at. With the first approach it is not possible
> > to do this in a
> > single SQL query. You could do this with a function, I guess.
> 
> The problem with this method is if you need to insert a category, or move a
> category. You'll need to re-id a bunch of categories, and bubble those
> changes out to every table that refers to this table.

You can solve the last problem by using an extra table that maps unique
record id's (numerical) to hierarchical id's (text).

Inserting a category, in my case, does not require me to start updating
lots of records, since I would only use these strings to store
hierarchical information. I'm sorting categories based on alphabet,
which you can overrule by increasing the 'weight' of a category, which
is a numerical value attached to every category and which normally has a
vaue of 1.

However, changing the level of a category would require me to modify all
categories below that. In my case, this wouldn't be a problem.  We're
using this stuff for a Yahoo style directory which (atm) has about 2500
different categories. I'm generating a complete tree of all categories
and the websites in them once a day, storing them in a souped up DBM
style database. For each record I store the children, not the parent. If
changing the underlying structure takes a couple of minutes, than this
is acceptable.

As you can see my number of categories is rather small. If you're going
to use this for a forum or something similar, you may run into problems.
However, how often do you want to move a thread...

Cheers,

Mathijs
--
"Where is human nature so weak as in a bookstore!" 
Henry Ward Beecher  (1813-1887) 



Re: [SQL] How to represent a tree-structure in a relational database

2000-12-13 Thread Mathijs Brands

On Wed, Dec 13, 2000 at 04:49:51PM -0800, Josh Berkus allegedly wrote:
> Stuart,
> 
> > I don't think I'd be comfortable with having the node_level column in the
> > table structure. First, because you can derive that value using a function,
> > it's duplicate data. Second, if you decide to take an entire segment of your
> > hierarchy and move it under another node (by changing the value of
> > node_linkup/ParentCategoryID), you'll need to recalculate all of those
> > node_level values. And all the node_level values underneath it.
> 
> I can see that.  I suppose it depends on the data you're storing.  The
> project I was working on tracked grocery inventory for a delivery
> service, and thus each item had a fixed "level" in the heirarcy (Food
> Class, Food Type, Manufacturer, and Item) and thus while items might get
> reassigned *across* the heirarcy, they did not get re-assigned *up and
> down* the heirarcy.

Indeed. If the structure 'rarely' changes, having to do an expensive
update may be acceptable, if it increase the overall performance
significantly.

> Also, I can't think of a way to represent the tree in pure SQL without
> having the level identifiers (and a fixed number of levels).

Storing only the parent for a record doesn't require you to keep track
of levels, since this information can be reconstructed by following the
chain of parent id's until you reach the top of your tree.

Storing the children for each record (like I'm doing) works exactly the
same. Just follow the 'path' (for instance 'Automotive Transport/Cars')
to find the category you're looking for.

Cheers,

Mathijs
--
"A book is a fragile creature.  It suffers the wear of time,
 it fears rodents, the elements, clumsy hands." 
Umberto Eco 



RE: [SQL] How to represent a tree-structure in a relational database

2000-12-13 Thread Opec Kemp \( Ozemail \)

Don't if this will help but there is a really good book that discuss this problem in 
details.
The book is called "SQL for Smarties" by Joe Celko. It covers lots of advance topics 
(tree being one of them). Very good book. Check out on Amazon:

http://www.amazon.com/exec/obidos/ASIN/1558605762/qid=976755796/sr=1-1/106-0241434-0557209

Just me $0.02 

> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:  Behalf Of Josh Berkus
> Sent: Thursday, December 14, 2000 10:50 AM
> To: sqllist
> Subject: Re: [SQL] How to represent a tree-structure in a relational
> database
> 
> 
> Stuart,
> 
> > I don't think I'd be comfortable with having the node_level column in the
> > table structure. First, because you can derive that value using a function,
> > it's duplicate data. Second, if you decide to take an entire segment of your
> > hierarchy and move it under another node (by changing the value of
> > node_linkup/ParentCategoryID), you'll need to recalculate all of those
> > node_level values. And all the node_level values underneath it.
> 
> I can see that.  I suppose it depends on the data you're storing.  The
> project I was working on tracked grocery inventory for a delivery
> service, and thus each item had a fixed "level" in the heirarcy (Food
> Class, Food Type, Manufacturer, and Item) and thus while items might get
> reassigned *across* the heirarcy, they did not get re-assigned *up and
> down* the heirarcy.
> 
> Also, I can't think of a way to represent the tree in pure SQL without
> having the level identifiers (and a fixed number of levels).
> 
> > We've done a similar thing for Java. It was ridiculously easy to create a
> > TreeModel wrapped around this data. Almost too easy; it made me feel dirty.
> 
> Great.  Maybe I'll buy it from you if I ever need to use Java :-)
> 
>   -Josh
> 
> -- 
> __AGLIO DATABASE SOLUTIONS___
> Josh Berkus
>Complete information technology  [EMAIL PROTECTED]
> and data management solutions   (415) 436-9166
>for law firms, small businesses   fax  436-0137
> and non-profit organizations.   pager 338-4078
>   San Francisco



Re: [SQL] How to represent a tree-structure in a relational database

2000-12-13 Thread clayton cottingham

Mathijs Brands wrote:
> 
> On Wed, Dec 13, 2000 at 04:49:51PM -0800, Josh Berkus allegedly wrote:
> > Stuart,
> >
> > > I don't think I'd be comfortable with having the node_level column in the
> > > table structure. First, because you can derive that value using a function,
> > > it's duplicate data. Second, if you decide to take an entire segment of your
> > > hierarchy and move it under another node (by changing the value of
> > > node_linkup/ParentCategoryID), you'll need to recalculate all of those
> > > node_level values. And all the node_level values underneath it.
> >
> > I can see that.  I suppose it depends on the data you're storing.  The
> > project I was working on tracked grocery inventory for a delivery
> > service, and thus each item had a fixed "level" in the heirarcy (Food
> > Class, Food Type, Manufacturer, and Item) and thus while items might get
> > reassigned *across* the heirarcy, they did not get re-assigned *up and
> > down* the heirarcy.
> 
> Indeed. If the structure 'rarely' changes, having to do an expensive
> update may be acceptable, if it increase the overall performance
> significantly.
> 
> > Also, I can't think of a way to represent the tree in pure SQL without
> > having the level identifiers (and a fixed number of levels).
> 
> Storing only the parent for a record doesn't require you to keep track
> of levels, since this information can be reconstructed by following the
> chain of parent id's until you reach the top of your tree.
> 
> Storing the children for each record (like I'm doing) works exactly the
> same. Just follow the 'path' (for instance 'Automotive Transport/Cars')
> to find the category you're looking for.
> 
> Cheers,
> 
> Mathijs
> --
> "A book is a fragile creature.  It suffers the wear of time,
>  it fears rodents, the elements, clumsy hands."
> Umberto Eco



this is the way i implemented too!

i used perl and modperl/apache to deploy

so i did use a perl module to store the top level of categories

so they where always avail to the starting page

mostly it ended up looking like the bidders edge horizontal line type
thing

again if anyone needs the code lemme know!



Re: [SQL] How to represent a tree-structure in a relational database

2000-12-13 Thread Robert B. Easter

On Wednesday 13 December 2000 18:05, Josh Berkus wrote:
> Frank, etc:
> > > create table Category (
> > > CategoryID   int4  not null  primary key,
> > > ParentCategoryID int4  not null  REFERENCES Category (CategoryID),
> > > CategoryName varchar(100)
> > > );

I made a message board with a hierarchy for topics/boards under which  
messages go into a post/reply hierarchy at www.comptechnews.com.  I used the 
parent_id idea like in the table above.

It's working OK and, yes, I can easily move nodes around without problems.  I 
can move/reparent Topic/boards and posts/replies (posts and replies are 
treated nearly the same, just that a post has a parent-post-id of 0 while a 
reply message's parent-post-id is non-zero).  This arrangement combined with 
the use of PL/pgSQL trigger functions can go a long way.

I did not use any foreign keys but just had the PL/pgSQL triggers do any 
checks I wanted myself.  So, its easy to make triggers that do things like 
automatically delete an entire hierarchy when a node(message or topic) is 
deleted.  Like, if you delete a post, it deletes all its replies 
automatically.  If you delete a reply, it deletes any child/replies that it 
might have etc.  If you delete a topic/board, it deletes all messages that 
were in that topic and any subtopics and messages in those subtopics 
recursively.  The PL/pgSQL can be used to RAISE EXCEPTION when something you 
don't want to happen is attempted (like deleting the root topic!), which then 
automatically ABORTs the transaction.

You will want to make heavy use of procedures in the database to ensure 
integrity.  You might be able to use FOREIGN KEY triggers also but with 
careful use of BEFORE and AFTER PL/pgSQL triggers. In my case, I had somekind 
of conflict between what my PL/pgSQL triggers where doing and what the 
foreign key triggers were doing. The stuff my PL/pgSQL were doing caused 
referential integrity violations sometimes.  You have to be real careful what 
goes into a BEFORE trigger and what goes into an AFTER trigger.  Trying to 
make a BEFORE trigger or an AFTER trigger do too much will end up in trouble 
so you have to split what you want to do into two triggers for a table.  The 
CONSTRAINT TRIGGERs that get created automatically by FOREIGN KEYs are called 
AFTER INSERT OR UPDATE|DELETE and are NOT DEFERRABLE and INITIALLY IMMEDIATE. 
You have to keep that in mind in writing your procedures.  In the end, I 
judged the contraint triggers to interfere with what I wanted to do and 
removed them.



>
> That was it.  I also gave an example of a UNION query that would display
> the whole category tree in ASCII format:
>
> I've done this before for one project.  Here's what you do:
>
> CREATE TABLE sample_heirarchy (
> unique_id   SERIAL CONSTRAINT PRIMARY KEY,
> node_linkup INT4,
> node_level  INT2,
> label   VARCHAR(30)
> datawhatever
> );
>
> Then you use the unique_id and node_linkup fields to create a heirarchy
> of data nodes, with an indefinite number of levels, where the
> node_linkup of each lower level equals the id of its parent record.  For
> example:
>
> id  linkup  level   label   data
> 3   0   1   Node1   Node1
> 4   3   2   Node1.1 Node1.1
> 6   3   2   Node1.2 Node1.2
> 7   6   3   Node1.2.1   Node1.2.1
> 5   0   1   Node2   Node2
>
> etc.
>
> You can then access the whole heirarchy through moderately complex, but
> very fast-executing UNION queries.  The one drawback is that you need to
> know in advance the maximum number of levels (3 in this example), but
> I'm sure someone on this list can find a way around that:
>
> SELECT n1.unique_id, n1.label, n1.data, n1.node_level, n1.unique_id AS
> level1,
> 0 AS level2, 0 AS level3
> FROM sample_heirarchy n1
> WHERE n1.node_level = 1
> UNION ALL
> SELECT n2.unique_id, n2.label, n2.data, n2.node_level, n1.unique_id,
> n2.unique_id, 0
> FROM sample_heirarchy n2, sample_heirarchy n1
> WHERE n1.unique_id = n2.node_linkup
> AND n2.node_level = 2
> UNION ALL
> SELECT n3.unique_id, n3.label, n3.data, n3.node_level, n1.unique_id,
> n2.unique_id, n3.unique_id
> FROM sample_heirarchy n1, sample_heirarchy n2, sample_heirarchy n3
> WHERE n1.unique_id = n2.node_linkup AND
> n2.unique_id = n3.node_linkup
> AND n3.node_level = 3
> ORDER BY level1, level2, level3
>
> Should produce this output (pardon any parsing errors; I'm not at a
> PGSQL terminal right now):
>
> unique_id   label   datalevel   level1  level2  level3
> 3   Node1   Node1 1 3   0   0
> 4   Node1.1 Node1.1   2 3   4   0
> 6   Node1.2 Node1.2   2 3   6   0
> 7   Node1.2.1   Node1.2.1 3 

Re: [SQL] How to represent a tree-structure in a relational database

2000-12-14 Thread Tulassay Zsolt


There actually is a model of tree structures in SQL databases which is
different from those mentioned earlier in that it represents the tree
as nested sets (ie. nodes are subsets of parent sets (parent nodes)).

There is a huge advantage in this model as it eliminates the need for
recursion. For example, if you want to get a specific node's parents,
grandparents and all other parents up the tree, you can do this in
_one single_ SELECT query. If you just stored the the id of the parent
of each node, you would need (n-1) queries if the node is at the n-th
level.

However, inserting or deleting nodes is more complex this way since you
have to keep the data structure "balanced". But in most cases you won't
insert new nodes all the time so you won't suffer from this overhead. And
on the other hand, the performance gain on queries might be enormous.

You can find the article dealing with this at
http://www.utdt.edu/~mig/sql-trees


Actually what I did was to implement _both_ models at the same time. I
took the nested set structure, and besides stored the parent id of all
nodes as well. The benefit of this is that by accepting a little more
overhead during inserts, I was able to use the advantages of both models.

If somebody is interested in it, I might share the code (a few tables
and some plpgsql functions). But the article is surely worth a read.

Zsolt

ps: I found the link a few months ago in the pgsql-sql archive :-)


On Wed, 13 Dec 2000, Josh Berkus wrote:

> Stuart,
> 
> > I don't think I'd be comfortable with having the node_level column in the
> > table structure. First, because you can derive that value using a function,
> > it's duplicate data. Second, if you decide to take an entire segment of your
> > hierarchy and move it under another node (by changing the value of
> > node_linkup/ParentCategoryID), you'll need to recalculate all of those
> > node_level values. And all the node_level values underneath it.
> 
> I can see that.  I suppose it depends on the data you're storing.  The
> project I was working on tracked grocery inventory for a delivery
> service, and thus each item had a fixed "level" in the heirarcy (Food
> Class, Food Type, Manufacturer, and Item) and thus while items might get
> reassigned *across* the heirarcy, they did not get re-assigned *up and
> down* the heirarcy.
> 
> Also, I can't think of a way to represent the tree in pure SQL without
> having the level identifiers (and a fixed number of levels).
> 
> > We've done a similar thing for Java. It was ridiculously easy to create a
> > TreeModel wrapped around this data. Almost too easy; it made me feel dirty.
> 
> Great.  Maybe I'll buy it from you if I ever need to use Java :-)
> 
>   -Josh
> 
> -- 
> __AGLIO DATABASE SOLUTIONS___
> Josh Berkus
>Complete information technology  [EMAIL PROTECTED]
> and data management solutions   (415) 436-9166
>for law firms, small businesses   fax  436-0137
> and non-profit organizations.   pager 338-4078
>   San Francisco
> 




Re: [SQL] How to represent a tree-structure in a relational database

2000-12-14 Thread hubert depesz lubaczewski

somebody already showed table structure, but i'll ad some more code to this:

table:

CREATE TABLE groups (
  id   INT4 NOT NULL DEFAULT NEXTVAL('groups_seq'),
  parent_idINT4 NOT NULL DEFAULT 0,
  name TEXT NOT NULL DEFAULT '',
  active   BOOL NOT NULL DEFAULT 't'::bool,
  PRIMARY KEY (id)
);
INSERT INTO groups   (id)   VALUES (0);
ALTER TABLE groups   ADD FOREIGN KEY (parent_id  ) REFERENCES groups   (id);
CREATE UNIQUE INDEX groups_pn_u   ON groups   (parent_id, name, active);

at this point it seems to be pretty easy and obvious.
in my case i got to the point that i needed some more info about the branch of
tree. so i wrote:

REATE function getgrouppath(int4, text) returns text as '
  DECLARE
sep ALIAS FOR $2;
aid int4;
wynik TEXT;
temp RECORD;
b BOOL;
  BEGIN
b:=''t'';
wynik:=;
aid:=$1;
while b loop
  SELECT name, parent_id INTO temp FROM groups WHERE id=aid;
  IF NOT FOUND THEN
return wynik;
  END IF;
  if wynik =  THEN
wynik:=temp.name;
  else
wynik:=temp.name||sep||wynik;
  END if;
  IF temp.parent_id = 0 THEN
b:=''f'';
  ELSE
aid:=temp.parent_id;
  END if;
end loop;
return wynik;
  END;
' language 'plpgsql';

(sorry for polish variable names)
this function does one nice trick
when having structure like:
=> select id, parent_id, name, active from groups;
 id | parent_id | name | active
+---+--+
  0 | 0 |  | t
  1 | 0 | RTV  | t
  2 | 0 | AGD  | t
  3 | 0 | MP3  | t
  4 | 1 | Audio| t
  5 | 2 | Lodówki  | t
  6 | 2 | Kuchenki Mikrofalowe | t
  7 | 4 | Sony | t
  8 | 4 | Panasonic| t
(9 rows)

i can:
=> select id, parent_id, name, active, getgrouppath(id, '/') from
groups;
 id | parent_id | name | active |   getgrouppath
+---+--++--
  0 | 0 |  | t  |
  1 | 0 | RTV  | t  | RTV
  2 | 0 | AGD  | t  | AGD
  3 | 0 | MP3  | t  | MP3
  4 | 1 | Audio| t  | RTV/Audio
  5 | 2 | Lodówki  | t  | AGD/Lodówki
  6 | 2 | Kuchenki Mikrofalowe | t  | AGD/Kuchenki Mikrofalowe
  7 | 4 | Sony | t  | RTV/Audio/Sony
  8 | 4 | Panasonic| t  | RTV/Audio/Panasonic


since for some reasons (indenting) i needed the level of branch i wrote:

CREATE FUNCTION grouplevel(int4) returns int4 AS '
  DECLARE
baseid ALIAS FOR $1;
currid INT4;
reply INT4;
  BEGIN
reply:=1;
if baseid = 0 then return 0; END if;
SELECT parent_id INTO currid FROM groups where id=baseid;
while currid>0 loop
  reply:=reply+1;
  SELECT parent_id INTO currid FROM groups where id=currid;
END loop;
return reply;
  END;
' language 'plpgsql';

which also seems pretty obvious.

to be complete i wrote two triggers which made me happy:

CREATE FUNCTION trg_recurs_act_g() RETURNS OPAQUE AS '
  BEGIN
IF NEW.active=''f''::bool and OLD.active=''t''::bool THEN
  UPDATE articles SET active=''f''::bool WHERE group_id=NEW.id;
  UPDATE groups SET active=''f''::bool WHERE parent_id=NEW.id and id<>0;
ELSE
  IF NEW.active=''t''::bool and OLD.active=''f''::bool AND NEW.id<>0 THEN
UPDATE groups SET active=''t''::bool WHERE id=NEW.parent_id;
  END IF;
END IF;
RETURN NEW;
  END;
' LANGUAGE 'plpgsql';

CREATE FUNCTION trg_recurs_act_a() RETURNS OPAQUE AS '
  BEGIN
IF NEW.active=''t''::bool and OLD.active=''f''::bool THEN
  UPDATE groups SET active=''t''::bool WHERE id=NEW.group_id;
END IF;
RETURN NEW;
  END;
' LANGUAGE 'plpgsql';

CREATE TRIGGER groups_update_trg BEFORE UPDATE ON groups FOR EACH ROW EXECUTE 
PROCEDURE trg_recurs_act_g();
CREATE TRIGGER articles_update_trg BEFORE UPDATE ON articles FOR EACH ROW EXECUTE 
PROCEDURE trg_recurs_act_a();

as you can see those triggers use article table which structure is not
important at this moment (let's assume it has id, group_id, name and active).

i hope this code will help you a bit.

depesz

-- 
hubert depesz lubaczewski

 najwspanialszą rzeczą jaką dało nam nowoczesne społeczeństwo,
  jest niesamowita wręcz łatwość unikania kontaktów z nim ...



Re: [SQL] How to represent a tree-structure in a relational database

2000-12-15 Thread Tulassay Zsolt



On Thu, 14 Dec 2000, Tulassay Zsolt wrote:

> 
> You can find the article dealing with this at
> http://www.utdt.edu/~mig/sql-trees
> 

sorry i pasted in the wrong url (this was mentioned in an earlier post)
the correct one is:

A look at SQL Trees (by Joe Celko)
http://www.dbmsmag.com/9603d06.html

Zsolt Tulassay




Re: [SQL] How to represent a tree-structure in a relational database

2000-12-28 Thread Ron Peterson

Ron Peterson wrote:
> 
> This structure is more 'normal' in the sense that nodes without children
> (in a tree, the leaf nodes) don't have records in the edge table.

Phghpth.  Should have had my coffee first.  The first data structure
given would only have a null parent id for the root node, not all the
leaf nodes.  My mistake.  Thought it might be politic to point that out
before someone (correctly) called me an idiot.

-Ron-



Re: [SQL] How to represent a tree-structure in a relational database

2000-12-29 Thread Ron Peterson

Ron Peterson wrote:
> 
> CREATE TABLE category_edge (
> parent  INTEGER
> NOT NULL
> REFERENCES category_node(id),
> 
> child   INTEGER
> NOT NULL
> REFERENCES category_node(id)
> );

Just for the sake of anal-retentive completeness, I'd like to point out
that you'd probably want an id field in this table also.  Plus what the
heck else am I going to do on Christmas break?  ;)

On a completely unrelated topic: getting the PostgreSQL discussions
lists on a news server is great!!!  I was overwhelmed with mail that I
usually don't have time to deal with.  Now when I have a chance, I just
go see what's up on the news server.  Excellent and thanks!

-Ron-



Re: [SQL] How to represent a tree-structure in a relational database

2000-12-29 Thread Ron Peterson

Stuart Statman wrote:
> 
> I would suggest, instead, to create a table that represents your hierarchy
> without adding columns. For example :
> 
> create table Category (
> CategoryID   int4  not null  primary key,
> ParentCategoryID int4  not null  REFERENCES Category (CategoryID),
> CategoryName varchar(100)
> );

Another possibility would be to use two tables to represent the data
structure.

CREATE SEQUENCE category_node_id_seq;
CREATE TABLE category_node (
nameTEXT
NOT NULL,

id  INTEGER
DEFAULT NEXTVAL('category_node_id_seq')
PRIMARY KEY
);

CREATE TABLE category_edge (
parent  INTEGER
NOT NULL
REFERENCES category_node(id),

child   INTEGER
NOT NULL
REFERENCES category_node(id)
);

This structure is more 'normal' in the sense that nodes without children
(in a tree, the leaf nodes) don't have records in the edge table.

What either of these structures allow to do is create directed graph
structures.  If you'd like to constrain this structure to be a tree, you
have to enforce that restriction with procedural code.

-Ron-