Re: [SQL] Summing Grouping in a Hierarchical Structure
Hi Bryan, My current (maybe about to be deprecated now) category table is a recursive table that I join on the line-item table. Your suggestion of using a table for each level of category is probably a bit easier to manage in some respects. However, as you pointed out, it becomes unwieldy when various branches of the category tree have different lengths. One effectively has to have the same number of levels in all branches. If only reality were so refined! :-) That is why I chose the recursive table structure that I gather is fairly common where the depth of a tree is not necessarily even or known. However, I like the view you created - I have not really used views very much so far, so it is good to see this use case for a view. That said, now that I have finally gotten the chance to try ltree, I think I like it a lot. I wonder how portable it is, but it seems to do the trick extremely well. Mind you, I am not necessarily looking to change DBMSes anytime soon, but it is still good to know how to handle recursive/hierarchical structures... ummm... the old-fashioned way(?). ;-) I think using WITH RECURSIVE would probably do the trick for my recursive category table, but - so far - the WITH construct makes my eyes glaze over. I feel like I will have returned the *one ring* if I get that. On Fri, Feb 22, 2013 at 11:54 PM, Bryan L Nuse n...@uga.edu wrote: SNIP Hello Don, Perhaps I am missing something about what your constraints are, or what you're trying to achieve, but is there any reason you could not use a series of joined tables indicating parent-child relationships? The following example follows that in your previous posts. Note that this approach (as given) will not work if branches stemming from the same node are different lengths. That is, if you have costs associated with Transportation.Bicycle.Gear, you could not also have a category Transportation.Bicycle.Gear.Chain_ring. (To add the latter category, you'd have to put costs from the former under something like Transportation.Bicycle.Gear.General -- or modify the approach.) However, lengthening the Alcohol branches, e.g., by tacking on a level5 table would be easy. Notice that level3 and level4 are not true look-up tables, since they may contain duplicate cat values. If I'm off base, by all means specify just how. Regards, Bryan -- CREATE TABLE level1 ( cat text PRIMARY KEY ); CREATE TABLE level2 ( cat text PRIMARY KEY, parent text REFERENCES level1(cat) ); CREATE TABLE level3 ( cat text, parent text REFERENCES level2(cat), cost numeric(6,2) ); CREATE TABLE level4 ( cat text, parent text, cost numeric(6,2) ); INSERT INTO level1 VALUES ('Transportation'), ('Groceries'); INSERT INTO level2 VALUES ('Auto', 'Transportation'), ('Bicycle', 'Transportation'), ('Fares', 'Transportation'), ('Food', 'Groceries'), ('Beverages', 'Groceries'); INSERT INTO level3 VALUES ('Fuel', 'Auto', 50.00), ('Maintenance', 'Auto', 30.00), ('Fuel', 'Auto', 25.00), ('Gear', 'Bicycle', 40.00), ('Gear', 'Bicycle', 20.00), ('Bus', 'Fares', 10.00), ('Train', 'Fares', 5.00), ('Beverages', 'Food', 30.00), ('Fruit_Veg', 'Food', 40.00), ('Meat_Fish', 'Food', 80.00), ('Grains_Cereals', 'Food', 30.00), ('Alcohol', 'Beverages', NULL), ('Juice', 'Beverages', 45.00), ('Other', 'Beverages', 15.00); INSERT INTO level4 VALUES ('Beer', 'Alcohol', 25.00), ('Spirits', 'Alcohol', 10.00), ('Wine', 'Alcohol', 50.00); CREATE VIEW all_cats AS ( SELECT a.cat AS level4, b.cat AS level3, c.cat AS level2, d.cat AS level1, CASE WHEN a.cost IS NULL THEN 0 WHEN a.cost IS NOT NULL THEN a.cost END + CASE WHEN b.cost IS NULL THEN 0 WHEN b.cost IS NOT NULL THEN b.cost END AS cost FROM level4 a FULL JOIN level3 b ON (a.parent = b.cat) FULL JOIN level2 c ON (b.parent = c.cat) FULL JOIN level1 d ON (c.parent = d.cat) ORDER BY level1, level2, level3, level4 ); SELECT * FROM all_cats; level4 | level3 | level2 | level1 | cost -++---++--- Beer| Alcohol| Beverages | Groceries | 25.00 Spirits | Alcohol| Beverages | Groceries | 10.00 Wine| Alcohol| Beverages | Groceries | 50.00 | Juice | Beverages | Groceries | 45.00 | Other | Beverages | Groceries | 15.00 | Beverages | Food | Groceries | 30.00 | Fruit_Veg
Re: [SQL] Summing Grouping in a Hierarchical Structure
That said, now that I have finally gotten the chance to try ltree, I think I like it a lot. Hello Don, Yes, after looking at ltree --which I had not done before-- I have to agree with Misa that it looks like the right solution for your problem. That is not to say that brute force SQL couldn't provide a workable arrangement; but ltree looks very flexible, especially as it allows you to assign cost values to non-terminal nodes. If it were me, though, I'd still make use of VIEWs to report results of the workhorse queries: staring at a list of items like Transportation.Bicycle.Gear.Chain_ring sounds like headache. That's a matter of taste, of course. Bryan -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] Summing Grouping in a Hierarchical Structure
Hi Bryan, The detailed category listing is dead simple to create: SELECT cat_id, cat_name, sum(amount) FROM category, line-item GROUP BY cat_name, cat_id; But... as you say, the result is a LOT of numbers over the course of 160-ish categories of all levels. Which is why it is so important to be able to summarize. I want to summarize by the highest-levels of the category tree, precisely so as not to get a headache looking at the details. But it is still important to be able to drill down when I want more details. On Sat, Feb 23, 2013 at 11:15 AM, Bryan L Nuse n...@uga.edu wrote: That said, now that I have finally gotten the chance to try ltree, I think I like it a lot. Hello Don, Yes, after looking at ltree --which I had not done before-- I have to agree with Misa that it looks like the right solution for your problem. That is not to say that brute force SQL couldn't provide a workable arrangement; but ltree looks very flexible, especially as it allows you to assign cost values to non-terminal nodes. If it were me, though, I'd still make use of VIEWs to report results of the workhorse queries: staring at a list of items like Transportation.Bicycle.Gear.Chain_ring sounds like headache. That's a matter of taste, of course. Bryan -- D.C. Parris, FMP, Linux+, ESL Certificate Minister, Security/FM Coordinator, Free Software Advocate http://dcparris.net/ https://www.xing.com/profile/Don_Parrishttp://www.linkedin.com/in/dcparris GPG Key ID: F5E179BE
Re: [SQL] Summing Grouping in a Hierarchical Structure
Misa, Is it possible to use spaces in the ltree path, like so: TOP.Groceries.Food.Herbs Spices Or do the elements of the path have to use underscores and dashes? On Sat, Feb 23, 2013 at 7:19 AM, Misa Simic misa.si...@gmail.com wrote: Hi Don, Yes, its better to use it in category table... Bryan, how many levels there will be - we dont know... With one table - and ltree is solved all cases... To add new subcategory user just picks the parent category... So it is easy to add chain ring to gear... As category... In another transaction table is category_id, amount... Don already posted query for sum... In these case just category and transaction table should be joined sum amount, group by functions on lpath(depending what is the goal...) Kind Regards, Misa On Saturday, February 23, 2013, Bryan L Nuse wrote: This works fine: test_ltree= SELECT path, trans_amt FROM testcat; path | trans_amt -+--- TOP.Transportation.Auto.Fuel| 50.00 TOP.Transportation.Auto.Maintenance | 30.00 TOP.Transportation.Auto.Fuel| 25.00 TOP.Transportation.Bicycle.Gear | 40.00 TOP.Transportation.Bicycle.Gear | 20.00 TOP.Transportation.Fares.Bus| 10.00 TOP.Transportation.Fares.Train | 5.00 TOP.Groceries.Food.Beverages| 30.00 TOP.Groceries.Food.Fruit_Veggies| 40.00 TOP.Groceries.Food.Meat_Fish| 80.00 TOP.Groceries.Food.Grains_Cereals | 30.00 TOP.Groceries.Beverages.Alcohol.Beer| 25.00 TOP.Groceries.Beverages.Alcohol.Spirits | 10.00 TOP.Groceries.Beverages.Alcohol.Wine| 50.00 TOP.Groceries.Beverages.Juice | 45.00 TOP.Groceries.Beverages.Other | 15.00 (16 rows) So if I want to see: TOP.Groceries| 240.00 TOP.Transportation | 180.00 Hello Don, Perhaps I am missing something about what your constraints are, or what you're trying to achieve, but is there any reason you could not use a series of joined tables indicating parent-child relationships? The following example follows that in your previous posts. Note that this approach (as given) will not work if branches stemming from the same node are different lengths. That is, if you have costs associated with Transportation.Bicycle.Gear, you could not also have a category Transportation.Bicycle.Gear.Chain_ring. (To add the latter category, you'd have to put costs from the former under something like Transportation.Bicycle.Gear.General -- or modify the approach.) However, lengthening the Alcohol branches, e.g., by tacking on a level5 table would be easy. Notice that level3 and level4 are not true look-up tables, since they may contain duplicate cat values. If I'm off base, by all means specify just how. Regards, Bryan -- CREATE TABLE level1 ( cat text PRIMARY KEY ); CREATE TABLE level2 ( cat text PRIMARY KEY, parent text REFERENCES level1(cat) ); CREATE TABLE level3 ( cat text, parent text REFERENCES level2(cat), cost numeric(6,2) ); CREATE TABLE level4 ( cat text, parent text, cost numeric(6,2) ); INSERT INTO level1 VALUES ('Transportation'), ('Groceries'); INSERT INTO level2 VALUES ('Auto', 'Transportation'), ('Bicycle', 'Transportation'), ('Fares', 'Transportation'), ('Food', 'Groceries'), ('Beverages', 'Groceries'); INSERT INTO level3 VALUES ('Fuel', 'Auto', 50.00), ('Maintenance', 'Auto', 30.00), ('Fuel', 'Auto', 25.00), ('Gear', 'Bicycle', 40.00), ('Gear', 'Bicycle', 20.00), ('Bus', 'Fares', 10.00), ('Train', 'Fares', 5.00), ('Beverages', 'Food', 30.00), ('Fruit_Veg', 'Food', 40.00), ('Meat_Fish', 'Food', 80.00), ('Grains_Cereals', 'Food', 30.00), ('Alcohol', 'Beverages', NULL), ('Juice', 'Beverages', 45.00), ('Other', 'Beverages', 15.00); INSERT INTO level4 VALUES ('Beer', 'Alcohol', 25.00), ('Spirits', 'Alcohol', 10.00), ('Wine', 'Alcohol', 50.00); CREATE VIEW all_cats AS ( SELECT a.cat AS level4, b.cat AS level3, c.cat AS level2, d.cat AS level1, CASE WHEN a.cost IS NULL THEN 0 WHEN a.cost IS NOT NULL THEN a.cost END + CASE WHEN b.cost IS NULL THEN 0 WHEN b.cost IS NOT NULL THEN b.cost END AS cost FROM level4 a FULL JOIN level3 b ON (a.parent = b.cat) FULL JOIN level2 c ON (b.parent = c.cat) FULL JOIN level1 d ON (c.parent = d.cat) ORDER BY level1, level2, level3, level4 );
Re: [SQL] Summing Grouping in a Hierarchical Structure
Hi Don, To be honest with you - i dont know - but think it is not... We use it to solve the problem with hierarchy relations - but it is nowhere visible to users in the app... Our internal rule is to use ids in ltree structure to solve many others problems, actually to easy get, actual category info... From any point... So if needed, it is easy from 1.2.3.4, get: TOP.Groceries.Food.Herbs Spices if needed... Each of them are actually category names in the table... Kind regards, Misa On Saturday, February 23, 2013, Don Parris wrote: Misa, Is it possible to use spaces in the ltree path, like so: TOP.Groceries.Food.Herbs Spices Or do the elements of the path have to use underscores and dashes? On Sat, Feb 23, 2013 at 7:19 AM, Misa Simic misa.si...@gmail.com wrote: Hi Don, Yes, its better to use it in category table... Bryan, how many levels there will be - we dont know... With one table - and ltree is solved all cases... To add new subcategory user just picks the parent category... So it is easy to add chain ring to gear... As category... In another transaction table is category_id, amount... Don already posted query for sum... In these case just category and transaction table should be joined sum amount, group by functions on lpath(depending what is the goal...) Kind Regards, Misa On Saturday, February 23, 2013, Bryan L Nuse wrote: This works fine: test_ltree= SELECT path, trans_amt FROM testcat; path | trans_amt -+--- TOP.Transportation.Auto.Fuel| 50.00 TOP.Transportation.Auto.Maintenance | 30.00 TOP.Transportation.Auto.Fuel| 25.00 TOP.Transportation.Bicycle.Gear | 40.00 TOP.Transportation.Bicycle.Gear | 20.00 TOP.Transportation.Fares.Bus| 10.00 TOP.Transportation.Fares.Train | 5.00 TOP.Groceries.Food.Beverages| 30.00 TOP.Groceries.Food.Fruit_Veggies| 40.00 TOP.Groceries.Food.Meat_Fish| 80.00 TOP.Groceries.Food.Grains_Cereals | 30.00 TOP.Groceries.Beverages.Alcohol.Beer| 25.00 TOP.Groceries.Beverages.Alcohol.Spirits | 10.00 TOP.Groceries.Beverages.Alcohol.Wine| 50.00 TOP.Groceries.Beverages.Juice | 45.00 TOP.Groceries.Beverages.Other | 15.00 (16 rows) So if I want to see: TOP.Groceries| 240.00 TOP.Transportation | 180.00 Hello Don, Perhaps I am missing something about what your constraints are, or what you're trying to achieve, but is there any reason you could not use a series of joined tables indicating parent-child relationships? The following example follows that in your previous posts. Note that this approach (as given) will not work if branches stemming from the same node are different lengths. That is, if you have costs associated with Transportation.Bicycle.Gear, you could not also have a category Transportation.Bicycle.Gear.Chain_ring. (To add the latter category, you'd have to put costs from the former under something like Transportation.Bicycle.Gear.General -- or modify the approach.) However, lengthening the Alcohol branches, e.g., by tacking on a level5 table would be easy. Notice that level3 and level4 are not true look-up tables, since they may contain duplicate cat values. If I'm off base, by all means specify just how. Regards, Bryan -- CREATE TABLE level1 ( cat text PRIMARY KEY ); CREATE TABLE level2 ( cat text PRIMARY KEY, parent text REFERENCES level1(cat) ); -- D.C. Parris, FMP, Linux+, ESL Certificate Minister, Security/FM Coordinator, Free Software Advocate http://dcparris.net/ https://www.xing.com/profile/Don_Parrishttp://www.linkedin.com/in/dcparris GPG Key ID: F5E179BE
Re: [SQL] Summing Grouping in a Hierarchical Structure
Quoth parri...@gmail.com (Don Parris): Is it possible to use spaces in the ltree path, like so: TOP.Groceries.Food.Herbs Spices Or do the elements of the path have to use underscores and dashes? From the docs: | A label is a sequence of alphanumeric characters and underscores (for | example, in C locale the characters A-Za-z0-9_ are allowed). Labels | must be less than 256 bytes long. [...] | | A label path is a sequence of zero or more labels separated by dots, | for example L1.L2.L3, representing a path from the root of a | hierarchical tree to a particular node. The length of a label path | must be less than 65Kb, but keeping it under 2Kb is preferable. If you need to store non-alphanumeric labels, one answer (as long as they aren't too long) would be to use URL-encoding, like TOP.Groceries.Food.Herbs_20_26_20Spices Of course, you would need to encode _ as well, and you would need to be sure the labels weren't going to come out too long. Another alternative would be to MD5 each label and use (say) the first 10 bytes of that MD5 in hex as the ltree label. (Annoyingly there's only one non-alphanumeric, so you can't use base64.) If you were going to do that you would need to consider the possibility of an attacker arranging a hash collision: I don't know where you're labels come from, so I don't know if this would be an issue. Ben -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Re: [SQL] Summing Grouping in a Hierarchical Structure
Misa Ben, Thanks. I guess I overlooked that in the docs. I did see the part about the 65KB. I don't see my labels getting overly long. I'll be fine with Herbs_Spices. I make up the labels myself, so it's really no big deal. Alexander Bryan - Many thanks for the help! On Sat, Feb 23, 2013 at 3:13 PM, Ben Morrow b...@morrow.me.uk wrote: Quoth parri...@gmail.com (Don Parris): Is it possible to use spaces in the ltree path, like so: TOP.Groceries.Food.Herbs Spices Or do the elements of the path have to use underscores and dashes? From the docs: | A label is a sequence of alphanumeric characters and underscores (for | example, in C locale the characters A-Za-z0-9_ are allowed). Labels | must be less than 256 bytes long. [...] | | A label path is a sequence of zero or more labels separated by dots, | for example L1.L2.L3, representing a path from the root of a | hierarchical tree to a particular node. The length of a label path | must be less than 65Kb, but keeping it under 2Kb is preferable. If you need to store non-alphanumeric labels, one answer (as long as they aren't too long) would be to use URL-encoding, like TOP.Groceries.Food.Herbs_20_26_20Spices Of course, you would need to encode _ as well, and you would need to be sure the labels weren't going to come out too long. Another alternative would be to MD5 each label and use (say) the first 10 bytes of that MD5 in hex as the ltree label. (Annoyingly there's only one non-alphanumeric, so you can't use base64.) If you were going to do that you would need to consider the possibility of an attacker arranging a hash collision: I don't know where you're labels come from, so I don't know if this would be an issue. Ben -- D.C. Parris, FMP, Linux+, ESL Certificate Minister, Security/FM Coordinator, Free Software Advocate http://dcparris.net/ https://www.xing.com/profile/Don_Parrishttp://www.linkedin.com/in/dcparris GPG Key ID: F5E179BE
Re: [SQL] Summing Grouping in a Hierarchical Structure
Hi, Have you considered maybe ltree datatype? http://www.postgresql.org/docs/9.1/static/ltree.html I think it solves a lot of problems in topic Kind regards, Misa On Friday, February 15, 2013, Don Parris wrote: Hi all, I posted to this list some time ago about working with a hierarchical category structure. I had great difficulty with my problem and gave up for a time. I recently returned to it and resolved a big part of it. I have one step left to go, but at least I have solved this part. Here is the original thread (or one of them): http://www.postgresql.org/message-id/CAJ-7yonw4_qDCp-ZNYwEkR2jdLKeL8nfGc+-TLLSW=rmo1v...@mail.gmail.com Here is my recent blog post about how I managed to show my expenses summed and grouped by a mid-level category: http://dcparris.net/2013/02/13/hierarchical-categories-rdbms/ Specifically, I wanted to sum and group expenses according to categories, not just at the bottom tier, but at higher tiers, so as to show more summarized information. A CEO primarily wants to know the sum total for all the business units, yet have the ability to drill down to more detailed levels if something is unusually high or low. In my case, I could see the details, but not the summary. Well now I can summarize by what I refer to as the 2nd-level categories. Anyway, I hope this helps someone, as I have come to appreciate - and I mean really appreciate - the challenge of working with hierarchical structures in a 2-dimensional RDBMS. If anyone sees something I should explain better or in more depth, please let me know. Regards, Don -- D.C. Parris, FMP, Linux+, ESL Certificate Minister, Security/FM Coordinator, Free Software Advocate http://dcparris.net/ https://www.xing.com/profile/Don_Parrishttp://www.linkedin.com/in/dcparris GPG Key ID: F5E179BE
Re: [SQL] Summing Grouping in a Hierarchical Structure
Hi Misa, I decided to try out ltree, and have made some progress with it. If I understand correctly how to use it, I simply insert the 'path' column into my table, using ltree as the data type. That eliminates the need for a category table, if I understand correctly. I just need to ensure the category path is correct for each line item in the transaction details table. However, I have difficulty figuring out how to sum the amounts as I would like: test_ltree= SELECT path, SUM(trans_amt) FROM testcat WHERE path ~ 'TOP.*{2}' GROUP BY path; path | sum --+- (0 rows) This works fine: test_ltree= SELECT path, trans_amt FROM testcat; path | trans_amt -+--- TOP.Transportation.Auto.Fuel| 50.00 TOP.Transportation.Auto.Maintenance | 30.00 TOP.Transportation.Auto.Fuel| 25.00 TOP.Transportation.Bicycle.Gear | 40.00 TOP.Transportation.Bicycle.Gear | 20.00 TOP.Transportation.Fares.Bus| 10.00 TOP.Transportation.Fares.Train | 5.00 TOP.Groceries.Food.Beverages| 30.00 TOP.Groceries.Food.Fruit_Veggies| 40.00 TOP.Groceries.Food.Meat_Fish| 80.00 TOP.Groceries.Food.Grains_Cereals | 30.00 TOP.Groceries.Beverages.Alcohol.Beer| 25.00 TOP.Groceries.Beverages.Alcohol.Spirits | 10.00 TOP.Groceries.Beverages.Alcohol.Wine| 50.00 TOP.Groceries.Beverages.Juice | 45.00 TOP.Groceries.Beverages.Other | 15.00 (16 rows) But I can do that in my sleep with any regular query. This also works great (sum the trans_amt column at level 3 (not counting TOP): test_ltree= SELECT path, SUM(trans_amt) FROM testcat WHERE path ~ 'TOP.*{3}' GROUP BY path; path | sum -+--- TOP.Groceries.Beverages.Juice | 45.00 TOP.Groceries.Beverages.Other | 15.00 TOP.Groceries.Food.Beverages| 30.00 TOP.Groceries.Food.Fruit_Veggies| 40.00 TOP.Groceries.Food.Grains_Cereals | 30.00 TOP.Groceries.Food.Meat_Fish| 80.00 TOP.Transportation.Auto.Fuel| 75.00 TOP.Transportation.Auto.Maintenance | 30.00 TOP.Transportation.Bicycle.Gear | 60.00 TOP.Transportation.Fares.Bus| 10.00 TOP.Transportation.Fares.Train | 5.00 (11 rows) So if I want to see: TOP.Groceries| 240.00 TOP.Transportation | 180.00 How do I get this? Can you help? I am running PostgreSQL 9.1 on Kubuntu 12.10, in case that makes any difference. On Fri, Feb 22, 2013 at 4:00 AM, Misa Simic misa.si...@gmail.com wrote: Hi, Have you considered maybe ltree datatype? http://www.postgresql.org/docs/9.1/static/ltree.html I think it solves a lot of problems in topic Kind regards, Misa On Friday, February 15, 2013, Don Parris wrote: Hi all, I posted to this list some time ago about working with a hierarchical category structure. I had great difficulty with my problem and gave up for a time. I recently returned to it and resolved a big part of it. I have one step left to go, but at least I have solved this part. Here is the original thread (or one of them): http://www.postgresql.org/message-id/CAJ-7yonw4_qDCp-ZNYwEkR2jdLKeL8nfGc+-TLLSW=rmo1v...@mail.gmail.com Here is my recent blog post about how I managed to show my expenses summed and grouped by a mid-level category: http://dcparris.net/2013/02/13/hierarchical-categories-rdbms/ Specifically, I wanted to sum and group expenses according to categories, not just at the bottom tier, but at higher tiers, so as to show more summarized information. A CEO primarily wants to know the sum total for all the business units, yet have the ability to drill down to more detailed levels if something is unusually high or low. In my case, I could see the details, but not the summary. Well now I can summarize by what I refer to as the 2nd-level categories. Anyway, I hope this helps someone, as I have come to appreciate - and I mean really appreciate - the challenge of working with hierarchical structures in a 2-dimensional RDBMS. If anyone sees something I should explain better or in more depth, please let me know. Regards, Don -- D.C. Parris, FMP, Linux+, ESL Certificate Minister, Security/FM Coordinator, Free Software Advocate http://dcparris.net/ https://www.xing.com/profile/Don_Parrishttp://www.linkedin.com/in/dcparris GPG Key ID: F5E179BE -- D.C. Parris, FMP, Linux+, ESL Certificate Minister, Security/FM Coordinator, Free Software Advocate http://dcparris.net/ https://www.xing.com/profile/Don_Parrishttp://www.linkedin.com/in/dcparris GPG Key ID: F5E179BE
Re: [SQL] Summing Grouping in a Hierarchical Structure
Ok, I managed to accomplish my goal with the ltree: test_ltree= SELECT subltree(path,0,2), sum(trans_amt) FROM testcat GROUP BY subltree; subltree | sum + TOP.Groceries | 325.00 TOP.Transportation | 180.00 (2 rows) It took a bit of experimenting with the various functions of ltree, but I got what I wanted. What I did was keep the ltree path column in the same table as the line-item amount, but this requires building the entire path statement in every row. That's ok for a quick test to figure out how ltree works, but I think I do need to keep the category structure in a separate table: Category (category_id, path) Anyway, thanks for suggesting ltree. On Fri, Feb 22, 2013 at 7:15 PM, Don Parris parri...@gmail.com wrote: Hi Misa, I decided to try out ltree, and have made some progress with it. If I understand correctly how to use it, I simply insert the 'path' column into my table, using ltree as the data type. That eliminates the need for a category table, if I understand correctly. I just need to ensure the category path is correct for each line item in the transaction details table. However, I have difficulty figuring out how to sum the amounts as I would like: test_ltree= SELECT path, SUM(trans_amt) FROM testcat WHERE path ~ 'TOP.*{2}' GROUP BY path; path | sum --+- (0 rows) This works fine: test_ltree= SELECT path, trans_amt FROM testcat; path | trans_amt -+--- TOP.Transportation.Auto.Fuel| 50.00 TOP.Transportation.Auto.Maintenance | 30.00 TOP.Transportation.Auto.Fuel| 25.00 TOP.Transportation.Bicycle.Gear | 40.00 TOP.Transportation.Bicycle.Gear | 20.00 TOP.Transportation.Fares.Bus| 10.00 TOP.Transportation.Fares.Train | 5.00 TOP.Groceries.Food.Beverages| 30.00 TOP.Groceries.Food.Fruit_Veggies| 40.00 TOP.Groceries.Food.Meat_Fish| 80.00 TOP.Groceries.Food.Grains_Cereals | 30.00 TOP.Groceries.Beverages.Alcohol.Beer| 25.00 TOP.Groceries.Beverages.Alcohol.Spirits | 10.00 TOP.Groceries.Beverages.Alcohol.Wine| 50.00 TOP.Groceries.Beverages.Juice | 45.00 TOP.Groceries.Beverages.Other | 15.00 (16 rows) But I can do that in my sleep with any regular query. This also works great (sum the trans_amt column at level 3 (not counting TOP): test_ltree= SELECT path, SUM(trans_amt) FROM testcat WHERE path ~ 'TOP.*{3}' GROUP BY path; path | sum -+--- TOP.Groceries.Beverages.Juice | 45.00 TOP.Groceries.Beverages.Other | 15.00 TOP.Groceries.Food.Beverages| 30.00 TOP.Groceries.Food.Fruit_Veggies| 40.00 TOP.Groceries.Food.Grains_Cereals | 30.00 TOP.Groceries.Food.Meat_Fish| 80.00 TOP.Transportation.Auto.Fuel| 75.00 TOP.Transportation.Auto.Maintenance | 30.00 TOP.Transportation.Bicycle.Gear | 60.00 TOP.Transportation.Fares.Bus| 10.00 TOP.Transportation.Fares.Train | 5.00 (11 rows) So if I want to see: TOP.Groceries| 240.00 TOP.Transportation | 180.00 How do I get this? Can you help? I am running PostgreSQL 9.1 on Kubuntu 12.10, in case that makes any difference. On Fri, Feb 22, 2013 at 4:00 AM, Misa Simic misa.si...@gmail.com wrote: Hi, Have you considered maybe ltree datatype? http://www.postgresql.org/docs/9.1/static/ltree.html I think it solves a lot of problems in topic Kind regards, Misa On Friday, February 15, 2013, Don Parris wrote: Hi all, I posted to this list some time ago about working with a hierarchical category structure. I had great difficulty with my problem and gave up for a time. I recently returned to it and resolved a big part of it. I have one step left to go, but at least I have solved this part. Here is the original thread (or one of them): http://www.postgresql.org/message-id/CAJ-7yonw4_qDCp-ZNYwEkR2jdLKeL8nfGc+-TLLSW=rmo1v...@mail.gmail.com Here is my recent blog post about how I managed to show my expenses summed and grouped by a mid-level category: http://dcparris.net/2013/02/13/hierarchical-categories-rdbms/ Specifically, I wanted to sum and group expenses according to categories, not just at the bottom tier, but at higher tiers, so as to show more summarized information. A CEO primarily wants to know the sum total for all the business units, yet have the ability to drill down to more detailed levels if something is unusually high or low. In my case, I could see the details, but not the summary. Well now I can summarize by what I refer to as the 2nd-level categories. Anyway, I hope this helps someone, as I have come to appreciate - and I mean really appreciate - the
Re: [SQL] Summing Grouping in a Hierarchical Structure
This works fine: test_ltree= SELECT path, trans_amt FROM testcat; path | trans_amt -+--- TOP.Transportation.Auto.Fuel| 50.00 TOP.Transportation.Auto.Maintenance | 30.00 TOP.Transportation.Auto.Fuel| 25.00 TOP.Transportation.Bicycle.Gear | 40.00 TOP.Transportation.Bicycle.Gear | 20.00 TOP.Transportation.Fares.Bus| 10.00 TOP.Transportation.Fares.Train | 5.00 TOP.Groceries.Food.Beverages| 30.00 TOP.Groceries.Food.Fruit_Veggies| 40.00 TOP.Groceries.Food.Meat_Fish| 80.00 TOP.Groceries.Food.Grains_Cereals | 30.00 TOP.Groceries.Beverages.Alcohol.Beer| 25.00 TOP.Groceries.Beverages.Alcohol.Spirits | 10.00 TOP.Groceries.Beverages.Alcohol.Wine| 50.00 TOP.Groceries.Beverages.Juice | 45.00 TOP.Groceries.Beverages.Other | 15.00 (16 rows) So if I want to see: TOP.Groceries| 240.00 TOP.Transportation | 180.00 Hello Don, Perhaps I am missing something about what your constraints are, or what you're trying to achieve, but is there any reason you could not use a series of joined tables indicating parent-child relationships? The following example follows that in your previous posts. Note that this approach (as given) will not work if branches stemming from the same node are different lengths. That is, if you have costs associated with Transportation.Bicycle.Gear, you could not also have a category Transportation.Bicycle.Gear.Chain_ring. (To add the latter category, you'd have to put costs from the former under something like Transportation.Bicycle.Gear.General -- or modify the approach.) However, lengthening the Alcohol branches, e.g., by tacking on a level5 table would be easy. Notice that level3 and level4 are not true look-up tables, since they may contain duplicate cat values. If I'm off base, by all means specify just how. Regards, Bryan -- CREATE TABLE level1 ( cat text PRIMARY KEY ); CREATE TABLE level2 ( cat text PRIMARY KEY, parent text REFERENCES level1(cat) ); CREATE TABLE level3 ( cat text, parent text REFERENCES level2(cat), cost numeric(6,2) ); CREATE TABLE level4 ( cat text, parent text, cost numeric(6,2) ); INSERT INTO level1 VALUES ('Transportation'), ('Groceries'); INSERT INTO level2 VALUES ('Auto', 'Transportation'), ('Bicycle', 'Transportation'), ('Fares', 'Transportation'), ('Food', 'Groceries'), ('Beverages', 'Groceries'); INSERT INTO level3 VALUES ('Fuel', 'Auto', 50.00), ('Maintenance', 'Auto', 30.00), ('Fuel', 'Auto', 25.00), ('Gear', 'Bicycle', 40.00), ('Gear', 'Bicycle', 20.00), ('Bus', 'Fares', 10.00), ('Train', 'Fares', 5.00), ('Beverages', 'Food', 30.00), ('Fruit_Veg', 'Food', 40.00), ('Meat_Fish', 'Food', 80.00), ('Grains_Cereals', 'Food', 30.00), ('Alcohol', 'Beverages', NULL), ('Juice', 'Beverages', 45.00), ('Other', 'Beverages', 15.00); INSERT INTO level4 VALUES ('Beer', 'Alcohol', 25.00), ('Spirits', 'Alcohol', 10.00), ('Wine', 'Alcohol', 50.00); CREATE VIEW all_cats AS ( SELECT a.cat AS level4, b.cat AS level3, c.cat AS level2, d.cat AS level1, CASE WHEN a.cost IS NULL THEN 0 WHEN a.cost IS NOT NULL THEN a.cost END + CASE WHEN b.cost IS NULL THEN 0 WHEN b.cost IS NOT NULL THEN b.cost END AS cost FROM level4 a FULL JOIN level3 b ON (a.parent = b.cat) FULL JOIN level2 c ON (b.parent = c.cat) FULL JOIN level1 d ON (c.parent = d.cat) ORDER BY level1, level2, level3, level4 ); SELECT * FROM all_cats; level4 | level3 | level2 | level1 | cost -++---++--- Beer| Alcohol| Beverages | Groceries | 25.00 Spirits | Alcohol| Beverages | Groceries | 10.00 Wine| Alcohol| Beverages | Groceries | 50.00 | Juice | Beverages | Groceries | 45.00 | Other | Beverages | Groceries | 15.00 | Beverages | Food | Groceries | 30.00 | Fruit_Veg | Food | Groceries | 40.00 | Grains_Cereals | Food | Groceries | 30.00 | Meat_Fish | Food | Groceries | 80.00 | Fuel | Auto | Transportation | 50.00 | Fuel | Auto | Transportation | 25.00 | Maintenance| Auto | Transportation | 30.00
Re: [SQL] Summing Grouping in a Hierarchical Structure
Hi Alexander, I appreciate you taking time to reply to my post. I like the idea of the WITH RECURSIVE query, but... The two examples in the link you offered are not so helpful to me. For example, the initial WITH query shown uses a single table, and I wander how that might apply in my case, where the relevant information is actually found in two tables, one of them a recursive table. The second example, which applies the WITH RECURSIVE clause, is even less so. I wonder if there is a good tutorial somewhere on this that shows some other examples? That might help me catch on a little better. I'll search for that today. On Thu, Feb 14, 2013 at 11:30 PM, Alexander Gataric gata...@usa.net wrote: I would try a recursivehttp://www.postgresql.org/docs/8.4/static/queries-with.htmlquery to determine the category structure and aggregate as you go. I had a similar problem with a hierarchical structure for an organization structure. Another thing you might try is to create a separate CTE for each category and then aggregate the individual CTEs. ** ** ** ** *From:* pgsql-sql-ow...@postgresql.org [mailto: pgsql-sql-ow...@postgresql.org] *On Behalf Of *Don Parris *Sent:* Thursday, February 14, 2013 7:58 PM *To:* pgsql-sql@postgresql.org *Subject:* [SQL] Summing Grouping in a Hierarchical Structure ** ** Hi all, I posted to this list some time ago about working with a hierarchical category structure. I had great difficulty with my problem and gave up for a time. I recently returned to it and resolved a big part of it. I have one step left to go, but at least I have solved this part. Here is the original thread (or one of them): http://www.postgresql.org/message-id/CAJ-7yonw4_qDCp-ZNYwEkR2jdLKeL8nfGc+-TLLSW=rmo1v...@mail.gmail.com ** ** Here is my recent blog post about how I managed to show my expenses summed and grouped by a mid-level category: http://dcparris.net/2013/02/13/hierarchical-categories-rdbms/ Specifically, I wanted to sum and group expenses according to categories, not just at the bottom tier, but at higher tiers, so as to show more summarized information. A CEO primarily wants to know the sum total for all the business units, yet have the ability to drill down to more detailed levels if something is unusually high or low. In my case, I could see the details, but not the summary. Well now I can summarize by what I refer to as the 2nd-level categories. Anyway, I hope this helps someone, as I have come to appreciate - and I mean really appreciate - the challenge of working with hierarchical structures in a 2-dimensional RDBMS. If anyone sees something I should explain better or in more depth, please let me know. ** ** Regards, Don -- D.C. Parris, FMP, Linux+, ESL Certificate Minister, Security/FM Coordinator, Free Software Advocate http://dcparris.net/ https://www.xing.com/profile/Don_Parris GPG Key ID: F5E179BE -- D.C. Parris, FMP, Linux+, ESL Certificate Minister, Security/FM Coordinator, Free Software Advocate http://dcparris.net/ https://www.xing.com/profile/Don_Parrishttp://www.linkedin.com/in/dcparris GPG Key ID: F5E179BE
Re: [SQL] Summing Grouping in a Hierarchical Structure
Hi Mike, No Excel here. This is a strictly libre systems environment. I believe LibreOffice Calc has a similar tool though. And your suggestion might be a great one. On Mon, Feb 18, 2013 at 10:24 AM, Relyea, Mike mike.rel...@xerox.comwrote: From: pgsql-sql-ow...@postgresql.org [mailto: pgsql-sql-ow...@postgresql.org] On Behalf Of Don Parris Sent: Thursday, February 14, 2013 8:58 PM To: pgsql-sql@postgresql.org Subject: [SQL] Summing Grouping in a Hierarchical Structure Hi all, I posted to this list some time ago about working with a hierarchical category structure. I had great difficulty with my problem and gave up for a time. I recently returned to it and resolved a big part of it. I have one step left to go, but at least I have solved this part. Here is the original thread (or one of them): http://www.postgresql.org/message-id/CAJ-7yonw4_qDCp-ZNYwEkR2jdLKeL8nfGc+-TLLSW=rmo1v...@mail.gmail.com Here is my recent blog post about how I managed to show my expenses summed and grouped by a mid-level category: http://dcparris.net/2013/02/13/hierarchical-categories-rdbms/ Specifically, I wanted to sum and group expenses according to categories, not just at the bottom tier, but at higher tiers, so as to show more summarized information. A CEO primarily wants to know the sum total for all the business units, yet have the ability to drill down to more detailed levels if something is unusually high or low. In my case, I could see the details, but not the summary. Well now I can summarize by what I refer to as the 2nd-level categories. Anyway, I hope this helps someone, as I have come to appreciate - and I mean really appreciate - the challenge of working with hierarchical structures in a 2-dimensional RDBMS. If anyone sees something I should explain better or in more depth, please let me know. Regards, Don -- D.C. Parris, FMP, Linux+, ESL Certificate Minister, Security/FM Coordinator, Free Software Advocate http://dcparris.net/ GPG Key ID: F5E179BE My two cents would be to actually use a different tool for the job of presenting this data. I'd have used a pivot table in Microsoft Excel. Not sure what your environment or requirements are but pivot tables are widely used in business, easy to share, can be formatted, and give the user the ability to drill down and navigate to the data they want to see. I'd set up a query to pull the raw data you need with all of the categories and associated data you need. Then bring that data to Excel to present and summarize it. Mike -- D.C. Parris, FMP, Linux+, ESL Certificate Minister, Security/FM Coordinator, Free Software Advocate http://dcparris.net/ https://www.xing.com/profile/Don_Parrishttp://www.linkedin.com/in/dcparris GPG Key ID: F5E179BE
Re: [SQL] Summing Grouping in a Hierarchical Structure
I would use the recursive CTE to gather the hierarchical portion of the data you need and then join that CTE to another table or CTE with the other data you need. I had a situation like this at my job were organization info was in a hierarchal table and I needed to join it to two other tables. I created a CTE with the combined data from the non-hierarchical tables and left joined it to the recursive CTE. If you're having trouble with this, I suggest looking into CTEs and the different types of joins. From: pgsql-sql-ow...@postgresql.org [mailto:pgsql-sql-ow...@postgresql.org] On Behalf Of Don Parris Sent: Thursday, February 21, 2013 4:38 AM To: pgsql-sql@postgresql.org Subject: Re: [SQL] Summing Grouping in a Hierarchical Structure Hi Alexander, I appreciate you taking time to reply to my post. I like the idea of the WITH RECURSIVE query, but... The two examples in the link you offered are not so helpful to me. For example, the initial WITH query shown uses a single table, and I wander how that might apply in my case, where the relevant information is actually found in two tables, one of them a recursive table. The second example, which applies the WITH RECURSIVE clause, is even less so. I wonder if there is a good tutorial somewhere on this that shows some other examples? That might help me catch on a little better. I'll search for that today. On Thu, Feb 14, 2013 at 11:30 PM, Alexander Gataric gata...@usa.net wrote: I would try a recursive http://www.postgresql.org/docs/8.4/static/queries-with.html query to determine the category structure and aggregate as you go. I had a similar problem with a hierarchical structure for an organization structure. Another thing you might try is to create a separate CTE for each category and then aggregate the individual CTEs. From: pgsql-sql-ow...@postgresql.org [mailto:pgsql-sql-ow...@postgresql.org] On Behalf Of Don Parris Sent: Thursday, February 14, 2013 7:58 PM To: pgsql-sql@postgresql.org Subject: [SQL] Summing Grouping in a Hierarchical Structure Hi all, I posted to this list some time ago about working with a hierarchical category structure. I had great difficulty with my problem and gave up for a time. I recently returned to it and resolved a big part of it. I have one step left to go, but at least I have solved this part. Here is the original thread (or one of them): http://www.postgresql.org/message-id/CAJ-7yonw4_qDCp-ZNYwEkR2jdLKeL8nfGc+-TL LSW=rmo1v...@mail.gmail.com Here is my recent blog post about how I managed to show my expenses summed and grouped by a mid-level category: http://dcparris.net/2013/02/13/hierarchical-categories-rdbms/ Specifically, I wanted to sum and group expenses according to categories, not just at the bottom tier, but at higher tiers, so as to show more summarized information. A CEO primarily wants to know the sum total for all the business units, yet have the ability to drill down to more detailed levels if something is unusually high or low. In my case, I could see the details, but not the summary. Well now I can summarize by what I refer to as the 2nd-level categories. Anyway, I hope this helps someone, as I have come to appreciate - and I mean really appreciate - the challenge of working with hierarchical structures in a 2-dimensional RDBMS. If anyone sees something I should explain better or in more depth, please let me know. Regards, Don -- D.C. Parris, FMP, Linux+, ESL Certificate Minister, Security/FM Coordinator, Free Software Advocate http://dcparris.net/ https://www.xing.com/profile/Don_Parris GPG Key ID: F5E179BE -- D.C. Parris, FMP, Linux+, ESL Certificate Minister, Security/FM Coordinator, Free Software Advocate http://dcparris.net/ https://www.xing.com/profile/Don_Parris GPG Key ID: F5E179BE
Re: [SQL] Summing Grouping in a Hierarchical Structure
From: pgsql-sql-ow...@postgresql.org [mailto:pgsql-sql-ow...@postgresql.org] On Behalf Of Don Parris Sent: Thursday, February 14, 2013 8:58 PM To: pgsql-sql@postgresql.org Subject: [SQL] Summing Grouping in a Hierarchical Structure Hi all, I posted to this list some time ago about working with a hierarchical category structure. I had great difficulty with my problem and gave up for a time. I recently returned to it and resolved a big part of it. I have one step left to go, but at least I have solved this part. Here is the original thread (or one of them): http://www.postgresql.org/message-id/CAJ-7yonw4_qDCp-ZNYwEkR2jdLKeL8nfGc+-TLLSW=rmo1v...@mail.gmail.com Here is my recent blog post about how I managed to show my expenses summed and grouped by a mid-level category: http://dcparris.net/2013/02/13/hierarchical-categories-rdbms/ Specifically, I wanted to sum and group expenses according to categories, not just at the bottom tier, but at higher tiers, so as to show more summarized information. A CEO primarily wants to know the sum total for all the business units, yet have the ability to drill down to more detailed levels if something is unusually high or low. In my case, I could see the details, but not the summary. Well now I can summarize by what I refer to as the 2nd-level categories. Anyway, I hope this helps someone, as I have come to appreciate - and I mean really appreciate - the challenge of working with hierarchical structures in a 2-dimensional RDBMS. If anyone sees something I should explain better or in more depth, please let me know. Regards, Don -- D.C. Parris, FMP, Linux+, ESL Certificate Minister, Security/FM Coordinator, Free Software Advocate http://dcparris.net/ GPG Key ID: F5E179BE My two cents would be to actually use a different tool for the job of presenting this data. I'd have used a pivot table in Microsoft Excel. Not sure what your environment or requirements are but pivot tables are widely used in business, easy to share, can be formatted, and give the user the ability to drill down and navigate to the data they want to see. I'd set up a query to pull the raw data you need with all of the categories and associated data you need. Then bring that data to Excel to present and summarize it. Mike -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
[SQL] Summing Grouping in a Hierarchical Structure
Hi all, I posted to this list some time ago about working with a hierarchical category structure. I had great difficulty with my problem and gave up for a time. I recently returned to it and resolved a big part of it. I have one step left to go, but at least I have solved this part. Here is the original thread (or one of them): http://www.postgresql.org/message-id/CAJ-7yonw4_qDCp-ZNYwEkR2jdLKeL8nfGc+-TLLSW=rmo1v...@mail.gmail.com Here is my recent blog post about how I managed to show my expenses summed and grouped by a mid-level category: http://dcparris.net/2013/02/13/hierarchical-categories-rdbms/ Specifically, I wanted to sum and group expenses according to categories, not just at the bottom tier, but at higher tiers, so as to show more summarized information. A CEO primarily wants to know the sum total for all the business units, yet have the ability to drill down to more detailed levels if something is unusually high or low. In my case, I could see the details, but not the summary. Well now I can summarize by what I refer to as the 2nd-level categories. Anyway, I hope this helps someone, as I have come to appreciate - and I mean really appreciate - the challenge of working with hierarchical structures in a 2-dimensional RDBMS. If anyone sees something I should explain better or in more depth, please let me know. Regards, Don -- D.C. Parris, FMP, Linux+, ESL Certificate Minister, Security/FM Coordinator, Free Software Advocate http://dcparris.net/ https://www.xing.com/profile/Don_Parrishttp://www.linkedin.com/in/dcparris GPG Key ID: F5E179BE
Re: [SQL] Summing Grouping in a Hierarchical Structure
I would try a recursive http://www.postgresql.org/docs/8.4/static/queries-with.html query to determine the category structure and aggregate as you go. I had a similar problem with a hierarchical structure for an organization structure. Another thing you might try is to create a separate CTE for each category and then aggregate the individual CTEs. From: pgsql-sql-ow...@postgresql.org [mailto:pgsql-sql-ow...@postgresql.org] On Behalf Of Don Parris Sent: Thursday, February 14, 2013 7:58 PM To: pgsql-sql@postgresql.org Subject: [SQL] Summing Grouping in a Hierarchical Structure Hi all, I posted to this list some time ago about working with a hierarchical category structure. I had great difficulty with my problem and gave up for a time. I recently returned to it and resolved a big part of it. I have one step left to go, but at least I have solved this part. Here is the original thread (or one of them): http://www.postgresql.org/message-id/CAJ-7yonw4_qDCp-ZNYwEkR2jdLKeL8nfGc+-TL LSW=rmo1v...@mail.gmail.com Here is my recent blog post about how I managed to show my expenses summed and grouped by a mid-level category: http://dcparris.net/2013/02/13/hierarchical-categories-rdbms/ Specifically, I wanted to sum and group expenses according to categories, not just at the bottom tier, but at higher tiers, so as to show more summarized information. A CEO primarily wants to know the sum total for all the business units, yet have the ability to drill down to more detailed levels if something is unusually high or low. In my case, I could see the details, but not the summary. Well now I can summarize by what I refer to as the 2nd-level categories. Anyway, I hope this helps someone, as I have come to appreciate - and I mean really appreciate - the challenge of working with hierarchical structures in a 2-dimensional RDBMS. If anyone sees something I should explain better or in more depth, please let me know. Regards, Don -- D.C. Parris, FMP, Linux+, ESL Certificate Minister, Security/FM Coordinator, Free Software Advocate http://dcparris.net/ https://www.xing.com/profile/Don_Parris GPG Key ID: F5E179BE