Re: [HACKERS] OLAP CUBE/ROLLUP Operators and GROUP BY grouping sets

2003-12-18 Thread Sailesh Krishnamurthy
We have a rude hack of temping hashed aggs to disk to deal with the case where there is not enough memory. I don't think that's an ideal solution, but it certainly has the code to dump to file. I can post the patch later in the day .. (This is some code for our undergrad db class assignment. I

[HACKERS] OLAP CUBE/ROLLUP Operators and GROUP BY grouping sets

2003-12-17 Thread Robert Bedell
I'm curious if anyone has ever looked into adding OLAP functionality (per the SQL99 specs) into PostGreSQL. I don't actually own the proper SQL99 specifications, and since the newer sql2003 ones are coming out I don't know which to purchase. Could someone point me in the right direction? I've

Re: [HACKERS] OLAP CUBE/ROLLUP Operators and GROUP BY grouping sets

2003-12-17 Thread Tom Lane
Robert Bedell [EMAIL PROTECTED] writes: I'm curious if anyone has ever looked into adding OLAP functionality (per the SQL99 specs) into PostGreSQL. There was a fairly crude CUBE implementation submitted (and rejected) a few months ago, but there's not been any work I thought had a chance of

Re: [HACKERS] OLAP CUBE/ROLLUP Operators and GROUP BY grouping sets

2003-12-17 Thread Joe Conway
Robert Bedell wrote: I'm curious if anyone has ever looked into adding OLAP functionality (per the SQL99 specs) into PostGreSQL. I don't actually own the proper SQL99 specifications, and since the newer sql2003 ones are coming out I don't know which to purchase. Could someone point me in the

Re: [HACKERS] OLAP CUBE/ROLLUP Operators and GROUP BY grouping sets

2003-12-17 Thread Robert Bedell
More specifically I would like to add grouping sets, and the CUBE and ROLLUP operators, into postgresql. Since modifying such the GROUP BY operation would necessitate changing the query structure, wouldn't that affect the query rewrites and genetic optimizer? I don't think either the

Re: [HACKERS] OLAP CUBE/ROLLUP Operators and GROUP BY grouping sets

2003-12-17 Thread Hannu Krosing
Tom Lane kirjutas N, 18.12.2003 kell 00:27: Robert Bedell [EMAIL PROTECTED] writes: I'm curious if anyone has ever looked into adding OLAP functionality (per the SQL99 specs) into PostGreSQL. As a first project one could think of implementing NULLS FIRST/LAST (from 4.14.9) for all ORDER BY

Re: [HACKERS] OLAP CUBE/ROLLUP Operators and GROUP BY grouping sets

2003-12-17 Thread Hannu Krosing
Robert Bedell kirjutas N, 18.12.2003 kell 01:02: These are not simple projects, I know. Might be a tad ambitious for your first venture into backend hacking... I agree completely. I'm not purporting to jump in quite that quickly, but it is something I would like to see added, and am

Re: [HACKERS] OLAP CUBE/ROLLUP Operators and GROUP BY grouping sets

2003-12-17 Thread Robert Bedell
I guess that by adding hash aggregates Tom solved most problems of adding ROLLUP, CUBE and GROUPING SETS. OTOH, I'm not sure if hash aggregates can already spill to disk if not enough memory is available for keeping them all. If not, then adding this capability would be great push towards

Re: [HACKERS] OLAP CUBE/ROLLUP Operators and GROUP BY grouping sets

2003-12-17 Thread Hannu Krosing
Robert Bedell kirjutas N, 18.12.2003 kell 01:55: I guess that by adding hash aggregates Tom solved most problems of adding ROLLUP, CUBE and GROUPING SETS. OTOH, I'm not sure if hash aggregates can already spill to disk if not enough memory is available for keeping them all. If not, then

Re: [HACKERS] OLAP CUBE/ROLLUP Operators and GROUP BY grouping sets

2003-12-17 Thread Robert Bedell
it seems that the longest GROUPING SET and all its left-continuous subsets could be collected from the sorted scan and the rest from hash aggregates. GROUPING SET () will always need a hash ;) To optimise any further would require use of statistics data, and is probably not a good idea

Re: [HACKERS] OLAP CUBE/ROLLUP Operators and GROUP BY grouping sets

2003-12-17 Thread Robert Bedell
For pure ROLLUP one could shortcut the split-into-groups and put-together-again process, as ROLLUP is already doable from single sorted scan. Actually as long as the grouping sets are all left-continuous of the longest grouping set it's doable from a single sorted scan. If done with

Re: [HACKERS] OLAP CUBE/ROLLUP Operators and GROUP BY grouping sets

2003-12-17 Thread Tom Lane
Hannu Krosing [EMAIL PROTECTED] writes: OTOH, I'm not sure if hash aggregates can already spill to disk if not enough memory is available for keeping them all. They do not, which is something it'd be good to fix, since if the planner drastically underestimates the number of groups, you could

Re: [HACKERS] OLAP CUBE/ROLLUP Operators and GROUP BY grouping sets

2003-12-17 Thread Robert Bedell
Hannu Krosing [EMAIL PROTECTED] writes: OTOH, I'm not sure if hash aggregates can already spill to disk if not enough memory is available for keeping them all. They do not, which is something it'd be good to fix, since if the planner drastically underestimates the number of groups, you

Re: [HACKERS] OLAP CUBE/ROLLUP Operators and GROUP BY grouping sets

2003-12-17 Thread Tom Lane
Robert Bedell [EMAIL PROTECTED] writes: 1) When does the optimizer set the nodeAgg plan to HASHED? See grouping_planner() in src/backend/optimizer/plan/planner.c particularly the logic around use_hashed_grouping. 2) What mechanism would be best to use for storing the data on disk? I know

Re: [HACKERS] OLAP CUBE/ROLLUP Operators and GROUP BY grouping sets

2003-12-17 Thread Robert Bedell
Thanks for the pointers! The documentation points to the 'sort_mem' parameter for this, but the code doesn't look to actually implement that yet. Well, yeah, that's sort of exactly the point ... it's considered during planning but the executor code has no fallback if the planner

Re: [HACKERS] OLAP CUBE/ROLLUP Operators and GROUP BY grouping sets

2003-12-17 Thread Tom Lane
Also ... Robert Bedell [EMAIL PROTECTED] writes: ... I think that LookupTupleHashEntry is the best place to put the code to spill over unto disk, since that is used only by the Group, Agg, and Subplan executor nodes. That's the right area to be looking at, but I don't think you can expect to

Re: [HACKERS] OLAP CUBE/ROLLUP Operators and GROUP BY grouping sets

2003-12-17 Thread Robert Bedell
That's the right area to be looking at, but I don't think you can expect to do a decent job with localized hacking in LookupTupleHashEntry. That function's API is predicated on the assumption that you have random access to any entry in the hash table --- which stops being true as soon as you