Re: [HACKERS] Final Patch for GROUPING SETS
On 2015-05-16 00:06:12 +0200, Andres Freund wrote: > Andrew (and I) have been working on this since. Here's the updated and > rebased patch. > > It misses a decent commit message and another beautification > readthrough. I've spent the last hour going through the thing again and > all I hit was a disturbing number of newline "errors" and two minor > comment additions. And committed. Thanks Andrew, everyone. Despite some unhappiness all around I do think the patch has improved due to the discussions in this thread. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
On Thu, May 14, 2015 at 08:59:45AM +0200, Andres Freund wrote: > On 2015-05-14 02:51:42 -0400, Noah Misch wrote: > > Covering hash aggregation might entail a large preparatory refactoring > > of nodeHash.c, but beyond development cost I can't malign that. > > You mean execGrouping.c? Afaics nodeHash.c isn't involved, and it > doesn't look very interesting to make it so? That particular comment of mine was comprehensively wrong. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
On 2015-05-14 09:16:10 +0100, Andrew Gierth wrote: > Andres> A rough sketch of what I'm thinking of is: > > I'm not sure I'd do it quite like that. It was meant as a sketch, so there's lots of things it's probably missing ;) > Rather, have a wrapper function get_outer_tuple that calls > ExecProcNode and, if appropriate, writes the tuple to a tuplesort > before returning it; use that in place of ExecProcNode in > agg_retrieve_direct and when building the hash table. Hm. I'd considered that, but thought it might end up being more complex for hashing support. I'm not exactly sure why I thought that tho. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
> "Andres" == Andres Freund writes: Andres> My problem is that, unless I very much misunderstand something, Andres> the current implementation can end up requiring roughly #sets * Andres> #input of additional space for the "sidechannel tuplestore" in Andres> some bad cases. That happens if you group by a couple clauses Andres> that each lead to a high number of groups. The actual upper bound for the tuplestore size is the size of the _result_ of the grouping, less one or two rows. You get that in cases like grouping sets (unique_col, rollup(constant_col)), which seems sufficiently pathological not to be worth worrying about greatly. In normal cases, the size of the tuplestore is the size of the result minus the rows processed directly by the top node. So the only way the size can be an issue is if the result set size itself is also an issue, and in that case I don't really think that this is going to be a matter of significant concern. Andres> A rough sketch of what I'm thinking of is: I'm not sure I'd do it quite like that. Rather, have a wrapper function get_outer_tuple that calls ExecProcNode and, if appropriate, writes the tuple to a tuplesort before returning it; use that in place of ExecProcNode in agg_retrieve_direct and when building the hash table. The problem with trying to turn agg_retrieve_direct inside-out (to make it look more like agg_retrieve_chained) is that it potentially projects multiple output groups (not just multiple-result projections) from a single input tuple, so it has to have some control over whether a tuple is read or not. (agg_retrieve_chained avoids this problem because it can loop over the projections, since it's writing to the tuplestore rather than returning to the caller.) Andres> I think this is quite doable and seems likely to actually end Andres> up with easier to understand code. But unfortunately it seems Andres> to be big enough of a change to make it unlikely to be done in Andres> sufficient quality until the freeze. I'll nonetheless work a Andres> couple hours on it tomorrow. Andres> Andrew, is that a structure you could live with, or not? Well, I still think the opaque-blobless isn't nice, but I retract some of my previous concerns; I can see a way to do it that doesn't significantly impinge on the difficulty of adding hash support. It sounds like I have more time immediately available than you do. As discussed on IRC, I'll take the first shot, and we'll see how far I can get. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
On 2015-05-14 02:51:42 -0400, Noah Misch wrote: > Covering hash aggregation might entail a large preparatory refactoring > of nodeHash.c, but beyond development cost I can't malign that. You mean execGrouping.c? Afaics nodeHash.c isn't involved, and it doesn't look very interesting to make it so? Isn't that just calling BuildTupleHashTable() for each to-be-hash-aggregated set, and then make agg_fill_hash_table() target multiple hashtables? This mostly seems to be adding a couple loops and parameters. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
On Thu, May 14, 2015 at 08:38:07AM +0200, Andres Freund wrote: > On 2015-05-14 02:32:04 -0400, Noah Misch wrote: > > On Thu, May 14, 2015 at 07:50:31AM +0200, Andres Freund wrote: > > > Andrew, is that a structure you could live with, or not? > > > > > > Others, what do you think? > > > > Andrew and I discussed that very structure upthread: > > > http://www.postgresql.org/message-id/87d26zd9k8@news-spur.riddles.org.uk > > I don't really believe that that'd necesarily be true. I think if done > like I sketched it'll likely end up being simpler than the currently > proposed code. I also don't see why this would make combining hashing > and sorting any more complex than now. If anything the contrary. > > > http://www.postgresql.org/message-id/20141231085845.ga2148...@tornado.leadboat.com > > http://www.postgresql.org/message-id/20141231210553.gb2159...@tornado.leadboat.com > > > > I still believe the words I wrote in my two messages cited. > > I.e. that you think it's a sane approach, despite the criticism? Yes. I won't warrant that it proves better, but it looks promising. Covering hash aggregation might entail a large preparatory refactoring of nodeHash.c, but beyond development cost I can't malign that. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
On 2015-05-14 02:32:04 -0400, Noah Misch wrote: > On Thu, May 14, 2015 at 07:50:31AM +0200, Andres Freund wrote: > > Andrew, is that a structure you could live with, or not? > > > > Others, what do you think? > > Andrew and I discussed that very structure upthread: > http://www.postgresql.org/message-id/87d26zd9k8@news-spur.riddles.org.uk I don't really believe that that'd necesarily be true. I think if done like I sketched it'll likely end up being simpler than the currently proposed code. I also don't see why this would make combining hashing and sorting any more complex than now. If anything the contrary. > http://www.postgresql.org/message-id/20141231085845.ga2148...@tornado.leadboat.com > http://www.postgresql.org/message-id/20141231210553.gb2159...@tornado.leadboat.com > > I still believe the words I wrote in my two messages cited. I.e. that you think it's a sane approach, despite the criticism? Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
On Thu, May 14, 2015 at 07:50:31AM +0200, Andres Freund wrote: > I still believe that the general approach of chaining vs. a union or CTE > is correct due to the efficiency arguments upthread. My problem is > that, unless I very much misunderstand something, the current > implementation can end up requiring roughly #sets * #input of additional > space for the "sidechannel tuplestore" in some bad cases. That happens > if you group by a couple clauses that each lead to a high number of > groups. Correct. > Andrew, is that a structure you could live with, or not? > > Others, what do you think? Andrew and I discussed that very structure upthread: http://www.postgresql.org/message-id/20141231085845.ga2148...@tornado.leadboat.com http://www.postgresql.org/message-id/87d26zd9k8@news-spur.riddles.org.uk http://www.postgresql.org/message-id/20141231210553.gb2159...@tornado.leadboat.com I still believe the words I wrote in my two messages cited. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
On 2015-05-13 22:51:15 +0200, Andres Freund wrote: > I'm pretty sure by now that I dislike the introduction of GroupedVar, > and not just tentatively. While I can see why you found its > introduction to be nicer than fiddling with the result tuple, for me the > disadvantages seem to outweigh the advantage. For one it's rather wierd > to have Var nodes be changed into GroupedVar in setrefs.c. The number > of places that need to be touched even when it's a 'planned stmt only' > type of node is still pretty large. > > Andrew: I'll work on changing this in a couple hours unless you're > speaking up about doing it yourself. I did a stab at removing it, and it imo definitely ends up looking better. The code for the GroupedVar replacement isn't perfect yet, but I think it'd be possible to clean that up until Friday. Unfortunately, after prolonged staring out of the window, I came to the conclusion that I don't think the current tree structure isn't right. I still believe that the general approach of chaining vs. a union or CTE is correct due to the efficiency arguments upthread. My problem is that, unless I very much misunderstand something, the current implementation can end up requiring roughly #sets * #input of additional space for the "sidechannel tuplestore" in some bad cases. That happens if you group by a couple clauses that each lead to a high number of groups. That happens because the aggregated rows produced in the chain nodes can't be returned up-tree, because the the next chain (or final group aggregate) node will expect unaggregated tuples. The current solution for that is to move the aggregated rows produced in chain nodes into a tuplestore that's then drained when the top level aggregate node has done it's own job. While that's probably not too bad in many cases because most of the use cases aggregation will be relatively effective, it does seem to be further evidence that the sidechannel tuplestore isn't the perfect idea. What I think we should/need to do instead is to the chaining locally inside one aggregation node. That way the aggregated tuples can be returned directly, without the sidechannel. While that will require inlining part of the code from nodeSort.c it doesn't seem too bad. Besides the advantage of getting rid of that tuplestore, it'll also fix the explain performance problems (as there's no deep tree to traverse via ruleutils.c), get rid of the the preemtive ExecReScan() to control memory usage. I think it might also make combined hashing/sorting easier. A rough sketch of what I'm thinking of is: ExecAgg() { ... while (!aggstate->consumed_input) { outerslot = ExecProcNode(outerPlanState(aggstate)); if (TupIsNull(outerslot)) { consumed_input = true; break; } if (aggstate->doing_hashing) { entry = lookup_hash_entry(aggstate, outerslot); /* Advance the aggregates */ advance_aggregates(aggstate, entry->pergroup); } if (aggstate->presorted_input || AGG_PLAIN) { /* handle aggregation, return if done with group */ } if (aggstate->doing_chaining) { tuplesort_puttupleslot(tuplesortstate, slot); } } if (aggstate->doing_hashing && !done) agg_retrieve_hashed(); /* * Feed data from one sort to the next, to compute grouping sets that * need differing sort orders. */ last_sort = tuplesortstate[0]; current_sort = numGroupingSets > 0 ? tuplesortstate[1] : NULL; while (aggstate->doing_chaining && !done_sorting) { tuplesort_gettupleslot(last_sort, tmpslot); /* exhausted all tuple from a particular sort order, move on */ if (TupIsNull(tmpslot)) { finalize_aggregates(...); tuplesort_end(last_sort); /* maybe save stats somewhere? */ last_sort = current_sort; current_sort = tuplesortstate[...]; if (all_sorts_done) done_sorting = true; return aggregated; } if (current_sort != NULL) tuplesort_puttupleslot(current_sort, slot); /* check if we crossed a boundary */ if (!execTuplesMatch(...)) { finalize_aggregates(...); aggstate->grp_firstTuple = ... return aggregated; } advance_aggregates(); tuplesort_puttupleslot(current_sort, slot); } } I think this is quite doable and seems likely to actually end up with easier to understand code. But unfortunately it seems to be big enough of a change to make it unlikely to be done in sufficient quality until the freeze. I'll nonetheless work a couple hours on it tomorrow. Andrew, is that a structure you could live with, or not? Others, what do you think? Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to y
Re: [HACKERS] Final Patch for GROUPING SETS
On 2015-05-12 05:36:19 +0200, Andres Freund wrote: > > Another controversial item was the introduction of GroupedVar. The need > > for this can be avoided by explicitly setting to NULL the relevant > > columns of the representative group tuple when evaluating result rows, > > but (a) I don't think that's an especially clean approach (though I'm > > not pushing back very hard on it) and (b) the logic needed in its > > absence is different between the current chaining implementation and a > > possible union implementation, so I decided against making any changes > > on wasted-effort grounds. > > Seems like fairly minor point to me. I very tentatively lean towards > setting the columns in the group tuple to NULL. I'm pretty sure by now that I dislike the introduction of GroupedVar, and not just tentatively. While I can see why you found its introduction to be nicer than fiddling with the result tuple, for me the disadvantages seem to outweigh the advantage. For one it's rather wierd to have Var nodes be changed into GroupedVar in setrefs.c. The number of places that need to be touched even when it's a 'planned stmt only' type of node is still pretty large. Andrew: I'll work on changing this in a couple hours unless you're speaking up about doing it yourself. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
> "Andres" == Andres Freund writes: Andres> Andrew, are you going to be working on any of these? As discussed on IRC, current status is: >>> * The increased complexity of grouping_planner. It'd imo be good if some >>> of that could be refactored into a separate function. Specifically the >>> else if (parse->hasAggs || (parse->groupingSets && parse->groupClause)) >>> block. done and pushed at you >>> * The Hopcroft-Karp stuff not being separate done and pushed Andres> * to split agg_retrieve_direct into a version for grouping sets Andres> and one without. I think that'll be a pretty clear win for Andres> clarity. I don't see how this helps given that the grouping sets version will be exactly as complex as the current code. Andres> * to spin out common code between agg_retrieve_direct (in both Andres> the functions its split into), agg_retrieve_hashed and Andres> agg_retrieve_chained. It should e.g. be fairly simple to spin Andres> out the tail end processing of a input group Andres> (finalize_aggregate loop, ExecQual) into a separate function. This isn't _quite_ as simple as it sounds but I'll have a go. >> * The code in nodeAgg.c isn't pretty in places. Stuff like if >> (node->chain_depth > 0) estate->agg_chain_head = save_chain_head;... >> Feels like a good bit of cleanup would be possible there. I'll look. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
On 2015-05-12 20:40:49 +0200, Andres Freund wrote: > On 2015-05-12 05:36:19 +0200, Andres Freund wrote: > > What I dislike so far: > > * Minor formatting things. Just going to fix and push the ones I > > dislike. > > * The Hopcroft-Karp stuff not being separate > > * The increased complexity of grouping_planner. It'd imo be good if some > > of that could be refactored into a separate function. Specifically the > > else if (parse->hasAggs || (parse->groupingSets && parse->groupClause)) > > block. > > * I think it'd not hurt to add rule deparse check for the function in > > GROUPING SETS case. I didn't see one at least. > > * The code in nodeAgg.c isn't pretty in places. Stuff like if > (node->chain_depth > 0) estate->agg_chain_head = save_chain_head;... > Feels like a good bit of cleanup would be possible there. In the executor I'd further like: * to split agg_retrieve_direct into a version for grouping sets and one without. I think that'll be a pretty clear win for clarity. * to spin out common code between agg_retrieve_direct (in both the functions its split into), agg_retrieve_hashed and agg_retrieve_chained. It should e.g. be fairly simple to spin out the tail end processing of a input group (finalize_aggregate loop, ExecQual) into a separate function. Andrew, are you going to be working on any of these? Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
On 2015-05-12 05:36:19 +0200, Andres Freund wrote: > What I dislike so far: > * Minor formatting things. Just going to fix and push the ones I > dislike. > * The Hopcroft-Karp stuff not being separate > * The increased complexity of grouping_planner. It'd imo be good if some > of that could be refactored into a separate function. Specifically the > else if (parse->hasAggs || (parse->groupingSets && parse->groupClause)) > block. > * I think it'd not hurt to add rule deparse check for the function in > GROUPING SETS case. I didn't see one at least. * The code in nodeAgg.c isn't pretty in places. Stuff like if (node->chain_depth > 0) estate->agg_chain_head = save_chain_head;... Feels like a good bit of cleanup would be possible there. > I think the problem is "just" that for each variable, in each grouping > set - a very large number in a large cube - we're recursing through the > whole ChainAggregate tree, as each Var just points to a var one level > lower. > > It might be worthwhile to add a little hack that deparses the variables > agains the "lowest" relevant node (i.e. the one below the last chain > agg). Since they'll all have the same targetlist that ought to be safe. I've prototype hacked this, and indeed, adding a shortcut from the intermediate chain nodes to the 'leaf chain node' cuts the explain time from 11 to 2 seconds on some arbitrary statement. The remaining time is the equivalent problem in the sort nodes... I'm not terribly bothered by this. We can relatively easily fix this up if required. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
>I think the problem is "just" that for each variable, in each grouping >set - a very large number in a large cube - we're recursing through the >whole ChainAggregate tree, as each Var just points to a var one level >lower. For small values of very large, that is. Had a little thinko there. Its still fault of recursing down all these levels, doing nontrivial work each time. -- Please excuse brevity and formatting - I am writing this on my mobile phone. Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
On 2015-04-30 05:35:26 +0100, Andrew Gierth wrote: > > "Andres" == Andres Freund writes: > > Andres> This is not a real review. I'm just scanning through the > Andres> patch, without reading the thread, to understand if I see > Andres> something "worthy" of controversy. While scanning I might have > Andres> a couple observations or questions. > > >> + * A list of grouping sets which is structurally equivalent to a > ROLLUP > >> + * clause (e.g. (a,b,c), (a,b), (a)) can be processed in a > single pass over > >> + * ordered data. We do this by keeping a separate set of > transition values > >> + * for each grouping set being concurrently processed; for each > input tuple > >> + * we update them all, and on group boundaries we reset some > initial subset > >> + * of the states (the list of grouping sets is ordered from most > specific to > >> + * least specific). One AGG_SORTED node thus handles any number > of grouping > >> + * sets as long as they share a sort order. > > Andres> Found "initial subset" not very clear, even if I probably > Andres> guessed the right meaning. > > How about: > > * [...], and on group boundaries we reset those states > * (starting at the front of the list) whose grouping values have > * changed (the list of grouping sets is ordered from most specific to > * least specific). One AGG_SORTED node thus handles any number [...] sounds good. > >> + * To handle multiple grouping sets that _don't_ share a sort > order, we use > >> + * a different strategy. An AGG_CHAINED node receives rows in > sorted order > >> + * and returns them unchanged, but computes transition values > for its own > >> + * list of grouping sets. At group boundaries, rather than > returning the > >> + * aggregated row (which is incompatible with the input rows), > it writes it > >> + * to a side-channel in the form of a tuplestore. Thus, a > number of > >> + * AGG_CHAINED nodes are associated with a single AGG_SORTED > node (the > >> + * "chain head"), which creates the side channel and, when it > has returned > >> + * all of its own data, returns the tuples from the tuplestore > to its own > >> + * caller. > > Andres> This paragraph deserves to be expanded imo. > > OK, but what in particular needs clarifying? I'm not sure ;). I obviously was a bit tired... > Andres> Are you intending to resolve this before an eventual commit? ... > Andres> Possibly after the 'structural' issues are resolved? Or do you > Andres> think this can safely be put of for another release? > > I think the feature is useful even without AGG_HASHED, even though that > means it can sometimes be beaten on performance by using UNION ALL of > many separate GROUP BYs; but I'd defer to the opinions of others on that > point. I agree. > Andres> * Arguably this converts the execution *tree* into a DAG. Tom > Andres> seems to be rather uncomfortable with that. I am wondering > Andres> whether this really is a big deal - essentially this only > Andres> happens in a relatively 'isolated' part of the tree right? > Andres> I.e. if those chained together nodes were considered one node, > Andres> there would not be any loops? Additionally, the way > Andres> parametrized scans works already essentially "violates" the > Andres> tree paradigma somewhat. > > The major downsides as I see them with the current approach are: > > 1. It makes plans (and hence explain output) nest very deeply if you > have complex grouping sets (especially cubes with high dimensionality). That doesn't concern me overly much. If we feel the need to fudge the explain output we certainly can. > 2. A union-based approach would have a chance of including AGG_HASHED > support without any significant code changes, just by using one HashAgg > node per qualifying grouping set. However, this would be potentially > significantly slower than teaching HashAgg to do multiple grouping sets, > and memory usage would be an issue. Your "however" imo pretty much disqualifies that as an argument. > The obvious alternative is this: > > -> CTE x > -> entire input subplan here > -> Append > -> GroupAggregate > -> Sort >-> CTE Scan x > -> GroupAggregate > -> Sort >-> CTE Scan x > -> HashAggregate > -> CTE Scan x > [...] > > which was basically what we expected to do originally. But all of the > existing code to deal with CTE / CTEScan is based on the assumption that > each CTE has a rangetable entry before planning starts, and it is > completely non-obvious how to arrange to generate such CTEs on the fly > while planning. Tom offered in December to look into that aspect for > us, and of course we've heard nothing about it since. I find Noah's argument about this kind of structure pretty convincing. We'd eit
Re: [HACKERS] Final Patch for GROUPING SETS
On Thu, Apr 30, 2015 at 05:35:26AM +0100, Andrew Gierth wrote: > > "Andres" == Andres Freund writes: > >> + * TODO: AGG_HASHED doesn't support multiple grouping sets yet. > > Andres> Are you intending to resolve this before an eventual commit? > > Original plan was to tackle AGG_HASHED after a working implementation > was committed; +1 for that plan. > Andres> Possibly after the 'structural' issues are resolved? Or do you > Andres> think this can safely be put of for another release? > > I think the feature is useful even without AGG_HASHED, even though that > means it can sometimes be beaten on performance by using UNION ALL of > many separate GROUP BYs; but I'd defer to the opinions of others on that > point. It will be a tough call, and PostgreSQL has gone each way on some recent feature. I recommend considering both GroupAggregate and HashAggregate in all design discussion but continuing to work toward a first commit implementing GroupAggregate alone. With that in the tree, we'll be in a better position to decide whether to release a feature paused at that stage in its development. Critical facts are uncertain, so a discussion today would be unproductive. > Andres> So, having quickly scanned through the patch, do I understand > Andres> correctly that the contentious problems are: > > Andres> * Arguably this converts the execution *tree* into a DAG. Tom > Andres> seems to be rather uncomfortable with that. I am wondering > Andres> whether this really is a big deal - essentially this only > Andres> happens in a relatively 'isolated' part of the tree right? > Andres> I.e. if those chained together nodes were considered one node, > Andres> there would not be any loops? Additionally, the way > Andres> parametrized scans works already essentially "violates" the > Andres> tree paradigma somewhat. I agree with your assessment. That has been contentious. > The major downsides as I see them with the current approach are: > > 1. It makes plans (and hence explain output) nest very deeply if you > have complex grouping sets (especially cubes with high dimensionality). > > This can make explain very slow in the most extreme cases I'm not worried about that. If anything, the response is to modify explain to more-quickly/compactly present affected plan trees. > 2. A union-based approach would have a chance of including AGG_HASHED > support without any significant code changes, > -> CTE x > -> entire input subplan here > -> Append > -> GroupAggregate > -> Sort >-> CTE Scan x > -> GroupAggregate > -> Sort >-> CTE Scan x > -> HashAggregate > -> CTE Scan x > [...] This uses 50-67% more I/O than the current strategy, which makes it a dead end from my standpoint. Details: http://www.postgresql.org/message-id/20141221210005.ga1864...@tornado.leadboat.com > Andres> Are those the two bigger controversial areas? Or are there > Andres> others in your respective views? > Another controversial item was the introduction of GroupedVar. I know of no additional controversies to add to this list. Thanks, nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
> "Andres" == Andres Freund writes: Andres> This is not a real review. I'm just scanning through the Andres> patch, without reading the thread, to understand if I see Andres> something "worthy" of controversy. While scanning I might have Andres> a couple observations or questions. >> + * A list of grouping sets which is structurally equivalent to a ROLLUP >> + * clause (e.g. (a,b,c), (a,b), (a)) can be processed in a single pass >> over >> + * ordered data. We do this by keeping a separate set of transition >> values >> + * for each grouping set being concurrently processed; for each input >> tuple >> + * we update them all, and on group boundaries we reset some initial >> subset >> + * of the states (the list of grouping sets is ordered from most >> specific to >> + * least specific). One AGG_SORTED node thus handles any number of >> grouping >> + * sets as long as they share a sort order. Andres> Found "initial subset" not very clear, even if I probably Andres> guessed the right meaning. How about: * [...], and on group boundaries we reset those states * (starting at the front of the list) whose grouping values have * changed (the list of grouping sets is ordered from most specific to * least specific). One AGG_SORTED node thus handles any number [...] >> + * To handle multiple grouping sets that _don't_ share a sort order, we >> use >> + * a different strategy. An AGG_CHAINED node receives rows in sorted >> order >> + * and returns them unchanged, but computes transition values for its own >> + * list of grouping sets. At group boundaries, rather than returning the >> + * aggregated row (which is incompatible with the input rows), it writes >> it >> + * to a side-channel in the form of a tuplestore. Thus, a number of >> + * AGG_CHAINED nodes are associated with a single AGG_SORTED node (the >> + * "chain head"), which creates the side channel and, when it has >> returned >> + * all of its own data, returns the tuples from the tuplestore to its own >> + * caller. Andres> This paragraph deserves to be expanded imo. OK, but what in particular needs clarifying? >> + * In order to avoid excess memory consumption from a chain of >> alternating >> + * Sort and AGG_CHAINED nodes, we reset each child Sort node >> preemptively, >> + * allowing us to cap the memory usage for all the sorts in the chain at >> + * twice the usage for a single node. Andres> What does reseting 'preemtively' mean? Plan nodes are normally not reset (in the sense of calling ExecReScan) just because they finished, but rather it's done before a subsequent new scan is done. Doing the rescan call after all the sorted output has been read means we discard the data from each sort node as soon as it is transferred to the next one. There is a more specific comment in agg_retrieve_chained where this actually happens. >> + * From the perspective of aggregate transition and final functions, the >> + * only issue regarding grouping sets is this: a single call site >> (flinfo) >> + * of an aggregate function may be used for updating several different >> + * transition values in turn. So the function must not cache in the >> flinfo >> + * anything which logically belongs as part of the transition value (most >> + * importantly, the memory context in which the transition value exists). >> + * The support API functions (AggCheckCallContext, AggRegisterCallback) >> are >> + * sensitive to the grouping set for which the aggregate function is >> + * currently being called. Andres> Hm. I've seen a bunch of aggreates do this. Such as? This was discussed about a year ago in the context of WITHIN GROUP: http://www.postgresql.org/message-id/87r424i24w@news-spur.riddles.org.uk >> + * TODO: AGG_HASHED doesn't support multiple grouping sets yet. Andres> Are you intending to resolve this before an eventual commit? Original plan was to tackle AGG_HASHED after a working implementation was committed; we figured that we'd have two commitfests to get the basics right, and then have a chance to get AGG_HASHED done for the third one. Also, there was talk of other people working on hashagg memory usage issues, and we didn't want to conflict with that. Naturally the extended delays rather put paid to that plan. Going ahead and writing code for AGG_HASHED anyway wasn't really an option, since with the overall structural questions unresolved there was too much chance of it being wasted effort. Andres> Possibly after the 'structural' issues are resolved? Or do you Andres> think this can safely be put of for another release? I think the feature is useful even without AGG_HASHED, even though that means it can sometimes be beaten on performance by using UNION ALL of many separate GROUP BYs; but I'd defer to the opinions of others on that point. Andres> Maybe it's just me, but I get twitchy if I se
Re: [HACKERS] Final Patch for GROUPING SETS
Hi, This is not a real review. I'm just scanning through the patch, without reading the thread, to understand if I see something "worthy" of controversy. While scanning I might have a couple observations or questions. On 2015-03-13 15:46:15 +, Andrew Gierth wrote: > + * A list of grouping sets which is structurally equivalent to a ROLLUP > + * clause (e.g. (a,b,c), (a,b), (a)) can be processed in a single pass > over > + * ordered data. We do this by keeping a separate set of transition > values > + * for each grouping set being concurrently processed; for each input > tuple > + * we update them all, and on group boundaries we reset some initial > subset > + * of the states (the list of grouping sets is ordered from most > specific to > + * least specific). One AGG_SORTED node thus handles any number of > grouping > + * sets as long as they share a sort order. Found "initial subset" not very clear, even if I probably guessed the right meaning. > + * To handle multiple grouping sets that _don't_ share a sort order, we > use > + * a different strategy. An AGG_CHAINED node receives rows in sorted > order > + * and returns them unchanged, but computes transition values for its own > + * list of grouping sets. At group boundaries, rather than returning the > + * aggregated row (which is incompatible with the input rows), it writes > it > + * to a side-channel in the form of a tuplestore. Thus, a number of > + * AGG_CHAINED nodes are associated with a single AGG_SORTED node (the > + * "chain head"), which creates the side channel and, when it has > returned > + * all of its own data, returns the tuples from the tuplestore to its own > + * caller. This paragraph deserves to be expanded imo. > + * In order to avoid excess memory consumption from a chain of > alternating > + * Sort and AGG_CHAINED nodes, we reset each child Sort node > preemptively, > + * allowing us to cap the memory usage for all the sorts in the chain at > + * twice the usage for a single node. What does reseting 'preemtively' mean? > + * From the perspective of aggregate transition and final functions, the > + * only issue regarding grouping sets is this: a single call site > (flinfo) > + * of an aggregate function may be used for updating several different > + * transition values in turn. So the function must not cache in the > flinfo > + * anything which logically belongs as part of the transition value (most > + * importantly, the memory context in which the transition value exists). > + * The support API functions (AggCheckCallContext, AggRegisterCallback) > are > + * sensitive to the grouping set for which the aggregate function is > + * currently being called. Hm. I've seen a bunch of aggreates do this. > + * TODO: AGG_HASHED doesn't support multiple grouping sets yet. Are you intending to resolve this before an eventual commit? Possibly after the 'structural' issues are resolved? Or do you think this can safely be put of for another release? > @@ -534,11 +603,13 @@ static void > advance_aggregates(AggState *aggstate, AggStatePerGroup pergroup) > { > int aggno; > + int setno = 0; > + int numGroupingSets = Max(aggstate->numsets, 1); > + int numAggs = aggstate->numaggs; > > - for (aggno = 0; aggno < aggstate->numaggs; aggno++) > + for (aggno = 0; aggno < numAggs; aggno++) > { > AggStatePerAgg peraggstate = &aggstate->peragg[aggno]; > - AggStatePerGroup pergroupstate = &pergroup[aggno]; > ExprState *filter = peraggstate->aggrefstate->aggfilter; > int numTransInputs = > peraggstate->numTransInputs; > int i; > @@ -582,13 +653,16 @@ advance_aggregates(AggState *aggstate, AggStatePerGroup > pergroup) > continue; > } > > - /* OK, put the tuple into the tuplesort object */ > - if (peraggstate->numInputs == 1) > - tuplesort_putdatum(peraggstate->sortstate, > - > slot->tts_values[0], > - > slot->tts_isnull[0]); > - else > - tuplesort_puttupleslot(peraggstate->sortstate, > slot); > + for (setno = 0; setno < numGroupingSets; setno++) > + { > + /* OK, put the tuple into the tuplesort object > */ > + if (peraggstate->numInputs == 1) > + > tuplesort_putdatum(peraggstate->sortstates[setno], > + > slot->tt
Re: [HACKERS] Final Patch for GROUPING SETS
On Wed, Jan 21, 2015 at 6:02 AM, Andrew Gierth wrote: > Updated patch (mostly just conflict resolution): > > - fix explain code to track changes to deparse context handling > > - tiny expansion of some comments (clarify in nodeAgg header >comment that aggcontexts are now EContexts rather than just >memory contexts) > > - declare support for features in sql_features.txt, which had been >previously overlooked > > Patch moved to CF 2015-02. -- Michael
Re: [HACKERS] Final Patch for GROUPING SETS
On Fri, Jan 02, 2015 at 03:55:23PM -0600, Jim Nasby wrote: > On 12/31/14, 3:05 PM, Noah Misch wrote: > >On Wed, Dec 31, 2014 at 05:33:43PM +, Andrew Gierth wrote: > >"Noah" == Noah Misch writes: > >>> > >>> Noah> Suppose one node orchestrated all sorting and aggregation. > >>> > >>>Well, that has the downside of making it into an opaque blob, without > >>>actually gaining much. > >The opaque-blob criticism is valid. As for not gaining much, well, the gain > >I > >sought was to break this stalemate. You and Tom have expressed willingness > >to > >accept the read I/O multiplier of the CTE approach. You and I are willing to > >swallow an architecture disruption, namely a tuplestore acting as a side > >channel between executor nodes. Given your NACK, I agree that it fails to > >move us toward consensus and therefore does not gain much. Alas. > > I haven't read the full discussion in depth, but is what we'd want here is > the ability to feed tuples to more than one node simultaneously? A similar comment appeared shortly upthread. Given a planner and executor capable of that, we would do so here. Changing the planner and executor architecture to support it is its own large, open-ended project. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
On 12/31/14, 3:05 PM, Noah Misch wrote: On Wed, Dec 31, 2014 at 05:33:43PM +, Andrew Gierth wrote: > >"Noah" == Noah Misch writes: > > Noah> Suppose one node orchestrated all sorting and aggregation. > >Well, that has the downside of making it into an opaque blob, without >actually gaining much. The opaque-blob criticism is valid. As for not gaining much, well, the gain I sought was to break this stalemate. You and Tom have expressed willingness to accept the read I/O multiplier of the CTE approach. You and I are willing to swallow an architecture disruption, namely a tuplestore acting as a side channel between executor nodes. Given your NACK, I agree that it fails to move us toward consensus and therefore does not gain much. Alas. I haven't read the full discussion in depth, but is what we'd want here is the ability to feed tuples to more than one node simultaneously? That would allow things like GroupAggregate --> Sort(a) \ +--> Sort(a,b) -\ --> Hash(b) + \--> SeqScan That would allow the planner to trade off things like total memory consumption vs IO. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
On Wed, Dec 31, 2014 at 05:33:43PM +, Andrew Gierth wrote: > > "Noah" == Noah Misch writes: > > Noah> Suppose one node orchestrated all sorting and aggregation. > > Well, that has the downside of making it into an opaque blob, without > actually gaining much. The opaque-blob criticism is valid. As for not gaining much, well, the gain I sought was to break this stalemate. You and Tom have expressed willingness to accept the read I/O multiplier of the CTE approach. You and I are willing to swallow an architecture disruption, namely a tuplestore acting as a side channel between executor nodes. Given your NACK, I agree that it fails to move us toward consensus and therefore does not gain much. Alas. > A more serious objection is that this forecloses (or at least makes > much more complex) the future possibility of doing some grouping sets > by sorting and others by hashing. The chained approach specifically > allows for the future possibility of using a HashAggregate as the > chain head, so that for example cube(a,b) can be implemented as a > sorted agg for (a,b) and (a) and a hashed agg for (b) and (), allowing > it to be done with one sort even if the result size for (a,b) is too > big to hash. That's a fair criticism, too. Ingesting nodeSort.c into nodeAgg.c wouldn't be too bad, because nodeSort.c is a thin wrapper around tuplesort.c. Ingesting nodeHash.c is not so tidy; that could entail extracting a module similar in level to tuplesort.c, to be consumed by both executor nodes. This does raise the good point that the GROUPING SETS _design_ ought to consider group and hash aggregation together. Designing one in isolation carries too high of a risk of painting the other into a corner. Thanks, nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
On Wed, Dec 31, 2014 at 02:45:29PM +0530, Atri Sharma wrote: > > Suppose one node orchestrated all sorting and aggregation. Call it a > > MultiGroupAggregate for now. It wouldn't harness Sort nodes, because it > > performs aggregation between tuplesort_puttupleslot() calls. Instead, it > > would directly manage two Tuplesortstate, CUR and NEXT. The node would > > have > > an initial phase similar to ExecSort(), in which it drains the outer node > > to > > populate the first CUR. After that, it looks more like > > agg_retrieve_direct(), > > except that CUR is the input source, and each tuple drawn is also put into > > NEXT. When done with one CUR, swap CUR with NEXT and reinitialize NEXT. > > This > > design does not add I/O consumption or require a nonstandard communication > > channel between executor nodes. Tom, Andrew, does that look satisfactory? > > > > > So you are essentially proposing merging ChainAggregate and its > corresponding Sort node? > > So the structure would be something like: > > GroupAggregate > --> MultiGroupAgg (a,b) > > MultiGroupAgg (c,d) ... No. > If you are proposing > replacing GroupAggregate node + entire ChainAggregate + Sort nodes stack > with a single MultiGroupAggregate node, I am not able to understand how it > will handle all the multiple sort orders. Yes, I was proposing that. My paragraph that you quoted above was the attempt to explain how the node would manage multiple sort orders. If you have specific questions about it, feel free to ask. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
> "Noah" == Noah Misch writes: Noah> Suppose one node orchestrated all sorting and aggregation. Well, that has the downside of making it into an opaque blob, without actually gaining much. Noah> Call it a MultiGroupAggregate for now. It wouldn't harness Noah> Sort nodes, because it performs aggregation between Noah> tuplesort_puttupleslot() calls. Instead, it would directly Noah> manage two Tuplesortstate, CUR and NEXT. The node would have Noah> an initial phase similar to ExecSort(), in which it drains the Noah> outer node to populate the first CUR. After that, it looks Noah> more like agg_retrieve_direct(), agg_retrieve_direct is already complex enough, and this would be substantially more so, as compared to agg_retrieve_chained which is substantially simpler. A more serious objection is that this forecloses (or at least makes much more complex) the future possibility of doing some grouping sets by sorting and others by hashing. The chained approach specifically allows for the future possibility of using a HashAggregate as the chain head, so that for example cube(a,b) can be implemented as a sorted agg for (a,b) and (a) and a hashed agg for (b) and (), allowing it to be done with one sort even if the result size for (a,b) is too big to hash. Noah> Tom, Andrew, does that look satisfactory? Not to me. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
ChainAggregate is > a bit like a node having two parents, a Sort and a GroupAggregate. > However, > the graph edge between ChainAggregate and its GroupAggregate is a > tuplestore > instead of the usual, synchronous ExecProcNode(). > Well, I dont buy the two parents theory. The Sort nodes are intermediately stacked amongst ChainAggregate nodes, so there is still the single edge. However, as you rightly said, there is a shared tuplestore, but note that only the head of chain ChainAggregate has the top GroupAggregate as its parent. > > Suppose one node orchestrated all sorting and aggregation. Call it a > MultiGroupAggregate for now. It wouldn't harness Sort nodes, because it > performs aggregation between tuplesort_puttupleslot() calls. Instead, it > would directly manage two Tuplesortstate, CUR and NEXT. The node would > have > an initial phase similar to ExecSort(), in which it drains the outer node > to > populate the first CUR. After that, it looks more like > agg_retrieve_direct(), > except that CUR is the input source, and each tuple drawn is also put into > NEXT. When done with one CUR, swap CUR with NEXT and reinitialize NEXT. > This > design does not add I/O consumption or require a nonstandard communication > channel between executor nodes. Tom, Andrew, does that look satisfactory? > > So you are essentially proposing merging ChainAggregate and its corresponding Sort node? So the structure would be something like: GroupAggregate --> MultiGroupAgg (a,b) > MultiGroupAgg (c,d) ... I am not sure if I understand you correctly. Only the top level GroupAggregate node projects the result of the entire operation. The key to ChainAggregate nodes is that each ChainAggregate node handles grouping sets that fit a single ROLLUP list i.e. can be done by a single sort order. There can be multiple lists of this type in a single GS operation, however, our current design has only a single top GroupAggregate node but a ChainAggregate node + Sort node per sort order. If you are proposing replacing GroupAggregate node + entire ChainAggregate + Sort nodes stack with a single MultiGroupAggregate node, I am not able to understand how it will handle all the multiple sort orders. If you are proposing replacing only ChainAggregate + Sort node with a single MultiGroupAgg node, that still shares the tuplestore with top level GroupAggregate node. I am pretty sure I have messed up my understanding of your proposal. Please correct me if I am wrong. Regards, Atri -- Regards, Atri *l'apprenant*
Re: [HACKERS] Final Patch for GROUPING SETS
On Tue, Dec 23, 2014 at 02:29:58AM -0500, Noah Misch wrote: > On Mon, Dec 22, 2014 at 10:46:16AM -0500, Tom Lane wrote: > > I still find the ChainAggregate approach too ugly at a system structural > > level to accept, regardless of Noah's argument about number of I/O cycles > > consumed. We'll be paying for that in complexity and bugs into the > > indefinite future, and I wonder if it isn't going to foreclose some other > > "performance opportunities" as well. > > Among GROUPING SETS GroupAggregate implementations, I bet there's a nonempty > intersection between those having maintainable design and those having optimal > I/O usage, optimal memory usage, and optimal number of sorts. Let's put more > effort into finding it. I'm hearing that the shared tuplestore is > ChainAggregate's principal threat to system structure; is that right? The underlying algorithm, if naively expressed in terms of our executor concepts, would call ExecProcNode() on a SortState, then feed the resulting slot to both a GroupAggregate and to another Sort. That implies a non-tree graph of executor nodes, which isn't going to fly anytime soon. The CTE approach bypasses that problem by eliminating cooperation between sorts, instead reading 2N+1 copies of the source data for N sorts. ChainAggregate is a bit like a node having two parents, a Sort and a GroupAggregate. However, the graph edge between ChainAggregate and its GroupAggregate is a tuplestore instead of the usual, synchronous ExecProcNode(). Suppose one node orchestrated all sorting and aggregation. Call it a MultiGroupAggregate for now. It wouldn't harness Sort nodes, because it performs aggregation between tuplesort_puttupleslot() calls. Instead, it would directly manage two Tuplesortstate, CUR and NEXT. The node would have an initial phase similar to ExecSort(), in which it drains the outer node to populate the first CUR. After that, it looks more like agg_retrieve_direct(), except that CUR is the input source, and each tuple drawn is also put into NEXT. When done with one CUR, swap CUR with NEXT and reinitialize NEXT. This design does not add I/O consumption or require a nonstandard communication channel between executor nodes. Tom, Andrew, does that look satisfactory? Thanks, nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
On Mon, Dec 22, 2014 at 6:57 PM, Andrew Gierth wrote: > In the case of cube(a,b,c,d), our code currently gives: > > b,d,a,c: (b,d,a,c),(b,d) > a,b,d:(a,b,d),(a,b) > d,a,c:(d,a,c),(d,a),(d) > c,d: (c,d),(c) > b,c,d:(b,c,d),(b,c),(b) > a,c,b:(a,c,b),(a,c),(a),() That's pretty cool. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
On Mon, Dec 22, 2014 at 10:46:16AM -0500, Tom Lane wrote: > I still find the ChainAggregate approach too ugly at a system structural > level to accept, regardless of Noah's argument about number of I/O cycles > consumed. We'll be paying for that in complexity and bugs into the > indefinite future, and I wonder if it isn't going to foreclose some other > "performance opportunities" as well. Among GROUPING SETS GroupAggregate implementations, I bet there's a nonempty intersection between those having maintainable design and those having optimal I/O usage, optimal memory usage, and optimal number of sorts. Let's put more effort into finding it. I'm hearing that the shared tuplestore is ChainAggregate's principal threat to system structure; is that right? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
> "Robert" == Robert Haas writes: >> I would be interested in seeing more good examples of the size and >> type of grouping sets used in typical queries. Robert> From what I have seen, there is interest in being able to do Robert> things like GROUP BY CUBE(a, b, c, d) and have that be Robert> efficient. Yes, but that's not telling me anything I didn't already know. What I'm curious about is things like: - what's the largest cube(...) people actually make use of in practice - do people make much use of the ability to mix cube and rollup, or take the cross product of multiple grouping sets - what's the most complex GROUPING SETS clause anyone has seen in common use Robert> That will require 16 different groupings, and we really want Robert> to minimize the number of times we have to re-sort to get all Robert> of those done. For example, if we start by sorting on (a, b, Robert> c, d), we want to then make a single pass over the data Robert> computing the aggregates with (a, b, c, d), (a, b, c), (a, Robert> b), (a), and () as the grouping columns. In the case of cube(a,b,c,d), our code currently gives: b,d,a,c: (b,d,a,c),(b,d) a,b,d:(a,b,d),(a,b) d,a,c:(d,a,c),(d,a),(d) c,d: (c,d),(c) b,c,d:(b,c,d),(b,c),(b) a,c,b:(a,c,b),(a,c),(a),() There is no solution in less than 6 sorts. (There are many possible solutions in 6 sorts, but we don't attempt to prefer one over another. The minimum number of sorts for a cube of N dimensions is obviously N! / (r! * (N-r)!) where r = floor(N/2).) If you want the theory: the set of grouping sets is a poset ordered by set inclusion; what we want is a minimal partition of this poset into chains (since any chain can be processed in one pass), which happens to be equivalent to the problem of maximum cardinality matching in a bipartite graph, which we solve in polynomial time with the Hopcroft-Karp algorithm. This guarantees us a minimal solution for any combination of grouping sets however specified, not just for cubes. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
On Tuesday, December 23, 2014, Robert Haas wrote: > On Mon, Dec 22, 2014 at 11:19 AM, Andrew Gierth > > wrote: > > Tom> The other reason that's a bad comparison is that I've not seen > > Tom> many queries that use more than a couple of window frames, > > Tom> whereas we have to expect that the number of grouping sets in > > Tom> typical queries will be significantly more than "a couple". > > > > I would be interested in seeing more good examples of the size and > > type of grouping sets used in typical queries. > > From what I have seen, there is interest in being able to do things > like GROUP BY CUBE(a, b, c, d) and have that be efficient. That will > require 16 different groupings, and we really want to minimize the > number of times we have to re-sort to get all of those done. For > example, if we start by sorting on (a, b, c, d), we want to then make > a single pass over the data computing the aggregates with (a, b, c, > d), (a, b, c), (a, b), (a), and () as the grouping columns. > > > That is what ChainAggregate node does exactly. A set of orders that fit in a single ROLLUP list (like your example) are processed in a single go. -- Regards, Atri *l'apprenant*
Re: [HACKERS] Final Patch for GROUPING SETS
On Mon, Dec 22, 2014 at 11:19 AM, Andrew Gierth wrote: > Tom> The other reason that's a bad comparison is that I've not seen > Tom> many queries that use more than a couple of window frames, > Tom> whereas we have to expect that the number of grouping sets in > Tom> typical queries will be significantly more than "a couple". > > I would be interested in seeing more good examples of the size and > type of grouping sets used in typical queries. >From what I have seen, there is interest in being able to do things like GROUP BY CUBE(a, b, c, d) and have that be efficient. That will require 16 different groupings, and we really want to minimize the number of times we have to re-sort to get all of those done. For example, if we start by sorting on (a, b, c, d), we want to then make a single pass over the data computing the aggregates with (a, b, c, d), (a, b, c), (a, b), (a), and () as the grouping columns. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
> "Tom" == Tom Lane writes: [Noah] >> I caution against using window function performance as the >> template for GROUPING SETS performance goals. The benefit of >> GROUPING SETS compared to its UNION ALL functional equivalent is >> 15% syntactic pleasantness, 85% performance opportunities. >> Contrast that having window functions is great even with naive >> performance, because they enable tasks that are otherwise too hard >> in SQL. Yes, this is a reasonable point. Tom> The other reason that's a bad comparison is that I've not seen Tom> many queries that use more than a couple of window frames, Tom> whereas we have to expect that the number of grouping sets in Tom> typical queries will be significantly more than "a couple". I would be interested in seeing more good examples of the size and type of grouping sets used in typical queries. Tom> So we do have to think about what the performance will be like Tom> with a lot of sort steps. I'm also worried that this use-case Tom> may finally force us to do something about the "one work_mem per Tom> sort node" behavior, unless we can hack things so that only one Tom> or two sorts reach max memory consumption concurrently. Modifying ChainAggregate so that only two sorts reach max memory consumption concurrently seems to have been quite simple to implement, though I'm still testing some aspects of it. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
Noah Misch writes: > On Sat, Dec 13, 2014 at 04:37:48AM +, Andrew Gierth wrote: > "Tom" == Tom Lane writes: >> Tom> That seems pretty grotty from a performance+memory consumption >> Tom> standpoint. At peak memory usage, each one of the Sort nodes >> Tom> will contain every input row, >> Has this objection ever been raised for WindowAgg, which has the same >> issue? > I caution against using window function performance as the template for > GROUPING SETS performance goals. The benefit of GROUPING SETS compared to its > UNION ALL functional equivalent is 15% syntactic pleasantness, 85% performance > opportunities. Contrast that having window functions is great even with naive > performance, because they enable tasks that are otherwise too hard in SQL. The other reason that's a bad comparison is that I've not seen many queries that use more than a couple of window frames, whereas we have to expect that the number of grouping sets in typical queries will be significantly more than "a couple". So we do have to think about what the performance will be like with a lot of sort steps. I'm also worried that this use-case may finally force us to do something about the "one work_mem per sort node" behavior, unless we can hack things so that only one or two sorts reach max memory consumption concurrently. I still find the ChainAggregate approach too ugly at a system structural level to accept, regardless of Noah's argument about number of I/O cycles consumed. We'll be paying for that in complexity and bugs into the indefinite future, and I wonder if it isn't going to foreclose some other "performance opportunities" as well. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
On Sat, Dec 13, 2014 at 04:37:48AM +, Andrew Gierth wrote: > > "Tom" == Tom Lane writes: > > >> I'd already explained in more detail way back when we posted the > >> patch. But to reiterate: the ChainAggregate nodes pass through > >> their input data unchanged, but on group boundaries they write > >> aggregated result rows to a tuplestore shared by the whole > >> chain. The top node returns the data from the tuplestore after its > >> own output is completed. > > Tom> That seems pretty grotty from a performance+memory consumption > Tom> standpoint. At peak memory usage, each one of the Sort nodes > Tom> will contain every input row, > > Has this objection ever been raised for WindowAgg, which has the same > issue? I caution against using window function performance as the template for GROUPING SETS performance goals. The benefit of GROUPING SETS compared to its UNION ALL functional equivalent is 15% syntactic pleasantness, 85% performance opportunities. Contrast that having window functions is great even with naive performance, because they enable tasks that are otherwise too hard in SQL. > Tom> In principle, with the CTE+UNION approach I was suggesting, the > Tom> peak memory consumption would be one copy of the input rows in > Tom> the CTE's tuplestore plus one copy in the active branch's Sort > Tom> node. I think a bit of effort would be needed to get there (ie, > Tom> shut down one branch's Sort node before starting the next, > Tom> something I'm pretty sure doesn't happen today). > > Correct, it doesn't. > > However, I notice that having ChainAggregate shut down its input would > also have the effect of bounding the memory usage (to two copies, > which is as good as the append+sorts+CTE case). Agreed, and I find that more promising than the CTE approach. Both strategies require temporary space covering two copies of the input data. (That, or you accept rescanning the original input.) The chained approach performs less I/O. Consider "SELECT count(*) FROM t GROUP BY GROUPING SETS (a, b)", where pg_relation_size(t) >> RAM. I/O consumed with the chained approach: read table write tuplesort 1 read tuplesort 1 write tuplesort 2 read tuplesort 2 I/O consumed with the CTE approach: read table write CTE read CTE write tuplesort 1 read tuplesort 1 read CTE write tuplesort 2 read tuplesort 2 Tom rightly brought up the space requirements for result rows. The CTE approach naturally avoids reserving space for that. However, I find it a safe bet to optimize GROUPING SETS for input >> result. Reserving temporary space for result rows to save input data I/O is a good trade. We don't actually need to compromise; one can imagine a GroupAggregateChain plan node with a sortChain list that exhibits the efficiencies of both. I'm fine moving forward with the cross-node tuplestore, though. The elephant in the performance room is the absence of hash aggregation. I agree with your decision to make that a follow-on patch, but the project would be in an awkward PR situation if 9.5 has GroupAggregate-only GROUPING SETS. I may argue to #ifdef-out the feature rather than release that way. We don't need to debate that prematurely, but keep it in mind while planning. Thanks, nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
On Mon, Dec 15, 2014 at 12:28 PM, Andrew Gierth wrote: >> "Michael" == Michael Paquier writes: > > Michael> Based on those comments, I am marking this patch as > Michael> "Returned with Feedback" on the CF app for 2014-10. Andrew, > Michael> feel free to move this entry to CF 2014-12 if you are > Michael> planning to continue working on it so as it would get > Michael> additional review. (Note that this patch status was "Waiting > Michael> on Author" when writing this text). > > Moved it to 2014-12 and set it back to "waiting on author". We expect to > submit a revised version, though I have no timescale yet. OK thanks for the update. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
> "Michael" == Michael Paquier writes: Michael> Based on those comments, I am marking this patch as Michael> "Returned with Feedback" on the CF app for 2014-10. Andrew, Michael> feel free to move this entry to CF 2014-12 if you are Michael> planning to continue working on it so as it would get Michael> additional review. (Note that this patch status was "Waiting Michael> on Author" when writing this text). Moved it to 2014-12 and set it back to "waiting on author". We expect to submit a revised version, though I have no timescale yet. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
On Thu, Dec 11, 2014 at 3:36 AM, Tom Lane wrote: > I don't really have any comments on the algorithms yet, having spent too > much time trying to figure out underdocumented data structures to get to > the algorithms. However, noting the addition of list_intersection_int() > made me wonder whether you'd not be better off reducing the integer lists > to bitmapsets a lot sooner, perhaps even at parse analysis. > list_intersection_int() is going to be O(N^2) by nature. Maybe N can't > get large enough to matter in this context, but I do see places that > seem to be concerned about performance. > > I've not spent any real effort looking at gsp2.patch yet, but it seems > even worse off comment-wise: if there's any explanation in there at all > of what a "chained aggregate" is, I didn't find it. I'd also counsel you > to find some other way to do it than putting bool chain_head fields in > Aggref nodes; that looks like a mess, eg, it will break equal() tests > for expression nodes that probably should still be seen as equal. > > I took a quick look at gsp-u.patch. It seems like that approach should > work, with of course the caveat that using CUBE/ROLLUP as function names > in a GROUP BY list would be problematic. I'm not convinced by the > commentary in ruleutils.c suggesting that extra parentheses would help > disambiguate: aren't extra parentheses still going to contain grouping > specs according to the standard? Forcibly schema-qualifying such function > names seems like a less fragile answer on that end. Based on those comments, I am marking this patch as "Returned with Feedback" on the CF app for 2014-10. Andrew, feel free to move this entry to CF 2014-12 if you are planning to continue working on it so as it would get additional review. (Note that this patch status was "Waiting on Author" when writing this text). Regards, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
> "Tom" == Tom Lane writes: With the high-priority questions out of the way, time to tackle the rest: Tom> My single biggest complaint is about the introduction of struct Tom> GroupedVar. If we stick with that, we're going to have to teach Tom> an extremely large number of places that know about Vars to also Tom> know about GroupedVars. This will result in code bloat and Tom> errors of omission. If you think the latter concern is Tom> hypothetical, note that you can't get 40 lines into gsp1.patch Tom> without finding such an omission, namely the patch fails to Tom> teach pg_stat_statements.c about GroupedVars. (That also points Tom> up that some of the errors of omission will be in third-party Tom> code that we can't fix easily.) Except that GroupedVar is created only late in planning, and so only a small proportion of places need to know about it (and certainly pg_stat_statements does not). It also can't end up attached to any foreign scan or otherwise potentially third-party plan node. Tom> I think you should get rid of that concept and instead implement Tom> the behavior by having nodeAgg.c set the relevant fields of the Tom> representative tuple slot to NULL, so that a regular Var does Tom> the right thing. We did consider that. Messing with the null flags of the slot didn't seem like an especially clean approach. But if that's how you want it... Tom> I don't really have any comments on the algorithms yet, having Tom> spent too much time trying to figure out underdocumented data Tom> structures to get to the algorithms. However, noting the Tom> addition of list_intersection_int() made me wonder whether you'd Tom> not be better off reducing the integer lists to bitmapsets a lot Tom> sooner, perhaps even at parse analysis. list_intersection_int should not be time-critical; common queries do not call it at all (simple cube or rollup clauses always have an empty grouping set, causing the intersection test to bail immediately), and in pathological worst-case constructions like putting a dozen individually grouped columns in front of a 12-d cube (thus calling it 4096 times on lists at least 12 nodes long) it doesn't account for more than a small percentage even with optimization off and debugging and asserts on. The code uses the list representation almost everywhere in parsing and planning because in some places the order of elements matters, and I didn't want to keep swapping between a bitmap and a list representation. (We _do_ use bitmapsets where we're potentially going to be doing an O(N^2) number of subset comparisons to build the graph adjacency list for computing the minimal set of sort operations, and at execution time.) I didn't even consider using bitmaps for the output of parse analysis because at that stage we want to preserve most of the original query substructure (otherwise view deparse won't look anything like the original query did). Tom> list_intersection_int() is going to be O(N^2) by nature. Maybe Tom> N can't get large enough to matter in this context, but I do see Tom> places that seem to be concerned about performance. My main feeling on performance is that simple cube and rollup clauses or short lists of grouping sets should parse and plan very quickly; more complex cases should parse and plan fast enough that execution time on any nontrivial input will swamp the parse/plan time; and the most complex cases that aren't outright rejected should plan in no more than a few seconds extra. (We're limiting to 4096 grouping sets in any query level, which is comparable to other databases and seems quite excessively high compared to what people are actually likely to need.) (don't be fooled by the excessive EXPLAIN time on some queries. There are performance issues in EXPLAIN output generation that have nothing to do with this patch, and which I've not pinned down.) -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
> "Tom" == Tom Lane writes: >> I'd already explained in more detail way back when we posted the >> patch. But to reiterate: the ChainAggregate nodes pass through >> their input data unchanged, but on group boundaries they write >> aggregated result rows to a tuplestore shared by the whole >> chain. The top node returns the data from the tuplestore after its >> own output is completed. Tom> That seems pretty grotty from a performance+memory consumption Tom> standpoint. At peak memory usage, each one of the Sort nodes Tom> will contain every input row, Has this objection ever been raised for WindowAgg, which has the same issue? Tom> and the shared tuplestore will contain every output row. Every output row except those produced by the top node, and since this is after grouping, that's expected to be smaller than the input. Tom> That will lead to either a lot of memory eaten, or a lot of Tom> temp-file I/O, depending on how big work_mem is. Yes. Though note that this code only kicks in when dealing with grouping sets more complex than a simple rollup. A CUBE of two dimensions uses only one Sort node above whatever is needed to produce sorted input, and a CUBE of three dimensions uses only two. (It does increase quite a lot for large cubes though.) Tom> In principle, with the CTE+UNION approach I was suggesting, the Tom> peak memory consumption would be one copy of the input rows in Tom> the CTE's tuplestore plus one copy in the active branch's Sort Tom> node. I think a bit of effort would be needed to get there (ie, Tom> shut down one branch's Sort node before starting the next, Tom> something I'm pretty sure doesn't happen today). Correct, it doesn't. However, I notice that having ChainAggregate shut down its input would also have the effect of bounding the memory usage (to two copies, which is as good as the append+sorts+CTE case). Is shutting down and reinitializing parts of the plan really feasible here? Or would it be a case of forcing a rescan? >> Second was to generate a CTE for the input data. This didn't get >> very far because everything that already exists to handle CTE >> nodes assumes that they are explicit in the planner's input (that >> they have their own Query node, etc.) and I was not able to >> determine a good solution. Tom> Seems like restructuring that wouldn't be *that* hard. We Tom> probably don't want it to be completely like a CTE for planning Tom> purposes anyway --- that would foreclose passing down any Tom> knowledge of desired sort order, which we don't want. But it Tom> seems like we could stick a variant of CtePath atop the chosen Tom> result path of the scan/join planning phase. If you like I can Tom> poke into this a bit. Please do. That seems to cover the high-priority issues from our point of view. We will continue working on the other issues, on the assumption that when we have some idea how to do it your way, we will rip out the ChainAggregate stuff in favour of an Append-based solution. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
Andrew Gierth writes: > "Tom" == Tom Lane writes: > Tom> That seems pretty messy, especially given your further comments > Tom> that these plan nodes are interconnected and know about each > Tom> other (though you failed to say exactly how). > I'd already explained in more detail way back when we posted the > patch. But to reiterate: the ChainAggregate nodes pass through their > input data unchanged, but on group boundaries they write aggregated > result rows to a tuplestore shared by the whole chain. The top node > returns the data from the tuplestore after its own output is > completed. That seems pretty grotty from a performance+memory consumption standpoint. At peak memory usage, each one of the Sort nodes will contain every input row, and the shared tuplestore will contain every output row. That will lead to either a lot of memory eaten, or a lot of temp-file I/O, depending on how big work_mem is. In principle, with the CTE+UNION approach I was suggesting, the peak memory consumption would be one copy of the input rows in the CTE's tuplestore plus one copy in the active branch's Sort node. I think a bit of effort would be needed to get there (ie, shut down one branch's Sort node before starting the next, something I'm pretty sure doesn't happen today). But it's doable whereas I don't see how we dodge the multiple-active-sorts problem with the chained implementation. > Tom> ISTM that maybe what we should do is take a cue from the SQL > Tom> spec, which defines these things in terms of UNION ALL of > Tom> plain-GROUP-BY operations reading from a common CTE. > I looked at that, in fact that was our original plan, but it became > clear quite quickly that it was not going to be easy. > I tried two different approaches. First was to actually re-plan the > input (i.e. running query_planner more than once) for different sort > orders; that crashed and burned quickly thanks to the extent to which > the planner assumes that it'll be run once only on any given input. Well, we'd not want to rescan the input multiple times, so I don't think that generating independent plan trees for each sort order would be the thing to do anyway. I suppose ideally it would be nice to check the costs of getting the different sort orders, so that the one Sort we elide is the one that gets the best cost savings. But the WindowAgg code isn't that smart either and no one's really complained, so I think this can wait. (Eventually I'd like to make such cost comparisons possible as part of the upper-planner Pathification that I keep nattering about. But it doesn't seem like a prerequisite for getting GROUPING SETS in.) > Second was to generate a CTE for the input data. This didn't get very > far because everything that already exists to handle CTE nodes assumes > that they are explicit in the planner's input (that they have their > own Query node, etc.) and I was not able to determine a good solution. Seems like restructuring that wouldn't be *that* hard. We probably don't want it to be completely like a CTE for planning purposes anyway --- that would foreclose passing down any knowledge of desired sort order, which we don't want. But it seems like we could stick a variant of CtePath atop the chosen result path of the scan/join planning phase. If you like I can poke into this a bit. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
> "Tom" == Tom Lane writes: >> What that code does is produce plans that look like this: >> GroupAggregate >> -> Sort >>-> ChainAggregate >> -> Sort >> -> ChainAggregate >> in much the same way that WindowAgg nodes are generated. Tom> That seems pretty messy, especially given your further comments Tom> that these plan nodes are interconnected and know about each Tom> other (though you failed to say exactly how). I'd already explained in more detail way back when we posted the patch. But to reiterate: the ChainAggregate nodes pass through their input data unchanged, but on group boundaries they write aggregated result rows to a tuplestore shared by the whole chain. The top node returns the data from the tuplestore after its own output is completed. The chain_head pointer in the ChainAggregate nodes is used for: - obtaining the head node's targetlist and qual, to use to project rows into the tuplestore (the ChainAggregate nodes don't do ordinary projection so they have dummy targetlists like the Sort nodes do) - obtaining the pointer to the tuplestore itself - on rescan without parameter change, to inform the parent node whether or not the child nodes are also being rescanned (since the Sort nodes may or may not block this) Tom> The claimed analogy to WindowAgg therefore seems bogus since Tom> stacked WindowAggs are independent, AFAIR anyway. The analogy is only in that they need to see the same input rows but in different sort orders. Tom> I'm also wondering about performance: doesn't this imply more Tom> rows passing through some of the plan steps than really Tom> necessary? There's no way to cut down the number of rows seen by intermediate nodes unless you implement (and require) associative aggregates, which we do not do in this patch (that's left for possible future optimization efforts). Our approach makes no new demands on the implementation of aggregate functions. Tom> Also, how would this extend to preferring hashed aggregation in Tom> some of the grouping steps? My suggestion for extending it to hashed aggs is: by having a (single) HashAggregate node keep multiple hash tables, per grouping set, then any arbitrary collection of grouping sets can be handled in one node provided that memory permits and no non-hashable features are used. So the normal plan for CUBE(a,b) under this scheme would be just: HashAggregate Grouping Sets: (), (a), (b), (a,b) -> (input path in unsorted order) If a mixture of hashable and non-hashable data types are used, for example CUBE(hashable,unhashable), then a plan of this form could be constructed: HashAggregate Grouping Sets: (), (hashable) -> ChainAggregate Grouping Sets: (unhashable), (unhashable,hashable) -> (input path sorted by (unhashable,hashable)) Likewise, plans of this form could be considered for cases like CUBE(low_card, high_card) where hashed grouping on high_card would require excessive memory: HashAggregate Grouping Sets: (), (low_card) -> ChainAggregate Grouping Sets: (high_card), (high_card, low_card) -> (input path sorted by (high_card, low_card)) Tom> ISTM that maybe what we should do is take a cue from the SQL Tom> spec, which defines these things in terms of UNION ALL of Tom> plain-GROUP-BY operations reading from a common CTE. I looked at that, in fact that was our original plan, but it became clear quite quickly that it was not going to be easy. I tried two different approaches. First was to actually re-plan the input (i.e. running query_planner more than once) for different sort orders; that crashed and burned quickly thanks to the extent to which the planner assumes that it'll be run once only on any given input. Second was to generate a CTE for the input data. This didn't get very far because everything that already exists to handle CTE nodes assumes that they are explicit in the planner's input (that they have their own Query node, etc.) and I was not able to determine a good solution. If you have any suggestions for how to approach the problem, I'm happy to have another go at it. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
Andrew Gierth writes: > "Tom" == Tom Lane writes: > Tom> I've not spent any real effort looking at gsp2.patch yet, but it > Tom> seems even worse off comment-wise: if there's any explanation in > Tom> there at all of what a "chained aggregate" is, I didn't find it. > (Maybe "stacked" would have been a better term.) > What that code does is produce plans that look like this: > GroupAggregate > -> Sort > -> ChainAggregate > -> Sort >-> ChainAggregate > in much the same way that WindowAgg nodes are generated. That seems pretty messy, especially given your further comments that these plan nodes are interconnected and know about each other (though you failed to say exactly how). The claimed analogy to WindowAgg therefore seems bogus since stacked WindowAggs are independent, AFAIR anyway. I'm also wondering about performance: doesn't this imply more rows passing through some of the plan steps than really necessary? Also, how would this extend to preferring hashed aggregation in some of the grouping steps? ISTM that maybe what we should do is take a cue from the SQL spec, which defines these things in terms of UNION ALL of plain-GROUP-BY operations reading from a common CTE. Abstractly, that is, we'd have Append -> GroupAggregate -> Sort -> source data -> GroupAggregate -> Sort -> source data -> GroupAggregate -> Sort -> source data ... (or some of the arms could be HashAgg without a sort). Then the question is what exactly the aggregates are reading from. We could do worse than make it a straight CTE, I suppose. > Tom> I'd also counsel you to find some other way to do it than > Tom> putting bool chain_head fields in Aggref nodes; > There are no chain_head fields in Aggref nodes. Oh, I mistook "struct Agg" for "struct Aggref". (That's another pretty poorly chosen struct name, though I suppose it's far too late to change that choice.) Still, interconnecting plan nodes that aren't adjacent in the plan tree doesn't sound like a great idea to me. > Tom> I took a quick look at gsp-u.patch. It seems like that approach > Tom> should work, with of course the caveat that using CUBE/ROLLUP as > Tom> function names in a GROUP BY list would be problematic. I'm not > Tom> convinced by the commentary in ruleutils.c suggesting that extra > Tom> parentheses would help disambiguate: aren't extra parentheses > Tom> still going to contain grouping specs according to the standard? > The extra parens do actually disambiguate because CUBE(x) and > (CUBE(x)) are not equivalent anywhere; while CUBE(x) can appear inside > GROUPING SETS (...), it cannot appear inside a (...) list nested inside > a GROUPING SETS list (or anywhere else). Maybe, but this seems very fragile and non-future-proof. I think double-quoting or schema-qualifying such function names would be safer when you think about the use-case of dumping views that may get loaded into future Postgres versions. > The question that needs deciding here is less whether the approach > _could_ work but whether we _want_ it. The objection has been made > that we are in effect introducing a new category of "unreserved almost > everywhere" keyword, which I think has a point; True, but I think that ship has already sailed. We already have similar behavior for PARTITION, RANGE, and ROWS (see the opt_existing_window_name production), and I think PRECEDING, FOLLOWING, and UNBOUNDED are effectively reserved-in-certain-very-specific-contexts as well. And there are similar behaviors in plpgsql's parser. > on the other hand, > reserving CUBE is a seriously painful prospect. Precisely. I think renaming or getting rid of contrib/cube would have to be something done in a staged fashion over multiple release cycles. Waiting several years to get GROUPING SETS doesn't seem appealing at all compared to this alternative. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
> "Tom" == Tom Lane writes: More comment on this later, but I want to highlight these specific points since we need clear answers here to avoid wasting unnecessary time and effort: Tom> I've not spent any real effort looking at gsp2.patch yet, but it Tom> seems even worse off comment-wise: if there's any explanation in Tom> there at all of what a "chained aggregate" is, I didn't find it. (Maybe "stacked" would have been a better term.) What that code does is produce plans that look like this: GroupAggregate -> Sort -> ChainAggregate -> Sort -> ChainAggregate in much the same way that WindowAgg nodes are generated. Where would you consider the best place to comment this? The WindowAgg equivalent seems to be discussed primarily in the header comment of nodeWindowAgg.c. Tom> I'd also counsel you to find some other way to do it than Tom> putting bool chain_head fields in Aggref nodes; There are no chain_head fields in Aggref nodes. Agg.chain_head is true for the Agg node at the top of the chain (the GroupAggregate node in the above example), while AggState.chain_head is set on the ChainAggregate nodes to point to the AggState of the GroupAggregate node. What we need to know before doing any further work on this is whether this idea of stacking up aggregate and sort nodes is a viable one. (The feedback I've had so far suggests that the performance is acceptable, even if there are still optimization opportunities that can be tackled later, like adding HashAggregate support.) Tom> I took a quick look at gsp-u.patch. It seems like that approach Tom> should work, with of course the caveat that using CUBE/ROLLUP as Tom> function names in a GROUP BY list would be problematic. I'm not Tom> convinced by the commentary in ruleutils.c suggesting that extra Tom> parentheses would help disambiguate: aren't extra parentheses Tom> still going to contain grouping specs according to the standard? The spec is of minimal help here since it does not allow expressions in GROUP BY at all, last I looked; only column references. The extra parens do actually disambiguate because CUBE(x) and (CUBE(x)) are not equivalent anywhere; while CUBE(x) can appear inside GROUPING SETS (...), it cannot appear inside a (...) list nested inside a GROUPING SETS list (or anywhere else). As the comments in gram.y explain, the productions used are intended to follow the spec with the exception of using a_expr where the spec requires . So CUBE and ROLLUP are recognized as special only as part of a group_by_item ( in the spec), and as soon as we see a paren that isn't part of the "GROUPING SETS (" opener, we're forced into parsing an a_expr, in which CUBE() would become a function call. (The case of upgrading from an old pg version seems to require the use of --quote-all-identifiers in pg_dump) Tom> Forcibly schema-qualifying such function names seems like a less Tom> fragile answer on that end. That I guess would require keeping more state, unless you applied it everywhere to any function with a keyword for a name? I dunno. The question that needs deciding here is less whether the approach _could_ work but whether we _want_ it. The objection has been made that we are in effect introducing a new category of "unreserved almost everywhere" keyword, which I think has a point; on the other hand, reserving CUBE is a seriously painful prospect. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
Andrew Gierth writes: > And here is that recut patch set. I started looking over this patch, but eventually decided that it needs more work to be committable than I'm prepared to put in right now. My single biggest complaint is about the introduction of struct GroupedVar. If we stick with that, we're going to have to teach an extremely large number of places that know about Vars to also know about GroupedVars. This will result in code bloat and errors of omission. If you think the latter concern is hypothetical, note that you can't get 40 lines into gsp1.patch without finding such an omission, namely the patch fails to teach pg_stat_statements.c about GroupedVars. (That also points up that some of the errors of omission will be in third-party code that we can't fix easily.) I think you should get rid of that concept and instead implement the behavior by having nodeAgg.c set the relevant fields of the representative tuple slot to NULL, so that a regular Var does the right thing. I'm also not happy about the quality of the internal documentation. The big problem here is the seriously lacking documentation of the new parse node types, eg +/* + * Node representing substructure in GROUPING SETS + * + * This is not actually executable, but it's used in the raw parsetree + * representation of GROUP BY, and in the groupingSets field of Query, to + * preserve the original structure of rollup/cube clauses for readability + * rather than reducing everything to grouping sets. + */ + +typedef enum +{ + GROUPING_SET_EMPTY, + GROUPING_SET_SIMPLE, + GROUPING_SET_ROLLUP, + GROUPING_SET_CUBE, + GROUPING_SET_SETS +} GroupingSetKind; + +typedef struct GroupingSet +{ + Exprxpr; + GroupingSetKind kind; + List *content; + int location; +} GroupingSet; The only actual documentation there is a long-winded excuse for having put the struct declaration in the wrong place. (Since it's not an executable expression, it should be in parsenodes.h not primnodes.h.) Good luck figuring out what "content" is a list of, or indeed anything at all except that this has got something to do with grouping sets. If one digs around in the patch long enough, some useful information can be found in the header comments for various functions --- but there should be a spec for what this struct means, what its fields are, what the relevant invariants are *in the .h file*. Poking around in parsenodes.h, eg the description of SortGroupClause, should give you an idea of the standard here. I'm not too happy about struct Grouping either. If one had to guess, one would probably guess that this was part of the representation of a GROUP BY clause; a guess led on by the practice of the patch of dealing with this and struct GroupingSet together, as in eg pg_stat_statements.c and nodes.h. Reading enough of the patch will eventually clue you that this is the representation of a call of the GROUPING() pseudo-function, but that's not exactly clear from either the name of the struct or its random placement between Var and Const in primnodes.h. And the comment is oh so helpful: +/* + * Grouping + */ I'd be inclined to call it GroupingFunc and put it after Aggref/WindowFunc. Also please note that there is an attempt throughout the system to order code stanzas that deal with assorted node types in an order matching the order in which they're declared in the *nodes.h files. You should never be flipping a coin to decide where to add such code, and "put it at the end of the existing list" is usually not the best answer either. Some other random examples of inadequate attention to commenting: @@ -243,7 +243,7 @@ typedef struct AggStatePerAggData * rest. */ - Tuplesortstate *sortstate; /* sort object, if DISTINCT or ORDER BY */ + Tuplesortstate **sortstate; /* sort object, if DISTINCT or ORDER BY */ This change didn't even bother to pluralize the comment, let alone explain the length of the array or what it's indexed according to, let alone explain why we now need multiple tuplesort objects in what is still apparently a "per aggregate" state struct. (BTW, as a matter of good engineering I think it's useful to change a field's name when you change its meaning and representation so fundamentally. In this case, renaming to "sortstates" would have been clearer and would have helped ensure that you didn't miss fixing any referencing code.) @@ -338,81 +339,101 @@ static Datum GetAggInitVal(Datum textInitVal, Oid transtype); static void initialize_aggregates(AggState *aggstate, AggStatePerAgg peragg, - AggStatePerGroup pergroup) + AggStatePerGroup pergroup, + int numReinitialize) { int aggno; I wonder what numReinitialize is, or why it's
Re: [HACKERS] Final Patch for GROUPING SETS
On Sat, Sep 27, 2014 at 06:37:38AM +0100, Andrew Gierth wrote: > > "Andrew" == Andrew Gierth writes: > > Andrew> I was holding off on posting a recut patch with the latest > Andrew> EXPLAIN formatting changes (which are basically cosmetic) > Andrew> until it became clear whether RLS was likely to be reverted > Andrew> or kept (we have a tiny but irritating conflict with it, in > Andrew> the regression test schedule file where we both add to the > Andrew> same list of tests). > > And here is that recut patch set. > > Changes since last posting (other than conflict removal): > > - gsp1.patch: clearer EXPLAIN output as per discussion > > Recut patches: > > gsp1.patch - phase 1 code patch (full syntax, limited functionality) > gsp2.patch - phase 2 code patch (adds full functionality using the > new chained aggregate mechanism) > gsp-doc.patch - docs > gsp-contrib.patch - quote "cube" in contrib/cube and contrib/earthdistance, > intended primarily for testing pending a decision on > renaming contrib/cube or unreserving keywords > gsp-u.patch- proposed method to unreserve CUBE and ROLLUP > > (the contrib patch is not necessary if the -u patch is used; the > contrib/pg_stat_statements fixes are in the phase1 patch) > > -- > Andrew (irc:RhodiumToad) > Tom, any word on this? Cheers, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
> "Heikki" == Heikki Linnakangas writes: Heikki> There's been a lot of discussion and I haven't followed it in Heikki> detail. Andrew, there were some open questions, but have you Heikki> gotten enough feedback so that you know what to do next? I was holding off on posting a recut patch with the latest EXPLAIN formatting changes (which are basically cosmetic) until it became clear whether RLS was likely to be reverted or kept (we have a tiny but irritating conflict with it, in the regression test schedule file where we both add to the same list of tests). Other than that there is nothing for Atri and me to do next but wait on a proper review. The feedback and discussion has been almost all about cosmetic details; the only actual issues found have been a trivial omission from pg_stat_statements, and a slightly suboptimal planning of sort steps, both long since fixed. What we have not had: - anything more than a superficial review - any feedback over the acceptability of our chained-sorts approach for doing aggregations with differing sort orders - any decision about the question of reserved words and/or possibly renaming contrib/cube (and what new name to use if so) -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
There's been a lot of discussion and I haven't followed it in detail. Andrew, there were some open questions, but have you gotten enough feedback so that you know what to do next? I'm trying to get this commitfest to an end, and this is still in "Needs Review" state... - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
> "Josh" == Josh Berkus writes: Josh> (b) If we're going to discuss ripping out YAML format, please Josh> let's do that as a *separate* patch and discussion, +infinity >> Grouping Sets: >> - ["two","four"] >> - ["two"] >> - [] >> >> Would that be better? (It's not consistent with other YAML outputs >> like sort/group keys, but it's equally legal as far as I can tell >> and seems more readable.) Josh> That works for me. I prefer that one to any of the others I've come up with, so unless anyone has a major objection, I'll go with it. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
On 09/19/2014 08:52 AM, Andres Freund wrote: >> Until someone decides to dike it out, I think we are obligated to make >> > it produce something resembling correct output. > I vote for ripping it out. There really isn't any justification for it > and it broke more than once. (a) I personally use it all the time to produce human-readable output, sometimes also working via markdown. It's easier to read than the "standard format" or JSON, especially when combined with grep or other selective filtering. Note that this use would not at all preclude having the YAML output look "wierd" as long as it was readable. (b) If we're going to discuss ripping out YAML format, please let's do that as a *separate* patch and discussion, and not as a side effect of Grouping Sets. Otherwise this will be one of those things where people pitch a fit during beta because the people who care about YAML aren't necessarily reading this thread. On 09/19/2014 08:52 AM, Andrew Gierth wrote:> Oh, another YAML alternative would be: > > Grouping Sets: > - ["two","four"] > - ["two"] > - [] > > Would that be better? (It's not consistent with other YAML outputs like > sort/group keys, but it's equally legal as far as I can tell and seems > more readable.) That works for me. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
On 19/09/14 17:52, Andres Freund wrote: On 2014-09-19 16:35:52 +0100, Andrew Gierth wrote: Marti> But is anyone actually using YAML output format, or was it Marti> implemented simply "because we can"? Until someone decides to dike it out, I think we are obligated to make it produce something resembling correct output. I vote for ripping it out. There really isn't any justification for it and it broke more than once. Even though I really like YAML I say +1, mainly because any YAML 1.2 parser should be able to parse JSON output without problem... -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
> "Andrew" == Andrew Gierth writes: Andrew> You're telling me. Also, feeding it to an online yaml-to-json Andrew> converter gives the result as [["two","four"],["two"],null] Andrew> which is not quite the same as the json version. An Andrew> alternative would be: Oh, another YAML alternative would be: Grouping Sets: - ["two","four"] - ["two"] - [] Would that be better? (It's not consistent with other YAML outputs like sort/group keys, but it's equally legal as far as I can tell and seems more readable.) -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
On 2014-09-19 16:35:52 +0100, Andrew Gierth wrote: > Marti> But is anyone actually using YAML output format, or was it > Marti> implemented simply "because we can"? > > Until someone decides to dike it out, I think we are obligated to make > it produce something resembling correct output. I vote for ripping it out. There really isn't any justification for it and it broke more than once. Greg: Did you actually ever end up using the yaml output? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
> "Marti" == Marti Raudsepp writes: >> (yaml format) >> Grouping Sets: >> - - "two" >> - "four" >> - - "two" >> - Marti> Now this is weird. You're telling me. Also, feeding it to an online yaml-to-json converter gives the result as [["two","four"],["two"],null] which is not quite the same as the json version. An alternative would be: Grouping Sets: - - "two" - "four" - - "two" - [] or Grouping Sets: - - "two" - "four" - - "two" - [] though I haven't managed to get that second one to work yet. Marti> But is anyone actually using YAML output format, or was it Marti> implemented simply "because we can"? Until someone decides to dike it out, I think we are obligated to make it produce something resembling correct output. Marti> The reason I bring this up is that queries are frequently Marti> dynamically generated by programs. Good point. >> would you want the original syntax preserved in views Marti> Doesn't matter IMO. I think it's fairly consistent for the parser to do this, since we do a number of other normalization steps there (removing excess nesting and so on). This turns out to be quite trivial. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
On Fri, Sep 19, 2014 at 4:45 AM, Andrew Gierth wrote: > GroupAggregate (cost=1122.39..1197.48 rows=9 width=8) >Group Key: two, four >Group Key: two >Group Key: () > "Grouping Sets": [ > ["two", "four"], > ["two"], > [] +1 looks good to me. > (yaml format) > Grouping Sets: > - - "two" > - "four" > - - "two" > - Now this is weird. But is anyone actually using YAML output format, or was it implemented simply "because we can"? > Marti> Do you think it would be reasonable to normalize single-set > Marti> grouping sets into a normal GROUP BY? > It's certainly possible, though it would seem somewhat odd to write > queries that way. The reason I bring this up is that queries are frequently dynamically generated by programs. Coders are unlikely to special-case SQL generation when there's just a single grouping set. And that's the power of relational databases: the optimization work is done in the database pretty much transparently to the coder (when it works, that is). > would you want the original syntax preserved in views Doesn't matter IMO. > Marti> I'd expect GROUP BY () to be fully equivalent to having no > Marti> GROUP BY clause, but there's a difference in explain > Marti> output. The former displays "Grouping Sets: ()" which is odd, > Marti> since none of the grouping set keywords were used. > That's an implementation artifact, in the sense that we preserve the > fact that GROUP BY () was used by using an empty grouping set. Is it > a problem, really, that it shows up that way in explain? No, not really a problem. :) Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
> "Marti" == Marti Raudsepp writes: Marti> Since you were asking for feedback on the EXPLAIN output on Marti> IRC, I'd weigh in and say that having the groups on separate Marti> lines would be significantly more readable. I revisited the explain output a bit and have come up with these (surrounding material trimmed for clarity): (text format) GroupAggregate (cost=1122.39..1197.48 rows=9 width=8) Group Key: two, four Group Key: two Group Key: () -> ... (xml format) Aggregate Sorted 1122.39 1197.48 9 8 two four two ... (json format) "Plan": { "Node Type": "Aggregate", "Strategy": "Sorted", "Startup Cost": 1122.39, "Total Cost": 1197.48, "Plan Rows": 9, "Plan Width": 8, "Grouping Sets": [ ["two", "four"], ["two"], [] ], "Plans": [...] (yaml format) - Plan: Node Type: "Aggregate" Strategy: "Sorted" Startup Cost: 1122.39 Total Cost: 1197.48 Plan Rows: 9 Plan Width: 8 Grouping Sets: - - "two" - "four" - - "two" - Plans: ... Opinions? Any improvements? I'm not entirely happy with what I had to do with the json and (especially) the YAML output code in order to make this work. There seemed no obvious way to generate nested unlabelled structures in either using the existing Explain* functions, and for the YAML case the best output structure to produce was entirely non-obvious (and trying to read the YAML spec made my head explode). Marti> Do you think it would be reasonable to normalize single-set Marti> grouping sets into a normal GROUP BY? It's certainly possible, though it would seem somewhat odd to write queries that way. Either the parser or the planner could do that; would you want the original syntax preserved in views, or wouldn't that matter? Marti> I'd expect GROUP BY () to be fully equivalent to having no Marti> GROUP BY clause, but there's a difference in explain Marti> output. The former displays "Grouping Sets: ()" which is odd, Marti> since none of the grouping set keywords were used. That's an implementation artifact, in the sense that we preserve the fact that GROUP BY () was used by using an empty grouping set. Is it a problem, really, that it shows up that way in explain? -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
On 09/17/2014 03:02 PM, Marti Raudsepp wrote: > So instead of: > GroupAggregate >Output: four, ten, hundred, count(*) >Grouping Sets: (onek.four, onek.ten, onek.hundred), (onek.four, > onek.ten), (onek.four), () > > Perhaps print: >Grouping Sets: (onek.four, onek.ten, onek.hundred) > (onek.four, onek.ten) > (onek.four) > () So: Grouping Sets: [ [ onek.four, onek.ten, onek.hundred ], [ onek.four, onek.ten ], [ onek.four ], [] ] .. in JSON? Seems to me that we need a better way to display the grand total grouping set. > > Or maybe: >Grouping Set: (onek.four, onek.ten, onek.hundred) >Grouping Set: (onek.four, onek.ten) >Grouping Set: (onek.four) >Grouping Set: () The latter won't work with JSON and YAML output. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
On Fri, Sep 12, 2014 at 9:41 PM, Andrew Gierth wrote: > gsp1.patch - phase 1 code patch (full syntax, limited functionality) > gsp2.patch - phase 2 code patch (adds full functionality using the > new chained aggregate mechanism) I gave these a try by converting my current CTE-based queries into CUBEs and it works as expected; query time is cut in half and lines of code is 1/4 of original. Thanks! I only have a few trivial observations; if I'm getting too nitpicky let me know. :) Since you were asking for feedback on the EXPLAIN output on IRC, I'd weigh in and say that having the groups on separate lines would be significantly more readable. It took me a while to understand what's going on in my queries due to longer table and column names and wrapping; The comma separators between groups are hard to distinguish. If that can be made to work with the EXPLAIN printer without too much trouble. So instead of: GroupAggregate Output: four, ten, hundred, count(*) Grouping Sets: (onek.four, onek.ten, onek.hundred), (onek.four, onek.ten), (onek.four), () Perhaps print: Grouping Sets: (onek.four, onek.ten, onek.hundred) (onek.four, onek.ten) (onek.four) () Or maybe: Grouping Set: (onek.four, onek.ten, onek.hundred) Grouping Set: (onek.four, onek.ten) Grouping Set: (onek.four) Grouping Set: () Both seem to work with the explain.depesz.com parser, although the 1st won't be aligned as nicely. Do you think it would be reasonable to normalize single-set grouping sets into a normal GROUP BY? Such queries would be capable of using HashAggregate, but the current code doesn't allow that. For example: set enable_sort=off; explain select two, count(*) from onek group by grouping sets (two); Could be equivalent to: explain select two, count(*) from onek group by two; I'd expect GROUP BY () to be fully equivalent to having no GROUP BY clause, but there's a difference in explain output. The former displays "Grouping Sets: ()" which is odd, since none of the grouping set keywords were used. # explain select count(*) from onek group by (); Aggregate (cost=77.78..77.79 rows=1 width=0) Grouping Sets: () -> Index Only Scan using onek_stringu1 on onek (cost=0.28..75.28 rows=1000 width=0) Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS - unrecognized node type: 347
> "Tomas" == Tomas Vondra writes: Tomas> If we can get rid of the excessive ChainAggregate, that's Tomas> certainly enough for now. I found an algorithm that should provably give the minimal number of sorts (I was afraid that problem would turn out to be NP-hard, but not so - it's solvable in P by reducing it to a problem of maximal matching in bipartite graphs). Updated patch should be forthcoming in a day or two. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS - unrecognized node type: 347
On 7.9.2014 18:52, Andrew Gierth wrote: >> "Tomas" == Tomas Vondra writes: > > Tomas> Maybe preventing this completely (i.e. raising an ERROR with > Tomas> "duplicate columns in CUBE/ROLLUP/... clauses") would be > Tomas> appropriate. Does the standard says anything about this? > > The spec does not say anything explicitly about duplicates, so they > are allowed (and duplicate grouping _sets_ can't be removed, only > duplicate columns within a single GROUP BY clause after the grouping > sets have been eliminated by transformation). I have checked my > reading of the spec against oracle 11 and MSSQL using sqlfiddle. > > The way the spec handles grouping sets is to define a sequence of > syntactic transforms that result in a query which is a UNION ALL of > ordinary GROUP BY queries. (We haven't tried to implement the > additional optional feature of GROUP BY DISTINCT.) Since it's UNION > ALL, any duplicates must be preserved, so a query with GROUPING SETS > ((a),(a)) reduces to: > > SELECT ... GROUP BY a UNION ALL SELECT ... GROUP BY a; > > and therefore has duplicates of all its result rows. > > I'm quite prepared to concede that I may have read the spec wrong > (wouldn't be the first time), but in this case I require any such > claim to be backed up by an example from some other db showing an > actual difference in behavior. I think you read the spec right. Apparently duplicate grouping sets are allowed, and it's supposed to output that grouping set twice. The section on ROLLUP/CUBE do not mention duplicates at all, it only explains how to generate all the possible grouping sets, so if you have duplicate columns there, you'll get duplicate sets (which is allowed). If we can get rid of the excessive ChainAggregate, that's certainly enough for now. Optimizing it could be simple, though - you don't need to keep the duplicate groups, you only need to keep a counter "how many times to output this group". But the more I think about it, the more I think we can ignore that. There are far more important pieces to implement, and if you write bad SQL there's no help anyway. regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS - unrecognized node type: 347
> "Tomas" == Tomas Vondra writes: >> As for computing it all twice, there's currently no attempt to >> optimize multiple identical grouping sets into multiple >> projections of a single grouping set result. CUBE(a,b,c,a) has >> twice as many grouping sets as CUBE(a,b,c) does, even though all >> the extra ones are duplicates. Tomas> Shouldn't this be solved by eliminating the excessive Tomas> ChainAggregate? Although it probably changes GROUPING(...), Tomas> so it's not just about removing the duplicate column(s) from Tomas> the CUBE. Eliminating the excess ChainAggregate would not change the number of grouping sets, only where they are computed. Tomas> Maybe preventing this completely (i.e. raising an ERROR with Tomas> "duplicate columns in CUBE/ROLLUP/... clauses") would be Tomas> appropriate. Does the standard says anything about this? The spec does not say anything explicitly about duplicates, so they are allowed (and duplicate grouping _sets_ can't be removed, only duplicate columns within a single GROUP BY clause after the grouping sets have been eliminated by transformation). I have checked my reading of the spec against oracle 11 and MSSQL using sqlfiddle. The way the spec handles grouping sets is to define a sequence of syntactic transforms that result in a query which is a UNION ALL of ordinary GROUP BY queries. (We haven't tried to implement the additional optional feature of GROUP BY DISTINCT.) Since it's UNION ALL, any duplicates must be preserved, so a query with GROUPING SETS ((a),(a)) reduces to: SELECT ... GROUP BY a UNION ALL SELECT ... GROUP BY a; and therefore has duplicates of all its result rows. I'm quite prepared to concede that I may have read the spec wrong (wouldn't be the first time), but in this case I require any such claim to be backed up by an example from some other db showing an actual difference in behavior. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS - unrecognized node type: 347
On 7.9.2014 15:11, Andrew Gierth wrote: >> "Tomas" == Tomas Vondra writes: > > >> It's not one sort per grouping set, it's the minimal number of > >> sorts needed to express the result as a union of ROLLUP > >> clauses. The planner code will (I believe) always find the > >> smallest number of sorts needed. > > Tomas> You're probably right. Although when doing GROUP BY CUBE > Tomas> (a,b,c,a) I get one more ChainAggregate than with > Tomas> CUBE(a,b,c). and we seem to compute all the aggregates > Tomas> twice. Not sure if we need to address this though, because > Tomas> it's mostly user's fault. > > Hm. Yeah, you're right that the number of sorts is not optimal > there. We can look into that. I don't think it's very critical, though. I was worried about it because of the sorts, but if that gets tackled in patches following this commitfest it seems OK. > As for computing it all twice, there's currently no attempt to > optimize multiple identical grouping sets into multiple projections > of a single grouping set result. CUBE(a,b,c,a) has twice as many > grouping sets as CUBE(a,b,c) does, even though all the extra ones are > duplicates. Shouldn't this be solved by eliminating the excessive ChainAggregate? Although it probably changes GROUPING(...), so it's not just about removing the duplicate column(s) from the CUBE. Maybe preventing this completely (i.e. raising an ERROR with "duplicate columns in CUBE/ROLLUP/... clauses") would be appropriate. Does the standard says anything about this? But arguably this is a minor issue, happening only when the user uses the same column/expression twice. Hopefully the users don't do that too often. regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS - unrecognized node type: 347
> "Tomas" == Tomas Vondra writes: >> It's not one sort per grouping set, it's the minimal number of >> sorts needed to express the result as a union of ROLLUP >> clauses. The planner code will (I believe) always find the >> smallest number of sorts needed. Tomas> You're probably right. Although when doing GROUP BY CUBE Tomas> (a,b,c,a) I get one more ChainAggregate than with Tomas> CUBE(a,b,c). and we seem to compute all the aggregates Tomas> twice. Not sure if we need to address this though, because Tomas> it's mostly user's fault. Hm. Yeah, you're right that the number of sorts is not optimal there. We can look into that. As for computing it all twice, there's currently no attempt to optimize multiple identical grouping sets into multiple projections of a single grouping set result. CUBE(a,b,c,a) has twice as many grouping sets as CUBE(a,b,c) does, even though all the extra ones are duplicates. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS - unrecognized node type: 347
On 6.9.2014 23:34, Andrew Gierth wrote: >> "Tomas" == Tomas Vondra writes: > > Tomas> I have significant doubts about the whole design, > Tomas> though. Especially the decision not to use HashAggregate, > > There is no "decision not to use HashAggregate". There is simply no > support for HashAggregate yet. > > Having it be able to work with GroupAggregate is essential, because > there are always cases where HashAggregate is simply not permitted > (e.g. when using distinct or sorted aggs; or unhashable types; or with > the current code, when the estimated memory usage exceeds work_mem). > HashAggregate may be a performance improvement, but it's something > that can be added afterwards rather than an essential part of the > feature. Ah, OK. I got confused by the "final patch" subject, and so the possibility of additional optimization somehow didn't occur to me. > Tomas> Now, the chaining only makes this worse, because it > Tomas> effectively forces a separate sort of the whole table for each > Tomas> grouping set. > > It's not one sort per grouping set, it's the minimal number of sorts > needed to express the result as a union of ROLLUP clauses. The planner > code will (I believe) always find the smallest number of sorts needed. You're probably right. Although when doing GROUP BY CUBE (a,b,c,a) I get one more ChainAggregate than with CUBE(a,b,c). and we seem to compute all the aggregates twice. Not sure if we need to address this though, because it's mostly user's fault. > Each aggregate node can process any number of grouping sets as long as > they represent a single rollup list (and therefore share a single sort > order). > > Yes, this is slower than using one hashagg. But it solves the general > problem in a way that does not interfere with future optimization. > > (HashAggregate can be added to the current implementation by first > adding executor support for hashagg with multiple grouping sets, then > in the planner, extracting as many hashable grouping sets as possible > from the list before looking for rollup lists. The chained aggregate > code can work just fine with a HashAggregate as the chain head. > > We have not actually tackled this, since I'm not going to waste any > time adding optimizations before the basic idea is accepted.) OK, understood. > > Tomas> What I envisioned when considering hacking on this a few > Tomas> months back, was extending the aggregate API with "merge > Tomas> state" function, > > That's not really on the cards for arbitrary non-trivial aggregate > functions. > > Yes, it can be done for simple ones, and if you want to use that as a > basis for adding optimizations that's fine. But a solution that ONLY > works in simple cases isn't sufficient, IMO. I believe it can be done for most aggregates, assuming you have access to the internal state somehow (not just the). Adding it for in-core aggregates would not be difficult, in most cases. But you're right we don't have this now at all. regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS - unrecognized node type: 347
> "Tomas" == Tomas Vondra writes: Tomas> I have significant doubts about the whole design, Tomas> though. Especially the decision not to use HashAggregate, There is no "decision not to use HashAggregate". There is simply no support for HashAggregate yet. Having it be able to work with GroupAggregate is essential, because there are always cases where HashAggregate is simply not permitted (e.g. when using distinct or sorted aggs; or unhashable types; or with the current code, when the estimated memory usage exceeds work_mem). HashAggregate may be a performance improvement, but it's something that can be added afterwards rather than an essential part of the feature. Tomas> Now, the chaining only makes this worse, because it Tomas> effectively forces a separate sort of the whole table for each Tomas> grouping set. It's not one sort per grouping set, it's the minimal number of sorts needed to express the result as a union of ROLLUP clauses. The planner code will (I believe) always find the smallest number of sorts needed. Each aggregate node can process any number of grouping sets as long as they represent a single rollup list (and therefore share a single sort order). Yes, this is slower than using one hashagg. But it solves the general problem in a way that does not interfere with future optimization. (HashAggregate can be added to the current implementation by first adding executor support for hashagg with multiple grouping sets, then in the planner, extracting as many hashable grouping sets as possible from the list before looking for rollup lists. The chained aggregate code can work just fine with a HashAggregate as the chain head. We have not actually tackled this, since I'm not going to waste any time adding optimizations before the basic idea is accepted.) Tomas> What I envisioned when considering hacking on this a few Tomas> months back, was extending the aggregate API with "merge Tomas> state" function, That's not really on the cards for arbitrary non-trivial aggregate functions. Yes, it can be done for simple ones, and if you want to use that as a basis for adding optimizations that's fine. But a solution that ONLY works in simple cases isn't sufficient, IMO. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS - unrecognized node type: 347
On 31.8.2014 22:52, Andrew Gierth wrote: > Recut patches: > > gsp1.patch - phase 1 code patch (full syntax, limited functionality) > gsp2.patch - phase 2 code patch (adds full functionality using the > new chained aggregate mechanism) > gsp-doc.patch - docs > gsp-contrib.patch - quote "cube" in contrib/cube and contrib/earthdistance, > intended primarily for testing pending a decision on > renaming contrib/cube or unreserving keywords > gsp-u.patch- proposed method to unreserve CUBE and ROLLUP > > (the contrib patch is not necessary if the -u patch is used; the > contrib/pg_stat_statements fixes are in the phase1 patch) Hi, I looked at the patch today. The good news is it seems to apply cleanly on HEAD (with some small offsets, but no conflicts). The code generally seems OK to me, although the patch is quite massive. I've also did a considerable amount of testing and I've been unable to cause failures. I have significant doubts about the whole design, though. Especially the decision not to use HashAggregate, and the whole chaining idea. I haven't noticed any discussion about this (at least in this thread), and the chaining idea was not mentioned until 21/8, so I'd appreciate some reasoning behind this choice. I assume the "no HashAggregate" decision was done because of fear of underestimates, and the related OOM issues. I don't see how this is different from the general HashAggregate, though. Or is there another reason for this? Now, the chaining only makes this worse, because it effectively forces a separate sort of the whole table for each grouping set. We're doing a lot of analytics on large tables, where large means tens of GBs and hundreds of millions of rows. What we do now at the moment is basically the usual ROLAP approach - create a cube with aggregated data, which is usually much smaller than the source table, and then compute the rollups for the interesting slices in a second step. I was hoping that maybe we could eventually replace this with the GROUP BY CUBE functionality provided by this patch, but these design decisions make it pretty much impossible. I believe most other users processing non-trivial amounts of data (pretty much everyone with just a few million rows) will be in similar position :-( What I envisioned when considering hacking on this a few months back, was extending the aggregate API with "merge state" function, doing the aggregation just like today and merging the groups (for each cell) at the end. Yeah, we don't have this infrastructure, but maybe it'd be a better way than the current chaining approach. And it was repeatedly mentioned as necessary for parallel aggregation (and even mentioned in the memory-bounded hashagg batching discussion). I'm ready to spend some time on this, if it makes the grouping sets useful for us. regards Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS - unrecognized node type: 347
On Sunday, August 31, 2014, Andres Freund wrote: > On 2014-08-31 21:09:59 +0530, Atri Sharma wrote: > > On Sun, Aug 31, 2014 at 9:07 PM, Erik Rijkers > wrote: > > > I have found that the "unrecognized node type" error is caused by: > > It's a warning, not an error, right? > > > > shared_preload_libraries = pg_stat_statements > > > > > > in postgresql.conf (as my default compile script was doing). > > > > > > If I disable that line the error goes away. > > > > > > > > I think thats more of a library linking problem rather than a problem > with > > the patch. I couldnt reproduce it,though. > > I think it's vastly more likely that the patch simply didn't add the new > expression types to pg_stat_statements.c:JumbleExpr(). > > > Must have run the above diagnosis in a wrong manner then, I will check.Thanks for the heads up! Regards, Atri -- Regards, Atri *l'apprenant*
Re: [HACKERS] Final Patch for GROUPING SETS - unrecognized node type: 347
On 2014-08-31 21:09:59 +0530, Atri Sharma wrote: > On Sun, Aug 31, 2014 at 9:07 PM, Erik Rijkers wrote: > > I have found that the "unrecognized node type" error is caused by: It's a warning, not an error, right? > > shared_preload_libraries = pg_stat_statements > > > > in postgresql.conf (as my default compile script was doing). > > > > If I disable that line the error goes away. > > > > > I think thats more of a library linking problem rather than a problem with > the patch. I couldnt reproduce it,though. I think it's vastly more likely that the patch simply didn't add the new expression types to pg_stat_statements.c:JumbleExpr(). Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS - unrecognized node type: 347
On Sun, Aug 31, 2014 at 9:07 PM, Erik Rijkers wrote: > On Tue, August 26, 2014 14:24, Andrew Gierth wrote: > >> "Erik" == Erik Rijkers writes: > > > > >> They apply cleanly for me at 2bde297 whether with git apply or > > >> patch, except for the contrib one (which you don't need unless you > > >> want to run the contrib regression tests without applying the > > >> gsp-u patch). > > > > Erik> Ah, I had not realised that. Excluding that contrib-patch and > > Erik> only applying these three: > > > > Erik> gsp1.patch > > Erik> gsp2.patch > > Erik> gsp-doc.patch > > > > Erik> does indeed work (applies, compiles). > > > > I put up a rebased contrib patch anyway (linked off the CF). > > > > Did the "unrecognized node type" error go away, or do we still need to > > look into that? > > > > I have found that the "unrecognized node type" error is caused by: > > shared_preload_libraries = pg_stat_statements > > in postgresql.conf (as my default compile script was doing). > > If I disable that line the error goes away. > > I think thats more of a library linking problem rather than a problem with the patch. I couldnt reproduce it,though. Regards, Atri -- Regards, Atri *l'apprenant*
Re: [HACKERS] Final Patch for GROUPING SETS - unrecognized node type: 347
On Tue, August 26, 2014 14:24, Andrew Gierth wrote: >> "Erik" == Erik Rijkers writes: > > >> They apply cleanly for me at 2bde297 whether with git apply or > >> patch, except for the contrib one (which you don't need unless you > >> want to run the contrib regression tests without applying the > >> gsp-u patch). > > Erik> Ah, I had not realised that. Excluding that contrib-patch and > Erik> only applying these three: > > Erik> gsp1.patch > Erik> gsp2.patch > Erik> gsp-doc.patch > > Erik> does indeed work (applies, compiles). > > I put up a rebased contrib patch anyway (linked off the CF). > > Did the "unrecognized node type" error go away, or do we still need to > look into that? > I have found that the "unrecognized node type" error is caused by: shared_preload_libraries = pg_stat_statements in postgresql.conf (as my default compile script was doing). If I disable that line the error goes away. I don't know exactly what that means for the groping sets patches but I thought I'd mention it here. Otherwise I've not run into any problems with GROUPING SETS. Erik Rijkers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS - unrecognized node type: 347
On Tue, August 26, 2014 14:24, Andrew Gierth wrote: >> "Erik" == Erik Rijkers writes: > > >> They apply cleanly for me at 2bde297 whether with git apply or > >> patch, except for the contrib one (which you don't need unless you > >> want to run the contrib regression tests without applying the > >> gsp-u patch). > > Erik> Ah, I had not realised that. Excluding that contrib-patch and > Erik> only applying these three: > > Erik> gsp1.patch > Erik> gsp2.patch > Erik> gsp-doc.patch > > Erik> does indeed work (applies, compiles). > > I put up a rebased contrib patch anyway (linked off the CF). > > Did the "unrecognized node type" error go away, or do we still need to > look into that? > Yes, it did go away; looks fine now: select brand , size , grouping(brand, size) , sum(sales) from items_sold group by rollup(brand, size) ; brand | size | grouping | sum ---+--+--+- Bar | L|0 | 5 Bar | M|0 | 15 Bar | |1 | 20 Foo | L|0 | 10 Foo | M|0 | 20 Foo | |1 | 30 | |3 | 50 (7 rows) I'm a bit unclear why the bottom-row 'grouping' value is 3. Shouldn't that be 2? But I'm still reading the documentation so it's perhaps too early to ask... Thanks, Erik Rijkers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS - unrecognized node type: 347
> "Erik" == Erik Rijkers writes: >> They apply cleanly for me at 2bde297 whether with git apply or >> patch, except for the contrib one (which you don't need unless you >> want to run the contrib regression tests without applying the >> gsp-u patch). Erik> Ah, I had not realised that. Excluding that contrib-patch and Erik> only applying these three: Erik> gsp1.patch Erik> gsp2.patch Erik> gsp-doc.patch Erik> does indeed work (applies, compiles). I put up a rebased contrib patch anyway (linked off the CF). Did the "unrecognized node type" error go away, or do we still need to look into that? -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS - unrecognized node type: 347
On Tue, August 26, 2014 11:13, Andrew Gierth wrote: >> "Andrew" == Andrew Gierth writes: > >> "Erik" == Erik Rijkers writes: > > Erik> The patches did not apply anymore so I applied at 73eba19aebe0. > Erik> There they applied OK, and make && make check was OK. > > Andrew> I'll look and rebase if need be. > > They apply cleanly for me at 2bde297 whether with git apply or patch, > except for the contrib one (which you don't need unless you want to > run the contrib regression tests without applying the gsp-u patch). > Ah, I had not realised that. Excluding that contrib-patch and only applying these three: gsp1.patch gsp2.patch gsp-doc.patch does indeed work (applies, compiles). Thank you, Erik Rijkers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS - unrecognized node type: 347
> "Andrew" == Andrew Gierth writes: > "Erik" == Erik Rijkers writes: Erik> The patches did not apply anymore so I applied at 73eba19aebe0. Erik> There they applied OK, and make && make check was OK. Andrew> I'll look and rebase if need be. They apply cleanly for me at 2bde297 whether with git apply or patch, except for the contrib one (which you don't need unless you want to run the contrib regression tests without applying the gsp-u patch). -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS - unrecognized node type: 347
> "Erik" == Erik Rijkers writes: Erik> The patches did not apply anymore so I applied at 73eba19aebe0. Erik> There they applied OK, and make && make check was OK. I'll look and rebase if need be. --> WARNING: unrecognized node type: 347 Can't reproduce this - are you sure it's not a mis-build? -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS - unrecognized node type: 347
On Mon, August 25, 2014 07:21, Andrew Gierth wrote: > Here is the new version of our grouping sets patch. This version > supersedes the previous post. The patches did not apply anymore so I applied at 73eba19aebe0. There they applied OK, and make && make check was OK. drop table if exists items_sold; create table items_sold as select * from ( values ('Foo', 'L', 10) , ('Foo', 'M', 20) , ('Bar', 'M', 15) , ('Bar', 'L', 5) ) as f(brand, size, sales) ; select brand, size, grouping(brand, size), sum(sales) from items_sold group by rollup(brand, size); --> WARNING: unrecognized node type: 347 I suppose that's not correct. thanks, Erik Rijkers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
2014-08-26 2:45 GMT+02:00 Andrew Gierth : > > "Pavel" == Pavel Stehule writes: > > Pavel> Hi > Pavel> I checked this patch, and it working very well > > Pavel> I found only two issue - I am not sure if it is issue > > Pavel> It duplicate rows > > Pavel> postgres=# explain select name, place, sum(count), grouping(name), > Pavel> grouping(place) from cars group by rollup(name, place), name; > Pavel>QUERY PLAN > Pavel> > > Pavel> GroupAggregate (cost=101.14..101.38 rows=18 > width=68) > Pavel>Grouping Sets: (name, place), (name), (name) > > I think I can safely claim from the spec that our version is correct. > Following the syntactic transformations given in 7.9 > of sql2008, we have: > > GROUP BY rollup(name,place), name; > > parses as GROUP BY , > > Syntax rule 13 replaces the giving: > > GROUP BY GROUPING SETS ((name,place), (name), ()), name; > > Syntax rule 16b gives: > > GROUP BY GROUPING SETS ((name,place), (name), ()), GROUPING SETS (name); > > Syntax rule 16c takes the cartesian product of the two sets: > > GROUP BY GROUPING SETS ((name,place,name), (name,name), (name)); > > Syntax rule 17 gives: > > SELECT ... GROUP BY name,place,name > UNION ALL > SELECT ... GROUP BY name,name > UNION ALL > SELECT ... GROUP BY name > > Obviously at this point the extra "name" columns become redundant so > we eliminate them (this doesn't correspond to a spec rule, but doesn't > change the semantics). So we're left with: > > SELECT ... GROUP BY name,place > UNION ALL > SELECT ... GROUP BY name > UNION ALL > SELECT ... GROUP BY name > > Running a quick test on sqlfiddle with Oracle 11 suggests that Oracle's > behavior agrees with my interpretation. > > Nothing in the spec that I can find licenses the elimination of > duplicate grouping sets except indirectly via feature T434 (GROUP BY > DISTINCT ...), which we did not attempt to implement. > > ok, I'll try to search in my memory to find some indices, so redundant columns should be reduced, Regards Pavel > -- > Andrew (irc:RhodiumToad) >
Re: [HACKERS] Final Patch for GROUPING SETS
> "Pavel" == Pavel Stehule writes: Pavel> Hi Pavel> I checked this patch, and it working very well Pavel> I found only two issue - I am not sure if it is issue Pavel> It duplicate rows Pavel> postgres=# explain select name, place, sum(count), grouping(name), Pavel> grouping(place) from cars group by rollup(name, place), name; Pavel>QUERY PLAN Pavel> Pavel> GroupAggregate (cost=101.14..101.38 rows=18 width=68) Pavel>Grouping Sets: (name, place), (name), (name) I think I can safely claim from the spec that our version is correct. Following the syntactic transformations given in 7.9 of sql2008, we have: GROUP BY rollup(name,place), name; parses as GROUP BY , Syntax rule 13 replaces the giving: GROUP BY GROUPING SETS ((name,place), (name), ()), name; Syntax rule 16b gives: GROUP BY GROUPING SETS ((name,place), (name), ()), GROUPING SETS (name); Syntax rule 16c takes the cartesian product of the two sets: GROUP BY GROUPING SETS ((name,place,name), (name,name), (name)); Syntax rule 17 gives: SELECT ... GROUP BY name,place,name UNION ALL SELECT ... GROUP BY name,name UNION ALL SELECT ... GROUP BY name Obviously at this point the extra "name" columns become redundant so we eliminate them (this doesn't correspond to a spec rule, but doesn't change the semantics). So we're left with: SELECT ... GROUP BY name,place UNION ALL SELECT ... GROUP BY name UNION ALL SELECT ... GROUP BY name Running a quick test on sqlfiddle with Oracle 11 suggests that Oracle's behavior agrees with my interpretation. Nothing in the spec that I can find licenses the elimination of duplicate grouping sets except indirectly via feature T434 (GROUP BY DISTINCT ...), which we did not attempt to implement. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
Hi I checked this patch, and it working very well I found only two issue - I am not sure if it is issue with data from https://wiki.postgresql.org/wiki/Grouping_Sets postgres=# select name, place, sum(count), grouping(name), grouping(place) from cars group by rollup(name, place); name | place| sum | grouping | grouping ---++---+--+-- bmw | czech rep. | 100 |0 |0 bmw | germany| 1000 |0 |0 bmw || 1100 |0 |1 opel | czech rep. | 7000 |0 |0 opel | germany| 7000 |0 |0 opel || 14000 |0 |1 skoda | czech rep. | 1 |0 |0 skoda | germany| 5000 |0 |0 skoda || 15000 |0 |1 || 30100 |1 |1 (10 rows) * redundant sets should be ignored postgres=# select name, place, sum(count), grouping(name), grouping(place) from cars group by rollup(name, place), name; name | place| sum | grouping | grouping ---++---+--+-- bmw | czech rep. | 100 |0 |0 bmw | germany| 1000 |0 |0 bmw || 1100 |0 |1 bmw || 1100 |0 |1 opel | czech rep. | 7000 |0 |0 opel | germany| 7000 |0 |0 opel || 14000 |0 |1 opel || 14000 |0 |1 skoda | czech rep. | 1 |0 |0 skoda | germany| 5000 |0 |0 skoda || 15000 |0 |1 skoda || 15000 |0 |1 (12 rows) It duplicate rows postgres=# explain select name, place, sum(count), grouping(name), grouping(place) from cars group by rollup(name, place), name; QUERY PLAN GroupAggregate (cost=101.14..101.38 rows=18 width=68) Grouping Sets: (name, place), (name), (name) -> Sort (cost=101.14..101.15 rows=6 width=68) Sort Key: name, place -> Seq Scan on cars (cost=0.00..1.06 rows=6 width=68) Planning time: 0.235 ms (6 rows) postgres=# select name, place, sum(count), grouping(name), grouping(place) from cars group by grouping sets((name, place), (name), (name),(place), ()); name | place| sum | grouping | grouping ---++---+--+-- bmw | czech rep. | 100 |0 |0 bmw | germany| 1000 |0 |0 bmw || 1100 |0 |1 bmw || 1100 |0 |1 opel | czech rep. | 7000 |0 |0 opel | germany| 7000 |0 |0 opel || 14000 |0 |1 opel || 14000 |0 |1 skoda | czech rep. | 1 |0 |0 skoda | germany| 5000 |0 |0 skoda || 15000 |0 |1 skoda || 15000 |0 |1 || 30100 |1 |1 | czech rep. | 17100 |1 |0 | germany| 13000 |1 |0 (15 rows) Fantastic work Regards Pavel 2014-08-25 7:21 GMT+02:00 Andrew Gierth : > Here is the new version of our grouping sets patch. This version > supersedes the previous post. > > We believe the functionality of this version to be substantially > complete, providing all the standard grouping set features except T434 > (GROUP BY DISTINCT). (Additional tweaks, such as extra variants on > GROUPING(), could be added for compatibility with other databases.) > > Since the debate regarding reserved keywords has not produced any > useful answer, the main patch here makes CUBE and ROLLUP into > col_name_reserved keywords, but a separate small patch is attached to > make them unreserved_keywords instead. > > So there are now 5 files: > > gsp1.patch - phase 1 code patch (full syntax, limited > functionality) > gsp2.patch - phase 2 code patch (adds full functionality using the > new chained aggregate mechanism) > gsp-doc.patch - docs > gsp-contrib.patch - quote "cube" in contrib/cube and > contrib/earthdistance, > intended primarily for testing pending a decision on > renaming contrib/cube or unreserving keywords > gsp-u.patch- proposed method to unreserve CUBE and ROLLUP > > -- > Andrew (irc:RhodiumToad) > > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > >