ChainAggregate is > a bit like a node having two parents, a Sort and a GroupAggregate. > However, > the graph edge between ChainAggregate and its GroupAggregate is a > tuplestore > instead of the usual, synchronous ExecProcNode(). >
Well, I dont buy the two parents theory. The Sort nodes are intermediately stacked amongst ChainAggregate nodes, so there is still the single edge. However, as you rightly said, there is a shared tuplestore, but note that only the head of chain ChainAggregate has the top GroupAggregate as its parent. > > Suppose one node orchestrated all sorting and aggregation. Call it a > MultiGroupAggregate for now. It wouldn't harness Sort nodes, because it > performs aggregation between tuplesort_puttupleslot() calls. Instead, it > would directly manage two Tuplesortstate, CUR and NEXT. The node would > have > an initial phase similar to ExecSort(), in which it drains the outer node > to > populate the first CUR. After that, it looks more like > agg_retrieve_direct(), > except that CUR is the input source, and each tuple drawn is also put into > NEXT. When done with one CUR, swap CUR with NEXT and reinitialize NEXT. > This > design does not add I/O consumption or require a nonstandard communication > channel between executor nodes. Tom, Andrew, does that look satisfactory? > > So you are essentially proposing merging ChainAggregate and its corresponding Sort node? So the structure would be something like: GroupAggregate --> MultiGroupAgg (a,b) ----> MultiGroupAgg (c,d) ... I am not sure if I understand you correctly. Only the top level GroupAggregate node projects the result of the entire operation. The key to ChainAggregate nodes is that each ChainAggregate node handles grouping sets that fit a single ROLLUP list i.e. can be done by a single sort order. There can be multiple lists of this type in a single GS operation, however, our current design has only a single top GroupAggregate node but a ChainAggregate node + Sort node per sort order. If you are proposing replacing GroupAggregate node + entire ChainAggregate + Sort nodes stack with a single MultiGroupAggregate node, I am not able to understand how it will handle all the multiple sort orders. If you are proposing replacing only ChainAggregate + Sort node with a single MultiGroupAgg node, that still shares the tuplestore with top level GroupAggregate node. I am pretty sure I have messed up my understanding of your proposal. Please correct me if I am wrong. Regards, Atri -- Regards, Atri *l'apprenant*