On Tue, Oct 27, 2020 at 5:43 PM Robert Haas <robertmh...@gmail.com> wrote: > > On Thu, Oct 22, 2020 at 5:08 AM Dilip Kumar <dilipbal...@gmail.com> wrote: > > Interesting idea. So IIUC, whenever a worker is scanning the tuple it > > will directly put it into the respective batch(shared tuple store), > > based on the hash on grouping column and once all the workers are > > doing preparing the batch then each worker will pick those baches one > > by one, perform sort and finish the aggregation. I think there is a > > scope of improvement that instead of directly putting the tuple to the > > batch what if the worker does the partial aggregations and then it > > places the partially aggregated rows in the shared tuple store based > > on the hash value and then the worker can pick the batch by batch. By > > doing this way, we can avoid doing large sorts. And then this > > approach can also be used with the hash aggregate, I mean the > > partially aggregated data by the hash aggregate can be put into the > > respective batch. > > I am not sure if this would be a win if the typical group size is > small and the transition state has to be serialized/deserialized. > Possibly we need multiple strategies, but I guess we'd have to test > performance to be sure.
+1 -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com