On Tue, 9 Dec 2025 at 04:37, Sergey Soloviev <[email protected]> wrote: > I would like to introduce new GROUP BY strategy, called Index Aggregate.
> In a nutshell, we build B+tree index where GROUP BY attributes are index > keys and if memory limit reached we will build index for each batch and > spill it to the disk as sorted run performing final external merge. > Mean IndexAgg time is about 1.8 ms and 2 ms for hash + sort, so win is about > 10%. > > Also, I have run TPC-H tests and 2 tests used Index Agg node (4 and 5) and > this gave > near 5% gain in time. Interesting. Are you able to provide benchmarks with increasing numbers of groups, say 100 to 100 million, increasing in multiples of 10, with say 1GB work_mem, and to be fair, hash_mem_multiplier=1 with all 3 strategies. A binary search's performance characteristics will differ vastly from that of simplehash's hash lookup and linear probe type search. Binary searches become much less optimal when the array becomes large as there are many more opportunities for cache misses than with a linear probing hash table. I think you're going to have to demonstrate that the window where this is useful is big enough to warrant the extra code. Ideally, if you could show a graph and maybe name Hash Aggregate as the baseline and show that as 1 always, then run the same benchmark forcing a Sort -> Group Agg, and then also your Index Agg. Also, ideally, if you could provide scripts for this so people can easily run it themselves, to allow us to see how other hardware compares to yours. Doing this may also help you move forward with your costing code for the planner, but the main thing to show is that there is a useful enough data size where this is useful. You might want to repeat the test a few times with different data types. Perhaps int or bigint, then also something varlena and maybe something byref, such as UUID. Also, you might want to avoid presorted data as I suspect it'll be hard to beat Sort -> Group Agg with presorted data. Not causing performance regressions for presorted data might be quite a tricky aspect of this patch. David
