Re: Hash aggregate collisions cause excessive spilling

Ants Aasma Thu, 19 Feb 2026 09:06:33 -0800

On Thu, 19 Feb 2026 at 18:32, Andres Freund <[email protected]> wrote:
> > Interestingly the plan doesn't have partial and final on those hash agg 
> > nodes:
> > ...
> > There are timescale tables involved in the plan, so I think timescale
> > might be behind that.
>
> Hm, so timescale creates a plan that we would not?


No, after poking around in the query, it's actually just that the
aggregate is for implementing DISTINCT, which means there is no
aggregate state.

> > I'm wondering if some way to decorrelate the hashtables would help.
> > For example a hashtable specific (pseudo)random salt.
>
> We do try to add a hash-IV that's different for each worker:
>
>         /*
>          * If parallelism is in use, even if the leader backend is performing 
> the
>          * scan itself, we don't want to create the hashtable exactly the 
> same way
>          * in all workers. As hashtables are iterated over in keyspace-order,
>          * doing so in all processes in the same way is likely to lead to
>          * "unbalanced" hashtables when the table size initially is
>          * underestimated.
>          */
>         if (use_variable_hash_iv)
>                 hash_iv = murmurhash32(ParallelWorkerNumber);
>
>
> I don't remember enough of how the parallel aggregate stuff works. Perhaps the
> issue is that the leader is also building a hashtable and it's being inserted
> into the post-gather hashtable, using the same IV?
>
> In which case parallel_leader_participation=off should make a difference.

After turning leader participation off the problem no longer
reproduced even after 10 iterations, turning it back on it reproduced
on the 4th iteration. Is there any reason why the hash table couldn't
have an unconditional iv that includes the plan node?

Regards,
Ants Aasma

Re: Hash aggregate collisions cause excessive spilling

Reply via email to