On Sun, Jul 7, 2024 at 10:45 AM Paul George <[email protected]> wrote:
> Thanks for reviving this patch and for all of your work on it! Eager
> aggregation pushdown will be beneficial for my work and I'm hoping to see it
> land.
Thanks for looking at this patch!
> The output of both the original query and this one match (and the plans with
> eager aggregation and the subquery are nearly identical if you restore the
> LEFT JOIN to a JOIN). I admittedly may be missing a subtlety, but does this
> mean that there are conditions under which eager aggregation can be pushed
> down to the nullable side?
I think it's a very risky thing to push a partial aggregation down to
the nullable side of an outer join, because the NULL-extended rows
produced by the outer join would not be available when we perform the
partial aggregation, while with a non-eager-aggregation plan these
rows are available for the top-level aggregation. This may put the
rows into groups in a different way than expected, or get wrong values
from the aggregate functions. I've managed to compose an example:
create table t (a int, b int);
insert into t select 1, 1;
select t2.a, count(*) from t t1 left join t t2 on t2.b > 1 group by
t2.a having t2.a is null;
a | count
---+-------
| 1
(1 row)
This is the expected result, because after the outer join we have got
a NULL-extended row.
But if we somehow push down the partial aggregation to the nullable
side of this outer join, we would get a wrong result.
explain (costs off)
select t2.a, count(*) from t t1 left join t t2 on t2.b > 1 group by
t2.a having t2.a is null;
QUERY PLAN
-------------------------------------------
Finalize HashAggregate
Group Key: t2.a
-> Nested Loop Left Join
Filter: (t2.a IS NULL)
-> Seq Scan on t t1
-> Materialize
-> Partial HashAggregate
Group Key: t2.a
-> Seq Scan on t t2
Filter: (b > 1)
(10 rows)
select t2.a, count(*) from t t1 left join t t2 on t2.b > 1 group by
t2.a having t2.a is null;
a | count
---+-------
| 0
(1 row)
I believe there are cases where pushing a partial aggregation down to
the nullable side of an outer join can be safe, but I doubt that there
is an easy way to identify these cases and do the push-down for them.
So for now I think we'd better refrain from doing that.
Thanks
Richard