Hi Mr.Vondra, Mr.Pyhalov.

I'm interesied in Mr.Pyhalov's patch due to the following background.
--Background
I develop postgresql's extension such as fdw in my work. 
I'm interested in using postgresql for OLAP. 
I think the function of a previous patch "Push aggregation down to base 
relations and joins"[1] is desiable. I rebased the previous patch and register 
the rebased patch on the next commitfest[2].
And I think it would be more useful if the previous patch works on a foreign 
table of postgres_fdw.
I realized the function of partial aggregation pushdown is necessary  to make 
the previous patch work on a foreign table of postgres_fdw.
--

So I reviewed Mr.Pyhalov's patch and discussions on this thread.
I made a draft of approach to respond to Mr.Vondra's comments.
Would you check whether my draft is right or not?

--My draft
> 1) It's not clear to me how could this get extended to aggregates with 
> more complex aggregate states, to support e.g. avg() and similar 
> fairly common aggregates.
We add a special aggregate function every aggregate function (hereafter we call 
this src)  which supports partial aggregation.
The followings are differences between the src and the special aggregate 
function.
difference1) result type
The result type is same with the src's transtype if the src's transtype is not 
internal.
Otherwise the result type is bytea.

difference2) final func
The final func does not exist if the src's transtype is not internal.
Otherwize the final func returns serialized value.

For example, let me call the special aggregate function of avg(float8) 
avg_p(float8).
The result value of avg_p is a float8 array which consists of count and 
summation.
avg_p does not have finalfunc.

We pushdown the special aggregate function instead of a src.
For example, we issue "select avg_p(c) from t" instead of "select avg(c) from t"
in the above example.

We add a new column partialaggfn to pg_aggregate to get the oid of  the special 
aggregate function from the the src's oid.
This column is the oid of the special aggregate function which corresponds to 
the src.

If an aggregate function does not have any special aggregate function,  then we 
does not pushdown any partial aggregation of the aggregate function.

> 2) I'm not sure relying on aggpartialpushdownsafe without any version 
> checks etc. is sufficient. I mean, how would we know the remote node 
> has the same idea of representing the aggregate state. I wonder how 
> this aligns with assumptions we do e.g. for functions etc.
We add compatible server versions infomation to pg_aggregate and  the set of 
options of postgres_fdw's foreign server.
We check compatibility of an aggregate function using this infomation.

An additional column of pg_aggregate is compatibleversonrange.
This column is a range of postgresql server versions which  has compatible 
aggregate function.
An additional options of postgres_fdw's foreign server are serverversion and 
bwcompatibleverson.
serverversion is remote postgresql server version.
bwcompatibleverson is the maximum version in which any aggregate function is 
compatible with local noed's one.
Our version check passes if and only if at least one of the following 
conditions is true.
condition1) the option value of serverversion is in compatibleversonrange.
condition2) the local postgresql server version is between bwcompatibleverson 
and the option value of serverversion.

We can get the local postgresql server version from PG_VERSION_NUM macro.
We use condition1 if the local postgresql server version is not more than the 
remote one.
and use condition2 if the local postgresql server version is greater than the 
remote one.
--

Sincerely yours,
Yuuki Fujii

[1] https://commitfest.postgresql.org/32/1247/
[2] https://commitfest.postgresql.org/39/3764/

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

Reply via email to