On Mon, Nov 21, 2022 at 5:02 PM fujii.y...@df.mitsubishielectric.co.jp <
fujii.y...@df.mitsubishielectric.co.jp> wrote:

> Hi Mr.Vondra, Mr.Pyhalov, Everyone.
>
> I discussed with Mr.Pyhalov about the above draft by directly sending mail
> to
>  him(outside of pgsql-hackers). Mr.Pyhalov allowed me to update his patch
> along with the above draft. So I update Mr.Pyhalov's patch v10.
>
> I wrote my patch for discussion.
> My patch passes regression tests which contains additional basic
> postgres_fdw tests
> for my patch's feature. But my patch doesn't contain sufficient documents
> and tests.
> If reviewers accept my approach, I will add documents and tests to my
> patch.
>
> The following is a my patch's readme.
> # I simplified the above draft.
>
> --readme of my patch
> 1. interface
> 1) pg_aggregate
> There are the following additional columns.
> a) partialaggfn
>   data type    : regproc.
>   default value: zero(means invalid).
>   description  : This field refers to the special aggregate function(then
> we call
>      this partialaggfunc)
>     corresponding to aggregation function(then we call src) which has
> aggfnoid.
>     partialaggfunc is used for partial aggregation pushdown by
> postgres_fdw.
>     The followings are differences between the src and the special
> aggregate function.
>       difference1) result type
>         The result type is same as the src's transtype if the src's
> transtype
>         is not internal.
>         Otherwise the result type is bytea.
>       difference2) final func
>         The final func does not exist if the src's transtype is not
> internal.
>         Otherwize the final func returns serialized value.
>     For example, there is a partialaggfunc avg_p_int4 which corresponds to
> avg(int4)
>     whose aggtranstype is _int4.
>     The result value of avg_p_int4 is a float8 array which consists of
> count and
>     summation. avg_p_int4 does not have finalfunc.
>     For another example, there is a partialaggfunc avg_p_int8 which
> corresponds to
>     avg(int8) whose aggtranstype is internal.
>     The result value of avg_p_int8 is a bytea serialized array which
> consists of count
>     and summation. avg_p_int8 has finalfunc int8_avg_serialize which is
> serialize function
>     of avg(int8). This field is zero if there is no partialaggfunc.
>
> b) partialagg_minversion
>   data type    : int4.
>   default value: zero(means current version).
>   description  : This field is the minimum PostgreSQL server version which
> has
>     partialaggfunc. This field is used for checking compatibility of
> partialaggfunc.
>
> The above fields are valid in tuples for builtin avg, sum, min, max, count.
> There are additional records which correspond to partialaggfunc for avg,
> sum, min, max,
> count.
>
> 2) pg_proc
> There are additional records which correspond to partialaggfunc for avg,
> sum, min, max,
> count.
>
> 3) postgres_fdw
> postgres_fdw has an additional foreign server option server_version.
> server_version is
> integer value which means remote server version number. Default value of
> server_version
> is zero. server_version is used for checking compatibility of
> partialaggfunc.
>
> 2. feature
> postgres_fdw can pushdown partial aggregation of avg, sum, min, max, count.
> Partial aggregation pushdown is fine when the following two conditions are
> both true.
>   condition1) partialaggfn is valid.
>   condition2) server_version is not less than partialagg_minversion
> postgres_fdw executes pushdown the patialaggfunc instead of a src.
> For example, we issue "select avg_p_int4(c) from t" instead of "select
> avg(c) from t"
> in the above example.
>
> postgres_fdw can pushdown every aggregate function which supports partial
> aggregation
> if you add a partialaggfunc corresponding to the aggregate function by
> create aggregate
> command.
>
> 3. difference between my patch and Mr.Pyhalov's v10 patch.
> 1) In my patch postgres_fdw can pushdown partial aggregation of avg
> 2) In my patch postgres_fdw can pushdown every aggregate function which
> supports partial
>   aggregation if you add a partialaggfunc corresponding to the aggregate
> function.
>
> 4. sample commands in psql
> \c postgres
> drop database tmp;
> create database tmp;
> \c tmp
> create extension postgres_fdw;
> create server server_01 foreign data wrapper postgres_fdw options(host
> 'localhost', dbname 'tmp', server_version '160000', async_capable 'true');
> create user mapping for postgres server server_01 options(user 'postgres',
> password 'postgres');
> create server server_02 foreign data wrapper postgres_fdw options(host
> 'localhost', dbname 'tmp', server_version '160000', async_capable 'true');
> create user mapping for postgres server server_02 options(user 'postgres',
> password 'postgres');
>
> create table t(dt timestamp, id int4, name text, total int4, val float4,
> type int4, span interval) partition by list (type);
>
> create table t1(dt timestamp, id int4, name text, total int4, val float4,
> type int4, span interval);
> create table t2(dt timestamp, id int4, name text, total int4, val float4,
> type int4, span interval);
>
> truncate table t1;
> truncate table t2;
> insert into t1 select timestamp'2020-01-01' + cast(t || ' seconds' as
> interval), t % 100, 'hoge' || t, 1, 1.1, 1, cast('1 seconds' as interval)
> from generate_series(1, 100000, 1) t;
> insert into t1 select timestamp'2020-01-01' + cast(t || ' seconds' as
> interval), t % 100, 'hoge' || t, 2, 2.1, 1, cast('2 seconds' as interval)
> from generate_series(1, 100000, 1) t;
> insert into t2 select timestamp'2020-01-01' + cast(t || ' seconds' as
> interval), t % 100, 'hoge' || t, 1, 1.1, 2, cast('1 seconds' as interval)
> from generate_series(1, 100000, 1) t;
> insert into t2 select timestamp'2020-01-01' + cast(t || ' seconds' as
> interval), t % 100, 'hoge' || t, 2, 2.1, 2, cast('2 seconds' as interval)
> from generate_series(1, 100000, 1) t;
>
> create foreign table f_t1 partition of t for values in (1) server
> server_01 options(table_name 't1');
> create foreign table f_t2 partition of t for values in (2) server
> server_02 options(table_name 't2');
>
> set enable_partitionwise_aggregate = on;
> explain (verbose, costs off) select avg(total::int4), avg(total::int8)
> from t;
> select avg(total::int4), avg(total::int8) from t;
>
> Sincerely yours,
> Yuuki Fujii
> --
> Yuuki Fujii
> Information Technology R&D Center Mitsubishi Electric Corporation
>

Hi,
For partial_agg_compatible :

+ * Check that partial aggregate agg has compatibility

If the `agg` refers to func parameter, the parameter name is aggform

+       int32  partialagg_minversion = PG_VERSION_NUM;
+       if (aggform->partialagg_minversion ==
PARTIALAGG_MINVERSION_DEFAULT) {
+               partialagg_minversion = PG_VERSION_NUM;

I am curious why the same variable is assigned the same value twice. It
seems the if block is redundant.

+       if ((fpinfo->server_version >= partialagg_minversion)) {
+               compatible = true;

The above can be simplified as: return fpinfo->server_version >=
partialagg_minversion;

Cheers

Reply via email to