On Thu, Dec  7, 2023 at 09:56:08AM -0500, Robert Haas wrote:
> On Wed, Dec 6, 2023 at 7:11 PM fujii.y...@df.mitsubishielectric.co.jp
> <fujii.y...@df.mitsubishielectric.co.jp> wrote:
> > I would be grateful if we can resolve this issue gradually. I would also 
> > like to continue the discussion if possible in the future.
> 
> I think that would be good. Thanks for your work on this. It is a hard 
> problem.

Agreed.  First, Robert is right that this feature is long overdue.  It
might not help many of our existing workloads, but it opens us up to
handling new, larger workloads.

Second, the patch already has a mechanism to check the remote server
version to see if it is the same or newer.   Here is the version check
documentation patch:

        check_partial_aggregate_support (boolean)
        
        If this option is false, <filename>postgres_fdw</filename> always
        uses partial aggregate pushdown by assuming that each built-in
        aggregate function has a partial aggregate function defined on
        the remote server.  If this option is true, local aggregates
        whose partial computation function references itself are assumed
        to exist on the remote server.  If not, during query planning,
        <filename>postgres_fdw</filename> will connect to the remote
        server and retrieve the remote server version.  If the remote
        version is the same or newer, partial aggregate functions will be
        assumed to exist.  If older, <filename>postgres_fdw</filename>
        checks that the remote server has a matching partial aggregate
        function before performing partial aggregate pushdown.  The default
        is <literal>false</literal>.

There is also an extension list that specifies which extension-owned
functions can be pushed down;  from the doc patch:

        To reduce the risk of misexecution of queries, WHERE clauses and
        aggregate expressions are not sent to the remote server unless they
        only use data types, operators, and functions that are built-in
        or belong to an extension that is listed in the foreign server's
        <literal>extensions</literal> option.

Third, we already have a way of creating records for tables:

        SELECT pg_language FROM pg_language;
                        pg_language
        -------------------------------------------
         (12,internal,10,f,f,0,0,2246,)
         (13,c,10,f,f,0,0,2247,)
         (14,sql,10,f,t,0,0,2248,)
         (13576,plpgsql,10,t,t,13573,13574,13575,)

And we do have record input functionality:

        CREATE TABLE test (x int, language pg_language);
        
        INSERT INTO test SELECT 0, pg_language FROM pg_language;
        
        SELECT * FROM test;
         x |                 language
        ---+-------------------------------------------
         0 | (12,internal,10,f,f,0,0,2246,)
         0 | (13,c,10,f,f,0,0,2247,)
         0 | (14,sql,10,f,t,0,0,2248,)
         0 | (13576,plpgsql,10,t,t,13573,13574,13575,)
        (4 rows)

However, functions don't have pre-created records, and internal
functions don't see to have an SQL-defined structure, but as I remember
the internal aggregate functions all take the same internal structure,
so I guess we only need one fixed input and one output that would
output/input such records.  Performance might be an issue, but at this
point let's just implement this and measure the overhead since there are
few/any(?) other viable options.

Fourth, going with #2 where we do the pushdown using an SQL keyword also
allows extensions to automatically work, while requiring partial
aggregate functions for every non-partial aggregate will require work
for extensions, and potentially lead to more version mismatch issues.

Finally, I am now concerned that this will not be able to be in PG 17,
which I was hoping for.

-- 
  Bruce Momjian  <br...@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.


Reply via email to