Re: [CAUTION!! freemail] Re: Partial aggregates pushdown

2024-01-26 Thread vignesh C
On Thu, 7 Dec 2023 at 05:41, fujii.y...@df.mitsubishielectric.co.jp
 wrote:
>
> Hi Mr.Haas.
>
> > -Original Message-
> > From: Robert Haas 
> > Sent: Wednesday, December 6, 2023 10:25 PM
> > On Wed, Dec 6, 2023 at 3:41 AM fujii.y...@df.mitsubishielectric.co.jp
> >  wrote:
> > > Are you concerned about the hassle and potential human errors of
> > > manually adding new partial aggregation functions, rather than the 
> > > catalog becoming bloated?
> >
> > I'm concerned about both.
> Understood. Thank you for your response.
>
> > > The process of creating partial aggregation functions from aggregation
> > > functions can be automated, so I believe this issue can be resolved.
> > > However, automating it may increase the size of the patch even more, so 
> > > overall, approach#2 might be better.
> > > To implement approach #2, it would be necessary to investigate how much 
> > > additional code is required.
> >
> > Yes. Unfortunately I fear that there is quite a lot of work left to do here 
> > in order to produce a committable feature. To me it
> > seems necessary to conduct an investigation of approach #2. If the result 
> > of that investigation is that nothing major
> > stands in the way of approach #2, then I think we should adopt it, which is 
> > more work. In addition, the problems around
> > transmitting serialized bytea blobs between machines that can't be assumed 
> > to fully trust each other will need to be
> > addressed in some way, which seems like it will require a good deal of 
> > design work, forming some kind of consensus, and
> > then implementation work to follow. In addition to that there may be some 
> > small problems that need to be solved at a
> > detail level, such as the HAVING issue. I think the last category won't be 
> > too hard to sort out, but that still leaves two really
> > major areas to address.
> Yes, I agree with you. It is clear that further investigation and discussion 
> are still needed.
> I would be grateful if we can resolve this issue gradually. I would also like 
> to continue the discussion if possible in the future.

Thanks for all the efforts on this patch. I have changed the status of
the commitfest entry to "Returned with Feedback" as there is still
some work to get this patch out. Feel free to continue the discussion
and add a new entry when the patch is in a reviewable shape.

Regards,
Vignesh




Re: [CAUTION!! freemail] Re: Partial aggregates pushdown

2023-12-11 Thread Robert Haas
On Thu, Dec 7, 2023 at 4:12 PM Bruce Momjian  wrote:
> Second, the patch already has a mechanism to check the remote server
> version to see if it is the same or newer.   Here is the version check
> documentation patch:

Right. This feature can certainly be implemented in a
backward-compatible way. I'm not sure that we have as much control
over what does and does not get pushed down as we really want here,
but it's completely possible to do this in a way that doesn't break
other use cases.

> However, functions don't have pre-created records, and internal
> functions don't see to have an SQL-defined structure, but as I remember
> the internal aggregate functions all take the same internal structure,
> so I guess we only need one fixed input and one output that would
> output/input such records.  Performance might be an issue, but at this
> point let's just implement this and measure the overhead since there are
> few/any(?) other viable options.

IMHO records will be the easiest approach, but it will be some work to try it.

> Fourth, going with #2 where we do the pushdown using an SQL keyword also
> allows extensions to automatically work, while requiring partial
> aggregate functions for every non-partial aggregate will require work
> for extensions, and potentially lead to more version mismatch issues.

Yeah.

> Finally, I am now concerned that this will not be able to be in PG 17,
> which I was hoping for.

Getting it ready to ship by March seems very difficult. I'm not saying
it couldn't happen, but I think more likely it won't.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: [CAUTION!! freemail] Re: Partial aggregates pushdown

2023-12-07 Thread Bruce Momjian
On Thu, Dec  7, 2023 at 09:56:08AM -0500, Robert Haas wrote:
> On Wed, Dec 6, 2023 at 7:11 PM fujii.y...@df.mitsubishielectric.co.jp
>  wrote:
> > I would be grateful if we can resolve this issue gradually. I would also 
> > like to continue the discussion if possible in the future.
> 
> I think that would be good. Thanks for your work on this. It is a hard 
> problem.

Agreed.  First, Robert is right that this feature is long overdue.  It
might not help many of our existing workloads, but it opens us up to
handling new, larger workloads.

Second, the patch already has a mechanism to check the remote server
version to see if it is the same or newer.   Here is the version check
documentation patch:

check_partial_aggregate_support (boolean)

If this option is false, postgres_fdw always
uses partial aggregate pushdown by assuming that each built-in
aggregate function has a partial aggregate function defined on
the remote server.  If this option is true, local aggregates
whose partial computation function references itself are assumed
to exist on the remote server.  If not, during query planning,
postgres_fdw will connect to the remote
server and retrieve the remote server version.  If the remote
version is the same or newer, partial aggregate functions will be
assumed to exist.  If older, postgres_fdw
checks that the remote server has a matching partial aggregate
function before performing partial aggregate pushdown.  The default
is false.

There is also an extension list that specifies which extension-owned
functions can be pushed down;  from the doc patch:

To reduce the risk of misexecution of queries, WHERE clauses and
aggregate expressions are not sent to the remote server unless they
only use data types, operators, and functions that are built-in
or belong to an extension that is listed in the foreign server's
extensions option.

Third, we already have a way of creating records for tables:

SELECT pg_language FROM pg_language;
pg_language
---
 (12,internal,10,f,f,0,0,2246,)
 (13,c,10,f,f,0,0,2247,)
 (14,sql,10,f,t,0,0,2248,)
 (13576,plpgsql,10,t,t,13573,13574,13575,)

And we do have record input functionality:

CREATE TABLE test (x int, language pg_language);

INSERT INTO test SELECT 0, pg_language FROM pg_language;

SELECT * FROM test;
 x | language
---+---
 0 | (12,internal,10,f,f,0,0,2246,)
 0 | (13,c,10,f,f,0,0,2247,)
 0 | (14,sql,10,f,t,0,0,2248,)
 0 | (13576,plpgsql,10,t,t,13573,13574,13575,)
(4 rows)

However, functions don't have pre-created records, and internal
functions don't see to have an SQL-defined structure, but as I remember
the internal aggregate functions all take the same internal structure,
so I guess we only need one fixed input and one output that would
output/input such records.  Performance might be an issue, but at this
point let's just implement this and measure the overhead since there are
few/any(?) other viable options.

Fourth, going with #2 where we do the pushdown using an SQL keyword also
allows extensions to automatically work, while requiring partial
aggregate functions for every non-partial aggregate will require work
for extensions, and potentially lead to more version mismatch issues.

Finally, I am now concerned that this will not be able to be in PG 17,
which I was hoping for.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: [CAUTION!! freemail] Re: Partial aggregates pushdown

2023-12-07 Thread Robert Haas
On Wed, Dec 6, 2023 at 7:11 PM fujii.y...@df.mitsubishielectric.co.jp
 wrote:
> I would be grateful if we can resolve this issue gradually. I would also like 
> to continue the discussion if possible in the future.

I think that would be good. Thanks for your work on this. It is a hard problem.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




RE: [CAUTION!! freemail] Re: Partial aggregates pushdown

2023-12-06 Thread fujii.y...@df.mitsubishielectric.co.jp
Hi Mr.Haas.

> -Original Message-
> From: Robert Haas 
> Sent: Wednesday, December 6, 2023 10:25 PM
> On Wed, Dec 6, 2023 at 3:41 AM fujii.y...@df.mitsubishielectric.co.jp
>  wrote:
> > Are you concerned about the hassle and potential human errors of
> > manually adding new partial aggregation functions, rather than the catalog 
> > becoming bloated?
> 
> I'm concerned about both.
Understood. Thank you for your response.

> > The process of creating partial aggregation functions from aggregation
> > functions can be automated, so I believe this issue can be resolved.
> > However, automating it may increase the size of the patch even more, so 
> > overall, approach#2 might be better.
> > To implement approach #2, it would be necessary to investigate how much 
> > additional code is required.
> 
> Yes. Unfortunately I fear that there is quite a lot of work left to do here 
> in order to produce a committable feature. To me it
> seems necessary to conduct an investigation of approach #2. If the result of 
> that investigation is that nothing major
> stands in the way of approach #2, then I think we should adopt it, which is 
> more work. In addition, the problems around
> transmitting serialized bytea blobs between machines that can't be assumed to 
> fully trust each other will need to be
> addressed in some way, which seems like it will require a good deal of design 
> work, forming some kind of consensus, and
> then implementation work to follow. In addition to that there may be some 
> small problems that need to be solved at a
> detail level, such as the HAVING issue. I think the last category won't be 
> too hard to sort out, but that still leaves two really
> major areas to address.
Yes, I agree with you. It is clear that further investigation and discussion 
are still needed. 
I would be grateful if we can resolve this issue gradually. I would also like 
to continue the discussion if possible in the future.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R Center Mitsubishi Electric Corporation


Re: [CAUTION!! freemail] Re: Partial aggregates pushdown

2022-11-22 Thread Ted Yu
On Tue, Nov 22, 2022 at 1:11 AM fujii.y...@df.mitsubishielectric.co.jp <
fujii.y...@df.mitsubishielectric.co.jp> wrote:

> Hi Mr.Yu.
>
> Thank you for comments.
>
> > + * Check that partial aggregate agg has compatibility
> >
> > If the `agg` refers to func parameter, the parameter name is aggform
> I fixed the above typo and made the above comment easy to understand
> New comment is "Check that partial aggregate function of aggform exsits in
> remote"
>
> > +   int32  partialagg_minversion = PG_VERSION_NUM;
> > +   if (aggform->partialagg_minversion ==
> > PARTIALAGG_MINVERSION_DEFAULT) {
> > +   partialagg_minversion = PG_VERSION_NUM;
> >
> >
> > I am curious why the same variable is assigned the same value twice. It
> seems
> > the if block is redundant.
> >
> > +   if ((fpinfo->server_version >= partialagg_minversion)) {
> > +   compatible = true;
> >
> >
> > The above can be simplified as: return fpinfo->server_version >=
> > partialagg_minversion;
> I fixed according to your comment.
>
> Sincerely yours,
> Yuuki Fujii
>
>
> Hi,
Thanks for the quick response.