Re: estimation problems for DISTINCT ON with FDW

2020-07-25 Thread Jeff Janes
On Fri, Jul 3, 2020 at 5:50 PM Tom Lane wrote: > > OK, I'll go ahead and push the patch I proposed yesterday. > Thank you. I tested 12_STABLE with my real queries on the real data set, and the "hard coded" estimate of 200 distinct rows (when use_remote_estimte is turned back on) is enough to

Re: estimation problems for DISTINCT ON with FDW

2020-07-03 Thread Tom Lane
Etsuro Fujita writes: > On Thu, Jul 2, 2020 at 11:46 PM Tom Lane wrote: >> We could perhaps try to make our own estimate of the selectivity of the >> shippable quals and then back into #1 from the value we got for #2 from >> the remote server. > Actually, that is what I suggested: > ... By "the

Re: estimation problems for DISTINCT ON with FDW

2020-07-03 Thread Etsuro Fujita
On Fri, Jul 3, 2020 at 5:19 AM Tom Lane wrote: > Concretely, I now propose the attached, which seems entirely > safe to back-patch. The patch looks good to me. And +1 for back-patching. Best regards, Etsuro Fujita

Re: estimation problems for DISTINCT ON with FDW

2020-07-02 Thread Etsuro Fujita
On Thu, Jul 2, 2020 at 11:46 PM Tom Lane wrote: > Etsuro Fujita writes: > > On Wed, Jul 1, 2020 at 11:40 PM Tom Lane wrote: > >> Short of sending a whole second query to the remote server, it's > >> not clear to me how we could get the full table size (or equivalently > >> the target query's

Re: estimation problems for DISTINCT ON with FDW

2020-07-02 Thread Tom Lane
Concretely, I now propose the attached, which seems entirely safe to back-patch. regards, tom lane diff --git a/doc/src/sgml/fdwhandler.sgml b/doc/src/sgml/fdwhandler.sgml index 6587678af2..1e997c218b 100644 --- a/doc/src/sgml/fdwhandler.sgml +++

Re: estimation problems for DISTINCT ON with FDW

2020-07-02 Thread Tom Lane
Etsuro Fujita writes: > On Wed, Jul 1, 2020 at 11:40 PM Tom Lane wrote: >> Short of sending a whole second query to the remote server, it's >> not clear to me how we could get the full table size (or equivalently >> the target query's selectivity for that table). The best we realistically >>

Re: estimation problems for DISTINCT ON with FDW

2020-07-01 Thread Etsuro Fujita
On Wed, Jul 1, 2020 at 11:40 PM Tom Lane wrote: > Etsuro Fujita writes: > > On Wed, Jul 1, 2020 at 7:21 AM Tom Lane wrote: > >> +baserel->tuples = Max(baserel->tuples, baserel->rows); > > > for consistency, this should be > > baserel->tuples = clamp_row_est(baserel->rows / sel); > > where

Re: estimation problems for DISTINCT ON with FDW

2020-07-01 Thread Tom Lane
Etsuro Fujita writes: > On Wed, Jul 1, 2020 at 7:21 AM Tom Lane wrote: >> +baserel->tuples = Max(baserel->tuples, baserel->rows); > for consistency, this should be > baserel->tuples = clamp_row_est(baserel->rows / sel); > where sel is the selectivity of the baserestrictinfo clauses? If

Re: estimation problems for DISTINCT ON with FDW

2020-07-01 Thread Etsuro Fujita
On Wed, Jul 1, 2020 at 7:21 AM Tom Lane wrote: > Attached are a couple of quick-hack patches along each of those lines. > Either one resolves the crazy number-of-groups estimate for Jeff's > example; neither changes any existing regression test results. > > On the whole I'm not sure I like 0001

Re: estimation problems for DISTINCT ON with FDW

2020-06-30 Thread Tom Lane
I wrote: > I poked into that and found that the problem is in estimate_num_groups, > which effectively just disregards any relation that has rel->tuples = 0. > That is the case for a postgres_fdw foreign table if use_remote_estimate > is true, because postgres_fdw never bothers to set any other

Re: estimation problems for DISTINCT ON with FDW

2020-06-30 Thread Tom Lane
Jeff Janes writes: > It doesn't cost out the plan of pushing the DISTINCT ON down to the foreign > side, which is probably the best way to run the query. I guess it makes > sense that FDW machinery in general doesn't want to try to push a > PostgreSQL specific construct. Well, that's an

Re: estimation problems for DISTINCT ON with FDW

2020-06-30 Thread Etsuro Fujita
On Mon, Jun 29, 2020 at 7:02 PM Bharath Rupireddy wrote: > > It doesn't cost out the plan of pushing the DISTINCT ON down to the foreign > > side, which is probably the best way to run the query. I guess it makes > > sense that FDW machinery in general doesn't want to try to push a > >

Re: estimation problems for DISTINCT ON with FDW

2020-06-29 Thread Bharath Rupireddy
> It doesn't cost out the plan of pushing the DISTINCT ON down to the foreign > side, which is probably the best way to run the query. I guess it makes > sense that FDW machinery in general doesn't want to try to push a PostgreSQL > specific construct. I think you are right, the DISTINCT

estimation problems for DISTINCT ON with FDW

2020-06-28 Thread Jeff Janes
If I use the attached sql file to set up the database with loop-back postgres_fdw, and then turn on use_remote_estimate for this query: distinct on (id) id, z from fgn.priority order by id, priority desc,z It issues two queries for the foreign estimate, one with a sort and one without: EXPLAIN