Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2022-11-08 Thread Andy Fan
Hi: > cfbot reports the patch no longer applies [1]. As CommitFest 2022-11 is > currently underway, this would be an excellent time to update the patch. > Thank you Ian & Andrey for taking care of this! I am planning to start a new thread for this topic in 2 weeks, and will post an update

Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2022-11-03 Thread Ian Lawrence Barwick
2022年7月7日(木) 20:11 Andrey Lepikhov : > > On 17/5/2022 05:00, Andy Fan wrote: > > Thanks. But I will wait to see if anyone will show interest with this. > > Or else > > Moving alone is not a great experience. > To move forward I've rebased your patchset onto new master, removed > annoying tailing

Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2022-05-16 Thread Andy Fan
On Tue, May 17, 2022 at 6:52 AM Thomas Munro wrote: > On Thu, Mar 24, 2022 at 3:22 PM Andy Fan wrote: > > Here is the latest code. I have rebased the code with the latest master > a1bc4d3590b. > > FYI this is failing with an unexpected plan in the partition_join test: > > >

Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2022-05-16 Thread Thomas Munro
On Thu, Mar 24, 2022 at 3:22 PM Andy Fan wrote: > Here is the latest code. I have rebased the code with the latest master > a1bc4d3590b. FYI this is failing with an unexpected plan in the partition_join test:

Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2022-03-21 Thread Andres Freund
Hi, On 2022-03-08 21:44:37 +0800, Andy Fan wrote: > I have finished the PoC for planning timing improvement and joinrel rows > estimation. This currently crashes on cfbot: https://api.cirrus-ci.com/v1/task/6158455839916032/logs/cores.log https://cirrus-ci.com/task/6158455839916032 As this is

Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2022-03-16 Thread Andy Fan
Hi: I just tested more cases for the estimation issue for this feature, and we can find **we get a more accurate/stable estimation than before**. Here is the test cases and result (by comparing the master version and patched version). create table ec_t110 as select i::int as a from

Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2022-03-02 Thread Robert Haas
On Wed, Mar 2, 2022 at 11:09 AM Tom Lane wrote: > Robert Haas writes: > > So the questions in my mind here are all > > about whether we can detect this stuff cheaply and whether anybody > > wants to do the work to make it happen, not whether we'd get a benefit > > in the cases where it kicks in.

Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2022-03-02 Thread Tom Lane
Robert Haas writes: > So the questions in my mind here are all > about whether we can detect this stuff cheaply and whether anybody > wants to do the work to make it happen, not whether we'd get a benefit > in the cases where it kicks in. Right, my worries are mostly about the first point.

Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2022-03-02 Thread Robert Haas
On Tue, Mar 1, 2022 at 9:05 PM Tom Lane wrote: > Robert Haas writes: > > I agree. My question is: why shouldn't every case where we can deduce > > an implied inequality be reasonably likely to show a benefit? > > Maybe it will be, if we can deal with the issue you already mentioned > about not

Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2022-03-01 Thread Andy Fan
> > > I don't think 0001 is right either, although maybe for somewhat > different reasons. First, I think it only considers VAR OP CONST style > clauses, but that is leaving money on the table, because given a.x = > b.x AND mumble(a.x), we can decide to instead test mumble(b.x) if the > equality

Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2022-03-01 Thread Andy Fan
Thanks Tom for joining. > I'm not in favor of complicating the EquivalenceClass > mechanism for this, because (b) what it definitely will do > is make ECs harder to understand and reason about. I'm not willing to show opposition on purpose, and I'm not insist on current strategy, but I

Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2022-03-01 Thread Tom Lane
Robert Haas writes: > I agree. My question is: why shouldn't every case where we can deduce > an implied inequality be reasonably likely to show a benefit? Maybe it will be, if we can deal with the issue you already mentioned about not misestimating the resulting partially-redundant conditions.

Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2022-03-01 Thread Robert Haas
On Tue, Mar 1, 2022 at 5:53 PM Tom Lane wrote: > Robert Haas writes: > > This topic has been discussed a number of times, and Tom has basically > > always said that he thinks this would be expensive to plan (which I > > think is true) and that we wouldn't get much benefit (which I think is > >

Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2022-03-01 Thread Tom Lane
Robert Haas writes: > This topic has been discussed a number of times, and Tom has basically > always said that he thinks this would be expensive to plan (which I > think is true) and that we wouldn't get much benefit (which I think is > false). I think the trick here, as in so many other

Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2022-03-01 Thread Robert Haas
On Mon, Feb 21, 2022 at 2:31 AM Andy Fan wrote: > +1. Just to be more precise, are you also confused about why this > should not be done at all. IIUC, I get 3 reasons from Tom's reply. > a). Planning cost. b). estimation error. c) extra qual execution is bad. This topic has been discussed a

Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2022-02-20 Thread Andy Fan
> > > >> It actually deals with a more general form of this case, because the > >> clauses don't need to reference the same attribute - so for example this > >> would work too, assuming there is extended stats object on the columns > >> on each side: > >> > >> P(A.c = B.d | (A.e < 42) & (B.f <

Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2022-02-20 Thread Andy Fan
Thanks for the detailed explanation. On Sat, Feb 19, 2022 at 2:27 AM Robert Haas wrote: > On Fri, Feb 18, 2022 at 12:56 AM Andy Fan > wrote: > > What do you think about moving on this feature? The items known by me > > are: 1). Make sure the estimation error can be fixed or discuss if my >

Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2022-02-18 Thread Tomas Vondra
On 2/17/22 23:16, Robert Haas wrote: > On Thu, Feb 17, 2022 at 4:17 PM Tomas Vondra > wrote: >> IMHO the whole problem is we're unable to estimate the join clause as a >> conditional probability, i.e. >> >>P(A.x = B.x | (A.x < 42) & (B.x < 42)) >> >> so maybe instead of trying to generate

Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2022-02-18 Thread Robert Haas
On Fri, Feb 18, 2022 at 12:56 AM Andy Fan wrote: > What do you think about moving on this feature? The items known by me > are: 1). Make sure the estimation error can be fixed or discuss if my current > solution is workable. b). Just distribute some selectivity restrictinfo to > RelOptInfo.

Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2022-02-17 Thread Andy Fan
On Fri, Feb 18, 2022 at 4:15 AM Robert Haas wrote: > On Tue, Feb 1, 2022 at 10:08 AM Andy Fan wrote: > > To address the row estimation issue, The most straightforward way to fix > this is to > > ignore the derived clauses when figuring out the RelOptInfo->rows on > base relation. > > To note

Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2022-02-17 Thread Robert Haas
On Thu, Feb 17, 2022 at 4:17 PM Tomas Vondra wrote: > IMHO the whole problem is we're unable to estimate the join clause as a > conditional probability, i.e. > >P(A.x = B.x | (A.x < 42) & (B.x < 42)) > > so maybe instead of trying to generate additional RelOptInfo items we > should think

Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2022-02-17 Thread Tomas Vondra
On 2/17/22 21:15, Robert Haas wrote: > On Tue, Feb 1, 2022 at 10:08 AM Andy Fan wrote: >> To address the row estimation issue, The most straightforward way to fix >> this is to >> ignore the derived clauses when figuring out the RelOptInfo->rows on base >> relation. >> To note which clause is

Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2022-02-17 Thread Robert Haas
On Tue, Feb 1, 2022 at 10:08 AM Andy Fan wrote: > To address the row estimation issue, The most straightforward way to fix this > is to > ignore the derived clauses when figuring out the RelOptInfo->rows on base > relation. > To note which clause is derived from this patch, I added a new field

Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2022-02-17 Thread Finnerty, Jim
So I think knowing what bad it is to have this feature is the key point to discussion now. > While I've only read your description of the patch not the patch itself, This comment applies to me also. Is the join selectivity properly calculated in all cases, e.g. in the n:m join case in

Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2022-02-11 Thread Andy Fan
> >> > So I think knowing what bad it is to have this feature is the key point to >> discussion now. >> >> I re-read the discussion at 2015 [1] and the below topic is added for the above question. Here is the summary for easy discussion. >From planner aspect: > While I've only read your

Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2022-02-05 Thread Andy Fan
Hi, On Sat, Feb 5, 2022 at 9:32 PM Tomas Vondra wrote: > > I'm also not claiming this is 100% worth it - queries with a suitable > combination of clauses (conditions on the join keys) seems rather > uncommon. Thanks for showing interest in this. I want to add some other user cases which seem

Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2022-02-05 Thread Tomas Vondra
Hi, there's been an interesting case [1] of a slow query on pgsql-general, related to the topic discussed in this thread. It causes an order the query to run slower by multiple orders of magnitude, and I think it's interesting, so let me present a simple example demonstrating it.

Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2022-02-01 Thread Andy Fan
Hi Justin: Thanks for your attention. On Wed, Feb 2, 2022 at 1:13 AM Justin Pryzby wrote: > > Subject: [PATCH v1 1/6] Rebaee David's patch against the latest code. > > If you use git-am, then the author/commit information is preserved. > It's probably good to include a link to the patch in

Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2022-02-01 Thread Justin Pryzby
> Subject: [PATCH v1 1/6] Rebaee David's patch against the latest code. If you use git-am, then the author/commit information is preserved. It's probably good to include a link to the patch in any case. > Subject: [PATCH v1 4/6] remove duplicated qual executing.

Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2022-02-01 Thread Andy Fan
On Fri, May 14, 2021 at 12:22 PM David Rowley wrote: > On Fri, 14 May 2021 at 11:22, Tom Lane wrote: > > I recall somebody (David Rowley, maybe? Too lazy to check archives.) > > working on this idea awhile ago, but he didn't get to the point of > > a committable patch. > > Yeah. Me. The

Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2021-05-19 Thread Andy Fan
On Wed, May 19, 2021 at 8:15 PM David Rowley wrote: > On Mon, 17 May 2021 at 14:52, Andy Fan wrote: > > Would marking the new added RestrictInfo.norm_selec > 1 be OK? > > There would be cases you'd want to not count the additional clauses in > the selectivity estimation and there would be cases

Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2021-05-19 Thread David Rowley
On Mon, 17 May 2021 at 14:52, Andy Fan wrote: > Would marking the new added RestrictInfo.norm_selec > 1 be OK? There would be cases you'd want to not count the additional clauses in the selectivity estimation and there would be cases you would want to. For example: SELECT ... FROM t1 INNER

Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2021-05-16 Thread Andy Fan
> > > So now you have to figure out > what the selectivity of that is after the application of the partially > redundant IN clauses. > Would marking the new added RestrictInfo.norm_selec > 1 be OK? clause_selectivity_ext /* * If the clause is marked redundant, always return

Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2021-05-13 Thread David Rowley
On Fri, 14 May 2021 at 11:22, Tom Lane wrote: > I recall somebody (David Rowley, maybe? Too lazy to check archives.) > working on this idea awhile ago, but he didn't get to the point of > a committable patch. Yeah. Me. The discussion is in [1]. David [1]

Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2021-05-13 Thread Tom Lane
Dmitry Astapov writes: > Am I right in thinking that elimination the join condition is actually > quite important part of the process? > Could it possibly be the main reason for =ANY/(x IN (..)) not to be > optimized the same way? Yup. > Is it still hard when one thinks about =ANY or (column in

Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2021-05-13 Thread Dmitry Astapov
On Wed, May 12, 2021 at 4:54 PM Tom Lane wrote: > Dmitry Astapov writes: > > I am trying to understand the behaviour of the query planner regarding > the > > push-down of the conditions "through" the join. > > I think your mental model is wrong. What's actually happening here is > that the

Re: Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2021-05-12 Thread Tom Lane
Dmitry Astapov writes: > I am trying to understand the behaviour of the query planner regarding the > push-down of the conditions "through" the join. I think your mental model is wrong. What's actually happening here is that the planner uses equivalence classes to deduce implied conditions.

Condition pushdown: why (=) is pushed down into join, but BETWEEN or >= is not?

2021-05-12 Thread Dmitry Astapov
Hi! I am trying to understand the behaviour of the query planner regarding the push-down of the conditions "through" the join. Lets say that I have tables a(adate date, aval text) and b(bdate date, bval text), and I create a view: create view v as select a.adate, a.aval, b.bval from a join b