On Tue, Sep 17, 2019 at 4:41 PM Richard Guo <ri...@pivotal.io> wrote:
> > On Thu, Sep 12, 2019 at 11:35 PM Antonin Houska <a...@cybertec.at> wrote: > >> Richard Guo <ri...@pivotal.io> wrote: >> >> > On Wed, Sep 11, 2019 at 3:25 PM Antonin Houska <a...@cybertec.at> >> > wrote: >> > >> > >> > Nevertheless, I don't know how to overcome the problems that I >> > mentioned >> > upthread. >> > >> > >> > Do you mean the problem "the WHERE clause of the subquery didn't >> > participate in the SEMI JOIN evaluation"? Good news is it has been >> > fixed >> > by commit 043f6ff0 as I mentioned upthread. >> >> Do you say that my old patch (rebased) no longer breaks the regression >> tests? >> > > I think so. > > >> >> (I noticed your other email in the thread which seems to indicate that >> you're >> no lo longer interested to work on the feature, but asking out of >> curiosity.) >> > > Tom pointed out that even if we pull up the subquery with the help of > LATERAL, we cannot make sure we will end up with a better plan, since > LATERAL pretty much constrains things to use a nestloop. Hmm, I think > what he said makes sense. > > Thanks > Richard > > Even if we can't do this for the general case, I still think we can do something for some special cases, for example: select count(*) from j1 where (i) *in* (select i from j2 where* j2.im5 = j1.im5*); can be converted to select count(*) from t1 where (i, im5) in (select i, im5 from j2); The conversion can happen just before the convert_ANY_sublink_to_join. @@ -399,6 +483,7 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node, /* Is it a convertible ANY or EXISTS clause? */ if (sublink->subLinkType == ANY_SUBLINK) { + reduce_sublink_correlation_exprs(root, sublink); if ((j = convert_ANY_sublink_to_join(root, sublink, available_rels1)) != NULL) However we have to do lots of pre checking for this, the below is something I can think for now. 1). It must be an in-subquery. 2). The op in correlation_expr must be a mergeable op. 3). no aggregation call in subquery->targetList and subquery->havingQual. 4). no limit/offset cause. 5). No volatile function involved for safety. I can't tell how often it is, I just run into this by my own and search the maillist and get only 1 report [1]. Is it something worth doing or do we have a better strategy to handle it? Thanks! [1] https://www.postgresql.org/message-id/3691.1342650...@sss.pgh.pa.us -- Best Regards Andy Fan