Typo:
We can just add a security guard saying that it is supported.
Should be 
We can just add a security guard saying that it is NOT supported.

On 2020/07/20 19:57:34, Haisheng Yuan <hy...@apache.org> wrote: 
> I am not sure I got your implication by "pollute". If you mean changes, yes, 
> it requires some changes in rules. Do we need to change enumerables? Not 
> necessary. We can just add a security guard saying that it is supported. Not 
> everyone requires the Enumerable operators to support everything. More 
> importantly, currently there is no logic or rules to translate sub-query 
> directly to SEMI/ANTI joins, let alone translating directly to ANTI_NOTIN. 
> Currently NOT IN is expanded to NOT(IN ...) before entering RelNode land. 
> That means we don't even have the chance to generate the NOT IN anti join. Is 
> that still a concern?
> 
> Even if some day, some contributor extends Calcite's parser and 
> SubqueryRemovalRule to be able to  transform NOT_IN subquery into NOT IN anti 
> join, we still have chance to disable it. Is that still a concern?
> 
> There are many ways to play it safe.
> 
> > Brainstorming: maybe we could consider it as a separate logical operator
> > (with its corresponding enumerable implementation)?
> It doesn't sound cool. It requires much more work. You have to duplicate all 
> the rules, metadata handler that deal with LogicalJoin, and for some rule 
> that matches Join base class, you have to check it is a LogicalJoin or the 
> logical operator for ANTI_NOTIN.
> 
> On 2020/07/20 08:28:42, Ruben Q L <rube...@gmail.com> wrote: 
> > I have some concerns that this new type would "pollute" the existing Join
> > logic, rules and enumerable implementations.
> > 
> > Brainstorming: maybe we could consider it as a separate logical operator
> > (with its corresponding enumerable implementation)?
> > 
> > 
> > Le lun. 20 juil. 2020 à 06:08, Haisheng Yuan <h.y...@alibaba-inc.com> a
> > écrit :
> > 
> > > I agree that NOT IN is toxic, and it is error-prone.
> > > But you can't prevent people writing SQL with not in sub-queries, would
> > > you rather let optimizer generate inefficient plan?
> > >
> > > - Haisheng
> > >
> > > ------------------------------------------------------------------
> > > 发件人:Julian Hyde<jh...@apache.org>
> > > 日 期:2020年07月20日 11:56:35
> > > 收件人:dev@calcite.apache.org<dev@calcite.apache.org>
> > > 主 题:Re: [DISCUSS] New Join Type: ANTI_NOTIN
> > >
> > > Yuck!
> > >
> > > NOT IN is toxic. I'd rather keep it out of the algebra.
> > >
> > > On Sun, Jul 19, 2020 at 8:24 PM Haisheng Yuan <hy...@apache.org> wrote:
> > > >
> > > > Hi all,
> > > >
> > > > Currently, JoinRelType.ANTI only represents NOT_EXISTS subquery (thanks
> > > to Ruben for reminding).
> > > > For some simple boolean context NOT_IN subquery, we can't transform it
> > > to ANTI join. e.g.:
> > > >
> > > > SELECT * FROM foo WHERE a NOT IN (SELECT b FROM bar); -- bar.b is
> > > nullable
> > > >
> > > > Because if there is a null value in the results of subquery, the NOT IN
> > > predicate will return false, the whole query returns empty. And in 
> > > Calcite,
> > > the plan for this kind of query is inefficient.
> > > >
> > > > If we have ANTI_NOTIN to represent this kind of join, we can generate
> > > more efficient plan, as long as the query executor support it.
> > > >
> > > > Thoughts?
> > > >
> > > > Haisheng Yuan
> > > >
> > > >
> > >
> > 
> 

Reply via email to