[jira] [Created] (CALCITE-3890) Infer IS NOT NULL predicate from join
Chunwei Lei created CALCITE-3890: Summary: Infer IS NOT NULL predicate from join Key: CALCITE-3890 URL: https://issues.apache.org/jira/browse/CALCITE-3890 Project: Calcite Issue Type: Improvement Components: core Reporter: Chunwei Lei We can infer IS NOT NULL predicate from join which implies some columns may not be null. For instance, {code:java} select * from a join b on a.id = b.id; {code} we can infer a.id is not null and b.id is not null. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3889) Add apply(Mappings.Mapping) to RelTrait and RelTraitSet
Haisheng Yuan created CALCITE-3889: -- Summary: Add apply(Mappings.Mapping) to RelTrait and RelTraitSet Key: CALCITE-3889 URL: https://issues.apache.org/jira/browse/CALCITE-3889 Project: Calcite Issue Type: Improvement Components: core Reporter: Haisheng Yuan RelTrait Collation, Distribution have key indices, when we pass down the traitset to child or propagate to parent operator, we have to remap these keys. It would be nice to have {{apply(Mappings.Mapping)}} on RelTrait and RelTraitSet. RelDistribution already has the method, but we may want it on every RelTrait except Convention. -- This message was sent by Atlassian Jira (v8.3.4#803005)
Re: Split Join condition with CAST which only widening nullability
Thanks Zoltan. I added a test to RelBuilderTest and I can confirm that '_ IS TRUE' is stripped. And I agree that we cannot safely strip '_ IS NOT FALSE'. On Tue, Mar 31, 2020 at 5:11 AM Zoltan Haindrich wrote: > > > we should consider that 'b IS TRUE' and 'b IS NOT FALSE' > > I think we already do that... UnknownAs.FALSE essentially means that the > expression is enclosed in an "IS TRUE" - since we run filter/join condition > simplification in UAF > mode; its allowed to remove "_ IS TRUE" > > At the same time I don't see a way how "X IS NOT FALSE" could be removed > because in case X is null; the expression should evaluate to true (this > expression translates to > UnknownAs.TRUE mode) - which could be lost in case of a rewrite. We may > consider adding the UnknownAs mode to the filter/join node; but I think that > would just cause > trouble; are there some other way which I've not considered? > > cheers, > Zoltan > > On 3/30/20 7:19 PM, Julian Hyde wrote: > > If we're going down the path, we should consider that 'b IS TRUE' and > > 'b IS NOT FALSE' are somewhat like casts. Removing them from join > > conditions does not affect the result of the join. > > > > And the same apply to filter conditions. > > > > I don't know whether removing casts, _ IS TRUE and _ IS NOT FALSE from > > conditions genuinely make the world "simpler". But let's try it and > > see. > > > > On Mon, Mar 30, 2020 at 8:06 AM Zoltan Haindrich wrote: > >> > >> Hey Shuo! > >> > >> Thank you for sharing the testcase! I've seen that you were able to fix it > >> by calling the builder instead of copy - right now I think fixing this > >> thru ReduceExpressionRule > >> might be better - as it could also fix up other cases. > >> I've tried disabling nullability retainment for filters/join conditions - > >> and it seems to be working; I'll submit it under [1]. > >> > >> Julian: I recommended to try that to provide a quick check to see if at > >> that point the issue could be fixed - I was confident that by disabling > >> "matchNullability" for > >> "simplifyPreservingType()" will do the right thing and it doesn't add an > >> unnecessary cast - instead it safely removes it; however: it still added > >> the cast...and by doing so > >> it didn't helped :) > >> > >> [1] https://issues.apache.org/jira/browse/CALCITE-3887 > >> > >> cheers, > >> Zoltan > >> > >> > >> On 3/26/20 12:22 PM, Shuo Cheng wrote: > >>> I think we may solve the problem from two aspects: > >>> 1. Do not try to preserve type (nullability) of Join/Filter condition > >>> expression when simplifying or something like pushing down. > >>> 2. We can do some work (remove unnecessary CAST) right before create a > >>> Join/Filter, as Julian said, something in RelBuilder could be done. > >>> I've do some fix in above Link (remove unnecessary CAST when doing > >>> pushDownEqualJoinConditions) > >>> > >>> On Thu, Mar 26, 2020 at 7:14 PM Shuo Cheng wrote: > >>> > Sorry for the late reply, I've reproduced the problem here > https://github.com/cshuo/calcite/commit/b9a7fb5f536825d3a577bf42a5fc6cc7d4df7929 > . > > On Wed, Mar 25, 2020 at 12:38 AM Julian Hyde wrote: > > > It does seem to be something that RelBuilder could do. (RexSimplify > > can’t > > really do it, because it doesn’t know how the expression is being used.) > > > > It’s also worth discovering why the CAST was added in the first place. > > It > > doesn’t seem to be helpful. I think we should strive to eliminate all of > > the slightly unhelpful things that Calcite does; those things can add up > > and cause major inefficiencies in the planning process and/or > > sub-optimal > > plans. > > > > Julian > > > > > >> On Mar 24, 2020, at 1:47 AM, Zoltan Haindrich wrote: > >> > >> Hey, > >> > >> That's a great diagnosis :) > >> I would guess that newCondition became non-nullable for some reason > > (rexSimplify runs under RexProgramBuilder so it might be able to narrow > > the > > nullability) > >> you could try invoking simplify.simplifyPreservingType() on it to see > > if that would help. > >> > >>> I know it's necessary to preserve the nullability when simplifying a > > boolean expression in project columns, but as for condition in > > Filter/Calc, > > may be we can omit the > >>> nullability? > >> I think that could probably work - we can't change the nullability on > > project columns because those could be referenced (and the reference > > also > > has the type) ; but for filter/join conditions we don't need to care > > with > > it. > >> It seems we already have a "matchnullability" in ReduceExpressionsRule > > ; for FILTER/JOIN we should probably turn that off... :) > >> > >> cheers, > >> Zoltan > >> > >> > >> On 3/24/20 9:15 AM, Shuo Cheng wrote: > >>> Hi Zoltan, > >>> I encountered
Re: Split Join condition with CAST which only widening nullability
> we should consider that 'b IS TRUE' and 'b IS NOT FALSE' I think we already do that... UnknownAs.FALSE essentially means that the expression is enclosed in an "IS TRUE" - since we run filter/join condition simplification in UAF mode; its allowed to remove "_ IS TRUE" At the same time I don't see a way how "X IS NOT FALSE" could be removed because in case X is null; the expression should evaluate to true (this expression translates to UnknownAs.TRUE mode) - which could be lost in case of a rewrite. We may consider adding the UnknownAs mode to the filter/join node; but I think that would just cause trouble; are there some other way which I've not considered? cheers, Zoltan On 3/30/20 7:19 PM, Julian Hyde wrote: If we're going down the path, we should consider that 'b IS TRUE' and 'b IS NOT FALSE' are somewhat like casts. Removing them from join conditions does not affect the result of the join. And the same apply to filter conditions. I don't know whether removing casts, _ IS TRUE and _ IS NOT FALSE from conditions genuinely make the world "simpler". But let's try it and see. On Mon, Mar 30, 2020 at 8:06 AM Zoltan Haindrich wrote: Hey Shuo! Thank you for sharing the testcase! I've seen that you were able to fix it by calling the builder instead of copy - right now I think fixing this thru ReduceExpressionRule might be better - as it could also fix up other cases. I've tried disabling nullability retainment for filters/join conditions - and it seems to be working; I'll submit it under [1]. Julian: I recommended to try that to provide a quick check to see if at that point the issue could be fixed - I was confident that by disabling "matchNullability" for "simplifyPreservingType()" will do the right thing and it doesn't add an unnecessary cast - instead it safely removes it; however: it still added the cast...and by doing so it didn't helped :) [1] https://issues.apache.org/jira/browse/CALCITE-3887 cheers, Zoltan On 3/26/20 12:22 PM, Shuo Cheng wrote: I think we may solve the problem from two aspects: 1. Do not try to preserve type (nullability) of Join/Filter condition expression when simplifying or something like pushing down. 2. We can do some work (remove unnecessary CAST) right before create a Join/Filter, as Julian said, something in RelBuilder could be done. I've do some fix in above Link (remove unnecessary CAST when doing pushDownEqualJoinConditions) On Thu, Mar 26, 2020 at 7:14 PM Shuo Cheng wrote: Sorry for the late reply, I've reproduced the problem here https://github.com/cshuo/calcite/commit/b9a7fb5f536825d3a577bf42a5fc6cc7d4df7929 . On Wed, Mar 25, 2020 at 12:38 AM Julian Hyde wrote: It does seem to be something that RelBuilder could do. (RexSimplify can’t really do it, because it doesn’t know how the expression is being used.) It’s also worth discovering why the CAST was added in the first place. It doesn’t seem to be helpful. I think we should strive to eliminate all of the slightly unhelpful things that Calcite does; those things can add up and cause major inefficiencies in the planning process and/or sub-optimal plans. Julian On Mar 24, 2020, at 1:47 AM, Zoltan Haindrich wrote: Hey, That's a great diagnosis :) I would guess that newCondition became non-nullable for some reason (rexSimplify runs under RexProgramBuilder so it might be able to narrow the nullability) you could try invoking simplify.simplifyPreservingType() on it to see if that would help. I know it's necessary to preserve the nullability when simplifying a boolean expression in project columns, but as for condition in Filter/Calc, may be we can omit the nullability? I think that could probably work - we can't change the nullability on project columns because those could be referenced (and the reference also has the type) ; but for filter/join conditions we don't need to care with it. It seems we already have a "matchnullability" in ReduceExpressionsRule ; for FILTER/JOIN we should probably turn that off... :) cheers, Zoltan On 3/24/20 9:15 AM, Shuo Cheng wrote: Hi Zoltan, I encountered the problem when running TPC tests, and have not reproduced it in Calcite master. But I figured it out how the problem is produced: There is semi join with the condition:AND(EXPANDED_INDF1, EXPANDED_INDF2), type of AND is BOOLEAN with nullable `true` After JoinPushExpressionsRule -->> join condition: AND(INDF1, INDF2), type of AND is BOOLEAN with nullable `true` After SemiJoinProjectTransposeRule --> Join condition: CAST(AND(INDF1, INDF2)), type of AND is BOOLEAN with nullable `false` Just as what you suspected, It's in `SemiJoinProjectTransposeRule` where forced type correction is added by `RexProgramBuilder#addCondition`, which will call `RexSimplify#simplifyPreservingType` before registering an expression. I know it's necessary to preserve the nullability when simplifying a boolean expression in project columns, but as for condition in Filter/Calc, may be we can omit