Hi Danny,

Thanks a lot for taking this on, it is a great start!

I didn't look thoroughly through the PR but I noticed that there are many
renaming/refactoring of public APIs. I am not sure if we should introduce
so many breaking changes without prior notice. A most conservative approach
would be to keep existing classes/methods, mark them as deprecated, and
then remove them in one of the coming releases. I am not sure if that is
the right way to go so let's see what the others have to say.

Best,
Stamatis

On Fri, Apr 12, 2019 at 9:18 AM Yuzhao Chen <yuzhao....@gmail.com> wrote:

> Hi, @Haisheng Yuan, @Julian Hyde, @Stamatis Zampetakis,
> @Walaa Eldin Moustafa
>
> I have did the work for this discussion, and look forward to your
> suggestions.
>
>
> ### Diff
> - Deprecate SemiJoin, EquiJoin, EnumerableSemiJoin, SemiJoinType,
> EnumerableSemiJoinRule, EnumerableThetaJoin
> - Make EnumerableMergeJoin extends Join instead of EquiJoin
> - Add SEMI and ANTI join type to JoinRelType, add method
> returnsJustFirstInput() to decide if the join only outputs left side
> - Correlate use JoinRelType instead of SemiJoinType
> - Rename EnumerableCorrelate to EnumerableNestedLoopJoin and make it
> exptends Join instead of Correlate
> - Rename EnumerableJoin to EnumerableHashJoin
> - EnumerableJoinRule will convert semi-join to EnumerableNestedLoopJoin
> (EnumerableSemiJoin's function is merged into this rule)
> - Add method isNonCorrelateSemiJoin() in Join.java to make sure if this
> join is a semi-join (Comes from SemiJoinRule) or comes from
> decorrelation(SubqueryRemoveRule or RelDecorrelator), the returns value
> true means the join is a semi-join equivalent to SemiJoin before this patch.
> - Cache the JoinInfo in Join and use it to get leftKeys and rightKeys,
> merge the SemiJoin#computeSelfCost to Join#computeSelfCost
> - RelBuilder removes SemiJoinFactory, method #semiJoin now return a
> LogicalJoin with JoinRelType#SEMI
>
> ### Rules tweak
> - JoinAddRedundantSemiJoinRule now create LogicalJoin with
> JoinRelType#SEMI instead of SemiJoin
> - JoinToCorrelateRule remove SEMI instance and change the matchs condition
> to !join.getJoinType().generatesNullsOnLeft() which also allowed ANTI
> compared before this patch.
> - SemiJoinRule match SEMI join specificlly
>
> ### Metadata tweak
> - RelMdAllPredicates, RelMdExpressionLineage: Add full rowType to
> getAllPredicates(Join) cause semi-join only outputs one side
> - RelMdColumnUniqueness, RelMdSelectivity, RelMdDistinctRowCount,
> RelMdSize, RelMdUniqueKeys: merge semi-join logic to join
>
>
> ### Test cases change
> - MaterializationTest#testJoinMaterialization11 now can materialize
> successfully, cause i allow logical SemiJoin node to match, the original
> matchs SemiJoin as SemiJoin.class.isAssignableFrom(), which i think is
> wrong cause this will only matches subClasses of SemiJoin which is only
> EnumerableSemiJoin before this patch.
> - SortRemoveRuleTest#removeSortOverEnumerableCorrelate, because
> CALCITE-2018, the final EnumerableSort's cost was cache by the previous
> EnumerableSort with logical childs, so i remove the EnumerableSortRule and
> the best plan is correct
> - sub-query.iq has better plan for null correlate
>
>
>
> Best,
> Danny Chan
> 在 2019年3月21日 +0800 AM3:07,Julian Hyde <jh...@apache.org>,写道:
> > I just discovered that Correlate, which is neither a Join nor a
> SemiJoin, uses SemiJoinType, but SemiJoin does not use SemiJoinType.
> >
> > Yuck. The Join/SemiJoin/Correlate type hierarchy needs some thought.
> >
> > Julian
> >
> >
>

Reply via email to