Re: When will the exchange node(Distribution) be added to the execution plan

JiaTao Tao Mon, 03 Feb 2020 04:38:34 -0800

Thank you very much, now I can see distribution in RelTrait, and I still
have some doubts:
1. It seems in Calcite's main query process(via Prepare#prepareSql) there's
no code to `addRelTraitDef(RelDistributionTraitDef.INSTANCE)`, and even no
config, anyone know why?
2. I enable `useAbstractConvertersForConversion` and only register SMJ
rule, the table has no collation when optimizing, it occurs error:


Missing conversions are EnumerableTableScan[sort: [] -> [0]] (2 cases)


And when the table exposes collation, it just fine. How to make calcite
automatically add sort nodes, like Spark's ensure requirements.

Regards!

Aron Tao


Roman Kondakov <kondako...@mail.ru.invalid> 于2020年2月2日周日 下午7:26写道：

Hi

If you want the distribution trait to be taken into account by
optimizer, you need to register it:

VolcanoPlanner planner = ...;
planner.addRelTraitDef(RelDistributionTraitDef.INSTANCE);

See example in [1].

[1]
https://github.com/apache/calcite/blob/a6f544eb48a87f4f71f76ed422584398c0c9baa3/core/src/test/java/org/apache/calcite/test/RelOptRulesTest.java#L6377


-- 
Kind Regards
Roman Kondakov


On 02.02.2020 08:01, JiaTao Tao wrote:
> Hi
> I wonder when will the exchange node be added to the execution plan. For
> example, In Spark, if a join is SMJ(SortMergeJoin), it will add an
> exchange and a sort node to the execution plan:
>
> 3631580619602_.pic.jpg
>
> In Calcite, Let me use CsvTest#testReadme for example and I can find a
> sorting trait if the join is SMJ, but I can not find an exchange.
>
> The SQL:
>
> SELECT d.name <http://d.name>, COUNT(*) cnt
> FROM emps AS e
> JOIN depts AS d ON e.deptno = d.deptno
> GROUP BY d.name <http://d.name>;
>
> The plan in volcano planner, see
> `rel#76:EnumerableMergeJoin.ENUMERABLE.[[0], [2]]`, we can see the
> conversion and the Collation, but no distribution.
>
> appendix
>
> Set#6, type: RecordType(INTEGER DEPTNO, VARCHAR NAME, INTEGER DEPTNO0)
>     rel#51:Subset#6.NONE.[], best=null, importance=0.6561
>
>
rel#49:LogicalJoin.NONE.[](left=RelSubset#30,right=RelSubset#29,condition==($2,
> $0),joinType=inner), rowcount=1500.0, cumulative cost={inf}
>
>
rel#60:LogicalProject.NONE.[](input=RelSubset#32,DEPTNO=$1,NAME=$2,DEPTNO0=$0),
> rowcount=1500.0, cumulative cost={inf}
>     rel#55:Subset#6.ENUMERABLE.[], best=rel#78,
> importance=0.7290000000000001
>
>
rel#70:EnumerableProject.ENUMERABLE.[](input=RelSubset#46,DEPTNO=$1,NAME=$2,DEPTNO0=$0),
> rowcount=1500.0, cumulative cost={3686.517018598809 rows, 4626.25 cpu,
> 0.0 io}
>         rel#76:EnumerableMergeJoin.ENUMERABLE.[[0],
> [2]](left=RelSubset#74,right=RelSubset#75,condition==($2,
> $0),joinType=inner), rowcount=1500.0, cumulative cost={inf}
>
>
rel#78:EnumerableHashJoin.ENUMERABLE.[](left=RelSubset#30,right=RelSubset#69,condition==($0,
> $2),joinType=inner), rowcount=1500.0, cumulative cost={2185.517018598809
> rows, 126.25 cpu, 0.0 io}
>
> --
> Regards!
>
> Aron Tao
>
>
> --
>
> Regards!
>
> Aron Tao
>

Re: When will the exchange node(Distribution) be added to the execution plan

Reply via email to