Github user HeartSaVioR commented on the issue:
https://github.com/apache/storm/pull/1736
@manuzhang
Thanks for showing interest on this. Let me rearrange your questions:
> Has this changed the workflow of StormSQL?
No, we still rely on Trident, and nothing changed in point of workflow.
> Motive behind this big change and benefits
This is started from [Julian's
comment](https://issues.apache.org/jira/browse/STORM-1040?focusedCommentId=15034472&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15034472)
and also [Milinda's
comment](https://issues.apache.org/jira/browse/STORM-1040?focusedCommentId=15035182&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15035182).
For me having own relational algebras (rel) has several advantages,
- We can push operator handling logic to rel itself. Before that we should
traverse Calcite logical rel tree with PostOrderRelNodeVisitor, and visitor
needs to handle Calcite's rel directly. Now the logic how to configure Trident
topology is all handled from separate rels.
- We sometimes want to have more derived rels compared to Calcite logical
operators. One of example is `Join`. There's only one logical rel regarding
join in Calcite - LogicalJoin - but we're now converting LogicalJoin to
EquiJoin if conditions are met. If we have various types of join it will make
the difference. We're not prepared yet, but streaming scan vs table scan, and
streaming insert vs table insert are the other cases.
```
TridentStormAggregateRel(group=[{0}], EXPR$1=[COUNT()])
TridentStormCalcRel(expr#0..4=[{inputs}], expr#5=[0], expr#6=[>($t0,
$t5)], DEPTID=[$t3], EMPID=[$t0], $condition=[$t6])
TridentStormEquiJoinRel(condition=[=($2, $3)], joinType=[inner])
TridentStormStreamScanRel(table=[[EMP]])
TridentStormStreamScanRel(table=[[DEPT]])
```
We can even override the methods how to represent the rel in explain string
if we think Calcite's explain is less informational. For example, showing
initial parallelism (when we support) for Scan.
- This patch starts addressing query optimizations. One of example is Calc,
which is for merging multiple calculations into one. No need to filter and
projection separately. They're still not minimized (as projection is), but it
can be addressed after STORM-2072. Defining derived rels helps further query
optimizations, like filter pushdown. Calcite rels is not aware of data source
characteristic, and we can include it to our own rels.
Please don't hesitate to ask questions. Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---