Hi Minseok,
I'm not sure I fully understand your requirements. But I think that
Correlate/EnumerableCorrelate may be useful in your case. It is similar
to nested loop join, but it adds an additional filter for the right side
with the correlation variable (cor0):
EnumerableCorrelate(correlation=[$cor0], joinType=[left],
requiredColumns=[{7}])
EnumerableTableScan(table=[[CATALOG, SALES, EMP]])
EnumerableFilter(condition=[=($cor0.DEPTNO, $0)])
EnumerableTableScan(table=[[CATALOG, SALES, DEPT]])
Thanks,
Roman
On 18.07.2023 01:10, Kim, Min-Seok wrote:
Hello All,
I'm creating a query system using Apache Calcite. My storage method can
only look up and scan by key, and doesn't support full scans without a key.
I've made the storage table using FilterableTable, which requires a
specific key in the filter.
I can do single queries like
*SELECT * FROM emps as e WHERE e.empno = 100*
where '*emps*' is stored with the key 'empno'. But, joins are more
difficult. For example, in
*SELECT * FROM emps as e JOIN depts as d ON e.deptno = d.deptno WHERE
e.empno = 100*
I can use 'e.empno = 100' to scan '*emps*'. Ideally, I'd use the result of
that scan when scanning '*depts*'
But, the current plan is:
EnumerableHashJoin
...
EnumerableInterpreter
BindableTableScan(table=[[SALES, EMPS]], filters=[[=($1, 100)]]) <-
*FilterableTable* is Used, it is feasible.
EnumerableCalc
EnumerableTableScan(table=[[SALES, EMPS]]) <- FilterableTable is
used but the filters in scan are empty so it is infeasible.
Instead, I want to do:
*EnumerableCustomHashJoin* <- custom HashJoin
EnumerableCalc
*EnumerableCustomTableScan*(table=[[SALES, EMPS]]) <- custom Scan
EnumerableInterpreter
BindableTableScan(table=[[SALES, EMPS]],
filters=[[=($1, 100)]])
I need advice on how to make this happen. Any advice or insights you could
provide would be greatly appreciated. Looking forward to hearing from you.
Thanks
Minseok