Hi Minseok,

I'm not sure I fully understand your requirements. But I think that Correlate/EnumerableCorrelate may be useful in your case. It is similar to nested loop join, but it adds an additional filter for the right side with the correlation variable (cor0):

  EnumerableCorrelate(correlation=[$cor0], joinType=[left], requiredColumns=[{7}])
    EnumerableTableScan(table=[[CATALOG, SALES, EMP]])
    EnumerableFilter(condition=[=($cor0.DEPTNO, $0)])
      EnumerableTableScan(table=[[CATALOG, SALES, DEPT]])

Thanks,

Roman

On 18.07.2023 01:10, Kim, Min-Seok wrote:
Hello All,

I'm creating a query system using Apache Calcite. My storage method can
only look up and scan by key, and doesn't support full scans without a key.
I've made the storage table using FilterableTable, which requires a
specific key in the filter.

I can do single queries like

*SELECT * FROM emps as e WHERE e.empno = 100*

where '*emps*' is stored with the key 'empno'. But, joins are more
difficult. For example, in

*SELECT * FROM emps as e JOIN depts as d ON e.deptno = d.deptno WHERE
e.empno = 100*

I can use 'e.empno = 100' to scan '*emps*'. Ideally, I'd use the result of
that scan when scanning '*depts*'

But, the current plan is:
EnumerableHashJoin
     ...
     EnumerableInterpreter
         BindableTableScan(table=[[SALES, EMPS]], filters=[[=($1, 100)]]) <-
*FilterableTable* is Used, it is feasible.
     EnumerableCalc
         EnumerableTableScan(table=[[SALES, EMPS]]) <- FilterableTable is
used but the filters in scan are empty so it is infeasible.

Instead, I want to do:

*EnumerableCustomHashJoin* <- custom HashJoin
     EnumerableCalc
         *EnumerableCustomTableScan*(table=[[SALES, EMPS]])  <- custom Scan
                EnumerableInterpreter
                     BindableTableScan(table=[[SALES, EMPS]],
filters=[[=($1, 100)]])

I need advice on how to make this happen. Any advice or insights you could
provide would be greatly appreciated. Looking forward to hearing from you.

Thanks
Minseok

Reply via email to