[ 
https://issues.apache.org/jira/browse/CALCITE-5559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17697778#comment-17697778
 ] 

Ruben Q L commented on CALCITE-5559:
------------------------------------

Correct [~julianhyde], at this moment RepeatUnion has a control to avoid 
returning duplicates in the "final" result if its flag all=false. The problem 
is that if there are duplicates generated within iteration X, these duplicates 
(even if the are correctly handled by the RepeatUnion) will be fed back as 
input for the iteration X+1, leading to unnecessary extra processing.

To recap the full picture, SQL recursive union functionality in Calcite is 
split in 3 main operators: RepeatUnion, TableSpool and TransientScan:
{noformat}
RepeatUnion(all: true/false)
  TableSpool(table: DELTA)
    ... <seed subplan> ...
  TableSpool(table: DELTA)
    ... <iterative subplan> ...
      TransientScan(table: DELTA)
{noformat}
The RepeatUnion orchestrates the process: calls its left input (seed) once, and 
then its right input (iterative) over and over until no more results are 
produced (or until an optional iterationLimit is reached). The "magic" to make 
the output of iteration X to become the input of iteration X+1 comes from the 
TableSpools+TransientScan who "share" the same table (TableSpool feeds it, then 
TransientScan consumes it in the next iteration). So, even though global 
duplicates are discarded by the RepeatUnion at the top level, the duplicates 
within a certain iteration are not discarded when the TableSpool feeds the 
TransientScan, which may lead to unnecessary work in the following iteration.

In order to optimize that, we have mentioned two approaches:
A) Improve TableSpool (or rather provide a new TableSpool) so that it can deals 
with duplicates in case of RepeatUnion all=false:
{noformat}
RepeatUnion(all: false)
  DistinctTableSpool(table: DELTA)
    ... <seed subplan> ...
  DistinctTableSpool(table: DELTA)
    ... <iterative subplan> ...
      TransientScan(table: DELTA)
{noformat}
I have a PoC with this approach, I could create a PR right away.

B) Inject an Aggregate (distinct) before the TableSpool, with the limitation 
that currently this would not work due to the "enumerator eager evaluation" of 
aggregates, as explained on my last comment (so this would require some 
preliminary work):
{noformat}
RepeatUnion(all: false)
  TableSpool(table: DELTA)
    Aggregate(distinct)
      ... <seed subplan> ...
  TableSpool(table: DELTA)
    Aggregate(distinct)
      ... <iterative subplan> ...
        TransientScan(table: DELTA)
{noformat}

Finally, let's remember that RepeatUnion, TableSpool and TransientScan are 
flagged as "experimental and subject to change without notice", so maybe 
approach B's preliminary refactoring around Aggregate could be an overkill to 
accommodate an optimization on an experimental feature; and A would seem a 
simpler, more localized solution that would not impact other (non-experimental) 
operators.

> Improve RepeatUnion by discarding duplicates at TableSpool level
> ----------------------------------------------------------------
>
>                 Key: CALCITE-5559
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5559
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>            Reporter: Ruben Q L
>            Assignee: Ruben Q L
>            Priority: Major
>
> Currently, RepeatUnion operator with all=false keeps track of the elements 
> that it has returned in order to discard duplicates. However, the TableSpool 
> operators that are right below it do not have such control. In certain 
> scenarios, duplicates are returned by the TableSpool current iteration, 
> discarded by the RepeatUnion, but have been already "fed back" by the 
> TableSpool into the next iteration, causing unnecessary processing.
> We can optimize this scenario by keeping track of the duplicates 
> inside/before the TableSpool too (note: we still need to keep track of 
> duplicates at RepeatUnion level, because that is the only place where we can 
> detect a potential "global duplicate" of an element: returned by the LHS and 
> then also by the RHS, or by two different iterations of the RHS).
> A PoC testing this improvement on a downstream project showed that certain 
> queries can go from ~40s down to ~1s.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to