[ 
https://issues.apache.org/jira/browse/CALCITE-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17273409#comment-17273409
 ] 

Ruben Q L commented on CALCITE-3221:
------------------------------------

[~julianhyde] in terms of complexity, in case of UNION ALL we are talking of 
O(m*n), where "m" is the number of inputs and "n" the input size. For UNION we 
need to add the duplicate verification check, which normally should not be very 
expensive because we exploit the fact that inputs are sorted by a certain key 
and we use a HashSet that (thanks to [~vladimirsitnikov] suggestion) keeps only 
the elements having the current sort key (as soon as we see a new key, the set 
is cleared).

[~amaliujia] is correct. EnumerableMergeUnionRule defines that:
- when the sort + union has {{LIMIT X,}} then {{LIMIT X}} is pushed down to the 
inputs
- when the sort + union has {{LIMIT X OFFSET Y}}, then {{LIMIT X+Y}} is pushed 
down to the inputs

> Add MergeUnion operator in Enumerable convention
> ------------------------------------------------
>
>                 Key: CALCITE-3221
>                 URL: https://issues.apache.org/jira/browse/CALCITE-3221
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 1.19.0
>            Reporter: Stamatis Zampetakis
>            Assignee: Ruben Q L
>            Priority: Minor
>              Labels: pull-request-available
>         Attachments: screenshot-1.png
>
>          Time Spent: 12h
>  Remaining Estimate: 0h
>
> Currently, the union operation offered by CalciteĀ (seeĀ 
> [EnumerableDefaults.union|https://github.com/apache/calcite/blob/d98856bf1a5f5c151d004b769e14bdd368a67234/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L2747])
>  "breaks" the collation (if any) of its inputs.
> The goal of this issue is to create a new union algorithm in Enumerable 
> convention (EnumerableMergeUnion) that, given the fact that its inputs are 
> sorted by the same collation, will return the union / union all result 
> respecting this collation.
> Most likely the implementation of the merge join can be useful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to