[ https://issues.apache.org/jira/browse/CALCITE-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17273409#comment-17273409 ]
Ruben Q L commented on CALCITE-3221: ------------------------------------ [~julianhyde] in terms of complexity, in case of UNION ALL we are talking of O(m*n), where "m" is the number of inputs and "n" the input size. For UNION we need to add the duplicate verification check, which normally should not be very expensive because we exploit the fact that inputs are sorted by a certain key and we use a HashSet that (thanks to [~vladimirsitnikov] suggestion) keeps only the elements having the current sort key (as soon as we see a new key, the set is cleared). [~amaliujia] is correct. EnumerableMergeUnionRule defines that: - when the sort + union has {{LIMIT X,}} then {{LIMIT X}} is pushed down to the inputs - when the sort + union has {{LIMIT X OFFSET Y}}, then {{LIMIT X+Y}} is pushed down to the inputs > Add MergeUnion operator in Enumerable convention > ------------------------------------------------ > > Key: CALCITE-3221 > URL: https://issues.apache.org/jira/browse/CALCITE-3221 > Project: Calcite > Issue Type: Improvement > Components: core > Affects Versions: 1.19.0 > Reporter: Stamatis Zampetakis > Assignee: Ruben Q L > Priority: Minor > Labels: pull-request-available > Attachments: screenshot-1.png > > Time Spent: 12h > Remaining Estimate: 0h > > Currently, the union operation offered by CalciteĀ (seeĀ > [EnumerableDefaults.union|https://github.com/apache/calcite/blob/d98856bf1a5f5c151d004b769e14bdd368a67234/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L2747]) > "breaks" the collation (if any) of its inputs. > The goal of this issue is to create a new union algorithm in Enumerable > convention (EnumerableMergeUnion) that, given the fact that its inputs are > sorted by the same collation, will return the union / union all result > respecting this collation. > Most likely the implementation of the merge join can be useful. -- This message was sent by Atlassian Jira (v8.3.4#803005)