[ 
https://issues.apache.org/jira/browse/CALCITE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L resolved CALCITE-5003.
--------------------------------
    Resolution: Fixed

Fixed via 
https://github.com/apache/calcite/commit/2789f5e4c361b052967f42b87447f04cc1ce7896

> MergeUnion on types with different collators produces wrong result
> ------------------------------------------------------------------
>
>                 Key: CALCITE-5003
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5003
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.27.0
>            Reporter: Ruben Q L
>            Assignee: Ruben Q L
>            Priority: Minor
>              Labels: pull-request-available
>             Fix For: 1.31.0
>
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> MergeUnion on types with different collators produces wrong result.
> Problem can be reproduced with the following test (in 
> {{EnumerableStringComparisonTest}}):
> {code}
>   @Test void testMergeUnionOnStringDifferentCollation() {
>     tester()
>         .query("?")
>         .withHook(Hook.PLANNER, (Consumer<RelOptPlanner>) planner ->
>             planner.removeRule(EnumerableRules.ENUMERABLE_UNION_RULE))
>         .withRel(b -> {
>           final RelBuilder builder = b.transform(c -> 
> c.withSimplifyValues(false));
>           return builder
>               .values(builder.getTypeFactory().builder()
>                       .add("name",
>                           
> builder.getTypeFactory().createSqlType(SqlTypeName.VARCHAR)).build(),
>                   "facilities", "HR", "administration", "Marketing")
>               .values(createRecordVarcharSpecialCollation(builder),
>                   "Marketing", "administration", "presales", "HR")
>               .union(false)
>               .sort(0)
>               .build();
>         })
>         .explainHookMatches("" // It is important that we have MergeUnion in 
> the plan
>             + "EnumerableMergeUnion(all=[false])\n"
>             + "  EnumerableSort(sort0=[$0], dir0=[ASC])\n"
>             + "    EnumerableValues(tuples=[[{ 'facilities' }, { 'HR' }, { 
> 'administration' }, { 'Marketing' }]])\n"
>             + "  EnumerableSort(sort0=[$0], dir0=[ASC])\n"
>             + "    EnumerableValues(tuples=[[{ 'Marketing' }, { 
> 'administration' }, { 'presales' }, { 'HR' }]])\n")
>         .returnsOrdered("name=administration\n"
>             + "name=facilities\n"
>             + "name=HR\n"
>             + "name=Marketing\n"
>             + "name=presales");
>   }
> {code}
> which fails with:
> {noformat}
> java.lang.AssertionError: 
> Expected: 
> "name=administration\nname=facilities\nname=HR\nname=Marketing\nname=presales"
>      but: was 
> "name=administration\nname=HR\nname=Marketing\nname=administration\nname=facilities\nname=Marketing\nname=presales"
> {noformat}
> The problem is that, in case of different collators, the pre-requisite of the 
> the MergeUnion (inputs sorted) is not fulfilled, since inputs are technically 
> sorted, but not using the same sorting collator, so they are not comparable 
> by the MergeUnion algorithm.
> A possible solution could be not applying EnumerableMergeUnionRule in this 
> case.
> A more clever solution could be achieved if the rule pushes a Sort + Cast + 
> input (and not just Sort + input) in case the input's key type differs 
> collation-wise with the union's result type.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to