[ 
https://issues.apache.org/jira/browse/CALCITE-6338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L updated CALCITE-6338:
-------------------------------
    Summary: RelMdCollation#project can return an incomplete list of collations 
in the presence of aliasing  (was: RelMdCollation#project can return an 
incomplete list of collations)

> RelMdCollation#project can return an incomplete list of collations in the 
> presence of aliasing
> ----------------------------------------------------------------------------------------------
>
>                 Key: CALCITE-6338
>                 URL: https://issues.apache.org/jira/browse/CALCITE-6338
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.36.0
>            Reporter: Ruben Q L
>            Assignee: Ruben Q L
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.37.0
>
>
> {{RelMdCollation#project}} can return an incomplete list of collations.
> Let us say we have a Project that projects the following expressions (notice 
> that $2 will become $1 and $2 after the projection): $0, $2, $2, $3
> The Project's input has collation [2, 3]
> In order to calculate the Project's own collation, {{RelMdCollation#project}} 
> will be called, and a MultiMap targets will be computed because, as in this 
> case, a certain "source field" (e.g. 2) can have multiple project targets 
> (e.g. 1 and 2). However, when the collation is being computed, *only the 
> first target will be considered* (and the rest will be discarded):
> {code}
>   public static @Nullable List<RelCollation> project(RelMetadataQuery mq,
>       RelNode input, List<? extends RexNode> projects) {
>   ...
>       for (RelFieldCollation ifc : ic.getFieldCollations()) {
>         final Collection<Integer> integers = targets.get(ifc.getFieldIndex());
>         if (integers.isEmpty()) {
>           continue loop; // cannot do this collation
>         }
>         fieldCollations.add(ifc.withFieldIndex(integers.iterator().next()));  
> // <-- HERE!!
>       }
> {code}
> Because of this, the Project's collation will be [1 3], but there is also 
> another valid one ([2 3]), so the correct (complete) result should be: [1 3] 
> [2 3]
> This seems a minor problem, but it can be the root cause of more relevant 
> issues. For instance, at the moment I have a scenario (not so easy to 
> reproduce with a unit test) where a certain plan with a certain combination 
> of rules in a HepPlanner results in a StackOverflow due to 
> SortJoinTransposeRule being fired infinitely. The root cause is that, after 
> the first application, the rule does not detect that the Join's left input is 
> already sorted (due to the previous application of the rule), because there 
> is a "problematic" Project on it (that shows the problem described above), 
> which returns only one collation, whereas the second collation (the one being 
> discarded) is the Sort's collation, so it would be one that would prevent the 
> SortJoinTransposeRule from being re-applied over and over.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to