[ 
https://issues.apache.org/jira/browse/DRILL-4467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15179759#comment-15179759
 ] 

Jacques Nadeau commented on DRILL-4467:
---------------------------------------

This lack of stability also is causing incorrect plans, for example, the plan 
for this regression test is invalid (but may execute correctly because Drill 
resolves using names rather than ordinals):

https://github.com/mapr/drill-test-framework/blob/master/framework/resources/Functional/hbase/hbase_pushdown/plan/pushdown_p3.e_tsv

{code:title=PlanWithoutStablity (wrong)}
    Screen
      Project(EXPR$0=[/(CAST($1):INTEGER, CAST($2):FLOAT)])
        Project(row_key=[$1], ITEM=[ITEM($2, 'age')], ITEM2=[ITEM($0, 'gpa')])
          Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
[tableName=student, startRow=750\x00, stopRow=800, filter=FilterList AND (2/2): 
[RowFilter (LESS, 800), RowFilter (GREATER, 750)]], columns=[`row_key`, 
`twocf`.`age`, `threecf`.`gpa`]]])
{code}

But once we apply the desiredFields LinkedHashSet fix, we see stability/correct 
ordinals in the project above the Scan:
{code:title=PlanWithStability}
00-00    Screen
00-01      Project(EXPR$0=[/(CAST($1):INTEGER, CAST($2):FLOAT)])
00-02        Project(row_key=[$0], ITEM=[ITEM($1, 'age')], ITEM2=[ITEM($2, 
'gpa')])
00-03          Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
[tableName=student, startRow=750\x00, stopRow=800, filter=FilterList AND (2/2): 
[RowFilter (LESS, 800), RowFilter (GREATER, 750)]], columns=[`row_key`, 
`twocf`.`age`, `threecf`.`gpa`]]])
{code}


> Invalid projection created using PrelUtil.getColumns
> ----------------------------------------------------
>
>                 Key: DRILL-4467
>                 URL: https://issues.apache.org/jira/browse/DRILL-4467
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Laurent Goujon
>            Assignee: Jacques Nadeau
>            Priority: Critical
>             Fix For: 1.6.0
>
>
> In {{DrillPushProjIntoScan}}, a new scan and a new projection are created 
> using {{PrelUtil#getColumn(RelDataType, List<RexNode>)}}.
> The returned {{ProjectPushInfo}} instance has several fields, one of them is 
> {{desiredFields}} which is the list of projected fields. There's one instance 
> per {{RexNode}} but because instances were initially added to a set, they 
> might not be in the same order as the order they were created.
> The issue happens in the following code:
> {code:java}
>       List<RexNode> newProjects = Lists.newArrayList();
>       for (RexNode n : proj.getChildExps()) {
>         newProjects.add(n.accept(columnInfo.getInputRewriter()));
>       }
> {code}
> This code creates a new list of projects out of the initial ones, by mapping 
> the indices from the old projects to the new projects, but the indices of the 
> new RexNode instances might be out of order (because of the ordering of 
> desiredFields). And if indices are out of order, the check 
> {{ProjectRemoveRule.isTrivial(newProj)}} will fail.
> My guess is that desiredFields ordering should be preserved when instances 
> are added, to satisfy the condition above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to