Rahul Challapalli created DRILL-5249:
----------------------------------------

             Summary: Optimizer should remove the sort from the plan when the 
order by statement does not impact the output of the query
                 Key: DRILL-5249
                 URL: https://issues.apache.org/jira/browse/DRILL-5249
             Project: Apache Drill
          Issue Type: Bug
          Components: Query Planning & Optimization
    Affects Versions: 1.10.0
            Reporter: Rahul Challapalli


git.commit.id.abbrev=2af709f

The below should be optimized to get rid of the "sort" operation
{code}
0: jdbc:drill:zk=10.10.100.190:5181> explain plan for select count(*) from 
. . . . . . . . . . . . . . . . . .> (
. . . . . . . . . . . . . . . . . .>   select * from customer_demographics 
order by cd_marital_status
. . . . . . . . . . . . . . . . . .> ) d1;
+------+------+
| text | json |
+------+------+
| 00-00    Screen
00-01      Project(EXPR$0=[$0])
00-02        StreamAgg(group=[{}], EXPR$0=[$SUM0($0)])
00-03          UnionExchange
01-01            StreamAgg(group=[{}], EXPR$0=[COUNT()])
01-02              Project($f0=[0])
01-03                SingleMergeExchange(sort0=[2 ASC])
02-01                  SelectionVectorRemover
02-02                    Sort(sort0=[$2], dir0=[ASC])
02-03                      Project(cd_demo_sk=[$0], cd_gender=[$1], 
cd_marital_status=[$2], cd_education_status=[$3], cd_purchase_estimate=[$4], 
cd_credit_rating=[$5], cd_dep_count=[$6], cd_dep_employed_count=[$7], 
cd_dep_college_count=[$8])
02-04                        HashToRandomExchange(dist0=[[$2]])
03-01                          UnorderedMuxExchange
04-01                            Project(cd_demo_sk=[$0], cd_gender=[$1], 
cd_marital_status=[$2], cd_education_status=[$3], cd_purchase_estimate=[$4], 
cd_credit_rating=[$5], cd_dep_count=[$6], cd_dep_employed_count=[$7], 
cd_dep_college_count=[$8], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($2)])
04-02                              Project(cd_demo_sk=[CAST($0):INTEGER], 
cd_gender=[CAST($1):VARCHAR(200) CHARACTER SET "ISO-8859-1" COLLATE 
"ISO-8859-1$en_US$primary"], cd_marital_status=[CAST($2):VARCHAR(200) CHARACTER 
SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary"], 
cd_education_status=[CAST($3):VARCHAR(200) CHARACTER SET "ISO-8859-1" COLLATE 
"ISO-8859-1$en_US$primary"], cd_purchase_estimate=[CAST($4):INTEGER], 
cd_credit_rating=[CAST($5):VARCHAR(200) CHARACTER SET "ISO-8859-1" COLLATE 
"ISO-8859-1$en_US$primary"], cd_dep_count=[CAST($6):INTEGER], 
cd_dep_employed_count=[CAST($7):INTEGER], 
cd_dep_college_count=[CAST($8):INTEGER])
04-03                                Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath 
[path=maprfs:///drill/testdata/tpcds_sf1/parquet/customer_demographics]], 
selectionRoot=maprfs:/drill/testdata/tpcds_sf1/parquet/customer_demographics, 
numFiles=1, usedMetadataFile=false, columns=[`cd_demo_sk`, `cd_gender`, 
`cd_marital_status`, `cd_education_status`, `cd_purchase_estimate`, 
`cd_credit_rating`, `cd_dep_count`, `cd_dep_employed_count`, 
`cd_dep_college_count`]]])
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to