Rahul Challapalli created DRILL-5249: ----------------------------------------
Summary: Optimizer should remove the sort from the plan when the order by statement does not impact the output of the query Key: DRILL-5249 URL: https://issues.apache.org/jira/browse/DRILL-5249 Project: Apache Drill Issue Type: Bug Components: Query Planning & Optimization Affects Versions: 1.10.0 Reporter: Rahul Challapalli git.commit.id.abbrev=2af709f The below should be optimized to get rid of the "sort" operation {code} 0: jdbc:drill:zk=10.10.100.190:5181> explain plan for select count(*) from . . . . . . . . . . . . . . . . . .> ( . . . . . . . . . . . . . . . . . .> select * from customer_demographics order by cd_marital_status . . . . . . . . . . . . . . . . . .> ) d1; +------+------+ | text | json | +------+------+ | 00-00 Screen 00-01 Project(EXPR$0=[$0]) 00-02 StreamAgg(group=[{}], EXPR$0=[$SUM0($0)]) 00-03 UnionExchange 01-01 StreamAgg(group=[{}], EXPR$0=[COUNT()]) 01-02 Project($f0=[0]) 01-03 SingleMergeExchange(sort0=[2 ASC]) 02-01 SelectionVectorRemover 02-02 Sort(sort0=[$2], dir0=[ASC]) 02-03 Project(cd_demo_sk=[$0], cd_gender=[$1], cd_marital_status=[$2], cd_education_status=[$3], cd_purchase_estimate=[$4], cd_credit_rating=[$5], cd_dep_count=[$6], cd_dep_employed_count=[$7], cd_dep_college_count=[$8]) 02-04 HashToRandomExchange(dist0=[[$2]]) 03-01 UnorderedMuxExchange 04-01 Project(cd_demo_sk=[$0], cd_gender=[$1], cd_marital_status=[$2], cd_education_status=[$3], cd_purchase_estimate=[$4], cd_credit_rating=[$5], cd_dep_count=[$6], cd_dep_employed_count=[$7], cd_dep_college_count=[$8], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($2)]) 04-02 Project(cd_demo_sk=[CAST($0):INTEGER], cd_gender=[CAST($1):VARCHAR(200) CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary"], cd_marital_status=[CAST($2):VARCHAR(200) CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary"], cd_education_status=[CAST($3):VARCHAR(200) CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary"], cd_purchase_estimate=[CAST($4):INTEGER], cd_credit_rating=[CAST($5):VARCHAR(200) CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary"], cd_dep_count=[CAST($6):INTEGER], cd_dep_employed_count=[CAST($7):INTEGER], cd_dep_college_count=[CAST($8):INTEGER]) 04-03 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/tpcds_sf1/parquet/customer_demographics]], selectionRoot=maprfs:/drill/testdata/tpcds_sf1/parquet/customer_demographics, numFiles=1, usedMetadataFile=false, columns=[`cd_demo_sk`, `cd_gender`, `cd_marital_status`, `cd_education_status`, `cd_purchase_estimate`, `cd_credit_rating`, `cd_dep_count`, `cd_dep_employed_count`, `cd_dep_college_count`]]]) {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)