Khurram Faraaz created DRILL-4106: ------------------------------------- Summary: Redundant Project on top of Scan in query plan Key: DRILL-4106 URL: https://issues.apache.org/jira/browse/DRILL-4106 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.3.0 Reporter: Khurram Faraaz Priority: Minor
Why doe we see two Projects after the Scan in the query plan ? Table is auto partitioned by column c1 4 node cluster on CentOS, Drill 1.3, git.commit.id=a639c51c #CTAS statement is, {code} CREATE TABLE inNstedDirAutoPrtn PARTITION BY(c1) AS SELECT cast(columns[0] AS INT) c1, cast(columns[1] AS BIGINT) c2, cast(columns[2] AS CHAR(2)) c3, cast(columns[3] AS VARCHAR(54)) c4, cast(columns[4] AS TIMESTAMP) c5, cast(columns[5] AS DATE) c6, cast(columns[6] as BOOLEAN) c7, cast(columns[7] as DOUBLE) c8, cast(columns[8] as TIME) c9 FROM `nested_dirs/data/csv/allData.csv`; Why do we see two Projects on top of Scan in query plan ? One of them looks redundant. 0: jdbc:drill:schema=dfs.tmp> explain plan for select * from inNstedDirAutoPrtn where c1 IN (1,2,3,4,-1,0,100,-1710); +------+------+ | text | json | +------+------+ | 00-00 Screen 00-01 Project(*=[$0]) 00-02 Project(*=[$0]) 00-03 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=/tmp/inNstedDirAutoPrtn/0_0_48.parquet], ReadEntryWithPath [path=/tmp/inNstedDirAutoPrtn/0_0_31.parquet], ReadEntryWithPath [path=/tmp/inNstedDirAutoPrtn/0_0_50.parquet], ReadEntryWithPath [path=/tmp/inNstedDirAutoPrtn/0_0_47.parquet], ReadEntryWithPath [path=/tmp/inNstedDirAutoPrtn/0_0_49.parquet], ReadEntryWithPath [path=/tmp/inNstedDirAutoPrtn/0_0_46.parquet]], selectionRoot=maprfs:/tmp/inNstedDirAutoPrtn, numFiles=6, usedMetadataFile=false, columns=[`*`]]]) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)