Khurram Faraaz created DRILL-4106:
-------------------------------------

             Summary: Redundant Project on top of Scan in query plan
                 Key: DRILL-4106
                 URL: https://issues.apache.org/jira/browse/DRILL-4106
             Project: Apache Drill
          Issue Type: Bug
          Components: Execution - Relational Operators
    Affects Versions: 1.3.0
            Reporter: Khurram Faraaz
            Priority: Minor


Why doe we see two Projects after the Scan in the query plan ? 
Table is auto partitioned by column c1
4 node cluster on CentOS, Drill 1.3, git.commit.id=a639c51c

#CTAS statement is,

{code}
CREATE TABLE inNstedDirAutoPrtn PARTITION BY(c1) AS SELECT cast(columns[0] AS 
INT) c1, cast(columns[1] AS BIGINT) c2, cast(columns[2] AS CHAR(2)) c3, 
cast(columns[3] AS VARCHAR(54)) c4, cast(columns[4] AS TIMESTAMP) c5, 
cast(columns[5] AS DATE) c6, cast(columns[6] as BOOLEAN) c7, cast(columns[7] as 
DOUBLE) c8, cast(columns[8] as TIME) c9 FROM `nested_dirs/data/csv/allData.csv`;

Why do we see two Projects on top of Scan in query plan ? One of them looks 
redundant.

0: jdbc:drill:schema=dfs.tmp> explain plan for select * from inNstedDirAutoPrtn 
where c1 IN (1,2,3,4,-1,0,100,-1710);
+------+------+
| text | json |
+------+------+
| 00-00    Screen
00-01      Project(*=[$0])
00-02        Project(*=[$0])
00-03          Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
[path=/tmp/inNstedDirAutoPrtn/0_0_48.parquet], ReadEntryWithPath 
[path=/tmp/inNstedDirAutoPrtn/0_0_31.parquet], ReadEntryWithPath 
[path=/tmp/inNstedDirAutoPrtn/0_0_50.parquet], ReadEntryWithPath 
[path=/tmp/inNstedDirAutoPrtn/0_0_47.parquet], ReadEntryWithPath 
[path=/tmp/inNstedDirAutoPrtn/0_0_49.parquet], ReadEntryWithPath 
[path=/tmp/inNstedDirAutoPrtn/0_0_46.parquet]], 
selectionRoot=maprfs:/tmp/inNstedDirAutoPrtn, numFiles=6, 
usedMetadataFile=false, columns=[`*`]]])

{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to