[jira] [Updated] (DRILL-5357) Partition pruning information not available in query plan for COUNT aggregate query
[ https://issues.apache.org/jira/browse/DRILL-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chun Chang updated DRILL-5357: -- Reviewer: Khurram Faraaz > Partition pruning information not available in query plan for COUNT aggregate > query > --- > > Key: DRILL-5357 > URL: https://issues.apache.org/jira/browse/DRILL-5357 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.10.0 > Environment: 3 node CentOS cluster >Reporter: Khurram Faraaz >Assignee: Arina Ielchiieva > Fix For: 1.12.0 > > > We are not seeing partition pruning information in the query plan for the > below, COUNT(*) and COUNT() query > Drill 1.10.0-SNAPSHOT > git commit id: b657d44f > parquet table has 6 columns > total number of rows = 1638640 > {noformat} > 0: jdbc:drill:schema=dfs.tmp> CREATE TABLE tbl_prtn_prune_01 PARTITION BY > (col_state) > AS > SELECT CAST(columns[0] AS DATE) col_date, > CAST(columns[1] AS CHAR(3)) col_state, > CAST(columns[2] AS INTEGER) col_prime, > CAST(columns[3] AS VARCHAR(256)) col_varstr, > CAST(columns[4] AS INTEGER) col_id, > CAST(columns[5] AS VARCHAR(50)) col_name > from `partition_prune_data.csv`; > +---++ > | Fragment | Number of records written | > +---++ > | 0_0 | 1638640| > +---++ > 1 row selected (17.675 seconds) > 0: jdbc:drill:schema=dfs.tmp> select COUNT(*) from tbl_prtn_prune_01 where > col_state = 'CA'; > +-+ > | EXPR$0 | > +-+ > | 35653 | > +-+ > 1 row selected (0.471 seconds) > 0: jdbc:drill:schema=dfs.tmp> explain plan for select COUNT(*) from > tbl_prtn_prune_01 where col_state = 'CA'; > +--+--+ > | text | json | > +--+--+ > | 00-00Screen > 00-01 Project(EXPR$0=[$0]) > 00-02Project(EXPR$0=[$0]) > 00-03 > Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@1d4bb67d[columns > = null, isStarQuery = false, isSkipQuery = false]]) > {noformat} > And then I did a REFRESH TABLE METADATA on the parquet table > {noformat} > 0: jdbc:drill:schema=dfs.tmp> refresh table metadata tbl_prtn_prune_01; > +---+-+ > | ok | summary | > +---+-+ > | true | Successfully updated metadata for table tbl_prtn_prune_01. | > +---+-+ > 1 row selected (0.321 seconds) > 0: jdbc:drill:schema=dfs.tmp> explain plan for select COUNT(col_state) from > tbl_prtn_prune_01 where col_state = 'CA'; > +--+--+ > | text | json | > +--+--+ > | 00-00Screen > 00-01 Project(EXPR$0=[$0]) > 00-02Project(EXPR$0=[$0]) > 00-03 > Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@2e0f4be9[columns > = null, isStarQuery = false, isSkipQuery = false]]) > 0: jdbc:drill:schema=dfs.tmp> explain plan for select COUNT(*) from > tbl_prtn_prune_01 where col_state = 'CA'; > +--+--+ > | text | json | > +--+--+ > | 00-00Screen > 00-01 Project(EXPR$0=[$0]) > 00-02Project(EXPR$0=[$0]) > 00-03 > Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@3fc1f8e7[columns > = null, isStarQuery = false, isSkipQuery = false]]) > 0: jdbc:drill:schema=dfs.tmp> explain plan for select COUNT(col_date) from > tbl_prtn_prune_01 where col_state = 'CA'; > +--+--+ > | text | json | > +--+--+ > | 00-00Screen > 00-01 Project(EXPR$0=[$0]) > 00-02Project(EXPR$0=[$0]) > 00-03 > Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@7afc851e[columns > = null, isStarQuery = false, isSkipQuery = false]]) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (DRILL-5357) Partition pruning information not available in query plan for COUNT aggregate query
[ https://issues.apache.org/jira/browse/DRILL-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Khurram Faraaz updated DRILL-5357: -- Priority: Major (was: Critical) > Partition pruning information not available in query plan for COUNT aggregate > query > --- > > Key: DRILL-5357 > URL: https://issues.apache.org/jira/browse/DRILL-5357 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.10.0 > Environment: 3 node CentOS cluster >Reporter: Khurram Faraaz > > We are not seeing partition pruning information in the query plan for the > below, COUNT(*) and COUNT() query > Drill 1.10.0-SNAPSHOT > git commit id: b657d44f > parquet table has 6 columns > total number of rows = 1638640 > {noformat} > 0: jdbc:drill:schema=dfs.tmp> CREATE TABLE tbl_prtn_prune_01 PARTITION BY > (col_state) > AS > SELECT CAST(columns[0] AS DATE) col_date, > CAST(columns[1] AS CHAR(3)) col_state, > CAST(columns[2] AS INTEGER) col_prime, > CAST(columns[3] AS VARCHAR(256)) col_varstr, > CAST(columns[4] AS INTEGER) col_id, > CAST(columns[5] AS VARCHAR(50)) col_name > from `partition_prune_data.csv`; > +---++ > | Fragment | Number of records written | > +---++ > | 0_0 | 1638640| > +---++ > 1 row selected (17.675 seconds) > 0: jdbc:drill:schema=dfs.tmp> select COUNT(*) from tbl_prtn_prune_01 where > col_state = 'CA'; > +-+ > | EXPR$0 | > +-+ > | 35653 | > +-+ > 1 row selected (0.471 seconds) > 0: jdbc:drill:schema=dfs.tmp> explain plan for select COUNT(*) from > tbl_prtn_prune_01 where col_state = 'CA'; > +--+--+ > | text | json | > +--+--+ > | 00-00Screen > 00-01 Project(EXPR$0=[$0]) > 00-02Project(EXPR$0=[$0]) > 00-03 > Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@1d4bb67d[columns > = null, isStarQuery = false, isSkipQuery = false]]) > {noformat} > And then I did a REFRESH TABLE METADATA on the parquet table > {noformat} > 0: jdbc:drill:schema=dfs.tmp> refresh table metadata tbl_prtn_prune_01; > +---+-+ > | ok | summary | > +---+-+ > | true | Successfully updated metadata for table tbl_prtn_prune_01. | > +---+-+ > 1 row selected (0.321 seconds) > 0: jdbc:drill:schema=dfs.tmp> explain plan for select COUNT(col_state) from > tbl_prtn_prune_01 where col_state = 'CA'; > +--+--+ > | text | json | > +--+--+ > | 00-00Screen > 00-01 Project(EXPR$0=[$0]) > 00-02Project(EXPR$0=[$0]) > 00-03 > Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@2e0f4be9[columns > = null, isStarQuery = false, isSkipQuery = false]]) > 0: jdbc:drill:schema=dfs.tmp> explain plan for select COUNT(*) from > tbl_prtn_prune_01 where col_state = 'CA'; > +--+--+ > | text | json | > +--+--+ > | 00-00Screen > 00-01 Project(EXPR$0=[$0]) > 00-02Project(EXPR$0=[$0]) > 00-03 > Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@3fc1f8e7[columns > = null, isStarQuery = false, isSkipQuery = false]]) > 0: jdbc:drill:schema=dfs.tmp> explain plan for select COUNT(col_date) from > tbl_prtn_prune_01 where col_state = 'CA'; > +--+--+ > | text | json | > +--+--+ > | 00-00Screen > 00-01 Project(EXPR$0=[$0]) > 00-02Project(EXPR$0=[$0]) > 00-03 > Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@7afc851e[columns > = null, isStarQuery = false, isSkipQuery = false]]) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (DRILL-5357) Partition pruning information not available in query plan for COUNT aggregate query
[ https://issues.apache.org/jira/browse/DRILL-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Khurram Faraaz updated DRILL-5357: -- Description: We are not seeing partition pruning information in the query plan for the below, COUNT(*) and COUNT() query Drill 1.10.0-SNAPSHOT git commit id: b657d44f parquet table has 6 columns total number of rows = 1638640 {noformat} 0: jdbc:drill:schema=dfs.tmp> CREATE TABLE tbl_prtn_prune_01 PARTITION BY (col_state) AS SELECT CAST(columns[0] AS DATE) col_date, CAST(columns[1] AS CHAR(3)) col_state, CAST(columns[2] AS INTEGER) col_prime, CAST(columns[3] AS VARCHAR(256)) col_varstr, CAST(columns[4] AS INTEGER) col_id, CAST(columns[5] AS VARCHAR(50)) col_name from `partition_prune_data.csv`; +---++ | Fragment | Number of records written | +---++ | 0_0 | 1638640| +---++ 1 row selected (17.675 seconds) 0: jdbc:drill:schema=dfs.tmp> select COUNT(*) from tbl_prtn_prune_01 where col_state = 'CA'; +-+ | EXPR$0 | +-+ | 35653 | +-+ 1 row selected (0.471 seconds) 0: jdbc:drill:schema=dfs.tmp> explain plan for select COUNT(*) from tbl_prtn_prune_01 where col_state = 'CA'; +--+--+ | text | json | +--+--+ | 00-00Screen 00-01 Project(EXPR$0=[$0]) 00-02Project(EXPR$0=[$0]) 00-03 Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@1d4bb67d[columns = null, isStarQuery = false, isSkipQuery = false]]) {noformat} And then I did a REFRESH TABLE METADATA on the parquet table {noformat} 0: jdbc:drill:schema=dfs.tmp> refresh table metadata tbl_prtn_prune_01; +---+-+ | ok | summary | +---+-+ | true | Successfully updated metadata for table tbl_prtn_prune_01. | +---+-+ 1 row selected (0.321 seconds) 0: jdbc:drill:schema=dfs.tmp> explain plan for select COUNT(col_state) from tbl_prtn_prune_01 where col_state = 'CA'; +--+--+ | text | json | +--+--+ | 00-00Screen 00-01 Project(EXPR$0=[$0]) 00-02Project(EXPR$0=[$0]) 00-03 Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@2e0f4be9[columns = null, isStarQuery = false, isSkipQuery = false]]) 0: jdbc:drill:schema=dfs.tmp> explain plan for select COUNT(*) from tbl_prtn_prune_01 where col_state = 'CA'; +--+--+ | text | json | +--+--+ | 00-00Screen 00-01 Project(EXPR$0=[$0]) 00-02Project(EXPR$0=[$0]) 00-03 Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@3fc1f8e7[columns = null, isStarQuery = false, isSkipQuery = false]]) 0: jdbc:drill:schema=dfs.tmp> explain plan for select COUNT(col_date) from tbl_prtn_prune_01 where col_state = 'CA'; +--+--+ | text | json | +--+--+ | 00-00Screen 00-01 Project(EXPR$0=[$0]) 00-02Project(EXPR$0=[$0]) 00-03 Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@7afc851e[columns = null, isStarQuery = false, isSkipQuery = false]]) {noformat} was: We are not seeing partition pruning information in the query plan for the below, COUNT(*) and COUNT() query ? Drill 1.10.0-SNAPSHOT git commit id: b657d44f parquet table has 6 columns total number of rows = 1638640 {noformat} 0: jdbc:drill:schema=dfs.tmp> CREATE TABLE tbl_prtn_prune_01 PARTITION BY (col_state) AS SELECT CAST(columns[0] AS DATE) col_date, CAST(columns[1] AS CHAR(3)) col_state, CAST(columns[2] AS INTEGER) col_prime, CAST(columns[3] AS VARCHAR(256)) col_varstr, CAST(columns[4] AS INTEGER) col_id, CAST(columns[5] AS VARCHAR(50)) col_name from `partition_prune_data.csv`; +---++ | Fragment | Number of records written | +---++ | 0_0 | 1638640| +---++ 1 row selected (17.675 seconds) 0: jdbc:drill:schema=dfs.tmp> select COUNT(*) from tbl_prtn_prune_01 where col_state = 'CA'; +-+ | EXPR$0 | +-+ | 35653 | +-+ 1 row selected (0.471 seconds) 0: jdbc:drill:schema=dfs.tmp> explain plan for select COUNT(*) from tbl_prtn_prune_01 where col_state = 'CA'; +--+--+ | text | json | +--+--+ | 00-00Screen 00-01 Project(EXPR$0=[$0]) 00-02Project(EXPR$0=[$0]) 00-03 Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@1d4bb67d[columns = null, isStarQuery = false, isSkipQuery = false]]) {noformat} And then I did a REFRESH TABLE METADATA on the parquet table {noformat} 0: jdbc:drill:schema=dfs.tmp> refresh table metadata