[jira] [Updated] (DRILL-5357) Partition pruning information not available in query plan for COUNT aggregate query

2017-09-20 Thread Chun Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chun Chang updated DRILL-5357:
--
Reviewer: Khurram Faraaz

> Partition pruning information not available in query plan for COUNT aggregate 
> query
> ---
>
> Key: DRILL-5357
> URL: https://issues.apache.org/jira/browse/DRILL-5357
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.10.0
> Environment: 3 node CentOS cluster
>Reporter: Khurram Faraaz
>Assignee: Arina Ielchiieva
> Fix For: 1.12.0
>
>
> We are not seeing partition pruning information in the query plan for the 
> below, COUNT(*) and COUNT() query 
> Drill 1.10.0-SNAPSHOT
> git commit id: b657d44f
> parquet table has 6 columns
> total number of rows = 1638640
> {noformat}
> 0: jdbc:drill:schema=dfs.tmp> CREATE TABLE tbl_prtn_prune_01 PARTITION BY 
> (col_state) 
> AS 
> SELECT CAST(columns[0] AS DATE) col_date, 
> CAST(columns[1] AS CHAR(3)) col_state, 
> CAST(columns[2] AS INTEGER) col_prime, 
> CAST(columns[3] AS VARCHAR(256)) col_varstr, 
> CAST(columns[4] AS INTEGER) col_id, 
> CAST(columns[5] AS VARCHAR(50)) col_name 
> from `partition_prune_data.csv`;
> +---++
> | Fragment  | Number of records written  |
> +---++
> | 0_0   | 1638640|
> +---++
> 1 row selected (17.675 seconds)
> 0: jdbc:drill:schema=dfs.tmp> select COUNT(*) from tbl_prtn_prune_01 where 
> col_state = 'CA';
> +-+
> | EXPR$0  |
> +-+
> | 35653   |
> +-+
> 1 row selected (0.471 seconds)
> 0: jdbc:drill:schema=dfs.tmp> explain plan for select COUNT(*) from 
> tbl_prtn_prune_01 where col_state = 'CA';
> +--+--+
> | text | json |
> +--+--+
> | 00-00Screen
> 00-01  Project(EXPR$0=[$0])
> 00-02Project(EXPR$0=[$0])
> 00-03  
> Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@1d4bb67d[columns
>  = null, isStarQuery = false, isSkipQuery = false]])
> {noformat}
> And then I did a REFRESH TABLE METADATA on the parquet table
> {noformat}
> 0: jdbc:drill:schema=dfs.tmp> refresh table metadata tbl_prtn_prune_01;
> +---+-+
> |  ok   |   summary   |
> +---+-+
> | true  | Successfully updated metadata for table tbl_prtn_prune_01.  |
> +---+-+
> 1 row selected (0.321 seconds)
> 0: jdbc:drill:schema=dfs.tmp> explain plan for select COUNT(col_state) from 
> tbl_prtn_prune_01 where col_state = 'CA';
> +--+--+
> | text | json |
> +--+--+
> | 00-00Screen
> 00-01  Project(EXPR$0=[$0])
> 00-02Project(EXPR$0=[$0])
> 00-03  
> Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@2e0f4be9[columns
>  = null, isStarQuery = false, isSkipQuery = false]])
> 0: jdbc:drill:schema=dfs.tmp> explain plan for select COUNT(*) from 
> tbl_prtn_prune_01 where col_state = 'CA';
> +--+--+
> | text | json |
> +--+--+
> | 00-00Screen
> 00-01  Project(EXPR$0=[$0])
> 00-02Project(EXPR$0=[$0])
> 00-03  
> Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@3fc1f8e7[columns
>  = null, isStarQuery = false, isSkipQuery = false]])
> 0: jdbc:drill:schema=dfs.tmp> explain plan for select COUNT(col_date) from 
> tbl_prtn_prune_01 where col_state = 'CA';
> +--+--+
> | text | json |
> +--+--+
> | 00-00Screen
> 00-01  Project(EXPR$0=[$0])
> 00-02Project(EXPR$0=[$0])
> 00-03  
> Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@7afc851e[columns
>  = null, isStarQuery = false, isSkipQuery = false]])
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (DRILL-5357) Partition pruning information not available in query plan for COUNT aggregate query

2017-03-15 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-5357:
--
Priority: Major  (was: Critical)

> Partition pruning information not available in query plan for COUNT aggregate 
> query
> ---
>
> Key: DRILL-5357
> URL: https://issues.apache.org/jira/browse/DRILL-5357
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.10.0
> Environment: 3 node CentOS cluster
>Reporter: Khurram Faraaz
>
> We are not seeing partition pruning information in the query plan for the 
> below, COUNT(*) and COUNT() query 
> Drill 1.10.0-SNAPSHOT
> git commit id: b657d44f
> parquet table has 6 columns
> total number of rows = 1638640
> {noformat}
> 0: jdbc:drill:schema=dfs.tmp> CREATE TABLE tbl_prtn_prune_01 PARTITION BY 
> (col_state) 
> AS 
> SELECT CAST(columns[0] AS DATE) col_date, 
> CAST(columns[1] AS CHAR(3)) col_state, 
> CAST(columns[2] AS INTEGER) col_prime, 
> CAST(columns[3] AS VARCHAR(256)) col_varstr, 
> CAST(columns[4] AS INTEGER) col_id, 
> CAST(columns[5] AS VARCHAR(50)) col_name 
> from `partition_prune_data.csv`;
> +---++
> | Fragment  | Number of records written  |
> +---++
> | 0_0   | 1638640|
> +---++
> 1 row selected (17.675 seconds)
> 0: jdbc:drill:schema=dfs.tmp> select COUNT(*) from tbl_prtn_prune_01 where 
> col_state = 'CA';
> +-+
> | EXPR$0  |
> +-+
> | 35653   |
> +-+
> 1 row selected (0.471 seconds)
> 0: jdbc:drill:schema=dfs.tmp> explain plan for select COUNT(*) from 
> tbl_prtn_prune_01 where col_state = 'CA';
> +--+--+
> | text | json |
> +--+--+
> | 00-00Screen
> 00-01  Project(EXPR$0=[$0])
> 00-02Project(EXPR$0=[$0])
> 00-03  
> Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@1d4bb67d[columns
>  = null, isStarQuery = false, isSkipQuery = false]])
> {noformat}
> And then I did a REFRESH TABLE METADATA on the parquet table
> {noformat}
> 0: jdbc:drill:schema=dfs.tmp> refresh table metadata tbl_prtn_prune_01;
> +---+-+
> |  ok   |   summary   |
> +---+-+
> | true  | Successfully updated metadata for table tbl_prtn_prune_01.  |
> +---+-+
> 1 row selected (0.321 seconds)
> 0: jdbc:drill:schema=dfs.tmp> explain plan for select COUNT(col_state) from 
> tbl_prtn_prune_01 where col_state = 'CA';
> +--+--+
> | text | json |
> +--+--+
> | 00-00Screen
> 00-01  Project(EXPR$0=[$0])
> 00-02Project(EXPR$0=[$0])
> 00-03  
> Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@2e0f4be9[columns
>  = null, isStarQuery = false, isSkipQuery = false]])
> 0: jdbc:drill:schema=dfs.tmp> explain plan for select COUNT(*) from 
> tbl_prtn_prune_01 where col_state = 'CA';
> +--+--+
> | text | json |
> +--+--+
> | 00-00Screen
> 00-01  Project(EXPR$0=[$0])
> 00-02Project(EXPR$0=[$0])
> 00-03  
> Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@3fc1f8e7[columns
>  = null, isStarQuery = false, isSkipQuery = false]])
> 0: jdbc:drill:schema=dfs.tmp> explain plan for select COUNT(col_date) from 
> tbl_prtn_prune_01 where col_state = 'CA';
> +--+--+
> | text | json |
> +--+--+
> | 00-00Screen
> 00-01  Project(EXPR$0=[$0])
> 00-02Project(EXPR$0=[$0])
> 00-03  
> Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@7afc851e[columns
>  = null, isStarQuery = false, isSkipQuery = false]])
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-5357) Partition pruning information not available in query plan for COUNT aggregate query

2017-03-15 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz updated DRILL-5357:
--
Description: 
We are not seeing partition pruning information in the query plan for the 
below, COUNT(*) and COUNT() query 

Drill 1.10.0-SNAPSHOT
git commit id: b657d44f

parquet table has 6 columns
total number of rows = 1638640

{noformat}
0: jdbc:drill:schema=dfs.tmp> CREATE TABLE tbl_prtn_prune_01 PARTITION BY 
(col_state) 
AS 
SELECT CAST(columns[0] AS DATE) col_date, 
CAST(columns[1] AS CHAR(3)) col_state, 
CAST(columns[2] AS INTEGER) col_prime, 
CAST(columns[3] AS VARCHAR(256)) col_varstr, 
CAST(columns[4] AS INTEGER) col_id, 
CAST(columns[5] AS VARCHAR(50)) col_name 
from `partition_prune_data.csv`;
+---++
| Fragment  | Number of records written  |
+---++
| 0_0   | 1638640|
+---++
1 row selected (17.675 seconds)

0: jdbc:drill:schema=dfs.tmp> select COUNT(*) from tbl_prtn_prune_01 where 
col_state = 'CA';
+-+
| EXPR$0  |
+-+
| 35653   |
+-+
1 row selected (0.471 seconds)

0: jdbc:drill:schema=dfs.tmp> explain plan for select COUNT(*) from 
tbl_prtn_prune_01 where col_state = 'CA';
+--+--+
| text | json |
+--+--+
| 00-00Screen
00-01  Project(EXPR$0=[$0])
00-02Project(EXPR$0=[$0])
00-03  
Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@1d4bb67d[columns
 = null, isStarQuery = false, isSkipQuery = false]])
{noformat}

And then I did a REFRESH TABLE METADATA on the parquet table

{noformat}
0: jdbc:drill:schema=dfs.tmp> refresh table metadata tbl_prtn_prune_01;
+---+-+
|  ok   |   summary   |
+---+-+
| true  | Successfully updated metadata for table tbl_prtn_prune_01.  |
+---+-+
1 row selected (0.321 seconds)

0: jdbc:drill:schema=dfs.tmp> explain plan for select COUNT(col_state) from 
tbl_prtn_prune_01 where col_state = 'CA';
+--+--+
| text | json |
+--+--+
| 00-00Screen
00-01  Project(EXPR$0=[$0])
00-02Project(EXPR$0=[$0])
00-03  
Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@2e0f4be9[columns
 = null, isStarQuery = false, isSkipQuery = false]])

0: jdbc:drill:schema=dfs.tmp> explain plan for select COUNT(*) from 
tbl_prtn_prune_01 where col_state = 'CA';
+--+--+
| text | json |
+--+--+
| 00-00Screen
00-01  Project(EXPR$0=[$0])
00-02Project(EXPR$0=[$0])
00-03  
Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@3fc1f8e7[columns
 = null, isStarQuery = false, isSkipQuery = false]])

0: jdbc:drill:schema=dfs.tmp> explain plan for select COUNT(col_date) from 
tbl_prtn_prune_01 where col_state = 'CA';
+--+--+
| text | json |
+--+--+
| 00-00Screen
00-01  Project(EXPR$0=[$0])
00-02Project(EXPR$0=[$0])
00-03  
Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@7afc851e[columns
 = null, isStarQuery = false, isSkipQuery = false]])
{noformat}


  was:
We are not seeing partition pruning information in the query plan for the 
below, COUNT(*) and COUNT() query ?

Drill 1.10.0-SNAPSHOT
git commit id: b657d44f

parquet table has 6 columns
total number of rows = 1638640

{noformat}
0: jdbc:drill:schema=dfs.tmp> CREATE TABLE tbl_prtn_prune_01 PARTITION BY 
(col_state) 
AS 
SELECT CAST(columns[0] AS DATE) col_date, 
CAST(columns[1] AS CHAR(3)) col_state, 
CAST(columns[2] AS INTEGER) col_prime, 
CAST(columns[3] AS VARCHAR(256)) col_varstr, 
CAST(columns[4] AS INTEGER) col_id, 
CAST(columns[5] AS VARCHAR(50)) col_name 
from `partition_prune_data.csv`;
+---++
| Fragment  | Number of records written  |
+---++
| 0_0   | 1638640|
+---++
1 row selected (17.675 seconds)

0: jdbc:drill:schema=dfs.tmp> select COUNT(*) from tbl_prtn_prune_01 where 
col_state = 'CA';
+-+
| EXPR$0  |
+-+
| 35653   |
+-+
1 row selected (0.471 seconds)

0: jdbc:drill:schema=dfs.tmp> explain plan for select COUNT(*) from 
tbl_prtn_prune_01 where col_state = 'CA';
+--+--+
| text | json |
+--+--+
| 00-00Screen
00-01  Project(EXPR$0=[$0])
00-02Project(EXPR$0=[$0])
00-03  
Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@1d4bb67d[columns
 = null, isStarQuery = false, isSkipQuery = false]])
{noformat}

And then I did a REFRESH TABLE METADATA on the parquet table

{noformat}
0: jdbc:drill:schema=dfs.tmp> refresh table metadata