[jira] [Commented] (DRILL-2030) CTAS with SELECT * and expression creates column names with prefix

2015-01-18 Thread Jinfeng Ni (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14282190#comment-14282190
 ] 

Jinfeng Ni commented on DRILL-2030:
---

Thanks for finding this problem. Yes, you are right. The top level project 
should be below Writer.  Originally, I thought both Screen and Writer could be 
root operator; Screen is the root operator for query, while Writer is is the 
root for CTAS.  Apparently, later on we decided to put Writer under Screen, so 
that Screen could report back the result of CTAS, which broke the logic of star 
column handling. 

I'll submit a patch to fix this issue. 

   

> CTAS with SELECT * and expression creates column names with prefix
> --
>
> Key: DRILL-2030
> URL: https://issues.apache.org/jira/browse/DRILL-2030
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 0.7.0
>Reporter: Aman Sinha
>Assignee: Jinfeng Ni
>
> Doing a CTAS with the star column and an expression creates columns that 
> contain the table prefix 'T||' .  Looks like the top project did not strip 
> out the prefix. 
> {code}
> 0: jdbc:drill:zk=local> create table region4 as select *, r_regionkey + 1 
> from cp.`tpch/region.parquet`;
> ++---+
> |  Fragment  | Number of records written |
> ++---+
> | 0_0| 5 |
> ++---+
> 0: jdbc:drill:zk=local> select * from region4;
> +-++---++
> | T2¦¦r_regionkey | T2¦¦r_name | T2¦¦r_comment |   EXPR$1   |
> +-++---++
> {code}
> The column names are correct if there is a regular column with the star 
> column. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-1889) when 'select *' is used along with an order by on length of a column, Drill is adding the computed length to the list of columns

2015-01-18 Thread Jacques Nadeau (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacques Nadeau updated DRILL-1889:
--
Fix Version/s: (was: 0.9.0)
   0.8.0

> when 'select *' is used along with an order by on length of a column, Drill 
> is adding the computed length to the list of columns
> 
>
> Key: DRILL-1889
> URL: https://issues.apache.org/jira/browse/DRILL-1889
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Rahul Challapalli
>Assignee: Jinfeng Ni
>Priority: Critical
> Fix For: 0.8.0
>
> Attachments: 
> 0001-DRILL-1889-Fix-star-column-prefix-and-subsume-logic-.patch.3
>
>
> git.commit.id.abbrev=9dfa4a1
> Dataset :
> {code}
> {
>  "col1":1,
>  "col2":"a"
> }
> {
>  "col1":2,
>  "col2":"b"
> }
> {
>  "col1":2,
>  "col2":"abc"
> }
> {code}
> Query :
> {code}
>  select * from `b.json` order by length(col2);
> ++++
> |col1|col2|   EXPR$1   |
> ++++
> | 1  | a  | 1  |
> | 2  | b  | 1  |
> | 2  | abc| 3  |
> ++++
> {code}
> Drill adds the length column. (EXPR$1) Not sure if this is intended behavior 
> since postgres does not do this



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-1889) when 'select *' is used along with an order by on length of a column, Drill is adding the computed length to the list of columns

2015-01-18 Thread Jacques Nadeau (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacques Nadeau resolved DRILL-1889.
---
Resolution: Fixed

> when 'select *' is used along with an order by on length of a column, Drill 
> is adding the computed length to the list of columns
> 
>
> Key: DRILL-1889
> URL: https://issues.apache.org/jira/browse/DRILL-1889
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Rahul Challapalli
>Assignee: Jinfeng Ni
>Priority: Critical
> Fix For: 0.9.0
>
> Attachments: 
> 0001-DRILL-1889-Fix-star-column-prefix-and-subsume-logic-.patch.3
>
>
> git.commit.id.abbrev=9dfa4a1
> Dataset :
> {code}
> {
>  "col1":1,
>  "col2":"a"
> }
> {
>  "col1":2,
>  "col2":"b"
> }
> {
>  "col1":2,
>  "col2":"abc"
> }
> {code}
> Query :
> {code}
>  select * from `b.json` order by length(col2);
> ++++
> |col1|col2|   EXPR$1   |
> ++++
> | 1  | a  | 1  |
> | 2  | b  | 1  |
> | 2  | abc| 3  |
> ++++
> {code}
> Drill adds the length column. (EXPR$1) Not sure if this is intended behavior 
> since postgres does not do this



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2019) Filter pushdown into the subquery when the subquery also has a filter is resulting in incorrect results

2015-01-18 Thread Jacques Nadeau (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14282050#comment-14282050
 ] 

Jacques Nadeau commented on DRILL-2019:
---

Looks good. +1


> Filter pushdown into the subquery when the subquery also has a filter is 
> resulting in incorrect results
> ---
>
> Key: DRILL-2019
> URL: https://issues.apache.org/jira/browse/DRILL-2019
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Operators, Query Planning & Optimization
>Reporter: Rahul Challapalli
>Assignee: Aman Sinha
>Priority: Critical
> Attachments: 
> 0001-DRILL-2019-Use-correct-index-for-filter-evaluation-w.patch
>
>
> git.commit.id.abbrev=b491cdb
> The below query on top of tpch 0.01 data should actually return 0 records. 
> (Verified with postgres). However drill returns incorrect result
> {code}
> select count(*) from cp.`tpch/lineitem.parquet` l inner join (select 
> o.o_orderkey, o.o_custkey from cp.`tpch/orders.parquet` o where o.o_custkey < 
> 5) s on l.l_orderkey = s.o_orderkey and s.o_custkey > 5;
> ++
> |   EXPR$0   |
> ++
> | 189|
> ++
> {code}
> Marked as 'critical' since drill is reporting incorrect results



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2019) Filter pushdown into the subquery when the subquery also has a filter is resulting in incorrect results

2015-01-18 Thread Aman Sinha (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aman Sinha updated DRILL-2019:
--
Attachment: 0001-DRILL-2019-Use-correct-index-for-filter-evaluation-w.patch

Uploaded patch.  Fix is to pass the correct index to doEval() in the filter 
template for SelectionVector2.  [~jnadeau] could you pls review ? 

> Filter pushdown into the subquery when the subquery also has a filter is 
> resulting in incorrect results
> ---
>
> Key: DRILL-2019
> URL: https://issues.apache.org/jira/browse/DRILL-2019
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Operators, Query Planning & Optimization
>Reporter: Rahul Challapalli
>Assignee: Aman Sinha
>Priority: Critical
> Attachments: 
> 0001-DRILL-2019-Use-correct-index-for-filter-evaluation-w.patch
>
>
> git.commit.id.abbrev=b491cdb
> The below query on top of tpch 0.01 data should actually return 0 records. 
> (Verified with postgres). However drill returns incorrect result
> {code}
> select count(*) from cp.`tpch/lineitem.parquet` l inner join (select 
> o.o_orderkey, o.o_custkey from cp.`tpch/orders.parquet` o where o.o_custkey < 
> 5) s on l.l_orderkey = s.o_orderkey and s.o_custkey > 5;
> ++
> |   EXPR$0   |
> ++
> | 189|
> ++
> {code}
> Marked as 'critical' since drill is reporting incorrect results



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2022) Parquet engine falls back to "new" Parquet reader unnecessarily

2015-01-18 Thread Adam Gilmore (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Gilmore updated DRILL-2022:

Priority: Minor  (was: Major)

> Parquet engine falls back to "new" Parquet reader unnecessarily
> ---
>
> Key: DRILL-2022
> URL: https://issues.apache.org/jira/browse/DRILL-2022
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Affects Versions: 0.8.0
>Reporter: Adam Gilmore
>Assignee: Parth Chandra
>Priority: Minor
> Attachments: DRILL-2022.1.patch.txt
>
>
> The Parquet engine falls back to the "new" Parquet reader whenever a Parquet 
> file that is "complex" (i.e. not purely primitive types) is found.
> The engine should still use the faster reader when all the projected columns 
> are primitive types and only fall back to the other reader when columns 
> containing complex types are selected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-2032) For TestStarQueries, migrate to new test framework to improve validation process

2015-01-18 Thread Sean Hsuan-Yi Chu (JIRA)
Sean Hsuan-Yi Chu created DRILL-2032:


 Summary: For TestStarQueries, migrate to new test framework to 
improve validation process
 Key: DRILL-2032
 URL: https://issues.apache.org/jira/browse/DRILL-2032
 Project: Apache Drill
  Issue Type: Improvement
Reporter: Sean Hsuan-Yi Chu
Assignee: Sean Hsuan-Yi Chu
Priority: Minor


In TestStarQueries.java, some unit tests do not use testFramework, so returned 
results are not validated for correctness. 

This issue is created to improve this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2021) select a, *, a from ... gives wrong result

2015-01-18 Thread Sean Hsuan-Yi Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281980#comment-14281980
 ] 

Sean Hsuan-Yi Chu commented on DRILL-2021:
--

Agree! I just filed a new jira 
DRILL-2032 
to improve the validation process

> select a, *, a from ... gives wrong result
> --
>
> Key: DRILL-2021
> URL: https://issues.apache.org/jira/browse/DRILL-2021
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow, Execution - Operators
> Environment: IDE
>Reporter: Sean Hsuan-Yi Chu
>Assignee: Aman Sinha
>Priority: Critical
> Fix For: 0.8.0
>
> Attachments: DRILL-2021.1.patch, DRILL-2021.2.patch, 
> DRILL-2021.3.patch
>
>
> When, in the select-clause, there are star(s) and regular columns showing up 
> more than once, some regular columns would failed to be printed out.
> For example, 
> select n_name, *, n_name from cp.`tpch/nation.parquet` limit 2
> -
> n_namen_nationkey n_name0 n_regionkey n_comment
> ALGERIA   0   ALGERIA 0haggle. carefully final deposits 
> detect slyly agai
> ARGENTINA 1   ARGENTINA   1   al foxes promise slyly 
> according to the regular accounts. bold requests alone
> -
> Notice "n_name" show be printed out three times. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2021) select a, *, a from ... gives wrong result

2015-01-18 Thread Aman Sinha (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281852#comment-14281852
 ] 

Aman Sinha commented on DRILL-2021:
---

Reviewed patch and manually inspected the test results for all TestStarQueries 
(we really need to add validation for all of those...only the recent ones have 
validation check).   
Fixed in master branch, commit #: 31d764d4f

> select a, *, a from ... gives wrong result
> --
>
> Key: DRILL-2021
> URL: https://issues.apache.org/jira/browse/DRILL-2021
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow, Execution - Operators
> Environment: IDE
>Reporter: Sean Hsuan-Yi Chu
>Assignee: Aman Sinha
>Priority: Critical
> Fix For: 0.8.0
>
> Attachments: DRILL-2021.1.patch, DRILL-2021.2.patch, 
> DRILL-2021.3.patch
>
>
> When, in the select-clause, there are star(s) and regular columns showing up 
> more than once, some regular columns would failed to be printed out.
> For example, 
> select n_name, *, n_name from cp.`tpch/nation.parquet` limit 2
> -
> n_namen_nationkey n_name0 n_regionkey n_comment
> ALGERIA   0   ALGERIA 0haggle. carefully final deposits 
> detect slyly agai
> ARGENTINA 1   ARGENTINA   1   al foxes promise slyly 
> according to the regular accounts. bold requests alone
> -
> Notice "n_name" show be printed out three times. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)