[jira] [Commented] (DRILL-2030) CTAS with SELECT * and expression creates column names with prefix
[ https://issues.apache.org/jira/browse/DRILL-2030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14282190#comment-14282190 ] Jinfeng Ni commented on DRILL-2030: --- Thanks for finding this problem. Yes, you are right. The top level project should be below Writer. Originally, I thought both Screen and Writer could be root operator; Screen is the root operator for query, while Writer is is the root for CTAS. Apparently, later on we decided to put Writer under Screen, so that Screen could report back the result of CTAS, which broke the logic of star column handling. I'll submit a patch to fix this issue. > CTAS with SELECT * and expression creates column names with prefix > -- > > Key: DRILL-2030 > URL: https://issues.apache.org/jira/browse/DRILL-2030 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 0.7.0 >Reporter: Aman Sinha >Assignee: Jinfeng Ni > > Doing a CTAS with the star column and an expression creates columns that > contain the table prefix 'T||' . Looks like the top project did not strip > out the prefix. > {code} > 0: jdbc:drill:zk=local> create table region4 as select *, r_regionkey + 1 > from cp.`tpch/region.parquet`; > ++---+ > | Fragment | Number of records written | > ++---+ > | 0_0| 5 | > ++---+ > 0: jdbc:drill:zk=local> select * from region4; > +-++---++ > | T2¦¦r_regionkey | T2¦¦r_name | T2¦¦r_comment | EXPR$1 | > +-++---++ > {code} > The column names are correct if there is a regular column with the star > column. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-1889) when 'select *' is used along with an order by on length of a column, Drill is adding the computed length to the list of columns
[ https://issues.apache.org/jira/browse/DRILL-1889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacques Nadeau updated DRILL-1889: -- Fix Version/s: (was: 0.9.0) 0.8.0 > when 'select *' is used along with an order by on length of a column, Drill > is adding the computed length to the list of columns > > > Key: DRILL-1889 > URL: https://issues.apache.org/jira/browse/DRILL-1889 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Reporter: Rahul Challapalli >Assignee: Jinfeng Ni >Priority: Critical > Fix For: 0.8.0 > > Attachments: > 0001-DRILL-1889-Fix-star-column-prefix-and-subsume-logic-.patch.3 > > > git.commit.id.abbrev=9dfa4a1 > Dataset : > {code} > { > "col1":1, > "col2":"a" > } > { > "col1":2, > "col2":"b" > } > { > "col1":2, > "col2":"abc" > } > {code} > Query : > {code} > select * from `b.json` order by length(col2); > ++++ > |col1|col2| EXPR$1 | > ++++ > | 1 | a | 1 | > | 2 | b | 1 | > | 2 | abc| 3 | > ++++ > {code} > Drill adds the length column. (EXPR$1) Not sure if this is intended behavior > since postgres does not do this -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (DRILL-1889) when 'select *' is used along with an order by on length of a column, Drill is adding the computed length to the list of columns
[ https://issues.apache.org/jira/browse/DRILL-1889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacques Nadeau resolved DRILL-1889. --- Resolution: Fixed > when 'select *' is used along with an order by on length of a column, Drill > is adding the computed length to the list of columns > > > Key: DRILL-1889 > URL: https://issues.apache.org/jira/browse/DRILL-1889 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Reporter: Rahul Challapalli >Assignee: Jinfeng Ni >Priority: Critical > Fix For: 0.9.0 > > Attachments: > 0001-DRILL-1889-Fix-star-column-prefix-and-subsume-logic-.patch.3 > > > git.commit.id.abbrev=9dfa4a1 > Dataset : > {code} > { > "col1":1, > "col2":"a" > } > { > "col1":2, > "col2":"b" > } > { > "col1":2, > "col2":"abc" > } > {code} > Query : > {code} > select * from `b.json` order by length(col2); > ++++ > |col1|col2| EXPR$1 | > ++++ > | 1 | a | 1 | > | 2 | b | 1 | > | 2 | abc| 3 | > ++++ > {code} > Drill adds the length column. (EXPR$1) Not sure if this is intended behavior > since postgres does not do this -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-2019) Filter pushdown into the subquery when the subquery also has a filter is resulting in incorrect results
[ https://issues.apache.org/jira/browse/DRILL-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14282050#comment-14282050 ] Jacques Nadeau commented on DRILL-2019: --- Looks good. +1 > Filter pushdown into the subquery when the subquery also has a filter is > resulting in incorrect results > --- > > Key: DRILL-2019 > URL: https://issues.apache.org/jira/browse/DRILL-2019 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Operators, Query Planning & Optimization >Reporter: Rahul Challapalli >Assignee: Aman Sinha >Priority: Critical > Attachments: > 0001-DRILL-2019-Use-correct-index-for-filter-evaluation-w.patch > > > git.commit.id.abbrev=b491cdb > The below query on top of tpch 0.01 data should actually return 0 records. > (Verified with postgres). However drill returns incorrect result > {code} > select count(*) from cp.`tpch/lineitem.parquet` l inner join (select > o.o_orderkey, o.o_custkey from cp.`tpch/orders.parquet` o where o.o_custkey < > 5) s on l.l_orderkey = s.o_orderkey and s.o_custkey > 5; > ++ > | EXPR$0 | > ++ > | 189| > ++ > {code} > Marked as 'critical' since drill is reporting incorrect results -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-2019) Filter pushdown into the subquery when the subquery also has a filter is resulting in incorrect results
[ https://issues.apache.org/jira/browse/DRILL-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aman Sinha updated DRILL-2019: -- Attachment: 0001-DRILL-2019-Use-correct-index-for-filter-evaluation-w.patch Uploaded patch. Fix is to pass the correct index to doEval() in the filter template for SelectionVector2. [~jnadeau] could you pls review ? > Filter pushdown into the subquery when the subquery also has a filter is > resulting in incorrect results > --- > > Key: DRILL-2019 > URL: https://issues.apache.org/jira/browse/DRILL-2019 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Operators, Query Planning & Optimization >Reporter: Rahul Challapalli >Assignee: Aman Sinha >Priority: Critical > Attachments: > 0001-DRILL-2019-Use-correct-index-for-filter-evaluation-w.patch > > > git.commit.id.abbrev=b491cdb > The below query on top of tpch 0.01 data should actually return 0 records. > (Verified with postgres). However drill returns incorrect result > {code} > select count(*) from cp.`tpch/lineitem.parquet` l inner join (select > o.o_orderkey, o.o_custkey from cp.`tpch/orders.parquet` o where o.o_custkey < > 5) s on l.l_orderkey = s.o_orderkey and s.o_custkey > 5; > ++ > | EXPR$0 | > ++ > | 189| > ++ > {code} > Marked as 'critical' since drill is reporting incorrect results -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-2022) Parquet engine falls back to "new" Parquet reader unnecessarily
[ https://issues.apache.org/jira/browse/DRILL-2022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Gilmore updated DRILL-2022: Priority: Minor (was: Major) > Parquet engine falls back to "new" Parquet reader unnecessarily > --- > > Key: DRILL-2022 > URL: https://issues.apache.org/jira/browse/DRILL-2022 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Parquet >Affects Versions: 0.8.0 >Reporter: Adam Gilmore >Assignee: Parth Chandra >Priority: Minor > Attachments: DRILL-2022.1.patch.txt > > > The Parquet engine falls back to the "new" Parquet reader whenever a Parquet > file that is "complex" (i.e. not purely primitive types) is found. > The engine should still use the faster reader when all the projected columns > are primitive types and only fall back to the other reader when columns > containing complex types are selected. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-2032) For TestStarQueries, migrate to new test framework to improve validation process
Sean Hsuan-Yi Chu created DRILL-2032: Summary: For TestStarQueries, migrate to new test framework to improve validation process Key: DRILL-2032 URL: https://issues.apache.org/jira/browse/DRILL-2032 Project: Apache Drill Issue Type: Improvement Reporter: Sean Hsuan-Yi Chu Assignee: Sean Hsuan-Yi Chu Priority: Minor In TestStarQueries.java, some unit tests do not use testFramework, so returned results are not validated for correctness. This issue is created to improve this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-2021) select a, *, a from ... gives wrong result
[ https://issues.apache.org/jira/browse/DRILL-2021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281980#comment-14281980 ] Sean Hsuan-Yi Chu commented on DRILL-2021: -- Agree! I just filed a new jira DRILL-2032 to improve the validation process > select a, *, a from ... gives wrong result > -- > > Key: DRILL-2021 > URL: https://issues.apache.org/jira/browse/DRILL-2021 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Flow, Execution - Operators > Environment: IDE >Reporter: Sean Hsuan-Yi Chu >Assignee: Aman Sinha >Priority: Critical > Fix For: 0.8.0 > > Attachments: DRILL-2021.1.patch, DRILL-2021.2.patch, > DRILL-2021.3.patch > > > When, in the select-clause, there are star(s) and regular columns showing up > more than once, some regular columns would failed to be printed out. > For example, > select n_name, *, n_name from cp.`tpch/nation.parquet` limit 2 > - > n_namen_nationkey n_name0 n_regionkey n_comment > ALGERIA 0 ALGERIA 0haggle. carefully final deposits > detect slyly agai > ARGENTINA 1 ARGENTINA 1 al foxes promise slyly > according to the regular accounts. bold requests alone > - > Notice "n_name" show be printed out three times. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-2021) select a, *, a from ... gives wrong result
[ https://issues.apache.org/jira/browse/DRILL-2021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14281852#comment-14281852 ] Aman Sinha commented on DRILL-2021: --- Reviewed patch and manually inspected the test results for all TestStarQueries (we really need to add validation for all of those...only the recent ones have validation check). Fixed in master branch, commit #: 31d764d4f > select a, *, a from ... gives wrong result > -- > > Key: DRILL-2021 > URL: https://issues.apache.org/jira/browse/DRILL-2021 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Flow, Execution - Operators > Environment: IDE >Reporter: Sean Hsuan-Yi Chu >Assignee: Aman Sinha >Priority: Critical > Fix For: 0.8.0 > > Attachments: DRILL-2021.1.patch, DRILL-2021.2.patch, > DRILL-2021.3.patch > > > When, in the select-clause, there are star(s) and regular columns showing up > more than once, some regular columns would failed to be printed out. > For example, > select n_name, *, n_name from cp.`tpch/nation.parquet` limit 2 > - > n_namen_nationkey n_name0 n_regionkey n_comment > ALGERIA 0 ALGERIA 0haggle. carefully final deposits > detect slyly agai > ARGENTINA 1 ARGENTINA 1 al foxes promise slyly > according to the regular accounts. bold requests alone > - > Notice "n_name" show be printed out three times. -- This message was sent by Atlassian JIRA (v6.3.4#6332)