[jira] [Commented] (DRILL-4320) Difference in query plan on JDK8 for window function query
[ https://issues.apache.org/jira/browse/DRILL-4320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130181#comment-15130181 ] Khurram Faraaz commented on DRILL-4320: --- I can repro on mapr Drill 1.5.0 git.commit.id=6a36a704 and JDK8 and MapR FS 5.0.0 on 4 node cluster on CentOS > Difference in query plan on JDK8 for window function query > -- > > Key: DRILL-4320 > URL: https://issues.apache.org/jira/browse/DRILL-4320 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.4.0 > Environment: 4 node cluster CentOS >Reporter: Khurram Faraaz > > Difference in query plan seen in window function query on JDK8 with below > test environment, the difference being that a Project is missing after the > initial Scan, the new plan looks more optimized. Should we update the > expected query plan or further investigation is required ? > Java 8 > MapR Drill 1.4.0 GA > JDK8 > MapR FS 5.0.0 GA > Functional/window_functions/optimization/plan/pp_03.sql > {noformat} > Actual plan > 00-00Screen > 00-01 Project(EXPR$0=[$0]) > 00-02Project($0=[$2]) > 00-03 Window(window#0=[window(partition {1} order by [] range > between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($0)])]) > 00-04SelectionVectorRemover > 00-05 Sort(sort0=[$1], dir0=[ASC]) > 00-06Scan(groupscan=[ParquetGroupScan > [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/subqueries/t1]], > selectionRoot=maprfs:/drill/testdata/subqueries/t1, numFiles=1, > usedMetadataFile=false, columns=[`a1`, `c1`]]]) > Expected plan > Screen > .*Project.* >.*Project.* > .*Window.*range between UNBOUNDED PRECEDING and UNBOUNDED > FOLLOWING aggs.* >.*SelectionVectorRemover.* > .*Sort.* >.*Project.* > .*Scan.* > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (DRILL-3610) TimestampAdd/Diff (SQL_TSI_) functions
[ https://issues.apache.org/jira/browse/DRILL-3610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva reassigned DRILL-3610: --- Assignee: Arina Ielchiieva > TimestampAdd/Diff (SQL_TSI_) functions > -- > > Key: DRILL-3610 > URL: https://issues.apache.org/jira/browse/DRILL-3610 > Project: Apache Drill > Issue Type: Improvement > Components: Functions - Drill >Reporter: Andries Engelbrecht >Assignee: Arina Ielchiieva > Fix For: Future > > > Add TimestampAdd and TimestampDiff (SQL_TSI) functions for year, quarter, > month, week, day, hour, minute, second. > Examples > SELECT CAST(TIMESTAMPADD(SQL_TSI_QUARTER,1,Date('2013-03-31'), SQL_DATE) AS > `column_quarter` > FROM `table_in` > HAVING (COUNT(1) > 0) > SELECT `table_in`.`datetime` AS `column1`, > `table`.`Key` AS `column_Key`, > TIMESTAMPDIFF(SQL_TSI_MINUTE,to_timestamp('2004-07-04', > '-MM-dd'),`table_in`.`datetime`) AS `sum_datediff_minute` > FROM `calcs` -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4320) Difference in query plan on JDK8 for window function query
[ https://issues.apache.org/jira/browse/DRILL-4320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130220#comment-15130220 ] Khurram Faraaz commented on DRILL-4320: --- Some fourteen tests fail on mapr Drill 1.5.0 with JDK8 due to difference in plan, a Project after the initial Scan is missing from the actual query plan. Project is missing after initial Scan Functional/filter/pushdown/plan/q1.sql Functional/filter/pushdown/plan/q2.sql Functional/filter/pushdown/plan/q3.sql Functional/filter/pushdown/plan/q4.sql Functional/filter/pushdown/plan/q5.sql Functional/filter/pushdown/plan/q6.sql Functional/filter/pushdown/plan/q7.sql Functional/filter/pushdown/plan/q8.sql Functional/filter/pushdown/plan/q9.sql Functional/filter/pushdown/plan/q10.sql Functional/filter/pushdown/plan/q11.sql Functional/filter/pushdown/plan/q13.sql Functional/filter/pushdown/plan/q14.sql Functional/filter/pushdown/plan/q16.sql > Difference in query plan on JDK8 for window function query > -- > > Key: DRILL-4320 > URL: https://issues.apache.org/jira/browse/DRILL-4320 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.4.0 > Environment: 4 node cluster CentOS >Reporter: Khurram Faraaz > > Difference in query plan seen in window function query on JDK8 with below > test environment, the difference being that a Project is missing after the > initial Scan, the new plan looks more optimized. Should we update the > expected query plan or further investigation is required ? > Java 8 > MapR Drill 1.4.0 GA > JDK8 > MapR FS 5.0.0 GA > Functional/window_functions/optimization/plan/pp_03.sql > {noformat} > Actual plan > 00-00Screen > 00-01 Project(EXPR$0=[$0]) > 00-02Project($0=[$2]) > 00-03 Window(window#0=[window(partition {1} order by [] range > between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($0)])]) > 00-04SelectionVectorRemover > 00-05 Sort(sort0=[$1], dir0=[ASC]) > 00-06Scan(groupscan=[ParquetGroupScan > [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/subqueries/t1]], > selectionRoot=maprfs:/drill/testdata/subqueries/t1, numFiles=1, > usedMetadataFile=false, columns=[`a1`, `c1`]]]) > Expected plan > Screen > .*Project.* >.*Project.* > .*Window.*range between UNBOUNDED PRECEDING and UNBOUNDED > FOLLOWING aggs.* >.*SelectionVectorRemover.* > .*Sort.* >.*Project.* > .*Scan.* > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4340) Tableau complains about the ODBC driver capabilities
[ https://issues.apache.org/jira/browse/DRILL-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130286#comment-15130286 ] Oscar Morante commented on DRILL-4340: -- Oops! You're right, I was missing the TDC file. Thanks! > Tableau complains about the ODBC driver capabilities > > > Key: DRILL-4340 > URL: https://issues.apache.org/jira/browse/DRILL-4340 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.5.0 >Reporter: Oscar Morante > > I'm testing Drill with Tableau via ODBC and when it connects it complains > about some missing features: > {code} > This ODBC driver does not support important capabilities used by Tableau. > This unsupported function is required for relative date filters: The date > part named 'week' for the date function: DATETRUNC(date_part, date, > [start_of_week]) > - > Tableau identified the following warnings for the ODBC data source named > 'test (s3.views.test)': > This aggregation is unsupported: Attribute > This aggregation is unsupported: Std. Dev > This aggregation is unsupported: Std. Dev (Pop.) > This aggregation is unsupported: Trunc Week Number > This aggregation is unsupported: Variance > This aggregation is unsupported: Variance (Pop.) > This function is unsupported: % with parameter types 'integer, integer' > This function is unsupported: ABS(number) with parameter types 'float' > This function is unsupported: ABS(number) with parameter types 'integer' > This function is unsupported: ACOS(number) with parameter types 'float' > This function is unsupported: ASIN(number) with parameter types 'float' > This function is unsupported: ATAN(number) with parameter types 'float' > This function is unsupported: ATAN2(y number, x number) with parameter types > 'float, float' > This function is unsupported: COS(angle) with parameter types 'float' > This function is unsupported: COT(angle) with parameter types 'float' > This function is unsupported: DATEPART_DAYOFWEEK_INTERNAL with parameter > types 'date' > This function is unsupported: DATEPART_WEEK_INTERNAL with parameter types > 'date' > This function is unsupported: DATETIME with parameter types 'integer' > This function is unsupported: DEGREES(number) with parameter types 'float' > This function is unsupported: EXP(number) with parameter types 'float' > This function is unsupported: LN with parameter types 'float' > This function is unsupported: LN(number) with parameter types 'float' > This function is unsupported: LOG with parameter types 'float' > This function is unsupported: LOG(number, [base]) with parameter types 'float' > This function is unsupported: PI() > This function is unsupported: POWER with parameter types 'float, integer' > This function is unsupported: POWER with parameter types 'integer, integer' > This function is unsupported: POWER(number,power) with parameter types > 'float, integer' > This function is unsupported: POWER(number,power) with parameter types > 'integer, integer' > This function is unsupported: RADIANS(number) with parameter types 'float' > This function is unsupported: ROUND(number, [decimals]) with parameter types > 'float' > This function is unsupported: ROUND(number, [decimals]) with parameter types > 'float, integer' > This function is unsupported: SIGN(number) with parameter types 'float' > This function is unsupported: SIN(angle) with parameter types 'float' > This function is unsupported: SQRT(number) with parameter types 'float' > This function is unsupported: SQUARE with parameter types 'float' > This function is unsupported: TAN(angle) with parameter types 'float' > This function is unsupported: The date part named 'week' for the date > function: DATEDIFF(date_part, start_date, end_date, [start_of_week]) > This function is unsupported: The date part named 'week' for the date > function: DATEDIFF(date_part, start_date, end_date, [start_of_week]) > This function is unsupported: The date part named 'week' for the date > function: DATENAME(date_part, date, [start_of_week]) > This function is unsupported: The date part named 'week' for the date > function: DATENAME(date_part, date, [start_of_week]) > This function is unsupported: The date part named 'week' for the date > function: DATEPART(date_part, date, [start_of_week]) > This function is unsupported: The date part named 'week' for the date > function: DATEPART(date_part, date, [start_of_week]) > This function is unsupported: The date part named 'weekday' for the date > function: DATENAME(date_part, date, [start_of_week]) > This function is unsupported: The date part named 'weekday' for the date > function: DATENAME(date_part, date, [start_of_week]) > This function is unsupported: The date part named 'weekday' for the date > function: DATEPART(date_part, date, [start_of_week]) > This function is
[jira] [Closed] (DRILL-4340) Tableau complains about the ODBC driver capabilities
[ https://issues.apache.org/jira/browse/DRILL-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oscar Morante closed DRILL-4340. Resolution: Not A Problem > Tableau complains about the ODBC driver capabilities > > > Key: DRILL-4340 > URL: https://issues.apache.org/jira/browse/DRILL-4340 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.5.0 >Reporter: Oscar Morante > > I'm testing Drill with Tableau via ODBC and when it connects it complains > about some missing features: > {code} > This ODBC driver does not support important capabilities used by Tableau. > This unsupported function is required for relative date filters: The date > part named 'week' for the date function: DATETRUNC(date_part, date, > [start_of_week]) > - > Tableau identified the following warnings for the ODBC data source named > 'test (s3.views.test)': > This aggregation is unsupported: Attribute > This aggregation is unsupported: Std. Dev > This aggregation is unsupported: Std. Dev (Pop.) > This aggregation is unsupported: Trunc Week Number > This aggregation is unsupported: Variance > This aggregation is unsupported: Variance (Pop.) > This function is unsupported: % with parameter types 'integer, integer' > This function is unsupported: ABS(number) with parameter types 'float' > This function is unsupported: ABS(number) with parameter types 'integer' > This function is unsupported: ACOS(number) with parameter types 'float' > This function is unsupported: ASIN(number) with parameter types 'float' > This function is unsupported: ATAN(number) with parameter types 'float' > This function is unsupported: ATAN2(y number, x number) with parameter types > 'float, float' > This function is unsupported: COS(angle) with parameter types 'float' > This function is unsupported: COT(angle) with parameter types 'float' > This function is unsupported: DATEPART_DAYOFWEEK_INTERNAL with parameter > types 'date' > This function is unsupported: DATEPART_WEEK_INTERNAL with parameter types > 'date' > This function is unsupported: DATETIME with parameter types 'integer' > This function is unsupported: DEGREES(number) with parameter types 'float' > This function is unsupported: EXP(number) with parameter types 'float' > This function is unsupported: LN with parameter types 'float' > This function is unsupported: LN(number) with parameter types 'float' > This function is unsupported: LOG with parameter types 'float' > This function is unsupported: LOG(number, [base]) with parameter types 'float' > This function is unsupported: PI() > This function is unsupported: POWER with parameter types 'float, integer' > This function is unsupported: POWER with parameter types 'integer, integer' > This function is unsupported: POWER(number,power) with parameter types > 'float, integer' > This function is unsupported: POWER(number,power) with parameter types > 'integer, integer' > This function is unsupported: RADIANS(number) with parameter types 'float' > This function is unsupported: ROUND(number, [decimals]) with parameter types > 'float' > This function is unsupported: ROUND(number, [decimals]) with parameter types > 'float, integer' > This function is unsupported: SIGN(number) with parameter types 'float' > This function is unsupported: SIN(angle) with parameter types 'float' > This function is unsupported: SQRT(number) with parameter types 'float' > This function is unsupported: SQUARE with parameter types 'float' > This function is unsupported: TAN(angle) with parameter types 'float' > This function is unsupported: The date part named 'week' for the date > function: DATEDIFF(date_part, start_date, end_date, [start_of_week]) > This function is unsupported: The date part named 'week' for the date > function: DATEDIFF(date_part, start_date, end_date, [start_of_week]) > This function is unsupported: The date part named 'week' for the date > function: DATENAME(date_part, date, [start_of_week]) > This function is unsupported: The date part named 'week' for the date > function: DATENAME(date_part, date, [start_of_week]) > This function is unsupported: The date part named 'week' for the date > function: DATEPART(date_part, date, [start_of_week]) > This function is unsupported: The date part named 'week' for the date > function: DATEPART(date_part, date, [start_of_week]) > This function is unsupported: The date part named 'weekday' for the date > function: DATENAME(date_part, date, [start_of_week]) > This function is unsupported: The date part named 'weekday' for the date > function: DATENAME(date_part, date, [start_of_week]) > This function is unsupported: The date part named 'weekday' for the date > function: DATEPART(date_part, date, [start_of_week]) > This function is unsupported: The date part named 'weekday' for the date > function:
[jira] [Commented] (DRILL-4321) Difference in results count distinct with min max query on JDK8
[ https://issues.apache.org/jira/browse/DRILL-4321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130178#comment-15130178 ] Khurram Faraaz commented on DRILL-4321: --- I can repro on mapr Drill 1.5.0 git.commit.id=6a36a704 and JDK8 and MapR FS 5.0.0 on 4 node cluster on CentOS > Difference in results count distinct with min max query on JDK8 > --- > > Key: DRILL-4321 > URL: https://issues.apache.org/jira/browse/DRILL-4321 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.4.0 > Environment: 4 node cluster >Reporter: Khurram Faraaz > Attachments: expected_results.res > > > count distinct query with min max and group by and order by returns incorrect > results on MapR Drill 1.4.0, MapR FS 5.0.0 GA and JDK8 > The difference is in the way we round off values after the decimal when using > JDK8. > Expected results file can be found here > https://github.com/mapr/drill-test-framework/blob/master/framework/resources/Functional/aggregates/aggregation/count_distinct/with_min_max_c_float_group_by_1_cols.res > Failing query is Functional/aggregates/aggregation/count_distinct/ > with_min_max_c_float_group_by_1_cols.sql > {noformat} > select count(distinct c_float), max(c_float), min(c_float) from > alltypes_with_nulls group by c_date order by c_date; > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-4349) parquet reader returns wrong results when reading a nullable column that starts with a large number of nulls (>30k)
[ https://issues.apache.org/jira/browse/DRILL-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deneche A. Hakim updated DRILL-4349: Attachment: drill4349.tar.gz attached csv file > parquet reader returns wrong results when reading a nullable column that > starts with a large number of nulls (>30k) > --- > > Key: DRILL-4349 > URL: https://issues.apache.org/jira/browse/DRILL-4349 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Parquet >Affects Versions: 1.4.0 >Reporter: Deneche A. Hakim >Assignee: Deneche A. Hakim >Priority: Critical > Fix For: 1.6.0 > > Attachments: drill4349.tar.gz > > > While reading a nullable column, if in a single pass we only read null > values, the parquet reader resets the value of pageReader.readPosInBytes > which will lead to wrong data read from the file. > To reproduce the issue, create a csv file (repro.csv) with 2 columns (id, > val) with 50100 rows, where id equals to the row number and val is empty for > the first 50k rows, and equal to id for the remaining rows. > create a parquet table from the csv file: > {noformat} > CREATE TABLE `repro_parquet` AS SELECT CAST(columns[0] AS INT) AS id, > CAST(NULLIF(columns[1], '') AS DOUBLE) AS val from `repro.csv`; > {noformat} > Now if you query any of the non null values you will get wrong results: > {noformat} > 0: jdbc:drill:zk=local> select * from `repro_parquet` where id>=5 limit > 10; > ++---+ > | id |val| > ++---+ > | 5 | 9.11337776337441E-309 | > | 50001 | 3.26044E-319 | > | 50002 | 1.4916681476489723E-154 | > | 50003 | 2.18890676| > | 50004 | 2.681561588521345E154 | > | 50005 | -2.1016574E-317 | > | 50006 | -1.4916681476489723E-154 | > | 50007 | -2.18890676 | > | 50008 | -2.681561588521345E154| > | 50009 | 2.1016574E-317| > ++---+ > 10 rows selected (0.238 seconds) > {noformat} > and here are the expected values: > {noformat} > 0: jdbc:drill:zk=local> select * from `repro.csv` where cast(columns[0] as > int)>=5 limit 10; > ++ > | columns | > ++ > | ["5","5"] | > | ["50001","50001"] | > | ["50002","50002"] | > | ["50003","50003"] | > | ["50004","50004"] | > | ["50005","50005"] | > | ["50006","50006"] | > | ["50007","50007"] | > | ["50008","50008"] | > | ["50009","50009"] | > ++ > {noformat} > I confirmed that the file is written correctly and the issue is in the > parquet reader (already have a fix for it) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3880) sqlline does not allow for a password prompt - security issue
[ https://issues.apache.org/jira/browse/DRILL-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15131267#comment-15131267 ] John Omernik commented on DRILL-3880: - So I came up with another use case (other than the ability to provide a URL and Username, but no password and have it prompt for the password). I would like it so if a URL is provided, and authentication is required (if it's hard to tell if authentication is required, then perhaps another flag that would allow to specify "prompt" for credentials: then it would be like starting sqlline, typing the connect URL with !connect going from there. So: sqlline -u %YOURURL% -n myusername Results in asking for a password and then completing the process sqlline -u %YOURURL% -P #-P may be prompt for for credentials) Results in it asking for username and then password as if the user started sqlline and typed >!connect %YOURURL% > sqlline does not allow for a password prompt - security issue > - > > Key: DRILL-3880 > URL: https://issues.apache.org/jira/browse/DRILL-3880 > Project: Apache Drill > Issue Type: Improvement > Components: Client - CLI >Affects Versions: 1.1.0 >Reporter: John Omernik > Labels: security > Fix For: Future > > > When authentication is enabled in drill, and using sqlline, there is no way > to get the sqlline client to prompt for a password. The only option is to > specify the password at the command line (-n user -p password) or to log in > and then connect. > This is a security risk, in that now the .bash_history contains the user's > password, defeating accountability on the system. Hive and MYSQL both allow > for a -p flag with no value to trigger a prompt for the password that is not > logged by .bash_history. > One work around is to connect after starting sqlline, however, if the sqlline > command offers a way to specify the username/password, we should do it in a > way that doesn't violate security principles. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4349) parquet reader returns wrong results when reading a nullable column that starts with a large number of nulls (>30k)
[ https://issues.apache.org/jira/browse/DRILL-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15131132#comment-15131132 ] Deneche A. Hakim commented on DRILL-4349: - This is a regression introduced by DRILL-3871 > parquet reader returns wrong results when reading a nullable column that > starts with a large number of nulls (>30k) > --- > > Key: DRILL-4349 > URL: https://issues.apache.org/jira/browse/DRILL-4349 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Parquet >Affects Versions: 1.4.0 >Reporter: Deneche A. Hakim >Assignee: Deneche A. Hakim >Priority: Critical > Fix For: 1.6.0 > > Attachments: drill4349.tar.gz > > > While reading a nullable column, if in a single pass we only read null > values, the parquet reader resets the value of pageReader.readPosInBytes > which will lead to wrong data read from the file. > To reproduce the issue, create a csv file (repro.csv) with 2 columns (id, > val) with 50100 rows, where id equals to the row number and val is empty for > the first 50k rows, and equal to id for the remaining rows. > create a parquet table from the csv file: > {noformat} > CREATE TABLE `repro_parquet` AS SELECT CAST(columns[0] AS INT) AS id, > CAST(NULLIF(columns[1], '') AS DOUBLE) AS val from `repro.csv`; > {noformat} > Now if you query any of the non null values you will get wrong results: > {noformat} > 0: jdbc:drill:zk=local> select * from `repro_parquet` where id>=5 limit > 10; > ++---+ > | id |val| > ++---+ > | 5 | 9.11337776337441E-309 | > | 50001 | 3.26044E-319 | > | 50002 | 1.4916681476489723E-154 | > | 50003 | 2.18890676| > | 50004 | 2.681561588521345E154 | > | 50005 | -2.1016574E-317 | > | 50006 | -1.4916681476489723E-154 | > | 50007 | -2.18890676 | > | 50008 | -2.681561588521345E154| > | 50009 | 2.1016574E-317| > ++---+ > 10 rows selected (0.238 seconds) > {noformat} > and here are the expected values: > {noformat} > 0: jdbc:drill:zk=local> select * from `repro.csv` where cast(columns[0] as > int)>=5 limit 10; > ++ > | columns | > ++ > | ["5","5"] | > | ["50001","50001"] | > | ["50002","50002"] | > | ["50003","50003"] | > | ["50004","50004"] | > | ["50005","50005"] | > | ["50006","50006"] | > | ["50007","50007"] | > | ["50008","50008"] | > | ["50009","50009"] | > ++ > {noformat} > I confirmed that the file is written correctly and the issue is in the > parquet reader (already have a fix for it) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-4349) parquet reader returns wrong results when reading a nullable column that starts with a large number of nulls (>30k)
Deneche A. Hakim created DRILL-4349: --- Summary: parquet reader returns wrong results when reading a nullable column that starts with a large number of nulls (>30k) Key: DRILL-4349 URL: https://issues.apache.org/jira/browse/DRILL-4349 Project: Apache Drill Issue Type: Bug Components: Storage - Parquet Affects Versions: 1.4.0 Reporter: Deneche A. Hakim Assignee: Deneche A. Hakim Priority: Critical Fix For: 1.6.0 While reading a nullable column, if in a single pass we only read null values, the parquet reader resets the value of pageReader.readPosInBytes which will lead to wrong data read from the file. To reproduce the issue, create a csv file (repro.csv) with 2 columns (id, val) with 50100 rows, where id equals to the row number and val is empty for the first 50k rows, and equal to id for the remaining rows. create a parquet table from the csv file: {noformat} CREATE TABLE `repro_parquet` AS SELECT CAST(columns[0] AS INT) AS id, CAST(NULLIF(columns[1], '') AS DOUBLE) AS val from `repro.csv`; {noformat} Now if you query any of the non null values you will get wrong results: {noformat} 0: jdbc:drill:zk=local> select * from `repro_parquet` where id>=5 limit 10; ++---+ | id |val| ++---+ | 5 | 9.11337776337441E-309 | | 50001 | 3.26044E-319 | | 50002 | 1.4916681476489723E-154 | | 50003 | 2.18890676| | 50004 | 2.681561588521345E154 | | 50005 | -2.1016574E-317 | | 50006 | -1.4916681476489723E-154 | | 50007 | -2.18890676 | | 50008 | -2.681561588521345E154| | 50009 | 2.1016574E-317| ++---+ 10 rows selected (0.238 seconds) {noformat} and here are the expected values: {noformat} 0: jdbc:drill:zk=local> select * from `repro.csv` where cast(columns[0] as int)>=5 limit 10; ++ | columns | ++ | ["5","5"] | | ["50001","50001"] | | ["50002","50002"] | | ["50003","50003"] | | ["50004","50004"] | | ["50005","50005"] | | ["50006","50006"] | | ["50007","50007"] | | ["50008","50008"] | | ["50009","50009"] | ++ {noformat} I confirmed that the file is written correctly and the issue is in the parquet reader (already have a fix for it) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3562) Query fails when using flatten on JSON data where some documents have an empty array
[ https://issues.apache.org/jira/browse/DRILL-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130550#comment-15130550 ] Dan Osipov commented on DRILL-3562: --- Facing the same issue, any workaround? > Query fails when using flatten on JSON data where some documents have an > empty array > > > Key: DRILL-3562 > URL: https://issues.apache.org/jira/browse/DRILL-3562 > Project: Apache Drill > Issue Type: Bug > Components: Storage - JSON >Affects Versions: 1.1.0 >Reporter: Philip Deegan > Fix For: Future > > > Drill query fails when using flatten when some records contain an empty array > {noformat} > SELECT COUNT(*) FROM (SELECT FLATTEN(t.a.b.c) AS c FROM dfs.`flat.json` t) > flat WHERE flat.c.d.e = 'f' limit 1; > {noformat} > Succeeds on > { "a": { "b": { "c": [ { "d": { "e": "f" } } ] } } } > Fails on > { "a": { "b": { "c": [] } } } > Error > {noformat} > Error: SYSTEM ERROR: ClassCastException: Cannot cast > org.apache.drill.exec.vector.NullableIntVector to > org.apache.drill.exec.vector.complex.RepeatedValueVector > {noformat} > Is it possible to ignore the empty arrays, or do they need to be populated > with dummy data? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-4291) Ensure the jdbc-all driver jar includes classes required to return VarChar[]
[ https://issues.apache.org/jira/browse/DRILL-4291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Ollala updated DRILL-4291: - Reviewer: Khurram Faraaz > Ensure the jdbc-all driver jar includes classes required to return VarChar[] > > > Key: DRILL-4291 > URL: https://issues.apache.org/jira/browse/DRILL-4291 > Project: Apache Drill > Issue Type: Bug > Components: Client - JDBC >Affects Versions: 0.5.0 > Environment: Linux / 1.5-SNAPSHOT >Reporter: Stefán Baxter >Assignee: Jacques Nadeau > Fix For: 1.5.0 > > > Hi, > We are using the 1.5-SNAPSHOT version of the JDBC drilver (all) and we seem > to be getting this old thing: > https://issues.apache.org/jira/browse/DRILL-2482 > We are either doing something wrong or this or this is a regression. Has > anyone else experienced not being able to get nested structures via the > latest JDBC driver? > (I'm going to pull the lastest from master to be sure this has not been > solved) > The error we get when accessing a field containing a sub-structure is : > java.lang.NoClassDefFoundError: org/apache/hadoop/io/Text > at > oadd.org.apache.drill.exec.util.JsonStringArrayList.(JsonStringArrayList.java:35) > at > oadd.org.apache.drill.exec.vector.RepeatedVarCharVector$Accessor.getObject(RepeatedVarCharVector.java:293) > at > oadd.org.apache.drill.exec.vector.RepeatedVarCharVector$Accessor.getObject(RepeatedVarCharVector.java:290) > at > oadd.org.apache.drill.exec.vector.accessor.GenericAccessor.getObject(GenericAccessor.java:44) > at > oadd.org.apache.drill.exec.vector.accessor.BoundCheckingAccessor.getObject(BoundCheckingAccessor.java:148) > at > org.apache.drill.jdbc.impl.TypeConvertingSqlAccessor.getObject(TypeConvertingSqlAccessor.java:795) > at > org.apache.drill.jdbc.impl.AvaticaDrillSqlAccessor.getObject(AvaticaDrillSqlAccessor.java:179) > at > oadd.net.hydromatic.avatica.AvaticaResultSet.getObject(AvaticaResultSet.java:351) > at > org.apache.drill.jdbc.impl.DrillResultSetImpl.getObject(DrillResultSetImpl.java:420) > Regards, > -Stefan -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-4256) Performance regression in hive planning
[ https://issues.apache.org/jira/browse/DRILL-4256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rahul Challapalli updated DRILL-4256: - Reviewer: Dechang Gu (was: Rahul Challapalli) > Performance regression in hive planning > --- > > Key: DRILL-4256 > URL: https://issues.apache.org/jira/browse/DRILL-4256 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Hive, Query Planning & Optimization >Affects Versions: 1.5.0 >Reporter: Rahul Challapalli >Assignee: Venki Korukanti > Fix For: 1.5.0 > > Attachments: jstack.tgz > > > Commit # : 76f41e18207e3e3e987fef56ee7f1695dd6ddd7a > The fix for reading hive tables backed by hbase caused a performance > regression. The data set used in the below test has ~3700 partitions and the > filter in the query would ensure only 1 partition get selected. > {code} > Commit : 76f41e18207e3e3e987fef56ee7f1695dd6ddd7a > Query : explain plan for select count(*) from lineitem_partitioned where > `year`=2015 and `month`=1 and `day` =1; > Time : ~25 seconds > {code} > {code} > Commit : 1ea3d6c3f144614caf460648c1c27c6d0f5b06b8 > Query : explain plan for select count(*) from lineitem_partitioned where > `year`=2015 and `month`=1 and `day` =1; > Time : ~6.5 seconds > {code} > Since the data is large, I couldn't attach it here. Reach out to me if you > need additional information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-4196) some TPCDS queries return wrong result when hash join is disabled
[ https://issues.apache.org/jira/browse/DRILL-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Ollala updated DRILL-4196: - Reviewer: Victoria Markman > some TPCDS queries return wrong result when hash join is disabled > - > > Key: DRILL-4196 > URL: https://issues.apache.org/jira/browse/DRILL-4196 > Project: Apache Drill > Issue Type: Bug >Reporter: Victoria Markman >Assignee: amit hadke > Fix For: 1.5.0 > > Attachments: 1.5.0-amit-branch_tpcds_sf1.txt, query40.tar, query52.tar > > > With hash join disabled query52.sql and query40.sql returned incorrect result > with 1.4.0 : > {noformat} > +-+---+-++--++ > | version | commit_id | > commit_message|commit_time > | build_email | build_time | > +-+---+-++--++ > | 1.4.0-SNAPSHOT | b9068117177c3b47025f52c00f67938e0c3e4732 | DRILL-4165 > Add a precondition for size of merge join record batch. | 08.12.2015 @ > 01:25:34 UTC | Unknown | 08.12.2015 @ 03:36:25 UTC | > +-+---+-++--++ > 1 row selected (2.13 seconds) > {noformat} > Setup and options are the same as in DRILL-4190 > See attached queries (.sql), expected result (.e_tsv) and actual output (.out) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-4322) User exception created upon failed DROP TABLE eats the underlying exception
[ https://issues.apache.org/jira/browse/DRILL-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Ollala updated DRILL-4322: - Reviewer: Chun Chang > User exception created upon failed DROP TABLE eats the underlying exception > --- > > Key: DRILL-4322 > URL: https://issues.apache.org/jira/browse/DRILL-4322 > Project: Apache Drill > Issue Type: Bug >Reporter: Jason Altekruse >Assignee: Jason Altekruse > Fix For: 1.5.0 > > > Reported in this thread on the list: > http://mail-archives.apache.org/mod_mbox/drill-user/201601.mbox/%3CCAMpYv7Cd%2BRuj5L5RAOOe4CoVNxjU6HOSuH2m0XTcyzjzuKiadw%40mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-4163) Support schema changes for MergeJoin operator.
[ https://issues.apache.org/jira/browse/DRILL-4163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Ollala updated DRILL-4163: - Reviewer: Victoria Markman > Support schema changes for MergeJoin operator. > -- > > Key: DRILL-4163 > URL: https://issues.apache.org/jira/browse/DRILL-4163 > Project: Apache Drill > Issue Type: Improvement >Reporter: amit hadke >Assignee: Jason Altekruse > Fix For: 1.5.0 > > > Since external sort operator supports schema changes, allow use of union > types in merge join to support for schema changes. > For now, we assume that merge join always works on record batches from sort > operator. Thus merging schemas and promoting to union vectors is already > taken care by sort operator. > Test Cases: > 1) Only one side changes schema (join on union type and primitive type) > 2) Both sids change schema on all columns. > 3) Join between numeric types and string types. > 4) Missing columns - each batch has different columns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4349) parquet reader returns wrong results when reading a nullable column that starts with a large number of nulls (>30k)
[ https://issues.apache.org/jira/browse/DRILL-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15131443#comment-15131443 ] ASF GitHub Bot commented on DRILL-4349: --- GitHub user adeneche opened a pull request: https://github.com/apache/drill/pull/356 DRILL-4349: parquet reader returns wrong results when reading a nulla… …ble column that starts with a large number of nulls (>30k) You can merge this pull request into a Git repository by running: $ git pull https://github.com/adeneche/incubator-drill DRILL-4349 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/356.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #356 commit a1bc0c7dc11a117d18b6ae74b91e6390138be20f Author: adenecheDate: 2016-02-03T23:42:22Z DRILL-4349: parquet reader returns wrong results when reading a nullable column that starts with a large number of nulls (>30k) > parquet reader returns wrong results when reading a nullable column that > starts with a large number of nulls (>30k) > --- > > Key: DRILL-4349 > URL: https://issues.apache.org/jira/browse/DRILL-4349 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Parquet >Affects Versions: 1.4.0 >Reporter: Deneche A. Hakim >Assignee: Deneche A. Hakim >Priority: Critical > Fix For: 1.6.0 > > Attachments: drill4349.tar.gz > > > While reading a nullable column, if in a single pass we only read null > values, the parquet reader resets the value of pageReader.readPosInBytes > which will lead to wrong data read from the file. > To reproduce the issue, create a csv file (repro.csv) with 2 columns (id, > val) with 50100 rows, where id equals to the row number and val is empty for > the first 50k rows, and equal to id for the remaining rows. > create a parquet table from the csv file: > {noformat} > CREATE TABLE `repro_parquet` AS SELECT CAST(columns[0] AS INT) AS id, > CAST(NULLIF(columns[1], '') AS DOUBLE) AS val from `repro.csv`; > {noformat} > Now if you query any of the non null values you will get wrong results: > {noformat} > 0: jdbc:drill:zk=local> select * from `repro_parquet` where id>=5 limit > 10; > ++---+ > | id |val| > ++---+ > | 5 | 9.11337776337441E-309 | > | 50001 | 3.26044E-319 | > | 50002 | 1.4916681476489723E-154 | > | 50003 | 2.18890676| > | 50004 | 2.681561588521345E154 | > | 50005 | -2.1016574E-317 | > | 50006 | -1.4916681476489723E-154 | > | 50007 | -2.18890676 | > | 50008 | -2.681561588521345E154| > | 50009 | 2.1016574E-317| > ++---+ > 10 rows selected (0.238 seconds) > {noformat} > and here are the expected values: > {noformat} > 0: jdbc:drill:zk=local> select * from `repro.csv` where cast(columns[0] as > int)>=5 limit 10; > ++ > | columns | > ++ > | ["5","5"] | > | ["50001","50001"] | > | ["50002","50002"] | > | ["50003","50003"] | > | ["50004","50004"] | > | ["50005","50005"] | > | ["50006","50006"] | > | ["50007","50007"] | > | ["50008","50008"] | > | ["50009","50009"] | > ++ > {noformat} > I confirmed that the file is written correctly and the issue is in the > parquet reader (already have a fix for it) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4323) Hive Native Reader : A simple count(*) throws Incoming batch has an empty schema error
[ https://issues.apache.org/jira/browse/DRILL-4323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15131360#comment-15131360 ] ASF GitHub Bot commented on DRILL-4323: --- Github user jinfengni commented on the pull request: https://github.com/apache/drill/pull/349#issuecomment-179526708 LGTM. +1 > Hive Native Reader : A simple count(*) throws Incoming batch has an empty > schema error > -- > > Key: DRILL-4323 > URL: https://issues.apache.org/jira/browse/DRILL-4323 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Hive >Affects Versions: 1.5.0 >Reporter: Rahul Challapalli >Assignee: Sean Hsuan-Yi Chu >Priority: Critical > Attachments: error.log > > > git.commit.id.abbrev=3d0b4b0 > A simple count(*) query does not work when hive native reader is enabled > {code} > 0: jdbc:drill:zk=10.10.100.190:5181> select count(*) from customer; > +-+ > | EXPR$0 | > +-+ > | 10 | > +-+ > 1 row selected (3.074 seconds) > 0: jdbc:drill:zk=10.10.100.190:5181> alter session set > `store.hive.optimize_scan_with_native_readers` = true; > +---++ > | ok |summary | > +---++ > | true | store.hive.optimize_scan_with_native_readers updated. | > +---++ > 1 row selected (0.2 seconds) > 0: jdbc:drill:zk=10.10.100.190:5181> select count(*) from customer; > Error: SYSTEM ERROR: IllegalStateException: Incoming batch [#1341, > ProjectRecordBatch] has an empty schema. This is not allowed. > Fragment 0:0 > [Error Id: 4c867440-0fd3-4eda-922f-0f5eadcb1463 on qa-node191.qa.lab:31010] > (state=,code=0) > {code} > Hive DDL for the table : > {code} > create table customer > ( > c_customer_sk int, > c_customer_id string, > c_current_cdemo_sk int, > c_current_hdemo_sk int, > c_current_addr_sk int, > c_first_shipto_date_sk int, > c_first_sales_date_sk int, > c_salutation string, > c_first_name string, > c_last_name string, > c_preferred_cust_flag string, > c_birth_day int, > c_birth_month int, > c_birth_year int, > c_birth_country string, > c_login string, > c_email_address string, > c_last_review_date string > ) > STORED AS PARQUET > LOCATION '/drill/testdata/customer' > {code} > Attached the log file with the stacktrace -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4256) Performance regression in hive planning
[ https://issues.apache.org/jira/browse/DRILL-4256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15131465#comment-15131465 ] Rahul Challapalli commented on DRILL-4256: -- I manually verified the fix and it looks good! > Performance regression in hive planning > --- > > Key: DRILL-4256 > URL: https://issues.apache.org/jira/browse/DRILL-4256 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Hive, Query Planning & Optimization >Affects Versions: 1.5.0 >Reporter: Rahul Challapalli >Assignee: Venki Korukanti > Fix For: 1.5.0 > > Attachments: jstack.tgz > > > Commit # : 76f41e18207e3e3e987fef56ee7f1695dd6ddd7a > The fix for reading hive tables backed by hbase caused a performance > regression. The data set used in the below test has ~3700 partitions and the > filter in the query would ensure only 1 partition get selected. > {code} > Commit : 76f41e18207e3e3e987fef56ee7f1695dd6ddd7a > Query : explain plan for select count(*) from lineitem_partitioned where > `year`=2015 and `month`=1 and `day` =1; > Time : ~25 seconds > {code} > {code} > Commit : 1ea3d6c3f144614caf460648c1c27c6d0f5b06b8 > Query : explain plan for select count(*) from lineitem_partitioned where > `year`=2015 and `month`=1 and `day` =1; > Time : ~6.5 seconds > {code} > Since the data is large, I couldn't attach it here. Reach out to me if you > need additional information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-4349) parquet reader returns wrong results when reading a nullable column that starts with a large number of nulls (>30k)
[ https://issues.apache.org/jira/browse/DRILL-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chun Chang updated DRILL-4349: -- Reviewer: Chun Chang > parquet reader returns wrong results when reading a nullable column that > starts with a large number of nulls (>30k) > --- > > Key: DRILL-4349 > URL: https://issues.apache.org/jira/browse/DRILL-4349 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Parquet >Affects Versions: 1.4.0 >Reporter: Deneche A. Hakim >Assignee: Parth Chandra >Priority: Critical > Fix For: 1.6.0 > > Attachments: drill4349.tar.gz > > > While reading a nullable column, if in a single pass we only read null > values, the parquet reader resets the value of pageReader.readPosInBytes > which will lead to wrong data read from the file. > To reproduce the issue, create a csv file (repro.csv) with 2 columns (id, > val) with 50100 rows, where id equals to the row number and val is empty for > the first 50k rows, and equal to id for the remaining rows. > create a parquet table from the csv file: > {noformat} > CREATE TABLE `repro_parquet` AS SELECT CAST(columns[0] AS INT) AS id, > CAST(NULLIF(columns[1], '') AS DOUBLE) AS val from `repro.csv`; > {noformat} > Now if you query any of the non null values you will get wrong results: > {noformat} > 0: jdbc:drill:zk=local> select * from `repro_parquet` where id>=5 limit > 10; > ++---+ > | id |val| > ++---+ > | 5 | 9.11337776337441E-309 | > | 50001 | 3.26044E-319 | > | 50002 | 1.4916681476489723E-154 | > | 50003 | 2.18890676| > | 50004 | 2.681561588521345E154 | > | 50005 | -2.1016574E-317 | > | 50006 | -1.4916681476489723E-154 | > | 50007 | -2.18890676 | > | 50008 | -2.681561588521345E154| > | 50009 | 2.1016574E-317| > ++---+ > 10 rows selected (0.238 seconds) > {noformat} > and here are the expected values: > {noformat} > 0: jdbc:drill:zk=local> select * from `repro.csv` where cast(columns[0] as > int)>=5 limit 10; > ++ > | columns | > ++ > | ["5","5"] | > | ["50001","50001"] | > | ["50002","50002"] | > | ["50003","50003"] | > | ["50004","50004"] | > | ["50005","50005"] | > | ["50006","50006"] | > | ["50007","50007"] | > | ["50008","50008"] | > | ["50009","50009"] | > ++ > {noformat} > I confirmed that the file is written correctly and the issue is in the > parquet reader (already have a fix for it) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-4350) Error messages need to be escaped in Web UI
Nathan Smith created DRILL-4350: --- Summary: Error messages need to be escaped in Web UI Key: DRILL-4350 URL: https://issues.apache.org/jira/browse/DRILL-4350 Project: Apache Drill Issue Type: Bug Affects Versions: 1.4.0 Reporter: Nathan Smith Priority: Trivial Some error messages don't display properly in the HTML web UI. I believe that adding the [html built-in|http://freemarker.incubator.apache.org/docs/ref_builtins_string.html#ref_builtin_html] to some of the templates. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4250) File system directory-based partition pruning does not work when a directory contains both subdirectories and files.
[ https://issues.apache.org/jira/browse/DRILL-4250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15131533#comment-15131533 ] Rahul Challapalli commented on DRILL-4250: -- verified and added a testcase > File system directory-based partition pruning does not work when a directory > contains both subdirectories and files. > -- > > Key: DRILL-4250 > URL: https://issues.apache.org/jira/browse/DRILL-4250 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Reporter: Jinfeng Ni >Assignee: Jinfeng Ni > Fix For: 1.5.0 > > > When a directory contains both subdirectories and files, then the > directory-based partition pruning would not work. > For example, I have the following directory structure with nation.parquet > (copied from tpch sample dataset). > .//2001/Q1/nation.parquet > .//2001/Q2/nation.parquet > The following query has the directory-based partition pruning work correctly. > > {code} > explain plan for select * from dfs.tmp.fileAndDir where dir0 = 2001 and dir1 > = 'Q1'; > 00-00Screen > 00-01 Project(*=[$0]) > 00-02Project(*=[$0]) > 00-03 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath > [path=file:/tmp/fileAndDir/2001/Q1/nation.parquet]], > selectionRoot=file:/tmp/fileAndDir, numFiles=1, usedMetadataFile=false, > columns=[`*`]]]) > {code} > However, if I add a nation.parquet file to 2001 directory, like the following: > .//2001/nation.parquet > .//2001/Q1/nation.parquet > .//2001/Q2/nation.parquet > Then, the same query will not have the partition pruning applied. > {code} > explain plan for select * from dfs.tmp.fileAndDir where dir0 = 2001 and dir1 > = 'Q1'; > +--+--+ > | text | json | > +--+--+ > | 00-00Screen > 00-01 Project(*=[$0]) > 00-02Project(T0¦¦*=[$0]) > 00-03 SelectionVectorRemover > 00-04Filter(condition=[AND(=($1, 2001), =($2, 'Q1'))]) > 00-05 Project(T0¦¦*=[$0], dir0=[$1], dir1=[$2]) > 00-06Scan(groupscan=[ParquetGroupScan > [entries=[ReadEntryWithPath [path=file:/tmp/fileAndDir/2001/nation.parquet], > ReadEntryWithPath [path=file:/tmp/fileAndDir/2001/Q1/nation.parquet], > ReadEntryWithPath [path=file:/tmp/fileAndDir/2001/Q2/nation.parquet]], > selectionRoot=file:/tmp/fileAndDir, numFiles=3, usedMetadataFile=false, > columns=[`*`]]]) > {code} > I should note that for the second case where partition pruning did not work, > the query did return the correct result. Therefore, this issue is only impact > the query performance, not the query result. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (DRILL-4250) File system directory-based partition pruning does not work when a directory contains both subdirectories and files.
[ https://issues.apache.org/jira/browse/DRILL-4250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rahul Challapalli closed DRILL-4250. > File system directory-based partition pruning does not work when a directory > contains both subdirectories and files. > -- > > Key: DRILL-4250 > URL: https://issues.apache.org/jira/browse/DRILL-4250 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Reporter: Jinfeng Ni >Assignee: Jinfeng Ni > Fix For: 1.5.0 > > > When a directory contains both subdirectories and files, then the > directory-based partition pruning would not work. > For example, I have the following directory structure with nation.parquet > (copied from tpch sample dataset). > .//2001/Q1/nation.parquet > .//2001/Q2/nation.parquet > The following query has the directory-based partition pruning work correctly. > > {code} > explain plan for select * from dfs.tmp.fileAndDir where dir0 = 2001 and dir1 > = 'Q1'; > 00-00Screen > 00-01 Project(*=[$0]) > 00-02Project(*=[$0]) > 00-03 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath > [path=file:/tmp/fileAndDir/2001/Q1/nation.parquet]], > selectionRoot=file:/tmp/fileAndDir, numFiles=1, usedMetadataFile=false, > columns=[`*`]]]) > {code} > However, if I add a nation.parquet file to 2001 directory, like the following: > .//2001/nation.parquet > .//2001/Q1/nation.parquet > .//2001/Q2/nation.parquet > Then, the same query will not have the partition pruning applied. > {code} > explain plan for select * from dfs.tmp.fileAndDir where dir0 = 2001 and dir1 > = 'Q1'; > +--+--+ > | text | json | > +--+--+ > | 00-00Screen > 00-01 Project(*=[$0]) > 00-02Project(T0¦¦*=[$0]) > 00-03 SelectionVectorRemover > 00-04Filter(condition=[AND(=($1, 2001), =($2, 'Q1'))]) > 00-05 Project(T0¦¦*=[$0], dir0=[$1], dir1=[$2]) > 00-06Scan(groupscan=[ParquetGroupScan > [entries=[ReadEntryWithPath [path=file:/tmp/fileAndDir/2001/nation.parquet], > ReadEntryWithPath [path=file:/tmp/fileAndDir/2001/Q1/nation.parquet], > ReadEntryWithPath [path=file:/tmp/fileAndDir/2001/Q2/nation.parquet]], > selectionRoot=file:/tmp/fileAndDir, numFiles=3, usedMetadataFile=false, > columns=[`*`]]]) > {code} > I should note that for the second case where partition pruning did not work, > the query did return the correct result. Therefore, this issue is only impact > the query performance, not the query result. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-4350) Error messages need to be escaped in Web UI
[ https://issues.apache.org/jira/browse/DRILL-4350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nathan Smith updated DRILL-4350: Flags: Patch > Error messages need to be escaped in Web UI > --- > > Key: DRILL-4350 > URL: https://issues.apache.org/jira/browse/DRILL-4350 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.4.0 >Reporter: Nathan Smith >Priority: Trivial > Attachments: DRILL-4350.patch > > > Some error messages don't display properly in the HTML web UI. I believe that > adding the [html > built-in|http://freemarker.incubator.apache.org/docs/ref_builtins_string.html#ref_builtin_html] > to some of the templates. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-4350) Error messages need to be escaped in Web UI
[ https://issues.apache.org/jira/browse/DRILL-4350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nathan Smith updated DRILL-4350: Attachment: DRILL-4350.patch > Error messages need to be escaped in Web UI > --- > > Key: DRILL-4350 > URL: https://issues.apache.org/jira/browse/DRILL-4350 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.4.0 >Reporter: Nathan Smith >Priority: Trivial > Attachments: DRILL-4350.patch > > > Some error messages don't display properly in the HTML web UI. I believe that > adding the [html > built-in|http://freemarker.incubator.apache.org/docs/ref_builtins_string.html#ref_builtin_html] > to some of the templates. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4347) Planning time for query64 from TPCDS test suite has increased 10 times compared to 1.4 release
[ https://issues.apache.org/jira/browse/DRILL-4347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15131583#comment-15131583 ] Jinfeng Ni commented on DRILL-4347: --- This might be caused by the patch of DRILL-2517, although I have not figured out the reason. For jstack, seems planner runs into loop when uses metadataprovider. .. at org.apache.calcite.rel.metadata.ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(ChainedRelMetadataProvider.java:109) at com.sun.proxy.$Proxy62.getRowCount(Unknown Source) at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.calcite.rel.metadata.ChainedRelMetadataProvider$ChainedInvocationHandler.invoke(ChainedRelMetadataProvider.java:109) at com.sun.proxy.$Proxy62.getRowCount(Unknown Source) at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.calcite.rel.metadata.CachingRelMetadataProvider$CachingInvocationHandler.invoke(CachingRelMetadataProvider.java:132) at com.sun.proxy.$Proxy62.getRowCount(Unknown Source) at org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:86) at org.apache.calcite.rel.SingleRel.getRows(SingleRel.java:68) at org.apache.calcite.rel.core.Aggregate.getRows(Aggregate.java:286) at org.apache.drill.exec.planner.physical.visitor.SwapHashJoinVisitor.visitJoin(SwapHashJoinVisitor.java:70) at org.apache.drill.exec.planner.physical.visitor.SwapHashJoinVisitor.visitJoin(SwapHashJoinVisitor.java:41) at org.apache.drill.exec.planner.physical.JoinPrel.accept(JoinPrel.java:60) at org.apache.drill.exec.planner.physical.visitor.SwapHashJoinVisitor.visitPrel(SwapHashJoinVisitor.java:57) at org.apache.drill.exec.planner.physical.visitor.SwapHashJoinVisitor.visitPrel(SwapHashJoinVisitor.java:41) at org.apache.drill.exec.planner.physical.FilterPrel.accept(FilterPrel.java:65) at org.apache.drill.exec.planner.physical.visitor.SwapHashJoinVisitor.visitPrel(SwapHashJoinVisitor.java:57) at org.apache.drill.exec.planner.physical.visitor.SwapHashJoinVisitor.visitPrel(SwapHashJoinVisitor.java:41) at org.apache.drill.exec.planner.physical.visitor.BasePrelVisitor.visitProject(BasePrelVisitor.java:48) at org.apache.drill.exec.planner.physical.ProjectPrel.accept(ProjectPrel.java:69) at org.apache.drill.exec.planner.physical.visitor.SwapHashJoinVisitor.visitPrel(SwapHashJoinVisitor.java:57) at org.apache.drill.exec.planner.physical.visitor.SwapHashJoinVisitor.visitPrel(SwapHashJoinVisitor.java:41) at org.apache.drill.exec.planner.physical.visitor.BasePrelVisitor.visitExchange(BasePrelVisitor.java:38) at org.apache.drill.exec.planner.physical.ExchangePrel.accept(ExchangePrel.java:34) at org.apache.drill.exec.planner.physical.visitor.SwapHashJoinVisitor.visitPrel(SwapHashJoinVisitor.java:57) at org.apache.drill.exec.planner.physical.visitor.SwapHashJoinVisitor.visitPrel(SwapHashJoinVisitor.java:41) at org.apache.drill.exec.planner.physical.SortPrel.accept(SortPrel.java:104) at org.apache.drill.exec.planner.physical.visitor.SwapHashJoinVisitor.visitPrel(SwapHashJoinVisitor.java:57) at org.apache.drill.exec.planner.physical.visitor.SwapHashJoinVisitor.visitPrel(SwapHashJoinVisitor.java:41) at org.apache.drill.exec.planner.physical.visitor.BasePrelVisitor.visitExchange(BasePrelVisitor.java:38) at org.apache.drill.exec.planner.physical.ExchangePrel.accept(ExchangePrel.java:34) at org.apache.drill.exec.planner.physical.visitor.SwapHashJoinVisitor.visitPrel(SwapHashJoinVisitor.java:57) at org.apache.drill.exec.planner.physical.visitor.SwapHashJoinVisitor.visitPrel(SwapHashJoinVisitor.java:41) at org.apache.drill.exec.planner.physical.visitor.BasePrelVisitor.visitProject(BasePrelVisitor.java:48) at org.apache.drill.exec.planner.physical.ProjectPrel.accept(ProjectPrel.java:69) at org.apache.drill.exec.planner.physical.visitor.SwapHashJoinVisitor.visitPrel(SwapHashJoinVisitor.java:57) at org.apache.drill.exec.planner.physical.visitor.SwapHashJoinVisitor.visitPrel(SwapHashJoinVisitor.java:41) at org.apache.drill.exec.planner.physical.visitor.BasePrelVisitor.visitScreen(BasePrelVisitor.java:53) at org.apache.drill.exec.planner.physical.ScreenPrel.accept(ScreenPrel.java:64) at
[jira] [Updated] (DRILL-4328) Fix for backward compatibility regression caused by DRILL-4198
[ https://issues.apache.org/jira/browse/DRILL-4328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Ollala updated DRILL-4328: - Reviewer: Rahul Challapalli > Fix for backward compatibility regression caused by DRILL-4198 > -- > > Key: DRILL-4328 > URL: https://issues.apache.org/jira/browse/DRILL-4328 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Other >Reporter: Venki Korukanti >Assignee: Venki Korukanti > Fix For: 1.5.0 > > > Revert updates made to StoragePlugin interface in DRILL-4198. Instead add the > new methods to AbstractStoragePlugin. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (DRILL-4322) User exception created upon failed DROP TABLE eats the underlying exception
[ https://issues.apache.org/jira/browse/DRILL-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chun Chang closed DRILL-4322. - I guess the following stack trace shows we do not eat underlying exception anymore. The best I can do to verify the fix. {noformat} 2016-01-26 05:22:06,292 [29588d02-6fc1-3e49-4e4b-de4cc6205538:foreman] ERROR o.a.drill.exec.work.foreman.Foreman - SYSTEM ERROR: NullPointerException [Error Id: ac0f28f9-b569-4eac-af54-bedbce33278a on atsqa6c61.qa.lab:31010] org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: NullPointerException [Error Id: ac0f28f9-b569-4eac-af54-bedbce33278a on atsqa6c61.qa.lab:31010] at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:543) ~[drill-common-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.Foreman$ForemanResult.close(Foreman.java:746) [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:858) [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:790) [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] at org.apache.drill.common.EventProcessor.sendEvent(EventProcessor.java:73) [drill-common-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.Foreman$StateSwitch.moveToState(Foreman.java:792) [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:909) [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:261) [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_45] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_45] at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45] Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected exception during fragment initialization: Internal error: Error while applying rule DrillTableRule, args [rel#6431439:EnumerableTableScan.ENUMERABLE.ANY([]).[](table=[hive, lineitem_text_partitioned_hive_hier_intstring])] ... 4 common frames omitted Caused by: java.lang.AssertionError: Internal error: Error while applying rule DrillTableRule, args [rel#6431439:EnumerableTableScan.ENUMERABLE.ANY([]).[](table=[hive, lineitem_text_partitioned_hive_hier_intstring])] at org.apache.calcite.util.Util.newInternal(Util.java:792) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10] at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:251) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10] at org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:808) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10] at org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:303) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10] at org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl.java:313) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10] at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.doLogicalPlanning(DefaultSqlHandler.java:542) ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:218) ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:252) ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:172) ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] at org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:199) ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:924) [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:250) [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] ... 3 common frames omitted Caused by: java.lang.AssertionError: Internal error: Error occurred while applying rule DrillTableRule at org.apache.calcite.util.Util.newInternal(Util.java:792) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10] at org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(VolcanoRuleCall.java:150) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10] at org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:213)
[jira] [Commented] (DRILL-4349) parquet reader returns wrong results when reading a nullable column that starts with a large number of nulls (>30k)
[ https://issues.apache.org/jira/browse/DRILL-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15131444#comment-15131444 ] ASF GitHub Bot commented on DRILL-4349: --- Github user adeneche commented on the pull request: https://github.com/apache/drill/pull/356#issuecomment-179551062 @parthchandra can you please review ? thanks > parquet reader returns wrong results when reading a nullable column that > starts with a large number of nulls (>30k) > --- > > Key: DRILL-4349 > URL: https://issues.apache.org/jira/browse/DRILL-4349 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Parquet >Affects Versions: 1.4.0 >Reporter: Deneche A. Hakim >Assignee: Parth Chandra >Priority: Critical > Fix For: 1.6.0 > > Attachments: drill4349.tar.gz > > > While reading a nullable column, if in a single pass we only read null > values, the parquet reader resets the value of pageReader.readPosInBytes > which will lead to wrong data read from the file. > To reproduce the issue, create a csv file (repro.csv) with 2 columns (id, > val) with 50100 rows, where id equals to the row number and val is empty for > the first 50k rows, and equal to id for the remaining rows. > create a parquet table from the csv file: > {noformat} > CREATE TABLE `repro_parquet` AS SELECT CAST(columns[0] AS INT) AS id, > CAST(NULLIF(columns[1], '') AS DOUBLE) AS val from `repro.csv`; > {noformat} > Now if you query any of the non null values you will get wrong results: > {noformat} > 0: jdbc:drill:zk=local> select * from `repro_parquet` where id>=5 limit > 10; > ++---+ > | id |val| > ++---+ > | 5 | 9.11337776337441E-309 | > | 50001 | 3.26044E-319 | > | 50002 | 1.4916681476489723E-154 | > | 50003 | 2.18890676| > | 50004 | 2.681561588521345E154 | > | 50005 | -2.1016574E-317 | > | 50006 | -1.4916681476489723E-154 | > | 50007 | -2.18890676 | > | 50008 | -2.681561588521345E154| > | 50009 | 2.1016574E-317| > ++---+ > 10 rows selected (0.238 seconds) > {noformat} > and here are the expected values: > {noformat} > 0: jdbc:drill:zk=local> select * from `repro.csv` where cast(columns[0] as > int)>=5 limit 10; > ++ > | columns | > ++ > | ["5","5"] | > | ["50001","50001"] | > | ["50002","50002"] | > | ["50003","50003"] | > | ["50004","50004"] | > | ["50005","50005"] | > | ["50006","50006"] | > | ["50007","50007"] | > | ["50008","50008"] | > | ["50009","50009"] | > ++ > {noformat} > I confirmed that the file is written correctly and the issue is in the > parquet reader (already have a fix for it) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-4349) parquet reader returns wrong results when reading a nullable column that starts with a large number of nulls (>30k)
[ https://issues.apache.org/jira/browse/DRILL-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deneche A. Hakim updated DRILL-4349: Assignee: Parth Chandra (was: Deneche A. Hakim) > parquet reader returns wrong results when reading a nullable column that > starts with a large number of nulls (>30k) > --- > > Key: DRILL-4349 > URL: https://issues.apache.org/jira/browse/DRILL-4349 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Parquet >Affects Versions: 1.4.0 >Reporter: Deneche A. Hakim >Assignee: Parth Chandra >Priority: Critical > Fix For: 1.6.0 > > Attachments: drill4349.tar.gz > > > While reading a nullable column, if in a single pass we only read null > values, the parquet reader resets the value of pageReader.readPosInBytes > which will lead to wrong data read from the file. > To reproduce the issue, create a csv file (repro.csv) with 2 columns (id, > val) with 50100 rows, where id equals to the row number and val is empty for > the first 50k rows, and equal to id for the remaining rows. > create a parquet table from the csv file: > {noformat} > CREATE TABLE `repro_parquet` AS SELECT CAST(columns[0] AS INT) AS id, > CAST(NULLIF(columns[1], '') AS DOUBLE) AS val from `repro.csv`; > {noformat} > Now if you query any of the non null values you will get wrong results: > {noformat} > 0: jdbc:drill:zk=local> select * from `repro_parquet` where id>=5 limit > 10; > ++---+ > | id |val| > ++---+ > | 5 | 9.11337776337441E-309 | > | 50001 | 3.26044E-319 | > | 50002 | 1.4916681476489723E-154 | > | 50003 | 2.18890676| > | 50004 | 2.681561588521345E154 | > | 50005 | -2.1016574E-317 | > | 50006 | -1.4916681476489723E-154 | > | 50007 | -2.18890676 | > | 50008 | -2.681561588521345E154| > | 50009 | 2.1016574E-317| > ++---+ > 10 rows selected (0.238 seconds) > {noformat} > and here are the expected values: > {noformat} > 0: jdbc:drill:zk=local> select * from `repro.csv` where cast(columns[0] as > int)>=5 limit 10; > ++ > | columns | > ++ > | ["5","5"] | > | ["50001","50001"] | > | ["50002","50002"] | > | ["50003","50003"] | > | ["50004","50004"] | > | ["50005","50005"] | > | ["50006","50006"] | > | ["50007","50007"] | > | ["50008","50008"] | > | ["50009","50009"] | > ++ > {noformat} > I confirmed that the file is written correctly and the issue is in the > parquet reader (already have a fix for it) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-4351) SQL keyword CURRENT is used as column name in sys.drillbits
[ https://issues.apache.org/jira/browse/DRILL-4351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Girish updated DRILL-4351: --- Priority: Minor (was: Major) > SQL keyword CURRENT is used as column name in sys.drillbits > --- > > Key: DRILL-4351 > URL: https://issues.apache.org/jira/browse/DRILL-4351 > Project: Apache Drill > Issue Type: Bug > Components: SQL Parser >Affects Versions: 1.5.0 >Reporter: Khurram Faraaz >Priority: Minor > > current is a SQL keyword, we will have to rename it to a different word, say > foreman, in sys.drillbits table. We hit this issue when user tries to project > the column named current in sys.drillbits > Drill git commit ID : 6a36a704 (1.5.0-SNAPSHOT) > JDK8 > current is a keyword in these DBMSs > MS SQL Server > Oracle > DB2 > {noformat} > 0: jdbc:drill:schema=dfs.tmp> select * from sys.drillbits; > +---++---++--+ > | hostname | user_port | control_port | data_port | current | > +---++---++--+ > | centos-03.qa.lab | 31010 | 31011 | 31012 | true | > | centos-01.qa.lab | 31010 | 31011 | 31012 | false| > | centos-02.qa.lab | 31010 | 31011 | 31012 | false| > | centos-04.qa.lab | 31010 | 31011 | 31012 | false| > +---++---++--+ > 4 rows selected (0.238 seconds) > 0: jdbc:drill:schema=dfs.tmp> select hostname from sys.drillbits; > +---+ > | hostname | > +---+ > | centos-03.qa.lab | > | centos-01.qa.lab | > | centos-02.qa.lab | > | centos-04.qa.lab | > +---+ > 4 rows selected (0.178 seconds) > 0: jdbc:drill:schema=dfs.tmp> select current from sys.drillbits; > Error: PARSE ERROR: Encountered "current from" at line 1, column 8. > Was expecting one of: > "UNION" ... > "INTERSECT" ... > "EXCEPT" ... > ... > "EXISTS" ... > "(" ... > while parsing SQL query: > select current from sys.drillbits >^ > [Error Id: 07e0161c-9c36-44bb-b584-40fef11fa9e8 on centos-03.qa.lab:31010] > (state=,code=0) > > {noformat} > From drillbit.log > > {noformat} > [Error Id: 07e0161c-9c36-44bb-b584-40fef11fa9e8 ] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:543) > ~[drill-common-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > at > org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:164) > [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > at > org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:924) > [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:250) > [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0_65] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_65] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_65] > Caused by: org.apache.calcite.sql.parser.SqlParseException: Encountered > "current from" at line 1, column 8. > Was expecting one of: > "UNION" ... > "INTERSECT" ... > "EXCEPT" ... > "ORDER" ... > "LIMIT" ... > "OFFSET" ... > "FETCH" ... > "STREAM" ... > "DISTINCT" ... > "ALL" ... > "*" ... > "+" ... > "-" ... > ... > ... > ... > ... > ... > ... > ... > "TRUE" ... > "FALSE" ... > "UNKNOWN" ... > "NULL" ... > ... > ... > ... > "DATE" ... > "TIME" ... > "TIMESTAMP" ... > "INTERVAL" ... > "?" ... > "CAST" ... > "EXTRACT" ... > "POSITION" ... > "CONVERT" ... > "TRANSLATE" ... > "OVERLAY" ... > "FLOOR" ... > "CEIL" ... > "CEILING" ... > "SUBSTRING" ... > "TRIM" ... > ... > "MULTISET" ... > "ARRAY" ... > "SPECIFIC" ... > ... > ... > ... > ... > ... > "ABS" ... > "AVG" ... > "CARDINALITY" ... > "CHAR_LENGTH" ... > "CHARACTER_LENGTH" ... > "COALESCE" ... > "COLLECT" ... > "COVAR_POP" ... > "COVAR_SAMP" ... > "CUME_DIST" ... > "COUNT" ... > "CURRENT_DATE" ... > "CURRENT_TIME" ... > "CURRENT_TIMESTAMP" ... > "DENSE_RANK" ... > "ELEMENT" ... > "EXP" ... > "FIRST_VALUE" ... > "FUSION" ... > "GROUPING" ... > "LAST_VALUE" ... > "LN" ... > "LOCALTIME" ... > "LOCALTIMESTAMP" ... > "LOWER" ... > "MAX" ... > "MIN" ... > "MOD" ... >
[jira] [Commented] (DRILL-4351) SQL keyword CURRENT is used as column name in sys.drillbits
[ https://issues.apache.org/jira/browse/DRILL-4351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15131764#comment-15131764 ] Khurram Faraaz commented on DRILL-4351: --- enclosing in back-quotes is a work around. It is not suggested that we use reserve words/keywords for column names in tables. > SQL keyword CURRENT is used as column name in sys.drillbits > --- > > Key: DRILL-4351 > URL: https://issues.apache.org/jira/browse/DRILL-4351 > Project: Apache Drill > Issue Type: Bug > Components: SQL Parser >Affects Versions: 1.5.0 >Reporter: Khurram Faraaz > > current is a SQL keyword, we will have to rename it to a different word, say > foreman, in sys.drillbits table. We hit this issue when user tries to project > the column named current in sys.drillbits > Drill git commit ID : 6a36a704 (1.5.0-SNAPSHOT) > JDK8 > current is a keyword in these DBMSs > MS SQL Server > Oracle > DB2 > {noformat} > 0: jdbc:drill:schema=dfs.tmp> select * from sys.drillbits; > +---++---++--+ > | hostname | user_port | control_port | data_port | current | > +---++---++--+ > | centos-03.qa.lab | 31010 | 31011 | 31012 | true | > | centos-01.qa.lab | 31010 | 31011 | 31012 | false| > | centos-02.qa.lab | 31010 | 31011 | 31012 | false| > | centos-04.qa.lab | 31010 | 31011 | 31012 | false| > +---++---++--+ > 4 rows selected (0.238 seconds) > 0: jdbc:drill:schema=dfs.tmp> select hostname from sys.drillbits; > +---+ > | hostname | > +---+ > | centos-03.qa.lab | > | centos-01.qa.lab | > | centos-02.qa.lab | > | centos-04.qa.lab | > +---+ > 4 rows selected (0.178 seconds) > 0: jdbc:drill:schema=dfs.tmp> select current from sys.drillbits; > Error: PARSE ERROR: Encountered "current from" at line 1, column 8. > Was expecting one of: > "UNION" ... > "INTERSECT" ... > "EXCEPT" ... > ... > "EXISTS" ... > "(" ... > while parsing SQL query: > select current from sys.drillbits >^ > [Error Id: 07e0161c-9c36-44bb-b584-40fef11fa9e8 on centos-03.qa.lab:31010] > (state=,code=0) > > {noformat} > From drillbit.log > > {noformat} > [Error Id: 07e0161c-9c36-44bb-b584-40fef11fa9e8 ] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:543) > ~[drill-common-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > at > org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:164) > [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > at > org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:924) > [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:250) > [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0_65] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_65] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_65] > Caused by: org.apache.calcite.sql.parser.SqlParseException: Encountered > "current from" at line 1, column 8. > Was expecting one of: > "UNION" ... > "INTERSECT" ... > "EXCEPT" ... > "ORDER" ... > "LIMIT" ... > "OFFSET" ... > "FETCH" ... > "STREAM" ... > "DISTINCT" ... > "ALL" ... > "*" ... > "+" ... > "-" ... > ... > ... > ... > ... > ... > ... > ... > "TRUE" ... > "FALSE" ... > "UNKNOWN" ... > "NULL" ... > ... > ... > ... > "DATE" ... > "TIME" ... > "TIMESTAMP" ... > "INTERVAL" ... > "?" ... > "CAST" ... > "EXTRACT" ... > "POSITION" ... > "CONVERT" ... > "TRANSLATE" ... > "OVERLAY" ... > "FLOOR" ... > "CEIL" ... > "CEILING" ... > "SUBSTRING" ... > "TRIM" ... > ... > "MULTISET" ... > "ARRAY" ... > "SPECIFIC" ... > ... > ... > ... > ... > ... > "ABS" ... > "AVG" ... > "CARDINALITY" ... > "CHAR_LENGTH" ... > "CHARACTER_LENGTH" ... > "COALESCE" ... > "COLLECT" ... > "COVAR_POP" ... > "COVAR_SAMP" ... > "CUME_DIST" ... > "COUNT" ... > "CURRENT_DATE" ... > "CURRENT_TIME" ... > "CURRENT_TIMESTAMP" ... > "DENSE_RANK" ... > "ELEMENT" ... > "EXP" ... > "FIRST_VALUE" ... > "FUSION" ... > "GROUPING" ... > "LAST_VALUE" ... > "LN" ... >
[jira] [Commented] (DRILL-4347) Planning time for query64 from TPCDS test suite has increased 10 times compared to 1.4 release
[ https://issues.apache.org/jira/browse/DRILL-4347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15131721#comment-15131721 ] Zelaine Fong commented on DRILL-4347: - [~jni] - that was my initial thought as well. If so, then this regression should not be the 1.5 release candidate, since that was branched earlier than your change for DRILL-2517. Correct? > Planning time for query64 from TPCDS test suite has increased 10 times > compared to 1.4 release > -- > > Key: DRILL-4347 > URL: https://issues.apache.org/jira/browse/DRILL-4347 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.5.0 >Reporter: Victoria Markman > Attachments: 294e9fb9-cdda-a89f-d1a7-b852878926a1.sys.drill_1.4.0, > 294ea418-9fb8-3082-1725-74e3cfe38fe9.sys.drill_1.5.0 > > > mapr-drill-1.5.0.201602012001-1.noarch.rpm > {code} > 0: jdbc:drill:schema=dfs> WITH cs_ui > . . . . . . . . . . . . > AS (SELECT cs_item_sk, > . . . . . . . . . . . . > Sum(cs_ext_list_price) AS sale, > . . . . . . . . . . . . > Sum(cr_refunded_cash + > cr_reversed_charge > . . . . . . . . . . . . > + cr_store_credit) AS refund > . . . . . . . . . . . . > FROM catalog_sales, > . . . . . . . . . . . . > catalog_returns > . . . . . . . . . . . . > WHERE cs_item_sk = cr_item_sk > . . . . . . . . . . . . > AND cs_order_number = > cr_order_number > . . . . . . . . . . . . > GROUP BY cs_item_sk > . . . . . . . . . . . . > HAVING Sum(cs_ext_list_price) > 2 * Sum( > . . . . . . . . . . . . > cr_refunded_cash + > cr_reversed_charge > . . . . . . . . . . . . > + cr_store_credit)), > . . . . . . . . . . . . > cross_sales > . . . . . . . . . . . . > AS (SELECT i_product_name product_name, > . . . . . . . . . . . . > i_item_sk item_sk, > . . . . . . . . . . . . > s_store_name store_name, > . . . . . . . . . . . . > s_zip store_zip, > . . . . . . . . . . . . > ad1.ca_street_number > b_street_number, > . . . . . . . . . . . . > ad1.ca_street_name > b_streen_name, > . . . . . . . . . . . . > ad1.ca_cityb_city, > . . . . . . . . . . . . > ad1.ca_zip b_zip, > . . . . . . . . . . . . > ad2.ca_street_number > c_street_number, > . . . . . . . . . . . . > ad2.ca_street_name > c_street_name, > . . . . . . . . . . . . > ad2.ca_cityc_city, > . . . . . . . . . . . . > ad2.ca_zip c_zip, > . . . . . . . . . . . . > d1.d_year AS syear, > . . . . . . . . . . . . > d2.d_year AS fsyear, > . . . . . . . . . . . . > d3.d_year s2year, > . . . . . . . . . . . . > Count(*) cnt, > . . . . . . . . . . . . > Sum(ss_wholesale_cost) s1, > . . . . . . . . . . . . > Sum(ss_list_price) s2, > . . . . . . . . . . . . > Sum(ss_coupon_amt) s3 > . . . . . . . . . . . . > FROM store_sales, > . . . . . . . . . . . . > store_returns, > . . . . . . . . . . . . > cs_ui, > . . . . . . . . . . . . > date_dim d1, > . . . . . . . . . . . . > date_dim d2, > . . . . . . . . . . . . > date_dim d3, > . . . . . . . . . . . . > store, > . . . . . . . . . . . . > customer, > . . . . . . . . . . . . > customer_demographics cd1, > . . . . . . . . . . . . > customer_demographics cd2, > . . . . . . . . . . . . > promotion, > . . . . . . . . . . . . > household_demographics hd1, > . . . . . . . . . . . . > household_demographics hd2, > . . . . . . . . . . . . > customer_address ad1, > . . . . . . . . . . . . > customer_address ad2, > . . . . . . . . . . . . > income_band ib1, > . . . . . . . . . . . . > income_band ib2, > . . . . . . . . . . . . > item > . . . . . . . . . . . . > WHERE ss_store_sk = s_store_sk > . . . . . . . . . . . . > AND ss_sold_date_sk = d1.d_date_sk > . . . . . . . . . . . . > AND ss_customer_sk = c_customer_sk > . . . . . . . . . . . . > AND ss_cdemo_sk = cd1.cd_demo_sk > . . . . . . . . . . . . > AND ss_hdemo_sk = hd1.hd_demo_sk > . . . .
[jira] [Created] (DRILL-4351) SQL keyword CURRENT is used as column name in sys.drillbits
Khurram Faraaz created DRILL-4351: - Summary: SQL keyword CURRENT is used as column name in sys.drillbits Key: DRILL-4351 URL: https://issues.apache.org/jira/browse/DRILL-4351 Project: Apache Drill Issue Type: Bug Components: SQL Parser Affects Versions: 1.5.0 Reporter: Khurram Faraaz current is a SQL keyword, we will have to rename it to a different word, say foreman, in sys.drillbits table. We hit this issue when user tries to project the column named current in sys.drillbits Drill git commit ID : 6a36a704 (1.5.0-SNAPSHOT) JDK8 current is a keyword in these DBMSs MS SQL Server Oracle DB2 {noformat} 0: jdbc:drill:schema=dfs.tmp> select * from sys.drillbits; +---++---++--+ | hostname | user_port | control_port | data_port | current | +---++---++--+ | centos-03.qa.lab | 31010 | 31011 | 31012 | true | | centos-01.qa.lab | 31010 | 31011 | 31012 | false| | centos-02.qa.lab | 31010 | 31011 | 31012 | false| | centos-04.qa.lab | 31010 | 31011 | 31012 | false| +---++---++--+ 4 rows selected (0.238 seconds) 0: jdbc:drill:schema=dfs.tmp> select hostname from sys.drillbits; +---+ | hostname | +---+ | centos-03.qa.lab | | centos-01.qa.lab | | centos-02.qa.lab | | centos-04.qa.lab | +---+ 4 rows selected (0.178 seconds) 0: jdbc:drill:schema=dfs.tmp> select current from sys.drillbits; Error: PARSE ERROR: Encountered "current from" at line 1, column 8. Was expecting one of: "UNION" ... "INTERSECT" ... "EXCEPT" ... ... "EXISTS" ... "(" ... while parsing SQL query: select current from sys.drillbits ^ [Error Id: 07e0161c-9c36-44bb-b584-40fef11fa9e8 on centos-03.qa.lab:31010] (state=,code=0) {noformat} >From drillbit.log {noformat} [Error Id: 07e0161c-9c36-44bb-b584-40fef11fa9e8 ] at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:543) ~[drill-common-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] at org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:164) [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:924) [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:250) [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_65] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_65] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_65] Caused by: org.apache.calcite.sql.parser.SqlParseException: Encountered "current from" at line 1, column 8. Was expecting one of: "UNION" ... "INTERSECT" ... "EXCEPT" ... "ORDER" ... "LIMIT" ... "OFFSET" ... "FETCH" ... "STREAM" ... "DISTINCT" ... "ALL" ... "*" ... "+" ... "-" ... ... ... ... ... ... ... ... "TRUE" ... "FALSE" ... "UNKNOWN" ... "NULL" ... ... ... ... "DATE" ... "TIME" ... "TIMESTAMP" ... "INTERVAL" ... "?" ... "CAST" ... "EXTRACT" ... "POSITION" ... "CONVERT" ... "TRANSLATE" ... "OVERLAY" ... "FLOOR" ... "CEIL" ... "CEILING" ... "SUBSTRING" ... "TRIM" ... ... "MULTISET" ... "ARRAY" ... "SPECIFIC" ... ... ... ... ... ... "ABS" ... "AVG" ... "CARDINALITY" ... "CHAR_LENGTH" ... "CHARACTER_LENGTH" ... "COALESCE" ... "COLLECT" ... "COVAR_POP" ... "COVAR_SAMP" ... "CUME_DIST" ... "COUNT" ... "CURRENT_DATE" ... "CURRENT_TIME" ... "CURRENT_TIMESTAMP" ... "DENSE_RANK" ... "ELEMENT" ... "EXP" ... "FIRST_VALUE" ... "FUSION" ... "GROUPING" ... "LAST_VALUE" ... "LN" ... "LOCALTIME" ... "LOCALTIMESTAMP" ... "LOWER" ... "MAX" ... "MIN" ... "MOD" ... "NULLIF" ... "OCTET_LENGTH" ... "PERCENT_RANK" ... "POWER" ... "RANK" ... "REGR_SXX" ... "REGR_SYY" ... "ROW_NUMBER" ... "SQRT" ... "STDDEV_POP" ... "STDDEV_SAMP" ... "SUM" ... "UPPER" ... "VAR_POP" ... "VAR_SAMP" ... "CURRENT_CATALOG" ... "CURRENT_DEFAULT_TRANSFORM_GROUP" ... "CURRENT_PATH" ... "CURRENT_ROLE" ... "CURRENT_SCHEMA" ... "CURRENT_USER" ... "SESSION_USER" ... "SYSTEM_USER" ... "USER" ... "NEW" ... "CASE" ...
[jira] [Commented] (DRILL-4347) Planning time for query64 from TPCDS test suite has increased 10 times compared to 1.4 release
[ https://issues.apache.org/jira/browse/DRILL-4347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15131732#comment-15131732 ] Jinfeng Ni commented on DRILL-4347: --- [~zelaine], that's correct. The patch for DRILL-2517 is not included in the 1.5 release candidate. > Planning time for query64 from TPCDS test suite has increased 10 times > compared to 1.4 release > -- > > Key: DRILL-4347 > URL: https://issues.apache.org/jira/browse/DRILL-4347 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.5.0 >Reporter: Victoria Markman > Attachments: 294e9fb9-cdda-a89f-d1a7-b852878926a1.sys.drill_1.4.0, > 294ea418-9fb8-3082-1725-74e3cfe38fe9.sys.drill_1.5.0 > > > mapr-drill-1.5.0.201602012001-1.noarch.rpm > {code} > 0: jdbc:drill:schema=dfs> WITH cs_ui > . . . . . . . . . . . . > AS (SELECT cs_item_sk, > . . . . . . . . . . . . > Sum(cs_ext_list_price) AS sale, > . . . . . . . . . . . . > Sum(cr_refunded_cash + > cr_reversed_charge > . . . . . . . . . . . . > + cr_store_credit) AS refund > . . . . . . . . . . . . > FROM catalog_sales, > . . . . . . . . . . . . > catalog_returns > . . . . . . . . . . . . > WHERE cs_item_sk = cr_item_sk > . . . . . . . . . . . . > AND cs_order_number = > cr_order_number > . . . . . . . . . . . . > GROUP BY cs_item_sk > . . . . . . . . . . . . > HAVING Sum(cs_ext_list_price) > 2 * Sum( > . . . . . . . . . . . . > cr_refunded_cash + > cr_reversed_charge > . . . . . . . . . . . . > + cr_store_credit)), > . . . . . . . . . . . . > cross_sales > . . . . . . . . . . . . > AS (SELECT i_product_name product_name, > . . . . . . . . . . . . > i_item_sk item_sk, > . . . . . . . . . . . . > s_store_name store_name, > . . . . . . . . . . . . > s_zip store_zip, > . . . . . . . . . . . . > ad1.ca_street_number > b_street_number, > . . . . . . . . . . . . > ad1.ca_street_name > b_streen_name, > . . . . . . . . . . . . > ad1.ca_cityb_city, > . . . . . . . . . . . . > ad1.ca_zip b_zip, > . . . . . . . . . . . . > ad2.ca_street_number > c_street_number, > . . . . . . . . . . . . > ad2.ca_street_name > c_street_name, > . . . . . . . . . . . . > ad2.ca_cityc_city, > . . . . . . . . . . . . > ad2.ca_zip c_zip, > . . . . . . . . . . . . > d1.d_year AS syear, > . . . . . . . . . . . . > d2.d_year AS fsyear, > . . . . . . . . . . . . > d3.d_year s2year, > . . . . . . . . . . . . > Count(*) cnt, > . . . . . . . . . . . . > Sum(ss_wholesale_cost) s1, > . . . . . . . . . . . . > Sum(ss_list_price) s2, > . . . . . . . . . . . . > Sum(ss_coupon_amt) s3 > . . . . . . . . . . . . > FROM store_sales, > . . . . . . . . . . . . > store_returns, > . . . . . . . . . . . . > cs_ui, > . . . . . . . . . . . . > date_dim d1, > . . . . . . . . . . . . > date_dim d2, > . . . . . . . . . . . . > date_dim d3, > . . . . . . . . . . . . > store, > . . . . . . . . . . . . > customer, > . . . . . . . . . . . . > customer_demographics cd1, > . . . . . . . . . . . . > customer_demographics cd2, > . . . . . . . . . . . . > promotion, > . . . . . . . . . . . . > household_demographics hd1, > . . . . . . . . . . . . > household_demographics hd2, > . . . . . . . . . . . . > customer_address ad1, > . . . . . . . . . . . . > customer_address ad2, > . . . . . . . . . . . . > income_band ib1, > . . . . . . . . . . . . > income_band ib2, > . . . . . . . . . . . . > item > . . . . . . . . . . . . > WHERE ss_store_sk = s_store_sk > . . . . . . . . . . . . > AND ss_sold_date_sk = d1.d_date_sk > . . . . . . . . . . . . > AND ss_customer_sk = c_customer_sk > . . . . . . . . . . . . > AND ss_cdemo_sk = cd1.cd_demo_sk > . . . . . . . . . . . . > AND ss_hdemo_sk = hd1.hd_demo_sk > . . . . . . . . . . . . > AND ss_addr_sk = ad1.ca_address_sk > . . . . . . . . . . . .
[jira] [Commented] (DRILL-4351) SQL keyword CURRENT is used as column name in sys.drillbits
[ https://issues.apache.org/jira/browse/DRILL-4351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15131752#comment-15131752 ] Abhishek Girish commented on DRILL-4351: I don't think this is a problem. You must be able to enclose it in back-quotes and get it working. > SQL keyword CURRENT is used as column name in sys.drillbits > --- > > Key: DRILL-4351 > URL: https://issues.apache.org/jira/browse/DRILL-4351 > Project: Apache Drill > Issue Type: Bug > Components: SQL Parser >Affects Versions: 1.5.0 >Reporter: Khurram Faraaz > > current is a SQL keyword, we will have to rename it to a different word, say > foreman, in sys.drillbits table. We hit this issue when user tries to project > the column named current in sys.drillbits > Drill git commit ID : 6a36a704 (1.5.0-SNAPSHOT) > JDK8 > current is a keyword in these DBMSs > MS SQL Server > Oracle > DB2 > {noformat} > 0: jdbc:drill:schema=dfs.tmp> select * from sys.drillbits; > +---++---++--+ > | hostname | user_port | control_port | data_port | current | > +---++---++--+ > | centos-03.qa.lab | 31010 | 31011 | 31012 | true | > | centos-01.qa.lab | 31010 | 31011 | 31012 | false| > | centos-02.qa.lab | 31010 | 31011 | 31012 | false| > | centos-04.qa.lab | 31010 | 31011 | 31012 | false| > +---++---++--+ > 4 rows selected (0.238 seconds) > 0: jdbc:drill:schema=dfs.tmp> select hostname from sys.drillbits; > +---+ > | hostname | > +---+ > | centos-03.qa.lab | > | centos-01.qa.lab | > | centos-02.qa.lab | > | centos-04.qa.lab | > +---+ > 4 rows selected (0.178 seconds) > 0: jdbc:drill:schema=dfs.tmp> select current from sys.drillbits; > Error: PARSE ERROR: Encountered "current from" at line 1, column 8. > Was expecting one of: > "UNION" ... > "INTERSECT" ... > "EXCEPT" ... > ... > "EXISTS" ... > "(" ... > while parsing SQL query: > select current from sys.drillbits >^ > [Error Id: 07e0161c-9c36-44bb-b584-40fef11fa9e8 on centos-03.qa.lab:31010] > (state=,code=0) > > {noformat} > From drillbit.log > > {noformat} > [Error Id: 07e0161c-9c36-44bb-b584-40fef11fa9e8 ] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:543) > ~[drill-common-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > at > org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:164) > [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > at > org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:924) > [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:250) > [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0_65] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_65] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_65] > Caused by: org.apache.calcite.sql.parser.SqlParseException: Encountered > "current from" at line 1, column 8. > Was expecting one of: > "UNION" ... > "INTERSECT" ... > "EXCEPT" ... > "ORDER" ... > "LIMIT" ... > "OFFSET" ... > "FETCH" ... > "STREAM" ... > "DISTINCT" ... > "ALL" ... > "*" ... > "+" ... > "-" ... > ... > ... > ... > ... > ... > ... > ... > "TRUE" ... > "FALSE" ... > "UNKNOWN" ... > "NULL" ... > ... > ... > ... > "DATE" ... > "TIME" ... > "TIMESTAMP" ... > "INTERVAL" ... > "?" ... > "CAST" ... > "EXTRACT" ... > "POSITION" ... > "CONVERT" ... > "TRANSLATE" ... > "OVERLAY" ... > "FLOOR" ... > "CEIL" ... > "CEILING" ... > "SUBSTRING" ... > "TRIM" ... > ... > "MULTISET" ... > "ARRAY" ... > "SPECIFIC" ... > ... > ... > ... > ... > ... > "ABS" ... > "AVG" ... > "CARDINALITY" ... > "CHAR_LENGTH" ... > "CHARACTER_LENGTH" ... > "COALESCE" ... > "COLLECT" ... > "COVAR_POP" ... > "COVAR_SAMP" ... > "CUME_DIST" ... > "COUNT" ... > "CURRENT_DATE" ... > "CURRENT_TIME" ... > "CURRENT_TIMESTAMP" ... > "DENSE_RANK" ... > "ELEMENT" ... > "EXP" ... > "FIRST_VALUE" ... > "FUSION" ... > "GROUPING" ... > "LAST_VALUE" ... > "LN" ... > "LOCALTIME" ... >
[jira] [Created] (DRILL-4346) NumberFormatException when casting empty string to int in hbase/maprdb
Vitalii Diravka created DRILL-4346: -- Summary: NumberFormatException when casting empty string to int in hbase/maprdb Key: DRILL-4346 URL: https://issues.apache.org/jira/browse/DRILL-4346 Project: Apache Drill Issue Type: Bug Components: Functions - Drill Affects Versions: 1.2.0 Reporter: Vitalii Diravka Assignee: Vitalii Diravka Queries to null values in HBase with no data when casting to Integer results in NumberFormatException: {code} Data row1,1,2 row2,,4 row3,5,6 row4,7,8 Create Table $ maprcli table create -path /user/cmatta/projects/cmatta_test $ maprcli table cf create -path /user/cmatta/projects/cmatta_test -cfname a Load into Hbase table: hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.separator=',' -Dimporttsv.columns=HBASE_ROW_KEY,a:c1,a:c2 /user/cmatta/projects/cmatta_test maprfs:///user/cmatta/projects/testdata_hbase_null {code} {code} 0: jdbc:drill:> select cast(x.`row_key` as varchar(128)) as `row_key`, CAST(x.`a`.`c1` as INTEGER) from maprfs.cmatta.`cmatta_test` x; Error: SYSTEM ERROR: NumberFormatException: Fragment 0:0 [Error Id: 05a0e5ed-d830-4926-a442-569c9d70d0b4 on se-node11.se.lab:31010] (state=,code=0) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3161) Drill JDBC driver not visible/auto-registered via Service Provider Mechanism
[ https://issues.apache.org/jira/browse/DRILL-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130644#comment-15130644 ] Vincent Uribe commented on DRILL-3161: -- I tried adding the java.sql.Driver file myself to the drill-jdbc-all-1.4.0.jar with no luck. the file is present but the driver doest not seem to load. Any inputs? > Drill JDBC driver not visible/auto-registered via Service Provider Mechanism > > > Key: DRILL-3161 > URL: https://issues.apache.org/jira/browse/DRILL-3161 > Project: Apache Drill > Issue Type: Bug > Components: Client - JDBC >Reporter: Daniel Barclay (Drill) > Fix For: Future > > > Drill's JDBC driver is not automatically made visible to JDBC's DriverManager > and auto-registered, because it does not use Java's Service Provider > Mechanism as specified by JDBC 4.0. > This usually means that instead of just having to put the Drill JDBC driver > Jar file on the class path and use a Drill JDBC URL (one starting with > "{{jdbc:drill:}}"), users also have to configure their tools or code with the > name of the Drill driver class. > > The Drill JDBC driver's Jar file should contain a > {{META-INF/services/java.sql.Driver}} file that contains a line consisting of > the fully qualified name of the Drill JDBC driver class > ({{org.apache.drill.jdbc.Driver}}). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3547) IndexOutOfBoundsException on directory with ~20 subdirectories
[ https://issues.apache.org/jira/browse/DRILL-3547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130694#comment-15130694 ] Zelaine Fong commented on DRILL-3547: - Not sure if this is related, but [~hgunes] made a fix in DRILL-2618 related to handling of queries over empty directories. However, it sounds like this problem relates to empty files, and not empty directories. > IndexOutOfBoundsException on directory with ~20 subdirectories > -- > > Key: DRILL-3547 > URL: https://issues.apache.org/jira/browse/DRILL-3547 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill >Affects Versions: 1.1.0 > Environment: RHEL 7 >Reporter: Philip Deegan >Assignee: Arina Ielchiieva > Fix For: 1.6.0 > > > Tested on 1.1 with commit id: > {noformat} > 0: jdbc:drill:zk=local> select commit_id from sys.version; > +---+ > | commit_id | > +---+ > | e3fc7e97bfe712dc09d43a8a055a5135c96b7344 | > +---+ > {noformat} > Directory has child directories "a" to "u", each contain json files. > Running the query on each subdirectory indivudually does not cause an error. > {noformat} > java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: > IndexOutOfBoundsException: index: 0, length: 1 (expected: range(0, 0)) > Fragment 1:2 > [Error Id: 69a0879f-f718-4930-ae6f-c526de05528c on > ip-172-31-29-60.eu-central-1.compute.internal:31010] > at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73) > at > sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87) > at sqlline.TableOutputFormat.print(TableOutputFormat.java:118) > at sqlline.SqlLine.print(SqlLine.java:1583) > at sqlline.Commands.execute(Commands.java:852) > at sqlline.Commands.sql(Commands.java:751) > at sqlline.SqlLine.dispatch(SqlLine.java:738) > at sqlline.SqlLine.begin(SqlLine.java:612) > at sqlline.SqlLine.start(SqlLine.java:366) > at sqlline.SqlLine.main(SqlLine.java:259) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4122) Create unit test suite for checking quality of hashing for hash based operators
[ https://issues.apache.org/jira/browse/DRILL-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130729#comment-15130729 ] Sudheesh Katkam commented on DRILL-4122: [Skewness|http://mathworld.wolfram.com/Skewness.html] is another measure. This method could be implemented using a Drill UDAF based on the [online skewness algorithm|https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Higher-order_statistics]. The following query checks the skewness of 10^7 pseudorandom double values (between 0 inclusive and 1 exclusive): {code} WITH x AS (SELECT * FROM (VALUES 1,1,1,1,1,1,1,1,1,1) AS MyTable(key)) SELECT skewness(mod(abs(hash(random())), 10)) FROM x AS x1 INNER JOIN x AS x2 ON x2.key = x1.key INNER JOIN x AS x3 ON x3.key = x2.key INNER JOIN x AS x4 ON x4.key = x3.key INNER JOIN x AS x5 ON x5.key = x4.key INNER JOIN x AS x6 ON x6.key = x5.key INNER JOIN x AS x7 ON x7.key = x6.key; {code} > Create unit test suite for checking quality of hashing for hash based > operators > --- > > Key: DRILL-4122 > URL: https://issues.apache.org/jira/browse/DRILL-4122 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill >Affects Versions: 1.3.0 >Reporter: Aman Sinha >Assignee: Sudheesh Katkam > > We have encountered substantial skew in the hash based operators (hash > distribution, hash aggregation, hash join) for certain data sets. Two such > issues are DRILL-2803, DRILL-4119. > It would be very useful to have a unit test suite to test the quality of > hashing. > The number of combinations is large: num_data_types x nullability x > num_hash_function_types (32bit, 64bit, AsDouble variations). Plus, the nature > of the data itself. We would have to be judicious about picking a > reasonable subset of this space. We should also look at open source test > suites in this area. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-4347) Planning time for query64 from TPCDS test suite has increased 10 times compared to 1.4 release
[ https://issues.apache.org/jira/browse/DRILL-4347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Altekruse updated DRILL-4347: --- Affects Version/s: (was: 0.5.0) 1.5.0 > Planning time for query64 from TPCDS test suite has increased 10 times > compared to 1.4 release > -- > > Key: DRILL-4347 > URL: https://issues.apache.org/jira/browse/DRILL-4347 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.5.0 >Reporter: Victoria Markman > Attachments: 294e9fb9-cdda-a89f-d1a7-b852878926a1.sys.drill_1.4.0, > 294ea418-9fb8-3082-1725-74e3cfe38fe9.sys.drill_1.5.0 > > > mapr-drill-1.5.0.201602012001-1.noarch.rpm > {code} > 0: jdbc:drill:schema=dfs> WITH cs_ui > . . . . . . . . . . . . > AS (SELECT cs_item_sk, > . . . . . . . . . . . . > Sum(cs_ext_list_price) AS sale, > . . . . . . . . . . . . > Sum(cr_refunded_cash + > cr_reversed_charge > . . . . . . . . . . . . > + cr_store_credit) AS refund > . . . . . . . . . . . . > FROM catalog_sales, > . . . . . . . . . . . . > catalog_returns > . . . . . . . . . . . . > WHERE cs_item_sk = cr_item_sk > . . . . . . . . . . . . > AND cs_order_number = > cr_order_number > . . . . . . . . . . . . > GROUP BY cs_item_sk > . . . . . . . . . . . . > HAVING Sum(cs_ext_list_price) > 2 * Sum( > . . . . . . . . . . . . > cr_refunded_cash + > cr_reversed_charge > . . . . . . . . . . . . > + cr_store_credit)), > . . . . . . . . . . . . > cross_sales > . . . . . . . . . . . . > AS (SELECT i_product_name product_name, > . . . . . . . . . . . . > i_item_sk item_sk, > . . . . . . . . . . . . > s_store_name store_name, > . . . . . . . . . . . . > s_zip store_zip, > . . . . . . . . . . . . > ad1.ca_street_number > b_street_number, > . . . . . . . . . . . . > ad1.ca_street_name > b_streen_name, > . . . . . . . . . . . . > ad1.ca_cityb_city, > . . . . . . . . . . . . > ad1.ca_zip b_zip, > . . . . . . . . . . . . > ad2.ca_street_number > c_street_number, > . . . . . . . . . . . . > ad2.ca_street_name > c_street_name, > . . . . . . . . . . . . > ad2.ca_cityc_city, > . . . . . . . . . . . . > ad2.ca_zip c_zip, > . . . . . . . . . . . . > d1.d_year AS syear, > . . . . . . . . . . . . > d2.d_year AS fsyear, > . . . . . . . . . . . . > d3.d_year s2year, > . . . . . . . . . . . . > Count(*) cnt, > . . . . . . . . . . . . > Sum(ss_wholesale_cost) s1, > . . . . . . . . . . . . > Sum(ss_list_price) s2, > . . . . . . . . . . . . > Sum(ss_coupon_amt) s3 > . . . . . . . . . . . . > FROM store_sales, > . . . . . . . . . . . . > store_returns, > . . . . . . . . . . . . > cs_ui, > . . . . . . . . . . . . > date_dim d1, > . . . . . . . . . . . . > date_dim d2, > . . . . . . . . . . . . > date_dim d3, > . . . . . . . . . . . . > store, > . . . . . . . . . . . . > customer, > . . . . . . . . . . . . > customer_demographics cd1, > . . . . . . . . . . . . > customer_demographics cd2, > . . . . . . . . . . . . > promotion, > . . . . . . . . . . . . > household_demographics hd1, > . . . . . . . . . . . . > household_demographics hd2, > . . . . . . . . . . . . > customer_address ad1, > . . . . . . . . . . . . > customer_address ad2, > . . . . . . . . . . . . > income_band ib1, > . . . . . . . . . . . . > income_band ib2, > . . . . . . . . . . . . > item > . . . . . . . . . . . . > WHERE ss_store_sk = s_store_sk > . . . . . . . . . . . . > AND ss_sold_date_sk = d1.d_date_sk > . . . . . . . . . . . . > AND ss_customer_sk = c_customer_sk > . . . . . . . . . . . . > AND ss_cdemo_sk = cd1.cd_demo_sk > . . . . . . . . . . . . > AND ss_hdemo_sk = hd1.hd_demo_sk > . . . . . . . . . . . . > AND ss_addr_sk = ad1.ca_address_sk > . . . . . . . . . . . . > AND ss_item_sk = i_item_sk > . . . . . . . . .
[jira] [Created] (DRILL-4347) Planning time for query64 from TPCDS test suite has increased 10 times compared to 1.4 release
Victoria Markman created DRILL-4347: --- Summary: Planning time for query64 from TPCDS test suite has increased 10 times compared to 1.4 release Key: DRILL-4347 URL: https://issues.apache.org/jira/browse/DRILL-4347 Project: Apache Drill Issue Type: Bug Components: Query Planning & Optimization Affects Versions: 0.5.0 Reporter: Victoria Markman mapr-drill-1.5.0.201602012001-1.noarch.rpm {code} 0: jdbc:drill:schema=dfs> WITH cs_ui . . . . . . . . . . . . > AS (SELECT cs_item_sk, . . . . . . . . . . . . > Sum(cs_ext_list_price) AS sale, . . . . . . . . . . . . > Sum(cr_refunded_cash + cr_reversed_charge . . . . . . . . . . . . > + cr_store_credit) AS refund . . . . . . . . . . . . > FROM catalog_sales, . . . . . . . . . . . . > catalog_returns . . . . . . . . . . . . > WHERE cs_item_sk = cr_item_sk . . . . . . . . . . . . > AND cs_order_number = cr_order_number . . . . . . . . . . . . > GROUP BY cs_item_sk . . . . . . . . . . . . > HAVING Sum(cs_ext_list_price) > 2 * Sum( . . . . . . . . . . . . > cr_refunded_cash + cr_reversed_charge . . . . . . . . . . . . > + cr_store_credit)), . . . . . . . . . . . . > cross_sales . . . . . . . . . . . . > AS (SELECT i_product_name product_name, . . . . . . . . . . . . > i_item_sk item_sk, . . . . . . . . . . . . > s_store_name store_name, . . . . . . . . . . . . > s_zip store_zip, . . . . . . . . . . . . > ad1.ca_street_number b_street_number, . . . . . . . . . . . . > ad1.ca_street_name b_streen_name, . . . . . . . . . . . . > ad1.ca_cityb_city, . . . . . . . . . . . . > ad1.ca_zip b_zip, . . . . . . . . . . . . > ad2.ca_street_number c_street_number, . . . . . . . . . . . . > ad2.ca_street_name c_street_name, . . . . . . . . . . . . > ad2.ca_cityc_city, . . . . . . . . . . . . > ad2.ca_zip c_zip, . . . . . . . . . . . . > d1.d_year AS syear, . . . . . . . . . . . . > d2.d_year AS fsyear, . . . . . . . . . . . . > d3.d_year s2year, . . . . . . . . . . . . > Count(*) cnt, . . . . . . . . . . . . > Sum(ss_wholesale_cost) s1, . . . . . . . . . . . . > Sum(ss_list_price) s2, . . . . . . . . . . . . > Sum(ss_coupon_amt) s3 . . . . . . . . . . . . > FROM store_sales, . . . . . . . . . . . . > store_returns, . . . . . . . . . . . . > cs_ui, . . . . . . . . . . . . > date_dim d1, . . . . . . . . . . . . > date_dim d2, . . . . . . . . . . . . > date_dim d3, . . . . . . . . . . . . > store, . . . . . . . . . . . . > customer, . . . . . . . . . . . . > customer_demographics cd1, . . . . . . . . . . . . > customer_demographics cd2, . . . . . . . . . . . . > promotion, . . . . . . . . . . . . > household_demographics hd1, . . . . . . . . . . . . > household_demographics hd2, . . . . . . . . . . . . > customer_address ad1, . . . . . . . . . . . . > customer_address ad2, . . . . . . . . . . . . > income_band ib1, . . . . . . . . . . . . > income_band ib2, . . . . . . . . . . . . > item . . . . . . . . . . . . > WHERE ss_store_sk = s_store_sk . . . . . . . . . . . . > AND ss_sold_date_sk = d1.d_date_sk . . . . . . . . . . . . > AND ss_customer_sk = c_customer_sk . . . . . . . . . . . . > AND ss_cdemo_sk = cd1.cd_demo_sk . . . . . . . . . . . . > AND ss_hdemo_sk = hd1.hd_demo_sk . . . . . . . . . . . . > AND ss_addr_sk = ad1.ca_address_sk . . . . . . . . . . . . > AND ss_item_sk = i_item_sk . . . . . . . . . . . . > AND ss_item_sk = sr_item_sk . . . . . . . . . . . . > AND ss_ticket_number = sr_ticket_number . . . . . . . . . . . . > AND ss_item_sk = cs_ui.cs_item_sk . . . . . . . . . . . . > AND c_current_cdemo_sk = cd2.cd_demo_sk . . . . . . . . . . . . > AND c_current_hdemo_sk = hd2.hd_demo_sk . . . . . . . . . . . . > AND c_current_addr_sk = ad2.ca_address_sk . . . . . . . . . . . . > AND c_first_sales_date_sk = d2.d_date_sk . . . .
[jira] [Updated] (DRILL-4347) Planning time for query64 from TPCDS test suite has increased 10 times compared to 1.4 release
[ https://issues.apache.org/jira/browse/DRILL-4347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Victoria Markman updated DRILL-4347: Attachment: 294ea418-9fb8-3082-1725-74e3cfe38fe9.sys.drill_1.5.0 294e9fb9-cdda-a89f-d1a7-b852878926a1.sys.drill_1.4.0 > Planning time for query64 from TPCDS test suite has increased 10 times > compared to 1.4 release > -- > > Key: DRILL-4347 > URL: https://issues.apache.org/jira/browse/DRILL-4347 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 0.5.0 >Reporter: Victoria Markman > Attachments: 294e9fb9-cdda-a89f-d1a7-b852878926a1.sys.drill_1.4.0, > 294ea418-9fb8-3082-1725-74e3cfe38fe9.sys.drill_1.5.0 > > > mapr-drill-1.5.0.201602012001-1.noarch.rpm > {code} > 0: jdbc:drill:schema=dfs> WITH cs_ui > . . . . . . . . . . . . > AS (SELECT cs_item_sk, > . . . . . . . . . . . . > Sum(cs_ext_list_price) AS sale, > . . . . . . . . . . . . > Sum(cr_refunded_cash + > cr_reversed_charge > . . . . . . . . . . . . > + cr_store_credit) AS refund > . . . . . . . . . . . . > FROM catalog_sales, > . . . . . . . . . . . . > catalog_returns > . . . . . . . . . . . . > WHERE cs_item_sk = cr_item_sk > . . . . . . . . . . . . > AND cs_order_number = > cr_order_number > . . . . . . . . . . . . > GROUP BY cs_item_sk > . . . . . . . . . . . . > HAVING Sum(cs_ext_list_price) > 2 * Sum( > . . . . . . . . . . . . > cr_refunded_cash + > cr_reversed_charge > . . . . . . . . . . . . > + cr_store_credit)), > . . . . . . . . . . . . > cross_sales > . . . . . . . . . . . . > AS (SELECT i_product_name product_name, > . . . . . . . . . . . . > i_item_sk item_sk, > . . . . . . . . . . . . > s_store_name store_name, > . . . . . . . . . . . . > s_zip store_zip, > . . . . . . . . . . . . > ad1.ca_street_number > b_street_number, > . . . . . . . . . . . . > ad1.ca_street_name > b_streen_name, > . . . . . . . . . . . . > ad1.ca_cityb_city, > . . . . . . . . . . . . > ad1.ca_zip b_zip, > . . . . . . . . . . . . > ad2.ca_street_number > c_street_number, > . . . . . . . . . . . . > ad2.ca_street_name > c_street_name, > . . . . . . . . . . . . > ad2.ca_cityc_city, > . . . . . . . . . . . . > ad2.ca_zip c_zip, > . . . . . . . . . . . . > d1.d_year AS syear, > . . . . . . . . . . . . > d2.d_year AS fsyear, > . . . . . . . . . . . . > d3.d_year s2year, > . . . . . . . . . . . . > Count(*) cnt, > . . . . . . . . . . . . > Sum(ss_wholesale_cost) s1, > . . . . . . . . . . . . > Sum(ss_list_price) s2, > . . . . . . . . . . . . > Sum(ss_coupon_amt) s3 > . . . . . . . . . . . . > FROM store_sales, > . . . . . . . . . . . . > store_returns, > . . . . . . . . . . . . > cs_ui, > . . . . . . . . . . . . > date_dim d1, > . . . . . . . . . . . . > date_dim d2, > . . . . . . . . . . . . > date_dim d3, > . . . . . . . . . . . . > store, > . . . . . . . . . . . . > customer, > . . . . . . . . . . . . > customer_demographics cd1, > . . . . . . . . . . . . > customer_demographics cd2, > . . . . . . . . . . . . > promotion, > . . . . . . . . . . . . > household_demographics hd1, > . . . . . . . . . . . . > household_demographics hd2, > . . . . . . . . . . . . > customer_address ad1, > . . . . . . . . . . . . > customer_address ad2, > . . . . . . . . . . . . > income_band ib1, > . . . . . . . . . . . . > income_band ib2, > . . . . . . . . . . . . > item > . . . . . . . . . . . . > WHERE ss_store_sk = s_store_sk > . . . . . . . . . . . . > AND ss_sold_date_sk = d1.d_date_sk > . . . . . . . . . . . . > AND ss_customer_sk = c_customer_sk > . . . . . . . . . . . . > AND ss_cdemo_sk = cd1.cd_demo_sk > . . . . . . . . . . . . > AND ss_hdemo_sk = hd1.hd_demo_sk > . . . . . . . . . . . . > AND ss_addr_sk = ad1.ca_address_sk > . . . . . . . . .
[jira] [Created] (DRILL-4348) Date arithmetic issues
Nathan Smith created DRILL-4348: --- Summary: Date arithmetic issues Key: DRILL-4348 URL: https://issues.apache.org/jira/browse/DRILL-4348 Project: Apache Drill Issue Type: Bug Components: Execution - Data Types, Execution - Relational Operators Affects Versions: 1.4.0 Reporter: Nathan Smith I have been encountering issues while trying to do date(time) arithmetic. According to https://issues.apache.org/jira/browse/DRILL-549, the subtraction operator should work with DATE types, but I am getting following error when executing this type of query: {code} SELECT COUNT((CAST(tran_dttm AS DATE) - DATE '2012-07-01') < 0) AS before_july, COUNT((CAST(tran_dttm AS DATE) - DATE '2012-07-01') > 0) AS after_july FROM dfs.root.ldg_tran_parquet WHERE EXTRACT(year FROM tran_dttm) = 2012 {code} {code} 2016-02-03 15:02:39,891 [qtp254153532-52] ERROR o.a.d.e.server.rest.QueryResources - Query from Web UI Failed org.apache.drill.common.exceptions.UserRemoteException: VALIDATION ERROR: From line 2, column 9 to line 2, column 59: Cannot apply '-' to arguments of type ' - '. Supported form(s): ' - ' ' - ' ' - ' [Error Id: 73b5df2c-14f6-424c-83c7-af4a6158247c on drill-standalone.aunsight.office:31010] (org.apache.calcite.tools.ValidationException) org.apache.calcite.runtime.CalciteContextException: From line 2, column 9 to line 2, column 59: Cannot apply '-' to arguments of type ' - '. Supported form(s): ' - ' ' - ' ' - ' org.apache.calcite.prepare.PlannerImpl.validate():189 org.apache.calcite.prepare.PlannerImpl.validateAndGetType():198 org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateNode():451 org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert():198 org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():167 org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():197 org.apache.drill.exec.work.foreman.Foreman.runSQL():909 org.apache.drill.exec.work.foreman.Foreman.run():244 java.util.concurrent.ThreadPoolExecutor.runWorker():1145 java.util.concurrent.ThreadPoolExecutor$Worker.run():615 java.lang.Thread.run():745 Caused By (org.apache.calcite.runtime.CalciteContextException) From line 2, column 9 to line 2, column 59: Cannot apply '-' to arguments of type ' - '. Supported form(s): ' - ' ' - ' ' - ' sun.reflect.NativeConstructorAccessorImpl.newInstance0():-2 sun.reflect.NativeConstructorAccessorImpl.newInstance():57 sun.reflect.DelegatingConstructorAccessorImpl.newInstance():45 java.lang.reflect.Constructor.newInstance():526 org.apache.calcite.runtime.Resources$ExInstWithCause.ex():405 org.apache.calcite.sql.SqlUtil.newContextException():714 org.apache.calcite.sql.SqlUtil.newContextException():702 org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError():3931 org.apache.calcite.sql.SqlCallBinding.newValidationSignatureError():275 org.apache.calcite.sql.type.FamilyOperandTypeChecker.checkSingleOperandType():92 org.apache.calcite.sql.type.FamilyOperandTypeChecker.checkOperandTypes():109 org.apache.calcite.sql.type.CompositeOperandTypeChecker.checkOperandTypes():245 org.apache.calcite.sql.SqlOperator.checkOperandTypes():563 org.apache.calcite.sql.SqlOperator.validateOperands():420 org.apache.calcite.sql.SqlOperator.deriveType():487 org.apache.calcite.sql.SqlBinaryOperator.deriveType():143 org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit():4268 org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit():4255 org.apache.calcite.sql.SqlCall.accept():130 org.apache.calcite.sql.validate.SqlValidatorImpl.deriveTypeImpl():1495 org.apache.calcite.sql.validate.SqlValidatorImpl.deriveType():1478 org.apache.calcite.sql.SqlOperator.deriveType():483 org.apache.calcite.sql.SqlBinaryOperator.deriveType():143 org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit():4268 org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit():4255 org.apache.calcite.sql.SqlCall.accept():130 org.apache.calcite.sql.validate.SqlValidatorImpl.deriveTypeImpl():1495 org.apache.calcite.sql.validate.SqlValidatorImpl.deriveType():1478 org.apache.calcite.sql.SqlFunction.deriveType():288 org.apache.calcite.sql.SqlFunction.deriveType():230 org.apache.calcite.sql.fun.SqlCountAggFunction.deriveType():88 org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit():4268 org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit():4255 org.apache.calcite.sql.SqlCall.accept():130 org.apache.calcite.sql.validate.SqlValidatorImpl.deriveTypeImpl():1495 org.apache.calcite.sql.validate.SqlValidatorImpl.deriveType():1478
[jira] [Updated] (DRILL-4348) Date arithmetic issues
[ https://issues.apache.org/jira/browse/DRILL-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nathan Smith updated DRILL-4348: Description: I have been encountering issues while trying to do date(time) arithmetic. According to https://issues.apache.org/jira/browse/DRILL-549, the subtraction operator should work with DATE types, but I am getting following error when executing this type of query: {code} SELECT COUNT((CAST(tran_dttm AS DATE) - DATE '2012-07-01') < 0) AS before_july, COUNT((CAST(tran_dttm AS DATE) - DATE '2012-07-01') >= 0) AS after_july FROM dfs.root.ldg_tran_parquet WHERE EXTRACT(year FROM tran_dttm) = 2012 {code} {code} 2016-02-03 15:02:39,891 [qtp254153532-52] ERROR o.a.d.e.server.rest.QueryResources - Query from Web UI Failed org.apache.drill.common.exceptions.UserRemoteException: VALIDATION ERROR: From line 2, column 9 to line 2, column 59: Cannot apply '-' to arguments of type ' - '. Supported form(s): ' - ' ' - ' ' - ' [Error Id: 73b5df2c-14f6-424c-83c7-af4a6158247c on drill-standalone.aunsight.office:31010] (org.apache.calcite.tools.ValidationException) org.apache.calcite.runtime.CalciteContextException: From line 2, column 9 to line 2, column 59: Cannot apply '-' to arguments of type ' - '. Supported form(s): ' - ' ' - ' ' - ' org.apache.calcite.prepare.PlannerImpl.validate():189 org.apache.calcite.prepare.PlannerImpl.validateAndGetType():198 org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateNode():451 org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert():198 org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():167 org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():197 org.apache.drill.exec.work.foreman.Foreman.runSQL():909 org.apache.drill.exec.work.foreman.Foreman.run():244 java.util.concurrent.ThreadPoolExecutor.runWorker():1145 java.util.concurrent.ThreadPoolExecutor$Worker.run():615 java.lang.Thread.run():745 Caused By (org.apache.calcite.runtime.CalciteContextException) From line 2, column 9 to line 2, column 59: Cannot apply '-' to arguments of type ' - '. Supported form(s): ' - ' ' - ' ' - ' sun.reflect.NativeConstructorAccessorImpl.newInstance0():-2 sun.reflect.NativeConstructorAccessorImpl.newInstance():57 sun.reflect.DelegatingConstructorAccessorImpl.newInstance():45 java.lang.reflect.Constructor.newInstance():526 org.apache.calcite.runtime.Resources$ExInstWithCause.ex():405 org.apache.calcite.sql.SqlUtil.newContextException():714 org.apache.calcite.sql.SqlUtil.newContextException():702 org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError():3931 org.apache.calcite.sql.SqlCallBinding.newValidationSignatureError():275 org.apache.calcite.sql.type.FamilyOperandTypeChecker.checkSingleOperandType():92 org.apache.calcite.sql.type.FamilyOperandTypeChecker.checkOperandTypes():109 org.apache.calcite.sql.type.CompositeOperandTypeChecker.checkOperandTypes():245 org.apache.calcite.sql.SqlOperator.checkOperandTypes():563 org.apache.calcite.sql.SqlOperator.validateOperands():420 org.apache.calcite.sql.SqlOperator.deriveType():487 org.apache.calcite.sql.SqlBinaryOperator.deriveType():143 org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit():4268 org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit():4255 org.apache.calcite.sql.SqlCall.accept():130 org.apache.calcite.sql.validate.SqlValidatorImpl.deriveTypeImpl():1495 org.apache.calcite.sql.validate.SqlValidatorImpl.deriveType():1478 org.apache.calcite.sql.SqlOperator.deriveType():483 org.apache.calcite.sql.SqlBinaryOperator.deriveType():143 org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit():4268 org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit():4255 org.apache.calcite.sql.SqlCall.accept():130 org.apache.calcite.sql.validate.SqlValidatorImpl.deriveTypeImpl():1495 org.apache.calcite.sql.validate.SqlValidatorImpl.deriveType():1478 org.apache.calcite.sql.SqlFunction.deriveType():288 org.apache.calcite.sql.SqlFunction.deriveType():230 org.apache.calcite.sql.fun.SqlCountAggFunction.deriveType():88 org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit():4268 org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit():4255 org.apache.calcite.sql.SqlCall.accept():130 org.apache.calcite.sql.validate.SqlValidatorImpl.deriveTypeImpl():1495 org.apache.calcite.sql.validate.SqlValidatorImpl.deriveType():1478 org.apache.calcite.sql.SqlAsOperator.deriveType():132 org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit():4268 org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit():4255