[jira] [Updated] (DRILL-3167) When a query fails, Foreman should wait for all fragments to finish cleaning up before sending a FAILED state to the client
[ https://issues.apache.org/jira/browse/DRILL-3167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deneche A. Hakim updated DRILL-3167: Attachment: (was: DRILL-3267.3.patch.txt) When a query fails, Foreman should wait for all fragments to finish cleaning up before sending a FAILED state to the client --- Key: DRILL-3167 URL: https://issues.apache.org/jira/browse/DRILL-3167 Project: Apache Drill Issue Type: Bug Reporter: Deneche A. Hakim Assignee: Jacques Nadeau Fix For: 1.2.0 Attachments: DRILL-3167.1.patch.txt, DRILL-3167.5.patch.txt TestDrillbitResilience.foreman_runTryEnd() exposes this problem intermittently The query fails and the Foreman reports the failure to the client which removes the results listener associated to the failed query. Sometimes, a data batch reaches the client after the FAILED state already arrived, the client doesn't handle this properly and the corresponding buffer is never released. Making the Foreman wait for all fragments to finish before sending the final state should help avoid such scenarios. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3167) When a query fails, Foreman should wait for all fragments to finish cleaning up before sending a FAILED state to the client
[ https://issues.apache.org/jira/browse/DRILL-3167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deneche A. Hakim updated DRILL-3167: Attachment: (was: DRILL-3267.2.patch.txt) When a query fails, Foreman should wait for all fragments to finish cleaning up before sending a FAILED state to the client --- Key: DRILL-3167 URL: https://issues.apache.org/jira/browse/DRILL-3167 Project: Apache Drill Issue Type: Bug Reporter: Deneche A. Hakim Assignee: Jacques Nadeau Fix For: 1.2.0 Attachments: DRILL-3167.1.patch.txt, DRILL-3167.5.patch.txt TestDrillbitResilience.foreman_runTryEnd() exposes this problem intermittently The query fails and the Foreman reports the failure to the client which removes the results listener associated to the failed query. Sometimes, a data batch reaches the client after the FAILED state already arrived, the client doesn't handle this properly and the corresponding buffer is never released. Making the Foreman wait for all fragments to finish before sending the final state should help avoid such scenarios. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3209) [Umbrella] Plan reads of Hive tables as native Drill reads when a native reader for the underlying table format exists
[ https://issues.apache.org/jira/browse/DRILL-3209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Westin updated DRILL-3209: Fix Version/s: (was: 1.2.0) 1.3.0 [Umbrella] Plan reads of Hive tables as native Drill reads when a native reader for the underlying table format exists -- Key: DRILL-3209 URL: https://issues.apache.org/jira/browse/DRILL-3209 Project: Apache Drill Issue Type: Improvement Components: Query Planning Optimization, Storage - Hive Reporter: Jason Altekruse Assignee: Jason Altekruse Fix For: 1.3.0 All reads against Hive are currently done through the Hive Serde interface. While this provides the most flexibility, the API is not optimized for maximum performance while reading the data into Drill's native data structures. For Parquet and Text file backed tables, we can plan these reads as Drill native reads. Currently reads of these file types provide untyped data. While parquet has metadata in the file we currently do not make use of the type information while planning. For text files we read all of the files as lists of varchars. In both of these cases, casts will need to be injected to provide the same datatypes provided by the reads through the SerDe interface. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (DRILL-3056) Numeric literal in an IN list is casted to decimal even when decimal type is disabled
[ https://issues.apache.org/jira/browse/DRILL-3056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mehant Baid resolved DRILL-3056. Resolution: Fixed Even though the record type indicates Decimal type when the IN list is converted we still use double data type. Numeric literal in an IN list is casted to decimal even when decimal type is disabled - Key: DRILL-3056 URL: https://issues.apache.org/jira/browse/DRILL-3056 Project: Apache Drill Issue Type: Bug Components: Query Planning Optimization Affects Versions: 1.0.0 Reporter: Victoria Markman Assignee: Mehant Baid Fix For: 1.2.0 {code} 0: jdbc:drill:schema=dfs select * from sys.options where name like '%decimal%'; +++++++++ |name|kind|type| status | num_val | string_val | bool_val | float_val | +++++++++ | planner.enable_decimal_data_type | BOOLEAN| SYSTEM | DEFAULT| null | null | false | null | +++++++++ 1 row selected (0.212 seconds) {code} In list that contains more than 20 numeric literals. We are casting number with the decimal point to decimal type even though decimal type is disabled: {code} 0: jdbc:drill:schema=dfs explain plan including all attributes for select * from t1 where a1 in (1,2,3,4,5,6,7,8,9,0,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25.0); +++ |text|json| +++ | 00-00Screen : rowType = RecordType(ANY *): rowcount = 10.0, cumulative cost = {24.0 rows, 158.0 cpu, 0.0 io, 0.0 network, 35.2 memory}, id = 4921 00-01 Project(*=[$0]) : rowType = RecordType(ANY *): rowcount = 10.0, cumulative cost = {23.0 rows, 157.0 cpu, 0.0 io, 0.0 network, 35.2 memory}, id = 4920 00-02Project(T7¦¦*=[$0]) : rowType = RecordType(ANY T7¦¦*): rowcount = 10.0, cumulative cost = {23.0 rows, 157.0 cpu, 0.0 io, 0.0 network, 35.2 memory}, id = 4919 00-03 HashJoin(condition=[=($2, $3)], joinType=[inner]) : rowType = RecordType(ANY T7¦¦*, ANY a1, ANY a10, DECIMAL(11, 1) ROW_VALUE): rowcount = 10.0, cumulative cost = {23.0 rows, 157.0 cpu, 0.0 io, 0.0 network, 35.2 memory}, id = 4918 00-05Project(T7¦¦*=[$0], a1=[$1], a10=[$1]) : rowType = RecordType(ANY T7¦¦*, ANY a1, ANY a10): rowcount = 10.0, cumulative cost = {10.0 rows, 20.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 4915 00-07 Project(T7¦¦*=[$0], a1=[$1]) : rowType = RecordType(ANY T7¦¦*, ANY a1): rowcount = 10.0, cumulative cost = {10.0 rows, 20.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 4914 00-08Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/subqueries/t1]], selectionRoot=/drill/testdata/subqueries/t1, numFiles=1, columns=[`*`]]]) : rowType = (DrillRecordRow[*, a1]): rowcount = 10.0, cumulative cost = {10.0 rows, 20.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 4913 00-04HashAgg(group=[{0}]) : rowType = RecordType(DECIMAL(11, 1) ROW_VALUE): rowcount = 1.0, cumulative cost = {2.0 rows, 9.0 cpu, 0.0 io, 0.0 network, 17.6 memory}, id = 4917 00-06 Values : rowType = RecordType(DECIMAL(11, 1) ROW_VALUE): rowcount = 1.0, cumulative cost = {1.0 rows, 1.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 4916 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-2029) All readers should show the filename where it encountered error. If possible, also position
[ https://issues.apache.org/jira/browse/DRILL-2029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Westin updated DRILL-2029: Fix Version/s: (was: 1.2.0) 1.3.0 All readers should show the filename where it encountered error. If possible, also position Key: DRILL-2029 URL: https://issues.apache.org/jira/browse/DRILL-2029 Project: Apache Drill Issue Type: Improvement Components: Storage - Parquet Affects Versions: 0.7.0 Reporter: Aman Sinha Assignee: Steven Phillips Fix For: 1.3.0 The Parquet reader (and possibly other file system readers) may encounter an error (e.g an IndexOutOfBounds) while reading one out of hundreds or thousands of files in a directory. The stack does not show the exact file where this error occurred which makes diagnosing the problem much harder. We should show the filename in the error message. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-1478) The order of query results for the selected fields seems to be different from sqlline vs Web UI
[ https://issues.apache.org/jira/browse/DRILL-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Westin updated DRILL-1478: Fix Version/s: (was: 1.2.0) 1.3.0 The order of query results for the selected fields seems to be different from sqlline vs Web UI --- Key: DRILL-1478 URL: https://issues.apache.org/jira/browse/DRILL-1478 Project: Apache Drill Issue Type: Bug Components: Client - HTTP Environment: I executed a query with aggregation, it seems the order results is different from sqlline vs Web UI. Reporter: B Anil Kumar Assignee: Sudheesh Katkam Fix For: 1.3.0 Here the order means selected columns order. For example. {noformat} 0: jdbc:drill:zk=localhost:2181 select state,city,avg(pop) from mongo.test.`zips` zipcodes group by state, city limit 5; ++++ | state|city| EXPR$2 | ++++ | MA | AGAWAM | 15338.0| | MA | CUSHMAN| 36963.0| | MA | BARRE | 4546.0 | | MA | BELCHERTOWN | 10579.0| | MA | BLANDFORD | 1240.0 | ++++ {noformat} The above is as expected. But where as for the same above query in Web UI {noformat} EXPR$2 state city 1,240 MA BLANDFORD 4,546 MA BARRE 10,579MA BELCHERTOWN 15,338MA AGAWAM 36,963MA CUSHMAN {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-2663) Better error handling of storage plugin configuration
[ https://issues.apache.org/jira/browse/DRILL-2663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Westin updated DRILL-2663: Fix Version/s: (was: 1.2.0) 1.3.0 Better error handling of storage plugin configuration - Key: DRILL-2663 URL: https://issues.apache.org/jira/browse/DRILL-2663 Project: Apache Drill Issue Type: Bug Components: Client - HTTP Affects Versions: 0.9.0 Reporter: Krystal Assignee: Sudheesh Katkam Fix For: 1.3.0 First, when there is an invalid entry in the configuration, an error pops up for about 1 sec then disappeared. So if you happen to blink during this time, you will not able to make out what the error message is. We should extend the duration of the error message display time. For case when there is a invalid json mapping error, it would be nice if we can tell the user the invalid entries in the error message. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3287) Changing session level parameter back to the default value does not change it's status back to DEFAULT
[ https://issues.apache.org/jira/browse/DRILL-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Westin updated DRILL-3287: Fix Version/s: (was: 1.2.0) 1.3.0 Changing session level parameter back to the default value does not change it's status back to DEFAULT -- Key: DRILL-3287 URL: https://issues.apache.org/jira/browse/DRILL-3287 Project: Apache Drill Issue Type: Bug Components: Execution - Flow Reporter: Victoria Markman Assignee: Sudheesh Katkam Fix For: 1.3.0 Initial state: {code} 0: jdbc:drill:schema=dfs select * from sys.options where status like '%CHANGED%'; +---+--+-+--+--+-+---++ | name| kind | type | status | num_val | string_val | bool_val | float_val | +---+--+-+--+--+-+---++ | planner.enable_decimal_data_type | BOOLEAN | SYSTEM | CHANGED | null | null| true | null | +---+--+-+--+--+-+---++ 1 row selected (0.247 seconds) {code} I changed session parameter: {code} 0: jdbc:drill:schema=dfs alter session set `planner.enable_hashjoin` = false; +---+---+ | ok | summary | +---+---+ | true | planner.enable_hashjoin updated. | +---+---+ 1 row selected (0.1 seconds) {code} So far, so good: it appears on changed options list: {code} 0: jdbc:drill:schema=dfs select * from sys.options where status like '%CHANGED%'; +---+--+--+--+--+-+---++ | name| kind | type | status | num_val | string_val | bool_val | float_val | +---+--+--+--+--+-+---++ | planner.enable_decimal_data_type | BOOLEAN | SYSTEM | CHANGED | null | null| true | null | | planner.enable_hashjoin | BOOLEAN | SESSION | CHANGED | null | null| false | null | +---+--+--+--+--+-+---++ 2 rows selected (0.133 seconds) {code} I changed session parameter back to it's default value: {code} 0: jdbc:drill:schema=dfs alter session set `planner.enable_hashjoin` = true; +---+---+ | ok | summary | +---+---+ | true | planner.enable_hashjoin updated. | +---+---+ 1 row selected (0.096 seconds) {code} {color:red} It still appears on changed list, even though it has default value:{color} {code} 0: jdbc:drill:schema=dfs select * from sys.options where status like '%CHANGED%'; +---+--+--+--+--+-+---++ | name| kind | type | status | num_val | string_val | bool_val | float_val | +---+--+--+--+--+-+---++ | planner.enable_decimal_data_type | BOOLEAN | SYSTEM | CHANGED | null | null| true | null | | planner.enable_hashjoin | BOOLEAN | SESSION | CHANGED | null | null| true | null | +---+--+--+--+--+-+---++ 2 rows selected (0.124 seconds) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3168) Char overflow in LimitRecordBatch
[ https://issues.apache.org/jira/browse/DRILL-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Westin updated DRILL-3168: Fix Version/s: (was: 1.2.0) 1.4.0 Char overflow in LimitRecordBatch -- Key: DRILL-3168 URL: https://issues.apache.org/jira/browse/DRILL-3168 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 0.9.0 Reporter: 徐波 Assignee: Sudheesh Katkam Fix For: 1.4.0 The variable named 'i' in 'limitWithNoSV' may overflow when fetch - offset Character.MAX_VALUE - offset. eg. offset=0, fetch=65536. Code in limitWithNoSV: int svIndex = 0; for(char i = (char) offset; i fetch; i++) { outgoingSv.setIndex(svIndex, i); svIndex++; } -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (DRILL-3194) TestDrillbitResilience#memoryLeaksWhenFailed hangs
[ https://issues.apache.org/jira/browse/DRILL-3194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sudheesh Katkam reopened DRILL-3194: TestDrillbitResilience#memoryLeaksWhenFailed hangs -- Key: DRILL-3194 URL: https://issues.apache.org/jira/browse/DRILL-3194 Project: Apache Drill Issue Type: Bug Components: Execution - Flow Reporter: Sudheesh Katkam Assignee: Deneche A. Hakim Fix For: 1.2.0 TestDrillbitResilience#memoryLeaksWhenFailed hangs and fails when run multiple times. This might be related to DRILL-3163. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3461) Need to add javadocs to class where they are missing
[ https://issues.apache.org/jira/browse/DRILL-3461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Phillips updated DRILL-3461: --- Summary: Need to add javadocs to class where they are missing (was: Need to meet basic coding standards) Need to add javadocs to class where they are missing Key: DRILL-3461 URL: https://issues.apache.org/jira/browse/DRILL-3461 Project: Apache Drill Issue Type: Bug Reporter: Ted Dunning Attachments: no-comments.txt, no-javadoc-no-comments.txt, no-javadocs.txt 1220 classes in Drill have no Javadocs whatsoever. I will attach a detailed list. Some kind of expression of intent and basic place in the architecture should be included in all classes. The good news is that at least there are 1838 (1868 in 1.1.0 branch) classes that have at least some kind of javadocs. I would be happy to help write comments, but I can't figure out what these classes do. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3158) Add result verification to tests that currently run a query and just check to make sure no exception occurs
[ https://issues.apache.org/jira/browse/DRILL-3158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Westin updated DRILL-3158: Fix Version/s: (was: 1.2.0) 1.3.0 Add result verification to tests that currently run a query and just check to make sure no exception occurs --- Key: DRILL-3158 URL: https://issues.apache.org/jira/browse/DRILL-3158 Project: Apache Drill Issue Type: Test Components: Tools, Build Test Affects Versions: 1.0.0 Reporter: Jason Altekruse Assignee: Jason Altekruse Fix For: 1.3.0 Many of the early unit tests written for Drill only run a query to make sure it can execute without an exception. These tests should all be enhanced to include result verification using the new unit test framework. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3128) LENGTH(..., CAST(... AS VARCHAR(0) ) ) yields ClassCastException
[ https://issues.apache.org/jira/browse/DRILL-3128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mehant Baid updated DRILL-3128: --- Fix Version/s: (was: 1.2.0) 1.4.0 LENGTH(..., CAST(... AS VARCHAR(0) ) ) yields ClassCastException Key: DRILL-3128 URL: https://issues.apache.org/jira/browse/DRILL-3128 Project: Apache Drill Issue Type: Bug Components: Functions - Drill Reporter: Daniel Barclay (Drill) Assignee: Mehant Baid Fix For: 1.4.0 Trying to make a function call with a function name of {{LENGTH}}, with two arguments, and with the second argument being a cast expression having a target type of {{VARCHAR(0)}} yields a {{ClassCastException}} (at least for several cases of source expression): {noformat} 0: jdbc:drill:zk=local SELECT LENGTH(1, CAST('x' AS VARCHAR(0) ) ) FROM INFORMATION_SCHEMA.CATALOGS; Error: SYSTEM ERROR: java.lang.ClassCastException: org.apache.drill.common.expression.CastExpression cannot be cast to org.apache.drill.common.expression.ValueExpressions$QuotedString [Error Id: 1860730b-b69b-4400-bb2c-935a56aa456e on dev-linux2:31010] (state=,code=0) 0: jdbc:drill:zk=local SELECT LENGTH(1, CAST(1 AS VARCHAR(0) ) ) FROM INFORMATION_SCHEMA.CATALOGS; Error: SYSTEM ERROR: java.lang.ClassCastException: org.apache.drill.common.expression.CastExpression cannot be cast to org.apache.drill.common.expression.ValueExpressions$QuotedString [Error Id: 476c4848-4b53-4c1e-9005-2bab3a2a91a4 on dev-linux2:31010] (state=,code=0) 0: jdbc:drill:zk=local SELECT LENGTH(1, CAST(NULL AS VARCHAR(0) ) ) FROM INFORMATION_SCHEMA.CATALOGS; Error: SYSTEM ERROR: java.lang.ClassCastException: org.apache.drill.common.expression.TypedNullConstant cannot be cast to org.apache.drill.common.expression.ValueExpressions$QuotedString [Error Id: d888a336-2b18-45d9-a5e8-f4c2406a292e on dev-linux2:31010] (state=,code=0) 0: jdbc:drill:zk=local {noformat} This case (not with {{VARCHAR(0)}}) also yields a {{ClassCastException}}: {noformat} 0: jdbc:drill:zk=local SELECT LENGTH(1, CAST(1 AS VARCHAR(2) ) ) FROM INFORMATION_SCHEMA.CATALOGS; Error: SYSTEM ERROR: java.lang.ClassCastException: org.apache.drill.common.expression.CastExpression cannot be cast to org.apache.drill.common.expression.ValueExpressions$QuotedString [Error Id: 04bd6cb1-2dd7-4938-ab9b-4d460aaaf05f on dev-linux2:31010] (state=,code=0) 0: jdbc:drill:zk=local {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-2040) Fix currently ignored parquet tests to pull files from the web or generate the files at the time of the test
[ https://issues.apache.org/jira/browse/DRILL-2040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Westin updated DRILL-2040: Fix Version/s: (was: 1.2.0) 1.4.0 Fix currently ignored parquet tests to pull files from the web or generate the files at the time of the test Key: DRILL-2040 URL: https://issues.apache.org/jira/browse/DRILL-2040 Project: Apache Drill Issue Type: Improvement Components: Storage - Parquet Reporter: Jason Altekruse Assignee: Jason Altekruse Fix For: 1.4.0 Many tests in the parquet library rely on binary files not in version control or attached to JIRA. This should be corrected to allow external verification of the tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-2726) Display Drill version in sys.version
[ https://issues.apache.org/jira/browse/DRILL-2726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Westin updated DRILL-2726: Fix Version/s: (was: 1.2.0) 1.4.0 Display Drill version in sys.version Key: DRILL-2726 URL: https://issues.apache.org/jira/browse/DRILL-2726 Project: Apache Drill Issue Type: Improvement Components: Storage - Other Reporter: Andries Engelbrecht Assignee: Sudheesh Katkam Fix For: 1.4.0 Include the Drill version information in sys.version, so it is easy to determine the exact version of Drill being used for support purposes. Adding a version column to sys.version to show the exact version i.e. mapr-drill-0.8.0.31168-1 or apache-drill-0.8.0.31168-1 Will make it easier for users to quickly identify the Drill version being used, and provide that information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-1951) Can't cast numeric value with decimal point read from CSV file into integer data type
[ https://issues.apache.org/jira/browse/DRILL-1951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mehant Baid updated DRILL-1951: --- Fix Version/s: (was: 1.2.0) 1.4.0 Can't cast numeric value with decimal point read from CSV file into integer data type - Key: DRILL-1951 URL: https://issues.apache.org/jira/browse/DRILL-1951 Project: Apache Drill Issue Type: Bug Components: Execution - Data Types Affects Versions: 0.8.0 Reporter: Victoria Markman Assignee: Mehant Baid Fix For: 1.4.0 sales.csv file: {code} 997,Ford,ME350,3000.00, comment#1 1999,Chevy,Venture,4900.00, comment#2 1999,Chevy,Venture,5000.00, comment#3 1996,Jeep,Cherokee,1.01, comment#4 0: jdbc:drill:schema=dfs select cast(columns[3] as decimal(18,2)) from `sales.csv`; ++ | EXPR$0 | ++ | 3000.00| | 4900.00| | 5000.00| | 1.01 | ++ 4 rows selected (0.093 seconds) {code} -- Can cast to decimal {code} 0: jdbc:drill:schema=dfs select cast(columns[3] as decimal(18,2)) from `sales.csv`; ++ | EXPR$0 | ++ | 3000.00| | 4900.00| | 5000.00| | 1.01 | ++ 4 rows selected (0.095 seconds) {code} -- Can cast to float {code} 0: jdbc:drill:schema=dfs select cast(columns[3] as float) from `sales.csv`; ++ | EXPR$0 | ++ | 3000.0 | | 4900.0 | | 5000.0 | | 1.01 | ++ 4 rows selected (0.112 seconds) {code}-- Can't cast to INT/BIGINT {code} 0: jdbc:drill:schema=dfs select cast(columns[3] as bigint) from `sales.csv`; Query failed: Query failed: Failure while running fragment., 3000.00 [ 4818451a-c731-48a9-9992-1e81ab1d520d on atsqa4-134.qa.lab:31010 ] [ 4818451a-c731-48a9-9992-1e81ab1d520d on atsqa4-134.qa.lab:31010 ] Error: exception while executing query: Failure while executing query. (state=,code=0) {code} -- Same works with json/parquet files {code} 0: jdbc:drill:schema=dfs select a1 from `t1.json`; ++ | a1 | ++ | 10.01 | ++ 1 row selected (0.077 seconds) 0: jdbc:drill:schema=dfs select cast(a1 as int) from `t1.json`; ++ | EXPR$0 | ++ | 10 | ++ 0: jdbc:drill:schema=dfs select * from test_cast; ++ | a1 | ++ | 10.0100| ++ 1 row selected (0.06 seconds) 0: jdbc:drill:schema=dfs select cast(a1 as int) from test_cast; ++ | EXPR$0 | ++ | 10 | ++ 1 row selected (0.094 seconds) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-2563) Improvements for the fragment graph in the profile UI
[ https://issues.apache.org/jira/browse/DRILL-2563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Westin updated DRILL-2563: Fix Version/s: (was: 1.2.0) Future Improvements for the fragment graph in the profile UI -- Key: DRILL-2563 URL: https://issues.apache.org/jira/browse/DRILL-2563 Project: Apache Drill Issue Type: Improvement Components: Client - HTTP Affects Versions: 0.9.0 Reporter: Krystal Assignee: Sudheesh Katkam Fix For: Future git.commit.id=8493713cafe6e5d1f56f2dffc9d8bea294a6e013 The overview of the major fragment graph is not intuitive in figuring out what each color line represents. We should provide some kind of legend or color the major fragments in the table underneath the graph to correspond to the lines in the graph. Also, we should make the graph bigger so it would be easier to read. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-2657) Should force a storage plugin to be disabled first before allow to delete
[ https://issues.apache.org/jira/browse/DRILL-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Westin updated DRILL-2657: Fix Version/s: (was: 1.2.0) 1.3.0 Should force a storage plugin to be disabled first before allow to delete - Key: DRILL-2657 URL: https://issues.apache.org/jira/browse/DRILL-2657 Project: Apache Drill Issue Type: Improvement Components: Client - HTTP Affects Versions: 0.9.0 Reporter: Krystal Assignee: Sudheesh Katkam Fix For: 1.3.0 Improvement Request: To avoid having users accidentally delete a live storage plugin configuration, we should only allow deletion of a disabled configuration only. So to delete, the user has to first disable the configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-2665) Should break the Completed Queries list under the profile UI into different sections
[ https://issues.apache.org/jira/browse/DRILL-2665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Westin updated DRILL-2665: Fix Version/s: (was: 1.2.0) Future Should break the Completed Queries list under the profile UI into different sections Key: DRILL-2665 URL: https://issues.apache.org/jira/browse/DRILL-2665 Project: Apache Drill Issue Type: Improvement Components: Client - HTTP Affects Versions: 0.9.0 Reporter: Krystal Assignee: Sudheesh Katkam Fix For: Future To be more user friendly, the executed queries under the Profile UI page should be broken down into 3 different sections. The successfully completed queries should go under the Completed Queries section. The failed queries should go under the Failed Queries section, and the cancelled queries should go under the Cancelled Queries section. The Completed Queries section should be listed first. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3201) Drill UI Authentication
[ https://issues.apache.org/jira/browse/DRILL-3201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Altekruse updated DRILL-3201: --- Fix Version/s: (was: 1.2.0) 1.3.0 Drill UI Authentication --- Key: DRILL-3201 URL: https://issues.apache.org/jira/browse/DRILL-3201 Project: Apache Drill Issue Type: Improvement Components: Client - HTTP Affects Versions: 1.0.0 Environment: Drill 1.0.0 Reporter: Rajkumar Singh Assignee: Jason Altekruse Labels: features Fix For: 1.3.0 DRILL UI don't have authentication feature thats why any user can cancel the running query or can change the storage plugin configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-1992) Add more stats for HashJoinBatch and HashAggBatch
[ https://issues.apache.org/jira/browse/DRILL-1992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Westin updated DRILL-1992: Fix Version/s: (was: 1.2.0) 1.4.0 Add more stats for HashJoinBatch and HashAggBatch - Key: DRILL-1992 URL: https://issues.apache.org/jira/browse/DRILL-1992 Project: Apache Drill Issue Type: Improvement Components: Execution - Relational Operators Affects Versions: 0.8.0 Reporter: Venki Korukanti Assignee: Venki Korukanti Fix For: 1.4.0 Attachments: DRILL-1992-2.patch, DRILL-1992.patch Adding more stats to analyze the memory usage of HashJoinBatch and HashAggBatch. HashJoinBatch + HASHTABLE_MEMORY_ALLOCATION + HASHTABLE_NUM_BATCHHOLDERS + HASHJOINHELPER_MEMORY HashAgg + HASHTABLE_MEMORY_ALLOCATION + HASHTABLE_NUM_BATCHHOLDERS + HASHAGG_MEMORY + HASHAGG_NUM_BATCHHOLDERS Cleanup: + Prefix HASHTABLE_ to existing HashTable metrics such as NUM_BUCKETS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3068) Query fails and creates profile .drill file with size 0
[ https://issues.apache.org/jira/browse/DRILL-3068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Altekruse updated DRILL-3068: --- Fix Version/s: (was: 1.2.0) 1.3.0 Query fails and creates profile .drill file with size 0 --- Key: DRILL-3068 URL: https://issues.apache.org/jira/browse/DRILL-3068 Project: Apache Drill Issue Type: Bug Components: Client - HTTP Reporter: Victoria Markman Assignee: Jason Altekruse Fix For: 1.3.0 This causes WebUI to fail with: HTTP ERROR 500 Problem accessing /profiles. Reason: Request failed. Reproduction for profile with size 0 is in drill-3067 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-2293) CTAS does not clean up when it fails
[ https://issues.apache.org/jira/browse/DRILL-2293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Westin updated DRILL-2293: Fix Version/s: (was: 1.2.0) 1.3.0 CTAS does not clean up when it fails Key: DRILL-2293 URL: https://issues.apache.org/jira/browse/DRILL-2293 Project: Apache Drill Issue Type: Bug Components: Storage - Parquet Reporter: Rahul Challapalli Assignee: Steven Phillips Fix For: 1.3.0 git.commit.id.abbrev=6676f2d Data Set : {code} { id : 1, map:{rm: [ {mapid:m1,mapvalue:{col1:1,col2:[0,1,2,3,4,5]},rptd: [{ a: foo},{b:boo}]}, {mapid:m2,mapvalue:{col1:0,col2:[]},rptd: [{ a: bar},{c:1},{d:4.5}]} ]} } {code} The below query fails : {code} create table rep_map as select d.map from `temp.json` d; Query failed: Query stopped., index: -4, length: 4 (expected: range(0, 16384)) [ d76e3f74-7e2c-406f-a7fd-5efc68227e75 on qa-node190.qa.lab:31010 ] {code} However drill created a folder 'rep_map' and the folder contained a broken parquet file. {code} create table rep_map as select d.map from `temp.json` d; +++ | ok | summary | +++ | false | Table 'rep_map' already exists. | {code} Drill should clean up properly in case of a failure. I raised a different issue for the actual failure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-2100) Drill not deleting spooling files
[ https://issues.apache.org/jira/browse/DRILL-2100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Westin updated DRILL-2100: Fix Version/s: (was: 1.2.0) 1.3.0 Drill not deleting spooling files - Key: DRILL-2100 URL: https://issues.apache.org/jira/browse/DRILL-2100 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 0.8.0 Reporter: Abhishek Girish Assignee: Steven Phillips Fix For: 1.3.0 Currently, after forcing queries to use an external sort by switching off hash join/agg causes spill-to-disk files accumulating. This causes issues with disk space availability when the spill is configured to be on the local file system (/tmp/drill). Also not optimal when configured to use DFS (custom). Drill must clean up all temporary files created after a query completes or after a drillbit restart. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-2293) CTAS does not clean up when it fails
[ https://issues.apache.org/jira/browse/DRILL-2293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14615606#comment-14615606 ] Chris Westin commented on DRILL-2293: - Under the assumption that this can be done manually, as a workaround, pushing this further out into the future. CTAS does not clean up when it fails Key: DRILL-2293 URL: https://issues.apache.org/jira/browse/DRILL-2293 Project: Apache Drill Issue Type: Bug Components: Storage - Parquet Reporter: Rahul Challapalli Assignee: Steven Phillips Fix For: 1.3.0 git.commit.id.abbrev=6676f2d Data Set : {code} { id : 1, map:{rm: [ {mapid:m1,mapvalue:{col1:1,col2:[0,1,2,3,4,5]},rptd: [{ a: foo},{b:boo}]}, {mapid:m2,mapvalue:{col1:0,col2:[]},rptd: [{ a: bar},{c:1},{d:4.5}]} ]} } {code} The below query fails : {code} create table rep_map as select d.map from `temp.json` d; Query failed: Query stopped., index: -4, length: 4 (expected: range(0, 16384)) [ d76e3f74-7e2c-406f-a7fd-5efc68227e75 on qa-node190.qa.lab:31010 ] {code} However drill created a folder 'rep_map' and the folder contained a broken parquet file. {code} create table rep_map as select d.map from `temp.json` d; +++ | ok | summary | +++ | false | Table 'rep_map' already exists. | {code} Drill should clean up properly in case of a failure. I raised a different issue for the actual failure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3461) Need to meet basic coding standards
[ https://issues.apache.org/jira/browse/DRILL-3461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Dunning updated DRILL-3461: --- Attachment: no-javadocs.txt This is a list of the 1220 classes with no javadocs Need to meet basic coding standards --- Key: DRILL-3461 URL: https://issues.apache.org/jira/browse/DRILL-3461 Project: Apache Drill Issue Type: Bug Reporter: Ted Dunning Attachments: no-javadocs.txt 1220 classes in Drill have no Javadocs whatsoever. I will attach a detailed list. Some kind of expression of intent and basic place in the architecture should be included in all classes. The good news is that at least there are 1838 (1868 in 1.1.0 branch) classes that have at least some kind of javadocs. I would be happy to help write comments, but I can't figure out what these classes do. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-3461) Need to meet basic coding standards
Ted Dunning created DRILL-3461: -- Summary: Need to meet basic coding standards Key: DRILL-3461 URL: https://issues.apache.org/jira/browse/DRILL-3461 Project: Apache Drill Issue Type: Bug Reporter: Ted Dunning 1220 classes in Drill have no Javadocs whatsoever. I will attach a detailed list. Some kind of expression of intent and basic place in the architecture should be included in all classes. The good news is that at least there are 1838 (1868 in 1.1.0 branch) classes that have at least some kind of javadocs. I would be happy to help write comments, but I can't figure out what these classes do. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-2800) Performance regression introduced with commit: a6df26a (Patch for DRILL-2512)
[ https://issues.apache.org/jira/browse/DRILL-2800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14615617#comment-14615617 ] Chris Westin commented on DRILL-2800: - Or, if the numbers are still like this, then move this forward to 1.3; it's less egregious than some other bugs. Performance regression introduced with commit: a6df26a (Patch for DRILL-2512) -- Key: DRILL-2800 URL: https://issues.apache.org/jira/browse/DRILL-2800 Project: Apache Drill Issue Type: Bug Affects Versions: 0.9.0 Environment: RHEL 6.4 TPCH Data Set: SF100 (Uncompressed Parquet) Reporter: Kunal Khatua Assignee: Sudheesh Katkam Fix For: 1.2.0 TPCH 06 (Cached Run) was used as a reference to identify the regressive commit. DRILL-2613: 2-Core: Impl. ResultSet.getXxx(...) number-to-number data [fe11e86] 3,902 msec DRILL-2668: Fix: CAST(1.1 AS FLOAT) was yielding DOUBLE. [49042bc] 5,606 msec DRILL-2512: Shuffle the list of Drill endpoints before connecting [a6df26a] 10,506 msec (Rerun 9,678 msec) Here are comparisons from the last complete run (Cached runs): Commitd7e37f4 a6df26a tpch 0112,232 16,693 tpch 0323,374 30,062 tpch 0442,144 23,749 tpch 0532,247 41,648 tpch 064,665 10,506 tpch 0729,322 34,315 tpch 0835,478 42,120 tpch 0943,959 49,262 tpch 1024,439 26,136 tpch 12Timeout 18,866 tpch 1318,226 20,863 tpch 1411,760 11,884 tpch 1610,676 15,032 tpch 1834,153 39,058 tpch 19Timeout 32,909 tpch 2099,788 22,890 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3461) Need to meet basic coding standards
[ https://issues.apache.org/jira/browse/DRILL-3461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Dunning updated DRILL-3461: --- Attachment: no-javadoc-no-comments.txt no-comments.txt Here are other views of the situation. A quick summary is that only one file that has no javadoc has any // comments. Need to meet basic coding standards --- Key: DRILL-3461 URL: https://issues.apache.org/jira/browse/DRILL-3461 Project: Apache Drill Issue Type: Bug Reporter: Ted Dunning Attachments: no-comments.txt, no-javadoc-no-comments.txt, no-javadocs.txt 1220 classes in Drill have no Javadocs whatsoever. I will attach a detailed list. Some kind of expression of intent and basic place in the architecture should be included in all classes. The good news is that at least there are 1838 (1868 in 1.1.0 branch) classes that have at least some kind of javadocs. I would be happy to help write comments, but I can't figure out what these classes do. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-3462) There appears to be no way to have complex intermediate state
Ted Dunning created DRILL-3462: -- Summary: There appears to be no way to have complex intermediate state Key: DRILL-3462 URL: https://issues.apache.org/jira/browse/DRILL-3462 Project: Apache Drill Issue Type: Bug Reporter: Ted Dunning After spending several frustrating days on the problem (see also DRILL-3461), it appears that there is no viable idiom for building an aggregator that has internal state that is anything more than a scalar. What is needed is: 1) The ability to allocate a Repeated* type for use in a Workspace variables. Currently, new works to get the basic structure, but there is no good way to allocate the corresponding vector. 2) The ability to use and to allocate a ComplexWriter in the Workspace variables. 3) The ability to write a UDAF that supports multi-phase aggregation. It would be just fine if I simply have to write a combine method on my UDAF class. I don't think that there is any way to infer such a combiner from the parameters and workspace variables. An alternative API would be to have a form of the output function that is given an IterableOutputClass, but that is probably much less efficient than simply having a combine method that is called repeatedly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3194) TestDrillbitResilience#memoryLeaksWhenFailed hangs
[ https://issues.apache.org/jira/browse/DRILL-3194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deneche A. Hakim updated DRILL-3194: Component/s: Execution - Flow TestDrillbitResilience#memoryLeaksWhenFailed hangs -- Key: DRILL-3194 URL: https://issues.apache.org/jira/browse/DRILL-3194 Project: Apache Drill Issue Type: Bug Components: Execution - Flow Reporter: Sudheesh Katkam Assignee: Deneche A. Hakim Fix For: 1.2.0 TestDrillbitResilience#memoryLeaksWhenFailed hangs and fails when run multiple times. This might be related to DRILL-3163. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3167) When a query fails, Foreman should wait for all fragments to finish cleaning up before sending a FAILED state to the client
[ https://issues.apache.org/jira/browse/DRILL-3167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deneche A. Hakim updated DRILL-3167: Attachment: (was: DRILL-3267.4.patch.txt) When a query fails, Foreman should wait for all fragments to finish cleaning up before sending a FAILED state to the client --- Key: DRILL-3167 URL: https://issues.apache.org/jira/browse/DRILL-3167 Project: Apache Drill Issue Type: Bug Reporter: Deneche A. Hakim Assignee: Jacques Nadeau Fix For: 1.2.0 Attachments: DRILL-3167.1.patch.txt, DRILL-3167.5.patch.txt TestDrillbitResilience.foreman_runTryEnd() exposes this problem intermittently The query fails and the Foreman reports the failure to the client which removes the results listener associated to the failed query. Sometimes, a data batch reaches the client after the FAILED state already arrived, the client doesn't handle this properly and the corresponding buffer is never released. Making the Foreman wait for all fragments to finish before sending the final state should help avoid such scenarios. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-3460) Implement function validation in Drill
Mehant Baid created DRILL-3460: -- Summary: Implement function validation in Drill Key: DRILL-3460 URL: https://issues.apache.org/jira/browse/DRILL-3460 Project: Apache Drill Issue Type: Improvement Reporter: Mehant Baid Assignee: Mehant Baid Fix For: 1.3.0 Since the schema of the table is not known during the validation phase of Calcite, Drill ends up skipping most of the validation checks in Calcite. This causes certain problems at execution time, for example when we fail function resolution or function execution due to incorrect types provided to the function. The worst manifestation of this problem is in the case when Drill tries to apply implicit casting and produces incorrect results. There are cases when its fine the apply the implicit cast but it doesn't make sense for a particular function. This JIRA is aimed to provide a new approach to be able to perform validation -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-2860) Unable to cast integer column from parquet file to interval day
[ https://issues.apache.org/jira/browse/DRILL-2860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mehant Baid updated DRILL-2860: --- Fix Version/s: (was: 1.2.0) 1.3.0 Unable to cast integer column from parquet file to interval day --- Key: DRILL-2860 URL: https://issues.apache.org/jira/browse/DRILL-2860 Project: Apache Drill Issue Type: Bug Components: Execution - Data Types Reporter: Victoria Markman Assignee: Mehant Baid Fix For: 1.3.0 Attachments: t1.parquet I can cast numeric literal to interval day: {code} 0: jdbc:drill:schema=dfs select cast(1 as interval day) from t1; ++ | EXPR$0 | ++ | P1D| | P1D| | P1D| | P1D| | P1D| | P1D| | P1D| | P1D| | P1D| | P1D| ++ 10 rows selected (0.122 seconds) {code} Get an error when I'm trying to do the same from parquet file: {code} 0: jdbc:drill:schema=dfs select cast(a1 as interval day) from t1 where a1 = 1; Query failed: SYSTEM ERROR: Invalid format: 1 Fragment 0:0 [6a4adf04-f3db-4feb-8010-ebc3bfced1e3 on atsqa4-134.qa.lab:31010] (java.lang.IllegalArgumentException) Invalid format: 1 org.joda.time.format.PeriodFormatter.parseMutablePeriod():326 org.joda.time.format.PeriodFormatter.parsePeriod():304 org.joda.time.Period.parse():92 org.joda.time.Period.parse():81 org.apache.drill.exec.test.generated.ProjectorGen180.doEval():77 org.apache.drill.exec.test.generated.ProjectorGen180.projectRecords():62 org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.doWork():170 org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():93 org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():130 org.apache.drill.exec.record.AbstractRecordBatch.next():144 org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():118 org.apache.drill.exec.physical.impl.BaseRootExec.next():74 org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():80 org.apache.drill.exec.physical.impl.BaseRootExec.next():64 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():198 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():192 java.security.AccessController.doPrivileged():-2 javax.security.auth.Subject.doAs():415 org.apache.hadoop.security.UserGroupInformation.doAs():1469 org.apache.drill.exec.work.fragment.FragmentExecutor.run():192 org.apache.drill.common.SelfCleaningRunnable.run():38 java.util.concurrent.ThreadPoolExecutor.runWorker():1145 java.util.concurrent.ThreadPoolExecutor$Worker.run():615 java.lang.Thread.run():745 Error: exception while executing query: Failure while executing query. (state=,code=0) {code} If I try casting a1 to an integer I run into drill-2859 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-2456) regexp_replace using hex codes fails on larger JSON data sets
[ https://issues.apache.org/jira/browse/DRILL-2456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mehant Baid updated DRILL-2456: --- Fix Version/s: (was: 1.2.0) 1.3.0 regexp_replace using hex codes fails on larger JSON data sets - Key: DRILL-2456 URL: https://issues.apache.org/jira/browse/DRILL-2456 Project: Apache Drill Issue Type: Bug Components: Functions - Drill Affects Versions: 0.7.0 Environment: Drill 0.7 MapR 4.0.1 CentOS Reporter: Andries Engelbrecht Assignee: Mehant Baid Fix For: 1.3.0 Attachments: drillbit.log This query works with only 1 file select regexp_replace(`text`, '[^\x20-\xad]', '°'), count(id) from dfs.twitter.`/feed/2015/03/13/17/FlumeData.1426267859699.json` group by `text` order by count(id) desc limit 10; This one fails with multiple files select regexp_replace(`text`, '[^\x20-\xad]', '°'), count(id) from dfs.twitter.`/feed/2015/03/13` group by `text` order by count(id) desc limit 10; Query failed: Query failed: Failure while trying to start remote fragment, Encountered an illegal char on line 1, column 31: '' [ 43ff1aa4-4a71-455d-b817-ec5eb8d179bb on twitternode:31010 ] Using text in regexp_replace does work for same dataset. This query works fine on full data set. select regexp_replace(`text`, '[^ -~¡-ÿ]', '°'), count(id) from dfs.twitter.`/feed/2015/03/13` group by `text` order by count(id) desc limit 10; Attached snippet drillbit.log for error -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3430) CAST to interval type doesn't accept standard-format strings
[ https://issues.apache.org/jira/browse/DRILL-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mehant Baid updated DRILL-3430: --- Fix Version/s: (was: 1.2.0) 1.3.0 CAST to interval type doesn't accept standard-format strings Key: DRILL-3430 URL: https://issues.apache.org/jira/browse/DRILL-3430 Project: Apache Drill Issue Type: Bug Components: Functions - Drill Reporter: Daniel Barclay (Drill) Assignee: Mehant Baid Fix For: 1.3.0 Cast specification evaluation is not compliant with the SQL standard. Mainly, it yields errors for standard-format strings that are specified to successfully yield interval values. In ISO/IEC 9075-2:2011(E) section 6.13 cast specification, General Rule 19 case b says that, in a cast specification casting to an interval type, a character string value that is a valid interval literal (interval literal) or unquoted interval string yields an interval value. (interval literal is the INTERVAL '1-6' YEAR TO MONTH syntax; unquoted interval string is the 1-6 syntax.) Drill currently rejects both of those syntaxes. Note the casts to type INTERVAL HOUR and the resulting error messages in the following: {noformat} 0: jdbc:drill:zk=local SELECT CAST( CAST( 'INTERVAL ''1'' HOUR' AS VARCHAR(100) ) AS INTERVAL HOUR) FROM INFORMATION_SCHEMA.CATALOGS; Error: SYSTEM ERROR: IllegalArgumentException: Invalid format: INTERVAL '1' HOUR Fragment 0:0 [Error Id: b4bed61a-1efe-4e06-86d4-fff8f9829d50 on dev-linux2:31010] (state=,code=0) 0: jdbc:drill:zk=local SELECT CAST( CAST( '1' AS VARCHAR(100) ) AS INTERVAL HOUR) FROM INFORMATION_SCHEMA.CATALOGS; Error: SYSTEM ERROR: IllegalArgumentException: Invalid format: 1 Fragment 0:0 [Error Id: 91dec1ed-5cac-4235-93d7-49a2a0f03a1a on dev-linux2:31010] (state=,code=0) 0: jdbc:drill:zk=local {noformat} (The extra cast to VARCHAR is a workaround for a CHAR-vs.-VARCHAR bug.) Drill should accept the standard formats or at least document the non-compliance for users. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-2558) Four different errors are returned when running one COALESCE with multiple arguments of incompatible data types
[ https://issues.apache.org/jira/browse/DRILL-2558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mehant Baid updated DRILL-2558: --- Fix Version/s: (was: 1.2.0) 1.3.0 Four different errors are returned when running one COALESCE with multiple arguments of incompatible data types --- Key: DRILL-2558 URL: https://issues.apache.org/jira/browse/DRILL-2558 Project: Apache Drill Issue Type: Bug Components: Functions - Drill Affects Versions: 0.8.0 Reporter: Victoria Markman Assignee: Mehant Baid Fix For: 1.3.0 Drill tries to do implicit cast during runtime, and we agreed that if particular cast is not implemented run time exception is fine for now. However, it feels weird to get four of them at the same time. I think we should throw exception on the first incompatibility and stop processing rest of the arguments. {code} 0: jdbc:drill:schema=dfs select coalesce(c_varchar, c_integer, c_bigint, c_float, c_double, c_date, c_time, c_timestamp, c_boolean) from j2; Query failed: Query stopped., Failure while trying to materialize incoming schema. Errors: Error in expression at index -1. Error: Missing function implementation: [castBIT(TIMESTAMP-OPTIONAL)]. Full expression: --UNKNOWN EXPRESSION--. Error in expression at index -1. Error: Missing function implementation: [castBIT(TIME-OPTIONAL)]. Full expression: --UNKNOWN EXPRESSION--. Error in expression at index -1. Error: Missing function implementation: [castBIT(DATE-OPTIONAL)]. Full expression: --UNKNOWN EXPRESSION--. Error in expression at index -1. Error: Missing function implementation: [castFLOAT8(BIT-OPTIONAL)]. Full expression: --UNKNOWN EXPRESSION--.. [ e2effaf7-6fcc-4d7d-b408-2031aab2a344 on atsqa4-133.qa.lab:31010 ] Error: exception while executing query: Failure while executing query. (state=,code=0) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-2792) Killing the drillbit which is the foreman results in direct memory being held on
[ https://issues.apache.org/jira/browse/DRILL-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Altekruse updated DRILL-2792: --- Fix Version/s: (was: 1.2.0) 1.3.0 Killing the drillbit which is the foreman results in direct memory being held on Key: DRILL-2792 URL: https://issues.apache.org/jira/browse/DRILL-2792 Project: Apache Drill Issue Type: Bug Components: Execution - Flow Affects Versions: 0.8.0 Reporter: Ramana Inukonda Nagaraj Assignee: Jason Altekruse Fix For: 1.3.0 Killed one of the drillbits which is the foreman for the query- Profiles page reports that query has cancelled. Due to bug Drill-2778 sqlline hangs. However after killing sqlline the current direct memory used does not go down to pre query levels. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-2003) bootstrap-storage-plugins.json is not merged properly
[ https://issues.apache.org/jira/browse/DRILL-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Westin updated DRILL-2003: Fix Version/s: (was: 1.2.0) 1.4.0 bootstrap-storage-plugins.json is not merged properly - Key: DRILL-2003 URL: https://issues.apache.org/jira/browse/DRILL-2003 Project: Apache Drill Issue Type: Bug Components: Metadata Affects Versions: 0.7.0 Reporter: Rahul Challapalli Assignee: Jason Altekruse Fix For: 1.4.0 Drill not picking up the bootstrap-storage-plugins.json from the conf directory. I made sure that the Zookeeper's drill directory is empty and the conf directory is in the classpath. It looks like there is an issue with the merge with the same file in the jar file provided with drill. This worked with 0.6.0 but seems to be broken with the current 0.7.0 release -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3292) SUM(constant) OVER(...) returns wrong results
[ https://issues.apache.org/jira/browse/DRILL-3292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14615600#comment-14615600 ] Sean Hsuan-Yi Chu commented on DRILL-3292: -- https://reviews.apache.org/r/36219/ SUM(constant) OVER(...) returns wrong results - Key: DRILL-3292 URL: https://issues.apache.org/jira/browse/DRILL-3292 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators, Query Planning Optimization Affects Versions: 1.0.0 Reporter: Deneche A. Hakim Assignee: Sean Hsuan-Yi Chu Priority: Critical Labels: window_function Fix For: 1.2.0 The following query returns wrong results: {noformat} 0: jdbc:drill: select sum(1) over w sum1, sum(5) over w sum5 from cp.`employee.json` where position_id = 2 window w as (partition by position_id); +---+---+ | sum1 | sum5 | +---+---+ | 6 | 6 | | 6 | 6 | | 6 | 6 | | 6 | 6 | | 6 | 6 | | 6 | 6 | +---+---+ {noformat} The second column should display 30 (5 x 6) instead of 6. Here is the plan for the query: {noformat} 00-00Screen 00-01 Project(sum1=[$0], sum5=[$1]) 00-02Project(sum1=[$0], sum5=[$1]) 00-03 Project($0=[$1], $1=[$2]) 00-04Window(window#0=[window(partition {0} order by [] range between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($1), SUM($2)])]) 00-05 SelectionVectorRemover 00-06Sort(sort0=[$0], dir0=[ASC]) 00-07 Filter(condition=[=($0, 2)]) 00-08Scan(groupscan=[EasyGroupScan [selectionRoot=/employee.json, numFiles=1, columns=[`position_id`], files=[classpath:/employee.json]]]) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3419) Handle scans optimally when all files are pruned out
[ https://issues.apache.org/jira/browse/DRILL-3419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Westin updated DRILL-3419: Fix Version/s: (was: 1.2.0) 1.4.0 Handle scans optimally when all files are pruned out Key: DRILL-3419 URL: https://issues.apache.org/jira/browse/DRILL-3419 Project: Apache Drill Issue Type: Bug Components: Query Planning Optimization Affects Versions: 1.1.0 Reporter: Khurram Faraaz Assignee: Steven Phillips Fix For: 1.4.0 Note that in case (1) and case (2) we prune, however it is not clear if we prune is case (3), that is because we see a FILTER in the query plan in case (3) CTAS {code} 0: jdbc:drill:schema=dfs.tmp CREATE TABLE CTAS_ONE_MILN_RWS_PER_GROUP(col1, col2) PARTITION BY (col2) AS select cast(columns[0] as bigint) col1, cast(columns[1] as char(2)) col2 from `millionValGroup.csv`; +---++ | Fragment | Number of records written | +---++ | 1_1 | 21932064 | | 1_0 | 28067936 | +---++ 2 rows selected (73.661 seconds) {code} case 1) {code} explain plan for select col1, col2 from CTAS_ONE_MILN_RWS_PER_GROUP where col2 LIKE '%Z%'; | 00-00Screen 00-01 Project(col1=[$0], col2=[$1]) 00-02UnionExchange 01-01 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_3.parquet], ReadEntryWithPath [path=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_1_3.parquet]], selectionRoot=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP, numFiles=2, columns=[`col2`, `col1`]]]) {code} case 2) {code} explain plan for select col1, col2 from CTAS_ONE_MILN_RWS_PER_GROUP where col2 LIKE 'A%'; | 00-00Screen 00-01 Project(col1=[$0], col2=[$1]) 00-02UnionExchange 01-01 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_3.parquet], ReadEntryWithPath [path=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_2.parquet], ReadEntryWithPath [path=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_1_1.parquet], ReadEntryWithPath [path=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_1_2.parquet], ReadEntryWithPath [path=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_1_3.parquet], ReadEntryWithPath [path=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_1.parquet]], selectionRoot=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP, numFiles=6, columns=[`col2`, `col1`]]]) {code} case 3) we are NOT pruning here. {code} explain plan for select col1, col2 from CTAS_ONE_MILN_RWS_PER_GROUP where col2 LIKE 'Z%'; | 00-00Screen 00-01 Project(col1=[$1], col2=[$0]) 00-02SelectionVectorRemover 00-03 Filter(condition=[LIKE($0, 'Z%')]) 00-04Project(col2=[$1], col1=[$0]) 00-05 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_1_48.parquet]], selectionRoot=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP, numFiles=1, columns=[`col2`, `col1`]]]) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3417) Filter present in query plan when we should not see one
[ https://issues.apache.org/jira/browse/DRILL-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Westin updated DRILL-3417: Fix Version/s: (was: 1.2.0) 1.4.0 Filter present in query plan when we should not see one Key: DRILL-3417 URL: https://issues.apache.org/jira/browse/DRILL-3417 Project: Apache Drill Issue Type: Bug Components: Query Planning Optimization Affects Versions: 1.1.0 Environment: 4 node cluster CentOS Reporter: Khurram Faraaz Assignee: Steven Phillips Fix For: 1.4.0 We are seeing a FILTER in the query plan, in this case we should not see one. {code} 0: jdbc:drill:schema=dfs.tmp CREATE TABLE CTAS_ONE_MILN_RWS_PER_GROUP(col1, col2) PARTITION BY (col2) AS select cast(columns[0] as bigint) col1, cast(columns[1] as char(2)) col2 from `millionValGroup.csv`; +---++ | Fragment | Number of records written | +---++ | 1_1 | 21932064 | | 1_0 | 28067936 | +---++ 2 rows selected (73.661 seconds) {code} Total number of rows in CTAS output {code} 0: jdbc:drill:schema=dfs.tmp select count(*) from CTAS_ONE_MILN_RWS_PER_GROUP; +---+ | EXPR$0 | +---+ | 5000 | +---+ 1 row selected (0.197 seconds) {code} {code} explain plan for select col1, col2 from CTAS_ONE_MILN_RWS_PER_GROUP where col2 = 'AK'; | 00-00Screen 00-01 Project(col1=[$0], col2=[$1]) 00-02UnionExchange 01-01 Project(col1=[$1], col2=[$0]) 01-02SelectionVectorRemover 01-03 Filter(condition=[=($0, 'AK')]) 01-04Project(col2=[$1], col1=[$0]) 01-05 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tmp/CTAS_ONE_MILN_RWS_PER_GROUP]], selectionRoot=/tmp/CTAS_ONE_MILN_RWS_PER_GROUP, numFiles=1, columns=[`col2`, `col1`]]]) {code} Number of files created by CTAS {code} [root@centos-01 ~]# hadoop fs -ls /tmp/CTAS_ONE_MILN_RWS_PER_GROUP Found 98 items -rwxr-xr-x 3 mapr mapr2907957 2015-06-29 17:55 /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_1.parquet -rwxr-xr-x 3 mapr mapr2902189 2015-06-29 17:55 /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_10.parquet -rwxr-xr-x 3 mapr mapr2910365 2015-06-29 17:55 /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_11.parquet -rwxr-xr-x 3 mapr mapr2906479 2015-06-29 17:55 /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_12.parquet -rwxr-xr-x 3 mapr mapr2900842 2015-06-29 17:55 /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_13.parquet -rwxr-xr-x 3 mapr mapr2901196 2015-06-29 17:55 /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_14.parquet -rwxr-xr-x 3 mapr mapr2909687 2015-06-29 17:55 /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_15.parquet -rwxr-xr-x 3 mapr mapr2908603 2015-06-29 17:55 /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_16.parquet -rwxr-xr-x 3 mapr mapr2903334 2015-06-29 17:55 /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_17.parquet -rwxr-xr-x 3 mapr mapr2906378 2015-06-29 17:55 /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_18.parquet -rwxr-xr-x 3 mapr mapr2904710 2015-06-29 17:55 /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_19.parquet -rwxr-xr-x 3 mapr mapr2903170 2015-06-29 17:55 /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_2.parquet -rwxr-xr-x 3 mapr mapr2908703 2015-06-29 17:55 /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_20.parquet -rwxr-xr-x 3 mapr mapr2903634 2015-06-29 17:55 /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_21.parquet -rwxr-xr-x 3 mapr mapr2898076 2015-06-29 17:55 /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_22.parquet -rwxr-xr-x 3 mapr mapr2899426 2015-06-29 17:55 /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_23.parquet -rwxr-xr-x 3 mapr mapr2903914 2015-06-29 17:56 /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_24.parquet -rwxr-xr-x 3 mapr mapr2906561 2015-06-29 17:56 /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_25.parquet -rwxr-xr-x 3 mapr mapr2899655 2015-06-29 17:56 /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_26.parquet -rwxr-xr-x 3 mapr mapr2902479 2015-06-29 17:56 /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_27.parquet -rwxr-xr-x 3 mapr mapr2905985 2015-06-29 17:56 /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_28.parquet -rwxr-xr-x 3 mapr mapr2901645 2015-06-29 17:56 /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_29.parquet -rwxr-xr-x 3 mapr mapr2901653 2015-06-29 17:55 /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_3.parquet -rwxr-xr-x 3 mapr mapr2903008 2015-06-29 17:56 /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_30.parquet -rwxr-xr-x 3 mapr mapr2898135 2015-06-29 17:56 /tmp/CTAS_ONE_MILN_RWS_PER_GROUP/1_0_31.parquet -rwxr-xr-x 3 mapr mapr2908631
[jira] [Updated] (DRILL-3082) Safeguards against dropping rows in StreamingAggBatch
[ https://issues.apache.org/jira/browse/DRILL-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Westin updated DRILL-3082: Fix Version/s: (was: 1.2.0) 1.4.0 Safeguards against dropping rows in StreamingAggBatch - Key: DRILL-3082 URL: https://issues.apache.org/jira/browse/DRILL-3082 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Reporter: Mehant Baid Assignee: Steven Phillips Fix For: 1.4.0 As I was debugging DRILL-3069, Steven mentioned he had a patch to safeguard against dropping rows in StreamingAggBatch. It might be useful to get this patch as it would cause asserts in such scenarios. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-2483) Configuration parameter to change default record batch size for scanners
[ https://issues.apache.org/jira/browse/DRILL-2483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Westin updated DRILL-2483: Fix Version/s: (was: 1.2.0) 1.3.0 Configuration parameter to change default record batch size for scanners Key: DRILL-2483 URL: https://issues.apache.org/jira/browse/DRILL-2483 Project: Apache Drill Issue Type: Wish Components: Storage - Other Reporter: Victoria Markman Assignee: Jason Altekruse Fix For: 1.3.0 We've found a bug recently where if table had multiple duplicate rows and duplicate rows span multiple buffers, merge join returned wrong result. Test case had a table with 10,000 rows. The same problem could be reproduced on a much smaller data set if buffer size was configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-2035) Add ability to cancel multiple queries
[ https://issues.apache.org/jira/browse/DRILL-2035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Westin updated DRILL-2035: Fix Version/s: (was: 1.2.0) Future Add ability to cancel multiple queries -- Key: DRILL-2035 URL: https://issues.apache.org/jira/browse/DRILL-2035 Project: Apache Drill Issue Type: New Feature Components: Client - HTTP Reporter: Neeraja Assignee: Jason Altekruse Fix For: Future Currently Drill UI allows canceling one query at a time. This could be cumbersome to manage for scenarios using with BI tools which generate multiple queries for a single action in the UI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-1751) Drill must indicate JSON All Text Mode needs to be turned on when such queries fail
[ https://issues.apache.org/jira/browse/DRILL-1751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Westin updated DRILL-1751: Fix Version/s: (was: 1.2.0) 1.3.0 Drill must indicate JSON All Text Mode needs to be turned on when such queries fail --- Key: DRILL-1751 URL: https://issues.apache.org/jira/browse/DRILL-1751 Project: Apache Drill Issue Type: Improvement Components: Storage - JSON Reporter: Abhishek Girish Assignee: Jason Altekruse Priority: Critical Labels: error_message_must_fix Fix For: 1.3.0 Attachments: drillbit.log Although JSON All Text Mode is a documented option, it may not be obvious to turn this option ON on encountering an error. Query: select * from dfs.`/data/json/lastfm/lastfm_test/A/A/A/TRAAAEA128F935A30D.json` limit 1; Query failed: Failure while running fragment.[ 4331e9a7-c5b4-4e52-bece-214ffa5d06dd on abhi7.qa.lab:31010 ] Error: exception while executing query: Failure while executing query. (state=,code=0) Resolution: I tried setting JSON Text Mode and queries began to work. alter system set `store.json.all_text_mode` = true; +++ | ok | summary | +++ | true | store.json.all_text_mode updated. | +++ 1 row selected (0.136 seconds) 0: jdbc:drill:zk=10.10.103.34:5181 select * from dfs.`/data/json/lastfm/lastfm_test/A/A/A/TRAAAEA128F935A30D.json` limit 1; ++++++ results ++++++ 1 row selected (0.169 seconds) A clear message must be included in the logs and must be displayed on SQLline. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3426) DROP Table
[ https://issues.apache.org/jira/browse/DRILL-3426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Westin updated DRILL-3426: Fix Version/s: (was: 1.2.0) 1.3.0 DROP Table -- Key: DRILL-3426 URL: https://issues.apache.org/jira/browse/DRILL-3426 Project: Apache Drill Issue Type: New Feature Components: Execution - Flow Reporter: Soumendra Kumar Mishra Assignee: Chris Westin Fix For: 1.3.0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-2974) Make OutOfMemoryException an unchecked exception and remove OutOfMemoryRuntimeException
[ https://issues.apache.org/jira/browse/DRILL-2974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Westin updated DRILL-2974: Fix Version/s: (was: 1.2.0) 1.3.0 Make OutOfMemoryException an unchecked exception and remove OutOfMemoryRuntimeException --- Key: DRILL-2974 URL: https://issues.apache.org/jira/browse/DRILL-2974 Project: Apache Drill Issue Type: Improvement Components: Execution - Flow Reporter: Deneche A. Hakim Assignee: Deneche A. Hakim Priority: Minor Fix For: 1.3.0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3323) Flatten planning rule creates unneeded copy of the list being flattened, causes executuion/allocation issues with large lists
[ https://issues.apache.org/jira/browse/DRILL-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Altekruse updated DRILL-3323: --- Fix Version/s: (was: 1.2.0) 1.3.0 Flatten planning rule creates unneeded copy of the list being flattened, causes executuion/allocation issues with large lists - Key: DRILL-3323 URL: https://issues.apache.org/jira/browse/DRILL-3323 Project: Apache Drill Issue Type: Bug Components: Execution - Flow Reporter: Jason Altekruse Assignee: Jason Altekruse Fix For: 1.3.0 The planning rule for flatten was written to not only handle the flatten operator, but it also was designed to address some shortcomings in expression evaluation involving complex types. The rule currently plans inefficiently to try to cover some of these more advanced cases, but there is not thorough test coverage to even demonstrate the benefits of it. We should disable a particular behavior of copying complex data and extra time when it is not needed, because it is causing flatten queries to fail with allocation issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-2691) Source files with Windows line endings
[ https://issues.apache.org/jira/browse/DRILL-2691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Altekruse updated DRILL-2691: --- Fix Version/s: (was: 1.2.0) 1.3.0 Source files with Windows line endings -- Key: DRILL-2691 URL: https://issues.apache.org/jira/browse/DRILL-2691 Project: Apache Drill Issue Type: Bug Components: Tools, Build Test Affects Versions: 0.6.0 Reporter: Deneche A. Hakim Assignee: Jason Altekruse Fix For: 1.3.0 Attachments: DRILL-2691.1.patch.txt The following files: {noformat} common/src/main/java/org/apache/drill/common/util/DrillStringUtils.java contrib/storage-hbase/src/test/java/org/apache/drill/hbase/TestHBaseCFAsJSONString.java {noformat} Have Windows line endings in them. Trying to apply a patch that contains changes in one of those files will fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-2911) Queries fail with connection error when some Drillbit processes are down
[ https://issues.apache.org/jira/browse/DRILL-2911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Westin updated DRILL-2911: Fix Version/s: (was: 1.2.0) 1.3.0 Queries fail with connection error when some Drillbit processes are down Key: DRILL-2911 URL: https://issues.apache.org/jira/browse/DRILL-2911 Project: Apache Drill Issue Type: Bug Components: Execution - Flow Affects Versions: 0.9.0 Reporter: Abhishek Girish Assignee: Chris Westin Fix For: 1.3.0 Attachments: drillbit_node1.log, drillbit_node2.log, drillbit_node3.log, drillbit_node4.log Drill fails with connection error even when the Drill web UI also shows all drill-bits to be up. However, some nodes do not list the Drillbit process. Looks like an inconsistent state. Queries with simple scans execute successfully: {code:sql} select i_item_sk from item limit 5; ++ | i_item_sk | ++ | 1 | | 2 | | 3 | | 4 | | 5 | ++ 5 rows selected (0.112 seconds) {code} Any query which might span across multiple drill-bits fails with connection error: {code:sql} SELECT * FROM item i, inventory inv WHEREinv.inv_item_sk = i.i_item_sk LIMIT 10; Query failed: CONNECTION ERROR: Exceeded timeout while waiting send intermediate work fragments to remote nodes. Sent 4 and only heard response back from 3 nodes. [5ada1a3e-d198-478b-941d-3c9bb917e494 on abhi7.qa.lab:31010] Error: exception while executing query: Failure while executing query. (state=,code=0) {code} The issue could possibly be due to a previous failed query. Couldn't find the error code in logs. Have attached logs from all nodes for reference. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-2863) Slow code generation/compilation(/scalar replacement?) for getColumns(...) query
[ https://issues.apache.org/jira/browse/DRILL-2863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Westin updated DRILL-2863: Fix Version/s: (was: 1.2.0) 1.4.0 Slow code generation/compilation(/scalar replacement?) for getColumns(...) query Key: DRILL-2863 URL: https://issues.apache.org/jira/browse/DRILL-2863 Project: Apache Drill Issue Type: Bug Components: Execution - Codegen Reporter: Daniel Barclay (Drill) Assignee: Chris Westin Fix For: 1.4.0 Calling Drill's JDBC driver's DatabaseMetaData.getColumns(...) method seems to take an unusually long of time to execute. Unit tests TestJdbcMetadata and Drill2128GetColumnsDataTypeNotTypeCodeIntBugsTest have gotten slower recently, seemingly in several increments: They needed their timeouts increased, from around 50 s to 90 s, and then to 120 s, and that 120 s timeout is not long enough for reliable runs (at least on my machine). From looking at the logs (with sufficiently verbose logging), it seems that the large SQL query in the implementation of getColumns() (currently in org.apache.drill.jdbc.MetaImpl) is leads to 513 kB of generated code. That half a megabyte of generated Java code frequently takes around 110 seconds to compile (on my machine). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-2125) Add input template file in the source files generated by freemarker
[ https://issues.apache.org/jira/browse/DRILL-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mehant Baid updated DRILL-2125: --- Fix Version/s: (was: 1.2.0) 1.4.0 Add input template file in the source files generated by freemarker --- Key: DRILL-2125 URL: https://issues.apache.org/jira/browse/DRILL-2125 Project: Apache Drill Issue Type: Improvement Components: Tools, Build Test Reporter: Mehant Baid Assignee: Mehant Baid Fix For: 1.4.0 Attachments: DRILL-2125.patch Currently only some generated source files include information as to which template was used to create the sources. For better readability and modifying the template it'd be good to include which template was used to generate the sources. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-2304) Case sensitivity - system and session options are case sensitive
[ https://issues.apache.org/jira/browse/DRILL-2304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Altekruse updated DRILL-2304: --- Assignee: Sudheesh Katkam (was: Jason Altekruse) Case sensitivity - system and session options are case sensitive Key: DRILL-2304 URL: https://issues.apache.org/jira/browse/DRILL-2304 Project: Apache Drill Issue Type: Bug Components: Storage - Information Schema Affects Versions: 0.8.0 Reporter: Ramana Inukonda Nagaraj Assignee: Sudheesh Katkam Priority: Minor Fix For: 1.2.0 Attachments: DRILL-2304.1.patch.txt, DRILL-2304.2.patch.txt TBH I am not sure if this is a bug. When trying to set a session option and I specify the name in a different case the alter command fails. Considering the way we store session options this might be an invalid bug but considering how typical Database hints and options work this is a bug. {code} 0: jdbc:drill: alter SESSION set `STORE.PARQUET.COMPRESSION`='GZIP'; Query failed: SetOptionException: Unknown option: STORE.PARQUET.COMPRESSION Error: exception while executing query: Failure while executing query. (state=,code=0) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-2120) Bringing up multiple drillbits at same time results in synchronization failure
[ https://issues.apache.org/jira/browse/DRILL-2120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Westin updated DRILL-2120: Fix Version/s: (was: 1.2.0) 1.3.0 Bringing up multiple drillbits at same time results in synchronization failure -- Key: DRILL-2120 URL: https://issues.apache.org/jira/browse/DRILL-2120 Project: Apache Drill Issue Type: Bug Components: Execution - Flow Affects Versions: 0.8.0 Reporter: Ramana Inukonda Nagaraj Assignee: Steven Phillips Fix For: 1.3.0 Repro: With a fresh ZK install bring up 4 drillbits at the same time using something like clush clush -g ats /opt/drill/bin/drillbit.sh start Looks like all 4 nodes try to query the ZK to see if the node exists and all of them try to create it at the same time. Some succeed, Others don't. The ones which fail have incorrect information about the state of the ZK and that would explain the below stacktrace. {code} log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Exception in thread main org.apache.drill.exec.exception.DrillbitStartupException: Failure during initial startup of Drillbit. at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:76) at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:60) at org.apache.drill.exec.server.Drillbit.main(Drillbit.java:83) Caused by: java.lang.RuntimeException: Failure while accessing Zookeeper at org.apache.drill.exec.store.sys.zk.ZkAbstractStore.putIfAbsent(ZkAbstractStore.java:135) at org.apache.drill.exec.store.StoragePluginRegistry.createPlugins(StoragePluginRegistry.java:150) at org.apache.drill.exec.store.StoragePluginRegistry.init(StoragePluginRegistry.java:130) at org.apache.drill.exec.server.Drillbit.run(Drillbit.java:155) at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:73) ... 2 more Caused by: java.lang.RuntimeException: Failure while accessing Zookeeper at org.apache.drill.exec.store.sys.zk.ZkPStore.createNodeInZK(ZkPStore.java:53) at org.apache.drill.exec.store.sys.zk.ZkAbstractStore.putIfAbsent(ZkAbstractStore.java:129) ... 6 more Caused by: org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = NodeExists for /drill-ats-build/sys.storage_plugins/cp at org.apache.zookeeper.KeeperException.create(KeeperException.java:119) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783) at org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:676) at org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:660) at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107) at org.apache.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:656) at org.apache.curator.framework.imps.CreateBuilderImpl.protectedPathInForeground(CreateBuilderImpl.java:441) at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:431) at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:44) at org.apache.drill.exec.store.sys.zk.ZkPStore.createNodeInZK(ZkPStore.java:51) ... 7 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3189) Disable ALLOW PARTIAL/DISALLOW PARTIAL in window function grammar
[ https://issues.apache.org/jira/browse/DRILL-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14615612#comment-14615612 ] Sean Hsuan-Yi Chu commented on DRILL-3189: -- We might need to disable disallow partial only because, by default (i.e., not specify disallow/allow partial), Calcite assumes partial is allowed. More information is bellow (From Calcite's java doc): Returns whether partial windows are allowed. If false, a partial window (for example, a window of size 1 hour which has only 45 minutes of data in it) will appear to windowed aggregate functions to be empty. Disable ALLOW PARTIAL/DISALLOW PARTIAL in window function grammar - Key: DRILL-3189 URL: https://issues.apache.org/jira/browse/DRILL-3189 Project: Apache Drill Issue Type: Bug Components: Query Planning Optimization Affects Versions: 1.0.0 Reporter: Victoria Markman Assignee: Sean Hsuan-Yi Chu Priority: Critical Labels: window_function Fix For: 1.2.0 It does not seem to be implemented on the drill side. Looks like Calcite specific grammar. Don't see it SQL Standard. Looks like wrong result: {code} 0: jdbc:drill:schema=dfs select a2, sum(a2) over(partition by a2 order by a2 rows between 1 preceding and 1 following disallow partial) from t2 order by a2; +-+-+ | a2 | EXPR$1 | +-+-+ | 0 | null| | 1 | null| | 2 | 6 | | 2 | 6 | | 2 | 6 | | 3 | null| | 4 | null| | 5 | null| | 6 | null| | 7 | 14 | | 7 | 14 | | 8 | null| | 9 | null| +-+-+ 13 rows selected (0.213 seconds) {code} {code} 0: jdbc:drill:schema=dfs select a2, sum(a2) over(partition by a2 order by a2 rows between 1 preceding and 1 following allow partial) from t2 order by a2; +-+-+ | a2 | EXPR$1 | +-+-+ | 0 | 0 | | 1 | 1 | | 2 | 6 | | 2 | 6 | | 2 | 6 | | 3 | 3 | | 4 | 4 | | 5 | 5 | | 6 | 6 | | 7 | 14 | | 7 | 14 | | 8 | 8 | | 9 | 9 | +-+-+ 13 rows selected (0.208 seconds) {code} {code} 0: jdbc:drill:schema=dfs select a2, sum(a2) over(partition by a2 order by a2 disallow partial) from t2 order by a2; Error: PARSE ERROR: From line 1, column 53 to line 1, column 68: Cannot use DISALLOW PARTIAL with window based on RANGE [Error Id: 984c4b81-9eb0-401d-b36a-9580640b4a78 on atsqa4-133.qa.lab:31010] (state=,code=0) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-2478) Validating values assigned to SYSTEM/SESSION configuration parameters
[ https://issues.apache.org/jira/browse/DRILL-2478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Westin updated DRILL-2478: Fix Version/s: (was: 1.2.0) 1.4.0 Validating values assigned to SYSTEM/SESSION configuration parameters - Key: DRILL-2478 URL: https://issues.apache.org/jira/browse/DRILL-2478 Project: Apache Drill Issue Type: Bug Components: Execution - Flow Affects Versions: 0.8.0 Environment: {code} 0: jdbc:drill: select * from sys.version; +++-+-++ | commit_id | commit_message | commit_time | build_email | build_time | +++-+-++ | f658a3c513ddf7f2d1b0ad7aa1f3f65049a594fe | DRILL-2209 Insert ProjectOperator with MuxExchange | 09.03.2015 @ 01:49:18 EDT | Unknown | 09.03.2015 @ 04:50:05 EDT | +++-+-++ 1 row selected (0.046 seconds) {code} Reporter: Khurram Faraaz Assignee: Sudheesh Katkam Fix For: 1.4.0 Values that are assigned to configuration parameters of type SYSTEM and SESSION must be validated. Currently any value can be assigned to some of the SYSTEM/SESSION type parameters. Here are two examples where assignment of invalid values to store.format does not result in any error. {code} 0: jdbc:drill: alter session set `store.format`='1'; +++ | ok | summary | +++ | true | store.format updated. | +++ 1 row selected (0.02 seconds) {code} {code} 0: jdbc:drill: alter session set `store.format`='foo'; +++ | ok | summary | +++ | true | store.format updated. | +++ 1 row selected (0.039 seconds) {code} In some cases values to some of the configuration parameters are validated, like in this example, where trying to assign an invalid value to parameter store.parquet.compression results in an error, which is correct. However, this kind of validation is not performed for every configuration parameter of SYSTEM/SESSION type. These values that are assigned to parameters must be validated, and report errors if incorrect values are assigned by users. {code} 0: jdbc:drill: alter session set `store.parquet.compression`='anything'; Query failed: ExpressionParsingException: Option store.parquet.compression must be one of: [snappy, gzip, none] Error: exception while executing query: Failure while executing query. (state=,code=0) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3455) If a drillbit, that contains fragments for the current query, dies the QueryManager will fail the query even if those fragments already finished successfully
[ https://issues.apache.org/jira/browse/DRILL-3455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sudheesh Katkam updated DRILL-3455: --- Attachment: DRILL-3455.2.patch.txt If a drillbit, that contains fragments for the current query, dies the QueryManager will fail the query even if those fragments already finished successfully - Key: DRILL-3455 URL: https://issues.apache.org/jira/browse/DRILL-3455 Project: Apache Drill Issue Type: Bug Components: Execution - Flow Reporter: Deneche A. Hakim Assignee: Jacques Nadeau Fix For: 1.2.0 Attachments: DRILL-3455.1.patch.txt, DRILL-3455.2.patch.txt Once DRILL-3448 is fixed we need to update QueryManager.DrillbitStatusListener to no fragment is still running on the dead node. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3194) TestDrillbitResilience#memoryLeaksWhenFailed hangs
[ https://issues.apache.org/jira/browse/DRILL-3194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14615644#comment-14615644 ] Sudheesh Katkam commented on DRILL-3194: I removed the timeout and set the repeat count to 30. The test hangs. TestDrillbitResilience#memoryLeaksWhenFailed hangs -- Key: DRILL-3194 URL: https://issues.apache.org/jira/browse/DRILL-3194 Project: Apache Drill Issue Type: Bug Components: Execution - Flow Reporter: Sudheesh Katkam Assignee: Deneche A. Hakim Fix For: 1.2.0 TestDrillbitResilience#memoryLeaksWhenFailed hangs and fails when run multiple times. This might be related to DRILL-3163. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3341) Move OperatorWrapper list and FragmentWrapper list creation to ProfileWrapper ctor
[ https://issues.apache.org/jira/browse/DRILL-3341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14615338#comment-14615338 ] Sudheesh Katkam commented on DRILL-3341: RB: https://reviews.apache.org/r/36210/ Move OperatorWrapper list and FragmentWrapper list creation to ProfileWrapper ctor -- Key: DRILL-3341 URL: https://issues.apache.org/jira/browse/DRILL-3341 Project: Apache Drill Issue Type: Improvement Reporter: Sudheesh Katkam Assignee: Sudheesh Katkam Priority: Minor Fix For: 1.2.0 Attachments: DRILL-3341.1.patch.txt + avoid re-computation in some cases + consistent comparator names -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3341) Move OperatorWrapper list and FragmentWrapper list creation to ProfileWrapper ctor
[ https://issues.apache.org/jira/browse/DRILL-3341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sudheesh Katkam updated DRILL-3341: --- Assignee: Jason Altekruse (was: Sudheesh Katkam) Move OperatorWrapper list and FragmentWrapper list creation to ProfileWrapper ctor -- Key: DRILL-3341 URL: https://issues.apache.org/jira/browse/DRILL-3341 Project: Apache Drill Issue Type: Improvement Reporter: Sudheesh Katkam Assignee: Jason Altekruse Priority: Minor Fix For: 1.2.0 Attachments: DRILL-3341.1.patch.txt + avoid re-computation in some cases + consistent comparator names -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3096) State change requested from ... -- ... for blank after for
[ https://issues.apache.org/jira/browse/DRILL-3096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sudheesh Katkam updated DRILL-3096: --- Assignee: Parth Chandra (was: Sudheesh Katkam) State change requested from ... -- ... for blank after for Key: DRILL-3096 URL: https://issues.apache.org/jira/browse/DRILL-3096 Project: Apache Drill Issue Type: Bug Components: Execution - Flow Reporter: Daniel Barclay (Drill) Assignee: Parth Chandra Fix For: 1.2.0 Attachments: DRILL-3096.1.patch.txt Something seems to be missing (or sometimes blank) in state-change log messages. See the for with nothing after it in these messages: {noformat} 18:27:44.578 [2aaab4ed-0f43-9290-b480-35aa0ec77cb0:frag:0:0] INFO o.apache.drill.exec.work.fragment.FragmentExecutor - 2aaab4ed-0f43-9290-b480-35aa0ec77cb0:0:0 : State change requested from CANCELLATION_REQUESTED -- FAILED for 18:27:44.587 [2aaab4ed-0f43-9290-b480-35aa0ec77cb0:frag:0:0] INFO o.apache.drill.exec.work.fragment.FragmentExecutor - 2aaab4ed-0f43-9290-b480-35aa0ec77cb0:0:0 : State change requested from FAILED -- FAILED for 18:27:44.588 [2aaab4ed-0f43-9290-b480-35aa0ec77cb0:frag:0:0] INFO o.apache.drill.exec.work.fragment.FragmentExecutor - 2aaab4ed-0f43-9290-b480-35aa0ec77cb0:0:0 : State change requested from FAILED -- FINISHED for {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3455) If a drillbit, that contains fragments for the current query, dies the QueryManager will fail the query even if those fragments already finished successfully
[ https://issues.apache.org/jira/browse/DRILL-3455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14615330#comment-14615330 ] Sudheesh Katkam commented on DRILL-3455: RB: https://reviews.apache.org/r/36208/ If a drillbit, that contains fragments for the current query, dies the QueryManager will fail the query even if those fragments already finished successfully - Key: DRILL-3455 URL: https://issues.apache.org/jira/browse/DRILL-3455 Project: Apache Drill Issue Type: Bug Components: Execution - Flow Reporter: Deneche A. Hakim Assignee: Sudheesh Katkam Fix For: 1.2.0 Attachments: DRILL-3455.1.patch.txt Once DRILL-3448 is fixed we need to update QueryManager.DrillbitStatusListener to no fragment is still running on the dead node. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-2735) Broadcast plan gets lost when the same query is used in UNION ALL
[ https://issues.apache.org/jira/browse/DRILL-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Barclay (Drill) updated DRILL-2735: -- Summary: Broadcast plan gets lost when the same query is used in UNION ALL (was: Broadcast plan get's lost when the same query is used in UNION ALL) Broadcast plan gets lost when the same query is used in UNION ALL --- Key: DRILL-2735 URL: https://issues.apache.org/jira/browse/DRILL-2735 Project: Apache Drill Issue Type: Bug Components: Query Planning Optimization Affects Versions: 0.9.0 Reporter: Victoria Markman Assignee: Jinfeng Ni Fix For: 1.2.0 Attachments: j1_j2_tables.tar I get a broadcast plan for simple inner join query. {code} 0: jdbc:drill:schema=dfs explain plan for select j1.c_integer from j1, j2 where j1.c_integer = j2.c_integer; +++ |text|json| +++ | 00-00Screen 00-01 UnionExchange 01-01Project(c_integer=[$0]) 01-02 HashJoin(condition=[=($0, $1)], joinType=[inner]) 01-04Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/ctas/j1]], selectionRoot=/drill/testdata/ctas/j1, numFiles=1, columns=[`c_integer`]]]) 01-03Project(c_integer0=[$0]) 01-05 BroadcastExchange 02-01Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/ctas/j2]], selectionRoot=/drill/testdata/ctas/j2, numFiles=1, columns=[`c_integer`]]]) | { head : { version : 1, generator : { type : ExplainHandler, info : }, type : APACHE_DRILL_PHYSICAL, options : [ { name : planner.broadcast_factor, kind : DOUBLE, type : SESSION, float_val : 0.0 }, { name : planner.slice_target, kind : LONG, type : SESSION, num_val : 1 } ], {code} Create table succeeds and multiple fragments are executed: {code} 0: jdbc:drill:schema=dfs create table test(a1) as select j1.c_integer from j1, j2 where j1.c_integer = j2.c_integer; ++---+ | Fragment | Number of records written | ++---+ | 1_1| 0 | | 1_3| 0 | | 1_31 | 0 | | 1_43 | 0 | | 1_35 | 0 | | 1_21 | 0 | | 1_19 | 0 | | 1_27 | 1 | | 1_17 | 1 | | 1_13 | 0 | | 1_29 | 0 | | 1_33 | 0 | | 1_25 | 0 | | 1_7| 0 | | 1_11 | 0 | | 1_37 | 0 | | 1_45 | 0 | | 1_9| 0 | | 1_23 | 1 | | 1_15 | 0 | | 1_41 | 0 | | 1_39 | 0 | | 1_5| 0 | | 1_10 | 0 | | 1_14 | 0 | | 1_24 | 0 | | 1_16 | 0 | | 1_12 | 0 | | 1_36 | 0 | | 1_20 | 0 | | 1_34 | 1 | | 1_40 | 0 | | 1_22 | 0 | | 1_26 | 0 | | 1_32 | 1 | | 1_8| 0 | | 1_18 | 0 | | 1_42 | 0 | | 1_44 | 0 | | 1_38 | 0 | | 1_30 | 0 | | 1_28 | 1 | | 1_4| 10| | 1_2| 1 | | 1_6| 0 | | 1_0| 0 | ++---+ 46 rows selected (2.337 seconds) {code} 8 parquet files are written: {code} [Wed Apr 08 11:41:10 root@/mapr/vmarkman.cluster.com/drill/testdata/ctas/test ] # ls -ltr total 4 -rwxr-xr-x 1 mapr mapr 146 Apr 8 11:40 1_17_0.parquet -rwxr-xr-x 1 mapr mapr 146 Apr 8 11:40 1_27_0.parquet -rwxr-xr-x 1 mapr mapr 146 Apr 8 11:40 1_23_0.parquet -rwxr-xr-x 1 mapr mapr 146 Apr 8 11:40 1_34_0.parquet -rwxr-xr-x 1 mapr
[jira] [Updated] (DRILL-3459) Umbrella JIRA for missing cast and convert_from/convert_to functions
[ https://issues.apache.org/jira/browse/DRILL-3459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mehant Baid updated DRILL-3459: --- Fix Version/s: 1.3.0 Umbrella JIRA for missing cast and convert_from/convert_to functions Key: DRILL-3459 URL: https://issues.apache.org/jira/browse/DRILL-3459 Project: Apache Drill Issue Type: Bug Reporter: Mehant Baid Assignee: Mehant Baid Fix For: 1.3.0 We have a handful of cast functions and convert_from/convert_to functions that need to be implemented. Will link all related issue to this umbrella JIRA so that we have a consolidated view of what needs to be implemented. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-2192) DrillScanRel should differentiate skip-all scan-all scan-some semantics while creating a GroupScan [umbrella]
[ https://issues.apache.org/jira/browse/DRILL-2192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Phillips updated DRILL-2192: --- Fix Version/s: (was: 1.2.0) 1.3.0 DrillScanRel should differentiate skip-all scan-all scan-some semantics while creating a GroupScan [umbrella] - Key: DRILL-2192 URL: https://issues.apache.org/jira/browse/DRILL-2192 Project: Apache Drill Issue Type: Improvement Components: Query Planning Optimization Reporter: Hanifi Gunes Assignee: Steven Phillips Fix For: 1.3.0 DrillScanRel passes a list of columns to be read into GroupScan. Currently the logic here is to scan all of the columns even if planner asks to skip them all. Skipping all of the columns is particularly beneficial for the case of count(star) that is translated to count(constant) where we just need row count but not the actual data. The idea is to distinguish three separate states depending on the output coming from planner as follows: | list of columns from planner | scan semantics | | null | scan-all | | empty list of columns | skip-all | | non-empty list of columns w/o star | scan-some | | list of columns with star | scan-all | As part this umbrella, we should make readers understand skip-all semantics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-2879) Drill extended json's support $oid
[ https://issues.apache.org/jira/browse/DRILL-2879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Phillips updated DRILL-2879: --- Fix Version/s: (was: 1.2.0) 1.3.0 Drill extended json's support $oid -- Key: DRILL-2879 URL: https://issues.apache.org/jira/browse/DRILL-2879 Project: Apache Drill Issue Type: Improvement Components: Storage - JSON Reporter: Bhallamudi Venkata Siva Kamesh Assignee: Steven Phillips Fix For: 1.3.0 Attachments: DRILL-2879_1.patch, extended.json, extendedjson.patch Enhancing JSON reader to parse $oid (from mongo). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (DRILL-1611) The SQL fails Query failed: Failure while running fragment. Queue closed due to channel closure
[ https://issues.apache.org/jira/browse/DRILL-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Victoria Markman resolved DRILL-1611. - Resolution: Fixed Fix Version/s: (was: 1.2.0) 1.1.0 The SQL fails Query failed: Failure while running fragment. Queue closed due to channel closure --- Key: DRILL-1611 URL: https://issues.apache.org/jira/browse/DRILL-1611 Project: Apache Drill Issue Type: Bug Components: Query Planning Optimization Affects Versions: 0.6.0 Reporter: Amol Assignee: Victoria Markman Fix For: 1.1.0 Some of the sqls are executing fine. When we execute below SQL we get an error. We doubled the memory and it does not respond. request guidance to correct it, SQL USED: SELECT c_customer_id,c_current_cdemo_sk,c_current_hdemo_sk,c_current_addr_sk, c_salutation,c_first_name,c_last_name,c_preferred_cust_flag, c_login,c_email_address,c_last_review_date,Sum(ss_quantity), Sum(ss_wholesale_cost),Sum(ss_list_price),Sum(ss_sales_price), Sum(ss_ext_discount_amt),Sum(ss_ext_sales_price),Sum(ss_ext_wholesale_cost), Sum(ss_ext_list_price),Sum(ss_ext_tax),Sum(ss_coupon_amt),Sum(ss_net_paid), Sum(ss_net_paid_inc_tax),Sum(ss_net_profit) from customer A , store_sales B where A.c_customer_sk = B.ss_customer_sk Group by c_customer_id,c_current_cdemo_sk,c_current_hdemo_sk,c_current_addr_sk, c_salutation,c_first_name,c_last_name,c_preferred_cust_flag, c_login,c_email_address,c_last_review_date limit 100; ERROR: Query failed: Failure while running fragment. Queue closed due to channel closure. [9a718ee0-2f6b-401f-b4d6-b861ef4769da] Error: exception while executing query: Failure while trying to get next result batch. (state=,code=0) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-2044) Filter not being pushed down when we join tables with wide records
[ https://issues.apache.org/jira/browse/DRILL-2044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14615475#comment-14615475 ] Sean Hsuan-Yi Chu commented on DRILL-2044: -- This depends on our costing model for JOIN. For the sake of stability, we can come back later. Filter not being pushed down when we join tables with wide records -- Key: DRILL-2044 URL: https://issues.apache.org/jira/browse/DRILL-2044 Project: Apache Drill Issue Type: Bug Components: Query Planning Optimization Reporter: Rahul Challapalli Assignee: Sean Hsuan-Yi Chu Fix For: 1.4.0 Attachments: widestrings_small.parquet git.commit.id.abbrev=a418af1 The filter is not being pushed down according to the plan. This could either be a bug or expected behavior based on the optimization rules. So someone needs to verify that it is atleast not a bug {code} explain plan for select count(ws1.str_var) from widestrings_small ws1 INNER JOIN widestrings_small ws2 on ws1.str_fixed_null_empty=ws2.str_var_null_empty where ws1.tinyint_var 120; 00-00Screen 00-01 StreamAgg(group=[{}], EXPR$0=[COUNT($0)]) 00-02Project(str_var=[$2]) 00-03 SelectionVectorRemover 00-04Filter(condition=[($1, 120)]) 00-05 HashJoin(condition=[=($0, $3)], joinType=[inner]) 00-07Project(str_fixed_null_empty=[$2], tinyint_var=[$1], str_var=[$0]) 00-08 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/data-shapes/wide-columns/5000/1000rows/parquet/widestrings_small]], selectionRoot=/drill/testdata/data-shapes/wide-columns/5000/1000rows/parquet/widestrings_small, numFiles=1, columns=[`str_fixed_null_empty`, `tinyint_var`, `str_var`]]]) 00-06Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/data-shapes/wide-columns/5000/1000rows/parquet/widestrings_small]], selectionRoot=/drill/testdata/data-shapes/wide-columns/5000/1000rows/parquet/widestrings_small, numFiles=1, columns=[`str_var_null_empty`]]]) {code} I attached the data file used. Let me know if you have any questions -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (DRILL-2332) Drill should be consistent with Implicit casting rules across data formats
[ https://issues.apache.org/jira/browse/DRILL-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Hsuan-Yi Chu resolved DRILL-2332. -- Resolution: Invalid Drill should be consistent with Implicit casting rules across data formats -- Key: DRILL-2332 URL: https://issues.apache.org/jira/browse/DRILL-2332 Project: Apache Drill Issue Type: Improvement Components: Query Planning Optimization Reporter: Abhishek Girish Assignee: Sean Hsuan-Yi Chu Fix For: 1.2.0 Currently, the outcome of a query with a filter on a column comparing it with a literal, depends on the underlying data format. *Parquet* {code:sql} select * from date_dim where d_month_seq ='1193' limit 1; [Succeeds] select * from date_dim where d_date in ('1999-06-30') limit 1; [Succeeds] {code} *View on top of text:* {code:sql} select * from date_dim where d_date in ('1999-06-30') limit 1; Query failed: SqlValidatorException: Values passed to IN operator must have compatible types Error: exception while executing query: Failure while executing query. (state=,code=0) select * from date_dim where d_month_seq ='1193' limit 1; Query failed: SqlValidatorException: Cannot apply '=' to arguments of type 'INTEGER = CHAR(4)'. Supported form(s): 'COMPARABLE_TYPE = COMPARABLE_TYPE' Error: exception while executing query: Failure while executing query. (state=,code=0) {code} I understand that in the case of View on Text, SQL validation fails at the Optiq layer. But from the perspective of an end-user, Drill's behavior must be consistent across data formats. Also having a view by definition should abstract out this information. Here, both the view and parquet were created with type information. *Parquet-meta* {code} parquet-schema /mapr/abhi311/data/parquet/tpcds/scale1/date_dim/0_0_0.parquet message root { optional int32 d_date_sk; optional binary d_date_id (UTF8); optional binary d_date (UTF8); optional int32 d_month_seq; optional int32 d_week_seq; optional int32 d_quarter_seq; optional int32 d_year; optional int32 d_dow; optional int32 d_moy; optional int32 d_dom; optional int32 d_qoy; optional int32 d_fy_year; optional int32 d_fy_quarter_seq; optional int32 s_fy_week_seq; optional binary d_day_name (UTF8); optional binary d_quarter_name (UTF8); optional binary d_holiday (UTF8); optional binary d_weekend (UTF8); optional binary d_following_holiday (UTF8); optional int32 d_first_dom; optional int32 d_last_dom; optional int32 d_same_day_ly; optional int32 d_same_day_lq; optional binary d_current_day (UTF8); optional binary d_current_week (UTF8); optional binary d_current_month (UTF8); optional binary d_current_quarter (UTF8); optional binary d_current_year (UTF8); } {code} *Describe View* {code:sql} describe date_dim; +-++-+ | COLUMN_NAME | DATA_TYPE | IS_NULLABLE | +-++-+ | d_date_sk | INTEGER| NO | | d_date_id | VARCHAR| NO | | d_date | DATE | NO | | d_month_seq | INTEGER| NO | | d_week_seq | INTEGER| NO | | d_quarter_seq | INTEGER| NO | | d_year | INTEGER| NO | | d_dow | INTEGER| NO | | d_moy | INTEGER| NO | | d_dom | INTEGER| NO | | d_qoy | INTEGER| NO | | d_fy_year | INTEGER| NO | | d_fy_quarter_seq | INTEGER| NO | | s_fy_week_seq | INTEGER| NO | | d_day_name | VARCHAR| NO | | d_quarter_name | VARCHAR| NO | | d_holiday | VARCHAR| NO | | d_weekend | VARCHAR| NO | | d_following_holiday | VARCHAR| NO | | d_first_dom | INTEGER| NO | | d_last_dom | INTEGER| NO | | d_same_day_ly | INTEGER| NO | | d_same_day_lq | INTEGER| NO | | d_current_day | VARCHAR| NO | | d_current_week | VARCHAR| NO | | d_current_month | VARCHAR| NO | | d_current_quarter | VARCHAR| NO | | d_current_year | VARCHAR| NO | +-++-+ 28 rows selected (0.137 seconds) {code} For an end -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-2044) Filter not being pushed down when we join tables with wide records
[ https://issues.apache.org/jira/browse/DRILL-2044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Hsuan-Yi Chu updated DRILL-2044: - Fix Version/s: (was: 1.2.0) 1.4.0 Filter not being pushed down when we join tables with wide records -- Key: DRILL-2044 URL: https://issues.apache.org/jira/browse/DRILL-2044 Project: Apache Drill Issue Type: Bug Components: Query Planning Optimization Reporter: Rahul Challapalli Assignee: Sean Hsuan-Yi Chu Fix For: 1.4.0 Attachments: widestrings_small.parquet git.commit.id.abbrev=a418af1 The filter is not being pushed down according to the plan. This could either be a bug or expected behavior based on the optimization rules. So someone needs to verify that it is atleast not a bug {code} explain plan for select count(ws1.str_var) from widestrings_small ws1 INNER JOIN widestrings_small ws2 on ws1.str_fixed_null_empty=ws2.str_var_null_empty where ws1.tinyint_var 120; 00-00Screen 00-01 StreamAgg(group=[{}], EXPR$0=[COUNT($0)]) 00-02Project(str_var=[$2]) 00-03 SelectionVectorRemover 00-04Filter(condition=[($1, 120)]) 00-05 HashJoin(condition=[=($0, $3)], joinType=[inner]) 00-07Project(str_fixed_null_empty=[$2], tinyint_var=[$1], str_var=[$0]) 00-08 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/data-shapes/wide-columns/5000/1000rows/parquet/widestrings_small]], selectionRoot=/drill/testdata/data-shapes/wide-columns/5000/1000rows/parquet/widestrings_small, numFiles=1, columns=[`str_fixed_null_empty`, `tinyint_var`, `str_var`]]]) 00-06Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/data-shapes/wide-columns/5000/1000rows/parquet/widestrings_small]], selectionRoot=/drill/testdata/data-shapes/wide-columns/5000/1000rows/parquet/widestrings_small, numFiles=1, columns=[`str_var_null_empty`]]]) {code} I attached the data file used. Let me know if you have any questions -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-2332) Drill should be consistent with Implicit casting rules across data formats
[ https://issues.apache.org/jira/browse/DRILL-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14615469#comment-14615469 ] Sean Hsuan-Yi Chu commented on DRILL-2332: -- Although both parquet and view are created with type information, when you are trying to query from 1. view: data type is exposed to our planner/parser 2. parquet: data type is NOT exposed to our planner/parser In general, no matter how you created the parquet files, the data types for all the columns are treated as ANY type {color:red} at planning {color}. And implicit casting will kick-in at the later execution. However, in the case of querying views, the data types have been exposed to planner/parser. And if there are non-compatible types, Drill-Calcite will block it at planning phase. By and large, when schema is available, drill tries to converge to schema-based DB. Drill should be consistent with Implicit casting rules across data formats -- Key: DRILL-2332 URL: https://issues.apache.org/jira/browse/DRILL-2332 Project: Apache Drill Issue Type: Improvement Components: Query Planning Optimization Reporter: Abhishek Girish Assignee: Sean Hsuan-Yi Chu Fix For: 1.2.0 Currently, the outcome of a query with a filter on a column comparing it with a literal, depends on the underlying data format. *Parquet* {code:sql} select * from date_dim where d_month_seq ='1193' limit 1; [Succeeds] select * from date_dim where d_date in ('1999-06-30') limit 1; [Succeeds] {code} *View on top of text:* {code:sql} select * from date_dim where d_date in ('1999-06-30') limit 1; Query failed: SqlValidatorException: Values passed to IN operator must have compatible types Error: exception while executing query: Failure while executing query. (state=,code=0) select * from date_dim where d_month_seq ='1193' limit 1; Query failed: SqlValidatorException: Cannot apply '=' to arguments of type 'INTEGER = CHAR(4)'. Supported form(s): 'COMPARABLE_TYPE = COMPARABLE_TYPE' Error: exception while executing query: Failure while executing query. (state=,code=0) {code} I understand that in the case of View on Text, SQL validation fails at the Optiq layer. But from the perspective of an end-user, Drill's behavior must be consistent across data formats. Also having a view by definition should abstract out this information. Here, both the view and parquet were created with type information. *Parquet-meta* {code} parquet-schema /mapr/abhi311/data/parquet/tpcds/scale1/date_dim/0_0_0.parquet message root { optional int32 d_date_sk; optional binary d_date_id (UTF8); optional binary d_date (UTF8); optional int32 d_month_seq; optional int32 d_week_seq; optional int32 d_quarter_seq; optional int32 d_year; optional int32 d_dow; optional int32 d_moy; optional int32 d_dom; optional int32 d_qoy; optional int32 d_fy_year; optional int32 d_fy_quarter_seq; optional int32 s_fy_week_seq; optional binary d_day_name (UTF8); optional binary d_quarter_name (UTF8); optional binary d_holiday (UTF8); optional binary d_weekend (UTF8); optional binary d_following_holiday (UTF8); optional int32 d_first_dom; optional int32 d_last_dom; optional int32 d_same_day_ly; optional int32 d_same_day_lq; optional binary d_current_day (UTF8); optional binary d_current_week (UTF8); optional binary d_current_month (UTF8); optional binary d_current_quarter (UTF8); optional binary d_current_year (UTF8); } {code} *Describe View* {code:sql} describe date_dim; +-++-+ | COLUMN_NAME | DATA_TYPE | IS_NULLABLE | +-++-+ | d_date_sk | INTEGER| NO | | d_date_id | VARCHAR| NO | | d_date | DATE | NO | | d_month_seq | INTEGER| NO | | d_week_seq | INTEGER| NO | | d_quarter_seq | INTEGER| NO | | d_year | INTEGER| NO | | d_dow | INTEGER| NO | | d_moy | INTEGER| NO | | d_dom | INTEGER| NO | | d_qoy | INTEGER| NO | | d_fy_year | INTEGER| NO | | d_fy_quarter_seq | INTEGER| NO | | s_fy_week_seq | INTEGER| NO | | d_day_name | VARCHAR| NO | | d_quarter_name | VARCHAR| NO | | d_holiday | VARCHAR| NO | | d_weekend | VARCHAR| NO | | d_following_holiday | VARCHAR| NO | | d_first_dom | INTEGER| NO | | d_last_dom | INTEGER| NO | | d_same_day_ly | INTEGER| NO | | d_same_day_lq | INTEGER| NO | |
[jira] [Updated] (DRILL-2026) Consider: speed up dev. tests using Maven integration-test phase
[ https://issues.apache.org/jira/browse/DRILL-2026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Phillips updated DRILL-2026: --- Fix Version/s: (was: 1.2.0) Future Consider: speed up dev. tests using Maven integration-test phase -- Key: DRILL-2026 URL: https://issues.apache.org/jira/browse/DRILL-2026 Project: Apache Drill Issue Type: Improvement Components: Tools, Build Test Reporter: Daniel Barclay (Drill) Assignee: Steven Phillips Fix For: Future Because many of our unit test classes (unit tests in the sense of being run by Surefire in Maven's test phase) need to have a running Drillbit, then because they are unit tests, each such test class starts up its own Drillbit, taking quite a while, especially considering the aggregate time. Consider moving Drillbit-needing tests to be Maven integration tests (run in Maven's integration-test phase by Failsafe). That would allow for starting up a Drillbit (or multiple Drillbits and other servers) in Maven's pre-integration-test phase, using those servers for all tests, and shutting down the servers in Maven's post-integration-test phase. That should save quite a lot of time--the product of the time per Drillbit startup plus shutdown and the number of test classes changed to integration tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-1611) The SQL fails Query failed: Failure while running fragment. Queue closed due to channel closure
[ https://issues.apache.org/jira/browse/DRILL-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14615504#comment-14615504 ] Victoria Markman commented on DRILL-1611: - This query now runs successfully: version 1.1 of drill: {code} 0: jdbc:drill:schema=dfs SELECT A.c_customer_id, . . . . . . . . . . . . A.c_current_cdemo_sk, . . . . . . . . . . . . A.c_current_hdemo_sk, . . . . . . . . . . . . A.c_current_addr_sk, . . . . . . . . . . . . A.c_salutation, . . . . . . . . . . . . A.c_first_name, . . . . . . . . . . . . A.c_last_name, . . . . . . . . . . . . A.c_preferred_cust_flag, . . . . . . . . . . . . A.c_login, . . . . . . . . . . . . A.c_email_address, . . . . . . . . . . . . A.c_last_review_date, . . . . . . . . . . . . Sum(B.ss_quantity), . . . . . . . . . . . . Sum(B.ss_wholesale_cost), . . . . . . . . . . . . Sum(B.ss_list_price), . . . . . . . . . . . . Sum(B.ss_sales_price), . . . . . . . . . . . . Sum(B.ss_ext_discount_amt), . . . . . . . . . . . . Sum(B.ss_ext_sales_price), . . . . . . . . . . . . Sum(B.ss_ext_wholesale_cost), . . . . . . . . . . . . Sum(B.ss_ext_list_price), . . . . . . . . . . . . Sum(B.ss_ext_tax), . . . . . . . . . . . . Sum(B.ss_coupon_amt), . . . . . . . . . . . . Sum(B.ss_net_paid), . . . . . . . . . . . . Sum(B.ss_net_paid_inc_tax), . . . . . . . . . . . . Sum(B.ss_net_profit) . . . . . . . . . . . . FROM customer A, . . . . . . . . . . . . store_sales B . . . . . . . . . . . . WHERE . . . . . . . . . . . . A.c_customer_sk = B.ss_customer_sk . . . . . . . . . . . . GROUP BY . . . . . . . . . . . .A.c_customer_id, . . . . . . . . . . . .A.c_current_cdemo_sk, . . . . . . . . . . . .A.c_current_hdemo_sk, . . . . . . . . . . . .A.c_current_addr_sk, . . . . . . . . . . . .A.c_salutation, . . . . . . . . . . . .A.c_first_name, . . . . . . . . . . . .A.c_last_name, . . . . . . . . . . . .A.c_preferred_cust_flag, . . . . . . . . . . . .A.c_login, . . . . . . . . . . . .A.c_email_address, . . . . . . . . . . . .A.c_last_review_date . . . . . . . . . . . . LIMIT 100; ++-+-++---+---+--++--+--+-+--+-+-+-+-+-+-+-+-+-+-+-+--+ | c_customer_id | c_current_cdemo_sk | c_current_hdemo_sk | c_current_addr_sk | c_salutation | c_first_name | c_last_name | c_preferred_cust_flag | c_login | c_email_address | c_last_review_date | EXPR$11 | EXPR$12 | EXPR$13 | EXPR$14 | EXPR$15 | EXPR$16 | EXPR$17 | EXPR$18 | EXPR$19 | EXPR$20 | EXPR$21 | EXPR$22 | EXPR$23| ++-+-++---+---+--++--+--+-+--+-+-+-+-+-+-+-+-+-+-+-+--+ | [B@1997053a| 1740699 | 6283| 8667 | [B@9d82ff6| [B@5034c486 | [B@3852628d | [B@28c433de| null | [B@692948ba | [B@1565e146 | 1451 | 1636.3499875068665 | 2735.4400119781494 | 1026.380006607622 | 13208.909889936447 | 58845.53004384041 | 94184.37959289551 | 148164.82012939453 | 1023.8200042340904 | 13208.909889936447 | 42804.30076660216 | 46858.699785619974 | -48243.00975036621 | | [B@53a8c5a3| 1802568 | 1029| 15490 | [B@6d477b3d | [B@6fec19d7 | [B@7594aba8 | [B@7391dee6| null | [B@6dc10950 | [B@1c4a9b7b | 1162 | 1148.9599850177765 | 1852.3299944400787 | 777.4500068426132 | 759.4600238800049 | 35477.37022399902 | 51913.2008447 | 84324.75047302246 | 1640.790023803711 | 759.4600238800049 | 34717.91014003754 | 36358.69979476929 | -17195.299812316895 | | [B@6625b6bf| 1744347 | 2368| 42742
[jira] [Updated] (DRILL-2686) Move writeJson() methods from PhysicalPlanReader to corresponding classes
[ https://issues.apache.org/jira/browse/DRILL-2686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Phillips updated DRILL-2686: --- Fix Version/s: (was: 1.2.0) Future Move writeJson() methods from PhysicalPlanReader to corresponding classes - Key: DRILL-2686 URL: https://issues.apache.org/jira/browse/DRILL-2686 Project: Apache Drill Issue Type: Improvement Components: Query Planning Optimization Affects Versions: 0.7.0 Reporter: Sudheesh Katkam Assignee: Steven Phillips Fix For: Future From Chris's comment https://reviews.apache.org/r/32795/ It would have been better to have a writeJson(ObjectMapper) method added to each of OptionList, PhysicalOperator, -and ExecutionControls-, and for PhysicalPlanReader just to have a getMapper() that is used to get the argument needed for those. In that form, we don't have to add a new method to PhysicalPlanReader for each thing that we want to add to it. We just get its mapper and write whatever it is to it. We'd have {code} final ObjectMapper mapper = reader.getMapper(); options.writeJson(mapper); executionControls.writeJson(mapper); {code} So as we add more things to the plan, we don't have to add more methods to it. Each object knows how to write itself, given the mapper. And if we ever need to add them to anything else, that object just needs to expose its mapper in a similar way, rather than having a method per item. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-1681) select with limit on directory with csv files takes quite long to terminate
[ https://issues.apache.org/jira/browse/DRILL-1681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Phillips updated DRILL-1681: --- Fix Version/s: (was: 1.2.0) 1.3.0 select with limit on directory with csv files takes quite long to terminate --- Key: DRILL-1681 URL: https://issues.apache.org/jira/browse/DRILL-1681 Project: Apache Drill Issue Type: Bug Components: Storage - Text CSV Reporter: Suresh Ollala Assignee: Steven Phillips Priority: Minor Fix For: 1.3.0 query like select * from `/drill/data` limit 100 takes quite long to terminate, about 20+ seconds. /drill/data includes overall 1100 csv files, all in single directory. select * from `/drill/data/d2.csv` limit 100; terminates in 0.2 seconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-2033) JSON 'All Text mode' should support schema change from scalar to complex types
[ https://issues.apache.org/jira/browse/DRILL-2033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Phillips updated DRILL-2033: --- Fix Version/s: (was: 1.2.0) 1.3.0 JSON 'All Text mode' should support schema change from scalar to complex types -- Key: DRILL-2033 URL: https://issues.apache.org/jira/browse/DRILL-2033 Project: Apache Drill Issue Type: Improvement Components: Storage - JSON Reporter: Neeraja Assignee: Steven Phillips Fix For: 1.3.0 A scalar/simple field turning to complex field is a very common scenario in many JSON documents. For ex: An integer turning into an array of integers , a string turning into a map of multiple strings . Drill currently already provides ability to query the scalar data with datatype changes by turning all text mode to true. This should be expanded it to support querying the scalar-complex type changes as well. This gives power for the users to look at the data without fixing/modifying JSON upfront. Here is a quick example from a public dataset. { data: { games: { game: [ { home_runs: { player: { first: Jason } } }, { home_runs: { player: [ { first: Kosuke }, { first: Alfonso }, { first: Jeff }, { first: Brandon } ] } } ] } } } -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-1627) Writer needs to be transactional
[ https://issues.apache.org/jira/browse/DRILL-1627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Phillips updated DRILL-1627: --- Fix Version/s: (was: 1.2.0) 1.3.0 Writer needs to be transactional Key: DRILL-1627 URL: https://issues.apache.org/jira/browse/DRILL-1627 Project: Apache Drill Issue Type: Bug Components: Storage - Writer Environment: embedded drill invoked via sqlline under Eclipse on OSX Reporter: Chris Westin Assignee: Steven Phillips Fix For: 1.3.0 Tried to do a CTAS which failed for unknown reasons. Output starts out looking OK, but then gets an error: 0: jdbc:drill:zk=local create table donuts_parquet as select * from `donuts.json`; create table donuts_parquet as select * from `donuts.jso n`; ++---+ | Fragment | Number of records written | ++---+ | 0_0| 5 | Query failed: Failure while running fragment. java.lang.RuntimeException: java.sql.SQLException: Failure while executing query. at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2514) at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2148) at sqlline.SqlLine.print(SqlLine.java:1809) at sqlline.SqlLine$Commands.execute(SqlLine.java:3766) at sqlline.SqlLine$Commands.sql(SqlLine.java:3663) at sqlline.SqlLine.dispatch(SqlLine.java:889) at sqlline.SqlLine.begin(SqlLine.java:763) at sqlline.SqlLine.start(SqlLine.java:498) at sqlline.SqlLine.main(SqlLine.java:460) 0: jdbc:drill:zk=local No indication of what caused the failure. But the non-zero Number of records written would seem to imply success. I checked the directory this workspace is configured to use, and while it did create the parquet file, it is zero sized: wormsign:json cwestin$ ls donuts.json donuts_parquet/ wormsign:json cwestin$ ls donuts_parquet 0_0_0.parquet wormsign:json cwestin$ ls -l donuts_parquet total 0 -rw-r--r-- 1 cwestin staff 0 Oct 31 16:06 0_0_0.parquet wormsign:json cwestin$ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3096) State change requested from ... -- ... for blank after for
[ https://issues.apache.org/jira/browse/DRILL-3096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14615725#comment-14615725 ] Parth Chandra commented on DRILL-3096: -- +!. LGTM State change requested from ... -- ... for blank after for Key: DRILL-3096 URL: https://issues.apache.org/jira/browse/DRILL-3096 Project: Apache Drill Issue Type: Bug Components: Execution - Flow Reporter: Daniel Barclay (Drill) Assignee: Parth Chandra Fix For: 1.2.0 Attachments: DRILL-3096.1.patch.txt Something seems to be missing (or sometimes blank) in state-change log messages. See the for with nothing after it in these messages: {noformat} 18:27:44.578 [2aaab4ed-0f43-9290-b480-35aa0ec77cb0:frag:0:0] INFO o.apache.drill.exec.work.fragment.FragmentExecutor - 2aaab4ed-0f43-9290-b480-35aa0ec77cb0:0:0 : State change requested from CANCELLATION_REQUESTED -- FAILED for 18:27:44.587 [2aaab4ed-0f43-9290-b480-35aa0ec77cb0:frag:0:0] INFO o.apache.drill.exec.work.fragment.FragmentExecutor - 2aaab4ed-0f43-9290-b480-35aa0ec77cb0:0:0 : State change requested from FAILED -- FAILED for 18:27:44.588 [2aaab4ed-0f43-9290-b480-35aa0ec77cb0:frag:0:0] INFO o.apache.drill.exec.work.fragment.FragmentExecutor - 2aaab4ed-0f43-9290-b480-35aa0ec77cb0:0:0 : State change requested from FAILED -- FINISHED for {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3443) Flatten function raise exception when JSON files have different schema
[ https://issues.apache.org/jira/browse/DRILL-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14615784#comment-14615784 ] Jason Altekruse commented on DRILL-3443: We have a number of known issues around handling changing schemas. Unfortunately due to some current design limitations, a few of these evolving schema cases where a field doesn't exist in some files and does in others, are also known to have issues. We will be trying to fix the error messages in these cases (there are a number of JIRAs related to this root problem) and are looking into ways to solve the problem more generally soon. Flatten function raise exception when JSON files have different schema -- Key: DRILL-3443 URL: https://issues.apache.org/jira/browse/DRILL-3443 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.0.0 Environment: DRILL 1.0 Embedded (running on OSX with Java 8) DRILL 1.0 Deployed on MapR 4.1 Sandbox Reporter: Tugdual Grall Assignee: Jason Altekruse Priority: Critical Fix For: 1.3.0 I have 2 JSON documents: {code} { name : PPRODUCT_002, price : 200.00, tags : [sports , cool, ocean] } { name : PPRODUCT_001, price : 100.00 } {code} And I execute this query: {code} SELECT name, flatten(tags) FROM dfs.`data/json_array/*.json` {code} If the JSON Documents are located in 2 different files and the first file does not contains the tags (product 001 in 001.json ), the following exception is raised: {code} org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: java.lang.ClassCastException: Cannot cast org.apache.drill.exec.vector.NullableIntVector to org.apache.drill.exec.vector.RepeatedValueVector Fragment 0:0 [Error Id: 4bb5b9e4-0de1-48e9-a0f3-956339608903 on 192.168.99.13:31010] {code} It is working if: * All the JSON documents are in a single json file (order is not important) * if the product with the tags attribute is first on the file system, for example you put product 02 in 000.json (that will be read before 001.json) This is similar to [DRILL-3334] bug -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-1970) Hive views must not be listed with the show tables command
[ https://issues.apache.org/jira/browse/DRILL-1970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Altekruse updated DRILL-1970: --- Fix Version/s: (was: 1.2.0) 1.3.0 Hive views must not be listed with the show tables command -- Key: DRILL-1970 URL: https://issues.apache.org/jira/browse/DRILL-1970 Project: Apache Drill Issue Type: Bug Components: Storage - Hive Reporter: Abhishek Girish Assignee: Jason Altekruse Fix For: 1.3.0 Attachments: DRILL-1970.1.patch.txt, DRILL-1970.2.patch.txt This is related to DRILL-1969. Until Drill can support querying of Hive Views, hive views metadata must not be visible upon issuing the show tables command. use hive; +++ | ok | summary | +++ | true | Default schema changed to 'hive' | +++ Currently Observed: show tables ; +--++ | TABLE_SCHEMA | TABLE_NAME | +--++ | hive.default | table1 | | hive.default | table2 | | hive.default | table1_view1 | | hive.default | table2_view1 | ... +--++ Expected: show tables ; +--++ | TABLE_SCHEMA | TABLE_NAME | +--++ | hive.default | table1 | | hive.default | table2 | +--++ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3461) Need to add javadocs to class where they are missing
[ https://issues.apache.org/jira/browse/DRILL-3461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Dunning updated DRILL-3461: --- Attachment: (was: no-javadoc-no-comments.txt) Need to add javadocs to class where they are missing Key: DRILL-3461 URL: https://issues.apache.org/jira/browse/DRILL-3461 Project: Apache Drill Issue Type: Bug Reporter: Ted Dunning Attachments: no-javadocs-templates.txt, no-javadocs.txt, no-javadocs.txt 1220 classes in Drill have no Javadocs whatsoever. I will attach a detailed list. Some kind of expression of intent and basic place in the architecture should be included in all classes. The good news is that at least there are 1838 (1868 in 1.1.0 branch) classes that have at least some kind of javadocs. I would be happy to help write comments, but I can't figure out what these classes do. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (DRILL-2274) Unable to allocate sv2 buffer after repeated attempts : JOIN, Order by used in query
[ https://issues.apache.org/jira/browse/DRILL-2274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deneche A. Hakim reassigned DRILL-2274: --- Assignee: Deneche A. Hakim (was: Jason Altekruse) Unable to allocate sv2 buffer after repeated attempts : JOIN, Order by used in query Key: DRILL-2274 URL: https://issues.apache.org/jira/browse/DRILL-2274 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Reporter: Rahul Challapalli Assignee: Deneche A. Hakim Fix For: 1.2.0 Attachments: data.json git.commit.id.abbrev=6676f2d The below query fails : {code} select sub1.uid from `data.json` sub1 inner join `data.json` sub2 on sub1.uid = sub2.uid order by sub1.uid; {code} Error from the logs : {code} 2015-02-20 00:24:08,431 [2b1981b0-149e-981b-f83f-512c587321d7:frag:1:2] ERROR o.a.d.e.w.f.AbstractStatusReporter - Error 66dba4ff-644c-4400-ab84-203256dc2600: Failure while running fragment. java.lang.RuntimeException: org.apache.drill.exec.memory.OutOfMemoryException: Unable to allocate sv2 buffer after repeated attempts at org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.innerNext(ExternalSortBatch.java:307) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:89) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext(RemovingRecordBatch.java:96) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:67) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext(SingleSenderCreator.java:97) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:57) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:116) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] at org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:303) [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_71] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_71] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71] Caused by: org.apache.drill.exec.memory.OutOfMemoryException: Unable to allocate sv2 buffer after repeated attempts at org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.newSV2(ExternalSortBatch.java:516) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.innerNext(ExternalSortBatch.java:305) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] ... 16 common frames omitted {code} On a different drillbit in the cluster, I found the below message for the same run {code} 2015-02-20 00:24:08,435 [BitServer-6] WARN o.a.d.exec.rpc.control.WorkEventBus - A fragment message arrived but there was no registered listener for that message: profile { state: FAILED error { error_id: 66dba4ff-644c-4400-ab84-203256dc2600 endpoint { address: qa-node191.qa.lab user_port: 31010 control_port: 31011 data_port:
[jira] [Updated] (DRILL-2745) Query returns IOB Exception when JSON data with empty arrays is input to flatten function
[ https://issues.apache.org/jira/browse/DRILL-2745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Khurram Faraaz updated DRILL-2745: -- Assignee: Jason Altekruse (was: Khurram Faraaz) Query returns IOB Exception when JSON data with empty arrays is input to flatten function - Key: DRILL-2745 URL: https://issues.apache.org/jira/browse/DRILL-2745 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 0.9.0 Environment: | 9d92b8e319f2d46e8659d903d355450e15946533 | DRILL-2580: Exit early from HashJoinBatch if build side is empty | 26.03.2015 @ 16:13:53 EDT Reporter: Khurram Faraaz Assignee: Jason Altekruse Fix For: 1.2.0 IOB Exception is returned when JSON file that has many empty arrays and arrays with different types of data is passed to flatten function. Tested on 4 node cluster on CentOS {code} 0: jdbc:drill: select flatten(outkey) from `nestedJArry.json` ; Query failed: RemoteRpcException: Failure while running fragment., index: 176, length: 4 (expected: range(0, 176)) [ 2627cf84-9dfb-4077-8531-9955ecdbdec7 on centos-02.qa.lab:31010 ] [ 2627cf84-9dfb-4077-8531-9955ecdbdec7 on centos-02.qa.lab:31010 ] Error: exception while executing query: Failure while executing query. (state=,code=0) 0: jdbc:drill: select outkey from `nestedJArry.json`; ++ | outkey | ++ | [[100,1000,200,99,1,0,-1,10],[a,b,c,d,e,p,o,f,m,q,d,s,v],[2012-04-01,1998-02-20,2011-08-05,1992-01-01],[10:30:29.123,12:29:21.999],[sdfklgjsdlkjfghlsidhfgopiuesrtoipuertoiurtyoiurotuiydkfjlbn,bfn;waokefpqowertoipuwergklnjdfbpdsiofgoigiuewqrqiugkjehgjksdhbvkjshdfkjsdfbnlkfbkljrghljrelkhbdlkfjbgkdfjbgkndfbnkldfgklbhjdflkghjlnkoiurty984756897345609782-3458745uiyoheirluht7895e6y],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[null],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[test string,hello world!,just do it!,houston we have a problem],[1,2,3,4,5,6,7,8,9,0]] | ++ 1 row selected (0.088 seconds) Stack trace from drillbit.log 2015-04-09 23:54:41,965 [2ad8eebd-adb6-6f7e-469e-4bb8ca276984:frag:0:0] WARN o.a.d.e.w.fragment.FragmentExecutor - Error while initializing or executing fragment java.lang.IndexOutOfBoundsException: index: 176, length: 4 (expected: range(0, 176)) at io.netty.buffer.DrillBuf.checkIndexD(DrillBuf.java:187) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:4.0.24.Final] at io.netty.buffer.DrillBuf.chk(DrillBuf.java:209) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:4.0.24.Final] at io.netty.buffer.DrillBuf.setInt(DrillBuf.java:513) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:4.0.24.Final] at org.apache.drill.exec.vector.UInt4Vector$Mutator.set(UInt4Vector.java:363) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT] at org.apache.drill.exec.vector.RepeatedVarCharVector.splitAndTransferTo(RepeatedVarCharVector.java:173) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT] at org.apache.drill.exec.vector.RepeatedVarCharVector$TransferImpl.splitAndTransfer(RepeatedVarCharVector.java:200) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT] at org.apache.drill.exec.test.generated.FlattenerGen1107.flattenRecords(FlattenTemplate.java:106) ~[na:na] at org.apache.drill.exec.physical.impl.flatten.FlattenRecordBatch.doWork(FlattenRecordBatch.java:156) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:93) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.flatten.FlattenRecordBatch.innerNext(FlattenRecordBatch.java:122) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:89)
[jira] [Updated] (DRILL-3463) Unit test of project pushdown in TestUnionAll should put more precisely plan attribute in plan verification.
[ https://issues.apache.org/jira/browse/DRILL-3463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jinfeng Ni updated DRILL-3463: -- Fix Version/s: 1.2.0 Unit test of project pushdown in TestUnionAll should put more precisely plan attribute in plan verification. -- Key: DRILL-3463 URL: https://issues.apache.org/jira/browse/DRILL-3463 Project: Apache Drill Issue Type: Bug Components: Query Planning Optimization Reporter: Jinfeng Ni Assignee: Jinfeng Ni Fix For: 1.2.0 As part of fix for DRILL-2802, it was discovered that several unit test cases for project pushdown in TestUnionAll did not put the desired plan attributes in to the expected plan result. To verify project pushdown is working properly, one simple way is to verify that the the column list in the Scan operator contains the desired columns. This should be the part of plan verification. However, the unit test cases in TestUnionAll did not do that. In stead, it tries to match a pattern of Project -- Scan, which seems not serving the purpose it desired. For instance, {code} final String[] expectedPlan = {UnionAll.*\n. + *Project.*\n + .*Scan.*\n + {code} should be replaced by {code} final String[] expectedPlan = {UnionAll.*\n. + *Project.*\n + .*Scan.*columns=\\[`n_comment`, `n_nationkey`, `n_name`\\].*\n {code} if we want to verify the column 'n_comment', 'n_nationkey', 'n_name' are pushed into Scan operator. To fix this, modify the expected plan result, such that it contains the plan attributes that should be able to verify whether the project pushdown is working or not. This will help catch project pushdown failure, and avoid causing more false alarm in plan verification. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-2561) Profile UI: Metrics displayed incorrectly for failed query
[ https://issues.apache.org/jira/browse/DRILL-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sudheesh Katkam updated DRILL-2561: --- Fix Version/s: (was: Future) 1.3.0 Profile UI: Metrics displayed incorrectly for failed query -- Key: DRILL-2561 URL: https://issues.apache.org/jira/browse/DRILL-2561 Project: Apache Drill Issue Type: Bug Components: Client - HTTP Affects Versions: 0.9.0 Reporter: Krystal Assignee: Sudheesh Katkam Fix For: 1.3.0 git.commit.id=8493713cafe6e5d1f56f2dffc9d8bea294a6e013 I have a query that failed to execute. The profile UI for this query displayed wrong metrics in columns. Here is the url for that profile: http://10.10.100.115:8047/profiles/2aed1b79-17a0-312d-42a5-161a1c2c66a4 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-2409) Drill profile page mishandles statistics from long running queries
[ https://issues.apache.org/jira/browse/DRILL-2409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sudheesh Katkam updated DRILL-2409: --- Fix Version/s: (was: 1.2.0) 1.3.0 Drill profile page mishandles statistics from long running queries -- Key: DRILL-2409 URL: https://issues.apache.org/jira/browse/DRILL-2409 Project: Apache Drill Issue Type: Bug Components: Client - HTTP Reporter: Jacques Nadeau Assignee: Sudheesh Katkam Fix For: 1.3.0 We recently ran a 72 hour query that joined several trillion records for a customer. While the query completed successfully, the presentation on the profile page had a number of problems. This included times not being correctly reported (they were truncated) and the Gant timeline being unreadable (since it doesn't scale the axes from seconds). We should correct these. (For durations specifically, we should really be present as 4m 2s, 7h 4m or 7d 4h 4m instead of 07:04:02 since we're talking about durations and not times.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-2325) conf/drill-override-example.conf is outdated
[ https://issues.apache.org/jira/browse/DRILL-2325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sudheesh Katkam updated DRILL-2325: --- Fix Version/s: (was: 1.2.0) 1.3.0 conf/drill-override-example.conf is outdated Key: DRILL-2325 URL: https://issues.apache.org/jira/browse/DRILL-2325 Project: Apache Drill Issue Type: Bug Components: Tools, Build Test Reporter: Zhiyong Liu Assignee: Sudheesh Katkam Fix For: 1.3.0 The conf/drill-override-example.conf file is outdated. Properties have been added (e.g., compile), removed (e.g., cache.hazel.subnets) or otherwise modified. The file is statically tracked in distribution/src/resources/drill-override-example.conf. Ideally there should be a way to update the file programmatically when things change. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3461) Need to add javadocs to class where they are missing
[ https://issues.apache.org/jira/browse/DRILL-3461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Dunning updated DRILL-3461: --- Attachment: (was: no-comments.txt) Need to add javadocs to class where they are missing Key: DRILL-3461 URL: https://issues.apache.org/jira/browse/DRILL-3461 Project: Apache Drill Issue Type: Bug Reporter: Ted Dunning Attachments: no-javadocs-templates.txt, no-javadocs.txt, no-javadocs.txt 1220 classes in Drill have no Javadocs whatsoever. I will attach a detailed list. Some kind of expression of intent and basic place in the architecture should be included in all classes. The good news is that at least there are 1838 (1868 in 1.1.0 branch) classes that have at least some kind of javadocs. I would be happy to help write comments, but I can't figure out what these classes do. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3461) Need to add javadocs to class where they are missing
[ https://issues.apache.org/jira/browse/DRILL-3461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Dunning updated DRILL-3461: --- Attachment: no-javadocs-templates.txt no-javadocs.txt Updated lists of files. Current count is 1239 java files with no javadoc and 67 templates. The previous count was possibly distorted by generated files and seeing the Apache license as a comment. Need to add javadocs to class where they are missing Key: DRILL-3461 URL: https://issues.apache.org/jira/browse/DRILL-3461 Project: Apache Drill Issue Type: Bug Reporter: Ted Dunning Attachments: no-comments.txt, no-javadoc-no-comments.txt, no-javadocs-templates.txt, no-javadocs.txt, no-javadocs.txt 1220 classes in Drill have no Javadocs whatsoever. I will attach a detailed list. Some kind of expression of intent and basic place in the architecture should be included in all classes. The good news is that at least there are 1838 (1868 in 1.1.0 branch) classes that have at least some kind of javadocs. I would be happy to help write comments, but I can't figure out what these classes do. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-2745) Query returns IOB Exception when JSON data with empty arrays is input to flatten function
[ https://issues.apache.org/jira/browse/DRILL-2745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14615819#comment-14615819 ] Khurram Faraaz commented on DRILL-2745: --- I see this error message now, I do not see IOB {code} 0: jdbc:drill:schema=dfs.tmp select flatten(outkey) from `nestedJArry.json`; Error: UNSUPPORTED_OPERATION ERROR: In a list of type BIGINT, encountered a value of type VARCHAR. Drill does not support lists of different types. File /tmp/nestedJArry.json Record 1 Line 1 Column 64 Field outkey Line 1 Column 64 Field outkey Fragment 0:0 [Error Id: 60ca5348-d7be-4443-9c0c-77e1c99cb430 on centos-04.qa.lab:31010] (state=,code=0) {code} Stack trace from drillbit.log {code} org.apache.drill.common.exceptions.UserRemoteException: UNSUPPORTED_OPERATION ERROR: In a list of type BIGINT, encountered a value of type VARCHAR. Drill does not support lists of different types. File /tmp/nestedJArry.json Record 1 Line 1 Column 64 Field outkey Line 1 Column 64 Field outkey Fragment 0:0 [Error Id: 60ca5348-d7be-4443-9c0c-77e1c99cb430 on centos-04.qa.lab:31010] at org.apache.drill.exec.work.foreman.QueryManager$1.statusUpdate(QueryManager.java:475) [drill-java-exec-1.1.0.jar:1.1.0] at org.apache.drill.exec.rpc.control.WorkEventBus.statusUpdate(WorkEventBus.java:71) [drill-java-exec-1.1.0.jar:1.1.0] at org.apache.drill.exec.work.batch.ControlMessageHandler.handle(ControlMessageHandler.java:79) [drill-java-exec-1.1.0.jar:1.1.0] at org.apache.drill.exec.rpc.control.ControlServer.handle(ControlServer.java:61) [drill-java-exec-1.1.0.jar:1.1.0] at org.apache.drill.exec.rpc.control.ControlServer.handle(ControlServer.java:38) [drill-java-exec-1.1.0.jar:1.1.0] at org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:61) [drill-java-exec-1.1.0.jar:1.1.0] at org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:233) [drill-java-exec-1.1.0.jar:1.1.0] at org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:205) [drill-java-exec-1.1.0.jar:1.1.0] at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89) [netty-codec-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.handler.timeout.ReadTimeoutHandler.channelRead(ReadTimeoutHandler.java:150) [netty-handler-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103) [netty-codec-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242) [netty-codec-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847) [netty-transport-4.0.27.Final.jar:4.0.27.Final] at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:618) [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] at
[jira] [Resolved] (DRILL-2745) Query returns IOB Exception when JSON data with empty arrays is input to flatten function
[ https://issues.apache.org/jira/browse/DRILL-2745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Altekruse resolved DRILL-2745. Resolution: Not A Problem Query returns IOB Exception when JSON data with empty arrays is input to flatten function - Key: DRILL-2745 URL: https://issues.apache.org/jira/browse/DRILL-2745 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 0.9.0 Environment: | 9d92b8e319f2d46e8659d903d355450e15946533 | DRILL-2580: Exit early from HashJoinBatch if build side is empty | 26.03.2015 @ 16:13:53 EDT Reporter: Khurram Faraaz Assignee: Khurram Faraaz Fix For: 1.2.0 IOB Exception is returned when JSON file that has many empty arrays and arrays with different types of data is passed to flatten function. Tested on 4 node cluster on CentOS {code} 0: jdbc:drill: select flatten(outkey) from `nestedJArry.json` ; Query failed: RemoteRpcException: Failure while running fragment., index: 176, length: 4 (expected: range(0, 176)) [ 2627cf84-9dfb-4077-8531-9955ecdbdec7 on centos-02.qa.lab:31010 ] [ 2627cf84-9dfb-4077-8531-9955ecdbdec7 on centos-02.qa.lab:31010 ] Error: exception while executing query: Failure while executing query. (state=,code=0) 0: jdbc:drill: select outkey from `nestedJArry.json`; ++ | outkey | ++ | [[100,1000,200,99,1,0,-1,10],[a,b,c,d,e,p,o,f,m,q,d,s,v],[2012-04-01,1998-02-20,2011-08-05,1992-01-01],[10:30:29.123,12:29:21.999],[sdfklgjsdlkjfghlsidhfgopiuesrtoipuertoiurtyoiurotuiydkfjlbn,bfn;waokefpqowertoipuwergklnjdfbpdsiofgoigiuewqrqiugkjehgjksdhbvkjshdfkjsdfbnlkfbkljrghljrelkhbdlkfjbgkdfjbgkndfbnkldfgklbhjdflkghjlnkoiurty984756897345609782-3458745uiyoheirluht7895e6y],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[null],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[test string,hello world!,just do it!,houston we have a problem],[1,2,3,4,5,6,7,8,9,0]] | ++ 1 row selected (0.088 seconds) Stack trace from drillbit.log 2015-04-09 23:54:41,965 [2ad8eebd-adb6-6f7e-469e-4bb8ca276984:frag:0:0] WARN o.a.d.e.w.fragment.FragmentExecutor - Error while initializing or executing fragment java.lang.IndexOutOfBoundsException: index: 176, length: 4 (expected: range(0, 176)) at io.netty.buffer.DrillBuf.checkIndexD(DrillBuf.java:187) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:4.0.24.Final] at io.netty.buffer.DrillBuf.chk(DrillBuf.java:209) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:4.0.24.Final] at io.netty.buffer.DrillBuf.setInt(DrillBuf.java:513) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:4.0.24.Final] at org.apache.drill.exec.vector.UInt4Vector$Mutator.set(UInt4Vector.java:363) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT] at org.apache.drill.exec.vector.RepeatedVarCharVector.splitAndTransferTo(RepeatedVarCharVector.java:173) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT] at org.apache.drill.exec.vector.RepeatedVarCharVector$TransferImpl.splitAndTransfer(RepeatedVarCharVector.java:200) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT] at org.apache.drill.exec.test.generated.FlattenerGen1107.flattenRecords(FlattenTemplate.java:106) ~[na:na] at org.apache.drill.exec.physical.impl.flatten.FlattenRecordBatch.doWork(FlattenRecordBatch.java:156) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:93) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.flatten.FlattenRecordBatch.innerNext(FlattenRecordBatch.java:122) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99) ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:89)
[jira] [Created] (DRILL-3463) Unit test of project pushdown in TestUnionAll should put more precisely plan attribute in plan verification.
Jinfeng Ni created DRILL-3463: - Summary: Unit test of project pushdown in TestUnionAll should put more precisely plan attribute in plan verification. Key: DRILL-3463 URL: https://issues.apache.org/jira/browse/DRILL-3463 Project: Apache Drill Issue Type: Bug Components: Query Planning Optimization Reporter: Jinfeng Ni Assignee: Jinfeng Ni As part of fix for DRILL-2802, it was discovered that several unit test cases for project pushdown in TestUnionAll did not put the desired plan attributes in to the expected plan result. To verify project pushdown is working properly, one simple way is to verify that the the column list in the Scan operator contains the desired columns. This should be the part of plan verification. However, the unit test cases in TestUnionAll did not do that. In stead, it tries to match a pattern of Project -- Scan, which seems not serving the purpose it desired. For instance, {code} final String[] expectedPlan = {UnionAll.*\n. + *Project.*\n + .*Scan.*\n + {code} should be replaced by {code} final String[] expectedPlan = {UnionAll.*\n. + *Project.*\n + .*Scan.*columns=\\[`n_comment`, `n_nationkey`, `n_name`\\].*\n {code} if we want to verify the column 'n_comment', 'n_nationkey', 'n_name' are pushed into Scan operator. To fix this, modify the expected plan result, such that it contains the plan attributes that should be able to verify whether the project pushdown is working or not. This will help catch project pushdown failure, and avoid causing more false alarm in plan verification. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (DRILL-3243) Need a better error message - Use of alias in window function definition
[ https://issues.apache.org/jira/browse/DRILL-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deneche A. Hakim closed DRILL-3243. --- Resolution: Fixed Fixed in 1c9093e0f34daaeb1ad6661bb4d4115bc573ed78 Need a better error message - Use of alias in window function definition Key: DRILL-3243 URL: https://issues.apache.org/jira/browse/DRILL-3243 Project: Apache Drill Issue Type: Bug Components: Execution - Flow Affects Versions: 1.0.0 Reporter: Khurram Faraaz Assignee: Deneche A. Hakim Priority: Minor Fix For: 1.2.0 Attachments: DRILL-3243.1.patch.txt, DRILL-3243.2.patch.txt Need a better error message when we use alias for window definition in query that uses window functions. for example, OVER(PARTITION BY columns[0] ORDER BY columns[1]) tmp, and if alias tmp is used in the predicate we need a message that says, column tmp does not exist, that is how it is in Postgres 9.3 Postgres 9.3 {code} postgres=# select count(*) OVER(partition by type order by id) `tmp` from airports where tmp is not null; ERROR: column tmp does not exist LINE 1: ...ect count(*) OVER(partition by type order by id) `tmp` from ... ^ {code} Drill 1.0 {code} 0: jdbc:drill:schema=dfs.tmp select count(*) OVER(partition by columns[2] order by columns[0]) tmp from `airports.csv` where tmp is not null; Error: SYSTEM ERROR: java.lang.IllegalArgumentException: Selected column(s) must have name 'columns' or must be plain '*' Fragment 0:0 [Error Id: 66987b81-fe50-422d-95e4-9ce61c873584 on centos-02.qa.lab:31010] (state=,code=0) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-1750) Querying directories with JSON files returns incomplete results
[ https://issues.apache.org/jira/browse/DRILL-1750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Phillips updated DRILL-1750: --- Attachment: (was: DRILL-1750.patch) Querying directories with JSON files returns incomplete results --- Key: DRILL-1750 URL: https://issues.apache.org/jira/browse/DRILL-1750 Project: Apache Drill Issue Type: Bug Components: Storage - JSON Reporter: Abhishek Girish Assignee: Steven Phillips Priority: Critical Fix For: 1.2.0 Attachments: 1.json, 2.json, 3.json, 4.json, DRILL-1750_2015-07-06_16:39:04.patch I happened to observe that querying (select *) a directory with json files displays only fields common to all json files. All corresponding fields are displayed while querying each of the json files individually. And in some scenarios, querying the directory crashes sqlline. The example below may help make the issue clear: select * from dfs.`/data/json/tmp/1.json`; ++++ | artist | track_id | title| ++++ | Jonathan King | TRAAAEA128F935A30D | I'll Slap Your Face (Entertainment USA Theme) | ++++ 1 row selected (1.305 seconds) select * from dfs.`/data/json/tmp/2.json`; +++++ | artist | timestamp | track_id | title| +++++ | Supersuckers | 2011-08-01 20:30:17.991134 | TRAAAQN128F9353BA0 | Double Wide | +++++ 1 row selected (0.105 seconds) select * from dfs.`/data/json/tmp/3.json`; ++++ | timestamp | track_id | title| ++++ | 2011-08-01 20:30:17.991134 | TRAAAQN128F9353BA0 | Double Wide | ++++ 1 row selected (0.083 seconds) select * from dfs.`/data/json/tmp/4.json`; +++ | track_id | title| +++ | TRAAAQN128F9353BA0 | Double Wide | +++ 1 row selected (0.076 seconds) select * from dfs.`/data/json/tmp`; +++ | track_id | title| +++ | TRAAAQN128F9353BA0 | Double Wide | | TRAAAQN128F9353BA0 | Double Wide | | TRAAAEA128F935A30D | I'll Slap Your Face (Entertainment USA Theme) | | TRAAAQN128F9353BA0 | Double Wide | +++ 4 rows selected (0.121 seconds) JVM Crash occurs at times: select * from dfs.`/data/json/tmp`; ++++ | timestamp | track_id | title| ++++ | 2011-08-01 20:30:17.991134 | TRAAAQN128F9353BA0 | Double Wide | # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x7f3cb99be053, pid=13943, tid=139898808436480 # # JRE version: OpenJDK Runtime Environment (7.0_65-b17) (build 1.7.0_65-mockbuild_2014_07_16_06_06-b00) # Java VM: OpenJDK 64-Bit Server VM (24.65-b04 mixed mode linux-amd64 compressed oops) # Problematic frame: # V [libjvm.so+0x932053] # # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try ulimit -c unlimited before starting Java again # # An error report file with more information is saved as: # /tmp/jvm-13943/hs_error.log # # If you would like to submit a bug report, please include # instructions on how to reproduce the bug and visit: # http://icedtea.classpath.org/bugzilla # Aborted -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-1750) Querying directories with JSON files returns incomplete results
[ https://issues.apache.org/jira/browse/DRILL-1750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14615898#comment-14615898 ] Steven Phillips commented on DRILL-1750: Updated reviewboard https://reviews.apache.org/r/36229/ Querying directories with JSON files returns incomplete results --- Key: DRILL-1750 URL: https://issues.apache.org/jira/browse/DRILL-1750 Project: Apache Drill Issue Type: Bug Components: Storage - JSON Reporter: Abhishek Girish Assignee: Steven Phillips Priority: Critical Fix For: 1.2.0 Attachments: 1.json, 2.json, 3.json, 4.json, DRILL-1750_2015-07-06_16:39:04.patch I happened to observe that querying (select *) a directory with json files displays only fields common to all json files. All corresponding fields are displayed while querying each of the json files individually. And in some scenarios, querying the directory crashes sqlline. The example below may help make the issue clear: select * from dfs.`/data/json/tmp/1.json`; ++++ | artist | track_id | title| ++++ | Jonathan King | TRAAAEA128F935A30D | I'll Slap Your Face (Entertainment USA Theme) | ++++ 1 row selected (1.305 seconds) select * from dfs.`/data/json/tmp/2.json`; +++++ | artist | timestamp | track_id | title| +++++ | Supersuckers | 2011-08-01 20:30:17.991134 | TRAAAQN128F9353BA0 | Double Wide | +++++ 1 row selected (0.105 seconds) select * from dfs.`/data/json/tmp/3.json`; ++++ | timestamp | track_id | title| ++++ | 2011-08-01 20:30:17.991134 | TRAAAQN128F9353BA0 | Double Wide | ++++ 1 row selected (0.083 seconds) select * from dfs.`/data/json/tmp/4.json`; +++ | track_id | title| +++ | TRAAAQN128F9353BA0 | Double Wide | +++ 1 row selected (0.076 seconds) select * from dfs.`/data/json/tmp`; +++ | track_id | title| +++ | TRAAAQN128F9353BA0 | Double Wide | | TRAAAQN128F9353BA0 | Double Wide | | TRAAAEA128F935A30D | I'll Slap Your Face (Entertainment USA Theme) | | TRAAAQN128F9353BA0 | Double Wide | +++ 4 rows selected (0.121 seconds) JVM Crash occurs at times: select * from dfs.`/data/json/tmp`; ++++ | timestamp | track_id | title| ++++ | 2011-08-01 20:30:17.991134 | TRAAAQN128F9353BA0 | Double Wide | # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x7f3cb99be053, pid=13943, tid=139898808436480 # # JRE version: OpenJDK Runtime Environment (7.0_65-b17) (build 1.7.0_65-mockbuild_2014_07_16_06_06-b00) # Java VM: OpenJDK 64-Bit Server VM (24.65-b04 mixed mode linux-amd64 compressed oops) # Problematic frame: # V [libjvm.so+0x932053] # # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try ulimit -c unlimited before starting Java again # # An error report file with more information is saved as: # /tmp/jvm-13943/hs_error.log # # If you would like to submit a bug report, please include # instructions on how to reproduce the bug and visit: # http://icedtea.classpath.org/bugzilla # Aborted -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3076) USING clause should not be supported in drill
[ https://issues.apache.org/jira/browse/DRILL-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14615897#comment-14615897 ] Sean Hsuan-Yi Chu commented on DRILL-3076: -- Why do we disable it? Besides, in unit test, TestJoinNullable class, we have been accepting using in a few tests. USING clause should not be supported in drill -- Key: DRILL-3076 URL: https://issues.apache.org/jira/browse/DRILL-3076 Project: Apache Drill Issue Type: Bug Components: Query Planning Optimization Affects Versions: 1.0.0 Reporter: Victoria Markman Assignee: Sean Hsuan-Yi Chu Fix For: 1.2.0 For the same reason natural join is not supported. See https://issues.apache.org/jira/browse/DRILL-1986 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-3464) Index out of bounds exception while performing concat()
Mehant Baid created DRILL-3464: -- Summary: Index out of bounds exception while performing concat() Key: DRILL-3464 URL: https://issues.apache.org/jira/browse/DRILL-3464 Project: Apache Drill Issue Type: Bug Reporter: Mehant Baid Assignee: Mehant Baid Fix For: 1.2.0 We hit IOOB while performing concat() on a single input in DrillOptiq. Below is the stack trace: at java.util.ArrayList.rangeCheck(ArrayList.java:635) ~[na:1.7.0_67] at java.util.ArrayList.get(ArrayList.java:411) ~[na:1.7.0_67] at org.apache.drill.exec.planner.logical.DrillOptiq$RexToDrill.getDrillFunctionFromOptiqCall(DrillOptiq.java:373) ~[classes/:na] at org.apache.drill.exec.planner.logical.DrillOptiq$RexToDrill.visitCall(DrillOptiq.java:106) ~[classes/:na] at org.apache.drill.exec.planner.logical.DrillOptiq$RexToDrill.visitCall(DrillOptiq.java:77) ~[classes/:na] at org.apache.calcite.rex.RexCall.accept(RexCall.java:107) ~[classes/:na] at org.apache.drill.exec.planner.logical.DrillOptiq.toDrill(DrillOptiq.java:74) ~[classes/:na] at org.apache.drill.exec.planner.common.DrillProjectRelBase.getProjectExpressions(DrillProjectRelBase.java:111) ~[classes/:na] at org.apache.drill.exec.planner.physical.ProjectPrel.getPhysicalOperator(ProjectPrel.java:57) ~[classes/:na] at org.apache.drill.exec.planner.physical.ScreenPrel.getPhysicalOperator(ScreenPrel.java:51) ~[classes/:na] at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToPop(DefaultSqlHandler.java:392) ~[classes/:na] at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:167) ~[classes/:na] at org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:178) ~[classes/:na] at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:903) [classes/:na] at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:242) [classes/:na] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-2561) Profile UI: Metrics displayed incorrectly for failed query
[ https://issues.apache.org/jira/browse/DRILL-2561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sudheesh Katkam updated DRILL-2561: --- Fix Version/s: (was: 1.2.0) Future Profile UI: Metrics displayed incorrectly for failed query -- Key: DRILL-2561 URL: https://issues.apache.org/jira/browse/DRILL-2561 Project: Apache Drill Issue Type: Bug Components: Client - HTTP Affects Versions: 0.9.0 Reporter: Krystal Assignee: Sudheesh Katkam Fix For: Future git.commit.id=8493713cafe6e5d1f56f2dffc9d8bea294a6e013 I have a query that failed to execute. The profile UI for this query displayed wrong metrics in columns. Here is the url for that profile: http://10.10.100.115:8047/profiles/2aed1b79-17a0-312d-42a5-161a1c2c66a4 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (DRILL-2800) Performance regression introduced with commit: a6df26a (Patch for DRILL-2512)
[ https://issues.apache.org/jira/browse/DRILL-2800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sudheesh Katkam closed DRILL-2800. -- Performance regression introduced with commit: a6df26a (Patch for DRILL-2512) -- Key: DRILL-2800 URL: https://issues.apache.org/jira/browse/DRILL-2800 Project: Apache Drill Issue Type: Bug Affects Versions: 0.9.0 Environment: RHEL 6.4 TPCH Data Set: SF100 (Uncompressed Parquet) Reporter: Kunal Khatua Assignee: Sudheesh Katkam Fix For: 1.2.0 TPCH 06 (Cached Run) was used as a reference to identify the regressive commit. DRILL-2613: 2-Core: Impl. ResultSet.getXxx(...) number-to-number data [fe11e86] 3,902 msec DRILL-2668: Fix: CAST(1.1 AS FLOAT) was yielding DOUBLE. [49042bc] 5,606 msec DRILL-2512: Shuffle the list of Drill endpoints before connecting [a6df26a] 10,506 msec (Rerun 9,678 msec) Here are comparisons from the last complete run (Cached runs): Commitd7e37f4 a6df26a tpch 0112,232 16,693 tpch 0323,374 30,062 tpch 0442,144 23,749 tpch 0532,247 41,648 tpch 064,665 10,506 tpch 0729,322 34,315 tpch 0835,478 42,120 tpch 0943,959 49,262 tpch 1024,439 26,136 tpch 12Timeout 18,866 tpch 1318,226 20,863 tpch 1411,760 11,884 tpch 1610,676 15,032 tpch 1834,153 39,058 tpch 19Timeout 32,909 tpch 2099,788 22,890 -- This message was sent by Atlassian JIRA (v6.3.4#6332)