[jira] [Assigned] (DRILL-4972) Drillbit shuts down immediately after starting if embedded web server is disabled
[ https://issues.apache.org/jira/browse/DRILL-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zelaine Fong reassigned DRILL-4972: --- Assignee: Sorabh Hamirwasia (was: Sudheesh Katkam) Assigning to [~shamirwasia] for review. > Drillbit shuts down immediately after starting if embedded web server is > disabled > - > > Key: DRILL-4972 > URL: https://issues.apache.org/jira/browse/DRILL-4972 > Project: Apache Drill > Issue Type: Bug > Components: Server >Affects Versions: 1.8.0 >Reporter: Sudheesh Katkam >Assignee: Sorabh Hamirwasia >Priority: Critical > Fix For: 1.9.0 > > > Disable embedded web server by setting "drill.exec.http.enabled" to false. > Now when drillbit is started, it shuts down immediately after starting. > JVM exits when the only threads running are all daemon threads. Turns out all > threads in a drillbit, other than the thread pool started by the web server, > are daemon. So I suggest WorkManager#StatusThread be made non-daemon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-4973) Sqlline history
Andries Engelbrecht created DRILL-4973: -- Summary: Sqlline history Key: DRILL-4973 URL: https://issues.apache.org/jira/browse/DRILL-4973 Project: Apache Drill Issue Type: Improvement Components: Client - CLI Reporter: Andries Engelbrecht Priority: Minor Currently the history on sqlline stops working after 500 queries have been logged in the users .sqlline/history file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-4974) NPE in FindPartitionConditions.analyzeCall() for 'holistic' expressions
Karthikeyan Manivannan created DRILL-4974: - Summary: NPE in FindPartitionConditions.analyzeCall() for 'holistic' expressions Key: DRILL-4974 URL: https://issues.apache.org/jira/browse/DRILL-4974 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 1.8.0, 1.7.0, 1.6.0 Reporter: Karthikeyan Manivannan Assignee: Karthikeyan Manivannan Fix For: 1.8.0 The following query can cause an NPE in FindPartitionConditions.analyzeCall() if the fileSize column is a partitioned column. SELECT fileSize FROM dfs.`/drill-data/data/` WHERE compoundId LIKE 'FOO-1234567%' This is because, the LIKE is treated as a holistic expression in FindPartitionConditions.analyzeCall(), causing opStack to be empty, thus causing opStack.peek() to return a NULL value. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-4975) Null equality join over non existing columns returns nulls forever
Khurram Faraaz created DRILL-4975: - Summary: Null equality join over non existing columns returns nulls forever Key: DRILL-4975 URL: https://issues.apache.org/jira/browse/DRILL-4975 Project: Apache Drill Issue Type: Bug Components: Execution - Flow Affects Versions: 1.9.0 Reporter: Khurram Faraaz Null equality join query that involves non-existing columns in the join predicate, runs for ever and returns nulls. Since the columns are not part of the parquet file, the join should fail gracefully, and not return infinite number of nulls. {noformat} SELECT t1.col_blah , t2.col_blah FROM typeall_l t1, typeall_r t2 WHERE t1.col_blah = t2.col_blah OR ( t1.col_blah IS NULL AND t2.col_blah IS NULL ); ... | null | null | | null | null | | null | null | | null | null | | null | null | | null | null | ... {noformat} Upon removing, OR ( t1.col_blah IS NULL AND t2.col_blah IS NULL ) from the query, no results are returned, as expected. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4369) Database driver fails to report any major or minor version information
[ https://issues.apache.org/jira/browse/DRILL-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15613378#comment-15613378 ] ASF GitHub Bot commented on DRILL-4369: --- Github user paul-rogers commented on the issue: https://github.com/apache/drill/pull/622 Thanks for the responses! Sorry I posted my comments just after the commit. The issues are "nice-to-haves" for the original PR, do not justify another PR. > Database driver fails to report any major or minor version information > -- > > Key: DRILL-4369 > URL: https://issues.apache.org/jira/browse/DRILL-4369 > Project: Apache Drill > Issue Type: Bug > Components: Client - JDBC >Affects Versions: 1.4.0 >Reporter: N Campbell >Assignee: Laurent Goujon > Fix For: 1.9.0 > > > Using Apache 1.4 Drill > The DatabaseMetadata.getters to obtain the Major and Minor versions of the > server or JDBC driver return 0 instead of 1.4. > This prevents an application from dynamically adjusting how it interacts > based on which version of Drill a connection is accessing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4968) Add column size information to ColumnMetadata
[ https://issues.apache.org/jira/browse/DRILL-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15613634#comment-15613634 ] ASF GitHub Bot commented on DRILL-4968: --- Github user adeneche commented on the issue: https://github.com/apache/drill/pull/631 +1, LGTM > Add column size information to ColumnMetadata > - > > Key: DRILL-4968 > URL: https://issues.apache.org/jira/browse/DRILL-4968 > Project: Apache Drill > Issue Type: Sub-task > Components: Metadata >Reporter: Laurent Goujon >Assignee: Laurent Goujon > > Both ODBC and JDBC needs column size information for the column metadata. > Instead of duplicating the logic between C++ and Java (and having to keep in > them sync), column size should be computed on the server so that value is > kept consistent across clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4373) Drill and Hive have incompatible timestamp representations in parquet
[ https://issues.apache.org/jira/browse/DRILL-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15613668#comment-15613668 ] ASF GitHub Bot commented on DRILL-4373: --- Github user parthchandra commented on a diff in the pull request: https://github.com/apache/drill/pull/600#discussion_r85449218 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java --- @@ -739,30 +741,76 @@ public void runTestAndValidate(String selection, String validationSelection, Str } /* - Test the reading of an int96 field. Impala encodes timestamps as int96 fields +Impala encodes timestamp values as int96 fields. Test the reading of an int96 field with two converters: +the first one converts parquet INT96 into drill VARBINARY and the second one (works while +store.parquet.reader.int96_as_timestamp option is enabled) converts parquet INT96 into drill TIMESTAMP. */ @Test public void testImpalaParquetInt96() throws Exception { compareParquetReadersColumnar("field_impala_ts", "cp.`parquet/int96_impala_1.parquet`"); +try { + test("alter session set %s = true", ExecConstants.PARQUET_READER_INT96_AS_TIMESTAMP); + compareParquetReadersColumnar("field_impala_ts", "cp.`parquet/int96_impala_1.parquet`"); +} finally { + test("alter session reset %s", ExecConstants.PARQUET_READER_INT96_AS_TIMESTAMP); +} } /* - Test the reading of a binary field where data is in dicationary _and_ non-dictionary encoded pages + Test the reading of a binary field as drill varbinary where data is in dicationary _and_ non-dictionary encoded pages */ @Test - public void testImpalaParquetVarBinary_DictChange() throws Exception { + public void testImpalaParquetBinaryAsVarBinary_DictChange() throws Exception { compareParquetReadersColumnar("field_impala_ts", "cp.`parquet/int96_dict_change.parquet`"); } /* + Test the reading of a binary field as drill timestamp where data is in dicationary _and_ non-dictionary encoded pages + */ + @Test + public void testImpalaParquetBinaryAsTimeStamp_DictChange() throws Exception { +final String WORKING_PATH = TestTools.getWorkingPath(); +final String TEST_RES_PATH = WORKING_PATH + "/src/test/resources"; +try { + testBuilder() + .sqlQuery("select int96_ts from dfs_test.`%s/parquet/int96_dict_change`", TEST_RES_PATH) + .optionSettingQueriesForTestQuery( + "alter session set `%s` = true", ExecConstants.PARQUET_READER_INT96_AS_TIMESTAMP) + .ordered() + .csvBaselineFile("testframework/testParquetReader/testInt96DictChange/q1.tsv") + .baselineTypes(TypeProtos.MinorType.TIMESTAMP) + .baselineColumns("int96_ts") + .build().run(); +} finally { + test("alter system reset `%s`", ExecConstants.PARQUET_READER_INT96_AS_TIMESTAMP); +} + } + + /* Test the conversion from int96 to impala timestamp */ @Test - public void testImpalaParquetTimestampAsInt96() throws Exception { + public void testTimestampImpalaConvertFrom() throws Exception { compareParquetReadersColumnar("convert_from(field_impala_ts, 'TIMESTAMP_IMPALA')", "cp.`parquet/int96_impala_1.parquet`"); } /* + Test reading parquet Int96 as TimeStamp and comparing obtained values with the + old results (reading the same values as VarBinary and convert_fromTIMESTAMP_IMPALA function using) + */ + @Test + public void testImpalaParquetTimestampInt96AsTimeStamp() throws Exception { --- End diff -- The test testImpalaParquetTimestampInt96AsTimeStamp fails when run in a different timezone. Can you mark this as @Ignore unless you can fix the test to run across different timezones? > Drill and Hive have incompatible timestamp representations in parquet > - > > Key: DRILL-4373 > URL: https://issues.apache.org/jira/browse/DRILL-4373 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Hive, Storage - Parquet >Affects Versions: 1.8.0 >Reporter: Rahul Challapalli >Assignee: Karthikeyan Manivannan > Labels: doc-impacting > Fix For: 1.9.0 > > > git.commit.id.abbrev=83d460c > I created a parquet file with a timestamp type using Drill. Now if I define a > hive table on top of the parquet file and use "timestamp" as the column type, > drill fails to read the hive table through the hive storage plugin > Implementation:
[jira] [Commented] (DRILL-4373) Drill and Hive have incompatible timestamp representations in parquet
[ https://issues.apache.org/jira/browse/DRILL-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15613669#comment-15613669 ] ASF GitHub Bot commented on DRILL-4373: --- Github user parthchandra commented on the issue: https://github.com/apache/drill/pull/600 Changing this to -1 until unit test failure is addressed. > Drill and Hive have incompatible timestamp representations in parquet > - > > Key: DRILL-4373 > URL: https://issues.apache.org/jira/browse/DRILL-4373 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Hive, Storage - Parquet >Affects Versions: 1.8.0 >Reporter: Rahul Challapalli >Assignee: Karthikeyan Manivannan > Labels: doc-impacting > Fix For: 1.9.0 > > > git.commit.id.abbrev=83d460c > I created a parquet file with a timestamp type using Drill. Now if I define a > hive table on top of the parquet file and use "timestamp" as the column type, > drill fails to read the hive table through the hive storage plugin > Implementation: > Added int96 to timestamp converter for both parquet readers and controling it > by system / session option "store.parquet.int96_as_timestamp". > The value of the option is false by default for the proper work of the old > query scripts with the "convert_from TIMESTAMP_IMPALA" function. > When the option is true using of that function is unnesessary and can lead to > the query fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4974) NPE in FindPartitionConditions.analyzeCall() for 'holistic' expressions
[ https://issues.apache.org/jira/browse/DRILL-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15613677#comment-15613677 ] ASF GitHub Bot commented on DRILL-4974: --- GitHub user bitblender opened a pull request: https://github.com/apache/drill/pull/634 DRILL-4974: NPE in FindPartitionConditions.analyzeCall() for 'holistic' expressions Changes: Added a missing null check in FindPartitionConditions.analyzeCall(), to ensure that opStack.peek() value is dereferenced only after a null-check. Without this check, if the expression is holistic, opStack can be null, so using the value of opStack.peek() without a check can cause an NPE. You can merge this pull request into a Git repository by running: $ git pull https://github.com/bitblender/drill DRILL-4974 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/634.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #634 commit a519a0987280abeb00e33a8088d2f7d6c9809eed Author: karthik Date: 2016-10-20T20:43:17Z DRILL-4974: Add missing null check in FindPartitionConditions.analyzeCall() > NPE in FindPartitionConditions.analyzeCall() for 'holistic' expressions > --- > > Key: DRILL-4974 > URL: https://issues.apache.org/jira/browse/DRILL-4974 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.6.0, 1.7.0, 1.8.0 >Reporter: Karthikeyan Manivannan >Assignee: Karthikeyan Manivannan > Fix For: 1.8.0 > > Original Estimate: 2h > Remaining Estimate: 2h > > The following query can cause an NPE in FindPartitionConditions.analyzeCall() > if the fileSize column is a partitioned column. > SELECT fileSize FROM dfs.`/drill-data/data/` WHERE compoundId LIKE > 'FOO-1234567%' > This is because, the LIKE is treated as a holistic expression in > FindPartitionConditions.analyzeCall(), causing opStack to be empty, thus > causing opStack.peek() to return a NULL value. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4968) Add column size information to ColumnMetadata
[ https://issues.apache.org/jira/browse/DRILL-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15613776#comment-15613776 ] ASF GitHub Bot commented on DRILL-4968: --- Github user asfgit closed the pull request at: https://github.com/apache/drill/pull/631 > Add column size information to ColumnMetadata > - > > Key: DRILL-4968 > URL: https://issues.apache.org/jira/browse/DRILL-4968 > Project: Apache Drill > Issue Type: Sub-task > Components: Metadata >Reporter: Laurent Goujon >Assignee: Laurent Goujon > > Both ODBC and JDBC needs column size information for the column metadata. > Instead of duplicating the logic between C++ and Java (and having to keep in > them sync), column size should be computed on the server so that value is > kept consistent across clients. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4974) NPE in FindPartitionConditions.analyzeCall() for 'holistic' expressions
[ https://issues.apache.org/jira/browse/DRILL-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15613792#comment-15613792 ] ASF GitHub Bot commented on DRILL-4974: --- Github user bitblender commented on the issue: https://github.com/apache/drill/pull/634 @amansinha100 Can you please review this change. > NPE in FindPartitionConditions.analyzeCall() for 'holistic' expressions > --- > > Key: DRILL-4974 > URL: https://issues.apache.org/jira/browse/DRILL-4974 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.6.0, 1.7.0, 1.8.0 >Reporter: Karthikeyan Manivannan >Assignee: Karthikeyan Manivannan > Fix For: 1.8.0 > > Original Estimate: 2h > Remaining Estimate: 2h > > The following query can cause an NPE in FindPartitionConditions.analyzeCall() > if the fileSize column is a partitioned column. > SELECT fileSize FROM dfs.`/drill-data/data/` WHERE compoundId LIKE > 'FOO-1234567%' > This is because, the LIKE is treated as a holistic expression in > FindPartitionConditions.analyzeCall(), causing opStack to be empty, thus > causing opStack.peek() to return a NULL value. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4826) Query against INFORMATION_SCHEMA.TABLES degrades as the number of views increases
[ https://issues.apache.org/jira/browse/DRILL-4826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15613960#comment-15613960 ] ASF GitHub Bot commented on DRILL-4826: --- Github user asfgit closed the pull request at: https://github.com/apache/drill/pull/592 > Query against INFORMATION_SCHEMA.TABLES degrades as the number of views > increases > - > > Key: DRILL-4826 > URL: https://issues.apache.org/jira/browse/DRILL-4826 > Project: Apache Drill > Issue Type: Bug >Reporter: Parth Chandra >Assignee: Padma Penumarthy > > Queries against INFORMATION_SCHEMA.TABLES and INFORMATION_SCHEMA.VIEWS slow > down as the number of views increases. > BI tools like Tableau issue a query like the following at connection time: > {code} > select TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME, TABLE_TYPE from > INFORMATION_SCHEMA.`TABLES` WHERE TABLE_CATALOG LIKE 'DRILL' ESCAPE '\' AND > TABLE_SCHEMA <> 'sys' AND TABLE_SCHEMA <> 'INFORMATION_SCHEMA'ORDER BY > TABLE_TYPE, TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME > {code} > The time to query the information schema tables degrades as the number of > views increases. On a test system: > || Views || Time(secs) || > |500 | 6 | > |1000 | 19 | > |1500 | 33 | > This can result in a single connection taking more than a minute to establish. > The problem occurs because we read the view file for every view and this > appears to take most of the time. > Querying information_schema.tables does not, in fact, need to open the view > file at all, it merely needs to get a listing of the view files. Eliminating > the view file read will speed up the query tremendously. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (DRILL-4974) NPE in FindPartitionConditions.analyzeCall() for 'holistic' expressions
[ https://issues.apache.org/jira/browse/DRILL-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zelaine Fong reassigned DRILL-4974: --- Assignee: Gautam Kumar Parai (was: Karthikeyan Manivannan) Assigning to [~gparai] for review. > NPE in FindPartitionConditions.analyzeCall() for 'holistic' expressions > --- > > Key: DRILL-4974 > URL: https://issues.apache.org/jira/browse/DRILL-4974 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.6.0, 1.7.0, 1.8.0 >Reporter: Karthikeyan Manivannan >Assignee: Gautam Kumar Parai > Fix For: 1.8.0 > > Original Estimate: 2h > Remaining Estimate: 2h > > The following query can cause an NPE in FindPartitionConditions.analyzeCall() > if the fileSize column is a partitioned column. > SELECT fileSize FROM dfs.`/drill-data/data/` WHERE compoundId LIKE > 'FOO-1234567%' > This is because, the LIKE is treated as a holistic expression in > FindPartitionConditions.analyzeCall(), causing opStack to be empty, thus > causing opStack.peek() to return a NULL value. -- This message was sent by Atlassian JIRA (v6.3.4#6332)