[jira] [Assigned] (DRILL-4972) Drillbit shuts down immediately after starting if embedded web server is disabled

2016-10-27 Thread Zelaine Fong (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zelaine Fong reassigned DRILL-4972:
---

Assignee: Sorabh Hamirwasia  (was: Sudheesh Katkam)

Assigning to [~shamirwasia] for review.

> Drillbit shuts down immediately after starting if embedded web server is 
> disabled
> -
>
> Key: DRILL-4972
> URL: https://issues.apache.org/jira/browse/DRILL-4972
> Project: Apache Drill
>  Issue Type: Bug
>  Components:  Server
>Affects Versions: 1.8.0
>Reporter: Sudheesh Katkam
>Assignee: Sorabh Hamirwasia
>Priority: Critical
> Fix For: 1.9.0
>
>
> Disable embedded web server by setting "drill.exec.http.enabled" to false. 
> Now when drillbit is started, it shuts down immediately after starting.
> JVM exits when the only threads running are all daemon threads. Turns out all 
> threads in a drillbit, other than the thread pool started by the web server, 
> are daemon. So I suggest WorkManager#StatusThread be made non-daemon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4973) Sqlline history

2016-10-27 Thread Andries Engelbrecht (JIRA)
Andries Engelbrecht created DRILL-4973:
--

 Summary: Sqlline history
 Key: DRILL-4973
 URL: https://issues.apache.org/jira/browse/DRILL-4973
 Project: Apache Drill
  Issue Type: Improvement
  Components: Client - CLI
Reporter: Andries Engelbrecht
Priority: Minor


Currently the history on sqlline stops working after 500 queries have been 
logged in the users .sqlline/history file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4974) NPE in FindPartitionConditions.analyzeCall() for 'holistic' expressions

2016-10-27 Thread Karthikeyan Manivannan (JIRA)
Karthikeyan Manivannan created DRILL-4974:
-

 Summary: NPE in FindPartitionConditions.analyzeCall() for 
'holistic' expressions
 Key: DRILL-4974
 URL: https://issues.apache.org/jira/browse/DRILL-4974
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Relational Operators
Affects Versions: 1.8.0, 1.7.0, 1.6.0
Reporter: Karthikeyan Manivannan
Assignee: Karthikeyan Manivannan
 Fix For: 1.8.0


The following query can cause an NPE in FindPartitionConditions.analyzeCall() 
if the fileSize column is a partitioned column. 

SELECT  fileSize FROM dfs.`/drill-data/data/` WHERE compoundId LIKE 
'FOO-1234567%'

This is because, the LIKE is treated as a holistic expression in 
FindPartitionConditions.analyzeCall(), causing opStack to be empty, thus 
causing opStack.peek() to return a NULL value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4975) Null equality join over non existing columns returns nulls forever

2016-10-27 Thread Khurram Faraaz (JIRA)
Khurram Faraaz created DRILL-4975:
-

 Summary: Null equality join over non existing columns returns 
nulls forever
 Key: DRILL-4975
 URL: https://issues.apache.org/jira/browse/DRILL-4975
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 1.9.0
Reporter: Khurram Faraaz



Null equality join query that involves non-existing columns in the join 
predicate, runs for ever and returns nulls. 
Since the columns are not part of the parquet file, the join should fail 
gracefully, and not return infinite number of nulls.

{noformat}
SELECT t1.col_blah , t2.col_blah FROM typeall_l t1, typeall_r t2 WHERE 
t1.col_blah = t2.col_blah OR ( t1.col_blah IS NULL AND t2.col_blah IS NULL );
...
| null | null  |
| null | null  |
| null | null  |
| null | null  |
| null | null  |
| null | null  |
...
{noformat}

Upon removing, OR ( t1.col_blah IS NULL AND t2.col_blah IS NULL ) from the 
query, no results are returned, as expected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4369) Database driver fails to report any major or minor version information

2016-10-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15613378#comment-15613378
 ] 

ASF GitHub Bot commented on DRILL-4369:
---

Github user paul-rogers commented on the issue:

https://github.com/apache/drill/pull/622
  
Thanks for the responses! Sorry I posted my comments just after the commit. 
The issues are "nice-to-haves" for the original PR, do not justify another PR.


> Database driver fails to report any major or minor version information
> --
>
> Key: DRILL-4369
> URL: https://issues.apache.org/jira/browse/DRILL-4369
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Affects Versions: 1.4.0
>Reporter: N Campbell
>Assignee: Laurent Goujon
> Fix For: 1.9.0
>
>
> Using Apache 1.4 Drill
> The DatabaseMetadata.getters to obtain the Major and Minor versions of the 
> server or JDBC driver return 0 instead of 1.4.
> This prevents an application from dynamically adjusting how it interacts 
> based on which version of Drill a connection is accessing.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4968) Add column size information to ColumnMetadata

2016-10-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15613634#comment-15613634
 ] 

ASF GitHub Bot commented on DRILL-4968:
---

Github user adeneche commented on the issue:

https://github.com/apache/drill/pull/631
  
+1, LGTM


> Add column size information to ColumnMetadata
> -
>
> Key: DRILL-4968
> URL: https://issues.apache.org/jira/browse/DRILL-4968
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Metadata
>Reporter: Laurent Goujon
>Assignee: Laurent Goujon
>
> Both ODBC and JDBC needs column size information for the column metadata. 
> Instead of duplicating the logic between C++ and Java (and having to keep in 
> them sync), column size should be computed on the server so that value is 
> kept consistent across clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4373) Drill and Hive have incompatible timestamp representations in parquet

2016-10-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15613668#comment-15613668
 ] 

ASF GitHub Bot commented on DRILL-4373:
---

Github user parthchandra commented on a diff in the pull request:

https://github.com/apache/drill/pull/600#discussion_r85449218
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java
 ---
@@ -739,30 +741,76 @@ public void runTestAndValidate(String selection, 
String validationSelection, Str
   }
 
   /*
-  Test the reading of an int96 field. Impala encodes timestamps as int96 
fields
+Impala encodes timestamp values as int96 fields. Test the reading of 
an int96 field with two converters:
+the first one converts parquet INT96 into drill VARBINARY and the 
second one (works while
+store.parquet.reader.int96_as_timestamp option is enabled) converts 
parquet INT96 into drill TIMESTAMP.
*/
   @Test
   public void testImpalaParquetInt96() throws Exception {
 compareParquetReadersColumnar("field_impala_ts", 
"cp.`parquet/int96_impala_1.parquet`");
+try {
+  test("alter session set %s = true", 
ExecConstants.PARQUET_READER_INT96_AS_TIMESTAMP);
+  compareParquetReadersColumnar("field_impala_ts", 
"cp.`parquet/int96_impala_1.parquet`");
+} finally {
+  test("alter session reset %s", 
ExecConstants.PARQUET_READER_INT96_AS_TIMESTAMP);
+}
   }
 
   /*
-  Test the reading of a binary field where data is in dicationary _and_ 
non-dictionary encoded pages
+  Test the reading of a binary field as drill varbinary where data is in 
dicationary _and_ non-dictionary encoded pages
*/
   @Test
-  public void testImpalaParquetVarBinary_DictChange() throws Exception {
+  public void testImpalaParquetBinaryAsVarBinary_DictChange() throws 
Exception {
 compareParquetReadersColumnar("field_impala_ts", 
"cp.`parquet/int96_dict_change.parquet`");
   }
 
   /*
+  Test the reading of a binary field as drill timestamp where data is in 
dicationary _and_ non-dictionary encoded pages
+   */
+  @Test
+  public void testImpalaParquetBinaryAsTimeStamp_DictChange() throws 
Exception {
+final String WORKING_PATH = TestTools.getWorkingPath();
+final String TEST_RES_PATH = WORKING_PATH + "/src/test/resources";
+try {
+  testBuilder()
+  .sqlQuery("select int96_ts from 
dfs_test.`%s/parquet/int96_dict_change`", TEST_RES_PATH)
+  .optionSettingQueriesForTestQuery(
+  "alter session set `%s` = true", 
ExecConstants.PARQUET_READER_INT96_AS_TIMESTAMP)
+  .ordered()
+  
.csvBaselineFile("testframework/testParquetReader/testInt96DictChange/q1.tsv")
+  .baselineTypes(TypeProtos.MinorType.TIMESTAMP)
+  .baselineColumns("int96_ts")
+  .build().run();
+} finally {
+  test("alter system reset `%s`", 
ExecConstants.PARQUET_READER_INT96_AS_TIMESTAMP);
+}
+  }
+
+  /*
  Test the conversion from int96 to impala timestamp
*/
   @Test
-  public void testImpalaParquetTimestampAsInt96() throws Exception {
+  public void testTimestampImpalaConvertFrom() throws Exception {
 compareParquetReadersColumnar("convert_from(field_impala_ts, 
'TIMESTAMP_IMPALA')", "cp.`parquet/int96_impala_1.parquet`");
   }
 
   /*
+ Test reading parquet Int96 as TimeStamp and comparing obtained values 
with the
+ old results (reading the same values as VarBinary and 
convert_fromTIMESTAMP_IMPALA function using)
+   */
+  @Test
+  public void testImpalaParquetTimestampInt96AsTimeStamp() throws 
Exception {
--- End diff --

The test testImpalaParquetTimestampInt96AsTimeStamp fails when run in  a 
different timezone. Can you mark this as @Ignore unless you can fix the test to 
run across different timezones?


> Drill and Hive have incompatible timestamp representations in parquet
> -
>
> Key: DRILL-4373
> URL: https://issues.apache.org/jira/browse/DRILL-4373
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Hive, Storage - Parquet
>Affects Versions: 1.8.0
>Reporter: Rahul Challapalli
>Assignee: Karthikeyan Manivannan
>  Labels: doc-impacting
> Fix For: 1.9.0
>
>
> git.commit.id.abbrev=83d460c
> I created a parquet file with a timestamp type using Drill. Now if I define a 
> hive table on top of the parquet file and use "timestamp" as the column type, 
> drill fails to read the hive table through the hive storage plugin
> Implementation: 

[jira] [Commented] (DRILL-4373) Drill and Hive have incompatible timestamp representations in parquet

2016-10-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15613669#comment-15613669
 ] 

ASF GitHub Bot commented on DRILL-4373:
---

Github user parthchandra commented on the issue:

https://github.com/apache/drill/pull/600
  
Changing this to -1 until unit test failure is addressed.


> Drill and Hive have incompatible timestamp representations in parquet
> -
>
> Key: DRILL-4373
> URL: https://issues.apache.org/jira/browse/DRILL-4373
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Hive, Storage - Parquet
>Affects Versions: 1.8.0
>Reporter: Rahul Challapalli
>Assignee: Karthikeyan Manivannan
>  Labels: doc-impacting
> Fix For: 1.9.0
>
>
> git.commit.id.abbrev=83d460c
> I created a parquet file with a timestamp type using Drill. Now if I define a 
> hive table on top of the parquet file and use "timestamp" as the column type, 
> drill fails to read the hive table through the hive storage plugin
> Implementation: 
> Added int96 to timestamp converter for both parquet readers and controling it 
> by system / session option "store.parquet.int96_as_timestamp".
> The value of the option is false by default for the proper work of the old 
> query scripts with the "convert_from TIMESTAMP_IMPALA" function.
> When the option is true using of that function is unnesessary and can lead to 
> the query fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4974) NPE in FindPartitionConditions.analyzeCall() for 'holistic' expressions

2016-10-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15613677#comment-15613677
 ] 

ASF GitHub Bot commented on DRILL-4974:
---

GitHub user bitblender opened a pull request:

https://github.com/apache/drill/pull/634

DRILL-4974: NPE in FindPartitionConditions.analyzeCall() for 'holistic' 
expressions

Changes: Added a missing null check in 
FindPartitionConditions.analyzeCall(), to ensure that opStack.peek() value is 
dereferenced only after a null-check. Without this check, if the expression is 
holistic, opStack can be null, so using the value of opStack.peek() without a 
check can cause an NPE.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bitblender/drill DRILL-4974

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/634.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #634


commit a519a0987280abeb00e33a8088d2f7d6c9809eed
Author: karthik 
Date:   2016-10-20T20:43:17Z

DRILL-4974: Add missing null check in FindPartitionConditions.analyzeCall()




> NPE in FindPartitionConditions.analyzeCall() for 'holistic' expressions
> ---
>
> Key: DRILL-4974
> URL: https://issues.apache.org/jira/browse/DRILL-4974
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.6.0, 1.7.0, 1.8.0
>Reporter: Karthikeyan Manivannan
>Assignee: Karthikeyan Manivannan
> Fix For: 1.8.0
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> The following query can cause an NPE in FindPartitionConditions.analyzeCall() 
> if the fileSize column is a partitioned column. 
> SELECT  fileSize FROM dfs.`/drill-data/data/` WHERE compoundId LIKE 
> 'FOO-1234567%'
> This is because, the LIKE is treated as a holistic expression in 
> FindPartitionConditions.analyzeCall(), causing opStack to be empty, thus 
> causing opStack.peek() to return a NULL value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4968) Add column size information to ColumnMetadata

2016-10-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15613776#comment-15613776
 ] 

ASF GitHub Bot commented on DRILL-4968:
---

Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/631


> Add column size information to ColumnMetadata
> -
>
> Key: DRILL-4968
> URL: https://issues.apache.org/jira/browse/DRILL-4968
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Metadata
>Reporter: Laurent Goujon
>Assignee: Laurent Goujon
>
> Both ODBC and JDBC needs column size information for the column metadata. 
> Instead of duplicating the logic between C++ and Java (and having to keep in 
> them sync), column size should be computed on the server so that value is 
> kept consistent across clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4974) NPE in FindPartitionConditions.analyzeCall() for 'holistic' expressions

2016-10-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15613792#comment-15613792
 ] 

ASF GitHub Bot commented on DRILL-4974:
---

Github user bitblender commented on the issue:

https://github.com/apache/drill/pull/634
  
@amansinha100 Can you please review this change. 


> NPE in FindPartitionConditions.analyzeCall() for 'holistic' expressions
> ---
>
> Key: DRILL-4974
> URL: https://issues.apache.org/jira/browse/DRILL-4974
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.6.0, 1.7.0, 1.8.0
>Reporter: Karthikeyan Manivannan
>Assignee: Karthikeyan Manivannan
> Fix For: 1.8.0
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> The following query can cause an NPE in FindPartitionConditions.analyzeCall() 
> if the fileSize column is a partitioned column. 
> SELECT  fileSize FROM dfs.`/drill-data/data/` WHERE compoundId LIKE 
> 'FOO-1234567%'
> This is because, the LIKE is treated as a holistic expression in 
> FindPartitionConditions.analyzeCall(), causing opStack to be empty, thus 
> causing opStack.peek() to return a NULL value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4826) Query against INFORMATION_SCHEMA.TABLES degrades as the number of views increases

2016-10-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15613960#comment-15613960
 ] 

ASF GitHub Bot commented on DRILL-4826:
---

Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/592


> Query against INFORMATION_SCHEMA.TABLES degrades as the number of views 
> increases
> -
>
> Key: DRILL-4826
> URL: https://issues.apache.org/jira/browse/DRILL-4826
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Parth Chandra
>Assignee: Padma Penumarthy
>
> Queries against INFORMATION_SCHEMA.TABLES and INFORMATION_SCHEMA.VIEWS slow 
> down as the number of views increases. 
> BI tools like Tableau issue a query like the following at connection time:
> {code}
> select TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME, TABLE_TYPE from 
> INFORMATION_SCHEMA.`TABLES` WHERE TABLE_CATALOG LIKE 'DRILL' ESCAPE '\' AND 
> TABLE_SCHEMA <> 'sys' AND TABLE_SCHEMA <> 'INFORMATION_SCHEMA'ORDER BY 
> TABLE_TYPE, TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME
> {code}
> The time to query the information schema tables degrades as the number of 
> views increases. On a test system:
> || Views || Time(secs) ||
> |500 | 6 |
> |1000 | 19 |
> |1500 | 33 |
> This can result in a single connection taking more than a minute to establish.
> The problem occurs because we read the view file for every view and this 
> appears to take most of the time.
> Querying information_schema.tables does not, in fact, need to open the view 
> file at all, it merely needs to get a listing of the view files. Eliminating 
> the view file read will speed up the query tremendously.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-4974) NPE in FindPartitionConditions.analyzeCall() for 'holistic' expressions

2016-10-27 Thread Zelaine Fong (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zelaine Fong reassigned DRILL-4974:
---

Assignee: Gautam Kumar Parai  (was: Karthikeyan Manivannan)

Assigning to [~gparai] for review.

> NPE in FindPartitionConditions.analyzeCall() for 'holistic' expressions
> ---
>
> Key: DRILL-4974
> URL: https://issues.apache.org/jira/browse/DRILL-4974
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.6.0, 1.7.0, 1.8.0
>Reporter: Karthikeyan Manivannan
>Assignee: Gautam Kumar Parai
> Fix For: 1.8.0
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> The following query can cause an NPE in FindPartitionConditions.analyzeCall() 
> if the fileSize column is a partitioned column. 
> SELECT  fileSize FROM dfs.`/drill-data/data/` WHERE compoundId LIKE 
> 'FOO-1234567%'
> This is because, the LIKE is treated as a holistic expression in 
> FindPartitionConditions.analyzeCall(), causing opStack to be empty, thus 
> causing opStack.peek() to return a NULL value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)