date:20171107


 [ 
https://issues.apache.org/jira/browse/DRILL-5943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-5943:

Affects Version/s: 1.12.0

> Avoid the strong check introduced by DRILL-5582 for PLAIN mechanism
> ---
>
> Key: DRILL-5943
> URL: https://issues.apache.org/jira/browse/DRILL-5943
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.12.0
>Reporter: Sorabh Hamirwasia
>Assignee: Sorabh Hamirwasia
> Fix For: 1.12.0
>
>
> For PLAIN mechanism we will weaken the strong check introduced with 
> DRILL-5582 to keep the forward compatibility between Drill 1.12 client and 
> Drill 1.9 server. This is fine since with and without this strong check PLAIN 
> mechanism is still vulnerable to MITM during handshake itself unlike mutual 
> authentication protocols like Kerberos.
> Also for keeping forward compatibility with respect to SASL we will treat 
> UNKNOWN_SASL_SUPPORT as valid value. For handshake message received from a 
> client which is running on later version (let say 1.13) then Drillbit (1.12) 
> and having a new value for SaslSupport field which is unknown to server, this 
> field will be decoded as UNKNOWN_SASL_SUPPORT. In this scenario client will 
> be treated as one aware about SASL protocol but server doesn't know exact 
> capabilities of client. Hence the SASL handshake will still be required from 
> server side.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-4286) Have an ability to put server in quiescent mode of operation


[ 
https://issues.apache.org/jira/browse/DRILL-4286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243166#comment-16243166
 ] 

ASF GitHub Bot commented on DRILL-4286:
---

Github user bitblender commented on a diff in the pull request:

https://github.com/apache/drill/pull/921#discussion_r149541807
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/server/DrillbitStateManager.java
 ---
@@ -0,0 +1,80 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.server;
+/*
+  State manager to manage the state of drillbit.
+ */
+public class DrillbitStateManager {
+
+
+  public DrillbitStateManager(DrillbitState currentState) {
+this.currentState = currentState;
+  }
+
+  public enum DrillbitState {
+STARTUP, ONLINE, GRACE, DRAINING, OFFLINE, SHUTDOWN
+  }
+
+  public DrillbitState getState() {
+return currentState;
+  }
+
+  private DrillbitState currentState;
--- End diff --

I think Drillbit.quiescentMode and Drillbit.forceful_shutdown also need NOT 
be volatile given the way they are used now. You don't have to enforce 
happens-before (by preventing re-ordering) here and even if these variables are 
volatile, the read of these variables in close() can anyway race with the 
setting of these variables in another thread doing a stop/gracefulShutdown. Let 
me know if I am missing anything.

That said, adding volatiles can only makes the code more correct (and 
slower). Since this code is not critical you can let it be as it is.  


> Have an ability to put server in quiescent mode of operation
> 
>
> Key: DRILL-4286
> URL: https://issues.apache.org/jira/browse/DRILL-4286
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Execution - Flow
>Reporter: Victoria Markman
>Assignee: Venkata Jyothsna Donapati
>
> I think drill will benefit from mode of operation that is called "quiescent" 
> in some databases. 
> From IBM Informix server documentation:
> {code}
> Change gracefully from online to quiescent mode
> Take the database server gracefully from online mode to quiescent mode to 
> restrict access to the database server without interrupting current 
> processing. After you perform this task, the database server sets a flag that 
> prevents new sessions from gaining access to the database server. The current 
> sessions are allowed to finish processing. After you initiate the mode 
> change, it cannot be canceled. During the mode change from online to 
> quiescent, the database server is considered to be in Shutdown mode.
> {code}
> This is different from shutdown, when processes are terminated. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)


[ 
https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243197#comment-16243197
 ] 

ASF GitHub Bot commented on DRILL-3640:
---

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149550418
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillResultSetImpl.java ---
@@ -96,6 +105,14 @@ private void throwIfClosed() throws 
AlreadyClosedSqlException,
 throw new AlreadyClosedSqlException( "ResultSet is already 
closed." );
   }
 }
+
+//Implicit check for whether timeout is set
+if (elapsedTimer != null) {
--- End diff --

```yes, pausing before execute would totally work!```
Current here is what the test does (_italics indicate what we're doing 
under the covers_):
1. Init Statement
2. Set timeout on statement (_validating the timeout value_)
3. Calling `execute()` and fetching ResultSet instance (_starting the 
clock_) 
4. Fetching a row using ResultSet.next()
5. Pausing briefly
6. Repeat step 4 onwards (_enough pause to trigger timeout_)

I was intending to pause between step 3 and 4 as an additional step. 
You believe that we are not exercising any tests for timeout within the 
`execute()` call? 
(Ref: 
https://github.com/kkhatua/drill/blob/9c4e3f3f727e70ca058facd4767556087a1876e1/exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillResultSetImpl.java#L1908
 )



> Drill JDBC driver support Statement.setQueryTimeout(int)
> 
>
> Key: DRILL-3640
> URL: https://issues.apache.org/jira/browse/DRILL-3640
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
> Fix For: 1.12.0
>
>
> It would be nice if we have this implemented. Run away queries can be 
> automatically canceled by setting the timeout. 
> java.sql.SQLFeatureNotSupportedException: Setting network timeout is not 
> supported.
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)


[ 
https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243187#comment-16243187
 ] 

ASF GitHub Bot commented on DRILL-3640:
---

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149548759
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java ---
@@ -333,8 +368,14 @@ void close() {
 final int batchQueueThrottlingThreshold =
 client.getConfig().getInt(
 ExecConstants.JDBC_BATCH_QUEUE_THROTTLING_THRESHOLD );
-resultsListener = new ResultsListener(batchQueueThrottlingThreshold);
+resultsListener = new ResultsListener(this, 
batchQueueThrottlingThreshold);
 currentBatchHolder = new RecordBatchLoader(client.getAllocator());
+try {
+  setTimeout(this.statement.getQueryTimeout());
+} catch (SQLException e) {
+  // Printing any unexpected SQLException stack trace
+  e.printStackTrace();
--- End diff --

I agree. Thankfully, the _caller_ does handle any thrown `SQLException`s, 
so I'm going to pass this off to that. IMO, I don't think we'll have an issue 
because the `Statement.setQueryTimeout()` would have handled any corner cases 
before this is invoked via `execute()`


> Drill JDBC driver support Statement.setQueryTimeout(int)
> 
>
> Key: DRILL-3640
> URL: https://issues.apache.org/jira/browse/DRILL-3640
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
> Fix For: 1.12.0
>
>
> It would be nice if we have this implemented. Run away queries can be 
> automatically canceled by setting the timeout. 
> java.sql.SQLFeatureNotSupportedException: Setting network timeout is not 
> supported.
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)


[ 
https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243158#comment-16243158
 ] 

ASF GitHub Bot commented on DRILL-3640:
---

Github user laurentgo commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149546105
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java ---
@@ -260,6 +288,10 @@ void close() {
   // when the main thread is blocked waiting for the result.  In that 
case
   // we want to unblock the main thread.
   firstMessageReceived.countDown(); // TODO:  Why not call 
releaseIfFirst as used elsewhere?
+  //Stopping timeout clock
--- End diff --

Nothing is actually running in Stopwatch (it's just a state to indicate if 
elapsed time should use the current time or the time when Stopwatch was 
stopped...)


> Drill JDBC driver support Statement.setQueryTimeout(int)
> 
>
> Key: DRILL-3640
> URL: https://issues.apache.org/jira/browse/DRILL-3640
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
> Fix For: 1.12.0
>
>
> It would be nice if we have this implemented. Run away queries can be 
> automatically canceled by setting the timeout. 
> java.sql.SQLFeatureNotSupportedException: Setting network timeout is not 
> supported.
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)


[ 
https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243159#comment-16243159
 ] 

ASF GitHub Bot commented on DRILL-3640:
---

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149546258
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java ---
@@ -139,8 +147,22 @@ private boolean stopThrottlingIfSo() {
   return stopped;
 }
 
-public void awaitFirstMessage() throws InterruptedException {
-  firstMessageReceived.await();
+public void awaitFirstMessage() throws InterruptedException, 
SQLTimeoutException {
+  //Check if a non-zero timeout has been set
+  if ( parent.timeoutInMilliseconds > 0 ) {
+//Identifying remaining in milliseconds to maintain a granularity 
close to integer value of timeout
+long timeToTimeout = (parent.timeoutInMilliseconds) - 
parent.elapsedTimer.elapsed(TimeUnit.MILLISECONDS);
+if ( timeToTimeout > 0 ) {
--- End diff --

Affects readability, but I think comments can convey the intent. +1


> Drill JDBC driver support Statement.setQueryTimeout(int)
> 
>
> Key: DRILL-3640
> URL: https://issues.apache.org/jira/browse/DRILL-3640
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
> Fix For: 1.12.0
>
>
> It would be nice if we have this implemented. Run away queries can be 
> automatically canceled by setting the timeout. 
> java.sql.SQLFeatureNotSupportedException: Setting network timeout is not 
> supported.
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5717) change some date time unit cases with specific timezone or Local


[ 
https://issues.apache.org/jira/browse/DRILL-5717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243155#comment-16243155
 ] 

ASF GitHub Bot commented on DRILL-5717:
---

Github user weijietong commented on the issue:

https://github.com/apache/drill/pull/904
  
applied the review comments


> change some date time unit cases with specific timezone or Local
> 
>
> Key: DRILL-5717
> URL: https://issues.apache.org/jira/browse/DRILL-5717
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Tools, Build & Test
>Affects Versions: 1.9.0, 1.11.0
>Reporter: weijie.tong
>
> Some date time test cases like  JodaDateValidatorTest  is not Local 
> independent .This will cause other Local's users's test phase to fail. We 
> should let these test cases to be Local env independent.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)


[ 
https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243153#comment-16243153
 ] 

ASF GitHub Bot commented on DRILL-3640:
---

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149545534
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java ---
@@ -239,6 +261,11 @@ QueryDataBatch getNext() throws UserException, 
InterruptedException {
 }
 return qdb;
   }
+
+  // Check and throw SQLTimeoutException
+  if ( parent.timeoutInMilliseconds > 0 && 
parent.elapsedTimer.elapsed(TimeUnit.SECONDS) >= parent.timeoutInMilliseconds ) 
{
--- End diff --

Darn! +1


> Drill JDBC driver support Statement.setQueryTimeout(int)
> 
>
> Key: DRILL-3640
> URL: https://issues.apache.org/jira/browse/DRILL-3640
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
> Fix For: 1.12.0
>
>
> It would be nice if we have this implemented. Run away queries can be 
> automatically canceled by setting the timeout. 
> java.sql.SQLFeatureNotSupportedException: Setting network timeout is not 
> supported.
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)


[ 
https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243106#comment-16243106
 ] 

ASF GitHub Bot commented on DRILL-3640:
---

Github user kkhatua commented on the issue:

https://github.com/apache/drill/pull/1024
  
@laurentgo Done the changes... ready for review. 


> Drill JDBC driver support Statement.setQueryTimeout(int)
> 
>
> Key: DRILL-3640
> URL: https://issues.apache.org/jira/browse/DRILL-3640
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
> Fix For: 1.12.0
>
>
> It would be nice if we have this implemented. Run away queries can be 
> automatically canceled by setting the timeout. 
> java.sql.SQLFeatureNotSupportedException: Setting network timeout is not 
> supported.
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)


[ 
https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243134#comment-16243134
 ] 

ASF GitHub Bot commented on DRILL-3640:
---

Github user laurentgo commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149542720
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java ---
@@ -260,6 +288,10 @@ void close() {
   // when the main thread is blocked waiting for the result.  In that 
case
   // we want to unblock the main thread.
   firstMessageReceived.countDown(); // TODO:  Why not call 
releaseIfFirst as used elsewhere?
+  //Stopping timeout clock
--- End diff --

since we are closing, do we need to care about the stopwatch?


> Drill JDBC driver support Statement.setQueryTimeout(int)
> 
>
> Key: DRILL-3640
> URL: https://issues.apache.org/jira/browse/DRILL-3640
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
> Fix For: 1.12.0
>
>
> It would be nice if we have this implemented. Run away queries can be 
> automatically canceled by setting the timeout. 
> java.sql.SQLFeatureNotSupportedException: Setting network timeout is not 
> supported.
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)


[ 
https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243163#comment-16243163
 ] 

ASF GitHub Bot commented on DRILL-3640:
---

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149546431
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java ---
@@ -376,6 +417,19 @@ synchronized void cleanup() {
 currentBatchHolder.clear();
   }
 
+  long getTimeoutInMilliseconds() {
+return timeoutInMilliseconds;
+  }
+
+  //Set the cursor's timeout in seconds
+  void setTimeout(int timeoutDurationInSeconds){
+this.timeoutInMilliseconds = timeoutDurationInSeconds*1000L;
--- End diff --

+1 


> Drill JDBC driver support Statement.setQueryTimeout(int)
> 
>
> Key: DRILL-3640
> URL: https://issues.apache.org/jira/browse/DRILL-3640
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
> Fix For: 1.12.0
>
>
> It would be nice if we have this implemented. Run away queries can be 
> automatically canceled by setting the timeout. 
> java.sql.SQLFeatureNotSupportedException: Setting network timeout is not 
> supported.
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)


[ 
https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243183#comment-16243183
 ] 

ASF GitHub Bot commented on DRILL-3640:
---

Github user laurentgo commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149548337
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillResultSetImpl.java ---
@@ -96,6 +105,14 @@ private void throwIfClosed() throws 
AlreadyClosedSqlException,
 throw new AlreadyClosedSqlException( "ResultSet is already 
closed." );
   }
 }
+
+//Implicit check for whether timeout is set
+if (elapsedTimer != null) {
--- End diff --

yes, pausing before execute would totally work! After execute, likely not 
since injection is done when query is executed on the server side.


> Drill JDBC driver support Statement.setQueryTimeout(int)
> 
>
> Key: DRILL-3640
> URL: https://issues.apache.org/jira/browse/DRILL-3640
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
> Fix For: 1.12.0
>
>
> It would be nice if we have this implemented. Run away queries can be 
> automatically canceled by setting the timeout. 
> java.sql.SQLFeatureNotSupportedException: Setting network timeout is not 
> supported.
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)


[ 
https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243154#comment-16243154
 ] 

ASF GitHub Bot commented on DRILL-3640:
---

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149545640
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java ---
@@ -260,6 +288,10 @@ void close() {
   // when the main thread is blocked waiting for the result.  In that 
case
   // we want to unblock the main thread.
   firstMessageReceived.countDown(); // TODO:  Why not call 
releaseIfFirst as used elsewhere?
+  //Stopping timeout clock
--- End diff --

Just wrapping up any 'running' resources. 


> Drill JDBC driver support Statement.setQueryTimeout(int)
> 
>
> Key: DRILL-3640
> URL: https://issues.apache.org/jira/browse/DRILL-3640
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
> Fix For: 1.12.0
>
>
> It would be nice if we have this implemented. Run away queries can be 
> automatically canceled by setting the timeout. 
> java.sql.SQLFeatureNotSupportedException: Setting network timeout is not 
> supported.
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (DRILL-5943) Avoid the strong check introduced by DRILL-5582 for PLAIN mechanism

2017-11-07 Thread Sorabh Hamirwasia (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-5943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sorabh Hamirwasia updated DRILL-5943:
-
Reviewer: Parth Chandra

> Avoid the strong check introduced by DRILL-5582 for PLAIN mechanism
> ---
>
> Key: DRILL-5943
> URL: https://issues.apache.org/jira/browse/DRILL-5943
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Sorabh Hamirwasia
>Assignee: Sorabh Hamirwasia
> Fix For: 1.12.0
>
>
> For PLAIN mechanism we will weaken the strong check introduced with 
> DRILL-5582 to keep the forward compatibility between Drill 1.12 client and 
> Drill 1.9 server. This is fine since with and without this strong check PLAIN 
> mechanism is still vulnerable to MITM during handshake itself unlike mutual 
> authentication protocols like Kerberos.
> Also for keeping forward compatibility with respect to SASL we will treat 
> UNKNOWN_SASL_SUPPORT as valid value. For handshake message received from a 
> client which is running on later version (let say 1.13) then Drillbit (1.12) 
> and having a new value for SaslSupport field which is unknown to server, this 
> field will be decoded as UNKNOWN_SASL_SUPPORT. In this scenario client will 
> be treated as one aware about SASL protocol but server doesn't know exact 
> capabilities of client. Hence the SASL handshake will still be required from 
> server side.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5923) State of a successfully completed query shown as "COMPLETED"


[ 
https://issues.apache.org/jira/browse/DRILL-5923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243210#comment-16243210
 ] 

ASF GitHub Bot commented on DRILL-5923:
---

Github user paul-rogers commented on the issue:

https://github.com/apache/drill/pull/1021
  
@arina-ielchiieva, @prasadns14, here is my two cents.

The names and numbers used in the protobuf definitions are part of Drill's 
public network API. This API is not versioned, so we can't really change it. If 
we changed the names, then, say, C-code or Java code that expects the old names 
will break. Being part of the public API, that code may not even be in the 
Drill source tree; perhaps someone has generated, say, a Python binding. So, 
can't change the public API.

For purely aesthetic reasons, the contributor wishes to change the message 
displayed in the UI. This is purely a UI decision (the user is not expected to 
map the display names to the Protobuf enums.) And, the display name is subject 
to change. Maybe other UIs want to use other names. Maybe we want to show 
icons, or abbreviate the names ("Fail", "OK", etc.) And, of course, what if the 
display name should have spaces other characters: "In Progress", "In Queue" or 
"Didn't Work!". Can't put those in enum names. You get the idea.

For this reason, the mapping from enum values to display names should be 
part of the UI, not the network protocol definition. The present change 
provides a UI-specific mapping from API Protobuf enum values to display 
strings, which seems like a good idea.

So, the key questions are:

* Should we use display strings other than the Protobuf constants (seems a 
good idea.)
* Should we do the mapping in Java or in Freemarker? (Java seems simpler.)

Thoughts?


> State of a successfully completed query shown as "COMPLETED"
> 
>
> Key: DRILL-5923
> URL: https://issues.apache.org/jira/browse/DRILL-5923
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - HTTP
>Affects Versions: 1.11.0
>Reporter: Prasad Nagaraj Subramanya
>Assignee: Prasad Nagaraj Subramanya
> Fix For: 1.12.0
>
>
> Drill UI currently lists a successfully completed query as "COMPLETED". 
> Successfully completed, failed and canceled queries are all grouped as 
> Completed queries. 
> It would be better to list the state of a successfully completed query as 
> "Succeeded" to avoid confusion.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5899) Simple pattern matchers can work with DrillBuf directly


[ 
https://issues.apache.org/jira/browse/DRILL-5899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243315#comment-16243315
 ] 

ASF GitHub Bot commented on DRILL-5899:
---

Github user ppadma commented on the issue:

https://github.com/apache/drill/pull/1015
  
@paul-rogers updated with latest review comments taken care of. Please take 
a look.


> Simple pattern matchers can work with DrillBuf directly
> ---
>
> Key: DRILL-5899
> URL: https://issues.apache.org/jira/browse/DRILL-5899
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Flow
>Reporter: Padma Penumarthy
>Assignee: Padma Penumarthy
>Priority: Critical
>
> For the 4 simple patterns we have i.e. startsWith, endsWith, contains and 
> constant,, we do not need the overhead of charSequenceWrapper. We can work 
> with DrillBuf directly. This will save us from doing isAscii check and UTF8 
> decoding for each row.
> UTF-8 encoding ensures that no UTF-8 character is a prefix of any other valid 
> character. So, instead of decoding varChar from each row we are processing, 
> encode the patternString once during setup and do raw byte comparison. 
> Instead of bounds checking and reading one byte at a time, we get the whole 
> buffer in one shot and use that for comparison.
> This improved overall performance for filter operator by around 20%. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-4286) Have an ability to put server in quiescent mode of operation


[ 
https://issues.apache.org/jira/browse/DRILL-4286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243167#comment-16243167
 ] 

ASF GitHub Bot commented on DRILL-4286:
---

Github user bitblender commented on a diff in the pull request:

https://github.com/apache/drill/pull/921#discussion_r149542196
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java ---
@@ -348,6 +354,21 @@ public void run() {
  */
   }
 
+  /*
+Check if the foreman is ONLINE. If not dont accept any new queries.
+   */
+  public void checkForemanState() throws ForemanException{
+DrillbitEndpoint foreman = drillbitContext.getEndpoint();
+Collection dbs = drillbitContext.getAvailableBits();
--- End diff --

Maybe add it to the DrillbitContext class ?


> Have an ability to put server in quiescent mode of operation
> 
>
> Key: DRILL-4286
> URL: https://issues.apache.org/jira/browse/DRILL-4286
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Execution - Flow
>Reporter: Victoria Markman
>Assignee: Venkata Jyothsna Donapati
>
> I think drill will benefit from mode of operation that is called "quiescent" 
> in some databases. 
> From IBM Informix server documentation:
> {code}
> Change gracefully from online to quiescent mode
> Take the database server gracefully from online mode to quiescent mode to 
> restrict access to the database server without interrupting current 
> processing. After you perform this task, the database server sets a flag that 
> prevents new sessions from gaining access to the database server. The current 
> sessions are allowed to finish processing. After you initiate the mode 
> change, it cannot be canceled. During the mode change from online to 
> quiescent, the database server is considered to be in Shutdown mode.
> {code}
> This is different from shutdown, when processes are terminated. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-4286) Have an ability to put server in quiescent mode of operation


[ 
https://issues.apache.org/jira/browse/DRILL-4286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243168#comment-16243168
 ] 

ASF GitHub Bot commented on DRILL-4286:
---

Github user bitblender commented on a diff in the pull request:

https://github.com/apache/drill/pull/921#discussion_r149544267
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/TestGracefulShutdown.java ---
@@ -0,0 +1,323 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.test;
+
+import ch.qos.logback.classic.Level;
+import org.apache.commons.io.FileUtils;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint;
+import org.apache.drill.exec.proto.UserBitShared.QueryResult.QueryState;
+import org.apache.drill.exec.server.Drillbit;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Test;
+import org.omg.PortableServer.THREAD_POLICY_ID;
+
+import java.io.File;
+import java.io.FileWriter;
+import java.io.IOException;
+import java.io.PrintWriter;
+import java.net.HttpURLConnection;
+import java.net.URL;
+import java.util.Collection;
+import java.util.Properties;
+
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertNotEquals;
+import static org.junit.Assert.fail;
+
+public class TestGracefulShutdown {
+
+  @BeforeClass
+  public static void setUpTestData() {
+for( int i = 0; i < 1000; i++) {
+  setupFile(i);
+}
+  }
+
+
+  public static final Properties WEBSERVER_CONFIGURATION = new 
Properties() {
+{
+  put(ExecConstants.HTTP_ENABLE, true);
+}
+  };
+
+  public FixtureBuilder enableWebServer(FixtureBuilder builder) {
+Properties props = new Properties();
+props.putAll(WEBSERVER_CONFIGURATION);
+builder.configBuilder.configProps(props);
+return builder;
+  }
+
+
+  /*
+  Start multiple drillbits and then shutdown a drillbit. Query the online
+  endpoints and check if the drillbit still exists.
+   */
+  @Test
+  public void testOnlineEndPoints() throws  Exception {
+
+String[] drillbits = {"db1" ,"db2","db3", "db4", "db5", "db6"};
+FixtureBuilder builder = 
ClusterFixture.builder().withBits(drillbits).withLocalZk();
+
+
+try ( ClusterFixture cluster = builder.build();
+  ClientFixture client = cluster.clientFixture()) {
+
+  Drillbit drillbit = cluster.drillbit("db2");
+  DrillbitEndpoint drillbitEndpoint =  
drillbit.getRegistrationHandle().getEndPoint();
+  int grace_period = 
drillbit.getContext().getConfig().getInt("drill.exec.grace_period");
+  new Thread(new Runnable() {
+public void run() {
+  try {
+cluster.close_drillbit("db2");
+  } catch (Exception e) {
+e.printStackTrace();
+  }
+}
+  }).start();
+  //wait for graceperiod
+  Thread.sleep(grace_period);
+  Collection drillbitEndpoints = 
cluster.drillbit().getContext()
+  .getClusterCoordinator()
+  .getOnlineEndPoints();
+  Assert.assertFalse(drillbitEndpoints.contains(drillbitEndpoint));
+}
+  }
+  /*
+Test if the drillbit transitions from ONLINE state when a shutdown
+request is initiated
+   */
+  @Test
+  public void testStateChange() throws  Exception {
+
+String[] drillbits = {"db1" ,"db2", "db3", "db4", "db5", "db6"};
+FixtureBuilder builder = 
ClusterFixture.builder().withBits(drillbits).withLocalZk();
+
+try ( ClusterFixture cluster = builder.build();
+  ClientFixture client = cluster.clientFixture()) {
+  Drillbit drillbit = cluster.drillbit("db2");
+

[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)


[ 
https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243175#comment-16243175
 ] 

ASF GitHub Bot commented on DRILL-3640:
---

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149547712
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillResultSetImpl.java ---
@@ -96,6 +105,14 @@ private void throwIfClosed() throws 
AlreadyClosedSqlException,
 throw new AlreadyClosedSqlException( "ResultSet is already 
closed." );
   }
 }
+
+//Implicit check for whether timeout is set
+if (elapsedTimer != null) {
--- End diff --

I was originally wondering as to when should we trigger the countdown on 
the timer. 
Creating a `[Prepared]Statement` object should not be the basis for the 
starting the clock, but only when you actually call execute(). The 
`DrillCursor` is initialized in this method and is what starts the clock. 
I could create a clone of the `testTriggeredQueryTimeout` method and simply 
have the client pause after `execute()` but before fetching the `ResultSet` 
instance or invoking `ResultSet.next()` . Would that work ?


> Drill JDBC driver support Statement.setQueryTimeout(int)
> 
>
> Key: DRILL-3640
> URL: https://issues.apache.org/jira/browse/DRILL-3640
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
> Fix For: 1.12.0
>
>
> It would be nice if we have this implemented. Run away queries can be 
> automatically canceled by setting the timeout. 
> java.sql.SQLFeatureNotSupportedException: Setting network timeout is not 
> supported.
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)


[ 
https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243199#comment-16243199
 ] 

ASF GitHub Bot commented on DRILL-3640:
---

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149550602
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java ---
@@ -260,6 +288,10 @@ void close() {
   // when the main thread is blocked waiting for the result.  In that 
case
   // we want to unblock the main thread.
   firstMessageReceived.countDown(); // TODO:  Why not call 
releaseIfFirst as used elsewhere?
+  //Stopping timeout clock
--- End diff --

Ok. Guess we'll do away with it. +1


> Drill JDBC driver support Statement.setQueryTimeout(int)
> 
>
> Key: DRILL-3640
> URL: https://issues.apache.org/jira/browse/DRILL-3640
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
> Fix For: 1.12.0
>
>
> It would be nice if we have this implemented. Run away queries can be 
> automatically canceled by setting the timeout. 
> java.sql.SQLFeatureNotSupportedException: Setting network timeout is not 
> supported.
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (DRILL-5917) Ban org.json:json library in Drill

2017-11-07 Thread Vlad Rozov (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-5917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vlad Rozov updated DRILL-5917:
--
Summary: Ban org.json:json library in Drill  (was: Ban json.org library in 
Drill)

> Ban org.json:json library in Drill
> --
>
> Key: DRILL-5917
> URL: https://issues.apache.org/jira/browse/DRILL-5917
> Project: Apache Drill
>  Issue Type: Task
>Affects Versions: 1.11.0
>Reporter: Arina Ielchiieva
>Assignee: Vlad Rozov
> Fix For: 1.12.0
>
>
> Apache Drill has dependencies on json.org lib indirectly from two libraries:
> com.mapr.hadoop:maprfs:jar:5.2.1-mapr
> com.mapr.fs:mapr-hbase:jar:5.2.1-mapr
> {noformat}
> [INFO] org.apache.drill.contrib:drill-format-mapr:jar:1.12.0-SNAPSHOT
> [INFO] +- com.mapr.hadoop:maprfs:jar:5.2.1-mapr:compile
> [INFO] |  \- org.json:json:jar:20080701:compile
> [INFO] \- com.mapr.fs:mapr-hbase:jar:5.2.1-mapr:compile
> [INFO]\- (org.json:json:jar:20080701:compile - omitted for duplicate)
> {noformat}
> Need to make sure we won't have any dependencies from these libs to json.org 
> lib and ban this lib in main pom.xml file.
> Issue is critical since Apache release won't happen until we make sure 
> json.org lib is not used (https://www.apache.org/legal/resolved.html).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5943) Avoid the strong check introduced by DRILL-5582 for PLAIN mechanism


[ 
https://issues.apache.org/jira/browse/DRILL-5943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243208#comment-16243208
 ] 

ASF GitHub Bot commented on DRILL-5943:
---

GitHub user sohami opened a pull request:

https://github.com/apache/drill/pull/1028

DRILL-5943: Avoid the strong check introduced by DRILL-5582 for PLAIN…

… mechanism

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sohami/drill DRILL-5943

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1028.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1028


commit 708dbc203b63700fb520445e585826a5c1e911e4
Author: Sorabh Hamirwasia 
Date:   2017-11-07T23:27:45Z

DRILL-5943: Avoid the strong check introduced by DRILL-5582 for PLAIN 
mechanism




> Avoid the strong check introduced by DRILL-5582 for PLAIN mechanism
> ---
>
> Key: DRILL-5943
> URL: https://issues.apache.org/jira/browse/DRILL-5943
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Sorabh Hamirwasia
>Assignee: Sorabh Hamirwasia
> Fix For: 1.12.0
>
>
> For PLAIN mechanism we will weaken the strong check introduced with 
> DRILL-5582 to keep the forward compatibility between Drill 1.12 client and 
> Drill 1.9 server. This is fine since with and without this strong check PLAIN 
> mechanism is still vulnerable to MITM during handshake itself unlike mutual 
> authentication protocols like Kerberos.
> Also for keeping forward compatibility with respect to SASL we will treat 
> UNKNOWN_SASL_SUPPORT as valid value. For handshake message received from a 
> client which is running on later version (let say 1.13) then Drillbit (1.12) 
> and having a new value for SaslSupport field which is unknown to server, this 
> field will be decoded as UNKNOWN_SASL_SUPPORT. In this scenario client will 
> be treated as one aware about SASL protocol but server doesn't know exact 
> capabilities of client. Hence the SASL handshake will still be required from 
> server side.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5943) Avoid the strong check introduced by DRILL-5582 for PLAIN mechanism


[ 
https://issues.apache.org/jira/browse/DRILL-5943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243211#comment-16243211
 ] 

ASF GitHub Bot commented on DRILL-5943:
---

Github user sohami commented on the issue:

https://github.com/apache/drill/pull/1028
  
@parthchandra & @laurentgo - Please help to review this PR.


> Avoid the strong check introduced by DRILL-5582 for PLAIN mechanism
> ---
>
> Key: DRILL-5943
> URL: https://issues.apache.org/jira/browse/DRILL-5943
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Sorabh Hamirwasia
>Assignee: Sorabh Hamirwasia
> Fix For: 1.12.0
>
>
> For PLAIN mechanism we will weaken the strong check introduced with 
> DRILL-5582 to keep the forward compatibility between Drill 1.12 client and 
> Drill 1.9 server. This is fine since with and without this strong check PLAIN 
> mechanism is still vulnerable to MITM during handshake itself unlike mutual 
> authentication protocols like Kerberos.
> Also for keeping forward compatibility with respect to SASL we will treat 
> UNKNOWN_SASL_SUPPORT as valid value. For handshake message received from a 
> client which is running on later version (let say 1.13) then Drillbit (1.12) 
> and having a new value for SaslSupport field which is unknown to server, this 
> field will be decoded as UNKNOWN_SASL_SUPPORT. In this scenario client will 
> be treated as one aware about SASL protocol but server doesn't know exact 
> capabilities of client. Hence the SASL handshake will still be required from 
> server side.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)


[ 
https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242897#comment-16242897
 ] 

ASF GitHub Bot commented on DRILL-3640:
---

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149506412
  
--- Diff: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java ---
@@ -237,6 +245,127 @@ public String toString() {
 }
   }
 
+  /**
+   * Test for reading of default query timeout
+   */
+  @Test
+  public void testDefaultGetQueryTimeout() throws SQLException {
+PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL);
+int timeoutValue = stmt.getQueryTimeout();
+assert( 0 == timeoutValue );
+  }
+
+  /**
+   * Test Invalid parameter by giving negative timeout
+   */
+  @Test ( expected = InvalidParameterSqlException.class )
+  public void testInvalidSetQueryTimeout() throws SQLException {
+PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL);
+//Setting negative value
+int valueToSet = -10;
+if (0L == valueToSet) {
+  valueToSet--;
+}
+try {
+  stmt.setQueryTimeout(valueToSet);
+} catch ( final Exception e) {
+  // TODO: handle exception
+  assertThat( e.getMessage(), containsString( "illegal timeout value") 
);
+  //Converting this to match expected Exception
+  throw new InvalidParameterSqlException(e.getMessage());
--- End diff --

Yep. Agree. Was trying to make use of the large number of 
`###SqlException`s defined within the Drill JDBC package. Will fix this. +1


> Drill JDBC driver support Statement.setQueryTimeout(int)
> 
>
> Key: DRILL-3640
> URL: https://issues.apache.org/jira/browse/DRILL-3640
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
> Fix For: 1.12.0
>
>
> It would be nice if we have this implemented. Run away queries can be 
> automatically canceled by setting the timeout. 
> java.sql.SQLFeatureNotSupportedException: Setting network timeout is not 
> supported.
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (DRILL-5943) Avoid the strong check introduced by DRILL-5582 for PLAIN mechanism

2017-11-07 Thread Sorabh Hamirwasia (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-5943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sorabh Hamirwasia updated DRILL-5943:
-
Fix Version/s: 1.12.0

> Avoid the strong check introduced by DRILL-5582 for PLAIN mechanism
> ---
>
> Key: DRILL-5943
> URL: https://issues.apache.org/jira/browse/DRILL-5943
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Sorabh Hamirwasia
>Assignee: Sorabh Hamirwasia
> Fix For: 1.12.0
>
>
> For PLAIN mechanism we will weaken the strong check introduced with 
> DRILL-5582 to keep the forward compatibility between Drill 1.12 client and 
> Drill 1.9 server. This is fine since with and without this strong check PLAIN 
> mechanism is still vulnerable to MITM during handshake itself unlike mutual 
> authentication protocols like Kerberos.
> Also for keeping forward compatibility with respect to SASL we will treat 
> UNKNOWN_SASL_SUPPORT as valid value. For handshake message received from a 
> client which is running on later version (let say 1.13) then Drillbit (1.12) 
> and having a new value for SaslSupport field which is unknown to server, this 
> field will be decoded as UNKNOWN_SASL_SUPPORT. In this scenario client will 
> be treated as one aware about SASL protocol but server doesn't know exact 
> capabilities of client. Hence the SASL handshake will still be required from 
> server side.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5899) Simple pattern matchers can work with DrillBuf directly


[ 
https://issues.apache.org/jira/browse/DRILL-5899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243238#comment-16243238
 ] 

ASF GitHub Bot commented on DRILL-5899:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1015#discussion_r149554195
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/SqlPatternContainsMatcher.java
 ---
@@ -17,37 +17,48 @@
  */
 package org.apache.drill.exec.expr.fn.impl;
 
-public class SqlPatternContainsMatcher implements SqlPatternMatcher {
-  final String patternString;
-  CharSequence charSequenceWrapper;
-  final int patternLength;
-
-  public SqlPatternContainsMatcher(String patternString, CharSequence 
charSequenceWrapper) {
-this.patternString = patternString;
-this.charSequenceWrapper = charSequenceWrapper;
-patternLength = patternString.length();
+import io.netty.buffer.DrillBuf;
+
+public class SqlPatternContainsMatcher extends AbstractSqlPatternMatcher {
+
+  public SqlPatternContainsMatcher(String patternString) {
+super(patternString);
   }
 
   @Override
-  public int match() {
-final int txtLength = charSequenceWrapper.length();
-int patternIndex = 0;
-int txtIndex = 0;
-
-// simplePattern string has meta characters i.e % and _ and escape 
characters removed.
-// so, we can just directly compare.
-while (patternIndex < patternLength && txtIndex < txtLength) {
-  if (patternString.charAt(patternIndex) != 
charSequenceWrapper.charAt(txtIndex)) {
-// Go back if there is no match
-txtIndex = txtIndex - patternIndex;
-patternIndex = 0;
-  } else {
-patternIndex++;
+  public int match(int start, int end, DrillBuf drillBuf) {
+
+if (patternLength == 0) { // Everything should match for null pattern 
string
+  return 1;
+}
+
+final int txtLength = end - start;
+
+// no match if input string length is less than pattern length
+if (txtLength < patternLength) {
+  return 0;
+}
+
+outer:
+for (int txtIndex = 0; txtIndex < txtLength; txtIndex++) {
+
+  // boundary check
+  if (txtIndex + patternLength > txtLength) {
--- End diff --

Better:
```
int end = txtLength - patternLength;
for (int txtIndex = 0; txtIndex < end; txtIndex++) {
```
And omit the boundary check on every iteration. That is, no reason to 
iterate past the last possible match, then use an if-statement to shorten the 
loop. Just shorten the loop.


> Simple pattern matchers can work with DrillBuf directly
> ---
>
> Key: DRILL-5899
> URL: https://issues.apache.org/jira/browse/DRILL-5899
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Flow
>Reporter: Padma Penumarthy
>Assignee: Padma Penumarthy
>Priority: Critical
>
> For the 4 simple patterns we have i.e. startsWith, endsWith, contains and 
> constant,, we do not need the overhead of charSequenceWrapper. We can work 
> with DrillBuf directly. This will save us from doing isAscii check and UTF8 
> decoding for each row.
> UTF-8 encoding ensures that no UTF-8 character is a prefix of any other valid 
> character. So, instead of decoding varChar from each row we are processing, 
> encode the patternString once during setup and do raw byte comparison. 
> Instead of bounds checking and reading one byte at a time, we get the whole 
> buffer in one shot and use that for comparison.
> This improved overall performance for filter operator by around 20%. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5899) Simple pattern matchers can work with DrillBuf directly


[ 
https://issues.apache.org/jira/browse/DRILL-5899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243237#comment-16243237
 ] 

ASF GitHub Bot commented on DRILL-5899:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1015#discussion_r149552453
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/AbstractSqlPatternMatcher.java
 ---
@@ -0,0 +1,61 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.expr.fn.impl;
+
+import com.google.common.base.Charsets;
+import org.apache.drill.common.exceptions.UserException;
+import java.nio.ByteBuffer;
+import java.nio.CharBuffer;
+import java.nio.charset.CharacterCodingException;
+import java.nio.charset.CharsetEncoder;
+import static 
org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.logger;
+
+// To get good performance for most commonly used pattern matches
--- End diff --

Javadoc?

```
/**
 * This is a Javadoc comment and appears in generated documentation.
 */
// This is a plain comment and does not appear in documentation.
```


> Simple pattern matchers can work with DrillBuf directly
> ---
>
> Key: DRILL-5899
> URL: https://issues.apache.org/jira/browse/DRILL-5899
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Flow
>Reporter: Padma Penumarthy
>Assignee: Padma Penumarthy
>Priority: Critical
>
> For the 4 simple patterns we have i.e. startsWith, endsWith, contains and 
> constant,, we do not need the overhead of charSequenceWrapper. We can work 
> with DrillBuf directly. This will save us from doing isAscii check and UTF8 
> decoding for each row.
> UTF-8 encoding ensures that no UTF-8 character is a prefix of any other valid 
> character. So, instead of decoding varChar from each row we are processing, 
> encode the patternString once during setup and do raw byte comparison. 
> Instead of bounds checking and reading one byte at a time, we get the whole 
> buffer in one shot and use that for comparison.
> This improved overall performance for filter operator by around 20%. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5899) Simple pattern matchers can work with DrillBuf directly


[ 
https://issues.apache.org/jira/browse/DRILL-5899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243235#comment-16243235
 ] 

ASF GitHub Bot commented on DRILL-5899:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1015#discussion_r149554356
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/SqlPatternEndsWithMatcher.java
 ---
@@ -17,33 +17,30 @@
  */
 package org.apache.drill.exec.expr.fn.impl;
 
-public class SqlPatternEndsWithMatcher implements SqlPatternMatcher {
-  final String patternString;
-  CharSequence charSequenceWrapper;
-  final int patternLength;
-
-  public SqlPatternEndsWithMatcher(String patternString, CharSequence 
charSequenceWrapper) {
-this.charSequenceWrapper = charSequenceWrapper;
-this.patternString = patternString;
-this.patternLength = patternString.length();
+import io.netty.buffer.DrillBuf;
+
+public class SqlPatternEndsWithMatcher extends AbstractSqlPatternMatcher {
+
+  public SqlPatternEndsWithMatcher(String patternString) {
+super(patternString);
   }
 
   @Override
-  public int match() {
-int txtIndex = charSequenceWrapper.length();
-int patternIndex = patternLength;
-boolean matchFound = true; // if pattern is empty string, we always 
match.
+  public int match(int start, int end, DrillBuf drillBuf) {
+
+if ( (end - start) < patternLength) { // No match if input string 
length is less than pattern length.
--- End diff --

`( (end - start)` --> `(end - start`


> Simple pattern matchers can work with DrillBuf directly
> ---
>
> Key: DRILL-5899
> URL: https://issues.apache.org/jira/browse/DRILL-5899
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Flow
>Reporter: Padma Penumarthy
>Assignee: Padma Penumarthy
>Priority: Critical
>
> For the 4 simple patterns we have i.e. startsWith, endsWith, contains and 
> constant,, we do not need the overhead of charSequenceWrapper. We can work 
> with DrillBuf directly. This will save us from doing isAscii check and UTF8 
> decoding for each row.
> UTF-8 encoding ensures that no UTF-8 character is a prefix of any other valid 
> character. So, instead of decoding varChar from each row we are processing, 
> encode the patternString once during setup and do raw byte comparison. 
> Instead of bounds checking and reading one byte at a time, we get the whole 
> buffer in one shot and use that for comparison.
> This improved overall performance for filter operator by around 20%. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5899) Simple pattern matchers can work with DrillBuf directly


[ 
https://issues.apache.org/jira/browse/DRILL-5899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243239#comment-16243239
 ] 

ASF GitHub Bot commented on DRILL-5899:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1015#discussion_r149555002
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/SqlPatternEndsWithMatcher.java
 ---
@@ -17,33 +17,30 @@
  */
 package org.apache.drill.exec.expr.fn.impl;
 
-public class SqlPatternEndsWithMatcher implements SqlPatternMatcher {
-  final String patternString;
-  CharSequence charSequenceWrapper;
-  final int patternLength;
-
-  public SqlPatternEndsWithMatcher(String patternString, CharSequence 
charSequenceWrapper) {
-this.charSequenceWrapper = charSequenceWrapper;
-this.patternString = patternString;
-this.patternLength = patternString.length();
+import io.netty.buffer.DrillBuf;
+
+public class SqlPatternEndsWithMatcher extends AbstractSqlPatternMatcher {
+
+  public SqlPatternEndsWithMatcher(String patternString) {
+super(patternString);
   }
 
   @Override
-  public int match() {
-int txtIndex = charSequenceWrapper.length();
-int patternIndex = patternLength;
-boolean matchFound = true; // if pattern is empty string, we always 
match.
+  public int match(int start, int end, DrillBuf drillBuf) {
+
+if ( (end - start) < patternLength) { // No match if input string 
length is less than pattern length.
+  return 0;
+}
 
 // simplePattern string has meta characters i.e % and _ and escape 
characters removed.
 // so, we can just directly compare.
-while (patternIndex > 0 && txtIndex > 0) {
-  if (charSequenceWrapper.charAt(--txtIndex) != 
patternString.charAt(--patternIndex)) {
-matchFound = false;
-break;
+for (int index = 1; index <= patternLength; index++) {
--- End diff --

```
int txtStart = end - patternLength;
if (txtStart < start) { return 0; }
for (int index = 0; index < patternLength; index++) {
   ... patternByteBuffer.get(index) ... drillBuf.getByte(txtStart + index) 
...
```



> Simple pattern matchers can work with DrillBuf directly
> ---
>
> Key: DRILL-5899
> URL: https://issues.apache.org/jira/browse/DRILL-5899
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Flow
>Reporter: Padma Penumarthy
>Assignee: Padma Penumarthy
>Priority: Critical
>
> For the 4 simple patterns we have i.e. startsWith, endsWith, contains and 
> constant,, we do not need the overhead of charSequenceWrapper. We can work 
> with DrillBuf directly. This will save us from doing isAscii check and UTF8 
> decoding for each row.
> UTF-8 encoding ensures that no UTF-8 character is a prefix of any other valid 
> character. So, instead of decoding varChar from each row we are processing, 
> encode the patternString once during setup and do raw byte comparison. 
> Instead of bounds checking and reading one byte at a time, we get the whole 
> buffer in one shot and use that for comparison.
> This improved overall performance for filter operator by around 20%. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5899) Simple pattern matchers can work with DrillBuf directly


[ 
https://issues.apache.org/jira/browse/DRILL-5899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243236#comment-16243236
 ] 

ASF GitHub Bot commented on DRILL-5899:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1015#discussion_r149552506
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/AbstractSqlPatternMatcher.java
 ---
@@ -0,0 +1,61 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.expr.fn.impl;
+
+import com.google.common.base.Charsets;
+import org.apache.drill.common.exceptions.UserException;
+import java.nio.ByteBuffer;
+import java.nio.CharBuffer;
+import java.nio.charset.CharacterCodingException;
+import java.nio.charset.CharsetEncoder;
+import static 
org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.logger;
+
+// To get good performance for most commonly used pattern matches
+// i.e. CONSTANT('ABC'), STARTSWITH('%ABC'), ENDSWITH('ABC%') and 
CONTAINS('%ABC%'),
+// we have simple pattern matchers.
+// Idea is to have our own implementation for simple pattern matchers so 
we can
+// avoid heavy weight regex processing, skip UTF-8 decoding and char 
conversion.
+// Instead, we encode the pattern string and do byte comparison against 
native memory.
+// Overall, this approach
+// gives us orders of magnitude performance improvement for simple pattern 
matches.
+// Anything that is not simple is considered
+// complex pattern and we use Java regex for complex pattern matches.
+
+public abstract class AbstractSqlPatternMatcher implements 
SqlPatternMatcher {
+  final String patternString;
--- End diff --

`protected final`


> Simple pattern matchers can work with DrillBuf directly
> ---
>
> Key: DRILL-5899
> URL: https://issues.apache.org/jira/browse/DRILL-5899
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Flow
>Reporter: Padma Penumarthy
>Assignee: Padma Penumarthy
>Priority: Critical
>
> For the 4 simple patterns we have i.e. startsWith, endsWith, contains and 
> constant,, we do not need the overhead of charSequenceWrapper. We can work 
> with DrillBuf directly. This will save us from doing isAscii check and UTF8 
> decoding for each row.
> UTF-8 encoding ensures that no UTF-8 character is a prefix of any other valid 
> character. So, instead of decoding varChar from each row we are processing, 
> encode the patternString once during setup and do raw byte comparison. 
> Instead of bounds checking and reading one byte at a time, we get the whole 
> buffer in one shot and use that for comparison.
> This improved overall performance for filter operator by around 20%. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (DRILL-5943) Avoid the strong check introduced by DRILL-5582 for PLAIN mechanism

2017-11-07 Thread Sorabh Hamirwasia (JIRA)

Sorabh Hamirwasia created DRILL-5943:


 Summary: Avoid the strong check introduced by DRILL-5582 for PLAIN 
mechanism
 Key: DRILL-5943
 URL: https://issues.apache.org/jira/browse/DRILL-5943
 Project: Apache Drill
  Issue Type: Improvement
Reporter: Sorabh Hamirwasia
Assignee: Sorabh Hamirwasia


For PLAIN mechanism we will weaken the strong check introduced with DRILL-5582 
to keep the forward compatibility between Drill 1.12 client and Drill 1.9 
server. This is fine since with and without this strong check PLAIN mechanism 
is still vulnerable to MITM during handshake itself unlike mutual 
authentication protocols like Kerberos.

Also for keeping forward compatibility with respect to SASL we will treat 
UNKNOWN_SASL_SUPPORT as valid value. For handshake message received from a 
client which is running on later version (let say 1.13) then Drillbit (1.12) 
and having a new value for SaslSupport field which is unknown to server, this 
field will be decoded as UNKNOWN_SASL_SUPPORT. In this scenario client will be 
treated as one aware about SASL protocol but server doesn't know exact 
capabilities of client. Hence the SASL handshake will still be required from 
server side.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)


[ 
https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243135#comment-16243135
 ] 

ASF GitHub Bot commented on DRILL-3640:
---

Github user laurentgo commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149543163
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java ---
@@ -376,6 +417,19 @@ synchronized void cleanup() {
 currentBatchHolder.clear();
   }
 
+  long getTimeoutInMilliseconds() {
+return timeoutInMilliseconds;
+  }
+
+  //Set the cursor's timeout in seconds
+  void setTimeout(int timeoutDurationInSeconds){
+this.timeoutInMilliseconds = timeoutDurationInSeconds*1000L;
--- End diff --

Preferably use `TimeUnit.SECONDS.toMillis(timeoutDurationInSeconds)` to 
avoid magic constants


> Drill JDBC driver support Statement.setQueryTimeout(int)
> 
>
> Key: DRILL-3640
> URL: https://issues.apache.org/jira/browse/DRILL-3640
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
> Fix For: 1.12.0
>
>
> It would be nice if we have this implemented. Run away queries can be 
> automatically canceled by setting the timeout. 
> java.sql.SQLFeatureNotSupportedException: Setting network timeout is not 
> supported.
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)


[ 
https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243133#comment-16243133
 ] 

ASF GitHub Bot commented on DRILL-3640:
---

Github user laurentgo commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149543078
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java ---
@@ -333,8 +368,14 @@ void close() {
 final int batchQueueThrottlingThreshold =
 client.getConfig().getInt(
 ExecConstants.JDBC_BATCH_QUEUE_THROTTLING_THRESHOLD );
-resultsListener = new ResultsListener(batchQueueThrottlingThreshold);
+resultsListener = new ResultsListener(this, 
batchQueueThrottlingThreshold);
 currentBatchHolder = new RecordBatchLoader(client.getAllocator());
+try {
+  setTimeout(this.statement.getQueryTimeout());
+} catch (SQLException e) {
+  // Printing any unexpected SQLException stack trace
+  e.printStackTrace();
--- End diff --

two choices here:
- we don't think it's important if we cannot get the value, so we should 
log it properly and not simply dump the exception
- we think this is important, and we propagate the exception to the caller

(I think it is important: the most likely reason why we could not get the 
value if that the statement was closed, and we should probably notify the user 
about it).


> Drill JDBC driver support Statement.setQueryTimeout(int)
> 
>
> Key: DRILL-3640
> URL: https://issues.apache.org/jira/browse/DRILL-3640
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
> Fix For: 1.12.0
>
>
> It would be nice if we have this implemented. Run away queries can be 
> automatically canceled by setting the timeout. 
> java.sql.SQLFeatureNotSupportedException: Setting network timeout is not 
> supported.
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)


[ 
https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243136#comment-16243136
 ] 

ASF GitHub Bot commented on DRILL-3640:
---

Github user laurentgo commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149542468
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java ---
@@ -139,8 +147,22 @@ private boolean stopThrottlingIfSo() {
   return stopped;
 }
 
-public void awaitFirstMessage() throws InterruptedException {
-  firstMessageReceived.await();
+public void awaitFirstMessage() throws InterruptedException, 
SQLTimeoutException {
+  //Check if a non-zero timeout has been set
+  if ( parent.timeoutInMilliseconds > 0 ) {
+//Identifying remaining in milliseconds to maintain a granularity 
close to integer value of timeout
+long timeToTimeout = (parent.timeoutInMilliseconds) - 
parent.elapsedTimer.elapsed(TimeUnit.MILLISECONDS);
+if ( timeToTimeout > 0 ) {
--- End diff --

maybe a style issue, but to avoid code duplication both conditions could be 
checked together?
```
if ( timeToTimeout <= 0 || !firstMessageReceived.await(timeToTimeout, 
TimeUnit.MILLISECONDS) ) {
  throw new 
SqlTimeoutException(TimeUnit.MILLISECONDS.toSeconds(parent.timeoutInMilliseconds));
}
```


> Drill JDBC driver support Statement.setQueryTimeout(int)
> 
>
> Key: DRILL-3640
> URL: https://issues.apache.org/jira/browse/DRILL-3640
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
> Fix For: 1.12.0
>
>
> It would be nice if we have this implemented. Run away queries can be 
> automatically canceled by setting the timeout. 
> java.sql.SQLFeatureNotSupportedException: Setting network timeout is not 
> supported.
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)


[ 
https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243137#comment-16243137
 ] 

ASF GitHub Bot commented on DRILL-3640:
---

Github user laurentgo commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149543581
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillResultSetImpl.java ---
@@ -96,6 +105,14 @@ private void throwIfClosed() throws 
AlreadyClosedSqlException,
 throw new AlreadyClosedSqlException( "ResultSet is already 
closed." );
   }
 }
+
+//Implicit check for whether timeout is set
+if (elapsedTimer != null) {
--- End diff --

maybe an helper method from the cursor to see if we timed out instead of 
exposing elapsedTimer?

I'm not sure if this is really necessary (posted another comment about it 
previously), except maybe because of unit tests where it's hard to time out 
inside the cursor?

I did a prototype too and used control injection to pause the screen 
operator: the test would look like this:
```
  /**
   * Test setting timeout for a query that actually times out
   */
  @Test ( expected = SqlTimeoutException.class )
  public void testTriggeredQueryTimeout() throws SQLException {
// Prevent the server to complete the query to trigger a timeout
final String controls = Controls.newBuilder()
  .addPause(ScreenCreator.class, "send-complete", 0)
  .build();

try(Statement statement = connection.createStatement()) {
  assertThat(
  statement.execute(String.format(
  "ALTER session SET `%s` = '%s'",
  ExecConstants.DRILLBIT_CONTROL_INJECTIONS,
  controls)),
  equalTo(true));
}
String queryId = null;
try(Statement statement = connection.createStatement()) {
  int timeoutDuration = 3;
  //Setting to a very low value (3sec)
  statement.setQueryTimeout(timeoutDuration);
  ResultSet rs = statement.executeQuery(SYS_VERSION_SQL);
  queryId = ((DrillResultSet) rs).getQueryId();
  //Fetch rows
  while (rs.next()) {
rs.getBytes(1);
  }
} catch (SQLException sqlEx) {
  if (sqlEx instanceof SqlTimeoutException) {
throw (SqlTimeoutException) sqlEx;
  }
} finally {
  // Do not forget to unpause to avoid memory leak.
  if (queryId != null) {
DrillClient drillClient = ((DrillConnection) 
connection).getClient();

drillClient.resumeQuery(QueryIdHelper.getQueryIdFromString(queryId));
  }
  }
```

Works for PreparedStatementTest too, need to make sure you pause after 
prepared statement is created but before it is executed.


> Drill JDBC driver support Statement.setQueryTimeout(int)
> 
>
> Key: DRILL-3640
> URL: https://issues.apache.org/jira/browse/DRILL-3640
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
> Fix For: 1.12.0
>
>
> It would be nice if we have this implemented. Run away queries can be 
> automatically canceled by setting the timeout. 
> java.sql.SQLFeatureNotSupportedException: Setting network timeout is not 
> supported.
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)


[ 
https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243132#comment-16243132
 ] 

ASF GitHub Bot commented on DRILL-3640:
---

Github user laurentgo commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149542622
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java ---
@@ -239,6 +261,11 @@ QueryDataBatch getNext() throws UserException, 
InterruptedException {
 }
 return qdb;
   }
+
+  // Check and throw SQLTimeoutException
+  if ( parent.timeoutInMilliseconds > 0 && 
parent.elapsedTimer.elapsed(TimeUnit.SECONDS) >= parent.timeoutInMilliseconds ) 
{
--- End diff --

wrong unit for the comparison (should be millis)


> Drill JDBC driver support Statement.setQueryTimeout(int)
> 
>
> Key: DRILL-3640
> URL: https://issues.apache.org/jira/browse/DRILL-3640
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
> Fix For: 1.12.0
>
>
> It would be nice if we have this implemented. Run away queries can be 
> automatically canceled by setting the timeout. 
> java.sql.SQLFeatureNotSupportedException: Setting network timeout is not 
> supported.
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5899) Simple pattern matchers can work with DrillBuf directly


[ 
https://issues.apache.org/jira/browse/DRILL-5899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243050#comment-16243050
 ] 

ASF GitHub Bot commented on DRILL-5899:
---

Github user ppadma commented on the issue:

https://github.com/apache/drill/pull/1015
  
@paul-rogers Thanks a lot for the review. Updated the PR with code review 
comments. Please take a look. 

Overall, good improvement with this change. Here are the numbers.

select count(*) from `/Users/ppenumarthy/MAPRTECH/padma/testdata` where 
l_comment like '%a' 
1.4 sec vs 7 sec

select count(*) from `/Users/ppenumarthy/MAPRTECH/padma/testdata` where 
l_comment like '%a%' 
6.5 sec vs 13.5 sec

select count(*) from `/Users/ppenumarthy/MAPRTECH/padma/testdata` where 
l_comment like 'a%' 
1.4 sec vs 5.8 sec

select count(*) from `/Users/ppenumarthy/MAPRTECH/padma/testdata` where 
l_comment like 'a' 
1.1.65 sec vs 5.8 sec


I think for "contains", improvement is not as much as others, probably 
because of nested for loops. @sachouche changes on top of these changes can 
improve further. 



> Simple pattern matchers can work with DrillBuf directly
> ---
>
> Key: DRILL-5899
> URL: https://issues.apache.org/jira/browse/DRILL-5899
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Flow
>Reporter: Padma Penumarthy
>Assignee: Padma Penumarthy
>Priority: Critical
>
> For the 4 simple patterns we have i.e. startsWith, endsWith, contains and 
> constant,, we do not need the overhead of charSequenceWrapper. We can work 
> with DrillBuf directly. This will save us from doing isAscii check and UTF8 
> decoding for each row.
> UTF-8 encoding ensures that no UTF-8 character is a prefix of any other valid 
> character. So, instead of decoding varChar from each row we are processing, 
> encode the patternString once during setup and do raw byte comparison. 
> Instead of bounds checking and reading one byte at a time, we get the whole 
> buffer in one shot and use that for comparison.
> This improved overall performance for filter operator by around 20%. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)


[ 
https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242705#comment-16242705
 ] 

ASF GitHub Bot commented on DRILL-3640:
---

Github user laurentgo commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149477222
  
--- Diff: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java ---
@@ -237,6 +245,127 @@ public String toString() {
 }
   }
 
+  /**
+   * Test for reading of default query timeout
+   */
+  @Test
+  public void testDefaultGetQueryTimeout() throws SQLException {
+PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL);
+int timeoutValue = stmt.getQueryTimeout();
+assert( 0 == timeoutValue );
+  }
+
+  /**
+   * Test Invalid parameter by giving negative timeout
+   */
+  @Test ( expected = InvalidParameterSqlException.class )
+  public void testInvalidSetQueryTimeout() throws SQLException {
+PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL);
+//Setting negative value
+int valueToSet = -10;
+if (0L == valueToSet) {
+  valueToSet--;
+}
+try {
+  stmt.setQueryTimeout(valueToSet);
+} catch ( final Exception e) {
+  // TODO: handle exception
+  assertThat( e.getMessage(), containsString( "illegal timeout value") 
);
+  //Converting this to match expected Exception
+  throw new InvalidParameterSqlException(e.getMessage());
--- End diff --

but you did since you catch the exception and do a check on the message. 
Rewrapping it so that the test framework can check the new type has no value.


> Drill JDBC driver support Statement.setQueryTimeout(int)
> 
>
> Key: DRILL-3640
> URL: https://issues.apache.org/jira/browse/DRILL-3640
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
> Fix For: 1.12.0
>
>
> It would be nice if we have this implemented. Run away queries can be 
> automatically canceled by setting the timeout. 
> java.sql.SQLFeatureNotSupportedException: Setting network timeout is not 
> supported.
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)


[ 
https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242719#comment-16242719
 ] 

ASF GitHub Bot commented on DRILL-3640:
---

Github user laurentgo commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149477955
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java ---
@@ -376,6 +415,19 @@ synchronized void cleanup() {
 currentBatchHolder.clear();
   }
 
+  //Set the cursor's timeout in seconds
--- End diff --

you just need to get the value when the query is executed (in DrillCursor) 
once to make sure the timeout doesn't change (that and StopWatch being managed 
by DrillCursor too.

Also, it is subject to interpretation but it seems the intent of the API is 
to time bound how much time it takes the query to complete. I'm not sure it is 
necessary to make the extra work of having a slow client reading the result set 
data although all data has already been read by the driver from the server (and 
from the server point of view, the query is completed).


> Drill JDBC driver support Statement.setQueryTimeout(int)
> 
>
> Key: DRILL-3640
> URL: https://issues.apache.org/jira/browse/DRILL-3640
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
> Fix For: 1.12.0
>
>
> It would be nice if we have this implemented. Run away queries can be 
> automatically canceled by setting the timeout. 
> java.sql.SQLFeatureNotSupportedException: Setting network timeout is not 
> supported.
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5138) TopN operator on top of ~110 GB data set is very slow

2017-11-07 Thread Chun Chang (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242709#comment-16242709
 ] 

Chun Chang commented on DRILL-5138:
---

I ran the query against MapR Drill 1.11.0 and query returned in 81 seconds.

[root@perfnode166 catalog_sales]# sqlline --maxWidth=1 -u 
"jdbc:drill:zk=10.10.30.166:5181"
OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=512M; support was 
removed in 8.0
apache drill 1.11.0-mapr
"drill baby drill"
0: jdbc:drill:zk=10.10.30.166:5181> select * from 
dfs.`/drill/testdata/tpcds_sf100/parquet/catalog_sales` order by cs_quantity, 
cs_wholesale_cost limit 1;
+--+---+--+---++-++--++-+---+-++-++--+---+---+--++--+--+--+-+--+---+--+--+---+--+--+--+--++
| cs_bill_addr_sk  | cs_bill_cdemo_sk  | cs_bill_customer_sk  | 
cs_bill_hdemo_sk  | cs_call_center_sk  | cs_catalog_page_sk  | cs_coupon_amt  | 
cs_ext_discount_amt  | cs_ext_list_price  | cs_ext_sales_price  | 
cs_ext_ship_cost  | cs_ext_tax  | cs_ext_wholesale_cost  | cs_item_sk  | 
cs_list_price  | cs_net_paid  | cs_net_paid_inc_ship  | 
cs_net_paid_inc_ship_tax  | cs_net_paid_inc_tax  | cs_net_profit  | 
cs_order_number  | cs_promo_sk  | cs_quantity  | cs_sales_price  | 
cs_ship_addr_sk  | cs_ship_cdemo_sk  | cs_ship_customer_sk  | cs_ship_date_sk  
| cs_ship_hdemo_sk  | cs_ship_mode_sk  | cs_sold_date_sk  | cs_sold_time_sk  | 
cs_warehouse_sk  | cs_wholesale_cost  |
+--+---+--+---++-++--++-+---+-++-++--+---+---+--++--+--+--+-+--+---+--+--+---+--+--+--+--++
| 184649   | 555979| 1796891  | 1114
  | 24 | 14393   | 0.00   | 0.02
 | 1.82   | 1.80| 0.25  | 0.00  
  | 1.00   | 108618  | 1.82   | 1.80 | 2.05 
 | 2.05  | 1.80 | 0.80  
 | 15928478 | 540  | 1| 1.80| 
184649   | 555979| 1796891  | 2452671  
| 1114  | 9| 2452640  | 38871| 
1| 1.00   |
+--+---+--+---++-++--++-+---+-++-++--+---+---+--++--+--+--+-+--+---+--+--+---+--+--+--+--++
1 row selected (81.577 seconds)

> TopN operator on top of ~110 GB data set is very slow
> -
>
> Key: DRILL-5138
> URL: https://issues.apache.org/jira/browse/DRILL-5138
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Reporter: Rahul Challapalli
>Assignee: Timothy Farkas
>
> git.commit.id.abbrev=cf2b7c7
> No of cores : 23
> No of disks : 5
> DRILL_MAX_DIRECT_MEMORY="24G"
> DRILL_MAX_HEAP="12G"
> The below query ran for more than 4 hours and did not complete. The table is 
> ~110 GB
> {code}
> select * from catalog_sales order by cs_quantity, cs_wholesale_cost limit 1;
> {code}
> Physical Plan :
> {code}
> 00-00Screen : rowType = RecordType(ANY *): rowcount = 1.0, cumulative 
> cost = {1.00798629141E10 rows, 4.1

[jira] [Resolved] (DRILL-5138) TopN operator on top of ~110 GB data set is very slow


 [ 
https://issues.apache.org/jira/browse/DRILL-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Farkas resolved DRILL-5138.
---
Resolution: Fixed

> TopN operator on top of ~110 GB data set is very slow
> -
>
> Key: DRILL-5138
> URL: https://issues.apache.org/jira/browse/DRILL-5138
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Reporter: Rahul Challapalli
>Assignee: Timothy Farkas
>
> git.commit.id.abbrev=cf2b7c7
> No of cores : 23
> No of disks : 5
> DRILL_MAX_DIRECT_MEMORY="24G"
> DRILL_MAX_HEAP="12G"
> The below query ran for more than 4 hours and did not complete. The table is 
> ~110 GB
> {code}
> select * from catalog_sales order by cs_quantity, cs_wholesale_cost limit 1;
> {code}
> Physical Plan :
> {code}
> 00-00Screen : rowType = RecordType(ANY *): rowcount = 1.0, cumulative 
> cost = {1.00798629141E10 rows, 4.17594320691E10 cpu, 0.0 io, 
> 4.1287118487552E13 network, 0.0 memory}, id = 352
> 00-01  Project(*=[$0]) : rowType = RecordType(ANY *): rowcount = 1.0, 
> cumulative cost = {1.0079862914E10 rows, 4.1759432069E10 cpu, 0.0 io, 
> 4.1287118487552E13 network, 0.0 memory}, id = 351
> 00-02Project(T0¦¦*=[$0]) : rowType = RecordType(ANY T0¦¦*): rowcount 
> = 1.0, cumulative cost = {1.0079862914E10 rows, 4.1759432069E10 cpu, 0.0 io, 
> 4.1287118487552E13 network, 0.0 memory}, id = 350
> 00-03  SelectionVectorRemover : rowType = RecordType(ANY T0¦¦*, ANY 
> cs_quantity, ANY cs_wholesale_cost): rowcount = 1.0, cumulative cost = 
> {1.0079862914E10 rows, 4.1759432069E10 cpu, 0.0 io, 4.1287118487552E13 
> network, 0.0 memory}, id = 349
> 00-04Limit(fetch=[1]) : rowType = RecordType(ANY T0¦¦*, ANY 
> cs_quantity, ANY cs_wholesale_cost): rowcount = 1.0, cumulative cost = 
> {1.0079862913E10 rows, 4.1759432068E10 cpu, 0.0 io, 4.1287118487552E13 
> network, 0.0 memory}, id = 348
> 00-05  SingleMergeExchange(sort0=[1 ASC], sort1=[2 ASC]) : 
> rowType = RecordType(ANY T0¦¦*, ANY cs_quantity, ANY cs_wholesale_cost): 
> rowcount = 1.439980416E9, cumulative cost = {1.0079862912E10 rows, 
> 4.1759432064E10 cpu, 0.0 io, 4.1287118487552E13 network, 0.0 memory}, id = 347
> 01-01SelectionVectorRemover : rowType = RecordType(ANY T0¦¦*, 
> ANY cs_quantity, ANY cs_wholesale_cost): rowcount = 1.439980416E9, cumulative 
> cost = {8.639882496E9 rows, 3.0239588736E10 cpu, 0.0 io, 2.3592639135744E13 
> network, 0.0 memory}, id = 346
> 01-02  TopN(limit=[1]) : rowType = RecordType(ANY T0¦¦*, ANY 
> cs_quantity, ANY cs_wholesale_cost): rowcount = 1.439980416E9, cumulative 
> cost = {7.19990208E9 rows, 2.879960832E10 cpu, 0.0 io, 2.3592639135744E13 
> network, 0.0 memory}, id = 345
> 01-03Project(T0¦¦*=[$0], cs_quantity=[$1], 
> cs_wholesale_cost=[$2]) : rowType = RecordType(ANY T0¦¦*, ANY cs_quantity, 
> ANY cs_wholesale_cost): rowcount = 1.439980416E9, cumulative cost = 
> {5.759921664E9 rows, 2.879960832E10 cpu, 0.0 io, 2.3592639135744E13 network, 
> 0.0 memory}, id = 344
> 01-04  HashToRandomExchange(dist0=[[$1]], dist1=[[$2]]) : 
> rowType = RecordType(ANY T0¦¦*, ANY cs_quantity, ANY cs_wholesale_cost, ANY 
> E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 1.439980416E9, cumulative cost = 
> {5.759921664E9 rows, 2.879960832E10 cpu, 0.0 io, 2.3592639135744E13 network, 
> 0.0 memory}, id = 343
> 02-01UnorderedMuxExchange : rowType = RecordType(ANY 
> T0¦¦*, ANY cs_quantity, ANY cs_wholesale_cost, ANY 
> E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 1.439980416E9, cumulative cost = 
> {4.319941248E9 rows, 1.1519843328E10 cpu, 0.0 io, 0.0 network, 0.0 memory}, 
> id = 342
> 03-01  Project(T0¦¦*=[$0], cs_quantity=[$1], 
> cs_wholesale_cost=[$2], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($2, 
> hash32AsDouble($1))]) : rowType = RecordType(ANY T0¦¦*, ANY cs_quantity, ANY 
> cs_wholesale_cost, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 1.439980416E9, 
> cumulative cost = {2.879960832E9 rows, 1.0079862912E10 cpu, 0.0 io, 0.0 
> network, 0.0 memory}, id = 341
> 03-02Project(T0¦¦*=[$0], cs_quantity=[$1], 
> cs_wholesale_cost=[$2]) : rowType = RecordType(ANY T0¦¦*, ANY cs_quantity, 
> ANY cs_wholesale_cost): rowcount = 1.439980416E9, cumulative cost = 
> {1.439980416E9 rows, 4.319941248E9 cpu, 0.0 io, 0.0 network, 0.0 memory}, id 
> = 340
> 03-03  Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath 
> [path=maprfs:///drill/testdata/tpcds/parquet/sf1000/catalog_sales]], 
> selectionRoot=maprfs:/drill/testdata/tpcds/parquet/sf1000/catalog_sales, 
> numFiles=1, usedMetadataFile=false, columns=[`*`]]]) : rowType = 
> (DrillRecordRow[*, cs_quantity, cs_wholesale_cost]):

[jira] [Commented] (DRILL-5138) TopN operator on top of ~110 GB data set is very slow


[ 
https://issues.apache.org/jira/browse/DRILL-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242773#comment-16242773
 ] 

Timothy Farkas commented on DRILL-5138:
---

Since this is working for Chun it looks like the config change, which is 
already merged to master, was sufficient to fix this issue.

> TopN operator on top of ~110 GB data set is very slow
> -
>
> Key: DRILL-5138
> URL: https://issues.apache.org/jira/browse/DRILL-5138
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Reporter: Rahul Challapalli
>Assignee: Timothy Farkas
>
> git.commit.id.abbrev=cf2b7c7
> No of cores : 23
> No of disks : 5
> DRILL_MAX_DIRECT_MEMORY="24G"
> DRILL_MAX_HEAP="12G"
> The below query ran for more than 4 hours and did not complete. The table is 
> ~110 GB
> {code}
> select * from catalog_sales order by cs_quantity, cs_wholesale_cost limit 1;
> {code}
> Physical Plan :
> {code}
> 00-00Screen : rowType = RecordType(ANY *): rowcount = 1.0, cumulative 
> cost = {1.00798629141E10 rows, 4.17594320691E10 cpu, 0.0 io, 
> 4.1287118487552E13 network, 0.0 memory}, id = 352
> 00-01  Project(*=[$0]) : rowType = RecordType(ANY *): rowcount = 1.0, 
> cumulative cost = {1.0079862914E10 rows, 4.1759432069E10 cpu, 0.0 io, 
> 4.1287118487552E13 network, 0.0 memory}, id = 351
> 00-02Project(T0¦¦*=[$0]) : rowType = RecordType(ANY T0¦¦*): rowcount 
> = 1.0, cumulative cost = {1.0079862914E10 rows, 4.1759432069E10 cpu, 0.0 io, 
> 4.1287118487552E13 network, 0.0 memory}, id = 350
> 00-03  SelectionVectorRemover : rowType = RecordType(ANY T0¦¦*, ANY 
> cs_quantity, ANY cs_wholesale_cost): rowcount = 1.0, cumulative cost = 
> {1.0079862914E10 rows, 4.1759432069E10 cpu, 0.0 io, 4.1287118487552E13 
> network, 0.0 memory}, id = 349
> 00-04Limit(fetch=[1]) : rowType = RecordType(ANY T0¦¦*, ANY 
> cs_quantity, ANY cs_wholesale_cost): rowcount = 1.0, cumulative cost = 
> {1.0079862913E10 rows, 4.1759432068E10 cpu, 0.0 io, 4.1287118487552E13 
> network, 0.0 memory}, id = 348
> 00-05  SingleMergeExchange(sort0=[1 ASC], sort1=[2 ASC]) : 
> rowType = RecordType(ANY T0¦¦*, ANY cs_quantity, ANY cs_wholesale_cost): 
> rowcount = 1.439980416E9, cumulative cost = {1.0079862912E10 rows, 
> 4.1759432064E10 cpu, 0.0 io, 4.1287118487552E13 network, 0.0 memory}, id = 347
> 01-01SelectionVectorRemover : rowType = RecordType(ANY T0¦¦*, 
> ANY cs_quantity, ANY cs_wholesale_cost): rowcount = 1.439980416E9, cumulative 
> cost = {8.639882496E9 rows, 3.0239588736E10 cpu, 0.0 io, 2.3592639135744E13 
> network, 0.0 memory}, id = 346
> 01-02  TopN(limit=[1]) : rowType = RecordType(ANY T0¦¦*, ANY 
> cs_quantity, ANY cs_wholesale_cost): rowcount = 1.439980416E9, cumulative 
> cost = {7.19990208E9 rows, 2.879960832E10 cpu, 0.0 io, 2.3592639135744E13 
> network, 0.0 memory}, id = 345
> 01-03Project(T0¦¦*=[$0], cs_quantity=[$1], 
> cs_wholesale_cost=[$2]) : rowType = RecordType(ANY T0¦¦*, ANY cs_quantity, 
> ANY cs_wholesale_cost): rowcount = 1.439980416E9, cumulative cost = 
> {5.759921664E9 rows, 2.879960832E10 cpu, 0.0 io, 2.3592639135744E13 network, 
> 0.0 memory}, id = 344
> 01-04  HashToRandomExchange(dist0=[[$1]], dist1=[[$2]]) : 
> rowType = RecordType(ANY T0¦¦*, ANY cs_quantity, ANY cs_wholesale_cost, ANY 
> E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 1.439980416E9, cumulative cost = 
> {5.759921664E9 rows, 2.879960832E10 cpu, 0.0 io, 2.3592639135744E13 network, 
> 0.0 memory}, id = 343
> 02-01UnorderedMuxExchange : rowType = RecordType(ANY 
> T0¦¦*, ANY cs_quantity, ANY cs_wholesale_cost, ANY 
> E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 1.439980416E9, cumulative cost = 
> {4.319941248E9 rows, 1.1519843328E10 cpu, 0.0 io, 0.0 network, 0.0 memory}, 
> id = 342
> 03-01  Project(T0¦¦*=[$0], cs_quantity=[$1], 
> cs_wholesale_cost=[$2], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($2, 
> hash32AsDouble($1))]) : rowType = RecordType(ANY T0¦¦*, ANY cs_quantity, ANY 
> cs_wholesale_cost, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 1.439980416E9, 
> cumulative cost = {2.879960832E9 rows, 1.0079862912E10 cpu, 0.0 io, 0.0 
> network, 0.0 memory}, id = 341
> 03-02Project(T0¦¦*=[$0], cs_quantity=[$1], 
> cs_wholesale_cost=[$2]) : rowType = RecordType(ANY T0¦¦*, ANY cs_quantity, 
> ANY cs_wholesale_cost): rowcount = 1.439980416E9, cumulative cost = 
> {1.439980416E9 rows, 4.319941248E9 cpu, 0.0 io, 0.0 network, 0.0 memory}, id 
> = 340
> 03-03  Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath 
> [path=maprfs:///drill/testdata/tpcds/parquet/sf1000/catalog_sales]], 
> selectionRoot=maprfs:/drill/t

[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)


[ 
https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242706#comment-16242706
 ] 

ASF GitHub Bot commented on DRILL-3640:
---

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149477233
  
--- Diff: exec/jdbc/src/test/java/org/apache/drill/jdbc/StatementTest.java 
---
@@ -61,55 +71,129 @@ public static void tearDownStatement() throws 
SQLException {
   //
   // getQueryTimeout():
 
-  /** Tests that getQueryTimeout() indicates no timeout set. */
+  /**
+   * Test for reading of default query timeout
+   */
   @Test
-  public void testGetQueryTimeoutSaysNoTimeout() throws SQLException {
-assertThat( statement.getQueryTimeout(), equalTo( 0 ) );
+  public void testDefaultGetQueryTimeout() throws SQLException {
+Statement stmt = connection.createStatement();
+int timeoutValue = stmt.getQueryTimeout();
+assert( 0 == timeoutValue );
--- End diff --

+1 


> Drill JDBC driver support Statement.setQueryTimeout(int)
> 
>
> Key: DRILL-3640
> URL: https://issues.apache.org/jira/browse/DRILL-3640
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
> Fix For: 1.12.0
>
>
> It would be nice if we have this implemented. Run away queries can be 
> automatically canceled by setting the timeout. 
> java.sql.SQLFeatureNotSupportedException: Setting network timeout is not 
> supported.
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)


[ 
https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242701#comment-16242701
 ] 

ASF GitHub Bot commented on DRILL-3640:
---

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149476820
  
--- Diff: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java ---
@@ -237,6 +245,127 @@ public String toString() {
 }
   }
 
+  /**
+   * Test for reading of default query timeout
+   */
+  @Test
+  public void testDefaultGetQueryTimeout() throws SQLException {
+PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL);
+int timeoutValue = stmt.getQueryTimeout();
+assert( 0 == timeoutValue );
+  }
+
+  /**
+   * Test Invalid parameter by giving negative timeout
+   */
+  @Test ( expected = InvalidParameterSqlException.class )
+  public void testInvalidSetQueryTimeout() throws SQLException {
+PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL);
+//Setting negative value
+int valueToSet = -10;
+if (0L == valueToSet) {
+  valueToSet--;
+}
+try {
+  stmt.setQueryTimeout(valueToSet);
+} catch ( final Exception e) {
+  // TODO: handle exception
+  assertThat( e.getMessage(), containsString( "illegal timeout value") 
);
+  //Converting this to match expected Exception
+  throw new InvalidParameterSqlException(e.getMessage());
+}
+  }
+
+  /**
+   * Test setting a valid timeout
+   */
+  @Test
+  public void testValidSetQueryTimeout() throws SQLException {
+PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL);
+//Setting positive value
+int valueToSet = new Random(System.currentTimeMillis()).nextInt(60);
+if (0L == valueToSet) {
+  valueToSet++;
+}
+stmt.setQueryTimeout(valueToSet);
+assert( valueToSet == stmt.getQueryTimeout() );
+  }
+
+  /**
+   * Test setting timeout as zero and executing
+   */
+  @Test
+  public void testSetQueryTimeoutAsZero() throws SQLException {
+PreparedStatement stmt = connection.prepareStatement(SYS_RANDOM_SQL);
+stmt.setQueryTimeout(0);
+stmt.executeQuery();
+ResultSet rs = stmt.getResultSet();
+int rowCount = 0;
+while (rs.next()) {
+  rs.getBytes(1);
+  rowCount++;
+}
+stmt.close();
+assert( 3 == rowCount );
+  }
+
+  /**
+   * Test setting timeout for a query that actually times out
+   */
+  @Test ( expected = SQLTimeoutException.class )
+  public void testTriggeredQueryTimeout() throws SQLException {
+PreparedStatement stmt = null;
+//Setting to a very low value (3sec)
+int timeoutDuration = 3;
+int rowsCounted = 0;
+try {
+  stmt = connection.prepareStatement(SYS_RANDOM_SQL);
+  stmt.setQueryTimeout(timeoutDuration);
+  System.out.println("Set a timeout of "+ stmt.getQueryTimeout() +" 
seconds");
--- End diff --

I think I previously came across some unit tests that are using System.out 
instead of logger, so i figured there wasn't any preference. Logger is probably 
the cleaner way of doing things. +1


> Drill JDBC driver support Statement.setQueryTimeout(int)
> 
>
> Key: DRILL-3640
> URL: https://issues.apache.org/jira/browse/DRILL-3640
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
> Fix For: 1.12.0
>
>
> It would be nice if we have this implemented. Run away queries can be 
> automatically canceled by setting the timeout. 
> java.sql.SQLFeatureNotSupportedException: Setting network timeout is not 
> supported.
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)


[ 
https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242697#comment-16242697
 ] 

ASF GitHub Bot commented on DRILL-3640:
---

Github user laurentgo commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149476642
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillPreparedStatementImpl.java
 ---
@@ -61,8 +65,14 @@ protected DrillPreparedStatementImpl(DrillConnectionImpl 
connection,
 if (preparedStatementHandle != null) {
   ((DrillColumnMetaDataList) 
signature.columns).updateColumnMetaData(preparedStatementHandle.getColumnsList());
 }
+//Implicit query timeout
+this.queryTimeoutInSeconds = 0;
+this.elapsedTimer = Stopwatch.createUnstarted();
--- End diff --

not even true for a statement: you can execute multiple queries, but the 
previous resultset will be closed and a new cursor created...


> Drill JDBC driver support Statement.setQueryTimeout(int)
> 
>
> Key: DRILL-3640
> URL: https://issues.apache.org/jira/browse/DRILL-3640
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
> Fix For: 1.12.0
>
>
> It would be nice if we have this implemented. Run away queries can be 
> automatically canceled by setting the timeout. 
> java.sql.SQLFeatureNotSupportedException: Setting network timeout is not 
> supported.
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)


[ 
https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242700#comment-16242700
 ] 

ASF GitHub Bot commented on DRILL-3640:
---

Github user laurentgo commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149476740
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillResultSetImpl.java ---
@@ -66,11 +70,27 @@
   private final DrillConnectionImpl connection;
   private volatile boolean hasPendingCancelationNotification = false;
 
+  private Stopwatch elapsedTimer;
+
+  private int queryTimeoutInSeconds;
+
   DrillResultSetImpl(AvaticaStatement statement, Meta.Signature signature,
  ResultSetMetaData resultSetMetaData, TimeZone 
timeZone,
  Meta.Frame firstFrame) {
 super(statement, signature, resultSetMetaData, timeZone, firstFrame);
 connection = (DrillConnectionImpl) statement.getConnection();
+try {
+  if (statement.getQueryTimeout() > 0) {
+queryTimeoutInSeconds = statement.getQueryTimeout();
+  }
+} catch (Exception e) {
+  e.printStackTrace();
--- End diff --

I think so


> Drill JDBC driver support Statement.setQueryTimeout(int)
> 
>
> Key: DRILL-3640
> URL: https://issues.apache.org/jira/browse/DRILL-3640
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
> Fix For: 1.12.0
>
>
> It would be nice if we have this implemented. Run away queries can be 
> automatically canceled by setting the timeout. 
> java.sql.SQLFeatureNotSupportedException: Setting network timeout is not 
> supported.
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5942) Drill Resource Management

[
https://issues.apache.org/jira/browse/DRILL-5942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242689#comment-16242689
]

Timothy Farkas commented on DRILL-5942:
---

My initial analysis of this issue is as follows:

This is a very complex issue which will take a considerable amount of time to
solve correctly. Rehashing the points that Paul has already mentioned in
various discussions I think there are two main Phases this would need to be
tackled in.

h2. Phase 1: Running Queries Non-Concurrently Without Running Out of Memory

h3. Goal

The goal here would be to run one query at a time successfully in all cases. I
think this is possible to achieve with incremental improvements to the existing
architecture. *Note:* I think achieving *Phase 2* will require significant
changes to Drill's architecture.

h3. Tasks

In order to avoid out of memory exceptions for the single query case, it is
necessary and sufficient to have solutions for all of these sub-tasks.

* Make each operator memory aware. Given a specific memory budget each
operator must be capable of obeying it. All the operators need to be analyze
and made memory aware.
* *Relevant Pending Work:* The HashJoin work Boaz is doing.
* Account for the memory used by Drillbit to Drillbit communication.
Currently exchanges use buffers on both the sending and recieving drill bits.
These buffers can use a significant amount of memory. We would have to be able
to set a limit on the amount of memory used by these buffers and make the
exchange operations smart enough to obey the limit.
* *Relevant Pending Work:* The exchange operator work Vlad is doing.
* Make drill aware of the amount of direct memory allocated to the jvm. This
is necessary because the Drillibit needs to know how much memory it has
available to allocate to operators, and buffers.
* Control batch sizes. We cannot effectively obey memory limits in the
operators and buffers unless we can bound the size of batches.
* *Relevant Pending Work:* The work Paul is doing to limit batch sizes.
* Once everything above is satisfied, we need to test the Parallelizer code
to make sure it doesn't overallocate memory. I believe there are cases where
incorrect configuration can cause the Parallelizers to grossly over allocate
memory.

h2. Phase 2: Running Multiple Concurrently Without Running Out of Memory

h3. Goal

The goal here would be to run multiple concurrent queries successfully in all
cases. However, before we can even think about *Phase 2* we must first solve
the issues outlined in *Phase 1*.

h3. Theory

This will require significant changes to Drill's architecture. This is because
we will have to solve the problem of distributed resource management in order
to effectively allocate resources to concurrent queries without exceeding the
resources we have in our cluster.

h4. Desired State

The good news is that existing cluster managers like YARN already solve this
problem for batch jobs by doing the following:

# The amount of available memory and cpu cores is reported to a resource
manager.
# New jobs are submitted to the resource manager. A job includes a description
of all the containers it needs to run. Each container description also includes
the amount of memory and cpu it will need.
# The resource manager places the job in a Queue.
# The resource manager uses a scheduler to prioritize jobs in the Queue.
# A job with high priority is scheduled to run on the cluster if and only if
the cluster has enough resources to execute the job.
# When a job is deployed to the cluster, the remaining unused resources on the
cluster are updated appropriately.

h4. Current State

Currently Drill does not do any of this. Instead it does the following:

# A query is sent to Drill.
# A foreman is created for the query.
# The query is planned.
# During planning, each query assumes it has access to all of the cluster
resources and is oblivious to any other queries running on the cluster.

Because of this the following issues can occur:

* Queries sporadically run out of memory when too many concurrent queries are
run. For example, assume we have a cluster with 100 Gb of memory. Let's run
*Query A* and assume it consumes 80Gb of memory on the cluster. While *Query A*
is running let's try to run *Query B*. During the planning of *Query B* Drill
is completely unaware that *Query A* is running and assumes that *Query B* has
the full 100 GB at it's disposal. So Drill may launch *Query B* and give it 80
GB. Now there are two queries with 80 GB allocated to each of them for a total
of 160 GB when the cluster only has 100 GB.
* Even if we try to have some smart heuristics to avoid the first issue, a
resource deadlock can occur. For example *Query A* could be partially deployed
to the cluster and take up half of the cluster resources, similarly *Query B*
could be partially deplo

[jira] [Updated] (DRILL-5942) Drill Resource Management


 [ 
https://issues.apache.org/jira/browse/DRILL-5942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Farkas updated DRILL-5942:
--
Description: OutOfMemoryExceptions still occur in Drill. This ticket 
addresses a plan for what it would take to ensure all queries are able to 
execute on Drill without running out of memory.  (was: OutOfMemoryExceptions 
still occur in Drill. This ticket address a plan for what it would take to 
ensure all queries are able to execute on Drill without running out of memory.)

> Drill Resource Management
> -
>
> Key: DRILL-5942
> URL: https://issues.apache.org/jira/browse/DRILL-5942
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Timothy Farkas
>
> OutOfMemoryExceptions still occur in Drill. This ticket addresses a plan for 
> what it would take to ensure all queries are able to execute on Drill without 
> running out of memory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)


[ 
https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242695#comment-16242695
 ] 

ASF GitHub Bot commented on DRILL-3640:
---

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149476190
  
--- Diff: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java ---
@@ -237,6 +245,127 @@ public String toString() {
 }
   }
 
+  /**
+   * Test for reading of default query timeout
+   */
+  @Test
+  public void testDefaultGetQueryTimeout() throws SQLException {
+PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL);
+int timeoutValue = stmt.getQueryTimeout();
+assert( 0 == timeoutValue );
+  }
+
+  /**
+   * Test Invalid parameter by giving negative timeout
+   */
+  @Test ( expected = InvalidParameterSqlException.class )
+  public void testInvalidSetQueryTimeout() throws SQLException {
+PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL);
+//Setting negative value
+int valueToSet = -10;
+if (0L == valueToSet) {
+  valueToSet--;
+}
+try {
+  stmt.setQueryTimeout(valueToSet);
+} catch ( final Exception e) {
+  // TODO: handle exception
+  assertThat( e.getMessage(), containsString( "illegal timeout value") 
);
+  //Converting this to match expected Exception
+  throw new InvalidParameterSqlException(e.getMessage());
+}
+  }
+
+  /**
+   * Test setting a valid timeout
+   */
+  @Test
+  public void testValidSetQueryTimeout() throws SQLException {
+PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL);
+//Setting positive value
+int valueToSet = new Random(System.currentTimeMillis()).nextInt(60);
--- End diff --

I am trying to add some randomness to the test parameters, since the 
expected behaviour should be the same. I'll fix this up and get rid of that 
check.  +1


> Drill JDBC driver support Statement.setQueryTimeout(int)
> 
>
> Key: DRILL-3640
> URL: https://issues.apache.org/jira/browse/DRILL-3640
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
> Fix For: 1.12.0
>
>
> It would be nice if we have this implemented. Run away queries can be 
> automatically canceled by setting the timeout. 
> java.sql.SQLFeatureNotSupportedException: Setting network timeout is not 
> supported.
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)


[ 
https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242681#comment-16242681
 ] 

ASF GitHub Bot commented on DRILL-3640:
---

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149474798
  
--- Diff: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java ---
@@ -237,6 +245,127 @@ public String toString() {
 }
   }
 
+  /**
+   * Test for reading of default query timeout
+   */
+  @Test
+  public void testDefaultGetQueryTimeout() throws SQLException {
+PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL);
+int timeoutValue = stmt.getQueryTimeout();
+assert( 0 == timeoutValue );
+  }
+
+  /**
+   * Test Invalid parameter by giving negative timeout
+   */
+  @Test ( expected = InvalidParameterSqlException.class )
+  public void testInvalidSetQueryTimeout() throws SQLException {
+PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL);
+//Setting negative value
+int valueToSet = -10;
+if (0L == valueToSet) {
--- End diff --

My bad. The original code would assign the negation of a random integer.,.. 
hence the check for 0L and followed by a decrement. +1


> Drill JDBC driver support Statement.setQueryTimeout(int)
> 
>
> Key: DRILL-3640
> URL: https://issues.apache.org/jira/browse/DRILL-3640
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
> Fix For: 1.12.0
>
>
> It would be nice if we have this implemented. Run away queries can be 
> automatically canceled by setting the timeout. 
> java.sql.SQLFeatureNotSupportedException: Setting network timeout is not 
> supported.
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)


[ 
https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242680#comment-16242680
 ] 

ASF GitHub Bot commented on DRILL-3640:
---

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149474455
  
--- Diff: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java ---
@@ -237,6 +245,127 @@ public String toString() {
 }
   }
 
+  /**
+   * Test for reading of default query timeout
+   */
+  @Test
+  public void testDefaultGetQueryTimeout() throws SQLException {
+PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL);
--- End diff --

+1


> Drill JDBC driver support Statement.setQueryTimeout(int)
> 
>
> Key: DRILL-3640
> URL: https://issues.apache.org/jira/browse/DRILL-3640
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
> Fix For: 1.12.0
>
>
> It would be nice if we have this implemented. Run away queries can be 
> automatically canceled by setting the timeout. 
> java.sql.SQLFeatureNotSupportedException: Setting network timeout is not 
> supported.
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)


[ 
https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242688#comment-16242688
 ] 

ASF GitHub Bot commented on DRILL-3640:
---

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149475535
  
--- Diff: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java ---
@@ -237,6 +245,127 @@ public String toString() {
 }
   }
 
+  /**
+   * Test for reading of default query timeout
+   */
+  @Test
+  public void testDefaultGetQueryTimeout() throws SQLException {
+PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL);
+int timeoutValue = stmt.getQueryTimeout();
+assert( 0 == timeoutValue );
+  }
+
+  /**
+   * Test Invalid parameter by giving negative timeout
+   */
+  @Test ( expected = InvalidParameterSqlException.class )
+  public void testInvalidSetQueryTimeout() throws SQLException {
+PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL);
+//Setting negative value
+int valueToSet = -10;
+if (0L == valueToSet) {
+  valueToSet--;
+}
+try {
+  stmt.setQueryTimeout(valueToSet);
+} catch ( final Exception e) {
+  // TODO: handle exception
+  assertThat( e.getMessage(), containsString( "illegal timeout value") 
);
+  //Converting this to match expected Exception
+  throw new InvalidParameterSqlException(e.getMessage());
--- End diff --

Wanted to make sure that the unit test also reports the correct exception. 
This only rewraps the thrown SQLException to an 
InvalidParameterSqlException for JUnit to confirm.


> Drill JDBC driver support Statement.setQueryTimeout(int)
> 
>
> Key: DRILL-3640
> URL: https://issues.apache.org/jira/browse/DRILL-3640
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
> Fix For: 1.12.0
>
>
> It would be nice if we have this implemented. Run away queries can be 
> automatically canceled by setting the timeout. 
> java.sql.SQLFeatureNotSupportedException: Setting network timeout is not 
> supported.
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)


[ 
https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242686#comment-16242686
 ] 

ASF GitHub Bot commented on DRILL-3640:
---

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149475115
  
--- Diff: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java ---
@@ -237,6 +245,127 @@ public String toString() {
 }
   }
 
+  /**
+   * Test for reading of default query timeout
+   */
+  @Test
+  public void testDefaultGetQueryTimeout() throws SQLException {
+PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL);
+int timeoutValue = stmt.getQueryTimeout();
+assert( 0 == timeoutValue );
+  }
+
+  /**
+   * Test Invalid parameter by giving negative timeout
+   */
+  @Test ( expected = InvalidParameterSqlException.class )
+  public void testInvalidSetQueryTimeout() throws SQLException {
+PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL);
+//Setting negative value
+int valueToSet = -10;
+if (0L == valueToSet) {
+  valueToSet--;
+}
+try {
+  stmt.setQueryTimeout(valueToSet);
+} catch ( final Exception e) {
--- End diff --

Yes, it should be. Might be a legacy code. Will fix it. +1


> Drill JDBC driver support Statement.setQueryTimeout(int)
> 
>
> Key: DRILL-3640
> URL: https://issues.apache.org/jira/browse/DRILL-3640
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
> Fix For: 1.12.0
>
>
> It would be nice if we have this implemented. Run away queries can be 
> automatically canceled by setting the timeout. 
> java.sql.SQLFeatureNotSupportedException: Setting network timeout is not 
> supported.
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (DRILL-5942) Drill Resource Management

Timothy Farkas created DRILL-5942:
-

 Summary: Drill Resource Management
 Key: DRILL-5942
 URL: https://issues.apache.org/jira/browse/DRILL-5942
 Project: Apache Drill
  Issue Type: Bug
Reporter: Timothy Farkas


OutOfMemoryExceptions still occur in Drill. This ticket address a plan for what 
it would take to ensure all queries are able to execute on Drill without 
running out of memory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)


[ 
https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242671#comment-16242671
 ] 

ASF GitHub Bot commented on DRILL-3640:
---

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149473521
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillResultSetImpl.java ---
@@ -66,11 +70,27 @@
   private final DrillConnectionImpl connection;
   private volatile boolean hasPendingCancelationNotification = false;
 
+  private Stopwatch elapsedTimer;
+
+  private int queryTimeoutInSeconds;
+
   DrillResultSetImpl(AvaticaStatement statement, Meta.Signature signature,
  ResultSetMetaData resultSetMetaData, TimeZone 
timeZone,
  Meta.Frame firstFrame) {
 super(statement, signature, resultSetMetaData, timeZone, firstFrame);
 connection = (DrillConnectionImpl) statement.getConnection();
+try {
+  if (statement.getQueryTimeout() > 0) {
+queryTimeoutInSeconds = statement.getQueryTimeout();
+  }
+} catch (Exception e) {
+  e.printStackTrace();
--- End diff --

Guess I was not sure what am I to do if `getQueryTImeout()` threw an 
Exception. Didn't want to lose the stack trace. Should I just ignore it?


> Drill JDBC driver support Statement.setQueryTimeout(int)
> 
>
> Key: DRILL-3640
> URL: https://issues.apache.org/jira/browse/DRILL-3640
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
> Fix For: 1.12.0
>
>
> It would be nice if we have this implemented. Run away queries can be 
> automatically canceled by setting the timeout. 
> java.sql.SQLFeatureNotSupportedException: Setting network timeout is not 
> supported.
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)


[ 
https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242674#comment-16242674
 ] 

ASF GitHub Bot commented on DRILL-3640:
---

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149473999
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillResultSetImpl.java ---
@@ -96,6 +117,13 @@ private void throwIfClosed() throws 
AlreadyClosedSqlException,
 throw new AlreadyClosedSqlException( "ResultSet is already 
closed." );
   }
 }
+
+//Query Timeout Check. The timer has already been started by the 
DrillCursor at this point
--- End diff --

This code block gets touched even if there is no timeout set, hence the 
check to implicitly confirm if there is a timeout set.


> Drill JDBC driver support Statement.setQueryTimeout(int)
> 
>
> Key: DRILL-3640
> URL: https://issues.apache.org/jira/browse/DRILL-3640
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
> Fix For: 1.12.0
>
>
> It would be nice if we have this implemented. Run away queries can be 
> automatically canceled by setting the timeout. 
> java.sql.SQLFeatureNotSupportedException: Setting network timeout is not 
> supported.
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)


[ 
https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242667#comment-16242667
 ] 

ASF GitHub Bot commented on DRILL-3640:
---

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149472887
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillResultSetImpl.java ---
@@ -66,11 +70,27 @@
   private final DrillConnectionImpl connection;
   private volatile boolean hasPendingCancelationNotification = false;
 
+  private Stopwatch elapsedTimer;
--- End diff --

+1


> Drill JDBC driver support Statement.setQueryTimeout(int)
> 
>
> Key: DRILL-3640
> URL: https://issues.apache.org/jira/browse/DRILL-3640
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
> Fix For: 1.12.0
>
>
> It would be nice if we have this implemented. Run away queries can be 
> automatically canceled by setting the timeout. 
> java.sql.SQLFeatureNotSupportedException: Setting network timeout is not 
> supported.
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)


[ 
https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242666#comment-16242666
 ] 

ASF GitHub Bot commented on DRILL-3640:
---

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149472806
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillPreparedStatementImpl.java
 ---
@@ -61,8 +65,14 @@ protected DrillPreparedStatementImpl(DrillConnectionImpl 
connection,
 if (preparedStatementHandle != null) {
   ((DrillColumnMetaDataList) 
signature.columns).updateColumnMetaData(preparedStatementHandle.getColumnsList());
 }
+//Implicit query timeout
+this.queryTimeoutInSeconds = 0;
+this.elapsedTimer = Stopwatch.createUnstarted();
--- End diff --

I thought the Statement and Cursor had a 1:1 relationship, so they can 
share the timer. I guess for a PreparedStatement I cannot make that assumption. 
Will fix this. +1


> Drill JDBC driver support Statement.setQueryTimeout(int)
> 
>
> Key: DRILL-3640
> URL: https://issues.apache.org/jira/browse/DRILL-3640
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
> Fix For: 1.12.0
>
>
> It would be nice if we have this implemented. Run away queries can be 
> automatically canceled by setting the timeout. 
> java.sql.SQLFeatureNotSupportedException: Setting network timeout is not 
> supported.
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)


[ 
https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242655#comment-16242655
 ] 

ASF GitHub Bot commented on DRILL-3640:
---

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149471937
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillPreparedStatementImpl.java
 ---
@@ -46,6 +48,8 @@
DrillRemoteStatement {
 
   private final PreparedStatement preparedStatementHandle;
+  int queryTimeoutInSeconds = 0;
--- End diff --

Ah.. i missed this during the clean up. Thanks! +1


> Drill JDBC driver support Statement.setQueryTimeout(int)
> 
>
> Key: DRILL-3640
> URL: https://issues.apache.org/jira/browse/DRILL-3640
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
> Fix For: 1.12.0
>
>
> It would be nice if we have this implemented. Run away queries can be 
> automatically canceled by setting the timeout. 
> java.sql.SQLFeatureNotSupportedException: Setting network timeout is not 
> supported.
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)


[ 
https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242645#comment-16242645
 ] 

ASF GitHub Bot commented on DRILL-3640:
---

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149471249
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java ---
@@ -100,13 +103,17 @@
 final LinkedBlockingDeque batchQueue =
 Queues.newLinkedBlockingDeque();
 
+private final DrillCursor parent;
--- End diff --

Stopwatch seemed a convenient way of visualizing a timer object that is 
passed between different JDBC entities, and also provides a clean way of 
specifying elapsed time, etc. 


> Drill JDBC driver support Statement.setQueryTimeout(int)
> 
>
> Key: DRILL-3640
> URL: https://issues.apache.org/jira/browse/DRILL-3640
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
> Fix For: 1.12.0
>
>
> It would be nice if we have this implemented. Run away queries can be 
> automatically canceled by setting the timeout. 
> java.sql.SQLFeatureNotSupportedException: Setting network timeout is not 
> supported.
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)


[ 
https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242617#comment-16242617
 ] 

ASF GitHub Bot commented on DRILL-3640:
---

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149467572
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java ---
@@ -376,6 +415,19 @@ synchronized void cleanup() {
 currentBatchHolder.clear();
   }
 
+  //Set the cursor's timeout in seconds
--- End diff --

We do get the timeout value from the Statement (Ref: 
https://github.com/kkhatua/drill/blob/a008707c7b97ea95700ab0f2eb5182d779a9bcb3/exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java#L372
 )
However, the Statement is referred to by the ResultSet object as well, to 
get a handle of the timer object. During testing, I found that there is a 
possibility that the DrillCursor completes fetching all batches, but a slow 
client would call ResultSet.next() slowly and time out. The ResultSet object 
has no reference to the timer, except via the Statement object.
There is a bigger problem that this block of code fixes. During iteration, 
we don't want to be able to change the timeout period. Hence, the DrillCursor 
(invoked by the _first_ `ResultSet.next()` call) will be initialized and set 
the timer to start ticking.Thereafter, any attempt to change the timeout can be 
ignored.


> Drill JDBC driver support Statement.setQueryTimeout(int)
> 
>
> Key: DRILL-3640
> URL: https://issues.apache.org/jira/browse/DRILL-3640
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
> Fix For: 1.12.0
>
>
> It would be nice if we have this implemented. Run away queries can be 
> automatically canceled by setting the timeout. 
> java.sql.SQLFeatureNotSupportedException: Setting network timeout is not 
> supported.
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)


[ 
https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242606#comment-16242606
 ] 

ASF GitHub Bot commented on DRILL-3640:
---

Github user laurentgo commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149465309
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java ---
@@ -239,6 +259,11 @@ QueryDataBatch getNext() throws UserException, 
InterruptedException {
 }
 return qdb;
   }
+
+  // Check and throw SQLTimeoutException
+  if ( parent.timeoutInSeconds > 0 && 
parent.elapsedTimer.elapsed(TimeUnit.SECONDS) >= parent.timeoutInSeconds ) {
--- End diff --

you don't really need a check after the pool: if it's not null, it means it 
completed before timeout and you can proceed forward. If it is null, then you 
would loop and redo the check based on the current time and might be able to 
throw a timeout exception


> Drill JDBC driver support Statement.setQueryTimeout(int)
> 
>
> Key: DRILL-3640
> URL: https://issues.apache.org/jira/browse/DRILL-3640
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
> Fix For: 1.12.0
>
>
> It would be nice if we have this implemented. Run away queries can be 
> automatically canceled by setting the timeout. 
> java.sql.SQLFeatureNotSupportedException: Setting network timeout is not 
> supported.
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)


[ 
https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242596#comment-16242596
 ] 

ASF GitHub Bot commented on DRILL-3640:
---

Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149463638
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java ---
@@ -239,6 +259,11 @@ QueryDataBatch getNext() throws UserException, 
InterruptedException {
 }
 return qdb;
   }
+
+  // Check and throw SQLTimeoutException
+  if ( parent.timeoutInSeconds > 0 && 
parent.elapsedTimer.elapsed(TimeUnit.SECONDS) >= parent.timeoutInSeconds ) {
--- End diff --

Good point, and I thought it might help in avoiding going into polling all 
together.  
However, the granularity of the timeout is in seconds, so 50ms is 
insignificant. If I do a check before the poll, I'd need to do after the poll 
as well.. over a 50ms window. So, a post-poll check works fine, because we'll, 
at most, exceed the timeout by 50ms. So a timeout of 1sec would occur in 
1.05sec. For any larger timeout values, the 50ms is of diminishing significance.


> Drill JDBC driver support Statement.setQueryTimeout(int)
> 
>
> Key: DRILL-3640
> URL: https://issues.apache.org/jira/browse/DRILL-3640
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
> Fix For: 1.12.0
>
>
> It would be nice if we have this implemented. Run away queries can be 
> automatically canceled by setting the timeout. 
> java.sql.SQLFeatureNotSupportedException: Setting network timeout is not 
> supported.
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5717) change some date time unit cases with specific timezone or Local


[ 
https://issues.apache.org/jira/browse/DRILL-5717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242447#comment-16242447
 ] 

ASF GitHub Bot commented on DRILL-5717:
---

Github user vvysotskyi commented on a diff in the pull request:

https://github.com/apache/drill/pull/904#discussion_r149440392
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/fn/impl/testing/TestDateConversions.java
 ---
@@ -225,4 +243,16 @@ public void testPostgresDateFormatError() throws 
Exception {
   throw e;
 }
   }
+
+  /**
+   * mock current locale to US
+   */
+  private void mockUSLocale() {
--- End diff --

Please move this method into ExecTest class and make it public and static.


> change some date time unit cases with specific timezone or Local
> 
>
> Key: DRILL-5717
> URL: https://issues.apache.org/jira/browse/DRILL-5717
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Tools, Build & Test
>Affects Versions: 1.9.0, 1.11.0
>Reporter: weijie.tong
>
> Some date time test cases like  JodaDateValidatorTest  is not Local 
> independent .This will cause other Local's users's test phase to fail. We 
> should let these test cases to be Local env independent.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5717) change some date time unit cases with specific timezone or Local


[ 
https://issues.apache.org/jira/browse/DRILL-5717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242445#comment-16242445
 ] 

ASF GitHub Bot commented on DRILL-5717:
---

Github user vvysotskyi commented on a diff in the pull request:

https://github.com/apache/drill/pull/904#discussion_r149440034
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/fn/interp/TestConstantFolding.java
 ---
@@ -117,6 +123,13 @@ public void createFiles(int smallFileLines, int 
bigFileLines) throws Exception{
 
   @Test
   public void testConstantFolding_allTypes() throws Exception {
+new MockUp() {
--- End diff --

I meant to use the same method in all tests where Locale should be mocked. 
Please replace it.


> change some date time unit cases with specific timezone or Local
> 
>
> Key: DRILL-5717
> URL: https://issues.apache.org/jira/browse/DRILL-5717
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Tools, Build & Test
>Affects Versions: 1.9.0, 1.11.0
>Reporter: weijie.tong
>
> Some date time test cases like  JodaDateValidatorTest  is not Local 
> independent .This will cause other Local's users's test phase to fail. We 
> should let these test cases to be Local env independent.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5717) change some date time unit cases with specific timezone or Local


[ 
https://issues.apache.org/jira/browse/DRILL-5717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242446#comment-16242446
 ] 

ASF GitHub Bot commented on DRILL-5717:
---

Github user vvysotskyi commented on a diff in the pull request:

https://github.com/apache/drill/pull/904#discussion_r149440814
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/fn/impl/TestCastFunctions.java
 ---
@@ -77,16 +82,23 @@ public void testCastByConstantFolding() throws 
Exception {
 
   @Test // DRILL-3769
   public void testToDateForTimeStamp() throws Exception {
-final String query = "select to_date(to_timestamp(-1)) as col \n" +
-"from (values(1))";
+new MockUp() {
--- End diff --

This mock is used twice. So let's also move the code into the separate 
method.


> change some date time unit cases with specific timezone or Local
> 
>
> Key: DRILL-5717
> URL: https://issues.apache.org/jira/browse/DRILL-5717
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Tools, Build & Test
>Affects Versions: 1.9.0, 1.11.0
>Reporter: weijie.tong
>
> Some date time test cases like  JodaDateValidatorTest  is not Local 
> independent .This will cause other Local's users's test phase to fail. We 
> should let these test cases to be Local env independent.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5106) Refactor SkipRecordsInspector to exclude check for predefined file formats


[ 
https://issues.apache.org/jira/browse/DRILL-5106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242279#comment-16242279
 ] 

Arina Ielchiieva commented on DRILL-5106:
-

The following improvements will be implemented in the scope of DRILL-5941:
a. fileFormats will be removed from skip records inspector;
b. skip header count logic will be applied only once during reader 
initialization;
c. when skip footer won't be required, default processing will be done without 
buffering data in queue.

> Refactor SkipRecordsInspector to exclude check for predefined file formats
> --
>
> Key: DRILL-5106
> URL: https://issues.apache.org/jira/browse/DRILL-5106
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Hive
>Affects Versions: 1.9.0
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
>Priority: Minor
>
> After changes introduced in DRILL-4982, SkipRecordInspector is used only for 
> predefined formats (using hasHeaderFooter: false / true). But 
> SkipRecordInspector has its own check for formats where skip strategy can be 
> applied. Acceptable file formats are stored in private final Set 
> fileFormats and initialized in constructor, currently it contains only one 
> format - TextInputFormat. Now this check is redundant and may lead to 
> ignoring hasHeaderFooter setting to true for any other format except of Text.
> To do:
> 1. remove private final Set fileFormats
> 2. remove if block from SkipRecordsInspector.retrievePositiveIntProperty:
> {code}
>  if 
> (!fileFormats.contains(tableProperties.get(hive_metastoreConstants.FILE_INPUT_FORMAT)))
>  {
> return propertyIntValue;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (DRILL-5941) Skip header / footer logic works incorrectly for Hive tables when file has several input splits

Arina Ielchiieva created DRILL-5941:
---

Summary: Skip header / footer logic works incorrectly for Hive
tables when file has several input splits
Key: DRILL-5941
URL: https://issues.apache.org/jira/browse/DRILL-5941
Project: Apache Drill
Issue Type: Bug
Components: Storage - Hive
Affects Versions: 1.11.0
Reporter: Arina Ielchiieva
Assignee: Arina Ielchiieva
Fix For: 1.12.0

*To reproduce*
1. Create csv file with two columns (key, value) for 329 rows, where first
row is a header.
The data file has size of should be greater than chunk size of 256 MB. Copy
file to the distributed file system.

2. Create table in Hive:
{noformat}
CREATE EXTERNAL TABLE `h_table`(
`key` bigint,
`value` string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
'maprfs:/tmp/h_table'
TBLPROPERTIES (
'skip.header.line.count'='1');
{noformat}

3. Execute query {{select * from hive.h_table}} in Drill (query data using Hive
plugin). The result will return less rows then expected. Expected result is
328 (total count minus one row as header).

*The root cause*
Since file is greater than default chunk size, it's split into several
fragments, known as input splits. For example:
{noformat}
maprfs:/tmp/h_table/h_table.csv:0+268435456
maprfs:/tmp/h_table/h_table.csv:268435457+492782112
{noformat}

TextHiveReader is responsible for handling skip header and / or footer logic.
Currently Drill creates reader [for each input
split|https://github.com/apache/drill/blob/master/contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveScanBatchCreator.java#L84]
and skip header and /or footer logic is applied for each input splits, though
ideally the above mentioned input splits should have been read by one reader,
so skip / header footer logic was applied correctly.

--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-4779) Kafka storage plugin support

2017-11-07 Thread B Anil Kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-4779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242170#comment-16242170
 ] 

B Anil Kumar commented on DRILL-4779:
-

For Avro support we have raised a separate ticket 
https://issues.apache.org/jira/browse/DRILL-5940

> Kafka storage plugin support
> 
>
> Key: DRILL-4779
> URL: https://issues.apache.org/jira/browse/DRILL-4779
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.11.0
>Reporter: B Anil Kumar
>Assignee: B Anil Kumar
>  Labels: doc-impacting
> Fix For: 1.12.0
>
>
> Implement Kafka storage plugin will enable the strong SQL support for Kafka.
> Initially implementation can target for supporting json and avro message types



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (DRILL-5940) Avro with schema registry support for Kafka

2017-11-07 Thread B Anil Kumar (JIRA)

B Anil Kumar created DRILL-5940:
---

 Summary: Avro with schema registry support for Kafka
 Key: DRILL-5940
 URL: https://issues.apache.org/jira/browse/DRILL-5940
 Project: Apache Drill
  Issue Type: New Feature
  Components: Storage - Other
Reporter: B Anil Kumar
Assignee: Bhallamudi Venkata Siva Kamesh


Support Avro messages with Schema registry for Kafka storage plugin



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-4779) Kafka storage plugin support


[ 
https://issues.apache.org/jira/browse/DRILL-4779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242157#comment-16242157
 ] 

ASF GitHub Bot commented on DRILL-4779:
---

GitHub user akumarb2010 opened a pull request:

https://github.com/apache/drill/pull/1027

DRILL-4779 : Kafka storage plugin

This PR contains Kafka support with JSON message format.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/akumarb2010/incubator-drill master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1027.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1027


commit f3397816ad07f85a08f53b964213ecf9f96b56b8
Author: Batchu 
Date:   2016-12-20T16:28:40Z

Starting on kafka module

commit f5c9ff3f0863cf743e922fc6c39ccfebc4b7df4d
Author: Batchu 
Date:   2016-12-26T21:16:19Z

Initial Kafka integration code

commit 8d7403f49f1acb601e6d90d1a2cb640a4df0e05c
Author: Batchu 
Date:   2016-12-26T23:41:05Z

Kafka plugin src clean up

commit 08992ea73c239e8b6996436b7984e87ee12b534d
Author: Anil Kumar Batchu 
Date:   2016-12-26T23:45:19Z

Initial Kafka plugin code base

commit 8ec1e187efcdd3123db03fda7c0e038503246a46
Author: Anil Kumar Batchu 
Date:   2017-01-08T06:07:06Z

Initial Kafka plugin code base

commit a11af5542079b891975c95d4b2c6b562e7a61386
Author: Anil Kumar Batchu 
Date:   2017-01-08T06:22:12Z

Initial Kafka plugin code base

commit e22d10e82857888e7fb9a723b7781a20d6581d69
Author: Anil Kumar Batchu 
Date:   2017-03-04T06:40:25Z

Initial Kafka plugin code base issues

commit 990f479db43156dad296033ca57ee4cf0f497b69
Author: Anil Kumar Batchu 
Date:   2017-03-04T06:48:00Z

Initial Kafka plugin code base issues

commit 5d565c4d843373f8c168c9255d108140677e58b0
Author: Venkata Siva Kamesh 
Date:   2017-03-04T11:00:30Z

Updating Kafka storage plugin and its affected classes, adding schema 
related classes

commit aec67b863da49293d96a53d5332998ffaf890115
Author: Venkata Siva Kamesh 
Date:   2017-03-04T11:07:17Z

Adding license

commit 84e7eaa0de6ae90244df1fb4960d97876076ffc6
Author: Venkata Siva Kamesh 
Date:   2017-03-04T11:34:40Z

Adding bootstrap-storage-plugins.json and drill-module.conf

commit 0fa26c856996474442c2deb038226e1765c67ac4
Author: Venkata Siva Kamesh 
Date:   2017-03-04T13:18:49Z

Cleaning pom and adding kafka-storage in bin.xml

commit 2c96e9ff9e0c68ae76e9b9b16b712b0f5a7fcea8
Author: Venkata Siva Kamesh 
Date:   2017-03-04T13:25:24Z

fixing drill-module.conf by updating package to kafka

commit a30640c60586b12b3c4ec6a6316fc6dab84a6d3f
Author: Venkata Siva Kamesh 
Date:   2017-03-04T14:04:40Z

fixing spelling mistakes

commit ccd8de3f86e407852ea9ce9e032caf86542ea482
Author: Venkata Siva Kamesh 
Date:   2017-03-05T09:41:15Z

Formatting the code using Apache eclipse formatter

commit dcc018acac3cb07f2fe0d5c33a6c50c03e2ebc59
Author: Venkata Siva Kamesh 
Date:   2017-03-05T09:44:28Z

moving Kafka schema factory to schema package

commit 2987afbcfeb47cb7d7a29643015979d293ebcc72
Author: Venkata Siva Kamesh 
Date:   2017-03-05T15:20:03Z

Adding message format in storage plugin config

commit b6bc9c34aa7b2c9299a593e2138e0e9dc30edeb9
Author: Venkata Siva Kamesh 
Date:   2017-03-05T16:00:53Z

updating storage-plugins.json, drill-module.conf and adding loggers

commit 97f62d43da48a284de318cd21543f03c3f10e73f
Author: Venkata Siva Kamesh 
Date:   2017-03-05T16:19:34Z

Adding debug message

commit 30ac2ae6b623ba5279e3085549e5fa5ad2a44a23
Author: Venkata Siva Kamesh 
Date:   2017-03-05T16:26:53Z

updating debug message

commit fdd84d19dcfa03913a307d041c0cdf7421095e71
Author: akumarb2010 
Date:   2017-03-05T16:29:19Z

Adding avro support to kafka plugin

commit d301f3ee07d16800ec54316d5dd48c398c09a1ec
Author: akumarb2010 
Date:   2017-03-05T16:29:44Z

Merge branch 'master' of https://github.com/akumarb2010/incubator-drill

commit c922ea9ea7ea31b252150f357e153f64edd8fbb5
Author: akumarb2010 
Date:   2017-03-25T18:16:05Z

Adding avro support to kafka plugin DRILL-4779

commit d00aa38cf51bec1dcdefa7f6e44f918ce761f912
Author: akumarb2010 
Date:   2017-03-25T20:17:06Z

KafkaRecordReader implementation DRILL-4779

commit 318ebc120bd0f2d4790a920806110db2494e1668
Author: akumarb2010 
Date:   2017-03-26T05:15:29Z

KafkaRecordReader implementation DRILL-4779

commit 0e142a9758223761124632dc844ba8c75bca913f
Author: Venkata Siva Kamesh 
Date:   2017-04-02T06:33:06Z

Adding storage config in Groupscan and fixing rat issues

commit e797a221b799a154a4c8f55a590b12b18ed9f31c
Author: akumarb2010 
Date:   2017-04-02T06:43:12Z

Checkstyle issues

commit f700b4b1074fe7dfefc82c9895e7ad4041762d4b
Author: akumarb2010 
Date:   2017-04-02T

[jira] [Updated] (DRILL-4779) Kafka storage plugin support

2017-11-07 Thread B Anil Kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-4779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

B Anil Kumar updated DRILL-4779:

Description: 
Implement Kafka storage plugin will enable the strong SQL support for Kafka.

Initially implementation can target for supporting json and avro message types

  was:
Implement Kafka storage plugin will enable the strong SQL support for Kafka.

Initially implementation can target for supporting text, json and avro message 
types


> Kafka storage plugin support
> 
>
> Key: DRILL-4779
> URL: https://issues.apache.org/jira/browse/DRILL-4779
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.11.0
>Reporter: B Anil Kumar
>Assignee: B Anil Kumar
>  Labels: doc-impacting
> Fix For: 1.12.0
>
>
> Implement Kafka storage plugin will enable the strong SQL support for Kafka.
> Initially implementation can target for supporting json and avro message types



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (DRILL-5939) NullPointerException in convert_toJSON function

Volodymyr Tkach created DRILL-5939:
--

 Summary: NullPointerException in convert_toJSON function
 Key: DRILL-5939
 URL: https://issues.apache.org/jira/browse/DRILL-5939
 Project: Apache Drill
  Issue Type: Bug
Reporter: Volodymyr Tkach
Priority: Minor


The query: `select convert_toJSON(convert_fromJSON('\{"key": "value"\}')) from 
(values(1));` fails with exception.
Although, when we apply it for the data from the file from disk it succeeds.
select convert_toJSON(convert_fromJSON(columns\[0\])) from dfs.tmp.`some.csv`;
Some.csv
\{"key":"val"\},val2

{noformat}
Fragment 0:0

[Error Id: 016ca995-16f9-4eab-83c2-7679071faad4 on userf206-pc:31010]
org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
NullPointerException

Fragment 0:0

[Error Id: 016ca995-16f9-4eab-83c2-7679071faad4 on userf206-pc:31010]
at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:586)
 ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:298)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:160)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:267)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) 
[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
[na:1.7.0_80]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_80]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
Caused by: java.lang.NullPointerException: null
at 
org.apache.drill.exec.expr.fn.DrillFuncHolder.addProtectedBlock(DrillFuncHolder.java:183)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.expr.fn.DrillFuncHolder.generateBody(DrillFuncHolder.java:169)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.expr.fn.DrillSimpleFuncHolder.renderEnd(DrillSimpleFuncHolder.java:86)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.expr.EvaluationVisitor$EvalVisitor.visitFunctionHolderExpression(EvaluationVisitor.java:205)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.expr.EvaluationVisitor$ConstantFilter.visitFunctionHolderExpression(EvaluationVisitor.java:1089)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.expr.EvaluationVisitor$CSEFilter.visitFunctionHolderExpression(EvaluationVisitor.java:827)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.expr.EvaluationVisitor$CSEFilter.visitFunctionHolderExpression(EvaluationVisitor.java:807)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.common.expression.FunctionHolderExpression.accept(FunctionHolderExpression.java:53)
 ~[drill-logical-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.expr.EvaluationVisitor$EvalVisitor.visitValueVectorWriteExpression(EvaluationVisitor.java:362)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.expr.EvaluationVisitor$EvalVisitor.visitUnknown(EvaluationVisitor.java:344)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.expr.EvaluationVisitor$ConstantFilter.visitUnknown(EvaluationVisitor.java:1339)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.expr.EvaluationVisitor$CSEFilter.visitUnknown(EvaluationVisitor.java:1038)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.expr.EvaluationVisitor$CSEFilter.visitUnknown(EvaluationVisitor.java:807)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.expr.ValueVectorWriteExpression.accept(ValueVectorWriteExpression.java:64)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.expr.EvaluationVisitor.addExpr(EvaluationVisitor.java:104)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.expr.ClassGenerator.addExpr(ClassGenerator.java:335) 
~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchemaFromInput(ProjectRecordBatch.java:476)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchema(ProjectReco

[jira] [Created] (DRILL-5938) Write unit tests for math function with NaN and Infinity numbers

Volodymyr Tkach created DRILL-5938:
--

 Summary: Write unit tests for math function with NaN and Infinity 
numbers 
 Key: DRILL-5938
 URL: https://issues.apache.org/jira/browse/DRILL-5938
 Project: Apache Drill
  Issue Type: Test
Reporter: Volodymyr Tkach
Priority: Minor
 Fix For: Future


Drill math function needs to be covered with test cases when input is 
non-numeric numbers: NaN or Infinity.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (DRILL-5919) Add non-numeric support for JSON processing


[ 
https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242040#comment-16242040
 ] 

Volodymyr Tkach edited comment on DRILL-5919 at 11/7/17 2:36 PM:
-

1. Added two session options `store.json.reader.non_numeric_numbers` and 
`store.json.reader.non_numeric_numbers` that allow to read/write `NaN` and 
`Infinity` as numbers. By default these options are set to false;
2. Extended signature of `convert_toJSON` and `convert_fromJSON` functions by 
adding second optional parameter that enables read/write `NaN` and `Infinity`. 
For example:
`select convert_fromJSON('\{"key": NaN\}') from (values(1));` will result with 
JsonParseException, but 
`select convert_fromJSON('\{"key": NaN\}', true) from (values(1));` will parse 
`NaN` as a number.


was (Author: volodymyr.tkach):
Added two session options `store.json.reader.non_numeric_numbers` and 
`store.json.reader.non_numeric_numbers` that allow to read/write `NaN` and 
`Infinity` as numbers. By default these options are set to false;
Also extended signature of `convert_toJSON` and `convert_fromJSON` functions by 
adding second optional parameter that enables read/write `NaN` and `Infinity`. 
For example:
`select convert_fromJSON('\{"key": NaN\}') from (values(1));` will result with 
JsonParseException, but 
`select convert_fromJSON('\{"key": NaN\}', true) from (values(1));` will parse 
`NaN` as a number.

> Add non-numeric support for JSON processing
> ---
>
> Key: DRILL-5919
> URL: https://issues.apache.org/jira/browse/DRILL-5919
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.11.0
>Reporter: Volodymyr Tkach
>Assignee: Volodymyr Tkach
> Fix For: Future
>
>
> Add session options to allow drill working with non standard json strings 
> number literals like: NaN, Infinity, -Infinity. By default these options will 
> be switched off, the user will be able to toggle them during working session.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (DRILL-5919) Add non-numeric support for JSON processing


 [ 
https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Volodymyr Tkach updated DRILL-5919:
---
Summary: Add non-numeric support for JSON processing  (was: Add session 
option to allow json reader/writer to work with NaN,INF)

> Add non-numeric support for JSON processing
> ---
>
> Key: DRILL-5919
> URL: https://issues.apache.org/jira/browse/DRILL-5919
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.11.0
>Reporter: Volodymyr Tkach
>Assignee: Volodymyr Tkach
> Fix For: Future
>
>
> Add session options to allow drill working with non standard json strings 
> number literals like: NaN, Infinity, -Infinity. By default these options will 
> be switched off, the user will be able to toggle them during working session.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (DRILL-5919) Add session option to allow json reader/writer to work with NaN,INF


[ 
https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242040#comment-16242040
 ] 

Volodymyr Tkach edited comment on DRILL-5919 at 11/7/17 2:02 PM:
-

Added two session options `store.json.reader.non_numeric_numbers` and 
`store.json.reader.non_numeric_numbers` that allow to read/write `NaN` and 
`Infinity` as numbers. By default these options are set to false;
Also extended signature of `convert_toJSON` and `convert_fromJSON` functions by 
adding second optional parameter that enables read/write `NaN` and `Infinity`. 
For example:
`select convert_fromJSON('\{"key": NaN\}') from (values(1));` will result with 
JsonParseException, but 
`select convert_fromJSON('\{"key": NaN\}', true) from (values(1));` will parse 
`NaN` as a number.


was (Author: volodymyr.tkach):
Added two session options `store.json.reader.non_numeric_numbers` and 
`store.json.reader.non_numeric_numbers` that allow to read/write `NaN` and 
`Infinity` as numbers. By default these options are set to false;
Also extended signature of `convert_toJSON` and `convert_fromJSON` functions by 
adding second optional parameter that enables read/write `NaN` and `Infinity`. 
For example:
`select convert_fromJSON('{"key": NaN}') from (values(1));` will result with 
JsonParseException, but 
`select convert_fromJSON('{"key": NaN}', true) from (values(1));` will parse 
`NaN` as a number.

> Add session option to allow json reader/writer to work with NaN,INF
> ---
>
> Key: DRILL-5919
> URL: https://issues.apache.org/jira/browse/DRILL-5919
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.11.0
>Reporter: Volodymyr Tkach
>Assignee: Volodymyr Tkach
> Fix For: Future
>
>
> Add session options to allow drill working with non standard json strings 
> number literals like: NaN, Infinity, -Infinity. By default these options will 
> be switched off, the user will be able to toggle them during working session.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5919) Add session option to allow json reader/writer to work with NaN,INF


[ 
https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242040#comment-16242040
 ] 

Volodymyr Tkach commented on DRILL-5919:


Added two session options `store.json.reader.non_numeric_numbers` and 
`store.json.reader.non_numeric_numbers` that allow to read/write `NaN` and 
`Infinity` as numbers. By default these options are set to false;
Also extended signature of `convert_toJSON` and `convert_fromJSON` functions by 
adding second optional parameter that enables read/write `NaN` and `Infinity`. 
For example:
`select convert_fromJSON('{"key": NaN}') from (values(1));` will result with 
JsonParseException, but 
`select convert_fromJSON('{"key": NaN}', true) from (values(1));` will parse 
`NaN` as a number.

> Add session option to allow json reader/writer to work with NaN,INF
> ---
>
> Key: DRILL-5919
> URL: https://issues.apache.org/jira/browse/DRILL-5919
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.11.0
>Reporter: Volodymyr Tkach
>Assignee: Volodymyr Tkach
> Fix For: Future
>
>
> Add session options to allow drill working with non standard json strings 
> number literals like: NaN, Infinity, -Infinity. By default these options will 
> be switched off, the user will be able to toggle them during working session.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5919) Add session option to allow json reader/writer to work with NaN,INF


[ 
https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242031#comment-16242031
 ] 

ASF GitHub Bot commented on DRILL-5919:
---

GitHub user vladimirtkach opened a pull request:

https://github.com/apache/drill/pull/1026

DRILL-5919: Add session option to allow json reader/writer to work with 
NaN,INF

 Added two session options `store.json.reader.non_numeric_numbers` and 
`store.json.reader.non_numeric_numbers` that allow to read/write `NaN` and 
`Infinity` as numbers

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vladimirtkach/drill DRILL-5919

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1026.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1026


commit 0e972bac9d472f6681e6f16d232f61e6d0bfcb44
Author: Volodymyr Tkach 
Date:   2017-11-03T16:13:29Z

DRILL-5919: Add session option to allow json reader/writer to work with 
NaN,INF




> Add session option to allow json reader/writer to work with NaN,INF
> ---
>
> Key: DRILL-5919
> URL: https://issues.apache.org/jira/browse/DRILL-5919
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.11.0
>Reporter: Volodymyr Tkach
>Assignee: Volodymyr Tkach
> Fix For: Future
>
>
> Add session options to allow drill working with non standard json strings 
> number literals like: NaN, Infinity, -Infinity. By default these options will 
> be switched off, the user will be able to toggle them during working session.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5921) Counters metrics should be listed in table


[ 
https://issues.apache.org/jira/browse/DRILL-5921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16241917#comment-16241917
 ] 

ASF GitHub Bot commented on DRILL-5921:
---

Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/1020#discussion_r149349985
  
--- Diff: exec/java-exec/src/main/resources/rest/metrics/metrics.ftl ---
@@ -138,21 +154,14 @@
   });
 };
 
-function updateOthers(metrics) {
-  $.each(["counters", "meters"], function(i, key) {
-if(! $.isEmptyObject(metrics[key])) {
-  $("#" + key + "Val").html(JSON.stringify(metrics[key], null, 2));
-}
-  });
-};
-
 var update = function() {
   $.get("/status/metrics", function(metrics) {
 updateGauges(metrics.gauges);
 updateBars(metrics.gauges);
 if(! $.isEmptyObject(metrics.timers)) createTable(metrics.timers, 
"timers");
 if(! $.isEmptyObject(metrics.histograms)) 
createTable(metrics.histograms, "histograms");
-updateOthers(metrics);
+if(! $.isEmptyObject(metrics.counters)) 
createCountersTable(metrics.counters);
+if(! $.isEmptyObject(metrics.meters)) 
$("#metersVal").html(JSON.stringify(metrics.meters, null, 2));
--- End diff --

@prasadns14 
1. Please add two screenshots before and after the changes.
2. Can you please think of the way to make create table generic so can be 
used for timers, histograms and counters?
3. What about meters? How they are displayed right now? Maybe we need to 
display them in table as well?
Ideally, we can display all metrics in the same way.


> Counters metrics should be listed in table
> --
>
> Key: DRILL-5921
> URL: https://issues.apache.org/jira/browse/DRILL-5921
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - HTTP
>Affects Versions: 1.11.0
>Reporter: Prasad Nagaraj Subramanya
>Assignee: Prasad Nagaraj Subramanya
>Priority: Minor
> Fix For: 1.12.0
>
>
> Counter metrics are currently displayed as json string in the Drill UI. They 
> should be listed in a table similar to other metrics.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (DRILL-5923) State of a successfully completed query shown as "COMPLETED"


[ 
https://issues.apache.org/jira/browse/DRILL-5923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16241833#comment-16241833
 ] 

ASF GitHub Bot commented on DRILL-5923:
---

Github user arina-ielchiieva commented on the issue:

https://github.com/apache/drill/pull/1021
  
@prasadns14 as far as I understood, you made all these changes to replace 
`completed` with `succeeded`. What if you just make changes in State enum 
itself, refactor some code and thus no changes in rest part will be required? 
From UserBitShared.proto
```
enum QueryState {
  STARTING = 0; // query has been scheduled for execution. This is 
post-enqueued.
  RUNNING = 1;
  COMPLETED = 2; // query has completed successfully
  CANCELED = 3; // query has been cancelled, and all cleanup is complete
  FAILED = 4;
  CANCELLATION_REQUESTED = 5; // cancellation has been requested, and 
is being processed
  ENQUEUED = 6; // query has been enqueued. this is pre-starting.
}
```
After the renaming, please don't forget to regenerate protobuf.


> State of a successfully completed query shown as "COMPLETED"
> 
>
> Key: DRILL-5923
> URL: https://issues.apache.org/jira/browse/DRILL-5923
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - HTTP
>Affects Versions: 1.11.0
>Reporter: Prasad Nagaraj Subramanya
>Assignee: Prasad Nagaraj Subramanya
> Fix For: 1.12.0
>
>
> Drill UI currently lists a successfully completed query as "COMPLETED". 
> Successfully completed, failed and canceled queries are all grouped as 
> Completed queries. 
> It would be better to list the state of a successfully completed query as 
> "Succeeded" to avoid confusion.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Resolved] (DRILL-5746) Pcap PR manually edited Protobuf files, values lost on next build


 [ 
https://issues.apache.org/jira/browse/DRILL-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva resolved DRILL-5746.
-
Resolution: Fixed

In the scope of DRILL-5716.

> Pcap PR manually edited Protobuf files, values lost on next build
> -
>
> Key: DRILL-5746
> URL: https://issues.apache.org/jira/browse/DRILL-5746
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.12.0
>
>
> Drill recently accepted the pcap format plugin. As part of that work, the 
> author added a new operator type, {{PCAP_SUB_SCAN_VALUE}}.
> But, apparently this was done by editing the generated Protobuf files to add 
> the values, rather than modifying the protobuf definitions and rebuilding the 
> generated files. The result is, on the next build of the Protobuf sources, 
> the following compile error appears:
> {code}
> [ERROR] 
> /Users/paulrogers/git/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/pcap/PcapFormatPlugin.java:[80,41]
>  error: cannot find symbol
> [ERROR] symbol:   variable PCAP_SUB_SCAN_VALUE
> [ERROR] location: class CoreOperatorType
> {code}
> The solution is to properly edit the Protobuf definitions with the required 
> symbol.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (DRILL-5270) Improve loading of profiles listing in the WebUI


 [ 
https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-5270:

Reviewer: Arina Ielchiieva

> Improve loading of profiles listing in the WebUI
> 
>
> Key: DRILL-5270
> URL: https://issues.apache.org/jira/browse/DRILL-5270
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
> Fix For: 1.12.0
>
>
> Currently, as the number of profiles increase, we reload the same list of 
> profiles from the FS.
> An ideal improvement would be to detect if there are any new profiles and 
> only reload from the disk then. Otherwise, a cached list is sufficient.
> For a directory of 280K profiles, the load time is close to 6 seconds on a 32 
> core server. With the caching, we can get it down to as much as a few 
> milliseconds.
> To render the cache as invalid, we inspect the last modified time of the 
> directory to confirm whether a reload is needed. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (DRILL-5688) Add repeated map support to column accessors


 [ 
https://issues.apache.org/jira/browse/DRILL-5688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-5688:

Fix Version/s: (was: 1.12.0)
   1.13.0

> Add repeated map support to column accessors
> 
>
> Key: DRILL-5688
> URL: https://issues.apache.org/jira/browse/DRILL-5688
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.12.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.13.0
>
>
> DRILL-5211 describes how Drill runs into OOM issues due to Drill's two 
> allocators: Netty and Unsafe. That JIRA also describes the solution: limit 
> vectors to 16 MB in length (with the eventual goal of limiting overall batch 
> size.) DRILL-5517 added "size-aware" support to the column accessors created 
> to parallel Drill's existing readers and writers. (The parallel 
> implementation ensures that we don't break existing code that uses the 
> existing mechanism; same as we did for the external sort.)
> This ticket describes work to extend the column accessors to handle repeated 
> maps and lists. Key themes:
> * Define a common metadata schema for use in this layer and the "result set 
> loader" of DRILL-5657. This schema layer builds on top of the existing schema 
> to add the kind of metadata needed here and by the "sizer" created for the 
> external sort.
> * Define a JSON-like reader and writer structure that supports the full Drill 
> data model semantics. (The earlier version focused on the scalar types and 
> arrays of scalars to prove the concept of limiting vector sizes.)
> * Revising test code to use the revised column writer structure.
> Implementation details appear in the PR.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (DRILL-5822) The query with "SELECT *" with "ORDER BY" clause and `planner.slice_target`=1 doesn't preserve column order


 [ 
https://issues.apache.org/jira/browse/DRILL-5822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-5822:

Reviewer: Paul Rogers

> The query with "SELECT *" with "ORDER BY" clause and `planner.slice_target`=1 
> doesn't preserve column order
> ---
>
> Key: DRILL-5822
> URL: https://issues.apache.org/jira/browse/DRILL-5822
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.11.0
>Reporter: Prasad Nagaraj Subramanya
>Assignee: Vitalii Diravka
>  Labels: ready-to-commit
> Fix For: 1.12.0
>
>
> Columns ordering doesn't preserve for the star query with sorting when this 
> is planned into multiple fragments.
> Repro steps:
> 1) {code}alter session set `planner.slice_target`=1;{code}
> 2) ORDER BY clause in the query.
> Scenarios:
> {code}
> 0: jdbc:drill:zk=local> alter session reset `planner.slice_target`;
> +---++
> |  ok   |summary |
> +---++
> | true  | planner.slice_target updated.  |
> +---++
> 1 row selected (0.082 seconds)
> 0: jdbc:drill:zk=local> select * from cp.`tpch/nation.parquet` order by 
> n_name limit 1;
> +--+--+--+--+
> | n_nationkey  |  n_name  | n_regionkey  |  n_comment 
>   |
> +--+--+--+--+
> | 0| ALGERIA  | 0|  haggle. carefully final deposits 
> detect slyly agai  |
> +--+--+--+--+
> 1 row selected (0.141 seconds)
> 0: jdbc:drill:zk=local> alter session set `planner.slice_target`=1;
> +---++
> |  ok   |summary |
> +---++
> | true  | planner.slice_target updated.  |
> +---++
> 1 row selected (0.091 seconds)
> 0: jdbc:drill:zk=local> select * from cp.`tpch/nation.parquet` order by 
> n_name limit 1;
> +--+--+--+--+
> |  n_comment   |  n_name  | 
> n_nationkey  | n_regionkey  |
> +--+--+--+--+
> |  haggle. carefully final deposits detect slyly agai  | ALGERIA  | 0 
>| 0|
> +--+--+--+--+
> 1 row selected (0.201 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (DRILL-5769) IndexOutOfBoundsException when querying JSON files


 [ 
https://issues.apache.org/jira/browse/DRILL-5769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-5769:

Fix Version/s: (was: 1.12.0)
   (was: 1.11.0)
   (was: 1.10.0)
   Future

> IndexOutOfBoundsException when querying JSON files
> --
>
> Key: DRILL-5769
> URL: https://issues.apache.org/jira/browse/DRILL-5769
> Project: Apache Drill
>  Issue Type: Bug
>  Components:  Server, Storage - JSON
>Affects Versions: 1.10.0
> Environment: *jdk_8u45_x64*
> *single drillbit running on zookeeper*
> *Following options set to TRUE:*
> drill.exec.functions.cast_empty_string_to_null
> store.json.all_text_mode
> store.parquet.enable_dictionary_encoding
> store.parquet.use_new_reader
>Reporter: David Lee
>Assignee: Jinfeng Ni
> Fix For: Future
>
> Attachments: 001.json, 100.json, 111.json
>
>
> *Running the following SQL on these three JSON files fail: *
> 001.json 100.json 111.json
> select t.id
> from dfs.`/tmp/???.json` t
> where t.assetData.debt.couponPaymentFeature.interestBasis = '5'
> *Error:*
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
> IndexOutOfBoundsException: index: 1024, length: 1 (expected: range(0, 1024)) 
> Fragment 0:0 [Error Id: ....
> *However running the same SQL on two out of three files works:*
> select t.id
> from dfs.`/tmp/1??.json` t
> where t.assetData.debt.couponPaymentFeature.interestBasis = '5'
> select t.id
> from dfs.`/tmp/?1?.json` t
> where t.assetData.debt.couponPaymentFeature.interestBasis = '5'
> select t.id
> from dfs.`/tmp/??1.json` t
> where t.assetData.debt.couponPaymentFeature.interestBasis = '5'
> *Changing the selected column from t.id to t.* also works: *
> select *
> from dfs.`/tmp/???.json` t
> where t.assetData.debt.couponPaymentFeature.interestBasis = '5'



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (DRILL-5896) Handle vector creation in HbaseRecordReader to avoid NullableInt vectors later


 [ 
https://issues.apache.org/jira/browse/DRILL-5896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-5896:

Labels: ready-to-commit  (was: )

> Handle vector creation in HbaseRecordReader to avoid NullableInt vectors later
> --
>
> Key: DRILL-5896
> URL: https://issues.apache.org/jira/browse/DRILL-5896
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - HBase
>Affects Versions: 1.11.0
>Reporter: Prasad Nagaraj Subramanya
>Assignee: Prasad Nagaraj Subramanya
>  Labels: ready-to-commit
> Fix For: 1.12.0
>
>
> When a hbase query projects both a column family and a column in the column 
> family, the vector for the column is not created in the HbaseRecordReader.
> So, in cases where scan batch is empty we create a NullableInt vector for 
> this column. We need to handle column creation in the reader.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (DRILL-5670) Varchar vector throws an assertion error when allocating a new vector


 [ 
https://issues.apache.org/jira/browse/DRILL-5670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-5670:

Fix Version/s: (was: 1.12.0)
   1.13.0

> Varchar vector throws an assertion error when allocating a new vector
> -
>
> Key: DRILL-5670
> URL: https://issues.apache.org/jira/browse/DRILL-5670
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.11.0
>Reporter: Rahul Challapalli
>Assignee: Paul Rogers
> Fix For: 1.13.0
>
> Attachments: 26478262-f0a7-8fc1-1887-4f27071b9c0f.sys.drill, 
> 26498995-bbad-83bc-618f-914c37a84e1f.sys.drill, 
> 26555749-4d36-10d2-6faf-e403db40c370.sys.drill, 
> 266290f3-5fdc-5873-7372-e9ee053bf867.sys.drill, 
> 269969ca-8d4d-073a-d916-9031e3d3fbf0.sys.drill, drill-override.conf, 
> drillbit.log, drillbit.log, drillbit.log, drillbit.log, drillbit.log, 
> drillbit.log.exchange, drillbit.log.sort, drillbit.out
>
>
> I am running this test on a private branch of [paul's 
> repository|https://github.com/paul-rogers/drill]. Below is the commit info
> {code}
> git.commit.id.abbrev=d86e16c
> git.commit.user.email=prog...@maprtech.com
> git.commit.message.full=DRILL-5601\: Rollup of external sort fixes an 
> improvements\n\n- DRILL-5513\: Managed External Sort \: OOM error during the 
> merge phase\n- DRILL-5519\: Sort fails to spill and results in an OOM\n- 
> DRILL-5522\: OOM during the merge and spill process of the managed external 
> sort\n- DRILL-5594\: Excessive buffer reallocations during merge phase of 
> external sort\n- DRILL-5597\: Incorrect "bits" vector allocation in nullable 
> vectors allocateNew()\n- DRILL-5602\: Repeated List Vector fails to 
> initialize the offset vector\n\nAll of the bugs have to do with handling 
> low-memory conditions, and with\ncorrectly estimating the sizes of vectors, 
> even when those vectors come\nfrom the spill file or from an exchange. Hence, 
> the changes for all of\nthe above issues are interrelated.\n
> git.commit.id=d86e16c551e7d3553f2cde748a739b1c5a7a7659
> git.commit.message.short=DRILL-5601\: Rollup of external sort fixes an 
> improvements
> git.commit.user.name=Paul Rogers
> git.build.user.name=Rahul Challapalli
> git.commit.id.describe=0.9.0-1078-gd86e16c
> git.build.user.email=challapallira...@gmail.com
> git.branch=d86e16c551e7d3553f2cde748a739b1c5a7a7659
> git.commit.time=05.07.2017 @ 20\:34\:39 PDT
> git.build.time=12.07.2017 @ 14\:27\:03 PDT
> git.remote.origin.url=g...@github.com\:paul-rogers/drill.git
> {code}
> Below query fails with an Assertion Error
> {code}
> 0: jdbc:drill:zk=10.10.100.190:5181> ALTER SESSION SET 
> `exec.sort.disable_managed` = false;
> +---+-+
> |  ok   |   summary   |
> +---+-+
> | true  | exec.sort.disable_managed updated.  |
> +---+-+
> 1 row selected (1.044 seconds)
> 0: jdbc:drill:zk=10.10.100.190:5181> alter session set 
> `planner.memory.max_query_memory_per_node` = 482344960;
> +---++
> |  ok   |  summary   |
> +---++
> | true  | planner.memory.max_query_memory_per_node updated.  |
> +---++
> 1 row selected (0.372 seconds)
> 0: jdbc:drill:zk=10.10.100.190:5181> alter session set 
> `planner.width.max_per_node` = 1;
> +---+--+
> |  ok   |   summary|
> +---+--+
> | true  | planner.width.max_per_node updated.  |
> +---+--+
> 1 row selected (0.292 seconds)
> 0: jdbc:drill:zk=10.10.100.190:5181> alter session set 
> `planner.width.max_per_query` = 1;
> +---+---+
> |  ok   |summary|
> +---+---+
> | true  | planner.width.max_per_query updated.  |
> +---+---+
> 1 row selected (0.25 seconds)
> 0: jdbc:drill:zk=10.10.100.190:5181> select count(*) from (select * from 
> dfs.`/drill/testdata/resource-manager/3500cols.tbl` order by 
> columns[450],columns[330],columns[230],columns[220],columns[110],columns[90],columns[80],columns[70],columns[40],columns[10],columns[20],columns[30],columns[40],columns[50],
>  
> columns[454],columns[413],columns[940],columns[834],columns[73],columns[140],columns[104],columns[],columns[30],columns[2420],columns[1520],
>  columns[1410], 
> columns[1110],columns[1290],columns[2380],

[jira] [Updated] (DRILL-5265) External Sort consumes more memory than allocated


 [ 
https://issues.apache.org/jira/browse/DRILL-5265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-5265:

Fix Version/s: (was: 1.12.0)
   1.13.0

> External Sort consumes more memory than allocated
> -
>
> Key: DRILL-5265
> URL: https://issues.apache.org/jira/browse/DRILL-5265
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Rahul Challapalli
>Assignee: Paul Rogers
> Fix For: 1.13.0
>
>
> git.commit.id.abbrev=300e934
> Based on the profile for the below query, the external sort has a peak memory 
> usage of ~126MB when only ~100MB was allocated 
> {code}
> alter session set `planner.memory.max_query_memory_per_node` = 104857600;
> alter session set `planner.width.max_per_node` = 1;
> select * from dfs.`/drill/testdata/md1362` order by c_email_address;
> {code}
> I attached the profile and the log files



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (DRILL-5310) Memory leak in managed sort if OOM during sv2 allocation


 [ 
https://issues.apache.org/jira/browse/DRILL-5310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-5310:

Fix Version/s: (was: 1.12.0)
   1.13.0

> Memory leak in managed sort if OOM during sv2 allocation
> 
>
> Key: DRILL-5310
> URL: https://issues.apache.org/jira/browse/DRILL-5310
> Project: Apache Drill
>  Issue Type: Sub-task
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.13.0
>
>
> See the "identical1" test case in DRILL-5266. Due to misconfiguration, the 
> sort was given too little memory to make progress. An OOM error occurred when 
> allocating an SV2.
> In this scenario, the "converted" record batch is leaked.
> Normally, a converted batch is added to the list of in-memory batches, then 
> released on {{close()}}. But, in this case, the batch is only a local 
> variable, and so leaks.
> The code must release this batch in this condition.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (DRILL-5822) The query with "SELECT *" with "ORDER BY" clause and `planner.slice_target`=1 doesn't preserve column order


 [ 
https://issues.apache.org/jira/browse/DRILL-5822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-5822:

Labels: ready-to-commit  (was: )

> The query with "SELECT *" with "ORDER BY" clause and `planner.slice_target`=1 
> doesn't preserve column order
> ---
>
> Key: DRILL-5822
> URL: https://issues.apache.org/jira/browse/DRILL-5822
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.11.0
>Reporter: Prasad Nagaraj Subramanya
>Assignee: Vitalii Diravka
>  Labels: ready-to-commit
> Fix For: 1.12.0
>
>
> Columns ordering doesn't preserve for the star query with sorting when this 
> is planned into multiple fragments.
> Repro steps:
> 1) {code}alter session set `planner.slice_target`=1;{code}
> 2) ORDER BY clause in the query.
> Scenarios:
> {code}
> 0: jdbc:drill:zk=local> alter session reset `planner.slice_target`;
> +---++
> |  ok   |summary |
> +---++
> | true  | planner.slice_target updated.  |
> +---++
> 1 row selected (0.082 seconds)
> 0: jdbc:drill:zk=local> select * from cp.`tpch/nation.parquet` order by 
> n_name limit 1;
> +--+--+--+--+
> | n_nationkey  |  n_name  | n_regionkey  |  n_comment 
>   |
> +--+--+--+--+
> | 0| ALGERIA  | 0|  haggle. carefully final deposits 
> detect slyly agai  |
> +--+--+--+--+
> 1 row selected (0.141 seconds)
> 0: jdbc:drill:zk=local> alter session set `planner.slice_target`=1;
> +---++
> |  ok   |summary |
> +---++
> | true  | planner.slice_target updated.  |
> +---++
> 1 row selected (0.091 seconds)
> 0: jdbc:drill:zk=local> select * from cp.`tpch/nation.parquet` order by 
> n_name limit 1;
> +--+--+--+--+
> |  n_comment   |  n_name  | 
> n_nationkey  | n_regionkey  |
> +--+--+--+--+
> |  haggle. carefully final deposits detect slyly agai  | ALGERIA  | 0 
>| 0|
> +--+--+--+--+
> 1 row selected (0.201 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (DRILL-5909) need new JMX metrics for (FAILED and CANCELED) queries


 [ 
https://issues.apache.org/jira/browse/DRILL-5909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-5909:

Reviewer: Paul Rogers

> need new JMX metrics for (FAILED and CANCELED) queries
> --
>
> Key: DRILL-5909
> URL: https://issues.apache.org/jira/browse/DRILL-5909
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Monitoring
>Affects Versions: 1.11.0, 1.12.0
>Reporter: Khurram Faraaz
>Assignee: Prasad Nagaraj Subramanya
>  Labels: ready-to-commit
> Fix For: 1.12.0
>
>
> we have these JMX metrics today
> {noformat}
> drill.queries.running
> drill.queries.completed
> {noformat}
> we need these new JMX metrics
> {noformat}
> drill.queries.failed
> drill.queries.canceled
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (DRILL-5909) need new JMX metrics for (FAILED and CANCELED) queries


 [ 
https://issues.apache.org/jira/browse/DRILL-5909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-5909:

Labels: ready-to-commit  (was: )

> need new JMX metrics for (FAILED and CANCELED) queries
> --
>
> Key: DRILL-5909
> URL: https://issues.apache.org/jira/browse/DRILL-5909
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Monitoring
>Affects Versions: 1.11.0, 1.12.0
>Reporter: Khurram Faraaz
>Assignee: Prasad Nagaraj Subramanya
>  Labels: ready-to-commit
> Fix For: 1.12.0
>
>
> we have these JMX metrics today
> {noformat}
> drill.queries.running
> drill.queries.completed
> {noformat}
> we need these new JMX metrics
> {noformat}
> drill.queries.failed
> drill.queries.canceled
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (DRILL-5896) Handle vector creation in HbaseRecordReader to avoid NullableInt vectors later


 [ 
https://issues.apache.org/jira/browse/DRILL-5896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-5896:

Reviewer: Paul Rogers

> Handle vector creation in HbaseRecordReader to avoid NullableInt vectors later
> --
>
> Key: DRILL-5896
> URL: https://issues.apache.org/jira/browse/DRILL-5896
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - HBase
>Affects Versions: 1.11.0
>Reporter: Prasad Nagaraj Subramanya
>Assignee: Prasad Nagaraj Subramanya
>  Labels: ready-to-commit
> Fix For: 1.12.0
>
>
> When a hbase query projects both a column family and a column in the column 
> family, the vector for the column is not created in the HbaseRecordReader.
> So, in cases where scan batch is empty we create a NullableInt vector for 
> this column. We need to handle column creation in the reader.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (DRILL-5921) Counters metrics should be listed in table


 [ 
https://issues.apache.org/jira/browse/DRILL-5921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-5921:

Reviewer: Arina Ielchiieva

> Counters metrics should be listed in table
> --
>
> Key: DRILL-5921
> URL: https://issues.apache.org/jira/browse/DRILL-5921
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - HTTP
>Affects Versions: 1.11.0
>Reporter: Prasad Nagaraj Subramanya
>Assignee: Prasad Nagaraj Subramanya
>Priority: Minor
> Fix For: 1.12.0
>
>
> Counter metrics are currently displayed as json string in the Drill UI. They 
> should be listed in a table similar to other metrics.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (DRILL-5923) State of a successfully completed query shown as "COMPLETED"


 [ 
https://issues.apache.org/jira/browse/DRILL-5923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-5923:

Reviewer: Arina Ielchiieva

> State of a successfully completed query shown as "COMPLETED"
> 
>
> Key: DRILL-5923
> URL: https://issues.apache.org/jira/browse/DRILL-5923
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - HTTP
>Affects Versions: 1.11.0
>Reporter: Prasad Nagaraj Subramanya
>Assignee: Prasad Nagaraj Subramanya
> Fix For: 1.12.0
>
>
> Drill UI currently lists a successfully completed query as "COMPLETED". 
> Successfully completed, failed and canceled queries are all grouped as 
> Completed queries. 
> It would be better to list the state of a successfully completed query as 
> "Succeeded" to avoid confusion.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (DRILL-5829) RecordBatchLoader schema comparison is not case insensitive


 [ 
https://issues.apache.org/jira/browse/DRILL-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-5829:

Fix Version/s: (was: 1.12.0)
   1.13.0

> RecordBatchLoader schema comparison is not case insensitive
> ---
>
> Key: DRILL-5829
> URL: https://issues.apache.org/jira/browse/DRILL-5829
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.11.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.13.0
>
>
> The class {{RecordBatchLoader}} decodes batches received over the wire. It 
> determines if the schema of the new batch matches that of the old one. To do 
> that, it uses a map of existing columns.
> In Drill, column names follow SQL rules: they are case insensitive. Yet, the 
> implementation of {{RecordBatchLoader}} uses a case sensitive map:
> {code}
> final Map oldFields = Maps.newHashMap();
> {code}
> This should be:
> {code}
> final Map oldFields = 
> CaseInsensitiveMap.newHashMap();
> {code}
> Without this change, the receivers will report schema changes if a column 
> differs only in name case. However, Drill semantics say that names that 
> differ in case are identical, and so no schema change should be issued.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (DRILL-5690) RepeatedDecimal18Vector does not pass scale, precision to data vector


 [ 
https://issues.apache.org/jira/browse/DRILL-5690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-5690:

Fix Version/s: (was: 1.12.0)
   1.13.0

> RepeatedDecimal18Vector does not pass scale, precision to data vector
> -
>
> Key: DRILL-5690
> URL: https://issues.apache.org/jira/browse/DRILL-5690
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.13.0
>
>
> Decimal types require not just the type (Decimal9, Decimal18, etc.) but also 
> a precision and scale. The triple of (minor type, precision, scale) appears 
> in the {{MaterializedField}} for the nullable or required vectors.
> A repeated vector has three parts: the {{RepeatedDecimal18Vector}} which is 
> composed of a {{UInt4Vector}} offset vector and a {{Decimal18Vector}} that 
> holds values.
> When {{RepeatedDecimal18Vector}} creates the {{Decimal18Vector}} to hold the 
> values, it clones the {{MaterializedField}}. But, it *does not* clone the 
> scale and precision, resulting in the loss of critical information.
> {code}
>   public RepeatedDecimal18Vector(MaterializedField field, BufferAllocator 
> allocator) {
> super(field, allocator);
> 
> addOrGetVector(VectorDescriptor.create(Types.required(field.getType().getMinorType(;
>   }
> {code}
> This is normally not a problem because most code access the data via the 
> repeated vector. But, for code that needs to work with the values, the 
> results are wrong given that the types are wrong. (Values stored with one 
> scale, 123.45, (scale 2) will be retrieved with 0 scale (123, say).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Closed] (DRILL-5872) Deserialization of profile JSON fails due to totalCost being reported as "NaN"


 [ 
https://issues.apache.org/jira/browse/DRILL-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva closed DRILL-5872.
---
Resolution: Won't Fix

> Deserialization of profile JSON fails due to totalCost being reported as "NaN"
> --
>
> Key: DRILL-5872
> URL: https://issues.apache.org/jira/browse/DRILL-5872
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.12.0
>Reporter: Kunal Khatua
>Assignee: Paul Rogers
>Priority: Blocker
> Fix For: 1.12.0
>
>
> With DRILL-5716 , there is a change in the protobuf that introduces a new 
> attribute in the JSON document that Drill uses to interpret and render the 
> profile's details. 
> The totalCost attribute, used as a part of showing the query cost (to 
> understand how it was assign to small/large queue), sometimes returns a 
> non-numeric text value {{"NaN"}}. 
> This breaks the UI with the messages:
> {code}
> Failed to get profiles:
> unable to deserialize value at key 2620698f-295e-f8d3-3ab7-01792b0f2669
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (DRILL-5377) Five-digit year dates are displayed incorrectly via jdbc