[jira] [Updated] (DRILL-5943) Avoid the strong check introduced by DRILL-5582 for PLAIN mechanism
[ https://issues.apache.org/jira/browse/DRILL-5943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-5943: Affects Version/s: 1.12.0 > Avoid the strong check introduced by DRILL-5582 for PLAIN mechanism > --- > > Key: DRILL-5943 > URL: https://issues.apache.org/jira/browse/DRILL-5943 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.12.0 >Reporter: Sorabh Hamirwasia >Assignee: Sorabh Hamirwasia > Fix For: 1.12.0 > > > For PLAIN mechanism we will weaken the strong check introduced with > DRILL-5582 to keep the forward compatibility between Drill 1.12 client and > Drill 1.9 server. This is fine since with and without this strong check PLAIN > mechanism is still vulnerable to MITM during handshake itself unlike mutual > authentication protocols like Kerberos. > Also for keeping forward compatibility with respect to SASL we will treat > UNKNOWN_SASL_SUPPORT as valid value. For handshake message received from a > client which is running on later version (let say 1.13) then Drillbit (1.12) > and having a new value for SaslSupport field which is unknown to server, this > field will be decoded as UNKNOWN_SASL_SUPPORT. In this scenario client will > be treated as one aware about SASL protocol but server doesn't know exact > capabilities of client. Hence the SASL handshake will still be required from > server side. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-4286) Have an ability to put server in quiescent mode of operation
[ https://issues.apache.org/jira/browse/DRILL-4286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243166#comment-16243166 ] ASF GitHub Bot commented on DRILL-4286: --- Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r149541807 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/server/DrillbitStateManager.java --- @@ -0,0 +1,80 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.server; +/* + State manager to manage the state of drillbit. + */ +public class DrillbitStateManager { + + + public DrillbitStateManager(DrillbitState currentState) { +this.currentState = currentState; + } + + public enum DrillbitState { +STARTUP, ONLINE, GRACE, DRAINING, OFFLINE, SHUTDOWN + } + + public DrillbitState getState() { +return currentState; + } + + private DrillbitState currentState; --- End diff -- I think Drillbit.quiescentMode and Drillbit.forceful_shutdown also need NOT be volatile given the way they are used now. You don't have to enforce happens-before (by preventing re-ordering) here and even if these variables are volatile, the read of these variables in close() can anyway race with the setting of these variables in another thread doing a stop/gracefulShutdown. Let me know if I am missing anything. That said, adding volatiles can only makes the code more correct (and slower). Since this code is not critical you can let it be as it is. > Have an ability to put server in quiescent mode of operation > > > Key: DRILL-4286 > URL: https://issues.apache.org/jira/browse/DRILL-4286 > Project: Apache Drill > Issue Type: New Feature > Components: Execution - Flow >Reporter: Victoria Markman >Assignee: Venkata Jyothsna Donapati > > I think drill will benefit from mode of operation that is called "quiescent" > in some databases. > From IBM Informix server documentation: > {code} > Change gracefully from online to quiescent mode > Take the database server gracefully from online mode to quiescent mode to > restrict access to the database server without interrupting current > processing. After you perform this task, the database server sets a flag that > prevents new sessions from gaining access to the database server. The current > sessions are allowed to finish processing. After you initiate the mode > change, it cannot be canceled. During the mode change from online to > quiescent, the database server is considered to be in Shutdown mode. > {code} > This is different from shutdown, when processes are terminated. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)
[ https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243197#comment-16243197 ] ASF GitHub Bot commented on DRILL-3640: --- Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1024#discussion_r149550418 --- Diff: exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillResultSetImpl.java --- @@ -96,6 +105,14 @@ private void throwIfClosed() throws AlreadyClosedSqlException, throw new AlreadyClosedSqlException( "ResultSet is already closed." ); } } + +//Implicit check for whether timeout is set +if (elapsedTimer != null) { --- End diff -- ```yes, pausing before execute would totally work!``` Current here is what the test does (_italics indicate what we're doing under the covers_): 1. Init Statement 2. Set timeout on statement (_validating the timeout value_) 3. Calling `execute()` and fetching ResultSet instance (_starting the clock_) 4. Fetching a row using ResultSet.next() 5. Pausing briefly 6. Repeat step 4 onwards (_enough pause to trigger timeout_) I was intending to pause between step 3 and 4 as an additional step. You believe that we are not exercising any tests for timeout within the `execute()` call? (Ref: https://github.com/kkhatua/drill/blob/9c4e3f3f727e70ca058facd4767556087a1876e1/exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillResultSetImpl.java#L1908 ) > Drill JDBC driver support Statement.setQueryTimeout(int) > > > Key: DRILL-3640 > URL: https://issues.apache.org/jira/browse/DRILL-3640 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC >Affects Versions: 1.2.0 >Reporter: Chun Chang >Assignee: Kunal Khatua > Fix For: 1.12.0 > > > It would be nice if we have this implemented. Run away queries can be > automatically canceled by setting the timeout. > java.sql.SQLFeatureNotSupportedException: Setting network timeout is not > supported. > at > org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)
[ https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243187#comment-16243187 ] ASF GitHub Bot commented on DRILL-3640: --- Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1024#discussion_r149548759 --- Diff: exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java --- @@ -333,8 +368,14 @@ void close() { final int batchQueueThrottlingThreshold = client.getConfig().getInt( ExecConstants.JDBC_BATCH_QUEUE_THROTTLING_THRESHOLD ); -resultsListener = new ResultsListener(batchQueueThrottlingThreshold); +resultsListener = new ResultsListener(this, batchQueueThrottlingThreshold); currentBatchHolder = new RecordBatchLoader(client.getAllocator()); +try { + setTimeout(this.statement.getQueryTimeout()); +} catch (SQLException e) { + // Printing any unexpected SQLException stack trace + e.printStackTrace(); --- End diff -- I agree. Thankfully, the _caller_ does handle any thrown `SQLException`s, so I'm going to pass this off to that. IMO, I don't think we'll have an issue because the `Statement.setQueryTimeout()` would have handled any corner cases before this is invoked via `execute()` > Drill JDBC driver support Statement.setQueryTimeout(int) > > > Key: DRILL-3640 > URL: https://issues.apache.org/jira/browse/DRILL-3640 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC >Affects Versions: 1.2.0 >Reporter: Chun Chang >Assignee: Kunal Khatua > Fix For: 1.12.0 > > > It would be nice if we have this implemented. Run away queries can be > automatically canceled by setting the timeout. > java.sql.SQLFeatureNotSupportedException: Setting network timeout is not > supported. > at > org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)
[ https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243158#comment-16243158 ] ASF GitHub Bot commented on DRILL-3640: --- Github user laurentgo commented on a diff in the pull request: https://github.com/apache/drill/pull/1024#discussion_r149546105 --- Diff: exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java --- @@ -260,6 +288,10 @@ void close() { // when the main thread is blocked waiting for the result. In that case // we want to unblock the main thread. firstMessageReceived.countDown(); // TODO: Why not call releaseIfFirst as used elsewhere? + //Stopping timeout clock --- End diff -- Nothing is actually running in Stopwatch (it's just a state to indicate if elapsed time should use the current time or the time when Stopwatch was stopped...) > Drill JDBC driver support Statement.setQueryTimeout(int) > > > Key: DRILL-3640 > URL: https://issues.apache.org/jira/browse/DRILL-3640 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC >Affects Versions: 1.2.0 >Reporter: Chun Chang >Assignee: Kunal Khatua > Fix For: 1.12.0 > > > It would be nice if we have this implemented. Run away queries can be > automatically canceled by setting the timeout. > java.sql.SQLFeatureNotSupportedException: Setting network timeout is not > supported. > at > org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)
[ https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243159#comment-16243159 ] ASF GitHub Bot commented on DRILL-3640: --- Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1024#discussion_r149546258 --- Diff: exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java --- @@ -139,8 +147,22 @@ private boolean stopThrottlingIfSo() { return stopped; } -public void awaitFirstMessage() throws InterruptedException { - firstMessageReceived.await(); +public void awaitFirstMessage() throws InterruptedException, SQLTimeoutException { + //Check if a non-zero timeout has been set + if ( parent.timeoutInMilliseconds > 0 ) { +//Identifying remaining in milliseconds to maintain a granularity close to integer value of timeout +long timeToTimeout = (parent.timeoutInMilliseconds) - parent.elapsedTimer.elapsed(TimeUnit.MILLISECONDS); +if ( timeToTimeout > 0 ) { --- End diff -- Affects readability, but I think comments can convey the intent. +1 > Drill JDBC driver support Statement.setQueryTimeout(int) > > > Key: DRILL-3640 > URL: https://issues.apache.org/jira/browse/DRILL-3640 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC >Affects Versions: 1.2.0 >Reporter: Chun Chang >Assignee: Kunal Khatua > Fix For: 1.12.0 > > > It would be nice if we have this implemented. Run away queries can be > automatically canceled by setting the timeout. > java.sql.SQLFeatureNotSupportedException: Setting network timeout is not > supported. > at > org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5717) change some date time unit cases with specific timezone or Local
[ https://issues.apache.org/jira/browse/DRILL-5717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243155#comment-16243155 ] ASF GitHub Bot commented on DRILL-5717: --- Github user weijietong commented on the issue: https://github.com/apache/drill/pull/904 applied the review comments > change some date time unit cases with specific timezone or Local > > > Key: DRILL-5717 > URL: https://issues.apache.org/jira/browse/DRILL-5717 > Project: Apache Drill > Issue Type: Bug > Components: Tools, Build & Test >Affects Versions: 1.9.0, 1.11.0 >Reporter: weijie.tong > > Some date time test cases like JodaDateValidatorTest is not Local > independent .This will cause other Local's users's test phase to fail. We > should let these test cases to be Local env independent. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)
[ https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243153#comment-16243153 ] ASF GitHub Bot commented on DRILL-3640: --- Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1024#discussion_r149545534 --- Diff: exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java --- @@ -239,6 +261,11 @@ QueryDataBatch getNext() throws UserException, InterruptedException { } return qdb; } + + // Check and throw SQLTimeoutException + if ( parent.timeoutInMilliseconds > 0 && parent.elapsedTimer.elapsed(TimeUnit.SECONDS) >= parent.timeoutInMilliseconds ) { --- End diff -- Darn! +1 > Drill JDBC driver support Statement.setQueryTimeout(int) > > > Key: DRILL-3640 > URL: https://issues.apache.org/jira/browse/DRILL-3640 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC >Affects Versions: 1.2.0 >Reporter: Chun Chang >Assignee: Kunal Khatua > Fix For: 1.12.0 > > > It would be nice if we have this implemented. Run away queries can be > automatically canceled by setting the timeout. > java.sql.SQLFeatureNotSupportedException: Setting network timeout is not > supported. > at > org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)
[ https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243106#comment-16243106 ] ASF GitHub Bot commented on DRILL-3640: --- Github user kkhatua commented on the issue: https://github.com/apache/drill/pull/1024 @laurentgo Done the changes... ready for review. > Drill JDBC driver support Statement.setQueryTimeout(int) > > > Key: DRILL-3640 > URL: https://issues.apache.org/jira/browse/DRILL-3640 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC >Affects Versions: 1.2.0 >Reporter: Chun Chang >Assignee: Kunal Khatua > Fix For: 1.12.0 > > > It would be nice if we have this implemented. Run away queries can be > automatically canceled by setting the timeout. > java.sql.SQLFeatureNotSupportedException: Setting network timeout is not > supported. > at > org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)
[ https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243134#comment-16243134 ] ASF GitHub Bot commented on DRILL-3640: --- Github user laurentgo commented on a diff in the pull request: https://github.com/apache/drill/pull/1024#discussion_r149542720 --- Diff: exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java --- @@ -260,6 +288,10 @@ void close() { // when the main thread is blocked waiting for the result. In that case // we want to unblock the main thread. firstMessageReceived.countDown(); // TODO: Why not call releaseIfFirst as used elsewhere? + //Stopping timeout clock --- End diff -- since we are closing, do we need to care about the stopwatch? > Drill JDBC driver support Statement.setQueryTimeout(int) > > > Key: DRILL-3640 > URL: https://issues.apache.org/jira/browse/DRILL-3640 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC >Affects Versions: 1.2.0 >Reporter: Chun Chang >Assignee: Kunal Khatua > Fix For: 1.12.0 > > > It would be nice if we have this implemented. Run away queries can be > automatically canceled by setting the timeout. > java.sql.SQLFeatureNotSupportedException: Setting network timeout is not > supported. > at > org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)
[ https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243163#comment-16243163 ] ASF GitHub Bot commented on DRILL-3640: --- Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1024#discussion_r149546431 --- Diff: exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java --- @@ -376,6 +417,19 @@ synchronized void cleanup() { currentBatchHolder.clear(); } + long getTimeoutInMilliseconds() { +return timeoutInMilliseconds; + } + + //Set the cursor's timeout in seconds + void setTimeout(int timeoutDurationInSeconds){ +this.timeoutInMilliseconds = timeoutDurationInSeconds*1000L; --- End diff -- +1 > Drill JDBC driver support Statement.setQueryTimeout(int) > > > Key: DRILL-3640 > URL: https://issues.apache.org/jira/browse/DRILL-3640 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC >Affects Versions: 1.2.0 >Reporter: Chun Chang >Assignee: Kunal Khatua > Fix For: 1.12.0 > > > It would be nice if we have this implemented. Run away queries can be > automatically canceled by setting the timeout. > java.sql.SQLFeatureNotSupportedException: Setting network timeout is not > supported. > at > org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)
[ https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243183#comment-16243183 ] ASF GitHub Bot commented on DRILL-3640: --- Github user laurentgo commented on a diff in the pull request: https://github.com/apache/drill/pull/1024#discussion_r149548337 --- Diff: exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillResultSetImpl.java --- @@ -96,6 +105,14 @@ private void throwIfClosed() throws AlreadyClosedSqlException, throw new AlreadyClosedSqlException( "ResultSet is already closed." ); } } + +//Implicit check for whether timeout is set +if (elapsedTimer != null) { --- End diff -- yes, pausing before execute would totally work! After execute, likely not since injection is done when query is executed on the server side. > Drill JDBC driver support Statement.setQueryTimeout(int) > > > Key: DRILL-3640 > URL: https://issues.apache.org/jira/browse/DRILL-3640 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC >Affects Versions: 1.2.0 >Reporter: Chun Chang >Assignee: Kunal Khatua > Fix For: 1.12.0 > > > It would be nice if we have this implemented. Run away queries can be > automatically canceled by setting the timeout. > java.sql.SQLFeatureNotSupportedException: Setting network timeout is not > supported. > at > org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)
[ https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243154#comment-16243154 ] ASF GitHub Bot commented on DRILL-3640: --- Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1024#discussion_r149545640 --- Diff: exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java --- @@ -260,6 +288,10 @@ void close() { // when the main thread is blocked waiting for the result. In that case // we want to unblock the main thread. firstMessageReceived.countDown(); // TODO: Why not call releaseIfFirst as used elsewhere? + //Stopping timeout clock --- End diff -- Just wrapping up any 'running' resources. > Drill JDBC driver support Statement.setQueryTimeout(int) > > > Key: DRILL-3640 > URL: https://issues.apache.org/jira/browse/DRILL-3640 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC >Affects Versions: 1.2.0 >Reporter: Chun Chang >Assignee: Kunal Khatua > Fix For: 1.12.0 > > > It would be nice if we have this implemented. Run away queries can be > automatically canceled by setting the timeout. > java.sql.SQLFeatureNotSupportedException: Setting network timeout is not > supported. > at > org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (DRILL-5943) Avoid the strong check introduced by DRILL-5582 for PLAIN mechanism
[ https://issues.apache.org/jira/browse/DRILL-5943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sorabh Hamirwasia updated DRILL-5943: - Reviewer: Parth Chandra > Avoid the strong check introduced by DRILL-5582 for PLAIN mechanism > --- > > Key: DRILL-5943 > URL: https://issues.apache.org/jira/browse/DRILL-5943 > Project: Apache Drill > Issue Type: Improvement >Reporter: Sorabh Hamirwasia >Assignee: Sorabh Hamirwasia > Fix For: 1.12.0 > > > For PLAIN mechanism we will weaken the strong check introduced with > DRILL-5582 to keep the forward compatibility between Drill 1.12 client and > Drill 1.9 server. This is fine since with and without this strong check PLAIN > mechanism is still vulnerable to MITM during handshake itself unlike mutual > authentication protocols like Kerberos. > Also for keeping forward compatibility with respect to SASL we will treat > UNKNOWN_SASL_SUPPORT as valid value. For handshake message received from a > client which is running on later version (let say 1.13) then Drillbit (1.12) > and having a new value for SaslSupport field which is unknown to server, this > field will be decoded as UNKNOWN_SASL_SUPPORT. In this scenario client will > be treated as one aware about SASL protocol but server doesn't know exact > capabilities of client. Hence the SASL handshake will still be required from > server side. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5923) State of a successfully completed query shown as "COMPLETED"
[ https://issues.apache.org/jira/browse/DRILL-5923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243210#comment-16243210 ] ASF GitHub Bot commented on DRILL-5923: --- Github user paul-rogers commented on the issue: https://github.com/apache/drill/pull/1021 @arina-ielchiieva, @prasadns14, here is my two cents. The names and numbers used in the protobuf definitions are part of Drill's public network API. This API is not versioned, so we can't really change it. If we changed the names, then, say, C-code or Java code that expects the old names will break. Being part of the public API, that code may not even be in the Drill source tree; perhaps someone has generated, say, a Python binding. So, can't change the public API. For purely aesthetic reasons, the contributor wishes to change the message displayed in the UI. This is purely a UI decision (the user is not expected to map the display names to the Protobuf enums.) And, the display name is subject to change. Maybe other UIs want to use other names. Maybe we want to show icons, or abbreviate the names ("Fail", "OK", etc.) And, of course, what if the display name should have spaces other characters: "In Progress", "In Queue" or "Didn't Work!". Can't put those in enum names. You get the idea. For this reason, the mapping from enum values to display names should be part of the UI, not the network protocol definition. The present change provides a UI-specific mapping from API Protobuf enum values to display strings, which seems like a good idea. So, the key questions are: * Should we use display strings other than the Protobuf constants (seems a good idea.) * Should we do the mapping in Java or in Freemarker? (Java seems simpler.) Thoughts? > State of a successfully completed query shown as "COMPLETED" > > > Key: DRILL-5923 > URL: https://issues.apache.org/jira/browse/DRILL-5923 > Project: Apache Drill > Issue Type: Bug > Components: Client - HTTP >Affects Versions: 1.11.0 >Reporter: Prasad Nagaraj Subramanya >Assignee: Prasad Nagaraj Subramanya > Fix For: 1.12.0 > > > Drill UI currently lists a successfully completed query as "COMPLETED". > Successfully completed, failed and canceled queries are all grouped as > Completed queries. > It would be better to list the state of a successfully completed query as > "Succeeded" to avoid confusion. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5899) Simple pattern matchers can work with DrillBuf directly
[ https://issues.apache.org/jira/browse/DRILL-5899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243315#comment-16243315 ] ASF GitHub Bot commented on DRILL-5899: --- Github user ppadma commented on the issue: https://github.com/apache/drill/pull/1015 @paul-rogers updated with latest review comments taken care of. Please take a look. > Simple pattern matchers can work with DrillBuf directly > --- > > Key: DRILL-5899 > URL: https://issues.apache.org/jira/browse/DRILL-5899 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Flow >Reporter: Padma Penumarthy >Assignee: Padma Penumarthy >Priority: Critical > > For the 4 simple patterns we have i.e. startsWith, endsWith, contains and > constant,, we do not need the overhead of charSequenceWrapper. We can work > with DrillBuf directly. This will save us from doing isAscii check and UTF8 > decoding for each row. > UTF-8 encoding ensures that no UTF-8 character is a prefix of any other valid > character. So, instead of decoding varChar from each row we are processing, > encode the patternString once during setup and do raw byte comparison. > Instead of bounds checking and reading one byte at a time, we get the whole > buffer in one shot and use that for comparison. > This improved overall performance for filter operator by around 20%. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-4286) Have an ability to put server in quiescent mode of operation
[ https://issues.apache.org/jira/browse/DRILL-4286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243167#comment-16243167 ] ASF GitHub Bot commented on DRILL-4286: --- Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r149542196 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java --- @@ -348,6 +354,21 @@ public void run() { */ } + /* +Check if the foreman is ONLINE. If not dont accept any new queries. + */ + public void checkForemanState() throws ForemanException{ +DrillbitEndpoint foreman = drillbitContext.getEndpoint(); +Collection dbs = drillbitContext.getAvailableBits(); --- End diff -- Maybe add it to the DrillbitContext class ? > Have an ability to put server in quiescent mode of operation > > > Key: DRILL-4286 > URL: https://issues.apache.org/jira/browse/DRILL-4286 > Project: Apache Drill > Issue Type: New Feature > Components: Execution - Flow >Reporter: Victoria Markman >Assignee: Venkata Jyothsna Donapati > > I think drill will benefit from mode of operation that is called "quiescent" > in some databases. > From IBM Informix server documentation: > {code} > Change gracefully from online to quiescent mode > Take the database server gracefully from online mode to quiescent mode to > restrict access to the database server without interrupting current > processing. After you perform this task, the database server sets a flag that > prevents new sessions from gaining access to the database server. The current > sessions are allowed to finish processing. After you initiate the mode > change, it cannot be canceled. During the mode change from online to > quiescent, the database server is considered to be in Shutdown mode. > {code} > This is different from shutdown, when processes are terminated. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-4286) Have an ability to put server in quiescent mode of operation
[ https://issues.apache.org/jira/browse/DRILL-4286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243168#comment-16243168 ] ASF GitHub Bot commented on DRILL-4286: --- Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r149544267 --- Diff: exec/java-exec/src/test/java/org/apache/drill/test/TestGracefulShutdown.java --- @@ -0,0 +1,323 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.test; + +import ch.qos.logback.classic.Level; +import org.apache.commons.io.FileUtils; +import org.apache.drill.exec.ExecConstants; +import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint; +import org.apache.drill.exec.proto.UserBitShared.QueryResult.QueryState; +import org.apache.drill.exec.server.Drillbit; +import org.junit.AfterClass; +import org.junit.Assert; +import org.junit.BeforeClass; +import org.junit.Test; +import org.omg.PortableServer.THREAD_POLICY_ID; + +import java.io.File; +import java.io.FileWriter; +import java.io.IOException; +import java.io.PrintWriter; +import java.net.HttpURLConnection; +import java.net.URL; +import java.util.Collection; +import java.util.Properties; + +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertNotEquals; +import static org.junit.Assert.fail; + +public class TestGracefulShutdown { + + @BeforeClass + public static void setUpTestData() { +for( int i = 0; i < 1000; i++) { + setupFile(i); +} + } + + + public static final Properties WEBSERVER_CONFIGURATION = new Properties() { +{ + put(ExecConstants.HTTP_ENABLE, true); +} + }; + + public FixtureBuilder enableWebServer(FixtureBuilder builder) { +Properties props = new Properties(); +props.putAll(WEBSERVER_CONFIGURATION); +builder.configBuilder.configProps(props); +return builder; + } + + + /* + Start multiple drillbits and then shutdown a drillbit. Query the online + endpoints and check if the drillbit still exists. + */ + @Test + public void testOnlineEndPoints() throws Exception { + +String[] drillbits = {"db1" ,"db2","db3", "db4", "db5", "db6"}; +FixtureBuilder builder = ClusterFixture.builder().withBits(drillbits).withLocalZk(); + + +try ( ClusterFixture cluster = builder.build(); + ClientFixture client = cluster.clientFixture()) { + + Drillbit drillbit = cluster.drillbit("db2"); + DrillbitEndpoint drillbitEndpoint = drillbit.getRegistrationHandle().getEndPoint(); + int grace_period = drillbit.getContext().getConfig().getInt("drill.exec.grace_period"); + new Thread(new Runnable() { +public void run() { + try { +cluster.close_drillbit("db2"); + } catch (Exception e) { +e.printStackTrace(); + } +} + }).start(); + //wait for graceperiod + Thread.sleep(grace_period); + Collection drillbitEndpoints = cluster.drillbit().getContext() + .getClusterCoordinator() + .getOnlineEndPoints(); + Assert.assertFalse(drillbitEndpoints.contains(drillbitEndpoint)); +} + } + /* +Test if the drillbit transitions from ONLINE state when a shutdown +request is initiated + */ + @Test + public void testStateChange() throws Exception { + +String[] drillbits = {"db1" ,"db2", "db3", "db4", "db5", "db6"}; +FixtureBuilder builder = ClusterFixture.builder().withBits(drillbits).withLocalZk(); + +try ( ClusterFixture cluster = builder.build(); + ClientFixture client = cluster.clientFixture()) { + Drillbit drillbit = cluster.drillbit("db2"); +
[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)
[ https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243175#comment-16243175 ] ASF GitHub Bot commented on DRILL-3640: --- Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1024#discussion_r149547712 --- Diff: exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillResultSetImpl.java --- @@ -96,6 +105,14 @@ private void throwIfClosed() throws AlreadyClosedSqlException, throw new AlreadyClosedSqlException( "ResultSet is already closed." ); } } + +//Implicit check for whether timeout is set +if (elapsedTimer != null) { --- End diff -- I was originally wondering as to when should we trigger the countdown on the timer. Creating a `[Prepared]Statement` object should not be the basis for the starting the clock, but only when you actually call execute(). The `DrillCursor` is initialized in this method and is what starts the clock. I could create a clone of the `testTriggeredQueryTimeout` method and simply have the client pause after `execute()` but before fetching the `ResultSet` instance or invoking `ResultSet.next()` . Would that work ? > Drill JDBC driver support Statement.setQueryTimeout(int) > > > Key: DRILL-3640 > URL: https://issues.apache.org/jira/browse/DRILL-3640 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC >Affects Versions: 1.2.0 >Reporter: Chun Chang >Assignee: Kunal Khatua > Fix For: 1.12.0 > > > It would be nice if we have this implemented. Run away queries can be > automatically canceled by setting the timeout. > java.sql.SQLFeatureNotSupportedException: Setting network timeout is not > supported. > at > org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)
[ https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243199#comment-16243199 ] ASF GitHub Bot commented on DRILL-3640: --- Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1024#discussion_r149550602 --- Diff: exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java --- @@ -260,6 +288,10 @@ void close() { // when the main thread is blocked waiting for the result. In that case // we want to unblock the main thread. firstMessageReceived.countDown(); // TODO: Why not call releaseIfFirst as used elsewhere? + //Stopping timeout clock --- End diff -- Ok. Guess we'll do away with it. +1 > Drill JDBC driver support Statement.setQueryTimeout(int) > > > Key: DRILL-3640 > URL: https://issues.apache.org/jira/browse/DRILL-3640 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC >Affects Versions: 1.2.0 >Reporter: Chun Chang >Assignee: Kunal Khatua > Fix For: 1.12.0 > > > It would be nice if we have this implemented. Run away queries can be > automatically canceled by setting the timeout. > java.sql.SQLFeatureNotSupportedException: Setting network timeout is not > supported. > at > org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (DRILL-5917) Ban org.json:json library in Drill
[ https://issues.apache.org/jira/browse/DRILL-5917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Rozov updated DRILL-5917: -- Summary: Ban org.json:json library in Drill (was: Ban json.org library in Drill) > Ban org.json:json library in Drill > -- > > Key: DRILL-5917 > URL: https://issues.apache.org/jira/browse/DRILL-5917 > Project: Apache Drill > Issue Type: Task >Affects Versions: 1.11.0 >Reporter: Arina Ielchiieva >Assignee: Vlad Rozov > Fix For: 1.12.0 > > > Apache Drill has dependencies on json.org lib indirectly from two libraries: > com.mapr.hadoop:maprfs:jar:5.2.1-mapr > com.mapr.fs:mapr-hbase:jar:5.2.1-mapr > {noformat} > [INFO] org.apache.drill.contrib:drill-format-mapr:jar:1.12.0-SNAPSHOT > [INFO] +- com.mapr.hadoop:maprfs:jar:5.2.1-mapr:compile > [INFO] | \- org.json:json:jar:20080701:compile > [INFO] \- com.mapr.fs:mapr-hbase:jar:5.2.1-mapr:compile > [INFO]\- (org.json:json:jar:20080701:compile - omitted for duplicate) > {noformat} > Need to make sure we won't have any dependencies from these libs to json.org > lib and ban this lib in main pom.xml file. > Issue is critical since Apache release won't happen until we make sure > json.org lib is not used (https://www.apache.org/legal/resolved.html). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5943) Avoid the strong check introduced by DRILL-5582 for PLAIN mechanism
[ https://issues.apache.org/jira/browse/DRILL-5943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243208#comment-16243208 ] ASF GitHub Bot commented on DRILL-5943: --- GitHub user sohami opened a pull request: https://github.com/apache/drill/pull/1028 DRILL-5943: Avoid the strong check introduced by DRILL-5582 for PLAIN… … mechanism You can merge this pull request into a Git repository by running: $ git pull https://github.com/sohami/drill DRILL-5943 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1028.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1028 commit 708dbc203b63700fb520445e585826a5c1e911e4 Author: Sorabh Hamirwasia Date: 2017-11-07T23:27:45Z DRILL-5943: Avoid the strong check introduced by DRILL-5582 for PLAIN mechanism > Avoid the strong check introduced by DRILL-5582 for PLAIN mechanism > --- > > Key: DRILL-5943 > URL: https://issues.apache.org/jira/browse/DRILL-5943 > Project: Apache Drill > Issue Type: Improvement >Reporter: Sorabh Hamirwasia >Assignee: Sorabh Hamirwasia > Fix For: 1.12.0 > > > For PLAIN mechanism we will weaken the strong check introduced with > DRILL-5582 to keep the forward compatibility between Drill 1.12 client and > Drill 1.9 server. This is fine since with and without this strong check PLAIN > mechanism is still vulnerable to MITM during handshake itself unlike mutual > authentication protocols like Kerberos. > Also for keeping forward compatibility with respect to SASL we will treat > UNKNOWN_SASL_SUPPORT as valid value. For handshake message received from a > client which is running on later version (let say 1.13) then Drillbit (1.12) > and having a new value for SaslSupport field which is unknown to server, this > field will be decoded as UNKNOWN_SASL_SUPPORT. In this scenario client will > be treated as one aware about SASL protocol but server doesn't know exact > capabilities of client. Hence the SASL handshake will still be required from > server side. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5943) Avoid the strong check introduced by DRILL-5582 for PLAIN mechanism
[ https://issues.apache.org/jira/browse/DRILL-5943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243211#comment-16243211 ] ASF GitHub Bot commented on DRILL-5943: --- Github user sohami commented on the issue: https://github.com/apache/drill/pull/1028 @parthchandra & @laurentgo - Please help to review this PR. > Avoid the strong check introduced by DRILL-5582 for PLAIN mechanism > --- > > Key: DRILL-5943 > URL: https://issues.apache.org/jira/browse/DRILL-5943 > Project: Apache Drill > Issue Type: Improvement >Reporter: Sorabh Hamirwasia >Assignee: Sorabh Hamirwasia > Fix For: 1.12.0 > > > For PLAIN mechanism we will weaken the strong check introduced with > DRILL-5582 to keep the forward compatibility between Drill 1.12 client and > Drill 1.9 server. This is fine since with and without this strong check PLAIN > mechanism is still vulnerable to MITM during handshake itself unlike mutual > authentication protocols like Kerberos. > Also for keeping forward compatibility with respect to SASL we will treat > UNKNOWN_SASL_SUPPORT as valid value. For handshake message received from a > client which is running on later version (let say 1.13) then Drillbit (1.12) > and having a new value for SaslSupport field which is unknown to server, this > field will be decoded as UNKNOWN_SASL_SUPPORT. In this scenario client will > be treated as one aware about SASL protocol but server doesn't know exact > capabilities of client. Hence the SASL handshake will still be required from > server side. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)
[ https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242897#comment-16242897 ] ASF GitHub Bot commented on DRILL-3640: --- Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1024#discussion_r149506412 --- Diff: exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java --- @@ -237,6 +245,127 @@ public String toString() { } } + /** + * Test for reading of default query timeout + */ + @Test + public void testDefaultGetQueryTimeout() throws SQLException { +PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL); +int timeoutValue = stmt.getQueryTimeout(); +assert( 0 == timeoutValue ); + } + + /** + * Test Invalid parameter by giving negative timeout + */ + @Test ( expected = InvalidParameterSqlException.class ) + public void testInvalidSetQueryTimeout() throws SQLException { +PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL); +//Setting negative value +int valueToSet = -10; +if (0L == valueToSet) { + valueToSet--; +} +try { + stmt.setQueryTimeout(valueToSet); +} catch ( final Exception e) { + // TODO: handle exception + assertThat( e.getMessage(), containsString( "illegal timeout value") ); + //Converting this to match expected Exception + throw new InvalidParameterSqlException(e.getMessage()); --- End diff -- Yep. Agree. Was trying to make use of the large number of `###SqlException`s defined within the Drill JDBC package. Will fix this. +1 > Drill JDBC driver support Statement.setQueryTimeout(int) > > > Key: DRILL-3640 > URL: https://issues.apache.org/jira/browse/DRILL-3640 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC >Affects Versions: 1.2.0 >Reporter: Chun Chang >Assignee: Kunal Khatua > Fix For: 1.12.0 > > > It would be nice if we have this implemented. Run away queries can be > automatically canceled by setting the timeout. > java.sql.SQLFeatureNotSupportedException: Setting network timeout is not > supported. > at > org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (DRILL-5943) Avoid the strong check introduced by DRILL-5582 for PLAIN mechanism
[ https://issues.apache.org/jira/browse/DRILL-5943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sorabh Hamirwasia updated DRILL-5943: - Fix Version/s: 1.12.0 > Avoid the strong check introduced by DRILL-5582 for PLAIN mechanism > --- > > Key: DRILL-5943 > URL: https://issues.apache.org/jira/browse/DRILL-5943 > Project: Apache Drill > Issue Type: Improvement >Reporter: Sorabh Hamirwasia >Assignee: Sorabh Hamirwasia > Fix For: 1.12.0 > > > For PLAIN mechanism we will weaken the strong check introduced with > DRILL-5582 to keep the forward compatibility between Drill 1.12 client and > Drill 1.9 server. This is fine since with and without this strong check PLAIN > mechanism is still vulnerable to MITM during handshake itself unlike mutual > authentication protocols like Kerberos. > Also for keeping forward compatibility with respect to SASL we will treat > UNKNOWN_SASL_SUPPORT as valid value. For handshake message received from a > client which is running on later version (let say 1.13) then Drillbit (1.12) > and having a new value for SaslSupport field which is unknown to server, this > field will be decoded as UNKNOWN_SASL_SUPPORT. In this scenario client will > be treated as one aware about SASL protocol but server doesn't know exact > capabilities of client. Hence the SASL handshake will still be required from > server side. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5899) Simple pattern matchers can work with DrillBuf directly
[ https://issues.apache.org/jira/browse/DRILL-5899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243238#comment-16243238 ] ASF GitHub Bot commented on DRILL-5899: --- Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1015#discussion_r149554195 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/SqlPatternContainsMatcher.java --- @@ -17,37 +17,48 @@ */ package org.apache.drill.exec.expr.fn.impl; -public class SqlPatternContainsMatcher implements SqlPatternMatcher { - final String patternString; - CharSequence charSequenceWrapper; - final int patternLength; - - public SqlPatternContainsMatcher(String patternString, CharSequence charSequenceWrapper) { -this.patternString = patternString; -this.charSequenceWrapper = charSequenceWrapper; -patternLength = patternString.length(); +import io.netty.buffer.DrillBuf; + +public class SqlPatternContainsMatcher extends AbstractSqlPatternMatcher { + + public SqlPatternContainsMatcher(String patternString) { +super(patternString); } @Override - public int match() { -final int txtLength = charSequenceWrapper.length(); -int patternIndex = 0; -int txtIndex = 0; - -// simplePattern string has meta characters i.e % and _ and escape characters removed. -// so, we can just directly compare. -while (patternIndex < patternLength && txtIndex < txtLength) { - if (patternString.charAt(patternIndex) != charSequenceWrapper.charAt(txtIndex)) { -// Go back if there is no match -txtIndex = txtIndex - patternIndex; -patternIndex = 0; - } else { -patternIndex++; + public int match(int start, int end, DrillBuf drillBuf) { + +if (patternLength == 0) { // Everything should match for null pattern string + return 1; +} + +final int txtLength = end - start; + +// no match if input string length is less than pattern length +if (txtLength < patternLength) { + return 0; +} + +outer: +for (int txtIndex = 0; txtIndex < txtLength; txtIndex++) { + + // boundary check + if (txtIndex + patternLength > txtLength) { --- End diff -- Better: ``` int end = txtLength - patternLength; for (int txtIndex = 0; txtIndex < end; txtIndex++) { ``` And omit the boundary check on every iteration. That is, no reason to iterate past the last possible match, then use an if-statement to shorten the loop. Just shorten the loop. > Simple pattern matchers can work with DrillBuf directly > --- > > Key: DRILL-5899 > URL: https://issues.apache.org/jira/browse/DRILL-5899 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Flow >Reporter: Padma Penumarthy >Assignee: Padma Penumarthy >Priority: Critical > > For the 4 simple patterns we have i.e. startsWith, endsWith, contains and > constant,, we do not need the overhead of charSequenceWrapper. We can work > with DrillBuf directly. This will save us from doing isAscii check and UTF8 > decoding for each row. > UTF-8 encoding ensures that no UTF-8 character is a prefix of any other valid > character. So, instead of decoding varChar from each row we are processing, > encode the patternString once during setup and do raw byte comparison. > Instead of bounds checking and reading one byte at a time, we get the whole > buffer in one shot and use that for comparison. > This improved overall performance for filter operator by around 20%. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5899) Simple pattern matchers can work with DrillBuf directly
[ https://issues.apache.org/jira/browse/DRILL-5899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243237#comment-16243237 ] ASF GitHub Bot commented on DRILL-5899: --- Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1015#discussion_r149552453 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/AbstractSqlPatternMatcher.java --- @@ -0,0 +1,61 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.expr.fn.impl; + +import com.google.common.base.Charsets; +import org.apache.drill.common.exceptions.UserException; +import java.nio.ByteBuffer; +import java.nio.CharBuffer; +import java.nio.charset.CharacterCodingException; +import java.nio.charset.CharsetEncoder; +import static org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.logger; + +// To get good performance for most commonly used pattern matches --- End diff -- Javadoc? ``` /** * This is a Javadoc comment and appears in generated documentation. */ // This is a plain comment and does not appear in documentation. ``` > Simple pattern matchers can work with DrillBuf directly > --- > > Key: DRILL-5899 > URL: https://issues.apache.org/jira/browse/DRILL-5899 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Flow >Reporter: Padma Penumarthy >Assignee: Padma Penumarthy >Priority: Critical > > For the 4 simple patterns we have i.e. startsWith, endsWith, contains and > constant,, we do not need the overhead of charSequenceWrapper. We can work > with DrillBuf directly. This will save us from doing isAscii check and UTF8 > decoding for each row. > UTF-8 encoding ensures that no UTF-8 character is a prefix of any other valid > character. So, instead of decoding varChar from each row we are processing, > encode the patternString once during setup and do raw byte comparison. > Instead of bounds checking and reading one byte at a time, we get the whole > buffer in one shot and use that for comparison. > This improved overall performance for filter operator by around 20%. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5899) Simple pattern matchers can work with DrillBuf directly
[ https://issues.apache.org/jira/browse/DRILL-5899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243235#comment-16243235 ] ASF GitHub Bot commented on DRILL-5899: --- Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1015#discussion_r149554356 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/SqlPatternEndsWithMatcher.java --- @@ -17,33 +17,30 @@ */ package org.apache.drill.exec.expr.fn.impl; -public class SqlPatternEndsWithMatcher implements SqlPatternMatcher { - final String patternString; - CharSequence charSequenceWrapper; - final int patternLength; - - public SqlPatternEndsWithMatcher(String patternString, CharSequence charSequenceWrapper) { -this.charSequenceWrapper = charSequenceWrapper; -this.patternString = patternString; -this.patternLength = patternString.length(); +import io.netty.buffer.DrillBuf; + +public class SqlPatternEndsWithMatcher extends AbstractSqlPatternMatcher { + + public SqlPatternEndsWithMatcher(String patternString) { +super(patternString); } @Override - public int match() { -int txtIndex = charSequenceWrapper.length(); -int patternIndex = patternLength; -boolean matchFound = true; // if pattern is empty string, we always match. + public int match(int start, int end, DrillBuf drillBuf) { + +if ( (end - start) < patternLength) { // No match if input string length is less than pattern length. --- End diff -- `( (end - start)` --> `(end - start` > Simple pattern matchers can work with DrillBuf directly > --- > > Key: DRILL-5899 > URL: https://issues.apache.org/jira/browse/DRILL-5899 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Flow >Reporter: Padma Penumarthy >Assignee: Padma Penumarthy >Priority: Critical > > For the 4 simple patterns we have i.e. startsWith, endsWith, contains and > constant,, we do not need the overhead of charSequenceWrapper. We can work > with DrillBuf directly. This will save us from doing isAscii check and UTF8 > decoding for each row. > UTF-8 encoding ensures that no UTF-8 character is a prefix of any other valid > character. So, instead of decoding varChar from each row we are processing, > encode the patternString once during setup and do raw byte comparison. > Instead of bounds checking and reading one byte at a time, we get the whole > buffer in one shot and use that for comparison. > This improved overall performance for filter operator by around 20%. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5899) Simple pattern matchers can work with DrillBuf directly
[ https://issues.apache.org/jira/browse/DRILL-5899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243239#comment-16243239 ] ASF GitHub Bot commented on DRILL-5899: --- Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1015#discussion_r149555002 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/SqlPatternEndsWithMatcher.java --- @@ -17,33 +17,30 @@ */ package org.apache.drill.exec.expr.fn.impl; -public class SqlPatternEndsWithMatcher implements SqlPatternMatcher { - final String patternString; - CharSequence charSequenceWrapper; - final int patternLength; - - public SqlPatternEndsWithMatcher(String patternString, CharSequence charSequenceWrapper) { -this.charSequenceWrapper = charSequenceWrapper; -this.patternString = patternString; -this.patternLength = patternString.length(); +import io.netty.buffer.DrillBuf; + +public class SqlPatternEndsWithMatcher extends AbstractSqlPatternMatcher { + + public SqlPatternEndsWithMatcher(String patternString) { +super(patternString); } @Override - public int match() { -int txtIndex = charSequenceWrapper.length(); -int patternIndex = patternLength; -boolean matchFound = true; // if pattern is empty string, we always match. + public int match(int start, int end, DrillBuf drillBuf) { + +if ( (end - start) < patternLength) { // No match if input string length is less than pattern length. + return 0; +} // simplePattern string has meta characters i.e % and _ and escape characters removed. // so, we can just directly compare. -while (patternIndex > 0 && txtIndex > 0) { - if (charSequenceWrapper.charAt(--txtIndex) != patternString.charAt(--patternIndex)) { -matchFound = false; -break; +for (int index = 1; index <= patternLength; index++) { --- End diff -- ``` int txtStart = end - patternLength; if (txtStart < start) { return 0; } for (int index = 0; index < patternLength; index++) { ... patternByteBuffer.get(index) ... drillBuf.getByte(txtStart + index) ... ``` > Simple pattern matchers can work with DrillBuf directly > --- > > Key: DRILL-5899 > URL: https://issues.apache.org/jira/browse/DRILL-5899 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Flow >Reporter: Padma Penumarthy >Assignee: Padma Penumarthy >Priority: Critical > > For the 4 simple patterns we have i.e. startsWith, endsWith, contains and > constant,, we do not need the overhead of charSequenceWrapper. We can work > with DrillBuf directly. This will save us from doing isAscii check and UTF8 > decoding for each row. > UTF-8 encoding ensures that no UTF-8 character is a prefix of any other valid > character. So, instead of decoding varChar from each row we are processing, > encode the patternString once during setup and do raw byte comparison. > Instead of bounds checking and reading one byte at a time, we get the whole > buffer in one shot and use that for comparison. > This improved overall performance for filter operator by around 20%. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5899) Simple pattern matchers can work with DrillBuf directly
[ https://issues.apache.org/jira/browse/DRILL-5899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243236#comment-16243236 ] ASF GitHub Bot commented on DRILL-5899: --- Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1015#discussion_r149552506 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/AbstractSqlPatternMatcher.java --- @@ -0,0 +1,61 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.expr.fn.impl; + +import com.google.common.base.Charsets; +import org.apache.drill.common.exceptions.UserException; +import java.nio.ByteBuffer; +import java.nio.CharBuffer; +import java.nio.charset.CharacterCodingException; +import java.nio.charset.CharsetEncoder; +import static org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.logger; + +// To get good performance for most commonly used pattern matches +// i.e. CONSTANT('ABC'), STARTSWITH('%ABC'), ENDSWITH('ABC%') and CONTAINS('%ABC%'), +// we have simple pattern matchers. +// Idea is to have our own implementation for simple pattern matchers so we can +// avoid heavy weight regex processing, skip UTF-8 decoding and char conversion. +// Instead, we encode the pattern string and do byte comparison against native memory. +// Overall, this approach +// gives us orders of magnitude performance improvement for simple pattern matches. +// Anything that is not simple is considered +// complex pattern and we use Java regex for complex pattern matches. + +public abstract class AbstractSqlPatternMatcher implements SqlPatternMatcher { + final String patternString; --- End diff -- `protected final` > Simple pattern matchers can work with DrillBuf directly > --- > > Key: DRILL-5899 > URL: https://issues.apache.org/jira/browse/DRILL-5899 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Flow >Reporter: Padma Penumarthy >Assignee: Padma Penumarthy >Priority: Critical > > For the 4 simple patterns we have i.e. startsWith, endsWith, contains and > constant,, we do not need the overhead of charSequenceWrapper. We can work > with DrillBuf directly. This will save us from doing isAscii check and UTF8 > decoding for each row. > UTF-8 encoding ensures that no UTF-8 character is a prefix of any other valid > character. So, instead of decoding varChar from each row we are processing, > encode the patternString once during setup and do raw byte comparison. > Instead of bounds checking and reading one byte at a time, we get the whole > buffer in one shot and use that for comparison. > This improved overall performance for filter operator by around 20%. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (DRILL-5943) Avoid the strong check introduced by DRILL-5582 for PLAIN mechanism
Sorabh Hamirwasia created DRILL-5943: Summary: Avoid the strong check introduced by DRILL-5582 for PLAIN mechanism Key: DRILL-5943 URL: https://issues.apache.org/jira/browse/DRILL-5943 Project: Apache Drill Issue Type: Improvement Reporter: Sorabh Hamirwasia Assignee: Sorabh Hamirwasia For PLAIN mechanism we will weaken the strong check introduced with DRILL-5582 to keep the forward compatibility between Drill 1.12 client and Drill 1.9 server. This is fine since with and without this strong check PLAIN mechanism is still vulnerable to MITM during handshake itself unlike mutual authentication protocols like Kerberos. Also for keeping forward compatibility with respect to SASL we will treat UNKNOWN_SASL_SUPPORT as valid value. For handshake message received from a client which is running on later version (let say 1.13) then Drillbit (1.12) and having a new value for SaslSupport field which is unknown to server, this field will be decoded as UNKNOWN_SASL_SUPPORT. In this scenario client will be treated as one aware about SASL protocol but server doesn't know exact capabilities of client. Hence the SASL handshake will still be required from server side. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)
[ https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243135#comment-16243135 ] ASF GitHub Bot commented on DRILL-3640: --- Github user laurentgo commented on a diff in the pull request: https://github.com/apache/drill/pull/1024#discussion_r149543163 --- Diff: exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java --- @@ -376,6 +417,19 @@ synchronized void cleanup() { currentBatchHolder.clear(); } + long getTimeoutInMilliseconds() { +return timeoutInMilliseconds; + } + + //Set the cursor's timeout in seconds + void setTimeout(int timeoutDurationInSeconds){ +this.timeoutInMilliseconds = timeoutDurationInSeconds*1000L; --- End diff -- Preferably use `TimeUnit.SECONDS.toMillis(timeoutDurationInSeconds)` to avoid magic constants > Drill JDBC driver support Statement.setQueryTimeout(int) > > > Key: DRILL-3640 > URL: https://issues.apache.org/jira/browse/DRILL-3640 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC >Affects Versions: 1.2.0 >Reporter: Chun Chang >Assignee: Kunal Khatua > Fix For: 1.12.0 > > > It would be nice if we have this implemented. Run away queries can be > automatically canceled by setting the timeout. > java.sql.SQLFeatureNotSupportedException: Setting network timeout is not > supported. > at > org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)
[ https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243133#comment-16243133 ] ASF GitHub Bot commented on DRILL-3640: --- Github user laurentgo commented on a diff in the pull request: https://github.com/apache/drill/pull/1024#discussion_r149543078 --- Diff: exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java --- @@ -333,8 +368,14 @@ void close() { final int batchQueueThrottlingThreshold = client.getConfig().getInt( ExecConstants.JDBC_BATCH_QUEUE_THROTTLING_THRESHOLD ); -resultsListener = new ResultsListener(batchQueueThrottlingThreshold); +resultsListener = new ResultsListener(this, batchQueueThrottlingThreshold); currentBatchHolder = new RecordBatchLoader(client.getAllocator()); +try { + setTimeout(this.statement.getQueryTimeout()); +} catch (SQLException e) { + // Printing any unexpected SQLException stack trace + e.printStackTrace(); --- End diff -- two choices here: - we don't think it's important if we cannot get the value, so we should log it properly and not simply dump the exception - we think this is important, and we propagate the exception to the caller (I think it is important: the most likely reason why we could not get the value if that the statement was closed, and we should probably notify the user about it). > Drill JDBC driver support Statement.setQueryTimeout(int) > > > Key: DRILL-3640 > URL: https://issues.apache.org/jira/browse/DRILL-3640 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC >Affects Versions: 1.2.0 >Reporter: Chun Chang >Assignee: Kunal Khatua > Fix For: 1.12.0 > > > It would be nice if we have this implemented. Run away queries can be > automatically canceled by setting the timeout. > java.sql.SQLFeatureNotSupportedException: Setting network timeout is not > supported. > at > org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)
[ https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243136#comment-16243136 ] ASF GitHub Bot commented on DRILL-3640: --- Github user laurentgo commented on a diff in the pull request: https://github.com/apache/drill/pull/1024#discussion_r149542468 --- Diff: exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java --- @@ -139,8 +147,22 @@ private boolean stopThrottlingIfSo() { return stopped; } -public void awaitFirstMessage() throws InterruptedException { - firstMessageReceived.await(); +public void awaitFirstMessage() throws InterruptedException, SQLTimeoutException { + //Check if a non-zero timeout has been set + if ( parent.timeoutInMilliseconds > 0 ) { +//Identifying remaining in milliseconds to maintain a granularity close to integer value of timeout +long timeToTimeout = (parent.timeoutInMilliseconds) - parent.elapsedTimer.elapsed(TimeUnit.MILLISECONDS); +if ( timeToTimeout > 0 ) { --- End diff -- maybe a style issue, but to avoid code duplication both conditions could be checked together? ``` if ( timeToTimeout <= 0 || !firstMessageReceived.await(timeToTimeout, TimeUnit.MILLISECONDS) ) { throw new SqlTimeoutException(TimeUnit.MILLISECONDS.toSeconds(parent.timeoutInMilliseconds)); } ``` > Drill JDBC driver support Statement.setQueryTimeout(int) > > > Key: DRILL-3640 > URL: https://issues.apache.org/jira/browse/DRILL-3640 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC >Affects Versions: 1.2.0 >Reporter: Chun Chang >Assignee: Kunal Khatua > Fix For: 1.12.0 > > > It would be nice if we have this implemented. Run away queries can be > automatically canceled by setting the timeout. > java.sql.SQLFeatureNotSupportedException: Setting network timeout is not > supported. > at > org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)
[ https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243137#comment-16243137 ] ASF GitHub Bot commented on DRILL-3640: --- Github user laurentgo commented on a diff in the pull request: https://github.com/apache/drill/pull/1024#discussion_r149543581 --- Diff: exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillResultSetImpl.java --- @@ -96,6 +105,14 @@ private void throwIfClosed() throws AlreadyClosedSqlException, throw new AlreadyClosedSqlException( "ResultSet is already closed." ); } } + +//Implicit check for whether timeout is set +if (elapsedTimer != null) { --- End diff -- maybe an helper method from the cursor to see if we timed out instead of exposing elapsedTimer? I'm not sure if this is really necessary (posted another comment about it previously), except maybe because of unit tests where it's hard to time out inside the cursor? I did a prototype too and used control injection to pause the screen operator: the test would look like this: ``` /** * Test setting timeout for a query that actually times out */ @Test ( expected = SqlTimeoutException.class ) public void testTriggeredQueryTimeout() throws SQLException { // Prevent the server to complete the query to trigger a timeout final String controls = Controls.newBuilder() .addPause(ScreenCreator.class, "send-complete", 0) .build(); try(Statement statement = connection.createStatement()) { assertThat( statement.execute(String.format( "ALTER session SET `%s` = '%s'", ExecConstants.DRILLBIT_CONTROL_INJECTIONS, controls)), equalTo(true)); } String queryId = null; try(Statement statement = connection.createStatement()) { int timeoutDuration = 3; //Setting to a very low value (3sec) statement.setQueryTimeout(timeoutDuration); ResultSet rs = statement.executeQuery(SYS_VERSION_SQL); queryId = ((DrillResultSet) rs).getQueryId(); //Fetch rows while (rs.next()) { rs.getBytes(1); } } catch (SQLException sqlEx) { if (sqlEx instanceof SqlTimeoutException) { throw (SqlTimeoutException) sqlEx; } } finally { // Do not forget to unpause to avoid memory leak. if (queryId != null) { DrillClient drillClient = ((DrillConnection) connection).getClient(); drillClient.resumeQuery(QueryIdHelper.getQueryIdFromString(queryId)); } } ``` Works for PreparedStatementTest too, need to make sure you pause after prepared statement is created but before it is executed. > Drill JDBC driver support Statement.setQueryTimeout(int) > > > Key: DRILL-3640 > URL: https://issues.apache.org/jira/browse/DRILL-3640 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC >Affects Versions: 1.2.0 >Reporter: Chun Chang >Assignee: Kunal Khatua > Fix For: 1.12.0 > > > It would be nice if we have this implemented. Run away queries can be > automatically canceled by setting the timeout. > java.sql.SQLFeatureNotSupportedException: Setting network timeout is not > supported. > at > org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)
[ https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243132#comment-16243132 ] ASF GitHub Bot commented on DRILL-3640: --- Github user laurentgo commented on a diff in the pull request: https://github.com/apache/drill/pull/1024#discussion_r149542622 --- Diff: exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java --- @@ -239,6 +261,11 @@ QueryDataBatch getNext() throws UserException, InterruptedException { } return qdb; } + + // Check and throw SQLTimeoutException + if ( parent.timeoutInMilliseconds > 0 && parent.elapsedTimer.elapsed(TimeUnit.SECONDS) >= parent.timeoutInMilliseconds ) { --- End diff -- wrong unit for the comparison (should be millis) > Drill JDBC driver support Statement.setQueryTimeout(int) > > > Key: DRILL-3640 > URL: https://issues.apache.org/jira/browse/DRILL-3640 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC >Affects Versions: 1.2.0 >Reporter: Chun Chang >Assignee: Kunal Khatua > Fix For: 1.12.0 > > > It would be nice if we have this implemented. Run away queries can be > automatically canceled by setting the timeout. > java.sql.SQLFeatureNotSupportedException: Setting network timeout is not > supported. > at > org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5899) Simple pattern matchers can work with DrillBuf directly
[ https://issues.apache.org/jira/browse/DRILL-5899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16243050#comment-16243050 ] ASF GitHub Bot commented on DRILL-5899: --- Github user ppadma commented on the issue: https://github.com/apache/drill/pull/1015 @paul-rogers Thanks a lot for the review. Updated the PR with code review comments. Please take a look. Overall, good improvement with this change. Here are the numbers. select count(*) from `/Users/ppenumarthy/MAPRTECH/padma/testdata` where l_comment like '%a' 1.4 sec vs 7 sec select count(*) from `/Users/ppenumarthy/MAPRTECH/padma/testdata` where l_comment like '%a%' 6.5 sec vs 13.5 sec select count(*) from `/Users/ppenumarthy/MAPRTECH/padma/testdata` where l_comment like 'a%' 1.4 sec vs 5.8 sec select count(*) from `/Users/ppenumarthy/MAPRTECH/padma/testdata` where l_comment like 'a' 1.1.65 sec vs 5.8 sec I think for "contains", improvement is not as much as others, probably because of nested for loops. @sachouche changes on top of these changes can improve further. > Simple pattern matchers can work with DrillBuf directly > --- > > Key: DRILL-5899 > URL: https://issues.apache.org/jira/browse/DRILL-5899 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Flow >Reporter: Padma Penumarthy >Assignee: Padma Penumarthy >Priority: Critical > > For the 4 simple patterns we have i.e. startsWith, endsWith, contains and > constant,, we do not need the overhead of charSequenceWrapper. We can work > with DrillBuf directly. This will save us from doing isAscii check and UTF8 > decoding for each row. > UTF-8 encoding ensures that no UTF-8 character is a prefix of any other valid > character. So, instead of decoding varChar from each row we are processing, > encode the patternString once during setup and do raw byte comparison. > Instead of bounds checking and reading one byte at a time, we get the whole > buffer in one shot and use that for comparison. > This improved overall performance for filter operator by around 20%. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)
[ https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242705#comment-16242705 ] ASF GitHub Bot commented on DRILL-3640: --- Github user laurentgo commented on a diff in the pull request: https://github.com/apache/drill/pull/1024#discussion_r149477222 --- Diff: exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java --- @@ -237,6 +245,127 @@ public String toString() { } } + /** + * Test for reading of default query timeout + */ + @Test + public void testDefaultGetQueryTimeout() throws SQLException { +PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL); +int timeoutValue = stmt.getQueryTimeout(); +assert( 0 == timeoutValue ); + } + + /** + * Test Invalid parameter by giving negative timeout + */ + @Test ( expected = InvalidParameterSqlException.class ) + public void testInvalidSetQueryTimeout() throws SQLException { +PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL); +//Setting negative value +int valueToSet = -10; +if (0L == valueToSet) { + valueToSet--; +} +try { + stmt.setQueryTimeout(valueToSet); +} catch ( final Exception e) { + // TODO: handle exception + assertThat( e.getMessage(), containsString( "illegal timeout value") ); + //Converting this to match expected Exception + throw new InvalidParameterSqlException(e.getMessage()); --- End diff -- but you did since you catch the exception and do a check on the message. Rewrapping it so that the test framework can check the new type has no value. > Drill JDBC driver support Statement.setQueryTimeout(int) > > > Key: DRILL-3640 > URL: https://issues.apache.org/jira/browse/DRILL-3640 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC >Affects Versions: 1.2.0 >Reporter: Chun Chang >Assignee: Kunal Khatua > Fix For: 1.12.0 > > > It would be nice if we have this implemented. Run away queries can be > automatically canceled by setting the timeout. > java.sql.SQLFeatureNotSupportedException: Setting network timeout is not > supported. > at > org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)
[ https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242719#comment-16242719 ] ASF GitHub Bot commented on DRILL-3640: --- Github user laurentgo commented on a diff in the pull request: https://github.com/apache/drill/pull/1024#discussion_r149477955 --- Diff: exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java --- @@ -376,6 +415,19 @@ synchronized void cleanup() { currentBatchHolder.clear(); } + //Set the cursor's timeout in seconds --- End diff -- you just need to get the value when the query is executed (in DrillCursor) once to make sure the timeout doesn't change (that and StopWatch being managed by DrillCursor too. Also, it is subject to interpretation but it seems the intent of the API is to time bound how much time it takes the query to complete. I'm not sure it is necessary to make the extra work of having a slow client reading the result set data although all data has already been read by the driver from the server (and from the server point of view, the query is completed). > Drill JDBC driver support Statement.setQueryTimeout(int) > > > Key: DRILL-3640 > URL: https://issues.apache.org/jira/browse/DRILL-3640 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC >Affects Versions: 1.2.0 >Reporter: Chun Chang >Assignee: Kunal Khatua > Fix For: 1.12.0 > > > It would be nice if we have this implemented. Run away queries can be > automatically canceled by setting the timeout. > java.sql.SQLFeatureNotSupportedException: Setting network timeout is not > supported. > at > org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5138) TopN operator on top of ~110 GB data set is very slow
[ https://issues.apache.org/jira/browse/DRILL-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242709#comment-16242709 ] Chun Chang commented on DRILL-5138: --- I ran the query against MapR Drill 1.11.0 and query returned in 81 seconds. [root@perfnode166 catalog_sales]# sqlline --maxWidth=1 -u "jdbc:drill:zk=10.10.30.166:5181" OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=512M; support was removed in 8.0 apache drill 1.11.0-mapr "drill baby drill" 0: jdbc:drill:zk=10.10.30.166:5181> select * from dfs.`/drill/testdata/tpcds_sf100/parquet/catalog_sales` order by cs_quantity, cs_wholesale_cost limit 1; +--+---+--+---++-++--++-+---+-++-++--+---+---+--++--+--+--+-+--+---+--+--+---+--+--+--+--++ | cs_bill_addr_sk | cs_bill_cdemo_sk | cs_bill_customer_sk | cs_bill_hdemo_sk | cs_call_center_sk | cs_catalog_page_sk | cs_coupon_amt | cs_ext_discount_amt | cs_ext_list_price | cs_ext_sales_price | cs_ext_ship_cost | cs_ext_tax | cs_ext_wholesale_cost | cs_item_sk | cs_list_price | cs_net_paid | cs_net_paid_inc_ship | cs_net_paid_inc_ship_tax | cs_net_paid_inc_tax | cs_net_profit | cs_order_number | cs_promo_sk | cs_quantity | cs_sales_price | cs_ship_addr_sk | cs_ship_cdemo_sk | cs_ship_customer_sk | cs_ship_date_sk | cs_ship_hdemo_sk | cs_ship_mode_sk | cs_sold_date_sk | cs_sold_time_sk | cs_warehouse_sk | cs_wholesale_cost | +--+---+--+---++-++--++-+---+-++-++--+---+---+--++--+--+--+-+--+---+--+--+---+--+--+--+--++ | 184649 | 555979| 1796891 | 1114 | 24 | 14393 | 0.00 | 0.02 | 1.82 | 1.80| 0.25 | 0.00 | 1.00 | 108618 | 1.82 | 1.80 | 2.05 | 2.05 | 1.80 | 0.80 | 15928478 | 540 | 1| 1.80| 184649 | 555979| 1796891 | 2452671 | 1114 | 9| 2452640 | 38871| 1| 1.00 | +--+---+--+---++-++--++-+---+-++-++--+---+---+--++--+--+--+-+--+---+--+--+---+--+--+--+--++ 1 row selected (81.577 seconds) > TopN operator on top of ~110 GB data set is very slow > - > > Key: DRILL-5138 > URL: https://issues.apache.org/jira/browse/DRILL-5138 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Reporter: Rahul Challapalli >Assignee: Timothy Farkas > > git.commit.id.abbrev=cf2b7c7 > No of cores : 23 > No of disks : 5 > DRILL_MAX_DIRECT_MEMORY="24G" > DRILL_MAX_HEAP="12G" > The below query ran for more than 4 hours and did not complete. The table is > ~110 GB > {code} > select * from catalog_sales order by cs_quantity, cs_wholesale_cost limit 1; > {code} > Physical Plan : > {code} > 00-00Screen : rowType = RecordType(ANY *): rowcount = 1.0, cumulative > cost = {1.00798629141E10 rows, 4.1
[jira] [Resolved] (DRILL-5138) TopN operator on top of ~110 GB data set is very slow
[ https://issues.apache.org/jira/browse/DRILL-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Farkas resolved DRILL-5138. --- Resolution: Fixed > TopN operator on top of ~110 GB data set is very slow > - > > Key: DRILL-5138 > URL: https://issues.apache.org/jira/browse/DRILL-5138 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Reporter: Rahul Challapalli >Assignee: Timothy Farkas > > git.commit.id.abbrev=cf2b7c7 > No of cores : 23 > No of disks : 5 > DRILL_MAX_DIRECT_MEMORY="24G" > DRILL_MAX_HEAP="12G" > The below query ran for more than 4 hours and did not complete. The table is > ~110 GB > {code} > select * from catalog_sales order by cs_quantity, cs_wholesale_cost limit 1; > {code} > Physical Plan : > {code} > 00-00Screen : rowType = RecordType(ANY *): rowcount = 1.0, cumulative > cost = {1.00798629141E10 rows, 4.17594320691E10 cpu, 0.0 io, > 4.1287118487552E13 network, 0.0 memory}, id = 352 > 00-01 Project(*=[$0]) : rowType = RecordType(ANY *): rowcount = 1.0, > cumulative cost = {1.0079862914E10 rows, 4.1759432069E10 cpu, 0.0 io, > 4.1287118487552E13 network, 0.0 memory}, id = 351 > 00-02Project(T0¦¦*=[$0]) : rowType = RecordType(ANY T0¦¦*): rowcount > = 1.0, cumulative cost = {1.0079862914E10 rows, 4.1759432069E10 cpu, 0.0 io, > 4.1287118487552E13 network, 0.0 memory}, id = 350 > 00-03 SelectionVectorRemover : rowType = RecordType(ANY T0¦¦*, ANY > cs_quantity, ANY cs_wholesale_cost): rowcount = 1.0, cumulative cost = > {1.0079862914E10 rows, 4.1759432069E10 cpu, 0.0 io, 4.1287118487552E13 > network, 0.0 memory}, id = 349 > 00-04Limit(fetch=[1]) : rowType = RecordType(ANY T0¦¦*, ANY > cs_quantity, ANY cs_wholesale_cost): rowcount = 1.0, cumulative cost = > {1.0079862913E10 rows, 4.1759432068E10 cpu, 0.0 io, 4.1287118487552E13 > network, 0.0 memory}, id = 348 > 00-05 SingleMergeExchange(sort0=[1 ASC], sort1=[2 ASC]) : > rowType = RecordType(ANY T0¦¦*, ANY cs_quantity, ANY cs_wholesale_cost): > rowcount = 1.439980416E9, cumulative cost = {1.0079862912E10 rows, > 4.1759432064E10 cpu, 0.0 io, 4.1287118487552E13 network, 0.0 memory}, id = 347 > 01-01SelectionVectorRemover : rowType = RecordType(ANY T0¦¦*, > ANY cs_quantity, ANY cs_wholesale_cost): rowcount = 1.439980416E9, cumulative > cost = {8.639882496E9 rows, 3.0239588736E10 cpu, 0.0 io, 2.3592639135744E13 > network, 0.0 memory}, id = 346 > 01-02 TopN(limit=[1]) : rowType = RecordType(ANY T0¦¦*, ANY > cs_quantity, ANY cs_wholesale_cost): rowcount = 1.439980416E9, cumulative > cost = {7.19990208E9 rows, 2.879960832E10 cpu, 0.0 io, 2.3592639135744E13 > network, 0.0 memory}, id = 345 > 01-03Project(T0¦¦*=[$0], cs_quantity=[$1], > cs_wholesale_cost=[$2]) : rowType = RecordType(ANY T0¦¦*, ANY cs_quantity, > ANY cs_wholesale_cost): rowcount = 1.439980416E9, cumulative cost = > {5.759921664E9 rows, 2.879960832E10 cpu, 0.0 io, 2.3592639135744E13 network, > 0.0 memory}, id = 344 > 01-04 HashToRandomExchange(dist0=[[$1]], dist1=[[$2]]) : > rowType = RecordType(ANY T0¦¦*, ANY cs_quantity, ANY cs_wholesale_cost, ANY > E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 1.439980416E9, cumulative cost = > {5.759921664E9 rows, 2.879960832E10 cpu, 0.0 io, 2.3592639135744E13 network, > 0.0 memory}, id = 343 > 02-01UnorderedMuxExchange : rowType = RecordType(ANY > T0¦¦*, ANY cs_quantity, ANY cs_wholesale_cost, ANY > E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 1.439980416E9, cumulative cost = > {4.319941248E9 rows, 1.1519843328E10 cpu, 0.0 io, 0.0 network, 0.0 memory}, > id = 342 > 03-01 Project(T0¦¦*=[$0], cs_quantity=[$1], > cs_wholesale_cost=[$2], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($2, > hash32AsDouble($1))]) : rowType = RecordType(ANY T0¦¦*, ANY cs_quantity, ANY > cs_wholesale_cost, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 1.439980416E9, > cumulative cost = {2.879960832E9 rows, 1.0079862912E10 cpu, 0.0 io, 0.0 > network, 0.0 memory}, id = 341 > 03-02Project(T0¦¦*=[$0], cs_quantity=[$1], > cs_wholesale_cost=[$2]) : rowType = RecordType(ANY T0¦¦*, ANY cs_quantity, > ANY cs_wholesale_cost): rowcount = 1.439980416E9, cumulative cost = > {1.439980416E9 rows, 4.319941248E9 cpu, 0.0 io, 0.0 network, 0.0 memory}, id > = 340 > 03-03 Scan(groupscan=[ParquetGroupScan > [entries=[ReadEntryWithPath > [path=maprfs:///drill/testdata/tpcds/parquet/sf1000/catalog_sales]], > selectionRoot=maprfs:/drill/testdata/tpcds/parquet/sf1000/catalog_sales, > numFiles=1, usedMetadataFile=false, columns=[`*`]]]) : rowType = > (DrillRecordRow[*, cs_quantity, cs_wholesale_cost]):
[jira] [Commented] (DRILL-5138) TopN operator on top of ~110 GB data set is very slow
[ https://issues.apache.org/jira/browse/DRILL-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242773#comment-16242773 ] Timothy Farkas commented on DRILL-5138: --- Since this is working for Chun it looks like the config change, which is already merged to master, was sufficient to fix this issue. > TopN operator on top of ~110 GB data set is very slow > - > > Key: DRILL-5138 > URL: https://issues.apache.org/jira/browse/DRILL-5138 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Reporter: Rahul Challapalli >Assignee: Timothy Farkas > > git.commit.id.abbrev=cf2b7c7 > No of cores : 23 > No of disks : 5 > DRILL_MAX_DIRECT_MEMORY="24G" > DRILL_MAX_HEAP="12G" > The below query ran for more than 4 hours and did not complete. The table is > ~110 GB > {code} > select * from catalog_sales order by cs_quantity, cs_wholesale_cost limit 1; > {code} > Physical Plan : > {code} > 00-00Screen : rowType = RecordType(ANY *): rowcount = 1.0, cumulative > cost = {1.00798629141E10 rows, 4.17594320691E10 cpu, 0.0 io, > 4.1287118487552E13 network, 0.0 memory}, id = 352 > 00-01 Project(*=[$0]) : rowType = RecordType(ANY *): rowcount = 1.0, > cumulative cost = {1.0079862914E10 rows, 4.1759432069E10 cpu, 0.0 io, > 4.1287118487552E13 network, 0.0 memory}, id = 351 > 00-02Project(T0¦¦*=[$0]) : rowType = RecordType(ANY T0¦¦*): rowcount > = 1.0, cumulative cost = {1.0079862914E10 rows, 4.1759432069E10 cpu, 0.0 io, > 4.1287118487552E13 network, 0.0 memory}, id = 350 > 00-03 SelectionVectorRemover : rowType = RecordType(ANY T0¦¦*, ANY > cs_quantity, ANY cs_wholesale_cost): rowcount = 1.0, cumulative cost = > {1.0079862914E10 rows, 4.1759432069E10 cpu, 0.0 io, 4.1287118487552E13 > network, 0.0 memory}, id = 349 > 00-04Limit(fetch=[1]) : rowType = RecordType(ANY T0¦¦*, ANY > cs_quantity, ANY cs_wholesale_cost): rowcount = 1.0, cumulative cost = > {1.0079862913E10 rows, 4.1759432068E10 cpu, 0.0 io, 4.1287118487552E13 > network, 0.0 memory}, id = 348 > 00-05 SingleMergeExchange(sort0=[1 ASC], sort1=[2 ASC]) : > rowType = RecordType(ANY T0¦¦*, ANY cs_quantity, ANY cs_wholesale_cost): > rowcount = 1.439980416E9, cumulative cost = {1.0079862912E10 rows, > 4.1759432064E10 cpu, 0.0 io, 4.1287118487552E13 network, 0.0 memory}, id = 347 > 01-01SelectionVectorRemover : rowType = RecordType(ANY T0¦¦*, > ANY cs_quantity, ANY cs_wholesale_cost): rowcount = 1.439980416E9, cumulative > cost = {8.639882496E9 rows, 3.0239588736E10 cpu, 0.0 io, 2.3592639135744E13 > network, 0.0 memory}, id = 346 > 01-02 TopN(limit=[1]) : rowType = RecordType(ANY T0¦¦*, ANY > cs_quantity, ANY cs_wholesale_cost): rowcount = 1.439980416E9, cumulative > cost = {7.19990208E9 rows, 2.879960832E10 cpu, 0.0 io, 2.3592639135744E13 > network, 0.0 memory}, id = 345 > 01-03Project(T0¦¦*=[$0], cs_quantity=[$1], > cs_wholesale_cost=[$2]) : rowType = RecordType(ANY T0¦¦*, ANY cs_quantity, > ANY cs_wholesale_cost): rowcount = 1.439980416E9, cumulative cost = > {5.759921664E9 rows, 2.879960832E10 cpu, 0.0 io, 2.3592639135744E13 network, > 0.0 memory}, id = 344 > 01-04 HashToRandomExchange(dist0=[[$1]], dist1=[[$2]]) : > rowType = RecordType(ANY T0¦¦*, ANY cs_quantity, ANY cs_wholesale_cost, ANY > E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 1.439980416E9, cumulative cost = > {5.759921664E9 rows, 2.879960832E10 cpu, 0.0 io, 2.3592639135744E13 network, > 0.0 memory}, id = 343 > 02-01UnorderedMuxExchange : rowType = RecordType(ANY > T0¦¦*, ANY cs_quantity, ANY cs_wholesale_cost, ANY > E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 1.439980416E9, cumulative cost = > {4.319941248E9 rows, 1.1519843328E10 cpu, 0.0 io, 0.0 network, 0.0 memory}, > id = 342 > 03-01 Project(T0¦¦*=[$0], cs_quantity=[$1], > cs_wholesale_cost=[$2], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($2, > hash32AsDouble($1))]) : rowType = RecordType(ANY T0¦¦*, ANY cs_quantity, ANY > cs_wholesale_cost, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 1.439980416E9, > cumulative cost = {2.879960832E9 rows, 1.0079862912E10 cpu, 0.0 io, 0.0 > network, 0.0 memory}, id = 341 > 03-02Project(T0¦¦*=[$0], cs_quantity=[$1], > cs_wholesale_cost=[$2]) : rowType = RecordType(ANY T0¦¦*, ANY cs_quantity, > ANY cs_wholesale_cost): rowcount = 1.439980416E9, cumulative cost = > {1.439980416E9 rows, 4.319941248E9 cpu, 0.0 io, 0.0 network, 0.0 memory}, id > = 340 > 03-03 Scan(groupscan=[ParquetGroupScan > [entries=[ReadEntryWithPath > [path=maprfs:///drill/testdata/tpcds/parquet/sf1000/catalog_sales]], > selectionRoot=maprfs:/drill/t
[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)
[ https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242706#comment-16242706 ] ASF GitHub Bot commented on DRILL-3640: --- Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1024#discussion_r149477233 --- Diff: exec/jdbc/src/test/java/org/apache/drill/jdbc/StatementTest.java --- @@ -61,55 +71,129 @@ public static void tearDownStatement() throws SQLException { // // getQueryTimeout(): - /** Tests that getQueryTimeout() indicates no timeout set. */ + /** + * Test for reading of default query timeout + */ @Test - public void testGetQueryTimeoutSaysNoTimeout() throws SQLException { -assertThat( statement.getQueryTimeout(), equalTo( 0 ) ); + public void testDefaultGetQueryTimeout() throws SQLException { +Statement stmt = connection.createStatement(); +int timeoutValue = stmt.getQueryTimeout(); +assert( 0 == timeoutValue ); --- End diff -- +1 > Drill JDBC driver support Statement.setQueryTimeout(int) > > > Key: DRILL-3640 > URL: https://issues.apache.org/jira/browse/DRILL-3640 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC >Affects Versions: 1.2.0 >Reporter: Chun Chang >Assignee: Kunal Khatua > Fix For: 1.12.0 > > > It would be nice if we have this implemented. Run away queries can be > automatically canceled by setting the timeout. > java.sql.SQLFeatureNotSupportedException: Setting network timeout is not > supported. > at > org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)
[ https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242701#comment-16242701 ] ASF GitHub Bot commented on DRILL-3640: --- Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1024#discussion_r149476820 --- Diff: exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java --- @@ -237,6 +245,127 @@ public String toString() { } } + /** + * Test for reading of default query timeout + */ + @Test + public void testDefaultGetQueryTimeout() throws SQLException { +PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL); +int timeoutValue = stmt.getQueryTimeout(); +assert( 0 == timeoutValue ); + } + + /** + * Test Invalid parameter by giving negative timeout + */ + @Test ( expected = InvalidParameterSqlException.class ) + public void testInvalidSetQueryTimeout() throws SQLException { +PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL); +//Setting negative value +int valueToSet = -10; +if (0L == valueToSet) { + valueToSet--; +} +try { + stmt.setQueryTimeout(valueToSet); +} catch ( final Exception e) { + // TODO: handle exception + assertThat( e.getMessage(), containsString( "illegal timeout value") ); + //Converting this to match expected Exception + throw new InvalidParameterSqlException(e.getMessage()); +} + } + + /** + * Test setting a valid timeout + */ + @Test + public void testValidSetQueryTimeout() throws SQLException { +PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL); +//Setting positive value +int valueToSet = new Random(System.currentTimeMillis()).nextInt(60); +if (0L == valueToSet) { + valueToSet++; +} +stmt.setQueryTimeout(valueToSet); +assert( valueToSet == stmt.getQueryTimeout() ); + } + + /** + * Test setting timeout as zero and executing + */ + @Test + public void testSetQueryTimeoutAsZero() throws SQLException { +PreparedStatement stmt = connection.prepareStatement(SYS_RANDOM_SQL); +stmt.setQueryTimeout(0); +stmt.executeQuery(); +ResultSet rs = stmt.getResultSet(); +int rowCount = 0; +while (rs.next()) { + rs.getBytes(1); + rowCount++; +} +stmt.close(); +assert( 3 == rowCount ); + } + + /** + * Test setting timeout for a query that actually times out + */ + @Test ( expected = SQLTimeoutException.class ) + public void testTriggeredQueryTimeout() throws SQLException { +PreparedStatement stmt = null; +//Setting to a very low value (3sec) +int timeoutDuration = 3; +int rowsCounted = 0; +try { + stmt = connection.prepareStatement(SYS_RANDOM_SQL); + stmt.setQueryTimeout(timeoutDuration); + System.out.println("Set a timeout of "+ stmt.getQueryTimeout() +" seconds"); --- End diff -- I think I previously came across some unit tests that are using System.out instead of logger, so i figured there wasn't any preference. Logger is probably the cleaner way of doing things. +1 > Drill JDBC driver support Statement.setQueryTimeout(int) > > > Key: DRILL-3640 > URL: https://issues.apache.org/jira/browse/DRILL-3640 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC >Affects Versions: 1.2.0 >Reporter: Chun Chang >Assignee: Kunal Khatua > Fix For: 1.12.0 > > > It would be nice if we have this implemented. Run away queries can be > automatically canceled by setting the timeout. > java.sql.SQLFeatureNotSupportedException: Setting network timeout is not > supported. > at > org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)
[ https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242697#comment-16242697 ] ASF GitHub Bot commented on DRILL-3640: --- Github user laurentgo commented on a diff in the pull request: https://github.com/apache/drill/pull/1024#discussion_r149476642 --- Diff: exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillPreparedStatementImpl.java --- @@ -61,8 +65,14 @@ protected DrillPreparedStatementImpl(DrillConnectionImpl connection, if (preparedStatementHandle != null) { ((DrillColumnMetaDataList) signature.columns).updateColumnMetaData(preparedStatementHandle.getColumnsList()); } +//Implicit query timeout +this.queryTimeoutInSeconds = 0; +this.elapsedTimer = Stopwatch.createUnstarted(); --- End diff -- not even true for a statement: you can execute multiple queries, but the previous resultset will be closed and a new cursor created... > Drill JDBC driver support Statement.setQueryTimeout(int) > > > Key: DRILL-3640 > URL: https://issues.apache.org/jira/browse/DRILL-3640 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC >Affects Versions: 1.2.0 >Reporter: Chun Chang >Assignee: Kunal Khatua > Fix For: 1.12.0 > > > It would be nice if we have this implemented. Run away queries can be > automatically canceled by setting the timeout. > java.sql.SQLFeatureNotSupportedException: Setting network timeout is not > supported. > at > org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)
[ https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242700#comment-16242700 ] ASF GitHub Bot commented on DRILL-3640: --- Github user laurentgo commented on a diff in the pull request: https://github.com/apache/drill/pull/1024#discussion_r149476740 --- Diff: exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillResultSetImpl.java --- @@ -66,11 +70,27 @@ private final DrillConnectionImpl connection; private volatile boolean hasPendingCancelationNotification = false; + private Stopwatch elapsedTimer; + + private int queryTimeoutInSeconds; + DrillResultSetImpl(AvaticaStatement statement, Meta.Signature signature, ResultSetMetaData resultSetMetaData, TimeZone timeZone, Meta.Frame firstFrame) { super(statement, signature, resultSetMetaData, timeZone, firstFrame); connection = (DrillConnectionImpl) statement.getConnection(); +try { + if (statement.getQueryTimeout() > 0) { +queryTimeoutInSeconds = statement.getQueryTimeout(); + } +} catch (Exception e) { + e.printStackTrace(); --- End diff -- I think so > Drill JDBC driver support Statement.setQueryTimeout(int) > > > Key: DRILL-3640 > URL: https://issues.apache.org/jira/browse/DRILL-3640 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC >Affects Versions: 1.2.0 >Reporter: Chun Chang >Assignee: Kunal Khatua > Fix For: 1.12.0 > > > It would be nice if we have this implemented. Run away queries can be > automatically canceled by setting the timeout. > java.sql.SQLFeatureNotSupportedException: Setting network timeout is not > supported. > at > org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5942) Drill Resource Management
[ https://issues.apache.org/jira/browse/DRILL-5942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242689#comment-16242689 ] Timothy Farkas commented on DRILL-5942: --- My initial analysis of this issue is as follows: This is a very complex issue which will take a considerable amount of time to solve correctly. Rehashing the points that Paul has already mentioned in various discussions I think there are two main Phases this would need to be tackled in. h2. Phase 1: Running Queries Non-Concurrently Without Running Out of Memory h3. Goal The goal here would be to run one query at a time successfully in all cases. I think this is possible to achieve with incremental improvements to the existing architecture. *Note:* I think achieving *Phase 2* will require significant changes to Drill's architecture. h3. Tasks In order to avoid out of memory exceptions for the single query case, it is necessary and sufficient to have solutions for all of these sub-tasks. * Make each operator memory aware. Given a specific memory budget each operator must be capable of obeying it. All the operators need to be analyze and made memory aware. * *Relevant Pending Work:* The HashJoin work Boaz is doing. * Account for the memory used by Drillbit to Drillbit communication. Currently exchanges use buffers on both the sending and recieving drill bits. These buffers can use a significant amount of memory. We would have to be able to set a limit on the amount of memory used by these buffers and make the exchange operations smart enough to obey the limit. * *Relevant Pending Work:* The exchange operator work Vlad is doing. * Make drill aware of the amount of direct memory allocated to the jvm. This is necessary because the Drillibit needs to know how much memory it has available to allocate to operators, and buffers. * Control batch sizes. We cannot effectively obey memory limits in the operators and buffers unless we can bound the size of batches. * *Relevant Pending Work:* The work Paul is doing to limit batch sizes. * Once everything above is satisfied, we need to test the Parallelizer code to make sure it doesn't overallocate memory. I believe there are cases where incorrect configuration can cause the Parallelizers to grossly over allocate memory. h2. Phase 2: Running Multiple Concurrently Without Running Out of Memory h3. Goal The goal here would be to run multiple concurrent queries successfully in all cases. However, before we can even think about *Phase 2* we must first solve the issues outlined in *Phase 1*. h3. Theory This will require significant changes to Drill's architecture. This is because we will have to solve the problem of distributed resource management in order to effectively allocate resources to concurrent queries without exceeding the resources we have in our cluster. h4. Desired State The good news is that existing cluster managers like YARN already solve this problem for batch jobs by doing the following: # The amount of available memory and cpu cores is reported to a resource manager. # New jobs are submitted to the resource manager. A job includes a description of all the containers it needs to run. Each container description also includes the amount of memory and cpu it will need. # The resource manager places the job in a Queue. # The resource manager uses a scheduler to prioritize jobs in the Queue. # A job with high priority is scheduled to run on the cluster if and only if the cluster has enough resources to execute the job. # When a job is deployed to the cluster, the remaining unused resources on the cluster are updated appropriately. h4. Current State Currently Drill does not do any of this. Instead it does the following: # A query is sent to Drill. # A foreman is created for the query. # The query is planned. # During planning, each query assumes it has access to all of the cluster resources and is oblivious to any other queries running on the cluster. Because of this the following issues can occur: * Queries sporadically run out of memory when too many concurrent queries are run. For example, assume we have a cluster with 100 Gb of memory. Let's run *Query A* and assume it consumes 80Gb of memory on the cluster. While *Query A* is running let's try to run *Query B*. During the planning of *Query B* Drill is completely unaware that *Query A* is running and assumes that *Query B* has the full 100 GB at it's disposal. So Drill may launch *Query B* and give it 80 GB. Now there are two queries with 80 GB allocated to each of them for a total of 160 GB when the cluster only has 100 GB. * Even if we try to have some smart heuristics to avoid the first issue, a resource deadlock can occur. For example *Query A* could be partially deployed to the cluster and take up half of the cluster resources, similarly *Query B* could be partially deplo
[jira] [Updated] (DRILL-5942) Drill Resource Management
[ https://issues.apache.org/jira/browse/DRILL-5942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Farkas updated DRILL-5942: -- Description: OutOfMemoryExceptions still occur in Drill. This ticket addresses a plan for what it would take to ensure all queries are able to execute on Drill without running out of memory. (was: OutOfMemoryExceptions still occur in Drill. This ticket address a plan for what it would take to ensure all queries are able to execute on Drill without running out of memory.) > Drill Resource Management > - > > Key: DRILL-5942 > URL: https://issues.apache.org/jira/browse/DRILL-5942 > Project: Apache Drill > Issue Type: Bug >Reporter: Timothy Farkas > > OutOfMemoryExceptions still occur in Drill. This ticket addresses a plan for > what it would take to ensure all queries are able to execute on Drill without > running out of memory. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)
[ https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242695#comment-16242695 ] ASF GitHub Bot commented on DRILL-3640: --- Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1024#discussion_r149476190 --- Diff: exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java --- @@ -237,6 +245,127 @@ public String toString() { } } + /** + * Test for reading of default query timeout + */ + @Test + public void testDefaultGetQueryTimeout() throws SQLException { +PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL); +int timeoutValue = stmt.getQueryTimeout(); +assert( 0 == timeoutValue ); + } + + /** + * Test Invalid parameter by giving negative timeout + */ + @Test ( expected = InvalidParameterSqlException.class ) + public void testInvalidSetQueryTimeout() throws SQLException { +PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL); +//Setting negative value +int valueToSet = -10; +if (0L == valueToSet) { + valueToSet--; +} +try { + stmt.setQueryTimeout(valueToSet); +} catch ( final Exception e) { + // TODO: handle exception + assertThat( e.getMessage(), containsString( "illegal timeout value") ); + //Converting this to match expected Exception + throw new InvalidParameterSqlException(e.getMessage()); +} + } + + /** + * Test setting a valid timeout + */ + @Test + public void testValidSetQueryTimeout() throws SQLException { +PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL); +//Setting positive value +int valueToSet = new Random(System.currentTimeMillis()).nextInt(60); --- End diff -- I am trying to add some randomness to the test parameters, since the expected behaviour should be the same. I'll fix this up and get rid of that check. +1 > Drill JDBC driver support Statement.setQueryTimeout(int) > > > Key: DRILL-3640 > URL: https://issues.apache.org/jira/browse/DRILL-3640 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC >Affects Versions: 1.2.0 >Reporter: Chun Chang >Assignee: Kunal Khatua > Fix For: 1.12.0 > > > It would be nice if we have this implemented. Run away queries can be > automatically canceled by setting the timeout. > java.sql.SQLFeatureNotSupportedException: Setting network timeout is not > supported. > at > org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)
[ https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242681#comment-16242681 ] ASF GitHub Bot commented on DRILL-3640: --- Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1024#discussion_r149474798 --- Diff: exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java --- @@ -237,6 +245,127 @@ public String toString() { } } + /** + * Test for reading of default query timeout + */ + @Test + public void testDefaultGetQueryTimeout() throws SQLException { +PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL); +int timeoutValue = stmt.getQueryTimeout(); +assert( 0 == timeoutValue ); + } + + /** + * Test Invalid parameter by giving negative timeout + */ + @Test ( expected = InvalidParameterSqlException.class ) + public void testInvalidSetQueryTimeout() throws SQLException { +PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL); +//Setting negative value +int valueToSet = -10; +if (0L == valueToSet) { --- End diff -- My bad. The original code would assign the negation of a random integer.,.. hence the check for 0L and followed by a decrement. +1 > Drill JDBC driver support Statement.setQueryTimeout(int) > > > Key: DRILL-3640 > URL: https://issues.apache.org/jira/browse/DRILL-3640 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC >Affects Versions: 1.2.0 >Reporter: Chun Chang >Assignee: Kunal Khatua > Fix For: 1.12.0 > > > It would be nice if we have this implemented. Run away queries can be > automatically canceled by setting the timeout. > java.sql.SQLFeatureNotSupportedException: Setting network timeout is not > supported. > at > org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)
[ https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242680#comment-16242680 ] ASF GitHub Bot commented on DRILL-3640: --- Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1024#discussion_r149474455 --- Diff: exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java --- @@ -237,6 +245,127 @@ public String toString() { } } + /** + * Test for reading of default query timeout + */ + @Test + public void testDefaultGetQueryTimeout() throws SQLException { +PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL); --- End diff -- +1 > Drill JDBC driver support Statement.setQueryTimeout(int) > > > Key: DRILL-3640 > URL: https://issues.apache.org/jira/browse/DRILL-3640 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC >Affects Versions: 1.2.0 >Reporter: Chun Chang >Assignee: Kunal Khatua > Fix For: 1.12.0 > > > It would be nice if we have this implemented. Run away queries can be > automatically canceled by setting the timeout. > java.sql.SQLFeatureNotSupportedException: Setting network timeout is not > supported. > at > org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)
[ https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242688#comment-16242688 ] ASF GitHub Bot commented on DRILL-3640: --- Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1024#discussion_r149475535 --- Diff: exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java --- @@ -237,6 +245,127 @@ public String toString() { } } + /** + * Test for reading of default query timeout + */ + @Test + public void testDefaultGetQueryTimeout() throws SQLException { +PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL); +int timeoutValue = stmt.getQueryTimeout(); +assert( 0 == timeoutValue ); + } + + /** + * Test Invalid parameter by giving negative timeout + */ + @Test ( expected = InvalidParameterSqlException.class ) + public void testInvalidSetQueryTimeout() throws SQLException { +PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL); +//Setting negative value +int valueToSet = -10; +if (0L == valueToSet) { + valueToSet--; +} +try { + stmt.setQueryTimeout(valueToSet); +} catch ( final Exception e) { + // TODO: handle exception + assertThat( e.getMessage(), containsString( "illegal timeout value") ); + //Converting this to match expected Exception + throw new InvalidParameterSqlException(e.getMessage()); --- End diff -- Wanted to make sure that the unit test also reports the correct exception. This only rewraps the thrown SQLException to an InvalidParameterSqlException for JUnit to confirm. > Drill JDBC driver support Statement.setQueryTimeout(int) > > > Key: DRILL-3640 > URL: https://issues.apache.org/jira/browse/DRILL-3640 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC >Affects Versions: 1.2.0 >Reporter: Chun Chang >Assignee: Kunal Khatua > Fix For: 1.12.0 > > > It would be nice if we have this implemented. Run away queries can be > automatically canceled by setting the timeout. > java.sql.SQLFeatureNotSupportedException: Setting network timeout is not > supported. > at > org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)
[ https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242686#comment-16242686 ] ASF GitHub Bot commented on DRILL-3640: --- Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1024#discussion_r149475115 --- Diff: exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java --- @@ -237,6 +245,127 @@ public String toString() { } } + /** + * Test for reading of default query timeout + */ + @Test + public void testDefaultGetQueryTimeout() throws SQLException { +PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL); +int timeoutValue = stmt.getQueryTimeout(); +assert( 0 == timeoutValue ); + } + + /** + * Test Invalid parameter by giving negative timeout + */ + @Test ( expected = InvalidParameterSqlException.class ) + public void testInvalidSetQueryTimeout() throws SQLException { +PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL); +//Setting negative value +int valueToSet = -10; +if (0L == valueToSet) { + valueToSet--; +} +try { + stmt.setQueryTimeout(valueToSet); +} catch ( final Exception e) { --- End diff -- Yes, it should be. Might be a legacy code. Will fix it. +1 > Drill JDBC driver support Statement.setQueryTimeout(int) > > > Key: DRILL-3640 > URL: https://issues.apache.org/jira/browse/DRILL-3640 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC >Affects Versions: 1.2.0 >Reporter: Chun Chang >Assignee: Kunal Khatua > Fix For: 1.12.0 > > > It would be nice if we have this implemented. Run away queries can be > automatically canceled by setting the timeout. > java.sql.SQLFeatureNotSupportedException: Setting network timeout is not > supported. > at > org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (DRILL-5942) Drill Resource Management
Timothy Farkas created DRILL-5942: - Summary: Drill Resource Management Key: DRILL-5942 URL: https://issues.apache.org/jira/browse/DRILL-5942 Project: Apache Drill Issue Type: Bug Reporter: Timothy Farkas OutOfMemoryExceptions still occur in Drill. This ticket address a plan for what it would take to ensure all queries are able to execute on Drill without running out of memory. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)
[ https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242671#comment-16242671 ] ASF GitHub Bot commented on DRILL-3640: --- Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1024#discussion_r149473521 --- Diff: exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillResultSetImpl.java --- @@ -66,11 +70,27 @@ private final DrillConnectionImpl connection; private volatile boolean hasPendingCancelationNotification = false; + private Stopwatch elapsedTimer; + + private int queryTimeoutInSeconds; + DrillResultSetImpl(AvaticaStatement statement, Meta.Signature signature, ResultSetMetaData resultSetMetaData, TimeZone timeZone, Meta.Frame firstFrame) { super(statement, signature, resultSetMetaData, timeZone, firstFrame); connection = (DrillConnectionImpl) statement.getConnection(); +try { + if (statement.getQueryTimeout() > 0) { +queryTimeoutInSeconds = statement.getQueryTimeout(); + } +} catch (Exception e) { + e.printStackTrace(); --- End diff -- Guess I was not sure what am I to do if `getQueryTImeout()` threw an Exception. Didn't want to lose the stack trace. Should I just ignore it? > Drill JDBC driver support Statement.setQueryTimeout(int) > > > Key: DRILL-3640 > URL: https://issues.apache.org/jira/browse/DRILL-3640 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC >Affects Versions: 1.2.0 >Reporter: Chun Chang >Assignee: Kunal Khatua > Fix For: 1.12.0 > > > It would be nice if we have this implemented. Run away queries can be > automatically canceled by setting the timeout. > java.sql.SQLFeatureNotSupportedException: Setting network timeout is not > supported. > at > org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)
[ https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242674#comment-16242674 ] ASF GitHub Bot commented on DRILL-3640: --- Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1024#discussion_r149473999 --- Diff: exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillResultSetImpl.java --- @@ -96,6 +117,13 @@ private void throwIfClosed() throws AlreadyClosedSqlException, throw new AlreadyClosedSqlException( "ResultSet is already closed." ); } } + +//Query Timeout Check. The timer has already been started by the DrillCursor at this point --- End diff -- This code block gets touched even if there is no timeout set, hence the check to implicitly confirm if there is a timeout set. > Drill JDBC driver support Statement.setQueryTimeout(int) > > > Key: DRILL-3640 > URL: https://issues.apache.org/jira/browse/DRILL-3640 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC >Affects Versions: 1.2.0 >Reporter: Chun Chang >Assignee: Kunal Khatua > Fix For: 1.12.0 > > > It would be nice if we have this implemented. Run away queries can be > automatically canceled by setting the timeout. > java.sql.SQLFeatureNotSupportedException: Setting network timeout is not > supported. > at > org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)
[ https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242667#comment-16242667 ] ASF GitHub Bot commented on DRILL-3640: --- Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1024#discussion_r149472887 --- Diff: exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillResultSetImpl.java --- @@ -66,11 +70,27 @@ private final DrillConnectionImpl connection; private volatile boolean hasPendingCancelationNotification = false; + private Stopwatch elapsedTimer; --- End diff -- +1 > Drill JDBC driver support Statement.setQueryTimeout(int) > > > Key: DRILL-3640 > URL: https://issues.apache.org/jira/browse/DRILL-3640 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC >Affects Versions: 1.2.0 >Reporter: Chun Chang >Assignee: Kunal Khatua > Fix For: 1.12.0 > > > It would be nice if we have this implemented. Run away queries can be > automatically canceled by setting the timeout. > java.sql.SQLFeatureNotSupportedException: Setting network timeout is not > supported. > at > org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)
[ https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242666#comment-16242666 ] ASF GitHub Bot commented on DRILL-3640: --- Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1024#discussion_r149472806 --- Diff: exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillPreparedStatementImpl.java --- @@ -61,8 +65,14 @@ protected DrillPreparedStatementImpl(DrillConnectionImpl connection, if (preparedStatementHandle != null) { ((DrillColumnMetaDataList) signature.columns).updateColumnMetaData(preparedStatementHandle.getColumnsList()); } +//Implicit query timeout +this.queryTimeoutInSeconds = 0; +this.elapsedTimer = Stopwatch.createUnstarted(); --- End diff -- I thought the Statement and Cursor had a 1:1 relationship, so they can share the timer. I guess for a PreparedStatement I cannot make that assumption. Will fix this. +1 > Drill JDBC driver support Statement.setQueryTimeout(int) > > > Key: DRILL-3640 > URL: https://issues.apache.org/jira/browse/DRILL-3640 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC >Affects Versions: 1.2.0 >Reporter: Chun Chang >Assignee: Kunal Khatua > Fix For: 1.12.0 > > > It would be nice if we have this implemented. Run away queries can be > automatically canceled by setting the timeout. > java.sql.SQLFeatureNotSupportedException: Setting network timeout is not > supported. > at > org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)
[ https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242655#comment-16242655 ] ASF GitHub Bot commented on DRILL-3640: --- Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1024#discussion_r149471937 --- Diff: exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillPreparedStatementImpl.java --- @@ -46,6 +48,8 @@ DrillRemoteStatement { private final PreparedStatement preparedStatementHandle; + int queryTimeoutInSeconds = 0; --- End diff -- Ah.. i missed this during the clean up. Thanks! +1 > Drill JDBC driver support Statement.setQueryTimeout(int) > > > Key: DRILL-3640 > URL: https://issues.apache.org/jira/browse/DRILL-3640 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC >Affects Versions: 1.2.0 >Reporter: Chun Chang >Assignee: Kunal Khatua > Fix For: 1.12.0 > > > It would be nice if we have this implemented. Run away queries can be > automatically canceled by setting the timeout. > java.sql.SQLFeatureNotSupportedException: Setting network timeout is not > supported. > at > org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)
[ https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242645#comment-16242645 ] ASF GitHub Bot commented on DRILL-3640: --- Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1024#discussion_r149471249 --- Diff: exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java --- @@ -100,13 +103,17 @@ final LinkedBlockingDeque batchQueue = Queues.newLinkedBlockingDeque(); +private final DrillCursor parent; --- End diff -- Stopwatch seemed a convenient way of visualizing a timer object that is passed between different JDBC entities, and also provides a clean way of specifying elapsed time, etc. > Drill JDBC driver support Statement.setQueryTimeout(int) > > > Key: DRILL-3640 > URL: https://issues.apache.org/jira/browse/DRILL-3640 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC >Affects Versions: 1.2.0 >Reporter: Chun Chang >Assignee: Kunal Khatua > Fix For: 1.12.0 > > > It would be nice if we have this implemented. Run away queries can be > automatically canceled by setting the timeout. > java.sql.SQLFeatureNotSupportedException: Setting network timeout is not > supported. > at > org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)
[ https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242617#comment-16242617 ] ASF GitHub Bot commented on DRILL-3640: --- Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1024#discussion_r149467572 --- Diff: exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java --- @@ -376,6 +415,19 @@ synchronized void cleanup() { currentBatchHolder.clear(); } + //Set the cursor's timeout in seconds --- End diff -- We do get the timeout value from the Statement (Ref: https://github.com/kkhatua/drill/blob/a008707c7b97ea95700ab0f2eb5182d779a9bcb3/exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java#L372 ) However, the Statement is referred to by the ResultSet object as well, to get a handle of the timer object. During testing, I found that there is a possibility that the DrillCursor completes fetching all batches, but a slow client would call ResultSet.next() slowly and time out. The ResultSet object has no reference to the timer, except via the Statement object. There is a bigger problem that this block of code fixes. During iteration, we don't want to be able to change the timeout period. Hence, the DrillCursor (invoked by the _first_ `ResultSet.next()` call) will be initialized and set the timer to start ticking.Thereafter, any attempt to change the timeout can be ignored. > Drill JDBC driver support Statement.setQueryTimeout(int) > > > Key: DRILL-3640 > URL: https://issues.apache.org/jira/browse/DRILL-3640 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC >Affects Versions: 1.2.0 >Reporter: Chun Chang >Assignee: Kunal Khatua > Fix For: 1.12.0 > > > It would be nice if we have this implemented. Run away queries can be > automatically canceled by setting the timeout. > java.sql.SQLFeatureNotSupportedException: Setting network timeout is not > supported. > at > org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)
[ https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242606#comment-16242606 ] ASF GitHub Bot commented on DRILL-3640: --- Github user laurentgo commented on a diff in the pull request: https://github.com/apache/drill/pull/1024#discussion_r149465309 --- Diff: exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java --- @@ -239,6 +259,11 @@ QueryDataBatch getNext() throws UserException, InterruptedException { } return qdb; } + + // Check and throw SQLTimeoutException + if ( parent.timeoutInSeconds > 0 && parent.elapsedTimer.elapsed(TimeUnit.SECONDS) >= parent.timeoutInSeconds ) { --- End diff -- you don't really need a check after the pool: if it's not null, it means it completed before timeout and you can proceed forward. If it is null, then you would loop and redo the check based on the current time and might be able to throw a timeout exception > Drill JDBC driver support Statement.setQueryTimeout(int) > > > Key: DRILL-3640 > URL: https://issues.apache.org/jira/browse/DRILL-3640 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC >Affects Versions: 1.2.0 >Reporter: Chun Chang >Assignee: Kunal Khatua > Fix For: 1.12.0 > > > It would be nice if we have this implemented. Run away queries can be > automatically canceled by setting the timeout. > java.sql.SQLFeatureNotSupportedException: Setting network timeout is not > supported. > at > org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)
[ https://issues.apache.org/jira/browse/DRILL-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242596#comment-16242596 ] ASF GitHub Bot commented on DRILL-3640: --- Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1024#discussion_r149463638 --- Diff: exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java --- @@ -239,6 +259,11 @@ QueryDataBatch getNext() throws UserException, InterruptedException { } return qdb; } + + // Check and throw SQLTimeoutException + if ( parent.timeoutInSeconds > 0 && parent.elapsedTimer.elapsed(TimeUnit.SECONDS) >= parent.timeoutInSeconds ) { --- End diff -- Good point, and I thought it might help in avoiding going into polling all together. However, the granularity of the timeout is in seconds, so 50ms is insignificant. If I do a check before the poll, I'd need to do after the poll as well.. over a 50ms window. So, a post-poll check works fine, because we'll, at most, exceed the timeout by 50ms. So a timeout of 1sec would occur in 1.05sec. For any larger timeout values, the 50ms is of diminishing significance. > Drill JDBC driver support Statement.setQueryTimeout(int) > > > Key: DRILL-3640 > URL: https://issues.apache.org/jira/browse/DRILL-3640 > Project: Apache Drill > Issue Type: New Feature > Components: Client - JDBC >Affects Versions: 1.2.0 >Reporter: Chun Chang >Assignee: Kunal Khatua > Fix For: 1.12.0 > > > It would be nice if we have this implemented. Run away queries can be > automatically canceled by setting the timeout. > java.sql.SQLFeatureNotSupportedException: Setting network timeout is not > supported. > at > org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5717) change some date time unit cases with specific timezone or Local
[ https://issues.apache.org/jira/browse/DRILL-5717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242447#comment-16242447 ] ASF GitHub Bot commented on DRILL-5717: --- Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/904#discussion_r149440392 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/fn/impl/testing/TestDateConversions.java --- @@ -225,4 +243,16 @@ public void testPostgresDateFormatError() throws Exception { throw e; } } + + /** + * mock current locale to US + */ + private void mockUSLocale() { --- End diff -- Please move this method into ExecTest class and make it public and static. > change some date time unit cases with specific timezone or Local > > > Key: DRILL-5717 > URL: https://issues.apache.org/jira/browse/DRILL-5717 > Project: Apache Drill > Issue Type: Bug > Components: Tools, Build & Test >Affects Versions: 1.9.0, 1.11.0 >Reporter: weijie.tong > > Some date time test cases like JodaDateValidatorTest is not Local > independent .This will cause other Local's users's test phase to fail. We > should let these test cases to be Local env independent. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5717) change some date time unit cases with specific timezone or Local
[ https://issues.apache.org/jira/browse/DRILL-5717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242445#comment-16242445 ] ASF GitHub Bot commented on DRILL-5717: --- Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/904#discussion_r149440034 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/fn/interp/TestConstantFolding.java --- @@ -117,6 +123,13 @@ public void createFiles(int smallFileLines, int bigFileLines) throws Exception{ @Test public void testConstantFolding_allTypes() throws Exception { +new MockUp() { --- End diff -- I meant to use the same method in all tests where Locale should be mocked. Please replace it. > change some date time unit cases with specific timezone or Local > > > Key: DRILL-5717 > URL: https://issues.apache.org/jira/browse/DRILL-5717 > Project: Apache Drill > Issue Type: Bug > Components: Tools, Build & Test >Affects Versions: 1.9.0, 1.11.0 >Reporter: weijie.tong > > Some date time test cases like JodaDateValidatorTest is not Local > independent .This will cause other Local's users's test phase to fail. We > should let these test cases to be Local env independent. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5717) change some date time unit cases with specific timezone or Local
[ https://issues.apache.org/jira/browse/DRILL-5717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242446#comment-16242446 ] ASF GitHub Bot commented on DRILL-5717: --- Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/904#discussion_r149440814 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/fn/impl/TestCastFunctions.java --- @@ -77,16 +82,23 @@ public void testCastByConstantFolding() throws Exception { @Test // DRILL-3769 public void testToDateForTimeStamp() throws Exception { -final String query = "select to_date(to_timestamp(-1)) as col \n" + -"from (values(1))"; +new MockUp() { --- End diff -- This mock is used twice. So let's also move the code into the separate method. > change some date time unit cases with specific timezone or Local > > > Key: DRILL-5717 > URL: https://issues.apache.org/jira/browse/DRILL-5717 > Project: Apache Drill > Issue Type: Bug > Components: Tools, Build & Test >Affects Versions: 1.9.0, 1.11.0 >Reporter: weijie.tong > > Some date time test cases like JodaDateValidatorTest is not Local > independent .This will cause other Local's users's test phase to fail. We > should let these test cases to be Local env independent. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5106) Refactor SkipRecordsInspector to exclude check for predefined file formats
[ https://issues.apache.org/jira/browse/DRILL-5106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242279#comment-16242279 ] Arina Ielchiieva commented on DRILL-5106: - The following improvements will be implemented in the scope of DRILL-5941: a. fileFormats will be removed from skip records inspector; b. skip header count logic will be applied only once during reader initialization; c. when skip footer won't be required, default processing will be done without buffering data in queue. > Refactor SkipRecordsInspector to exclude check for predefined file formats > -- > > Key: DRILL-5106 > URL: https://issues.apache.org/jira/browse/DRILL-5106 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Hive >Affects Versions: 1.9.0 >Reporter: Arina Ielchiieva >Assignee: Arina Ielchiieva >Priority: Minor > > After changes introduced in DRILL-4982, SkipRecordInspector is used only for > predefined formats (using hasHeaderFooter: false / true). But > SkipRecordInspector has its own check for formats where skip strategy can be > applied. Acceptable file formats are stored in private final Set > fileFormats and initialized in constructor, currently it contains only one > format - TextInputFormat. Now this check is redundant and may lead to > ignoring hasHeaderFooter setting to true for any other format except of Text. > To do: > 1. remove private final Set fileFormats > 2. remove if block from SkipRecordsInspector.retrievePositiveIntProperty: > {code} > if > (!fileFormats.contains(tableProperties.get(hive_metastoreConstants.FILE_INPUT_FORMAT))) > { > return propertyIntValue; > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (DRILL-5941) Skip header / footer logic works incorrectly for Hive tables when file has several input splits
Arina Ielchiieva created DRILL-5941: --- Summary: Skip header / footer logic works incorrectly for Hive tables when file has several input splits Key: DRILL-5941 URL: https://issues.apache.org/jira/browse/DRILL-5941 Project: Apache Drill Issue Type: Bug Components: Storage - Hive Affects Versions: 1.11.0 Reporter: Arina Ielchiieva Assignee: Arina Ielchiieva Fix For: 1.12.0 *To reproduce* 1. Create csv file with two columns (key, value) for 329 rows, where first row is a header. The data file has size of should be greater than chunk size of 256 MB. Copy file to the distributed file system. 2. Create table in Hive: {noformat} CREATE EXTERNAL TABLE `h_table`( `key` bigint, `value` string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION 'maprfs:/tmp/h_table' TBLPROPERTIES ( 'skip.header.line.count'='1'); {noformat} 3. Execute query {{select * from hive.h_table}} in Drill (query data using Hive plugin). The result will return less rows then expected. Expected result is 328 (total count minus one row as header). *The root cause* Since file is greater than default chunk size, it's split into several fragments, known as input splits. For example: {noformat} maprfs:/tmp/h_table/h_table.csv:0+268435456 maprfs:/tmp/h_table/h_table.csv:268435457+492782112 {noformat} TextHiveReader is responsible for handling skip header and / or footer logic. Currently Drill creates reader [for each input split|https://github.com/apache/drill/blob/master/contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveScanBatchCreator.java#L84] and skip header and /or footer logic is applied for each input splits, though ideally the above mentioned input splits should have been read by one reader, so skip / header footer logic was applied correctly. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-4779) Kafka storage plugin support
[ https://issues.apache.org/jira/browse/DRILL-4779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242170#comment-16242170 ] B Anil Kumar commented on DRILL-4779: - For Avro support we have raised a separate ticket https://issues.apache.org/jira/browse/DRILL-5940 > Kafka storage plugin support > > > Key: DRILL-4779 > URL: https://issues.apache.org/jira/browse/DRILL-4779 > Project: Apache Drill > Issue Type: New Feature > Components: Storage - Other >Affects Versions: 1.11.0 >Reporter: B Anil Kumar >Assignee: B Anil Kumar > Labels: doc-impacting > Fix For: 1.12.0 > > > Implement Kafka storage plugin will enable the strong SQL support for Kafka. > Initially implementation can target for supporting json and avro message types -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (DRILL-5940) Avro with schema registry support for Kafka
B Anil Kumar created DRILL-5940: --- Summary: Avro with schema registry support for Kafka Key: DRILL-5940 URL: https://issues.apache.org/jira/browse/DRILL-5940 Project: Apache Drill Issue Type: New Feature Components: Storage - Other Reporter: B Anil Kumar Assignee: Bhallamudi Venkata Siva Kamesh Support Avro messages with Schema registry for Kafka storage plugin -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-4779) Kafka storage plugin support
[ https://issues.apache.org/jira/browse/DRILL-4779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242157#comment-16242157 ] ASF GitHub Bot commented on DRILL-4779: --- GitHub user akumarb2010 opened a pull request: https://github.com/apache/drill/pull/1027 DRILL-4779 : Kafka storage plugin This PR contains Kafka support with JSON message format. You can merge this pull request into a Git repository by running: $ git pull https://github.com/akumarb2010/incubator-drill master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1027.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1027 commit f3397816ad07f85a08f53b964213ecf9f96b56b8 Author: Batchu Date: 2016-12-20T16:28:40Z Starting on kafka module commit f5c9ff3f0863cf743e922fc6c39ccfebc4b7df4d Author: Batchu Date: 2016-12-26T21:16:19Z Initial Kafka integration code commit 8d7403f49f1acb601e6d90d1a2cb640a4df0e05c Author: Batchu Date: 2016-12-26T23:41:05Z Kafka plugin src clean up commit 08992ea73c239e8b6996436b7984e87ee12b534d Author: Anil Kumar Batchu Date: 2016-12-26T23:45:19Z Initial Kafka plugin code base commit 8ec1e187efcdd3123db03fda7c0e038503246a46 Author: Anil Kumar Batchu Date: 2017-01-08T06:07:06Z Initial Kafka plugin code base commit a11af5542079b891975c95d4b2c6b562e7a61386 Author: Anil Kumar Batchu Date: 2017-01-08T06:22:12Z Initial Kafka plugin code base commit e22d10e82857888e7fb9a723b7781a20d6581d69 Author: Anil Kumar Batchu Date: 2017-03-04T06:40:25Z Initial Kafka plugin code base issues commit 990f479db43156dad296033ca57ee4cf0f497b69 Author: Anil Kumar Batchu Date: 2017-03-04T06:48:00Z Initial Kafka plugin code base issues commit 5d565c4d843373f8c168c9255d108140677e58b0 Author: Venkata Siva Kamesh Date: 2017-03-04T11:00:30Z Updating Kafka storage plugin and its affected classes, adding schema related classes commit aec67b863da49293d96a53d5332998ffaf890115 Author: Venkata Siva Kamesh Date: 2017-03-04T11:07:17Z Adding license commit 84e7eaa0de6ae90244df1fb4960d97876076ffc6 Author: Venkata Siva Kamesh Date: 2017-03-04T11:34:40Z Adding bootstrap-storage-plugins.json and drill-module.conf commit 0fa26c856996474442c2deb038226e1765c67ac4 Author: Venkata Siva Kamesh Date: 2017-03-04T13:18:49Z Cleaning pom and adding kafka-storage in bin.xml commit 2c96e9ff9e0c68ae76e9b9b16b712b0f5a7fcea8 Author: Venkata Siva Kamesh Date: 2017-03-04T13:25:24Z fixing drill-module.conf by updating package to kafka commit a30640c60586b12b3c4ec6a6316fc6dab84a6d3f Author: Venkata Siva Kamesh Date: 2017-03-04T14:04:40Z fixing spelling mistakes commit ccd8de3f86e407852ea9ce9e032caf86542ea482 Author: Venkata Siva Kamesh Date: 2017-03-05T09:41:15Z Formatting the code using Apache eclipse formatter commit dcc018acac3cb07f2fe0d5c33a6c50c03e2ebc59 Author: Venkata Siva Kamesh Date: 2017-03-05T09:44:28Z moving Kafka schema factory to schema package commit 2987afbcfeb47cb7d7a29643015979d293ebcc72 Author: Venkata Siva Kamesh Date: 2017-03-05T15:20:03Z Adding message format in storage plugin config commit b6bc9c34aa7b2c9299a593e2138e0e9dc30edeb9 Author: Venkata Siva Kamesh Date: 2017-03-05T16:00:53Z updating storage-plugins.json, drill-module.conf and adding loggers commit 97f62d43da48a284de318cd21543f03c3f10e73f Author: Venkata Siva Kamesh Date: 2017-03-05T16:19:34Z Adding debug message commit 30ac2ae6b623ba5279e3085549e5fa5ad2a44a23 Author: Venkata Siva Kamesh Date: 2017-03-05T16:26:53Z updating debug message commit fdd84d19dcfa03913a307d041c0cdf7421095e71 Author: akumarb2010 Date: 2017-03-05T16:29:19Z Adding avro support to kafka plugin commit d301f3ee07d16800ec54316d5dd48c398c09a1ec Author: akumarb2010 Date: 2017-03-05T16:29:44Z Merge branch 'master' of https://github.com/akumarb2010/incubator-drill commit c922ea9ea7ea31b252150f357e153f64edd8fbb5 Author: akumarb2010 Date: 2017-03-25T18:16:05Z Adding avro support to kafka plugin DRILL-4779 commit d00aa38cf51bec1dcdefa7f6e44f918ce761f912 Author: akumarb2010 Date: 2017-03-25T20:17:06Z KafkaRecordReader implementation DRILL-4779 commit 318ebc120bd0f2d4790a920806110db2494e1668 Author: akumarb2010 Date: 2017-03-26T05:15:29Z KafkaRecordReader implementation DRILL-4779 commit 0e142a9758223761124632dc844ba8c75bca913f Author: Venkata Siva Kamesh Date: 2017-04-02T06:33:06Z Adding storage config in Groupscan and fixing rat issues commit e797a221b799a154a4c8f55a590b12b18ed9f31c Author: akumarb2010 Date: 2017-04-02T06:43:12Z Checkstyle issues commit f700b4b1074fe7dfefc82c9895e7ad4041762d4b Author: akumarb2010 Date: 2017-04-02T
[jira] [Updated] (DRILL-4779) Kafka storage plugin support
[ https://issues.apache.org/jira/browse/DRILL-4779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] B Anil Kumar updated DRILL-4779: Description: Implement Kafka storage plugin will enable the strong SQL support for Kafka. Initially implementation can target for supporting json and avro message types was: Implement Kafka storage plugin will enable the strong SQL support for Kafka. Initially implementation can target for supporting text, json and avro message types > Kafka storage plugin support > > > Key: DRILL-4779 > URL: https://issues.apache.org/jira/browse/DRILL-4779 > Project: Apache Drill > Issue Type: New Feature > Components: Storage - Other >Affects Versions: 1.11.0 >Reporter: B Anil Kumar >Assignee: B Anil Kumar > Labels: doc-impacting > Fix For: 1.12.0 > > > Implement Kafka storage plugin will enable the strong SQL support for Kafka. > Initially implementation can target for supporting json and avro message types -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (DRILL-5939) NullPointerException in convert_toJSON function
Volodymyr Tkach created DRILL-5939: -- Summary: NullPointerException in convert_toJSON function Key: DRILL-5939 URL: https://issues.apache.org/jira/browse/DRILL-5939 Project: Apache Drill Issue Type: Bug Reporter: Volodymyr Tkach Priority: Minor The query: `select convert_toJSON(convert_fromJSON('\{"key": "value"\}')) from (values(1));` fails with exception. Although, when we apply it for the data from the file from disk it succeeds. select convert_toJSON(convert_fromJSON(columns\[0\])) from dfs.tmp.`some.csv`; Some.csv \{"key":"val"\},val2 {noformat} Fragment 0:0 [Error Id: 016ca995-16f9-4eab-83c2-7679071faad4 on userf206-pc:31010] org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: NullPointerException Fragment 0:0 [Error Id: 016ca995-16f9-4eab-83c2-7679071faad4 on userf206-pc:31010] at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:586) ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:298) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:160) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:267) [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) [drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_80] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_80] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80] Caused by: java.lang.NullPointerException: null at org.apache.drill.exec.expr.fn.DrillFuncHolder.addProtectedBlock(DrillFuncHolder.java:183) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.expr.fn.DrillFuncHolder.generateBody(DrillFuncHolder.java:169) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.expr.fn.DrillSimpleFuncHolder.renderEnd(DrillSimpleFuncHolder.java:86) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.expr.EvaluationVisitor$EvalVisitor.visitFunctionHolderExpression(EvaluationVisitor.java:205) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.expr.EvaluationVisitor$ConstantFilter.visitFunctionHolderExpression(EvaluationVisitor.java:1089) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.expr.EvaluationVisitor$CSEFilter.visitFunctionHolderExpression(EvaluationVisitor.java:827) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.expr.EvaluationVisitor$CSEFilter.visitFunctionHolderExpression(EvaluationVisitor.java:807) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.common.expression.FunctionHolderExpression.accept(FunctionHolderExpression.java:53) ~[drill-logical-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.expr.EvaluationVisitor$EvalVisitor.visitValueVectorWriteExpression(EvaluationVisitor.java:362) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.expr.EvaluationVisitor$EvalVisitor.visitUnknown(EvaluationVisitor.java:344) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.expr.EvaluationVisitor$ConstantFilter.visitUnknown(EvaluationVisitor.java:1339) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.expr.EvaluationVisitor$CSEFilter.visitUnknown(EvaluationVisitor.java:1038) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.expr.EvaluationVisitor$CSEFilter.visitUnknown(EvaluationVisitor.java:807) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.expr.ValueVectorWriteExpression.accept(ValueVectorWriteExpression.java:64) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.expr.EvaluationVisitor.addExpr(EvaluationVisitor.java:104) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.expr.ClassGenerator.addExpr(ClassGenerator.java:335) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchemaFromInput(ProjectRecordBatch.java:476) ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchema(ProjectReco
[jira] [Created] (DRILL-5938) Write unit tests for math function with NaN and Infinity numbers
Volodymyr Tkach created DRILL-5938: -- Summary: Write unit tests for math function with NaN and Infinity numbers Key: DRILL-5938 URL: https://issues.apache.org/jira/browse/DRILL-5938 Project: Apache Drill Issue Type: Test Reporter: Volodymyr Tkach Priority: Minor Fix For: Future Drill math function needs to be covered with test cases when input is non-numeric numbers: NaN or Infinity. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (DRILL-5919) Add non-numeric support for JSON processing
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242040#comment-16242040 ] Volodymyr Tkach edited comment on DRILL-5919 at 11/7/17 2:36 PM: - 1. Added two session options `store.json.reader.non_numeric_numbers` and `store.json.reader.non_numeric_numbers` that allow to read/write `NaN` and `Infinity` as numbers. By default these options are set to false; 2. Extended signature of `convert_toJSON` and `convert_fromJSON` functions by adding second optional parameter that enables read/write `NaN` and `Infinity`. For example: `select convert_fromJSON('\{"key": NaN\}') from (values(1));` will result with JsonParseException, but `select convert_fromJSON('\{"key": NaN\}', true) from (values(1));` will parse `NaN` as a number. was (Author: volodymyr.tkach): Added two session options `store.json.reader.non_numeric_numbers` and `store.json.reader.non_numeric_numbers` that allow to read/write `NaN` and `Infinity` as numbers. By default these options are set to false; Also extended signature of `convert_toJSON` and `convert_fromJSON` functions by adding second optional parameter that enables read/write `NaN` and `Infinity`. For example: `select convert_fromJSON('\{"key": NaN\}') from (values(1));` will result with JsonParseException, but `select convert_fromJSON('\{"key": NaN\}', true) from (values(1));` will parse `NaN` as a number. > Add non-numeric support for JSON processing > --- > > Key: DRILL-5919 > URL: https://issues.apache.org/jira/browse/DRILL-5919 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JSON >Affects Versions: 1.11.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach > Fix For: Future > > > Add session options to allow drill working with non standard json strings > number literals like: NaN, Infinity, -Infinity. By default these options will > be switched off, the user will be able to toggle them during working session. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (DRILL-5919) Add non-numeric support for JSON processing
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Volodymyr Tkach updated DRILL-5919: --- Summary: Add non-numeric support for JSON processing (was: Add session option to allow json reader/writer to work with NaN,INF) > Add non-numeric support for JSON processing > --- > > Key: DRILL-5919 > URL: https://issues.apache.org/jira/browse/DRILL-5919 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JSON >Affects Versions: 1.11.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach > Fix For: Future > > > Add session options to allow drill working with non standard json strings > number literals like: NaN, Infinity, -Infinity. By default these options will > be switched off, the user will be able to toggle them during working session. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (DRILL-5919) Add session option to allow json reader/writer to work with NaN,INF
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242040#comment-16242040 ] Volodymyr Tkach edited comment on DRILL-5919 at 11/7/17 2:02 PM: - Added two session options `store.json.reader.non_numeric_numbers` and `store.json.reader.non_numeric_numbers` that allow to read/write `NaN` and `Infinity` as numbers. By default these options are set to false; Also extended signature of `convert_toJSON` and `convert_fromJSON` functions by adding second optional parameter that enables read/write `NaN` and `Infinity`. For example: `select convert_fromJSON('\{"key": NaN\}') from (values(1));` will result with JsonParseException, but `select convert_fromJSON('\{"key": NaN\}', true) from (values(1));` will parse `NaN` as a number. was (Author: volodymyr.tkach): Added two session options `store.json.reader.non_numeric_numbers` and `store.json.reader.non_numeric_numbers` that allow to read/write `NaN` and `Infinity` as numbers. By default these options are set to false; Also extended signature of `convert_toJSON` and `convert_fromJSON` functions by adding second optional parameter that enables read/write `NaN` and `Infinity`. For example: `select convert_fromJSON('{"key": NaN}') from (values(1));` will result with JsonParseException, but `select convert_fromJSON('{"key": NaN}', true) from (values(1));` will parse `NaN` as a number. > Add session option to allow json reader/writer to work with NaN,INF > --- > > Key: DRILL-5919 > URL: https://issues.apache.org/jira/browse/DRILL-5919 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JSON >Affects Versions: 1.11.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach > Fix For: Future > > > Add session options to allow drill working with non standard json strings > number literals like: NaN, Infinity, -Infinity. By default these options will > be switched off, the user will be able to toggle them during working session. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5919) Add session option to allow json reader/writer to work with NaN,INF
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242040#comment-16242040 ] Volodymyr Tkach commented on DRILL-5919: Added two session options `store.json.reader.non_numeric_numbers` and `store.json.reader.non_numeric_numbers` that allow to read/write `NaN` and `Infinity` as numbers. By default these options are set to false; Also extended signature of `convert_toJSON` and `convert_fromJSON` functions by adding second optional parameter that enables read/write `NaN` and `Infinity`. For example: `select convert_fromJSON('{"key": NaN}') from (values(1));` will result with JsonParseException, but `select convert_fromJSON('{"key": NaN}', true) from (values(1));` will parse `NaN` as a number. > Add session option to allow json reader/writer to work with NaN,INF > --- > > Key: DRILL-5919 > URL: https://issues.apache.org/jira/browse/DRILL-5919 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JSON >Affects Versions: 1.11.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach > Fix For: Future > > > Add session options to allow drill working with non standard json strings > number literals like: NaN, Infinity, -Infinity. By default these options will > be switched off, the user will be able to toggle them during working session. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5919) Add session option to allow json reader/writer to work with NaN,INF
[ https://issues.apache.org/jira/browse/DRILL-5919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242031#comment-16242031 ] ASF GitHub Bot commented on DRILL-5919: --- GitHub user vladimirtkach opened a pull request: https://github.com/apache/drill/pull/1026 DRILL-5919: Add session option to allow json reader/writer to work with NaN,INF Added two session options `store.json.reader.non_numeric_numbers` and `store.json.reader.non_numeric_numbers` that allow to read/write `NaN` and `Infinity` as numbers You can merge this pull request into a Git repository by running: $ git pull https://github.com/vladimirtkach/drill DRILL-5919 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1026.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1026 commit 0e972bac9d472f6681e6f16d232f61e6d0bfcb44 Author: Volodymyr Tkach Date: 2017-11-03T16:13:29Z DRILL-5919: Add session option to allow json reader/writer to work with NaN,INF > Add session option to allow json reader/writer to work with NaN,INF > --- > > Key: DRILL-5919 > URL: https://issues.apache.org/jira/browse/DRILL-5919 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - JSON >Affects Versions: 1.11.0 >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach > Fix For: Future > > > Add session options to allow drill working with non standard json strings > number literals like: NaN, Infinity, -Infinity. By default these options will > be switched off, the user will be able to toggle them during working session. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5921) Counters metrics should be listed in table
[ https://issues.apache.org/jira/browse/DRILL-5921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16241917#comment-16241917 ] ASF GitHub Bot commented on DRILL-5921: --- Github user arina-ielchiieva commented on a diff in the pull request: https://github.com/apache/drill/pull/1020#discussion_r149349985 --- Diff: exec/java-exec/src/main/resources/rest/metrics/metrics.ftl --- @@ -138,21 +154,14 @@ }); }; -function updateOthers(metrics) { - $.each(["counters", "meters"], function(i, key) { -if(! $.isEmptyObject(metrics[key])) { - $("#" + key + "Val").html(JSON.stringify(metrics[key], null, 2)); -} - }); -}; - var update = function() { $.get("/status/metrics", function(metrics) { updateGauges(metrics.gauges); updateBars(metrics.gauges); if(! $.isEmptyObject(metrics.timers)) createTable(metrics.timers, "timers"); if(! $.isEmptyObject(metrics.histograms)) createTable(metrics.histograms, "histograms"); -updateOthers(metrics); +if(! $.isEmptyObject(metrics.counters)) createCountersTable(metrics.counters); +if(! $.isEmptyObject(metrics.meters)) $("#metersVal").html(JSON.stringify(metrics.meters, null, 2)); --- End diff -- @prasadns14 1. Please add two screenshots before and after the changes. 2. Can you please think of the way to make create table generic so can be used for timers, histograms and counters? 3. What about meters? How they are displayed right now? Maybe we need to display them in table as well? Ideally, we can display all metrics in the same way. > Counters metrics should be listed in table > -- > > Key: DRILL-5921 > URL: https://issues.apache.org/jira/browse/DRILL-5921 > Project: Apache Drill > Issue Type: Bug > Components: Client - HTTP >Affects Versions: 1.11.0 >Reporter: Prasad Nagaraj Subramanya >Assignee: Prasad Nagaraj Subramanya >Priority: Minor > Fix For: 1.12.0 > > > Counter metrics are currently displayed as json string in the Drill UI. They > should be listed in a table similar to other metrics. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (DRILL-5923) State of a successfully completed query shown as "COMPLETED"
[ https://issues.apache.org/jira/browse/DRILL-5923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16241833#comment-16241833 ] ASF GitHub Bot commented on DRILL-5923: --- Github user arina-ielchiieva commented on the issue: https://github.com/apache/drill/pull/1021 @prasadns14 as far as I understood, you made all these changes to replace `completed` with `succeeded`. What if you just make changes in State enum itself, refactor some code and thus no changes in rest part will be required? From UserBitShared.proto ``` enum QueryState { STARTING = 0; // query has been scheduled for execution. This is post-enqueued. RUNNING = 1; COMPLETED = 2; // query has completed successfully CANCELED = 3; // query has been cancelled, and all cleanup is complete FAILED = 4; CANCELLATION_REQUESTED = 5; // cancellation has been requested, and is being processed ENQUEUED = 6; // query has been enqueued. this is pre-starting. } ``` After the renaming, please don't forget to regenerate protobuf. > State of a successfully completed query shown as "COMPLETED" > > > Key: DRILL-5923 > URL: https://issues.apache.org/jira/browse/DRILL-5923 > Project: Apache Drill > Issue Type: Bug > Components: Client - HTTP >Affects Versions: 1.11.0 >Reporter: Prasad Nagaraj Subramanya >Assignee: Prasad Nagaraj Subramanya > Fix For: 1.12.0 > > > Drill UI currently lists a successfully completed query as "COMPLETED". > Successfully completed, failed and canceled queries are all grouped as > Completed queries. > It would be better to list the state of a successfully completed query as > "Succeeded" to avoid confusion. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (DRILL-5746) Pcap PR manually edited Protobuf files, values lost on next build
[ https://issues.apache.org/jira/browse/DRILL-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva resolved DRILL-5746. - Resolution: Fixed In the scope of DRILL-5716. > Pcap PR manually edited Protobuf files, values lost on next build > - > > Key: DRILL-5746 > URL: https://issues.apache.org/jira/browse/DRILL-5746 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.10.0 >Reporter: Paul Rogers >Assignee: Paul Rogers > Fix For: 1.12.0 > > > Drill recently accepted the pcap format plugin. As part of that work, the > author added a new operator type, {{PCAP_SUB_SCAN_VALUE}}. > But, apparently this was done by editing the generated Protobuf files to add > the values, rather than modifying the protobuf definitions and rebuilding the > generated files. The result is, on the next build of the Protobuf sources, > the following compile error appears: > {code} > [ERROR] > /Users/paulrogers/git/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/pcap/PcapFormatPlugin.java:[80,41] > error: cannot find symbol > [ERROR] symbol: variable PCAP_SUB_SCAN_VALUE > [ERROR] location: class CoreOperatorType > {code} > The solution is to properly edit the Protobuf definitions with the required > symbol. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (DRILL-5270) Improve loading of profiles listing in the WebUI
[ https://issues.apache.org/jira/browse/DRILL-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-5270: Reviewer: Arina Ielchiieva > Improve loading of profiles listing in the WebUI > > > Key: DRILL-5270 > URL: https://issues.apache.org/jira/browse/DRILL-5270 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.9.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua > Fix For: 1.12.0 > > > Currently, as the number of profiles increase, we reload the same list of > profiles from the FS. > An ideal improvement would be to detect if there are any new profiles and > only reload from the disk then. Otherwise, a cached list is sufficient. > For a directory of 280K profiles, the load time is close to 6 seconds on a 32 > core server. With the caching, we can get it down to as much as a few > milliseconds. > To render the cache as invalid, we inspect the last modified time of the > directory to confirm whether a reload is needed. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (DRILL-5688) Add repeated map support to column accessors
[ https://issues.apache.org/jira/browse/DRILL-5688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-5688: Fix Version/s: (was: 1.12.0) 1.13.0 > Add repeated map support to column accessors > > > Key: DRILL-5688 > URL: https://issues.apache.org/jira/browse/DRILL-5688 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.12.0 >Reporter: Paul Rogers >Assignee: Paul Rogers > Fix For: 1.13.0 > > > DRILL-5211 describes how Drill runs into OOM issues due to Drill's two > allocators: Netty and Unsafe. That JIRA also describes the solution: limit > vectors to 16 MB in length (with the eventual goal of limiting overall batch > size.) DRILL-5517 added "size-aware" support to the column accessors created > to parallel Drill's existing readers and writers. (The parallel > implementation ensures that we don't break existing code that uses the > existing mechanism; same as we did for the external sort.) > This ticket describes work to extend the column accessors to handle repeated > maps and lists. Key themes: > * Define a common metadata schema for use in this layer and the "result set > loader" of DRILL-5657. This schema layer builds on top of the existing schema > to add the kind of metadata needed here and by the "sizer" created for the > external sort. > * Define a JSON-like reader and writer structure that supports the full Drill > data model semantics. (The earlier version focused on the scalar types and > arrays of scalars to prove the concept of limiting vector sizes.) > * Revising test code to use the revised column writer structure. > Implementation details appear in the PR. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (DRILL-5822) The query with "SELECT *" with "ORDER BY" clause and `planner.slice_target`=1 doesn't preserve column order
[ https://issues.apache.org/jira/browse/DRILL-5822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-5822: Reviewer: Paul Rogers > The query with "SELECT *" with "ORDER BY" clause and `planner.slice_target`=1 > doesn't preserve column order > --- > > Key: DRILL-5822 > URL: https://issues.apache.org/jira/browse/DRILL-5822 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.11.0 >Reporter: Prasad Nagaraj Subramanya >Assignee: Vitalii Diravka > Labels: ready-to-commit > Fix For: 1.12.0 > > > Columns ordering doesn't preserve for the star query with sorting when this > is planned into multiple fragments. > Repro steps: > 1) {code}alter session set `planner.slice_target`=1;{code} > 2) ORDER BY clause in the query. > Scenarios: > {code} > 0: jdbc:drill:zk=local> alter session reset `planner.slice_target`; > +---++ > | ok |summary | > +---++ > | true | planner.slice_target updated. | > +---++ > 1 row selected (0.082 seconds) > 0: jdbc:drill:zk=local> select * from cp.`tpch/nation.parquet` order by > n_name limit 1; > +--+--+--+--+ > | n_nationkey | n_name | n_regionkey | n_comment > | > +--+--+--+--+ > | 0| ALGERIA | 0| haggle. carefully final deposits > detect slyly agai | > +--+--+--+--+ > 1 row selected (0.141 seconds) > 0: jdbc:drill:zk=local> alter session set `planner.slice_target`=1; > +---++ > | ok |summary | > +---++ > | true | planner.slice_target updated. | > +---++ > 1 row selected (0.091 seconds) > 0: jdbc:drill:zk=local> select * from cp.`tpch/nation.parquet` order by > n_name limit 1; > +--+--+--+--+ > | n_comment | n_name | > n_nationkey | n_regionkey | > +--+--+--+--+ > | haggle. carefully final deposits detect slyly agai | ALGERIA | 0 >| 0| > +--+--+--+--+ > 1 row selected (0.201 seconds) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (DRILL-5769) IndexOutOfBoundsException when querying JSON files
[ https://issues.apache.org/jira/browse/DRILL-5769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-5769: Fix Version/s: (was: 1.12.0) (was: 1.11.0) (was: 1.10.0) Future > IndexOutOfBoundsException when querying JSON files > -- > > Key: DRILL-5769 > URL: https://issues.apache.org/jira/browse/DRILL-5769 > Project: Apache Drill > Issue Type: Bug > Components: Server, Storage - JSON >Affects Versions: 1.10.0 > Environment: *jdk_8u45_x64* > *single drillbit running on zookeeper* > *Following options set to TRUE:* > drill.exec.functions.cast_empty_string_to_null > store.json.all_text_mode > store.parquet.enable_dictionary_encoding > store.parquet.use_new_reader >Reporter: David Lee >Assignee: Jinfeng Ni > Fix For: Future > > Attachments: 001.json, 100.json, 111.json > > > *Running the following SQL on these three JSON files fail: * > 001.json 100.json 111.json > select t.id > from dfs.`/tmp/???.json` t > where t.assetData.debt.couponPaymentFeature.interestBasis = '5' > *Error:* > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > IndexOutOfBoundsException: index: 1024, length: 1 (expected: range(0, 1024)) > Fragment 0:0 [Error Id: .... > *However running the same SQL on two out of three files works:* > select t.id > from dfs.`/tmp/1??.json` t > where t.assetData.debt.couponPaymentFeature.interestBasis = '5' > select t.id > from dfs.`/tmp/?1?.json` t > where t.assetData.debt.couponPaymentFeature.interestBasis = '5' > select t.id > from dfs.`/tmp/??1.json` t > where t.assetData.debt.couponPaymentFeature.interestBasis = '5' > *Changing the selected column from t.id to t.* also works: * > select * > from dfs.`/tmp/???.json` t > where t.assetData.debt.couponPaymentFeature.interestBasis = '5' -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (DRILL-5896) Handle vector creation in HbaseRecordReader to avoid NullableInt vectors later
[ https://issues.apache.org/jira/browse/DRILL-5896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-5896: Labels: ready-to-commit (was: ) > Handle vector creation in HbaseRecordReader to avoid NullableInt vectors later > -- > > Key: DRILL-5896 > URL: https://issues.apache.org/jira/browse/DRILL-5896 > Project: Apache Drill > Issue Type: Bug > Components: Storage - HBase >Affects Versions: 1.11.0 >Reporter: Prasad Nagaraj Subramanya >Assignee: Prasad Nagaraj Subramanya > Labels: ready-to-commit > Fix For: 1.12.0 > > > When a hbase query projects both a column family and a column in the column > family, the vector for the column is not created in the HbaseRecordReader. > So, in cases where scan batch is empty we create a NullableInt vector for > this column. We need to handle column creation in the reader. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (DRILL-5670) Varchar vector throws an assertion error when allocating a new vector
[ https://issues.apache.org/jira/browse/DRILL-5670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-5670: Fix Version/s: (was: 1.12.0) 1.13.0 > Varchar vector throws an assertion error when allocating a new vector > - > > Key: DRILL-5670 > URL: https://issues.apache.org/jira/browse/DRILL-5670 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators >Affects Versions: 1.11.0 >Reporter: Rahul Challapalli >Assignee: Paul Rogers > Fix For: 1.13.0 > > Attachments: 26478262-f0a7-8fc1-1887-4f27071b9c0f.sys.drill, > 26498995-bbad-83bc-618f-914c37a84e1f.sys.drill, > 26555749-4d36-10d2-6faf-e403db40c370.sys.drill, > 266290f3-5fdc-5873-7372-e9ee053bf867.sys.drill, > 269969ca-8d4d-073a-d916-9031e3d3fbf0.sys.drill, drill-override.conf, > drillbit.log, drillbit.log, drillbit.log, drillbit.log, drillbit.log, > drillbit.log.exchange, drillbit.log.sort, drillbit.out > > > I am running this test on a private branch of [paul's > repository|https://github.com/paul-rogers/drill]. Below is the commit info > {code} > git.commit.id.abbrev=d86e16c > git.commit.user.email=prog...@maprtech.com > git.commit.message.full=DRILL-5601\: Rollup of external sort fixes an > improvements\n\n- DRILL-5513\: Managed External Sort \: OOM error during the > merge phase\n- DRILL-5519\: Sort fails to spill and results in an OOM\n- > DRILL-5522\: OOM during the merge and spill process of the managed external > sort\n- DRILL-5594\: Excessive buffer reallocations during merge phase of > external sort\n- DRILL-5597\: Incorrect "bits" vector allocation in nullable > vectors allocateNew()\n- DRILL-5602\: Repeated List Vector fails to > initialize the offset vector\n\nAll of the bugs have to do with handling > low-memory conditions, and with\ncorrectly estimating the sizes of vectors, > even when those vectors come\nfrom the spill file or from an exchange. Hence, > the changes for all of\nthe above issues are interrelated.\n > git.commit.id=d86e16c551e7d3553f2cde748a739b1c5a7a7659 > git.commit.message.short=DRILL-5601\: Rollup of external sort fixes an > improvements > git.commit.user.name=Paul Rogers > git.build.user.name=Rahul Challapalli > git.commit.id.describe=0.9.0-1078-gd86e16c > git.build.user.email=challapallira...@gmail.com > git.branch=d86e16c551e7d3553f2cde748a739b1c5a7a7659 > git.commit.time=05.07.2017 @ 20\:34\:39 PDT > git.build.time=12.07.2017 @ 14\:27\:03 PDT > git.remote.origin.url=g...@github.com\:paul-rogers/drill.git > {code} > Below query fails with an Assertion Error > {code} > 0: jdbc:drill:zk=10.10.100.190:5181> ALTER SESSION SET > `exec.sort.disable_managed` = false; > +---+-+ > | ok | summary | > +---+-+ > | true | exec.sort.disable_managed updated. | > +---+-+ > 1 row selected (1.044 seconds) > 0: jdbc:drill:zk=10.10.100.190:5181> alter session set > `planner.memory.max_query_memory_per_node` = 482344960; > +---++ > | ok | summary | > +---++ > | true | planner.memory.max_query_memory_per_node updated. | > +---++ > 1 row selected (0.372 seconds) > 0: jdbc:drill:zk=10.10.100.190:5181> alter session set > `planner.width.max_per_node` = 1; > +---+--+ > | ok | summary| > +---+--+ > | true | planner.width.max_per_node updated. | > +---+--+ > 1 row selected (0.292 seconds) > 0: jdbc:drill:zk=10.10.100.190:5181> alter session set > `planner.width.max_per_query` = 1; > +---+---+ > | ok |summary| > +---+---+ > | true | planner.width.max_per_query updated. | > +---+---+ > 1 row selected (0.25 seconds) > 0: jdbc:drill:zk=10.10.100.190:5181> select count(*) from (select * from > dfs.`/drill/testdata/resource-manager/3500cols.tbl` order by > columns[450],columns[330],columns[230],columns[220],columns[110],columns[90],columns[80],columns[70],columns[40],columns[10],columns[20],columns[30],columns[40],columns[50], > > columns[454],columns[413],columns[940],columns[834],columns[73],columns[140],columns[104],columns[],columns[30],columns[2420],columns[1520], > columns[1410], > columns[1110],columns[1290],columns[2380],
[jira] [Updated] (DRILL-5265) External Sort consumes more memory than allocated
[ https://issues.apache.org/jira/browse/DRILL-5265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-5265: Fix Version/s: (was: 1.12.0) 1.13.0 > External Sort consumes more memory than allocated > - > > Key: DRILL-5265 > URL: https://issues.apache.org/jira/browse/DRILL-5265 > Project: Apache Drill > Issue Type: Bug >Reporter: Rahul Challapalli >Assignee: Paul Rogers > Fix For: 1.13.0 > > > git.commit.id.abbrev=300e934 > Based on the profile for the below query, the external sort has a peak memory > usage of ~126MB when only ~100MB was allocated > {code} > alter session set `planner.memory.max_query_memory_per_node` = 104857600; > alter session set `planner.width.max_per_node` = 1; > select * from dfs.`/drill/testdata/md1362` order by c_email_address; > {code} > I attached the profile and the log files -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (DRILL-5310) Memory leak in managed sort if OOM during sv2 allocation
[ https://issues.apache.org/jira/browse/DRILL-5310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-5310: Fix Version/s: (was: 1.12.0) 1.13.0 > Memory leak in managed sort if OOM during sv2 allocation > > > Key: DRILL-5310 > URL: https://issues.apache.org/jira/browse/DRILL-5310 > Project: Apache Drill > Issue Type: Sub-task >Affects Versions: 1.10.0 >Reporter: Paul Rogers >Assignee: Paul Rogers > Fix For: 1.13.0 > > > See the "identical1" test case in DRILL-5266. Due to misconfiguration, the > sort was given too little memory to make progress. An OOM error occurred when > allocating an SV2. > In this scenario, the "converted" record batch is leaked. > Normally, a converted batch is added to the list of in-memory batches, then > released on {{close()}}. But, in this case, the batch is only a local > variable, and so leaks. > The code must release this batch in this condition. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (DRILL-5822) The query with "SELECT *" with "ORDER BY" clause and `planner.slice_target`=1 doesn't preserve column order
[ https://issues.apache.org/jira/browse/DRILL-5822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-5822: Labels: ready-to-commit (was: ) > The query with "SELECT *" with "ORDER BY" clause and `planner.slice_target`=1 > doesn't preserve column order > --- > > Key: DRILL-5822 > URL: https://issues.apache.org/jira/browse/DRILL-5822 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.11.0 >Reporter: Prasad Nagaraj Subramanya >Assignee: Vitalii Diravka > Labels: ready-to-commit > Fix For: 1.12.0 > > > Columns ordering doesn't preserve for the star query with sorting when this > is planned into multiple fragments. > Repro steps: > 1) {code}alter session set `planner.slice_target`=1;{code} > 2) ORDER BY clause in the query. > Scenarios: > {code} > 0: jdbc:drill:zk=local> alter session reset `planner.slice_target`; > +---++ > | ok |summary | > +---++ > | true | planner.slice_target updated. | > +---++ > 1 row selected (0.082 seconds) > 0: jdbc:drill:zk=local> select * from cp.`tpch/nation.parquet` order by > n_name limit 1; > +--+--+--+--+ > | n_nationkey | n_name | n_regionkey | n_comment > | > +--+--+--+--+ > | 0| ALGERIA | 0| haggle. carefully final deposits > detect slyly agai | > +--+--+--+--+ > 1 row selected (0.141 seconds) > 0: jdbc:drill:zk=local> alter session set `planner.slice_target`=1; > +---++ > | ok |summary | > +---++ > | true | planner.slice_target updated. | > +---++ > 1 row selected (0.091 seconds) > 0: jdbc:drill:zk=local> select * from cp.`tpch/nation.parquet` order by > n_name limit 1; > +--+--+--+--+ > | n_comment | n_name | > n_nationkey | n_regionkey | > +--+--+--+--+ > | haggle. carefully final deposits detect slyly agai | ALGERIA | 0 >| 0| > +--+--+--+--+ > 1 row selected (0.201 seconds) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (DRILL-5909) need new JMX metrics for (FAILED and CANCELED) queries
[ https://issues.apache.org/jira/browse/DRILL-5909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-5909: Reviewer: Paul Rogers > need new JMX metrics for (FAILED and CANCELED) queries > -- > > Key: DRILL-5909 > URL: https://issues.apache.org/jira/browse/DRILL-5909 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Monitoring >Affects Versions: 1.11.0, 1.12.0 >Reporter: Khurram Faraaz >Assignee: Prasad Nagaraj Subramanya > Labels: ready-to-commit > Fix For: 1.12.0 > > > we have these JMX metrics today > {noformat} > drill.queries.running > drill.queries.completed > {noformat} > we need these new JMX metrics > {noformat} > drill.queries.failed > drill.queries.canceled > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (DRILL-5909) need new JMX metrics for (FAILED and CANCELED) queries
[ https://issues.apache.org/jira/browse/DRILL-5909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-5909: Labels: ready-to-commit (was: ) > need new JMX metrics for (FAILED and CANCELED) queries > -- > > Key: DRILL-5909 > URL: https://issues.apache.org/jira/browse/DRILL-5909 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Monitoring >Affects Versions: 1.11.0, 1.12.0 >Reporter: Khurram Faraaz >Assignee: Prasad Nagaraj Subramanya > Labels: ready-to-commit > Fix For: 1.12.0 > > > we have these JMX metrics today > {noformat} > drill.queries.running > drill.queries.completed > {noformat} > we need these new JMX metrics > {noformat} > drill.queries.failed > drill.queries.canceled > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (DRILL-5896) Handle vector creation in HbaseRecordReader to avoid NullableInt vectors later
[ https://issues.apache.org/jira/browse/DRILL-5896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-5896: Reviewer: Paul Rogers > Handle vector creation in HbaseRecordReader to avoid NullableInt vectors later > -- > > Key: DRILL-5896 > URL: https://issues.apache.org/jira/browse/DRILL-5896 > Project: Apache Drill > Issue Type: Bug > Components: Storage - HBase >Affects Versions: 1.11.0 >Reporter: Prasad Nagaraj Subramanya >Assignee: Prasad Nagaraj Subramanya > Labels: ready-to-commit > Fix For: 1.12.0 > > > When a hbase query projects both a column family and a column in the column > family, the vector for the column is not created in the HbaseRecordReader. > So, in cases where scan batch is empty we create a NullableInt vector for > this column. We need to handle column creation in the reader. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (DRILL-5921) Counters metrics should be listed in table
[ https://issues.apache.org/jira/browse/DRILL-5921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-5921: Reviewer: Arina Ielchiieva > Counters metrics should be listed in table > -- > > Key: DRILL-5921 > URL: https://issues.apache.org/jira/browse/DRILL-5921 > Project: Apache Drill > Issue Type: Bug > Components: Client - HTTP >Affects Versions: 1.11.0 >Reporter: Prasad Nagaraj Subramanya >Assignee: Prasad Nagaraj Subramanya >Priority: Minor > Fix For: 1.12.0 > > > Counter metrics are currently displayed as json string in the Drill UI. They > should be listed in a table similar to other metrics. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (DRILL-5923) State of a successfully completed query shown as "COMPLETED"
[ https://issues.apache.org/jira/browse/DRILL-5923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-5923: Reviewer: Arina Ielchiieva > State of a successfully completed query shown as "COMPLETED" > > > Key: DRILL-5923 > URL: https://issues.apache.org/jira/browse/DRILL-5923 > Project: Apache Drill > Issue Type: Bug > Components: Client - HTTP >Affects Versions: 1.11.0 >Reporter: Prasad Nagaraj Subramanya >Assignee: Prasad Nagaraj Subramanya > Fix For: 1.12.0 > > > Drill UI currently lists a successfully completed query as "COMPLETED". > Successfully completed, failed and canceled queries are all grouped as > Completed queries. > It would be better to list the state of a successfully completed query as > "Succeeded" to avoid confusion. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (DRILL-5829) RecordBatchLoader schema comparison is not case insensitive
[ https://issues.apache.org/jira/browse/DRILL-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-5829: Fix Version/s: (was: 1.12.0) 1.13.0 > RecordBatchLoader schema comparison is not case insensitive > --- > > Key: DRILL-5829 > URL: https://issues.apache.org/jira/browse/DRILL-5829 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.11.0 >Reporter: Paul Rogers >Assignee: Paul Rogers > Fix For: 1.13.0 > > > The class {{RecordBatchLoader}} decodes batches received over the wire. It > determines if the schema of the new batch matches that of the old one. To do > that, it uses a map of existing columns. > In Drill, column names follow SQL rules: they are case insensitive. Yet, the > implementation of {{RecordBatchLoader}} uses a case sensitive map: > {code} > final Map oldFields = Maps.newHashMap(); > {code} > This should be: > {code} > final Map oldFields = > CaseInsensitiveMap.newHashMap(); > {code} > Without this change, the receivers will report schema changes if a column > differs only in name case. However, Drill semantics say that names that > differ in case are identical, and so no schema change should be issued. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (DRILL-5690) RepeatedDecimal18Vector does not pass scale, precision to data vector
[ https://issues.apache.org/jira/browse/DRILL-5690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-5690: Fix Version/s: (was: 1.12.0) 1.13.0 > RepeatedDecimal18Vector does not pass scale, precision to data vector > - > > Key: DRILL-5690 > URL: https://issues.apache.org/jira/browse/DRILL-5690 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.10.0 >Reporter: Paul Rogers >Assignee: Paul Rogers > Fix For: 1.13.0 > > > Decimal types require not just the type (Decimal9, Decimal18, etc.) but also > a precision and scale. The triple of (minor type, precision, scale) appears > in the {{MaterializedField}} for the nullable or required vectors. > A repeated vector has three parts: the {{RepeatedDecimal18Vector}} which is > composed of a {{UInt4Vector}} offset vector and a {{Decimal18Vector}} that > holds values. > When {{RepeatedDecimal18Vector}} creates the {{Decimal18Vector}} to hold the > values, it clones the {{MaterializedField}}. But, it *does not* clone the > scale and precision, resulting in the loss of critical information. > {code} > public RepeatedDecimal18Vector(MaterializedField field, BufferAllocator > allocator) { > super(field, allocator); > > addOrGetVector(VectorDescriptor.create(Types.required(field.getType().getMinorType(; > } > {code} > This is normally not a problem because most code access the data via the > repeated vector. But, for code that needs to work with the values, the > results are wrong given that the types are wrong. (Values stored with one > scale, 123.45, (scale 2) will be retrieved with 0 scale (123, say). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Closed] (DRILL-5872) Deserialization of profile JSON fails due to totalCost being reported as "NaN"
[ https://issues.apache.org/jira/browse/DRILL-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva closed DRILL-5872. --- Resolution: Won't Fix > Deserialization of profile JSON fails due to totalCost being reported as "NaN" > -- > > Key: DRILL-5872 > URL: https://issues.apache.org/jira/browse/DRILL-5872 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.12.0 >Reporter: Kunal Khatua >Assignee: Paul Rogers >Priority: Blocker > Fix For: 1.12.0 > > > With DRILL-5716 , there is a change in the protobuf that introduces a new > attribute in the JSON document that Drill uses to interpret and render the > profile's details. > The totalCost attribute, used as a part of showing the query cost (to > understand how it was assign to small/large queue), sometimes returns a > non-numeric text value {{"NaN"}}. > This breaks the UI with the messages: > {code} > Failed to get profiles: > unable to deserialize value at key 2620698f-295e-f8d3-3ab7-01792b0f2669 > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (DRILL-5377) Five-digit year dates are displayed incorrectly via jdbc
[ https://issues.apache.org/jira/browse/DRILL-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-5377: Fix Version/s: (was: 1.12.0) 1.13.0 > Five-digit year dates are displayed incorrectly via jdbc > > > Key: DRILL-5377 > URL: https://issues.apache.org/jira/browse/DRILL-5377 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Parquet >Affects Versions: 1.10.0 >Reporter: Rahul Challapalli >Assignee: Vitalii Diravka >Priority: Minor > Fix For: 1.13.0 > > > git.commit.id.abbrev=38ef562 > The issue is connected to displaying five-digit year dates via jdbc > Below is the output, I get from test framework when I disable auto correction > for date fields > {code} > select l_shipdate from table(cp.`tpch/lineitem.parquet` (type => 'parquet', > autoCorrectCorruptDates => false)) order by l_shipdate limit 10; > ^@356-03-19 > ^@356-03-21 > ^@356-03-21 > ^@356-03-23 > ^@356-03-24 > ^@356-03-24 > ^@356-03-26 > ^@356-03-26 > ^@356-03-26 > ^@356-03-26 > {code} > Or a simpler case: > {code} > 0: jdbc:drill:> select cast('11356-02-16' as date) as FUTURE_DATE from > (VALUES(1)); > +--+ > | FUTURE_DATE | > +--+ > | 356-02-16 | > +--+ > 1 row selected (0.293 seconds) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)