[jira] [Commented] (DRILL-7107) Unable to connect to Drill 1.15 through ZK

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16801211#comment-16801211
 ] 

ASF GitHub Bot commented on DRILL-7107:
---

bitblender commented on pull request #1702: DRILL-7107 Unable to connect to 
Drill 1.15 through ZK
URL: https://github.com/apache/drill/pull/1702#discussion_r268885320
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/coord/zk/ZKClusterCoordinator.java
 ##
 @@ -81,23 +78,20 @@
   private ConcurrentHashMap endpointsMap = new 
ConcurrentHashMap();
   private static final Pattern ZK_COMPLEX_STRING = 
Pattern.compile("(^.*?)/(.*)/([^/]*)$");
 
-  public ZKClusterCoordinator(DrillConfig config, String connect)
-  throws IOException, DrillbitStartupException {
-this(config, connect, null);
+  public ZKClusterCoordinator(DrillConfig config, String connect) {
+this(config, connect, new DefaultACLProvider());
   }
 
-  public ZKClusterCoordinator(DrillConfig config, BootStrapContext context)
-  throws IOException, DrillbitStartupException {
-this(config, null, context);
+  public ZKClusterCoordinator(DrillConfig config, ACLProvider aclProvider) {
+this(config, null, aclProvider);
 
 Review comment:
   I tried writing a test where the Drillbits (inside ClusterFixture) are setup 
with ZK_APPLY_SECURE_ACL=false (to avoid the need to setup a secure ZK server 
within the unit test) and the ClientFixture is setup with 
ZK_APPLY_SECURE_ACL=true (to simulate the failure). Starting a test with 
different values for the same property turns out to be quite hard because the 
ClusterFixture internally instantiates a ClientFixure. Changing this behavior 
might affect other tests. 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Unable to connect to Drill 1.15 through ZK
> --
>
> Key: DRILL-7107
> URL: https://issues.apache.org/jira/browse/DRILL-7107
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Karthikeyan Manivannan
>Assignee: Karthikeyan Manivannan
>Priority: Major
> Fix For: 1.16.0
>
>
> After upgrading to Drill 1.15, users are seeing they are no longer able to 
> connect to Drill using ZK quorum. They are getting the following "Unable to 
> setup ZK for client" error.
> [~]$ sqlline -u "jdbc:drill:zk=172.16.2.165:5181;auth=maprsasl"
> Error: Failure in connecting to Drill: 
> org.apache.drill.exec.rpc.RpcException: Failure setting up ZK for client. 
> (state=,code=0)
> java.sql.SQLNonTransientConnectionException: Failure in connecting to Drill: 
> org.apache.drill.exec.rpc.RpcException: Failure setting up ZK for client.
>  at 
> org.apache.drill.jdbc.impl.DrillConnectionImpl.(DrillConnectionImpl.java:174)
>  at 
> org.apache.drill.jdbc.impl.DrillJdbc41Factory.newDrillConnection(DrillJdbc41Factory.java:67)
>  at 
> org.apache.drill.jdbc.impl.DrillFactory.newConnection(DrillFactory.java:67)
>  at 
> org.apache.calcite.avatica.UnregisteredDriver.connect(UnregisteredDriver.java:138)
>  at org.apache.drill.jdbc.Driver.connect(Driver.java:72)
>  at sqlline.DatabaseConnection.connect(DatabaseConnection.java:130)
>  at sqlline.DatabaseConnection.getConnection(DatabaseConnection.java:179)
>  at sqlline.Commands.connect(Commands.java:1247)
>  at sqlline.Commands.connect(Commands.java:1139)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:38)
>  at sqlline.SqlLine.dispatch(SqlLine.java:722)
>  at sqlline.SqlLine.initArgs(SqlLine.java:416)
>  at sqlline.SqlLine.begin(SqlLine.java:514)
>  at sqlline.SqlLine.start(SqlLine.java:264)
>  at sqlline.SqlLine.main(SqlLine.java:195)
> Caused by: org.apache.drill.exec.rpc.RpcException: Failure setting up ZK for 
> client.
>  at org.apache.drill.exec.client.DrillClient.connect(DrillClient.java:340)
>  at 
> org.apache.drill.jdbc.impl.DrillConnectionImpl.(DrillConnectionImpl.java:165)
>  ... 18 more
> Caused by: java.lang.NullPointerException
>  at 
> org.apache.drill.exec.coord.zk.ZKACLProviderFactory.findACLProvider(ZKACLProviderFactory.java:68)
>  at 
> org.apache.drill.exec.coord.zk.ZKACLProviderFactory.getACLProvider(ZKACLProviderFactory.java:47)
>  at 
> org.apache.drill.exec.coord.zk.ZKClusterCoordinator.(ZKClusterCoordinator

[jira] [Commented] (DRILL-7048) Implement JDBC Statement.setMaxRows() with System Option

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16801044#comment-16801044
 ] 

ASF GitHub Bot commented on DRILL-7048:
---

kkhatua commented on issue #1714: DRILL-7048: Implement JDBC 
Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#issuecomment-476368650
 
 
   @vvysotskyi  made changes and verified that tests passed.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Implement JDBC Statement.setMaxRows() with System Option
> 
>
> Key: DRILL-7048
> URL: https://issues.apache.org/jira/browse/DRILL-7048
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC, Query Planning & Optimization
>Affects Versions: 1.15.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.17.0
>
>
> With DRILL-6960, the webUI will get an auto-limit on the number of results 
> fetched.
> Since more of the plumbing is already there, it makes sense to provide the 
> same for the JDBC client.
> In addition, it would be nice if the Server can have a pre-defined value as 
> well (default 0; i.e. no limit) so that an _admin_ would be able to ensure a 
> max limit on the resultset size as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7108) With statistics enabled TPCH 16 has two additional exchange operators

2019-03-25 Thread Robert Hou (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16801040#comment-16801040
 ] 

Robert Hou commented on DRILL-7108:
---

I have verified this fix.

> With statistics enabled TPCH 16 has two additional exchange operators
> -
>
> Key: DRILL-7108
> URL: https://issues.apache.org/jira/browse/DRILL-7108
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.16.0
>Reporter: Robert Hou
>Assignee: Gautam Parai
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> TPCH 16 with sf 100 runs 14% slower.  Here is the query:
> {noformat}
> select
>   p.p_brand,
>   p.p_type,
>   p.p_size,
>   count(distinct ps.ps_suppkey) as supplier_cnt
> from
>   partsupp ps,
>   part p
> where
>   p.p_partkey = ps.ps_partkey
>   and p.p_brand <> 'Brand#21'
>   and p.p_type not like 'MEDIUM PLATED%'
>   and p.p_size in (38, 2, 8, 31, 44, 5, 14, 24)
>   and ps.ps_suppkey not in (
> select
>   s.s_suppkey
> from
>   supplier s
> where
>   s.s_comment like '%Customer%Complaints%'
>   )
> group by
>   p.p_brand,
>   p.p_type,
>   p.p_size
> order by
>   supplier_cnt desc,
>   p.p_brand,
>   p.p_type,
>   p.p_size;
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (DRILL-7108) With statistics enabled TPCH 16 has two additional exchange operators

2019-03-25 Thread Robert Hou (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Hou closed DRILL-7108.
-

> With statistics enabled TPCH 16 has two additional exchange operators
> -
>
> Key: DRILL-7108
> URL: https://issues.apache.org/jira/browse/DRILL-7108
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.16.0
>Reporter: Robert Hou
>Assignee: Gautam Parai
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> TPCH 16 with sf 100 runs 14% slower.  Here is the query:
> {noformat}
> select
>   p.p_brand,
>   p.p_type,
>   p.p_size,
>   count(distinct ps.ps_suppkey) as supplier_cnt
> from
>   partsupp ps,
>   part p
> where
>   p.p_partkey = ps.ps_partkey
>   and p.p_brand <> 'Brand#21'
>   and p.p_type not like 'MEDIUM PLATED%'
>   and p.p_size in (38, 2, 8, 31, 44, 5, 14, 24)
>   and ps.ps_suppkey not in (
> select
>   s.s_suppkey
> from
>   supplier s
> where
>   s.s_comment like '%Customer%Complaints%'
>   )
> group by
>   p.p_brand,
>   p.p_type,
>   p.p_size
> order by
>   supplier_cnt desc,
>   p.p_brand,
>   p.p_type,
>   p.p_size;
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7048) Implement JDBC Statement.setMaxRows() with System Option

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800974#comment-16800974
 ] 

ASF GitHub Bot commented on DRILL-7048:
---

kkhatua commented on pull request #1714: DRILL-7048: Implement JDBC 
Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r268800813
 
 

 ##
 File path: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillStatementImpl.java
 ##
 @@ -270,4 +271,10 @@ public void setResultSet(AvaticaResultSet resultSet) {
   public void setUpdateCount(int value) {
 updateCount = value;
   }
+
+  @Override
+  public void setLargeMaxRows(long maxRowCount) throws SQLException {
+execute("ALTER SESSION SET `" + ExecConstants.QUERY_MAX_ROWS + 
"`="+maxRowCount);
+this.maxRowCount = maxRowCount;
 
 Review comment:
   We need this here to ensure that when `getLargeMaxRows()` is called, we are 
reading it back from the value that was set using `setLargeMaxRows()`. Avatica 
only holds the value that has been set.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Implement JDBC Statement.setMaxRows() with System Option
> 
>
> Key: DRILL-7048
> URL: https://issues.apache.org/jira/browse/DRILL-7048
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC, Query Planning & Optimization
>Affects Versions: 1.15.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.17.0
>
>
> With DRILL-6960, the webUI will get an auto-limit on the number of results 
> fetched.
> Since more of the plumbing is already there, it makes sense to provide the 
> same for the JDBC client.
> In addition, it would be nice if the Server can have a pre-defined value as 
> well (default 0; i.e. no limit) so that an _admin_ would be able to ensure a 
> max limit on the resultset size as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7048) Implement JDBC Statement.setMaxRows() with System Option

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800919#comment-16800919
 ] 

ASF GitHub Bot commented on DRILL-7048:
---

kkhatua commented on pull request #1714: DRILL-7048: Implement JDBC 
Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r268772086
 
 

 ##
 File path: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/DrillPreparedStatement.java
 ##
 @@ -32,4 +33,25 @@
  */
 public interface DrillPreparedStatement extends PreparedStatement {
 
+  /**
+   * @throws  SQLException
+   *Any SQL exception
+   */
+  @Override
+  int getMaxRows() throws SQLException;
 
 Review comment:
   I think I was declaring it because it seemed Avatica preferred using 
`setLargeMaxRows() / getLargeMaxRows()` . I, anyway, cannot really override 
this (all calls to `setMaxRows() / getMaxRows()` are redirected to the 
`...Large...` methods), so I'll remove it.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Implement JDBC Statement.setMaxRows() with System Option
> 
>
> Key: DRILL-7048
> URL: https://issues.apache.org/jira/browse/DRILL-7048
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC, Query Planning & Optimization
>Affects Versions: 1.15.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.17.0
>
>
> With DRILL-6960, the webUI will get an auto-limit on the number of results 
> fetched.
> Since more of the plumbing is already there, it makes sense to provide the 
> same for the JDBC client.
> In addition, it would be nice if the Server can have a pre-defined value as 
> well (default 0; i.e. no limit) so that an _admin_ would be able to ensure a 
> max limit on the resultset size as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7048) Implement JDBC Statement.setMaxRows() with System Option

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800917#comment-16800917
 ] 

ASF GitHub Bot commented on DRILL-7048:
---

kkhatua commented on pull request #1714: DRILL-7048: Implement JDBC 
Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r268771066
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/ops/QueryContext.java
 ##
 @@ -107,6 +108,20 @@ public QueryContext(final UserSession session, final 
DrillbitContext drillbitCon
   this.table = drillbitContext.getOperatorTable();
 }
 
+// Checking for limit on ResultSet rowcount and if user attempting to 
override the system value
+int sessionMaxRowCount = 
queryOptions.getOption(ExecConstants.QUERY_MAX_ROWS).num_val.intValue();
+int defaultMaxRowCount = 
queryOptions.getOptionManager(OptionScope.SYSTEM).getOption(ExecConstants.QUERY_MAX_ROWS).num_val.intValue();
+if (sessionMaxRowCount > 0 && defaultMaxRowCount > 0) {
+  this.autoLimitRowCount = Math.min(sessionMaxRowCount, 
defaultMaxRowCount);
+} else {
+  this.autoLimitRowCount = Math.max(sessionMaxRowCount, 
defaultMaxRowCount);
+}
+if (autoLimitRowCount == defaultMaxRowCount && defaultMaxRowCount != 
sessionMaxRowCount) {
+  // Required to indicate via OptionScope=QueryLevel that session limit is 
overridden by system limit
+  queryOptions.setLocalOption(ExecConstants.QUERY_MAX_ROWS, 
autoLimitRowCount);
+}
+logger.debug("ResultSet size is auto-limited to {} rows [Session: {} / 
Default: {}]", this.autoLimitRowCount, sessionMaxRowCount, defaultMaxRowCount);
 
 Review comment:
   Good point. Will mark it for when autoLimit is non-zero.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Implement JDBC Statement.setMaxRows() with System Option
> 
>
> Key: DRILL-7048
> URL: https://issues.apache.org/jira/browse/DRILL-7048
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC, Query Planning & Optimization
>Affects Versions: 1.15.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.17.0
>
>
> With DRILL-6960, the webUI will get an auto-limit on the number of results 
> fetched.
> Since more of the plumbing is already there, it makes sense to provide the 
> same for the JDBC client.
> In addition, it would be nice if the Server can have a pre-defined value as 
> well (default 0; i.e. no limit) so that an _admin_ would be able to ensure a 
> max limit on the resultset size as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7048) Implement JDBC Statement.setMaxRows() with System Option

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800913#comment-16800913
 ] 

ASF GitHub Bot commented on DRILL-7048:
---

kkhatua commented on pull request #1714: DRILL-7048: Implement JDBC 
Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r268769689
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/ops/QueryContext.java
 ##
 @@ -283,6 +298,21 @@ public RemoteFunctionRegistry getRemoteFunctionRegistry() 
{
 return drillbitContext.getRemoteFunctionRegistry();
   }
 
+  /**
+   * Returns the maximum size of auto-limited resultset
+   * @return Maximum size of auto-limited resultSet
+   */
+  public int getAutoLimitRowCount() {
 
 Review comment:
   Ok. Since I was using it at 4 other places, It seemed cleaner to have a 
single API instead of resolving it through `getOptions().getOption...`
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Implement JDBC Statement.setMaxRows() with System Option
> 
>
> Key: DRILL-7048
> URL: https://issues.apache.org/jira/browse/DRILL-7048
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC, Query Planning & Optimization
>Affects Versions: 1.15.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.17.0
>
>
> With DRILL-6960, the webUI will get an auto-limit on the number of results 
> fetched.
> Since more of the plumbing is already there, it makes sense to provide the 
> same for the JDBC client.
> In addition, it would be nice if the Server can have a pre-defined value as 
> well (default 0; i.e. no limit) so that an _admin_ would be able to ensure a 
> max limit on the resultset size as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7048) Implement JDBC Statement.setMaxRows() with System Option

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800911#comment-16800911
 ] 

ASF GitHub Bot commented on DRILL-7048:
---

kkhatua commented on pull request #1714: DRILL-7048: Implement JDBC 
Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r268768631
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java
 ##
 @@ -283,7 +283,9 @@
   new OptionDefinition(ExecConstants.NDV_BLOOM_FILTER_FPOS_PROB_VALIDATOR),
   new OptionDefinition(ExecConstants.RM_QUERY_TAGS_VALIDATOR,
 new OptionMetaData(OptionValue.AccessibleScopes.SESSION_AND_QUERY, 
false, false)),
-  new 
OptionDefinition(ExecConstants.RM_QUEUES_WAIT_FOR_PREFERRED_NODES_VALIDATOR)
+  new 
OptionDefinition(ExecConstants.RM_QUEUES_WAIT_FOR_PREFERRED_NODES_VALIDATOR),
+  new OptionDefinition(ExecConstants.RETURN_RESULT_SET_FOR_DDL_VALIDATOR),
 
 Review comment:
   Looks like this slipped through when resolving merge conflicts with the 
latest master branch. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Implement JDBC Statement.setMaxRows() with System Option
> 
>
> Key: DRILL-7048
> URL: https://issues.apache.org/jira/browse/DRILL-7048
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC, Query Planning & Optimization
>Affects Versions: 1.15.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.17.0
>
>
> With DRILL-6960, the webUI will get an auto-limit on the number of results 
> fetched.
> Since more of the plumbing is already there, it makes sense to provide the 
> same for the JDBC client.
> In addition, it would be nice if the Server can have a pre-defined value as 
> well (default 0; i.e. no limit) so that an _admin_ would be able to ensure a 
> max limit on the resultset size as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7048) Implement JDBC Statement.setMaxRows() with System Option

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800909#comment-16800909
 ] 

ASF GitHub Bot commented on DRILL-7048:
---

kkhatua commented on pull request #1714: DRILL-7048: Implement JDBC 
Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r268767703
 
 

 ##
 File path: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java
 ##
 @@ -72,12 +73,18 @@
 public class PreparedStatementTest extends JdbcTestBase {
 
   private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(PreparedStatementTest.class);
+  private static final Random RANDOMIZER = new Random(20150304);
 
   private static final String SYS_VERSION_SQL = "select * from sys.version";
   private static final String SYS_RANDOM_SQL =
   "SELECT cast(random() as varchar) as myStr FROM (VALUES(1)) " +
   "union SELECT cast(random() as varchar) as myStr FROM (VALUES(1)) " +
   "union SELECT cast(random() as varchar) as myStr FROM (VALUES(1)) ";
+  private static final String SYS_OPTIONS_SQL = "SELECT * FROM sys.options";
+  private static final String SYS_OPTIONS_SQL_LIMIT_10 = "SELECT * FROM 
sys.options LIMIT 12";
+  private static final String ALTER_SYS_OPTIONS_MAX_ROWS_LIMIT_X = "ALTER 
SYSTEM SET `"+ExecConstants.QUERY_MAX_ROWS+"`=";
 
 Review comment:
   👍  Will fix for `StatementTest` as well
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Implement JDBC Statement.setMaxRows() with System Option
> 
>
> Key: DRILL-7048
> URL: https://issues.apache.org/jira/browse/DRILL-7048
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC, Query Planning & Optimization
>Affects Versions: 1.15.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.17.0
>
>
> With DRILL-6960, the webUI will get an auto-limit on the number of results 
> fetched.
> Since more of the plumbing is already there, it makes sense to provide the 
> same for the JDBC client.
> In addition, it would be nice if the Server can have a pre-defined value as 
> well (default 0; i.e. no limit) so that an _admin_ would be able to ensure a 
> max limit on the resultset size as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7048) Implement JDBC Statement.setMaxRows() with System Option

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800904#comment-16800904
 ] 

ASF GitHub Bot commented on DRILL-7048:
---

kkhatua commented on pull request #1714: DRILL-7048: Implement JDBC 
Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r268766977
 
 

 ##
 File path: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillPreparedStatementImpl.java
 ##
 @@ -259,4 +261,17 @@ public void setObject(int parameterIndex, Object x, 
SQLType targetSqlType) throw
 checkOpen();
 super.setObject(parameterIndex, x, targetSqlType);
   }
+
+  @Override
+  public void setLargeMaxRows(long maxRowCount) throws SQLException {
+Statement setMaxStmt = this.connection.createStatement();
+setMaxStmt.execute("ALTER SESSION SET `" + ExecConstants.QUERY_MAX_ROWS + 
"`="+maxRowCount);
 
 Review comment:
   👍 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Implement JDBC Statement.setMaxRows() with System Option
> 
>
> Key: DRILL-7048
> URL: https://issues.apache.org/jira/browse/DRILL-7048
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC, Query Planning & Optimization
>Affects Versions: 1.15.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.17.0
>
>
> With DRILL-6960, the webUI will get an auto-limit on the number of results 
> fetched.
> Since more of the plumbing is already there, it makes sense to provide the 
> same for the JDBC client.
> In addition, it would be nice if the Server can have a pre-defined value as 
> well (default 0; i.e. no limit) so that an _admin_ would be able to ensure a 
> max limit on the resultset size as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7048) Implement JDBC Statement.setMaxRows() with System Option

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800905#comment-16800905
 ] 

ASF GitHub Bot commented on DRILL-7048:
---

kkhatua commented on pull request #1714: DRILL-7048: Implement JDBC 
Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r268767018
 
 

 ##
 File path: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillStatementImpl.java
 ##
 @@ -270,4 +271,10 @@ public void setResultSet(AvaticaResultSet resultSet) {
   public void setUpdateCount(int value) {
 updateCount = value;
   }
+
+  @Override
+  public void setLargeMaxRows(long maxRowCount) throws SQLException {
+execute("ALTER SESSION SET `" + ExecConstants.QUERY_MAX_ROWS + 
"`="+maxRowCount);
 
 Review comment:
   👍 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Implement JDBC Statement.setMaxRows() with System Option
> 
>
> Key: DRILL-7048
> URL: https://issues.apache.org/jira/browse/DRILL-7048
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC, Query Planning & Optimization
>Affects Versions: 1.15.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.17.0
>
>
> With DRILL-6960, the webUI will get an auto-limit on the number of results 
> fetched.
> Since more of the plumbing is already there, it makes sense to provide the 
> same for the JDBC client.
> In addition, it would be nice if the Server can have a pre-defined value as 
> well (default 0; i.e. no limit) so that an _admin_ would be able to ensure a 
> max limit on the resultset size as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7048) Implement JDBC Statement.setMaxRows() with System Option

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800903#comment-16800903
 ] 

ASF GitHub Bot commented on DRILL-7048:
---

kkhatua commented on pull request #1714: DRILL-7048: Implement JDBC 
Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r268766549
 
 

 ##
 File path: exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java
 ##
 @@ -361,6 +361,9 @@ void close() {
 ExecConstants.JDBC_BATCH_QUEUE_THROTTLING_THRESHOLD );
 resultsListener = new ResultsListener(this, batchQueueThrottlingThreshold);
 currentBatchHolder = new RecordBatchLoader(client.getAllocator());
+
+// Set Query Timeout and MaxRows
 
 Review comment:
   Actually, we don't need it here any more. I'll fix the comment.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Implement JDBC Statement.setMaxRows() with System Option
> 
>
> Key: DRILL-7048
> URL: https://issues.apache.org/jira/browse/DRILL-7048
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC, Query Planning & Optimization
>Affects Versions: 1.15.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.17.0
>
>
> With DRILL-6960, the webUI will get an auto-limit on the number of results 
> fetched.
> Since more of the plumbing is already there, it makes sense to provide the 
> same for the JDBC client.
> In addition, it would be nice if the Server can have a pre-defined value as 
> well (default 0; i.e. no limit) so that an _admin_ would be able to ensure a 
> max limit on the resultset size as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7051) Upgrade jetty

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800870#comment-16800870
 ] 

ASF GitHub Bot commented on DRILL-7051:
---

sohami commented on pull request #1681: DRILL-7051: Upgrade jetty
URL: https://github.com/apache/drill/pull/1681#discussion_r268740814
 
 

 ##
 File path: pom.xml
 ##
 @@ -2553,7 +2548,13 @@
 4.0.1
 provided
   
-  
+
+  
+javax.ws.rs
+javax.ws.rs-api
 
 Review comment:
   why we need this and other Javax Servlet dependencies ?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Upgrade jetty 
> --
>
> Key: DRILL-7051
> URL: https://issues.apache.org/jira/browse/DRILL-7051
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.15.0
>Reporter: Veera Naranammalpuram
>Assignee: Vitalii Diravka
>Priority: Major
> Fix For: 1.16.0
>
>
> Is Drill using a version of jetty web server that's really old? The jar's 
> suggest it's using jetty 9.1 that was built sometime in 2014? 
> {noformat}
> -rw-r--r-- 1 veeranaranammalpuram staff 15988 Nov 20 2017 
> jetty-continuation-9.1.1.v20140108.jar
> -rw-r--r-- 1 veeranaranammalpuram staff 103288 Nov 20 2017 
> jetty-http-9.1.5.v20140505.jar
> -rw-r--r-- 1 veeranaranammalpuram staff 101519 Nov 20 2017 
> jetty-io-9.1.5.v20140505.jar
> -rw-r--r-- 1 veeranaranammalpuram staff 95906 Nov 20 2017 
> jetty-security-9.1.5.v20140505.jar
> -rw-r--r-- 1 veeranaranammalpuram staff 401593 Nov 20 2017 
> jetty-server-9.1.5.v20140505.jar
> -rw-r--r-- 1 veeranaranammalpuram staff 110992 Nov 20 2017 
> jetty-servlet-9.1.5.v20140505.jar
> -rw-r--r-- 1 veeranaranammalpuram staff 119215 Nov 20 2017 
> jetty-servlets-9.1.5.v20140505.jar
> -rw-r--r-- 1 veeranaranammalpuram staff 341683 Nov 20 2017 
> jetty-util-9.1.5.v20140505.jar
> -rw-r--r-- 1 veeranaranammalpuram staff 38707 Dec 21 15:42 
> jetty-util-ajax-9.3.19.v20170502.jar
> -rw-r--r-- 1 veeranaranammalpuram staff 111466 Nov 20 2017 
> jetty-webapp-9.1.1.v20140108.jar
> -rw-r--r-- 1 veeranaranammalpuram staff 41763 Nov 20 2017 
> jetty-xml-9.1.1.v20140108.jar {noformat}
> This version is shown as deprecated: 
> [https://www.eclipse.org/jetty/documentation/current/what-jetty-version.html#d0e203]
> Opening this to upgrade jetty to the latest stable supported version. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7077) Add Function to Facilitate Time Series Analysis

2019-03-25 Thread Arina Ielchiieva (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-7077:

Labels: doc-impacting ready-to-commit  (was: doc-impacting)

> Add Function to Facilitate Time Series Analysis
> ---
>
> Key: DRILL-7077
> URL: https://issues.apache.org/jira/browse/DRILL-7077
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.16.0
>
>
> When analyzing time based data, you will often have to aggregate by time 
> grains. While some time grains will be easy to calculate, others, such as 
> quarter, can be quite difficult. These functions enable a user to quickly and 
> easily aggregate data by various units of time. Usage is as follows:
> {code:java}
> SELECT 
> FROM 
> GROUP BY nearestDate(, {code}
> So let's say that a user wanted to count the number of hits on a web server 
> per 15 minute, the query might look like this:
> {code:java}
> SELECT nearestDate(`eventDate`, '15MINUTE' ) AS eventDate,
> COUNT(*) AS hitCount
> FROM dfs.`log.httpd`
> GROUP BY nearestDate(`eventDate`, '15MINUTE'){code}
> Currently supports the following time units:
>  * YEAR
>  * QUARTER
>  * MONTH
>  * WEEK_SUNDAY
>  * WEEK_MONDAY
>  * DAY
>  * HOUR
>  * HALF_HOUR / 30MIN
>  * QUARTER_HOUR / 15MIN
>  * MINUTE
>  * 30SECOND
>  * 15SECOND
>  * SECOND
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7077) Add Function to Facilitate Time Series Analysis

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800832#comment-16800832
 ] 

ASF GitHub Bot commented on DRILL-7077:
---

arina-ielchiieva commented on issue #1680: DRILL-7077: Add Function to 
Facilitate Time Series Analysis
URL: https://github.com/apache/drill/pull/1680#issuecomment-476264054
 
 
   +1, LGTM.
   Please squash the commits.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Function to Facilitate Time Series Analysis
> ---
>
> Key: DRILL-7077
> URL: https://issues.apache.org/jira/browse/DRILL-7077
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.16.0
>
>
> When analyzing time based data, you will often have to aggregate by time 
> grains. While some time grains will be easy to calculate, others, such as 
> quarter, can be quite difficult. These functions enable a user to quickly and 
> easily aggregate data by various units of time. Usage is as follows:
> {code:java}
> SELECT 
> FROM 
> GROUP BY nearestDate(, {code}
> So let's say that a user wanted to count the number of hits on a web server 
> per 15 minute, the query might look like this:
> {code:java}
> SELECT nearestDate(`eventDate`, '15MINUTE' ) AS eventDate,
> COUNT(*) AS hitCount
> FROM dfs.`log.httpd`
> GROUP BY nearestDate(`eventDate`, '15MINUTE'){code}
> Currently supports the following time units:
>  * YEAR
>  * QUARTER
>  * MONTH
>  * WEEK_SUNDAY
>  * WEEK_MONDAY
>  * DAY
>  * HOUR
>  * HALF_HOUR / 30MIN
>  * QUARTER_HOUR / 15MIN
>  * MINUTE
>  * 30SECOND
>  * 15SECOND
>  * SECOND
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7077) Add Function to Facilitate Time Series Analysis

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800825#comment-16800825
 ] 

ASF GitHub Bot commented on DRILL-7077:
---

ihuzenko commented on pull request #1680: DRILL-7077: Add Function to 
Facilitate Time Series Analysis
URL: https://github.com/apache/drill/pull/1680#discussion_r268716085
 
 

 ##
 File path: 
contrib/udfs/src/test/java/org/apache/drill/exec/udfs/TestNearestDateFunctions.java
 ##
 @@ -0,0 +1,158 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.udfs;
+
+import org.apache.drill.categories.SqlFunctionTest;
+import org.apache.drill.categories.UnlikelyTest;
+import org.apache.drill.common.exceptions.DrillRuntimeException;
+import org.apache.drill.test.ClusterFixture;
+import org.apache.drill.test.ClusterFixtureBuilder;
+import org.apache.drill.test.ClusterTest;
+import org.junit.BeforeClass;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+
+import java.time.LocalDateTime;
+
+import static org.junit.Assert.assertTrue;
+
+@Category({UnlikelyTest.class, SqlFunctionTest.class})
+public class TestNearestDateFunctions extends ClusterTest {
+
+  @BeforeClass
+  public static void setup() throws Exception {
+ClusterFixtureBuilder builder = ClusterFixture.builder(dirTestWatcher);
+startCluster(builder);
+  }
+
+  @Test
+  public void testNearestDate() throws Exception {
 
 Review comment:
   ok, cool)  
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add Function to Facilitate Time Series Analysis
> ---
>
> Key: DRILL-7077
> URL: https://issues.apache.org/jira/browse/DRILL-7077
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.16.0
>
>
> When analyzing time based data, you will often have to aggregate by time 
> grains. While some time grains will be easy to calculate, others, such as 
> quarter, can be quite difficult. These functions enable a user to quickly and 
> easily aggregate data by various units of time. Usage is as follows:
> {code:java}
> SELECT 
> FROM 
> GROUP BY nearestDate(, {code}
> So let's say that a user wanted to count the number of hits on a web server 
> per 15 minute, the query might look like this:
> {code:java}
> SELECT nearestDate(`eventDate`, '15MINUTE' ) AS eventDate,
> COUNT(*) AS hitCount
> FROM dfs.`log.httpd`
> GROUP BY nearestDate(`eventDate`, '15MINUTE'){code}
> Currently supports the following time units:
>  * YEAR
>  * QUARTER
>  * MONTH
>  * WEEK_SUNDAY
>  * WEEK_MONDAY
>  * DAY
>  * HOUR
>  * HALF_HOUR / 30MIN
>  * QUARTER_HOUR / 15MIN
>  * MINUTE
>  * 30SECOND
>  * 15SECOND
>  * SECOND
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7077) Add Function to Facilitate Time Series Analysis

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800823#comment-16800823
 ] 

ASF GitHub Bot commented on DRILL-7077:
---

ihuzenko commented on pull request #1680: DRILL-7077: Add Function to 
Facilitate Time Series Analysis
URL: https://github.com/apache/drill/pull/1680#discussion_r268714887
 
 

 ##
 File path: 
contrib/udfs/src/main/java/org/apache/drill/exec/udfs/NearestDateUtils.java
 ##
 @@ -0,0 +1,149 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.udfs;
+
+import org.apache.drill.common.exceptions.DrillRuntimeException;
+
+import java.time.temporal.TemporalAdjusters;
+import java.time.LocalDateTime;
+import java.time.DayOfWeek;
+import java.time.temporal.ChronoUnit;
+import java.util.Arrays;
+
+public class NearestDateUtils {
+  /**
+   * Specifies the time grouping to be used with the nearest date function
+   */
+  private enum TimeInterval {
+YEAR,
+QUARTER,
+MONTH,
+WEEK_SUNDAY,
+WEEK_MONDAY,
+DAY,
+HOUR,
+HALF_HOUR,
+QUARTER_HOUR,
+MINUTE,
+HALF_MINUTE,
+QUARTER_MINUTE,
+SECOND
+  }
+
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(NearestDateUtils.class);
+
+  /**
+   * This function takes a Java LocalDateTime object, and an interval string 
and returns
+   * the nearest date closets to that time.  For instance, if you specified 
the date as 2018-05-04 and YEAR, the function
+   * will return 2018-01-01
+   *
+   * @param dthe original datetime before adjustments
+   * @param interval The interval string to deduct from the supplied date
+   * @return the modified LocalDateTime
+   */
+  public final static java.time.LocalDateTime getDate(java.time.LocalDateTime 
d, String interval) {
+java.time.LocalDateTime newDate = d;
+int year = d.getYear();
+int month = d.getMonth().getValue();
+int day = d.getDayOfMonth();
+int hour = d.getHour();
+int minute = d.getMinute();
+int second = d.getSecond();
+TimeInterval adjustmentAmount;
+try {
+  adjustmentAmount = TimeInterval.valueOf(interval.toUpperCase());
+} catch (IllegalArgumentException e) {
+  throw new DrillRuntimeException(String.format("[%s] is not a valid time 
statement. Expecting: %s", interval, Arrays.asList(TimeInterval.values(;
+}
+switch (adjustmentAmount) {
+  case YEAR:
+newDate = LocalDateTime.of(year, 1, 1, 0, 0, 0);
+break;
+  case QUARTER:
+newDate = LocalDateTime.of(year, (month / 3) * 3 + 1, 1, 0, 0, 0);
+break;
+  case MONTH:
+newDate = LocalDateTime.of(year, month, 1, 0, 0, 0);
+break;
+  case WEEK_SUNDAY:
+newDate = 
newDate.with(TemporalAdjusters.previousOrSame(DayOfWeek.SUNDAY))
+.truncatedTo(ChronoUnit.DAYS);
+break;
+  case WEEK_MONDAY:
+newDate = 
newDate.with(TemporalAdjusters.previousOrSame(DayOfWeek.MONDAY))
+.truncatedTo(ChronoUnit.DAYS);
+break;
+  case DAY:
+newDate = LocalDateTime.of(year, month, day, 0, 0, 0);
+break;
+  case HOUR:
+newDate = LocalDateTime.of(year, month, day, hour, 0, 0);
+break;
+  case HALF_HOUR:
+if (minute >= 30) {
+  minute = 30;
+} else {
+  minute = 0;
+}
+newDate = LocalDateTime.of(year, month, day, hour, minute, 0);
+break;
+  case QUARTER_HOUR:
+if (minute >= 45) {
+  minute = 45;
+} else if (minute >= 30) {
+  minute = 30;
+} else if (minute >= 15) {
+  minute = 15;
+} else {
+  minute = 0;
+}
+newDate = LocalDateTime.of(year, month, day, hour, minute, 0);
+break;
+  case MINUTE:
+newDate = LocalDateTime.of(year, month, day, hour, minute, 0);
+break;
+  case HALF_MINUTE:
+if (second >= 30) {
+  second = 30;
+} else {
+  second = 0;
+}
+newDate = LocalDateTime.of(year, month, day, hour, minute, second);

[jira] [Commented] (DRILL-7032) Ignore corrupt rows in a PCAP file

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800813#comment-16800813
 ] 

ASF GitHub Bot commented on DRILL-7032:
---

cgivre commented on issue #1637: DRILL-7032: Ignore corrupt rows in a PCAP file
URL: https://github.com/apache/drill/pull/1637#issuecomment-476254011
 
 
   Commits squashed.   Thanks!
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Ignore corrupt rows in a PCAP file
> --
>
> Key: DRILL-7032
> URL: https://issues.apache.org/jira/browse/DRILL-7032
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.15.0
> Environment: OS: Ubuntu 18.4
> Drill version: 1.15.0
> Java(TM) SE Runtime Environment (build 1.8.0_191-b12)
>Reporter: Giovanni Conte
>Assignee: Charles Givre
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> Would be useful for Drill to have some ability to ignore corrupt rows in a 
> PCAP file instead of trow the java exception.
> This is because there are many pcap files with corrupted lines and this 
> funcionality will avoid to do a pre-fixing of the packet-captures (example 
> attached file).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7032) Ignore corrupt rows in a PCAP file

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800807#comment-16800807
 ] 

ASF GitHub Bot commented on DRILL-7032:
---

arina-ielchiieva commented on issue #1637: DRILL-7032: Ignore corrupt rows in a 
PCAP file
URL: https://github.com/apache/drill/pull/1637#issuecomment-476251224
 
 
   +1, LGTM. @cgivre please squash the commits.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Ignore corrupt rows in a PCAP file
> --
>
> Key: DRILL-7032
> URL: https://issues.apache.org/jira/browse/DRILL-7032
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.15.0
> Environment: OS: Ubuntu 18.4
> Drill version: 1.15.0
> Java(TM) SE Runtime Environment (build 1.8.0_191-b12)
>Reporter: Giovanni Conte
>Assignee: Charles Givre
>Priority: Major
> Fix For: 1.16.0
>
>
> Would be useful for Drill to have some ability to ignore corrupt rows in a 
> PCAP file instead of trow the java exception.
> This is because there are many pcap files with corrupted lines and this 
> funcionality will avoid to do a pre-fixing of the packet-captures (example 
> attached file).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7032) Ignore corrupt rows in a PCAP file

2019-03-25 Thread Arina Ielchiieva (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-7032:

Labels: ready-to-commit  (was: )

> Ignore corrupt rows in a PCAP file
> --
>
> Key: DRILL-7032
> URL: https://issues.apache.org/jira/browse/DRILL-7032
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.15.0
> Environment: OS: Ubuntu 18.4
> Drill version: 1.15.0
> Java(TM) SE Runtime Environment (build 1.8.0_191-b12)
>Reporter: Giovanni Conte
>Assignee: Charles Givre
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> Would be useful for Drill to have some ability to ignore corrupt rows in a 
> PCAP file instead of trow the java exception.
> This is because there are many pcap files with corrupted lines and this 
> funcionality will avoid to do a pre-fixing of the packet-captures (example 
> attached file).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7032) Ignore corrupt rows in a PCAP file

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800779#comment-16800779
 ] 

ASF GitHub Bot commented on DRILL-7032:
---

cgivre commented on pull request #1637: DRILL-7032: Ignore corrupt rows in a 
PCAP file
URL: https://github.com/apache/drill/pull/1637#discussion_r268688621
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/pcap/decoder/Packet.java
 ##
 @@ -324,7 +333,12 @@ public int getDst_port() {
 byte[] data = null;
 if (packetLength >= payloadDataStart) {
   data = new byte[packetLength - payloadDataStart];
-  System.arraycopy(raw, ipOffset + payloadDataStart, data, 0, data.length);
+  try {
+System.arraycopy(raw, ipOffset + payloadDataStart, data, 0, 
data.length);
+  } catch (Exception e) {
+isCorrupt = true;
+logger.info("Error while parsing PCAP data: ", e.getMessage());
 
 Review comment:
   Thanks @arina-ielchiieva.  Fixed.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Ignore corrupt rows in a PCAP file
> --
>
> Key: DRILL-7032
> URL: https://issues.apache.org/jira/browse/DRILL-7032
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.15.0
> Environment: OS: Ubuntu 18.4
> Drill version: 1.15.0
> Java(TM) SE Runtime Environment (build 1.8.0_191-b12)
>Reporter: Giovanni Conte
>Assignee: Charles Givre
>Priority: Major
> Fix For: 1.16.0
>
>
> Would be useful for Drill to have some ability to ignore corrupt rows in a 
> PCAP file instead of trow the java exception.
> This is because there are many pcap files with corrupted lines and this 
> funcionality will avoid to do a pre-fixing of the packet-captures (example 
> attached file).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7048) Implement JDBC Statement.setMaxRows() with System Option

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800735#comment-16800735
 ] 

ASF GitHub Bot commented on DRILL-7048:
---

vvysotskyi commented on pull request #1714: DRILL-7048: Implement JDBC 
Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r268648485
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/ops/QueryContext.java
 ##
 @@ -107,6 +108,20 @@ public QueryContext(final UserSession session, final 
DrillbitContext drillbitCon
   this.table = drillbitContext.getOperatorTable();
 }
 
+// Checking for limit on ResultSet rowcount and if user attempting to 
override the system value
+int sessionMaxRowCount = 
queryOptions.getOption(ExecConstants.QUERY_MAX_ROWS).num_val.intValue();
+int defaultMaxRowCount = 
queryOptions.getOptionManager(OptionScope.SYSTEM).getOption(ExecConstants.QUERY_MAX_ROWS).num_val.intValue();
+if (sessionMaxRowCount > 0 && defaultMaxRowCount > 0) {
+  this.autoLimitRowCount = Math.min(sessionMaxRowCount, 
defaultMaxRowCount);
+} else {
+  this.autoLimitRowCount = Math.max(sessionMaxRowCount, 
defaultMaxRowCount);
+}
+if (autoLimitRowCount == defaultMaxRowCount && defaultMaxRowCount != 
sessionMaxRowCount) {
+  // Required to indicate via OptionScope=QueryLevel that session limit is 
overridden by system limit
+  queryOptions.setLocalOption(ExecConstants.QUERY_MAX_ROWS, 
autoLimitRowCount);
+}
+logger.debug("ResultSet size is auto-limited to {} rows [Session: {} / 
Default: {}]", this.autoLimitRowCount, sessionMaxRowCount, defaultMaxRowCount);
 
 Review comment:
   This message will be logged even for the case when auto limit does not 
happen.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Implement JDBC Statement.setMaxRows() with System Option
> 
>
> Key: DRILL-7048
> URL: https://issues.apache.org/jira/browse/DRILL-7048
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC, Query Planning & Optimization
>Affects Versions: 1.15.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.17.0
>
>
> With DRILL-6960, the webUI will get an auto-limit on the number of results 
> fetched.
> Since more of the plumbing is already there, it makes sense to provide the 
> same for the JDBC client.
> In addition, it would be nice if the Server can have a pre-defined value as 
> well (default 0; i.e. no limit) so that an _admin_ would be able to ensure a 
> max limit on the resultset size as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7048) Implement JDBC Statement.setMaxRows() with System Option

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800743#comment-16800743
 ] 

ASF GitHub Bot commented on DRILL-7048:
---

vvysotskyi commented on pull request #1714: DRILL-7048: Implement JDBC 
Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r268658212
 
 

 ##
 File path: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillPreparedStatementImpl.java
 ##
 @@ -259,4 +261,17 @@ public void setObject(int parameterIndex, Object x, 
SQLType targetSqlType) throw
 checkOpen();
 super.setObject(parameterIndex, x, targetSqlType);
   }
+
+  @Override
+  public void setLargeMaxRows(long maxRowCount) throws SQLException {
+Statement setMaxStmt = this.connection.createStatement();
+setMaxStmt.execute("ALTER SESSION SET `" + ExecConstants.QUERY_MAX_ROWS + 
"`="+maxRowCount);
 
 Review comment:
   ```suggestion
   setMaxStmt.execute("ALTER SESSION SET `" + ExecConstants.QUERY_MAX_ROWS 
+ "`=" + maxRowCount);
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Implement JDBC Statement.setMaxRows() with System Option
> 
>
> Key: DRILL-7048
> URL: https://issues.apache.org/jira/browse/DRILL-7048
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC, Query Planning & Optimization
>Affects Versions: 1.15.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.17.0
>
>
> With DRILL-6960, the webUI will get an auto-limit on the number of results 
> fetched.
> Since more of the plumbing is already there, it makes sense to provide the 
> same for the JDBC client.
> In addition, it would be nice if the Server can have a pre-defined value as 
> well (default 0; i.e. no limit) so that an _admin_ would be able to ensure a 
> max limit on the resultset size as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7048) Implement JDBC Statement.setMaxRows() with System Option

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800737#comment-16800737
 ] 

ASF GitHub Bot commented on DRILL-7048:
---

vvysotskyi commented on pull request #1714: DRILL-7048: Implement JDBC 
Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r268650777
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/server/options/SystemOptionManager.java
 ##
 @@ -283,7 +283,9 @@
   new OptionDefinition(ExecConstants.NDV_BLOOM_FILTER_FPOS_PROB_VALIDATOR),
   new OptionDefinition(ExecConstants.RM_QUERY_TAGS_VALIDATOR,
 new OptionMetaData(OptionValue.AccessibleScopes.SESSION_AND_QUERY, 
false, false)),
-  new 
OptionDefinition(ExecConstants.RM_QUEUES_WAIT_FOR_PREFERRED_NODES_VALIDATOR)
+  new 
OptionDefinition(ExecConstants.RM_QUEUES_WAIT_FOR_PREFERRED_NODES_VALIDATOR),
+  new OptionDefinition(ExecConstants.RETURN_RESULT_SET_FOR_DDL_VALIDATOR),
 
 Review comment:
   This one is already specified above.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Implement JDBC Statement.setMaxRows() with System Option
> 
>
> Key: DRILL-7048
> URL: https://issues.apache.org/jira/browse/DRILL-7048
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC, Query Planning & Optimization
>Affects Versions: 1.15.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.17.0
>
>
> With DRILL-6960, the webUI will get an auto-limit on the number of results 
> fetched.
> Since more of the plumbing is already there, it makes sense to provide the 
> same for the JDBC client.
> In addition, it would be nice if the Server can have a pre-defined value as 
> well (default 0; i.e. no limit) so that an _admin_ would be able to ensure a 
> max limit on the resultset size as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7048) Implement JDBC Statement.setMaxRows() with System Option

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800739#comment-16800739
 ] 

ASF GitHub Bot commented on DRILL-7048:
---

vvysotskyi commented on pull request #1714: DRILL-7048: Implement JDBC 
Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r268656547
 
 

 ##
 File path: exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java
 ##
 @@ -361,6 +361,9 @@ void close() {
 ExecConstants.JDBC_BATCH_QUEUE_THROTTLING_THRESHOLD );
 resultsListener = new ResultsListener(this, batchQueueThrottlingThreshold);
 currentBatchHolder = new RecordBatchLoader(client.getAllocator());
+
+// Set Query Timeout and MaxRows
 
 Review comment:
   Looks like `MaxRows` wasn't set here.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Implement JDBC Statement.setMaxRows() with System Option
> 
>
> Key: DRILL-7048
> URL: https://issues.apache.org/jira/browse/DRILL-7048
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC, Query Planning & Optimization
>Affects Versions: 1.15.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.17.0
>
>
> With DRILL-6960, the webUI will get an auto-limit on the number of results 
> fetched.
> Since more of the plumbing is already there, it makes sense to provide the 
> same for the JDBC client.
> In addition, it would be nice if the Server can have a pre-defined value as 
> well (default 0; i.e. no limit) so that an _admin_ would be able to ensure a 
> max limit on the resultset size as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7048) Implement JDBC Statement.setMaxRows() with System Option

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800740#comment-16800740
 ] 

ASF GitHub Bot commented on DRILL-7048:
---

vvysotskyi commented on pull request #1714: DRILL-7048: Implement JDBC 
Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r268656219
 
 

 ##
 File path: exec/jdbc/src/main/java/org/apache/drill/jdbc/DrillStatement.java
 ##
 @@ -53,6 +53,28 @@ void setQueryTimeout( int seconds )
  JdbcApiSqlException,
  SQLException;
 
+  /**
+   * @throws  SQLException
+   *Any SQL exception
+   */
+  @Override
+  int getMaxRows() throws SQLException;
 
 Review comment:
   The same question as above.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Implement JDBC Statement.setMaxRows() with System Option
> 
>
> Key: DRILL-7048
> URL: https://issues.apache.org/jira/browse/DRILL-7048
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC, Query Planning & Optimization
>Affects Versions: 1.15.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.17.0
>
>
> With DRILL-6960, the webUI will get an auto-limit on the number of results 
> fetched.
> Since more of the plumbing is already there, it makes sense to provide the 
> same for the JDBC client.
> In addition, it would be nice if the Server can have a pre-defined value as 
> well (default 0; i.e. no limit) so that an _admin_ would be able to ensure a 
> max limit on the resultset size as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7048) Implement JDBC Statement.setMaxRows() with System Option

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800744#comment-16800744
 ] 

ASF GitHub Bot commented on DRILL-7048:
---

vvysotskyi commented on pull request #1714: DRILL-7048: Implement JDBC 
Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r268660625
 
 

 ##
 File path: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java
 ##
 @@ -72,12 +73,18 @@
 public class PreparedStatementTest extends JdbcTestBase {
 
   private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(PreparedStatementTest.class);
+  private static final Random RANDOMIZER = new Random(20150304);
 
   private static final String SYS_VERSION_SQL = "select * from sys.version";
   private static final String SYS_RANDOM_SQL =
   "SELECT cast(random() as varchar) as myStr FROM (VALUES(1)) " +
   "union SELECT cast(random() as varchar) as myStr FROM (VALUES(1)) " +
   "union SELECT cast(random() as varchar) as myStr FROM (VALUES(1)) ";
+  private static final String SYS_OPTIONS_SQL = "SELECT * FROM sys.options";
+  private static final String SYS_OPTIONS_SQL_LIMIT_10 = "SELECT * FROM 
sys.options LIMIT 12";
+  private static final String ALTER_SYS_OPTIONS_MAX_ROWS_LIMIT_X = "ALTER 
SYSTEM SET `"+ExecConstants.QUERY_MAX_ROWS+"`=";
 
 Review comment:
   ```suggestion
 private static final String ALTER_SYS_OPTIONS_MAX_ROWS_LIMIT_X = "ALTER 
SYSTEM SET `" + ExecConstants.QUERY_MAX_ROWS + "`=";
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Implement JDBC Statement.setMaxRows() with System Option
> 
>
> Key: DRILL-7048
> URL: https://issues.apache.org/jira/browse/DRILL-7048
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC, Query Planning & Optimization
>Affects Versions: 1.15.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.17.0
>
>
> With DRILL-6960, the webUI will get an auto-limit on the number of results 
> fetched.
> Since more of the plumbing is already there, it makes sense to provide the 
> same for the JDBC client.
> In addition, it would be nice if the Server can have a pre-defined value as 
> well (default 0; i.e. no limit) so that an _admin_ would be able to ensure a 
> max limit on the resultset size as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7048) Implement JDBC Statement.setMaxRows() with System Option

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800738#comment-16800738
 ] 

ASF GitHub Bot commented on DRILL-7048:
---

vvysotskyi commented on pull request #1714: DRILL-7048: Implement JDBC 
Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r268655998
 
 

 ##
 File path: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/DrillPreparedStatement.java
 ##
 @@ -32,4 +33,25 @@
  */
 public interface DrillPreparedStatement extends PreparedStatement {
 
+  /**
+   * @throws  SQLException
+   *Any SQL exception
+   */
+  @Override
+  int getMaxRows() throws SQLException;
 
 Review comment:
   What was the reason for declaring this method and the method below?They both 
are already available in `Statement` interface.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Implement JDBC Statement.setMaxRows() with System Option
> 
>
> Key: DRILL-7048
> URL: https://issues.apache.org/jira/browse/DRILL-7048
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC, Query Planning & Optimization
>Affects Versions: 1.15.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.17.0
>
>
> With DRILL-6960, the webUI will get an auto-limit on the number of results 
> fetched.
> Since more of the plumbing is already there, it makes sense to provide the 
> same for the JDBC client.
> In addition, it would be nice if the Server can have a pre-defined value as 
> well (default 0; i.e. no limit) so that an _admin_ would be able to ensure a 
> max limit on the resultset size as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7048) Implement JDBC Statement.setMaxRows() with System Option

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800742#comment-16800742
 ] 

ASF GitHub Bot commented on DRILL-7048:
---

vvysotskyi commented on pull request #1714: DRILL-7048: Implement JDBC 
Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r268658531
 
 

 ##
 File path: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillStatementImpl.java
 ##
 @@ -270,4 +271,10 @@ public void setResultSet(AvaticaResultSet resultSet) {
   public void setUpdateCount(int value) {
 updateCount = value;
   }
+
+  @Override
+  public void setLargeMaxRows(long maxRowCount) throws SQLException {
+execute("ALTER SESSION SET `" + ExecConstants.QUERY_MAX_ROWS + 
"`="+maxRowCount);
 
 Review comment:
   And please fix spaces here
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Implement JDBC Statement.setMaxRows() with System Option
> 
>
> Key: DRILL-7048
> URL: https://issues.apache.org/jira/browse/DRILL-7048
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC, Query Planning & Optimization
>Affects Versions: 1.15.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.17.0
>
>
> With DRILL-6960, the webUI will get an auto-limit on the number of results 
> fetched.
> Since more of the plumbing is already there, it makes sense to provide the 
> same for the JDBC client.
> In addition, it would be nice if the Server can have a pre-defined value as 
> well (default 0; i.e. no limit) so that an _admin_ would be able to ensure a 
> max limit on the resultset size as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7048) Implement JDBC Statement.setMaxRows() with System Option

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800736#comment-16800736
 ] 

ASF GitHub Bot commented on DRILL-7048:
---

vvysotskyi commented on pull request #1714: DRILL-7048: Implement JDBC 
Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r268647133
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/ops/QueryContext.java
 ##
 @@ -283,6 +298,21 @@ public RemoteFunctionRegistry getRemoteFunctionRegistry() 
{
 return drillbitContext.getRemoteFunctionRegistry();
   }
 
+  /**
+   * Returns the maximum size of auto-limited resultset
+   * @return Maximum size of auto-limited resultSet
+   */
+  public int getAutoLimitRowCount() {
 
 Review comment:
   Please remove this method and method below and replace their usage by 
`getOptions().getOption...`
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Implement JDBC Statement.setMaxRows() with System Option
> 
>
> Key: DRILL-7048
> URL: https://issues.apache.org/jira/browse/DRILL-7048
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC, Query Planning & Optimization
>Affects Versions: 1.15.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.17.0
>
>
> With DRILL-6960, the webUI will get an auto-limit on the number of results 
> fetched.
> Since more of the plumbing is already there, it makes sense to provide the 
> same for the JDBC client.
> In addition, it would be nice if the Server can have a pre-defined value as 
> well (default 0; i.e. no limit) so that an _admin_ would be able to ensure a 
> max limit on the resultset size as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7048) Implement JDBC Statement.setMaxRows() with System Option

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800741#comment-16800741
 ] 

ASF GitHub Bot commented on DRILL-7048:
---

vvysotskyi commented on pull request #1714: DRILL-7048: Implement JDBC 
Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r268659959
 
 

 ##
 File path: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillStatementImpl.java
 ##
 @@ -270,4 +271,10 @@ public void setResultSet(AvaticaResultSet resultSet) {
   public void setUpdateCount(int value) {
 updateCount = value;
   }
+
+  @Override
+  public void setLargeMaxRows(long maxRowCount) throws SQLException {
+execute("ALTER SESSION SET `" + ExecConstants.QUERY_MAX_ROWS + 
"`="+maxRowCount);
+this.maxRowCount = maxRowCount;
 
 Review comment:
   Looks like `maxRowCount` is taken from avatica. Does it provide similar 
functionality we want to implement?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Implement JDBC Statement.setMaxRows() with System Option
> 
>
> Key: DRILL-7048
> URL: https://issues.apache.org/jira/browse/DRILL-7048
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - JDBC, Query Planning & Optimization
>Affects Versions: 1.15.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.17.0
>
>
> With DRILL-6960, the webUI will get an auto-limit on the number of results 
> fetched.
> Since more of the plumbing is already there, it makes sense to provide the 
> same for the JDBC client.
> In addition, it would be nice if the Server can have a pre-defined value as 
> well (default 0; i.e. no limit) so that an _admin_ would be able to ensure a 
> max limit on the resultset size as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7011) Allow hybrid model in the Row set-based scan framework

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800645#comment-16800645
 ] 

ASF GitHub Bot commented on DRILL-7011:
---

arina-ielchiieva commented on pull request #1711: DRILL-7011: Support schema in 
scan framework
URL: https://github.com/apache/drill/pull/1711#discussion_r268603780
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/exec/store/easy/text/compliant/TestCsvWithSchema.java
 ##
 @@ -82,6 +167,468 @@ public void testSchema() throws Exception {
   .addRow(10, new LocalDate(2019, 3, 20), "it works!", 1234.5D, 20L, 
"")
   .build();
   RowSetUtilities.verify(expected, actual);
+} finally {
+  resetV3();
+  resetSchema();
+}
+  }
+
+
+  /**
+   * Use a schema with explicit projection to get a consistent view
+   * of the table schema, even if columns are missing, rows are ragged,
+   * and column order changes.
+   * 
+   * Force the scans to occur in distinct fragments so the order of the
+   * file batches is random.
+   */
+  @Test
+  public void testMultiFileSchema() throws Exception {
+RowSet expected1 = null;
+RowSet expected2 = null;
+try {
+  enableV3(true);
+  enableSchema(true);
+  enableMultiScan();
+  String tablePath = buildTwoFileTable("multiFileSchema", 
raggedMulti1Contents, reordered2Contents);
+  run(SCHEMA_SQL, tablePath);
+
+  // Wildcard expands to union of schema + table. In this case
+  // all table columns appear in the schema (though not all schema
+  // columns appear in the table.)
+
+  String sql = "SELECT id, `name`, `date`, gender, comment FROM " + 
tablePath;
+  TupleMetadata expectedSchema = new SchemaBuilder()
+  .add("id", MinorType.INT)
+  .add("name", MinorType.VARCHAR)
+  .addNullable("date", MinorType.DATE)
+  .add("gender", MinorType.VARCHAR)
+  .add("comment", MinorType.VARCHAR)
+  .buildSchema();
+  expected1 = new RowSetBuilder(client.allocator(), expectedSchema)
+  .addRow(1, "arina", new LocalDate(2019, 1, 18), "female", "ABC")
+  .addRow(2, "javan", new LocalDate(2019, 1, 19), "male", "ABC")
+  .addRow(4, "albert", new LocalDate(2019, 5, 4), "", "ABC")
+  .build();
+  expected2 = new RowSetBuilder(client.allocator(), expectedSchema)
+  .addRow(3, "bob", new LocalDate(2001, 1, 16), "NA", "ABC")
+  .build();
+
+  // Loop 10 times so that, as the two reader fragments read the two
+  // files, we end up with (acceptable) races that read the files in
+  // random order.
+
+  for (int i = 0; i < 10; i++) {
+boolean sawSchema = false;
+boolean sawFile1 = false;
+boolean sawFile2 = false;
+Iterator iter = 
client.queryBuilder().sql(sql).rowSetIterator();
+while (iter.hasNext()) {
+  RowSet result = iter.next();
+  if (result.rowCount() == 3) {
+sawFile1 = true;
+new RowSetComparison(expected1).verifyAndClear(result);
+  } else if (result.rowCount() == 1) {
+sawFile2 = true;
+new RowSetComparison(expected2).verifyAndClear(result);
+  } else {
+assertEquals(0, result.rowCount());
+sawSchema = true;
+  }
+}
+assertTrue(sawSchema);
+assertTrue(sawFile1);
+assertTrue(sawFile2);
+  }
+} finally {
+  expected1.clear();
+  expected2.clear();
+  client.resetSession(ExecConstants.ENABLE_V3_TEXT_READER_KEY);
+  client.resetSession(ExecConstants.STORE_TABLE_USE_SCHEMA_FILE);
+  client.resetSession(ExecConstants.MIN_READER_WIDTH_KEY);
+}
+  }
+
+  /**
+   * Test the schema we get in V2 when the table read order is random.
+   * Worst-case: the two files have different column counts and
+   * column orders.
+   * 
+   * Though the results are random, we iterate 10 times which, in most runs,
+   * shows the random variation in schemas:
+   * 
+   * Sometimes the first batch has three columns, sometimes four.
+   * Sometimes the column `id` is in position 0, sometimes in position 1
+   * (correlated with the above).
+   * Due to the fact that sometimes the first file (with four columns)
+   * is returned first, sometimes the second file (with three columns) is
+   * returned first.
+   * 
+   */
+  @Test
+  public void testSchemaRaceV2() throws Exception {
+try {
+  enableV3(false);
+  enableSchema(false);
+  enableMultiScan();
+  String tablePath = buildTwoFileTable("schemaRaceV2", multi1Contents, 
reordered2Contents);
+  boolean sawFile1First = false;
+  boolean sawFile2First = false;
+  boolean sawFullSchema = false;
+  boolean sawPartialSchema = false;
+  boolean sawIdAsCol0 = false;
+  boolean sawIdAsCol1 = false;
+  String sql = "SELECT * FRO

[jira] [Commented] (DRILL-7011) Allow hybrid model in the Row set-based scan framework

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800618#comment-16800618
 ] 

ASF GitHub Bot commented on DRILL-7011:
---

arina-ielchiieva commented on issue #1711: DRILL-7011: Support schema in scan 
framework
URL: https://github.com/apache/drill/pull/1711#issuecomment-476166889
 
 
   @paul-rogers 
   Actually when I was presenting the schema provisioning design, there were a 
proposal to add schema property `drill.is_full_schema`. By default it’s 
`false`, thus we assume that schema is partial.
   If user wants to indicate that schema is strict and to ignore all columns 
except of those indicated in schema, he needs to create schema the following 
way:
`create schema (col int) for table dfs.tmp.t. properties 
('drill.is_full_schema' = 'true')`
   
   Since most of the `default` property problems are related to star queries, 
we can state the following:
   1. For queries with defined list of columns (aka projection queries: `select 
id, name from t`), we apply schema consistently.
   2. For star queries and when schema property `drill.is_full_schema` is set 
to `false`, we might get inconsistent results with default values but it's ok 
since we discover schema on the read.
   3. For star queries and when schema property `drill.is_full_schema` is set 
to `true`, we project only those columns indicated in schema.
   What do you think?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow hybrid model in the Row set-based scan framework
> --
>
> Key: DRILL-7011
> URL: https://issues.apache.org/jira/browse/DRILL-7011
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.15.0
>Reporter: Arina Ielchiieva
>Assignee: Paul Rogers
>Priority: Major
> Fix For: 1.16.0
>
>
> As part of schema provisioning project we want to allow hybrid model for Row 
> set-based scan framework, namely to allow to pass custom schema metadata 
> which can be partial.
> Currently schema provisioning has SchemaContainer class that contains the 
> following information (can be obtained from metastore, schema file, table 
> function):
> 1. schema represented by org.apache.drill.exec.record.metadata.TupleMetadata
> 2. properties represented by Map, can contain information if 
> schema is strict or partial (default is partial) etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7011) Allow hybrid model in the Row set-based scan framework

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800617#comment-16800617
 ] 

ASF GitHub Bot commented on DRILL-7011:
---

arina-ielchiieva commented on issue #1711: DRILL-7011: Support schema in scan 
framework
URL: https://github.com/apache/drill/pull/1711#issuecomment-476166889
 
 
   @paul-rogers 
   Actually when I was presenting the schema provisioning design, there were a 
proposal to add schema property `drill.is_full_schema`. By default it’s 
`false`, thus we assume that schema is partial.
   If user wants to indicate that schema is strict and to ignore all columns 
except of those indicated in schema, he needs to create schema the following 
way:
`create schema (col int) properties ('drill.is_full_schema' = 'true')`
   
   Since most of the `default` property problems are related to star queries, 
we can state the following:
   1. For queries with defined list of columns (aka projection queries: `select 
id, name from t`), we apply schema consistently.
   2. For star queries and when schema property `drill.is_full_schema` is set 
to `false`, we might get inconsistent results with default values but it's ok 
since we discover schema on the read.
   3. For star queries and when schema property `drill.is_full_schema` is set 
to `true`, we project only those columns indicated in schema.
   What do you think?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow hybrid model in the Row set-based scan framework
> --
>
> Key: DRILL-7011
> URL: https://issues.apache.org/jira/browse/DRILL-7011
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.15.0
>Reporter: Arina Ielchiieva
>Assignee: Paul Rogers
>Priority: Major
> Fix For: 1.16.0
>
>
> As part of schema provisioning project we want to allow hybrid model for Row 
> set-based scan framework, namely to allow to pass custom schema metadata 
> which can be partial.
> Currently schema provisioning has SchemaContainer class that contains the 
> following information (can be obtained from metastore, schema file, table 
> function):
> 1. schema represented by org.apache.drill.exec.record.metadata.TupleMetadata
> 2. properties represented by Map, can contain information if 
> schema is strict or partial (default is partial) etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7011) Allow hybrid model in the Row set-based scan framework

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800610#comment-16800610
 ] 

ASF GitHub Bot commented on DRILL-7011:
---

arina-ielchiieva commented on pull request #1711: DRILL-7011: Support schema in 
scan framework
URL: https://github.com/apache/drill/pull/1711#discussion_r268603780
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/exec/store/easy/text/compliant/TestCsvWithSchema.java
 ##
 @@ -82,6 +167,468 @@ public void testSchema() throws Exception {
   .addRow(10, new LocalDate(2019, 3, 20), "it works!", 1234.5D, 20L, 
"")
   .build();
   RowSetUtilities.verify(expected, actual);
+} finally {
+  resetV3();
+  resetSchema();
+}
+  }
+
+
+  /**
+   * Use a schema with explicit projection to get a consistent view
+   * of the table schema, even if columns are missing, rows are ragged,
+   * and column order changes.
+   * 
+   * Force the scans to occur in distinct fragments so the order of the
+   * file batches is random.
+   */
+  @Test
+  public void testMultiFileSchema() throws Exception {
+RowSet expected1 = null;
+RowSet expected2 = null;
+try {
+  enableV3(true);
+  enableSchema(true);
+  enableMultiScan();
+  String tablePath = buildTwoFileTable("multiFileSchema", 
raggedMulti1Contents, reordered2Contents);
+  run(SCHEMA_SQL, tablePath);
+
+  // Wildcard expands to union of schema + table. In this case
+  // all table columns appear in the schema (though not all schema
+  // columns appear in the table.)
+
+  String sql = "SELECT id, `name`, `date`, gender, comment FROM " + 
tablePath;
+  TupleMetadata expectedSchema = new SchemaBuilder()
+  .add("id", MinorType.INT)
+  .add("name", MinorType.VARCHAR)
+  .addNullable("date", MinorType.DATE)
+  .add("gender", MinorType.VARCHAR)
+  .add("comment", MinorType.VARCHAR)
+  .buildSchema();
+  expected1 = new RowSetBuilder(client.allocator(), expectedSchema)
+  .addRow(1, "arina", new LocalDate(2019, 1, 18), "female", "ABC")
+  .addRow(2, "javan", new LocalDate(2019, 1, 19), "male", "ABC")
+  .addRow(4, "albert", new LocalDate(2019, 5, 4), "", "ABC")
+  .build();
+  expected2 = new RowSetBuilder(client.allocator(), expectedSchema)
+  .addRow(3, "bob", new LocalDate(2001, 1, 16), "NA", "ABC")
+  .build();
+
+  // Loop 10 times so that, as the two reader fragments read the two
+  // files, we end up with (acceptable) races that read the files in
+  // random order.
+
+  for (int i = 0; i < 10; i++) {
+boolean sawSchema = false;
+boolean sawFile1 = false;
+boolean sawFile2 = false;
+Iterator iter = 
client.queryBuilder().sql(sql).rowSetIterator();
+while (iter.hasNext()) {
+  RowSet result = iter.next();
+  if (result.rowCount() == 3) {
+sawFile1 = true;
+new RowSetComparison(expected1).verifyAndClear(result);
+  } else if (result.rowCount() == 1) {
+sawFile2 = true;
+new RowSetComparison(expected2).verifyAndClear(result);
+  } else {
+assertEquals(0, result.rowCount());
+sawSchema = true;
+  }
+}
+assertTrue(sawSchema);
+assertTrue(sawFile1);
+assertTrue(sawFile2);
+  }
+} finally {
+  expected1.clear();
+  expected2.clear();
+  client.resetSession(ExecConstants.ENABLE_V3_TEXT_READER_KEY);
+  client.resetSession(ExecConstants.STORE_TABLE_USE_SCHEMA_FILE);
+  client.resetSession(ExecConstants.MIN_READER_WIDTH_KEY);
+}
+  }
+
+  /**
+   * Test the schema we get in V2 when the table read order is random.
+   * Worst-case: the two files have different column counts and
+   * column orders.
+   * 
+   * Though the results are random, we iterate 10 times which, in most runs,
+   * shows the random variation in schemas:
+   * 
+   * Sometimes the first batch has three columns, sometimes four.
+   * Sometimes the column `id` is in position 0, sometimes in position 1
+   * (correlated with the above).
+   * Due to the fact that sometimes the first file (with four columns)
+   * is returned first, sometimes the second file (with three columns) is
+   * returned first.
+   * 
+   */
+  @Test
+  public void testSchemaRaceV2() throws Exception {
+try {
+  enableV3(false);
+  enableSchema(false);
+  enableMultiScan();
+  String tablePath = buildTwoFileTable("schemaRaceV2", multi1Contents, 
reordered2Contents);
+  boolean sawFile1First = false;
+  boolean sawFile2First = false;
+  boolean sawFullSchema = false;
+  boolean sawPartialSchema = false;
+  boolean sawIdAsCol0 = false;
+  boolean sawIdAsCol1 = false;
+  String sql = "SELECT * FRO

[jira] [Commented] (DRILL-7049) REST API returns the toString of byte arrays (VARBINARY types)

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800538#comment-16800538
 ] 

ASF GitHub Bot commented on DRILL-7049:
---

vdiravka commented on pull request #1672: DRILL-7049 return VARBINARY as a 
string with escaped non printable bytes
URL: https://github.com/apache/drill/pull/1672#discussion_r268565353
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/util/ValueVectorElementFormatter.java
 ##
 @@ -52,28 +52,47 @@ public ValueVectorElementFormatter(OptionManager options) {
* @return the formatted value, null if failed
*/
   public String format(Object value, TypeProtos.MinorType minorType) {
+boolean handled = false;
+   String str = null;
 switch (minorType) {
   case TIMESTAMP:
 if (value instanceof LocalDateTime) {
-  return format((LocalDateTime) value,
+  handled = true;
+  str = format((LocalDateTime) value,
 
options.getString(ExecConstants.WEB_DISPLAY_FORMAT_TIMESTAMP),
 (v, p) -> v.format(getTimestampFormatter(p)));
 }
+break;
   case DATE:
 if (value instanceof LocalDate) {
-  return format((LocalDate) value,
+  handled = true;
+  str = format((LocalDate) value,
 
options.getString(ExecConstants.WEB_DISPLAY_FORMAT_DATE),
 (v, p) -> v.format(getDateFormatter(p)));
 }
+break;
   case TIME:
 if (value instanceof LocalTime) {
-  return format((LocalTime) value,
+  handled = true;
+  str = format((LocalTime) value,
 
options.getString(ExecConstants.WEB_DISPLAY_FORMAT_TIME),
 (v, p) -> v.format(getTimeFormatter(p)));
 }
-  default:
-return value.toString();
+break;
+  case VARBINARY:
+if (value instanceof byte[]) {
+  handled = true;
+  byte[] bytes = (byte[]) value;
+  str = 
org.apache.drill.common.util.DrillStringUtils.toBinaryString(bytes);
+}
+break;
+}
+
+if (!handled) {
 
 Review comment:
   It looks like current code execution is the same as in your PR. But logic 
from PR is more complex: additional flag `handled`, breaks in switch 
statements...
   I think we can leave current code from Drill master and add  your `case 
VARBINARY`.
   Seems it will be enough.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> REST API returns the toString of byte arrays (VARBINARY types)
> --
>
> Key: DRILL-7049
> URL: https://issues.apache.org/jira/browse/DRILL-7049
> Project: Apache Drill
>  Issue Type: Bug
>  Components:  Server, Web Server
>Affects Versions: 1.15.0
>Reporter: jean-claude
>Priority: Minor
> Fix For: 1.16.0
>
>
> Doing a query using the REST API will return VARBINARY columns as a Java byte 
> array hashcode instead of the actual data of the VARBINARY.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7051) Upgrade jetty

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800519#comment-16800519
 ] 

ASF GitHub Bot commented on DRILL-7051:
---

vdiravka commented on pull request #1681: DRILL-7051: Upgrade jetty
URL: https://github.com/apache/drill/pull/1681#discussion_r268551237
 
 

 ##
 File path: pom.xml
 ##
 @@ -85,6 +85,7 @@
 0.9.10
 1.8.2
 4.0.2
+9.4.15.v20190215
 
 Review comment:
   Finally jetty 9.3 is chosen for Drill.
   Jetty dependencies are used in `java-exec pom.xml`, but I've left versions 
control in `dependencyManagement` block of root POM to avoid picking invalid 
jetty version by maven, in case when some libs will have other version.
   For instance we can't exclude jetty from `hadoop-common` and `hbase` 
dependencies. But they have different jetty minor versions: 
   
[9.3.24.v20180605](https://github.com/apache/hadoop/blob/trunk/hadoop-project/pom.xml#L38)
  for Hadoop
   
[9.3.19.v20170502](https://github.com/apache/hbase/blob/rel/2.1.0/pom.xml#L1352)
 for HBase 2.1 and 
[9.3.25.v20180904](https://github.com/apache/hbase/blob/master/pom.xml#L1529) 
for master HBase version.
   I didn't find any API compatibility differences between these jetty minor 
versions (only 9.4 has it).
   Possibly in future we can consider shade Jetty version in Drill, 
[DRILL-7135](https://issues.apache.org/jira/browse/DRILL-7135). Not sure that 
is necessary for now.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Upgrade jetty 
> --
>
> Key: DRILL-7051
> URL: https://issues.apache.org/jira/browse/DRILL-7051
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.15.0
>Reporter: Veera Naranammalpuram
>Assignee: Vitalii Diravka
>Priority: Major
> Fix For: 1.16.0
>
>
> Is Drill using a version of jetty web server that's really old? The jar's 
> suggest it's using jetty 9.1 that was built sometime in 2014? 
> {noformat}
> -rw-r--r-- 1 veeranaranammalpuram staff 15988 Nov 20 2017 
> jetty-continuation-9.1.1.v20140108.jar
> -rw-r--r-- 1 veeranaranammalpuram staff 103288 Nov 20 2017 
> jetty-http-9.1.5.v20140505.jar
> -rw-r--r-- 1 veeranaranammalpuram staff 101519 Nov 20 2017 
> jetty-io-9.1.5.v20140505.jar
> -rw-r--r-- 1 veeranaranammalpuram staff 95906 Nov 20 2017 
> jetty-security-9.1.5.v20140505.jar
> -rw-r--r-- 1 veeranaranammalpuram staff 401593 Nov 20 2017 
> jetty-server-9.1.5.v20140505.jar
> -rw-r--r-- 1 veeranaranammalpuram staff 110992 Nov 20 2017 
> jetty-servlet-9.1.5.v20140505.jar
> -rw-r--r-- 1 veeranaranammalpuram staff 119215 Nov 20 2017 
> jetty-servlets-9.1.5.v20140505.jar
> -rw-r--r-- 1 veeranaranammalpuram staff 341683 Nov 20 2017 
> jetty-util-9.1.5.v20140505.jar
> -rw-r--r-- 1 veeranaranammalpuram staff 38707 Dec 21 15:42 
> jetty-util-ajax-9.3.19.v20170502.jar
> -rw-r--r-- 1 veeranaranammalpuram staff 111466 Nov 20 2017 
> jetty-webapp-9.1.1.v20140108.jar
> -rw-r--r-- 1 veeranaranammalpuram staff 41763 Nov 20 2017 
> jetty-xml-9.1.1.v20140108.jar {noformat}
> This version is shown as deprecated: 
> [https://www.eclipse.org/jetty/documentation/current/what-jetty-version.html#d0e203]
> Opening this to upgrade jetty to the latest stable supported version. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7032) Ignore corrupt rows in a PCAP file

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800459#comment-16800459
 ] 

ASF GitHub Bot commented on DRILL-7032:
---

arina-ielchiieva commented on pull request #1637: DRILL-7032: Ignore corrupt 
rows in a PCAP file
URL: https://github.com/apache/drill/pull/1637#discussion_r268515281
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/pcap/decoder/Packet.java
 ##
 @@ -324,7 +333,12 @@ public int getDst_port() {
 byte[] data = null;
 if (packetLength >= payloadDataStart) {
   data = new byte[packetLength - payloadDataStart];
-  System.arraycopy(raw, ipOffset + payloadDataStart, data, 0, data.length);
+  try {
+System.arraycopy(raw, ipOffset + payloadDataStart, data, 0, 
data.length);
+  } catch (Exception e) {
+isCorrupt = true;
+logger.info("Error while parsing PCAP data: ", e.getMessage());
 
 Review comment:
   I think log info will produce error for each corrupt row and log file can 
grow enormously. I guess this should be debug, you can also include trace for 
the full exception:
   
   ```
   String message = "Error while parsing PCAP data: {}";
   logger.debug(message, e.getMessage());
   logger.trace(message, e);
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Ignore corrupt rows in a PCAP file
> --
>
> Key: DRILL-7032
> URL: https://issues.apache.org/jira/browse/DRILL-7032
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.15.0
> Environment: OS: Ubuntu 18.4
> Drill version: 1.15.0
> Java(TM) SE Runtime Environment (build 1.8.0_191-b12)
>Reporter: Giovanni Conte
>Assignee: Charles Givre
>Priority: Major
> Fix For: 1.16.0
>
>
> Would be useful for Drill to have some ability to ignore corrupt rows in a 
> PCAP file instead of trow the java exception.
> This is because there are many pcap files with corrupted lines and this 
> funcionality will avoid to do a pre-fixing of the packet-captures (example 
> attached file).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7032) Ignore corrupt rows in a PCAP file

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800453#comment-16800453
 ] 

ASF GitHub Bot commented on DRILL-7032:
---

arina-ielchiieva commented on pull request #1637: DRILL-7032: Ignore corrupt 
rows in a PCAP file
URL: https://github.com/apache/drill/pull/1637#discussion_r268515281
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/pcap/decoder/Packet.java
 ##
 @@ -324,7 +333,12 @@ public int getDst_port() {
 byte[] data = null;
 if (packetLength >= payloadDataStart) {
   data = new byte[packetLength - payloadDataStart];
-  System.arraycopy(raw, ipOffset + payloadDataStart, data, 0, data.length);
+  try {
+System.arraycopy(raw, ipOffset + payloadDataStart, data, 0, 
data.length);
+  } catch (Exception e) {
+isCorrupt = true;
+logger.info("Error while parsing PCAP data: ", e.getMessage());
 
 Review comment:
   I think log info will produce error for each corrupt row and log file can 
grow enormously. I guess this should be debug, you can also include trace for 
the full exception:
   
   ```
   String message = "Error while parsing PCAP data: ";
   logger.debug(message, e.getMessage());
   logger.trace(message, e);
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Ignore corrupt rows in a PCAP file
> --
>
> Key: DRILL-7032
> URL: https://issues.apache.org/jira/browse/DRILL-7032
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.15.0
> Environment: OS: Ubuntu 18.4
> Drill version: 1.15.0
> Java(TM) SE Runtime Environment (build 1.8.0_191-b12)
>Reporter: Giovanni Conte
>Assignee: Charles Givre
>Priority: Major
> Fix For: 1.16.0
>
>
> Would be useful for Drill to have some ability to ignore corrupt rows in a 
> PCAP file instead of trow the java exception.
> This is because there are many pcap files with corrupted lines and this 
> funcionality will avoid to do a pre-fixing of the packet-captures (example 
> attached file).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6970) Issue with LogRegex format plugin where drillbuf was overflowing

2019-03-25 Thread Arina Ielchiieva (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-6970:

Labels: ready-to-commit  (was: )

> Issue with LogRegex format plugin where drillbuf was overflowing 
> -
>
> Key: DRILL-6970
> URL: https://issues.apache.org/jira/browse/DRILL-6970
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: jean-claude
>Assignee: jean-claude
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> The log format plugin does re-allocate the drillbuf when it fills up. You can 
> query small log files but larger ones will fail with this error:
> 0: jdbc:drill:zk=local> select * from dfs.root.`/prog/test.log`;
> Error: INTERNAL_ERROR ERROR: index: 32724, length: 108 (expected: range(0, 
> 32768))
> Fragment 0:0
> Please, refer to logs for more information.
>  
> I'm running drill-embeded. The log storage plugin is configured like so
> {code:java}
> "log": {
> "type": "logRegex",
> "regex": "(.+)",
> "extension": "log",
> "maxErrors": 10,
> "schema": [
> {
> "fieldName": "line"
> }
> ]
> },
> {code}
> The log files is very simple
> {code:java}
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> ...{code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)