date:20190327

[GitHub] [drill] paul-rogers commented on a change in pull request #1711: DRILL-7011: Support schema in scan framework

2019-03-27 Thread GitBox

paul-rogers commented on a change in pull request #1711: DRILL-7011: Support 
schema in scan framework
URL: https://github.com/apache/drill/pull/1711#discussion_r269437772
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/exec/store/easy/text/compliant/TestCsvWithSchema.java
 ##
 @@ -82,6 +167,468 @@ public void testSchema() throws Exception {
   .addRow(10, new LocalDate(2019, 3, 20), "it works!", 1234.5D, 20L, 
"")
   .build();
   RowSetUtilities.verify(expected, actual);
+} finally {
+  resetV3();
+  resetSchema();
+}
+  }
+
+
+  /**
+   * Use a schema with explicit projection to get a consistent view
+   * of the table schema, even if columns are missing, rows are ragged,
+   * and column order changes.
+   * 
+   * Force the scans to occur in distinct fragments so the order of the
+   * file batches is random.
+   */
+  @Test
+  public void testMultiFileSchema() throws Exception {
+RowSet expected1 = null;
+RowSet expected2 = null;
+try {
+  enableV3(true);
+  enableSchema(true);
+  enableMultiScan();
+  String tablePath = buildTwoFileTable("multiFileSchema", 
raggedMulti1Contents, reordered2Contents);
+  run(SCHEMA_SQL, tablePath);
+
+  // Wildcard expands to union of schema + table. In this case
+  // all table columns appear in the schema (though not all schema
+  // columns appear in the table.)
+
+  String sql = "SELECT id, `name`, `date`, gender, comment FROM " + 
tablePath;
+  TupleMetadata expectedSchema = new SchemaBuilder()
+  .add("id", MinorType.INT)
+  .add("name", MinorType.VARCHAR)
+  .addNullable("date", MinorType.DATE)
+  .add("gender", MinorType.VARCHAR)
+  .add("comment", MinorType.VARCHAR)
+  .buildSchema();
+  expected1 = new RowSetBuilder(client.allocator(), expectedSchema)
+  .addRow(1, "arina", new LocalDate(2019, 1, 18), "female", "ABC")
+  .addRow(2, "javan", new LocalDate(2019, 1, 19), "male", "ABC")
+  .addRow(4, "albert", new LocalDate(2019, 5, 4), "", "ABC")
+  .build();
+  expected2 = new RowSetBuilder(client.allocator(), expectedSchema)
+  .addRow(3, "bob", new LocalDate(2001, 1, 16), "NA", "ABC")
+  .build();
+
+  // Loop 10 times so that, as the two reader fragments read the two
+  // files, we end up with (acceptable) races that read the files in
+  // random order.
+
+  for (int i = 0; i < 10; i++) {
+boolean sawSchema = false;
+boolean sawFile1 = false;
+boolean sawFile2 = false;
+Iterator iter = 
client.queryBuilder().sql(sql).rowSetIterator();
+while (iter.hasNext()) {
+  RowSet result = iter.next();
+  if (result.rowCount() == 3) {
+sawFile1 = true;
+new RowSetComparison(expected1).verifyAndClear(result);
+  } else if (result.rowCount() == 1) {
+sawFile2 = true;
+new RowSetComparison(expected2).verifyAndClear(result);
+  } else {
+assertEquals(0, result.rowCount());
+sawSchema = true;
+  }
+}
+assertTrue(sawSchema);
+assertTrue(sawFile1);
+assertTrue(sawFile2);
+  }
+} finally {
+  expected1.clear();
+  expected2.clear();
+  client.resetSession(ExecConstants.ENABLE_V3_TEXT_READER_KEY);
+  client.resetSession(ExecConstants.STORE_TABLE_USE_SCHEMA_FILE);
+  client.resetSession(ExecConstants.MIN_READER_WIDTH_KEY);
+}
+  }
+
+  /**
+   * Test the schema we get in V2 when the table read order is random.
+   * Worst-case: the two files have different column counts and
+   * column orders.
+   * 
+   * Though the results are random, we iterate 10 times which, in most runs,
+   * shows the random variation in schemas:
+   * 
+   * Sometimes the first batch has three columns, sometimes four.
+   * Sometimes the column `id` is in position 0, sometimes in position 1
+   * (correlated with the above).
+   * Due to the fact that sometimes the first file (with four columns)
+   * is returned first, sometimes the second file (with three columns) is
+   * returned first.
+   * 
+   */
+  @Test
+  public void testSchemaRaceV2() throws Exception {
+try {
+  enableV3(false);
+  enableSchema(false);
+  enableMultiScan();
+  String tablePath = buildTwoFileTable("schemaRaceV2", multi1Contents, 
reordered2Contents);
+  boolean sawFile1First = false;
+  boolean sawFile2First = false;
+  boolean sawFullSchema = false;
+  boolean sawPartialSchema = false;
+  boolean sawIdAsCol0 = false;
+  boolean sawIdAsCol1 = false;
+  String sql = "SELECT * FROM " + tablePath;
+  for (int i = 0; i < 10; i++) {
+Iterator iter = 
client.queryBuilder().sql(sql).rowSetIterator();
+int batchCount = 0;
+while(iter.hasNext()) {
+  batchCount++;
+  RowSet result = iter.n

[GitHub] [drill] paul-rogers commented on a change in pull request #1711: DRILL-7011: Support schema in scan framework

2019-03-27 Thread GitBox

paul-rogers commented on a change in pull request #1711: DRILL-7011: Support 
schema in scan framework
URL: https://github.com/apache/drill/pull/1711#discussion_r269438194
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/exec/store/easy/text/compliant/TestCsvWithSchema.java
 ##
 @@ -82,6 +167,468 @@ public void testSchema() throws Exception {
   .addRow(10, new LocalDate(2019, 3, 20), "it works!", 1234.5D, 20L, 
"")
   .build();
   RowSetUtilities.verify(expected, actual);
+} finally {
+  resetV3();
+  resetSchema();
+}
+  }
+
+
+  /**
+   * Use a schema with explicit projection to get a consistent view
+   * of the table schema, even if columns are missing, rows are ragged,
+   * and column order changes.
+   * 
+   * Force the scans to occur in distinct fragments so the order of the
+   * file batches is random.
+   */
+  @Test
+  public void testMultiFileSchema() throws Exception {
+RowSet expected1 = null;
+RowSet expected2 = null;
+try {
+  enableV3(true);
+  enableSchema(true);
+  enableMultiScan();
+  String tablePath = buildTwoFileTable("multiFileSchema", 
raggedMulti1Contents, reordered2Contents);
+  run(SCHEMA_SQL, tablePath);
+
+  // Wildcard expands to union of schema + table. In this case
+  // all table columns appear in the schema (though not all schema
+  // columns appear in the table.)
+
+  String sql = "SELECT id, `name`, `date`, gender, comment FROM " + 
tablePath;
+  TupleMetadata expectedSchema = new SchemaBuilder()
+  .add("id", MinorType.INT)
+  .add("name", MinorType.VARCHAR)
+  .addNullable("date", MinorType.DATE)
+  .add("gender", MinorType.VARCHAR)
+  .add("comment", MinorType.VARCHAR)
+  .buildSchema();
+  expected1 = new RowSetBuilder(client.allocator(), expectedSchema)
+  .addRow(1, "arina", new LocalDate(2019, 1, 18), "female", "ABC")
+  .addRow(2, "javan", new LocalDate(2019, 1, 19), "male", "ABC")
+  .addRow(4, "albert", new LocalDate(2019, 5, 4), "", "ABC")
+  .build();
+  expected2 = new RowSetBuilder(client.allocator(), expectedSchema)
+  .addRow(3, "bob", new LocalDate(2001, 1, 16), "NA", "ABC")
+  .build();
+
+  // Loop 10 times so that, as the two reader fragments read the two
+  // files, we end up with (acceptable) races that read the files in
+  // random order.
+
+  for (int i = 0; i < 10; i++) {
+boolean sawSchema = false;
+boolean sawFile1 = false;
+boolean sawFile2 = false;
+Iterator iter = 
client.queryBuilder().sql(sql).rowSetIterator();
+while (iter.hasNext()) {
+  RowSet result = iter.next();
+  if (result.rowCount() == 3) {
+sawFile1 = true;
+new RowSetComparison(expected1).verifyAndClear(result);
+  } else if (result.rowCount() == 1) {
+sawFile2 = true;
+new RowSetComparison(expected2).verifyAndClear(result);
+  } else {
+assertEquals(0, result.rowCount());
+sawSchema = true;
+  }
+}
+assertTrue(sawSchema);
+assertTrue(sawFile1);
+assertTrue(sawFile2);
+  }
+} finally {
+  expected1.clear();
+  expected2.clear();
+  client.resetSession(ExecConstants.ENABLE_V3_TEXT_READER_KEY);
+  client.resetSession(ExecConstants.STORE_TABLE_USE_SCHEMA_FILE);
+  client.resetSession(ExecConstants.MIN_READER_WIDTH_KEY);
+}
+  }
+
+  /**
+   * Test the schema we get in V2 when the table read order is random.
+   * Worst-case: the two files have different column counts and
+   * column orders.
+   * 
+   * Though the results are random, we iterate 10 times which, in most runs,
+   * shows the random variation in schemas:
+   * 
+   * Sometimes the first batch has three columns, sometimes four.
+   * Sometimes the column `id` is in position 0, sometimes in position 1
+   * (correlated with the above).
+   * Due to the fact that sometimes the first file (with four columns)
+   * is returned first, sometimes the second file (with three columns) is
+   * returned first.
+   * 
+   */
+  @Test
+  public void testSchemaRaceV2() throws Exception {
+try {
+  enableV3(false);
+  enableSchema(false);
+  enableMultiScan();
+  String tablePath = buildTwoFileTable("schemaRaceV2", multi1Contents, 
reordered2Contents);
+  boolean sawFile1First = false;
+  boolean sawFile2First = false;
+  boolean sawFullSchema = false;
+  boolean sawPartialSchema = false;
+  boolean sawIdAsCol0 = false;
+  boolean sawIdAsCol1 = false;
+  String sql = "SELECT * FROM " + tablePath;
+  for (int i = 0; i < 10; i++) {
+Iterator iter = 
client.queryBuilder().sql(sql).rowSetIterator();
+int batchCount = 0;
+while(iter.hasNext()) {
+  batchCount++;
+  RowSet result = iter.n

[GitHub] [drill] paul-rogers commented on issue #1711: DRILL-7011: Support schema in scan framework

2019-03-27 Thread GitBox

paul-rogers commented on issue #1711: DRILL-7011: Support schema in scan 
framework
URL: https://github.com/apache/drill/pull/1711#issuecomment-477031299
 
 
   Added wildcard projection handling for both "lenient" and "strict" schemas. 
Added tests for the scan framework. Added a basic tests for "lenient" wildcard 
projection with CSV files. Have not yet added CSV tests for strict schema 
because I need to know how to pass a table-level property with the new CREATE 
SCHEMA command.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vdiravka commented on a change in pull request #1681: DRILL-7051: Upgrade to Jetty 9.3

2019-03-27 Thread GitBox

vdiravka commented on a change in pull request #1681: DRILL-7051: Upgrade to 
Jetty 9.3 
URL: https://github.com/apache/drill/pull/1681#discussion_r269085173
 
 

 ##
 File path: pom.xml
 ##
 @@ -2553,7 +2548,13 @@
 4.0.1
 provided
   
-  
+
+  
+javax.ws.rs
+javax.ws.rs-api
 
 Review comment:
   This is versions control for `javax` transitive dependencies from `jersey` 
dependencies, since they are also widely used in Drill.
   I have updated them to the latest versions along with `jersey` dependencies.
   But after reading some good practices, I rethought it. It is better to 
control only `jersey` dependencies. Since `javax.ws` can have an interfaces, 
which are not implemented in Jersey yet.
   So I removed this `javax` versions control mechanism 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] arina-ielchiieva opened a new pull request #1717: DRILL-6989: Upgrade to SqlLine 1.7

2019-03-27 Thread GitBox

arina-ielchiieva opened a new pull request #1717: DRILL-6989: Upgrade to 
SqlLine 1.7
URL: https://github.com/apache/drill/pull/1717
 
 
   Details in [DRILL-6989](https://issues.apache.org/jira/browse/DRILL-6989).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] arina-ielchiieva commented on a change in pull request #1711: DRILL-7011: Support schema in scan framework

2019-03-27 Thread GitBox

arina-ielchiieva commented on a change in pull request #1711: DRILL-7011: 
Support schema in scan framework
URL: https://github.com/apache/drill/pull/1711#discussion_r269512455
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/scan/project/ScanLevelProjection.java
 ##
 @@ -128,6 +129,25 @@
 
 public class ScanLevelProjection {
 
+  public enum ScanProjectionType {
 
 Review comment:
   Please add java doc for each option if possible.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] arina-ielchiieva commented on a change in pull request #1711: DRILL-7011: Support schema in scan framework

2019-03-27 Thread GitBox

arina-ielchiieva commented on a change in pull request #1711: DRILL-7011: 
Support schema in scan framework
URL: https://github.com/apache/drill/pull/1711#discussion_r269542539
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/scan/file/MetadataColumn.java
 ##
 @@ -25,6 +25,8 @@
 
 public abstract class MetadataColumn extends ResolvedColumn implements 
ConstantColumnSpec {
 
+
+  public static final int ID = 15;
 
 Review comment:
   Do we still need ID?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] arina-ielchiieva commented on a change in pull request #1711: DRILL-7011: Support schema in scan framework

2019-03-27 Thread GitBox

arina-ielchiieva commented on a change in pull request #1711: DRILL-7011: 
Support schema in scan framework
URL: https://github.com/apache/drill/pull/1711#discussion_r269512903
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/scan/project/WildcardProjection.java
 ##
 @@ -0,0 +1,65 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.physical.impl.scan.project;
+
+import java.util.List;
+
+import 
org.apache.drill.exec.physical.impl.scan.project.AbstractUnresolvedColumn.UnresolvedWildcardColumn;
+import org.apache.drill.exec.record.MaterializedField;
+import org.apache.drill.exec.record.metadata.TupleMetadata;
+
+/**
+ * Perform a wildcard projection. In this case, the query wants all
+ * columns in the source table, so the table drives the final projection.
+ * Since we include only those columns in the table, there is no need
+ * to create null columns. Example: SELECT *
+ */
+
+public class WildcardProjection extends ReaderLevelProjection {
+
+  public WildcardProjection(ScanLevelProjection scanProj,
+  TupleMetadata tableSchema,
+  ResolvedTuple rootTuple,
+  List resolvers) {
+super(resolvers);
+for (ColumnProjection col : scanProj.columns()) {
+  if (col instanceof UnresolvedWildcardColumn) {
+projectAllColumns(rootTuple, tableSchema);
+  } else {
+resolveSpecial(rootTuple, col, tableSchema);
+  }
+}
+  }
+
+  /**
+   * Project all columns from table schema to the output, in table
+   * schema order. Since we accept any map columns as-is, no need
+   * to do recursive projection.
+   *
+   * @param tableSchema
 
 Review comment:
   Please add column description to avoid warning in the IDE.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] arina-ielchiieva commented on a change in pull request #1711: DRILL-7011: Support schema in scan framework

2019-03-27 Thread GitBox

arina-ielchiieva commented on a change in pull request #1711: DRILL-7011: 
Support schema in scan framework
URL: https://github.com/apache/drill/pull/1711#discussion_r269513816
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/scan/BaseScanOperatorExecTest.java
 ##
 @@ -0,0 +1,180 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.physical.impl.scan;
+
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.drill.common.types.TypeProtos.MinorType;
+import org.apache.drill.exec.physical.impl.scan.framework.ManagedScanFramework;
+import 
org.apache.drill.exec.physical.impl.scan.framework.ManagedScanFramework.ScanFrameworkBuilder;
+import org.apache.drill.exec.physical.impl.scan.ScanTestUtils.ScanFixture;
+import 
org.apache.drill.exec.physical.impl.scan.ScanTestUtils.ScanFixtureBuilder;
+import org.apache.drill.exec.physical.impl.scan.framework.BasicScanFactory;
+import org.apache.drill.exec.physical.impl.scan.framework.ManagedReader;
+import org.apache.drill.exec.physical.impl.scan.framework.SchemaNegotiator;
+import org.apache.drill.exec.physical.rowSet.ResultSetLoader;
+import org.apache.drill.exec.physical.rowSet.RowSetLoader;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.VectorContainer;
+import org.apache.drill.exec.record.metadata.SchemaBuilder;
+import org.apache.drill.exec.record.metadata.TupleMetadata;
+import org.apache.drill.test.SubOperatorTest;
+import org.apache.drill.test.rowSet.RowSet.SingleRowSet;
+import org.apache.drill.test.rowSet.RowSetUtilities;
+
+/**
+ * Test of the scan operator framework. Here the focus is on the
+ * implementation of the scan operator itself. This operator is
+ * based on a number of lower-level abstractions, each of which has
+ * its own unit tests. To make this more concrete: review the scan
+ * operator code paths. Each path should be exercised by one or more
+ * of the tests here. If, however, the code path depends on the
+ * details of another, supporting class, then tests for that class
+ * appear elsewhere.
+ */
+
+public class BaseScanOperatorExecTest extends SubOperatorTest {
+  static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(BaseScanOperatorExecTest.class);
+
+  /**
+   * Base class for the "mock" readers used in this test. The mock readers
+   * follow the normal (enhanced) reader API, but instead of actually reading
+   * from a data source, they just generate data with a known schema.
+   * They also expose internal state such as identifying which methods
+   * were actually called.
+   */
+
+  protected static abstract class BaseMockBatchReader implements 
ManagedReader {
+public boolean openCalled;
 
 Review comment:
   Can fields be private / protected?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] arina-ielchiieva commented on issue #1711: DRILL-7011: Support schema in scan framework

2019-03-27 Thread GitBox

arina-ielchiieva commented on issue #1711: DRILL-7011: Support schema in scan 
framework
URL: https://github.com/apache/drill/pull/1711#issuecomment-477160154
 
 
   @paul-rogers command syntax is the following:
   ```
   CREATE [OR REPLACE] SCHEMA
   [LOAD 'file:///path/to/file']
   [(column_name data_type nullability,...)]
   [FOR TABLE `table_name`]
   [PATH 'file:///schema_file_path/schema_file_name'] 
   [PROPERTIES ('key1'='value1', 'key2'='value2', ...)]
   ```
   `PROPERTIES` should be provided in parenthesis, in a form of key / value 
pairs where value follows after the key and equal sign, each enclosed the 
single quotes. Key / value groups should be separated by commas.
   ```
   create schema
   (col1 int, col2 int)
   for table t
   properties (
   'drill.strict' = 'true',
   'some_other_prop' = 'val')
   ```
   In `.drill.schema` JSON file this would look the following way:
   ```
   {
 "table" : "dfs.tmp.`t`",
 "schema" : {
   "columns" : [
 {
   "name" : "col1",
   "type" : "INT",
   "mode" : "OPTIONAL"
 },
 {
   "name" : "col2",
   "type" : "INT",
   "mode" : "OPTIONAL"
 }
   ],
   "properties" : {
 "drill.strict" : "true",
 "some_other_prop" : "val"
   }
 },
 "version" : 1
   }
   ```
   During deserialization they will be stored in `TupleMetadata` class (use 
`property(key)`, `properties` methods to extract them).
   
   If you want to add column properties, similar syntax will be used, except 
instead of parenthesis you need to use curly braces:
   ```
   create schema
   (col1 int, col2 int properties {'drill.strict' = 'true'})
   for table t
   properties (
   'drill.strict' = 'true',
   'some_other_prop' = 'val')
   ```
   JSON output:
   ```
   {
 "table" : "dfs.tmp.`t`",
 "schema" : {
   "columns" : [
 {
   "name" : "col1",
   "type" : "INT",
   "mode" : "OPTIONAL",
   "properties" : {
 "drill.strict" : "true"
   }
 },
 {
   "name" : "col2",
   "type" : "INT",
   "mode" : "OPTIONAL"
 }
   ],
   "properties" : {
 "drill.is_strict_schema" : "true",
 "some_other_prop" : "val"
   }
 },
 "version" : 1
   }
   ```
   Please let me know if there are any other syntax related questions.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] arina-ielchiieva edited a comment on issue #1711: DRILL-7011: Support schema in scan framework

2019-03-27 Thread GitBox

arina-ielchiieva edited a comment on issue #1711: DRILL-7011: Support schema in 
scan framework
URL: https://github.com/apache/drill/pull/1711#issuecomment-477160154
 
 
   @paul-rogers command syntax is the following:
   ```
   CREATE [OR REPLACE] SCHEMA
   [LOAD 'file:///path/to/file']
   [(column_name data_type nullability,...)]
   [FOR TABLE `table_name`]
   [PATH 'file:///schema_file_path/schema_file_name'] 
   [PROPERTIES ('key1'='value1', 'key2'='value2', ...)]
   ```
   `PROPERTIES` should be provided in parenthesis, in a form of key / value 
pairs where value follows after the key and equal sign, each enclosed the 
single quotes. Key / value groups should be separated by commas.
   ```
   create schema
   (col1 int, col2 int)
   for table t
   properties (
   'drill.strict' = 'true',
   'some_other_prop' = 'val')
   ```
   In `.drill.schema` JSON file this would look the following way:
   ```
   {
 "table" : "dfs.tmp.`t`",
 "schema" : {
   "columns" : [
 {
   "name" : "col1",
   "type" : "INT",
   "mode" : "OPTIONAL"
 },
 {
   "name" : "col2",
   "type" : "INT",
   "mode" : "OPTIONAL"
 }
   ],
   "properties" : {
 "drill.strict" : "true",
 "some_other_prop" : "val"
   }
 },
 "version" : 1
   }
   ```
   During deserialization they will be stored in `TupleMetadata` class (use 
`property(key)`, `properties` methods to extract them).
   
   If you want to add column properties, similar syntax will be used, except 
instead of parenthesis you need to use curly braces:
   ```
   create schema
   (col1 int, col2 int properties {'drill.strict' = 'true'})
   for table t
   properties (
   'drill.strict' = 'true',
   'some_other_prop' = 'val')
   ```
   JSON output:
   ```
   {
 "table" : "dfs.tmp.`t`",
 "schema" : {
   "columns" : [
 {
   "name" : "col1",
   "type" : "INT",
   "mode" : "OPTIONAL",
   "properties" : {
 "drill.strict" : "true"
   }
 },
 {
   "name" : "col2",
   "type" : "INT",
   "mode" : "OPTIONAL"
 }
   ],
   "properties" : {
 "drill.strict" : "true",
 "some_other_prop" : "val"
   }
 },
 "version" : 1
   }
   ```
   Please let me know if there are any other syntax related questions.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] arina-ielchiieva edited a comment on issue #1711: DRILL-7011: Support schema in scan framework

2019-03-27 Thread GitBox

arina-ielchiieva edited a comment on issue #1711: DRILL-7011: Support schema in 
scan framework
URL: https://github.com/apache/drill/pull/1711#issuecomment-477160154
 
 
   @paul-rogers command syntax is the following:
   ```
   CREATE [OR REPLACE] SCHEMA
   [LOAD 'file:///path/to/file']
   [(column_name data_type nullability,...)]
   [FOR TABLE `table_name`]
   [PATH 'file:///schema_file_path/schema_file_name'] 
   [PROPERTIES ('key1'='value1', 'key2'='value2', ...)]
   ```
   `PROPERTIES` should be provided in parenthesis, in a form of key / value 
pairs where value follows after the key and equal sign, each enclosed the 
single quotes. Key / value groups should be separated by commas.
   ```
   create schema
   (col1 int, col2 int)
   for table t
   properties (
   'drill.strict' = 'true',
   'some_other_prop' = 'val')
   ```
   In `.drill.schema` JSON file this would look the following way:
   ```
   {
 "table" : "dfs.tmp.`t`",
 "schema" : {
   "columns" : [
 {
   "name" : "col1",
   "type" : "INT",
   "mode" : "OPTIONAL"
 },
 {
   "name" : "col2",
   "type" : "INT",
   "mode" : "OPTIONAL"
 }
   ],
   "properties" : {
 "drill.strict" : "true",
 "some_other_prop" : "val"
   }
 },
 "version" : 1
   }
   ```
   During deserialization they will be stored in `TupleMetadata` class (use 
`property(key)`, `properties` methods to extract them).
   
   If you want to add column properties, similar syntax will be used, except 
instead of parenthesis you need to use curly braces:
   ```
   create schema
   (col1 int properties {'drill.strict' = 'true'}, col2 int)
   for table t
   properties (
   'drill.strict' = 'true',
   'some_other_prop' = 'val')
   ```
   JSON output:
   ```
   {
 "table" : "dfs.tmp.`t`",
 "schema" : {
   "columns" : [
 {
   "name" : "col1",
   "type" : "INT",
   "mode" : "OPTIONAL",
   "properties" : {
 "drill.strict" : "true"
   }
 },
 {
   "name" : "col2",
   "type" : "INT",
   "mode" : "OPTIONAL"
 }
   ],
   "properties" : {
 "drill.strict" : "true",
 "some_other_prop" : "val"
   }
 },
 "version" : 1
   }
   ```
   Please let me know if there are any other syntax related questions.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] ihuzenko commented on issue #1706: DRILL-7115: Improve Hive schema show tables performance

2019-03-27 Thread GitBox

ihuzenko commented on issue #1706: DRILL-7115: Improve Hive schema show tables 
performance
URL: https://github.com/apache/drill/pull/1706#issuecomment-477196081
 
 
   Hi @vdiravka , according to the [PR 
discussion](https://github.com/apache/drill/pull/461#discussion_r58753354) , 
show tables authorization was left in inconsistent state (all tables shown for 
SQL Standard mode, only accessible shown for Storage Based mode). I have spent 
some time investigating the issues and even found how to fix listing for SQL 
Standard mode. But performance acceptable solution for Storage Based mode is 
still unknown. I think that listing of only accessible tables wasn't Drill 
users requirement, and in case of such Jira appearance it should be implemented 
as finalized feature with acceptable performance and consistency for both 
authorization modes. Until Drill users require the feature, it'll be much 
better to return all tables as fast as possible. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] ihuzenko commented on issue #1706: DRILL-7115: Improve Hive schema show tables performance

2019-03-27 Thread GitBox

ihuzenko commented on issue #1706: DRILL-7115: Improve Hive schema show tables 
performance
URL: https://github.com/apache/drill/pull/1706#issuecomment-477219702
 
 
   @vdiravka could you please review again from scratch ? Because a lot of 
things were changed since last review. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement JDBC Statement.setMaxRows() with System Option

2019-03-27 Thread GitBox

vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement 
JDBC Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r269711718
 
 

 ##
 File path: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java
 ##
 @@ -410,6 +417,155 @@ public void testNonTriggeredQueryTimeout() throws 
SQLException {
 }
   }
 
+  
+  // Query maxRows methods:
+
+  /**
+   * Test for reading of default max rows
+   */
+  @Test
+  public void   testDefaultGetMaxRows() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL)) {
+  int maxRowsValue = pStmt.getMaxRows();
+  assertEquals(0, maxRowsValue);
+}
+  }
+
+  /**
+   * Test Invalid parameter by giving negative maxRows value
+   */
+  @Test
+  public void testInvalidSetMaxRows() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL)) {
+  //Setting negative value
+  int valueToSet = -10;
+  try {
+pStmt.setMaxRows(valueToSet);
+  } catch ( final SQLException e) {
+assertThat( e.getMessage(), containsString("VALIDATION ERROR: Option 
exec.query.max_rows must be between 0 and ") );
+  }
+}
+  }
+
+  /**
+   * Test setting a valid maxRows value
+   */
+  @Test
+  public void testValidSetMaxRows() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL)) {
+  //Setting positive value
+  int valueToSet = RANDOMIZER.nextInt(59)+1;
+  pStmt.setMaxRows(valueToSet);
+  assertEquals( valueToSet, pStmt.getMaxRows() );
+}
+  }
+
+  /**
+   * Test setting maxSize as zero (i.e. no Limit)
+   */
+  @Test
+  public void testSetMaxRowsAsZero() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL)) {
+  pStmt.setMaxRows(0);
+  pStmt.execute();
+  ResultSet rs = pStmt.getResultSet();
+  int rowCount = 0;
+  while (rs.next()) {
+rs.getBytes(1);
+rowCount++;
+  }
+  rs.close();
+  assertTrue(rowCount > 0);
+}
+  }
+
+  /**
+   * Test setting maxSize at a value lower than existing query limit
+   */
+  @Test
+  public void testSetMaxRowsLowerThanQueryLimit() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL_LIMIT_10)) {
+  int valueToSet = RANDOMIZER.nextInt(9)+1; //range:[1-9]
+  pStmt.setMaxRows(valueToSet);
+  pStmt.executeQuery();
+  ResultSet rs = pStmt.getResultSet();
+  int rowCount = 0;
+  while (rs.next()) {
+rs.getBytes(1);
+rowCount++;
+  }
+  rs.close();
+  assertEquals(valueToSet, rowCount);
+}
+  }
+
+  /**
+   * Test setting maxSize at a value higher than existing query limit
+   */
+  @Test
+  public void testSetMaxRowsHigherThanQueryLimit() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL_LIMIT_10)) {
+  int valueToSet = RANDOMIZER.nextInt(10)+11; //range:[11-20]
+  pStmt.setMaxRows(valueToSet);
+  pStmt.executeQuery();
+  ResultSet rs = pStmt.getResultSet();
+  int rowCount = 0;
+  while (rs.next()) {
+rs.getBytes(1);
+rowCount++;
+  }
+  rs.close();
+  assertTrue(valueToSet > rowCount);
+}
+  }
+
+  /**
+   * Test setting maxSize at a value lower than existing SYSTEM limit
+   */
+  @Test
+  public void testSetMaxRowsLowerThanSystemLimit() throws SQLException {
+int sysValueToSet = RANDOMIZER.nextInt(5)+6; //range:[6-10]
+setSystemMaxRows(sysValueToSet);
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL)) {
+  int valueToSet = RANDOMIZER.nextInt(5)+1; //range:[1-5]
+  pStmt.setMaxRows(valueToSet);
+  pStmt.executeQuery();
+  ResultSet rs = pStmt.getResultSet();
+  int rowCount = 0;
+  while (rs.next()) {
+rs.getBytes(1);
+rowCount++;
+  }
+  rs.close();
+  assertEquals(valueToSet, rowCount);
+}
+setSystemMaxRows(0); //RESET
+  }
+
+  /**
+   * Test setting maxSize at a value higher than existing SYSTEM limit
+   */
+  @Test
+  public void testSetMaxRowsHigherThanSystemLimit() throws SQLException {
+int sysValueToSet = RANDOMIZER.nextInt(5)+6; //range:[6-10]
 
 Review comment:
   ```suggestion
   int sysValueToSet = RANDOMIZER.nextInt(5) + 6; // range: [6-10]
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement JDBC Statement.setMaxRows() with System Option

2019-03-27 Thread GitBox

vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement 
JDBC Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r269710545
 
 

 ##
 File path: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java
 ##
 @@ -410,6 +417,155 @@ public void testNonTriggeredQueryTimeout() throws 
SQLException {
 }
   }
 
+  
+  // Query maxRows methods:
+
+  /**
+   * Test for reading of default max rows
+   */
+  @Test
+  public void   testDefaultGetMaxRows() throws SQLException {
 
 Review comment:
   ```suggestion
 public void testDefaultGetMaxRows() throws SQLException {
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement JDBC Statement.setMaxRows() with System Option

2019-03-27 Thread GitBox

vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement 
JDBC Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r269719993
 
 

 ##
 File path: exec/jdbc/src/test/java/org/apache/drill/jdbc/StatementTest.java
 ##
 @@ -239,4 +246,174 @@ public void testNonTriggeredQueryTimeout() throws 
SQLException {
 }
   }
 
+  
+  // Query maxRows methods:
+
+  /**
+   * Test for reading of default max rows
+   */
+  @Test
+  public void   testDefaultGetMaxRows() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  int maxRowsValue = stmt.getMaxRows();
+  assertEquals(0, maxRowsValue);
+}
+  }
+
+  /**
+   * Test Invalid parameter by giving negative maxRows value
+   */
+  @Test
+  public void testInvalidSetMaxRows() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  // Setting negative value
+  int valueToSet = -10;
+  try {
+stmt.setMaxRows(valueToSet);
+  } catch ( final SQLException e) {
+assertThat( e.getMessage(), containsString("VALIDATION ERROR: Option 
exec.query.max_rows must be between 0 and ") );
+  }
+}
+  }
+
+  /**
+   * Test setting a valid maxRows value
+   */
+  @Test
+  public void testValidSetMaxRows() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  // Setting positive value
+  int valueToSet = RANDOMIZER.nextInt(59)+1;
+  stmt.setMaxRows(valueToSet);
+  assertEquals( valueToSet, stmt.getMaxRows() );
 
 Review comment:
   ```suggestion
 assertEquals(valueToSet, stmt.getMaxRows());
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement JDBC Statement.setMaxRows() with System Option

2019-03-27 Thread GitBox

vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement 
JDBC Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r269711132
 
 

 ##
 File path: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java
 ##
 @@ -410,6 +417,155 @@ public void testNonTriggeredQueryTimeout() throws 
SQLException {
 }
   }
 
+  
+  // Query maxRows methods:
+
+  /**
+   * Test for reading of default max rows
+   */
+  @Test
+  public void   testDefaultGetMaxRows() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL)) {
+  int maxRowsValue = pStmt.getMaxRows();
+  assertEquals(0, maxRowsValue);
+}
+  }
+
+  /**
+   * Test Invalid parameter by giving negative maxRows value
+   */
+  @Test
+  public void testInvalidSetMaxRows() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL)) {
+  //Setting negative value
+  int valueToSet = -10;
+  try {
+pStmt.setMaxRows(valueToSet);
+  } catch ( final SQLException e) {
+assertThat( e.getMessage(), containsString("VALIDATION ERROR: Option 
exec.query.max_rows must be between 0 and ") );
+  }
+}
+  }
+
+  /**
+   * Test setting a valid maxRows value
+   */
+  @Test
+  public void testValidSetMaxRows() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL)) {
+  //Setting positive value
+  int valueToSet = RANDOMIZER.nextInt(59)+1;
+  pStmt.setMaxRows(valueToSet);
+  assertEquals( valueToSet, pStmt.getMaxRows() );
 
 Review comment:
   ```suggestion
 assertEquals(valueToSet, pStmt.getMaxRows());
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement JDBC Statement.setMaxRows() with System Option

2019-03-27 Thread GitBox

vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement 
JDBC Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r269720137
 
 

 ##
 File path: exec/jdbc/src/test/java/org/apache/drill/jdbc/StatementTest.java
 ##
 @@ -239,4 +246,174 @@ public void testNonTriggeredQueryTimeout() throws 
SQLException {
 }
   }
 
+  
+  // Query maxRows methods:
+
+  /**
+   * Test for reading of default max rows
+   */
+  @Test
+  public void   testDefaultGetMaxRows() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  int maxRowsValue = stmt.getMaxRows();
+  assertEquals(0, maxRowsValue);
+}
+  }
+
+  /**
+   * Test Invalid parameter by giving negative maxRows value
+   */
+  @Test
+  public void testInvalidSetMaxRows() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  // Setting negative value
+  int valueToSet = -10;
+  try {
+stmt.setMaxRows(valueToSet);
+  } catch ( final SQLException e) {
+assertThat( e.getMessage(), containsString("VALIDATION ERROR: Option 
exec.query.max_rows must be between 0 and ") );
+  }
+}
+  }
+
+  /**
+   * Test setting a valid maxRows value
+   */
+  @Test
+  public void testValidSetMaxRows() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  // Setting positive value
+  int valueToSet = RANDOMIZER.nextInt(59)+1;
+  stmt.setMaxRows(valueToSet);
+  assertEquals( valueToSet, stmt.getMaxRows() );
+}
+  }
+
+  /**
+   * Test setting maxSize as zero (i.e. no Limit)
+   */
+  @Test
+  public void testSetMaxRowsAsZero() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  stmt.setMaxRows(0);
+  stmt.executeQuery(SYS_OPTIONS_SQL);
+  ResultSet rs = stmt.getResultSet();
+  int rowCount = 0;
+  while (rs.next()) {
+rs.getBytes(1);
+rowCount++;
+  }
+  rs.close();
+  assertTrue(rowCount > 0);
+}
+  }
+
+  /**
+   * Test setting maxSize at a value lower than existing query limit
+   */
+  @Test
+  public void testSetMaxRowsLowerThanQueryLimit() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  int valueToSet = RANDOMIZER.nextInt(9)+1; // range:[1-9]
 
 Review comment:
   ```suggestion
 int valueToSet = RANDOMIZER.nextInt(9) + 1; // range: [1-9]
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement JDBC Statement.setMaxRows() with System Option

2019-03-27 Thread GitBox

vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement 
JDBC Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r269720874
 
 

 ##
 File path: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/test/Drill2489CallsAfterCloseThrowExceptionsTest.java
 ##
 @@ -537,7 +547,14 @@ public void 
testclosedPreparedStmtOfOpenConnMethodsThrowRight() {
 new ClosedPreparedStatementChecker(PreparedStatement.class,
closedPreparedStmtOfOpenConn);
 
-checker.testAllMethods();
+//List of methods now supported
+Set methodsToSkip = new LinkedHashSet() {{
 
 Review comment:
   I meant to revert all these changes, they are not needed for now.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement JDBC Statement.setMaxRows() with System Option

2019-03-27 Thread GitBox

vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement 
JDBC Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r269710453
 
 

 ##
 File path: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java
 ##
 @@ -285,7 +292,7 @@ public void testInvalidSetQueryTimeout() throws 
SQLException {
   public void testValidSetQueryTimeout() throws SQLException {
 try (PreparedStatement stmt = 
connection.prepareStatement(SYS_VERSION_SQL)) {
   //Setting positive value
-  int valueToSet = new Random(20150304).nextInt(59)+1;
+  int valueToSet = RANDOMIZER.nextInt(59)+1;
 
 Review comment:
   ```suggestion
 int valueToSet = RANDOMIZER.nextInt(59) + 1;
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement JDBC Statement.setMaxRows() with System Option

2019-03-27 Thread GitBox

vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement 
JDBC Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r269719449
 
 

 ##
 File path: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java
 ##
 @@ -462,4 +618,25 @@ public void 
testParamSettingWhenUnsupportedTypeSaysUnsupported() throws SQLExcep
 }
   }
 
+
+  // Sets the SystemMaxRows option
+  private void setSystemMaxRows(int sysValueToSet) throws SQLException {
+// Acquire lock if Non-Zero Value (i.e. a test is in progress)
+if (sysValueToSet != 0) {
+  maxRowsSysOptionLock.acquireUninterruptibly();
+}
+// Setting the value
+try (Statement stmt = connection.createStatement()) {
+  stmt.executeQuery(ALTER_SYS_OPTIONS_MAX_ROWS_LIMIT_X + sysValueToSet);
+  ResultSet rs = stmt.getResultSet();
 
 Review comment:
   Ok, thanks for the clarification.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement JDBC Statement.setMaxRows() with System Option

2019-03-27 Thread GitBox

vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement 
JDBC Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r269719305
 
 

 ##
 File path: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java
 ##
 @@ -462,4 +618,25 @@ public void 
testParamSettingWhenUnsupportedTypeSaysUnsupported() throws SQLExcep
 }
   }
 
+
+  // Sets the SystemMaxRows option
+  private void setSystemMaxRows(int sysValueToSet) throws SQLException {
 
 Review comment:
   It looks like a hack to release the lock for one value and acquire for 
others. Except adding locks into tests directly, the more clear way may be to 
add a boolean flag into method argument whether to acquire or release the lock.
   
   Also, I'm wondering whether these tests which change autoLimit does not 
affect other tests, i.e. when one test sets autoLimit to 3, but another test 
(which does not contain `setSystemMaxRows()` call, so it is not locked) expects 
result with 5 rows. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement JDBC Statement.setMaxRows() with System Option

2019-03-27 Thread GitBox

vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement 
JDBC Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r269711273
 
 

 ##
 File path: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java
 ##
 @@ -410,6 +417,155 @@ public void testNonTriggeredQueryTimeout() throws 
SQLException {
 }
   }
 
+  
+  // Query maxRows methods:
+
+  /**
+   * Test for reading of default max rows
+   */
+  @Test
+  public void   testDefaultGetMaxRows() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL)) {
+  int maxRowsValue = pStmt.getMaxRows();
+  assertEquals(0, maxRowsValue);
+}
+  }
+
+  /**
+   * Test Invalid parameter by giving negative maxRows value
+   */
+  @Test
+  public void testInvalidSetMaxRows() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL)) {
+  //Setting negative value
+  int valueToSet = -10;
+  try {
+pStmt.setMaxRows(valueToSet);
+  } catch ( final SQLException e) {
+assertThat( e.getMessage(), containsString("VALIDATION ERROR: Option 
exec.query.max_rows must be between 0 and ") );
+  }
+}
+  }
+
+  /**
+   * Test setting a valid maxRows value
+   */
+  @Test
+  public void testValidSetMaxRows() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL)) {
+  //Setting positive value
+  int valueToSet = RANDOMIZER.nextInt(59)+1;
+  pStmt.setMaxRows(valueToSet);
+  assertEquals( valueToSet, pStmt.getMaxRows() );
+}
+  }
+
+  /**
+   * Test setting maxSize as zero (i.e. no Limit)
+   */
+  @Test
+  public void testSetMaxRowsAsZero() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL)) {
+  pStmt.setMaxRows(0);
+  pStmt.execute();
+  ResultSet rs = pStmt.getResultSet();
+  int rowCount = 0;
+  while (rs.next()) {
+rs.getBytes(1);
+rowCount++;
+  }
+  rs.close();
+  assertTrue(rowCount > 0);
+}
+  }
+
+  /**
+   * Test setting maxSize at a value lower than existing query limit
+   */
+  @Test
+  public void testSetMaxRowsLowerThanQueryLimit() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL_LIMIT_10)) {
+  int valueToSet = RANDOMIZER.nextInt(9)+1; //range:[1-9]
 
 Review comment:
   ```suggestion
 int valueToSet = RANDOMIZER.nextInt(9) + 1; // range: [1-9]
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement JDBC Statement.setMaxRows() with System Option

2019-03-27 Thread GitBox

vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement 
JDBC Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r269720328
 
 

 ##
 File path: exec/jdbc/src/test/java/org/apache/drill/jdbc/StatementTest.java
 ##
 @@ -239,4 +246,174 @@ public void testNonTriggeredQueryTimeout() throws 
SQLException {
 }
   }
 
+  
+  // Query maxRows methods:
+
+  /**
+   * Test for reading of default max rows
+   */
+  @Test
+  public void   testDefaultGetMaxRows() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  int maxRowsValue = stmt.getMaxRows();
+  assertEquals(0, maxRowsValue);
+}
+  }
+
+  /**
+   * Test Invalid parameter by giving negative maxRows value
+   */
+  @Test
+  public void testInvalidSetMaxRows() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  // Setting negative value
+  int valueToSet = -10;
+  try {
+stmt.setMaxRows(valueToSet);
+  } catch ( final SQLException e) {
+assertThat( e.getMessage(), containsString("VALIDATION ERROR: Option 
exec.query.max_rows must be between 0 and ") );
+  }
+}
+  }
+
+  /**
+   * Test setting a valid maxRows value
+   */
+  @Test
+  public void testValidSetMaxRows() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  // Setting positive value
+  int valueToSet = RANDOMIZER.nextInt(59)+1;
+  stmt.setMaxRows(valueToSet);
+  assertEquals( valueToSet, stmt.getMaxRows() );
+}
+  }
+
+  /**
+   * Test setting maxSize as zero (i.e. no Limit)
+   */
+  @Test
+  public void testSetMaxRowsAsZero() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  stmt.setMaxRows(0);
+  stmt.executeQuery(SYS_OPTIONS_SQL);
+  ResultSet rs = stmt.getResultSet();
+  int rowCount = 0;
+  while (rs.next()) {
+rs.getBytes(1);
+rowCount++;
+  }
+  rs.close();
+  assertTrue(rowCount > 0);
+}
+  }
+
+  /**
+   * Test setting maxSize at a value lower than existing query limit
+   */
+  @Test
+  public void testSetMaxRowsLowerThanQueryLimit() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  int valueToSet = RANDOMIZER.nextInt(9)+1; // range:[1-9]
+  stmt.setMaxRows(valueToSet);
+  stmt.executeQuery(SYS_OPTIONS_SQL_LIMIT_10);
+  ResultSet rs = stmt.getResultSet();
+  int rowCount = 0;
+  while (rs.next()) {
+rs.getBytes(1);
+rowCount++;
+  }
+  rs.close();
+  assertEquals(valueToSet, rowCount);
+}
+  }
+
+  /**
+   * Test setting maxSize at a value higher than existing query limit
+   */
+  @Test
+  public void testSetMaxRowsHigherThanQueryLimit() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  int valueToSet = RANDOMIZER.nextInt(10)+11; // range:[11-20]
+  stmt.setMaxRows(valueToSet);
+  stmt.executeQuery(SYS_OPTIONS_SQL_LIMIT_10);
+  ResultSet rs = stmt.getResultSet();
+  int rowCount = 0;
+  while (rs.next()) {
+rs.getBytes(1);
+rowCount++;
+  }
+  rs.close();
+  assertTrue(valueToSet > rowCount);
+}
+  }
+
+  /**
+   * Test setting maxSize at a value lower than existing SYSTEM limit
+   */
+  @Test
+  public void testSetMaxRowsLowerThanSystemLimit() throws SQLException {
+int sysValueToSet = RANDOMIZER.nextInt(5)+6; // range:[6-10]
 
 Review comment:
   ```suggestion
   int sysValueToSet = RANDOMIZER.nextInt(5) + 6; // range: [6-10]
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement JDBC Statement.setMaxRows() with System Option

2019-03-27 Thread GitBox

vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement 
JDBC Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r269710761
 
 

 ##
 File path: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java
 ##
 @@ -410,6 +417,155 @@ public void testNonTriggeredQueryTimeout() throws 
SQLException {
 }
   }
 
+  
+  // Query maxRows methods:
+
+  /**
+   * Test for reading of default max rows
+   */
+  @Test
+  public void   testDefaultGetMaxRows() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL)) {
+  int maxRowsValue = pStmt.getMaxRows();
+  assertEquals(0, maxRowsValue);
+}
+  }
+
+  /**
+   * Test Invalid parameter by giving negative maxRows value
+   */
+  @Test
+  public void testInvalidSetMaxRows() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL)) {
+  //Setting negative value
+  int valueToSet = -10;
+  try {
+pStmt.setMaxRows(valueToSet);
+  } catch ( final SQLException e) {
 
 Review comment:
   ```suggestion
 } catch (final SQLException e) {
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement JDBC Statement.setMaxRows() with System Option

2019-03-27 Thread GitBox

vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement 
JDBC Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r269711570
 
 

 ##
 File path: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java
 ##
 @@ -410,6 +417,155 @@ public void testNonTriggeredQueryTimeout() throws 
SQLException {
 }
   }
 
+  
+  // Query maxRows methods:
+
+  /**
+   * Test for reading of default max rows
+   */
+  @Test
+  public void   testDefaultGetMaxRows() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL)) {
+  int maxRowsValue = pStmt.getMaxRows();
+  assertEquals(0, maxRowsValue);
+}
+  }
+
+  /**
+   * Test Invalid parameter by giving negative maxRows value
+   */
+  @Test
+  public void testInvalidSetMaxRows() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL)) {
+  //Setting negative value
+  int valueToSet = -10;
+  try {
+pStmt.setMaxRows(valueToSet);
+  } catch ( final SQLException e) {
+assertThat( e.getMessage(), containsString("VALIDATION ERROR: Option 
exec.query.max_rows must be between 0 and ") );
+  }
+}
+  }
+
+  /**
+   * Test setting a valid maxRows value
+   */
+  @Test
+  public void testValidSetMaxRows() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL)) {
+  //Setting positive value
+  int valueToSet = RANDOMIZER.nextInt(59)+1;
+  pStmt.setMaxRows(valueToSet);
+  assertEquals( valueToSet, pStmt.getMaxRows() );
+}
+  }
+
+  /**
+   * Test setting maxSize as zero (i.e. no Limit)
+   */
+  @Test
+  public void testSetMaxRowsAsZero() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL)) {
+  pStmt.setMaxRows(0);
+  pStmt.execute();
+  ResultSet rs = pStmt.getResultSet();
+  int rowCount = 0;
+  while (rs.next()) {
+rs.getBytes(1);
+rowCount++;
+  }
+  rs.close();
+  assertTrue(rowCount > 0);
+}
+  }
+
+  /**
+   * Test setting maxSize at a value lower than existing query limit
+   */
+  @Test
+  public void testSetMaxRowsLowerThanQueryLimit() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL_LIMIT_10)) {
+  int valueToSet = RANDOMIZER.nextInt(9)+1; //range:[1-9]
+  pStmt.setMaxRows(valueToSet);
+  pStmt.executeQuery();
+  ResultSet rs = pStmt.getResultSet();
+  int rowCount = 0;
+  while (rs.next()) {
+rs.getBytes(1);
+rowCount++;
+  }
+  rs.close();
+  assertEquals(valueToSet, rowCount);
+}
+  }
+
+  /**
+   * Test setting maxSize at a value higher than existing query limit
+   */
+  @Test
+  public void testSetMaxRowsHigherThanQueryLimit() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL_LIMIT_10)) {
+  int valueToSet = RANDOMIZER.nextInt(10)+11; //range:[11-20]
+  pStmt.setMaxRows(valueToSet);
+  pStmt.executeQuery();
+  ResultSet rs = pStmt.getResultSet();
+  int rowCount = 0;
+  while (rs.next()) {
+rs.getBytes(1);
+rowCount++;
+  }
+  rs.close();
+  assertTrue(valueToSet > rowCount);
+}
+  }
+
+  /**
+   * Test setting maxSize at a value lower than existing SYSTEM limit
+   */
+  @Test
+  public void testSetMaxRowsLowerThanSystemLimit() throws SQLException {
+int sysValueToSet = RANDOMIZER.nextInt(5)+6; //range:[6-10]
+setSystemMaxRows(sysValueToSet);
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL)) {
+  int valueToSet = RANDOMIZER.nextInt(5)+1; //range:[1-5]
 
 Review comment:
   ```suggestion
 int valueToSet = RANDOMIZER.nextInt(5) + 1; // range: [1-5]
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement JDBC Statement.setMaxRows() with System Option

2019-03-27 Thread GitBox

vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement 
JDBC Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r269720581
 
 

 ##
 File path: exec/jdbc/src/test/java/org/apache/drill/jdbc/StatementTest.java
 ##
 @@ -239,4 +246,174 @@ public void testNonTriggeredQueryTimeout() throws 
SQLException {
 }
   }
 
+  
+  // Query maxRows methods:
+
+  /**
+   * Test for reading of default max rows
+   */
+  @Test
+  public void   testDefaultGetMaxRows() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  int maxRowsValue = stmt.getMaxRows();
+  assertEquals(0, maxRowsValue);
+}
+  }
+
+  /**
+   * Test Invalid parameter by giving negative maxRows value
+   */
+  @Test
+  public void testInvalidSetMaxRows() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  // Setting negative value
+  int valueToSet = -10;
+  try {
+stmt.setMaxRows(valueToSet);
+  } catch ( final SQLException e) {
+assertThat( e.getMessage(), containsString("VALIDATION ERROR: Option 
exec.query.max_rows must be between 0 and ") );
+  }
+}
+  }
+
+  /**
+   * Test setting a valid maxRows value
+   */
+  @Test
+  public void testValidSetMaxRows() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  // Setting positive value
+  int valueToSet = RANDOMIZER.nextInt(59)+1;
+  stmt.setMaxRows(valueToSet);
+  assertEquals( valueToSet, stmt.getMaxRows() );
+}
+  }
+
+  /**
+   * Test setting maxSize as zero (i.e. no Limit)
+   */
+  @Test
+  public void testSetMaxRowsAsZero() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  stmt.setMaxRows(0);
+  stmt.executeQuery(SYS_OPTIONS_SQL);
+  ResultSet rs = stmt.getResultSet();
+  int rowCount = 0;
+  while (rs.next()) {
+rs.getBytes(1);
+rowCount++;
+  }
+  rs.close();
+  assertTrue(rowCount > 0);
+}
+  }
+
+  /**
+   * Test setting maxSize at a value lower than existing query limit
+   */
+  @Test
+  public void testSetMaxRowsLowerThanQueryLimit() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  int valueToSet = RANDOMIZER.nextInt(9)+1; // range:[1-9]
+  stmt.setMaxRows(valueToSet);
+  stmt.executeQuery(SYS_OPTIONS_SQL_LIMIT_10);
+  ResultSet rs = stmt.getResultSet();
+  int rowCount = 0;
+  while (rs.next()) {
+rs.getBytes(1);
+rowCount++;
+  }
+  rs.close();
+  assertEquals(valueToSet, rowCount);
+}
+  }
+
+  /**
+   * Test setting maxSize at a value higher than existing query limit
+   */
+  @Test
+  public void testSetMaxRowsHigherThanQueryLimit() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  int valueToSet = RANDOMIZER.nextInt(10)+11; // range:[11-20]
+  stmt.setMaxRows(valueToSet);
+  stmt.executeQuery(SYS_OPTIONS_SQL_LIMIT_10);
+  ResultSet rs = stmt.getResultSet();
+  int rowCount = 0;
+  while (rs.next()) {
+rs.getBytes(1);
+rowCount++;
+  }
+  rs.close();
+  assertTrue(valueToSet > rowCount);
+}
+  }
+
+  /**
+   * Test setting maxSize at a value lower than existing SYSTEM limit
+   */
+  @Test
+  public void testSetMaxRowsLowerThanSystemLimit() throws SQLException {
+int sysValueToSet = RANDOMIZER.nextInt(5)+6; // range:[6-10]
+setSystemMaxRows(sysValueToSet);
+try (Statement stmt = connection.createStatement()) {
+  int valueToSet = RANDOMIZER.nextInt(5)+1; // range:[1-5]
+  stmt.setMaxRows(valueToSet);
+  stmt.executeQuery(SYS_OPTIONS_SQL);
+  ResultSet rs = stmt.getResultSet();
+  int rowCount = 0;
+  while (rs.next()) {
+rs.getBytes(1);
+rowCount++;
+  }
+  rs.close();
+  assertEquals(valueToSet, rowCount);
+}
+setSystemMaxRows(0); // Reset value
+  }
+
+  /**
+   * Test setting maxSize at a value higher than existing SYSTEM limit
+   */
+  @Test
+  public void testSetMaxRowsHigherThanSystemLimit() throws SQLException {
+int sysValueToSet = RANDOMIZER.nextInt(5)+6; // range:[6-10]
+setSystemMaxRows(sysValueToSet);
+try (Statement stmt = connection.createStatement()) {
+  int valueToSet = RANDOMIZER.nextInt(5)+11; // range:[11-15]
 
 Review comment:
   ```suggestion
 int valueToSet = RANDOMIZER.nextInt(5) + 11; // range: [11-15]
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement JDBC Statement.setMaxRows() with System Option

2019-03-27 Thread GitBox

vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement 
JDBC Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r269711477
 
 

 ##
 File path: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java
 ##
 @@ -410,6 +417,155 @@ public void testNonTriggeredQueryTimeout() throws 
SQLException {
 }
   }
 
+  
+  // Query maxRows methods:
+
+  /**
+   * Test for reading of default max rows
+   */
+  @Test
+  public void   testDefaultGetMaxRows() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL)) {
+  int maxRowsValue = pStmt.getMaxRows();
+  assertEquals(0, maxRowsValue);
+}
+  }
+
+  /**
+   * Test Invalid parameter by giving negative maxRows value
+   */
+  @Test
+  public void testInvalidSetMaxRows() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL)) {
+  //Setting negative value
+  int valueToSet = -10;
+  try {
+pStmt.setMaxRows(valueToSet);
+  } catch ( final SQLException e) {
+assertThat( e.getMessage(), containsString("VALIDATION ERROR: Option 
exec.query.max_rows must be between 0 and ") );
+  }
+}
+  }
+
+  /**
+   * Test setting a valid maxRows value
+   */
+  @Test
+  public void testValidSetMaxRows() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL)) {
+  //Setting positive value
+  int valueToSet = RANDOMIZER.nextInt(59)+1;
+  pStmt.setMaxRows(valueToSet);
+  assertEquals( valueToSet, pStmt.getMaxRows() );
+}
+  }
+
+  /**
+   * Test setting maxSize as zero (i.e. no Limit)
+   */
+  @Test
+  public void testSetMaxRowsAsZero() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL)) {
+  pStmt.setMaxRows(0);
+  pStmt.execute();
+  ResultSet rs = pStmt.getResultSet();
+  int rowCount = 0;
+  while (rs.next()) {
+rs.getBytes(1);
+rowCount++;
+  }
+  rs.close();
+  assertTrue(rowCount > 0);
+}
+  }
+
+  /**
+   * Test setting maxSize at a value lower than existing query limit
+   */
+  @Test
+  public void testSetMaxRowsLowerThanQueryLimit() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL_LIMIT_10)) {
+  int valueToSet = RANDOMIZER.nextInt(9)+1; //range:[1-9]
+  pStmt.setMaxRows(valueToSet);
+  pStmt.executeQuery();
+  ResultSet rs = pStmt.getResultSet();
+  int rowCount = 0;
+  while (rs.next()) {
+rs.getBytes(1);
+rowCount++;
+  }
+  rs.close();
+  assertEquals(valueToSet, rowCount);
+}
+  }
+
+  /**
+   * Test setting maxSize at a value higher than existing query limit
+   */
+  @Test
+  public void testSetMaxRowsHigherThanQueryLimit() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL_LIMIT_10)) {
+  int valueToSet = RANDOMIZER.nextInt(10)+11; //range:[11-20]
+  pStmt.setMaxRows(valueToSet);
+  pStmt.executeQuery();
+  ResultSet rs = pStmt.getResultSet();
+  int rowCount = 0;
+  while (rs.next()) {
+rs.getBytes(1);
+rowCount++;
+  }
+  rs.close();
+  assertTrue(valueToSet > rowCount);
+}
+  }
+
+  /**
+   * Test setting maxSize at a value lower than existing SYSTEM limit
+   */
+  @Test
+  public void testSetMaxRowsLowerThanSystemLimit() throws SQLException {
+int sysValueToSet = RANDOMIZER.nextInt(5)+6; //range:[6-10]
 
 Review comment:
   ```suggestion
   int sysValueToSet = RANDOMIZER.nextInt(5) + 6; // range: [6-10]
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement JDBC Statement.setMaxRows() with System Option

2019-03-27 Thread GitBox

vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement 
JDBC Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r269711070
 
 

 ##
 File path: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java
 ##
 @@ -410,6 +417,155 @@ public void testNonTriggeredQueryTimeout() throws 
SQLException {
 }
   }
 
+  
+  // Query maxRows methods:
+
+  /**
+   * Test for reading of default max rows
+   */
+  @Test
+  public void   testDefaultGetMaxRows() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL)) {
+  int maxRowsValue = pStmt.getMaxRows();
+  assertEquals(0, maxRowsValue);
+}
+  }
+
+  /**
+   * Test Invalid parameter by giving negative maxRows value
+   */
+  @Test
+  public void testInvalidSetMaxRows() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL)) {
+  //Setting negative value
+  int valueToSet = -10;
+  try {
+pStmt.setMaxRows(valueToSet);
+  } catch ( final SQLException e) {
+assertThat( e.getMessage(), containsString("VALIDATION ERROR: Option 
exec.query.max_rows must be between 0 and ") );
+  }
+}
+  }
+
+  /**
+   * Test setting a valid maxRows value
+   */
+  @Test
+  public void testValidSetMaxRows() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL)) {
+  //Setting positive value
+  int valueToSet = RANDOMIZER.nextInt(59)+1;
 
 Review comment:
   ```suggestion
 int valueToSet = RANDOMIZER.nextInt(59) + 1;
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement JDBC Statement.setMaxRows() with System Option

2019-03-27 Thread GitBox

vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement 
JDBC Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r269720507
 
 

 ##
 File path: exec/jdbc/src/test/java/org/apache/drill/jdbc/StatementTest.java
 ##
 @@ -239,4 +246,174 @@ public void testNonTriggeredQueryTimeout() throws 
SQLException {
 }
   }
 
+  
+  // Query maxRows methods:
+
+  /**
+   * Test for reading of default max rows
+   */
+  @Test
+  public void   testDefaultGetMaxRows() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  int maxRowsValue = stmt.getMaxRows();
+  assertEquals(0, maxRowsValue);
+}
+  }
+
+  /**
+   * Test Invalid parameter by giving negative maxRows value
+   */
+  @Test
+  public void testInvalidSetMaxRows() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  // Setting negative value
+  int valueToSet = -10;
+  try {
+stmt.setMaxRows(valueToSet);
+  } catch ( final SQLException e) {
+assertThat( e.getMessage(), containsString("VALIDATION ERROR: Option 
exec.query.max_rows must be between 0 and ") );
+  }
+}
+  }
+
+  /**
+   * Test setting a valid maxRows value
+   */
+  @Test
+  public void testValidSetMaxRows() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  // Setting positive value
+  int valueToSet = RANDOMIZER.nextInt(59)+1;
+  stmt.setMaxRows(valueToSet);
+  assertEquals( valueToSet, stmt.getMaxRows() );
+}
+  }
+
+  /**
+   * Test setting maxSize as zero (i.e. no Limit)
+   */
+  @Test
+  public void testSetMaxRowsAsZero() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  stmt.setMaxRows(0);
+  stmt.executeQuery(SYS_OPTIONS_SQL);
+  ResultSet rs = stmt.getResultSet();
+  int rowCount = 0;
+  while (rs.next()) {
+rs.getBytes(1);
+rowCount++;
+  }
+  rs.close();
+  assertTrue(rowCount > 0);
+}
+  }
+
+  /**
+   * Test setting maxSize at a value lower than existing query limit
+   */
+  @Test
+  public void testSetMaxRowsLowerThanQueryLimit() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  int valueToSet = RANDOMIZER.nextInt(9)+1; // range:[1-9]
+  stmt.setMaxRows(valueToSet);
+  stmt.executeQuery(SYS_OPTIONS_SQL_LIMIT_10);
+  ResultSet rs = stmt.getResultSet();
+  int rowCount = 0;
+  while (rs.next()) {
+rs.getBytes(1);
+rowCount++;
+  }
+  rs.close();
+  assertEquals(valueToSet, rowCount);
+}
+  }
+
+  /**
+   * Test setting maxSize at a value higher than existing query limit
+   */
+  @Test
+  public void testSetMaxRowsHigherThanQueryLimit() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  int valueToSet = RANDOMIZER.nextInt(10)+11; // range:[11-20]
+  stmt.setMaxRows(valueToSet);
+  stmt.executeQuery(SYS_OPTIONS_SQL_LIMIT_10);
+  ResultSet rs = stmt.getResultSet();
+  int rowCount = 0;
+  while (rs.next()) {
+rs.getBytes(1);
+rowCount++;
+  }
+  rs.close();
+  assertTrue(valueToSet > rowCount);
+}
+  }
+
+  /**
+   * Test setting maxSize at a value lower than existing SYSTEM limit
+   */
+  @Test
+  public void testSetMaxRowsLowerThanSystemLimit() throws SQLException {
+int sysValueToSet = RANDOMIZER.nextInt(5)+6; // range:[6-10]
+setSystemMaxRows(sysValueToSet);
+try (Statement stmt = connection.createStatement()) {
+  int valueToSet = RANDOMIZER.nextInt(5)+1; // range:[1-5]
+  stmt.setMaxRows(valueToSet);
+  stmt.executeQuery(SYS_OPTIONS_SQL);
+  ResultSet rs = stmt.getResultSet();
+  int rowCount = 0;
+  while (rs.next()) {
+rs.getBytes(1);
+rowCount++;
+  }
+  rs.close();
+  assertEquals(valueToSet, rowCount);
+}
+setSystemMaxRows(0); // Reset value
+  }
+
+  /**
+   * Test setting maxSize at a value higher than existing SYSTEM limit
+   */
+  @Test
+  public void testSetMaxRowsHigherThanSystemLimit() throws SQLException {
+int sysValueToSet = RANDOMIZER.nextInt(5)+6; // range:[6-10]
 
 Review comment:
   ```suggestion
   int sysValueToSet = RANDOMIZER.nextInt(5) + 6; // range: [6-10]
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement JDBC Statement.setMaxRows() with System Option

2019-03-27 Thread GitBox

vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement 
JDBC Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r269710983
 
 

 ##
 File path: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java
 ##
 @@ -410,6 +417,155 @@ public void testNonTriggeredQueryTimeout() throws 
SQLException {
 }
   }
 
+  
+  // Query maxRows methods:
+
+  /**
+   * Test for reading of default max rows
+   */
+  @Test
+  public void   testDefaultGetMaxRows() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL)) {
+  int maxRowsValue = pStmt.getMaxRows();
+  assertEquals(0, maxRowsValue);
+}
+  }
+
+  /**
+   * Test Invalid parameter by giving negative maxRows value
+   */
+  @Test
+  public void testInvalidSetMaxRows() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL)) {
+  //Setting negative value
+  int valueToSet = -10;
+  try {
+pStmt.setMaxRows(valueToSet);
+  } catch ( final SQLException e) {
+assertThat( e.getMessage(), containsString("VALIDATION ERROR: Option 
exec.query.max_rows must be between 0 and ") );
 
 Review comment:
   ```suggestion
   assertThat(e.getMessage(), containsString("VALIDATION ERROR: Option 
exec.query.max_rows must be between 0 and "));
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement JDBC Statement.setMaxRows() with System Option

2019-03-27 Thread GitBox

vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement 
JDBC Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r269710348
 
 

 ##
 File path: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java
 ##
 @@ -72,12 +73,18 @@
 public class PreparedStatementTest extends JdbcTestBase {
 
   private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(PreparedStatementTest.class);
+  private static final Random RANDOMIZER = new Random(20150304);
 
   private static final String SYS_VERSION_SQL = "select * from sys.version";
   private static final String SYS_RANDOM_SQL =
   "SELECT cast(random() as varchar) as myStr FROM (VALUES(1)) " +
   "union SELECT cast(random() as varchar) as myStr FROM (VALUES(1)) " +
   "union SELECT cast(random() as varchar) as myStr FROM (VALUES(1)) ";
+  private static final String SYS_OPTIONS_SQL = "SELECT * FROM sys.options";
+  private static final String SYS_OPTIONS_SQL_LIMIT_10 = "SELECT * FROM 
sys.options LIMIT 12";
+  private static final String ALTER_SYS_OPTIONS_MAX_ROWS_LIMIT_X = "ALTER 
SYSTEM SET `" + ExecConstants.QUERY_MAX_ROWS + "`=";
+  //Locks used across StatementTest and PreparedStatementTest
+  private static Semaphore maxRowsSysOptionLock = 
StatementTest.maxRowsSysOptionLock;
 
 Review comment:
   Looks like these tests are connected to the same drillbit. In this case, the 
same `Semaphore` instance should be used. Thanks for the clarification.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement JDBC Statement.setMaxRows() with System Option

2019-03-27 Thread GitBox

vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement 
JDBC Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r269720216
 
 

 ##
 File path: exec/jdbc/src/test/java/org/apache/drill/jdbc/StatementTest.java
 ##
 @@ -239,4 +246,174 @@ public void testNonTriggeredQueryTimeout() throws 
SQLException {
 }
   }
 
+  
+  // Query maxRows methods:
+
+  /**
+   * Test for reading of default max rows
+   */
+  @Test
+  public void   testDefaultGetMaxRows() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  int maxRowsValue = stmt.getMaxRows();
+  assertEquals(0, maxRowsValue);
+}
+  }
+
+  /**
+   * Test Invalid parameter by giving negative maxRows value
+   */
+  @Test
+  public void testInvalidSetMaxRows() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  // Setting negative value
+  int valueToSet = -10;
+  try {
+stmt.setMaxRows(valueToSet);
+  } catch ( final SQLException e) {
+assertThat( e.getMessage(), containsString("VALIDATION ERROR: Option 
exec.query.max_rows must be between 0 and ") );
+  }
+}
+  }
+
+  /**
+   * Test setting a valid maxRows value
+   */
+  @Test
+  public void testValidSetMaxRows() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  // Setting positive value
+  int valueToSet = RANDOMIZER.nextInt(59)+1;
+  stmt.setMaxRows(valueToSet);
+  assertEquals( valueToSet, stmt.getMaxRows() );
+}
+  }
+
+  /**
+   * Test setting maxSize as zero (i.e. no Limit)
+   */
+  @Test
+  public void testSetMaxRowsAsZero() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  stmt.setMaxRows(0);
+  stmt.executeQuery(SYS_OPTIONS_SQL);
+  ResultSet rs = stmt.getResultSet();
+  int rowCount = 0;
+  while (rs.next()) {
+rs.getBytes(1);
+rowCount++;
+  }
+  rs.close();
+  assertTrue(rowCount > 0);
+}
+  }
+
+  /**
+   * Test setting maxSize at a value lower than existing query limit
+   */
+  @Test
+  public void testSetMaxRowsLowerThanQueryLimit() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  int valueToSet = RANDOMIZER.nextInt(9)+1; // range:[1-9]
+  stmt.setMaxRows(valueToSet);
+  stmt.executeQuery(SYS_OPTIONS_SQL_LIMIT_10);
+  ResultSet rs = stmt.getResultSet();
+  int rowCount = 0;
+  while (rs.next()) {
+rs.getBytes(1);
+rowCount++;
+  }
+  rs.close();
+  assertEquals(valueToSet, rowCount);
+}
+  }
+
+  /**
+   * Test setting maxSize at a value higher than existing query limit
+   */
+  @Test
+  public void testSetMaxRowsHigherThanQueryLimit() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  int valueToSet = RANDOMIZER.nextInt(10)+11; // range:[11-20]
 
 Review comment:
   ```suggestion
 int valueToSet = RANDOMIZER.nextInt(10) + 11; // range: [11-20]
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement JDBC Statement.setMaxRows() with System Option

2019-03-27 Thread GitBox

vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement 
JDBC Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r269720404
 
 

 ##
 File path: exec/jdbc/src/test/java/org/apache/drill/jdbc/StatementTest.java
 ##
 @@ -239,4 +246,174 @@ public void testNonTriggeredQueryTimeout() throws 
SQLException {
 }
   }
 
+  
+  // Query maxRows methods:
+
+  /**
+   * Test for reading of default max rows
+   */
+  @Test
+  public void   testDefaultGetMaxRows() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  int maxRowsValue = stmt.getMaxRows();
+  assertEquals(0, maxRowsValue);
+}
+  }
+
+  /**
+   * Test Invalid parameter by giving negative maxRows value
+   */
+  @Test
+  public void testInvalidSetMaxRows() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  // Setting negative value
+  int valueToSet = -10;
+  try {
+stmt.setMaxRows(valueToSet);
+  } catch ( final SQLException e) {
+assertThat( e.getMessage(), containsString("VALIDATION ERROR: Option 
exec.query.max_rows must be between 0 and ") );
+  }
+}
+  }
+
+  /**
+   * Test setting a valid maxRows value
+   */
+  @Test
+  public void testValidSetMaxRows() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  // Setting positive value
+  int valueToSet = RANDOMIZER.nextInt(59)+1;
+  stmt.setMaxRows(valueToSet);
+  assertEquals( valueToSet, stmt.getMaxRows() );
+}
+  }
+
+  /**
+   * Test setting maxSize as zero (i.e. no Limit)
+   */
+  @Test
+  public void testSetMaxRowsAsZero() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  stmt.setMaxRows(0);
+  stmt.executeQuery(SYS_OPTIONS_SQL);
+  ResultSet rs = stmt.getResultSet();
+  int rowCount = 0;
+  while (rs.next()) {
+rs.getBytes(1);
+rowCount++;
+  }
+  rs.close();
+  assertTrue(rowCount > 0);
+}
+  }
+
+  /**
+   * Test setting maxSize at a value lower than existing query limit
+   */
+  @Test
+  public void testSetMaxRowsLowerThanQueryLimit() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  int valueToSet = RANDOMIZER.nextInt(9)+1; // range:[1-9]
+  stmt.setMaxRows(valueToSet);
+  stmt.executeQuery(SYS_OPTIONS_SQL_LIMIT_10);
+  ResultSet rs = stmt.getResultSet();
+  int rowCount = 0;
+  while (rs.next()) {
+rs.getBytes(1);
+rowCount++;
+  }
+  rs.close();
+  assertEquals(valueToSet, rowCount);
+}
+  }
+
+  /**
+   * Test setting maxSize at a value higher than existing query limit
+   */
+  @Test
+  public void testSetMaxRowsHigherThanQueryLimit() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  int valueToSet = RANDOMIZER.nextInt(10)+11; // range:[11-20]
+  stmt.setMaxRows(valueToSet);
+  stmt.executeQuery(SYS_OPTIONS_SQL_LIMIT_10);
+  ResultSet rs = stmt.getResultSet();
+  int rowCount = 0;
+  while (rs.next()) {
+rs.getBytes(1);
+rowCount++;
+  }
+  rs.close();
+  assertTrue(valueToSet > rowCount);
+}
+  }
+
+  /**
+   * Test setting maxSize at a value lower than existing SYSTEM limit
+   */
+  @Test
+  public void testSetMaxRowsLowerThanSystemLimit() throws SQLException {
+int sysValueToSet = RANDOMIZER.nextInt(5)+6; // range:[6-10]
+setSystemMaxRows(sysValueToSet);
+try (Statement stmt = connection.createStatement()) {
+  int valueToSet = RANDOMIZER.nextInt(5)+1; // range:[1-5]
 
 Review comment:
   ```suggestion
 int valueToSet = RANDOMIZER.nextInt(5) + 1; // range: [1-5]
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement JDBC Statement.setMaxRows() with System Option

2019-03-27 Thread GitBox

vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement 
JDBC Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r269719774
 
 

 ##
 File path: exec/jdbc/src/test/java/org/apache/drill/jdbc/StatementTest.java
 ##
 @@ -239,4 +246,174 @@ public void testNonTriggeredQueryTimeout() throws 
SQLException {
 }
   }
 
+  
+  // Query maxRows methods:
+
+  /**
+   * Test for reading of default max rows
+   */
+  @Test
+  public void   testDefaultGetMaxRows() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  int maxRowsValue = stmt.getMaxRows();
+  assertEquals(0, maxRowsValue);
+}
+  }
+
+  /**
+   * Test Invalid parameter by giving negative maxRows value
+   */
+  @Test
+  public void testInvalidSetMaxRows() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  // Setting negative value
+  int valueToSet = -10;
+  try {
+stmt.setMaxRows(valueToSet);
+  } catch ( final SQLException e) {
 
 Review comment:
   ```suggestion
 } catch (final SQLException e) {
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement JDBC Statement.setMaxRows() with System Option

2019-03-27 Thread GitBox

vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement 
JDBC Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r269719943
 
 

 ##
 File path: exec/jdbc/src/test/java/org/apache/drill/jdbc/StatementTest.java
 ##
 @@ -239,4 +246,174 @@ public void testNonTriggeredQueryTimeout() throws 
SQLException {
 }
   }
 
+  
+  // Query maxRows methods:
+
+  /**
+   * Test for reading of default max rows
+   */
+  @Test
+  public void   testDefaultGetMaxRows() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  int maxRowsValue = stmt.getMaxRows();
+  assertEquals(0, maxRowsValue);
+}
+  }
+
+  /**
+   * Test Invalid parameter by giving negative maxRows value
+   */
+  @Test
+  public void testInvalidSetMaxRows() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  // Setting negative value
+  int valueToSet = -10;
+  try {
+stmt.setMaxRows(valueToSet);
+  } catch ( final SQLException e) {
+assertThat( e.getMessage(), containsString("VALIDATION ERROR: Option 
exec.query.max_rows must be between 0 and ") );
+  }
+}
+  }
+
+  /**
+   * Test setting a valid maxRows value
+   */
+  @Test
+  public void testValidSetMaxRows() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  // Setting positive value
+  int valueToSet = RANDOMIZER.nextInt(59)+1;
 
 Review comment:
   ```suggestion
 int valueToSet = RANDOMIZER.nextInt(59) + 1;
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement JDBC Statement.setMaxRows() with System Option

2019-03-27 Thread GitBox

vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement 
JDBC Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r269709745
 
 

 ##
 File path: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillPreparedStatementImpl.java
 ##
 @@ -259,4 +261,13 @@ public void setObject(int parameterIndex, Object x, 
SQLType targetSqlType) throw
 checkOpen();
 super.setObject(parameterIndex, x, targetSqlType);
   }
+
+  @Override
+  public void setLargeMaxRows(long maxRowCount) throws SQLException {
+super.setLargeMaxRows(maxRowCount);
+try (Statement statement = this.connection.createStatement()) {
+  statement.execute("ALTER SESSION SET `" + ExecConstants.QUERY_MAX_ROWS + 
"`=" + maxRowCount);
+  statement.close();
 
 Review comment:
   Please remove this line, `statement` will be closed automatically since it 
is used in `try with resources`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement JDBC Statement.setMaxRows() with System Option

2019-03-27 Thread GitBox

vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement 
JDBC Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r269711382
 
 

 ##
 File path: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java
 ##
 @@ -410,6 +417,155 @@ public void testNonTriggeredQueryTimeout() throws 
SQLException {
 }
   }
 
+  
+  // Query maxRows methods:
+
+  /**
+   * Test for reading of default max rows
+   */
+  @Test
+  public void   testDefaultGetMaxRows() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL)) {
+  int maxRowsValue = pStmt.getMaxRows();
+  assertEquals(0, maxRowsValue);
+}
+  }
+
+  /**
+   * Test Invalid parameter by giving negative maxRows value
+   */
+  @Test
+  public void testInvalidSetMaxRows() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL)) {
+  //Setting negative value
+  int valueToSet = -10;
+  try {
+pStmt.setMaxRows(valueToSet);
+  } catch ( final SQLException e) {
+assertThat( e.getMessage(), containsString("VALIDATION ERROR: Option 
exec.query.max_rows must be between 0 and ") );
+  }
+}
+  }
+
+  /**
+   * Test setting a valid maxRows value
+   */
+  @Test
+  public void testValidSetMaxRows() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL)) {
+  //Setting positive value
+  int valueToSet = RANDOMIZER.nextInt(59)+1;
+  pStmt.setMaxRows(valueToSet);
+  assertEquals( valueToSet, pStmt.getMaxRows() );
+}
+  }
+
+  /**
+   * Test setting maxSize as zero (i.e. no Limit)
+   */
+  @Test
+  public void testSetMaxRowsAsZero() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL)) {
+  pStmt.setMaxRows(0);
+  pStmt.execute();
+  ResultSet rs = pStmt.getResultSet();
+  int rowCount = 0;
+  while (rs.next()) {
+rs.getBytes(1);
+rowCount++;
+  }
+  rs.close();
+  assertTrue(rowCount > 0);
+}
+  }
+
+  /**
+   * Test setting maxSize at a value lower than existing query limit
+   */
+  @Test
+  public void testSetMaxRowsLowerThanQueryLimit() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL_LIMIT_10)) {
+  int valueToSet = RANDOMIZER.nextInt(9)+1; //range:[1-9]
+  pStmt.setMaxRows(valueToSet);
+  pStmt.executeQuery();
+  ResultSet rs = pStmt.getResultSet();
+  int rowCount = 0;
+  while (rs.next()) {
+rs.getBytes(1);
+rowCount++;
+  }
+  rs.close();
+  assertEquals(valueToSet, rowCount);
+}
+  }
+
+  /**
+   * Test setting maxSize at a value higher than existing query limit
+   */
+  @Test
+  public void testSetMaxRowsHigherThanQueryLimit() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL_LIMIT_10)) {
+  int valueToSet = RANDOMIZER.nextInt(10)+11; //range:[11-20]
 
 Review comment:
   ```suggestion
 int valueToSet = RANDOMIZER.nextInt(10) + 11; // range: [11-20]
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement JDBC Statement.setMaxRows() with System Option

2019-03-27 Thread GitBox

vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement 
JDBC Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r269719854
 
 

 ##
 File path: exec/jdbc/src/test/java/org/apache/drill/jdbc/StatementTest.java
 ##
 @@ -239,4 +246,174 @@ public void testNonTriggeredQueryTimeout() throws 
SQLException {
 }
   }
 
+  
+  // Query maxRows methods:
+
+  /**
+   * Test for reading of default max rows
+   */
+  @Test
+  public void   testDefaultGetMaxRows() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  int maxRowsValue = stmt.getMaxRows();
+  assertEquals(0, maxRowsValue);
+}
+  }
+
+  /**
+   * Test Invalid parameter by giving negative maxRows value
+   */
+  @Test
+  public void testInvalidSetMaxRows() throws SQLException {
+try (Statement stmt = connection.createStatement()) {
+  // Setting negative value
+  int valueToSet = -10;
+  try {
+stmt.setMaxRows(valueToSet);
+  } catch ( final SQLException e) {
+assertThat( e.getMessage(), containsString("VALIDATION ERROR: Option 
exec.query.max_rows must be between 0 and ") );
 
 Review comment:
   ```suggestion
   assertThat(e.getMessage(), containsString("VALIDATION ERROR: Option 
exec.query.max_rows must be between 0 and "));
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement JDBC Statement.setMaxRows() with System Option

2019-03-27 Thread GitBox

vvysotskyi commented on a change in pull request #1714: DRILL-7048: Implement 
JDBC Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r269711807
 
 

 ##
 File path: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java
 ##
 @@ -410,6 +417,155 @@ public void testNonTriggeredQueryTimeout() throws 
SQLException {
 }
   }
 
+  
+  // Query maxRows methods:
+
+  /**
+   * Test for reading of default max rows
+   */
+  @Test
+  public void   testDefaultGetMaxRows() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL)) {
+  int maxRowsValue = pStmt.getMaxRows();
+  assertEquals(0, maxRowsValue);
+}
+  }
+
+  /**
+   * Test Invalid parameter by giving negative maxRows value
+   */
+  @Test
+  public void testInvalidSetMaxRows() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL)) {
+  //Setting negative value
+  int valueToSet = -10;
+  try {
+pStmt.setMaxRows(valueToSet);
+  } catch ( final SQLException e) {
+assertThat( e.getMessage(), containsString("VALIDATION ERROR: Option 
exec.query.max_rows must be between 0 and ") );
+  }
+}
+  }
+
+  /**
+   * Test setting a valid maxRows value
+   */
+  @Test
+  public void testValidSetMaxRows() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL)) {
+  //Setting positive value
+  int valueToSet = RANDOMIZER.nextInt(59)+1;
+  pStmt.setMaxRows(valueToSet);
+  assertEquals( valueToSet, pStmt.getMaxRows() );
+}
+  }
+
+  /**
+   * Test setting maxSize as zero (i.e. no Limit)
+   */
+  @Test
+  public void testSetMaxRowsAsZero() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL)) {
+  pStmt.setMaxRows(0);
+  pStmt.execute();
+  ResultSet rs = pStmt.getResultSet();
+  int rowCount = 0;
+  while (rs.next()) {
+rs.getBytes(1);
+rowCount++;
+  }
+  rs.close();
+  assertTrue(rowCount > 0);
+}
+  }
+
+  /**
+   * Test setting maxSize at a value lower than existing query limit
+   */
+  @Test
+  public void testSetMaxRowsLowerThanQueryLimit() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL_LIMIT_10)) {
+  int valueToSet = RANDOMIZER.nextInt(9)+1; //range:[1-9]
+  pStmt.setMaxRows(valueToSet);
+  pStmt.executeQuery();
+  ResultSet rs = pStmt.getResultSet();
+  int rowCount = 0;
+  while (rs.next()) {
+rs.getBytes(1);
+rowCount++;
+  }
+  rs.close();
+  assertEquals(valueToSet, rowCount);
+}
+  }
+
+  /**
+   * Test setting maxSize at a value higher than existing query limit
+   */
+  @Test
+  public void testSetMaxRowsHigherThanQueryLimit() throws SQLException {
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL_LIMIT_10)) {
+  int valueToSet = RANDOMIZER.nextInt(10)+11; //range:[11-20]
+  pStmt.setMaxRows(valueToSet);
+  pStmt.executeQuery();
+  ResultSet rs = pStmt.getResultSet();
+  int rowCount = 0;
+  while (rs.next()) {
+rs.getBytes(1);
+rowCount++;
+  }
+  rs.close();
+  assertTrue(valueToSet > rowCount);
+}
+  }
+
+  /**
+   * Test setting maxSize at a value lower than existing SYSTEM limit
+   */
+  @Test
+  public void testSetMaxRowsLowerThanSystemLimit() throws SQLException {
+int sysValueToSet = RANDOMIZER.nextInt(5)+6; //range:[6-10]
+setSystemMaxRows(sysValueToSet);
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL)) {
+  int valueToSet = RANDOMIZER.nextInt(5)+1; //range:[1-5]
+  pStmt.setMaxRows(valueToSet);
+  pStmt.executeQuery();
+  ResultSet rs = pStmt.getResultSet();
+  int rowCount = 0;
+  while (rs.next()) {
+rs.getBytes(1);
+rowCount++;
+  }
+  rs.close();
+  assertEquals(valueToSet, rowCount);
+}
+setSystemMaxRows(0); //RESET
+  }
+
+  /**
+   * Test setting maxSize at a value higher than existing SYSTEM limit
+   */
+  @Test
+  public void testSetMaxRowsHigherThanSystemLimit() throws SQLException {
+int sysValueToSet = RANDOMIZER.nextInt(5)+6; //range:[6-10]
+setSystemMaxRows(sysValueToSet);
+try (PreparedStatement pStmt = 
connection.prepareStatement(SYS_OPTIONS_SQL)) {
+  int valueToSet = RANDOMIZER.nextInt(5)+11; //range:[11-15]
 
 Review comment:
   ```suggestion
 int valueToSet = RANDOMIZER.nextInt(5) + 11; // range: [11-15]
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this servi

[GitHub] [drill] kkhatua commented on a change in pull request #1714: DRILL-7048: Implement JDBC Statement.setMaxRows() with System Option

2019-03-27 Thread GitBox

kkhatua commented on a change in pull request #1714: DRILL-7048: Implement JDBC 
Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r269751346
 
 

 ##
 File path: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillPreparedStatementImpl.java
 ##
 @@ -259,4 +261,13 @@ public void setObject(int parameterIndex, Object x, 
SQLType targetSqlType) throw
 checkOpen();
 super.setObject(parameterIndex, x, targetSqlType);
   }
+
+  @Override
+  public void setLargeMaxRows(long maxRowCount) throws SQLException {
+super.setLargeMaxRows(maxRowCount);
+try (Statement statement = this.connection.createStatement()) {
+  statement.execute("ALTER SESSION SET `" + ExecConstants.QUERY_MAX_ROWS + 
"`=" + maxRowCount);
+  statement.close();
 
 Review comment:
   👍 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] kkhatua commented on a change in pull request #1714: DRILL-7048: Implement JDBC Statement.setMaxRows() with System Option

2019-03-27 Thread GitBox

kkhatua commented on a change in pull request #1714: DRILL-7048: Implement JDBC 
Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r269753040
 
 

 ##
 File path: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java
 ##
 @@ -410,6 +417,155 @@ public void testNonTriggeredQueryTimeout() throws 
SQLException {
 }
   }
 
+  
+  // Query maxRows methods:
+
+  /**
+   * Test for reading of default max rows
+   */
+  @Test
+  public void   testDefaultGetMaxRows() throws SQLException {
 
 Review comment:
   👍 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] kkhatua commented on a change in pull request #1714: DRILL-7048: Implement JDBC Statement.setMaxRows() with System Option

2019-03-27 Thread GitBox

kkhatua commented on a change in pull request #1714: DRILL-7048: Implement JDBC 
Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r269753146
 
 

 ##
 File path: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java
 ##
 @@ -285,7 +292,7 @@ public void testInvalidSetQueryTimeout() throws 
SQLException {
   public void testValidSetQueryTimeout() throws SQLException {
 try (PreparedStatement stmt = 
connection.prepareStatement(SYS_VERSION_SQL)) {
   //Setting positive value
-  int valueToSet = new Random(20150304).nextInt(59)+1;
+  int valueToSet = RANDOMIZER.nextInt(59)+1;
 
 Review comment:
   👍 
   Will check across all the tests.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] gparai commented on a change in pull request #1715: DRILL-7117: Support creation of equi-depth histogram for selected dat…

2019-03-27 Thread GitBox

gparai commented on a change in pull request #1715: DRILL-7117: Support 
creation of equi-depth histogram for selected dat…
URL: https://github.com/apache/drill/pull/1715#discussion_r269754472
 
 

 ##
 File path: exec/java-exec/src/main/resources/drill-module.conf
 ##
 @@ -685,5 +685,6 @@ drill.exec.options: {
 exec.query.return_result_set_for_ddl: true,
 # = rm related options ===
 exec.rm.queryTags: "",
-exec.rm.queues.wait_for_preferred_nodes: true
+exec.rm.queues.wait_for_preferred_nodes: true,
+exec.statistics.tdigest_compression: 100
 
 Review comment:
   Do we want to expose the algorithm name to the user? We had a similar 
discussion for HLL where we decided it would be better to have a generic name.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] gparai commented on a change in pull request #1715: DRILL-7117: Support creation of equi-depth histogram for selected dat…

2019-03-27 Thread GitBox

gparai commented on a change in pull request #1715: DRILL-7117: Support 
creation of equi-depth histogram for selected dat…
URL: https://github.com/apache/drill/pull/1715#discussion_r269753784
 
 

 ##
 File path: exec/jdbc-all/pom.xml
 ##
 @@ -452,6 +452,7 @@

org/apache/drill/shaded/guava/com/google/common/graph/**

org/apache/drill/shaded/guava/com/google/common/collect/Tree*

org/apache/drill/shaded/guava/com/google/common/collect/Standard*
+   
org/apache/drill/shaded/guava/com/google/common/io/BaseEncoding*
 
 Review comment:
   Why was this needed?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] gparai commented on a change in pull request #1715: DRILL-7117: Support creation of equi-depth histogram for selected dat…

2019-03-27 Thread GitBox

gparai commented on a change in pull request #1715: DRILL-7117: Support 
creation of equi-depth histogram for selected dat…
URL: https://github.com/apache/drill/pull/1715#discussion_r269755101
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/NumericEquiDepthHistogram.java
 ##
 @@ -0,0 +1,106 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.common;
+
+
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+
+import org.apache.calcite.rex.RexNode;
+import com.clearspring.analytics.stream.quantile.TDigest;
+
+/**
+ * A column specific equi-depth histogram which is meant for numeric data types
+ */
+@JsonTypeName("numeric-equi-depth")
+public class NumericEquiDepthHistogram implements Histogram {
+
+  // For equi-depth, all buckets will have same (approx) number of rows
+  @JsonProperty("numRowsPerBucket")
+  private long numRowsPerBucket;
+
+  // An array of buckets arranged in increasing order of their start boundaries
+  // Note that the buckets only maintain the start point of the bucket range.
+  // End point is assumed to be the same as the start point of next bucket.
 
 Review comment:
   I thought end-point was less than the start point of the next bucket?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] gparai commented on a change in pull request #1715: DRILL-7117: Support creation of equi-depth histogram for selected dat…

2019-03-27 Thread GitBox

gparai commented on a change in pull request #1715: DRILL-7117: Support 
creation of equi-depth histogram for selected dat…
URL: https://github.com/apache/drill/pull/1715#discussion_r269756979
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/NumericEquiDepthHistogram.java
 ##
 @@ -0,0 +1,106 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.common;
+
+
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+
+import org.apache.calcite.rex.RexNode;
+import com.clearspring.analytics.stream.quantile.TDigest;
+
+/**
+ * A column specific equi-depth histogram which is meant for numeric data types
+ */
+@JsonTypeName("numeric-equi-depth")
+public class NumericEquiDepthHistogram implements Histogram {
+
+  // For equi-depth, all buckets will have same (approx) number of rows
+  @JsonProperty("numRowsPerBucket")
+  private long numRowsPerBucket;
+
+  // An array of buckets arranged in increasing order of their start boundaries
+  // Note that the buckets only maintain the start point of the bucket range.
+  // End point is assumed to be the same as the start point of next bucket.
+  @JsonProperty("buckets")
+  private Double[] buckets;
+
+  // Default constructor for deserializer
+  public NumericEquiDepthHistogram() {}
+
+  public NumericEquiDepthHistogram(int numBuckets) {
+// If numBuckets = N, we are keeping N + 1 entries since the (N+1)th 
bucket's
+// starting value is the MAX value for the column and it becomes the end 
point of the
+// Nth bucket.
+buckets = new Double[numBuckets + 1];
+for (int i = 0; i < buckets.length; i++) {
+  buckets[i] = new Double(0.0);
+}
+numRowsPerBucket = -1;
+  }
+
+  public long getNumRowsPerBucket() {
+return numRowsPerBucket;
+  }
+
+  public void setNumRowsPerBucket(long numRows) {
+this.numRowsPerBucket = numRows;
+  }
+
+  public Double[] getBuckets() {
+return buckets;
+  }
+
+  @Override
+  public Double estimatedSelectivity(RexNode filter) {
+if (numRowsPerBucket >= 0) {
+  return 1.0;
+} else {
+  return null;
+}
+  }
+
+  /**
+   * Utility method to build a Numeric Equi-Depth Histogram from a t-digest 
byte array
+   * @param tdigest_array
+   * @return An instance of NumericEquiDepthHistogram
+   */
+  public static NumericEquiDepthHistogram buildFromTDigest(byte[] 
tdigest_array,
+   int numBuckets) {
+TDigest tdigest = 
TDigest.fromBytes(java.nio.ByteBuffer.wrap(tdigest_array));
+
+NumericEquiDepthHistogram histogram = new 
NumericEquiDepthHistogram(numBuckets);
+
+double q = 1.0/numBuckets;
+int i = 0;
+for (; i < numBuckets; i++) {
+  // get the starting point of the i-th quantile
+  double start = tdigest.quantile(q * i);
+  histogram.buckets[i] = start;
+}
+// for the N-th bucket, the end point corresponds to the 1.0 quantile but 
we don't keep the end
+// points; only the start point, so this is stored as the start point of 
the (N+1)th bucket
+histogram.buckets[i] = tdigest.quantile(1.0);
+
+// each bucket stores approx equal number of rows
+histogram.setNumRowsPerBucket(tdigest.size()/numBuckets);
 
 Review comment:
   `tdigest.size()` would be equal to the sampled rows? In that case we should 
use the rowcount (which would have done the scaling to account for sampling).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] gparai commented on a change in pull request #1715: DRILL-7117: Support creation of equi-depth histogram for selected dat…

2019-03-27 Thread GitBox

gparai commented on a change in pull request #1715: DRILL-7117: Support 
creation of equi-depth histogram for selected dat…
URL: https://github.com/apache/drill/pull/1715#discussion_r269731108
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/TDigestFunctions.java
 ##
 @@ -0,0 +1,1126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.expr.fn.impl;
+
+import io.netty.buffer.DrillBuf;
+import org.apache.drill.exec.expr.DrillAggFunc;
+import org.apache.drill.exec.expr.DrillSimpleFunc;
+import org.apache.drill.exec.expr.annotations.FunctionTemplate;
+import org.apache.drill.exec.expr.annotations.FunctionTemplate.FunctionScope;
+import org.apache.drill.exec.expr.annotations.FunctionTemplate.NullHandling;
+import org.apache.drill.exec.expr.annotations.Output;
+import org.apache.drill.exec.expr.annotations.Param;
+import org.apache.drill.exec.expr.annotations.Workspace;
+import org.apache.drill.exec.expr.holders.BigIntHolder;
+import org.apache.drill.exec.expr.holders.NullableBigIntHolder;
+import org.apache.drill.exec.expr.holders.BitHolder;
+import org.apache.drill.exec.expr.holders.NullableBitHolder;
+import org.apache.drill.exec.expr.holders.NullableIntHolder;
+import org.apache.drill.exec.expr.holders.NullableFloat8Holder;
+import org.apache.drill.exec.expr.holders.NullableFloat4Holder;
+import org.apache.drill.exec.expr.holders.Float8Holder;
+import org.apache.drill.exec.expr.holders.Float4Holder;
+import org.apache.drill.exec.expr.holders.DateHolder;
+import org.apache.drill.exec.expr.holders.TimeHolder;
+import org.apache.drill.exec.expr.holders.TimeStampHolder;
+import org.apache.drill.exec.expr.holders.NullableDateHolder;
+import org.apache.drill.exec.expr.holders.NullableTimeHolder;
+import org.apache.drill.exec.expr.holders.NullableTimeStampHolder;
+import org.apache.drill.exec.expr.holders.ObjectHolder;
+import org.apache.drill.exec.expr.holders.IntHolder;
+import org.apache.drill.exec.expr.holders.NullableVarBinaryHolder;
+import org.apache.drill.exec.expr.holders.VarCharHolder;
+import org.apache.drill.exec.expr.holders.VarBinaryHolder;
+import org.apache.drill.exec.expr.holders.NullableVarCharHolder;
+import org.apache.drill.exec.server.options.OptionManager;
+import org.apache.drill.exec.vector.AllocationHelper;
+import org.apache.drill.exec.vector.RepeatedFloat8Vector;
+
+import javax.inject.Inject;
+
+@SuppressWarnings("unused")
+public class TDigestFunctions {
+  static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(TDigestFunctions.class);
+
+  private TDigestFunctions(){}
+
+  @FunctionTemplate(name = "tdigest", scope = 
FunctionTemplate.FunctionScope.POINT_AGGREGATE)
+  public static class BigIntTDigestFunction implements DrillAggFunc {
+@Param BigIntHolder in;
+@Workspace ObjectHolder work;
+@Output NullableVarBinaryHolder out;
+@Inject DrillBuf buffer;
+@Inject OptionManager options;
+@Workspace IntHolder compression;
+
+@Override
+public void setup() {
+  work = new ObjectHolder();
+  compression.value = (int) 
options.getLong(org.apache.drill.exec.ExecConstants.TDIGEST_COMPRESSION);
+  work.obj = new 
com.clearspring.analytics.stream.quantile.TDigest(compression.value);
+}
+
+@Override
+public void add() {
+  if (work.obj != null) {
+com.clearspring.analytics.stream.quantile.TDigest tdigest = 
(com.clearspring.analytics.stream.quantile.TDigest) work.obj;
+tdigest.add(in.value);
+  }
+}
+
+@Override
+public void output() {
+  if (work.obj != null) {
+com.clearspring.analytics.stream.quantile.TDigest tdigest = 
(com.clearspring.analytics.stream.quantile.TDigest) work.obj;
+try {
+  int size = tdigest.smallByteSize();
+  java.nio.ByteBuffer byteBuf = java.nio.ByteBuffer.allocate(size);
+  tdigest.asSmallBytes(byteBuf);
+  out.buffer = buffer.reallocIfNeeded(size);
+  out.start = 0;
+  out.end = size;
+  out.buffer.setBytes(0, byteBuf.array());
+  out.isSet = 1;
+} catch (Exception e) {
+  throw new 
org.apache.drill.common.exceptions.DrillRunt

[GitHub] [drill] gparai commented on a change in pull request #1715: DRILL-7117: Support creation of equi-depth histogram for selected dat…

2019-03-27 Thread GitBox

gparai commented on a change in pull request #1715: DRILL-7117: Support 
creation of equi-depth histogram for selected dat…
URL: https://github.com/apache/drill/pull/1715#discussion_r269748728
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/statistics/TDigestMergedStatistic.java
 ##
 @@ -0,0 +1,135 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.physical.impl.statistics;
+
+// Library implementing TDigest algorithm to derive approximate quantiles. 
Please refer to:
+// 'Computing Extremely Accurate Quantiles using t-Digests' by Ted Dunning and 
Otmar Ertl
+
+import com.clearspring.analytics.stream.quantile.TDigest;
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.server.options.OptionManager;
+import org.apache.drill.exec.vector.NullableVarBinaryVector;
+import org.apache.drill.exec.vector.ValueVector;
+import org.apache.drill.exec.vector.complex.MapVector;
+
+import java.util.HashMap;
+import java.util.Map;
+import java.nio.ByteBuffer;
+
+public class TDigestMergedStatistic extends AbstractMergedStatistic {
+  private Map tdigestHolder;
+  private int compression;
+
+  public TDigestMergedStatistic() {
+this.tdigestHolder = new HashMap<>();
+state = State.INIT;
+  }
+
+  @Override
+  public void initialize(String inputName, double samplePercent) {
+super.initialize(Statistic.TDIGEST_MERGE, inputName, samplePercent);
+state = State.CONFIG;
+  }
+
+  @Override
+  public String getName() {
+return name;
+  }
+
+  @Override
+  public String getInput() {
+return inputName;
+  }
+
+  @Override
+  public void merge(MapVector input) {
+// Check the input is a Map Vector
+assert (input.getField().getType().getMinorType() == 
TypeProtos.MinorType.MAP);
+for (ValueVector vv : input) {
+  String colName = vv.getField().getName();
+  TDigest colTdigestHolder = null;
+  if (tdigestHolder.get(colName) != null) {
+colTdigestHolder = tdigestHolder.get(colName);
+  }
+  NullableVarBinaryVector tdigestVector = (NullableVarBinaryVector) vv;
+  NullableVarBinaryVector.Accessor accessor = tdigestVector.getAccessor();
+
+  try {
+if (!accessor.isNull(0)) {
+  TDigest other = TDigest.fromBytes(ByteBuffer.wrap(accessor.get(0)));
+  if (colTdigestHolder != null) {
+colTdigestHolder.add(other);
+tdigestHolder.put(colName, colTdigestHolder);
+  } else {
+tdigestHolder.put(colName, other);
+  }
+}
+  } catch (Exception ex) {
+//TODO: Catch IOException/CardinalityMergeException
 
 Review comment:
   Missing?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] gparai commented on a change in pull request #1715: DRILL-7117: Support creation of equi-depth histogram for selected dat…

2019-03-27 Thread GitBox

gparai commented on a change in pull request #1715: DRILL-7117: Support 
creation of equi-depth histogram for selected dat…
URL: https://github.com/apache/drill/pull/1715#discussion_r269730355
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/TDigestFunctions.java
 ##
 @@ -0,0 +1,1126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.expr.fn.impl;
+
+import io.netty.buffer.DrillBuf;
+import org.apache.drill.exec.expr.DrillAggFunc;
+import org.apache.drill.exec.expr.DrillSimpleFunc;
+import org.apache.drill.exec.expr.annotations.FunctionTemplate;
+import org.apache.drill.exec.expr.annotations.FunctionTemplate.FunctionScope;
+import org.apache.drill.exec.expr.annotations.FunctionTemplate.NullHandling;
+import org.apache.drill.exec.expr.annotations.Output;
+import org.apache.drill.exec.expr.annotations.Param;
+import org.apache.drill.exec.expr.annotations.Workspace;
+import org.apache.drill.exec.expr.holders.BigIntHolder;
+import org.apache.drill.exec.expr.holders.NullableBigIntHolder;
+import org.apache.drill.exec.expr.holders.BitHolder;
+import org.apache.drill.exec.expr.holders.NullableBitHolder;
+import org.apache.drill.exec.expr.holders.NullableIntHolder;
+import org.apache.drill.exec.expr.holders.NullableFloat8Holder;
+import org.apache.drill.exec.expr.holders.NullableFloat4Holder;
+import org.apache.drill.exec.expr.holders.Float8Holder;
+import org.apache.drill.exec.expr.holders.Float4Holder;
+import org.apache.drill.exec.expr.holders.DateHolder;
+import org.apache.drill.exec.expr.holders.TimeHolder;
+import org.apache.drill.exec.expr.holders.TimeStampHolder;
+import org.apache.drill.exec.expr.holders.NullableDateHolder;
+import org.apache.drill.exec.expr.holders.NullableTimeHolder;
+import org.apache.drill.exec.expr.holders.NullableTimeStampHolder;
+import org.apache.drill.exec.expr.holders.ObjectHolder;
+import org.apache.drill.exec.expr.holders.IntHolder;
+import org.apache.drill.exec.expr.holders.NullableVarBinaryHolder;
+import org.apache.drill.exec.expr.holders.VarCharHolder;
+import org.apache.drill.exec.expr.holders.VarBinaryHolder;
+import org.apache.drill.exec.expr.holders.NullableVarCharHolder;
+import org.apache.drill.exec.server.options.OptionManager;
+import org.apache.drill.exec.vector.AllocationHelper;
+import org.apache.drill.exec.vector.RepeatedFloat8Vector;
+
+import javax.inject.Inject;
+
+@SuppressWarnings("unused")
+public class TDigestFunctions {
+  static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(TDigestFunctions.class);
+
+  private TDigestFunctions(){}
+
+  @FunctionTemplate(name = "tdigest", scope = 
FunctionTemplate.FunctionScope.POINT_AGGREGATE)
+  public static class BigIntTDigestFunction implements DrillAggFunc {
+@Param BigIntHolder in;
+@Workspace ObjectHolder work;
+@Output NullableVarBinaryHolder out;
+@Inject DrillBuf buffer;
+@Inject OptionManager options;
+@Workspace IntHolder compression;
+
+@Override
+public void setup() {
+  work = new ObjectHolder();
+  compression.value = (int) 
options.getLong(org.apache.drill.exec.ExecConstants.TDIGEST_COMPRESSION);
+  work.obj = new 
com.clearspring.analytics.stream.quantile.TDigest(compression.value);
+}
+
+@Override
+public void add() {
+  if (work.obj != null) {
+com.clearspring.analytics.stream.quantile.TDigest tdigest = 
(com.clearspring.analytics.stream.quantile.TDigest) work.obj;
+tdigest.add(in.value);
+  }
+}
+
+@Override
+public void output() {
+  if (work.obj != null) {
+com.clearspring.analytics.stream.quantile.TDigest tdigest = 
(com.clearspring.analytics.stream.quantile.TDigest) work.obj;
+try {
+  int size = tdigest.smallByteSize();
+  java.nio.ByteBuffer byteBuf = java.nio.ByteBuffer.allocate(size);
+  tdigest.asSmallBytes(byteBuf);
+  out.buffer = buffer.reallocIfNeeded(size);
+  out.start = 0;
+  out.end = size;
+  out.buffer.setBytes(0, byteBuf.array());
+  out.isSet = 1;
+} catch (Exception e) {
+  throw new 
org.apache.drill.common.exceptions.DrillRunt

[GitHub] [drill] gparai commented on a change in pull request #1715: DRILL-7117: Support creation of equi-depth histogram for selected dat…

2019-03-27 Thread GitBox

gparai commented on a change in pull request #1715: DRILL-7117: Support 
creation of equi-depth histogram for selected dat…
URL: https://github.com/apache/drill/pull/1715#discussion_r269757452
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/StringEquiDepthHistogram.java
 ##
 @@ -0,0 +1,36 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.common;
+
+
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import org.apache.calcite.rex.RexNode;
+
+/**
+ * A column specific histogram which is meant for string columns
+ */
+@JsonTypeName("string-equi-depth")
+public class StringEquiDepthHistogram implements Histogram {
 
 Review comment:
   Do we want to remove this implementation for now?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] gparai commented on a change in pull request #1715: DRILL-7117: Support creation of equi-depth histogram for selected dat…

2019-03-27 Thread GitBox

gparai commented on a change in pull request #1715: DRILL-7117: Support 
creation of equi-depth histogram for selected dat…
URL: https://github.com/apache/drill/pull/1715#discussion_r269733494
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/TDigestFunctions.java
 ##
 @@ -0,0 +1,1126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.expr.fn.impl;
+
+import io.netty.buffer.DrillBuf;
+import org.apache.drill.exec.expr.DrillAggFunc;
+import org.apache.drill.exec.expr.DrillSimpleFunc;
+import org.apache.drill.exec.expr.annotations.FunctionTemplate;
+import org.apache.drill.exec.expr.annotations.FunctionTemplate.FunctionScope;
+import org.apache.drill.exec.expr.annotations.FunctionTemplate.NullHandling;
+import org.apache.drill.exec.expr.annotations.Output;
+import org.apache.drill.exec.expr.annotations.Param;
+import org.apache.drill.exec.expr.annotations.Workspace;
+import org.apache.drill.exec.expr.holders.BigIntHolder;
+import org.apache.drill.exec.expr.holders.NullableBigIntHolder;
+import org.apache.drill.exec.expr.holders.BitHolder;
+import org.apache.drill.exec.expr.holders.NullableBitHolder;
+import org.apache.drill.exec.expr.holders.NullableIntHolder;
+import org.apache.drill.exec.expr.holders.NullableFloat8Holder;
+import org.apache.drill.exec.expr.holders.NullableFloat4Holder;
+import org.apache.drill.exec.expr.holders.Float8Holder;
+import org.apache.drill.exec.expr.holders.Float4Holder;
+import org.apache.drill.exec.expr.holders.DateHolder;
+import org.apache.drill.exec.expr.holders.TimeHolder;
+import org.apache.drill.exec.expr.holders.TimeStampHolder;
+import org.apache.drill.exec.expr.holders.NullableDateHolder;
+import org.apache.drill.exec.expr.holders.NullableTimeHolder;
+import org.apache.drill.exec.expr.holders.NullableTimeStampHolder;
+import org.apache.drill.exec.expr.holders.ObjectHolder;
+import org.apache.drill.exec.expr.holders.IntHolder;
+import org.apache.drill.exec.expr.holders.NullableVarBinaryHolder;
+import org.apache.drill.exec.expr.holders.VarCharHolder;
+import org.apache.drill.exec.expr.holders.VarBinaryHolder;
+import org.apache.drill.exec.expr.holders.NullableVarCharHolder;
+import org.apache.drill.exec.server.options.OptionManager;
+import org.apache.drill.exec.vector.AllocationHelper;
+import org.apache.drill.exec.vector.RepeatedFloat8Vector;
+
+import javax.inject.Inject;
+
+@SuppressWarnings("unused")
+public class TDigestFunctions {
+  static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(TDigestFunctions.class);
+
+  private TDigestFunctions(){}
+
+  @FunctionTemplate(name = "tdigest", scope = 
FunctionTemplate.FunctionScope.POINT_AGGREGATE)
+  public static class BigIntTDigestFunction implements DrillAggFunc {
+@Param BigIntHolder in;
+@Workspace ObjectHolder work;
+@Output NullableVarBinaryHolder out;
+@Inject DrillBuf buffer;
+@Inject OptionManager options;
+@Workspace IntHolder compression;
+
+@Override
+public void setup() {
+  work = new ObjectHolder();
+  compression.value = (int) 
options.getLong(org.apache.drill.exec.ExecConstants.TDIGEST_COMPRESSION);
+  work.obj = new 
com.clearspring.analytics.stream.quantile.TDigest(compression.value);
+}
+
+@Override
+public void add() {
+  if (work.obj != null) {
+com.clearspring.analytics.stream.quantile.TDigest tdigest = 
(com.clearspring.analytics.stream.quantile.TDigest) work.obj;
+tdigest.add(in.value);
+  }
+}
+
+@Override
+public void output() {
+  if (work.obj != null) {
+com.clearspring.analytics.stream.quantile.TDigest tdigest = 
(com.clearspring.analytics.stream.quantile.TDigest) work.obj;
+try {
+  int size = tdigest.smallByteSize();
+  java.nio.ByteBuffer byteBuf = java.nio.ByteBuffer.allocate(size);
+  tdigest.asSmallBytes(byteBuf);
+  out.buffer = buffer.reallocIfNeeded(size);
+  out.start = 0;
+  out.end = size;
+  out.buffer.setBytes(0, byteBuf.array());
+  out.isSet = 1;
+} catch (Exception e) {
+  throw new 
org.apache.drill.common.exceptions.DrillRunt

[GitHub] [drill] gparai commented on a change in pull request #1715: DRILL-7117: Support creation of equi-depth histogram for selected dat…

2019-03-27 Thread GitBox

gparai commented on a change in pull request #1715: DRILL-7117: Support 
creation of equi-depth histogram for selected dat…
URL: https://github.com/apache/drill/pull/1715#discussion_r269752998
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/statistics/TDigestMergedStatistic.java
 ##
 @@ -0,0 +1,135 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.physical.impl.statistics;
+
+// Library implementing TDigest algorithm to derive approximate quantiles. 
Please refer to:
+// 'Computing Extremely Accurate Quantiles using t-Digests' by Ted Dunning and 
Otmar Ertl
+
+import com.clearspring.analytics.stream.quantile.TDigest;
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.server.options.OptionManager;
+import org.apache.drill.exec.vector.NullableVarBinaryVector;
+import org.apache.drill.exec.vector.ValueVector;
+import org.apache.drill.exec.vector.complex.MapVector;
+
+import java.util.HashMap;
+import java.util.Map;
+import java.nio.ByteBuffer;
+
+public class TDigestMergedStatistic extends AbstractMergedStatistic {
+  private Map tdigestHolder;
+  private int compression;
+
+  public TDigestMergedStatistic() {
+this.tdigestHolder = new HashMap<>();
+state = State.INIT;
+  }
+
+  @Override
+  public void initialize(String inputName, double samplePercent) {
+super.initialize(Statistic.TDIGEST_MERGE, inputName, samplePercent);
+state = State.CONFIG;
+  }
+
+  @Override
+  public String getName() {
+return name;
+  }
+
+  @Override
+  public String getInput() {
+return inputName;
+  }
+
+  @Override
+  public void merge(MapVector input) {
+// Check the input is a Map Vector
+assert (input.getField().getType().getMinorType() == 
TypeProtos.MinorType.MAP);
+for (ValueVector vv : input) {
+  String colName = vv.getField().getName();
+  TDigest colTdigestHolder = null;
+  if (tdigestHolder.get(colName) != null) {
+colTdigestHolder = tdigestHolder.get(colName);
+  }
+  NullableVarBinaryVector tdigestVector = (NullableVarBinaryVector) vv;
+  NullableVarBinaryVector.Accessor accessor = tdigestVector.getAccessor();
+
+  try {
+if (!accessor.isNull(0)) {
+  TDigest other = TDigest.fromBytes(ByteBuffer.wrap(accessor.get(0)));
+  if (colTdigestHolder != null) {
+colTdigestHolder.add(other);
+tdigestHolder.put(colName, colTdigestHolder);
+  } else {
+tdigestHolder.put(colName, other);
+  }
+}
+  } catch (Exception ex) {
+//TODO: Catch IOException/CardinalityMergeException
+//TODO: logger
+  }
+}
+  }
+
+  public TDigest getStat(String colName) {
+if (state != State.COMPLETE) {
+  throw new IllegalStateException(String.format("Statistic `%s` has not 
completed merging statistics",
+  name));
+}
+return tdigestHolder.get(colName);
+  }
+
+  @Override
+  public void setOutput(MapVector output) {
+// Check the input is a Map Vector
+assert (output.getField().getType().getMinorType() == 
TypeProtos.MinorType.MAP);
+// Dependencies have been configured correctly
+assert (state == State.MERGE);
+for (ValueVector outMapCol : output) {
+  String colName = outMapCol.getField().getName();
+  TDigest colTdigestHolder = tdigestHolder.get(colName);
+  NullableVarBinaryVector vv = (NullableVarBinaryVector) outMapCol;
+  vv.allocateNewSafe();
+  try {
+if (colTdigestHolder != null) {
+  int size = colTdigestHolder.smallByteSize();
+  java.nio.ByteBuffer byteBuf = java.nio.ByteBuffer.allocate(size);
+  colTdigestHolder.asSmallBytes(byteBuf);
+  // NOTE: in setting the VV below, we are using the byte[] instead of 
the ByteBuffer because the
+  // latter was producing incorrect output (after re-reading the data 
from the VV).  It is
+  // unclear whether the setSafe() api for ByteBuffer has a bug, so to 
be safe we are using the
+  // byte[] directly which works.
+  vv.getMutator().setSafe(0, byteBuf.array(), 0, 
by

[GitHub] [drill] gparai commented on a change in pull request #1715: DRILL-7117: Support creation of equi-depth histogram for selected dat…

2019-03-27 Thread GitBox

gparai commented on a change in pull request #1715: DRILL-7117: Support 
creation of equi-depth histogram for selected dat…
URL: https://github.com/apache/drill/pull/1715#discussion_r269752646
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillStatsTable.java
 ##
 @@ -62,10 +62,15 @@
   public enum STATS_VERSION {V0, V1};
   // The current version
   public static final STATS_VERSION CURRENT_VERSION = STATS_VERSION.V1;
+  // 10 histogram buckets (TODO: can make this configurable later)
+  public static final int NUM_HISTOGRAM_BUCKETS = 10;
 
 Review comment:
   Maybe we can look at other implementations to figure out a good default 
value for histogram buckets.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] gparai commented on a change in pull request #1715: DRILL-7117: Support creation of equi-depth histogram for selected dat…

2019-03-27 Thread GitBox

gparai commented on a change in pull request #1715: DRILL-7117: Support 
creation of equi-depth histogram for selected dat…
URL: https://github.com/apache/drill/pull/1715#discussion_r269727107
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/TDigestFunctions.java
 ##
 @@ -0,0 +1,1126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.expr.fn.impl;
+
+import io.netty.buffer.DrillBuf;
+import org.apache.drill.exec.expr.DrillAggFunc;
+import org.apache.drill.exec.expr.DrillSimpleFunc;
+import org.apache.drill.exec.expr.annotations.FunctionTemplate;
+import org.apache.drill.exec.expr.annotations.FunctionTemplate.FunctionScope;
+import org.apache.drill.exec.expr.annotations.FunctionTemplate.NullHandling;
+import org.apache.drill.exec.expr.annotations.Output;
+import org.apache.drill.exec.expr.annotations.Param;
+import org.apache.drill.exec.expr.annotations.Workspace;
+import org.apache.drill.exec.expr.holders.BigIntHolder;
+import org.apache.drill.exec.expr.holders.NullableBigIntHolder;
+import org.apache.drill.exec.expr.holders.BitHolder;
+import org.apache.drill.exec.expr.holders.NullableBitHolder;
+import org.apache.drill.exec.expr.holders.NullableIntHolder;
+import org.apache.drill.exec.expr.holders.NullableFloat8Holder;
+import org.apache.drill.exec.expr.holders.NullableFloat4Holder;
+import org.apache.drill.exec.expr.holders.Float8Holder;
+import org.apache.drill.exec.expr.holders.Float4Holder;
+import org.apache.drill.exec.expr.holders.DateHolder;
+import org.apache.drill.exec.expr.holders.TimeHolder;
+import org.apache.drill.exec.expr.holders.TimeStampHolder;
+import org.apache.drill.exec.expr.holders.NullableDateHolder;
+import org.apache.drill.exec.expr.holders.NullableTimeHolder;
+import org.apache.drill.exec.expr.holders.NullableTimeStampHolder;
+import org.apache.drill.exec.expr.holders.ObjectHolder;
+import org.apache.drill.exec.expr.holders.IntHolder;
+import org.apache.drill.exec.expr.holders.NullableVarBinaryHolder;
+import org.apache.drill.exec.expr.holders.VarCharHolder;
+import org.apache.drill.exec.expr.holders.VarBinaryHolder;
+import org.apache.drill.exec.expr.holders.NullableVarCharHolder;
+import org.apache.drill.exec.server.options.OptionManager;
+import org.apache.drill.exec.vector.AllocationHelper;
+import org.apache.drill.exec.vector.RepeatedFloat8Vector;
+
+import javax.inject.Inject;
+
+@SuppressWarnings("unused")
+public class TDigestFunctions {
+  static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(TDigestFunctions.class);
+
+  private TDigestFunctions(){}
+
+  @FunctionTemplate(name = "tdigest", scope = 
FunctionTemplate.FunctionScope.POINT_AGGREGATE)
+  public static class BigIntTDigestFunction implements DrillAggFunc {
+@Param BigIntHolder in;
+@Workspace ObjectHolder work;
+@Output NullableVarBinaryHolder out;
+@Inject DrillBuf buffer;
 
 Review comment:
   `@Inject DrillBuf buffer;` is not used in the code? Please remove.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] kkhatua commented on a change in pull request #1714: DRILL-7048: Implement JDBC Statement.setMaxRows() with System Option

2019-03-27 Thread GitBox

kkhatua commented on a change in pull request #1714: DRILL-7048: Implement JDBC 
Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r269759078
 
 

 ##
 File path: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/test/Drill2489CallsAfterCloseThrowExceptionsTest.java
 ##
 @@ -537,7 +547,14 @@ public void 
testclosedPreparedStmtOfOpenConnMethodsThrowRight() {
 new ClosedPreparedStatementChecker(PreparedStatement.class,
closedPreparedStmtOfOpenConn);
 
-checker.testAllMethods();
+//List of methods now supported
+Set methodsToSkip = new LinkedHashSet() {{
 
 Review comment:
   Ok. These tests were failing because it we had implemented the 
`setMaxRows()` and `getMaxRows()` methods. These tests, it seems, expected them 
to throw exceptions.
   With the `super.setLargeMaxRows()`, this should not be necessary anymore.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [drill] kkhatua commented on a change in pull request #1714: DRILL-7048: Implement JDBC Statement.setMaxRows() with System Option

2019-03-27 Thread GitBox

kkhatua commented on a change in pull request #1714: DRILL-7048: Implement JDBC 
Statement.setMaxRows() with System Option
URL: https://github.com/apache/drill/pull/1714#discussion_r269763048
 
 

 ##
 File path: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java
 ##
 @@ -462,4 +618,25 @@ public void 
testParamSettingWhenUnsupportedTypeSaysUnsupported() throws SQLExcep
 }
   }
 
+
+  // Sets the SystemMaxRows option
+  private void setSystemMaxRows(int sysValueToSet) throws SQLException {
 
 Review comment:
   There are _getter-setter_ method tests that don't rely on what the `SYSTEM` 
value is, so we don't need to acquire explicit locks. Only when a query is 
being executed, do we need to ensure that changes to the `SYSTEM` value are 
synchronized because that is when `QueryContext` will apply a `SESSION` or 
`SYSTEM` value.  Using the lock to synchronize the change, indirectly 
synchronizes these specific tests as well.
   We don't have any other control on the parallelism or order of the tests 
within the 2 suites (`PreparedStatementTest` and `StatementTest`)... so using a 
common lock across both tests is the safest way to run such tests with an 
element of randomness (using the `RANDOMIZER` object). 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[jira] [Created] (DRILL-7136) Num_buckets for HashAgg in profile may be inaccurate

2019-03-27 Thread Robert Hou (JIRA)

Robert Hou created DRILL-7136:
-

 Summary: Num_buckets for HashAgg in profile may be inaccurate
 Key: DRILL-7136
 URL: https://issues.apache.org/jira/browse/DRILL-7136
 Project: Apache Drill
  Issue Type: Bug
  Components: Tools, Build & Test
Affects Versions: 1.16.0
Reporter: Robert Hou
Assignee: Pritesh Maker
 Fix For: 1.16.0
 Attachments: 23650ee5-6721-8a8f-7dd3-f5dd09a3a7b0.sys.drill

I ran TPCH query 17 with sf 1000.  Here is the query:
{noformat}
select
  sum(l.l_extendedprice) / 7.0 as avg_yearly
from
  lineitem l,
  part p
where
  p.p_partkey = l.l_partkey
  and p.p_brand = 'Brand#13'
  and p.p_container = 'JUMBO CAN'
  and l.l_quantity < (
select
  0.2 * avg(l2.l_quantity)
from
  lineitem l2
where
  l2.l_partkey = p.p_partkey
  );
{noformat}

One of the hash agg operators has resized 6 times.  It should have 4M buckets.  
But the profile shows it has 64K buckets.



I have attached a sample profile.  In this profile, the hash agg operator is 
(04-02).
{noformat}
Operator Metrics
Minor Fragment  NUM_BUCKETS NUM_ENTRIES NUM_RESIZING
RESIZING_TIME_MSNUM_PARTITIONS  SPILLED_PARTITIONS  SPILL_MB
SPILL_CYCLE INPUT_BATCH_COUNT   AVG_INPUT_BATCH_BYTES   
AVG_INPUT_ROW_BYTES INPUT_RECORD_COUNT  OUTPUT_BATCH_COUNT  
AVG_OUTPUT_BATCH_BYTES  AVG_OUTPUT_ROW_BYTESOUTPUT_RECORD_COUNT
04-00-0265,536  748,746 6   364 1   582 0   
813 582,653 18  26,316,456  401 1,631,943   25  
26,176,350
{noformat}





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (DRILL-7137) Implement unit test case to test Drill client <-> interaction in a secure setup

2019-03-27 Thread Karthikeyan Manivannan (JIRA)

Karthikeyan Manivannan created DRILL-7137:
-

 Summary: Implement unit test case to test Drill client <-> 
interaction in a secure setup 
 Key: DRILL-7137
 URL: https://issues.apache.org/jira/browse/DRILL-7137
 Project: Apache Drill
  Issue Type: Improvement
Reporter: Karthikeyan Manivannan


Implement a unit testcase fo DRILL-7101

>From the PR https://github.com/apache/drill/pull/1702

"Writing a test where the Drillbits (inside ClusterFixture) are setup with 
ZK_APPLY_SECURE_ACL=false (to avoid the need to setup a secure ZK server within 
the unit test) and the ClientFixture is setup with ZK_APPLY_SECURE_ACL=true (to 
simulate the failure). Starting a test with different values for the same 
property turns out to be quite hard because the ClusterFixture internally 
instantiates a ClientFixure. Changing this behavior might affect other tests."



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[GitHub] [drill] gparai opened a new pull request #1718: DRILL-7121: Use correct ndv when statistics is disabled

2019-03-27 Thread GitBox

gparai opened a new pull request #1718: DRILL-7121: Use correct ndv when 
statistics is disabled
URL: https://github.com/apache/drill/pull/1718
 
 
   @amansinha100 can you please review the PR? Thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

61 matches

Mail list logo