[jira] [Commented] (DRILL-5387) TestBitBitKerberos and TestUserBitKerberos cause sporadic unit test failures

2017-04-05 Thread Abhishek Girish (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15958270#comment-15958270
 ] 

Abhishek Girish commented on DRILL-5387:


I'm observing these failures consistently on my environment. 

> TestBitBitKerberos and TestUserBitKerberos cause sporadic unit test failures
> 
>
> Key: DRILL-5387
> URL: https://issues.apache.org/jira/browse/DRILL-5387
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Sudheesh Katkam
>Assignee: Sudheesh Katkam
>
> TestOptionsAuthEnabled and TestInboundImpersonation sporadically fail. There 
> is a [Java 
> trick|https://github.com/apache/drill/blob/master/exec/java-exec/src/test/java/org/apache/hadoop/security/UgiTestUtil.java#L29]
>  to reset some static state in TestUserBitKerberos and TestBitBitKerberos to 
> ensure the JVM is reusable for other tests as done in the [Hadoop auth 
> tests|https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/security/TestUGIWithMiniKdc.java#L53],
>  but this trick does not seem to work always. So disable these tests. In the 
> future, maybe the tests can be run separately through surefire but not as 
> part of the default build?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (DRILL-5417) Unit test failures in TestBitBitKerberos and TestUserBitKerberos

2017-04-05 Thread Abhishek Girish (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish resolved DRILL-5417.

Resolution: Duplicate

> Unit test failures in TestBitBitKerberos and TestUserBitKerberos
> 
>
> Key: DRILL-5417
> URL: https://issues.apache.org/jira/browse/DRILL-5417
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Tools, Build & Test
>Affects Versions: 1.11.0
> Environment: CentOS Linux release 7.3.1611 (Core)
> Commit ID: 06e1522b5ddf7e15d49921be1d9323f1e09273b0
>Reporter: Abhishek Girish
>
> Tests in error: 
> {code}
>   TestBitBitKerberos.setupKdc:100 » Runtime Unable to parse:includedir 
> /etc/krb5...
>   TestUserBitKerberos.setupKdc:73 » Runtime Unable to parse:includedir 
> /etc/krb5...
> {code}
> TestBitBitKerberos details:
> {code}
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.033 sec <<< 
> FAILURE! - in org.apache.drill.exec.rpc.data.TestBitBitKerberos
> org.apache.drill.exec.rpc.data.TestBitBitKerberos  Time elapsed: 4.033 sec  
> <<< ERROR!
> java.lang.RuntimeException: Unable to parse:includedir /etc/krb5.conf.d/
>   at 
> org.apache.kerby.kerberos.kerb.common.Krb5Parser.load(Krb5Parser.java:72)
>   at 
> org.apache.kerby.kerberos.kerb.common.Krb5Conf.addKrb5Config(Krb5Conf.java:47)
>   at 
> org.apache.kerby.kerberos.kerb.client.ClientUtil.getDefaultConfig(ClientUtil.java:94)
>   at 
> org.apache.kerby.kerberos.kerb.client.KrbClientBase.(KrbClientBase.java:51)
>   at 
> org.apache.kerby.kerberos.kerb.client.KrbClient.(KrbClient.java:38)
>   at 
> org.apache.kerby.kerberos.kerb.server.SimpleKdcServer.(SimpleKdcServer.java:54)
>   at 
> org.apache.drill.exec.rpc.data.TestBitBitKerberos.setupKdc(TestBitBitKerberos.java:100)
> {code}
> TestUserBitKerberos details:
> {code}
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 2.41 sec <<< 
> FAILURE! - in org.apache.drill.exec.rpc.user.security.TestUserBitKerberos
> org.apache.drill.exec.rpc.user.security.TestUserBitKerberos  Time elapsed: 
> 2.41 sec  <<< ERROR!
> java.lang.RuntimeException: Unable to parse:includedir /etc/krb5.conf.d/
>   at 
> org.apache.kerby.kerberos.kerb.common.Krb5Parser.load(Krb5Parser.java:72)
>   at 
> org.apache.kerby.kerberos.kerb.common.Krb5Conf.addKrb5Config(Krb5Conf.java:47)
>   at 
> org.apache.kerby.kerberos.kerb.client.ClientUtil.getDefaultConfig(ClientUtil.java:94)
>   at 
> org.apache.kerby.kerberos.kerb.client.KrbClientBase.(KrbClientBase.java:51)
>   at 
> org.apache.kerby.kerberos.kerb.client.KrbClient.(KrbClient.java:38)
>   at 
> org.apache.kerby.kerberos.kerb.server.SimpleKdcServer.(SimpleKdcServer.java:54)
>   at 
> org.apache.drill.exec.rpc.user.security.TestUserBitKerberos.setupKdc(TestUserBitKerberos.java:73)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5418) Unit test failures in TestMergeJoinWithSchemaChanges

2017-04-05 Thread Abhishek Girish (JIRA)
Abhishek Girish created DRILL-5418:
--

 Summary: Unit test failures in TestMergeJoinWithSchemaChanges
 Key: DRILL-5418
 URL: https://issues.apache.org/jira/browse/DRILL-5418
 Project: Apache Drill
  Issue Type: Bug
  Components: Tools, Build & Test
Affects Versions: 1.11.0
 Environment: CentOS Linux release 7.3.1611 (Core)
Commit ID: 06e1522b5ddf7e15d49921be1d9323f1e09273b0
Reporter: Abhishek Girish


The following test fails intermittently
{code}
TestMergeJoinWithSchemaChanges.testMissingAndNewColumns:265->BaseTestQuery.testRunAndReturn:331
 » Rpc
{code}

Details:
{code}
org.apache.drill.exec.physical.impl.join.TestMergeJoinWithSchemaChanges
testMissingAndNewColumns(org.apache.drill.exec.physical.impl.join.TestMergeJoinWithSchemaChanges)
  Time elapsed: 0.541 sec  <<< ERROR!
org.apache.drill.exec.rpc.RpcException: 
org.apache.drill.common.exceptions.UserRemoteException: UNSUPPORTED_OPERATION 
ERROR: Sort doesn't currently support sorts with changing schemas

Fragment 0:0

[Error Id: 669a7a03-4c61-453f-80cf-67303ad21cad on cv6:31010]

  (org.apache.drill.exec.exception.SchemaChangeException) Sort currently only 
supports a single schema.
org.apache.drill.exec.physical.impl.sort.SortRecordBatchBuilder.build():146
org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.innerNext():478
org.apache.drill.exec.record.AbstractRecordBatch.next():162

org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():215
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():109
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51

org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext():93
org.apache.drill.exec.record.AbstractRecordBatch.next():162

org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():215
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.RecordIterator.nextBatch():99
org.apache.drill.exec.record.RecordIterator.next():185
org.apache.drill.exec.record.RecordIterator.prepare():169
org.apache.drill.exec.physical.impl.join.JoinStatus.prepare():87
org.apache.drill.exec.physical.impl.join.MergeJoinBatch.innerNext():160
org.apache.drill.exec.record.AbstractRecordBatch.next():162

org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():215
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():109
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51

org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():135
org.apache.drill.exec.record.AbstractRecordBatch.next():162

org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():215
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():109
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51

org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():135
org.apache.drill.exec.record.AbstractRecordBatch.next():162

org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():215
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():109
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51

org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():135
org.apache.drill.exec.record.AbstractRecordBatch.next():162

org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():215
org.apache.drill.exec.physical.impl.BaseRootExec.next():104
org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81
org.apache.drill.exec.physical.impl.BaseRootExec.next():94
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():232
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():226
java.security.AccessController.doPrivileged():-2
javax.security.auth.Subject.doAs():421
org.apache.hadoop.security.UserGroupInformation.doAs():1657
org.apache.drill.exec.work.fragment.FragmentExecutor.run():226
org.apache.drill.common.SelfCleaningRunnable.run():38
java.util.concurrent.ThreadPoolExecutor.runWorker():1145
java.util.concurrent.ThreadPoolExecutor$Worker.run():615
java.lang.Thread.run():745

at 
org.apache.drill.exec.rpc.RpcException.mapException(RpcException.java:60)
at 
org.apache.drill.exec.client.DrillClient$ListHoldingResultsListener.getResults(DrillClient.java:854)
at 
org.apache.drill.exec.client.DrillClient.runQuery(DrillClient.java:556)

[jira] [Created] (DRILL-5417) Unit test failures in TestBitBitKerberos and TestUserBitKerberos

2017-04-05 Thread Abhishek Girish (JIRA)
Abhishek Girish created DRILL-5417:
--

 Summary: Unit test failures in TestBitBitKerberos and 
TestUserBitKerberos
 Key: DRILL-5417
 URL: https://issues.apache.org/jira/browse/DRILL-5417
 Project: Apache Drill
  Issue Type: Bug
  Components: Tools, Build & Test
Affects Versions: 1.11.0
 Environment: CentOS Linux release 7.3.1611 (Core)
Commit ID: 06e1522b5ddf7e15d49921be1d9323f1e09273b0
Reporter: Abhishek Girish


Tests in error: 
{code}
  TestBitBitKerberos.setupKdc:100 » Runtime Unable to parse:includedir 
/etc/krb5...
  TestUserBitKerberos.setupKdc:73 » Runtime Unable to parse:includedir 
/etc/krb5...
{code}
TestBitBitKerberos details:
{code}
Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.033 sec <<< 
FAILURE! - in org.apache.drill.exec.rpc.data.TestBitBitKerberos
org.apache.drill.exec.rpc.data.TestBitBitKerberos  Time elapsed: 4.033 sec  <<< 
ERROR!
java.lang.RuntimeException: Unable to parse:includedir /etc/krb5.conf.d/
at 
org.apache.kerby.kerberos.kerb.common.Krb5Parser.load(Krb5Parser.java:72)
at 
org.apache.kerby.kerberos.kerb.common.Krb5Conf.addKrb5Config(Krb5Conf.java:47)
at 
org.apache.kerby.kerberos.kerb.client.ClientUtil.getDefaultConfig(ClientUtil.java:94)
at 
org.apache.kerby.kerberos.kerb.client.KrbClientBase.(KrbClientBase.java:51)
at 
org.apache.kerby.kerberos.kerb.client.KrbClient.(KrbClient.java:38)
at 
org.apache.kerby.kerberos.kerb.server.SimpleKdcServer.(SimpleKdcServer.java:54)
at 
org.apache.drill.exec.rpc.data.TestBitBitKerberos.setupKdc(TestBitBitKerberos.java:100)
{code}
TestUserBitKerberos details:
{code}
Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 2.41 sec <<< 
FAILURE! - in org.apache.drill.exec.rpc.user.security.TestUserBitKerberos
org.apache.drill.exec.rpc.user.security.TestUserBitKerberos  Time elapsed: 2.41 
sec  <<< ERROR!
java.lang.RuntimeException: Unable to parse:includedir /etc/krb5.conf.d/
at 
org.apache.kerby.kerberos.kerb.common.Krb5Parser.load(Krb5Parser.java:72)
at 
org.apache.kerby.kerberos.kerb.common.Krb5Conf.addKrb5Config(Krb5Conf.java:47)
at 
org.apache.kerby.kerberos.kerb.client.ClientUtil.getDefaultConfig(ClientUtil.java:94)
at 
org.apache.kerby.kerberos.kerb.client.KrbClientBase.(KrbClientBase.java:51)
at 
org.apache.kerby.kerberos.kerb.client.KrbClient.(KrbClient.java:38)
at 
org.apache.kerby.kerberos.kerb.server.SimpleKdcServer.(SimpleKdcServer.java:54)
at 
org.apache.drill.exec.rpc.user.security.TestUserBitKerberos.setupKdc(TestUserBitKerberos.java:73)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-5395) Query on MapR-DB table fails with NPE due to an issue with assignment logic

2017-04-05 Thread Padma Penumarthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Padma Penumarthy updated DRILL-5395:

Labels: MapR-DB-Binary ready-to-commit  (was: MapR-DB-Binary)

> Query on MapR-DB table fails with NPE due to an issue with assignment logic
> ---
>
> Key: DRILL-5395
> URL: https://issues.apache.org/jira/browse/DRILL-5395
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization, Storage - MapRDB
>Affects Versions: 1.9.0, 1.10.0
>Reporter: Abhishek Girish
>Assignee: Padma Penumarthy
>  Labels: MapR-DB-Binary, ready-to-commit
> Fix For: 1.11.0
>
> Attachments: drillbit.log.txt
>
>
> We uncovered this issue when working on DRILL-5394. 
> The MapR-DB table in question had 5 tablets with skewed data distribution (~6 
> million rows). A partial WIP fix for DRILL-5394 caused the number of rows to 
> be reported incorrectly (~300,000). 2 minor fragments were created (due to 
> filter selectivity) for scanning the 5 tablets. And this resulted in an NPE, 
> possibly related to an issue with assignment logic, that was now exposed. 
> Representative query:
> {code}
> SELECT Convert_from(avail.customer, 'UTF8') AS ABC, 
>Convert_from(prop.customer, 'UTF8')  AS PQR 
> FROM   (SELECT Convert_from(a.row_key, 'UTF8') 
>AS customer, 
>Cast(Convert_from(a.data .` l_discount ` , 'double_be') AS 
> FLOAT) 
>AS availability 
> FROM   db.tpch_maprdb.lineitem_1 a 
> WHERE  Convert_from(a.row_key, 'UTF8') = '%004%') AS avail 
>join 
>   (SELECT Convert_from(b.row_key, 'UTF8') 
>   AS customer, 
>Cast( 
>Convert_from(b.data .` l_discount ` , 'double_be') AS FLOAT) AS 
>availability 
> FROM   db.tpch_maprdb.lineitem_1 b 
> WHERE  Convert_from(b.row_key, 'UTF8') LIKE '%003%') AS prop 
>  ON avail.customer = prop.customer; 
> {code}
> Error:
> {code}
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> NullPointerException
> {code}
> Log attached. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (DRILL-4907) Wrong default value mentioned in documentation for "planner.width.max_per_node"

2017-04-05 Thread Bridget Bevens (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bridget Bevens resolved DRILL-4907.
---
Resolution: Fixed
  Assignee: Bridget Bevens

Updated the default value description as requested.
Setting the status to Resolved.
Thanks,
Bridget

> Wrong default value mentioned in documentation for 
> "planner.width.max_per_node"
> ---
>
> Key: DRILL-4907
> URL: https://issues.apache.org/jira/browse/DRILL-4907
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.8.0
>Reporter: Rahul Challapalli
>Assignee: Bridget Bevens
>Priority: Minor
>
> From the documentation of config options at [1], the default value for 
> "planner.width.max_per_node" is mentioned as 3. This should be updated to 70% 
> of the total processors on a node.
> [1] https://drill.apache.org/docs/configuration-options-introduction/



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15958084#comment-15958084
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/785#discussion_r109563707
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/AbstractSingleRowSet.java
 ---
@@ -0,0 +1,158 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test.rowSet;
+
+import org.apache.drill.common.types.TypeProtos.MajorType;
+import org.apache.drill.common.types.TypeProtos.MinorType;
+import org.apache.drill.exec.expr.TypeHelper;
+import org.apache.drill.exec.memory.BufferAllocator;
+import org.apache.drill.exec.physical.impl.spill.RecordBatchSizer;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.BatchSchema.SelectionVectorMode;
+import org.apache.drill.exec.record.VectorContainer;
+import org.apache.drill.exec.record.VectorWrapper;
+import org.apache.drill.exec.vector.ValueVector;
+import org.apache.drill.exec.vector.complex.MapVector;
+import org.apache.drill.test.rowSet.RowSet.SingleRowSet;
+import org.apache.drill.test.rowSet.RowSetSchema.LogicalColumn;
+import org.apache.drill.test.rowSet.RowSetSchema.PhysicalSchema;
+
+public abstract class AbstractSingleRowSet extends AbstractRowSet 
implements SingleRowSet {
+
+  public abstract static class StructureBuilder {
+protected final PhysicalSchema schema;
+protected final BufferAllocator allocator;
+protected final ValueVector[] valueVectors;
+protected final MapVector[] mapVectors;
+protected int vectorIndex;
+protected int mapIndex;
+
+public StructureBuilder(BufferAllocator allocator, RowSetSchema 
schema) {
+  this.allocator = allocator;
+  this.schema = schema.physical();
+  valueVectors = new ValueVector[schema.access().count()];
+  if (schema.access().mapCount() == 0) {
+mapVectors = null;
+  } else {
+mapVectors = new MapVector[schema.access().mapCount()];
+  }
+}
+  }
+
+  public static class VectorBuilder extends StructureBuilder {
+
+public VectorBuilder(BufferAllocator allocator, RowSetSchema schema) {
+  super(allocator, schema);
+}
+
+public ValueVector[] buildContainer(VectorContainer container) {
+  for (int i = 0; i < schema.count(); i++) {
+LogicalColumn colSchema = schema.column(i);
+@SuppressWarnings("resource")
+ValueVector v = TypeHelper.getNewVector(colSchema.field, 
allocator, null);
+container.add(v);
+if (colSchema.field.getType().getMinorType() == MinorType.MAP) {
+  MapVector mv = (MapVector) v;
+  mapVectors[mapIndex++] = mv;
+  buildMap(mv, colSchema.mapSchema);
+} else {
+  valueVectors[vectorIndex++] = v;
+}
+  }
+  container.buildSchema(SelectionVectorMode.NONE);
+  return valueVectors;
+}
+
+private void buildMap(MapVector mapVector, PhysicalSchema mapSchema) {
+  for (int i = 0; i < mapSchema.count(); i++) {
+LogicalColumn colSchema = mapSchema.column(i);
+MajorType type = colSchema.field.getType();
+Class vectorClass = 
TypeHelper.getValueVectorClass(type.getMinorType(), type.getMode());
+@SuppressWarnings("resource")
+ValueVector v = mapVector.addOrGet(colSchema.field.getName(), 
type, vectorClass);
+if (type.getMinorType() == MinorType.MAP) {
+  MapVector mv = (MapVector) v;
+  mapVectors[mapIndex++] = mv;
+  buildMap(mv, colSchema.mapSchema);
+} else {
+  valueVectors[vectorIndex++] = v;
+}
+  }
+}
+  }
+
+  

[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15958083#comment-15958083
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/785#discussion_r110056242
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/SchemaBuilder.java ---
@@ -0,0 +1,138 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test.rowSet;
+
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.drill.common.types.TypeProtos.DataMode;
+import org.apache.drill.common.types.TypeProtos.MajorType;
+import org.apache.drill.common.types.TypeProtos.MinorType;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.BatchSchema.SelectionVectorMode;
+import org.apache.drill.exec.record.MaterializedField;
+
+/**
+ * Builder of a row set schema expressed as a list of materialized
+ * fields. Optimized for use when creating schemas by hand in tests.
+ * 
+ * Example usage to create the following schema: 
+ * (c: INT, a: MAP(c: VARCHAR, d: INT, e: MAP(f: VARCHAR), g: INT), h: 
BIGINT)
--- End diff --

Fixed.


> Provide test tools to create, populate and compare row sets
> ---
>
> Key: DRILL-5323
> URL: https://issues.apache.org/jira/browse/DRILL-5323
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.11.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.11.0
>
>
> Operators work with individual row sets. A row set is a collection of records 
> stored as column vectors. (Drill uses various terms for this concept. A 
> record batch is a row set with an operator implementation wrapped around it. 
> A vector container is a row set, but with much functionality left as an 
> exercise for the developer. And so on.)
> To simplify tests, we need a {{TestRowSet}} concept that wraps a 
> {{VectorContainer}} and provides easy ways to:
> * Define a schema for the row set.
> * Create a set of vectors that implement the schema.
> * Populate the row set with test data via code.
> * Add an SV2 to the row set.
> * Pass the row set to operator components (such as generated code blocks.)
> * Compare the results of the operation with an expected result set.
> * Dispose of the underling direct memory when work is done.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15958082#comment-15958082
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/785#discussion_r110055747
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/RowSetPrinter.java ---
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test.rowSet;
+
+import java.io.PrintStream;
+
+import org.apache.drill.exec.record.BatchSchema.SelectionVectorMode;
+import org.apache.drill.test.rowSet.RowSet.RowSetReader;
+import org.apache.drill.test.rowSet.RowSetSchema.AccessSchema;
+
+public class RowSetPrinter {
+  private RowSet rowSet;
+
+  public RowSetPrinter(RowSet rowSet) {
+this.rowSet = rowSet;
+  }
+
+  public void print() {
+print(System.out);
+  }
+
+  public void print(PrintStream out) {
+SelectionVectorMode selectionMode = rowSet.getIndirectionType();
+RowSetReader reader = rowSet.reader();
+int colCount = reader.width();
+printSchema(out, selectionMode);
+while (reader.next()) {
+  printHeader(out, reader, selectionMode);
+  for (int i = 0; i < colCount; i++) {
+if (i > 0) {
+  out.print(", ");
+}
+out.print(reader.getAsString(i));
+  }
+  out.println();
+}
+  }
+
+  private void printSchema(PrintStream out, SelectionVectorMode 
selectionMode) {
+out.print("#");
+switch (selectionMode) {
+case FOUR_BYTE:
+  out.print(" (batch #, row #)");
+  break;
+case TWO_BYTE:
+  out.print(" (row #)");
+  break;
+default:
+  break;
+}
+out.print(": ");
+AccessSchema schema = rowSet.schema().access();
+for (int i = 0; i < schema.count(); i++) {
+  if (i > 0) {
--- End diff --

Access schema is a flattened view of the row set: the set of columns whose 
values can be set. If we have a schema like (a, b(c, d)) (where b is a map), 
the access schema is (a, b.c, b.d).


> Provide test tools to create, populate and compare row sets
> ---
>
> Key: DRILL-5323
> URL: https://issues.apache.org/jira/browse/DRILL-5323
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.11.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.11.0
>
>
> Operators work with individual row sets. A row set is a collection of records 
> stored as column vectors. (Drill uses various terms for this concept. A 
> record batch is a row set with an operator implementation wrapped around it. 
> A vector container is a row set, but with much functionality left as an 
> exercise for the developer. And so on.)
> To simplify tests, we need a {{TestRowSet}} concept that wraps a 
> {{VectorContainer}} and provides easy ways to:
> * Define a schema for the row set.
> * Create a set of vectors that implement the schema.
> * Populate the row set with test data via code.
> * Add an SV2 to the row set.
> * Pass the row set to operator components (such as generated code blocks.)
> * Compare the results of the operation with an expected result set.
> * Dispose of the underling direct memory when work is done.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15958081#comment-15958081
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/785#discussion_r110056124
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/RowSetWriterImpl.java 
---
@@ -0,0 +1,110 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test.rowSet;
+
+import java.math.BigDecimal;
+
+import org.apache.drill.exec.vector.ValueVector;
+import org.apache.drill.exec.vector.accessor.AbstractColumnWriter;
+import org.apache.drill.exec.vector.accessor.ColumnAccessorFactory;
+import org.apache.drill.exec.vector.accessor.ColumnWriter;
+import org.apache.drill.test.rowSet.RowSet.RowSetWriter;
+import org.joda.time.Period;
+
+/**
+ * Implements a row set writer on top of a {@link RowSet}
+ * container.
+ */
+
+public class RowSetWriterImpl extends AbstractRowSetAccessor implements 
RowSetWriter {
+
+  private final AbstractColumnWriter writers[];
+
+  public RowSetWriterImpl(AbstractSingleRowSet recordSet, AbstractRowIndex 
rowIndex) {
+super(rowIndex, recordSet.schema().access());
+ValueVector[] valueVectors = recordSet.vectors();
+writers = new AbstractColumnWriter[valueVectors.length];
+int posn = 0;
+for (int i = 0; i < writers.length; i++) {
+  writers[posn] = 
ColumnAccessorFactory.newWriter(valueVectors[i].getField().getType());
--- End diff --

Fixed.


> Provide test tools to create, populate and compare row sets
> ---
>
> Key: DRILL-5323
> URL: https://issues.apache.org/jira/browse/DRILL-5323
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.11.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.11.0
>
>
> Operators work with individual row sets. A row set is a collection of records 
> stored as column vectors. (Drill uses various terms for this concept. A 
> record batch is a row set with an operator implementation wrapped around it. 
> A vector container is a row set, but with much functionality left as an 
> exercise for the developer. And so on.)
> To simplify tests, we need a {{TestRowSet}} concept that wraps a 
> {{VectorContainer}} and provides easy ways to:
> * Define a schema for the row set.
> * Create a set of vectors that implement the schema.
> * Populate the row set with test data via code.
> * Add an SV2 to the row set.
> * Pass the row set to operator components (such as generated code blocks.)
> * Compare the results of the operation with an expected result set.
> * Dispose of the underling direct memory when work is done.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15958085#comment-15958085
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/785#discussion_r110055632
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/HyperRowSetImpl.java 
---
@@ -0,0 +1,221 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test.rowSet;
+
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.drill.common.types.TypeProtos.MinorType;
+import org.apache.drill.exec.memory.BufferAllocator;
+import org.apache.drill.exec.record.BatchSchema.SelectionVectorMode;
+import org.apache.drill.exec.record.HyperVectorWrapper;
+import org.apache.drill.exec.record.VectorContainer;
+import org.apache.drill.exec.record.VectorWrapper;
+import org.apache.drill.exec.record.selection.SelectionVector4;
+import org.apache.drill.exec.vector.ValueVector;
+import org.apache.drill.exec.vector.accessor.AccessorUtilities;
+import org.apache.drill.exec.vector.complex.AbstractMapVector;
+import org.apache.drill.test.rowSet.AbstractRowSetAccessor.BoundedRowIndex;
+import org.apache.drill.test.rowSet.RowSet.HyperRowSet;
+import org.apache.drill.test.rowSet.RowSetSchema.LogicalColumn;
+import org.apache.drill.test.rowSet.RowSetSchema.PhysicalSchema;
+
+public class HyperRowSetImpl extends AbstractRowSet implements HyperRowSet 
{
+
+  public static class HyperRowIndex extends BoundedRowIndex {
+
+private final SelectionVector4 sv4;
+
+public HyperRowIndex(SelectionVector4 sv4) {
+  super(sv4.getCount());
+  this.sv4 = sv4;
+}
+
+@Override
+public int index() {
+  return AccessorUtilities.sv4Index(sv4.get(rowIndex));
+}
+
+@Override
+public int batch( ) {
+  return AccessorUtilities.sv4Batch(sv4.get(rowIndex));
+}
+  }
+
+  /**
+   * Build a hyper row set by restructuring a hyper vector bundle into a 
uniform
+   * shape. Consider this schema: 
+   * { a: 10, b: { c: 20, d: { e: 30 } } }
+   * 
+   * The hyper container, with two batches, has this structure:
+   * 
+   * Batchab
+   * 0Int vectorMap Vector(Int vector, Map 
Vector(Int vector))
+   * 1Int vectorMap Vector(Int vector, Map 
Vector(Int vector))
+   * 
+   * 
+   * The above table shows that top-level scalar vectors (such as the Int 
Vector for column
+   * a) appear "end-to-end" as a hyper-vector. Maps also appear 
end-to-end. But, the
+   * contents of the map (column c) do not appear end-to-end. Instead, 
they appear as
+   * contents in the map vector. To get to c, one indexes into the map 
vector, steps inside
+   * the map to find c and indexes to the right row.
+   * 
+   * Similarly, the maps for d do not appear end-to-end, one must step to 
the right batch
+   * in b, then step to d.
+   * 
+   * Finally, to get to e, one must step
+   * into the hyper vector for b, then steps to the proper batch, steps to 
d, step to e
+   * and finally step to the row within e. This is a very complex, costly 
indexing scheme
+   * that differs depending on map nesting depth.
+   * 
+   * To simplify access, this class restructures the maps to flatten the 
scalar vectors
+   * into end-to-end hyper vectors. For example, for the above:
+   * 
+   * 
+   * Batchacd
+   * 0Int vectorInt vectorInt 
vector
+   * 1Int vectorInt vectorInt 
vector
+   * 
+   *
+   * The maps are still available as hyper vectors, but separated into map 
fields.
+   * (Scalar access no longer needs to access the maps.) The result is a 
uniform
+   * addressing scheme for both top-level and nested vectors.
+   */
+
+  public static class HyperVectorBuilder {
  

[jira] [Commented] (DRILL-5323) Provide test tools to create, populate and compare row sets

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15958086#comment-15958086
 ] 

ASF GitHub Bot commented on DRILL-5323:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/785#discussion_r110055928
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/rowSet/RowSetUtilities.java 
---
@@ -0,0 +1,83 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.test.rowSet;
+
+import org.apache.drill.common.types.TypeProtos.MinorType;
+import org.apache.drill.exec.record.selection.SelectionVector2;
+import org.apache.drill.exec.vector.accessor.AccessorUtilities;
+import org.apache.drill.exec.vector.accessor.ColumnAccessor.ValueType;
+import org.apache.drill.exec.vector.accessor.ColumnWriter;
+import org.apache.drill.test.rowSet.RowSet.RowSetWriter;
+import org.joda.time.Duration;
+import org.joda.time.Period;
+
+public class RowSetUtilities {
+
+  private RowSetUtilities() { }
+
+  public static void reverse(SelectionVector2 sv2) {
+int count = sv2.getCount();
+for (int i = 0; i < count / 2; i++) {
+  char temp = sv2.getIndex(i);
+  int dest = count - 1 - i;
+  sv2.setIndex(i, sv2.getIndex(dest));
+  sv2.setIndex(dest, temp);
+}
+  }
+
+  /**
+   * Set a test data value from an int. Uses the type information of the
+   * column to handle interval types. Else, uses the value type of the
+   * accessor. The value set here is purely for testing; the mapping
+   * from ints to intervals has no real meaning.
+   *
+   * @param rowWriter
+   * @param index
+   * @param value
+   */
+
+  public static void setFromInt(RowSetWriter rowWriter, int index, int 
value) {
+ColumnWriter writer = rowWriter.column(index);
+if (writer.valueType() == ValueType.PERIOD) {
+  setPeriodFromInt(writer, 
rowWriter.schema().column(index).getType().getMinorType(), value);
+} else {
+  AccessorUtilities.setFromInt(writer, value);
+}
+  }
+
+  public static void setPeriodFromInt(ColumnWriter writer, MinorType 
minorType,
+  int value) {
+switch (minorType) {
+case INTERVAL:
+  writer.setPeriod(Duration.millis(value).toPeriod());
+  break;
+case INTERVALYEAR:
+  writer.setPeriod(Period.years(value / 12).withMonths(value % 12));
+  break;
+case INTERVALDAY:
+  int sec = value % 60;
+  value = value / 60;
+  int min = value % 60;
+  value = value / 60;
+  
writer.setPeriod(Period.days(value).withMinutes(min).withSeconds(sec));
--- End diff --

This is a data generator. The int has no real meaning, it is just a 
convenient way to populate a field. So, here we just slice off some values to 
put into each field. Not pretty, but convenient for testing.


> Provide test tools to create, populate and compare row sets
> ---
>
> Key: DRILL-5323
> URL: https://issues.apache.org/jira/browse/DRILL-5323
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.11.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.11.0
>
>
> Operators work with individual row sets. A row set is a collection of records 
> stored as column vectors. (Drill uses various terms for this concept. A 
> record batch is a row set with an operator implementation wrapped around it. 
> A vector container is a row set, but with much functionality left as an 
> exercise for the developer. And so on.)
> To simplify tests, we need a {{TestRowSet}} concept that wraps a 
> {{VectorContainer}} and provides easy ways to:
> * Define a schema for the row set.
> * Create a set of vectors that implement 

[jira] [Commented] (DRILL-5416) Vectors read from disk report incorrect memory sizes

2017-04-05 Thread Paul Rogers (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15958012#comment-15958012
 ] 

Paul Rogers commented on DRILL-5416:


The code here is complex. Since any change would be made only to improve a 
memory estimate, this is not a good investment in time. Let's just live with 
the incorrect memory estimates.

The good news is that the data, after being read, takes less memory than before 
being read. (The vectors share a single "dead space" rather than a dead space 
per vector.) 

> Vectors read from disk report incorrect memory sizes
> 
>
> Key: DRILL-5416
> URL: https://issues.apache.org/jira/browse/DRILL-5416
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.8.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
>
> The external sort and revised hash agg operators spill to disk using a vector 
> serialization mechanism. This mechanism serializes each vector as a (length, 
> bytes) pair.
> Before spilling, if we check the memory used for a vector (using the new 
> {{RecordBatchSizer}} class), we learn of the actual memory consumed by the 
> vector, including any unused space in the vector.
> If we spill the vector, then reread it, the reported storage size is wrong.
> On reading, the code allocates a buffer, based on the saved length, rounded 
> up to the next power of two. Then, when building the vector, we "slice" the 
> read buffer, setting the memory size to the data size.
> For example, suppose we save 20 1-byte fields. The size on disk is 20. The 
> read buffer is rounded to 32 bytes (the size of the original, pre-spill 
> buffer.) We read the 20 bytes and create a vector. Creating the vector 
> reports the memory size as 20, "hiding" the extra, unused 12 bytes.
> As a result, when computing memory sizes, we receive incorrect numbers. 
> Working with false numbers means that the code cannot safely operate within a 
> memory budget, causing the user to receive an unexpected OOM error.
> As it turns out, the code path that does the slicing is used only for reads 
> from disk. This ticket asks to remove the slicing step: just use the 
> allocated buffer directly so that the after-read vector reports the correct 
> memory usage; same as the before-spill vector.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-5416) Vectors read from disk report incorrect memory sizes

2017-04-05 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated DRILL-5416:
---
Fix Version/s: (was: 1.11.0)

> Vectors read from disk report incorrect memory sizes
> 
>
> Key: DRILL-5416
> URL: https://issues.apache.org/jira/browse/DRILL-5416
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.8.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
>
> The external sort and revised hash agg operators spill to disk using a vector 
> serialization mechanism. This mechanism serializes each vector as a (length, 
> bytes) pair.
> Before spilling, if we check the memory used for a vector (using the new 
> {{RecordBatchSizer}} class), we learn of the actual memory consumed by the 
> vector, including any unused space in the vector.
> If we spill the vector, then reread it, the reported storage size is wrong.
> On reading, the code allocates a buffer, based on the saved length, rounded 
> up to the next power of two. Then, when building the vector, we "slice" the 
> read buffer, setting the memory size to the data size.
> For example, suppose we save 20 1-byte fields. The size on disk is 20. The 
> read buffer is rounded to 32 bytes (the size of the original, pre-spill 
> buffer.) We read the 20 bytes and create a vector. Creating the vector 
> reports the memory size as 20, "hiding" the extra, unused 12 bytes.
> As a result, when computing memory sizes, we receive incorrect numbers. 
> Working with false numbers means that the code cannot safely operate within a 
> memory budget, causing the user to receive an unexpected OOM error.
> As it turns out, the code path that does the slicing is used only for reads 
> from disk. This ticket asks to remove the slicing step: just use the 
> allocated buffer directly so that the after-read vector reports the correct 
> memory usage; same as the before-spill vector.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5416) Vectors read from disk report incorrect memory sizes

2017-04-05 Thread Paul Rogers (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957996#comment-15957996
 ] 

Paul Rogers commented on DRILL-5416:


The solution seems to be to change the serialization. Today we treat the vector 
as the unit of serialization (creating composite buffers as needed during read 
and write.) Note, however, that maps are treated as special; the map vector 
actually contains serialization code to write its composite vectors.

The revision is to use the map pattern for all vectors. Serializing vectors 
becomes a tree-walk: visit each vector. If the vector is simple (has only a 
buffer) serialize that. If it is composite (has more than one buffer or 
vector), visit each to serialize.

Then, reverse the process on read so that each vector is backed by its own 
buffer, and free space is owned by a single vector and we can correctly 
understand the memory needs of each vector.

> Vectors read from disk report incorrect memory sizes
> 
>
> Key: DRILL-5416
> URL: https://issues.apache.org/jira/browse/DRILL-5416
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.8.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
> Fix For: 1.11.0
>
>
> The external sort and revised hash agg operators spill to disk using a vector 
> serialization mechanism. This mechanism serializes each vector as a (length, 
> bytes) pair.
> Before spilling, if we check the memory used for a vector (using the new 
> {{RecordBatchSizer}} class), we learn of the actual memory consumed by the 
> vector, including any unused space in the vector.
> If we spill the vector, then reread it, the reported storage size is wrong.
> On reading, the code allocates a buffer, based on the saved length, rounded 
> up to the next power of two. Then, when building the vector, we "slice" the 
> read buffer, setting the memory size to the data size.
> For example, suppose we save 20 1-byte fields. The size on disk is 20. The 
> read buffer is rounded to 32 bytes (the size of the original, pre-spill 
> buffer.) We read the 20 bytes and create a vector. Creating the vector 
> reports the memory size as 20, "hiding" the extra, unused 12 bytes.
> As a result, when computing memory sizes, we receive incorrect numbers. 
> Working with false numbers means that the code cannot safely operate within a 
> memory budget, causing the user to receive an unexpected OOM error.
> As it turns out, the code path that does the slicing is used only for reads 
> from disk. This ticket asks to remove the slicing step: just use the 
> allocated buffer directly so that the after-read vector reports the correct 
> memory usage; same as the before-spill vector.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (DRILL-5416) Vectors read from disk report incorrect memory sizes

2017-04-05 Thread Paul Rogers (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957987#comment-15957987
 ] 

Paul Rogers edited comment on DRILL-5416 at 4/5/17 11:17 PM:
-

The original design for serialization is that each vector serializes to a 
buffer. This is simple for single-buffer vectors (required int, say). For 
composite vectors (nullable int, Varchar), the serialization process results in 
all buffers being combined into a single write buffer, and the corresponding 
read buffer being sliced into the individual composite vectors.

For a Varchar:
{code}
Data: [FredBarneyWilma_]
Offsets: [01041015]
Output buffer:  [01041015FredBarneyWilma_]
Input buffer:   [01041015FredBarneyWilma_]
New Offsets:[]
New Data[^^^]
{code}

Notice that, in the original, the empty space "denoted with _" is allocated per 
vector. After serialization, free space is in a buffer shared by two vectors 
and is not "owned" by (or visible to) either.


was (Author: paul-rogers):
The original design for serialization is that each vector serializes to a 
buffer. This is simple for single-buffer vectors (required int, say). For 
composite vectors (nullable int, Varchar), the serialization process results in 
all buffers being combined into a single write buffer, and the corresponding 
read buffer being sliced into the individual composite vectors.

For a Varchar:
{code}
Data: [FredBarneyWilma_]
Offsets: [01041015]
Output buffer:  [01041015FredBarneyWilma_]
Input buffer:   [01041015FredBarneyWilma_]
New Offsets:[]
New Data[^^^]
{code}


> Vectors read from disk report incorrect memory sizes
> 
>
> Key: DRILL-5416
> URL: https://issues.apache.org/jira/browse/DRILL-5416
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.8.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
> Fix For: 1.11.0
>
>
> The external sort and revised hash agg operators spill to disk using a vector 
> serialization mechanism. This mechanism serializes each vector as a (length, 
> bytes) pair.
> Before spilling, if we check the memory used for a vector (using the new 
> {{RecordBatchSizer}} class), we learn of the actual memory consumed by the 
> vector, including any unused space in the vector.
> If we spill the vector, then reread it, the reported storage size is wrong.
> On reading, the code allocates a buffer, based on the saved length, rounded 
> up to the next power of two. Then, when building the vector, we "slice" the 
> read buffer, setting the memory size to the data size.
> For example, suppose we save 20 1-byte fields. The size on disk is 20. The 
> read buffer is rounded to 32 bytes (the size of the original, pre-spill 
> buffer.) We read the 20 bytes and create a vector. Creating the vector 
> reports the memory size as 20, "hiding" the extra, unused 12 bytes.
> As a result, when computing memory sizes, we receive incorrect numbers. 
> Working with false numbers means that the code cannot safely operate within a 
> memory budget, causing the user to receive an unexpected OOM error.
> As it turns out, the code path that does the slicing is used only for reads 
> from disk. This ticket asks to remove the slicing step: just use the 
> allocated buffer directly so that the after-read vector reports the correct 
> memory usage; same as the before-spill vector.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5416) Vectors read from disk report incorrect memory sizes

2017-04-05 Thread Paul Rogers (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957987#comment-15957987
 ] 

Paul Rogers commented on DRILL-5416:


The original design for serialization is that each vector serializes to a 
buffer. This is simple for single-buffer vectors (required int, say). For 
composite vectors (nullable int, Varchar), the serialization process results in 
all buffers being combined into a single write buffer, and the corresponding 
read buffer being sliced into the individual composite vectors.

For a Varchar:
{code}
Data: [FredBarneyWilma_]
Offsets: [01041015]
Output buffer:  [01041015FredBarneyWilma_]
Input buffer:   [01041015FredBarneyWilma_]
New Offsets:[]
New Data[^^^]
{code}


> Vectors read from disk report incorrect memory sizes
> 
>
> Key: DRILL-5416
> URL: https://issues.apache.org/jira/browse/DRILL-5416
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.8.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
> Fix For: 1.11.0
>
>
> The external sort and revised hash agg operators spill to disk using a vector 
> serialization mechanism. This mechanism serializes each vector as a (length, 
> bytes) pair.
> Before spilling, if we check the memory used for a vector (using the new 
> {{RecordBatchSizer}} class), we learn of the actual memory consumed by the 
> vector, including any unused space in the vector.
> If we spill the vector, then reread it, the reported storage size is wrong.
> On reading, the code allocates a buffer, based on the saved length, rounded 
> up to the next power of two. Then, when building the vector, we "slice" the 
> read buffer, setting the memory size to the data size.
> For example, suppose we save 20 1-byte fields. The size on disk is 20. The 
> read buffer is rounded to 32 bytes (the size of the original, pre-spill 
> buffer.) We read the 20 bytes and create a vector. Creating the vector 
> reports the memory size as 20, "hiding" the extra, unused 12 bytes.
> As a result, when computing memory sizes, we receive incorrect numbers. 
> Working with false numbers means that the code cannot safely operate within a 
> memory budget, causing the user to receive an unexpected OOM error.
> As it turns out, the code path that does the slicing is used only for reads 
> from disk. This ticket asks to remove the slicing step: just use the 
> allocated buffer directly so that the after-read vector reports the correct 
> memory usage; same as the before-spill vector.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5413) DrillConnectionImpl.isReadOnly() throws NullPointerException

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957984#comment-15957984
 ] 

ASF GitHub Bot commented on DRILL-5413:
---

Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/806


> DrillConnectionImpl.isReadOnly() throws NullPointerException
> 
>
> Key: DRILL-5413
> URL: https://issues.apache.org/jira/browse/DRILL-5413
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Affects Versions: 1.10.0
> Environment: jboss 7.0.1 final version
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
> Fix For: 1.11.0
>
>
> According to the 
> [CALCITE-843|https://issues.apache.org/jira/browse/CALCITE-843] every call of 
> "isReadonly()" throws NullPointerException. 
> For example, JBoss uses DrillConnectionImpl.isReadOnly() method in the 
> process of connection to the Drill as a datasource.
> The fix for CALCITE-843 should be added to the Drill Calcite fork.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5416) Vectors read from disk report incorrect memory sizes

2017-04-05 Thread Paul Rogers (JIRA)
Paul Rogers created DRILL-5416:
--

 Summary: Vectors read from disk report incorrect memory sizes
 Key: DRILL-5416
 URL: https://issues.apache.org/jira/browse/DRILL-5416
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.8.0
Reporter: Paul Rogers
Assignee: Paul Rogers
Priority: Minor
 Fix For: 1.11.0


The external sort and revised hash agg operators spill to disk using a vector 
serialization mechanism. This mechanism serializes each vector as a (length, 
bytes) pair.

Before spilling, if we check the memory used for a vector (using the new 
{{RecordBatchSizer}} class), we learn of the actual memory consumed by the 
vector, including any unused space in the vector.

If we spill the vector, then reread it, the reported storage size is wrong.

On reading, the code allocates a buffer, based on the saved length, rounded up 
to the next power of two. Then, when building the vector, we "slice" the read 
buffer, setting the memory size to the data size.

For example, suppose we save 20 1-byte fields. The size on disk is 20. The read 
buffer is rounded to 32 bytes (the size of the original, pre-spill buffer.) We 
read the 20 bytes and create a vector. Creating the vector reports the memory 
size as 20, "hiding" the extra, unused 12 bytes.

As a result, when computing memory sizes, we receive incorrect numbers. Working 
with false numbers means that the code cannot safely operate within a memory 
budget, causing the user to receive an unexpected OOM error.

As it turns out, the code path that does the slicing is used only for reads 
from disk. This ticket asks to remove the slicing step: just use the allocated 
buffer directly so that the after-read vector reports the correct memory usage; 
same as the before-spill vector.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957904#comment-15957904
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r109960926
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/rpc/security/ServerAuthenticationHandler.java
 ---
@@ -251,25 +256,62 @@ void process(SaslResponseContext context) 
throws Exception {
   private static , T extends EnumLite>
   void handleSuccess(final SaslResponseContext context, final 
SaslMessage.Builder challenge,
  final SaslServer saslServer) throws IOException {
-context.connection.changeHandlerTo(context.requestHandler);
-context.connection.finalizeSaslSession();
-context.sender.send(new Response(context.saslResponseType, 
challenge.build()));
 
-// setup security layers here..
+final S connection = context.connection;
+connection.changeHandlerTo(context.requestHandler);
+connection.finalizeSaslSession();
+context.sender.send(new Response(context.saslResponseType, 
challenge.build()));
 
 if (logger.isTraceEnabled()) {
-  logger.trace("Authenticated {} successfully using {} from {}", 
saslServer.getAuthorizationID(),
-  saslServer.getMechanismName(), context.remoteAddress);
+  logger.trace("Authenticated {} successfully using {} from {} with 
encryption context {}",
+saslServer.getAuthorizationID(), saslServer.getMechanismName(), 
connection.getRemoteAddress().toString(),
+connection.getEncryptionString());
+}
+
+if (connection.isEncrypted()) {
+  try {
+// Check if connection was marked for being secure then verify for 
negotiated QOP value for correctness.
+final String negotiatedQOP = 
saslServer.getNegotiatedProperty(Sasl.QOP).toString();
+assert 
(negotiatedQOP.equals(SaslProperties.QualityOfProtection.PRIVACY.getSaslQop()));
+
+// Update the rawWrapSendSize with the negotiated rawSendSize 
since we cannot call encode with more than the
+// negotiated size of buffer
+final int negotiatedRawSendSize = Integer.parseInt(saslServer
+  
.getNegotiatedProperty(SaslProperties.WRAP_RAW_SEND_SIZE)
+  .toString());
+if(negotiatedRawSendSize <= 0) {
+  throw new SaslException(String.format("Negotiated rawSendSize: 
%d is invalid. Please check the configured " +
+"value of sasl.encryption.encodesize. It might be configured 
to a very small value.",
+negotiatedRawSendSize));
+}
+connection.setRawWrapSendSize(negotiatedRawSendSize);
+connection.addSecurityHandlers();
+  } catch (IllegalStateException | NumberFormatException e) {
+throw new SaslException(String.format("Unexpected failure while 
retrieving negotiated property values (%s)",
+  e.getMessage()), e);
+  }
+} else {
+  // Encryption is not required hence we don't need to hold on to 
saslServer object.
+  if (saslServer != null) {
+try {
+  saslServer.dispose();
--- End diff --

Same comment as in for `saslClient`


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957886#comment-15957886
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r109736477
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/rpc/BitConnectionConfig.java 
---
@@ -46,16 +47,40 @@ protected BitConnectionConfig(BufferAllocator 
allocator, BootStrapContext contex
 super(allocator, context);
 
 final DrillConfig config = context.getConfig();
+final AuthenticatorProvider authProvider = getAuthProvider();
+
 if (config.getBoolean(ExecConstants.BIT_AUTHENTICATION_ENABLED)) {
   this.authMechanismToUse = 
config.getString(ExecConstants.BIT_AUTHENTICATION_MECHANISM);
   try {
-getAuthProvider().getAuthenticatorFactory(authMechanismToUse);
+authProvider.getAuthenticatorFactory(authMechanismToUse);
   } catch (final SaslException e) {
 throw new DrillbitStartupException(String.format(
 "'%s' mechanism not found for bit-to-bit authentication. 
Please check authentication configuration.",
 authMechanismToUse));
   }
-  logger.info("Configured bit-to-bit connections to require 
authentication using: {}", authMechanismToUse);
+
+  // Update encryption related configurations
+  
encryptionContext.setEncryption(config.getBoolean(ExecConstants.BIT_SASL_ENCRYPTION_ENABLED));
+
+  int maxEncodeSize = 
config.getInt(ExecConstants.BIT_SASL_ENCRYPTION_ENCODESIZE);
+
+  if(maxEncodeSize > RpcConstants.MAX_WRAP_SIZE) {
--- End diff --

+ spacing
+ check for non-negative


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957893#comment-15957893
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r110021847
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/rpc/security/SaslProperties.java
 ---
@@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.rpc.security;
+
+import javax.security.sasl.Sasl;
+import java.util.HashMap;
+import java.util.Map;
+
+public final class SaslProperties {
+
+  /**
+   * All supported Quality of Protection value which can be negotiated for
+   */
+  enum QualityOfProtection {
+AUTHENTICATION("auth"),
+INTEGRITY("auth-int"),
+PRIVACY("auth-conf");
+
+public final String saslQop;
+
+QualityOfProtection(String saslQop) {
+  this.saslQop = saslQop;
+}
+
+public String getSaslQop() {
+  return saslQop;
+}
+  }
+
+  static final String WRAP_RAW_SEND_SIZE = 
"javax.security.sasl.rawsendsize";
+
+  /**
+   * Get's the map of minimum set of SaslProperties required during 
negotiation process either for encryption
+   * or authentication
+   * @param encryptionEnabled - Flag to determine if property needed is 
for encryption or authentication
+   * @param wrappedChunkSize  - Configured wrappedChunkSize to negotiate 
for.
+   *Default is {@link 
org.apache.drill.exec.rpc.RpcConstants.MAX_WRAP_SIZE}
--- End diff --

When is the value set to default?


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957876#comment-15957876
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r109967293
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/rpc/user/UserClient.java ---
@@ -135,20 +137,33 @@ public void submitQuery(UserResultsListener 
resultsListener, RunQuery query) {
* @param credentials credentials
* @throws RpcException if either connection or authentication fails
*/
-  public void connect(final DrillbitEndpoint endpoint, final 
DrillProperties properties,
-  final UserCredentials credentials) throws 
RpcException {
-final UserToBitHandshake handshake = UserToBitHandshake.newBuilder()
+  public void connect(final DrillbitEndpoint endpoint, final 
DrillProperties properties, final UserCredentials credentials) throws 
RpcException {
--- End diff --

nit: undo formatting change?


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957880#comment-15957880
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r109967593
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/rpc/user/UserClient.java ---
@@ -135,20 +137,33 @@ public void submitQuery(UserResultsListener 
resultsListener, RunQuery query) {
* @param credentials credentials
* @throws RpcException if either connection or authentication fails
*/
-  public void connect(final DrillbitEndpoint endpoint, final 
DrillProperties properties,
-  final UserCredentials credentials) throws 
RpcException {
-final UserToBitHandshake handshake = UserToBitHandshake.newBuilder()
+  public void connect(final DrillbitEndpoint endpoint, final 
DrillProperties properties, final UserCredentials credentials) throws 
RpcException {
+final UserToBitHandshake.Builder hsBuilder = 
UserToBitHandshake.newBuilder()
 .setRpcVersion(UserRpcConfig.RPC_VERSION)
 .setSupportListening(true)
 .setSupportComplexTypes(supportComplexTypes)
 .setSupportTimeout(true)
 .setCredentials(credentials)
 .setClientInfos(UserRpcUtils.getRpcEndpointInfos(clientName))
-.setSaslSupport(SaslSupport.SASL_AUTH)
-.setProperties(properties.serializeForServer())
-.build();
+.setSaslSupport(SaslSupport.SASL_PRIVACY)
+.setProperties(properties.serializeForServer());
+
+// Only used for testing purpose
+if (properties.containsKey(DrillProperties.TEST_OLD_CLIENT)) {
+  hsBuilder.setSaslSupport(SaslSupport.valueOf(
+
Integer.parseInt(properties.getProperty(DrillProperties.TEST_OLD_CLIENT;
+}
+
+connect(hsBuilder.build(), endpoint).checkedGet();
+
+// Check if client needs encryption and server is not configured for 
encryption.
+final boolean clientNeedEncryption = 
properties.containsKey(DrillProperties.ENCRYPTION)
--- End diff --

`clientNeedsEncryption`


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957878#comment-15957878
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r109971454
  
--- Diff: 
exec/rpc/src/main/java/org/apache/drill/exec/rpc/AbstractRemoteConnection.java 
---
@@ -34,16 +37,26 @@
   private final WriteManager writeManager;
   private final RequestIdMap requestIdMap = new RequestIdMap();
   private final String clientName;
-
   private String name;
 
+  // Encryption related parameters
+  private EncryptionContext encryptionContext;
+  // SaslBackendWrapper to hold instance of SaslClient/SaslServer
+  protected SaslBackendWrapper saslBackend;
+
   public AbstractRemoteConnection(SocketChannel channel, String name) {
 this.channel = channel;
 this.clientName = name;
 this.writeManager = new WriteManager();
+this.encryptionContext = new EncryptionContext();
 channel.pipeline().addLast(new BackPressureHandler());
   }
 
+  public AbstractRemoteConnection(SocketChannel channel, String name, 
EncryptionContext encryptionContext) {
+this(channel, name);
+this.encryptionContext.initialize(encryptionContext);
--- End diff --

To avoid `initialize`, call this ctor from the other one and move other 
inits here.


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957887#comment-15957887
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r109744300
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/rpc/ConnectionConfig.java ---
@@ -31,4 +31,11 @@
 
   AuthenticatorProvider getAuthProvider();
 
+  void incConnectionCounter(boolean secure);
--- End diff --

Not sure if there should be mutable state in a config object, is there a 
way to move the counters elsewhere?


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957888#comment-15957888
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r109743585
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/rpc/ConnectionConfig.java ---
@@ -21,7 +21,7 @@
 import org.apache.drill.exec.rpc.security.AuthenticatorProvider;
 import org.apache.drill.exec.server.BootStrapContext;
 
-public interface ConnectionConfig {
+interface ConnectionConfig {
--- End diff --

public


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957895#comment-15957895
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r110034083
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/rpc/security/AuthenticationOutcomeListener.java
 ---
@@ -243,4 +247,43 @@ public SaslMessage process(SaslChallengeContext 
context) throws Exception {
   }
 }
   }
+
+  private static void handleSuccess(SaslChallengeContext context) throws 
SaslException {
+final ClientConnection connection = context.connection;
+final SaslClient saslClient = connection.getSaslClient();
+
+if (connection.isEncrypted()) {
+  try {
+// Check if connection was marked for being secure then verify for 
negotiated QOP value for
+// correctness.
+final String negotiatedQOP = 
saslClient.getNegotiatedProperty(Sasl.QOP).toString();
--- End diff --

`getNegotiatedProperty` could return null here and below?

Please correct other occurrences too.


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957903#comment-15957903
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r109744921
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/rpc/control/ControlConnectionConfig.java
 ---
@@ -30,6 +33,14 @@
 
   private final ControlMessageHandler handler;
 
+  // Total number of control connection's as client and server for a 
DrillBit.
+  // i.e. Sum of incoming and outgoing control connections.
+  private static final Counter secureControlConnections = 
DrillMetrics.getRegistry()
+.counter("drill.control.encrypted.connections");
--- End diff --

Change to "drill.connections.control.encrypted", and similarly for other 
connections.


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957869#comment-15957869
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r109735413
  
--- Diff: 
exec/rpc/src/main/java/org/apache/drill/exec/rpc/SaslBackendWrapper.java ---
@@ -0,0 +1,33 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.rpc;
+
+import javax.security.sasl.SaslException;
+
+/*
+ * The helper interface to wrap SaslClient and SaslServer instances for 
use in Security Handlers.
+ */
+public interface SaslBackendWrapper{
--- End diff --

How about "SaslCodec", as in coder-decoder?


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957897#comment-15957897
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r110035581
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/rpc/user/UserServer.java ---
@@ -335,8 +350,27 @@ public BitToUserHandshake 
getHandshakeResponse(UserToBitHandshake inbound) throw
 }
   }
 
-  // mention server's authentication capabilities
-  
respBuilder.addAllAuthenticationMechanisms(config.getAuthProvider().getAllFactoryNames());
+  // We are checking in UserConnectionConfig that if SASL 
encryption is enabled then mechanisms other
+  // than PLAIN are also configured otherwise throw exception
+  final Set configuredMech = 
config.getAuthProvider().getAllFactoryNames();
+
+  if (!config.isEncryptionEnabled()) {
+
+respBuilder.addAllAuthenticationMechanisms(configuredMech);
+  } else {
--- End diff --

Few things to note:
+ If encryption is enabled, PLAIN will fail negotiation anyway. So the 
special handling (and this block itself) is unnecessary?
+ An implication of this is that even if the Drillbit starts up with PLAIN 
configured correctly, the mechanism will not be offered to clients.
+ Consider a custom mechanism which do not support encryption, PLAIN will 
not be offered, but that mechanism will be offered?


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957901#comment-15957901
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r110039074
  
--- Diff: 
exec/rpc/src/main/java/org/apache/drill/exec/rpc/RpcConstants.java ---
@@ -22,6 +22,15 @@
 
   private RpcConstants(){}
 
-  public static final boolean SOME_DEBUGGING = false;
+  static final boolean SOME_DEBUGGING = false;
--- End diff --

make constants public


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957885#comment-15957885
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r110033564
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/rpc/user/UserServer.java ---
@@ -335,8 +350,27 @@ public BitToUserHandshake 
getHandshakeResponse(UserToBitHandshake inbound) throw
 }
   }
 
-  // mention server's authentication capabilities
-  
respBuilder.addAllAuthenticationMechanisms(config.getAuthProvider().getAllFactoryNames());
+  // We are checking in UserConnectionConfig that if SASL 
encryption is enabled then mechanisms other
+  // than PLAIN are also configured otherwise throw exception
--- End diff --

Where is the exception thrown?


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957867#comment-15957867
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r109743453
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/rpc/BitConnectionConfig.java 
---
@@ -46,16 +47,40 @@ protected BitConnectionConfig(BufferAllocator 
allocator, BootStrapContext contex
 super(allocator, context);
 
 final DrillConfig config = context.getConfig();
+final AuthenticatorProvider authProvider = getAuthProvider();
+
 if (config.getBoolean(ExecConstants.BIT_AUTHENTICATION_ENABLED)) {
   this.authMechanismToUse = 
config.getString(ExecConstants.BIT_AUTHENTICATION_MECHANISM);
   try {
-getAuthProvider().getAuthenticatorFactory(authMechanismToUse);
+authProvider.getAuthenticatorFactory(authMechanismToUse);
   } catch (final SaslException e) {
 throw new DrillbitStartupException(String.format(
 "'%s' mechanism not found for bit-to-bit authentication. 
Please check authentication configuration.",
 authMechanismToUse));
   }
-  logger.info("Configured bit-to-bit connections to require 
authentication using: {}", authMechanismToUse);
+
+  // Update encryption related configurations
+  
encryptionContext.setEncryption(config.getBoolean(ExecConstants.BIT_SASL_ENCRYPTION_ENABLED));
+
+  int maxEncodeSize = 
config.getInt(ExecConstants.BIT_SASL_ENCRYPTION_ENCODESIZE);
+
+  if(maxEncodeSize > RpcConstants.MAX_WRAP_SIZE) {
+logger.warn("Setting bit.sasl.encryption.encodesize to maximum 
allowed value of 16MB");
+maxEncodeSize = RpcConstants.MAX_WRAP_SIZE;
+  }
+  encryptionContext.setWrappedChunkSize(maxEncodeSize);
+
+  if (encryptionContext.isEncryptionEnabled() && 
authProvider.isOnlyPlainConfigured()) {
+throw new DrillbitStartupException("Encryption is enabled but only 
PLAIN mechanism is configured." +
+  " Please check the security.bit configurations.");
+  }
+
+  logger.info("Configured bit-to-bit connections to require 
authentication using: {} with encryption: {}",
+authMechanismToUse, encryptionContext.getEncryptionString());
+
+} else if 
(config.getBoolean(ExecConstants.BIT_SASL_ENCRYPTION_ENABLED)) {
+  throw new DrillbitStartupException("Invalid security configuration. 
Encryption is enabled with authentication " +
--- End diff --

How about "... Encryption **using SASL** is enabled... "


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957899#comment-15957899
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r110038275
  
--- Diff: 
exec/rpc/src/main/java/org/apache/drill/exec/rpc/AbstractRemoteConnection.java 
---
@@ -224,4 +237,67 @@ public void close() {
 }
   }
 
+  /**
+   * Helps to add all the required security handler's after negotiation 
for encryption is completed.
+   * 
+   *  Handler's that are added are:
+   *  SaslDecryptionHandler
+   *  LengthFieldBasedFrameDecoder Handler
+   *  SaslEncryptionHandler
+   *  ChunkCreationHandler
+   * 
+   * 
+   *  If encryption is enabled ChunkCreationHandler is always added 
irrespective of chunkMode enabled or not.
+   *  This helps to make a generic encryption handler.
+   * 
+   */
+  @Override
+  public void addSecurityHandlers() {
+
+final ChannelPipeline channelPipeline = getChannel().pipeline();
+channelPipeline.addFirst("SaslDecryptionHandler", new 
SaslDecryptionHandler(saslBackend, getWrappedChunkSize(),
--- End diff --

Define handler names as class constants, specially "message-decoder", which 
is from another class.


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957879#comment-15957879
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r109988945
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/rpc/security/ClientAuthenticatorProvider.java
 ---
@@ -101,4 +101,12 @@ public void close() throws Exception {
 AutoCloseables.close(authFactories.values());
 authFactories.clear();
   }
+
+  /**
+   * By default Kerberos and Plain factories are provided by Drill.
+   */
+  @Override
+  public boolean isOnlyPlainConfigured() {
+return false;
--- End diff --

nit: add the above comment here (since it's not method doc)


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957906#comment-15957906
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r110035650
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/rpc/user/UserServer.java ---
@@ -335,8 +350,27 @@ public BitToUserHandshake 
getHandshakeResponse(UserToBitHandshake inbound) throw
 }
   }
 
-  // mention server's authentication capabilities
-  
respBuilder.addAllAuthenticationMechanisms(config.getAuthProvider().getAllFactoryNames());
+  // We are checking in UserConnectionConfig that if SASL 
encryption is enabled then mechanisms other
+  // than PLAIN are also configured otherwise throw exception
+  final Set configuredMech = 
config.getAuthProvider().getAllFactoryNames();
+
+  if (!config.isEncryptionEnabled()) {
+
+respBuilder.addAllAuthenticationMechanisms(configuredMech);
+  } else {
+final Set saslEncryptMech = new HashSet<>();
+
+for (String mechanism : configuredMech) {
+  if 
(!mechanism.equals(PlainFactory.SIMPLE_NAME.toLowerCase())) {
+saslEncryptMech.add(mechanism);
+  }
+}
+respBuilder.addAllAuthenticationMechanisms(saslEncryptMech);
+  }
+
+  // set the encrypted flag in handshake message. For older 
clients this field is optional so will be ignored
+  respBuilder.setEncrypted(connection.isEncrypted());
--- End diff --

Shouldn't these be set inside the above `else` block? The values are 
invalid otherwise.


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957874#comment-15957874
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r109968323
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/rpc/user/UserClient.java ---
@@ -135,20 +137,33 @@ public void submitQuery(UserResultsListener 
resultsListener, RunQuery query) {
* @param credentials credentials
* @throws RpcException if either connection or authentication fails
*/
-  public void connect(final DrillbitEndpoint endpoint, final 
DrillProperties properties,
-  final UserCredentials credentials) throws 
RpcException {
-final UserToBitHandshake handshake = UserToBitHandshake.newBuilder()
+  public void connect(final DrillbitEndpoint endpoint, final 
DrillProperties properties, final UserCredentials credentials) throws 
RpcException {
+final UserToBitHandshake.Builder hsBuilder = 
UserToBitHandshake.newBuilder()
 .setRpcVersion(UserRpcConfig.RPC_VERSION)
 .setSupportListening(true)
 .setSupportComplexTypes(supportComplexTypes)
 .setSupportTimeout(true)
 .setCredentials(credentials)
 .setClientInfos(UserRpcUtils.getRpcEndpointInfos(clientName))
-.setSaslSupport(SaslSupport.SASL_AUTH)
-.setProperties(properties.serializeForServer())
-.build();
+.setSaslSupport(SaslSupport.SASL_PRIVACY)
+.setProperties(properties.serializeForServer());
+
+// Only used for testing purpose
+if (properties.containsKey(DrillProperties.TEST_OLD_CLIENT)) {
--- End diff --

Although only VisibleForTesting, this property is still available to any 
user. Could this cause trouble?


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957900#comment-15957900
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r109958662
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/rpc/user/UserConnectionConfig.java
 ---
@@ -34,32 +39,81 @@
 
   private final UserServerRequestHandler handler;
 
+  // Total number of external DrillClient connection's on this server.
+  private static final Counter secureUserConnections = 
DrillMetrics.getRegistry()
+.counter("drill.user.encrypted.connections");
+
+  private static final Counter insecureUserConnections = 
DrillMetrics.getRegistry()
+.counter("drill.user.unencrypted.connections");
+
   UserConnectionConfig(BufferAllocator allocator, BootStrapContext 
context, UserServerRequestHandler handler)
-  throws DrillbitStartupException {
+throws DrillbitStartupException {
 super(allocator, context);
 this.handler = handler;
 
-if 
(context.getConfig().getBoolean(ExecConstants.USER_AUTHENTICATION_ENABLED)) {
-  if (getAuthProvider().getAllFactoryNames().isEmpty()) {
+final DrillConfig config = context.getConfig();
+final AuthenticatorProvider authProvider = getAuthProvider();
+
+if (config.getBoolean(ExecConstants.USER_AUTHENTICATION_ENABLED)) {
+  if (authProvider.getAllFactoryNames().isEmpty()) {
 throw new DrillbitStartupException("Authentication enabled, but no 
mechanisms found. Please check " +
-"authentication configuration.");
+  "authentication configuration.");
   }
   authEnabled = true;
-  logger.info("Configured all user connections to require 
authentication using: {}",
-  getAuthProvider().getAllFactoryNames());
+
+  // Update encryption related parameters.
+  
encryptionContext.setEncryption(config.getBoolean(ExecConstants.USER_SASL_ENCRYPTION_ENABLED));
+
+  int maxEncodeSize = 
config.getInt(ExecConstants.USER_SASL_ENCRYPTION_ENCODESIZE);
+
+  if(maxEncodeSize > RpcConstants.MAX_WRAP_SIZE) {
+logger.warn("Setting user.sasl.encryption.encodesize to maximum 
allowed value of 16MB");
+maxEncodeSize = RpcConstants.MAX_WRAP_SIZE;
+  }
+  encryptionContext.setWrappedChunkSize(maxEncodeSize);
+
+  if (encryptionContext.isEncryptionEnabled() && 
authProvider.isOnlyPlainConfigured()) {
--- End diff --

There maybe other mechanisms that do not support encryption, so this check 
(`isOnlyPlainConfigured`) may not be sufficient.


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957898#comment-15957898
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r109742481
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/rpc/AbstractServerConnection.java
 ---
@@ -110,7 +130,31 @@ public void changeHandlerTo(final RequestHandler 
handler) {
   }
 
   @Override
-  public void close() {
+  public void setEncrypted(boolean encrypted){
+throw new UnsupportedOperationException("Changing encryption setting 
on server connection is not permitted.");
+  }
+
+  @Override
+  public void setWrappedChunkSize(int chunkSize) {
+throw new UnsupportedOperationException("Changing WrappedChunkSize 
setting on server connection is not permitted.");
+  }
+
+
+  @Override
+  public void incConnectionCounter() {
+config.incConnectionCounter(isEncrypted());
+  }
+
+  @Override
+  public void decConnectionCounter() {
+config.decConnectionCounter(isEncrypted());
+  }
+
+  @Override
+  public void channelClosed(RpcException ex) {
+// This will be triggered from Netty when a channel is closed. We 
should cleanup here
--- End diff --

nice catch


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957905#comment-15957905
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r110038965
  
--- Diff: 
exec/rpc/src/main/java/org/apache/drill/exec/rpc/RemoteConnection.java ---
@@ -51,6 +51,22 @@
 
   SocketAddress getRemoteAddress();
 
+  void addSecurityHandlers();
+
+  boolean isEncrypted();
--- End diff --

Maybe `getEncryptionContext` to avoid adding setters and getters to this 
class if `EncryptionContext` is modified?


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957872#comment-15957872
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r109960614
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/rpc/security/ServerAuthenticationHandler.java
 ---
@@ -208,8 +207,12 @@ void process(SaslResponseContext context) throws 
Exception {
 
 handleSuccess(context, challenge, saslServer);
   } else {
-logger.info("Failed to authenticate client from {}", 
context.remoteAddress);
-throw new SaslException("Client allegedly succeeded 
authentication, but server did not. Suspicious?");
+final S connection = context.connection;
+logger.info("Failed to authenticate client from {} with encryption 
context:{}",
+  connection.getRemoteAddress().toString(),
--- End diff --

Leave the `context.remoteAddress` as is (here and below); I happened to 
notice`.toString()` is expensive.

nit: why evaluate `getEncryptionString` every time, cache that too?


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957890#comment-15957890
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r109745391
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/rpc/security/AuthenticationOutcomeListener.java
 ---
@@ -120,19 +121,22 @@ public void success(SaslMessage value, ByteBuf 
buffer) {
   new SaslException("Server sent a corrupt message.")));
 } else {
   try {
-final SaslChallengeContext context = new 
SaslChallengeContext(value, connection.getSaslClient(), ugi);
-
+final SaslChallengeContext context = new 
SaslChallengeContext<>(value, ugi, connection);
 final SaslMessage saslResponse = processor.process(context);
 
 if (saslResponse != null) {
   client.send(new AuthenticationOutcomeListener<>(client, 
connection, saslRpcType, ugi, completionListener),
   connection, saslRpcType, saslResponse, SaslMessage.class,
-  true /** the connection will not be backed up at this point 
*/);
+  true /* the connection will not be backed up at this point 
*/);
 } else {
   // success
   completionListener.success(null, null);
+  logger.trace("Successfully authenticated to server using {} 
mechanism and encryption context: {}",
--- End diff --

`if (logger.isTraceEnabled())`


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957894#comment-15957894
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r110022378
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/rpc/security/ServerAuthenticationHandler.java
 ---
@@ -251,25 +256,62 @@ void process(SaslResponseContext context) 
throws Exception {
   private static , T extends EnumLite>
   void handleSuccess(final SaslResponseContext context, final 
SaslMessage.Builder challenge,
  final SaslServer saslServer) throws IOException {
-context.connection.changeHandlerTo(context.requestHandler);
-context.connection.finalizeSaslSession();
-context.sender.send(new Response(context.saslResponseType, 
challenge.build()));
 
-// setup security layers here..
+final S connection = context.connection;
+connection.changeHandlerTo(context.requestHandler);
+connection.finalizeSaslSession();
+context.sender.send(new Response(context.saslResponseType, 
challenge.build()));
 
 if (logger.isTraceEnabled()) {
-  logger.trace("Authenticated {} successfully using {} from {}", 
saslServer.getAuthorizationID(),
-  saslServer.getMechanismName(), context.remoteAddress);
+  logger.trace("Authenticated {} successfully using {} from {} with 
encryption context {}",
+saslServer.getAuthorizationID(), saslServer.getMechanismName(), 
connection.getRemoteAddress().toString(),
+connection.getEncryptionString());
+}
+
+if (connection.isEncrypted()) {
+  try {
+// Check if connection was marked for being secure then verify for 
negotiated QOP value for correctness.
+final String negotiatedQOP = 
saslServer.getNegotiatedProperty(Sasl.QOP).toString();
+assert 
(negotiatedQOP.equals(SaslProperties.QualityOfProtection.PRIVACY.getSaslQop()));
--- End diff --

This should be an exception. Please correct other occurrences too.


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957902#comment-15957902
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r109742969
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/rpc/BitConnectionConfig.java 
---
@@ -46,16 +47,40 @@ protected BitConnectionConfig(BufferAllocator 
allocator, BootStrapContext contex
 super(allocator, context);
 
 final DrillConfig config = context.getConfig();
+final AuthenticatorProvider authProvider = getAuthProvider();
+
 if (config.getBoolean(ExecConstants.BIT_AUTHENTICATION_ENABLED)) {
   this.authMechanismToUse = 
config.getString(ExecConstants.BIT_AUTHENTICATION_MECHANISM);
   try {
-getAuthProvider().getAuthenticatorFactory(authMechanismToUse);
+authProvider.getAuthenticatorFactory(authMechanismToUse);
   } catch (final SaslException e) {
 throw new DrillbitStartupException(String.format(
 "'%s' mechanism not found for bit-to-bit authentication. 
Please check authentication configuration.",
 authMechanismToUse));
   }
-  logger.info("Configured bit-to-bit connections to require 
authentication using: {}", authMechanismToUse);
+
+  // Update encryption related configurations
+  
encryptionContext.setEncryption(config.getBoolean(ExecConstants.BIT_SASL_ENCRYPTION_ENABLED));
+
+  int maxEncodeSize = 
config.getInt(ExecConstants.BIT_SASL_ENCRYPTION_ENCODESIZE);
+
+  if(maxEncodeSize > RpcConstants.MAX_WRAP_SIZE) {
+logger.warn("Setting bit.sasl.encryption.encodesize to maximum 
allowed value of 16MB");
+maxEncodeSize = RpcConstants.MAX_WRAP_SIZE;
+  }
+  encryptionContext.setWrappedChunkSize(maxEncodeSize);
--- End diff --

I have difficulty in understanding what these sizes mean. I could at least 
classify them as related, but how are they related? Better names maybe?
+ "maxEncodeSize", "ENCODESIZE", "WrappedChunkSize", "MAX_WRAP_SIZE"
+ "RawSendSize", "RawWrapSendSize", "MaxRawWrapSendSize", 
"WRAP_RAW_SEND_SIZE"

Sometimes "max" is not necessarily a maximum e.g. `maxEncodeSize`, and 
`setRawWrapSendSize` sets MaxRawWrapSendSize.

I noticed only "ENCODESIZE" is configurable through drill-override.conf. 
Are the others not configurable, at connection time, for example?


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957892#comment-15957892
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r110036482
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/DrillRoot.java 
---
@@ -61,6 +63,10 @@ public ClusterInfo getClusterInfoJSON() {
 final DrillbitEndpoint currentDrillbit = 
work.getContext().getEndpoint();
 final String currentVersion = currentDrillbit.getVersion();
 
+final DrillConfig config = work.getContext().getConfig();
+final boolean clientEncryptionEnabled = 
config.getBoolean(ExecConstants.USER_SASL_ENCRYPTION_ENABLED);
--- End diff --

For consistency, use `userEncryptionEnabled` and `bitEncryptionEnabled`


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957896#comment-15957896
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r109957575
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/rpc/security/AuthenticationOutcomeListener.java
 ---
@@ -243,4 +247,43 @@ public SaslMessage process(SaslChallengeContext 
context) throws Exception {
   }
 }
   }
+
+  private static void handleSuccess(SaslChallengeContext context) throws 
SaslException {
+final ClientConnection connection = context.connection;
+final SaslClient saslClient = connection.getSaslClient();
+
+if (connection.isEncrypted()) {
+  try {
+// Check if connection was marked for being secure then verify for 
negotiated QOP value for
+// correctness.
+final String negotiatedQOP = 
saslClient.getNegotiatedProperty(Sasl.QOP).toString();
+assert 
(negotiatedQOP.equals(SaslProperties.QualityOfProtection.PRIVACY.getSaslQop()));
+
+// Update the rawWrapChunkSize with the negotiated buffer size 
since we cannot call encode with more than
+// negotiated size of buffer.
+final int negotiatedRawSendSize = Integer.parseInt(saslClient
+
.getNegotiatedProperty(SaslProperties.WRAP_RAW_SEND_SIZE)
+.toString());
+if(negotiatedRawSendSize <= 0) {
+  throw new SaslException(String.format("Negotiated rawSendSize: 
%d is invalid. Please check the configured " +
+  "value of sasl.encryption.encodesize. It might be configured 
to a very small value.",
+negotiatedRawSendSize));
+}
+connection.setRawWrapSendSize(negotiatedRawSendSize);
+connection.addSecurityHandlers();
+  } catch (Exception e) {
+throw new SaslException(String.format("Unexpected failure while 
retrieving negotiated property values (%s)",
+  e.getMessage()), e);
+  }
+} else {
+  // Encryption is not required hence we don't need to hold on to 
saslClient object.
+  if (saslClient != null) {
--- End diff --

Although `dispose` is documented to be idempotent,  should `dispose` be 
called only once (at connection closure) to keep this logic in one place?


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957884#comment-15957884
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r109729986
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java ---
@@ -116,6 +116,11 @@
   String BIT_AUTHENTICATION_ENABLED = 
"drill.exec.security.bit.auth.enabled";
   String BIT_AUTHENTICATION_MECHANISM = 
"drill.exec.security.bit.auth.mechanism";
   String USE_LOGIN_PRINCIPAL = 
"drill.exec.security.bit.auth.use_login_principal";
+  String USER_SASL_ENCRYPTION_ENABLED = 
"drill.exec.security.user.sasl.encryption.enabled";
--- End diff --

How about using "drill.exec.security.user.**encryption.sasl**.enabled" 
instead of "drill.exec.security.user.**sasl.encryption**.enabled" to be 
consistent with "auth" (and similarly below)?


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957882#comment-15957882
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r109989384
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/rpc/security/SaslProperties.java
 ---
@@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.rpc.security;
+
+import javax.security.sasl.Sasl;
+import java.util.HashMap;
+import java.util.Map;
+
+public final class SaslProperties {
+
+  /**
+   * All supported Quality of Protection value which can be negotiated for
--- End diff --

+ value**s**
+ negotiated ~~for~~.


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957868#comment-15957868
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r109956559
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/rpc/security/AuthenticationOutcomeListener.java
 ---
@@ -243,4 +247,43 @@ public SaslMessage process(SaslChallengeContext 
context) throws Exception {
   }
 }
   }
+
+  private static void handleSuccess(SaslChallengeContext context) throws 
SaslException {
+final ClientConnection connection = context.connection;
+final SaslClient saslClient = connection.getSaslClient();
+
+if (connection.isEncrypted()) {
+  try {
+// Check if connection was marked for being secure then verify for 
negotiated QOP value for
+// correctness.
+final String negotiatedQOP = 
saslClient.getNegotiatedProperty(Sasl.QOP).toString();
+assert 
(negotiatedQOP.equals(SaslProperties.QualityOfProtection.PRIVACY.getSaslQop()));
+
+// Update the rawWrapChunkSize with the negotiated buffer size 
since we cannot call encode with more than
+// negotiated size of buffer.
+final int negotiatedRawSendSize = Integer.parseInt(saslClient
+
.getNegotiatedProperty(SaslProperties.WRAP_RAW_SEND_SIZE)
--- End diff --

Why not use `Sasl.RAW_SEND_SIZE`?


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957883#comment-15957883
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r109989548
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/rpc/security/SaslProperties.java
 ---
@@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.rpc.security;
+
+import javax.security.sasl.Sasl;
+import java.util.HashMap;
+import java.util.Map;
+
+public final class SaslProperties {
+
+  /**
+   * All supported Quality of Protection value which can be negotiated for
+   */
+  enum QualityOfProtection {
+AUTHENTICATION("auth"),
+INTEGRITY("auth-int"),
+PRIVACY("auth-conf");
+
+public final String saslQop;
+
+QualityOfProtection(String saslQop) {
+  this.saslQop = saslQop;
+}
+
+public String getSaslQop() {
+  return saslQop;
+}
+  }
+
+  static final String WRAP_RAW_SEND_SIZE = 
"javax.security.sasl.rawsendsize";
--- End diff --

Not required. Use `Sasl.RAW_SEND_SIZE`


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957873#comment-15957873
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r109969009
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/rpc/user/UserConnectionConfig.java
 ---
@@ -34,32 +39,81 @@
 
   private final UserServerRequestHandler handler;
 
+  // Total number of external DrillClient connection's on this server.
+  private static final Counter secureUserConnections = 
DrillMetrics.getRegistry()
+.counter("drill.user.encrypted.connections");
+
+  private static final Counter insecureUserConnections = 
DrillMetrics.getRegistry()
+.counter("drill.user.unencrypted.connections");
+
   UserConnectionConfig(BufferAllocator allocator, BootStrapContext 
context, UserServerRequestHandler handler)
-  throws DrillbitStartupException {
+throws DrillbitStartupException {
 super(allocator, context);
 this.handler = handler;
 
-if 
(context.getConfig().getBoolean(ExecConstants.USER_AUTHENTICATION_ENABLED)) {
-  if (getAuthProvider().getAllFactoryNames().isEmpty()) {
+final DrillConfig config = context.getConfig();
+final AuthenticatorProvider authProvider = getAuthProvider();
+
+if (config.getBoolean(ExecConstants.USER_AUTHENTICATION_ENABLED)) {
+  if (authProvider.getAllFactoryNames().isEmpty()) {
 throw new DrillbitStartupException("Authentication enabled, but no 
mechanisms found. Please check " +
-"authentication configuration.");
+  "authentication configuration.");
--- End diff --

(nit: continuing lines have four spaces, just so your IDE formatter does 
not conflict with others' in the future; I faced the same problem)


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957877#comment-15957877
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r109975904
  
--- Diff: 
exec/rpc/src/main/java/org/apache/drill/exec/rpc/SaslDecryptionHandler.java ---
@@ -0,0 +1,158 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.rpc;
+
+import io.netty.buffer.ByteBuf;
+import io.netty.channel.ChannelHandlerContext;
+import io.netty.handler.codec.MessageToMessageDecoder;
+
+import org.apache.drill.exec.exception.OutOfMemoryException;
+
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.nio.ByteOrder;
+import java.util.List;
+/**
+ * Handler to Decrypt the input ByteBuf. It expects input to be in format 
where it has length of the bytes to
+ * decode in network order and actual encrypted bytes. The handler reads 
the length and then reads the
+ * required bytes to pass it to unwrap function for decryption. The 
decrypted buffer is copied to a new
+ * ByteBuf and added to out list.
+ * 
+ * Example:
+ * Input - [EBLN1, EB1, EBLN2, EB2] --> ByteBuf with repeated 
combination of encrypted byte length
+ * in network order (EBLNx) and encrypted bytes (EB)
+ * Output - [DB1] --> Decrypted ByteBuf of first chunk.(EB1)
+ * 
+ */
+class SaslDecryptionHandler extends MessageToMessageDecoder {
+
+  final org.slf4j.Logger logger;
--- End diff --

static?


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957875#comment-15957875
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r109974828
  
--- Diff: 
exec/rpc/src/main/java/org/apache/drill/exec/rpc/AbstractRemoteConnection.java 
---
@@ -224,4 +237,67 @@ public void close() {
 }
   }
 
+  /**
+   * Helps to add all the required security handler's after negotiation 
for encryption is completed.
+   * 
+   *  Handler's that are added are:
+   *  SaslDecryptionHandler
+   *  LengthFieldBasedFrameDecoder Handler
+   *  SaslEncryptionHandler
+   *  ChunkCreationHandler
+   * 
+   * 
+   *  If encryption is enabled ChunkCreationHandler is always added 
irrespective of chunkMode enabled or not.
+   *  This helps to make a generic encryption handler.
+   * 
+   */
+  @Override
+  public void addSecurityHandlers() {
+
+final ChannelPipeline channelPipeline = getChannel().pipeline();
+channelPipeline.addFirst("SaslDecryptionHandler", new 
SaslDecryptionHandler(saslBackend, getWrappedChunkSize(),
+  OutOfMemoryHandler.DEFAULT_INSTANCE));
+
+channelPipeline.addFirst("Length-Decoder",
+  new LengthFieldBasedFrameDecoder(ByteOrder.BIG_ENDIAN, 
Integer.MAX_VALUE,
+RpcConstants.LENGTH_FIELD_OFFSET, 
RpcConstants.LENGTH_FIELD_LENGTH, RpcConstants.LENGTH_ADJUSTMENT,
+RpcConstants.INITIAL_BYTES_TO_STRIP, true));
+
+channelPipeline.addAfter("message-decoder", "SaslEncryptionHandler",
+  new SaslEncryptionHandler(saslBackend, 
encryptionContext.getMaxRawWrapSendSize(),
+OutOfMemoryHandler.DEFAULT_INSTANCE));
+
+channelPipeline.addAfter("SaslEncryptionHandler", 
"ChunkCreationHandler",
+  new ChunkCreationHandler("ChunkCreatorHandler", 
encryptionContext.getMaxRawWrapSendSize()));
+  }
+
+  public void setEncrypted(boolean encrypted) {
--- End diff --

maybe `getEncryptionContext` to avoid delegating setters and getters?


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957889#comment-15957889
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r110038091
  
--- Diff: 
exec/rpc/src/main/java/org/apache/drill/exec/rpc/AbstractRemoteConnection.java 
---
@@ -224,4 +237,67 @@ public void close() {
 }
   }
 
+  /**
+   * Helps to add all the required security handler's after negotiation 
for encryption is completed.
+   * 
+   *  Handler's that are added are:
--- End diff --

Please document the order of handlers before and after this method is 
invoked.


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957870#comment-15957870
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r109967124
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/rpc/security/kerberos/KerberosFactory.java
 ---
@@ -93,6 +94,7 @@ public UserGroupInformation createAndLoginUser(final 
Map properties)
   @Override
   public SaslServer createSaslServer(final UserGroupInformation ugi, final 
Map properties)
   throws SaslException {
+final String qopValue = properties.containsKey(Sasl.QOP) ? 
properties.get(Sasl.QOP).toString() : "null";
--- End diff --

change `null` to `auth`, here and below?


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957881#comment-15957881
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r109977762
  
--- Diff: 
common/src/main/java/org/apache/drill/common/config/DrillProperties.java ---
@@ -59,6 +60,12 @@
 
   public static final String KEYTAB = "keytab";
 
+  public static final String ENCRYPTION = "encryption";
--- End diff --

Maybe `... SASL_ENCRYPT = "sasl.encrypt"`?


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957891#comment-15957891
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r110036262
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/rpc/user/security/UserAuthenticatorFactory.java
 ---
@@ -46,6 +46,12 @@
*/
   public static UserAuthenticator createAuthenticator(final DrillConfig 
config, ScanResult scan)
   throws DrillbitStartupException {
+
+if(!config.hasPath(USER_AUTHENTICATOR_IMPL)) {
--- End diff --

good catch


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4335) Apache Drill should support network encryption

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957871#comment-15957871
 ] 

ASF GitHub Bot commented on DRILL-4335:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/773#discussion_r109963000
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/rpc/security/ServerAuthenticationHandler.java
 ---
@@ -251,25 +256,62 @@ void process(SaslResponseContext context) 
throws Exception {
   private static , T extends EnumLite>
   void handleSuccess(final SaslResponseContext context, final 
SaslMessage.Builder challenge,
  final SaslServer saslServer) throws IOException {
-context.connection.changeHandlerTo(context.requestHandler);
-context.connection.finalizeSaslSession();
-context.sender.send(new Response(context.saslResponseType, 
challenge.build()));
 
-// setup security layers here..
+final S connection = context.connection;
+connection.changeHandlerTo(context.requestHandler);
+connection.finalizeSaslSession();
+context.sender.send(new Response(context.saslResponseType, 
challenge.build()));
 
 if (logger.isTraceEnabled()) {
-  logger.trace("Authenticated {} successfully using {} from {}", 
saslServer.getAuthorizationID(),
-  saslServer.getMechanismName(), context.remoteAddress);
+  logger.trace("Authenticated {} successfully using {} from {} with 
encryption context {}",
+saslServer.getAuthorizationID(), saslServer.getMechanismName(), 
connection.getRemoteAddress().toString(),
+connection.getEncryptionString());
+}
+
+if (connection.isEncrypted()) {
+  try {
+// Check if connection was marked for being secure then verify for 
negotiated QOP value for correctness.
+final String negotiatedQOP = 
saslServer.getNegotiatedProperty(Sasl.QOP).toString();
+assert 
(negotiatedQOP.equals(SaslProperties.QualityOfProtection.PRIVACY.getSaslQop()));
+
+// Update the rawWrapSendSize with the negotiated rawSendSize 
since we cannot call encode with more than the
+// negotiated size of buffer
+final int negotiatedRawSendSize = Integer.parseInt(saslServer
+  
.getNegotiatedProperty(SaslProperties.WRAP_RAW_SEND_SIZE)
+  .toString());
+if(negotiatedRawSendSize <= 0) {
+  throw new SaslException(String.format("Negotiated rawSendSize: 
%d is invalid. Please check the configured " +
+"value of sasl.encryption.encodesize. It might be configured 
to a very small value.",
+negotiatedRawSendSize));
+}
+connection.setRawWrapSendSize(negotiatedRawSendSize);
+connection.addSecurityHandlers();
+  } catch (IllegalStateException | NumberFormatException e) {
+throw new SaslException(String.format("Unexpected failure while 
retrieving negotiated property values (%s)",
--- End diff --

This method does not follow [this 
contract](https://github.com/apache/drill/blob/master/exec/rpc/src/main/java/org/apache/drill/exec/rpc/RequestHandler.java#L34).
 Somehow the order of `context.sender.send(...)` and exceptions thrown (here 
and below) needs to be fixed.


> Apache Drill should support network encryption
> --
>
> Key: DRILL-4335
> URL: https://issues.apache.org/jira/browse/DRILL-4335
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Keys Botzum
>Assignee: Sorabh Hamirwasia
>  Labels: security
> Attachments: ApacheDrillEncryptionUsingSASLDesign.pdf
>
>
> This is clearly related to Drill-291 but wanted to make explicit that this 
> needs to include network level encryption and not just authentication. This 
> is particularly important for the client connection to Drill which will often 
> be sending passwords in the clear until there is encryption.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5319) Refactor FragmentContext and OptionManager for unit testing

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957593#comment-15957593
 ] 

ASF GitHub Bot commented on DRILL-5319:
---

Github user gparai commented on a diff in the pull request:

https://github.com/apache/drill/pull/787#discussion_r109986447
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/server/options/TypeValidators.java
 ---
@@ -90,19 +90,19 @@ public void validate(final OptionValue v, final 
OptionManager manager) {
   }
 
   public static class MinRangeDoubleValidator extends RangeDoubleValidator 
{
-private final double min;
-private final double max;
+//private final double min;
--- End diff --

Please remove the comments here and below


> Refactor FragmentContext and OptionManager for unit testing
> ---
>
> Key: DRILL-5319
> URL: https://issues.apache.org/jira/browse/DRILL-5319
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.11.0
>
>
> Roll-up task for two refactorings, see the sub-tasks for details. This ticket 
> allows a single PR for the two different refactorings since the work heavily 
> overlaps. See DRILL-5320 and DRILL-5321 for details.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5319) Refactor FragmentContext and OptionManager for unit testing

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957592#comment-15957592
 ] 

ASF GitHub Bot commented on DRILL-5319:
---

Github user gparai commented on a diff in the pull request:

https://github.com/apache/drill/pull/787#discussion_r109982950
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/server/options/OptionSet.java
 ---
@@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.server.options;
+
+/**
+ * Immutable set of options accessible by name or validator.
+ */
+
+public interface OptionSet {
--- End diff --

Maybe rename to OptionReader instead of OptionSet. An OptionManager would 
extend an OptionReader and OptionModifier interfaces.


> Refactor FragmentContext and OptionManager for unit testing
> ---
>
> Key: DRILL-5319
> URL: https://issues.apache.org/jira/browse/DRILL-5319
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.11.0
>
>
> Roll-up task for two refactorings, see the sub-tasks for details. This ticket 
> allows a single PR for the two different refactorings since the work heavily 
> overlaps. See DRILL-5320 and DRILL-5321 for details.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5319) Refactor FragmentContext and OptionManager for unit testing

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957594#comment-15957594
 ] 

ASF GitHub Bot commented on DRILL-5319:
---

Github user gparai commented on a diff in the pull request:

https://github.com/apache/drill/pull/787#discussion_r110018766
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/ops/FragmentExecContext.java 
---
@@ -0,0 +1,49 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.ops;
+
+import java.io.IOException;
+import java.util.List;
+
+import org.apache.drill.common.config.DrillConfig;
+import org.apache.drill.exec.exception.ClassTransformationException;
+import org.apache.drill.exec.expr.ClassGenerator;
+import org.apache.drill.exec.expr.CodeGenerator;
+import org.apache.drill.exec.expr.fn.FunctionImplementationRegistry;
+import org.apache.drill.exec.server.options.OptionSet;
+import org.apache.drill.exec.testing.ExecutionControls;
+
+public interface FragmentExecContext {
+  FunctionImplementationRegistry getFunctionRegistry();
+  OptionSet getOptionSet();
+
+   T getImplementationClass(final ClassGenerator cg)
--- End diff --

Good to have comments describing the interface methods


> Refactor FragmentContext and OptionManager for unit testing
> ---
>
> Key: DRILL-5319
> URL: https://issues.apache.org/jira/browse/DRILL-5319
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.11.0
>
>
> Roll-up task for two refactorings, see the sub-tasks for details. This ticket 
> allows a single PR for the two different refactorings since the work heavily 
> overlaps. See DRILL-5320 and DRILL-5321 for details.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5415) Improve Fixture Builder to configure client properties and keep collection type properties for server

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957348#comment-15957348
 ] 

ASF GitHub Bot commented on DRILL-5415:
---

GitHub user sohami opened a pull request:

https://github.com/apache/drill/pull/807

DRILL-5415: Improve Fixture Builder to configure client properties an…

…d keep collection type properties for server

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sohami/drill DRILL-5415

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/807.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #807


commit 1608487142f6d1f22e70eed00b508b8006200325
Author: Sorabh Hamirwasia 
Date:   2017-04-05T18:04:58Z

DRILL-5415: Improve Fixture Builder to configure client properties and keep 
collection type properties for server




> Improve Fixture Builder to configure client properties and keep collection 
> type properties for server
> -
>
> Key: DRILL-5415
> URL: https://issues.apache.org/jira/browse/DRILL-5415
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Tools, Build & Test
>Affects Versions: 1.11.0
>Reporter: Sorabh Hamirwasia
>Assignee: Sorabh Hamirwasia
>Priority: Minor
> Fix For: 1.11.0
>
>
> There are 2 improvements which are made as part of this pull request.
> 1) The Fixture Builder framework converts all the config properties for 
> Drillbit into string type. But there are certain configurations for 
> authentication (like auth.mechanism) which are expected to be list type. Thus 
> there will be failure during type check. Change to keep collections type 
> config value as is and insert those config value after string types are 
> inserted.
> 2) The Fixture Builder framework when builds it tries to apply any system 
> options / session options (if set) for which it creates a default client. 
> Hence with  cluster enabled for authentication this default client will not 
> provide any connection parameters for authentication and will fail to 
> connect. Allow Fixture Builder to accept client related properties as well so 
> that can be used while creating default client to connect to cluster.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5415) Improve Fixture Builder to configure client properties and keep collection type properties for server

2017-04-05 Thread Sorabh Hamirwasia (JIRA)
Sorabh Hamirwasia created DRILL-5415:


 Summary: Improve Fixture Builder to configure client properties 
and keep collection type properties for server
 Key: DRILL-5415
 URL: https://issues.apache.org/jira/browse/DRILL-5415
 Project: Apache Drill
  Issue Type: Improvement
  Components: Tools, Build & Test
Affects Versions: 1.11.0
Reporter: Sorabh Hamirwasia
Assignee: Sorabh Hamirwasia
Priority: Minor
 Fix For: 1.11.0


There are 2 improvements which are made as part of this pull request.
1) The Fixture Builder framework converts all the config properties for 
Drillbit into string type. But there are certain configurations for 
authentication (like auth.mechanism) which are expected to be list type. Thus 
there will be failure during type check. Change to keep collections type config 
value as is and insert those config value after string types are inserted.
2) The Fixture Builder framework when builds it tries to apply any system 
options / session options (if set) for which it creates a default client. Hence 
with  cluster enabled for authentication this default client will not provide 
any connection parameters for authentication and will fail to connect. Allow 
Fixture Builder to accept client related properties as well so that can be used 
while creating default client to connect to cluster.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-3562) Query fails when using flatten on JSON data where some documents have an empty array

2017-04-05 Thread Rahul Challapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957331#comment-15957331
 ] 

Rahul Challapalli commented on DRILL-3562:
--

Thanks for the analysis [~arina]

> Query fails when using flatten on JSON data where some documents have an 
> empty array
> 
>
> Key: DRILL-3562
> URL: https://issues.apache.org/jira/browse/DRILL-3562
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - JSON
>Affects Versions: 1.1.0
>Reporter: Philip Deegan
>Assignee: Serhii Harnyk
> Fix For: 1.10.0
>
>
> Drill query fails when using flatten when some records contain an empty array 
> {noformat}
> SELECT COUNT(*) FROM (SELECT FLATTEN(t.a.b.c) AS c FROM dfs.`flat.json` t) 
> flat WHERE flat.c.d.e = 'f' limit 1;
> {noformat}
> Succeeds on 
> { "a": { "b": { "c": [  { "d": {  "e": "f" } } ] } } }
> Fails on
> { "a": { "b": { "c": [] } } }
> Error
> {noformat}
> Error: SYSTEM ERROR: ClassCastException: Cannot cast 
> org.apache.drill.exec.vector.NullableIntVector to 
> org.apache.drill.exec.vector.complex.RepeatedValueVector
> {noformat}
> Is it possible to ignore the empty arrays, or do they need to be populated 
> with dummy data?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5032) Drill query on hive parquet table failed with OutOfMemoryError: Java heap space

2017-04-05 Thread Rahul Challapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957320#comment-15957320
 ] 

Rahul Challapalli commented on DRILL-5032:
--

Verified and automated a testcase

> Drill query on hive parquet table failed with OutOfMemoryError: Java heap 
> space
> ---
>
> Key: DRILL-5032
> URL: https://issues.apache.org/jira/browse/DRILL-5032
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Hive
>Affects Versions: 1.8.0
>Reporter: Serhii Harnyk
>Assignee: Serhii Harnyk
> Fix For: 1.10.0
>
> Attachments: plan, plan with fix
>
>
> Following query on hive parquet table failed with OOM Java heap space:
> {code}
> select distinct(businessdate) from vmdr_trades where trade_date='2016-04-12'
> 2016-08-31 08:02:03,597 [283938c3-fde8-0fc6-37e1-9a568c7f5913:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 283938c3-fde8-0fc6-37e1-9a568c7f5913: select distinct(businessdate) from 
> vmdr_trades where trade_date='2016-04-12'
> 2016-08-31 08:05:58,502 [283938c3-fde8-0fc6-37e1-9a568c7f5913:foreman] INFO  
> o.a.d.e.p.l.partition.PruneScanRule - Beginning partition pruning, pruning 
> class: 
> org.apache.drill.exec.planner.sql.logical.HivePushPartitionFilterIntoScan$2
> 2016-08-31 08:05:58,506 [283938c3-fde8-0fc6-37e1-9a568c7f5913:foreman] INFO  
> o.a.d.e.p.l.partition.PruneScanRule - Total elapsed time to build and analyze 
> filter tree: 1 ms
> 2016-08-31 08:05:58,506 [283938c3-fde8-0fc6-37e1-9a568c7f5913:foreman] INFO  
> o.a.d.e.p.l.partition.PruneScanRule - No conditions were found eligible for 
> partition pruning.Total pruning elapsed time: 3 ms
> 2016-08-31 08:05:58,663 [283938c3-fde8-0fc6-37e1-9a568c7f5913:foreman] INFO  
> o.a.d.e.p.l.partition.PruneScanRule - Beginning partition pruning, pruning 
> class: 
> org.apache.drill.exec.planner.sql.logical.HivePushPartitionFilterIntoScan$2
> 2016-08-31 08:05:58,663 [283938c3-fde8-0fc6-37e1-9a568c7f5913:foreman] INFO  
> o.a.d.e.p.l.partition.PruneScanRule - Total elapsed time to build and analyze 
> filter tree: 0 ms
> 2016-08-31 08:05:58,663 [283938c3-fde8-0fc6-37e1-9a568c7f5913:foreman] INFO  
> o.a.d.e.p.l.partition.PruneScanRule - No conditions were found eligible for 
> partition pruning.Total pruning elapsed time: 0 ms
> 2016-08-31 08:05:58,664 [283938c3-fde8-0fc6-37e1-9a568c7f5913:foreman] INFO  
> o.a.d.e.p.l.partition.PruneScanRule - Beginning partition pruning, pruning 
> class: 
> org.apache.drill.exec.planner.sql.logical.HivePushPartitionFilterIntoScan$1
> 2016-08-31 08:05:58,665 [283938c3-fde8-0fc6-37e1-9a568c7f5913:foreman] INFO  
> o.a.d.e.p.l.partition.PruneScanRule - Total elapsed time to build and analyze 
> filter tree: 0 ms
> 2016-08-31 08:05:58,665 [283938c3-fde8-0fc6-37e1-9a568c7f5913:foreman] INFO  
> o.a.d.e.p.l.partition.PruneScanRule - No conditions were found eligible for 
> partition pruning.Total pruning elapsed time: 0 ms
> 2016-08-31 08:09:42,355 [283938c3-fde8-0fc6-37e1-9a568c7f5913:foreman] ERROR 
> o.a.drill.common.CatastrophicFailure - Catastrophic Failure Occurred, 
> exiting. Information message: Unable to handle out of memory condition in 
> Foreman.
> java.lang.OutOfMemoryError: Java heap space
> at java.util.Arrays.copyOf(Arrays.java:3332) ~[na:1.8.0_74]
> at 
> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:137)
>  ~[na:1.8.0_74]
> at 
> java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:121)
>  ~[na:1.8.0_74]
> at 
> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:421) 
> ~[na:1.8.0_74]
> at java.lang.StringBuilder.append(StringBuilder.java:136) 
> ~[na:1.8.0_74]
> at java.lang.StringBuilder.append(StringBuilder.java:76) 
> ~[na:1.8.0_74]
> at 
> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:457) 
> ~[na:1.8.0_74]
> at java.lang.StringBuilder.append(StringBuilder.java:166) 
> ~[na:1.8.0_74]
> at java.lang.StringBuilder.append(StringBuilder.java:76) 
> ~[na:1.8.0_74]
> at 
> com.google.protobuf.TextFormat$TextGenerator.write(TextFormat.java:538) 
> ~[protobuf-java-2.5.0.jar:na]
> at 
> com.google.protobuf.TextFormat$TextGenerator.print(TextFormat.java:526) 
> ~[protobuf-java-2.5.0.jar:na]
> at 
> com.google.protobuf.TextFormat$Printer.printFieldValue(TextFormat.java:389) 
> ~[protobuf-java-2.5.0.jar:na]
> at 
> com.google.protobuf.TextFormat$Printer.printSingleField(TextFormat.java:327) 
> ~[protobuf-java-2.5.0.jar:na]
> at 
> com.google.protobuf.TextFormat$Printer.printField(TextFormat.java:286) 
> ~[protobuf-java-2.5.0.jar:na]
> at 

[jira] [Commented] (DRILL-4847) Window function query results in OOM Exception.

2017-04-05 Thread Paul Rogers (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957187#comment-15957187
 ] 

Paul Rogers commented on DRILL-4847:


If the software is in the code (using the correct branch?) and the settings are 
correct, it should work.

Note that you have to enable the sort both in the config file AND as a session 
option. The session option has no effect if the sort is turned off at the 
config level. This was a safety feature in 1.10, to be removed at some future 
time.

> Window function query results in OOM Exception.
> ---
>
> Key: DRILL-4847
> URL: https://issues.apache.org/jira/browse/DRILL-4847
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.8.0
> Environment: 4 node cluster CentOS
>Reporter: Khurram Faraaz
>Assignee: Paul Rogers
>Priority: Critical
>  Labels: window_function
> Attachments: drillbit.log
>
>
> Window function query results in OOM Exception.
> Drill version 1.8.0-SNAPSHOT git commit ID: 38ce31ca
> MapRBuildVersion 5.1.0.37549.GA
> {noformat}
> 0: jdbc:drill:schema=dfs.tmp> SELECT clientname, audiencekey, spendprofileid, 
> postalcd, provincecd, provincename, postalcode_json, country_json, 
> province_json, town_json, dma_json, msa_json, ROW_NUMBER() OVER (PARTITION BY 
> spendprofileid  ORDER BY (CASE WHEN postalcd IS NULL THEN 9 ELSE 0 END) ASC, 
> provincecd ASC) as rn FROM `MD593.parquet` limit 3;
> Error: RESOURCE ERROR: One or more nodes ran out of memory while executing 
> the query.
> Failure while allocating buffer.
> Fragment 0:0
> [Error Id: 2287fe71-f0cb-469a-a563-11580fceb1c5 on centos-01.qa.lab:31010] 
> (state=,code=0)
> {noformat}
> Stack trace from drillbit.log
> {noformat}
> 2016-08-16 07:25:44,590 [284d4006-9f9d-b893-9352-4f54f9b1d52a:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 284d4006-9f9d-b893-9352-4f54f9b1d52a: SELECT clientname, audiencekey, 
> spendprofileid, postalcd, provincecd, provincename, postalcode_json, 
> country_json, province_json, town_json, dma_json, msa_json, ROW_NUMBER() OVER 
> (PARTITION BY spendprofileid  ORDER BY (CASE WHEN postalcd IS NULL THEN 9 
> ELSE 0 END) ASC, provincecd ASC) as rn FROM `MD593.parquet` limit 3
> ...
> 2016-08-16 07:25:46,273 [284d4006-9f9d-b893-9352-4f54f9b1d52a:frag:0:0] INFO  
> o.a.d.e.p.i.xsort.ExternalSortBatch - Completed spilling to 
> /tmp/drill/spill/284d4006-9f9d-b893-9352-4f54f9b1d52a_majorfragment0_minorfragment0_operator8/2
> 2016-08-16 07:25:46,283 [284d4006-9f9d-b893-9352-4f54f9b1d52a:frag:0:0] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - User Error Occurred
> org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: One or more 
> nodes ran out of memory while executing the query.
> Failure while allocating buffer.
> [Error Id: 2287fe71-f0cb-469a-a563-11580fceb1c5 ]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:543)
>  ~[drill-common-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:242)
>  [drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_101]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_101]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_101]
> Caused by: org.apache.drill.exec.exception.OutOfMemoryException: Failure 
> while allocating buffer.
> at 
> org.apache.drill.exec.vector.NullableVarCharVector.allocateNew(NullableVarCharVector.java:187)
>  ~[vector-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.vector.complex.RepeatedMapVector$RepeatedMapTransferPair.(RepeatedMapVector.java:331)
>  ~[vector-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.vector.complex.RepeatedMapVector$RepeatedMapTransferPair.(RepeatedMapVector.java:307)
>  ~[vector-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.vector.complex.RepeatedMapVector.getTransferPair(RepeatedMapVector.java:161)
>  ~[vector-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.SimpleVectorWrapper.cloneAndTransfer(SimpleVectorWrapper.java:66)
>  ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.VectorContainer.cloneAndTransfer(VectorContainer.java:204)
>  ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> 

[jira] [Commented] (DRILL-5377) Drill returns weird characters when parquet date auto-correction is turned off

2017-04-05 Thread Vitalii Diravka (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957116#comment-15957116
 ] 

Vitalii Diravka commented on DRILL-5377:


[~rkins] Could you verify above usecase?

> Drill returns weird characters when parquet date auto-correction is turned off
> --
>
> Key: DRILL-5377
> URL: https://issues.apache.org/jira/browse/DRILL-5377
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Affects Versions: 1.10.0
>Reporter: Rahul Challapalli
>
> git.commit.id.abbrev=38ef562
> Below is the output, I get from test framework when I disable auto correction 
> for date fields
> {code}
> select l_shipdate from table(cp.`tpch/lineitem.parquet` (type => 'parquet', 
> autoCorrectCorruptDates => false)) order by l_shipdate limit 10;
> ^@356-03-19
> ^@356-03-21
> ^@356-03-21
> ^@356-03-23
> ^@356-03-24
> ^@356-03-24
> ^@356-03-26
> ^@356-03-26
> ^@356-03-26
> ^@356-03-26
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-5385) Vector serializer fails to read saved SV2

2017-04-05 Thread Zelaine Fong (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zelaine Fong updated DRILL-5385:

Labels:   (was: ready-to-commit)

> Vector serializer fails to read saved SV2
> -
>
> Key: DRILL-5385
> URL: https://issues.apache.org/jira/browse/DRILL-5385
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.8.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.11.0
>
>
> Drill provides the {{VectorAccessibleSerializable}} class to write a record 
> batch to a stream, and to read that batch from a stream. Record batches can 
> carry an indirection vector (a so-called selection vector 2 or SV2).
> The code to write batches writes the SV2 to the stream. But, the code to 
> deserialize batches initializes, but does not read, the SV2 from the stream.
> The result is that vector deserialization reads the wrong bytes and the saved 
> values are corrupted on read.
> Note that this issue was found via unit testing. At present, the only 
> production use of this code is in the external sort, which serializes batches 
> without an indirection vector.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-5413) DrillConnectionImpl.isReadOnly() throws NullPointerException

2017-04-05 Thread Zelaine Fong (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zelaine Fong updated DRILL-5413:

Reviewer: Jinfeng Ni

Assigned Reviewer to [~jni]

> DrillConnectionImpl.isReadOnly() throws NullPointerException
> 
>
> Key: DRILL-5413
> URL: https://issues.apache.org/jira/browse/DRILL-5413
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Affects Versions: 1.10.0
> Environment: jboss 7.0.1 final version
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
> Fix For: 1.11.0
>
>
> According to the 
> [CALCITE-843|https://issues.apache.org/jira/browse/CALCITE-843] every call of 
> "isReadonly()" throws NullPointerException. 
> For example, JBoss uses DrillConnectionImpl.isReadOnly() method in the 
> process of connection to the Drill as a datasource.
> The fix for CALCITE-843 should be added to the Drill Calcite fork.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4847) Window function query results in OOM Exception.

2017-04-05 Thread Zelaine Fong (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957024#comment-15957024
 ] 

Zelaine Fong commented on DRILL-4847:
-

[~khfaraaz] - those lines, I believe, are still the current sort.  
[~paul-rogers] - any idea why the old sort is still being used even though 
Khurram is using the new, managed sort?  Are there places where the planner is 
still generating plans using the old sort even with the new setting?

> Window function query results in OOM Exception.
> ---
>
> Key: DRILL-4847
> URL: https://issues.apache.org/jira/browse/DRILL-4847
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.8.0
> Environment: 4 node cluster CentOS
>Reporter: Khurram Faraaz
>Assignee: Paul Rogers
>Priority: Critical
>  Labels: window_function
> Attachments: drillbit.log
>
>
> Window function query results in OOM Exception.
> Drill version 1.8.0-SNAPSHOT git commit ID: 38ce31ca
> MapRBuildVersion 5.1.0.37549.GA
> {noformat}
> 0: jdbc:drill:schema=dfs.tmp> SELECT clientname, audiencekey, spendprofileid, 
> postalcd, provincecd, provincename, postalcode_json, country_json, 
> province_json, town_json, dma_json, msa_json, ROW_NUMBER() OVER (PARTITION BY 
> spendprofileid  ORDER BY (CASE WHEN postalcd IS NULL THEN 9 ELSE 0 END) ASC, 
> provincecd ASC) as rn FROM `MD593.parquet` limit 3;
> Error: RESOURCE ERROR: One or more nodes ran out of memory while executing 
> the query.
> Failure while allocating buffer.
> Fragment 0:0
> [Error Id: 2287fe71-f0cb-469a-a563-11580fceb1c5 on centos-01.qa.lab:31010] 
> (state=,code=0)
> {noformat}
> Stack trace from drillbit.log
> {noformat}
> 2016-08-16 07:25:44,590 [284d4006-9f9d-b893-9352-4f54f9b1d52a:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Query text for query id 
> 284d4006-9f9d-b893-9352-4f54f9b1d52a: SELECT clientname, audiencekey, 
> spendprofileid, postalcd, provincecd, provincename, postalcode_json, 
> country_json, province_json, town_json, dma_json, msa_json, ROW_NUMBER() OVER 
> (PARTITION BY spendprofileid  ORDER BY (CASE WHEN postalcd IS NULL THEN 9 
> ELSE 0 END) ASC, provincecd ASC) as rn FROM `MD593.parquet` limit 3
> ...
> 2016-08-16 07:25:46,273 [284d4006-9f9d-b893-9352-4f54f9b1d52a:frag:0:0] INFO  
> o.a.d.e.p.i.xsort.ExternalSortBatch - Completed spilling to 
> /tmp/drill/spill/284d4006-9f9d-b893-9352-4f54f9b1d52a_majorfragment0_minorfragment0_operator8/2
> 2016-08-16 07:25:46,283 [284d4006-9f9d-b893-9352-4f54f9b1d52a:frag:0:0] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - User Error Occurred
> org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: One or more 
> nodes ran out of memory while executing the query.
> Failure while allocating buffer.
> [Error Id: 2287fe71-f0cb-469a-a563-11580fceb1c5 ]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:543)
>  ~[drill-common-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:242)
>  [drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_101]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_101]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_101]
> Caused by: org.apache.drill.exec.exception.OutOfMemoryException: Failure 
> while allocating buffer.
> at 
> org.apache.drill.exec.vector.NullableVarCharVector.allocateNew(NullableVarCharVector.java:187)
>  ~[vector-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.vector.complex.RepeatedMapVector$RepeatedMapTransferPair.(RepeatedMapVector.java:331)
>  ~[vector-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.vector.complex.RepeatedMapVector$RepeatedMapTransferPair.(RepeatedMapVector.java:307)
>  ~[vector-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.vector.complex.RepeatedMapVector.getTransferPair(RepeatedMapVector.java:161)
>  ~[vector-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.SimpleVectorWrapper.cloneAndTransfer(SimpleVectorWrapper.java:66)
>  ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.VectorContainer.cloneAndTransfer(VectorContainer.java:204)
>  ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.VectorContainer.getTransferClone(VectorContainer.java:157)
>  

[jira] [Updated] (DRILL-5375) Nested loop join: return correct result for left join

2017-04-05 Thread Arina Ielchiieva (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-5375:

Labels: doc-impacting ready-to-commit  (was: doc-impacting)

> Nested loop join: return correct result for left join
> -
>
> Key: DRILL-5375
> URL: https://issues.apache.org/jira/browse/DRILL-5375
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.8.0
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
>  Labels: doc-impacting, ready-to-commit
>
> Mini repro:
> 1. Create 2 Hive tables with data
> {code}
> CREATE TABLE t1 (
>   FYQ varchar(999),
>   dts varchar(999),
>   dte varchar(999)
> )
> ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
> 2016-Q1,2016-06-01,2016-09-30
> 2016-Q2,2016-09-01,2016-12-31
> 2016-Q3,2017-01-01,2017-03-31
> 2016-Q4,2017-04-01,2017-06-30
> CREATE TABLE t2 (
>   who varchar(999),
>   event varchar(999),
>   dt varchar(999)
> )
> ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
> aperson,did somthing,2017-01-06
> aperson,did somthing else,2017-01-12
> aperson,had chrsitmas,2016-12-26
> aperson,went wild,2016-01-01
> {code}
> 2. Impala Query shows correct result
> {code}
> select t2.dt, t1.fyq, t2.who, t2.event
> from t2
> left join t1 on t2.dt between t1.dts and t1.dte
> order by t2.dt;
> ++-+-+---+
> | dt | fyq | who | event |
> ++-+-+---+
> | 2016-01-01 | NULL| aperson | went wild |
> | 2016-12-26 | 2016-Q2 | aperson | had chrsitmas |
> | 2017-01-06 | 2016-Q3 | aperson | did somthing  |
> | 2017-01-12 | 2016-Q3 | aperson | did somthing else |
> ++-+-+---+
> {code}
> 3. Drill query shows wrong results:
> {code}
> alter session set planner.enable_nljoin_for_scalar_only=false;
> use hive;
> select t2.dt, t1.fyq, t2.who, t2.event
> from t2
> left join t1 on t2.dt between t1.dts and t1.dte
> order by t2.dt;
> +-+--+--++
> | dt  |   fyq|   who|   event|
> +-+--+--++
> | 2016-12-26  | 2016-Q2  | aperson  | had chrsitmas  |
> | 2017-01-06  | 2016-Q3  | aperson  | did somthing   |
> | 2017-01-12  | 2016-Q3  | aperson  | did somthing else  |
> +-+--+--++
> 3 rows selected (2.523 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5414) Issue with Querying Directories

2017-04-05 Thread Paul Makkar (JIRA)
Paul Makkar created DRILL-5414:
--

 Summary: Issue with Querying Directories
 Key: DRILL-5414
 URL: https://issues.apache.org/jira/browse/DRILL-5414
 Project: Apache Drill
  Issue Type: Bug
  Components: Functions - Drill
Affects Versions: 1.10.0
 Environment: Kubernetes running Debian GNU/Linux 8 containers.
openjdk version "1.8.0_111".
AWS.
Using s3 buckets
Reporter: Paul Makkar


Hi

*Thanks for apache drill - it's pretty awesome :)

I'm hoping to exploit drill directory querying and have structured my data 
archive in s3 to test this. However, I've got an issue using directory querying.

My directory structure in s3 is like:
s3/devices_by_id/device_id/2016/11/12/.json.gz

>From the documentation I figured the following queries were equivalent:

select count(*) from `s3`.`/deviceid/xyz/2016/11/` ;
+-+
| EXPR$0  |
+-+
| 286049  |
+-+
1 row selected (10.351 seconds)

select count(*) from `s3`.`/deviceid/` where dir0='xyz' and dir1='2016' and 
dir2='11'; But this latter query just hangs. There is no profile in the UI. I 
cntrl-c and get :

+--+
|  |
+--+
+--+
No rows selected (1481.727 seconds)

If I try to run an explain plan, that also hangs.

There are a total of 13283 compressed json files in the 2016/11 s3 bucket. 

The log doesn't show much information.

If anyone can help with this please? I can provide more information as 
required. Hopefully this is not user error.






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5413) DrillConnectionImpl.isReadOnly() throws NullPointerException

2017-04-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15956987#comment-15956987
 ] 

ASF GitHub Bot commented on DRILL-5413:
---

GitHub user vdiravka opened a pull request:

https://github.com/apache/drill/pull/806

DRILL-5413: DrillConnectionImpl.isReadOnly() throws NullPointerException



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vdiravka/drill DRILL-5413

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/806.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #806


commit 01ec92ccb2774045c818f1fc1733772c947b09f5
Author: Vitalii Diravka 
Date:   2017-04-05T17:59:32Z

DRILL-5413: DrillConnectionImpl.isReadOnly() throws NullPointerException




> DrillConnectionImpl.isReadOnly() throws NullPointerException
> 
>
> Key: DRILL-5413
> URL: https://issues.apache.org/jira/browse/DRILL-5413
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Affects Versions: 1.10.0
> Environment: jboss 7.0.1 final version
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
> Fix For: 1.11.0
>
>
> According to the 
> [CALCITE-843|https://issues.apache.org/jira/browse/CALCITE-843] every call of 
> "isReadonly()" throws NullPointerException. 
> For example, JBoss uses DrillConnectionImpl.isReadOnly() method in the 
> process of connection to the Drill as a datasource.
> The fix for CALCITE-843 should be added to the Drill Calcite fork.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5413) DrillConnectionImpl.isReadOnly() throws NullPointerException

2017-04-05 Thread Vitalii Diravka (JIRA)
Vitalii Diravka created DRILL-5413:
--

 Summary: DrillConnectionImpl.isReadOnly() throws 
NullPointerException
 Key: DRILL-5413
 URL: https://issues.apache.org/jira/browse/DRILL-5413
 Project: Apache Drill
  Issue Type: Bug
  Components: Client - JDBC
Affects Versions: 1.10.0
 Environment: jboss 7.0.1 final version
Reporter: Vitalii Diravka
Assignee: Vitalii Diravka
 Fix For: 1.11.0


According to the 
[CALCITE-843|https://issues.apache.org/jira/browse/CALCITE-843] every call of 
"isReadonly()" throws NullPointerException. 

For example, JBoss uses DrillConnectionImpl.isReadOnly() method in the process 
of connection to the Drill as a datasource.

The fix for CALCITE-843 should be added to the Drill Calcite fork.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-3562) Query fails when using flatten on JSON data where some documents have an empty array

2017-04-05 Thread Arina Ielchiieva (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15956645#comment-15956645
 ] 

Arina Ielchiieva commented on DRILL-3562:
-

I see, one more point then. DRILL-3562 made changes in JsonReader and 
FlattenRecordBatch classes. If there are no empty arrays in json files, only 
changes in FlattenRecordBatch may have had influence.

Error message from DRILL-5399 "Flatten does not support inputs of non-list 
values." may be thrown in the two places:
1. In 
https://github.com/apache/drill/blob/ddcf89548bd33c0cd3e062f1f6d5027eed822372/exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/flatten/FlattenRecordBatch.java#L282
 but it is before code changes in DRILL-3562.
2. In 
https://github.com/apache/drill/blob/ddcf89548bd33c0cd3e062f1f6d5027eed822372/exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/flatten/FlattenRecordBatch.java#L139
 but this one is connected with value vector which is taken from the incoming 
batch but not from the FlattenRecordBatch where changes were made.

Also there is [a 
check|https://github.com/apache/drill/blob/ddcf89548bd33c0cd3e062f1f6d5027eed822372/exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/flatten/FlattenRecordBatch.java#L321]
 in FlattenRecordBatch which won't pass data from queries in DRILL-5399 to 
changes made in DRILL-3562. So far I don't see any relation between  DRILL-3652 
and DRILL-5399.

> Query fails when using flatten on JSON data where some documents have an 
> empty array
> 
>
> Key: DRILL-3562
> URL: https://issues.apache.org/jira/browse/DRILL-3562
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - JSON
>Affects Versions: 1.1.0
>Reporter: Philip Deegan
>Assignee: Serhii Harnyk
> Fix For: 1.10.0
>
>
> Drill query fails when using flatten when some records contain an empty array 
> {noformat}
> SELECT COUNT(*) FROM (SELECT FLATTEN(t.a.b.c) AS c FROM dfs.`flat.json` t) 
> flat WHERE flat.c.d.e = 'f' limit 1;
> {noformat}
> Succeeds on 
> { "a": { "b": { "c": [  { "d": {  "e": "f" } } ] } } }
> Fails on
> { "a": { "b": { "c": [] } } }
> Error
> {noformat}
> Error: SYSTEM ERROR: ClassCastException: Cannot cast 
> org.apache.drill.exec.vector.NullableIntVector to 
> org.apache.drill.exec.vector.complex.RepeatedValueVector
> {noformat}
> Is it possible to ignore the empty arrays, or do they need to be populated 
> with dummy data?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)