Re: Speeding Up Unit Tests

2017-08-28 Thread Paul Rogers
Tests run in parallel when run from the command line in Maven. We have had 
issues, especially when some test changes global state. Can’t remember the 
details, however.

- Paul

> On Aug 28, 2017, at 6:04 PM, Timothy Farkas  wrote:
> 
> Hi All,
> 
> I will be working on 
> DRILL-5730  and see some 
> other areas for potential improvement with regard to testing:
> 
>  1.  Add caching of maven artifacts to the Travis build. This should 
> significantly speed up compiling the code in travis.
>  2.  Running unit tests in parallel.
>  3.  Make the Travis build actually run unit tests (currently it does not).
> 
> Points (1) and (3) are easy to do, but I was wondering if anyone has 
> experience with some of the potential pitfalls of running unit tests in 
> parallel? Are there certain tests which will have race conditions? and are 
> there any foreseeable issues with running multiple embedded drill bits in 
> parallel?
> 
> Thanks,
> Tim
> 
> 



[GitHub] drill issue #919: DRILL-5721: Query with only root fragment and no non-root ...

2017-08-28 Thread sohami
Github user sohami commented on the issue:

https://github.com/apache/drill/pull/919
  
Rebased the branch to latest master and updated test's included in 
DRILL-5701


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Speeding Up Unit Tests

2017-08-28 Thread Timothy Farkas
Hi All,

I will be working on 
DRILL-5730  and see some 
other areas for potential improvement with regard to testing:

  1.  Add caching of maven artifacts to the Travis build. This should 
significantly speed up compiling the code in travis.
  2.  Running unit tests in parallel.
  3.  Make the Travis build actually run unit tests (currently it does not).

Points (1) and (3) are easy to do, but I was wondering if anyone has experience 
with some of the potential pitfalls of running unit tests in parallel? Are 
there certain tests which will have race conditions? and are there any 
foreseeable issues with running multiple embedded drill bits in parallel?

Thanks,
Tim




[GitHub] drill issue #923: DRILL-5723: Added System Internal Options That can be Modi...

2017-08-28 Thread ilooner
Github user ilooner commented on the issue:

https://github.com/apache/drill/pull/923
  
@paul-rogers Applied your comments and cleaned up the PR, it should be 
ready for review now. If additional work needs to be done to make this easier 
to review please let me know.

Thanks,
Tim


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #913: - DRILL-5729 Fix Travis Build

2017-08-28 Thread ilooner
Github user ilooner commented on the issue:

https://github.com/apache/drill/pull/913
  
@parthchandra I think the PR still needs a +1 from a committer. If Vlad's 
review is sufficient, could you give this a +1?

Thanks,
Tim


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (DRILL-5748) NPE with invalid ALTER SYSTEM

2017-08-28 Thread Paul Rogers (JIRA)
Paul Rogers created DRILL-5748:
--

 Summary: NPE with invalid ALTER SYSTEM
 Key: DRILL-5748
 URL: https://issues.apache.org/jira/browse/DRILL-5748
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.11.0
Reporter: Paul Rogers
Priority: Minor


Launched a local Drillbit, using ZK. Entered the following query:

{code}
ALTER SYSTEM SET `exec.queue.enable` = true
{code}

Result:

{code}
Query Failed: An Error Occurred
java.lang.NullPointerException
{code}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (DRILL-5747) Drill should put directory name field in same sequence w.r.t regular column for select * query

2017-08-28 Thread Jinfeng Ni (JIRA)
Jinfeng Ni created DRILL-5747:
-

 Summary: Drill should put directory name field in same sequence 
w.r.t regular column for select * query
 Key: DRILL-5747
 URL: https://issues.apache.org/jira/browse/DRILL-5747
 Project: Apache Drill
  Issue Type: Bug
Reporter: Jinfeng Ni
Assignee: Jinfeng Ni


Today,  star column * in Drill would expand into a list of regular columns, and 
the directory name field such as dir0, dir1.  However, Drill does not put the 
directory name field with respect to regular field in a consistent way.

For instance, for parquet files, dir0 is put behind the list of regular columns.

{code}
select * from dfs.tmp.parquetTbl where dir0 = 1990;
+--+--+--+--+---+
| N_NATIONKEY  |N_NAME| N_REGIONKEY  |  N_COMMENT   | dir0  |
+--+--+--+--+---+
| 0| [B@5527446   | 0| [B@684fa264  | 1990  |
| 1| [B@442e88bc  | 1| [B@4b13119c  | 1990  |
| 2| [B@50e93f45  | 1| [B@138f483   | 1990  |
| 3| [B@423cc515  | 1| [B@23af07ac  | 1990  |
| 4| [B@3820bf81  | 4| [B@6dfccaf0  | 1990  |
| 5| [B@6f6f8af9  | 0| [B@40d1a97   | 1990  |
| 6| [B@784cb194  | 3| [B@731ea93f  | 1990  |
| 7| [B@61f9a224  | 3| [B@4c041bbc  | 1990  |
| 8| [B@21b8faa1  | 2| [B@774e7152  | 1990  |
| 9| [B@3ef1fbaf  | 2| [B@c2be72| 1990  |
| 10   | [B@71652ec1  | 4| [B@29e0bb10  | 1990  |
| 11   | [B@61192cea  | 4| [B@3bd3e873  | 1990  |
| 12   | [B@5541f4b4  | 2| [B@5d288126  | 1990  |
| 13   | [B@e371592   | 4| [B@42692b88  | 1990  |
| 14   | [B@6a90fc8   | 0| [B@454b16e2  | 1990  |
| 15   | [B@44cb72f8  | 0| [B@8e91b11   | 1990  |
| 16   | [B@7feffda8  | 0| [B@64f66236  | 1990  |
| 17   | [B@6ba9fb02  | 1| [B@649e7786  | 1990  |
| 18   | [B@5fb93205  | 2| [B@7783175b  | 1990  |
| 19   | [B@3f7294a9  | 3| [B@7b7e03c9  | 1990  |
| 20   | [B@e2ac076   | 4| [B@18c18a3e  | 1990  |
| 21   | [B@4a5af924  | 2| [B@1a9ad09f  | 1990  |
| 22   | [B@29f6845e  | 3| [B@776c4cd7  | 1990  |
| 23   | [B@6728f481  | 3| [B@31cc7610  | 1990  |
| 24   | [B@665b2dfa  | 1| [B@6c27ac95  | 1990  |
+--+--+--+--+---+
{code}
Notice in the above output, dir0 = 1990 is the last column.

However, for JSON, dir0 is put in front of the list of regular columns.

{code}
select * from dfs.tmp.jsonTbl where dir0 = 1990;
+---+--+
| dir0  |  a   |
+---+--+
| 1990  | 100  |
| 1990  | 200  |
+---+--+
{code}

It would be good to present the directory name field in the same sequence 
regardless of file format, storage plugin. IMHO, it makes sense to put the 
directory name field in front of the list of regular columns ( the behavior 
that JSON format present today).

This ticket is opened to modify Drill's ScanBatch code for the above explained 
purpose.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (DRILL-5746) Pcap PR manually edited Protobuf files, values lost on next build

2017-08-28 Thread Paul Rogers (JIRA)
Paul Rogers created DRILL-5746:
--

 Summary: Pcap PR manually edited Protobuf files, values lost on 
next build
 Key: DRILL-5746
 URL: https://issues.apache.org/jira/browse/DRILL-5746
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.10.0
Reporter: Paul Rogers
Assignee: Paul Rogers
 Fix For: 1.12.0


Drill recently accepted the pcap format plugin. As part of that work, the 
author added a new operator type, {{PCAP_SUB_SCAN_VALUE}}.

But, apparently this was done by editing the generated Protobuf files to add 
the values, rather than modifying the protobuf definitions and rebuilding the 
generated files. The result is, on the next build of the Protobuf sources, the 
following compile error appears:

{code}
[ERROR] 
/Users/paulrogers/git/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/pcap/PcapFormatPlugin.java:[80,41]
 error: cannot find symbol
[ERROR] symbol:   variable PCAP_SUB_SCAN_VALUE
[ERROR] location: class CoreOperatorType
{code}

The solution is to properly edit the Protobuf definitions with the required 
symbol.

In addition, the name {{PCAP_SUB_SCAN_VALUE}} does not follow Drill's naming 
conventions. It should be: {{PCAP_SUB_SCAN}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (DRILL-5745) Invalid "location" information in Drill web server

2017-08-28 Thread Paul Rogers (JIRA)
Paul Rogers created DRILL-5745:
--

 Summary: Invalid "location" information in Drill web server
 Key: DRILL-5745
 URL: https://issues.apache.org/jira/browse/DRILL-5745
 Project: Apache Drill
  Issue Type: Bug
  Components: Web Server
Affects Versions: 1.11.0
Reporter: Paul Rogers
Priority: Minor


The file {{ProfileResources.java}} has the following incorrect code line:

{code}
  this.location = "http://localhost:8047/profile/; + queryId + ".json";
{code}

This code makes three errors.

1. The "http" prefix ignores the fact that the Drillbit can have SSL enabled 
for the web server.
2. In a browser, "localhost" refers to the the machine running the browser. 
This is valid only if the browser runs on the same machine as the Drillbit, 
which is not, in general, true.
3. The port number is hardcoded to 8047, but it can be customized in the config 
file.

Therefore, most of the time, the link won't work on a production server.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] drill issue #922: DRILL-5741: During startup Drill should not exceed the ava...

2017-08-28 Thread paul-rogers
Github user paul-rogers commented on the issue:

https://github.com/apache/drill/pull/922
  
@kkhatua, thanks for the note. No doubt the proposed model helps in _some_ 
cases. The problem is, it adds complexity for other cases.

Perhaps offer the change as an additional script that can be called from 
`drill_env.sh` for those sites that need it, rather than making it part of the 
core, common script. That way, this solution is orthogonal to, rather than part 
of, a solution driven by a resource manager such as YARN, Mesos or Warden.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #922: DRILL-5741: During startup Drill should not exceed the ava...

2017-08-28 Thread kkhatua
Github user kkhatua commented on the issue:

https://github.com/apache/drill/pull/922
  
@paul-rogers 
The reason this would be needed is because the system admin might not be 
the one managing the specifics of the Drill memory allocations. This allows a 
node admin to define the limits within which Drill will work, while the 
specifications can be delegated to the drill-env.sh script which is managed by 
a power user. 

There isn't a conflict with the way a service like Warden would allocate 
memory, because it does not need to manage the internal allocations, only the 
overall. The Drill service will continue to own its configuration's specifics. 

Your solution is pretty much in line with what the PR proposes. The 
exception being that, in the presence of the env variable defined, the 
specifications **cannot** oversubscribe the available (or max permissible) 
memory. In case of oversubscribing, Drill will not startup.

If there are uber resource managers (for e.g. YARN), they have the option 
of either 

- providing this limit and relying on files like ```drill-env.sh``` to 
specify the limits... 

or 

- provide the explicit parameters for Heap, DirectMemory, etc to the JVM 
and **not** have to specify ```DRILLBIT_MAX_PROC_MEM``` . 




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (DRILL-5744) External sort fails with OOM error

2017-08-28 Thread Robert Hou (JIRA)
Robert Hou created DRILL-5744:
-

 Summary: External sort fails with OOM error
 Key: DRILL-5744
 URL: https://issues.apache.org/jira/browse/DRILL-5744
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Relational Operators
Affects Versions: 1.10.0
Reporter: Robert Hou
Assignee: Paul Rogers
 Fix For: 1.12.0


Query is:
{noformat}
ALTER SESSION SET `exec.sort.disable_managed` = false;
alter session set `planner.width.max_per_node` = 1;
alter session set `planner.disable_exchanges` = true;
alter session set `planner.width.max_per_query` = 1;
alter session set `planner.memory.max_query_memory_per_node` = 152428800;
select count(*) from (
  select * from (
select s1.type type, flatten(s1.rms.rptd) rptds, s1.rms, s1.uid 
from (
  select d.type type, d.uid uid, flatten(d.map.rm) rms from 
dfs.`/drill/testdata/resource-manager/nested-large.json` d order by d.uid
) s1
  ) s2
  order by s2.rms.mapid
);
ALTER SESSION SET `exec.sort.disable_managed` = true;
alter session set `planner.width.max_per_node` = 17;
alter session set `planner.disable_exchanges` = false;
alter session set `planner.width.max_per_query` = 1000;
alter session set `planner.memory.max_query_memory_per_node` = 2147483648;
{noformat}

Stack trace is:
{noformat}
2017-08-23 06:59:42,763 [266275e5-ebdb-14ae-d52d-00fa3a154f6d:frag:0:0] INFO  
o.a.d.e.w.fragment.FragmentExecutor - User Error Occurred: One or more nodes
 ran out of memory while executing the query. (Unable to allocate buffer of 
size 4194304 (rounded from 3276750) due to memory limit. Current allocation: 7
9986944)
org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: One or more 
nodes ran out of memory while executing the query.

Unable to allocate buffer of size 4194304 (rounded from 3276750) due to memory 
limit. Current allocation: 79986944

[Error Id: 4f4959df-0921-4a50-b75e-56488469ab10 ]
at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:550)
 ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:244)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) 
[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
[na:1.7.0_51]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_51]
at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51]
Caused by: org.apache.drill.exec.exception.OutOfMemoryException: Unable to 
allocate buffer of size 4194304 (rounded from 3276750) due to memory limit. Cur
rent allocation: 79986944
at 
org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:238) 
~[drill-memory-base-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:213) 
~[drill-memory-base-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.vector.VarCharVector.allocateNew(VarCharVector.java:402) 
~[vector-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.vector.NullableVarCharVector.allocateNew(NullableVarCharVector.java:236)
 ~[vector-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.vector.AllocationHelper.allocatePrecomputedChildCount(AllocationHelper.java:33)
 ~[vector-1.12.0-SNAPSHOT.jar:1.12.0-SNAPS
HOT]
at 
org.apache.drill.exec.vector.AllocationHelper.allocate(AllocationHelper.java:46)
 ~[vector-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.record.VectorInitializer.allocateVector(VectorInitializer.java:113)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT
]
at 
org.apache.drill.exec.record.VectorInitializer.allocateVector(VectorInitializer.java:95)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.record.VectorInitializer.allocateMap(VectorInitializer.java:130)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.record.VectorInitializer.allocateVector(VectorInitializer.java:93)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.record.VectorInitializer.allocateBatch(VectorInitializer.java:85)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.xsort.managed.PriorityQueueCopierWrapper$BatchMerger.next(PriorityQueueCopierWrapper.java:262)
 ~[drill-java
-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.load(ExternalSortBatch.java:374)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12
.0-SNAPSHOT]
at 

[GitHub] drill issue #910: DRILL-5726: Support Impersonation without authentication f...

2017-08-28 Thread sohami
Github user sohami commented on the issue:

https://github.com/apache/drill/pull/910
  
+1 LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #910: DRILL-5726: Support Impersonation without authentic...

2017-08-28 Thread sohami
Github user sohami commented on a diff in the pull request:

https://github.com/apache/drill/pull/910#discussion_r135599220
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/DrillRestServer.java
 ---
@@ -230,6 +230,27 @@ public WebUserConnection provide() {
 public void dispose(WebUserConnection instance) {
 
 }
+
+/**
+ * Creates session user principal. If impersonation is enabled without 
authentication and User-Name header is present and valid,
+ * will create session user principal with provided user name, 
otherwise anonymous user name will be used.
+ * In both cases session user principal will have admin rights.
+ *
+ * @param config drill config
+ * @param request client request
+ * @return session user principal
+ */
+private Principal createSessionUserPrincipal(DrillConfig config, 
HttpServletRequest request) {
+  final boolean checkForUserName = 
!config.getBoolean(ExecConstants.USER_AUTHENTICATION_ENABLED) && 
config.getBoolean(ExecConstants.IMPERSONATION_ENABLED);
+  if (checkForUserName) {
--- End diff --

Makes sense. Thank You for explanation


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #906: DRILL-5546: Handle schema change exception failure ...

2017-08-28 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/906#discussion_r135578936
  
--- Diff: 
exec/vector/src/main/java/org/apache/drill/exec/vector/UntypedNullHolder.java 
---
@@ -0,0 +1,50 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.vector;
+
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.common.types.Types;
+import org.apache.drill.exec.expr.holders.ValueHolder;
+
+public class UntypedNullHolder implements ValueHolder {
+  public static final TypeProtos.MajorType TYPE = 
Types.optional(TypeProtos.MinorType.NULL);
+
+  public TypeProtos.MajorType getType() {return TYPE;}
+
+  public static final int WIDTH = 0;
+
+  public int isSet = 1;
+
+  @Deprecated
+  public int hashCode(){
+throw new UnsupportedOperationException();
+  }
--- End diff --

See note for `toString()`. This has the same issues.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #906: DRILL-5546: Handle schema change exception failure ...

2017-08-28 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/906#discussion_r135580061
  
--- Diff: 
exec/vector/src/main/java/org/apache/drill/exec/vector/UntypedNullVector.java 
---
@@ -0,0 +1,267 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ 
**/
+
+package org.apache.drill.exec.vector;
+
+
+import com.google.common.base.Preconditions;
+import io.netty.buffer.DrillBuf;
+import org.apache.drill.exec.memory.BufferAllocator;
+import org.apache.drill.exec.proto.UserBitShared.SerializedField;
+import org.apache.drill.exec.record.MaterializedField;
+import org.apache.drill.exec.record.TransferPair;
+import org.apache.drill.exec.vector.complex.reader.FieldReader;
+
+/** UntypedNullVector is to represent a value vector with {@link 
org.apache.drill.common.types.MinorType.NULL}
+ *  All values in the vector represent two semantic implications: 1) the 
value is unknown, 2) the type is unknown.
+ *  Because of this, we only have to keep track of the number of values in 
value vector,
+ *  and there is no allocated buffer to back up this value vector. 
Therefore, the majority of
+ *  methods in this class is either no-op, or throws {@link 
UnsupportedOperationException}.
+ *
+ */
+public final class UntypedNullVector extends BaseDataValueVector 
implements FixedWidthVector {
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(UntypedNullVector.class);
+
+  /**
+   * Width of each fixed-width value.
+   */
+  public static final int VALUE_WIDTH = 0;
+
+  private final Accessor accessor = new Accessor();
+  private final Mutator mutator = new Mutator();
+  private int valueCount ;
+
+  public UntypedNullVector(MaterializedField field, BufferAllocator 
allocator) {
+super(field, allocator);
+valueCount = 0;
+  }
+
+  @Override
+  public FieldReader getReader() { throw new 
UnsupportedOperationException(); }
+
+  @Override
+  public int getBufferSizeFor(final int valueCount) {
+return 0;
+  }
+
+  @Override
+  public int getValueCapacity(){
+return Character.MAX_VALUE;
+  }
+
+  @Override
+  public Accessor getAccessor() { return accessor; }
+
+  @Override
+  public Mutator getMutator() { return mutator; }
+
+  @Override
+  public void setInitialCapacity(final int valueCount) {
+  }
+
+  @Override
+  public void allocateNew() {
+  }
+
+  @Override
+  public boolean allocateNewSafe() {
+return true;
+  }
+
+  @Override
+  public void allocateNew(final int valueCount) {
+  }
+
+  @Override
+  public void reset() {
+  }
+
+  /**
+   * {@inheritDoc}
+   */
+  @Override
+  public void zeroVector() {
+  }
+
+  @Override
+  public void load(SerializedField metadata, DrillBuf buffer) {
+
Preconditions.checkArgument(this.field.getPath().equals(metadata.getNamePart().getName()),
+"The field %s doesn't match the provided metadata %s.", 
this.field, metadata);
+final int actualLength = metadata.getBufferLength();
+final int valueCount = metadata.getValueCount();
+final int expectedLength = valueCount * VALUE_WIDTH;
+assert actualLength == expectedLength : String.format("Expected to 
load %d bytes but actually loaded %d bytes", expectedLength, actualLength);
+
+this.valueCount = valueCount;
+  }
+
+  @Override
+  public TransferPair getTransferPair(BufferAllocator allocator){
+return new TransferImpl(getField(), allocator);
+  }
+
+  @Override
+  public TransferPair getTransferPair(String ref, BufferAllocator 
allocator){
+return new TransferImpl(getField().withPath(ref), allocator);
+  }
+
+  @Override
+  public TransferPair makeTransferPair(ValueVector to) {
+   

[GitHub] drill pull request #906: DRILL-5546: Handle schema change exception failure ...

2017-08-28 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/906#discussion_r135577415
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/record/AbstractSingleRecordBatch.java
 ---
@@ -125,4 +131,19 @@ public BatchSchema getSchema() {
 
   protected abstract boolean setupNewSchema() throws SchemaChangeException;
   protected abstract IterOutcome doWork();
+
+  /**
+   * Default behavior to handle fast NONE (incoming's first next() return 
NONE, in stead of OK_NEW_SCHEMA):
+   * FAST NONE could happen when the underneath Scan operators do not 
produce any batch with schema.
+   *
+   * This behavior could be override in each individual operator, if the 
operator's semantics is to
+   * inject a batch with schema.
+   *
+   * @return IterOutcome.NONE.
+   */
+  protected IterOutcome handleFastNone() {
+container.buildSchema(SelectionVectorMode.NONE);
+return IterOutcome.NONE;
+  };
--- End diff --

Delete the semi-colon.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #906: DRILL-5546: Handle schema change exception failure ...

2017-08-28 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/906#discussion_r135578735
  
--- Diff: 
exec/vector/src/main/java/org/apache/drill/exec/vector/UntypedNullHolder.java 
---
@@ -0,0 +1,50 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.vector;
+
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.common.types.Types;
+import org.apache.drill.exec.expr.holders.ValueHolder;
+
+public class UntypedNullHolder implements ValueHolder {
+  public static final TypeProtos.MajorType TYPE = 
Types.optional(TypeProtos.MinorType.NULL);
+
+  public TypeProtos.MajorType getType() {return TYPE;}
--- End diff --

The typical Java convention is to list fields first, then methods, so maybe 
move this after the fields.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #906: DRILL-5546: Handle schema change exception failure ...

2017-08-28 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/906#discussion_r135579157
  
--- Diff: 
exec/vector/src/main/java/org/apache/drill/exec/vector/UntypedNullVector.java 
---
@@ -0,0 +1,267 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ 
**/
+
+package org.apache.drill.exec.vector;
+
+
+import com.google.common.base.Preconditions;
+import io.netty.buffer.DrillBuf;
+import org.apache.drill.exec.memory.BufferAllocator;
+import org.apache.drill.exec.proto.UserBitShared.SerializedField;
+import org.apache.drill.exec.record.MaterializedField;
+import org.apache.drill.exec.record.TransferPair;
+import org.apache.drill.exec.vector.complex.reader.FieldReader;
+
+/** UntypedNullVector is to represent a value vector with {@link 
org.apache.drill.common.types.MinorType.NULL}
+ *  All values in the vector represent two semantic implications: 1) the 
value is unknown, 2) the type is unknown.
+ *  Because of this, we only have to keep track of the number of values in 
value vector,
+ *  and there is no allocated buffer to back up this value vector. 
Therefore, the majority of
+ *  methods in this class is either no-op, or throws {@link 
UnsupportedOperationException}.
+ *
+ */
+public final class UntypedNullVector extends BaseDataValueVector 
implements FixedWidthVector {
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(UntypedNullVector.class);
+
+  /**
+   * Width of each fixed-width value.
+   */
+  public static final int VALUE_WIDTH = 0;
+
+  private final Accessor accessor = new Accessor();
+  private final Mutator mutator = new Mutator();
+  private int valueCount ;
--- End diff --

Remove space before ;


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #906: DRILL-5546: Handle schema change exception failure ...

2017-08-28 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/906#discussion_r135574741
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillScanRel.java
 ---
@@ -160,12 +161,7 @@ public RelOptCost computeSelfCost(final RelOptPlanner 
planner, RelMetadataQuery
 final ScanStats stats = groupScan.getScanStats(settings);
 int columnCount = getRowType().getFieldCount();
 double ioCost = 0;
-boolean isStarQuery = Iterables.tryFind(getRowType().getFieldNames(), 
new Predicate() {
-  @Override
-  public boolean apply(String input) {
-return Preconditions.checkNotNull(input).equals("*");
-  }
-}).isPresent();
+boolean isStarQuery = AbstractRecordReader.isStarQuery(columns);
--- End diff --

Should this planning-time class depend on a specific execution-time class? 
Or, would it be better to move the method from the execution class onto 
something common between plan and execution?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #906: DRILL-5546: Handle schema change exception failure ...

2017-08-28 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/906#discussion_r135579971
  
--- Diff: 
exec/vector/src/main/java/org/apache/drill/exec/vector/UntypedNullVector.java 
---
@@ -0,0 +1,267 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ 
**/
+
+package org.apache.drill.exec.vector;
+
+
+import com.google.common.base.Preconditions;
+import io.netty.buffer.DrillBuf;
+import org.apache.drill.exec.memory.BufferAllocator;
+import org.apache.drill.exec.proto.UserBitShared.SerializedField;
+import org.apache.drill.exec.record.MaterializedField;
+import org.apache.drill.exec.record.TransferPair;
+import org.apache.drill.exec.vector.complex.reader.FieldReader;
+
+/** UntypedNullVector is to represent a value vector with {@link 
org.apache.drill.common.types.MinorType.NULL}
+ *  All values in the vector represent two semantic implications: 1) the 
value is unknown, 2) the type is unknown.
+ *  Because of this, we only have to keep track of the number of values in 
value vector,
+ *  and there is no allocated buffer to back up this value vector. 
Therefore, the majority of
+ *  methods in this class is either no-op, or throws {@link 
UnsupportedOperationException}.
+ *
+ */
+public final class UntypedNullVector extends BaseDataValueVector 
implements FixedWidthVector {
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(UntypedNullVector.class);
+
+  /**
+   * Width of each fixed-width value.
+   */
+  public static final int VALUE_WIDTH = 0;
+
+  private final Accessor accessor = new Accessor();
+  private final Mutator mutator = new Mutator();
+  private int valueCount ;
+
+  public UntypedNullVector(MaterializedField field, BufferAllocator 
allocator) {
+super(field, allocator);
+valueCount = 0;
+  }
+
+  @Override
+  public FieldReader getReader() { throw new 
UnsupportedOperationException(); }
+
+  @Override
+  public int getBufferSizeFor(final int valueCount) {
+return 0;
+  }
+
+  @Override
+  public int getValueCapacity(){
+return Character.MAX_VALUE;
+  }
+
+  @Override
+  public Accessor getAccessor() { return accessor; }
+
+  @Override
+  public Mutator getMutator() { return mutator; }
+
+  @Override
+  public void setInitialCapacity(final int valueCount) {
+  }
+
+  @Override
+  public void allocateNew() {
+  }
+
+  @Override
+  public boolean allocateNewSafe() {
+return true;
+  }
+
+  @Override
+  public void allocateNew(final int valueCount) {
+  }
+
+  @Override
+  public void reset() {
+  }
+
+  /**
+   * {@inheritDoc}
+   */
+  @Override
+  public void zeroVector() {
+  }
+
+  @Override
+  public void load(SerializedField metadata, DrillBuf buffer) {
+
Preconditions.checkArgument(this.field.getPath().equals(metadata.getNamePart().getName()),
+"The field %s doesn't match the provided metadata %s.", 
this.field, metadata);
+final int actualLength = metadata.getBufferLength();
+final int valueCount = metadata.getValueCount();
+final int expectedLength = valueCount * VALUE_WIDTH;
+assert actualLength == expectedLength : String.format("Expected to 
load %d bytes but actually loaded %d bytes", expectedLength, actualLength);
+
+this.valueCount = valueCount;
+  }
+
+  @Override
+  public TransferPair getTransferPair(BufferAllocator allocator){
+return new TransferImpl(getField(), allocator);
+  }
+
+  @Override
+  public TransferPair getTransferPair(String ref, BufferAllocator 
allocator){
+return new TransferImpl(getField().withPath(ref), allocator);
+  }
+
+  @Override
+  public TransferPair makeTransferPair(ValueVector to) {
+   

[GitHub] drill pull request #906: DRILL-5546: Handle schema change exception failure ...

2017-08-28 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/906#discussion_r135575060
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/ProjectPrel.java
 ---
@@ -35,18 +35,43 @@
 import org.apache.calcite.rex.RexNode;
 import org.apache.calcite.sql.SqlKind;
 
+/**
+ * A physical Prel node for Project operator.
+ */
 public class ProjectPrel extends DrillProjectRelBase implements Prel{
   static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(ProjectPrel.class);
 
+  private final boolean outputProj;
 
   public ProjectPrel(RelOptCluster cluster, RelTraitSet traits, RelNode 
child, List exps,
   RelDataType rowType) {
+this(cluster, traits, child, exps, rowType, false);
+  }
+
+  /**
+   * Constructor for ProjectPrel.
+   * @param cluster
+   * @param traits traits of ProjectPrel node
+   * @param child  input
+   * @param exps   list of RexNode, representing expressions of projection.
+   * @param rowType output rowType of projection expression.
+   * @param outputProj true if ProjectPrel is inserted by {@link 
org.apache.drill.exec.planner.physical.visitor.TopProjectVisitor}
+   *   Such top Project operator does the following 
processing, before the result was presented to Screen/Writer
+   *   1) ensure final output field names are preserved,
+   *   2) handle cases where input does not return any 
batch (a fast NONE) (see ProjectRecordBatch.handleFastNone() method)
+   *   3) handle cases where expressions in upstream 
operator were evaluated to NULL type
--- End diff --

This is Javadoc, so rather than a pre-formatted numeric list, consider 
\\...\\.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #906: DRILL-5546: Handle schema change exception failure ...

2017-08-28 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/906#discussion_r135578594
  
--- Diff: 
exec/vector/src/main/java/org/apache/drill/exec/vector/UntypedNullHolder.java 
---
@@ -0,0 +1,50 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.vector;
+
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.common.types.Types;
+import org.apache.drill.exec.expr.holders.ValueHolder;
+
+public class UntypedNullHolder implements ValueHolder {
+  public static final TypeProtos.MajorType TYPE = 
Types.optional(TypeProtos.MinorType.NULL);
+
+  public TypeProtos.MajorType getType() {return TYPE;}
+
+  public static final int WIDTH = 0;
+
+  public int isSet = 1;
+
+  @Deprecated
+  public int hashCode(){
+throw new UnsupportedOperationException();
+  }
+
+  /*
+   * Reason for deprecation is that ValueHolders are potential scalar 
replacements
+   * and hence we don't want any methods to be invoked on them.
+   */
+  @Deprecated
+  public String toString(){
+throw new UnsupportedOperationException();
--- End diff --

Why unsupported? According to Java, all classes should implement toString. 
Else, if we are in the debugger, and try to look at an instance of this class, 
the debugger will encounter an exception.

Also, it is not a good idea to deprecate a standard Java method.


Either leave the default implementation, or simply return 
`getClass().getSimpleName()`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #906: DRILL-5546: Handle schema change exception failure ...

2017-08-28 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/906#discussion_r135577968
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/TestEmptyBatchSql.java ---
@@ -0,0 +1,124 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill;
+
+import com.google.common.collect.Lists;
+import org.apache.commons.lang3.tuple.Pair;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.common.util.FileUtils;
+import org.junit.Test;
+
+import java.util.List;
+
+public class TestEmptyBatchSql extends  BaseTestQuery {
--- End diff --

Nice set of tests for JSON. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #906: DRILL-5546: Handle schema change exception failure ...

2017-08-28 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/906#discussion_r135577279
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/record/AbstractSingleRecordBatch.java
 ---
@@ -61,6 +63,10 @@ public IterOutcome innerNext() {
 }
 switch (upstream) {
 case NONE:
+  if (state == BatchState.FIRST) {
+return handleFastNone();
--- End diff --

Again, consider a more descriptive name, such as `handleEmptyInput()`. 
Otherwise, everyone has to remember that "fast none" == "empty input".


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #906: DRILL-5546: Handle schema change exception failure ...

2017-08-28 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/906#discussion_r135579027
  
--- Diff: 
exec/vector/src/main/java/org/apache/drill/exec/vector/UntypedNullHolder.java 
---
@@ -0,0 +1,50 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.vector;
+
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.common.types.Types;
+import org.apache.drill.exec.expr.holders.ValueHolder;
+
+public class UntypedNullHolder implements ValueHolder {
+  public static final TypeProtos.MajorType TYPE = 
Types.optional(TypeProtos.MinorType.NULL);
+
+  public TypeProtos.MajorType getType() {return TYPE;}
+
+  public static final int WIDTH = 0;
+
+  public int isSet = 1;
+
+  @Deprecated
+  public int hashCode(){
+throw new UnsupportedOperationException();
+  }
+
+  /*
+   * Reason for deprecation is that ValueHolders are potential scalar 
replacements
+   * and hence we don't want any methods to be invoked on them.
+   */
+  @Deprecated
+  public String toString(){
+throw new UnsupportedOperationException();
+  }
+
+
+
--- End diff --

Remove extra blank lines.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #906: DRILL-5546: Handle schema change exception failure ...

2017-08-28 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/906#discussion_r135577493
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/record/AbstractSingleRecordBatch.java
 ---
@@ -125,4 +131,19 @@ public BatchSchema getSchema() {
 
   protected abstract boolean setupNewSchema() throws SchemaChangeException;
   protected abstract IterOutcome doWork();
+
+  /**
+   * Default behavior to handle fast NONE (incoming's first next() return 
NONE, in stead of OK_NEW_SCHEMA):
+   * FAST NONE could happen when the underneath Scan operators do not 
produce any batch with schema.
+   *
--- End diff --

Javadoc is HTML, so insert \


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #906: DRILL-5546: Handle schema change exception failure ...

2017-08-28 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/906#discussion_r135579508
  
--- Diff: 
exec/vector/src/main/java/org/apache/drill/exec/vector/UntypedNullVector.java 
---
@@ -0,0 +1,267 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ 
**/
+
+package org.apache.drill.exec.vector;
+
+
+import com.google.common.base.Preconditions;
+import io.netty.buffer.DrillBuf;
+import org.apache.drill.exec.memory.BufferAllocator;
+import org.apache.drill.exec.proto.UserBitShared.SerializedField;
+import org.apache.drill.exec.record.MaterializedField;
+import org.apache.drill.exec.record.TransferPair;
+import org.apache.drill.exec.vector.complex.reader.FieldReader;
+
+/** UntypedNullVector is to represent a value vector with {@link 
org.apache.drill.common.types.MinorType.NULL}
+ *  All values in the vector represent two semantic implications: 1) the 
value is unknown, 2) the type is unknown.
+ *  Because of this, we only have to keep track of the number of values in 
value vector,
+ *  and there is no allocated buffer to back up this value vector. 
Therefore, the majority of
+ *  methods in this class is either no-op, or throws {@link 
UnsupportedOperationException}.
+ *
+ */
+public final class UntypedNullVector extends BaseDataValueVector 
implements FixedWidthVector {
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(UntypedNullVector.class);
+
+  /**
+   * Width of each fixed-width value.
+   */
+  public static final int VALUE_WIDTH = 0;
+
+  private final Accessor accessor = new Accessor();
+  private final Mutator mutator = new Mutator();
+  private int valueCount ;
+
+  public UntypedNullVector(MaterializedField field, BufferAllocator 
allocator) {
+super(field, allocator);
+valueCount = 0;
+  }
+
+  @Override
+  public FieldReader getReader() { throw new 
UnsupportedOperationException(); }
+
+  @Override
+  public int getBufferSizeFor(final int valueCount) {
+return 0;
+  }
+
+  @Override
+  public int getValueCapacity(){
+return Character.MAX_VALUE;
--- End diff --

Please use `ValueVector.MAX_ROW_COUNT`. It is a coincidence that the 
maximum row count happens to be (one greater than) `Character.MAX_VALUE`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #906: DRILL-5546: Handle schema change exception failure ...

2017-08-28 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/906#discussion_r135576938
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/record/AbstractBinaryRecordBatch.java
 ---
@@ -0,0 +1,75 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.record;
+
+import org.apache.drill.exec.exception.OutOfMemoryException;
+import org.apache.drill.exec.ops.FragmentContext;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+
+public abstract class AbstractBinaryRecordBatch extends  AbstractRecordBatch {
+  protected final RecordBatch left;
+  protected final RecordBatch right;
+
+  // state (IterOutcome) of the left input
+  protected IterOutcome leftUpstream = IterOutcome.NONE;
+
+  // state (IterOutcome) of the right input
+  protected IterOutcome rightUpstream = IterOutcome.NONE;
--- End diff --

Nice improvement!

Consider wrapping the two sides in a new class, say, BatchIterator, that 
holds the batch and iter outcome. Then, the code here is just a bit cleaner 
because we don't have to repeat variables for left and right.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #910: DRILL-5726: Support Impersonation without authentication f...

2017-08-28 Thread arina-ielchiieva
Github user arina-ielchiieva commented on the issue:

https://github.com/apache/drill/pull/910
  
@sohami addressed code review comments. Please review when possible.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #910: DRILL-5726: Support Impersonation without authentic...

2017-08-28 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/910#discussion_r135571864
  
--- Diff: exec/java-exec/src/main/resources/rest/query/query.ftl ---
@@ -47,8 +58,32 @@
   Query
   
 
-Submit
+
+"button" 
onclick="doSubmit()"<#else>"submit">
+  Submit
+
   
+
+<#if model?? && model>
+  

[GitHub] drill pull request #910: DRILL-5726: Support Impersonation without authentic...

2017-08-28 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/910#discussion_r135572514
  
--- Diff: exec/java-exec/src/main/resources/rest/query/query.ftl ---
@@ -21,8 +24,16 @@
 
 Sample SQL query: SELECT * FROM cp.`employee.json` LIMIT 
20
   
-  
-
+
+  <#if model?? && model>
+ 
+   User Name
+   
--- End diff --

Done. Did not use `required` attribute since it's not working with 
`onclick`, instead check is using js.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #910: DRILL-5726: Support Impersonation without authentic...

2017-08-28 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/910#discussion_r135492293
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/QueryResources.java
 ---
@@ -54,7 +56,10 @@
   @Path("/query")
   @Produces(MediaType.TEXT_HTML)
   public Viewable getQuery() {
-return ViewableWithPermissions.create(authEnabled.get(), 
"/rest/query/query.ftl", sc);
+final DrillConfig config = work.getContext().getConfig();
+// if impersonation is enabled without impersonation, will provide 
mechanism to add user name to request header from Web UI
--- End diff --

Fixed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #910: DRILL-5726: Support Impersonation without authentic...

2017-08-28 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/910#discussion_r135571777
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/DrillRestServer.java
 ---
@@ -230,6 +230,27 @@ public WebUserConnection provide() {
 public void dispose(WebUserConnection instance) {
 
 }
+
+/**
+ * Creates session user principal. If impersonation is enabled without 
authentication and User-Name header is present and valid,
+ * will create session user principal with provided user name, 
otherwise anonymous user name will be used.
+ * In both cases session user principal will have admin rights.
+ *
+ * @param config drill config
+ * @param request client request
+ * @return session user principal
+ */
+private Principal createSessionUserPrincipal(DrillConfig config, 
HttpServletRequest request) {
+  final boolean checkForUserName = 
!config.getBoolean(ExecConstants.USER_AUTHENTICATION_ENABLED) && 
config.getBoolean(ExecConstants.IMPERSONATION_ENABLED);
+  if (checkForUserName) {
--- End diff --

We should not require `User-Name` for each request since it is needed only 
when we need to query the data. I will leave this method but also add 
`UserNameFilter` that would be applied to all `/query` and `/query.json` 
requests and check if header is provided, i.e. we would allow using `anonymous` 
user if `User-Name` is not provided since we'll know that `UserNameFilter` will 
filter out request where `User-Name` is required but not set.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Test cases that require a UTC timezone.

2017-08-28 Thread Muhammad Gelbana
​​
In the "Drill developer guide or code organization" thread
​ you asked me what am I struggling with, well let's discuss this in this
thread as it provides more context.


Frankly, I can't claim I fully understood what you said so I tried to
understand the problem on my own, but I haven't succeeded yet. Let me break
down what you said that I don't understand

"
which is then stored using an offset from the epoch UTC
​"
Stored where ? The query used by the test case is querying a literal.

What I'm struggling with is getting the test cases to complete %100
successfully. I thought I need to set the timezone for the server (Drill
instance ?) started by the test cases but I couldn't reach the code that
actually starts the server.

Also since other test cases are ignored (i.e. Marked with @Ignore) because
they depend on timezones, then why are the test cases failing for me aren't
? I find them depending on timezones as well.

​

Thanks,
Gelbana

On Sun, Jul 9, 2017 at 6:39 PM, Paul Rogers  wrote:

> Hi Muhammad,
>
> While I can’t comment on the specific test cases, I can say that Drill
> always uses the server’s own timezone to hold dates and times. Not sure how
> this is affecting the tests, but the same date/time will have a different
> numeric value in each time zone. That is, “2 PM on July 9, 2017” is
> interpreted as “2 PM on July 9, 2017 in your server's time zone”, which is
> then stored using an offset from the epoch UTC, but with that value
> reinterpreted as an offset from the epoch in your time zone.
>
> The reinterpreted UTC value is then sent to the client where it is
> reinterpreted again as an offset from the epoch in the client’s own
> timezone. So, on your server, “2 PM on July 9, 2017” is interpreted as “2
> PM on July 9, 2017 GMT+2”, but when I connect to the server, do a query,
> and obtain time data, it is reinterpreted as “2 PM on July 9, 2017 GMT-8."
>
> This mostly works, but does lead to the well known issues that Joda time
> (and, later, the JDK 8 time library) were designed to resolve.
>
> So, to run time tests, you may have to understand our somewhat convoluted
> time mapping to make things work.
>
> Thanks,
>
> - Paul
>
> > On Jul 9, 2017, at 7:47 AM, Muhammad Gelbana 
> wrote:
> >
> > While trying to run Drill's test cases
> > , I found that one of
> the
> > failing tests would succeed
> >  focusedCommentId=16079131=com.atlassian.jira.
> plugin.system.issuetabpanels:comment-tabpanel#comment-16079131>
> > if the timezone was set to UTC (Mine is GMT+2).
> >
> > When I looked around for other test cases that may require timezones, I
> > found a couple of tests ignored (Marked with @Ignore) because they depend
> > on timezones !
> >
> > Would someone please tell me how can I set the timezone for a test case ?
> > Also sharing a guide about Drill's tests classes, packages,
> > architecture...etc, would be very helpful.
> >
> > -Gelbana
>
>


Re: Drill developer guide or code organization

2017-08-28 Thread Aditya Allamraju
Thank you Paul and Muhammad.
This is very valuable information.

Thanks
Aditya

On Sun, Aug 27, 2017 at 1:50 AM, Paul Rogers  wrote:

> Hi Aditya,
>
> Drill does not have a good overview at present. The Wiki pages that
> Muhammad pointed out are about all that we can offer.
>
> Some general guidelines: almost everything you’ll want to explore is in
> the “java-exec” package. This includes the planner, the networking layer,
> the execution framework, etc.
>
> The planner is a bit hard to follow unless you learn Apache Calcite:
> Drill’s code is just a series of extensions to Calcite.
>
> Drill is columnar. Value Vectors are the internal representation, and are
> defined (via code generation) in the “vector” project.
>
> A number of storage and format plugins exist in the “contrib” projects.
>
> Please post specific questions here and we can help you. Then, I’ll adapt
> the answers to extend my own Wiki pages (the first item on the list below.)
>
> BTW: We want to move some of the more “fully baked” posts into Apache
> Drill at some point, perhaps in the Apache Drill wiki or as markdown files
> within a new Maven project.
>
> Also, as you learn about Drill, please consider creating your own summary
> of what you learn to benefit others. We can eventually pull that material
> into Drill as well.
>
> Finally, Muhammad, what challenges are you facing with the test framework?
> It is supposed to be easy, so if it is not, we’d sure like to learn about
> the challenges and fix them (or add better documentation.)
>
> Thanks,
>
> - Paul
>
>
> > On Aug 26, 2017, at 6:47 AM, Muhammad Gelbana 
> wrote:
> >
> > I agree to that. Having a documentation guiding potential committers
> > through the code can help many achieve their tasks and grow the
> community.
> > I my self am struggling a bit with the test cases framework but I'm not
> > giving my full time though.
> >
> > Anyway, here is a list of the all the Github wikis for Drill forks:
> >
> > https://github.com/paul-rogers/drill/wiki
> > https://github.com/parthchandra/drill/wiki
> > https://github.com/kkhatua/drill/wiki
> > https://github.com/bitblender/drill/wiki
> > https://github.com/chunhui-shi/drill/wiki
> > https://github.com/xiaom/drill/wiki
> > https://github.com/jacques-n/drill/wiki
> > https://github.com/XingCloud/incubator-drill/wiki (Chinese)
> >
> > Thanks,
> > Gelbana
> >
> > On Sat, Aug 26, 2017 at 3:07 PM, Aditya Allamraju <
> > aditya.allamr...@gmail.com> wrote:
> >
> >> Team,
> >>
> >> Is there a place where we have documented different Code components of
> >> Drill?
> >> What i am looking for is something similar to
> >> https://cwiki.apache.org/confluence/display/Hive/DeveloperGuide (mainly
> >> the
> >> part with code organization)
> >> I looked at apache docs. But could not find the above info in "developer
> >> information".
> >>
> >> I request the active members of the group to share such info. If it is
> not
> >> yet there, can someone please put up a doc for a start briefly
> mentioning
> >> different components and problem they are solving.
> >> Such information will greatly help the newcomers to this community.
> >>
> >> Appreciate all the efforts going on in this group.
> >>
> >> Thanks
> >> Aditya
> >>
>
>