[GitHub] drill pull request: DRILL-4215: Transfer ownership in TransferPair

2015-12-21 Thread StevenMPhillips
GitHub user StevenMPhillips opened a pull request:

https://github.com/apache/drill/pull/308

DRILL-4215: Transfer ownership in TransferPair

Only commit 79bbfcb is part of this commit. But since the other commits 
haven't been merged to master, they are also showing up in this PR.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StevenMPhillips/drill drill-4215

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/308.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #308


commit ea7bbc58f3e4bed5dc8093381d6c7dc4911f520e
Author: Chris Westin 
Date:   2015-11-11T22:57:16Z

DRILL-4144: Clean up close behaviors.

commit da4f9821644520bf2c4eb698770ed87713b1a039
Author: Chris Westin 
Date:   2015-11-11T22:57:47Z

DRILL-4144: Add new allocator

commit 728687f8d7c28d398ac993705dc0e4fb189b01d4
Author: Jacques Nadeau 
Date:   2015-11-16T01:26:02Z

DRILL-4144: Allocator Improvements

- make Allocator mostly lockless
- change BaseAllocator maps to direct references
- add documentation around memory management model
- move transfer and ownership methods to DrillBuf
- Improve debug messaging.
- Fix/revert sort changes
- Remove unused fragment limit flag
- Add time to HistoricalLog events
- Remove reservation amount from RootAllocator constructor (since not 
allowed)
- Fix concurrency issue where allocator is closing at same moment as 
incoming batch transfer, causing leaked memory and/or query failure.
- Add new AutoCloseables.close(Iterable)
- Remove extraneous DataResponseHandler and Impl (and update TestBitRpc to 
use smarter mock of FragmentManager)
- Remove the concept of poison pill record batches, using instead 
FragmentContext.isOverMemoryLimit()
- Update incoming data batches so that they are transferred under 
protection of a close lock
- Improve field names in IncomingBuffers and move synchronization to 
collectors as opposed to IncomingBuffers (also change decrementing to 
decrementToZero rather than two part check).

commit 79bbfcb018429ad3a02daf2ac4dea119126041e1
Author: Steven Phillips 
Date:   2015-12-01T08:34:41Z

DRILL-4215: Transfer buffer ownership in TransferPair




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-4241: Add Kudu reader

2016-01-03 Thread StevenMPhillips
Github user StevenMPhillips commented on the pull request:

https://github.com/apache/drill/pull/314#issuecomment-168588400
  
+1

As long as this doesn't break the build, I say we go ahead and merge it. We 
can add tests and fixes later.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: Drill 4236: ExternalSort should use the new al...

2016-01-05 Thread StevenMPhillips
Github user StevenMPhillips commented on the pull request:

https://github.com/apache/drill/pull/317#issuecomment-169122198
  
+1 LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: Drill 4236: ExternalSort should use the new al...

2016-01-05 Thread StevenMPhillips
Github user StevenMPhillips commented on a diff in the pull request:

https://github.com/apache/drill/pull/317#discussion_r48900539
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/ExternalSortBatch.java
 ---
@@ -588,36 +571,29 @@ public BatchGroup 
mergeAndSpill(LinkedList batchGroups) throws Schem
 } finally {
   hyperBatch.clear();
 }
-long bufSize = getBufferSize(c1);
-totalSizeInMemory += bufSize;
-logger.debug("mergeAndSpill: final total size in memory = {}", 
totalSizeInMemory);
+logger.debug("mergeAndSpill: final total size in memory = {}", 
oAllocator.getAllocatedMemory());
 logger.info("Completed spilling to {}", outputFile);
 return newGroup;
   }
 
-  private long getBufferSize(VectorAccessible batch) {
-long size = 0;
-for (VectorWrapper w : batch) {
-  DrillBuf[] bufs = w.getValueVector().getBuffers(false);
-  for (DrillBuf buf : bufs) {
-size += buf.getPossibleMemoryConsumed();
-  }
-}
-return size;
-  }
-
   private SelectionVector2 newSV2() throws OutOfMemoryException, 
InterruptedException {
-SelectionVector2 sv2 = new SelectionVector2(oContext.getAllocator());
+SelectionVector2 sv2 = new SelectionVector2(oAllocator);
 if (!sv2.allocateNewSafe(incoming.getRecordCount())) {
   try {
-// Not being able to allocate sv2 means this operator's allocator 
reached it's maximum capacity.
-// Spilling this.batchGroups won't help here as they are owned by 
upstream operator,
-// but spilling spilledBatchGroups may free enough memory
-final BatchGroup merged = mergeAndSpill(spilledBatchGroups);
+final BatchGroup merged = mergeAndSpill(batchGroups);
 if (merged != null) {
-  spilledBatchGroups.addFirst(merged);
+  spilledBatchGroups.add(merged);
--- End diff --

The ordering in the queue only matters for determining which BatchGroups 
will be merged the next time mergeAndSpill is called. Since mergeAndSpill takes 
batches off the the end of the queue, we want to put the outcome at the front 
of the queue, since the outcome BatchGroup has a much larger spill size than 
the ones at the end of the queue, and is thus more expensive to merge.

So I think you should revert this change, and leave it the way it is. It 
might be a good idea to take more extensive look at the spilling/merging 
strategy, but that can be a different task.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: Drill 4236: ExternalSort should use the new al...

2016-01-05 Thread StevenMPhillips
Github user StevenMPhillips commented on a diff in the pull request:

https://github.com/apache/drill/pull/317#discussion_r48902637
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/ExternalSortBatch.java
 ---
@@ -588,36 +571,29 @@ public BatchGroup 
mergeAndSpill(LinkedList batchGroups) throws Schem
 } finally {
   hyperBatch.clear();
 }
-long bufSize = getBufferSize(c1);
-totalSizeInMemory += bufSize;
-logger.debug("mergeAndSpill: final total size in memory = {}", 
totalSizeInMemory);
+logger.debug("mergeAndSpill: final total size in memory = {}", 
oAllocator.getAllocatedMemory());
 logger.info("Completed spilling to {}", outputFile);
 return newGroup;
   }
 
-  private long getBufferSize(VectorAccessible batch) {
-long size = 0;
-for (VectorWrapper w : batch) {
-  DrillBuf[] bufs = w.getValueVector().getBuffers(false);
-  for (DrillBuf buf : bufs) {
-size += buf.getPossibleMemoryConsumed();
-  }
-}
-return size;
-  }
-
   private SelectionVector2 newSV2() throws OutOfMemoryException, 
InterruptedException {
-SelectionVector2 sv2 = new SelectionVector2(oContext.getAllocator());
+SelectionVector2 sv2 = new SelectionVector2(oAllocator);
 if (!sv2.allocateNewSafe(incoming.getRecordCount())) {
   try {
-// Not being able to allocate sv2 means this operator's allocator 
reached it's maximum capacity.
-// Spilling this.batchGroups won't help here as they are owned by 
upstream operator,
-// but spilling spilledBatchGroups may free enough memory
-final BatchGroup merged = mergeAndSpill(spilledBatchGroups);
+final BatchGroup merged = mergeAndSpill(batchGroups);
 if (merged != null) {
-  spilledBatchGroups.addFirst(merged);
+  spilledBatchGroups.add(merged);
--- End diff --

It depends on which queue we are merging from. When batches first come in 
to the ExternalSort, they are added to the batchGroup queue. Usually when 
spilling, we spill some number of batches from the batchGroup queue, and then 
put the result at the end of the spilledBatchGroup queue. In the cases where we 
need to re-merge the batchGroups, we take from the end of the spilledBatchGroup 
queue, and add them to the front of the queue.

If you are seeing a place where we mergeAndSpill the spilledBatchGroup 
queue and add to the end of the queue, that is a mistake and should be fixed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: Drill 4236: ExternalSort should use the new al...

2016-01-05 Thread StevenMPhillips
Github user StevenMPhillips commented on a diff in the pull request:

https://github.com/apache/drill/pull/317#discussion_r48904994
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/ExternalSortBatch.java
 ---
@@ -588,36 +571,29 @@ public BatchGroup 
mergeAndSpill(LinkedList batchGroups) throws Schem
 } finally {
   hyperBatch.clear();
 }
-long bufSize = getBufferSize(c1);
-totalSizeInMemory += bufSize;
-logger.debug("mergeAndSpill: final total size in memory = {}", 
totalSizeInMemory);
+logger.debug("mergeAndSpill: final total size in memory = {}", 
oAllocator.getAllocatedMemory());
 logger.info("Completed spilling to {}", outputFile);
 return newGroup;
   }
 
-  private long getBufferSize(VectorAccessible batch) {
-long size = 0;
-for (VectorWrapper w : batch) {
-  DrillBuf[] bufs = w.getValueVector().getBuffers(false);
-  for (DrillBuf buf : bufs) {
-size += buf.getPossibleMemoryConsumed();
-  }
-}
-return size;
-  }
-
   private SelectionVector2 newSV2() throws OutOfMemoryException, 
InterruptedException {
-SelectionVector2 sv2 = new SelectionVector2(oContext.getAllocator());
+SelectionVector2 sv2 = new SelectionVector2(oAllocator);
 if (!sv2.allocateNewSafe(incoming.getRecordCount())) {
   try {
-// Not being able to allocate sv2 means this operator's allocator 
reached it's maximum capacity.
-// Spilling this.batchGroups won't help here as they are owned by 
upstream operator,
-// but spilling spilledBatchGroups may free enough memory
-final BatchGroup merged = mergeAndSpill(spilledBatchGroups);
+final BatchGroup merged = mergeAndSpill(batchGroups);
 if (merged != null) {
-  spilledBatchGroups.addFirst(merged);
+  spilledBatchGroups.add(merged);
--- End diff --

Yeah, since you also changed it so that we are spilling the batchGroup 
queue, this change does make sense.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-4257: Fix StoragePluginRegistry clean-up...

2016-01-10 Thread StevenMPhillips
Github user StevenMPhillips commented on the pull request:

https://github.com/apache/drill/pull/321#issuecomment-170419014
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-4238: Add a custom RPC interface on the ...

2016-01-10 Thread StevenMPhillips
Github user StevenMPhillips commented on the pull request:

https://github.com/apache/drill/pull/313#issuecomment-170433313
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3581: Upgrade Guava, HPPC and Jackson

2016-02-04 Thread StevenMPhillips
Github user StevenMPhillips commented on the pull request:

https://github.com/apache/drill/pull/361#issuecomment-180172955
  
+1

Might as well go ahead and fix the misspellings I noticed, but other than 
that looks good to merge.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3623: Optimize limit 0 queries

2016-02-05 Thread StevenMPhillips
GitHub user StevenMPhillips opened a pull request:

https://github.com/apache/drill/pull/364

DRILL-3623: Optimize limit 0 queries

This pulls some patches that Sudheesh had previously worked on but not 
committed. In addition, fixes some problems when the type is not one of the 
json ExtendTypes.

There are still problems with float, decimal, and interval_day/year.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StevenMPhillips/drill limit_0

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/364.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #364


commit 8c3f2e8911dabc93f7b3f381600ccfaccc008938
Author: Sudheesh Katkam 
Date:   2015-11-17T23:12:48Z

DRILL-3623: Use shorter exec path for LIMIT 0 queries when schema of the 
root logical node is known

+ DRILL-4043: Perform LIMIT 0 optimization before logical transformation

commit 3be63cd5b95891cbfee1bdb6cfa2baaaf18f31c8
Author: Sudheesh Katkam 
Date:   2015-12-22T19:07:39Z

WIP

commit b4061e46fe91b747bc2afe82e66cd6d0bc060d1b
Author: Steven Phillips 
Date:   2016-01-31T01:49:04Z

Make limit 0 return correct type for VARCHAR

Still does not work correctly for float4, decimal, or interval types




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-4382: Remove dependency on drill-logical...

2016-02-10 Thread StevenMPhillips
GitHub user StevenMPhillips opened a pull request:

https://github.com/apache/drill/pull/373

DRILL-4382: Remove dependency on drill-logical from vector package



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StevenMPhillips/drill arrow

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/373.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #373


commit e2be7853aa49467c6db0ab960fe2b11a24ccb84b
Author: Steven Phillips 
Date:   2016-02-05T01:43:17Z

DRILL-4382: Remove dependency on drill-logical from vector package




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-4382: Remove dependency on drill-logical...

2016-02-10 Thread StevenMPhillips
Github user StevenMPhillips commented on the pull request:

https://github.com/apache/drill/pull/373#issuecomment-182692303
  
There are two main changes made in this commit:

1. Removed SchemaPath from MaterializedField. Now, a MaterializedField 
contains a name (String), and a list of children (MaterializedField). Each 
MaterializedField instance knows its own name, but has no knowledge of its 
parents. While making this change, I also got rid of MaterializedField.Key, and 
made sure that MaterializedField was not used as a Map key anywhere in the code.

2. TransferPair no longer takes a FieldReference, but instead will take a 
String for the field name.

With those two changes, I was able to remove the dependency on 
drill-logical.

The rest of the changes in the patch are simply making the rest of the code 
conform to this new interface.

I should not that this will break external StoragePlugins. They will need 
to be modified and recompiled.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3623: Optimize limit 0 queries

2016-02-11 Thread StevenMPhillips
Github user StevenMPhillips commented on the pull request:

https://github.com/apache/drill/pull/364#issuecomment-183008840
  
@hhsuanyi I did not open any calcite jira. The new calcite version was 
there from Sudheesh's commit which I pulled in.

@sudheeshkatkam Does your update handle limit 0 for all types?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-4441: Fix varchar data read out of Avro ...

2016-02-29 Thread StevenMPhillips
Github user StevenMPhillips commented on the pull request:

https://github.com/apache/drill/pull/393#issuecomment-190458332
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-4455: Depend on Apache Arrow

2016-02-29 Thread StevenMPhillips
GitHub user StevenMPhillips opened a pull request:

https://github.com/apache/drill/pull/398

DRILL-4455: Depend on Apache Arrow

Remove the vector and memory modules
Replace most instances of TypeProtos.MajorType, TypeProtos.MinorType etc
  with corresponding org.apache.arrow.types.Types class.
The code for serializing and deserializing vectors is still in the Drill 
code-base
  and still uses the old TypeProtos classes. The class MajorTypeHelper is 
used
  to convert between the two sets of classes.
The old load and getMetadata classes in ValueVector are now found in a 
corresponding
  ValueVectorHelper class

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StevenMPhillips/drill arrow9

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/398.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #398


commit 91790c43f2939a028e05f8c67093d43e96fe9caa
Author: Steven Phillips 
Date:   2016-03-01T01:05:23Z

DRILL-4455: Depend on Apache Arrow

Remove the vector and memory modules
Replace most instances of TypeProtos.MajorType, TypeProtos.MinorType etc
  with corresponding org.apache.arrow.types.Types class.
The code for serializing and deserializing vectors is still in the Drill 
code-base
  and still uses the old TypeProtos classes. The class MajorTypeHelper is 
used
  to convert between the two sets of classes.
The old load and getMetadata classes in ValueVector are now found in a 
corresponding
  ValueVectorHelper class




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-4486: Fix expression serialization escap...

2016-03-07 Thread StevenMPhillips
GitHub user StevenMPhillips opened a pull request:

https://github.com/apache/drill/pull/412

DRILL-4486: Fix expression serialization escaping



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StevenMPhillips/drill parser

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/412.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #412


commit 26c6f97e08698f91298186d9c545ea1fe3102552
Author: Steven Phillips 
Date:   2016-03-05T04:29:26Z

DRILL-4486: Fix expression serialization escaping




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-4486: Fix expression serialization escap...

2016-03-07 Thread StevenMPhillips
Github user StevenMPhillips commented on a diff in the pull request:

https://github.com/apache/drill/pull/412#discussion_r55305190
  
--- Diff: 
logical/src/main/java/org/apache/drill/common/expression/ExpressionStringBuilder.java
 ---
@@ -119,14 +126,14 @@ public Void visitSchemaPath(SchemaPath path, 
StringBuilder sb) throws RuntimeExc
   throw new IllegalStateException("Drill doesn't currently support top 
level arrays");
 }
 sb.append('`');
-sb.append(seg.getNameSegment().getPath());
+sb.append(escapeBackTick(seg.getNameSegment().getPath()));
--- End diff --

I think getPath() can be used outside the context of serialization, so best 
to leave that method unchanged.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-4482: Avro subselection broken by 4382

2016-03-08 Thread StevenMPhillips
Github user StevenMPhillips commented on the pull request:

https://github.com/apache/drill/pull/419#issuecomment-194060258
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3623: For limit 0 queries, use a shorter...

2016-03-21 Thread StevenMPhillips
Github user StevenMPhillips commented on a diff in the pull request:

https://github.com/apache/drill/pull/405#discussion_r56918321
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/limit/TestLimit0.java
 ---
@@ -0,0 +1,677 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.physical.impl.limit;
+
+import com.google.common.collect.Lists;
+import org.apache.commons.lang3.tuple.Pair;
+import org.apache.drill.BaseTestQuery;
+import org.apache.drill.PlanTestBase;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.common.types.Types;
+import org.joda.time.DateTime;
+import org.junit.AfterClass;
+import org.junit.BeforeClass;
+import org.junit.Ignore;
+import org.junit.Test;
+
+import java.util.List;
+
+public class TestLimit0 extends BaseTestQuery {
+
+  private static final String viewName = "limitZeroEmployeeView";
+
+  private static String wrapLimit0(final String query) {
+return "SELECT * FROM (" + query + ") LZT LIMIT 0";
+  }
+
+  @BeforeClass
+  public static void createView() throws Exception {
+test("USE dfs_test.tmp");
+test(String.format("CREATE OR REPLACE VIEW %s AS SELECT " +
+"CAST(employee_id AS INT) AS employee_id, " +
+"CAST(full_name AS VARCHAR(25)) AS full_name, " +
+"CAST(position_id AS INTEGER) AS position_id, " +
+"CAST(department_id AS BIGINT) AS department_id," +
+"CAST(birth_date AS DATE) AS birth_date, " +
+"CAST(hire_date AS TIMESTAMP) AS hire_date, " +
+"CAST(salary AS DOUBLE) AS salary, " +
+"CAST(salary AS FLOAT) AS fsalary, " +
+"CAST((CASE WHEN marital_status = 'S' THEN true ELSE false END) AS 
BOOLEAN) AS single, " +
+"CAST(education_level AS VARCHAR(60)) AS education_level," +
+"CAST(gender AS CHAR) AS gender " +
+"FROM cp.`employee.json` " +
+"ORDER BY employee_id " +
+"LIMIT 1;", viewName));
+// { "employee_id":1,"full_name":"Sheri 
Nowmer","first_name":"Sheri","last_name":"Nowmer","position_id":1,
+// 
"position_title":"President","store_id":0,"department_id":1,"birth_date":"1961-08-26",
+// "hire_date":"1994-12-01 
00:00:00.0","end_date":null,"salary":8.,"supervisor_id":0,
+// "education_level":"Graduate 
Degree","marital_status":"S","gender":"F","management_role":"Senior Management" 
}
+  }
+
+  @AfterClass
+  public static void tearDownView() throws Exception {
+test("DROP VIEW " + viewName + ";");
+  }
+
+  //  SIMPLE QUERIES 
+
+  @Test
+  public void infoSchema() throws Exception {
+testBuilder()
+.sqlQuery(String.format("DESCRIBE %s", viewName))
+.unOrdered()
+.baselineColumns("COLUMN_NAME", "DATA_TYPE", "IS_NULLABLE")
+.baselineValues("employee_id", "INTEGER", "YES")
+.baselineValues("full_name", "CHARACTER VARYING", "YES")
+.baselineValues("position_id", "INTEGER", "YES")
+.baselineValues("department_id", "BIGINT", "YES")
+.baselineValues("birth_date", "DATE", "YES")
+   

[GitHub] drill pull request: DRILL-3623: For limit 0 queries, use a shorter...

2016-03-22 Thread StevenMPhillips
Github user StevenMPhillips commented on the pull request:

https://github.com/apache/drill/pull/405#issuecomment-199910572
  
Sudheesh, could you respond to the comments made by Jin Feng? If you have 
already discussed it with him in person, could you post a summary here.

I am in favor of merging this PR, but just the last commit. I think the 
second commit should be separate, and should be done as an inserted operator 
(similar to IteratorValidator), rather than modifying the constructor for 
Screen.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3623: For limit 0 queries, use a shorter...

2016-03-22 Thread StevenMPhillips
Github user StevenMPhillips commented on the pull request:

https://github.com/apache/drill/pull/405#issuecomment-199964093
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-4562: Fix bug with nested union expressi...

2016-03-30 Thread StevenMPhillips
GitHub user StevenMPhillips opened a pull request:

https://github.com/apache/drill/pull/455

DRILL-4562: Fix bug with nested union expressions



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StevenMPhillips/drill drill-4562

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/455.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #455


commit d432e193492e4b81242c6d60b976d0fdef6904aa
Author: Steven Phillips 
Date:   2016-03-30T02:55:29Z

DRILL-4562: Fix bug with nested union expressions




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-4566: TDigest, median, and quantile func...

2016-03-30 Thread StevenMPhillips
GitHub user StevenMPhillips opened a pull request:

https://github.com/apache/drill/pull/456

DRILL-4566: TDigest, median, and quantile functions

Adds tdigest, tdigest_merge, median, and quantile UDFS
Adds reduce aggregates rule to reduce:
  median(x) -> tdigest_median(tdigest(x))
  quantile(q, x) -> tdigest_quantile(q, tdigest(x))
Adds two-phase aggregate rule for tdigest

Also adds ability to specify tolerances in the TestBuilder

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StevenMPhillips/drill drill-4566

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/456.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #456


commit b6a90053c7451d92b15807ba50057f5790985efc
Author: Steven Phillips 
Date:   2016-03-30T01:55:26Z

DRILL-4566: TDigest, median, and quantile functions

Adds tdigest, tdigest_merge, median, and quantile UDFS
Adds reduce aggregates rule to reduce:
  median(x) -> tdigest_median(tdigest(x))
  quantile(q, x) -> tdigest_quantile(q, tdigest(x))
Adds two-phase aggregate rule for tdigest




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-4521: Map variance() and stddev() to var...

2016-03-30 Thread StevenMPhillips
Github user StevenMPhillips commented on the pull request:

https://github.com/apache/drill/pull/437#issuecomment-203657641
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-2100: Added deleting temporary spill dir...

2016-03-31 Thread StevenMPhillips
Github user StevenMPhillips commented on a diff in the pull request:

https://github.com/apache/drill/pull/454#discussion_r58107846
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/ExternalSortBatch.java
 ---
@@ -110,11 +111,12 @@
   private LinkedList batchGroups = Lists.newLinkedList();
   private LinkedList spilledBatchGroups = 
Lists.newLinkedList();
   private SelectionVector4 sv4;
-  private FileSystem fs;
+  private static FileSystem fs;
--- End diff --

I agree with Jacques, except I would probably have one directory per sort 
operator in a fragment, rather one per fragment. This way the ExternalSort 
operator would be responsible for cleaning it up, and we wouldn't have to 
manage it at the Fragment level.

To clarify, right now the spill files for a sort go in a directory with the 
path:

/

Instead we should make it:

/___/

That way, the sort operator itself will be able to cleanup its own spill 
directory, and there won't be any empty directories left over.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-2100: Added deleting temporary spill dir...

2016-03-31 Thread StevenMPhillips
Github user StevenMPhillips commented on a diff in the pull request:

https://github.com/apache/drill/pull/454#discussion_r58109276
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/ExternalSortBatch.java
 ---
@@ -110,11 +111,12 @@
   private LinkedList batchGroups = Lists.newLinkedList();
   private LinkedList spilledBatchGroups = 
Lists.newLinkedList();
   private SelectionVector4 sv4;
-  private FileSystem fs;
+  private static FileSystem fs;
--- End diff --

Thanks Jason, that looks better.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-4593: Remove OldAssignmentCreator in Fil...

2016-04-13 Thread StevenMPhillips
Github user StevenMPhillips commented on the pull request:

https://github.com/apache/drill/pull/473#issuecomment-209580405
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-4258: Add threads, fragments, and querie...

2016-04-14 Thread StevenMPhillips
GitHub user StevenMPhillips opened a pull request:

https://github.com/apache/drill/pull/479

DRILL-4258: Add threads, fragments, and queries system tables

Here are the pojos that represent the data for the new system tables:

```java
  public static class FragmentInfo {
public String hostname;
public String queryId;
public int majorFragmentId;
public int minorFragmentId;
public Long memoryUsage;
/**
 * The maximum number of input records across all Operators in fragment
 */
public Long rowsProcessed;
public Timestamp startTime;
  }
```

```java
  public static class ThreadSummary {
/**
 * The Drillbit hostname
 */
public String hostname;

/**
 * The Drillbit user port
 */
public long user_port;
public String threadName;
public long threadId;
public boolean inNative;
public boolean suspended;
public String threadState;
/**
 * Thread cpu time during last second. Between 0 and 100
 */
public Integer cpuTime;
/**
 * Thread user cpu time during last second. Between 0 and 100
 */
public Integer userTime;
public String stackTrace;
  }
```

```java
  public static class QueryInfo {
/**
 * The host where foreman is running
 */
public String foreman;
/**
 * User who submitted query
 */
public String user;
public String queryId;
/**
 * Query sql string
 */
public String query;
public Timestamp startTime;
  }
```

I did not include data in the query table which can be obtained from the 
fragments table and doing a join on the queryId.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StevenMPhillips/drill drill-4258

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/479.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #479


commit c99fffd5f091bcea17d8a0248c22094ed90ca86d
Author: Steven Phillips 
Date:   2016-04-14T18:27:46Z

DRILL-4258: Add threads, fragments, and queries system tables




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-4455: Depend on Apache Arrow

2016-04-21 Thread StevenMPhillips
Github user StevenMPhillips commented on the pull request:

https://github.com/apache/drill/pull/398#issuecomment-213149221
  
I just rebased the patch and broke it up into 5 different patches.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-4618: Fix hive function loader not corre...

2016-05-27 Thread StevenMPhillips
Github user StevenMPhillips commented on a diff in the pull request:

https://github.com/apache/drill/pull/509#discussion_r64940411
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/EvaluationVisitor.java 
---
@@ -811,7 +811,7 @@ public HoldingContainer visitFunctionCall(FunctionCall 
call, ClassGenerator g
 @Override
 public HoldingContainer 
visitFunctionHolderExpression(FunctionHolderExpression holder, 
ClassGenerator generator) throws RuntimeException {
   HoldingContainer hc = getPrevious(holder, generator.getMappingSet());
-  if (hc == null) {
+  if (hc == null || holder.isRandom()) {
--- End diff --

I think this change is not sufficient to cover all cases. If random() is 
just part of the expression, then this won't help. For example, the expression 
2 * random().

I think a better way to fix this would be to modify 
EqualityVisitor.visitFunctionHolderExpression() to return false if the function 
is random. That way any expression which contains a random function will never 
be considered equal to another expression.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3476: Merge paths in FieldSelection

2015-07-08 Thread StevenMPhillips
GitHub user StevenMPhillips opened a pull request:

https://github.com/apache/drill/pull/83

DRILL-3476: Merge paths in FieldSelection



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StevenMPhillips/incubator-drill drill-3476

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/83.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #83






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3477: Use varbinary for null vectors

2015-07-08 Thread StevenMPhillips
GitHub user StevenMPhillips opened a pull request:

https://github.com/apache/drill/pull/84

DRILL-3477: Use varbinary for null vectors



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StevenMPhillips/incubator-drill drill-3477

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/84.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #84


commit 5dc85777fa9826a9e72a428f9abfda1e8d066123
Author: Steven Phillips 
Date:   2015-07-09T00:39:32Z

DRILL-3477: Use varbinary for null vectors




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3353: Fix dropping nested fields

2015-07-09 Thread StevenMPhillips
GitHub user StevenMPhillips opened a pull request:

https://github.com/apache/drill/pull/86

DRILL-3353: Fix dropping nested fields

Use the SchemaChangeCallBack in more places to track schema changes
Reset the ephemeral transfer pair when making a new transfer pair for Map 
or RepeatedMap

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StevenMPhillips/incubator-drill drill-3353

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/86.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #86


commit 6598d5efa99e7516a882ea17582d3014e13d3ca6
Author: Steven Phillips 
Date:   2015-07-09T00:35:09Z

DRILL-3353: Fix dropping nested fields

Use the SchemaChangeCallBack in more places to track schema changes
Reset the ephemeral transfer pair when making a new transfer pair for Map 
or RepeatedMap




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-2743: Parquet file metadata caching

2015-08-14 Thread StevenMPhillips
GitHub user StevenMPhillips opened a pull request:

https://github.com/apache/drill/pull/114

DRILL-2743: Parquet file metadata caching



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StevenMPhillips/incubator-drill meta2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/114.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #114


commit 371153d7dfb4cd0ad5b0042f6ca2df255e81c52f
Author: Steven Phillips 
Date:   2015-03-13T23:12:34Z

DRILL-2743: Parquet file metadata caching




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: Update 2014-12-09-running-sql-queries-on-amazo...

2015-09-22 Thread StevenMPhillips
GitHub user StevenMPhillips opened a pull request:

https://github.com/apache/drill/pull/164

Update 2014-12-09-running-sql-queries-on-amazon-s3.md



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StevenMPhillips/drill patch-1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/164.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #164


commit bd9d73b56091e9c97e819b85af59727847d95e55
Author: Steven Phillips 
Date:   2015-09-22T21:09:22Z

Update 2014-12-09-running-sql-queries-on-amazon-s3.md




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3229: Implement Union type vector

2015-10-01 Thread StevenMPhillips
GitHub user StevenMPhillips opened a pull request:

https://github.com/apache/drill/pull/180

DRILL-3229: Implement Union type vector



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StevenMPhillips/drill drill-3229

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/180.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #180


commit e5a529d7a276597f2b62cdcb9a1cab2fae8bc52f
Author: Steven Phillips 
Date:   2015-10-01T10:26:34Z

DRILL-3229: Implement Union type vector




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: Parquet meta

2015-10-01 Thread StevenMPhillips
GitHub user StevenMPhillips opened a pull request:

https://github.com/apache/drill/pull/186

Parquet meta



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StevenMPhillips/drill parquet_meta

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/186.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #186


commit c3cf974db0fe437f893fa9c1fa907620d93c994f
Author: Steven Phillips 
Date:   2015-10-01T23:54:18Z

DRILL-3887: Fix bug where metadata file not being used

commit 96534e3a5502524171df569373cc47e9860de466
Author: Steven Phillips 
Date:   2015-10-02T03:36:15Z

DRILL-3867: Store relative paths in metadata file

commit 728d6e6b12eec11d52e9f106deecaf311ebd7958
Author: Steven Phillips 
Date:   2015-10-02T03:51:29Z

DRILL-3820: Set 700 permission on metadata file




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request:

2015-10-01 Thread StevenMPhillips
Github user StevenMPhillips commented on the pull request:


https://github.com/apache/drill/commit/728d6e6b12eec11d52e9f106deecaf311ebd7958#commitcomment-13556275
  
I think this will create problems with impersonation. I will need to 
rethink this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: Parquet meta

2015-10-02 Thread StevenMPhillips
Github user StevenMPhillips commented on a diff in the pull request:

https://github.com/apache/drill/pull/186#discussion_r41051093
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetGroupScan.java
 ---
@@ -716,6 +726,7 @@ public String toString() {
 return "ParquetGroupScan [entries=" + entries
 + ", selectionRoot=" + selectionRoot
 + ", numFiles=" + getEntries().size()
++ ", usedCache=" + usedMetadataCache
--- End diff --

We only use the cache if there is a cache file for the root directory of 
the query.

I can change the name to useMetadataFile


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3912: Common subexpression elimination

2015-10-07 Thread StevenMPhillips
GitHub user StevenMPhillips opened a pull request:

https://github.com/apache/drill/pull/189

DRILL-3912: Common subexpression elimination

In EvaluationVisitor, map each logical expression subtree to the 
HoldingContainer produced when the subtree is evaluated, and store it in a 
HashMap.

Tow new classes, HashVisitor and EqualityVisitor are used to determine the 
hashcode and equality for the map. Two logical expression trees are equal if 
all child nodes are equal, and the root nodes are the same type and have the 
same fields.

Only expressions that are in scope are available in the map. When entering 
a new scope, the current map is copied into a new map. The current map is 
pushed onto a stack, and the new map becomes the current map. When that scope 
is left, the previous map is popped off the stack and becomes the current 
scope. Thus previously evaluated expression in the outer scope are available 
while in the inner scope, but any new expressions will not be available when in 
the outer scope.

A new scope is entered when evaluating a boolean expression, or when 
evaluating either branch of an If Expression.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StevenMPhillips/drill drill-3912

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/189.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #189


commit f13b4eab7ea1cb61efcda0e137006bc983c9a339
Author: Steven Phillips 
Date:   2015-10-07T10:44:10Z

DRILL-3912: Common subexpression elimination




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3917: During file selection expansion, g...

2015-10-10 Thread StevenMPhillips
Github user StevenMPhillips commented on the pull request:

https://github.com/apache/drill/pull/195#issuecomment-147108592
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3918: During expansion save the metadata...

2015-10-11 Thread StevenMPhillips
Github user StevenMPhillips commented on the pull request:

https://github.com/apache/drill/pull/196#issuecomment-147230673
  
+1

This is basically the same fix I was planning.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: Drill-3232: Handle promoting writers and vecto...

2015-10-16 Thread StevenMPhillips
GitHub user StevenMPhillips opened a pull request:

https://github.com/apache/drill/pull/207

Drill-3232: Handle promoting writers and vectors from primitive to union 
type

The PromotableFieldWriter delegates all FieldWriter API calls to an inner 
FieldWriter. This inner field writer can start as a specific type, and this 
class will promote the writer to a UnionWriter if a call is made that the 
specifically typed writer cannot handle. A new UnionVector is created, wrapping 
the original vector, and replaces the original vector in the parent vector, 
which can be either an AbstractMapVector or a ListVector.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StevenMPhillips/drill drill-3232

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/207.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #207


commit 81a4e714a868634592bb449eb490b7a30ac990fe
Author: Steven Phillips 
Date:   2015-10-01T10:26:34Z

DRILL-3229: Implement Union type vector

commit 53aaacaebff923094dee4e4d3c5fc52f112c12d4
Author: Steven Phillips 
Date:   2015-10-05T04:32:34Z

DRILL-3233: Expression handling for Union types

commit 3807ff05f8232586a83b4895ea5216ebdf4c60d1
Author: Steven Phillips 
Date:   2015-10-09T19:59:37Z

Drill-3232: Promotable writer




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: Drill-3232: Handle promoting writers and vecto...

2015-10-16 Thread StevenMPhillips
Github user StevenMPhillips commented on a diff in the pull request:

https://github.com/apache/drill/pull/207#discussion_r42285176
  
--- Diff: protocol/src/main/protobuf/Types.proto ---
@@ -73,6 +74,7 @@ message MajorType {
   optional int32 precision = 4; // used for decimal types
   optional int32 scale = 5; // used for decimal types
   optional int32 timeZone = 6; // used by TimeStamp type
+  repeated MinorType sub_type = 7; // used by Union type
--- End diff --

We basically don't allow a non-leaf/non-root node of an expression tree to 
have a complex type. Expressions where an intermediate node are complex, the 
expression is split apart into separate operators. So we only care about the 
subtypes of a map or list vector when reading from the vector.

In the case of a Union vector, it is possible that intermediate nodes might 
have Union type. We pass the possible subtypes up the tree so that the code 
generated only includes the code needed for the specific types that are 
present, rather than having to generate code for every type at every node.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3912: Common subexpression elimination

2015-10-19 Thread StevenMPhillips
Github user StevenMPhillips commented on the pull request:

https://github.com/apache/drill/pull/189#issuecomment-149390304
  
I actually have done some additional work on this branch to extend it to 
work with other operators. But I ran into the Scalar Replacement bug that was 
mentioned in the JIRA. So I didn't add the changes to the pull request.

The tests fail because the current PR assumes there is only a single 
MappingSet, and thus the generated code is all going into the same methods. My 
solution was to clear the evaluated expression cache whenever the 
setMappingSet() method is called.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3912: Common subexpression elimination

2015-10-19 Thread StevenMPhillips
Github user StevenMPhillips commented on a diff in the pull request:

https://github.com/apache/drill/pull/189#discussion_r42445808
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/EqualityVisitor.java ---
@@ -0,0 +1,322 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.expr;
+
+import com.google.common.collect.Lists;
+import org.apache.drill.common.expression.BooleanOperator;
+import org.apache.drill.common.expression.CastExpression;
+import org.apache.drill.common.expression.ConvertExpression;
+import org.apache.drill.common.expression.FunctionCall;
+import org.apache.drill.common.expression.FunctionHolderExpression;
+import org.apache.drill.common.expression.IfExpression;
+import org.apache.drill.common.expression.LogicalExpression;
+import org.apache.drill.common.expression.NullExpression;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.expression.TypedNullConstant;
+import 
org.apache.drill.common.expression.ValueExpressions.BooleanExpression;
+import org.apache.drill.common.expression.ValueExpressions.DateExpression;
+import 
org.apache.drill.common.expression.ValueExpressions.Decimal18Expression;
+import 
org.apache.drill.common.expression.ValueExpressions.Decimal28Expression;
+import 
org.apache.drill.common.expression.ValueExpressions.Decimal38Expression;
+import 
org.apache.drill.common.expression.ValueExpressions.Decimal9Expression;
+import 
org.apache.drill.common.expression.ValueExpressions.DoubleExpression;
+import org.apache.drill.common.expression.ValueExpressions.FloatExpression;
+import org.apache.drill.common.expression.ValueExpressions.IntExpression;
+import 
org.apache.drill.common.expression.ValueExpressions.IntervalDayExpression;
+import 
org.apache.drill.common.expression.ValueExpressions.IntervalYearExpression;
+import org.apache.drill.common.expression.ValueExpressions.LongExpression;
+import org.apache.drill.common.expression.ValueExpressions.QuotedString;
+import org.apache.drill.common.expression.ValueExpressions.TimeExpression;
+import 
org.apache.drill.common.expression.ValueExpressions.TimeStampExpression;
+import org.apache.drill.common.expression.visitors.AbstractExprVisitor;
+
+import java.util.List;
+
+public class EqualityVisitor extends 
AbstractExprVisitor {
+  @Override
+  public Boolean visitFunctionCall(FunctionCall call, LogicalExpression 
value) throws RuntimeException {
+if (!(value instanceof FunctionCall)) {
+  return false;
+}
+if (!checkType(call, value)) {
+  return false;
+}
+if (!call.getName().equals(((FunctionCall) value).getName())) {
+  return false;
+}
+return checkChildren(call, value);
+  }
+
+  @Override
+  public Boolean visitFunctionHolderExpression(FunctionHolderExpression 
holder, LogicalExpression value) throws RuntimeException {
+if (!(value instanceof FunctionHolderExpression)) {
+  return false;
+}
+if (!checkType(holder, value)) {
+  return false;
+}
+if (!holder.getName().equals(((FunctionHolderExpression) 
value).getName())) {
+  return false;
+}
+return checkChildren(holder, value);
+  }
+
+  @Override
+  public Boolean visitIfExpression(IfExpression ifExpr, LogicalExpression 
value) throws RuntimeException {
+if (!(value instanceof IfExpression)) {
+  return false;
+}
+return checkChildren(ifExpr, value);
+  }
+
+  @Override
+  public Boolean visitBooleanOperator(BooleanOperator call, 
LogicalExpression value) throws RuntimeException {
+if (!(value instanceof BooleanOperator)) {
+  return false;
+}
+if (!call.getName().equals(((BooleanOperator) value).getName())) {
+  return false;
+}
+return checkChildren(call, value);
+  }
+
+  @Override
 

[GitHub] drill pull request: DRILL-3912: Common subexpression elimination

2015-10-19 Thread StevenMPhillips
Github user StevenMPhillips commented on a diff in the pull request:

https://github.com/apache/drill/pull/189#discussion_r42445825
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/EvaluationVisitor.java 
---
@@ -106,19 +163,32 @@ public HoldingContainer 
visitFunctionCall(FunctionCall call, ClassGenerator g
 @Override
 public HoldingContainer visitBooleanOperator(BooleanOperator op,
 ClassGenerator generator) throws RuntimeException {
+  HoldingContainer hc = getPrevious(op);
+  if (hc != null) {
+return hc;
+  }
+  newScope();
--- End diff --

I think this should be possible, and probably less error prone.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3912: Common subexpression elimination

2015-10-19 Thread StevenMPhillips
Github user StevenMPhillips commented on the pull request:

https://github.com/apache/drill/pull/189#issuecomment-149391542
  
Yeah, I think that makes sense. I will go ahead and do that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3912: Common subexpression elimination

2015-10-20 Thread StevenMPhillips
Github user StevenMPhillips commented on a diff in the pull request:

https://github.com/apache/drill/pull/189#discussion_r42537221
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/EqualityVisitor.java ---
@@ -0,0 +1,322 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.expr;
+
+import com.google.common.collect.Lists;
+import org.apache.drill.common.expression.BooleanOperator;
+import org.apache.drill.common.expression.CastExpression;
+import org.apache.drill.common.expression.ConvertExpression;
+import org.apache.drill.common.expression.FunctionCall;
+import org.apache.drill.common.expression.FunctionHolderExpression;
+import org.apache.drill.common.expression.IfExpression;
+import org.apache.drill.common.expression.LogicalExpression;
+import org.apache.drill.common.expression.NullExpression;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.expression.TypedNullConstant;
+import 
org.apache.drill.common.expression.ValueExpressions.BooleanExpression;
+import org.apache.drill.common.expression.ValueExpressions.DateExpression;
+import 
org.apache.drill.common.expression.ValueExpressions.Decimal18Expression;
+import 
org.apache.drill.common.expression.ValueExpressions.Decimal28Expression;
+import 
org.apache.drill.common.expression.ValueExpressions.Decimal38Expression;
+import 
org.apache.drill.common.expression.ValueExpressions.Decimal9Expression;
+import 
org.apache.drill.common.expression.ValueExpressions.DoubleExpression;
+import org.apache.drill.common.expression.ValueExpressions.FloatExpression;
+import org.apache.drill.common.expression.ValueExpressions.IntExpression;
+import 
org.apache.drill.common.expression.ValueExpressions.IntervalDayExpression;
+import 
org.apache.drill.common.expression.ValueExpressions.IntervalYearExpression;
+import org.apache.drill.common.expression.ValueExpressions.LongExpression;
+import org.apache.drill.common.expression.ValueExpressions.QuotedString;
+import org.apache.drill.common.expression.ValueExpressions.TimeExpression;
+import 
org.apache.drill.common.expression.ValueExpressions.TimeStampExpression;
+import org.apache.drill.common.expression.visitors.AbstractExprVisitor;
+
+import java.util.List;
+
+public class EqualityVisitor extends 
AbstractExprVisitor {
+  @Override
+  public Boolean visitFunctionCall(FunctionCall call, LogicalExpression 
value) throws RuntimeException {
+if (!(value instanceof FunctionCall)) {
+  return false;
+}
+if (!checkType(call, value)) {
+  return false;
+}
+if (!call.getName().equals(((FunctionCall) value).getName())) {
+  return false;
+}
+return checkChildren(call, value);
+  }
+
+  @Override
+  public Boolean visitFunctionHolderExpression(FunctionHolderExpression 
holder, LogicalExpression value) throws RuntimeException {
+if (!(value instanceof FunctionHolderExpression)) {
+  return false;
+}
+if (!checkType(holder, value)) {
+  return false;
+}
+if (!holder.getName().equals(((FunctionHolderExpression) 
value).getName())) {
+  return false;
+}
+return checkChildren(holder, value);
+  }
+
+  @Override
+  public Boolean visitIfExpression(IfExpression ifExpr, LogicalExpression 
value) throws RuntimeException {
+if (!(value instanceof IfExpression)) {
+  return false;
+}
+return checkChildren(ifExpr, value);
+  }
+
+  @Override
+  public Boolean visitBooleanOperator(BooleanOperator call, 
LogicalExpression value) throws RuntimeException {
+if (!(value instanceof BooleanOperator)) {
+  return false;
+}
+if (!call.getName().equals(((BooleanOperator) value).getName())) {
+  return false;
+}
+return checkChildren(call, value);
+  }
+
+  @Override
 

[GitHub] drill pull request: Add Sequence file support.

2015-10-21 Thread StevenMPhillips
Github user StevenMPhillips commented on a diff in the pull request:

https://github.com/apache/drill/pull/214#discussion_r42687575
  
--- Diff: 
protocol/src/main/java/org/apache/drill/exec/proto/beans/CoreOperatorType.java 
---
@@ -58,7 +58,8 @@
 HBASE_SUB_SCAN(33),
 WINDOW(34),
 NESTED_LOOP_JOIN(35),
-AVRO_SUB_SCAN(36);
+AVRO_SUB_SCAN(36),
--- End diff --

I agree that the situation is less than ideal, and it has always kind of 
bothered me. But I think this is a separate issue and really has no bearing on 
the current PR. We should move this discussion to the mailing list or a new 
JIRA.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3975: Make sure to strip scheme and auth...

2015-10-25 Thread StevenMPhillips
Github user StevenMPhillips commented on the pull request:

https://github.com/apache/drill/pull/217#issuecomment-151026505
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3912: Common subexpression elimination

2015-10-28 Thread StevenMPhillips
Github user StevenMPhillips commented on the pull request:

https://github.com/apache/drill/pull/189#issuecomment-152066327
  
@jinfengni I updated the PR. Could you take a look?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3232: Handle promoting writers and vecto...

2015-10-30 Thread StevenMPhillips
Github user StevenMPhillips commented on a diff in the pull request:

https://github.com/apache/drill/pull/207#discussion_r43479065
  
--- Diff: exec/java-exec/src/main/codegen/templates/MapWriters.java ---
@@ -52,9 +52,18 @@
   private final Map fields = Maps.newHashMap();
   <#if mode == "Repeated">private int currentChildIndex = 0;
 
-  public ${mode}MapWriter(${containerClass} container, FieldWriter parent) 
{
+  private final boolean unionEnabled;
+  private final boolean unionInternalMap;
--- End diff --

It's actually not needed, it is a relic of an older version. I removed it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3232: Handle promoting writers and vecto...

2015-10-30 Thread StevenMPhillips
Github user StevenMPhillips commented on a diff in the pull request:

https://github.com/apache/drill/pull/207#discussion_r43479145
  
--- Diff: exec/java-exec/src/main/codegen/templates/NullReader.java ---
@@ -56,11 +56,17 @@ public void copyAsValue(MapWriter writer) {}
 
   public void copyAsValue(ListWriter writer) {}
 
-  <#list vv.types as type><#list type.minor as minor><#assign name = 
minor.class?cap_first /> 
+  public void copyAsValue(UnionWriter writer) {}
+
+  <#list vv.types as type><#list type.minor as minor><#assign name = 
minor.class?cap_first />
+  public void read(${name}Holder holder){
+throw new UnsupportedOperationException("NullReader cannot read into 
non-nullable holder");
--- End diff --

This is an internal error message, not the result of a potential user error.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3232: Handle promoting writers and vecto...

2015-10-30 Thread StevenMPhillips
Github user StevenMPhillips commented on a diff in the pull request:

https://github.com/apache/drill/pull/207#discussion_r43479369
  
--- Diff: exec/java-exec/src/main/codegen/templates/UnionFunctions.java ---
@@ -0,0 +1,121 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+<@pp.dropOutputFile />
+<@pp.changeOutputFile 
name="/org/apache/drill/exec/expr/fn/impl/GUnionFunctions.java" />
+
+
+<#include "/@includes/license.ftl" />
+
+package org.apache.drill.exec.expr.fn.impl;
+
+<#include "/@includes/vv_imports.ftl" />
+import org.apache.drill.exec.expr.DrillSimpleFunc;
+import org.apache.drill.exec.expr.annotations.FunctionTemplate;
+import 
org.apache.drill.exec.expr.annotations.FunctionTemplate.NullHandling;
+import org.apache.drill.exec.expr.annotations.Output;
+import org.apache.drill.exec.expr.annotations.Param;
+import org.apache.drill.exec.expr.holders.*;
+import javax.inject.Inject;
+import io.netty.buffer.DrillBuf;
+import org.apache.drill.exec.record.RecordBatch;
+
+/*
+ * This class is generated using freemarker and the ${.template_name} 
template.
+ */
+
+@SuppressWarnings("unused")
+public class GUnionFunctions {
+
+  <#list vv.types as type><#list type.minor as minor><#assign name = 
minor.class?cap_first />
+  <#assign fields = minor.fields!type.fields />
+  <#assign uncappedName = name?uncap_first/>
+
+  <#if !minor.class?starts_with("Decimal")>
+
+  @SuppressWarnings("unused")
+  @FunctionTemplate(name = "is_${name?upper_case}", scope = 
FunctionTemplate.FunctionScope.SIMPLE, nulls=NullHandling.INTERNAL)
+  public static class UnionIs${name} implements DrillSimpleFunc {
+
+@Param UnionHolder in;
+@Output BitHolder out;
+
+public void setup() {}
+
+public void eval() {
+  if (in.isSet == 1) {
+out.value = in.getType().getMinorType() == 
org.apache.drill.common.types.TypeProtos.MinorType.${name?upper_case} ? 1 : 0;
+  } else {
+out.value = 0;
+  }
+}
+  }
+
+  @SuppressWarnings("unused")
+  @FunctionTemplate(name = "assert_${name?upper_case}", scope = 
FunctionTemplate.FunctionScope.SIMPLE, nulls=NullHandling.INTERNAL)
+  public static class CastUnion${name} implements DrillSimpleFunc {
+
+@Param UnionHolder in;
+@Output Nullable${name}Holder out;
+
+public void setup() {}
+
+public void eval() {
+  if (in.isSet == 1) {
+in.reader.read(out);
--- End diff --

It throws a message like:
"You tried to read a [%s] type when you are using a field reader of type 
[%s]."

I think these functions are internal and not for general use, so this type 
of message is problem fine.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3232: Handle promoting writers and vecto...

2015-10-30 Thread StevenMPhillips
Github user StevenMPhillips commented on a diff in the pull request:

https://github.com/apache/drill/pull/207#discussion_r43479756
  
--- Diff: exec/java-exec/src/main/codegen/templates/UnionVector.java ---
@@ -0,0 +1,479 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import org.apache.drill.common.types.TypeProtos.MinorType;
+
+<@pp.dropOutputFile />
+<@pp.changeOutputFile 
name="/org/apache/drill/exec/vector/complex/impl/UnionVector.java" />
+
+
+<#include "/@includes/license.ftl" />
+
+package org.apache.drill.exec.vector.complex.impl;
+
+<#include "/@includes/vv_imports.ftl" />
+import java.util.Iterator;
+import org.apache.drill.exec.vector.complex.impl.ComplexCopier;
+import org.apache.drill.exec.util.CallBack;
+
+/*
+ * This class is generated using freemarker and the ${.template_name} 
template.
+ */
+@SuppressWarnings("unused")
+
+
+public class UnionVector implements ValueVector {
+
+  private MaterializedField field;
+  private BufferAllocator allocator;
+  private Accessor accessor = new Accessor();
+  private Mutator mutator = new Mutator();
+  private int valueCount;
+
+  private MapVector internalMap;
+  private SingleMapWriter internalMapWriter;
+  private UInt1Vector typeVector;
+
+  private MapVector mapVector;
+  private ListVector listVector;
+  private NullableBigIntVector bigInt;
+  private NullableVarCharVector varChar;
+
+  private FieldReader reader;
+  private NullableBitVector bit;
+
+  private State state = State.INIT;
+  private int singleType = 0;
+  private ValueVector singleVector;
+  private MajorType majorType;
+
+  private final CallBack callBack;
+
+  private enum State {
+INIT, SINGLE, MULTI
+  }
+
+  public UnionVector(MaterializedField field, BufferAllocator allocator, 
CallBack callBack) {
+this.field = field.clone();
+this.allocator = allocator;
+internalMap = new MapVector("internal", allocator, callBack);
+internalMapWriter = new SingleMapWriter(internalMap, null, true, true);
+this.typeVector = internalMap.addOrGet("types", 
Types.required(MinorType.UINT1), UInt1Vector.class);
+this.field.addChild(internalMap.getField().clone());
+this.majorType = field.getType();
+this.callBack = callBack;
+  }
+
+  private void updateState(ValueVector v) {
+if (state == State.INIT) {
+  state = State.SINGLE;
+  singleVector = v;
+  singleType = v.getField().getType().getMinorType().getNumber();
+} else {
+  state = State.MULTI;
+  singleVector = null;
+}
+  }
+
+  public List getSubTypes() {
+return majorType.getSubTypeList();
+  }
+
+  private void addSubType(MinorType type) {
+majorType =  
MajorType.newBuilder(this.majorType).addSubType(type).build();
+if (callBack != null) {
+  callBack.doWork();
+}
+  }
+
+  public boolean isSingleType() {
+return state == State.SINGLE && singleType != MinorType.LIST_VALUE;
+  }
+
+  public ValueVector getSingleVector() {
+assert state != State.MULTI : "Cannot get single vector when there are 
multiple types";
--- End diff --

This State was originally designed to eliminate some of the overhead when 
there was only a single subtype, but with PromotableWriter, this is really not 
necessary, and this State is no longer used. So I removed this code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3232: Handle promoting writers and vecto...

2015-10-30 Thread StevenMPhillips
Github user StevenMPhillips commented on a diff in the pull request:

https://github.com/apache/drill/pull/207#discussion_r43480218
  
--- Diff: exec/java-exec/src/main/codegen/templates/UnionVector.java ---
@@ -0,0 +1,479 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import org.apache.drill.common.types.TypeProtos.MinorType;
+
+<@pp.dropOutputFile />
+<@pp.changeOutputFile 
name="/org/apache/drill/exec/vector/complex/impl/UnionVector.java" />
+
+
+<#include "/@includes/license.ftl" />
+
+package org.apache.drill.exec.vector.complex.impl;
+
+<#include "/@includes/vv_imports.ftl" />
+import java.util.Iterator;
+import org.apache.drill.exec.vector.complex.impl.ComplexCopier;
+import org.apache.drill.exec.util.CallBack;
+
+/*
+ * This class is generated using freemarker and the ${.template_name} 
template.
+ */
+@SuppressWarnings("unused")
+
+
+public class UnionVector implements ValueVector {
+
+  private MaterializedField field;
+  private BufferAllocator allocator;
+  private Accessor accessor = new Accessor();
+  private Mutator mutator = new Mutator();
+  private int valueCount;
+
+  private MapVector internalMap;
+  private SingleMapWriter internalMapWriter;
+  private UInt1Vector typeVector;
+
+  private MapVector mapVector;
+  private ListVector listVector;
+  private NullableBigIntVector bigInt;
+  private NullableVarCharVector varChar;
+
+  private FieldReader reader;
+  private NullableBitVector bit;
+
+  private State state = State.INIT;
+  private int singleType = 0;
+  private ValueVector singleVector;
+  private MajorType majorType;
+
+  private final CallBack callBack;
+
+  private enum State {
+INIT, SINGLE, MULTI
+  }
+
+  public UnionVector(MaterializedField field, BufferAllocator allocator, 
CallBack callBack) {
+this.field = field.clone();
+this.allocator = allocator;
+internalMap = new MapVector("internal", allocator, callBack);
+internalMapWriter = new SingleMapWriter(internalMap, null, true, true);
+this.typeVector = internalMap.addOrGet("types", 
Types.required(MinorType.UINT1), UInt1Vector.class);
+this.field.addChild(internalMap.getField().clone());
+this.majorType = field.getType();
+this.callBack = callBack;
+  }
+
+  private void updateState(ValueVector v) {
+if (state == State.INIT) {
+  state = State.SINGLE;
+  singleVector = v;
+  singleType = v.getField().getType().getMinorType().getNumber();
+} else {
+  state = State.MULTI;
+  singleVector = null;
+}
+  }
+
+  public List getSubTypes() {
+return majorType.getSubTypeList();
+  }
+
+  private void addSubType(MinorType type) {
+majorType =  
MajorType.newBuilder(this.majorType).addSubType(type).build();
+if (callBack != null) {
+  callBack.doWork();
+}
+  }
+
+  public boolean isSingleType() {
+return state == State.SINGLE && singleType != MinorType.LIST_VALUE;
+  }
+
+  public ValueVector getSingleVector() {
+assert state != State.MULTI : "Cannot get single vector when there are 
multiple types";
+assert state != State.INIT : "Cannot get single vector when there are 
no types";
+return singleVector;
+  }
+
+  private static final MajorType MAP_TYPE = Types.optional(MinorType.MAP);
+
+  public MapVector getMap() {
+if (mapVector == null) {
+  int vectorCount = internalMap.size();
+  mapVector = internalMap.addOrGet("map", MAP_TYPE, MapVector.class);
+  updateState(mapVector);
+  addSubType(MinorType.MAP);
+  if (internalMap.size() > vectorCount) {
+mapVector.allocateNew();
+  }

[GitHub] drill pull request: DRILL-3232: Handle promoting writers and vecto...

2015-10-30 Thread StevenMPhillips
Github user StevenMPhillips commented on a diff in the pull request:

https://github.com/apache/drill/pull/207#discussion_r43482191
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java ---
@@ -242,7 +244,7 @@
   Long.MAX_VALUE, 60 * 1000 * 5);
 
   public static final String ENABLE_VERBOSE_ERRORS_KEY = 
"exec.errors.verbose";
-  public static final OptionValidator ENABLE_VERBOSE_ERRORS = new 
BooleanValidator(ENABLE_VERBOSE_ERRORS_KEY, false);
+  public static final OptionValidator ENABLE_VERBOSE_ERRORS = new 
BooleanValidator(ENABLE_VERBOSE_ERRORS_KEY, true);
--- End diff --

reverted


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3232: Handle promoting writers and vecto...

2015-10-30 Thread StevenMPhillips
Github user StevenMPhillips commented on a diff in the pull request:

https://github.com/apache/drill/pull/207#discussion_r43482198
  
--- Diff: exec/java-exec/src/main/codegen/templates/UnionVector.java ---
@@ -0,0 +1,479 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import org.apache.drill.common.types.TypeProtos.MinorType;
+
+<@pp.dropOutputFile />
+<@pp.changeOutputFile 
name="/org/apache/drill/exec/vector/complex/impl/UnionVector.java" />
+
+
+<#include "/@includes/license.ftl" />
+
+package org.apache.drill.exec.vector.complex.impl;
+
+<#include "/@includes/vv_imports.ftl" />
+import java.util.Iterator;
+import org.apache.drill.exec.vector.complex.impl.ComplexCopier;
+import org.apache.drill.exec.util.CallBack;
+
+/*
+ * This class is generated using freemarker and the ${.template_name} 
template.
+ */
+@SuppressWarnings("unused")
+
+
+public class UnionVector implements ValueVector {
+
+  private MaterializedField field;
+  private BufferAllocator allocator;
+  private Accessor accessor = new Accessor();
+  private Mutator mutator = new Mutator();
+  private int valueCount;
+
+  private MapVector internalMap;
+  private SingleMapWriter internalMapWriter;
+  private UInt1Vector typeVector;
+
+  private MapVector mapVector;
+  private ListVector listVector;
+  private NullableBigIntVector bigInt;
+  private NullableVarCharVector varChar;
+
+  private FieldReader reader;
+  private NullableBitVector bit;
+
+  private State state = State.INIT;
+  private int singleType = 0;
+  private ValueVector singleVector;
+  private MajorType majorType;
+
+  private final CallBack callBack;
+
+  private enum State {
+INIT, SINGLE, MULTI
+  }
+
+  public UnionVector(MaterializedField field, BufferAllocator allocator, 
CallBack callBack) {
+this.field = field.clone();
+this.allocator = allocator;
+internalMap = new MapVector("internal", allocator, callBack);
+internalMapWriter = new SingleMapWriter(internalMap, null, true, true);
+this.typeVector = internalMap.addOrGet("types", 
Types.required(MinorType.UINT1), UInt1Vector.class);
+this.field.addChild(internalMap.getField().clone());
+this.majorType = field.getType();
+this.callBack = callBack;
+  }
+
+  private void updateState(ValueVector v) {
+if (state == State.INIT) {
+  state = State.SINGLE;
+  singleVector = v;
+  singleType = v.getField().getType().getMinorType().getNumber();
+} else {
+  state = State.MULTI;
+  singleVector = null;
+}
+  }
+
+  public List getSubTypes() {
+return majorType.getSubTypeList();
+  }
+
+  private void addSubType(MinorType type) {
+majorType =  
MajorType.newBuilder(this.majorType).addSubType(type).build();
+if (callBack != null) {
+  callBack.doWork();
+}
+  }
+
+  public boolean isSingleType() {
+return state == State.SINGLE && singleType != MinorType.LIST_VALUE;
+  }
+
+  public ValueVector getSingleVector() {
+assert state != State.MULTI : "Cannot get single vector when there are 
multiple types";
+assert state != State.INIT : "Cannot get single vector when there are 
no types";
+return singleVector;
+  }
+
+  private static final MajorType MAP_TYPE = Types.optional(MinorType.MAP);
+
+  public MapVector getMap() {
+if (mapVector == null) {
+  int vectorCount = internalMap.size();
+  mapVector = internalMap.addOrGet("map", MAP_TYPE, MapVector.class);
+  updateState(mapVector);
+  addSubType(MinorType.MAP);
+  if (internalMap.size() > vectorCount) {
+mapVector.allocateNew();
+  }

[GitHub] drill pull request: DRILL-3232: Handle promoting writers and vecto...

2015-10-30 Thread StevenMPhillips
Github user StevenMPhillips commented on a diff in the pull request:

https://github.com/apache/drill/pull/207#discussion_r43546226
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/vector/complex/AbstractMapVector.java
 ---
@@ -173,7 +173,7 @@ public ValueVector getChildByOrdinal(int id) {
*
* Note that this method does not enforce any vector type check nor 
throws a schema change exception.
*/
-  protected void putChild(String name, ValueVector vector) {
+  public void putChild(String name, ValueVector vector) {
 putVector(name, vector);
--- End diff --

I'm not sure what it means for what is loaded to be right. But I went ahead 
and moved the UnionVector into the same package as AbstractMapVector, so I can 
use this method without making it public.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3232: Handle promoting writers and vecto...

2015-10-30 Thread StevenMPhillips
Github user StevenMPhillips commented on a diff in the pull request:

https://github.com/apache/drill/pull/207#discussion_r43546612
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/vector/complex/ListVector.java
 ---
@@ -0,0 +1,390 @@

+/***
+
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ 
**/
+package org.apache.drill.exec.vector.complex;
+
+import com.google.common.collect.ObjectArrays;
+import io.netty.buffer.DrillBuf;
+import org.apache.drill.common.expression.FieldReference;
+import org.apache.drill.common.expression.PathSegment;
+import org.apache.drill.common.types.TypeProtos.DataMode;
+import org.apache.drill.common.types.TypeProtos.MinorType;
+import org.apache.drill.common.types.Types;
+import org.apache.drill.exec.memory.BufferAllocator;
+import org.apache.drill.exec.memory.OutOfMemoryRuntimeException;
+import org.apache.drill.exec.proto.UserBitShared;
+import org.apache.drill.exec.record.MaterializedField;
+import org.apache.drill.exec.record.TransferPair;
+import org.apache.drill.exec.record.TypedFieldId;
+import org.apache.drill.exec.util.CallBack;
+import org.apache.drill.exec.util.JsonStringArrayList;
+import org.apache.drill.exec.vector.AddOrGetResult;
+import org.apache.drill.exec.vector.BaseValueVector;
+import org.apache.drill.exec.vector.UInt1Vector;
+import org.apache.drill.exec.vector.UInt4Vector;
+import org.apache.drill.exec.vector.ValueVector;
+import org.apache.drill.exec.vector.VectorDescriptor;
+import org.apache.drill.exec.vector.ZeroVector;
+import org.apache.drill.exec.vector.complex.impl.ComplexCopier;
+import org.apache.drill.exec.vector.complex.impl.UnionListReader;
+import org.apache.drill.exec.vector.complex.impl.UnionListWriter;
+import org.apache.drill.exec.vector.complex.impl.UnionVector;
+import org.apache.drill.exec.vector.complex.reader.FieldReader;
+import org.apache.drill.exec.vector.complex.writer.FieldWriter;
+
+import java.util.List;
+
+public class ListVector extends BaseRepeatedValueVector {
+
+  private UInt4Vector offsets;
+  private final UInt1Vector bits;
+  private Mutator mutator = new Mutator();
+  private Accessor accessor = new Accessor();
+  private UnionListWriter writer;
+  private UnionListReader reader;
+  private CallBack callBack;
+
+  public ListVector(MaterializedField field, BufferAllocator allocator, 
CallBack callBack) {
+super(field, allocator);
+this.bits = new UInt1Vector(MaterializedField.create("$bits$", 
Types.required(MinorType.UINT1)), allocator);
+offsets = getOffsetVector();
+this.field.addChild(getDataVector().getField());
+this.writer = new UnionListWriter(this);
+this.reader = new UnionListReader(this);
+this.callBack = callBack;
+  }
+
+  public UnionListWriter getWriter() {
+return writer;
+  }
+
+  @Override
+  public void allocateNew() throws OutOfMemoryRuntimeException {
+super.allocateNewSafe();
+  }
+
+  public void transferTo(ListVector target) {
+offsets.makeTransferPair(target.offsets).transfer();
+bits.makeTransferPair(target.bits).transfer();
+if (target.getDataVector() instanceof ZeroVector) {
+  target.addOrGetVector(new 
VectorDescriptor(vector.getField().getType()));
+}
+getDataVector().makeTransferPair(target.getDataVector()).transfer();
+  }
+
+  public void copyFromSafe(int inIndex, int outIndex, ListVector from) {
+copyFrom(inIndex, outIndex, from);
+  }
+
+  public void copyFrom(int inIndex, int outIndex, ListVector from) {
+FieldReader in = from.getReader();
+in.setPosition(inIndex);
+FieldWriter out = getWriter();
+out.setPosition(outIndex);
+ComplexCopier copi

[GitHub] drill pull request: DRILL-3232: Handle promoting writers and vecto...

2015-10-30 Thread StevenMPhillips
Github user StevenMPhillips commented on a diff in the pull request:

https://github.com/apache/drill/pull/207#discussion_r43550594
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/vector/complex/fn/JsonReader.java
 ---
@@ -275,83 +275,86 @@ private void consumeEntireNextValue() throws 
IOException {
   private void writeData(MapWriter map, FieldSelection selection, boolean 
moveForward) throws IOException {
 //
 map.start();
-outside: while (true) {
-
-  JsonToken t;
-  if(moveForward){
-t = parser.nextToken();
-  }else{
-t = parser.getCurrentToken();
-moveForward = true;
-  }
+try {
--- End diff --

Yes, I am pretty sure that is the case. I don't remember why exactly I 
needed to put this in a try-finally block.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3232: Handle promoting writers and vecto...

2015-10-30 Thread StevenMPhillips
Github user StevenMPhillips commented on a diff in the pull request:

https://github.com/apache/drill/pull/207#discussion_r43550818
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/TestExampleQueries.java ---
@@ -26,12 +26,23 @@
 import org.apache.drill.common.util.FileUtils;
 import org.apache.drill.common.util.TestTools;
 import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.planner.physical.PlannerSettings;
+import org.junit.After;
 import org.junit.Ignore;
 import org.junit.Test;
 
 public class TestExampleQueries extends BaseTestQuery {
 //  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(TestExampleQueries.class);
 
+  @After
--- End diff --

I've removed this code altogether. It's not needed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3232: Handle promoting writers and vecto...

2015-10-30 Thread StevenMPhillips
Github user StevenMPhillips commented on a diff in the pull request:

https://github.com/apache/drill/pull/207#discussion_r43563065
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/ExpressionTreeMaterializer.java
 ---
@@ -323,23 +454,50 @@ public LogicalExpression 
visitIfExpression(IfExpression ifExpr, FunctionLookupCo
 
   MinorType thenType = 
conditions.expression.getMajorType().getMinorType();
   MinorType elseType = newElseExpr.getMajorType().getMinorType();
-
-  // Check if we need a cast
-  if (thenType != elseType && !(thenType == MinorType.NULL || elseType 
== MinorType.NULL)) {
-
-MinorType leastRestrictive = 
TypeCastRules.getLeastRestrictiveType((Arrays.asList(thenType, elseType)));
-if (leastRestrictive != thenType) {
-  // Implicitly cast the then expression
+  boolean hasUnion = thenType == MinorType.UNION || elseType == 
MinorType.UNION;
+  if (unionTypeEnabled) {
+if (thenType != elseType && !(thenType == MinorType.NULL || 
elseType == MinorType.NULL)) {
+
+  MinorType leastRestrictive = MinorType.UNION;
+  MajorType.Builder builder = 
MajorType.newBuilder().setMinorType(MinorType.UNION).setMode(DataMode.OPTIONAL);
+  if (thenType == MinorType.UNION) {
+for (MinorType subType : 
conditions.expression.getMajorType().getSubTypeList()) {
+  builder.addSubType(subType);
+}
+  } else {
+builder.addSubType(thenType);
+  }
+  if (elseType == MinorType.UNION) {
+for (MinorType subType : 
newElseExpr.getMajorType().getSubTypeList()) {
+  builder.addSubType(subType);
+}
+  } else {
+builder.addSubType(elseType);
+  }
+  MajorType unionType = builder.build();
   conditions = new IfExpression.IfCondition(newCondition,
-  addCastExpression(conditions.expression, 
newElseExpr.getMajorType(), functionLookupContext, errorCollector));
-} else if (leastRestrictive != elseType) {
-  // Implicitly cast the else expression
-  newElseExpr = addCastExpression(newElseExpr, 
conditions.expression.getMajorType(), functionLookupContext, errorCollector);
-} else {
-  /* Cannot cast one of the two expressions to make the output 
type of if and else expression
-   * to be the same. Raise error.
-   */
-  throw new DrillRuntimeException("Case expression should have 
similar output type on all its branches");
+  addCastExpression(conditions.expression, unionType, 
functionLookupContext, errorCollector));
+  newElseExpr = addCastExpression(newElseExpr, unionType, 
functionLookupContext, errorCollector);
+}
+
+  } else {
+// Check if we need a cast
+if (thenType != elseType && !(thenType == MinorType.NULL || 
elseType == MinorType.NULL)) {
+
+  MinorType leastRestrictive = 
TypeCastRules.getLeastRestrictiveType((Arrays.asList(thenType, elseType)));
+  if (leastRestrictive != thenType) {
+// Implicitly cast the then expression
+conditions = new IfExpression.IfCondition(newCondition,
+addCastExpression(conditions.expression, 
newElseExpr.getMajorType(), functionLookupContext, errorCollector));
+  } else if (leastRestrictive != elseType) {
+// Implicitly cast the else expression
+newElseExpr = addCastExpression(newElseExpr, 
conditions.expression.getMajorType(), functionLookupContext, errorCollector);
+  } else {
+/* Cannot cast one of the two expressions to make the output 
type of if and else expression
+ * to be the same. Raise error.
+ */
+throw new DrillRuntimeException("Case expression should have 
similar output type on all its branches");
--- End diff --

I don't think this is a problem. In this case, the case statement will not 
coerce to a type unless one of the branches output is that type. So as long as 
there is no Union type in the input, there won't be a union in the output.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3232: Handle promoting writers and vecto...

2015-10-30 Thread StevenMPhillips
Github user StevenMPhillips commented on a diff in the pull request:

https://github.com/apache/drill/pull/207#discussion_r43563231
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/FunctionGenerationHelper.java
 ---
@@ -127,7 +133,7 @@ protected static boolean isComparableType(MajorType 
type) {
 
   private static String formatCanNotCompareMsg(MajorType left, MajorType 
right) {
 StringBuilder sb = new StringBuilder();
-sb.append("Map, Array or repeated scalar type should not be used in 
group by, order by or in a comparison operator. Drill does not support compare 
between ");
+sb.append("Map, Array, Union or repeated scalar type should not be 
used in group by, order by or in a comparison operator. Drill does not support 
compare between ");
--- End diff --

Yes, they could be comparable, but that functionality has not been 
implemented yet.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3232: Handle promoting writers and vecto...

2015-10-30 Thread StevenMPhillips
Github user StevenMPhillips commented on a diff in the pull request:

https://github.com/apache/drill/pull/207#discussion_r43563951
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/vector/accessor/UnionSqlAccessor.java
 ---
@@ -0,0 +1,129 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.vector.accessor;
+
+import org.apache.drill.common.types.TypeProtos.MajorType;
+import org.apache.drill.common.types.TypeProtos.MinorType;
+import org.apache.drill.common.types.Types;
+import org.apache.drill.exec.vector.complex.impl.UnionVector;
+import org.apache.drill.exec.vector.complex.impl.UnionWriter;
+import org.apache.drill.exec.vector.complex.reader.FieldReader;
+
+import java.io.InputStream;
+import java.io.Reader;
+import java.math.BigDecimal;
+import java.sql.Date;
+import java.sql.Time;
+import java.sql.Timestamp;
+
+public class UnionSqlAccessor extends AbstractSqlAccessor {
+
+  FieldReader reader;
--- End diff --

JDBC functionality wasn't really a goal here, just bare minimum so that the 
results will display in sqlline


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-4000: Ensure storage plugins are not nee...

2015-11-01 Thread StevenMPhillips
Github user StevenMPhillips commented on the pull request:

https://github.com/apache/drill/pull/227#issuecomment-152887889
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3912: Common subexpression elimination

2015-11-02 Thread StevenMPhillips
Github user StevenMPhillips commented on a diff in the pull request:

https://github.com/apache/drill/pull/189#discussion_r43715223
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/EvaluationVisitor.java 
---
@@ -106,19 +177,30 @@ public HoldingContainer 
visitFunctionCall(FunctionCall call, ClassGenerator g
 @Override
 public HoldingContainer visitBooleanOperator(BooleanOperator op,
 ClassGenerator generator) throws RuntimeException {
+  HoldingContainer hc = getPrevious(op, generator.getMappingSet());
--- End diff --

Okay, I think I can make that change.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3912: Common subexpression elimination

2015-11-02 Thread StevenMPhillips
Github user StevenMPhillips commented on a diff in the pull request:

https://github.com/apache/drill/pull/189#discussion_r43715208
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/ClassGenerator.java ---
@@ -218,11 +220,16 @@ public HoldingContainer addExpr(LogicalExpression ex) 
{
   }
 
   public HoldingContainer addExpr(LogicalExpression ex, boolean rotate) {
+return addExpr(ex, rotate, false);
--- End diff --

I actually think I will just remove the boolean flag altogether. Originally 
I wanted to limit the scope, but I think it makes sense to always have it 
enabled.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-4032: Check if a new vector was created ...

2015-11-04 Thread StevenMPhillips
GitHub user StevenMPhillips opened a pull request:

https://github.com/apache/drill/pull/241

DRILL-4032: Check if a new vector was created to determine if all…

…ocation is necessary in MapWriter

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StevenMPhillips/drill drill-4032

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/241.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #241


commit b4b4372f0eaea76efe44d752a19947d2b1e5ed33
Author: Steven Phillips 
Date:   2015-11-04T22:17:39Z

DRILL-4032: Check for if a new vector was created to determine if 
allocation is necessary in MapWriter




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-4032: Check if a new vector was created ...

2015-11-04 Thread StevenMPhillips
Github user StevenMPhillips commented on the pull request:

https://github.com/apache/drill/pull/241#issuecomment-153909104
  
Okay, added a unit test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-4006: Ensure the position of MapWriters ...

2015-11-04 Thread StevenMPhillips
Github user StevenMPhillips commented on the pull request:

https://github.com/apache/drill/pull/242#issuecomment-153924675
  
I think the fix is in the wrong place. We should fix the underlying vector. 
I was able to fix the problem with a small change to 
BaseRepeatedValueVector.BaseRepeatedMutator:

```java
@Override
public void startNewValue(int index) {
  while (offsets.getValueCapacity() <= index) {
offsets.reAlloc();
  }
  offsets.getMutator().setSafe(index+1, 
offsets.getAccessor().get(index));
  setValueCount(index+1);
}
```

We shouldn't need to make any change in the writer.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-4006: Ensure the position of MapWriters ...

2015-11-04 Thread StevenMPhillips
Github user StevenMPhillips commented on the pull request:

https://github.com/apache/drill/pull/242#issuecomment-153933449
  
No, I'm talking about this issue. The problem is not really with the 
writer, its in the underlying vector.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-4006: Reallocate offset vector in repeat...

2015-11-04 Thread StevenMPhillips
GitHub user StevenMPhillips opened a pull request:

https://github.com/apache/drill/pull/243

DRILL-4006: Reallocate offset vector in repeated vectors when index is 
beyond the current capacity



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StevenMPhillips/drill drill-4006

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/243.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #243


commit e1d1dfefc000f7602971d602aa94d949117d8fa2
Author: Hsuan-Yi Chu 
Date:   2015-11-01T00:14:34Z

DRILL-4006: Ensure the position of MapWriters is set

This can help eliminate potential IOOB exception

commit e997ecfd3fefac9c42c89f3b8aaeb2aa30a407ca
Author: Steven Phillips 
Date:   2015-11-05T04:01:31Z

DRILL-4006: Reallocate offset vector in repeated vectors when index is 
beyond the current capacity




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-4053: Reduce metadata cache file size. S...

2015-11-13 Thread StevenMPhillips
Github user StevenMPhillips commented on the pull request:

https://github.com/apache/drill/pull/254#issuecomment-156576322
  
Do you have any numbers to illustrate how much of a performance improvement 
this change provides?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-4053: Reduce metadata cache file size. S...

2015-11-13 Thread StevenMPhillips
Github user StevenMPhillips commented on the pull request:

https://github.com/apache/drill/pull/254#issuecomment-156602190
  
Why do you need to know the version before reading the file?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-4053: Reduce metadata cache file size. S...

2015-11-13 Thread StevenMPhillips
Github user StevenMPhillips commented on the pull request:

https://github.com/apache/drill/pull/254#issuecomment-156603625
  
Yes, that is true, it is able to deserialize based on version, similar to 
what we do when deserializing physical plans or storage plugin configurations. 
See StoragePluginConfig.java for an example. It uses the @JsonTypeInfo 
annotation.
 
I wasn't sure if some other part of the code needs to know the version 
before deserializing.

This was designed to support new versions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-4081: Handle schema changes in ExternalS...

2015-11-15 Thread StevenMPhillips
GitHub user StevenMPhillips opened a pull request:

https://github.com/apache/drill/pull/257

DRILL-4081: Handle schema changes in ExternalSort



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StevenMPhillips/drill drill-4081

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/257.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #257


commit db573ffe2c15caad4a9d12dab9d28cc3e0b2f51d
Author: Steven Phillips 
Date:   2015-11-13T19:27:16Z

DRILL-4081: Handle schema changes in ExternalSort




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3854: For convert_from, re-point readerI...

2015-11-17 Thread StevenMPhillips
Github user StevenMPhillips commented on a diff in the pull request:

https://github.com/apache/drill/pull/262#discussion_r45124172
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/record/WritableBatch.java ---
@@ -149,6 +149,7 @@ public static WritableBatch getBatchNoHV(int 
recordCount, Iterable
   }
 
   for (DrillBuf b : vv.getBuffers(true)) {
+b.readerIndex(0);
--- End diff --

Yes, I think that is correct. It should be outside the if (clear) block.

On Tue, Nov 17, 2015 at 1:07 PM, Jacques Nadeau 
wrote:

> In
> 
exec/java-exec/src/main/java/org/apache/drill/exec/record/WritableBatch.java
> <https://github.com/apache/drill/pull/262#discussion_r45120761>:
>
> > @@ -149,6 +149,7 @@ public static WritableBatch getBatchNoHV(int 
recordCount, Iterable
> >}
> >
> >for (DrillBuf b : vv.getBuffers(true)) {
> > +b.readerIndex(0);
>
> I believe the contract of getBuffers() is that buffers are returned in a
> reader appropriate state. As such, you should figure out which buffers are
> failing to guarantee this. It should be easy as there are only a small
> amount of implementations of this. In other words, where are we failing to
> ensure this?
>
> Given the code I looked at before, I think the problem may be that the
> readerIndex behavior is only inside the clear statement. @StevenMPhillips
> <https://github.com/StevenMPhillips> , it seems like this line:
> 
https://github.com/apache/drill/blame/master/exec/vector/src/main/java/org/apache/drill/exec/vector/BaseDataValueVector.java#L63
> should be outside the if(clear). Thoughts?
>
> —
> Reply to this email directly or view it on GitHub
> <https://github.com/apache/drill/pull/262/files#r45120761>.
>



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-4053: Reduce metadata cache file size. S...

2015-12-01 Thread StevenMPhillips
Github user StevenMPhillips commented on the pull request:

https://github.com/apache/drill/pull/254#issuecomment-161160748
  
+1 LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-4145: Handle empty final field in Text r...

2015-12-02 Thread StevenMPhillips
GitHub user StevenMPhillips opened a pull request:

https://github.com/apache/drill/pull/287

DRILL-4145: Handle empty final field in Text reader correctly



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StevenMPhillips/drill drill-4145

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/287.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #287


commit 8f56250aeb29d5d21bcdc6c727cec89607150224
Author: Steven Phillips 
Date:   2015-12-02T10:09:20Z

DRILL-4145: Handle empty final field in Text reader correctly




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-4109 Fix NPE in RecordIterator.

2015-12-02 Thread StevenMPhillips
Github user StevenMPhillips commented on the pull request:

https://github.com/apache/drill/pull/282#issuecomment-161478598
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-4159: Use test framework in TestCsvHeade...

2015-12-03 Thread StevenMPhillips
GitHub user StevenMPhillips opened a pull request:

https://github.com/apache/drill/pull/289

DRILL-4159: Use test framework in TestCsvHeader



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StevenMPhillips/drill master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/289.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #289


commit f4e34ccac6add649df469f85e38f064962eae90a
Author: Steven Phillips 
Date:   2015-12-03T21:39:03Z

DRILL-4159: Use test framework in TestCsvHeader




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-4159: Use test framework in TestCsvHeade...

2015-12-03 Thread StevenMPhillips
Github user StevenMPhillips commented on the pull request:

https://github.com/apache/drill/pull/289#issuecomment-161800836
  
I didn't do a full build with tests before merging, and it looks like there 
is a checkstyle problem.
I fixed it, and am now doing a full build to verify before merging.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-4159: Use test framework in TestCsvHeade...

2015-12-03 Thread StevenMPhillips
Github user StevenMPhillips commented on the pull request:

https://github.com/apache/drill/pull/289#issuecomment-161816994
  
Which method?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #456: DRILL-4566: TDigest, median, and quantile functions

2017-03-02 Thread StevenMPhillips
Github user StevenMPhillips commented on the issue:

https://github.com/apache/drill/pull/456
  
There was a brief discussion on the drill-dev mailing list a few days after 
this PR was posted. Unfortunately the discussion did not culminate in any 
decision. The discussion was mainly around what syntax should we use for these 
functions, since they are actually approximate functions.

If you want to revive the discussion, or propose a resolution, feel free to 
pick up this PR and make sure it can rebase on latest drill master. I probably 
won't be getting to it for at least a few weeks. But if someone else takes it 
up and gets it into a mergeable state, and others in the community are in 
agreement, I think we can merge it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---