[GitHub] drill pull request #853: DRILL-5130: Fix explainTerms method for Values node...

2017-06-20 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/853#discussion_r123136893 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillValuesRelBase.java --- @@ -0,0 +1,44 @@ +/* + * Licensed to the

[GitHub] drill pull request #840: DRILL-5517: Size-aware set methods in value vectors

2017-06-20 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/840#discussion_r123134097 --- Diff: exec/vector/src/main/codegen/templates/VariableLengthVectors.java --- @@ -548,6 +567,23 @@ public void setSafe(int index, ByteBuffer bytes, int

[GitHub] drill pull request #840: DRILL-5517: Size-aware set methods in value vectors

2017-06-20 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/840#discussion_r123131374 --- Diff: exec/vector/src/main/codegen/templates/FixedValueVectors.java --- @@ -806,10 +998,32 @@ public void generateTestDataAlt(int size) { }

[GitHub] drill pull request #840: DRILL-5517: Size-aware set methods in value vectors

2017-06-20 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/840#discussion_r123129202 --- Diff: exec/memory/base/src/main/java/io/netty/buffer/UnsafeDirectLittleEndian.java --- @@ -174,6 +175,40 @@ public ByteBuf setDouble(int index,

[GitHub] drill pull request #840: DRILL-5517: Size-aware set methods in value vectors

2017-06-20 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/840#discussion_r123129149 --- Diff: exec/memory/base/src/main/java/io/netty/buffer/UnsafeDirectLittleEndian.java --- @@ -174,6 +175,40 @@ public ByteBuf setDouble(int index,

[GitHub] drill pull request #858: DRILL-3640: Support JDBC Statement.setQueryTimeout(...

2017-06-20 Thread kkhatua
GitHub user kkhatua opened a pull request: https://github.com/apache/drill/pull/858 DRILL-3640: Support JDBC Statement.setQueryTimeout(int) Allow for queries to be cancelled if they don't complete within the stipulated time. Tests added to test different query timeout

[GitHub] drill pull request #857: DRILL-5599: Notify StatusHandler that batch sending...

2017-06-20 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/857#discussion_r123135163 --- Diff: exec/rpc/src/main/java/org/apache/drill/exec/rpc/RequestIdMap.java --- @@ -47,7 +47,7 @@ private final IntObjectHashMap map;

[GitHub] drill pull request #857: DRILL-5599: Notify StatusHandler that batch sending...

2017-06-20 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/857#discussion_r123135561 --- Diff: exec/rpc/src/main/java/org/apache/drill/exec/rpc/RequestIdMap.java --- @@ -111,13 +111,16 @@ public RpcListener(RpcOutcomeListener handler,

[GitHub] drill pull request #822: DRILL-5457: Spill implementation for Hash Aggregate

2017-06-20 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/drill/pull/822 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[jira] [Created] (DRILL-5600) Using convert_to function on top of a map gives random errors

2017-06-20 Thread Rahul Challapalli (JIRA)
Rahul Challapalli created DRILL-5600: Summary: Using convert_to function on top of a map gives random errors Key: DRILL-5600 URL: https://issues.apache.org/jira/browse/DRILL-5600 Project: Apache

Re: Performance issue with 2 phase hash-agg design

2017-06-20 Thread rahul challapalli
Thanks for sharing the link Aman. On Tue, Jun 20, 2017 at 3:26 PM, Aman Sinha wrote: > See [1] which talks about this behavior for unique keys and suggests > manually setting the single phase agg. > We would need NDV statistics on the group-by keys to have the optimizer >

[GitHub] drill pull request #857: DRILL-5599: Notify StatusHandler that batch sending...

2017-06-20 Thread ppadma
Github user ppadma commented on a diff in the pull request: https://github.com/apache/drill/pull/857#discussion_r123118364 --- Diff: exec/rpc/src/main/java/org/apache/drill/exec/rpc/RequestIdMap.java --- @@ -47,7 +47,7 @@ private final IntObjectHashMap map;

[GitHub] drill pull request #857: DRILL-5599: Notify StatusHandler that batch sending...

2017-06-20 Thread ppadma
Github user ppadma commented on a diff in the pull request: https://github.com/apache/drill/pull/857#discussion_r123118764 --- Diff: exec/rpc/src/main/java/org/apache/drill/exec/rpc/RequestIdMap.java --- @@ -111,13 +111,16 @@ public RpcListener(RpcOutcomeListener handler, Class

[GitHub] drill pull request #857: DRILL-5599: Notify StatusHandler that batch sending...

2017-06-20 Thread ppadma
Github user ppadma commented on a diff in the pull request: https://github.com/apache/drill/pull/857#discussion_r123118252 --- Diff: exec/rpc/src/main/java/org/apache/drill/exec/rpc/RequestIdMap.java --- @@ -111,13 +111,16 @@ public RpcListener(RpcOutcomeListener handler, Class

Timeline for the 1.11.0 release?

2017-06-20 Thread Abhishek Girish
Hey all, A couple of questions have come up on mailing lists and elsewhere asking about the next release of Apache Drill. Do we have a timeline for the same? -Abhishek

Re: Performance issue with 2 phase hash-agg design

2017-06-20 Thread Aman Sinha
See [1] which talks about this behavior for unique keys and suggests manually setting the single phase agg. We would need NDV statistics on the group-by keys to have the optimizer pick the more efficient scheme. [1] https://drill.apache.org/docs/guidelines-for-optimizing-aggregation/ On Tue, Jun

Re: Performance issue with 2 phase hash-agg design

2017-06-20 Thread Chun Chang
I also noticed if the keys are mostly unique, the first phase aggregation effort is mostly wasted. This can and should be improved. One idea is to detect unique keys while processing. When the percentage of unique keys exceeds a certain threshold after processing certain percentage of data,

Performance issue with 2 phase hash-agg design

2017-06-20 Thread rahul challapalli
During the first phase, the hash agg operator is not protected from skew in data (Eg : data contains 2 files where the number of records in one file is very large compared to the other). Assuming there are only 2 fragments, the hash-agg operator in one fragment handles more records and it

[jira] [Created] (DRILL-5599) Notify StatusHandlerListener that batch sending has failed even if channel is still open

2017-06-20 Thread Arina Ielchiieva (JIRA)
Arina Ielchiieva created DRILL-5599: --- Summary: Notify StatusHandlerListener that batch sending has failed even if channel is still open Key: DRILL-5599 URL: https://issues.apache.org/jira/browse/DRILL-5599

[GitHub] drill pull request #857: DRILL-5599: Notify StatusHandlerListener that batch...

2017-06-20 Thread arina-ielchiieva
GitHub user arina-ielchiieva opened a pull request: https://github.com/apache/drill/pull/857 DRILL-5599: Notify StatusHandlerListener that batch sending has failed even if channel is still open Details in [DRILL-5599](https://issues.apache.org/jira/browse/DRILL-5599) You can merge

[GitHub] drill pull request #856: DRILL-3640: Support JDBC Statement.setQueryTimeout(...

2017-06-20 Thread kkhatua
Github user kkhatua closed the pull request at: https://github.com/apache/drill/pull/856 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is