[jira] [Commented] (DRILL-5694) hash agg spill to disk, second phase OOM

2017-09-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16172758#comment-16172758
 ] 

ASF GitHub Bot commented on DRILL-5694:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/938#discussion_r139877947
  
--- Diff: 
common/src/main/java/org/apache/drill/common/exceptions/UserException.java ---
@@ -536,6 +542,33 @@ public Builder pushContext(final String name, final 
double value) {
  * @return user exception
  */
 public UserException build(final Logger logger) {
+
+  // To allow for debugging:
+  // A spinner code to make the execution stop here while the file 
'/tmp/drillspin' exists
+  // Can be used to attach a debugger, use jstack, etc
+  // The processID of the spinning thread should be in a file like 
/tmp/spin4148663301172491613.tmp
+  // along with the error message.
+  File spinFile = new File("/tmp/drillspin");
+  if ( spinFile.exists() ) {
+File tmpDir = new File("/tmp");
+File outErr = null;
+try {
+  outErr = File.createTempFile("spin", ".tmp", tmpDir);
+  BufferedWriter bw = new BufferedWriter(new FileWriter(outErr));
+  bw.write("Spinning process: " + 
ManagementFactory.getRuntimeMXBean().getName()
+  /* After upgrading to JDK 9 - replace with: 
ProcessHandle.current().getPid() */);
+  bw.write("\nError cause: " +
+(errorType == DrillPBError.ErrorType.SYSTEM ? ("SYSTEM ERROR: 
" + ErrorHelper.getRootMessage(cause)) : message));
+  bw.close();
+} catch (Exception ex) {
+  logger.warn("Failed creating a spinner tmp message file: {}", 
ex);
+}
+while (spinFile.exists()) {
+  try { sleep(1_000); } catch (Exception ex) { /* ignore 
interruptions */ }
--- End diff --

What happens it the fragment executor tries to kill the query? Do we want 
the spinner to ignore that request here?


> hash agg spill to disk, second phase OOM
> 
>
> Key: DRILL-5694
> URL: https://issues.apache.org/jira/browse/DRILL-5694
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 1.11.0
>Reporter: Chun Chang
>Assignee: Boaz Ben-Zvi
>
> | 1.11.0-SNAPSHOT  | d622f76ee6336d97c9189fc589befa7b0f4189d6  | DRILL-5165: 
> For limit all case, no need to push down limit to scan  | 21.07.2017 @ 
> 10:36:29 PDT
> Second phase agg ran out of memory. Not suppose to. Test data currently only 
> accessible locally.
> /root/drill-test-framework/framework/resources/Advanced/hash-agg/spill/hagg15.q
> Query:
> select row_count, sum(row_count), avg(double_field), max(double_rand), 
> count(float_rand) from parquet_500m_v1 group by row_count order by row_count 
> limit 30
> Failed with exception
> java.sql.SQLException: RESOURCE ERROR: One or more nodes ran out of memory 
> while executing the query.
> HT was: 534773760 OOM at Second Phase. Partitions: 32. Estimated batch size: 
> 4849664. Planned batches: 0. Rows spilled so far: 6459928 Memory limit: 
> 536870912 so far allocated: 534773760.
> Fragment 1:6
> [Error Id: a193babd-f783-43da-a476-bb8dd4382420 on 10.10.30.168:31010]
>   (org.apache.drill.exec.exception.OutOfMemoryException) HT was: 534773760 
> OOM at Second Phase. Partitions: 32. Estimated batch size: 4849664. Planned 
> batches: 0. Rows spilled so far: 6459928 Memory limit: 536870912 so far 
> allocated: 534773760.
> 
> org.apache.drill.exec.test.generated.HashAggregatorGen1823.checkGroupAndAggrValues():1175
> org.apache.drill.exec.test.generated.HashAggregatorGen1823.doWork():539
> org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.innerNext():168
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():133
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.physical.impl.TopN.TopNBatch.innerNext():191
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
> 
> 

[jira] [Commented] (DRILL-5694) hash agg spill to disk, second phase OOM

2017-09-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16172761#comment-16172761
 ] 

ASF GitHub Bot commented on DRILL-5694:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/938#discussion_r139877011
  
--- Diff: 
common/src/main/java/org/apache/drill/common/exceptions/UserException.java ---
@@ -536,6 +542,33 @@ public Builder pushContext(final String name, final 
double value) {
  * @return user exception
  */
 public UserException build(final Logger logger) {
+
+  // To allow for debugging:
+  // A spinner code to make the execution stop here while the file 
'/tmp/drillspin' exists
--- End diff --

Would recommend `/tmp/drill/spin`. We already use `/tmp/drill` for other 
items, so this keep things tidy.


> hash agg spill to disk, second phase OOM
> 
>
> Key: DRILL-5694
> URL: https://issues.apache.org/jira/browse/DRILL-5694
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 1.11.0
>Reporter: Chun Chang
>Assignee: Boaz Ben-Zvi
>
> | 1.11.0-SNAPSHOT  | d622f76ee6336d97c9189fc589befa7b0f4189d6  | DRILL-5165: 
> For limit all case, no need to push down limit to scan  | 21.07.2017 @ 
> 10:36:29 PDT
> Second phase agg ran out of memory. Not suppose to. Test data currently only 
> accessible locally.
> /root/drill-test-framework/framework/resources/Advanced/hash-agg/spill/hagg15.q
> Query:
> select row_count, sum(row_count), avg(double_field), max(double_rand), 
> count(float_rand) from parquet_500m_v1 group by row_count order by row_count 
> limit 30
> Failed with exception
> java.sql.SQLException: RESOURCE ERROR: One or more nodes ran out of memory 
> while executing the query.
> HT was: 534773760 OOM at Second Phase. Partitions: 32. Estimated batch size: 
> 4849664. Planned batches: 0. Rows spilled so far: 6459928 Memory limit: 
> 536870912 so far allocated: 534773760.
> Fragment 1:6
> [Error Id: a193babd-f783-43da-a476-bb8dd4382420 on 10.10.30.168:31010]
>   (org.apache.drill.exec.exception.OutOfMemoryException) HT was: 534773760 
> OOM at Second Phase. Partitions: 32. Estimated batch size: 4849664. Planned 
> batches: 0. Rows spilled so far: 6459928 Memory limit: 536870912 so far 
> allocated: 534773760.
> 
> org.apache.drill.exec.test.generated.HashAggregatorGen1823.checkGroupAndAggrValues():1175
> org.apache.drill.exec.test.generated.HashAggregatorGen1823.doWork():539
> org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.innerNext():168
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():133
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.physical.impl.TopN.TopNBatch.innerNext():191
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
> 
> org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext():93
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.physical.impl.BaseRootExec.next():105
> 
> org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():92
> org.apache.drill.exec.physical.impl.BaseRootExec.next():95
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():234
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():227
> java.security.AccessController.doPrivileged():-2
> javax.security.auth.Subject.doAs():415
> org.apache.hadoop.security.UserGroupInformation.doAs():1595
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():227
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1145
> java.util.concurrent.ThreadPoolExecutor$Worker.run():615
> java.lang.Thread.run():745
>   Caused By (org.apache.drill.exec.exception.OutOfMemoryException) Unable to 
> allocate buffer of size 4194304 due to memory limit. Current allocation: 
> 534773760
> org.apache.drill.exec.memory.BaseAllocator.buffer():238
> org.apache.drill.exec.memory.BaseAllocator.buffer():213
>  

[jira] [Commented] (DRILL-5694) hash agg spill to disk, second phase OOM

2017-09-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16172759#comment-16172759
 ] 

ASF GitHub Bot commented on DRILL-5694:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/938#discussion_r139877079
  
--- Diff: 
common/src/main/java/org/apache/drill/common/exceptions/UserException.java ---
@@ -536,6 +542,33 @@ public Builder pushContext(final String name, final 
double value) {
  * @return user exception
  */
 public UserException build(final Logger logger) {
+
+  // To allow for debugging:
+  // A spinner code to make the execution stop here while the file 
'/tmp/drillspin' exists
+  // Can be used to attach a debugger, use jstack, etc
+  // The processID of the spinning thread should be in a file like 
/tmp/spin4148663301172491613.tmp
--- End diff --

Would also recommend `/tmp/drill/spin...`.


> hash agg spill to disk, second phase OOM
> 
>
> Key: DRILL-5694
> URL: https://issues.apache.org/jira/browse/DRILL-5694
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 1.11.0
>Reporter: Chun Chang
>Assignee: Boaz Ben-Zvi
>
> | 1.11.0-SNAPSHOT  | d622f76ee6336d97c9189fc589befa7b0f4189d6  | DRILL-5165: 
> For limit all case, no need to push down limit to scan  | 21.07.2017 @ 
> 10:36:29 PDT
> Second phase agg ran out of memory. Not suppose to. Test data currently only 
> accessible locally.
> /root/drill-test-framework/framework/resources/Advanced/hash-agg/spill/hagg15.q
> Query:
> select row_count, sum(row_count), avg(double_field), max(double_rand), 
> count(float_rand) from parquet_500m_v1 group by row_count order by row_count 
> limit 30
> Failed with exception
> java.sql.SQLException: RESOURCE ERROR: One or more nodes ran out of memory 
> while executing the query.
> HT was: 534773760 OOM at Second Phase. Partitions: 32. Estimated batch size: 
> 4849664. Planned batches: 0. Rows spilled so far: 6459928 Memory limit: 
> 536870912 so far allocated: 534773760.
> Fragment 1:6
> [Error Id: a193babd-f783-43da-a476-bb8dd4382420 on 10.10.30.168:31010]
>   (org.apache.drill.exec.exception.OutOfMemoryException) HT was: 534773760 
> OOM at Second Phase. Partitions: 32. Estimated batch size: 4849664. Planned 
> batches: 0. Rows spilled so far: 6459928 Memory limit: 536870912 so far 
> allocated: 534773760.
> 
> org.apache.drill.exec.test.generated.HashAggregatorGen1823.checkGroupAndAggrValues():1175
> org.apache.drill.exec.test.generated.HashAggregatorGen1823.doWork():539
> org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.innerNext():168
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():133
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.physical.impl.TopN.TopNBatch.innerNext():191
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
> 
> org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext():93
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.physical.impl.BaseRootExec.next():105
> 
> org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():92
> org.apache.drill.exec.physical.impl.BaseRootExec.next():95
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():234
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():227
> java.security.AccessController.doPrivileged():-2
> javax.security.auth.Subject.doAs():415
> org.apache.hadoop.security.UserGroupInformation.doAs():1595
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():227
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1145
> java.util.concurrent.ThreadPoolExecutor$Worker.run():615
> java.lang.Thread.run():745
>   Caused By (org.apache.drill.exec.exception.OutOfMemoryException) Unable to 
> allocate buffer of size 4194304 due to memory limit. Current allocation: 
> 534773760
> 

[jira] [Commented] (DRILL-5694) hash agg spill to disk, second phase OOM

2017-09-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16172760#comment-16172760
 ] 

ASF GitHub Bot commented on DRILL-5694:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/938#discussion_r139877726
  
--- Diff: 
common/src/main/java/org/apache/drill/common/exceptions/UserException.java ---
@@ -536,6 +542,33 @@ public Builder pushContext(final String name, final 
double value) {
  * @return user exception
  */
 public UserException build(final Logger logger) {
+
+  // To allow for debugging:
+  // A spinner code to make the execution stop here while the file 
'/tmp/drillspin' exists
+  // Can be used to attach a debugger, use jstack, etc
+  // The processID of the spinning thread should be in a file like 
/tmp/spin4148663301172491613.tmp
+  // along with the error message.
+  File spinFile = new File("/tmp/drillspin");
--- End diff --

Should this be a config setting? Probably the config is not visible here, 
but can we set a static variable at start-up time? And, since this code will 
check the file system on every exception, should we have a config variable to 
turn on the check?

Feel free to tell me I'm being overly paranoid...


> hash agg spill to disk, second phase OOM
> 
>
> Key: DRILL-5694
> URL: https://issues.apache.org/jira/browse/DRILL-5694
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 1.11.0
>Reporter: Chun Chang
>Assignee: Boaz Ben-Zvi
>
> | 1.11.0-SNAPSHOT  | d622f76ee6336d97c9189fc589befa7b0f4189d6  | DRILL-5165: 
> For limit all case, no need to push down limit to scan  | 21.07.2017 @ 
> 10:36:29 PDT
> Second phase agg ran out of memory. Not suppose to. Test data currently only 
> accessible locally.
> /root/drill-test-framework/framework/resources/Advanced/hash-agg/spill/hagg15.q
> Query:
> select row_count, sum(row_count), avg(double_field), max(double_rand), 
> count(float_rand) from parquet_500m_v1 group by row_count order by row_count 
> limit 30
> Failed with exception
> java.sql.SQLException: RESOURCE ERROR: One or more nodes ran out of memory 
> while executing the query.
> HT was: 534773760 OOM at Second Phase. Partitions: 32. Estimated batch size: 
> 4849664. Planned batches: 0. Rows spilled so far: 6459928 Memory limit: 
> 536870912 so far allocated: 534773760.
> Fragment 1:6
> [Error Id: a193babd-f783-43da-a476-bb8dd4382420 on 10.10.30.168:31010]
>   (org.apache.drill.exec.exception.OutOfMemoryException) HT was: 534773760 
> OOM at Second Phase. Partitions: 32. Estimated batch size: 4849664. Planned 
> batches: 0. Rows spilled so far: 6459928 Memory limit: 536870912 so far 
> allocated: 534773760.
> 
> org.apache.drill.exec.test.generated.HashAggregatorGen1823.checkGroupAndAggrValues():1175
> org.apache.drill.exec.test.generated.HashAggregatorGen1823.doWork():539
> org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.innerNext():168
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():133
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.physical.impl.TopN.TopNBatch.innerNext():191
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
> 
> org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext():93
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.physical.impl.BaseRootExec.next():105
> 
> org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():92
> org.apache.drill.exec.physical.impl.BaseRootExec.next():95
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():234
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():227
> java.security.AccessController.doPrivileged():-2
> javax.security.auth.Subject.doAs():415
> org.apache.hadoop.security.UserGroupInformation.doAs():1595
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():227
> 

[jira] [Commented] (DRILL-5002) Using hive's date functions on top of date column gives wrong results for local time-zone

2017-09-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16172754#comment-16172754
 ] 

ASF GitHub Bot commented on DRILL-5002:
---

Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/937


> Using hive's date functions on top of date column gives wrong results for 
> local time-zone
> -
>
> Key: DRILL-5002
> URL: https://issues.apache.org/jira/browse/DRILL-5002
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Hive, Storage - Parquet
>Reporter: Rahul Challapalli
>Assignee: Vitalii Diravka
>Priority: Critical
>  Labels: ready-to-commit
> Attachments: 0_0_0.parquet
>
>
> git.commit.id.abbrev=190d5d4
> Wrong Result 1 :
> {code}
> select l_shipdate, `month`(l_shipdate) from cp.`tpch/lineitem.parquet` where 
> l_shipdate = date '1994-02-01' limit 2;
> +-+-+
> | l_shipdate  | EXPR$1  |
> +-+-+
> | 1994-02-01  | 1   |
> | 1994-02-01  | 1   |
> +-+-+
> {code}
> Wrong Result 2 : 
> {code}
> select l_shipdate, `day`(l_shipdate) from cp.`tpch/lineitem.parquet` where 
> l_shipdate = date '1998-06-02' limit 2;
> +-+-+
> | l_shipdate  | EXPR$1  |
> +-+-+
> | 1998-06-02  | 1   |
> | 1998-06-02  | 1   |
> +-+-+
> {code}
> Correct Result :
> {code}
> select l_shipdate, `month`(l_shipdate) from cp.`tpch/lineitem.parquet` where 
> l_shipdate = date '1998-06-02' limit 2;
> +-+-+
> | l_shipdate  | EXPR$1  |
> +-+-+
> | 1998-06-02  | 6   |
> | 1998-06-02  | 6   |
> +-+-+
> {code}
> It looks like we are getting wrong results when the 'day' is '01'. I only 
> tried month and day hive functionsbut wouldn't be surprised if they have 
> similar issues too.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (DRILL-5493) Managed External Sort + CTAS creates batches too large for sort

2017-09-19 Thread Robert Hou (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Hou reassigned DRILL-5493:
-

Assignee: Paul Rogers

> Managed External Sort + CTAS creates batches too large for sort
> ---
>
> Key: DRILL-5493
> URL: https://issues.apache.org/jira/browse/DRILL-5493
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.10.0
>Reporter: Rahul Challapalli
>Assignee: Paul Rogers
> Attachments: 26ee07bb-81ff-1c10-9003-90510f4b8e1d.sys.drill, 
> drillbit.log
>
>
> Config :
> {code}
> git.commit.id.abbrev=1e0a14c
> No of nodes : 1
> DRILL_MAX_DIRECT_MEMORY="32G"
> DRILL_MAX_HEAP="4G"
> Assertions Enabled : true
> {code}
> The below query fails during the CTAS phase (the explicit order by in the 
> query runs fine)
> {code}
> ALTER SESSION SET `exec.sort.disable_managed` = false;
> alter session set `planner.width.max_per_query` = 17;
> create table dfs.drillTestDir.xsort_ctas4 partition by (col1) as select 
> columns[0] as col1 from (select * from 
> dfs.`/drill/testdata/resource-manager/wide-to-zero` order by columns[0]);
> Error: RESOURCE ERROR: Unable to allocate sv2 buffer
> Fragment 0:0
> [Error Id: 24ae2ec8-ac2a-45c3-b550-43c12764165d on qa-node190.qa.lab:31010] 
> (state=,code=0)
> {code}
> I attached the logs and profiles. The data is too large to attach to a jira.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (DRILL-5795) Filter pushdown for parquet handles multi rowgroup file

2017-09-19 Thread Damien Profeta (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damien Profeta reassigned DRILL-5795:
-

Assignee: Damien Profeta

> Filter pushdown for parquet handles multi rowgroup file
> ---
>
> Key: DRILL-5795
> URL: https://issues.apache.org/jira/browse/DRILL-5795
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Reporter: Damien Profeta
>Assignee: Damien Profeta
>  Labels: doc-impacting
>
> DRILL-1950 implemented the filter pushdown for parquet file but only in the 
> case of one rowgroup per parquet file. In the case of multiple rowgroups per 
> files, it detects that the rowgroup can be pruned but then tell to the 
> drillbit to read the whole file which leads to performance issue.
> Having multiple rowgroup per file helps to handle partitioned dataset and 
> still read only the relevant subset of data without ending with more file 
> than really needed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (DRILL-5795) Filter pushdown for parquet handles multi rowgroup file

2017-09-19 Thread Damien Profeta (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damien Profeta updated DRILL-5795:
--
Labels: doc-impacting  (was: )

> Filter pushdown for parquet handles multi rowgroup file
> ---
>
> Key: DRILL-5795
> URL: https://issues.apache.org/jira/browse/DRILL-5795
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Reporter: Damien Profeta
>  Labels: doc-impacting
>
> DRILL-1950 implemented the filter pushdown for parquet file but only in the 
> case of one rowgroup per parquet file. In the case of multiple rowgroups per 
> files, it detects that the rowgroup can be pruned but then tell to the 
> drillbit to read the whole file which leads to performance issue.
> Having multiple rowgroup per file helps to handle partitioned dataset and 
> still read only the relevant subset of data without ending with more file 
> than really needed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5795) Filter pushdown for parquet handles multi rowgroup file

2017-09-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16172509#comment-16172509
 ] 

ASF GitHub Bot commented on DRILL-5795:
---

Github user dprofeta commented on the issue:

https://github.com/apache/drill/pull/949
  
@parthchandra Can you please review?


> Filter pushdown for parquet handles multi rowgroup file
> ---
>
> Key: DRILL-5795
> URL: https://issues.apache.org/jira/browse/DRILL-5795
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Reporter: Damien Profeta
>
> DRILL-1950 implemented the filter pushdown for parquet file but only in the 
> case of one rowgroup per parquet file. In the case of multiple rowgroups per 
> files, it detects that the rowgroup can be pruned but then tell to the 
> drillbit to read the whole file which leads to performance issue.
> Having multiple rowgroup per file helps to handle partitioned dataset and 
> still read only the relevant subset of data without ending with more file 
> than really needed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (DRILL-5806) DrillRuntimeException: Interrupted but context.shouldContinue() is true

2017-09-19 Thread Khurram Faraaz (JIRA)
Khurram Faraaz created DRILL-5806:
-

 Summary: DrillRuntimeException: Interrupted but 
context.shouldContinue() is true
 Key: DRILL-5806
 URL: https://issues.apache.org/jira/browse/DRILL-5806
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 1.12.0
 Environment: Drill 1.12.0 commit : 
aaff1b35b7339fb4e6ab480dd517994ff9f0a5c5
Reporter: Khurram Faraaz



On a three node cluster
1. run concurrent queries (TPC-DS query 11) from a Java program.
2. stop the drillbit (foreman drillbit) this way, 
/opt/mapr/drill/drill-1.12.0/bin/drillbit.sh stop
3. InterruptedException: null, is written to the drillbit.log

Stack trace from drillbit.log
{noformat}
2017-09-19 21:49:20,867 [263e6f48-0ace-0c0d-4f90-55ae2f0d778b:frag:5:0] ERROR 
o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: InterruptedException

Fragment 5:0

[Error Id: 63ce8c18-040a-47f9-9643-e826de9a1a27 on centos-01.qa.lab:31010]
org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
InterruptedException

Fragment 5:0

[Error Id: 63ce8c18-040a-47f9-9643-e826de9a1a27 on centos-01.qa.lab:31010]
at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:550)
 ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:298)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:160)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:264)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) 
[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[na:1.8.0_91]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_91]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
Caused by: org.apache.drill.common.exceptions.DrillRuntimeException: 
Interrupted but context.shouldContinue() is true
at 
org.apache.drill.exec.work.batch.BaseRawBatchBuffer.getNext(BaseRawBatchBuffer.java:178)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.unorderedreceiver.UnorderedReceiverBatch.getNextBatch(UnorderedReceiverBatch.java:141)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.unorderedreceiver.UnorderedReceiverBatch.next(UnorderedReceiverBatch.java:164)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:225)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:141)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:164)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:225)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.test.generated.HashAggregatorGen498.doWork(HashAggTemplate.java:581)
 ~[na:na]
at 
org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.innerNext(HashAggBatch.java:168)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:164)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:225)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
 

[jira] [Reopened] (DRILL-5564) IllegalStateException: allocator[op:21:1:5:HashJoinPOP]: buffer space (16674816) + prealloc space (0) + child space (0) != allocated (16740352)

2017-09-19 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz reopened DRILL-5564:
---

> IllegalStateException: allocator[op:21:1:5:HashJoinPOP]: buffer space 
> (16674816) + prealloc space (0) + child space (0) != allocated (16740352)
> ---
>
> Key: DRILL-5564
> URL: https://issues.apache.org/jira/browse/DRILL-5564
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.11.0
> Environment: 3 node CentOS cluster
>Reporter: Khurram Faraaz
>
> Run a concurrent Java program that executes TPCDS query11
> while the above concurrent java program is under execution
> stop foreman Drillbit (from another shell, using below command)
> ./bin/drillbit.sh stop
> and you will see the IllegalStateException: allocator[op:21:1:5:HashJoinPOP]: 
>  and another assertion error, in the drillbit.log
> AssertionError: Failure while stopping processing for operator id 10. 
> Currently have states of processing:false, setup:false, waiting:true.   
> Drill 1.11.0 git commit ID: d11aba2 (with assertions enabled)
>  
> details from drillbit.log from the foreman Drillbit node.
> {noformat}
> 2017-06-05 18:38:33,838 [26ca5afa-7f6d-991b-1fdf-6196faddc229:frag:23:1] INFO 
>  o.a.d.e.w.fragment.FragmentExecutor - 
> 26ca5afa-7f6d-991b-1fdf-6196faddc229:23:1: State change requested RUNNING --> 
> FAILED
> 2017-06-05 18:38:33,849 [26ca5afa-7f6d-991b-1fdf-6196faddc229:frag:23:1] INFO 
>  o.a.d.e.w.fragment.FragmentExecutor - 
> 26ca5afa-7f6d-991b-1fdf-6196faddc229:23:1: State change requested FAILED --> 
> FINISHED
> 2017-06-05 18:38:33,852 [26ca5afa-7f6d-991b-1fdf-6196faddc229:frag:23:1] 
> ERROR o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: AssertionError: 
> Failure while stopping processing for operator id 10. Currently have states 
> of processing:false, setup:false, waiting:true.
> Fragment 23:1
> [Error Id: a116b326-43ed-4569-a20e-a10ba03d215e on centos-01.qa.lab:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> AssertionError: Failure while stopping processing for operator id 10. 
> Currently have states of processing:false, setup:false, waiting:true.
> Fragment 23:1
> [Error Id: a116b326-43ed-4569-a20e-a10ba03d215e on centos-01.qa.lab:31010]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:544)
>  ~[drill-common-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:295)
>  [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:160)
>  [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:264)
>  [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
> at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_91]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_91]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
> Caused by: java.lang.RuntimeException: java.lang.AssertionError: Failure 
> while stopping processing for operator id 10. Currently have states of 
> processing:false, setup:false, waiting:true.
> at 
> org.apache.drill.common.DeferredException.addThrowable(DeferredException.java:101)
>  ~[drill-common-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.fail(FragmentExecutor.java:409)
>  [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:250)
>  [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
> ... 4 common frames omitted
> Caused by: java.lang.AssertionError: Failure while stopping processing for 
> operator id 10. Currently have states of processing:false, setup:false, 
> waiting:true.
> at 
> org.apache.drill.exec.ops.OperatorStats.stopProcessing(OperatorStats.java:167)
>  ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:255) 
> ~[drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:215)
>  

[jira] [Commented] (DRILL-5564) IllegalStateException: allocator[op:21:1:5:HashJoinPOP]: buffer space (16674816) + prealloc space (0) + child space (0) != allocated (16740352)

2017-09-19 Thread Khurram Faraaz (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16172474#comment-16172474
 ] 

Khurram Faraaz commented on DRILL-5564:
---

The same issue is reproducible, it didn't show up in the first run, in the 
later runs it does show up. Hence reopening this issue.

test setup was
{noformat}
1. three node cluster Drill 1.12.0 commit : aaff1b3
2. start drillbits
3. run concurrent java program that runs TP-CDS query11
4. stop the foreman drillbit $DRILL_HOME/bin/drillbit.sh stop
5. you should see the same AssertionError

{noformat}

Stack trace from drillbit.log
{noformat}
2017-09-19 20:51:21,902 [263e7cd2-56e8-dd4c-d6c8-64de0a4626b0:frag:21:1] DEBUG 
o.a.d.exec.ops.OperatorContextImpl - Closing context for 
org.apache.drill.exec.physical.config.SingleSender
2017-09-19 20:51:21,902 [263e7cd2-56e8-dd4c-d6c8-64de0a4626b0:frag:21:1] DEBUG 
o.a.drill.exec.memory.BaseAllocator - closed allocator[op:21:1:0:SingleSender].
2017-09-19 20:51:21,902 [263e7cd2-56e8-dd4c-d6c8-64de0a4626b0:frag:21:1] DEBUG 
o.a.drill.exec.memory.BaseAllocator - closed allocator[frag:21:1].
2017-09-19 20:51:21,902 [263e7cd2-56e8-dd4c-d6c8-64de0a4626b0:frag:21:1] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 
263e7cd2-56e8-dd4c-d6c8-64de0a4626b0:21:1: State change requested FAILED --> 
FINISHED
2017-09-19 20:51:21,904 [263e7cd2-56e8-dd4c-d6c8-64de0a4626b0:frag:21:1] ERROR 
o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: AssertionError: Failure 
while stopping processing for operator id 10. Currently have states of 
processing:false, setup:false, waiting:true.

Fragment 21:1

[Error Id: f14abd2f-d8c3-466c-a2c1-fdf622b7cee6 on centos-01.qa.lab:31010]
org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: AssertionError: 
Failure while stopping processing for operator id 10. Currently have states of 
processing:false, setup:false, waiting:true.

Fragment 21:1

[Error Id: f14abd2f-d8c3-466c-a2c1-fdf622b7cee6 on centos-01.qa.lab:31010]
at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:550)
 ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:298)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:160)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:264)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) 
[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[na:1.8.0_91]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_91]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
Caused by: java.lang.RuntimeException: java.lang.AssertionError: Failure while 
stopping processing for operator id 10. Currently have states of 
processing:false, setup:false, waiting:true.
at 
org.apache.drill.common.DeferredException.addThrowable(DeferredException.java:101)
 ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.fail(FragmentExecutor.java:409)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:250)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
... 4 common frames omitted

Caused by: java.lang.AssertionError: Failure while stopping processing for 
operator id 10. Currently have states of processing:false, setup:false, 
waiting:true.
 at 
org.apache.drill.exec.ops.OperatorStats.stopProcessing(OperatorStats.java:167) 
~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:220) 
~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:225)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:141)
 

[jira] [Commented] (DRILL-5478) Spill file size parameter is not honored by the managed external sort

2017-09-19 Thread Robert Hou (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16172380#comment-16172380
 ] 

Robert Hou commented on DRILL-5478:
---

If this is exposed to Support, then it would be good to have some documentation 
about how this parameter works.  Maybe even just a comment that the parameter 
is not precise, and depends on the memory.

> Spill file size parameter is not honored by the managed external sort
> -
>
> Key: DRILL-5478
> URL: https://issues.apache.org/jira/browse/DRILL-5478
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.10.0
>Reporter: Rahul Challapalli
>Assignee: Paul Rogers
> Fix For: 1.12.0
>
>
> git.commit.id.abbrev=1e0a14c
> Query:
> {code}
> ALTER SESSION SET `exec.sort.disable_managed` = false;
> alter session set `planner.width.max_per_node` = 1;
> alter session set `planner.disable_exchanges` = true;
> alter session set `planner.width.max_per_query` = 1;
> alter session set `planner.memory.max_query_memory_per_node` = 1052428800;
> alter session set `planner.enable_decimal_data_type` = true;
> select count(*) from (
>   select * from dfs.`/drill/testdata/resource-manager/all_types_large` d1
>   order by d1.map.missing
> ) d;
> {code}
> Boot Options (spill file size is set to 256MB)
> {code}
> 0: jdbc:drill:zk=10.10.100.190:5181> select * from sys.boot where name like 
> '%spill%';
> +--+-+---+-+--++---++
> |   name   |  kind   | type  | status 
>  | num_val  | string_val | bool_val  
> | float_val  |
> +--+-+---+-+--++---++
> | drill.exec.sort.external.spill.directories   | STRING  | BOOT  | BOOT   
>  | null | [
> # drill-override.conf: 26
> "/tmp/test"
> ]  | null  | null   |
> | drill.exec.sort.external.spill.file_size | STRING  | BOOT  | BOOT   
>  | null | "256M" | null  
> | null   |
> | drill.exec.sort.external.spill.fs| STRING  | BOOT  | BOOT   
>  | null | "maprfs:///"   | null  
> | null   |
> | drill.exec.sort.external.spill.group.size| LONG| BOOT  | BOOT   
>  | 4| null   | null  
> | null   |
> | drill.exec.sort.external.spill.merge_batch_size  | STRING  | BOOT  | BOOT   
>  | null | "16M"  | null  
> | null   |
> | drill.exec.sort.external.spill.spill_batch_size  | STRING  | BOOT  | BOOT   
>  | null | "8M"   | null  
> | null   |
> | drill.exec.sort.external.spill.threshold | LONG| BOOT  | BOOT   
>  | 4| null   | null  
> | null   |
> +--+-+---+-+--++---++
> {code}
> Below are the spill files while the query is still executing. The size of the 
> spill files is ~34MB
> {code}
> -rwxr-xr-x   3 root root   34957815 2017-05-05 11:26 
> /tmp/test/26f33c36-4235-3531-aeaa-2c73dc4ddeb5_major0_minor0_op5_sort/run1
> -rwxr-xr-x   3 root root   34957815 2017-05-05 11:27 
> /tmp/test/26f33c36-4235-3531-aeaa-2c73dc4ddeb5_major0_minor0_op5_sort/run2
> -rwxr-xr-x   3 root root  0 2017-05-05 11:27 
> /tmp/test/26f33c36-4235-3531-aeaa-2c73dc4ddeb5_major0_minor0_op5_sort/run3
> {code}
> The data set is too large to attach here. Reach out to me if you need anything



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5786) A query that includes sort encounters Exception in RPC communication

2017-09-19 Thread Robert Hou (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16172375#comment-16172375
 ] 

Robert Hou commented on DRILL-5786:
---

Updated title.

> A query that includes sort encounters Exception in RPC communication
> 
>
> Key: DRILL-5786
> URL: https://issues.apache.org/jira/browse/DRILL-5786
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.11.0
>Reporter: Robert Hou
>Assignee: Paul Rogers
> Fix For: 1.12.0
>
> Attachments: 2647d2b0-69bf-5a2b-0e23-81e8d49e464e.sys.drill, 
> drillbit.log
>
>
> Query is:
> {noformat}
> select count(*) from (select * from 
> dfs.`/drill/testdata/resource-manager/3500cols.tbl` order by 
> columns[450],columns[330],columns[230],columns[220],columns[110],columns[90],columns[80],columns[70],columns[40],columns[10],columns[20],columns[30],columns[40],columns[50],
>  
> columns[454],columns[413],columns[940],columns[834],columns[73],columns[140],columns[104],columns[],columns[30],columns[2420],columns[1520],
>  columns[1410], 
> columns[1110],columns[1290],columns[2380],columns[705],columns[45],columns[1054],columns[2430],columns[420],columns[404],columns[3350],
>  
> columns[],columns[153],columns[356],columns[84],columns[745],columns[1450],columns[103],columns[2065],columns[343],columns[3420],columns[530],
>  columns[3210] ) d where d.col433 = 'sjka skjf'
> {noformat}
> This is the same query as DRILL-5670 but no session variables are set.
> Here is the stack trace:
> {noformat}
> 2017-09-12 13:14:57,584 [BitServer-5] ERROR 
> o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication.  
> Connection: /10.10.100.190:31012 <--> /10.10.100.190:46230 (data server).  
> Closing connection.
> io.netty.handler.codec.DecoderException: 
> org.apache.drill.exec.exception.OutOfMemoryException: Failure allocating 
> buffer.
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:233)
>  ~[netty-codec-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) 
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) 
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) 
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_111]
> Caused by: org.apache.drill.exec.exception.OutOfMemoryException: Failure 
> allocating buffer.
> at 
> io.netty.buffer.PooledByteBufAllocatorL.allocate(PooledByteBufAllocatorL.java:64)
>  ~[drill-memory-base-1.12.0-SNAPSHOT.jar:4.0.27.Final]
> at 
> org.apache.drill.exec.memory.AllocationManager.(AllocationManager.java:81)
>  ~[drill-memory-base-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
> at 
> org.apache.drill.exec.memory.BaseAllocator.bufferWithoutReservation(BaseAllocator.java:260)
>  ~[drill-memory-base-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
> at 
> org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:243) 
> ~[drill-memory-base-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
> at 
> 

[jira] [Updated] (DRILL-5786) A query that includes sort encounters Exception in RPC communication

2017-09-19 Thread Robert Hou (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Hou updated DRILL-5786:
--
Summary: A query that includes sort encounters Exception in RPC 
communication  (was: External Sort encounters Exception in RPC communication 
during Sort)

> A query that includes sort encounters Exception in RPC communication
> 
>
> Key: DRILL-5786
> URL: https://issues.apache.org/jira/browse/DRILL-5786
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.11.0
>Reporter: Robert Hou
>Assignee: Paul Rogers
> Fix For: 1.12.0
>
> Attachments: 2647d2b0-69bf-5a2b-0e23-81e8d49e464e.sys.drill, 
> drillbit.log
>
>
> Query is:
> {noformat}
> select count(*) from (select * from 
> dfs.`/drill/testdata/resource-manager/3500cols.tbl` order by 
> columns[450],columns[330],columns[230],columns[220],columns[110],columns[90],columns[80],columns[70],columns[40],columns[10],columns[20],columns[30],columns[40],columns[50],
>  
> columns[454],columns[413],columns[940],columns[834],columns[73],columns[140],columns[104],columns[],columns[30],columns[2420],columns[1520],
>  columns[1410], 
> columns[1110],columns[1290],columns[2380],columns[705],columns[45],columns[1054],columns[2430],columns[420],columns[404],columns[3350],
>  
> columns[],columns[153],columns[356],columns[84],columns[745],columns[1450],columns[103],columns[2065],columns[343],columns[3420],columns[530],
>  columns[3210] ) d where d.col433 = 'sjka skjf'
> {noformat}
> This is the same query as DRILL-5670 but no session variables are set.
> Here is the stack trace:
> {noformat}
> 2017-09-12 13:14:57,584 [BitServer-5] ERROR 
> o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication.  
> Connection: /10.10.100.190:31012 <--> /10.10.100.190:46230 (data server).  
> Closing connection.
> io.netty.handler.codec.DecoderException: 
> org.apache.drill.exec.exception.OutOfMemoryException: Failure allocating 
> buffer.
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:233)
>  ~[netty-codec-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) 
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) 
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) 
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_111]
> Caused by: org.apache.drill.exec.exception.OutOfMemoryException: Failure 
> allocating buffer.
> at 
> io.netty.buffer.PooledByteBufAllocatorL.allocate(PooledByteBufAllocatorL.java:64)
>  ~[drill-memory-base-1.12.0-SNAPSHOT.jar:4.0.27.Final]
> at 
> org.apache.drill.exec.memory.AllocationManager.(AllocationManager.java:81)
>  ~[drill-memory-base-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
> at 
> org.apache.drill.exec.memory.BaseAllocator.bufferWithoutReservation(BaseAllocator.java:260)
>  ~[drill-memory-base-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
> at 
> org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:243) 

[jira] [Commented] (DRILL-5804) External Sort times out, may be infinite loop

2017-09-19 Thread Paul Rogers (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16172353#comment-16172353
 ] 

Paul Rogers commented on DRILL-5804:


I've seen something similar when investigating sort bugs. Work was done for the 
sort to pre-allocate vectors to avoid vector doubling, which is what the 
messages indicate. However, even after that, I saw many such messages emitted 
by readers and other operators.

We'll need to track down which part of Drill is causing the problems in this 
case.

> External Sort times out, may be infinite loop
> -
>
> Key: DRILL-5804
> URL: https://issues.apache.org/jira/browse/DRILL-5804
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.11.0
>Reporter: Robert Hou
>Assignee: Paul Rogers
> Fix For: 1.12.0
>
> Attachments: drillbit.log
>
>
> Query is:
> {noformat}
> ALTER SESSION SET `exec.sort.disable_managed` = false;
> select count(*) from (
>   select * from (
> select s1.type type, flatten(s1.rms.rptd) rptds, s1.rms, s1.uid 
> from (
>   select d.type type, d.uid uid, flatten(d.map.rm) rms from 
> dfs.`/drill/testdata/resource-manager/nested_large` d order by d.uid
> ) s1
>   ) s2
>   order by s2.rms.mapid, s2.rptds.a, s2.rptds.do_not_exist
> );
> {noformat}
> Plan is:
> {noformat}
> | 00-00Screen
> 00-01  Project(EXPR$0=[$0])
> 00-02StreamAgg(group=[{}], EXPR$0=[$SUM0($0)])
> 00-03  UnionExchange
> 01-01StreamAgg(group=[{}], EXPR$0=[COUNT()])
> 01-02  Project($f0=[0])
> 01-03SingleMergeExchange(sort0=[4 ASC], sort1=[5 ASC], 
> sort2=[6 ASC])
> 02-01  SelectionVectorRemover
> 02-02Sort(sort0=[$4], sort1=[$5], sort2=[$6], dir0=[ASC], 
> dir1=[ASC], dir2=[ASC])
> 02-03  Project(type=[$0], rptds=[$1], rms=[$2], uid=[$3], 
> EXPR$4=[$4], EXPR$5=[$5], EXPR$6=[$6])
> 02-04HashToRandomExchange(dist0=[[$4]], dist1=[[$5]], 
> dist2=[[$6]])
> 03-01  UnorderedMuxExchange
> 04-01Project(type=[$0], rptds=[$1], rms=[$2], 
> uid=[$3], EXPR$4=[$4], EXPR$5=[$5], EXPR$6=[$6], 
> E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($6, hash32AsDouble($5, 
> hash32AsDouble($4, 1301011)))])
> 04-02  Project(type=[$0], rptds=[$1], rms=[$2], 
> uid=[$3], EXPR$4=[ITEM($2, 'mapid')], EXPR$5=[ITEM($1, 'a')], 
> EXPR$6=[ITEM($1, 'do_not_exist')])
> 04-03Flatten(flattenField=[$1])
> 04-04  Project(type=[$0], rptds=[ITEM($2, 
> 'rptd')], rms=[$2], uid=[$1])
> 04-05SingleMergeExchange(sort0=[1 ASC])
> 05-01  SelectionVectorRemover
> 05-02Sort(sort0=[$1], dir0=[ASC])
> 05-03  Project(type=[$0], uid=[$1], 
> rms=[$2])
> 05-04
> HashToRandomExchange(dist0=[[$1]])
> 06-01  UnorderedMuxExchange
> 07-01Project(type=[$0], 
> uid=[$1], rms=[$2], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($1, 1301011)])
> 07-02  
> Flatten(flattenField=[$2])
> 07-03Project(type=[$0], 
> uid=[$1], rms=[ITEM($2, 'rm')])
> 07-04  
> Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
> [path=maprfs:///drill/testdata/resource-manager/nested_large]], 
> selectionRoot=maprfs:/drill/testdata/resource-manager/nested_large, 
> numFiles=1, usedMetadataFile=false, columns=[`type`, `uid`, `map`.`rm`]]])
> {noformat}
> Here is a segment of the drillbit.log, starting at line 55890:
> {noformat}
> 2017-09-19 04:22:56,258 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:2] DEBUG 
> o.a.d.e.t.g.SingleBatchSorterGen44 - Took 142 us to sort 1023 records
> 2017-09-19 04:22:56,265 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:4] DEBUG 
> o.a.d.e.t.g.SingleBatchSorterGen44 - Took 105 us to sort 1023 records
> 2017-09-19 04:22:56,268 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:3:0] DEBUG 
> o.a.d.e.p.i.p.PartitionSenderRootExec - Partitioner.next(): got next record 
> batch with status OK
> 2017-09-19 04:22:56,275 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:7] DEBUG 
> o.a.d.e.t.g.SingleBatchSorterGen44 - Took 145 us to sort 1023 records
> 2017-09-19 04:22:56,354 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:3:0] DEBUG 
> o.a.d.e.p.i.p.PartitionSenderRootExec - Partitioner.next(): got next record 
> batch with status OK
> 2017-09-19 

[jira] [Commented] (DRILL-5786) External Sort encounters Exception in RPC communication during Sort

2017-09-19 Thread Paul Rogers (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16172350#comment-16172350
 ] 

Paul Rogers commented on DRILL-5786:


The "external sort" cannot run out of memory in the RPC layer. The external 
sort is a specific module of code that does not directly interact with the RPC 
layer.

The correct title here is "A query that contains a sort runs into an RPC 
exception".

Clearly, we have issues with RPC and/or exchanges. These bugs affect many 
queries, including those with sorts. However, for proper categorization, this 
is not a "sort bug" per se.

> External Sort encounters Exception in RPC communication during Sort
> ---
>
> Key: DRILL-5786
> URL: https://issues.apache.org/jira/browse/DRILL-5786
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.11.0
>Reporter: Robert Hou
>Assignee: Paul Rogers
> Fix For: 1.12.0
>
> Attachments: 2647d2b0-69bf-5a2b-0e23-81e8d49e464e.sys.drill, 
> drillbit.log
>
>
> Query is:
> {noformat}
> select count(*) from (select * from 
> dfs.`/drill/testdata/resource-manager/3500cols.tbl` order by 
> columns[450],columns[330],columns[230],columns[220],columns[110],columns[90],columns[80],columns[70],columns[40],columns[10],columns[20],columns[30],columns[40],columns[50],
>  
> columns[454],columns[413],columns[940],columns[834],columns[73],columns[140],columns[104],columns[],columns[30],columns[2420],columns[1520],
>  columns[1410], 
> columns[1110],columns[1290],columns[2380],columns[705],columns[45],columns[1054],columns[2430],columns[420],columns[404],columns[3350],
>  
> columns[],columns[153],columns[356],columns[84],columns[745],columns[1450],columns[103],columns[2065],columns[343],columns[3420],columns[530],
>  columns[3210] ) d where d.col433 = 'sjka skjf'
> {noformat}
> This is the same query as DRILL-5670 but no session variables are set.
> Here is the stack trace:
> {noformat}
> 2017-09-12 13:14:57,584 [BitServer-5] ERROR 
> o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication.  
> Connection: /10.10.100.190:31012 <--> /10.10.100.190:46230 (data server).  
> Closing connection.
> io.netty.handler.codec.DecoderException: 
> org.apache.drill.exec.exception.OutOfMemoryException: Failure allocating 
> buffer.
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:233)
>  ~[netty-codec-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) 
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) 
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) 
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_111]
> Caused by: org.apache.drill.exec.exception.OutOfMemoryException: Failure 
> allocating buffer.
> at 
> io.netty.buffer.PooledByteBufAllocatorL.allocate(PooledByteBufAllocatorL.java:64)
>  ~[drill-memory-base-1.12.0-SNAPSHOT.jar:4.0.27.Final]
> at 
> org.apache.drill.exec.memory.AllocationManager.(AllocationManager.java:81)
>  

[jira] [Commented] (DRILL-5721) Query with only root fragment and no non-root fragment hangs when Drillbit to Drillbit Control Connection has network issues

2017-09-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16172340#comment-16172340
 ] 

ASF GitHub Bot commented on DRILL-5721:
---

Github user paul-rogers commented on the issue:

https://github.com/apache/drill/pull/919
  
Attempt to merge into master failed:

```
CONFLICT (content): Merge conflict in 
exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentStatusReporter.java
```

Please rebase onto latest master. While at it, please also squash commits.


> Query with only root fragment and no non-root fragment hangs when Drillbit to 
> Drillbit Control Connection has network issues
> 
>
> Key: DRILL-5721
> URL: https://issues.apache.org/jira/browse/DRILL-5721
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Sorabh Hamirwasia
>Assignee: Sorabh Hamirwasia
>  Labels: ready-to-commit
> Fix For: 1.12.0
>
>
> Recently I found an issue (Thanks to [~knguyen] to create this scenario) 
> related to Fragment Status reporting and would like some feedback on it. 
> When a client submits a query to Foreman, then it is planned by Foreman and 
> later fragments are scheduled to root and non-root nodes. Foreman creates a 
> DriilbitStatusListener and FragmentStatusListener to know about the health of 
> Drillbit node and a fragment respectively. The way root and non-root 
> fragments are setup by Foreman are different: 
> Root fragments are setup without any communication over control channel 
> (since it is executed locally on Foreman)
> Non-root fragments are setup by sending control message 
> (REQ_INITIALIZE_FRAGMENTS_VALUE) over wire. If there is failure in sending 
> any such control message (like due to network hiccup's) during query setup 
> then the query is failed and client is notified. 
> Each fragment is executed on it's node with the help Fragment Executor which 
> has an instance for FragmentStatusReporter. FragmentStatusReporter helps to 
> update the status of a fragment to Foreman node over a control tunnel or 
> connection using RPC message (REQ_FRAGMENT_STATUS) both for root and non-root 
> fragments. 
> Based on above when root fragment is submitted for setup then it is done 
> locally without any RPC communication whereas when status for that fragment 
> is reported by fragment executor that happens over control connection by 
> sending a RPC message. But for non-root fragment setup and status update both 
> happens using RPC message over control connection.
> *Issue 1:*
> What was observed is if for a simple query which has only 1 root fragment 
> running on Foreman node then setup will work fine. But as part of status 
> update when the fragment tries to create a control connection and fails to 
> establish that, then the query hangs. This is because the root fragment will 
> complete execution but will fail to update Foreman about it and Foreman think 
> that the query is running for ever. 
> *Proposed Solution:*
> For root fragment the setup of fragment is happening locally without RPC 
> message, so we can do the same for status update of root fragments. This will 
> avoid RPC communication for status update of fragments running locally on the 
> foreman and hence will resolve issue 1.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (DRILL-5805) External Sort runs out of memory

2017-09-19 Thread Robert Hou (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Hou updated DRILL-5805:
--
Attachment: drillbit.log.gz

> External Sort runs out of memory
> 
>
> Key: DRILL-5805
> URL: https://issues.apache.org/jira/browse/DRILL-5805
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.11.0
>Reporter: Robert Hou
>Assignee: Paul Rogers
> Fix For: 1.12.0
>
> Attachments: 2645d135-4222-d752-2609-c95568ff6e93.sys.drill, 
> drillbit.log.gz
>
>
> Query is:
> {noformat}
> ALTER SESSION SET `exec.sort.disable_managed` = false;
> alter session set `planner.width.max_per_node` = 5;
> alter session set `planner.disable_exchanges` = true;
> alter session set `planner.width.max_per_query` = 100;
> select count(*) from (select * from (select id, flatten(str_list) str from 
> dfs.`/drill/testdata/resource-manager/flatten-large-small.json`) d order by 
> d.str) d1 where d1.id=0;
> {noformat}
> Plan is:
> {noformat}
> | 00-00Screen
> 00-01  Project(EXPR$0=[$0])
> 00-02StreamAgg(group=[{}], EXPR$0=[COUNT()])
> 00-03  Project($f0=[0])
> 00-04SelectionVectorRemover
> 00-05  Filter(condition=[=($0, 0)])
> 00-06SelectionVectorRemover
> 00-07  Sort(sort0=[$1], dir0=[ASC])
> 00-08Flatten(flattenField=[$1])
> 00-09  Project(id=[$0], str=[$1])
> 00-10Scan(groupscan=[EasyGroupScan 
> [selectionRoot=maprfs:/drill/testdata/resource-manager/flatten-large-small.json,
>  numFiles=1, columns=[`id`, `str_list`], 
> files=[maprfs:///drill/testdata/resource-manager/flatten-large-small.json]]])
> {noformat}
> sys.version is:
> {noformat}
> | 1.12.0-SNAPSHOT  | c4211d3b545b0d1996b096a8e1ace35376a63977  | Fix for 
> DRILL-5670  | 09.09.2017 @ 14:38:25 PDT  | r...@qa-node190.qa.lab  | 
> 11.09.2017 @ 14:27:16 PDT  |
> {noformat}
> mult drill5447_1



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (DRILL-5805) External Sort runs out of memory

2017-09-19 Thread Robert Hou (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Hou updated DRILL-5805:
--
Attachment: 2645d135-4222-d752-2609-c95568ff6e93.sys.drill

> External Sort runs out of memory
> 
>
> Key: DRILL-5805
> URL: https://issues.apache.org/jira/browse/DRILL-5805
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.11.0
>Reporter: Robert Hou
>Assignee: Paul Rogers
> Fix For: 1.12.0
>
> Attachments: 2645d135-4222-d752-2609-c95568ff6e93.sys.drill
>
>
> Query is:
> {noformat}
> ALTER SESSION SET `exec.sort.disable_managed` = false;
> alter session set `planner.width.max_per_node` = 5;
> alter session set `planner.disable_exchanges` = true;
> alter session set `planner.width.max_per_query` = 100;
> select count(*) from (select * from (select id, flatten(str_list) str from 
> dfs.`/drill/testdata/resource-manager/flatten-large-small.json`) d order by 
> d.str) d1 where d1.id=0;
> {noformat}
> Plan is:
> {noformat}
> | 00-00Screen
> 00-01  Project(EXPR$0=[$0])
> 00-02StreamAgg(group=[{}], EXPR$0=[COUNT()])
> 00-03  Project($f0=[0])
> 00-04SelectionVectorRemover
> 00-05  Filter(condition=[=($0, 0)])
> 00-06SelectionVectorRemover
> 00-07  Sort(sort0=[$1], dir0=[ASC])
> 00-08Flatten(flattenField=[$1])
> 00-09  Project(id=[$0], str=[$1])
> 00-10Scan(groupscan=[EasyGroupScan 
> [selectionRoot=maprfs:/drill/testdata/resource-manager/flatten-large-small.json,
>  numFiles=1, columns=[`id`, `str_list`], 
> files=[maprfs:///drill/testdata/resource-manager/flatten-large-small.json]]])
> {noformat}
> sys.version is:
> {noformat}
> | 1.12.0-SNAPSHOT  | c4211d3b545b0d1996b096a8e1ace35376a63977  | Fix for 
> DRILL-5670  | 09.09.2017 @ 14:38:25 PDT  | r...@qa-node190.qa.lab  | 
> 11.09.2017 @ 14:27:16 PDT  |
> {noformat}
> mult drill5447_1



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (DRILL-5805) External Sort runs out of memory

2017-09-19 Thread Robert Hou (JIRA)
Robert Hou created DRILL-5805:
-

 Summary: External Sort runs out of memory
 Key: DRILL-5805
 URL: https://issues.apache.org/jira/browse/DRILL-5805
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Relational Operators
Affects Versions: 1.11.0
Reporter: Robert Hou
Assignee: Paul Rogers
 Fix For: 1.12.0


Query is:
{noformat}
ALTER SESSION SET `exec.sort.disable_managed` = false;
alter session set `planner.width.max_per_node` = 5;
alter session set `planner.disable_exchanges` = true;
alter session set `planner.width.max_per_query` = 100;
select count(*) from (select * from (select id, flatten(str_list) str from 
dfs.`/drill/testdata/resource-manager/flatten-large-small.json`) d order by 
d.str) d1 where d1.id=0;
{noformat}

Plan is:
{noformat}
| 00-00Screen
00-01  Project(EXPR$0=[$0])
00-02StreamAgg(group=[{}], EXPR$0=[COUNT()])
00-03  Project($f0=[0])
00-04SelectionVectorRemover
00-05  Filter(condition=[=($0, 0)])
00-06SelectionVectorRemover
00-07  Sort(sort0=[$1], dir0=[ASC])
00-08Flatten(flattenField=[$1])
00-09  Project(id=[$0], str=[$1])
00-10Scan(groupscan=[EasyGroupScan 
[selectionRoot=maprfs:/drill/testdata/resource-manager/flatten-large-small.json,
 numFiles=1, columns=[`id`, `str_list`], 
files=[maprfs:///drill/testdata/resource-manager/flatten-large-small.json]]])
{noformat}

sys.version is:
{noformat}
| 1.12.0-SNAPSHOT  | c4211d3b545b0d1996b096a8e1ace35376a63977  | Fix for 
DRILL-5670  | 09.09.2017 @ 14:38:25 PDT  | r...@qa-node190.qa.lab  | 11.09.2017 
@ 14:27:16 PDT  |
{noformat}

mult drill5447_1



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (DRILL-5804) External Sort times out, may be infinite loop

2017-09-19 Thread Robert Hou (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Hou updated DRILL-5804:
--
Description: 
Query is:
{noformat}
ALTER SESSION SET `exec.sort.disable_managed` = false;
select count(*) from (
  select * from (
select s1.type type, flatten(s1.rms.rptd) rptds, s1.rms, s1.uid 
from (
  select d.type type, d.uid uid, flatten(d.map.rm) rms from 
dfs.`/drill/testdata/resource-manager/nested_large` d order by d.uid
) s1
  ) s2
  order by s2.rms.mapid, s2.rptds.a, s2.rptds.do_not_exist
);
{noformat}

Plan is:
{noformat}
| 00-00Screen
00-01  Project(EXPR$0=[$0])
00-02StreamAgg(group=[{}], EXPR$0=[$SUM0($0)])
00-03  UnionExchange
01-01StreamAgg(group=[{}], EXPR$0=[COUNT()])
01-02  Project($f0=[0])
01-03SingleMergeExchange(sort0=[4 ASC], sort1=[5 ASC], sort2=[6 
ASC])
02-01  SelectionVectorRemover
02-02Sort(sort0=[$4], sort1=[$5], sort2=[$6], dir0=[ASC], 
dir1=[ASC], dir2=[ASC])
02-03  Project(type=[$0], rptds=[$1], rms=[$2], uid=[$3], 
EXPR$4=[$4], EXPR$5=[$5], EXPR$6=[$6])
02-04HashToRandomExchange(dist0=[[$4]], dist1=[[$5]], 
dist2=[[$6]])
03-01  UnorderedMuxExchange
04-01Project(type=[$0], rptds=[$1], rms=[$2], 
uid=[$3], EXPR$4=[$4], EXPR$5=[$5], EXPR$6=[$6], 
E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($6, hash32AsDouble($5, 
hash32AsDouble($4, 1301011)))])
04-02  Project(type=[$0], rptds=[$1], rms=[$2], 
uid=[$3], EXPR$4=[ITEM($2, 'mapid')], EXPR$5=[ITEM($1, 'a')], EXPR$6=[ITEM($1, 
'do_not_exist')])
04-03Flatten(flattenField=[$1])
04-04  Project(type=[$0], rptds=[ITEM($2, 
'rptd')], rms=[$2], uid=[$1])
04-05SingleMergeExchange(sort0=[1 ASC])
05-01  SelectionVectorRemover
05-02Sort(sort0=[$1], dir0=[ASC])
05-03  Project(type=[$0], uid=[$1], 
rms=[$2])
05-04
HashToRandomExchange(dist0=[[$1]])
06-01  UnorderedMuxExchange
07-01Project(type=[$0], 
uid=[$1], rms=[$2], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($1, 1301011)])
07-02  
Flatten(flattenField=[$2])
07-03Project(type=[$0], 
uid=[$1], rms=[ITEM($2, 'rm')])
07-04  
Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
[path=maprfs:///drill/testdata/resource-manager/nested_large]], 
selectionRoot=maprfs:/drill/testdata/resource-manager/nested_large, numFiles=1, 
usedMetadataFile=false, columns=[`type`, `uid`, `map`.`rm`]]])
{noformat}

Here is a segment of the drillbit.log, starting at line 55890:
{noformat}
2017-09-19 04:22:56,258 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:2] DEBUG 
o.a.d.e.t.g.SingleBatchSorterGen44 - Took 142 us to sort 1023 records
2017-09-19 04:22:56,265 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:4] DEBUG 
o.a.d.e.t.g.SingleBatchSorterGen44 - Took 105 us to sort 1023 records
2017-09-19 04:22:56,268 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:3:0] DEBUG 
o.a.d.e.p.i.p.PartitionSenderRootExec - Partitioner.next(): got next record 
batch with status OK
2017-09-19 04:22:56,275 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:7] DEBUG 
o.a.d.e.t.g.SingleBatchSorterGen44 - Took 145 us to sort 1023 records
2017-09-19 04:22:56,354 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:3:0] DEBUG 
o.a.d.e.p.i.p.PartitionSenderRootExec - Partitioner.next(): got next record 
batch with status OK
2017-09-19 04:22:56,357 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:2] DEBUG 
o.a.d.e.t.g.SingleBatchSorterGen44 - Took 143 us to sort 1023 records
2017-09-19 04:22:56,361 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:0] DEBUG 
o.a.d.exec.compile.ClassTransformer - Compiled and merged 
PriorityQueueCopierGen50: bytecode size = 11.0 KiB, time = 124 ms.
2017-09-19 04:22:56,365 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:4] DEBUG 
o.a.d.e.t.g.SingleBatchSorterGen44 - Took 108 us to sort 1023 records
2017-09-19 04:22:56,367 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:0] DEBUG 
o.a.d.e.p.i.x.m.PriorityQueueCopierWrapper - Copier setup complete
2017-09-19 04:22:56,375 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:7] DEBUG 
o.a.d.e.t.g.SingleBatchSorterGen44 - Took 144 us to sort 1023 records
2017-09-19 04:22:56,396 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:0] DEBUG 
o.a.drill.exec.vector.BigIntVector - Reallocating vector 
[$data$(BIGINT:REQUIRED)]. # of bytes: [0] -> [0]
2017-09-19 04:22:56,396 

[jira] [Commented] (DRILL-5804) External Sort times out, may be infinite loop

2017-09-19 Thread Robert Hou (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16172274#comment-16172274
 ] 

Robert Hou commented on DRILL-5804:
---

The profile is missing.  I suspect it was not created.

> External Sort times out, may be infinite loop
> -
>
> Key: DRILL-5804
> URL: https://issues.apache.org/jira/browse/DRILL-5804
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.11.0
>Reporter: Robert Hou
>Assignee: Paul Rogers
> Fix For: 1.12.0
>
> Attachments: drillbit.log
>
>
> Query is:
> {noformat}
> ALTER SESSION SET `exec.sort.disable_managed` = false;
> select count(*) from (
>   select * from (
> select s1.type type, flatten(s1.rms.rptd) rptds, s1.rms, s1.uid 
> from (
>   select d.type type, d.uid uid, flatten(d.map.rm) rms from 
> dfs.`/drill/testdata/resource-manager/nested_large` d order by d.uid
> ) s1
>   ) s2
>   order by s2.rms.mapid, s2.rptds.a, s2.rptds.do_not_exist
> );
> {noformat}
> Plan is:
> {noformat}
> | 00-00Screen
> 00-01  Project(EXPR$0=[$0])
> 00-02StreamAgg(group=[{}], EXPR$0=[$SUM0($0)])
> 00-03  UnionExchange
> 01-01StreamAgg(group=[{}], EXPR$0=[COUNT()])
> 01-02  Project($f0=[0])
> 01-03SingleMergeExchange(sort0=[4 ASC], sort1=[5 ASC], 
> sort2=[6 ASC])
> 02-01  SelectionVectorRemover
> 02-02Sort(sort0=[$4], sort1=[$5], sort2=[$6], dir0=[ASC], 
> dir1=[ASC], dir2=[ASC])
> 02-03  Project(type=[$0], rptds=[$1], rms=[$2], uid=[$3], 
> EXPR$4=[$4], EXPR$5=[$5], EXPR$6=[$6])
> 02-04HashToRandomExchange(dist0=[[$4]], dist1=[[$5]], 
> dist2=[[$6]])
> 03-01  UnorderedMuxExchange
> 04-01Project(type=[$0], rptds=[$1], rms=[$2], 
> uid=[$3], EXPR$4=[$4], EXPR$5=[$5], EXPR$6=[$6], 
> E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($6, hash32AsDouble($5, 
> hash32AsDouble($4, 1301011)))])
> 04-02  Project(type=[$0], rptds=[$1], rms=[$2], 
> uid=[$3], EXPR$4=[ITEM($2, 'mapid')], EXPR$5=[ITEM($1, 'a')], 
> EXPR$6=[ITEM($1, 'do_not_exist')])
> 04-03Flatten(flattenField=[$1])
> 04-04  Project(type=[$0], rptds=[ITEM($2, 
> 'rptd')], rms=[$2], uid=[$1])
> 04-05SingleMergeExchange(sort0=[1 ASC])
> 05-01  SelectionVectorRemover
> 05-02Sort(sort0=[$1], dir0=[ASC])
> 05-03  Project(type=[$0], uid=[$1], 
> rms=[$2])
> 05-04
> HashToRandomExchange(dist0=[[$1]])
> 06-01  UnorderedMuxExchange
> 07-01Project(type=[$0], 
> uid=[$1], rms=[$2], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($1, 1301011)])
> 07-02  
> Flatten(flattenField=[$2])
> 07-03Project(type=[$0], 
> uid=[$1], rms=[ITEM($2, 'rm')])
> 07-04  
> Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
> [path=maprfs:///drill/testdata/resource-manager/nested_large]], 
> selectionRoot=maprfs:/drill/testdata/resource-manager/nested_large, 
> numFiles=1, usedMetadataFile=false, columns=[`type`, `uid`, `map`.`rm`]]])
> {noformat}
> Here is a segment of the drillbit.log, starting at line 55890:
> {noformat}
> 2017-09-19 04:22:56,258 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:2] DEBUG 
> o.a.d.e.t.g.SingleBatchSorterGen44 - Took 142 us to sort 1023 records
> 2017-09-19 04:22:56,265 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:4] DEBUG 
> o.a.d.e.t.g.SingleBatchSorterGen44 - Took 105 us to sort 1023 records
> 2017-09-19 04:22:56,268 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:3:0] DEBUG 
> o.a.d.e.p.i.p.PartitionSenderRootExec - Partitioner.next(): got next record 
> batch with status OK
> 2017-09-19 04:22:56,275 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:7] DEBUG 
> o.a.d.e.t.g.SingleBatchSorterGen44 - Took 145 us to sort 1023 records
> 2017-09-19 04:22:56,354 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:3:0] DEBUG 
> o.a.d.e.p.i.p.PartitionSenderRootExec - Partitioner.next(): got next record 
> batch with status OK
> 2017-09-19 04:22:56,357 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:2] DEBUG 
> o.a.d.e.t.g.SingleBatchSorterGen44 - Took 143 us to sort 1023 records
> 2017-09-19 04:22:56,361 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:0] DEBUG 
> o.a.d.exec.compile.ClassTransformer - Compiled and merged 
> 

[jira] [Updated] (DRILL-5804) External Sort times out, may be infinite loop

2017-09-19 Thread Robert Hou (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Hou updated DRILL-5804:
--
Attachment: drillbit.log

> External Sort times out, may be infinite loop
> -
>
> Key: DRILL-5804
> URL: https://issues.apache.org/jira/browse/DRILL-5804
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.11.0
>Reporter: Robert Hou
>Assignee: Paul Rogers
> Fix For: 1.12.0
>
> Attachments: drillbit.log
>
>
> Query is:
> {noformat}
> ALTER SESSION SET `exec.sort.disable_managed` = false;
> select count(*) from (
>   select * from (
> select s1.type type, flatten(s1.rms.rptd) rptds, s1.rms, s1.uid 
> from (
>   select d.type type, d.uid uid, flatten(d.map.rm) rms from 
> dfs.`/drill/testdata/resource-manager/nested_large` d order by d.uid
> ) s1
>   ) s2
>   order by s2.rms.mapid, s2.rptds.a, s2.rptds.do_not_exist
> );
> {noformat}
> Plan is:
> {noformat}
> | 00-00Screen
> 00-01  Project(EXPR$0=[$0])
> 00-02StreamAgg(group=[{}], EXPR$0=[$SUM0($0)])
> 00-03  UnionExchange
> 01-01StreamAgg(group=[{}], EXPR$0=[COUNT()])
> 01-02  Project($f0=[0])
> 01-03SingleMergeExchange(sort0=[4 ASC], sort1=[5 ASC], 
> sort2=[6 ASC])
> 02-01  SelectionVectorRemover
> 02-02Sort(sort0=[$4], sort1=[$5], sort2=[$6], dir0=[ASC], 
> dir1=[ASC], dir2=[ASC])
> 02-03  Project(type=[$0], rptds=[$1], rms=[$2], uid=[$3], 
> EXPR$4=[$4], EXPR$5=[$5], EXPR$6=[$6])
> 02-04HashToRandomExchange(dist0=[[$4]], dist1=[[$5]], 
> dist2=[[$6]])
> 03-01  UnorderedMuxExchange
> 04-01Project(type=[$0], rptds=[$1], rms=[$2], 
> uid=[$3], EXPR$4=[$4], EXPR$5=[$5], EXPR$6=[$6], 
> E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($6, hash32AsDouble($5, 
> hash32AsDouble($4, 1301011)))])
> 04-02  Project(type=[$0], rptds=[$1], rms=[$2], 
> uid=[$3], EXPR$4=[ITEM($2, 'mapid')], EXPR$5=[ITEM($1, 'a')], 
> EXPR$6=[ITEM($1, 'do_not_exist')])
> 04-03Flatten(flattenField=[$1])
> 04-04  Project(type=[$0], rptds=[ITEM($2, 
> 'rptd')], rms=[$2], uid=[$1])
> 04-05SingleMergeExchange(sort0=[1 ASC])
> 05-01  SelectionVectorRemover
> 05-02Sort(sort0=[$1], dir0=[ASC])
> 05-03  Project(type=[$0], uid=[$1], 
> rms=[$2])
> 05-04
> HashToRandomExchange(dist0=[[$1]])
> 06-01  UnorderedMuxExchange
> 07-01Project(type=[$0], 
> uid=[$1], rms=[$2], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($1, 1301011)])
> 07-02  
> Flatten(flattenField=[$2])
> 07-03Project(type=[$0], 
> uid=[$1], rms=[ITEM($2, 'rm')])
> 07-04  
> Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
> [path=maprfs:///drill/testdata/resource-manager/nested_large]], 
> selectionRoot=maprfs:/drill/testdata/resource-manager/nested_large, 
> numFiles=1, usedMetadataFile=false, columns=[`type`, `uid`, `map`.`rm`]]])
> {noformat}
> Here is a segment of the drillbit.log, starting at line 55890:
> {noformat}
> 2017-09-19 04:22:56,258 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:2] DEBUG 
> o.a.d.e.t.g.SingleBatchSorterGen44 - Took 142 us to sort 1023 records
> 2017-09-19 04:22:56,265 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:4] DEBUG 
> o.a.d.e.t.g.SingleBatchSorterGen44 - Took 105 us to sort 1023 records
> 2017-09-19 04:22:56,268 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:3:0] DEBUG 
> o.a.d.e.p.i.p.PartitionSenderRootExec - Partitioner.next(): got next record 
> batch with status OK
> 2017-09-19 04:22:56,275 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:7] DEBUG 
> o.a.d.e.t.g.SingleBatchSorterGen44 - Took 145 us to sort 1023 records
> 2017-09-19 04:22:56,354 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:3:0] DEBUG 
> o.a.d.e.p.i.p.PartitionSenderRootExec - Partitioner.next(): got next record 
> batch with status OK
> 2017-09-19 04:22:56,357 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:2] DEBUG 
> o.a.d.e.t.g.SingleBatchSorterGen44 - Took 143 us to sort 1023 records
> 2017-09-19 04:22:56,361 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:0] DEBUG 
> o.a.d.exec.compile.ClassTransformer - Compiled and merged 
> PriorityQueueCopierGen50: bytecode size = 11.0 KiB, time = 124 ms.
> 2017-09-19 

[jira] [Updated] (DRILL-5786) External Sort encounters Exception in RPC communication during Sort

2017-09-19 Thread Robert Hou (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Hou updated DRILL-5786:
--
Summary: External Sort encounters Exception in RPC communication during 
Sort  (was: Query encounters Exception in RPC communication during Sort)

> External Sort encounters Exception in RPC communication during Sort
> ---
>
> Key: DRILL-5786
> URL: https://issues.apache.org/jira/browse/DRILL-5786
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.11.0
>Reporter: Robert Hou
>Assignee: Paul Rogers
> Fix For: 1.12.0
>
> Attachments: 2647d2b0-69bf-5a2b-0e23-81e8d49e464e.sys.drill, 
> drillbit.log
>
>
> Query is:
> {noformat}
> select count(*) from (select * from 
> dfs.`/drill/testdata/resource-manager/3500cols.tbl` order by 
> columns[450],columns[330],columns[230],columns[220],columns[110],columns[90],columns[80],columns[70],columns[40],columns[10],columns[20],columns[30],columns[40],columns[50],
>  
> columns[454],columns[413],columns[940],columns[834],columns[73],columns[140],columns[104],columns[],columns[30],columns[2420],columns[1520],
>  columns[1410], 
> columns[1110],columns[1290],columns[2380],columns[705],columns[45],columns[1054],columns[2430],columns[420],columns[404],columns[3350],
>  
> columns[],columns[153],columns[356],columns[84],columns[745],columns[1450],columns[103],columns[2065],columns[343],columns[3420],columns[530],
>  columns[3210] ) d where d.col433 = 'sjka skjf'
> {noformat}
> This is the same query as DRILL-5670 but no session variables are set.
> Here is the stack trace:
> {noformat}
> 2017-09-12 13:14:57,584 [BitServer-5] ERROR 
> o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication.  
> Connection: /10.10.100.190:31012 <--> /10.10.100.190:46230 (data server).  
> Closing connection.
> io.netty.handler.codec.DecoderException: 
> org.apache.drill.exec.exception.OutOfMemoryException: Failure allocating 
> buffer.
> at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:233)
>  ~[netty-codec-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) 
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>  [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) 
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) 
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
>  [netty-common-4.0.27.Final.jar:4.0.27.Final]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_111]
> Caused by: org.apache.drill.exec.exception.OutOfMemoryException: Failure 
> allocating buffer.
> at 
> io.netty.buffer.PooledByteBufAllocatorL.allocate(PooledByteBufAllocatorL.java:64)
>  ~[drill-memory-base-1.12.0-SNAPSHOT.jar:4.0.27.Final]
> at 
> org.apache.drill.exec.memory.AllocationManager.(AllocationManager.java:81)
>  ~[drill-memory-base-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
> at 
> org.apache.drill.exec.memory.BaseAllocator.bufferWithoutReservation(BaseAllocator.java:260)
>  ~[drill-memory-base-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
> at 
> org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:243) 
> 

[jira] [Updated] (DRILL-5804) External Sort times out, may be infinite loop

2017-09-19 Thread Robert Hou (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Hou updated DRILL-5804:
--
Summary: External Sort times out, may be infinite loop  (was: Query times 
out, may be infinite loop)

> External Sort times out, may be infinite loop
> -
>
> Key: DRILL-5804
> URL: https://issues.apache.org/jira/browse/DRILL-5804
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.11.0
>Reporter: Robert Hou
>Assignee: Paul Rogers
> Fix For: 1.12.0
>
>
> Query is:
> {noformat}
> ALTER SESSION SET `exec.sort.disable_managed` = false;
> select count(*) from (
>   select * from (
> select s1.type type, flatten(s1.rms.rptd) rptds, s1.rms, s1.uid 
> from (
>   select d.type type, d.uid uid, flatten(d.map.rm) rms from 
> dfs.`/drill/testdata/resource-manager/nested_large` d order by d.uid
> ) s1
>   ) s2
>   order by s2.rms.mapid, s2.rptds.a, s2.rptds.do_not_exist
> );
> {noformat}
> Plan is:
> {noformat}
> | 00-00Screen
> 00-01  Project(EXPR$0=[$0])
> 00-02StreamAgg(group=[{}], EXPR$0=[$SUM0($0)])
> 00-03  UnionExchange
> 01-01StreamAgg(group=[{}], EXPR$0=[COUNT()])
> 01-02  Project($f0=[0])
> 01-03SingleMergeExchange(sort0=[4 ASC], sort1=[5 ASC], 
> sort2=[6 ASC])
> 02-01  SelectionVectorRemover
> 02-02Sort(sort0=[$4], sort1=[$5], sort2=[$6], dir0=[ASC], 
> dir1=[ASC], dir2=[ASC])
> 02-03  Project(type=[$0], rptds=[$1], rms=[$2], uid=[$3], 
> EXPR$4=[$4], EXPR$5=[$5], EXPR$6=[$6])
> 02-04HashToRandomExchange(dist0=[[$4]], dist1=[[$5]], 
> dist2=[[$6]])
> 03-01  UnorderedMuxExchange
> 04-01Project(type=[$0], rptds=[$1], rms=[$2], 
> uid=[$3], EXPR$4=[$4], EXPR$5=[$5], EXPR$6=[$6], 
> E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($6, hash32AsDouble($5, 
> hash32AsDouble($4, 1301011)))])
> 04-02  Project(type=[$0], rptds=[$1], rms=[$2], 
> uid=[$3], EXPR$4=[ITEM($2, 'mapid')], EXPR$5=[ITEM($1, 'a')], 
> EXPR$6=[ITEM($1, 'do_not_exist')])
> 04-03Flatten(flattenField=[$1])
> 04-04  Project(type=[$0], rptds=[ITEM($2, 
> 'rptd')], rms=[$2], uid=[$1])
> 04-05SingleMergeExchange(sort0=[1 ASC])
> 05-01  SelectionVectorRemover
> 05-02Sort(sort0=[$1], dir0=[ASC])
> 05-03  Project(type=[$0], uid=[$1], 
> rms=[$2])
> 05-04
> HashToRandomExchange(dist0=[[$1]])
> 06-01  UnorderedMuxExchange
> 07-01Project(type=[$0], 
> uid=[$1], rms=[$2], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($1, 1301011)])
> 07-02  
> Flatten(flattenField=[$2])
> 07-03Project(type=[$0], 
> uid=[$1], rms=[ITEM($2, 'rm')])
> 07-04  
> Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
> [path=maprfs:///drill/testdata/resource-manager/nested_large]], 
> selectionRoot=maprfs:/drill/testdata/resource-manager/nested_large, 
> numFiles=1, usedMetadataFile=false, columns=[`type`, `uid`, `map`.`rm`]]])
> {noformat}
> Here is a segment of the drillbit.log, starting at line 55890:
> {noformat}
> 2017-09-19 04:22:56,258 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:2] DEBUG 
> o.a.d.e.t.g.SingleBatchSorterGen44 - Took 142 us to sort 1023 records
> 2017-09-19 04:22:56,265 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:4] DEBUG 
> o.a.d.e.t.g.SingleBatchSorterGen44 - Took 105 us to sort 1023 records
> 2017-09-19 04:22:56,268 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:3:0] DEBUG 
> o.a.d.e.p.i.p.PartitionSenderRootExec - Partitioner.next(): got next record 
> batch with status OK
> 2017-09-19 04:22:56,275 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:7] DEBUG 
> o.a.d.e.t.g.SingleBatchSorterGen44 - Took 145 us to sort 1023 records
> 2017-09-19 04:22:56,354 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:3:0] DEBUG 
> o.a.d.e.p.i.p.PartitionSenderRootExec - Partitioner.next(): got next record 
> batch with status OK
> 2017-09-19 04:22:56,357 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:2] DEBUG 
> o.a.d.e.t.g.SingleBatchSorterGen44 - Took 143 us to sort 1023 records
> 2017-09-19 04:22:56,361 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:0] DEBUG 
> o.a.d.exec.compile.ClassTransformer - Compiled and merged 
> PriorityQueueCopierGen50: bytecode size = 

[jira] [Created] (DRILL-5804) Query times out, may be infinite loop

2017-09-19 Thread Robert Hou (JIRA)
Robert Hou created DRILL-5804:
-

 Summary: Query times out, may be infinite loop
 Key: DRILL-5804
 URL: https://issues.apache.org/jira/browse/DRILL-5804
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Relational Operators
Affects Versions: 1.11.0
Reporter: Robert Hou
Assignee: Paul Rogers
 Fix For: 1.12.0


Query is:
{noformat}
ALTER SESSION SET `exec.sort.disable_managed` = false;
select count(*) from (
  select * from (
select s1.type type, flatten(s1.rms.rptd) rptds, s1.rms, s1.uid 
from (
  select d.type type, d.uid uid, flatten(d.map.rm) rms from 
dfs.`/drill/testdata/resource-manager/nested_large` d order by d.uid
) s1
  ) s2
  order by s2.rms.mapid, s2.rptds.a, s2.rptds.do_not_exist
);
{noformat}

Plan is:
{noformat}
| 00-00Screen
00-01  Project(EXPR$0=[$0])
00-02StreamAgg(group=[{}], EXPR$0=[$SUM0($0)])
00-03  UnionExchange
01-01StreamAgg(group=[{}], EXPR$0=[COUNT()])
01-02  Project($f0=[0])
01-03SingleMergeExchange(sort0=[4 ASC], sort1=[5 ASC], sort2=[6 
ASC])
02-01  SelectionVectorRemover
02-02Sort(sort0=[$4], sort1=[$5], sort2=[$6], dir0=[ASC], 
dir1=[ASC], dir2=[ASC])
02-03  Project(type=[$0], rptds=[$1], rms=[$2], uid=[$3], 
EXPR$4=[$4], EXPR$5=[$5], EXPR$6=[$6])
02-04HashToRandomExchange(dist0=[[$4]], dist1=[[$5]], 
dist2=[[$6]])
03-01  UnorderedMuxExchange
04-01Project(type=[$0], rptds=[$1], rms=[$2], 
uid=[$3], EXPR$4=[$4], EXPR$5=[$5], EXPR$6=[$6], 
E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($6, hash32AsDouble($5, 
hash32AsDouble($4, 1301011)))])
04-02  Project(type=[$0], rptds=[$1], rms=[$2], 
uid=[$3], EXPR$4=[ITEM($2, 'mapid')], EXPR$5=[ITEM($1, 'a')], EXPR$6=[ITEM($1, 
'do_not_exist')])
04-03Flatten(flattenField=[$1])
04-04  Project(type=[$0], rptds=[ITEM($2, 
'rptd')], rms=[$2], uid=[$1])
04-05SingleMergeExchange(sort0=[1 ASC])
05-01  SelectionVectorRemover
05-02Sort(sort0=[$1], dir0=[ASC])
05-03  Project(type=[$0], uid=[$1], 
rms=[$2])
05-04
HashToRandomExchange(dist0=[[$1]])
06-01  UnorderedMuxExchange
07-01Project(type=[$0], 
uid=[$1], rms=[$2], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($1, 1301011)])
07-02  
Flatten(flattenField=[$2])
07-03Project(type=[$0], 
uid=[$1], rms=[ITEM($2, 'rm')])
07-04  
Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
[path=maprfs:///drill/testdata/resource-manager/nested_large]], 
selectionRoot=maprfs:/drill/testdata/resource-manager/nested_large, numFiles=1, 
usedMetadataFile=false, columns=[`type`, `uid`, `map`.`rm`]]])
{noformat}

Here is a segment of the drillbit.log, starting at line 55890:
{noformat}
2017-09-19 04:22:56,258 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:2] DEBUG 
o.a.d.e.t.g.SingleBatchSorterGen44 - Took 142 us to sort 1023 records
2017-09-19 04:22:56,265 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:4] DEBUG 
o.a.d.e.t.g.SingleBatchSorterGen44 - Took 105 us to sort 1023 records
2017-09-19 04:22:56,268 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:3:0] DEBUG 
o.a.d.e.p.i.p.PartitionSenderRootExec - Partitioner.next(): got next record 
batch with status OK
2017-09-19 04:22:56,275 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:7] DEBUG 
o.a.d.e.t.g.SingleBatchSorterGen44 - Took 145 us to sort 1023 records
2017-09-19 04:22:56,354 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:3:0] DEBUG 
o.a.d.e.p.i.p.PartitionSenderRootExec - Partitioner.next(): got next record 
batch with status OK
2017-09-19 04:22:56,357 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:2] DEBUG 
o.a.d.e.t.g.SingleBatchSorterGen44 - Took 143 us to sort 1023 records
2017-09-19 04:22:56,361 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:0] DEBUG 
o.a.d.exec.compile.ClassTransformer - Compiled and merged 
PriorityQueueCopierGen50: bytecode size = 11.0 KiB, time = 124 ms.
2017-09-19 04:22:56,365 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:4] DEBUG 
o.a.d.e.t.g.SingleBatchSorterGen44 - Took 108 us to sort 1023 records
2017-09-19 04:22:56,367 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:0] DEBUG 
o.a.d.e.p.i.x.m.PriorityQueueCopierWrapper - Copier setup complete
2017-09-19 04:22:56,375 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:7] DEBUG 
o.a.d.e.t.g.SingleBatchSorterGen44 - Took 

[jira] [Resolved] (DRILL-5710) drill-config.sh incorrectly exits with Java 1.7 or later is required to run Apache Drill

2017-09-19 Thread Arina Ielchiieva (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva resolved DRILL-5710.
-
Resolution: Fixed

Fixed in the scope of DRILL-5698.

> drill-config.sh incorrectly exits with Java 1.7 or later is required to run 
> Apache Drill
> 
>
> Key: DRILL-5710
> URL: https://issues.apache.org/jira/browse/DRILL-5710
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.11.0
> Environment: java version "1.8.0_144"
> OSX
>Reporter: Angel Aray
> Fix For: 1.12.0
>
>
> drill-config fails to recognize 1.8.0_144 as Java 1.7 or later.
> The scripts validates the java version using the following code:
> "$JAVA" -version 2>&1 | grep "version" | egrep -e "1.4|1.5|1.6" 
> this should be replaced by:
> "$JAVA" -version 2>&1 | grep "version" | egrep -e "1\.4|1\.5|1\.6" 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (DRILL-5706) Select * on hbase table having multiple regions(one or more empty) returns wrong result intermittently

2017-09-19 Thread Vitalii Diravka (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitalii Diravka reassigned DRILL-5706:
--

Assignee: Vitalii Diravka

> Select * on hbase table having multiple regions(one or more empty) returns 
> wrong result intermittently
> --
>
> Key: DRILL-5706
> URL: https://issues.apache.org/jira/browse/DRILL-5706
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - HBase
>Affects Versions: 1.11.0
>Reporter: Prasad Nagaraj Subramanya
>Assignee: Vitalii Diravka
>
> 1) Create a hbase table with 4 regions
> {code}
> create 'myhbase', 'cf1', {SPLITS => ['a', 'b', 'c']}
> put 'myhbase','a','cf1:col1','somedata'
> put 'myhbase','b','cf1:col1','somedata'
> put 'myhbase','c','cf1:col1','somedata'
> {code}
> 2) Run select * on the hbase table
> {code}
> select * from hbase.myhbase;
> {code}
> The query returns wrong result intermittently



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5706) Select * on hbase table having multiple regions(one or more empty) returns wrong result intermittently

2017-09-19 Thread Vitalii Diravka (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16172202#comment-16172202
 ] 

Vitalii Diravka commented on DRILL-5706:


[~prasadns14] Looks like the result is correct. Please verify:
{code}
0: jdbc:drill:> select * from hbase.myhbase;
+--+--+
|   row_key|   cf1|
+--+--+
| [B@3486f312  | {"col1":"c29tZWRhdGE="}  |
| [B@12e0f043  | {"col1":"c29tZWRhdGE="}  |
| [B@6dbdc863  | {"col1":"c29tZWRhdGE="}  |
+--+--+
3 rows selected (0.322 seconds)
0: jdbc:drill:> select convert_from(row_key, 'UTF8') as id, 
convert_from(myhbase.cf1.col1, 'UTF8') col1 from hbase. myhbase;
+-+---+
| id  |   col1|
+-+---+
| c   | somedata  |
| a   | somedata  |
| b   | somedata  |
+-+---+
3 rows selected (0.262 seconds)
{code}

Note: the above result is the same for Drill-1.10 and Drill-1.12 master version.

> Select * on hbase table having multiple regions(one or more empty) returns 
> wrong result intermittently
> --
>
> Key: DRILL-5706
> URL: https://issues.apache.org/jira/browse/DRILL-5706
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - HBase
>Affects Versions: 1.11.0
>Reporter: Prasad Nagaraj Subramanya
>Assignee: Vitalii Diravka
>
> 1) Create a hbase table with 4 regions
> {code}
> create 'myhbase', 'cf1', {SPLITS => ['a', 'b', 'c']}
> put 'myhbase','a','cf1:col1','somedata'
> put 'myhbase','b','cf1:col1','somedata'
> put 'myhbase','c','cf1:col1','somedata'
> {code}
> 2) Run select * on the hbase table
> {code}
> select * from hbase.myhbase;
> {code}
> The query returns wrong result intermittently



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (DRILL-5710) drill-config.sh incorrectly exits with Java 1.7 or later is required to run Apache Drill

2017-09-19 Thread Arina Ielchiieva (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-5710:

Fix Version/s: 1.12.0

> drill-config.sh incorrectly exits with Java 1.7 or later is required to run 
> Apache Drill
> 
>
> Key: DRILL-5710
> URL: https://issues.apache.org/jira/browse/DRILL-5710
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.11.0
> Environment: java version "1.8.0_144"
> OSX
>Reporter: Angel Aray
> Fix For: 1.12.0
>
>
> drill-config fails to recognize 1.8.0_144 as Java 1.7 or later.
> The scripts validates the java version using the following code:
> "$JAVA" -version 2>&1 | grep "version" | egrep -e "1.4|1.5|1.6" 
> this should be replaced by:
> "$JAVA" -version 2>&1 | grep "version" | egrep -e "1\.4|1\.5|1\.6" 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5798) Fix Flakey Tests

2017-09-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16172072#comment-16172072
 ] 

ASF GitHub Bot commented on DRILL-5798:
---

Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/945


> Fix Flakey Tests
> 
>
> Key: DRILL-5798
> URL: https://issues.apache.org/jira/browse/DRILL-5798
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5795) Filter pushdown for parquet handles multi rowgroup file

2017-09-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16172040#comment-16172040
 ] 

ASF GitHub Bot commented on DRILL-5795:
---

GitHub user dprofeta opened a pull request:

https://github.com/apache/drill/pull/949

DRILL-5795: Parquet Filter push down at rowgroup level

Before this commit, the filter was pruning complete files. When a file
is composed of multiple rowgroups, it was not able to prune one
rowgroup from the file. Now, when the filter find that a rowgroup
doesn't match it will be remove from the scan.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dprofeta/drill drill-5795

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/949.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #949


commit eed3395647b10d06edf86ba4378995e9fd8da83d
Author: Damien Profeta 
Date:   2017-09-15T18:01:58Z

Parquet Filter push down now work at rowgroup level

Before this commit, the filter was pruning complete files. When a file
is composed of multiple rowgroups, it was not able to prune one
rowgroup from the file. Now, when the filter find that a rowgroup
doesn't match it will be remove from the scan.




> Filter pushdown for parquet handles multi rowgroup file
> ---
>
> Key: DRILL-5795
> URL: https://issues.apache.org/jira/browse/DRILL-5795
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Reporter: Damien Profeta
>
> DRILL-1950 implemented the filter pushdown for parquet file but only in the 
> case of one rowgroup per parquet file. In the case of multiple rowgroups per 
> files, it detects that the rowgroup can be pruned but then tell to the 
> drillbit to read the whole file which leads to performance issue.
> Having multiple rowgroup per file helps to handle partitioned dataset and 
> still read only the relevant subset of data without ending with more file 
> than really needed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5431) Support SSL

2017-09-19 Thread Parth Chandra (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16172013#comment-16172013
 ] 

Parth Chandra commented on DRILL-5431:
--

[~laurentgo] I did put in the support to read from the Windows certificate 
store for the C++ client, and support for the Mac Keychain and Windows 
certificate store for the Java client which makes it a little more palatable 
for organizations that want to have their CA at the OS level. It is not too 
hard to add the hooks to make the libraries accept a trust store verifier, but 
if we have the ability to read the system trust store then it may not be too 
useful any more.
Re the hostname verifier, I feel that we should probably stick with the 
implementations written by the professionals (i.e boost, netty), rather than 
let end users write their own. Many software projects don't even do hostname 
verification [http://dl.acm.org/citation.cfm?id=2382204], and many get the name 
verification wrong (the RFC is hard to read). 
It is probably just as easy (or hard) to add a hook to let users override 
hostname verification but I'd like to get a PR out for this one, and then add 
this as an enhancement.


> Support SSL
> ---
>
> Key: DRILL-5431
> URL: https://issues.apache.org/jira/browse/DRILL-5431
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - Java, Client - ODBC
>Reporter: Sudheesh Katkam
>Assignee: Sudheesh Katkam
>
> Support SSL between Drillbit and JDBC/ODBC drivers. Drill already supports 
> HTTPS for web traffic.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5745) Invalid "location" information in Drill web server

2017-09-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16171210#comment-16171210
 ] 

ASF GitHub Bot commented on DRILL-5745:
---

GitHub user prasadns14 opened a pull request:

https://github.com/apache/drill/pull/948

DRILL-5745: Corrected 'location' information in Drill web server

Updated the 'location' information in Drill web server to-
1) consider if HTTPS is enabled
2) use the foreman address rather than "localhost"
3) use the set HTTP port

@paul-rogers, please review

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/prasadns14/drill DRILL-5745

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/948.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #948


commit a8858d278eaf5ba648ce783cc7f1c6d6ba53689a
Author: Prasad Subramanya 
Date:   2017-09-19T06:59:01Z

DRILL-5745: Corrected 'location' information in Drill web server




> Invalid "location" information in Drill web server
> --
>
> Key: DRILL-5745
> URL: https://issues.apache.org/jira/browse/DRILL-5745
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Web Server
>Affects Versions: 1.11.0
>Reporter: Paul Rogers
>Priority: Minor
>
> The file {{ProfileResources.java}} has the following incorrect code line:
> {code}
>   this.location = "http://localhost:8047/profile/; + queryId + ".json";
> {code}
> This code makes three errors.
> 1. The "http" prefix ignores the fact that the Drillbit can have SSL enabled 
> for the web server.
> 2. In a browser, "localhost" refers to the the machine running the browser. 
> This is valid only if the browser runs on the same machine as the Drillbit, 
> which is not, in general, true.
> 3. The port number is hardcoded to 8047, but it can be customized in the 
> config file.
> Therefore, most of the time, the link won't work on a production server.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5657) Implement size-aware result set loader

2017-09-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16171154#comment-16171154
 ] 

ASF GitHub Bot commented on DRILL-5657:
---

Github user paul-rogers commented on the issue:

https://github.com/apache/drill/pull/914
  
This commit introduces a feature to limit memory consumed by a batch.

### Batch Size Limits

With this change, the code now has three overlapping limits:

* The traditional row-count limit.
* A maximum limit of 16 MB per vector.
* The new memory-per-batch limit.

### Overall Flow for Limiting Batch Memory Usage

The batch size limit builds on the work already done for overflow.

* The column metadata allows the client to specify allocation hints such as 
expected Varchar width and array cardinality.
* The result set loader allocates a batch using the hints and target row 
count.
* The result set loader measures the memory allocated above. This is the 
initial batch size.
* As the writers find the need to extend a vector, the writer calls a 
listener to ask if the extension is allowed, passing in the amount of growth 
expected.
* The result set loader adds the delta to the accumulated total, compares 
this against the size limit, and returns whether the resize is allowed.
* If the resize is not allowed, an overflow is triggered.

Note that the above reuses the overflow mechanism, allowing the size limit 
to be handled even if reached in the middle of a row.

### Implementation Details

To make the above work:

* A new batch size limit is added to the result set loader options.
* The batch size tracking code is added. This required a new method in the 
value vectors to report actual allocated memory.
* The scalar accessors are refactored to add in the batch size limitation 
without introducing duplicated code. Code moved from the template to base 
classes to factor out redundancy.
* General code clean-up in the vector limit found while doing the above 
work.
* Unit tests for the new mechanism.


> Implement size-aware result set loader
> --
>
> Key: DRILL-5657
> URL: https://issues.apache.org/jira/browse/DRILL-5657
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: Future
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: Future
>
>
> A recent extension to Drill's set of test tools created a "row set" 
> abstraction to allow us to create, and verify, record batches with very few 
> lines of code. Part of this work involved creating a set of "column 
> accessors" in the vector subsystem. Column readers provide a uniform API to 
> obtain data from columns (vectors), while column writers provide a uniform 
> writing interface.
> DRILL-5211 discusses a set of changes to limit value vectors to 16 MB in size 
> (to avoid memory fragmentation due to Drill's two memory allocators.) The 
> column accessors have proven to be so useful that they will be the basis for 
> the new, size-aware writers used by Drill's record readers.
> A step in that direction is to retrofit the column writers to use the 
> size-aware {{setScalar()}} and {{setArray()}} methods introduced in 
> DRILL-5517.
> Since the test framework row set classes are (at present) the only consumer 
> of the accessors, those classes must also be updated with the changes.
> This then allows us to add a new "row mutator" class that handles size-aware 
> vector writing, including the case in which a vector fills in the middle of a 
> row.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)