[jira] [Comment Edited] (DRILL-6517) IllegalStateException: Record count not set for this vector container

2019-01-30 Thread Robert Hou (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756773#comment-16756773
 ] 

Robert Hou edited comment on DRILL-6517 at 1/31/19 1:38 AM:


I am unable to reproduce this problem with sf1. I ran the query for 2 hours and 
12 hours, and then successfully canceled the query. I spoke with Khurram and 
added "-ea" to DRILL_JAVA_OPTS. I also added "alter system set 
`drill.exec.hashjoin.fallback.enabled` = true;" because the query was running 
out of memory. I am able to cancel the query. I am running Drill 1.14, commit 
35a1ae23c9b280b9e73cb0f6f01808c996515454. The commit message is "NPE for nested 
EAND scenario.".  The query can be canceled with Drill 1.15.


was (Author: rhou):
 am unable to reproduce this problem with sf1. I ran the query for 2 hours and 
12 hours, and then successfully canceled the query. I spoke with Khurram and 
added "-ea" to DRILL_JAVA_OPTS. I also added "alter system set 
`drill.exec.hashjoin.fallback.enabled` = true;" because the query was running 
out of memory. I am able to cancel the query. I am running Drill 1.14, commit 
35a1ae23c9b280b9e73cb0f6f01808c996515454. The commit message is "NPE for nested 
EAND scenario.".  The query can be canceled with Drill 1.15.

> IllegalStateException: Record count not set for this vector container
> -
>
> Key: DRILL-6517
> URL: https://issues.apache.org/jira/browse/DRILL-6517
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.14.0
>Reporter: Khurram Faraaz
>Assignee: Boaz Ben-Zvi
>Priority: Critical
>  Labels: ready-to-commit
> Fix For: 1.15.0
>
> Attachments: 24d7b377-7589-7928-f34f-57d02061acef.sys.drill
>
>
> TPC-DS query is Canceled after 2 hrs and 47 mins and we see an 
> IllegalStateException: Record count not set for this vector container, in 
> drillbit.log
> Steps to reproduce the problem, query profile 
> (24d7b377-7589-7928-f34f-57d02061acef) is attached here.
> {noformat}
> In drill-env.sh set max direct memory to 12G on all 4 nodes in cluster
> export DRILL_MAX_DIRECT_MEMORY=${DRILL_MAX_DIRECT_MEMORY:-"12G"}
> and set these options from sqlline,
> alter system set `planner.memory.max_query_memory_per_node` = 10737418240;
> alter system set `drill.exec.hashagg.fallback.enabled` = true;
> To run the query (replace IP-ADDRESS with your foreman node's IP address)
> cd /opt/mapr/drill/drill-1.14.0/bin
> ./sqlline -u 
> "jdbc:drill:schema=dfs.tpcds_sf1_parquet_views;drillbit=" -f 
> /root/query72.sql
> {noformat}
> Stack trace from drillbit.log
> {noformat}
> 2018-06-18 20:08:51,912 [24d7b377-7589-7928-f34f-57d02061acef:frag:4:49] 
> ERROR o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: 
> IllegalStateException: Record count not set for this vector container
> Fragment 4:49
> [Error Id: 73177a1c-f7aa-4c9e-99e1-d6e1280e3f27 on qa102-45.qa.lab:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> IllegalStateException: Record count not set for this vector container
> Fragment 4:49
> [Error Id: 73177a1c-f7aa-4c9e-99e1-d6e1280e3f27 on qa102-45.qa.lab:31010]
>  at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633)
>  ~[drill-common-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:361)
>  [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:216)
>  [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:327)
>  [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [na:1.8.0_161]
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [na:1.8.0_161]
>  at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161]
> Caused by: java.lang.IllegalStateException: Record count not set for this 
> vector container
>  at com.google.common.base.Preconditions.checkState(Preconditions.java:173) 
> ~[guava-18.0.jar:na]
>  at 
> org.apache.drill.exec.record.VectorContainer.getRecordCount(VectorContainer.java:394)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.getRecordCount(RemovingRecordBatch.java:49)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> 

[jira] [Comment Edited] (DRILL-6517) IllegalStateException: Record count not set for this vector container

2019-01-30 Thread Robert Hou (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756773#comment-16756773
 ] 

Robert Hou edited comment on DRILL-6517 at 1/31/19 1:38 AM:


 am unable to reproduce this problem with sf1. I ran the query for 2 hours and 
12 hours, and then successfully canceled the query. I spoke with Khurram and 
added "-ea" to DRILL_JAVA_OPTS. I also added "alter system set 
`drill.exec.hashjoin.fallback.enabled` = true;" because the query was running 
out of memory. I am able to cancel the query. I am running Drill 1.14, commit 
35a1ae23c9b280b9e73cb0f6f01808c996515454. The commit message is "NPE for nested 
EAND scenario.".  The query can be canceled with Drill 1.15.


was (Author: rhou):
 am unable to reproduce this problem with sf1. I ran the query for 2 hours and 
12 hours, and then successfully canceled the query. I spoke with Khurram and 
added "-ea" to DRILL_JAVA_OPTS. I also added "alter system set 
`drill.exec.hashjoin.fallback.enabled` = true;" because the query was running 
out of memory. I am able to cancel the query. I am running Drill 1.14, commit 
35a1ae23c9b280b9e73cb0f6f01808c996515454. The commit message is "NPE for nested 
EAND scenario.".

> IllegalStateException: Record count not set for this vector container
> -
>
> Key: DRILL-6517
> URL: https://issues.apache.org/jira/browse/DRILL-6517
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.14.0
>Reporter: Khurram Faraaz
>Assignee: Boaz Ben-Zvi
>Priority: Critical
>  Labels: ready-to-commit
> Fix For: 1.15.0
>
> Attachments: 24d7b377-7589-7928-f34f-57d02061acef.sys.drill
>
>
> TPC-DS query is Canceled after 2 hrs and 47 mins and we see an 
> IllegalStateException: Record count not set for this vector container, in 
> drillbit.log
> Steps to reproduce the problem, query profile 
> (24d7b377-7589-7928-f34f-57d02061acef) is attached here.
> {noformat}
> In drill-env.sh set max direct memory to 12G on all 4 nodes in cluster
> export DRILL_MAX_DIRECT_MEMORY=${DRILL_MAX_DIRECT_MEMORY:-"12G"}
> and set these options from sqlline,
> alter system set `planner.memory.max_query_memory_per_node` = 10737418240;
> alter system set `drill.exec.hashagg.fallback.enabled` = true;
> To run the query (replace IP-ADDRESS with your foreman node's IP address)
> cd /opt/mapr/drill/drill-1.14.0/bin
> ./sqlline -u 
> "jdbc:drill:schema=dfs.tpcds_sf1_parquet_views;drillbit=" -f 
> /root/query72.sql
> {noformat}
> Stack trace from drillbit.log
> {noformat}
> 2018-06-18 20:08:51,912 [24d7b377-7589-7928-f34f-57d02061acef:frag:4:49] 
> ERROR o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: 
> IllegalStateException: Record count not set for this vector container
> Fragment 4:49
> [Error Id: 73177a1c-f7aa-4c9e-99e1-d6e1280e3f27 on qa102-45.qa.lab:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> IllegalStateException: Record count not set for this vector container
> Fragment 4:49
> [Error Id: 73177a1c-f7aa-4c9e-99e1-d6e1280e3f27 on qa102-45.qa.lab:31010]
>  at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633)
>  ~[drill-common-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:361)
>  [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:216)
>  [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:327)
>  [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [na:1.8.0_161]
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [na:1.8.0_161]
>  at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161]
> Caused by: java.lang.IllegalStateException: Record count not set for this 
> vector container
>  at com.google.common.base.Preconditions.checkState(Preconditions.java:173) 
> ~[guava-18.0.jar:na]
>  at 
> org.apache.drill.exec.record.VectorContainer.getRecordCount(VectorContainer.java:394)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.getRecordCount(RemovingRecordBatch.java:49)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.record.RecordBatchSizer.(RecordBatchSizer.java:690)
>  

[jira] [Comment Edited] (DRILL-6517) IllegalStateException: Record count not set for this vector container

2018-07-10 Thread Boaz Ben-Zvi (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539284#comment-16539284
 ] 

Boaz Ben-Zvi edited comment on DRILL-6517 at 7/10/18 10:14 PM:
---

  From running instrumented code (on latest master), it looks like the error 
that triggered the cancellation was a disk full while spilling:

{code}
2018-07-10 13:58:41,318 [24bb00cd-1a38-f06b-19a7-cc44d83bde59:frag:4:4] INFO  
o.a.d.e.p.impl.common.HashPartition - User Error Occurred: Hash Join failed to 
write to output file: 
/tmp/drill/spill/24bb00cd-1a38-f06b-19a7-cc44d83bde59_HashJoin_4-22-4/spill6_outer
 (null)
org.apache.drill.common.exceptions.UserException: DATA_WRITE ERROR: Hash Join 
failed to write to output file: 
/tmp/drill/spill/24bb00cd-1a38-f06b-19a7-cc44d83bde59_HashJoin_4-22-4/spill6_outer
{code}

Then operators started returning batches with the STOP outcome, where the 
vector container was not initialized.
The original code went on to check those records (ignoring the outcome), 
invoking the *batch* method `getRecordCount()` (which normally just returns the 
internal field in the batch.)  However for `RemovingRecordBatch` -- the 
implementation invokes the container's `getRecordCount()` , which failed as it 
was not initialized.



was (Author: ben-zvi):
  From running instrumented code (on latest master), it looks like the error 
that triggered the cancellation was a disk full while spilling:

{code}
2018-07-10 13:58:41,318 [24bb00cd-1a38-f06b-19a7-cc44d83bde59:frag:4:4] INFO  
o.a.d.e.p.impl.common.HashPartition - User Error Occurred: Hash Join failed to 
write to output file: 
/tmp/drill/spill/24bb00cd-1a38-f06b-19a7-cc44d83bde59_HashJoin_4-22-4/spill6_outer
 (null)
org.apache.drill.common.exceptions.UserException: DATA_WRITE ERROR: Hash Join 
failed to write to output file: 
/tmp/drill/spill/24bb00cd-1a38-f06b-19a7-cc44d83bde59_HashJoin_4-22-4/spill6_outer
{code}

Then operators started returning batches with the STOP outcome, where the 
vector container was not initialized.
The original code went on to check those records (ignoring the outcome), 
invoking the *batch* method `getRecordCount()` (which normally just returns the 
internal field in the batch. However for `RemovingRecordBatch` -- the 
implementation invokes the container's `getRecordCount()` , which failed as it 
was not initialized.


> IllegalStateException: Record count not set for this vector container
> -
>
> Key: DRILL-6517
> URL: https://issues.apache.org/jira/browse/DRILL-6517
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.14.0
>Reporter: Khurram Faraaz
>Assignee: Boaz Ben-Zvi
>Priority: Critical
> Fix For: 1.14.0
>
> Attachments: 24d7b377-7589-7928-f34f-57d02061acef.sys.drill
>
>
> TPC-DS query is Canceled after 2 hrs and 47 mins and we see an 
> IllegalStateException: Record count not set for this vector container, in 
> drillbit.log
> Steps to reproduce the problem, query profile 
> (24d7b377-7589-7928-f34f-57d02061acef) is attached here.
> {noformat}
> In drill-env.sh set max direct memory to 12G on all 4 nodes in cluster
> export DRILL_MAX_DIRECT_MEMORY=${DRILL_MAX_DIRECT_MEMORY:-"12G"}
> and set these options from sqlline,
> alter system set `planner.memory.max_query_memory_per_node` = 10737418240;
> alter system set `drill.exec.hashagg.fallback.enabled` = true;
> To run the query (replace IP-ADDRESS with your foreman node's IP address)
> cd /opt/mapr/drill/drill-1.14.0/bin
> ./sqlline -u 
> "jdbc:drill:schema=dfs.tpcds_sf1_parquet_views;drillbit=" -f 
> /root/query72.sql
> {noformat}
> Stack trace from drillbit.log
> {noformat}
> 2018-06-18 20:08:51,912 [24d7b377-7589-7928-f34f-57d02061acef:frag:4:49] 
> ERROR o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: 
> IllegalStateException: Record count not set for this vector container
> Fragment 4:49
> [Error Id: 73177a1c-f7aa-4c9e-99e1-d6e1280e3f27 on qa102-45.qa.lab:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> IllegalStateException: Record count not set for this vector container
> Fragment 4:49
> [Error Id: 73177a1c-f7aa-4c9e-99e1-d6e1280e3f27 on qa102-45.qa.lab:31010]
>  at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633)
>  ~[drill-common-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:361)
>  [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:216)
>  [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
>