[ https://issues.apache.org/jira/browse/DRILL-5669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dechang Gu updated DRILL-5669: ------------------------------ Attachment: 26999476-174e-98fd-e21e-fd53f79284c7.sys.drill > Multiple TPCH queries failed due to OOM > --------------------------------------- > > Key: DRILL-5669 > URL: https://issues.apache.org/jira/browse/DRILL-5669 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill > Environment: RHEL 6.4 2.6.32-358.el6.x86_64, 10+1 nodes cluster > Reporter: Dechang Gu > Assignee: Boaz Ben-Zvi > Fix For: 1.11.0 > > Attachments: 26999476-174e-98fd-e21e-fd53f79284c7.sys.drill > > > Running TPCH SF100 Parquet (and CSV) tests, multiple queries failed due to > OOM. For example, Q16 hit the following error: > {code} > java.sql.SQLException: RESOURCE ERROR: One or more nodes ran out of memory > while executing the query. > Unable to allocate sv2 for 65536 records, and not enough batchGroups to spill. > batchGroups.size 1 > spilledBatchGroups.size 0 > allocated memory 23500416 > allocator limit 20000000 > Fragment 1:11 > [Error Id: e58161a6-2383-48b1-a350-50db1b5408c6 on ucs-node10.perf.lab:31010] > at > org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:489) > at org.apache.drill.jdbc.impl.DrillCursor.next(DrillCursor.java:593) > at > org.apache.calcite.avatica.AvaticaResultSet.next(AvaticaResultSet.java:215) > at > org.apache.drill.jdbc.impl.DrillResultSetImpl.next(DrillResultSetImpl.java:140) > at PipSQueak.fetchRows(PipSQueak.java:420) > at PipSQueak.runTest(PipSQueak.java:116) > at PipSQueak.main(PipSQueak.java:556) > Caused by: org.apache.drill.common.exceptions.UserRemoteException: RESOURCE > ERROR: One or more nodes ran out of memory while executing the query. > Unable to allocate sv2 for 65536 records, and not enough batchGroups to spill. > batchGroups.size 1 > spilledBatchGroups.size 0 > allocated memory 23500416 > allocator limit 20000000 > Fragment 1:11 > {code} > And in drillbit.log: > {code} > 2017-07-12 11:34:11,670 ucs-node10.perf.lab > [26999476-174e-98fd-e21e-fd53f79284c7:frag:1:11] INFO > o.a.d.e.p.i.xsort.ExternalSortBatch - User Error Occurred: One or more nodes > ran out of memory while executing the query. > org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: One or more > nodes ran out of memory while executing the query. > Unable to allocate sv2 for 65536 records, and not enough batchGroups to spill. > batchGroups.size 1 > spilledBatchGroups.size 0 > allocated memory 23500416 > allocator limit 20000000 > [Error Id: e58161a6-2383-48b1-a350-50db1b5408c6 ] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:550) > ~[drill-common-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.newSV2(ExternalSortBatch.java:639) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.innerNext(ExternalSortBatch.java:381) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.aggregate.StreamingAggBatch.innerNext(StreamingAggBatch.java:140) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:105) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.partitionsender.PartitionSenderRootExec.innerNext(PartitionSenderRootExec.java:144) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:95) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:234) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:227) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at java.security.AccessController.doPrivileged(Native Method) > [na:1.7.0_65] > at javax.security.auth.Subject.doAs(Subject.java:415) [na:1.7.0_65] > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1595) > [hadoop-common-2.7.0-mapr-1607.jar:na] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:227) > [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > [drill-common-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > [na:1.7.0_65] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [na:1.7.0_65] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65] > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)