[ https://issues.apache.org/jira/browse/DRILL-5470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16051392#comment-16051392 ]
Paul Rogers commented on DRILL-5470: ------------------------------------ The user enabled the "vector validator" (DRILL-5504) and found this output in the logs: {code} IteratorValidatorCreator - Iterator validation enabled for ScanBatch with vector validation ... BatchValidator - Found one or more vector errors from ScanBatch BatchValidator - Column columns-offsets of type UInt4Vector: Invalid offset at index 2731 = 8193 exceeds maximum of 8192 BatchValidator - Column columns-offsets of type UInt4Vector: Invalid offset at index 2732 = 8196 exceeds maximum of 8192 {code} At the same time, another user found a similar error, reported as DRILL-5590. See notes in that JIRA for more details. Unfortunately, so far, it seems that the bug in that case may not be the same one that caused this one. Still, now that we know where to look, perhaps we can use what was learned from that one to track down this one. > Offset vector data corruption with CSV data > ------------------------------------------- > > Key: DRILL-5470 > URL: https://issues.apache.org/jira/browse/DRILL-5470 > Project: Apache Drill > Issue Type: Bug > Components: Server > Affects Versions: 1.10.0 > Environment: - ubuntu 14.04 > - r3.8xl (32 CPU/240GB Mem) > - openjdk version "1.8.0_111" > - drill 1.10.0 with 8656c83b00f8ab09fb6817e4e9943b2211772541 cherry-picked > Reporter: Nathan Butler > Assignee: Paul Rogers > Priority: Critical > > Per the mailing list discussion and Rahul's and Paul's suggestion I'm filing > this Jira issue. Drill seems to be running out of memory when doing an > External Sort. Per Zelaine's suggestion I enabled > sort.external.disable_managed in drill-override.conf and in the sqlline > session. This caused the query to run for longer but it still would fail with > the same message. > Per Paul's suggestion, I enabled debug logging for the > org.apache.drill.exec.physical.impl.xsort.managed package and re-ran the > query. > Here's the initial DEBUG line for ExternalSortBatch for our query: > bq. 2017-05-03 12:02:56,095 [26f600f1-17b3-d649-51be-2ca0c9bf7606:frag:2:15] > DEBUG o.a.d.e.p.i.x.m.ExternalSortBatch - Config: memory limit = 10737418240, > spill file size = 268435456, spill batch size = 8388608, merge limit = > 2147483647, merge batch size = 16777216 > And here's the last DEBUG line before the stack trace: > bq. 2017-05-03 12:37:44,249 [26f600f1-17b3-d649-51be-2ca0c9bf7606:frag:2:4] > DEBUG o.a.d.e.p.i.x.m.ExternalSortBatch - Available memory: 10737418240, > buffer memory = 10719535268, merge memory = 10707140978 > And the stacktrace: > {quote} > 2017-05-03 12:38:02,927 [26f600f1-17b3-d649-51be-2ca0c9bf7606:frag:2:6] INFO > o.a.d.e.p.i.x.m.ExternalSortBatch - User Error Occurred: External Sort > encountered an error while spilling to disk (Un > able to allocate buffer of size 268435456 due to memory limit. Current > allocation: 10579849472) > org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: External > Sort encountered an error while spilling to disk > [Error Id: 5d53c677-0cd9-4c01-a664-c02089670a1c ] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:544) > ~[drill-common-1.10.0.jar:1.10.0] > at > org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.doMergeAndSpill(ExternalSortBatch.java:1447) > [drill-java-exec-1.10.0.jar:1.10.0] > at > org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.mergeAndSpill(ExternalSortBatch.java:1376) > [drill-java-exec-1.10.0.jar:1.10.0] > at > org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.spillFromMemory(ExternalSortBatch.java:1339) > [drill-java-exec-1.10.0.jar:1.10.0] > at > org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.processBatch(ExternalSortBatch.java:831) > [drill-java-exec-1.10.0.jar:1.10.0] > at > org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.loadBatch(ExternalSortBatch.java:618) > [drill-java-exec-1.10.0.jar:1.10.0] > at > org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.load(ExternalSortBatch.java:660) > [drill-java-exec-1.10.0.jar:1.10.0] > at > org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.innerNext(ExternalSortBatch.java:559) > [drill-java-exec-1.10.0.jar:1.10.0] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162) > [drill-java-exec-1.10.0.jar:1.10.0] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) > [drill-java-exec-1.10.0.jar:1.10.0] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109) > [drill-java-exec-1.10.0.jar:1.10.0] > at > org.apache.drill.exec.physical.impl.aggregate.StreamingAggBatch.innerNext(StreamingAggBatch.java:137) > [drill-java-exec-1.10.0.jar:1.10.0] > at > org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162) > [drill-java-exec-1.10.0.jar:1.10.0] > at > org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:104) > [drill-java-exec-1.10.0.jar:1.10.0] > at > org.apache.drill.exec.physical.impl.partitionsender.PartitionSenderRootExec.innerNext(PartitionSenderRootExec.java:144) > [drill-java-exec-1.10.0.jar:1.10.0] > at > org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:94) > [drill-java-exec-1.10.0.jar:1.10.0] > at > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:232) > [drill-java-exec-1.10.0.jar:1.10.0] > at > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:226) > [drill-java-exec-1.10.0.jar:1.10.0] > at java.security.AccessController.doPrivileged(Native Method) > [na:1.8.0_111] > at javax.security.auth.Subject.doAs(Subject.java:422) [na:1.8.0_111] > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > [hadoop-common-2.7.1.jar:na] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:226) > [drill-java-exec-1.10.0.jar:1.10.0] > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > [drill-common-1.10.0.jar:1.10.0] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0_111] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_111] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_111] > Caused by: org.apache.drill.exec.exception.OutOfMemoryException: Unable to > allocate buffer of size 268435456 due to memory limit. Current allocation: > 10579849472 > at > org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:220) > ~[drill-memory-base-1.10.0.jar:1.10.0] > at > org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:195) > ~[drill-memory-base-1.10.0.jar:1.10.0] > at > org.apache.drill.exec.vector.VarCharVector.reAlloc(VarCharVector.java:425) > ~[vector-1.10.0.jar:1.10.0] > at > org.apache.drill.exec.vector.VarCharVector.copyFromSafe(VarCharVector.java:278) > ~[vector-1.10.0.jar:1.10.0] > at > org.apache.drill.exec.vector.NullableVarCharVector.copyFromSafe(NullableVarCharVector.java:379) > ~[vector-1.10.0.jar:1.10.0] > at > org.apache.drill.exec.test.generated.PriorityQueueCopierGen140.doCopy(PriorityQueueCopierTemplate.java:22) > ~[na:na] > at > org.apache.drill.exec.test.generated.PriorityQueueCopierGen140.next(PriorityQueueCopierTemplate.java:76) > ~[na:na] > at > org.apache.drill.exec.physical.impl.xsort.managed.CopierHolder$BatchMerger.next(CopierHolder.java:234) > ~[drill-java-exec-1.10.0.jar:1.10.0] > at > org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.doMergeAndSpill(ExternalSortBatch.java:1408) > [drill-java-exec-1.10.0.jar:1.10.0] > ... 24 common frames omitted > {quote} > I'm in communication with Paul and will send him the full log file. > Thanks, > Nathan -- This message was sent by Atlassian JIRA (v6.4.14#64029)