[ https://issues.apache.org/jira/browse/HIVE-23477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Prasanth Jayachandran updated HIVE-23477: ----------------------------------------- Resolution: Fixed Status: Resolved (was: Patch Available) Committed to master. Thanks Gopal for the review! > LLAP : mmap allocation interruptions fails to notify other threads > ------------------------------------------------------------------ > > Key: HIVE-23477 > URL: https://issues.apache.org/jira/browse/HIVE-23477 > Project: Hive > Issue Type: Bug > Reporter: Prasanth Jayachandran > Assignee: Prasanth Jayachandran > Priority: Major > Labels: pull-request-available > Attachments: HIVE-23477.1.patch, HIVE-23477.2.patch, > HIVE-23477.3.patch > > Time Spent: 10m > Remaining Estimate: 0h > > BuddyAllocator always uses lazy allocation if mmap is enabled. If query > fragment is interrupted at the time of arena allocation, > ClosedByInterruptionException is thrown. This exception artificially triggers > allocator OutOfMemoryError and fails to notify other threads waiting to > allocate arenas. > {code:java} > 2020-05-15 00:03:23.254 WARN [TezTR-128417_1_3_1_1_0] LlapIoImpl: Failed > trying to allocate memory mapped arena > java.nio.channels.ClosedByInterruptException > at > java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202) > at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:970) > at > org.apache.hadoop.hive.llap.cache.BuddyAllocator.preallocateArenaBuffer(BuddyAllocator.java:867) > at > org.apache.hadoop.hive.llap.cache.BuddyAllocator.access$1100(BuddyAllocator.java:69) > at > org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.init(BuddyAllocator.java:900) > at > org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.allocateWithExpand(BuddyAllocator.java:1458) > at > org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.access$800(BuddyAllocator.java:884) > at > org.apache.hadoop.hive.llap.cache.BuddyAllocator.allocateWithExpand(BuddyAllocator.java:740) > at > org.apache.hadoop.hive.llap.cache.BuddyAllocator.allocateMultiple(BuddyAllocator.java:330) > at > org.apache.hadoop.hive.llap.io.metadata.MetadataCache.wrapBbForFile(MetadataCache.java:257) > at > org.apache.hadoop.hive.llap.io.metadata.MetadataCache.putFileMetadata(MetadataCache.java:216) > at > org.apache.hadoop.hive.llap.io.metadata.MetadataCache.putFileMetadata(MetadataCache.java:49) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.readSplitFooter(VectorizedParquetRecordReader.java:343) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.initialize(VectorizedParquetRecordReader.java:238) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.<init>(VectorizedParquetRecordReader.java:160) > at > org.apache.hadoop.hive.ql.io.parquet.VectorizedParquetInputFormat.getRecordReader(VectorizedParquetInputFormat.java:50) > at > org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:87) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:427) > at > org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:203) > at > org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.<init>(TezGroupedSplitsInputFormat.java:145) > at > org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat.getRecordReader(TezGroupedSplitsInputFormat.java:111) > at > org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(MRReaderMapred.java:156) > at > org.apache.tez.mapreduce.lib.MRReaderMapred.setSplit(MRReaderMapred.java:82) > at > org.apache.tez.mapreduce.input.MRInput.initFromEventInternal(MRInput.java:703) > at > org.apache.tez.mapreduce.input.MRInput.initFromEvent(MRInput.java:662) > at > org.apache.tez.mapreduce.input.MRInputLegacy.checkAndAwaitRecordReaderInitialization(MRInputLegacy.java:150) > at > org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:114) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getMRInput(MapRecordProcessor.java:532) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:178) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:266) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:75) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:62) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:62) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:38) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at > org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2020-05-15 00:03:23.254 ERROR [TezTR-128417_1_3_1_1_0] > vector.VectorizedParquetRecordReader: Failed to create the vectorized reader > due to exception java.lang.OutOfMemoryError: Cannot allocate 1073741824 > bytes: Failed trying to allocate memory mapped arena: null; make sure your > xmx and process size are set correctly. {code} > > {code:java} > "TezTR-128417_1_3_1_18_0" #319 daemon prio=5 os_prio=0 tid=0x00007f5880004000 > nid=0x3c8 runnable [0x00007f57a1846000] > java.lang.Thread.State: TIMED_WAITING (on object monitor) > at java.lang.Throwable.fillInStackTrace(Native Method) > at java.lang.Throwable.fillInStackTrace(Throwable.java:784) > - locked <0x00007f5c93915e98> (a java.lang.InterruptedException) > at java.lang.Throwable.<init>(Throwable.java:251) > at java.lang.Exception.<init>(Exception.java:54) > at java.lang.InterruptedException.<init>(InterruptedException.java:57) > at java.lang.Object.wait(Native Method) > at > org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.allocateWithExpand(BuddyAllocator.java:1443) > - locked <0x00007f598859f188> (a > org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena) > at > org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.access$800(BuddyAllocator.java:884) > at > org.apache.hadoop.hive.llap.cache.BuddyAllocator.allocateWithExpand(BuddyAllocator.java:740) > at > org.apache.hadoop.hive.llap.cache.BuddyAllocator.allocateMultiple(BuddyAllocator.java:330) > at > org.apache.hadoop.hive.llap.io.metadata.MetadataCache.wrapBbForFile(MetadataCache.java:257) > at > org.apache.hadoop.hive.llap.io.metadata.MetadataCache.putFileMetadata(MetadataCache.java:216) > at > org.apache.hadoop.hive.llap.io.metadata.MetadataCache.putFileMetadata(MetadataCache.java:49) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.readSplitFooter(VectorizedParquetRecordReader.java:343) > {code} > > {code:java} > "TezTR-128417_1_4_1_18_0" #588 daemon prio=5 os_prio=0 tid=0x00007f57d0004000 > nid=0x43a3 in Object.wait() [0x00007f56f8681000] > java.lang.Thread.State: TIMED_WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > at > org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.allocateWithExpand(BuddyAllocator.java:1443) > - locked <0x00007f598859f188> (a > org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena) {code} > > TezTR-128417_1_3_1_18_0 got interrupted, it threw OOM but failed to notify > other threads. TezTR-128417_1_4_1_18_0 thread is stuck forever waiting to > allocate. -- This message was sent by Atlassian Jira (v8.3.4#803005)