[jira] [Comment Edited] (HIVE-16233) llap: Query failed with AllocatorOutOfMemoryException

Sergey Shelukhin (JIRA) Wed, 05 Apr 2017 13:51:54 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-16233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15957639#comment-15957639
 ]


Sergey Shelukhin edited comment on HIVE-16233 at 4/5/17 8:50 PM:
-----------------------------------------------------------------

I fixed some more things, the patch seems to have some MTT issues. I am not 
sure if the locking model used for headers is sufficient with the new code that 
looks at headers directly... previously headers worked without explicit sync 
due to interplay of different freelist locks and how things were accessed. Now 
there may be some stale-data races when headers are accessed outside of 
freelist locks and then re-checked inside a wrong free list lock.  May need 
some extra sync sadly. Although these may just be ST logic bugs, like all the 
previous things that looked like races. Will likely update later today.


was (Author: sershe):
I fixed some more things, the patch seems to have some MTT issues. I am not 
sure if the locking model used for headers is sufficient... there may be some 
stale-data races when headers are accessed outside of freelist locks and then 
re-checked inside a wrong free list lock.  May need some extra sync sadly. 
Although they may just be ST logic bugs, like all the previous things that 
looked like races. Will likely update later today.

> llap: Query failed with AllocatorOutOfMemoryException
> -----------------------------------------------------
>
>                 Key: HIVE-16233
>                 URL: https://issues.apache.org/jira/browse/HIVE-16233
>             Project: Hive
>          Issue Type: Bug
>          Components: llap
>            Reporter: Siddharth Seth
>            Assignee: Sergey Shelukhin
>         Attachments: HIVE-16233.WIP.patch, HIVE-16233.WIP.patch
>
>
> {code}
> TaskAttempt 5 failed, info=[Error: Error while running task ( failure ) : 
> attempt_1488231257387_2288_25_05_000056_5:java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
> java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 262144; at 0 out of 1
>         at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:211)
>         at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:168)
>         at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)
>         at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>         at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
>         at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>         at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>         at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>         at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.IOException: java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 262144; at 0 out of 1
>         at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:74)
>         at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:419)
>         at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:185)
>         ... 15 more
> Caused by: java.io.IOException: java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 262144; at 0 out of 1
>         at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>         at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>         at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365)
>         at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
>         at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
>         at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>         at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151)
>         at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
>         at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:62)
>         ... 17 more
> Caused by: java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 262144; at 0 out of 1
>         at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:425)
>         at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:413)
>         at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:235)
>         at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:232)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
>         at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:232)
>         at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:94)
>         ... 6 more
> Caused by: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 262144; at 0 out of 1
>         at 
> org.apache.hadoop.hive.llap.cache.BuddyAllocator.allocateMultiple(BuddyAllocator.java:275)
>         at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedStream(EncodedReaderImpl.java:675)
>         at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:412)
>         ... 14 more
> {code}
> Also reported by [~selinazh] while using older bits.
> This JIRA will be used to track a long-term fix. 
> We were initially planning to write a different allocator with compaction, 
> but now we are stuck with buddyallocator.
> There are two approaches - compaction on top of buddy allocator (should be 
> fairly simple actually, aside from indirection... luckily for the latter, we 
> can take the buffer out in most places and we can reuse the existing 
> locking-to-prevent-eviction mechanism to also prevent reallocation); or 
> writing a better allocator. Probably the former is preferred, need to think 
> about the latter.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Comment Edited] (HIVE-16233) llap: Query failed with AllocatorOutOfMemoryException

Reply via email to