Hi Xin,

Can you provide these:

  1.  Output of explain plan
  2.  Output of set –v (this will list the configs, so you might want to 
anonymize these)

In addition to that, it looks like vertex vertex_1495595408051_21107_2_03 
failed with OOM. Using Tez counters you can find out the amount of data input 
to this vertex which can further help you in narrowing down the root cause.

Hope this helps,
—Vaibhav

From: <Yang>, Xin <xiy...@visa.com<mailto:xiy...@visa.com>>
Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
<user@hive.apache.org<mailto:user@hive.apache.org>>
Date: Thursday, July 6, 2017 at 10:37 AM
To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
<user@hive.apache.org<mailto:user@hive.apache.org>>
Subject: Re: Tez query failed with OutOfMemoryError: Java heap space

Here're the version information:

Hive: 1.2.1
Tez: 0.8.5
Hadoop 2.6.0-cdh5.8.3

Please let me know if you need more information.

Regards,
Xin

From: "Yang, Xin" <xiy...@visa.com<mailto:xiy...@visa.com>>
Date: Thursday, June 29, 2017 at 11:48 AM
To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
<user@hive.apache.org<mailto:user@hive.apache.org>>
Subject: Tez query failed with OutOfMemoryError: Java heap space

Hi,

We ran a Tez query and it failed with OOM. Then, we computed stats, it still 
failed with the OOM.

Settings:

set hive.tez.container.size=4096;
set tez.am.resource.memory.mb=1024;
set hive.tez.java.opts=-Xmx3276m;

set hive.tez.dynamic.partition.pruning=false;
set hive.tez.dynamic.partition.pruning.max.event.size=1048576;
set hive.tez.dynamic.partition.pruning.max.data.size=104857600;

set hive.prewarm.enabled=true;
set hive.prewarm.numcontainers=10;

set tez.am.container.reuse.enabled=true;

set hive.cbo.enable=true;
set hive.compute.query.using.stats=true;
set hive.stats.fetch.column.stats=true;
set hive.stats.fetch.partition.stats=true;

set hive.auto.convert.join=true;
set hive.auto.convert.join.noconditionaltask=true;
set hive.auto.convert.join.noconditionaltask.size=20971520;
set hive.mapjoin.hybridgrace.hashtable=false;
set hive.optimize.bucketmapjoin.sortedmerge=false;
set hive.map.aggr.hash.percentmemory=0.5;
set hive.map.aggr=true;

set hive.vectorized.execution.enabled=false;
set hive.vectorized.execution.reduce.enabled=false;
set hive.vectorized.execution.reduce.groupby.enabled=false;

set hive.exec.parallel=true;
set hive.exec.parallel.thread.number=16;

set hive.exec.reducers.max=800;
set hive.optimize.reducededuplication=true;
set hive.optimize.reducededuplication.min.reducer=4;

set hive.merge.mapfiles=true;
set hive.merge.mapredfiles=false;
set hive.merge.smallfiles.avgsize=16000000;
set hive.merge.size.per.task=256000000;
set hive.smbjoin.cache.rows=10000;
set hive.fetch.task.conversion=more;
set hive.optimize.sort.dynamic.partition=true;

set hive.tez.auto.reducer.parallelism=true;

Stacktrace:

Status: Failed
Vertex failed, vertexName=Map 3, vertexId=vertex_1495595408051_21107_2_03, 
diagnostics=[Task failed, taskId=task_1495595408051_21107_2_03_000000, 
diagnostics=[TaskAttempt 0 failed, info=[Error: exceptio
nThrown=java.lang.OutOfMemoryError: Java heap space
        at 
org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:56)
        at 
org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:46)
        at 
org.apache.tez.runtime.library.common.shuffle.MemoryFetchedInput.<init>(MemoryFetchedInput.java:38)
        at 
org.apache.tez.runtime.library.common.shuffle.impl.SimpleFetchedInputAllocator.allocate(SimpleFetchedInputAllocator.java:141)
        at 
org.apache.tez.runtime.library.common.shuffle.Fetcher.fetchInputs(Fetcher.java:717)
        at 
org.apache.tez.runtime.library.common.shuffle.Fetcher.doHttpFetch(Fetcher.java:489)
        at 
org.apache.tez.runtime.library.common.shuffle.Fetcher.doHttpFetch(Fetcher.java:398)
        at 
org.apache.tez.runtime.library.common.shuffle.Fetcher.callInternal(Fetcher.java:195)
        at 
org.apache.tez.runtime.library.common.shuffle.Fetcher.callInternal(Fetcher.java:70)
        at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
, errorMessage=Fetch failed:java.lang.OutOfMemoryError: Java heap space
        at 
org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:56)
        at 
org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:46)
        at 
org.apache.tez.runtime.library.common.shuffle.MemoryFetchedInput.<init>(MemoryFetchedInput.java:38)
        at 
org.apache.tez.runtime.library.common.shuffle.impl.SimpleFetchedInputAllocator.allocate(SimpleFetchedInputAllocator.java:141)
        at 
org.apache.tez.runtime.library.common.shuffle.Fetcher.fetchInputs(Fetcher.java:717)
        at 
org.apache.tez.runtime.library.common.shuffle.Fetcher.doHttpFetch(Fetcher.java:489)
        at 
org.apache.tez.runtime.library.common.shuffle.Fetcher.doHttpFetch(Fetcher.java:398)
        at 
org.apache.tez.runtime.library.common.shuffle.Fetcher.callInternal(Fetcher.java:195)
        at 
org.apache.tez.runtime.library.common.shuffle.Fetcher.callInternal(Fetcher.java:70)
        at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
], TaskAttempt 1 failed, info=[Error: Failure while running 
task:java.lang.RuntimeException: java.lang.RuntimeException: Map operator 
initialization failed
        at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
        at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
        at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337)
        at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
        at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
        at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
        at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
        at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
], TaskAttempt 1 failed, info=[Error: Failure while running 
task:java.lang.RuntimeException: java.lang.RuntimeException: Map operator 
initialization failed
        at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
        at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
        at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337)
        at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
        at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
        at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
        at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
        at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: Map operator initialization failed
        at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:229)
        at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147)
        ... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap 
space
        at 
org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:388)
        at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:378)
        at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
        at 
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438)
        at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
        at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:214)
        ... 15 more
Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: 
Java heap space
        at java.util.concurrent.FutureTask.report(FutureTask.java:122)
        at java.util.concurrent.FutureTask.get(FutureTask.java:192)
        at 
org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:386)
        ... 20 more
Caused by: java.lang.OutOfMemoryError: Java heap space
        at 
org.apache.hadoop.hive.serde2.WriteBuffers.nextBufferToWrite(WriteBuffers.java:241)
        at 
org.apache.hadoop.hive.serde2.WriteBuffers.write(WriteBuffers.java:217)
        at 
org.apache.hadoop.hive.ql.exec.persistence.MapJoinBytesTableContainer$LazyBinaryKvWriter.writeKey(MapJoinBytesTableContainer.java:235)
        at 
org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap.put(BytesBytesMultiHashMap.java:445)
        at 
org.apache.hadoop.hive.ql.exec.persistence.MapJoinBytesTableContainer.putRow(MapJoinBytesTableContainer.java:365)
        at 
org.apache.hadoop.hive.ql.exec.tez.HashTableLoader.load(HashTableLoader.java:191)
        at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:288)
        at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:173)
        at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:169)
        at 
org.apache.hadoop.hive.ql.exec.tez.ObjectCache.retrieve(ObjectCache.java:75)
        at 
org.apache.hadoop.hive.ql.exec.tez.ObjectCache$1.call(ObjectCache.java:92)
        ... 4 more
], TaskAttempt 2 failed, info=[Error: Failure while running 
task:java.lang.RuntimeException: java.lang.RuntimeException: Map operator 
initialization failed
        at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
        at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
        at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337)
        at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
        at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
        at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
        at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
        at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: Map operator initialization failed
        at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:229)
        at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147)
        ... 14 more
]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 
killedTasks:0, Vertex vertex_1495595408051_21107_2_03 [Map 3] killed/failed due 
to:null]Vertex killed, vertexName=Reducer 7, vertexId=ve
rtex_1495595408051_21107_2_06, diagnostics=[Vertex received Kill while in 
RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, 
failedTasks:0 killedTasks:2, Vertex vertex_1495595408051_211
07_2_06 [Reducer 7] killed/failed due to:null]Vertex killed, vertexName=Map 6, 
vertexId=vertex_1495595408051_21107_2_05, diagnostics=[Vertex received Kill 
while in RUNNING state., Vertex did not succeed
 due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:1, Vertex 
vertex_1495595408051_21107_2_05 [Map 6] killed/failed due to:null]Vertex 
killed, vertexName=Map 5, vertexId=vertex_1495595408051_21107_2
_04, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not 
succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:1, Vertex 
vertex_1495595408051_21107_2_04 [Map 5] killed/fai
led due to:null]Vertex killed, vertexName=Map 1, 
vertexId=vertex_1495595408051_21107_2_02, diagnostics=[Vertex received Kill 
while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE,
failedTasks:0 killedTasks:41, Vertex vertex_1495595408051_21107_2_02 [Map 1] 
killed/failed due to:null]DAG did not succeed due to VERTEX_FAILURE. 
failedVertices:1 killedVertices:4

Please take a look. Thanks.

Regards,
Xin

Reply via email to