Hi all,

 

I have a pretty big Hive Query. I'm joining over 3 Hive-Tables which have
thousands of lines each. I'm grouping this join by several columns. In the
Hive-Shell this query only reach about 80%. After about 1400 seconds its
canceling with the following error:  

 

Status: Failed

Vertex failed, vertexName=Map 2, vertexId=vertex_1434357133795_0008_1_01,
diagnostics=[Task failed, taskId=task_1434357133795_0008_1_01_000033,
diagnostics=[TaskAttempt 0 failed,
info=[Containercontainer_1434357133795_0008_01_000039 finished while trying
to launch. Diagnostics: [Container failed. Container expired since it was
unused]], TaskAttempt 1 failed,
info=[Containercontainer_1434357133795_0008_01_000055 finished while trying
to launch. Diagnostics: [Container failed. Container expired since it was
unused]], TaskAttempt 2 failed,
info=[Containercontainer_1434357133795_0008_01_000072 finished while trying
to launch. Diagnostics: [Container failed. Container expired since it was
unused]], TaskAttempt 3 failed,
info=[Containercontainer_1434357133795_0008_01_000101 finished while trying
to launch. Diagnostics: [Container failed. Container expired since it was
unused]]], Vertex failed as one or more tasks failed. failedTasks:1, Vertex
vertex_1434357133795_0008_1_01 [Map 2] killed/failed due to:null]

DAG failed due to vertex failure. failedVertices:1 killedVertices:0

FAILED: Execution Error, return code 2 from
org.apache.hadoop.hive.ql.exec.tez.TezTask

 

My yarn resource manager is at 100% during the whole execution (using all of
the 300 GB memory). I tried to extend the live time of my containers with
the following setting in the yarn-site.xml but no success:

 

yarn.resourcemanager.rm.container-allocation.expiry-interval-ms = 1200000

 

After this change my query stays at 0% over thousands of seconds. The query
itself is working (tested with less data). How can I solve this problem.

 

Thanks for your help.

 

Greetz

DK

Reply via email to