[
https://issues.apache.org/jira/browse/MAPREDUCE-7206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tarek Abouzeid resolved MAPREDUCE-7206.
---------------------------------------
Resolution: Not A Bug
> ShuffleHandler cannot access file.out
> --------------------------------------
>
> Key: MAPREDUCE-7206
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7206
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Affects Versions: 3.1.1
> Environment: HDP 3.1 (3.1.0.0-78)
> Reporter: Tarek Abouzeid
> Priority: Critical
>
> i am running HDP 3.1 (3.1.0.0-78) , i have 10 data nodes , Hive execution
> engine is TEZ, when i run a query i get this error,
> {code:java}
> ERROR : FAILED: Execution Error, return code 2 from
> org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex re-running, vertexName=Map
> 1, vertexId=vertex_1557754551780_1091_2_00Vertex re-running, vertexName=Map
> 1, vertexId=vertex_1557754551780_1091_2_00Vertex re-running, vertexName=Map
> 1, vertexId=vertex_1557754551780_1091_2_00Vertex failed, vertexName=Map 1,
> vertexId=vertex_1557754551780_1091_2_00, diagnostics=[Vertex
> vertex_1557754551780_1091_2_00 [Map 1] killed/failed due to:OWN_TASK_FAILURE,
> Vertex vertex_1557754551780_1091_2_00 [Map 1] failed as task
> task_1557754551780_1091_2_00_000001 failed after vertex succeeded.]DAG did
> not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0
> INFO : Completed executing
> command(queryId=hive_20190516161715_09090e6d-e513-4fcc-9c96-0b48e9b43822);
> Time taken: 17.935 seconds
> Error: Error while processing statement: FAILED: Execution Error, return code
> 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex re-running,
> vertexName=Map 1, vertexId=vertex_1557754551780_1091_2_00Vertex re-running,
> vertexName=Map 1, vertexId=vertex_1557754551780_1091_2_00Vertex re-running,
> vertexName=Map 1, vertexId=vertex_1557754551780_1091_2_00Vertex failed,
> vertexName=Map 1, vertexId=vertex_1557754551780_1091_2_00,
> diagnostics=[Vertex vertex_1557754551780_1091_2_00 [Map 1] killed/failed due
> to:OWN_TASK_FAILURE, Vertex vertex_1557754551780_1091_2_00 [Map 1] failed as
> task task_1557754551780_1091_2_00_000001 failed after vertex succeeded.]DAG
> did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0
> (state=08S01,code=2)
> {code}
> when i traced the logs, for example the application id is
> *application_1557754551780_1091*
> checked the node manager logs
> {code:java}
> 2019-05-16 16:19:05,801 INFO mapred.ShuffleHandler
> (ShuffleHandler.java:sendMapOutput(1268)) -
> /var/lib/hadoop/yarn/local/usercache/hive/appcache/application_1557754551780_1091/output/attempt_1557754551780_1091_2_00_000000_0_10003/file.out
> not found
> 2019-05-16 16:19:05,818 INFO mapred.ShuffleHandler
> (ShuffleHandler.java:sendMapOutput(1268)) -
> /var/lib/hadoop/yarn/local/usercache/hive/appcache/application_1557754551780_1091/output/attempt_1557754551780_1091_2_00_000000_0_10003/file.out
> not found
> 2019-05-16 16:19:05,821 INFO mapred.ShuffleHandler
> (ShuffleHandler.java:sendMapOutput(1268)) -
> /var/lib/hadoop/yarn/local/usercache/hive/appcache/application_1557754551780_1091/output/attempt_1557754551780_1091_2_00_000000_0_10003/file.out
> not found
> 2019-05-16 16:19:05,822 INFO mapred.ShuffleHandler
> (ShuffleHandler.java:sendMapOutput(1268)) -
> /var/lib/hadoop/yarn/local/usercache/hive/appcache/application_1557754551780_1091/output/attempt_1557754551780_1091_2_00_000000_0_10003/file.out
> not found
> 2019-05-16 16:19:05,824 INFO mapred.ShuffleHandler
> (ShuffleHandler.java:sendMapOutput(1268)) -
> /var/lib/hadoop/yarn/local/usercache/hive/appcache/application_1557754551780_1091/output/attempt_1557754551780_1091_2_00_000000_0_10003/file.out
> not found
> 2019-05-16 16:19:05,826 INFO mapred.ShuffleHandler
> (ShuffleHandler.java:sendMapOutput(1268)) -
> /var/lib/hadoop/yarn/local/usercache/hive/appcache/application_1557754551780_1091/output/attempt_1557754551780_1091_2_00_000000_0_10003/file.out
> not found
> {code}
> i checked the path where the output of the Map will be there in (
> */var/lib/hadoop/yarn/local/usercache/hive/appcache/application_1557754551780_1091/output/attempt_1557754551780_1091_2_00_000000_0_10003*
> )
>
> {code:java}
> drwx--x---. 3 hive hadoop 16 May 16 16:16 filecache
> drwxr-s---. 3 hive hadoop 60 May 16 16:16 output
> {code}
> inside the output :
>
>
> {code:java}
> -rw-------. 1 hive hadoop 28 May 16 16:17 file.out
> -rw-r-----. 1 hive hadoop 32 May 16 16:17 file.out.index
> {code}
>
> so the *file.out* is not readable by other users in same group (switched to
> yarn user and tried to open this file and got permission denied)
>
>
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]