Junping Du created MAPREDUCE-6361: ------------------------------------- Summary: NPE issue in shuffle caused by concurrent issue between copySucceeded() in one thread and copyFailed() in another thread on the same host Key: MAPREDUCE-6361 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6361 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Junping Du Assignee: Junping Du
The failure in log: 2015-05-08 21:00:00,513 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#25 at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: java.lang.NullPointerException at org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.copyFailed(ShuffleSchedulerImpl.java:267) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:308) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:193) -- This message was sent by Atlassian JIRA (v6.3.4#6332)