[ https://issues.apache.org/jira/browse/MAPREDUCE-6024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14102649#comment-14102649 ]
Hudson commented on MAPREDUCE-6024: ----------------------------------- FAILURE: Integrated in Hadoop-Hdfs-trunk #1842 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1842/]) MAPREDUCE-6024. Shortened the time when Fetcher is stuck in retrying before concluding the failure by configuration. Contributed by Yunjiong Zhao. (zjshen: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1618677) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/ShuffleSchedulerImpl.java > java.net.SocketTimeoutException in Fetcher caused jobs stuck for more than 1 > hour > --------------------------------------------------------------------------------- > > Key: MAPREDUCE-6024 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6024 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am, task > Reporter: zhaoyunjiong > Assignee: zhaoyunjiong > Priority: Critical > Fix For: 2.6.0 > > Attachments: MAPREDUCE-6024.1.patch, MAPREDUCE-6024.2.patch, > MAPREDUCE-6024.3.patch, MAPREDUCE-6024.4.patch, MAPREDUCE-6024.5.patch, > MAPREDUCE-6024.patch > > > 2014-08-04 21:09:42,356 WARN fetcher#33 > org.apache.hadoop.mapreduce.task.reduce.Fetcher: Failed to connect to > fake.host.name:13562 with 2 map outputs > java.net.SocketTimeoutException: Read timed out > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.read(SocketInputStream.java:129) > at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) > at java.io.BufferedInputStream.read1(BufferedInputStream.java:258) > at java.io.BufferedInputStream.read(BufferedInputStream.java:317) > at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:697) > at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:640) > at > sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1195) > at > org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:289) > at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165) > 2014-08-04 21:09:42,360 INFO fetcher#33 > org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: > fake.host.name:13562 freed by fetcher#33 in 180024ms > 2014-08-04 21:09:55,360 INFO fetcher#33 > org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: Assigning > fake.host.name:13562 with 3 to fetcher#33 > 2014-08-04 21:09:55,360 INFO fetcher#33 > org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: assigned 3 of 3 > to fake.host.name:13562 to fetcher#33 > 2014-08-04 21:12:55,463 WARN fetcher#33 > org.apache.hadoop.mapreduce.task.reduce.Fetcher: Failed to connect to > fake.host.name:13562 with 3 map outputs > java.net.SocketTimeoutException: Read timed out > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.read(SocketInputStream.java:129) > at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) > at java.io.BufferedInputStream.read1(BufferedInputStream.java:258) > at java.io.BufferedInputStream.read(BufferedInputStream.java:317) > at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:697) > at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:640) > at > sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1195) > at > org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:289) > at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165) > ... > 2014-08-04 22:03:13,416 INFO fetcher#33 > org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: > fake.host.name:13562 freed by fetcher#33 in 271081ms > 2014-08-04 22:04:13,417 INFO fetcher#33 > org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: Assigning > fake.host.name:13562 with 3 to fetcher#33 > 2014-08-04 22:04:13,417 INFO fetcher#33 > org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl: assigned 3 of 3 > to fake.host.name:13562 to fetcher#33 > 2014-08-04 22:07:13,449 WARN fetcher#33 > org.apache.hadoop.mapreduce.task.reduce.Fetcher: Failed to connect to > fake.host.name:13562 with 3 map outputs > java.net.SocketTimeoutException: Read timed out > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.read(SocketInputStream.java:129) > at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) > at java.io.BufferedInputStream.read1(BufferedInputStream.java:258) > at java.io.BufferedInputStream.read(BufferedInputStream.java:317) > at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:697) > at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:640) > at > sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1195) > at > org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:289) > at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165) -- This message was sent by Atlassian JIRA (v6.2#6252)