[
https://issues.apache.org/jira/browse/HADOOP-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12555310#action_12555310
]
Hudson commented on HADOOP-2492:
--------------------------------
Integrated in Hadoop-Nightly #353 (See
[http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/353/])
> ConcurrentModificationException in org.apache.hadoop.ipc.Server.Responder
> -------------------------------------------------------------------------
>
> Key: HADOOP-2492
> URL: https://issues.apache.org/jira/browse/HADOOP-2492
> Project: Hadoop
> Issue Type: Bug
> Components: ipc
> Affects Versions: 0.16.0
> Reporter: Devaraj Das
> Assignee: dhruba borthakur
> Fix For: 0.16.0
>
> Attachments: rpcexception.patch
>
>
> I was running hadoop on 800 machines and after running a couple of jobs, and
> running 100% of the maps of the current job, the JobTracker stopped
> responding - *all* tasktrackers were lost ... When I looked at the JT logs,
> these seemed alarming:
> 2007-12-26 19:18:30,185 WARN org.apache.hadoop.ipc.Server: Exception in
> Responder java.util.ConcurrentModificationException
> Following the above exception, I saw a whole lot of exceptions like:
> 2007-12-26 19:23:10,926 WARN org.apache.hadoop.ipc.Server: Call queue
> overflow discarding oldest call heartbeat([EMAIL PROTECTED], false, true,
> 1758) from 1.2.3.4:1234
> From the number of exceptions to do with call queue overflow, it seemed like
> the jobtracker was not processing RPCs after it got the
> ConcurrentModificationException, and around that time the tasktrackers
> started getting timeouts on RPCs...
> There were two occurrences of the ConcurrentModificationException but the
> first instance seemed to not have any effect on the call queue...
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.