[
https://issues.apache.org/jira/browse/HADOOP-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12554653
]
Devaraj Das commented on HADOOP-2492:
-------------------------------------
There is no stack trace since the Responder.run catches the exception and just
logs the exception (_LOG.warn("Exception in Responder " + e)_). It doesn't
print the stack trace...
> ConcurrentModificationException in org.apache.hadoop.ipc.Server.Responder
> -------------------------------------------------------------------------
>
> Key: HADOOP-2492
> URL: https://issues.apache.org/jira/browse/HADOOP-2492
> Project: Hadoop
> Issue Type: Bug
> Components: ipc
> Affects Versions: 0.16.0
> Reporter: Devaraj Das
> Assignee: dhruba borthakur
> Fix For: 0.16.0
>
>
> I was running hadoop on 800 machines and after running a couple of jobs, and
> running 100% of the maps of the current job, the JobTracker stopped
> responding - *all* tasktrackers were lost ... When I looked at the JT logs,
> these seemed alarming:
> 2007-12-26 19:18:30,185 WARN org.apache.hadoop.ipc.Server: Exception in
> Responder java.util.ConcurrentModificationException
> Following the above exception, I saw a whole lot of exceptions like:
> 2007-12-26 19:23:10,926 WARN org.apache.hadoop.ipc.Server: Call queue
> overflow discarding oldest call heartbeat([EMAIL PROTECTED], false, true,
> 1758) from 1.2.3.4:1234
> From the number of exceptions to do with call queue overflow, it seemed like
> the jobtracker was not processing RPCs after it got the
> ConcurrentModificationException, and around that time the tasktrackers
> started getting timeouts on RPCs...
> There were two occurrences of the ConcurrentModificationException but the
> first instance seemed to not have any effect on the call queue...
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.