[ https://issues.apache.org/jira/browse/HDFS-1203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jakob Homan updated HDFS-1203: ------------------------------ Hadoop Flags: [Reviewed] Issue Type: Improvement (was: Bug) Fix Version/s: 0.22.0 +1. This sounds reasonable. > DataNode should sleep before reentering service loop after an exception > ----------------------------------------------------------------------- > > Key: HDFS-1203 > URL: https://issues.apache.org/jira/browse/HDFS-1203 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node > Affects Versions: 0.22.0 > Reporter: Todd Lipcon > Assignee: Todd Lipcon > Fix For: 0.22.0 > > Attachments: hdfs-1203.txt > > > When the DN gets an exception in response to a heartbeat, it logs it and > continues, but there is no sleep. I've occasionally seen bugs produce a case > where heartbeats continuously produce exceptions, and thus the DN floods the > NN with bad heartbeats. Adding a 1 second sleep at least throttles the error > messages for easier debugging and error isolation. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.