Re: Heartbeat interval and timeout: why 3 secs and 10 min?

2013-03-13 Thread Suresh Srinivas
You are right, in heartbeat response namenode sends commands to the datanode. Commands sent this way include deletion of blocks, replication, block recovery secret key updates etc. Increasing the heartbeat interval results in namenode not being able to quickly act on the events in the cluster and

Re: Heartbeat interval and timeout: why 3 secs and 10 min?

2013-03-13 Thread Colin McCabe
My understanding is that the 10 minute timeout helps to avoid replication storms, especially during startup. You might be interested in HDFS-3703, which adds a stale state which datanodes are placed into after 30 seconds of missing heartbeats. (This is an optional feature controlled by

Re: Heartbeat interval and timeout: why 3 secs and 10 min?

2013-03-13 Thread André Oriani
Thanks Colin and Suresh! On Wed, Mar 13, 2013 at 3:08 PM, Colin McCabe cmcc...@alumni.cmu.eduwrote: My understanding is that the 10 minute timeout helps to avoid replication storms, especially during startup. You might be interested in HDFS-3703, which adds a stale state which datanodes are

Re: Heartbeat interval and timeout: why 3 secs and 10 min?

2013-03-12 Thread André Oriani
No take on this one? In Zookeeper the heartbeats happen on every third of the timeout. If I am not mistaken, recomended timeout is more than 2 minutes to avoid false positives. But I still cannot see the relationship on HDFS between heartbeat interval and timeout. Okay 10 minutes seems to be a

Heartbeat interval and timeout: why 3 secs and 10 min?

2013-03-07 Thread André Oriani
Hi, Is there any particular reason why the default heartbeat interval is 3 seconds and the timeout is 10 minutes? Everywhere I looked (code, Google, ..) only mentions the values but no clue on why those values were chosen. Thanks in advance, André Oriani