Actually, it is not complex. o2cb timeouts: If not using multipathing/netbonding, leave the timeouts as it. If using multipathing, double the disk hearbeat to 120 secs. If using netbonding, double the network idle to 60 secs. Ensure your private network has no loops to prevent spanning tree protocol from interfering. Leave the reconnect/keepalive timeouts as is.
Next, match the node numbers between the css and o2cb clusters. This is documented in the 1.4 user's guide. Configure the css cluster and then edit the o2cb cluster.conf so that the nodes in both stacks are numbered the same. Lastly, do not place crs_home on ocfs2. Keep that on local volumes. The db home can be on ocfs2. When a node dies, most of the time is spent in node death detection. The actual recovery is fairly quick. During node death detection, the fs does not block any ios unless it has to. By that I mean, say the io requires the node to take a lock that the "supposed" dead node had. If that happens, that io will be blocked until after the recovery. But the node will continue to io if it has all the locks. We use this to our advantage with the voting disk. The css voting disk ios are non-extending odirect writes. They will not be blocked during detection. They are only blocked during the actual recovery which is fairly short. The default css timeouts are much larger than the recovery time. But, no one is saying you have-to have the voting disk on ocfs2. It could be on a separate raw device too. If that is the case, then the closest timeout that I am aware of is the default 15 mins for database controlfile lock. o2cb timeouts are much shorter than that. Sunil Schmitter, Martin wrote: > Hi Devender, > > this is a very complex question. > > Timeouts must be set in conjunction with your infrastructure. What type of > storage? What OCFS2 Version? Etc. ... > > The major problem is, to synchronies the timeouts with CRS timeouts to > prevent different decisions. In fact, I am pretty sure, you won’t get a > default value ore suggestion. > > In general you have to do a lot of tests! > > Good praxis for me: > > Heartbeat dead threshold = around 61 > network idle timeout = around 70000 > network keepalive delay in ms 5000 > network reconnect delay in ms 5000 > in a multipath environment with a virtual san. > > As I already mentioned, timeouts have to be set in conjunction with your > infrastructure and san system. This could be totally different for your > needs. Do not take OCFS2 with CRS easy. This is very difficult and make sure > you are using the latest releases. > > Everything without warranty! Good Luck > > Regards, > > Martin > > ________________________________________ > Von: ocfs2-users-boun...@oss.oracle.com [ocfs2-users-boun...@oss.oracle.com] > im Auftrag von Devender Narula [devendernar...@yahoo.com] > Gesendet: Freitag, 5. Juni 2009 12:59 > An: ocfs2-users@oss.oracle.com > Betreff: [Ocfs2-users] Default Values of heartbeat dead threshold > > Hi Guys > > i got two node RAC cluster Running on RHEL 5.0 .. i just want to what is > oracle recomendid Defaults values for below mention parameters > > Thanks for your help > > Heartbeat dead threshold > network idle timeout > network keepalive delay in ms > network reconnect delay in ms > kernel.panic_on_oops > kernel.panic > > Regards, > > Devender > _______________________________________________ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users