There are two heartbeats in OCFS2. One on disk and the
other on the network.
Randy Ramsdell wrote:
Sunil Mushran wrote:
Means there was a network hiccup that caused Node 1 to fence itself.
The problem is that our default timeout is too low. We have already
addressed this in mainline and are looking to add that patch into 1.2.5.
I am unclear as to your last qs.
Randy Ramsdell wrote:
Hi,
Ok I'll try this again since there seems to be more people reading this
list.
I don't quite understand the log messages regarding fencing. Should the
other nodes in the cluster that lost network connectivity state
something about quorum/fencing etc...?
Is it true that the network timeout param. can be set in 1.2.4 and if
not, can I change the setting myself before compile?
What will we see in logs if a node cannot write to the clusterfs but
heartbeat still works ?
Sunil Mushran wrote:
Means there was a network hiccup that caused Node 1 to fence itself.
The problem is that our default timeout is too low. We have already
addressed this in mainline and are looking to add that patch into 1.2.5.
I am unclear as to your last qs.
Randy Ramsdell wrote:
Hi,
Ok I'll try this again since there seems to be more people reading this
list.
I don't quite understand the log messages regarding fencing. Should the
other nodes in the cluster that lost network connectivity state
something about quorum/fencing etc...?
Is it true that the network timeout param. can be set in 1.2.4 and if
not, can I change the setting myself before compile?
What will we see in logs if a node cannot write to the clusterfs but
heartbeat still works ?
<snip>
I see we had a network hiccup ( actually it was the load ), but I was
really trying to "iron out" the reason why our logs don't mention the
fencing. As a matter of fact, I have never seen the other nodes logging
a node fencing. Although I know it may be a small detail, it is just
interesting why I never see that message but many others do in this type
of situation.
The third question I asked was: What will we see in logs if a node
cannot write to the clusterfs but
heartbeat still works ?
As I understand it, there are 2 ways the clusters notifies nodes about
the cluster connectivity.
1. Heartbeat on port 7777.
2. Each nodes writes timestamps to the clusterfs.
I'm just a little fuzzy on this area.
RCR
_______________________________________________
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users
_______________________________________________
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users