Hi Alain, On 11/09/2010 04:49 PM, Alain.Moulle wrote: > Hi, > The three cluster.conf are exactly the same on the 3 nodes. > The errors messages are : > > -nodes1: > o2net: accepted connection from node selfxl-5 (num 1) at > 10.197.189.218:7777 > o2net: no longer connected to node selfxl-5 (num 1) at > 10.197.189.218:7777 > > -nodes2: > (1457,1):o2net_connect_expired:1656 ERROR: no connection established > with node 1 after 30.0 seconds, giving up and returning errors. > > Note that once a mount is refused for example on node3, if > I umount the FS on node1 for example, then I can mount it > on node3. Oh, so do you have enough slots for all these 3 nodes to mount?
What's the output for the below command? echo 'stats'|debugfs.ocfs2 /dev/sdx|grep Slots Regards, Tao > Note also that when the mound is refused for example on node3, > I've check that this node3"pings" successfully both other > nodes on IP addr given in cluster.conf. > > Alain > > > > > Tao Ma a écrit : >> Hi Alain, >> >> On 11/08/2010 11:08 PM, Alain.Moulle wrote: >> >>> Hi, >>> >>> I have a problem on Fedora13 with releases : >>> ocfs2 1.4.3-5.fc13.x86_64 >>> dlm_tool 3.0.17 >>> >>> With a 3 nodes ocfs2 cluster, I can't mount FS on the three nodes at the >>> same time >>> but only on two nodes among the 3 nodes , whatever the two nodes are >>> among the 3 nodes. >>> >>> The errors are : >>> "(1475,0):o2net_connect_expired:1656 ERROR: no connection established >>> with node 2 after 30.0 seconds, giving up and returning errors. >>> (2175,0):dlm_request_join:1035 ERROR: status = -107 >>> (2175,0):dlm_try_to_join_domain:1209 ERROR: status = -107 >>> (2175,0):dlm_join_domain:1487 ERROR: status = -107 >>> (2175,0):dlm_register_domain:1753 ERROR: status = -107 >>> (2175,0):o2cb_cluster_connect:313 ERROR: status = -107 >>> (2175,0):ocfs2_dlm_init:2995 ERROR: status = -107 >>> (2175,0):ocfs2_mount_volume:1789 ERROR: status = -107 >>> ocfs2: Unmounting device (8,16) on (node 0) >>> o2net: no longer connected to node selfxl-4 (num 0) at >>> 10.197.189.204:7777 >>> o2net: connected to node selfxl-4 (num 0) at 10.197.189.204:7777 >>> >>> It seems to be a lock management problem >>> Is it an already known issue ? >>> Is there an available patch ? >>> >> It doesn't look like a dlm problem, but a network problem. ;) >> So your first error is o2net_connect_expired. >> So it seems that the 3rd node can't connect with node 2. >> Could you please check the error message in node 2? >> >> btw, I would deem that the cluster.conf is the same among the 3 nodes, >> and you you can connect to 7777(which is used by ocfs2) of node 2 from >> node 3. >> >> Regards, >> Tao >> >> _______________________________________________ >> Ocfs2-users mailing list >> Ocfs2-users@oss.oracle.com <mailto:Ocfs2-users@oss.oracle.com> >> http://oss.oracle.com/mailman/listinfo/ocfs2-users >> >> >> > _______________________________________________ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users