Re: [Ocfs2-users] O2CB heartbeat not active on 2nd node
Thanks for your push on this. I don't know why I didn't check this before. It looks like we have a problem with our interconnect. Our SA is checking the hardware now. [r...@nyclx1 ~]# more /etc/ocfs2/cluster.conf node: ip_port = ip_address = 192.168.0.218 number = 0 name = nyclx1 cluster = tiaa node: ip_port = ip_address = 192.168.0.217 number = 1 name = nyclx2 cluster = tiaa cluster: node_count = 2 name = tiaa [r...@nyclx1 ~]# ping 192.168.0.217 PING 192.168.0.217 (192.168.0.217) 56(84) bytes of data. >From 192.168.0.218 icmp_seq=2 Destination Host Unreachable >From 192.168.0.218 icmp_seq=3 Destination Host Unreachable >From 192.168.0.218 icmp_seq=4 Destination Host Unreachable -Original Message- From: Sunil Mushran [mailto:sunil.mush...@oracle.com] Sent: Wednesday, June 03, 2009 6:19 PM To: McKinley, Reid Cc: ocfs2-users@oss.oracle.com Subject: Re: [Ocfs2-users] O2CB heartbeat not active on 2nd node Do on both nodes: $ netstat -ta --numeric-ports Maybe port is already in use. Check your setup again. Ensure cluster.conf is the same on both nodes. And that the ips are correct. That tcpdump was capturing the traffic on the correct interface. etc. etc. McKinley, Reid wrote: > Yes, I had tcpdump running in separate sessions on both servers. > > The port is correct. Here is the cluster.conf. > > node: > ip_port = > ip_address = 192.168.0.218 > number = 0 > name = nyclx1 > cluster = tiaa > > node: > ip_port = > ip_address = 192.168.0.217 > number = 1 > name = nyclx2 > cluster = tiaa > > cluster: > node_count = 2 > name = tiaa > > -Original Message- > From: Sunil Mushran [mailto:sunil.mush...@oracle.com] > Sent: Wednesday, June 03, 2009 5:35 PM > To: McKinley, Reid > Cc: ocfs2-users@oss.oracle.com > Subject: Re: [Ocfs2-users] O2CB heartbeat not active on 2nd node > > Did you have tcpdump running on a terminal when you attempted > the mount on another terminal? Is the interface and port correct? > > It is one thing to not see the packets on the nyclx2. But what > confuses me is that there is no traffic on nyclx1 too. > This message, including any attachments, contains confidential information intended for a specific individual and purpose, and is protected by law. If you are not the intended recipient, please contact the sender immediately by reply e-mail and destroy all copies. You are hereby notified that any disclosure, copying, or distribution of this message, or the taking of any action based on it, is strictly prohibited. TIAA-CREF ___ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users
Re: [Ocfs2-users] O2CB heartbeat not active on 2nd node
Do on both nodes: $ netstat -ta --numeric-ports Maybe port is already in use. Check your setup again. Ensure cluster.conf is the same on both nodes. And that the ips are correct. That tcpdump was capturing the traffic on the correct interface. etc. etc. McKinley, Reid wrote: > Yes, I had tcpdump running in separate sessions on both servers. > > The port is correct. Here is the cluster.conf. > > node: > ip_port = > ip_address = 192.168.0.218 > number = 0 > name = nyclx1 > cluster = tiaa > > node: > ip_port = > ip_address = 192.168.0.217 > number = 1 > name = nyclx2 > cluster = tiaa > > cluster: > node_count = 2 > name = tiaa > > -Original Message- > From: Sunil Mushran [mailto:sunil.mush...@oracle.com] > Sent: Wednesday, June 03, 2009 5:35 PM > To: McKinley, Reid > Cc: ocfs2-users@oss.oracle.com > Subject: Re: [Ocfs2-users] O2CB heartbeat not active on 2nd node > > Did you have tcpdump running on a terminal when you attempted > the mount on another terminal? Is the interface and port correct? > > It is one thing to not see the packets on the nyclx2. But what > confuses me is that there is no traffic on nyclx1 too. > ___ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users
Re: [Ocfs2-users] O2CB heartbeat not active on 2nd node
Yes, I had tcpdump running in separate sessions on both servers. The port is correct. Here is the cluster.conf. node: ip_port = ip_address = 192.168.0.218 number = 0 name = nyclx1 cluster = tiaa node: ip_port = ip_address = 192.168.0.217 number = 1 name = nyclx2 cluster = tiaa cluster: node_count = 2 name = tiaa -Original Message- From: Sunil Mushran [mailto:sunil.mush...@oracle.com] Sent: Wednesday, June 03, 2009 5:35 PM To: McKinley, Reid Cc: ocfs2-users@oss.oracle.com Subject: Re: [Ocfs2-users] O2CB heartbeat not active on 2nd node Did you have tcpdump running on a terminal when you attempted the mount on another terminal? Is the interface and port correct? It is one thing to not see the packets on the nyclx2. But what confuses me is that there is no traffic on nyclx1 too. McKinley, Reid wrote: > We can bring up the ocfs2 cluster on 1 of 2 nodes only. So, it appears > that it's not specific to only one specific node. Right now we have the > ocfs2 heartbeat operational on node2 (node1 in the cluster.conf). > > Here are the results of the mount and tcpdump. > > [r...@nyclx1 ~]# mount -t ocfs2 /dev/mapper/mpath1 /oragrid > mount.ocfs2: Transport endpoint is not connected while mounting > /dev/mapper/mpath1 on /oragrid. Check 'dmesg' for more information on > this error. > [r...@nyclx1 ~]# > > [r...@nyclx2 ~]# tcpdump -i eth1 -s 2500 -ttt 'port ' > tcpdump: verbose output suppressed, use -v or -vv for full protocol > decode > listening on eth1, link-type EN10MB (Ethernet), capture size 2500 bytes > > 0 packets captured > 0 packets received by filter > 0 packets dropped by kernel > > [r...@nyclx2 ~]# /etc/init.d/o2cb status > Driver for "configfs": Loaded > Filesystem "configfs": Mounted > Driver for "ocfs2_dlmfs": Loaded > Filesystem "ocfs2_dlmfs": Mounted > Checking O2CB cluster tiaa: Online > Heartbeat dead threshold = 31 > Network idle timeout: 3 > Network keepalive delay: 2000 > Network reconnect delay: 2000 > Checking O2CB heartbeat: Active > > [r...@nyclx1 ~]# tcpdump -i eth1 -s 2500 -ttt 'port ' > tcpdump: verbose output suppressed, use -v or -vv for full protocol > decode > listening on eth1, link-type EN10MB (Ethernet), capture size 2500 bytes > > 0 packets captured > 0 packets received by filter > 0 packets dropped by kernel > > [r...@nyclx1 ~]# /etc/init.d/o2cb status > Driver for "configfs": Loaded > Filesystem "configfs": Mounted > Driver for "ocfs2_dlmfs": Loaded > Filesystem "ocfs2_dlmfs": Mounted > Checking O2CB cluster tiaa: Online > Heartbeat dead threshold = 31 > Network idle timeout: 3 > Network keepalive delay: 2000 > Network reconnect delay: 2000 > Checking O2CB heartbeat: Not active > > > -Original Message- > From: Sunil Mushran [mailto:sunil.mush...@oracle.com] > Sent: Wednesday, June 03, 2009 4:55 PM > To: McKinley, Reid > Cc: ocfs2-users@oss.oracle.com > Subject: Re: [Ocfs2-users] O2CB heartbeat not active on 2nd node > > Do: > $ tcpdump -i ethX -s 2500 -ttt 'port ' > > on both nodes. Replace ethX with the appropriate interface. > Then issue the mount command on node 1. Do you see the traffic > on node 0? > > McKinley, Reid wrote: > >> No, iptables is shutdown and disabled. No firewalls. >> >> [r...@nyclx1 ~]# service iptables status >> Firewall is stopped. >> >> -Original Message- >> From: Sunil Mushran [mailto:sunil.mush...@oracle.com] >> Sent: Wednesday, June 03, 2009 12:57 PM >> To: McKinley, Reid >> Cc: ocfs2-users@oss.oracle.com >> Subject: Re: [Ocfs2-users] O2CB heartbeat not active on 2nd node >> >> The connect requests are not getting through. Do you >> have any firewalls setup? Is iptables running? If so, either >> shut it down or allow traffic on the o2cb port. >> > > > This message, including any attachments, contains confidential information intended > for a specific individual and purpose, and is protected by law. If you are not the intended > recipient, please contact the sender immediately by reply e-mail and destroy all copies. > You are hereby notified that any disclosure, copying, or distribution of this message, or > the taking of any action based on it, is strictly prohibited. > > TIAA-CREF > > >
Re: [Ocfs2-users] O2CB heartbeat not active on 2nd node
Did you have tcpdump running on a terminal when you attempted the mount on another terminal? Is the interface and port correct? It is one thing to not see the packets on the nyclx2. But what confuses me is that there is no traffic on nyclx1 too. McKinley, Reid wrote: > We can bring up the ocfs2 cluster on 1 of 2 nodes only. So, it appears > that it's not specific to only one specific node. Right now we have the > ocfs2 heartbeat operational on node2 (node1 in the cluster.conf). > > Here are the results of the mount and tcpdump. > > [r...@nyclx1 ~]# mount -t ocfs2 /dev/mapper/mpath1 /oragrid > mount.ocfs2: Transport endpoint is not connected while mounting > /dev/mapper/mpath1 on /oragrid. Check 'dmesg' for more information on > this error. > [r...@nyclx1 ~]# > > [r...@nyclx2 ~]# tcpdump -i eth1 -s 2500 -ttt 'port ' > tcpdump: verbose output suppressed, use -v or -vv for full protocol > decode > listening on eth1, link-type EN10MB (Ethernet), capture size 2500 bytes > > 0 packets captured > 0 packets received by filter > 0 packets dropped by kernel > > [r...@nyclx2 ~]# /etc/init.d/o2cb status > Driver for "configfs": Loaded > Filesystem "configfs": Mounted > Driver for "ocfs2_dlmfs": Loaded > Filesystem "ocfs2_dlmfs": Mounted > Checking O2CB cluster tiaa: Online > Heartbeat dead threshold = 31 > Network idle timeout: 3 > Network keepalive delay: 2000 > Network reconnect delay: 2000 > Checking O2CB heartbeat: Active > > [r...@nyclx1 ~]# tcpdump -i eth1 -s 2500 -ttt 'port ' > tcpdump: verbose output suppressed, use -v or -vv for full protocol > decode > listening on eth1, link-type EN10MB (Ethernet), capture size 2500 bytes > > 0 packets captured > 0 packets received by filter > 0 packets dropped by kernel > > [r...@nyclx1 ~]# /etc/init.d/o2cb status > Driver for "configfs": Loaded > Filesystem "configfs": Mounted > Driver for "ocfs2_dlmfs": Loaded > Filesystem "ocfs2_dlmfs": Mounted > Checking O2CB cluster tiaa: Online > Heartbeat dead threshold = 31 > Network idle timeout: 3 > Network keepalive delay: 2000 > Network reconnect delay: 2000 > Checking O2CB heartbeat: Not active > > > -Original Message- > From: Sunil Mushran [mailto:sunil.mush...@oracle.com] > Sent: Wednesday, June 03, 2009 4:55 PM > To: McKinley, Reid > Cc: ocfs2-users@oss.oracle.com > Subject: Re: [Ocfs2-users] O2CB heartbeat not active on 2nd node > > Do: > $ tcpdump -i ethX -s 2500 -ttt 'port ' > > on both nodes. Replace ethX with the appropriate interface. > Then issue the mount command on node 1. Do you see the traffic > on node 0? > > McKinley, Reid wrote: > >> No, iptables is shutdown and disabled. No firewalls. >> >> [r...@nyclx1 ~]# service iptables status >> Firewall is stopped. >> >> -Original Message- >> From: Sunil Mushran [mailto:sunil.mush...@oracle.com] >> Sent: Wednesday, June 03, 2009 12:57 PM >> To: McKinley, Reid >> Cc: ocfs2-users@oss.oracle.com >> Subject: Re: [Ocfs2-users] O2CB heartbeat not active on 2nd node >> >> The connect requests are not getting through. Do you >> have any firewalls setup? Is iptables running? If so, either >> shut it down or allow traffic on the o2cb port. >> > > > This message, including any attachments, contains confidential information > intended > for a specific individual and purpose, and is protected by law. If you are > not the intended > recipient, please contact the sender immediately by reply e-mail and destroy > all copies. > You are hereby notified that any disclosure, copying, or distribution of this > message, or > the taking of any action based on it, is strictly prohibited. > > TIAA-CREF > > > ___ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users
Re: [Ocfs2-users] O2CB heartbeat not active on 2nd node
We can bring up the ocfs2 cluster on 1 of 2 nodes only. So, it appears that it's not specific to only one specific node. Right now we have the ocfs2 heartbeat operational on node2 (node1 in the cluster.conf). Here are the results of the mount and tcpdump. [r...@nyclx1 ~]# mount -t ocfs2 /dev/mapper/mpath1 /oragrid mount.ocfs2: Transport endpoint is not connected while mounting /dev/mapper/mpath1 on /oragrid. Check 'dmesg' for more information on this error. [r...@nyclx1 ~]# [r...@nyclx2 ~]# tcpdump -i eth1 -s 2500 -ttt 'port ' tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth1, link-type EN10MB (Ethernet), capture size 2500 bytes 0 packets captured 0 packets received by filter 0 packets dropped by kernel [r...@nyclx2 ~]# /etc/init.d/o2cb status Driver for "configfs": Loaded Filesystem "configfs": Mounted Driver for "ocfs2_dlmfs": Loaded Filesystem "ocfs2_dlmfs": Mounted Checking O2CB cluster tiaa: Online Heartbeat dead threshold = 31 Network idle timeout: 3 Network keepalive delay: 2000 Network reconnect delay: 2000 Checking O2CB heartbeat: Active [r...@nyclx1 ~]# tcpdump -i eth1 -s 2500 -ttt 'port ' tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth1, link-type EN10MB (Ethernet), capture size 2500 bytes 0 packets captured 0 packets received by filter 0 packets dropped by kernel [r...@nyclx1 ~]# /etc/init.d/o2cb status Driver for "configfs": Loaded Filesystem "configfs": Mounted Driver for "ocfs2_dlmfs": Loaded Filesystem "ocfs2_dlmfs": Mounted Checking O2CB cluster tiaa: Online Heartbeat dead threshold = 31 Network idle timeout: 3 Network keepalive delay: 2000 Network reconnect delay: 2000 Checking O2CB heartbeat: Not active -Original Message- From: Sunil Mushran [mailto:sunil.mush...@oracle.com] Sent: Wednesday, June 03, 2009 4:55 PM To: McKinley, Reid Cc: ocfs2-users@oss.oracle.com Subject: Re: [Ocfs2-users] O2CB heartbeat not active on 2nd node Do: $ tcpdump -i ethX -s 2500 -ttt 'port ' on both nodes. Replace ethX with the appropriate interface. Then issue the mount command on node 1. Do you see the traffic on node 0? McKinley, Reid wrote: > No, iptables is shutdown and disabled. No firewalls. > > [r...@nyclx1 ~]# service iptables status > Firewall is stopped. > > -Original Message- > From: Sunil Mushran [mailto:sunil.mush...@oracle.com] > Sent: Wednesday, June 03, 2009 12:57 PM > To: McKinley, Reid > Cc: ocfs2-users@oss.oracle.com > Subject: Re: [Ocfs2-users] O2CB heartbeat not active on 2nd node > > The connect requests are not getting through. Do you > have any firewalls setup? Is iptables running? If so, either > shut it down or allow traffic on the o2cb port. This message, including any attachments, contains confidential information intended for a specific individual and purpose, and is protected by law. If you are not the intended recipient, please contact the sender immediately by reply e-mail and destroy all copies. You are hereby notified that any disclosure, copying, or distribution of this message, or the taking of any action based on it, is strictly prohibited. TIAA-CREF ___ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users
Re: [Ocfs2-users] O2CB heartbeat not active on 2nd node
Do: $ tcpdump -i ethX -s 2500 -ttt 'port ' on both nodes. Replace ethX with the appropriate interface. Then issue the mount command on node 1. Do you see the traffic on node 0? McKinley, Reid wrote: > No, iptables is shutdown and disabled. No firewalls. > > [r...@nyclx1 ~]# service iptables status > Firewall is stopped. > > -Original Message- > From: Sunil Mushran [mailto:sunil.mush...@oracle.com] > Sent: Wednesday, June 03, 2009 12:57 PM > To: McKinley, Reid > Cc: ocfs2-users@oss.oracle.com > Subject: Re: [Ocfs2-users] O2CB heartbeat not active on 2nd node > > The connect requests are not getting through. Do you > have any firewalls setup? Is iptables running? If so, either > shut it down or allow traffic on the o2cb port. ___ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users
Re: [Ocfs2-users] O2CB heartbeat not active on 2nd node
No, iptables is shutdown and disabled. No firewalls. [r...@nyclx1 ~]# service iptables status Firewall is stopped. -Original Message- From: Sunil Mushran [mailto:sunil.mush...@oracle.com] Sent: Wednesday, June 03, 2009 12:57 PM To: McKinley, Reid Cc: ocfs2-users@oss.oracle.com Subject: Re: [Ocfs2-users] O2CB heartbeat not active on 2nd node The connect requests are not getting through. Do you have any firewalls setup? Is iptables running? If so, either shut it down or allow traffic on the o2cb port. McKinley, Reid wrote: > > We are having trouble getting the 2^nd node in our 2 node RAC > configuration to have an active O2CB heartbeat. We have our OCR and > voting disks on an OCFS2 mount point, so we cannot bring up > Clusterware on this node. > > I'm at a loss as to what the issue is. It was running fine for a few > weeks, then we had a reboot and we cannot get the heartbeat active and > we cannot mount any OCFS2 filesystems on the 2^nd node. > > Any ideas are greatly appreciated. > > Dmesg errors are at the bottom. > > Here are the rpm and status details: > > [r...@nyclx1 ~]# rpm -qa | grep ocfs2 > > ocfs2-tools-1.4.1-1.el5 > > ocfs2console-1.4.1-1.el5 > > ocfs2-2.6.18-92.el5-1.4.1-1.el5 > > ocfs2-2.6.18-92.el5debug-1.2.8-2.el5 > > ocfs2-2.6.18-92.el5-debuginfo-1.4.1-1.el5 > > ocfs2-tools-debuginfo-1.4.1-1.el5 > > [r...@nyclx1 ~]# /etc/init.d/o2cb status > > Driver for "configfs": Loaded > > Filesystem "configfs": Mounted > > Driver for "ocfs2_dlmfs": Loaded > > Filesystem "ocfs2_dlmfs": Mounted > > Checking O2CB cluster tiaa: Online > > Heartbeat dead threshold = 31 > > Network idle timeout: 3 > > Network keepalive delay: 2000 > > Network reconnect delay: 2000 > > Checking O2CB heartbeat: Active > > [r...@nyclx2 ~]# rpm -qa | grep ocfs2 > > ocfs2-tools-1.4.1-1.el5 > > ocfs2-2.6.18-92.el5-1.4.1-1.el5 > > ocfs2-tools-debuginfo-1.4.1-1.el5 > > ocfs2console-1.4.1-1.el5 > > ocfs2-2.6.18-92.el5debug-1.2.8-2.el5 > > ocfs2-2.6.18-92.el5-debuginfo-1.4.1-1.el5 > > [r...@nyclx2 ~]# /etc/init.d/o2cb status > > Driver for "configfs": Loaded > > Filesystem "configfs": Mounted > > Driver for "ocfs2_dlmfs": Loaded > > Filesystem "ocfs2_dlmfs": Mounted > > Checking O2CB cluster tiaa: Online > > Heartbeat dead threshold = 31 > > Network idle timeout: 3 > > Network keepalive delay: 2000 > > Network reconnect delay: 2000 > > Checking O2CB heartbeat: Not active > > OCFS2 1.4.1 Wed Jul 23 12:05:34 PDT 2008 (build > 3fc82af4b5669945497b322b6aabd031) > > (11296,1):o2net_connect_expired:1637 ERROR: no connection established > with node 0 after 30.0 seconds, giving up and returning errors. > > (14212,1):dlm_request_join:1033 ERROR: status = -107 > > (14212,1):dlm_try_to_join_domain:1207 ERROR: status = -107 > > (14212,1):dlm_join_domain:1485 ERROR: status = -107 > > (14212,1):dlm_register_domain:1732 ERROR: status = -107 > > (14212,1):ocfs2_dlm_init:2662 ERROR: status = -107 > > (14212,1):ocfs2_mount_volume:1251 ERROR: status = -107 > > ocfs2: Unmounting device (253,2) on (node 1) > > (11296,1):o2net_connect_expired:1637 ERROR: no connection established > with node 0 after 30.0 seconds, giving up and returning errors. > > (14350,1):dlm_request_join:1033 ERROR: status = -107 > > (14350,1):dlm_try_to_join_domain:1207 ERROR: status = -107 > > (14350,1):dlm_join_domain:1485 ERROR: status = -107 > > (14350,1):dlm_register_domain:1732 ERROR: status = -107 > > (14350,1):ocfs2_dlm_init:2662 ERROR: status = -107 > > (14350,1):ocfs2_mount_volume:1251 ERROR: status = -107 > > ocfs2: Unmounting device (253,3) on (node 1) > > (11296,1):o2net_connect_expired:1637 ERROR: no connection established > with node 0 after 30.0 seconds, giving up and returning errors. > > (4347,1):dlm_request_join:1033 ERROR: status = -107 > > (4347,1):dlm_try_to_join_domain:1207 ERROR: status = -107 > > (4347,1):dlm_join_domain:1485 ERROR: status = -107 > > (4347,1):dlm_register_domain:1732 ERROR: status = -107 > > (4347,1):ocfs2_dlm_init:2662 ERROR: status = -107 > > (4347,1):ocfs2_mount_volume:1251 ERROR: status = -107 > > ocfs2: Unmounting device (253,3) on (node 1) > > (11296,1):o2net_connect_expired:1637 ERROR: no connection established > with node 0 after 30.0 seconds, giving up and returning errors. > > (4948,1):dlm_request_join:1033 ERROR: status = -107 > > (4948,1):dlm_try_to_join_domain:1207 ERROR: status = -107 > > (4948,1):dlm_join_domain:1485 ERROR: status = -107 > > (4948,1):dlm_register_do
Re: [Ocfs2-users] O2CB heartbeat not active on 2nd node
The connect requests are not getting through. Do you have any firewalls setup? Is iptables running? If so, either shut it down or allow traffic on the o2cb port. McKinley, Reid wrote: > > We are having trouble getting the 2^nd node in our 2 node RAC > configuration to have an active O2CB heartbeat. We have our OCR and > voting disks on an OCFS2 mount point, so we cannot bring up > Clusterware on this node. > > I’m at a loss as to what the issue is. It was running fine for a few > weeks, then we had a reboot and we cannot get the heartbeat active and > we cannot mount any OCFS2 filesystems on the 2^nd node. > > Any ideas are greatly appreciated. > > Dmesg errors are at the bottom. > > Here are the rpm and status details: > > [r...@nyclx1 ~]# rpm -qa | grep ocfs2 > > ocfs2-tools-1.4.1-1.el5 > > ocfs2console-1.4.1-1.el5 > > ocfs2-2.6.18-92.el5-1.4.1-1.el5 > > ocfs2-2.6.18-92.el5debug-1.2.8-2.el5 > > ocfs2-2.6.18-92.el5-debuginfo-1.4.1-1.el5 > > ocfs2-tools-debuginfo-1.4.1-1.el5 > > [r...@nyclx1 ~]# /etc/init.d/o2cb status > > Driver for "configfs": Loaded > > Filesystem "configfs": Mounted > > Driver for "ocfs2_dlmfs": Loaded > > Filesystem "ocfs2_dlmfs": Mounted > > Checking O2CB cluster tiaa: Online > > Heartbeat dead threshold = 31 > > Network idle timeout: 3 > > Network keepalive delay: 2000 > > Network reconnect delay: 2000 > > Checking O2CB heartbeat: Active > > [r...@nyclx2 ~]# rpm -qa | grep ocfs2 > > ocfs2-tools-1.4.1-1.el5 > > ocfs2-2.6.18-92.el5-1.4.1-1.el5 > > ocfs2-tools-debuginfo-1.4.1-1.el5 > > ocfs2console-1.4.1-1.el5 > > ocfs2-2.6.18-92.el5debug-1.2.8-2.el5 > > ocfs2-2.6.18-92.el5-debuginfo-1.4.1-1.el5 > > [r...@nyclx2 ~]# /etc/init.d/o2cb status > > Driver for "configfs": Loaded > > Filesystem "configfs": Mounted > > Driver for "ocfs2_dlmfs": Loaded > > Filesystem "ocfs2_dlmfs": Mounted > > Checking O2CB cluster tiaa: Online > > Heartbeat dead threshold = 31 > > Network idle timeout: 3 > > Network keepalive delay: 2000 > > Network reconnect delay: 2000 > > Checking O2CB heartbeat: Not active > > OCFS2 1.4.1 Wed Jul 23 12:05:34 PDT 2008 (build > 3fc82af4b5669945497b322b6aabd031) > > (11296,1):o2net_connect_expired:1637 ERROR: no connection established > with node 0 after 30.0 seconds, giving up and returning errors. > > (14212,1):dlm_request_join:1033 ERROR: status = -107 > > (14212,1):dlm_try_to_join_domain:1207 ERROR: status = -107 > > (14212,1):dlm_join_domain:1485 ERROR: status = -107 > > (14212,1):dlm_register_domain:1732 ERROR: status = -107 > > (14212,1):ocfs2_dlm_init:2662 ERROR: status = -107 > > (14212,1):ocfs2_mount_volume:1251 ERROR: status = -107 > > ocfs2: Unmounting device (253,2) on (node 1) > > (11296,1):o2net_connect_expired:1637 ERROR: no connection established > with node 0 after 30.0 seconds, giving up and returning errors. > > (14350,1):dlm_request_join:1033 ERROR: status = -107 > > (14350,1):dlm_try_to_join_domain:1207 ERROR: status = -107 > > (14350,1):dlm_join_domain:1485 ERROR: status = -107 > > (14350,1):dlm_register_domain:1732 ERROR: status = -107 > > (14350,1):ocfs2_dlm_init:2662 ERROR: status = -107 > > (14350,1):ocfs2_mount_volume:1251 ERROR: status = -107 > > ocfs2: Unmounting device (253,3) on (node 1) > > (11296,1):o2net_connect_expired:1637 ERROR: no connection established > with node 0 after 30.0 seconds, giving up and returning errors. > > (4347,1):dlm_request_join:1033 ERROR: status = -107 > > (4347,1):dlm_try_to_join_domain:1207 ERROR: status = -107 > > (4347,1):dlm_join_domain:1485 ERROR: status = -107 > > (4347,1):dlm_register_domain:1732 ERROR: status = -107 > > (4347,1):ocfs2_dlm_init:2662 ERROR: status = -107 > > (4347,1):ocfs2_mount_volume:1251 ERROR: status = -107 > > ocfs2: Unmounting device (253,3) on (node 1) > > (11296,1):o2net_connect_expired:1637 ERROR: no connection established > with node 0 after 30.0 seconds, giving up and returning errors. > > (4948,1):dlm_request_join:1033 ERROR: status = -107 > > (4948,1):dlm_try_to_join_domain:1207 ERROR: status = -107 > > (4948,1):dlm_join_domain:1485 ERROR: status = -107 > > (4948,1):dlm_register_domain:1732 ERROR: status = -107 > > (4948,1):ocfs2_dlm_init:2662 ERROR: status = -107 > > (4948,1):ocfs2_mount_volume:1251 ERROR: status = -107 > > ocfs2: Unmounting device (253,3) on (node 1) > > OCFS2 Node Manager 1.4.1 Wed Jul 23 12:05:37 PDT 2008 (build > 0f78045c75c0174e50e4cf0934bf9eae) > > OCFS2 DLM 1.4.1 Wed Jul 23 12:05:37 PDT 2008 (build > 4ce8fae327880c466761f40fb7619490) > > OCFS2 DLMFS 1.4.1 Wed Jul 23 12:05:37 PDT 2008 (build > 4ce8fae327880c466761f40fb7619490) > > OCFS2 User DLM kernel interface loaded > > [r...@nyclx2 ~]# > > Reid McKinley > > > This message, including any attachments, contains confidential information > intended > for a specific individual and purpose, and is protected by law. If you are > not the intended > recipient, please contact the sender immediatel
[Ocfs2-users] O2CB heartbeat not active on 2nd node
We are having trouble getting the 2nd node in our 2 node RAC configuration to have an active O2CB heartbeat. We have our OCR and voting disks on an OCFS2 mount point, so we cannot bring up Clusterware on this node. I'm at a loss as to what the issue is. It was running fine for a few weeks, then we had a reboot and we cannot get the heartbeat active and we cannot mount any OCFS2 filesystems on the 2nd node. Any ideas are greatly appreciated. Dmesg errors are at the bottom. Here are the rpm and status details: [r...@nyclx1 ~]# rpm -qa | grep ocfs2 ocfs2-tools-1.4.1-1.el5 ocfs2console-1.4.1-1.el5 ocfs2-2.6.18-92.el5-1.4.1-1.el5 ocfs2-2.6.18-92.el5debug-1.2.8-2.el5 ocfs2-2.6.18-92.el5-debuginfo-1.4.1-1.el5 ocfs2-tools-debuginfo-1.4.1-1.el5 [r...@nyclx1 ~]# /etc/init.d/o2cb status Driver for "configfs": Loaded Filesystem "configfs": Mounted Driver for "ocfs2_dlmfs": Loaded Filesystem "ocfs2_dlmfs": Mounted Checking O2CB cluster tiaa: Online Heartbeat dead threshold = 31 Network idle timeout: 3 Network keepalive delay: 2000 Network reconnect delay: 2000 Checking O2CB heartbeat: Active [r...@nyclx2 ~]# rpm -qa | grep ocfs2 ocfs2-tools-1.4.1-1.el5 ocfs2-2.6.18-92.el5-1.4.1-1.el5 ocfs2-tools-debuginfo-1.4.1-1.el5 ocfs2console-1.4.1-1.el5 ocfs2-2.6.18-92.el5debug-1.2.8-2.el5 ocfs2-2.6.18-92.el5-debuginfo-1.4.1-1.el5 [r...@nyclx2 ~]# /etc/init.d/o2cb status Driver for "configfs": Loaded Filesystem "configfs": Mounted Driver for "ocfs2_dlmfs": Loaded Filesystem "ocfs2_dlmfs": Mounted Checking O2CB cluster tiaa: Online Heartbeat dead threshold = 31 Network idle timeout: 3 Network keepalive delay: 2000 Network reconnect delay: 2000 Checking O2CB heartbeat: Not active OCFS2 1.4.1 Wed Jul 23 12:05:34 PDT 2008 (build 3fc82af4b5669945497b322b6aabd031) (11296,1):o2net_connect_expired:1637 ERROR: no connection established with node 0 after 30.0 seconds, giving up and returning errors. (14212,1):dlm_request_join:1033 ERROR: status = -107 (14212,1):dlm_try_to_join_domain:1207 ERROR: status = -107 (14212,1):dlm_join_domain:1485 ERROR: status = -107 (14212,1):dlm_register_domain:1732 ERROR: status = -107 (14212,1):ocfs2_dlm_init:2662 ERROR: status = -107 (14212,1):ocfs2_mount_volume:1251 ERROR: status = -107 ocfs2: Unmounting device (253,2) on (node 1) (11296,1):o2net_connect_expired:1637 ERROR: no connection established with node 0 after 30.0 seconds, giving up and returning errors. (14350,1):dlm_request_join:1033 ERROR: status = -107 (14350,1):dlm_try_to_join_domain:1207 ERROR: status = -107 (14350,1):dlm_join_domain:1485 ERROR: status = -107 (14350,1):dlm_register_domain:1732 ERROR: status = -107 (14350,1):ocfs2_dlm_init:2662 ERROR: status = -107 (14350,1):ocfs2_mount_volume:1251 ERROR: status = -107 ocfs2: Unmounting device (253,3) on (node 1) (11296,1):o2net_connect_expired:1637 ERROR: no connection established with node 0 after 30.0 seconds, giving up and returning errors. (4347,1):dlm_request_join:1033 ERROR: status = -107 (4347,1):dlm_try_to_join_domain:1207 ERROR: status = -107 (4347,1):dlm_join_domain:1485 ERROR: status = -107 (4347,1):dlm_register_domain:1732 ERROR: status = -107 (4347,1):ocfs2_dlm_init:2662 ERROR: status = -107 (4347,1):ocfs2_mount_volume:1251 ERROR: status = -107 ocfs2: Unmounting device (253,3) on (node 1) (11296,1):o2net_connect_expired:1637 ERROR: no connection established with node 0 after 30.0 seconds, giving up and returning errors. (4948,1):dlm_request_join:1033 ERROR: status = -107 (4948,1):dlm_try_to_join_domain:1207 ERROR: status = -107 (4948,1):dlm_join_domain:1485 ERROR: status = -107 (4948,1):dlm_register_domain:1732 ERROR: status = -107 (4948,1):ocfs2_dlm_init:2662 ERROR: status = -107 (4948,1):ocfs2_mount_volume:1251 ERROR: status = -107 ocfs2: Unmounting device (253,3) on (node 1) OCFS2 Node Manager 1.4.1 Wed Jul 23 12:05:37 PDT 2008 (build 0f78045c75c0174e50e4cf0934bf9eae) OCFS2 DLM 1.4.1 Wed Jul 23 12:05:37 PDT 2008 (build 4ce8fae327880c466761f40fb7619490) OCFS2 DLMFS 1.4.1 Wed Jul 23 12:05:37 PDT 2008 (build 4ce8fae327880c466761f40fb7619490) OCFS2 User DLM kernel interface loaded [r...@nyclx2 ~]# Reid McKinley This message, including any attachments, contains confidential information intended for a specific individual and purpose, and is protected by law. If you are not the intended recipient, please contact the sender immediately by reply e-mail and destroy all copies. You are hereby notified that any disclosure, copying, or distribution of this message, or the taking of any action based on it, is strictly prohibited. TIAA-CREF ___ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-