Hi Florian, florian.engelm...@bt.com wrote: > Hi Tao, > >> Hi florian, >> >> florian.engelm...@bt.com wrote: >>>> Florian, >>>> the problem here seems to be with network. The nodes are running > into >>>> network heartbeat timeout and hence second node is getting fenced. > Do >>>> you see o2net thread consuming 100% cpu on any node? if not then >>>> probably check your network >>>> thanks, >>>> --Srini >>> I forgot to post my /etc/ocfs2/cluster.conf >>> node: >>> ip_port = 7777 >>> ip_address = 192.168.0.101 >>> number = 0 >>> name = defr1elcbtd01 >>> cluster = ocfs2 >>> >>> node: >>> ip_port = 7777 >>> ip_address = 192.168.0.102 >>> number = 1 >>> name = defr1elcbtd02 >>> cluster = ocfs2 >>> >>> cluster: >>> node_count = 2 >>> name = ocfs2 >>> >>> >>> 192.168.0.10x is eth3 on both nodes and connected with a cross over >>> cable. No active network component is involved here. >>> >>> defr1elcbtd02:~# traceroute 192.168.0.101 >>> traceroute to 192.168.0.101 (192.168.0.101), 30 hops max, 52 byte >>> packets >>> 1 node1 (192.168.0.101) 0.220 ms 0.142 ms 0.223 ms >>> defr1elcbtd02:~# >>> >>> The error message looks like a network problem but why should there > be a >>> network problem if I shutdown a FC port?! I testet it about 20 times > and >>> got about 16 kernel panics starting with the same error message: >>> >>> kernel: o2net: no longer connected to node defr1elcbtd01 (num 0) at >>> 192.168.0.101:7777 >> It isn't an error message, just a status report that we can't connect > to >> that node now. That node may be rebooted or something else, but this >> node don't know, and it only knows the connection is down. > > But node defr1elcbtd01 was never down and also the network link (eth3) > wasn't down. I was able to ping from each node to the other. > Node 1 is hosting all services and never was faulted while I was > testing. > > All I have to do to panic node 2 is to disable one of two fibre channel > ports or pull one fibre channel cable or delete node 2 from the cisco > SAN zoning. > If I apply one of those 3 "errors" I get the message about o2net is no > longer connected to node 1 and 60 seconds later the 2nd node panics > because of ocfs2 fencing (but this happens only in about 80% of cases - > in the other 20% of cases o2net does not disconnect and there are no > messages about ocfs2 at all - like it should be...). > Everything else is working fine in these 60 seconds. The filesystem is > still writable from both nodes and both nodes can ping each other (via > the cluster interconnect). I just checked your log. The error why node 2 get the message is that node 1 get the message that node 2 stopped disk heartbeat for quite a long time so it stop the connection intentionally. So node 2 get this message.
See the log in node 1: Jun 8 09:46:26 defr1elcbtd01 kernel: (3804,0):o2quo_hb_down:224 node 1, 1 total Jun 8 09:46:26 defr1elcbtd01 kernel: (3804,0):o2net_set_nn_state:382 node 1 sc: ffff81007ddf4400 -> 0000000000000000, valid 1 -> 0, err 0 -> -107 Jun 8 09:46:26 defr1elcbtd01 kernel: (3804,0):o2quo_conn_err:296 node 1, 1 total Jun 8 09:46:26 defr1elcbtd01 kernel: o2net: no longer connected to node defr1elcbtd02 (num 1) at 192.168.0.102:7777 And I guess the reason why you see this log sometimes(80%) is that the time interval. You know ocfs2 disk heartbeat try every 2 secs so sometimes node 2 panic before node 1 call o2quo_hb_down and sometimes node2 panic after node 1 call o2quo_hb_down(which will put something like "no longer..." in node 2's log). So would you please give your timeout configuration(o2cb)? Regards, Tao > > Here are the logs with debug logging: > > Node 2: > > Jun 8 09:46:11 defr1elcbtd02 kernel: qla2xxx 0000:04:00.0: LOOP DOWN > detected (2). > Jun 8 09:46:11 defr1elcbtd02 kernel: (3463,0):sc_put:289 [sc > ffff81007c2f0800 refs 3 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] put > Jun 8 09:46:11 defr1elcbtd02 kernel: (0,0):o2net_data_ready:452 [sc > ffff81007c2f0800 refs 2 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] data_ready hit > Jun 8 09:46:11 defr1elcbtd02 kernel: (0,0):sc_get:294 [sc > ffff81007c2f0800 refs 2 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] get > Jun 8 09:46:11 defr1elcbtd02 kernel: (3463,0):o2net_advance_rx:1129 [sc > ffff81007c2f0800 refs 3 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] receiving > Jun 8 09:46:11 defr1elcbtd02 kernel: (3463,0):o2net_advance_rx:1170 > [mag 64088 len 0 typ 0 stat 0 sys_stat 0 key 00000000 num 0] at page_off > 24 > Jun 8 09:46:11 defr1elcbtd02 kernel: > (3463,0):o2net_process_message:1015 [mag 64088 len 0 typ 0 stat 0 > sys_stat 0 key 00000000 num 0] processing message > Jun 8 09:46:11 defr1elcbtd02 kernel: (3463,0):sc_get:294 [sc > ffff81007c2f0800 refs 3 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 24] get > Jun 8 09:46:11 defr1elcbtd02 kernel: (3463,0):o2net_advance_rx:1196 [sc > ffff81007c2f0800 refs 4 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] ret = 1 > Jun 8 09:46:11 defr1elcbtd02 kernel: (3463,0):o2net_advance_rx:1129 [sc > ffff81007c2f0800 refs 4 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] receiving > Jun 8 09:46:11 defr1elcbtd02 kernel: (3463,0):o2net_advance_rx:1196 [sc > ffff81007c2f0800 refs 4 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] ret = -11 > Jun 8 09:46:11 defr1elcbtd02 kernel: (3463,0):sc_put:289 [sc > ffff81007c2f0800 refs 4 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] put > Jun 8 09:46:16 defr1elcbtd02 kernel: (3463,0):sc_put:289 [sc > ffff81007c2f0800 refs 3 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] put > Jun 8 09:46:16 defr1elcbtd02 kernel: (0,0):o2net_data_ready:452 [sc > ffff81007c2f0800 refs 2 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] data_ready hit > Jun 8 09:46:16 defr1elcbtd02 kernel: (0,0):sc_get:294 [sc > ffff81007c2f0800 refs 2 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] get > Jun 8 09:46:16 defr1elcbtd02 kernel: (0,0):o2net_data_ready:452 [sc > ffff81007c2f0800 refs 3 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] data_ready hit > Jun 8 09:46:16 defr1elcbtd02 kernel: (0,0):sc_get:294 [sc > ffff81007c2f0800 refs 3 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] get > Jun 8 09:46:16 defr1elcbtd02 kernel: (0,0):sc_put:289 [sc > ffff81007c2f0800 refs 4 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] put > Jun 8 09:46:16 defr1elcbtd02 kernel: (3463,0):o2net_advance_rx:1129 [sc > ffff81007c2f0800 refs 3 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] receiving > Jun 8 09:46:16 defr1elcbtd02 kernel: (3463,0):o2net_advance_rx:1170 > [mag 64088 len 0 typ 0 stat 0 sys_stat 0 key 00000000 num 0] at page_off > 24 > Jun 8 09:46:16 defr1elcbtd02 kernel: > (3463,0):o2net_process_message:1015 [mag 64088 len 0 typ 0 stat 0 > sys_stat 0 key 00000000 num 0] processing message > Jun 8 09:46:16 defr1elcbtd02 kernel: (3463,0):sc_get:294 [sc > ffff81007c2f0800 refs 3 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 24] get > Jun 8 09:46:16 defr1elcbtd02 kernel: (3463,0):o2net_advance_rx:1196 [sc > ffff81007c2f0800 refs 4 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] ret = 1 > Jun 8 09:46:16 defr1elcbtd02 kernel: (3463,0):o2net_advance_rx:1129 [sc > ffff81007c2f0800 refs 4 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] receiving > Jun 8 09:46:16 defr1elcbtd02 kernel: (3463,0):o2net_advance_rx:1170 > [mag 64087 len 0 typ 0 stat 0 sys_stat 0 key 00000000 num 0] at page_off > 24 > Jun 8 09:46:16 defr1elcbtd02 kernel: > (3463,0):o2net_process_message:1015 [mag 64087 len 0 typ 0 stat 0 > sys_stat 0 key 00000000 num 0] processing message > Jun 8 09:46:16 defr1elcbtd02 kernel: (3463,0):sc_put:289 [sc > ffff81007c2f0800 refs 4 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 24] put > Jun 8 09:46:16 defr1elcbtd02 kernel: (3463,0):sc_get:294 [sc > ffff81007c2f0800 refs 3 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 24] get > Jun 8 09:46:16 defr1elcbtd02 kernel: (3463,0):o2net_advance_rx:1196 [sc > ffff81007c2f0800 refs 4 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] ret = 1 > Jun 8 09:46:16 defr1elcbtd02 kernel: (3463,0):o2net_advance_rx:1129 [sc > ffff81007c2f0800 refs 4 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] receiving > Jun 8 09:46:16 defr1elcbtd02 kernel: (3463,0):o2net_advance_rx:1196 [sc > ffff81007c2f0800 refs 4 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] ret = -11 > Jun 8 09:46:16 defr1elcbtd02 kernel: (3463,0):sc_put:289 [sc > ffff81007c2f0800 refs 4 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] put > Jun 8 09:46:21 defr1elcbtd02 kernel: (3463,0):sc_put:289 [sc > ffff81007c2f0800 refs 3 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] put > Jun 8 09:46:21 defr1elcbtd02 kernel: (0,0):o2net_data_ready:452 [sc > ffff81007c2f0800 refs 2 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] data_ready hit > Jun 8 09:46:21 defr1elcbtd02 kernel: (0,0):sc_get:294 [sc > ffff81007c2f0800 refs 2 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] get > Jun 8 09:46:21 defr1elcbtd02 kernel: (3463,0):o2net_advance_rx:1129 [sc > ffff81007c2f0800 refs 3 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] receiving > Jun 8 09:46:21 defr1elcbtd02 kernel: (3463,0):o2net_advance_rx:1170 > [mag 64088 len 0 typ 0 stat 0 sys_stat 0 key 00000000 num 0] at page_off > 24 > Jun 8 09:46:21 defr1elcbtd02 kernel: > (3463,0):o2net_process_message:1015 [mag 64088 len 0 typ 0 stat 0 > sys_stat 0 key 00000000 num 0] processing message > Jun 8 09:46:21 defr1elcbtd02 kernel: (3463,0):sc_get:294 [sc > ffff81007c2f0800 refs 3 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 24] get > Jun 8 09:46:21 defr1elcbtd02 kernel: (3463,0):o2net_advance_rx:1196 [sc > ffff81007c2f0800 refs 4 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] ret = 1 > Jun 8 09:46:21 defr1elcbtd02 kernel: (3463,0):o2net_advance_rx:1129 [sc > ffff81007c2f0800 refs 4 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] receiving > Jun 8 09:46:21 defr1elcbtd02 kernel: (3463,0):o2net_advance_rx:1196 [sc > ffff81007c2f0800 refs 4 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] ret = -11 > Jun 8 09:46:21 defr1elcbtd02 kernel: (3463,0):sc_put:289 [sc > ffff81007c2f0800 refs 4 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] put > Jun 8 09:46:26 defr1elcbtd02 kernel: (0,0):o2net_state_change:477 [sc > ffff81007c2f0800 refs 3 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] state_change to 8 > Jun 8 09:46:26 defr1elcbtd02 kernel: (0,0):sc_get:294 [sc > ffff81007c2f0800 refs 3 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] get > Jun 8 09:46:26 defr1elcbtd02 kernel: (0,0):o2net_data_ready:452 [sc > ffff81007c2f0800 refs 4 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] data_ready hit > Jun 8 09:46:26 defr1elcbtd02 kernel: (0,0):sc_get:294 [sc > ffff81007c2f0800 refs 4 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] get > Jun 8 09:46:26 defr1elcbtd02 kernel: (3463,0):o2net_shutdown_sc:572 [sc > ffff81007c2f0800 refs 5 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] shutting down > Jun 8 09:46:26 defr1elcbtd02 kernel: (3463,0):sc_put:289 [sc > ffff81007c2f0800 refs 5 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] put > Jun 8 09:46:26 defr1elcbtd02 kernel: (3463,0):sc_put:289 [sc > ffff81007c2f0800 refs 4 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] put > Jun 8 09:46:26 defr1elcbtd02 kernel: (3463,0):o2net_set_nn_state:382 > node 0 sc: ffff81007c2f0800 -> 0000000000000000, valid 1 -> 0, err 0 -> > -107 > Jun 8 09:46:26 defr1elcbtd02 kernel: (3463,0):o2quo_conn_err:296 node > 0, 1 total > Jun 8 09:46:26 defr1elcbtd02 kernel: (3463,0):o2quo_set_hold:162 node > 0, 1 total > Jun 8 09:46:26 defr1elcbtd02 kernel: o2net: no longer connected to node > defr1elcbtd01 (num 0) at 192.168.0.101:7777 > Jun 8 09:46:26 defr1elcbtd02 kernel: > (3463,0):o2net_complete_nodes_nsw:255 completed 0 messages for node 0 > Jun 8 09:46:26 defr1elcbtd02 kernel: (3463,0):o2net_set_nn_state:431 > queueing conn attempt in 0 jiffies > Jun 8 09:46:26 defr1elcbtd02 kernel: (3463,0):sc_get:294 [sc > ffff81007c2f0800 refs 3 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] get > Jun 8 09:46:26 defr1elcbtd02 kernel: (3463,0):sc_put:289 [sc > ffff81007c2f0800 refs 4 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] put > Jun 8 09:46:26 defr1elcbtd02 kernel: (3463,0):sc_put:289 [sc > ffff81007c2f0800 refs 3 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] put > Jun 8 09:46:26 defr1elcbtd02 kernel: (3463,0):o2net_advance_rx:1129 [sc > ffff81007c2f0800 refs 2 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] receiving > Jun 8 09:46:26 defr1elcbtd02 kernel: (3463,0):o2net_advance_rx:1196 [sc > ffff81007c2f0800 refs 2 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] ret = 0 > Jun 8 09:46:26 defr1elcbtd02 kernel: (3463,0):o2net_rx_until_empty:1215 > [sc ffff81007c2f0800 refs 2 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] saw error 0, closing > Jun 8 09:46:26 defr1elcbtd02 kernel: (3463,0):sc_put:289 [sc > ffff81007c2f0800 refs 2 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] put > Jun 8 09:46:26 defr1elcbtd02 kernel: (3463,0):o2net_shutdown_sc:572 [sc > ffff81007c2f0800 refs 1 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] shutting down > Jun 8 09:46:26 defr1elcbtd02 kernel: (3463,0):sc_put:289 [sc > ffff81007c2f0800 refs 1 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] put > Jun 8 09:46:26 defr1elcbtd02 kernel: (3463,0):sc_kref_release:274 [sc > ffff81007c2f0800 refs 1 sock ffff8100694138c0 node 0 page > ffff81007fafbb00 pg_off 0] releasing > Jun 8 09:46:46 defr1elcbtd02 kernel: scsi 2:0:1:1: SCSI error: return > code = 0x00010000 > Jun 8 09:46:46 defr1elcbtd02 kernel: end_request: I/O error, dev sdd, > sector 1672 > Jun 8 09:46:46 defr1elcbtd02 kernel: device-mapper: multipath: Failing > path 8:48. > Jun 8 09:46:46 defr1elcbtd02 kernel: device-mapper: multipath: Failing > path 8:16. > Jun 8 09:46:46 defr1elcbtd02 kernel: device-mapper: multipath emc: long > trespass command will be send > Jun 8 09:46:46 defr1elcbtd02 kernel: device-mapper: multipath emc: > honor reservation bit will not be set (default) > Jun 8 09:46:46 defr1elcbtd02 kernel: device-mapper: ioctl: error adding > target to table > Jun 8 09:46:46 defr1elcbtd02 kernel: device-mapper: multipath emc: long > trespass command will be send > Jun 8 09:46:46 defr1elcbtd02 kernel: device-mapper: multipath emc: > honor reservation bit will not be set (default) > Jun 8 09:46:48 defr1elcbtd02 kernel: device-mapper: multipath emc: > emc_pg_init: sending switch-over command > > > > > Node 1: > Jun 8 09:46:21 defr1elcbtd01 kernel: (16528,0):o2net_data_ready:452 [sc > ffff81007ddf4400 refs 3 sock ffff810002977d00 node 1 page > ffff81007f75b3e8 pg_off 0] data_ready hit > Jun 8 09:46:21 defr1elcbtd01 kernel: (16528,0):sc_get:294 [sc > ffff81007ddf4400 refs 3 sock ffff810002977d00 node 1 page > ffff81007f75b3e8 pg_off 0] get > Jun 8 09:46:21 defr1elcbtd01 kernel: (3505,0):o2net_advance_rx:1129 [sc > ffff81007ddf4400 refs 4 sock ffff810002977d00 node 1 page > ffff81007f75b3e8 pg_off 0] receiving > Jun 8 09:46:21 defr1elcbtd01 kernel: (3505,0):o2net_advance_rx:1170 > [mag 64087 len 0 typ 0 stat 0 sys_stat 0 key 00000000 num 0] at page_off > 24 > Jun 8 09:46:21 defr1elcbtd01 kernel: > (3505,0):o2net_process_message:1015 [mag 64087 len 0 typ 0 stat 0 > sys_stat 0 key 00000000 num 0] processing message > Jun 8 09:46:21 defr1elcbtd01 kernel: (3505,0):sc_put:289 [sc > ffff81007ddf4400 refs 4 sock ffff810002977d00 node 1 page > ffff81007f75b3e8 pg_off 24] put > Jun 8 09:46:21 defr1elcbtd01 kernel: (3505,0):sc_get:294 [sc > ffff81007ddf4400 refs 3 sock ffff810002977d00 node 1 page > ffff81007f75b3e8 pg_off 24] get > Jun 8 09:46:21 defr1elcbtd01 kernel: (3505,0):o2net_advance_rx:1196 [sc > ffff81007ddf4400 refs 4 sock ffff810002977d00 node 1 page > ffff81007f75b3e8 pg_off 0] ret = 1 > Jun 8 09:46:21 defr1elcbtd01 kernel: (3505,0):o2net_advance_rx:1129 [sc > ffff81007ddf4400 refs 4 sock ffff810002977d00 node 1 page > ffff81007f75b3e8 pg_off 0] receiving > Jun 8 09:46:21 defr1elcbtd01 kernel: (3505,0):o2net_advance_rx:1196 [sc > ffff81007ddf4400 refs 4 sock ffff810002977d00 node 1 page > ffff81007f75b3e8 pg_off 0] ret = -11 > Jun 8 09:46:21 defr1elcbtd01 kernel: (3505,0):sc_put:289 [sc > ffff81007ddf4400 refs 4 sock ffff810002977d00 node 1 page > ffff81007f75b3e8 pg_off 0] put > Jun 8 09:46:26 defr1elcbtd01 kernel: (3804,0):o2quo_hb_down:224 node 1, > 1 total > Jun 8 09:46:26 defr1elcbtd01 kernel: (3804,0):o2net_set_nn_state:382 > node 1 sc: ffff81007ddf4400 -> 0000000000000000, valid 1 -> 0, err 0 -> > -107 > Jun 8 09:46:26 defr1elcbtd01 kernel: (3804,0):o2quo_conn_err:296 node > 1, 1 total > Jun 8 09:46:26 defr1elcbtd01 kernel: o2net: no longer connected to node > defr1elcbtd02 (num 1) at 192.168.0.102:7777 > Jun 8 09:46:26 defr1elcbtd01 kernel: > (3804,0):o2net_complete_nodes_nsw:255 completed 0 messages for node 1 > Jun 8 09:46:26 defr1elcbtd01 kernel: (3804,0):o2net_set_nn_state:431 > queueing conn attempt in 0 jiffies > Jun 8 09:46:26 defr1elcbtd01 kernel: (3804,0):sc_get:294 [sc > ffff81007ddf4400 refs 3 sock ffff810002977d00 node 1 page > ffff81007f75b3e8 pg_off 0] get > Jun 8 09:46:26 defr1elcbtd01 kernel: (3804,0):sc_put:289 [sc > ffff81007ddf4400 refs 4 sock ffff810002977d00 node 1 page > ffff81007f75b3e8 pg_off 0] put > Jun 8 09:46:26 defr1elcbtd01 kernel: (3505,0):o2net_shutdown_sc:572 [sc > ffff81007ddf4400 refs 3 sock ffff810002977d00 node 1 page > ffff81007f75b3e8 pg_off 0] shutting down > Jun 8 09:46:26 defr1elcbtd01 kernel: (3505,0):sc_put:289 [sc > ffff81007ddf4400 refs 3 sock ffff810002977d00 node 1 page > ffff81007f75b3e8 pg_off 0] put > Jun 8 09:46:26 defr1elcbtd01 kernel: (3505,0):sc_put:289 [sc > ffff81007ddf4400 refs 2 sock ffff810002977d00 node 1 page > ffff81007f75b3e8 pg_off 0] put > Jun 8 09:46:26 defr1elcbtd01 kernel: (3505,0):sc_put:289 [sc > ffff81007ddf4400 refs 1 sock ffff810002977d00 node 1 page > ffff81007f75b3e8 pg_off 0] put > Jun 8 09:46:26 defr1elcbtd01 kernel: (3505,0):sc_kref_release:274 [sc > ffff81007ddf4400 refs 1 sock ffff810002977d00 node 1 page > ffff81007f75b3e8 pg_off 0] releasing > Jun 8 09:46:26 defr1elcbtd01 kernel: > (16541,0):ocfs2_replay_journal:1184 Recovering node 1 from slot 0 on > device (253,7) > Jun 8 09:46:28 defr1elcbtd01 kernel: kjournald starting. Commit > interval 5 seconds > Jun 8 09:46:48 defr1elcbtd01 kernel: Please double check your > configuration values for 'O2CB_HEARTBEAT_THRESHOLD' > Jun 8 09:46:48 defr1elcbtd01 kernel: (3804,0):o2quo_hb_up:199 node 1, 2 > total > Jun 8 09:46:48 defr1elcbtd01 kernel: (3804,0):o2quo_set_hold:162 node > 1, 1 total > Jun 8 09:46:48 defr1elcbtd01 kernel: (3804,0):o2net_set_nn_state:382 > node 1 sc: 0000000000000000 -> 0000000000000000, valid 0 -> 0, err -107 > -> 0 > Jun 8 09:46:48 defr1elcbtd01 kernel: (3804,0):o2net_set_nn_state:431 > queueing conn attempt in 0 jiffies > Jun 8 09:46:58 defr1elcbtd01 kernel: (3505,0):o2net_set_nn_state:382 > node 1 sc: 0000000000000000 -> 0000000000000000, valid 0 -> 0, err 0 -> > -107 > Jun 8 09:46:58 defr1elcbtd01 kernel: (3505,0):o2quo_conn_err:296 node > 1, 1 total > Jun 8 09:46:58 defr1elcbtd01 kernel: (3505,0):o2net_set_nn_state:431 > queueing conn attempt in 0 jiffies > Jun 8 09:47:16 defr1elcbtd01 kernel: (3505,0):o2quo_hb_still_up:242 > node 1 > Jun 8 09:47:16 defr1elcbtd01 kernel: (3505,0):o2quo_clear_hold:171 node > 1, 0 total > Jun 8 09:47:16 defr1elcbtd01 kernel: (8,0):o2quo_make_decision:105 > heartbeating: 2, connected: 1, lowest: 0 (reachable) > Jun 8 09:47:44 defr1elcbtd01 kernel: (3804,0):o2quo_hb_down:224 node 1, > 1 total > Jun 8 09:47:44 defr1elcbtd01 kernel: (3804,0):o2net_set_nn_state:382 > node 1 sc: 0000000000000000 -> 0000000000000000, valid 0 -> 0, err -107 > -> -107 > Jun 8 09:47:44 defr1elcbtd01 kernel: (3804,0):o2net_set_nn_state:431 > queueing conn attempt in 0 jiffies > Jun 8 09:48:33 defr1elcbtd01 kernel: e1000: eth3: e1000_watchdog: NIC > Link is Down > Jun 8 09:48:45 defr1elcbtd01 kernel: e1000: eth3: e1000_watchdog: NIC > Link is Up 1000 Mbps Full Duplex > Jun 8 09:50:05 defr1elcbtd01 kernel: e1000: eth3: e1000_watchdog: NIC > Link is Down > Jun 8 09:50:08 defr1elcbtd01 kernel: e1000: eth3: e1000_watchdog: NIC > Link is Up 1000 Mbps Full Duplex > Jun 8 09:50:20 defr1elcbtd01 kernel: Please double check your > configuration values for 'O2CB_HEARTBEAT_THRESHOLD' > Jun 8 09:50:20 defr1elcbtd01 kernel: (3804,0):o2quo_hb_up:199 node 1, 2 > total > Jun 8 09:50:20 defr1elcbtd01 kernel: (3804,0):o2quo_set_hold:162 node > 1, 1 total > Jun 8 09:50:20 defr1elcbtd01 kernel: (3804,0):o2net_set_nn_state:382 > node 1 sc: 0000000000000000 -> 0000000000000000, valid 0 -> 0, err -107 > -> 0 > Jun 8 09:50:20 defr1elcbtd01 kernel: (3804,0):o2net_set_nn_state:431 > queueing conn attempt in 0 jiffies > Jun 8 09:50:22 defr1elcbtd01 kernel: (0,0):o2net_listen_data_ready:1670 > bytes: 0 > Jun 8 09:50:22 defr1elcbtd01 kernel: (3505,0):sc_alloc:320 [sc > ffff81007dddc400 refs 1 sock 0000000000000000 node 1 page > 0000000000000000 pg_off 0] alloced > Jun 8 09:50:22 defr1elcbtd01 kernel: (3505,0):o2net_set_nn_state:382 > node 1 sc: 0000000000000000 -> ffff81007dddc400, valid 0 -> 0, err 0 -> > 0 > Jun 8 09:50:22 defr1elcbtd01 kernel: (3505,0):o2net_set_nn_state:431 > queueing conn attempt in 0 jiffies > Jun 8 09:50:22 defr1elcbtd01 kernel: (3505,0):sc_get:294 [sc > ffff81007dddc400 refs 1 sock ffff81005232e700 node 1 page > ffff81007f272590 pg_off 0] get > Jun 8 09:50:22 defr1elcbtd01 kernel: (3505,0):sc_get:294 [sc > ffff81007dddc400 refs 2 sock ffff81005232e700 node 1 page > ffff81007f272590 pg_off 0] get > Jun 8 09:50:22 defr1elcbtd01 kernel: (3505,0):sc_get:294 [sc > ffff81007dddc400 refs 3 sock ffff81005232e700 node 1 page > ffff81007f272590 pg_off 0] get > Jun 8 09:50:22 defr1elcbtd01 kernel: (3505,0):sc_put:289 [sc > ffff81007dddc400 refs 4 sock ffff81005232e700 node 1 page > ffff81007f272590 pg_off 0] put > Jun 8 09:50:22 defr1elcbtd01 kernel: (3505,0):o2net_advance_rx:1129 [sc > ffff81007dddc400 refs 3 sock ffff81005232e700 node 1 page > ffff81007f272590 pg_off 0] receiving > Jun 8 09:50:22 defr1elcbtd01 kernel: (3505,0):sc_get:294 [sc > ffff81007dddc400 refs 3 sock ffff81005232e700 node 1 page > ffff81007f272590 pg_off 16] get > Jun 8 09:50:22 defr1elcbtd01 kernel: (3505,0):o2net_set_nn_state:382 > node 1 sc: ffff81007dddc400 -> ffff81007dddc400, valid 0 -> 1, err 0 -> > 0 > Jun 8 09:50:22 defr1elcbtd01 kernel: (3505,0):o2quo_conn_up:267 node 1, > 2 total > Jun 8 09:50:22 defr1elcbtd01 kernel: (3505,0):o2quo_clear_hold:171 node > 1, 0 total > Jun 8 09:50:22 defr1elcbtd01 kernel: o2net: accepted connection from > node defr1elcbtd02 (num 1) at 192.168.0.102:7777 > Jun 8 09:50:22 defr1elcbtd01 kernel: (3505,0):o2net_advance_rx:1196 [sc > ffff81007dddc400 refs 4 sock ffff81005232e700 node 1 page > ffff81007f272590 pg_off 0] ret = 16 > Jun 8 09:50:22 defr1elcbtd01 kernel: (3505,0):o2net_advance_rx:1129 [sc > ffff81007dddc400 refs 4 sock ffff81005232e700 node 1 page > ffff81007f272590 pg_off 0] receiving > Jun 8 09:50:22 defr1elcbtd01 kernel: (3505,0):o2net_advance_rx:1196 [sc > ffff81007dddc400 refs 4 sock ffff81005232e700 node 1 page > ffff81007f272590 pg_off 0] ret = -11 > Jun 8 09:50:22 defr1elcbtd01 kernel: (3505,0):sc_put:289 [sc > ffff81007dddc400 refs 4 sock ffff81005232e700 node 1 page > ffff81007f272590 pg_off 0] put > >> If there is some problem in FC port, the node will reboot and the > other >> node will get this message. >>> The cluster is running fine if there is no problem with the SAN >>> connection. >>> >>> How to enable verbose logging with ofcs2? >> debugfs.ocfs2 -l will show current logging status. >> >> If you want to enable some log, use e.g >> debugfs.ocfs2 -l DISK_ALLOC allow >> and >> debugfs.ocfs2 -l DISK_ALLOC off >> will turn it off. >> >> Regards, >> Tao >>> Regards, >>> Florian >>> >>>> florian.engelm...@bt.com wrote: >>>>> Hello, >>>>> our Debian etch cluster nodes are panicing because of ocfs2 > fencing >>> if >>>>> one SAN path fails. >>>>> >>>>> modinfo ocfs2 >>>>> filename: > /lib/modules/2.6.18-6-amd64/kernel/fs/ocfs2/ocfs2.ko >>>>> author: Oracle >>>>> license: GPL >>>>> description: OCFS2 1.3.3 >>>>> version: 1.3.3 >>>>> vermagic: 2.6.18-6-amd64 SMP mod_unload gcc-4.1 >>>>> depends: ocfs2_dlm,ocfs2_nodemanager,jbd >>>>> srcversion: 0798424846E68F10172C203 >>>>> >>>>> modinfo ocfs2_dlmfs >>>>> filename: >>>>> /lib/modules/2.6.18-6-amd64/kernel/fs/ocfs2/dlm/ocfs2_dlmfs.ko >>>>> author: Oracle >>>>> license: GPL >>>>> description: OCFS2 DLMFS 1.3.3 >>>>> version: 1.3.3 >>>>> vermagic: 2.6.18-6-amd64 SMP mod_unload gcc-4.1 >>>>> depends: ocfs2_dlm,ocfs2_nodemanager >>>>> srcversion: E3780E12396118282B3C1AD >>>>> >>>>> defr1elcbtd02:~# modinfo ocfs2_dlm >>>>> filename: >>>>> /lib/modules/2.6.18-6-amd64/kernel/fs/ocfs2/dlm/ocfs2_dlm.ko >>>>> author: Oracle >>>>> license: GPL >>>>> description: OCFS2 DLM 1.3.3 >>>>> version: 1.3.3 >>>>> vermagic: 2.6.18-6-amd64 SMP mod_unload gcc-4.1 >>>>> depends: ocfs2_nodemanager >>>>> srcversion: 7DC395EA08AE4CE826C5B92 >>>>> >>>>> modinfo ocfs2_nodemanager >>>>> filename: >>>>> > /lib/modules/2.6.18-6-amd64/kernel/fs/ocfs2/cluster/ocfs2_nodemanager.ko >>>>> author: Oracle >>>>> license: GPL >>>>> description: OCFS2 Node Manager 1.3.3 >>>>> version: 1.3.3 >>>>> vermagic: 2.6.18-6-amd64 SMP mod_unload gcc-4.1 >>>>> depends: configfs >>>>> srcversion: C4C9871302E1910B78DAE40 >>>>> >>>>> modinfo qla2xxx >>>>> filename: >>>>> /lib/modules/2.6.18-6-amd64/kernel/drivers/scsi/qla2xxx/qla2xxx.ko >>>>> author: QLogic Corporation >>>>> description: QLogic Fibre Channel HBA Driver >>>>> license: GPL >>>>> version: 8.01.07-k1 >>>>> vermagic: 2.6.18-6-amd64 SMP mod_unload gcc-4.1 >>>>> depends: scsi_mod,scsi_transport_fc,firmware_class >>>>> alias: pci:v00001077d00002100sv*sd*bc*sc*i* >>>>> alias: pci:v00001077d00002200sv*sd*bc*sc*i* >>>>> alias: pci:v00001077d00002300sv*sd*bc*sc*i* >>>>> alias: pci:v00001077d00002312sv*sd*bc*sc*i* >>>>> alias: pci:v00001077d00002322sv*sd*bc*sc*i* >>>>> alias: pci:v00001077d00006312sv*sd*bc*sc*i* >>>>> alias: pci:v00001077d00006322sv*sd*bc*sc*i* >>>>> alias: pci:v00001077d00002422sv*sd*bc*sc*i* >>>>> alias: pci:v00001077d00002432sv*sd*bc*sc*i* >>>>> alias: pci:v00001077d00005422sv*sd*bc*sc*i* >>>>> alias: pci:v00001077d00005432sv*sd*bc*sc*i* >>>>> srcversion: B8E1608E257391DCAFD9224 >>>>> parm: ql2xfdmienable:Enables FDMI registratons Default > is >>> 0 - >>>>> no FDMI. 1 - perfom FDMI. (int) >>>>> parm: extended_error_logging:Option to enable extended >>> error >>>>> logging, Default is 0 - no logging. 1 - log errors. (int) >>>>> parm: ql2xallocfwdump:Option to enable allocation of >>> memory >>>>> for a firmware dump during HBA initialization. Memory allocation >>>>> requirements vary by ISP type. Default is 1 - allocate memory. >>> (int) >>>>> parm: ql2xloginretrycount:Specify an alternate value for >>> the >>>>> NVRAM login retry count. (int) >>>>> parm: ql2xplogiabsentdevice:Option to enable PLOGI to >>> devices >>>>> that are not present after a Fabric scan. This is needed for >>> several >>>>> broken switches. Default is 0 - no PLOGI. 1 - perfom PLOGI. (int) >>>>> parm: qlport_down_retry:Maximum number of command > retries >>> to a >>>>> port that returns a PORT-DOWN status. (int) >>>>> parm: ql2xlogintimeout:Login timeout value in seconds. >>> (int) >>>>> modinfo dm_multipath >>>>> filename: >>>>> /lib/modules/2.6.18-6-amd64/kernel/drivers/md/dm-multipath.ko >>>>> description: device-mapper multipath target >>>>> author: Sistina Software <dm-de...@redhat.com> >>>>> license: GPL >>>>> vermagic: 2.6.18-6-amd64 SMP mod_unload gcc-4.1 >>>>> depends: dm-mod >>>>> >>>>> modinfo dm_mod >>>>> filename: >>> /lib/modules/2.6.18-6-amd64/kernel/drivers/md/dm-mod.ko >>>>> description: device-mapper driver >>>>> author: Joe Thornber <dm-de...@redhat.com> >>>>> license: GPL >>>>> vermagic: 2.6.18-6-amd64 SMP mod_unload gcc-4.1 >>>>> depends: >>>>> parm: major:The major number of the device mapper (uint) >>>>> >>>>> modinfo dm_round_robin >>>>> filename: >>>>> /lib/modules/2.6.18-6-amd64/kernel/drivers/md/dm-round-robin.ko >>>>> description: device-mapper round-robin multipath path selector >>>>> author: Sistina Software <dm-de...@redhat.com> >>>>> license: GPL >>>>> vermagic: 2.6.18-6-amd64 SMP mod_unload gcc-4.1 >>>>> depends: dm-multipath >>>>> >>>>> There is no self compiled software just the official repository > was >>>>> used. >>>>> The nodes are connected to our two independent SANs. The storage >>> systems >>>>> are EMC Clariion CX3-20f and EMC Clariion CX500. >>>>> >>>>> multipath.conf: >>>>> defaults { >>>>> rr_min_io 1000 >>>>> polling_interval 2 >>>>> no_path_retry 5 >>>>> user_friendly_names yes >>>>> } >>>>> >>>>> blacklist { >>>>> devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*" >>>>> devnode "^hd[a-z][[0-9]*]" >>>>> devnode "^cciss!c[0-9]d[0-9]*[p[0-9]*]" >>>>> device { >>>>> vendor "DGC" >>>>> product "LUNZ" # EMC Management LUN >>>>> } >>>>> device { >>>>> vendor "ATA" #We do not need mutlipathing for > local >>>>> drives >>>>> product "*" >>>>> } >>>>> device { >>>>> vendor "AMI" # No multipathing for SUN Virtual >>> devices >>>>> product "*" >>>>> } >>>>> device { >>>>> vendor "HITACHI" # No multipathing for local scsi >>> disks >>>>> product "H101414SCSUN146G" >>>>> } >>>>> } >>>>> >>>>> devices { >>>>> ## Device attributes for EMC CLARiiON >>>>> device { >>>>> vendor "DGC" >>>>> product "*" >>>>> path_grouping_policy group_by_prio >>>>> getuid_callout "/sbin/scsi_id -g -u -s >>>>> /block/%n" >>>>> prio_callout "/sbin/mpath_prio_emc >>> /dev/%n" >>>>> hardware_handler "1 emc" >>>>> features "1 queue_if_no_path" >>>>> no_path_retry fail >>>>> path_checker emc_clariion >>>>> path_selector "round-robin 0" >>>>> failback immediate >>>>> user_friendly_names yes >>>>> } >>>>> } >>>>> >>>>> multipaths { >>>>> multipath { >>>>> wwid >>>>> 3600601603ac511001c7c92fec775dd11 >>>>> alias stosan01_lun070 >>>>> } >>>>> } >>>>> >>>>> multipath -ll: >>>>> stosan01_lun070 (3600601603ac511001c7c92fec775dd11) dm-7 DGC,RAID > 5 >>>>> [size=133G][features=0][hwhandler=1 emc] >>>>> \_ round-robin 0 [prio=2][active] >>>>> \_ 0:0:1:1 sdd 8:48 [active][ready] >>>>> \_ 1:0:1:1 sdh 8:112 [active][ready] >>>>> \_ round-robin 0 [prio=0][enabled] >>>>> \_ 0:0:0:1 sdb 8:16 [active][ready] >>>>> \_ 1:0:0:1 sdf 8:80 [active][ready] >>>>> >>>>> >>>>> As we use lvm2 we added /dev/sd* to the filter: >>>>> filter = [ "r|/dev/cdrom|", "r|/dev/sd.*|" ] >>>>> >>>>> Here is what happened and what we did to reconstruct the situation >>> to >>>>> find a solution: >>>>> >>>>> On 02.06.2009 we did something wrong with the zoning on one of our >>> two >>>>> SANs and all servers (about 40) lost one path to the SAN. Only two >>>>> servers crashed. Those two are our Debian etch heartbeat cluster >>>>> described above. >>>>> The console showed a kernel panic because of ocfs2 was fencing > both >>>>> nodes. >>>>> >>>>> This was the message: >>>>> O2hb_write_timeout: 165 ERROR: Heartbeat write timeout to device >>> dm-7 >>>>> after 12000 milliseconds >>>>> >>>>> So we decided to change the o2cb settings to: >>>>> O2CB_HEARTBEAT_THRESHOLD=31 >>>>> O2CB_IDLE_TIMEOUT_MS=30000 >>>>> O2CB_KEEPALIVE_DELAY_MS=2000 >>>>> O2CB_RECONNECT_DELAY_MS=2000 >>>>> >>>>> We switched all cluster resources to the 1st node to test the new >>>>> settings on the second node. We removed the 2nd node from the > zoning >>> (we >>>>> also tested shutting down the port with the same result) and got a >>>>> different error but still ending up with a kernel panic: >>>>> >>>>> Jun 4 16:41:05 defr1elcbtd02 kernel: o2net: no longer connected > to >>> node >>>>> defr1elcbtd01 (num 0) at 192.168.0.101:7777 >>>>> Jun 4 16:41:27 defr1elcbtd02 kernel: rport-0:0-0: blocked FC >>> remote >>>>> port time out: removing target and saving binding >>>>> Jun 4 16:41:27 defr1elcbtd02 kernel: rport-0:0-1: blocked FC >>> remote >>>>> port time out: removing target and saving binding >>>>> Jun 4 16:41:27 defr1elcbtd02 kernel: sd 0:0:1:1: SCSI error: > return >>>>> code = 0x00010000 >>>>> Jun 4 16:41:27 defr1elcbtd02 kernel: end_request: I/O error, dev >>> sdd, >>>>> sector 1672 >>>>> Jun 4 16:41:27 defr1elcbtd02 kernel: device-mapper: multipath: >>> Failing >>>>> path 8:48. >>>>> Jun 4 16:41:27 defr1elcbtd02 kernel: device-mapper: multipath: >>> Failing >>>>> path 8:16. >>>>> Jun 4 16:41:27 defr1elcbtd02 kernel: scsi 0:0:1:1: rejecting I/O > to >>>>> device being removed >>>>> Jun 4 16:41:27 defr1elcbtd02 kernel: device-mapper: multipath > emc: >>> long >>>>> trespass command will be send >>>>> Jun 4 16:41:27 defr1elcbtd02 kernel: device-mapper: multipath > emc: >>>>> honor reservation bit will not be set (default) >>>>> Jun 4 16:41:27 defr1elcbtd02 kernel: device-mapper: table: 253:7: >>>>> multipath: error getting device >>>>> Jun 4 16:41:27 defr1elcbtd02 kernel: device-mapper: ioctl: error >>> adding >>>>> target to table >>>>> Jun 4 16:41:27 defr1elcbtd02 kernel: device-mapper: multipath > emc: >>> long >>>>> trespass command will be send >>>>> Jun 4 16:41:27 defr1elcbtd02 kernel: device-mapper: multipath > emc: >>>>> honor reservation bit will not be set (default) >>>>> Jun 4 16:41:29 defr1elcbtd02 kernel: device-mapper: multipath > emc: >>>>> emc_pg_init: sending switch-over command >>>>> Jun 4 16:42:01 defr1elcbtd02 kernel: >>>>> (10751,1):dlm_send_remote_convert_request:395 ERROR: status = -107 >>>>> Jun 4 16:42:01 defr1elcbtd02 kernel: >>>>> (10751,1):dlm_wait_for_node_death:374 >>> 5EE89BC01EFC405E9197C198DEEAE678: >>>>> waiting 5000ms for notification of death of node 0 >>>>> Jun 4 16:42:07 defr1elcbtd02 kernel: >>>>> (10751,1):dlm_send_remote_convert_request:395 ERROR: status = -107 >>>>> Jun 4 16:42:07 defr1elcbtd02 kernel: >>>>> (10751,1):dlm_wait_for_node_death:374 >>> 5EE89BC01EFC405E9197C198DEEAE678: >>>>> waiting 5000ms for notification of death of node 0 >>>>> [...] >>>>> After 60 seconds: >>>>> >>>>> (8,0): o2quo_make_decision:143 ERROR: fending this node because it >>> is >>>>> connected to a half-quorum of 1 out of 2 nodes which doesn't > include >>> the >>>>> lowest active node 0 >>>>> >>>>> >>>>> multipath -ll changed to: >>>>> stosan01_lun070 (3600601603ac511001c7c92fec775dd11) dm-7 DGC,RAID > 5 >>>>> [size=133G][features=0][hwhandler=1 emc] >>>>> \_ round-robin 0 [prio=1][active] >>>>> \_ 0:0:1:1 sdd 8:48 [active][ready] >>>>> \_ round-robin 0 [prio=0][enabled] >>>>> \_ 0:0:0:1 sdb 8:16 [active][ready] >>>>> >>>>> The ocfs2 filesystem is still mounted an writable. Even if I > enable >>> the >>>>> zoneing (or the FC port) again within the 60 seconds ocfs2 does > not >>>>> reconnect to node 1 and panics the kernel after 60 seconds while >>>>> multipath -ll shows both path again. >>>>> >>>>> I do not understand at all what the Ethernet heartbeat connection > of >>>>> ocfs2 has to do with the SAN connection. >>>>> >>>>> The strangest thing at all is - this does not happen always! After >>> some >>>>> reboots the system keeps running stable even if I shutdown a FC > port >>> and >>>>> enable it again many times. There is no constant behaviour... It >>> happens >>>>> most of the times, but at about 10% it does not happen and >>> everything is >>>>> working as intended. >>>>> >>>>> Any explanations or ideas what causes this behaviour? >>>>> >>>>> I will test this on Debian lenny to see if the Debian version > makes >>> a >>>>> difference. >>>>> >>>>> Best regards, >>>>> Florian >>>>> >>>>> _______________________________________________ >>>>> Ocfs2-users mailing list >>>>> Ocfs2-users@oss.oracle.com >>>>> http://oss.oracle.com/mailman/listinfo/ocfs2-users >>>>> >>> _______________________________________________ >>> Ocfs2-users mailing list >>> Ocfs2-users@oss.oracle.com >>> http://oss.oracle.com/mailman/listinfo/ocfs2-users _______________________________________________ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users