Hi Emmanuel, Yes, i'm running gfs2. I'm also trying this out on Rhel 6.2 with three nodes so see if this happens upstream. Looks like i may have to open a BZ to get more info on this.
root@bl13-node13:~# gfs2_tool list 253:15 cluster3:cluster3_disk6 253:16 cluster3:cluster3_disk3 253:18 cluster3:disk10 253:17 cluster3:cluster3_disk9 253:19 cluster3:cluster3_disk8 253:21 cluster3:cluster3_disk7 253:22 cluster3:cluster3_disk2 253:23 cluster3:cluster3_disk1 thanks, -Cedric On Sun, Jun 3, 2012 at 1:17 PM, emmanuel segura <emi2f...@gmail.com> wrote: > Hello Cedric > > Are you using gfs or gfs2? if you are using gfs i recommend to use gfs2 > > 2012/6/3 Cedric Kimaru <rhel_clus...@ckimaru.com> > >> Fellow Cluster Compatriots, >> I'm looking for some guidance here. Whenever my rhel 5.7 cluster get's >> into "*LEAVE_START_WAIT*" on on a given iscsi volume, the following >> occurs: >> >> 1. I can't r/w io to the volume. >> 2. Can't unmount it, from any node. >> 3. In flight/pending IO's are impossible to determine or kill since >> lsof on the mount fails. Basically all IO operations stall/fail. >> >> So my questions are: >> >> 1. What does the output from group_tool -v really indicate, *"00030005 >> LEAVE_START_WAIT 12 c000b0002 1" *? Man on group_tool doesn't list >> these fields. >> 2. Does anyone have a list of what these fields represent ? >> 3. Corrective actions. How do i get out of this state without >> rebooting the entire cluster ? >> 4. Is it possible to determine the offending node ? >> >> thanks, >> -Cedric >> >> >> //misc output >> >> root@bl13-node13:~# clustat >> Cluster Status for cluster3 @ Sat Jun 2 20:47:08 2012 >> Member Status: Quorate >> >> Member Name ID >> Status >> ------ ---- ---- >> ------ >> bl01-node01 1 Online, rgmanager >> bl04-node04 4 Online, rgmanager >> bl05-node05 5 Online, rgmanager >> bl06-node06 6 Online, rgmanager >> bl07-node07 7 Online, rgmanager >> bl08-node08 8 Online, rgmanager >> bl09-node09 9 Online, rgmanager >> bl10-node10 10 Online, rgmanager >> bl11-node11 11 Online, rgmanager >> bl12-node12 12 Online, rgmanager >> bl13-node13 13 Online, Local, >> rgmanager >> bl14-node14 14 Online, rgmanager >> bl15-node15 15 Online, rgmanager >> >> >> Service Name Owner >> (Last) State >> ------- ---- ----- >> ------ ----- >> service:httpd >> bl05-node05 started >> service:nfs_disk2 >> bl08-node08 started >> >> >> root@bl13-node13:~# group_tool -v >> type level name id state node id local_done >> fence 0 default 0001000d none >> [1 4 5 6 7 8 9 10 11 12 13 14 15] >> dlm 1 clvmd 0001000c none >> [1 4 5 6 7 8 9 10 11 12 13 14 15] >> dlm 1 cluster3_disk1 00020005 none >> [4 5 6 7 8 9 10 11 12 13 14 15] >> dlm 1 cluster3_disk2 00040005 none >> [4 5 6 7 8 9 10 11 13 14 15] >> dlm 1 cluster3_disk7 00060005 none >> [1 4 5 6 7 8 9 10 11 12 13 14 15] >> dlm 1 cluster3_disk8 00080005 none >> [1 4 5 6 7 8 9 10 11 12 13 14 15] >> dlm 1 cluster3_disk9 000a0005 none >> [1 4 5 6 7 8 9 10 11 12 13 14 15] >> dlm 1 disk10 000c0005 none >> [1 4 5 6 7 8 9 10 11 12 13 14 15] >> dlm 1 rgmanager 0001000a none >> [1 4 5 6 7 8 9 10 11 12 13 14 15] >> dlm 1 cluster3_disk3 00020001 none >> [1 5 6 7 8 9 10 11 12 13] >> dlm 1 cluster3_disk6 00020008 none >> [1 4 5 6 7 8 9 10 11 12 13 14 15] >> gfs 2 cluster3_disk1 00010005 none >> [4 5 6 7 8 9 10 11 12 13 14 15] >> *gfs 2 cluster3_disk2 00030005 LEAVE_START_WAIT 12 >> c000b0002 1 >> [4 5 6 7 8 9 10 11 13 14 15]* >> gfs 2 cluster3_disk7 00050005 none >> [1 4 5 6 7 8 9 10 11 12 13 14 15] >> gfs 2 cluster3_disk8 00070005 none >> [1 4 5 6 7 8 9 10 11 12 13 14 15] >> gfs 2 cluster3_disk9 00090005 none >> [1 4 5 6 7 8 9 10 11 12 13 14 15] >> gfs 2 disk10 000b0005 none >> [1 4 5 6 7 8 9 10 11 12 13 14 15] >> gfs 2 cluster3_disk3 00010001 none >> [1 5 6 7 8 9 10 11 12 13] >> gfs 2 cluster3_disk6 00010008 none >> [1 4 5 6 7 8 9 10 11 12 13 14 15] >> >> root@bl13-node13:~# gfs2_tool list >> 253:15 cluster3:cluster3_disk6 >> 253:16 cluster3:cluster3_disk3 >> 253:18 cluster3:disk10 >> 253:17 cluster3:cluster3_disk9 >> 253:19 cluster3:cluster3_disk8 >> 253:21 cluster3:cluster3_disk7 >> 253:22 cluster3:cluster3_disk2 >> 253:23 cluster3:cluster3_disk1 >> >> root@bl13-node13:~# lvs >> Logging initialised at Sat Jun 2 20:50:03 2012 >> Set umask from 0022 to 0077 >> Finding all logical volumes >> LV VG Attr >> LSize Origin Snap% Move Log Copy% Convert >> lv_cluster3_Disk7 vg_Cluster3_Disk7 -wi-ao >> 3.00T >> lv_cluster3_Disk9 vg_Cluster3_Disk9 -wi-ao >> 200.01G >> lv_Cluster3_libvert vg_Cluster3_libvert -wi-a- >> 100.00G >> lv_cluster3_disk1 vg_cluster3_disk1 -wi-ao >> 100.00G >> lv_cluster3_disk10 vg_cluster3_disk10 -wi-ao >> 15.00T >> lv_cluster3_disk2 vg_cluster3_disk2 -wi-ao >> 220.00G >> lv_cluster3_disk3 vg_cluster3_disk3 -wi-ao >> 330.00G >> lv_cluster3_disk4_1T-kvm-thin vg_cluster3_disk4_1T-kvm-thin -wi-a- >> 1.00T >> lv_cluster3_disk5 vg_cluster3_disk5 -wi-a- >> 555.00G >> lv_cluster3_disk6 vg_cluster3_disk6 -wi-ao >> 2.00T >> lv_cluster3_disk8 vg_cluster3_disk8 -wi-ao >> 2.00T >> >> >> -- >> Linux-cluster mailing list >> Linux-cluster@redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > > > > -- > esta es mi vida e me la vivo hasta que dios quiera > > -- > Linux-cluster mailing list > Linux-cluster@redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster >
-- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster