gfs2 hangs if a node crashes

emmanuel segura Fri, 23 Mar 2012 14:04:08 -0700

Hello William

Sorry but i would to know if can show me your /etc/cluster/cluster.conf


Il giorno 23 marzo 2012 21:50, William Seligman <selig...@nevis.columbia.edu
> ha scritto:

> On 3/22/12 2:43 PM, William Seligman wrote:
> > On 3/20/12 4:55 PM, Lars Ellenberg wrote:
> >> On Fri, Mar 16, 2012 at 05:06:04PM -0400, William Seligman wrote:
> >>> On 3/16/12 12:12 PM, William Seligman wrote:
> >>>> On 3/16/12 7:02 AM, Andreas Kurz wrote:
> >>>>> On 03/15/2012 11:50 PM, William Seligman wrote:
> >>>>>> On 3/15/12 6:07 PM, William Seligman wrote:
> >>>>>>> On 3/15/12 6:05 PM, William Seligman wrote:
> >>>>>>>> On 3/15/12 4:57 PM, emmanuel segura wrote:
> >>>>>>>>
> >>>>>>>>> we can try to understand what happen when clvm hang
> >>>>>>>>>
> >>>>>>>>> edit the /etc/lvm/lvm.conf  and change level = 7 in the log
> session and
> >>>>>>>>> uncomment this line
> >>>>>>>>>
> >>>>>>>>> file = "/var/log/lvm2.log"
> >>>>>>>>
> >>>>>>>> Here's the tail end of the file (the original is 1.6M). Because
> there no times
> >>>>>>>> in the log, it's hard for me to point you to the point where I
> crashed the other
> >>>>>>>> system. I think (though I'm not sure) that the crash happened
> after the last
> >>>>>>>> occurrence of
> >>>>>>>>
> >>>>>>>> cache/lvmcache.c:1484   Wiping internal VG cache
> >>>>>>>>
> >>>>>>>> Honestly, it looks like a wall of text to me. Does it suggest
> anything to you?
> >>>>>>>
> >>>>>>> Maybe it would help if I included the link to the pastebin where I
> put the
> >>>>>>> output: <http://pastebin.com/8pgW3Muw>
> >>>>>>
> >>>>>> Could the problem be with lvm+drbd?
> >>>>>>
> >>>>>> In lvm2.conf, I see this sequence of lines pre-crash:
> >>>>>>
> >>>>>> device/dev-io.c:535   Opened /dev/md0 RO O_DIRECT
> >>>>>> device/dev-io.c:271   /dev/md0: size is 1027968 sectors
> >>>>>> device/dev-io.c:137   /dev/md0: block size is 1024 bytes
> >>>>>> device/dev-io.c:588   Closed /dev/md0
> >>>>>> device/dev-io.c:271   /dev/md0: size is 1027968 sectors
> >>>>>> device/dev-io.c:535   Opened /dev/md0 RO O_DIRECT
> >>>>>> device/dev-io.c:137   /dev/md0: block size is 1024 bytes
> >>>>>> device/dev-io.c:588   Closed /dev/md0
> >>>>>> filters/filter-composite.c:31   Using /dev/md0
> >>>>>> device/dev-io.c:535   Opened /dev/md0 RO O_DIRECT
> >>>>>> device/dev-io.c:137   /dev/md0: block size is 1024 bytes
> >>>>>> label/label.c:186   /dev/md0: No label detected
> >>>>>> device/dev-io.c:588   Closed /dev/md0
> >>>>>> device/dev-io.c:535   Opened /dev/drbd0 RO O_DIRECT
> >>>>>> device/dev-io.c:271   /dev/drbd0: size is 5611549368 sectors
> >>>>>> device/dev-io.c:137   /dev/drbd0: block size is 4096 bytes
> >>>>>> device/dev-io.c:588   Closed /dev/drbd0
> >>>>>> device/dev-io.c:271   /dev/drbd0: size is 5611549368 sectors
> >>>>>> device/dev-io.c:535   Opened /dev/drbd0 RO O_DIRECT
> >>>>>> device/dev-io.c:137   /dev/drbd0: block size is 4096 bytes
> >>>>>> device/dev-io.c:588   Closed /dev/drbd0
> >>>>>>
> >>>>>> I interpret this: Look at /dev/md0, get some info, close; look at
> /dev/drbd0,
> >>>>>> get some info, close.
> >>>>>>
> >>>>>> Post-crash, I see:
> >>>>>>
> >>>>>> evice/dev-io.c:535   Opened /dev/md0 RO O_DIRECT
> >>>>>> device/dev-io.c:271   /dev/md0: size is 1027968 sectors
> >>>>>> device/dev-io.c:137   /dev/md0: block size is 1024 bytes
> >>>>>> device/dev-io.c:588   Closed /dev/md0
> >>>>>> device/dev-io.c:271   /dev/md0: size is 1027968 sectors
> >>>>>> device/dev-io.c:535   Opened /dev/md0 RO O_DIRECT
> >>>>>> device/dev-io.c:137   /dev/md0: block size is 1024 bytes
> >>>>>> device/dev-io.c:588   Closed /dev/md0
> >>>>>> filters/filter-composite.c:31   Using /dev/md0
> >>>>>> device/dev-io.c:535   Opened /dev/md0 RO O_DIRECT
> >>>>>> device/dev-io.c:137   /dev/md0: block size is 1024 bytes
> >>>>>> label/label.c:186   /dev/md0: No label detected
> >>>>>> device/dev-io.c:588   Closed /dev/md0
> >>>>>> device/dev-io.c:535   Opened /dev/drbd0 RO O_DIRECT
> >>>>>> device/dev-io.c:271   /dev/drbd0: size is 5611549368 sectors
> >>>>>> device/dev-io.c:137   /dev/drbd0: block size is 4096 bytes
> >>>>>>
> >>>>>> ... and then it hangs. Comparing the two, it looks like it can't
> close /dev/drbd0.
> >>>>>>
> >>>>>> If I look at /proc/drbd when I crash one node, I see this:
> >>>>>>
> >>>>>> # cat /proc/drbd
> >>>>>> version: 8.3.12 (api:88/proto:86-96)
> >>>>>> GIT-hash: e2a8ef4656be026bbae540305fcb998a5991090f build by
> >>>>>> r...@hypatia-tb.nevis.columbia.edu, 2012-02-28 18:01:34
> >>>>>>  0: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C s-----
> >>>>>>     ns:7000064 nr:0 dw:0 dr:7049728 al:0 bm:516 lo:0 pe:0 ua:0 ap:0
> ep:1 wo:b oos:0
> >>>>>
> >>>>> s----- ... DRBD suspended io, most likely because of it's
> >>>>> fencing-policy. For valid dual-primary setups you have to use
> >>>>> "resource-and-stonith" policy and a working "fence-peer" handler. In
> >>>>> this mode I/O is suspended until fencing of peer was succesful.
> Question
> >>>>> is, why the peer does _not_ also suspend its I/O because obviously
> >>>>> fencing was not successful .....
> >>>>>
> >>>>> So with a correct DRBD configuration one of your nodes should already
> >>>>> have been fenced because of connection loss between nodes (on drbd
> >>>>> replication link).
> >>>>>
> >>>>> You can use e.g. that nice fencing script:
> >>>>>
> >>>>> http://goo.gl/O4N8f
> >>>>
> >>>> This is the output of "drbdadm dump admin": <
> http://pastebin.com/kTxvHCtx>
> >>>>
> >>>> So I've got resource-and-stonith. I gather from an earlier thread that
> >>>> obliterate-peer.sh is more-or-less equivalent in functionality with
> >>>> stonith_admin_fence_peer.sh:
> >>>>
> >>>> <http://www.gossamer-threads.com/lists/linuxha/users/78504#78504>
> >>>>
> >>>> At the moment I'm pursuing the possibility that I'm returning the
> wrong return
> >>>> codes from my fencing agent:
> >>>>
> >>>> <http://www.gossamer-threads.com/lists/linuxha/users/78572>
> >>>
> >>> I cleaned up my fencing agent, making sure its return code matched
> those
> >>> returned by other agents in /usr/sbin/fence_, and allowing for some
> delay issues
> >>> in reading the UPS status. But...
> >>>
> >>>> After that, I'll look at another suggestion with lvm.conf:
> >>>>
> >>>> <http://www.gossamer-threads.com/lists/linuxha/users/78796#78796>
> >>>>
> >>>> Then I'll try DRBD 8.4.1. Hopefully one of these is the source of the
> issue.
> >>>
> >>> Failure on all three counts.
> >>
> >> May I suggest you double check the permissions on your fence peer
> script?
> >> I suspect you may simply have forgotten the "chmod +x" .
> >>
> >> Test with "drbdadm fence-peer minor-0" from the command line.
> >
> > I still haven't solved the problem, but this advice has gotten me
> further than
> > before.
> >
> > First, Lars was correct: I did not have execute permissions set on my
> fence peer
> > scripts. (D'oh!) I turned them on, but that did not change anything:
> cman+clvmd
> > still hung on the vgdisplay command if I crashed the peer node.
> >
> > I started up both nodes again (cman+pacemaker+drbd+clvmd) and tried Lars'
> > suggested command. I didn't save the response for this message (d'oh
> again!) but
> > it said that the fence-peer script had failed.
> >
> > Hmm. The peer was definitely shutting down, so my fencing script is
> working. I
> > went over it, comparing the return codes to those of the existing
> scripts, and
> > made some changes. Here's my current script: <
> http://pastebin.com/nUnYVcBK>.
> >
> > Up until now my fence-peer scripts had either been Lon Hohberger's
> > obliterate-peer.sh or Digimer's rhcs_fence. I decided to try
> > stonith_admin-fence-peer.sh that Andreas Kurz recommended; unlike the
> first two
> > scripts, which fence using fence_node, the latter script just calls
> stonith_admin.
> >
> > When I tried the stonith_admin-fence-peer.sh script, it worked:
> >
> > # drbdadm fence-peer minor-0
> > stonith_admin-fence-peer.sh[10886]: stonith_admin successfully fenced
> peer
> > orestes-corosync.nevis.columbia.edu.
> >
> > Power was cut on the peer, the remaining node stayed up. Then I brought
> up the
> > peer with:
> >
> > stonith_admin -U orestes-corosync.nevis.columbia.edu
> >
> > BUT: When the restored peer came up and started to run cman, the clvmd
> hung on
> > the main node again.
> >
> > After cycling through some more tests, I found that if I brought down
> the peer
> > with drbdadm, then brought up with the peer with no HA services, then
> started
> > drbd and then cman, the cluster remained intact.
> >
> > If I crashed the peer, the scheme in the previous paragraph didn't work.
> I bring
> > up drbd, check that the disks are both UpToDate, then bring up cman. At
> that
> > point the vgdisplay on the main node takes so long to run that clvmd
> will time out:
> >
> > vgdisplay  Error locking on node orestes-corosync.nevis.columbia.edu:
> Command
> > timed out
> >
> > I timed how long it took vgdisplay to run. I might be able to work
> around this
> > by setting the timeout on my clvmd resource to 300s, but that seems to
> be a
> > band-aid for an underlying problem. Any suggestions on what else I could
> check?
>
> I've done some more tests. Still no solution, just an observation: The
> "death
> mode" appears to be:
>
> - Two nodes running cman+pacemaker+drbd+clvmd
> - Take one node down = one remaining node w/cman+pacemaker+drbd+clvmd
> - Start up dead node. If it ever gets into a state in which it's running
> cman
> but not clvmd, clvmd on the uncrashed node hangs.
> - Conversely, if I bring up drbd, make it primary, start cman+clvmd,
> there's no
> problem on the uncrashed node.
>
> My guess is that clvmd is getting the number of nodes it expects from
> cman. When
> the formally-dead node starts running cman, the number of cluster nodes
> goes to
> 2 (I checked with 'cman_tool status') but the number of nodes running
> clvmd is
> still 1, hence the crash.
>
> Does this guess make sense?
>
> --
> Bill Seligman             | Phone: (914) 591-2823
> Nevis Labs, Columbia Univ | mailto://selig...@nevis.columbia.edu
> PO Box 137                |
> Irvington NY 10533 USA    | http://www.nevis.columbia.edu/~seligman/
>
>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>



-- 
esta es mi vida e me la vivo hasta que dios quiera
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] clvm/dlm/gfs2 hangs if a node crashes

Reply via email to