Darn it, forgot versions:

Redhat Linux 6.2 (kernel 2.6.32)
cman-3.0.12.1
corosync-1.4.1
pacemaker-1.1.6

On 3/2/12 6:12 PM, William Seligman wrote:
> One step forward, two steps back.
> 
> I'm working on a two-node primary-primary cluster. I'm debugging problems I 
> have
> with the ocf:heartbeat:exportfs resource. For some reason, pacemaker sometimes
> appears to ignore ordering I put on the resources.
> 
> Florian Haas recommended pastebin in another thread, so let's give it a try.
> Here's my complete current output of "crm configure show":
> 
> <http://pastebin.com/bbSsqyeu>
> 
> Here's a quick sketch: The sequence of events is supposed to be DRBD (ms) ->
> clvmd (clone) -> gfs2 (clone) -> exportfs (clone).
> 
> But that's not what happens. What happens is that pacemaker tries to start up
> the exportfs resource immediately. This fails, because what it's exporting
> doesn't exist until after gfs2 runs. Because the cloned resource can't run on
> either node, the cluster goes into a state in which one node is fenced, the
> other node refuses to run anything.
> 
> Here's a quick snapshot I was able to take of the output of crm_mon that shows
> the problem:
> 
> <http://pastebin.com/CiZvS4Fh>
> 
> This shows that pacemaker is still trying to start the exportfs resources,
> before it has run the chain drbd->clvmd->gfs2.
> 
> Just to confirm the obvious, I have the ordering constraints in the full
> configuration linked above ("Admin" is my DRBD resource):
> 
> order Admin_Before_Clvmd inf: AdminClone:promote ClvmdClone:start
> order Clvmd_Before_Gfs2 inf: ClvmdClone Gfs2Clone
> order Gfs2_Before_Exports inf: Gfs2Clone ExportsClone
> 
> This is not the only time I've observed this behavior in pacemaker. Here's a
> lengthy log file excerpt from the same time I took the crm_mon snapshot:
> 
> <http://pastebin.com/HwMUCmcX>
> 
> I can see that other resources, the symlink ones in particular, are being 
> probed
> and started before the drbd Admin resource has a chance to be promoted. In
> looking at the log file, it may help to know that /mail and /var/nevis are 
> gfs2
> partitions that aren't mounted until the Gfs2 resource starts.
> 
> So this isn't the first time I've seen this happen. This is just the first 
> time
> I've been able to reproduce this reliably and capture a snapshot.
> 
> Any ideas?


-- 
Bill Seligman             | Phone: (914) 591-2823
Nevis Labs, Columbia Univ | mailto://selig...@nevis.columbia.edu
PO Box 137                |
Irvington NY 10533 USA    | http://www.nevis.columbia.edu/~seligman/

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to