Hi list-denizens.

I'm sorry if this has been posted to the list before, but my search didn't turn up anything that seemed applicable after the first few that I found.

I've got a two node cluster, nfs1 and nfs2. It's running drbd 0.7, drbdlinks, an IPaddr resource, a filesystem resource, and an lsb nfs- kernel-server resource. I've got co-location constraints that keep everything on the same node with the drbd resource, and ordering constraints like so:

drbd -> Filesystem -> drbdlinks -> IPaddr -> nfs-kernel-server

Now, this is (mostly) working. It correctly discovers that drbd is already in primary state, mounts the filesystem, starts drbdlinks, creates the interface alias... and then does nothing else. nfs- kernel-server doesn't come up unless I start it myself. nfs2 wins the crm election process, so in its syslog, I find a number of these:

tengine: [22991]: info: mask(tengine.c:cib_action_updated): Initiating action 5: start nfs_kernel_server on nfs1

And nfs1 soon after shows:

crmd: [7092]: WARN: lrm_get_rsc(653): got a return code HA_FAIL from a reply message of getrsc with function get_ret_from_msg. crmd: [7092]: ERROR: mask(lrm.c:do_lrm_rsc_op): Triggered dev assert at lrm.c:746 : type != NULL

A bit of Googling led me to a bug thread about something similar, which basically said that the nfs-kernel-server init script wasn't lsb-compliant because it didn't contain a status function. I patched the init script to fix this, but nfs-kernel-server still isn't being started or managed by heartbeat. Can anyone tell me where I'm going wrong/what else I need to try/what other information I need to provide in order for help to be given? Please?

Regards,

Adrian Overbury
Inomial Pty Ltd
[EMAIL PROTECTED]



_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to