On Monday 04 February 2008, Mike Toler wrote: > I'm finally able to run my DRBD/HA NFS server on a V1 setup without > serious issue. My failovers work correctly and NFS service takes only a > minor interruption when a server is lost. The only thing I'm still > having problems using V1 with is SNMP. > > Now, as an exercise in masochism, I'm trying to convert it over to V2 so > that I can use all the nifty new HA V2 functions. (We also already are > using SNMP with a V2 HA setup for some of our other components, so I'm > hoping this will also fix my last issue there.) > > My problem: > > Using the information from the "DRBD/HowTov2: Linux HA" page I should be > able to easily setup the DRBD portion. However, my config fails to pass > the "crm_verify" command. > > crm_verify -L -V > crm_verify[5272]: 2008/02/04_15:58:52 WARN: unpack_rsc_op: Processing > failed op drbd0:0_start_0 on nfs_server1.prodea.local.lab: Error > crm_verify[5272]: 2008/02/04_15:58:52 WARN: unpack_rsc_op: Compatability > handling for failed op drbd0:0_start_0 on nfs_server1.prodea.local.lab > crm_verify[5272]: 2008/02/04_15:58:52 WARN: unpack_rsc_op: Processing > failed op drbd0:1_start_0 on nfs_server1.prodea.local.lab: Error > crm_verify[5272]: 2008/02/04_15:58:52 WARN: unpack_rsc_op: Compatability > handling for failed op drbd0:1_start_0 on nfs_server1.prodea.local.lab > crm_verify[5272]: 2008/02/04_15:58:52 WARN: native_color: Resource > drbd0:0 cannot run anywhere > crm_verify[5272]: 2008/02/04_15:58:52 WARN: native_color: Resource > drbd0:1 cannot run anywhere > crm_verify[5272]: 2008/02/04_15:58:52 WARN: native_color: Resource fs0 > cannot run anywhere > Warnings found during check: config may not be valid > > My nodes seem to be named correctly (when viewed through uname -a) > > [EMAIL PROTECTED] ha.d]# uname -a > Linux nfs_server1.prodea.local.lab 2.6.9-55.ELsmp #1 SMP > Fri Apr 20 17:03:35 EDT 2007 i686 i686 i386 GNU/Linux > > > Why would the DRBD resource not be able to run anywhere? I followed > the instructions from the setup page pretty much to the letter, with the > only changes being the DRBD resource name on my system is "r0" instead > of "drbd0".
As I recall I worked around this a week or two ago by adding a location constraint for each group of resources; the drbd resource is in the group, along with a floating ip and a few other things. The constraint snippet is below. <constraints> <rsc_location id="location_iscsi" rsc="group_iscsi"> <rule id="prefered_location_iscsi" score="100"> <expression attribute="#uname" id="33825ff5-9614-462f-bc25-d371d863a155" operation="eq" value="host1"/> </rule> </rsc_location> <rsc_location id="location_mysql" rsc="group_mysql"> <rule id="prefered_location_mysql" score="100"> <expression attribute="#uname" id="1be8da01-4dc8-495c-b6bd-ecf013534a72" operation="eq" value="host2"/> </rule> </rsc_location> </constraints> > Here are my CIB file and the important parts of the drbd.conf file, > along with a snippet from the /var/log/messages file. > /var/log/messages > Feb 4 15:46:11 nfs_server1 crmd: [4573]: info: do_lrm_rsc_op: > Performing op=drbd0:0_start_0 > key=5:1:bffa1a55-8ea4-4c1c-91bc-599bf9e6d49e) > Feb 4 15:46:11 nfs_server1 lrmd: [4570]: info: rsc:drbd0:0: start > Feb 4 15:46:11 nfs_server1 drbd[4850]: INFO: r0: Using hostname node_0 > Feb 4 15:46:11 nfs_server1 lrmd: [4570]: info: RA output: > (drbd0:0:start:stdout) /etc/drbd.conf:395: in resource r0, on > nfs_server1.prodea.local.lab { ... } ... on nfs_server2.prodea.local.lab > { ... }: There are multiple host sections for the peer. Maybe misspelled > local host name 'node_0'? /etc/drbd.conf:395: in resource r0, there is > no host section for this host. Missing 'on node_0 {...}' ? > Feb 4 15:46:11 nfs_server1 drbd[4850]: ERROR: r0 start: not in > Secondary mode after start. > Feb 4 15:46:11 nfs_server1 crmd: [4573]: ERROR: process_lrm_event: LRM > operation drbd0:0_start_0 (call=7, rc=1) Error unknown error > Feb 4 15:46:11 nfs_server1 tengine: [4575]: WARN: status_from_rc: > Action start on nfs_server1.prodea.local.lab failed (target: <null> vs. > rc: 1): Error > Feb 4 15:46:11 nfs_server1 tengine: [4575]: WARN: update_failcount: > Updating failcount for drbd0:0 on 1d040f02-a506-4c46-b661-319c5e024e10 > after failed start: rc=1 > > > cib.xml > <cib generated="false" admin_epoch="0" have_quorum="true" > ignore_dtd="false" num_peers="0" cib_feature_revision="2.0" epoch="14" > num_updates="1" cib-last-written="Mon Feb 4 15:45:54 2008" > ccm_transition="1"> > <configuration> > <crm_config> > <cluster_property_set id="cib-bootstrap-options"> > <attributes> > <nvpair id="cib-bootstrap-options-dc-version" > name="dc-version" value="2.1.3-node: > 552305612591183b1628baa5bc6e903e0f1e26a3"/> > <nvpair id="cib-bootstrap-options-last-lrm-refresh" > name="last-lrm-refresh" value="1202136349"/> > </attributes> > </cluster_property_set> > </crm_config> > <nodes> > <node id="20f292a2-876b-4b71-a3c1-5802d4af9b2d" > uname="nfs_server2.prodea.local.lab" type="normal"> > <instance_attributes > id="nodes-20f292a2-876b-4b71-a3c1-5802d4af9b2d"> > <attributes> > <nvpair id="standby-20f292a2-876b-4b71-a3c1-5802d4af9b2d" > name="standby" value="off"/> > </attributes> > </instance_attributes> > </node> > <node id="1d040f02-a506-4c46-b661-319c5e024e10" > uname="nfs_server1.prodea.local.lab" type="normal"/> > </nodes> > <resources> > <master_slave id="ms-drbd0"> > <meta_attributes id="ma-ms-drbd0"> > <attributes> > <nvpair id="ma-ms-drbd0-1" name="clone_max" value="2"/> > <nvpair id="ma-ms-drbd0-2" name="clone_node_max" > value="1"/> > <nvpair id="ma-ms-drbd0-3" name="master_max" value="1"/> > <nvpair id="ma-ms-drbd0-4" name="master_node_max" > value="1"/> > <nvpair id="ma-ms-drbd0-5" name="notify" value="yes"/> > <nvpair id="ma-ms-drbd0-6" name="globally_unique" > value="false"/> > <nvpair id="ma-ms-drbd0-7" name="target_role" > value="stopped"/> > </attributes> > </meta_attributes> > <primitive class="ocf" provider="heartbeat" type="drbd" > id="drbd0"> > <instance_attributes id="ia-drbd0"> > <attributes> > <nvpair name="drbd_resource" id="ia-drbd0-1" value="r0"/> > <nvpair id="ia-drbd0-2" name="clone_overrides_hostname" > value="yes"/> > <nvpair id="drbd0:0_target_role" name="target_role" > value="started"/> > </attributes> > </instance_attributes> > </primitive> > </master_slave> > <primitive class="ocf" provider="heartbeat" type="Filesystem" > id="fs0"> > <meta_attributes id="ma-fs0"> > <attributes> > <nvpair name="target_role" id="ma-fs0-1" value="stopped"/> > </attributes> > </meta_attributes> > <instance_attributes id="ia-fs0"> > <attributes> > <nvpair id="ia-fs0-1" name="fstype" value="ext3"/> > <nvpair id="ia-fs0-2" name="directory" > value="/mnt/share1"/> > <nvpair id="ia-fs0-3" name="device" value="/dev/drbd0"/> > </attributes> > </instance_attributes> > </primitive> > <primitive class="ocf" provider="heartbeat" type="IPaddr" > id="ip0"> > <instance_attributes id="ia-ip0"> > <attributes> > <nvpair id="ia-ip0-1" name="ip" value="172.24.1.167"/> > </attributes> > </instance_attributes> > </primitive> > </resources> > <constraints> > <rsc_location id="location-ip0" rsc="ip0"> > <rule id="ip0-rule-1" score="-INFINITY"> > <expression id="exp-ip0-1" value="a" attribute="site" > operation="eq"/> > </rule> > </rsc_location> > <rsc_order id="order_drbd0_ip0" to="ip0" from="ms-drbd0"/> > <rsc_order id="drbd0_before_fs0" from="fs0" action="start" > to="ms-drbd0" to_action="promote"/> > <rsc_colocation id="fs0_on_drbd0" to="ms-drbd0" to_role="master" > from="fs0" score="infinity"/> > <rsc_colocation id="colo_drbd0_ip0" to="ip0" from="drbd0:0" > score="infinity"/> > </constraints> > </configuration> > </cib> > > drbd.conf > > resource r0 { > protocol C; > > . . . > on nfs_server1.prodea.local.lab { > device /dev/drbd0; > disk /dev/sdc1; > address 172.24.1.160:7788; > meta-disk /dev/sdb1[0]; > > } > on nfs_server2.prodea.local.lab { > device /dev/drbd0; > disk /dev/sdc1; > address 172.24.1.159:7788; > meta-disk /dev/sdb1[0]; > > } > } > > > Michael Toler > System Test Engineer > Prodea Systems, Inc. > 214-278-1834 (office) > 972-816-7790 (mobile) -- -- Michael
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems