Re: [Linux-HA] DRBD and Linux HA, crm_verify fails

Michael Brennen Mon, 04 Feb 2008 18:12:37 -0800

On Monday 04 February 2008, Mike Toler wrote:
> I'm finally able to run my DRBD/HA NFS server on a V1 setup without
> serious issue.  My failovers work correctly and NFS service takes only a
> minor interruption when a server is lost.  The only thing I'm still
> having problems using V1 with is SNMP.
>
> Now, as an exercise in masochism, I'm trying to convert it over to V2 so
> that I can use all the nifty new HA V2 functions.   (We also already are
> using SNMP with a V2 HA setup for some of our other components, so I'm
> hoping this will also fix my last issue there.)
>
> My problem:
>
> Using the information from the "DRBD/HowTov2: Linux HA" page I should be
> able to easily setup the DRBD portion.  However, my config fails to pass
> the "crm_verify" command.
>
> crm_verify -L -V
> crm_verify[5272]: 2008/02/04_15:58:52 WARN: unpack_rsc_op: Processing
> failed op drbd0:0_start_0 on nfs_server1.prodea.local.lab: Error
> crm_verify[5272]: 2008/02/04_15:58:52 WARN: unpack_rsc_op: Compatability
> handling for failed op drbd0:0_start_0 on nfs_server1.prodea.local.lab
> crm_verify[5272]: 2008/02/04_15:58:52 WARN: unpack_rsc_op: Processing
> failed op drbd0:1_start_0 on nfs_server1.prodea.local.lab: Error
> crm_verify[5272]: 2008/02/04_15:58:52 WARN: unpack_rsc_op: Compatability
> handling for failed op drbd0:1_start_0 on nfs_server1.prodea.local.lab
> crm_verify[5272]: 2008/02/04_15:58:52 WARN: native_color: Resource
> drbd0:0 cannot run anywhere
> crm_verify[5272]: 2008/02/04_15:58:52 WARN: native_color: Resource
> drbd0:1 cannot run anywhere
> crm_verify[5272]: 2008/02/04_15:58:52 WARN: native_color: Resource fs0
> cannot run anywhere
> Warnings found during check: config may not be valid
>
> My nodes seem to be named correctly (when viewed through uname -a)
>
>               [EMAIL PROTECTED] ha.d]# uname -a
>               Linux nfs_server1.prodea.local.lab 2.6.9-55.ELsmp #1 SMP
> Fri Apr 20 17:03:35 EDT 2007 i686 i686 i386 GNU/Linux
>
>
> Why would the DRBD resource not be able to run anywhere?   I followed
> the instructions from the setup page pretty much to the letter, with the
> only changes being the DRBD resource name on my system is "r0" instead
> of "drbd0".


As I recall I worked around this a week or two ago by adding a location 
constraint for each group of resources; the drbd resource is in the group, 
along with a floating ip and a few other things.  The constraint snippet is 
below.

     <constraints>
       <rsc_location id="location_iscsi" rsc="group_iscsi">
         <rule id="prefered_location_iscsi" score="100">
           <expression attribute="#uname" 
id="33825ff5-9614-462f-bc25-d371d863a155" operation="eq" value="host1"/>
         </rule>
       </rsc_location>
       <rsc_location id="location_mysql" rsc="group_mysql">
         <rule id="prefered_location_mysql" score="100">
           <expression attribute="#uname" 
id="1be8da01-4dc8-495c-b6bd-ecf013534a72" operation="eq" value="host2"/>
         </rule>
       </rsc_location>
     </constraints>



> Here are my CIB file and the important parts of the drbd.conf file,
> along with a snippet from the /var/log/messages file.
> /var/log/messages
> Feb  4 15:46:11 nfs_server1 crmd: [4573]: info: do_lrm_rsc_op:
> Performing op=drbd0:0_start_0
> key=5:1:bffa1a55-8ea4-4c1c-91bc-599bf9e6d49e)
> Feb  4 15:46:11 nfs_server1 lrmd: [4570]: info: rsc:drbd0:0: start
> Feb  4 15:46:11 nfs_server1 drbd[4850]: INFO: r0: Using hostname node_0
> Feb  4 15:46:11 nfs_server1 lrmd: [4570]: info: RA output:
> (drbd0:0:start:stdout) /etc/drbd.conf:395: in resource r0, on
> nfs_server1.prodea.local.lab { ... } ... on nfs_server2.prodea.local.lab
> { ... }: There are multiple host sections for the peer. Maybe misspelled
> local host name 'node_0'? /etc/drbd.conf:395: in resource r0, there is
> no host section for this host. Missing 'on node_0 {...}' ?
> Feb  4 15:46:11 nfs_server1 drbd[4850]: ERROR: r0 start: not in
> Secondary mode after start.
> Feb  4 15:46:11 nfs_server1 crmd: [4573]: ERROR: process_lrm_event: LRM
> operation drbd0:0_start_0 (call=7, rc=1) Error unknown error
> Feb  4 15:46:11 nfs_server1 tengine: [4575]: WARN: status_from_rc:
> Action start on nfs_server1.prodea.local.lab failed (target: <null> vs.
> rc: 1): Error
> Feb  4 15:46:11 nfs_server1 tengine: [4575]: WARN: update_failcount:
> Updating failcount for drbd0:0 on 1d040f02-a506-4c46-b661-319c5e024e10
> after failed start: rc=1
>
>
> cib.xml
> <cib generated="false" admin_epoch="0" have_quorum="true"
> ignore_dtd="false" num_peers="0" cib_feature_revision="2.0" epoch="14"
> num_updates="1" cib-last-written="Mon Feb  4 15:45:54 2008"
> ccm_transition="1">
>    <configuration>
>      <crm_config>
>        <cluster_property_set id="cib-bootstrap-options">
>          <attributes>
>            <nvpair id="cib-bootstrap-options-dc-version"
> name="dc-version" value="2.1.3-node:
> 552305612591183b1628baa5bc6e903e0f1e26a3"/>
>            <nvpair id="cib-bootstrap-options-last-lrm-refresh"
> name="last-lrm-refresh" value="1202136349"/>
>          </attributes>
>        </cluster_property_set>
>      </crm_config>
>      <nodes>
>        <node id="20f292a2-876b-4b71-a3c1-5802d4af9b2d"
> uname="nfs_server2.prodea.local.lab" type="normal">
>          <instance_attributes
> id="nodes-20f292a2-876b-4b71-a3c1-5802d4af9b2d">
>            <attributes>
>              <nvpair id="standby-20f292a2-876b-4b71-a3c1-5802d4af9b2d"
> name="standby" value="off"/>
>            </attributes>
>          </instance_attributes>
>        </node>
>        <node id="1d040f02-a506-4c46-b661-319c5e024e10"
> uname="nfs_server1.prodea.local.lab" type="normal"/>
>      </nodes>
>      <resources>
>        <master_slave id="ms-drbd0">
>          <meta_attributes id="ma-ms-drbd0">
>            <attributes>
>              <nvpair id="ma-ms-drbd0-1" name="clone_max" value="2"/>
>              <nvpair id="ma-ms-drbd0-2" name="clone_node_max"
> value="1"/>
>              <nvpair id="ma-ms-drbd0-3" name="master_max" value="1"/>
>              <nvpair id="ma-ms-drbd0-4" name="master_node_max"
> value="1"/>
>              <nvpair id="ma-ms-drbd0-5" name="notify" value="yes"/>
>              <nvpair id="ma-ms-drbd0-6" name="globally_unique"
> value="false"/>
>              <nvpair id="ma-ms-drbd0-7" name="target_role"
> value="stopped"/>
>            </attributes>
>          </meta_attributes>
>          <primitive class="ocf" provider="heartbeat" type="drbd"
> id="drbd0">
>            <instance_attributes id="ia-drbd0">
>              <attributes>
>                <nvpair name="drbd_resource" id="ia-drbd0-1" value="r0"/>
>                <nvpair id="ia-drbd0-2" name="clone_overrides_hostname"
> value="yes"/>
>                <nvpair id="drbd0:0_target_role" name="target_role"
> value="started"/>
>              </attributes>
>            </instance_attributes>
>          </primitive>
>        </master_slave>
>        <primitive class="ocf" provider="heartbeat" type="Filesystem"
> id="fs0">
>          <meta_attributes id="ma-fs0">
>            <attributes>
>              <nvpair name="target_role" id="ma-fs0-1" value="stopped"/>
>            </attributes>
>          </meta_attributes>
>          <instance_attributes id="ia-fs0">
>            <attributes>
>              <nvpair id="ia-fs0-1" name="fstype" value="ext3"/>
>              <nvpair id="ia-fs0-2" name="directory"
> value="/mnt/share1"/>
>              <nvpair id="ia-fs0-3" name="device" value="/dev/drbd0"/>
>            </attributes>
>          </instance_attributes>
>        </primitive>
>        <primitive class="ocf" provider="heartbeat" type="IPaddr"
> id="ip0">
>          <instance_attributes id="ia-ip0">
>            <attributes>
>              <nvpair id="ia-ip0-1" name="ip" value="172.24.1.167"/>
>            </attributes>
>          </instance_attributes>
>        </primitive>
>      </resources>
>      <constraints>
>        <rsc_location id="location-ip0" rsc="ip0">
>          <rule id="ip0-rule-1" score="-INFINITY">
>            <expression id="exp-ip0-1" value="a" attribute="site"
> operation="eq"/>
>          </rule>
>        </rsc_location>
>        <rsc_order id="order_drbd0_ip0" to="ip0" from="ms-drbd0"/>
>        <rsc_order id="drbd0_before_fs0" from="fs0" action="start"
> to="ms-drbd0" to_action="promote"/>
>        <rsc_colocation id="fs0_on_drbd0" to="ms-drbd0" to_role="master"
> from="fs0" score="infinity"/>
>        <rsc_colocation id="colo_drbd0_ip0" to="ip0" from="drbd0:0"
> score="infinity"/>
>      </constraints>
>    </configuration>
>  </cib>
>
> drbd.conf
>
> resource r0 {
>   protocol C;
>
> . . .
>   on nfs_server1.prodea.local.lab {
>     device     /dev/drbd0;
>     disk       /dev/sdc1;
>     address    172.24.1.160:7788;
>     meta-disk  /dev/sdb1[0];
>
>   }
>   on nfs_server2.prodea.local.lab {
>     device     /dev/drbd0;
>     disk       /dev/sdc1;
>     address    172.24.1.159:7788;
>     meta-disk  /dev/sdb1[0];
>
>   }
> }
>
>
> Michael Toler
> System Test Engineer
> Prodea Systems, Inc.
> 214-278-1834 (office)
> 972-816-7790 (mobile)


-- 

   -- Michael

signature.asc
Description: This is a digitally signed message part.

_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] DRBD and Linux HA, crm_verify fails

Reply via email to