Re: [Linux-HA] DRBD in a 2 node cluster

Dominik Klein Wed, 11 Feb 2009 04:44:32 -0800

Hi Jason

Jason Fitzpatrick wrote:
> I have disabled the services and run
> drbdadm secondary all
> drbdadm detach all
> drbdadm down all
> service drbd stop
> 
> before testing as far as I can see (cat /proc/drbd on both nodes) drbd is
> shutdown
> 
> cat: /proc/drbd: No such file or directory


Good.

> I have taken the command that heartbeat is running (drbdsetup /dev/drbd0
> disk /dev/sdb /dev/sdb internal --set-defaults --create-device
> --on-io-error=pass_on') 

The RA actually runs "drbdadm up", which translates into this.

> and run it against the nodes when heartbeat is not
> in control and this command will bring the resources online, but re-running
> this command will generate the error, so I am kind of leaning twords the
> command being run twice?

Never seen the cluster do that.

Please post your configuration and logs. hb_report should gather
everything needed and put it into a nice .bz2 archive :)

Regards
Dominik

> Thanks
> 
> Jason
> 
> 2009/2/11 Dominik Klein <d...@in-telegence.net>
> 
>> Hi Jason
>>
>> any chance you started drbd at boot or the drbd device was active at the
>> time you started the cluster resource? If so, read the introduction of
>> the howto again and correct your setup.
>>
>> Jason Fitzpatrick wrote:
>>> Hi Dominik
>>>
>>> I have upgraded to HB 2.9xx and have been following the instructions that
>>> you provided (thanks for those) and have added a resource as follows
>>>
>>> crm
>>> configure
>>> primitive Storage1 ocf:heartbeat:drbd \
>>> params drbd_resource=Storage1 \
>>> op monitor role=Master interval=59s timeout=30s \
>>> op monitor role=Slave interval=60s timeout=30s
>>> ms DRBD_Storage Storage1 \
>>> meta clone-max=2 notify=true globally-unique=false target-role=stopped
>>> commit
>>> exit
>>>
>>> no errors are reported and the resource is visable from within the hb_gui
>>>
>>> when I try to bring the resource online with
>>>
>>> crm resource start DRBD_Storage
>>>
>>> I see the resource attempt to come online and then fail, it seems to be
>>> starting the services, changing the status of the devices to attached
>> (from
>>> detached) but not setting any device to master
>>>
>>> the following is from the ha-log
>>>
>>> crmd[8020]: 2009/02/10_17:22:32 info: do_lrm_rsc_op: Performing
>>> key=7:166:0:b57f7f7c-4e2d-4134-9c14-b1a2b7db11a7 op=Storage1:1_start_0 )
>>> lrmd[8016]: 2009/02/10_17:22:32 info: rsc:Storage1:1: start
>>> lrmd[8016]: 2009/02/10_17:22:32 info: RA output:
>> (Storage1:1:start:stdout)
>>> /dev/drbd0: Failure: (124) Device is attached to a disk (use detach
>> first)
>>> Command
>>>  'drbdsetup /dev/drbd0 disk /dev/sdb /dev/sdb internal --set-defaults
>>> --create-device --on-io-error=pass_on' terminated with exit code 10
>> This looks like "drbdadm up" is failing because the device is already
>> attached to the lower level storage device.
>>
>> Regards
>> Dominik
>>
>>> drbd[22270]:    2009/02/10_17:22:32 ERROR: Storage1 start: not in
>> Secondary
>>> mode after start.
>>> crmd[8020]: 2009/02/10_17:22:32 info: process_lrm_event: LRM operation
>>> Storage1:1_start_0 (call=189, rc=1, cib-update=380, confirmed=true)
>> complete
>>> unknown e
>>> rror
>>> .
>>>
>>> I have checked the DRBD device Storage1 and it is in secondary mode after
>>> the start, and should I choose I can make it primary on either node
>>>
>>> Thanks
>>>
>>> Jason
>>>
>>> 2009/2/10 Jason Fitzpatrick <jayfitzpatr...@gmail.com>
>>>
>>>> Thanks,
>>>>
>>>> This was the latest version in the Fedora Repos, I will upgrade and see
>>>> what happens
>>>>
>>>> Jason
>>>>
>>>> 2009/2/10 Dominik Klein <d...@in-telegence.net>
>>>>
>>>> Jason Fitzpatrick wrote:
>>>>>>> Hi All
>>>>>>>
>>>>>>> I am having a hell of a time trying to get heartbeat to fail over my
>>>>> DRBD
>>>>>>> harddisk and am hoping for some help.
>>>>>>>
>>>>>>> I have a 2 node cluster, heartbeat is working as I am able to fail
>> over
>>>>> IP
>>>>>>> Addresses and services successfully, but when I try to fail over my
>>>>> DRBD
>>>>>>> resource from secondary to primary I am hitting a brick wall, I can
>>>>> fail
>>>>>>> over the DRBD resource manually so I know that it does work at some
>>>>> level
>>>>>>> DRBD version 8.3 Heartbeat version heartbeat-2.1.3-1.fc9.i386
>>>>> Please upgrade. Thats too old for reliable master/slave behaviour.
>>>>> Preferrably upgrade to pacemaker and ais or heartbeat 2.99. Read
>>>>> http://www.clusterlabs.org/wiki/Install for install notes.
>>>>>
>>>>>>> and using
>>>>>>> heartbeat-gui to configure
>>>>> Don't use the gui to configure complex (ie clone or master/slave)
>>>>> resources.
>>>>>
>>>>> Once you upgraded to the latest pacemaker, please refer to
>>>>> http://www.clusterlabs.org/wiki/DRBD_HowTo_1.0 for drbd's cluster
>>>>> configuration.
>>>>>
>>>>> Regards
>>>>> Dominik
>>>>>
>>>>>>> DRBD Resource is called Storage1, the 2 nodes are connected via 2
>>>>> x-over
>>>>>>> cables (1 heartbeat 1 Replication)
>>>>>>>
>>>>>>> I have stripped down my config to the bare bones and tried every
>> option
>>>>>>> that I can think off but know that I am missing something simple,
>>>>>>>
>>>>>>> I have attached my cib.xml but have removed domain names from the
>>>>> systems
>>>>>>> for privacy reasons
>>>>>>>
>>>>>>> Thanks in advance
>>>>>>>
>>>>>>> Jason
>>>>>>>
>>>>>>>  <cib admin_epoch="0" have_quorum="true" ignore_dtd="false"
>>>>>>> cib_feature_revision="2.0" num_peers="2" generated="true"
>>>>>>> ccm_transition="22" dc_uuid="9d8abc28-4fa3-408a-a695-fb36b0d67a48"
>>>>>>> epoch="733" num_updates="1" cib-last-written="Mon Feb  9 18:31:19
>>>>> 2009">
>>>>>>>    <configuration>
>>>>>>>      <crm_config>
>>>>>>>        <cluster_property_set id="cib-bootstrap-options">
>>>>>>>          <attributes>
>>>>>>>            <nvpair id="cib-bootstrap-options-dc-version"
>>>>> name="dc-version"
>>>>>>> value="2.1.3-node: 552305612591183b1628baa5bc6e903e0f1e26a3"/>
>>>>>>>            <nvpair name="last-lrm-refresh"
>>>>>>> id="cib-bootstrap-options-last-lrm-refresh" value="1234204278"/>
>>>>>>>          </attributes>
>>>>>>>        </cluster_property_set>
>>>>>>>      </crm_config>
>>>>>>>      <nodes>
>>>>>>>        <node id="df707752-d5fb-405a-8ca7-049e25a227b7"
>>>>> uname="lpissan1001"
>>>>>>> type="normal">
>>>>>>>          <instance_attributes
>>>>>>> id="nodes-df707752-d5fb-405a-8ca7-049e25a227b7">
>>>>>>>            <attributes>
>>>>>>>              <nvpair
>> id="standby-df707752-d5fb-405a-8ca7-049e25a227b7"
>>>>>>> name="standby" value="off"/>
>>>>>>>            </attributes>
>>>>>>>          </instance_attributes>
>>>>>>>        </node>
>>>>>>>        <node id="9d8abc28-4fa3-408a-a695-fb36b0d67a48"
>>>>> uname="lpissan1002"
>>>>>>> type="normal">
>>>>>>>          <instance_attributes
>>>>>>> id="nodes-9d8abc28-4fa3-408a-a695-fb36b0d67a48">
>>>>>>>            <attributes>
>>>>>>>              <nvpair
>> id="standby-9d8abc28-4fa3-408a-a695-fb36b0d67a48"
>>>>>>> name="standby" value="off"/>
>>>>>>>            </attributes>
>>>>>>>          </instance_attributes>
>>>>>>>        </node>
>>>>>>>      </nodes>
>>>>>>>      <resources>
>>>>>>>        <master_slave id="Storage1">
>>>>>>>          <meta_attributes id="Storage1_meta_attrs">
>>>>>>>            <attributes>
>>>>>>>              <nvpair id="Storage1_metaattr_target_role"
>>>>> name="target_role"
>>>>>>> value="started"/>
>>>>>>>              <nvpair id="Storage1_metaattr_clone_max"
>> name="clone_max"
>>>>>>> value="2"/>
>>>>>>>              <nvpair id="Storage1_metaattr_clone_node_max"
>>>>>>> name="clone_node_max" value="1"/>
>>>>>>>              <nvpair id="Storage1_metaattr_master_max"
>>>>> name="master_max"
>>>>>>> value="1"/>
>>>>>>>              <nvpair id="Storage1_metaattr_master_node_max"
>>>>>>> name="master_node_max" value="1"/>
>>>>>>>              <nvpair id="Storage1_metaattr_notify" name="notify"
>>>>>>> value="true"/>
>>>>>>>              <nvpair id="Storage1_metaattr_globally_unique"
>>>>>>> name="globally_unique" value="false"/>
>>>>>>>            </attributes>
>>>>>>>          </meta_attributes>
>>>>>>>          <primitive id="Storage1" class="ocf" type="drbd"
>>>>>>> provider="heartbeat">
>>>>>>>            <instance_attributes id="Storage1_instance_attrs">
>>>>>>>              <attributes>
>>>>>>>                <nvpair id="273a1bb2-4867-42dd-a9e5-7cebbf48ef3b"
>>>>>>> name="drbd_resource" value="Storage1"/>
>>>>>>>              </attributes>
>>>>>>>            </instance_attributes>
>>>>>>>            <operations>
>>>>>>>              <op id="9ddc0ce9-4090-4546-a7d5-787fe47de872"
>>>>> name="monitor"
>>>>>>> description="master" interval="29" timeout="10" start_delay="1m"
>>>>>>> role="Master"/>
>>>>>>>              <op id="56a7508f-fa42-46f8-9924-3b284cdb97f0"
>>>>> name="monitor"
>>>>>>> description="slave" interval="29" timeout="10" start_delay="1m"
>>>>>>> role="Slave"/>
>>>>>>>            </operations>
>>>>>>>          </primitive>
>>>>>>>        </master_slave>
>>>>>>>      </resources>
>>>>>>>      <constraints/>
>>>>>>>    </configuration>
>>>>>>>  </cib>
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] DRBD in a 2 node cluster

Reply via email to