Re: [Linux-HA] DRBD in a 2 node cluster

Jason Fitzpatrick Wed, 11 Feb 2009 02:39:21 -0800

I have disabled the services and run
drbdadm secondary all
drbdadm detach all
drbdadm down all
service drbd stop


before testing as far as I can see (cat /proc/drbd on both nodes) drbd is
shutdown

cat: /proc/drbd: No such file or directory


I have taken the command that heartbeat is running (drbdsetup /dev/drbd0
disk /dev/sdb /dev/sdb internal --set-defaults --create-device
--on-io-error=pass_on') and run it against the nodes when heartbeat is not
in control and this command will bring the resources online, but re-running
this command will generate the error, so I am kind of leaning twords the
command being run twice?

Thanks

Jason

2009/2/11 Dominik Klein <d...@in-telegence.net>

> Hi Jason
>
> any chance you started drbd at boot or the drbd device was active at the
> time you started the cluster resource? If so, read the introduction of
> the howto again and correct your setup.
>
> Jason Fitzpatrick wrote:
> > Hi Dominik
> >
> > I have upgraded to HB 2.9xx and have been following the instructions that
> > you provided (thanks for those) and have added a resource as follows
> >
> > crm
> > configure
> > primitive Storage1 ocf:heartbeat:drbd \
> > params drbd_resource=Storage1 \
> > op monitor role=Master interval=59s timeout=30s \
> > op monitor role=Slave interval=60s timeout=30s
> > ms DRBD_Storage Storage1 \
> > meta clone-max=2 notify=true globally-unique=false target-role=stopped
> > commit
> > exit
> >
> > no errors are reported and the resource is visable from within the hb_gui
> >
> > when I try to bring the resource online with
> >
> > crm resource start DRBD_Storage
> >
> > I see the resource attempt to come online and then fail, it seems to be
> > starting the services, changing the status of the devices to attached
> (from
> > detached) but not setting any device to master
> >
> > the following is from the ha-log
> >
> > crmd[8020]: 2009/02/10_17:22:32 info: do_lrm_rsc_op: Performing
> > key=7:166:0:b57f7f7c-4e2d-4134-9c14-b1a2b7db11a7 op=Storage1:1_start_0 )
> > lrmd[8016]: 2009/02/10_17:22:32 info: rsc:Storage1:1: start
> > lrmd[8016]: 2009/02/10_17:22:32 info: RA output:
> (Storage1:1:start:stdout)
> > /dev/drbd0: Failure: (124) Device is attached to a disk (use detach
> first)
> > Command
> >  'drbdsetup /dev/drbd0 disk /dev/sdb /dev/sdb internal --set-defaults
> > --create-device --on-io-error=pass_on' terminated with exit code 10
>
> This looks like "drbdadm up" is failing because the device is already
> attached to the lower level storage device.
>
> Regards
> Dominik
>
> > drbd[22270]:    2009/02/10_17:22:32 ERROR: Storage1 start: not in
> Secondary
> > mode after start.
> > crmd[8020]: 2009/02/10_17:22:32 info: process_lrm_event: LRM operation
> > Storage1:1_start_0 (call=189, rc=1, cib-update=380, confirmed=true)
> complete
> > unknown e
> > rror
> > .
> >
> > I have checked the DRBD device Storage1 and it is in secondary mode after
> > the start, and should I choose I can make it primary on either node
> >
> > Thanks
> >
> > Jason
> >
> > 2009/2/10 Jason Fitzpatrick <jayfitzpatr...@gmail.com>
> >
> >> Thanks,
> >>
> >> This was the latest version in the Fedora Repos, I will upgrade and see
> >> what happens
> >>
> >> Jason
> >>
> >> 2009/2/10 Dominik Klein <d...@in-telegence.net>
> >>
> >> Jason Fitzpatrick wrote:
> >>>>> Hi All
> >>>>>
> >>>>> I am having a hell of a time trying to get heartbeat to fail over my
> >>> DRBD
> >>>>> harddisk and am hoping for some help.
> >>>>>
> >>>>> I have a 2 node cluster, heartbeat is working as I am able to fail
> over
> >>> IP
> >>>>> Addresses and services successfully, but when I try to fail over my
> >>> DRBD
> >>>>> resource from secondary to primary I am hitting a brick wall, I can
> >>> fail
> >>>>> over the DRBD resource manually so I know that it does work at some
> >>> level
> >>>>> DRBD version 8.3 Heartbeat version heartbeat-2.1.3-1.fc9.i386
> >>> Please upgrade. Thats too old for reliable master/slave behaviour.
> >>> Preferrably upgrade to pacemaker and ais or heartbeat 2.99. Read
> >>> http://www.clusterlabs.org/wiki/Install for install notes.
> >>>
> >>>>> and using
> >>>>> heartbeat-gui to configure
> >>> Don't use the gui to configure complex (ie clone or master/slave)
> >>> resources.
> >>>
> >>> Once you upgraded to the latest pacemaker, please refer to
> >>> http://www.clusterlabs.org/wiki/DRBD_HowTo_1.0 for drbd's cluster
> >>> configuration.
> >>>
> >>> Regards
> >>> Dominik
> >>>
> >>>>> DRBD Resource is called Storage1, the 2 nodes are connected via 2
> >>> x-over
> >>>>> cables (1 heartbeat 1 Replication)
> >>>>>
> >>>>> I have stripped down my config to the bare bones and tried every
> option
> >>>>> that I can think off but know that I am missing something simple,
> >>>>>
> >>>>> I have attached my cib.xml but have removed domain names from the
> >>> systems
> >>>>> for privacy reasons
> >>>>>
> >>>>> Thanks in advance
> >>>>>
> >>>>> Jason
> >>>>>
> >>>>>  <cib admin_epoch="0" have_quorum="true" ignore_dtd="false"
> >>>>> cib_feature_revision="2.0" num_peers="2" generated="true"
> >>>>> ccm_transition="22" dc_uuid="9d8abc28-4fa3-408a-a695-fb36b0d67a48"
> >>>>> epoch="733" num_updates="1" cib-last-written="Mon Feb  9 18:31:19
> >>> 2009">
> >>>>>    <configuration>
> >>>>>      <crm_config>
> >>>>>        <cluster_property_set id="cib-bootstrap-options">
> >>>>>          <attributes>
> >>>>>            <nvpair id="cib-bootstrap-options-dc-version"
> >>> name="dc-version"
> >>>>> value="2.1.3-node: 552305612591183b1628baa5bc6e903e0f1e26a3"/>
> >>>>>            <nvpair name="last-lrm-refresh"
> >>>>> id="cib-bootstrap-options-last-lrm-refresh" value="1234204278"/>
> >>>>>          </attributes>
> >>>>>        </cluster_property_set>
> >>>>>      </crm_config>
> >>>>>      <nodes>
> >>>>>        <node id="df707752-d5fb-405a-8ca7-049e25a227b7"
> >>> uname="lpissan1001"
> >>>>> type="normal">
> >>>>>          <instance_attributes
> >>>>> id="nodes-df707752-d5fb-405a-8ca7-049e25a227b7">
> >>>>>            <attributes>
> >>>>>              <nvpair
> id="standby-df707752-d5fb-405a-8ca7-049e25a227b7"
> >>>>> name="standby" value="off"/>
> >>>>>            </attributes>
> >>>>>          </instance_attributes>
> >>>>>        </node>
> >>>>>        <node id="9d8abc28-4fa3-408a-a695-fb36b0d67a48"
> >>> uname="lpissan1002"
> >>>>> type="normal">
> >>>>>          <instance_attributes
> >>>>> id="nodes-9d8abc28-4fa3-408a-a695-fb36b0d67a48">
> >>>>>            <attributes>
> >>>>>              <nvpair
> id="standby-9d8abc28-4fa3-408a-a695-fb36b0d67a48"
> >>>>> name="standby" value="off"/>
> >>>>>            </attributes>
> >>>>>          </instance_attributes>
> >>>>>        </node>
> >>>>>      </nodes>
> >>>>>      <resources>
> >>>>>        <master_slave id="Storage1">
> >>>>>          <meta_attributes id="Storage1_meta_attrs">
> >>>>>            <attributes>
> >>>>>              <nvpair id="Storage1_metaattr_target_role"
> >>> name="target_role"
> >>>>> value="started"/>
> >>>>>              <nvpair id="Storage1_metaattr_clone_max"
> name="clone_max"
> >>>>> value="2"/>
> >>>>>              <nvpair id="Storage1_metaattr_clone_node_max"
> >>>>> name="clone_node_max" value="1"/>
> >>>>>              <nvpair id="Storage1_metaattr_master_max"
> >>> name="master_max"
> >>>>> value="1"/>
> >>>>>              <nvpair id="Storage1_metaattr_master_node_max"
> >>>>> name="master_node_max" value="1"/>
> >>>>>              <nvpair id="Storage1_metaattr_notify" name="notify"
> >>>>> value="true"/>
> >>>>>              <nvpair id="Storage1_metaattr_globally_unique"
> >>>>> name="globally_unique" value="false"/>
> >>>>>            </attributes>
> >>>>>          </meta_attributes>
> >>>>>          <primitive id="Storage1" class="ocf" type="drbd"
> >>>>> provider="heartbeat">
> >>>>>            <instance_attributes id="Storage1_instance_attrs">
> >>>>>              <attributes>
> >>>>>                <nvpair id="273a1bb2-4867-42dd-a9e5-7cebbf48ef3b"
> >>>>> name="drbd_resource" value="Storage1"/>
> >>>>>              </attributes>
> >>>>>            </instance_attributes>
> >>>>>            <operations>
> >>>>>              <op id="9ddc0ce9-4090-4546-a7d5-787fe47de872"
> >>> name="monitor"
> >>>>> description="master" interval="29" timeout="10" start_delay="1m"
> >>>>> role="Master"/>
> >>>>>              <op id="56a7508f-fa42-46f8-9924-3b284cdb97f0"
> >>> name="monitor"
> >>>>> description="slave" interval="29" timeout="10" start_delay="1m"
> >>>>> role="Slave"/>
> >>>>>            </operations>
> >>>>>          </primitive>
> >>>>>        </master_slave>
> >>>>>      </resources>
> >>>>>      <constraints/>
> >>>>>    </configuration>
> >>>>>  </cib>
> >>>>>
> >>>>>
> >>>> _______________________________________________
> >>>> Linux-HA mailing list
> >>>> Linux-HA@lists.linux-ha.org
> >>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >>>> See also: http://linux-ha.org/ReportingProblems
> >>>>
> >>> _______________________________________________
> >>> Linux-HA mailing list
> >>> Linux-HA@lists.linux-ha.org
> >>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >>> See also: http://linux-ha.org/ReportingProblems
> >>>
> >>
> > _______________________________________________
> > Linux-HA mailing list
> > Linux-HA@lists.linux-ha.org
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > See also: http://linux-ha.org/ReportingProblems
> >
>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] DRBD in a 2 node cluster

Reply via email to