Re: [Linux-HA] DRBD in a 2 node cluster

Jason Fitzpatrick Wed, 11 Feb 2009 05:06:50 -0800

Hi Dominik

Thanks for the follow up, please find the file attached


Jason


2009/2/11 Dominik Klein <d...@in-telegence.net>

> Hi Jason
>
> Jason Fitzpatrick wrote:
> > I have disabled the services and run
> > drbdadm secondary all
> > drbdadm detach all
> > drbdadm down all
> > service drbd stop
> >
> > before testing as far as I can see (cat /proc/drbd on both nodes) drbd is
> > shutdown
> >
> > cat: /proc/drbd: No such file or directory
>
> Good.
>
> > I have taken the command that heartbeat is running (drbdsetup /dev/drbd0
> > disk /dev/sdb /dev/sdb internal --set-defaults --create-device
> > --on-io-error=pass_on')
>
> The RA actually runs "drbdadm up", which translates into this.
>
> > and run it against the nodes when heartbeat is not
> > in control and this command will bring the resources online, but
> re-running
> > this command will generate the error, so I am kind of leaning twords the
> > command being run twice?
>
> Never seen the cluster do that.
>
> Please post your configuration and logs. hb_report should gather
> everything needed and put it into a nice .bz2 archive :)
>
> Regards
> Dominik
>
> > Thanks
> >
> > Jason
> >
> > 2009/2/11 Dominik Klein <d...@in-telegence.net>
> >
> >> Hi Jason
> >>
> >> any chance you started drbd at boot or the drbd device was active at the
> >> time you started the cluster resource? If so, read the introduction of
> >> the howto again and correct your setup.
> >>
> >> Jason Fitzpatrick wrote:
> >>> Hi Dominik
> >>>
> >>> I have upgraded to HB 2.9xx and have been following the instructions
> that
> >>> you provided (thanks for those) and have added a resource as follows
> >>>
> >>> crm
> >>> configure
> >>> primitive Storage1 ocf:heartbeat:drbd \
> >>> params drbd_resource=Storage1 \
> >>> op monitor role=Master interval=59s timeout=30s \
> >>> op monitor role=Slave interval=60s timeout=30s
> >>> ms DRBD_Storage Storage1 \
> >>> meta clone-max=2 notify=true globally-unique=false target-role=stopped
> >>> commit
> >>> exit
> >>>
> >>> no errors are reported and the resource is visable from within the
> hb_gui
> >>>
> >>> when I try to bring the resource online with
> >>>
> >>> crm resource start DRBD_Storage
> >>>
> >>> I see the resource attempt to come online and then fail, it seems to be
> >>> starting the services, changing the status of the devices to attached
> >> (from
> >>> detached) but not setting any device to master
> >>>
> >>> the following is from the ha-log
> >>>
> >>> crmd[8020]: 2009/02/10_17:22:32 info: do_lrm_rsc_op: Performing
> >>> key=7:166:0:b57f7f7c-4e2d-4134-9c14-b1a2b7db11a7 op=Storage1:1_start_0
> )
> >>> lrmd[8016]: 2009/02/10_17:22:32 info: rsc:Storage1:1: start
> >>> lrmd[8016]: 2009/02/10_17:22:32 info: RA output:
> >> (Storage1:1:start:stdout)
> >>> /dev/drbd0: Failure: (124) Device is attached to a disk (use detach
> >> first)
> >>> Command
> >>>  'drbdsetup /dev/drbd0 disk /dev/sdb /dev/sdb internal --set-defaults
> >>> --create-device --on-io-error=pass_on' terminated with exit code 10
> >> This looks like "drbdadm up" is failing because the device is already
> >> attached to the lower level storage device.
> >>
> >> Regards
> >> Dominik
> >>
> >>> drbd[22270]:    2009/02/10_17:22:32 ERROR: Storage1 start: not in
> >> Secondary
> >>> mode after start.
> >>> crmd[8020]: 2009/02/10_17:22:32 info: process_lrm_event: LRM operation
> >>> Storage1:1_start_0 (call=189, rc=1, cib-update=380, confirmed=true)
> >> complete
> >>> unknown e
> >>> rror
> >>> .
> >>>
> >>> I have checked the DRBD device Storage1 and it is in secondary mode
> after
> >>> the start, and should I choose I can make it primary on either node
> >>>
> >>> Thanks
> >>>
> >>> Jason
> >>>
> >>> 2009/2/10 Jason Fitzpatrick <jayfitzpatr...@gmail.com>
> >>>
> >>>> Thanks,
> >>>>
> >>>> This was the latest version in the Fedora Repos, I will upgrade and
> see
> >>>> what happens
> >>>>
> >>>> Jason
> >>>>
> >>>> 2009/2/10 Dominik Klein <d...@in-telegence.net>
> >>>>
> >>>> Jason Fitzpatrick wrote:
> >>>>>>> Hi All
> >>>>>>>
> >>>>>>> I am having a hell of a time trying to get heartbeat to fail over
> my
> >>>>> DRBD
> >>>>>>> harddisk and am hoping for some help.
> >>>>>>>
> >>>>>>> I have a 2 node cluster, heartbeat is working as I am able to fail
> >> over
> >>>>> IP
> >>>>>>> Addresses and services successfully, but when I try to fail over my
> >>>>> DRBD
> >>>>>>> resource from secondary to primary I am hitting a brick wall, I can
> >>>>> fail
> >>>>>>> over the DRBD resource manually so I know that it does work at some
> >>>>> level
> >>>>>>> DRBD version 8.3 Heartbeat version heartbeat-2.1.3-1.fc9.i386
> >>>>> Please upgrade. Thats too old for reliable master/slave behaviour.
> >>>>> Preferrably upgrade to pacemaker and ais or heartbeat 2.99. Read
> >>>>> http://www.clusterlabs.org/wiki/Install for install notes.
> >>>>>
> >>>>>>> and using
> >>>>>>> heartbeat-gui to configure
> >>>>> Don't use the gui to configure complex (ie clone or master/slave)
> >>>>> resources.
> >>>>>
> >>>>> Once you upgraded to the latest pacemaker, please refer to
> >>>>> http://www.clusterlabs.org/wiki/DRBD_HowTo_1.0 for drbd's cluster
> >>>>> configuration.
> >>>>>
> >>>>> Regards
> >>>>> Dominik
> >>>>>
> >>>>>>> DRBD Resource is called Storage1, the 2 nodes are connected via 2
> >>>>> x-over
> >>>>>>> cables (1 heartbeat 1 Replication)
> >>>>>>>
> >>>>>>> I have stripped down my config to the bare bones and tried every
> >> option
> >>>>>>> that I can think off but know that I am missing something simple,
> >>>>>>>
> >>>>>>> I have attached my cib.xml but have removed domain names from the
> >>>>> systems
> >>>>>>> for privacy reasons
> >>>>>>>
> >>>>>>> Thanks in advance
> >>>>>>>
> >>>>>>> Jason
> >>>>>>>
> >>>>>>>  <cib admin_epoch="0" have_quorum="true" ignore_dtd="false"
> >>>>>>> cib_feature_revision="2.0" num_peers="2" generated="true"
> >>>>>>> ccm_transition="22" dc_uuid="9d8abc28-4fa3-408a-a695-fb36b0d67a48"
> >>>>>>> epoch="733" num_updates="1" cib-last-written="Mon Feb  9 18:31:19
> >>>>> 2009">
> >>>>>>>    <configuration>
> >>>>>>>      <crm_config>
> >>>>>>>        <cluster_property_set id="cib-bootstrap-options">
> >>>>>>>          <attributes>
> >>>>>>>            <nvpair id="cib-bootstrap-options-dc-version"
> >>>>> name="dc-version"
> >>>>>>> value="2.1.3-node: 552305612591183b1628baa5bc6e903e0f1e26a3"/>
> >>>>>>>            <nvpair name="last-lrm-refresh"
> >>>>>>> id="cib-bootstrap-options-last-lrm-refresh" value="1234204278"/>
> >>>>>>>          </attributes>
> >>>>>>>        </cluster_property_set>
> >>>>>>>      </crm_config>
> >>>>>>>      <nodes>
> >>>>>>>        <node id="df707752-d5fb-405a-8ca7-049e25a227b7"
> >>>>> uname="lpissan1001"
> >>>>>>> type="normal">
> >>>>>>>          <instance_attributes
> >>>>>>> id="nodes-df707752-d5fb-405a-8ca7-049e25a227b7">
> >>>>>>>            <attributes>
> >>>>>>>              <nvpair
> >> id="standby-df707752-d5fb-405a-8ca7-049e25a227b7"
> >>>>>>> name="standby" value="off"/>
> >>>>>>>            </attributes>
> >>>>>>>          </instance_attributes>
> >>>>>>>        </node>
> >>>>>>>        <node id="9d8abc28-4fa3-408a-a695-fb36b0d67a48"
> >>>>> uname="lpissan1002"
> >>>>>>> type="normal">
> >>>>>>>          <instance_attributes
> >>>>>>> id="nodes-9d8abc28-4fa3-408a-a695-fb36b0d67a48">
> >>>>>>>            <attributes>
> >>>>>>>              <nvpair
> >> id="standby-9d8abc28-4fa3-408a-a695-fb36b0d67a48"
> >>>>>>> name="standby" value="off"/>
> >>>>>>>            </attributes>
> >>>>>>>          </instance_attributes>
> >>>>>>>        </node>
> >>>>>>>      </nodes>
> >>>>>>>      <resources>
> >>>>>>>        <master_slave id="Storage1">
> >>>>>>>          <meta_attributes id="Storage1_meta_attrs">
> >>>>>>>            <attributes>
> >>>>>>>              <nvpair id="Storage1_metaattr_target_role"
> >>>>> name="target_role"
> >>>>>>> value="started"/>
> >>>>>>>              <nvpair id="Storage1_metaattr_clone_max"
> >> name="clone_max"
> >>>>>>> value="2"/>
> >>>>>>>              <nvpair id="Storage1_metaattr_clone_node_max"
> >>>>>>> name="clone_node_max" value="1"/>
> >>>>>>>              <nvpair id="Storage1_metaattr_master_max"
> >>>>> name="master_max"
> >>>>>>> value="1"/>
> >>>>>>>              <nvpair id="Storage1_metaattr_master_node_max"
> >>>>>>> name="master_node_max" value="1"/>
> >>>>>>>              <nvpair id="Storage1_metaattr_notify" name="notify"
> >>>>>>> value="true"/>
> >>>>>>>              <nvpair id="Storage1_metaattr_globally_unique"
> >>>>>>> name="globally_unique" value="false"/>
> >>>>>>>            </attributes>
> >>>>>>>          </meta_attributes>
> >>>>>>>          <primitive id="Storage1" class="ocf" type="drbd"
> >>>>>>> provider="heartbeat">
> >>>>>>>            <instance_attributes id="Storage1_instance_attrs">
> >>>>>>>              <attributes>
> >>>>>>>                <nvpair id="273a1bb2-4867-42dd-a9e5-7cebbf48ef3b"
> >>>>>>> name="drbd_resource" value="Storage1"/>
> >>>>>>>              </attributes>
> >>>>>>>            </instance_attributes>
> >>>>>>>            <operations>
> >>>>>>>              <op id="9ddc0ce9-4090-4546-a7d5-787fe47de872"
> >>>>> name="monitor"
> >>>>>>> description="master" interval="29" timeout="10" start_delay="1m"
> >>>>>>> role="Master"/>
> >>>>>>>              <op id="56a7508f-fa42-46f8-9924-3b284cdb97f0"
> >>>>> name="monitor"
> >>>>>>> description="slave" interval="29" timeout="10" start_delay="1m"
> >>>>>>> role="Slave"/>
> >>>>>>>            </operations>
> >>>>>>>          </primitive>
> >>>>>>>        </master_slave>
> >>>>>>>      </resources>
> >>>>>>>      <constraints/>
> >>>>>>>    </configuration>
> >>>>>>>  </cib>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>

report_1.tar.bz2
Description: BZip2 compressed data

_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] DRBD in a 2 node cluster

Reply via email to