Hi Jason Jason Fitzpatrick wrote: > I have disabled the services and run > drbdadm secondary all > drbdadm detach all > drbdadm down all > service drbd stop > > before testing as far as I can see (cat /proc/drbd on both nodes) drbd is > shutdown > > cat: /proc/drbd: No such file or directory
Good. > I have taken the command that heartbeat is running (drbdsetup /dev/drbd0 > disk /dev/sdb /dev/sdb internal --set-defaults --create-device > --on-io-error=pass_on') The RA actually runs "drbdadm up", which translates into this. > and run it against the nodes when heartbeat is not > in control and this command will bring the resources online, but re-running > this command will generate the error, so I am kind of leaning twords the > command being run twice? Never seen the cluster do that. Please post your configuration and logs. hb_report should gather everything needed and put it into a nice .bz2 archive :) Regards Dominik > Thanks > > Jason > > 2009/2/11 Dominik Klein <d...@in-telegence.net> > >> Hi Jason >> >> any chance you started drbd at boot or the drbd device was active at the >> time you started the cluster resource? If so, read the introduction of >> the howto again and correct your setup. >> >> Jason Fitzpatrick wrote: >>> Hi Dominik >>> >>> I have upgraded to HB 2.9xx and have been following the instructions that >>> you provided (thanks for those) and have added a resource as follows >>> >>> crm >>> configure >>> primitive Storage1 ocf:heartbeat:drbd \ >>> params drbd_resource=Storage1 \ >>> op monitor role=Master interval=59s timeout=30s \ >>> op monitor role=Slave interval=60s timeout=30s >>> ms DRBD_Storage Storage1 \ >>> meta clone-max=2 notify=true globally-unique=false target-role=stopped >>> commit >>> exit >>> >>> no errors are reported and the resource is visable from within the hb_gui >>> >>> when I try to bring the resource online with >>> >>> crm resource start DRBD_Storage >>> >>> I see the resource attempt to come online and then fail, it seems to be >>> starting the services, changing the status of the devices to attached >> (from >>> detached) but not setting any device to master >>> >>> the following is from the ha-log >>> >>> crmd[8020]: 2009/02/10_17:22:32 info: do_lrm_rsc_op: Performing >>> key=7:166:0:b57f7f7c-4e2d-4134-9c14-b1a2b7db11a7 op=Storage1:1_start_0 ) >>> lrmd[8016]: 2009/02/10_17:22:32 info: rsc:Storage1:1: start >>> lrmd[8016]: 2009/02/10_17:22:32 info: RA output: >> (Storage1:1:start:stdout) >>> /dev/drbd0: Failure: (124) Device is attached to a disk (use detach >> first) >>> Command >>> 'drbdsetup /dev/drbd0 disk /dev/sdb /dev/sdb internal --set-defaults >>> --create-device --on-io-error=pass_on' terminated with exit code 10 >> This looks like "drbdadm up" is failing because the device is already >> attached to the lower level storage device. >> >> Regards >> Dominik >> >>> drbd[22270]: 2009/02/10_17:22:32 ERROR: Storage1 start: not in >> Secondary >>> mode after start. >>> crmd[8020]: 2009/02/10_17:22:32 info: process_lrm_event: LRM operation >>> Storage1:1_start_0 (call=189, rc=1, cib-update=380, confirmed=true) >> complete >>> unknown e >>> rror >>> . >>> >>> I have checked the DRBD device Storage1 and it is in secondary mode after >>> the start, and should I choose I can make it primary on either node >>> >>> Thanks >>> >>> Jason >>> >>> 2009/2/10 Jason Fitzpatrick <jayfitzpatr...@gmail.com> >>> >>>> Thanks, >>>> >>>> This was the latest version in the Fedora Repos, I will upgrade and see >>>> what happens >>>> >>>> Jason >>>> >>>> 2009/2/10 Dominik Klein <d...@in-telegence.net> >>>> >>>> Jason Fitzpatrick wrote: >>>>>>> Hi All >>>>>>> >>>>>>> I am having a hell of a time trying to get heartbeat to fail over my >>>>> DRBD >>>>>>> harddisk and am hoping for some help. >>>>>>> >>>>>>> I have a 2 node cluster, heartbeat is working as I am able to fail >> over >>>>> IP >>>>>>> Addresses and services successfully, but when I try to fail over my >>>>> DRBD >>>>>>> resource from secondary to primary I am hitting a brick wall, I can >>>>> fail >>>>>>> over the DRBD resource manually so I know that it does work at some >>>>> level >>>>>>> DRBD version 8.3 Heartbeat version heartbeat-2.1.3-1.fc9.i386 >>>>> Please upgrade. Thats too old for reliable master/slave behaviour. >>>>> Preferrably upgrade to pacemaker and ais or heartbeat 2.99. Read >>>>> http://www.clusterlabs.org/wiki/Install for install notes. >>>>> >>>>>>> and using >>>>>>> heartbeat-gui to configure >>>>> Don't use the gui to configure complex (ie clone or master/slave) >>>>> resources. >>>>> >>>>> Once you upgraded to the latest pacemaker, please refer to >>>>> http://www.clusterlabs.org/wiki/DRBD_HowTo_1.0 for drbd's cluster >>>>> configuration. >>>>> >>>>> Regards >>>>> Dominik >>>>> >>>>>>> DRBD Resource is called Storage1, the 2 nodes are connected via 2 >>>>> x-over >>>>>>> cables (1 heartbeat 1 Replication) >>>>>>> >>>>>>> I have stripped down my config to the bare bones and tried every >> option >>>>>>> that I can think off but know that I am missing something simple, >>>>>>> >>>>>>> I have attached my cib.xml but have removed domain names from the >>>>> systems >>>>>>> for privacy reasons >>>>>>> >>>>>>> Thanks in advance >>>>>>> >>>>>>> Jason >>>>>>> >>>>>>> <cib admin_epoch="0" have_quorum="true" ignore_dtd="false" >>>>>>> cib_feature_revision="2.0" num_peers="2" generated="true" >>>>>>> ccm_transition="22" dc_uuid="9d8abc28-4fa3-408a-a695-fb36b0d67a48" >>>>>>> epoch="733" num_updates="1" cib-last-written="Mon Feb 9 18:31:19 >>>>> 2009"> >>>>>>> <configuration> >>>>>>> <crm_config> >>>>>>> <cluster_property_set id="cib-bootstrap-options"> >>>>>>> <attributes> >>>>>>> <nvpair id="cib-bootstrap-options-dc-version" >>>>> name="dc-version" >>>>>>> value="2.1.3-node: 552305612591183b1628baa5bc6e903e0f1e26a3"/> >>>>>>> <nvpair name="last-lrm-refresh" >>>>>>> id="cib-bootstrap-options-last-lrm-refresh" value="1234204278"/> >>>>>>> </attributes> >>>>>>> </cluster_property_set> >>>>>>> </crm_config> >>>>>>> <nodes> >>>>>>> <node id="df707752-d5fb-405a-8ca7-049e25a227b7" >>>>> uname="lpissan1001" >>>>>>> type="normal"> >>>>>>> <instance_attributes >>>>>>> id="nodes-df707752-d5fb-405a-8ca7-049e25a227b7"> >>>>>>> <attributes> >>>>>>> <nvpair >> id="standby-df707752-d5fb-405a-8ca7-049e25a227b7" >>>>>>> name="standby" value="off"/> >>>>>>> </attributes> >>>>>>> </instance_attributes> >>>>>>> </node> >>>>>>> <node id="9d8abc28-4fa3-408a-a695-fb36b0d67a48" >>>>> uname="lpissan1002" >>>>>>> type="normal"> >>>>>>> <instance_attributes >>>>>>> id="nodes-9d8abc28-4fa3-408a-a695-fb36b0d67a48"> >>>>>>> <attributes> >>>>>>> <nvpair >> id="standby-9d8abc28-4fa3-408a-a695-fb36b0d67a48" >>>>>>> name="standby" value="off"/> >>>>>>> </attributes> >>>>>>> </instance_attributes> >>>>>>> </node> >>>>>>> </nodes> >>>>>>> <resources> >>>>>>> <master_slave id="Storage1"> >>>>>>> <meta_attributes id="Storage1_meta_attrs"> >>>>>>> <attributes> >>>>>>> <nvpair id="Storage1_metaattr_target_role" >>>>> name="target_role" >>>>>>> value="started"/> >>>>>>> <nvpair id="Storage1_metaattr_clone_max" >> name="clone_max" >>>>>>> value="2"/> >>>>>>> <nvpair id="Storage1_metaattr_clone_node_max" >>>>>>> name="clone_node_max" value="1"/> >>>>>>> <nvpair id="Storage1_metaattr_master_max" >>>>> name="master_max" >>>>>>> value="1"/> >>>>>>> <nvpair id="Storage1_metaattr_master_node_max" >>>>>>> name="master_node_max" value="1"/> >>>>>>> <nvpair id="Storage1_metaattr_notify" name="notify" >>>>>>> value="true"/> >>>>>>> <nvpair id="Storage1_metaattr_globally_unique" >>>>>>> name="globally_unique" value="false"/> >>>>>>> </attributes> >>>>>>> </meta_attributes> >>>>>>> <primitive id="Storage1" class="ocf" type="drbd" >>>>>>> provider="heartbeat"> >>>>>>> <instance_attributes id="Storage1_instance_attrs"> >>>>>>> <attributes> >>>>>>> <nvpair id="273a1bb2-4867-42dd-a9e5-7cebbf48ef3b" >>>>>>> name="drbd_resource" value="Storage1"/> >>>>>>> </attributes> >>>>>>> </instance_attributes> >>>>>>> <operations> >>>>>>> <op id="9ddc0ce9-4090-4546-a7d5-787fe47de872" >>>>> name="monitor" >>>>>>> description="master" interval="29" timeout="10" start_delay="1m" >>>>>>> role="Master"/> >>>>>>> <op id="56a7508f-fa42-46f8-9924-3b284cdb97f0" >>>>> name="monitor" >>>>>>> description="slave" interval="29" timeout="10" start_delay="1m" >>>>>>> role="Slave"/> >>>>>>> </operations> >>>>>>> </primitive> >>>>>>> </master_slave> >>>>>>> </resources> >>>>>>> <constraints/> >>>>>>> </configuration> >>>>>>> </cib> _______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems