Hi Dominik Thanks for the follow up, please find the file attached
Jason 2009/2/11 Dominik Klein <d...@in-telegence.net> > Hi Jason > > Jason Fitzpatrick wrote: > > I have disabled the services and run > > drbdadm secondary all > > drbdadm detach all > > drbdadm down all > > service drbd stop > > > > before testing as far as I can see (cat /proc/drbd on both nodes) drbd is > > shutdown > > > > cat: /proc/drbd: No such file or directory > > Good. > > > I have taken the command that heartbeat is running (drbdsetup /dev/drbd0 > > disk /dev/sdb /dev/sdb internal --set-defaults --create-device > > --on-io-error=pass_on') > > The RA actually runs "drbdadm up", which translates into this. > > > and run it against the nodes when heartbeat is not > > in control and this command will bring the resources online, but > re-running > > this command will generate the error, so I am kind of leaning twords the > > command being run twice? > > Never seen the cluster do that. > > Please post your configuration and logs. hb_report should gather > everything needed and put it into a nice .bz2 archive :) > > Regards > Dominik > > > Thanks > > > > Jason > > > > 2009/2/11 Dominik Klein <d...@in-telegence.net> > > > >> Hi Jason > >> > >> any chance you started drbd at boot or the drbd device was active at the > >> time you started the cluster resource? If so, read the introduction of > >> the howto again and correct your setup. > >> > >> Jason Fitzpatrick wrote: > >>> Hi Dominik > >>> > >>> I have upgraded to HB 2.9xx and have been following the instructions > that > >>> you provided (thanks for those) and have added a resource as follows > >>> > >>> crm > >>> configure > >>> primitive Storage1 ocf:heartbeat:drbd \ > >>> params drbd_resource=Storage1 \ > >>> op monitor role=Master interval=59s timeout=30s \ > >>> op monitor role=Slave interval=60s timeout=30s > >>> ms DRBD_Storage Storage1 \ > >>> meta clone-max=2 notify=true globally-unique=false target-role=stopped > >>> commit > >>> exit > >>> > >>> no errors are reported and the resource is visable from within the > hb_gui > >>> > >>> when I try to bring the resource online with > >>> > >>> crm resource start DRBD_Storage > >>> > >>> I see the resource attempt to come online and then fail, it seems to be > >>> starting the services, changing the status of the devices to attached > >> (from > >>> detached) but not setting any device to master > >>> > >>> the following is from the ha-log > >>> > >>> crmd[8020]: 2009/02/10_17:22:32 info: do_lrm_rsc_op: Performing > >>> key=7:166:0:b57f7f7c-4e2d-4134-9c14-b1a2b7db11a7 op=Storage1:1_start_0 > ) > >>> lrmd[8016]: 2009/02/10_17:22:32 info: rsc:Storage1:1: start > >>> lrmd[8016]: 2009/02/10_17:22:32 info: RA output: > >> (Storage1:1:start:stdout) > >>> /dev/drbd0: Failure: (124) Device is attached to a disk (use detach > >> first) > >>> Command > >>> 'drbdsetup /dev/drbd0 disk /dev/sdb /dev/sdb internal --set-defaults > >>> --create-device --on-io-error=pass_on' terminated with exit code 10 > >> This looks like "drbdadm up" is failing because the device is already > >> attached to the lower level storage device. > >> > >> Regards > >> Dominik > >> > >>> drbd[22270]: 2009/02/10_17:22:32 ERROR: Storage1 start: not in > >> Secondary > >>> mode after start. > >>> crmd[8020]: 2009/02/10_17:22:32 info: process_lrm_event: LRM operation > >>> Storage1:1_start_0 (call=189, rc=1, cib-update=380, confirmed=true) > >> complete > >>> unknown e > >>> rror > >>> . > >>> > >>> I have checked the DRBD device Storage1 and it is in secondary mode > after > >>> the start, and should I choose I can make it primary on either node > >>> > >>> Thanks > >>> > >>> Jason > >>> > >>> 2009/2/10 Jason Fitzpatrick <jayfitzpatr...@gmail.com> > >>> > >>>> Thanks, > >>>> > >>>> This was the latest version in the Fedora Repos, I will upgrade and > see > >>>> what happens > >>>> > >>>> Jason > >>>> > >>>> 2009/2/10 Dominik Klein <d...@in-telegence.net> > >>>> > >>>> Jason Fitzpatrick wrote: > >>>>>>> Hi All > >>>>>>> > >>>>>>> I am having a hell of a time trying to get heartbeat to fail over > my > >>>>> DRBD > >>>>>>> harddisk and am hoping for some help. > >>>>>>> > >>>>>>> I have a 2 node cluster, heartbeat is working as I am able to fail > >> over > >>>>> IP > >>>>>>> Addresses and services successfully, but when I try to fail over my > >>>>> DRBD > >>>>>>> resource from secondary to primary I am hitting a brick wall, I can > >>>>> fail > >>>>>>> over the DRBD resource manually so I know that it does work at some > >>>>> level > >>>>>>> DRBD version 8.3 Heartbeat version heartbeat-2.1.3-1.fc9.i386 > >>>>> Please upgrade. Thats too old for reliable master/slave behaviour. > >>>>> Preferrably upgrade to pacemaker and ais or heartbeat 2.99. Read > >>>>> http://www.clusterlabs.org/wiki/Install for install notes. > >>>>> > >>>>>>> and using > >>>>>>> heartbeat-gui to configure > >>>>> Don't use the gui to configure complex (ie clone or master/slave) > >>>>> resources. > >>>>> > >>>>> Once you upgraded to the latest pacemaker, please refer to > >>>>> http://www.clusterlabs.org/wiki/DRBD_HowTo_1.0 for drbd's cluster > >>>>> configuration. > >>>>> > >>>>> Regards > >>>>> Dominik > >>>>> > >>>>>>> DRBD Resource is called Storage1, the 2 nodes are connected via 2 > >>>>> x-over > >>>>>>> cables (1 heartbeat 1 Replication) > >>>>>>> > >>>>>>> I have stripped down my config to the bare bones and tried every > >> option > >>>>>>> that I can think off but know that I am missing something simple, > >>>>>>> > >>>>>>> I have attached my cib.xml but have removed domain names from the > >>>>> systems > >>>>>>> for privacy reasons > >>>>>>> > >>>>>>> Thanks in advance > >>>>>>> > >>>>>>> Jason > >>>>>>> > >>>>>>> <cib admin_epoch="0" have_quorum="true" ignore_dtd="false" > >>>>>>> cib_feature_revision="2.0" num_peers="2" generated="true" > >>>>>>> ccm_transition="22" dc_uuid="9d8abc28-4fa3-408a-a695-fb36b0d67a48" > >>>>>>> epoch="733" num_updates="1" cib-last-written="Mon Feb 9 18:31:19 > >>>>> 2009"> > >>>>>>> <configuration> > >>>>>>> <crm_config> > >>>>>>> <cluster_property_set id="cib-bootstrap-options"> > >>>>>>> <attributes> > >>>>>>> <nvpair id="cib-bootstrap-options-dc-version" > >>>>> name="dc-version" > >>>>>>> value="2.1.3-node: 552305612591183b1628baa5bc6e903e0f1e26a3"/> > >>>>>>> <nvpair name="last-lrm-refresh" > >>>>>>> id="cib-bootstrap-options-last-lrm-refresh" value="1234204278"/> > >>>>>>> </attributes> > >>>>>>> </cluster_property_set> > >>>>>>> </crm_config> > >>>>>>> <nodes> > >>>>>>> <node id="df707752-d5fb-405a-8ca7-049e25a227b7" > >>>>> uname="lpissan1001" > >>>>>>> type="normal"> > >>>>>>> <instance_attributes > >>>>>>> id="nodes-df707752-d5fb-405a-8ca7-049e25a227b7"> > >>>>>>> <attributes> > >>>>>>> <nvpair > >> id="standby-df707752-d5fb-405a-8ca7-049e25a227b7" > >>>>>>> name="standby" value="off"/> > >>>>>>> </attributes> > >>>>>>> </instance_attributes> > >>>>>>> </node> > >>>>>>> <node id="9d8abc28-4fa3-408a-a695-fb36b0d67a48" > >>>>> uname="lpissan1002" > >>>>>>> type="normal"> > >>>>>>> <instance_attributes > >>>>>>> id="nodes-9d8abc28-4fa3-408a-a695-fb36b0d67a48"> > >>>>>>> <attributes> > >>>>>>> <nvpair > >> id="standby-9d8abc28-4fa3-408a-a695-fb36b0d67a48" > >>>>>>> name="standby" value="off"/> > >>>>>>> </attributes> > >>>>>>> </instance_attributes> > >>>>>>> </node> > >>>>>>> </nodes> > >>>>>>> <resources> > >>>>>>> <master_slave id="Storage1"> > >>>>>>> <meta_attributes id="Storage1_meta_attrs"> > >>>>>>> <attributes> > >>>>>>> <nvpair id="Storage1_metaattr_target_role" > >>>>> name="target_role" > >>>>>>> value="started"/> > >>>>>>> <nvpair id="Storage1_metaattr_clone_max" > >> name="clone_max" > >>>>>>> value="2"/> > >>>>>>> <nvpair id="Storage1_metaattr_clone_node_max" > >>>>>>> name="clone_node_max" value="1"/> > >>>>>>> <nvpair id="Storage1_metaattr_master_max" > >>>>> name="master_max" > >>>>>>> value="1"/> > >>>>>>> <nvpair id="Storage1_metaattr_master_node_max" > >>>>>>> name="master_node_max" value="1"/> > >>>>>>> <nvpair id="Storage1_metaattr_notify" name="notify" > >>>>>>> value="true"/> > >>>>>>> <nvpair id="Storage1_metaattr_globally_unique" > >>>>>>> name="globally_unique" value="false"/> > >>>>>>> </attributes> > >>>>>>> </meta_attributes> > >>>>>>> <primitive id="Storage1" class="ocf" type="drbd" > >>>>>>> provider="heartbeat"> > >>>>>>> <instance_attributes id="Storage1_instance_attrs"> > >>>>>>> <attributes> > >>>>>>> <nvpair id="273a1bb2-4867-42dd-a9e5-7cebbf48ef3b" > >>>>>>> name="drbd_resource" value="Storage1"/> > >>>>>>> </attributes> > >>>>>>> </instance_attributes> > >>>>>>> <operations> > >>>>>>> <op id="9ddc0ce9-4090-4546-a7d5-787fe47de872" > >>>>> name="monitor" > >>>>>>> description="master" interval="29" timeout="10" start_delay="1m" > >>>>>>> role="Master"/> > >>>>>>> <op id="56a7508f-fa42-46f8-9924-3b284cdb97f0" > >>>>> name="monitor" > >>>>>>> description="slave" interval="29" timeout="10" start_delay="1m" > >>>>>>> role="Slave"/> > >>>>>>> </operations> > >>>>>>> </primitive> > >>>>>>> </master_slave> > >>>>>>> </resources> > >>>>>>> <constraints/> > >>>>>>> </configuration> > >>>>>>> </cib> > _______________________________________________ > Linux-HA mailing list > Linux-HA@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems >
report_1.tar.bz2
Description: BZip2 compressed data
_______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems