Re: [Linux-HA] how to get group members
Hello, Dejan > Not without extra text processing, but that shouldn't be too > difficult. Yes, I use extra text processing to get group members. >Alternatively, if you're using pacemaker 1.0, then you > can grep output of "crm configure show" for the group name. I use heartbeat. Best wishes, Ivan ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] how to forbid ptest logs in /var/log/messages
Dear Dejan and Andrew Thanks a lot for your help. > No, I don't think so, unless you disable syslog somehow. I think I'll use syslog to do it. Thanks >Why does that bother you? I use ptest very often in my script therefore I don't want to write this information in /var/log/messages. __ Best wishes, Ivan * Dejan Muhamedagic [Tue, 18 Aug 2009 17:05:26 +0200]: > Hi, > > On Tue, Aug 18, 2009 at 06:25:40PM +0400, Ivan Gromov wrote: > > Hi, all > > > > I use ptest -LsVV 2>&1 to determine resource and group score. I have > > noticed that the ptest writes information in /var/log/messages . Is it > > possible to forbid the ptest to write in messages file? > > No, I don't think so, unless you disable syslog somehow. Why does > that bother you? > > Thanks, > > Dejan > > > Try the following patch (which I'm about to commit) >diff -r 33da369d36e3 pengine/ptest.c >--- a/pengine/ptest.c Tue Aug 18 14:38:16 2009 +0200 >+++ b/pengine/ptest.c Tue Aug 18 17:00:06 2009 +0200 >@@ -178,7 +178,7 @@ main(int argc, char **argv) > crm_log_init("ptest", LOG_CRIT, FALSE, FALSE, 0, NULL); > crm_set_options("V?$XD:G:I:Lwx:d:aSs", "[-?Vv] -[Xxp] {other >options}", long_options, > "Calculate the cluster's response to the supplied cluster state\n"); >- cl_log_set_facility(LOG_USER); >+ cl_log_set_facility(-1); > > while (1) { > int option_index = 0; > > > > > > > ___ > > Linux-HA mailing list > > Linux-HA@lists.linux-ha.org > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > See also: http://linux-ha.org/ReportingProblems > ___ > Linux-HA mailing list > Linux-HA@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
[Linux-HA] Heartbeat daily statistics collection and service interruption
Hi, How can I stop/restrict heartbeat from collecting daily statistics. As my service reliability is configured using heartbeat, it stops for a couple of seconds daily at a time when it is collecting the statistics. How can I get out of this peculiar behaviour. I am worried I will loose some traffic during that period. My logs: heartbeat[7344]: 2009/08/17_23:45:57 info: RealMalloc stats: 623356 total malloc bytes. pid [7344/MST_CONTROL] heartbeat[7344]: 2009/08/17_23:45:57 info: RealMalloc stats: 31444 total malloc bytes. pid [7346/HBFIFO] heartbeat[7344]: 2009/08/17_23:45:57 info: RealMalloc stats: 40644 total malloc bytes. pid [7347/HBWRITE] heartbeat[7344]: 2009/08/17_23:45:57 info: RealMalloc stats: 33136 total malloc bytes. pid [7348/HBREAD] It occurs daily at 11:45. -- Kiran Sarvabhotla MS(Research) OBH-50,IIIT Gachibowli Hyderabad-500032 Samvedana's blog: http://teamsamvedana.wordpress.com ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
[Linux-HA] WARN: Gmain_timeout_dispatch:
I get the following messages constantly in both of my two nodes. Both linux 2.6.26.8-2 running Heartbeat 2.1.4 without crm (for now) Attached are my conf files... heartbeat[6685]: 2009/08/19_09:54:46 WARN: Gmain_timeout_dispatch: Dispatch function for send local status was delayed 1650 ms (> 1010 ms) before being called (GSource: 0x811de70) heartbeat[6685]: 2009/08/19_09:54:46 info: Gmain_timeout_dispatch: started at 1724344686 should have started at 1724344521 heartbeat[6685]: 2009/08/19_09:54:46 WARN: Gmain_timeout_dispatch: Dispatch function for check for signals was delayed 1650 ms (> 1010 ms) before being called (GSource: 0x811e178) heartbeat[6685]: 2009/08/19_09:54:46 info: Gmain_timeout_dispatch: started at 1724344686 should have started at 1724344521 ha.cf Description: ha.cf haresources Description: haresources ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] stonith question
Chris Card wrote: > > >> Date: Wed, 19 Aug 2009 06:52:55 -0500 >> From: deb...@us.ibm.com >> To: linux-ha@lists.linux-ha.org >> Subject: Re: [Linux-HA] stonith question >> >> Chris Card wrote: >> > > >>> I'm planning to use stonith with heartbeat (v1), and I understand that to >>> do this I need a "stonith" or "stonith_host" line in my ha.cf to get >>> heartbeat to call the stonith plugin with the appropriate parameters. >>> >>> What is not clear to me is under what conditions heartbeat will call the >>> stonith plugin to power off a machine - is this documented anywhere? >>> >>> >> Not sure if a complete or even partially complete list exists, but in >> general a node is STONITHed whenever heartbeat needs to make sure that >> it is not using cluster resources. Here is a nice little article that >> includes some discussion on fencing: >> >> http://techthoughts.typepad.com/managing_computers/2007/10/split-brain-quo.html >> >>> Chris >>> > > Thanks Dave, that's a useful document, though I'd still like some definite > info about what heartbeat actually does. > > I've been looking at the source, and one thing I noticed is that heartbeat > itself only seems to send a "reset" command to the stonith device, whereas > the stonith command line tool also accepts "on" and "off" as separate > actions. Is this the case, or am I missing something? Is there a way to make > heartbeat power off a machine and not automatically power it on again? > You are not missing something, they are indeed separate actions but reset is by far and away the main one. In fact, for v2 external STONITH plugins, reset is the only required operation while on and off are optional. I'm not aware of a way to power off a STONITH device on v1, v2 includes the following CIB directive: You may be able to play around with the meatware device, which waits for manual intervention, or external device, which *I believe* issues a system() call if you can call your device via the shell. > Chris > > _ > > Upgrade to Internet Explorer 8 Optimised for MSN. > > http://extras.uk.msn.com/internet-explorer-8/?ocid=T010MSN07A0716U > ___ > Linux-HA mailing list > Linux-HA@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] stonith question
> Date: Wed, 19 Aug 2009 06:52:55 -0500 > From: deb...@us.ibm.com > To: linux-ha@lists.linux-ha.org > Subject: Re: [Linux-HA] stonith question > > Chris Card wrote: > > I'm planning to use stonith with heartbeat (v1), and I understand that to > > do this I need a "stonith" or "stonith_host" line in my ha.cf to get > > heartbeat to call the stonith plugin with the appropriate parameters. > > > > What is not clear to me is under what conditions heartbeat will call the > > stonith plugin to power off a machine - is this documented anywhere? > > > Not sure if a complete or even partially complete list exists, but in > general a node is STONITHed whenever heartbeat needs to make sure that > it is not using cluster resources. Here is a nice little article that > includes some discussion on fencing: > > http://techthoughts.typepad.com/managing_computers/2007/10/split-brain-quo.html > > Chris Thanks Dave, that's a useful document, though I'd still like some definite info about what heartbeat actually does. I've been looking at the source, and one thing I noticed is that heartbeat itself only seems to send a "reset" command to the stonith device, whereas the stonith command line tool also accepts "on" and "off" as separate actions. Is this the case, or am I missing something? Is there a way to make heartbeat power off a machine and not automatically power it on again? Chris _ Upgrade to Internet Explorer 8 Optimised for MSN. http://extras.uk.msn.com/internet-explorer-8/?ocid=T010MSN07A0716U ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
[Linux-HA] stonith failed to start
List, I've got a dev cluster up and running with Xen/DRBD/heartbeat working. After a day or so of running, i saw that stonith had failed to start on node2(it initially started just fine). I have seen this behavior before with this cluster. What would cause the stonith 'start' operation to fail after it initially had succeeded? crm_mon output: --- Refresh in 10s... Last updated: Wed Aug 19 06:33:12 2009 Current DC: node1 (47d563cc-f8ec-4b6d-8092-d80ceb64dbbd) 2 Nodes configured. 4 Resources configured. Node: node2 (c95ba6f0-5dcf-41d3-abb0-25e55ae313eb): online Node: node1 (47d563cc-f8ec-4b6d-8092-d80ceb64dbbd): online xen1 (heartbeat::ocf:Xen): Started node2 xen2 (heartbeat::ocf:Xen): Started node1 xen3 (heartbeat::ocf:Xen): Started node2 Clone Set: Stonith_Clone_Group stonithclone:0 (stonith:external/ssh): Started node1 stonithclone:1 (stonith:external/ssh): Stopped Failed actions: stonithclone:1_start_0 (node=node2, call=14, rc=1): complete At first look, it appears that the monitor operation fails. Heartbeat then tries to start stonith on the failed node and then the 'start' operation fails as well. Aug 18 11:02:37 node1 tengine: [3950]: WARN: update_failcount: Updating failcount for stonithclone:1 on c95ba6f0-5dcf-41d3-abb0-25e55ae313eb after failed monitor: rc=14 Aug 18 11:02:37 node1 crmd: [3859]: info: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_IPC_MESSAGE origin=route_message ] Aug 18 11:02:37 node1 crmd: [3859]: info: do_state_transition: All 2 cluster nodes are eligible to run resources. Aug 18 11:02:37 node1 pengine: [3951]: info: determine_online_status: Node node1 is online Aug 18 11:02:37 node1 pengine: [3951]: info: determine_online_status: Node node2 is online Aug 18 11:02:37 node1 pengine: [3951]: info: unpack_find_resource: Internally renamed stonithclone:0 on node2 to stonithclone:1 Aug 18 11:02:37 node1 pengine: [3951]: WARN: unpack_rsc_op: Processing failed op stonithclone:1_monitor_5000 on node2: Error Aug 18 11:02:37 node1 pengine: [3951]: notice: native_print: romulus#011(heartbeat::ocf:Xen):#011Started node2 Aug 18 11:02:37 node1 pengine: [3951]: notice: native_print: remus#011(heartbeat::ocf:Xen):#011Started node1 Aug 18 11:02:37 node1 pengine: [3951]: notice: native_print: fortuna#011(heartbeat::ocf:Xen):#011Started node2 Aug 18 11:02:37 node1 pengine: [3951]: notice: clone_print: Clone Set: Stonith_Clone_Group Aug 18 11:02:37 node1 pengine: [3951]: notice: native_print: stonithclone:0#011(stonith:external/ssh):#011Started node1 Aug 18 11:02:37 node1 pengine: [3951]: notice: native_print: stonithclone:1#011(stonith:external/ssh):#011Started node2 FAILED Aug 18 11:02:37 node1 pengine: [3951]: notice: NoRoleChange: Leave resource xen2#011(node2) Aug 18 11:02:37 node1 pengine: [3951]: notice: NoRoleChange: Leave resource xen1#011(node1) Aug 18 11:02:37 node1 pengine: [3951]: notice: NoRoleChange: Leave resource xen3#011(node2) Aug 18 11:02:37 node1 pengine: [3951]: notice: NoRoleChange: Leave resource stonithclone:0#011(node1) Aug 18 11:02:37 node1 pengine: [3951]: notice: NoRoleChange: Recover resource stonithclone:1#011(node2) Aug 18 11:02:37 node1 pengine: [3951]: notice: StopRsc: node2#011Stop stonithclone:1 Aug 18 11:02:37 node1 pengine: [3951]: notice: StartRsc: node2#011Start stonithclone:1 Aug 18 11:02:37 node1 pengine: [3951]: notice: RecurringOp: node2#011 stonithclone:1_monitor_5000 Aug 18 11:02:37 node1 tengine: [3950]: info: extract_event: Aborting on transient_attributes changes for c95ba6f0-5dcf-41d3-abb0-25e55ae313eb Aug 18 11:02:37 node1 pengine: [3951]: info: process_pe_message: Transition 3: PEngine Input stored in: /var/lib/heartbeat/pengine/pe-input-31.bz2 Aug 18 11:02:37 node1 pengine: [3951]: info: determine_online_status: Node node1 is online Aug 18 11:02:37 node1 pengine: [3951]: info: determine_online_status: Node node2 is online Aug 18 11:02:37 node1 pengine: [3951]: info: unpack_find_resource: Internally renamed stonithclone:0 on node2 to stonithclone:1 Aug 18 11:02:37 node1 pengine: [3951]: WARN: unpack_rsc_op: Processing failed op stonithclone:1_monitor_5000 on node2: Error Aug 18 11:02:37 node1 pengine: [3951]: notice: native_print: xen2#011(heartbeat::ocf:Xen):#011Started node2 Aug 18 11:02:37 node1 pengine: [3951]: notice: native_print: xen1#011(heartbeat::ocf:Xen):#011Started node1 Aug 18 11:02:37 node1 pengine: [3951]: notice: native_print: xen3#011(heartbeat::ocf:Xen):#011Started node2 Aug 18 11:02:37 node1 pengine: [3951]: notice: clone_print: Clone Set: Stonith_Clone_Group Aug 18 11:02:37 node1 pengine: [3951]: notice: native_print: stonithclone:0#011(stonith:external/ssh):#011Started node1 Aug 18 11:02:37 node1 pengine: [3951]: notice: native_print: stonithclone:1#011(stonith:external/ssh):#011Started node2 FAILED If the node gets rebooted, it comes b
Re: [Linux-HA] stonith question
Chris Card wrote: > [Apologies if this is a FAQ] > > I'm planning to use stonith with heartbeat (v1), and I understand that to do > this I need a "stonith" or "stonith_host" line in my ha.cf to get heartbeat > to call the stonith plugin with the appropriate parameters. > > What is not clear to me is under what conditions heartbeat will call the > stonith plugin to power off a machine - is this documented anywhere? > Not sure if a complete or even partially complete list exists, but in general a node is STONITHed whenever heartbeat needs to make sure that it is not using cluster resources. Here is a nice little article that includes some discussion on fencing: http://techthoughts.typepad.com/managing_computers/2007/10/split-brain-quo.html > Chris > > _ > Windows Live Messenger: Thanks for 10 great years—enjoy free winks and > emoticons. > http://clk.atdmt.com/UKM/go/157562755/direct/01/ > ___ > Linux-HA mailing list > Linux-HA@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Adding a resource to a running Cluster
>> clones have only meta_attributes, so move the two nvpairs over there. << Oh. That's the bit I missed in the documentation, then. Yes, once that's sorted the cib swallows this update without the barest hint of a complaint. Thank you!! (I'm sure you hear it so often that you can't hear it anymore, but I still have to say: this clustering software *rocks*; I'm not sure there's a cooler, more significant enterprise class project, short of the kernel itself, anywhere.) Be well, Karl On Wed, Aug 19, 2009 at 6:56 AM, Dejan Muhamedagic wrote: > Hi, > > On Wed, Aug 19, 2009 at 06:27:32AM -0400, Karl W. Lewis wrote: > > On Mon, Aug 10, 2009 at 2:20 PM, Karl W. Lewis >wrote: > > > > > D'oh! > > > > > > Yes, it's Pacemaker 1.0.3 I should have said so. > > > > > > I am sorry for what turns out to be a[nother] the stupid question. > Thank > > > you for your kind patience. > > > > > > Be well, > > > > > > Karl > > > > > > > > > On Mon, Aug 10, 2009 at 1:50 PM, Dejan Muhamedagic < > deja...@fastmail.fm>wrote: > > > > > >> Hi, > > >> > > >> On Mon, Aug 10, 2009 at 12:54:44PM -0400, Karl W. Lewis wrote: > > >> > I have used cibadmin to add contraints to a running cluster, but now > I > > >> wish > > >> > to add another resource, a stonith configuration, to my cluster. > > >> > > >> I guess that this is pacemaker 1.0. > > >> > > >> > The snippet of xml I am trying to feed the cluster looks like this: > > >> > > > >> > > > >> > > > >> > > > >> > > >> value="false"/> > > >> > > > >> > > > >> > > > >> > > > >> > > >> Both are missing id. Also, replace "_" with "-". > > >> > > >> > > > >> >> >> type="external/egenera" > > >> > provider="heartbeat"> > > >> > > > >> >> >> prereq="nothing"/> > > >> > > >> Missing id. "prereq" is now named "requires". > > >> > > >> Thanks, > > >> > > >> Dejan > > >> > > >> > > > >> > > > >> > > >> value="wsc-voo-205, > > >> > wsc-voo-206, wsc-voo-207"/> > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > >>> > > >> > cibadmin -V -V -V -V -V -V -V -C -o resources -x > stonith_production.xml > > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug2: main: Option o > => > > >> > resources > > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug2: main: Option x > => > > >> > stonith_production.xml > > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: > > >> [admin > > >> > input] > > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: > > >> [admin > > >> > input] > > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: > > >> [admin > > >> > input] > > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: > > >> [admin > > >> > input]> >> > value="false" /> > > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: > > >> [admin > > >> > input] > > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: > > >> [admin > > >> > input] > > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: > > >> [admin > > >> > input] > > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: > > >> [admin > > >> > input] > > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: > > >> [admin > > >> > input] > > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: > > >> [admin > > >> > input] > >> > type="external/egenera" provider="heartbeat" > > > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: > > >> [admin > > >> > input] > > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: > > >> [admin > > >> > input] > >> > prereq="nothing" /> > > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: > > >> [admin > > >> > input] > > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: > > >> [admin > > >> > input] > > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: > > >> [admin > > >> > input] > >> > value="wsc-voo-205, wsc-voo-206, wsc-voo-207" /> > > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: > > >> [admin > > >> > input] > > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: > > >> [admin > > >> > input] > > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: > > >> [admin > > >> > input] > > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: > > >> [admin > > >> > input] > > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: > > >> > init_client_ipc_comms_nodispatch: Attempting to talk on: > > >> /var/run/crm/cib_rw > > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug3: > > >> > init_client_ipc_comms_nodispatch: Processing of /var/run/crm/cib_rw > > >> complete > > >> > cibadmin[18807]:
Re: [Linux-HA] Adding a resource to a running Cluster
Hi, On Wed, Aug 19, 2009 at 06:27:32AM -0400, Karl W. Lewis wrote: > On Mon, Aug 10, 2009 at 2:20 PM, Karl W. Lewis wrote: > > > D'oh! > > > > Yes, it's Pacemaker 1.0.3 I should have said so. > > > > I am sorry for what turns out to be a[nother] the stupid question. Thank > > you for your kind patience. > > > > Be well, > > > > Karl > > > > > > On Mon, Aug 10, 2009 at 1:50 PM, Dejan Muhamedagic > > wrote: > > > >> Hi, > >> > >> On Mon, Aug 10, 2009 at 12:54:44PM -0400, Karl W. Lewis wrote: > >> > I have used cibadmin to add contraints to a running cluster, but now I > >> wish > >> > to add another resource, a stonith configuration, to my cluster. > >> > >> I guess that this is pacemaker 1.0. > >> > >> > The snippet of xml I am trying to feed the cluster looks like this: > >> > > >> > > >> > > >> > > >> > >> value="false"/> > >> > > >> > > >> > > >> > > >> > >> Both are missing id. Also, replace "_" with "-". > >> > >> > > >> >>> type="external/egenera" > >> > provider="heartbeat"> > >> > > >> >>> prereq="nothing"/> > >> > >> Missing id. "prereq" is now named "requires". > >> > >> Thanks, > >> > >> Dejan > >> > >> > > >> > > >> > >> value="wsc-voo-205, > >> > wsc-voo-206, wsc-voo-207"/> > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > >>> > >> > cibadmin -V -V -V -V -V -V -V -C -o resources -x stonith_production.xml > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug2: main: Option o => > >> > resources > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug2: main: Option x => > >> > stonith_production.xml > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: > >> [admin > >> > input] > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: > >> [admin > >> > input] > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: > >> [admin > >> > input] > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: > >> [admin > >> > input]>> > value="false" /> > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: > >> [admin > >> > input] > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: > >> [admin > >> > input] > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: > >> [admin > >> > input] > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: > >> [admin > >> > input] > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: > >> [admin > >> > input] > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: > >> [admin > >> > input] >> > type="external/egenera" provider="heartbeat" > > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: > >> [admin > >> > input] > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: > >> [admin > >> > input] >> > prereq="nothing" /> > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: > >> [admin > >> > input] > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: > >> [admin > >> > input] > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: > >> [admin > >> > input] >> > value="wsc-voo-205, wsc-voo-206, wsc-voo-207" /> > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: > >> [admin > >> > input] > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: > >> [admin > >> > input] > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: > >> [admin > >> > input] > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: > >> [admin > >> > input] > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: > >> > init_client_ipc_comms_nodispatch: Attempting to talk on: > >> /var/run/crm/cib_rw > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug3: > >> > init_client_ipc_comms_nodispatch: Processing of /var/run/crm/cib_rw > >> complete > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: > >> > init_client_ipc_comms_nodispatch: Attempting to talk on: > >> > /var/run/crm/cib_callback > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug3: > >> > init_client_ipc_comms_nodispatch: Processing of > >> /var/run/crm/cib_callback > >> > complete > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: cib_native_signon_raw: > >> > Connection to CIB successful > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug3: > >> cib_native_perform_op: > >> > Sending cib_create message to CIB service > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug3: > >> cib_native_perform_op: > >> > Message sent > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug3: > >> cib_native_perform_op: > >> > Async call, returning > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug3: main: cibadmin > >> waiting > >> > for reply from the
Re: [Linux-HA] how to get group members
Ivan Gromov wrote: > Hi, all > > How to get group members? > I use crm_resource -x -t group -r group_Name. Can I get members without > xml part? What about crm configure show ? Regards Dominik ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Adding a resource to a running Cluster
On Mon, Aug 10, 2009 at 2:20 PM, Karl W. Lewis wrote: > D'oh! > > Yes, it's Pacemaker 1.0.3 I should have said so. > > I am sorry for what turns out to be a[nother] the stupid question. Thank > you for your kind patience. > > Be well, > > Karl > > > On Mon, Aug 10, 2009 at 1:50 PM, Dejan Muhamedagic wrote: > >> Hi, >> >> On Mon, Aug 10, 2009 at 12:54:44PM -0400, Karl W. Lewis wrote: >> > I have used cibadmin to add contraints to a running cluster, but now I >> wish >> > to add another resource, a stonith configuration, to my cluster. >> >> I guess that this is pacemaker 1.0. >> >> > The snippet of xml I am trying to feed the cluster looks like this: >> > >> > >> > >> > >> > > value="false"/> >> > >> > >> > >> > >> >> Both are missing id. Also, replace "_" with "-". >> >> > >> > > type="external/egenera" >> > provider="heartbeat"> >> > >> > > prereq="nothing"/> >> >> Missing id. "prereq" is now named "requires". >> >> Thanks, >> >> Dejan >> >> > >> > >> > > value="wsc-voo-205, >> > wsc-voo-206, wsc-voo-207"/> >> > >> > >> > >> > >> > >> > >> > >> > >>> >> > cibadmin -V -V -V -V -V -V -V -C -o resources -x stonith_production.xml >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug2: main: Option o => >> > resources >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug2: main: Option x => >> > stonith_production.xml >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: >> [admin >> > input] >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: >> [admin >> > input] >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: >> [admin >> > input] >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: >> [admin >> > input] > > value="false" /> >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: >> [admin >> > input] >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: >> [admin >> > input] >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: >> [admin >> > input] >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: >> [admin >> > input] >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: >> [admin >> > input] >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: >> [admin >> > input] > > type="external/egenera" provider="heartbeat" > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: >> [admin >> > input] >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: >> [admin >> > input] > > prereq="nothing" /> >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: >> [admin >> > input] >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: >> [admin >> > input] >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: >> [admin >> > input] > > value="wsc-voo-205, wsc-voo-206, wsc-voo-207" /> >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: >> [admin >> > input] >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: >> [admin >> > input] >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: >> [admin >> > input] >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main: >> [admin >> > input] >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: >> > init_client_ipc_comms_nodispatch: Attempting to talk on: >> /var/run/crm/cib_rw >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug3: >> > init_client_ipc_comms_nodispatch: Processing of /var/run/crm/cib_rw >> complete >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: >> > init_client_ipc_comms_nodispatch: Attempting to talk on: >> > /var/run/crm/cib_callback >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug3: >> > init_client_ipc_comms_nodispatch: Processing of >> /var/run/crm/cib_callback >> > complete >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: cib_native_signon_raw: >> > Connection to CIB successful >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug3: >> cib_native_perform_op: >> > Sending cib_create message to CIB service >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug3: >> cib_native_perform_op: >> > Message sent >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug3: >> cib_native_perform_op: >> > Async call, returning >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug3: main: cibadmin >> waiting >> > for reply from the local CIB >> > cibadmin[18807]: 2009/08/10_12:44:17 info: main: Starting mainloop >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug2: cib_native_callback: >> > Invoking callback cibadmin_op_callback for call 2 >> > cibadmin[18807]: 2009/08/10_12:44:17 WARN: cibadmin_op_callback: Call >> > cib_create failed (-47): Update does not conform to the configured >> > schema/DTD >> > Call cib
Re: [Linux-HA] Pacemaker 1.0.4 & HBv2 1.99 // Question about ucast
On Wed, Aug 19, 2009 at 9:51 AM, Alain.Moulle wrote: > Hi, > Thanks again. And this leads me to another linked question : > if we want to have redundancy for the heartbeat, that means that > we could set 8 lines likewise : > ucast eth0 139.111.12.1 > ucast eth0 139.111.12.2 > ucast eth0 139.111.12.3 > ucast eth0 139.111.12.4 > > ucast eth1 139.222.12.1 > ucast eth1 139.222.12.2 > ucast eth1 139.222.12.3 > ucast eth1 139.222.12.4 > > Right ? Yes > > But in this case, suppose we lost the eth1 on node 3, so that : > ucast eth1 139.222.12.3 fails from other nodes, does that lead > to stonith of node 3 ? no, because (as you said below) we can still see it via eth0 so there is no need to shoot anyone > or the fact that : > ucast eth0 139.111.12.3 is always ok prevents node3 to be kill by others ? > > Thanks > Alain > > >> Hi, >> > I wonder if we can use ucast for heartbeat in a cluster of more than >> > two-nodes, >> > and if so, in case of for example 4 nodes , I suppose we have to set 4 >> > lines >> > in ha.cf, one line per node in the cluster : >> > ucast eth0 139.111.12.1 >> > ucast eth0 139.111.12.2 >> > ucast eth0 139.111.12.3 >> > ucast eth0 139.111.12.4 >> > but in this case, does it matter to have a "ucast on myself" as ha.cf has >> > to be the same on the 4 nodes ? >> >> >> nope, perfectly fine to do this >> >> >>> > Thanks for your response. >>> > Alain > ___ > Linux-HA mailing list > Linux-HA@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
[Linux-HA] stonith question
[Apologies if this is a FAQ] I'm planning to use stonith with heartbeat (v1), and I understand that to do this I need a "stonith" or "stonith_host" line in my ha.cf to get heartbeat to call the stonith plugin with the appropriate parameters. What is not clear to me is under what conditions heartbeat will call the stonith plugin to power off a machine - is this documented anywhere? Chris _ Windows Live Messenger: Thanks for 10 great years—enjoy free winks and emoticons. http://clk.atdmt.com/UKM/go/157562755/direct/01/ ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
[Linux-HA] Pacemaker 1.0.4 & HBv2 1.99 // Question about ucast
Hi, Thanks again. And this leads me to another linked question : if we want to have redundancy for the heartbeat, that means that we could set 8 lines likewise : ucast eth0 139.111.12.1 ucast eth0 139.111.12.2 ucast eth0 139.111.12.3 ucast eth0 139.111.12.4 ucast eth1 139.222.12.1 ucast eth1 139.222.12.2 ucast eth1 139.222.12.3 ucast eth1 139.222.12.4 Right ? But in this case, suppose we lost the eth1 on node 3, so that : ucast eth1 139.222.12.3 fails from other nodes, does that lead to stonith of node 3 ? or the fact that : ucast eth0 139.111.12.3 is always ok prevents node3 to be kill by others ? Thanks Alain > Hi, > > I wonder if we can use ucast for heartbeat in a cluster of more than > > two-nodes, > > and if so, in case of for example 4 nodes , I suppose we have to set 4 lines > > in ha.cf, one line per node in the cluster : > > ucast eth0 139.111.12.1 > > ucast eth0 139.111.12.2 > > ucast eth0 139.111.12.3 > > ucast eth0 139.111.12.4 > > but in this case, does it matter to have a "ucast on myself" as ha.cf has > > to be the same on the 4 nodes ? > > > nope, perfectly fine to do this > > >> > Thanks for your response. >> > Alain ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems