Re: [Linux-HA] how to get group members

2009-08-19 Thread Ivan Gromov
Hello, Dejan

> Not without extra text processing, but that shouldn't be too
> difficult.
Yes, I use extra text processing to get group members.

>Alternatively, if you're using pacemaker 1.0, then you
> can grep output of "crm configure show" for the group name.
 I use heartbeat.

Best wishes,
Ivan
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] how to forbid ptest logs in /var/log/messages

2009-08-19 Thread Ivan Gromov
Dear Dejan and Andrew
Thanks a lot for your help.
> No, I don't think so, unless you disable syslog somehow.
I think I'll use syslog to do it. Thanks
>Why does that bother you?
I use ptest very often in my script therefore I don't want to write this 
information in /var/log/messages.

__
Best wishes,
Ivan
* Dejan Muhamedagic  [Tue, 18 Aug 2009 17:05:26 
+0200]:
> Hi,
>
> On Tue, Aug 18, 2009 at 06:25:40PM +0400, Ivan Gromov wrote:
> > Hi, all
> >
> > I use ptest -LsVV 2>&1 to determine resource and group score. I have
> > noticed that the ptest writes information in /var/log/messages . Is 
it
> > possible to forbid the ptest to write in messages file?
>
> No, I don't think so, unless you disable syslog somehow. Why does
> that bother you?
>
> Thanks,
>
> Dejan
>
>
> Try the following patch (which I'm about to commit)

>diff -r 33da369d36e3 pengine/ptest.c
>--- a/pengine/ptest.c Tue Aug 18 14:38:16 2009 +0200
>+++ b/pengine/ptest.c Tue Aug 18 17:00:06 2009 +0200
>@@ -178,7 +178,7 @@ main(int argc, char **argv)
> crm_log_init("ptest", LOG_CRIT, FALSE, FALSE, 0, NULL);
> crm_set_options("V?$XD:G:I:Lwx:d:aSs", "[-?Vv] -[Xxp] {other
>options}", long_options,
> "Calculate the cluster's response to the 
supplied cluster state\n");
>- cl_log_set_facility(LOG_USER);
>+ cl_log_set_facility(-1);
>
> while (1) {
> int option_index = 0;
>
> >
> >
> > ___
> > Linux-HA mailing list
> > Linux-HA@lists.linux-ha.org
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > See also: http://linux-ha.org/ReportingProblems
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] Heartbeat daily statistics collection and service interruption

2009-08-19 Thread kiran
Hi,

How can I stop/restrict heartbeat from collecting daily statistics. As my
service reliability is configured using heartbeat, it stops for a couple of
seconds daily at a time when it is collecting the statistics. How can I get
out of this peculiar behaviour. I am worried I will loose some traffic
during that period.

My logs:

heartbeat[7344]: 2009/08/17_23:45:57 info: RealMalloc stats: 623356 total
malloc bytes. pid [7344/MST_CONTROL]
heartbeat[7344]: 2009/08/17_23:45:57 info: RealMalloc stats: 31444 total
malloc bytes. pid [7346/HBFIFO]
heartbeat[7344]: 2009/08/17_23:45:57 info: RealMalloc stats: 40644 total
malloc bytes. pid [7347/HBWRITE]
heartbeat[7344]: 2009/08/17_23:45:57 info: RealMalloc stats: 33136 total
malloc bytes. pid [7348/HBREAD]

It occurs daily at 11:45.

-- 
Kiran Sarvabhotla
MS(Research)
OBH-50,IIIT
Gachibowli
Hyderabad-500032

Samvedana's blog: http://teamsamvedana.wordpress.com
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] WARN: Gmain_timeout_dispatch:

2009-08-19 Thread Cantwell, Bryan
I get the following messages constantly in both of my two nodes. Both linux  
2.6.26.8-2 running  Heartbeat 2.1.4 without crm (for now) 
Attached are my conf files...

heartbeat[6685]: 2009/08/19_09:54:46 WARN: Gmain_timeout_dispatch: Dispatch 
function for send local status was delayed 1650 ms (> 1010 ms) before being 
called (GSource: 0x811de70)
heartbeat[6685]: 2009/08/19_09:54:46 info: Gmain_timeout_dispatch: started at 
1724344686 should have started at 1724344521
heartbeat[6685]: 2009/08/19_09:54:46 WARN: Gmain_timeout_dispatch: Dispatch 
function for check for signals was delayed 1650 ms (> 1010 ms) before being 
called (GSource: 0x811e178)
heartbeat[6685]: 2009/08/19_09:54:46 info: Gmain_timeout_dispatch: started at 
1724344686 should have started at 1724344521


 


ha.cf
Description: ha.cf


haresources
Description: haresources
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] stonith question

2009-08-19 Thread Dave Blaschke
Chris Card wrote:
>
>   
>> Date: Wed, 19 Aug 2009 06:52:55 -0500
>> From: deb...@us.ibm.com
>> To: linux-ha@lists.linux-ha.org
>> Subject: Re: [Linux-HA] stonith question
>>
>> Chris Card wrote:
>> 
>
>   
>>> I'm planning to use stonith with heartbeat (v1), and I understand that to 
>>> do this I need a "stonith" or "stonith_host" line in my ha.cf to get 
>>> heartbeat to call the stonith plugin with the appropriate parameters. 
>>>
>>> What is not clear to me is under what conditions heartbeat will call the 
>>> stonith plugin to power off a machine - is this documented anywhere?
>>>   
>>>   
>> Not sure if a complete or even partially complete list exists, but in 
>> general a node is STONITHed whenever heartbeat needs to make sure that 
>> it is not using cluster resources.  Here is a nice little article that 
>> includes some discussion on fencing:
>>
>> http://techthoughts.typepad.com/managing_computers/2007/10/split-brain-quo.html
>> 
>>> Chris
>>>   
>
> Thanks Dave, that's a useful document, though I'd still like some definite 
> info about what heartbeat actually does.
>
> I've been looking at the source, and one thing I noticed is that heartbeat 
> itself only seems to send a "reset" command to the stonith device, whereas 
> the stonith command line tool also accepts "on" and "off" as separate 
> actions. Is this the case, or am I missing something? Is there a way to make 
> heartbeat power off a machine and not automatically power it on again?
>   
You are not missing something, they are indeed separate actions but 
reset is by far and away the main one.  In fact, for v2 external STONITH 
plugins, reset is the only required operation while on and off are 
optional.  I'm not aware of a way to power off a STONITH device on v1, 
v2 includes the following CIB directive:



You may be able to play around with the meatware device, which waits for 
manual intervention, or external device, which *I believe* issues a 
system() call if you can call your device via the shell.
> Chris
>
> _
>
> Upgrade to Internet Explorer 8 Optimised for MSN.  
>
> http://extras.uk.msn.com/internet-explorer-8/?ocid=T010MSN07A0716U
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>   


___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] stonith question

2009-08-19 Thread Chris Card



> Date: Wed, 19 Aug 2009 06:52:55 -0500
> From: deb...@us.ibm.com
> To: linux-ha@lists.linux-ha.org
> Subject: Re: [Linux-HA] stonith question
> 
> Chris Card wrote:

> > I'm planning to use stonith with heartbeat (v1), and I understand that to 
> > do this I need a "stonith" or "stonith_host" line in my ha.cf to get 
> > heartbeat to call the stonith plugin with the appropriate parameters. 
> >
> > What is not clear to me is under what conditions heartbeat will call the 
> > stonith plugin to power off a machine - is this documented anywhere?
> >   
> Not sure if a complete or even partially complete list exists, but in 
> general a node is STONITHed whenever heartbeat needs to make sure that 
> it is not using cluster resources.  Here is a nice little article that 
> includes some discussion on fencing:
> 
> http://techthoughts.typepad.com/managing_computers/2007/10/split-brain-quo.html
> > Chris

Thanks Dave, that's a useful document, though I'd still like some definite info 
about what heartbeat actually does.

I've been looking at the source, and one thing I noticed is that heartbeat 
itself only seems to send a "reset" command to the stonith device, whereas the 
stonith command line tool also accepts "on" and "off" as separate actions. Is 
this the case, or am I missing something? Is there a way to make heartbeat 
power off a machine and not automatically power it on again?

Chris

_

Upgrade to Internet Explorer 8 Optimised for MSN.  

http://extras.uk.msn.com/internet-explorer-8/?ocid=T010MSN07A0716U
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] stonith failed to start

2009-08-19 Thread Terry L. Inzauro
List,

I've got a dev cluster up and running with Xen/DRBD/heartbeat working.  After a 
day or so of running, i saw that stonith had
failed to start on node2(it initially started just fine).  I have seen this 
behavior before with this cluster.

What would cause the stonith 'start' operation to fail after it initially had 
succeeded?


crm_mon output:
---
Refresh in 10s...


Last updated: Wed Aug 19 06:33:12 2009
Current DC: node1 (47d563cc-f8ec-4b6d-8092-d80ceb64dbbd)
2 Nodes configured.
4 Resources configured.


Node: node2 (c95ba6f0-5dcf-41d3-abb0-25e55ae313eb): online
Node: node1 (47d563cc-f8ec-4b6d-8092-d80ceb64dbbd): online

xen1 (heartbeat::ocf:Xen):   Started node2
xen2 (heartbeat::ocf:Xen):   Started node1
xen3 (heartbeat::ocf:Xen):   Started node2
Clone Set: Stonith_Clone_Group
stonithclone:0  (stonith:external/ssh): Started node1
stonithclone:1  (stonith:external/ssh): Stopped

Failed actions:
stonithclone:1_start_0 (node=node2, call=14, rc=1): complete


At first look, it appears that the monitor operation fails. Heartbeat then 
tries to start stonith on the failed node and then
the 'start' operation fails as well.

Aug 18 11:02:37 node1 tengine: [3950]: WARN: update_failcount: Updating 
failcount for stonithclone:1 on
c95ba6f0-5dcf-41d3-abb0-25e55ae313eb after failed monitor: rc=14
Aug 18 11:02:37 node1 crmd: [3859]: info: do_state_transition: State transition 
S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC
cause=C_IPC_MESSAGE origin=route_message ]
Aug 18 11:02:37 node1 crmd: [3859]: info: do_state_transition: All 2 cluster 
nodes are eligible to run resources.
Aug 18 11:02:37 node1 pengine: [3951]: info: determine_online_status: Node 
node1 is online
Aug 18 11:02:37 node1 pengine: [3951]: info: determine_online_status: Node 
node2 is online
Aug 18 11:02:37 node1 pengine: [3951]: info: unpack_find_resource: Internally 
renamed stonithclone:0 on node2 to stonithclone:1
Aug 18 11:02:37 node1 pengine: [3951]: WARN: unpack_rsc_op: Processing failed 
op stonithclone:1_monitor_5000 on node2: Error
Aug 18 11:02:37 node1 pengine: [3951]: notice: native_print: 
romulus#011(heartbeat::ocf:Xen):#011Started node2
Aug 18 11:02:37 node1 pengine: [3951]: notice: native_print: 
remus#011(heartbeat::ocf:Xen):#011Started node1
Aug 18 11:02:37 node1 pengine: [3951]: notice: native_print: 
fortuna#011(heartbeat::ocf:Xen):#011Started node2
Aug 18 11:02:37 node1 pengine: [3951]: notice: clone_print: Clone Set: 
Stonith_Clone_Group
Aug 18 11:02:37 node1 pengine: [3951]: notice: native_print: 
stonithclone:0#011(stonith:external/ssh):#011Started node1
Aug 18 11:02:37 node1 pengine: [3951]: notice: native_print: 
stonithclone:1#011(stonith:external/ssh):#011Started node2
FAILED
Aug 18 11:02:37 node1 pengine: [3951]: notice: NoRoleChange: Leave resource 
xen2#011(node2)
Aug 18 11:02:37 node1 pengine: [3951]: notice: NoRoleChange: Leave resource 
xen1#011(node1)
Aug 18 11:02:37 node1 pengine: [3951]: notice: NoRoleChange: Leave resource 
xen3#011(node2)
Aug 18 11:02:37 node1 pengine: [3951]: notice: NoRoleChange: Leave resource 
stonithclone:0#011(node1)
Aug 18 11:02:37 node1 pengine: [3951]: notice: NoRoleChange: Recover resource 
stonithclone:1#011(node2)
Aug 18 11:02:37 node1 pengine: [3951]: notice: StopRsc:   node2#011Stop 
stonithclone:1
Aug 18 11:02:37 node1 pengine: [3951]: notice: StartRsc:  node2#011Start 
stonithclone:1
Aug 18 11:02:37 node1 pengine: [3951]: notice: RecurringOp: node2#011   
stonithclone:1_monitor_5000
Aug 18 11:02:37 node1 tengine: [3950]: info: extract_event: Aborting on 
transient_attributes changes for
c95ba6f0-5dcf-41d3-abb0-25e55ae313eb
Aug 18 11:02:37 node1 pengine: [3951]: info: process_pe_message: Transition 3: 
PEngine Input stored in:
/var/lib/heartbeat/pengine/pe-input-31.bz2
Aug 18 11:02:37 node1 pengine: [3951]: info: determine_online_status: Node 
node1 is online
Aug 18 11:02:37 node1 pengine: [3951]: info: determine_online_status: Node 
node2 is online
Aug 18 11:02:37 node1 pengine: [3951]: info: unpack_find_resource: Internally 
renamed stonithclone:0 on node2 to stonithclone:1
Aug 18 11:02:37 node1 pengine: [3951]: WARN: unpack_rsc_op: Processing failed 
op stonithclone:1_monitor_5000 on node2: Error
Aug 18 11:02:37 node1 pengine: [3951]: notice: native_print: 
xen2#011(heartbeat::ocf:Xen):#011Started node2
Aug 18 11:02:37 node1 pengine: [3951]: notice: native_print: 
xen1#011(heartbeat::ocf:Xen):#011Started node1
Aug 18 11:02:37 node1 pengine: [3951]: notice: native_print: 
xen3#011(heartbeat::ocf:Xen):#011Started node2
Aug 18 11:02:37 node1 pengine: [3951]: notice: clone_print: Clone Set: 
Stonith_Clone_Group
Aug 18 11:02:37 node1 pengine: [3951]: notice: native_print: 
stonithclone:0#011(stonith:external/ssh):#011Started node1
Aug 18 11:02:37 node1 pengine: [3951]: notice: native_print: 
stonithclone:1#011(stonith:external/ssh):#011Started node2
FAILED


If the node gets rebooted, it comes b

Re: [Linux-HA] stonith question

2009-08-19 Thread Dave Blaschke
Chris Card wrote:
> [Apologies if this is a FAQ]
>
> I'm planning to use stonith with heartbeat (v1), and I understand that to do 
> this I need a "stonith" or "stonith_host" line in my ha.cf to get heartbeat 
> to call the stonith plugin with the appropriate parameters. 
>
> What is not clear to me is under what conditions heartbeat will call the 
> stonith plugin to power off a machine - is this documented anywhere?
>   
Not sure if a complete or even partially complete list exists, but in 
general a node is STONITHed whenever heartbeat needs to make sure that 
it is not using cluster resources.  Here is a nice little article that 
includes some discussion on fencing:

http://techthoughts.typepad.com/managing_computers/2007/10/split-brain-quo.html
> Chris
>
> _
> Windows Live Messenger: Thanks for 10 great years—enjoy free winks and 
> emoticons.
> http://clk.atdmt.com/UKM/go/157562755/direct/01/
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>   


___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Adding a resource to a running Cluster

2009-08-19 Thread Karl W. Lewis
>> clones have only meta_attributes, so move the two nvpairs over
there.  <<

Oh.  That's the bit I missed in the documentation, then.  Yes, once that's
sorted the cib swallows this update without the barest hint of a complaint.

Thank you!!

(I'm sure you hear it so often that you can't hear it anymore, but I still
have to say: this clustering software *rocks*; I'm not sure there's a
cooler, more significant enterprise class project, short of the kernel
itself, anywhere.)

Be well,

Karl


On Wed, Aug 19, 2009 at 6:56 AM, Dejan Muhamedagic wrote:

> Hi,
>
> On Wed, Aug 19, 2009 at 06:27:32AM -0400, Karl W. Lewis wrote:
> > On Mon, Aug 10, 2009 at 2:20 PM, Karl W. Lewis  >wrote:
> >
> > > D'oh!
> > >
> > > Yes, it's Pacemaker 1.0.3  I should have said so.
> > >
> > > I am sorry for what turns out to be a[nother]  the stupid question.
>  Thank
> > > you for your kind patience.
> > >
> > > Be well,
> > >
> > > Karl
> > >
> > >
> > > On Mon, Aug 10, 2009 at 1:50 PM, Dejan Muhamedagic <
> deja...@fastmail.fm>wrote:
> > >
> > >> Hi,
> > >>
> > >> On Mon, Aug 10, 2009 at 12:54:44PM -0400, Karl W. Lewis wrote:
> > >> > I have used cibadmin to add contraints to a running cluster, but now
> I
> > >> wish
> > >> > to add another resource, a stonith configuration, to my cluster.
> > >>
> > >> I guess that this is pacemaker 1.0.
> > >>
> > >> > The snippet of xml I am trying to feed the cluster looks like this:
> > >> > 
> > >> > 
> > >> >  
> > >> >   
> > >> >  > >> value="false"/>
> > >> >   
> > >> >   
> > >> >   
> > >> >   
> > >>
> > >> Both are missing id. Also, replace "_" with "-".
> > >>
> > >> >   
> > >> >> >> type="external/egenera"
> > >> > provider="heartbeat">
> > >> > 
> > >> >> >> prereq="nothing"/>
> > >>
> > >> Missing id. "prereq" is now named "requires".
> > >>
> > >> Thanks,
> > >>
> > >> Dejan
> > >>
> > >> > 
> > >> > 
> > >> >  > >> value="wsc-voo-205,
> > >> > wsc-voo-206, wsc-voo-207"/>
> > >> > 
> > >> >   
> > >> >  
> > >> > 
> > >> >
> > >> > 
> > >> >
> > >> > >>>
> > >> >  cibadmin -V -V -V -V -V -V -V -C -o resources -x
> stonith_production.xml
> > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug2: main: Option o
> =>
> > >> > resources
> > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug2: main: Option x
> =>
> > >> > stonith_production.xml
> > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
> > >> [admin
> > >> > input] 
> > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
> > >> [admin
> > >> > input]   
> > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
> > >> [admin
> > >> > input] 
> > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
> > >> [admin
> > >> > input]> >> > value="false" />
> > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
> > >> [admin
> > >> > input] 
> > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
> > >> [admin
> > >> > input] 
> > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
> > >> [admin
> > >> > input]   
> > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
> > >> [admin
> > >> > input]   
> > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
> > >> [admin
> > >> > input] 
> > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
> > >> [admin
> > >> > input]  > >> > type="external/egenera" provider="heartbeat" >
> > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
> > >> [admin
> > >> > input]   
> > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
> > >> [admin
> > >> > input]  > >> > prereq="nothing" />
> > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
> > >> [admin
> > >> > input]   
> > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
> > >> [admin
> > >> > input]   
> > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
> > >> [admin
> > >> > input]  > >> > value="wsc-voo-205, wsc-voo-206, wsc-voo-207" />
> > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
> > >> [admin
> > >> > input]   
> > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
> > >> [admin
> > >> > input] 
> > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
> > >> [admin
> > >> > input]   
> > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
> > >> [admin
> > >> > input] 
> > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug:
> > >> > init_client_ipc_comms_nodispatch: Attempting to talk on:
> > >> /var/run/crm/cib_rw
> > >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug3:
> > >> > init_client_ipc_comms_nodispatch: Processing of /var/run/crm/cib_rw
> > >> complete
> > >> > cibadmin[18807]:

Re: [Linux-HA] Adding a resource to a running Cluster

2009-08-19 Thread Dejan Muhamedagic
Hi,

On Wed, Aug 19, 2009 at 06:27:32AM -0400, Karl W. Lewis wrote:
> On Mon, Aug 10, 2009 at 2:20 PM, Karl W. Lewis wrote:
> 
> > D'oh!
> >
> > Yes, it's Pacemaker 1.0.3  I should have said so.
> >
> > I am sorry for what turns out to be a[nother]  the stupid question.  Thank
> > you for your kind patience.
> >
> > Be well,
> >
> > Karl
> >
> >
> > On Mon, Aug 10, 2009 at 1:50 PM, Dejan Muhamedagic 
> > wrote:
> >
> >> Hi,
> >>
> >> On Mon, Aug 10, 2009 at 12:54:44PM -0400, Karl W. Lewis wrote:
> >> > I have used cibadmin to add contraints to a running cluster, but now I
> >> wish
> >> > to add another resource, a stonith configuration, to my cluster.
> >>
> >> I guess that this is pacemaker 1.0.
> >>
> >> > The snippet of xml I am trying to feed the cluster looks like this:
> >> > 
> >> > 
> >> >  
> >> >   
> >> >  >> value="false"/>
> >> >   
> >> >   
> >> >   
> >> >   
> >>
> >> Both are missing id. Also, replace "_" with "-".
> >>
> >> >   
> >> >>> type="external/egenera"
> >> > provider="heartbeat">
> >> > 
> >> >>> prereq="nothing"/>
> >>
> >> Missing id. "prereq" is now named "requires".
> >>
> >> Thanks,
> >>
> >> Dejan
> >>
> >> > 
> >> > 
> >> >  >> value="wsc-voo-205,
> >> > wsc-voo-206, wsc-voo-207"/>
> >> > 
> >> >   
> >> >  
> >> > 
> >> >
> >> > 
> >> >
> >> > >>>
> >> >  cibadmin -V -V -V -V -V -V -V -C -o resources -x stonith_production.xml
> >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug2: main: Option o =>
> >> > resources
> >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug2: main: Option x =>
> >> > stonith_production.xml
> >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
> >> [admin
> >> > input] 
> >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
> >> [admin
> >> > input]   
> >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
> >> [admin
> >> > input] 
> >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
> >> [admin
> >> > input]>> > value="false" />
> >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
> >> [admin
> >> > input] 
> >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
> >> [admin
> >> > input] 
> >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
> >> [admin
> >> > input]   
> >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
> >> [admin
> >> > input]   
> >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
> >> [admin
> >> > input] 
> >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
> >> [admin
> >> > input]  >> > type="external/egenera" provider="heartbeat" >
> >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
> >> [admin
> >> > input]   
> >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
> >> [admin
> >> > input]  >> > prereq="nothing" />
> >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
> >> [admin
> >> > input]   
> >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
> >> [admin
> >> > input]   
> >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
> >> [admin
> >> > input]  >> > value="wsc-voo-205, wsc-voo-206, wsc-voo-207" />
> >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
> >> [admin
> >> > input]   
> >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
> >> [admin
> >> > input] 
> >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
> >> [admin
> >> > input]   
> >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
> >> [admin
> >> > input] 
> >> > cibadmin[18807]: 2009/08/10_12:44:17 debug:
> >> > init_client_ipc_comms_nodispatch: Attempting to talk on:
> >> /var/run/crm/cib_rw
> >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug3:
> >> > init_client_ipc_comms_nodispatch: Processing of /var/run/crm/cib_rw
> >> complete
> >> > cibadmin[18807]: 2009/08/10_12:44:17 debug:
> >> > init_client_ipc_comms_nodispatch: Attempting to talk on:
> >> > /var/run/crm/cib_callback
> >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug3:
> >> > init_client_ipc_comms_nodispatch: Processing of
> >> /var/run/crm/cib_callback
> >> > complete
> >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: cib_native_signon_raw:
> >> > Connection to CIB successful
> >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug3:
> >> cib_native_perform_op:
> >> > Sending cib_create message to CIB service
> >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug3:
> >> cib_native_perform_op:
> >> > Message sent
> >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug3:
> >> cib_native_perform_op:
> >> > Async call, returning
> >> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug3: main: cibadmin
> >> waiting
> >> > for reply from the 

Re: [Linux-HA] how to get group members

2009-08-19 Thread Dominik Klein
Ivan Gromov wrote:
> Hi, all
> 
> How to get group members?
> I use crm_resource -x -t group -r group_Name. Can I get members without 
> xml part?

What about

crm configure show  ?

Regards
Dominik
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Adding a resource to a running Cluster

2009-08-19 Thread Karl W. Lewis
On Mon, Aug 10, 2009 at 2:20 PM, Karl W. Lewis wrote:

> D'oh!
>
> Yes, it's Pacemaker 1.0.3  I should have said so.
>
> I am sorry for what turns out to be a[nother]  the stupid question.  Thank
> you for your kind patience.
>
> Be well,
>
> Karl
>
>
> On Mon, Aug 10, 2009 at 1:50 PM, Dejan Muhamedagic wrote:
>
>> Hi,
>>
>> On Mon, Aug 10, 2009 at 12:54:44PM -0400, Karl W. Lewis wrote:
>> > I have used cibadmin to add contraints to a running cluster, but now I
>> wish
>> > to add another resource, a stonith configuration, to my cluster.
>>
>> I guess that this is pacemaker 1.0.
>>
>> > The snippet of xml I am trying to feed the cluster looks like this:
>> > 
>> > 
>> >  
>> >   
>> > > value="false"/>
>> >   
>> >   
>> >   
>> >   
>>
>> Both are missing id. Also, replace "_" with "-".
>>
>> >   
>> >   > type="external/egenera"
>> > provider="heartbeat">
>> > 
>> >   > prereq="nothing"/>
>>
>> Missing id. "prereq" is now named "requires".
>>
>> Thanks,
>>
>> Dejan
>>
>> > 
>> > 
>> > > value="wsc-voo-205,
>> > wsc-voo-206, wsc-voo-207"/>
>> > 
>> >   
>> >  
>> > 
>> >
>> > 
>> >
>> > >>>
>> >  cibadmin -V -V -V -V -V -V -V -C -o resources -x stonith_production.xml
>> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug2: main: Option o =>
>> > resources
>> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug2: main: Option x =>
>> > stonith_production.xml
>> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
>> [admin
>> > input] 
>> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
>> [admin
>> > input]   
>> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
>> [admin
>> > input] 
>> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
>> [admin
>> > input]   > > value="false" />
>> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
>> [admin
>> > input] 
>> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
>> [admin
>> > input] 
>> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
>> [admin
>> > input]   
>> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
>> [admin
>> > input]   
>> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
>> [admin
>> > input] 
>> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
>> [admin
>> > input] > > type="external/egenera" provider="heartbeat" >
>> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
>> [admin
>> > input]   
>> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
>> [admin
>> > input] > > prereq="nothing" />
>> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
>> [admin
>> > input]   
>> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
>> [admin
>> > input]   
>> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
>> [admin
>> > input] > > value="wsc-voo-205, wsc-voo-206, wsc-voo-207" />
>> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
>> [admin
>> > input]   
>> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
>> [admin
>> > input] 
>> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
>> [admin
>> > input]   
>> > cibadmin[18807]: 2009/08/10_12:44:17 debug: log_data_element: main:
>> [admin
>> > input] 
>> > cibadmin[18807]: 2009/08/10_12:44:17 debug:
>> > init_client_ipc_comms_nodispatch: Attempting to talk on:
>> /var/run/crm/cib_rw
>> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug3:
>> > init_client_ipc_comms_nodispatch: Processing of /var/run/crm/cib_rw
>> complete
>> > cibadmin[18807]: 2009/08/10_12:44:17 debug:
>> > init_client_ipc_comms_nodispatch: Attempting to talk on:
>> > /var/run/crm/cib_callback
>> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug3:
>> > init_client_ipc_comms_nodispatch: Processing of
>> /var/run/crm/cib_callback
>> > complete
>> > cibadmin[18807]: 2009/08/10_12:44:17 debug: cib_native_signon_raw:
>> > Connection to CIB successful
>> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug3:
>> cib_native_perform_op:
>> > Sending cib_create message to CIB service
>> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug3:
>> cib_native_perform_op:
>> > Message sent
>> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug3:
>> cib_native_perform_op:
>> > Async call, returning
>> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug3: main: cibadmin
>> waiting
>> > for reply from the local CIB
>> > cibadmin[18807]: 2009/08/10_12:44:17 info: main: Starting mainloop
>> > cibadmin[18807]: 2009/08/10_12:44:17 debug: debug2: cib_native_callback:
>> > Invoking callback cibadmin_op_callback for call 2
>> > cibadmin[18807]: 2009/08/10_12:44:17 WARN: cibadmin_op_callback: Call
>> > cib_create failed (-47): Update does not conform to the configured
>> > schema/DTD
>> > Call cib

Re: [Linux-HA] Pacemaker 1.0.4 & HBv2 1.99 // Question about ucast

2009-08-19 Thread Andrew Beekhof
On Wed, Aug 19, 2009 at 9:51 AM, Alain.Moulle wrote:
> Hi,
> Thanks again. And this leads me to another linked question :
> if we want to have redundancy for the heartbeat, that means that
> we could set 8 lines likewise :
> ucast eth0 139.111.12.1
> ucast eth0 139.111.12.2
> ucast eth0 139.111.12.3
> ucast eth0 139.111.12.4
>
> ucast eth1 139.222.12.1
> ucast eth1 139.222.12.2
> ucast eth1 139.222.12.3
> ucast eth1 139.222.12.4
>
> Right ?

Yes

>
> But in this case, suppose we lost the eth1 on node 3, so that :
> ucast eth1 139.222.12.3 fails from other nodes, does that lead
> to stonith of node 3 ?

no, because (as you said below) we can still see it via eth0
so there is no need to shoot anyone

> or the fact that :
> ucast eth0 139.111.12.3 is always ok prevents node3 to be kill by others ?
>
> Thanks
> Alain
>
>
>> Hi,
>> > I wonder if we can use ucast for heartbeat in a cluster of more than
>> > two-nodes,
>> > and if so, in case of for example 4 nodes , I suppose we have to set 4 
>> > lines
>> > in ha.cf, one line per node in the cluster :
>> > ucast eth0 139.111.12.1
>> > ucast eth0 139.111.12.2
>> > ucast eth0 139.111.12.3
>> > ucast eth0 139.111.12.4
>> > but in this case, does it matter to have a "ucast on myself" as ha.cf has
>> > to be the same on the 4 nodes ?
>>
>>
>> nope, perfectly fine to do this
>>
>>
>>> > Thanks for your response.
>>> > Alain
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] stonith question

2009-08-19 Thread Chris Card

[Apologies if this is a FAQ]

I'm planning to use stonith with heartbeat (v1), and I understand that to do 
this I need a "stonith" or "stonith_host" line in my ha.cf to get heartbeat to 
call the stonith plugin with the appropriate parameters. 

What is not clear to me is under what conditions heartbeat will call the 
stonith plugin to power off a machine - is this documented anywhere?

Chris

_
Windows Live Messenger: Thanks for 10 great years—enjoy free winks and 
emoticons.
http://clk.atdmt.com/UKM/go/157562755/direct/01/
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] Pacemaker 1.0.4 & HBv2 1.99 // Question about ucast

2009-08-19 Thread Alain.Moulle
Hi,
Thanks again. And this leads me to another linked question :
if we want to have redundancy for the heartbeat, that means that
we could set 8 lines likewise :
ucast eth0 139.111.12.1
ucast eth0 139.111.12.2
ucast eth0 139.111.12.3
ucast eth0 139.111.12.4

ucast eth1 139.222.12.1
ucast eth1 139.222.12.2
ucast eth1 139.222.12.3
ucast eth1 139.222.12.4

Right ?

But in this case, suppose we lost the eth1 on node 3, so that :
ucast eth1 139.222.12.3 fails from other nodes, does that lead
to stonith of node 3 ? or the fact that :
ucast eth0 139.111.12.3 is always ok prevents node3 to be kill by others ?

Thanks
Alain


> Hi,
> > I wonder if we can use ucast for heartbeat in a cluster of more than
> > two-nodes,
> > and if so, in case of for example 4 nodes , I suppose we have to set 4 lines
> > in ha.cf, one line per node in the cluster :
> > ucast eth0 139.111.12.1
> > ucast eth0 139.111.12.2
> > ucast eth0 139.111.12.3
> > ucast eth0 139.111.12.4
> > but in this case, does it matter to have a "ucast on myself" as ha.cf has
> > to be the same on the 4 nodes ?
>   
>
> nope, perfectly fine to do this
>
>   
>> > Thanks for your response.
>> > Alain
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems