Re: [Pacemaker] Announce: Pacemaker 1.0.7 (stable) Released

2010-01-20 Thread Andrew Beekhof
On Wed, Jan 20, 2010 at 12:30 AM, Thomas Guthmann
 wrote:
> Hey,
>
 Pre-built packages for Pacemaker and it s immediate dependancies are
 currently building and will be available for openSUSE, SLES, Fedora, RHEL,
 CentOS from the ClusterLabs Build Area (http://www.clusterlabs.org/rpm)
 shortly.
>
> Thanks Andrew. I'll move from 1.0.6 + patch to 1.0.7 at the end of week and
> I will give you feedback if I find anything weird.
>
> Is it also possible to have corosync 1.2.0 in the repository ?
>

It should be there later today

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] Announce: Pacemaker 1.0.7 (stable) Released

2010-01-20 Thread Andreas Mock
> -Ursprüngliche Nachricht-
> Von: "Andrew Beekhof" 
> Gesendet: 20.01.10 09:31:00
> An: pacemaker@oss.clusterlabs.org
> Betreff: Re: [Pacemaker] Announce: Pacemaker 1.0.7 (stable) Released

> > Is it also possible to have corosync 1.2.0 in the repository ?
> >
> 
> It should be there later today

Hi all,

one question to that. If I want to work with cLVM
do I need corosync or openais together with pacemaker?

Best regards
Andreas Mock


___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


[Pacemaker] cLVM and openSuSE

2010-01-20 Thread Andreas Mock
Hi all,

does anyone have experiences with cLVM in openSuSE?
Currently the version 2.02.45 is part of the openSuSE 11.2
distribution. Looking at the cLVM project page you can find
version 2.02.58. Even more: Looking at the changelog gives
the feeling that some changes happend meanwhile which
are not only cosmetic ones.

Is cLVM production ready?
Any experiences?

Or is it more same to use LVM and be careful whithin the
cluster environment (as we did and do for years now)?

Thanks in advance
Andreas Mock


___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] Announce: Pacemaker 1.0.7 (stable) Released

2010-01-20 Thread Andrew Beekhof
On Wed, Jan 20, 2010 at 9:43 AM, Andreas Mock  wrote:
>> -Ursprüngliche Nachricht-
>> Von: "Andrew Beekhof" 
>> Gesendet: 20.01.10 09:31:00
>> An: pacemaker@oss.clusterlabs.org
>> Betreff: Re: [Pacemaker] Announce: Pacemaker 1.0.7 (stable) Released
>
>> > Is it also possible to have corosync 1.2.0 in the repository ?
>> >
>>
>> It should be there later today
>
> Hi all,
>
> one question to that. If I want to work with cLVM
> do I need corosync or openais together with pacemaker?

both
pacemaker only needs corosync, but cLVM needs the dlm which needs
pieces from openais.

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] Announce: Pacemaker 1.0.7 (stable) Released

2010-01-20 Thread Andreas Mock
> -Ursprüngliche Nachricht-
> Von: "Andrew Beekhof" 
> Gesendet: 18.01.10 21:30:32
> An: pacemaker@oss.clusterlabs.org
> Betreff: Re: [Pacemaker] Announce: Pacemaker 1.0.7 (stable) Released


> 
> Done. Please let me know how it goes.

Hi Andrew,

first feedback on package management.

Experiences made with ncurses yast installer openSuSE 11.2:
a) Repository added with 
zypper ar http://clusterlabs.org/rpm/opensuse-11.2/clusterlabs.repo
zypper refresh

works fine.

b) Installation with ncurses Yast
* The installation of "pacemaker" seems to pull corosync and heartbeat
but NOT openais. Yes, now I know that this is enough for pacemaker only.

* The picture 
http://clusterlabs.org/mediawiki/images/thumb/c/c7/Install_Dependancies.png/400px-Install_Dependancies.png
is misleading at this point.
I would recommend to update this picture so that pacemaker needs/is dependent 
on/pulls
corosync with libraries BUT not openais. Add another bubble on the same level 
as pacemaker
with openais being dependent on corosync.

* Please add an hint in the documentation what you told me in the other
mail IMHO it's helpful.
- pacemaker needs corosync
- openais needs corosync
- cLVM and others (upper level cluster services) need DLM  and that needs 
openais 

More feedback following...  ;-)

Best regards
Andreas Mock


___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] 1.0.7 upgraded, restarting resources problem

2010-01-20 Thread Martin Gombač

Shoud i file a bug for the problem (restarting resources) described below?
Regards,
M.

Dejan Muhamedagic wrote:

Hi,

On Mon, Jan 18, 2010 at 10:00:39PM +0100, Martin Gombač wrote:
  

Hi,

i have one m/s drbd resource and one Xen instance on top. Both m/s
are primary.
When i restart node that's _not_ hosting the Xen instance (ibm1),
pacemaker restarts running Xen instance on the other node (ibm2).
There is no need to do that. I thought it got fixed
(http://developerbugs.linux-foundation.org/show_bug.cgi?id=2153).
Didn't it?

Here is my config once more. Please note the WARNING showed up only
after upgrade.
(BTW setting drbd0predHosting score to 0 doesn't restart it. But it
doesn't help resource ordering either.)

[r...@ibm1 etc]# crm configure show
WARNING: notify: operation name not recognized



That's from the shell, please ignore it. Strange, the operation
list should've been updated a long time ago.

Thanks,

Dejan

  

node $id="3d430f49-b915-4d52-a32b-b0799fa17ae7" ibm2
node $id="4b2047c8-f3a0-4935-84a2-967b548598c9" ibm1
primitive Hosting ocf:heartbeat:Xen \
   params xmfile="/etc/xen/Hosting.cfg" shutdown_timeout="303" \
   meta target-role="Started" allow-migrate="true" is-managed="true" \
   op monitor interval="120s" timeout="506s" start-delay="5s" \
   op migrate_to interval="0s" timeout="304s" \
   op migrate_from interval="0s" timeout="304s" \
   op stop interval="0s" timeout="304s" \
   op start interval="0s" timeout="202s"
primitive drbd_r0 ocf:linbit:drbd \
   params drbd_resource="r0" \
   op monitor interval="15s" role="Master" timeout="30s" \
   op monitor interval="30s" role="Slave" timeout="30s" \
   op stop interval="0s" timeout="501s" \
   op notify interval="0s" timeout="90s" \
   op demote interval="0s" timeout="90s" \
   op promote interval="0s" timeout="90s" \
   op start interval="0s" timeout="255s"
ms ms_drbd_r0 drbd_r0 \
   meta notify="true" master-max="2" inteleave="true"
is-managed="true" target-role="Started"
order drbd0predHosting inf: ms_drbd_r0:promote Hosting:start
property $id="cib-bootstrap-options" \
   dc-version="1.0.7-b1191b11d4b56dcae8f34715d52532561b875cd5" \
   cluster-infrastructure="Heartbeat" \
   stonith-enabled="false" \
   no-quorum-policy="ignore" \
   default-resource-stickiness="10" \
   last-lrm-refresh="1263845352"

All i want is to have just one resource Hosting started, after drbd
was promoted(/primary) on the node that's it's starting.
Please advise me if you can.

Thank you,
regards,
M.

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker



___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
  


___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] Announce: Pacemaker 1.0.7 (stable) Released

2010-01-20 Thread Andrew Beekhof
On Wed, Jan 20, 2010 at 10:29 AM, Andreas Mock  wrote:
>> -Ursprüngliche Nachricht-
>> Von: "Andrew Beekhof" 
>> Gesendet: 18.01.10 21:30:32
>> An: pacemaker@oss.clusterlabs.org
>> Betreff: Re: [Pacemaker] Announce: Pacemaker 1.0.7 (stable) Released
>
>
>>
>> Done. Please let me know how it goes.
>
> Hi Andrew,
>
> first feedback on package management.
>
> Experiences made with ncurses yast installer openSuSE 11.2:
> a) Repository added with
> zypper ar http://clusterlabs.org/rpm/opensuse-11.2/clusterlabs.repo
> zypper refresh
>
> works fine.
>
> b) Installation with ncurses Yast
> * The installation of "pacemaker" seems to pull corosync and heartbeat
> but NOT openais.

Correct - that is intentional.

>  Yes, now I know that this is enough for pacemaker only.
>
> * The picture 
> http://clusterlabs.org/mediawiki/images/thumb/c/c7/Install_Dependancies.png/400px-Install_Dependancies.png
> is misleading at this point.

Yeah, needs to be updated one of these days :-(

> I would recommend to update this picture so that pacemaker needs/is dependent 
> on/pulls
> corosync with libraries BUT not openais. Add another bubble on the same level 
> as pacemaker
> with openais being dependent on corosync.
>
> * Please add an hint in the documentation what you told me in the other
> mail IMHO it's helpful.
> - pacemaker needs corosync
> - openais needs corosync
> - cLVM and others (upper level cluster services) need DLM  and that needs 
> openais
>
> More feedback following...  ;-)
>
> Best regards
> Andreas Mock
>
>
> ___
> Pacemaker mailing list
> Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] APC Master Stonith

2010-01-20 Thread Dejan Muhamedagic
Hi,

On Tue, Jan 19, 2010 at 04:37:05PM -0500, Errol Neal wrote:
> On Tue, Jan 19, 2010 04:19  PM, Sander van Vugt  wrote:
> > Hi,
> > 
> > I hope someone has configured the APC Master Stonith resource (which you
> > would use to have pacemaker to a device like the APC switched rack PDU),
> > as I have a - probably extremely stupid - conceptual question about it. 
> > 
> > When I look at the options the resource has, it allows me to enter
> > username, password and IP address. What I would also expect, is to give
> > it something like a name of the node that is should do STONITH on, as
> > well as the port on the device that it should power cycle. Am I missing
> > something? Or do I have to specify this information as additional
> > attributes? And if so, what exactly would be the syntax?
> > 
> What type of device are you trying to get the plugin to work with?
> I'm using APC rack PDUs and this plugin did not work by default
> for me. I had to hack it to get it work for me, but it works
> exactly how I wanted it to. By the way, I'm not using the snmp
> - i'm using telnet.

It would be good to hear why it didn't work and what did you do
to make it work. Incidentally, there is a bugzilla with a patch
which should help handling certain device releases, but for lack
of devices to try it out, it never got applied. See
http://developerbugs.linux-foundation.org/show_bug.cgi?id=1891

Thanks,

Dejan

> So here is how mine's is configured:
> 
> primitive stonith-apcmaster-axigen2 stonith:apcmaster \
> params ipaddr="x.x.x.x login="axigen2" password="x.x.x.x" \
> op monitor interval="120s" timeout="20s" \
> op startup interval="0" timeout="60s" \
> 
> Then I have a constraint that prohibits it a node from committing suicide.
> 
> I'll describe what I did to get it going in my environment. 
> 
> I created a user account for each node on it's respective PDU
> and only allowed it to control it's own power.
> 
> As I mentioned, I hacked the plugin's source code and recompiled. My changes  
> #1 to make it work and #2, to make it work simple and plain. Login and 
> shut-er down. I can provide you my changes if you think it will work for you.
> 
> -Errol
> 
> ___
> Pacemaker mailing list
> Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] APC Master Stonith

2010-01-20 Thread Dejan Muhamedagic
Hi,

On Tue, Jan 19, 2010 at 10:19:10PM +0100, Sander van Vugt wrote:
> Hi,
> 
> I hope someone has configured the APC Master Stonith resource (which you
> would use to have pacemaker to a device like the APC switched rack PDU),
> as I have a - probably extremely stupid - conceptual question about it. 
> 
> When I look at the options the resource has, it allows me to enter
> username, password and IP address. What I would also expect, is to give
> it something like a name of the node that is should do STONITH on, as
> well as the port on the device that it should power cycle. Am I missing
> something? Or do I have to specify this information as additional
> attributes? And if so, what exactly would be the syntax?

Never worked with the plugin, but apparently it gets the list of
nodes it can manage from the device itself. This list is compiled
from the outlet names, so they should match node names. Then, on
fence request, the plugin will find the right outlet based on the
node name.

BTW, the device can't handle more than one connection at the
time, so you should create only one resource (no clones).

Thanks,

Dejan

> Thanks for enlightening me,
> Sander
> 
> 
> 
> ___
> Pacemaker mailing list
> Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] 1.0.7 upgraded, restarting resources problem

2010-01-20 Thread Dejan Muhamedagic
Hi,

On Wed, Jan 20, 2010 at 10:29:54AM +0100, Martin Gombač wrote:
> Shoud i file a bug for the problem (restarting resources) described below?
> Regards,
> M.
> 
> Dejan Muhamedagic wrote:
> >Hi,
> >
> >On Mon, Jan 18, 2010 at 10:00:39PM +0100, Martin Gombač wrote:
> >>Hi,
> >>
> >>i have one m/s drbd resource and one Xen instance on top. Both m/s
> >>are primary.
> >>When i restart node that's _not_ hosting the Xen instance (ibm1),
> >>pacemaker restarts running Xen instance on the other node (ibm2).
> >>There is no need to do that. I thought it got fixed
> >>(http://developerbugs.linux-foundation.org/show_bug.cgi?id=2153).
> >>Didn't it?

If you can reproduce the issue described in this bugzilla, then
please reopen it and attach a hb_report tarball.

Thanks,

Dejan

> >>Here is my config once more. Please note the WARNING showed up only
> >>after upgrade.
> >>(BTW setting drbd0predHosting score to 0 doesn't restart it. But it
> >>doesn't help resource ordering either.)
> >>
> >>[r...@ibm1 etc]# crm configure show
> >>WARNING: notify: operation name not recognized
> >
> >That's from the shell, please ignore it. Strange, the operation
> >list should've been updated a long time ago.
> >
> >Thanks,
> >
> >Dejan
> >
> >>node $id="3d430f49-b915-4d52-a32b-b0799fa17ae7" ibm2
> >>node $id="4b2047c8-f3a0-4935-84a2-967b548598c9" ibm1
> >>primitive Hosting ocf:heartbeat:Xen \
> >>   params xmfile="/etc/xen/Hosting.cfg" shutdown_timeout="303" \
> >>   meta target-role="Started" allow-migrate="true" is-managed="true" \
> >>   op monitor interval="120s" timeout="506s" start-delay="5s" \
> >>   op migrate_to interval="0s" timeout="304s" \
> >>   op migrate_from interval="0s" timeout="304s" \
> >>   op stop interval="0s" timeout="304s" \
> >>   op start interval="0s" timeout="202s"
> >>primitive drbd_r0 ocf:linbit:drbd \
> >>   params drbd_resource="r0" \
> >>   op monitor interval="15s" role="Master" timeout="30s" \
> >>   op monitor interval="30s" role="Slave" timeout="30s" \
> >>   op stop interval="0s" timeout="501s" \
> >>   op notify interval="0s" timeout="90s" \
> >>   op demote interval="0s" timeout="90s" \
> >>   op promote interval="0s" timeout="90s" \
> >>   op start interval="0s" timeout="255s"
> >>ms ms_drbd_r0 drbd_r0 \
> >>   meta notify="true" master-max="2" inteleave="true"
> >>is-managed="true" target-role="Started"
> >>order drbd0predHosting inf: ms_drbd_r0:promote Hosting:start
> >>property $id="cib-bootstrap-options" \
> >>   dc-version="1.0.7-b1191b11d4b56dcae8f34715d52532561b875cd5" \
> >>   cluster-infrastructure="Heartbeat" \
> >>   stonith-enabled="false" \
> >>   no-quorum-policy="ignore" \
> >>   default-resource-stickiness="10" \
> >>   last-lrm-refresh="1263845352"
> >>
> >>All i want is to have just one resource Hosting started, after drbd
> >>was promoted(/primary) on the node that's it's starting.
> >>Please advise me if you can.
> >>
> >>Thank you,
> >>regards,
> >>M.
> >>
> >>___
> >>Pacemaker mailing list
> >>Pacemaker@oss.clusterlabs.org
> >>http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> >___
> >Pacemaker mailing list
> >Pacemaker@oss.clusterlabs.org
> >http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> ___
> Pacemaker mailing list
> Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] cLVM and openSuSE

2010-01-20 Thread Sander van Vugt
I'm using it in Pacemaker, with clone resources for dlm and clvm and
individual resources for each of the logical volumes and it works
perfectly. Some cluster related stuff in my environment worked on cLVM
only, such as pvcreate on multipath devices (might be my ignorance
thoug ;-)

In some environments you just need to be sure that no other server is
currently touching your storage, like EVMS did in the old days.

HTH,
Sander


On Wed, 2010-01-20 at 09:49 +0100, Andreas Mock wrote:
> Hi all,
> 
> does anyone have experiences with cLVM in openSuSE?
> Currently the version 2.02.45 is part of the openSuSE 11.2
> distribution. Looking at the cLVM project page you can find
> version 2.02.58. Even more: Looking at the changelog gives
> the feeling that some changes happend meanwhile which
> are not only cosmetic ones.
> 
> Is cLVM production ready?
> Any experiences?
> 
> Or is it more same to use LVM and be careful whithin the
> cluster environment (as we did and do for years now)?
> 
> Thanks in advance
> Andreas Mock
> 
> 
> ___
> Pacemaker mailing list
> Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker



___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] cLVM and openSuSE

2010-01-20 Thread Sander van Vugt
Hi Andreas,

> Hi Sander,
> 
> thank you for the fast answer.
> 
> >I'm using it in Pacemaker, with clone resources for dlm and clvm and
> >individual resources for each of the logical volumes and it works
> >perfectly. Some cluster related stuff in my environment worked on cLVM
> >only, such as pvcreate on multipath devices (might be my ignorance
> >thoug ;-)
> 
> 
> a) Do you use the distribution packages of cLVM and DLM or are you
> building more up to date software from source?
> 
I'm a lazy person, using distribution packages.

> b) May I ask you for a pacemaker config  snippet concerning the
> cloned services for cLVM and DLM?

Will get one for you tonight, no access now.
> 
> c) Are you using it with openSuSE?

SLES

Best regards,
Sander


___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


[Pacemaker] metadata (timeout) ignored?

2010-01-20 Thread Markus M.

Hello,

i've a question about metadata returned by an ocf resource agent using 
the "meta-data" command and the behaviour of the cluster.


When checking the resource agent's metadata using crm i get this:

# crm
crm(live)# ra
crm(live)ra#  meta cluster_oracle ocf
bla (ocf:heartbeat:cluster_oracle)

Master/Slave OCF Resource Agent for Oracle (clustered)

Parameters (* denotes required, [] the default):

oracle_role* (string): Ora role
Required to assign the Oracle role. Must be "master" or "slave"

Operations' defaults (advisory minimum):

starttimeout=240
promote  timeout=90
demote   timeout=90
notify   timeout=90
stop timeout=100
monitor  timeout=20 interval=20 depth=0
monitor  timeout=20 interval=10 depth=0

So it seems for the "stop" action there is a timeout of 100 seconds 
defined. But at cluster shutdown i can see this in the ha-debug log:


...
Jan 18 14:31:35 node1 crmd: [12844]: info: te_rsc_command: Initiating 
action 5: stop oracle_primary_stop_0 on node1 (local)
Jan 18 14:31:35 node11 pengine: [12848]: notice: LogActions: Leave 
resource oracle_secondary  (Stopped)

Jan 18 14:31:35 node1 lrmd: [12841]: info: rsc:oracle_primary:7: stop
Jan 18 14:31:35 node1 crmd: [12844]: info: do_lrm_rsc_op: Performing 
key=5:10:0:40ea1f42-c929-40d6-a0ed-569a7c8944bc op=oracle_primary_stop_0 )
Jan 18 14:31:35 node1 lrmd: [12841]: info: RA output: 
(oracle_primary:stop:stderr) 
/usr/lib/ocf/resource.d//heartbeat/cluster_oracle[247]:
Jan 18 14:31:35 node1 pengine: [12848]: WARN: process_pe_message: 
Transition 10: WARNINGs found during PE processing. PEngine Input stored 
in: /var/lib/pengine/pe-warn-2220.bz2
Jan 18 14:31:35 node1 pengine: [12848]: info: process_pe_message: 
Configuration WARNINGs found during PE processing.  Please run 
"crm_verify -L" to identify issues.
Jan 18 14:31:55 node1 lrmd: [12841]: WARN: oracle_primary:stop process 
(PID 14386) timed out (try 1).  Killing with signal SIGTERM (15).
Jan 18 14:31:55 node1 lrmd: [12841]: info: RA output: 
(oracle_primary:stop:stderr)

Session terminated, killing shell...
Jan 18 14:31:57 node1 lrmd: [12841]: info: RA output: 
(oracle_primary:stop:stderr)  ...killed.


Apparently a timeout occured at the stop action after 20 seconds. But 
why, if the resource defined 100 secs?


With kind regards
Markus

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


[Pacemaker] Pacemaker stopping all resouceswhen any change is made

2010-01-20 Thread Harrison James
Hello,

I am controlling two instances of WebSphere MQ  on two machines. The
idea is that normally one instance runs on one machine. However, if
anything should happen to one of the instances then it fails over to the
other machine.

 

It is working except that if one of the instances fails over to the
other node then automatically every resource is shut down and bought
back up. I don't want the existing resources to stop running, only the
failed resources.

 

I have the resources grouped, so one group has a floating IP address,
shared file system, and W MQ init script. There are two such groups.

 

If anyone can help I would be grateful.

 

Many thanks,

James Harrison

 

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] metadata (timeout) ignored?

2010-01-20 Thread Dejan Muhamedagic
Hi,

On Wed, Jan 20, 2010 at 04:28:49PM +0100, Markus M. wrote:
> Hello,
> 
> i've a question about metadata returned by an ocf resource agent
> using the "meta-data" command and the behaviour of the cluster.
> 
> When checking the resource agent's metadata using crm i get this:
> 
> # crm
> crm(live)# ra
> crm(live)ra#  meta cluster_oracle ocf
> bla (ocf:heartbeat:cluster_oracle)
> 
> Master/Slave OCF Resource Agent for Oracle (clustered)
> 
> Parameters (* denotes required, [] the default):
> 
> oracle_role* (string): Ora role
> Required to assign the Oracle role. Must be "master" or "slave"
> 
> Operations' defaults (advisory minimum):
> 
> starttimeout=240
> promote  timeout=90
> demote   timeout=90
> notify   timeout=90
> stop timeout=100
> monitor  timeout=20 interval=20 depth=0
> monitor  timeout=20 interval=10 depth=0
> 
> So it seems for the "stop" action there is a timeout of 100 seconds
> defined. But at cluster shutdown i can see this in the ha-debug log:

It says above that it's "advisory minimum" (the wording should
probably be changed). You have to set the timeouts yourself.

Thanks,

Dejan

> Jan 18 14:31:35 node1 crmd: [12844]: info: te_rsc_command:
> Initiating action 5: stop oracle_primary_stop_0 on node1 (local)
> Jan 18 14:31:35 node11 pengine: [12848]: notice: LogActions: Leave
> resource oracle_secondary  (Stopped)
> Jan 18 14:31:35 node1 lrmd: [12841]: info: rsc:oracle_primary:7: stop
> Jan 18 14:31:35 node1 crmd: [12844]: info: do_lrm_rsc_op: Performing
> key=5:10:0:40ea1f42-c929-40d6-a0ed-569a7c8944bc
> op=oracle_primary_stop_0 )
> Jan 18 14:31:35 node1 lrmd: [12841]: info: RA output:
> (oracle_primary:stop:stderr)
> /usr/lib/ocf/resource.d//heartbeat/cluster_oracle[247]:
> Jan 18 14:31:35 node1 pengine: [12848]: WARN: process_pe_message:
> Transition 10: WARNINGs found during PE processing. PEngine Input
> stored in: /var/lib/pengine/pe-warn-2220.bz2
> Jan 18 14:31:35 node1 pengine: [12848]: info: process_pe_message:
> Configuration WARNINGs found during PE processing.  Please run
> "crm_verify -L" to identify issues.
> Jan 18 14:31:55 node1 lrmd: [12841]: WARN: oracle_primary:stop
> process (PID 14386) timed out (try 1).  Killing with signal SIGTERM
> (15).
> Jan 18 14:31:55 node1 lrmd: [12841]: info: RA output:
> (oracle_primary:stop:stderr)
> Session terminated, killing shell...
> Jan 18 14:31:57 node1 lrmd: [12841]: info: RA output:
> (oracle_primary:stop:stderr)  ...killed.
> 
> Apparently a timeout occured at the stop action after 20 seconds.
> But why, if the resource defined 100 secs?
> 
> With kind regards
> Markus
> 
> ___
> Pacemaker mailing list
> Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] APC Master Stonith

2010-01-20 Thread Errol Neal
> It would be good to hear why it didn't work and what did you do
> to make it work. Incidentally, there is a bugzilla with a patch
> which should help handling certain device releases, but for lack
> of devices to try it out, it never got applied. See
> http://developerbugs.linux-foundation.org/show_bug.cgi?id=1891
> 
> Thanks,
> 

I've long since forgotten why it didn't work. I just realized after looking at 
the code that my device simply wasn't supported
by and I modified it to work for me. 

I have 7930s running 3.3 codebase

--- cluster-glue/lib/plugins/stonith/apcmaster.c2010-01-11 
04:20:07.0 -0500
+++ /root/apcmaster.c   2010-01-20 07:55:19.0 -0500
@@ -47,6 +47,10 @@
  *private subnet.
  */
 
+/*
+ * Version string that is filled in by CVS
+ */
+static const char *version __attribute__ ((unused)) = "$Revision: 1.27 $"; 
 #include 
 
 #defineDEVICE  "APC MasterSwitch"
@@ -143,8 +147,11 @@ static const char * NOTpluginID = "APCMS
 
 #define APCMSSTR   "American Power Conversion"
 
-static struct Etoken EscapeChar[] ={ {"Escape character is '^]'.", 0, 0}
-   ,   {NULL,0,0}};
+static struct Etoken EscapeChar[] = { {"Escape character is '^]'.", 0, 0},
+  {"User Name :", 1, 0},
+  {NULL,0,0}
+  };
+
 static struct Etoken login[] = { {"User Name :", 0, 0}, 
{NULL,0,0}};
 static struct Etoken password[] =  { {"Password  :", 0, 0} ,{NULL,0,0}};
 static struct Etoken Prompt[] ={ {"> ", 0, 0} ,{NULL,0,0}};
@@ -153,9 +160,6 @@ static struct Etoken LoginOK[] ={ {APCM
 static struct Etoken Separator[] = { {"-", 0, 0} ,{NULL,0,0}};
 
 /* We may get a notice about rebooting, or a request for confirmation */
-static struct Etoken Processing[] ={ {"Press  to continue", 0, 0}
-   ,   {"Enter 'YES' to continue", 1, 0}
-   ,   {NULL,0,0}};
 
 #include "stonith_config_xml.h"
 
@@ -173,23 +177,32 @@ static intMSNametoOutlet(struct pluginD
 static int MSReset(struct pluginDevice*, int outletNum, const char * host);
 static int MSLogout(struct pluginDevice * ms);
 
-#if defined(ST_POWERON) && defined(ST_POWEROFF)
-static int apcmaster_onoff(struct pluginDevice*, int outletnum, const char 
* unitid
-,  int request);
-#endif
 
 /* Login to the APC Master Switch */
 
 static int
 MSLogin(struct pluginDevice * ms)
 {
-EXPECT(ms->rdfd, EscapeChar, 10);
+   int rc;
 
-   /* 
-* We should be looking at something like this:
- * User Name :
+   /* Patch from Dave Blaschke
+* Apparently some telnet apps display the escape character while
+* others don't, so we need to handle both possibilities...
+*
+* rc == 0 : "Escape character is '^]'." found
+* rc == 1 : "User Name :" found
+* rc <  0 : Neither found or timeout
 */
-   EXPECT(ms->rdfd, login, 10);
+   if ((rc = StonithLookFor(ms->rdfd, EscapeChar, 10)) < 0) {
+   return(errno == ETIMEDOUT ? S_TIMEOUT : S_OOPS);
+   } else if (rc == 0) {
+   /*
+* We should be looking at something like this:
+*  User Name :
+*/
+   EXPECT(ms->rdfd, login, 10);
+   }
+
SEND(ms->wrfd, ms->user);   
SEND(ms->wrfd, "\r");
 
@@ -275,7 +288,6 @@ int MSLogout(struct pluginDevice* ms)
 static int
 MSReset(struct pluginDevice* ms, int outletNum, const char *host)
 {
-   charunum[32];
 
/* Make sure we're in the top level menu */
 SEND(ms->wrfd, "\033");
@@ -293,13 +305,12 @@ MSReset(struct pluginDevice* ms, int out
 
/* Request menu 1 (Device Control) */
SEND(ms->wrfd, "1\r");
-
-   /* Select requested outlet */
EXPECT(ms->rdfd, Prompt, 5);
-   snprintf(unum, sizeof(unum), "%i\r", outletNum);
-   SEND(ms->wrfd, unum);
+   SEND(ms->wrfd, "2\r");
+   EXPECT(ms->rdfd, Prompt, 5);
+   SEND(ms->wrfd, "1\r");
 
-   /* Select menu 1 (Control Outlet) */
+   /* Select requested outlet */
EXPECT(ms->rdfd, Prompt, 5);
SEND(ms->wrfd, "1\r");
 
@@ -307,21 +318,7 @@ MSReset(struct pluginDevice* ms, int out
EXPECT(ms->rdfd, Prompt, 5);
SEND(ms->wrfd, "3\r");
 
-   /* Expect "Press  " or "Enter 'YES'" (if confirmation turned on) 
*/
-   retry:
-   switch (StonithLookFor(ms->rdfd, Processing, 5)) {
-   case 0: /* Got "Press " Do so */
-   SEND(ms->wrfd, "\r");
-   break;
-
-   case 1: /* Got that annoying command confirmation :-( */
-   SEND(ms->wrfd, "YES\r");
-   goto retry;
-
-   default: 
-   return(errno == ETIM

Re: [Pacemaker] APC Master Stonith

2010-01-20 Thread Sander van Vugt
On Wed, 2010-01-20 at 11:47 -0500, Errol Neal wrote:
> > It would be good to hear why it didn't work and what did you do
> > to make it work. Incidentally, there is a bugzilla with a patch
> > which should help handling certain device releases, but for lack
> > of devices to try it out, it never got applied. See
> > http://developerbugs.linux-foundation.org/show_bug.cgi?id=1891
> > 
> > Thanks,
> > 
> 
> I've long since forgotten why it didn't work. I just realized after looking 
> at the code that my device simply wasn't supported
> by and I modified it to work for me. 
> 
> I have 7930s running 3.3 codebase
> 
I've got a 7920 and I would like to see Pacemaker supporting it. If I
can help by testing this patch, or the one on bugzilla, I'd be more than
happy to, but for the moment, I'm to ignorant to apply it to my code and
not being a C-programmer, the short explanation below doesn't clarify it
to me. Would welcome some directions on how to proceed with this (and
how to become a good patch tester)

Thanks,
Sander



> --- cluster-glue/lib/plugins/stonith/apcmaster.c  2010-01-11 
> 04:20:07.0 -0500
> +++ /root/apcmaster.c 2010-01-20 07:55:19.0 -0500
> @@ -47,6 +47,10 @@
>   *private subnet.
>   */
>  
> +/*
> + * Version string that is filled in by CVS
> + */
> +static const char *version __attribute__ ((unused)) = "$Revision: 1.27 $"; 
>  #include 
>  
>  #define  DEVICE  "APC MasterSwitch"
> @@ -143,8 +147,11 @@ static const char * NOTpluginID = "APCMS
>  
>  #define APCMSSTR "American Power Conversion"
>  
> -static struct Etoken EscapeChar[] =  { {"Escape character is '^]'.", 0, 0}
> - ,   {NULL,0,0}};
> +static struct Etoken EscapeChar[] = { {"Escape character is '^]'.", 0, 0},
> +  {"User Name :", 1, 0},
> +  {NULL,0,0}
> +  };
> +
>  static struct Etoken login[] =   { {"User Name :", 0, 0}, 
> {NULL,0,0}};
>  static struct Etoken password[] ={ {"Password  :", 0, 0} ,{NULL,0,0}};
>  static struct Etoken Prompt[] =  { {"> ", 0, 0} ,{NULL,0,0}};
> @@ -153,9 +160,6 @@ static struct Etoken LoginOK[] =  { {APCM
>  static struct Etoken Separator[] =   { {"-", 0, 0} ,{NULL,0,0}};
>  
>  /* We may get a notice about rebooting, or a request for confirmation */
> -static struct Etoken Processing[] =  { {"Press  to continue", 0, 0}
> - ,   {"Enter 'YES' to continue", 1, 0}
> - ,   {NULL,0,0}};
>  
>  #include "stonith_config_xml.h"
>  
> @@ -173,23 +177,32 @@ static int  MSNametoOutlet(struct pluginD
>  static int   MSReset(struct pluginDevice*, int outletNum, const char * host);
>  static int   MSLogout(struct pluginDevice * ms);
>  
> -#if defined(ST_POWERON) && defined(ST_POWEROFF)
> -static int   apcmaster_onoff(struct pluginDevice*, int outletnum, const char 
> * unitid
> -,int request);
> -#endif
>  
>  /* Login to the APC Master Switch */
>  
>  static int
>  MSLogin(struct pluginDevice * ms)
>  {
> -EXPECT(ms->rdfd, EscapeChar, 10);
> + int rc;
>  
> - /* 
> -  * We should be looking at something like this:
> - *   User Name :
> + /* Patch from Dave Blaschke
> +  * Apparently some telnet apps display the escape character while
> +  * others don't, so we need to handle both possibilities...
> +  *
> +  * rc == 0 : "Escape character is '^]'." found
> +  * rc == 1 : "User Name :" found
> +  * rc <  0 : Neither found or timeout
>*/
> - EXPECT(ms->rdfd, login, 10);
> + if ((rc = StonithLookFor(ms->rdfd, EscapeChar, 10)) < 0) {
> + return(errno == ETIMEDOUT ? S_TIMEOUT : S_OOPS);
> + } else if (rc == 0) {
> + /*
> +  * We should be looking at something like this:
> +  *  User Name :
> +  */
> + EXPECT(ms->rdfd, login, 10);
> + }
> +
>   SEND(ms->wrfd, ms->user);   
>   SEND(ms->wrfd, "\r");
>  
> @@ -275,7 +288,6 @@ int MSLogout(struct pluginDevice* ms)
>  static int
>  MSReset(struct pluginDevice* ms, int outletNum, const char *host)
>  {
> - charunum[32];
>  
>   /* Make sure we're in the top level menu */
>  SEND(ms->wrfd, "\033");
> @@ -293,13 +305,12 @@ MSReset(struct pluginDevice* ms, int out
>  
>   /* Request menu 1 (Device Control) */
>   SEND(ms->wrfd, "1\r");
> -
> - /* Select requested outlet */
>   EXPECT(ms->rdfd, Prompt, 5);
> - snprintf(unum, sizeof(unum), "%i\r", outletNum);
> - SEND(ms->wrfd, unum);
> + SEND(ms->wrfd, "2\r");
> + EXPECT(ms->rdfd, Prompt, 5);
> + SEND(ms->wrfd, "1\r");
>  
> - /* Select menu 1 (Control Outlet) */
> + /* Select requested outlet */
>   EXPECT(ms->rdfd, Prompt, 5);
>   SEND(ms->wrfd, "1\r");
>  
> @@ -307,21 +318,7 @@ MSReset(struct pluginD

Re: [Pacemaker] APC Master Stonith

2010-01-20 Thread Sander van Vugt
Hi,

On Wed, 2010-01-20 at 07:56 +0100, Dominik Klein wrote:
> Errol Neal wrote:
> > On Tue, Jan 19, 2010 04:19  PM, Sander van Vugt  
> > wrote:
> >> Hi,
> >>
> >> I hope someone has configured the APC Master Stonith resource (which you
> >> would use to have pacemaker to a device like the APC switched rack PDU),
> >> as I have a - probably extremely stupid - conceptual question about it. 
> >>
> >> When I look at the options the resource has, it allows me to enter
> >> username, password and IP address. What I would also expect, is to give
> >> it something like a name of the node that is should do STONITH on, as
> >> well as the port on the device that it should power cycle. Am I missing
> >> something? Or do I have to specify this information as additional
> >> attributes? And if so, what exactly would be the syntax?
> >>
> > What type of device are you trying to get the plugin to work with?
> > I'm using APC rack PDUs and this plugin did not work by default for me. I 
> > had to hack it to get it work for me, but
> > it works exactly how I wanted it to. By the way, I'm not using the snmp - 
> > i'm using telnet.
> > 
> > So here is how mine's is configured:
> > 
> > primitive stonith-apcmaster-axigen2 stonith:apcmaster \
> > params ipaddr="x.x.x.x login="axigen2" password="x.x.x.x" \
> > op monitor interval="120s" timeout="20s" \
> > op startup interval="0" timeout="60s" \
> > 
> > Then I have a constraint that prohibits it a node from committing suicide.

I'm interested in knowing how you configured this constraint. As I
understand it now, there should be one primitive only in the cluster to
address the PDU, which would be used to perform STONITH on all nodes
that need one. You probably used a location constraint, but how exactly?
> > 
> > I'll describe what I did to get it going in my environment. 
> > 
> > I created a user account for each node on it's respective PDU and only 
> > allowed it to control it's own power.
> > 
> > As I mentioned, I hacked the plugin's source code and recompiled. My 
> > changes  #1 to make it work and #2, to make it work simple and plain. Login 
> > and shut-er down. I can provide you my changes if you think it will work 
> > for you.
> 
> I'd also be interested in the changes.
> 
> The apcmastersnmp plugin does work for me with APC7920 though.
> 
> Thanks,
> Dominik
> 
Dominic, I'd be interested to know how you configured the apcmastersnmp
plugin. Would you mind sending me your configuration?

Thanks.
Sander

> ___
> Pacemaker mailing list
> Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker



___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] APC Master Stonith

2010-01-20 Thread Errol Neal
On Wed, Jan 20, 2010 01:45  PM, Sander van Vugt  wrote:
> I've got a 7920 and I would like to see Pacemaker supporting it. If I
> can help by testing this patch, or the one on bugzilla, I'd be more than
> happy to, but for the moment, I'm to ignorant to apply it to my code and
> not being a C-programmer, the short explanation below doesn't clarify it
> to me. Would welcome some directions on how to proceed with this (and
> how to become a good patch tester)
> 
> Thanks,
> Sander
> 

Well, as I said, I wanted it to work in a specific way. It was very easy for me 
to create accounts on the PDU and assign an account permissions to manage an 
outlet. That being said, I removed the code in apcmaster.c that allows looks 
for device names and etc. My hack assumes that if we've logged in, we can reset 
the outlet. 
Each node gets a account on their respective PDUs. I have a constraint that 
does not permit the node to run it's own stonith resource.

What distro and arch are you running? 

Earlier in this thread, somone wanted to see the constraint:

location never-stonith-apcmaster-axigen1 stonith-apcmaster-axigen1 -inf: axigen1
location never-stonith-apcmaster-axigen2 stonith-apcmaster-axigen2 -inf: axigen2



___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] metadata (timeout) ignored?

2010-01-20 Thread Markus M.

Dejan Muhamedagic wrote:


Operations' defaults (advisory minimum):

>>

stop timeout=100

So it seems for the "stop" action there is a timeout of 100 seconds
defined. But at cluster shutdown i can see this in the ha-debug log:


It says above that it's "advisory minimum" (the wording should
probably be changed). You have to set the timeouts yourself.


Sorry, maybe i've misunderstood something... i thought _i've set the 
timeout_ by making the ocf resource agent meta-data function returning 
the value of 100 seconds for the stop action? Is there another place to 
set the timeout for the stop action of this ra?


The timeout is occuring after 20 seconds:


Jan 18 14:31:35 node1 crmd: [12844]: info: te_rsc_command:
Initiating action 5: stop oracle_primary_stop_0 on node1 (local)

...

Jan 18 14:31:55 node1 lrmd: [12841]: WARN: oracle_primary:stop
process (PID 14386) timed out (try 1).  Killing with signal SIGTERM
(15).


Regards
Markus

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] Question on resource groups

2010-01-20 Thread Ken Dechick
Thanks very much Dejan! That's exactly the information I needed!

Kenneth M DeChick
Linux Systems Administrator
Community Computer Service, Inc.
(315)-255-1751 ext154
http://www.medent.com
k...@medent.com
Registered Linux User #497318
-- -- -- -- -- -- -- -- -- -- --
"You canna change the laws of physics, Captain; I've got to have thirtyminutes! 
"

.

-- Original Message ---
From: Dejan Muhamedagic  
To: pacema...@clusterlabs.org, k...@medent.com 
Sent: Tue, 19 Jan 2010 18:18:47 +0100 
Subject: Re: [Pacemaker] Question on resource groups

> Hi, 
> 
> On Tue, Jan 19, 2010 at 11:38:06AM -0500, Ken Dechick wrote: 
> > Hello all, 
> > 
> > Quick question here today. Please forgive me if this has been 
> > answered, I have searched for a couple days and not been able 
> > to come up with the answer. I am working on a standard 2 node 
> > cluster using DRBD and I have my resources in a group. All in 
> > working well, but my question has to do with what happens when 
> > there is a problem with an individual service. Consider the 
> > following example using heartbeat (3.0.1-1) drbd (8.3.6) and 
> > pacemaker (1.0.6): 
> > 
> > Cluster with one reosurce group which contains these resources in this 
> > order: 
> >    
> >    -drbd master/slave 
> >    -virtual file system 
> >    -openvpn 
> >    -samba 
> >    -apache webserver 
> >    -cupsd 
> > 
> > Problem I am running into is if there is a problem with openvpn 
> > in this example (VPN goes down and keys are missing so it 
> > CANNOT restart without intervention), watching the cluster with 
> > crm_mon, I see that all the services under openvpn in order 
> > (samba,apache, cupsd) will all starta "rolling restart". In 
> > other words, I see openvpn fail, then samba goes down, then 
> > apache goes down, then cups goes down. Next cups comes up, 
> > apache comes up, samba comes up, then openvpn tries to start 
> > but fails so the progress starts over - smba, apache and cups 
> > stop then start again. What I end up with is a system where 
> > those last 3 services which runs fine alone keep coming up then 
> > going down again, over and over. Only way I can change this is 
> > to fix the openvpn issue, then things restart and stay 
> > restarted. 
> > 
> > My question is: is this normal (expected) behavior? 
> 
> Yes. 
> 
> > If so how 
> > do I change this? 
> 
> Reconfigure. Your group doesn't represent properly the relations 
> between resources. I guess that all the four resources depend on 
> drbd and filesystem, but not on each other. You can then create 
> non-ordered group with those four resources and collocate/order 
> that group with the drbd/fs group. 
> 
> Thanks, 
> 
> Dejan 
> 
> > I have tried several on-fail options in the 
> > monitors for those services (tried: stop, restart, and block) 
> > but this doesn't change the behavior. I would like to just have 
> > the one service stop without affecting the others. Do I need to 
> > re-think using a resource group?? Any assistance would be 
> > greatly appreciated. The pacemaker site has a lot of 
> > documentation but it's not the clearest explainations at times. 
> > 
> > -Thanks 
> > 
> > Kenneth M DeChick 
> > Linux Systems Administrator 
> > Community Computer Service, Inc. 
> > (315)-255-1751 ext154 
> > http://www.medent.com 
> > k...@medent.com 
> > Registered Linux User #497318 
> > -- -- -- -- -- -- -- -- -- -- -- 
> > "You canna change the laws of physics, Captain; I've got to have 
> > thirtyminutes! " 
> > 
> > . 
> >   
> > This message has been scanned for viruses and dangerous content by 
> > MailScanner, SpamAssassin  & ClamAV.  
> >   
> > This message and any attachments may contain information that is protected 
> > by law as privileged and confidential, and is transmitted for the sole use 
> >  
> > of the intended recipient(s). If you are not the intended recipient, you 
> > are hereby notified that any use, dissemination, copying or retention of 
> > this e-mail  
> > or the information contained herein is strictly prohibited. If you received 
> > this e-mail in error, please immediately notify the sender by e-mail, and 
> > permanently  
> > delete this e-mail.  
> > 
> 
> > ___ 
> > Pacemaker mailing list 
> > Pacemaker@oss.clusterlabs.org 
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker 
--- End of Original Message ---
 
This message has been scanned for viruses and dangerous content by MailScanner, 
SpamAssassin  & ClamAV. 
 
This message and any attachments may contain information that is protected by 
law as privileged and confidential, and is transmitted for the sole use 
of the intended recipient(s). If you are not the intended recipient, you are 
hereby notified that any use, dissemination, copying or retention of this 
e-mail 
or the information contained herein is strictly prohibited. If you received 
this e-mail in error, please immediately notify the sender by e-mail, and 
permane

Re: [Pacemaker] metadata (timeout) ignored?

2010-01-20 Thread Dejan Muhamedagic
Hi,

On Wed, Jan 20, 2010 at 09:45:46PM +0100, Markus M. wrote:
> Dejan Muhamedagic wrote:
> 
> >>Operations' defaults (advisory minimum):
> >>
> >>stop timeout=100
> >>
> >>So it seems for the "stop" action there is a timeout of 100 seconds
> >>defined. But at cluster shutdown i can see this in the ha-debug log:
> >
> >It says above that it's "advisory minimum" (the wording should
> >probably be changed). You have to set the timeouts yourself.
> 
> Sorry, maybe i've misunderstood something... i thought _i've set the
> timeout_ by making the ocf resource agent meta-data function
> returning the value of 100 seconds for the stop action? Is there
> another place to set the timeout for the stop action of this ra?

Yes, in the cluster configuration. Like this:

primitive rsc_c001n07 ocf:heartbeat:IPaddr \
params ip="127.0.0.16" cidr_netmask="32" \
op stop timeout="100s"

Thanks,

Dejan

> The timeout is occuring after 20 seconds:
> 
> >>Jan 18 14:31:35 node1 crmd: [12844]: info: te_rsc_command:
> >>Initiating action 5: stop oracle_primary_stop_0 on node1 (local)
> ...
> >>Jan 18 14:31:55 node1 lrmd: [12841]: WARN: oracle_primary:stop
> >>process (PID 14386) timed out (try 1).  Killing with signal SIGTERM
> >>(15).
> 
> Regards
> Markus
> 
> ___
> Pacemaker mailing list
> Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


[Pacemaker] Heartbeat GUI Client Compiling

2010-01-20 Thread Ruiyuan Jiang
Hi, all

I downloaded the v1.4 GUI source code, 4d3288030a84. 
I compiled it with the option:

# ./bootstrap --with-heartbeat-support
# make

The bootstrap step passed no problem. It failed at make:

...
gcc -DHAVE_CONFIG_H -I. -I. -I../../include -I../../include -I../../include 
-I../../include -I../../libltdl -I../../libltdl -I../../linux-ha 
-I../../linux-ha -I../.. -I../.. -I/usr/include/glib-2.0 
-I/usr/lib64/glib-2.0/include -I/usr/include/libxml2 -I../../include 
-I../../include -I../../libltdl -I../../libltdl -I../../linux-ha 
-I../../linux-ha -I../.. -I../.. -g -O2 -I/usr/local/include/heartbeat 
-I/usr/local/include/pacemaker -fgnu89-inline -Wall -Wmissing-prototypes 
-Wmissing-declarations -Wstrict-prototypes -Wdeclaration-after-statement 
-Wpointer-arith -Wwrite-strings -Wcast-qual -Wcast-align -Wbad-function-cast 
-Winline -Wmissing-format-attribute -Wformat=2 -Wformat-security 
-Wformat-nonliteral -Wno-long-long -Wno-strict-aliasing -Werror -ansi 
-D_GNU_SOURCE -DANSI_ONLY -ggdb3 -funsigned-char -MT libhbmgmt_la-mgmt_lib.lo 
-MD -MP -MF .deps/libhbmgmt_la-mgmt_lib.Tpo -c mgmt_lib.c  -fPIC -DPIC -o 
.libs/libhbmgmt_la-mgmt_lib.o
mgmt_lib.c:40:34: error: clplumbing/cl_malloc.h: No such file or directory
In file included from /usr/local/include/pacemaker/crm/common/util.h:32,
 from mgmt_lib.c:51:
/usr/local/include/heartbeat/heartbeat.h:144:1: error: "HADEBUGVAL" redefined
In file included from mgmt_lib.c:49:
mgmt_internal.h:30:1: error: this is the location of the previous definition
In file included from mgmt_lib.c:51:
/usr/local/include/pacemaker/crm/common/util.h:156: error: expected ')' before 
'*' token
/usr/local/include/pacemaker/crm/common/util.h:157: error: expected ')' before 
'*' token
/usr/local/include/pacemaker/crm/common/util.h:254: error: expected '=', ',', 
';', 'asm' or '__attribute__' before '*' token
/usr/local/include/pacemaker/crm/common/util.h:255: error: expected declaration 
specifiers or '...' before 'xmlNode'
mgmt_lib.c: In function 'init_mgmt_lib':
mgmt_lib.c:75: error: 'cl_free' undeclared (first use in this function)
mgmt_lib.c:75: error: (Each undeclared identifier is reported only once
mgmt_lib.c:75: error: for each function it appears in.)
mgmt_lib.c:79: error: 'cl_malloc' undeclared (first use in this function)
mgmt_lib.c:79: error: 'cl_realloc' undeclared (first use in this function)
cc1: warnings being treated as errors
mgmt_lib.c: In function 'reg_msg':
mgmt_lib.c:131: warning: implicit declaration of function 'cl_strdup'
mgmt_lib.c:131: warning: passing argument 2 of 'g_hash_table_insert' makes 
pointer from integer without a cast
mgmt_lib.c: In function 'reg_event':
mgmt_lib.c:175: warning: passing argument 2 of 'g_hash_table_replace' makes 
pointer from integer without a cast
gmake[2]: *** [libhbmgmt_la-mgmt_lib.lo] Error 1
gmake[2]: Leaving directory 
`/home/rc6/Pacemaker-Python-GUI-4d3288030a84/mgmt/daemon'
gmake[1]: *** [all-recursive] Error 1
gmake[1]: Leaving directory `/home/rc6/Pacemaker-Python-GUI-4d3288030a84/mgmt'
make: *** [all-recursive] Error 1

Am I missing something here? I also tried with option: 
--enable-fatal-warnings=no but no help. Thanks in advance.

Ryan



This message (including any attachments) is intended
solely for the specific individual(s) or entity(ies) named
above, and may contain legally privileged and
confidential information. If you are not the intended 
recipient, please notify the sender immediately by 
replying to this message and then delete it.
Any disclosure, copying, or distribution of this message,
or the taking of any action based on it, by other than the
intended recipient, is strictly prohibited.


___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] Heartbeat GUI Client Compiling

2010-01-20 Thread Yan Gao
Hi,

Ruiyuan Jiang wrote:
> Hi, all
> 
> I downloaded the v1.4 GUI source code, 4d3288030a84. 
> I compiled it with the option:
> 
> # ./bootstrap --with-heartbeat-support
> # make
> 
> The bootstrap step passed no problem. It failed at make:
> 
> ...
> gcc -DHAVE_CONFIG_H -I. -I. -I../../include -I../../include -I../../include 
> -I../../include -I../../libltdl -I../../libltdl -I../../linux-ha 
> -I../../linux-ha -I../.. -I../.. -I/usr/include/glib-2.0 
> -I/usr/lib64/glib-2.0/include -I/usr/include/libxml2 -I../../include 
> -I../../include -I../../libltdl -I../../libltdl -I../../linux-ha 
> -I../../linux-ha -I../.. -I../.. -g -O2 -I/usr/local/include/heartbeat 
> -I/usr/local/include/pacemaker -fgnu89-inline -Wall -Wmissing-prototypes 
> -Wmissing-declarations -Wstrict-prototypes -Wdeclaration-after-statement 
> -Wpointer-arith -Wwrite-strings -Wcast-qual -Wcast-align -Wbad-function-cast 
> -Winline -Wmissing-format-attribute -Wformat=2 -Wformat-security 
> -Wformat-nonliteral -Wno-long-long -Wno-strict-aliasing -Werror -ansi 
> -D_GNU_SOURCE -DANSI_ONLY -ggdb3 -funsigned-char -MT libhbmgmt_la-mgmt_lib.lo 
> -MD -MP -MF .deps/libhbmgmt_la-mgmt_lib.Tpo -c mgmt_lib.c  -fPIC -DPIC -o 
> .libs/libhbmgmt_la-mgmt_lib.o
> mgmt_lib.c:40:34: error: clplumbing/cl_malloc.h: No such file or directory
> In file included from /usr/local/include/pacemaker/crm/common/util.h:32,
>  from mgmt_lib.c:51:
> /usr/local/include/heartbeat/heartbeat.h:144:1: error: "HADEBUGVAL" redefined
> In file included from mgmt_lib.c:49:
> mgmt_internal.h:30:1: error: this is the location of the previous definition
> In file included from mgmt_lib.c:51:
> /usr/local/include/pacemaker/crm/common/util.h:156: error: expected ')' 
> before '*' token
> /usr/local/include/pacemaker/crm/common/util.h:157: error: expected ')' 
> before '*' token
> /usr/local/include/pacemaker/crm/common/util.h:254: error: expected '=', ',', 
> ';', 'asm' or '__attribute__' before '*' token
> /usr/local/include/pacemaker/crm/common/util.h:255: error: expected 
> declaration specifiers or '...' before 'xmlNode'
> mgmt_lib.c: In function 'init_mgmt_lib':
> mgmt_lib.c:75: error: 'cl_free' undeclared (first use in this function)
> mgmt_lib.c:75: error: (Each undeclared identifier is reported only once
> mgmt_lib.c:75: error: for each function it appears in.)
> mgmt_lib.c:79: error: 'cl_malloc' undeclared (first use in this function)
> mgmt_lib.c:79: error: 'cl_realloc' undeclared (first use in this function)
> cc1: warnings being treated as errors
> mgmt_lib.c: In function 'reg_msg':
> mgmt_lib.c:131: warning: implicit declaration of function 'cl_strdup'
> mgmt_lib.c:131: warning: passing argument 2 of 'g_hash_table_insert' makes 
> pointer from integer without a cast
> mgmt_lib.c: In function 'reg_event':
> mgmt_lib.c:175: warning: passing argument 2 of 'g_hash_table_replace' makes 
> pointer from integer without a cast
> gmake[2]: *** [libhbmgmt_la-mgmt_lib.lo] Error 1
> gmake[2]: Leaving directory 
> `/home/rc6/Pacemaker-Python-GUI-4d3288030a84/mgmt/daemon'
> gmake[1]: *** [all-recursive] Error 1
> gmake[1]: Leaving directory `/home/rc6/Pacemaker-Python-GUI-4d3288030a84/mgmt'
> make: *** [all-recursive] Error 1
> 
> Am I missing something here? I also tried with option: 
> --enable-fatal-warnings=no but no help. Thanks in advance.
Please use the tip of the repo. That's stable enough.

-- 
Yan Gao 
Software Engineer
China Server Team, OPS Engineering, Novell, Inc.

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker