Re: [Pacemaker] crm resource restart fails to restart the service

2010-11-17 Thread Vadym Chepkov

On Nov 17, 2010, at 7:26 AM, Dan Frincu wrote:

> Hi,
> 
> r...@cluster1:/# pgrep mysql
> 961
> 1127
> r...@cluster1:/# crm resource restart mysqld
> r...@cluster1:/# pgrep -fl mysql
> 961
> 1127
> 
> The restart command doesn't actually restart the process, I have tried this 
> with another custom built OCF compliant RA and have the same issue.
> 
> # rpm -qa '(pacemaker|corosync|resource-agents)'
> pacemaker-1.0.9.1-1.el5
> resource-agents-1.0.3-2.el5
> corosync-1.2.7-1.1.el5
> 
> # crm configure show mysqld
> primitive mysqld ocf:heartbeat:mysql \
>   params binary="/usr/bin/mysqld_safe" config="/etc/my.cnf" 
> enable_creation="0" datadir="/mysql/database" user="root" test_user="monitor" 
> test_passwd="monitor" test_table="cluster.monitor" \
>   op monitor interval="10s" timeout="5s" \
>   op start interval="0s" \
>   op stop interval="0s" \
>   meta target-role="Started"
> 
> Ideas?


RA doesn't support restart action? Most luckily you get OCF_ERR_UNIMPLEMENTED 
in the log

Vadym


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] crm resource restart fails to restart the service

2010-11-17 Thread Pavlos Parissis
On 17 November 2010 13:26, Dan Frincu  wrote:
> Hi,
>
> r...@cluster1:/# pgrep mysql
> 961
> 1127
> r...@cluster1:/# crm resource restart mysqld
> r...@cluster1:/# pgrep -fl mysql
> 961
> 1127
>
> The restart command doesn't actually restart the process, I have tried this
> with another custom built OCF compliant RA and have the same issue.
>
> # rpm -qa '(pacemaker|corosync|resource-agents)'
> pacemaker-1.0.9.1-1.el5
> resource-agents-1.0.3-2.el5
> corosync-1.2.7-1.1.el5
>
> # crm configure show mysqld
> primitive mysqld ocf:heartbeat:mysql \
>       params binary="/usr/bin/mysqld_safe" config="/etc/my.cnf"
> enable_creation="0" datadir="/mysql/database" user="root"
> test_user="monitor" test_passwd="monitor" test_table="cluster.monitor" \
>       op monitor interval="10s" timeout="5s" \
>       op start interval="0s" \
>       op stop interval="0s" \
>       meta target-role="Started"
>
> Ideas?
>
> Regards,
> Dan
>
> --
> Dan FRINCU
> Systems Engineer
> CCNA, RHCE
> Streamwide Romania
>
>
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>

I have experienced the same issue and created a bug report
http://developerbugs.linux-foundation.org/show_bug.cgi?id=2516.

In my case I have a group [1] and if I do crm resource restart pbx_01
the last resource(mailAlert-01) of the group is restarted.

Cheers,
Pavlos


[1] group pbx_service_01 ip_01 fs_01 pbx_01 sshd_01 mailAlert-01

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] crm resource restart fails to restart the service

2010-11-17 Thread Pavlos Parissis
On 17 November 2010 13:35, Vadym Chepkov  wrote:
>
> On Nov 17, 2010, at 7:26 AM, Dan Frincu wrote:
>
>> Hi,
>>
>> r...@cluster1:/# pgrep mysql
>> 961
>> 1127
>> r...@cluster1:/# crm resource restart mysqld
>> r...@cluster1:/# pgrep -fl mysql
>> 961
>> 1127
>>
>> The restart command doesn't actually restart the process, I have tried this 
>> with another custom built OCF compliant RA and have the same issue.
>>
>> # rpm -qa '(pacemaker|corosync|resource-agents)'
>> pacemaker-1.0.9.1-1.el5
>> resource-agents-1.0.3-2.el5
>> corosync-1.2.7-1.1.el5
>>
>> # crm configure show mysqld
>> primitive mysqld ocf:heartbeat:mysql \
>>       params binary="/usr/bin/mysqld_safe" config="/etc/my.cnf" 
>> enable_creation="0" datadir="/mysql/database" user="root" 
>> test_user="monitor" test_passwd="monitor" test_table="cluster.monitor" \
>>       op monitor interval="10s" timeout="5s" \
>>       op start interval="0s" \
>>       op stop interval="0s" \
>>       meta target-role="Started"
>>
>> Ideas?
>
>
> RA doesn't support restart action? Most luckily you get OCF_ERR_UNIMPLEMENTED 
> in the log
>
> Vadym
>
that is correct
[r...@node-01 heartbeat]# pwd
/usr/lib/ocf/resource.d/heartbeat
[r...@node-01 heartbeat]# grep usage mysql
# An example usage in /etc/ha.d/haresources:
# See usage() function below for more details...
usage() {
usage: $0 (start|stop|validate-all|meta-data|monitor)
  usage|help)   usage
 *) usage


but in my case it supports restart

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] crm resource restart fails to restart the service

2010-11-17 Thread Dejan Muhamedagic
Hi,

On Wed, Nov 17, 2010 at 07:35:46AM -0500, Vadym Chepkov wrote:
> 
> On Nov 17, 2010, at 7:26 AM, Dan Frincu wrote:
> 
> > Hi,
> > 
> > r...@cluster1:/# pgrep mysql
> > 961
> > 1127
> > r...@cluster1:/# crm resource restart mysqld
> > r...@cluster1:/# pgrep -fl mysql
> > 961
> > 1127
> > 
> > The restart command doesn't actually restart the process, I have tried this 
> > with another custom built OCF compliant RA and have the same issue.
> > 
> > # rpm -qa '(pacemaker|corosync|resource-agents)'
> > pacemaker-1.0.9.1-1.el5
> > resource-agents-1.0.3-2.el5
> > corosync-1.2.7-1.1.el5
> > 
> > # crm configure show mysqld
> > primitive mysqld ocf:heartbeat:mysql \
> >   params binary="/usr/bin/mysqld_safe" config="/etc/my.cnf" 
> > enable_creation="0" datadir="/mysql/database" user="root" 
> > test_user="monitor" test_passwd="monitor" test_table="cluster.monitor" \
> >   op monitor interval="10s" timeout="5s" \
> >   op start interval="0s" \
> >   op stop interval="0s" \
> >   meta target-role="Started"
> > 
> > Ideas?
> 
> 
> RA doesn't support restart action? Most luckily you get OCF_ERR_UNIMPLEMENTED 
> in the log

It's actually a resource stop followed by start. It says so in
the help too. Perhaps the start precludes the stop action. The
logs should give a hint. We need a sleep in between.

Thanks,

Dejan

> Vadym
> 
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: 
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] crm resource restart fails to restart the service

2010-11-17 Thread Vadym Chepkov
On Wed, Nov 17, 2010 at 8:01 AM, Dejan Muhamedagic  wrote:

>> RA doesn't support restart action? Most luckily you get 
>> OCF_ERR_UNIMPLEMENTED in the log
>
> It's actually a resource stop followed by start. It says so in
> the help too. Perhaps the start precludes the stop action. The
> logs should give a hint. We need a sleep in between.
>

In this case this command is not working at all, because I tried in
the past for many resources and it never worked, so I just assumed it
has to be implemented by RA.

To test it right now I issued a command
# crm resource restart xen_vbuild

where xen_vbuild is a Xen VM and no results whatsoever.

Here is output of the log

Nov 17 13:04:13 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
Nov 17 13:05:14 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
Nov 17 13:06:14 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
Nov 17 13:07:15 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
+   
Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
+ 
Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
+   
Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
-   
Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
- 
Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
-   
Nov 17 13:07:44 xen-11 pengine: [22958]: debug: unpack_lrm_rsc_state:
xen_vbuild: Overwriting calculated next role Unknown with requested
next role Stopped
Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
+   
Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
+ 
Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
+   
Nov 17 13:07:45 xen-11 pengine: [22958]: debug: unpack_lrm_rsc_state:
xen_vbuild: Overwriting calculated next role Unknown with requested
next role Stopped
Nov 17 13:07:45 xen-11 pengine: [22958]: notice: native_print:
xen_vbuild  (ocf::heartbeat:Xen):   Started xen-11
Nov 17 13:07:45 xen-11 pengine: [22958]: debug: native_assign_node:
All nodes for resource xen_vbuild are unavailable, unclean or shutting
down (xen-11: 1, -100)
Nov 17 13:07:45 xen-11 pengine: [22958]: debug: native_assign_node:
Could not allocate a node for xen_vbuild
Nov 17 13:07:45 xen-11 pengine: [22958]: info: native_color: Resource
xen_vbuild cannot run anywhere
Nov 17 13:07:45 xen-11 pengine: [22958]: notice: LogActions: Stop
resource xen_vbuild (xen-11)
Nov 17 13:07:46 xen-11 pengine: [22958]: notice: native_print:
xen_vbuild  (ocf::heartbeat:Xen):   Started xen-11
Nov 17 13:07:46 xen-11 pengine: [22958]: debug: native_assign_node:
Assigning xen-11 to xen_vbuild
Nov 17 13:07:46 xen-11 pengine: [22958]: notice: LogActions: Leave
resource xen_vbuild (Started xen-11)
Nov 17 13:08:15 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
Nov 17 13:09:15 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
Nov 17 13:10:16 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
Nov 17 13:11:16 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
Nov 17 13:12:17 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
Nov 17 13:13:17 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
Nov 17 13:14:17 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
Nov 17 13:15:18 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
Nov 17 13:16:18 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
Nov 17 13:17:19 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
Nov 17 13:17:47 xen-11 pengine: [22958]: notice: native_print:
xen_vbuild  (ocf::heartbeat:Xen):   Started xen-11
Nov 17 13:17:47 xen-11 pengine: [22958]: debug: native_assign_node:
Assigning xen-11 to xen_vbuild
Nov 17 13:17:47 xen-11 pengine: [22958]: notice: LogActions: Leave
resource xen_vbuild (Started xen-11)
Nov 17 13:18:19 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
Nov 17 13:19:19 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
Nov 17 13:20:20 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor

but VM never stopped:


[r...@xen-11 ~]# xm list|grep vbuild
vbuild 3  511 2 -b352.4


still ID 3 as it was before

Vadym

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] crm resource restart fails to restart the service

2010-11-17 Thread Dejan Muhamedagic
Hi,

On Wed, Nov 17, 2010 at 08:30:36AM -0500, Vadym Chepkov wrote:
> On Wed, Nov 17, 2010 at 8:01 AM, Dejan Muhamedagic  
> wrote:
> 
> >> RA doesn't support restart action? Most luckily you get 
> >> OCF_ERR_UNIMPLEMENTED in the log
> >
> > It's actually a resource stop followed by start. It says so in
> > the help too. Perhaps the start precludes the stop action. The
> > logs should give a hint. We need a sleep in between.
> >
> 
> In this case this command is not working at all, because I tried in
> the past for many resources and it never worked, so I just assumed it
> has to be implemented by RA.
> 
> To test it right now I issued a command
> # crm resource restart xen_vbuild
> 
> where xen_vbuild is a Xen VM and no results whatsoever.
> 
> Here is output of the log
> 
> Nov 17 13:04:13 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
> Nov 17 13:05:14 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
> Nov 17 13:06:14 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
> Nov 17 13:07:15 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
> +   
> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
> +  __crm_diff_marker__="added:top" >
> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
> +name="target-role" value="Stopped" />
> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
> -   
> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
> - 
> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
> -id="xen_vbuild-meta_attributes-target-role" />
> Nov 17 13:07:44 xen-11 pengine: [22958]: debug: unpack_lrm_rsc_state:
> xen_vbuild: Overwriting calculated next role Unknown with requested
> next role Stopped
> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
> +   
> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
> + 
> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
> +id="xen_vbuild-meta_attributes-target-role" />
> Nov 17 13:07:45 xen-11 pengine: [22958]: debug: unpack_lrm_rsc_state:
> xen_vbuild: Overwriting calculated next role Unknown with requested
> next role Stopped
> Nov 17 13:07:45 xen-11 pengine: [22958]: notice: native_print:
> xen_vbuild(ocf::heartbeat:Xen):   Started xen-11
> Nov 17 13:07:45 xen-11 pengine: [22958]: debug: native_assign_node:
> All nodes for resource xen_vbuild are unavailable, unclean or shutting
> down (xen-11: 1, -100)
> Nov 17 13:07:45 xen-11 pengine: [22958]: debug: native_assign_node:
> Could not allocate a node for xen_vbuild
> Nov 17 13:07:45 xen-11 pengine: [22958]: info: native_color: Resource
> xen_vbuild cannot run anywhere
> Nov 17 13:07:45 xen-11 pengine: [22958]: notice: LogActions: Stop
> resource xen_vbuild   (xen-11)
> Nov 17 13:07:46 xen-11 pengine: [22958]: notice: native_print:
> xen_vbuild(ocf::heartbeat:Xen):   Started xen-11
> Nov 17 13:07:46 xen-11 pengine: [22958]: debug: native_assign_node:
> Assigning xen-11 to xen_vbuild
> Nov 17 13:07:46 xen-11 pengine: [22958]: notice: LogActions: Leave
> resource xen_vbuild   (Started xen-11)
> Nov 17 13:08:15 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
> Nov 17 13:09:15 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
> Nov 17 13:10:16 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
> Nov 17 13:11:16 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
> Nov 17 13:12:17 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
> Nov 17 13:13:17 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
> Nov 17 13:14:17 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
> Nov 17 13:15:18 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
> Nov 17 13:16:18 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
> Nov 17 13:17:19 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
> Nov 17 13:17:47 xen-11 pengine: [22958]: notice: native_print:
> xen_vbuild(ocf::heartbeat:Xen):   Started xen-11
> Nov 17 13:17:47 xen-11 pengine: [22958]: debug: native_assign_node:
> Assigning xen-11 to xen_vbuild
> Nov 17 13:17:47 xen-11 pengine: [22958]: notice: LogActions: Leave
> resource xen_vbuild   (Started xen-11)
> Nov 17 13:18:19 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
> Nov 17 13:19:19 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
> Nov 17 13:20:20 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
> 
> but VM never stopped:
> 
> 
> [r...@xen-11 ~]# xm list|grep vbuild
> vbuild 3  511 2 -b352.4
> 
> 
> still ID 3 as it was before

I'll take a look.

Thanks,

Dejan


> Vadym
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://w

Re: [Pacemaker] crm resource restart fails to restart the service

2010-11-17 Thread Dejan Muhamedagic
On Wed, Nov 17, 2010 at 08:30:36AM -0500, Vadym Chepkov wrote:
> On Wed, Nov 17, 2010 at 8:01 AM, Dejan Muhamedagic  
> wrote:
> 
> >> RA doesn't support restart action? Most luckily you get 
> >> OCF_ERR_UNIMPLEMENTED in the log
> >
> > It's actually a resource stop followed by start. It says so in
> > the help too. Perhaps the start precludes the stop action. The
> > logs should give a hint. We need a sleep in between.
> >
> 
> In this case this command is not working at all, because I tried in
> the past for many resources and it never worked, so I just assumed it
> has to be implemented by RA.

Funny, it worked here for me every time for apache, Dummy,
Delay, stonith resources. With both pacemaker 1.0 and 1.1.

> To test it right now I issued a command
> # crm resource restart xen_vbuild

Can you try to insert a sleep and see if that helps. It's in
/usr/lib64/python2.6/site-packages/crm/ui.py:

 802 def restart(self,cmd,rsc):
 803 "usage: restart "
 804 if not is_name_sane(rsc):
 805 return False
 806 if not self.stop("stop",rsc):
 807 return False
 808 time.sleep(1)
 809 return self.start("start",rsc)

Thanks,

Dejan

> where xen_vbuild is a Xen VM and no results whatsoever.
> 
> Here is output of the log
> 
> Nov 17 13:04:13 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
> Nov 17 13:05:14 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
> Nov 17 13:06:14 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
> Nov 17 13:07:15 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
> +   
> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
> +  __crm_diff_marker__="added:top" >
> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
> +name="target-role" value="Stopped" />
> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
> -   
> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
> - 
> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
> -id="xen_vbuild-meta_attributes-target-role" />
> Nov 17 13:07:44 xen-11 pengine: [22958]: debug: unpack_lrm_rsc_state:
> xen_vbuild: Overwriting calculated next role Unknown with requested
> next role Stopped
> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
> +   
> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
> + 
> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
> +id="xen_vbuild-meta_attributes-target-role" />
> Nov 17 13:07:45 xen-11 pengine: [22958]: debug: unpack_lrm_rsc_state:
> xen_vbuild: Overwriting calculated next role Unknown with requested
> next role Stopped
> Nov 17 13:07:45 xen-11 pengine: [22958]: notice: native_print:
> xen_vbuild(ocf::heartbeat:Xen):   Started xen-11
> Nov 17 13:07:45 xen-11 pengine: [22958]: debug: native_assign_node:
> All nodes for resource xen_vbuild are unavailable, unclean or shutting
> down (xen-11: 1, -100)
> Nov 17 13:07:45 xen-11 pengine: [22958]: debug: native_assign_node:
> Could not allocate a node for xen_vbuild
> Nov 17 13:07:45 xen-11 pengine: [22958]: info: native_color: Resource
> xen_vbuild cannot run anywhere
> Nov 17 13:07:45 xen-11 pengine: [22958]: notice: LogActions: Stop
> resource xen_vbuild   (xen-11)
> Nov 17 13:07:46 xen-11 pengine: [22958]: notice: native_print:
> xen_vbuild(ocf::heartbeat:Xen):   Started xen-11
> Nov 17 13:07:46 xen-11 pengine: [22958]: debug: native_assign_node:
> Assigning xen-11 to xen_vbuild
> Nov 17 13:07:46 xen-11 pengine: [22958]: notice: LogActions: Leave
> resource xen_vbuild   (Started xen-11)
> Nov 17 13:08:15 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
> Nov 17 13:09:15 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
> Nov 17 13:10:16 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
> Nov 17 13:11:16 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
> Nov 17 13:12:17 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
> Nov 17 13:13:17 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
> Nov 17 13:14:17 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
> Nov 17 13:15:18 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
> Nov 17 13:16:18 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
> Nov 17 13:17:19 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
> Nov 17 13:17:47 xen-11 pengine: [22958]: notice: native_print:
> xen_vbuild(ocf::heartbeat:Xen):   Started xen-11
> Nov 17 13:17:47 xen-11 pengine: [22958]: debug: native_assign_node:
> Assigning xen-11 to xen_vbuild
> Nov 17 13:17:47 xen-11 pengine: [22958]: notice: LogActions: Leave
> resource xen_vbuild   (Started xen-11)
> Nov 17 13:18:19 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
> Nov 17 13:19:19 xen-11 lrmd: [4295]: debug: rsc:xen_vb

Re: [Pacemaker] crm resource restart fails to restart the service

2010-11-17 Thread Vadym Chepkov

On Nov 17, 2010, at 9:46 AM, Dejan Muhamedagic wrote:

> On Wed, Nov 17, 2010 at 08:30:36AM -0500, Vadym Chepkov wrote:
>> On Wed, Nov 17, 2010 at 8:01 AM, Dejan Muhamedagic  
>> wrote:
>> 
 RA doesn't support restart action? Most luckily you get 
 OCF_ERR_UNIMPLEMENTED in the log
>>> 
>>> It's actually a resource stop followed by start. It says so in
>>> the help too. Perhaps the start precludes the stop action. The
>>> logs should give a hint. We need a sleep in between.
>>> 
>> 
>> In this case this command is not working at all, because I tried in
>> the past for many resources and it never worked, so I just assumed it
>> has to be implemented by RA.
> 
> Funny, it worked here for me every time for apache, Dummy,
> Delay, stonith resources. With both pacemaker 1.0 and 1.1.
> 
>> To test it right now I issued a command
>> # crm resource restart xen_vbuild
> 
> Can you try to insert a sleep and see if that helps. It's in
> /usr/lib64/python2.6/site-packages/crm/ui.py:
> 
> 802 def restart(self,cmd,rsc):
> 803 "usage: restart "
> 804 if not is_name_sane(rsc):
> 805 return False
> 806 if not self.stop("stop",rsc):
> 807 return False
> 808 time.sleep(1)
> 809 return self.start("start",rsc)
> 
> Thanks,
> 
> Dejan


Yep, that did the trick

Now I see this:

Nov 17 14:52:39 xen-11 Xen[1]: INFO: Xen domain vbuild will be stopped 
(timeout: 220s)
Nov 17 14:52:40 xen-11 Xen[1]: DEBUG: vbuild still not stopped. Waiting...
Nov 17 14:52:44 xen-11 Xen[1]: DEBUG: vbuild still not stopped. Waiting…
Nov 17 14:52:45 xen-11 Xen[1]: DEBUG: vbuild still not stopped. Waiting...
Nov 17 14:52:47 xen-11 Xen[1]: DEBUG: vbuild still not stopped. Waiting...
Nov 17 14:52:48 xen-11 Xen[1]: DEBUG: vbuild still not stopped. Waiting...
Nov 17 14:52:50 xen-11 Xen[1]: DEBUG: vbuild still not stopped. Waiting...
Nov 17 14:52:54 xen-11 Xen[1]: DEBUG: vbuild still not stopped. Waiting...
Nov 17 14:52:55 xen-11 Xen[1]: DEBUG: vbuild still not stopped. Waiting...
Nov 17 14:53:00 xen-11 Xen[1]: INFO: Xen domain vbuild stopped.

[r...@xen-11 ~]# xm list|grep build
vbuild18  511 2 -b 12.0



> 
>> where xen_vbuild is a Xen VM and no results whatsoever.
>> 
>> Here is output of the log
>> 
>> Nov 17 13:04:13 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
>> Nov 17 13:05:14 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
>> Nov 17 13:06:14 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
>> Nov 17 13:07:15 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
>> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
>> +   
>> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
>> + > __crm_diff_marker__="added:top" >
>> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
>> +   > name="target-role" value="Stopped" />
>> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
>> -   
>> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
>> - 
>> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
>> -   > id="xen_vbuild-meta_attributes-target-role" />
>> Nov 17 13:07:44 xen-11 pengine: [22958]: debug: unpack_lrm_rsc_state:
>> xen_vbuild: Overwriting calculated next role Unknown with requested
>> next role Stopped
>> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
>> +   
>> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
>> + 
>> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
>> +   > id="xen_vbuild-meta_attributes-target-role" />
>> Nov 17 13:07:45 xen-11 pengine: [22958]: debug: unpack_lrm_rsc_state:
>> xen_vbuild: Overwriting calculated next role Unknown with requested
>> next role Stopped
>> Nov 17 13:07:45 xen-11 pengine: [22958]: notice: native_print:
>> xen_vbuild   (ocf::heartbeat:Xen):   Started xen-11
>> Nov 17 13:07:45 xen-11 pengine: [22958]: debug: native_assign_node:
>> All nodes for resource xen_vbuild are unavailable, unclean or shutting
>> down (xen-11: 1, -100)
>> Nov 17 13:07:45 xen-11 pengine: [22958]: debug: native_assign_node:
>> Could not allocate a node for xen_vbuild
>> Nov 17 13:07:45 xen-11 pengine: [22958]: info: native_color: Resource
>> xen_vbuild cannot run anywhere
>> Nov 17 13:07:45 xen-11 pengine: [22958]: notice: LogActions: Stop
>> resource xen_vbuild  (xen-11)
>> Nov 17 13:07:46 xen-11 pengine: [22958]: notice: native_print:
>> xen_vbuild   (ocf::heartbeat:Xen):   Started xen-11
>> Nov 17 13:07:46 xen-11 pengine: [22958]: debug: native_assign_node:
>> Assigning xen-11 to xen_vbuild
>> Nov 17 13:07:46 xen-11 pengine: [22958]: notice: LogActions: Leave
>> resource xen_vbuild  (Started xen-11)
>> Nov 17 13:08:15 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
>> Nov 17 13:09:15 xen-11 

Re: [Pacemaker] crm resource restart fails to restart the service

2010-11-17 Thread Dan Frincu

Hi,

Vadym Chepkov wrote:

On Wed, Nov 17, 2010 at 8:01 AM, Dejan Muhamedagic  wrote:

  

RA doesn't support restart action? Most luckily you get OCF_ERR_UNIMPLEMENTED 
in the log
  

It's actually a resource stop followed by start. It says so in
the help too. Perhaps the start precludes the stop action. The
logs should give a hint. We need a sleep in between.




In this case this command is not working at all, because I tried in
the past for many resources and it never worked, so I just assumed it
has to be implemented by RA.

To test it right now I issued a command
# crm resource restart xen_vbuild

where xen_vbuild is a Xen VM and no results whatsoever.

Here is output of the log

Nov 17 13:04:13 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
Nov 17 13:05:14 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
Nov 17 13:06:14 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
Nov 17 13:07:15 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
+   
Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
+ 
Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
+   
Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
-   
Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
- 
Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
-   
Nov 17 13:07:44 xen-11 pengine: [22958]: debug: unpack_lrm_rsc_state:
xen_vbuild: Overwriting calculated next role Unknown with requested
next role Stopped
Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
+   
Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
+ 
Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
+   
Nov 17 13:07:45 xen-11 pengine: [22958]: debug: unpack_lrm_rsc_state:
xen_vbuild: Overwriting calculated next role Unknown with requested
next role Stopped
Nov 17 13:07:45 xen-11 pengine: [22958]: notice: native_print:
xen_vbuild  (ocf::heartbeat:Xen):   Started xen-11
Nov 17 13:07:45 xen-11 pengine: [22958]: debug: native_assign_node:
All nodes for resource xen_vbuild are unavailable, unclean or shutting
down (xen-11: 1, -100)
Nov 17 13:07:45 xen-11 pengine: [22958]: debug: native_assign_node:
Could not allocate a node for xen_vbuild
Nov 17 13:07:45 xen-11 pengine: [22958]: info: native_color: Resource
xen_vbuild cannot run anywhere
Nov 17 13:07:45 xen-11 pengine: [22958]: notice: LogActions: Stop
resource xen_vbuild (xen-11)
Nov 17 13:07:46 xen-11 pengine: [22958]: notice: native_print:
xen_vbuild  (ocf::heartbeat:Xen):   Started xen-11
Nov 17 13:07:46 xen-11 pengine: [22958]: debug: native_assign_node:
Assigning xen-11 to xen_vbuild
Nov 17 13:07:46 xen-11 pengine: [22958]: notice: LogActions: Leave
resource xen_vbuild (Started xen-11)
Nov 17 13:08:15 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
Nov 17 13:09:15 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
Nov 17 13:10:16 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
Nov 17 13:11:16 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
Nov 17 13:12:17 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
Nov 17 13:13:17 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
Nov 17 13:14:17 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
Nov 17 13:15:18 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
Nov 17 13:16:18 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
Nov 17 13:17:19 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
Nov 17 13:17:47 xen-11 pengine: [22958]: notice: native_print:
xen_vbuild  (ocf::heartbeat:Xen):   Started xen-11
Nov 17 13:17:47 xen-11 pengine: [22958]: debug: native_assign_node:
Assigning xen-11 to xen_vbuild
Nov 17 13:17:47 xen-11 pengine: [22958]: notice: LogActions: Leave
resource xen_vbuild (Started xen-11)
Nov 17 13:18:19 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
Nov 17 13:19:19 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
Nov 17 13:20:20 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor

but VM never stopped:


[r...@xen-11 ~]# xm list|grep vbuild
vbuild 3  511 2 -b352.4


still ID 3 as it was before
  
In my case the custom OCF RA, works, after some tweaks, now I'm stuck 
with the mysql RA, I think this is the issue:


/usr/lib/ocf/resource.d/heartbeat# ./mysql stop
./mysql: line 523: (/1000)-5: syntax error: operand expected (error 
token is "/1000)-5")


First I thought it was because I set the monitor, start and stop 
timeouts to other values than the default, but even after setting the 
defaults, same thing.


primitive mysqld ocf:heartbeat:mysql \
   params binary="/usr/bin/mysqld_safe" config="/etc/my.cnf" 
enable_creation="0" datadir="/mysql/database" user="root" 
test_user="monitor" test_passwd=

Re: [Pacemaker] crm resource restart fails to restart the service

2010-11-17 Thread Dejan Muhamedagic
On Wed, Nov 17, 2010 at 09:56:25AM -0500, Vadym Chepkov wrote:
> 
> On Nov 17, 2010, at 9:46 AM, Dejan Muhamedagic wrote:
> 
> > On Wed, Nov 17, 2010 at 08:30:36AM -0500, Vadym Chepkov wrote:
> >> On Wed, Nov 17, 2010 at 8:01 AM, Dejan Muhamedagic  
> >> wrote:
> >> 
>  RA doesn't support restart action? Most luckily you get 
>  OCF_ERR_UNIMPLEMENTED in the log
> >>> 
> >>> It's actually a resource stop followed by start. It says so in
> >>> the help too. Perhaps the start precludes the stop action. The
> >>> logs should give a hint. We need a sleep in between.
> >>> 
> >> 
> >> In this case this command is not working at all, because I tried in
> >> the past for many resources and it never worked, so I just assumed it
> >> has to be implemented by RA.
> > 
> > Funny, it worked here for me every time for apache, Dummy,
> > Delay, stonith resources. With both pacemaker 1.0 and 1.1.
> > 
> >> To test it right now I issued a command
> >> # crm resource restart xen_vbuild
> > 
> > Can you try to insert a sleep and see if that helps. It's in
> > /usr/lib64/python2.6/site-packages/crm/ui.py:
> > 
> > 802 def restart(self,cmd,rsc):
> > 803 "usage: restart "
> > 804 if not is_name_sane(rsc):
> > 805 return False
> > 806 if not self.stop("stop",rsc):
> > 807 return False
> > 808 time.sleep(1)
> > 809 return self.start("start",rsc)
> > 
> > Thanks,
> > 
> > Dejan
> 
> 
> Yep, that did the trick

OK. These nodes are faster than what I have (or the other way
around), i.e. this seems to be timing issue.

Thanks,

Dejan

> Now I see this:
> 
> Nov 17 14:52:39 xen-11 Xen[1]: INFO: Xen domain vbuild will be stopped 
> (timeout: 220s)
> Nov 17 14:52:40 xen-11 Xen[1]: DEBUG: vbuild still not stopped. Waiting...
> Nov 17 14:52:44 xen-11 Xen[1]: DEBUG: vbuild still not stopped. Waiting…
> Nov 17 14:52:45 xen-11 Xen[1]: DEBUG: vbuild still not stopped. Waiting...
> Nov 17 14:52:47 xen-11 Xen[1]: DEBUG: vbuild still not stopped. Waiting...
> Nov 17 14:52:48 xen-11 Xen[1]: DEBUG: vbuild still not stopped. Waiting...
> Nov 17 14:52:50 xen-11 Xen[1]: DEBUG: vbuild still not stopped. Waiting...
> Nov 17 14:52:54 xen-11 Xen[1]: DEBUG: vbuild still not stopped. Waiting...
> Nov 17 14:52:55 xen-11 Xen[1]: DEBUG: vbuild still not stopped. Waiting...
> Nov 17 14:53:00 xen-11 Xen[1]: INFO: Xen domain vbuild stopped.
> 
> [r...@xen-11 ~]# xm list|grep build
> vbuild18  511 2 -b 12.0
> 
> 
> 
> > 
> >> where xen_vbuild is a Xen VM and no results whatsoever.
> >> 
> >> Here is output of the log
> >> 
> >> Nov 17 13:04:13 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
> >> Nov 17 13:05:14 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
> >> Nov 17 13:06:14 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
> >> Nov 17 13:07:15 xen-11 lrmd: [4295]: debug: rsc:xen_vbuild:101: monitor
> >> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
> >> +   
> >> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
> >> +  >> __crm_diff_marker__="added:top" >
> >> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
> >> +>> name="target-role" value="Stopped" />
> >> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
> >> -   
> >> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
> >> - 
> >> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
> >> ->> id="xen_vbuild-meta_attributes-target-role" />
> >> Nov 17 13:07:44 xen-11 pengine: [22958]: debug: unpack_lrm_rsc_state:
> >> xen_vbuild: Overwriting calculated next role Unknown with requested
> >> next role Stopped
> >> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
> >> +   
> >> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
> >> + 
> >> Nov 17 13:07:44 xen-11 cib: [4294]: info: log_data_element: cib:diff:
> >> +>> id="xen_vbuild-meta_attributes-target-role" />
> >> Nov 17 13:07:45 xen-11 pengine: [22958]: debug: unpack_lrm_rsc_state:
> >> xen_vbuild: Overwriting calculated next role Unknown with requested
> >> next role Stopped
> >> Nov 17 13:07:45 xen-11 pengine: [22958]: notice: native_print:
> >> xen_vbuild (ocf::heartbeat:Xen):   Started xen-11
> >> Nov 17 13:07:45 xen-11 pengine: [22958]: debug: native_assign_node:
> >> All nodes for resource xen_vbuild are unavailable, unclean or shutting
> >> down (xen-11: 1, -100)
> >> Nov 17 13:07:45 xen-11 pengine: [22958]: debug: native_assign_node:
> >> Could not allocate a node for xen_vbuild
> >> Nov 17 13:07:45 xen-11 pengine: [22958]: info: native_color: Resource
> >> xen_vbuild cannot run anywhere
> >> Nov 17 13:07:45 xen-11 pengine: [22958]: notice: LogActions: Stop
> >> resource xen_vbuild(xen-11)
> >> Nov 17 13:07:46 xen-11

Re: [Pacemaker] crm resource restart fails to restart the service

2010-11-18 Thread Vadym Chepkov
On Wed, Nov 17, 2010 at 1:03 PM, Dejan Muhamedagic  wrote:
>> >
>> > Funny, it worked here for me every time for apache, Dummy,
>> > Delay, stonith resources. With both pacemaker 1.0 and 1.1.
>> >
>> >> To test it right now I issued a command
>> >> # crm resource restart xen_vbuild
>> >
>> > Can you try to insert a sleep and see if that helps. It's in
>> > /usr/lib64/python2.6/site-packages/crm/ui.py:
>> >
>> > 802     def restart(self,cmd,rsc):
>> > 803         "usage: restart "
>> > 804         if not is_name_sane(rsc):
>> > 805             return False
>> > 806         if not self.stop("stop",rsc):
>> > 807             return False
>> > 808         time.sleep(1)
>> > 809         return self.start("start",rsc)
>> >
>> > Thanks,
>> >
>> > Dejan
>>
>>
>> Yep, that did the trick
>
> OK. These nodes are faster than what I have (or the other way
> around), i.e. this seems to be timing issue.
>
> Thanks,
>
> Dejan
>

well, I would say it's not normal, right? Are you going to include
this "sleep" in the stable-1.0 branch ? or maybe some op_defaults
reset_delay ?

Thanks,
Vadym

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] crm resource restart fails to restart the service

2010-11-19 Thread Dejan Muhamedagic
Hi,

On Thu, Nov 18, 2010 at 01:35:24PM -0500, Vadym Chepkov wrote:
> On Wed, Nov 17, 2010 at 1:03 PM, Dejan Muhamedagic  
> wrote:
> >> >
> >> > Funny, it worked here for me every time for apache, Dummy,
> >> > Delay, stonith resources. With both pacemaker 1.0 and 1.1.
> >> >
> >> >> To test it right now I issued a command
> >> >> # crm resource restart xen_vbuild
> >> >
> >> > Can you try to insert a sleep and see if that helps. It's in
> >> > /usr/lib64/python2.6/site-packages/crm/ui.py:
> >> >
> >> > 802     def restart(self,cmd,rsc):
> >> > 803         "usage: restart "
> >> > 804         if not is_name_sane(rsc):
> >> > 805             return False
> >> > 806         if not self.stop("stop",rsc):
> >> > 807             return False
> >> > 808         time.sleep(1)
> >> > 809         return self.start("start",rsc)
> >> >
> >> > Thanks,
> >> >
> >> > Dejan
> >>
> >>
> >> Yep, that did the trick
> >
> > OK. These nodes are faster than what I have (or the other way
> > around), i.e. this seems to be timing issue.
> >
> > Thanks,
> >
> > Dejan
> >
> 
> well, I would say it's not normal, right?

I guess not, but what do you really mean? :)

> Are you going to include
> this "sleep" in the stable-1.0 branch ?

The sleep is currently included in the 1.1 branch, but it's not
a proper fix. If there are dependencies which take time to stop
then the restart will fail. In that case we'd need to wait for
the transition to finish. Right now, the shell doesn't have such
a facility, but should get one.

> or maybe some op_defaults
> reset_delay ?

That's still not general enough.

Thanks,

Dejan

> 
> Thanks,
> Vadym
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: 
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker