Re: [Pacemaker] [Openais] Unusual exit code with /etc/init.d/corosync stop

2010-03-25 Thread Andrew Beekhof
On Tue, Mar 23, 2010 at 12:42 AM, Andreas Mock andreas.m...@web.de wrote:
 Hi all,

 I'm using corosync 1.2.0 from the packages of clusterlabs.org on openSuSE 
 11.2.
 A correct /etc/init.d/corosync stop issues a return code of 1

The rc code isn't coming from corosync at all.
Its coming from the last command in stop(), which is echo.

Please run the following and report the result:
   echo ; echo $?

On Fedora it produces:

[09:14 AM] r...@f12 ~/tmp # echo ; echo $?

0
[09:14 AM] r...@f12 ~/tmp #


 which definitely hurts
 the Cluster Test Suite when stopping the cluster stack asuming (IMHO 
 correctly)
 that a problem free execution of the rc script should return 0 and not 1.



 The problem is indirectly the setting of the return code variable $rtrn in 
 the while

 loop waiting for corosync to die. While loop is exited exactly when the status

 call delivers a 1 meaning that the process isn't there any more. This rc of 1

 will then be delivered as return code of the stop-call.



 Here's the patch just to show the little change.

 ---8--

 --- /etc/init.d/corosync 2010-01-20 21:23:53.0 +0100
 +++ /tmp/corosync 2010-03-23 00:25:12.794065102 +0100
 @@ -138,6 +138,7 @@
 ;;
 stop)
 stop
 + rtrn=0
 ;;
 *)
 echo usage: $0 
 {start|stop|restart|reload|force-reload|condrestart|try-restart|status}
 ---8--



 Best regards

 Andreas Mock








 ___
 Openais mailing list
 open...@lists.linux-foundation.org
 https://lists.linux-foundation.org/mailman/listinfo/openais

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] [Openais] Unusual exit code with /etc/init.d/corosync stop

2010-03-25 Thread Andreas Mock
-Ursprüngliche Nachricht-
Von: Andrew Beekhof and...@beekhof.net
Gesendet: 25.03.2010 09:15:11
An: Andreas Mock andreas.m...@web.de
Betreff: Re: [Openais] Unusual exit code with /etc/init.d/corosync stop

On Tue, Mar 23, 2010 at 12:42 AM, Andreas Mock [ wrote:
 Hi all,

 I'm using corosync 1.2.0 from the packages of clusterlabs.org on openSuSE 
 11.2.
 A correct /etc/init.d/corosync stop issues a return code of 1

The rc code isn't coming from corosync at all.
Its coming from the last command in stop(), which is echo.

Where in my original post did I say that the return code comes from  corosync 
(binary)??

Please read the mail completely. In the first sentence I just described the
version and platform I'm using and that the script /etc/init.d/corosync issues a
return code of 1 when stopping worked correctly.

Some lines further - you can see them in your quoted post - I'll explain - 
probably in bad English -
what the reason for this return code is, as I investigated this problem by 
debugging 
the script /etc/init.d/corosync.

Read the rest of my mail carefully and you get the reason for that behaviour.
a) The very last line is: exit $rtrn
b) Where is the global variable $rtrn initialized and set??
c) It gets set in shell function status!!
d) When you do a stop and the stop works status is called the last time in the 
while
loop setting $rtrn to 1.
e) This variable is never changed afterwards.
f) It is returned by the last statement, look at a)


Best regards
Andreas Mock



Please run the following and report the result:
   echo ; echo $?

On Fedora it produces:

[09:14 AM] r...@f12 ~/tmp # echo ; echo $?

0
[09:14 AM] r...@f12 ~/tmp #


 which definitely hurts
 the Cluster Test Suite when stopping the cluster stack asuming (IMHO 
 correctly)
 that a problem free execution of the rc script should return 0 and not 1.



 The problem is indirectly the setting of the return code variable $rtrn in 
 the while

 loop waiting for corosync to die. While loop is exited exactly when the 
 status

 call delivers a 1 meaning that the process isn't there any more. This rc of 1

 will then be delivered as return code of the stop-call.



 Here's the patch just to show the little change.

 ---8--

 --- /etc/init.d/corosync 2010-01-20 21:23:53.0 +0100
 +++ /tmp/corosync 2010-03-23 00:25:12.794065102 +0100
 @@ -138,6 +138,7 @@
 ;;
 stop)
 stop
 + rtrn=0
 ;;
 *)
 echo usage: $0 
 {start|stop|restart|reload|force-reload|condrestart|try-restart|status}
 ---8--



 Best regards

 Andreas Mock








 ___
 Openais mailing list
 open...@lists.linux-foundation.org
 https://lists.linux-foundation.org/mailman/listinfo/openais

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] [Openais] Unusual exit code with /etc/init.d/corosync stop (Steve - Please ack new patch)

2010-03-25 Thread Andrew Beekhof
On Thu, Mar 25, 2010 at 9:32 AM, Andreas Mock andreas.m...@web.de wrote:
 -Ursprüngliche Nachricht-
 Von: Andrew Beekhof and...@beekhof.net
 Gesendet: 25.03.2010 09:15:11
 An: Andreas Mock andreas.m...@web.de
 Betreff: Re: [Openais] Unusual exit code with /etc/init.d/corosync stop

On Tue, Mar 23, 2010 at 12:42 AM, Andreas Mock [ wrote:
 Hi all,

 I'm using corosync 1.2.0 from the packages of clusterlabs.org on openSuSE 
 11.2.
 A correct /etc/init.d/corosync stop issues a return code of 1

The rc code isn't coming from corosync at all.
Its coming from the last command in stop(), which is echo.

 Where in my original post did I say that the return code comes from  corosync 
 (binary)??

 Please read the mail completely. In the first sentence I just described the
 version and platform I'm using and that the script /etc/init.d/corosync 
 issues a
 return code of 1 when stopping worked correctly.

 Some lines further - you can see them in your quoted post - I'll explain - 
 probably in bad English -
 what the reason for this return code is, as I investigated this problem by 
 debugging
 the script /etc/init.d/corosync.

 Read the rest of my mail carefully and you get the reason for that behaviour.
 a) The very last line is: exit $rtrn
 b) Where is the global variable $rtrn initialized and set??
 c) It gets set in shell function status!!
 d) When you do a stop and the stop works status is called the last time in 
 the while
 loop setting $rtrn to 1.
 e) This variable is never changed afterwards.
 f) It is returned by the last statement, look at a)

Do try to calm down a little.
I made a mistake, it happens when one tries responding to 40-50
conversations a day.

Patching after stop is wrong though, the root cause is status() not
using a local variable.

--- ./etc/init.d/corosync.old   2010-03-25 10:21:19.673779309 +0100
+++ ./etc/init.d/corosync   2010-03-25 10:23:47.318779319 +0100
@@ -40,13 +40,13 @@ failure()
 status()
 {
pid=$(pidof $1 2/dev/null)
-   rtrn=$?
-   if [ $rtrn -ne 0 ]; then
+   rc=$?
+   if [ $rc -ne 0 ]; then
echo $1 is stopped
else
echo $1 (pid $pid) is running...
fi
-   return $rtrn
+   return $rc
 }

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker


Re: [Pacemaker] [Openais] Unusual exit code with /etc/init.d/corosync stop (Steve - Please ack new patch)

2010-03-25 Thread Andreas Mock
-Ursprüngliche Nachricht-
Von: Andrew Beekhof and...@beekhof.net
Gesendet: 25.03.2010 10:29:50
An: Andreas Mock andreas.m...@web.de
Betreff: Re: [Openais] Unusual exit code with /etc/init.d/corosync stop (Steve  
- Please ack new patch)

Do try to calm down a little.

Sorry, if it sounded upset. That was not my intention. 

In fact I thought that the way I expressed myself was the
reason you didn't understand me and therefore I tried to sum it
up in a different way. It seems to have worked.

I made a mistake, it happens when one tries responding to 40-50
conversations a day.

As it happens to everyone of us from time to time.  :-)

Be sure I have the greatest respect for your fast replies to any questions
sent to the mailing list. I'm sure I'm not the only one being thankful
for that. 

So, in fact I'm totally relaxed. The more that I read that we
get corosync 1.2.1 soon from you guys.

Best regards
Andreas Mock

___
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker