Re: [Pacemaker] [Openais] Unusual exit code with /etc/init.d/corosync stop
On Tue, Mar 23, 2010 at 12:42 AM, Andreas Mock andreas.m...@web.de wrote: Hi all, I'm using corosync 1.2.0 from the packages of clusterlabs.org on openSuSE 11.2. A correct /etc/init.d/corosync stop issues a return code of 1 The rc code isn't coming from corosync at all. Its coming from the last command in stop(), which is echo. Please run the following and report the result: echo ; echo $? On Fedora it produces: [09:14 AM] r...@f12 ~/tmp # echo ; echo $? 0 [09:14 AM] r...@f12 ~/tmp # which definitely hurts the Cluster Test Suite when stopping the cluster stack asuming (IMHO correctly) that a problem free execution of the rc script should return 0 and not 1. The problem is indirectly the setting of the return code variable $rtrn in the while loop waiting for corosync to die. While loop is exited exactly when the status call delivers a 1 meaning that the process isn't there any more. This rc of 1 will then be delivered as return code of the stop-call. Here's the patch just to show the little change. ---8-- --- /etc/init.d/corosync 2010-01-20 21:23:53.0 +0100 +++ /tmp/corosync 2010-03-23 00:25:12.794065102 +0100 @@ -138,6 +138,7 @@ ;; stop) stop + rtrn=0 ;; *) echo usage: $0 {start|stop|restart|reload|force-reload|condrestart|try-restart|status} ---8-- Best regards Andreas Mock ___ Openais mailing list open...@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Re: [Pacemaker] [Openais] Unusual exit code with /etc/init.d/corosync stop
-Ursprüngliche Nachricht- Von: Andrew Beekhof and...@beekhof.net Gesendet: 25.03.2010 09:15:11 An: Andreas Mock andreas.m...@web.de Betreff: Re: [Openais] Unusual exit code with /etc/init.d/corosync stop On Tue, Mar 23, 2010 at 12:42 AM, Andreas Mock [ wrote: Hi all, I'm using corosync 1.2.0 from the packages of clusterlabs.org on openSuSE 11.2. A correct /etc/init.d/corosync stop issues a return code of 1 The rc code isn't coming from corosync at all. Its coming from the last command in stop(), which is echo. Where in my original post did I say that the return code comes from corosync (binary)?? Please read the mail completely. In the first sentence I just described the version and platform I'm using and that the script /etc/init.d/corosync issues a return code of 1 when stopping worked correctly. Some lines further - you can see them in your quoted post - I'll explain - probably in bad English - what the reason for this return code is, as I investigated this problem by debugging the script /etc/init.d/corosync. Read the rest of my mail carefully and you get the reason for that behaviour. a) The very last line is: exit $rtrn b) Where is the global variable $rtrn initialized and set?? c) It gets set in shell function status!! d) When you do a stop and the stop works status is called the last time in the while loop setting $rtrn to 1. e) This variable is never changed afterwards. f) It is returned by the last statement, look at a) Best regards Andreas Mock Please run the following and report the result: echo ; echo $? On Fedora it produces: [09:14 AM] r...@f12 ~/tmp # echo ; echo $? 0 [09:14 AM] r...@f12 ~/tmp # which definitely hurts the Cluster Test Suite when stopping the cluster stack asuming (IMHO correctly) that a problem free execution of the rc script should return 0 and not 1. The problem is indirectly the setting of the return code variable $rtrn in the while loop waiting for corosync to die. While loop is exited exactly when the status call delivers a 1 meaning that the process isn't there any more. This rc of 1 will then be delivered as return code of the stop-call. Here's the patch just to show the little change. ---8-- --- /etc/init.d/corosync 2010-01-20 21:23:53.0 +0100 +++ /tmp/corosync 2010-03-23 00:25:12.794065102 +0100 @@ -138,6 +138,7 @@ ;; stop) stop + rtrn=0 ;; *) echo usage: $0 {start|stop|restart|reload|force-reload|condrestart|try-restart|status} ---8-- Best regards Andreas Mock ___ Openais mailing list open...@lists.linux-foundation.org https://lists.linux-foundation.org/mailman/listinfo/openais ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Re: [Pacemaker] [Openais] Unusual exit code with /etc/init.d/corosync stop (Steve - Please ack new patch)
On Thu, Mar 25, 2010 at 9:32 AM, Andreas Mock andreas.m...@web.de wrote: -Ursprüngliche Nachricht- Von: Andrew Beekhof and...@beekhof.net Gesendet: 25.03.2010 09:15:11 An: Andreas Mock andreas.m...@web.de Betreff: Re: [Openais] Unusual exit code with /etc/init.d/corosync stop On Tue, Mar 23, 2010 at 12:42 AM, Andreas Mock [ wrote: Hi all, I'm using corosync 1.2.0 from the packages of clusterlabs.org on openSuSE 11.2. A correct /etc/init.d/corosync stop issues a return code of 1 The rc code isn't coming from corosync at all. Its coming from the last command in stop(), which is echo. Where in my original post did I say that the return code comes from corosync (binary)?? Please read the mail completely. In the first sentence I just described the version and platform I'm using and that the script /etc/init.d/corosync issues a return code of 1 when stopping worked correctly. Some lines further - you can see them in your quoted post - I'll explain - probably in bad English - what the reason for this return code is, as I investigated this problem by debugging the script /etc/init.d/corosync. Read the rest of my mail carefully and you get the reason for that behaviour. a) The very last line is: exit $rtrn b) Where is the global variable $rtrn initialized and set?? c) It gets set in shell function status!! d) When you do a stop and the stop works status is called the last time in the while loop setting $rtrn to 1. e) This variable is never changed afterwards. f) It is returned by the last statement, look at a) Do try to calm down a little. I made a mistake, it happens when one tries responding to 40-50 conversations a day. Patching after stop is wrong though, the root cause is status() not using a local variable. --- ./etc/init.d/corosync.old 2010-03-25 10:21:19.673779309 +0100 +++ ./etc/init.d/corosync 2010-03-25 10:23:47.318779319 +0100 @@ -40,13 +40,13 @@ failure() status() { pid=$(pidof $1 2/dev/null) - rtrn=$? - if [ $rtrn -ne 0 ]; then + rc=$? + if [ $rc -ne 0 ]; then echo $1 is stopped else echo $1 (pid $pid) is running... fi - return $rtrn + return $rc } ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Re: [Pacemaker] [Openais] Unusual exit code with /etc/init.d/corosync stop (Steve - Please ack new patch)
-Ursprüngliche Nachricht- Von: Andrew Beekhof and...@beekhof.net Gesendet: 25.03.2010 10:29:50 An: Andreas Mock andreas.m...@web.de Betreff: Re: [Openais] Unusual exit code with /etc/init.d/corosync stop (Steve - Please ack new patch) Do try to calm down a little. Sorry, if it sounded upset. That was not my intention. In fact I thought that the way I expressed myself was the reason you didn't understand me and therefore I tried to sum it up in a different way. It seems to have worked. I made a mistake, it happens when one tries responding to 40-50 conversations a day. As it happens to everyone of us from time to time. :-) Be sure I have the greatest respect for your fast replies to any questions sent to the mailing list. I'm sure I'm not the only one being thankful for that. So, in fact I'm totally relaxed. The more that I read that we get corosync 1.2.1 soon from you guys. Best regards Andreas Mock ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker