On Wed, 2013-02-20 at 16:26 +0900, Yuichi SEINO wrote: > Hi Jiaju, > > I am testing this patch. > When a lockfile was removed, it seems that the stop of RA isn't a > intended behavior.
I'm just curious how the lockfile was removed. Basically the existence of the lockfile shows one boothd is started, and prevent being wrongly started again. So the lockfile should not be removed intentionally by the admin. Thanks, Jiaju > Currently, If "pidnum" is empty, RA run "cat > /proc//cmdline". /proc/cmdline is boot parameter file. So, I added the > check about a existence of lockfile. > > diff --git a/script/ocf/booth-site b/script/ocf/booth-site > index 2575643..7c775dc 100755 > --- a/script/ocf/booth-site > +++ b/script/ocf/booth-site > @@ -116,6 +116,10 @@ booth_check_daemon_state(){ > > case $rc in > $OCF_SUCCESS) > + if [ ! -f $lockfile ]; then > + ocf_log err "lockfile not exists.(${lockfile})" > + return $BOOTH_DAEMON_EXIST; > + fi > pidnum=$(cat $lockfile |awk '{print $1}') > daemonstate=$(cat $lockfile |awk '{print $2}') > if cat /proc/$pidnum/cmdline |grep $OCF_RESKEY_type > >/dev/null 2>&1; then > > When this happened, I got "crm resource trace booth" > > + 21:09:48: 223: '[' '!' ']' > + 21:09:48: 224: OCF_RESKEY_daemon=boothd > + 21:09:48: 227: '[' '!' ']' > + 21:09:48: 228: OCF_RESKEY_type=site > + 21:09:48: 231: case $__OCF_ACTION in > + 21:09:48: 236: booth_stop > + 21:09:48: booth_stop:166: booth_check_daemon_state > + 21:09:48: booth_check_daemon_state:115: booth_check_daemon_exist > + 21:09:48: booth_check_daemon_exist:105: killall -0 boothd > + 21:09:48: booth_check_daemon_exist:105: rc=0 > + 21:09:48: booth_check_daemon_exist:107: case $rc in > + 21:09:48: booth_check_daemon_exist:108: return 0 > + 21:09:48: booth_check_daemon_state:115: rc=0 > + 21:09:48: booth_check_daemon_state:117: case $rc in > + 21:09:48: booth_check_daemon_state:117: case $rc in > ++ 21:09:48: booth_check_daemon_state:119: awk '{print $1}' > + 21:09:48: booth_check_daemon_state:117: case $rc in > + 21:09:48: booth_check_daemon_state:117: case $rc in > ++ 21:09:48: booth_check_daemon_state:119: cat /var/run/booth.pid > + 21:09:48: booth_check_daemon_state:117: case $rc in > + 21:09:48: booth_check_daemon_state:117: case $rc in > + 21:09:48: booth_check_daemon_state:119: pidnum= > ++ 21:09:48: booth_check_daemon_state:120: awk '{print $2}' > ++ 21:09:48: booth_check_daemon_state:120: cat /var/run/booth.pid > + 21:09:48: booth_check_daemon_state:120: daemonstate= > + 21:09:48: booth_check_daemon_state:121: grep site > + 21:09:48: booth_check_daemon_state:121: cat /proc//cmdline > + 21:09:48: booth_check_daemon_state:122: case $daemonstate in > + 21:09:48: booth_check_daemon_state:125: return 4 > + 21:09:48: booth_stop:166: rc=4 > + 21:09:48: booth_stop:168: case $rc in > + 21:09:48: booth_stop:173: return 1 > + 21:09:48: 246: rc=1 > + 21:09:48: 248: exit 1 > > > > > 2013/2/19 Jiaju Zhang <jjzh...@suse.de>: > > Hi Yuichi, > > > > On Tue, 2013-02-19 at 10:27 +0900, Yuichi SEINO wrote: > >> Hi Xia, > >> > >> I have a question about the following part. The write man explain that > >> "errno" is set appropriately if the write return -1. So, if "rv" is > >> equal to 0, strerror(errno) may not output the correct message. What > >> do you think about it? > > > > Good catch, I think we should differentiate the cases of rv == -1 or rv > > == 0. Maybe setting errno to ENOSPC when rv == 0. > > > > BTW, apart from that, does this patch fix your original issue? > > > > Thanks, > > Jiaju > > > > > > -- > Yuichi SEINO > METROSYSTEMS CORPORATION > E-mail:seino.clust...@gmail.com _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org