Re: [Linux-ha-dev] RFC: pidfile handling; current worst case: stop failure and node level fencing

Dejan Muhamedagic Wed, 22 Oct 2014 02:31:43 -0700

Hi Lars,

On Mon, Oct 20, 2014 at 09:17:29PM +0200, Lars Ellenberg wrote:
> 
> Recent discussions with Dejan made me again more prominently aware of a
> few issues we probably all know about, but usually dismis as having not
> much relevance in the real-world.
> 
> The facts:
> 
>  * a pidfile typically only stores a pid
>  * a pidfile may "stale", not properly cleaned up
>    when the pid it references died.
>  * pids are recycled
> 
>    This is more an issue if kernel.pid_max is small
>    wrt the number of processes created per unit time,
>    for example on some embeded systems,
>    or on some very busy systems.
> 
>    But it may be an issue on any system,
>    even a mostly idle one, given "bad luck^W timing",
>    see below.
> 
> A common idiom in resource agents is to
> 
> kill_that_pid_and_wait_until_dead()
> {
>       local pid=$1
>       is_alive $pid || return 0
>       kill -TERM $pid
>       while is_alive $pid ; sleep 1; done
>       return 0
> }
> 
> The naïve implementation of is_alive() is
> is_alive() { kill -0 $1 ; }
> 
> This is the main issue:
> -----------------------
> 
> If the last-used-pid is just a bit smaller then $pid,
> during the sleep 1, $pid may die,
> and the OS may already have created a new process with that exact pid.
> 
> Using above "is_alive", kill_that_pid() will not notice that the
> to-be-killed pid has actually terminated while that new process runs.
> Which may be a very long time if that is some other long running daemon.
> 
> This may result in stop failure and resulting node level fencing.
> 
> The question is, which better way do we have to detect if some pid died
> after we killed it. Or, related, and even better: how to detect if the
> process currently running with some pid is in fact still the process
> referenced by the pidfile.
> 
> I have two suggestions.
> 
> (I am trying to avoid bashisms in here.
>  But maybe I overlook some.
>  Also, the code is typed, not sourced from some working script,
>  so there may be logic bugs and typos.
>  My intent should be obvious enough, though.)
> 
> using "cd /proc/$pid; stat ."
> -----------------------------
> 
> # this is most likely linux specific


Apparently not. According to Wikipedia at least, most UNIX
platforms (including BSD and Solaris) support /proc/$pid.

> kill_that_pid_and_wait_until_dead()
> {
>       local pid=$1
>       (
>               cd /proc/$pid || return 0
>               kill -TERM $pid
>               while stat . ; sleep 1; done

I'd rather "test -d ." (it's more common in shell scripts and
runs faster). BTW, on my laptop, test -d is so fast that the
process doesn't get removed before it runs and the while loop
always gets executed. In that respect, "stat" or "ls -d" performs
better.

>       )
>       return 0
> }
> 
> Once pid dies, /proc/$pid will become stale (but not completely go away,
> because it is our cwd), and stat . will return "No such process".

This seems to be a very elegant solution and I cannot find fault
with it. Short and easy to understand too.

[... Skipping other proposals, some of which are quite exotic :) ]

> kill_using_pidfile()
> {
>       local pidfile=$1
>       local pid starttime proc_pid_starttime
> 
>       test -e $pidfile                || return # already dead
>       read pid starttime <$pidfile    || return # unreadable

I'd assume that we (the caller) knows what the process should
look like in the process table, as in say command and arguments.
We could also test that if there's a possibility that the process
left but the PID file somehow stayed behind.

>       # check pid and starttime are both present, numeric only, ...
>       # I have a version that distinguishes 16 distinct error

Wow!

>       # conditions; this is the short version only...
> 
>       local i=0
>       while
>               get_proc_pid_starttime &&
>               [ "$starttime" = "$proc_pid_starttime" ]
>       do
>               : $(( i+=1 ))
>               [ $i =  1 ] && kill -TERM $pid
>               # MAYBE # [ $i = 30 ] && kill -KILL $pid
>               sleep 1
>       done
> 
>       # it's not (anymore) the process we where looking for
>       # remove that pidfile.
> 
>       rm -f "$pidfile"
> }
> 
> In other OSes, ps may be able to give a good enough equivalent?
> 
> Any comments?

I'd just go with the "cd /proc/$pid" thing. Perhaps add a test
for "ps -o cmd $pid" output.

And thanks for giving this such a thorough analysis!

Thanks,

Dejan

> Thanks,
>       Lars
> 
> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Re: [Linux-ha-dev] RFC: pidfile handling; current worst case: stop failure and node level fencing

Reply via email to