>On Mon, Oct 20, 2014 at 11:21:36PM +0200, Lars Ellenberg wrote:
>> On Mon, Oct 20, 2014 at 03:04:31PM -0600, Alan Robertson wrote:
>> > On 10/20/2014 02:52 PM, Alan Robertson wrote:
>> > > For the Assimilation code I use the full pathname of the binary from
>> > > /proc to tell if it's "one of mine".  That's not perfect if you're using
>> > > an interpreted language.  It works quite well for compiled languages.
>> 
>> It works just as well (or as bad) from interpreted languages:
>> readlink /proc/$pid/exe
>> (very old linux has a fsid:inode encoding there, but I digress)
>> 
>> But that does solve a different subset of problems,
>> has race conditions in itself, and breaks if you have updated the binary
>> since start of that service (which does happen).

Sorry, I lost the original.
Alan then wrote:

> It only breaks if you change the *name* of the binary.  Updating the
> binary contents has no effect.  Changing the name of the binary is
> pretty unusual - or so it seems to me.  Did I miss something?
> 
> And if you do, you should stop with the binary with the old version and
> start it with the new one.  Very few methods are going to deal well with
> radical changes in the service without stopping it with the old script,
> updating, and starting with the new script.

Well, the "pid starttime" method does...

> I don't believe I see the race condition.

Does not matter.

> It won't loop, and it's not fooled by pid wraparound.  What else are you
> looking for? [Guess I missed something else here]

pid + exe is certainly is better than the pid alone.
It may even be "good enough".

But it still has shortcomings.

/proc/pid/exe is not stable,
(changes to "deleted" if the binary is deleted)
could be accounted for.

/proc/pid/exe links to the interpreter (python, bash, java, whatever)

Even if it is a "real" binary, (pid, /proc/pid/exe) is
still NOT unique for pid re-use after wrap around:
think different instances of mysql or whatever.
(yes, it gets increasingly unlikely...)

However, (pid, starttime) *is* unique (for the lifetime of the pidfile,
as long as that is stored on tmpfs resp. cleared after reboot).
(unless you tell me you can eat through pid_max, or at least the
currently unused pids, within the granularity of starttime...)

So that's why I propose to use (pid, starttime) tuple.

If you see problems with (pid, starttime), please speak up.
If you have something *better*, please speak up.
If you just have something "different",
feel free to tell us anyways :-)

        Lars

_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to