I started off with what I thought was a simple question, but googling,
searching mailing list archives, reading man pages, and testing hasn't
turned up anything I'm happy with and has raised some new issues...

In a past life, on a non-Unix system, I was able to set up simple and
effective mutual exclusion in the equivalent of shell scripts by
opening a file for write access (which created an exclusive lock on the
file) at the start of the protected section and not closing it until
the end of that section.  This had no race conditions and had no
problem of stale locks since the lock was automatically released if the
process holding it terminated abnormally.

My original question was "What is the equivalent idiom for OpenBSD
shell scripts, or is there none?"

The best approximation I've found so far is (assuming that the details
of the semantics of "ln" and "kill -0" under OpenBSD's /bin/sh are as
the author expects; I haven't yet checked this)

function my_lockfile ()
{
        TEMPFILE="$1.$$"
        LOCKFILE="$1.lock"
        { echo $$ > $TEMPFILE } >& /dev/null || {
                echo "You don't have permission to access `dirname
$TEMPFILE`"
                return 1
        }
        ln $TEMPFILE $LOCKFILE >& /dev/null && {
                rm -f $TEMPFILE
                return 0
        }
        kill -0 `cat $LOCKFILE` >& /dev/null && {
                rm -f $TEMPFILE
                return 1
        }
        echo "Removing stale lock file"
        rm -f $LOCKFILE
        ln $TEMPFILE $LOCKFILE >& /dev/null && {
                rm -f $TEMPFILE
                return 0
        }
        rm -f $TEMPFILE
        return 1

but this is more complicated than I like and has the intrinsic problem
that one can't be sure of detecting a stale lock file (the process
creating the lock file may have died and a new process with the same
process id been created; this seems rather unlikely in practice but
AFAIK is definitely possible).

It also, at least under OpenBSD, has the serious problem that "$$"
isn't the PID of the shell running the script but rather the PID of the
"original" shell (whatever exactly that means; some testing suggests
that it's the last process on the PPID chain which is still in this
process group) and I haven't yet found any straightforward way of
getting the PID of the "bottom-level" shell, which is what is needed
for the stale-lock testing to work at all when the exclusion needed is
among scripts run in subshells of the same shell.  (I realize that I
could create a trivial program which writes its PPID to stdout, or hack
/bin/sh to add a new variable which contains the PID I want -- but I'd
prefer to use the tools which come as part of the base system.  This
has also left me rather curious as to *why* the PID and PPID of the
"original" shell are easily accessible in scripts but those of the
subshell actually running the script aren't.)

Another obvious possibility is to use something other than a shell
script (probably perl, which I strongly suspect is capable of doing
this), but I'm not at all sure it makes sense to stop and learn yet
another language *right* *now*.  If this *is* the way to go,
recommendations as to the "best" language for general sysadmin-type
scripting would be appreciated.

Thanks in advance for any advice,

        Dave

-- 
Dave Anderson
<[EMAIL PROTECTED]>

Reply via email to