If you're not reading/writing the file (via with-open-file or open or
whatever) then why lock it?  If you *are* reading/writing the file,
why not use that file descriptor?

The requirements are a bit unique: we have two different applications
(two distinct processes) launched at different times, via a cron
scheduler (this is running on a linux server, btw).

Those two processes read from and write to the same set of files.

Even though they've been scheduled to run at different times (to avoid
running on top of each other), it *is* possible (though given the
schedule unlikely), that those two processes will try to access the
same file at the same time.

Your description makes me think you've implemented an interface
conceptually similar to this:

  (defun lock (fname)
    (let ((fd (open ...)))
      (when fd
        (fcntl fd LOCK))))

  (defun unlock (fname)
    (let ((fd (open ...)))
      (when fd
        (fcntl fd UNLOCK))))

and that just won't work.


You're right, the moment you exit either of those functions, the lock is gone.

Here's how we did it instead -- this is still just the first attempt,
subject to change, but it's passed all the tests we've done so far:

(defun with-file-lock (file-pathname fn &rest args)
 "Perform file I/O on file-pathname (i.e. apply fn and its args),
while setting a POSIX fcntl() lock on file-pathname to prevent
simultaneous writes/race conditions."
 (let ((fd -1)
        (result nil))
   (loop
     (progn
        (setf fd (lockfile (namestring file-pathname)))
        (when (> fd -1)
          (unwind-protect
            (setf result (apply fn args))
            (unlockfile fd))
          (return-from with-file-lock result)))
     (sleep 0.2))))

So basically, we attempt to place a lock on the file designated by the
file-pathname object, but if we can't (because the other process has
it), we pause and try again.

Once we do get a lock, and the fd is valid, we apply fn (whatever
read/write action we need to do on the file), remove the lock, and
return.

Note that (lockfile) and (unlockfile) correspond to the fcntl()
wrapper classes in lockfile.o, as defined by (alien:def-alien-routine)
after we load the object file using (alien:load-foreign).

So that's a long way of saying: I think we're doing everything
correctly now, but I know how tricky these things can be (hence the
long battery of tests we're putting it through), so if you have any
comments or suggestions, please let me know.

Reply via email to