On Mon, Aug 17, 2009 at 01:24:30PM -0700, Udo Grabowski wrote:
> We see this problem during fast write access to small files (a couple
> of jobs on different files on the same filesystem) through the
> automounter (layout and options as described above).  Files sometimes
> do exist before and sometimes not, so no clue there.

It's a race condition.  If the mirror mount hasn't been done yet, or if
has been unmounted (due to not being in use for a while), and two or
more threads step into the mount, then you have a race, and only one
thread will win.

This is trivial to reproduce when running "dmake" in an ONNV source
directory in an NFS-mounted workspace with a separate "proto" area
filesystem.  Yes, that's ONNV engineer speak.  There's other ways to
trigger it, and, IIRC I provided enough details about this a long time
ago.

If any process holds a reference to a file or directory in that
mirror-mounted filesystem, then that filesystem will not be unmounted,
and then the EBUSY won't happen (since no threads can be racing to mount
the thing that's already mounted).  So that's your workaround: keep a
reference to every affected mirror mount open so that they don't get
unmounted.  (But I've not tested this workaround, or at least not
recently.)

This is a general bug in domount() in the kernel.  The automounter works
around it by retrying calls to domount() on failure (there's a bit more
to it, but you get the picture).  The fix here is to do more or less
what the automounter does.  Tom had a fix, but that fix introduced some
other bug (I forget the details), and since it was a low-priority bug
(in my case it was easy enough to live with it) it has languished.

Nico
-- 

Reply via email to