Re: [PATCH] au_proc_getmntent: optimize reading, retry

hooanon05g Wed, 22 May 2019 21:16:04 -0700

Kirill Kolyshkin:
> Can you tell me what is your reason to not retry reading by default? The
> code
> has just checked that this is an aufs mount so it should definitely be
> present in
> /proc/mounts. Unless, of course, this mount was unmounted by someone in
> between statfs() and reading. If you have this exact case in mind (I can't
> think
> of anything else) and don't want to retry because of efficiency, you can add
> another statfs() to after reading /proc/mounts and not finding the mount --
> that way you can be sure that the mount is still there but it eluded the
> /proc/mounts.


Yes, such race was in my mind.
In other words, it is hard to identify the reason why /proc/mounts
doesn't show the entry.  The problem of /proc/mounts, or someone else
unmounted?  Additionally I guess(hope) such parallel mount/unmounts are
rare.  And I wonder "2" is the absolute correct solution?  "3" cannot be
happen? Never?

Statfs(), you say, won't help I am afraid.  Even if it tells us that the
dir is aufs, it is not the proof of the aufs mountpoint.  It can be a
subdir of another aufs mount.
An extra stat(2) call may help in this point.  It will tell us the inode
number, and if it is AUFS_ROOT_INO, then that path is the aufs mountpoint.
But I wonder do we really have to issue stat(2) and statfs(2) just to
make sure the aufs mount is still there?  Isn't it rather heavy and racy?


> I have also took a deeper look at that other error I mentioned earlier.
> Found out
> it's a race in au_xino_create(). In case xino mount option is provided (we
> use
> xino=/dev/shm/aufs.xino in Docker), and multiple mounts (note: for different
> mount points) are performed in parallel, one mount can reach
>
> >  file = vfsub_filp_open(fpath, O_RDWR | O_CREAT | O_EXCL | O_LARGEFILE,
> 0666);
>
> line of code, while another mount already created that file, but haven't
> unlinked it yet.
>
> As a result, we have an error like these in the kernel log:
>
> [2233986.956753] aufs au_xino_create:767:dockerd[17144]: open
> /dev/shm/aufs.xino(-17)
> [2233988.732636] aufs au_xino_create:767:dockerd[17518]: open
> /dev/shm/aufs.xino(-17)

Thank you very much for the report.
Here -17 means EEXIST "File exists" error.
It is an expected behaviour (and I am glad that I know it is working
expectedly).  As you might know, the default path of XINO files are the
top dir of the first writable branch, and a writable branch is not
sharable between the multiple aufs mounts.  So by default XINO files are
dedicated to a single aufs mount. Not shared, no confliction happens.


> Currently, I am working around this unfortunate issue by calling mount(2)
> under
> an exclusive lock, to make sure no two aufs mounts (again, for different
> mount
> points) are performed in parallel, but perhaps there is a better way?
>
> I am going to mitigate this race by adding a random suffix to xino file
> name; do you think
> it is a decent workaround?

If your first writable branch is somewhere on /dev/shm, then you can
remove "xino=" option.  In this case, the XINO files will be created
under /dev/shm and not shared.  Moreover "xino=" option is something
like a last resort generally.  As long as the filesystem of your first
writable branch doesn't support XINO, or you want a little gain around
the aufs internal XINO handling, you may want "xino=".  Otherwise you
can omit it.
Of course adding a random/unique suffix is a good idea.  If I were you,
I'd use $$ in shell script manner such like
        mount -t aufs -obr=...,xino=/dev/shm/aufs.xino.$$ ...


J. R. Okajima

Re: [PATCH] au_proc_getmntent: optimize reading, retry

Reply via email to