Kirill Kolyshkin: > Can you tell me what is your reason to not retry reading by default? The > code > has just checked that this is an aufs mount so it should definitely be > present in > /proc/mounts. Unless, of course, this mount was unmounted by someone in > between statfs() and reading. If you have this exact case in mind (I can't > think > of anything else) and don't want to retry because of efficiency, you can add > another statfs() to after reading /proc/mounts and not finding the mount -- > that way you can be sure that the mount is still there but it eluded the > /proc/mounts.
Yes, such race was in my mind. In other words, it is hard to identify the reason why /proc/mounts doesn't show the entry. The problem of /proc/mounts, or someone else unmounted? Additionally I guess(hope) such parallel mount/unmounts are rare. And I wonder "2" is the absolute correct solution? "3" cannot be happen? Never? Statfs(), you say, won't help I am afraid. Even if it tells us that the dir is aufs, it is not the proof of the aufs mountpoint. It can be a subdir of another aufs mount. An extra stat(2) call may help in this point. It will tell us the inode number, and if it is AUFS_ROOT_INO, then that path is the aufs mountpoint. But I wonder do we really have to issue stat(2) and statfs(2) just to make sure the aufs mount is still there? Isn't it rather heavy and racy? > I have also took a deeper look at that other error I mentioned earlier. > Found out > it's a race in au_xino_create(). In case xino mount option is provided (we > use > xino=/dev/shm/aufs.xino in Docker), and multiple mounts (note: for different > mount points) are performed in parallel, one mount can reach > > > file = vfsub_filp_open(fpath, O_RDWR | O_CREAT | O_EXCL | O_LARGEFILE, > 0666); > > line of code, while another mount already created that file, but haven't > unlinked it yet. > > As a result, we have an error like these in the kernel log: > > [2233986.956753] aufs au_xino_create:767:dockerd[17144]: open > /dev/shm/aufs.xino(-17) > [2233988.732636] aufs au_xino_create:767:dockerd[17518]: open > /dev/shm/aufs.xino(-17) Thank you very much for the report. Here -17 means EEXIST "File exists" error. It is an expected behaviour (and I am glad that I know it is working expectedly). As you might know, the default path of XINO files are the top dir of the first writable branch, and a writable branch is not sharable between the multiple aufs mounts. So by default XINO files are dedicated to a single aufs mount. Not shared, no confliction happens. > Currently, I am working around this unfortunate issue by calling mount(2) > under > an exclusive lock, to make sure no two aufs mounts (again, for different > mount > points) are performed in parallel, but perhaps there is a better way? > > I am going to mitigate this race by adding a random suffix to xino file > name; do you think > it is a decent workaround? If your first writable branch is somewhere on /dev/shm, then you can remove "xino=" option. In this case, the XINO files will be created under /dev/shm and not shared. Moreover "xino=" option is something like a last resort generally. As long as the filesystem of your first writable branch doesn't support XINO, or you want a little gain around the aufs internal XINO handling, you may want "xino=". Otherwise you can omit it. Of course adding a random/unique suffix is a good idea. If I were you, I'd use $$ in shell script manner such like mount -t aufs -obr=...,xino=/dev/shm/aufs.xino.$$ ... J. R. Okajima