On Wed, Jun 16, 2021 at 11:06:01AM PDT, Matthew Dillon wrote: > If you are not seeing any actual I/O errors in the dmesg output, then there > is probably no issue with the filesystem. > > The dotdot warnings might be some edge-case being caused by the null-mounts > (because the null-mount has a mount point, but its being mounted on top of > a sub-directory in an underlying filesystem). If you can track down the > operation that is causing message, it might just wind up being a patch to > the kernel to get rid of the console warning for that particular case. If > you can find a simple configuration that I can throw onto a test box to get > the same error, I can track down the issue and fix it. > > -Matt
Thanks for looking into this. Unfortunately the dotdot warnings *are* coinciding with I/O errors on the client side - just not on the server. But they do seem to be causing stat() to fail on the client side, not just on directories but on regular files and symlinks as well (although doing that probably causes a lookupdotdot on the containing directory for all I know). Typically, my shell will inform me of I/O errors attempting to check my local email in /var/mail which is an NFS mount (yeah, I know we don't have NFS file locking on the client side and I shouldn't be doing it this way). I've also seen find(1) suddenly begin to throw up I/O errors on literally *every* node it encounters on my NFS mounts. Statting anything in that hierarchy will continue to fail it reattempted. Curiously, ssh'ing into the server and doing something like "find $exported_mount >/dev/null" and just letting it sit there and traverse the PFS on the server side seems to re-enable client access to the affected hierarchy. FWIW, the server is running GENERIC at v6.0.0.5.g53d41-RELEASE and affected PFS's range from a dozen-odd files of less than a meg total, to millions of files totalling a few terabytes. Clients I've seen this happen on include FreeBSD 13, DragonFly DEVELOPMENT, and Linux 5.4. I'll try to monitor the circumstances in which these errors occur and see if I can find any correlation between them. So far, it just seems completely random to me. If there's any other info I can provide, I'd be happy to oblige. -- A Dog
