>>>>> "JLT" == Jason L Tibbitts <ti...@math.uh.edu> writes:

JLT> Certainly a server reboot, or maybe even just
JLT> unmounting and remounting the filesystem or copying the data to
JLT> another filesystem would tell me that.  In any case, as soon as I
JLT> am able to mess with that server, I'll know more.

Rebooting the server did not make any difference, and now more users are
seeing the problem.  At this point I'm in a state where NFS simply isn't
reliable at all, and I'm not sure what to do.  If Centos 8 were out,
I'd work on moving to that just so that the server was a little more
modern.  (Currently the server is Centos 7.)  I guess I could try using
Fedora, or installing one of the upstream kernels, just in case this has
to do with some interaction between the client and the old RHEL7 kernel.

I do have a packet capture of a directory listing that fails with EIO,
but I'm not sure if it's safe to simply post it, and I'm not sure what
tshark options would be useful in decoding it.

I do know that I can rsync one of the problematic directories to a
different server (running the same kernel) and it doesn't have the same
problem.  What I'll try next is rsyncing to a different filesystem on
the same server, but again I'll have to wait until people log off to do
proper testing.

 - J<

Reply via email to