"chas williams - CONTRACTOR" <[EMAIL PROTECTED]> writes: > this mountpoint wouldn't happen to be mounted multiple times would it? > this really upsets the linux fs stack.
Not in this case, no. However, I was accessing it through both the RO and the RW paths. > i could see a problem if you lookup the mountpoint first in different > parent dir, then you move it from a different directory to a new > directory, the mountpoint's parent is not going to be the directory its > currently in. i cant seem to duplicate this though. i think > check_bad_parent() should catch this (except for when the parent volid > dont change). > what afs version are you running? 1.4.0. > can you be more specific about duplicating this problem? This came up when I was cleaning tripwire reports. The way we do tripwire is to have one AFS volume per machine that holds the machine configuration and the current tripwire database, all of which are mounted in a single replicated directory. I had been running tripwire on different machines and copying new databases over into AFS, and in the process I ran across various systems where the tripwire directory was mounted with the wrong name (since the mount point has to match the hostname for the way that we use tripwire). Whenever I found a system where the mount point didn't match the hostname, I'd switch to the read/write path and mv the mount point to the right name, then release the volume. When I did that, I got this kernel BUG and a segfault from mv. (Note that the AFS client had been running for some time at this point, and I'd unloaded it and reloaded it to upgrade to a new build at one point in the past.) After that happened, anything else that touched the directory that holds all the mount points would block in disk wait and I couldn't unload the AFS kernel module. I rebooted the system, which cleared that up, and I could work in that directory again. But then, I went back to doing the same thing, and while the first two or three times I mv'd a mount point everything was fine, the next time I got the segfault and the BUG again. I was then very careful not to touch that directory with any other process, and I have no processes in disk wait, but even though lsof reports no processes with open files in AFS, I can't unload the kernel module again (it has three references in lsmod). Note that the directory I was working in has several hundred mount points, a few symlinks, and no other files (and it's the only directory in its volume). It's possible that the size of the directory may have something to do with this, or switching from RO to RW accesses of the same volume. The individual tripwire directories are not replicated, only the volume that holds their mount points. -- Russ Allbery ([EMAIL PROTECTED]) <http://www.eyrie.org/~eagle/> _______________________________________________ OpenAFS-devel mailing list [email protected] https://lists.openafs.org/mailman/listinfo/openafs-devel
