Hi!
A. SUMMARY Long story short: I have a file name on my zfs without a file to it. ls will include it in the dir content, but stat-ing that file will result in an ENOENT error: "No such file or directory". B. HISTORY So how did I come to this situation? I've recently had to kill the sending side of an rsync, with the receiving side on FreeBSD. For reasons yet unknown, the next run of rsync started deleting stuff it shouldn't. Details on this are in PR 162318 [1], but quoting the most important things: Logging into the receiving FreeBSD as root, I found that large parts of the user's home directory content had disappeared, even outside the subdirectory used as the rsync destination! - All the .* config files in the home directory were gone - The .ssh directory was still present, but its content was gone as well - Both the home dir and the .ssh subdir contained a file "rsync.%stat", which should be the name of an extattr instead, used to implement the rsync --fake-super command line option. [1] http://www.freebsd.org/cgi/query-pr.cgi?pr=162318 C. SYMPTOMS I first assumed a problem in the binary rsync build for FreeBSD, but devs on the above bug report favored RAM failure or an upstream source code bug. So I gave it another try, and payed closer attention to the error messages. Among them was the following: > rsync: stat "/home/name/backup/etc/ca-certificates" failed: No such file or > directory (2) Strange thing is, this isn't specific to rsync at all, it can be reproduced using simple command line tools like ls: > # ls /home/name/backup/etc/ | grep ca-cert > ca-certificates > ca-certificates.conf > ca-certificates.conf~ > # ls /home/name/backup/etc/ca-* > ls: /home/name/backup/etc/ca-certificates: No such file or directory > /home/name/backup/etc/ca-certificates.conf > /home/name/backup/etc/ca-certificates.conf~ So as you see, the name is returned by readdir(3), where both ls for the dir and the wildcard expansion find it. But anything that stat(2)s the file will encounter an ENOENT error. "zpool status" says everything's fine, so zfs isn't aware of any corruption. I believe that no matter what errors user space programs might make, the kernel zfs driver should never allow the above to happen. Either a file is there, or it isn't, there should be no such mixture. So what do you think, is this likely to be a bug in the zfs implementation? I found one other person describing problems like this: in threads titled "file lose inode in Memory-Based file system.", lisen1001 described pretty much the same thing, except on ramdisk on 8.2 instead of my own hdd-based raidz on 9.0-RC1 [2,3]. [2] http://thread.gmane.org/gmane.os.freebsd.questions/280183 [3] http://thread.gmane.org/gmane.os.freebsd.devel.file-systems/13153 D. NEXT STEPS As I'm new to FreeBSD, I'm not yet sure how bug reports are handled around here. As I said, I've reported a bug report against rsync, and it has been closed on the grounds that this appears to be an upstream problem. Would it make sense to include the above information in the bug report for reference? Would replying to the gnats address be enough to accomplish that? Should the bug be reopened, as I assume all my problems to be related, and as the zfs corruption at least is specific to FreeBSD? If so, how does one reopen a report? Or who can do that? Do you agree that this looks like a problem in the ZFS implementation? Should I file a new problem report for that? Can you suggest any way I could resolve the corruption on my local ZFS pool, short of destroying and recreating the whole file system? "rm" for the file doesn't work, as it, too, encounters the ENOENT. Is there any tool to check or rebuild the inode data structures of zfs? "zpool scrub" doesn't seem to fit the bill, as its manpage indicates a computation of file content checksums. Greetings, Martin von Gagern
signature.asc
Description: OpenPGP digital signature