On Wed, Apr 18, 2018 at 1:02 AM, Jayashree Mohan <jayashree2...@gmail.com> wrote: > Hi, > > A gentle reminder on the crash consistency bug that we found on btrfs:
Why do you call it a consistency bug? The filesystem does not stay in inconsistent state. The link count stays 1 and the dentry used for fsync (foo) is persisted. An inconsistency would be if we ended up with a link count of 2 and only one of the dentries was persisted, or if we ended up with a link count of 1 and both dentries were persisted. Those cases would be detected by an fsck and would could fs operations to fail unexpectedly (attemping to remove a dentry, etc). > Link count of a file is not persisted even after a fsync. We believe a > filesystem that ensures strictly ordered metadata behaviour should be > able to persist the hard link after a fsync on the original file. The thing is there's no written specification about what's the expected and correct behavior. Yes, I'm aware of Dave's explanation on strictly ordered metadata on the other thread and transaction details, but things on btrfs work very differently from xfs (I'm not saying he's wrong). For me it makes more sense to persist the hard link, but again, there's a lack of a specification that demands such behaviour, and I'm not aware of applications needing that behaviour nor user reports about it. > Could you comment on why btrfs does not exhibit this behavior, and if > it's something you'd want to fix? I don't know the "why", as I am not the original author of the log tree (what is used to implement fsync on btrfs), so either it's accidental behavior (the most likely) or intentional. Since it's not something that causes any corruption, fs inconsistency, crash, etc, nor there are user reports complaining about this (afaik), for me it would be far from a priority at the moment as I'm trying to fix corruptions and similar issues (not necessarily caused by fsync). Plus, adding such behaviour would have to be done carefully to not impact performance, as checking if the node has multiple hard links and which ones need to be persisted (created in the current, uncommitted, transaction) may have a measurable impact. The current behaviour it to only guarantee persisting the dentry used for the fsync call. > > Thanks, > Jayashree Mohan > > > > On Mon, Apr 16, 2018 at 9:35 AM, Jayashree Mohan > <jayashree2...@gmail.com> wrote: >> Hi, >> >> The following seems to be a crash consistency bug on btrfs, where in >> the link count is not persisted even after a fsync on the original >> file. >> >> Consider the following workload : >> creat foo >> link (foo, A/bar) >> fsync(foo) >> ---Crash--- >> >> Now, on recovery we expect the metadata of foo to be persisted i.e >> have a link count of 2. However in btrfs, the link count is 1 and file >> A/bar is not persisted. The expected behaviour would be to persist the >> dependencies of inode foo. That is to say, shouldn't fsync of foo >> persist A/bar and correctly update the link count? >> Note that ext4, xfs and f2fs recover to the correct link count of 2 >> for the above workload. >> >> Let us know what you think about this behavior. >> >> Thanks, >> Jayashree Mohan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html