On 2019/3/27 下午10:07, Adam Borowski wrote: > On Wed, Mar 27, 2019 at 05:46:50PM +0800, Qu Wenruo wrote: >> This urgent patchset can be fetched from github: >> https://github.com/adam900710/btrfs-progs/tree/flush_super >> Which is based on v4.20.2. >> >> Before this patch, btrfs-progs writes to the fs has no barrier at all. >> All metadata and superblock are just buffered write, no barrier between >> super blocks and metadata writes at all. >> >> No wonder why even clear space cache can cause serious transid >> corruption to the originally good fs. >> >> Please merge this fix as soon as possible as I really don't want to see >> btrfs-progs corrupting any fs any more. > > How often does this happen in practice?
As long as some BUG_ON() triggers, it's highly possible some transid error will happen. > I'm slightly incredulous about > btrfs-progs crashing often. We're making progress enhancing btrfs-progs, but just check the recent mail list, there is a report of clear free space cache v1 causing transid error: https://lore.kernel.org/linux-btrfs/c59ce3ee-b0cd-f195-9dfa-11abd362d...@gmx.com/ And that's clear cache making the transid problem more serious. Adding to this, we still have a case where bad cacheing em could lead to BUG_ON (*), I think btrfs-progs currently is only safe for RO operation, not heavy write operations. *: The fix is already submitted: https://patchwork.kernel.org/patch/10840313/ > Especially that pwrite() is buffered on the > kernel side, so we'd need a _kernel_ crash (usually a power loss) to break > consistency. Obviously, a potential data loss bug is always something that > needs fixing, I'm just wondering about severity. Oh, I see the point. But there is some case still very concerning: - Trans 1 get committed, write the following ems: em at 16K (fs root, gen = 1) em at 32K em at 48K - trans 2 get committed em at 64K (fs root, gen = 2) em at 80K - trans 3 get half committed em at 16K (fs root, gen = 3) only trans 2 get its super block written to kernel, trans 3 get aborted before writing super block due to whatever the reason is. And you can see in that case, kernel will write: em at 16K (newer gen) em at 32K em at 48K em at 64K em at 80K sb at 4K (gen = 2) Then sb 2 will points to older fs root (gen = 1), but at that location, we have fs root with gen = 3. Causing the fs unable to be mounted. > > Or do I understand this wrong? > > Asking because Dimitri John Ledkov stepped down as Debian's maintainer of > this package, and I'm taking up the mantle (with Nicholas D Steeves being > around) -- modulo any updates other than important bug fixes being on hold > because of Debian's freeze. Thus, I wonder if this is important enough to > ask for a freeze exception. I can't help for packaging at all. As I'm an Arch user, just like a lot of reporters here. (And "I'm using Arch" here is not a meme). Personally I understand Debian has its policy, but really for btrfs-progs, we really like the upstream version. Thanks, Qu > > > Meow! >
signature.asc
Description: OpenPGP digital signature