On 2019/3/27 下午10:39, Qu Wenruo wrote: > > > On 2019/3/27 下午10:07, Adam Borowski wrote: >> On Wed, Mar 27, 2019 at 05:46:50PM +0800, Qu Wenruo wrote: >>> This urgent patchset can be fetched from github: >>> https://github.com/adam900710/btrfs-progs/tree/flush_super >>> Which is based on v4.20.2. >>> >>> Before this patch, btrfs-progs writes to the fs has no barrier at all. >>> All metadata and superblock are just buffered write, no barrier between >>> super blocks and metadata writes at all. >>> >>> No wonder why even clear space cache can cause serious transid >>> corruption to the originally good fs. >>> >>> Please merge this fix as soon as possible as I really don't want to see >>> btrfs-progs corrupting any fs any more. >> >> How often does this happen in practice? > > As long as some BUG_ON() triggers, it's highly possible some transid > error will happen. > >> I'm slightly incredulous about >> btrfs-progs crashing often. > We're making progress enhancing btrfs-progs, but just check the recent > mail list, there is a report of clear free space cache v1 causing > transid error: > https://lore.kernel.org/linux-btrfs/c59ce3ee-b0cd-f195-9dfa-11abd362d...@gmx.com/ > > And that's clear cache making the transid problem more serious. > > Adding to this, we still have a case where bad cacheing em could lead to > BUG_ON (*), I think btrfs-progs currently is only safe for RO operation, > not heavy write operations. > > *: The fix is already submitted: > https://patchwork.kernel.org/patch/10840313/ > > >> Especially that pwrite() is buffered on the >> kernel side, so we'd need a _kernel_ crash (usually a power loss) to break >> consistency. Obviously, a potential data loss bug is always something that >> needs fixing, I'm just wondering about severity. > > Oh, I see the point. > But there is some case still very concerning: > > - Trans 1 get committed, write the following ems: > em at 16K (fs root, gen = 1) > em at 32K > em at 48K > > - trans 2 get committed > em at 64K (fs root, gen = 2)
Slightly wrong, in trans 2, fs root is not updated. So please discard this mail, I'll resend a better version. Thanks, Qu > em at 80K > > - trans 3 get half committed > em at 16K (fs root, gen = 3) > > only trans 2 get its super block written to kernel, trans 3 get aborted > before writing super block due to whatever the reason is. > > And you can see in that case, kernel will write: > em at 16K (newer gen) > em at 32K > em at 48K > em at 64K > em at 80K > sb at 4K (gen = 2) > > Then sb 2 will points to older fs root (gen = 1), but at that location, > we have fs root with gen = 3. > > Causing the fs unable to be mounted. > >> >> Or do I understand this wrong? >> >> Asking because Dimitri John Ledkov stepped down as Debian's maintainer of >> this package, and I'm taking up the mantle (with Nicholas D Steeves being >> around) -- modulo any updates other than important bug fixes being on hold >> because of Debian's freeze. Thus, I wonder if this is important enough to >> ask for a freeze exception. > > I can't help for packaging at all. > As I'm an Arch user, just like a lot of reporters here. (And "I'm using > Arch" here is not a meme). > > Personally I understand Debian has its policy, but really for > btrfs-progs, we really like the upstream version. > > Thanks, > Qu > >> >> >> Meow! >> >
signature.asc
Description: OpenPGP digital signature