On 2019/3/27 下午10:07, Adam Borowski wrote:
> On Wed, Mar 27, 2019 at 05:46:50PM +0800, Qu Wenruo wrote:
>> This urgent patchset can be fetched from github:
>> https://github.com/adam900710/btrfs-progs/tree/flush_super
>> Which is based on v4.20.2.
>>
>> Before this patch, btrfs-progs writes to the fs has no barrier at all.
>> All metadata and superblock are just buffered write, no barrier between
>> super blocks and metadata writes at all.
>>
>> No wonder why even clear space cache can cause serious transid
>> corruption to the originally good fs.
>>
>> Please merge this fix as soon as possible as I really don't want to see
>> btrfs-progs corrupting any fs any more.
> 
> How often does this happen in practice?

As long as some BUG_ON() triggers, it's highly possible some transid
error will happen.

>  I'm slightly incredulous about
> btrfs-progs crashing often.
We're making progress enhancing btrfs-progs, but just check the recent
mail list, there is a report of clear free space cache v1 causing
transid error:
https://lore.kernel.org/linux-btrfs/c59ce3ee-b0cd-f195-9dfa-11abd362d...@gmx.com/

And that's clear cache making the transid problem more serious.

Adding to this, we still have a case where bad cacheing em could lead to
BUG_ON (*), I think btrfs-progs currently is only safe for RO operation,
not heavy write operations.

*: The fix is already submitted:
https://patchwork.kernel.org/patch/10840313/


> Especially that pwrite() is buffered on the
> kernel side, so we'd need a _kernel_ crash (usually a power loss) to break
> consistency.  Obviously, a potential data loss bug is always something that
> needs fixing, I'm just wondering about severity.

Oh, I see the point.
But there is some case still very concerning:

- Trans 1 get committed, write the following ems:
  em at 16K (fs root, gen = 1)
  em at 32K
  em at 48K

- trans 2 get committed
  em at 64K (fs root, gen = 2)
  em at 80K

- trans 3 get half committed
  em at 16K (fs root, gen = 3)

only trans 2 get its super block written to kernel, trans 3 get aborted
before writing super block due to whatever the reason is.

And you can see in that case, kernel will write:
 em at 16K (newer gen)
 em at 32K
 em at 48K
 em at 64K
 em at 80K
 sb at 4K (gen = 2)

Then sb 2 will points to older fs root (gen = 1), but at that location,
we have fs root with gen = 3.

Causing the fs unable to be mounted.

> 
> Or do I understand this wrong?
> 
> Asking because Dimitri John Ledkov stepped down as Debian's maintainer of
> this package, and I'm taking up the mantle (with Nicholas D Steeves being
> around) -- modulo any updates other than important bug fixes being on hold
> because of Debian's freeze.  Thus, I wonder if this is important enough to
> ask for a freeze exception.

I can't help for packaging at all.
As I'm an Arch user, just like a lot of reporters here. (And "I'm using
Arch" here is not a meme).

Personally I understand Debian has its policy, but really for
btrfs-progs, we really like the upstream version.

Thanks,
Qu

> 
> 
> Meow!
> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to