Re: Massive filesystem corruption since kernel 5.2 (ARCH)

Filipe Manana Thu, 12 Sep 2019 07:28:53 -0700

On Thu, Sep 12, 2019 at 2:09 PM Christoph Anton Mitterer
<cales...@scientia.net> wrote:
>
> Hi.
>
> First, thanks for finding&fixing this :-)
>
>
> On Thu, 2019-09-12 at 08:50 +0100, Filipe Manana wrote:
> > 1) either a hang when committing a transaction, reported by several
> > users recently and hit it myself too twice when running fstests (test
> > case generic/475 and generic/561) after I upgradaded my development
> > branch from a 5.1.x kernel to a 5.3-rcX kernel. If this happens you
> > risk no corruption, still the hang is very inconvenient of course, as
> > you have to reboot.
>
> Okay inconvenient, but not so bad if there is no corruption risk.
>
>
> > 2) writeback for some btree nodes may never be started and we end up
> > committing a transaction without noticing that. This is really
> > serious
> > and that will lead to the "parent transid verify failed on ..."
> > messages.
>
> As some people have already pointed out, it will be infeasible for many
> end users to downgrade (no security updates) or manually patch (well,
> end-users).


Yes, but I can't do anything about that. I'm not skilled to build a
time machine to go back in time :)

>
> Can you elaborate under which circumstances this problem occurs,
> whether there are any intermediate workarounds, and whether it's always
> noticed (i.e. no silence corruption)?

It can happen whenever a transaction is being committed (or committing
the fsync log).
Every fs is at risk, unless it's always mounted in read-only and with
-o nologreplay.

A btree node/leaf (extent buffer) is dirty in memory, needs to be
written to disk, this always happens at transaction commit time,
but can also happen before that, if for some reason writeback on the
btree inode happens (due to reclaim, system under memory pressure,
etc).

If the writeback happens only at the transaction commit time, and if
one the node's pages is locked (not necessarily by btrfs,
it can happen everywhere in the memory management subsystem, page
migration for example), we ended up skipping the
writeback (start the process of writing what's in memory to disk) of a
node. This is case 2), the corruption with the error messages
"parent transid verify failed ..." in dmesg/syslog after mounting the
filesystem again.
This is very likely (as we can never rule out other bugs, be it in
btrfs or some other layer, or even hardware/firmware) what
Swâmi ran into, since he never had problems with 5.1 and older kernel
versions and has been using the same hardware for a long time.

For case 1), the hang, it happens if writeback happened before the
transaction commit as well. At transaction commit we trigger
writeback again for the same node(s), and here we hang because of the
previous attempt.
Two people reported the hang yesterday here on the list, plus at least
one more some weeks ago.
I hit it myself once last week and once 2 evenings ago with test cases
from fstests after changing my development branch from 5.1 to 5.3-rcX.

To hit any of the problems, sure, you still need to have some bad
luck, but it's impossible to tell how likely to run into it.
It depends on so many things, from workloads, system configuration, etc.
No matter how likely (and how likely will not be the same for
everyone), it's serious because if it happens you can get a corrupt
filesystem.

>
>
> Thanks,
> Chris.
>


-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”

Re: Massive filesystem corruption since kernel 5.2 (ARCH)

Reply via email to