Re: errors found in extent allocation tree or chunk allocation after power failure

Chris Murphy Wed, 25 Sep 2019 14:07:00 -0700

On Wed, Sep 25, 2019 at 1:34 PM Pallissard, Matthew <m...@pallissard.net> wrote:
>
>
> Chris,
>
> Thank you for your reply.  Responses in-line.
>
> On 2019-09-25T13:08:34, Chris Murphy wrote:
> > On Wed, Sep 25, 2019 at 8:50 AM Pallissard, Matthew <m...@pallissard.net> 
> > wrote:
> > >
> > > Version:
> > > Kernel: 5.2.2-arch1-1-ARCH #1 SMP PREEMPT Sun Jul 21 19:18:34 UTC 2019 
> > > x86_64 GNU/Linux
> >
> > You need to upgrade to arch kernel 5.2.14 or newer (they backported the fix 
> > first appearing in stable 5.2.15). Or you need to downgrade to 5.1 series.
> > https://lore.kernel.org/linux-btrfs/20190911145542.1125-1-fdman...@kernel.org/T/#u
> >
> > That's a nasty bug. I don't offhand see evidence that you've hit this bug. 
> > But I'm not certain. So first thing should be to use a different kernel.
>
> Interesting, I'll go ahead with a kernel upgrade as that easy enough.
>
> However, that looks like it's related to a stacktrace regarding a hung 
> process.  Which is not the original problem I had.
>
> Based on the output in my previous email, I've been working under the 
> assumption that there is a problem on-disk.  Is that not correct?


That bug does cause filesystem corruption that is not repairable.
Whether you have that problem or a different problem, I'm not sure.
But it's best to avoid combining problems.

The file system mounts rw now? Or still only mounts ro?

I think most of the errors reported by btrfs check, if they still
exist after doing a scrub, should be repaired by 'btrfs check
--repair' but I don't advise that until later. I'm not a developer,
maybe Qu can offer some advise on those errors.


> > Next, anytime there is a crash or powerfailur with Btrfs raid56, you need 
> > to do a complete scrub of the volume. Obviously will take time but that's 
> > what needs to be done first.
>
> I'm using raid 10, not 5 or 6.

Same advice, but it's not as important to raid10 because it doesn't
have the write hole problem.


> > OK actually, before the scrub you need to confirm that each drive's SCT ERC 
> > time is *less* than the kernel's SCSI command timer. e.g.
>
> I gather that I should probably do this before any scrub, be it raid 5, 6, or 
> 10.  But, Is a scrub the operation I should attempt on this raid 10 array to 
> repair the specific errors mentioned in my previous email?
>

Definitely deal with the timing issue first. If by chance there are
bad sectors on any of the drives, they must be properly reported by
the drive with a discrete read error in order for Btrfs to do a proper
fixup. If the times are mismatched, then Linux can get tired waiting,
and do a link reset on the drive before the read error happens. And
now the whole command queue is lost and the problem isn't fixed.

There are myriad errors and the advice I'm giving to scrub is a safe
first step to make sure the storage stack is sane - or at least we
know where the simpler problems are. And then move to the less simple
ones that have higher risk.  It also changed the volume the least.
Everything else, like balance and chunk recover and btrfs check
--repair - all make substantial changes to the file system and have
higher risk of making things worse.

In theory if the storage stack does exactly what Btrfs says, then at
worst you should lose some data, but the file system itself should be
consistent. And that includes power failures. The fact there's
problems reported suggests a bug somewhere - it could be Btrfs, it
could be device mapper, it could be controller or drive firmware.

--
Chris Murphy

Re: errors found in extent allocation tree or chunk allocation after power failure

Reply via email to