Re: errors found in extent allocation tree or chunk allocation after power failure

Pallissard, Matthew Wed, 25 Sep 2019 14:32:46 -0700

On 2019-09-25T15:05:44, Chris Murphy wrote:
> On Wed, Sep 25, 2019 at 1:34 PM Pallissard, Matthew <m...@pallissard.net> 
> wrote:
> > On 2019-09-25T13:08:34, Chris Murphy wrote:
> > > On Wed, Sep 25, 2019 at 8:50 AM Pallissard, Matthew <m...@pallissard.net> 
> > > wrote:
> > > >
> > > > Version:
> > > > Kernel: 5.2.2-arch1-1-ARCH #1 SMP PREEMPT Sun Jul 21 19:18:34 UTC 2019 
> > > > x86_64 GNU/Linux
> > >
> > > You need to upgrade to arch kernel 5.2.14 or newer (they backported the 
> > > fix first appearing in stable 5.2.15). Or you need to downgrade to 5.1 
> > > series.
> > > https://lore.kernel.org/linux-btrfs/20190911145542.1125-1-fdman...@kernel.org/T/#u
> > >
> > > That's a nasty bug. I don't offhand see evidence that you've hit this 
> > > bug. But I'm not certain. So first thing should be to use a different 
> > > kernel.
> >
> > Interesting, I'll go ahead with a kernel upgrade as that easy enough.
> > However, that looks like it's related to a stacktrace regarding a hung 
> > process.  Which is not the original problem I had.
> > Based on the output in my previous email, I've been working under the 
> > assumption that there is a problem on-disk.  Is that not correct?
>
> That bug does cause filesystem corruption that is not repairable.
> Whether you have that problem or a different problem, I'm not sure.
> But it's best to avoid combining problems.
>
> The file system mounts rw now? Or still only mounts ro?


It mounts RW, but I have yet to attempt an actual write.


> I think most of the errors reported by btrfs check, if they still exist after 
> doing a scrub, should be repaired by 'btrfs check --repair' but I don't 
> advise that until later. I'm not a developer, maybe Qu can offer some advise 
> on those errors.


> > > Next, anytime there is a crash or powerfailur with Btrfs raid56, you need 
> > > to do a complete scrub of the volume. Obviously will take time but that's 
> > > what needs to be done first.
> >
> > I'm using raid 10, not 5 or 6.
>
> Same advice, but it's not as important to raid10 because it doesn't have the 
> write hole problem.


> > > OK actually, before the scrub you need to confirm that each drive's SCT 
> > > ERC time is *less* than the kernel's SCSI command timer. e.g.
> >
> > I gather that I should probably do this before any scrub, be it raid 5, 6, 
> > or 10.  But, Is a scrub the operation I should attempt on this raid 10 
> > array to repair the specific errors mentioned in my previous email?
>
> Definitely deal with the timing issue first. If by chance there are bad 
> sectors on any of the drives, they must be properly reported by the drive 
> with a discrete read error in order for Btrfs to do a proper fixup. If the 
> times are mismatched, then Linux can get tired waiting, and do a link reset 
> on the drive before the read error happens. And now the whole command queue 
> is lost and the problem isn't fixed.

Good to know, that seems like a critical piece of information.  A few searches 
turned up this page, https://wiki.debian.org/Btrfs#FAQ.

Should this be noted on the 'gotchas' or 'getting started page as well?  I'd be 
happy to make edits should the powers that be allow it.


> There are myriad errors and the advice I'm giving to scrub is a safe first 
> step to make sure the storage stack is sane - or at least we know where the 
> simpler problems are. And then move to the less simple ones that have higher 
> risk.  It also changed the volume the least. Everything else, like balance 
> and chunk recover and btrfs check --repair - all make substantial changes to 
> the file system and have higher risk of making things worse.

This sounds sensible.


> In theory if the storage stack does exactly what Btrfs says, then at worst 
> you should lose some data, but the file system itself should be consistent. 
> And that includes power failures. The fact there's problems reported suggests 
> a bug somewhere - it could be Btrfs, it could be device mapper, it could be 
> controller or drive firmware.

I'll go ahead with a kernel upgrade/make sure the timing issues are squared 
away.  Then I'll kick off a scrub.

I'll report back when the scrub is complete or something interesting happens.  
Whichever comes first.

Thanks again.


Matt Pallissard

signature.asc
Description: PGP signature

Re: errors found in extent allocation tree or chunk allocation after power failure

Reply via email to