Re: [UNRESOLVED] Re: errors found in extent allocation tree or chunk allocation after power failure

Pallissard, Matthew Fri, 27 Sep 2019 17:04:18 -0700

On 2019-09-27T17:01:27, Pallissard, Matthew wrote:
> 
> On 2019-09-25T14:32:31, Pallissard, Matthew wrote:
> > On 2019-09-25T15:05:44, Chris Murphy wrote:
> > > On Wed, Sep 25, 2019 at 1:34 PM Pallissard, Matthew <m...@pallissard.net> 
> > > wrote:
> > > > On 2019-09-25T13:08:34, Chris Murphy wrote:
> > > > > On Wed, Sep 25, 2019 at 8:50 AM Pallissard, Matthew 
> > > > > <m...@pallissard.net> wrote:
> > > > > >
> > > > > > Version:
> > > > > > Kernel: 5.2.2-arch1-1-ARCH #1 SMP PREEMPT Sun Jul 21 19:18:34 UTC 
> > > > > > 2019 x86_64 GNU/Linux
> > > > >
> > > > > You need to upgrade to arch kernel 5.2.14 or newer (they backported 
> > > > > the fix first appearing in stable 5.2.15). Or you need to downgrade 
> > > > > to 5.1 series.
> > > > > https://lore.kernel.org/linux-btrfs/20190911145542.1125-1-fdman...@kernel.org/T/#u
> > > > >
> > > > > That's a nasty bug. I don't offhand see evidence that you've hit this 
> > > > > bug. But I'm not certain. So first thing should be to use a different 
> > > > > kernel.
> > > >
> > > > Interesting, I'll go ahead with a kernel upgrade as that easy enough.
> > > > However, that looks like it's related to a stacktrace regarding a hung 
> > > > process.  Which is not the original problem I had.
> > > > Based on the output in my previous email, I've been working under the 
> > > > assumption that there is a problem on-disk.  Is that not correct?
> > >
> > > That bug does cause filesystem corruption that is not repairable.
> > > Whether you have that problem or a different problem, I'm not sure.
> > > But it's best to avoid combining problems.
> > >
> > > The file system mounts rw now? Or still only mounts ro?
> > 
> > It mounts RW, but I have yet to attempt an actual write.
> > 
> > 
> > > I think most of the errors reported by btrfs check, if they still exist 
> > > after doing a scrub, should be repaired by 'btrfs check --repair' but I 
> > > don't advise that until later. I'm not a developer, maybe Qu can offer 
> > > some advise on those errors.
> > 
> > 
> > > > > Next, anytime there is a crash or powerfailur with Btrfs raid56, you 
> > > > > need to do a complete scrub of the volume. Obviously will take time 
> > > > > but that's what needs to be done first.
> > > >
> > > > I'm using raid 10, not 5 or 6.
> > >
> > > Same advice, but it's not as important to raid10 because it doesn't have 
> > > the write hole problem.
> > 
> > 
> > > > > OK actually, before the scrub you need to confirm that each drive's 
> > > > > SCT ERC time is *less* than the kernel's SCSI command timer. e.g.
> > > >
> > > > I gather that I should probably do this before any scrub, be it raid 5, 
> > > > 6, or 10.  But, Is a scrub the operation I should attempt on this raid 
> > > > 10 array to repair the specific errors mentioned in my previous email?
> > >
> > > Definitely deal with the timing issue first. If by chance there are bad 
> > > sectors on any of the drives, they must be properly reported by the drive 
> > > with a discrete read error in order for Btrfs to do a proper fixup. If 
> > > the times are mismatched, then Linux can get tired waiting, and do a link 
> > > reset on the drive before the read error happens. And now the whole 
> > > command queue is lost and the problem isn't fixed.
> > 
> > Good to know, that seems like a critical piece of information.  A few 
> > searches turned up this page, https://wiki.debian.org/Btrfs#FAQ.
> > 
> > Should this be noted on the 'gotchas' or 'getting started page as well?  
> > I'd be happy to make edits should the powers that be allow it.
> > 
> > 
> > > There are myriad errors and the advice I'm giving to scrub is a safe 
> > > first step to make sure the storage stack is sane - or at least we know 
> > > where the simpler problems are. And then move to the less simple ones 
> > > that have higher risk.  It also changed the volume the least. Everything 
> > > else, like balance and chunk recover and btrfs check --repair - all make 
> > > substantial changes to the file system and have higher risk of making 
> > > things worse.
> > 
> > This sounds sensible.
> > 
> > 
> > > In theory if the storage stack does exactly what Btrfs says, then at 
> > > worst you should lose some data, but the file system itself should be 
> > > consistent. And that includes power failures. The fact there's problems 
> > > reported suggests a bug somewhere - it could be Btrfs, it could be device 
> > > mapper, it could be controller or drive firmware.
> > 
> > I'll go ahead with a kernel upgrade/make sure the timing issues are squared 
> > away.  Then I'll kick off a scrub.
> > 
> > I'll report back when the scrub is complete or something interesting 
> > happens.  Whichever comes first.
> 
> As a followup;
> 1. I took care of the timing issues
> 2. ran a scrub.
> 3. I ran a balance, it kept failing with about 20% left
>   - stacktraces in dmesg showed spinlock stuff
> 
> 3. got I/O errors on one file during my final backup, (
>   - post-backup hashsums of everything else checked out
>   - the errors during the copy were csum mismatches should anyone care
> 
> 4. ran a bunch of potentially disruptive btrfs check commands in alphabetical 
> order because "why not at this point?"
>   - they had zero affect as far as I can tell, all the same files were 
> readable, the btrfs check errors looked identical (admittedly I didn't put 
> them side by side)
> 
> 5. re-provisioned the array, restored from backups.
> 
> As I thought about it, it may have not been an issue with the original power 
> outage.  I only ran a check after the power outage.  My array could have had 
> an issue due to a previous bug. I was on a 5.2x kernel for several weeks 
> under high load.  Anyway, there are enough unknowns to make a root cause 
> analysis not worth my time.
> 
> Marking this as unresolved folks in the future who may be looking for answers.
>


Man, I should have read that over one more time for typos. Oh well.

Matt Pallissard

signature.asc
Description: PGP signature

Re: [UNRESOLVED] Re: errors found in extent allocation tree or chunk allocation after power failure

Reply via email to