> On Jun 16, 2015, at 7:58 PM, Duncan <1i5t5.dun...@cox.net> wrote: > > Vincent Olivier posted on Tue, 16 Jun 2015 09:34:29 -0400 as excerpted: > > >>> On Jun 16, 2015, at 8:25 AM, Hugo Mills <h...@carfax.org.uk> wrote: >>> >>> On Tue, Jun 16, 2015 at 08:09:17AM -0400, Vincent Olivier wrote: >>>> >>>> My first question is this : is it normal to have “single” blocks ? >>>> Why not only RAID10? I don’t remember the exact mkfs options I used >>>> but I certainly didn’t ask for “single” so this is unexpected. >>> >>> Yes. It's an artefact of the way that mkfs works. If you run a >>> balance on those chunks, they'll go away. (btrfs balance start >>> -dusage=0 -musage=0 /mountpoint) >> >> Thanks! I did and it did go away, except for the "GlobalReserve, single: >> total=512.00MiB, used=0.00B”. But I suppose this is a permanent fixture, >> right? > > Yes. GlobalReserve is for short-term btrfs-internal use, reserved for > times when btrfs needs to (temporarily) allocate some space in ordered to > free space, etc. It's always single, and you'll rarely see anything but > 0 used except perhaps in the middle of a balance or something.
Get it. Thanks. Is there anyway to put that on another device, say, a SSD? I am thinking of backing up this RAID10 on a 2x8TB device-managed SMR RAID1 and I want to minimize random write operations (noatime & al.). I will start a new thread for that maybe but first, is there something substantial I can read about btrfs+SMR? Or should I avoid SMR+btfs ? > >>> For maintenance, I would suggest running a scrub regularly, to >>> check for various forms of bitrot. Typical frequencies for a scrub are >>> once a week or once a month -- opinions vary (as do runtimes). >> >> >> Yes. I cronned it weekly for now. Takes about 5 hours. Is it >> automatically corrected on RAID10 since a copy of it exist within the >> filesystem ? What happens for RAID0 ? > > For raid10 (and the raid1 I use), yes, it's corrected, from the other > existing copy, assuming it's good, tho if there are metadata checksum > errors, there may be corresponding unverified checksums as well, where > the verification couldn't be done because the metadata containing the > checksums was bad. Thus, if there are errors found and corrected, and > you see unverified errors as well, rerun the scrub, so the newly > corrected metadata can now be used to verify the previously unverified > errors. ok then, rule of the thumb re-run the scrub on “unverified checksum error(s)”. I have yet to see checksum errors yet but will keep it in mind.. > > I'm presently getting a lot of experience with this as one of the ssds in > my raid1 is gradually failing and rewriting sectors. Generally what > happens is that the ssd will take too long, triggering a SATA reset (30 > second timeout), and btrfs will call that an error. The scrub then > rewrites the bad copy on the unreliable device with the good copy from > the more reliable device, with the write triggering a sector relocation > on the bad device. The newly written copy then checks out good, but if > it was metadata, it very likely contained checksums for several other > blocks, which couldn't be verified because the block containing their > checksums was itself bad. Typically I'll see dozens to a couple hundred > unverified errors for every bad metadata block rewritten in this way. > Rerunning the scrub then either verifies or fixes the previously > unverified blocks, tho sometimes one of those in turn ends up bad and if > it's a metadata block, I may end up rerunning the scrub another time or > two, until everything checks out. > > FWIW, on the bad device, smartctl -A reports (excerpted): > > ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED > WHEN_FAILED RAW_VALUE > 5 Reallocated_Sector_Ct 0x0032 098 098 036 Old_age Always > - 259 > 182 Erase_Fail_Count_Total 0x0032 100 100 000 Old_age Always > - 132 > > While on the paired good device: > > 5 Reallocated_Sector_Ct 0x0032 253 253 036 Old_age Always > - 0 > 182 Erase_Fail_Count_Total 0x0032 253 253 000 Old_age Always > - 0 > > Meanwhile, smartctl -H has already warned once that the device is > failing, tho it went back to passing status again, but as of now it's > saying failing, again. The attribute that actually registers as failing, > again from the bad device, followed by the good, is: > > 1 Raw_Read_Error_Rate 0x000f 001 001 006 Pre-fail Always > FAILING_NOW 3081 > > 1 Raw_Read_Error_Rate 0x000f 160 159 006 Pre-fail Always > - 41 > > When it's not actually reporting failing, the FAILING_NOW status is > replaced with IN_THE_PAST. > > 250 Read_Error_Retry_Rate is the other attribute of interest, with values > of 100 current and worst for both devices, threshold 0, but a raw value > of 2488 for the good device and over 17,000,000 for the failing device. > But with the "cooked" value never moving from 100 and with no real > guidance on how to interpret the raw values, while it's interesting, > I am left relying on the others for indicators I can actually understand. > > The 5 and 182 raw counts have been increasing gradually over time, and I > scrub every time I do a major update, with another reallocated sector or > two often appearing. But as long as the paired good device keeps its zero > count and I have backups (as I do!), btrfs is actually allowing me to > continue using the unreliable device, relying on btrfs checksums and > scrubbing to keep it usable. And FWIW, I do have another device ready to > go in when I decide I've had enough of this, but as long as I have > backups and btrfs scrub keeps things fixed up, there's no real hurry > unless I decide I'm tired of dealing with it. Meanwhile, I'm having a > bit of morbid fun watching as it slowly decays, getting experience of > the process in a reasonably controlled setting without serious danger > to my data, since it is backed up. You sure have morbid inclinations ! ;-) Out of curiosity what is the frequency and sequence of smartctl long/short tests + btrfs scrubs ? Is it all automated ? > As for raid0 (and single), there's only one copy. Btrfs detects checksum > failure as it does above, but since there's only the one copy, if it's > bad, well, for data you simply can't access that file any longer. For > metadata, you can't access whatever directories and files it referenced, > any longer. (FWIW for the truly desperate who hope that at least some of > it can be recovered even if it's not a bit-perfect match, there's a btrfs > command that wipes the checksum tree, which will let you access the > previously bad-checksum files again, but it works on the entire > filesystem so it's all or nothing, and of course with known corruption, > there's no guarantees.) But is it possible to manually correct the corruption by overwriting the corrupted files with a copy from a backup ? I mean is there enough information reported in order to do that ? thanks! v
signature.asc
Description: Message signed with OpenPGP using GPGMail