Re: [OmniOS-discuss] (no subject)

2015-09-15 Thread Andy Fiddaman

On Mon, 14 Sep 2015, Paul B. Henson wrote:

; > From: Omen Wild
; > Sent: Monday, September 14, 2015 3:10 PM
; >
; > Mostly we are wondering how to clear the corruption off disk and worried
; > what else might be corrupt since the scrub turns up no issues.
;
; While looking into possible corruption from the recent L2 cache bug it seems
; that running 'zdb -bbccsv' is a good test for finding corruption as it looks
; at all of the blocks and verifies all of the checksums.

zpool scrub is fine but I get lots of messages like this when I run zdb
-bbccsv

zdb_blkptr_cb: Got error 50 reading <3077, 212, 0, 52>
DVA[0]=<0:14d528f8200:1ce00> [L0 ZFS plain file] fletcher4 lz4 LE
contiguous unique single size=2L/13200P birth=3708038L/3708038P fill=1
cksum=1717c7d38f62:374184e099ada9b:a86cf60db2f68605:2be4a1817f9f4b1d --
skipping

Is this an indicator of corruption in the pool?
It's going to be a right royal pain to rebuild them if I need to!

Thanks,

Andy
-- 
Citrus IT Limited | +44 (0)870 199 8000 | enquir...@citrus-it.co.uk
Rock House Farm | Green Moor | Wortley | Sheffield | S35 7DQ
Registered in England and Wales | Company number 4899123

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] (no subject)

2015-09-15 Thread Paul B. Henson
> From: Andy Fiddaman
> Sent: Tuesday, September 15, 2015 1:41 AM
>
> zdb_blkptr_cb: Got error 50 reading <3077, 212, 0, 52>
> DVA[0]=<0:14d528f8200:1ce00> [L0 ZFS plain file] fletcher4 lz4 LE
> contiguous unique single size=2L/13200P birth=3708038L/3708038P fill=1
> cksum=1717c7d38f62:374184e099ada9b:a86cf60db2f68605:2be4a1817f9f4b1d
> --
> skipping
> 
> Is this an indicator of corruption in the pool?
> It's going to be a right royal pain to rebuild them if I need to!

That certainly doesn't look good :(. I'd recommend posting this output on
the zfs mailing list and asking for feedback.


___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] (no subject)

2015-09-15 Thread Paul B. Henson
> From: Stephan Budach
> Sent: Monday, September 14, 2015 10:00 PM
>
> As George Wilson wrote on the ZFS mailing list: " Unfortunately, if the
> corruption impacts a data block then we won't be able to detect it.".
> So, I am afarid apart from metadata and indirect blocks corruption,
> there's no way to even detect a corruption inside a data block, as the
> checksum fits.

Yes, that's true, assuming you have no external source of verification.
However, Arne said he didn't think this bug would result in data corruption,
only metadata corruption. I was mostly worried about pool corruption that
would cause panics or failure to import, which data level corruption would
not cause. Most of the data on the pool I was worried about is media, a bad
data block here or there wouldn't be too tragic.

> from that pool, e.g. from a backup prior to 6214 having been introduced,
> but depending on the sheer amount of data or the type of it, that might
> not be even possible.

Yup. This was a sucky bug :(.

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] (no subject)

2015-09-14 Thread Paul B. Henson
> From: Omen Wild
> Sent: Monday, September 14, 2015 3:10 PM
> 
> Mostly we are wondering how to clear the corruption off disk and worried
> what else might be corrupt since the scrub turns up no issues.

While looking into possible corruption from the recent L2 cache bug it seems
that running 'zdb -bbccsv' is a good test for finding corruption as it looks
at all of the blocks and verifies all of the checksums.

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] (no subject)

2015-09-14 Thread Stephan Budach

Am 15.09.15 um 03:46 schrieb Paul B. Henson:

From: Omen Wild
Sent: Monday, September 14, 2015 3:10 PM

Mostly we are wondering how to clear the corruption off disk and worried
what else might be corrupt since the scrub turns up no issues.

While looking into possible corruption from the recent L2 cache bug it seems
that running 'zdb -bbccsv' is a good test for finding corruption as it looks
at all of the blocks and verifies all of the checksums.

___
As George Wilson wrote on the ZFS mailing list: " Unfortunately, if the 
corruption impacts a data block then we won't be able to detect it.". 
So, I am afarid apart from metadata and indirect blocks corruption, 
there's no way to even detect a corruption inside a data block, as the 
checksum fits.


I think, the best one can do is to run a scrub and act on the results of 
that. If scrub reports no errors, one can live with that or one would 
need to think of options to reference the data with known, good data 
from that pool, e.g. from a backup prior to 6214 having been introduced, 
but depending on the sheer amount of data or the type of it, that might 
not be even possible.


Cheers,
Stephan
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss