Re: Perc 6/i Puncturing bad blocks
I'm hoping someone will know about this. I'm finding this again on a different server, same issue. Thanks! On 11/16/2010 11:31 AM, J. Epperson wrote: > > That post says the 6/i is the first with a feature to fix a punctured > stripe, but doesn't say how to use it (probably because the OP was about a > Perc 5). Patrick Fisher: are you still reading? Can you elaborate on > the 6/i punctured stripe repair capability? > > ___ > Linux-PowerEdge mailing list > Linux-PowerEdge@dell.com > https://lists.us.dell.com/mailman/listinfo/linux-poweredge > Please read the FAQ at http://lists.us.dell.com/faq -- Thanks! Jacob Perkins Level II Linux Systems Administrator Systems Monitoring HostGator.com LLC http://support.hostgator.com .-._ _ _ _ _ _ _ _ _ .-''-.__.-'00 '-' ' ' ' ' ' ' ' '-. '.___ '. .--_'-' '-' '-' _'-' '._ V: V 'vv-' '_ '. .' _..' '.'. '=..=_.--' :_.__.__:_ '. : : (((.-''-. / : : (((-'\ .' / _..' .' '-._.-' ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: Perc 6/i Puncturing bad blocks
On Mon, November 15, 2010 23:43, Eugene Vilensky wrote: > > On Nov 15, 2010, at 9:09 AM, J. Epperson wrote: > >> On Thu, November 11, 2010 10:25, Jacob P wrote: >>> Hello, >>> >>> I'm having an issue with a PERC 6/i card and I was hoping to get some >>> guidance from the gurus on this list. >>> >>> We're having an issue with the controller 'puncturing bad blocks', >>> but it's 'remembering' the sectors after swapping out with a >>> hotspare. >>> >> >> Looks like the infamous and somewhat nebulous "punctured stripe". Your >> PERC is presenting the RAID volume as an emulated hard disk, and when >> you end up with emulated bad blocks, they'll move around WRT the actual >> physical drives. The only sure way to get rid of them is said to be >> recreating the volume from scratch and reloading the files from backup. >> >> >> There are people participating on this list who understand this far >> better than I do, and I hope one or more of them elaborate for us. > > > This was a memorable one, hope it helps: > > http://lists.us.dell.com/pipermail/linux-poweredge/2010-May/042140.html > That post says the 6/i is the first with a feature to fix a punctured stripe, but doesn't say how to use it (probably because the OP was about a Perc 5). Patrick Fisher: are you still reading? Can you elaborate on the 6/i punctured stripe repair capability? ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: Perc 6/i Puncturing bad blocks
On Nov 15, 2010, at 9:09 AM, J. Epperson wrote: > On Thu, November 11, 2010 10:25, Jacob P wrote: >> Hello, >> >> I'm having an issue with a PERC 6/i card and I was hoping to get some >> guidance from the gurus on this list. >> >> We're having an issue with the controller 'puncturing bad blocks', but >> it's 'remembering' the sectors after swapping out with a hotspare. >> > > Looks like the infamous and somewhat nebulous "punctured stripe". Your > PERC is presenting the RAID volume as an emulated hard disk, and when you > end up with emulated bad blocks, they'll move around WRT the actual > physical drives. The only sure way to get rid of them is said to be > recreating the volume from scratch and reloading the files from backup. > > There are people participating on this list who understand this far better > than I do, and I hope one or more of them elaborate for us. This was a memorable one, hope it helps: http://lists.us.dell.com/pipermail/linux-poweredge/2010-May/042140.html ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: Perc 6/i Puncturing bad blocks
On Thu, November 11, 2010 10:25, Jacob P wrote: > Hello, > > I'm having an issue with a PERC 6/i card and I was hoping to get some > guidance from the gurus on this list. > > We're having an issue with the controller 'puncturing bad blocks', but > it's 'remembering' the sectors after swapping out with a hotspare. > Looks like the infamous and somewhat nebulous "punctured stripe". Your PERC is presenting the RAID volume as an emulated hard disk, and when you end up with emulated bad blocks, they'll move around WRT the actual physical drives. The only sure way to get rid of them is said to be recreating the volume from scratch and reloading the files from backup. There are people participating on this list who understand this far better than I do, and I hope one or more of them elaborate for us. ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Perc 6/i Puncturing bad blocks
Hello, I'm having an issue with a PERC 6/i card and I was hoping to get some guidance from the gurus on this list. We're having an issue with the controller 'puncturing bad blocks', but it's 'remembering' the sectors after swapping out with a hotspare. Example: *Nov 1st -> hotswapped S3* a0 PERC 6/i Integrated bios:2.04.00 fw:1.22.02-0612 encl:1 ldrv:2 rbld:30% mem:256MiB batt:good/4054mV/26C a0d0 136GiB RAID 1 1x2 optimal row 0: a0e32s0 a0e32s1 a0d12TiB RAID 5 1x4 optimal row 0: a0e32s2 a0e32s3 a0e32s4 a0e32s5 a0e32s0 SEAGATE ST3146356SS rev:HS0F s/n:3QN23Y2J 136GiB a0d0 online a0e32s1 SEAGATE ST3146356SS rev:HS0F s/n:3QN260CF 136GiB a0d0 online a0e32s2 SEAGATE ST31000640SS rev:MS0A s/n:9QJ5NX2F 931GiB a0d1 online a0e32s3 SEAGATE ST31000640SS rev:MS0A s/n:9QJ5NDVC 931GiB a0d1 online errs: media:76 other:2 a0e32s4 SEAGATE ST31000640SS rev:MS0A s/n:9QJ6BWRR 931GiB a0d1 online a0e32s5 SEAGATE ST31000640SS rev:MS0A s/n:9QJ5NX2V 931GiB a0d1 online *Nov 3rd -> hotswapped s4* a0 PERC 6/i Integrated bios:2.04.00 fw:1.22.02-0612 encl:1 ldrv:2 rbld:30% mem:256MiB batt:good/4044mV/26C a0d0 136GiB RAID 1 1x2 optimal row 0: a0e32s0 a0e32s1 a0d12TiB RAID 5 1x4 optimal row 0: a0e32s2 a0e32s3 a0e32s4 a0e32s5 a0e32s0 SEAGATE ST3146356SS rev:HS0F s/n:3QN23Y2J 136GiB a0d0 online a0e32s1 SEAGATE ST3146356SS rev:HS0F s/n:3QN260CF 136GiB a0d0 online a0e32s2 SEAGATE ST31000640SS rev:MS0A s/n:9QJ5NX2F 931GiB a0d1 online a0e32s3 SEAGATE ST31000640SS rev:0004 s/n:9QJ1WP3P 931GiB a0d1 online errs: media:0 other:1 a0e32s4 SEAGATE ST31000640SS rev:MS0A s/n:9QJ6BWRR 931GiB a0d1 online errs: media:76 other:0 *Nov 4th -> hotswapped s3* a0 PERC 6/i Integrated bios:2.04.00 fw:1.22.02-0612 encl:1 ldrv:2 rbld:30% mem:256MiB batt:good/4038mV/26C a0d0 136GiB RAID 1 1x2 optimal row 0: a0e32s0 a0e32s1 a0d12TiB RAID 5 1x4 optimal row 0: a0e32s2 a0e32s3 a0e32s4 a0e32s5 a0e32s0 SEAGATE ST3146356SS rev:HS0F s/n:3QN23Y2J 136GiB a0d0 online a0e32s1 SEAGATE ST3146356SS rev:HS0F s/n:3QN260CF 136GiB a0d0 online a0e32s2 SEAGATE ST31000640SS rev:MS0A s/n:9QJ5NX2F 931GiB a0d1 online a0e32s3 SEAGATE ST31000640SS rev:0004 s/n:9QJ1WP3P 931GiB a0d1 online errs: media:76 other:1 predictive-failure a0e32s4 SEAGATE ST31000640SS rev:MS0A s/n:9QJ63CGY 931GiB a0d1 online a0e32s5 SEAGATE ST31000640SS rev:MS0A s/n:9QJ5NX2V 931GiB a0d1 online *Today: *a0 PERC 6/i Integrated bios:2.04.00 fw:1.22.02-0612 encl:1 ldrv:2 rbld:30% mem:256MiB batt:good/4019mV/26C a0d0 136GiB RAID 1 1x2 optimal row 0: a0e32s0 a0e32s1 a0d12TiB RAID 5 1x4 optimal row 0: a0e32s2 a0e32s3 a0e32s4 a0e32s5 a0e32s0 SEAGATE ST3146356SS rev:HS0F s/n:3QN23Y2J 136GiB a0d0 online a0e32s1 SEAGATE ST3146356SS rev:HS0F s/n:3QN260CF 136GiB a0d0 online a0e32s2 SEAGATE ST31000640SS rev:MS0A s/n:9QJ5NX2F 931GiB a0d1 online a0e32s3 SEAGATE ST31000640SS rev:MS04 s/n:9QJ636RE 931GiB a0d1 online errs: media:76 other:0 a0e32s4 SEAGATE ST31000640SS rev:MS0A s/n:9QJ6C4XE 931GiB a0d1 online a0e32s5 SEAGATE ST31000640SS rev:MS0A s/n:9QJ5NX2V 931GiB a0d1 online As you can see, the '76' errors seem to be moving from pd to pd, but even though they are different drives. This is leading me to believe that some how the controller is taking these sectors offline because it's remembering them somehow. Has anyone seen this before, or perhaps have any suggestions on how I should proceed? Wall of text logs below. Thanks again for any help!! 11/07/10 19:16:55: EVT#00372-11/07/10 19:16:55: 113=Unexpected sense: PD 03(e0x20/s3) Path 5000c50010157171, CDB: 28 00 74 2f e9 00 00 00 80 00, Sense: 3/11/00 11/07/10 19:16:55: Raw Sense for PD 3: f0 00 03 74 2f e9 37 0a 00 00 00 00 11 00 81 80 00 96 11/07/10 19:16:55: DEV_REC:Medium Error DevId[3] Tgt 3 RDM=a05caa00 retires=0 11/07/10 19:16:55: MedErr is for: cmdId=422, ld=1, src=4, cmd=1, lba=e85fd2, cnt=80, rmwOp=0 11/07/10 19:16:55: -> recoveryChild: ld=1 orgLi=0 recPhysArm=2 badPhysArm=ff doneFun=a0c02268 sRef=0 eRef=7f recFlags=0 11/07/10 19:16:55: -> RecParent: cmdId=422, src=4, cmd=1, lba=e85fd2, cnt=80, rmwOp=0, refs=0/7f 11/07/10 19:16:55: ErrLBAOffset (37) LBA(742fe900) BadLba=742fe937 11/07/10 19:16:55: EVT#00373-11/07/10 19:16:55: 111=Unrecoverable medium erro