On 16 October 2010 00:51, Charles Owens <cow...@greatbaysoftware.com> wrote: > Hmm... the problem appears to have resolved itself. After a few hours the > new drive seems to have gone back into the array, and the original hot spare > drive put back into hot-spare state. > > So I'm interpreting state 0x0020 to therefore mean something like "hang on > while I use this new drive to automatically put everything back as it was > before the failure". Is this correct? > > Thanks, > Charles > > [r...@bsvr ~]# mfiutil show drives > mfi0 Physical Drives: > ( 149G) ONLINE<ST9160511NS SN04 serial=9SM236JR> SATA enclosure 1, slot 0 > ( 149G) ONLINE<ST9160511NS SN04 serial=9SM237KF> SATA enclosure 1, slot 1 > ( 149G) ONLINE<ST9160511NS SN04 serial=9SM236N8> SATA enclosure 1, slot 2 > ( 149G) HOT SPARE<ST9160511NS SN04 serial=9SM237EK> SATA enclosure 1, slot > 3 > ( 149G) ONLINE<ST9160511NS SN04 serial=9SM238AG> SATA enclosure 1, slot 4 > > > > On 10/15/10 3:05 PM, Charles Owens wrote: >> >> Hello, >> >> We have a mfi-based RAID array with a failed drive. When replacing the >> failed drive with a brand new one 'mfiutil' reports it having status of >> "PSTATE 0x0020". Attempts to work with the drive to make it a hot spare are >> unsuccessful (eg. using "good" and/or "add" subcommands of mfiutil). We've >> tested procedures for replacing failed drives in the past and haven't run >> into this. >> >> Looking at the code for mfiutil it appears that this is happening because >> the mfi controller is reporting a drive status code that mfiutil doesn't >> know about. The system is remote and in production, so booting into the LSI >> in-BIOS RAID-management-tool is not an attractive option. >> >> Any help with understanding the situation and potential next steps would >> be greatly appreciated. More background information follows below. >> >> Thanks, >> >> Charles >> >> >> Storage configuration: 4-drive RAID 10 array plus one hot spare >> >> [r...@svr ~]# mfiutil show config >> mfi0 Configuration: 2 arrays, 1 volumes, 0 spares >> array 0 of 2 drives: >> drive 0 ( 149G) ONLINE<ST9160511NS SN04 serial=9SM236JR> SATA >> enclosure 1, slot 0 >> drive 1 ( 149G) ONLINE<ST9160511NS SN04 serial=9SM237KF> SATA >> enclosure 1, slot 1 >> array 1 of 2 drives: >> drive 4 ( 149G) ONLINE<ST9160511NS SN04 serial=9SM237EK> SATA >> enclosure 1, slot 3 >> drive 3 ( 149G) ONLINE<ST9160511NS SN04 serial=9SM236N8> SATA >> enclosure 1, slot 2 >> volume mfid0 (296G) RAID-1 256K OPTIMAL spans: >> array 0 >> array 1 >> >> [r...@svr ~]# mfiutil show drives >> mfi0 Physical Drives: >> ( 149G) ONLINE<ST9160511NS SN04 serial=9SM236JR> SATA enclosure 1, slot >> 0 >> ( 149G) ONLINE<ST9160511NS SN04 serial=9SM237KF> SATA enclosure 1, slot >> 1 >> ( 149G) ONLINE<ST9160511NS SN04 serial=9SM236N8> SATA enclosure 1, slot >> 2 >> ( 149G) ONLINE<ST9160511NS SN04 serial=9SM237EK> SATA enclosure 1, slot >> 3 >> ( 149G) PSTATE 0x0020<ST9160511NS SN04 serial=9SM238AG> SATA enclosure >> 1, slot 4 >> >> mfi0:<LSI MegaSAS 1078> port 0x1000-0x10ff mem >> ... >>
Hi, Charles Owens. 0x20 is much likely to be the copyback physical state, which is missing in enum mfi_pd_state. And what you've experienced is copyback feature in action :) Your array has been rebuilt with HSP as its ordinal PD, then you switched failed drive with good one, and HSP came into copyback mode to move all its data back to good disk. That prevents reordering of disk numbers in array and double rebuilding. -- wbr, pluknet _______________________________________________ freebsd-hardware@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hardware To unsubscribe, send any mail to "freebsd-hardware-unsubscr...@freebsd.org"