On 2011-03-29 16:16, Jeff wrote:
halt for 0.5-2 seconds, then resume. The fix we're going to do is replace each drive in order with the rebuild occuring between each. Then we do a security erase to reset the drive back to completely empty (including the "spare" blocks kept around for writes).
Are you replacing the drives with new once, or just secure-erase and back in?
What kind of numbers are you drawing out of smartmontools in usage figures? (Also seeing some write-stalls here, on 24 Raid50 volumes of x25m's, and have been planning to cycle drives for quite some time, without actually getting to it.
Now that all sounds awful and horrible until you get to overall performance, especially with reads - you are looking at 20k random reads per second with a few disks. Adding in writes does kick it down a noch, but you're still looking at 10k+ iops. That is the current trade off.
Thats also my experience. -- Jesper