Izak Burger added the comment: I worked on it quite a bit yesterday. I pushed the load average to 36 at times, which was almost entirely because of disk issues (Linux deems processes in the disk queue as runnable, so a high load average often points to a long disk queue). It should be better since about 7 hours ago. I'm still writing zeroes to the rest of the open space, in an attempt to force the disk to swap the sectors (which it can usually only do on a write). The raid array rebuilt to 95% last night before it failed, so we're getting close to getting redundancy back. I highly suspect a disk swap might still be necessary, but at the moment it seems the disk that is not in the array is the better one, so I want the array back in sync first.
At the moment I'm not having joy with smartmontools. The initial stats showed some 28 bad sectors that were pending a swap, which isn't too bad, but a full offline scan (which despite it's name can be done while the disk is online) will take a full day. The files we lost were almost all log files, even in the other virtual hosts on that machine. One of postgresql's WAL logs also failed but I could recover it from a previous copy. By simply doing a few successive rsyncs I got all the data back. regards, Izak _______________________________________________________ PSF Meta Tracker <metatrac...@psf.upfronthosting.co.za> <http://psf.upfronthosting.co.za/roundup/meta/issue504> _______________________________________________________ _______________________________________________ Tracker-discuss mailing list Tracker-discuss@python.org http://mail.python.org/mailman/listinfo/tracker-discuss