Alan Stern wrote: > On Wed, 6 Dec 2006, Matthias Schniedermeyer wrote: > > >>Hi >> >> >>I'm using a Bunch auf HDDs in USB-Enclosures for storing files. >>(currently 38 HDD, with a total capacity of 9,5 TB of which 8,5 TB is used) >> >>After i realised about a year(!) ago that the files copied to the HDDs >>sometimes aren't identical to the "original"-files i changed my >>procedured so that each file is MD5 before and after and deleted/copied >>again if an error is detected. >> >>My averate file size is about 1GB with files from about 400MB to 5000MB >>I estimate the average error-rate at about one damaged file in about >>10GB of data. >> >>I'm not sure and haven't checked if the files are wrongly written or >>"only" wrongly read back as i delete the defective files and copy them >>again. >> >>Today i copied a few files back and checked them against the stored MD5 >>sums and 5 files of 86 (each about 700 MB) had errors. So i copied the 5 >>files again. 4 of the files were OK after that and coping the last file >>the third time also resulted in the correct MD5. >> >>This time i kept the defective files and used "vbindiff" to show me the >>difference. Strangly in EVERY case the difference is a single bit in a >>sequence of "0xff"-Bytes inside a block of varing bit-values that >>changed a "0xff" into a "0xf7". >>Also interesting is that each error is at a 0xXXXXXXX5-Position >> >>Attached is a file with 5 of the 6 differences named 1-5. Of each of the >>5 2x3 lines-blocks the first 3 lines are the original the following 3 >>lines contain the error in the middle row 6th value. >> >>NEVER did i see any messages in syslog regarding erros or an aborting >>program due to errors passed down from the kernel or something like that. > > > This was almost certainly caused by hardware flaws in the USB interface > chips of the enclosures. There's nothing the kernel can do about it > because the errors aren't reported; all that happens is that incorrect > data is sent to or from the drive.
So pretty much all ich can do is to pray that the errors don't corrupt the Filesystem-Metadata (XFS). So i should definetly consider writing me a "NO-FS" where the "filesystem"-part is stored elsewhere and the HDD contains 100% content (Minus a Dummy-MBR-Block for sector 0). On the plus side such a filesystem won't have any overhead at all, but on the flipside you loose pretty much the whole content if you lose the metadata. But i guess in my case it would considerably lower the risk of loosing data. Bis denn -- Real Programmers consider "what you see is what you get" to be just as bad a concept in Text Editors as it is in women. No, the Real Programmer wants a "you asked for it, you got it" text editor -- complicated, cryptic, powerful, unforgiving, dangerous. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/