Hi Alan :) * Alan Stern <[EMAIL PROTECTED]> dixit: > On Wed, 22 Nov 2006, Andrew Morton wrote: > > The problem is the following: whenever I copy a lot of data to > > the usb-storage device (more than a few GB's), the copy goes OK, > > without an error, but when I compare the copied files with the > > original files, sometimes a copied file is different. This does not > > happen if I copy the files one by one, and it doesn't happens all the > > time, sometimes the copy is perfect. > > Intermittent problems like this are very hard to track down. It > sounds like a hardware problem of some sort, but without more > information it's impossible to say if the problem lies in your > computer, the USB cable, the USB-storage adapters, or the hard disk > drives. Have you tried using different cables?
Yes, and the error dissappears (or at least it hasn't been produced yet) when using a very short cable (less than 0.5m), while a USB memory stick works OK with a 1m long cable at the same speed! The set of cables causing problems are of different brands, and their only common "feature" is their lenght: about 1m. The same cables work OK (no detectable problems) in other computers, I've tested this morning. The fact is that the USB card works OK in another computer I've tested on, but that computer uses Windows and I cannot install Linux there, so... :( Really the problem is very difficult to track down. > > In addition to this, from time to time the usb-storage > > adapters (any of them, with any of the USB cards and any kernel) > > report a read error, telling that some sector could not be read. > > This is false because if I repeat the operation, the sector is > > correctly retrieved. > > No, the messages are not false. They definitely indicate a > problem; you mustn't dismiss them so easily. With borderline > hardware it's entirely possible that an operation can fail at > moment and then succeed a few moments later. I've tested the hard disk with a destructive badblocks and with some diagnostic tool of Seagate, and all the disks are OK. In fact they work reliably (and SMART doesn't show any problem) if used directly. That leaves us with the usb-storage adapters as causing those failures, but: why should them fail for a sector that is being read from a hard disk which can be read after a while? Probably the motherboard is the culprit, I don't really know :( I've tested the same set (USB-card+USB-cable+usb-storage adapter+harddisk) in a windows computer and it works OK, but I don't know if it really works or if windows is ignoring IO errors and silently retrying :??? Unfortunately, I cannot carry under windows the same tests I can carry under Linux (including modifying the kernel to add traces or any other debugging helper). > > The fact is that I cannot reproduce the problem reliably, so I > > cannot give you a "recipe", except that it happens when I copy a lot > > of data at a time. > > > > Any suggestion about how to narrow the problem down? Any more > > data that you may need? A known bug? Am I doing any stupidity? > > Well, you could start by posting some of the error messages! Yep, sorry O:)) Here are they: kernel: SCSI disk error : host 0 channel 0 id 0 lun 0 return code = 8000002 kernel: Current sd08:01: sns = 70 4 kernel: ASC=4b ASCQ= 0 kernel: Raw sense data:0x70 0x00 0x04 0x00 0x00 0x00 0x00 0x0a 0x00 0x00 0x00 0x00 0x4b 0x00 0x00 0x00 0x00 0x00 kernel: I/O error: dev 08:01, sector 1804512 The sector number varies on each IO error, and are completely unrelated. Here are the list of "bad sectors" so far: 6133384 1804512 18490944 31794768 31177200 Again, nor badblocks nor the Seagate diagnostic program have revealed any error when connecting the disk to the IDE bus directly. > Also, it wouldn't hurt to turn on CONFIG_USB_DEBUG and rebuild the > USB drivers in your kernel, then post the entire kernel log > starting from when you plug in the drive. I cannot rebuild the kernel because I cannot reboot the machine right now, but I have already a kernel prepared with CONFIG_USB_DEBUG > It's impossible to tell whether your problem is due to a known bug > without more information. About all you've told us so far is that > from time to time something goes wrong. That's not much to go on. Well, I just wanted some advice to further investigate the problem and then provide more information. Now that I've discarded part of the problem (1m cables seem too long for the USB card *in this computer*), I'll concentrate in the IO error. Thanks a lot for the advice :) I'll try to provide more and better data, and more tests. I'm going to perform some kind of differential analysis to discover the exact combination (if any) or conditions (again, if any) that lead to the error. Anyway this is going to be slow, because each test tooks half an hour or even more. Again, thanks a lot :) Raúl Núñez de Arenas Coronado -- Linux Registered User 88736 | http://www.dervishd.net It's my PC and I'll cry if I want to... RAmen! ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ linux-usb-devel@lists.sourceforge.net To unsubscribe, use the last form field at: https://lists.sourceforge.net/lists/listinfo/linux-usb-devel