Hi Alan :)

 * Alan Stern <[EMAIL PROTECTED]> dixit:
> On Wed, 22 Nov 2006, Andrew Morton wrote:
> >     The problem is the following: whenever I copy a lot of data to
> > the usb-storage device (more than a few GB's), the copy goes OK,
> > without an error, but when I compare the copied files with the
> > original files, sometimes a copied file is different. This does not
> > happen if I copy the files one by one, and it doesn't happens all the
> > time, sometimes the copy is perfect.
> 
> Intermittent problems like this are very hard to track down.  It
> sounds like a hardware problem of some sort, but without more
> information it's impossible to say if the problem lies in your
> computer, the USB cable, the USB-storage adapters, or the hard disk
> drives.  Have you tried using different cables?

    Yes, and the error dissappears (or at least it hasn't been
produced yet) when using a very short cable (less than 0.5m), while a
USB memory stick works OK with a 1m long cable at the same speed! The
set of cables causing problems are of different brands, and their
only common "feature" is their lenght: about 1m. The same cables work
OK (no detectable problems) in other computers, I've tested this
morning.

    The fact is that the USB card works OK in another computer I've
tested on, but that computer uses Windows and I cannot install Linux
there, so... :(

    Really the problem is very difficult to track down.

> >     In addition to this, from time to time the usb-storage
> > adapters (any of them, with any of the USB cards and any kernel)
> > report a read error, telling that some sector could not be read.
> > This is false because if I repeat the operation, the sector is
> > correctly retrieved.
> 
> No, the messages are not false.  They definitely indicate a
> problem; you mustn't dismiss them so easily.  With borderline
> hardware it's entirely possible that an operation can fail at
> moment and then succeed a few moments later.

    I've tested the hard disk with a destructive badblocks and with
some diagnostic tool of Seagate, and all the disks are OK. In fact
they work reliably (and SMART doesn't show any problem) if used
directly. That leaves us with the usb-storage adapters as causing
those failures, but: why should them fail for a sector that is being
read from a hard disk which can be read after a while?

    Probably the motherboard is the culprit, I don't really know :(
I've tested the same set (USB-card+USB-cable+usb-storage
adapter+harddisk) in a windows computer and it works OK, but I don't
know if it really works or if windows is ignoring IO errors and
silently retrying :??? Unfortunately, I cannot carry under windows
the same tests I can carry under Linux (including modifying the
kernel to add traces or any other debugging helper).

> >     The fact is that I cannot reproduce the problem reliably, so I
> > cannot give you a "recipe", except that it happens when I copy a lot
> > of data at a time.
> > 
> >     Any suggestion about how to narrow the problem down? Any more
> > data that you may need? A known bug? Am I doing any stupidity?
> 
> Well, you could start by posting some of the error messages!

    Yep, sorry O:)) Here are they:

kernel: SCSI disk error : host 0 channel 0 id 0 lun 0 return code = 8000002
kernel: Current sd08:01: sns = 70  4
kernel: ASC=4b ASCQ= 0
kernel: Raw sense data:0x70 0x00 0x04 0x00 0x00 0x00 0x00 0x0a 0x00
                       0x00 0x00 0x00 0x4b 0x00 0x00 0x00 0x00 0x00
kernel:  I/O error: dev 08:01, sector 1804512

    The sector number varies on each IO error, and are completely
unrelated. Here are the list of "bad sectors" so far:

6133384
1804512
18490944
31794768
31177200

    Again, nor badblocks nor the Seagate diagnostic program have
revealed any error when connecting the disk to the IDE bus directly.

> Also, it wouldn't hurt to turn on CONFIG_USB_DEBUG and rebuild the
> USB drivers in your kernel, then post the entire kernel log
> starting from when you plug in the drive.

    I cannot rebuild the kernel because I cannot reboot the machine
right now, but I have already a kernel prepared with CONFIG_USB_DEBUG
 
> It's impossible to tell whether your problem is due to a known bug
> without more information.  About all you've told us so far is that
> from time to time something goes wrong.  That's not much to go on.

    Well, I just wanted some advice to further investigate the
problem and then provide more information. Now that I've discarded
part of the problem (1m cables seem too long for the USB card *in
this computer*), I'll concentrate in the IO error.

    Thanks a lot for the advice :) I'll try to provide more and
better data, and more tests. I'm going to perform some kind of
differential analysis to discover the exact combination (if any) or
conditions (again, if any) that lead to the error. Anyway this is
going to be slow, because each test tooks half an hour or even more.

    Again, thanks a lot :)

    Raúl Núñez de Arenas Coronado

-- 
Linux Registered User 88736 | http://www.dervishd.net
It's my PC and I'll cry if I want to... RAmen!

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Reply via email to