Thanks for taking the time to look at our problem.
Please send us the patch, so we can check with this Version.
Maybe this is a hardware problem. Meanwhile I have seen the same thing when
using an USB 1.1 memeory stick (definitely working in a lot of other
computers) on this particular server.
Is there a slight possibility that the trouble comes from the really high
speed machine we are using? It is a hyperthreaded dual processor XEON 3GHz (4
CPUs reported) with 4 GByte RAM installed. When doing the tests no other load
is on the machine.
I almost don't dare to say it, but when we install Windows 2000 / SP4 on that
server everything works just fine. Bu we definitely have to get Linux running
stable on that machine, using all the peripherals that are in there (we have
similar problems with a DVDRAM drive coupled over IDE bus in this server, but
that's another story).
Is there a possibility that "the others" have some workarounds included into
their drivers to overcome such hardware quirks?
Is it possible that most of PC hardware works only because software drivers
tolerate their faults and retry in some way?
I agree that there might be a hardware problem. But sometimes software has to
correct were hardware fails. Maybe we can implement some of these workarounds
here, too?
Thomas
Alan Stern wrote:
On Wed, 22 Feb 2006, Thomas Thanner wrote:
Thank you for analyzing the log.
We bought some new CF cards and initially it worked well. We thought the
problem is solved now. But unfortunately the error occures again.
When we do the testing with large files containing only zeroes, then we have
no problems. But as soon as we try to copy real data we get siminal errors again.
That certainly indicates a hardware problem of some sort. The software
doesn't care what values are in the data stream. It doesn't even look at
the data stream!
Now we wanted to generate the log files as required and magically no more
errors occure! What shall we do in this case?
That's a bad situation. If you want, I could send you a patch to log
only the error codes and nothing else. It probably wouldn't help, though,
since the codes would be the same as what we've already seen.
These errors are really nasty, when it occures while copying large files
(>2GByte) the servers cannot be shut down any more, because the copy process
cannot be terminated. We waited for about 2 days, but no end was in sight!
Unplugging the USB cable should terminate the copy process. But there's a
combination of hardware and software problems that cause the EHCI driver
to hang on some systems under certain circumstances.
Interestingly we have same error no matter whether we use "ub" or
"usb-storage" modules for access (using different kernels).
That's because this is a hardware problem, not a software problem.
All works fine, when we unload EHCI and use only USB1.1 over UHCI. But in this
case all our USB ports get USB 1.1 and that is not good. Is there a possibiliy
to force some ports to USB1.1 and let the others on USB2.0? don't think so,
but it might be a nice feature, to solve our problem :-)
Like I told you before: Get a USB 1.1 hub and attach it to the computer.
Any devices you plug into the hub will automatically run at full speed
instead of high speed.
We have also tests on different servers of same kind and of completely other
machines. The error remains the same.
We have tested four different CF-cards, all the same error.
We are more than willing to take part in USB stack debugging and hacking, but
we might need some hints where we can start to look for our problem. At the
moment we have only experiences in hacking xfree and xorg servers.
It sounds like you should start looking at the EHCI driver:
drivers/usb/host/ehci*. I doubt you'll find anything wrong, but go ahead
and look.
A better approach might be to buy an add-on PCI USB controller card.
There's a good chance it would work better than the USB controller on your
motherboard.
Alan Stern
--
Thomas Thanner
Manager System Development
Citron GmbH
Anwaltinger Str. 14
D-86165 Augsburg
Tel. : ++49-821-74945-0
Fax. : ++49-821-74945-99
Email: [EMAIL PROTECTED]
http://www.citron.de
-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
[email protected]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-users