On Sun, 27 Oct 2013, Chris McClelland wrote:

> Hello libusbx-devel,
> 
> Firstly, apologies for the long email!
> 
> Today I discovered a weird problem with my FPGALink library on my Raspberry
> Pi, running Raspbian 2013-09-25 (with the stock armhf/armv6l libusbx-1.0.11
> package provided by Raspbian):
> 
> $ uname -a
> Linux raspberrypi 3.6.11+ #538 PREEMPT Fri Aug 30 20:42:08 BST 2013 armv6l
> GNU/Linux
> 
> I reproduced the problem with kernel USB sniffing and libusbx-1.0.17 built
> from source with debug enabled.
> 
> At this stage I would like someone with more experience to have a look at
> my log-files:
> 
> https://dl.dropboxusercontent.com/u/80983693/pi.tar.bz2

It seems clear from the traces that this isn't a problem in libusb.  
The RPi really did not receive the missing data.

> Basically, the test involves a Xilinx FPGA and a 16MiB SDRAM connected to
> the RPi via a Cypress FX2, connected either directly or via an
> Akasa-branded powered hub. I write 16MiB of random binary data over USB to
> the SDRAM, then read it back and compare the "in" and "out" data. I control
> the FX2 firmware, the FPGA design and all the code above libusbx. I cannot
> get it to fail (with or without the hub) on my main development machine
> (Linux x64), nor my laptop (Windows 8 x64). Similarly I cannot get it to
> fail if I connect the FX2 directly to the RPi. However, if I connect the
> FX2 to the RPi via the hub, it fails like 95% of the time. The hub
> enumerates as a pair of "058F:6254 Alcor Micro Corp. USB Hub" devices.

Obviously the hub is messing things up.

> The code is single-threaded, and sits in a loop, using the libusbx async
> API: libusb_submit_transfer() and libusb_handle_events_timeout_completed().
> It reads the entire 16MiB block in 64KiB chunks. First submit an
> EP2OUT-transfer: send a five-byte "read" command (i.e logical channel
> number & length in bytes) to the FPGA. Then submit an EP6IN-transfer:
> receive the 64KiB of data sent back by the FPGA, and write it to a file. Do
> that in a loop, maintaining a "submit depth" of two. Something like:
> 
> libusb_submit_transfer(OUT: 5-byte command to bulk endpoint 2)
> libusb_submit_transfer(IN: 64 KiB data from bulk endpoint 6)
> while ( not yet requested the full 16 MiB ) {
>   libusb_submit_transfer(OUT: 5-byte command to bulk endpoint 2)
>   libusb_submit_transfer(IN: 64 KiB data from bulk endpoint 6)
>   libusb_handle_events_timeout_completed(EP2OUT ack)
>   libusb_handle_events_timeout_completed(EP6IN data)
>   write chunk N to file
> }
> libusb_handle_events_timeout_completed(EP2OUT ack)
> libusb_handle_events_timeout_completed(EP6IN data)
> write last chunk to file
> 
> When connected via the hub, sometimes on rare occasions, the readback
> completes successfully, but most of the time, some (~4) 1KiB-blocks out of
> the 16MiB total are silently dropped, with no apparent error, and the final
> handle_events() call hangs waiting for completion of the final read, which
> never comes. The 1KiB blocks are dropped from random positions, but always
> on 512-byte boundaries, so if I use hexdump to get text files with 512
> bytes per line from the "in" and "out" binary data, I can diff them and get
> something like this:
> 
> 11866,11867d11865 (1024 bytes dropped)
> <
> 6609879B30036BA387D2CF1C2B913B80A4E3102E69BA93D5F67E96E02F29C385251DBDC670...
> <
> 0FBD8CBD064F704278A7241E3772E0F79F0E7B12DC0BE9951D2CC01EDB083AADEB682490C3...
> 17155,17156d17152 (1024 bytes dropped)
> <
> A50012D2ED25FC586E2C510E0276F80226A7114A0E17A906C40AF45092028CB41569F10A5D...
> <
> E85EFB1F7EF6B32F988DA8147608682420E7096C87F650530A4F4D3AD733ED30C375F0EA67...
> 28881,28882d28876 (1024 bytes dropped)
> <
> A58544550EDDF84FBB8CD6B5652DE1261789DD83B5AB886A0BD6B5FF8CCAF88BE4FF290F3F...
> <
> 203F0E6EC12A52DA5B667E1CC2F479FB997FC9E7DDDC65BC0C4B12E1273958D0BD7C59C35B...
> 31594,31595d31587 (1024 bytes dropped)
> <
> 80C3E7C1FF61B2A10D0019030C363FD6F2892CE9FD47B95932140B731847330E853873696C...
> <
> 0418845C72CE8B2253D215101263866FA75EBE1F608F82D169418C6C31F477D0745885F574...
> 32649,32768d32640 (the final 64KiB read never completes)
> <
> A29EF552E1BC31D47B8AEBE28A26B661B1D04D3E542E3330D94D6D0393FBCD0C802B83D188...
> <
> F49281244CCD19BC6D88B7F35F9942D97833F9D76E349EDB3D23A454D5575D4AF2FFA80250...
> <
> DA3108E152D852BFBDB6BB43C3C7B63A04636F4FE003A88210254F9DC2BBBFEF6A4F243813...
> <
> FFE2EF6EDF9480E4C7BCC31BACCA0BEF3E1079D6BAA9A98B1DF5A86130E020D120A4647FB3...
> <
> 8BCA6D2900829DCD47BF4367D58F098B0F4C4F28599D9F5401BF93A0FB90B7B971B8934323...
> <
> 0475764406C1B925627BE3D61A99D88D9485D34FB8C688F7A590E7AF5397CBB0CDB468656A...
> <
> E8B006C277A8C9F9BA2DAF7F6B460A20D50F182DBFADFF3C1ABA5FF510D3C27C41B7B01175...
> <
> B3D8A7842D521B79C4DE45FB4D741FA761EB73309A0F968895F0C131DDE4310D14008D8578...
>   :

1024 bytes is two USB packets.  This suggests that packets are being 
dropped, and the toggle mismatch mechanism fails when it happens to two 
packets in a row.  However, toggle mismatches for incoming data would 
cause data to be duplicated, not lost -- I can't think how it ends up 
working out in your case.

Have you noticed any difference in the time it takes the program to run
with the hub present compared to a direct connection?  If it takes
longer with the hub, that's a good indication you're getting a lot of
failures and retries.

> The last diff is always the missing final 64KiB chunk: because since some
> data has been dropped in the middle, libusbx just hangs on the final
> libusb_handle_events_timeout_completed(), waiting for the final 16 KiB URB
> which will never arrive.
> 
> The code is public, but fairly big; I could probably reduce it to something
> minimal involving libusbx, but without the FPGA as well, the prospects for
> someone else reproducing the behaviour are not great:
> 
> https://github.com/makestuff/flcli/tree/20131026
> https://github.com/makestuff/libfpgalink/tree/20131026
> https://github.com/makestuff/libusbwrap/tree/20131026
> 
> My main goal is to eliminate the possibility that it's me doing something
> silly. Maybe there's something up with the RPi's driver for my hub, but

If the hub driver weren't working, you wouldn't be able to communicate 
with the FX2 chip at all.

> maybe there's something wrong with my code, and it's just pure luck that it
> doesn't manifest itself on any other platform.
> 
> I'd appreciate any clues.

Maybe it's not the hub itself but one of the USB cables or connections.  
Have you tried using a different brand of hub?  Maybe a powered hub?

Alan Stern


------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk
_______________________________________________
libusbx-devel mailing list
libusbx-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libusbx-devel

Reply via email to