Re: usbserial / ftdi_sio (+ others) bug?
On Tue, 4 Nov 2014 09:14:49 +0100 Johan Hovold jo...@kernel.org wrote: 2. The chip responds with single correct character followed by a few hundred or so replies containing only the overrun status (no data) which are then converted to a bunch of binary zeroes by the ldisc because of the bug I mentioned earlier. After that the chip starts responding with proper data again and works until closed. Note that the only bug is that the application cannot disable the overrun reporting, but why would you want that? The merits of doing so may be debatable, but if using the quotes around bug is supposed to indicate that it isn't one, I have to respectfully disagree. I know it is not the most important thing in the world and without the hardware fault I probably would not have seen it at all, but I would still call it a bug. What's on the other side of the FTDI chip? Some kind of an optical receiver circuit (the link is optically isolated). On the other side of that is then the device that sends periodical data packets (a couple of times per second 17 bytes each) to the computer. The computer doesn't send anything i.e. the tx functionality of the chip is not used at all. It still sounds like your hardware is broken, but at least you seem to have found a work-around. Like I said, the hw is the real culprit here, there's no doubt about it. But I also doubt that it's just the individual chip in my device that has this issue. The device is practically brand new and while that is no guarantee that there won't be any faults, I find it much more likely that what I am seeing here is a quirk of the implementation and there are lots of these chips with the same issue out there. The real questions that remain are then; 1. is the chip real or counterfeit and how am I supposed to know it, 2. how much the driver can or even should try to accommodate the quirks of the hw, and 3. does the answer to #2 depend on the answer to #1. Perhaps you can report it to the logging-device (?) manufacturer or FTDI. Sure, if I can find someone that cares, which is doubtful. What is the lsusb -v output for your device by the way. Bus 002 Device 006: ID 0403:6001 Future Technology Devices International, Ltd FT232 USB-Serial (UART) IC Device Descriptor: bLength18 bDescriptorType 1 bcdUSB 2.00 bDeviceClass0 (Defined at Interface level) bDeviceSubClass 0 bDeviceProtocol 0 bMaxPacketSize0 8 idVendor 0x0403 Future Technology Devices International, Ltd idProduct 0x6001 FT232 USB-Serial (UART) IC bcdDevice6.00 iManufacturer 1 FTDI iProduct2 FT232R USB UART iSerial 3 A400EJPK bNumConfigurations 1 Configuration Descriptor: bLength 9 bDescriptorType 2 wTotalLength 32 bNumInterfaces 1 bConfigurationValue 1 iConfiguration 0 bmAttributes 0xa0 (Bus Powered) Remote Wakeup MaxPower 90mA Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber0 bAlternateSetting 0 bNumEndpoints 2 bInterfaceClass 255 Vendor Specific Class bInterfaceSubClass255 Vendor Specific Subclass bInterfaceProtocol255 Vendor Specific Protocol iInterface 2 FT232R USB UART Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x81 EP 1 IN bmAttributes2 Transfer TypeBulk Synch Type None Usage Type Data wMaxPacketSize 0x0040 1x 64 bytes bInterval 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x02 EP 2 OUT bmAttributes2 Transfer TypeBulk Synch Type None Usage Type Data wMaxPacketSize 0x0040 1x 64 bytes bInterval 0 Device Status: 0x (Bus Powered) -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: usbserial / ftdi_sio (+ others) bug?
On Wed, 29 Oct 2014 09:51:28 +0100 Johan Hovold jo...@kernel.org wrote: Having the driver not reporting overrun (and other) errors will obviously not fix the underlying issue with your device, which is generating all these errors in the first place. Ok, I did take a closer look at this (mostly with usbmon) and it seems to be caused by the hardware. When the application does open the device and the driver submits the first bulk reads, there's basically three possibilities what happens next: 1. The chip responds with correct data and everything works fine from there until the device is closed. 2. The chip responds with single correct character followed by a few hundred or so replies containing only the overrun status (no data) which are then converted to a bunch of binary zeroes by the ldisc because of the bug I mentioned earlier. After that the chip starts responding with proper data again and works until closed. 3. The chip hangs forever without ever responding anything on the bulk endpoint. As a rough estimate I'd say that something like at least one out of ten opens currently exhibits either behavior 2 or 3. Also it doesn't seem to have anything to do with any real buffering inside the chip i.e. if I close a working connection and immediately open it again, it may hang the chip. After some poking around, it seems that the chip really doesn't like the latency timer value of 1 when it is reset. After it gets the data going it doesn't seem to mind it i.e. I have not seen the chip to hang or report superfluous overruns during normal operation even with latency timer value of 1. With timer value 2 I did get something like 300 opens before hitting the issue and with value 3 I have not seen the device misbehave (yet) in like a thousand or so opens. I do think that more testing is still needed before saying anything definite, but larger timer at least seems to mitigate the issue significantly. BTW, in case nobody else is ever experiencing this issue, please note that I cannot guarantee in any way that the FT232RL in my device is actually authentic. If it is counterfeit, it is a different one than the one that was having the issue with the Windows driver lately. My device doesn't seem to have that bug, but that is no guarantee that it is the real deal. And obviously, real or not, it *does* have some bug that causes it to now misbehave during open(). So, tentatively seems that in order to get rid of the issue with at least this FT232 variant (whatever it may happen to be), either the minimum latency timer value should be increased or possibly alternatively the chip could be reset with higher value and the actual value set later when the chip has started properly. Although I don't yet know for sure which latency value would work 100% of the time or if the alternative idea would actually work at all (I just thought about trying something like that). -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
usbserial / ftdi_sio (+ others) bug?
I own a device that implements a data logging interface using the FT232 USB-serial -chip. Very often it happens that connecting the associated software with the device requires multiple attempts. There seems to be two kinds of issues; either the program reports that it did not receive any data or it reports reading lots of data, but it was all invalid. I haven't yet looked at the former, but I did spend some time investigating the latter. Simple strace of the program startup showed that when connecting fails, the program gets a lot (hundreds) of binary zeros while reading the device. I used usbmon to capture the traffic between the host and the device and the zeros are not strictly speaking coming from the device. However when this problem happens the device seems to report quite lot of overruns for a while, which was a clue. After a somewhat successful attempt to understand the operation of the tty code in Linux, I have a theory. The usbserial driver sets the TTY_DRIVER_REAL_RAW flag. Based on the comment in tty_driver.h this implies that the driver is not supposed to report any statuses (including overruns) to ldisc if they are ignored by the application (like they are in this case). It's just that AFAICS the ftdi_sio subdriver (and many others) doesn't seem quite honor this, but seems to report any status unconditionally. Also AFAICS this then means that every overrun will get converted into single binary zero delivered to the application(?). If so, this probably isn't what is supposed to happen and would explain the flood of extraneous zeros the application was seeing when the connecting failed. I haven't had yet the time to test this theory, but at least it seems plausible to me. Any thoughts, anybody? -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: usbserial / ftdi_sio (+ others) bug?
On Wed, Oct 29, 2014 at 10:51 AM, Johan Hovold jo...@kernel.org wrote: Having the driver not reporting overrun (and other) errors will obviously not fix the underlying issue with your device, which is generating all these errors in the first place. Yes, although that might be related to the other fault I have been seeing where the program reports receiving no data whatsoever. I'll have to take a look at that too when I have the time. -- To unsubscribe from this list: send the line unsubscribe linux-usb in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html