Re: serial flow control appears broken
2.2.5 is using the same UART setup (trigger level of 8) as the current code. There is no obvious difference in the interrupt setup (same devices on the same interrupts). So I have no helpful suggestions :-( -- Paul - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
2.2.5 is using the same UART setup (trigger level of 8) as the current code. There is no obvious difference in the interrupt setup (same devices on the same interrupts). So I have no helpful suggestions :-( -- Paul - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Paul Fulghum wrote: Lee Howard wrote: And in repeat tests it is quite evident that IDE disk activity is, indeed, at least part of the problem. As IDE disk activity increases an increased amount of data coming in on the serial port goes missing. Lee, you mentioned 2.2.x kernels did not exhibit this problem. Was this on the same hardware you are currently testing? Yes it was... except for the hard drive. I have different installs of different operating systems on different hard drives. I change the hard drive when switching between 2.2.5 and 2.6.5. Which 2.2.x version were you using? The default 2.2.5 kernel that comes with RedHat 6.0. Was the 2.2.x serial driver also identifying the UART as a 16550A? Yes it does. Can you get /proc/interrupts output from both the current setup and the 2.2.x setup? Current (2.6.5): CPU0 0: 14660696 XT-PIC timer 1: 8 XT-PIC i8042 2: 0 XT-PIC cascade 3:1240314 XT-PIC serial 4: 778901 XT-PIC serial 8: 1 XT-PIC rtc 10: 111647 XT-PIC eth0 14: 221202 XT-PIC ide0 15: 34 XT-PIC ide1 NMI: 0 ERR: 5 (2.2.5): CPU0 0: 5908 XT-PIC timer 1: 88 XT-PIC i8042 2: 0 XT-PIC cascade 8: 2 XT-PIC rtc 10:38 XT-PIC Intel EtherExpress Pro 10/100 Ethernet 13: 1 XT-PIC fpu 14:36637XT-PIC ide0 15: 4 XT-PIC ide1 NMI: 0 Thanks, Lee. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Lee Howard wrote: And in repeat tests it is quite evident that IDE disk activity is, indeed, at least part of the problem. As IDE disk activity increases an increased amount of data coming in on the serial port goes missing. Lee, you mentioned 2.2.x kernels did not exhibit this problem. Was this on the same hardware you are currently testing? Which 2.2.x version were you using? Was the 2.2.x serial driver also identifying the UART as a 16550A? Can you get /proc/interrupts output from both the current setup and the 2.2.x setup? It would be interesting to compare the interrupt assignment and UART setup between the versions. -- Paul - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Mark Lord wrote: The "fix" could be to have the serial IRQ handler never unmask interrupts, but that's a bit unsociable to others. The IDE stuff really needs to not do so much during the actual IRQ handler. Ingo's RT patches would probably fix all of this. I did a Fedora 7 installation and installed Ingo's kernel from here: http://people.redhat.com/mingo/realtime-preempt/yum-testing/yum/i686/kernel-rt-2.6.21-0182.rt11cfsv17.i686.rpm Even then, the problem still occurs, unfortunately. Thanks, Lee. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Ray Lee wrote: On 7/27/07, Lee Howard <[EMAIL PROTECTED]> wrote: Curiously, the session at 38400 bps that skipped 858 bytes... coincided, not just in sequence but also in precice timing within the session, with a small but noticeable disk load that I caused by grepping through a hundred session logs. (I can't reproduce it easily, though, because of disk caching.) `echo 1 > /proc/sys/vm/drop_caches` will clear out most (all?) of what the kernel has cached from the drive. It's there just for this kind of repeatability of tests... And in repeat tests it is quite evident that IDE disk activity is, indeed, at least part of the problem. As IDE disk activity increases an increased amount of data coming in on the serial port goes missing. Thanks, Lee. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Maciej W. Rozycki wrote: On Fri, 27 Jul 2007, Lee Howard wrote: Okay, so let's say we've got a loop around a blocking read on the modem file descriptor... for (;;) { read some data from modem process data from modem if (end-of-data detected) break; } Are you suggesting that the application should be using deasserting RTS after the read and asserting it before? It certainly could -- you were asking how it would know. ;-) So, to test... I put this in the application before every read: int flags; ioctl(modemFd, TIOCMGET, ); flags |= TIOCM_RTS; ioctl(modemFd, TIOCMSET, ); and this after: int flags; ioctl(modemFd, TIOCMGET, ); flags &= ~TIOCM_RTS; ioctl(modemFd, TIOCMSET, ); Now I can see the RTS light blink on the modem (and during heavy communication it merely "dims" depending on the amount of delay in the processing. However, it does not help. Data still goes missing. Thanks, Lee. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Maciej W. Rozycki wrote: On Fri, 27 Jul 2007, Lee Howard wrote: Okay, so let's say we've got a loop around a blocking read on the modem file descriptor... for (;;) { read some data from modem process data from modem if (end-of-data detected) break; } Are you suggesting that the application should be using deasserting RTS after the read and asserting it before? It certainly could -- you were asking how it would know. ;-) So, to test... I put this in the application before every read: int flags; ioctl(modemFd, TIOCMGET, flags); flags |= TIOCM_RTS; ioctl(modemFd, TIOCMSET, flags); and this after: int flags; ioctl(modemFd, TIOCMGET, flags); flags = ~TIOCM_RTS; ioctl(modemFd, TIOCMSET, flags); Now I can see the RTS light blink on the modem (and during heavy communication it merely dims depending on the amount of delay in the processing. However, it does not help. Data still goes missing. Thanks, Lee. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Ray Lee wrote: On 7/27/07, Lee Howard [EMAIL PROTECTED] wrote: Curiously, the session at 38400 bps that skipped 858 bytes... coincided, not just in sequence but also in precice timing within the session, with a small but noticeable disk load that I caused by grepping through a hundred session logs. (I can't reproduce it easily, though, because of disk caching.) `echo 1 /proc/sys/vm/drop_caches` will clear out most (all?) of what the kernel has cached from the drive. It's there just for this kind of repeatability of tests... And in repeat tests it is quite evident that IDE disk activity is, indeed, at least part of the problem. As IDE disk activity increases an increased amount of data coming in on the serial port goes missing. Thanks, Lee. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Mark Lord wrote: The fix could be to have the serial IRQ handler never unmask interrupts, but that's a bit unsociable to others. The IDE stuff really needs to not do so much during the actual IRQ handler. Ingo's RT patches would probably fix all of this. I did a Fedora 7 installation and installed Ingo's kernel from here: http://people.redhat.com/mingo/realtime-preempt/yum-testing/yum/i686/kernel-rt-2.6.21-0182.rt11cfsv17.i686.rpm Even then, the problem still occurs, unfortunately. Thanks, Lee. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Lee Howard wrote: And in repeat tests it is quite evident that IDE disk activity is, indeed, at least part of the problem. As IDE disk activity increases an increased amount of data coming in on the serial port goes missing. Lee, you mentioned 2.2.x kernels did not exhibit this problem. Was this on the same hardware you are currently testing? Which 2.2.x version were you using? Was the 2.2.x serial driver also identifying the UART as a 16550A? Can you get /proc/interrupts output from both the current setup and the 2.2.x setup? It would be interesting to compare the interrupt assignment and UART setup between the versions. -- Paul - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Paul Fulghum wrote: Lee Howard wrote: And in repeat tests it is quite evident that IDE disk activity is, indeed, at least part of the problem. As IDE disk activity increases an increased amount of data coming in on the serial port goes missing. Lee, you mentioned 2.2.x kernels did not exhibit this problem. Was this on the same hardware you are currently testing? Yes it was... except for the hard drive. I have different installs of different operating systems on different hard drives. I change the hard drive when switching between 2.2.5 and 2.6.5. Which 2.2.x version were you using? The default 2.2.5 kernel that comes with RedHat 6.0. Was the 2.2.x serial driver also identifying the UART as a 16550A? Yes it does. Can you get /proc/interrupts output from both the current setup and the 2.2.x setup? Current (2.6.5): CPU0 0: 14660696 XT-PIC timer 1: 8 XT-PIC i8042 2: 0 XT-PIC cascade 3:1240314 XT-PIC serial 4: 778901 XT-PIC serial 8: 1 XT-PIC rtc 10: 111647 XT-PIC eth0 14: 221202 XT-PIC ide0 15: 34 XT-PIC ide1 NMI: 0 ERR: 5 (2.2.5): CPU0 0: 5908 XT-PIC timer 1: 88 XT-PIC i8042 2: 0 XT-PIC cascade 8: 2 XT-PIC rtc 10:38 XT-PIC Intel EtherExpress Pro 10/100 Ethernet 13: 1 XT-PIC fpu 14:36637XT-PIC ide0 15: 4 XT-PIC ide1 NMI: 0 Thanks, Lee. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
On Thu, 2 Aug 2007, Alan Cox wrote: > Currently libata PIO is mostly done in the IRQ path. Albert Lee was doing > some work on that but its actually very hard to fix without doing polled > PIO. Hmm, when the drive signals it is ready for a PIO data transfer can't just the interrupt handler mask the originating interrupt and post a softirq to handle the case? That should be rather straightforward. Maciej - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
On Thu, 2 Aug 2007, Alan Cox wrote: Currently libata PIO is mostly done in the IRQ path. Albert Lee was doing some work on that but its actually very hard to fix without doing polled PIO. Hmm, when the drive signals it is ready for a PIO data transfer can't just the interrupt handler mask the originating interrupt and post a softirq to handle the case? That should be rather straightforward. Maciej - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
> That's what "hdparm -u1" (or -u0) controls. Only some of the time. > Ingo's RT patches would probably fix all of this. The worst case IDE times we've seen for executing a single indivisible un-interruptible I/O cycle with a drive are around 1mS. Thats a hardware limit. Alan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Alan Cox wrote: I think that PIO transfers only have to be done with interrupts disabled on really old, evil controllers (without unmask set). I don't think libata ever disables interrupts during transfers(?) Currently libata PIO is mostly done in the IRQ path. Albert Lee was doing some work on that but its actually very hard to fix without doing polled PIO. Ah, right. Misread the code. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
> I think that PIO transfers only have to be done with interrupts disabled > on really old, evil controllers (without unmask set). I don't think > libata ever disables interrupts during transfers(?) Currently libata PIO is mostly done in the IRQ path. Albert Lee was doing some work on that but its actually very hard to fix without doing polled PIO. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Mark Lord wrote: I think that PIO transfers only have to be done with interrupts disabled on really old, evil controllers (without unmask set). I don't think libata ever disables interrupts during transfers(?) That's what "hdparm -u1" (or -u0) controls. But it doesn't matter a whit here. The problem is that the IDE interrupt handling can take a long time, regardless of whether it unmasks IRQs or not. And if that IDE interrupt interrupts a serial interrupt, then the serial stuff won't get handled until the IDE stuff completes. Thus the problem. The "fix" could be to have the serial IRQ handler never unmask interrupts, but that's a bit unsociable to others. The IDE stuff really needs to not do so much during the actual IRQ handler. Ingo's RT patches would probably fix all of this. libata also doesn't do the actual PIO transfer from the interrupt handler like old IDE does, either, and it only disables interrupts for the transfer if it's transferring to/from high memory.. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Robert Hancock wrote: Mark Lord wrote: I don't believe the speed of the machine has much to do with it, as IDE PIO is always at pretty much the same speed (or slower) regardless of the CPU speed. Best case is about .120 usec per 16-bit word, but that doesn't often pan out in practice. More typical is something closer to 1 usec per 16-bit word. So, for multcount=16 (very common), best case is 16 * 256 * .120 = 491 usec, plus extra overhead for reading the IDE status register (another usec or so), and other stuff. Figure maybe 500usec total per interrupt for multcount=16 in the best case, or 4000usec in the worst case. At 115200bps, we get a byte every 86 usec or so. Assuming the UART FIFO is set to interrupt (warn) us at 12/16 full, we have 4*86 = 344 usec to respond and de-assert RTS. Less than that in practice. Conclusion: using IDE multisector PIO is not a good idea with high speed serial transfers happening, since we cannot respond quickly enough. It might be possible to set the buffer underrun threshold lower in the UART (?). All that said, I doubt that his system is using IDE PIO in the first place. Dunno how long IDE DMA interrupts take, but it's probably in the 20-50 usec range. I think that PIO transfers only have to be done with interrupts disabled on really old, evil controllers (without unmask set). I don't think libata ever disables interrupts during transfers(?) That's what "hdparm -u1" (or -u0) controls. But it doesn't matter a whit here. The problem is that the IDE interrupt handling can take a long time, regardless of whether it unmasks IRQs or not. And if that IDE interrupt interrupts a serial interrupt, then the serial stuff won't get handled until the IDE stuff completes. Thus the problem. The "fix" could be to have the serial IRQ handler never unmask interrupts, but that's a bit unsociable to others. The IDE stuff really needs to not do so much during the actual IRQ handler. Ingo's RT patches would probably fix all of this. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Mark Lord wrote: I don't believe the speed of the machine has much to do with it, as IDE PIO is always at pretty much the same speed (or slower) regardless of the CPU speed. Best case is about .120 usec per 16-bit word, but that doesn't often pan out in practice. More typical is something closer to 1 usec per 16-bit word. So, for multcount=16 (very common), best case is 16 * 256 * .120 = 491 usec, plus extra overhead for reading the IDE status register (another usec or so), and other stuff. Figure maybe 500usec total per interrupt for multcount=16 in the best case, or 4000usec in the worst case. At 115200bps, we get a byte every 86 usec or so. Assuming the UART FIFO is set to interrupt (warn) us at 12/16 full, we have 4*86 = 344 usec to respond and de-assert RTS. Less than that in practice. Conclusion: using IDE multisector PIO is not a good idea with high speed serial transfers happening, since we cannot respond quickly enough. It might be possible to set the buffer underrun threshold lower in the UART (?). All that said, I doubt that his system is using IDE PIO in the first place. Dunno how long IDE DMA interrupts take, but it's probably in the 20-50 usec range. I think that PIO transfers only have to be done with interrupts disabled on really old, evil controllers (without unmask set). I don't think libata ever disables interrupts during transfers(?) -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Maciej W. Rozycki wrote: On Sat, 28 Jul 2007, Russell King wrote: Essentially, any complex interrupt handler (such as an IDE interrupt doing a multi-sector PIO transfer _in interrupt context_) can cause this kind of starvation. That's why Linux 1.x had bottom halves - so that the time consuming work could be moved out of the interrupt handler, thereby causing minimal the blockage of other interrupts. Unfortunately, that kind of design has been long since forgotten. Apparantly modern machines are fast enough that it doesn't have to be worried about anymore... Or are they? I would guess it is not that the machines are fast enough, but that this two-level processing makes things more complicated. Enough that most people would not bother digging into it unless really forced. Only occasional latency problems are probably not enough of a force. I don't believe the speed of the machine has much to do with it, as IDE PIO is always at pretty much the same speed (or slower) regardless of the CPU speed. Best case is about .120 usec per 16-bit word, but that doesn't often pan out in practice. More typical is something closer to 1 usec per 16-bit word. So, for multcount=16 (very common), best case is 16 * 256 * .120 = 491 usec, plus extra overhead for reading the IDE status register (another usec or so), and other stuff. Figure maybe 500usec total per interrupt for multcount=16 in the best case, or 4000usec in the worst case. At 115200bps, we get a byte every 86 usec or so. Assuming the UART FIFO is set to interrupt (warn) us at 12/16 full, we have 4*86 = 344 usec to respond and de-assert RTS. Less than that in practice. Conclusion: using IDE multisector PIO is not a good idea with high speed serial transfers happening, since we cannot respond quickly enough. It might be possible to set the buffer underrun threshold lower in the UART (?). All that said, I doubt that his system is using IDE PIO in the first place. Dunno how long IDE DMA interrupts take, but it's probably in the 20-50 usec range. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Maciej W. Rozycki wrote: On Sat, 28 Jul 2007, Russell King wrote: Essentially, any complex interrupt handler (such as an IDE interrupt doing a multi-sector PIO transfer _in interrupt context_) can cause this kind of starvation. That's why Linux 1.x had bottom halves - so that the time consuming work could be moved out of the interrupt handler, thereby causing minimal the blockage of other interrupts. Unfortunately, that kind of design has been long since forgotten. Apparantly modern machines are fast enough that it doesn't have to be worried about anymore... Or are they? I would guess it is not that the machines are fast enough, but that this two-level processing makes things more complicated. Enough that most people would not bother digging into it unless really forced. Only occasional latency problems are probably not enough of a force. I don't believe the speed of the machine has much to do with it, as IDE PIO is always at pretty much the same speed (or slower) regardless of the CPU speed. Best case is about .120 usec per 16-bit word, but that doesn't often pan out in practice. More typical is something closer to 1 usec per 16-bit word. So, for multcount=16 (very common), best case is 16 * 256 * .120 = 491 usec, plus extra overhead for reading the IDE status register (another usec or so), and other stuff. Figure maybe 500usec total per interrupt for multcount=16 in the best case, or 4000usec in the worst case. At 115200bps, we get a byte every 86 usec or so. Assuming the UART FIFO is set to interrupt (warn) us at 12/16 full, we have 4*86 = 344 usec to respond and de-assert RTS. Less than that in practice. Conclusion: using IDE multisector PIO is not a good idea with high speed serial transfers happening, since we cannot respond quickly enough. It might be possible to set the buffer underrun threshold lower in the UART (?). All that said, I doubt that his system is using IDE PIO in the first place. Dunno how long IDE DMA interrupts take, but it's probably in the 20-50 usec range. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Mark Lord wrote: I don't believe the speed of the machine has much to do with it, as IDE PIO is always at pretty much the same speed (or slower) regardless of the CPU speed. Best case is about .120 usec per 16-bit word, but that doesn't often pan out in practice. More typical is something closer to 1 usec per 16-bit word. So, for multcount=16 (very common), best case is 16 * 256 * .120 = 491 usec, plus extra overhead for reading the IDE status register (another usec or so), and other stuff. Figure maybe 500usec total per interrupt for multcount=16 in the best case, or 4000usec in the worst case. At 115200bps, we get a byte every 86 usec or so. Assuming the UART FIFO is set to interrupt (warn) us at 12/16 full, we have 4*86 = 344 usec to respond and de-assert RTS. Less than that in practice. Conclusion: using IDE multisector PIO is not a good idea with high speed serial transfers happening, since we cannot respond quickly enough. It might be possible to set the buffer underrun threshold lower in the UART (?). All that said, I doubt that his system is using IDE PIO in the first place. Dunno how long IDE DMA interrupts take, but it's probably in the 20-50 usec range. I think that PIO transfers only have to be done with interrupts disabled on really old, evil controllers (without unmask set). I don't think libata ever disables interrupts during transfers(?) -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
I think that PIO transfers only have to be done with interrupts disabled on really old, evil controllers (without unmask set). I don't think libata ever disables interrupts during transfers(?) Currently libata PIO is mostly done in the IRQ path. Albert Lee was doing some work on that but its actually very hard to fix without doing polled PIO. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Robert Hancock wrote: Mark Lord wrote: I don't believe the speed of the machine has much to do with it, as IDE PIO is always at pretty much the same speed (or slower) regardless of the CPU speed. Best case is about .120 usec per 16-bit word, but that doesn't often pan out in practice. More typical is something closer to 1 usec per 16-bit word. So, for multcount=16 (very common), best case is 16 * 256 * .120 = 491 usec, plus extra overhead for reading the IDE status register (another usec or so), and other stuff. Figure maybe 500usec total per interrupt for multcount=16 in the best case, or 4000usec in the worst case. At 115200bps, we get a byte every 86 usec or so. Assuming the UART FIFO is set to interrupt (warn) us at 12/16 full, we have 4*86 = 344 usec to respond and de-assert RTS. Less than that in practice. Conclusion: using IDE multisector PIO is not a good idea with high speed serial transfers happening, since we cannot respond quickly enough. It might be possible to set the buffer underrun threshold lower in the UART (?). All that said, I doubt that his system is using IDE PIO in the first place. Dunno how long IDE DMA interrupts take, but it's probably in the 20-50 usec range. I think that PIO transfers only have to be done with interrupts disabled on really old, evil controllers (without unmask set). I don't think libata ever disables interrupts during transfers(?) That's what hdparm -u1 (or -u0) controls. But it doesn't matter a whit here. The problem is that the IDE interrupt handling can take a long time, regardless of whether it unmasks IRQs or not. And if that IDE interrupt interrupts a serial interrupt, then the serial stuff won't get handled until the IDE stuff completes. Thus the problem. The fix could be to have the serial IRQ handler never unmask interrupts, but that's a bit unsociable to others. The IDE stuff really needs to not do so much during the actual IRQ handler. Ingo's RT patches would probably fix all of this. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Mark Lord wrote: I think that PIO transfers only have to be done with interrupts disabled on really old, evil controllers (without unmask set). I don't think libata ever disables interrupts during transfers(?) That's what hdparm -u1 (or -u0) controls. But it doesn't matter a whit here. The problem is that the IDE interrupt handling can take a long time, regardless of whether it unmasks IRQs or not. And if that IDE interrupt interrupts a serial interrupt, then the serial stuff won't get handled until the IDE stuff completes. Thus the problem. The fix could be to have the serial IRQ handler never unmask interrupts, but that's a bit unsociable to others. The IDE stuff really needs to not do so much during the actual IRQ handler. Ingo's RT patches would probably fix all of this. libata also doesn't do the actual PIO transfer from the interrupt handler like old IDE does, either, and it only disables interrupts for the transfer if it's transferring to/from high memory.. -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
That's what hdparm -u1 (or -u0) controls. Only some of the time. Ingo's RT patches would probably fix all of this. The worst case IDE times we've seen for executing a single indivisible un-interruptible I/O cycle with a drive are around 1mS. Thats a hardware limit. Alan - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Alan Cox wrote: I think that PIO transfers only have to be done with interrupts disabled on really old, evil controllers (without unmask set). I don't think libata ever disables interrupts during transfers(?) Currently libata PIO is mostly done in the IRQ path. Albert Lee was doing some work on that but its actually very hard to fix without doing polled PIO. Ah, right. Misread the code. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
On Mon, Jul 30, 2007 at 10:45:19AM +0100, Maciej W. Rozycki wrote: > On Sat, 28 Jul 2007, Russell King wrote: > > > Essentially, any complex interrupt handler (such as an IDE interrupt > > doing a multi-sector PIO transfer _in interrupt context_) can cause this > > kind of starvation. That's why Linux 1.x had bottom halves - so that > > the time consuming work could be moved out of the interrupt handler, > > thereby causing minimal the blockage of other interrupts. > > > > Unfortunately, that kind of design has been long since forgotten. > > Apparantly modern machines are fast enough that it doesn't have to be > > worried about anymore... Or are they? > > I would guess it is not that the machines are fast enough, but that this > two-level processing makes things more complicated. Enough that most > people would not bother digging into it unless really forced. Only > occasional latency problems are probably not enough of a force. It's a shame we don't have a way to measure IRQ latency - it would be very useful to flag up problems. I think the best we could do is to arrange for the timer interrupt to complain if it's delayed by more than 1ms or so - but some architectures already run their timers with IRQF_DISABLED as a work around some of the latency issues. -- Russell King Linux kernel2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
On Sat, 28 Jul 2007, Russell King wrote: > Essentially, any complex interrupt handler (such as an IDE interrupt > doing a multi-sector PIO transfer _in interrupt context_) can cause this > kind of starvation. That's why Linux 1.x had bottom halves - so that > the time consuming work could be moved out of the interrupt handler, > thereby causing minimal the blockage of other interrupts. > > Unfortunately, that kind of design has been long since forgotten. > Apparantly modern machines are fast enough that it doesn't have to be > worried about anymore... Or are they? I would guess it is not that the machines are fast enough, but that this two-level processing makes things more complicated. Enough that most people would not bother digging into it unless really forced. Only occasional latency problems are probably not enough of a force. Maciej - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
On Fri, 27 Jul 2007, Paul Fulghum wrote: > I can't see anyplace in serial_core.c or 8250.c that sets TTY_OVERRUN. Look for UART_LSR_OE in 8250.c -- the serial core accepts any bit that has been defined by the low-level driver and sets TTY_OVERRUN in uart_insert_char(). Maciej - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
On Fri, 27 Jul 2007, Lee Howard wrote: > >The serial drivers have nothing to do about it -- all they can do is pushing > >data upstream, to the discipline driver. They can provide an interface to > >hardware flow control features though, if implemented by a given UART. > > > > Thank you for this clarification. So I should have more correctly been saying > that "tty flow control appears broken". Right? Probably. It might be, as Alan suggested, that it is meant to work, but the latency kills it. Maciej - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
On Fri, 27 Jul 2007, Robert Hancock wrote: > > The TTY line discipline driver could do that based on the amount of received > > data present in its buffer. And it should if asked to (a brief look at > > drivers/char/n_tty.c reveals it does; obviously there may be a bug > > Really, where? In my look through the code I haven't found any mechanism that > would result in RTS being lowered based on TTY buffers filling up, at least > not in the 8250 case. Look for calls to ->throttle() and ->unthrottle(). XON and XOFF might be used instead as a result of these calls though, depending on terminal settings. > In this situation, though, it appears it's not the TTY buffers that are > filling but the UART's own buffer. I would think this must be caused by some > kind of interrupt latency that results in not draining the FIFO in time. Well, the UART only has its FIFO which is rather small, so automatic flow control would be useful. Though, admittedly, tty_insert_flip_char() might return some kind of a status related to how much space is left in the receive buffer which would indicate that there is a lag in data stream processing -- which in turn may relate to the system being loaded, so that the receive ISR could decide whether to negate RTS itself for the less capable UARTs (i.e. ones with no autoflow and a tiny or no FIFO). Maciej - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
On Fri, 27 Jul 2007, Robert Hancock wrote: The TTY line discipline driver could do that based on the amount of received data present in its buffer. And it should if asked to (a brief look at drivers/char/n_tty.c reveals it does; obviously there may be a bug Really, where? In my look through the code I haven't found any mechanism that would result in RTS being lowered based on TTY buffers filling up, at least not in the 8250 case. Look for calls to -throttle() and -unthrottle(). XON and XOFF might be used instead as a result of these calls though, depending on terminal settings. In this situation, though, it appears it's not the TTY buffers that are filling but the UART's own buffer. I would think this must be caused by some kind of interrupt latency that results in not draining the FIFO in time. Well, the UART only has its FIFO which is rather small, so automatic flow control would be useful. Though, admittedly, tty_insert_flip_char() might return some kind of a status related to how much space is left in the receive buffer which would indicate that there is a lag in data stream processing -- which in turn may relate to the system being loaded, so that the receive ISR could decide whether to negate RTS itself for the less capable UARTs (i.e. ones with no autoflow and a tiny or no FIFO). Maciej - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
On Fri, 27 Jul 2007, Lee Howard wrote: The serial drivers have nothing to do about it -- all they can do is pushing data upstream, to the discipline driver. They can provide an interface to hardware flow control features though, if implemented by a given UART. Thank you for this clarification. So I should have more correctly been saying that tty flow control appears broken. Right? Probably. It might be, as Alan suggested, that it is meant to work, but the latency kills it. Maciej - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
On Fri, 27 Jul 2007, Paul Fulghum wrote: I can't see anyplace in serial_core.c or 8250.c that sets TTY_OVERRUN. Look for UART_LSR_OE in 8250.c -- the serial core accepts any bit that has been defined by the low-level driver and sets TTY_OVERRUN in uart_insert_char(). Maciej - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
On Sat, 28 Jul 2007, Russell King wrote: Essentially, any complex interrupt handler (such as an IDE interrupt doing a multi-sector PIO transfer _in interrupt context_) can cause this kind of starvation. That's why Linux 1.x had bottom halves - so that the time consuming work could be moved out of the interrupt handler, thereby causing minimal the blockage of other interrupts. Unfortunately, that kind of design has been long since forgotten. Apparantly modern machines are fast enough that it doesn't have to be worried about anymore... Or are they? I would guess it is not that the machines are fast enough, but that this two-level processing makes things more complicated. Enough that most people would not bother digging into it unless really forced. Only occasional latency problems are probably not enough of a force. Maciej - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
On Mon, Jul 30, 2007 at 10:45:19AM +0100, Maciej W. Rozycki wrote: On Sat, 28 Jul 2007, Russell King wrote: Essentially, any complex interrupt handler (such as an IDE interrupt doing a multi-sector PIO transfer _in interrupt context_) can cause this kind of starvation. That's why Linux 1.x had bottom halves - so that the time consuming work could be moved out of the interrupt handler, thereby causing minimal the blockage of other interrupts. Unfortunately, that kind of design has been long since forgotten. Apparantly modern machines are fast enough that it doesn't have to be worried about anymore... Or are they? I would guess it is not that the machines are fast enough, but that this two-level processing makes things more complicated. Enough that most people would not bother digging into it unless really forced. Only occasional latency problems are probably not enough of a force. It's a shame we don't have a way to measure IRQ latency - it would be very useful to flag up problems. I think the best we could do is to arrange for the timer interrupt to complain if it's delayed by more than 1ms or so - but some architectures already run their timers with IRQF_DISABLED as a work around some of the latency issues. -- Russell King Linux kernel2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
On 7/27/07, Lee Howard <[EMAIL PROTECTED]> wrote: > Curiously, the session at 38400 bps that skipped 858 bytes... coincided, > not just in sequence but also in precice timing within the session, with > a small but noticeable disk load that I caused by grepping through a > hundred session logs. (I can't reproduce it easily, though, because of > disk caching.) `echo 1 > /proc/sys/vm/drop_caches` will clear out most (all?) of what the kernel has cached from the drive. It's there just for this kind of repeatability of tests... Ray - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Alan Cox wrote: Curiously, the session at 38400 bps that skipped 858 bytes... coincided, not just in sequence but also in precice timing within the session, with a small but noticeable disk load that I caused by grepping through a hundred session logs. (I can't reproduce it easily, though, because of disk caching.) Can you send me a dmesg, there are some cases when high disk load can cause high interrupt latency in both 2.2 and 2.6 depending upon what is configured. I've attached dmesg output. The os version I used yesterday to run those tests was Debian 4.0r0 (kernel 2.6.18-4-686). It's still running, and that's where I give you this dmesg output from. I don't think thats related to the main problem but it is worth knowing about hdparm -u1 # hdparm -u1 /dev/hda /dev/hda: setting unmaskirq to 1 (on) unmaskirq= 1 (on) # After doing this I re-ran the 5 test sends at 115200 bps. The number of lost bytes were: 0, 14, 8, 0, and 3. Compared with yesterday's 63, 5, 44, 48, and 2 this may indicate an improvement. Note also that in the 4th session where no bytes were lost there was still one element of corrupt data as detected by the image decoder. Thanks, Lee. Linux version 2.6.18-4-686 (Debian 2.6.18.dfsg.1-12) ([EMAIL PROTECTED]) (gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)) #1 SMP Mon Mar 26 17:17:36 UTC 2007 BIOS-provided physical RAM map: BIOS-e820: - 000a (usable) BIOS-e820: 000f - 0010 (reserved) BIOS-e820: 0010 - 0e00 (usable) BIOS-e820: - 0001 (reserved) 0MB HIGHMEM available. 224MB LOWMEM available. On node 0 totalpages: 57344 DMA zone: 4096 pages, LIFO batch:0 Normal zone: 53248 pages, LIFO batch:15 DMI 2.0 present. ACPI: Unable to locate RSDP Allocating PCI resources starting at 1000 (gap: 0e00:f1ff) Detected 400.953 MHz processor. Built 1 zonelists. Total pages: 57344 Kernel command line: root=/dev/hda3 ro Local APIC disabled by BIOS -- you can enable it with "lapic" mapped APIC to d000 (011c9000) Enabling fast FPU save and restore... done. Initializing CPU#0 PID hash table entries: 1024 (order: 10, 4096 bytes) Console: colour VGA+ 80x25 Dentry cache hash table entries: 32768 (order: 5, 131072 bytes) Inode-cache hash table entries: 16384 (order: 4, 65536 bytes) Memory: 219828k/229376k available (1544k kernel code, 9052k reserved, 577k data, 196k init, 0k highmem) Checking if this processor honours the WP bit even in supervisor mode... Ok. Calibrating delay using timer specific routine.. 802.59 BogoMIPS (lpj=1605193) Security Framework v1.0.0 initialized SELinux: Disabled at boot. Capability LSM initialized Mount-cache hash table entries: 512 CPU: After generic identify, caps: 0183f9ff CPU: After vendor identify, caps: 0183f9ff CPU: L1 I cache: 16K, L1 D cache: 16K CPU: L2 cache: 512K CPU: After all inits, caps: 0183f9ff 0040 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#0. Compat vDSO mapped to e000. Checking 'hlt' instruction... OK. SMP alternatives: switching to UP code Freeing SMP alternatives: 16k freed CPU0: Intel Pentium II (Deschutes) stepping 03 SMP motherboard not detected. Local APIC not detected. Using dummy APIC emulation. Brought up 1 CPUs migration_cost=0 checking if image is initramfs... it is Freeing initrd memory: 4375k freed NET: Registered protocol family 16 PCI: PCI BIOS revision 2.10 entry at 0xfb4a0, last bus=1 PCI: Using configuration type 1 Setting up standard PCI resources ACPI: Interpreter disabled. Linux Plug and Play Support v0.97 (c) Adam Belay pnp: PnP ACPI: disabled PnPBIOS: Scanning system for PnP BIOS support... PnPBIOS: Found PnP BIOS installation structure at 0xc00fc0f0 PnPBIOS: PnP BIOS version 1.0, entry 0xf:0xc118, dseg 0xf PnPBIOS: 14 nodes reported by PnP BIOS; 14 recorded by driver PCI: Probing PCI hardware PCI: Probing PCI hardware (bus 00) PCI: Firmware left :00:0b.0 e100 interrupts enabled, disabling Boot video device is :01:00.0 PCI: Bridge: :00:01.0 IO window: c000-cfff MEM window: e400-e5ff PREFETCH window: e700-e77f PCI: Setting latency timer of device :00:01.0 to 64 NET: Registered protocol family 2 IP route cache hash table entries: 2048 (order: 1, 8192 bytes) TCP established hash table entries: 8192 (order: 4, 65536 bytes) TCP bind hash table entries: 4096 (order: 3, 32768 bytes) TCP: Hash tables configured (established 8192 bind 4096) TCP reno registered audit: initializing netlink socket (disabled) audit(1185563327.604:1): initialized VFS: Disk quotas dquot_6.5.1 Dquot-cache hash table entries: 1024 (order 0, 4096 bytes) Initializing Cryptographic API io scheduler noop registered io scheduler anticipatory
Re: serial flow control appears broken
> Curiously, the session at 38400 bps that skipped 858 bytes... coincided, > not just in sequence but also in precice timing within the session, with > a small but noticeable disk load that I caused by grepping through a > hundred session logs. (I can't reproduce it easily, though, because of > disk caching.) Can you send me a dmesg, there are some cases when high disk load can cause high interrupt latency in both 2.2 and 2.6 depending upon what is configured. I don't think thats related to the main problem but it is worth knowing about hdparm -u1 > as it leaves the DCE. I mention this in case there is any limitation to > how the 8250 driver performs when two modems are being run simultaneously. It means more load but that shouldn't matter much, and the transmit side if under load with asynchronous traffic will not lose bytes sending. Alan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
On Fri, Jul 27, 2007 at 09:51:25PM -0700, Lee Howard wrote: > Curiously, the session at 38400 bps that skipped 858 bytes... coincided, > not just in sequence but also in precice timing within the session, with > a small but noticeable disk load that I caused by grepping through a > hundred session logs. (I can't reproduce it easily, though, because of > disk caching.) If you have other parts of the system which run with IRQs disabled for a significant time period, then you will get serial corruption. That's not the serial driver's fault - that's a problem with the other device drivers/rest of the system. You may be table to track down where IRQs are being held off for too long by hooking into the 8250 interrupt handler, and when an overrun error is reported, printk a _minimal_ message reporting the instruction pointer obtained via get_irq_regs(). Note, however, that I don't actively maintain serial anymore. -- Russell King Linux kernel2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
On Fri, Jul 27, 2007 at 12:22:57PM -0600, Robert Hancock wrote: > Maciej W. Rozycki wrote: > > The TTY line discipline driver could do that based on the amount of > >received data present in its buffer. And it should if asked to (a brief > >look at drivers/char/n_tty.c reveals it does; obviously there may be a bug > > Really, where? In my look through the code I haven't found any mechanism > that would result in RTS being lowered based on TTY buffers filling up, > at least not in the 8250 case. That's something for the line discipline to decide. > In this situation, though, it appears it's not the TTY buffers that are > filling but the UART's own buffer. I would think this must be caused by > some kind of interrupt latency that results in not draining the FIFO in > time. Correct, and suggested approach to tracking down the culpret has been mentioned in a previous email. Also note that there's nothing the serial driver can do to detect this condition before it occurs. The problem occurs because the serial driver is starved of CPU time due to other parts of the system, and the driver has precisely zero knowledge as to when that's going to happen. There are two possible scenarios when such starvation can occur: 1. interrupts are disabled for a long period. 2. the serial interrupt has started to run, but has been interrupted by _another_ interrupt which runs for a long period. Essentially, any complex interrupt handler (such as an IDE interrupt doing a multi-sector PIO transfer _in interrupt context_) can cause this kind of starvation. That's why Linux 1.x had bottom halves - so that the time consuming work could be moved out of the interrupt handler, thereby causing minimal the blockage of other interrupts. Unfortunately, that kind of design has been long since forgotten. Apparantly modern machines are fast enough that it doesn't have to be worried about anymore... Or are they? -- Russell King Linux kernel2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
On Fri, Jul 27, 2007 at 12:22:57PM -0600, Robert Hancock wrote: Maciej W. Rozycki wrote: The TTY line discipline driver could do that based on the amount of received data present in its buffer. And it should if asked to (a brief look at drivers/char/n_tty.c reveals it does; obviously there may be a bug Really, where? In my look through the code I haven't found any mechanism that would result in RTS being lowered based on TTY buffers filling up, at least not in the 8250 case. That's something for the line discipline to decide. In this situation, though, it appears it's not the TTY buffers that are filling but the UART's own buffer. I would think this must be caused by some kind of interrupt latency that results in not draining the FIFO in time. Correct, and suggested approach to tracking down the culpret has been mentioned in a previous email. Also note that there's nothing the serial driver can do to detect this condition before it occurs. The problem occurs because the serial driver is starved of CPU time due to other parts of the system, and the driver has precisely zero knowledge as to when that's going to happen. There are two possible scenarios when such starvation can occur: 1. interrupts are disabled for a long period. 2. the serial interrupt has started to run, but has been interrupted by _another_ interrupt which runs for a long period. Essentially, any complex interrupt handler (such as an IDE interrupt doing a multi-sector PIO transfer _in interrupt context_) can cause this kind of starvation. That's why Linux 1.x had bottom halves - so that the time consuming work could be moved out of the interrupt handler, thereby causing minimal the blockage of other interrupts. Unfortunately, that kind of design has been long since forgotten. Apparantly modern machines are fast enough that it doesn't have to be worried about anymore... Or are they? -- Russell King Linux kernel2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
On Fri, Jul 27, 2007 at 09:51:25PM -0700, Lee Howard wrote: Curiously, the session at 38400 bps that skipped 858 bytes... coincided, not just in sequence but also in precice timing within the session, with a small but noticeable disk load that I caused by grepping through a hundred session logs. (I can't reproduce it easily, though, because of disk caching.) If you have other parts of the system which run with IRQs disabled for a significant time period, then you will get serial corruption. That's not the serial driver's fault - that's a problem with the other device drivers/rest of the system. You may be table to track down where IRQs are being held off for too long by hooking into the 8250 interrupt handler, and when an overrun error is reported, printk a _minimal_ message reporting the instruction pointer obtained via get_irq_regs(). Note, however, that I don't actively maintain serial anymore. -- Russell King Linux kernel2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Curiously, the session at 38400 bps that skipped 858 bytes... coincided, not just in sequence but also in precice timing within the session, with a small but noticeable disk load that I caused by grepping through a hundred session logs. (I can't reproduce it easily, though, because of disk caching.) Can you send me a dmesg, there are some cases when high disk load can cause high interrupt latency in both 2.2 and 2.6 depending upon what is configured. I don't think thats related to the main problem but it is worth knowing about hdparm -u1 as it leaves the DCE. I mention this in case there is any limitation to how the 8250 driver performs when two modems are being run simultaneously. It means more load but that shouldn't matter much, and the transmit side if under load with asynchronous traffic will not lose bytes sending. Alan - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Alan Cox wrote: Curiously, the session at 38400 bps that skipped 858 bytes... coincided, not just in sequence but also in precice timing within the session, with a small but noticeable disk load that I caused by grepping through a hundred session logs. (I can't reproduce it easily, though, because of disk caching.) Can you send me a dmesg, there are some cases when high disk load can cause high interrupt latency in both 2.2 and 2.6 depending upon what is configured. I've attached dmesg output. The os version I used yesterday to run those tests was Debian 4.0r0 (kernel 2.6.18-4-686). It's still running, and that's where I give you this dmesg output from. I don't think thats related to the main problem but it is worth knowing about hdparm -u1 # hdparm -u1 /dev/hda /dev/hda: setting unmaskirq to 1 (on) unmaskirq= 1 (on) # After doing this I re-ran the 5 test sends at 115200 bps. The number of lost bytes were: 0, 14, 8, 0, and 3. Compared with yesterday's 63, 5, 44, 48, and 2 this may indicate an improvement. Note also that in the 4th session where no bytes were lost there was still one element of corrupt data as detected by the image decoder. Thanks, Lee. Linux version 2.6.18-4-686 (Debian 2.6.18.dfsg.1-12) ([EMAIL PROTECTED]) (gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)) #1 SMP Mon Mar 26 17:17:36 UTC 2007 BIOS-provided physical RAM map: BIOS-e820: - 000a (usable) BIOS-e820: 000f - 0010 (reserved) BIOS-e820: 0010 - 0e00 (usable) BIOS-e820: - 0001 (reserved) 0MB HIGHMEM available. 224MB LOWMEM available. On node 0 totalpages: 57344 DMA zone: 4096 pages, LIFO batch:0 Normal zone: 53248 pages, LIFO batch:15 DMI 2.0 present. ACPI: Unable to locate RSDP Allocating PCI resources starting at 1000 (gap: 0e00:f1ff) Detected 400.953 MHz processor. Built 1 zonelists. Total pages: 57344 Kernel command line: root=/dev/hda3 ro Local APIC disabled by BIOS -- you can enable it with lapic mapped APIC to d000 (011c9000) Enabling fast FPU save and restore... done. Initializing CPU#0 PID hash table entries: 1024 (order: 10, 4096 bytes) Console: colour VGA+ 80x25 Dentry cache hash table entries: 32768 (order: 5, 131072 bytes) Inode-cache hash table entries: 16384 (order: 4, 65536 bytes) Memory: 219828k/229376k available (1544k kernel code, 9052k reserved, 577k data, 196k init, 0k highmem) Checking if this processor honours the WP bit even in supervisor mode... Ok. Calibrating delay using timer specific routine.. 802.59 BogoMIPS (lpj=1605193) Security Framework v1.0.0 initialized SELinux: Disabled at boot. Capability LSM initialized Mount-cache hash table entries: 512 CPU: After generic identify, caps: 0183f9ff CPU: After vendor identify, caps: 0183f9ff CPU: L1 I cache: 16K, L1 D cache: 16K CPU: L2 cache: 512K CPU: After all inits, caps: 0183f9ff 0040 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#0. Compat vDSO mapped to e000. Checking 'hlt' instruction... OK. SMP alternatives: switching to UP code Freeing SMP alternatives: 16k freed CPU0: Intel Pentium II (Deschutes) stepping 03 SMP motherboard not detected. Local APIC not detected. Using dummy APIC emulation. Brought up 1 CPUs migration_cost=0 checking if image is initramfs... it is Freeing initrd memory: 4375k freed NET: Registered protocol family 16 PCI: PCI BIOS revision 2.10 entry at 0xfb4a0, last bus=1 PCI: Using configuration type 1 Setting up standard PCI resources ACPI: Interpreter disabled. Linux Plug and Play Support v0.97 (c) Adam Belay pnp: PnP ACPI: disabled PnPBIOS: Scanning system for PnP BIOS support... PnPBIOS: Found PnP BIOS installation structure at 0xc00fc0f0 PnPBIOS: PnP BIOS version 1.0, entry 0xf:0xc118, dseg 0xf PnPBIOS: 14 nodes reported by PnP BIOS; 14 recorded by driver PCI: Probing PCI hardware PCI: Probing PCI hardware (bus 00) PCI: Firmware left :00:0b.0 e100 interrupts enabled, disabling Boot video device is :01:00.0 PCI: Bridge: :00:01.0 IO window: c000-cfff MEM window: e400-e5ff PREFETCH window: e700-e77f PCI: Setting latency timer of device :00:01.0 to 64 NET: Registered protocol family 2 IP route cache hash table entries: 2048 (order: 1, 8192 bytes) TCP established hash table entries: 8192 (order: 4, 65536 bytes) TCP bind hash table entries: 4096 (order: 3, 32768 bytes) TCP: Hash tables configured (established 8192 bind 4096) TCP reno registered audit: initializing netlink socket (disabled) audit(1185563327.604:1): initialized VFS: Disk quotas dquot_6.5.1 Dquot-cache hash table entries: 1024 (order 0, 4096 bytes) Initializing Cryptographic API io scheduler noop registered io scheduler anticipatory
Re: serial flow control appears broken
On 7/27/07, Lee Howard [EMAIL PROTECTED] wrote: Curiously, the session at 38400 bps that skipped 858 bytes... coincided, not just in sequence but also in precice timing within the session, with a small but noticeable disk load that I caused by grepping through a hundred session logs. (I can't reproduce it easily, though, because of disk caching.) `echo 1 /proc/sys/vm/drop_caches` will clear out most (all?) of what the kernel has cached from the drive. It's there just for this kind of repeatability of tests... Ray - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Paul Fulghum wrote: So this seems to be a latency issue reading the receive FIFO in the ISR. The current rx FIFO trigger level should be 8 bytes (UART_FCR_R_TRIG_10) which gives the ISR 694usec to get the data at 115200bps. IIRC, in 2.2.X kernels this defaulted to 4 bytes (TRIG_01) which gave a little more time to service the interrupt. How does the data rate affect the frequency of the overrun errors? Does 57600bps make them go away? The overrun error message does not occur on every instance of data corruption. (I just became aware of this as I've not been paying so much attention to the error messages as I have been to the corrupt data.) The data gets far more corrupted than the error messages would lead me to believe. Since the data being sent from the fax modem to the host is identical (same image data) every time it's easier for me to measure the effect of one bitrate over another by examining the number of missing bytes from the data. The image has a total of 140465 bytes. Just now I sent it 5 times each at 115200, 57600, 38400, and 19200 bps. At 115200 bps the number of bytes skipped were: 63, 5, 44, 48, and 2. At 57600 bps the number of bytes skipped were: 0, 1, 13, 9, and 12. At 38400 bps the number of bytes skipped were 858, 0, 0, 0, and 8. At 19200 bps the number of bytes skipped were 0, 0, 0, 0, and 0. Curiously, the session at 38400 bps that skipped 858 bytes... coincided, not just in sequence but also in precice timing within the session, with a small but noticeable disk load that I caused by grepping through a hundred session logs. (I can't reproduce it easily, though, because of disk caching.) And, perhaps this is relevant... the way that I have the fax modem sending the data to the host is by receiving it from another fax modem which is sending it. Thus, the modem on ttyS0 is sending a fax to the modem on ttyS1. Due to the error correction protocol that is performed between the two fax endpoints I can guarantee that the data is correct as it leaves the DCE. I mention this in case there is any limitation to how the 8250 driver performs when two modems are being run simultaneously. Thanks, Lee. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
On Fri, 2007-07-27 at 13:48 -0700, Lee Howard wrote: > Here's the output: > > type: 4 > line: 1 > line: 760 > irq: 3 >flags: 1358954688 > xmit_fifo_size: 16 > custom_divisor: 0 >baud_base: 115200 OK, the FIFO should be enabled. What is known: * The error is a hardware FIFO overrun. - observed message is in n_tty due to driver setting TTY_OVERRUN * The RTS/CTS flow control is not involved - this is done only by the ldisc in response to buffer levels - you verified crtscts is set - you did not observed RTS change when 'overflow error' logged - you did observe RTS change when application stopped reading So this seems to be a latency issue reading the receive FIFO in the ISR. The current rx FIFO trigger level should be 8 bytes (UART_FCR_R_TRIG_10) which gives the ISR 694usec to get the data at 115200bps. IIRC, in 2.2.X kernels this defaulted to 4 bytes (TRIG_01) which gave a little more time to service the interrupt. How does the data rate affect the frequency of the overrun errors? Does 57600bps make them go away? -- Paul - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Paul Fulghum wrote: Tilman Schmidt wrote: Could this be related? http://lkml.org/lkml/2007/7/18/245 Quote: "I've recently found (using 2.6.21.4) that configuring a serial ports (ST16654) which use the 8250 driver using setserial results in the UART's FIFOs being disabled (unless you specify autoconfig)." That would make sense. Lee's error is a hardware FIFO overrun which could occur if the FIFO is being disabled as described in your link (by trying to set the uart type with setserial). I'm not using setserial on this port, myself. If something in init is calling on setserial then I don't know about it. That said, tests on the serial port from within the application show that xmit_fifo_size is set to 16 as it should be. I wrote up a little test app: struct serial_struct serial; ioctl(modemFd, TIOCGSERIAL, ); printf("type: %d\n", serial.type); printf("line: %d\n", serial.line); printf("line: %u\n", serial.port); printf(" irq: %d\n", serial.irq); printf(" flags: %d\n", serial.flags); printf(" xmit_fifo_size: %d\n", serial.xmit_fifo_size); printf(" custom_divisor: %d\n", serial.custom_divisor); printf(" baud_base: %d\n", serial.baud_base); printf(" close_delay: %u\n", serial.close_delay); printf(" io_type: 0x%X\n", serial.io_type); printf("reserved_char[0]: 0x%X\n", serial.reserved_char[0]); printf("hub6: %d\n", serial.hub6); printf("closing_wait: %u\n", serial.closing_wait); printf(" closing_wait2: %u\n", serial.closing_wait2); printf(" iomem_reg_shift: %u\n", serial.iomem_reg_shift); printf(" port_high: %u\n", serial.port_high); printf(" reserved[0]: %d\n", serial.reserved[0]); Here's the output: type: 4 line: 1 line: 760 irq: 3 flags: 1358954688 xmit_fifo_size: 16 custom_divisor: 0 baud_base: 115200 close_delay: 500 io_type: 0x0 reserved_char[0]: 0x0 hub6: 0 closing_wait: 3 closing_wait2: 0 iomem_reg_shift: 0 port_high: 0 reserved[0]: 0 Thanks, Lee. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Tilman Schmidt wrote: Lee Howard schrieb: So, does this explain why I wouldn't have a problem at 115200 bps with kernel 2.2.5 but why I do with 2.6.5 and 2.6.18? Both hardware and software flow control work fine with 2.2.5 (meaning I don't see any error message and I don't have any data corruption), but neither works to avoid the "kernel: ttyS1: 1 input overrun(s)" and consequent data corruption issue in 2.6.5 nor 2.6.18. Was there some associated application change in tty handling that needed to occur between the 2.2 and 2.6 kernels to properly implement flow control? Could this be related? http://lkml.org/lkml/2007/7/18/245 Quote: "I've recently found (using 2.6.21.4) that configuring a serial ports (ST16654) which use the 8250 driver using setserial results in the UART's FIFOs being disabled (unless you specify autoconfig)." I'm not running setserial on the port, myself. But to test to see if it is related, I included this code in the application: #include struct serial_struct serial; ioctl(modemFd, TIOCGSERIAL, ); traceModemOp("modem xmit_fifo_size: %u", serial.xmit_fifo_size); And I get this resulting logging: "MODEM modem xmit_fifo_size: 16" So it's clear from here that the xmit_fifo_size is set correctly on this system. Thanks, Lee. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Tilman Schmidt wrote: Could this be related? http://lkml.org/lkml/2007/7/18/245 Quote: "I've recently found (using 2.6.21.4) that configuring a serial ports (ST16654) which use the 8250 driver using setserial results in the UART's FIFOs being disabled (unless you specify autoconfig)." That would make sense. Lee's error is a hardware FIFO overrun which could occur if the FIFO is being disabled as described in your link (by trying to set the uart type with setserial). Since the tty flow control is only triggered by the line discipline in response to ldisc buffer levels and not hardware FIFO overruns, you would never see any flow control action as reported by Lee. -- Paul Fulghum Microgate Systems, Ltd. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
OK, I see where TTY_OVERRUN is set: include/linux/serial_core.h:uart_insert_char() -- Paul Fulghum Microgate Systems, Ltd - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
On Fri, 2007-07-27 at 12:22 -0600, Robert Hancock wrote: > In this situation, though, it appears it's not the TTY buffers that are > filling but the UART's own buffer. I would think this must be caused by > some kind of interrupt latency that results in not draining the FIFO in > time. You are right, this error is output when the character flag TTY_OVERRUN is encountered by n_tty.c which should be set by the driver in response to a hardware FIFO overrun (not an ldisc buffer overrun). I can't see anyplace in serial_core.c or 8250.c that sets TTY_OVERRUN. -- Paul Fulghum Microgate Systems, Ltd - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
On Fri, 2007-07-27 at 12:22 -0600, Robert Hancock wrote: > Maciej W. Rozycki wrote: > > The TTY line discipline driver could do that based on the amount of > > received data present in its buffer. And it should if asked to (a brief > > look at drivers/char/n_tty.c reveals it does; obviously there may be a bug > > Really, where? In my look through the code I haven't found any mechanism > that would result in RTS being lowered based on TTY buffers filling up, > at least not in the 8250 case. serial_core.c:uart_throttle() serial_core.c:uart_unthrottle() These are called by N_TTY in response to buffer levels. -- Paul Fulghum Microgate Systems, Ltd - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Maciej W. Rozycki wrote: On Fri, 27 Jul 2007, Lee Howard wrote: Okay, so let's say we've got a loop around a blocking read on the modem file descriptor... for (;;) { read some data from modem process data from modem if (end-of-data detected) break; } Are you suggesting that the application should be using deasserting RTS after the read and asserting it before? It certainly could -- you were asking how it would know. ;-) I had previously thought that the control of RTS was something that the serial/tty driver was supposed to do independently based on the buffer fill. The TTY line discipline driver could do that based on the amount of received data present in its buffer. And it should if asked to (a brief look at drivers/char/n_tty.c reveals it does; obviously there may be a bug Really, where? In my look through the code I haven't found any mechanism that would result in RTS being lowered based on TTY buffers filling up, at least not in the 8250 case. In this situation, though, it appears it's not the TTY buffers that are filling but the UART's own buffer. I would think this must be caused by some kind of interrupt latency that results in not draining the FIFO in time. somewhere though). So could e.g. the SLIP and PPP line discipline drivers, though the criteria might be different (apparently they do not, which is a shame). The serial drivers have nothing to do about it -- all they can do is pushing data upstream, to the discipline driver. They can provide an interface to hardware flow control features though, if implemented by a given UART. Maciej - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Maciej W. Rozycki wrote: The TTY line discipline driver could do that based on the amount of received data present in its buffer. And it should if asked to (a brief look at drivers/char/n_tty.c reveals it does; obviously there may be a bug somewhere though). So could e.g. the SLIP and PPP line discipline drivers, though the criteria might be different (apparently they do not, which is a shame). The serial drivers have nothing to do about it -- all they can do is pushing data upstream, to the discipline driver. They can provide an interface to hardware flow control features though, if implemented by a given UART. Thank you for this clarification. So I should have more correctly been saying that "tty flow control appears broken". Right? I've asked the manufacturer to take a look at drivers/char/n_tty.c to see if they can't see anything obvious. Thanks, Lee. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Alan Cox wrote: As the flow control is driven by software on most 16x50 chips (there are a couple of exceptions) if we fail to empty the fifo fast enough then any flow control will be asserted too late to save the day. If you stop the application and do the following cat /dev/ttywhatever ^Z [stopped] (so you are asking the OS to buffer data but not ever reading it) and then fire data at it does the flow control eventually occur ? Yes it does appear to. I told the application to simply sleep(300) at the appropriate moment, and I watched the application and when it began the sleep I ran: cat /dev/ttyS1 (lots of "garbage" began spewing forth) ^Z (about 2 or 3 seconds and the RTS light goes dark) Thanks, Lee. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
On Fri, 27 Jul 2007, Lee Howard wrote: > Okay, so let's say we've got a loop around a blocking read on the modem file > descriptor... > > for (;;) { > read some data from modem > process data from modem > if (end-of-data detected) break; > } > > Are you suggesting that the application should be using deasserting RTS after > the read and asserting it before? It certainly could -- you were asking how it would know. ;-) > I had previously thought that the control of RTS was something that the > serial/tty driver was supposed to do independently based on the buffer fill. The TTY line discipline driver could do that based on the amount of received data present in its buffer. And it should if asked to (a brief look at drivers/char/n_tty.c reveals it does; obviously there may be a bug somewhere though). So could e.g. the SLIP and PPP line discipline drivers, though the criteria might be different (apparently they do not, which is a shame). The serial drivers have nothing to do about it -- all they can do is pushing data upstream, to the discipline driver. They can provide an interface to hardware flow control features though, if implemented by a given UART. Maciej - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
> I had previously thought that the control of RTS was something that the > serial/tty driver was supposed to do independently based on the buffer > fill. Was I wrong? If the kernel is asked to do CRTSCTS then the kernel handles the flow control. It uses it when the internal buffers are nearly full. The direct access to the lines is normally only used by special drivers such as half duplex radio modem drivers. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Maciej W. Rozycki wrote: On Thu, 26 Jul 2007, Lee Howard wrote: If the application were to use TIOCM_RTS how would it know when to apply it or not? Is there some approach that the application could take to manage flow control on the serial port? What about software flow control? Does the Well, an application could negate RTS when it receives a character and is running out of resources for further processing of incoming data. Smarter UARTs may be able to negate RTS themselves based on the amount of data in their receive FIFO. The threshold may be configurable. Okay, so let's say we've got a loop around a blocking read on the modem file descriptor... for (;;) { read some data from modem process data from modem if (end-of-data detected) break; } Are you suggesting that the application should be using deasserting RTS after the read and asserting it before? I had previously thought that the control of RTS was something that the serial/tty driver was supposed to do independently based on the buffer fill. Was I wrong? Thanks, Lee. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Lee Howard schrieb: > > So, does this explain why I wouldn't have a problem at 115200 bps with > kernel 2.2.5 but why I do with 2.6.5 and 2.6.18? Both hardware and > software flow control work fine with 2.2.5 (meaning I don't see any > error message and I don't have any data corruption), but neither works > to avoid the "kernel: ttyS1: 1 input overrun(s)" and consequent data > corruption issue in 2.6.5 nor 2.6.18. > > Was there some associated application change in tty handling that needed > to occur between the 2.2 and 2.6 kernels to properly implement flow control? Could this be related? http://lkml.org/lkml/2007/7/18/245 Quote: "I've recently found (using 2.6.21.4) that configuring a serial ports (ST16654) which use the 8250 driver using setserial results in the UART's FIFOs being disabled (unless you specify autoconfig)." -- Tilman SchmidtE-Mail: [EMAIL PROTECTED] Bonn, Germany Diese Nachricht besteht zu 100% aus wiederverwerteten Bits. Ungeöffnet mindestens haltbar bis: (siehe Rückseite) signature.asc Description: OpenPGP digital signature
Re: serial flow control appears broken
> -parenb -parodd cs8 -hupcl -cstopb cread clocal crtscts Ok so crtscts is set, but you have clocal set too. That shouldn't matter > Using software flow control this is what stty tells me about the port > set up done by the application: This also looks fine > They seem correct to me, but I am certainly willing to be wrong. clocal set as well is unusual but if I remember the spec right then clocal would not interfere with rts/cts handshake and certainly not with xon/xoff Looks correct, two boards so its unlikely both didnt wire it. > A quick google on "input overrun(s)" may lend some credence (although, > certainly this is not in any way conclusive) that I'm not the only one > who may be seeking a solution on this matter. > > http://www.google.com/search?hl=en=%2B%22input+overrun%28s%29%22 Those look different on the whole - there are two reasons you'll get an input overrun with a 16x50 UART. The first is because we ran out of buffers to empty the chip, in which case we would have asserted flow control in software. The second is if we cannot keep up and fail to empty the on chip FIFO within the required time (about 1mS) As the flow control is driven by software on most 16x50 chips (there are a couple of exceptions) if we fail to empty the fifo fast enough then any flow control will be asserted too late to save the day. If you stop the application and do the following cat /dev/ttywhatever ^Z [stopped] (so you are asking the OS to buffer data but not ever reading it) and then fire data at it does the flow control eventually occur ? Alan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
On Thu, 26 Jul 2007, Lee Howard wrote: > If the application were to use TIOCM_RTS how would it know when to apply it or > not? Is there some approach that the application could take to manage flow > control on the serial port? What about software flow control? Does the Well, an application could negate RTS when it receives a character and is running out of resources for further processing of incoming data. Smarter UARTs may be able to negate RTS themselves based on the amount of data in their receive FIFO. The threshold may be configurable. Maciej - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Alan Cox wrote: The manufacturer is using a scope to look for RTS and they're not seeing it, either. I just use my eyes to look at the LED, but I can see the CTS, DTR, DCD, RD, and TD lights blink, flicker, or dim... (and TD, RD, and CTS tend to go on and off rather quickly). And you have 1. The port set up correctly for flow control options in the kernel ? I suppose that you mean that the application has properly set up the port using termios/tcsetattr/ioctl and the like... rather than if the kernel build/config options were set to permit flow control (I know of no relevant flow-control-enabling kernel build options). Using hardware flow control this is what stty tells me about the port set up done by the application: # stty -F /dev/ttyS1 -a speed 115200 baud; rows 0; columns 0; line = 0; intr = ^C; quit = ^\; erase = ^?; kill = ^U; eof = ^D; eol = ; eol2 = ; start = ^Q; stop = ^S; susp = ^Z; rprnt = ^R; werase = ^W; lnext = ^V; flush = ^O; min = 1; time = 0; -parenb -parodd cs8 -hupcl -cstopb cread clocal crtscts -ignbrk -brkint -ignpar -parmrk -inpck -istrip -inlcr -igncr -icrnl -ixon -ixoff -iuclc -ixany -imaxbel -opost -olcuc -ocrnl -onlcr -onocr -onlret -ofill -ofdel nl0 cr0 tab0 bs0 vt0 ff0 -isig -icanon -iexten -echo -echoe -echok -echonl -noflsh -xcase -tostop -echoprt -echoctl -echoke # Using software flow control this is what stty tells me about the port set up done by the application: # stty -F /dev/ttyS1 -a speed 115200 baud; rows 0; columns 0; line = 0; intr = ^C; quit = ^\; erase = ^?; kill = ^U; eof = ^D; eol = ; eol2 = ; start = ^Q; stop = ^S; susp = ^Z; rprnt = ^R; werase = ^W; lnext = ^V; flush = ^O; min = 1; time = 0; -parenb -parodd cs8 -hupcl -cstopb cread clocal -crtscts -ignbrk -brkint -ignpar -parmrk -inpck -istrip -inlcr -igncr -icrnl ixon ixoff -iuclc -ixany -imaxbel -opost -olcuc -ocrnl -onlcr -onocr -onlret -ofill -ofdel nl0 cr0 tab0 bs0 vt0 ff0 -isig -icanon -iexten -echo -echoe -echok -echonl -noflsh -xcase -tostop -echoprt -echoctl -echoke # They seem correct to me, but I am certainly willing to be wrong. 2. Verified that the board vendor remembered to wire it ? I don't know how to verify directly that the board manufacturer wired the serial port correctly. I've tested this on two different motherboards made several years apart (but, yes, both were made by the same manufacturer). However, when using RedHat 6.0 (kernel 2.2.5) I have no problems with data corruption occurring in the data coming from the DCE. So that tells me that *something* was working before that isn't working now... and I'm trying to determine what the difference is... whether it be a problem in modern kernels or whether it be something that the application (HylaFAX) is not doing to accomodate whatever changes occurred in modern kernels. A quick google on "input overrun(s)" may lend some credence (although, certainly this is not in any way conclusive) that I'm not the only one who may be seeking a solution on this matter. http://www.google.com/search?hl=en=%2B%22input+overrun%28s%29%22 Thanks, Lee. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Alan Cox wrote: The manufacturer is using a scope to look for RTS and they're not seeing it, either. I just use my eyes to look at the LED, but I can see the CTS, DTR, DCD, RD, and TD lights blink, flicker, or dim... (and TD, RD, and CTS tend to go on and off rather quickly). And you have 1. The port set up correctly for flow control options in the kernel ? I suppose that you mean that the application has properly set up the port using termios/tcsetattr/ioctl and the like... rather than if the kernel build/config options were set to permit flow control (I know of no relevant flow-control-enabling kernel build options). Using hardware flow control this is what stty tells me about the port set up done by the application: # stty -F /dev/ttyS1 -a speed 115200 baud; rows 0; columns 0; line = 0; intr = ^C; quit = ^\; erase = ^?; kill = ^U; eof = ^D; eol = undef; eol2 = undef; start = ^Q; stop = ^S; susp = ^Z; rprnt = ^R; werase = ^W; lnext = ^V; flush = ^O; min = 1; time = 0; -parenb -parodd cs8 -hupcl -cstopb cread clocal crtscts -ignbrk -brkint -ignpar -parmrk -inpck -istrip -inlcr -igncr -icrnl -ixon -ixoff -iuclc -ixany -imaxbel -opost -olcuc -ocrnl -onlcr -onocr -onlret -ofill -ofdel nl0 cr0 tab0 bs0 vt0 ff0 -isig -icanon -iexten -echo -echoe -echok -echonl -noflsh -xcase -tostop -echoprt -echoctl -echoke # Using software flow control this is what stty tells me about the port set up done by the application: # stty -F /dev/ttyS1 -a speed 115200 baud; rows 0; columns 0; line = 0; intr = ^C; quit = ^\; erase = ^?; kill = ^U; eof = ^D; eol = undef; eol2 = undef; start = ^Q; stop = ^S; susp = ^Z; rprnt = ^R; werase = ^W; lnext = ^V; flush = ^O; min = 1; time = 0; -parenb -parodd cs8 -hupcl -cstopb cread clocal -crtscts -ignbrk -brkint -ignpar -parmrk -inpck -istrip -inlcr -igncr -icrnl ixon ixoff -iuclc -ixany -imaxbel -opost -olcuc -ocrnl -onlcr -onocr -onlret -ofill -ofdel nl0 cr0 tab0 bs0 vt0 ff0 -isig -icanon -iexten -echo -echoe -echok -echonl -noflsh -xcase -tostop -echoprt -echoctl -echoke # They seem correct to me, but I am certainly willing to be wrong. 2. Verified that the board vendor remembered to wire it ? I don't know how to verify directly that the board manufacturer wired the serial port correctly. I've tested this on two different motherboards made several years apart (but, yes, both were made by the same manufacturer). However, when using RedHat 6.0 (kernel 2.2.5) I have no problems with data corruption occurring in the data coming from the DCE. So that tells me that *something* was working before that isn't working now... and I'm trying to determine what the difference is... whether it be a problem in modern kernels or whether it be something that the application (HylaFAX) is not doing to accomodate whatever changes occurred in modern kernels. A quick google on input overrun(s) may lend some credence (although, certainly this is not in any way conclusive) that I'm not the only one who may be seeking a solution on this matter. http://www.google.com/search?hl=enq=%2B%22input+overrun%28s%29%22 Thanks, Lee. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
-parenb -parodd cs8 -hupcl -cstopb cread clocal crtscts Ok so crtscts is set, but you have clocal set too. That shouldn't matter Using software flow control this is what stty tells me about the port set up done by the application: This also looks fine They seem correct to me, but I am certainly willing to be wrong. clocal set as well is unusual but if I remember the spec right then clocal would not interfere with rts/cts handshake and certainly not with xon/xoff Looks correct, two boards so its unlikely both didnt wire it. A quick google on input overrun(s) may lend some credence (although, certainly this is not in any way conclusive) that I'm not the only one who may be seeking a solution on this matter. http://www.google.com/search?hl=enq=%2B%22input+overrun%28s%29%22 Those look different on the whole - there are two reasons you'll get an input overrun with a 16x50 UART. The first is because we ran out of buffers to empty the chip, in which case we would have asserted flow control in software. The second is if we cannot keep up and fail to empty the on chip FIFO within the required time (about 1mS) As the flow control is driven by software on most 16x50 chips (there are a couple of exceptions) if we fail to empty the fifo fast enough then any flow control will be asserted too late to save the day. If you stop the application and do the following cat /dev/ttywhatever ^Z [stopped] (so you are asking the OS to buffer data but not ever reading it) and then fire data at it does the flow control eventually occur ? Alan - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
On Thu, 26 Jul 2007, Lee Howard wrote: If the application were to use TIOCM_RTS how would it know when to apply it or not? Is there some approach that the application could take to manage flow control on the serial port? What about software flow control? Does the Well, an application could negate RTS when it receives a character and is running out of resources for further processing of incoming data. Smarter UARTs may be able to negate RTS themselves based on the amount of data in their receive FIFO. The threshold may be configurable. Maciej - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Lee Howard schrieb: So, does this explain why I wouldn't have a problem at 115200 bps with kernel 2.2.5 but why I do with 2.6.5 and 2.6.18? Both hardware and software flow control work fine with 2.2.5 (meaning I don't see any error message and I don't have any data corruption), but neither works to avoid the kernel: ttyS1: 1 input overrun(s) and consequent data corruption issue in 2.6.5 nor 2.6.18. Was there some associated application change in tty handling that needed to occur between the 2.2 and 2.6 kernels to properly implement flow control? Could this be related? http://lkml.org/lkml/2007/7/18/245 Quote: I've recently found (using 2.6.21.4) that configuring a serial ports (ST16654) which use the 8250 driver using setserial results in the UART's FIFOs being disabled (unless you specify autoconfig). -- Tilman SchmidtE-Mail: [EMAIL PROTECTED] Bonn, Germany Diese Nachricht besteht zu 100% aus wiederverwerteten Bits. Ungeöffnet mindestens haltbar bis: (siehe Rückseite) signature.asc Description: OpenPGP digital signature
Re: serial flow control appears broken
Maciej W. Rozycki wrote: On Thu, 26 Jul 2007, Lee Howard wrote: If the application were to use TIOCM_RTS how would it know when to apply it or not? Is there some approach that the application could take to manage flow control on the serial port? What about software flow control? Does the Well, an application could negate RTS when it receives a character and is running out of resources for further processing of incoming data. Smarter UARTs may be able to negate RTS themselves based on the amount of data in their receive FIFO. The threshold may be configurable. Okay, so let's say we've got a loop around a blocking read on the modem file descriptor... for (;;) { read some data from modem process data from modem if (end-of-data detected) break; } Are you suggesting that the application should be using deasserting RTS after the read and asserting it before? I had previously thought that the control of RTS was something that the serial/tty driver was supposed to do independently based on the buffer fill. Was I wrong? Thanks, Lee. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
I had previously thought that the control of RTS was something that the serial/tty driver was supposed to do independently based on the buffer fill. Was I wrong? If the kernel is asked to do CRTSCTS then the kernel handles the flow control. It uses it when the internal buffers are nearly full. The direct access to the lines is normally only used by special drivers such as half duplex radio modem drivers. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
On Fri, 27 Jul 2007, Lee Howard wrote: Okay, so let's say we've got a loop around a blocking read on the modem file descriptor... for (;;) { read some data from modem process data from modem if (end-of-data detected) break; } Are you suggesting that the application should be using deasserting RTS after the read and asserting it before? It certainly could -- you were asking how it would know. ;-) I had previously thought that the control of RTS was something that the serial/tty driver was supposed to do independently based on the buffer fill. The TTY line discipline driver could do that based on the amount of received data present in its buffer. And it should if asked to (a brief look at drivers/char/n_tty.c reveals it does; obviously there may be a bug somewhere though). So could e.g. the SLIP and PPP line discipline drivers, though the criteria might be different (apparently they do not, which is a shame). The serial drivers have nothing to do about it -- all they can do is pushing data upstream, to the discipline driver. They can provide an interface to hardware flow control features though, if implemented by a given UART. Maciej - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Maciej W. Rozycki wrote: On Fri, 27 Jul 2007, Lee Howard wrote: Okay, so let's say we've got a loop around a blocking read on the modem file descriptor... for (;;) { read some data from modem process data from modem if (end-of-data detected) break; } Are you suggesting that the application should be using deasserting RTS after the read and asserting it before? It certainly could -- you were asking how it would know. ;-) I had previously thought that the control of RTS was something that the serial/tty driver was supposed to do independently based on the buffer fill. The TTY line discipline driver could do that based on the amount of received data present in its buffer. And it should if asked to (a brief look at drivers/char/n_tty.c reveals it does; obviously there may be a bug Really, where? In my look through the code I haven't found any mechanism that would result in RTS being lowered based on TTY buffers filling up, at least not in the 8250 case. In this situation, though, it appears it's not the TTY buffers that are filling but the UART's own buffer. I would think this must be caused by some kind of interrupt latency that results in not draining the FIFO in time. somewhere though). So could e.g. the SLIP and PPP line discipline drivers, though the criteria might be different (apparently they do not, which is a shame). The serial drivers have nothing to do about it -- all they can do is pushing data upstream, to the discipline driver. They can provide an interface to hardware flow control features though, if implemented by a given UART. Maciej - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Alan Cox wrote: As the flow control is driven by software on most 16x50 chips (there are a couple of exceptions) if we fail to empty the fifo fast enough then any flow control will be asserted too late to save the day. If you stop the application and do the following cat /dev/ttywhatever ^Z [stopped] (so you are asking the OS to buffer data but not ever reading it) and then fire data at it does the flow control eventually occur ? Yes it does appear to. I told the application to simply sleep(300) at the appropriate moment, and I watched the application and when it began the sleep I ran: cat /dev/ttyS1 (lots of garbage began spewing forth) ^Z (about 2 or 3 seconds and the RTS light goes dark) Thanks, Lee. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Maciej W. Rozycki wrote: The TTY line discipline driver could do that based on the amount of received data present in its buffer. And it should if asked to (a brief look at drivers/char/n_tty.c reveals it does; obviously there may be a bug somewhere though). So could e.g. the SLIP and PPP line discipline drivers, though the criteria might be different (apparently they do not, which is a shame). The serial drivers have nothing to do about it -- all they can do is pushing data upstream, to the discipline driver. They can provide an interface to hardware flow control features though, if implemented by a given UART. Thank you for this clarification. So I should have more correctly been saying that tty flow control appears broken. Right? I've asked the manufacturer to take a look at drivers/char/n_tty.c to see if they can't see anything obvious. Thanks, Lee. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
On Fri, 2007-07-27 at 12:22 -0600, Robert Hancock wrote: Maciej W. Rozycki wrote: The TTY line discipline driver could do that based on the amount of received data present in its buffer. And it should if asked to (a brief look at drivers/char/n_tty.c reveals it does; obviously there may be a bug Really, where? In my look through the code I haven't found any mechanism that would result in RTS being lowered based on TTY buffers filling up, at least not in the 8250 case. serial_core.c:uart_throttle() serial_core.c:uart_unthrottle() These are called by N_TTY in response to buffer levels. -- Paul Fulghum Microgate Systems, Ltd - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
On Fri, 2007-07-27 at 12:22 -0600, Robert Hancock wrote: In this situation, though, it appears it's not the TTY buffers that are filling but the UART's own buffer. I would think this must be caused by some kind of interrupt latency that results in not draining the FIFO in time. You are right, this error is output when the character flag TTY_OVERRUN is encountered by n_tty.c which should be set by the driver in response to a hardware FIFO overrun (not an ldisc buffer overrun). I can't see anyplace in serial_core.c or 8250.c that sets TTY_OVERRUN. -- Paul Fulghum Microgate Systems, Ltd - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Tilman Schmidt wrote: Could this be related? http://lkml.org/lkml/2007/7/18/245 Quote: I've recently found (using 2.6.21.4) that configuring a serial ports (ST16654) which use the 8250 driver using setserial results in the UART's FIFOs being disabled (unless you specify autoconfig). That would make sense. Lee's error is a hardware FIFO overrun which could occur if the FIFO is being disabled as described in your link (by trying to set the uart type with setserial). Since the tty flow control is only triggered by the line discipline in response to ldisc buffer levels and not hardware FIFO overruns, you would never see any flow control action as reported by Lee. -- Paul Fulghum Microgate Systems, Ltd. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
OK, I see where TTY_OVERRUN is set: include/linux/serial_core.h:uart_insert_char() -- Paul Fulghum Microgate Systems, Ltd - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Tilman Schmidt wrote: Lee Howard schrieb: So, does this explain why I wouldn't have a problem at 115200 bps with kernel 2.2.5 but why I do with 2.6.5 and 2.6.18? Both hardware and software flow control work fine with 2.2.5 (meaning I don't see any error message and I don't have any data corruption), but neither works to avoid the kernel: ttyS1: 1 input overrun(s) and consequent data corruption issue in 2.6.5 nor 2.6.18. Was there some associated application change in tty handling that needed to occur between the 2.2 and 2.6 kernels to properly implement flow control? Could this be related? http://lkml.org/lkml/2007/7/18/245 Quote: I've recently found (using 2.6.21.4) that configuring a serial ports (ST16654) which use the 8250 driver using setserial results in the UART's FIFOs being disabled (unless you specify autoconfig). I'm not running setserial on the port, myself. But to test to see if it is related, I included this code in the application: #include linux/serial.h struct serial_struct serial; ioctl(modemFd, TIOCGSERIAL, serial); traceModemOp(modem xmit_fifo_size: %u, serial.xmit_fifo_size); And I get this resulting logging: MODEM modem xmit_fifo_size: 16 So it's clear from here that the xmit_fifo_size is set correctly on this system. Thanks, Lee. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Paul Fulghum wrote: Tilman Schmidt wrote: Could this be related? http://lkml.org/lkml/2007/7/18/245 Quote: I've recently found (using 2.6.21.4) that configuring a serial ports (ST16654) which use the 8250 driver using setserial results in the UART's FIFOs being disabled (unless you specify autoconfig). That would make sense. Lee's error is a hardware FIFO overrun which could occur if the FIFO is being disabled as described in your link (by trying to set the uart type with setserial). I'm not using setserial on this port, myself. If something in init is calling on setserial then I don't know about it. That said, tests on the serial port from within the application show that xmit_fifo_size is set to 16 as it should be. I wrote up a little test app: struct serial_struct serial; ioctl(modemFd, TIOCGSERIAL, serial); printf(type: %d\n, serial.type); printf(line: %d\n, serial.line); printf(line: %u\n, serial.port); printf( irq: %d\n, serial.irq); printf( flags: %d\n, serial.flags); printf( xmit_fifo_size: %d\n, serial.xmit_fifo_size); printf( custom_divisor: %d\n, serial.custom_divisor); printf( baud_base: %d\n, serial.baud_base); printf( close_delay: %u\n, serial.close_delay); printf( io_type: 0x%X\n, serial.io_type); printf(reserved_char[0]: 0x%X\n, serial.reserved_char[0]); printf(hub6: %d\n, serial.hub6); printf(closing_wait: %u\n, serial.closing_wait); printf( closing_wait2: %u\n, serial.closing_wait2); printf( iomem_reg_shift: %u\n, serial.iomem_reg_shift); printf( port_high: %u\n, serial.port_high); printf( reserved[0]: %d\n, serial.reserved[0]); Here's the output: type: 4 line: 1 line: 760 irq: 3 flags: 1358954688 xmit_fifo_size: 16 custom_divisor: 0 baud_base: 115200 close_delay: 500 io_type: 0x0 reserved_char[0]: 0x0 hub6: 0 closing_wait: 3 closing_wait2: 0 iomem_reg_shift: 0 port_high: 0 reserved[0]: 0 Thanks, Lee. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
On Fri, 2007-07-27 at 13:48 -0700, Lee Howard wrote: Here's the output: type: 4 line: 1 line: 760 irq: 3 flags: 1358954688 xmit_fifo_size: 16 custom_divisor: 0 baud_base: 115200 OK, the FIFO should be enabled. What is known: * The error is a hardware FIFO overrun. - observed message is in n_tty due to driver setting TTY_OVERRUN * The RTS/CTS flow control is not involved - this is done only by the ldisc in response to buffer levels - you verified crtscts is set - you did not observed RTS change when 'overflow error' logged - you did observe RTS change when application stopped reading So this seems to be a latency issue reading the receive FIFO in the ISR. The current rx FIFO trigger level should be 8 bytes (UART_FCR_R_TRIG_10) which gives the ISR 694usec to get the data at 115200bps. IIRC, in 2.2.X kernels this defaulted to 4 bytes (TRIG_01) which gave a little more time to service the interrupt. How does the data rate affect the frequency of the overrun errors? Does 57600bps make them go away? -- Paul - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Paul Fulghum wrote: So this seems to be a latency issue reading the receive FIFO in the ISR. The current rx FIFO trigger level should be 8 bytes (UART_FCR_R_TRIG_10) which gives the ISR 694usec to get the data at 115200bps. IIRC, in 2.2.X kernels this defaulted to 4 bytes (TRIG_01) which gave a little more time to service the interrupt. How does the data rate affect the frequency of the overrun errors? Does 57600bps make them go away? The overrun error message does not occur on every instance of data corruption. (I just became aware of this as I've not been paying so much attention to the error messages as I have been to the corrupt data.) The data gets far more corrupted than the error messages would lead me to believe. Since the data being sent from the fax modem to the host is identical (same image data) every time it's easier for me to measure the effect of one bitrate over another by examining the number of missing bytes from the data. The image has a total of 140465 bytes. Just now I sent it 5 times each at 115200, 57600, 38400, and 19200 bps. At 115200 bps the number of bytes skipped were: 63, 5, 44, 48, and 2. At 57600 bps the number of bytes skipped were: 0, 1, 13, 9, and 12. At 38400 bps the number of bytes skipped were 858, 0, 0, 0, and 8. At 19200 bps the number of bytes skipped were 0, 0, 0, 0, and 0. Curiously, the session at 38400 bps that skipped 858 bytes... coincided, not just in sequence but also in precice timing within the session, with a small but noticeable disk load that I caused by grepping through a hundred session logs. (I can't reproduce it easily, though, because of disk caching.) And, perhaps this is relevant... the way that I have the fax modem sending the data to the host is by receiving it from another fax modem which is sending it. Thus, the modem on ttyS0 is sending a fax to the modem on ttyS1. Due to the error correction protocol that is performed between the two fax endpoints I can guarantee that the data is correct as it leaves the DCE. I mention this in case there is any limitation to how the 8250 driver performs when two modems are being run simultaneously. Thanks, Lee. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Alan Cox wrote: Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing enabled ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A It's a Shuttle HOT-661 motherboard (VIA Apollo Pro Plus mainboard chipset). Both FreeBSD and Linux identify the serial chipset type as 16550A. So you've got 16bytes of buffering. That ought to be enough on a modern PC. The older kernels use quite limited internal buffers which may be a factor, the current ones have a rewritten tty buffering layer which may improve matters enormously. So, does this explain why I wouldn't have a problem at 115200 bps with kernel 2.2.5 but why I do with 2.6.5 and 2.6.18? Both hardware and software flow control work fine with 2.2.5 (meaning I don't see any error message and I don't have any data corruption), but neither works to avoid the "kernel: ttyS1: 1 input overrun(s)" and consequent data corruption issue in 2.6.5 nor 2.6.18. Was there some associated application change in tty handling that needed to occur between the 2.2 and 2.6 kernels to properly implement flow control? Thanks, Lee. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
> The manufacturer is using a scope to look for RTS and they're not seeing > it, either. I just use my eyes to look at the LED, but I can see the > CTS, DTR, DCD, RD, and TD lights blink, flicker, or dim... (and TD, RD, > and CTS tend to go on and off rather quickly). And you have 1. The port set up correctly for flow control options in the kernel ? 2. Verified that the board vendor remembered to wire it ? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Uwe Kleine-König wrote: Hello, This is evidenced in hardware flow control by a little LED labeled "RTS" that is on the external modem. This LED lights up when pin 7 of the DB9 serial connection is given +12Vdc current (signalling "RTS" is on - that the host can accept data). The LED goes dark when the current is removed (signalling that the host cannot accept data). This "RTS" LED never flickers at all, as it should, when receiving these bursts of data - the LED stays lit as long as the serial cable is connected to the host... and yet I will see those "input overrun" messages. Thus, it seems quite clear that the Linux serial tty driver is not deasserting RTS as it should in hardware flow control. (And probably the analogous problem exists in software flow control, too.) I don't know the relevant timings for problem, but just to be sure that your prerequisites are correct: How did you check that the LED stays lit all the time? Just from looking might not be accurate. You might want to mesure the signal with an oscilloscope. The manufacturer is using a scope to look for RTS and they're not seeing it, either. I just use my eyes to look at the LED, but I can see the CTS, DTR, DCD, RD, and TD lights blink, flicker, or dim... (and TD, RD, and CTS tend to go on and off rather quickly). All of that said... even though I don't see RTS flicker or blink or dim when using kernel 2.2.5 (RedHat 6.0) I don't have any problems using 115200 bps DTE-DCE communication rate. Thanks, Lee. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
> Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing enabled > ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A > ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A > > It's a Shuttle HOT-661 motherboard (VIA Apollo Pro Plus mainboard > chipset). Both FreeBSD and Linux identify the serial chipset type as > 16550A. So you've got 16bytes of buffering. That ought to be enough on a modern PC. The older kernels use quite limited internal buffers which may be a factor, the current ones have a rewritten tty buffering layer which may improve matters enormously. Alan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Robert Hancock wrote: Lee Howard wrote: Hello. I have fax modems that will, in their proper behavior with certain features, send up to 64 kilobytes of data to the host DTE all at once. (So, the fax modem handles an incoming fax and periodically will send between 256 bytes and 64 kilobytes of data in bursts.) When the DCE-DTE (modem-to-host) communication rate is established at 115200 bps data loss occurs systems using at least Linux kernels 2.6.5 and 2.6.18 (and probably everything in-beween and then some more). This is because the modem overflows the host's buffer. This is evidenced in kernel logging: Jul 23 14:01:30 gollum kernel: ttyS1: 1 input overrun(s) Jul 23 17:09:45 gollum kernel: ttyS1: 1 input overrun(s) Normally I would blame the modem itself for not honoring the host's flow control signals. However, I have worked with the modem manufacturer closely on this matter for over three months now. In that process they have improved the responsiveness of the modem and have fixed other problems, but the end result is that it truly does appear that the serial tty driver is not using flow control. Whether software flow control (XON/XOFF) or hardware flow control (RTS/CTS) is used the result is the same. This is evidenced in hardware flow control by a little LED labeled "RTS" that is on the external modem. This LED lights up when pin 7 of the DB9 serial connection is given +12Vdc current (signalling "RTS" is on - that the host can accept data). The LED goes dark when the current is removed (signalling that the host cannot accept data). This "RTS" LED never flickers at all, as it should, when receiving these bursts of data - the LED stays lit as long as the serial cable is connected to the host... and yet I will see those "input overrun" messages. Thus, it seems quite clear that the Linux serial tty driver is not deasserting RTS as it should in hardware flow control. (And probably the analogous problem exists in software flow control, too.) Please tell me what I can do to help you resove and/or remedy this matter. Also, please let me know if I have contacted the wrong people. (I have cross-posted to linux-kernel as a catch-all. I am not subscribed to either linux-serial or linux-kernel mailing lists. So please CC me in any list responses.) If it is of any value to know (perhaps they have common code?), the same error occurs on FreeBSD 6.2 as well. The problem does not occur on Windows. The problem does not occur on RedHat 6.0 (kernel 2.2.5). What kind of serial port and machine is this on? From what I can see, a standard 16550 UART (not a special variant) just doesn't have any support for clearing RTS on its own when its input FIFO gets too full. The kernel would have to do it in that case. I'm not seeing where it would be controlling that automatically (as opposed to manually from the application with TIOCM_RTS). I'm also not sure if the UART gives the kernel enough information for it to even be able to control this line properly automatically. That's assuming it actually is a 16550 or similar with a 16-byte FIFO at all, which assuming it's a non-ancient PC it should be, but who knows. Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing enabled ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A It's a Shuttle HOT-661 motherboard (VIA Apollo Pro Plus mainboard chipset). Both FreeBSD and Linux identify the serial chipset type as 16550A. If the application were to use TIOCM_RTS how would it know when to apply it or not? Is there some approach that the application could take to manage flow control on the serial port? What about software flow control? Does the application (and not the driver) need to be managing the DC1/DC3 signalling on the host-side? Thanks, Lee. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Hello, > This is evidenced in hardware flow control by a little LED labeled "RTS" > that is on the external modem. This LED lights up when pin 7 of the DB9 > serial connection is given +12Vdc current (signalling "RTS" is on - that > the host can accept data). The LED goes dark when the current is > removed (signalling that the host cannot accept data). This "RTS" LED > never flickers at all, as it should, when receiving these bursts of data > - the LED stays lit as long as the serial cable is connected to the > host... and yet I will see those "input overrun" messages. Thus, it > seems quite clear that the Linux serial tty driver is not deasserting > RTS as it should in hardware flow control. (And probably the analogous > problem exists in software flow control, too.) I don't know the relevant timings for problem, but just to be sure that your prerequisites are correct: How did you check that the LED stays lit all the time? Just from looking might not be accurate. You might want to mesure the signal with an oscilloscope. Just my 0.02¢ Uwe -- Uwe Kleine-König fib where fib = 0 : 1 : zipWith (+) fib (tail fib) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Lee Howard wrote: Hello. I have fax modems that will, in their proper behavior with certain features, send up to 64 kilobytes of data to the host DTE all at once. (So, the fax modem handles an incoming fax and periodically will send between 256 bytes and 64 kilobytes of data in bursts.) When the DCE-DTE (modem-to-host) communication rate is established at 115200 bps data loss occurs systems using at least Linux kernels 2.6.5 and 2.6.18 (and probably everything in-beween and then some more). This is because the modem overflows the host's buffer. This is evidenced in kernel logging: Jul 23 14:01:30 gollum kernel: ttyS1: 1 input overrun(s) Jul 23 17:09:45 gollum kernel: ttyS1: 1 input overrun(s) Normally I would blame the modem itself for not honoring the host's flow control signals. However, I have worked with the modem manufacturer closely on this matter for over three months now. In that process they have improved the responsiveness of the modem and have fixed other problems, but the end result is that it truly does appear that the serial tty driver is not using flow control. Whether software flow control (XON/XOFF) or hardware flow control (RTS/CTS) is used the result is the same. This is evidenced in hardware flow control by a little LED labeled "RTS" that is on the external modem. This LED lights up when pin 7 of the DB9 serial connection is given +12Vdc current (signalling "RTS" is on - that the host can accept data). The LED goes dark when the current is removed (signalling that the host cannot accept data). This "RTS" LED never flickers at all, as it should, when receiving these bursts of data - the LED stays lit as long as the serial cable is connected to the host... and yet I will see those "input overrun" messages. Thus, it seems quite clear that the Linux serial tty driver is not deasserting RTS as it should in hardware flow control. (And probably the analogous problem exists in software flow control, too.) Please tell me what I can do to help you resove and/or remedy this matter. Also, please let me know if I have contacted the wrong people. (I have cross-posted to linux-kernel as a catch-all. I am not subscribed to either linux-serial or linux-kernel mailing lists. So please CC me in any list responses.) If it is of any value to know (perhaps they have common code?), the same error occurs on FreeBSD 6.2 as well. The problem does not occur on Windows. The problem does not occur on RedHat 6.0 (kernel 2.2.5). What kind of serial port and machine is this on? From what I can see, a standard 16550 UART (not a special variant) just doesn't have any support for clearing RTS on its own when its input FIFO gets too full. The kernel would have to do it in that case. I'm not seeing where it would be controlling that automatically (as opposed to manually from the application with TIOCM_RTS). I'm also not sure if the UART gives the kernel enough information for it to even be able to control this line properly automatically. That's assuming it actually is a 16550 or similar with a 16-byte FIFO at all, which assuming it's a non-ancient PC it should be, but who knows. -- Robert Hancock Saskatoon, SK, Canada To email, remove "nospam" from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Lee Howard wrote: Hello. I have fax modems that will, in their proper behavior with certain features, send up to 64 kilobytes of data to the host DTE all at once. (So, the fax modem handles an incoming fax and periodically will send between 256 bytes and 64 kilobytes of data in bursts.) When the DCE-DTE (modem-to-host) communication rate is established at 115200 bps data loss occurs systems using at least Linux kernels 2.6.5 and 2.6.18 (and probably everything in-beween and then some more). This is because the modem overflows the host's buffer. This is evidenced in kernel logging: Jul 23 14:01:30 gollum kernel: ttyS1: 1 input overrun(s) Jul 23 17:09:45 gollum kernel: ttyS1: 1 input overrun(s) Normally I would blame the modem itself for not honoring the host's flow control signals. However, I have worked with the modem manufacturer closely on this matter for over three months now. In that process they have improved the responsiveness of the modem and have fixed other problems, but the end result is that it truly does appear that the serial tty driver is not using flow control. Whether software flow control (XON/XOFF) or hardware flow control (RTS/CTS) is used the result is the same. This is evidenced in hardware flow control by a little LED labeled RTS that is on the external modem. This LED lights up when pin 7 of the DB9 serial connection is given +12Vdc current (signalling RTS is on - that the host can accept data). The LED goes dark when the current is removed (signalling that the host cannot accept data). This RTS LED never flickers at all, as it should, when receiving these bursts of data - the LED stays lit as long as the serial cable is connected to the host... and yet I will see those input overrun messages. Thus, it seems quite clear that the Linux serial tty driver is not deasserting RTS as it should in hardware flow control. (And probably the analogous problem exists in software flow control, too.) Please tell me what I can do to help you resove and/or remedy this matter. Also, please let me know if I have contacted the wrong people. (I have cross-posted to linux-kernel as a catch-all. I am not subscribed to either linux-serial or linux-kernel mailing lists. So please CC me in any list responses.) If it is of any value to know (perhaps they have common code?), the same error occurs on FreeBSD 6.2 as well. The problem does not occur on Windows. The problem does not occur on RedHat 6.0 (kernel 2.2.5). What kind of serial port and machine is this on? From what I can see, a standard 16550 UART (not a special variant) just doesn't have any support for clearing RTS on its own when its input FIFO gets too full. The kernel would have to do it in that case. I'm not seeing where it would be controlling that automatically (as opposed to manually from the application with TIOCM_RTS). I'm also not sure if the UART gives the kernel enough information for it to even be able to control this line properly automatically. That's assuming it actually is a 16550 or similar with a 16-byte FIFO at all, which assuming it's a non-ancient PC it should be, but who knows. -- Robert Hancock Saskatoon, SK, Canada To email, remove nospam from [EMAIL PROTECTED] Home Page: http://www.roberthancock.com/ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
The manufacturer is using a scope to look for RTS and they're not seeing it, either. I just use my eyes to look at the LED, but I can see the CTS, DTR, DCD, RD, and TD lights blink, flicker, or dim... (and TD, RD, and CTS tend to go on and off rather quickly). And you have 1. The port set up correctly for flow control options in the kernel ? 2. Verified that the board vendor remembered to wire it ? - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Uwe Kleine-König wrote: Hello, This is evidenced in hardware flow control by a little LED labeled RTS that is on the external modem. This LED lights up when pin 7 of the DB9 serial connection is given +12Vdc current (signalling RTS is on - that the host can accept data). The LED goes dark when the current is removed (signalling that the host cannot accept data). This RTS LED never flickers at all, as it should, when receiving these bursts of data - the LED stays lit as long as the serial cable is connected to the host... and yet I will see those input overrun messages. Thus, it seems quite clear that the Linux serial tty driver is not deasserting RTS as it should in hardware flow control. (And probably the analogous problem exists in software flow control, too.) I don't know the relevant timings for problem, but just to be sure that your prerequisites are correct: How did you check that the LED stays lit all the time? Just from looking might not be accurate. You might want to mesure the signal with an oscilloscope. The manufacturer is using a scope to look for RTS and they're not seeing it, either. I just use my eyes to look at the LED, but I can see the CTS, DTR, DCD, RD, and TD lights blink, flicker, or dim... (and TD, RD, and CTS tend to go on and off rather quickly). All of that said... even though I don't see RTS flicker or blink or dim when using kernel 2.2.5 (RedHat 6.0) I don't have any problems using 115200 bps DTE-DCE communication rate. Thanks, Lee. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Robert Hancock wrote: Lee Howard wrote: Hello. I have fax modems that will, in their proper behavior with certain features, send up to 64 kilobytes of data to the host DTE all at once. (So, the fax modem handles an incoming fax and periodically will send between 256 bytes and 64 kilobytes of data in bursts.) When the DCE-DTE (modem-to-host) communication rate is established at 115200 bps data loss occurs systems using at least Linux kernels 2.6.5 and 2.6.18 (and probably everything in-beween and then some more). This is because the modem overflows the host's buffer. This is evidenced in kernel logging: Jul 23 14:01:30 gollum kernel: ttyS1: 1 input overrun(s) Jul 23 17:09:45 gollum kernel: ttyS1: 1 input overrun(s) Normally I would blame the modem itself for not honoring the host's flow control signals. However, I have worked with the modem manufacturer closely on this matter for over three months now. In that process they have improved the responsiveness of the modem and have fixed other problems, but the end result is that it truly does appear that the serial tty driver is not using flow control. Whether software flow control (XON/XOFF) or hardware flow control (RTS/CTS) is used the result is the same. This is evidenced in hardware flow control by a little LED labeled RTS that is on the external modem. This LED lights up when pin 7 of the DB9 serial connection is given +12Vdc current (signalling RTS is on - that the host can accept data). The LED goes dark when the current is removed (signalling that the host cannot accept data). This RTS LED never flickers at all, as it should, when receiving these bursts of data - the LED stays lit as long as the serial cable is connected to the host... and yet I will see those input overrun messages. Thus, it seems quite clear that the Linux serial tty driver is not deasserting RTS as it should in hardware flow control. (And probably the analogous problem exists in software flow control, too.) Please tell me what I can do to help you resove and/or remedy this matter. Also, please let me know if I have contacted the wrong people. (I have cross-posted to linux-kernel as a catch-all. I am not subscribed to either linux-serial or linux-kernel mailing lists. So please CC me in any list responses.) If it is of any value to know (perhaps they have common code?), the same error occurs on FreeBSD 6.2 as well. The problem does not occur on Windows. The problem does not occur on RedHat 6.0 (kernel 2.2.5). What kind of serial port and machine is this on? From what I can see, a standard 16550 UART (not a special variant) just doesn't have any support for clearing RTS on its own when its input FIFO gets too full. The kernel would have to do it in that case. I'm not seeing where it would be controlling that automatically (as opposed to manually from the application with TIOCM_RTS). I'm also not sure if the UART gives the kernel enough information for it to even be able to control this line properly automatically. That's assuming it actually is a 16550 or similar with a 16-byte FIFO at all, which assuming it's a non-ancient PC it should be, but who knows. Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing enabled ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A It's a Shuttle HOT-661 motherboard (VIA Apollo Pro Plus mainboard chipset). Both FreeBSD and Linux identify the serial chipset type as 16550A. If the application were to use TIOCM_RTS how would it know when to apply it or not? Is there some approach that the application could take to manage flow control on the serial port? What about software flow control? Does the application (and not the driver) need to be managing the DC1/DC3 signalling on the host-side? Thanks, Lee. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing enabled ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A It's a Shuttle HOT-661 motherboard (VIA Apollo Pro Plus mainboard chipset). Both FreeBSD and Linux identify the serial chipset type as 16550A. So you've got 16bytes of buffering. That ought to be enough on a modern PC. The older kernels use quite limited internal buffers which may be a factor, the current ones have a rewritten tty buffering layer which may improve matters enormously. Alan - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Hello, This is evidenced in hardware flow control by a little LED labeled RTS that is on the external modem. This LED lights up when pin 7 of the DB9 serial connection is given +12Vdc current (signalling RTS is on - that the host can accept data). The LED goes dark when the current is removed (signalling that the host cannot accept data). This RTS LED never flickers at all, as it should, when receiving these bursts of data - the LED stays lit as long as the serial cable is connected to the host... and yet I will see those input overrun messages. Thus, it seems quite clear that the Linux serial tty driver is not deasserting RTS as it should in hardware flow control. (And probably the analogous problem exists in software flow control, too.) I don't know the relevant timings for problem, but just to be sure that your prerequisites are correct: How did you check that the LED stays lit all the time? Just from looking might not be accurate. You might want to mesure the signal with an oscilloscope. Just my 0.02¢ Uwe -- Uwe Kleine-König fib where fib = 0 : 1 : zipWith (+) fib (tail fib) - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: serial flow control appears broken
Alan Cox wrote: Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing enabled ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A It's a Shuttle HOT-661 motherboard (VIA Apollo Pro Plus mainboard chipset). Both FreeBSD and Linux identify the serial chipset type as 16550A. So you've got 16bytes of buffering. That ought to be enough on a modern PC. The older kernels use quite limited internal buffers which may be a factor, the current ones have a rewritten tty buffering layer which may improve matters enormously. So, does this explain why I wouldn't have a problem at 115200 bps with kernel 2.2.5 but why I do with 2.6.5 and 2.6.18? Both hardware and software flow control work fine with 2.2.5 (meaning I don't see any error message and I don't have any data corruption), but neither works to avoid the kernel: ttyS1: 1 input overrun(s) and consequent data corruption issue in 2.6.5 nor 2.6.18. Was there some associated application change in tty handling that needed to occur between the 2.2 and 2.6 kernels to properly implement flow control? Thanks, Lee. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/