Re: serial flow control appears broken

2007-08-05 Thread Paul Fulghum
2.2.5 is using the same UART setup (trigger level of 8) as
the current code. There is no obvious difference in the
interrupt setup (same devices on the same interrupts).

So I have no helpful suggestions :-(

--
Paul


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-08-05 Thread Paul Fulghum
2.2.5 is using the same UART setup (trigger level of 8) as
the current code. There is no obvious difference in the
interrupt setup (same devices on the same interrupts).

So I have no helpful suggestions :-(

--
Paul


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-08-04 Thread Lee Howard

Paul Fulghum wrote:


Lee Howard wrote:

And in repeat tests it is quite evident that IDE disk activity is, 
indeed, at least part of the problem.  As IDE disk activity increases 
an increased amount of data coming in on the serial port goes missing.



Lee, you mentioned 2.2.x kernels did not exhibit this problem.

Was this on the same hardware you are currently testing?



Yes it was... except for the hard drive.  I have different installs of 
different operating systems on different hard drives.  I change the hard 
drive when switching between 2.2.5 and 2.6.5.



Which 2.2.x version were you using?



The default 2.2.5 kernel that comes with RedHat 6.0.


Was the 2.2.x serial driver also identifying the UART as a 16550A?



Yes it does.


Can you get /proc/interrupts output
from both the current setup and the 2.2.x setup?



Current (2.6.5):

  CPU0
 0:   14660696  XT-PIC  timer
 1:  8  XT-PIC  i8042
 2:  0  XT-PIC  cascade
 3:1240314  XT-PIC  serial
 4: 778901  XT-PIC  serial
 8:  1  XT-PIC  rtc
10: 111647  XT-PIC  eth0
14: 221202  XT-PIC  ide0
15: 34  XT-PIC  ide1
NMI:  0
ERR:  5

(2.2.5):

  CPU0
 0:   5908  XT-PIC  timer
 1: 88  XT-PIC  i8042
 2:  0  XT-PIC  cascade
 8:  2  XT-PIC  rtc
10:38  XT-PIC  Intel EtherExpress Pro 10/100 Ethernet
13:  1  XT-PIC fpu
14:36637XT-PIC  ide0
15: 4  XT-PIC  ide1
NMI:  0

Thanks,

Lee.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-08-04 Thread Paul Fulghum

Lee Howard wrote:
And in repeat tests it is quite evident that IDE disk activity is, 
indeed, at least part of the problem.  As IDE disk activity increases an 
increased amount of data coming in on the serial port goes missing.


Lee, you mentioned 2.2.x kernels did not exhibit this problem.

Was this on the same hardware you are currently testing?
Which 2.2.x version were you using?
Was the 2.2.x serial driver also identifying the UART as a 16550A?
Can you get /proc/interrupts output
from both the current setup and the 2.2.x setup?

It would be interesting to compare the interrupt
assignment and UART setup between the versions.

--
Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-08-04 Thread Lee Howard

Mark Lord wrote:

The "fix" could be to have the serial IRQ handler never unmask 
interrupts,

but that's a bit unsociable to others.  The IDE stuff really needs to not
do so much during the actual IRQ handler.

Ingo's RT patches would probably fix all of this.



I did a Fedora 7 installation and installed Ingo's kernel from here:

http://people.redhat.com/mingo/realtime-preempt/yum-testing/yum/i686/kernel-rt-2.6.21-0182.rt11cfsv17.i686.rpm

Even then, the problem still occurs, unfortunately.

Thanks,

Lee.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-08-04 Thread Lee Howard

Ray Lee wrote:


On 7/27/07, Lee Howard <[EMAIL PROTECTED]> wrote:
 


Curiously, the session at 38400 bps that skipped 858 bytes... coincided,
not just in sequence but also in precice timing within the session, with
a small but noticeable disk load that I caused by grepping through a
hundred session logs.  (I can't reproduce it easily, though, because of
disk caching.)
   



`echo 1 > /proc/sys/vm/drop_caches` will clear out most (all?) of what
the kernel has cached from the drive. It's there just for this kind of
repeatability of tests...



And in repeat tests it is quite evident that IDE disk activity is, 
indeed, at least part of the problem.  As IDE disk activity increases an 
increased amount of data coming in on the serial port goes missing.


Thanks,

Lee.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-08-04 Thread Lee Howard

Maciej W. Rozycki wrote:


On Fri, 27 Jul 2007, Lee Howard wrote:

 


Okay, so let's say we've got a loop around a blocking read on the modem file
descriptor...

for (;;) {
read some data from modem
process data from modem
if (end-of-data detected) break;
}

Are you suggesting that the application should be using deasserting RTS after
the read and asserting it before?
   



It certainly could -- you were asking how it would know. ;-)



So, to test... I put this in the application before every read:

   int flags;
   ioctl(modemFd, TIOCMGET, );
   flags |= TIOCM_RTS;
   ioctl(modemFd, TIOCMSET, );

and this after:

   int flags;
   ioctl(modemFd, TIOCMGET, );
   flags &= ~TIOCM_RTS;
   ioctl(modemFd, TIOCMSET, );

Now I can see the RTS light blink on the modem (and during heavy 
communication it merely "dims" depending on the amount of delay in the 
processing.


However, it does not help.  Data still goes missing.

Thanks,

Lee.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-08-04 Thread Lee Howard

Maciej W. Rozycki wrote:


On Fri, 27 Jul 2007, Lee Howard wrote:

 


Okay, so let's say we've got a loop around a blocking read on the modem file
descriptor...

for (;;) {
read some data from modem
process data from modem
if (end-of-data detected) break;
}

Are you suggesting that the application should be using deasserting RTS after
the read and asserting it before?
   



It certainly could -- you were asking how it would know. ;-)



So, to test... I put this in the application before every read:

   int flags;
   ioctl(modemFd, TIOCMGET, flags);
   flags |= TIOCM_RTS;
   ioctl(modemFd, TIOCMSET, flags);

and this after:

   int flags;
   ioctl(modemFd, TIOCMGET, flags);
   flags = ~TIOCM_RTS;
   ioctl(modemFd, TIOCMSET, flags);

Now I can see the RTS light blink on the modem (and during heavy 
communication it merely dims depending on the amount of delay in the 
processing.


However, it does not help.  Data still goes missing.

Thanks,

Lee.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-08-04 Thread Lee Howard

Ray Lee wrote:


On 7/27/07, Lee Howard [EMAIL PROTECTED] wrote:
 


Curiously, the session at 38400 bps that skipped 858 bytes... coincided,
not just in sequence but also in precice timing within the session, with
a small but noticeable disk load that I caused by grepping through a
hundred session logs.  (I can't reproduce it easily, though, because of
disk caching.)
   



`echo 1  /proc/sys/vm/drop_caches` will clear out most (all?) of what
the kernel has cached from the drive. It's there just for this kind of
repeatability of tests...



And in repeat tests it is quite evident that IDE disk activity is, 
indeed, at least part of the problem.  As IDE disk activity increases an 
increased amount of data coming in on the serial port goes missing.


Thanks,

Lee.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-08-04 Thread Lee Howard

Mark Lord wrote:

The fix could be to have the serial IRQ handler never unmask 
interrupts,

but that's a bit unsociable to others.  The IDE stuff really needs to not
do so much during the actual IRQ handler.

Ingo's RT patches would probably fix all of this.



I did a Fedora 7 installation and installed Ingo's kernel from here:

http://people.redhat.com/mingo/realtime-preempt/yum-testing/yum/i686/kernel-rt-2.6.21-0182.rt11cfsv17.i686.rpm

Even then, the problem still occurs, unfortunately.

Thanks,

Lee.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-08-04 Thread Paul Fulghum

Lee Howard wrote:
And in repeat tests it is quite evident that IDE disk activity is, 
indeed, at least part of the problem.  As IDE disk activity increases an 
increased amount of data coming in on the serial port goes missing.


Lee, you mentioned 2.2.x kernels did not exhibit this problem.

Was this on the same hardware you are currently testing?
Which 2.2.x version were you using?
Was the 2.2.x serial driver also identifying the UART as a 16550A?
Can you get /proc/interrupts output
from both the current setup and the 2.2.x setup?

It would be interesting to compare the interrupt
assignment and UART setup between the versions.

--
Paul
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-08-04 Thread Lee Howard

Paul Fulghum wrote:


Lee Howard wrote:

And in repeat tests it is quite evident that IDE disk activity is, 
indeed, at least part of the problem.  As IDE disk activity increases 
an increased amount of data coming in on the serial port goes missing.



Lee, you mentioned 2.2.x kernels did not exhibit this problem.

Was this on the same hardware you are currently testing?



Yes it was... except for the hard drive.  I have different installs of 
different operating systems on different hard drives.  I change the hard 
drive when switching between 2.2.5 and 2.6.5.



Which 2.2.x version were you using?



The default 2.2.5 kernel that comes with RedHat 6.0.


Was the 2.2.x serial driver also identifying the UART as a 16550A?



Yes it does.


Can you get /proc/interrupts output
from both the current setup and the 2.2.x setup?



Current (2.6.5):

  CPU0
 0:   14660696  XT-PIC  timer
 1:  8  XT-PIC  i8042
 2:  0  XT-PIC  cascade
 3:1240314  XT-PIC  serial
 4: 778901  XT-PIC  serial
 8:  1  XT-PIC  rtc
10: 111647  XT-PIC  eth0
14: 221202  XT-PIC  ide0
15: 34  XT-PIC  ide1
NMI:  0
ERR:  5

(2.2.5):

  CPU0
 0:   5908  XT-PIC  timer
 1: 88  XT-PIC  i8042
 2:  0  XT-PIC  cascade
 8:  2  XT-PIC  rtc
10:38  XT-PIC  Intel EtherExpress Pro 10/100 Ethernet
13:  1  XT-PIC fpu
14:36637XT-PIC  ide0
15: 4  XT-PIC  ide1
NMI:  0

Thanks,

Lee.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-08-03 Thread Maciej W. Rozycki
On Thu, 2 Aug 2007, Alan Cox wrote:

> Currently libata PIO is mostly done in the IRQ path. Albert Lee was doing
> some work on that but its actually very hard to fix without doing polled
> PIO.

 Hmm, when the drive signals it is ready for a PIO data transfer can't 
just the interrupt handler mask the originating interrupt and post a 
softirq to handle the case?  That should be rather straightforward.

  Maciej
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-08-03 Thread Maciej W. Rozycki
On Thu, 2 Aug 2007, Alan Cox wrote:

 Currently libata PIO is mostly done in the IRQ path. Albert Lee was doing
 some work on that but its actually very hard to fix without doing polled
 PIO.

 Hmm, when the drive signals it is ready for a PIO data transfer can't 
just the interrupt handler mask the originating interrupt and post a 
softirq to handle the case?  That should be rather straightforward.

  Maciej
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-08-02 Thread Alan Cox
> That's what "hdparm -u1" (or -u0) controls.

Only some of the time.

> Ingo's RT patches would probably fix all of this.

The worst case IDE times we've seen for executing a single indivisible
un-interruptible I/O cycle with a drive are around 1mS. Thats a hardware
limit.

Alan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-08-02 Thread Robert Hancock

Alan Cox wrote:
I think that PIO transfers only have to be done with interrupts disabled 
on really old, evil controllers (without unmask set). I don't think 
libata ever disables interrupts during transfers(?)


Currently libata PIO is mostly done in the IRQ path. Albert Lee was doing
some work on that but its actually very hard to fix without doing polled
PIO.


Ah, right. Misread the code.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-08-02 Thread Alan Cox
> I think that PIO transfers only have to be done with interrupts disabled 
> on really old, evil controllers (without unmask set). I don't think 
> libata ever disables interrupts during transfers(?)

Currently libata PIO is mostly done in the IRQ path. Albert Lee was doing
some work on that but its actually very hard to fix without doing polled
PIO.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-08-02 Thread Robert Hancock

Mark Lord wrote:
I think that PIO transfers only have to be done with interrupts 
disabled on really old, evil controllers (without unmask set). I don't 
think libata ever disables interrupts during transfers(?)


That's what "hdparm -u1" (or -u0) controls.

But it doesn't matter a whit here.  The problem is that the IDE interrupt
handling can take a long time, regardless of whether it unmasks IRQs or 
not.

And if that IDE interrupt interrupts a serial interrupt, then the serial
stuff won't get handled until the IDE stuff completes.  Thus the problem.

The "fix" could be to have the serial IRQ handler never unmask interrupts,
but that's a bit unsociable to others.  The IDE stuff really needs to not
do so much during the actual IRQ handler.

Ingo's RT patches would probably fix all of this.


libata also doesn't do the actual PIO transfer from the interrupt 
handler like old IDE does, either, and it only disables interrupts for 
the transfer if it's transferring to/from high memory..


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-08-02 Thread Mark Lord

Robert Hancock wrote:

Mark Lord wrote:

I don't believe the speed of the machine has much to do with it,
as IDE PIO is always at pretty much the same speed (or slower)
regardless of the CPU speed.

Best case is about .120 usec per 16-bit word, but that doesn't often 
pan out

in practice.  More typical is something closer to 1 usec per 16-bit word.

So, for multcount=16 (very common), best case is 16 * 256 * .120 = 491 
usec,
plus extra overhead for reading the IDE status register (another usec 
or so),
and other stuff.  Figure maybe 500usec total per interrupt for 
multcount=16

in the best case, or 4000usec in the worst case.

At 115200bps, we get a byte every 86 usec or so.  Assuming the UART FIFO
is set to interrupt (warn) us at 12/16 full, we have 4*86 = 344 usec to
respond and de-assert RTS.  Less than that in practice.

Conclusion:  using IDE multisector PIO is not a good idea with high speed
serial transfers happening, since we cannot respond quickly enough.

It might be possible to set the buffer underrun threshold lower in the 
UART (?).


All that said, I doubt that his system is using IDE PIO in the first 
place.
Dunno how long IDE DMA interrupts take, but it's probably in the 20-50 
usec range.


I think that PIO transfers only have to be done with interrupts disabled 
on really old, evil controllers (without unmask set). I don't think 
libata ever disables interrupts during transfers(?)


That's what "hdparm -u1" (or -u0) controls.

But it doesn't matter a whit here.  The problem is that the IDE interrupt
handling can take a long time, regardless of whether it unmasks IRQs or not.
And if that IDE interrupt interrupts a serial interrupt, then the serial
stuff won't get handled until the IDE stuff completes.  Thus the problem.

The "fix" could be to have the serial IRQ handler never unmask interrupts,
but that's a bit unsociable to others.  The IDE stuff really needs to not
do so much during the actual IRQ handler.

Ingo's RT patches would probably fix all of this.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-08-02 Thread Robert Hancock

Mark Lord wrote:

I don't believe the speed of the machine has much to do with it,
as IDE PIO is always at pretty much the same speed (or slower)
regardless of the CPU speed.

Best case is about .120 usec per 16-bit word, but that doesn't often pan 
out

in practice.  More typical is something closer to 1 usec per 16-bit word.

So, for multcount=16 (very common), best case is 16 * 256 * .120 = 491 
usec,
plus extra overhead for reading the IDE status register (another usec or 
so),

and other stuff.  Figure maybe 500usec total per interrupt for multcount=16
in the best case, or 4000usec in the worst case.

At 115200bps, we get a byte every 86 usec or so.  Assuming the UART FIFO
is set to interrupt (warn) us at 12/16 full, we have 4*86 = 344 usec to
respond and de-assert RTS.  Less than that in practice.

Conclusion:  using IDE multisector PIO is not a good idea with high speed
serial transfers happening, since we cannot respond quickly enough.

It might be possible to set the buffer underrun threshold lower in the 
UART (?).


All that said, I doubt that his system is using IDE PIO in the first place.
Dunno how long IDE DMA interrupts take, but it's probably in the 20-50 
usec range.


I think that PIO transfers only have to be done with interrupts disabled 
on really old, evil controllers (without unmask set). I don't think 
libata ever disables interrupts during transfers(?)


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-08-02 Thread Mark Lord

Maciej W. Rozycki wrote:

On Sat, 28 Jul 2007, Russell King wrote:


Essentially, any complex interrupt handler (such as an IDE interrupt
doing a multi-sector PIO transfer _in interrupt context_) can cause this
kind of starvation.  That's why Linux 1.x had bottom halves - so that
the time consuming work could be moved out of the interrupt handler,
thereby causing minimal the blockage of other interrupts.

Unfortunately, that kind of design has been long since forgotten.
Apparantly modern machines are fast enough that it doesn't have to be
worried about anymore...  Or are they?


 I would guess it is not that the machines are fast enough, but that this 
two-level processing makes things more complicated.  Enough that most 
people would not bother digging into it unless really forced.  Only 
occasional latency problems are probably not enough of a force.


I don't believe the speed of the machine has much to do with it,
as IDE PIO is always at pretty much the same speed (or slower)
regardless of the CPU speed.

Best case is about .120 usec per 16-bit word, but that doesn't often pan out
in practice.  More typical is something closer to 1 usec per 16-bit word.

So, for multcount=16 (very common), best case is 16 * 256 * .120 = 491 usec,
plus extra overhead for reading the IDE status register (another usec or so),
and other stuff.  Figure maybe 500usec total per interrupt for multcount=16
in the best case, or 4000usec in the worst case.

At 115200bps, we get a byte every 86 usec or so.  Assuming the UART FIFO
is set to interrupt (warn) us at 12/16 full, we have 4*86 = 344 usec to
respond and de-assert RTS.  Less than that in practice.

Conclusion:  using IDE multisector PIO is not a good idea with high speed
serial transfers happening, since we cannot respond quickly enough.

It might be possible to set the buffer underrun threshold lower in the UART (?).

All that said, I doubt that his system is using IDE PIO in the first place.
Dunno how long IDE DMA interrupts take, but it's probably in the 20-50 usec 
range.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-08-02 Thread Mark Lord

Maciej W. Rozycki wrote:

On Sat, 28 Jul 2007, Russell King wrote:


Essentially, any complex interrupt handler (such as an IDE interrupt
doing a multi-sector PIO transfer _in interrupt context_) can cause this
kind of starvation.  That's why Linux 1.x had bottom halves - so that
the time consuming work could be moved out of the interrupt handler,
thereby causing minimal the blockage of other interrupts.

Unfortunately, that kind of design has been long since forgotten.
Apparantly modern machines are fast enough that it doesn't have to be
worried about anymore...  Or are they?


 I would guess it is not that the machines are fast enough, but that this 
two-level processing makes things more complicated.  Enough that most 
people would not bother digging into it unless really forced.  Only 
occasional latency problems are probably not enough of a force.


I don't believe the speed of the machine has much to do with it,
as IDE PIO is always at pretty much the same speed (or slower)
regardless of the CPU speed.

Best case is about .120 usec per 16-bit word, but that doesn't often pan out
in practice.  More typical is something closer to 1 usec per 16-bit word.

So, for multcount=16 (very common), best case is 16 * 256 * .120 = 491 usec,
plus extra overhead for reading the IDE status register (another usec or so),
and other stuff.  Figure maybe 500usec total per interrupt for multcount=16
in the best case, or 4000usec in the worst case.

At 115200bps, we get a byte every 86 usec or so.  Assuming the UART FIFO
is set to interrupt (warn) us at 12/16 full, we have 4*86 = 344 usec to
respond and de-assert RTS.  Less than that in practice.

Conclusion:  using IDE multisector PIO is not a good idea with high speed
serial transfers happening, since we cannot respond quickly enough.

It might be possible to set the buffer underrun threshold lower in the UART (?).

All that said, I doubt that his system is using IDE PIO in the first place.
Dunno how long IDE DMA interrupts take, but it's probably in the 20-50 usec 
range.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-08-02 Thread Robert Hancock

Mark Lord wrote:

I don't believe the speed of the machine has much to do with it,
as IDE PIO is always at pretty much the same speed (or slower)
regardless of the CPU speed.

Best case is about .120 usec per 16-bit word, but that doesn't often pan 
out

in practice.  More typical is something closer to 1 usec per 16-bit word.

So, for multcount=16 (very common), best case is 16 * 256 * .120 = 491 
usec,
plus extra overhead for reading the IDE status register (another usec or 
so),

and other stuff.  Figure maybe 500usec total per interrupt for multcount=16
in the best case, or 4000usec in the worst case.

At 115200bps, we get a byte every 86 usec or so.  Assuming the UART FIFO
is set to interrupt (warn) us at 12/16 full, we have 4*86 = 344 usec to
respond and de-assert RTS.  Less than that in practice.

Conclusion:  using IDE multisector PIO is not a good idea with high speed
serial transfers happening, since we cannot respond quickly enough.

It might be possible to set the buffer underrun threshold lower in the 
UART (?).


All that said, I doubt that his system is using IDE PIO in the first place.
Dunno how long IDE DMA interrupts take, but it's probably in the 20-50 
usec range.


I think that PIO transfers only have to be done with interrupts disabled 
on really old, evil controllers (without unmask set). I don't think 
libata ever disables interrupts during transfers(?)


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-08-02 Thread Alan Cox
 I think that PIO transfers only have to be done with interrupts disabled 
 on really old, evil controllers (without unmask set). I don't think 
 libata ever disables interrupts during transfers(?)

Currently libata PIO is mostly done in the IRQ path. Albert Lee was doing
some work on that but its actually very hard to fix without doing polled
PIO.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-08-02 Thread Mark Lord

Robert Hancock wrote:

Mark Lord wrote:

I don't believe the speed of the machine has much to do with it,
as IDE PIO is always at pretty much the same speed (or slower)
regardless of the CPU speed.

Best case is about .120 usec per 16-bit word, but that doesn't often 
pan out

in practice.  More typical is something closer to 1 usec per 16-bit word.

So, for multcount=16 (very common), best case is 16 * 256 * .120 = 491 
usec,
plus extra overhead for reading the IDE status register (another usec 
or so),
and other stuff.  Figure maybe 500usec total per interrupt for 
multcount=16

in the best case, or 4000usec in the worst case.

At 115200bps, we get a byte every 86 usec or so.  Assuming the UART FIFO
is set to interrupt (warn) us at 12/16 full, we have 4*86 = 344 usec to
respond and de-assert RTS.  Less than that in practice.

Conclusion:  using IDE multisector PIO is not a good idea with high speed
serial transfers happening, since we cannot respond quickly enough.

It might be possible to set the buffer underrun threshold lower in the 
UART (?).


All that said, I doubt that his system is using IDE PIO in the first 
place.
Dunno how long IDE DMA interrupts take, but it's probably in the 20-50 
usec range.


I think that PIO transfers only have to be done with interrupts disabled 
on really old, evil controllers (without unmask set). I don't think 
libata ever disables interrupts during transfers(?)


That's what hdparm -u1 (or -u0) controls.

But it doesn't matter a whit here.  The problem is that the IDE interrupt
handling can take a long time, regardless of whether it unmasks IRQs or not.
And if that IDE interrupt interrupts a serial interrupt, then the serial
stuff won't get handled until the IDE stuff completes.  Thus the problem.

The fix could be to have the serial IRQ handler never unmask interrupts,
but that's a bit unsociable to others.  The IDE stuff really needs to not
do so much during the actual IRQ handler.

Ingo's RT patches would probably fix all of this.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-08-02 Thread Robert Hancock

Mark Lord wrote:
I think that PIO transfers only have to be done with interrupts 
disabled on really old, evil controllers (without unmask set). I don't 
think libata ever disables interrupts during transfers(?)


That's what hdparm -u1 (or -u0) controls.

But it doesn't matter a whit here.  The problem is that the IDE interrupt
handling can take a long time, regardless of whether it unmasks IRQs or 
not.

And if that IDE interrupt interrupts a serial interrupt, then the serial
stuff won't get handled until the IDE stuff completes.  Thus the problem.

The fix could be to have the serial IRQ handler never unmask interrupts,
but that's a bit unsociable to others.  The IDE stuff really needs to not
do so much during the actual IRQ handler.

Ingo's RT patches would probably fix all of this.


libata also doesn't do the actual PIO transfer from the interrupt 
handler like old IDE does, either, and it only disables interrupts for 
the transfer if it's transferring to/from high memory..


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-08-02 Thread Alan Cox
 That's what hdparm -u1 (or -u0) controls.

Only some of the time.

 Ingo's RT patches would probably fix all of this.

The worst case IDE times we've seen for executing a single indivisible
un-interruptible I/O cycle with a drive are around 1mS. Thats a hardware
limit.

Alan
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-08-02 Thread Robert Hancock

Alan Cox wrote:
I think that PIO transfers only have to be done with interrupts disabled 
on really old, evil controllers (without unmask set). I don't think 
libata ever disables interrupts during transfers(?)


Currently libata PIO is mostly done in the IRQ path. Albert Lee was doing
some work on that but its actually very hard to fix without doing polled
PIO.


Ah, right. Misread the code.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-30 Thread Russell King
On Mon, Jul 30, 2007 at 10:45:19AM +0100, Maciej W. Rozycki wrote:
> On Sat, 28 Jul 2007, Russell King wrote:
> 
> > Essentially, any complex interrupt handler (such as an IDE interrupt
> > doing a multi-sector PIO transfer _in interrupt context_) can cause this
> > kind of starvation.  That's why Linux 1.x had bottom halves - so that
> > the time consuming work could be moved out of the interrupt handler,
> > thereby causing minimal the blockage of other interrupts.
> > 
> > Unfortunately, that kind of design has been long since forgotten.
> > Apparantly modern machines are fast enough that it doesn't have to be
> > worried about anymore...  Or are they?
> 
>  I would guess it is not that the machines are fast enough, but that this 
> two-level processing makes things more complicated.  Enough that most 
> people would not bother digging into it unless really forced.  Only 
> occasional latency problems are probably not enough of a force.

It's a shame we don't have a way to measure IRQ latency - it would be
very useful to flag up problems.

I think the best we could do is to arrange for the timer interrupt to
complain if it's delayed by more than 1ms or so - but some architectures
already run their timers with IRQF_DISABLED as a work around some of
the latency issues.

-- 
Russell King
 Linux kernel2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-30 Thread Maciej W. Rozycki
On Sat, 28 Jul 2007, Russell King wrote:

> Essentially, any complex interrupt handler (such as an IDE interrupt
> doing a multi-sector PIO transfer _in interrupt context_) can cause this
> kind of starvation.  That's why Linux 1.x had bottom halves - so that
> the time consuming work could be moved out of the interrupt handler,
> thereby causing minimal the blockage of other interrupts.
> 
> Unfortunately, that kind of design has been long since forgotten.
> Apparantly modern machines are fast enough that it doesn't have to be
> worried about anymore...  Or are they?

 I would guess it is not that the machines are fast enough, but that this 
two-level processing makes things more complicated.  Enough that most 
people would not bother digging into it unless really forced.  Only 
occasional latency problems are probably not enough of a force.

  Maciej
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-30 Thread Maciej W. Rozycki
On Fri, 27 Jul 2007, Paul Fulghum wrote:

> I can't see anyplace in serial_core.c or 8250.c that sets TTY_OVERRUN.

 Look for UART_LSR_OE in 8250.c -- the serial core accepts any bit that 
has been defined by the low-level driver and sets TTY_OVERRUN in 
uart_insert_char().

  Maciej
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-30 Thread Maciej W. Rozycki
On Fri, 27 Jul 2007, Lee Howard wrote:

> >The serial drivers have nothing to do about it -- all they can do is pushing
> >data upstream, to the discipline driver.  They can provide an interface to
> >hardware flow control features though, if implemented by a given UART.
> >
> 
> Thank you for this clarification.  So I should have more correctly been saying
> that "tty flow control appears broken".  Right?

 Probably.  It might be, as Alan suggested, that it is meant to work, but 
the latency kills it.

  Maciej
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-30 Thread Maciej W. Rozycki
On Fri, 27 Jul 2007, Robert Hancock wrote:

> > The TTY line discipline driver could do that based on the amount of received
> > data present in its buffer.  And it should if asked to (a brief look at
> > drivers/char/n_tty.c reveals it does; obviously there may be a bug 
> 
> Really, where? In my look through the code I haven't found any mechanism that
> would result in RTS being lowered based on TTY buffers filling up, at least
> not in the 8250 case.

 Look for calls to ->throttle() and ->unthrottle().  XON and XOFF might be 
used instead as a result of these calls though, depending on terminal 
settings.

> In this situation, though, it appears it's not the TTY buffers that are
> filling but the UART's own buffer. I would think this must be caused by some
> kind of interrupt latency that results in not draining the FIFO in time.

 Well, the UART only has its FIFO which is rather small, so automatic flow 
control would be useful.  Though, admittedly, tty_insert_flip_char() might 
return some kind of a status related to how much space is left in the 
receive buffer which would indicate that there is a lag in data stream 
processing -- which in turn may relate to the system being loaded, so that 
the receive ISR could decide whether to negate RTS itself for the less 
capable UARTs (i.e. ones with no autoflow and a tiny or no FIFO).

  Maciej
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-30 Thread Maciej W. Rozycki
On Fri, 27 Jul 2007, Robert Hancock wrote:

  The TTY line discipline driver could do that based on the amount of received
  data present in its buffer.  And it should if asked to (a brief look at
  drivers/char/n_tty.c reveals it does; obviously there may be a bug 
 
 Really, where? In my look through the code I haven't found any mechanism that
 would result in RTS being lowered based on TTY buffers filling up, at least
 not in the 8250 case.

 Look for calls to -throttle() and -unthrottle().  XON and XOFF might be 
used instead as a result of these calls though, depending on terminal 
settings.

 In this situation, though, it appears it's not the TTY buffers that are
 filling but the UART's own buffer. I would think this must be caused by some
 kind of interrupt latency that results in not draining the FIFO in time.

 Well, the UART only has its FIFO which is rather small, so automatic flow 
control would be useful.  Though, admittedly, tty_insert_flip_char() might 
return some kind of a status related to how much space is left in the 
receive buffer which would indicate that there is a lag in data stream 
processing -- which in turn may relate to the system being loaded, so that 
the receive ISR could decide whether to negate RTS itself for the less 
capable UARTs (i.e. ones with no autoflow and a tiny or no FIFO).

  Maciej
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-30 Thread Maciej W. Rozycki
On Fri, 27 Jul 2007, Lee Howard wrote:

 The serial drivers have nothing to do about it -- all they can do is pushing
 data upstream, to the discipline driver.  They can provide an interface to
 hardware flow control features though, if implemented by a given UART.
 
 
 Thank you for this clarification.  So I should have more correctly been saying
 that tty flow control appears broken.  Right?

 Probably.  It might be, as Alan suggested, that it is meant to work, but 
the latency kills it.

  Maciej
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-30 Thread Maciej W. Rozycki
On Fri, 27 Jul 2007, Paul Fulghum wrote:

 I can't see anyplace in serial_core.c or 8250.c that sets TTY_OVERRUN.

 Look for UART_LSR_OE in 8250.c -- the serial core accepts any bit that 
has been defined by the low-level driver and sets TTY_OVERRUN in 
uart_insert_char().

  Maciej
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-30 Thread Maciej W. Rozycki
On Sat, 28 Jul 2007, Russell King wrote:

 Essentially, any complex interrupt handler (such as an IDE interrupt
 doing a multi-sector PIO transfer _in interrupt context_) can cause this
 kind of starvation.  That's why Linux 1.x had bottom halves - so that
 the time consuming work could be moved out of the interrupt handler,
 thereby causing minimal the blockage of other interrupts.
 
 Unfortunately, that kind of design has been long since forgotten.
 Apparantly modern machines are fast enough that it doesn't have to be
 worried about anymore...  Or are they?

 I would guess it is not that the machines are fast enough, but that this 
two-level processing makes things more complicated.  Enough that most 
people would not bother digging into it unless really forced.  Only 
occasional latency problems are probably not enough of a force.

  Maciej
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-30 Thread Russell King
On Mon, Jul 30, 2007 at 10:45:19AM +0100, Maciej W. Rozycki wrote:
 On Sat, 28 Jul 2007, Russell King wrote:
 
  Essentially, any complex interrupt handler (such as an IDE interrupt
  doing a multi-sector PIO transfer _in interrupt context_) can cause this
  kind of starvation.  That's why Linux 1.x had bottom halves - so that
  the time consuming work could be moved out of the interrupt handler,
  thereby causing minimal the blockage of other interrupts.
  
  Unfortunately, that kind of design has been long since forgotten.
  Apparantly modern machines are fast enough that it doesn't have to be
  worried about anymore...  Or are they?
 
  I would guess it is not that the machines are fast enough, but that this 
 two-level processing makes things more complicated.  Enough that most 
 people would not bother digging into it unless really forced.  Only 
 occasional latency problems are probably not enough of a force.

It's a shame we don't have a way to measure IRQ latency - it would be
very useful to flag up problems.

I think the best we could do is to arrange for the timer interrupt to
complain if it's delayed by more than 1ms or so - but some architectures
already run their timers with IRQF_DISABLED as a work around some of
the latency issues.

-- 
Russell King
 Linux kernel2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-28 Thread Ray Lee
On 7/27/07, Lee Howard <[EMAIL PROTECTED]> wrote:
> Curiously, the session at 38400 bps that skipped 858 bytes... coincided,
> not just in sequence but also in precice timing within the session, with
> a small but noticeable disk load that I caused by grepping through a
> hundred session logs.  (I can't reproduce it easily, though, because of
> disk caching.)

`echo 1 > /proc/sys/vm/drop_caches` will clear out most (all?) of what
the kernel has cached from the drive. It's there just for this kind of
repeatability of tests...

Ray
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-28 Thread Lee Howard

Alan Cox wrote:

Curiously, the session at 38400 bps that skipped 858 bytes... coincided, 
not just in sequence but also in precice timing within the session, with 
a small but noticeable disk load that I caused by grepping through a 
hundred session logs.  (I can't reproduce it easily, though, because of 
disk caching.)
   



Can you send me a dmesg, there are some cases when high disk load can
cause high interrupt latency in both 2.2 and 2.6 depending upon what is
configured.



I've attached dmesg output.  The os version I used yesterday to run 
those tests was Debian 4.0r0 (kernel 2.6.18-4-686).  It's still running, 
and that's where I give you this dmesg output from.



I don't think thats related to the main problem but it is
worth knowing about hdparm -u1



# hdparm -u1 /dev/hda

/dev/hda:
setting unmaskirq to 1 (on)
unmaskirq=  1 (on)
#

After doing this I re-ran the 5 test sends at 115200 bps.  The number of 
lost bytes were:  0, 14, 8, 0, and 3.  Compared with yesterday's 63, 5, 
44, 48, and 2 this may indicate an improvement.  Note also that in the 
4th session where no bytes were lost there was still one element of 
corrupt data as detected by the image decoder.


Thanks,

Lee.
Linux version 2.6.18-4-686 (Debian 2.6.18.dfsg.1-12) ([EMAIL PROTECTED]) (gcc 
version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)) #1 SMP Mon Mar 26 
17:17:36 UTC 2007
BIOS-provided physical RAM map:
 BIOS-e820:  - 000a (usable)
 BIOS-e820: 000f - 0010 (reserved)
 BIOS-e820: 0010 - 0e00 (usable)
 BIOS-e820:  - 0001 (reserved)
0MB HIGHMEM available.
224MB LOWMEM available.
On node 0 totalpages: 57344
  DMA zone: 4096 pages, LIFO batch:0
  Normal zone: 53248 pages, LIFO batch:15
DMI 2.0 present.
ACPI: Unable to locate RSDP
Allocating PCI resources starting at 1000 (gap: 0e00:f1ff)
Detected 400.953 MHz processor.
Built 1 zonelists.  Total pages: 57344
Kernel command line: root=/dev/hda3 ro 
Local APIC disabled by BIOS -- you can enable it with "lapic"
mapped APIC to d000 (011c9000)
Enabling fast FPU save and restore... done.
Initializing CPU#0
PID hash table entries: 1024 (order: 10, 4096 bytes)
Console: colour VGA+ 80x25
Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
Memory: 219828k/229376k available (1544k kernel code, 9052k reserved, 577k 
data, 196k init, 0k highmem)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
Calibrating delay using timer specific routine.. 802.59 BogoMIPS (lpj=1605193)
Security Framework v1.0.0 initialized
SELinux:  Disabled at boot.
Capability LSM initialized
Mount-cache hash table entries: 512
CPU: After generic identify, caps: 0183f9ff     
 
CPU: After vendor identify, caps: 0183f9ff     
 
CPU: L1 I cache: 16K, L1 D cache: 16K
CPU: L2 cache: 512K
CPU: After all inits, caps: 0183f9ff   0040  
 
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
Compat vDSO mapped to e000.
Checking 'hlt' instruction... OK.
SMP alternatives: switching to UP code
Freeing SMP alternatives: 16k freed
CPU0: Intel Pentium II (Deschutes) stepping 03
SMP motherboard not detected.
Local APIC not detected. Using dummy APIC emulation.
Brought up 1 CPUs
migration_cost=0
checking if image is initramfs... it is
Freeing initrd memory: 4375k freed
NET: Registered protocol family 16
PCI: PCI BIOS revision 2.10 entry at 0xfb4a0, last bus=1
PCI: Using configuration type 1
Setting up standard PCI resources
ACPI: Interpreter disabled.
Linux Plug and Play Support v0.97 (c) Adam Belay
pnp: PnP ACPI: disabled
PnPBIOS: Scanning system for PnP BIOS support...
PnPBIOS: Found PnP BIOS installation structure at 0xc00fc0f0
PnPBIOS: PnP BIOS version 1.0, entry 0xf:0xc118, dseg 0xf
PnPBIOS: 14 nodes reported by PnP BIOS; 14 recorded by driver
PCI: Probing PCI hardware
PCI: Probing PCI hardware (bus 00)
PCI: Firmware left :00:0b.0 e100 interrupts enabled, disabling
Boot video device is :01:00.0
PCI: Bridge: :00:01.0
  IO window: c000-cfff
  MEM window: e400-e5ff
  PREFETCH window: e700-e77f
PCI: Setting latency timer of device :00:01.0 to 64
NET: Registered protocol family 2
IP route cache hash table entries: 2048 (order: 1, 8192 bytes)
TCP established hash table entries: 8192 (order: 4, 65536 bytes)
TCP bind hash table entries: 4096 (order: 3, 32768 bytes)
TCP: Hash tables configured (established 8192 bind 4096)
TCP reno registered
audit: initializing netlink socket (disabled)
audit(1185563327.604:1): initialized
VFS: Disk quotas dquot_6.5.1
Dquot-cache hash table entries: 1024 (order 0, 4096 bytes)
Initializing Cryptographic API
io scheduler noop registered
io scheduler anticipatory 

Re: serial flow control appears broken

2007-07-28 Thread Alan Cox
> Curiously, the session at 38400 bps that skipped 858 bytes... coincided, 
> not just in sequence but also in precice timing within the session, with 
> a small but noticeable disk load that I caused by grepping through a 
> hundred session logs.  (I can't reproduce it easily, though, because of 
> disk caching.)

Can you send me a dmesg, there are some cases when high disk load can
cause high interrupt latency in both 2.2 and 2.6 depending upon what is
configured. I don't think thats related to the main problem but it is
worth knowing about hdparm -u1

> as it leaves the DCE.  I mention this in case there is any limitation to 
> how the 8250 driver performs when two modems are being run simultaneously.

It means more load but that shouldn't matter much, and the transmit side
if under load with asynchronous traffic will not lose bytes sending.

Alan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-28 Thread Russell King
On Fri, Jul 27, 2007 at 09:51:25PM -0700, Lee Howard wrote:
> Curiously, the session at 38400 bps that skipped 858 bytes... coincided, 
> not just in sequence but also in precice timing within the session, with 
> a small but noticeable disk load that I caused by grepping through a 
> hundred session logs.  (I can't reproduce it easily, though, because of 
> disk caching.)

If you have other parts of the system which run with IRQs disabled for
a significant time period, then you will get serial corruption.  That's
not the serial driver's fault - that's a problem with the other device
drivers/rest of the system.

You may be table to track down where IRQs are being held off for too long
by hooking into the 8250 interrupt handler, and when an overrun error is
reported, printk a _minimal_ message reporting the instruction pointer
obtained via get_irq_regs().

Note, however, that I don't actively maintain serial anymore.

-- 
Russell King
 Linux kernel2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-28 Thread Russell King
On Fri, Jul 27, 2007 at 12:22:57PM -0600, Robert Hancock wrote:
> Maciej W. Rozycki wrote:
> > The TTY line discipline driver could do that based on the amount of 
> >received data present in its buffer.  And it should if asked to (a brief 
> >look at drivers/char/n_tty.c reveals it does; obviously there may be a bug 
> 
> Really, where? In my look through the code I haven't found any mechanism 
> that would result in RTS being lowered based on TTY buffers filling up, 
> at least not in the 8250 case.

That's something for the line discipline to decide.

> In this situation, though, it appears it's not the TTY buffers that are 
> filling but the UART's own buffer. I would think this must be caused by 
> some kind of interrupt latency that results in not draining the FIFO in 
> time.

Correct, and suggested approach to tracking down the culpret has been
mentioned in a previous email.

Also note that there's nothing the serial driver can do to detect this
condition before it occurs.  The problem occurs because the serial driver
is starved of CPU time due to other parts of the system, and the driver
has precisely zero knowledge as to when that's going to happen.

There are two possible scenarios when such starvation can occur:

1. interrupts are disabled for a long period.
2. the serial interrupt has started to run, but has been interrupted
   by _another_ interrupt which runs for a long period.

Essentially, any complex interrupt handler (such as an IDE interrupt
doing a multi-sector PIO transfer _in interrupt context_) can cause this
kind of starvation.  That's why Linux 1.x had bottom halves - so that
the time consuming work could be moved out of the interrupt handler,
thereby causing minimal the blockage of other interrupts.

Unfortunately, that kind of design has been long since forgotten.
Apparantly modern machines are fast enough that it doesn't have to be
worried about anymore...  Or are they?

-- 
Russell King
 Linux kernel2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-28 Thread Russell King
On Fri, Jul 27, 2007 at 12:22:57PM -0600, Robert Hancock wrote:
 Maciej W. Rozycki wrote:
  The TTY line discipline driver could do that based on the amount of 
 received data present in its buffer.  And it should if asked to (a brief 
 look at drivers/char/n_tty.c reveals it does; obviously there may be a bug 
 
 Really, where? In my look through the code I haven't found any mechanism 
 that would result in RTS being lowered based on TTY buffers filling up, 
 at least not in the 8250 case.

That's something for the line discipline to decide.

 In this situation, though, it appears it's not the TTY buffers that are 
 filling but the UART's own buffer. I would think this must be caused by 
 some kind of interrupt latency that results in not draining the FIFO in 
 time.

Correct, and suggested approach to tracking down the culpret has been
mentioned in a previous email.

Also note that there's nothing the serial driver can do to detect this
condition before it occurs.  The problem occurs because the serial driver
is starved of CPU time due to other parts of the system, and the driver
has precisely zero knowledge as to when that's going to happen.

There are two possible scenarios when such starvation can occur:

1. interrupts are disabled for a long period.
2. the serial interrupt has started to run, but has been interrupted
   by _another_ interrupt which runs for a long period.

Essentially, any complex interrupt handler (such as an IDE interrupt
doing a multi-sector PIO transfer _in interrupt context_) can cause this
kind of starvation.  That's why Linux 1.x had bottom halves - so that
the time consuming work could be moved out of the interrupt handler,
thereby causing minimal the blockage of other interrupts.

Unfortunately, that kind of design has been long since forgotten.
Apparantly modern machines are fast enough that it doesn't have to be
worried about anymore...  Or are they?

-- 
Russell King
 Linux kernel2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-28 Thread Russell King
On Fri, Jul 27, 2007 at 09:51:25PM -0700, Lee Howard wrote:
 Curiously, the session at 38400 bps that skipped 858 bytes... coincided, 
 not just in sequence but also in precice timing within the session, with 
 a small but noticeable disk load that I caused by grepping through a 
 hundred session logs.  (I can't reproduce it easily, though, because of 
 disk caching.)

If you have other parts of the system which run with IRQs disabled for
a significant time period, then you will get serial corruption.  That's
not the serial driver's fault - that's a problem with the other device
drivers/rest of the system.

You may be table to track down where IRQs are being held off for too long
by hooking into the 8250 interrupt handler, and when an overrun error is
reported, printk a _minimal_ message reporting the instruction pointer
obtained via get_irq_regs().

Note, however, that I don't actively maintain serial anymore.

-- 
Russell King
 Linux kernel2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-28 Thread Alan Cox
 Curiously, the session at 38400 bps that skipped 858 bytes... coincided, 
 not just in sequence but also in precice timing within the session, with 
 a small but noticeable disk load that I caused by grepping through a 
 hundred session logs.  (I can't reproduce it easily, though, because of 
 disk caching.)

Can you send me a dmesg, there are some cases when high disk load can
cause high interrupt latency in both 2.2 and 2.6 depending upon what is
configured. I don't think thats related to the main problem but it is
worth knowing about hdparm -u1

 as it leaves the DCE.  I mention this in case there is any limitation to 
 how the 8250 driver performs when two modems are being run simultaneously.

It means more load but that shouldn't matter much, and the transmit side
if under load with asynchronous traffic will not lose bytes sending.

Alan
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-28 Thread Lee Howard

Alan Cox wrote:

Curiously, the session at 38400 bps that skipped 858 bytes... coincided, 
not just in sequence but also in precice timing within the session, with 
a small but noticeable disk load that I caused by grepping through a 
hundred session logs.  (I can't reproduce it easily, though, because of 
disk caching.)
   



Can you send me a dmesg, there are some cases when high disk load can
cause high interrupt latency in both 2.2 and 2.6 depending upon what is
configured.



I've attached dmesg output.  The os version I used yesterday to run 
those tests was Debian 4.0r0 (kernel 2.6.18-4-686).  It's still running, 
and that's where I give you this dmesg output from.



I don't think thats related to the main problem but it is
worth knowing about hdparm -u1



# hdparm -u1 /dev/hda

/dev/hda:
setting unmaskirq to 1 (on)
unmaskirq=  1 (on)
#

After doing this I re-ran the 5 test sends at 115200 bps.  The number of 
lost bytes were:  0, 14, 8, 0, and 3.  Compared with yesterday's 63, 5, 
44, 48, and 2 this may indicate an improvement.  Note also that in the 
4th session where no bytes were lost there was still one element of 
corrupt data as detected by the image decoder.


Thanks,

Lee.
Linux version 2.6.18-4-686 (Debian 2.6.18.dfsg.1-12) ([EMAIL PROTECTED]) (gcc 
version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)) #1 SMP Mon Mar 26 
17:17:36 UTC 2007
BIOS-provided physical RAM map:
 BIOS-e820:  - 000a (usable)
 BIOS-e820: 000f - 0010 (reserved)
 BIOS-e820: 0010 - 0e00 (usable)
 BIOS-e820:  - 0001 (reserved)
0MB HIGHMEM available.
224MB LOWMEM available.
On node 0 totalpages: 57344
  DMA zone: 4096 pages, LIFO batch:0
  Normal zone: 53248 pages, LIFO batch:15
DMI 2.0 present.
ACPI: Unable to locate RSDP
Allocating PCI resources starting at 1000 (gap: 0e00:f1ff)
Detected 400.953 MHz processor.
Built 1 zonelists.  Total pages: 57344
Kernel command line: root=/dev/hda3 ro 
Local APIC disabled by BIOS -- you can enable it with lapic
mapped APIC to d000 (011c9000)
Enabling fast FPU save and restore... done.
Initializing CPU#0
PID hash table entries: 1024 (order: 10, 4096 bytes)
Console: colour VGA+ 80x25
Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
Memory: 219828k/229376k available (1544k kernel code, 9052k reserved, 577k 
data, 196k init, 0k highmem)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
Calibrating delay using timer specific routine.. 802.59 BogoMIPS (lpj=1605193)
Security Framework v1.0.0 initialized
SELinux:  Disabled at boot.
Capability LSM initialized
Mount-cache hash table entries: 512
CPU: After generic identify, caps: 0183f9ff     
 
CPU: After vendor identify, caps: 0183f9ff     
 
CPU: L1 I cache: 16K, L1 D cache: 16K
CPU: L2 cache: 512K
CPU: After all inits, caps: 0183f9ff   0040  
 
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
Compat vDSO mapped to e000.
Checking 'hlt' instruction... OK.
SMP alternatives: switching to UP code
Freeing SMP alternatives: 16k freed
CPU0: Intel Pentium II (Deschutes) stepping 03
SMP motherboard not detected.
Local APIC not detected. Using dummy APIC emulation.
Brought up 1 CPUs
migration_cost=0
checking if image is initramfs... it is
Freeing initrd memory: 4375k freed
NET: Registered protocol family 16
PCI: PCI BIOS revision 2.10 entry at 0xfb4a0, last bus=1
PCI: Using configuration type 1
Setting up standard PCI resources
ACPI: Interpreter disabled.
Linux Plug and Play Support v0.97 (c) Adam Belay
pnp: PnP ACPI: disabled
PnPBIOS: Scanning system for PnP BIOS support...
PnPBIOS: Found PnP BIOS installation structure at 0xc00fc0f0
PnPBIOS: PnP BIOS version 1.0, entry 0xf:0xc118, dseg 0xf
PnPBIOS: 14 nodes reported by PnP BIOS; 14 recorded by driver
PCI: Probing PCI hardware
PCI: Probing PCI hardware (bus 00)
PCI: Firmware left :00:0b.0 e100 interrupts enabled, disabling
Boot video device is :01:00.0
PCI: Bridge: :00:01.0
  IO window: c000-cfff
  MEM window: e400-e5ff
  PREFETCH window: e700-e77f
PCI: Setting latency timer of device :00:01.0 to 64
NET: Registered protocol family 2
IP route cache hash table entries: 2048 (order: 1, 8192 bytes)
TCP established hash table entries: 8192 (order: 4, 65536 bytes)
TCP bind hash table entries: 4096 (order: 3, 32768 bytes)
TCP: Hash tables configured (established 8192 bind 4096)
TCP reno registered
audit: initializing netlink socket (disabled)
audit(1185563327.604:1): initialized
VFS: Disk quotas dquot_6.5.1
Dquot-cache hash table entries: 1024 (order 0, 4096 bytes)
Initializing Cryptographic API
io scheduler noop registered
io scheduler anticipatory 

Re: serial flow control appears broken

2007-07-28 Thread Ray Lee
On 7/27/07, Lee Howard [EMAIL PROTECTED] wrote:
 Curiously, the session at 38400 bps that skipped 858 bytes... coincided,
 not just in sequence but also in precice timing within the session, with
 a small but noticeable disk load that I caused by grepping through a
 hundred session logs.  (I can't reproduce it easily, though, because of
 disk caching.)

`echo 1  /proc/sys/vm/drop_caches` will clear out most (all?) of what
the kernel has cached from the drive. It's there just for this kind of
repeatability of tests...

Ray
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-27 Thread Lee Howard

Paul Fulghum wrote:


So this seems to be a latency issue reading the receive
FIFO in the ISR. The current rx FIFO trigger level
should be 8 bytes (UART_FCR_R_TRIG_10) which gives the
ISR 694usec to get the data at 115200bps.

IIRC, in 2.2.X kernels this defaulted to 4 bytes
(TRIG_01) which gave a little more time to service the interrupt.

How does the data rate affect the frequency of the overrun errors?
Does 57600bps make them go away?
 



The overrun error message does not occur on every instance of data 
corruption.  (I just became aware of this as I've not been paying so 
much attention to the error messages as I have been to the corrupt 
data.)  The data gets far more corrupted than the error messages would 
lead me to believe.  Since the data being sent from the fax modem to the 
host is identical (same image data) every time it's easier for me to 
measure the effect of one bitrate over another by examining the number 
of missing bytes from the data.


The image has a total of 140465 bytes.  Just now I sent it 5 times each 
at 115200, 57600, 38400, and 19200 bps.


At 115200 bps the number of bytes skipped were:  63, 5, 44, 48, and 2.

At 57600 bps the number of bytes skipped were:  0, 1, 13, 9, and 12.

At 38400 bps the number of bytes skipped were 858, 0, 0, 0, and 8.

At 19200 bps the number of bytes skipped were 0, 0, 0, 0, and 0.

Curiously, the session at 38400 bps that skipped 858 bytes... coincided, 
not just in sequence but also in precice timing within the session, with 
a small but noticeable disk load that I caused by grepping through a 
hundred session logs.  (I can't reproduce it easily, though, because of 
disk caching.)


And, perhaps this is relevant... the way that I have the fax modem 
sending the data to the host is by receiving it from another fax modem 
which is sending it.  Thus, the modem on ttyS0 is sending a fax to the 
modem on ttyS1.  Due to the error correction protocol that is performed 
between the two fax endpoints I can guarantee that the data is correct 
as it leaves the DCE.  I mention this in case there is any limitation to 
how the 8250 driver performs when two modems are being run simultaneously.


Thanks,

Lee.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-27 Thread Paul Fulghum
On Fri, 2007-07-27 at 13:48 -0700, Lee Howard wrote:
> Here's the output:
> 
> type: 4
> line: 1
> line: 760
>  irq: 3
>flags: 1358954688
>   xmit_fifo_size: 16
>   custom_divisor: 0
>baud_base: 115200

OK, the FIFO should be enabled.

What is known:

* The error is a hardware FIFO overrun.
  - observed message is in n_tty due to driver setting TTY_OVERRUN

* The RTS/CTS flow control is not involved
  - this is done only by the ldisc in response to buffer levels
  - you verified crtscts is set
  - you did not observed RTS change when 'overflow error' logged
  - you did observe RTS change when application stopped reading

So this seems to be a latency issue reading the receive
FIFO in the ISR. The current rx FIFO trigger level
should be 8 bytes (UART_FCR_R_TRIG_10) which gives the
ISR 694usec to get the data at 115200bps.

IIRC, in 2.2.X kernels this defaulted to 4 bytes
(TRIG_01) which gave a little more time to service the interrupt.

How does the data rate affect the frequency of the overrun errors?
Does 57600bps make them go away?

--
Paul




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-27 Thread Lee Howard

Paul Fulghum wrote:


Tilman Schmidt wrote:


Could this be related?

http://lkml.org/lkml/2007/7/18/245

Quote:
"I've recently found (using 2.6.21.4) that configuring a serial ports
(ST16654) which use the 8250 driver using setserial results in the
UART's FIFOs being disabled (unless you specify autoconfig)."



That would make sense.

Lee's error is a hardware FIFO overrun which could occur
if the FIFO is being disabled as described in your
link (by trying to set the uart type with setserial).



I'm not using setserial on this port, myself.  If something in init is 
calling on setserial then I don't know about it.


That said, tests on the serial port from within the application show 
that xmit_fifo_size is set to 16 as it should be.


I wrote up a little test app:

   struct serial_struct serial;
   ioctl(modemFd, TIOCGSERIAL, );
   printf("type: %d\n", serial.type);
   printf("line: %d\n", serial.line);
   printf("line: %u\n", serial.port);
   printf(" irq: %d\n", serial.irq);
   printf("   flags: %d\n", serial.flags);
   printf("  xmit_fifo_size: %d\n", serial.xmit_fifo_size);
   printf("  custom_divisor: %d\n", serial.custom_divisor);
   printf("   baud_base: %d\n", serial.baud_base);
   printf(" close_delay: %u\n", serial.close_delay);
   printf(" io_type: 0x%X\n", serial.io_type);
   printf("reserved_char[0]: 0x%X\n", serial.reserved_char[0]);
   printf("hub6: %d\n", serial.hub6);
   printf("closing_wait: %u\n", serial.closing_wait);
   printf("   closing_wait2: %u\n", serial.closing_wait2);
   printf(" iomem_reg_shift: %u\n", serial.iomem_reg_shift);
   printf("   port_high: %u\n", serial.port_high);
   printf(" reserved[0]: %d\n", serial.reserved[0]);

Here's the output:

   type: 4
   line: 1
   line: 760
irq: 3
  flags: 1358954688
 xmit_fifo_size: 16
 custom_divisor: 0
  baud_base: 115200
close_delay: 500
io_type: 0x0
reserved_char[0]: 0x0
   hub6: 0
   closing_wait: 3
  closing_wait2: 0
iomem_reg_shift: 0
  port_high: 0
reserved[0]: 0

Thanks,

Lee.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-27 Thread Lee Howard

Tilman Schmidt wrote:


Lee Howard schrieb:
 

So, does this explain why I wouldn't have a problem at 115200 bps with 
kernel 2.2.5 but why I do with 2.6.5 and 2.6.18?  Both hardware and 
software flow control work fine with 2.2.5 (meaning I don't see any 
error message and I don't have any data corruption), but neither works 
to avoid the "kernel: ttyS1: 1 input overrun(s)" and consequent data 
corruption issue in 2.6.5 nor 2.6.18.


Was there some associated application change in tty handling that needed 
to occur between the 2.2 and 2.6 kernels to properly implement flow control?
   



Could this be related?

http://lkml.org/lkml/2007/7/18/245

Quote:
"I've recently found (using 2.6.21.4) that configuring a serial ports
(ST16654) which use the 8250 driver using setserial results in the
UART's FIFOs being disabled (unless you specify autoconfig)."
 



I'm not running setserial on the port, myself.  But to test to see if it 
is related, I included this code in the application:


#include 

struct serial_struct serial;
ioctl(modemFd, TIOCGSERIAL, );
traceModemOp("modem xmit_fifo_size: %u", serial.xmit_fifo_size);

And I get this resulting logging:

"MODEM modem xmit_fifo_size: 16"

So it's clear from here that the xmit_fifo_size is set correctly on this 
system.


Thanks,

Lee.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-27 Thread Paul Fulghum

Tilman Schmidt wrote:

Could this be related?

http://lkml.org/lkml/2007/7/18/245

Quote:
"I've recently found (using 2.6.21.4) that configuring a serial ports
(ST16654) which use the 8250 driver using setserial results in the
UART's FIFOs being disabled (unless you specify autoconfig)."


That would make sense.

Lee's error is a hardware FIFO overrun which could occur
if the FIFO is being disabled as described in your
link (by trying to set the uart type with setserial).

Since the tty flow control is only triggered
by the line discipline in response to ldisc
buffer levels and not hardware FIFO overruns,
you would never see any flow control action
as reported by Lee.


--
Paul Fulghum
Microgate Systems, Ltd.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-27 Thread Paul Fulghum
OK, I see where TTY_OVERRUN is set:
include/linux/serial_core.h:uart_insert_char()


--
Paul Fulghum
Microgate Systems, Ltd

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-27 Thread Paul Fulghum
On Fri, 2007-07-27 at 12:22 -0600, Robert Hancock wrote:

> In this situation, though, it appears it's not the TTY buffers that are 
> filling but the UART's own buffer. I would think this must be caused by 
> some kind of interrupt latency that results in not draining the FIFO in 
> time.

You are right, this error is output when the character flag TTY_OVERRUN
is encountered by n_tty.c which should be set by the driver
in response to a hardware FIFO overrun (not an ldisc buffer overrun).

I can't see anyplace in serial_core.c or 8250.c that sets TTY_OVERRUN.

 
--
Paul Fulghum
Microgate Systems, Ltd

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-27 Thread Paul Fulghum
On Fri, 2007-07-27 at 12:22 -0600, Robert Hancock wrote:
> Maciej W. Rozycki wrote:
> >  The TTY line discipline driver could do that based on the amount of 
> > received data present in its buffer.  And it should if asked to (a brief 
> > look at drivers/char/n_tty.c reveals it does; obviously there may be a bug 
> 
> Really, where? In my look through the code I haven't found any mechanism 
> that would result in RTS being lowered based on TTY buffers filling up, 
> at least not in the 8250 case.

serial_core.c:uart_throttle()
serial_core.c:uart_unthrottle()

These are called by N_TTY in response to buffer levels.
 
--
Paul Fulghum
Microgate Systems, Ltd

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-27 Thread Robert Hancock

Maciej W. Rozycki wrote:

On Fri, 27 Jul 2007, Lee Howard wrote:


Okay, so let's say we've got a loop around a blocking read on the modem file
descriptor...

 for (;;) {
 read some data from modem
 process data from modem
 if (end-of-data detected) break;
 }

Are you suggesting that the application should be using deasserting RTS after
the read and asserting it before?


 It certainly could -- you were asking how it would know. ;-)


I had previously thought that the control of RTS was something that the
serial/tty driver was supposed to do independently based on the buffer fill.


 The TTY line discipline driver could do that based on the amount of 
received data present in its buffer.  And it should if asked to (a brief 
look at drivers/char/n_tty.c reveals it does; obviously there may be a bug 


Really, where? In my look through the code I haven't found any mechanism 
that would result in RTS being lowered based on TTY buffers filling up, 
at least not in the 8250 case.


In this situation, though, it appears it's not the TTY buffers that are 
filling but the UART's own buffer. I would think this must be caused by 
some kind of interrupt latency that results in not draining the FIFO in 
time.


somewhere though).  So could e.g. the SLIP and PPP line discipline 
drivers, though the criteria might be different (apparently they do not, 
which is a shame).


 The serial drivers have nothing to do about it -- all they can do is 
pushing data upstream, to the discipline driver.  They can provide an 
interface to hardware flow control features though, if implemented by a 
given UART.


  Maciej


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-27 Thread Lee Howard

Maciej W. Rozycki wrote:

The TTY line discipline driver could do that based on the amount of 
received data present in its buffer.  And it should if asked to (a brief 
look at drivers/char/n_tty.c reveals it does; obviously there may be a bug 
somewhere though).  So could e.g. the SLIP and PPP line discipline 
drivers, though the criteria might be different (apparently they do not, 
which is a shame).


The serial drivers have nothing to do about it -- all they can do is 
pushing data upstream, to the discipline driver.  They can provide an 
interface to hardware flow control features though, if implemented by a 
given UART.




Thank you for this clarification.  So I should have more correctly been 
saying that "tty flow control appears broken".  Right?


I've asked the manufacturer to take a look at drivers/char/n_tty.c to 
see if they can't see anything obvious.


Thanks,

Lee.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-27 Thread Lee Howard

Alan Cox wrote:


As the flow control is driven by software on most 16x50 chips (there are
a couple of exceptions) if we fail to empty the fifo fast enough then any
flow control will be asserted too late to save the day.

If you stop the application and do the following

cat /dev/ttywhatever
^Z
[stopped]

(so you are asking the OS to buffer data but not ever reading it)

and then fire data at it does the flow control eventually occur ?



Yes it does appear to.  I told the application to simply sleep(300) at 
the appropriate moment, and I watched the application and when it began 
the sleep I ran:


 cat /dev/ttyS1
 (lots of "garbage" began spewing forth)
 ^Z
 (about 2 or 3 seconds and the RTS light goes dark)

Thanks,

Lee.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-27 Thread Maciej W. Rozycki
On Fri, 27 Jul 2007, Lee Howard wrote:

> Okay, so let's say we've got a loop around a blocking read on the modem file
> descriptor...
> 
>  for (;;) {
>  read some data from modem
>  process data from modem
>  if (end-of-data detected) break;
>  }
> 
> Are you suggesting that the application should be using deasserting RTS after
> the read and asserting it before?

 It certainly could -- you were asking how it would know. ;-)

> I had previously thought that the control of RTS was something that the
> serial/tty driver was supposed to do independently based on the buffer fill.

 The TTY line discipline driver could do that based on the amount of 
received data present in its buffer.  And it should if asked to (a brief 
look at drivers/char/n_tty.c reveals it does; obviously there may be a bug 
somewhere though).  So could e.g. the SLIP and PPP line discipline 
drivers, though the criteria might be different (apparently they do not, 
which is a shame).

 The serial drivers have nothing to do about it -- all they can do is 
pushing data upstream, to the discipline driver.  They can provide an 
interface to hardware flow control features though, if implemented by a 
given UART.

  Maciej
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-27 Thread Alan Cox
> I had previously thought that the control of RTS was something that the 
> serial/tty driver was supposed to do independently based on the buffer 
> fill.  Was I wrong?

If the kernel is asked to do CRTSCTS then the kernel handles the flow
control. It uses it when the internal buffers are nearly full.

The direct access to the lines is normally only used by special drivers
such as half duplex radio modem drivers.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-27 Thread Lee Howard

Maciej W. Rozycki wrote:


On Thu, 26 Jul 2007, Lee Howard wrote:

 


If the application were to use TIOCM_RTS how would it know when to apply it or
not?  Is there some approach that the application could take to manage flow
control on the serial port?  What about software flow control?  Does the
   



Well, an application could negate RTS when it receives a character and 
is running out of resources for further processing of incoming data.


Smarter UARTs may be able to negate RTS themselves based on the amount of 
data in their receive FIFO.  The threshold may be configurable.




Okay, so let's say we've got a loop around a blocking read on the modem 
file descriptor...


 for (;;) {
 read some data from modem
 process data from modem
 if (end-of-data detected) break;
 }

Are you suggesting that the application should be using deasserting RTS 
after the read and asserting it before?


I had previously thought that the control of RTS was something that the 
serial/tty driver was supposed to do independently based on the buffer 
fill.  Was I wrong?


Thanks,

Lee.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-27 Thread Tilman Schmidt
Lee Howard schrieb:
> 
> So, does this explain why I wouldn't have a problem at 115200 bps with 
> kernel 2.2.5 but why I do with 2.6.5 and 2.6.18?  Both hardware and 
> software flow control work fine with 2.2.5 (meaning I don't see any 
> error message and I don't have any data corruption), but neither works 
> to avoid the "kernel: ttyS1: 1 input overrun(s)" and consequent data 
> corruption issue in 2.6.5 nor 2.6.18.
> 
> Was there some associated application change in tty handling that needed 
> to occur between the 2.2 and 2.6 kernels to properly implement flow control?

Could this be related?

http://lkml.org/lkml/2007/7/18/245

Quote:
"I've recently found (using 2.6.21.4) that configuring a serial ports
(ST16654) which use the 8250 driver using setserial results in the
UART's FIFOs being disabled (unless you specify autoconfig)."

-- 
Tilman SchmidtE-Mail: [EMAIL PROTECTED]
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Ungeöffnet mindestens haltbar bis: (siehe Rückseite)



signature.asc
Description: OpenPGP digital signature


Re: serial flow control appears broken

2007-07-27 Thread Alan Cox
> -parenb -parodd cs8 -hupcl -cstopb cread clocal crtscts

Ok so crtscts is set, but you have clocal set too. That shouldn't matter

> Using software flow control this is what stty tells me about the port 
> set up done by the application:

This also looks fine

> They seem correct to me, but I am certainly willing to be wrong.

clocal set as well is unusual but if I remember the spec right then
clocal would not interfere with rts/cts handshake and certainly not with
xon/xoff

Looks correct, two boards so its unlikely both didnt wire it.

> A quick google on "input overrun(s)" may lend some credence (although, 
> certainly this is not in any way conclusive) that I'm not the only one 
> who may be seeking a solution on this matter.
> 
>   http://www.google.com/search?hl=en=%2B%22input+overrun%28s%29%22

Those look different on the whole - there are two reasons you'll get an
input overrun with a 16x50 UART. The first is because we ran out of
buffers to empty the chip, in which case we would have asserted flow
control in software. The second is if we cannot keep up and fail to empty
the on chip FIFO within the required time (about 1mS)

As the flow control is driven by software on most 16x50 chips (there are
a couple of exceptions) if we fail to empty the fifo fast enough then any
flow control will be asserted too late to save the day.

If you stop the application and do the following

cat /dev/ttywhatever
^Z
[stopped]

(so you are asking the OS to buffer data but not ever reading it)

and then fire data at it does the flow control eventually occur ?

Alan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-27 Thread Maciej W. Rozycki
On Thu, 26 Jul 2007, Lee Howard wrote:

> If the application were to use TIOCM_RTS how would it know when to apply it or
> not?  Is there some approach that the application could take to manage flow
> control on the serial port?  What about software flow control?  Does the

 Well, an application could negate RTS when it receives a character and 
is running out of resources for further processing of incoming data.

 Smarter UARTs may be able to negate RTS themselves based on the amount of 
data in their receive FIFO.  The threshold may be configurable.

  Maciej
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-27 Thread Lee Howard

Alan Cox wrote:

The manufacturer is using a scope to look for RTS and they're not seeing 
it, either.  I just use my eyes to look at the LED, but I can see the 
CTS, DTR, DCD, RD, and TD lights blink, flicker, or dim... (and TD, RD, 
and CTS tend to go on and off rather quickly).
   



And you have

1.  The port set up correctly for flow control options in the
kernel ?
 



I suppose that you mean that the application has properly set up the 
port using termios/tcsetattr/ioctl and the like... rather than if the 
kernel build/config options were set to permit flow control (I know of 
no relevant flow-control-enabling kernel build options).  Using hardware 
flow control this is what stty tells me about the port set up done by 
the application:


# stty -F /dev/ttyS1 -a
speed 115200 baud; rows 0; columns 0; line = 0;
intr = ^C; quit = ^\; erase = ^?; kill = ^U; eof = ^D; eol = ; 
eol2 = ; start = ^Q; stop = ^S; susp = ^Z; rprnt = ^R; werase = 
^W; lnext = ^V; flush = ^O;

min = 1; time = 0;
-parenb -parodd cs8 -hupcl -cstopb cread clocal crtscts
-ignbrk -brkint -ignpar -parmrk -inpck -istrip -inlcr -igncr -icrnl 
-ixon -ixoff -iuclc -ixany -imaxbel
-opost -olcuc -ocrnl -onlcr -onocr -onlret -ofill -ofdel nl0 cr0 tab0 
bs0 vt0 ff0
-isig -icanon -iexten -echo -echoe -echok -echonl -noflsh -xcase -tostop 
-echoprt -echoctl -echoke

#

Using software flow control this is what stty tells me about the port 
set up done by the application:


# stty -F /dev/ttyS1 -a
speed 115200 baud; rows 0; columns 0; line = 0;
intr = ^C; quit = ^\; erase = ^?; kill = ^U; eof = ^D; eol = ; 
eol2 = ; start = ^Q; stop = ^S; susp = ^Z; rprnt = ^R; werase = 
^W; lnext = ^V; flush = ^O;

min = 1; time = 0;
-parenb -parodd cs8 -hupcl -cstopb cread clocal -crtscts
-ignbrk -brkint -ignpar -parmrk -inpck -istrip -inlcr -igncr -icrnl ixon 
ixoff -iuclc -ixany -imaxbel
-opost -olcuc -ocrnl -onlcr -onocr -onlret -ofill -ofdel nl0 cr0 tab0 
bs0 vt0 ff0
-isig -icanon -iexten -echo -echoe -echok -echonl -noflsh -xcase -tostop 
-echoprt -echoctl -echoke

#

They seem correct to me, but I am certainly willing to be wrong.


2.  Verified that the board vendor remembered to wire it ?



I don't know how to verify directly that the board manufacturer wired 
the serial port correctly.  I've tested this on two different 
motherboards made several years apart (but, yes, both were made by the 
same manufacturer).  However, when using RedHat 6.0 (kernel 2.2.5) I 
have no problems with data corruption occurring in the data coming from 
the DCE.  So that tells me that *something* was working before that 
isn't working now... and I'm trying to determine what the difference 
is... whether it be a problem in modern kernels or whether it be 
something that the application (HylaFAX) is not doing to accomodate 
whatever changes occurred in modern kernels.


A quick google on "input overrun(s)" may lend some credence (although, 
certainly this is not in any way conclusive) that I'm not the only one 
who may be seeking a solution on this matter.


 http://www.google.com/search?hl=en=%2B%22input+overrun%28s%29%22

Thanks,

Lee.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-27 Thread Lee Howard

Alan Cox wrote:

The manufacturer is using a scope to look for RTS and they're not seeing 
it, either.  I just use my eyes to look at the LED, but I can see the 
CTS, DTR, DCD, RD, and TD lights blink, flicker, or dim... (and TD, RD, 
and CTS tend to go on and off rather quickly).
   



And you have

1.  The port set up correctly for flow control options in the
kernel ?
 



I suppose that you mean that the application has properly set up the 
port using termios/tcsetattr/ioctl and the like... rather than if the 
kernel build/config options were set to permit flow control (I know of 
no relevant flow-control-enabling kernel build options).  Using hardware 
flow control this is what stty tells me about the port set up done by 
the application:


# stty -F /dev/ttyS1 -a
speed 115200 baud; rows 0; columns 0; line = 0;
intr = ^C; quit = ^\; erase = ^?; kill = ^U; eof = ^D; eol = undef; 
eol2 = undef; start = ^Q; stop = ^S; susp = ^Z; rprnt = ^R; werase = 
^W; lnext = ^V; flush = ^O;

min = 1; time = 0;
-parenb -parodd cs8 -hupcl -cstopb cread clocal crtscts
-ignbrk -brkint -ignpar -parmrk -inpck -istrip -inlcr -igncr -icrnl 
-ixon -ixoff -iuclc -ixany -imaxbel
-opost -olcuc -ocrnl -onlcr -onocr -onlret -ofill -ofdel nl0 cr0 tab0 
bs0 vt0 ff0
-isig -icanon -iexten -echo -echoe -echok -echonl -noflsh -xcase -tostop 
-echoprt -echoctl -echoke

#

Using software flow control this is what stty tells me about the port 
set up done by the application:


# stty -F /dev/ttyS1 -a
speed 115200 baud; rows 0; columns 0; line = 0;
intr = ^C; quit = ^\; erase = ^?; kill = ^U; eof = ^D; eol = undef; 
eol2 = undef; start = ^Q; stop = ^S; susp = ^Z; rprnt = ^R; werase = 
^W; lnext = ^V; flush = ^O;

min = 1; time = 0;
-parenb -parodd cs8 -hupcl -cstopb cread clocal -crtscts
-ignbrk -brkint -ignpar -parmrk -inpck -istrip -inlcr -igncr -icrnl ixon 
ixoff -iuclc -ixany -imaxbel
-opost -olcuc -ocrnl -onlcr -onocr -onlret -ofill -ofdel nl0 cr0 tab0 
bs0 vt0 ff0
-isig -icanon -iexten -echo -echoe -echok -echonl -noflsh -xcase -tostop 
-echoprt -echoctl -echoke

#

They seem correct to me, but I am certainly willing to be wrong.


2.  Verified that the board vendor remembered to wire it ?



I don't know how to verify directly that the board manufacturer wired 
the serial port correctly.  I've tested this on two different 
motherboards made several years apart (but, yes, both were made by the 
same manufacturer).  However, when using RedHat 6.0 (kernel 2.2.5) I 
have no problems with data corruption occurring in the data coming from 
the DCE.  So that tells me that *something* was working before that 
isn't working now... and I'm trying to determine what the difference 
is... whether it be a problem in modern kernels or whether it be 
something that the application (HylaFAX) is not doing to accomodate 
whatever changes occurred in modern kernels.


A quick google on input overrun(s) may lend some credence (although, 
certainly this is not in any way conclusive) that I'm not the only one 
who may be seeking a solution on this matter.


 http://www.google.com/search?hl=enq=%2B%22input+overrun%28s%29%22

Thanks,

Lee.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-27 Thread Alan Cox
 -parenb -parodd cs8 -hupcl -cstopb cread clocal crtscts

Ok so crtscts is set, but you have clocal set too. That shouldn't matter

 Using software flow control this is what stty tells me about the port 
 set up done by the application:

This also looks fine

 They seem correct to me, but I am certainly willing to be wrong.

clocal set as well is unusual but if I remember the spec right then
clocal would not interfere with rts/cts handshake and certainly not with
xon/xoff

Looks correct, two boards so its unlikely both didnt wire it.

 A quick google on input overrun(s) may lend some credence (although, 
 certainly this is not in any way conclusive) that I'm not the only one 
 who may be seeking a solution on this matter.
 
   http://www.google.com/search?hl=enq=%2B%22input+overrun%28s%29%22

Those look different on the whole - there are two reasons you'll get an
input overrun with a 16x50 UART. The first is because we ran out of
buffers to empty the chip, in which case we would have asserted flow
control in software. The second is if we cannot keep up and fail to empty
the on chip FIFO within the required time (about 1mS)

As the flow control is driven by software on most 16x50 chips (there are
a couple of exceptions) if we fail to empty the fifo fast enough then any
flow control will be asserted too late to save the day.

If you stop the application and do the following

cat /dev/ttywhatever
^Z
[stopped]

(so you are asking the OS to buffer data but not ever reading it)

and then fire data at it does the flow control eventually occur ?

Alan
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-27 Thread Maciej W. Rozycki
On Thu, 26 Jul 2007, Lee Howard wrote:

 If the application were to use TIOCM_RTS how would it know when to apply it or
 not?  Is there some approach that the application could take to manage flow
 control on the serial port?  What about software flow control?  Does the

 Well, an application could negate RTS when it receives a character and 
is running out of resources for further processing of incoming data.

 Smarter UARTs may be able to negate RTS themselves based on the amount of 
data in their receive FIFO.  The threshold may be configurable.

  Maciej
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-27 Thread Tilman Schmidt
Lee Howard schrieb:
 
 So, does this explain why I wouldn't have a problem at 115200 bps with 
 kernel 2.2.5 but why I do with 2.6.5 and 2.6.18?  Both hardware and 
 software flow control work fine with 2.2.5 (meaning I don't see any 
 error message and I don't have any data corruption), but neither works 
 to avoid the kernel: ttyS1: 1 input overrun(s) and consequent data 
 corruption issue in 2.6.5 nor 2.6.18.
 
 Was there some associated application change in tty handling that needed 
 to occur between the 2.2 and 2.6 kernels to properly implement flow control?

Could this be related?

http://lkml.org/lkml/2007/7/18/245

Quote:
I've recently found (using 2.6.21.4) that configuring a serial ports
(ST16654) which use the 8250 driver using setserial results in the
UART's FIFOs being disabled (unless you specify autoconfig).

-- 
Tilman SchmidtE-Mail: [EMAIL PROTECTED]
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Ungeöffnet mindestens haltbar bis: (siehe Rückseite)



signature.asc
Description: OpenPGP digital signature


Re: serial flow control appears broken

2007-07-27 Thread Lee Howard

Maciej W. Rozycki wrote:


On Thu, 26 Jul 2007, Lee Howard wrote:

 


If the application were to use TIOCM_RTS how would it know when to apply it or
not?  Is there some approach that the application could take to manage flow
control on the serial port?  What about software flow control?  Does the
   



Well, an application could negate RTS when it receives a character and 
is running out of resources for further processing of incoming data.


Smarter UARTs may be able to negate RTS themselves based on the amount of 
data in their receive FIFO.  The threshold may be configurable.




Okay, so let's say we've got a loop around a blocking read on the modem 
file descriptor...


 for (;;) {
 read some data from modem
 process data from modem
 if (end-of-data detected) break;
 }

Are you suggesting that the application should be using deasserting RTS 
after the read and asserting it before?


I had previously thought that the control of RTS was something that the 
serial/tty driver was supposed to do independently based on the buffer 
fill.  Was I wrong?


Thanks,

Lee.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-27 Thread Alan Cox
 I had previously thought that the control of RTS was something that the 
 serial/tty driver was supposed to do independently based on the buffer 
 fill.  Was I wrong?

If the kernel is asked to do CRTSCTS then the kernel handles the flow
control. It uses it when the internal buffers are nearly full.

The direct access to the lines is normally only used by special drivers
such as half duplex radio modem drivers.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-27 Thread Maciej W. Rozycki
On Fri, 27 Jul 2007, Lee Howard wrote:

 Okay, so let's say we've got a loop around a blocking read on the modem file
 descriptor...
 
  for (;;) {
  read some data from modem
  process data from modem
  if (end-of-data detected) break;
  }
 
 Are you suggesting that the application should be using deasserting RTS after
 the read and asserting it before?

 It certainly could -- you were asking how it would know. ;-)

 I had previously thought that the control of RTS was something that the
 serial/tty driver was supposed to do independently based on the buffer fill.

 The TTY line discipline driver could do that based on the amount of 
received data present in its buffer.  And it should if asked to (a brief 
look at drivers/char/n_tty.c reveals it does; obviously there may be a bug 
somewhere though).  So could e.g. the SLIP and PPP line discipline 
drivers, though the criteria might be different (apparently they do not, 
which is a shame).

 The serial drivers have nothing to do about it -- all they can do is 
pushing data upstream, to the discipline driver.  They can provide an 
interface to hardware flow control features though, if implemented by a 
given UART.

  Maciej
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-27 Thread Robert Hancock

Maciej W. Rozycki wrote:

On Fri, 27 Jul 2007, Lee Howard wrote:


Okay, so let's say we've got a loop around a blocking read on the modem file
descriptor...

 for (;;) {
 read some data from modem
 process data from modem
 if (end-of-data detected) break;
 }

Are you suggesting that the application should be using deasserting RTS after
the read and asserting it before?


 It certainly could -- you were asking how it would know. ;-)


I had previously thought that the control of RTS was something that the
serial/tty driver was supposed to do independently based on the buffer fill.


 The TTY line discipline driver could do that based on the amount of 
received data present in its buffer.  And it should if asked to (a brief 
look at drivers/char/n_tty.c reveals it does; obviously there may be a bug 


Really, where? In my look through the code I haven't found any mechanism 
that would result in RTS being lowered based on TTY buffers filling up, 
at least not in the 8250 case.


In this situation, though, it appears it's not the TTY buffers that are 
filling but the UART's own buffer. I would think this must be caused by 
some kind of interrupt latency that results in not draining the FIFO in 
time.


somewhere though).  So could e.g. the SLIP and PPP line discipline 
drivers, though the criteria might be different (apparently they do not, 
which is a shame).


 The serial drivers have nothing to do about it -- all they can do is 
pushing data upstream, to the discipline driver.  They can provide an 
interface to hardware flow control features though, if implemented by a 
given UART.


  Maciej


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-27 Thread Lee Howard

Alan Cox wrote:


As the flow control is driven by software on most 16x50 chips (there are
a couple of exceptions) if we fail to empty the fifo fast enough then any
flow control will be asserted too late to save the day.

If you stop the application and do the following

cat /dev/ttywhatever
^Z
[stopped]

(so you are asking the OS to buffer data but not ever reading it)

and then fire data at it does the flow control eventually occur ?



Yes it does appear to.  I told the application to simply sleep(300) at 
the appropriate moment, and I watched the application and when it began 
the sleep I ran:


 cat /dev/ttyS1
 (lots of garbage began spewing forth)
 ^Z
 (about 2 or 3 seconds and the RTS light goes dark)

Thanks,

Lee.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-27 Thread Lee Howard

Maciej W. Rozycki wrote:

The TTY line discipline driver could do that based on the amount of 
received data present in its buffer.  And it should if asked to (a brief 
look at drivers/char/n_tty.c reveals it does; obviously there may be a bug 
somewhere though).  So could e.g. the SLIP and PPP line discipline 
drivers, though the criteria might be different (apparently they do not, 
which is a shame).


The serial drivers have nothing to do about it -- all they can do is 
pushing data upstream, to the discipline driver.  They can provide an 
interface to hardware flow control features though, if implemented by a 
given UART.




Thank you for this clarification.  So I should have more correctly been 
saying that tty flow control appears broken.  Right?


I've asked the manufacturer to take a look at drivers/char/n_tty.c to 
see if they can't see anything obvious.


Thanks,

Lee.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-27 Thread Paul Fulghum
On Fri, 2007-07-27 at 12:22 -0600, Robert Hancock wrote:
 Maciej W. Rozycki wrote:
   The TTY line discipline driver could do that based on the amount of 
  received data present in its buffer.  And it should if asked to (a brief 
  look at drivers/char/n_tty.c reveals it does; obviously there may be a bug 
 
 Really, where? In my look through the code I haven't found any mechanism 
 that would result in RTS being lowered based on TTY buffers filling up, 
 at least not in the 8250 case.

serial_core.c:uart_throttle()
serial_core.c:uart_unthrottle()

These are called by N_TTY in response to buffer levels.
 
--
Paul Fulghum
Microgate Systems, Ltd

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-27 Thread Paul Fulghum
On Fri, 2007-07-27 at 12:22 -0600, Robert Hancock wrote:

 In this situation, though, it appears it's not the TTY buffers that are 
 filling but the UART's own buffer. I would think this must be caused by 
 some kind of interrupt latency that results in not draining the FIFO in 
 time.

You are right, this error is output when the character flag TTY_OVERRUN
is encountered by n_tty.c which should be set by the driver
in response to a hardware FIFO overrun (not an ldisc buffer overrun).

I can't see anyplace in serial_core.c or 8250.c that sets TTY_OVERRUN.

 
--
Paul Fulghum
Microgate Systems, Ltd

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-27 Thread Paul Fulghum

Tilman Schmidt wrote:

Could this be related?

http://lkml.org/lkml/2007/7/18/245

Quote:
I've recently found (using 2.6.21.4) that configuring a serial ports
(ST16654) which use the 8250 driver using setserial results in the
UART's FIFOs being disabled (unless you specify autoconfig).


That would make sense.

Lee's error is a hardware FIFO overrun which could occur
if the FIFO is being disabled as described in your
link (by trying to set the uart type with setserial).

Since the tty flow control is only triggered
by the line discipline in response to ldisc
buffer levels and not hardware FIFO overruns,
you would never see any flow control action
as reported by Lee.


--
Paul Fulghum
Microgate Systems, Ltd.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-27 Thread Paul Fulghum
OK, I see where TTY_OVERRUN is set:
include/linux/serial_core.h:uart_insert_char()


--
Paul Fulghum
Microgate Systems, Ltd

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-27 Thread Lee Howard

Tilman Schmidt wrote:


Lee Howard schrieb:
 

So, does this explain why I wouldn't have a problem at 115200 bps with 
kernel 2.2.5 but why I do with 2.6.5 and 2.6.18?  Both hardware and 
software flow control work fine with 2.2.5 (meaning I don't see any 
error message and I don't have any data corruption), but neither works 
to avoid the kernel: ttyS1: 1 input overrun(s) and consequent data 
corruption issue in 2.6.5 nor 2.6.18.


Was there some associated application change in tty handling that needed 
to occur between the 2.2 and 2.6 kernels to properly implement flow control?
   



Could this be related?

http://lkml.org/lkml/2007/7/18/245

Quote:
I've recently found (using 2.6.21.4) that configuring a serial ports
(ST16654) which use the 8250 driver using setserial results in the
UART's FIFOs being disabled (unless you specify autoconfig).
 



I'm not running setserial on the port, myself.  But to test to see if it 
is related, I included this code in the application:


#include linux/serial.h

struct serial_struct serial;
ioctl(modemFd, TIOCGSERIAL, serial);
traceModemOp(modem xmit_fifo_size: %u, serial.xmit_fifo_size);

And I get this resulting logging:

MODEM modem xmit_fifo_size: 16

So it's clear from here that the xmit_fifo_size is set correctly on this 
system.


Thanks,

Lee.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-27 Thread Lee Howard

Paul Fulghum wrote:


Tilman Schmidt wrote:


Could this be related?

http://lkml.org/lkml/2007/7/18/245

Quote:
I've recently found (using 2.6.21.4) that configuring a serial ports
(ST16654) which use the 8250 driver using setserial results in the
UART's FIFOs being disabled (unless you specify autoconfig).



That would make sense.

Lee's error is a hardware FIFO overrun which could occur
if the FIFO is being disabled as described in your
link (by trying to set the uart type with setserial).



I'm not using setserial on this port, myself.  If something in init is 
calling on setserial then I don't know about it.


That said, tests on the serial port from within the application show 
that xmit_fifo_size is set to 16 as it should be.


I wrote up a little test app:

   struct serial_struct serial;
   ioctl(modemFd, TIOCGSERIAL, serial);
   printf(type: %d\n, serial.type);
   printf(line: %d\n, serial.line);
   printf(line: %u\n, serial.port);
   printf( irq: %d\n, serial.irq);
   printf(   flags: %d\n, serial.flags);
   printf(  xmit_fifo_size: %d\n, serial.xmit_fifo_size);
   printf(  custom_divisor: %d\n, serial.custom_divisor);
   printf(   baud_base: %d\n, serial.baud_base);
   printf( close_delay: %u\n, serial.close_delay);
   printf( io_type: 0x%X\n, serial.io_type);
   printf(reserved_char[0]: 0x%X\n, serial.reserved_char[0]);
   printf(hub6: %d\n, serial.hub6);
   printf(closing_wait: %u\n, serial.closing_wait);
   printf(   closing_wait2: %u\n, serial.closing_wait2);
   printf( iomem_reg_shift: %u\n, serial.iomem_reg_shift);
   printf(   port_high: %u\n, serial.port_high);
   printf( reserved[0]: %d\n, serial.reserved[0]);

Here's the output:

   type: 4
   line: 1
   line: 760
irq: 3
  flags: 1358954688
 xmit_fifo_size: 16
 custom_divisor: 0
  baud_base: 115200
close_delay: 500
io_type: 0x0
reserved_char[0]: 0x0
   hub6: 0
   closing_wait: 3
  closing_wait2: 0
iomem_reg_shift: 0
  port_high: 0
reserved[0]: 0

Thanks,

Lee.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-27 Thread Paul Fulghum
On Fri, 2007-07-27 at 13:48 -0700, Lee Howard wrote:
 Here's the output:
 
 type: 4
 line: 1
 line: 760
  irq: 3
flags: 1358954688
   xmit_fifo_size: 16
   custom_divisor: 0
baud_base: 115200

OK, the FIFO should be enabled.

What is known:

* The error is a hardware FIFO overrun.
  - observed message is in n_tty due to driver setting TTY_OVERRUN

* The RTS/CTS flow control is not involved
  - this is done only by the ldisc in response to buffer levels
  - you verified crtscts is set
  - you did not observed RTS change when 'overflow error' logged
  - you did observe RTS change when application stopped reading

So this seems to be a latency issue reading the receive
FIFO in the ISR. The current rx FIFO trigger level
should be 8 bytes (UART_FCR_R_TRIG_10) which gives the
ISR 694usec to get the data at 115200bps.

IIRC, in 2.2.X kernels this defaulted to 4 bytes
(TRIG_01) which gave a little more time to service the interrupt.

How does the data rate affect the frequency of the overrun errors?
Does 57600bps make them go away?

--
Paul




-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-27 Thread Lee Howard

Paul Fulghum wrote:


So this seems to be a latency issue reading the receive
FIFO in the ISR. The current rx FIFO trigger level
should be 8 bytes (UART_FCR_R_TRIG_10) which gives the
ISR 694usec to get the data at 115200bps.

IIRC, in 2.2.X kernels this defaulted to 4 bytes
(TRIG_01) which gave a little more time to service the interrupt.

How does the data rate affect the frequency of the overrun errors?
Does 57600bps make them go away?
 



The overrun error message does not occur on every instance of data 
corruption.  (I just became aware of this as I've not been paying so 
much attention to the error messages as I have been to the corrupt 
data.)  The data gets far more corrupted than the error messages would 
lead me to believe.  Since the data being sent from the fax modem to the 
host is identical (same image data) every time it's easier for me to 
measure the effect of one bitrate over another by examining the number 
of missing bytes from the data.


The image has a total of 140465 bytes.  Just now I sent it 5 times each 
at 115200, 57600, 38400, and 19200 bps.


At 115200 bps the number of bytes skipped were:  63, 5, 44, 48, and 2.

At 57600 bps the number of bytes skipped were:  0, 1, 13, 9, and 12.

At 38400 bps the number of bytes skipped were 858, 0, 0, 0, and 8.

At 19200 bps the number of bytes skipped were 0, 0, 0, 0, and 0.

Curiously, the session at 38400 bps that skipped 858 bytes... coincided, 
not just in sequence but also in precice timing within the session, with 
a small but noticeable disk load that I caused by grepping through a 
hundred session logs.  (I can't reproduce it easily, though, because of 
disk caching.)


And, perhaps this is relevant... the way that I have the fax modem 
sending the data to the host is by receiving it from another fax modem 
which is sending it.  Thus, the modem on ttyS0 is sending a fax to the 
modem on ttyS1.  Due to the error correction protocol that is performed 
between the two fax endpoints I can guarantee that the data is correct 
as it leaves the DCE.  I mention this in case there is any limitation to 
how the 8250 driver performs when two modems are being run simultaneously.


Thanks,

Lee.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-26 Thread Lee Howard

Alan Cox wrote:


Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing enabled
ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A

It's a Shuttle HOT-661 motherboard (VIA Apollo Pro Plus mainboard 
chipset).  Both FreeBSD and Linux identify the serial chipset type as 
16550A.
   



So you've got 16bytes of buffering. That ought to be enough on a modern
PC. The older kernels use quite limited internal buffers which may be a
factor, the current ones have a rewritten tty buffering layer which may
improve matters enormously.



So, does this explain why I wouldn't have a problem at 115200 bps with 
kernel 2.2.5 but why I do with 2.6.5 and 2.6.18?  Both hardware and 
software flow control work fine with 2.2.5 (meaning I don't see any 
error message and I don't have any data corruption), but neither works 
to avoid the "kernel: ttyS1: 1 input overrun(s)" and consequent data 
corruption issue in 2.6.5 nor 2.6.18.


Was there some associated application change in tty handling that needed 
to occur between the 2.2 and 2.6 kernels to properly implement flow control?


Thanks,

Lee.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-26 Thread Alan Cox
> The manufacturer is using a scope to look for RTS and they're not seeing 
> it, either.  I just use my eyes to look at the LED, but I can see the 
> CTS, DTR, DCD, RD, and TD lights blink, flicker, or dim... (and TD, RD, 
> and CTS tend to go on and off rather quickly).

And you have

1.  The port set up correctly for flow control options in the
kernel ?
2.  Verified that the board vendor remembered to wire it ?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-26 Thread Lee Howard

Uwe Kleine-König wrote:


Hello,

 

This is evidenced in hardware flow control by a little LED labeled "RTS" 
that is on the external modem.  This LED lights up when pin 7 of the DB9 
serial connection is given +12Vdc current (signalling "RTS" is on - that 
the host can accept data).  The LED goes dark when the current is 
removed (signalling that the host cannot accept data).  This "RTS" LED 
never flickers at all, as it should, when receiving these bursts of data 
- the LED stays lit as long as the serial cable is connected to the 
host... and yet I will see those "input overrun" messages.  Thus, it 
seems quite clear that the Linux serial tty driver is not deasserting 
RTS as it should in hardware flow control.  (And probably the analogous 
problem exists in software flow control, too.)
   


I don't know the relevant timings for problem, but just to be sure that
your prerequisites are correct:  How did you check that the LED stays
lit all the time?  Just from looking might not be accurate.  You might
want to mesure the signal with an oscilloscope.



The manufacturer is using a scope to look for RTS and they're not seeing 
it, either.  I just use my eyes to look at the LED, but I can see the 
CTS, DTR, DCD, RD, and TD lights blink, flicker, or dim... (and TD, RD, 
and CTS tend to go on and off rather quickly).


All of that said... even though I don't see RTS flicker or blink or dim 
when using kernel 2.2.5 (RedHat 6.0) I don't have any problems using 
115200 bps DTE-DCE communication rate.


Thanks,

Lee.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-26 Thread Alan Cox
> Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing enabled
> ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
> ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
> 
> It's a Shuttle HOT-661 motherboard (VIA Apollo Pro Plus mainboard 
> chipset).  Both FreeBSD and Linux identify the serial chipset type as 
> 16550A.

So you've got 16bytes of buffering. That ought to be enough on a modern
PC. The older kernels use quite limited internal buffers which may be a
factor, the current ones have a rewritten tty buffering layer which may
improve matters enormously.

Alan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-26 Thread Lee Howard

Robert Hancock wrote:


Lee Howard wrote:


Hello.

I have fax modems that will, in their proper behavior with certain 
features, send up to 64 kilobytes of data to the host DTE all at 
once.  (So, the fax modem handles an incoming fax and periodically 
will send between 256 bytes and 64 kilobytes of data in bursts.)


When the DCE-DTE (modem-to-host) communication rate is established at 
115200 bps data loss occurs systems using at least Linux kernels 
2.6.5 and 2.6.18 (and probably everything in-beween and then some 
more).  This is because the modem overflows the host's buffer.  This 
is evidenced in kernel logging:


Jul 23 14:01:30 gollum kernel: ttyS1: 1 input overrun(s)
Jul 23 17:09:45 gollum kernel: ttyS1: 1 input overrun(s)

Normally I would blame the modem itself for not honoring the host's 
flow control signals.  However, I have worked with the modem 
manufacturer closely on this matter for over three months now.  In 
that process they have improved the responsiveness of the modem and 
have fixed other problems, but the end result is that it truly does 
appear that the serial tty driver is not using flow control.  Whether 
software flow control (XON/XOFF) or hardware flow control (RTS/CTS) 
is used the result is the same.


This is evidenced in hardware flow control by a little LED labeled 
"RTS" that is on the external modem.  This LED lights up when pin 7 
of the DB9 serial connection is given +12Vdc current (signalling 
"RTS" is on - that the host can accept data).  The LED goes dark when 
the current is removed (signalling that the host cannot accept 
data).  This "RTS" LED never flickers at all, as it should, when 
receiving these bursts of data - the LED stays lit as long as the 
serial cable is connected to the host... and yet I will see those 
"input overrun" messages.  Thus, it seems quite clear that the Linux 
serial tty driver is not deasserting RTS as it should in hardware 
flow control.  (And probably the analogous problem exists in software 
flow control, too.)


Please tell me what I can do to help you resove and/or remedy this 
matter.  Also, please let me know if I have contacted the wrong 
people.  (I have cross-posted to linux-kernel as a catch-all.  I am 
not subscribed to either linux-serial or linux-kernel mailing lists.  
So please CC me in any list responses.)


If it is of any value to know (perhaps they have common code?), the 
same error occurs on FreeBSD 6.2 as well.   The problem does not 
occur on Windows.  The problem does not occur on RedHat 6.0 (kernel 
2.2.5).



What kind of serial port and machine is this on? From what I can see, 
a standard 16550 UART (not a special variant) just doesn't have any 
support for clearing RTS on its own when its input FIFO gets too full. 
The kernel would have to do it in that case. I'm not seeing where it 
would be controlling that automatically (as opposed to manually from 
the application with TIOCM_RTS). I'm also not sure if the UART gives 
the kernel enough information for it to even be able to control this 
line properly automatically.


That's assuming it actually is a 16550 or similar with a 16-byte FIFO 
at all, which assuming it's a non-ancient PC it should be, but who knows.



Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing enabled
ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A

It's a Shuttle HOT-661 motherboard (VIA Apollo Pro Plus mainboard 
chipset).  Both FreeBSD and Linux identify the serial chipset type as 
16550A.


If the application were to use TIOCM_RTS how would it know when to apply 
it or not?  Is there some approach that the application could take to 
manage flow control on the serial port?  What about software flow 
control?  Does the application (and not the driver) need to be managing 
the DC1/DC3 signalling on the host-side?


Thanks,

Lee.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-26 Thread Uwe Kleine-König
Hello,

> This is evidenced in hardware flow control by a little LED labeled "RTS" 
> that is on the external modem.  This LED lights up when pin 7 of the DB9 
> serial connection is given +12Vdc current (signalling "RTS" is on - that 
> the host can accept data).  The LED goes dark when the current is 
> removed (signalling that the host cannot accept data).  This "RTS" LED 
> never flickers at all, as it should, when receiving these bursts of data 
> - the LED stays lit as long as the serial cable is connected to the 
> host... and yet I will see those "input overrun" messages.  Thus, it 
> seems quite clear that the Linux serial tty driver is not deasserting 
> RTS as it should in hardware flow control.  (And probably the analogous 
> problem exists in software flow control, too.)
I don't know the relevant timings for problem, but just to be sure that
your prerequisites are correct:  How did you check that the LED stays
lit all the time?  Just from looking might not be accurate.  You might
want to mesure the signal with an oscilloscope.

Just my 0.02¢
Uwe

-- 
Uwe Kleine-König

fib where fib = 0 : 1 : zipWith (+) fib (tail fib)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-26 Thread Robert Hancock

Lee Howard wrote:

Hello.

I have fax modems that will, in their proper behavior with certain 
features, send up to 64 kilobytes of data to the host DTE all at once.  
(So, the fax modem handles an incoming fax and periodically will send 
between 256 bytes and 64 kilobytes of data in bursts.)


When the DCE-DTE (modem-to-host) communication rate is established at 
115200 bps data loss occurs systems using at least Linux kernels 2.6.5 
and 2.6.18 (and probably everything in-beween and then some more).  This 
is because the modem overflows the host's buffer.  This is evidenced in 
kernel logging:


Jul 23 14:01:30 gollum kernel: ttyS1: 1 input overrun(s)
Jul 23 17:09:45 gollum kernel: ttyS1: 1 input overrun(s)

Normally I would blame the modem itself for not honoring the host's flow 
control signals.  However, I have worked with the modem manufacturer 
closely on this matter for over three months now.  In that process they 
have improved the responsiveness of the modem and have fixed other 
problems, but the end result is that it truly does appear that the 
serial tty driver is not using flow control.  Whether software flow 
control (XON/XOFF) or hardware flow control (RTS/CTS) is used the result 
is the same.


This is evidenced in hardware flow control by a little LED labeled "RTS" 
that is on the external modem.  This LED lights up when pin 7 of the DB9 
serial connection is given +12Vdc current (signalling "RTS" is on - that 
the host can accept data).  The LED goes dark when the current is 
removed (signalling that the host cannot accept data).  This "RTS" LED 
never flickers at all, as it should, when receiving these bursts of data 
- the LED stays lit as long as the serial cable is connected to the 
host... and yet I will see those "input overrun" messages.  Thus, it 
seems quite clear that the Linux serial tty driver is not deasserting 
RTS as it should in hardware flow control.  (And probably the analogous 
problem exists in software flow control, too.)


Please tell me what I can do to help you resove and/or remedy this 
matter.  Also, please let me know if I have contacted the wrong people.  
(I have cross-posted to linux-kernel as a catch-all.  I am not 
subscribed to either linux-serial or linux-kernel mailing lists.  So 
please CC me in any list responses.)


If it is of any value to know (perhaps they have common code?), the same 
error occurs on FreeBSD 6.2 as well.   The problem does not occur on 
Windows.  The problem does not occur on RedHat 6.0 (kernel 2.2.5).


What kind of serial port and machine is this on? From what I can see, a 
standard 16550 UART (not a special variant) just doesn't have any 
support for clearing RTS on its own when its input FIFO gets too full. 
The kernel would have to do it in that case. I'm not seeing where it 
would be controlling that automatically (as opposed to manually from the 
application with TIOCM_RTS). I'm also not sure if the UART gives the 
kernel enough information for it to even be able to control this line 
properly automatically.


That's assuming it actually is a 16550 or similar with a 16-byte FIFO at 
all, which assuming it's a non-ancient PC it should be, but who knows.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-26 Thread Robert Hancock

Lee Howard wrote:

Hello.

I have fax modems that will, in their proper behavior with certain 
features, send up to 64 kilobytes of data to the host DTE all at once.  
(So, the fax modem handles an incoming fax and periodically will send 
between 256 bytes and 64 kilobytes of data in bursts.)


When the DCE-DTE (modem-to-host) communication rate is established at 
115200 bps data loss occurs systems using at least Linux kernels 2.6.5 
and 2.6.18 (and probably everything in-beween and then some more).  This 
is because the modem overflows the host's buffer.  This is evidenced in 
kernel logging:


Jul 23 14:01:30 gollum kernel: ttyS1: 1 input overrun(s)
Jul 23 17:09:45 gollum kernel: ttyS1: 1 input overrun(s)

Normally I would blame the modem itself for not honoring the host's flow 
control signals.  However, I have worked with the modem manufacturer 
closely on this matter for over three months now.  In that process they 
have improved the responsiveness of the modem and have fixed other 
problems, but the end result is that it truly does appear that the 
serial tty driver is not using flow control.  Whether software flow 
control (XON/XOFF) or hardware flow control (RTS/CTS) is used the result 
is the same.


This is evidenced in hardware flow control by a little LED labeled RTS 
that is on the external modem.  This LED lights up when pin 7 of the DB9 
serial connection is given +12Vdc current (signalling RTS is on - that 
the host can accept data).  The LED goes dark when the current is 
removed (signalling that the host cannot accept data).  This RTS LED 
never flickers at all, as it should, when receiving these bursts of data 
- the LED stays lit as long as the serial cable is connected to the 
host... and yet I will see those input overrun messages.  Thus, it 
seems quite clear that the Linux serial tty driver is not deasserting 
RTS as it should in hardware flow control.  (And probably the analogous 
problem exists in software flow control, too.)


Please tell me what I can do to help you resove and/or remedy this 
matter.  Also, please let me know if I have contacted the wrong people.  
(I have cross-posted to linux-kernel as a catch-all.  I am not 
subscribed to either linux-serial or linux-kernel mailing lists.  So 
please CC me in any list responses.)


If it is of any value to know (perhaps they have common code?), the same 
error occurs on FreeBSD 6.2 as well.   The problem does not occur on 
Windows.  The problem does not occur on RedHat 6.0 (kernel 2.2.5).


What kind of serial port and machine is this on? From what I can see, a 
standard 16550 UART (not a special variant) just doesn't have any 
support for clearing RTS on its own when its input FIFO gets too full. 
The kernel would have to do it in that case. I'm not seeing where it 
would be controlling that automatically (as opposed to manually from the 
application with TIOCM_RTS). I'm also not sure if the UART gives the 
kernel enough information for it to even be able to control this line 
properly automatically.


That's assuming it actually is a 16550 or similar with a 16-byte FIFO at 
all, which assuming it's a non-ancient PC it should be, but who knows.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-26 Thread Alan Cox
 The manufacturer is using a scope to look for RTS and they're not seeing 
 it, either.  I just use my eyes to look at the LED, but I can see the 
 CTS, DTR, DCD, RD, and TD lights blink, flicker, or dim... (and TD, RD, 
 and CTS tend to go on and off rather quickly).

And you have

1.  The port set up correctly for flow control options in the
kernel ?
2.  Verified that the board vendor remembered to wire it ?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-26 Thread Lee Howard

Uwe Kleine-König wrote:


Hello,

 

This is evidenced in hardware flow control by a little LED labeled RTS 
that is on the external modem.  This LED lights up when pin 7 of the DB9 
serial connection is given +12Vdc current (signalling RTS is on - that 
the host can accept data).  The LED goes dark when the current is 
removed (signalling that the host cannot accept data).  This RTS LED 
never flickers at all, as it should, when receiving these bursts of data 
- the LED stays lit as long as the serial cable is connected to the 
host... and yet I will see those input overrun messages.  Thus, it 
seems quite clear that the Linux serial tty driver is not deasserting 
RTS as it should in hardware flow control.  (And probably the analogous 
problem exists in software flow control, too.)
   


I don't know the relevant timings for problem, but just to be sure that
your prerequisites are correct:  How did you check that the LED stays
lit all the time?  Just from looking might not be accurate.  You might
want to mesure the signal with an oscilloscope.



The manufacturer is using a scope to look for RTS and they're not seeing 
it, either.  I just use my eyes to look at the LED, but I can see the 
CTS, DTR, DCD, RD, and TD lights blink, flicker, or dim... (and TD, RD, 
and CTS tend to go on and off rather quickly).


All of that said... even though I don't see RTS flicker or blink or dim 
when using kernel 2.2.5 (RedHat 6.0) I don't have any problems using 
115200 bps DTE-DCE communication rate.


Thanks,

Lee.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-26 Thread Lee Howard

Robert Hancock wrote:


Lee Howard wrote:


Hello.

I have fax modems that will, in their proper behavior with certain 
features, send up to 64 kilobytes of data to the host DTE all at 
once.  (So, the fax modem handles an incoming fax and periodically 
will send between 256 bytes and 64 kilobytes of data in bursts.)


When the DCE-DTE (modem-to-host) communication rate is established at 
115200 bps data loss occurs systems using at least Linux kernels 
2.6.5 and 2.6.18 (and probably everything in-beween and then some 
more).  This is because the modem overflows the host's buffer.  This 
is evidenced in kernel logging:


Jul 23 14:01:30 gollum kernel: ttyS1: 1 input overrun(s)
Jul 23 17:09:45 gollum kernel: ttyS1: 1 input overrun(s)

Normally I would blame the modem itself for not honoring the host's 
flow control signals.  However, I have worked with the modem 
manufacturer closely on this matter for over three months now.  In 
that process they have improved the responsiveness of the modem and 
have fixed other problems, but the end result is that it truly does 
appear that the serial tty driver is not using flow control.  Whether 
software flow control (XON/XOFF) or hardware flow control (RTS/CTS) 
is used the result is the same.


This is evidenced in hardware flow control by a little LED labeled 
RTS that is on the external modem.  This LED lights up when pin 7 
of the DB9 serial connection is given +12Vdc current (signalling 
RTS is on - that the host can accept data).  The LED goes dark when 
the current is removed (signalling that the host cannot accept 
data).  This RTS LED never flickers at all, as it should, when 
receiving these bursts of data - the LED stays lit as long as the 
serial cable is connected to the host... and yet I will see those 
input overrun messages.  Thus, it seems quite clear that the Linux 
serial tty driver is not deasserting RTS as it should in hardware 
flow control.  (And probably the analogous problem exists in software 
flow control, too.)


Please tell me what I can do to help you resove and/or remedy this 
matter.  Also, please let me know if I have contacted the wrong 
people.  (I have cross-posted to linux-kernel as a catch-all.  I am 
not subscribed to either linux-serial or linux-kernel mailing lists.  
So please CC me in any list responses.)


If it is of any value to know (perhaps they have common code?), the 
same error occurs on FreeBSD 6.2 as well.   The problem does not 
occur on Windows.  The problem does not occur on RedHat 6.0 (kernel 
2.2.5).



What kind of serial port and machine is this on? From what I can see, 
a standard 16550 UART (not a special variant) just doesn't have any 
support for clearing RTS on its own when its input FIFO gets too full. 
The kernel would have to do it in that case. I'm not seeing where it 
would be controlling that automatically (as opposed to manually from 
the application with TIOCM_RTS). I'm also not sure if the UART gives 
the kernel enough information for it to even be able to control this 
line properly automatically.


That's assuming it actually is a 16550 or similar with a 16-byte FIFO 
at all, which assuming it's a non-ancient PC it should be, but who knows.



Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing enabled
ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A

It's a Shuttle HOT-661 motherboard (VIA Apollo Pro Plus mainboard 
chipset).  Both FreeBSD and Linux identify the serial chipset type as 
16550A.


If the application were to use TIOCM_RTS how would it know when to apply 
it or not?  Is there some approach that the application could take to 
manage flow control on the serial port?  What about software flow 
control?  Does the application (and not the driver) need to be managing 
the DC1/DC3 signalling on the host-side?


Thanks,

Lee.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-26 Thread Alan Cox
 Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing enabled
 ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
 ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
 
 It's a Shuttle HOT-661 motherboard (VIA Apollo Pro Plus mainboard 
 chipset).  Both FreeBSD and Linux identify the serial chipset type as 
 16550A.

So you've got 16bytes of buffering. That ought to be enough on a modern
PC. The older kernels use quite limited internal buffers which may be a
factor, the current ones have a rewritten tty buffering layer which may
improve matters enormously.

Alan
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-26 Thread Uwe Kleine-König
Hello,

 This is evidenced in hardware flow control by a little LED labeled RTS 
 that is on the external modem.  This LED lights up when pin 7 of the DB9 
 serial connection is given +12Vdc current (signalling RTS is on - that 
 the host can accept data).  The LED goes dark when the current is 
 removed (signalling that the host cannot accept data).  This RTS LED 
 never flickers at all, as it should, when receiving these bursts of data 
 - the LED stays lit as long as the serial cable is connected to the 
 host... and yet I will see those input overrun messages.  Thus, it 
 seems quite clear that the Linux serial tty driver is not deasserting 
 RTS as it should in hardware flow control.  (And probably the analogous 
 problem exists in software flow control, too.)
I don't know the relevant timings for problem, but just to be sure that
your prerequisites are correct:  How did you check that the LED stays
lit all the time?  Just from looking might not be accurate.  You might
want to mesure the signal with an oscilloscope.

Just my 0.02¢
Uwe

-- 
Uwe Kleine-König

fib where fib = 0 : 1 : zipWith (+) fib (tail fib)
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: serial flow control appears broken

2007-07-26 Thread Lee Howard

Alan Cox wrote:


Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing enabled
ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A

It's a Shuttle HOT-661 motherboard (VIA Apollo Pro Plus mainboard 
chipset).  Both FreeBSD and Linux identify the serial chipset type as 
16550A.
   



So you've got 16bytes of buffering. That ought to be enough on a modern
PC. The older kernels use quite limited internal buffers which may be a
factor, the current ones have a rewritten tty buffering layer which may
improve matters enormously.



So, does this explain why I wouldn't have a problem at 115200 bps with 
kernel 2.2.5 but why I do with 2.6.5 and 2.6.18?  Both hardware and 
software flow control work fine with 2.2.5 (meaning I don't see any 
error message and I don't have any data corruption), but neither works 
to avoid the kernel: ttyS1: 1 input overrun(s) and consequent data 
corruption issue in 2.6.5 nor 2.6.18.


Was there some associated application change in tty handling that needed 
to occur between the 2.2 and 2.6 kernels to properly implement flow control?


Thanks,

Lee.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/