Re: TCP Spike

2007-09-27 Thread Ilpo Järvinen
On Thu, 27 Sep 2007, Majumder, Rajib wrote:

> We have observed 40ms latency spikes in TCP connections in "burst" type of 
> traffic.
> This affects regular TCP sockets.

Are segments being sent full-sized, or is there perhaps some Nagle 
component in it as well? I.e., are the applications using TCP_NODELAY?

> We observed this issue in kernels of 2.4.21 and kernel 2.6.5.  
> 
> Aparently, this seems to be fixed in 2.6.19.  
>
> Can someone throw some light on this? 

I think somebody, probably Alexey, enabled sending of ACK on every 2nd 
segment. Previously small segment senders playing with Nagle were 
complaining every now and then about performance because two small 
segments did not generate ACKs but one had to accumulate, IIRC, half MSS 
worth of data before ACK was sent. Could this be related to your case?

...In case you're having too much time, you can always try bisecting it
which finds out the causing commit... :-)

> Is this a congestion control/avoidance issue?

Congestion control is basically ACK clocked math for cwnd, ssthresh, etc.
state, which then results in permission to send new segments out etc. 
(except for RTO part of course, which I'll ignore in the next statement). 
Any delay gaps to sent packet after ACK receival, which triggered the 
state changing math, isn't there due to congestion control but due to 
other factors! 40ms is much below MIN_RTO (200ms), so it shouldn't be due 
to RTO either... Note that also delayed ACKs are exception to the general 
rule.

Congestion control is controlled like your CPU is. In your CPU there's 
this whatever GHz clock which determines when the state changing events 
take place, state changes don't just happen arbitarily but are _clocked_ 
(ACK _clocked_ in case of congestion control). Of course there will be 
some propagation delay after the change to put in effect all the state 
changes that are result of what occurred at clock edge (and this delay 
assimilating to processing delay in the context of congestion control).


-- 
 i.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: TCP Spike

2007-09-27 Thread Stephen Hemminger
On Thu, 27 Sep 2007 11:58:01 +0800
"Majumder, Rajib" <[EMAIL PROTECTED]> wrote:

> Hi,
> 
> We have observed 40ms latency spikes in TCP connections in "burst" type of 
> traffic. This affects regular TCP sockets. We observed this issue in kernels 
> of 2.4.21 and kernel 2.6.5.

Unfortunately, 2.6.5 is out of my short term memory at this point. I do 
remember that 2.6.5
used BIC for congestion control, and there were some math errors in the 
congestion control
logic that caused it to be way to aggressive.
  
> 
> Aparently, this seems to be fixed in 2.6.19.  
> 
> Can someone throw some light on this? 

My guess is that the addition of the SACK hinting might be the major win.  The 
code
takes 3 passes over the SACK list, so with large outstanding data that was a 
major
bottleneck, not sure if it was 4ms worth though.

> 
> Is this a congestion control/avoidance issue? What congestion control 
> algorithm is used before 2.6.8?   

Default congestion control in early 2.6 was BIC, then after CUBIC stabilized it 
was made the default in 2.6.19.

Another thing that may cause changes in latency is Appropriate Byte Counting 
(ABC).
It was added in 2.6.14, but then turned off by default in 2.6.18.  The problem 
is
that ABC caused performance problems with some applications that sent messages
as many small writes.


-- 
Stephen Hemminger <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: TCP Spike

2007-09-27 Thread Stephen Hemminger
On Thu, 27 Sep 2007 11:58:01 +0800
Majumder, Rajib [EMAIL PROTECTED] wrote:

 Hi,
 
 We have observed 40ms latency spikes in TCP connections in burst type of 
 traffic. This affects regular TCP sockets. We observed this issue in kernels 
 of 2.4.21 and kernel 2.6.5.

Unfortunately, 2.6.5 is out of my short term memory at this point. I do 
remember that 2.6.5
used BIC for congestion control, and there were some math errors in the 
congestion control
logic that caused it to be way to aggressive.
  
 
 Aparently, this seems to be fixed in 2.6.19.  
 
 Can someone throw some light on this? 

My guess is that the addition of the SACK hinting might be the major win.  The 
code
takes 3 passes over the SACK list, so with large outstanding data that was a 
major
bottleneck, not sure if it was 4ms worth though.

 
 Is this a congestion control/avoidance issue? What congestion control 
 algorithm is used before 2.6.8?   

Default congestion control in early 2.6 was BIC, then after CUBIC stabilized it 
was made the default in 2.6.19.

Another thing that may cause changes in latency is Appropriate Byte Counting 
(ABC).
It was added in 2.6.14, but then turned off by default in 2.6.18.  The problem 
is
that ABC caused performance problems with some applications that sent messages
as many small writes.


-- 
Stephen Hemminger [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: TCP Spike

2007-09-27 Thread Ilpo Järvinen
On Thu, 27 Sep 2007, Majumder, Rajib wrote:

 We have observed 40ms latency spikes in TCP connections in burst type of 
 traffic.
 This affects regular TCP sockets.

Are segments being sent full-sized, or is there perhaps some Nagle 
component in it as well? I.e., are the applications using TCP_NODELAY?

 We observed this issue in kernels of 2.4.21 and kernel 2.6.5.  
 
 Aparently, this seems to be fixed in 2.6.19.  

 Can someone throw some light on this? 

I think somebody, probably Alexey, enabled sending of ACK on every 2nd 
segment. Previously small segment senders playing with Nagle were 
complaining every now and then about performance because two small 
segments did not generate ACKs but one had to accumulate, IIRC, half MSS 
worth of data before ACK was sent. Could this be related to your case?

...In case you're having too much time, you can always try bisecting it
which finds out the causing commit... :-)

 Is this a congestion control/avoidance issue?

Congestion control is basically ACK clocked math for cwnd, ssthresh, etc.
state, which then results in permission to send new segments out etc. 
(except for RTO part of course, which I'll ignore in the next statement). 
Any delay gaps to sent packet after ACK receival, which triggered the 
state changing math, isn't there due to congestion control but due to 
other factors! 40ms is much below MIN_RTO (200ms), so it shouldn't be due 
to RTO either... Note that also delayed ACKs are exception to the general 
rule.

Congestion control is controlled like your CPU is. In your CPU there's 
this whatever GHz clock which determines when the state changing events 
take place, state changes don't just happen arbitarily but are _clocked_ 
(ACK _clocked_ in case of congestion control). Of course there will be 
some propagation delay after the change to put in effect all the state 
changes that are result of what occurred at clock edge (and this delay 
assimilating to processing delay in the context of congestion control).


-- 
 i.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


TCP Spike

2007-09-26 Thread Majumder, Rajib
Hi,

We have observed 40ms latency spikes in TCP connections in "burst" type of 
traffic. This affects regular TCP sockets. We observed this issue in kernels of 
2.4.21 and kernel 2.6.5.  

Aparently, this seems to be fixed in 2.6.19.  

Can someone throw some light on this? 

Is this a congestion control/avoidance issue? What congestion control algorithm 
is used before 2.6.8?   

Thanks

Rajib



==
Please access the attached hyperlink for an important electronic communications 
disclaimer: 

http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
==

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


TCP Spike

2007-09-26 Thread Majumder, Rajib
Hi,

We have observed 40ms latency spikes in TCP connections in burst type of 
traffic. This affects regular TCP sockets. We observed this issue in kernels of 
2.4.21 and kernel 2.6.5.  

Aparently, this seems to be fixed in 2.6.19.  

Can someone throw some light on this? 

Is this a congestion control/avoidance issue? What congestion control algorithm 
is used before 2.6.8?   

Thanks

Rajib



==
Please access the attached hyperlink for an important electronic communications 
disclaimer: 

http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
==

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/