[PATCH][TCP] tcp_vegas.c: (tcp_vegas_cong_avoid) fix a bug in disabling slow start by gamma parameter
TCP Vegas implementation has a bug in the process of disabling slow-start with gamma parameter. The bug may lead to extreme unfairness in the presence of early packet loss. See details in: http://www.cs.caltech.edu/~weixl/technical/ns2linux/known_linux/index.html#vegas Switch the order of if (tp-snd_cwnd = tp-snd_ssthresh) statement and if (diff gamma) statement to eliminate the problem. Signed-off-by: Xiaoliang (David) Wei http://www.davidwei.org --- linux-2.6.22.9-fixedvegas/net/ipv4/tcp_vegas.c | 35 - 1 files changed, 17 insertions(+), 18 deletions(-) diff -uprN linux-2.6.22.9/net/ipv4/tcp_vegas.c linux-2.6.22.9-fixedvegas/net/ipv4/tcp_vegas.c --- linux-2.6.22.9/net/ipv4/tcp_vegas.c 2007-09-26 11:03:01.0 -0700 +++ linux-2.6.22.9-fixedvegas/net/ipv4/tcp_vegas.c 2007-10-08 22:44:46.0 -0700 @@ -266,26 +266,25 @@ static void tcp_vegas_cong_avoid(struct */ diff = (old_wnd V_PARAM_SHIFT) - target_cwnd; - if (tp-snd_cwnd = tp-snd_ssthresh) { - /* Slow start. */ - if (diff gamma) { - /* Going too fast. Time to slow down -* and switch to congestion avoidance. -*/ - tp-snd_ssthresh = 2; + if (diff gamma tp-snd_ssthresh 2 ) { + /* Going too fast. Time to slow down +* and switch to congestion avoidance. +*/ + tp-snd_ssthresh = 2; - /* Set cwnd to match the actual rate -* exactly: -* cwnd = (actual rate) * baseRTT -* Then we add 1 because the integer -* truncation robs us of full link -* utilization. -*/ - tp-snd_cwnd = min(tp-snd_cwnd, - (target_cwnd - V_PARAM_SHIFT)+1); + /* Set cwnd to match the actual rate +* exactly: +* cwnd = (actual rate) * baseRTT +* Then we add 1 because the integer +* truncation robs us of full link +* utilization. +*/ + tp-snd_cwnd = min(tp-snd_cwnd, + (target_cwnd + V_PARAM_SHIFT)+1); - } + } else if (tp-snd_cwnd = tp-snd_ssthresh) { + /* Slow start. */ tcp_slow_start(tp); } else { /* Congestion avoidance. */ - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
A potential bug in tcp_vegas.c
Hi gurus, I am testing tcp_vegas.c in Linux with the NS-2 TCP-Linux. It seems that the ssthresh is not correctly reset in this file and the problem might lead to very unfair behavior among Vegas flows, when some flows exit slow start due to loss, not delay. Please see the details at: http://www.cs.caltech.edu/%7Eweixl/technical/ns2linux/known_linux/index.html#vegas (the section of Setting of Slow-Start-Threshold) A fix I wrote (for Linux 2.6.19.2) is as follow: --- tcp_vegas.c.old 2007-02-01 00:33:55.0 -0800 +++ tcp_vegas.c 2007-02-01 00:39:49.0 -0800 @@ -265,26 +265,25 @@ */ diff = (old_wnd V_PARAM_SHIFT) - target_cwnd; - if (tp-snd_cwnd = tp-snd_ssthresh) { - /* Slow start. */ - if (diff gamma) { - /* Going too fast. Time to slow down -* and switch to congestion avoidance. -*/ - tp-snd_ssthresh = 2; + if (diff gamma tp-snd_ssthresh 2 ) { + /* Going too fast. Time to slow down +* and switch to congestion avoidance. +*/ + tp-snd_ssthresh = 2; - /* Set cwnd to match the actual rate -* exactly: -* cwnd = (actual rate) * baseRTT -* Then we add 1 because the integer -* truncation robs us of full link -* utilization. -*/ - tp-snd_cwnd = min(tp-snd_cwnd, - (target_cwnd - V_PARAM_SHIFT)+1); + /* Set cwnd to match the actual rate +* exactly: +* cwnd = (actual rate) * baseRTT +* Then we add 1 because the integer +* truncation robs us of full link +* utilization. +*/ + tp-snd_cwnd = min(tp-snd_cwnd, + (target_cwnd + V_PARAM_SHIFT)+1); - } + } else if (tp-snd_cwnd = tp-snd_ssthresh) { + /* Slow start. */ tcp_slow_start(tp); } else { /* Congestion avoidance. */ Thanks. -David -- Xiaoliang (David) Wei Graduate Student, [EMAIL PROTECTED] http://davidwei.org *** - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
A potential bug in tcp_vegas.c
Hi gurus, I am testing tcp_vegas.c in Linux with the NS-2 TCP-Linux. It seems that the ssthresh is not correctly reset to 2 in the tcp_vegas_cong_avoid function. The problem might lead to very unfair behavior among Vegas flows, when some flows exit slow start due to loss, not delay. Please see the detailed effect at: http://www.cs.caltech.edu/%7Eweixl/technical/ns2linux/known_linux/#vegas (the section of Setting of Slow-Start-Threshold) A patch I wrote (for Linux 2.6.19-2) is attached. Basically, the branch that sets ssthresh to be 2 should be outside the condition of (cwnd=ssthresh). --- tcp_vegas.c.old 2007-02-01 00:33:55.0 -0800 +++ tcp_vegas.c 2007-02-01 00:39:49.0 -0800 @@ -265,26 +265,25 @@ */ diff = (old_wnd V_PARAM_SHIFT) - target_cwnd; - if (tp-snd_cwnd = tp-snd_ssthresh) { - /* Slow start. */ - if (diff gamma) { - /* Going too fast. Time to slow down -* and switch to congestion avoidance. -*/ - tp-snd_ssthresh = 2; + if (diff gamma tp-snd_ssthresh 2 ) { + /* Going too fast. Time to slow down +* and switch to congestion avoidance. +*/ + tp-snd_ssthresh = 2; - /* Set cwnd to match the actual rate -* exactly: -* cwnd = (actual rate) * baseRTT -* Then we add 1 because the integer -* truncation robs us of full link -* utilization. -*/ - tp-snd_cwnd = min(tp-snd_cwnd, - (target_cwnd - V_PARAM_SHIFT)+1); + /* Set cwnd to match the actual rate +* exactly: +* cwnd = (actual rate) * baseRTT +* Then we add 1 because the integer +* truncation robs us of full link +* utilization. +*/ + tp-snd_cwnd = min(tp-snd_cwnd, + (target_cwnd + V_PARAM_SHIFT)+1); - } + } else if (tp-snd_cwnd = tp-snd_ssthresh) { + /* Slow start. */ tcp_slow_start(tp); } else { /* Congestion avoidance. */ Thanks. -David -- Xiaoliang (David) Wei Graduate Student, [EMAIL PROTECTED] http://davidwei.org *** - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: TCP congestion graphs (2.6.19-rc4)
On 10/31/06, Stephen Hemminger [EMAIL PROTECTED] wrote: I reran the tests, this time using my DSL line to home (80ms 1Mbit/sec). http://developer.osdl.org/shemminger/tcp/2.6.19-rc4/dsl The only obvious problem is Vegas looks broken. Thanks, Stephen. It seems that the default Vegas alpha parameter in the rc4 is 1... I observed similar situation with the NS2Linux simulator (with 2.6.16 code) and found that if alpha=1, delayed ack will make it broken (keeping cwnd very low without real congestion) See details at http://www.cs.caltech.edu/%7Eweixl/technical/ns2linux/known_linux/index.html#vegas (Basically alpha==1 means Vegas seeks to see a delay of about 1 packet worth. With delayed ack, 1 packet worth of delay is common even with no congestion.) To make Vegas work, I'd suggest to raise alpha to at least 2 or 3. (and beta has to be at least as large as alpha.) -David -- Xiaoliang (David) Wei Graduate Student, [EMAIL PROTECTED] http://davidwei.org *** - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Weird TCP SACK problem. in Linux...
Hi Oumer, Your result is interesting. Just a few questions (along with your texts): So I looked further into the results, and what I found was that when SACK (when I refer to SACK here, I mean SACK only without FACK and DSACK) is used, the retransmissions seem to happen earlier . at www.kom.auc.dk/~oumer/first_transmission_times.pdf you can find the pic of cdf of the time when the first TCP retransmission occured for the four combinations of SACK and timestamps after hundrends of downloads of a 100K file for the different conditions under network reordering... Could you give a little bit more details on the scenarios. For example: What is your RTT, capacity and etc? Linux versions? Packetsize is 1.5K? Then 100K is about 66 packets. Do flows finish slow start or not? Also, what is the reordering level? Are you using Dummynet or real network? ...but I couldnt figure out why the retransmissions occur earlier for SACK than no SACK TCP. As far as I know, for both SACK and non SACK cases, we need three (or more according to the setting) duplicate ACKs to enter the fast retransmission /recovery state which would have resulted in the same behaviour to the first occurance of a retransmission. or is there some undocumented enhancment in Linux TCP when using SACK that makes it enter fast retransmit earlier... the ony explanation I could imagine is something like this Are you sure FACK is turned OFF? FACK might retransmit earlier if you have packet reordering, I think. non SACK case = 1 2 3 4 5 6 7 8 9 10. were sent and 2 was reorderdand assume we are using delayed ACKs...and we get a triple duplicate ACK after pkt#8 is received. (i.e 34--first duplicate ACK, 56..second duplicate ACK and 78...third duplicate ACK.)... so if SACK behaved like this... 34 SACKEd 2 packets out of order received 56 SACKEd4 packets out of order received start fast retransmissionas reorderd is greater than 3 (this is true when it comes to marking packets as lost during fast recovery, but is it true als for the first retransmission?) I guess delayed ACK is turned off when there is packet reordering. The receiver will send one ack for each data packet whenever there is out of order packets in its queue. So we will get duplicate ack ealier than what you explain above... One more thing, say I have FRTO, DSACK and timestamps enabled, which algorithm takes precedence ? if FRTO is enabled, then all spurious timeout detection are done through FRTO or a combination?.. They are compatible, I think? When retransmission timer times out, it first tries to go through FRTO. If FRTO found it's a real loss, then it goes to traditional timeout process as specified in FRTO algorithm. -David -- Xiaoliang (David) Wei Graduate Student, [EMAIL PROTECTED] http://davidwei.org *** - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
The AI parameter of tcp_highspeed.c (in 2.6.18)
Hi gurus, I am not sure if there is a problem in the additive increment (AI) parameter of tcp_highspeed.c: When snd_cwnd is smaller than 38 and the connection is in congestion avoidance phase (snd_cwnd snd_ssthresh), the snd_cwnd seems to stop growing. I guess the problem is in the function of hstcp_cong_avoid (Ln 126~Ln 138 of tcp_highspeed.c): if (tp-snd_cwnd hstcp_aimd_vals[ca-ai].cwnd) { while (tp-snd_cwnd hstcp_aimd_vals[ca-ai].cwnd ca-ai HSTCP_AIMD_MAX - 1) ca-ai++; } else if (tp-snd_cwnd hstcp_aimd_vals[ca-ai].cwnd) { while (tp-snd_cwnd hstcp_aimd_vals[ca-ai].cwnd ca-ai 0) ca-ai--; } /* Do additive increase */ if (tp-snd_cwnd tp-snd_cwnd_clamp) { tp-snd_cwnd_cnt += ca-ai; When snd_cwnd snd_ssthresh, this part of code is effective. And since snd_cwnd is smaller than 38, which is hstcp_aimd_vals[0].cwnd, ca-ai will be equal to 0. Hence, snd_cwnd_cnt will be unchanged for the rest of the connection. As a result, snd_cwnd also freezes. I guess we can add change the Line 138 to make the algorithm more reliable? -tp-snd_cwnd_cnt += ca-ai; + tp-snd_cwnd_cnt += (ca-ai+1); Thanks. -David - Xiaoliang (David) Wei http://davidwei.orgGraduate Student, Netlab, Caltech == - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html