Re: [PATCH 0/9]: tcp-2.6 patchset
On Mon, 28 May 2007 13:27:03 +0300 (EEST) Ilpo Järvinen [EMAIL PROTECTED] wrote: On Sun, 27 May 2007, Ilpo Järvinen wrote: On Sun, 27 May 2007, Baruch Even wrote: * Ilpo J?rvinen [EMAIL PROTECTED] [070527 14:16]: Thus, my original question basically culminates in this: should cc modules be passed number of packets acked or number of skbs acked? ...The latter makes no sense to me unless the value is intented to be interpreted as number of timestamps acked or something along those lines. ...I briefly tried looking up for documentation for cc module interface but didn't find anything useful about this, and thus asked in the first place... At least the htcp module that I wrote assumes that the number is actual number of tcp packets so GSO should be considered. Thanks for the info! It is what I suspected... ...I'll write a patch for it tomorrow against net-2.6... Dave, beware that it will partially overlap with the changes made in the patch 8, so you might choose to put the patch 8 on hold until this issue is first resolved... The consequences of this bug are not too large but it does make all congestion control algorithms a lot less aggressive. On my machines GSO is disabled by default (e1000 at 100mbps Tigon3 @ 1Gbps). Agreed, that's my impression too. However, some algorithms do things like 0 checks for it, so it might disturb their dynamics even more than in the too small value cases... Hmm, there seems to be another case that I'm not too sure of... Please check the alternative I choose for SYN handling below... ...hmm... While exploring this SYN thingie, I noticed that commit 164891aadf1721fca4dce473bb0e0998181537c6 drops !FLAG_RETRANS_DATA_ACKED check from rtt_sample call (when combining it with pkts_acked call). I hope that's intentional?!? ...the commit message didn't say anything about it nor was anything in cc modules changed to accomodate that. [PATCH] [TCP]: Fix GSO ignorance of pkts_acked arg (cong.cntrl modules) The code used to ignore GSO completely, passing either way too small or zero pkts_acked when GSO skb or part of it got ACKed. In addition, there is no need to calculate the value in the loop but simple arithmetics after the loop is sufficient. Yes, thanks for fixing this. Wonder how it affects measurements. It is not very clear how SYN segments should be handled, so I choose to follow the previous implementation in this respect. Since we don't invoke congestion control modules until after the SYN handshake this is not a problem. -- Stephen Hemminger [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/9]: tcp-2.6 patchset
On Tue, 29 May 2007, Stephen Hemminger wrote: On Mon, 28 May 2007 13:27:03 +0300 (EEST) Ilpo Järvinen [EMAIL PROTECTED] wrote: On Sun, 27 May 2007, Ilpo Järvinen wrote: [PATCH] [TCP]: Fix GSO ignorance of pkts_acked arg (cong.cntrl modules) Yes, thanks for fixing this. Wonder how it affects measurements. ...It's a bit hard to tell since dynamics change so dramatically in 0 check cases, the resulting behavior in too small value cases may be easier to predict... It's possible that this could explain some anomalities you've been seeing in your measurements. It is not very clear how SYN segments should be handled, so I choose to follow the previous implementation in this respect. Since we don't invoke congestion control modules until after the SYN handshake this is not a problem. Just curious, do you mean that cc modules cannot measure, e.g., initial RTT through this mechanism (though they could do that in init() cb then I suppose)... Or do you mean that they are called already for the ACK that completes the SYN handshake and therefore its skb is being cleaned from the queue right now (this is the case I above refer to)? In the first case the decrementer code is NOP. If the latter, then it is just interface specification question, i.e., if SYNs are treated as zero or one in num_acked for the pkts_acked callback (I have no opinion on this but was just trying to make sure cc modules get what they expect :-)). -- i.
Re: [PATCH 0/9]: tcp-2.6 patchset
On Tue, 29 May 2007 23:07:00 +0300 (EEST) Ilpo Järvinen [EMAIL PROTECTED] wrote: On Tue, 29 May 2007, Stephen Hemminger wrote: On Mon, 28 May 2007 13:27:03 +0300 (EEST) Ilpo Järvinen [EMAIL PROTECTED] wrote: On Sun, 27 May 2007, Ilpo Järvinen wrote: [PATCH] [TCP]: Fix GSO ignorance of pkts_acked arg (cong.cntrl modules) Yes, thanks for fixing this. Wonder how it affects measurements. ...It's a bit hard to tell since dynamics change so dramatically in 0 check cases, the resulting behavior in too small value cases may be easier to predict... It's possible that this could explain some anomalities you've been seeing in your measurements. It is not very clear how SYN segments should be handled, so I choose to follow the previous implementation in this respect. Since we don't invoke congestion control modules until after the SYN handshake this is not a problem. Just curious, do you mean that cc modules cannot measure, e.g., initial RTT through this mechanism (though they could do that in init() cb then I suppose)... Or do you mean that they are called already for the ACK that completes the SYN handshake and therefore its skb is being cleaned from the queue right now (this is the case I above refer to)? In the first case the decrementer code is NOP. If the latter, then it is just interface specification question, i.e., if SYNs are treated as zero or one in num_acked for the pkts_acked callback (I have no opinion on this but was just trying to make sure cc modules get what they expect :-)). We don't switch a socket out of Reno until after the initial handshake. It evolved that way, and has a couple of advantages: * uncomplete connections don't up module refcount so it is easier to debug * in future, we can choose congestion control to use based on route metric. As an interface, it makes sense to keep the API with the SYN counting as a packet. -- Stephen Hemminger [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/9]: tcp-2.6 patchset
On Tue, 29 May 2007, Stephen Hemminger wrote: On Tue, 29 May 2007 23:07:00 +0300 (EEST) Ilpo Järvinen [EMAIL PROTECTED] wrote: On Tue, 29 May 2007, Stephen Hemminger wrote: Since we don't invoke congestion control modules until after the SYN handshake this is not a problem. Just curious, do you mean that cc modules cannot measure, e.g., initial RTT through this mechanism (though they could do that in init() cb then I suppose)... Or do you mean that they are called already for the ACK that completes the SYN handshake and therefore its skb is being cleaned from the queue right now (this is the case I above refer to)? In the first case the decrementer code is NOP. If the latter, then it is just interface specification question, i.e., if SYNs are treated as zero or one in num_acked for the pkts_acked callback (I have no opinion on this but was just trying to make sure cc modules get what they expect :-)). We don't switch a socket out of Reno until after the initial handshake. ...It's still not very clear to me what exactly your after means (both here and in your earlier description), i.e., whether clean_rtx_queue call that cleans SYN skb from the queue happens before or after the switch out of reno or not... If I understand the code correctly, this specific clean_rtx_queue call happens after your after but I could be misunderstanding the current 3-way handshake code. :-) As an interface, it makes sense to keep the API with the SYN counting as a packet. Ok, this one answers the remaining question concerning the patch, here is it without the decrementer for SYN case (which IMHO wasn't very beautiful looking anyway :-)). Dave, please consider this to net-2.6. It could be a stable candidate as well, haven't tested yet if it applies cleanly to stable: [PATCH] [TCP]: Fix GSO ignorance of pkts_acked arg (cong.cntrl modules) The code used to ignore GSO completely, passing either way too small or zero pkts_acked when GSO skb or part of it got ACKed. In addition, there is no need to calculate the value in the loop but simple arithmetics after the loop is sufficient. Pkts_acked callback of congestion control modules include SYN as a packet for now on. Signed-off-by: Ilpo Järvinen [EMAIL PROTECTED] --- net/ipv4/tcp_input.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 38cb25b..74683d8 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -2407,8 +2407,8 @@ static int tcp_clean_rtx_queue(struct sock *sk, __s32 *seq_rtt_p) struct sk_buff *skb; __u32 now = tcp_time_stamp; int acked = 0; + int prior_packets = tp-packets_out; __s32 seq_rtt = -1; - u32 pkts_acked = 0; ktime_t last_ackt = ktime_set(0,0); while ((skb = tcp_write_queue_head(sk)) @@ -2437,7 +2437,6 @@ static int tcp_clean_rtx_queue(struct sock *sk, __s32 *seq_rtt_p) */ if (!(scb-flags TCPCB_FLAG_SYN)) { acked |= FLAG_DATA_ACKED; - ++pkts_acked; } else { acked |= FLAG_SYN_ACKED; tp-retrans_stamp = 0; @@ -2481,6 +2480,7 @@ static int tcp_clean_rtx_queue(struct sock *sk, __s32 *seq_rtt_p) } if (ackedFLAG_ACKED) { + u32 pkts_acked = prior_packets - tp-packets_out; const struct tcp_congestion_ops *ca_ops = inet_csk(sk)-icsk_ca_ops; -- 1.5.0.6
Re: [PATCH 0/9]: tcp-2.6 patchset
On Sun, 27 May 2007, Ilpo Järvinen wrote: On Sun, 27 May 2007, Baruch Even wrote: * Ilpo J?rvinen [EMAIL PROTECTED] [070527 14:16]: Thus, my original question basically culminates in this: should cc modules be passed number of packets acked or number of skbs acked? ...The latter makes no sense to me unless the value is intented to be interpreted as number of timestamps acked or something along those lines. ...I briefly tried looking up for documentation for cc module interface but didn't find anything useful about this, and thus asked in the first place... At least the htcp module that I wrote assumes that the number is actual number of tcp packets so GSO should be considered. Thanks for the info! It is what I suspected... ...I'll write a patch for it tomorrow against net-2.6... Dave, beware that it will partially overlap with the changes made in the patch 8, so you might choose to put the patch 8 on hold until this issue is first resolved... The consequences of this bug are not too large but it does make all congestion control algorithms a lot less aggressive. On my machines GSO is disabled by default (e1000 at 100mbps Tigon3 @ 1Gbps). Agreed, that's my impression too. However, some algorithms do things like 0 checks for it, so it might disturb their dynamics even more than in the too small value cases... Hmm, there seems to be another case that I'm not too sure of... Please check the alternative I choose for SYN handling below... ...hmm... While exploring this SYN thingie, I noticed that commit 164891aadf1721fca4dce473bb0e0998181537c6 drops !FLAG_RETRANS_DATA_ACKED check from rtt_sample call (when combining it with pkts_acked call). I hope that's intentional?!? ...the commit message didn't say anything about it nor was anything in cc modules changed to accomodate that. [PATCH] [TCP]: Fix GSO ignorance of pkts_acked arg (cong.cntrl modules) The code used to ignore GSO completely, passing either way too small or zero pkts_acked when GSO skb or part of it got ACKed. In addition, there is no need to calculate the value in the loop but simple arithmetics after the loop is sufficient. It is not very clear how SYN segments should be handled, so I choose to follow the previous implementation in this respect. Signed-off-by: Ilpo Järvinen [EMAIL PROTECTED] --- net/ipv4/tcp_input.c |7 +-- 1 files changed, 5 insertions(+), 2 deletions(-) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 38cb25b..7dfaabb 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -2407,8 +2407,8 @@ static int tcp_clean_rtx_queue(struct sock *sk, __s32 *seq_rtt_p) struct sk_buff *skb; __u32 now = tcp_time_stamp; int acked = 0; + int prior_packets = tp-packets_out; __s32 seq_rtt = -1; - u32 pkts_acked = 0; ktime_t last_ackt = ktime_set(0,0); while ((skb = tcp_write_queue_head(sk)) @@ -2437,7 +2437,6 @@ static int tcp_clean_rtx_queue(struct sock *sk, __s32 *seq_rtt_p) */ if (!(scb-flags TCPCB_FLAG_SYN)) { acked |= FLAG_DATA_ACKED; - ++pkts_acked; } else { acked |= FLAG_SYN_ACKED; tp-retrans_stamp = 0; @@ -2481,9 +2480,13 @@ static int tcp_clean_rtx_queue(struct sock *sk, __s32 *seq_rtt_p) } if (ackedFLAG_ACKED) { + u32 pkts_acked = prior_packets - tp-packets_out; const struct tcp_congestion_ops *ca_ops = inet_csk(sk)-icsk_ca_ops; + if (acked FLAG_SYN_ACKED) + pkts_acked--; + tcp_ack_update_rtt(sk, acked, seq_rtt); tcp_ack_packets_out(sk); -- 1.5.0.6
Re: [PATCH 0/9]: tcp-2.6 patchset
On Sat, 26 May 2007, David Miller wrote: From: Ilpo_Järvinen [EMAIL PROTECTED] Date: Sat, 26 May 2007 11:35:53 +0300 Dave, you could consider applying other than the last one if they seem ok to you too (you'll need to rebase your tcp-2.6 in that case first to apply cleanly those that touch tcp_sync_left_out :-)). Absolutely, I'll do the quick rebase of tcp-2.6 and review + applying of your patches (except the last one) over the weekend. While you're in the right context (reviewing patch 8), you could also look if tcp_clean_rtx_queue does a right thing when passing a strange pkts_acked to congestion control modules. I wonder if it really should ignore GSO the way it does currently... I read some cc module code and some was adding it to snd_cwnd_cnt, etc. which is a strong indication that GSO should be considered... Also if the head is GSO skb that is not completely acked, the loop breaks with pkts_acked being zero, I doubt that can be correct... ...added Stephen H. Herbert X. to this thread since also they might have some idea about it. ...Anyway, if it's wrong, the fix should go to net-2.6 rather than to tcp-2.6... -- i.
Re: [PATCH 0/9]: tcp-2.6 patchset
From: Ilpo_Järvinen [EMAIL PROTECTED] Date: Sun, 27 May 2007 10:58:27 +0300 (EEST) On Sat, 26 May 2007, David Miller wrote: From: Ilpo_Järvinen [EMAIL PROTECTED] Date: Sat, 26 May 2007 11:35:53 +0300 Dave, you could consider applying other than the last one if they seem ok to you too (you'll need to rebase your tcp-2.6 in that case first to apply cleanly those that touch tcp_sync_left_out :-)). Absolutely, I'll do the quick rebase of tcp-2.6 and review + applying of your patches (except the last one) over the weekend. While you're in the right context (reviewing patch 8), you could also look if tcp_clean_rtx_queue does a right thing when passing a strange pkts_acked to congestion control modules. I wonder if it really should ignore GSO the way it does currently... I read some cc module code and some was adding it to snd_cwnd_cnt, etc. which is a strong indication that GSO should be considered... Also if the head is GSO skb that is not completely acked, the loop breaks with pkts_acked being zero, I doubt that can be correct... I rebased tcp-2.6, applied your patches 1-7 (which I'll comment a bit on tomorrow) and will likely take a look at these issues wrt. patch 8 tomorrow. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/9]: tcp-2.6 patchset
On Sun, 27 May 2007, David Miller wrote: From: Ilpo_Järvinen [EMAIL PROTECTED] Date: Sun, 27 May 2007 10:58:27 +0300 (EEST) While you're in the right context (reviewing patch 8), you could also look if tcp_clean_rtx_queue does a right thing when passing a strange pkts_acked to congestion control modules. I wonder if it really should ignore GSO the way it does currently... I read some cc module code and some was adding it to snd_cwnd_cnt, etc. which is a strong indication that GSO should be considered... Also if the head is GSO skb that is not completely acked, the loop breaks with pkts_acked being zero, I doubt that can be correct... [...snip...] will likely take a look at these issues wrt. patch 8 tomorrow. ...I hope I got myself understood correctly (with my non-native English :-))... ...My intention was to say that there might be a bug in tcp_clean_rtx_queue too, which on logical level is unrelated to the patch 8 itself and to the change it makes (but caught my attention while I was doing that patch as it resides in the same function). ...There could have been confusion since I'm too changing things in the same function and because my article usage is usually far from perfect... :-) Also the fastretrans_alert arg rename to the same name (i.e., pkts_acked) in patch 8 is unrelated to the potential bug. Thus, my original question basically culminates in this: should cc modules be passed number of packets acked or number of skbs acked? ...The latter makes no sense to me unless the value is intented to be interpreted as number of timestamps acked or something along those lines. ...I briefly tried looking up for documentation for cc module interface but didn't find anything useful about this, and thus asked in the first place... -- i.
Re: [PATCH 0/9]: tcp-2.6 patchset
* Ilpo J?rvinen [EMAIL PROTECTED] [070527 14:16]: On Sun, 27 May 2007, David Miller wrote: From: Ilpo_J?rvinen [EMAIL PROTECTED] Date: Sun, 27 May 2007 10:58:27 +0300 (EEST) While you're in the right context (reviewing patch 8), you could also look if tcp_clean_rtx_queue does a right thing when passing a strange pkts_acked to congestion control modules. I wonder if it really should ignore GSO the way it does currently... I read some cc module code and some was adding it to snd_cwnd_cnt, etc. which is a strong indication that GSO should be considered... Also if the head is GSO skb that is not completely acked, the loop breaks with pkts_acked being zero, I doubt that can be correct... [...snip...] will likely take a look at these issues wrt. patch 8 tomorrow. [...snip...] Thus, my original question basically culminates in this: should cc modules be passed number of packets acked or number of skbs acked? ...The latter makes no sense to me unless the value is intented to be interpreted as number of timestamps acked or something along those lines. ...I briefly tried looking up for documentation for cc module interface but didn't find anything useful about this, and thus asked in the first place... At least the htcp module that I wrote assumes that the number is actual number of tcp packets so GSO should be considered. The consequences of this bug are not too large but it does make all congestion control algorithms a lot less aggressive. On my machines GSO is disabled by default (e1000 at 100mbps Tigon3 @ 1Gbps). Cheers, Baruch - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/9]: tcp-2.6 patchset
On Sun, 27 May 2007, Baruch Even wrote: * Ilpo J?rvinen [EMAIL PROTECTED] [070527 14:16]: Thus, my original question basically culminates in this: should cc modules be passed number of packets acked or number of skbs acked? ...The latter makes no sense to me unless the value is intented to be interpreted as number of timestamps acked or something along those lines. ...I briefly tried looking up for documentation for cc module interface but didn't find anything useful about this, and thus asked in the first place... At least the htcp module that I wrote assumes that the number is actual number of tcp packets so GSO should be considered. Thanks for the info! It is what I suspected... ...I'll write a patch for it tomorrow against net-2.6... Dave, beware that it will partially overlap with the changes made in the patch 8, so you might choose to put the patch 8 on hold until this issue is first resolved... The consequences of this bug are not too large but it does make all congestion control algorithms a lot less aggressive. On my machines GSO is disabled by default (e1000 at 100mbps Tigon3 @ 1Gbps). Agreed, that's my impression too. However, some algorithms do things like 0 checks for it, so it might disturb their dynamics even more than in the too small value cases... -- i. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/9]: tcp-2.6 patchset
From: Ilpo_Järvinen [EMAIL PROTECTED] Date: Sat, 26 May 2007 11:35:53 +0300 Dave, you could consider applying other than the last one if they seem ok to you too (you'll need to rebase your tcp-2.6 in that case first to apply cleanly those that touch tcp_sync_left_out :-)). Absolutely, I'll do the quick rebase of tcp-2.6 and review + applying of your patches (except the last one) over the weekend. Thanks! - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html