Re: [PATCH 8/9] [TCP]: Reduce sacked_out with reno when purging write_queue
On Mon, 2 Jul 2007, David Miller wrote: > From: "Ilpo_Järvinen" <[EMAIL PROTECTED]> > Date: Sat, 26 May 2007 11:36:01 +0300 > > > Previously TCP had a transitional state during which reno > > counted segments that are already below the current window into > > sacked_out, which is now prevented. Re-try now unconditional > > S+L catching (I wonder if we could get that BUG_ON place > > pinpointed more exactly in oops because now inlining makes it > > lose its original context as they always seem to be in > > tcp_ack, #define helps?). > > > > Beware, this change is not a trivial one and might have some > > unexpected side-effects under obscure conditions since state > > tracking is to happen much later on and the reno sack counting > > was highly depending on the current state. > > > > This approach conservatively calls just remove_sack and leaves > > reset_sack() calls alone. The best solution to the whole problem > > would be to first calculate the new sacked_out fully (this patch > > does not move reno_sack_reset calls from original sites and thus > > does not implement this). However, that would require very > > invasive change to fastretrans_alert (perhaps even slicing it to > > two halves). Alternatively, all callers of tcp_packets_in_flight > > (i.e., users that depend on sacked_out) should be postponed > > until the new sacked_out has been calculated but it isn't any > > simpler alternative. > > > > Signed-off-by: Ilpo Järvinen <[EMAIL PROTECTED]> > > So basically the idea behind this patch is to do the update of > the fake RENO sakcs in clean_rtx_queue instead of fixing it up > at the very last moment right before we invoke tcp_try_to_undo_partial(). Yeap, it would change sacked_out that things are seeing before undo_partial point (if there would be some) but it would qualify IMHO as fix for those case too rather than bug. I checked that F-RTO (with reno) cannot ever access stale sacked_out because of other constraints based on ACK's snd_una, and that's pretty much what's being done between there. TCP might still do another update later on by using tcp_reset_reno_sack, but that's always going to more aggressive update (or at least an equal one). > I like this patch and I can't find any holes in the idea. > > But some things have changed in the meantime and this patch > (and probably 9/9 too) don't apply cleanly. Could you respin > these against current tcp-2.6 so I can apply them? I knew that :-), it was postponed due to other, more important issues and because the trees then got some depencies at that point I decided it's better to wait for a while until tcp-2.6 gets all stuff that's to be in net-2.6... But now that things seem to have settled, I'll provide the updated one later on. The 9/9 should be dropped, I've noticed a cpu processing vulnerability in it which I previously didn't see there (I describe it in my other post). -- i.
Re: [PATCH 8/9] [TCP]: Reduce sacked_out with reno when purging write_queue
From: "Ilpo_Järvinen" <[EMAIL PROTECTED]> Date: Sat, 26 May 2007 11:36:01 +0300 > Previously TCP had a transitional state during which reno > counted segments that are already below the current window into > sacked_out, which is now prevented. Re-try now unconditional > S+L catching (I wonder if we could get that BUG_ON place > pinpointed more exactly in oops because now inlining makes it > lose its original context as they always seem to be in > tcp_ack, #define helps?). > > Beware, this change is not a trivial one and might have some > unexpected side-effects under obscure conditions since state > tracking is to happen much later on and the reno sack counting > was highly depending on the current state. > > This approach conservatively calls just remove_sack and leaves > reset_sack() calls alone. The best solution to the whole problem > would be to first calculate the new sacked_out fully (this patch > does not move reno_sack_reset calls from original sites and thus > does not implement this). However, that would require very > invasive change to fastretrans_alert (perhaps even slicing it to > two halves). Alternatively, all callers of tcp_packets_in_flight > (i.e., users that depend on sacked_out) should be postponed > until the new sacked_out has been calculated but it isn't any > simpler alternative. > > Signed-off-by: Ilpo Järvinen <[EMAIL PROTECTED]> So basically the idea behind this patch is to do the update of the fake RENO sakcs in clean_rtx_queue instead of fixing it up at the very last moment right before we invoke tcp_try_to_undo_partial(). I like this patch and I can't find any holes in the idea. But some things have changed in the meantime and this patch (and probably 9/9 too) don't apply cleanly. Could you respin these against current tcp-2.6 so I can apply them? Thanks. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 8/9] [TCP]: Reduce sacked_out with reno when purging write_queue
From: =?ISO-8859-1?q?Ilpo_J=E4rvinen?= <[EMAIL PROTECTED]> Previously TCP had a transitional state during which reno counted segments that are already below the current window into sacked_out, which is now prevented. Re-try now unconditional S+L catching (I wonder if we could get that BUG_ON place pinpointed more exactly in oops because now inlining makes it lose its original context as they always seem to be in tcp_ack, #define helps?). Beware, this change is not a trivial one and might have some unexpected side-effects under obscure conditions since state tracking is to happen much later on and the reno sack counting was highly depending on the current state. This approach conservatively calls just remove_sack and leaves reset_sack() calls alone. The best solution to the whole problem would be to first calculate the new sacked_out fully (this patch does not move reno_sack_reset calls from original sites and thus does not implement this). However, that would require very invasive change to fastretrans_alert (perhaps even slicing it to two halves). Alternatively, all callers of tcp_packets_in_flight (i.e., users that depend on sacked_out) should be postponed until the new sacked_out has been calculated but it isn't any simpler alternative. Signed-off-by: Ilpo Järvinen <[EMAIL PROTECTED]> --- include/net/tcp.h|3 +-- net/ipv4/tcp_input.c | 19 +++ 2 files changed, 12 insertions(+), 10 deletions(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index 6f88d13..5255d51 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -747,8 +747,7 @@ static inline __u32 tcp_current_ssthresh(const struct sock *sk) static inline void tcp_sync_left_out(struct tcp_sock *tp) { - BUG_ON(tp->rx_opt.sack_ok && - (tp->sacked_out + tp->lost_out > tp->packets_out)); + BUG_ON(tp->sacked_out + tp->lost_out > tp->packets_out); } extern void tcp_enter_cwr(struct sock *sk, const int set_ssthresh); diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 8c1e38a..9c12c90 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -2295,7 +2295,7 @@ static void tcp_mtup_probe_success(struct sock *sk, struct sk_buff *skb) */ static void tcp_fastretrans_alert(struct sock *sk, u32 prior_snd_una, - int prior_packets, int flag, u32 mark_lost_entry_seq) + int pkts_acked, int flag, u32 mark_lost_entry_seq) { struct inet_connection_sock *icsk = inet_csk(sk); struct tcp_sock *tp = tcp_sk(sk); @@ -2380,12 +2380,8 @@ tcp_fastretrans_alert(struct sock *sk, u32 prior_snd_una, if (prior_snd_una == tp->snd_una) { if (IsReno(tp) && is_dupack) tcp_add_reno_sack(sk); - } else { - int acked = prior_packets - tp->packets_out; - if (IsReno(tp)) - tcp_remove_reno_sacks(sk, acked); - is_dupack = tcp_try_undo_partial(sk, acked); - } + } else + is_dupack = tcp_try_undo_partial(sk, pkts_acked); break; case TCP_CA_Loss: if (flag&FLAG_DATA_ACKED) @@ -2602,6 +2598,7 @@ static int tcp_clean_rtx_queue(struct sock *sk, __s32 *seq_rtt_p) struct sk_buff *skb; __u32 now = tcp_time_stamp; int acked = 0; + int prior_packets = tp->packets_out; __s32 seq_rtt = -1; u32 pkts_acked = 0; ktime_t last_ackt = ktime_set(0,0); @@ -2682,6 +2679,11 @@ static int tcp_clean_rtx_queue(struct sock *sk, __s32 *seq_rtt_p) tcp_ack_update_rtt(sk, acked, seq_rtt); tcp_ack_packets_out(sk); + if (IsReno(tp)) { + int acked = prior_packets - tp->packets_out; + tcp_remove_reno_sacks(sk, acked); + } + if (ca_ops->pkts_acked) ca_ops->pkts_acked(sk, pkts_acked, last_ackt); } @@ -3017,7 +3019,8 @@ static int tcp_ack(struct sock *sk, struct sk_buff *skb, int flag) if ((flag & FLAG_DATA_ACKED) && !frto_cwnd && tcp_may_raise_cwnd(sk, flag)) tcp_cong_avoid(sk, ack, seq_rtt, prior_in_flight, 0); - tcp_fastretrans_alert(sk, prior_snd_una, prior_packets, flag, + tcp_fastretrans_alert(sk, prior_snd_una, + prior_packets - tp->packets_out, flag, mark_lost_entry_seq); } else { if ((flag & FLAG_DATA_ACKED) && !frto_cwnd) -- 1.5.0.6 - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html