Re: [PATCH 8/9] [TCP]: Reduce sacked_out with reno when purging write_queue

2007-07-03 Thread Ilpo Järvinen
On Mon, 2 Jul 2007, David Miller wrote:

> From: "Ilpo_Järvinen" <[EMAIL PROTECTED]>
> Date: Sat, 26 May 2007 11:36:01 +0300
> 
> > Previously TCP had a transitional state during which reno
> > counted segments that are already below the current window into
> > sacked_out, which is now prevented. Re-try now unconditional
> > S+L catching (I wonder if we could get that BUG_ON place
> > pinpointed more exactly in oops because now inlining makes it
> > lose its original context as they always seem to be in
> > tcp_ack, #define helps?).
> > 
> > Beware, this change is not a trivial one and might have some
> > unexpected side-effects under obscure conditions since state
> > tracking is to happen much later on and the reno sack counting
> > was highly depending on the current state.
> > 
> > This approach conservatively calls just remove_sack and leaves
> > reset_sack() calls alone. The best solution to the whole problem
> > would be to first calculate the new sacked_out fully (this patch
> > does not move reno_sack_reset calls from original sites and thus
> > does not implement this). However, that would require very
> > invasive change to fastretrans_alert (perhaps even slicing it to
> > two halves). Alternatively, all callers of tcp_packets_in_flight
> > (i.e., users that depend on sacked_out) should be postponed
> > until the new sacked_out has been calculated but it isn't any
> > simpler alternative.
> > 
> > Signed-off-by: Ilpo Järvinen <[EMAIL PROTECTED]>
> 
> So basically the idea behind this patch is to do the update of
> the fake RENO sakcs in clean_rtx_queue instead of fixing it up
> at the very last moment right before we invoke tcp_try_to_undo_partial().

Yeap, it would change sacked_out that things are seeing before 
undo_partial point (if there would be some) but it would qualify IMHO 
as fix for those case too rather than bug. I checked that F-RTO (with 
reno) cannot ever access stale sacked_out because of other constraints 
based on ACK's snd_una, and that's pretty much what's being done between 
there.

TCP might still do another update later on by using tcp_reset_reno_sack, 
but that's always going to more aggressive update (or at least an equal 
one).

> I like this patch and I can't find any holes in the idea.
> 
> But some things have changed in the meantime and this patch
> (and probably 9/9 too) don't apply cleanly.  Could you respin
> these against current tcp-2.6 so I can apply them?

I knew that :-), it was postponed due to other, more important issues and 
because the trees then got some depencies at that point I decided it's 
better to wait for a while until tcp-2.6 gets all stuff that's to be in 
net-2.6... But now that things seem to have settled, I'll provide the 
updated one later on. The 9/9 should be dropped, I've noticed a cpu 
processing vulnerability in it which I previously didn't see there (I 
describe it in my other post).


-- 
 i.

Re: [PATCH 8/9] [TCP]: Reduce sacked_out with reno when purging write_queue

2007-07-02 Thread David Miller
From: "Ilpo_Järvinen" <[EMAIL PROTECTED]>
Date: Sat, 26 May 2007 11:36:01 +0300

> Previously TCP had a transitional state during which reno
> counted segments that are already below the current window into
> sacked_out, which is now prevented. Re-try now unconditional
> S+L catching (I wonder if we could get that BUG_ON place
> pinpointed more exactly in oops because now inlining makes it
> lose its original context as they always seem to be in
> tcp_ack, #define helps?).
> 
> Beware, this change is not a trivial one and might have some
> unexpected side-effects under obscure conditions since state
> tracking is to happen much later on and the reno sack counting
> was highly depending on the current state.
> 
> This approach conservatively calls just remove_sack and leaves
> reset_sack() calls alone. The best solution to the whole problem
> would be to first calculate the new sacked_out fully (this patch
> does not move reno_sack_reset calls from original sites and thus
> does not implement this). However, that would require very
> invasive change to fastretrans_alert (perhaps even slicing it to
> two halves). Alternatively, all callers of tcp_packets_in_flight
> (i.e., users that depend on sacked_out) should be postponed
> until the new sacked_out has been calculated but it isn't any
> simpler alternative.
> 
> Signed-off-by: Ilpo Järvinen <[EMAIL PROTECTED]>

So basically the idea behind this patch is to do the update of
the fake RENO sakcs in clean_rtx_queue instead of fixing it up
at the very last moment right before we invoke tcp_try_to_undo_partial().

I like this patch and I can't find any holes in the idea.

But some things have changed in the meantime and this patch
(and probably 9/9 too) don't apply cleanly.  Could you respin
these against current tcp-2.6 so I can apply them?

Thanks.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 8/9] [TCP]: Reduce sacked_out with reno when purging write_queue

2007-05-26 Thread Ilpo Järvinen
From: =?ISO-8859-1?q?Ilpo_J=E4rvinen?= <[EMAIL PROTECTED]>

Previously TCP had a transitional state during which reno
counted segments that are already below the current window into
sacked_out, which is now prevented. Re-try now unconditional
S+L catching (I wonder if we could get that BUG_ON place
pinpointed more exactly in oops because now inlining makes it
lose its original context as they always seem to be in
tcp_ack, #define helps?).

Beware, this change is not a trivial one and might have some
unexpected side-effects under obscure conditions since state
tracking is to happen much later on and the reno sack counting
was highly depending on the current state.

This approach conservatively calls just remove_sack and leaves
reset_sack() calls alone. The best solution to the whole problem
would be to first calculate the new sacked_out fully (this patch
does not move reno_sack_reset calls from original sites and thus
does not implement this). However, that would require very
invasive change to fastretrans_alert (perhaps even slicing it to
two halves). Alternatively, all callers of tcp_packets_in_flight
(i.e., users that depend on sacked_out) should be postponed
until the new sacked_out has been calculated but it isn't any
simpler alternative.

Signed-off-by: Ilpo Järvinen <[EMAIL PROTECTED]>
---
 include/net/tcp.h|3 +--
 net/ipv4/tcp_input.c |   19 +++
 2 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 6f88d13..5255d51 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -747,8 +747,7 @@ static inline __u32 tcp_current_ssthresh(const struct sock 
*sk)
 
 static inline void tcp_sync_left_out(struct tcp_sock *tp)
 {
-   BUG_ON(tp->rx_opt.sack_ok &&
-  (tp->sacked_out + tp->lost_out > tp->packets_out));
+   BUG_ON(tp->sacked_out + tp->lost_out > tp->packets_out);
 }
 
 extern void tcp_enter_cwr(struct sock *sk, const int set_ssthresh);
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 8c1e38a..9c12c90 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -2295,7 +2295,7 @@ static void tcp_mtup_probe_success(struct sock *sk, 
struct sk_buff *skb)
  */
 static void
 tcp_fastretrans_alert(struct sock *sk, u32 prior_snd_una,
- int prior_packets, int flag, u32 mark_lost_entry_seq)
+ int pkts_acked, int flag, u32 mark_lost_entry_seq)
 {
struct inet_connection_sock *icsk = inet_csk(sk);
struct tcp_sock *tp = tcp_sk(sk);
@@ -2380,12 +2380,8 @@ tcp_fastretrans_alert(struct sock *sk, u32 prior_snd_una,
if (prior_snd_una == tp->snd_una) {
if (IsReno(tp) && is_dupack)
tcp_add_reno_sack(sk);
-   } else {
-   int acked = prior_packets - tp->packets_out;
-   if (IsReno(tp))
-   tcp_remove_reno_sacks(sk, acked);
-   is_dupack = tcp_try_undo_partial(sk, acked);
-   }
+   } else
+   is_dupack = tcp_try_undo_partial(sk, pkts_acked);
break;
case TCP_CA_Loss:
if (flag&FLAG_DATA_ACKED)
@@ -2602,6 +2598,7 @@ static int tcp_clean_rtx_queue(struct sock *sk, __s32 
*seq_rtt_p)
struct sk_buff *skb;
__u32 now = tcp_time_stamp;
int acked = 0;
+   int prior_packets = tp->packets_out;
__s32 seq_rtt = -1;
u32 pkts_acked = 0;
ktime_t last_ackt = ktime_set(0,0);
@@ -2682,6 +2679,11 @@ static int tcp_clean_rtx_queue(struct sock *sk, __s32 
*seq_rtt_p)
tcp_ack_update_rtt(sk, acked, seq_rtt);
tcp_ack_packets_out(sk);
 
+   if (IsReno(tp)) {
+   int acked = prior_packets - tp->packets_out;
+   tcp_remove_reno_sacks(sk, acked);
+   }
+
if (ca_ops->pkts_acked)
ca_ops->pkts_acked(sk, pkts_acked, last_ackt);
}
@@ -3017,7 +3019,8 @@ static int tcp_ack(struct sock *sk, struct sk_buff *skb, 
int flag)
if ((flag & FLAG_DATA_ACKED) && !frto_cwnd &&
tcp_may_raise_cwnd(sk, flag))
tcp_cong_avoid(sk, ack,  seq_rtt, prior_in_flight, 0);
-   tcp_fastretrans_alert(sk, prior_snd_una, prior_packets, flag,
+   tcp_fastretrans_alert(sk, prior_snd_una,
+ prior_packets - tp->packets_out, flag,
  mark_lost_entry_seq);
} else {
if ((flag & FLAG_DATA_ACKED) && !frto_cwnd)
-- 
1.5.0.6

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html