Re: [PATCH 0/9]: tcp-2.6 patchset

2007-05-29 Thread Stephen Hemminger
On Mon, 28 May 2007 13:27:03 +0300 (EEST)
Ilpo Järvinen [EMAIL PROTECTED] wrote:

 On Sun, 27 May 2007, Ilpo Järvinen wrote:
 
  On Sun, 27 May 2007, Baruch Even wrote:
  
   * Ilpo J?rvinen [EMAIL PROTECTED] [070527 14:16]:

Thus, my original question basically culminates in this: should cc
modules be passed number of packets acked or number of skbs acked?
...The latter makes no sense to me unless the value is intented to
be interpreted as number of timestamps acked or something along those 
lines. ...I briefly tried looking up for documentation for cc module 
interface but didn't find anything useful about this, and thus asked in 
the first place...
   
   At least the htcp module that I wrote assumes that the number is actual
   number of tcp packets so GSO should be considered.
  
  Thanks for the info! It is what I suspected... ...I'll write a patch for 
  it tomorrow against net-2.6... Dave, beware that it will partially 
  overlap with the changes made in the patch 8, so you might choose to put 
  the patch 8 on hold until this issue is first resolved...
  
   The consequences of this bug are not too large but it does make all
   congestion control algorithms a lot less aggressive. On my machines GSO
   is disabled by default (e1000 at 100mbps  Tigon3 @ 1Gbps).
  
  Agreed, that's my impression too. However, some algorithms do things
  like  0 checks for it, so it might disturb their dynamics even more
  than in the too small value cases...
 
 Hmm, there seems to be another case that I'm not too sure of...
 Please check the alternative I choose for SYN handling below...
 
 ...hmm... While exploring this SYN thingie, I noticed that commit 
 164891aadf1721fca4dce473bb0e0998181537c6 drops !FLAG_RETRANS_DATA_ACKED 
 check from rtt_sample call (when combining it with pkts_acked call).
 I hope that's intentional?!? ...the commit message didn't say anything 
 about it nor was anything in cc modules changed to accomodate that.
 
 
 [PATCH] [TCP]: Fix GSO ignorance of pkts_acked arg (cong.cntrl modules)
 
 The code used to ignore GSO completely, passing either way too
 small or zero pkts_acked when GSO skb or part of it got ACKed.
 In addition, there is no need to calculate the value in the loop
 but simple arithmetics after the loop is sufficient.

Yes, thanks for fixing this. Wonder how it affects measurements.
 
 It is not very clear how SYN segments should be handled, so I
 choose to follow the previous implementation in this respect.

Since we don't invoke congestion control modules until after the SYN
handshake this is not a problem.


-- 
Stephen Hemminger [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/9]: tcp-2.6 patchset

2007-05-29 Thread Ilpo Järvinen
On Tue, 29 May 2007, Stephen Hemminger wrote:

 On Mon, 28 May 2007 13:27:03 +0300 (EEST)
 Ilpo Järvinen [EMAIL PROTECTED] wrote:
 
  On Sun, 27 May 2007, Ilpo Järvinen wrote:
  
  [PATCH] [TCP]: Fix GSO ignorance of pkts_acked arg (cong.cntrl modules)
 
 Yes, thanks for fixing this. Wonder how it affects measurements.

...It's a bit hard to tell since dynamics change so dramatically 
in  0 check cases, the resulting behavior in too small value cases
may be easier to predict... It's possible that this could explain some 
anomalities you've been seeing in your measurements.

  It is not very clear how SYN segments should be handled, so I
  choose to follow the previous implementation in this respect.
 
 Since we don't invoke congestion control modules until after the SYN
 handshake this is not a problem.

Just curious, do you mean that cc modules cannot measure, e.g., initial
RTT through this mechanism (though they could do that in init() cb then
I suppose)... Or do you mean that they are called already for the ACK
that completes the SYN handshake and therefore its skb is being cleaned 
from the queue right now (this is the case I above refer to)? 
In the first case the decrementer code is NOP. If the latter, then it
is just interface specification question, i.e., if SYNs are treated as 
zero or one in num_acked for the pkts_acked callback (I have no opinion
on this but was just trying to make sure cc modules get what they
expect :-)). 


-- 
 i.

Re: [PATCH 0/9]: tcp-2.6 patchset

2007-05-29 Thread Stephen Hemminger
On Tue, 29 May 2007 23:07:00 +0300 (EEST)
Ilpo Järvinen [EMAIL PROTECTED] wrote:

 On Tue, 29 May 2007, Stephen Hemminger wrote:
 
  On Mon, 28 May 2007 13:27:03 +0300 (EEST)
  Ilpo Järvinen [EMAIL PROTECTED] wrote:
  
   On Sun, 27 May 2007, Ilpo Järvinen wrote:
   
   [PATCH] [TCP]: Fix GSO ignorance of pkts_acked arg (cong.cntrl modules)
  
  Yes, thanks for fixing this. Wonder how it affects measurements.
 
 ...It's a bit hard to tell since dynamics change so dramatically 
 in  0 check cases, the resulting behavior in too small value cases
 may be easier to predict... It's possible that this could explain some 
 anomalities you've been seeing in your measurements.
 
   It is not very clear how SYN segments should be handled, so I
   choose to follow the previous implementation in this respect.
  
  Since we don't invoke congestion control modules until after the SYN
  handshake this is not a problem.
 
 Just curious, do you mean that cc modules cannot measure, e.g., initial
 RTT through this mechanism (though they could do that in init() cb then
 I suppose)... Or do you mean that they are called already for the ACK
 that completes the SYN handshake and therefore its skb is being cleaned 
 from the queue right now (this is the case I above refer to)? 
 In the first case the decrementer code is NOP. If the latter, then it
 is just interface specification question, i.e., if SYNs are treated as 
 zero or one in num_acked for the pkts_acked callback (I have no opinion
 on this but was just trying to make sure cc modules get what they
 expect :-)). 

We don't switch a socket out of Reno until after the initial handshake.
It evolved that way, and has a couple of advantages:
 * uncomplete connections don't up module refcount so it is easier to
   debug
 * in future, we can choose congestion control to use based on route
   metric.
 
As an interface, it makes sense to keep the API with the SYN counting
as a packet.

-- 
Stephen Hemminger [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/9]: tcp-2.6 patchset

2007-05-29 Thread Ilpo Järvinen
On Tue, 29 May 2007, Stephen Hemminger wrote:

 On Tue, 29 May 2007 23:07:00 +0300 (EEST)
 Ilpo Järvinen [EMAIL PROTECTED] wrote:
 
  On Tue, 29 May 2007, Stephen Hemminger wrote:
  
   Since we don't invoke congestion control modules until after the SYN
   handshake this is not a problem.
  
  Just curious, do you mean that cc modules cannot measure, e.g., initial
  RTT through this mechanism (though they could do that in init() cb then
  I suppose)... Or do you mean that they are called already for the ACK
  that completes the SYN handshake and therefore its skb is being cleaned 
  from the queue right now (this is the case I above refer to)? 
  In the first case the decrementer code is NOP. If the latter, then it
  is just interface specification question, i.e., if SYNs are treated as 
  zero or one in num_acked for the pkts_acked callback (I have no opinion
  on this but was just trying to make sure cc modules get what they
  expect :-)). 
 
 We don't switch a socket out of Reno until after the initial handshake.

...It's still not very clear to me what exactly your after means (both 
here and in your earlier description), i.e., whether clean_rtx_queue call 
that cleans SYN skb from the queue happens before or after the switch out 
of reno or not... If I understand the code correctly, this specific 
clean_rtx_queue call happens after your after but I could be 
misunderstanding the current 3-way handshake code. :-)

 As an interface, it makes sense to keep the API with the SYN counting
 as a packet.

Ok, this one answers the remaining question concerning the patch, here
is it without the decrementer for SYN case (which IMHO wasn't very 
beautiful looking anyway :-)).

Dave, please consider this to net-2.6. It could be a stable candidate 
as well, haven't tested yet if it applies cleanly to stable:


[PATCH] [TCP]: Fix GSO ignorance of pkts_acked arg (cong.cntrl modules)

The code used to ignore GSO completely, passing either way too
small or zero pkts_acked when GSO skb or part of it got ACKed.
In addition, there is no need to calculate the value in the loop
but simple arithmetics after the loop is sufficient. Pkts_acked
callback of congestion control modules include SYN as a packet
for now on.

Signed-off-by: Ilpo Järvinen [EMAIL PROTECTED]
---
 net/ipv4/tcp_input.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 38cb25b..74683d8 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -2407,8 +2407,8 @@ static int tcp_clean_rtx_queue(struct sock *sk, __s32 
*seq_rtt_p)
struct sk_buff *skb;
__u32 now = tcp_time_stamp;
int acked = 0;
+   int prior_packets = tp-packets_out;
__s32 seq_rtt = -1;
-   u32 pkts_acked = 0;
ktime_t last_ackt = ktime_set(0,0);
 
while ((skb = tcp_write_queue_head(sk)) 
@@ -2437,7 +2437,6 @@ static int tcp_clean_rtx_queue(struct sock *sk, __s32 
*seq_rtt_p)
 */
if (!(scb-flags  TCPCB_FLAG_SYN)) {
acked |= FLAG_DATA_ACKED;
-   ++pkts_acked;
} else {
acked |= FLAG_SYN_ACKED;
tp-retrans_stamp = 0;
@@ -2481,6 +2480,7 @@ static int tcp_clean_rtx_queue(struct sock *sk, __s32 
*seq_rtt_p)
}
 
if (ackedFLAG_ACKED) {
+   u32 pkts_acked = prior_packets - tp-packets_out;
const struct tcp_congestion_ops *ca_ops
= inet_csk(sk)-icsk_ca_ops;
 
-- 
1.5.0.6


Re: [PATCH 0/9]: tcp-2.6 patchset

2007-05-28 Thread Ilpo Järvinen
On Sun, 27 May 2007, Ilpo Järvinen wrote:

 On Sun, 27 May 2007, Baruch Even wrote:
 
  * Ilpo J?rvinen [EMAIL PROTECTED] [070527 14:16]:
   
   Thus, my original question basically culminates in this: should cc
   modules be passed number of packets acked or number of skbs acked?
   ...The latter makes no sense to me unless the value is intented to
   be interpreted as number of timestamps acked or something along those 
   lines. ...I briefly tried looking up for documentation for cc module 
   interface but didn't find anything useful about this, and thus asked in 
   the first place...
  
  At least the htcp module that I wrote assumes that the number is actual
  number of tcp packets so GSO should be considered.
 
 Thanks for the info! It is what I suspected... ...I'll write a patch for 
 it tomorrow against net-2.6... Dave, beware that it will partially 
 overlap with the changes made in the patch 8, so you might choose to put 
 the patch 8 on hold until this issue is first resolved...
 
  The consequences of this bug are not too large but it does make all
  congestion control algorithms a lot less aggressive. On my machines GSO
  is disabled by default (e1000 at 100mbps  Tigon3 @ 1Gbps).
 
 Agreed, that's my impression too. However, some algorithms do things
 like  0 checks for it, so it might disturb their dynamics even more
 than in the too small value cases...

Hmm, there seems to be another case that I'm not too sure of...
Please check the alternative I choose for SYN handling below...

...hmm... While exploring this SYN thingie, I noticed that commit 
164891aadf1721fca4dce473bb0e0998181537c6 drops !FLAG_RETRANS_DATA_ACKED 
check from rtt_sample call (when combining it with pkts_acked call).
I hope that's intentional?!? ...the commit message didn't say anything 
about it nor was anything in cc modules changed to accomodate that.


[PATCH] [TCP]: Fix GSO ignorance of pkts_acked arg (cong.cntrl modules)

The code used to ignore GSO completely, passing either way too
small or zero pkts_acked when GSO skb or part of it got ACKed.
In addition, there is no need to calculate the value in the loop
but simple arithmetics after the loop is sufficient.

It is not very clear how SYN segments should be handled, so I
choose to follow the previous implementation in this respect.

Signed-off-by: Ilpo Järvinen [EMAIL PROTECTED]
---
 net/ipv4/tcp_input.c |7 +--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 38cb25b..7dfaabb 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -2407,8 +2407,8 @@ static int tcp_clean_rtx_queue(struct sock *sk, __s32 
*seq_rtt_p)
struct sk_buff *skb;
__u32 now = tcp_time_stamp;
int acked = 0;
+   int prior_packets = tp-packets_out;
__s32 seq_rtt = -1;
-   u32 pkts_acked = 0;
ktime_t last_ackt = ktime_set(0,0);
 
while ((skb = tcp_write_queue_head(sk)) 
@@ -2437,7 +2437,6 @@ static int tcp_clean_rtx_queue(struct sock *sk, __s32 
*seq_rtt_p)
 */
if (!(scb-flags  TCPCB_FLAG_SYN)) {
acked |= FLAG_DATA_ACKED;
-   ++pkts_acked;
} else {
acked |= FLAG_SYN_ACKED;
tp-retrans_stamp = 0;
@@ -2481,9 +2480,13 @@ static int tcp_clean_rtx_queue(struct sock *sk, __s32 
*seq_rtt_p)
}
 
if (ackedFLAG_ACKED) {
+   u32 pkts_acked = prior_packets - tp-packets_out;
const struct tcp_congestion_ops *ca_ops
= inet_csk(sk)-icsk_ca_ops;
 
+   if (acked  FLAG_SYN_ACKED)
+   pkts_acked--;
+
tcp_ack_update_rtt(sk, acked, seq_rtt);
tcp_ack_packets_out(sk);
 
-- 
1.5.0.6


Re: [PATCH 0/9]: tcp-2.6 patchset

2007-05-27 Thread Ilpo Järvinen
On Sat, 26 May 2007, David Miller wrote:

 From: Ilpo_Järvinen [EMAIL PROTECTED]
 Date: Sat, 26 May 2007 11:35:53 +0300
 
  Dave, you could consider applying other than the last one if they
  seem ok to you too (you'll need to rebase your tcp-2.6 in that case
  first to apply cleanly those that touch tcp_sync_left_out :-)).
 
 Absolutely, I'll do the quick rebase of tcp-2.6 and review +
 applying of your patches (except the last one) over the weekend.

While you're in the right context (reviewing patch 8), you could also
look if tcp_clean_rtx_queue does a right thing when passing a strange 
pkts_acked to congestion control modules. I wonder if it really should 
ignore GSO the way it does currently... I read some cc module code and 
some was adding it to snd_cwnd_cnt, etc. which is a strong indication 
that GSO should be considered... Also if the head is GSO skb that is not 
completely acked, the loop breaks with pkts_acked being zero, I doubt
that can be correct... 

...added Stephen H.  Herbert X. to this thread since also they might have 
some idea about it.

...Anyway, if it's wrong, the fix should go to net-2.6 rather than to 
tcp-2.6...


-- 
 i.

Re: [PATCH 0/9]: tcp-2.6 patchset

2007-05-27 Thread David Miller
From: Ilpo_Järvinen [EMAIL PROTECTED]
Date: Sun, 27 May 2007 10:58:27 +0300 (EEST)

 On Sat, 26 May 2007, David Miller wrote:
 
  From: Ilpo_Järvinen [EMAIL PROTECTED]
  Date: Sat, 26 May 2007 11:35:53 +0300
  
   Dave, you could consider applying other than the last one if they
   seem ok to you too (you'll need to rebase your tcp-2.6 in that case
   first to apply cleanly those that touch tcp_sync_left_out :-)).
  
  Absolutely, I'll do the quick rebase of tcp-2.6 and review +
  applying of your patches (except the last one) over the weekend.
 
 While you're in the right context (reviewing patch 8), you could also
 look if tcp_clean_rtx_queue does a right thing when passing a strange 
 pkts_acked to congestion control modules. I wonder if it really should 
 ignore GSO the way it does currently... I read some cc module code and 
 some was adding it to snd_cwnd_cnt, etc. which is a strong indication 
 that GSO should be considered... Also if the head is GSO skb that is not 
 completely acked, the loop breaks with pkts_acked being zero, I doubt
 that can be correct... 

I rebased tcp-2.6, applied your patches 1-7 (which I'll comment
a bit on tomorrow) and will likely take a look at these issues
wrt. patch 8 tomorrow.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/9]: tcp-2.6 patchset

2007-05-27 Thread Ilpo Järvinen
On Sun, 27 May 2007, David Miller wrote:

 From: Ilpo_Järvinen [EMAIL PROTECTED]
 Date: Sun, 27 May 2007 10:58:27 +0300 (EEST)
 
  While you're in the right context (reviewing patch 8), you could also
  look if tcp_clean_rtx_queue does a right thing when passing a strange 
  pkts_acked to congestion control modules. I wonder if it really should 
  ignore GSO the way it does currently... I read some cc module code and 
  some was adding it to snd_cwnd_cnt, etc. which is a strong indication 
  that GSO should be considered... Also if the head is GSO skb that is not 
  completely acked, the loop breaks with pkts_acked being zero, I doubt
  that can be correct... 
 
 [...snip...]
 will likely take a look at these issues wrt. patch 8 tomorrow.

...I hope I got myself understood correctly (with my non-native
English :-))... ...My intention was to say that there might be
a bug in tcp_clean_rtx_queue too, which on logical level is
unrelated to the patch 8 itself and to the change it makes (but
caught my attention while I was doing that patch as it resides in
the same function). ...There could have been confusion since I'm too 
changing things in the same function and because my article usage is 
usually far from perfect... :-) Also the fastretrans_alert arg rename
to the same name (i.e., pkts_acked) in patch 8 is unrelated to the 
potential bug.

Thus, my original question basically culminates in this: should cc
modules be passed number of packets acked or number of skbs acked?
...The latter makes no sense to me unless the value is intented to
be interpreted as number of timestamps acked or something along those 
lines. ...I briefly tried looking up for documentation for cc module 
interface but didn't find anything useful about this, and thus asked in 
the first place...


-- 
 i.

Re: [PATCH 0/9]: tcp-2.6 patchset

2007-05-27 Thread Baruch Even
* Ilpo J?rvinen [EMAIL PROTECTED] [070527 14:16]:
 On Sun, 27 May 2007, David Miller wrote:
 
  From: Ilpo_J?rvinen [EMAIL PROTECTED]
  Date: Sun, 27 May 2007 10:58:27 +0300 (EEST)
  
   While you're in the right context (reviewing patch 8), you could also
   look if tcp_clean_rtx_queue does a right thing when passing a strange 
   pkts_acked to congestion control modules. I wonder if it really should 
   ignore GSO the way it does currently... I read some cc module code and 
   some was adding it to snd_cwnd_cnt, etc. which is a strong indication 
   that GSO should be considered... Also if the head is GSO skb that is not 
   completely acked, the loop breaks with pkts_acked being zero, I doubt
   that can be correct... 
  
  [...snip...]
  will likely take a look at these issues wrt. patch 8 tomorrow.
 
 [...snip...]
 
 Thus, my original question basically culminates in this: should cc
 modules be passed number of packets acked or number of skbs acked?
 ...The latter makes no sense to me unless the value is intented to
 be interpreted as number of timestamps acked or something along those 
 lines. ...I briefly tried looking up for documentation for cc module 
 interface but didn't find anything useful about this, and thus asked in 
 the first place...

At least the htcp module that I wrote assumes that the number is actual
number of tcp packets so GSO should be considered.

The consequences of this bug are not too large but it does make all
congestion control algorithms a lot less aggressive. On my machines GSO
is disabled by default (e1000 at 100mbps  Tigon3 @ 1Gbps).

Cheers,
Baruch
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/9]: tcp-2.6 patchset

2007-05-27 Thread Ilpo Järvinen
On Sun, 27 May 2007, Baruch Even wrote:

 * Ilpo J?rvinen [EMAIL PROTECTED] [070527 14:16]:
  
  Thus, my original question basically culminates in this: should cc
  modules be passed number of packets acked or number of skbs acked?
  ...The latter makes no sense to me unless the value is intented to
  be interpreted as number of timestamps acked or something along those 
  lines. ...I briefly tried looking up for documentation for cc module 
  interface but didn't find anything useful about this, and thus asked in 
  the first place...
 
 At least the htcp module that I wrote assumes that the number is actual
 number of tcp packets so GSO should be considered.

Thanks for the info! It is what I suspected... ...I'll write a patch for 
it tomorrow against net-2.6... Dave, beware that it will partially 
overlap with the changes made in the patch 8, so you might choose to put 
the patch 8 on hold until this issue is first resolved...

 The consequences of this bug are not too large but it does make all
 congestion control algorithms a lot less aggressive. On my machines GSO
 is disabled by default (e1000 at 100mbps  Tigon3 @ 1Gbps).

Agreed, that's my impression too. However, some algorithms do things
like  0 checks for it, so it might disturb their dynamics even more
than in the too small value cases...


-- 
 i.
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/9]: tcp-2.6 patchset

2007-05-26 Thread David Miller
From: Ilpo_Järvinen [EMAIL PROTECTED]
Date: Sat, 26 May 2007 11:35:53 +0300

 Dave, you could consider applying other than the last one if they
 seem ok to you too (you'll need to rebase your tcp-2.6 in that case
 first to apply cleanly those that touch tcp_sync_left_out :-)).

Absolutely, I'll do the quick rebase of tcp-2.6 and review +
applying of your patches (except the last one) over the weekend.

Thanks!
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html