Re: [PATCH][RFC] Re: high latency with TCP connections

2006-09-22 Thread Rick Jones

Alexey Kuznetsov wrote:

Hello!


transactions to data segments is fubar.  That issue is also why I wonder 
about the setting of tcp_abc.



Yes, switching ABC on/off has visible impact on amount of segments.
When ABC is off, amount of segments is almost the same as number of
transactions. When it is on, ~1.5% are merged. But this is invisible
in numbers of throughput/cpu usage.


Hmm, that would seem to suggest that for "new" the netperf/netserver 
were being fast enough that the code didn't perceive the receipt of 
back-to-back sub-MSS segments? (Is that even possible once -b is fairly 
large?)  Otherwise, with new I would have expected the segment count to 
be meaningfully > than the transaction count?




That' numbers:

1Gig link. The first column is "b". - separates runs of netperf
in backward direction.

Run #1. One host is slower.

old,abc=0
 new,abc=0
  new,abc=1
   old,abc=1

2   23652.00  6.31   21.11  10.665  8.924
 23622.16  6.47   21.01  10.951  8.893
  23625.05  6.21   21.01  10.512  8.891
   23725.12  6.46   20.31  10.898  8.559
-
23594.87  21.90  6.44   9.283   10.912
 23631.52  20.30  6.36   8.592   10.766
  23609.55  21.00  6.26   8.896   10.599
   23633.75  21.10  5.44   8.929   9.206

4   36349.11  8.71   31.21  9.584   8.585
 36461.37  8.65   30.81  9.492   8.449
  36723.72  8.22   31.31  8.949   8.526
   35801.24  8.58   30.51  9.589   8.521
-
35127.34  33.80  8.43   9.621   9.605
 36165.50  30.90  8.48   8.545   9.381
  36201.45  31.10  8.31   8.592   9.185
   35269.76  30.00  8.58   8.507   9.732

8   41148.23  10.39  42.30  10.101  10.281
 41270.06  11.04  31.31  10.698  7.585
  41181.56  5.66   48.61  5.496   11.803
   40372.37  9.68   56.50  9.591   13.996
-
40392.14  47.00  11.89  11.637  11.775
 40613.80  36.90  9.16   9.086   9.019
  40504.66  53.60  7.73   13.234  7.639
   40388.99  48.70  11.93  12.058  11.814

16  67952.27  16.27  43.70  9.576   6.432
 68031.40  10.56  53.70  6.206   7.894
  6.95  12.81  46.90  7.559   6.920
   67814.41  16.13  46.50  9.517   6.857
-
68031.46  51.30  11.53  7.541   6.781
 68044.57  40.70  8.48   5.982   4.986
  67808.13  39.60  15.86  5.840   9.355
   67818.32  52.90  11.51  7.801   6.791

32  90445.09  15.41  99.90  6.817   11.045
 90210.34  16.11  100.00 7.143   11.085
  90221.84  17.31  98.90  7.676   10.962
   90712.78  18.41  99.40  8.120   10.958
-
89155.51  99.90  12.89  11.205  5.782
 90058.54  99.90  16.16  11.093  7.179
  90092.31  98.60  15.41  10.944  6.840
   88688.96  99.00  17.59  11.163  7.933

64  89983.76  13.66  100.00 6.071   11.113
 90504.24  17.54  100.00 7.750   11.049
  92043.36  17.44  99.70  7.580   10.832
   90979.29  16.01  99.90  7.038   10.981
-
88615.27  99.90  14.91  11.273  6.729
 89316.13  99.90  17.28  11.185  7.740
  90622.85  99.90  16.81  11.024  7.420
   89084.85  99.90  17.51  11.214  7.861

Run #2. Slower host is replaced with better one. ABC=0.
No runs in backward directions.

new
 old

2   24009.73  8.80   6.49   3.667   10.806
 24008.43  8.00   6.32   3.334   10.524
4   40012.53  18.30  8.79   4.574   8.783
 3.84  19.40  8.86   4.851   8.857
8   60500.29  26.30  12.78  4.348   8.452
 60397.79  26.30  11.73  4.355   7.769
16  69619.95  39.80  14.03  5.717   8.063
 70528.72  24.90  14.43  3.531   8.184
32  132522.01  53.20  21.28  4.015   6.424
 132602.93  57.70  22.59  4.351   6.813
64  145738.83  60.30  25.01  4.138   6.865
 143129.55  73.20  24.19  5.114   6.759
128 148184.21  69.70  24.96  4.704   6.739
 148143.47  71.00  25.01  4.793   6.753
256 144798.91  69.40  25.01  4.793   6.908
 144086.01  73.00  24.61  5.067   6.832

Frankly, I do not see any statistically valid correlations.


Does look like it jumps-around quite a bit - for example the run#2 with 
-b 16 had the CPU util all over the map on the netperf side.  That 
wasn't by any chance an SMP system?


that "linux" didn't seem to be doing the same thing. Hence my tweaking 
when seeing this patch come along...]



netperf does not catch this. :-)


Nope :(  One of these days I need to teach netperf how to extract 
TCP statistics from as many platforms as possible.  Meantime it relies 
as always on the kindness of benchmarkers :) (My appologies to Tennesee 
Williams :)



Even with this patch linux does not ack each second segment dumbly,
it waits for some conditions, mostly read() emptying receive queue.


Good.  HP-UX is indeed dumb about this, but I'm assured it will be 
changing.

Re: [PATCH][RFC] Re: high latency with TCP connections

2006-09-22 Thread Alexey Kuznetsov
Hello!

> transactions to data segments is fubar.  That issue is also why I wonder 
> about the setting of tcp_abc.

Yes, switching ABC on/off has visible impact on amount of segments.
When ABC is off, amount of segments is almost the same as number of
transactions. When it is on, ~1.5% are merged. But this is invisible
in numbers of throughput/cpu usage.

That' numbers:

1Gig link. The first column is "b". - separates runs of netperf
in backward direction.

Run #1. One host is slower.

old,abc=0
 new,abc=0
  new,abc=1
   old,abc=1

2   23652.00  6.31   21.11  10.665  8.924
 23622.16  6.47   21.01  10.951  8.893
  23625.05  6.21   21.01  10.512  8.891
   23725.12  6.46   20.31  10.898  8.559
-
23594.87  21.90  6.44   9.283   10.912
 23631.52  20.30  6.36   8.592   10.766
  23609.55  21.00  6.26   8.896   10.599
   23633.75  21.10  5.44   8.929   9.206

4   36349.11  8.71   31.21  9.584   8.585
 36461.37  8.65   30.81  9.492   8.449
  36723.72  8.22   31.31  8.949   8.526
   35801.24  8.58   30.51  9.589   8.521
-
35127.34  33.80  8.43   9.621   9.605
 36165.50  30.90  8.48   8.545   9.381
  36201.45  31.10  8.31   8.592   9.185
   35269.76  30.00  8.58   8.507   9.732

8   41148.23  10.39  42.30  10.101  10.281
 41270.06  11.04  31.31  10.698  7.585
  41181.56  5.66   48.61  5.496   11.803
   40372.37  9.68   56.50  9.591   13.996
-
40392.14  47.00  11.89  11.637  11.775
 40613.80  36.90  9.16   9.086   9.019
  40504.66  53.60  7.73   13.234  7.639
   40388.99  48.70  11.93  12.058  11.814

16  67952.27  16.27  43.70  9.576   6.432
 68031.40  10.56  53.70  6.206   7.894
  6.95  12.81  46.90  7.559   6.920
   67814.41  16.13  46.50  9.517   6.857
-
68031.46  51.30  11.53  7.541   6.781
 68044.57  40.70  8.48   5.982   4.986
  67808.13  39.60  15.86  5.840   9.355
   67818.32  52.90  11.51  7.801   6.791

32  90445.09  15.41  99.90  6.817   11.045
 90210.34  16.11  100.00 7.143   11.085
  90221.84  17.31  98.90  7.676   10.962
   90712.78  18.41  99.40  8.120   10.958
-
89155.51  99.90  12.89  11.205  5.782
 90058.54  99.90  16.16  11.093  7.179
  90092.31  98.60  15.41  10.944  6.840
   88688.96  99.00  17.59  11.163  7.933

64  89983.76  13.66  100.00 6.071   11.113
 90504.24  17.54  100.00 7.750   11.049
  92043.36  17.44  99.70  7.580   10.832
   90979.29  16.01  99.90  7.038   10.981
-
88615.27  99.90  14.91  11.273  6.729
 89316.13  99.90  17.28  11.185  7.740
  90622.85  99.90  16.81  11.024  7.420
   89084.85  99.90  17.51  11.214  7.861

Run #2. Slower host is replaced with better one. ABC=0.
No runs in backward directions.

new
 old

2   24009.73  8.80   6.49   3.667   10.806
 24008.43  8.00   6.32   3.334   10.524
4   40012.53  18.30  8.79   4.574   8.783
 3.84  19.40  8.86   4.851   8.857
8   60500.29  26.30  12.78  4.348   8.452
 60397.79  26.30  11.73  4.355   7.769
16  69619.95  39.80  14.03  5.717   8.063
 70528.72  24.90  14.43  3.531   8.184
32  132522.01  53.20  21.28  4.015   6.424
 132602.93  57.70  22.59  4.351   6.813
64  145738.83  60.30  25.01  4.138   6.865
 143129.55  73.20  24.19  5.114   6.759
128 148184.21  69.70  24.96  4.704   6.739
 148143.47  71.00  25.01  4.793   6.753
256 144798.91  69.40  25.01  4.793   6.908
 144086.01  73.00  24.61  5.067   6.832

Frankly, I do not see any statistically valid correlations.



> that "linux" didn't seem to be doing the same thing. Hence my tweaking 
> when seeing this patch come along...]

netperf does not catch this. :-)

Even with this patch linux does not ack each second segment dumbly,
it waits for some conditions, mostly read() emptying receive queue.
To model this it is necessary to insert some gaps between
bursted segments or to use slow network.

I have no doubts it is easy to model a situation when we send
lots of useless ACKs. F.e. inserting 20ms gaps between requests.
To see effect on thoughput/cpu, we could start enough of connections,
doing the same thing.

Alexey
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RFC] Re: high latency with TCP connections

2006-09-20 Thread Stephen Hemminger
On Wed, 20 Sep 2006 15:47:56 -0700 (PDT)
David Miller <[EMAIL PROTECTED]> wrote:

> From: Stephen Hemminger <[EMAIL PROTECTED]>
> Date: Wed, 20 Sep 2006 15:44:06 -0700
> 
> > On Mon, 18 Sep 2006 06:56:55 -0700 (PDT)
> > David Miller <[EMAIL PROTECTED]> wrote:
> > 
> > > Ok, I'll put this into net-2.6.19 for now.  Thanks.
> > 
> > Did you try this on a desktop system?  Something is wrong with net-2.6.19
> > basic web browsing seems slower.
> 
> It might be due to other changes, please verify that it's
> truly caused by Alexey's change by backing it out and
> retesting.
> 
> Note that I had to use an updated version of Alexey's change,
> which he sent me privately, because the first version didn't
> compile :)

It might be something else.. there are a lot of changes from 2.6.18 to 
net-2.6.19.



-- 
Stephen Hemminger <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RFC] Re: high latency with TCP connections

2006-09-20 Thread David Miller
From: Stephen Hemminger <[EMAIL PROTECTED]>
Date: Wed, 20 Sep 2006 15:44:06 -0700

> On Mon, 18 Sep 2006 06:56:55 -0700 (PDT)
> David Miller <[EMAIL PROTECTED]> wrote:
> 
> > Ok, I'll put this into net-2.6.19 for now.  Thanks.
> 
> Did you try this on a desktop system?  Something is wrong with net-2.6.19
> basic web browsing seems slower.

It might be due to other changes, please verify that it's
truly caused by Alexey's change by backing it out and
retesting.

Note that I had to use an updated version of Alexey's change,
which he sent me privately, because the first version didn't
compile :)
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RFC] Re: high latency with TCP connections

2006-09-20 Thread Stephen Hemminger
On Mon, 18 Sep 2006 06:56:55 -0700 (PDT)
David Miller <[EMAIL PROTECTED]> wrote:

> From: Alexey Kuznetsov <[EMAIL PROTECTED]>
> Date: Mon, 18 Sep 2006 14:37:05 +0400
> 
> > > It looks perfectly fine to me, would you like me to apply it
> > > Alexey?
> > 
> > Yes, I think it is safe.
> 
> Ok, I'll put this into net-2.6.19 for now.  Thanks.

Did you try this on a desktop system?  Something is wrong with net-2.6.19
basic web browsing seems slower.

-- 
Stephen Hemminger <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RFC] Re: high latency with TCP connections

2006-09-18 Thread Rick Jones
Regardless, kudos for running the test.  The only thing missing is the 
-c and -C options to enable the CPU utilization measurements which will 
then give the service demand on a CPU time per transaction basis.  Or 
was this a UP system that was taken to CPU saturation?



It is my notebook. :-) Of course, cpu consumption is 100%.
(Actally, netperf shows 100.10 :-))


Gotta love the accuracy. :)



I will redo test on a real network. What range of -b should I test?



I suppose that depends on your patience :) In theory, as you increase 
(eg double) the -b setting you should reach a point of diminishing 
returns wrt transaction rate.  If you see that, and see the service 
demand flattening-out I'd say it is probably time to stop.


I'm also not quite sure if "abc" needs to be disabled or not.

I do know that I left-out one very important netperf option.  The 
command line should be:


netperf -t TCP_RR -H foo -- -b N -D

where "-D" is added to set TCP_NODELAY.  Otherwise, the ratio of 
transactions to data segments is fubar.  That issue is also why I wonder 
about the setting of tcp_abc.


[I have this quixotic pipedream about being able to --enable-burst, set 
-D and say that the number of TCP segments exchanged on the network is 
2X the transaction count when request and response size are < MSS.  The 
raison d'etre for this pipe dream is maximizing PPS with TCP_RR tests 
without _having_ to have hundreds if not thousands of simultaneous 
netperfs/connections - say with just as many netperfs/connections as 
there are CPUs or threads/strands in the system. It was while trying to 
make this pipe dream a reality I first noticed that HP-UX 11i, which 
normally has a very nice ACK avoidance heuristic, would send an 
immediate ACK if it received back-to-back sub-MSS segments - thus 
ruining my pipe dream when it came to HP-UX testing.  Hapily, I noticed 
that "linux" didn't seem to be doing the same thing. Hence my tweaking 
when seeing this patch come along...]



What i'm thinking about isn't so much about the latency



I understand.

Actually, I did those tests ages ago for a pure throughput case,
when nothing goes in the opposite direction. I did not find a difference
that time. And nobody even noticed that Linux sends ACKs _each_ small
segment for unidirectional connections for all those years. :-)


Not everyone looks very closely (alas, sometimes myself included).

If all anyone does is look at throughput, until they CPU saturate they 
wouldn't notice.  Heck, before netperf and TCP_RR tests, and sadly even 
still today, most people just look at how fast a single-connection, 
unidirectional data transfer goes and leave it at that :(


Thankfully, the set of "most people" and "netdev" aren't completely 
overlapping.


rick jones
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RFC] Re: high latency with TCP connections

2006-09-18 Thread Alexey Kuznetsov
Hello!

> There isn't any sort of clever short-circuiting in loopback is there?

No, from all that I know.


>   I 
> do like the convenience of testing things over loopback, but always fret 
> about not including drivers and actual hardware interrupts etc.

Well, if the test is right, it should show cost of redundant ACKs.


> Regardless, kudos for running the test.  The only thing missing is the 
> -c and -C options to enable the CPU utilization measurements which will 
> then give the service demand on a CPU time per transaction basis.  Or 
> was this a UP system that was taken to CPU saturation?

It is my notebook. :-) Of course, cpu consumption is 100%.
(Actally, netperf shows 100.10 :-))

I will redo test on a real network. What range of -b should I test?


> What i'm thinking about isn't so much about the latency

I understand.

Actually, I did those tests ages ago for a pure throughput case,
when nothing goes in the opposite direction. I did not find a difference
that time. And nobody even noticed that Linux sends ACKs _each_ small
segment for unidirectional connections for all those years. :-)

Alexey
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RFC] Re: high latency with TCP connections

2006-09-18 Thread Rick Jones

Alexey Kuznetsov wrote:

Hello!

Of course, number of ACK increases. It is the goal. :-)

unpleasant increase in service demands on something like a "burst 
enabled" (./configure --enable-burst) netperf TCP_RR test:


netperf -t TCP_RR -H foo -- -b N   # N > 1


foo=localhost


There isn't any sort of clever short-circuiting in loopback is there?  I 
do like the convenience of testing things over loopback, but always fret 
about not including drivers and actual hardware interrupts etc.



b   patched orig
2   105874.83   105143.71
3   114208.53   114023.07
4   120493.99   120851.27
5   128087.48   128573.33
10  151328.48   151056.00

>

Probably, the test is done wrong. But I see no difference.


Regardless, kudos for running the test.  The only thing missing is the 
-c and -C options to enable the CPU utilization measurements which will 
then give the service demand on a CPU time per transaction basis.  Or 
was this a UP system that was taken to CPU saturation?


to increase as a result.   Pipelined HTTP would be like that, some NFS 
over TCP stuff too, maybe X traffic,



X will be excited about better latency.

What's about protocols not interested in latency, they will be a little
happier, if transactions are processed asynchronously.


What i'm thinking about isn't so much about the latency as it is the 
aggregate throughput a system can do with lots of these 
protocols/connections going at the same time.  Hence the concern about 
increases in service demand.



But actually, it is not about increasing/decreasing number of ACKs.
It is about killing that pain in ass which we used to have because
we pretended to be too smart.


:)

rick jones
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RFC] Re: high latency with TCP connections

2006-09-18 Thread Alexey Kuznetsov
Hello!

Of course, number of ACK increases. It is the goal. :-)


> unpleasant increase in service demands on something like a "burst 
> enabled" (./configure --enable-burst) netperf TCP_RR test:
> 
> netperf -t TCP_RR -H foo -- -b N   # N > 1

foo=localhost

b   patched orig
2   105874.83   105143.71
3   114208.53   114023.07
4   120493.99   120851.27
5   128087.48   128573.33
10  151328.48   151056.00

Probably, the test is done wrong. But I see no difference.


> to increase as a result.   Pipelined HTTP would be like that, some NFS 
> over TCP stuff too, maybe X traffic,

X will be excited about better latency.

What's about protocols not interested in latency, they will be a little
happier, if transactions are processed asynchronously.

But actually, it is not about increasing/decreasing number of ACKs.
It is about killing that pain in ass which we used to have because
we pretended to be too smart.

Alexey
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RFC] Re: high latency with TCP connections

2006-09-18 Thread Rick Jones

David Miller wrote:

From: Rick Jones <[EMAIL PROTECTED]>
Date: Tue, 05 Sep 2006 10:55:16 -0700


Is this really necessary?  I thought that the problems with ABC were in 
trying to apply byte-based heuristics from the RFC(s) to a 
packet-oritented cwnd in the stack?



This is receiver side, and helps a sender who does congestion
control based upon packet counting like Linux does.   It really
is less related to ABC than Alexey implies, we've always had
this kind of problem as I mentioned in previous talks in the
past on this issue.


For a connection receiving nothing but sub-MSS segments this is going to 
non-trivially increase the number of ACKs sent no?  I would expect an 
unpleasant increase in service demands on something like a "burst 
enabled" (./configure --enable-burst) netperf TCP_RR test:


netperf -t TCP_RR -H foo -- -b N   # N > 1

to increase as a result.   Pipelined HTTP would be like that, some NFS 
over TCP stuff too, maybe X traffic, other "transactional" workloads as 
well - maybe Tuxeudo.


rick jones
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RFC] Re: high latency with TCP connections

2006-09-18 Thread David Miller
From: Alexey Kuznetsov <[EMAIL PROTECTED]>
Date: Mon, 18 Sep 2006 14:37:05 +0400

> > It looks perfectly fine to me, would you like me to apply it
> > Alexey?
> 
> Yes, I think it is safe.

Ok, I'll put this into net-2.6.19 for now.  Thanks.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RFC] Re: high latency with TCP connections

2006-09-18 Thread Alexey Kuznetsov
Hello!

> It looks perfectly fine to me, would you like me to apply it
> Alexey?

Yes, I think it is safe.


Theoretically, there is one place where it can be not so good.
Good nagling tcp connection, which makes lots of small write()s,
will send MSS sized frames due to delayed ACKs. But if we ACK
each other segment, more segments will come out incomplete,
which could result in some decrease of throughput.

But the trap for this case was set 6 years ago. For unidirectional sessions
ACKs were sent not even each second segment, but each small segment. :-)
This did not show any problems for those 6 years. I guess it means
that the problem does not exist.

Alexey
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RFC] Re: high latency with TCP connections

2006-09-18 Thread David Miller
From: Rick Jones <[EMAIL PROTECTED]>
Date: Tue, 05 Sep 2006 10:55:16 -0700

> Is this really necessary?  I thought that the problems with ABC were in 
> trying to apply byte-based heuristics from the RFC(s) to a 
> packet-oritented cwnd in the stack?

This is receiver side, and helps a sender who does congestion
control based upon packet counting like Linux does.   It really
is less related to ABC than Alexey implies, we've always had
this kind of problem as I mentioned in previous talks in the
past on this issue.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RFC] Re: high latency with TCP connections

2006-09-18 Thread David Miller
From: Alexey Kuznetsov <[EMAIL PROTECTED]>
Date: Mon, 4 Sep 2006 20:00:45 +0400

> Try enclosed patch. I have no idea why 9.997 sec is so magic, but I
> get exactly this number on my notebook. :-)
> 
> =
> 
> This patch enables sending ACKs each 2d received segment.
> It does not affect either mss-sized connections (obviously) or connections
> controlled by Nagle (because there is only one small segment in flight).
> 
> The idea is to record the fact that a small segment arrives
> on a connection, where one small segment has already been received
> and still not-ACKed. In this case ACK is forced after tcp_recvmsg()
> drains receive buffer.
> 
> In other words, it is a "soft" each-2d-segment ACK, which is enough
> to preserve ACK clock even when ABC is enabled.
> 
> Signed-off-by: Alexey Kuznetsov <[EMAIL PROTECTED]>

This looks exactly like the kind of patch I tried to formulate,
very unsuccessfully, last time this topic came up a year or
so ago.

It looks perfectly fine to me, would you like me to apply it
Alexey?
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RFC] Re: high latency with TCP connections

2006-09-05 Thread Alexey Kuznetsov
Hello!

> Is this really necessary?

No, of course. We lived for ages without this, would live for another age.



>   I thought that the problems with ABC were in 
> trying to apply byte-based heuristics from the RFC(s) to a 
> packet-oritented cwnd in the stack?

It was just the last drop.

Even with disabled ABC, that test shows some gaps in latency summed
up to ~300 msec. Almost invisible, but not good.

Too aggressive delack has many other issues. Even without ABC
we have quadratically suppressed cwnd on TCP_NODELAY connections
comparing to BSD: at sender side we suppress it by counting
cwnd in packets, at receiver side by ACKing by byte counter.

Each time when another victim sees artificial latencies introduced
by agressive delayed acks, even though he requested TCP_NODELAY,
our best argument is "Stupid, you do all wrong, how could you get
a decent performance?" :-). 

Probably, we stand for a feature which really does not worth
to stand for and causes nothing but permanent pain in ass.

Alexey
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][RFC] Re: high latency with TCP connections

2006-09-05 Thread Rick Jones

Alexey Kuznetsov wrote:

Hello!



Some people reported that this program runs in 9.997 sec when run on
FreeBSD.



Try enclosed patch. I have no idea why 9.997 sec is so magic, but I
get exactly this number on my notebook. :-)

Alexey

=

This patch enables sending ACKs each 2d received segment.
It does not affect either mss-sized connections (obviously) or connections
controlled by Nagle (because there is only one small segment in flight).

The idea is to record the fact that a small segment arrives
on a connection, where one small segment has already been received
and still not-ACKed. In this case ACK is forced after tcp_recvmsg()
drains receive buffer.

In other words, it is a "soft" each-2d-segment ACK, which is enough
to preserve ACK clock even when ABC is enabled.


Is this really necessary?  I thought that the problems with ABC were in 
trying to apply byte-based heuristics from the RFC(s) to a 
packet-oritented cwnd in the stack?


rick jones
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH][RFC] Re: high latency with TCP connections

2006-09-04 Thread Alexey Kuznetsov
Hello!

> Some people reported that this program runs in 9.997 sec when run on
> FreeBSD.

Try enclosed patch. I have no idea why 9.997 sec is so magic, but I
get exactly this number on my notebook. :-)

Alexey

=

This patch enables sending ACKs each 2d received segment.
It does not affect either mss-sized connections (obviously) or connections
controlled by Nagle (because there is only one small segment in flight).

The idea is to record the fact that a small segment arrives
on a connection, where one small segment has already been received
and still not-ACKed. In this case ACK is forced after tcp_recvmsg()
drains receive buffer.

In other words, it is a "soft" each-2d-segment ACK, which is enough
to preserve ACK clock even when ABC is enabled.

Signed-off-by: Alexey Kuznetsov <[EMAIL PROTECTED]>



diff --git a/include/net/inet_connection_sock.h 
b/include/net/inet_connection_sock.h
index 9bf73fe..de4e83b 100644
--- a/include/net/inet_connection_sock.h
+++ b/include/net/inet_connection_sock.h
@@ -147,7 +147,8 @@ extern struct sock *inet_csk_clone(struc
 enum inet_csk_ack_state_t {
ICSK_ACK_SCHED  = 1,
ICSK_ACK_TIMER  = 2,
-   ICSK_ACK_PUSHED = 4
+   ICSK_ACK_PUSHED = 4,
+   ICSK_ACK_PUSHED2 = 8
 };
 
 extern void inet_csk_init_xmit_timers(struct sock *sk,
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 934396b..4f3b76f 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -955,8 +955,11 @@ #endif
 * receive buffer and there was a small segment
 * in queue.
 */
-   (copied > 0 && (icsk->icsk_ack.pending & ICSK_ACK_PUSHED) &&
-!icsk->icsk_ack.pingpong && 
!atomic_read(&sk->sk_rmem_alloc)))
+   (copied > 0 &&
+((icsk->icsk_ack.pending & ICSK_ACK_PUSHED2) ||
+ ((icsk->icsk_ack.pending & ICSK_ACK_PUSHED) &&
+  !icsk->icsk_ack.pingpong) &&
+ !atomic_read(&sk->sk_rmem_alloc)))
time_to_ack = 1;
}
 
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 111ff39..5877920 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -156,6 +156,8 @@ static void tcp_measure_rcv_mss(struct s
return;
}
}
+   if (icsk->icsk_ack.pending & ICSK_ACK_PUSHED)
+   icsk->icsk_ack.pending |= ICSK_ACK_PUSHED2;
icsk->icsk_ack.pending |= ICSK_ACK_PUSHED;
}
 }
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html