Re: Found the problem, w/patch (was Re: FreeBSD performing worse than Linux?)

2001-12-03 Thread Darryl Okahata

Nate Williams [EMAIL PROTECTED] wrote:

 Unfortunately, I'm unable to run tcpdump on the client, since it's
 running NT and we're not allowed to install any 3rd party apps on it
 (such as the WinDump package).

 NT???  You wouldn't happen to be seeing performance problems with
Samba, I hope?  There are some known Samba/FreeBSD issues that can cause
abysmal performance (~30-40KB/sec -- yes, kilobytes/sec), even with
100BT cards.  No hangs or aborts, though.  I've only seen this problem
with DOS network stacks, but it wouldn't surprise me if it also affected
some version of NT.

-- 
Darryl Okahata
[EMAIL PROTECTED]

DISCLAIMER: this message is the author's personal opinion and does not
constitute the support, opinion, or policy of Agilent Technologies, or
of the little green men that have been following him all day.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Found the problem, w/patch (was Re: FreeBSD performing worse than Linux?)

2001-12-03 Thread Nate Williams

  Unfortunately, I'm unable to run tcpdump on the client, since it's
  running NT and we're not allowed to install any 3rd party apps on it
  (such as the WinDump package).
 
  NT???  You wouldn't happen to be seeing performance problems with
 Samba, I hope?

We're not using Samba over 100's of miles. :)

 There are some known Samba/FreeBSD issues that can cause
 abysmal performance (~30-40KB/sec -- yes, kilobytes/sec), even with
 100BT cards.

This may be due to problems that Matt Dillon just recently fixed this
weekend in FreeBSD's TCP/IP stack.



Nate

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Patch #2 (was Re: Found the problem, w/patch (was Re: FreeBSD performing worse than Linux?))

2001-12-02 Thread Danny Braniss

hum, do i get a speed ticket?
i did some tests before applying your patches:

lizard ./tbench 1 dev
.1 clients started
..+*
Throughput 6.10567 MB/sec (NB=7.63209 MB/sec  61.0567 MBit/sec)
lizard ./tbench 2 dev
..2 clients started
++**
Throughput 7.71796 MB/sec (NB=9.64745 MB/sec  77.1796 MBit/sec)
lizard ./tbench 3 dev
...3 clients started
.++.+***
Throughput 9.17012 MB/sec (NB=11.4627 MB/sec  91.7012 MBit/sec)
lizard ./tbench 4 dev
4 clients started

Throughput 10.3365 MB/sec (NB=12.9207 MB/sec  103.365 MBit/sec)
lizard ./tbench 5 dev
.5 clients started
.+...++.+.+
*
Throughput 10.5219 MB/sec (NB=13.1523 MB/sec  105.219 MBit/sec)

 Richard (and others), please try this patch.  With this patch I
 get the following between two machines connected via a 100BaseTX
 switch (full duplex):
 
   
 
 test1:/home/dillon/dbench ./tbench 1 test2
 .1 clients started
 ..+*
 Throughput 6.13925 MB/sec (NB=7.67406 MB/sec  61.3925 MBit/sec)  1 procs
 test1:/home/dillon/dbench ./tbench 2 test2
 ..2 clients started
 ++**
 Throughput 8.37795 MB/sec (NB=10.4724 MB/sec  83.7795 MBit/sec)  2 procs
 
   
 
  On localhost I get:
 
   
 
 test1:/home/dillon/dbench ./tbench 1 localhost
 .1 clients started
 ..+*
 Throughput 25.7156 MB/sec (NB=32.1445 MB/sec  257.156 MBit/sec)  1 procs
 test1:/home/dillon/dbench ./tbench 2 localhost
 ..2 clients started
 ++**
 Throughput 36.5428 MB/sec (NB=45.6785 MB/sec  365.428 MBit/sec)  2 procs
 test1:/home/dillon/dbench 
 
   
 
 This is WITHOUT changing the default send and receive tcp buffers..
 they're both 16384.
 
 The bug I found is that when recv() is used with MSG_WAITALL, 
 which is what tbench does, soreceive() will block waiting for all
 available input WITHOUT ever calling pr-pr_usrreqs-pru_rcvd(),
 which means that if the sender filled up the receive buffer (16K default)
 the receiver will never ack the 0 window... that is until the idle code
 takes over after 5 seconds.
 
   -Matt
 
 Index: uipc_socket.c
 ===
 RCS file: /home/ncvs/src/sys/kern/uipc_socket.c,v
 retrieving revision 1.68.2.16
 diff -u -r1.68.2.16 uipc_socket.c
 --- uipc_socket.c 2001/06/14 20:46:06 1.68.2.16
 +++ uipc_socket.c 2001/12/01 21:09:13
 @@ -910,6 +910,14 @@
   !sosendallatonce(so)  !nextrecord) {
   if (so-so_error || so-so_state  SS_CANTRCVMORE)
   break;
 + /*
 +  * The window might have closed to zero, make
 +  * sure we send an ack now that we've drained
 +  * the buffer or we might end up blocking until
 +  * the idle takes over (5 seconds).
 +  */
 + if (pr-pr_flags  PR_WANTRCVD  so-so_pcb)
 + (*pr-pr_usrreqs-pru_rcvd)(so, flags);
   error = sbwait(so-so_rcv);
   if (error) {
   sbunlock(so-so_rcv);
 
 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with unsubscribe freebsd-hackers in the body of the message
 



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Patch #2 (was Re: Found the problem, w/patch (was Re: FreeBSD performing worse than Linux?))

2001-12-02 Thread Matthew Dillon

That's reasonable with patch #2 and delayed acks turned off (assuming
this is a 100BaseTX network you are testing on).  You should get 
slightly better results with Patch #3 and delayed acks turned on.

-Matt

:hum, do i get a speed ticket?
:i did some tests before applying your patches:
:
:lizard ./tbench 1 dev
:.1 clients started
:..+*
:Throughput 6.10567 MB/sec (NB=7.63209 MB/sec  61.0567 MBit/sec)
:lizard ./tbench 2 dev
:..2 clients started
:++**
:Throughput 7.71796 MB/sec (NB=9.64745 MB/sec  77.1796 MBit/sec)
:lizard ./tbench 3 dev
:...3 clients started
:.++.+***
:Throughput 9.17012 MB/sec (NB=11.4627 MB/sec  91.7012 MBit/sec)
:lizard ./tbench 4 dev
:4 clients started
:
:Throughput 10.3365 MB/sec (NB=12.9207 MB/sec  103.365 MBit/sec)
:lizard ./tbench 5 dev
:.5 clients started
:.+...++.+.+
:*
:Throughput 10.5219 MB/sec (NB=13.1523 MB/sec  105.219 MBit/sec)

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Patch #2 (was Re: Found the problem, w/patch (was Re: FreeBSD performing worse than Linux?))

2001-12-02 Thread Danny Braniss

 That's reasonable with patch #2 and delayed acks turned off (assuming
 this is a 100BaseTX network you are testing on).  You should get 
 slightly better results with Patch #3 and delayed acks turned on.
 
   -Matt
i did not apply any patches - yet -, net.inet.tcp.newreno is 1, on both hosts.
they are connected at 100mb, so the strange numbers are when the number of
clients is  3! (att 100Mbit/sec how come i get  100?)

danny
PS: just forgot, congrat on a terrific job!!! im still amazed

 
 :hum, do i get a speed ticket?
 :i did some tests before applying your patches:
 :
 :lizard ./tbench 1 dev
 :.1 clients started
 :..+*
 :Throughput 6.10567 MB/sec (NB=7.63209 MB/sec  61.0567 MBit/sec)
 :lizard ./tbench 2 dev
 :..2 clients started
 :++**
 :Throughput 7.71796 MB/sec (NB=9.64745 MB/sec  77.1796 MBit/sec)
 :lizard ./tbench 3 dev
 :...3 clients started
 :.++.+***
 :Throughput 9.17012 MB/sec (NB=11.4627 MB/sec  91.7012 MBit/sec)
 :lizard ./tbench 4 dev
 :4 clients started
 :
 :Throughput 10.3365 MB/sec (NB=12.9207 MB/sec  103.365 MBit/sec)
 :lizard ./tbench 5 dev
 :.5 clients started
 :.+...++.+.+
 :*
 :Throughput 10.5219 MB/sec (NB=13.1523 MB/sec  105.219 MBit/sec)
 
 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with unsubscribe freebsd-hackers in the body of the message
 



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Patch #2 (was Re: Found the problem, w/patch (was Re: FreeBSD performing worse than Linux?))

2001-12-02 Thread Matthew Dillon

: slightly better results with Patch #3 and delayed acks turned on.
: 
:  -Matt
:i did not apply any patches - yet -, net.inet.tcp.newreno is 1, on both hosts.
:they are connected at 100mb, so the strange numbers are when the number of
:clients is  3! (att 100Mbit/sec how come i get  100?)

   It's a synchronous protocol, so the number of bits per byte is not
   necessarily 10... it can be 8, 9, or 10.  Approximately.

:   danny
:PS: just forgot, congrat on a terrific job!!! im still amazed

-Matt


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Found the problem, w/patch (was Re: FreeBSD performing worse than Linux?)

2001-12-01 Thread Richard Sharpe

Matthew Dillon wrote:

 Index: tcp_output.c
 ===
 RCS file: /home/ncvs/src/sys/netinet/tcp_output.c,v
 retrieving revision 1.39.2.10
 diff -u -r1.39.2.10 tcp_output.c
 --- tcp_output.c  2001/07/07 04:30:38 1.39.2.10
 +++ tcp_output.c  2001/11/30 21:18:10
 @@ -912,7 +912,14 @@
   tp-t_flags = ~TF_ACKNOW;
   if (tcp_delack_enabled)
   callout_stop(tp-tt_delack);
 +#if 0
 + /*
 +  * This completely breaks TCP if newreno is turned on
 +  */
   if (sendalot  (!tcp_do_newreno || --maxburst))
 + goto again;
 +#endif
 + if (sendalot)
   goto again;
   return (0);
  }
 
 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with unsubscribe freebsd-hackers in the body of the message
 
OK, I have applied this patch, and FreeBSD 4.4-STABLE now seems to behave 
approximately the same as Linux. There are no extra ACKs, and FreeBSD now coalesces 
pairs of ACKs.


However, performance for one client is still at 25Mb/s with the tbench run, while 
Linux provides around 68Mb/s.


So, it is back to staring at traces. Perhaps I will get a full trace now.


-- 
Richard Sharpe, [EMAIL PROTECTED], LPIC-1
www.samba.org, www.ethereal.com, SAMS Teach Yourself Samba
in 24 Hours, Special Edition, Using Samba


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Found the problem, w/patch (was Re: FreeBSD performing worse than Linux?)

2001-12-01 Thread Matthew Dillon


!@#%@#^%^$#^@#$%#@!$%!@%@#@# Whos the blasted bozo that broke TCP!  :-)

:-(

I believe I have found another problem.  It may not be related to this
one but I sure hope it is.

When the receiver's window contracts to zero causing the sender to
stop transmitting, TCP stalls for 5 seconds.  In otherwords, some bozo
broke the window restart code when the receiver drains its buffer
NO ACK IS SENT INDICATING THAT THE WINDOW HAS OPENED UP  It's only
luck that the TCP idle code picks it up and acks it after 5 seconds.

Somebody broke something horrendously in the TCP stack.

!$#%@#$^%#@^@#%!@#$@!#$@#$@#$  blast it!  I'm going to fry the responsible
party when I get this tracked down.

-Matt

test1# tail x
12:44:25.629324 216.240.41.12.7003  216.240.41.11.1056: . ack 10417344 win 16060 (DF)
12:44:25.629344 216.240.41.11.1056  216.240.41.12.7003: P 10430484:10431944(1460) ack 
47215 win 17520 (DF)
12:44:25.629374 216.240.41.11.1056  216.240.41.12.7003: . 10431944:10433404(1460) ack 
47215 win 17520 (DF)
12:44:25.629566 216.240.41.12.7003  216.240.41.11.1056: . ack 10420264 win 14600 (DF)
12:44:25.629587 216.240.41.11.1056  216.240.41.12.7003: P 10433404:10434864(1460) ack 
47215 win 17520 (DF)
12:44:25.629810 216.240.41.12.7003  216.240.41.11.1056: . ack 10423184 win 11680 (DF)
12:44:25.630056 216.240.41.12.7003  216.240.41.11.1056: . ack 10426104 win 8760 (DF)
12:44:25.630302 216.240.41.12.7003  216.240.41.11.1056: . ack 10429024 win 5840 (DF)
12:44:25.630548 216.240.41.12.7003  216.240.41.11.1056: . ack 10431944 win 2920 (DF)
12:44:25.630794 216.240.41.12.7003  216.240.41.11.1056: . ack 10434864 win 0 (DF)
(stall)


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Patch #2 (was Re: Found the problem, w/patch (was Re: FreeBSD performing worse than Linux?))

2001-12-01 Thread Matthew Dillon

Richard (and others), please try this patch.  With this patch I
get the following between two machines connected via a 100BaseTX
switch (full duplex):



test1:/home/dillon/dbench ./tbench 1 test2
.1 clients started
..+*
Throughput 6.13925 MB/sec (NB=7.67406 MB/sec  61.3925 MBit/sec)  1 procs
test1:/home/dillon/dbench ./tbench 2 test2
..2 clients started
++**
Throughput 8.37795 MB/sec (NB=10.4724 MB/sec  83.7795 MBit/sec)  2 procs



 On localhost I get:



test1:/home/dillon/dbench ./tbench 1 localhost
.1 clients started
..+*
Throughput 25.7156 MB/sec (NB=32.1445 MB/sec  257.156 MBit/sec)  1 procs
test1:/home/dillon/dbench ./tbench 2 localhost
..2 clients started
++**
Throughput 36.5428 MB/sec (NB=45.6785 MB/sec  365.428 MBit/sec)  2 procs
test1:/home/dillon/dbench 



This is WITHOUT changing the default send and receive tcp buffers..
they're both 16384.

The bug I found is that when recv() is used with MSG_WAITALL, 
which is what tbench does, soreceive() will block waiting for all
available input WITHOUT ever calling pr-pr_usrreqs-pru_rcvd(),
which means that if the sender filled up the receive buffer (16K default)
the receiver will never ack the 0 window... that is until the idle code
takes over after 5 seconds.

-Matt

Index: uipc_socket.c
===
RCS file: /home/ncvs/src/sys/kern/uipc_socket.c,v
retrieving revision 1.68.2.16
diff -u -r1.68.2.16 uipc_socket.c
--- uipc_socket.c   2001/06/14 20:46:06 1.68.2.16
+++ uipc_socket.c   2001/12/01 21:09:13
@@ -910,6 +910,14 @@
!sosendallatonce(so)  !nextrecord) {
if (so-so_error || so-so_state  SS_CANTRCVMORE)
break;
+   /*
+* The window might have closed to zero, make
+* sure we send an ack now that we've drained
+* the buffer or we might end up blocking until
+* the idle takes over (5 seconds).
+*/
+   if (pr-pr_flags  PR_WANTRCVD  so-so_pcb)
+   (*pr-pr_usrreqs-pru_rcvd)(so, flags);
error = sbwait(so-so_rcv);
if (error) {
sbunlock(so-so_rcv);

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Patch #2 (was Re: Found the problem, w/patch (was Re: FreeBSD performing worse than Linux?))

2001-12-01 Thread Matthew Dillon


Oh, I forgot to mention...  there's still a bug.  You have to turn off
delayed acks to get the below numbers.  Otherwise the window 0 recovery
ack ... gets delayed.  Joy.  It's easy to fix, but I'm off to see 
Potter so it will be tonight.

-Matt

:Richard (and others), please try this patch.  With this patch I
:get the following between two machines connected via a 100BaseTX
:switch (full duplex):
:
:   
:
:test1:/home/dillon/dbench ./tbench 1 test2
:.1 clients started
:..+*
:Throughput 6.13925 MB/sec (NB=7.67406 MB/sec  61.3925 MBit/sec)  1 procs
:test1:/home/dillon/dbench ./tbench 2 test2
:..2 clients started
:++**
:Throughput 8.37795 MB/sec (NB=10.4724 MB/sec  83.7795 MBit/sec)  2 procs
:
:   
:
: On localhost I get:
:
:   
:
:test1:/home/dillon/dbench ./tbench 1 localhost
:.1 clients started
:..+*
:Throughput 25.7156 MB/sec (NB=32.1445 MB/sec  257.156 MBit/sec)  1 procs
:test1:/home/dillon/dbench ./tbench 2 localhost
:..2 clients started
:++**
:Throughput 36.5428 MB/sec (NB=45.6785 MB/sec  365.428 MBit/sec)  2 procs
:test1:/home/dillon/dbench 
:
:   
:
:This is WITHOUT changing the default send and receive tcp buffers..
:they're both 16384.
:
:The bug I found is that when recv() is used with MSG_WAITALL, 
:which is what tbench does, soreceive() will block waiting for all
:available input WITHOUT ever calling pr-pr_usrreqs-pru_rcvd(),
:which means that if the sender filled up the receive buffer (16K default)
:the receiver will never ack the 0 window... that is until the idle code
:takes over after 5 seconds.
:
:   -Matt

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Found the problem, w/patch (was Re: FreeBSD performing worse than Linux?)

2001-11-30 Thread Matthew Dillon

I believe I have found the problem.  The transmit side has a maximum
burst count imposed by newreno.  As far as I can tell, if this maxburst
is hit (it defaults to 4 packets), the transmitter just stops - presumably
until it receives an ack.

Now, theoretically this should work just fine... send four packets,
receive the first ack and send the next four packets... it should allow
us to fill the window geometrically.   I believe the idea is to give
transmit packets a chance to include acks for received data in a 
reasonable period of time... I'm not sure, it's J Lemon's commit (from
the original newreno commits) so maybe he can work it out.  However, if
the receiver has delayed-acks turned on only one ack is returned for all
four packets.  The next four are then sent and one ack is returned.

I believe this the cause of the problem.  It effectively destroys the
TCP window, forcing it to around 1.5Kx4 = 6K.  This also explains 
why performance is so weird... if more then one delayed ack
happens to occur per burst you get 'bumps' in the performance.

Without the patch, two things will solve or partially solve the problem:

* Turn off delayed acks on the receiver (performance 80K-6.8MB/sec)

OR

* Turn off newreno on the transmitter.  (performance 80K-7.9MB/sec)

The patch below kills the burst limit on the transmit side and appears
to solve the problem permanently.  I'm sure I'm breaking something in
the newreno RFC, but I am going to commit it to both branches now because
our current situation is horrible.

-Matt

Index: tcp_output.c
===
RCS file: /home/ncvs/src/sys/netinet/tcp_output.c,v
retrieving revision 1.39.2.10
diff -u -r1.39.2.10 tcp_output.c
--- tcp_output.c2001/07/07 04:30:38 1.39.2.10
+++ tcp_output.c2001/11/30 21:18:10
@@ -912,7 +912,14 @@
tp-t_flags = ~TF_ACKNOW;
if (tcp_delack_enabled)
callout_stop(tp-tt_delack);
+#if 0
+   /*
+* This completely breaks TCP if newreno is turned on
+*/
if (sendalot  (!tcp_do_newreno || --maxburst))
+   goto again;
+#endif
+   if (sendalot)
goto again;
return (0);
 }

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Found the problem, w/patch (was Re: FreeBSD performing worse than Linux?)

2001-11-30 Thread Nate Williams


 I believe I have found the problem.  The transmit side has a
 maximum burst count imposed by newreno.  As far as I can tell, if
 this maxburst is hit (it defaults to 4 packets), the transmitter
 just stops - presumably until it receives an ack.

Note, my experiences (and John Capos) are showing degraded performance
when *NOT* on a LAN segment.  In other words, when packet loss enters
the mix, performance tends to fall off rather quickly.

This is with or without newreno (which should theoretically help with
packet loss).  John claims that disabling delayed_ack doesn't seem to
affect his performance, and I've not been able to verify if delayed_ack
helps/hurts in my situation, since the testers have been pressed for
time so I can't get them to iterate through the different settings.

I do however have some packet dumps, although I'm not sure they will
tell anything. :(



Nate

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Found the problem, w/patch (was Re: FreeBSD performing worse than Linux?)

2001-11-30 Thread Matthew Dillon

:Note, my experiences (and John Capos) are showing degraded performance
:when *NOT* on a LAN segment.  In other words, when packet loss enters
:the mix, performance tends to fall off rather quickly.
:
:This is with or without newreno (which should theoretically help with
:packet loss).  John claims that disabling delayed_ack doesn't seem to
:affect his performance, and I've not been able to verify if delayed_ack
:helps/hurts in my situation, since the testers have been pressed for
:time so I can't get them to iterate through the different settings.
:
:I do however have some packet dumps, although I'm not sure they will
:tell anything. :(
:
:Nate

Packet loss will screw up TCP performance no matter what you do.  
NewReno, assuming it is working properly, can improve performance
for that case but it will not completely solve the problem (nothing will).
Remember that our timers are only good to around 20ms by default, so 
even the best retransmission case is going to create a serious hicup.

The question here is... is it actually packet loss that is creating
this issue for you and John, or is it something else?  The only way
to tell for sure is to run tcpdump on BOTH the client and server
and then observe whether packet loss is occuring by comparing the dumps.

I would guess that turning off delayed-acks will improve performance
in the face of packet loss, since a lost ack packet in that case will
not be as big an issue.

-Matt


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Found the problem, w/patch (was Re: FreeBSD performing worse than Linux?)

2001-11-30 Thread Nate Williams

 :Note, my experiences (and John Capos) are showing degraded performance
 :when *NOT* on a LAN segment.  In other words, when packet loss enters
 :the mix, performance tends to fall off rather quickly.
 :
 :This is with or without newreno (which should theoretically help with
 :packet loss).  John claims that disabling delayed_ack doesn't seem to
 :affect his performance, and I've not been able to verify if delayed_ack
 :helps/hurts in my situation, since the testers have been pressed for
 :time so I can't get them to iterate through the different settings.
 :
 :I do however have some packet dumps, although I'm not sure they will
 :tell anything. :(
 :
 :Nate
 
 Packet loss will screw up TCP performance no matter what you do.  

I know, dealing with that issue is my day job. :)

My point is that older FreeBSD releases (and newer Linux releases) seem
to be dealing with it in a more sane manner.  At least, it didn't effect
performance nearly as much as it does in newer releases.

 NewReno, assuming it is working properly, can improve performance
 for that case but it will not completely solve the problem
 (nothing will).  Remember that our timers are only good to around
 20ms by default, so even the best retransmission case is going to
 create a serious hicup.

See above.

 The question here is... is it actually packet loss that is creating
 this issue for you and John, or is it something else?

In my opinion, it's how the TCP stack recovers from packet loss that is
the problem.

 The only way
 to tell for sure is to run tcpdump on BOTH the client and server
 and then observe whether packet loss is occuring by comparing the dumps.

Unfortunately, I'm unable to run tcpdump on the client, since it's
running NT and we're not allowed to install any 3rd party apps on it
(such as the WinDump package).

I'm not saying that I expect the same results as I get on the LAN
segment, but I *am* expecting results that are equivalent to what we
were seeing with FreeBSD 3.x, and those that are in the same ballpark
(or better) than the Linux systems sitting next to it.

Given that I get great LAN resuls, I no longer suspect I have a ethernet
autonegotiation problem, since I can get almost wire-speeds with local
nodes, and close to maximum performance with our wireless products when
the network segment the FreeBSD server is relatively idle.

 I would guess that turning off delayed-acks will improve performance
 in the face of packet loss, since a lost ack packet in that case will
 not be as big an issue.

I'm not sure I agree.  I wouldn't expect it would help/hinder the
performance assuming a correctly performing stack, *UNLESS* the packet
loss was completely due to congestion.  In that case, delayed-acks *may*
improve things, but I doubt it would help much with TCP backoff and
such.




Nate

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Found the problem, w/patch (was Re: FreeBSD performing worse than Linux?)

2001-11-30 Thread Jonathan Lemon

On Fri, Nov 30, 2001 at 04:28:32PM -0600, Alfred Perlstein wrote:
 * Matthew Dillon [EMAIL PROTECTED] [011130 16:02] wrote:
  
  Packet loss will screw up TCP performance no matter what you do.  
  NewReno, assuming it is working properly, can improve performance
  for that case but it will not completely solve the problem (nothing will).
  Remember that our timers are only good to around 20ms by default, so 
  even the best retransmission case is going to create a serious hicup.
  
  The question here is... is it actually packet loss that is creating
  this issue for you and John, or is it something else?  The only way
  to tell for sure is to run tcpdump on BOTH the client and server
  and then observe whether packet loss is occuring by comparing the dumps.
  
  I would guess that turning off delayed-acks will improve performance
  in the face of packet loss, since a lost ack packet in that case will
  not be as big an issue.
 
 I have an odd theory that makes use of my waning remeberence of the
 stack behavior, this may be totally off base but I'd appreciate it
 if you guys would consider this scenerio if at all to put my mind
 at ease.
 
 I seem to remeber several places in the stack that detect what looks
 like a hiccup and immediately begin sending a sequence of ACKs in
 order to trigger the other side's fast retrasmit code.  One of the
 things that I don't remember seeing is that state is persistant.

There isn't anything in the receiver side that does this; ACKs are sent
in response to incoming packets.  However, state is maintained on the
sender side as to whether we are performing fast retransmit or not.
-- 
Jonathan

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Found the problem, w/patch (was Re: FreeBSD performing worse than Linux?)

2001-11-30 Thread Alfred Perlstein

* Jonathan Lemon [EMAIL PROTECTED] [011130 17:00] wrote:
 On Fri, Nov 30, 2001 at 04:28:32PM -0600, Alfred Perlstein wrote:
  
  I have an odd theory that makes use of my waning remeberence of the
  stack behavior, this may be totally off base but I'd appreciate it
  if you guys would consider this scenerio if at all to put my mind
  at ease.
  
  I seem to remeber several places in the stack that detect what looks
  like a hiccup and immediately begin sending a sequence of ACKs in
  order to trigger the other side's fast retrasmit code.  One of the
  things that I don't remember seeing is that state is persistant.
 
 There isn't anything in the receiver side that does this; ACKs are sent
 in response to incoming packets.  However, state is maintained on the
 sender side as to whether we are performing fast retransmit or not.

Either you don't follow or my concept of what happens is off.

What i'm saying is this, consider each pair to be in some form
of time:

h1 send: p1 p2 p3
h2 recv: p1 p3

h1 recv: (nothing acks lost)
h2 send: ack1 ack1 ack1 (dude, i missed a packet)

h1 send: (nothing, waiting for ack)
h2 send: (nothing, waiting for retransmit)

h1 send: p1 p2 p3 (ack timed out)
h2 send: (nothing, waiting for retransmit)

what should happen is this: 

h1 send: p1 p2 p3
h2 recv: p1 p3

h1 recv: (nothing acks lost)
h2 send: ack1 ack1 ack1 (dude, i missed a packet)

h2 send: ack1 ack1 ack1 (dude, i missed a packet)
h1 recv: ack1 ack1 ack1
h1 send: p2 p3

Basically, will the reciever keep acking not if 'it detects packet loss',
but rather 'as long as packets are lost'.

-- 
-Alfred Perlstein [[EMAIL PROTECTED]]
'Instead of asking why a piece of software is using 1970s technology,
 start asking why software is ignoring 30 years of accumulated wisdom.'
   http://www.morons.org/rants/gpl-harmful.php3

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Found the problem, w/patch (was Re: FreeBSD performing worse than Linux?)

2001-11-30 Thread Matthew Dillon

:what should happen is this: 
:
:h1 send: p1 p2 p3
:h2 recv: p1 p3
:
:h1 recv: (nothing acks lost)
:h2 send: ack1 ack1 ack1 (dude, i missed a packet)
:
:h2 send: ack1 ack1 ack1 (dude, i missed a packet)
:h1 recv: ack1 ack1 ack1
:h1 send: p2 p3
:
:Basically, will the reciever keep acking not if 'it detects packet loss',
:but rather 'as long as packets are lost'.
:
:-- 
:-Alfred Perlstein [[EMAIL PROTECTED]]

Yuch.  That won't help.   Basically you are taking a brute-force
approach Send the ack a whole bunch of times in case some of them
get lost.  Such an approach does not typically work very well.  For
example, if the packet loss occured due to link congestion your
solution will actually make the link more congested rather then less.
If there is significant latency in the path the acks can get into
a following run with the transmitter, making the transmitter believe
that the packet loss is worse then it actually is and responding in kind,
resulting in even more incorrect acks.

-Matt
Matthew Dillon 
[EMAIL PROTECTED]

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Found the problem, w/patch (was Re: FreeBSD performing worse than Linux?)

2001-11-30 Thread John Capo

Quoting Matthew Dillon ([EMAIL PROTECTED]):
 
 The question here is... is it actually packet loss that is creating
 this issue for you and John, or is it something else?  The only way
 to tell for sure is to run tcpdump on BOTH the client and server
 and then observe whether packet loss is occuring by comparing the dumps.
 

Packet loss is the problem for sure. I am dumping on the server and
client side.
   
http://www.irbs.net/server-dump.html
http://www.irbs.net/client-dump.html

In 60Ms the server pushed out about 200 segments.  My test writes
1 byte at a time on an existing ssh conection so the payload per
segment is small, 48 bytes.  (48 + IP + TCP) * 200 is around 17KB
in 60Ms which probably overflowed the frame switch queue.  The
client is on a fractional T1, the server is on a 10Mb - OC3
connection 1200 network miles away.

Jonathan Lemon pointed out in the TCP Anomalies thread that slow
start seems to be broken.

John Capo


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message