Hi, I am not sure whether my last email is filtered by mailing list. After disabled tso, the speed become even poorer. This is the packets captures. Plz see google drive. tcp_with_tso_off.pcapng.gz <https://docs.google.com/file/d/0By8sTL79ob4tYXQ0N0lZN0FUNVE/edit?usp=drive_web>
Regards, Niu Zhixiong --------------- kaia...@gmail.com On Sun, Aug 10, 2014 at 1:24 PM, Niu Zhixiong <kaia...@gmail.com> wrote: > Hi, > After disabled tso, the speed become even poorer. > This is the packets captures. Plz see google drive. > > tcp_with_tso_off.pcapng.gz > <https://docs.google.com/file/d/0By8sTL79ob4tYXQ0N0lZN0FUNVE/edit?usp=drive_web> > > > > John-Mark Gurney <j...@funkthat.com>于2014年8月10日星期日写道: > > Niu Zhixiong wrote this message on Sun, Aug 10, 2014 at 11:48 +0800: >> > I am using Intel I350-T4 NIC. The LRO is closed by default. And by the >> way, >> > when I am using KVM-based virtual machine(virtio NIC) do the exactly >> same >> > test. The results are same. >> >> Have you tried disabling tso? I asked that in an earlier email, but >> never heard from you if that changed anything... >> >> a lot of the trace looks like: >> 19:29:57.223574 IP 10.0.10.2.61010 > 10.0.10.3.9000: . >> 251521:257313(5792) ack 1 win 32783 <nop,nop,timestamp 51563557 1047294279> >> 19:29:57.223798 IP 10.0.10.3.9000 > 10.0.10.2.61010: . ack 257313 win >> 32745 <nop,nop,timestamp 1047294690 51563557> >> 19:29:57.225570 IP 10.0.10.2.61010 > 10.0.10.3.9000: . >> 257313:263105(5792) ack 1 win 32783 <nop,nop,timestamp 51563557 1047294279> >> >> Notice how the ack comes back immediately, but for some reason, we decide >> to >> wait almost 2ms before sending out the next frame... >> >> For some reason, we just aren't filling our window out... tcptcace's >> graphs shows the winow at 2MB, but we only ever have 4 segments >> outstanding at once... >> >> > ifconfig igb0 >> > igb0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu >> 1500 >> > >> options=403bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,VLAN_HWTSO> >> > ether a0:36:9f:38:27:d0 >> > inet 10.0.10.3 netmask 0xffffff00 broadcast 10.0.10.255 >> > inet6 fe80::a236:9fff:fe38:27d0%igb0 prefixlen 64 scopeid 0x1 >> > nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL> >> > media: Ethernet autoselect (1000baseT <full-duplex>) >> > status: active >> > >> > Regards, >> > Niu Zhixiong >> > ????????????????????????????????????????????? >> > kaia...@gmail.com >> > >> > >> > On Sun, Aug 10, 2014 at 11:32 AM, John-Mark Gurney <j...@funkthat.com> >> wrote: >> > >> > > Niu Zhixiong wrote this message on Sun, Aug 10, 2014 at 10:50 +0800: >> > > > I am sorry that I upload a WRONG SCTP capture. But, the throughput >> is >> > > same. >> > > > SCTP is double than TCP, about 18Mbps. >> > > > ??? >> > > > sctp_2.pcapng.gz >> > > > < >> > > >> https://docs.google.com/file/d/0By8sTL79ob4tMlh4WDlTSndHX0k/edit?usp=drive_web >> > > > >> > > > ??? >> > > >> > > Ok, the owin graph is very interesting... We do have a full 2MB >> window >> > > on the receiver side, but for some reason, we only ever have just >> under >> > > 6k outstanding on the connection... >> > > >> > > So, it looks like we send for a short period of time, and then stop >> > > sending... Do you have LRO enabled? I think it might be related to: >> > > https://svnweb.freebsd.org/changeset/base/r256920 >> > > >> > > As I'm seeing >100ms gaps where the sender doesn't send any data, and >> > > as soon as more than one ack comes in, the next segment goes out... >> If >> > > we only receive a single ack, then we wait for a timeout before >> sending >> > > the next segment.. >> > > >> > > Can you try to disable LRO on the receiving host? >> > > >> > > ifconfig <iface> -lro >> > > >> > > And see if that helps... If it does... Applying the patch, or >> compiling >> > > a more recent kernel from stable/10 that is after r257367 as that is >> was >> > > the date that the change was merged... >> > > >> > > > On Sun, Aug 10, 2014 at 10:42 AM, Niu Zhixiong <kaia...@gmail.com> >> > > wrote: >> > > > >> > > > > I am sure that wnd is about 2MB all the time. >> > > > > This is my latest capture, plz see Google Drive. >> > > > > In the latest test, TCP(0s-120s) is about 9Mbps and SCTP(0s-120s) >> is >> > > about >> > > > > 18Mbps. >> > > > > (The bandwidth(20Mbps) and delay(200ms) is set by dummynet) >> > > > > The SCTP and TCP are tested in same environment. >> > > > > >> > > > > ??? >> > > > > sctp.pcapng.gz >> > > > > < >> > > >> https://docs.google.com/file/d/0By8sTL79ob4tYl9sM2V5a19iNVU/edit?usp=drive_web >> > > > >> > > > > ?????? >> > > > > tcp.pcapng.gz >> > > > > < >> > > >> https://docs.google.com/file/d/0By8sTL79ob4tV0NMR1FYLUQ3MWs/edit?usp=drive_web >> > > > >> > > > > ??? >> > > > > >> > > > > >> > > > > >> > > > > Regards, >> > > > > Niu Zhixiong >> > > > > ????????????????????????????????????????????? >> > > > > kaia...@gmail.com >> > > > > >> > > > > >> > > > > On Sun, Aug 10, 2014 at 10:23 AM, John-Mark Gurney < >> j...@funkthat.com> >> > > > > wrote: >> > > > > >> > > > >> Niu Zhixiong wrote this message on Sun, Aug 10, 2014 at 10:12 >> +0800: >> > > > >> > During the TCP4 transmission. >> > > > >> > Proto Recv-Q Send-Q Local Address Foreign Address >> > > > >> (state) >> > > > >> > tcp4 0 2097346 10.0.10.2.13504 10.0.10.3.9000 >> > > > >> > ESTABLISHED >> > > > >> >> > > > >> Ok, so you are getting a full 2MB in there, and w/ that, you >> should >> > > > >> easily be saturating your pipe... >> > > > >> >> > > > >> The next thing would be to get a tcpdump, and take a look at the >> > > > >> window size.. Wireshark has lots of neat tools to make this >> analysis >> > > > >> easy... Another tool that is good is tcptrace.. It can output a >> > > > >> variety of different graphs that will help you track down, and >> see >> > > > >> what part of the system is the problem... >> > > > >> >> > > > >> You probably only need a few tens of seconds of the tcpdump... >> > > > >> >> > > > >> > On Sun, Aug 10, 2014 at 4:58 AM, Michael Tuexen < >> > > > >> > michael.tue...@lurchi.franken.de> wrote: >> > > > >> > >> > > > >> > > >> > > > >> > > On 09 Aug 2014, at 22:45, John-Mark Gurney <j...@funkthat.com >> > >> > > wrote: >> > > > >> > > >> > > > >> > > > Michael Tuexen wrote this message on Sat, Aug 09, 2014 at >> 21:51 >> > > > >> +0200: >> > > > >> > > >> >> > > > >> > > >> On 09 Aug 2014, at 20:42, John-Mark Gurney < >> j...@funkthat.com> >> > > > >> wrote: >> > > > >> > > >> >> > > > >> > > >>> Niu Zhixiong wrote this message on Fri, Aug 08, 2014 at >> 20:34 >> > > > >> +0800: >> > > > >> > > >>>> Dear all, >> > > > >> > > >>>> >> > > > >> > > >>>> Last month, I send problems related to FTP/TCP in a >> high RTT >> > > > >> > > environment. >> > > > >> > > >>>> After that, I setup a simulation environment(Dummynet) >> to >> > > test >> > > > >> TCP >> > > > >> > > and SCTP >> > > > >> > > >>>> in high delay environment. After finishing the test, I >> can >> > > see >> > > > >> TCP is >> > > > >> > > >>>> always slower than SCTP. But, I think it is not >> possible. >> > > (Plz >> > > > >> see the >> > > > >> > > >>>> figure in the attachment). When the delay is 200ms(means >> > > > >> RTT=400ms). >> > > > >> > > >>>> Besides, the TCP is extremely slow. >> > > > >> > > >>>> >> > > > >> > > >>>> ALL BW=20Mbps, DELAY= 0 ~ 200MS, Packet LOSS = 0 (by >> > > dummynet) >> > > > >> > > >>>> >> > > > >> > > >>>> This is my parameters: >> > > > >> > > >>>> FreeBSD vfreetest0 10.0-RELEASE FreeBSD 10.0-RELEASE >> #0: Thu >> > > Aug >> > > > >> 7 >> > > > >> > > >>>> 11:04:15 HKT 2014 >> > > > >> > > >>>> >> > > > >> > > >>>> sysctl net.inet.tcp >> > > > >> > > >>> >> > > > >> > > >>> [...] >> > > > >> > > >>> >> > > > >> > > >>>> net.inet.tcp.recvbuf_auto: 0 >> > > > >> > > >>> >> > > > >> > > >>> [...] >> > > > >> > > >>> >> > > > >> > > >>>> net.inet.tcp.sendbuf_auto: 0 >> > > > >> > > >>> >> > > > >> > > >>> Try enabling this... This should allow the buffer to >> grow >> > > large >> > > > >> enough >> > > > >> > > >>> to deal w/ the higher latency... >> > > > >> > > >>> >> > > > >> > > >>> Also, make sure your program isn't setting the recv >> buffer >> > > size >> > > > >> as that >> > > > >> > > >>> will disable the auto growing... >> > > > >> > > >> I think the program sets the buffer to 2MB, which it also >> does >> > > for >> > > > >> SCTP. >> > > > >> > > >> So having both statically at the same size makes sense >> for the >> > > > >> > > comparison. >> > > > >> > > >> I remember that there was a bug in the combination of LRO >> and >> > > > >> delayed >> > > > >> > > ACK, >> > > > >> > > >> which was fixed, but I don't remember it was fixed before >> > > 10.0... >> > > > >> > > > >> > > > >> > > > Sounds like disabling LRO and TSO would be a useful test >> to see >> > > if >> > > > >> that >> > > > >> > > > improves things... But hiren said that the fix made it, >> so... >> > > > >> > > > >> > > > >> > > >>> If you use netstat -a, you should be able to see the >> send-q >> > > on the >> > > > >> > > >>> sender grow as necessary... >> > > > >> > > > >> > > > >> > > > Also, getting the send-q output while it's running would >> let us >> > > know >> > > > >> > > > if the buffer is getting to 2MB or not... >> > > > >> > > That is correct. Niu: Can you provide this? >> > > >> > > -- >> > > John-Mark Gurney Voice: +1 415 225 >> 5579 >> > > >> > > "All that I will do, has been done, All that I have, has not." >> > > >> > _______________________________________________ >> > freebsd-net@freebsd.org mailing list >> > http://lists.freebsd.org/mailman/listinfo/freebsd-net >> > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" >> -- >> John-Mark Gurney Voice: +1 415 225 5579 >> >> "All that I will do, has been done, All that I have, has not." >> > _______________________________________________ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"