Network anomalies after update from 11.2 STABLE to 12.1 STABLE
Our current version is: FreeBSD 11.2-STABLE #0 r340725 New version that we have problems with: FreeBSD 12.1-STABLE #5 r352893 After update to new version we have started to observe an incredible number of errors in HTTP requests in between various services in our system. This problem appeared on all the servers that were upgraded, and seems to not be specific to concrete network card: we use different models, all are affected. During various tests, we observed a lot of spontaneous TCP stream abortions, including at the establishment stage (SYN) in cases that were 100% issue free on 11.2-STABLE. Concrete test cases will be shown below. We also want to highlight that, on numerous occasions, we have observed random, huge ACK indices in a first response to a SYN packet, instead of 1, as expected. This forces client to abort connection via RST. On the fist glance it looks like races in the kernel, because problem disappears when: * we use `dev.ixl.0.iflib.override_nrxqs=1` and `dev.ixl.0.iflib.override_ntxqs=1` * we use `dev.ixl.0.iflib.override_nrxqs=0` and `dev.ixl.0.iflib.override_ntxqs=0`, but don't issue concurrent TCP streams These are some debug log messages, emitted by 12.1-STABLE: Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:16304 to [10.10.10.92]:80 tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache entry (possibly syncookie only), segment ignored Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:16326 to [10.10.10.92]:80 tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache entry (possibly syncookie only), segment ignored Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:16402 to [10.10.10.92]:80 tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache entry (possibly syncookie only), segment ignored Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:16652 to [10.10.10.92]:80 tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache entry (possibly syncookie only), segment ignored Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:16686 to [10.10.10.92]:80 tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache entry (possibly syncookie only), segment ignored Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:18562 to [10.10.10.92]:80 tcpflags 0x4; tcp_do_segment: Timestamp missing, no action Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:18918 to [10.10.10.92]:80 tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache entry (possibly syncookie only), segment ignored Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19331 to [10.10.10.92]:80 tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache entry (possibly syncookie only), segment ignored Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19340 to [10.10.10.92]:80 tcpflags 0x4; tcp_do_segment: Timestamp missing, no action Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19340 to [10.10.10.92]:80 tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache entry (possibly syncookie only), segment ignored Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19340 to [10.10.10.92]:80 tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache entry (possibly syncookie only), segment ignored Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19489 to [10.10.10.92]:80 tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache entry (possibly syncookie only), segment ignored Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19580 to [10.10.10.92]:80 tcpflags 0x4; tcp_do_segment: Timestamp missing, no action Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19580 to [10.10.10.92]:80 tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache entry (possibly syncookie only), segment ignored Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19580 to [10.10.10.92]:80 tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache entry (possibly syncookie only), segment ignored Oct 18 14:59:02 test kernel: TCP: [10.10.10.39]:17705 to [10.10.10.92]:80; syncache_timer: Response timeout, retransmitting (1) SYN|ACK Oct 18 14:59:02 test kernel: TCP: [10.10.10.39]:18066 to [10.10.10.92]:80; syncache_timer: Response timeout, retransmitting (1) SYN|ACK Oct 18 14:59:02 test kernel: TCP: [10.10.10.39]:18066 to [10.10.10.92]:80 tcpflags 0x4; syncache_chkrst: Our SYN|ACK was rejected, connection attempt aborted by remote endpoint Oct 18 14:59:02 test kernel: TCP: [10.10.10.39]:17705 to [10.10.10.92]:80 tcpflags 0x4; syncache_chkrst: Our SYN|ACK was rejected, connection attempt aborted by remote endpoint Here, 10.10.10.92 runs 12.1-STABLE, while 10.10.10.39 is a client that runs 11.2-STABLE. In our test case we use nginx and wrk , with a minimal config, where nginx always returns error page 404. nginx is on the 12.1-STABLE, while wrk is on 11.2-STABLE. We run wrk like so: wrk -c 10 --header "Connection: close" -d 10 -t 1 --latency http://10.10.10.92:80/missing and often see e
Re: Network anomalies after update from 11.2 STABLE to 12.1 STABLE
> On 18. Oct 2019, at 14:57, Paul wrote: > > Our current version is: > >FreeBSD 11.2-STABLE #0 r340725 > > New version that we have problems with: > >FreeBSD 12.1-STABLE #5 r352893 > > > After update to new version we have started to observe an incredible number > of > errors in HTTP requests in between various services in our system. This > problem > appeared on all the servers that were upgraded, and seems to not be specific > to > concrete network card: we use different models, all are affected. > > During various tests, we observed a lot of spontaneous TCP stream abortions, > including at the establishment stage (SYN) in cases that were 100% issue free > on 11.2-STABLE. Concrete test cases will be shown below. > > We also want to highlight that, on numerous occasions, we have observed > random, > huge ACK indices in a first response to a SYN packet, instead of 1, as > expected. > This forces client to abort connection via RST. > > On the fist glance it looks like races in the kernel, because problem > disappears when: > * we use `dev.ixl.0.iflib.override_nrxqs=1` and > `dev.ixl.0.iflib.override_ntxqs=1` > * we use `dev.ixl.0.iflib.override_nrxqs=0` and > `dev.ixl.0.iflib.override_ntxqs=0`, but don't issue concurrent TCP streams > > These are some debug log messages, emitted by 12.1-STABLE: > > Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:16304 to [10.10.10.92]:80 > tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache > entry (possibly syncookie only), segment ignored > Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:16326 to [10.10.10.92]:80 > tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache > entry (possibly syncookie only), segment ignored > Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:16402 to [10.10.10.92]:80 > tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache > entry (possibly syncookie only), segment ignored > Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:16652 to [10.10.10.92]:80 > tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache > entry (possibly syncookie only), segment ignored > Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:16686 to [10.10.10.92]:80 > tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache > entry (possibly syncookie only), segment ignored > Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:18562 to [10.10.10.92]:80 > tcpflags 0x4; tcp_do_segment: Timestamp missing, no action > Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:18918 to [10.10.10.92]:80 > tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache > entry (possibly syncookie only), segment ignored > Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19331 to [10.10.10.92]:80 > tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache > entry (possibly syncookie only), segment ignored > Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19340 to [10.10.10.92]:80 > tcpflags 0x4; tcp_do_segment: Timestamp missing, no action > Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19340 to [10.10.10.92]:80 > tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache > entry (possibly syncookie only), segment ignored > Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19340 to [10.10.10.92]:80 > tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache > entry (possibly syncookie only), segment ignored > Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19489 to [10.10.10.92]:80 > tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache > entry (possibly syncookie only), segment ignored > Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19580 to [10.10.10.92]:80 > tcpflags 0x4; tcp_do_segment: Timestamp missing, no action > Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19580 to [10.10.10.92]:80 > tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache > entry (possibly syncookie only), segment ignored > Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19580 to [10.10.10.92]:80 > tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache > entry (possibly syncookie only), segment ignored > Oct 18 14:59:02 test kernel: TCP: [10.10.10.39]:17705 to [10.10.10.92]:80; > syncache_timer: Response timeout, retransmitting (1) SYN|ACK > Oct 18 14:59:02 test kernel: TCP: [10.10.10.39]:18066 to [10.10.10.92]:80; > syncache_timer: Response timeout, retransmitting (1) SYN|ACK > Oct 18 14:59:02 test kernel: TCP: [10.10.10.39]:18066 to [10.10.10.92]:80 > tcpflags 0x4; syncache_chkrst: Our SYN|ACK was rejected, connection > attempt aborted by remote endpoint > Oct 18 14:59:02 test kernel: TCP: [10.10.10.39]:17705 to [10.10.10.92]:80 > tcpflags 0x4; syncache_chkrst: Our SYN|ACK was rejected, connection > attempt aborted by remote endpoint > > Here, 10.10.10.92 runs 12.1-STABLE, while 10.10.10.39 is a client that runs > 11.2-STABLE. > > > In our test case we use nginx and wrk , with a minimal config, wh
Re: Network anomalies after update from 11.2 STABLE to 12.1 STABLE
> On 18. Oct 2019, at 14:57, Paul wrote: > > Our current version is: > > FreeBSD 11.2-STABLE #0 r340725 > > New version that we have problems with: > > FreeBSD 12.1-STABLE #5 r352893 > > > After update to new version we have started to observe an incredible number > of > errors in HTTP requests in between various services in our system. This > problem > appeared on all the servers that were upgraded, and seems to not be specific > to > concrete network card: we use different models, all are affected. > > During various tests, we observed a lot of spontaneous TCP stream abortions, > including at the establishment stage (SYN) in cases that were 100% issue free > on 11.2-STABLE. Concrete test cases will be shown below. > > We also want to highlight that, on numerous occasions, we have observed > random, > huge ACK indices in a first response to a SYN packet, instead of 1, as > expected. > This forces client to abort connection via RST. > > On the fist glance it looks like races in the kernel, because problem > disappears when: > * we use `dev.ixl.0.iflib.override_nrxqs=1` and > `dev.ixl.0.iflib.override_ntxqs=1` > * we use `dev.ixl.0.iflib.override_nrxqs=0` and > `dev.ixl.0.iflib.override_ntxqs=0`, but don't issue concurrent TCP streams > > These are some debug log messages, emitted by 12.1-STABLE: > > Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:16304 to [10.10.10.92]:80 > tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache > entry (possibly syncookie only), segment ignored > Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:16326 to [10.10.10.92]:80 > tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache > entry (possibly syncookie only), segment ignored > Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:16402 to [10.10.10.92]:80 > tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache > entry (possibly syncookie only), segment ignored > Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:16652 to [10.10.10.92]:80 > tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache > entry (possibly syncookie only), segment ignored > Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:16686 to [10.10.10.92]:80 > tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache > entry (possibly syncookie only), segment ignored > Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:18562 to [10.10.10.92]:80 > tcpflags 0x4; tcp_do_segment: Timestamp missing, no action > Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:18918 to [10.10.10.92]:80 > tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache > entry (possibly syncookie only), segment ignored > Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19331 to [10.10.10.92]:80 > tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache > entry (possibly syncookie only), segment ignored > Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19340 to [10.10.10.92]:80 > tcpflags 0x4; tcp_do_segment: Timestamp missing, no action > Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19340 to [10.10.10.92]:80 > tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache > entry (possibly syncookie only), segment ignored > Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19340 to [10.10.10.92]:80 > tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache > entry (possibly syncookie only), segment ignored > Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19489 to [10.10.10.92]:80 > tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache > entry (possibly syncookie only), segment ignored > Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19580 to [10.10.10.92]:80 > tcpflags 0x4; tcp_do_segment: Timestamp missing, no action > Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19580 to [10.10.10.92]:80 > tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache > entry (possibly syncookie only), segment ignored > Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19580 to [10.10.10.92]:80 > tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache > entry (possibly syncookie only), segment ignored > Oct 18 14:59:02 test kernel: TCP: [10.10.10.39]:17705 to [10.10.10.92]:80; > syncache_timer: Response timeout, retransmitting (1) SYN|ACK > Oct 18 14:59:02 test kernel: TCP: [10.10.10.39]:18066 to [10.10.10.92]:80; > syncache_timer: Response timeout, retransmitting (1) SYN|ACK > Oct 18 14:59:02 test kernel: TCP: [10.10.10.39]:18066 to [10.10.10.92]:80 > tcpflags 0x4; syncache_chkrst: Our SYN|ACK was rejected, connection > attempt aborted by remote endpoint > Oct 18 14:59:02 test kernel: TCP: [10.10.10.39]:17705 to [10.10.10.92]:80 > tcpflags 0x4; syncache_chkrst: Our SYN|ACK was rejected, connection > attempt aborted by remote endpoint > > Here, 10.10.10.92 runs 12.1-STABLE, while 10.10.10.39 is a client that runs > 11.2-STABLE. > > > In our test case we use nginx and wrk , with a minimal config, where
Re: Network anomalies after update from 11.2 STABLE to 12.1 STABLE
> On 19. Oct 2019, at 18:09, Paul wrote: > > Hi Michael, > > Thank you, for taking your time! > > We use physical machines. We don not have any special `pf` rules. > Both sides ran `pfctl -d` before testing. Hi Paul, OK. How are the physical machines connected to each other? What happens when you don't use a lagg interface, but the physical ones? (Trying to localise the problem...) Best regards Michael > > > `nginx` config is primitive, no secrets there: > > --- > user www; > worker_processes auto; > > error_log /var/log/nginx/error.log warn; > > events { > worker_connections 81920; > kqueue_changes 4096; > use kqueue; > } > > http { > include mime.types; > default_typeapplication/octet-stream; > > sendfileoff; > keepalive_timeout 65; > tcp_nopush on; > tcp_nodelay on; > > # Logging > log_format main'$remote_addr - $remote_user [$time_local] > "$request" ' > '$status $request_length $body_bytes_sent > "$http_referer" ' > '"$http_user_agent" "$http_x_real_ip" > "$realip_remote_addr" "$request_completion" "$request_time" ' > '"$request_body"'; > > access_log /var/log/nginx/access.log main; > > server { > listen 80 default; > > server_name localhost _; > > location / { > return 404; > } > } > } > --- > > > `wrk` is compiled with a default configuration. We test like this: > > `wrk -c 10 --header "Connection: close" -d 10 -t 1 --latency > http://10.10.10.92:80/missing` > > > Also, it seems that our issue, and the one described in this thread, are > identical: > >https://lists.freebsd.org/pipermail/freebsd-net/2019-June/053667.html > > We both have the Intel network cards, BTW. Our network cards are these: > > em0 at pci0:10:0:0:class=0x02 card=0x15d9 chip=0x10d38086 > rev=0x00 hdr=0x00 > vendor = 'Intel Corporation' > device = '82574L Gigabit Network Connection' > > ixl0 at pci0:4:0:0:class=0x02 card=0x00078086 chip=0x15728086 > rev=0x01 hdr=0x00 > vendor = 'Intel Corporation' > device = 'Ethernet Controller X710 for 10GbE SFP+' > > > == > > Additional info: > > During the tests, we have bonded two interfaces into a lagg: > > ixl0: flags=8843 metric 0 mtu 1500 > > options=c500b8 > ether 3c:fd:fe:aa:60:20 > media: Ethernet autoselect (10Gbase-SR ) > status: active > nd6 options=29 > ixl1: flags=8843 metric 0 mtu 1500 > > options=c500b8 > ether 3c:fd:fe:aa:60:20 > hwaddr 3c:fd:fe:aa:60:21 > media: Ethernet autoselect (10Gbase-SR ) > status: active > nd6 options=29 > > > lagg0: flags=8843 metric 0 mtu 1500 > > options=c500b8 > ether 3c:fd:fe:aa:60:20 > inet 10.10.10.92 netmask 0x broadcast 10.10.255.255 > laggproto failover lagghash l2,l3,l4 > laggport: ixl0 flags=5 > laggport: ixl1 flags=0<> > groups: lagg > media: Ethernet autoselect > status: active > nd6 options=29 > > using this config: > > ifconfig_ixl0="up -lro -tso -rxcsum -txcsum" (tried different options - > got the same outcome) > ifconfig_ixl1="up -lro -tso -rxcsum -txcsum" > ifconfig_lagg0="laggproto failover laggport ixl0 laggport ixl1 > 10.10.10.92/24" > > > We have randomly picked `ixl0` and restricted number of RX/TX queues to 1: > /boot/loader.conf : > dev.ixl.0.iflib.override_ntxqs=1 > dev.ixl.0.iflib.override_nrxqs=1 > > leaving `ixl1` with a default number, matching number of cores (6). > > > ixl0: mem > 0xf880-0xf8ff,0xf9808000-0xf980 irq 40 at device 0.0 on pci4 > ixl0: fw 5.0.40043 api 1.5 nvm 5.05 etid 80002927 oem 1.261.0 > ixl0: PF-ID[0]: VFs 64, MSI-X 129, VF MSI-X 5, QPs 768, I2C > ixl0: Using 1024 TX descriptors and 1024 RX descriptors > ixl0: Using 1 RX queues 1 TX queues > ixl0: Using MSI-X interrupts with 2 vectors > ixl0: Ethernet address: 3c:fd:fe:aa:60:20 > ixl0: Allocating 1 queues for PF LAN VSI; 1 queues active > ixl0: PCI Express Bus: Speed 8.0GT/s Width x4 > ixl0: SR-IOV ready > ixl0: netmap queues/slots: TX 1/1024, RX 1/1024 > ixl1: mem > 0xf800-0xf87f,0xf980-0xf9807fff irq 40 at device 0.1 on pci4 > ixl1: fw 5.0.40043 api 1.5 nvm 5.05 etid 80002927 oem 1.261.0 > ixl1: PF-ID[1]: VFs 64, MSI-X 129, VF MSI-X 5, QPs 768, I2C > ixl1: Using 1024 TX descriptors and 1024 RX descriptors > ixl1: Using 6 RX queues 6 TX q
Re: Network anomalies after update from 11.2 STABLE to 12.1 STABLE
> On 19. Oct 2019, at 19:32, Paul wrote: > > Hi Rick, > > RST is only one part of a syndrome. Apart from it, we have a ton of different > other issues. For example: a lot (50+) of ACK and [FIN, ACK] re-transmissions > in cases where they are definitely not needed, as seen in tspdump, unless the > packets that we see in the dump are not actually processed by the kernel(?), > therefore leading to re-transmissions? It definitely has something to do with > races, because issue completely disappears when only single queue is enabled. > > In other cases, we have observed that 12.1-STABLE has sent FIN, but then, > when sending the ACK it didn't actually increment SEQ, as if those two packets > FIN an ACK were sent concurrently, though ACK was dispatched later. > > Also, I want to focus on a weird behavior, as I wrote in the original post: > issue also disappears if, multiple TCP streams each use different DST port. > It's as if it has anything to do with sharing a port. Hi Paul, I understand that you see the NIC level queue handling as a part of what has to be taken into account. I agree, that having problems there might result in packets send out not in the expected order or packets received not being processed in the expected order. >From a TCP perspective, both cases look like reordering in the network and this might impact the performance in a negative way (unnecessary retransmissions, congestion control limiting the transfer more than it should), but it should not result in TCP connection drops. Do you have tracefiles (.pcap preferred) from both sides showing connection drops? Best regards Michael > > > 19 October 2019, 19:24:43, by "Rick Macklem" : > >> Btw, I once ran into a situation where "smart networking" was injecting >> RSTs into a TCP stream. The packet captures at the client and server >> machines were identical, except for the RSTs and the problem went away >> when I connected the two machines with a cable, bypassing the network. >> Might be worth a try, if you can do it? >> >> Good luck with it, rick >> >> >> From: owner-freebsd-...@freebsd.org on >> behalf of Paul >> Sent: Saturday, October 19, 2019 12:09 PM >> To: michael.tue...@lurchi.franken.de; freebsd-net@freebsd.org; >> freebsd-sta...@freebsd.org >> Subject: Re[2]: Network anomalies after update from 11.2 STABLE to 12.1 >> STABLE >> >> Hi Michael, >> >> Thank you, for taking your time! >> >> We use physical machines. We don not have any special `pf` rules. >> Both sides ran `pfctl -d` before testing. >> >> >> `nginx` config is primitive, no secrets there: >> >> --- >> user www; >> worker_processes auto; >> >> error_log /var/log/nginx/error.log warn; >> >> events { >>worker_connections 81920; >>kqueue_changes 4096; >>use kqueue; >> } >> >> http { >>include mime.types; >>default_typeapplication/octet-stream; >> >>sendfileoff; >>keepalive_timeout 65; >>tcp_nopush on; >>tcp_nodelay on; >> >># Logging >>log_format main'$remote_addr - $remote_user [$time_local] >> "$request" ' >>'$status $request_length $body_bytes_sent >> "$http_referer" ' >>'"$http_user_agent" "$http_x_real_ip" >> "$realip_remote_addr" "$request_completion" "$request_time" ' >>'"$request_body"'; >> >>access_log /var/log/nginx/access.log main; >> >>server { >>listen 80 default; >> >>server_name localhost _; >> >>location / { >>return 404; >>} >>} >> } >> --- >> >> >> `wrk` is compiled with a default configuration. We test like this: >> >> `wrk -c 10 --header "Connection: close" -d 10 -t 1 --latency >> http://10.10.10.92:80/missing` >> >> >> Also, it seems that our issue, and the one described in this thread, are >> identical: >> >> https://lists.freebsd.org/pipermail/freebsd-net/2019-June/053667.h
Re[2]: Network anomalies after update from 11.2 STABLE to 12.1 STABLE
Hi Michael, Thank you, for taking your time! We use physical machines. We don not have any special `pf` rules. Both sides ran `pfctl -d` before testing. `nginx` config is primitive, no secrets there: --- user www; worker_processes auto; error_log /var/log/nginx/error.log warn; events { worker_connections 81920; kqueue_changes 4096; use kqueue; } http { include mime.types; default_typeapplication/octet-stream; sendfileoff; keepalive_timeout 65; tcp_nopush on; tcp_nodelay on; # Logging log_format main'$remote_addr - $remote_user [$time_local] "$request" ' '$status $request_length $body_bytes_sent "$http_referer" ' '"$http_user_agent" "$http_x_real_ip" "$realip_remote_addr" "$request_completion" "$request_time" ' '"$request_body"'; access_log /var/log/nginx/access.log main; server { listen 80 default; server_name localhost _; location / { return 404; } } } --- `wrk` is compiled with a default configuration. We test like this: `wrk -c 10 --header "Connection: close" -d 10 -t 1 --latency http://10.10.10.92:80/missing` Also, it seems that our issue, and the one described in this thread, are identical: https://lists.freebsd.org/pipermail/freebsd-net/2019-June/053667.html We both have the Intel network cards, BTW. Our network cards are these: em0 at pci0:10:0:0:class=0x02 card=0x15d9 chip=0x10d38086 rev=0x00 hdr=0x00 vendor = 'Intel Corporation' device = '82574L Gigabit Network Connection' ixl0 at pci0:4:0:0:class=0x02 card=0x00078086 chip=0x15728086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = 'Ethernet Controller X710 for 10GbE SFP+' == Additional info: During the tests, we have bonded two interfaces into a lagg: ixl0: flags=8843 metric 0 mtu 1500 options=c500b8 ether 3c:fd:fe:aa:60:20 media: Ethernet autoselect (10Gbase-SR ) status: active nd6 options=29 ixl1: flags=8843 metric 0 mtu 1500 options=c500b8 ether 3c:fd:fe:aa:60:20 hwaddr 3c:fd:fe:aa:60:21 media: Ethernet autoselect (10Gbase-SR ) status: active nd6 options=29 lagg0: flags=8843 metric 0 mtu 1500 options=c500b8 ether 3c:fd:fe:aa:60:20 inet 10.10.10.92 netmask 0x broadcast 10.10.255.255 laggproto failover lagghash l2,l3,l4 laggport: ixl0 flags=5 laggport: ixl1 flags=0<> groups: lagg media: Ethernet autoselect status: active nd6 options=29 using this config: ifconfig_ixl0="up -lro -tso -rxcsum -txcsum" (tried different options - got the same outcome) ifconfig_ixl1="up -lro -tso -rxcsum -txcsum" ifconfig_lagg0="laggproto failover laggport ixl0 laggport ixl1 10.10.10.92/24" We have randomly picked `ixl0` and restricted number of RX/TX queues to 1: /boot/loader.conf : dev.ixl.0.iflib.override_ntxqs=1 dev.ixl.0.iflib.override_nrxqs=1 leaving `ixl1` with a default number, matching number of cores (6). ixl0: mem 0xf880-0xf8ff,0xf9808000-0xf980 irq 40 at device 0.0 on pci4 ixl0: fw 5.0.40043 api 1.5 nvm 5.05 etid 80002927 oem 1.261.0 ixl0: PF-ID[0]: VFs 64, MSI-X 129, VF MSI-X 5, QPs 768, I2C ixl0: Using 1024 TX descriptors and 1024 RX descriptors ixl0: Using 1 RX queues 1 TX queues ixl0: Using MSI-X interrupts with 2 vectors ixl0: Ethernet address: 3c:fd:fe:aa:60:20 ixl0: Allocating 1 queues for PF LAN VSI; 1 queues active ixl0: PCI Express Bus: Speed 8.0GT/s Width x4 ixl0: SR-IOV ready ixl0: netmap queues/slots: TX 1/1024, RX 1/1024 ixl1: mem 0xf800-0xf87f,0xf980-0xf9807fff irq 40 at device 0.1 on pci4 ixl1: fw 5.0.40043 api 1.5 nvm 5.05 etid 80002927 oem 1.261.0 ixl1: PF-ID[1]: VFs 64, MSI-X 129, VF MSI-X 5, QPs 768, I2C ixl1: Using 1024 TX descriptors and 1024 RX descriptors ixl1: Using 6 RX queues 6 TX queues ixl1: Using MSI-X interrupts with 7 vectors ixl1: Ethernet address: 3c:fd:fe:aa:60:21 ixl1: Allocating 8 queues for PF LAN VSI; 6 queues active ixl1: PCI Express Bus: Speed 8.0GT/s Width x4 ixl1: SR-IOV ready ixl1: netmap queues/slots: TX 6/1024, RX 6/1024 This allowed us easy switch between different configurations without the need to reboot, by simply shutting down one interface or the other: `ifconfig XXX down` When testing `ixl0` that runs only a single queue: ixl0: Using 1 RX queues 1 TX queues
Re[2]: Network anomalies after update from 11.2 STABLE to 12.1 STABLE
19 October 2019, 19:35:24, by "Michael Tuexen" : > > On 19. Oct 2019, at 18:09, Paul wrote: > > > > Hi Michael, > > > > Thank you, for taking your time! > > > > We use physical machines. We don not have any special `pf` rules. > > Both sides ran `pfctl -d` before testing. > Hi Paul, > > OK. How are the physical machines connected to each other? We have tested different connections. The old, copper ethernet, cable, as well as optics connection with an identical outcome. Machines are connected through Juniper QFX5100. > > What happens when you don't use a lagg interface, but the physical ones? > > (Trying to localise the problem...) Same thing, lagg does not change anything. Originally, the problem was observed on a regular interface. We have tested a on different hardware. Results are consistently stable on 11.2-STABLE and consistently unstable on 12.1-STABLE. The only unchanged thing is the network card vendor, it's Intel. > > Best regards > Michael > > > > > > `nginx` config is primitive, no secrets there: > > > > --- > > user www; > > worker_processes auto; > > > > error_log /var/log/nginx/error.log warn; > > > > events { > > worker_connections 81920; > > kqueue_changes 4096; > > use kqueue; > > } > > > > http { > > include mime.types; > > default_typeapplication/octet-stream; > > > > sendfileoff; > > keepalive_timeout 65; > > tcp_nopush on; > > tcp_nodelay on; > > > > # Logging > > log_format main'$remote_addr - $remote_user [$time_local] > > "$request" ' > > '$status $request_length $body_bytes_sent > > "$http_referer" ' > > '"$http_user_agent" "$http_x_real_ip" > > "$realip_remote_addr" "$request_completion" "$request_time" ' > > '"$request_body"'; > > > > access_log /var/log/nginx/access.log main; > > > > server { > > listen 80 default; > > > > server_name localhost _; > > > > location / { > > return 404; > > } > > } > > } > > --- > > > > > > `wrk` is compiled with a default configuration. We test like this: > > > > `wrk -c 10 --header "Connection: close" -d 10 -t 1 --latency > > http://10.10.10.92:80/missing` > > > > > > Also, it seems that our issue, and the one described in this thread, are > > identical: > > > >https://lists.freebsd.org/pipermail/freebsd-net/2019-June/053667.html > > > > We both have the Intel network cards, BTW. Our network cards are these: > > > > em0 at pci0:10:0:0:class=0x02 card=0x15d9 chip=0x10d38086 > > rev=0x00 hdr=0x00 > > vendor = 'Intel Corporation' > > device = '82574L Gigabit Network Connection' > > > > ixl0 at pci0:4:0:0:class=0x02 card=0x00078086 chip=0x15728086 > > rev=0x01 hdr=0x00 > > vendor = 'Intel Corporation' > > device = 'Ethernet Controller X710 for 10GbE SFP+' > > > > > > == > > > > Additional info: > > > > During the tests, we have bonded two interfaces into a lagg: > > > > ixl0: flags=8843 metric 0 mtu 1500 > > > > options=c500b8 > > ether 3c:fd:fe:aa:60:20 > > media: Ethernet autoselect (10Gbase-SR ) > > status: active > > nd6 options=29 > > ixl1: flags=8843 metric 0 mtu 1500 > > > > options=c500b8 > > ether 3c:fd:fe:aa:60:20 > > hwaddr 3c:fd:fe:aa:60:21 > > media: Ethernet autoselect (10Gbase-SR ) > > status: active > > nd6 options=29 > > > > > > lagg0: flags=8843 metric 0 mtu 1500 > > > > options=c500b8 > > ether 3c:fd:fe:aa:60:20 > > inet 10.10.10.92 netmask 0x broadcast 10.10.255.255 > > laggproto failover lagghash l2,l3,l4 > > laggport: ixl0 flags=5 > > laggport: ixl1 flags=0<> > > groups: lagg > > media: Ethernet autoselect > > status: active > > nd6 options=29 > > > > using this config: > > > > ifconfig_ixl0="up -lro -tso -rxcsum -txcsum" (tried different options > > - got the same outcome) > > ifconfig_ixl1="up -lro -tso -rxcsum -txcsum" > > ifconfig_lagg0="laggproto failover laggport ixl0 laggport ixl1 > > 10.10.10.92/24" > > > > > > We have randomly picked `ixl0` and restricted number of RX/TX queues to 1: > > /boot/loader.conf : > > dev.ixl.0.iflib.override_ntxqs=1 > > dev.ixl.0.iflib.override_nrxqs=1 > > > > leaving `ixl1` with a default number, matching number of cores (6). > > > > > > ixl0: mem > > 0xf880-0xf8ff,0xf9808000-0xf980 irq 40 at device 0.0 on pci4 > > ixl0: fw 5.0.40043
Re: Re[2]: Network anomalies after update from 11.2 STABLE to 12.1 STABLE
Btw, I once ran into a situation where "smart networking" was injecting RSTs into a TCP stream. The packet captures at the client and server machines were identical, except for the RSTs and the problem went away when I connected the two machines with a cable, bypassing the network. Might be worth a try, if you can do it? Good luck with it, rick From: owner-freebsd-...@freebsd.org on behalf of Paul Sent: Saturday, October 19, 2019 12:09 PM To: michael.tue...@lurchi.franken.de; freebsd-net@freebsd.org; freebsd-sta...@freebsd.org Subject: Re[2]: Network anomalies after update from 11.2 STABLE to 12.1 STABLE Hi Michael, Thank you, for taking your time! We use physical machines. We don not have any special `pf` rules. Both sides ran `pfctl -d` before testing. `nginx` config is primitive, no secrets there: --- user www; worker_processes auto; error_log /var/log/nginx/error.log warn; events { worker_connections 81920; kqueue_changes 4096; use kqueue; } http { include mime.types; default_typeapplication/octet-stream; sendfileoff; keepalive_timeout 65; tcp_nopush on; tcp_nodelay on; # Logging log_format main'$remote_addr - $remote_user [$time_local] "$request" ' '$status $request_length $body_bytes_sent "$http_referer" ' '"$http_user_agent" "$http_x_real_ip" "$realip_remote_addr" "$request_completion" "$request_time" ' '"$request_body"'; access_log /var/log/nginx/access.log main; server { listen 80 default; server_name localhost _; location / { return 404; } } } --- `wrk` is compiled with a default configuration. We test like this: `wrk -c 10 --header "Connection: close" -d 10 -t 1 --latency http://10.10.10.92:80/missing` Also, it seems that our issue, and the one described in this thread, are identical: https://lists.freebsd.org/pipermail/freebsd-net/2019-June/053667.html We both have the Intel network cards, BTW. Our network cards are these: em0 at pci0:10:0:0:class=0x02 card=0x15d9 chip=0x10d38086 rev=0x00 hdr=0x00 vendor = 'Intel Corporation' device = '82574L Gigabit Network Connection' ixl0 at pci0:4:0:0:class=0x02 card=0x00078086 chip=0x15728086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = 'Ethernet Controller X710 for 10GbE SFP+' == Additional info: During the tests, we have bonded two interfaces into a lagg: ixl0: flags=8843 metric 0 mtu 1500 options=c500b8 ether 3c:fd:fe:aa:60:20 media: Ethernet autoselect (10Gbase-SR ) status: active nd6 options=29 ixl1: flags=8843 metric 0 mtu 1500 options=c500b8 ether 3c:fd:fe:aa:60:20 hwaddr 3c:fd:fe:aa:60:21 media: Ethernet autoselect (10Gbase-SR ) status: active nd6 options=29 lagg0: flags=8843 metric 0 mtu 1500 options=c500b8 ether 3c:fd:fe:aa:60:20 inet 10.10.10.92 netmask 0x broadcast 10.10.255.255 laggproto failover lagghash l2,l3,l4 laggport: ixl0 flags=5 laggport: ixl1 flags=0<> groups: lagg media: Ethernet autoselect status: active nd6 options=29 using this config: ifconfig_ixl0="up -lro -tso -rxcsum -txcsum" (tried different options - got the same outcome) ifconfig_ixl1="up -lro -tso -rxcsum -txcsum" ifconfig_lagg0="laggproto failover laggport ixl0 laggport ixl1 10.10.10.92/24" We have randomly picked `ixl0` and restricted number of RX/TX queues to 1: /boot/loader.conf : dev.ixl.0.iflib.override_ntxqs=1 dev.ixl.0.iflib.override_nrxqs=1 leaving `ixl1` with a default number, matching number of cores (6). ixl0: mem 0xf880-0xf8ff,0xf9808000-0xf980 irq 40 at device 0.0 on pci4 ixl0: fw 5.0.40043 api 1.5 nvm 5.05 etid 80002927 oem 1.261.0 ixl0: PF-ID[0]: VFs 64, MSI-X 129, VF MSI-X 5, QPs 768, I2C ixl0: Using 1024 TX descriptors and 1024 RX descriptors ixl0: Using 1 RX queues 1 TX queues ixl0: Using MSI-X interrupts with 2 vectors ixl0: Ethernet address: 3c:fd:fe:aa:60:20 ixl0: Allocating 1 queues for PF LAN VSI; 1 queues active ixl0: PCI Express Bus: Speed 8.0GT/s Width x4 ixl0: SR-IOV ready ixl0: netmap queues/slots: TX 1/
Re[2]: Re[2]: Network anomalies after update from 11.2 STABLE to 12.1 STABLE
Hi Rick, RST is only one part of a syndrome. Apart from it, we have a ton of different other issues. For example: a lot (50+) of ACK and [FIN, ACK] re-transmissions in cases where they are definitely not needed, as seen in tspdump, unless the packets that we see in the dump are not actually processed by the kernel(?), therefore leading to re-transmissions? It definitely has something to do with races, because issue completely disappears when only single queue is enabled. In other cases, we have observed that 12.1-STABLE has sent FIN, but then, when sending the ACK it didn't actually increment SEQ, as if those two packets FIN an ACK were sent concurrently, though ACK was dispatched later. Also, I want to focus on a weird behavior, as I wrote in the original post: issue also disappears if, multiple TCP streams each use different DST port. It's as if it has anything to do with sharing a port. 19 October 2019, 19:24:43, by "Rick Macklem" : > Btw, I once ran into a situation where "smart networking" was injecting > RSTs into a TCP stream. The packet captures at the client and server > machines were identical, except for the RSTs and the problem went away > when I connected the two machines with a cable, bypassing the network. > Might be worth a try, if you can do it? > > Good luck with it, rick > > > From: owner-freebsd-...@freebsd.org on behalf > of Paul > Sent: Saturday, October 19, 2019 12:09 PM > To: michael.tue...@lurchi.franken.de; freebsd-net@freebsd.org; > freebsd-sta...@freebsd.org > Subject: Re[2]: Network anomalies after update from 11.2 STABLE to 12.1 STABLE > > Hi Michael, > > Thank you, for taking your time! > > We use physical machines. We don not have any special `pf` rules. > Both sides ran `pfctl -d` before testing. > > > `nginx` config is primitive, no secrets there: > > --- > user www; > worker_processes auto; > > error_log /var/log/nginx/error.log warn; > > events { > worker_connections 81920; > kqueue_changes 4096; > use kqueue; > } > > http { > include mime.types; > default_typeapplication/octet-stream; > > sendfileoff; > keepalive_timeout 65; > tcp_nopush on; > tcp_nodelay on; > > # Logging > log_format main'$remote_addr - $remote_user [$time_local] > "$request" ' > '$status $request_length $body_bytes_sent > "$http_referer" ' > '"$http_user_agent" "$http_x_real_ip" > "$realip_remote_addr" "$request_completion" "$request_time" ' > '"$request_body"'; > > access_log /var/log/nginx/access.log main; > > server { > listen 80 default; > > server_name localhost _; > > location / { > return 404; > } > } > } > --- > > > `wrk` is compiled with a default configuration. We test like this: > > `wrk -c 10 --header "Connection: close" -d 10 -t 1 --latency > http://10.10.10.92:80/missing` > > > Also, it seems that our issue, and the one described in this thread, are > identical: > >https://lists.freebsd.org/pipermail/freebsd-net/2019-June/053667.html > > We both have the Intel network cards, BTW. Our network cards are these: > > em0 at pci0:10:0:0:class=0x02 card=0x15d9 chip=0x10d38086 > rev=0x00 hdr=0x00 > vendor = 'Intel Corporation' > device = '82574L Gigabit Network Connection' > > ixl0 at pci0:4:0:0:class=0x02 card=0x00078086 chip=0x15728086 > rev=0x01 hdr=0x00 > vendor = 'Intel Corporation' > device = 'Ethernet Controller X710 for 10GbE SFP+' > > > == > > Additional info: > > During the tests, we have bonded two interfaces into a lagg: > > ixl0: flags=8843 metric 0 mtu 1500 > > options=c500b8 > ether 3c:fd:fe:aa:60:20 > media: Ethernet autoselect (10Gbase-SR ) > status: active > nd6 options=29 > ixl1: flags=8843 metric 0 mtu 1500 > > options=c500b8 > ether 3c:fd:fe:aa:60:20 > hwaddr 3c:fd:fe:aa:60:21 > media: Ethernet autoselect