Re: Network anomalies after update from 11.2 STABLE to 12.1 STABLE

2019-10-20 Thread Michael Tuexen
> On 19. Oct 2019, at 19:32, Paul  wrote:
> 
> Hi Rick,
> 
> RST is only one part of a syndrome. Apart from it, we have a ton of different
> other issues. For example: a lot (50+) of ACK and [FIN, ACK] re-transmissions
> in cases where they are definitely not needed, as seen in tspdump, unless the 
> packets that we see in the dump are not actually processed by the kernel(?), 
> therefore leading to re-transmissions? It definitely has something to do with 
> races, because issue completely disappears when only single queue is enabled.
> 
> In other cases, we have observed that 12.1-STABLE has sent FIN, but then, 
> when sending the ACK it didn't actually increment SEQ, as if those two packets
> FIN an ACK were sent concurrently, though ACK was dispatched later.  
> 
> Also, I want to focus on a weird behavior, as I wrote in the original post:
> issue also disappears if, multiple TCP streams each use different DST port.
> It's as if it has anything to do with sharing a port.
Hi Paul,

I understand that you see the NIC level queue handling as a part of what has to
be taken into account. I agree, that having problems there might result in 
packets
send out not in the expected order or packets received not being processed in 
the
expected order.
>From a TCP perspective, both cases look like reordering in the network and this
might impact the performance in a negative way (unnecessary retransmissions,
congestion control limiting the transfer more than it should), but it should not
result in TCP connection drops.

Do you have tracefiles (.pcap preferred) from both sides showing connection 
drops?

Best regards
Michael
> 
> 
> 19 October 2019, 19:24:43, by "Rick Macklem" :
> 
>> Btw, I once ran into a situation where "smart networking" was injecting
>> RSTs into a TCP stream. The packet captures at the client and server
>> machines were identical, except for the RSTs and the problem went away
>> when I connected the two machines with a cable, bypassing the network.
>> Might be worth a try, if you can do it?
>> 
>> Good luck with it, rick
>> 
>> 
>> From: owner-freebsd-...@freebsd.org  on 
>> behalf of Paul 
>> Sent: Saturday, October 19, 2019 12:09 PM
>> To: michael.tue...@lurchi.franken.de; freebsd-net@freebsd.org; 
>> freebsd-sta...@freebsd.org
>> Subject: Re[2]: Network anomalies after update from 11.2 STABLE to 12.1 
>> STABLE
>> 
>> Hi Michael,
>> 
>> Thank you, for taking your time!
>> 
>> We use physical machines. We don not have any special `pf` rules.
>> Both sides ran `pfctl -d` before testing.
>> 
>> 
>> `nginx` config is primitive, no secrets there:
>> 
>> ---
>> user  www;
>> worker_processes  auto;
>> 
>> error_log  /var/log/nginx/error.log warn;
>> 
>> events {
>>worker_connections  81920;
>>kqueue_changes  4096;
>>use kqueue;
>> }
>> 
>> http {
>>include mime.types;
>>default_typeapplication/octet-stream;
>> 
>>sendfileoff;
>>keepalive_timeout   65;
>>tcp_nopush  on;
>>tcp_nodelay on;
>> 
>># Logging
>>log_format  main'$remote_addr - $remote_user [$time_local] 
>> "$request" '
>>'$status $request_length $body_bytes_sent 
>> "$http_referer" '
>>'"$http_user_agent" "$http_x_real_ip" 
>> "$realip_remote_addr" "$request_completion" "$request_time" '
>>'"$request_body"';
>> 
>>access_log  /var/log/nginx/access.log  main;
>> 
>>server {
>>listen  80 default;
>> 
>>server_name localhost _;
>> 
>>location / {
>>return 404;
>>}
>>}
>> }
>> ---
>> 
>> 
>> `wrk` is compiled with a default configuration. We test like this:
>> 
>> `wrk -c 10 --header "Connection: close" -d 10 -t 1 --latency 
>> http://10.10.10.92:80/missing`
>> 
>> 
>> Also, it seems that our issue, and the one described in this thread, are 
>> identical:
>> 
>>   https://lists.freebsd.org/pipermail/freebsd-net/2019-June/053667.html
>> 
>> We both have the Intel netw

Re[2]: Re[2]: Network anomalies after update from 11.2 STABLE to 12.1 STABLE

2019-10-19 Thread Paul
Hi Rick,

RST is only one part of a syndrome. Apart from it, we have a ton of different
other issues. For example: a lot (50+) of ACK and [FIN, ACK] re-transmissions
in cases where they are definitely not needed, as seen in tspdump, unless the 
packets that we see in the dump are not actually processed by the kernel(?), 
therefore leading to re-transmissions? It definitely has something to do with 
races, because issue completely disappears when only single queue is enabled.

In other cases, we have observed that 12.1-STABLE has sent FIN, but then, 
when sending the ACK it didn't actually increment SEQ, as if those two packets
FIN an ACK were sent concurrently, though ACK was dispatched later.  

Also, I want to focus on a weird behavior, as I wrote in the original post:
issue also disappears if, multiple TCP streams each use different DST port.
It's as if it has anything to do with sharing a port.


19 October 2019, 19:24:43, by "Rick Macklem" :

> Btw, I once ran into a situation where "smart networking" was injecting
> RSTs into a TCP stream. The packet captures at the client and server
> machines were identical, except for the RSTs and the problem went away
> when I connected the two machines with a cable, bypassing the network.
> Might be worth a try, if you can do it?
> 
> Good luck with it, rick
> 
> 
> From: owner-freebsd-...@freebsd.org  on behalf 
> of Paul 
> Sent: Saturday, October 19, 2019 12:09 PM
> To: michael.tue...@lurchi.franken.de; freebsd-net@freebsd.org; 
> freebsd-sta...@freebsd.org
> Subject: Re[2]: Network anomalies after update from 11.2 STABLE to 12.1 STABLE
> 
> Hi Michael,
> 
> Thank you, for taking your time!
> 
> We use physical machines. We don not have any special `pf` rules.
> Both sides ran `pfctl -d` before testing.
> 
> 
> `nginx` config is primitive, no secrets there:
> 
> ---
> user  www;
> worker_processes  auto;
> 
> error_log  /var/log/nginx/error.log warn;
> 
> events {
> worker_connections  81920;
> kqueue_changes  4096;
> use kqueue;
> }
> 
> http {
> include mime.types;
> default_typeapplication/octet-stream;
> 
> sendfileoff;
> keepalive_timeout   65;
> tcp_nopush  on;
> tcp_nodelay on;
> 
> # Logging
> log_format  main'$remote_addr - $remote_user [$time_local] 
> "$request" '
> '$status $request_length $body_bytes_sent 
> "$http_referer" '
> '"$http_user_agent" "$http_x_real_ip" 
> "$realip_remote_addr" "$request_completion" "$request_time" '
> '"$request_body"';
> 
> access_log  /var/log/nginx/access.log  main;
> 
> server {
> listen  80 default;
> 
> server_name localhost _;
> 
> location / {
> return 404;
> }
> }
> }
> ---
> 
> 
> `wrk` is compiled with a default configuration. We test like this:
> 
> `wrk -c 10 --header "Connection: close" -d 10 -t 1 --latency 
> http://10.10.10.92:80/missing`
> 
> 
> Also, it seems that our issue, and the one described in this thread, are 
> identical:
> 
>https://lists.freebsd.org/pipermail/freebsd-net/2019-June/053667.html
> 
> We both have the Intel network cards, BTW. Our network cards are these:
> 
> em0 at pci0:10:0:0:class=0x02 card=0x15d9 chip=0x10d38086 
> rev=0x00 hdr=0x00
> vendor = 'Intel Corporation'
> device = '82574L Gigabit Network Connection'
> 
> ixl0 at pci0:4:0:0:class=0x02 card=0x00078086 chip=0x15728086 
> rev=0x01 hdr=0x00
> vendor = 'Intel Corporation'
> device = 'Ethernet Controller X710 for 10GbE SFP+'
> 
> 
> ==
> 
> Additional info:
> 
> During the tests, we have bonded two interfaces into a lagg:
> 
> ixl0: flags=8843 metric 0 mtu 1500
> 
> options=c500b8
> ether 3c:fd:fe:aa:60:20
> media: Ethernet autoselect (10Gbase-SR )
> status: active
> nd6 options=29
> ixl1: flags=8843 metric 0 mtu 1500
> 
> options=c500b8
> ether 3c:fd:fe:aa:60:20
> hwaddr 3c:fd:fe:aa:60:21
> media: Ethernet autoselect (10Gbase-SR )
> status: active
> nd6 options=29
> 
> 
> la

Re[2]: Network anomalies after update from 11.2 STABLE to 12.1 STABLE

2019-10-19 Thread Paul



19 October 2019, 19:35:24, by "Michael Tuexen" 
:

> > On 19. Oct 2019, at 18:09, Paul  wrote:
> > 
> > Hi Michael,
> > 
> > Thank you, for taking your time!
> > 
> > We use physical machines. We don not have any special `pf` rules. 
> > Both sides ran `pfctl -d` before testing.
> Hi Paul,
> 
> OK. How are the physical machines connected to each other?

We have tested different connections. The old, copper ethernet, cable, 
as well as optics connection with an identical outcome. Machines are 
connected through Juniper QFX5100.


> 
> What happens when you don't use a lagg interface, but the physical ones?
> 
> (Trying to localise the problem...)

Same thing, lagg does not change anything. Originally, the problem was 
observed on a regular interface.


We have tested a on different hardware. Results are consistently
stable on 11.2-STABLE and consistently unstable on 12.1-STABLE.
The only unchanged thing is the network card vendor, it's Intel.

> 
> Best regards
> Michael
> > 
> > 
> > `nginx` config is primitive, no secrets there:
> > 
> > ---
> > user  www;
> > worker_processes  auto;
> > 
> > error_log  /var/log/nginx/error.log warn;
> > 
> > events {
> > worker_connections  81920;
> > kqueue_changes  4096;
> > use kqueue;
> > }
> > 
> > http {
> > include mime.types;
> > default_typeapplication/octet-stream;
> > 
> > sendfileoff;
> > keepalive_timeout   65;
> > tcp_nopush  on;
> > tcp_nodelay on;
> > 
> > # Logging
> > log_format  main'$remote_addr - $remote_user [$time_local] 
> > "$request" '
> > '$status $request_length $body_bytes_sent 
> > "$http_referer" '
> > '"$http_user_agent" "$http_x_real_ip" 
> > "$realip_remote_addr" "$request_completion" "$request_time" '
> > '"$request_body"';
> > 
> > access_log  /var/log/nginx/access.log  main;
> > 
> > server {
> > listen  80 default;
> > 
> > server_name localhost _;
> > 
> > location / {
> > return 404;
> > }
> > }
> > }
> > ---
> > 
> > 
> > `wrk` is compiled with a default configuration. We test like this:
> > 
> > `wrk -c 10 --header "Connection: close" -d 10 -t 1 --latency 
> > http://10.10.10.92:80/missing`
> > 
> > 
> > Also, it seems that our issue, and the one described in this thread, are 
> > identical:
> > 
> >https://lists.freebsd.org/pipermail/freebsd-net/2019-June/053667.html
> > 
> > We both have the Intel network cards, BTW. Our network cards are these:
> > 
> > em0 at pci0:10:0:0:class=0x02 card=0x15d9 chip=0x10d38086 
> > rev=0x00 hdr=0x00
> > vendor = 'Intel Corporation'
> > device = '82574L Gigabit Network Connection'
> > 
> > ixl0 at pci0:4:0:0:class=0x02 card=0x00078086 chip=0x15728086 
> > rev=0x01 hdr=0x00
> > vendor = 'Intel Corporation'
> > device = 'Ethernet Controller X710 for 10GbE SFP+'
> > 
> > 
> > ==
> > 
> > Additional info:
> > 
> > During the tests, we have bonded two interfaces into a lagg:
> > 
> > ixl0: flags=8843 metric 0 mtu 1500
> > 
> > options=c500b8
> > ether 3c:fd:fe:aa:60:20
> > media: Ethernet autoselect (10Gbase-SR )
> > status: active
> > nd6 options=29
> > ixl1: flags=8843 metric 0 mtu 1500
> > 
> > options=c500b8
> > ether 3c:fd:fe:aa:60:20
> > hwaddr 3c:fd:fe:aa:60:21
> > media: Ethernet autoselect (10Gbase-SR )
> > status: active
> > nd6 options=29
> > 
> > 
> > lagg0: flags=8843 metric 0 mtu 1500
> > 
> > options=c500b8
> > ether 3c:fd:fe:aa:60:20
> > inet 10.10.10.92 netmask 0x broadcast 10.10.255.255
> > laggproto failover lagghash l2,l3,l4
> > laggport: ixl0 flags=5
> > laggport: ixl1 flags=0<>
> > groups: lagg
> > media: Ethernet autoselect
> > status: active
> > nd6 options=29
> > 
> > using this config:
> > 
> > ifconfig_ixl0="up -lro -tso -rxcsum -txcsum"  (tried different options 
> > - got the same outcome)
> > ifconfig_ixl1="up -lro -tso -rxcsum -txcsum"
> > ifconfig_lagg0="laggproto failover laggport ixl0 laggport ixl1 
> > 10.10.10.92/24"
> > 
> > 
> > We have randomly picked `ixl0` and restricted number of RX/TX queues to 1:
> > /boot/loader.conf :
> > dev.ixl.0.iflib.override_ntxqs=1
> > dev.ixl.0.iflib.override_nrxqs=1
> > 
> > leaving `ixl1` with a default number, matching number of cores (6).
> > 
> > 
> > ixl0:  mem 
> > 0xf880-0xf8ff,0xf9808000-0xf980 irq 40 at device 0.0 on pci4
> > ixl0: fw 5.0.40043 

Re: Network anomalies after update from 11.2 STABLE to 12.1 STABLE

2019-10-19 Thread Michael Tuexen
> On 19. Oct 2019, at 18:09, Paul  wrote:
> 
> Hi Michael,
> 
> Thank you, for taking your time!
> 
> We use physical machines. We don not have any special `pf` rules. 
> Both sides ran `pfctl -d` before testing.
Hi Paul,

OK. How are the physical machines connected to each other?

What happens when you don't use a lagg interface, but the physical ones?

(Trying to localise the problem...)

Best regards
Michael
> 
> 
> `nginx` config is primitive, no secrets there:
> 
> ---
> user  www;
> worker_processes  auto;
> 
> error_log  /var/log/nginx/error.log warn;
> 
> events {
> worker_connections  81920;
> kqueue_changes  4096;
> use kqueue;
> }
> 
> http {
> include mime.types;
> default_typeapplication/octet-stream;
> 
> sendfileoff;
> keepalive_timeout   65;
> tcp_nopush  on;
> tcp_nodelay on;
> 
> # Logging
> log_format  main'$remote_addr - $remote_user [$time_local] 
> "$request" '
> '$status $request_length $body_bytes_sent 
> "$http_referer" '
> '"$http_user_agent" "$http_x_real_ip" 
> "$realip_remote_addr" "$request_completion" "$request_time" '
> '"$request_body"';
> 
> access_log  /var/log/nginx/access.log  main;
> 
> server {
> listen  80 default;
> 
> server_name localhost _;
> 
> location / {
> return 404;
> }
> }
> }
> ---
> 
> 
> `wrk` is compiled with a default configuration. We test like this:
> 
> `wrk -c 10 --header "Connection: close" -d 10 -t 1 --latency 
> http://10.10.10.92:80/missing`
> 
> 
> Also, it seems that our issue, and the one described in this thread, are 
> identical:
> 
>https://lists.freebsd.org/pipermail/freebsd-net/2019-June/053667.html
> 
> We both have the Intel network cards, BTW. Our network cards are these:
> 
> em0 at pci0:10:0:0:class=0x02 card=0x15d9 chip=0x10d38086 
> rev=0x00 hdr=0x00
> vendor = 'Intel Corporation'
> device = '82574L Gigabit Network Connection'
> 
> ixl0 at pci0:4:0:0:class=0x02 card=0x00078086 chip=0x15728086 
> rev=0x01 hdr=0x00
> vendor = 'Intel Corporation'
> device = 'Ethernet Controller X710 for 10GbE SFP+'
> 
> 
> ==
> 
> Additional info:
> 
> During the tests, we have bonded two interfaces into a lagg:
> 
> ixl0: flags=8843 metric 0 mtu 1500
> 
> options=c500b8
> ether 3c:fd:fe:aa:60:20
> media: Ethernet autoselect (10Gbase-SR )
> status: active
> nd6 options=29
> ixl1: flags=8843 metric 0 mtu 1500
> 
> options=c500b8
> ether 3c:fd:fe:aa:60:20
> hwaddr 3c:fd:fe:aa:60:21
> media: Ethernet autoselect (10Gbase-SR )
> status: active
> nd6 options=29
> 
> 
> lagg0: flags=8843 metric 0 mtu 1500
> 
> options=c500b8
> ether 3c:fd:fe:aa:60:20
> inet 10.10.10.92 netmask 0x broadcast 10.10.255.255
> laggproto failover lagghash l2,l3,l4
> laggport: ixl0 flags=5
> laggport: ixl1 flags=0<>
> groups: lagg
> media: Ethernet autoselect
> status: active
> nd6 options=29
> 
> using this config:
> 
> ifconfig_ixl0="up -lro -tso -rxcsum -txcsum"  (tried different options - 
> got the same outcome)
> ifconfig_ixl1="up -lro -tso -rxcsum -txcsum"
> ifconfig_lagg0="laggproto failover laggport ixl0 laggport ixl1 
> 10.10.10.92/24"
> 
> 
> We have randomly picked `ixl0` and restricted number of RX/TX queues to 1:
> /boot/loader.conf :
> dev.ixl.0.iflib.override_ntxqs=1
> dev.ixl.0.iflib.override_nrxqs=1
> 
> leaving `ixl1` with a default number, matching number of cores (6).
> 
> 
> ixl0:  mem 
> 0xf880-0xf8ff,0xf9808000-0xf980 irq 40 at device 0.0 on pci4
> ixl0: fw 5.0.40043 api 1.5 nvm 5.05 etid 80002927 oem 1.261.0
> ixl0: PF-ID[0]: VFs 64, MSI-X 129, VF MSI-X 5, QPs 768, I2C
> ixl0: Using 1024 TX descriptors and 1024 RX descriptors
> ixl0: Using 1 RX queues 1 TX queues
> ixl0: Using MSI-X interrupts with 2 vectors
> ixl0: Ethernet address: 3c:fd:fe:aa:60:20
> ixl0: Allocating 1 queues for PF LAN VSI; 1 queues active
> ixl0: PCI Express Bus: Speed 8.0GT/s Width x4
> ixl0: SR-IOV ready
> ixl0: netmap queues/slots: TX 1/1024, RX 1/1024
> ixl1:  mem 
> 0xf800-0xf87f,0xf980-0xf9807fff irq 40 at device 0.1 on pci4
> ixl1: fw 5.0.40043 api 1.5 nvm 5.05 etid 80002927 oem 1.261.0
> ixl1: PF-ID[1]: VFs 64, MSI-X 129, VF MSI-X 5, QPs 768, I2C
> ixl1: Using 1024 TX descriptors and 1024 RX descriptors
> ixl1: Using 6 RX queues 6 TX 

Re: Re[2]: Network anomalies after update from 11.2 STABLE to 12.1 STABLE

2019-10-19 Thread Rick Macklem
Btw, I once ran into a situation where "smart networking" was injecting
RSTs into a TCP stream. The packet captures at the client and server
machines were identical, except for the RSTs and the problem went away
when I connected the two machines with a cable, bypassing the network.
Might be worth a try, if you can do it?

Good luck with it, rick


From: owner-freebsd-...@freebsd.org  on behalf 
of Paul 
Sent: Saturday, October 19, 2019 12:09 PM
To: michael.tue...@lurchi.franken.de; freebsd-net@freebsd.org; 
freebsd-sta...@freebsd.org
Subject: Re[2]: Network anomalies after update from 11.2 STABLE to 12.1 STABLE

Hi Michael,

Thank you, for taking your time!

We use physical machines. We don not have any special `pf` rules.
Both sides ran `pfctl -d` before testing.


`nginx` config is primitive, no secrets there:

---
user  www;
worker_processes  auto;

error_log  /var/log/nginx/error.log warn;

events {
worker_connections  81920;
kqueue_changes  4096;
use kqueue;
}

http {
include mime.types;
default_typeapplication/octet-stream;

sendfileoff;
keepalive_timeout   65;
tcp_nopush  on;
tcp_nodelay on;

# Logging
log_format  main'$remote_addr - $remote_user [$time_local] 
"$request" '
'$status $request_length $body_bytes_sent 
"$http_referer" '
'"$http_user_agent" "$http_x_real_ip" 
"$realip_remote_addr" "$request_completion" "$request_time" '
'"$request_body"';

access_log  /var/log/nginx/access.log  main;

server {
listen  80 default;

server_name localhost _;

location / {
return 404;
}
}
}
---


`wrk` is compiled with a default configuration. We test like this:

`wrk -c 10 --header "Connection: close" -d 10 -t 1 --latency 
http://10.10.10.92:80/missing`


Also, it seems that our issue, and the one described in this thread, are 
identical:

   https://lists.freebsd.org/pipermail/freebsd-net/2019-June/053667.html

We both have the Intel network cards, BTW. Our network cards are these:

em0 at pci0:10:0:0:class=0x02 card=0x15d9 chip=0x10d38086 
rev=0x00 hdr=0x00
vendor = 'Intel Corporation'
device = '82574L Gigabit Network Connection'

ixl0 at pci0:4:0:0:class=0x02 card=0x00078086 chip=0x15728086 
rev=0x01 hdr=0x00
vendor = 'Intel Corporation'
device = 'Ethernet Controller X710 for 10GbE SFP+'


==

Additional info:

During the tests, we have bonded two interfaces into a lagg:

ixl0: flags=8843 metric 0 mtu 1500

options=c500b8
ether 3c:fd:fe:aa:60:20
media: Ethernet autoselect (10Gbase-SR )
status: active
nd6 options=29
ixl1: flags=8843 metric 0 mtu 1500

options=c500b8
ether 3c:fd:fe:aa:60:20
hwaddr 3c:fd:fe:aa:60:21
media: Ethernet autoselect (10Gbase-SR )
status: active
nd6 options=29


lagg0: flags=8843 metric 0 mtu 1500

options=c500b8
ether 3c:fd:fe:aa:60:20
inet 10.10.10.92 netmask 0x broadcast 10.10.255.255
laggproto failover lagghash l2,l3,l4
laggport: ixl0 flags=5
laggport: ixl1 flags=0<>
groups: lagg
media: Ethernet autoselect
status: active
nd6 options=29

using this config:

ifconfig_ixl0="up -lro -tso -rxcsum -txcsum"  (tried different options - 
got the same outcome)
ifconfig_ixl1="up -lro -tso -rxcsum -txcsum"
ifconfig_lagg0="laggproto failover laggport ixl0 laggport ixl1 
10.10.10.92/24"


We have randomly picked `ixl0` and restricted number of RX/TX queues to 1:
/boot/loader.conf :
dev.ixl.0.iflib.override_ntxqs=1
dev.ixl.0.iflib.override_nrxqs=1

leaving `ixl1` with a default number, matching number of cores (6).


ixl0:  mem 
0xf880-0xf8ff,0xf9808000-0xf980 irq 40 at device 0.0 on pci4
ixl0: fw 5.0.40043 api 1.5 nvm 5.05 etid 80002927 oem 1.261.0
ixl0: PF-ID[0]: VFs 64, MSI-X 129, VF MSI-X 5, QPs 768, I2C
ixl0: Using 1024 TX descriptors and 1024 RX descriptors
ixl0: Using 1 RX queues 1 TX queues
ixl0: Using MSI-X interrupts with 2 vectors
ixl0: Ethernet address: 3c:fd:fe:aa:60:20
ixl0: Allocating 1 queues for PF LAN VSI; 1 queues active
ixl0: PCI Express Bus: Speed 8.0GT/s Width x4
ixl0: SR-IOV ready
ixl0: netmap queues/slots: TX 1/1024, RX 1/1024
ixl1:  mem 
0xf800-0xf87f,0xf980-0xf9807fff irq 

Re[2]: Network anomalies after update from 11.2 STABLE to 12.1 STABLE

2019-10-19 Thread Paul
Hi Michael,

Thank you, for taking your time!

We use physical machines. We don not have any special `pf` rules. 
Both sides ran `pfctl -d` before testing.


`nginx` config is primitive, no secrets there:

---
user  www;
worker_processes  auto;

error_log  /var/log/nginx/error.log warn;

events {
worker_connections  81920;
kqueue_changes  4096;
use kqueue;
}

http {
include mime.types;
default_typeapplication/octet-stream;

sendfileoff;
keepalive_timeout   65;
tcp_nopush  on;
tcp_nodelay on;

# Logging
log_format  main'$remote_addr - $remote_user [$time_local] 
"$request" '
'$status $request_length $body_bytes_sent 
"$http_referer" '
'"$http_user_agent" "$http_x_real_ip" 
"$realip_remote_addr" "$request_completion" "$request_time" '
'"$request_body"';

access_log  /var/log/nginx/access.log  main;

server {
listen  80 default;

server_name localhost _;

location / {
return 404;
}
}
}
---


`wrk` is compiled with a default configuration. We test like this:

`wrk -c 10 --header "Connection: close" -d 10 -t 1 --latency 
http://10.10.10.92:80/missing`


Also, it seems that our issue, and the one described in this thread, are 
identical:

   https://lists.freebsd.org/pipermail/freebsd-net/2019-June/053667.html

We both have the Intel network cards, BTW. Our network cards are these:

em0 at pci0:10:0:0:class=0x02 card=0x15d9 chip=0x10d38086 
rev=0x00 hdr=0x00
vendor = 'Intel Corporation'
device = '82574L Gigabit Network Connection'

ixl0 at pci0:4:0:0:class=0x02 card=0x00078086 chip=0x15728086 
rev=0x01 hdr=0x00
vendor = 'Intel Corporation'
device = 'Ethernet Controller X710 for 10GbE SFP+'


==

Additional info:

During the tests, we have bonded two interfaces into a lagg:

ixl0: flags=8843 metric 0 mtu 1500

options=c500b8
ether 3c:fd:fe:aa:60:20
media: Ethernet autoselect (10Gbase-SR )
status: active
nd6 options=29
ixl1: flags=8843 metric 0 mtu 1500

options=c500b8
ether 3c:fd:fe:aa:60:20
hwaddr 3c:fd:fe:aa:60:21
media: Ethernet autoselect (10Gbase-SR )
status: active
nd6 options=29


lagg0: flags=8843 metric 0 mtu 1500

options=c500b8
ether 3c:fd:fe:aa:60:20
inet 10.10.10.92 netmask 0x broadcast 10.10.255.255
laggproto failover lagghash l2,l3,l4
laggport: ixl0 flags=5
laggport: ixl1 flags=0<>
groups: lagg
media: Ethernet autoselect
status: active
nd6 options=29

using this config:

ifconfig_ixl0="up -lro -tso -rxcsum -txcsum"  (tried different options - 
got the same outcome)
ifconfig_ixl1="up -lro -tso -rxcsum -txcsum"
ifconfig_lagg0="laggproto failover laggport ixl0 laggport ixl1 
10.10.10.92/24"


We have randomly picked `ixl0` and restricted number of RX/TX queues to 1:
/boot/loader.conf :
dev.ixl.0.iflib.override_ntxqs=1
dev.ixl.0.iflib.override_nrxqs=1

leaving `ixl1` with a default number, matching number of cores (6).


ixl0:  mem 
0xf880-0xf8ff,0xf9808000-0xf980 irq 40 at device 0.0 on pci4
ixl0: fw 5.0.40043 api 1.5 nvm 5.05 etid 80002927 oem 1.261.0
ixl0: PF-ID[0]: VFs 64, MSI-X 129, VF MSI-X 5, QPs 768, I2C
ixl0: Using 1024 TX descriptors and 1024 RX descriptors
ixl0: Using 1 RX queues 1 TX queues
ixl0: Using MSI-X interrupts with 2 vectors
ixl0: Ethernet address: 3c:fd:fe:aa:60:20
ixl0: Allocating 1 queues for PF LAN VSI; 1 queues active
ixl0: PCI Express Bus: Speed 8.0GT/s Width x4
ixl0: SR-IOV ready
ixl0: netmap queues/slots: TX 1/1024, RX 1/1024
ixl1:  mem 
0xf800-0xf87f,0xf980-0xf9807fff irq 40 at device 0.1 on pci4
ixl1: fw 5.0.40043 api 1.5 nvm 5.05 etid 80002927 oem 1.261.0
ixl1: PF-ID[1]: VFs 64, MSI-X 129, VF MSI-X 5, QPs 768, I2C
ixl1: Using 1024 TX descriptors and 1024 RX descriptors
ixl1: Using 6 RX queues 6 TX queues
ixl1: Using MSI-X interrupts with 7 vectors
ixl1: Ethernet address: 3c:fd:fe:aa:60:21
ixl1: Allocating 8 queues for PF LAN VSI; 6 queues active
ixl1: PCI Express Bus: Speed 8.0GT/s Width x4
ixl1: SR-IOV ready
ixl1: netmap queues/slots: TX 6/1024, RX 6/1024


This allowed us easy switch between different configurations without
the need to reboot, by simply shutting down one interface or the other:

`ifconfig XXX down`

When testing `ixl0` that runs only a single queue:
ixl0: Using 1 RX queues 1 TX queues

Re: Network anomalies after update from 11.2 STABLE to 12.1 STABLE

2019-10-19 Thread Michael Tuexen
> On 18. Oct 2019, at 14:57, Paul  wrote:
> 
> Our current version is:
> 
>   FreeBSD 11.2-STABLE #0 r340725
> 
> New version that we have problems with:
> 
>   FreeBSD 12.1-STABLE #5 r352893
> 
> 
> After update to new version we have started to observe an incredible number 
> of 
> errors in HTTP requests in between various services in our system. This 
> problem
> appeared on all the servers that were upgraded, and seems to not be specific 
> to
> concrete network card: we use different models, all are affected.
> 
> During various tests, we observed a lot of spontaneous TCP stream abortions, 
> including at the establishment stage (SYN) in cases that were 100% issue free
> on 11.2-STABLE. Concrete test cases will be shown below.
> 
> We also want to highlight that, on numerous occasions, we have observed 
> random,
> huge ACK indices in a first response to a SYN packet, instead of 1, as 
> expected.
> This forces client to abort connection via RST.
> 
> On the fist glance it looks like races in the kernel, because problem 
> disappears when:
>  * we use `dev.ixl.0.iflib.override_nrxqs=1` and 
> `dev.ixl.0.iflib.override_ntxqs=1`
>  * we use `dev.ixl.0.iflib.override_nrxqs=0` and 
> `dev.ixl.0.iflib.override_ntxqs=0`, but don't issue concurrent TCP streams
> 
> These are some debug log messages, emitted by 12.1-STABLE:
> 
> Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:16304 to [10.10.10.92]:80 
> tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache 
> entry (possibly syncookie only), segment ignored
> Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:16326 to [10.10.10.92]:80 
> tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache 
> entry (possibly syncookie only), segment ignored
> Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:16402 to [10.10.10.92]:80 
> tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache 
> entry (possibly syncookie only), segment ignored
> Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:16652 to [10.10.10.92]:80 
> tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache 
> entry (possibly syncookie only), segment ignored
> Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:16686 to [10.10.10.92]:80 
> tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache 
> entry (possibly syncookie only), segment ignored
> Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:18562 to [10.10.10.92]:80 
> tcpflags 0x4; tcp_do_segment: Timestamp missing, no action
> Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:18918 to [10.10.10.92]:80 
> tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache 
> entry (possibly syncookie only), segment ignored
> Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19331 to [10.10.10.92]:80 
> tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache 
> entry (possibly syncookie only), segment ignored
> Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19340 to [10.10.10.92]:80 
> tcpflags 0x4; tcp_do_segment: Timestamp missing, no action
> Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19340 to [10.10.10.92]:80 
> tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache 
> entry (possibly syncookie only), segment ignored
> Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19340 to [10.10.10.92]:80 
> tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache 
> entry (possibly syncookie only), segment ignored
> Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19489 to [10.10.10.92]:80 
> tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache 
> entry (possibly syncookie only), segment ignored
> Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19580 to [10.10.10.92]:80 
> tcpflags 0x4; tcp_do_segment: Timestamp missing, no action
> Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19580 to [10.10.10.92]:80 
> tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache 
> entry (possibly syncookie only), segment ignored
> Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19580 to [10.10.10.92]:80 
> tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache 
> entry (possibly syncookie only), segment ignored
> Oct 18 14:59:02 test kernel: TCP: [10.10.10.39]:17705 to [10.10.10.92]:80; 
> syncache_timer: Response timeout, retransmitting (1) SYN|ACK
> Oct 18 14:59:02 test kernel: TCP: [10.10.10.39]:18066 to [10.10.10.92]:80; 
> syncache_timer: Response timeout, retransmitting (1) SYN|ACK
> Oct 18 14:59:02 test kernel: TCP: [10.10.10.39]:18066 to [10.10.10.92]:80 
> tcpflags 0x4; syncache_chkrst: Our SYN|ACK was rejected, connection 
> attempt aborted by remote endpoint
> Oct 18 14:59:02 test kernel: TCP: [10.10.10.39]:17705 to [10.10.10.92]:80 
> tcpflags 0x4; syncache_chkrst: Our SYN|ACK was rejected, connection 
> attempt aborted by remote endpoint
> 
> Here, 10.10.10.92 runs 12.1-STABLE, while 10.10.10.39 is a client that runs 
> 11.2-STABLE.
> 
> 
> In our test case we use nginx and wrk , with a minimal config, where 

Re: Network anomalies after update from 11.2 STABLE to 12.1 STABLE

2019-10-19 Thread Michael Tuexen
> On 18. Oct 2019, at 14:57, Paul  wrote:
> 
> Our current version is:
> 
>FreeBSD 11.2-STABLE #0 r340725
> 
> New version that we have problems with:
> 
>FreeBSD 12.1-STABLE #5 r352893
> 
> 
> After update to new version we have started to observe an incredible number 
> of 
> errors in HTTP requests in between various services in our system. This 
> problem
> appeared on all the servers that were upgraded, and seems to not be specific 
> to
> concrete network card: we use different models, all are affected.
> 
> During various tests, we observed a lot of spontaneous TCP stream abortions, 
> including at the establishment stage (SYN) in cases that were 100% issue free
> on 11.2-STABLE. Concrete test cases will be shown below.
> 
> We also want to highlight that, on numerous occasions, we have observed 
> random,
> huge ACK indices in a first response to a SYN packet, instead of 1, as 
> expected.
> This forces client to abort connection via RST.
> 
> On the fist glance it looks like races in the kernel, because problem 
> disappears when:
>   * we use `dev.ixl.0.iflib.override_nrxqs=1` and 
> `dev.ixl.0.iflib.override_ntxqs=1`
>   * we use `dev.ixl.0.iflib.override_nrxqs=0` and 
> `dev.ixl.0.iflib.override_ntxqs=0`, but don't issue concurrent TCP streams
> 
> These are some debug log messages, emitted by 12.1-STABLE:
> 
> Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:16304 to [10.10.10.92]:80 
> tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache 
> entry (possibly syncookie only), segment ignored
> Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:16326 to [10.10.10.92]:80 
> tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache 
> entry (possibly syncookie only), segment ignored
> Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:16402 to [10.10.10.92]:80 
> tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache 
> entry (possibly syncookie only), segment ignored
> Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:16652 to [10.10.10.92]:80 
> tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache 
> entry (possibly syncookie only), segment ignored
> Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:16686 to [10.10.10.92]:80 
> tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache 
> entry (possibly syncookie only), segment ignored
> Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:18562 to [10.10.10.92]:80 
> tcpflags 0x4; tcp_do_segment: Timestamp missing, no action
> Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:18918 to [10.10.10.92]:80 
> tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache 
> entry (possibly syncookie only), segment ignored
> Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19331 to [10.10.10.92]:80 
> tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache 
> entry (possibly syncookie only), segment ignored
> Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19340 to [10.10.10.92]:80 
> tcpflags 0x4; tcp_do_segment: Timestamp missing, no action
> Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19340 to [10.10.10.92]:80 
> tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache 
> entry (possibly syncookie only), segment ignored
> Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19340 to [10.10.10.92]:80 
> tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache 
> entry (possibly syncookie only), segment ignored
> Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19489 to [10.10.10.92]:80 
> tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache 
> entry (possibly syncookie only), segment ignored
> Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19580 to [10.10.10.92]:80 
> tcpflags 0x4; tcp_do_segment: Timestamp missing, no action
> Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19580 to [10.10.10.92]:80 
> tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache 
> entry (possibly syncookie only), segment ignored
> Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19580 to [10.10.10.92]:80 
> tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache 
> entry (possibly syncookie only), segment ignored
> Oct 18 14:59:02 test kernel: TCP: [10.10.10.39]:17705 to [10.10.10.92]:80; 
> syncache_timer: Response timeout, retransmitting (1) SYN|ACK
> Oct 18 14:59:02 test kernel: TCP: [10.10.10.39]:18066 to [10.10.10.92]:80; 
> syncache_timer: Response timeout, retransmitting (1) SYN|ACK
> Oct 18 14:59:02 test kernel: TCP: [10.10.10.39]:18066 to [10.10.10.92]:80 
> tcpflags 0x4; syncache_chkrst: Our SYN|ACK was rejected, connection 
> attempt aborted by remote endpoint
> Oct 18 14:59:02 test kernel: TCP: [10.10.10.39]:17705 to [10.10.10.92]:80 
> tcpflags 0x4; syncache_chkrst: Our SYN|ACK was rejected, connection 
> attempt aborted by remote endpoint
> 
> Here, 10.10.10.92 runs 12.1-STABLE, while 10.10.10.39 is a client that runs 
> 11.2-STABLE.
> 
> 
> In our test case we use nginx and wrk , with a minimal config, 

Network anomalies after update from 11.2 STABLE to 12.1 STABLE

2019-10-18 Thread Paul
Our current version is:

FreeBSD 11.2-STABLE #0 r340725

New version that we have problems with:

FreeBSD 12.1-STABLE #5 r352893


After update to new version we have started to observe an incredible number of 
errors in HTTP requests in between various services in our system. This problem
appeared on all the servers that were upgraded, and seems to not be specific to
concrete network card: we use different models, all are affected.

During various tests, we observed a lot of spontaneous TCP stream abortions, 
including at the establishment stage (SYN) in cases that were 100% issue free
on 11.2-STABLE. Concrete test cases will be shown below.

We also want to highlight that, on numerous occasions, we have observed random,
huge ACK indices in a first response to a SYN packet, instead of 1, as expected.
This forces client to abort connection via RST.

On the fist glance it looks like races in the kernel, because problem 
disappears when:
   * we use `dev.ixl.0.iflib.override_nrxqs=1` and 
`dev.ixl.0.iflib.override_ntxqs=1`
   * we use `dev.ixl.0.iflib.override_nrxqs=0` and 
`dev.ixl.0.iflib.override_ntxqs=0`, but don't issue concurrent TCP streams

These are some debug log messages, emitted by 12.1-STABLE:

Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:16304 to [10.10.10.92]:80 
tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache 
entry (possibly syncookie only), segment ignored
Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:16326 to [10.10.10.92]:80 
tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache 
entry (possibly syncookie only), segment ignored
Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:16402 to [10.10.10.92]:80 
tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache 
entry (possibly syncookie only), segment ignored
Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:16652 to [10.10.10.92]:80 
tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache 
entry (possibly syncookie only), segment ignored
Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:16686 to [10.10.10.92]:80 
tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache 
entry (possibly syncookie only), segment ignored
Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:18562 to [10.10.10.92]:80 
tcpflags 0x4; tcp_do_segment: Timestamp missing, no action
Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:18918 to [10.10.10.92]:80 
tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache 
entry (possibly syncookie only), segment ignored
Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19331 to [10.10.10.92]:80 
tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache 
entry (possibly syncookie only), segment ignored
Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19340 to [10.10.10.92]:80 
tcpflags 0x4; tcp_do_segment: Timestamp missing, no action
Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19340 to [10.10.10.92]:80 
tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache 
entry (possibly syncookie only), segment ignored
Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19340 to [10.10.10.92]:80 
tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache 
entry (possibly syncookie only), segment ignored
Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19489 to [10.10.10.92]:80 
tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache 
entry (possibly syncookie only), segment ignored
Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19580 to [10.10.10.92]:80 
tcpflags 0x4; tcp_do_segment: Timestamp missing, no action
Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19580 to [10.10.10.92]:80 
tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache 
entry (possibly syncookie only), segment ignored
Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19580 to [10.10.10.92]:80 
tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache 
entry (possibly syncookie only), segment ignored
Oct 18 14:59:02 test kernel: TCP: [10.10.10.39]:17705 to [10.10.10.92]:80; 
syncache_timer: Response timeout, retransmitting (1) SYN|ACK
Oct 18 14:59:02 test kernel: TCP: [10.10.10.39]:18066 to [10.10.10.92]:80; 
syncache_timer: Response timeout, retransmitting (1) SYN|ACK
Oct 18 14:59:02 test kernel: TCP: [10.10.10.39]:18066 to [10.10.10.92]:80 
tcpflags 0x4; syncache_chkrst: Our SYN|ACK was rejected, connection 
attempt aborted by remote endpoint
Oct 18 14:59:02 test kernel: TCP: [10.10.10.39]:17705 to [10.10.10.92]:80 
tcpflags 0x4; syncache_chkrst: Our SYN|ACK was rejected, connection 
attempt aborted by remote endpoint

Here, 10.10.10.92 runs 12.1-STABLE, while 10.10.10.39 is a client that runs 
11.2-STABLE.


In our test case we use nginx and wrk , with a minimal config, where nginx 
always returns 
error page 404. nginx is on the 12.1-STABLE, while wrk is on 11.2-STABLE.

We run wrk like so:

wrk -c 10 --header "Connection: close" -d 10 -t 1 --latency 
http://10.10.10.92:80/missing

and often see