Our current version is:
FreeBSD 11.2-STABLE #0 r340725
New version that we have problems with:
FreeBSD 12.1-STABLE #5 r352893
After update to new version we have started to observe an incredible number of
errors in HTTP requests in between various services in our system. This problem
appeared on all the servers that were upgraded, and seems to not be specific to
concrete network card: we use different models, all are affected.
During various tests, we observed a lot of spontaneous TCP stream abortions,
including at the establishment stage (SYN) in cases that were 100% issue free
on 11.2-STABLE. Concrete test cases will be shown below.
We also want to highlight that, on numerous occasions, we have observed random,
huge ACK indices in a first response to a SYN packet, instead of 1, as expected.
This forces client to abort connection via RST.
On the fist glance it looks like races in the kernel, because problem
disappears when:
* we use `dev.ixl.0.iflib.override_nrxqs=1` and
`dev.ixl.0.iflib.override_ntxqs=1`
* we use `dev.ixl.0.iflib.override_nrxqs=0` and
`dev.ixl.0.iflib.override_ntxqs=0`, but don't issue concurrent TCP streams
These are some debug log messages, emitted by 12.1-STABLE:
Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:16304 to [10.10.10.92]:80
tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache
entry (possibly syncookie only), segment ignored
Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:16326 to [10.10.10.92]:80
tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache
entry (possibly syncookie only), segment ignored
Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:16402 to [10.10.10.92]:80
tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache
entry (possibly syncookie only), segment ignored
Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:16652 to [10.10.10.92]:80
tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache
entry (possibly syncookie only), segment ignored
Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:16686 to [10.10.10.92]:80
tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache
entry (possibly syncookie only), segment ignored
Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:18562 to [10.10.10.92]:80
tcpflags 0x4; tcp_do_segment: Timestamp missing, no action
Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:18918 to [10.10.10.92]:80
tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache
entry (possibly syncookie only), segment ignored
Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19331 to [10.10.10.92]:80
tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache
entry (possibly syncookie only), segment ignored
Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19340 to [10.10.10.92]:80
tcpflags 0x4; tcp_do_segment: Timestamp missing, no action
Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19340 to [10.10.10.92]:80
tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache
entry (possibly syncookie only), segment ignored
Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19340 to [10.10.10.92]:80
tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache
entry (possibly syncookie only), segment ignored
Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19489 to [10.10.10.92]:80
tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache
entry (possibly syncookie only), segment ignored
Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19580 to [10.10.10.92]:80
tcpflags 0x4; tcp_do_segment: Timestamp missing, no action
Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19580 to [10.10.10.92]:80
tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache
entry (possibly syncookie only), segment ignored
Oct 18 14:59:01 test kernel: TCP: [10.10.10.39]:19580 to [10.10.10.92]:80
tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache
entry (possibly syncookie only), segment ignored
Oct 18 14:59:02 test kernel: TCP: [10.10.10.39]:17705 to [10.10.10.92]:80;
syncache_timer: Response timeout, retransmitting (1) SYN|ACK
Oct 18 14:59:02 test kernel: TCP: [10.10.10.39]:18066 to [10.10.10.92]:80;
syncache_timer: Response timeout, retransmitting (1) SYN|ACK
Oct 18 14:59:02 test kernel: TCP: [10.10.10.39]:18066 to [10.10.10.92]:80
tcpflags 0x4; syncache_chkrst: Our SYN|ACK was rejected, connection
attempt aborted by remote endpoint
Oct 18 14:59:02 test kernel: TCP: [10.10.10.39]:17705 to [10.10.10.92]:80
tcpflags 0x4; syncache_chkrst: Our SYN|ACK was rejected, connection
attempt aborted by remote endpoint
Here, 10.10.10.92 runs 12.1-STABLE, while 10.10.10.39 is a client that runs
11.2-STABLE.
In our test case we use nginx and wrk , with a minimal config, where nginx
always returns
error page 404. nginx is on the 12.1-STABLE, while wrk is on 11.2-STABLE.
We run wrk like so:
wrk -c 10 --header "Connection: close" -d 10 -t 1 --latency
http://10.10.10.92:80/missing
and often see