[Bug 271205] [ix] [carp]: Continuous input errors on Intel X553

2023-05-02 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=271205

Mark Linimon  changed:

   What|Removed |Added

   Keywords||IntelNetworking, regression
   Assignee|b...@freebsd.org|n...@freebsd.org

-- 
You are receiving this mail because:
You are the assignee for the bug.


Re: Cwnd grows slowly during slow-start due to LRO of the receiver side.

2023-05-02 Thread Rodney W. Grimes
Second attempt, first one failed due to not being a member
of the list :-(.

> Adding freebsd-transp...@freebsd.org to get that specific groups
> eyes on this issue.
> 
> Rod
> 
> > As per newreno_ack_received() in sys/netinet/cc/cc_newreno.c,
> > FreeBSD TCP sender strictly follows RFC 5681 with RFC 3465 extension
> > That is, during slow-start, when receiving an ACK of 'bytes_acked'
> > 
> > cwnd += min(bytes_acked, abc_l_var * SMSS);  // abc_l_var = 2 dflt
> > 
> > As discussed in sec3.2 of RFC 3465, L=2*SMSS bytes exactly balances
> > the negative impact of the delayed ACK algorithm.  RFC 5681 also
> > requires that a receiver SHOULD generate an ACK for at least every
> > second full-sized segment, so bytes_acked per ACK is at most 2 * SMSS.
> > If both sender and receiver follow it. cwnd should grow exponentially
> > during slow-slow:
> > 
> > cwnd *= 2(per RTT)
> > 
> > However, LRO and TSO are widely used today, so receiver may generate
> > much less ACKs than it used to do.  As I observed, Both FreeBSD and
> > Linux generates at most one ACK per segment assembled by LRO/GRO.
> > The worst case is one ACK per 45 MSS, as 45 * 1448 = 65160 < 65535.
> > 
> > Sending 1MB over a link of 100ms delay from FreeBSD 13.2:
> > 
> >  0.000 IP sender > sink: Flags [S], seq 205083268, win 65535, options
> > [mss 1460,nop,wscale 10,sackOK,TS val 495212525 ecr 0], length 0
> >  0.100 IP sink > sender: Flags [S.], seq 708257395, ack 205083269, win
> > 65160, options [mss 1460,sackOK,TS val 563185696 ecr
> > 495212525,nop,wscale 7], length 0
> >  0.100 IP sender > sink: Flags [.], ack 1, win 65, options [nop,nop,TS
> > val 495212626 ecr 563185696], length 0
> >  // TSopt omitted below for brevity.
> > 
> >  // cwnd = 10 * MSS, sent 10 * MSS
> >  0.101 IP sender > sink: Flags [.], seq 1:14481, ack 1, win 65, length 14480
> > 
> >  // got one ACK for 10 * MSS, cwnd += 2 * MSS, sent 12 * MSS
> >  0.201 IP sink > sender: Flags [.], ack 14481, win 427, length 0
> >  0.201 IP sender > sink: Flags [.], seq 14481:31857, ack 1, win 65, length 
> > 17376
> > 
> >  // got ACK of 12*MSS above, cwnd += 2 * MSS, sent 14 * MSS
> >  0.301 IP sink > sender: Flags [.], ack 31857, win 411, length 0
> >  0.301 IP sender > sink: Flags [.], seq 31857:52129, ack 1, win 65, length 
> > 20272
> > 
> >  // got ACK of 14*MSS above, cwnd += 2 * MSS, sent 16 * MSS
> >  0.402 IP sink > sender: Flags [.], ack 52129, win 395, length 0
> >  0.402 IP sender > sink: Flags [P.], seq 52129:73629, ack 1, win 65,
> > length 21500
> >  0.402 IP sender > sink: Flags [.], seq 73629:75077, ack 1, win 65, length 
> > 1448
> > 
> > As a consequence, instead of growing exponentially, cwnd grows
> > more-or-less quadratically during slow-start, unless abc_l_var is
> > set to a sufficiently large value.
> > 
> > NewReno took more than 20 seconds to ramp up throughput to 100Mbps
> > over an emulated 100ms delay link.  While Linux took ~2 seconds.
> > I can provide the pcap file if anyone is interested.
> > 
> > Switching to CUBIC won't help, because it uses the logic in NewReno
> > ack_received() for slow start.
> > 
> > Is this a well-known issue and abc_l_var is the only cure for it?
> > https://calomel.org/freebsd_network_tuning.html
> > 
> > Thank you!
> > 
> > Best,
> > Shuo Chen
> > 
> > 
> 
> -- 
> Rod Grimes rgri...@freebsd.org
> 
> 

-- 
Rod Grimes rgri...@freebsd.org



Re: Cwnd grows slowly during slow-start due to LRO of the receiver side.

2023-05-02 Thread Rodney W. Grimes
Adding freebsd-transp...@freebsd.org to get that specific groups
eyes on this issue.

Rod

> As per newreno_ack_received() in sys/netinet/cc/cc_newreno.c,
> FreeBSD TCP sender strictly follows RFC 5681 with RFC 3465 extension
> That is, during slow-start, when receiving an ACK of 'bytes_acked'
> 
> cwnd += min(bytes_acked, abc_l_var * SMSS);  // abc_l_var = 2 dflt
> 
> As discussed in sec3.2 of RFC 3465, L=2*SMSS bytes exactly balances
> the negative impact of the delayed ACK algorithm.  RFC 5681 also
> requires that a receiver SHOULD generate an ACK for at least every
> second full-sized segment, so bytes_acked per ACK is at most 2 * SMSS.
> If both sender and receiver follow it. cwnd should grow exponentially
> during slow-slow:
> 
> cwnd *= 2(per RTT)
> 
> However, LRO and TSO are widely used today, so receiver may generate
> much less ACKs than it used to do.  As I observed, Both FreeBSD and
> Linux generates at most one ACK per segment assembled by LRO/GRO.
> The worst case is one ACK per 45 MSS, as 45 * 1448 = 65160 < 65535.
> 
> Sending 1MB over a link of 100ms delay from FreeBSD 13.2:
> 
>  0.000 IP sender > sink: Flags [S], seq 205083268, win 65535, options
> [mss 1460,nop,wscale 10,sackOK,TS val 495212525 ecr 0], length 0
>  0.100 IP sink > sender: Flags [S.], seq 708257395, ack 205083269, win
> 65160, options [mss 1460,sackOK,TS val 563185696 ecr
> 495212525,nop,wscale 7], length 0
>  0.100 IP sender > sink: Flags [.], ack 1, win 65, options [nop,nop,TS
> val 495212626 ecr 563185696], length 0
>  // TSopt omitted below for brevity.
> 
>  // cwnd = 10 * MSS, sent 10 * MSS
>  0.101 IP sender > sink: Flags [.], seq 1:14481, ack 1, win 65, length 14480
> 
>  // got one ACK for 10 * MSS, cwnd += 2 * MSS, sent 12 * MSS
>  0.201 IP sink > sender: Flags [.], ack 14481, win 427, length 0
>  0.201 IP sender > sink: Flags [.], seq 14481:31857, ack 1, win 65, length 
> 17376
> 
>  // got ACK of 12*MSS above, cwnd += 2 * MSS, sent 14 * MSS
>  0.301 IP sink > sender: Flags [.], ack 31857, win 411, length 0
>  0.301 IP sender > sink: Flags [.], seq 31857:52129, ack 1, win 65, length 
> 20272
> 
>  // got ACK of 14*MSS above, cwnd += 2 * MSS, sent 16 * MSS
>  0.402 IP sink > sender: Flags [.], ack 52129, win 395, length 0
>  0.402 IP sender > sink: Flags [P.], seq 52129:73629, ack 1, win 65,
> length 21500
>  0.402 IP sender > sink: Flags [.], seq 73629:75077, ack 1, win 65, length 
> 1448
> 
> As a consequence, instead of growing exponentially, cwnd grows
> more-or-less quadratically during slow-start, unless abc_l_var is
> set to a sufficiently large value.
> 
> NewReno took more than 20 seconds to ramp up throughput to 100Mbps
> over an emulated 100ms delay link.  While Linux took ~2 seconds.
> I can provide the pcap file if anyone is interested.
> 
> Switching to CUBIC won't help, because it uses the logic in NewReno
> ack_received() for slow start.
> 
> Is this a well-known issue and abc_l_var is the only cure for it?
> https://calomel.org/freebsd_network_tuning.html
> 
> Thank you!
> 
> Best,
> Shuo Chen
> 
> 

-- 
Rod Grimes rgri...@freebsd.org



Re: Cwnd grows slowly during slow-start due to LRO of the receiver side.

2023-05-02 Thread Hans Petter Selasky

On 5/2/23 11:14, Hans Petter Selasky wrote:

Hi Chen!

The FreeBSD mbufs carry the number of ACKs that have been joined 
together into the following field:


m->m_pkthdr.lro_nsegs

Can this value be of any use to cc_newreno ?

--HPS


Hi Chen,

Have you tested using FreeBSD main / 14 ?

The "nsegs" are passed along like this:

nsegs = max(1, m->m_pkthdr.lro_nsegs);

...

cc_ack_received(tp, th, nsegs, CC_ACK);

...

(Newreno - FreeBSD-14)

incr = min(ccv->bytes_this_ack,
ccv->nsegs * abc_val *
CCV(ccv, t_maxseg));

And in FreeBSD-10 being mentioned in your article:

(Newreno - FreeBSD-10)

incr = min(ccv->bytes_this_ack,
V_tcp_abc_l_var * CCV(ccv, t_maxseg));


There is no such thing.

This issue may already have been fixed!

--HPS


On 5/2/23 09:46, Chen Shuo wrote:

As per newreno_ack_received() in sys/netinet/cc/cc_newreno.c,
FreeBSD TCP sender strictly follows RFC 5681 with RFC 3465 extension
That is, during slow-start, when receiving an ACK of 'bytes_acked'

 cwnd += min(bytes_acked, abc_l_var * SMSS);  // abc_l_var = 2 dflt

As discussed in sec3.2 of RFC 3465, L=2*SMSS bytes exactly balances
the negative impact of the delayed ACK algorithm.  RFC 5681 also
requires that a receiver SHOULD generate an ACK for at least every
second full-sized segment, so bytes_acked per ACK is at most 2 * SMSS.
If both sender and receiver follow it. cwnd should grow exponentially
during slow-slow:

 cwnd *= 2    (per RTT)

However, LRO and TSO are widely used today, so receiver may generate
much less ACKs than it used to do.  As I observed, Both FreeBSD and
Linux generates at most one ACK per segment assembled by LRO/GRO.
The worst case is one ACK per 45 MSS, as 45 * 1448 = 65160 < 65535.

Sending 1MB over a link of 100ms delay from FreeBSD 13.2:

  0.000 IP sender > sink: Flags [S], seq 205083268, win 65535, options
[mss 1460,nop,wscale 10,sackOK,TS val 495212525 ecr 0], length 0
  0.100 IP sink > sender: Flags [S.], seq 708257395, ack 205083269, win
65160, options [mss 1460,sackOK,TS val 563185696 ecr
495212525,nop,wscale 7], length 0
  0.100 IP sender > sink: Flags [.], ack 1, win 65, options [nop,nop,TS
val 495212626 ecr 563185696], length 0
  // TSopt omitted below for brevity.

  // cwnd = 10 * MSS, sent 10 * MSS
  0.101 IP sender > sink: Flags [.], seq 1:14481, ack 1, win 65, 
length 14480


  // got one ACK for 10 * MSS, cwnd += 2 * MSS, sent 12 * MSS
  0.201 IP sink > sender: Flags [.], ack 14481, win 427, length 0
  0.201 IP sender > sink: Flags [.], seq 14481:31857, ack 1, win 65, 
length 17376


  // got ACK of 12*MSS above, cwnd += 2 * MSS, sent 14 * MSS
  0.301 IP sink > sender: Flags [.], ack 31857, win 411, length 0
  0.301 IP sender > sink: Flags [.], seq 31857:52129, ack 1, win 65, 
length 20272


  // got ACK of 14*MSS above, cwnd += 2 * MSS, sent 16 * MSS
  0.402 IP sink > sender: Flags [.], ack 52129, win 395, length 0
  0.402 IP sender > sink: Flags [P.], seq 52129:73629, ack 1, win 65,
length 21500
  0.402 IP sender > sink: Flags [.], seq 73629:75077, ack 1, win 65, 
length 1448


As a consequence, instead of growing exponentially, cwnd grows
more-or-less quadratically during slow-start, unless abc_l_var is
set to a sufficiently large value.

NewReno took more than 20 seconds to ramp up throughput to 100Mbps
over an emulated 100ms delay link.  While Linux took ~2 seconds.
I can provide the pcap file if anyone is interested.

Switching to CUBIC won't help, because it uses the logic in NewReno
ack_received() for slow start.

Is this a well-known issue and abc_l_var is the only cure for it?
https://calomel.org/freebsd_network_tuning.html

Thank you!

Best,
Shuo Chen









Re: Cwnd grows slowly during slow-start due to LRO of the receiver side.

2023-05-02 Thread Hans Petter Selasky

Hi Chen!

The FreeBSD mbufs carry the number of ACKs that have been joined 
together into the following field:


m->m_pkthdr.lro_nsegs

Can this value be of any use to cc_newreno ?

--HPS

On 5/2/23 09:46, Chen Shuo wrote:

As per newreno_ack_received() in sys/netinet/cc/cc_newreno.c,
FreeBSD TCP sender strictly follows RFC 5681 with RFC 3465 extension
That is, during slow-start, when receiving an ACK of 'bytes_acked'

 cwnd += min(bytes_acked, abc_l_var * SMSS);  // abc_l_var = 2 dflt

As discussed in sec3.2 of RFC 3465, L=2*SMSS bytes exactly balances
the negative impact of the delayed ACK algorithm.  RFC 5681 also
requires that a receiver SHOULD generate an ACK for at least every
second full-sized segment, so bytes_acked per ACK is at most 2 * SMSS.
If both sender and receiver follow it. cwnd should grow exponentially
during slow-slow:

 cwnd *= 2(per RTT)

However, LRO and TSO are widely used today, so receiver may generate
much less ACKs than it used to do.  As I observed, Both FreeBSD and
Linux generates at most one ACK per segment assembled by LRO/GRO.
The worst case is one ACK per 45 MSS, as 45 * 1448 = 65160 < 65535.

Sending 1MB over a link of 100ms delay from FreeBSD 13.2:

  0.000 IP sender > sink: Flags [S], seq 205083268, win 65535, options
[mss 1460,nop,wscale 10,sackOK,TS val 495212525 ecr 0], length 0
  0.100 IP sink > sender: Flags [S.], seq 708257395, ack 205083269, win
65160, options [mss 1460,sackOK,TS val 563185696 ecr
495212525,nop,wscale 7], length 0
  0.100 IP sender > sink: Flags [.], ack 1, win 65, options [nop,nop,TS
val 495212626 ecr 563185696], length 0
  // TSopt omitted below for brevity.

  // cwnd = 10 * MSS, sent 10 * MSS
  0.101 IP sender > sink: Flags [.], seq 1:14481, ack 1, win 65, length 14480

  // got one ACK for 10 * MSS, cwnd += 2 * MSS, sent 12 * MSS
  0.201 IP sink > sender: Flags [.], ack 14481, win 427, length 0
  0.201 IP sender > sink: Flags [.], seq 14481:31857, ack 1, win 65, length 
17376

  // got ACK of 12*MSS above, cwnd += 2 * MSS, sent 14 * MSS
  0.301 IP sink > sender: Flags [.], ack 31857, win 411, length 0
  0.301 IP sender > sink: Flags [.], seq 31857:52129, ack 1, win 65, length 
20272

  // got ACK of 14*MSS above, cwnd += 2 * MSS, sent 16 * MSS
  0.402 IP sink > sender: Flags [.], ack 52129, win 395, length 0
  0.402 IP sender > sink: Flags [P.], seq 52129:73629, ack 1, win 65,
length 21500
  0.402 IP sender > sink: Flags [.], seq 73629:75077, ack 1, win 65, length 1448

As a consequence, instead of growing exponentially, cwnd grows
more-or-less quadratically during slow-start, unless abc_l_var is
set to a sufficiently large value.

NewReno took more than 20 seconds to ramp up throughput to 100Mbps
over an emulated 100ms delay link.  While Linux took ~2 seconds.
I can provide the pcap file if anyone is interested.

Switching to CUBIC won't help, because it uses the logic in NewReno
ack_received() for slow start.

Is this a well-known issue and abc_l_var is the only cure for it?
https://calomel.org/freebsd_network_tuning.html

Thank you!

Best,
Shuo Chen






Cwnd grows slowly during slow-start due to LRO of the receiver side.

2023-05-02 Thread Chen Shuo
As per newreno_ack_received() in sys/netinet/cc/cc_newreno.c,
FreeBSD TCP sender strictly follows RFC 5681 with RFC 3465 extension
That is, during slow-start, when receiving an ACK of 'bytes_acked'

cwnd += min(bytes_acked, abc_l_var * SMSS);  // abc_l_var = 2 dflt

As discussed in sec3.2 of RFC 3465, L=2*SMSS bytes exactly balances
the negative impact of the delayed ACK algorithm.  RFC 5681 also
requires that a receiver SHOULD generate an ACK for at least every
second full-sized segment, so bytes_acked per ACK is at most 2 * SMSS.
If both sender and receiver follow it. cwnd should grow exponentially
during slow-slow:

cwnd *= 2(per RTT)

However, LRO and TSO are widely used today, so receiver may generate
much less ACKs than it used to do.  As I observed, Both FreeBSD and
Linux generates at most one ACK per segment assembled by LRO/GRO.
The worst case is one ACK per 45 MSS, as 45 * 1448 = 65160 < 65535.

Sending 1MB over a link of 100ms delay from FreeBSD 13.2:

 0.000 IP sender > sink: Flags [S], seq 205083268, win 65535, options
[mss 1460,nop,wscale 10,sackOK,TS val 495212525 ecr 0], length 0
 0.100 IP sink > sender: Flags [S.], seq 708257395, ack 205083269, win
65160, options [mss 1460,sackOK,TS val 563185696 ecr
495212525,nop,wscale 7], length 0
 0.100 IP sender > sink: Flags [.], ack 1, win 65, options [nop,nop,TS
val 495212626 ecr 563185696], length 0
 // TSopt omitted below for brevity.

 // cwnd = 10 * MSS, sent 10 * MSS
 0.101 IP sender > sink: Flags [.], seq 1:14481, ack 1, win 65, length 14480

 // got one ACK for 10 * MSS, cwnd += 2 * MSS, sent 12 * MSS
 0.201 IP sink > sender: Flags [.], ack 14481, win 427, length 0
 0.201 IP sender > sink: Flags [.], seq 14481:31857, ack 1, win 65, length 17376

 // got ACK of 12*MSS above, cwnd += 2 * MSS, sent 14 * MSS
 0.301 IP sink > sender: Flags [.], ack 31857, win 411, length 0
 0.301 IP sender > sink: Flags [.], seq 31857:52129, ack 1, win 65, length 20272

 // got ACK of 14*MSS above, cwnd += 2 * MSS, sent 16 * MSS
 0.402 IP sink > sender: Flags [.], ack 52129, win 395, length 0
 0.402 IP sender > sink: Flags [P.], seq 52129:73629, ack 1, win 65,
length 21500
 0.402 IP sender > sink: Flags [.], seq 73629:75077, ack 1, win 65, length 1448

As a consequence, instead of growing exponentially, cwnd grows
more-or-less quadratically during slow-start, unless abc_l_var is
set to a sufficiently large value.

NewReno took more than 20 seconds to ramp up throughput to 100Mbps
over an emulated 100ms delay link.  While Linux took ~2 seconds.
I can provide the pcap file if anyone is interested.

Switching to CUBIC won't help, because it uses the logic in NewReno
ack_received() for slow start.

Is this a well-known issue and abc_l_var is the only cure for it?
https://calomel.org/freebsd_network_tuning.html

Thank you!

Best,
Shuo Chen