On 10/15/2018 07:50 AM, Eric Dumazet wrote:
> On Mon, Oct 15, 2018 at 3:26 AM Gasper Zejn <zelo.z...@gmail.com> wrote:
>>
>>
>> I've tried to isolate the issue as best I could. There seems to be an
>> issue if the TCP socket has keepalive set and send queue is not empty
>> and the route goes away.
>>
>> https://github.com/zejn/bbr_pfifo_interrupts_issue
>>
>> Hope this helps,
>> Gasper
> 
> This is awesome Gasper, I will take a look thanks.
> 
> Note that we are about to send a patch series (targeting net-next) to
> polish the EDT patch series that was merged last month for linux-4.20.
> TCP internal pacing is going to be much better performance-wise.
> 

Yeah, I believe that :

Commit c092dd5f4a7f4e4dbbcc8cf2e50b516bf07e432f ("tcp: switch
tcp_internal_pacing() to tcp_wstamp_ns")
has incidentally fixed the issue.

That is because it calls tcp_internal_pacing() from
tcp_update_skb_after_send() which is called only if the packet was
correctly sent by IP layer.

Before this patch, tcp_internal_pacing() was called from
__tcp_transmit_skb() before we attempted to send the clone
and the clone could be dropped in IP layer (lack of route for example)
right away.

So in case the packet was not sent because of a route problem, the high 
resolution
timer would kick soon after and TCP xmit path would be entered again, 
triggering this loop problem.

I am going to send the 2nd round of EDT patches, so that you can try David 
Miller net-next tree
with all the patches we believe are needed for 4.20. Once proven to work, we 
might have to backport
the series to 4.18 and 4.19

Thanks !

Reply via email to