On Thu, Jun 4, 2020 at 9:10 PM Eric Dumazet <eduma...@google.com> wrote: > > On Thu, Jun 4, 2020 at 2:01 AM <kerneljasonx...@gmail.com> wrote: > > > > From: Jason Xing <kerneljasonx...@gmail.com> > > > > When using BBR mode, too many tcp socks cannot be released because of > > duplicate use of the sock_hold() in the manner of tcp_internal_pacing() > > when RTO happens. Therefore, this situation maddly increases the slab > > memory and then constantly triggers the OOM until crash. > > > > Besides, in addition to BBR mode, if some mode applies pacing function, > > it could trigger what we've discussed above, > > > > Reproduce procedure: > > 0) cat /proc/slabinfo | grep TCP > > 1) switch net.ipv4.tcp_congestion_control to bbr > > 2) using wrk tool something like that to send packages > > 3) using tc to increase the delay and loss to simulate the RTO case. > > 4) cat /proc/slabinfo | grep TCP > > 5) kill the wrk command and observe the number of objects and slabs in > > TCP. > > 6) at last, you could notice that the number would not decrease. > > > > v2: extend the timer which could cover all those related potential risks > > (suggested by Eric Dumazet and Neal Cardwell) > > > > Signed-off-by: Jason Xing <kerneljasonx...@gmail.com> > > Signed-off-by: liweishi <liwei...@kuaishou.com> > > Signed-off-by: Shujin Li <lishu...@kuaishou.com> > > That is not how things work really. > > I will submit this properly so that stable teams do not have to guess > how to backport this to various kernels. > > Changelog is misleading, this has nothing to do with BBR, we need to be > precise. >
Thanks for your help. I can finally apply this patch into my kernel. Looking forward to your patchset :) Jason > Thank you.