On Thu, 2016-04-07 at 13:59 +0800, Yang Yingliang wrote:
> 
> On 2016/3/30 21:47, Eric Dumazet wrote:
> > On Wed, 2016-03-30 at 13:56 +0800, Yang Yingliang wrote:
> >
> >> Sorry, I made a mistake. I am very sure my kernel has these two patches.
> >> And I can get some dropping of the packets in 10Gb eth.
> >>
> >> # netstat -s | grep -i backlog
> >>       TCPBacklogDrop: 4135
> >> # netstat -s | grep -i backlog
> >>       TCPBacklogDrop: 4167
> >
> > Sender will retransmit and the receiver backlog will lilely be emptied
> > before the packets arrive again.
> >
> > Are you sure these are TCP drops ?
> Yes.
> 
> >
> > Which 10Gb NIC is it ? (ethtool -i eth0)
> The NIC driver is not upstream. And my system is arm64.
> 
> >
> > What is the max size of sendmsg() chunks are generated by your apps ?
> 256KB
> 
> >
> > Are they forcing small SO_RCVBUF or SO_SNDBUF ?
> I am not sure.
> I add some debug message in kernel:
> [2016-04-06 10:56:55][ 1365.477140] TCP: rcvbuf:10485760 sndbuf:2097152 
> limit:12582912 backloglen:12402232 rmem_alloc:0 truesize:53320
> [2016-04-06 10:56:55][ 1365.477170] TCP: rcvbuf:10485760 sndbuf:2097152 
> limit:12582912 backloglen:12460884 rmem_alloc:55986 truesize:58652
> [2016-04-06 10:56:55][ 1365.477192] TCP: rcvbuf:10485760 sndbuf:2097152 
> limit:12582912 backloglen:12506206 rmem_alloc:0 truesize:45322
> [2016-04-06 10:56:55][ 1365.477226] TCP: rcvbuf:10485760 sndbuf:2097152 
> limit:12582912 backloglen:12519536 rmem_alloc:7998 truesize:13330
> [2016-04-06 10:56:55][ 1365.477254] TCP: rcvbuf:10485760 sndbuf:2097152 
> limit:12582912 backloglen:12575522 rmem_alloc:0 truesize:55986
> [2016-04-06 10:56:55][ 1365.477282] TCP: rcvbuf:10485760 sndbuf:2097152 
> limit:12582912 backloglen:12634174 rmem_alloc:0 truesize:58652
> [2016-04-06 10:56:55][ 1365.477301] TCP: rcvbuf:10485760 sndbuf:2097152 
> limit:12582912 backloglen:12634174 rmem_alloc:26660 truesize:31992
> [2016-04-06 10:56:55][ 1365.477321] TCP: rcvbuf:10485760 sndbuf:2097152 
> limit:12582912 backloglen:12634174 rmem_alloc:58652 truesize:26660
> [2016-04-06 10:56:55][ 1365.477341] TCP: rcvbuf:10485760 sndbuf:2097152 
> limit:12582912 backloglen:12634174 rmem_alloc:58652 truesize:42656
> [2016-04-06 10:56:55][ 1365.477384] TCP: rcvbuf:10485760 sndbuf:2097152 
> limit:12582912 backloglen:12634174 rmem_alloc:0 truesize:58652
> [2016-04-06 10:56:55][ 1365.477403] TCP: rcvbuf:10485760 sndbuf:2097152 
> limit:12582912 backloglen:12634174 rmem_alloc:0 truesize:34658
> 
> >
> > What percentage of drops do you have ?
> netstat -s | grep -i TCPBacklogDrop increases 20-40 per second.
> It's about 1.2% (117724(TCPBacklogDrop)/214502873(InSegs of cat 
> /proc/net/snmp)).
> 
> >
> > Here (at Google), we have less than one backlog drop per billion
> > packets, on host facing the public Internet.
> >
> > If a TCP sender sends a burst of tiny packets because it is misbehaving,
> > you absolutely will drop packets, especially if applications use
> > sendmsg() with very big lengths and big SO_SNDBUF.
> >
> > Trying to not drop these hostile packets as you did is simply opening
> > your host to DOS attacks.
> >
> > Eventually, we should even drop earlier in TCP stack (before taking
> > socket lock).
> >
> >
> How about expand the buffer like:

Please do not send patches before really understanding the issue you
have.

Having a backlog of 12506206 bytes is ridiculous. Dropping packets is
absolutely fine if this ever happens.

Something is really wrong on your host, or the sender simply does not
comply with TCP protocol (not caring of receiver window at all)

Since you added a trace of truesize, please also trace skb->len


Reply via email to