On Fri, 14 Jun 2019 19:08:08 -0700 (PDT), David Miller wrote:
> From: Jakub Kicinski <[email protected]>
> Date: Wed, 12 Jun 2019 11:51:21 -0700
>
> > Brendan reports that the use of netem's packet corruption capability
> > leads to strange crashes. This seems to be caused by
> > commit d66280b12bd7 ("net: netem: use a list in addition to rbtree")
> > which uses skb->next pointer to construct a fast-path queue of
> > in-order skbs.
> >
> > Packet corruption code has to invoke skb_gso_segment() in case
> > of skbs in need of GSO. skb_gso_segment() returns a list of
> > skbs. If next pointers of the skbs on that list do not get cleared
> > fast path list goes into the weeds and tries to access the next
> > segment skb multiple times.
> >
> > Reported-by: Brendan Galloway <[email protected]>
> > Fixes: d66280b12bd7 ("net: netem: use a list in addition to rbtree")
> > Signed-off-by: Jakub Kicinski <[email protected]>
> > Reviewed-by: Dirk van der Merwe <[email protected]>
>
> Please rework the commit message a bit to make things cleared, your
> ascii diagrams would be great. :)
In process of rewriting the commit message I found a memory leak,
and the backlog accounting is also buggy in the segmentation path
qdisc netem 8001: root refcnt 64 limit 100 delay 19us corrupt 1%
Sent 30237896 bytes 19895 pkt (dropped 1885, overlimits 0 requeues 287)
backlog 0b 99p requeues 287
^^^^^^
99 packets but 0 bytes
I need an internal review, and will repost soon. I need to stop looking
for bugs here 🙈