From: Yonghong Song <y...@fb.com> Date: Wed, 21 Mar 2018 16:31:02 -0700
> One of our in-house projects, bpf-based NAT, hits a kernel BUG_ON at > function skb_segment(), line 3667. The bpf program attaches to > clsact ingress, calls bpf_skb_change_proto to change protocol > from ipv4 to ipv6 or from ipv6 to ipv4, and then calls bpf_redirect > to send the changed packet out. > ... > 3665 while (pos < offset + len) { > 3666 if (i >= nfrags) { > 3667 BUG_ON(skb_headlen(list_skb)); > ... > > The triggering input skb has the following properties: > list_skb = skb->frag_list; > skb->nfrags != NULL && skb_headlen(list_skb) != 0 > and skb_segment() is not able to handle a frag_list skb > if its headlen (list_skb->len - list_skb->data_len) is not 0. > > Patch #1 provides a simple solution to avoid BUG_ON. If > list_skb->head_frag is true, its page-backed frag will > be processed before the list_skb->frags. > Patch #2 provides a test case in test_bpf module which > constructs a skb and calls skb_segment() directly. The test > case is able to trigger the BUG_ON without Patch #1. > > The patch has been tested in the following setup: > ipv6_host <-> nat_server <-> ipv4_host > where nat_server has a bpf program doing ipv4<->ipv6 > translation and forwarding through clsact hook > bpf_skb_change_proto. Series applied, however I'm still not %100 convinced that allowing this kind of protocol and MSS sized mucked GRO packet is a good idea.