On Tue, Nov 27, 2018 at 3:32 PM Willem de Bruijn <willemdebruijn.ker...@gmail.com> wrote: > > On Tue, Nov 27, 2018 at 2:58 PM Willem de Bruijn > <willemdebruijn.ker...@gmail.com> wrote: > > > > On Tue, Nov 27, 2018 at 1:41 PM Maxim Mikityanskiy <maxi...@mellanox.com> > > wrote: > > > > > > Hi everyone, > > > > > > We are experiencing an issue with Mellanox mlx5 driver, and I tracked it > > > down to > > > the packet_snd function in net/packet/af_packet.c. > > > > > > Brief description: when a socket is created by calling `socket(AF_PACKET, > > > SOCK_RAW, 0)`, the mlx5 driver receives an skb with wrong > > > transport_offset, > > > which can confuse the driver and cause the transmit to fail (depending on > > > the > > > configuration of the NIC). > > > > > > The flow is the following: > > > > > > 1. packet_snd is called. > > > > > > 2. dev->hard_header_len (which is 14) is assigned to reserve. > > > > > > 3. The value of the third parameter of the initial socket() call is > > > assigned to > > > skb->protocol. In our case, it's 0. > > > > > > 4. skb_probe_transport_header is called with offset_hint == reserve > > > (which is > > > 14). > > > > > > 5. __skb_flow_dissect fails, because skb->protocol is 0. > > > > > > 6. skb_probe_transport_header happily sets transport_header to 14. > > > > > > I find this behavior (defaulting to 14) strange, because network_header > > > is also > > > set to 14, and the transport_header value is just wrong. Moreover, there > > > are two > > > more calls to skb_probe_transport_header in this file with offset_hint == > > > 0, > > > which looks more reasonable (if we can't find the transport header, we > > > indicate > > > that there is none, instead of pointing to the network header). > > > > That is not what offset_hint 0 does. It also sets the transport header > > to the same as the network header. > > Actually, what you observe may be due to commit b84bbaf7a6c8cc > ("packet: in packet_snd start writing at link layer allocation"). This > updated skb_set_network_header, but not the fall-back value for > skb_probe_transport_header. Let me take a closer look.
No, before and after the transport header is set to the same offset as the network header. This does leave the question whether the transport header would be better of not being set if it cannot be probed. I have not checked whether anything in the stack requires it to be set (i.e., not ~0U). The result with SOCK_DGRAM is actually probably buggy. In that case reserve is 0. But as in the case of SOCK_RAW skb->data is pointing to the link layer header when passing to po->xmit. So, nh_off is ETH_HLEN, but th_off is 0. Again, same behavior before and after the above patch, so ignore that red herring.