On Tue, 2018-11-27 at 10:00 +0000, Maxim Mikityanskiy wrote: > Hi everyone, > > We are experiencing an issue with Mellanox mlx5 driver, and I tracked > it down to > the packet_snd function in net/packet/af_packet.c. > > Brief description: when a socket is created by calling > `socket(AF_PACKET, > SOCK_RAW, 0)`, the mlx5 driver receives an skb with wrong > transport_offset, > which can confuse the driver and cause the transmit to fail > (depending on the > configuration of the NIC).
Hi Max, Can you elaborate more, what NIC? what configuration ? what do you mean by confusion, anyway please see below > > The flow is the following: > > 1. packet_snd is called. > > 2. dev->hard_header_len (which is 14) is assigned to reserve. > > 3. The value of the third parameter of the initial socket() call is > assigned to > skb->protocol. In our case, it's 0. > > 4. skb_probe_transport_header is called with offset_hint == reserve > (which is > 14). > > 5. __skb_flow_dissect fails, because skb->protocol is 0. > > 6. skb_probe_transport_header happily sets transport_header to 14. > > I find this behavior (defaulting to 14) strange, because > network_header is also > set to 14, and the transport_header value is just wrong. Moreover, > there are two > more calls to skb_probe_transport_header in this file with > offset_hint == 0, > which looks more reasonable (if we can't find the transport header, > we indicate > that there is none, instead of pointing to the network header). > > Does anyone know why offset_hint is set to 14 in this single place? > Can it be > replaced by 0 safely, and what can be the consequences? > > Also, what guarantees does kernel provide for the network and > transport header > offsets? Especially in raw sockets, where the headers are not > generated by > different stack layers. > in mlx5 with ConnectX4 or Connext4-LX there is a requirement to copy at least the ethernet header to the tx descriptor otherwise this might cause the packet to be dropped, and for RAW sockets the skb headers offsets are not set, but the latest mlx5 upstream driver would know how to handle this, and copy the minmum amount required please see: static inline u16 mlx5e_calc_min_inline(enum mlx5_inline_modes mode, struct sk_buff *skb) it should default to: case MLX5_INLINE_MODE_L2: default: hlen = mlx5e_skb_l2_header_offset(skb); static inline int mlx5e_skb_l2_header_offset(struct sk_buff *skb) { #define MLX5E_MIN_INLINE (ETH_HLEN + VLAN_HLEN) return max(skb_network_offset(skb), MLX5E_MIN_INLINE); } So it should return at least 18 and not 14. We had some issues with this in old driver such as kernels 4.14/15, and it depends in the use case so i need some information first: 1. What Cards do you have ? (lspci) 2. What kernel/driver version are you using ? 3. what is the current enum mlx5_inline_modes seen in mlx5e_calc_min_inline or sq->min_inline_mode ? 4. Firmware version ? (ethtool -i) can you share the packet format you are sending and seeing the bad behavior with > Thanks, > Max