Hello VPP developers, We have a problem with VPP used for NAT on Ubuntu 18.04 servers equipped with Mellanox ConnectX-5 network cards (ConnectX-5 EN network interface card; 100GbE dual-port QSFP28; PCIe3.0 x16; tall bracket; ROHS R6).
VPP is dropping packets in the ip4-input node due to "ip4 length > l2 length" errors, when we use the RDMA plugin. The interfaces are configured like this: create int rdma host-if enp101s0f1 name Interface101 num-rx-queues 1 create int rdma host-if enp179s0f1 name Interface179 num-rx-queues 1 (we have set num-rx-queues 1 now to simplify while troubleshooting, in production we use num-rx-queues 4) We see some packets dropped due to "ip4 length > l2 length" for example in TCP tests with around 100 Mbit/s -- running such a test for a few seconds already gives some errors. More traffic gives more errors and it seems to be unrelated to the contents of the packets, it seems to happen quite randomly and already at such moderate amounts of traffic, very far below what should be the capacity of the hardware. Only a small fraction of packets are dropped: in tests at 100 Mbit/s and packet size 500, for each million packets about 3 or 4 packets get the "ip4 length > l2 length" drop problem. However, the effect appears stronger for larger amounts of traffic and has impacted some of our end users who observe decresed TCP speed as a result of these drops. The "ip4 length > l2 length" errors can be seen using vppctl "show errors": 142 ip4-input ip4 length > l2 length To get more info about the "ip4 length > l2 length" error we printed the involved sizes when the error happens (ip_len0 and cur_len0 in src/vnet/ip/ip4_input.h), which shows that the actual packet size is often much smaller than the ip_len0 value which is what the IP packet size should be according to the IP header. For example, when ip_len0=500 as is the case for many of our packets in the test runs, the cur_len0 value is sometimes much smaller. The smallest case we have seen was cur_len0 = 59 with ip_len0 = 500 -- the IP header said the IP packet size was 500 bytes, but the actual size was only 59 bytes. So it seems some data is lost, packets have been truncated, sometimes large parts of the packets are missing. The problems disappear if we skip using the RDMA plugin and use the (old?) dpdk way of handling the interfaces, then there are no "ip4 length > l2 length" drops at all. That makes us think there is something wrong with the rdma plugin, perhaps a bug or something wrong with how it is configured. We have tested this with both the current master branch and the stable/1908 branch, we see the same problem for both. We tried updating the Mellanox driver from v4.6 to v4.7 (latest version) but that did not help. After trying some different values of the rx-queue-size parameter to the "create int rdma" command, it seems like the "ip4 length > l2 length" becomes smaller as the rx-queue-size is increased, perhaps indicating the problem has to do with what happens when the end of that queue is reached. Do you agree that the above points to a problem with the RDMA plugin in VPP? Are there known bugs or other issues that could explain the "ip4 length > l2 length" drops? Does it seem like a good idea to set a very large value of the rx- queue-size parameter if that alleviates the "ip4 length > l2 length" problem, or are there big downsides of using a large rx-queue-size value? What else could we do to troubleshoot this further, are there configuration options to the RDMA plugin that could be used to solve this and/or get more information about what is happening? Best regards, Elias
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#15403): https://lists.fd.io/g/vpp-dev/message/15403 Mute This Topic: https://lists.fd.io/mt/71273976/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-