Hi Jeff, Thank you for the bug report.
As you mention the graph node path taken by these packets does not go thru ip4-lookup and so does not have the fib-index set. Since ip4-lookup is replaced by ip4-options, IMO ip4-options would be the place to add the code you have identified, so that we don’t do unnecessary work on packets that are for-us and were subject to the lookup. Please submit a patch with the change. We have some unit tests in test/test_igmp.py (test_igmp_router()) where we send IGMP packets to an interface that is not IGMP enabled and so they take the same graph node paths as your packet. This tests passes. Is your input interface bound to a non-default VRF? If so would you be able to add a new test to that file where the input interface is bound to another table? Thanks, neale De : <vpp-dev@lists.fd.io> au nom de Jeff <jct...@mykolab.com> Date : mardi 16 octobre 2018 à 22:27 À : "vpp-dev@lists.fd.io" <vpp-dev@lists.fd.io> Objet : [vpp-dev] vpp crash when handling IGMP with router alert Hello, I have a tap interface connected to a noisy LAN and I found that a certain type of IGMP packet will sometimes cause a crash (backtrace at the end) in ip4_fib_mtrie_lookup_step_one(). More specifically its an IGMP packet with the router alert IP option. Here's a packet trace: 00:02:41:522429: virtio-input virtio: hw_if_index 6 next-index 4 vring 0 len 54 hdr: flags 0x00 gso_type 0x00 hdr_len 0 gso_size 0 csum_start 0 csum_offset 0 num_buffers 1 00:02:41:522430: ethernet-input IP4: 00:0c:29:1f:43:a4 -> 01:00:5e:00:00:16 00:02:41:522430: ip4-input IGMP: 172.20.2.194 -> 224.0.0.22 version 4, header length 24 tos 0xc0, ttl 1, length 40, checksum 0x5523 fragment id 0x0000, flags DONT_FRAGMENT 00:02:41:522431: ip4-options option:[0x94,0x4,0x0,0x0] 00:02:41:522431: ip4-local IGMP: 172.20.2.194 -> 224.0.0.22 version 4, header length 24 tos 0xc0, ttl 1, length 40, checksum 0x5523 fragment id 0x0000, flags DONT_FRAGMENT 00:02:41:522434: igmp-input sw_if_index 6 next-index 0 membership_report_v3: code 0, checksum 0xfbf4 00:02:41:522435: error-drop igmp-input: IGMP not enabled on this interface I found that when the crash occurs vnet_buffer(b)->ip.fib_index is ~0 in ip4_local_check_src(). Here's an example debug print just added just after "if (PREDICT_FALSE (last_check->src.as_u32 != ip0->src_address.as_u32))" in ip4_local_check_src() Usual case: ip4_local_check_src: (00000000 != 0101A8C0), buf 0x7f6b6301b900, vlib_tx 4294967295 fib index 0 When crash happens: ip4_local_check_src: (00000000 != 0100A8C0), buf 0x7f6b63a00000, vlib_tx 4294967295 fib index 4294967295 I think the problem is that vnet_buffer(b)->ip.fib_index isn't set anywhere in this processing chain (ip4-input -> ip4-options -> ip4-local). This can cause an invalid pointer to be used when looking up the mtrie in ip4_local_check_src(). Normally the fib_index metadata is assigned by ip4-lookup via ip_lookup_set_buffer_fib_index(). But since the packet doesn't traverse that node the metadata is unset. I'm guessing that due to luck and/or initialization the fib_index metadata is usually zero, so the crash won't happen until the metadata is modified elsewhere and then the buffer is reused for this IGMP packet with router alert. I hope this is what's happening and it's not something more nefarious like memory corruption. I made the following change at the top of ip4_local_check_src (taken from ip_lookup_set_buffer_fib_index()) const dpo_id_t *dpo0; load_balance_t *lb0; u32 lbi0; + ip4_main_t *im = &ip4_main; vnet_buffer (b)->ip.fib_index = + vec_elt (im->fib_index_by_sw_if_index, vnet_buffer (b)->sw_if_index[VLIB_RX]); + vnet_buffer (b)->ip.fib_index = vnet_buffer (b)->sw_if_index[VLIB_TX] != ~0 ? vnet_buffer (b)->sw_if_index[VLIB_TX] : vnet_buffer (b)->ip.fib_index; With this change I was unable to trigger the crash. Don't know if this is a proper fix though. Here's the backtrace (some of the line numbers might be offset due to my debugging): Thread 1 "vpp_main" received signal SIGSEGV, Segmentation fault. 0x00007f73861c2748 in ip4_fib_mtrie_lookup_step_one (dst_address=0x7f717de38e1a, m=<optimized out>) at /home/jeff/vpp/src/vnet/ip/ip4_mtrie.h:229 229 /home/jeff/vpp/src/vnet/ip/ip4_mtrie.h: No such file or directory. (gdb) bt #0 0x00007f73861c2748 in ip4_fib_mtrie_lookup_step_one (dst_address=0x7f717de38e1a, m=<optimized out>) at /home/jeff/vpp/src/vnet/ip/ip4_mtrie.h:229 #1 ip4_local_check_src (error0=<synthetic pointer>, last_check=<synthetic pointer>, ip0=0x7f717de38e0e, b=<optimized out>) at /home/jeff/vpp/src/vnet/ip/ip4_forward.c:1352 #2 ip4_local_inline (vm=<optimized out>, node=<optimized out>, frame=<optimized out>, head_of_feature_arc=<optimized out>) at /home/jeff/vpp/src/vnet/ip/ip4_forward.c:1586 #3 0x00007f7385c70014 in dispatch_node (last_time_stamp=17304359695215669, frame=0x7f718dcaf300, dispatch_state=VLIB_NODE_STATE_POLLING, type=VLIB_NODE_TYPE_INTERNAL, node=0x7f7184ed2ec0, vm=0x7f7385ec9980 <vlib_global_main>) at /home/jeff/vpp/src/vlib/main.c:989 #4 dispatch_pending_node (vm=vm@entry=0x7f7385ec9980 <vlib_global_main>, pending_frame_index=pending_frame_index@entry=3, last_time_stamp=last_time_stamp@entry=17304359695215669) at /home/jeff/vpp/src/vlib/main.c:1139 #5 0x00007f7385c719fd in vlib_main_or_worker_loop (is_main=1, vm=0x7f7385ec9980 <vlib_global_main>) at /home/jeff/vpp/src/vlib/main.c:1555 #6 vlib_main_loop (vm=0x7f7385ec9980 <vlib_global_main>) at /home/jeff/vpp/src/vlib/main.c:1629 #7 vlib_main (vm=vm@entry=0x7f7385ec9980 <vlib_global_main>, input=input@entry=0x7f7184dfffa0) at /home/jeff/vpp/src/vlib/main.c:1820 #8 0x00007f7385cab453 in thread0 (arg=140134144842112) at /home/jeff/vpp/src/vlib/unix/main.c:607 #9 0x00007f73855917cc in clib_calljmp () from /usr/lib/x86_64-linux-gnu/libvppinfra.so.18.10 #10 0x00007ffd505eee30 in ?? () #11 0x00007f7385cac4e7 in vlib_unix_main (argc=<optimized out>, argv=<optimized out>) at /home/jeff/vpp/src/vlib/unix/main.c:676 #12 0x0000000000000000 in ?? () (gdb) up #1 ip4_local_check_src (error0=<synthetic pointer>, last_check=<synthetic pointer>, ip0=0x7f717de38e0e, b=<optimized out>) at /home/jeff/vpp/src/vnet/ip/ip4_forward.c:1352 1352 /home/jeff/vpp/src/vnet/ip/ip4_forward.c: No such file or directory. (gdb) p/x *ip0 $15 = {{ip_version_and_header_length = 0x46, tos = 0xc0, length = 0x2800, fragment_id = 0x0, flags_and_fragment_offset = 0x40, ttl = 0x1, protocol = 0x2, checksum = 0x8183, {{src_address = {data = {0xac, 0x14, 0xd4, 0x63}, data_u32 = 0x63d414ac, as_u8 = {0xac, 0x14, 0xd4, 0x63}, as_u16 = {0x14ac, 0x63d4}, as_u32 = 0x63d414ac}, dst_address = {data = {0xe0, 0x0, 0x0, 0x16}, data_u32 = 0x160000e0, as_u8 = {0xe0, 0x0, 0x0, 0x16}, as_u16 = { 0xe0, 0x1600}, as_u32 = 0x160000e0}}, address_pair = {src = {data = {0xac, 0x14, 0xd4, 0x63}, data_u32 = 0x63d414ac, as_u8 = {0xac, 0x14, 0xd4, 0x63}, as_u16 = {0x14ac, 0x63d4}, as_u32 = 0x63d414ac}, dst = {data = {0xe0, 0x0, 0x0, 0x16}, data_u32 = 0x160000e0, as_u8 = {0xe0, 0x0, 0x0, 0x16}, as_u16 = {0xe0, 0x1600}, as_u32 = 0x160000e0}}}}, {checksum_data_64 = {0x4000002800c046, 0x63d414ac81830201}, checksum_data_64_32 = {0x160000e0}}, {checksum_data_32 = {0x2800c046, 0x400000, 0x81830201, 0x63d414ac, 0x160000e0}}} Let me know if additional information is needed or if there are any other questions. Thanks, Jeff
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#10848): https://lists.fd.io/g/vpp-dev/message/10848 Mute This Topic: https://lists.fd.io/mt/27373338/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-