Hello,

I have a tap interface connected to a noisy LAN and I found that a certain type 
of IGMP packet will sometimes cause a crash (backtrace at the end) in 
ip4_fib_mtrie_lookup_step_one().  More specifically its an IGMP packet with the 
router alert IP option.  Here's a packet trace:

00:02:41:522429: virtio-input
  virtio: hw_if_index 6 next-index 4 vring 0 len 54
    hdr: flags 0x00 gso_type 0x00 hdr_len 0 gso_size 0 csum_start 0 csum_offset 
0 num_buffers 1
00:02:41:522430: ethernet-input
  IP4: 00:0c:29:1f:43:a4 -> 01:00:5e:00:00:16
00:02:41:522430: ip4-input
  IGMP: 172.20.2.194 -> 224.0.0.22
    version 4, header length 24
    tos 0xc0, ttl 1, length 40, checksum 0x5523
    fragment id 0x0000, flags DONT_FRAGMENT
00:02:41:522431: ip4-options
    option:[0x94,0x4,0x0,0x0]
00:02:41:522431: ip4-local
    IGMP: 172.20.2.194 -> 224.0.0.22
      version 4, header length 24
      tos 0xc0, ttl 1, length 40, checksum 0x5523
      fragment id 0x0000, flags DONT_FRAGMENT
00:02:41:522434: igmp-input
  sw_if_index 6 next-index 0
  membership_report_v3: code 0, checksum 0xfbf4
00:02:41:522435: error-drop
  igmp-input: IGMP not enabled on this interface

I found that when the crash occurs vnet_buffer(b)->ip.fib_index is ~0 in 
ip4_local_check_src().  Here's an example debug print just added just after "if 
(PREDICT_FALSE (last_check->src.as_u32 != ip0->src_address.as_u32))" in 
ip4_local_check_src()

Usual case:
ip4_local_check_src: (00000000 != 0101A8C0), buf 0x7f6b6301b900, vlib_tx 
4294967295 fib index 0

When crash happens:
ip4_local_check_src: (00000000 != 0100A8C0), buf 0x7f6b63a00000, vlib_tx 
4294967295 fib index 4294967295

I think the problem is that vnet_buffer(b)->ip.fib_index isn't set anywhere in 
this processing chain (ip4-input -> ip4-options -> ip4-local).  This can cause 
an invalid pointer to be used when looking up the mtrie in 
ip4_local_check_src().  Normally the fib_index metadata is assigned by 
ip4-lookup via ip_lookup_set_buffer_fib_index().  But since the packet doesn't 
traverse that node the metadata is unset.  I'm guessing that due to luck and/or 
initialization the fib_index metadata is usually zero, so the crash won't 
happen until the metadata is modified elsewhere and then the buffer is reused 
for this IGMP packet with router alert.  I hope this is what's happening and 
it's not something more nefarious like memory corruption.

I made the following change at the top of ip4_local_check_src (taken from 
ip_lookup_set_buffer_fib_index())
 
   const dpo_id_t *dpo0;
   load_balance_t *lb0;
   u32 lbi0;
+  ip4_main_t *im = &ip4_main;

   vnet_buffer (b)->ip.fib_index =
+        vec_elt (im->fib_index_by_sw_if_index, vnet_buffer 
(b)->sw_if_index[VLIB_RX]);
+  vnet_buffer (b)->ip.fib_index =
     vnet_buffer (b)->sw_if_index[VLIB_TX] != ~0 ?
     vnet_buffer (b)->sw_if_index[VLIB_TX] : vnet_buffer (b)->ip.fib_index;

With this change I was unable to trigger the crash.  Don't know if this is a 
proper fix though.

Here's the backtrace (some of the line numbers might be offset due to my 
debugging):

Thread 1 "vpp_main" received signal SIGSEGV, Segmentation fault.
0x00007f73861c2748 in ip4_fib_mtrie_lookup_step_one 
(dst_address=0x7f717de38e1a, m=<optimized out>) at 
/home/jeff/vpp/src/vnet/ip/ip4_mtrie.h:229
229     /home/jeff/vpp/src/vnet/ip/ip4_mtrie.h: No such file or directory.
(gdb) bt
#0  0x00007f73861c2748 in ip4_fib_mtrie_lookup_step_one 
(dst_address=0x7f717de38e1a, m=<optimized out>) at 
/home/jeff/vpp/src/vnet/ip/ip4_mtrie.h:229
#1  ip4_local_check_src (error0=<synthetic pointer>, last_check=<synthetic 
pointer>, ip0=0x7f717de38e0e, b=<optimized out>)
    at /home/jeff/vpp/src/vnet/ip/ip4_forward.c:1352
#2  ip4_local_inline (vm=<optimized out>, node=<optimized out>, 
frame=<optimized out>, head_of_feature_arc=<optimized out>)
    at /home/jeff/vpp/src/vnet/ip/ip4_forward.c:1586
#3  0x00007f7385c70014 in dispatch_node (last_time_stamp=17304359695215669, 
frame=0x7f718dcaf300, dispatch_state=VLIB_NODE_STATE_POLLING,
    type=VLIB_NODE_TYPE_INTERNAL, node=0x7f7184ed2ec0, vm=0x7f7385ec9980 
<vlib_global_main>) at /home/jeff/vpp/src/vlib/main.c:989
#4  dispatch_pending_node (vm=vm@entry=0x7f7385ec9980 <vlib_global_main>, 
pending_frame_index=pending_frame_index@entry=3,
    last_time_stamp=last_time_stamp@entry=17304359695215669) at 
/home/jeff/vpp/src/vlib/main.c:1139
#5  0x00007f7385c719fd in vlib_main_or_worker_loop (is_main=1, 
vm=0x7f7385ec9980 <vlib_global_main>) at /home/jeff/vpp/src/vlib/main.c:1555
#6  vlib_main_loop (vm=0x7f7385ec9980 <vlib_global_main>) at 
/home/jeff/vpp/src/vlib/main.c:1629
#7  vlib_main (vm=vm@entry=0x7f7385ec9980 <vlib_global_main>, 
input=input@entry=0x7f7184dfffa0) at /home/jeff/vpp/src/vlib/main.c:1820
#8  0x00007f7385cab453 in thread0 (arg=140134144842112) at 
/home/jeff/vpp/src/vlib/unix/main.c:607
#9  0x00007f73855917cc in clib_calljmp () from 
/usr/lib/x86_64-linux-gnu/libvppinfra.so.18.10
#10 0x00007ffd505eee30 in ?? ()
#11 0x00007f7385cac4e7 in vlib_unix_main (argc=<optimized out>, argv=<optimized 
out>) at /home/jeff/vpp/src/vlib/unix/main.c:676
#12 0x0000000000000000 in ?? ()
(gdb) up
#1  ip4_local_check_src (error0=<synthetic pointer>, last_check=<synthetic 
pointer>, ip0=0x7f717de38e0e, b=<optimized out>)
    at /home/jeff/vpp/src/vnet/ip/ip4_forward.c:1352
1352    /home/jeff/vpp/src/vnet/ip/ip4_forward.c: No such file or directory.
(gdb) p/x *ip0
$15 = {{ip_version_and_header_length = 0x46, tos = 0xc0, length = 0x2800, 
fragment_id = 0x0, flags_and_fragment_offset = 0x40, ttl = 0x1, protocol = 0x2,
    checksum = 0x8183, {{src_address = {data = {0xac, 0x14, 0xd4, 0x63}, 
data_u32 = 0x63d414ac, as_u8 = {0xac, 0x14, 0xd4, 0x63}, as_u16 = {0x14ac,
            0x63d4}, as_u32 = 0x63d414ac}, dst_address = {data = {0xe0, 0x0, 
0x0, 0x16}, data_u32 = 0x160000e0, as_u8 = {0xe0, 0x0, 0x0, 0x16}, as_u16 = {
            0xe0, 0x1600}, as_u32 = 0x160000e0}}, address_pair = {src = {data = 
{0xac, 0x14, 0xd4, 0x63}, data_u32 = 0x63d414ac, as_u8 = {0xac, 0x14,
            0xd4, 0x63}, as_u16 = {0x14ac, 0x63d4}, as_u32 = 0x63d414ac}, dst = 
{data = {0xe0, 0x0, 0x0, 0x16}, data_u32 = 0x160000e0, as_u8 = {0xe0,
            0x0, 0x0, 0x16}, as_u16 = {0xe0, 0x1600}, as_u32 = 0x160000e0}}}}, 
{checksum_data_64 = {0x4000002800c046, 0x63d414ac81830201},
    checksum_data_64_32 = {0x160000e0}}, {checksum_data_32 = {0x2800c046, 
0x400000, 0x81830201, 0x63d414ac, 0x160000e0}}}

Let me know if additional information is needed or if there are any other 
questions.

Thanks,
Jeff
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10843): https://lists.fd.io/g/vpp-dev/message/10843
Mute This Topic: https://lists.fd.io/mt/27373338/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to