Hi Jeff,

Thank you for the bug report.

As you mention the graph node path taken by these packets does not go thru 
ip4-lookup and so does not have the fib-index set. Since ip4-lookup is replaced 
by ip4-options, IMO ip4-options would be the place to add the code you have 
identified, so that we don’t do unnecessary work on packets that are for-us and 
were subject to the lookup. Please submit a patch with the change.

We have some unit tests in test/test_igmp.py (test_igmp_router()) where we send 
IGMP packets to an interface that is not IGMP enabled and so they take the same 
graph node paths as your packet. This tests passes. Is your input interface 
bound to a non-default VRF? If so would you be able to add a new test to that 
file where the input interface is bound to another table?

Thanks,
neale

De : <vpp-dev@lists.fd.io> au nom de Jeff <jct...@mykolab.com>
Date : mardi 16 octobre 2018 à 22:27
À : "vpp-dev@lists.fd.io" <vpp-dev@lists.fd.io>
Objet : [vpp-dev] vpp crash when handling IGMP with router alert

Hello,

I have a tap interface connected to a noisy LAN and I found that a certain type 
of IGMP packet will sometimes cause a crash (backtrace at the end) in 
ip4_fib_mtrie_lookup_step_one().  More specifically its an IGMP packet with the 
router alert IP option.  Here's a packet trace:

00:02:41:522429: virtio-input
  virtio: hw_if_index 6 next-index 4 vring 0 len 54
    hdr: flags 0x00 gso_type 0x00 hdr_len 0 gso_size 0 csum_start 0 csum_offset 
0 num_buffers 1
00:02:41:522430: ethernet-input
  IP4: 00:0c:29:1f:43:a4 -> 01:00:5e:00:00:16
00:02:41:522430: ip4-input
  IGMP: 172.20.2.194 -> 224.0.0.22
    version 4, header length 24
    tos 0xc0, ttl 1, length 40, checksum 0x5523
    fragment id 0x0000, flags DONT_FRAGMENT
00:02:41:522431: ip4-options
    option:[0x94,0x4,0x0,0x0]
00:02:41:522431: ip4-local
    IGMP: 172.20.2.194 -> 224.0.0.22
      version 4, header length 24
      tos 0xc0, ttl 1, length 40, checksum 0x5523
      fragment id 0x0000, flags DONT_FRAGMENT
00:02:41:522434: igmp-input
  sw_if_index 6 next-index 0
  membership_report_v3: code 0, checksum 0xfbf4
00:02:41:522435: error-drop
  igmp-input: IGMP not enabled on this interface


I found that when the crash occurs vnet_buffer(b)->ip.fib_index is ~0 in 
ip4_local_check_src().  Here's an example debug print just added just after "if 
(PREDICT_FALSE (last_check->src.as_u32 != ip0->src_address.as_u32))" in 
ip4_local_check_src()

Usual case:
ip4_local_check_src: (00000000 != 0101A8C0), buf 0x7f6b6301b900, vlib_tx 
4294967295 fib index 0

When crash happens:
ip4_local_check_src: (00000000 != 0100A8C0), buf 0x7f6b63a00000, vlib_tx 
4294967295 fib index 4294967295

I think the problem is that vnet_buffer(b)->ip.fib_index isn't set anywhere in 
this processing chain (ip4-input -> ip4-options -> ip4-local).  This can cause 
an invalid pointer to be used when looking up the mtrie in 
ip4_local_check_src().  Normally the fib_index metadata is assigned by 
ip4-lookup via ip_lookup_set_buffer_fib_index().  But since the packet doesn't 
traverse that node the metadata is unset.  I'm guessing that due to luck and/or 
initialization the fib_index metadata is usually zero, so the crash won't 
happen until the metadata is modified elsewhere and then the buffer is reused 
for this IGMP packet with router alert.  I hope this is what's happening and 
it's not something more nefarious like memory corruption.

I made the following change at the top of ip4_local_check_src (taken from 
ip_lookup_set_buffer_fib_index())

   const dpo_id_t *dpo0;
   load_balance_t *lb0;
   u32 lbi0;
+  ip4_main_t *im = &ip4_main;

   vnet_buffer (b)->ip.fib_index =
+        vec_elt (im->fib_index_by_sw_if_index, vnet_buffer 
(b)->sw_if_index[VLIB_RX]);
+  vnet_buffer (b)->ip.fib_index =
     vnet_buffer (b)->sw_if_index[VLIB_TX] != ~0 ?
     vnet_buffer (b)->sw_if_index[VLIB_TX] : vnet_buffer (b)->ip.fib_index;

With this change I was unable to trigger the crash.  Don't know if this is a 
proper fix though.

Here's the backtrace (some of the line numbers might be offset due to my 
debugging):

Thread 1 "vpp_main" received signal SIGSEGV, Segmentation fault.
0x00007f73861c2748 in ip4_fib_mtrie_lookup_step_one 
(dst_address=0x7f717de38e1a, m=<optimized out>) at 
/home/jeff/vpp/src/vnet/ip/ip4_mtrie.h:229
229     /home/jeff/vpp/src/vnet/ip/ip4_mtrie.h: No such file or directory.
(gdb) bt
#0  0x00007f73861c2748 in ip4_fib_mtrie_lookup_step_one 
(dst_address=0x7f717de38e1a, m=<optimized out>) at 
/home/jeff/vpp/src/vnet/ip/ip4_mtrie.h:229
#1  ip4_local_check_src (error0=<synthetic pointer>, last_check=<synthetic 
pointer>, ip0=0x7f717de38e0e, b=<optimized out>)
    at /home/jeff/vpp/src/vnet/ip/ip4_forward.c:1352
#2  ip4_local_inline (vm=<optimized out>, node=<optimized out>, 
frame=<optimized out>, head_of_feature_arc=<optimized out>)
    at /home/jeff/vpp/src/vnet/ip/ip4_forward.c:1586
#3  0x00007f7385c70014 in dispatch_node (last_time_stamp=17304359695215669, 
frame=0x7f718dcaf300, dispatch_state=VLIB_NODE_STATE_POLLING,
    type=VLIB_NODE_TYPE_INTERNAL, node=0x7f7184ed2ec0, vm=0x7f7385ec9980 
<vlib_global_main>) at /home/jeff/vpp/src/vlib/main.c:989
#4  dispatch_pending_node (vm=vm@entry=0x7f7385ec9980 <vlib_global_main>, 
pending_frame_index=pending_frame_index@entry=3,
    last_time_stamp=last_time_stamp@entry=17304359695215669) at 
/home/jeff/vpp/src/vlib/main.c:1139
#5  0x00007f7385c719fd in vlib_main_or_worker_loop (is_main=1, 
vm=0x7f7385ec9980 <vlib_global_main>) at /home/jeff/vpp/src/vlib/main.c:1555
#6  vlib_main_loop (vm=0x7f7385ec9980 <vlib_global_main>) at 
/home/jeff/vpp/src/vlib/main.c:1629
#7  vlib_main (vm=vm@entry=0x7f7385ec9980 <vlib_global_main>, 
input=input@entry=0x7f7184dfffa0) at /home/jeff/vpp/src/vlib/main.c:1820
#8  0x00007f7385cab453 in thread0 (arg=140134144842112) at 
/home/jeff/vpp/src/vlib/unix/main.c:607
#9  0x00007f73855917cc in clib_calljmp () from 
/usr/lib/x86_64-linux-gnu/libvppinfra.so.18.10
#10 0x00007ffd505eee30 in ?? ()
#11 0x00007f7385cac4e7 in vlib_unix_main (argc=<optimized out>, argv=<optimized 
out>) at /home/jeff/vpp/src/vlib/unix/main.c:676
#12 0x0000000000000000 in ?? ()
(gdb) up
#1  ip4_local_check_src (error0=<synthetic pointer>, last_check=<synthetic 
pointer>, ip0=0x7f717de38e0e, b=<optimized out>)
    at /home/jeff/vpp/src/vnet/ip/ip4_forward.c:1352
1352    /home/jeff/vpp/src/vnet/ip/ip4_forward.c: No such file or directory.
(gdb) p/x *ip0
$15 = {{ip_version_and_header_length = 0x46, tos = 0xc0, length = 0x2800, 
fragment_id = 0x0, flags_and_fragment_offset = 0x40, ttl = 0x1, protocol = 0x2,
    checksum = 0x8183, {{src_address = {data = {0xac, 0x14, 0xd4, 0x63}, 
data_u32 = 0x63d414ac, as_u8 = {0xac, 0x14, 0xd4, 0x63}, as_u16 = {0x14ac,
            0x63d4}, as_u32 = 0x63d414ac}, dst_address = {data = {0xe0, 0x0, 
0x0, 0x16}, data_u32 = 0x160000e0, as_u8 = {0xe0, 0x0, 0x0, 0x16}, as_u16 = {
            0xe0, 0x1600}, as_u32 = 0x160000e0}}, address_pair = {src = {data = 
{0xac, 0x14, 0xd4, 0x63}, data_u32 = 0x63d414ac, as_u8 = {0xac, 0x14,
            0xd4, 0x63}, as_u16 = {0x14ac, 0x63d4}, as_u32 = 0x63d414ac}, dst = 
{data = {0xe0, 0x0, 0x0, 0x16}, data_u32 = 0x160000e0, as_u8 = {0xe0,
            0x0, 0x0, 0x16}, as_u16 = {0xe0, 0x1600}, as_u32 = 0x160000e0}}}}, 
{checksum_data_64 = {0x4000002800c046, 0x63d414ac81830201},
    checksum_data_64_32 = {0x160000e0}}, {checksum_data_32 = {0x2800c046, 
0x400000, 0x81830201, 0x63d414ac, 0x160000e0}}}

Let me know if additional information is needed or if there are any other 
questions.

Thanks,
Jeff
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#10848): https://lists.fd.io/g/vpp-dev/message/10848
Mute This Topic: https://lists.fd.io/mt/27373338/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to