Hello, We're facing a VPP crash where our dataplane agent is disconnecting and reconnecting to the VPP API socket when we are performing upgrades. To expedite the crash, we are sending the dataplane agent process into a constant restart loop where we create a new client session between the API, begin subscribing to l2fib events, unsubscribe and then close off the API socket.
I have created a Gerritt with a suspected patch ( https://gerrit.fd.io/r/c/vpp/+/44561) to check that the vl_input_queue is not uninitialized prior to passing it to vl_mem_api_can_send. I will also provide an example client to reproduce this crash. Thanks Catching the fault in gdb: Thread 1 "vpp_main" received signal SIGSEGV, Segmentation fault. vl_mem_api_can_send ( q =0x0) at /root/vpp/src/vlibapi/memory_shared.c :781 781 /root/vpp/src/vlibapi/memory_shared.c: No such file or directory. (gdb) (gdb) (gdb) (gdb) bt full #0 vl_mem_api_can_send ( q =0x0) at /root/vpp/src/vlibapi/memory_shared.c :781 No locals. #1 0x00007fa5ff04bbf4 in vl_api_can_send_msg ( rp =0x7fa544fc4440) at /root/vpp/src/vlibmemory/api.h :39 No locals. #2 l2fib_scan ( vm=vm@entry =0x7fa540000ac0, start_time =<optimized out>, event_only=event_only@entry =1 '\001') at /root/vpp/src/vnet/l2/l2_fib.c :1262 bd_learn_counts = 0x7fa545039dd8 last_start = <optimized out> accum_t = <optimized out> delta_t = <optimized out> evt_idx = <optimized out> learn_count = <optimized out> --Type <RET> for more, q to quit, c to continue without paging--c lm = <optimized out> client = 1 cl_idx = 33554560 mp = 0x1301c3970 reg = 0x7fa544fc4440 fm = <optimized out> i = <optimized out> j = <optimized out> k = <optimized out> bd_index = 3 h = <optimized out> #3 0x00007fa5ff0475c1 in l2fib_mac_age_scanner_process ( vm =0x7fa540000ac0, rt =<optimized out>, f =<optimized out>) at /root/vpp/src/vnet/l2/l2_fib.c :1334 scan = <optimized out> SCAN_MAC_AGE = SCAN_MAC_AGE SCAN_MAC_EVENT = SCAN_MAC_EVENT SCAN_DISABLE = SCAN_DISABLE event_data = 0x7fa544f5fb18 enabled = <optimized out> next_age_scan_time = <optimized out> event_type = 140347804209272 start_time = 6.5254760102106957e-06 fm = <optimized out> lm = <optimized out> #4 0x00007fa5fedc6eb7 in vlib_process_bootstrap ( _a =<optimized out>) at /root/vpp/src/vlib/main.c :1162 a = <optimized out> vm = 0x7fa54503d078 p = 0x7fa5408180c0 f = 0x3 node = 0x7fa5408180c0 n = <optimized out> #5 0x00007fa5fed46d88 in clib_calljmp () at /root/vpp/src/vppinfra/longjmp.S :123 No locals. #6 0x00007fa5fcc15dc0 in ?? () No symbol table info available. #7 0x00007fa5fedc2901 in vlib_process_startup ( vm =0x7fa540000ac0, p =0x7fa5408180c0, f =0x0) at /root/vpp/src/vlib/main.c :1187 a = <error reading variable a (Cannot access memory at address 0x7fa5fa8fc020)> r = <optimized out> #8 dispatch_process ( vm =0x7fa540000ac0, p =0x7fa5408180c0, f =0x0, last_time_stamp = <error reading variable: Cannot access memory at address 0x7fa5fa8fc010> ) at /root/vpp/src/vlib/main.c :1259 nm = 0x7fa540000c18 node_runtime = 0x7fa5408180c0 node = 0x7fa540817f60 t = <error reading variable t (Cannot access memory at address 0x7fa5fa8fc010)> old_process_index = 4294967295 n_vectors = <optimized out> is_suspend = <optimized out> Backtrace stopped: Cannot access memory at address 0x7fa5fa8fc068 With CLIB_DEBUG enabled: vpp[42]: received signal SIGSEGV, PC 0x7f1e42a455c0, faulting address 0x60 vpp[42]: Code: 8b 4f 60 31 c0 3b 4f 64 0f 9c c0 c3 0f 1f 40 00 41 56 53 50 vpp[42]: #0 0x00007f1e42a455c0 vl_mem_api_can_send + 0x0 vpp[42]: from /lib/x86_64-linux-gnu/libvlibapi.so.26.02 vpp[42]: #1 0x00007f1e42bf6bf4 l2fib_init + 0x31c4 vpp[42]: from /lib/x86_64-linux-gnu/libvnet.so.26.02 vpp[42]: #2 0x00007f1e42bf25c1 get_mac_table + 0x171 vpp[42]: from /lib/x86_64-linux-gnu/libvnet.so.26.02 vpp[42]: #3 0x00007f1e42971eb7 vlib_exit_with_status + 0xb37 vpp[42]: from /lib/x86_64-linux-gnu/libvlib.so.26.02 vpp[42]: #4 0x00007f1e428f1d88 clib_calljmp + 0x18 vpp[42]: from /lib/x86_64-linux-gnu/libvppinfra.so.26.02 [1]+ Aborted (core dumped) /usr/bin/vpp -c "${RUNTIME_DIR}/vpp.conf"
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#26705): https://lists.fd.io/g/vpp-dev/message/26705 Mute This Topic: https://lists.fd.io/mt/117139016/21656 Group Owner: [email protected] Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/14379924/21656/631435203/xyzzy [[email protected]] -=-=-=-=-=-=-=-=-=-=-=-
