Hoi, The crash I observed is now gone, thanks!
VPP occasionally hits an ASSERT related to error counters at drop.c:77 -- I'll try to see if I can get a reproduction, but it may take a while, and it may be transient. *11: /home/pim/src/vpp/src/vlib/drop.c:77 (counter_index) assertion `ci < n->n_errors' fails* Thread 14 "vpp_wk_11" received signal SIGABRT, Aborted. [Switching to Thread 0x7fff4bbfd700 (LWP 182685)] __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory. (gdb) bt #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 #1 0x00007ffff6a5f859 in __GI_abort () at abort.c:79 #2 0x00000000004072e3 in os_panic () at /home/pim/src/vpp/src/vpp/vnet/main.c:413 #3 0x00007ffff6daea29 in debugger () at /home/pim/src/vpp/src/vppinfra/error.c:84 #4 0x00007ffff6dae7fa in _clib_error (how_to_die=2, function_name=0x0, line_number=0, fmt=0x7ffff6f9d19c "%s:%d (%s) assertion `%s' fails") at /home/pim/src/vpp/src/vppinfra/error.c:143 #5 0x00007ffff6f782d9 in counter_index (vm=0x7fffa09fb2c0, e=3416) at /home/pim/src/vpp/src/vlib/drop.c:77 #6 0x00007ffff6f77c57 in process_drop_punt (vm=0x7fffa09fb2c0, node=0x7fffa0c79b00, frame=0x7fff97168140, disposition=ERROR_DISPOSITION_DROP) at /home/pim/src/vpp/src/vlib/drop.c:224 #7 0x00007ffff6f77957 in error_drop_node_fn_hsw (vm=0x7fffa09fb2c0, node=0x7fffa0c79b00, frame=0x7fff97168140) at /home/pim/src/vpp/src/vlib/drop.c:248 #8 0x00007ffff6f0b10d in dispatch_node (vm=0x7fffa09fb2c0, node=0x7fffa0c79b00, type=VLIB_NODE_TYPE_INTERNAL, dispatch_state=VLIB_NODE_STATE_POLLING, frame=0x7fff97168140, last_time_stamp=5318787653101516) at /home/pim/src/vpp/src/vlib/main.c:961 #9 0x00007ffff6f0bb60 in dispatch_pending_node (vm=0x7fffa09fb2c0, pending_frame_index=5, last_time_stamp=5318787653101516) at /home/pim/src/vpp/src/vlib/main.c:1120 #10 0x00007ffff6f06e0f in vlib_main_or_worker_loop (vm=0x7fffa09fb2c0, is_main=0) at /home/pim/src/vpp/src/vlib/main.c:1587 #11 0x00007ffff6f06537 in vlib_worker_loop (vm=0x7fffa09fb2c0) at /home/pim/src/vpp/src/vlib/main.c:1721 #12 0x00007ffff6f44ef4 in vlib_worker_thread_fn (arg=0x7fff98eabec0) at /home/pim/src/vpp/src/vlib/threads.c:1587 #13 0x00007ffff6f3ffe5 in vlib_worker_thread_bootstrap_fn (arg=0x7fff98eabec0) at /home/pim/src/vpp/src/vlib/threads.c:426 #14 0x00007ffff6e61609 in start_thread (arg=<optimized out>) at pthread_create.c:477 #15 0x00007ffff6b5c163 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 (gdb) up 4 #4 0x00007ffff6dae7fa in _clib_error (how_to_die=2, function_name=0x0, line_number=0, fmt=0x7ffff6f9d19c "%s:%d (%s) assertion `%s' fails") at /home/pim/src/vpp/src/vppinfra/error.c:143 143 debugger (); (gdb) up #5 0x00007ffff6f782d9 in counter_index (vm=0x7fffa09fb2c0, e=3416) at /home/pim/src/vpp/src/vlib/drop.c:77 77 ASSERT (ci < n->n_errors); (gdb) list 72 73 ni = vlib_error_get_node (&vm->node_main, e); 74 n = vlib_get_node (vm, ni); 75 76 ci = vlib_error_get_code (&vm->node_main, e); 77 ASSERT (ci < n->n_errors); 78 79 ci += n->error_heap_index; 80 81 return ci; On Wed, Apr 6, 2022 at 1:53 PM Damjan Marion (damarion) <damar...@cisco.com> wrote: > > This seems to be day one issue, and my patch just exposed it. > Current interface deletion code is not removing node stats entries. > > So if you delete interface and then create one with the same name, > stats entry is already there, and creation of new entry fails. > > Hope this helps: > > https://gerrit.fd.io/r/c/vpp/+/35900 > > — > Damjan > > > > > On 05.04.2022., at 22:13, Pim van Pelt <p...@ipng.nl> wrote: > > > > Hoi, > > > > Here's a minimal repro that reliably crashes VPP at head for me, does > not crash before gerrit 35640: > > > > create loopback interface instance 0 > > create bond id 0 mode lacp load-balance l34 > > create bond id 1 mode lacp load-balance l34 > > delete loopback interface intfc loop0 > > delete bond BondEthernet0 > > delete bond BondEthernet1 > > create bond id 0 mode lacp load-balance l34 > > delete bond BondEthernet0 > > comment { the next command crashes VPP } > > create loopback interface instance 0 > > > > > > > > On Tue, Apr 5, 2022 at 9:48 PM Pim van Pelt <p...@ipng.nl> wrote: > > Hoi, > > > > There is a crashing regression in VPP after > https://gerrit.fd.io/r/c/vpp/+/35640 > > > > With that change merged, VPP crashes upon creation and deletion of > interfaces. Winding back the repo until before 35640 does not crash. The > crash happens in > > 0: /home/pim/src/vpp/src/vlib/stats/stats.h:115 (vlib_stats_get_entry) > assertion `entry_index < vec_len (sm->directory_vector)' fails > > > > (gdb) bt > > #0 __GI_raise (sig=sig@entry=6) at > ../sysdeps/unix/sysv/linux/raise.c:50 > > #1 0x00007ffff6a5e859 in __GI_abort () at abort.c:79 > > #2 0x00000000004072e3 in os_panic () at > /home/pim/src/vpp/src/vpp/vnet/main.c:413 > > #3 0x00007ffff6dada29 in debugger () at > /home/pim/src/vpp/src/vppinfra/error.c:84 > > #4 0x00007ffff6dad7fa in _clib_error (how_to_die=2, function_name=0x0, > line_number=0, fmt=0x7ffff6f9c19c "%s:%d (%s) assertion `%s' fails") > > at /home/pim/src/vpp/src/vppinfra/error.c:143 > > #5 0x00007ffff6f39605 in vlib_stats_get_entry (sm=0x7ffff6fce5e8 > <vlib_stats_main>, entry_index=4294967295) > > at /home/pim/src/vpp/src/vlib/stats/stats.h:115 > > #6 0x00007ffff6f39273 in vlib_stats_remove_entry > (entry_index=4294967295) at /home/pim/src/vpp/src/vlib/stats/stats.c:135 > > #7 0x00007ffff6ee36d9 in vlib_register_errors (vm=0x7fff96800740, > node_index=718, n_errors=0, error_strings=0x0, counters=0x0) > > at /home/pim/src/vpp/src/vlib/error.c:149 > > #8 0x00007ffff70b8e0c in setup_tx_node (vm=0x7fff96800740, > node_index=718, dev_class=0x7fff973f9fb0) at > /home/pim/src/vpp/src/vnet/interface.c:816 > > #9 0x00007ffff70b7f26 in vnet_register_interface (vnm=0x7ffff7f579a0 > <vnet_main>, dev_class_index=31, dev_instance=0, hw_class_index=29, > > hw_instance=7) at /home/pim/src/vpp/src/vnet/interface.c:1085 > > #10 0x00007ffff7129efd in vnet_eth_register_interface > (vnm=0x7ffff7f579a0 <vnet_main>, r=0x7fff4b288f18) > > at /home/pim/src/vpp/src/vnet/ethernet/interface.c:376 > > #11 0x00007ffff712bd05 in vnet_create_loopback_interface > (sw_if_indexp=0x7fff4b288fb8, mac_address=0x7fff4b288fb2 "", is_specified=1 > '\001', > > user_instance=0) at > /home/pim/src/vpp/src/vnet/ethernet/interface.c:883 > > #12 0x00007ffff712fecf in create_simulated_ethernet_interfaces > (vm=0x7fff96800740, input=0x7fff4b2899d0, cmd=0x7fff973c7e38) > > at /home/pim/src/vpp/src/vnet/ethernet/interface.c:930 > > #13 0x00007ffff6ed65e8 in vlib_cli_dispatch_sub_commands > (vm=0x7fff96800740, cm=0x42c2f0 <vlib_global_main+48>, > input=0x7fff4b2899d0, > > parent_command_index=1161) at /home/pim/src/vpp/src/vlib/cli.c:592 > > #14 0x00007ffff6ed6358 in vlib_cli_dispatch_sub_commands > (vm=0x7fff96800740, cm=0x42c2f0 <vlib_global_main+48>, > input=0x7fff4b2899d0, > > parent_command_index=33) at /home/pim/src/vpp/src/vlib/cli.c:549 > > #15 0x00007ffff6ed6358 in vlib_cli_dispatch_sub_commands > (vm=0x7fff96800740, cm=0x42c2f0 <vlib_global_main+48>, > input=0x7fff4b2899d0, > > parent_command_index=0) at /home/pim/src/vpp/src/vlib/cli.c:549 > > #16 0x00007ffff6ed5528 in vlib_cli_input (vm=0x7fff96800740, > input=0x7fff4b2899d0, function=0x0, function_arg=0) > > at /home/pim/src/vpp/src/vlib/cli.c:695 > > #17 0x00007ffff6f61f21 in unix_cli_exec (vm=0x7fff96800740, > input=0x7fff4b289e78, cmd=0x7fff973c99d8) at > /home/pim/src/vpp/src/vlib/unix/cli.c:3454 > > #18 0x00007ffff6ed65e8 in vlib_cli_dispatch_sub_commands > (vm=0x7fff96800740, cm=0x42c2f0 <vlib_global_main+48>, > input=0x7fff4b289e78, > > parent_command_index=0) at /home/pim/src/vpp/src/vlib/cli.c:592 > > #19 0x00007ffff6ed5528 in vlib_cli_input (vm=0x7fff96800740, > input=0x7fff4b289e78, function=0x7ffff6f55960 <unix_vlib_cli_output>, > function_arg=1) > > at /home/pim/src/vpp/src/vlib/cli.c:695 > > > > This is caught by a local regression test ( > https://github.com/pimvanpelt/vppcfg/tree/main/intest) that executes a > bunch of CLI statements, and I have a set of transitions there which I can > probably narrow down to an exact repro case. > > > > On Fri, Apr 1, 2022 at 3:08 PM Pim van Pelt via lists.fd.io <pim= > ipng...@lists.fd.io> wrote: > > Hoi, > > > > As a followup - I tried to remember why I copied class VPPStats() and > friends into my own repository, but that may be because it's not exported > in __init__.py. Should it be? I pulled in the latest changed Damjan made to > vpp_stats.py into my own repo, and my app runs again. Is it possibly worth > our while to add the VPPStats() class to the exported classes in vpp_papi ? > > > > groet, > > Pim > > > > On Fri, Apr 1, 2022 at 2:50 PM Pim van Pelt via lists.fd.io <pim= > ipng...@lists.fd.io> wrote: > > Hoi, > > > > I noticed that my VPP SNMP Agent no longer works with the python API at > HEAD, and my attention was drawn to this change: > > https://gerrit.fd.io/r/c/vpp/+/35640 > > stats: convert error counters to normal counters > > > > > > At HEAD, src/vpp-api/python/vpp_papi/vpp_stats.py now fails 4 out of 6 > tests with the same error as my application: > > struct.error: offset -140393469444104 out of range for 1073741824-byte > buffer > > .. > > Ran 6 tests in 0.612s > > FAILED (errors=4) > > > > Damjan, Ole, any clues? > > > > groet, > > Pim > > -- > > Pim van Pelt <p...@ipng.nl> > > PBVP1-RIPE - http://www.ipng.nl/ > > > > > > > > > > > > -- > > Pim van Pelt <p...@ipng.nl> > > PBVP1-RIPE - http://www.ipng.nl/ > > > > > > > > > > > > -- > > Pim van Pelt <p...@ipng.nl> > > PBVP1-RIPE - http://www.ipng.nl/ > > > > > > -- > > Pim van Pelt <p...@ipng.nl> > > PBVP1-RIPE - http://www.ipng.nl/ > > > > > > > > -- Pim van Pelt <p...@ipng.nl> PBVP1-RIPE - http://www.ipng.nl/
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#21210): https://lists.fd.io/g/vpp-dev/message/21210 Mute This Topic: https://lists.fd.io/mt/90274515/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-