Hoi,

The crash I observed is now gone, thanks!

VPP occasionally hits an ASSERT related to error counters at drop.c:77 --
I'll try to see if I can get a reproduction, but it may take a while, and
it may be transient.


*11: /home/pim/src/vpp/src/vlib/drop.c:77 (counter_index) assertion `ci <
n->n_errors' fails*

Thread 14 "vpp_wk_11" received signal SIGABRT, Aborted.
[Switching to Thread 0x7fff4bbfd700 (LWP 182685)]
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007ffff6a5f859 in __GI_abort () at abort.c:79
#2  0x00000000004072e3 in os_panic () at
/home/pim/src/vpp/src/vpp/vnet/main.c:413
#3  0x00007ffff6daea29 in debugger () at
/home/pim/src/vpp/src/vppinfra/error.c:84
#4  0x00007ffff6dae7fa in _clib_error (how_to_die=2, function_name=0x0,
line_number=0, fmt=0x7ffff6f9d19c "%s:%d (%s) assertion `%s' fails")
    at /home/pim/src/vpp/src/vppinfra/error.c:143
#5  0x00007ffff6f782d9 in counter_index (vm=0x7fffa09fb2c0, e=3416) at
/home/pim/src/vpp/src/vlib/drop.c:77
#6  0x00007ffff6f77c57 in process_drop_punt (vm=0x7fffa09fb2c0,
node=0x7fffa0c79b00, frame=0x7fff97168140,
disposition=ERROR_DISPOSITION_DROP)
    at /home/pim/src/vpp/src/vlib/drop.c:224
#7  0x00007ffff6f77957 in error_drop_node_fn_hsw (vm=0x7fffa09fb2c0,
node=0x7fffa0c79b00, frame=0x7fff97168140)
    at /home/pim/src/vpp/src/vlib/drop.c:248
#8  0x00007ffff6f0b10d in dispatch_node (vm=0x7fffa09fb2c0,
node=0x7fffa0c79b00, type=VLIB_NODE_TYPE_INTERNAL,
    dispatch_state=VLIB_NODE_STATE_POLLING, frame=0x7fff97168140,
last_time_stamp=5318787653101516) at /home/pim/src/vpp/src/vlib/main.c:961
#9  0x00007ffff6f0bb60 in dispatch_pending_node (vm=0x7fffa09fb2c0,
pending_frame_index=5, last_time_stamp=5318787653101516)
    at /home/pim/src/vpp/src/vlib/main.c:1120
#10 0x00007ffff6f06e0f in vlib_main_or_worker_loop (vm=0x7fffa09fb2c0,
is_main=0) at /home/pim/src/vpp/src/vlib/main.c:1587
#11 0x00007ffff6f06537 in vlib_worker_loop (vm=0x7fffa09fb2c0) at
/home/pim/src/vpp/src/vlib/main.c:1721
#12 0x00007ffff6f44ef4 in vlib_worker_thread_fn (arg=0x7fff98eabec0) at
/home/pim/src/vpp/src/vlib/threads.c:1587
#13 0x00007ffff6f3ffe5 in vlib_worker_thread_bootstrap_fn
(arg=0x7fff98eabec0) at /home/pim/src/vpp/src/vlib/threads.c:426
#14 0x00007ffff6e61609 in start_thread (arg=<optimized out>) at
pthread_create.c:477
#15 0x00007ffff6b5c163 in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:95
(gdb) up 4
#4  0x00007ffff6dae7fa in _clib_error (how_to_die=2, function_name=0x0,
line_number=0, fmt=0x7ffff6f9d19c "%s:%d (%s) assertion `%s' fails")
    at /home/pim/src/vpp/src/vppinfra/error.c:143
143         debugger ();
(gdb) up
#5  0x00007ffff6f782d9 in counter_index (vm=0x7fffa09fb2c0, e=3416) at
/home/pim/src/vpp/src/vlib/drop.c:77
77        ASSERT (ci < n->n_errors);
(gdb) list
72
73        ni = vlib_error_get_node (&vm->node_main, e);
74        n = vlib_get_node (vm, ni);
75
76        ci = vlib_error_get_code (&vm->node_main, e);
77        ASSERT (ci < n->n_errors);
78
79        ci += n->error_heap_index;
80
81        return ci;

On Wed, Apr 6, 2022 at 1:53 PM Damjan Marion (damarion) <damar...@cisco.com>
wrote:

>
> This seems to be day one issue, and my patch just exposed it.
> Current interface deletion code is not removing node stats entries.
>
> So if you delete interface and then create one with the same name,
> stats entry is already there, and creation of new entry fails.
>
> Hope this helps:
>
> https://gerrit.fd.io/r/c/vpp/+/35900
>
> —
> Damjan
>
>
>
> > On 05.04.2022., at 22:13, Pim van Pelt <p...@ipng.nl> wrote:
> >
> > Hoi,
> >
> > Here's a minimal repro that reliably crashes VPP at head for me, does
> not crash before gerrit 35640:
> >
> > create loopback interface instance 0
> > create bond id 0 mode lacp load-balance l34
> > create bond id 1 mode lacp load-balance l34
> > delete loopback interface intfc loop0
> > delete bond BondEthernet0
> > delete bond BondEthernet1
> > create bond id 0 mode lacp load-balance l34
> > delete bond BondEthernet0
> > comment { the next command crashes VPP }
> > create loopback interface instance 0
> >
> >
> >
> > On Tue, Apr 5, 2022 at 9:48 PM Pim van Pelt <p...@ipng.nl> wrote:
> > Hoi,
> >
> > There is a crashing regression in VPP after
> https://gerrit.fd.io/r/c/vpp/+/35640
> >
> > With that change merged, VPP crashes upon creation and deletion of
> interfaces. Winding back the repo until before 35640 does not crash. The
> crash happens in
> > 0: /home/pim/src/vpp/src/vlib/stats/stats.h:115 (vlib_stats_get_entry)
> assertion `entry_index < vec_len (sm->directory_vector)' fails
> >
> > (gdb) bt
> > #0  __GI_raise (sig=sig@entry=6) at
> ../sysdeps/unix/sysv/linux/raise.c:50
> > #1  0x00007ffff6a5e859 in __GI_abort () at abort.c:79
> > #2  0x00000000004072e3 in os_panic () at
> /home/pim/src/vpp/src/vpp/vnet/main.c:413
> > #3  0x00007ffff6dada29 in debugger () at
> /home/pim/src/vpp/src/vppinfra/error.c:84
> > #4  0x00007ffff6dad7fa in _clib_error (how_to_die=2, function_name=0x0,
> line_number=0, fmt=0x7ffff6f9c19c "%s:%d (%s) assertion `%s' fails")
> >    at /home/pim/src/vpp/src/vppinfra/error.c:143
> > #5  0x00007ffff6f39605 in vlib_stats_get_entry (sm=0x7ffff6fce5e8
> <vlib_stats_main>, entry_index=4294967295)
> >    at /home/pim/src/vpp/src/vlib/stats/stats.h:115
> > #6  0x00007ffff6f39273 in vlib_stats_remove_entry
> (entry_index=4294967295) at /home/pim/src/vpp/src/vlib/stats/stats.c:135
> > #7  0x00007ffff6ee36d9 in vlib_register_errors (vm=0x7fff96800740,
> node_index=718, n_errors=0, error_strings=0x0, counters=0x0)
> >    at /home/pim/src/vpp/src/vlib/error.c:149
> > #8  0x00007ffff70b8e0c in setup_tx_node (vm=0x7fff96800740,
> node_index=718, dev_class=0x7fff973f9fb0) at
> /home/pim/src/vpp/src/vnet/interface.c:816
> > #9  0x00007ffff70b7f26 in vnet_register_interface (vnm=0x7ffff7f579a0
> <vnet_main>, dev_class_index=31, dev_instance=0, hw_class_index=29,
> >    hw_instance=7) at /home/pim/src/vpp/src/vnet/interface.c:1085
> > #10 0x00007ffff7129efd in vnet_eth_register_interface
> (vnm=0x7ffff7f579a0 <vnet_main>, r=0x7fff4b288f18)
> >    at /home/pim/src/vpp/src/vnet/ethernet/interface.c:376
> > #11 0x00007ffff712bd05 in vnet_create_loopback_interface
> (sw_if_indexp=0x7fff4b288fb8, mac_address=0x7fff4b288fb2 "", is_specified=1
> '\001',
> >    user_instance=0) at
> /home/pim/src/vpp/src/vnet/ethernet/interface.c:883
> > #12 0x00007ffff712fecf in create_simulated_ethernet_interfaces
> (vm=0x7fff96800740, input=0x7fff4b2899d0, cmd=0x7fff973c7e38)
> >    at /home/pim/src/vpp/src/vnet/ethernet/interface.c:930
> > #13 0x00007ffff6ed65e8 in vlib_cli_dispatch_sub_commands
> (vm=0x7fff96800740, cm=0x42c2f0 <vlib_global_main+48>,
> input=0x7fff4b2899d0,
> >    parent_command_index=1161) at /home/pim/src/vpp/src/vlib/cli.c:592
> > #14 0x00007ffff6ed6358 in vlib_cli_dispatch_sub_commands
> (vm=0x7fff96800740, cm=0x42c2f0 <vlib_global_main+48>,
> input=0x7fff4b2899d0,
> >    parent_command_index=33) at /home/pim/src/vpp/src/vlib/cli.c:549
> > #15 0x00007ffff6ed6358 in vlib_cli_dispatch_sub_commands
> (vm=0x7fff96800740, cm=0x42c2f0 <vlib_global_main+48>,
> input=0x7fff4b2899d0,
> >    parent_command_index=0) at /home/pim/src/vpp/src/vlib/cli.c:549
> > #16 0x00007ffff6ed5528 in vlib_cli_input (vm=0x7fff96800740,
> input=0x7fff4b2899d0, function=0x0, function_arg=0)
> >    at /home/pim/src/vpp/src/vlib/cli.c:695
> > #17 0x00007ffff6f61f21 in unix_cli_exec (vm=0x7fff96800740,
> input=0x7fff4b289e78, cmd=0x7fff973c99d8) at
> /home/pim/src/vpp/src/vlib/unix/cli.c:3454
> > #18 0x00007ffff6ed65e8 in vlib_cli_dispatch_sub_commands
> (vm=0x7fff96800740, cm=0x42c2f0 <vlib_global_main+48>,
> input=0x7fff4b289e78,
> >    parent_command_index=0) at /home/pim/src/vpp/src/vlib/cli.c:592
> > #19 0x00007ffff6ed5528 in vlib_cli_input (vm=0x7fff96800740,
> input=0x7fff4b289e78, function=0x7ffff6f55960 <unix_vlib_cli_output>,
> function_arg=1)
> >    at /home/pim/src/vpp/src/vlib/cli.c:695
> >
> > This is caught by a local regression test (
> https://github.com/pimvanpelt/vppcfg/tree/main/intest) that executes a
> bunch of CLI statements, and I have a set of transitions there which I can
> probably narrow down to an exact repro case.
> >
> > On Fri, Apr 1, 2022 at 3:08 PM Pim van Pelt via lists.fd.io <pim=
> ipng...@lists.fd.io> wrote:
> > Hoi,
> >
> > As a followup - I tried to remember why I copied class VPPStats() and
> friends into my own repository, but that may be because it's not exported
> in __init__.py. Should it be? I pulled in the latest changed Damjan made to
> vpp_stats.py into my own repo, and my app runs again. Is it possibly worth
> our while to add the VPPStats() class to the exported classes in vpp_papi ?
> >
> > groet,
> > Pim
> >
> > On Fri, Apr 1, 2022 at 2:50 PM Pim van Pelt via lists.fd.io <pim=
> ipng...@lists.fd.io> wrote:
> > Hoi,
> >
> > I noticed that my VPP SNMP Agent no longer works with the python API at
> HEAD, and my attention was drawn to this change:
> > https://gerrit.fd.io/r/c/vpp/+/35640
> > stats: convert error counters to normal counters
> >
> >
> > At HEAD, src/vpp-api/python/vpp_papi/vpp_stats.py now fails 4 out of 6
> tests with the same error as my application:
> > struct.error: offset -140393469444104 out of range for 1073741824-byte
> buffer
> > ..
> > Ran 6 tests in 0.612s
> > FAILED (errors=4)
> >
> > Damjan, Ole, any clues?
> >
> > groet,
> > Pim
> > --
> > Pim van Pelt <p...@ipng.nl>
> > PBVP1-RIPE - http://www.ipng.nl/
> >
> >
> >
> >
> >
> > --
> > Pim van Pelt <p...@ipng.nl>
> > PBVP1-RIPE - http://www.ipng.nl/
> >
> >
> >
> >
> >
> > --
> > Pim van Pelt <p...@ipng.nl>
> > PBVP1-RIPE - http://www.ipng.nl/
> >
> >
> > --
> > Pim van Pelt <p...@ipng.nl>
> > PBVP1-RIPE - http://www.ipng.nl/
> >
> > 
> >
>
>

-- 
Pim van Pelt <p...@ipng.nl>
PBVP1-RIPE - http://www.ipng.nl/
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#21210): https://lists.fd.io/g/vpp-dev/message/21210
Mute This Topic: https://lists.fd.io/mt/90274515/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to