Hi Damjan,

Just a quick note - 22.06 still has this regression

1: /home/pim/src/vpp/src/vlib/drop.c:77 (counter_index) assertion `ci <
n->n_errors' fails


Is a reasonable fix for this seeing to it that the ASSERT here returns NULL
instead and the two call sites in L95, L224 become tolerant of that?

On Thu, Apr 7, 2022 at 3:11 PM Damjan Marion (damarion) <damar...@cisco.com>
wrote:

>
> Yeah, looks like ip4_neighbor_probe is sending packet to deleted interface:
>
> (gdb)p n->name
> $4 = (u8 *) 0x7fff82b47578 "interface-3-output-deleted”
>
> So it is right that this assert kicks in.
>
> Likely what happens is that batch of commands are first triggering
> generation of neighbor probe packet, then
> immediately after that interface is deleted, but packet is still in flight
> and drop node tries to bump counters for deleted interface.
>
> —
> Damjan
>
>
>
> > On 06.04.2022., at 16:21, Pim van Pelt <p...@ipng.nl> wrote:
> >
> > Hoi,
> >
> > Following reproduces the drop.c:77 assertion:
> >
> > create loopback interface instance 0
> > set interface ip address loop0 10.0.0.1/32
> > set interface state GigabitEthernet3/0/1 up
> > set interface state loop0 up
> > set interface state loop0 down
> > set interface ip address del loop0 10.0.0.1/32
> > delete loopback interface intfc loop0
> > set interface state GigabitEthernet3/0/1 down
> > set interface state GigabitEthernet3/0/1 up
> > comment { the following crashes VPP }
> > set interface state GigabitEthernet3/0/1 down
> >
> > I found that adding IPv6 addresses does not provoke the crash, while
> adding IPv4 addresses to loop0 does provoke it.
> >
> > groet,
> > Pim
> >
> > On Wed, Apr 6, 2022 at 3:56 PM Pim van Pelt via lists.fd.io <pim=
> ipng...@lists.fd.io> wrote:
> > Hoi,
> >
> > The crash I observed is now gone, thanks!
> >
> > VPP occasionally hits an ASSERT related to error counters at drop.c:77
> -- I'll try to see if I can get a reproduction, but it may take a while,
> and it may be transient.
> >
> > 11: /home/pim/src/vpp/src/vlib/drop.c:77 (counter_index) assertion `ci <
> n->n_errors' fails
> >
> > Thread 14 "vpp_wk_11" received signal SIGABRT, Aborted.
> > [Switching to Thread 0x7fff4bbfd700 (LWP 182685)]
> > __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
> > 50      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
> > (gdb) bt
> > #0  __GI_raise (sig=sig@entry=6) at
> ../sysdeps/unix/sysv/linux/raise.c:50
> > #1  0x00007ffff6a5f859 in __GI_abort () at abort.c:79
> > #2  0x00000000004072e3 in os_panic () at
> /home/pim/src/vpp/src/vpp/vnet/main.c:413
> > #3  0x00007ffff6daea29 in debugger () at
> /home/pim/src/vpp/src/vppinfra/error.c:84
> > #4  0x00007ffff6dae7fa in _clib_error (how_to_die=2, function_name=0x0,
> line_number=0, fmt=0x7ffff6f9d19c "%s:%d (%s) assertion `%s' fails")
> >     at /home/pim/src/vpp/src/vppinfra/error.c:143
> > #5  0x00007ffff6f782d9 in counter_index (vm=0x7fffa09fb2c0, e=3416) at
> /home/pim/src/vpp/src/vlib/drop.c:77
> > #6  0x00007ffff6f77c57 in process_drop_punt (vm=0x7fffa09fb2c0,
> node=0x7fffa0c79b00, frame=0x7fff97168140,
> disposition=ERROR_DISPOSITION_DROP)
> >     at /home/pim/src/vpp/src/vlib/drop.c:224
> > #7  0x00007ffff6f77957 in error_drop_node_fn_hsw (vm=0x7fffa09fb2c0,
> node=0x7fffa0c79b00, frame=0x7fff97168140)
> >     at /home/pim/src/vpp/src/vlib/drop.c:248
> > #8  0x00007ffff6f0b10d in dispatch_node (vm=0x7fffa09fb2c0,
> node=0x7fffa0c79b00, type=VLIB_NODE_TYPE_INTERNAL,
> >     dispatch_state=VLIB_NODE_STATE_POLLING, frame=0x7fff97168140,
> last_time_stamp=5318787653101516) at /home/pim/src/vpp/src/vlib/main.c:961
> > #9  0x00007ffff6f0bb60 in dispatch_pending_node (vm=0x7fffa09fb2c0,
> pending_frame_index=5, last_time_stamp=5318787653101516)
> >     at /home/pim/src/vpp/src/vlib/main.c:1120
> > #10 0x00007ffff6f06e0f in vlib_main_or_worker_loop (vm=0x7fffa09fb2c0,
> is_main=0) at /home/pim/src/vpp/src/vlib/main.c:1587
> > #11 0x00007ffff6f06537 in vlib_worker_loop (vm=0x7fffa09fb2c0) at
> /home/pim/src/vpp/src/vlib/main.c:1721
> > #12 0x00007ffff6f44ef4 in vlib_worker_thread_fn (arg=0x7fff98eabec0) at
> /home/pim/src/vpp/src/vlib/threads.c:1587
> > #13 0x00007ffff6f3ffe5 in vlib_worker_thread_bootstrap_fn
> (arg=0x7fff98eabec0) at /home/pim/src/vpp/src/vlib/threads.c:426
> > #14 0x00007ffff6e61609 in start_thread (arg=<optimized out>) at
> pthread_create.c:477
> > #15 0x00007ffff6b5c163 in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
> > (gdb) up 4
> > #4  0x00007ffff6dae7fa in _clib_error (how_to_die=2, function_name=0x0,
> line_number=0, fmt=0x7ffff6f9d19c "%s:%d (%s) assertion `%s' fails")
> >     at /home/pim/src/vpp/src/vppinfra/error.c:143
> > 143         debugger ();
> > (gdb) up
> > #5  0x00007ffff6f782d9 in counter_index (vm=0x7fffa09fb2c0, e=3416) at
> /home/pim/src/vpp/src/vlib/drop.c:77
> > 77        ASSERT (ci < n->n_errors);
> > (gdb) list
> > 72
> > 73        ni = vlib_error_get_node (&vm->node_main, e);
> > 74        n = vlib_get_node (vm, ni);
> > 75
> > 76        ci = vlib_error_get_code (&vm->node_main, e);
> > 77        ASSERT (ci < n->n_errors);
> > 78
> > 79        ci += n->error_heap_index;
> > 80
> > 81        return ci;
> >
> > On Wed, Apr 6, 2022 at 1:53 PM Damjan Marion (damarion) <
> damar...@cisco.com> wrote:
> >
> > This seems to be day one issue, and my patch just exposed it.
> > Current interface deletion code is not removing node stats entries.
> >
> > So if you delete interface and then create one with the same name,
> > stats entry is already there, and creation of new entry fails.
> >
> > Hope this helps:
> >
> > https://gerrit.fd.io/r/c/vpp/+/35900
> >
> > —
> > Damjan
> >
> >
> >
> > > On 05.04.2022., at 22:13, Pim van Pelt <p...@ipng.nl> wrote:
> > >
> > > Hoi,
> > >
> > > Here's a minimal repro that reliably crashes VPP at head for me, does
> not crash before gerrit 35640:
> > >
> > > create loopback interface instance 0
> > > create bond id 0 mode lacp load-balance l34
> > > create bond id 1 mode lacp load-balance l34
> > > delete loopback interface intfc loop0
> > > delete bond BondEthernet0
> > > delete bond BondEthernet1
> > > create bond id 0 mode lacp load-balance l34
> > > delete bond BondEthernet0
> > > comment { the next command crashes VPP }
> > > create loopback interface instance 0
> > >
> > >
> > >
> > > On Tue, Apr 5, 2022 at 9:48 PM Pim van Pelt <p...@ipng.nl> wrote:
> > > Hoi,
> > >
> > > There is a crashing regression in VPP after
> https://gerrit.fd.io/r/c/vpp/+/35640
> > >
> > > With that change merged, VPP crashes upon creation and deletion of
> interfaces. Winding back the repo until before 35640 does not crash. The
> crash happens in
> > > 0: /home/pim/src/vpp/src/vlib/stats/stats.h:115 (vlib_stats_get_entry)
> assertion `entry_index < vec_len (sm->directory_vector)' fails
> > >
> > > (gdb) bt
> > > #0  __GI_raise (sig=sig@entry=6) at
> ../sysdeps/unix/sysv/linux/raise.c:50
> > > #1  0x00007ffff6a5e859 in __GI_abort () at abort.c:79
> > > #2  0x00000000004072e3 in os_panic () at
> /home/pim/src/vpp/src/vpp/vnet/main.c:413
> > > #3  0x00007ffff6dada29 in debugger () at
> /home/pim/src/vpp/src/vppinfra/error.c:84
> > > #4  0x00007ffff6dad7fa in _clib_error (how_to_die=2,
> function_name=0x0, line_number=0, fmt=0x7ffff6f9c19c "%s:%d (%s) assertion
> `%s' fails")
> > >    at /home/pim/src/vpp/src/vppinfra/error.c:143
> > > #5  0x00007ffff6f39605 in vlib_stats_get_entry (sm=0x7ffff6fce5e8
> <vlib_stats_main>, entry_index=4294967295)
> > >    at /home/pim/src/vpp/src/vlib/stats/stats.h:115
> > > #6  0x00007ffff6f39273 in vlib_stats_remove_entry
> (entry_index=4294967295) at /home/pim/src/vpp/src/vlib/stats/stats.c:135
> > > #7  0x00007ffff6ee36d9 in vlib_register_errors (vm=0x7fff96800740,
> node_index=718, n_errors=0, error_strings=0x0, counters=0x0)
> > >    at /home/pim/src/vpp/src/vlib/error.c:149
> > > #8  0x00007ffff70b8e0c in setup_tx_node (vm=0x7fff96800740,
> node_index=718, dev_class=0x7fff973f9fb0) at
> /home/pim/src/vpp/src/vnet/interface.c:816
> > > #9  0x00007ffff70b7f26 in vnet_register_interface (vnm=0x7ffff7f579a0
> <vnet_main>, dev_class_index=31, dev_instance=0, hw_class_index=29,
> > >    hw_instance=7) at /home/pim/src/vpp/src/vnet/interface.c:1085
> > > #10 0x00007ffff7129efd in vnet_eth_register_interface
> (vnm=0x7ffff7f579a0 <vnet_main>, r=0x7fff4b288f18)
> > >    at /home/pim/src/vpp/src/vnet/ethernet/interface.c:376
> > > #11 0x00007ffff712bd05 in vnet_create_loopback_interface
> (sw_if_indexp=0x7fff4b288fb8, mac_address=0x7fff4b288fb2 "", is_specified=1
> '\001',
> > >    user_instance=0) at
> /home/pim/src/vpp/src/vnet/ethernet/interface.c:883
> > > #12 0x00007ffff712fecf in create_simulated_ethernet_interfaces
> (vm=0x7fff96800740, input=0x7fff4b2899d0, cmd=0x7fff973c7e38)
> > >    at /home/pim/src/vpp/src/vnet/ethernet/interface.c:930
> > > #13 0x00007ffff6ed65e8 in vlib_cli_dispatch_sub_commands
> (vm=0x7fff96800740, cm=0x42c2f0 <vlib_global_main+48>,
> input=0x7fff4b2899d0,
> > >    parent_command_index=1161) at /home/pim/src/vpp/src/vlib/cli.c:592
> > > #14 0x00007ffff6ed6358 in vlib_cli_dispatch_sub_commands
> (vm=0x7fff96800740, cm=0x42c2f0 <vlib_global_main+48>,
> input=0x7fff4b2899d0,
> > >    parent_command_index=33) at /home/pim/src/vpp/src/vlib/cli.c:549
> > > #15 0x00007ffff6ed6358 in vlib_cli_dispatch_sub_commands
> (vm=0x7fff96800740, cm=0x42c2f0 <vlib_global_main+48>,
> input=0x7fff4b2899d0,
> > >    parent_command_index=0) at /home/pim/src/vpp/src/vlib/cli.c:549
> > > #16 0x00007ffff6ed5528 in vlib_cli_input (vm=0x7fff96800740,
> input=0x7fff4b2899d0, function=0x0, function_arg=0)
> > >    at /home/pim/src/vpp/src/vlib/cli.c:695
> > > #17 0x00007ffff6f61f21 in unix_cli_exec (vm=0x7fff96800740,
> input=0x7fff4b289e78, cmd=0x7fff973c99d8) at
> /home/pim/src/vpp/src/vlib/unix/cli.c:3454
> > > #18 0x00007ffff6ed65e8 in vlib_cli_dispatch_sub_commands
> (vm=0x7fff96800740, cm=0x42c2f0 <vlib_global_main+48>,
> input=0x7fff4b289e78,
> > >    parent_command_index=0) at /home/pim/src/vpp/src/vlib/cli.c:592
> > > #19 0x00007ffff6ed5528 in vlib_cli_input (vm=0x7fff96800740,
> input=0x7fff4b289e78, function=0x7ffff6f55960 <unix_vlib_cli_output>,
> function_arg=1)
> > >    at /home/pim/src/vpp/src/vlib/cli.c:695
> > >
> > > This is caught by a local regression test (
> https://github.com/pimvanpelt/vppcfg/tree/main/intest) that executes a
> bunch of CLI statements, and I have a set of transitions there which I can
> probably narrow down to an exact repro case.
> > >
> > > On Fri, Apr 1, 2022 at 3:08 PM Pim van Pelt via lists.fd.io <pim=
> ipng...@lists.fd.io> wrote:
> > > Hoi,
> > >
> > > As a followup - I tried to remember why I copied class VPPStats() and
> friends into my own repository, but that may be because it's not exported
> in __init__.py. Should it be? I pulled in the latest changed Damjan made to
> vpp_stats.py into my own repo, and my app runs again. Is it possibly worth
> our while to add the VPPStats() class to the exported classes in vpp_papi ?
> > >
> > > groet,
> > > Pim
> > >
> > > On Fri, Apr 1, 2022 at 2:50 PM Pim van Pelt via lists.fd.io <pim=
> ipng...@lists.fd.io> wrote:
> > > Hoi,
> > >
> > > I noticed that my VPP SNMP Agent no longer works with the python API
> at HEAD, and my attention was drawn to this change:
> > > https://gerrit.fd.io/r/c/vpp/+/35640
> > > stats: convert error counters to normal counters
> > >
> > >
> > > At HEAD, src/vpp-api/python/vpp_papi/vpp_stats.py now fails 4 out of 6
> tests with the same error as my application:
> > > struct.error: offset -140393469444104 out of range for 1073741824-byte
> buffer
> > > ..
> > > Ran 6 tests in 0.612s
> > > FAILED (errors=4)
> > >
> > > Damjan, Ole, any clues?
> > >
> > > groet,
> > > Pim
> > > --
> > > Pim van Pelt <p...@ipng.nl>
> > > PBVP1-RIPE - http://www.ipng.nl/
> > >
> > >
> > >
> > >
> > >
> > > --
> > > Pim van Pelt <p...@ipng.nl>
> > > PBVP1-RIPE - http://www.ipng.nl/
> > >
> > >
> > >
> > >
> > >
> > > --
> > > Pim van Pelt <p...@ipng.nl>
> > > PBVP1-RIPE - http://www.ipng.nl/
> > >
> > >
> > > --
> > > Pim van Pelt <p...@ipng.nl>
> > > PBVP1-RIPE - http://www.ipng.nl/
> > >
> > >
> > >
> >
> >
> >
> > --
> > Pim van Pelt <p...@ipng.nl>
> > PBVP1-RIPE - http://www.ipng.nl/
> >
> >
> >
> >
> >
> > --
> > Pim van Pelt <p...@ipng.nl>
> > PBVP1-RIPE - http://www.ipng.nl/
> >
> > 
> >
>
>

-- 
Pim van Pelt <p...@ipng.nl>
PBVP1-RIPE - http://www.ipng.nl/
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#21503): https://lists.fd.io/g/vpp-dev/message/21503
Mute This Topic: https://lists.fd.io/mt/90274515/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to