Hi Alexander, 

Quick reply. 

Nice bug report! Agreed that it looks like vl_api_clnt_process sleeps, probably 
because it hit a queue size of 0, but memclnt_queue_callback or the timeout, 
albeit 20s is a lot, should wake it up. 

So, given that QUEUE_SIGNAL_EVENT is set, the only thing that comes to mind is 
that maybe somehow vlib_process_signal_event context gets corrupted. Could you 
run a debug image and see if anything asserts? Is vlib_process_signal_event 
called by chance from a worker?

Regards,
Florin

> On Jan 24, 2023, at 7:59 AM, Alexander Chernavin via lists.fd.io 
> <achernavin=netgate....@lists.fd.io> wrote:
> 
> Hello all,
> 
> We are experiencing VPP crashes that occur a few days after the startup 
> because of API segment exhaustion. Increasing API segment size to 256MB 
> didn't stop the crashes from occurring.
> 
> Can you please take a look at the description below and tell us if you have 
> seen similar issues or have any ideas what the cause may be?
> 
> Given:
> VPP 22.10
> 2 worker threads
> API segment size is 256MB
> ~893k IPv4 routes and ~160k IPv6 routes added
> 
> Backtrace:
>> [..]
>> #32660 0x000055b02f606896 in os_panic () at 
>> /home/jenkins/tnsr-pkgs/work/vpp/src/vpp/vnet/main.c:414
>> #32661 0x00007fce3c0ec740 in clib_mem_heap_alloc_inline (heap=0x0, 
>> size=<optimized out>, align=8, 
>>     os_out_of_memory_on_failure=1) at 
>> /home/jenkins/tnsr-pkgs/work/vpp/src/vppinfra/mem_dlmalloc.c:613
>> #32662 clib_mem_alloc (size=<optimized out>)
>>     at /home/jenkins/tnsr-pkgs/work/vpp/src/vppinfra/mem_dlmalloc.c:628
>> #32663 0x00007fce3dc4ee6f in vl_msg_api_alloc_internal (vlib_rp=0x130026000, 
>> nbytes=69, pool=0, 
>>     may_return_null=0) at 
>> /home/jenkins/tnsr-pkgs/work/vpp/src/vlibmemory/memory_shared.c:179
>> #32664 0x00007fce3dc592cd in vl_api_rpc_call_main_thread_inline (force_rpc=0 
>> '\000', 
>>     fp=<optimized out>, data=<optimized out>, data_length=<optimized out>)
>>     at /home/jenkins/tnsr-pkgs/work/vpp/src/vlibmemory/memclnt_api.c:617
>> #32665 vl_api_rpc_call_main_thread (fp=0x7fce3c74de70 <ip_neighbor_learn>, 
>>     data=0x7fcc372bdc00 "& \001$ ", data_length=28)
>>     at /home/jenkins/tnsr-pkgs/work/vpp/src/vlibmemory/memclnt_api.c:641
>> #32666 0x00007fce3cc7fe2d in icmp6_neighbor_solicitation_or_advertisement 
>> (vm=0x7fccc0864000, 
>>     frame=0x7fcccd7d2d40, is_solicitation=1, node=<optimized out>)
>>     at /home/jenkins/tnsr-pkgs/work/vpp/src/vnet/ip6-nd/ip6_nd.c:163
>> #32667 icmp6_neighbor_solicitation (vm=0x7fccc0864000, node=0x7fccc09e3380, 
>> frame=0x7fcccd7d2d40)
>>     at /home/jenkins/tnsr-pkgs/work/vpp/src/vnet/ip6-nd/ip6_nd.c:322
>> #32668 0x00007fce3c1a2fe0 in dispatch_node (vm=0x7fccc0864000, 
>> node=0x7fce3dc74836, 
>>     type=VLIB_NODE_TYPE_INTERNAL, dispatch_state=VLIB_NODE_STATE_POLLING, 
>> frame=0x7fcccd7d2d40, 
>>     last_time_stamp=4014159654296481) at 
>> /home/jenkins/tnsr-pkgs/work/vpp/src/vlib/main.c:961
>> #32669 dispatch_pending_node (vm=0x7fccc0864000, pending_frame_index=7, 
>>     last_time_stamp=4014159654296481) at 
>> /home/jenkins/tnsr-pkgs/work/vpp/src/vlib/main.c:1120
>> #32670 vlib_main_or_worker_loop (vm=0x7fccc0864000, is_main=0)
>>     at /home/jenkins/tnsr-pkgs/work/vpp/src/vlib/main.c:1589
>> #32671 vlib_worker_loop (vm=vm@entry=0x7fccc0864000)
>>     at /home/jenkins/tnsr-pkgs/work/vpp/src/vlib/main.c:1723
>> #32672 0x00007fce3c1f581a in vlib_worker_thread_fn (arg=0x7fccbdb11b40)
>>     at /home/jenkins/tnsr-pkgs/work/vpp/src/vlib/threads.c:1579
>> #32673 0x00007fce3c1f02c1 in vlib_worker_thread_bootstrap_fn 
>> (arg=0x7fccbdb11b40)
>>     at /home/jenkins/tnsr-pkgs/work/vpp/src/vlib/threads.c:418
>> #32674 0x00007fce3be3db43 in start_thread (arg=<optimized out>) at 
>> ./nptl/pthread_create.c:442
>> #32675 0x00007fce3becfa00 in clone3 () at 
>> ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
> 
> According to the backtrace, an IPv6 neighbor is being learned. Since the 
> packet was received on a worker thread, the neighbor information is being 
> passed to the main thread by making an RPC call (that works via the API). For 
> this, an API message for RPC call is being allocated from the API segment (as 
> а client). But the allocation is failing because of no available memory.
> 
> If inspect the API rings after crashing, it can be seen that they are all 
> filled with VL_API_RPC_CALL messages. Also, it can be seen that there are a 
> lot of pending RPC requests (vm->pending_rpc_requests has ~3.3M items). Thus, 
> API segment exhaustion occurs because of a huge number of pending RPC 
> messages.
> 
> RPC messages are processed in a process node called api-rx-from-ring 
> (function is called vl_api_clnt_process). And process nodes are handled in 
> the main thread only.
> 
> First hypothesis is that the main loop of the main thread pauses for such a 
> long time that a huge number of pending RPC messages are accumulated by the 
> worker threads (that keep running). But this doesn't seem to be confirmed if 
> inspect vm->loop_interval_start of all threads after crashing. 
> vm->loop_interval_start of the worker threads would have been greater than 
> vm->loop_interval_start of the main thread.
>> (gdb) p vlib_global_main.vlib_mains[0]->loop_interval_start
>> $117 = 197662.55595008997
>> (gdb) p vlib_global_main.vlib_mains[1]->loop_interval_start
>> $119 = 197659.82887979984
>> (gdb) p vlib_global_main.vlib_mains[2]->loop_interval_start
>> $121 = 197659.93944517447
> 
> Second hypothesis is that pending RPC messages stop being processed 
> completely at some point and keep being accumulated while the memory permits. 
> This seems to be confirmed if inspect the process node after crashing. It can 
> be seen that vm->main_loop_count is much bigger than the process node's 
> main_loop_count_last_dispatch (difference is ~50M iterations). Although, 
> according to the flags, the node is waiting for timer or an event and it was 
> suspended for 20s. Also, vm->queue_signal_pending is set and this means that 
> QUEUE_SIGNAL_EVENT was signaled and the process node was supposed to be 
> executed shortly. But such a big difference since last execution of the node 
> seems to be indicating that it wasn't executed by timer either.
>> (gdb) p vlib_global_main.vlib_mains[0]->main_loop_count 
>> $115 = 1696611558
>> (gdb) p vlib_global_main.vlib_mains[0]->queue_signal_pending
>> $116 = 1
>> (gdb) p *vlib_global_main.vlib_mains[0]->node_main.processes[26]
>> $122 = {[...], function = 0x7fce3dc57260 <vl_api_clnt_process>,
>> [...], main_loop_count_last_dispatch = 1646632623,
>> [...], flags = 11,
>> [...], resume_clock_interval = 2000000,
>> [...]}
> 
> What is also worth noting is that if look what stop_timer_handle the process 
> node keeps, it can be seen that the timer doesn't belong to it. In fact, it 
> belongs to another process node. It might look like the timer was reused. And 
> if compare the values of main_loop_count_last_dispatch, after one crash, it 
> does look like a reuse but after another one, it doesn't. For example, in 
> this case, it doesn't look like a reuse because 1,646,632,623 > 1,552,541,082.
>> (gdb) p *vlib_global_main.vlib_mains[0]->node_main.processes[26]
>> $254 = {[...], function = 0x7fce3dc57260 <vl_api_clnt_process>,
>> [...], main_loop_count_last_dispatch = 1646632623,
>> [...], stop_timer_handle = 3087,
>> [...]}
>> (gdb) p 
>> ((tw_timer_wheel_1t_3w_1024sl_ov_t*)vlib_global_main.vlib_mains[0]->node_main.timing_wheel)->timers[3087]
>> $255 = {next = 3095, prev = 3074, {{fast_ring_offset = 287, slow_ring_offset 
>> = 356}, expiration_time = 18446744069437915423}, user_handle = 84}
>> (gdb) p *vlib_global_main.vlib_mains[0]->node_main.processes[42]
>> $256 = {[...], function = 0x7fce3c638700 <ip4_sv_reass_walk_expired>,
>> [...], main_loop_count_last_dispatch = 1552541082,
>> [...], stop_timer_handle = 3087,
>> [...]}
> 
> Regards,
> Alexander
> 
> 
> 
> 

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#22507): https://lists.fd.io/g/vpp-dev/message/22507
Mute This Topic: https://lists.fd.io/mt/96500275/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to