Let’s be clear: you’re seeing a crash in a modified fork of vpp-19.08. I’ve 
never seen such a crash myself, nor has one such been reported by anyone else 
to my knowledge.

 

That having been written, all signs point to the volatile int ** vector 
vl_api_queue_cursizes having had an accident:

 

static void

memclnt_queue_callback (vlib_main_t * vm)

{

<snip>

  for (i = 0; i < vec_len (vl_api_queue_cursizes); i++)

    {

      if (*vl_api_queue_cursizes[i])

      {

        vm->queue_signal_pending = 1;

        vm->api_queue_nonempty = 1;

        vlib_process_signal_event (vm, vl_api_clnt_node.index,

                           /* event_type */ QUEUE_SIGNAL_EVENT,

                           /* event_data */ 0);

        break;

      }

    }

<snip>

}

 

Try a debug image. Try capturing “i”, and the value vl_api_queue_cursizes[i] 
before dereferencing as a pointer. Add a couple of global variables with names 
which won’t collide with anything else:

 

void int oingo_save_i;

void oingo_save_cursizep;

 

In the loop, set:

   oingo_save_i = i;

   oingo_save_cursizep = vl_api_queue_cursizes[i];

 

   if(*vl_api_queue_cursizes[i])

     <etc>

 

Capture a coredump. It should be obvious why the reference blows up. If you 
can, change your custom signal handler so that the faulting virtual address is 
as obvious as possible.

 

Beyond that, you’re on your own.

 

HTH... Dave 

 

From: vpp-dev@lists.fd.io <vpp-dev@lists.fd.io> On Behalf Of Rajith PR via 
lists.fd.io
Sent: Tuesday, November 17, 2020 7:03 AM
To: vpp-dev <vpp-dev@lists.fd.io>
Subject: [vpp-dev]: Crash in memclnt_queue_callback().

 

Hi All,

 

We are seeing a random crash in VPP-19.08. The crash is occurring in 
memclnt_queue_callback and it is in code that we are not using. Any pointers to 
fix the crash would be helpful.

 

Complete Call Stack:

 

Thread 1 (Thread 0x7fe728f43d00 (LWP 189)):

#0  0x00007fe728049492 in __GI___waitpid (pid=732, 
stat_loc=stat_loc@entry=0x7fe6f9ebeed8, options=options@entry=0)
    at ../sysdeps/unix/sysv/linux/waitpid.c:30
#1  0x00007fe727fb4177 in do_system (line=<optimized out>) at 
../sysdeps/posix/system.c:149
#2  0x00007fe728ad6457 in bd_signal_handler_cb (signo=11) at 
/development/librtbrickinfra/bd/src/bd.c:770
#3  0x00007fe71c90fbf7 in rtb_bd_signal_handler (signo=11) at 
/development/libvpp/src/vlib/unix/main.c:80
#4  0x00007fe71c90ff92 in unix_signal_handler (signum=11, si=0x7fe6f9ebf7b0, 
uc=0x7fe6f9ebf680)
    at /development/libvpp/src/vlib/unix/main.c:180
#5  <signal handler called>
#6  memclnt_queue_callback (vm=0x7fe71cb49e80 <vlib_global_main>) at 
/development/libvpp/src/vlibmemory/memory_api.c:96
#7  0x00007fe71c8a9258 in vlib_main_or_worker_loop (vm=0x7fe71cb49e80 
<vlib_global_main>, is_main=1)
    at /development/libvpp/src/vlib/main.c:1799
#8  0x00007fe71c8a9f9d in vlib_main_loop (vm=0x7fe71cb49e80 <vlib_global_main>) 
at /development/libvpp/src/vlib/main.c:1982
#9  0x00007fe71c8aac7b in vlib_main (vm=0x7fe71cb49e80 <vlib_global_main>, 
input=0x7fe6f9ebffb0) at /development/libvpp/src/vlib/main.c:2209
#10 0x00007fe71c911745 in thread0 (arg=140630595772032) at 
/development/libvpp/src/vlib/unix/main.c:666
#11 0x00007fe71c568560 in clib_calljmp () from 
/usr/local/lib/libvppinfra.so.1.0.1
#12 0x00007ffe85672480 in ?? ()
#13 0x00007fe71c911cbb in vlib_unix_main (argc=42, argv=0x563be4aaa5a0) at 
/development/libvpp/src/vlib/unix/main.c:736
#14 0x00007fe71e0bc9eb in rtb_vpp_core_init (argc=42, argv=0x563be4aaa5a0) at 
/development/libvpp/src/vpp/vnet/main.c:483
#15 0x00007fe71e18fba2 in rtb_vpp_main () at 
/development/libvpp/src/vpp/rtbrick/rtb_vpp_main.c:113
#16 0x00007fe728ad5e46 in bd_load_daemon_lib (dmn_lib_cfg=0x7fe728cf2820 
<bd_json_global+21408>)
---Type <return> to continue, or q <return> to quit---
    at /development/librtbrickinfra/bd/src/bd.c:627
#17 0x00007fe728ad5ef1 in bd_load_all_daemon_libs () at 
/development/librtbrickinfra/bd/src/bd.c:646
#18 0x00007fe728ad7362 in bd_start_process () at 
/development/librtbrickinfra/bd/src/bd.c:1128
#19 0x00007fe72583c860 in bds_bd_init () at 
/development/librtbrickinfra/libbds/code/bds/src/bds.c:657
#20 0x00007fe7258c8a30 in pubsub_bd_init_expiry (data=0x0) at 
/development/librtbrickinfra/libbds/code/pubsub/src/pubsub_helper.c:1444
#21 0x00007fe7285d6640 in timer_dispatch (item=0x563be68209b0, p=QB_LOOP_HIGH) 
at /development/librtbrickinfra/libqb/lib/loop_timerlist.c:56
#22 0x00007fe7285d25d6 in qb_loop_run_level (level=0x563be47a17a0) at 
/development/librtbrickinfra/libqb/lib/loop.c:43
#23 0x00007fe7285d2d4b in qb_loop_run (lp=0x563be47a1730) at 
/development/librtbrickinfra/libqb/lib/loop.c:210
#24 0x00007fe7285e461e in lib_qb_service_start_event_loop () at 
/development/librtbrickinfra/libqb/lib/wrapper/lib_qb_service.c:257
#25 0x0000563be3d9f153 in main ()
(gdb) 
 
Code Snippet:
 
 94   for (i = 0; i < vec_len (vl_api_queue_cursizes); i++)

 95     {
 96       if (*vl_api_queue_cursizes[i]) 
<---------------------------------Crashed here
 97         {
 98           vm->queue_signal_pending = 1;
 99           vm->api_queue_nonempty = 1;
100           vlib_process_signal_event (vm, vl_api_clnt_node.index,
101                                      /* event_type */ QUEUE_SIGNAL_EVENT,
102                                      /* event_data */ 0);
103           break;
104         }
105     }
 
Thanks,
Rajith
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#18058): https://lists.fd.io/g/vpp-dev/message/18058
Mute This Topic: https://lists.fd.io/mt/78314224/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to