Hi
I have a situation where in some scenarios when VPP is restarted after a crash,
the VPP which is coming up itself crashes with the following backtrace.
This does not happen always. Any hints on what could cause this would be
appreciated.
Backtrace
(gdb) bt
#0 0x00002b688139c207 in raise () from /lib64/libc.so.6
#1 0x00002b688139d8f8 in abort () from /lib64/libc.so.6
#2 0x0000000000405f7e in os_exit (code=code@entry=1) at
/bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/third-party/vpp/vpp_1801/build-data/../src/vpp/vnet/main.c:292
#3 0x00002b687f4b09d7 in unix_signal_handler (signum=<optimized out>,
si=<optimized out>, uc=<optimized out>)
at
/bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/third-party/vpp/vpp_1801/build-data/../src/vlib/unix/main.c:120
#4 <signal handler called>
#5 rte_mempool_populate_iova_tab (mp=mp@entry=0x1098fbec0,
vaddr=0x2aaaaac00000 "", iova=0x0, pg_num=<optimized out>, pg_shift=<optimized
out>, free_cb=free_cb@entry=0x0, opaque=opaque@entry=0x0)
at
/bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/third-party/vpp/vpp_1801/build-root/build-vpp-native/dpdk/dpdk-stable-17.11.4/lib/librte_mempool/rte_mempool.c:486
#6 0x00002b6b0212a373 in dpdk_pool_create (vm=vm@entry=0x2b687f6cb260
<vlib_global_main>, pool_name=pool_name@entry=0x2b6883a0d924
"dpdk_mbuf_pool_socket0", elt_size=elt_size@entry=2432,
num_elts=num_elts@entry=500000,
pool_priv_size=pool_priv_size@entry=6, cache_size=cache_size@entry=512,
numa=numa@entry=0 '\000', _mp=_mp@entry=0x2b6884266c68,
pri=pri@entry=0x2b6884266c5f "")
at
/bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/third-party/vpp/vpp_1801/build-data/../src/plugins/dpdk/buffer.c:498
#7 0x00002b6b0212a470 in dpdk_buffer_pool_create (vm=vm@entry=0x2b687f6cb260
<vlib_global_main>, num_mbufs=500000, socket_id=0)
at
/bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/third-party/vpp/vpp_1801/build-data/../src/plugins/dpdk/buffer.c:537
#8 0x00002b6b02144861 in dpdk_config (vm=<optimized out>, input=<optimized
out>)
at
/bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/third-party/vpp/vpp_1801/build-data/../src/plugins/dpdk/device/init.c:1288
#9 0x00002b687f4715ad in vlib_call_all_config_functions (vm=<optimized out>,
input=input@entry=0x2b6884266fa0, is_early=is_early@entry=0)
at
/bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/third-party/vpp/vpp_1801/build-data/../src/vlib/init.c:146
#10 0x00002b687f479908 in vlib_main (vm=vm@entry=0x2b687f6cb260
<vlib_global_main>, input=input@entry=0x2b6884266fa0)
at
/bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/third-party/vpp/vpp_1801/build-data/../src/vlib/main.c:1772
#11 0x00002b687f4b0b23 in thread0 (arg=47727814423136) at
/bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/third-party/vpp/vpp_1801/build-data/../src/vlib/unix/main.c:567
#12 0x00002b68807e38b8 in clib_calljmp () at
/bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/third-party/vpp/vpp_1801/build-data/../src/vppinfra/longjmp.S:110
#13 0x00007ffe03c4a0e0 in ?? ()
#14 0x00002b687f4b187f in vlib_unix_main (argc=<optimized out>, argv=<optimized
out>) at
/bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/third-party/vpp/vpp_1801/build-data/../src/vlib/unix/main.c:631
#15 0x6f2d326c00000028 in ?? ()
#16 0x6c632d7475707475 in ?? ()
#17 0x4c20796669737361 in ?? ()
#18 0x697373616c432032 in ?? ()
#19 0x73706f7244207966 in ?? ()
#20 0x000000006425203a in ?? ()
#21 0x0000000000000000 in ?? ()
VPP version: 18.01
After analysing back-trace, it is observed that in
rte_mempool_populate_iova_tab iova address is NULL which is called by
dpdk_pool_create passing pr->page_table addr as iova address.
pr : vlib_physmem_region_t *pr;
Before calling rte_mempool_populate_iova_tab, vlib_physmem_region_alloc is
called which internally calls vm->os_physmem_region_alloc.
vm->os_physmem_region_alloc is mapped to unix_physmem_region_alloc.
So in unix_physmem_region_alloc,
pr->page_table = clib_mem_vm_get_paddr (pr->mem, pr->log2_page_size,
pr->n_pages);
Looks like clib_mem_vm_get_paddr is returning NULL to page table address.
Code for clib_mem_vm_get_paddr function
u64 *
clib_mem_vm_get_paddr (void *mem, int log2_page_size, int n_pages)
{
int pagesize = sysconf (_SC_PAGESIZE);
int fd;
int i;
u64 *r = 0;
if ((fd = open ((char *) "/proc/self/pagemap", O_RDONLY)) == -1)
return 0;
for (i = 0; i < n_pages; i++)
{
u64 seek, pagemap = 0;
uword vaddr = pointer_to_uword (mem) + (((u64) i) << log2_page_size);
seek = ((u64) vaddr / pagesize) * sizeof (u64);
if (lseek (fd, seek, SEEK_SET) != seek)
goto done;
if (read (fd, &pagemap, sizeof (pagemap)) != (sizeof (pagemap)))
goto done;
if ((pagemap & (1ULL << 63)) == 0)
goto done;
pagemap &= pow2_mask (55);
vec_add1 (r, pagemap * pagesize);
}
done:
close (fd);
if (vec_len (r) != n_pages)
{
vec_free (r);
return 0;
}
return r;
}
So why this could be returning NULL. Any help would be appreciated.
Thanks
Alok
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#11361): https://lists.fd.io/g/vpp-dev/message/11361
Mute This Topic: https://lists.fd.io/mt/28285357/21656
Group Owner: [email protected]
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-