Re: [vpp-dev] Regarding page table address NULL in dpdk_pool_create
' allocated on the wrong numa node (requested 0 actual 4294967294) Nov 26 07:57:18 fe01-oam vnet[13134]: received signal SIGSEGV, PC 0x2b30b3bdfdf8, faulting address 0x0 Nov 26 07:57:25 fe01-oam vnet[13466]: clib_sysfs_prealloc_hugepages:239: pre-allocating 212 additional 2048K hugepages on numa node 0 Nov 26 07:57:26 fe01-oam vnet[13466]: unix_physmem_region_alloc:231: physmem page for region 'dpdk_mbuf_pool_socket0' allocated on the wrong numa node (requested 0 actual 4294967294) Nov 26 07:57:26 fe01-oam vnet[13466]: received signal SIGSEGV, PC 0x2b097d347df8, faulting address 0x0 Nov 26 07:57:33 fe01-oam vnet[13978]: clib_sysfs_prealloc_hugepages:239: pre-allocating 212 additional 2048K hugepages on numa node 0 Nov 26 07:57:33 fe01-oam vnet[13978]: unix_physmem_region_alloc:231: physmem page for region 'dpdk_mbuf_pool_socket0' allocated on the wrong numa node (requested 0 actual 4294967294) Nov 26 07:57:33 fe01-oam vnet[13978]: received signal SIGSEGV, PC 0x2b585d42ddf8, faulting address 0x0 Nov 26 07:57:52 fe01-oam vnet[14464]: clib_sysfs_prealloc_hugepages:239: pre-allocating 212 additional 2048K hugepages on numa node 0 Nov 26 07:57:55 fe01-oam vnet[14464]: clib_sysfs_prealloc_hugepages:239: pre-allocating 212 additional 2048K hugepages on numa node 1 I see that the file src/plugins/dpdk/device/init.c is doing the following - /* lazy umount hugepages */ umount2 ((char *) huge_dir_path, MNT_DETACH); rmdir ((char *) huge_dir_path); vec_free (huge_dir_path); Could this be a problem in some scenarios ? Is there a way to do some 'unlazy mount' here such that it is guaranteed before moving forward that what is required has been done ? Regards Alok -Original Message- From: Damjan Marion Sent: 23 November 2018 15:03 To: Alok Makhariya Cc: vpp-dev@lists.fd.io Subject: Re: [vpp-dev] Regarding page table address NULL in dpdk_pool_create > > On 22 Nov 2018, at 10:55, Alok Makhariya wrote: > > Hi > > I have a situation where in some scenarios when VPP is restarted after a > crash, the VPP which is coming up itself crashes with the following backtrace. > This does not happen always. Any hints on what could cause this would be > appreciated. > > Backtrace > > (gdb) bt > #0 0x2b688139c207 in raise () from /lib64/libc.so.6 > #1 0x2b688139d8f8 in abort () from /lib64/libc.so.6 > #2 0x00405f7e in os_exit (code=code@entry=1) at > /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/thi > rd-party/vpp/vpp_1801/build-data/../src/vpp/vnet/main.c:292 > #3 0x2b687f4b09d7 in unix_signal_handler (signum=, > si=, uc=) > at > /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/thi > rd-party/vpp/vpp_1801/build-data/../src/vlib/unix/main.c:120 > #4 > #5 rte_mempool_populate_iova_tab (mp=mp@entry=0x1098fbec0, > vaddr=0x2ac0 "", iova=0x0, pg_num=, > pg_shift=, free_cb=free_cb@entry=0x0, opaque=opaque@entry=0x0) > at > /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/thi > rd-party/vpp/vpp_1801/build-root/build-vpp-native/dpdk/dpdk-stable-17. > 11.4/lib/librte_mempool/rte_mempool.c:486 > #6 0x2b6b0212a373 in dpdk_pool_create (vm=vm@entry=0x2b687f6cb260 > , pool_name=pool_name@entry=0x2b6883a0d924 > "dpdk_mbuf_pool_socket0", elt_size=elt_size@entry=2432, > num_elts=num_elts@entry=50, > pool_priv_size=pool_priv_size@entry=6, cache_size=cache_size@entry=512, > numa=numa@entry=0 '\000', _mp=_mp@entry=0x2b6884266c68, > pri=pri@entry=0x2b6884266c5f "") > at > /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/thi > rd-party/vpp/vpp_1801/build-data/../src/plugins/dpdk/buffer.c:498 > #7 0x2b6b0212a470 in dpdk_buffer_pool_create (vm=vm@entry=0x2b687f6cb260 > , num_mbufs=50, socket_id=0) > at > /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/thi > rd-party/vpp/vpp_1801/build-data/../src/plugins/dpdk/buffer.c:537 > #8 0x2b6b02144861 in dpdk_config (vm=, input= out>) > at > /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/thi > rd-party/vpp/vpp_1801/build-data/../src/plugins/dpdk/device/init.c:128 > 8 > #9 0x2b687f4715ad in vlib_call_all_config_functions (vm=, > input=input@entry=0x2b6884266fa0, is_early=is_early@entry=0) > at > /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/thi > rd-party/vpp/vpp_1801/build-data/../src/vlib/init.c:146 > #10 0x2b687f479908 in vlib_main (vm=vm@entry=0x2b687f6cb260 > , input=input@entry=0x2b6884266fa0) > at > /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/thi > rd-party/vpp/vpp_1801/build-data/../src/vlib/main.c:1772 > #11 0x2b687f4b0b23 in thread0 (arg=47727814423136) at > /bfs-build/build-area.4
Re: [vpp-dev] Regarding page table address NULL in dpdk_pool_create
Thanks Damjan. Regards Alok -Original Message- From: Damjan Marion Sent: 23 November 2018 15:03 To: Alok Makhariya Cc: vpp-dev@lists.fd.io Subject: Re: [vpp-dev] Regarding page table address NULL in dpdk_pool_create > > On 22 Nov 2018, at 10:55, Alok Makhariya wrote: > > Hi > > I have a situation where in some scenarios when VPP is restarted after a > crash, the VPP which is coming up itself crashes with the following backtrace. > This does not happen always. Any hints on what could cause this would be > appreciated. > > Backtrace > > (gdb) bt > #0 0x2b688139c207 in raise () from /lib64/libc.so.6 > #1 0x2b688139d8f8 in abort () from /lib64/libc.so.6 > #2 0x00405f7e in os_exit (code=code@entry=1) at > /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/thi > rd-party/vpp/vpp_1801/build-data/../src/vpp/vnet/main.c:292 > #3 0x2b687f4b09d7 in unix_signal_handler (signum=, > si=, uc=) > at > /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/thi > rd-party/vpp/vpp_1801/build-data/../src/vlib/unix/main.c:120 > #4 > #5 rte_mempool_populate_iova_tab (mp=mp@entry=0x1098fbec0, > vaddr=0x2ac0 "", iova=0x0, pg_num=, > pg_shift=, free_cb=free_cb@entry=0x0, opaque=opaque@entry=0x0) > at > /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/thi > rd-party/vpp/vpp_1801/build-root/build-vpp-native/dpdk/dpdk-stable-17. > 11.4/lib/librte_mempool/rte_mempool.c:486 > #6 0x2b6b0212a373 in dpdk_pool_create (vm=vm@entry=0x2b687f6cb260 > , pool_name=pool_name@entry=0x2b6883a0d924 > "dpdk_mbuf_pool_socket0", elt_size=elt_size@entry=2432, > num_elts=num_elts@entry=50, > pool_priv_size=pool_priv_size@entry=6, cache_size=cache_size@entry=512, > numa=numa@entry=0 '\000', _mp=_mp@entry=0x2b6884266c68, > pri=pri@entry=0x2b6884266c5f "") > at > /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/thi > rd-party/vpp/vpp_1801/build-data/../src/plugins/dpdk/buffer.c:498 > #7 0x2b6b0212a470 in dpdk_buffer_pool_create (vm=vm@entry=0x2b687f6cb260 > , num_mbufs=50, socket_id=0) > at > /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/thi > rd-party/vpp/vpp_1801/build-data/../src/plugins/dpdk/buffer.c:537 > #8 0x2b6b02144861 in dpdk_config (vm=, input= out>) > at > /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/thi > rd-party/vpp/vpp_1801/build-data/../src/plugins/dpdk/device/init.c:128 > 8 > #9 0x2b687f4715ad in vlib_call_all_config_functions (vm=, > input=input@entry=0x2b6884266fa0, is_early=is_early@entry=0) > at > /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/thi > rd-party/vpp/vpp_1801/build-data/../src/vlib/init.c:146 > #10 0x2b687f479908 in vlib_main (vm=vm@entry=0x2b687f6cb260 > , input=input@entry=0x2b6884266fa0) > at > /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/thi > rd-party/vpp/vpp_1801/build-data/../src/vlib/main.c:1772 > #11 0x2b687f4b0b23 in thread0 (arg=47727814423136) at > /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/thi > rd-party/vpp/vpp_1801/build-data/../src/vlib/unix/main.c:567 > #12 0x2b68807e38b8 in clib_calljmp () at > /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/thi > rd-party/vpp/vpp_1801/build-data/../src/vppinfra/longjmp.S:110 > #13 0x7ffe03c4a0e0 in ?? () > #14 0x2b687f4b187f in vlib_unix_main (argc=, > argv=) at > /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/thi > rd-party/vpp/vpp_1801/build-data/../src/vlib/unix/main.c:631 > #15 0x6f2d326c0028 in ?? () > #16 0x6c632d7475707475 in ?? () > #17 0x4c20796669737361 in ?? () > #18 0x697373616c432032 in ?? () > #19 0x73706f7244207966 in ?? () > #20 0x6425203a in ?? () > #21 0x in ?? () > > > VPP version: 18.01 Please move to recent version of vpp 18.01 is not supported anymore. > > After analysing back-trace, it is observed that in > rte_mempool_populate_iova_tab iova address is NULL which is called by > dpdk_pool_create passing pr->page_table addr as iova address. > pr : vlib_physmem_region_t *pr; > Before calling rte_mempool_populate_iova_tab, vlib_physmem_region_alloc is > called which internally calls vm->os_physmem_region_alloc. > > vm->os_physmem_region_alloc is mapped to unix_physmem_region_alloc. > So in unix_physmem_region_alloc, > pr->page_table = clib_mem_vm_get_paddr (pr->mem, pr->log2_page_size, > pr->pr->n_pages); > Looks like clib_mem_vm_get_paddr is returning NULL to page table address. > > Code
[vpp-dev] Regarding page table address NULL in dpdk_pool_create
Hi I have a situation where in some scenarios when VPP is restarted after a crash, the VPP which is coming up itself crashes with the following backtrace. This does not happen always. Any hints on what could cause this would be appreciated. Backtrace (gdb) bt #0 0x2b688139c207 in raise () from /lib64/libc.so.6 #1 0x2b688139d8f8 in abort () from /lib64/libc.so.6 #2 0x00405f7e in os_exit (code=code@entry=1) at /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/third-party/vpp/vpp_1801/build-data/../src/vpp/vnet/main.c:292 #3 0x2b687f4b09d7 in unix_signal_handler (signum=, si=, uc=) at /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/third-party/vpp/vpp_1801/build-data/../src/vlib/unix/main.c:120 #4 #5 rte_mempool_populate_iova_tab (mp=mp@entry=0x1098fbec0, vaddr=0x2ac0 "", iova=0x0, pg_num=, pg_shift=, free_cb=free_cb@entry=0x0, opaque=opaque@entry=0x0) at /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/third-party/vpp/vpp_1801/build-root/build-vpp-native/dpdk/dpdk-stable-17.11.4/lib/librte_mempool/rte_mempool.c:486 #6 0x2b6b0212a373 in dpdk_pool_create (vm=vm@entry=0x2b687f6cb260 , pool_name=pool_name@entry=0x2b6883a0d924 "dpdk_mbuf_pool_socket0", elt_size=elt_size@entry=2432, num_elts=num_elts@entry=50, pool_priv_size=pool_priv_size@entry=6, cache_size=cache_size@entry=512, numa=numa@entry=0 '\000', _mp=_mp@entry=0x2b6884266c68, pri=pri@entry=0x2b6884266c5f "") at /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/third-party/vpp/vpp_1801/build-data/../src/plugins/dpdk/buffer.c:498 #7 0x2b6b0212a470 in dpdk_buffer_pool_create (vm=vm@entry=0x2b687f6cb260 , num_mbufs=50, socket_id=0) at /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/third-party/vpp/vpp_1801/build-data/../src/plugins/dpdk/buffer.c:537 #8 0x2b6b02144861 in dpdk_config (vm=, input=) at /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/third-party/vpp/vpp_1801/build-data/../src/plugins/dpdk/device/init.c:1288 #9 0x2b687f4715ad in vlib_call_all_config_functions (vm=, input=input@entry=0x2b6884266fa0, is_early=is_early@entry=0) at /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/third-party/vpp/vpp_1801/build-data/../src/vlib/init.c:146 #10 0x2b687f479908 in vlib_main (vm=vm@entry=0x2b687f6cb260 , input=input@entry=0x2b6884266fa0) at /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/third-party/vpp/vpp_1801/build-data/../src/vlib/main.c:1772 #11 0x2b687f4b0b23 in thread0 (arg=47727814423136) at /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/third-party/vpp/vpp_1801/build-data/../src/vlib/unix/main.c:567 #12 0x2b68807e38b8 in clib_calljmp () at /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/third-party/vpp/vpp_1801/build-data/../src/vppinfra/longjmp.S:110 #13 0x7ffe03c4a0e0 in ?? () #14 0x2b687f4b187f in vlib_unix_main (argc=, argv=) at /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/third-party/vpp/vpp_1801/build-data/../src/vlib/unix/main.c:631 #15 0x6f2d326c0028 in ?? () #16 0x6c632d7475707475 in ?? () #17 0x4c20796669737361 in ?? () #18 0x697373616c432032 in ?? () #19 0x73706f7244207966 in ?? () #20 0x6425203a in ?? () #21 0x in ?? () VPP version: 18.01 After analysing back-trace, it is observed that in rte_mempool_populate_iova_tab iova address is NULL which is called by dpdk_pool_create passing pr->page_table addr as iova address. pr : vlib_physmem_region_t *pr; Before calling rte_mempool_populate_iova_tab, vlib_physmem_region_alloc is called which internally calls vm->os_physmem_region_alloc. vm->os_physmem_region_alloc is mapped to unix_physmem_region_alloc. So in unix_physmem_region_alloc, pr->page_table = clib_mem_vm_get_paddr (pr->mem, pr->log2_page_size, pr->n_pages); Looks like clib_mem_vm_get_paddr is returning NULL to page table address. Code for clib_mem_vm_get_paddr function u64 * clib_mem_vm_get_paddr (void *mem, int log2_page_size, int n_pages) { int pagesize = sysconf (_SC_PAGESIZE); int fd; int i; u64 *r = 0; if ((fd = open ((char *) "/proc/self/pagemap", O_RDONLY)) == -1) return 0; for (i = 0; i < n_pages; i++) { u64 seek, pagemap = 0; uword vaddr = pointer_to_uword (mem) + (((u64) i) << log2_page_size); seek = ((u64) vaddr / pagesize) * sizeof (u64); if (lseek (fd, seek, SEEK_SET) != seek) goto done; if (read (fd, , sizeof (pagemap)) != (sizeof (pagemap))) goto done; if ((pagemap & (1ULL << 63)) == 0) goto done; pagemap &= pow2_mask (55); vec_add1 (r, pagemap * pagesize); } done: close (fd); if (vec_len (r) != n_pages) { vec_free (r); return 0; } return r; } So why this could be returning NULL. Any help would be appreciated.