Re: [vpp-dev] Regarding page table address NULL in dpdk_pool_create

2018-11-26 Thread Alok Makhariya
' allocated on the wrong numa node 
(requested 0 actual 4294967294)
Nov 26 07:57:18 fe01-oam vnet[13134]: received signal SIGSEGV, PC 
0x2b30b3bdfdf8, faulting address 0x0
Nov 26 07:57:25 fe01-oam vnet[13466]: clib_sysfs_prealloc_hugepages:239: 
pre-allocating 212 additional 2048K hugepages on numa node 0
Nov 26 07:57:26 fe01-oam vnet[13466]: unix_physmem_region_alloc:231: physmem 
page for region 'dpdk_mbuf_pool_socket0' allocated on the wrong numa node 
(requested 0 actual 4294967294)
Nov 26 07:57:26 fe01-oam vnet[13466]: received signal SIGSEGV, PC 
0x2b097d347df8, faulting address 0x0
Nov 26 07:57:33 fe01-oam vnet[13978]: clib_sysfs_prealloc_hugepages:239: 
pre-allocating 212 additional 2048K hugepages on numa node 0
Nov 26 07:57:33 fe01-oam vnet[13978]: unix_physmem_region_alloc:231: physmem 
page for region 'dpdk_mbuf_pool_socket0' allocated on the wrong numa node 
(requested 0 actual 4294967294)
Nov 26 07:57:33 fe01-oam vnet[13978]: received signal SIGSEGV, PC 
0x2b585d42ddf8, faulting address 0x0
Nov 26 07:57:52 fe01-oam vnet[14464]: clib_sysfs_prealloc_hugepages:239: 
pre-allocating 212 additional 2048K hugepages on numa node 0
Nov 26 07:57:55 fe01-oam vnet[14464]: clib_sysfs_prealloc_hugepages:239: 
pre-allocating 212 additional 2048K hugepages on numa node 1

I see that the file src/plugins/dpdk/device/init.c is doing the following -
/* lazy umount hugepages */
  umount2 ((char *) huge_dir_path, MNT_DETACH);
  rmdir ((char *) huge_dir_path);
  vec_free (huge_dir_path);

Could this be a problem in some scenarios ?
Is there a way to do some 'unlazy mount' here such that it is guaranteed before 
moving forward that what is required has been done ?

Regards
Alok

-Original Message-
From: Damjan Marion  
Sent: 23 November 2018 15:03
To: Alok Makhariya 
Cc: vpp-dev@lists.fd.io
Subject: Re: [vpp-dev] Regarding page table address NULL in dpdk_pool_create

> 
> On 22 Nov 2018, at 10:55, Alok Makhariya  wrote:
> 
> Hi
>  
> I have a situation where in some scenarios when VPP is restarted after a 
> crash, the VPP which is coming up itself crashes with the following backtrace.
> This does not happen always. Any hints on what could cause this would be 
> appreciated.
>  
> Backtrace
>  
> (gdb) bt
> #0  0x2b688139c207 in raise () from /lib64/libc.so.6
> #1  0x2b688139d8f8 in abort () from /lib64/libc.so.6
> #2  0x00405f7e in os_exit (code=code@entry=1) at 
> /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/thi
> rd-party/vpp/vpp_1801/build-data/../src/vpp/vnet/main.c:292
> #3  0x2b687f4b09d7 in unix_signal_handler (signum=, 
> si=, uc=)
> at 
> /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/thi
> rd-party/vpp/vpp_1801/build-data/../src/vlib/unix/main.c:120
> #4  
> #5  rte_mempool_populate_iova_tab (mp=mp@entry=0x1098fbec0, 
> vaddr=0x2ac0 "", iova=0x0, pg_num=, 
> pg_shift=, free_cb=free_cb@entry=0x0, opaque=opaque@entry=0x0)
> at 
> /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/thi
> rd-party/vpp/vpp_1801/build-root/build-vpp-native/dpdk/dpdk-stable-17.
> 11.4/lib/librte_mempool/rte_mempool.c:486
> #6  0x2b6b0212a373 in dpdk_pool_create (vm=vm@entry=0x2b687f6cb260 
> , pool_name=pool_name@entry=0x2b6883a0d924 
> "dpdk_mbuf_pool_socket0", elt_size=elt_size@entry=2432, 
> num_elts=num_elts@entry=50, 
> pool_priv_size=pool_priv_size@entry=6, cache_size=cache_size@entry=512, 
> numa=numa@entry=0 '\000', _mp=_mp@entry=0x2b6884266c68, 
> pri=pri@entry=0x2b6884266c5f "")
> at 
> /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/thi
> rd-party/vpp/vpp_1801/build-data/../src/plugins/dpdk/buffer.c:498
> #7  0x2b6b0212a470 in dpdk_buffer_pool_create (vm=vm@entry=0x2b687f6cb260 
> , num_mbufs=50, socket_id=0)
> at 
> /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/thi
> rd-party/vpp/vpp_1801/build-data/../src/plugins/dpdk/buffer.c:537
> #8  0x2b6b02144861 in dpdk_config (vm=, input= out>)
> at 
> /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/thi
> rd-party/vpp/vpp_1801/build-data/../src/plugins/dpdk/device/init.c:128
> 8
> #9  0x2b687f4715ad in vlib_call_all_config_functions (vm=, 
> input=input@entry=0x2b6884266fa0, is_early=is_early@entry=0)
> at 
> /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/thi
> rd-party/vpp/vpp_1801/build-data/../src/vlib/init.c:146
> #10 0x2b687f479908 in vlib_main (vm=vm@entry=0x2b687f6cb260 
> , input=input@entry=0x2b6884266fa0)
> at 
> /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/thi
> rd-party/vpp/vpp_1801/build-data/../src/vlib/main.c:1772
> #11 0x2b687f4b0b23 in thread0 (arg=47727814423136) at 
> /bfs-build/build-area.4

Re: [vpp-dev] Regarding page table address NULL in dpdk_pool_create

2018-11-23 Thread Alok Makhariya
Thanks Damjan. 

Regards
Alok

-Original Message-
From: Damjan Marion  
Sent: 23 November 2018 15:03
To: Alok Makhariya 
Cc: vpp-dev@lists.fd.io
Subject: Re: [vpp-dev] Regarding page table address NULL in dpdk_pool_create

> 
> On 22 Nov 2018, at 10:55, Alok Makhariya  wrote:
> 
> Hi
>  
> I have a situation where in some scenarios when VPP is restarted after a 
> crash, the VPP which is coming up itself crashes with the following backtrace.
> This does not happen always. Any hints on what could cause this would be 
> appreciated.
>  
> Backtrace
>  
> (gdb) bt
> #0  0x2b688139c207 in raise () from /lib64/libc.so.6
> #1  0x2b688139d8f8 in abort () from /lib64/libc.so.6
> #2  0x00405f7e in os_exit (code=code@entry=1) at 
> /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/thi
> rd-party/vpp/vpp_1801/build-data/../src/vpp/vnet/main.c:292
> #3  0x2b687f4b09d7 in unix_signal_handler (signum=, 
> si=, uc=)
> at 
> /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/thi
> rd-party/vpp/vpp_1801/build-data/../src/vlib/unix/main.c:120
> #4  
> #5  rte_mempool_populate_iova_tab (mp=mp@entry=0x1098fbec0, 
> vaddr=0x2ac0 "", iova=0x0, pg_num=, 
> pg_shift=, free_cb=free_cb@entry=0x0, opaque=opaque@entry=0x0)
> at 
> /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/thi
> rd-party/vpp/vpp_1801/build-root/build-vpp-native/dpdk/dpdk-stable-17.
> 11.4/lib/librte_mempool/rte_mempool.c:486
> #6  0x2b6b0212a373 in dpdk_pool_create (vm=vm@entry=0x2b687f6cb260 
> , pool_name=pool_name@entry=0x2b6883a0d924 
> "dpdk_mbuf_pool_socket0", elt_size=elt_size@entry=2432, 
> num_elts=num_elts@entry=50, 
> pool_priv_size=pool_priv_size@entry=6, cache_size=cache_size@entry=512, 
> numa=numa@entry=0 '\000', _mp=_mp@entry=0x2b6884266c68, 
> pri=pri@entry=0x2b6884266c5f "")
> at 
> /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/thi
> rd-party/vpp/vpp_1801/build-data/../src/plugins/dpdk/buffer.c:498
> #7  0x2b6b0212a470 in dpdk_buffer_pool_create (vm=vm@entry=0x2b687f6cb260 
> , num_mbufs=50, socket_id=0)
> at 
> /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/thi
> rd-party/vpp/vpp_1801/build-data/../src/plugins/dpdk/buffer.c:537
> #8  0x2b6b02144861 in dpdk_config (vm=, input= out>)
> at 
> /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/thi
> rd-party/vpp/vpp_1801/build-data/../src/plugins/dpdk/device/init.c:128
> 8
> #9  0x2b687f4715ad in vlib_call_all_config_functions (vm=, 
> input=input@entry=0x2b6884266fa0, is_early=is_early@entry=0)
> at 
> /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/thi
> rd-party/vpp/vpp_1801/build-data/../src/vlib/init.c:146
> #10 0x2b687f479908 in vlib_main (vm=vm@entry=0x2b687f6cb260 
> , input=input@entry=0x2b6884266fa0)
> at 
> /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/thi
> rd-party/vpp/vpp_1801/build-data/../src/vlib/main.c:1772
> #11 0x2b687f4b0b23 in thread0 (arg=47727814423136) at 
> /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/thi
> rd-party/vpp/vpp_1801/build-data/../src/vlib/unix/main.c:567
> #12 0x2b68807e38b8 in clib_calljmp () at 
> /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/thi
> rd-party/vpp/vpp_1801/build-data/../src/vppinfra/longjmp.S:110
> #13 0x7ffe03c4a0e0 in ?? ()
> #14 0x2b687f4b187f in vlib_unix_main (argc=, 
> argv=) at 
> /bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/thi
> rd-party/vpp/vpp_1801/build-data/../src/vlib/unix/main.c:631
> #15 0x6f2d326c0028 in ?? ()
> #16 0x6c632d7475707475 in ?? ()
> #17 0x4c20796669737361 in ?? ()
> #18 0x697373616c432032 in ?? ()
> #19 0x73706f7244207966 in ?? ()
> #20 0x6425203a in ?? ()
> #21 0x in ?? ()
>  
>  
> VPP version: 18.01

Please move to recent version of vpp 18.01 is not supported anymore.


>  
> After analysing back-trace, it is observed that in 
> rte_mempool_populate_iova_tab iova address is NULL which is called by 
> dpdk_pool_create passing pr->page_table addr as iova address.
> pr : vlib_physmem_region_t *pr;
> Before calling rte_mempool_populate_iova_tab, vlib_physmem_region_alloc is 
> called which internally calls vm->os_physmem_region_alloc.
>  
> vm->os_physmem_region_alloc  is mapped to unix_physmem_region_alloc.
> So in unix_physmem_region_alloc,
> pr->page_table = clib_mem_vm_get_paddr (pr->mem, pr->log2_page_size, 
> pr->pr->n_pages);
> Looks like clib_mem_vm_get_paddr is returning NULL to page table address.
>  
> Code

[vpp-dev] Regarding page table address NULL in dpdk_pool_create

2018-11-22 Thread Alok Makhariya
Hi

I have a situation where in some scenarios when VPP is restarted after a crash, 
the VPP which is coming up itself crashes with the following backtrace.
This does not happen always. Any hints on what could cause this would be 
appreciated.

Backtrace

(gdb) bt
#0  0x2b688139c207 in raise () from /lib64/libc.so.6
#1  0x2b688139d8f8 in abort () from /lib64/libc.so.6
#2  0x00405f7e in os_exit (code=code@entry=1) at 
/bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/third-party/vpp/vpp_1801/build-data/../src/vpp/vnet/main.c:292
#3  0x2b687f4b09d7 in unix_signal_handler (signum=, 
si=, uc=)
at 
/bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/third-party/vpp/vpp_1801/build-data/../src/vlib/unix/main.c:120
#4  
#5  rte_mempool_populate_iova_tab (mp=mp@entry=0x1098fbec0, 
vaddr=0x2ac0 "", iova=0x0, pg_num=, pg_shift=, free_cb=free_cb@entry=0x0, opaque=opaque@entry=0x0)
at 
/bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/third-party/vpp/vpp_1801/build-root/build-vpp-native/dpdk/dpdk-stable-17.11.4/lib/librte_mempool/rte_mempool.c:486
#6  0x2b6b0212a373 in dpdk_pool_create (vm=vm@entry=0x2b687f6cb260 
, pool_name=pool_name@entry=0x2b6883a0d924 
"dpdk_mbuf_pool_socket0", elt_size=elt_size@entry=2432, 
num_elts=num_elts@entry=50,
pool_priv_size=pool_priv_size@entry=6, cache_size=cache_size@entry=512, 
numa=numa@entry=0 '\000', _mp=_mp@entry=0x2b6884266c68, 
pri=pri@entry=0x2b6884266c5f "")
at 
/bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/third-party/vpp/vpp_1801/build-data/../src/plugins/dpdk/buffer.c:498
#7  0x2b6b0212a470 in dpdk_buffer_pool_create (vm=vm@entry=0x2b687f6cb260 
, num_mbufs=50, socket_id=0)
at 
/bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/third-party/vpp/vpp_1801/build-data/../src/plugins/dpdk/buffer.c:537
#8  0x2b6b02144861 in dpdk_config (vm=, input=)
at 
/bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/third-party/vpp/vpp_1801/build-data/../src/plugins/dpdk/device/init.c:1288
#9  0x2b687f4715ad in vlib_call_all_config_functions (vm=, 
input=input@entry=0x2b6884266fa0, is_early=is_early@entry=0)
at 
/bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/third-party/vpp/vpp_1801/build-data/../src/vlib/init.c:146
#10 0x2b687f479908 in vlib_main (vm=vm@entry=0x2b687f6cb260 
, input=input@entry=0x2b6884266fa0)
at 
/bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/third-party/vpp/vpp_1801/build-data/../src/vlib/main.c:1772
#11 0x2b687f4b0b23 in thread0 (arg=47727814423136) at 
/bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/third-party/vpp/vpp_1801/build-data/../src/vlib/unix/main.c:567
#12 0x2b68807e38b8 in clib_calljmp () at 
/bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/third-party/vpp/vpp_1801/build-data/../src/vppinfra/longjmp.S:110
#13 0x7ffe03c4a0e0 in ?? ()
#14 0x2b687f4b187f in vlib_unix_main (argc=, argv=) at 
/bfs-build/build-area.49/builds/LinuxNBngp_7.X_RH7/2018-11-16-0505/third-party/vpp/vpp_1801/build-data/../src/vlib/unix/main.c:631
#15 0x6f2d326c0028 in ?? ()
#16 0x6c632d7475707475 in ?? ()
#17 0x4c20796669737361 in ?? ()
#18 0x697373616c432032 in ?? ()
#19 0x73706f7244207966 in ?? ()
#20 0x6425203a in ?? ()
#21 0x in ?? ()


VPP version: 18.01

After analysing back-trace, it is observed that in 
rte_mempool_populate_iova_tab iova address is NULL which is called by 
dpdk_pool_create passing pr->page_table addr as iova address.
pr : vlib_physmem_region_t *pr;
Before calling rte_mempool_populate_iova_tab, vlib_physmem_region_alloc is 
called which internally calls vm->os_physmem_region_alloc.

vm->os_physmem_region_alloc  is mapped to unix_physmem_region_alloc.
So in unix_physmem_region_alloc,
pr->page_table = clib_mem_vm_get_paddr (pr->mem, pr->log2_page_size, 
pr->n_pages);
Looks like clib_mem_vm_get_paddr is returning NULL to page table address.

Code for clib_mem_vm_get_paddr function
u64 *
clib_mem_vm_get_paddr (void *mem, int log2_page_size, int n_pages)
{
  int pagesize = sysconf (_SC_PAGESIZE);
  int fd;
  int i;
  u64 *r = 0;

  if ((fd = open ((char *) "/proc/self/pagemap", O_RDONLY)) == -1)
return 0;

  for (i = 0; i < n_pages; i++)
{
  u64 seek, pagemap = 0;
  uword vaddr = pointer_to_uword (mem) + (((u64) i) << log2_page_size);
  seek = ((u64) vaddr / pagesize) * sizeof (u64);
  if (lseek (fd, seek, SEEK_SET) != seek)
goto done;

  if (read (fd, , sizeof (pagemap)) != (sizeof (pagemap)))
goto done;

  if ((pagemap & (1ULL << 63)) == 0)
goto done;

  pagemap &= pow2_mask (55);
  vec_add1 (r, pagemap * pagesize);
}

done:
  close (fd);
  if (vec_len (r) != n_pages)
{
  vec_free (r);
  return 0;
}
  return r;
}

So why this could be returning NULL.  Any help would be appreciated.