Hi All,
We are observing the the following crash in VPP while running in Azure with
MLX5
Program received signal SIGABRT, Aborted.
0x00007f8c2387c387 in raise () from /lib64/libc.so.6
(gdb) bt
#0 0x00007f8c2387c387 in raise () from /lib64/libc.so.6
#1 0x00007f8c2387da78 in abort () from /lib64/libc.so.6
#2 0x000000000040755a in os_panic () at /home/vpp/src/vpp/vnet/main.c:355
#3 0x00007f8c2461eb39 in debugger () at /home/vpp/src/vppinfra/error.c:84
#4 0x00007f8c2461ef08 in _clib_error (how_to_die=2, function_name=0x0,
line_number=0, fmt=0x7f8bdffa7df8 "%s:%d (%s) assertion `%s' fails")
at /home/vpp/src/vppinfra/error.c:143
#5 0x00007f8bdfd38823 in vlib_get_buffer_index (vm=0x7f8c25479d80
<vlib_global_main>, p=0x17f19ed00)
at /home/vpp/src/vlib/buffer_funcs.h:261
#6 0x00007f8bdfd38b87 in vlib_get_buffer_indices_with_offset
(vm=0x7f8c25479d80 <vlib_global_main>, b=0x7f8be48c5fc8, bi=0x7f8be4a1c514,
count=2, offset=128) at /home/vpp/src/vlib/buffer_funcs.h:322
#7 0x00007f8bdfd3ae2d in dpdk_device_input (vm=0x7f8c25479d80
<vlib_global_main>, dm=0x7f8be06968e0 <dpdk_main>, xd=0x7f8be494e240,
node=0x7f8be3afae80, thread_index=0, queue_id=0) at
/home/vpp/src/plugins/dpdk/device/node.c:371
#8 0x00007f8bdfd3b362 in dpdk_input_node_fn_avx2 (vm=0x7f8c25479d80
<vlib_global_main>, node=0x7f8be3afae80, f=0x0)
at /home/vpp/src/plugins/dpdk/device/node.c:469
#9 0x00007f8c251d6c42 in dispatch_node (vm=0x7f8c25479d80
<vlib_global_main>, node=0x7f8be3afae80, type=VLIB_NODE_TYPE_INPUT,
dispatch_state=VLIB_NODE_STATE_POLLING, frame=0x0,
last_time_stamp=265582086742909)
at /home/vpp/src/vlib/main.c:1209
#10 0x00007f8c251d8d94 in vlib_main_or_worker_loop (vm=0x7f8c25479d80
<vlib_global_main>, is_main=1)
at /home/vpp/src/vlib/main.c:1781
#11 0x00007f8c251d98a9 in vlib_main_loop (vm=0x7f8c25479d80
<vlib_global_main>)
at /home/vpp/src/vlib/main.c:1930
#12 0x00007f8c251da571 in vlib_main (vm=0x7f8c25479d80 <vlib_global_main>,
input=0x7f8be2695fb0)
at /home/vpp/src/vlib/main.c:2147
#13 0x00007f8c25240533 in thread0 (arg=140239897599360) at
/home/vpp/src/vlib/unix/main.c:640
#14 0x00007f8c2463f9b0 in clib_calljmp ()
from
/home/vpp/build-root/install-vpp_debug-native/vpp/lib/libvppinfra.so.19.08.1
#15 0x00007ffc224c65e0 in ?? ()
#16 0x00007f8c25240aa9 in vlib_unix_main (argc=67, argv=0x855030)
at /home/vpp/src/vlib/unix/main.c:710
#17 0x0000000000406ece in main (argc=67, argv=0x855030) at
/home/vpp/src/vpp/vnet/main.c:280
*Details:*
VPP Version: 19.08
DPDK version: 19.05
Linux Version: CentOS Linux release 7.8.2003 (Core)
Kernel version: 3.10.0-1127.19.1.el7.x86_64
Step Followed: *https://fd.io/docs/vpp/master/usecases/vppinazure.html
<https://fd.io/docs/vpp/master/usecases/vppinazure.html>*
Interfaces are coming up properly, normal ping is working fine but when we
are trying to run more data we are running into this issue.
*More Info:*
- 1G hge pages is configured
- We can see the following error in vpp show logging
[root@exe91-fpm vpp]# lspci
0000:00:00.0 Host bridge: Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX
Host bridge (AGP disabled) (rev 03)
0000:00:07.0 ISA bridge: Intel Corporation 82371AB/EB/MB PIIX4 ISA (rev 01)
0000:00:07.1 IDE interface: Intel Corporation 82371AB/EB/MB PIIX4 IDE (rev
01)
0000:00:07.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 02)
0000:00:08.0 VGA compatible controller: Microsoft Corporation Hyper-V
virtual VGA
042d:00:02.0 Ethernet controller: Mellanox Technologies MT27710 Family
[ConnectX-4 Lx Virtual Function] (rev 80)
1d94:00:02.0 Ethernet controller: Mellanox Technologies MT27710 Family
[ConnectX-4 Lx Virtual Function] (rev 80)
int
Name Idx State MTU (L3/IP4/IP6/MPLS)
Counter Count
FiftyGigabitEthernet2 1 up 2000/0/0/0 rx
packets 1
rx
bytes 42
tx
packets 1
tx
bytes 42
drops
1
FiftyGigabitEthernet3 2 up 9000/0/0/0 rx
packets 9
rx
bytes 890
drops
9
ip4
2
ip6
7
DBGvpp#
DBGvpp# sh lo
load-balance load-balance-map logging lookup-dpo
DBGvpp# sh logging
2021/01/04 10:03:55:234 err perfmon No table for cpuid 830f10
2021/01/04 10:03:55:234 err perfmon model 31, stepping 0
2021/01/04 10:03:55:276 warn dpdk EAL init args: -c 2 -n 4
--in-memory --vdev net_vdev_netvsc0,iface=eth1 --vdev
net_vdev_netvsc1,iface=eth2 --file-prefix vpp -w 042d:00:02.0 -w
1d94:00:02.0 --master-lcore 1
2021/01/04 10:03:55:660 notice dpdk DPDK drivers found 2 ports...
2021/01/04 10:03:55:679 notice dpdk EAL: Detected 8 lcore(s)
2021/01/04 10:03:55:679 notice dpdk EAL: Detected 1 NUMA nodes
2021/01/04 10:03:55:679 notice dpdk EAL: Probing VFIO support...
2021/01/04 10:03:55:679 notice dpdk EAL: WARNING! Base virtual
address hint (0xa80001000 != 0x7f1c00000000) not respected!
2021/01/04 10:03:55:679 notice dpdk EAL: This may cause issues
with mapping memory into secondary processes
2021/01/04 10:03:55:679 notice dpdk EAL: WARNING! Base virtual
address hint (0xc00002000 != 0x7f13c0000000) not respected!
2021/01/04 10:03:55:679 notice dpdk EAL: This may cause issues
with mapping memory into secondary processes
2021/01/04 10:03:55:679 notice dpdk EAL: WARNING! Base virtual
address hint (0xd80003000 != 0x7f0b80000000) not respected!
2021/01/04 10:03:55:679 notice dpdk EAL: This may cause issues
with mapping memory into secondary processes
2021/01/04 10:03:55:679 notice dpdk EAL: Using memfd is not
supported, falling back to anonymous hugepages
2021/01/04 10:03:55:679 notice dpdk EAL: WARNING: cpu flags
constant_tsc=no nonstop_tsc=no -> using unreliable clock cycles !
2021/01/04 10:03:55:679 notice dpdk EAL: PCI device 042d:00:02.0
on NUMA socket 0
2021/01/04 10:03:55:679 notice dpdk EAL: probe driver:
15b3:1016 net_mlx5
2021/01/04 10:03:55:679 notice dpdk EAL: PCI device 1d94:00:02.0
on NUMA socket 0
2021/01/04 10:03:55:679 notice dpdk EAL: probe driver:
15b3:1016 net_mlx5
2021/01/04 10:03:55:679 notice dpdk vmbus_probe_one_driver():
Invalid NUMA socket, default to 0
2021/01/04 10:03:55:679 notice dpdk vmbus_probe_one_driver():
Invalid NUMA socket, default to 0
2021/01/04 10:03:55:679 notice dpdk net_vdev_netvsc: Cannot find
the specified netvsc device
2021/01/04 10:03:55:679 notice dpdk net_vdev_netvsc: Cannot find
the specified netvsc device
2021/01/04 10:03:55:679 notice dpdk EAL: VFIO support not
initialized
2021/01/04 10:03:55:679 notice dpdk EAL: Couldn't map new region
for DMA
2021/01/04 10:03:59:435 notice dpdk net_mlx5: port 0 unable to
find virtually contiguous chunk for address (0x1000000000).
rte_memseg_contig_walk() failed.
DBGvpp# sh buffers
Pool Name Index NUMA Size Data Size Total Avail Cached
Used
default-numa-0 0 0 2496 2048 430185 429641 75
469
Any help with this would be greatly appreciated, ready to provide more
information if required!
Thanks
Himanshu
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#18460): https://lists.fd.io/g/vpp-dev/message/18460
Mute This Topic: https://lists.fd.io/mt/79422123/21656
Group Owner: [email protected]
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-