Hi Pim and Andrew,
Thanks for the help! Turns out it was the stats memory that I had left
out. After increasing that to 128M I was able to import a full v4 and
v6 table no problem. As an aside, is the netlink plugin scheduled for
an upcoming release or is the interface still experimental?
Many thanks,
Nate
On Thu, May 27, 2021 at 11:36 am, Pim van Pelt <p...@ipng.nl> wrote:
Hoi Nate,
further to what Andrew suggested, there are a few more hints I can
offer:
1) Make sure there is enough netlink socket buffer by adding this to
your sysctl set:
cat << EOF > /etc/sysctl.d/81-VPP-netlink.conf
# Increase netlink to 64M
net.core.rmem_default=67108864
net.core.wmem_default=67108864
net.core.rmem_max=67108864
net.core.wmem_max=67108864
EOF
sysctl -p /etc/sysctl.d/81-VPP-netlink.conf
2) Ensure there is enough memory by adding this to VPP's startup
config:
memory {
main-heap-size 2G
main-heap-page-size default-hugepage
}
3) Many prefixes (like a full BGP routing table) will need more stats
memory, so increase that too in VPP's startup config:
statseg {
size 128M
}
And in case you missed it, make sure to create the linux-cp devices
in a separate namespace by adding this to the startup config:
linux-cp {
default netns dataplane
}
Then you should be able to consume the IPv4 and IPv6 DFZ in your
router. I tested extensively with FRR and Bird2, and so far had good
success.
groet,
Pim
On Thu, May 27, 2021 at 10:02 AM Andrew Yourtchenko
<ayour...@gmail.com <mailto:ayour...@gmail.com>> wrote:
I would guess from your traceback you are running out of memory, so
increasing the main heap size to something like 4x could help…
--a
On 27 May 2021, at 08:29, Nate Sales <n...@natesales.net
<mailto:n...@natesales.net>> wrote:
Hello,
I'm having some trouble with the linux-cp netlink plugin. After
building it from the patch set
(<https://gerrit.fd.io/r/c/vpp/+/31122>), it does correctly receive
netlink messages and insert routes from the linux kernel table into
the VPP FIB. When loading a large amount of routes however (full
IPv4 table), VPP crashes after loading about 400k routes.
It appears to be receiving a SIGABRT that terminates the VPP
process:
May 27 06:10:33 pdx1rtr1 vnet[2232]: received signal SIGABRT, PC
0x7fe9b99bdce1
May 27 06:10:33 pdx1rtr1 vnet[2232]: #0 0x00007fe9b9de1a7b
0x7fe9b9de1a7b
May 27 06:10:33 pdx1rtr1 vnet[2232]: #1 0x00007fe9b9d13140
0x7fe9b9d13140
May 27 06:10:33 pdx1rtr1 vnet[2232]: #2 0x00007fe9b99bdce1 gsignal
+ 0x141
May 27 06:10:33 pdx1rtr1 vnet[2232]: #3 0x00007fe9b99a7537 abort +
0x123
May 27 06:10:33 pdx1rtr1 vnet[2232]: #4 0x000055d43480a1f3
0x55d43480a1f3
May 27 06:10:33 pdx1rtr1 vnet[2232]: #5 0x00007fe9b9c9c8d5
vec_resize_allocate_memory + 0x285
May 27 06:10:33 pdx1rtr1 vnet[2232]: #6 0x00007fe9b9d71feb
vlib_validate_combined_counter + 0xdb
May 27 06:10:33 pdx1rtr1 vnet[2232]: #7 0x00007fe9ba4f1e55
load_balance_create + 0x205
May 27 06:10:33 pdx1rtr1 vnet[2232]: #8 0x00007fe9ba4c639d
fib_entry_src_mk_lb + 0x38d
May 27 06:10:33 pdx1rtr1 vnet[2232]: #9 0x00007fe9ba4c64a4
fib_entry_src_action_install + 0x44
May 27 06:10:33 pdx1rtr1 vnet[2232]: #10 0x00007fe9ba4c681b
fib_entry_src_action_activate + 0x17b
May 27 06:10:33 pdx1rtr1 vnet[2232]: #11 0x00007fe9ba4c3780
fib_entry_create + 0x70
May 27 06:10:33 pdx1rtr1 vnet[2232]: #12 0x00007fe9ba4b9afc
fib_table_entry_update + 0x29c
May 27 06:10:33 pdx1rtr1 vnet[2232]: #13 0x00007fe935fcedce
0x7fe935fcedce
May 27 06:10:33 pdx1rtr1 vnet[2232]: #14 0x00007fe935fd2ab5
0x7fe935fd2ab5
May 27 06:10:33 pdx1rtr1 systemd[1]: vpp.service: Main process
exited, code=killed, status=6/ABRT
May 27 06:10:33 pdx1rtr1 systemd[1]: vpp.service: Failed with
result 'signal'.
May 27 06:10:33 pdx1rtr1 systemd[1]: vpp.service: Consumed 12.505s
CPU time.
May 27 06:10:34 pdx1rtr1 systemd[1]: vpp.service: Scheduled restart
job, restart counter is at 2.
May 27 06:10:34 pdx1rtr1 systemd[1]: Stopped vector packet
processing engine.
May 27 06:10:34 pdx1rtr1 systemd[1]: vpp.service: Consumed 12.505s
CPU time.
May 27 06:10:34 pdx1rtr1 systemd[1]: Starting vector packet
processing engine...
May 27 06:10:34 pdx1rtr1 systemd[1]: Started vector packet
processing engine.
Here's what I'm working with:
root@pdx1rtr1:~# uname -a< /div>
Linux pdx1rtr1 5.10.0-7-amd64 #1 SMP Debian 5.10.38-1 (2021-05-20)
x86_64 GNU/Linux
root@pdx1rtr1:~# vppctl show ver
vpp v21.10-rc0~3-g3f3da0d27 built by nate on altair at
2021-05-27T01:21:58
root@pdx1rtr1:~# bird --version
BIRD version 2.0.7
And some adjusted sysctl params:
net.core.rmem_default = 67108864
net.core.wmem_default = 67108864
net.core.rmem_max = 67108864
net.core.wmem_max = 67108864
vm.nr_hugepages = 1024
vm.max_map_count = 3096
vm.hugetlb_shm_group = 0
kernel.shmmax = 2147483648
In case it's at all helpful, I ran a "sh ip fib sum" every second
and restarted BIRD to observe when the routes start processing, and
to get the last known fib state before the crash:
Thu May 27 06:10:20 UTC 2021
ipv4-VRF:0, fib_index:0, flow hash:[src dst sport dport proto
flowlabel ] epoch:0 flags:none locks:[adjacency:1, default-route:1,
lcp-rt:1, ]
Prefix length Count
0 1
4 2
8 3
9 5
10 29
11 62
; 12 169
13 357
14 702
15 1140
16 7110
17 4710
18 7763
19 13814
&nb sp; 20 22146
21 26557
22 51780
23 43914
24 227173
27 1
32 6
Thu May 27 06:10:21 UTC 2021
clib_socket_init: connect (fd 3, '/run/vpp/cli.sock'): Connection
refused
Thu May 27 06:10:22 UTC 2021
ipv4-VRF:0, fib_index:0, flow hash:[src dst spor t dport proto
flowlabel ] epoch:0 flags:none locks:[default-route:1, ]
Prefix length Count
0 1
4 2
32 2
I'm new to VPP so let me know if there are other logs that would be
useful too.
Cheers,
Nate
--
Pim van Pelt <p...@ipng.nl <mailto:p...@ipng.nl>>
PBVP1-RIPE - <http://www.ipng.nl/>
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#19493): https://lists.fd.io/g/vpp-dev/message/19493
Mute This Topic: https://lists.fd.io/mt/83119168/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-