Re: [vpp-dev] vpp's memory is leaking
Hi all, I closed transparent hugepage function, then the RES memory does not increase anymore, is that ok? echo never > /sys/kernel/mm/transparent_hugepage/enabled echo never > /sys/kernel/mm/transparent_hugepage/defrag Regards 在 2018-05-29 16:01:27,"xulang" 写道: Hi all, My version is 17.04, I encountered a memory leaking problem, the RES memory of VPP is increasing slowly and continuously. I shutdown all interfaces and set break points on memory allocate functions, such as malloc,calloc,realloc, mmap, vmalloc, clib_mem_alloc, mheap_alloc_with_flags. The program is still running continuously and the RES memory is also increasing continuously, any guides? Regards root@ubuntu:/home/wangzy# top -c |grep vpp 4499 root 20 0 5.000t 1.207g 197808 S 201.0 31.4 26:30.57 /usr/bin/vpp -c /etc/vpp/startup.conf 4499 root 20 0 5.000t 1.209g 197808 S 201.7 31.5 26:36.62 /usr/bin/vpp -c /etc/vpp/startup.conf 4499 root 20 0 5.000t 1.209g 197808 t 3.3 31.5 26:36.72 /usr/bin/vpp -c /etc/vpp/startup.conf 4499 root 20 0 5.000t 1.209g 197808 S 115.0 31.5 26:40.18 /usr/bin/vpp -c /etc/vpp/startup.conf 4499 root 20 0 5.000t 1.209g 197808 S 201.0 31.5 26:46.23 /usr/bin/vpp -c /etc/vpp/startup.conf 4499 root 20 0 5.000t 1.209g 197808 S 200.7 31.5 26:52.27 /usr/bin/vpp -c /etc/vpp/startup.conf 4499 root 20 0 5.000t 1.209g 197808 S 201.3 31.5 26:58.31 /usr/bin/vpp -c /etc/vpp/startup.conf
Re: [vpp-dev] gdb break point of plugin not hit
Hi Damjan, Some observations: #1 gdb) set args unix { interactive cli-listen /run/vpp/cli.sock full-coredump } plugin_path /home/harish/vpp_0524/vpp/build-root/install-vpp_debug-native/sample-plugin/lib64/vpp_plugins dpdk { uio-driver igb_uio dev :07:00.0 dev :07:00.1 } (gdb) r Starting program: /home/harish/vpp_0524/vpp/./build-root/install-vpp_debug-native/vpp/bin/vpp unix { interactive cli-listen /run/vpp/cli.sock full-coredump } plugin_path /home/harish/vpp_0524/vpp/build-root/install-vpp_debug-native/sample-plugin/lib64/vpp_plugins dpdk { uio-driver igb_uio dev :07:00.0 dev :07:00.1 } [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". vlib_plugin_early_init:356: plugin path /home/harish/vpp_0524/vpp/build-root/install-vpp_debug-native/sample-plugin/lib64/vpp_plugins load_one_plugin:184: Loaded plugin: sample_plugin.so (Sample of VPP Plugin) vlib_call_all_config_functions: unknown input `dpdk uio-driver igb_uio dev :07:00.0 dev :07:00.1 ' [Inferior 1 (process 338072) exited with code 01] #2 - Remove dpdk arguments Break point is hit, but there are no 10G interfaces since dpdk argument could not be passed via GDB. (gdb) set args unix { interactive cli-listen /run/vpp/cli.sock full-coredump } plugin_path /home/harish/vpp_0524/vpp/build-root/install-vpp_debug-native/sample-plugin/lib64/vpp_plugins (gdb) r Starting program: /home/harish/vpp_0524/vpp/./build-root/install-vpp_debug-native/vpp/bin/vpp unix { interactive cli-listen /run/vpp/cli.sock full-coredump } plugin_path /home/harish/vpp_0524/vpp/build-root/install-vpp_debug-native/sample-plugin/lib64/vpp_plugins [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". vlib_plugin_early_init:356: plugin path /home/harish/vpp_0524/vpp/build-root/install-vpp_debug-native/sample-plugin/lib64/vpp_plugins load_one_plugin:184: Loaded plugin: sample_plugin.so (Sample of VPP Plugin) Breakpoint 1, sample_init (vm=0x77b9d400 ) at /home/harish/vpp_0524/vpp/build-data/../src/examples/sample-plugin/sample/sample.c:205 205 { (gdb) c Continuing. [New Thread 0x7fff82fc2700 (LWP 339053)] _____ _ ___ __/ __/ _ \ (_)__| | / / _ \/ _ \ _/ _// // / / / _ \ | |/ / ___/ ___/ /_/ /(_)_/\___/ |___/_/ /_/ DBGvpp# sh int Name Idx State Counter Count local00down DBGvpp# So I still have issues. Thanks, Harish On Tue, May 29, 2018 at 11:13 AM, Harish Patil wrote: > Hi Damjan, > Thanks. > > Yes I do see show run hitting the sample plugin. > I sent one ping packet request. > > > sample active 1 1 > 0 2.67e41.00 > > I would like to stick to current version of VPP. > Is there a known issue or something that need my VPP to be upgraded? > > Thanks. > Harish > > > > On Sun, May 27, 2018 at 2:35 AM, Damjan Marion > wrote: > >> Do you see in "show run" output that sample plugin node was hit with some >> packets? >> >> also, I suggest moving to newer version of vpp... >> >> -- >> Damjan >> >> On 25 May 2018, at 21:12, Harish Patil wrote: >> >> Hi, >> >> I built VPP using make -j build TAG=vpp_debug with export >> SAMPLE_PLUGIN=yes; The VPP release is origin/stable/1710. First I verified >> that sample_plugin.so is loaded. >> >> [root@localhost vpp]# make run >> .. >> load_one_plugin:184: Loaded plugin: pppoe_plugin.so (PPPoE) >> load_one_plugin:184: Loaded plugin: sample_plugin.so (Sample of VPP >> Plugin) >> .. >> >> Now when I try to put break point of a function in sample_plugin: >> >> [root@localhost vpp]# gdb ./build-root/install-vpp_debu >> g-native/vpp/bin/vpp >> GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-64.el7 >> .. >> .. >> Reading symbols from /home/harish/vpp_0524/vpp/buil >> d-root/install-vpp_debug-native/vpp/bin/vpp...done. >> (gdb) b sample_node_fn >> Function "sample_node_fn" not defined. >> Make breakpoint pending on future shared library load? (y or [n]) >> >> 1) I selected y, but this future break point is never hit when the >> traffic is sent when sample plugin is activated. >> >> 2) The other thing I tried is to manually load symbols of sample_plugin: >> >> (gdb) add-symbol-file ./build-root/install-vpp_debug >> -native/sample-plugin/lib64/vpp_plugins/sample_plugin.so >> where TXT is the text address obtained from info sections. >> But the break points applied seems to be set in incorrect locations, so >> its never hit. >> >> Could you please help here? >> >> Thanks, >> >> Harish >> >> >> >> >> >> >> > > >
Re: [vpp-dev] gdb break point of plugin not hit
Hi Damjan, Thanks. Yes I do see show run hitting the sample plugin. I sent one ping packet request. sample active 1 1 0 2.67e41.00 I would like to stick to current version of VPP. Is there a known issue or something that need my VPP to be upgraded? Thanks. Harish On Sun, May 27, 2018 at 2:35 AM, Damjan Marion wrote: > Do you see in "show run" output that sample plugin node was hit with some > packets? > > also, I suggest moving to newer version of vpp... > > -- > Damjan > > On 25 May 2018, at 21:12, Harish Patil wrote: > > Hi, > > I built VPP using make -j build TAG=vpp_debug with export > SAMPLE_PLUGIN=yes; The VPP release is origin/stable/1710. First I verified > that sample_plugin.so is loaded. > > [root@localhost vpp]# make run > .. > load_one_plugin:184: Loaded plugin: pppoe_plugin.so (PPPoE) > load_one_plugin:184: Loaded plugin: sample_plugin.so (Sample of VPP Plugin) > .. > > Now when I try to put break point of a function in sample_plugin: > > [root@localhost vpp]# gdb ./build-root/install-vpp_ > debug-native/vpp/bin/vpp > GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-64.el7 > .. > .. > Reading symbols from /home/harish/vpp_0524/vpp/ > build-root/install-vpp_debug-native/vpp/bin/vpp...done. > (gdb) b sample_node_fn > Function "sample_node_fn" not defined. > Make breakpoint pending on future shared library load? (y or [n]) > > 1) I selected y, but this future break point is never hit when the traffic > is sent when sample plugin is activated. > > 2) The other thing I tried is to manually load symbols of sample_plugin: > > (gdb) add-symbol-file ./build-root/install-vpp_debug-native/sample-plugin/ > lib64/vpp_plugins/sample_plugin.so > where TXT is the text address obtained from info sections. > But the break points applied seems to be set in incorrect locations, so > its never hit. > > Could you please help here? > > Thanks, > > Harish > > > > > > > > >
Re: [vpp-dev] VPP Vnet crash with vhost-user interface
Steve, Thanks for inputs on debugs and gdb. I am using gdb on my development system to debug the issue. I would like to have reliable core generation on the system on which I don't have access to install gdb. I installed corekeeper and it still doesn't generate core. I am running vpp inside a VM (VirtualBox/vagrant), not sure if I need to set something inside vagrant config file. dpkg -l corekeeper Desired=Unknown/Install/Remove/Purge/Hold | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) ||/ Name Version ArchitectureDescription +++--===-===-== ii corekeeper 1.6 amd64 enable core files and report crashes to the system Thanks. On Tue, May 29, 2018 at 9:38 AM, Steven Luong (sluong) wrote: > Ravi, > > I install corekeeper and the core file is kept in /var/crash. But why not use > gdb to attach to the VPP process? > To turn on VPP vhost-user debug, type "debug vhost-user on" at the VPP prompt. > > Steven > > On 5/29/18, 9:10 AM, "vpp-dev@lists.fd.io on behalf of Ravi Kerur" > wrote: > > Hi Marco, > > > On Tue, May 29, 2018 at 6:30 AM, Marco Varlese wrote: > > Ravi, > > > > On Sun, 2018-05-27 at 12:20 -0700, Ravi Kerur wrote: > >> Hello, > >> > >> I have a VM(16.04.4 Ubuntu x86_64) with 2 cores and 4G RAM. I have > >> installed VPP successfully on it. Later I have created vhost-user > >> interfaces via > >> > >> create vhost socket /var/run/vpp/sock1.sock server > >> create vhost socket /var/run/vpp/sock2.sock server > >> set interface state VirtualEthernet0/0/0 up > >> set interface state VirtualEthernet0/0/1 up > >> > >> set interface l2 bridge VirtualEthernet0/0/0 1 > >> set interface l2 bridge VirtualEthernet0/0/1 1 > >> > >> I then run 'DPDK/testpmd' inside a container which will use > >> virtio-user interfaces using the following command > >> > >> docker run -it --privileged -v > >> /var/run/vpp/sock1.sock:/var/run/usvhost1 -v > >> /var/run/vpp/sock2.sock:/var/run/usvhost2 -v > >> /dev/hugepages:/dev/hugepages dpdk-app-testpmd ./bin/testpmd -c 0x3 -n > >> 4 --log-level=9 -m 64 --no-pci --single-file-segments > >> --vdev=virtio_user0,path=/var/run/usvhost1,mac=54:01:00:01:01:01 > >> --vdev=virtio_user1,path=/var/run/usvhost2,mac=54:01:00:01:01:02 -- > >> -i > >> > >> VPP Vnet crashes with following message > >> > >> May 27 11:44:00 localhost vnet[6818]: received signal SIGSEGV, PC > >> 0x7fcca4620187, faulting address 0x7fcb317ac000 > >> > >> Questions: > >> I have 'ulimit -c unlimited' and /etc/vpp/startup.conf has > >> unix { > >> nodaemon > >> log /var/log/vpp/vpp.log > >> full-coredump > >> cli-listen /run/vpp/cli.sock > >> gid vpp > >> } > >> > >> But I couldn't locate corefile? > > The location of the coredump file depends on your system configuration. > > > > Please, check "cat /proc/sys/kernel/core_pattern" > > > > If you have systemd-coredump in the output of the above command, then > likely the > > location of the coredump files is "/var/lib/systemd/coredump/" > > > > You can also change the location of where your system places the > coredump files: > > echo '/PATH_TO_YOU_LOCATION/core_%e.%p' | sudo tee > /proc/sys/kernel/core_pattern > > > > See if that helps... > > > > Initially '/proc/sys/kernel/core_pattern' was set to 'core'. I changed > it to 'systemd-coredump'. Still no core generated. VPP crashes > > May 29 08:54:34 localhost vnet[4107]: received signal SIGSEGV, PC > 0x7f0167751187, faulting address 0x7efff43ac000 > May 29 08:54:34 localhost systemd[1]: vpp.service: Main process > exited, code=killed, status=6/ABRT > May 29 08:54:34 localhost systemd[1]: vpp.service: Unit entered failed > state. > May 29 08:54:34 localhost systemd[1]: vpp.service: Failed with result > 'signal'. > > > cat /proc/sys/kernel/core_pattern > systemd-coredump > > > ulimit -a > core file size (blocks, -c) unlimited > data seg size (kbytes, -d) unlimited > scheduling priority (-e) 0 > file size (blocks, -f) unlimited > pending signals (-i) 15657 > max locked memory (kbytes, -l) 64 > max memory size (kbytes, -m) unlimited > open files (-n) 1024 > pipe size(512 bytes, -p) 8 > POSIX message queues (bytes, -q) 819200 > real-time priority (-r) 0 > stack size (kbytes, -s) 8192 > cpu time (seconds, -t) unlimited > max user processes (-u) 15657 > virtual memory
Re: [vpp-dev] VPP Vnet crash with vhost-user interface
Ravi, I install corekeeper and the core file is kept in /var/crash. But why not use gdb to attach to the VPP process? To turn on VPP vhost-user debug, type "debug vhost-user on" at the VPP prompt. Steven On 5/29/18, 9:10 AM, "vpp-dev@lists.fd.io on behalf of Ravi Kerur" wrote: Hi Marco, On Tue, May 29, 2018 at 6:30 AM, Marco Varlese wrote: > Ravi, > > On Sun, 2018-05-27 at 12:20 -0700, Ravi Kerur wrote: >> Hello, >> >> I have a VM(16.04.4 Ubuntu x86_64) with 2 cores and 4G RAM. I have >> installed VPP successfully on it. Later I have created vhost-user >> interfaces via >> >> create vhost socket /var/run/vpp/sock1.sock server >> create vhost socket /var/run/vpp/sock2.sock server >> set interface state VirtualEthernet0/0/0 up >> set interface state VirtualEthernet0/0/1 up >> >> set interface l2 bridge VirtualEthernet0/0/0 1 >> set interface l2 bridge VirtualEthernet0/0/1 1 >> >> I then run 'DPDK/testpmd' inside a container which will use >> virtio-user interfaces using the following command >> >> docker run -it --privileged -v >> /var/run/vpp/sock1.sock:/var/run/usvhost1 -v >> /var/run/vpp/sock2.sock:/var/run/usvhost2 -v >> /dev/hugepages:/dev/hugepages dpdk-app-testpmd ./bin/testpmd -c 0x3 -n >> 4 --log-level=9 -m 64 --no-pci --single-file-segments >> --vdev=virtio_user0,path=/var/run/usvhost1,mac=54:01:00:01:01:01 >> --vdev=virtio_user1,path=/var/run/usvhost2,mac=54:01:00:01:01:02 -- >> -i >> >> VPP Vnet crashes with following message >> >> May 27 11:44:00 localhost vnet[6818]: received signal SIGSEGV, PC >> 0x7fcca4620187, faulting address 0x7fcb317ac000 >> >> Questions: >> I have 'ulimit -c unlimited' and /etc/vpp/startup.conf has >> unix { >> nodaemon >> log /var/log/vpp/vpp.log >> full-coredump >> cli-listen /run/vpp/cli.sock >> gid vpp >> } >> >> But I couldn't locate corefile? > The location of the coredump file depends on your system configuration. > > Please, check "cat /proc/sys/kernel/core_pattern" > > If you have systemd-coredump in the output of the above command, then likely the > location of the coredump files is "/var/lib/systemd/coredump/" > > You can also change the location of where your system places the coredump files: > echo '/PATH_TO_YOU_LOCATION/core_%e.%p' | sudo tee /proc/sys/kernel/core_pattern > > See if that helps... > Initially '/proc/sys/kernel/core_pattern' was set to 'core'. I changed it to 'systemd-coredump'. Still no core generated. VPP crashes May 29 08:54:34 localhost vnet[4107]: received signal SIGSEGV, PC 0x7f0167751187, faulting address 0x7efff43ac000 May 29 08:54:34 localhost systemd[1]: vpp.service: Main process exited, code=killed, status=6/ABRT May 29 08:54:34 localhost systemd[1]: vpp.service: Unit entered failed state. May 29 08:54:34 localhost systemd[1]: vpp.service: Failed with result 'signal'. cat /proc/sys/kernel/core_pattern systemd-coredump ulimit -a core file size (blocks, -c) unlimited data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 15657 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 15657 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited cd /var/lib/systemd/coredump/ root@localhost:/var/lib/systemd/coredump# ls root@localhost:/var/lib/systemd/coredump# >> >> (2) How to enable debugs? I have used 'make build' but no additional >> logs other than those shown below >> >> >> VPP logs from /var/log/syslog is shown below >> cat /var/log/syslog >> May 27 11:40:28 localhost vpp[6818]: vlib_plugin_early_init:361: >> plugin path /usr/lib/vpp_plugins:/usr/lib64/vpp_plugins >> May 27 11:40:28 localhost vpp[6818]: load_one_plugin:189: Loaded >> plugin: abf_plugin.so (ACL based Forwarding) >> May 27 11:40:28 localhost vpp[6818]: load_one_plugin:189: Loaded >> plugin: acl_plugin.so (Access Control Lists) >> May 27 11:40:28 localhost vpp[6818]: load_one_plugin:189: Loaded >> plugin: avf_plugin.so (Intel Adaptive Virtual Function (AVF) Device >> Plugin) >> May 27 11:40:28 localhost vpp[6818]: load_one_plugin:191: Loaded >> plugin: cdp_plugin.so >> May 27 11:40:28 localhost
Re: [vpp-dev] VPP Vnet crash with vhost-user interface
Hi Marco, On Tue, May 29, 2018 at 6:30 AM, Marco Varlese wrote: > Ravi, > > On Sun, 2018-05-27 at 12:20 -0700, Ravi Kerur wrote: >> Hello, >> >> I have a VM(16.04.4 Ubuntu x86_64) with 2 cores and 4G RAM. I have >> installed VPP successfully on it. Later I have created vhost-user >> interfaces via >> >> create vhost socket /var/run/vpp/sock1.sock server >> create vhost socket /var/run/vpp/sock2.sock server >> set interface state VirtualEthernet0/0/0 up >> set interface state VirtualEthernet0/0/1 up >> >> set interface l2 bridge VirtualEthernet0/0/0 1 >> set interface l2 bridge VirtualEthernet0/0/1 1 >> >> I then run 'DPDK/testpmd' inside a container which will use >> virtio-user interfaces using the following command >> >> docker run -it --privileged -v >> /var/run/vpp/sock1.sock:/var/run/usvhost1 -v >> /var/run/vpp/sock2.sock:/var/run/usvhost2 -v >> /dev/hugepages:/dev/hugepages dpdk-app-testpmd ./bin/testpmd -c 0x3 -n >> 4 --log-level=9 -m 64 --no-pci --single-file-segments >> --vdev=virtio_user0,path=/var/run/usvhost1,mac=54:01:00:01:01:01 >> --vdev=virtio_user1,path=/var/run/usvhost2,mac=54:01:00:01:01:02 -- >> -i >> >> VPP Vnet crashes with following message >> >> May 27 11:44:00 localhost vnet[6818]: received signal SIGSEGV, PC >> 0x7fcca4620187, faulting address 0x7fcb317ac000 >> >> Questions: >> I have 'ulimit -c unlimited' and /etc/vpp/startup.conf has >> unix { >> nodaemon >> log /var/log/vpp/vpp.log >> full-coredump >> cli-listen /run/vpp/cli.sock >> gid vpp >> } >> >> But I couldn't locate corefile? > The location of the coredump file depends on your system configuration. > > Please, check "cat /proc/sys/kernel/core_pattern" > > If you have systemd-coredump in the output of the above command, then likely > the > location of the coredump files is "/var/lib/systemd/coredump/" > > You can also change the location of where your system places the coredump > files: > echo '/PATH_TO_YOU_LOCATION/core_%e.%p' | sudo tee > /proc/sys/kernel/core_pattern > > See if that helps... > Initially '/proc/sys/kernel/core_pattern' was set to 'core'. I changed it to 'systemd-coredump'. Still no core generated. VPP crashes May 29 08:54:34 localhost vnet[4107]: received signal SIGSEGV, PC 0x7f0167751187, faulting address 0x7efff43ac000 May 29 08:54:34 localhost systemd[1]: vpp.service: Main process exited, code=killed, status=6/ABRT May 29 08:54:34 localhost systemd[1]: vpp.service: Unit entered failed state. May 29 08:54:34 localhost systemd[1]: vpp.service: Failed with result 'signal'. cat /proc/sys/kernel/core_pattern systemd-coredump ulimit -a core file size (blocks, -c) unlimited data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 15657 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 15657 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited cd /var/lib/systemd/coredump/ root@localhost:/var/lib/systemd/coredump# ls root@localhost:/var/lib/systemd/coredump# >> >> (2) How to enable debugs? I have used 'make build' but no additional >> logs other than those shown below >> >> >> VPP logs from /var/log/syslog is shown below >> cat /var/log/syslog >> May 27 11:40:28 localhost vpp[6818]: vlib_plugin_early_init:361: >> plugin path /usr/lib/vpp_plugins:/usr/lib64/vpp_plugins >> May 27 11:40:28 localhost vpp[6818]: load_one_plugin:189: Loaded >> plugin: abf_plugin.so (ACL based Forwarding) >> May 27 11:40:28 localhost vpp[6818]: load_one_plugin:189: Loaded >> plugin: acl_plugin.so (Access Control Lists) >> May 27 11:40:28 localhost vpp[6818]: load_one_plugin:189: Loaded >> plugin: avf_plugin.so (Intel Adaptive Virtual Function (AVF) Device >> Plugin) >> May 27 11:40:28 localhost vpp[6818]: load_one_plugin:191: Loaded >> plugin: cdp_plugin.so >> May 27 11:40:28 localhost vpp[6818]: load_one_plugin:189: Loaded >> plugin: dpdk_plugin.so (Data Plane Development Kit (DPDK)) >> May 27 11:40:28 localhost vpp[6818]: load_one_plugin:189: Loaded >> plugin: flowprobe_plugin.so (Flow per Packet) >> May 27 11:40:28 localhost vpp[6818]: load_one_plugin:189: Loaded >> plugin: gbp_plugin.so (Group Based Policy) >> May 27 11:40:28 localhost vpp[6818]: load_one_plugin:189: Loaded >> plugin: gtpu_plugin.so (GTPv1-U) >> May 27 11:40:28 localhost vpp[6818]: load_one_plugin:189: Loaded >> plugin: igmp_plugin.so (IGMP messaging) >> May 27 11:40:28 localhost vpp[6818]: load_one_plugin:189: Loaded >> plugin: ila_plugin.so (Identifier-locator addressing for IPv6) >> May 27 11:40:28 localhost vpp[6818]: load_one_plugin:189: Loaded >>
[vpp-dev] VCL and LD Preload
Hi! I have been trying to test VPP's potential for TCP/IP latency optimization. When I run socket_test.sh without any special parameters (so just comparison between native-kernel, native-vcl and native-preload) it seems that preload around six times worse than kernel. I have also tried this same test with few different hardware setups. I also tried to compare LD preload to kernel performance between cloud VM's without using the provided script and ended up getting similar results. All test are run on 18.07 (build-release). What is the status of VCL and LD preload? Should it be working and do I miss something obvious maybe? Thanks, Ville Kapanen
Re: [vpp-dev] VPP Vnet crash with vhost-user interface
Ravi, On Sun, 2018-05-27 at 12:20 -0700, Ravi Kerur wrote: > Hello, > > I have a VM(16.04.4 Ubuntu x86_64) with 2 cores and 4G RAM. I have > installed VPP successfully on it. Later I have created vhost-user > interfaces via > > create vhost socket /var/run/vpp/sock1.sock server > create vhost socket /var/run/vpp/sock2.sock server > set interface state VirtualEthernet0/0/0 up > set interface state VirtualEthernet0/0/1 up > > set interface l2 bridge VirtualEthernet0/0/0 1 > set interface l2 bridge VirtualEthernet0/0/1 1 > > I then run 'DPDK/testpmd' inside a container which will use > virtio-user interfaces using the following command > > docker run -it --privileged -v > /var/run/vpp/sock1.sock:/var/run/usvhost1 -v > /var/run/vpp/sock2.sock:/var/run/usvhost2 -v > /dev/hugepages:/dev/hugepages dpdk-app-testpmd ./bin/testpmd -c 0x3 -n > 4 --log-level=9 -m 64 --no-pci --single-file-segments > --vdev=virtio_user0,path=/var/run/usvhost1,mac=54:01:00:01:01:01 > --vdev=virtio_user1,path=/var/run/usvhost2,mac=54:01:00:01:01:02 -- > -i > > VPP Vnet crashes with following message > > May 27 11:44:00 localhost vnet[6818]: received signal SIGSEGV, PC > 0x7fcca4620187, faulting address 0x7fcb317ac000 > > Questions: > I have 'ulimit -c unlimited' and /etc/vpp/startup.conf has > unix { > nodaemon > log /var/log/vpp/vpp.log > full-coredump > cli-listen /run/vpp/cli.sock > gid vpp > } > > But I couldn't locate corefile? The location of the coredump file depends on your system configuration. Please, check "cat /proc/sys/kernel/core_pattern" If you have systemd-coredump in the output of the above command, then likely the location of the coredump files is "/var/lib/systemd/coredump/" You can also change the location of where your system places the coredump files: echo '/PATH_TO_YOU_LOCATION/core_%e.%p' | sudo tee /proc/sys/kernel/core_pattern See if that helps... > > (2) How to enable debugs? I have used 'make build' but no additional > logs other than those shown below > > > VPP logs from /var/log/syslog is shown below > cat /var/log/syslog > May 27 11:40:28 localhost vpp[6818]: vlib_plugin_early_init:361: > plugin path /usr/lib/vpp_plugins:/usr/lib64/vpp_plugins > May 27 11:40:28 localhost vpp[6818]: load_one_plugin:189: Loaded > plugin: abf_plugin.so (ACL based Forwarding) > May 27 11:40:28 localhost vpp[6818]: load_one_plugin:189: Loaded > plugin: acl_plugin.so (Access Control Lists) > May 27 11:40:28 localhost vpp[6818]: load_one_plugin:189: Loaded > plugin: avf_plugin.so (Intel Adaptive Virtual Function (AVF) Device > Plugin) > May 27 11:40:28 localhost vpp[6818]: load_one_plugin:191: Loaded > plugin: cdp_plugin.so > May 27 11:40:28 localhost vpp[6818]: load_one_plugin:189: Loaded > plugin: dpdk_plugin.so (Data Plane Development Kit (DPDK)) > May 27 11:40:28 localhost vpp[6818]: load_one_plugin:189: Loaded > plugin: flowprobe_plugin.so (Flow per Packet) > May 27 11:40:28 localhost vpp[6818]: load_one_plugin:189: Loaded > plugin: gbp_plugin.so (Group Based Policy) > May 27 11:40:28 localhost vpp[6818]: load_one_plugin:189: Loaded > plugin: gtpu_plugin.so (GTPv1-U) > May 27 11:40:28 localhost vpp[6818]: load_one_plugin:189: Loaded > plugin: igmp_plugin.so (IGMP messaging) > May 27 11:40:28 localhost vpp[6818]: load_one_plugin:189: Loaded > plugin: ila_plugin.so (Identifier-locator addressing for IPv6) > May 27 11:40:28 localhost vpp[6818]: load_one_plugin:189: Loaded > plugin: ioam_plugin.so (Inbound OAM) > May 27 11:40:28 localhost vpp[6818]: load_one_plugin:117: Plugin > disabled (default): ixge_plugin.so > May 27 11:40:28 localhost vpp[6818]: load_one_plugin:189: Loaded > plugin: l2e_plugin.so (L2 Emulation) > May 27 11:40:28 localhost vpp[6818]: load_one_plugin:189: Loaded > plugin: lacp_plugin.so (Link Aggregation Control Protocol) > May 27 11:40:28 localhost vpp[6818]: load_one_plugin:189: Loaded > plugin: lb_plugin.so (Load Balancer) > May 27 11:40:28 localhost vpp[6818]: load_one_plugin:189: Loaded > plugin: memif_plugin.so (Packet Memory Interface (experimetal)) > May 27 11:40:28 localhost vpp[6818]: load_one_plugin:189: Loaded > plugin: nat_plugin.so (Network Address Translation) > May 27 11:40:28 localhost vpp[6818]: load_one_plugin:189: Loaded > plugin: pppoe_plugin.so (PPPoE) > May 27 11:40:28 localhost vpp[6818]: load_one_plugin:189: Loaded > plugin: srv6ad_plugin.so (Dynamic SRv6 proxy) > May 27 11:40:28 localhost vpp[6818]: load_one_plugin:189: Loaded > plugin: srv6am_plugin.so (Masquerading SRv6 proxy) > May 27 11:40:28 localhost vpp[6818]: load_one_plugin:189: Loaded > plugin: srv6as_plugin.so (Static SRv6 proxy) > May 27 11:40:29 localhost vpp[6818]: load_one_plugin:189: Loaded > plugin: stn_plugin.so (VPP Steals the NIC for Container integration) > May 27 11:40:29 localhost vpp[6818]: load_one_plugin:189: Loaded > plugin: tlsmbedtls_plugin.so (mbedtls based TLS Engine) > May 27 11:40:29 localhost vpp[6818]: load_one_plugin:189: Loaded > plugin: tlsopenssl_plugin.so (openssl
Re: [vpp-dev] Support for TCP flag
Thanks a lot for your prompt answer. :) From: Andrew Yourtchenko Sent: Tuesday, May 29, 2018 1:10 PM To: Rubina Bianchi Cc: vpp-dev@lists.fd.io Subject: Re: [vpp-dev] Support for TCP flag Hi Rubina, I designed the stateful mode to be just a bit more than the ACL, with a "diode" state, rather than going for the fully fledged firewall model - as a balance between the simplicity and the functionality. The full tracking of the TCP state machine was not in scope - getting into that territory properly requires also TCP sequence number tracking, etc. - and there the complexity would far outweigh the usefulness for most practical cases. So I needed to primarily differentiate the session state from the timeout perspective - when to remove it. For that purpose, there are two types of TCP sessions, decided by taking by the combination of SYN,FIN,RST,ACK TCP flag bits seen from each side: 1) Those that has seen SYN+ACK on both sides are fully open (this is where the "tcp idle" timeout applies, which is usually rather long. 2) Those that had seen any other combination of the flags (this is where the "tcp transient" timeout applies, which is default to 2 minutes) As we receive the packets, we update the seen flags, and we may change the current idle timeout based on the accumulated seen flags. Additionally, if we run out of sessions when creating the new ones, then the sessions in the transient state will be cleaned up and reused in the FIFO manner - so as to simulate a simple mechanism against the resource starvation for the higher session rate. This is a deliberate design choice, and unless there is some operational issues with it (i.e. where the resource clean-up does not happen where it should, etc...), I did not have any plans to change it. So, could you expand a bit more on what kind of use case you are looking for, to discuss further ? --a On 5/29/18, Rubina Bianchi wrote: > Hi > I have a question about vpp stateful mode. It seems that vpp stateful mode > hasn't implemented completely. I mean there aren't any feature same as > contrack in linux kernel. So, vpp doesn't have any mechanism to handle TCP > sessions based on different flags. For example I sent TCP three way > handshaking packets in different order (ack -> syn -> syn-ack), in this case > an idle session is added to session table. Do you have any plan to develop > it? >
Re: [vpp-dev] Rx stuck to 0 after a while
Dear Andrew I cleaned everything and created a new deb packages by your patch once again. With your patch I never see deadlock again, but still I have throughput problem in my scenario. -Per port stats table ports | 0 | 1 - opackets | 474826597 | 452028770 obytes |207843848531 |199591809555 ipackets |71010677 |72028456 ibytes | 31441646551 | 31687562468 ierrors | 0 | 0 oerrors | 0 | 0 Tx Bw | 9.56 Gbps | 9.16 Gbps -Global stats enabled Cpu Utilization : 88.4 % 7.1 Gb/core Platform_factor : 1.0 Total-Tx: 18.72 Gbps Total-Rx: 59.30 Mbps Total-PPS : 5.31 Mpps Total-CPS : 79.79 Kcps Expected-PPS: 9.02 Mpps Expected-CPS: 135.31 Kcps Expected-BPS: 31.77 Gbps Active-flows:88837 Clients : 252 Socket-util : 0.5598 % Open-flows : 14708455 Servers :65532 Socket :88837 Socket/Clients : 352.5 Total_queue_full : 328355248 drop-rate : 18.66 Gbps current time: 180.9 sec test duration : 99819.1 sec In best case (4 interface in one numa that only 2 of them has acl) my device (HP DL380 G9) throughput is maximum (18.72Gbps) but in worst case (4 interface in one numa that all of them has acl) my device throughput will decrease from maximum to around 60Mbps. Actually patch just prevent deadlock in my case but throughput is same as before. From: Andrew Yourtchenko Sent: Tuesday, May 29, 2018 10:11 AM To: Rubina Bianchi Cc: vpp-dev@lists.fd.io Subject: Re: [vpp-dev] Rx stuck to 0 after a while Dear Rubina, thank you for quickly checking it! Judging by the logs the VPP quits, so I would say there should be a core file, could you check ? If you find it (doublecheck by the timestamps that it is indeed the fresh one), you can load it in gdb (using gdb 'path-to-vpp-binary' 'path-to-core') and then get the backtrace using 'bt', this will give more idea on what is going on. --a On 5/29/18, Rubina Bianchi wrote: > Dear Andrew > > I tested your patch and my problem still exist, but my service status > changed and now there isn't any information about deadlock problem. Do you > have any idea about how I can provide you more information? > > root@MYRB:~# service vpp status > * vpp.service - vector packet processing engine >Loaded: loaded (/lib/systemd/system/vpp.service; disabled; vendor preset: > enabled) >Active: inactive (dead) > > May 29 09:27:06 MYRB /usr/bin/vpp[30805]: load_one_vat_plugin:67: Loaded > plugin: udp_ping_test_plugin.so > May 29 09:27:06 MYRB /usr/bin/vpp[30805]: load_one_vat_plugin:67: Loaded > plugin: stn_test_plugin.so > May 29 09:27:06 MYRB vpp[30805]: /usr/bin/vpp[30805]: dpdk: EAL init args: > -c 1ff -n 4 --huge-dir /run/vpp/hugepages --file-prefix vpp -w :08:00.0 > -w :08:00.1 -w :08 > May 29 09:27:06 MYRB /usr/bin/vpp[30805]: dpdk: EAL init args: -c 1ff -n 4 > --huge-dir /run/vpp/hugepages --file-prefix vpp -w :08:00.0 -w > :08:00.1 -w :08:00.2 -w 000 > May 29 09:27:07 MYRB vnet[30805]: dpdk_ipsec_process:1012: not enough DPDK > crypto resources, default to OpenSSL > May 29 09:27:13 MYRB vnet[30805]: unix_signal_handler:124: received signal > SIGCONT, PC 0x7fa535dfbac0 > May 29 09:27:13 MYRB vnet[30805]: received SIGTERM, exiting... > May 29 09:27:13 MYRB systemd[1]: Stopping vector packet processing > engine... > May 29 09:27:13 MYRB vnet[30805]: unix_signal_handler:124: received signal > SIGTERM, PC 0x7fa534121867 > May 29 09:27:13 MYRB systemd[1]: Stopped vector packet processing engine. > > > > From: Andrew Yourtchenko > Sent: Monday, May 28, 2018 5:58 PM > To: Rubina Bianchi > Cc: vpp-dev@lists.fd.io > Subject: Re: [vpp-dev] Rx stuck to 0 after a while > > Dear Rubina, > > Thanks for catching and reporting this! > > I suspect what might be happening is my recent change of using two > unidirectional sessions in bihash vs. the single one triggered a race, > whereby as the owning worker is deleting the session, > the non-owning worker is trying to update it. That would logically > explain the "BUG: .." line (since you don't change the interfaces nor > moving the traffic around, the 5 tuples should not collide), and as > well the later stop. > > To take care of this issue, I think I will split the deletion of the > session in two stages: > 1) deactivation of the bihash entries that steer the traffic > 2) freeing up the per-worker session structure > > and have a little pause time inbetween these two so that the > workers-in-progress could > finish updating the structures. > > The below gerrit is the first cut: > > https://gerrit.fd.io/r/#/c/12770/ > > It passes the make test right now
Re: [vpp-dev] Support for TCP flag
Hi Rubina, I designed the stateful mode to be just a bit more than the ACL, with a "diode" state, rather than going for the fully fledged firewall model - as a balance between the simplicity and the functionality. The full tracking of the TCP state machine was not in scope - getting into that territory properly requires also TCP sequence number tracking, etc. - and there the complexity would far outweigh the usefulness for most practical cases. So I needed to primarily differentiate the session state from the timeout perspective - when to remove it. For that purpose, there are two types of TCP sessions, decided by taking by the combination of SYN,FIN,RST,ACK TCP flag bits seen from each side: 1) Those that has seen SYN+ACK on both sides are fully open (this is where the "tcp idle" timeout applies, which is usually rather long. 2) Those that had seen any other combination of the flags (this is where the "tcp transient" timeout applies, which is default to 2 minutes) As we receive the packets, we update the seen flags, and we may change the current idle timeout based on the accumulated seen flags. Additionally, if we run out of sessions when creating the new ones, then the sessions in the transient state will be cleaned up and reused in the FIFO manner - so as to simulate a simple mechanism against the resource starvation for the higher session rate. This is a deliberate design choice, and unless there is some operational issues with it (i.e. where the resource clean-up does not happen where it should, etc...), I did not have any plans to change it. So, could you expand a bit more on what kind of use case you are looking for, to discuss further ? --a On 5/29/18, Rubina Bianchi wrote: > Hi > I have a question about vpp stateful mode. It seems that vpp stateful mode > hasn't implemented completely. I mean there aren't any feature same as > contrack in linux kernel. So, vpp doesn't have any mechanism to handle TCP > sessions based on different flags. For example I sent TCP three way > handshaking packets in different order (ack -> syn -> syn-ack), in this case > an idle session is added to session table. Do you have any plan to develop > it? > -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#9441): https://lists.fd.io/g/vpp-dev/message/9441 View All Messages In Topic (2): https://lists.fd.io/g/vpp-dev/topic/20405228 Mute This Topic: https://lists.fd.io/mt/20405228/21656 New Topic: https://lists.fd.io/g/vpp-dev/post Change Your Subscription: https://lists.fd.io/g/vpp-dev/editsub/21656 Group Home: https://lists.fd.io/g/vpp-dev Contact Group Owner: vpp-dev+ow...@lists.fd.io Terms of Service: https://lists.fd.io/static/tos Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] How to trigger perf test and compare the results
Hi Guys, I need CSIT perf testing to test patch. I have known that vpp-verify-perf-l2 can trigger perf test, but I only see the X520 results. My questions are : 1. how to trigger XL710 perf test? 2. What do we compare it to ... have the nightly build results? where do I find those? 3. These are MRR (maximum receive rate tests), how do we trigger NDP/PDR? Thanks Zhiyong
Re: [vpp-dev] Rx stuck to 0 after a while
Dear Rubina, thank you for quickly checking it! Judging by the logs the VPP quits, so I would say there should be a core file, could you check ? If you find it (doublecheck by the timestamps that it is indeed the fresh one), you can load it in gdb (using gdb 'path-to-vpp-binary' 'path-to-core') and then get the backtrace using 'bt', this will give more idea on what is going on. --a On 5/29/18, Rubina Bianchi wrote: > Dear Andrew > > I tested your patch and my problem still exist, but my service status > changed and now there isn't any information about deadlock problem. Do you > have any idea about how I can provide you more information? > > root@MYRB:~# service vpp status > * vpp.service - vector packet processing engine >Loaded: loaded (/lib/systemd/system/vpp.service; disabled; vendor preset: > enabled) >Active: inactive (dead) > > May 29 09:27:06 MYRB /usr/bin/vpp[30805]: load_one_vat_plugin:67: Loaded > plugin: udp_ping_test_plugin.so > May 29 09:27:06 MYRB /usr/bin/vpp[30805]: load_one_vat_plugin:67: Loaded > plugin: stn_test_plugin.so > May 29 09:27:06 MYRB vpp[30805]: /usr/bin/vpp[30805]: dpdk: EAL init args: > -c 1ff -n 4 --huge-dir /run/vpp/hugepages --file-prefix vpp -w :08:00.0 > -w :08:00.1 -w :08 > May 29 09:27:06 MYRB /usr/bin/vpp[30805]: dpdk: EAL init args: -c 1ff -n 4 > --huge-dir /run/vpp/hugepages --file-prefix vpp -w :08:00.0 -w > :08:00.1 -w :08:00.2 -w 000 > May 29 09:27:07 MYRB vnet[30805]: dpdk_ipsec_process:1012: not enough DPDK > crypto resources, default to OpenSSL > May 29 09:27:13 MYRB vnet[30805]: unix_signal_handler:124: received signal > SIGCONT, PC 0x7fa535dfbac0 > May 29 09:27:13 MYRB vnet[30805]: received SIGTERM, exiting... > May 29 09:27:13 MYRB systemd[1]: Stopping vector packet processing > engine... > May 29 09:27:13 MYRB vnet[30805]: unix_signal_handler:124: received signal > SIGTERM, PC 0x7fa534121867 > May 29 09:27:13 MYRB systemd[1]: Stopped vector packet processing engine. > > > > From: Andrew Yourtchenko > Sent: Monday, May 28, 2018 5:58 PM > To: Rubina Bianchi > Cc: vpp-dev@lists.fd.io > Subject: Re: [vpp-dev] Rx stuck to 0 after a while > > Dear Rubina, > > Thanks for catching and reporting this! > > I suspect what might be happening is my recent change of using two > unidirectional sessions in bihash vs. the single one triggered a race, > whereby as the owning worker is deleting the session, > the non-owning worker is trying to update it. That would logically > explain the "BUG: .." line (since you don't change the interfaces nor > moving the traffic around, the 5 tuples should not collide), and as > well the later stop. > > To take care of this issue, I think I will split the deletion of the > session in two stages: > 1) deactivation of the bihash entries that steer the traffic > 2) freeing up the per-worker session structure > > and have a little pause time inbetween these two so that the > workers-in-progress could > finish updating the structures. > > The below gerrit is the first cut: > > https://gerrit.fd.io/r/#/c/12770/ > > It passes the make test right now but I did not kick its tires too > much yet, will do tomorrow. > > You can try this change out in your test setup as well and tell me how it > feels. > > --a > > On 5/28/18, Rubina Bianchi wrote: >> Hi >> >> I run vpp v18.07-rc0~237-g525c9d0f with only 2 interface in stateful acl >> (permit+reflect) and generated sfr traffic using trex v2.27. My rx will >> become 0 after a short while, about 300 sec in my machine. Here is vpp >> status: >> >> root@MYRB:~# service vpp status >> * vpp.service - vector packet processing engine >>Loaded: loaded (/lib/systemd/system/vpp.service; disabled; vendor >> preset: >> enabled) >>Active: failed (Result: signal) since Mon 2018-05-28 11:35:03 +0130; >> 37s >> ago >> Process: 32838 ExecStopPost=/bin/rm -f /dev/shm/db /dev/shm/global_vm >> /dev/shm/vpe-api (code=exited, status=0/SUCCESS) >> Process: 31754 ExecStart=/usr/bin/vpp -c /etc/vpp/startup.conf >> (code=killed, signal=ABRT) >> Process: 31750 ExecStartPre=/sbin/modprobe uio_pci_generic >> (code=exited, >> status=0/SUCCESS) >> Process: 31747 ExecStartPre=/bin/rm -f /dev/shm/db /dev/shm/global_vm >> /dev/shm/vpe-api (code=exited, status=0/SUCCESS) >> Main PID: 31754 (code=killed, signal=ABRT) >> >> May 28 16:32:47 MYRB vnet[31754]: acl_fa_node_fn:210: BUG: session >> LSB16(sw_if_index) and 5-tuple collision! >> May 28 16:35:02 MYRB vnet[31754]: unix_signal_handler:124: received >> signal >> SIGCONT, PC 0x7f1fb591cac0 >> May 28 16:35:02 MYRB vnet[31754]: received SIGTERM, exiting... >> May 28 16:35:02 MYRB systemd[1]: Stopping vector packet processing >> engine... >> May 28 16:35:02 MYRB vnet[31754]: unix_signal_handler:124: received >> signal >> SIGTERM, PC 0x7f1fb3c40867 >> May 28 16:35:03 MYRB vpp[31754]: vlib_worker_thread_barrier_sync_int: >> worker >> thread deadlock >> May 28 16:35:03 MYRB systemd[1]: vpp.service:
[vpp-dev] vpp's memory is leaking
Hi all, My version is 17.04, I encountered a memory leaking problem, the RES memory of VPP is increasing slowly and continuously. I shutdown all interfaces and set break points on memory allocate functions, such as malloc,calloc,realloc, mmap, vmalloc, clib_mem_alloc, mheap_alloc_with_flags. The program is still running continuously and the RES memory is also increasing continuously, any guides? Regards root@ubuntu:/home/wangzy# top -c |grep vpp 4499 root 20 0 5.000t 1.207g 197808 S 201.0 31.4 26:30.57 /usr/bin/vpp -c /etc/vpp/startup.conf 4499 root 20 0 5.000t 1.209g 197808 S 201.7 31.5 26:36.62 /usr/bin/vpp -c /etc/vpp/startup.conf 4499 root 20 0 5.000t 1.209g 197808 t 3.3 31.5 26:36.72 /usr/bin/vpp -c /etc/vpp/startup.conf 4499 root 20 0 5.000t 1.209g 197808 S 115.0 31.5 26:40.18 /usr/bin/vpp -c /etc/vpp/startup.conf 4499 root 20 0 5.000t 1.209g 197808 S 201.0 31.5 26:46.23 /usr/bin/vpp -c /etc/vpp/startup.conf 4499 root 20 0 5.000t 1.209g 197808 S 200.7 31.5 26:52.27 /usr/bin/vpp -c /etc/vpp/startup.conf 4499 root 20 0 5.000t 1.209g 197808 S 201.3 31.5 26:58.31 /usr/bin/vpp -c /etc/vpp/startup.conf
Re: [vpp-dev] Rx stuck to 0 after a while
Dear Andrew I tested your patch and my problem still exist, but my service status changed and now there isn't any information about deadlock problem. Do you have any idea about how I can provide you more information? root@MYRB:~# service vpp status * vpp.service - vector packet processing engine Loaded: loaded (/lib/systemd/system/vpp.service; disabled; vendor preset: enabled) Active: inactive (dead) May 29 09:27:06 MYRB /usr/bin/vpp[30805]: load_one_vat_plugin:67: Loaded plugin: udp_ping_test_plugin.so May 29 09:27:06 MYRB /usr/bin/vpp[30805]: load_one_vat_plugin:67: Loaded plugin: stn_test_plugin.so May 29 09:27:06 MYRB vpp[30805]: /usr/bin/vpp[30805]: dpdk: EAL init args: -c 1ff -n 4 --huge-dir /run/vpp/hugepages --file-prefix vpp -w :08:00.0 -w :08:00.1 -w :08 May 29 09:27:06 MYRB /usr/bin/vpp[30805]: dpdk: EAL init args: -c 1ff -n 4 --huge-dir /run/vpp/hugepages --file-prefix vpp -w :08:00.0 -w :08:00.1 -w :08:00.2 -w 000 May 29 09:27:07 MYRB vnet[30805]: dpdk_ipsec_process:1012: not enough DPDK crypto resources, default to OpenSSL May 29 09:27:13 MYRB vnet[30805]: unix_signal_handler:124: received signal SIGCONT, PC 0x7fa535dfbac0 May 29 09:27:13 MYRB vnet[30805]: received SIGTERM, exiting... May 29 09:27:13 MYRB systemd[1]: Stopping vector packet processing engine... May 29 09:27:13 MYRB vnet[30805]: unix_signal_handler:124: received signal SIGTERM, PC 0x7fa534121867 May 29 09:27:13 MYRB systemd[1]: Stopped vector packet processing engine. From: Andrew Yourtchenko Sent: Monday, May 28, 2018 5:58 PM To: Rubina Bianchi Cc: vpp-dev@lists.fd.io Subject: Re: [vpp-dev] Rx stuck to 0 after a while Dear Rubina, Thanks for catching and reporting this! I suspect what might be happening is my recent change of using two unidirectional sessions in bihash vs. the single one triggered a race, whereby as the owning worker is deleting the session, the non-owning worker is trying to update it. That would logically explain the "BUG: .." line (since you don't change the interfaces nor moving the traffic around, the 5 tuples should not collide), and as well the later stop. To take care of this issue, I think I will split the deletion of the session in two stages: 1) deactivation of the bihash entries that steer the traffic 2) freeing up the per-worker session structure and have a little pause time inbetween these two so that the workers-in-progress could finish updating the structures. The below gerrit is the first cut: https://gerrit.fd.io/r/#/c/12770/ It passes the make test right now but I did not kick its tires too much yet, will do tomorrow. You can try this change out in your test setup as well and tell me how it feels. --a On 5/28/18, Rubina Bianchi wrote: > Hi > > I run vpp v18.07-rc0~237-g525c9d0f with only 2 interface in stateful acl > (permit+reflect) and generated sfr traffic using trex v2.27. My rx will > become 0 after a short while, about 300 sec in my machine. Here is vpp > status: > > root@MYRB:~# service vpp status > * vpp.service - vector packet processing engine >Loaded: loaded (/lib/systemd/system/vpp.service; disabled; vendor preset: > enabled) >Active: failed (Result: signal) since Mon 2018-05-28 11:35:03 +0130; 37s > ago > Process: 32838 ExecStopPost=/bin/rm -f /dev/shm/db /dev/shm/global_vm > /dev/shm/vpe-api (code=exited, status=0/SUCCESS) > Process: 31754 ExecStart=/usr/bin/vpp -c /etc/vpp/startup.conf > (code=killed, signal=ABRT) > Process: 31750 ExecStartPre=/sbin/modprobe uio_pci_generic (code=exited, > status=0/SUCCESS) > Process: 31747 ExecStartPre=/bin/rm -f /dev/shm/db /dev/shm/global_vm > /dev/shm/vpe-api (code=exited, status=0/SUCCESS) > Main PID: 31754 (code=killed, signal=ABRT) > > May 28 16:32:47 MYRB vnet[31754]: acl_fa_node_fn:210: BUG: session > LSB16(sw_if_index) and 5-tuple collision! > May 28 16:35:02 MYRB vnet[31754]: unix_signal_handler:124: received signal > SIGCONT, PC 0x7f1fb591cac0 > May 28 16:35:02 MYRB vnet[31754]: received SIGTERM, exiting... > May 28 16:35:02 MYRB systemd[1]: Stopping vector packet processing > engine... > May 28 16:35:02 MYRB vnet[31754]: unix_signal_handler:124: received signal > SIGTERM, PC 0x7f1fb3c40867 > May 28 16:35:03 MYRB vpp[31754]: vlib_worker_thread_barrier_sync_int: worker > thread deadlock > May 28 16:35:03 MYRB systemd[1]: vpp.service: Main process exited, > code=killed, status=6/ABRT > May 28 16:35:03 MYRB systemd[1]: Stopped vector packet processing engine. > May 28 16:35:03 MYRB systemd[1]: vpp.service: Unit entered failed state. > May 28 16:35:03 MYRB systemd[1]: vpp.service: Failed with result 'signal'. > > I attach my vpp configs to this email. I also run this test with the same > config and added 4 interface instead of two. But in this case nothing > happened to vpp and it was functional for a long time. > > Thanks, > RB >