[vpp-dev] svm ASAN check error
Hi experts: There is a svm ASAN check error. My vpp version is 21.01-rc0~394-g798267aaa DBGvpp# ==445==AddressSanitizer CHECK failed: ../../../../libsanitizer/asan/asan_mapping.h:377 "((AddrIsInMem(p))) != (0)" (0x0, 0x0) #0 0x774c158a (/root/net-base/script/../lib/libasan.so.5+0x11658a) #1 0x774df49a (/root/net-base/script/../lib/libasan.so.5+0x13449a) #2 0x774bcedc in __asan_unpoison_memory_region (/root/net-base/script/../lib/libasan.so.5+0x111edc) #3 0x739a201b in clib_mem_vm_map_internal /home/dev/code/net-base/.vpp-21.01-rc0/src/vppinfra/linux/mem.c:491 #4 0x73867124 in clib_mem_vm_map_shared /home/dev/code/net-base/.vpp-21.01-rc0/src/vppinfra/mem.c:68 #5 0x77ed6d8b in ssvm_server_init_memfd /home/dev/code/net-base/.vpp-21.01-rc0/src/svm/ssvm.c:253 #6 0x77ed7df6 in ssvm_server_init /home/dev/code/net-base/.vpp-21.01-rc0/src/svm/ssvm.c:437 #7 0x763d9d4c in session_vpp_event_queues_allocate /home/dev/code/net-base/.vpp-21.01-rc0/src/vnet/session/session.c:1497 #8 0x763dd48c in session_manager_main_enable /home/dev/code/net-base/.vpp-21.01-rc0/src/vnet/session/session.c:1678 #9 0x763dde66 in vnet_session_enable_disable /home/dev/code/net-base/.vpp-21.01-rc0/src/vnet/session/session.c:1772 #10 0x7ffea7331d86 in vpn_client_create_command_fn /home/dev/code/net-base/vpn/client.c:210 #11 0x740dc611 in vlib_cli_dispatch_sub_commands /home/dev/code/net-base/.vpp-21.01-rc0/src/vlib/cli.c:572 #12 0x740dc06f in vlib_cli_dispatch_sub_commands /home/dev/code/net-base/.vpp-21.01-rc0/src/vlib/cli.c:529 #13 0x740dc06f in vlib_cli_dispatch_sub_commands /home/dev/code/net-base/.vpp-21.01-rc0/src/vlib/cli.c:529 #14 0x740dd166 in vlib_cli_input /home/dev/code/net-base/.vpp-21.01-rc0/src/vlib/cli.c:674 #15 0x7428a9f6 in startup_config_process /home/dev/code/net-base/.vpp-21.01-rc0/src/vlib/unix/main.c:366 #16 0x7418c83f in vlib_process_bootstrap /home/dev/code/net-base/.vpp-21.01-rc0/src/vlib/main.c:1477 #17 0x7384ff0f (/root/net-base/script/../lib/libvppinfra.so.21.01.0+0xcff0f) Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7ffe98803700 (LWP 468)] 0x7417d5f9 in vlib_worker_thread_barrier_check () at /home/dev/code/net-base/.vpp-21.01-rc0/src/vlib/threads.h:439 439 /home/dev/code/net-base/.vpp-21.01-rc0/src/vlib/threads.h: No such file or directory. Missing separate debuginfos, use: debuginfo-install glibc-2.17-196.el7.x86_64 libgcc-4.8.5-16.el7.x86_64 libgcc-4.8.5-36.el7_6.1.x86_64 libstdc++-4.8.5-16.el7.x86_64 libstdc++-4.8.5-36.el7_6.1.x86_64 libuuid-2.23.2-43.el7.x86_64 (gdb) bt #0 0x7417d5f9 in vlib_worker_thread_barrier_check () at /home/dev/code/net-base/.vpp-21.01-rc0/src/vlib/threads.h:439 #1 0x74190196 in vlib_main_or_worker_loop (vm=0x7ffeb83b01c0, is_main=0) at /home/dev/code/net-base/.vpp-21.01-rc0/src/vlib/main.c:1812 #2 0x741925e0 in vlib_worker_loop (vm=0x7ffeb83b01c0) at /home/dev/code/net-base/.vpp-21.01-rc0/src/vlib/main.c:2038 #3 0x7422a926 in vlib_worker_thread_fn (arg=0x7ffeafa1b040) at /home/dev/code/net-base/.vpp-21.01-rc0/src/vlib/threads.c:1870 #4 0x7384ff10 in clib_calljmp () at /home/dev/code/net-base/.vpp-21.01-rc0/src/vppinfra/longjmp.S:123 #5 0x7ffe98802c30 in ?? () #6 0x7421c75c in vlib_worker_thread_bootstrap_fn (arg=0x7ffeafa1b040) at /home/dev/code/net-base/.vpp-21.01-rc0/src/vlib/threads.c:585 #7 0x7ffeaa0a9978 in eal_thread_loop (arg=0x0) at ../src-dpdk/lib/librte_eal/linux/eal_thread.c:127 #8 0x73c14e25 in start_thread () from /lib64/libpthread.so.0 #9 0x72a3834d in clone () from /lib64/libc.so.6 (gdb) -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18031): https://lists.fd.io/g/vpp-dev/message/18031 Mute This Topic: https://lists.fd.io/mt/78245228/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] Frequently updated gauge (scalar)?
> On 14 Nov 2020, at 00:44, Christian Hopps wrote: > > Yeah, I ended up doing this. Would be nice to have access to a "double" vs > "u64" type, but using vlib_set_counter works well enough for me. >From the perspective of the stat infra it doesn’t much matter. As long as it’s >64 bit aligned and updated atomically. Cheers Ole > > Thanks, > Chris. > >> On Nov 13, 2020, at 5:16 PM, otr...@employees.org wrote: >> >> Hi Christian, >> >> Aboslutely. The periodic gauge callback is purely there for sources that >> don't provide the counters themselves. >> E.g. used for polling mempry heaps, free buffers and so on. >> >> You can just use a normal counter (counter.c) for your high frequency gauge. >> >> Cheers, >> Ole >> On 13 Nov 2020, at 20:20, Christian Hopps wrote: >>> >>> I have need to track a frequently updated value (pps rate during congestion >>> control), but I need this to be close to very frequently updated (basically >>> whenever I change the value which is based on RTT). It seems the current >>> gauge code you supply a callback which updates the counter (at a default of >>> every 10 seconds). >>> >>> Is there something about the scalar stat that's different from the counter >>> stat that requires this infrequent updating? Is it possible to "register" a >>> scalar state that I can update very frequently instead directly (instead of >>> supplying a callback)? >>> >>> Thanks, >>> Chris. >>> >>> >>> >> > > > > -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18030): https://lists.fd.io/g/vpp-dev/message/18030 Mute This Topic: https://lists.fd.io/mt/78237093/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] Frequently updated gauge (scalar)?
Yeah, I ended up doing this. Would be nice to have access to a "double" vs "u64" type, but using vlib_set_counter works well enough for me. Thanks, Chris. > On Nov 13, 2020, at 5:16 PM, otr...@employees.org wrote: > > Hi Christian, > > Aboslutely. The periodic gauge callback is purely there for sources that > don't provide the counters themselves. > E.g. used for polling mempry heaps, free buffers and so on. > > You can just use a normal counter (counter.c) for your high frequency gauge. > > Cheers, > Ole > >> On 13 Nov 2020, at 20:20, Christian Hopps wrote: >> >> I have need to track a frequently updated value (pps rate during congestion >> control), but I need this to be close to very frequently updated (basically >> whenever I change the value which is based on RTT). It seems the current >> gauge code you supply a callback which updates the counter (at a default of >> every 10 seconds). >> >> Is there something about the scalar stat that's different from the counter >> stat that requires this infrequent updating? Is it possible to "register" a >> scalar state that I can update very frequently instead directly (instead of >> supplying a callback)? >> >> Thanks, >> Chris. >> >> >> > signature.asc Description: Message signed with OpenPGP -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18029): https://lists.fd.io/g/vpp-dev/message/18029 Mute This Topic: https://lists.fd.io/mt/78237093/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] Frequently updated gauge (scalar)?
Hi Christian, Aboslutely. The periodic gauge callback is purely there for sources that don't provide the counters themselves. E.g. used for polling mempry heaps, free buffers and so on. You can just use a normal counter (counter.c) for your high frequency gauge. Cheers, Ole > On 13 Nov 2020, at 20:20, Christian Hopps wrote: > > I have need to track a frequently updated value (pps rate during congestion > control), but I need this to be close to very frequently updated (basically > whenever I change the value which is based on RTT). It seems the current > gauge code you supply a callback which updates the counter (at a default of > every 10 seconds). > > Is there something about the scalar stat that's different from the counter > stat that requires this infrequent updating? Is it possible to "register" a > scalar state that I can update very frequently instead directly (instead of > supplying a callback)? > > Thanks, > Chris. > > > -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18028): https://lists.fd.io/g/vpp-dev/message/18028 Mute This Topic: https://lists.fd.io/mt/78237093/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: Handoff design issues [Re: RES: RES: [vpp-dev] Increasing NAT worker handoff frame queue size NAT_FQ_NELTS to avoid congestion drops?]
This is a typical problem one would face with a pipeline mode of processing packets. i.e. the over all performance of the pipeline is equal to the performance of the lowest performing stage in the pipeline. Having a bigger queue would help handle a burst or might solve the problem for a given platform and traffic profile. One option could be to run the lowest performing stage in multiple threads/CPUs. But, then the previous stage needs to distribute the packets evenly. Thanks, Honnappa > -Original Message- > From: vpp-dev@lists.fd.io On Behalf Of Christian > Hopps via lists.fd.io > Sent: Friday, November 13, 2020 3:47 PM > To: Marcos - Mgiga > Cc: Christian Hopps ; Klement Sekera > ; Elias Rudberg ; vpp- > d...@lists.fd.io > Subject: Handoff design issues [Re: RES: RES: [vpp-dev] Increasing NAT > worker handoff frame queue size NAT_FQ_NELTS to avoid congestion > drops?] > > FWIW, I too have hit this issue. Basically VPP is designed to process a packet > from rx to tx in the same thread. When downstream nodes run slower, the > upstream rx node doesn't run, so the vector size in each frame naturally > increases, and then the downstream nodes can benefit from "V" (i.e., > processing multiple packets in one go). > > This back-pressure from downstream does not occur when you hand-off > from a fast thread to a slower thread, so you end up with many single packet > frames and fill your hand-off queue. > > The quick fix one tries then is to increase the queue size; however, this is > not > a great solution b/c you are still not taking advantage of the "V" in VPP. To > really fit this back into the original design one needs to somehow still be > creating larger vectors in the hand-off frames. > > TBH I think the right solution here is to not hand-off frames, and instead > switch to packet queues and then on the handed-off side the frames would > get constructed from packet queues (basically creating another polling input > node but on the new thread). > > Thanks, > Chris. > > > On Nov 13, 2020, at 12:21 PM, Marcos - Mgiga > wrote: > > > > Understood. And what path did you take in order to analyse and monitor > vector rates ? Is there some specific command or log ? > > > > Thanks > > > > Marcos > > > > -Mensagem original- > > De: vpp-dev@lists.fd.io Em nome de ksekera via > > [] Enviada em: sexta-feira, 13 de novembro de 2020 14:02 > > Para: Marcos - Mgiga > > Cc: Elias Rudberg ; vpp-dev@lists.fd.io > > Assunto: Re: RES: [vpp-dev] Increasing NAT worker handoff frame queue > size NAT_FQ_NELTS to avoid congestion drops? > > > > Not completely idle, more like medium load. Vector rates at which I saw > congestion drops were roughly 40 for thread doing no work (just handoffs - I > hardcoded it this way for test purpose), and roughly 100 for thread picking > the packets doing NAT. > > > > What got me into infra investigation was the fact that once I was hitting > vector rates around 255, I did see packet drops, but no congestion drops. > > > > HTH, > > Klement > > > >> On 13 Nov 2020, at 17:51, Marcos - Mgiga wrote: > >> > >> So you mean that this situation ( congestion drops) is most likely to occur > when the system in general is idle than when it is processing a large amount > of traffic? > >> > >> Best Regards > >> > >> Marcos > >> > >> -Mensagem original- > >> De: vpp-dev@lists.fd.io Em nome de Klement > >> Sekera via lists.fd.io Enviada em: sexta-feira, 13 de novembro de > >> 2020 > >> 12:15 > >> Para: Elias Rudberg > >> Cc: vpp-dev@lists.fd.io > >> Assunto: Re: [vpp-dev] Increasing NAT worker handoff frame queue size > NAT_FQ_NELTS to avoid congestion drops? > >> > >> Hi Elias, > >> > >> I’ve already debugged this and came to the conclusion that it’s the infra > which is the weak link. I was seeing congestion drops at mild load, but not at > full load. Issue is that with handoff, there is uneven workload. For > simplicity’s > sake, just consider thread 1 handing off all the traffic to thread 2. What > happens is that for thread 1, the job is much easier, it just does some ip4 > parsing and then hands packet to thread 2, which actually does the heavy > lifting of hash inserts/lookups/translation etc. 64 element queue can hold 64 > frames, one extreme is 64 1-packet frames, totalling 64 packets, other > extreme is 64 255-packet frames, totalling ~16k packets. What happens is > this: thread 1 is mostly idle and just picking a few packets from NIC and > every > one of these small frames creates an entry in the handoff queue. Now > thread 2 picks one element from the handoff queue and deals with it before > picking another one. If the queue has only 3-packet or 10-packet elements, > then thread 2 can never really get into what VPP excels in - bulk processing. > >> > >> Q: Why doesn’t it pick as many packets as possible from the handoff > queue? > >> A: It’s not implemented. > >> > >> I already wrote a patch for it, which made all congestion drops which I saw > (in above
Handoff design issues [Re: RES: RES: [vpp-dev] Increasing NAT worker handoff frame queue size NAT_FQ_NELTS to avoid congestion drops?]
FWIW, I too have hit this issue. Basically VPP is designed to process a packet from rx to tx in the same thread. When downstream nodes run slower, the upstream rx node doesn't run, so the vector size in each frame naturally increases, and then the downstream nodes can benefit from "V" (i.e., processing multiple packets in one go). This back-pressure from downstream does not occur when you hand-off from a fast thread to a slower thread, so you end up with many single packet frames and fill your hand-off queue. The quick fix one tries then is to increase the queue size; however, this is not a great solution b/c you are still not taking advantage of the "V" in VPP. To really fit this back into the original design one needs to somehow still be creating larger vectors in the hand-off frames. TBH I think the right solution here is to not hand-off frames, and instead switch to packet queues and then on the handed-off side the frames would get constructed from packet queues (basically creating another polling input node but on the new thread). Thanks, Chris. > On Nov 13, 2020, at 12:21 PM, Marcos - Mgiga wrote: > > Understood. And what path did you take in order to analyse and monitor vector > rates ? Is there some specific command or log ? > > Thanks > > Marcos > > -Mensagem original- > De: vpp-dev@lists.fd.io Em nome de ksekera via [] > Enviada em: sexta-feira, 13 de novembro de 2020 14:02 > Para: Marcos - Mgiga > Cc: Elias Rudberg ; vpp-dev@lists.fd.io > Assunto: Re: RES: [vpp-dev] Increasing NAT worker handoff frame queue size > NAT_FQ_NELTS to avoid congestion drops? > > Not completely idle, more like medium load. Vector rates at which I saw > congestion drops were roughly 40 for thread doing no work (just handoffs - I > hardcoded it this way for test purpose), and roughly 100 for thread picking > the packets doing NAT. > > What got me into infra investigation was the fact that once I was hitting > vector rates around 255, I did see packet drops, but no congestion drops. > > HTH, > Klement > >> On 13 Nov 2020, at 17:51, Marcos - Mgiga wrote: >> >> So you mean that this situation ( congestion drops) is most likely to occur >> when the system in general is idle than when it is processing a large amount >> of traffic? >> >> Best Regards >> >> Marcos >> >> -Mensagem original- >> De: vpp-dev@lists.fd.io Em nome de Klement >> Sekera via lists.fd.io Enviada em: sexta-feira, 13 de novembro de 2020 >> 12:15 >> Para: Elias Rudberg >> Cc: vpp-dev@lists.fd.io >> Assunto: Re: [vpp-dev] Increasing NAT worker handoff frame queue size >> NAT_FQ_NELTS to avoid congestion drops? >> >> Hi Elias, >> >> I’ve already debugged this and came to the conclusion that it’s the infra >> which is the weak link. I was seeing congestion drops at mild load, but not >> at full load. Issue is that with handoff, there is uneven workload. For >> simplicity’s sake, just consider thread 1 handing off all the traffic to >> thread 2. What happens is that for thread 1, the job is much easier, it just >> does some ip4 parsing and then hands packet to thread 2, which actually does >> the heavy lifting of hash inserts/lookups/translation etc. 64 element queue >> can hold 64 frames, one extreme is 64 1-packet frames, totalling 64 packets, >> other extreme is 64 255-packet frames, totalling ~16k packets. What happens >> is this: thread 1 is mostly idle and just picking a few packets from NIC and >> every one of these small frames creates an entry in the handoff queue. Now >> thread 2 picks one element from the handoff queue and deals with it before >> picking another one. If the queue has only 3-packet or 10-packet elements, >> then thread 2 can never really get into what VPP excels in - bulk processing. >> >> Q: Why doesn’t it pick as many packets as possible from the handoff queue? >> A: It’s not implemented. >> >> I already wrote a patch for it, which made all congestion drops which I saw >> (in above synthetic test case) disappear. Mentioned patch >> https://gerrit.fd.io/r/c/vpp/+/28980 is sitting in gerrit. >> >> Would you like to give it a try and see if it helps your issue? We >> shouldn’t need big queues under mild loads anyway … >> >> Regards, >> Klement >> >>> On 13 Nov 2020, at 16:03, Elias Rudberg wrote: >>> >>> Hello VPP experts, >>> >>> We are using VPP for NAT44 and we get some "congestion drops", in a >>> situation where we think VPP is far from overloaded in general. Then >>> we started to investigate if it would help to use a larger handoff >>> frame queue size. In theory at least, allowing a longer queue could >>> help avoiding drops in case of short spikes of traffic, or if it >>> happens that some worker thread is temporarily busy for whatever >>> reason. >>> >>> The NAT worker handoff frame queue size is hard-coded in the >>> NAT_FQ_NELTS macro in src/plugins/nat/nat.h where the current value >>> is 64. The idea is that putting a larger value there
Re: [vpp-dev] Python API modules
Hi Paul ,Thank you for your answear. Iam sure that anyone who uses or maintain this project is very competent, and as a beginner I'm grateful for that and hopefully I Will be able to help it get even biggerWell, my goal is just build a dashboard to make easier the process of managing Vpp, maybe you can help me out with that. If you are interested email me and tell me the best way to keep in touch Best regards Original message From: Paul Vinciguerra Date: 11/13/20 17:29 (GMT-03:00) To: "Vratko Polak -X (vrpolak - PANTHEON TECHNOLOGIES at Cisco)" Cc: Marcos - Mgiga , vpp-dev , Ole Troan Subject: Re: [vpp-dev] Python API modules Hi Marcos.Yes. Your assumption is correct. A library user expects access to the library.I'm going to ask the question that you haven't.Why won't Ole, as maintainer, allow it?The build system uses something called vppapigen to generate the c include files and the json files. It could as easily generate static stubs so that development with python could be usable without a running vpp instance.The thing is that there are other trivial issues that need to be addressed first. Wouldn't you love to be able to do 'pip install vpp_papi' and get going?Yes, vpp builds artifacts for vpp_api_python, but who actually wants vpp_api installed globally? The python best practice is to install a venv, and do pip install.Even 'make test' follows this pattern.vpp_papi has changed significantly, yet the version number is never, ever increased.To address this, on April 5, 2019, I attempted to automate versioning of vpp_papi to the git tag. https://gerrit.fd.io/r/c/vpp/+/18694.There is a version on pypi, it is woefully out of date, and honestly not usable. On June 18, 2019, I contributed https://gerrit.fd.io/r/c/vpp/+/20035, to automate the process of updating pypi, with a simple command 'tox -epypi' see:src/vpp-api/python/tox.iniI have also floated the idea of moving vpp_papi to a submodule, so that it could be easily developed against if pypi were too burdensome to add to the release process. Just include the submodule in your code, pin it to any commit you like, and you're off to the races!To be 100% clear, this is not an attack on Ole. I have nothing but respect for him. He is extremely talented and has been *extremely* generous with his time to me. As for me, as anyone here can tell you, I'm not a C programmer. The only reason, I ever considered touching the c code, was because of Ole's help and guidance and his suggestion to fix the API instead of conforming the test. He/Vratko removed me as one of the maintainers of papi. https://gerrit.fd.io/r/c/vpp/+/22672. I'm cool with not being a maintainer, kinda funny that it was stuffed into another changeset. I have been called out repeatedly for submitting unrelated changes ;)All my changesets are still out there, so that should others become blocked, they can still get work done. 'git-review -X 12345' is your friend. (If you use -X instead of -x, you can remove the changeset with 'git rebase -i' if you want to submit a contribution)In that spirit, if you want a python module with static methods, let me know.Paul On Fri, Nov 13, 2020 at 10:57 AM Vratko Polak -X (vrpolak - PANTHEON TECHNOLOGIES at Cisco) wrote: > All available api are represented in python. That is one part of it. Not every "available" message is useable, as some messages need corresponding VPP plugins, which may be disabled. >> all available Python modules to use in VPP API. It depends on what do you mean by "Python module". If you mean "usable messages", you can examine the message table [0] after PAPI has connected to a running VPP. Vratko. [0] https://github.com/FDio/vpp/blob/66d10589f412d11841c4c8adc0a498b5527e88cb/src/vpp-api/python/vpp_papi/vpp_papi.py#L834-L836 From: vpp-dev@lists.fd.io On Behalf Of Ole Troan Sent: Friday, 2020-November-13 15:51 To: Marcos - Mgiga Cc: vpp-dev Subject: Re: [vpp-dev] Python API modules Hi Marcos, On 13 Nov 2020, at 15:08, Marcos - Mgiga wrote: Hello There, I believe this is a trivial question, but where / how can I get a list of all avaialble Python modules to use in VPP API. You can’t. They are auto generated from the available json representations of .api files. All available api are represented in python. Cheers Ole -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18025): https://lists.fd.io/g/vpp-dev/message/18025 Mute This Topic: https://lists.fd.io/mt/78229638/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] Python API modules
Hi Marcos. Yes. Your assumption is correct. A library user expects access to the library. I'm going to ask the question that you haven't. Why won't Ole, as maintainer, allow it? The build system uses something called vppapigen to generate the c include files and the json files. It could as easily generate static stubs so that development with python could be usable without a running vpp instance. The thing is that there are other trivial issues that need to be addressed first. Wouldn't you love to be able to do 'pip install vpp_papi' and get going? Yes, vpp builds artifacts for vpp_api_python, but who actually wants vpp_api installed globally? The python best practice is to install a venv, and do pip install. Even 'make test' follows this pattern. vpp_papi has changed significantly, yet the version number is never, ever increased. To address this, on April 5, 2019, I attempted to automate versioning of vpp_papi to the git tag. https://gerrit.fd.io/r/c/vpp/+/18694. There is a version on pypi, it is woefully out of date, and honestly not usable. On June 18, 2019, I contributed https://gerrit.fd.io/r/c/vpp/+/20035, to automate the process of updating pypi, with a simple command 'tox -epypi' see:src/vpp-api/python/tox.ini I have also floated the idea of moving vpp_papi to a submodule, so that it could be easily developed against if pypi were too burdensome to add to the release process. Just include the submodule in your code, pin it to any commit you like, and you're off to the races! To be 100% clear, this is not an attack on Ole. I have nothing but respect for him. He is extremely talented and has been *extremely* generous with his time to me. As for me, as anyone here can tell you, I'm not a C programmer. The only reason, I ever considered touching the c code, was because of Ole's help and guidance and his suggestion to fix the API instead of conforming the test. He/Vratko removed me as one of the maintainers of papi. https://gerrit.fd.io/r/c/vpp/+/22672. I'm cool with not being a maintainer, kinda funny that it was stuffed into another changeset. I have been called out repeatedly for submitting unrelated changes ;) All my changesets are still out there, so that should others become blocked, they can still get work done. 'git-review -X 12345' is your friend. (If you use -X instead of -x, you can remove the changeset with 'git rebase -i' if you want to submit a contribution) In that spirit, if you want a python module with static methods, let me know. Paul On Fri, Nov 13, 2020 at 10:57 AM Vratko Polak -X (vrpolak - PANTHEON TECHNOLOGIES at Cisco) wrote: > > All available api are represented in python. > > > > That is one part of it. > > Not every "available" message is useable, > > as some messages need corresponding VPP plugins, > > which may be disabled. > > > > >> all available Python modules to use in VPP API. > > > > It depends on what do you mean by "Python module". > > If you mean "usable messages", > > you can examine the message table [0] > > after PAPI has connected to a running VPP. > > > > Vratko. > > > > [0] > https://github.com/FDio/vpp/blob/66d10589f412d11841c4c8adc0a498b5527e88cb/src/vpp-api/python/vpp_papi/vpp_papi.py#L834-L836 > > > > *From:* vpp-dev@lists.fd.io *On Behalf Of *Ole Troan > *Sent:* Friday, 2020-November-13 15:51 > *To:* Marcos - Mgiga > *Cc:* vpp-dev > *Subject:* Re: [vpp-dev] Python API modules > > > > Hi Marcos, > > > > On 13 Nov 2020, at 15:08, Marcos - Mgiga wrote: > > > > > > > > Hello There, > > > > I believe this is a trivial question, but where / how can I get a list of > all avaialble Python modules to use in VPP API. > > > > > > You can’t. > > They are auto generated from the available json representations of .api > files. > > All available api are represented in python. > > > > Cheers > > Ole > > > > > > > > > > > -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18024): https://lists.fd.io/g/vpp-dev/message/18024 Mute This Topic: https://lists.fd.io/mt/78229638/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] Frequently updated gauge (scalar)?
I have need to track a frequently updated value (pps rate during congestion control), but I need this to be close to very frequently updated (basically whenever I change the value which is based on RTT). It seems the current gauge code you supply a callback which updates the counter (at a default of every 10 seconds). Is there something about the scalar stat that's different from the counter stat that requires this infrequent updating? Is it possible to "register" a scalar state that I can update very frequently instead directly (instead of supplying a callback)? Thanks, Chris. signature.asc Description: Message signed with OpenPGP -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18023): https://lists.fd.io/g/vpp-dev/message/18023 Mute This Topic: https://lists.fd.io/mt/78237093/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
RES: RES: RES: [vpp-dev] Increasing NAT worker handoff frame queue size NAT_FQ_NELTS to avoid congestion drops?
Thanks, you see reducing the number of VPP threads as an option to work this issue around, since you would probably increase the vector rate per thread? Best Regards -Mensagem original- De: vpp-dev@lists.fd.io Em nome de Klement Sekera via lists.fd.io Enviada em: sexta-feira, 13 de novembro de 2020 14:26 Para: Marcos - Mgiga Cc: Elias Rudberg ; vpp-dev Assunto: Re: RES: RES: [vpp-dev] Increasing NAT worker handoff frame queue size NAT_FQ_NELTS to avoid congestion drops? I used the usual 1. start traffic 2. clear run 3. wait n seconds (e.g. n == 10) 4. show run Klement > On 13 Nov 2020, at 18:21, Marcos - Mgiga wrote: > > Understood. And what path did you take in order to analyse and monitor vector > rates ? Is there some specific command or log ? > > Thanks > > Marcos > > -Mensagem original- > De: vpp-dev@lists.fd.io Em nome de ksekera via > [] Enviada em: sexta-feira, 13 de novembro de 2020 14:02 > Para: Marcos - Mgiga > Cc: Elias Rudberg ; vpp-dev@lists.fd.io > Assunto: Re: RES: [vpp-dev] Increasing NAT worker handoff frame queue size > NAT_FQ_NELTS to avoid congestion drops? > > Not completely idle, more like medium load. Vector rates at which I saw > congestion drops were roughly 40 for thread doing no work (just handoffs - I > hardcoded it this way for test purpose), and roughly 100 for thread picking > the packets doing NAT. > > What got me into infra investigation was the fact that once I was hitting > vector rates around 255, I did see packet drops, but no congestion drops. > > HTH, > Klement > >> On 13 Nov 2020, at 17:51, Marcos - Mgiga wrote: >> >> So you mean that this situation ( congestion drops) is most likely to occur >> when the system in general is idle than when it is processing a large amount >> of traffic? >> >> Best Regards >> >> Marcos >> >> -Mensagem original- >> De: vpp-dev@lists.fd.io Em nome de Klement >> Sekera via lists.fd.io Enviada em: sexta-feira, 13 de novembro de >> 2020 >> 12:15 >> Para: Elias Rudberg >> Cc: vpp-dev@lists.fd.io >> Assunto: Re: [vpp-dev] Increasing NAT worker handoff frame queue size >> NAT_FQ_NELTS to avoid congestion drops? >> >> Hi Elias, >> >> I’ve already debugged this and came to the conclusion that it’s the infra >> which is the weak link. I was seeing congestion drops at mild load, but not >> at full load. Issue is that with handoff, there is uneven workload. For >> simplicity’s sake, just consider thread 1 handing off all the traffic to >> thread 2. What happens is that for thread 1, the job is much easier, it just >> does some ip4 parsing and then hands packet to thread 2, which actually does >> the heavy lifting of hash inserts/lookups/translation etc. 64 element queue >> can hold 64 frames, one extreme is 64 1-packet frames, totalling 64 packets, >> other extreme is 64 255-packet frames, totalling ~16k packets. What happens >> is this: thread 1 is mostly idle and just picking a few packets from NIC and >> every one of these small frames creates an entry in the handoff queue. Now >> thread 2 picks one element from the handoff queue and deals with it before >> picking another one. If the queue has only 3-packet or 10-packet elements, >> then thread 2 can never really get into what VPP excels in - bulk processing. >> >> Q: Why doesn’t it pick as many packets as possible from the handoff queue? >> A: It’s not implemented. >> >> I already wrote a patch for it, which made all congestion drops which I saw >> (in above synthetic test case) disappear. Mentioned patch >> https://gerrit.fd.io/r/c/vpp/+/28980 is sitting in gerrit. >> >> Would you like to give it a try and see if it helps your issue? We >> shouldn’t need big queues under mild loads anyway … >> >> Regards, >> Klement >> >>> On 13 Nov 2020, at 16:03, Elias Rudberg wrote: >>> >>> Hello VPP experts, >>> >>> We are using VPP for NAT44 and we get some "congestion drops", in a >>> situation where we think VPP is far from overloaded in general. Then >>> we started to investigate if it would help to use a larger handoff >>> frame queue size. In theory at least, allowing a longer queue could >>> help avoiding drops in case of short spikes of traffic, or if it >>> happens that some worker thread is temporarily busy for whatever >>> reason. >>> >>> The NAT worker handoff frame queue size is hard-coded in the >>> NAT_FQ_NELTS macro in src/plugins/nat/nat.h where the current value >>> is 64. The idea is that putting a larger value there could help. >>> >>> We have run some tests where we changed the NAT_FQ_NELTS value from >>> 64 to a range of other values, each time rebuilding VPP and running >>> an identical test, a test case that is to some extent trying to >>> mimic our real traffic, although of course it is simplified. The >>> test runs many >>> iperf3 tests simultaneously using TCP, combined with some UDP >>> traffic chosen to trigger VPP to create more new sessions (to make >>> the NAT
Re: [vpp-dev] VPP-1946
Hi Paul, > I just wanted to take a sec., and say thank you to Damjan/Ben for fixing > VPP-1946. I documented an issue, and it was fixed in 2 days. That was an > example of a great user experience! Thanks for the kind word, your contribution is appreciated too: testing and filling meaningful bug reports is important. Also, if you happen to use VPP on ARM I'd be curious to know if you see any performance improvement from the patch: if I understand the build logic correctly, generic ARM package for VPP is built for 128-bytes cacheline. Prior to the patch, vlib_buffer_t uses 3 128-bytes cachelines in that scenario. After the patch, it will use 2 128-bytes cachelines instead (we save 1 cacheline per packet). Note that it should be limited to NAT and QoS scenarios which are accessing this cacheline. It should not impact eg. forwarding as far as I can tell. Best ben -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18021): https://lists.fd.io/g/vpp-dev/message/18021 Mute This Topic: https://lists.fd.io/mt/78233745/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: RES: RES: [vpp-dev] Increasing NAT worker handoff frame queue size NAT_FQ_NELTS to avoid congestion drops?
I used the usual 1. start traffic 2. clear run 3. wait n seconds (e.g. n == 10) 4. show run Klement > On 13 Nov 2020, at 18:21, Marcos - Mgiga wrote: > > Understood. And what path did you take in order to analyse and monitor vector > rates ? Is there some specific command or log ? > > Thanks > > Marcos > > -Mensagem original- > De: vpp-dev@lists.fd.io Em nome de ksekera via [] > Enviada em: sexta-feira, 13 de novembro de 2020 14:02 > Para: Marcos - Mgiga > Cc: Elias Rudberg ; vpp-dev@lists.fd.io > Assunto: Re: RES: [vpp-dev] Increasing NAT worker handoff frame queue size > NAT_FQ_NELTS to avoid congestion drops? > > Not completely idle, more like medium load. Vector rates at which I saw > congestion drops were roughly 40 for thread doing no work (just handoffs - I > hardcoded it this way for test purpose), and roughly 100 for thread picking > the packets doing NAT. > > What got me into infra investigation was the fact that once I was hitting > vector rates around 255, I did see packet drops, but no congestion drops. > > HTH, > Klement > >> On 13 Nov 2020, at 17:51, Marcos - Mgiga wrote: >> >> So you mean that this situation ( congestion drops) is most likely to occur >> when the system in general is idle than when it is processing a large amount >> of traffic? >> >> Best Regards >> >> Marcos >> >> -Mensagem original- >> De: vpp-dev@lists.fd.io Em nome de Klement >> Sekera via lists.fd.io Enviada em: sexta-feira, 13 de novembro de 2020 >> 12:15 >> Para: Elias Rudberg >> Cc: vpp-dev@lists.fd.io >> Assunto: Re: [vpp-dev] Increasing NAT worker handoff frame queue size >> NAT_FQ_NELTS to avoid congestion drops? >> >> Hi Elias, >> >> I’ve already debugged this and came to the conclusion that it’s the infra >> which is the weak link. I was seeing congestion drops at mild load, but not >> at full load. Issue is that with handoff, there is uneven workload. For >> simplicity’s sake, just consider thread 1 handing off all the traffic to >> thread 2. What happens is that for thread 1, the job is much easier, it just >> does some ip4 parsing and then hands packet to thread 2, which actually does >> the heavy lifting of hash inserts/lookups/translation etc. 64 element queue >> can hold 64 frames, one extreme is 64 1-packet frames, totalling 64 packets, >> other extreme is 64 255-packet frames, totalling ~16k packets. What happens >> is this: thread 1 is mostly idle and just picking a few packets from NIC and >> every one of these small frames creates an entry in the handoff queue. Now >> thread 2 picks one element from the handoff queue and deals with it before >> picking another one. If the queue has only 3-packet or 10-packet elements, >> then thread 2 can never really get into what VPP excels in - bulk processing. >> >> Q: Why doesn’t it pick as many packets as possible from the handoff queue? >> A: It’s not implemented. >> >> I already wrote a patch for it, which made all congestion drops which I saw >> (in above synthetic test case) disappear. Mentioned patch >> https://gerrit.fd.io/r/c/vpp/+/28980 is sitting in gerrit. >> >> Would you like to give it a try and see if it helps your issue? We >> shouldn’t need big queues under mild loads anyway … >> >> Regards, >> Klement >> >>> On 13 Nov 2020, at 16:03, Elias Rudberg wrote: >>> >>> Hello VPP experts, >>> >>> We are using VPP for NAT44 and we get some "congestion drops", in a >>> situation where we think VPP is far from overloaded in general. Then >>> we started to investigate if it would help to use a larger handoff >>> frame queue size. In theory at least, allowing a longer queue could >>> help avoiding drops in case of short spikes of traffic, or if it >>> happens that some worker thread is temporarily busy for whatever >>> reason. >>> >>> The NAT worker handoff frame queue size is hard-coded in the >>> NAT_FQ_NELTS macro in src/plugins/nat/nat.h where the current value >>> is 64. The idea is that putting a larger value there could help. >>> >>> We have run some tests where we changed the NAT_FQ_NELTS value from >>> 64 to a range of other values, each time rebuilding VPP and running >>> an identical test, a test case that is to some extent trying to mimic >>> our real traffic, although of course it is simplified. The test runs >>> many >>> iperf3 tests simultaneously using TCP, combined with some UDP traffic >>> chosen to trigger VPP to create more new sessions (to make the NAT >>> "slowpath" happen more). >>> >>> The following NAT_FQ_NELTS values were tested: >>> 16 >>> 32 >>> 64 <-- current value >>> 128 >>> 256 >>> 512 >>> 1024 >>> 2048 <-- best performance in our tests >>> 4096 >>> 8192 >>> 16384 >>> 32768 >>> 65536 >>> 131072 >>> >>> In those tests, performance was very bad for the smallest >>> NAT_FQ_NELTS values of 16 and 32, while values larger than 64 gave >>> improved performance. The best results in terms of throughput were >>> seen for NAT_FQ_NELTS=2048.
RES: RES: [vpp-dev] Increasing NAT worker handoff frame queue size NAT_FQ_NELTS to avoid congestion drops?
Understood. And what path did you take in order to analyse and monitor vector rates ? Is there some specific command or log ? Thanks Marcos -Mensagem original- De: vpp-dev@lists.fd.io Em nome de ksekera via [] Enviada em: sexta-feira, 13 de novembro de 2020 14:02 Para: Marcos - Mgiga Cc: Elias Rudberg ; vpp-dev@lists.fd.io Assunto: Re: RES: [vpp-dev] Increasing NAT worker handoff frame queue size NAT_FQ_NELTS to avoid congestion drops? Not completely idle, more like medium load. Vector rates at which I saw congestion drops were roughly 40 for thread doing no work (just handoffs - I hardcoded it this way for test purpose), and roughly 100 for thread picking the packets doing NAT. What got me into infra investigation was the fact that once I was hitting vector rates around 255, I did see packet drops, but no congestion drops. HTH, Klement > On 13 Nov 2020, at 17:51, Marcos - Mgiga wrote: > > So you mean that this situation ( congestion drops) is most likely to occur > when the system in general is idle than when it is processing a large amount > of traffic? > > Best Regards > > Marcos > > -Mensagem original- > De: vpp-dev@lists.fd.io Em nome de Klement > Sekera via lists.fd.io Enviada em: sexta-feira, 13 de novembro de 2020 > 12:15 > Para: Elias Rudberg > Cc: vpp-dev@lists.fd.io > Assunto: Re: [vpp-dev] Increasing NAT worker handoff frame queue size > NAT_FQ_NELTS to avoid congestion drops? > > Hi Elias, > > I’ve already debugged this and came to the conclusion that it’s the infra > which is the weak link. I was seeing congestion drops at mild load, but not > at full load. Issue is that with handoff, there is uneven workload. For > simplicity’s sake, just consider thread 1 handing off all the traffic to > thread 2. What happens is that for thread 1, the job is much easier, it just > does some ip4 parsing and then hands packet to thread 2, which actually does > the heavy lifting of hash inserts/lookups/translation etc. 64 element queue > can hold 64 frames, one extreme is 64 1-packet frames, totalling 64 packets, > other extreme is 64 255-packet frames, totalling ~16k packets. What happens > is this: thread 1 is mostly idle and just picking a few packets from NIC and > every one of these small frames creates an entry in the handoff queue. Now > thread 2 picks one element from the handoff queue and deals with it before > picking another one. If the queue has only 3-packet or 10-packet elements, > then thread 2 can never really get into what VPP excels in - bulk processing. > > Q: Why doesn’t it pick as many packets as possible from the handoff queue? > A: It’s not implemented. > > I already wrote a patch for it, which made all congestion drops which I saw > (in above synthetic test case) disappear. Mentioned patch > https://gerrit.fd.io/r/c/vpp/+/28980 is sitting in gerrit. > > Would you like to give it a try and see if it helps your issue? We > shouldn’t need big queues under mild loads anyway … > > Regards, > Klement > >> On 13 Nov 2020, at 16:03, Elias Rudberg wrote: >> >> Hello VPP experts, >> >> We are using VPP for NAT44 and we get some "congestion drops", in a >> situation where we think VPP is far from overloaded in general. Then >> we started to investigate if it would help to use a larger handoff >> frame queue size. In theory at least, allowing a longer queue could >> help avoiding drops in case of short spikes of traffic, or if it >> happens that some worker thread is temporarily busy for whatever >> reason. >> >> The NAT worker handoff frame queue size is hard-coded in the >> NAT_FQ_NELTS macro in src/plugins/nat/nat.h where the current value >> is 64. The idea is that putting a larger value there could help. >> >> We have run some tests where we changed the NAT_FQ_NELTS value from >> 64 to a range of other values, each time rebuilding VPP and running >> an identical test, a test case that is to some extent trying to mimic >> our real traffic, although of course it is simplified. The test runs >> many >> iperf3 tests simultaneously using TCP, combined with some UDP traffic >> chosen to trigger VPP to create more new sessions (to make the NAT >> "slowpath" happen more). >> >> The following NAT_FQ_NELTS values were tested: >> 16 >> 32 >> 64 <-- current value >> 128 >> 256 >> 512 >> 1024 >> 2048 <-- best performance in our tests >> 4096 >> 8192 >> 16384 >> 32768 >> 65536 >> 131072 >> >> In those tests, performance was very bad for the smallest >> NAT_FQ_NELTS values of 16 and 32, while values larger than 64 gave >> improved performance. The best results in terms of throughput were >> seen for NAT_FQ_NELTS=2048. For even larger values than that, we got >> reduced performance compared to the 2048 case. >> >> The tests were done for VPP 20.05 running on a Ubuntu 18.04 server >> with a 12-core Intel Xeon CPU and two Mellanox mlx5 network cards. >> The number of NAT threads was 8 in some of the tests and 4 in
[vpp-dev] VPP-1946
Hi All, I just wanted to take a sec., and say thank you to Damjan/Ben for fixing VPP-1946. I documented an issue, and it was fixed in 2 days. That was an example of a great user experience! Thank you! -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18017): https://lists.fd.io/g/vpp-dev/message/18017 Mute This Topic: https://lists.fd.io/mt/78233745/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
RES: [vpp-dev] Increasing NAT worker handoff frame queue size NAT_FQ_NELTS to avoid congestion drops?
So you mean that this situation ( congestion drops) is most likely to occur when the system in general is idle than when it is processing a large amount of traffic? Best Regards Marcos -Mensagem original- De: vpp-dev@lists.fd.io Em nome de Klement Sekera via lists.fd.io Enviada em: sexta-feira, 13 de novembro de 2020 12:15 Para: Elias Rudberg Cc: vpp-dev@lists.fd.io Assunto: Re: [vpp-dev] Increasing NAT worker handoff frame queue size NAT_FQ_NELTS to avoid congestion drops? Hi Elias, I’ve already debugged this and came to the conclusion that it’s the infra which is the weak link. I was seeing congestion drops at mild load, but not at full load. Issue is that with handoff, there is uneven workload. For simplicity’s sake, just consider thread 1 handing off all the traffic to thread 2. What happens is that for thread 1, the job is much easier, it just does some ip4 parsing and then hands packet to thread 2, which actually does the heavy lifting of hash inserts/lookups/translation etc. 64 element queue can hold 64 frames, one extreme is 64 1-packet frames, totalling 64 packets, other extreme is 64 255-packet frames, totalling ~16k packets. What happens is this: thread 1 is mostly idle and just picking a few packets from NIC and every one of these small frames creates an entry in the handoff queue. Now thread 2 picks one element from the handoff queue and deals with it before picking another one. If the queue has only 3-packet or 10-packet elements, then thread 2 can never really get into what VPP excels in - bulk processing. Q: Why doesn’t it pick as many packets as possible from the handoff queue? A: It’s not implemented. I already wrote a patch for it, which made all congestion drops which I saw (in above synthetic test case) disappear. Mentioned patch https://gerrit.fd.io/r/c/vpp/+/28980 is sitting in gerrit. Would you like to give it a try and see if it helps your issue? We shouldn’t need big queues under mild loads anyway … Regards, Klement > On 13 Nov 2020, at 16:03, Elias Rudberg wrote: > > Hello VPP experts, > > We are using VPP for NAT44 and we get some "congestion drops", in a > situation where we think VPP is far from overloaded in general. Then > we started to investigate if it would help to use a larger handoff > frame queue size. In theory at least, allowing a longer queue could > help avoiding drops in case of short spikes of traffic, or if it > happens that some worker thread is temporarily busy for whatever > reason. > > The NAT worker handoff frame queue size is hard-coded in the > NAT_FQ_NELTS macro in src/plugins/nat/nat.h where the current value is > 64. The idea is that putting a larger value there could help. > > We have run some tests where we changed the NAT_FQ_NELTS value from 64 > to a range of other values, each time rebuilding VPP and running an > identical test, a test case that is to some extent trying to mimic our > real traffic, although of course it is simplified. The test runs many > iperf3 tests simultaneously using TCP, combined with some UDP traffic > chosen to trigger VPP to create more new sessions (to make the NAT > "slowpath" happen more). > > The following NAT_FQ_NELTS values were tested: > 16 > 32 > 64 <-- current value > 128 > 256 > 512 > 1024 > 2048 <-- best performance in our tests > 4096 > 8192 > 16384 > 32768 > 65536 > 131072 > > In those tests, performance was very bad for the smallest NAT_FQ_NELTS > values of 16 and 32, while values larger than 64 gave improved > performance. The best results in terms of throughput were seen for > NAT_FQ_NELTS=2048. For even larger values than that, we got reduced > performance compared to the 2048 case. > > The tests were done for VPP 20.05 running on a Ubuntu 18.04 server > with a 12-core Intel Xeon CPU and two Mellanox mlx5 network cards. The > number of NAT threads was 8 in some of the tests and 4 in some of the > tests. > > According to these tests, the effect of changing NAT_FQ_NELTS can be > quite large. For example, for one test case chosen such that > congestion drops were a significant problem, the throughput increased > from about 43 to 90 Gbit/second with the amount of congestion drops > per second reduced to about one third. In another kind of test, > throughput increased by about 20% with congestion drops reduced to > zero. Of course such results depend a lot on how the tests are > constructed. But anyway, it seems clear that the choice of > NAT_FQ_NELTS value can be important and that increasing it would be > good, at least for the kind of usage we have tested now. > > Based on the above, we are considering changing NAT_FQ_NELTS from 64 > to a larger value and start trying that in our production environment > (so far we have only tried it in a test environment). > > Were there specific reasons for setting NAT_FQ_NELTS to 64? > > Are there some potential drawbacks or dangers of changing it to a > larger value? > > Would you
Re: [vpp-dev] Increasing NAT worker handoff frame queue size NAT_FQ_NELTS to avoid congestion drops?
>> Would you consider changing to a larger value in the official VPP code? Maybe make it configurable? I mean, after 28980 is merged and you still find tweaking the value helpful. Vratko. -Original Message- From: vpp-dev@lists.fd.io On Behalf Of Klement Sekera via lists.fd.io Sent: Friday, 2020-November-13 16:15 To: Elias Rudberg Cc: vpp-dev@lists.fd.io Subject: Re: [vpp-dev] Increasing NAT worker handoff frame queue size NAT_FQ_NELTS to avoid congestion drops? Hi Elias, I’ve already debugged this and came to the conclusion that it’s the infra which is the weak link. I was seeing congestion drops at mild load, but not at full load. Issue is that with handoff, there is uneven workload. For simplicity’s sake, just consider thread 1 handing off all the traffic to thread 2. What happens is that for thread 1, the job is much easier, it just does some ip4 parsing and then hands packet to thread 2, which actually does the heavy lifting of hash inserts/lookups/translation etc. 64 element queue can hold 64 frames, one extreme is 64 1-packet frames, totalling 64 packets, other extreme is 64 255-packet frames, totalling ~16k packets. What happens is this: thread 1 is mostly idle and just picking a few packets from NIC and every one of these small frames creates an entry in the handoff queue. Now thread 2 picks one element from the handoff queue and deals with it before picking another one. If the queue has only 3-packet or 10-packet elements, then thread 2 can never really get into what VPP excels in - bulk processing. Q: Why doesn’t it pick as many packets as possible from the handoff queue? A: It’s not implemented. I already wrote a patch for it, which made all congestion drops which I saw (in above synthetic test case) disappear. Mentioned patch https://gerrit.fd.io/r/c/vpp/+/28980 is sitting in gerrit. Would you like to give it a try and see if it helps your issue? We shouldn’t need big queues under mild loads anyway … Regards, Klement > On 13 Nov 2020, at 16:03, Elias Rudberg wrote: > > Hello VPP experts, > > We are using VPP for NAT44 and we get some "congestion drops", in a > situation where we think VPP is far from overloaded in general. Then > we started to investigate if it would help to use a larger handoff > frame queue size. In theory at least, allowing a longer queue could > help avoiding drops in case of short spikes of traffic, or if it > happens that some worker thread is temporarily busy for whatever > reason. > > The NAT worker handoff frame queue size is hard-coded in the > NAT_FQ_NELTS macro in src/plugins/nat/nat.h where the current value is > 64. The idea is that putting a larger value there could help. > > We have run some tests where we changed the NAT_FQ_NELTS value from 64 > to a range of other values, each time rebuilding VPP and running an > identical test, a test case that is to some extent trying to mimic our > real traffic, although of course it is simplified. The test runs many > iperf3 tests simultaneously using TCP, combined with some UDP traffic > chosen to trigger VPP to create more new sessions (to make the NAT > "slowpath" happen more). > > The following NAT_FQ_NELTS values were tested: > 16 > 32 > 64 <-- current value > 128 > 256 > 512 > 1024 > 2048 <-- best performance in our tests > 4096 > 8192 > 16384 > 32768 > 65536 > 131072 > > In those tests, performance was very bad for the smallest NAT_FQ_NELTS > values of 16 and 32, while values larger than 64 gave improved > performance. The best results in terms of throughput were seen for > NAT_FQ_NELTS=2048. For even larger values than that, we got reduced > performance compared to the 2048 case. > > The tests were done for VPP 20.05 running on a Ubuntu 18.04 server > with a 12-core Intel Xeon CPU and two Mellanox mlx5 network cards. The > number of NAT threads was 8 in some of the tests and 4 in some of the > tests. > > According to these tests, the effect of changing NAT_FQ_NELTS can be > quite large. For example, for one test case chosen such that > congestion drops were a significant problem, the throughput increased > from about 43 to 90 Gbit/second with the amount of congestion drops > per second reduced to about one third. In another kind of test, > throughput increased by about 20% with congestion drops reduced to > zero. Of course such results depend a lot on how the tests are > constructed. But anyway, it seems clear that the choice of > NAT_FQ_NELTS value can be important and that increasing it would be > good, at least for the kind of usage we have tested now. > > Based on the above, we are considering changing NAT_FQ_NELTS from 64 > to a larger value and start trying that in our production environment > (so far we have only tried it in a test environment). > > Were there specific reasons for setting NAT_FQ_NELTS to 64? > > Are there some potential drawbacks or dangers of changing it to a > larger value? > > Would you consider changing to a larger value in the
Re: [vpp-dev] Python API modules
> All available api are represented in python. That is one part of it. Not every "available" message is useable, as some messages need corresponding VPP plugins, which may be disabled. >> all available Python modules to use in VPP API. It depends on what do you mean by "Python module". If you mean "usable messages", you can examine the message table [0] after PAPI has connected to a running VPP. Vratko. [0] https://github.com/FDio/vpp/blob/66d10589f412d11841c4c8adc0a498b5527e88cb/src/vpp-api/python/vpp_papi/vpp_papi.py#L834-L836 From: vpp-dev@lists.fd.io On Behalf Of Ole Troan Sent: Friday, 2020-November-13 15:51 To: Marcos - Mgiga Cc: vpp-dev Subject: Re: [vpp-dev] Python API modules Hi Marcos, On 13 Nov 2020, at 15:08, Marcos - Mgiga mailto:mar...@mgiga.com.br>> wrote: Hello There, I believe this is a trivial question, but where / how can I get a list of all avaialble Python modules to use in VPP API. You can’t. They are auto generated from the available json representations of .api files. All available api are represented in python. Cheers Ole -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18014): https://lists.fd.io/g/vpp-dev/message/18014 Mute This Topic: https://lists.fd.io/mt/78229638/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] Increasing NAT worker handoff frame queue size NAT_FQ_NELTS to avoid congestion drops?
Hi Elias, I’ve already debugged this and came to the conclusion that it’s the infra which is the weak link. I was seeing congestion drops at mild load, but not at full load. Issue is that with handoff, there is uneven workload. For simplicity’s sake, just consider thread 1 handing off all the traffic to thread 2. What happens is that for thread 1, the job is much easier, it just does some ip4 parsing and then hands packet to thread 2, which actually does the heavy lifting of hash inserts/lookups/translation etc. 64 element queue can hold 64 frames, one extreme is 64 1-packet frames, totalling 64 packets, other extreme is 64 255-packet frames, totalling ~16k packets. What happens is this: thread 1 is mostly idle and just picking a few packets from NIC and every one of these small frames creates an entry in the handoff queue. Now thread 2 picks one element from the handoff queue and deals with it before picking another one. If the queue has only 3-packet or 10-packet elements, then thread 2 can never really get into what VPP excels in - bulk processing. Q: Why doesn’t it pick as many packets as possible from the handoff queue? A: It’s not implemented. I already wrote a patch for it, which made all congestion drops which I saw (in above synthetic test case) disappear. Mentioned patch https://gerrit.fd.io/r/c/vpp/+/28980 is sitting in gerrit. Would you like to give it a try and see if it helps your issue? We shouldn’t need big queues under mild loads anyway … Regards, Klement > On 13 Nov 2020, at 16:03, Elias Rudberg wrote: > > Hello VPP experts, > > We are using VPP for NAT44 and we get some "congestion drops", in a > situation where we think VPP is far from overloaded in general. Then > we started to investigate if it would help to use a larger handoff > frame queue size. In theory at least, allowing a longer queue could > help avoiding drops in case of short spikes of traffic, or if it > happens that some worker thread is temporarily busy for whatever > reason. > > The NAT worker handoff frame queue size is hard-coded in the > NAT_FQ_NELTS macro in src/plugins/nat/nat.h where the current value is > 64. The idea is that putting a larger value there could help. > > We have run some tests where we changed the NAT_FQ_NELTS value from 64 > to a range of other values, each time rebuilding VPP and running an > identical test, a test case that is to some extent trying to mimic our > real traffic, although of course it is simplified. The test runs many > iperf3 tests simultaneously using TCP, combined with some UDP traffic > chosen to trigger VPP to create more new sessions (to make the NAT > "slowpath" happen more). > > The following NAT_FQ_NELTS values were tested: > 16 > 32 > 64 <-- current value > 128 > 256 > 512 > 1024 > 2048 <-- best performance in our tests > 4096 > 8192 > 16384 > 32768 > 65536 > 131072 > > In those tests, performance was very bad for the smallest NAT_FQ_NELTS > values of 16 and 32, while values larger than 64 gave improved > performance. The best results in terms of throughput were seen for > NAT_FQ_NELTS=2048. For even larger values than that, we got reduced > performance compared to the 2048 case. > > The tests were done for VPP 20.05 running on a Ubuntu 18.04 server > with a 12-core Intel Xeon CPU and two Mellanox mlx5 network cards. The > number of NAT threads was 8 in some of the tests and 4 in some of the > tests. > > According to these tests, the effect of changing NAT_FQ_NELTS can be > quite large. For example, for one test case chosen such that > congestion drops were a significant problem, the throughput increased > from about 43 to 90 Gbit/second with the amount of congestion drops > per second reduced to about one third. In another kind of test, > throughput increased by about 20% with congestion drops reduced to > zero. Of course such results depend a lot on how the tests are > constructed. But anyway, it seems clear that the choice of > NAT_FQ_NELTS value can be important and that increasing it would be > good, at least for the kind of usage we have tested now. > > Based on the above, we are considering changing NAT_FQ_NELTS from 64 > to a larger value and start trying that in our production environment > (so far we have only tried it in a test environment). > > Were there specific reasons for setting NAT_FQ_NELTS to 64? > > Are there some potential drawbacks or dangers of changing it to a > larger value? > > Would you consider changing to a larger value in the official VPP > code? > > Best regards, > Elias > > > > -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18013): https://lists.fd.io/g/vpp-dev/message/18013 Mute This Topic: https://lists.fd.io/mt/78230881/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] Increasing NAT worker handoff frame queue size NAT_FQ_NELTS to avoid congestion drops?
Hello VPP experts, We are using VPP for NAT44 and we get some "congestion drops", in a situation where we think VPP is far from overloaded in general. Then we started to investigate if it would help to use a larger handoff frame queue size. In theory at least, allowing a longer queue could help avoiding drops in case of short spikes of traffic, or if it happens that some worker thread is temporarily busy for whatever reason. The NAT worker handoff frame queue size is hard-coded in the NAT_FQ_NELTS macro in src/plugins/nat/nat.h where the current value is 64. The idea is that putting a larger value there could help. We have run some tests where we changed the NAT_FQ_NELTS value from 64 to a range of other values, each time rebuilding VPP and running an identical test, a test case that is to some extent trying to mimic our real traffic, although of course it is simplified. The test runs many iperf3 tests simultaneously using TCP, combined with some UDP traffic chosen to trigger VPP to create more new sessions (to make the NAT "slowpath" happen more). The following NAT_FQ_NELTS values were tested: 16 32 64 <-- current value 128 256 512 1024 2048 <-- best performance in our tests 4096 8192 16384 32768 65536 131072 In those tests, performance was very bad for the smallest NAT_FQ_NELTS values of 16 and 32, while values larger than 64 gave improved performance. The best results in terms of throughput were seen for NAT_FQ_NELTS=2048. For even larger values than that, we got reduced performance compared to the 2048 case. The tests were done for VPP 20.05 running on a Ubuntu 18.04 server with a 12-core Intel Xeon CPU and two Mellanox mlx5 network cards. The number of NAT threads was 8 in some of the tests and 4 in some of the tests. According to these tests, the effect of changing NAT_FQ_NELTS can be quite large. For example, for one test case chosen such that congestion drops were a significant problem, the throughput increased from about 43 to 90 Gbit/second with the amount of congestion drops per second reduced to about one third. In another kind of test, throughput increased by about 20% with congestion drops reduced to zero. Of course such results depend a lot on how the tests are constructed. But anyway, it seems clear that the choice of NAT_FQ_NELTS value can be important and that increasing it would be good, at least for the kind of usage we have tested now. Based on the above, we are considering changing NAT_FQ_NELTS from 64 to a larger value and start trying that in our production environment (so far we have only tried it in a test environment). Were there specific reasons for setting NAT_FQ_NELTS to 64? Are there some potential drawbacks or dangers of changing it to a larger value? Would you consider changing to a larger value in the official VPP code? Best regards, Elias -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18012): https://lists.fd.io/g/vpp-dev/message/18012 Mute This Topic: https://lists.fd.io/mt/78230881/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
[vpp-dev] Python API modules
Hello There, I believe this is a trivial question, but where / how can I get a list of all avaialble Python modules to use in VPP API. Best Regards -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18010): https://lists.fd.io/g/vpp-dev/message/18010 Mute This Topic: https://lists.fd.io/mt/78229638/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] VPP load estimation
Thanks Ben for your reply. I found that we can get the VPP rate (size) information through vpp_get_stats function. /Joe -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18009): https://lists.fd.io/g/vpp-dev/message/18009 Mute This Topic: https://lists.fd.io/mt/78132591/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-