Hi Florin, Thanks for the quick, and very useful reply.
I’d been looking at the mp_safe flags, and had concluded that I’d need the calls I was interested in to be at least marked mp_safe. However, I was thinking that wasn’t sufficient, as it appeared that some calls marked as mp_safe invoke barrier_sync lower down the call stacks. For instance the internal functions adj_last_lock_gone(), adj_nbr_update_rewrite_internal() and vlib_node_serialize() all seem to call vlib_worker_thread_barrier_sync(), and the fix for defect 892 https://jira.fd.io/browse/VPP-892?gerritReviewStatus=All#gerrit-reviews-left-panel involves adding barrier calls in code related to the mp_safe ADD_DEL_ROUTE (which fits with packet loss I’d observed during testing of deleting routes). I think the raw lossless packet processing which vpp has achieved on static configs is truly amazing, but I guess what I’m trying to understand is whether it is viewed as important to achieve similar behaviour when the system is being reconfigured. Personally I think many of the potential uses of a software dataplane include the need to do limited impact dynamic reconfiguration, however, maybe the kind of applications I have in mind are in a minority? More than anything, given the number of areas which would likely be touched by the required changes, I wanted to understand if there is a consensus that such change was even needed? Thanks in advance for any insight you (or others) can offer. Cheers, Colin. From: Florin Coras [mailto:fcoras.li...@gmail.com] Sent: 22 August 2017 09:40 To: Colin Tregenza Dancer <c...@metaswitch.com> Cc: vpp-dev@lists.fd.io Subject: Re: [vpp-dev] Packet loss on use of API & cmdline Hi Colin, Your assumption was right. Most often than not, a binary API/CLI call results in a vlib_worker_thread_barrier_sync because most handlers and cli are not mp safe. As a consequence, vpp may experience packet loss. One way around this issue, for binary APIs, is to make sure the handler you’re interested in is thread safe and then mark it is_mp_safe in api_main. See, for instance, VL_API_IP_ADD_DEL_ROUTE. Hope this helps, Florin On Aug 22, 2017, at 1:11 AM, Colin Tregenza Dancer via vpp-dev <vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io>> wrote: I might have just missed it, but looking through the ongoing regression tests I can’t see anything that explicitly tests for packet loss during CLI/API commands, so I’m wondering whether minimization of packet loss during configuration is viewed as a goal for vpp? Many/most of the real world applications I’ve been exploring require the ability to reconfigure live systems without impacting the existing flows related to stable elements (route updates, tunnel add/remove, VM addition/removal), and it would be great to understand how this fit with vpp use cases. Thanks again, Colin. From: vpp-dev-boun...@lists.fd.io<mailto:vpp-dev-boun...@lists.fd.io> [mailto:vpp-dev-boun...@lists.fd.io] On Behalf Of Colin Tregenza Dancer via vpp-dev Sent: 19 August 2017 12:17 To: vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io> Subject: [vpp-dev] Packet loss on use of API & cmdline Hi, I’ve been doing some prototyping and load testing of the vpp dataplane, and have observed packet loss when I issue API requests or use the debug command line. Is this to be expected given the use of the worker_thread_barrier, or might there be some way I could improve matters? Currently I’m running a fairly modest 2Mpps throughput between a pair of 10G ports on an Intel X520 NIC, with baremetal Ubuntu 16, & vpp 17.01. Thanks in advance, Colin. _______________________________________________ vpp-dev mailing list vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io> https://lists.fd.io/mailman/listinfo/vpp-dev
_______________________________________________ vpp-dev mailing list vpp-dev@lists.fd.io https://lists.fd.io/mailman/listinfo/vpp-dev