Re: [vpp-dev] Wireguard: bottleneck in handshake process
Hi Artem, Can I ask you kindly to review the under-load state determination patch for wiregurad: https://gerrit.fd.io/r/c/vpp/+/37764 Regards Kai From: Oginski, GabrielX Sent: Friday, January 13, 2023 8:56 AM To: vpp-dev Cc: Ji, Kai ; Pei, Yulong ; Zhang, Fan ; artem.glazyc...@xored.com Subject: Wireguard: bottleneck in handshake process Hi, We recently running a concurrent 10K wireguard tunnels test and there is an handshake issue I would like to get community's attention. Its seem to us that the VPP wireguard handshake process is always proceeded by main thread, and there is handoff process if other threads receive handshake messages. This means that the main thread is running in the thread-safe environment. In the case of a large number of tunnelling setups (a lot handshakes), the main thread configures all the tunnels and the handshake process function need to look-up element in the vectors from all existing the wireguard interfaces with an exclusive UDP port (listen-port). This seems a bottle neck for us, for example: to setup 10k tunnels, as every handshake came with the same listen-port (e.g. 51820), is equal to look-up an element in vector with 10k entries. The process can be really time consuming if corresponds interface located at the end of vectors, as the MAC check and calculation is required by VPP upon received messages. The following are the code executed by main thread: static wg_input_error_t wg_handshake_process (vlib_main_t *vm, wg_main_t *wmp, vlib_buffer_t *b, u32 node_idx, u8 is_ip4) ... index_t *ii; wg_ifs = wg_if_indexes_get_by_port (udp_dst_port); if (NULL == wg_ifs) return WG_INPUT_ERROR_INTERFACE; vec_foreach (ii, wg_ifs) { wg_if = wg_if_get (*ii); if (NULL == wg_if) continue; under_load = wg_if_is_under_load (vm, wg_if); mac_state = cookie_checker_validate_macs ( vm, &wg_if->cookie_checker, macs, current_b_data, len, under_load, &src_ip, udp_src_port); if (mac_state == INVALID_MAC) { wg_if_dec_handshake_num (wg_if); wg_if = NULL; continue; } break; } The variable "wg_ifs" has value how many wireguard interfaces were created with the same "udp_dst_port", for each "vec_foreach" has to look-up on "wg_ifs" elements, and the "cookie_checker_validate_macs()" function has to check received "mac" with "mac" that will be calculated with key for each interfaces separately (&wg_if->cookie_checker). I measured the time before and after handshake process function call, and I discovered that the number of handshake processed per second are much lower when more tunnelling existed in the system. This means that vpp eventually won't able to process all the handshake messages in time, and it can lead to packets drop during the handoff - due to not have enough space for income handshake messages where the main thread still busy for the "old" handshake, then a large number of "congestion drop" start to happened during in "wg4-handshake-handoff". I have also investigated "under load state" in the wireguard, as all the handshake comes with the same UDP listen-port and the most time-consuming part is look-up, there is very little improvement. Please see the patch for update under-load state determination for wireguard: https://gerrit.fd.io/r/c/vpp/+/37764 I came out with the idea using different UDP listening port in configuration. In this case, the results we gathered are much promising, the VPP got much better handshake performance and we are no longer see "congestion drop" with larger number of tunnel can be created. I would like to ask community members to review my under-load state determination patch and any feedback on the change UDP listening port approach are welcome. I looking forward to hearing from you soon. Best regards, Gabriel Oginski -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#22474): https://lists.fd.io/g/vpp-dev/message/22474 Mute This Topic: https://lists.fd.io/mt/96242405/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] memory growth in charon using vpp_sswan
Hi Mahdi, We haven’t ever tried DPD features with sswan plugin, so I don’t have any info. I think we need to investigated and reproduce this memory growth issue in our end, I will keep you posted. regards Kai -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#22337): https://lists.fd.io/g/vpp-dev/message/22337 Mute This Topic: https://lists.fd.io/mt/95641379/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] memory growth in charon using vpp_sswan
Hi Mahdi, Thank you for report your discovery in vpp_sswan, unfortunately we haven't see this memory growth issue in our end. If my understanding is correct, the stat_segment_connect function should be called only if you want to see how many packets and bytes were processed by each SA, and the function was embedded in vpp, and there is no changes in sswan plugin. Can I ask you to try a different version of sswan (5.9.5 or 5.9.6) to see the issue still remains. regards Kai -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#22324): https://lists.fd.io/g/vpp-dev/message/22324 Mute This Topic: https://lists.fd.io/mt/95641379/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] VPP SSWAN plugin git cherry pick to stable 22.10
Hi Andrew, I think the patch is for SSWAN plugin only, the VPP code is untouched. I don’t see any impact this patch will bring to IPsec source code, I cc’d Fan & Radu for suggestion. Regards Kai From: vpp-dev@lists.fd.io On Behalf Of Andrew Yourtchenko Sent: Friday, October 14, 2022 11:04 AM To: vpp-dev@lists.fd.io Subject: Re: [vpp-dev] VPP SSWAN plugin git cherry pick to stable 22.10 Hi Kai, Having a second look at it - the plugin is offloading IPSec; So I would like some of the CSIT folks/IPSec maintainers to chime in with confirmation that they are comfortable with this… I think I might need to rescind the previous “no risk” assessment. --a On 14 Oct 2022, at 11:50, Ji, Kai mailto:kai...@intel.com>> wrote: Hi Andrew, Thank you for the reply, I’m agree with you, the https://gerrit.fd.io/r/c/vpp/+/36183 is way too big and risky for stable 22.10 now. In this case, can I kindly to ask you consider to merge VPP SSWAN plugin (https://gerrit.fd.io/r/c/vpp/+/36552) for 22.10 stable ? This patches only update the plugin and test script itself, and no dependences on vlibapi refactor and library fix. Regards Kai From: vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io> mailto:vpp-dev@lists.fd.io>> On Behalf Of Andrew Yourtchenko Sent: Friday, October 14, 2022 9:57 AM To: vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io> Subject: Re: [vpp-dev] VPP SSWAN plugin git cherry pick to stable 22.10 Kaj, *release manager hat on* we are past RC2 milestone - which means only critical fixes for CSIT tests go in. if the commit was just the plugin itself, in principle might be possible to entertain the idea, since a separate plugin is relatively low risk. But https://gerrit.fd.io/r/c/vpp/+/36183 is *far* too much risk to back port at this point in time, and is again not a fix per se. So unfortunately I will have to say “no” on this one. --a On 14 Oct 2022, at 10:43, Ji, Kai mailto:kai...@intel.com>> wrote: Hello We recently added fix plugin for vpp-sswan (https://gerrit.fd.io/r/c/vpp/+/36552) and I wonder if is it too late for this patch to be picked to stable 22.10 ? To make things more complicate, there is also a fix linked library patch for vpp-sswan https://gerrit.fd.io/r/c/vpp/+/37388 where need to be applied on top of vlibapi refactor (https://gerrit.fd.io/r/c/vpp/+/36183). Otherwise, the fix patch is no longer required if the vlibapi refactor is out of scope of stable 22.10. Regards Kai Ji - Intel Research and Development Ireland Ltd Co. Reg. #308263 Collinstown Industrial Park, Leixlip, County Kildare, Ireland -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#22037): https://lists.fd.io/g/vpp-dev/message/22037 Mute This Topic: https://lists.fd.io/mt/94322720/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] VPP SSWAN plugin git cherry pick to stable 22.10
Hi Andrew, Thank you for the reply, I’m agree with you, the https://gerrit.fd.io/r/c/vpp/+/36183 is way too big and risky for stable 22.10 now. In this case, can I kindly to ask you consider to merge VPP SSWAN plugin (https://gerrit.fd.io/r/c/vpp/+/36552) for 22.10 stable ? This patches only update the plugin and test script itself, and no dependences on vlibapi refactor and library fix. Regards Kai From: vpp-dev@lists.fd.io On Behalf Of Andrew Yourtchenko Sent: Friday, October 14, 2022 9:57 AM To: vpp-dev@lists.fd.io Subject: Re: [vpp-dev] VPP SSWAN plugin git cherry pick to stable 22.10 Kaj, *release manager hat on* we are past RC2 milestone - which means only critical fixes for CSIT tests go in. if the commit was just the plugin itself, in principle might be possible to entertain the idea, since a separate plugin is relatively low risk. But https://gerrit.fd.io/r/c/vpp/+/36183 is *far* too much risk to back port at this point in time, and is again not a fix per se. So unfortunately I will have to say “no” on this one. --a On 14 Oct 2022, at 10:43, Ji, Kai mailto:kai...@intel.com>> wrote: Hello We recently added fix plugin for vpp-sswan (https://gerrit.fd.io/r/c/vpp/+/36552) and I wonder if is it too late for this patch to be picked to stable 22.10 ? To make things more complicate, there is also a fix linked library patch for vpp-sswan https://gerrit.fd.io/r/c/vpp/+/37388 where need to be applied on top of vlibapi refactor (https://gerrit.fd.io/r/c/vpp/+/36183). Otherwise, the fix patch is no longer required if the vlibapi refactor is out of scope of stable 22.10. Regards Kai Ji - Intel Research and Development Ireland Ltd Co. Reg. #308263 Collinstown Industrial Park, Leixlip, County Kildare, Ireland -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#22035): https://lists.fd.io/g/vpp-dev/message/22035 Mute This Topic: https://lists.fd.io/mt/94322720/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-
Re: [vpp-dev] VPP SSWAN plugin git cherry pick to stable 22.10
Hello We recently added fix plugin for vpp-sswan (https://gerrit.fd.io/r/c/vpp/+/36552) and I wonder if is it too late for this patch to be picked to stable 22.10 ? To make things more complicate, there is also a fix linked library patch for vpp-sswan https://gerrit.fd.io/r/c/vpp/+/37388 where need to be applied on top of vlibapi refactor (https://gerrit.fd.io/r/c/vpp/+/36183). Otherwise, the fix patch is no longer required if the vlibapi refactor is out of scope of stable 22.10. Regards Kai Ji - Intel Research and Development Ireland Ltd Co. Reg. #308263 Collinstown Industrial Park, Leixlip, County Kildare, Ireland -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#22033): https://lists.fd.io/g/vpp-dev/message/22033 Mute This Topic: https://lists.fd.io/mt/94322720/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-