Hi Artem,

Can I ask you kindly to review the under-load state determination patch for 
wiregurad: https://gerrit.fd.io/r/c/vpp/+/37764

Regards

Kai

From: Oginski, GabrielX <gabrielx.ogin...@intel.com>
Sent: Friday, January 13, 2023 8:56 AM
To: vpp-dev <vpp-dev@lists.fd.io>
Cc: Ji, Kai <kai...@intel.com>; Pei, Yulong <yulong....@intel.com>; Zhang, Fan 
<fanzhang....@gmail.com>; artem.glazyc...@xored.com
Subject: Wireguard: bottleneck in handshake process

Hi,

We recently running a concurrent 10K wireguard tunnels test and there is an 
handshake issue I would like to get community's attention.

Its seem to us that the VPP wireguard handshake process is always proceeded by 
main thread, and there is handoff process if other threads receive handshake 
messages. This means that the main thread is running in the thread-safe 
environment. In the case of a large number of tunnelling setups (a lot 
handshakes), the main thread configures all the tunnels and the handshake 
process function need to look-up element in the vectors from all existing the 
wireguard interfaces with an exclusive UDP port (listen-port).

This seems a bottle neck for us, for example: to setup 10k tunnels, as every 
handshake came with the same listen-port (e.g. 51820), is equal to look-up an 
element in vector with 10k entries.
The process can be really time consuming if corresponds interface located at 
the end of vectors, as the MAC check and calculation is required by VPP upon 
received messages.

The following are the code executed by main thread:

static wg_input_error_t
wg_handshake_process (vlib_main_t *vm, wg_main_t *wmp, vlib_buffer_t *b,
                 u32 node_idx, u8 is_ip4)

...

  index_t *ii;
  wg_ifs = wg_if_indexes_get_by_port (udp_dst_port);
  if (NULL == wg_ifs)
    return WG_INPUT_ERROR_INTERFACE;

  vec_foreach (ii, wg_ifs)
    {
      wg_if = wg_if_get (*ii);
      if (NULL == wg_if)
        continue;

      under_load = wg_if_is_under_load (vm, wg_if);
      mac_state = cookie_checker_validate_macs (
     vm, &wg_if->cookie_checker, macs, current_b_data, len, under_load,
     &src_ip, udp_src_port);
      if (mac_state == INVALID_MAC)
     {
       wg_if_dec_handshake_num (wg_if);
       wg_if = NULL;
       continue;
     }
      break;
    }


The variable "wg_ifs" has value how many wireguard interfaces were created with 
the same "udp_dst_port", for each "vec_foreach" has to look-up on "wg_ifs" 
elements, and the "cookie_checker_validate_macs()" function has to check 
received "mac" with "mac" that will be calculated with key for each interfaces 
separately (&wg_if->cookie_checker).

I measured the time before and after handshake process function call, and I 
discovered that the number of handshake processed per second are much lower 
when more tunnelling existed in the system. This means that vpp eventually 
won't able to process all the handshake messages in time, and it can lead to 
packets drop during the handoff - due to not have enough space for income 
handshake messages where the main thread still busy for the "old" handshake, 
then a large number of "congestion drop" start to happened during in 
"wg4-handshake-handoff".

I have also investigated "under load state" in the wireguard, as all the 
handshake comes with the same UDP listen-port and the most time-consuming part 
is look-up, there is very little improvement.
Please see the patch for update under-load state determination for wireguard:  
https://gerrit.fd.io/r/c/vpp/+/37764

I came out with the idea using different UDP listening port in configuration. 
In this case, the results we gathered are much promising, the VPP got much 
better handshake performance and we are no longer see "congestion drop" with 
larger number of tunnel can be created.
I would like to ask community members to review my under-load state 
determination patch and any feedback on the change UDP listening port approach 
are welcome.

I looking forward to hearing from you soon.

Best regards,
Gabriel Oginski

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#22474): https://lists.fd.io/g/vpp-dev/message/22474
Mute This Topic: https://lists.fd.io/mt/96242405/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to