Hi, Michael.  Thanks for the new version!

On 6/6/22 20:59, Michael Santana wrote:
> Additional threads are required to service upcalls when we have CPU
> isolation (in per-cpu dispatch mode). The reason additional threads
> are required is because it creates a more fair distribution.

I think, the description above lacks the information on why we need
to create additional threads, and why we choose the number to be a
prime number.  Yes, you said that it creates a more fair distribution,
but we probably need to describe that more with an example.

> With more
> threads we decrease the load of each thread as more threads would
> decrease the number of cores each threads is assigned.

While that is true, it's not obvious why increasing the number of
threads beyond the number of available cores is good for us.
The key factor is more fair load distribution among threads, so all
cores are utilized, but that is not highlighted.

> Adding as many
> threads are there are cores could have a performance hit when the number
> of active cores (which all threads have to share) is very low. For this
> reason we avoid creating as many threads as there are cores and instead
> meet somewhere in the middle.
> 
> The formula used to calculate the number of handler threads to create
> is as follows:
> 
> handlers_n = min(next_prime(active_cores+1), total_cores)
> 
> Where next_prime is a function that returns the next numeric prime
> number that is strictly greater than active_cores
> 
> Assume default behavior when total_cores <= 2, that is do not create
> additional threads when we have less than 2 total cores on the system
> 
> Fixes: b1e517bd2f81 ("dpif-netlink: Introduce per-cpu upcall dispatch.")
> Signed-off-by: Michael Santana <msant...@redhat.com>
> ---
>  lib/dpif-netlink.c | 86 ++++++++++++++++++++++++++++++++++++++++++++--
>  lib/ovs-thread.c   | 16 +++++++++
>  lib/ovs-thread.h   |  1 +
>  3 files changed, 101 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
> index 06e1e8ca0..e948a829b 100644
> --- a/lib/dpif-netlink.c
> +++ b/lib/dpif-netlink.c
> @@ -31,6 +31,7 @@
>  #include <sys/epoll.h>
>  #include <sys/stat.h>
>  #include <unistd.h>
> +#include <math.h>
>  
>  #include "bitmap.h"
>  #include "dpif-netlink-rtnl.h"
> @@ -2506,6 +2507,87 @@ dpif_netlink_handler_uninit(struct dpif_handler 
> *handler)
>  }
>  #endif
>  
> +/*
> + * Returns 1 if num is a prime number,
> + * Otherwise return 0
> + */
> +static uint32_t

This should just return bool.

> +is_prime(uint32_t num)
> +{
> +    if ((num < 2) || ((num % 2) == 0)) {

There is no need for so many parenthesis.

And it might make sense to just write 3 if blocks
here to make the code simpler, i.e. check for
< 2, == 2 and % 2.

> +        return num == 2;
> +    }
> +
> +    uint32_t i;
> +    uint32_t limit = sqrt((float) num);
> +    for (i = 3; i <= limit; i += 2) {

for (uint64_t i = 3; i * i <= num; i += 2) {

There is no real need for sqrt.  I don't think we should be concerned
about 'i * i' performance here.

> +        if (num % i == 0) {
> +            return 0;
> +        }
> +    }
> +
> +    return 1;
> +}
> +
> +/*
> + * Returns num if num is a prime number. Otherwise returns the next
> + * prime greater than num. Search is limited by UINT32_MAX.

2 spaces between sentences, please.

> + *
> + * Returns 0 if no prime has been found between num and UINT32_MAX
> + */
> +static uint32_t
> +next_prime(uint32_t num)
> +{
> +    if (num < 2) {
> +        return 2;
> +    }
> +    if (num == 2) {
> +        return 3;

This contradicts the description of the function.  So, this check
should probably be done in dpif_netlink_calculate_handlers_num.

> +    }
> +    if (num % 2 == 0) {
> +        num++;
> +    }
> +
> +    uint32_t i;
> +    for (i = num; i < UINT32_MAX; i += 2) {

We may just start this loop from 'num' and increment by 1 instead
of 2.  is_prime will return false for even numbers right away, so
there is no real need for +=2 optimization here.  And we can also
remove 2 out of 3 'if's at the beginning of this function (except
the incorrect one).

> +        if (is_prime(i)) {
> +            return i;
> +        }
> +    }
> +
> +    return 0;
> +}
> +
> +/*
> + * Calcuales and returns the number of handler threads needed based

s/Calcuales/Calculates/

> + * the following formula:
> + *
> + * handlers_n = min(next_prime(active_cores+1), total_cores)
> + *
> + * Where next_prime is a function that returns the next numeric prime
> + * number that is strictly greater than active_cores
> + */
> +static uint32_t
> +dpif_netlink_calculate_handlers_num(void)

Maybe dpif_netlink_calculate_n_handlers() ?

> +{
> +    uint32_t next_prime_num;
> +    uint32_t n_handlers = count_cpu_cores();
> +    uint32_t total_cores = count_total_cores();

Reverse x-mass tree, please.

> +
> +    /*
> +     * If we have isolated cores, add additional handler threads to
> +     * service inactive cores in the unlikely event that traffic goes
> +     * through inactive cores

These core are not really inactive, they are just not available to
OVS process.  And it can be a very likely event that traffic will
go through these cores, in some typical configurations.

> +     */
> +    if (n_handlers < total_cores && total_cores > 2) {
> +        next_prime_num = next_prime(n_handlers + 1);
> +        n_handlers = next_prime_num >= total_cores ?
> +                                            total_cores : next_prime_num;

Please, breka the line before the '?' and indent it at the
beginning of the right-hand expression, i.e. at the start of
'next_prime_num'.

> +    }
> +
> +    return n_handlers;
> +}
> +
>  static int
>  dpif_netlink_refresh_handlers_cpu_dispatch(struct dpif_netlink *dpif)
>      OVS_REQ_WRLOCK(dpif->upcall_lock)
> @@ -2515,7 +2597,7 @@ dpif_netlink_refresh_handlers_cpu_dispatch(struct 
> dpif_netlink *dpif)
>      uint32_t n_handlers;
>      uint32_t *upcall_pids;
>  
> -    n_handlers = count_cpu_cores();
> +    n_handlers = dpif_netlink_calculate_handlers_num();
>      if (dpif->n_handlers != n_handlers) {
>          VLOG_DBG("Dispatch mode(per-cpu): initializing %d handlers",
>                     n_handlers);
> @@ -2755,7 +2837,7 @@ dpif_netlink_number_handlers_required(struct dpif 
> *dpif_, uint32_t *n_handlers)
>      struct dpif_netlink *dpif = dpif_netlink_cast(dpif_);
>  
>      if (dpif_netlink_upcall_per_cpu(dpif)) {
> -        *n_handlers = count_cpu_cores();
> +        *n_handlers = dpif_netlink_calculate_handlers_num();
>          return true;
>      }
>  
> diff --git a/lib/ovs-thread.c b/lib/ovs-thread.c
> index 805cba622..2172b3d3f 100644
> --- a/lib/ovs-thread.c
> +++ b/lib/ovs-thread.c
> @@ -663,6 +663,22 @@ count_cpu_cores(void)
>      return n_cores > 0 ? n_cores : 0;
>  }
>  
> +/* Returns the total number of cores on the system, or 0 if the
> + * number cannot be determined. */
> +int
> +count_total_cores(void) {

The '{' should be on the next line.

> +    long int n_cores;
> +
> +#ifndef _WIN32
> +    n_cores = sysconf(_SC_NPROCESSORS_CONF);
> +#else
> +    n_cores = 0;
> +    errno = ENOTSUP;
> +#endif
> +
> +    return n_cores > 0 ? n_cores : 0;
> +}
> +
>  /* Returns 'true' if current thread is PMD thread. */
>  bool
>  thread_is_pmd(void)
> diff --git a/lib/ovs-thread.h b/lib/ovs-thread.h
> index 3b444ccdc..aac5e19c9 100644
> --- a/lib/ovs-thread.h
> +++ b/lib/ovs-thread.h
> @@ -522,6 +522,7 @@ bool may_fork(void);
>  /* Useful functions related to threading. */
>  
>  int count_cpu_cores(void);
> +int count_total_cores(void);
>  bool thread_is_pmd(void);
>  
>  #endif /* ovs-thread.h */

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to