Pinning an rxq to a PMD with pmd-rxq-affinity may be done for various reasons such as reserving a full PMD for an rxq, or to ensure that multiple rxqs from a port are handled on different PMDs.
Previously pmd-rxq-affinity always isolated the PMD so no other rxqs could be assigned to it by OVS. There may be cases where there is unused cycles on those pmds and the user would like other rxqs to also be able to be assigned to it by OVS. Add an option to pin the rxq and non-isolate the PMD. The default behaviour is unchanged, which is pin and isolate the PMD. In order to pin and non-isolate: ovs-vsctl set Open_vSwitch . other_config:pmd-rxq-isolate=false Note this is available only with group assignment type, as pinning conflicts with the operation of the other rxq assignment algorithms. Signed-off-by: Kevin Traynor <ktray...@redhat.com> Acked-by: Sunil Pai G <sunil.pa...@intel.com> Acked-by: David Marchand <david.march...@redhat.com> --- Documentation/topics/dpdk/pmd.rst | 9 ++- NEWS | 3 + lib/dpif-netdev.c | 34 ++++++++-- tests/pmd.at | 105 ++++++++++++++++++++++++++++++ vswitchd/vswitch.xml | 19 ++++++ 5 files changed, 162 insertions(+), 8 deletions(-) diff --git a/Documentation/topics/dpdk/pmd.rst b/Documentation/topics/dpdk/pmd.rst index 29ba53954..30040d703 100644 --- a/Documentation/topics/dpdk/pmd.rst +++ b/Documentation/topics/dpdk/pmd.rst @@ -102,6 +102,11 @@ like so: - Queue #3 pinned to core 8 -PMD threads on cores where Rx queues are *pinned* will become *isolated*. This -means that this thread will only poll the *pinned* Rx queues. +PMD threads on cores where Rx queues are *pinned* will become *isolated* by +default. This means that this thread will only poll the *pinned* Rx queues. + +If using ``pmd-rxq-assign=group`` PMD threads with *pinned* Rxqs can be +*non-isolated* by setting:: + + $ ovs-vsctl set Open_vSwitch . other_config:pmd-rxq-isolate=false .. warning:: diff --git a/NEWS b/NEWS index c6a929068..5de06e9b1 100644 --- a/NEWS +++ b/NEWS @@ -54,4 +54,7 @@ Post-v2.15.0 * Added new 'group' option to pmd-rxq-assign. This will assign rxq to pmds purely based on rxq and pmd load. + * Add new 'pmd-rxq-isolate' option that can be set to 'false' in order + that pmd cores which are pinned with rxqs using 'pmd-rxq-affinity' + are available for assigning other non-pinned rxqs. - ovs-ctl: * New option '--no-record-hostname' to disable hostname configuration diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index ddb52c685..97cf06aa8 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -294,4 +294,5 @@ struct dp_netdev { /* Rxq to pmd assignment type. */ enum sched_assignment_type pmd_rxq_assign_type; + bool pmd_iso; /* Protects the access of the 'struct dp_netdev_pmd_thread' @@ -4238,4 +4239,22 @@ dpif_netdev_set_config(struct dpif *dpif, const struct smap *other_config) } + bool pmd_iso = smap_get_bool(other_config, "pmd-rxq-isolate", true); + + if (pmd_rxq_assign_type != SCHED_GROUP && pmd_iso == false) { + /* Invalid combination. */ + VLOG_WARN("pmd-rxq-isolate can only be set false " + "when using pmd-rxq-assign=group"); + pmd_iso = true; + } + if (dp->pmd_iso != pmd_iso) { + dp->pmd_iso = pmd_iso; + if (pmd_iso) { + VLOG_INFO("pmd-rxq-affinity isolates PMD core"); + } else { + VLOG_INFO("pmd-rxq-affinity does not isolate PMD core"); + } + dp_netdev_request_reconfigure(dp); + } + struct pmd_auto_lb *pmd_alb = &dp->pmd_alb; bool cur_rebalance_requested = pmd_alb->auto_lb_requested; @@ -4967,5 +4986,5 @@ sched_numa_list_assignments(struct sched_numa_list *numa_list, sched_pmd = sched_pmd_find_by_pmd(numa_list, rxq->pmd); if (sched_pmd) { - if (rxq->core_id != OVS_CORE_UNSPEC) { + if (rxq->core_id != OVS_CORE_UNSPEC && dp->pmd_iso) { sched_pmd->isolated = true; } @@ -5234,4 +5253,5 @@ sched_numa_list_schedule(struct sched_numa_list *numa_list, struct dp_netdev_pmd_thread *pmd; struct sched_numa *numa; + bool iso = dp->pmd_iso; uint64_t proc_cycles; char rxq_cyc_log[MAX_RXQ_CYC_STRLEN]; @@ -5256,9 +5276,11 @@ sched_numa_list_schedule(struct sched_numa_list *numa_list, continue; } - /* Mark PMD as isolated if not done already. */ - if (sched_pmd->isolated == false) { - sched_pmd->isolated = true; - numa = sched_pmd->numa; - numa->n_isolated++; + if (iso) { + /* Mark PMD as isolated if not done already. */ + if (sched_pmd->isolated == false) { + sched_pmd->isolated = true; + numa = sched_pmd->numa; + numa->n_isolated++; + } } proc_cycles = dp_netdev_rxq_get_cycles(rxq, diff --git a/tests/pmd.at b/tests/pmd.at index 3b21b4c22..d59e55f6b 100644 --- a/tests/pmd.at +++ b/tests/pmd.at @@ -613,4 +613,14 @@ p1 3 0 1 ]) +# Check they are pinned when those pmds are available again +AT_CHECK([ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x1fe]) + +AT_CHECK([ovs-appctl dpif-netdev/pmd-rxq-show | parse_pmd_rxq_show], [0], [dnl +p1 0 0 3 +p1 1 0 7 +p1 2 0 2 +p1 3 0 8 +]) + AT_CHECK([ovs-vsctl remove Interface p1 other_config pmd-rxq-affinity]) @@ -625,4 +635,5 @@ p1 3 ]) +AT_CHECK([ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=6]) AT_CHECK([ovs-vsctl set Interface p1 other_config:pmd-rxq-affinity='0:1']) @@ -639,4 +650,98 @@ OVS_VSWITCHD_STOP(["/cannot be pinned with port/d"]) AT_CLEANUP +AT_SETUP([PMD - rxq affinity - non-isolate]) +OVS_VSWITCHD_START( + [], [], [], [--dummy-numa 0,0,0,0,0,0,0,0,0]) +AT_CHECK([ovs-appctl vlog/set dpif:dbg dpif_netdev:dbg]) + +AT_CHECK([ovs-ofctl add-flow br0 actions=controller]) + +AT_CHECK([ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x1fe]) + +AT_CHECK([ovs-vsctl add-port br0 p1 -- set Interface p1 type=dummy-pmd ofport_request=1 options:n_rxq=4 other_config:pmd-rxq-affinity="0:3,1:7,2:2,3:8"]) + +dnl The rxqs should be on the requested cores. +AT_CHECK([ovs-appctl dpif-netdev/pmd-rxq-show | parse_pmd_rxq_show], [0], [dnl +p1 0 0 3 +p1 1 0 7 +p1 2 0 2 +p1 3 0 8 +]) + +# change rxq assignment algorithm +AT_CHECK([ovs-vsctl set Open_vSwitch . other_config:pmd-rxq-assign=group]) + +dnl The rxqs should be on the requested cores. +AT_CHECK([ovs-appctl dpif-netdev/pmd-rxq-show | parse_pmd_rxq_show], [0], [dnl +p1 0 0 3 +p1 1 0 7 +p1 2 0 2 +p1 3 0 8 +]) + +# try to pin & non-isolate +TMP=$(($(cat ovs-vswitchd.log | wc -l | tr -d [[:blank:]])+1)) +AT_CHECK([ovs-vsctl set Open_vSwitch . other_config:pmd-rxq-isolate=false]) +OVS_WAIT_UNTIL([tail -n +$TMP ovs-vswitchd.log | grep "pmd-rxq-affinity does not isolate PMD core"]) + +# should not impact - all rxqs are still pinned +dnl The rxqs should be on the requested cores. +AT_CHECK([ovs-appctl dpif-netdev/pmd-rxq-show | parse_pmd_rxq_show], [0], [dnl +p1 0 0 3 +p1 1 0 7 +p1 2 0 2 +p1 3 0 8 +]) + +# remove some pinning - see if non-isolate pmd are used for ovs rxq assignment of other rxqs +AT_CHECK([ovs-vsctl remove Interface p1 other_config pmd-rxq-affinity]) +AT_CHECK([ovs-vsctl set Interface p1 other_config:pmd-rxq-affinity='0:1']) +AT_CHECK([ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=6]) + +dnl We explicitly requested core 1 for queue 0. Core 1 is not isolated so it is +dnl use for other rxqs. +AT_CHECK([ovs-appctl dpif-netdev/pmd-rxq-show | parse_pmd_rxq_show], [0], [dnl +p1 0 0 1 +p1 1 0 2 +p1 2 0 1 +p1 3 0 2 +]) + +# change to algorithm that does not support pin & non-isolate +TMP=$(($(cat ovs-vswitchd.log | wc -l | tr -d [[:blank:]])+1)) +AT_CHECK([ovs-vsctl set Open_vSwitch . other_config:pmd-rxq-assign=cycles]) +OVS_WAIT_UNTIL([tail -n +$TMP ovs-vswitchd.log | grep "pmd-rxq-isolate can only be set false when using pmd-rxq-assign=group"]) +OVS_WAIT_UNTIL([tail -n +$TMP ovs-vswitchd.log | grep "pmd-rxq-affinity isolates PMD core"]) +OVS_WAIT_UNTIL([tail -n +$TMP ovs-vswitchd.log | grep "Performing pmd to rx queue assignment using cycles algorithm"]) + +dnl We explicitly requested core 1 for queue 0. Core 1 becomes isolated and +dnl every other queue goes to core 2. +AT_CHECK([ovs-appctl dpif-netdev/pmd-rxq-show | parse_pmd_rxq_show], [0], [dnl +p1 0 0 1 +p1 1 0 2 +p1 2 0 2 +p1 3 0 2 +]) + +# change rxq assignment algorithm to one that support pin & non-isolate +TMP=$(($(cat ovs-vswitchd.log | wc -l | tr -d [[:blank:]])+1)) +AT_CHECK([ovs-vsctl set Open_vSwitch . other_config:pmd-rxq-assign=group]) +OVS_WAIT_UNTIL([tail -n +$TMP ovs-vswitchd.log | grep "pmd-rxq-affinity does not isolate PMD core"]) +OVS_WAIT_UNTIL([tail -n +$TMP ovs-vswitchd.log | grep "Performing pmd to rx queue assignment using group algorithm"]) + +#dnl We explicitly requested core 1 for queue 0. Core 1 becomes isolated and +#dnl every other queue goes to core 2. +AT_CHECK([ovs-appctl dpif-netdev/pmd-rxq-show | parse_pmd_rxq_show], [0], [dnl +p1 0 0 1 +p1 1 0 2 +p1 2 0 1 +p1 3 0 2 +]) + +OVS_VSWITCHD_STOP(["/cannot be pinned with port/d +/pmd-rxq-isolate can only be set false when using pmd-rxq-assign=group/d"]) +AT_CLEANUP + + AT_SETUP([PMD - rxq affinity - NUMA]) OVS_VSWITCHD_START( diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml index cf6937501..c15053b25 100644 --- a/vswitchd/vswitch.xml +++ b/vswitchd/vswitch.xml @@ -549,4 +549,23 @@ </column> + <column name="other_config" key="pmd-rxq-isolate" + type='{"type": "boolean"}'> + <p> + Specifies if a CPU core will be isolated after being pinned with + an Rx queue. + <p/> + Set this value to <code>false</code> to non-isolate a CPU core after + it is pinned with an Rxq using <code>pmd-rxq-affinity</code>. This + will allow OVS to assign other Rxqs to that CPU core. + </p> + <p> + The default value is <code>true</code>. + </p> + <p> + This can only be <code>false</code> when <code>pmd-rxq-assign</code> + is set to <code>group</code>. + </p> + </column> + <column name="other_config" key="n-handler-threads" type='{"type": "integer", "minInteger": 1}'> -- 2.31.1 _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev