On 24/03/2022 16:35, Jan Scheurich wrote:
Hi Kevin,
This was a bit of a misunderstanding. We didn't check your RFC patch carefully
enough to realize that you had meant to encompass our cross-numa-polling
function in that RFC patch. Sorry for the confusion.
I wouldn't say we are particularly keen on upstreaming exactly our
implementation of cross-numa-polling for ALB, as long as we get the
functionality with a per-interface configuration option (preferably as in out
patch, so that we can maintain backward compatibility with our downstream
solution).
I suggest we have a closer look at your RFC and come back with comments on that.
No problems. It doesn't have selection logic/tests/docs etc, it was just
a way of seeing how the additions below could be added. It is mainly
just reorganising the schedule() fn. to not be always: select numa
first, then select pmd from numa. And some splitting things into
functions to help with checking load on all pmds.
'roundrobin' and 'cycles', are dependent on RR of cores per numa, so I agree it
makes sense in those cases to still RR the numas. Otherwise a mix of cross-
numa enabled and cross-numa disabled interfaces would conflict with each
other when selecting a core. So even though the user has selected to ignore
numa for this interface, we don't have a choice but to still RR the numa.
For 'group' it is about finding the lowest loaded pmd core. In that case we
don't
need to RR numa. We can just find the lowest loaded pmd core from any numa.
This is better because the user is choosing to remove numa based selection for
the interface and as the rxq scheduling algorthim is not dependent on it, we
can fully remove it too by checking pmds from all numas. I have done this in my
RFC.
It is also better to do this where possible because there may not be same
amount of pmd cores on each numa, or one numa could already be more
heavily loaded than the other.
Another difference is with above for 'group' I added tiebreaker for a local-to-
interface numa pmd to be selected if multiple pmds from different numas were
available with same load. This is most likely to be helpful for initial
selection
when there is no load on any pmds.
At first glance I agree with your reasoning. Choosing the least-loaded PMD from
all NUMAs using NUMA-locality as a tie-breaker makes sense in 'group'
algorithm. How do you select between equally loaded PMDs on the same NUMA node?
Just pick any?
Yes, that is how the group algorithm currently operates. A further way
to tiebreak on local numa could be if one of the remaining pmds was
already polling that rxq, or on number of rxqs assigned to the pmd.
thanks,
Kevin.
BR, Jan
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev