On Fri, Jan 26, 2018 at 05:16:02PM +0300, Ilya Maximets wrote:
> On 26.01.2018 15:00, Stokes, Ian wrote:
> > Hi All,
> > 
> > Recently an issue was raised regarding the move from a single shared 
> > mempool model that was in place up to OVS 2.8, to a mempool per port model 
> > introduced in 2.9.
> > 
> > https://mail.openvswitch.org/pipermail/ovs-discuss/2018-January/046021.html
> > 
> > The per port mempool model was introduced in September 5th with commit 
> > d555d9bded to allow fine grain control on a per port case of memory usage.
> > 
> > In the 'common/shared mempool' model, ports sharing a similar MTU would all 
> > share the same buffer mempool (e.g. the most common example of this being 
> > that all ports are by default created with a 1500B MTU, and as such share 
> > the same mbuf mempool).
> > 
> > This approach had some drawbacks however. For example, with the shared 
> > memory pool model a user could exhaust the shared memory pool (for instance 
> > by requesting a large number of RXQs for a port), this would cause the 
> > vSwitch to crash as any remaining ports would not have the required memory 
> > to function. This bug was discovered and reported to the community in late 
> > 2016 
> > https://mail.openvswitch.org/pipermail/ovs-discuss/2016-September/042560.html.
> > 
> > The per port mempool patch aimed to avoid such issues by allocating a 
> > separate buffer mempool to each port.
> > 
> > An issue has been flagged on ovs-discuss, whereby memory dimensions 
> > provided for a given number of ports on OvS 2.6-2.8 may be insufficient to 
> > support the same number of ports in OvS 2.9, on account of the per-port 
> > mempool model without re-dimensioning extra memory. The effect of this is 
> > use case dependent (number of ports, RXQs, MTU settings, number of PMDs 
> > etc.) The previous 'common-pool' model was rudimentary in estimating the 
> > mempool size and was marked as something that was to be improved upon. The 
> > memory allocation calculation for per port model was modified to take the 
> > possible configuration factors mentioned into account.
> > 
> > It's unfortunate that this came to light so close to the release code 
> > freeze - but better late than never as it is a valid problem to be resolved.
> > 
> > I wanted to highlight some options to the community as I don't think the 
> > next steps should be taken in isolation due to the impact this feature has.
> > 
> > There are a number of possibilities for the 2.9 release.
> > 
> > (i) Revert the mempool per port patches and return to the shared mempool 
> > model. There are a number of features and refactoring in place on top of 
> > the change so this will not be a simple revert. I'm investigating what 
> > exactly is involved with this currently.
> 
> This looks like a bad thing to do. Shared memory pools has their own issues 
> and
> hides the real memory usage by each port. Also, reverting seems almost 
> impossible,
> we'll have to re-implement it from scratch.

Agreed.

> > (ii) Leave the per port mempool implementation as is, flag to users that 
> > memory requirements have increased. Extra memory may have to be provided on 
> > a per use case basis.
> 
> That's a good point. I prefer this option if possible.

I was told we have ways to handle this change outside of OVS which is
not ideal, but doable.

It would be nice to have a doc explaining how to estimate the new
amount of memory required though.

 
> > (iii) Reduce the amount of memory allocated per mempool per port. An RFC to 
> > this effect was submitted by Kevin but on follow up the feeling is that it 
> > does not resolve the issue adequately.
> 
> IMHO, we should not reduce the mempool size by hardcoding small value. 
> Different
> installations has their own requirements because of different numbers of HW 
> NICs
> and CPU cores. I beleive that 20-30K is a normal size per-port for now.
> 
> How about to add an upper limit for mempool size configurable in boot time?
> See below code for example (not tested):

Thanks for the quick patch.  I see two problems:
1) Existing environments might stop working after an upgrade due to
the insufficient memory.  In this case, this solution would help but
then one could simply increase the amount of hugepages and move on as
it requires manual intervention in the same way.

2) New installations would need to follow the new documentation and
not sure how this would help here.

My concern is that this is another tunable to consider and most
probably to be misused, unless we think the number of buffers is over
estimated now.


> -------------------------------------------------------------------------------
> diff --git a/lib/dpdk-stub.c b/lib/dpdk-stub.c
> index 3602180..ed54e06 100644
> --- a/lib/dpdk-stub.c
> +++ b/lib/dpdk-stub.c
> @@ -54,3 +54,9 @@ dpdk_vhost_iommu_enabled(void)

[...]

> > (iv) Introduce a feature to allow users to configure mempool as
> > shared or on a per port basis: This would be the best of both
> > worlds but given the proximity to the 2.9 freeze I don't think
> > it's feasible by the end of January.
> 
> This, probably, may solve some issues, but yes, it's definitely not for 2.9.
> Good implementation and extensive testing will be required.

+1 here, too late.

Thanks,
-- 
Flavio
_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to