On 05/27/2018 06:04 PM, Lam, Tiago wrote:
> Hi Kevin,
> 
> On 25/05/2018 18:22, Kevin Traynor wrote:
>> Hi Tiago,
>>
> 
> [snip]
> 
>>>> +
>>>> +static struct dpdk_mp *
>>>> +dpdk_mp_get(struct netdev_dpdk *dev, int mtu, bool per_port_mp)
>>>> +{
>>>> +    struct dpdk_mp *dmp;
>>>> +    bool reuse = false;
>>>>  
>>>>      ovs_mutex_lock(&dpdk_mp_mutex);
>>>> -    if (dpdk_mp_full(mp)) {
>>>> -        VLOG_DBG("Freeing mempool \"%s\"", mp->name);
>>>> -        rte_mempool_free(mp);
>>>> -    } else {
>>>> -        struct dpdk_mp *dmp;
>>>> +    /* Check if shared mempools are being used, if so check existing 
>>>> mempools
>>>> +     * to see if reuse is possible. */
>>>> +    if (!per_port_mp) {
>>>> +        LIST_FOR_EACH (dmp, list_node, &dpdk_mp_list) {
>>>> +            if (dmp->socket_id == dev->requested_socket_id
>>>> +                && dmp->mtu == mtu) {
>>>> +                VLOG_DBG("Reusing mempool \"%s\"", dmp->mp->name);
>>>> +                dmp->refcount++;
>>>> +                reuse = true;
>>>> +                break;
>>>> +            }
>>>> +        }
>>>> +    }
>>>> +    /* Sweep mempools after reuse or before create. */
>>>> +    dpdk_mp_sweep();
>>>
>>> *Should dpdk_mp_sweep() be called when destroying an interface as well,
>>> in common_destruct()? While testing, I've noticed that if I add one port
>>> and delete the same port the mempool will still be allocated until you
>>> add another port, since dpdk_mp_sweep() will only be called on the next
>>> call to dpdp_mp_get(). This could potentially create problems in the
>>> per-port model where one deletes a certain number of ports and can't add
>>> any other ports because there's (hanging) mempools holding memory.
>>>
>>
>> With the shared model, it was deliberate to leave the dpdk_mp_sweep() as
>> late as possible to give more time for buffers to be returned to it and
>> to allow for the possibility that it might be reused. In order to not
>> require any additional memory, it is done just before a new mempool is
>> created.
>>
> 
> Thanks for giving some context, I also found that to be the case while
> testing with both models.
> 
>> It's not a problem to call dpdk_mp_sweep() anytime but not sure I follow
>> the issue you've raised - sweep would still be called before mempools
>> for any new port are created. Is there a case missed?
>>
> 
> Maybe I haven't explained myself as clearly as I wanted to. I wouldn't
> say it is a missed case as much as a consequence of what you just
> mentioned - "leave the dpdk_mp_sweep() as late as possible", but I
> thought it would be best to flag it just so we are aware.
> 
> Let's say you are using the per-port model and you add port X, which
> gets mempool A allocated to it. And let's say you immediately delete
> that port X. What I was seeing while testing is that mempool A is not
> freed right away, only when a new port, let's say port Y, is added. This
> also goes in-line with what you just mentioned.
> 
> Now, my point is that this could potentially become an issue in the
> per-port model, since it doesn't reuse the mempools. If you add a bunch
> of ports and then delete them in a batch (or maybe this is not a
> use-case?), if you try to add more ports you may find yourself out of
> memory, because the mempools haven't actually been freed when deleting
> the ports.
> 

I think it will be ok, because the sweep should be still called at least
once before creating the next mempool.

>>>> +
>>>> +/* Depending on the memory model being used this function tries
>>>> + * identify and reuse an existing mempool or tries to allocate new
>>>> + * mempool on requested_socket_id with mbuf size corresponding to
>>>> + * requested_mtu. On success new configuration will be applied.
>>>>   * On error, device will be left unchanged. */
>>>>  static int
>>>>  netdev_dpdk_mempool_configure(struct netdev_dpdk *dev)
>>>>      OVS_REQUIRES(dev->mutex)
>>>>  {
>>>>      uint32_t buf_size = dpdk_buf_size(dev->requested_mtu);
>>>> -    struct rte_mempool *mp;
>>>> +    struct dpdk_mp *mp;
>>>>      int ret = 0;
>>>> +    bool per_port_mp = dpdk_per_port_mempool_enabled();
>>>>  
>>>> -    dpdk_mp_sweep();
>>>> +    /* With shared mempools we do not need to configure a mempool if the 
>>>> MTU
>>>> +     * and socket ID have not changed, the previous configuration is still
>>>> +     * valid so return 0 */
>>>> +    if (dev->mtu == dev->requested_mtu
>>>> +        && dev->socket_id == dev->requested_socket_id
>>>> +        && (!per_port_mp)) {
>>>
>>> I also find that moving the `(!per_port_mp)` condition to the beginning
>>> improves readability here. It even goes in hand with your comment just
>>> above - "With shared mempools we do not ...".
>>>
>>>> +        return ret;
>>>> +    }
>>>>  
>>>> -    mp = dpdk_mp_create(dev, FRAME_LEN_TO_MTU(buf_size));
>>>> +    mp = dpdk_mp_get(dev, FRAME_LEN_TO_MTU(buf_size), per_port_mp);
>>>>      if (!mp) {
>>>>          VLOG_ERR("Failed to create memory pool for netdev "
>>>>                   "%s, with MTU %d on socket %d: %s\n",
>>>>                   dev->up.name, dev->requested_mtu, 
>>>> dev->requested_socket_id,
>>>> -                 rte_strerror(rte_errno));
>>>> -        ret = rte_errno;
>>>> +        rte_strerror(rte_errno));
>>>
>>> Missing indentation here.
>>>
>>>> +        return rte_errno;
>>>>      } else {
>>>> -        /* If a new MTU was requested and its rounded value equals the one
>>>> -         * that is currently used, then the existing mempool is returned. 
>>>> */
>>>> -        if (dev->mp != mp) {
>>>> -            /* A new mempool was created, release the previous one. */
>>>> -            dpdk_mp_release(dev->mp);
>>>> -        } else {
>>>> -            ret = EEXIST;
>>>> +        /* Check for any pre-existing dpdk_mp for the device */
>>>> +        if (dev->dpdk_mp != NULL) {
>>>> +            /* If a new MTU was requested and its rounded value equals the
>>>> +             * one that is currently used, then the existing mempool is
>>>> +             * returned. */
>>>> +            if (dev->dpdk_mp->mp != mp->mp) {
>>>> +                /* A new mempool was created, release the previous one. */
>>>> +                dpdk_mp_put(dev->dpdk_mp);
>>>> +            } else {
>>>> +                /* If the mempool already exists in the current dpdk_mp 
>>>> then
>>>> +                 * we need to ensure dpdk_mp that was just created is 
>>>> freed in
>>>> +                 * the next sweep as it will not be used. */
>>> The code path around the `else` block here will only be called when
>>> `!per_port_mp`. This is because `dpdk_mp_get()` will only return an
>>> already existing mempool when using the shared model. Otherwise a new
>>> one is always returned, and thus the `if (dev->dpdk_mp->mp != mp->mp)`
>>> will be true. Am I reading this right? If so then refactoring this a bit
>>> to differentiate on `per_port_mp` might help on readability - this goes
>>> in-line with Kevin's comment about making this piece a bit more readable.
>>>
>>> On the same note, the comment above mislead me to think that the
>>> allocated `mp` is being freed, which would result in error since the
>>> same `mp` is then assigned below. Instead, what it is doing is
>>> decrementing the refcount in struct dpdk_mp, which might end up being
>>> freed on the next dpdk_mp_sweep() if `refcount=0`. But that won't happen
>>> on the shared model unless no port is using the mempool.
>>>
>>
>> but this port is using it :-)
>>
>> +            if (dev->dpdk_mp->mp != mp->mp) {
>>
>> is false, which means they both have the same mempool. The refcount in
>> dpdk_mp only counts user of the dpdk_mp, not the actual mp, so it could
>> then be freed while still needed.
>>
> 
> I'm not sure I understand your point entirely. It is indeed true that
> both `mp`s will be equal and thus `dpdk_mp_put()` will be called,
> decrementing the `refcount`. But `refcount` shouldn't get to 0 (and thus
> the mempool shouldn't be freed during sweep) as the previous call to
> `dpdk_mp_get()` is also incrementing `refcount` by 1.
> 

sure, that is the case if the dpdk_mp structs are guaranteed to be same
also, but that is dependent on how mp_get/create works.

> (In fact, the `else` block seems to be here only to decrement the
> `refcount` when a port has been reconfigured and ends up reusing the
> same mempool it was using before. In that case, the previous call to
> `dpdk_mp_get()` is "wrongly" incrementing `refcount` by 1.)
> 
> I also tried to crash the patch by exercising this part of the code with
> the following:
> 1. Add one port - `refcount` here was 1;
> 2. Reconfigure that port (but keep the MTU, so the same mempool gets
> reused) - `refcount` here was 1;
> 3. Add a second port (this should trigger a call to `dpdk_mp_sweep()`) -
> `refcount` here was 2;
> 4. Send some traffic.
> 
> And wasn't able to do so. I do agree, though, that this part can be
> improved.
> 
> Regards,
> Tiago.
> 

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to