On 05/27/2018 06:04 PM, Lam, Tiago wrote: > Hi Kevin, > > On 25/05/2018 18:22, Kevin Traynor wrote: >> Hi Tiago, >> > > [snip] > >>>> + >>>> +static struct dpdk_mp * >>>> +dpdk_mp_get(struct netdev_dpdk *dev, int mtu, bool per_port_mp) >>>> +{ >>>> + struct dpdk_mp *dmp; >>>> + bool reuse = false; >>>> >>>> ovs_mutex_lock(&dpdk_mp_mutex); >>>> - if (dpdk_mp_full(mp)) { >>>> - VLOG_DBG("Freeing mempool \"%s\"", mp->name); >>>> - rte_mempool_free(mp); >>>> - } else { >>>> - struct dpdk_mp *dmp; >>>> + /* Check if shared mempools are being used, if so check existing >>>> mempools >>>> + * to see if reuse is possible. */ >>>> + if (!per_port_mp) { >>>> + LIST_FOR_EACH (dmp, list_node, &dpdk_mp_list) { >>>> + if (dmp->socket_id == dev->requested_socket_id >>>> + && dmp->mtu == mtu) { >>>> + VLOG_DBG("Reusing mempool \"%s\"", dmp->mp->name); >>>> + dmp->refcount++; >>>> + reuse = true; >>>> + break; >>>> + } >>>> + } >>>> + } >>>> + /* Sweep mempools after reuse or before create. */ >>>> + dpdk_mp_sweep(); >>> >>> *Should dpdk_mp_sweep() be called when destroying an interface as well, >>> in common_destruct()? While testing, I've noticed that if I add one port >>> and delete the same port the mempool will still be allocated until you >>> add another port, since dpdk_mp_sweep() will only be called on the next >>> call to dpdp_mp_get(). This could potentially create problems in the >>> per-port model where one deletes a certain number of ports and can't add >>> any other ports because there's (hanging) mempools holding memory. >>> >> >> With the shared model, it was deliberate to leave the dpdk_mp_sweep() as >> late as possible to give more time for buffers to be returned to it and >> to allow for the possibility that it might be reused. In order to not >> require any additional memory, it is done just before a new mempool is >> created. >> > > Thanks for giving some context, I also found that to be the case while > testing with both models. > >> It's not a problem to call dpdk_mp_sweep() anytime but not sure I follow >> the issue you've raised - sweep would still be called before mempools >> for any new port are created. Is there a case missed? >> > > Maybe I haven't explained myself as clearly as I wanted to. I wouldn't > say it is a missed case as much as a consequence of what you just > mentioned - "leave the dpdk_mp_sweep() as late as possible", but I > thought it would be best to flag it just so we are aware. > > Let's say you are using the per-port model and you add port X, which > gets mempool A allocated to it. And let's say you immediately delete > that port X. What I was seeing while testing is that mempool A is not > freed right away, only when a new port, let's say port Y, is added. This > also goes in-line with what you just mentioned. > > Now, my point is that this could potentially become an issue in the > per-port model, since it doesn't reuse the mempools. If you add a bunch > of ports and then delete them in a batch (or maybe this is not a > use-case?), if you try to add more ports you may find yourself out of > memory, because the mempools haven't actually been freed when deleting > the ports. >
I think it will be ok, because the sweep should be still called at least once before creating the next mempool. >>>> + >>>> +/* Depending on the memory model being used this function tries >>>> + * identify and reuse an existing mempool or tries to allocate new >>>> + * mempool on requested_socket_id with mbuf size corresponding to >>>> + * requested_mtu. On success new configuration will be applied. >>>> * On error, device will be left unchanged. */ >>>> static int >>>> netdev_dpdk_mempool_configure(struct netdev_dpdk *dev) >>>> OVS_REQUIRES(dev->mutex) >>>> { >>>> uint32_t buf_size = dpdk_buf_size(dev->requested_mtu); >>>> - struct rte_mempool *mp; >>>> + struct dpdk_mp *mp; >>>> int ret = 0; >>>> + bool per_port_mp = dpdk_per_port_mempool_enabled(); >>>> >>>> - dpdk_mp_sweep(); >>>> + /* With shared mempools we do not need to configure a mempool if the >>>> MTU >>>> + * and socket ID have not changed, the previous configuration is still >>>> + * valid so return 0 */ >>>> + if (dev->mtu == dev->requested_mtu >>>> + && dev->socket_id == dev->requested_socket_id >>>> + && (!per_port_mp)) { >>> >>> I also find that moving the `(!per_port_mp)` condition to the beginning >>> improves readability here. It even goes in hand with your comment just >>> above - "With shared mempools we do not ...". >>> >>>> + return ret; >>>> + } >>>> >>>> - mp = dpdk_mp_create(dev, FRAME_LEN_TO_MTU(buf_size)); >>>> + mp = dpdk_mp_get(dev, FRAME_LEN_TO_MTU(buf_size), per_port_mp); >>>> if (!mp) { >>>> VLOG_ERR("Failed to create memory pool for netdev " >>>> "%s, with MTU %d on socket %d: %s\n", >>>> dev->up.name, dev->requested_mtu, >>>> dev->requested_socket_id, >>>> - rte_strerror(rte_errno)); >>>> - ret = rte_errno; >>>> + rte_strerror(rte_errno)); >>> >>> Missing indentation here. >>> >>>> + return rte_errno; >>>> } else { >>>> - /* If a new MTU was requested and its rounded value equals the one >>>> - * that is currently used, then the existing mempool is returned. >>>> */ >>>> - if (dev->mp != mp) { >>>> - /* A new mempool was created, release the previous one. */ >>>> - dpdk_mp_release(dev->mp); >>>> - } else { >>>> - ret = EEXIST; >>>> + /* Check for any pre-existing dpdk_mp for the device */ >>>> + if (dev->dpdk_mp != NULL) { >>>> + /* If a new MTU was requested and its rounded value equals the >>>> + * one that is currently used, then the existing mempool is >>>> + * returned. */ >>>> + if (dev->dpdk_mp->mp != mp->mp) { >>>> + /* A new mempool was created, release the previous one. */ >>>> + dpdk_mp_put(dev->dpdk_mp); >>>> + } else { >>>> + /* If the mempool already exists in the current dpdk_mp >>>> then >>>> + * we need to ensure dpdk_mp that was just created is >>>> freed in >>>> + * the next sweep as it will not be used. */ >>> The code path around the `else` block here will only be called when >>> `!per_port_mp`. This is because `dpdk_mp_get()` will only return an >>> already existing mempool when using the shared model. Otherwise a new >>> one is always returned, and thus the `if (dev->dpdk_mp->mp != mp->mp)` >>> will be true. Am I reading this right? If so then refactoring this a bit >>> to differentiate on `per_port_mp` might help on readability - this goes >>> in-line with Kevin's comment about making this piece a bit more readable. >>> >>> On the same note, the comment above mislead me to think that the >>> allocated `mp` is being freed, which would result in error since the >>> same `mp` is then assigned below. Instead, what it is doing is >>> decrementing the refcount in struct dpdk_mp, which might end up being >>> freed on the next dpdk_mp_sweep() if `refcount=0`. But that won't happen >>> on the shared model unless no port is using the mempool. >>> >> >> but this port is using it :-) >> >> + if (dev->dpdk_mp->mp != mp->mp) { >> >> is false, which means they both have the same mempool. The refcount in >> dpdk_mp only counts user of the dpdk_mp, not the actual mp, so it could >> then be freed while still needed. >> > > I'm not sure I understand your point entirely. It is indeed true that > both `mp`s will be equal and thus `dpdk_mp_put()` will be called, > decrementing the `refcount`. But `refcount` shouldn't get to 0 (and thus > the mempool shouldn't be freed during sweep) as the previous call to > `dpdk_mp_get()` is also incrementing `refcount` by 1. > sure, that is the case if the dpdk_mp structs are guaranteed to be same also, but that is dependent on how mp_get/create works. > (In fact, the `else` block seems to be here only to decrement the > `refcount` when a port has been reconfigured and ends up reusing the > same mempool it was using before. In that case, the previous call to > `dpdk_mp_get()` is "wrongly" incrementing `refcount` by 1.) > > I also tried to crash the patch by exercising this part of the code with > the following: > 1. Add one port - `refcount` here was 1; > 2. Reconfigure that port (but keep the MTU, so the same mempool gets > reused) - `refcount` here was 1; > 3. Add a second port (this should trigger a call to `dpdk_mp_sweep()`) - > `refcount` here was 2; > 4. Send some traffic. > > And wasn't able to do so. I do agree, though, that this part can be > improved. > > Regards, > Tiago. > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev