On 26.06.2018 12:19, Eelco Chaudron wrote: > > > On 22 Jun 2018, at 21:03, Lam, Tiago wrote: > >> On 18/06/2018 12:28, Eelco Chaudron wrote: >>> >>> >>> On 11 Jun 2018, at 18:21, Tiago Lam wrote: >>> >>>> A new mutex, 'nonpmd_mp_mutex', has been introduced to serialise >>>> allocation and free operations by non-pmd threads on a given mempool. >>>> >>> >>> Can you explain why we need the mutex here? Can't see any reason why >>> rte_pktmbuf_free() needs to be protected for non-pmd threads? >> >> My understanding is that each PMD has a thread-local cache of its own, >> which it can access without locking. However, all non-PMD threads have >> direct access to the mempool, and thus need to lock in order to access >> it. Otherwise memory corruption could ensue. >> >> For OvS this means all these functions which can be called outside of a >> PMD context, or without holding the `non_pmd_mutex` mutex, in struct >> dp_netdev, need to lock before issue operations that modify a mempool. >> >> Hopefully I've put it succinctly enough without botching up the concept >> too much. > > I do not think this is a problem, as DPDK uses rte_mempool_default_cache() to > figure out which cache to use. > As the non PMD threads have a higher than RTE_MAX_LCORE id the cache will not > be used.
It's not the problem. The main reason is that implementation of ring buffers on which the mempools based is not preemptible: lib/librte_mempool/rte_mempool.h: * Note: the mempool implementation is not preemptible. An lcore must not be * interrupted by another task that uses the same mempool (because it uses a * ring which is not preemptible). As soon as non-PMD threads are not pinned, they could be scheduled on the same core and preempted while executing mempool operations. > >>> >>>> free_dpdk_buf() has been modified to make use of the introduced mutex. >>>> >>>> Signed-off-by: Tiago Lam <[email protected]> >>>> --- >>>> lib/netdev-dpdk.c | 21 ++++++++++++++++++++- >>>> 1 file changed, 20 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c >>>> index f546507..efd7c20 100644 >>>> --- a/lib/netdev-dpdk.c >>>> +++ b/lib/netdev-dpdk.c >>>> @@ -294,6 +294,10 @@ static struct ovs_mutex dpdk_mp_mutex >>>> OVS_ACQ_AFTER(dpdk_mutex) >>>> static struct ovs_list dpdk_mp_free_list >>>> OVS_GUARDED_BY(dpdk_mp_mutex) >>>> = OVS_LIST_INITIALIZER(&dpdk_mp_free_list); >>>> >>>> +/* This mutex must be used by non pmd threads when allocating or >>>> freeing >>>> + * mbufs through mempools. */ >>>> +static struct ovs_mutex nonpmd_mp_mutex = OVS_MUTEX_INITIALIZER; >>>> + >>>> /* Wrapper for a mempool released but not yet freed. */ >>>> struct dpdk_mp { >>>> struct rte_mempool *mp; >>>> @@ -461,6 +465,8 @@ struct netdev_rxq_dpdk { >>>> dpdk_port_t port_id; >>>> }; >>>> >>>> +static bool dpdk_thread_is_pmd(void); >>>> + >>>> static void netdev_dpdk_destruct(struct netdev *netdev); >>>> static void netdev_dpdk_vhost_destruct(struct netdev *netdev); >>>> >>>> @@ -494,6 +500,12 @@ dpdk_buf_size(int mtu) >>>> NETDEV_DPDK_MBUF_ALIGN); >>>> } >>>> >>>> +static bool >>>> +dpdk_thread_is_pmd(void) >>>> +{ >>>> + return rte_lcore_id() != NON_PMD_CORE_ID; >>> >>> Will this continue to work with newer DPDK versions? I've not looked at >>> it in detail, but I did notice that the vhost threads in the newer DPDK >>> now get created with rte_ctrl_thread_create() and does some lcore >>> mangling. >>> >> >> I had a look. This seems to be a wrapper for creating management >> threads only, to standardize pthread creation around that in DPDK. Usual >> EAL thread creation and assigning of LCORE_ID_ANY to non-EAL threads >> seems to still be the same. > > Thanks for checking. >> >>>> +} >>>> + >>>> /* Allocates an area of 'sz' bytes from DPDK. The memory is zero'ed. >>>> * >>>> * Unlike xmalloc(), this function can return NULL on failure. */ >>>> @@ -506,9 +518,16 @@ dpdk_rte_mzalloc(size_t sz) >>>> void >>>> free_dpdk_buf(struct dp_packet *p) >>>> { >>>> - struct rte_mbuf *pkt = (struct rte_mbuf *) p; >>>> + if (!dpdk_thread_is_pmd()) { >>>> + ovs_mutex_lock(&nonpmd_mp_mutex); >>>> + } >>>> >>>> + struct rte_mbuf *pkt = (struct rte_mbuf *) p; >>> >>> Should we not use a container_of macro here? >>> >> >> I'll use it for the next iteration. >> >> Tiago. > > > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
