On Thu, Apr 23, 2020 at 9:38 PM Andrew Rybchenko <[email protected]> wrote: > > On 4/13/20 5:21 PM, [email protected] wrote: > > From: Tonghao Zhang <[email protected]> > > > > The order of mempool initiation affects mempool index in the > > rte_mempool_ops_table. For example, when building APPs with: > > > > $ gcc -lrte_mempool_bucket -lrte_mempool_ring ... > > > > The "bucket" mempool will be registered firstly, and its index > > in table is 0 while the index of "ring" mempool is 1. DPDK > > uses the mk/rte.app.mk to build APPs, and others, for example, > > Open vSwitch, use the libdpdk.a or libdpdk.so to build it. > > The mempool lib linked in dpdk and Open vSwitch is different. > > > > The mempool can be used between primary and secondary process, > > such as dpdk-pdump and pdump-pmd/Open vSwitch(pdump enabled). > > There will be a crash because dpdk-pdump creates the "ring_mp_mc" > > ring which index in table is 0, but the index of "bucket" ring > > is 0 in Open vSwitch. If Open vSwitch use the index 0 to get > > mempool ops and malloc memory from mempool. The crash will occur: > > > > bucket_dequeue (access null and crash) > > rte_mempool_get_ops (should get "ring_mp_mc", > > but get "bucket" mempool) > > rte_mempool_ops_dequeue_bulk > > ... > > rte_pktmbuf_alloc > > rte_pktmbuf_copy > > pdump_copy > > pdump_rx > > rte_eth_rx_burst > > > > To avoid the crash, there are some solution: > > * constructor priority: Different mempool uses different > > priority in RTE_INIT, but it's not easy to maintain. > > > > * change mk/rte.app.mk: Change the order in mk/rte.app.mk to > > be same as libdpdk.a/libdpdk.so, but when adding a new mempool > > driver in future, we must make sure the order. > > > > * register mempool orderly: Sort the mempool when registering, > > so the lib linked will not affect the index in mempool table. > > but the number of mempool libraries may be different. > > > > * shared memzone: The primary process allocates a struct in > > shared memory named memzone, When we register a mempool ops, > > we first get a name and id from the shared struct: with the lock held, > > lookup for the registered name and return its index, else > > get the last id and copy the name in the struct. > > > > Previous discussion: > > https://mails.dpdk.org/archives/dev/2020-March/159354.html > > > > Suggested-by: Olivier Matz <[email protected]> > > Suggested-by: Jerin Jacob <[email protected]> > > Signed-off-by: Tonghao Zhang <[email protected]> > > --- > > v2: > > * fix checkpatch warning > > --- > > lib/librte_mempool/rte_mempool.h | 28 +++++++++++- > > lib/librte_mempool/rte_mempool_ops.c | 89 > > ++++++++++++++++++++++++++++-------- > > 2 files changed, 96 insertions(+), 21 deletions(-) > > > > diff --git a/lib/librte_mempool/rte_mempool.h > > b/lib/librte_mempool/rte_mempool.h > > index c90cf31467b2..2709b9e1d51b 100644 > > --- a/lib/librte_mempool/rte_mempool.h > > +++ b/lib/librte_mempool/rte_mempool.h > > @@ -50,6 +50,7 @@ > > #include <rte_ring.h> > > #include <rte_memcpy.h> > > #include <rte_common.h> > > +#include <rte_init.h> > > > > #ifdef __cplusplus > > extern "C" { > > @@ -678,7 +679,6 @@ struct rte_mempool_ops { > > */ > > struct rte_mempool_ops_table { > > rte_spinlock_t sl; /**< Spinlock for add/delete. */ > > - uint32_t num_ops; /**< Number of used ops structs in the table. > > */ > > /** > > * Storage for all possible ops structs. > > */ > > @@ -910,6 +910,30 @@ int rte_mempool_ops_get_info(const struct rte_mempool > > *mp, > > */ > > int rte_mempool_register_ops(const struct rte_mempool_ops *ops); > > > > +struct rte_mempool_shared_ops { > > + size_t num_mempool_ops; > > Is there any specific reason to change type from uint32_t used > above to size_t? I think that uint32_t is better here since > it is just a number, not a size of memory or related value. Thanks for you review. busy to commit other patch, so that be delayed. I will change that in next version. > > + struct { > > + char name[RTE_MEMPOOL_OPS_NAMESIZE]; > > + } mempool_ops[RTE_MEMPOOL_MAX_OPS_IDX]; > > + > > + rte_spinlock_t mempool; > > +}; > > + > > +static inline int > > +mempool_ops_register_cb(const void *arg) > > +{ > > + const struct rte_mempool_ops *h = (const struct rte_mempool_ops *)arg; > > + > > + return rte_mempool_register_ops(h); > > +} > > + > > +static inline void > > +mempool_ops_register(const struct rte_mempool_ops *ops) > > +{ > > + rte_init_register(mempool_ops_register_cb, (const void *)ops, > > + RTE_INIT_PRE); > > +} > > + > > /** > > * Macro to statically register the ops of a mempool handler. > > * Note that the rte_mempool_register_ops fails silently here when > > @@ -918,7 +942,7 @@ int rte_mempool_ops_get_info(const struct rte_mempool > > *mp, > > #define MEMPOOL_REGISTER_OPS(ops) \ > > RTE_INIT(mp_hdlr_init_##ops) \ > > { \ > > - rte_mempool_register_ops(&ops); \ > > + mempool_ops_register(&ops); \ > > } > > > > /** > > diff --git a/lib/librte_mempool/rte_mempool_ops.c > > b/lib/librte_mempool/rte_mempool_ops.c > > index 22c5251eb068..b10fda662db6 100644 > > --- a/lib/librte_mempool/rte_mempool_ops.c > > +++ b/lib/librte_mempool/rte_mempool_ops.c > > @@ -14,43 +14,95 @@ > > /* indirect jump table to support external memory pools. */ > > struct rte_mempool_ops_table rte_mempool_ops_table = { > > .sl = RTE_SPINLOCK_INITIALIZER, > > - .num_ops = 0 > > }; > > > > -/* add a new ops struct in rte_mempool_ops_table, return its index. */ > > -int > > -rte_mempool_register_ops(const struct rte_mempool_ops *h) > > +static int > > +rte_mempool_register_shared_ops(const char *name) > > { > > - struct rte_mempool_ops *ops; > > - int16_t ops_index; > > + static bool mempool_shared_ops_inited; > > + struct rte_mempool_shared_ops *shared_ops; > > + const struct rte_memzone *mz; > > + uint32_t ops_index = 0; > > + > > I think we should sanity check 'name' here to be not > empty string (see review notes below). OK > > + if (rte_eal_process_type() == RTE_PROC_PRIMARY && > > + !mempool_shared_ops_inited) { > > + > > + mz = rte_memzone_reserve("mempool_ops_shared", > > + sizeof(*shared_ops), 0, 0); > > + if (mz == NULL) > > + return -ENOMEM; > > + > > + shared_ops = mz->addr; > > + shared_ops->num_mempool_ops = 0; > > + rte_spinlock_init(&shared_ops->mempool); > > + > > + mempool_shared_ops_inited = true; > > + } else { > > + mz = rte_memzone_lookup("mempool_ops_shared"); > > + if (mz == NULL) > > + return -ENOMEM; > > + > > + shared_ops = mz->addr; > > + } > > > > - rte_spinlock_lock(&rte_mempool_ops_table.sl); > > + rte_spinlock_lock(&shared_ops->mempool); > > > > - if (rte_mempool_ops_table.num_ops >= > > - RTE_MEMPOOL_MAX_OPS_IDX) { > > - rte_spinlock_unlock(&rte_mempool_ops_table.sl); > > + if (shared_ops->num_mempool_ops >= RTE_MEMPOOL_MAX_OPS_IDX) { > > + rte_spinlock_unlock(&shared_ops->mempool); > > RTE_LOG(ERR, MEMPOOL, > > "Maximum number of mempool ops structs exceeded\n"); > > return -ENOSPC; > > } > > > > + while (shared_ops->mempool_ops[ops_index].name[0]) { > > Please, compare with '\0' as DPDK style guide says. > > > + if (!strcmp(name, shared_ops->mempool_ops[ops_index].name)) { > > + rte_spinlock_unlock(&shared_ops->mempool); > > + return ops_index; > > + } > > + > > + ops_index++; > > + } > > + > > + strlcpy(shared_ops->mempool_ops[ops_index].name, name, > > + sizeof(shared_ops->mempool_ops[0].name)); > > + > > + shared_ops->num_mempool_ops++; > > + > > + rte_spinlock_unlock(&shared_ops->mempool); > > + return ops_index; > > +} > > + > > +/* add a new ops struct in rte_mempool_ops_table, return its index. */ > > +int > > +rte_mempool_register_ops(const struct rte_mempool_ops *h) > > +{ > > + struct rte_mempool_ops *ops; > > + int16_t ops_index; > > + > > if (h->alloc == NULL || h->enqueue == NULL || > > - h->dequeue == NULL || h->get_count == NULL) { > > - rte_spinlock_unlock(&rte_mempool_ops_table.sl); > > + h->dequeue == NULL || h->get_count == NULL) { > > Changing formatting just makes review a bit more harder. > > > RTE_LOG(ERR, MEMPOOL, > > "Missing callback while registering mempool ops\n"); > > + rte_errno = EINVAL; > > Why is it done in the patch? For me it looks like logically > different change if it is really required (properly motivated). I guess that should add the err code to rte_rrno ? but I am fine, not to do that. > > return -EINVAL; > > } > > > > if (strlen(h->name) >= sizeof(ops->name) - 1) { > > - rte_spinlock_unlock(&rte_mempool_ops_table.sl); > > - RTE_LOG(DEBUG, EAL, "%s(): mempool_ops <%s>: name too long\n", > > - __func__, h->name); > > + RTE_LOG(ERR, MEMPOOL, > > + "The registering mempool_ops <%s>: name too long\n", > > + h->name); > > Why do you change from DEBUG to ERR here? It it not > directly related to the purpose of the patch. Yes, because it is an error, so change that type. I change it in separate patch ? > > rte_errno = EEXIST; > > return -EEXIST; > > } > > > > - ops_index = rte_mempool_ops_table.num_ops++; > > + ops_index = rte_mempool_register_shared_ops(h->name); > > + if (ops_index < 0) { > > + rte_errno = -ops_index; > > + return ops_index; > > + } > > + > > + rte_spinlock_lock(&rte_mempool_ops_table.sl); > > + > > ops = &rte_mempool_ops_table.ops[ops_index]; > > strlcpy(ops->name, h->name, sizeof(ops->name)); > > ops->alloc = h->alloc; > > @@ -165,9 +217,8 @@ struct rte_mempool_ops_table rte_mempool_ops_table = { > > if (mp->flags & MEMPOOL_F_POOL_CREATED) > > return -EEXIST; > > > > - for (i = 0; i < rte_mempool_ops_table.num_ops; i++) { > > - if (!strcmp(name, > > - rte_mempool_ops_table.ops[i].name)) { > > + for (i = 0; i < RTE_MEMPOOL_MAX_OPS_IDX; i++) { > > + if (!strcmp(name, rte_mempool_ops_table.ops[i].name)) { > > Since rte_mempool_ops_table is filled in which zeros, > name string is empty by default. So, request with empty name > will match the first unused entry. I guess it is not what we > want here. I think we should handle empty string before the > loop and return -EINVAL. OK, thanks. > > ops = &rte_mempool_ops_table.ops[i]; > > break; > > } > > >
-- Best regards, Tonghao

