On Tue, Mar 24, 2020 at 5:36 PM Andrew Rybchenko <arybche...@solarflare.com> wrote: > > On 3/9/20 11:27 AM, Olivier Matz wrote: > > Hi, > > > > On Mon, Mar 09, 2020 at 11:01:25AM +0800, Tonghao Zhang wrote: > >> On Sat, Mar 7, 2020 at 8:54 PM Andrew Rybchenko > >> <arybche...@solarflare.com> wrote: > >>> > >>> On 3/7/20 3:51 PM, Andrew Rybchenko wrote: > >>>> On 3/6/20 4:37 PM, Jerin Jacob wrote: > >>>>> On Fri, Mar 6, 2020 at 7:06 PM <xiangxia.m....@gmail.com> wrote: > >>>>>> From: Tonghao Zhang <xiangxia.m....@gmail.com> > >>>>>> > >>>>>> The order of mempool initiation affects mempool index in the > >>>>>> rte_mempool_ops_table. For example, when building APPs with: > >>>>>> > >>>>>> $ gcc -lrte_mempool_bucket -lrte_mempool_ring ... > >>>>>> > >>>>>> The "bucket" mempool will be registered firstly, and its index > >>>>>> in table is 0 while the index of "ring" mempool is 1. DPDK > >>>>>> uses the mk/rte.app.mk to build APPs, and others, for example, > >>>>>> Open vSwitch, use the libdpdk.a or libdpdk.so to build it. > >>>>>> The mempool lib linked in dpdk and Open vSwitch is different. > >>>>>> > >>>>>> The mempool can be used between primary and secondary process, > >>>>>> such as dpdk-pdump and pdump-pmd/Open vSwitch(pdump enabled). > >>>>>> There will be a crash because dpdk-pdump creates the "ring_mp_mc" > >>>>>> ring which index in table is 0, but the index of "bucket" ring > >>>>>> is 0 in Open vSwitch. If Open vSwitch use the index 0 to get > >>>>>> mempool ops and malloc memory from mempool. The crash will occur: > >>>>>> > >>>>>> bucket_dequeue (access null and crash) > >>>>>> rte_mempool_get_ops (should get "ring_mp_mc", > >>>>>> but get "bucket" mempool) > >>>>>> rte_mempool_ops_dequeue_bulk > >>>>>> ... > >>>>>> rte_pktmbuf_alloc > >>>>>> rte_pktmbuf_copy > >>>>>> pdump_copy > >>>>>> pdump_rx > >>>>>> rte_eth_rx_burst > >>>>>> > >>>>>> To avoid the crash, there are some solution: > >>>>>> * constructor priority: Different mempool uses different > >>>>>> priority in RTE_INIT, but it's not easy to maintain. > >>>>>> > >>>>>> * change mk/rte.app.mk: Change the order in mk/rte.app.mk to > >>>>>> be same as libdpdk.a/libdpdk.so, but when adding a new mempool > >>>>>> driver in future, we must make sure the order. > >>>>>> > >>>>>> * register mempool orderly: Sort the mempool when registering, > >>>>>> so the lib linked will not affect the index in mempool table. > >>>>>> > >>>>>> Signed-off-by: Tonghao Zhang <xiangxia.m....@gmail.com> > >>>>>> Acked-by: Olivier Matz <olivier.m...@6wind.com> > >>>>> Acked-by: Jerin Jacob <jer...@marvell.com> > >>>> > >>>> The patch is OK, but the fact that ops index changes during > >>>> mempool driver lifetime is frightening. In fact it breaks > >>>> rte_mempool_register_ops() return value semantics (read > >>>> as API break). The return value is not used in DPDK, but it > >>>> is a public function. If I'm not mistaken it should be taken > >>>> into account. > > > > Good points. > > > > The fact that the ops index changes during mempool driver lifetime is > > indeed frightening, especially knowning that this is a dynamic > > registration that could happen at any moment in the life of the > > application. Also, breaking the ABI is not desirable. > > > > Let me try to propose something else to solve your issue: > > > > 1/ At init, the primary process allocates a struct in shared memory > > (named memzone): > > > > struct rte_mempool_shared_ops { > > size_t num_mempool_ops; > > struct { > > char name[RTE_MEMPOOL_OPS_NAMESIZE]; > > } mempool_ops[RTE_MEMPOOL_MAX_OPS_IDX]; > > char *mempool_ops_name[RTE_MEMPOOL_MAX_OPS_IDX]; > > rte_spinlock_t mempool; > > } > > > > 2/ When we register a mempool ops, we first get a name and id from the > > shared struct: with the lock held, lookup for the registered name and > > return its index, else get the last id and copy the name in the struct. > > > > 3/ Then do as before (in the per-process global table), except that we > > reuse the registered id. > > > > We can remove the num_ops field from rte_mempool_ops_table. > > > > Thoughts? > > I like the solution. The patch will be sent, thanks.
-- Best regards, Tonghao