Hi, Slava > -----邮件原件----- > 发件人: Slava Ovsiienko <[email protected]> > 发送时间: 2021年5月17日 22:15 > 收件人: Feifei Wang <[email protected]>; Matan Azrad > <[email protected]>; Shahaf Shuler <[email protected]> > 抄送: [email protected]; nd <[email protected]>; Ruifeng Wang > <[email protected]> > 主题: RE: [PATCH v2 2/2] net/mlx5: remove unnecessary wmb for Memory > Region cache > > Hi, Feifei > > Thanks you for the patch. > Please, see my notes below about typos and minor commit message > rewording.
Thanks very much for your very careful reviewing.
I will apply these in the next version.
>
> > -----Original Message-----
> > From: Feifei Wang <[email protected]>
> > Sent: Monday, May 17, 2021 13:00
> > To: Matan Azrad <[email protected]>; Shahaf Shuler
> > <[email protected]>; Slava Ovsiienko <[email protected]>
> > Cc: [email protected]; [email protected]; Feifei Wang <[email protected]>;
> > Ruifeng Wang <[email protected]>
> > Subject: [PATCH v2 2/2] net/mlx5: remove unnecessary wmb for Memory
> > Region cache
> >
> > 'dev_gen' is a variable to inform other cores to flush their local
> > cache when global cache is rebuilt. It is unnecessary to add write
> > memory barrier (wmb) before or after its updating for synchronization.
> >
> Would it be better "to trigger all cores to flush their local caches once the
> global MR cache has been rebuilt" ?
>
1.Yes, I think this can be more clear.
> > This is due to MR cache's R/W lock can maintain synchronization
> > between
> > threads:
> I would add empty line here.
2.Done.
> > 1. dev_gen and global cache update ordering inside the lock protected
> > section does not matter. Because other threads cannot take the lock
> > until global cache has been updated. Thus, in out of order platform,
> > even if other agents firstly observed updated dev_gen but global does
> > not update, they also needs to wait the lock. As a result, it is
> > unnecessary to add a wmb
> Type: "need" (no S) -> "have to" would be better ?
>
3.Done.
> > between rebuiling global cache and updating dev_gen to keep the order
>
> rebuiling -> rebuilding
4.Done.
> And let's reword a little bit?
> "wmb between global cache rebuilding and updating the dev_gen to keep
> the memory store order."
>
5.Done.
> > rebuilding global cache and updating dev_gen.
> >
> > 2. Store-Release of unlock can provide the implicit wmb at the level
> > visible by
> can provide -> provides
>
6.Done.
> > software. This makes 'rebuiling global cache' and 'updating dev_gen'
> > be
> Typo: rebuiling -> rebuilding
7.Done.
>
>
> > observed before local_cache starts to be updated by other agents.
> > Thus, wmb after 'updating dev_gen' can be removed.
> >
> > Suggested-by: Ruifeng Wang <[email protected]>
> > Signed-off-by: Feifei Wang <[email protected]>
> > Reviewed-by: Ruifeng Wang <[email protected]>
> > ---
> > drivers/net/mlx5/mlx5_mr.c | 26 ++++++++++----------------
> > 1 file changed, 10 insertions(+), 16 deletions(-)
> >
> > diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c
> > index
> > e791b6338d..85e5865050 100644
> > --- a/drivers/net/mlx5/mlx5_mr.c
> > +++ b/drivers/net/mlx5/mlx5_mr.c
> > @@ -107,18 +107,15 @@ mlx5_mr_mem_event_free_cb(struct
> > mlx5_dev_ctx_shared *sh,
> > if (rebuild) {
> > mlx5_mr_rebuild_cache(&sh->share_cache);
> > /*
> > - * Flush local caches by propagating invalidation across cores.
> > - * rte_smp_wmb() is enough to synchronize this event. If
> > one of
> > - * freed memsegs is seen by other core, that means the
> > memseg
> > - * has been allocated by allocator, which will come after this
> > - * free call. Therefore, this store instruction (incrementing
> > - * generation below) will be guaranteed to be seen by other
> > core
> > - * before the core sees the newly allocated memory.
> > + * No wmb is needed after updating dev_gen due to store-
> > release of
> > + * unlock can provide the implicit wmb at the level visible by
> > + * software. This makes rebuilt global cache and updated
> > dev_gen
> > + * be observed when local_cache starts to be updating by
> > other
> > + * agents.
> > */
> Let's make comment a less wordy (and try to keep source code concise),
> what about this?
> "No explicit wmb is needed after updating dev_gen due to store-release
> ordering in unlock that provides the implicit barrier at the software visible
> level."
8.That's better than before. A concise comment works better in the code.
>
> > ++sh->share_cache.dev_gen;
> > DRV_LOG(DEBUG, "broadcasting local cache flush, gen=%d",
> > sh->share_cache.dev_gen);
> > - rte_smp_wmb();
> > }
> > rte_rwlock_write_unlock(&sh->share_cache.rwlock);
> > }
> > @@ -411,18 +408,15 @@ mlx5_dma_unmap(struct rte_pci_device *pdev,
> void
> > *addr,
> > (void *)mr);
> > mlx5_mr_rebuild_cache(&sh->share_cache);
> > /*
> > - * Flush local caches by propagating invalidation across cores.
> > - * rte_smp_wmb() is enough to synchronize this event. If one of
> > - * freed memsegs is seen by other core, that means the memseg
> > - * has been allocated by allocator, which will come after this
> > - * free call. Therefore, this store instruction (incrementing
> > - * generation below) will be guaranteed to be seen by other core
> > - * before the core sees the newly allocated memory.
> > + * No wmb is needed after updating dev_gen due to store-release of
> > + * unlock can provide the implicit wmb at the level visible by
> > + * software. This makes rebuilt global cache and updated dev_gen
> > + * be observed when local_cache starts to be updating by other
> > + * agents.
> The same as previous comment above.
9.Done.
>
> Please, apply the same comments to the mlx4 patch:
> http://patches.dpdk.org/project/dpdk/patch/20210517100002.19905-2-
> [email protected]/
>
10.Done.
Best Regards
Feifei
> With best regards,
> Slava

