> -----Original Message-----
> From: Morten Brørup <m...@smartsharesystems.com>
> Sent: Wednesday, March 22, 2023 12:08 PM
> To: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>; Tyler Retzlaff
> <roret...@linux.microsoft.com>
> Cc: Stephen Hemminger <step...@networkplumber.org>; dev@dpdk.org;
> Ruifeng Wang <ruifeng.w...@arm.com>; tho...@monjalon.net; nd
> <n...@arm.com>; nd <n...@arm.com>
> Subject: RE: [PATCH 0/7] replace rte atomics with GCC builtin atomics
> 
> > From: Honnappa Nagarahalli [mailto:honnappa.nagaraha...@arm.com]
> > Sent: Wednesday, 22 March 2023 17.40
> >
> > > From: Morten Brørup <m...@smartsharesystems.com>
> > > Sent: Wednesday, March 22, 2023 11:14 AM
> > >
> > > > From: Tyler Retzlaff [mailto:roret...@linux.microsoft.com]
> > > > Sent: Wednesday, 22 March 2023 16.30
> > > >
> > > > On Wed, Mar 22, 2023 at 03:58:07PM +0100, Morten Brørup wrote:
> > > > > > From: Tyler Retzlaff [mailto:roret...@linux.microsoft.com]
> > > > > > Sent: Wednesday, 22 March 2023 15.22
> > > > > >
> > > > > > On Wed, Mar 22, 2023 at 12:28:44PM +0100, Morten Brørup wrote:
> > > > > > > > From: Tyler Retzlaff [mailto:roret...@linux.microsoft.com]
> > > > > > > > Sent: Friday, 17 March 2023 22.49
> > > > > > > >
> > > > > > > > On Fri, Mar 17, 2023 at 02:42:26PM -0700, Stephen
> > > > > > > > Hemminger
> > > wrote:
> > > > > > > > > On Fri, 17 Mar 2023 13:19:41 -0700 Tyler Retzlaff
> > > > > > > > > <roret...@linux.microsoft.com> wrote:
> > > > > > > > >
> > > > > > > > > > Replace the use of rte_atomic.h types and functions,
> > > > > > > > > > instead use
> > > > GCC
> > > > > > > > > > supplied C++11 memory model builtins.
> > > > > > > > > >
> > > > > > > > > > This series covers the libraries and drivers that are
> > > > > > > > > > built on
> > > > > > Windows.
> > > > > > > > > >
> > > > > > > > > > The code has be converted to use the __atomic builtins
> > but
> > > > > > > > > > there
> > > > are
> > > > > > > > > > additional during conversion i notice that there may
> > > > > > > > > > be some
> > > > issues
> > > > > > > > > > that need to be addressed.
> > > > > > > > >
> > > > > > > > > I don't think all these cmpset need to use SEQ_CST.
> > > > > > > > > Especially for the places where it is used a loop, might
> > be
> > > > > > > > > more efficient with some of the other memory models.
> > > > > > > >
> > > > > > > > i agree.
> > > > > > > >
> > > > > > > > however, i'm not trying to improve the code with this
> > change,
> > > > > > > > just decouple it from rte_atomics.h so trying my best to
> > avoid
> > > > > > > > any unnecessary semantic change.
> > > > > > > >
> > > > > > > > certainly if the maintainers of this code wish to weaken
> > > > > > > > the ordering where appropriate after the change is merged
> > > > > > > > they
> > can
> > > > > > > > do so and
> > > > handily
> > > > > > > > this change has enabled them to do so easily allowing them
> > to
> > > > > > > > test
> > > > just
> > > > > > > > their change in isolation.
> > > > > > >
> > > > > > > I agree with the two-step approach, where this first step is
> > > > > > > a simple
> > > > > > search-and-replacement; but I insist that you add a FIXME or
> > > > > > similar note where you have blindly used SEQ_CST, indicating
> > that
> > > > > > the memory order
> > > > needs to
> > > > > > be reviewed and potentially corrected.
> > > > > >
> > > > > > i think the maintainers need to take some responsibility, if
> > they
> > > > > > see optimizations they missed when previously writing the code
> > > > > > they need to follow up with a patch themselves. i can't do
> > > > > > everything for them and marking things i'm not sure about will
> > > > > > only lead to me having to churn patch series to remove the
> > unwanted
> > > comments later.
> > > > >
> > > > > The previous atomic functions didn't have the "memory order"
> > > > > parameter, so
> > > > the maintainers didn't have to think about it - and thus they
> > > > didn't miss any optimizations when accepting the code.
> > > > >
> > > > > I also agree 100 % that it is not your responsibility to
> > > > > consider
> > or
> > > > determine which memory order is appropriate!
> > > > >
> > > > > But I think you should mark the locations where you are changing
> > > > > from the
> > > > old rte_atomic functions (where no memory order optimization was
> > > > available) to the new functions - to highlight where the option of
> > > > memory ordering has been introduced and knowingly ignored (by you).
> > > > >
> > > >
> > > > first, i have to apologize i confused myself about which of the
> > > > many patch series i have up right now that you were commenting on.
> > >
> > > No worries... you are rushing through quite an effort for this, so a
> > little
> > > confusion is perfectly understandable. Especially when I'm replying
> > > to
> > an ageing
> > > email. :-)
> > >
> > > >
> > > > let me ask for clarification in relation to this series.
> > > >
> > > > isn't that every single usage of the rte_atomic APIs?
> > >
> > > Probably, yes.
> > >
> > > > i mean are you
> > > > literally asking for the entire patch series to look like the
> > > > following patch snippet with the expectation that maintainers will
> > > > come along and clean up/review after this series is merged?
> > > >
> > > > -rte_atomic_add32(&o, v);
> > > > +//FIXME: opportunity for relaxing ordering constraint, please
> > review
> > > > +__atomic_fetch_add(&o, v, order);
> > >
> > > Exactly. And something similar for the rte_atomicXX_t variables
> > changed to
> > > intXX_t, such as the packet counters.
> > >
> > > Realistically, I don't expect the maintainers to clean them up
> > > anytime
> > soon. The
> > > purpose is to make the FIXMEs stick until someone eventually cleans
> > them up, so
> > > they are not forgotten as time passes.
> > Cleaning up the rte_atomic APIs is a different effort. There is
> > already lot of effort that has gone into this and there is more effort
> > happening (rte_ring being a painful one)
> >
> > Instead of having FIXME, why not just send a separate patch with
> > SEQ_CST (still a search and replace)? We can leave the tougher ones
> > like rte_ring as they are being worked on.
> 
> The FIXME makes it possible in the future to differentiate between the 
> instances
> that still need review and the instances that have been reviewed where
> SEQ_CST was the correct choice. (Similarly for the choice of type for 
> variables
> previously rte_atomicNN_t.)
Apologies, relooked at the heading of this patch, got confused with other 
patches.

The changes Arm had done for rte_atomic_ to __atomic_xxx were not direct 
replacements. The algorithms were studied, relaxed where required, race 
conditions fixed, performance benchmarked. IMO, we need to go through the same 
steps here.

I looked at the series, we should just review the patch and make suggested 
changes. Are we constrained by any deadlines for this work?

I would suggest to drop 1/7. Arm is working on removing the non-C11 algorithm 
for rte_ring (not sure if we will be successful). I think it is better to 
explore this approach rather than the changes in patch 1/7.

> 
> >
> > >
> > > >
> > > > this would just be a mechanical addition to this series so i can
> > > > certainly accomodate that, i thought something more complicated
> > > > was being asked for. if this is all, then sure no problem.
> > >
> > > Great.
> > >
> > > >
> > > > > > keep in mind i have to touch each of these again when
> > > > > > converting to standard so that's a better time to review
> > > > > > ~everything in
> > more
> > > > > > detail because when converting to standard that's when
> > > > > > suddenly you get a bunch of code generation that is "fallback"
> > > > > > to seq_cst
> > that isn't
> > > happening now.
> > > > >
> > > > > I think you should to do it when replacing the rte_atomic
> > functions
> > > > > with the
> > > > __atomic functions. It will make it easier to see where the memory
> > > > order was knowingly ignored, and should be reviewed for
> > optimization.
> > > > >
> > > > > >
> > > > > > the series that converts to standard needs to be up for review
> > as
> > > > > > soon as possible to maximize available time for feedback
> > > > > > before
> > > > > > 23.11 so it would be better to get the simpler cut & paste
> > > > > > normalizing the code out of the way to unblock that series
> > submission.
> > > > > >
> > > > > > >
> > > > > > > Also, in a couple of the drivers, you are using int64_t for
> > > > > > > packet
> > > > counters.
> > > > > > These cannot be negative and should be uint64_t. And AFAIK,
> > > > > > such counters
> > > > can
> > > > > > use RELAXED memory order.
> > > > > >
> > > > > > i know you don't mean to say i selected the types and rather
> > that
> > > > > > the types that were selected are not quite correct for their
> > usage.
> > > > >
> > > > > Yes; the previous types were also signed, and you didn't change
> > that.
> > > > >
> > > > > > again
> > > > > > on the review that actually adopts std atomics is a better
> > > > > > place to make any potential type changes since we are
> > > > > > "breaking" the
> > API
> > > > > > for 23.11 anyway. further, the std atomics series technically
> > > > > > changes all the types so it's probably better to make one type
> > > > > > change then rather than one now and one later.
> > > > > >
> > > > > > i think it would be best to get these validated and merged
> > > > > > asap
> > so
> > > > > > we can get to the std atomics review. when that series is up
> > let's
> > > > > > discuss further how i can mark areas of concern, with that
> > series
> > > > > > i expect there will have to be some changes in order to avoid
> > minor
> > > regressions.
> > > > > >
> > > > > > thanks!
> > > > >
> > > > > I thought it would be better to catch these details (i.e. memory
> > > > > ordering
> > > > and signedness) early on, but I now understand that you planned to
> > do
> > > > it in a later step. So I'll let you proceed as you have planned.
> > > > >
> > > > > Thanks for all your work on this, Tyler. It is much appreciated!
> > > >
> > > > again, sorry for the confusion the sooner i can get some of these
> > > > merged the easier it will be for me to manage the final series. i
> > hope
> > > > david/thomas can merge the simple normalization patches as soon as
> > > > 23.03 cycle is complete.
> > >
> > > Yes. An early merge would also provide more time for reviewing and
> > optimizing
> > > the memory order of the most important atomic operations.
> >

Reply via email to