> -----Original Message-----
> From: Morten Brørup <m...@smartsharesystems.com>
> Sent: Wednesday, March 22, 2023 11:14 AM
> To: Tyler Retzlaff <roret...@linux.microsoft.com>
> Cc: Stephen Hemminger <step...@networkplumber.org>; dev@dpdk.org;
> Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>; Ruifeng Wang
> <ruifeng.w...@arm.com>; tho...@monjalon.net
> Subject: RE: [PATCH 0/7] replace rte atomics with GCC builtin atomics
> 
> > From: Tyler Retzlaff [mailto:roret...@linux.microsoft.com]
> > Sent: Wednesday, 22 March 2023 16.30
> >
> > On Wed, Mar 22, 2023 at 03:58:07PM +0100, Morten Brørup wrote:
> > > > From: Tyler Retzlaff [mailto:roret...@linux.microsoft.com]
> > > > Sent: Wednesday, 22 March 2023 15.22
> > > >
> > > > On Wed, Mar 22, 2023 at 12:28:44PM +0100, Morten Brørup wrote:
> > > > > > From: Tyler Retzlaff [mailto:roret...@linux.microsoft.com]
> > > > > > Sent: Friday, 17 March 2023 22.49
> > > > > >
> > > > > > On Fri, Mar 17, 2023 at 02:42:26PM -0700, Stephen Hemminger
> wrote:
> > > > > > > On Fri, 17 Mar 2023 13:19:41 -0700 Tyler Retzlaff
> > > > > > > <roret...@linux.microsoft.com> wrote:
> > > > > > >
> > > > > > > > Replace the use of rte_atomic.h types and functions,
> > > > > > > > instead use
> > GCC
> > > > > > > > supplied C++11 memory model builtins.
> > > > > > > >
> > > > > > > > This series covers the libraries and drivers that are
> > > > > > > > built on
> > > > Windows.
> > > > > > > >
> > > > > > > > The code has be converted to use the __atomic builtins but
> > > > > > > > there
> > are
> > > > > > > > additional during conversion i notice that there may be
> > > > > > > > some
> > issues
> > > > > > > > that need to be addressed.
> > > > > > >
> > > > > > > I don't think all these cmpset need to use SEQ_CST.
> > > > > > > Especially for the places where it is used a loop, might be
> > > > > > > more efficient with some of the other memory models.
> > > > > >
> > > > > > i agree.
> > > > > >
> > > > > > however, i'm not trying to improve the code with this change,
> > > > > > just decouple it from rte_atomics.h so trying my best to avoid
> > > > > > any unnecessary semantic change.
> > > > > >
> > > > > > certainly if the maintainers of this code wish to weaken the
> > > > > > ordering where appropriate after the change is merged they can
> > > > > > do so and
> > handily
> > > > > > this change has enabled them to do so easily allowing them to
> > > > > > test
> > just
> > > > > > their change in isolation.
> > > > >
> > > > > I agree with the two-step approach, where this first step is a
> > > > > simple
> > > > search-and-replacement; but I insist that you add a FIXME or
> > > > similar note where you have blindly used SEQ_CST, indicating that
> > > > the memory order
> > needs to
> > > > be reviewed and potentially corrected.
> > > >
> > > > i think the maintainers need to take some responsibility, if they
> > > > see optimizations they missed when previously writing the code
> > > > they need to follow up with a patch themselves. i can't do
> > > > everything for them and marking things i'm not sure about will
> > > > only lead to me having to churn patch series to remove the unwanted
> comments later.
> > >
> > > The previous atomic functions didn't have the "memory order"
> > > parameter, so
> > the maintainers didn't have to think about it - and thus they didn't
> > miss any optimizations when accepting the code.
> > >
> > > I also agree 100 % that it is not your responsibility to consider or
> > determine which memory order is appropriate!
> > >
> > > But I think you should mark the locations where you are changing
> > > from the
> > old rte_atomic functions (where no memory order optimization was
> > available) to the new functions - to highlight where the option of
> > memory ordering has been introduced and knowingly ignored (by you).
> > >
> >
> > first, i have to apologize i confused myself about which of the many
> > patch series i have up right now that you were commenting on.
> 
> No worries... you are rushing through quite an effort for this, so a little
> confusion is perfectly understandable. Especially when I'm replying to an 
> ageing
> email. :-)
> 
> >
> > let me ask for clarification in relation to this series.
> >
> > isn't that every single usage of the rte_atomic APIs?
> 
> Probably, yes.
> 
> > i mean are you
> > literally asking for the entire patch series to look like the
> > following patch snippet with the expectation that maintainers will
> > come along and clean up/review after this series is merged?
> >
> > -rte_atomic_add32(&o, v);
> > +//FIXME: opportunity for relaxing ordering constraint, please review
> > +__atomic_fetch_add(&o, v, order);
> 
> Exactly. And something similar for the rte_atomicXX_t variables changed to
> intXX_t, such as the packet counters.
> 
> Realistically, I don't expect the maintainers to clean them up anytime soon. 
> The
> purpose is to make the FIXMEs stick until someone eventually cleans them up, 
> so
> they are not forgotten as time passes.
Cleaning up the rte_atomic APIs is a different effort. There is already lot of 
effort that has gone into this and there is more effort happening (rte_ring 
being a painful one)

Instead of having FIXME, why not just send a separate patch with SEQ_CST (still 
a search and replace)? We can leave the tougher ones like rte_ring as they are 
being worked on.

> 
> >
> > this would just be a mechanical addition to this series so i can
> > certainly accomodate that, i thought something more complicated was
> > being asked for. if this is all, then sure no problem.
> 
> Great.
> 
> >
> > > > keep in mind i have to touch each of these again when converting
> > > > to standard so that's a better time to review ~everything in more
> > > > detail because when converting to standard that's when suddenly
> > > > you get a bunch of code generation that is "fallback" to seq_cst that 
> > > > isn't
> happening now.
> > >
> > > I think you should to do it when replacing the rte_atomic functions
> > > with the
> > __atomic functions. It will make it easier to see where the memory
> > order was knowingly ignored, and should be reviewed for optimization.
> > >
> > > >
> > > > the series that converts to standard needs to be up for review as
> > > > soon as possible to maximize available time for feedback before
> > > > 23.11 so it would be better to get the simpler cut & paste
> > > > normalizing the code out of the way to unblock that series submission.
> > > >
> > > > >
> > > > > Also, in a couple of the drivers, you are using int64_t for
> > > > > packet
> > counters.
> > > > These cannot be negative and should be uint64_t. And AFAIK, such
> > > > counters
> > can
> > > > use RELAXED memory order.
> > > >
> > > > i know you don't mean to say i selected the types and rather that
> > > > the types that were selected are not quite correct for their usage.
> > >
> > > Yes; the previous types were also signed, and you didn't change that.
> > >
> > > > again
> > > > on the review that actually adopts std atomics is a better place
> > > > to make any potential type changes since we are "breaking" the API
> > > > for 23.11 anyway. further, the std atomics series technically
> > > > changes all the types so it's probably better to make one type
> > > > change then rather than one now and one later.
> > > >
> > > > i think it would be best to get these validated and merged asap so
> > > > we can get to the std atomics review. when that series is up let's
> > > > discuss further how i can mark areas of concern, with that series
> > > > i expect there will have to be some changes in order to avoid minor
> regressions.
> > > >
> > > > thanks!
> > >
> > > I thought it would be better to catch these details (i.e. memory
> > > ordering
> > and signedness) early on, but I now understand that you planned to do
> > it in a later step. So I'll let you proceed as you have planned.
> > >
> > > Thanks for all your work on this, Tyler. It is much appreciated!
> >
> > again, sorry for the confusion the sooner i can get some of these
> > merged the easier it will be for me to manage the final series. i hope
> > david/thomas can merge the simple normalization patches as soon as
> > 23.03 cycle is complete.
> 
> Yes. An early merge would also provide more time for reviewing and optimizing
> the memory order of the most important atomic operations.

Reply via email to