On 2024-04-26 14:00, Morten Brørup wrote:
From: Mattias Rönnblom [mailto:hof...@lysator.liu.se]
Sent: Friday, 26 April 2024 11.39

On 2024-04-25 18:18, Morten Brørup wrote:
From: Mattias Rönnblom [mailto:hof...@lysator.liu.se]
Sent: Thursday, 25 April 2024 16.36

On 2024-04-25 12:25, Morten Brørup wrote:
+#define rte_bit_atomic_test(addr, nr, memory_order)
        \
+       _Generic((addr),                                                \
+                uint32_t *: __rte_bit_atomic_test32,                   \
+                uint64_t *: __rte_bit_atomic_test64)(addr, nr,
memory_order)

I wonder if these should have RTE_ATOMIC qualifier:

+                RTE_ATOMIC(uint32_t) *: __rte_bit_atomic_test32,
                \
+                RTE_ATOMIC(uint64_t) *: __rte_bit_atomic_test64)(addr, nr,
memory_order)


+#define __RTE_GEN_BIT_ATOMIC_TEST(size)
        \
+       static inline bool                                              \
+       __rte_bit_atomic_test ## size(const uint ## size ## _t *addr,
        \

I wonder if the "addr" parameter should have RTE_ATOMIC qualifier:

+       __rte_bit_atomic_test ## size(const RTE_ATOMIC(uint ## size ## _t)
*addr,  \

instead of casting into a_addr.


Check the cover letter for the rationale for the cast.

Thanks, that clarifies it. Then...
For the series:
Acked-by: Morten Brørup <m...@smartsharesystems.com>


Where I'm at now is that I think C11 _Atomic is rather poor design. The
assumption that an object which allows for atomic access always should
require all operations upon it to be atomic, regardless of where it is
in its lifetime, and which thread is accessing it, does not hold, in the
general case.

It might be slow, but I suppose the C11 standard prioritizes correctness
over performance.


That's a false dichotomy, in this case. You can have both.

In theory you shouldn't need non-atomic access to atomic variables.
In reality, we want it anyway, because real CPUs are faster at non-atomic 
operations.


It seems locks are automatically added to _Atomic types larger than what is
natively supported by the architecture.
E.g. MSVC adds locks to _Atomic types larger than 8 byte. [1]

[1]: https://devblogs.microsoft.com/cppblog/c11-atomics-in-visual-studio-
2022-version-17-5-preview-2/


The only reason for _Atomic being as it is, as far as I can see, is to
accommodate for ISAs which does not have the appropriate atomic machine
instructions, and thus require a lock or some other data associated with
the actual user-data-carrying bits. Neither GCC nor DPDK supports any
such ISAs, to my knowledge. I suspect neither never will. So the cast
will continue to work.

I tend to agree with you on this.

We should officially decide that DPDK treats RTE_ATOMIC types as a union of
_Atomic and non-atomic, i.e. operations on RTE_ATOMIC types can be both atomic
and non-atomic.


I think this is a subject which needs to be further explored.

Yes. It's easier exploring and deciding now, when our options are open, than 
after we have locked down the affected APIs.


Objects that can be accessed both atomically and non-atomically should
be without _Atomic. With my current understanding of this issue, that
seems like the best option.

Agree.

The alterative described below is certainly no good!

It would be nice if they were marked as sometimes-atomic by a qualifier or 
special type, like rte_be32_t marks the network byte order variant of an 
uint32_t.

Furthermore, large atomic objects need the _Atomic qualifier for the compiler 
to add (and use) the associated lock.

If you have larger objects than the ISA can handle, you wouldn't want to leave the choice of the synchronization primitive to use to the compiler. I don't see how it could possibly know, which one is the most appropriate, especially in a DPDK context. It would for example need to know if the contending threads are non-preemptable or not.

In some situations a sequence lock may well be your best option. Will the compiler generate one for you?

If "lock" means std::mutex, it would be a disaster, performance-wise.

Alternatively, we could specify that sometimes-atomic objects cannot be larger 
than 8 byte, which is what MSVC can handle without locking.


You could turn it around as well, and have such marked _Atomic and have
explicit casts to their non-_Atomic cousins when operated upon by
non-atomic functions. Not sure how realistic that is, since
non-atomicity is the norm. All generic selection-based "functions" must
take this into account.


+                                     unsigned int nr, int memory_order) \
+       {                                                               \
+               RTE_ASSERT(nr < size);                                       \
+                                                                       \
+               const RTE_ATOMIC(uint ## size ## _t) *a_addr =          \
+                       (const RTE_ATOMIC(uint ## size ## _t) *)addr;   \
+               uint ## size ## _t mask = (uint ## size ## _t)1 << nr;    \
+               return rte_atomic_load_explicit(a_addr, memory_order) &
mask; \
+       }


Similar considerations regarding volatile qualifier for the "once"
operations.

Reply via email to