Sorry to say but I see the ARM disclaimer at the bottom even though I
requested the disclaimer not to be added to this specific email (there
is a trick for that). I need to figure out what went wrong.

-- Ola

On 4 April 2017 at 22:38, Ola Liljedahl <ola.liljed...@arm.com> wrote:
>
>
>
>
> On 04/04/2017, 22:14, "Dmitry Eremin-Solenikov"
> <dmitry.ereminsoleni...@linaro.org> wrote:
>
>>On 04.04.2017 21:48, Brian Brooks wrote:
>>> Signed-off-by: Ola Liljedahl <ola.liljed...@arm.com>
>>> Reviewed-by: Brian Brooks <brian.bro...@arm.com>
>>> Reviewed-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
>>> ---
>>>  platform/linux-generic/include/odp_llsc.h | 332
>>>++++++++++++++++++++++++++++++
>>>  1 file changed, 332 insertions(+)
>>>  create mode 100644 platform/linux-generic/include/odp_llsc.h
>>>
>>> diff --git a/platform/linux-generic/include/odp_llsc.h
>>>b/platform/linux-generic/include/odp_llsc.h
>>> new file mode 100644
>>> index 00000000..ea60c54b
>>> --- /dev/null
>>> +++ b/platform/linux-generic/include/odp_llsc.h
>>> @@ -0,0 +1,332 @@
>>> +/* Copyright (c) 2017, ARM Limited
>>> + * All rights reserved.
>>> + *
>>> + * SPDX-License-Identifier:BSD-3-Clause
>>> + */
>>> +
>>> +#ifndef ODP_LLSC_H_
>>> +#define ODP_LLSC_H_
>>> +
>>> +#include <odp_config_internal.h>
>>> +
>>>
>>>+/***********************************************************************
>>>*******
>>> + * LL/SC primitives
>>> +
>>>*************************************************************************
>>>****/
>>> +
>>> +#if defined __ARM_ARCH
>>> +#if __ARM_ARCH == 7 || (__ARM_ARCH == 8 && __ARM_64BIT_STATE == 0)
>>> +static inline void dmb(void)
>>> +{
>>> +__asm__ volatile("dmb" : : : "memory");
>>> +}
>>> +
>>> +static inline uint32_t ll8(uint8_t *var, int mm)
>>> +{
>>> +uint8_t old;
>>> +
>>> +__asm__ volatile("ldrexb %0, [%1]"
>>> + : "=&r" (old)
>>> + : "r" (var)
>>> + : );
>>> +/* Barrier after an acquiring load */
>>> +if (mm == __ATOMIC_ACQUIRE)
>>> +dmb();
>>> +return old;
>>> +}
>>
>>
>>Hmm, I remember Ola's story about ipfrag and stdatomic not providing
>>enough support for 128-bit atomics. But why do you need to define
>>load/store for 8- and 32-bit variables? Why can not you use stdatomic
>>interface here?
> The usage here is actually not to perform atomic updates.
>
> Load-exclusive is used in ARMv8 to load the local ³monitor² (which is used
> to check for atomicity in load-exclusive/store-exclusive sections).
> When the monitor is ³lost² (there is a better formal word for that but I
> don¹t remember it now) because some other CPU obtained exclusive ownership
> of the cache line in order to write to it, an event is generated. This is
> the event that wakes up the CPU when it is sleeping in WFE (wait for
> event).
>
> This is all explained in the ARMv8 documentation. In ARMv7a, you had to
> perform DSB+SEV in order to signal/wake up waiting CPU¹s, this is much
> slower (you can still do it on ARMv8 CPUs).
>
>>
>>Not to mention that ll/ll8/etc macto names are not _that_ easy to
>>understand without any additional comments. Please expand the names.
>>
>>> +#endif
>>> +
>>> +#if __ARM_ARCH == 8 && __ARM_64BIT_STATE == 1
>>
>>#elif here please.
> Brian this one is for you! :-)
>
>>
>>> +static inline void sevl(void)
>>> +{
>>> +#if defined __ARM_ARCH
>>> +__asm__ volatile("sevl" : : : );
>>> +#endif
>>> +}
>>> +
>>> +static inline void sev(void)
>>> +{
>>> +#if defined __ARM_ARCH
>>> +__asm__ volatile("sev" : : : "memory");
>>> +#endif
>>> +}
>>> +
>>> +static inline int wfe(void)
>>> +{
>>> +#if defined __ARM_ARCH
>>> +__asm__ volatile("wfe" : : : "memory");
>>> +#endif
>>> +return 1;
>>> +}
>>
>>Ugh. And what about other architectures?
> I don¹t know if any other architectures support something like the ARM
> event mechanism. I have never seen anything like it in e.g. MIPS or POWER.
>
> We could remove the inline functions and have the WFE() macro translate
> directly to the inline asm statement. Better?
>
>>
>>> +
>>> +#ifdef ODP_CONFIG_DMBSTR
>>> +
>>> +#if defined __ARM_ARCH && __ARM_ARCH == 8
>>> +/* Only ARMv8 supports DMB ISHLD */
>>> +/* A load only barrier is much cheaper than full barrier */
>>> +#define _odp_release_barrier(ro) \
>>> +do {     \
>>> +if (ro)     \
>>> +__asm__ volatile("dmb ishld" ::: "memory");  \
>>> +else     \
>>> +__asm__ volatile("dmb ish" ::: "memory");    \
>>> +} while (0)
>>> +#else
>>> +#define _odp_release_barrier(ro) \
>>> +__atomic_thread_fence(__ATOMIC_RELEASE)
>>> +#endif
>>> +
>>> +#define atomic_store_release(loc, val, ro)\
>>> +do {\
>>> +_odp_release_barrier(ro);\
>>> +__atomic_store_n(loc, val, __ATOMIC_RELAXED);   \
>>> +} while (0)
>>> +
>>> +#else
>>> +
>>> +#define atomic_store_release(loc, val, ro) \
>>> +__atomic_store_n(loc, val, __ATOMIC_RELEASE)
>>> +
>>> +#endif
>>> +
>>> +#ifdef ODP_CONFIG_USE_WFE
>>
>>Lack of documentation on those abbreviations is a very, very nice thing.
> Which abbreviations are we talking about here? The different build
> configurations (e.g. ODP_CONFIG_USE_WFE) are described in
> odp_config_internal.h. Do you think there should be some documentation
> here as well?
>
>>
>>> +#define SEVL() sevl()
>>> +#define WFE() wfe()
>>> +#define SEV() do { __asm__ volatile("dsb ish" ::: "memory"); sev(); }
>>>while (0)
>>
>>Soooo, if you need dsb here, why isn't it a part of sev()?
> We actually only need SEV (and DSB if there were any stores that need to
> be visible which is the normal case) on ARMv7a. I assume that we need to
> ensure ARMv7a support at some time (although 32-bit support doesn¹t
> necessarily mean ARMv7a, ARMv8 AArch32 code can still use ARMv8
> mechanisms). This macro could probably be written in a better (or at least
> different) way, e.g. combine DSB and SEV in one inline asm statement.
>
> So I am a little bit torn here. Remove the ARMv7a stuff until it is really
> needed or keep it (but possibly change it when it is actually used)?
> Suggestions? (now I start to feel that dead-ish code should be removed).
>
>>
>>> +#if defined __ARM_ARCH && __ARM_ARCH == 8 && __ARM_64BIT_STATE == 1
>>> +#define LDXR128(addr, mo) lld((addr), (mo))
>>> +#endif
>>
>>And who will provide LDXR128 in the other case?
> There are no other cases AFAIK.
> Only ARMv8 supports the event mechanism and 128-bit words.
> On ARMv7a, a double-word (e.g. two pointers) will only be 64-bits. If we
> need to do wait-for-event on e.g. a double-word location, ll64() will
> suffice.
>
> In practice, it might not matter which size (e.g. ll8..ll128) is used in
> load-exclusive in order to arm the local monitor because in practice the
> mechanism is implemented using the cache coherency protocol. Doing a
> load-exclusive on any word in the cache line would probably work but that
> feels misleading.
>
> On non-ARM architectures, the macros in the #else branch below will be
> used and we will spin wait the generic way (relaxed loads and using
> whatever NOP/pause instruction odp_cpu_pause() translates to).
>
>>
>>> +#define LDXR64(addr, mo) ll64((addr), (mo))
>>> +#define LDXR32(addr, mo) ll32((addr), (mo))
>>> +#define LDXR8(addr, mo) ll8((addr), (mo))
>>> +/* When using WFE do not stall the pipeline using other means */
>>> +#define DOZE() (void)0
>>> +#else
>>> +#define SEVL() (void)0
>>> +#define WFE() 1
>>> +#define SEV() (void)0
>>> +#define LDXR128(addr, mo) __atomic_load_n((addr), (mo))
>>> +#define LDXR64(addr, mo) __atomic_load_n((addr), (mo))
>>> +#define LDXR32(addr, mo) __atomic_load_n((addr), (mo))
>>> +#define LDXR8(addr, mo) __atomic_load_n((addr), (mo))
>>> +#define DOZE() odp_cpu_pause()
>>> +#endif
>>> +
>>> +#endif
>>
>>
>>--
>>With best wishes
>>Dmitry
>
> IMPORTANT NOTICE: The contents of this email and any attachments are 
> confidential and may also be privileged. If you are not the intended 
> recipient, please notify the sender immediately and do not disclose the 
> contents to any other person, use it for any purpose, or store or copy the 
> information in any medium. Thank you.

Reply via email to