On Wed, Aug 12, 2020 at 3:42 PM Tijl Coosemans <t...@freebsd.org> wrote:

> On Wed, 12 Aug 2020 09:44:25 +0400 Gleb Popov <arr...@freebsd.org> wrote:
> > On Wed, Aug 12, 2020 at 9:21 AM Gleb Popov <arr...@freebsd.org> wrote:
> >> Indeed, this looks like a culprit! When compiling using first command
> line
> >> (the long one) I get following warnings:
> >>
> >>
> /wrkdirs/usr/ports/lang/ghc/work/ghc-8.10.1/libraries/ghc-prim/cbits/atomic.c:369:10:
> >> warning: misaligned atomic operation may incur significant performance
> >> penalty [-Watomic-alignment]
> >>   return __atomic_load_n((StgWord64 *) x, __ATOMIC_SEQ_CST);
> >>          ^
> >>
> /wrkdirs/usr/ports/lang/ghc/work/ghc-8.10.1/libraries/ghc-prim/cbits/atomic.c:417:3:
> >> warning: misaligned atomic operation may incur significant performance
> >> penalty [-Watomic-alignment]
> >>   __atomic_store_n((StgWord64 *) x, (StgWord64) val, __ATOMIC_SEQ_CST);
> >>   ^
> >> 2 warnings generated.
> >>
> >> I guess this basically means "I'm emitting a call there". So, what's the
> >> correct fix in this case?
> >
> > I just noticed that Clang emits these warnings (and the call instruction)
> > only for functions handling StgWord64 type. For the same code with
> > StgWord32, like
> >
> > StgWord
> > hs_atomicread32(StgWord x)
> > {
> > #if HAVE_C11_ATOMICS
> >   return __atomic_load_n((StgWord32 *) x, __ATOMIC_SEQ_CST);
> > #else
> >   return __sync_add_and_fetch((StgWord32 *) x, 0);
> > #endif
> > }
> >
> > no warning is emitted as well as no call.
> >
> > How does clang infer alignment in these cases? What's so special about
> > StgWord64?
>
> StgWord64 is uint64_t which is unsigned long long which is 4 byte
> aligned on i386.  Clang wants 8 byte alignment to use the fildll
> instruction.
>
> You could change the definition of the StgWord64 type to look like:
>
> typedef uint64_t StgWord64 __attribute__((aligned(8)));
>
> But this only works if all calls to hs_atomicread64 pass a StgWord64
> as argument and not some other 64 bit value.
>
>
> Another solution I already mentioned in a previous message: replace
> HAVE_C11_ATOMICS with 0 in hs_atomicread64 so it uses
> __sync_add_and_fetch instead of __atomic_load_n.  That uses the
> cmpxchg8b instruction which doesn't care about alignment.  It's much
> slower but I guess 64 bit atomic loads are rare enough that this
> doesn't matter much.
>

Yep, your suggested workaround worked, many thanks.

Still, I'm curious where I can get __atomic_load_n in an i386 case, if I
don't want to pull in gcc?
_______________________________________________
freebsd-toolchain@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-toolchain
To unsubscribe, send any mail to "freebsd-toolchain-unsubscr...@freebsd.org"

Reply via email to