[Bug target/93177] PPC: Missing many useful platform intrinsics

segher at gcc dot gnu.org Thu, 23 Jan 2020 06:06:35 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93177


--- Comment #10 from Segher Boessenkool <segher at gcc dot gnu.org> ---
(In reply to Matt Emmerton from comment #9)
> > > __sync()
> > > __isync()
> > > __lwsync()
> > 
> > The sync intrinsics need to be tied to some other code.  A volatile asm with
> > a "memory" clobber is not good enough, in many cases.
> 
> We use these in our internal mutex and atomic implementations, and the
> resulting sequences are carefully scrutinized.

You have to check it after *every build* then, in general :-/

> > > __lwarx()
> > > __ldarx()
> > > __stwcx()
> > > __stdcx()
> > 
> > The compiler can always insert memory accesses in between those two, if you
> > have them as separate intrinsics (and it will, simply stack accesses for
> > temporaries will do, already).  If those accesses hit the same reservation
> > granule as the larx/stcx. uses, you lose.
> > 
> > You need to write the whole sequence in one piece of assembler code.
> 
> I would argue that the compiler should be smart enough to realize that these
> are part of a decomposed atomic operation, and avoid arbitrary instruction
> injection.

But this is impossible, it is contrary to all optimisation goals we have.  Yes,
It could perhaps work with -O0.

> > > __protected_stream_set()
> > > __protected_stream_count()
> > > __protected_stream_count_depth() // currently not implemented in gcc
> > > __protected_stream_go()
> > 
> > Those are pretty specific to CBE I think?
> 
> No.  They are implemented on POWER5 and above (ISA 2.02), and are useful in
> managing cache prefetch behaviour.

Open a separate feature request for these then, please.

> > > The implementation of stwcx() and stdcx() need revision on PPC.
> > > As I understand it, there is no need the mfocrf instruction nor the
> > > mask-and-shift on result.
> > 
> > How else would you output the CR0.EQ bit?
> 
> There is no need to copy CR0 to a GPR - branch instructions such as BNE can
> operate on CR0 directly.

You cannot write anything that maps to a CR field directly.

[Bug target/93177] PPC: Missing many useful platform intrinsics

Reply via email to