Inline assembly has been relatively useless in GCC for years.
Inline asm
interferes with the optimisers ability to do a good job, which
basically
makes use of inline assembly self-defeating.
The only time I ever need to use inline-asm is to interface an
arch feature
that has no API. As long as there are intrinsics for all the
opcodes one
might want, then it's better to use them.
That said, as stated above, if use of this stuff is for
performance, then
using an inline-asm block will ruin the surrounding code anyway,
Could someone explain to me, why inline asm screws up the
optimizer? My naive view on the matter is, that the optimizer has
full knowledge of what is going on regardless of whether
intrinsics or asm is used. I could also think of an optimizer
that optimizes inline asm, too. For example by reassigning
registers etc.