Hello all,
I am an active AltiVec PPC assembly programmer, but until recently have
not been using gcc's AltiVec extensions.
However, lately, with a project I am contributing to, called "macstl"
(www.pixelglow.com/macstl/), I've become involved in using this stuff.
So there is my question: why would one define the vec_XXX routines in
terms of __builtin_altivec_XXX compiler primitives as opposed to using
inline asm statements? Is there any good reason for doing that with the
already existing good support for inline assembly?
Now, it is important to note: I am *not* using the builtins directly; I
am invoking the vec_XXX functions that are defined in altivec.h and
specified in the AltiVec PIM.
I am asking this, because we're having some problems with those builtins
inlining instructions properly when a certain level of logic complexity
(in loops) arises. Even worse, gcc 4.0 (both 4.0.0 and 4.0.1) generates
bad code (whereas gcc 3.4 is OK). 3.4 simply resorts to a series of
calls to compiler generated routines (with mangled names such as
"_Z7vec_addU8_vectorfs" probably corresponding to the vec_add's builtin)
instead of inlining actual instructions. Again that happens when a
certain level of code mass is reached. gcc 4.0 tries to do the same but,
apparently, something goes wrong. I didn't inspect the produced assembly
code in depth. Currently, I can only give examples in terms of the
macstl expressions and compiler generated assembly output, but if you
request, I can try to write a more direct loop that uses the vec_* code.
So does anyone of you, compiler writers, know why the builtins are
needed? Also, does anyone care that this stuff doesn't really work?
Regards,
Ilya
- Question about the use of builtins in altivec.h Ilya Lipovsky
-