Hello all,

I am an active AltiVec PPC assembly programmer, but until recently have not been using gcc's AltiVec extensions.

However, lately, with a project I am contributing to, called "macstl" (www.pixelglow.com/macstl/), I've become involved in using this stuff.

So there is my question: why would one define the vec_XXX routines in terms of __builtin_altivec_XXX compiler primitives as opposed to using inline asm statements? Is there any good reason for doing that with the already existing good support for inline assembly?

Now, it is important to note: I am *not* using the builtins directly; I am invoking the vec_XXX functions that are defined in altivec.h and specified in the AltiVec PIM.

I am asking this, because we're having some problems with those builtins inlining instructions properly when a certain level of logic complexity (in loops) arises. Even worse, gcc 4.0 (both 4.0.0 and 4.0.1) generates bad code (whereas gcc 3.4 is OK). 3.4 simply resorts to a series of calls to compiler generated routines (with mangled names such as "_Z7vec_addU8_vectorfs" probably corresponding to the vec_add's builtin) instead of inlining actual instructions. Again that happens when a certain level of code mass is reached. gcc 4.0 tries to do the same but, apparently, something goes wrong. I didn't inspect the produced assembly code in depth. Currently, I can only give examples in terms of the macstl expressions and compiler generated assembly output, but if you request, I can try to write a more direct loop that uses the vec_* code.

So does anyone of you, compiler writers, know why the builtins are needed? Also, does anyone care that this stuff doesn't really work?

Regards,
Ilya

Reply via email to