On Mon, Sep 10, 2012 at 6:37 PM, Richard Henderson <r...@redhat.com> wrote:
> On 09/10/2012 09:09 AM, Iyer, Balaji V wrote:
>>> >If that's the case, what's the point in defining an external ABI and 
>>> >defining what
>>> >__attribute__((vector)) placed on a function declaration means?
>
>> When you have __attribute__((vector)) you are asking the compiler to
>> create a vector AND a scalar version of the function. The advantage
>> is that if the function is used, for example, in 2 loops where 1 can
>> be vectorized and another cannot, the vectorizable loop won't suffer
>> (i.e. suffer from being not-vectorized).
>
> You've totally mis-understood my point.
>
> Whether or not the compiler creates a clone COULD BE totally up to the
> compiler, based on whether or not vectorization is enabled, whether the
> loop has been analyzed such that vectorization may proceed, or indeed
> the phase of the moon.
>
> But in order for that to happen, the clone must be totally private to
> the module for which we are generating code (in the LTO sense, this is
> the entire program or dll; without LTO, this is just the object file).
> It means that we never attempt to generate clones for functions for
> which the body of the function is not visible.
>
> On the other hand, if you insist on assuming a clone exists merely
> because a declaration bears an attribute, then you must address ALL
> of the problems with respect to defining a stable ABI in the face of
> different cpu revisions, different ISAs, and different vector lengths.
>
> I've not seen you address ANY of these problems, despite having the
> problem pointed out multiple times.

Indeed, if the definition of an elemental function is always visible to the
vectorizer the vectorizer itself can instruct the creation of the clone
if it does not already exist (just make those clones managed by the
callgraph).  Then the clones are visible to the current TU only and no
ABI issues exist (though you could say that the vectorizer or the inliner
could as well force inlining of elemental functions into places it wants to
vectorize - one complication even with local clones is that the x86 ABI
has no callee-saved XMM registers which makes function calls inside
loops especially expensive).

Richard.

>
> r~

Reply via email to