http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49363

--- Comment #17 from vincenzo Innocente <vincenzo.innocente at cern dot ch> 
2012-05-10 10:16:13 UTC ---
I tested this

float x[1024], y[1024], z[1024], w[1024];

void foo() {
  for (int i=0; i!=1024; ++i)
     x[i]=y[i]*z[i]+w[i];
}


void __attribute__ ((target("arch=corei7"))) foo() {
  for (int i=0; i!=1024; ++i)
     x[i]=y[i]*z[i]+w[i];
}

void __attribute__ ((target("avx"))) foo() {
  for (int i=0; i!=1024; ++i)
     x[i]=y[i]*z[i]+w[i];
}


and see the three versions generated  + the "resolver".

As you notice the source code is identical as I'm exploiting compiler
autovectorization here.
In this case I was hoping that a single declaration such as  __attribute__
((target("arch=corei7,avx")))
or  __attribute__ ((target("arch=corei7),target("avx"))) would generate the two
versions w/o hang to duplicate the source code.
Is this possible to support?

Reply via email to