https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78808

            Bug ID: 78808
           Summary: target_clones not applying to openmp functions
           Product: gcc
           Version: 6.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: other
          Assignee: unassigned at gcc dot gnu.org
          Reporter: steven at uplinklabs dot net
  Target Milestone: ---

Simple test case:

---
__attribute__((target_clones("arch=haswell", "arch=sandybridge", "default")))
static void _saxpy(int n, float a, float * restrict x, float * restrict y)
{
#pragma omp parallel for
        for (int i = 0; i < n; ++i)
                y[i] = a*x[i] + y[i];
}

void saxpy(int n, float a, float * restrict x, float * restrict y)
{
        return _saxpy(n, a, x, y);
}
---

Compile with:

gcc-6 -O3 -std=gnu11 -ffast-math -S

If -fopenmp is specified, the OpenMP-generated function for the parallelized
for() loop is generated with the default optimizations (not respecting
target_clones).

See attachments for comparison.

Reply via email to