Re: gomp slowness

Tomash Brechko Sat, 20 Oct 2007 11:33:07 -0700

I'm not sure what OpenMP spec says about default data scope (too lazy
to read through), but it seems that examples from
http://kallipolis.com/openmp/2.html assume default(private), while GCC
GOMP defaults to shared.  In your case,


  #pragma omp parallel for shared(A, row, col)
    for (i = k+1; i<SIZE; i++) {
      for (j = k+1; j<SIZE; j++) {
          A[i][j] = A[i][j] - row[i] * col[j];
      }
    }

'#pragma omp for' makes 'i' private implicitly (it couldn't be
otherwise), but 'j' is still shared.  I just tried your original case,
not only it is slow, but it also produces different results with and
without OpenMP (just try to print any elem of 'A').  Adding
'private(j)' (or defining 'j' inside the outer loop) will fix the
case.

It would be nice if someone would post the measurement for the fixed
case, my machine has only HT, and I experience slowdown for this
example (but still it runs much faster then before the fix).


-- 
   Tomash Brechko

Re: gomp slowness

Reply via email to