[Bug c++/53292] multi-threaded (OpenMP) is slower than single-threaded

2012-05-12 Thread fh_p at hotmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53292

--- Comment #9 from FH fh_p at hotmail dot com 2012-05-12 21:27:31 UTC ---
Well...

I tested an OpenMP benchmarch (design to demonstrate OpenMP performances) found
on the web : multi-threaded (OpenMP) is again slower than single-threaded. I
looked at coding with pthreads : same thing.

So, I have a dual-core hyper-threaded PC : I end up with multi-threaded
applications slower than single-threaded and this is supposed to be a normal
behavior ?!... Anyway this is still illogical to me ?!?!


[Bug c++/53292] multi-threaded (OpenMP) is slower than single-threaded

2012-05-09 Thread fh_p at hotmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53292

--- Comment #1 from FH fh_p at hotmail dot com 2012-05-09 09:17:29 UTC ---
I am not sure to know if this problem is related rather to gcc or rather to
Ubuntu. I started with the assumption that is should rather to related to
gcc.


[Bug c++/53292] multi-threaded (OpenMP) is slower than single-threaded

2012-05-09 Thread fh_p at hotmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53292

--- Comment #2 from FH fh_p at hotmail dot com 2012-05-09 10:16:52 UTC ---
I have just tested on another computer (CPU : Xeon5650 12 cores + OS :
Scientific Linux) = I reproduce the unexpected behavior (OpenMP slower than
single-threaded).

So, I believe the problem is rather related to gcc (than to the OS)

When I use more threads (export OMP_NUM_THREADS=2, then 6, then 12), OpenMP is
more slower than single-threaded. (behavior related to thread initialisation
?)


[Bug c++/53292] multi-threaded (OpenMP) is slower than single-threaded

2012-05-09 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53292

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 CC||jakub at gcc dot gnu.org
 Resolution||INVALID

--- Comment #3 from Jakub Jelinek jakub at gcc dot gnu.org 2012-05-09 
12:15:53 UTC ---
This is just a bad test.  You are storing the values in the different
threads, but then reading everything in a single thread only.


[Bug c++/53292] multi-threaded (OpenMP) is slower than single-threaded

2012-05-09 Thread fh_p at hotmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53292

--- Comment #4 from FH fh_p at hotmail dot com 2012-05-09 12:53:46 UTC ---
I don't understand your answer.

Timing just times the for loop. Checking array content is single threaded :
this is added to make sure the for loop has done the job correctly and this
check is not timed. The array to initialize is shared by threads (shared by
default) ans not private to each thread.

To me, the test seems relevant. If it's not, why ? And how to modify it ?


[Bug c++/53292] multi-threaded (OpenMP) is slower than single-threaded

2012-05-09 Thread fh_p at hotmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53292

FH fh_p at hotmail dot com changed:

   What|Removed |Added

 Status|RESOLVED|UNCONFIRMED
 Resolution|INVALID |

--- Comment #5 from FH fh_p at hotmail dot com 2012-05-09 12:55:56 UTC ---
To me, the test seems relevant. If it's not, why ? And how to modify it ?


[Bug c++/53292] multi-threaded (OpenMP) is slower than single-threaded

2012-05-09 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53292

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||INVALID

--- Comment #6 from Jakub Jelinek jakub at gcc dot gnu.org 2012-05-09 
13:29:00 UTC ---
Sorry, missed you aren't measuring it with the single-threaded loop.  Anyway,
the test is still not relevant, it is purely memory bound, and as you can see
from running it with very small arguments, the thread creation and omp for
initial overhead is in the noise, what you see is just how the cache hierarchy
of your CPU works.  The inner loop in which all the measured time is spent in
is very similar (and even if hand edited to be identical it doesn't help at
all).


[Bug c++/53292] multi-threaded (OpenMP) is slower than single-threaded

2012-05-09 Thread fh_p at hotmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53292

--- Comment #7 from FH fh_p at hotmail dot com 2012-05-09 14:36:00 UTC ---
Well...

Still don't really get why it is not possible to improve performance for such
basic things. I tried with allocations up to 7 Gb or more (RAM full + SWAP
full) : I still get the same result that looks unexpected to me ?!

Anyway, I guess I won't be able to get the logic of this...


[Bug c++/53292] multi-threaded (OpenMP) is slower than single-threaded

2012-05-09 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53292

--- Comment #8 from Jakub Jelinek jakub at gcc dot gnu.org 2012-05-09 
15:01:24 UTC ---
Just try equivalent pthread program and you'll note the same behavior.

#include pthread.h
#include stdlib.h

double *p;
int c;

void *tf (void *x)
{
  int i, s = ((long) x) * c, e = s + c;
  for (i = s; i  e; i++)
p[i] = 1.0;
  return NULL;
}

int
main (int argc, char **argv)
{
  int n = atoi (argv[1]), i;
  int sz = atoi (argv[2]);
  if (n  32 || n  1 || sz  128 || (sz % n) != 0)
return 1;
  p = malloc (sz * sizeof (double));
  if (p == NULL)
return 1;
  c = sz / n;
  pthread_t t[32];
  for (i = 1; i  n; i++)
pthread_create (t[i], NULL, tf, (void *)(long) i);
  tf ((void *) 0L);
  for (i = 1; i  n; i++)
pthread_join (t[i], NULL);
  return 0;
}