Hi,
We faced an issue when testing the scalability of parallel merge sort using
reduction tree on an array of size 1024^3.
Currently, only the master opens the input file and parse it into an array
using fscanf and then distribute the array to other processors.
When using 32 processors, it took
Hi,
The log filenames suggests you are always running on a single node, is
that correct ?
Do you create the input file on the tmpfs once for all? before each run?
Can you please post your mpirun command lines?
If you did not bind the tasks, can you try again
mpirun --bind-to core ...
Ch
Also, in mpi_just_read.c, what if you add
MPI_Barrier(MPI_COMM_WORLD);
right before invoking
MPI_Finalize();
can you observe a similar performance degradation when moving from 32 to
64 tasks ?
Cheers,
Gilles
- Original Message -
Hi,
The log filenames suggests you are al
Hello,
Thank you for your replies.
Yes, it is only a single node with 64 cores.
The input file is copied from nfs to a tmpfs when I start the node.
The mpirun command lines were:
$ mpirun -np 64 --mca btl vader,self pms.out /run/user/10002/bigarray.in >
pms-vader-64.log 2>&1
$ mpirun -np 32 --mc
How is the performance if you leave a few cores for the OS, e,g. running with
60 processes instead of 64? Reasoning being that the file read operation is
really executed by the OS, and could potentially be quite resource intensive.
Thanks
Edgar
From: users On Behalf Of Ali Cherry via users
Sen
Thank you both for your interest.
$ mpirun --use-hwthread-cpus --bind-to core -np 64 --mca btl vader,self
mpijr.out /run/user/10002/bigarray.in > mpijr-bindto-vader-64.log 2>&1
$ mpirun --use-hwthread-cpus --bind-to core -np 60 --mca btl vader,self
mpijr.out /run/user/10002/bigarray.in > mpijr-b
When you run 64 MPI tasks, you oversubscribe your cores (2 MPI taks per
core)
Here is what I think is happening:
In pms.out, all tasks are spinning (master is reading the file, slaves
are waiting for data) so you end up with task 0 sharing half its
timeslices.
On the other hand, on mpi_just
Hi Gilles,
Just read your email after sending mine, yes it does indeed makes perfect
sense. That’s why I hinted at MPI_Recv by the 63 other processors at the
beginning of this email.
By using --use-hwthread-cpus, is it still considered as oversubscription?
And any way to overcome it?
Thank you,