[OMPI users] Read from file performance degradation when increasing number of processors in some cases

2020-03-06 Thread Ali Cherry via users
Hi, We faced an issue when testing the scalability of parallel merge sort using reduction tree on an array of size 1024^3. Currently, only the master opens the input file and parse it into an array using fscanf and then distribute the array to other processors. When using 32 processors, it took

Re: [OMPI users] Read from file performance degradation when increasing number of processors in some cases

2020-03-06 Thread Gilles Gouaillardet via users
Hi, The log filenames suggests you are always running on a single node, is that correct ? Do you create the input file on the tmpfs once for all? before each run? Can you please post your mpirun command lines? If you did not bind the tasks, can you try again mpirun --bind-to core ... Ch

Re: [OMPI users] Read from file performance degradation whenincreasing number of processors in some cases

2020-03-06 Thread Gilles Gouaillardet via users
Also, in mpi_just_read.c, what if you add MPI_Barrier(MPI_COMM_WORLD); right before invoking MPI_Finalize(); can you observe a similar performance degradation when moving from 32 to 64 tasks ? Cheers, Gilles - Original Message - Hi, The log filenames suggests you are al

Re: [OMPI users] Read from file performance degradation whenincreasing number of processors in some cases

2020-03-06 Thread Ali Cherry via users
Hello, Thank you for your replies. Yes, it is only a single node with 64 cores. The input file is copied from nfs to a tmpfs when I start the node. The mpirun command lines were: $ mpirun -np 64 --mca btl vader,self pms.out /run/user/10002/bigarray.in > pms-vader-64.log 2>&1 $ mpirun -np 32 --mc

Re: [OMPI users] Read from file performance degradation whenincreasing number of processors in some cases

2020-03-06 Thread Gabriel, Edgar via users
How is the performance if you leave a few cores for the OS, e,g. running with 60 processes instead of 64? Reasoning being that the file read operation is really executed by the OS, and could potentially be quite resource intensive. Thanks Edgar From: users On Behalf Of Ali Cherry via users Sen

Re: [OMPI users] Read from file performance degradation whenincreasing number of processors in some cases

2020-03-06 Thread Ali Cherry via users
Thank you both for your interest. $ mpirun --use-hwthread-cpus --bind-to core -np 64 --mca btl vader,self mpijr.out /run/user/10002/bigarray.in > mpijr-bindto-vader-64.log 2>&1 $ mpirun --use-hwthread-cpus --bind-to core -np 60 --mca btl vader,self mpijr.out /run/user/10002/bigarray.in > mpijr-b

Re: [OMPI users] Read from file performance degradation whenincreasing number of processors in some cases

2020-03-06 Thread Gilles Gouaillardet via users
When you run 64 MPI tasks, you oversubscribe your cores (2 MPI taks per core) Here is what I think is happening: In pms.out, all tasks are spinning (master is reading the file, slaves are waiting for data) so you end up with task 0 sharing half its timeslices. On the other hand, on mpi_just

Re: [OMPI users] Read from file performance degradation whenincreasing number of processors in some cases

2020-03-06 Thread Ali Cherry via users
Hi Gilles, Just read your email after sending mine, yes it does indeed makes perfect sense. That’s why I hinted at MPI_Recv by the 63 other processors at the beginning of this email. By using --use-hwthread-cpus, is it still considered as oversubscription? And any way to overcome it? Thank you,