With ' --bind-to socket' i get the same results as '--bind-to-core' : 3813 MB/s.
I have attached ompi_yalla_socket.out and ompi_yalla_socket.err files to this 
letter.


Вторник, 16 июня 2015, 18:15 +03:00 от Alina Sklarevich 
<ali...@dev.mellanox.co.il>:
>Hi Timur,
>
>Can you please try running your  ompi_yalla cmd with ' --bind-to socket' 
>(instead of binding to core) and check if it affects the results?
>We saw that it made a difference on the performance in our lab so that's why I 
>asked you to try the same.
>
>Thanks,
>Alina.
>
>On Tue, Jun 16, 2015 at 5:53 PM, Timur Ismagilov  < tismagi...@mail.ru > wrote:
>>Hello, Alina!
>>
>>If I use  --map-by node I will get only intranode communications on 
>>osu_mbw_mr. I use --map-by core instead.
>>
>>I have 2 nodes, each node has 2 sockets with 8 cores per socket.
>>
>>When I run osu_mbw_mr on 2 nodes with 32 MPI procs (command see below), I  
>>expect to see the unidirectional bandwidth of 4xFDR  link as a result  of 
>>this test.
>>
>>With IntelMPI I get 6367 MB/s, 
>>With ompi_yalla I get about 3744 MB/s (problem: it is a half of impi result)
>>With openmpi without mxm (ompi_clear) I get 6321 MB/s.
>>
>>How can I increase yalla results?
>>
>>IntelMPI cmd:
>>/opt/software/intel/impi/ 4.1.0.030/intel64/bin/mpiexec.hydra   -machinefile 
>>machines.pYAvuK -n 32 -binding domain=core  
>>../osu_impi/libexec/osu-micro-benchmarks/mpi/pt2pt/osu_mbw_mr -v -r=0
>>
>>ompi_yalla cmd:
>>/gpfs/NETHOME/oivt1/nicevt/itf/sources/hpcx-v1.3.330-icc-OFED-1.5.4.1-redhat6.2-x86_64/ompi-mellanox-fca-v1.8.5/bin/mpirun
>>  -report-bindings -display-map -mca coll_hcoll_enable 1 -x  
>>HCOLL_MAIN_IB=mlx4_0:1 -x     MXM_IB_PORTS=mlx4_0:1 -x  
>>MXM_SHM_KCOPY_MODE=off --mca pml yalla --map-by core --bind-to core  
>>--hostfile hostlist  
>>../osu_ompi_hcoll/libexec/osu-micro-benchmarks/mpi/pt2pt/osu_mbw_mr -v  -r=0
>>
>>ompi_clear cmd:
>>/gpfs/NETHOME/oivt1/nicevt/itf/sources/hpcx-v1.3.330-icc-OFED-1.5.4.1-redhat6.2-x86_64/ompi-clear-v1.8.5/bin/mpirun
>>  -report-bindings -display-map --hostfile hostlist --map-by core  --bind-to 
>>core  ../osu_ompi_clear/libexec/osu-micro-benchmarks/mpi/pt2pt/osu_mbw_mr -v  
>>-r=0
>>
>>I have attached output files to this letter:
>>ompi_clear.out, ompi_clear.err - contains ompi_clear results
>>ompi_yalla.out, ompi_yalla.err - contains ompi_yalla results
>>impi.out, impi.err - contains intel MPI results
>>
>>Best regards,
>>Timur
>>
>>Воскресенье,  7 июня 2015, 16:11 +03:00 от Alina Sklarevich < 
>>ali...@dev.mellanox.co.il >:
>>>Hi Timur,
>>>
>>>After running the osu_mbw_mr benchmark in our lab, we obsereved that the 
>>>binding policy made a difference on the performance.
>>>Can you please rerun your ompi tests with the following added to your 
>>>command line? (one of them in each run)
>>>
>>>1. --map-by node --bind-to socket
>>>2. --map-by node --bind-to core
>>>
>>>Please attach your results.
>>>
>>>Thank you,
>>>Alina.
>>>
>>>On Thu, Jun 4, 2015 at 6:53 PM, Timur Ismagilov  < tismagi...@mail.ru > 
>>>wrote:
>>>>Hello, Alina.
>>>>1. Here is my 
>>>>ompi_yalla command line:
>>>>$HPCX_MPI_DIR/bin/mpirun -mca coll_hcoll_enable 1 -x HCOLL_MAIN_IB=mlx4_0:1 
>>>>-x MXM_IB_PORTS=mlx4_0:1 -x MXM_SHM_KCOPY_MODE=off --mca pml yalla 
>>>>--hostfile hostlist $@
>>>>echo $HPCX_MPI_DIR 
>>>>/gpfs/NETHOME/oivt1/nicevt/itf/sources/hpcx-v1.3.330-icc-OFED-1.5.4.1-redhat6.2-x86_64/
>>>> ompi-mellanox-fca-v1.8.5
>>>>This mpi was configured with: --with-mxm=/path/to/mxm 
>>>>--with-hcoll=/path/to/hcoll 
>>>>--with-platform=contrib/platform/mellanox/optimized --prefix=/path/to/ 
>>>>ompi-mellanox-fca-v1.8.5
>>>>ompi_clear command line:
>>>>HPCX_MPI_DIR/bin/mpirun  --hostfile hostlist $@
>>>>echo $HPCX_MPI_DIR 
>>>>/gpfs/NETHOME/oivt1/nicevt/itf/sources/hpcx-v1.3.330-icc-OFED-1.5.4.1-redhat6.2-x86_64/
>>>> ompi-clear-v1.8.5
>>>>This mpi was configured with: 
>>>>--with-platform=contrib/platform/mellanox/optimized --prefix=/path/to 
>>>>/ompi-clear-v1.8.5
>>>>2. When i run osu_mbr_mr with key "-x MXM_TLS=self,shm,rc" . It fails with 
>>>>segmentation fault : 
>>>>stdout log is in attached file osu_mbr_mr_n-2_ppn-16.out; 
>>>>stderr log is in attached file osu_mbr_mr_n-2_ppn-16.err;
>>>>cmd line:
>>>>$HPCX_MPI_DIR/bin/mpirun -mca coll_hcoll_enable 1 -x HCOLL_MAIN_IB=mlx4_0:1 
>>>>-x MXM_IB_PORTS=mlx4_0:1 -x MXM_SHM_KCOPY_MODE=off --mca pml yalla -x 
>>>>MXM_TLS=self,shm,rc --hostfile hostlist osu_mbw_mr -v -r=0
>>>>osu_mbw_mr.c
>>>>I have changed WINDOW_SIZES in osu_mbw_mr.c:
>>>>#define WINDOW_SIZES {8, 16, 32, 64,  128, 256, 512, 1024 }  
>>>>3. I add results of running osu_mbw_mr with yalla and without hcoll on 32 
>>>>and 64 nodes (512 and 1024 mpi procs
>>>>) to  mvs10p_mpi.xls : list osu_mbr_mr.
>>>>The results are 20 percents smaller than old results (with hcoll).
>>>>
>>>>
>>>>
>>>>Среда,  3 июня 2015, 10:29 +03:00 от Alina Sklarevich < 
>>>>ali...@dev.mellanox.co.il >:
>>>>>Hello Timur,
>>>>>
>>>>>I will review your results and try to reproduce them in our lab.
>>>>>
>>>>>You are using an old OFED - OFED-1.5.4.1 and we suspect that this may be 
>>>>>causing the performance issues you are seeing.
>>>>>
>>>>>In the meantime, could you please:
>>>>>
>>>>>1. send us the exact command lines that you were running when you got 
>>>>>these results?
>>>>>
>>>>>2. add the following to the command line that you are running with 'pml 
>>>>>yalla' and attach the results?
>>>>>"-x MXM_TLS=self,shm,rc"
>>>>>
>>>>>3. run your command line with yalla and without hcoll?
>>>>>
>>>>>Thanks,
>>>>>Alina.
>>>>>
>>>>>
>>>>>
>>>>>On Tue, Jun 2, 2015 at 4:56 PM, Timur Ismagilov  < tismagi...@mail.ru > 
>>>>>wrote:
>>>>>>Hi, Mike!
>>>>>>I have impi v 4.1.2 (- impi)
>>>>>>I build ompi 1.8.5 with MXM and hcoll (- ompi_yalla)
>>>>>>I build ompi 1.8.5 without MXM and hcoll (- ompi_clear)
>>>>>>I start osu p2p: osu_mbr_mr test with this MPIs.
>>>>>>You can find the result of benchmark in attached file(mvs10p_mpi.xls: 
>>>>>>list osu_mbr_mr)
>>>>>>
>>>>>>On 64 nodes (and 1024 mpi processes) ompi_yalla get 2 time worse perf 
>>>>>>than ompi_clear.
>>>>>>Is mxm with yalla  reduces performance in p2p  compared with 
>>>>>>ompi_clear(and impi)?
>>>>>>Am  I  doing something wrong?
>>>>>>P.S. My colleague Alexander Semenov is in CC
>>>>>>Best regards,
>>>>>>Timur
>>>>>>
>>>>>>Четверг, 28 мая 2015, 20:02 +03:00 от Mike Dubman < 
>>>>>>mi...@dev.mellanox.co.il >:
>>>>>>>it is not apples-to-apples comparison.
>>>>>>>
>>>>>>>yalla/mxm is point-to-point library, it is not collective library.
>>>>>>>collective algorithm happens on top of yalla.
>>>>>>>
>>>>>>>Intel collective algorithm for a2a is better than OMPI built-in 
>>>>>>>collective algorithm.
>>>>>>>
>>>>>>>To see benefit of yalla - you should run p2p benchmarks 
>>>>>>>(osu_lat/bw/bibw/mr)
>>>>>>>
>>>>>>>
>>>>>>>On Thu, May 28, 2015 at 7:35 PM, Timur Ismagilov  < tismagi...@mail.ru > 
>>>>>>>wrote:
>>>>>>>>I compare ompi-1.8.5 (hpcx-1.3.3-icc) with impi v 4.1.4.
>>>>>>>>
>>>>>>>>I build ompi with MXM but without HCOLL and without  knem (I work on 
>>>>>>>>it). Configure options are:
>>>>>>>> ./configure  --prefix=my_prefix   
>>>>>>>>--with-mxm=path/to/hpcx/hpcx-v1.3.330-icc-OFED-1.5.4.1-redhat6.2-x86_64/mxm
>>>>>>>>   --with-platform=contrib/platform/mellanox/optimized
>>>>>>>>
>>>>>>>>As a result of the IMB-MPI1 Alltoall test, I have got disappointing  
>>>>>>>>results: for the most message sizes on 64 nodes and 16 processes per  
>>>>>>>>node impi is much (~40%) better.
>>>>>>>>
>>>>>>>>You can look at the results in the file "mvs10p_mpi.xlsx", I attach it. 
>>>>>>>>System configuration is also there.
>>>>>>>>
>>>>>>>>What do you think about? Is there any way to improve ompi yalla 
>>>>>>>>performance results?
>>>>>>>>
>>>>>>>>I attach the output of  "IMB-MPI1 Alltoall" for yalla and impi.
>>>>>>>>
>>>>>>>>P.S. My colleague Alexander Semenov is in CC
>>>>>>>>
>>>>>>>>Best regards,
>>>>>>>>Timur
>>>>>>>
>>>>>>>
>>>>>>>-- 
>>>>>>>
>>>>>>>Kind Regards,
>>>>>>>
>>>>>>>M.
>>>>>>
>>>>>>
>>>>>>
>>>>>>_______________________________________________
>>>>>>users mailing list
>>>>>>us...@open-mpi.org
>>>>>>Subscription:  http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>Link to this post:  
>>>>>>http://www.open-mpi.org/community/lists/users/2015/06/27029.php
>>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>>
>




Reply via email to