I understood. Thanks.

Tetsuya Mishima

2016/08/09 11:33:15、"devel"さんは「Re: [OMPI devel] sm BTL performace of
the openmpi-2.0.0」で書きました
> I will add a control to have the new behavior or using all available RDMA
btls or just the eager ones for the RDMA protocol. The flags will remain as
they are. And, yes, for 2.0.0 you can set the btl
> flags if you do not intend to use MPI RMA.
>
> New patch:
>
>
https://github.com/hjelmn/ompi/commit/43267012e58d78e3fc713b98c6fb9f782de977c7.patch

>
> -Nathan
>
> > On Aug 8, 2016, at 8:16 PM, tmish...@jcity.maeda.co.jp wrote:
> >
> > Then, my understanding is that you will restore the default value of
> > btl_openib_flags to previous one( = 310) and add a new MCA parameter to
> > control HCA inclusion for such a situation. The work arround so far for
> > openmpi-2.0.0 is setting those flags manually. Right?
> >
> > Tetsuya Mishima
> >
> > 2016/08/09 9:56:29、"devel"さんは「Re: [OMPI devel] sm BTL performace
of
> > the openmpi-2.0.0」で書きました
> >> Hmm, not good. So we have a situation where it is sometimes better to
> > include the HCA when it is the only rdma btl. Will have a new version
up in
> > a bit that adds an MCA parameter to control the
> >> behavior. The default will be the same as 1.10.x.
> >>
> >> -Nathan
> >>
> >>> On Aug 8, 2016, at 4:51 PM, tmish...@jcity.maeda.co.jp wrote:
> >>>
> >>> Hi, unfortunately it doesn't work well. The previous one was much
> >>> better ...
> >>>
> >>> [mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2
-report-bindings
> >>> osu_bw
> >>> [manage.cluster:25107] MCW rank 0 bound to socket 0[core 0[hwt 0]],
> > socket
> >>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
> >>> cket 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt
> > 0]]:
> >>> [B/B/B/B/B/B][./././././.]
> >>> [manage.cluster:25107] MCW rank 1 bound to socket 0[core 0[hwt 0]],
> > socket
> >>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so
> >>> cket 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt
> > 0]]:
> >>> [B/B/B/B/B/B][./././././.]
> >>> # OSU MPI Bandwidth Test v3.1.1
> >>> # Size        Bandwidth (MB/s)
> >>> 1                         2.22
> >>> 2                         4.53
> >>> 4                         9.11
> >>> 8                        18.02
> >>> 16                       35.44
> >>> 32                       70.84
> >>> 64                      113.71
> >>> 128                     176.74
> >>> 256                     311.07
> >>> 512                     529.03
> >>> 1024                    907.83
> >>> 2048                   1597.66
> >>> 4096                    330.14
> >>> 8192                    516.49
> >>> 16384                   780.31
> >>> 32768                  1038.43
> >>> 65536                  1186.36
> >>> 131072                 1268.87
> >>> 262144                 1222.24
> >>> 524288                 1232.30
> >>> 1048576                1244.62
> >>> 2097152                1260.25
> >>> 4194304                1263.47
> >>>
> >>> Tetsuya
> >>>
> >>>
> >>> 2016/08/09 2:42:24、"devel"さんは「Re: [OMPI devel] sm BTL performace
> > of
> >>> the openmpi-2.0.0」で書きました
> >>>> Ok, there was a problem with the selection logic when only one rdma
> >>> capable btl is available. I changed the logic to always use the RDMA
> > btl
> >>> over pipelined send/recv. This works better for me on a
> >>>> Intel Omnipath system. Let me know if this works for you.
> >>>>
> >>>>
> >>>
> >
https://github.com/hjelmn/ompi/commit/dddb865b5337213fd73d0e226b02e2f049cfab47.patch

> >
> >>>
> >>>>
> >>>> -Nathan
> >>>>
> >>>> On Aug 07, 2016, at 10:00 PM, tmish...@jcity.maeda.co.jp wrote:
> >>>>
> >>>> Hi, here is the gdb output for additional information:
> >>>>
> >>>> (It might be inexact, because I built openmpi-2.0.0 without debug
> > option)
> >>>>
> >>>> Core was generated by `osu_bw'.
> >>>> Program terminated with signal 11, Segmentation fault.
> >>>> #0 0x00000031d9008806 in ?? () from /lib64/libgcc_s.so.1
> >>>> (gdb) where
> >>>> #0 0x00000031d9008806 in ?? () from /lib64/libgcc_s.so.1
> >>>> #1 0x00000031d9008934 in _Unwind_Backtrace ()
> > from /lib64/libgcc_s.so.1
> >>>> #2 0x00000037ab8e5ee8 in backtrace () from /lib64/libc.so.6
> >>>> #3 0x00002ad882bd4345 in opal_backtrace_print ()
> >>>> at ./backtrace_execinfo.c:47
> >>>> #4 0x00002ad882bd1180 in show_stackframe () at ./stacktrace.c:331
> >>>> #5 <signal handler called>
> >>>> #6 mca_pml_ob1_recv_request_schedule_once ()
> > at ./pml_ob1_recvreq.c:983
> >>>> #7 0x00002aaab412f47a in mca_pml_ob1_recv_request_progress_rndv ()
> >>>>
> >>>>
> >>>
> >
from /home/mishima/opt/mpi/openmpi-2.0.0-pgi16.5/lib/openmpi/mca_pml_ob1.so
> >>>> #8 0x00002aaab412c645 in mca_pml_ob1_recv_frag_match ()
> >>>> at ./pml_ob1_recvfrag.c:715
> >>>> #9 0x00002aaab412bba6 in mca_pml_ob1_recv_frag_callback_rndv ()
> >>>> at ./pml_ob1_recvfrag.c:267
> >>>> #10 0x00002aaaaf2748d3 in mca_btl_vader_poll_handle_frag ()
> >>>> at ./btl_vader_component.c:589
> >>>> #11 0x00002aaaaf274b9a in mca_btl_vader_component_progress ()
> >>>> at ./btl_vader_component.c:231
> >>>> #12 0x00002ad882b916fc in opal_progress () at
> > runtime/opal_progress.c:224
> >>>> #13 0x00002ad8820a9aa5 in ompi_request_default_wait_all () at
> >>>> request/req_wait.c:77
> >>>> #14 0x00002ad8820f10dd in PMPI_Waitall () at ./pwaitall.c:76
> >>>> #15 0x0000000000401108 in main () at ./osu_bw.c:144
> >>>>
> >>>> Tetsuya
> >>>>
> >>>>
> >>>> 2016/08/08 12:34:57、"devel"さんは「Re: [OMPI devel] sm BTL
performace
> > of
> >>>> the openmpi-2.0.0」で書きました
> >>>> Hi, it caused segfault as below:
> >>>> [manage.cluster:25436] MCW rank 0 bound to socket 0[core 0[hwt
> > 0]],socket
> >>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]],
> >>> socket 0[core 4[hwt 0]], socket 0[core 5[hwt
> >>> 0]]:[B/B/B/B/B/B][./././././.][manage.cluster:25436] MCW rank 1 bound
> > to
> >>> socket 0[core
> >>>> 0[hwt 0]],socket
> >>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]],
> >>> socket 0[core 4[hwt 0]], socket 0[core 5[hwt
> >>> 0]]:[B/B/B/B/B/B][./././././.]# OSU MPI Bandwidth Test v3.1.1# Size
> >>> Bandwidth (MB/s)1
> >>>> 2.232 4.514 8.998 17.8316 35.1832 69.6664 109.84128 179.65256
> > 303.52512
> >>> 532.811024 911.742048 1605.294096 1598.738192 2135.9416384
2468.9832768
> >>> 2818.3765536 3658.83131072 4200.50262144 4545.01524288
> >>>> 4757.841048576 4831.75[manage:25442] *** Process received signal
> >>> ***[manage:25442] Signal: Segmentation fault (11)[manage:25442]
Signal
> >>> code: Address not mapped (1)[manage:25442] Failing at address:
> >>>> 0x8
> >>>>
> >>>
> >
--------------------------------------------------------------------------
> >>>> mpirun noticed that process rank 1 with PID 0 on node manage exited
> >>> onsignal 11 (Segmentation fault).
> >>>>
> >>>
> >
--------------------------------------------------------------------------
> >>>>
> >>>> Tetsuya Mishima
> >>>>
> >>>> 2016/08/08 10:12:05、"devel"さんは「Re: [OMPI devel] sm BTL
performace
> >>> ofthe openmpi-2.0.0」で書きました> This patch also modifies the put
> > path.
> >>> Let me know if this works:>> diff --git
> >>>> a/ompi/mca/pml/ob1/pml_ob1_rdma.cb/ompi/mca/pml/ob1/pml_ob1_rdma.c>
> > index
> >>> 888e126..a3ec6f8 100644> --- a/ompi/mca/pml/ob1/pml_ob1_rdma.c> +++
> >>> b/ompi/mca/pml/ob1/pml_ob1_rdma.c> @@ -42,6 +42,7 @@
> >>>> size_t mca_pml_ob1_rdma_btls(> mca_pml_ob1_com_btl_t* rdma_btls)> {>
> > int
> >>> num_btls = mca_bml_base_btl_array_get_size(&bml_endpoint->btl_rdma);
> >>>>> + int num_eager_btls = mca_bml_base_btl_array_get_size
> > (&bml_endpoint->
> >>> btl_eager);> double weight_total = 0;> int num_btls_used = 0;>> @@
> > -57,6
> >>> +58,21 @@ size_t mca_pml_ob1_rdma_btls(>
> >>>> (bml_endpoint->btl_rdma_index + n) % num_btls);>
> >>> mca_btl_base_registration_handle_t *reg_handle = NULL;>
> >>> mca_btl_base_module_t *btl = bml_btl->btl;> + bool ignore = true;> +>
> > + /*
> >>> do not use rdma
> >>>> btls that are not in the eager list. thisis
> >>>> necessary to avoid using> + * btls that exist on the endpoint only
to
> >>> support RMA. */> + for (int i = 0 ; i < num_eager_btls ; ++i) {> +
> >>> mca_bml_base_btl_t *eager_btl
> >>>> =mca_bml_base_btl_array_get_index (&bml_endpoint->btl_eager, i);> +
if
> >>> (eager_btl->btl_endpoint == bml_btl->btl_endpoint) {> + ignore =
> > false;> +
> >>> break;> + }> + }> +> + if (ignore) {> + continue;> +
> >>>> }>> if (btl->btl_register_mem) {> /* do not use the RDMA protocol
with
> >>> this btl if 1) leave pinned isdisabled,> @@ -99,18 +115,34 @@ size_t
> >>> mca_pml_ob1_rdma_pipeline_btls( mca_bml_base_endpoint_t*
> >>>> bml_endpoint,> size_t size,> mca_pml_ob1_com_btl_t* rdma_btls )> {>
-
> > int
> >>> i, num_btls = mca_bml_base_btl_array_get_size(&bml_endpoint->
> > btl_rdma);> +
> >>> int num_btls = mca_bml_base_btl_array_get_size
> >>>> (&bml_endpoint->btl_rdma);> + int num_eager_btls =
> >>> mca_bml_base_btl_array_get_size(&bml_endpoint->btl_eager);> double
> >>> weight_total = 0;> + int rdma_count = 0;>> - for(i = 0; i < num_btls
&&
> > i <
> >>>
> >>>> mca_pml_ob1.max_rdma_per_request; i+
> >>>> +) {> - rdma_btls[i].bml_btl => - mca_bml_base_btl_array_get_next
> >>> (&bml_endpoint->btl_rdma);> - rdma_btls[i].btl_reg = NULL;> + for(int
i
> > =
> >>> 0; i < num_btls && i <mca_pml_ob1.max_rdma_per_request;
> >>>> i++) {> + mca_bml_base_btl_t *bml_btl =
> > mca_bml_base_btl_array_get_next
> >>> (&bml_endpoint->btl_rdma);> + bool ignore = true;> +> + for (int i =
> > 0 ; i
> >>> < num_eager_btls ; ++i) {> + mca_bml_base_btl_t
> >>>> *eager_btl =mca_bml_base_btl_array_get_index (&bml_endpoint->
> > btl_eager,
> >>> i);> + if (eager_btl->btl_endpoint == bml_btl->btl_endpoint) {> +
> > ignore =
> >>> false;> + break;> + }> + }>> - weight_total +=
> >>>> rdma_btls[i].bml_btl->btl_weight;> + if (ignore) {> + continue;>
+ }>
> > +>
> >>> + rdma_btls[rdma_count].bml_btl = bml_btl;> + rdma_btls[rdma_count+
> >>> +].btl_reg = NULL;> +> + weight_total +=
> >>>> bml_btl->btl_weight;> }>> - mca_pml_ob1_calc_weighted_length
> > (rdma_btls,
> >>> i, size,weight_total);
> >>>>> + mca_pml_ob1_calc_weighted_length (rdma_btls, rdma_count,
> >>> size,weight_total);>> - return i;> + return rdma_count;> }>>>>> > On
> > Aug 7,
> >>> 2016, at 6:51 PM, Nathan Hjelm <hje...@me.com> wrote:> >> >
> >>>> Looks like the put path probably needs a similar patch. Will
> > sendanother
> >>> patch soon.> >> >> On Aug 7, 2016, at 6:01 PM,
> > tmish...@jcity.maeda.co.jp
> >>> wrote:> >>> >> Hi,> >>> >> I applied the patch to
> >>>> the file "pml_ob1_rdma.c" and ran osu_bwagain.
> >>>>>>> Then, I still see the bad performance for larger size
(>=2097152 ).>
> >>>>>>>> [mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np
> >>> 2-report-bindings
> >>>>>>> osu_bw> >> [manage.cluster:27444] MCW rank 0 bound to socket 0
[core
> >>> 0[hwt 0]],socket> >> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so>
>>
> > cket
> >>> 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket
> >>>> 0[core 5[hwt0]]:> >> [B/B/B/B/B/B][./././././.]> >>
> >>> [manage.cluster:27444] MCW rank 1 bound to socket 0[core 0[hwt
> > 0]],socket>
> >>>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so> >> cket 0[core 3[hwt
> >>>> 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt0]]:> >>
> >>> [B/B/B/B/B/B][./././././.]> >> # OSU MPI Bandwidth Test v3.1.1> >> #
> > Size
> >>> Bandwidth (MB/s)> >> 1 2.23> >> 2 4.52> >> 4 8.82> >> 8 17.83> >>
> >>>> 16 35.31> >> 32 69.49> >> 64 109.46> >> 128 178.51> >> 256 307.68>
>>
> > 512
> >>> 532.64> >> 1024 909.34> >> 2048 1583.95> >> 4096 1554.74> >> 8192
> > 2120.31>
> >>>>> 16384 2489.79> >> 32768 2853.66> >> 65536
> >>>> 3692.82> >> 131072 4236.67> >> 262144 4575.63> >> 524288 4778.47> >>
> >>> 1048576 4839.34> >> 2097152 2231.46> >> 4194304 1505.48> >>> >>
> > Regards,>
> >>>>>>>> Tetsuya Mishima> >>> >> 2016/08/06 0:00:08、"
> >>>> devel"さんは「Re: [OMPI devel] sm BTLperformace
> >>>> of> >> the openmpi-2.0.0」で書きました> >>> Making ob1 ignore RDMA
> > btls
> >>> that are not in use for eager messagesmight> >> be sufficient. Please
> > try
> >>> the following patch and let me know if itworks> >> for you.>
> >>>>>>>>>>> diff --git a/ompi/mca/pml/ob1/pml_ob1_rdma.c> >>
> >>> b/ompi/mca/pml/ob1/pml_ob1_rdma.c> >>> index 888e126..0c99525 100644>
> >>>>
> >>> --- a/ompi/mca/pml/ob1/pml_ob1_rdma.c> >>> +++
> >>>> b/ompi/mca/pml/ob1/pml_ob1_rdma.c> >>> @@ -42,6 +42,7 @@ size_t
> >>> mca_pml_ob1_rdma_btls(> >>> mca_pml_ob1_com_btl_t* rdma_btls)> >>> {>
> >>>>
> >>> int num_btls =
> >>>> mca_bml_base_btl_array_get_size(&bml_endpoint->btl_rdma);> >>> + int
> >>> num_eager_btls = mca_bml_base_btl_array_get_size> >> (&bml_endpoint->
> >>> btl_eager);> >>> double weight_total = 0;> >>> int
> >>>> num_btls_used = 0;> >>>> >>> @@ -57,6 +58,21 @@ size_t
> >>> mca_pml_ob1_rdma_btls(> >>> (bml_endpoint->btl_rdma_index + n) %
> >>> num_btls);> >>> mca_btl_base_registration_handle_t *reg_handle =
NULL;>
> >>>>
> >>>
> >>>> mca_btl_base_module_t *btl = bml_btl->btl;> >>> + bool ignore =
true;>
> >>>>>> +> >>> + /* do not use rdma btls that are not in the eager
list.this
> >>>> is> >> necessary to avoid using> >>> + * btls that exist on the
> > endpoint
> >>> only to support RMA. */> >>> + for (int i = 0 ; i < num_eager_btls ;
+
> > +i)
> >>> {> >>> + mca_bml_base_btl_t *eager_btl => >>
> >>>> mca_bml_base_btl_array_get_index (&bml_endpoint->btl_eager, i);> >>>
+
> > if
> >>> (eager_btl->btl_endpoint == bml_btl->btl_endpoint){
> >>>>>>>> + ignore = false;> >>> + break;> >>> + }> >>> + }> >>> +> >>> +
if
> >>> (ignore) {> >>> + continue;> >>> + }> >>>> >>> if (btl->
> > btl_register_mem)
> >>> {> >>> /* do not use the RDMA protocol with this btl
> >>>> if 1) leave pinned is> >> disabled,> >>>> >>>> >>>> >>> -Nathan>
>>>>
> >>>>>>>>>>> On Aug 5, 2016, at 8:44 AM, Nathan Hjelm <hje...@me.com>
> > wrote:>
> >>>>>>>>>>>> Nope. We are not going to change the flags
> >>>> as this will disablethe
> >>>> blt> >> for one-sided. Not sure what is going on here as the openib
> >>> btlshould
> >>>> be> >> 1) not used for pt2pt, and 2) polled infrequently.> >>> The
btl
> >>> debug log suggests both of these are the case. Not surewhat
> >>>> is> >> going on yet.> >>>>> >>>> -Nathan> >>>>> >>>>> On Aug 5,
2016,
> > at
> >>> 8:16 AM, r...@open-mpi.org wrote:> >>>>>> >>>>> Perhaps those flags
need
> > to
> >>> be the default?> >>>>>> >>>>>> >>>>>> On Aug 5,
> >>>> 2016, at 7:14 AM, tmish...@jcity.maeda.co.jp wrote:> >>>>>>> >>>>>>
Hi
> >>> Christoph,> >>>>>>> >>>>>> I applied the commits - pull/#1250 as
Nathan
> >>> told me and added"-mca> >>>>>> btl_openib_flags 311" to
> >>>> the mpirun command line option, then it> >> worked for> >>>>>> me. I
> >>> don't know the reason, but it looks ATOMIC_FOP in the> >>>>>>
> >>> btl_openib_flags degrades the sm/vader perfomance.> >>>>>>> >>>>>>
> >>>> Regards,> >>>>>> Tetsuya Mishima> >>>>>>> >>>>>>> >>>>>> 2016/08/05
> >>> 22:10:37、"devel"さんは「Re: [OMPI devel] sm BTL> >> performace of>
> >>>>>>>
> >>> the openmpi-2.0.0」で書きました> >>>>>>> Hello,> >>>>>>>> >>>>>>> We
> >>>> see the same problem here on various machines with Open MPI> >>
> > 2.0.0.>
> >>>>>>>>>> To us it seems that enabling the openib btl triggers
> >>> badperformance> >> for> >>>>>> the sm AND vader btls!> >>>>>>>
> >>>> --mca btl_base_verbose 10 reports in both cases the correct useof>
>>
> > sm
> >>> and> >>>>>> vader between MPI ranks - only performance differs?!>
> >>>>>>>>>
> >>>>>>>>>> One irritating thing I see in the log
> >>>> output is the following:> >>>>>>> openib BTL: rdmacm CPC unavailable
> > for
> >>> use on mlx4_0:1; skipped> >>>>>>> [rank=1] openib: using port
mlx4_0:1>
> >>>>>>>>>> select: init of component openib returned
> >>>> success> >>>>>>>> >>>>>>> Did not look into the "Skipped" code part
> >>> yet, ...> >>>>>>>> >>>>>>> Results see below.> >>>>>>>> >>>>>>> Best
> >>> regards> >>>>>>> Christoph Niethammer> >>>>>>>> >>>>>>> -->
> >>>>>>>>>>>>>>>>>>> Christoph Niethammer> >>>>>>> High Performance
> > Computing>>> Center Stuttgart (HLRS)> >>>>>>> Nobelstrasse 19> >>>>>>>
70569
> > Stuttgart>
> >>>>>>>>>>>>>>>>>> Tel: ++49(0)711-685-87203>
> >>>>>>>>>>> email: nietham...@hlrs.de> >>>>>>>
> >>> http://www.hlrs.de/people/niethammer> >>>>>>>> >>>>>>>> >>>>>>>>
> >>>>>>>>
> >>> mpirun -np 2 --mca btl self,vader osu_bw> >>>>>>> # OSU MPI Bandwidth
> > Test>
> >>>>>>>>>>
> >>>> # Size Bandwidth (MB/s)> >>>>>>> 1 4.83> >>>>>>> 2 10.30> >>>>>>> 4
> >>> 24.68> >>>>>>> 8 49.27> >>>>>>> 16 95.80> >>>>>>> 32 187.52> >>>>>>>
64
> >>> 270.82> >>>>>>> 128 405.00> >>>>>>> 256 659.26> >>>>>>> 512
> >>>> 1165.14> >>>>>>> 1024 2372.83> >>>>>>> 2048 3592.85> >>>>>>> 4096
> >>> 4283.51> >>>>>>> 8192 5523.55> >>>>>>> 16384 7388.92> >>>>>>> 32768
> >>> 7024.37> >>>>>>> 65536 7353.79> >>>>>>> 131072 7465.96> >>>>>>>
> >>>> 262144 8597.56> >>>>>>> 524288 9292.86> >>>>>>> 1048576 9168.01>
> >>>>>>>>
> >>> 2097152 9009.62> >>>>>>> 4194304 9013.02> >>>>>>>> >>>>>>> mpirun -np
2
> >>> --mca btl self,vader,openib osu_bw> >>>>>>> # OSU MPI
> >>>> Bandwidth Test> >>>>>>> # Size Bandwidth (MB/s)> >>>>>>> 1 5.32>
> >>>>>>>>
> >>> 2 11.14> >>>>>>> 4 20.88> >>>>>>> 8 49.26> >>>>>>> 16 99.11> >>>>>>>
32
> >>> 197.42> >>>>>>> 64 301.08> >>>>>>> 128 413.64> >>>>>>>
> >>>> 256 651.15> >>>>>>> 512 1161.12> >>>>>>> 1024 2460.99> >>>>>>> 2048
> >>> 3627.36> >>>>>>> 4096 2191.06> >>>>>>> 8192 3118.36> >>>>>>> 16384
> > 3428.45>
> >>>>>>>>>> 32768 3676.96> >>>>>>> 65536 3709.65> >>>>>>>
> >>>> 131072 3748.64> >>>>>>> 262144 3764.88> >>>>>>> 524288 3764.61>
> >>>>>>>>
> >>> 1048576 3772.45> >>>>>>> 2097152 3757.37> >>>>>>> 4194304 3746.45>
> >>>>>>>>>
> >>>>>>>>>> mpirun -np 2 --mca btl self,sm osu_bw>>>>>>>>>> # OSU MPI
> > Bandwidth Test> >>>>>>> # Size Bandwidth (MB/s)>
> >>>>>>>>>> 1 2.98> >>>>>>> 2 5.97> >>>>>>> 4 11.99> >>>>>>> 8 23.47>
> >>>>>>>>
> >>> 16 50.64> >>>>>>> 32 99.91> >>>>>>> 64 197.87> >>>>>>> 128
> >>>> 343.32> >>>>>>> 256 667.48> >>>>>>> 512 1200.86> >>>>>>> 1024
2050.05>
> >>>>>>>>>> 2048 3578.52> >>>>>>> 4096 3966.92> >>>>>>> 8192 5687.96>
> >>>>>>>>
> >>> 16384 7395.88> >>>>>>> 32768 7101.41> >>>>>>> 65536
> >>>> 7619.49> >>>>>>> 131072 7978.09> >>>>>>> 262144 8648.87> >>>>>>>
> > 524288
> >>> 9129.18> >>>>>>> 1048576 10525.31> >>>>>>> 2097152 10511.63> >>>>>>>
> >>> 4194304 10489.66> >>>>>>>> >>>>>>> mpirun -np 2 --mca btl
> >>>> self,sm,openib osu_bw> >>>>>>> # OSU MPI Bandwidth Test> >>>>>>> #
> > Size
> >>> Bandwidth (MB/s)> >>>>>>> 1 2.02> >>>>>>> 2 3.00> >>>>>>> 4 9.99>
> >>>>>>>> 8
> >>> 19.96> >>>>>>> 16 40.10> >>>>>>> 32 70.63> >>>>>>>
> >>>> 64 144.08> >>>>>>> 128 282.21> >>>>>>> 256 543.55> >>>>>>>
5121032.61
> >>>>>>>>>>>> 1024 1871.09> >>>>>>> 2048 3294.07> >>>>>>> 4096 2336.48>
> >>>>>>>>>> 8192 3142.22> >>>>>>> 16384 3419.93> >>>>>>> 32768 3647.30>
> >>>>>>>>
> >>> 65536 3725.40> >>>>>>> 131072 3749.43> >>>>>>> 262144
> >>>> 3765.31> >>>>>>> 524288 3771.06> >>>>>>> 1048576 3772.54> >>>>>>>
> > 2097152
> >>> 3760.93> >>>>>>> 4194304 3745.37> >>>>>>>> >>>>>>> ----- Original
> > Message
> >>> -----> >>>>>>> From: tmish...@jcity.maeda.co.jp>
> >>>>>>>>>>> To: "Open MPI Developers" <de...@open-mpi.org>> >>>>>>> Sent:
> >>> Wednesday, July 27, 2016 6:04:48 AM> >>>>>>> Subject: Re: [OMPI
devel]
> > sm
> >>> BTL performace of theopenmpi-2.0.0
> >>>>>>>>>>>>>>>>>>>> HiNathan,> >>>>>>>> >>>>>>> I applied those commits
> >>> and ran again without any BTLspecified.
> >>>>>>>>>>>>>>>>>>>> Then, although it says "mca: bml: Using vader btl
for
> >>> send to> >>>>>> [[18993,1],1]> >>>>>>> on node manage",> >>>>>>> the
> > osu_bw
> >>> still shows it's very slow as shown below:>
> >>>>>>>>>>>>>>>>>>> [mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2
> >>> -mca> >>>>>> btl_base_verbose> >>>>>>> 10 -bind-to core
> > -report-bindings
> >>> osu_bw> >>>>>>> [manage.cluster:17482] MCW rank 0 bound
> >>>> to socket 0[core 0[hwt0]]:> >>>>>>> [B/././././.][./././././.]>
> >>>>>>>>
> >>> [manage.cluster:17482] MCW rank 1 bound to socket 0[core 1[hwt0]]:>
> >>>>>>>>
> >>> [./B/./././.][./././././.]> >>>>>>>
> >>>> [manage.cluster:17487] mca: base: components_register:registering>
> >>>>>>>>>> framework btl components> >>>>>>> [manage.cluster:17487] mca:
> > base:
> >>> components_register: foundloaded> >>>>>>> component
> >>>> self> >>>>>>> [manage.cluster:17487] mca: base:
> >>> components_register:component
> >>>>>>> self> >>>>>>> register function successful> >>>>>>>
> >>> [manage.cluster:17487] mca: base: components_register: foundloaded>
> >>>>>>>>
> >>> component vader> >>>>>>> [manage.cluster:17488] mca: base:
> >>>> components_register:registering> >>>>>>> framework btl components>
> >>>>>>>>>> [manage.cluster:17488] mca: base: components_register:
> > foundloaded>
> >>>>>>>>>> component self> >>>>>>> [manage.cluster:17487]
> >>>> mca: base: components_register:component
> >>>>>>> vader> >>>>>>> register function successful> >>>>>>>
> >>> [manage.cluster:17488] mca: base: components_register:component
> >>>>>>> self> >>>>>>> register function successful> >>>>>>>
> >>> [manage.cluster:17488] mca: base: components_register: foundloaded>
> >>>>>>>>
> >>> component vader> >>>>>>> [manage.cluster:17487] mca: base:
> >>>> components_register: foundloaded> >>>>>>> component tcp> >>>>>>>
> >>> [manage.cluster:17488] mca: base: components_register:component
> >>>>>>> vader>>>>>>> register function successful> >>>>>>>
> >>> [manage.cluster:17488] mca: base: components_register: foundloaded>
> >>>>>>>>
> >>> component tcp> >>>>>>> [manage.cluster:17487] mca: base:
> >>>> components_register:component
> >>>> tcp> >>>>>>> register function successful> >>>>>>>
> > [manage.cluster:17487]
> >>> mca: base: components_register: foundloaded> >>>>>>> component sm>
> >>>>>>>>
> >>> [manage.cluster:17488] mca: base:
> >>>> components_register:component
> >>>> tcp> >>>>>>> register function successful> >>>>>>>
> > [manage.cluster:17488]
> >>> mca: base: components_register: foundloaded> >>>>>>> component sm>
> >>>>>>>>
> >>> [manage.cluster:17487] mca: base:
> >>>> components_register:component
> >>>> sm> >>>>>>> register function successful> >>>>>>>
> > [manage.cluster:17488]
> >>> mca: base: components_register:component
> >>>> sm> >>>>>>> register function successful> >>>>>>>
> > [manage.cluster:17488]
> >>> mca: base: components_register: foundloaded> >>>>>>> component
openib>
> >>>>>>>>>> [manage.cluster:17487] mca: base:
> >>>> components_register: foundloaded> >>>>>>> component openib> >>>>>>>
> >>> [manage.cluster:17488] mca: base: components_register:component
> >>>>>>> openib> >>>>>>> register function successful> >>>>>>>
> >>> [manage.cluster:17488] mca: base: components_open: opening btl> >>
> >>> components> >>>>>>> [manage.cluster:17488] mca: base:
components_open:
> >>>> found loaded> >> component> >>>>>>> self> >>>>>>>
> > [manage.cluster:17488]
> >>> mca: base: components_open: componentself
> >>>>>>> open> >>>>>>> function successful> >>>>>>> [manage.cluster:17488]
> >>> mca: base: components_open: found loaded> >> component> >>>>>>>
vader>
> >>>>>>>>>> [manage.cluster:17488] mca: base:
> >>>> components_open: componentvader> >> open> >>>>>>> function
successful>
> >>>>>>>>>> [manage.cluster:17488] mca: base: components_open: found
loaded>
> >>>
> >>> component> >>>>>>> tcp> >>>>>>>
> >>>> [manage.cluster:17488] mca: base: components_open: componenttcp
> >>>>>>> open> >>>>>>> function successful> >>>>>>> [manage.cluster:17488]
> >>> mca: base: components_open: found loaded> >> component> >>>>>>> sm>
> >>>>>>>>
> >>> [manage.cluster:17488] mca: base: components_open:
> >>>> component smopen> >>>>>>> function successful> >>>>>>>
> >>> [manage.cluster:17488] mca: base: components_open: found loaded> >>
> >>> component> >>>>>>> openib> >>>>>>> [manage.cluster:17488] mca: base:
> >>>> components_open: componentopenib> >> open> >>>>>>> function
> > successful>
> >>>>>>>>>> [manage.cluster:17488] select: initializing btl component
self>
> >>>>>>>>>> [manage.cluster:17488] select: init of
> >>>> component self returned> >> success> >>>>>>> [manage.cluster:17488]
> >>> select: initializing btl component vader> >>>>>>>
> > [manage.cluster:17487]
> >>> mca: base: components_register:component
> >>>>>>> openib> >>>>>>> register function successful> >>>>>>>
> >>> [manage.cluster:17487] mca: base: components_open: opening btl> >>
> >>> components> >>>>>>> [manage.cluster:17487] mca: base:
components_open:
> >>>> found loaded> >> component> >>>>>>> self> >>>>>>>
> > [manage.cluster:17487]
> >>> mca: base: components_open: componentself
> >>>>>>> open> >>>>>>> function successful> >>>>>>> [manage.cluster:17487]
> >>> mca: base: components_open: found loaded> >> component> >>>>>>>
vader>
> >>>>>>>>>> [manage.cluster:17487] mca: base:
> >>>> components_open: componentvader> >> open> >>>>>>> function
successful>
> >>>>>>>>>> [manage.cluster:17487] mca: base: components_open: found
loaded>
> >>>
> >>> component> >>>>>>> tcp> >>>>>>>
> >>>> [manage.cluster:17487] mca: base: components_open: componenttcp
> >>>>>>> open> >>>>>>> function successful> >>>>>>> [manage.cluster:17487]
> >>> mca: base: components_open: found loaded> >> component> >>>>>>> sm>
> >>>>>>>>
> >>> [manage.cluster:17487] mca: base: components_open:
> >>>> component smopen> >>>>>>> function successful> >>>>>>>
> >>> [manage.cluster:17487] mca: base: components_open: found loaded> >>
> >>> component> >>>>>>> openib> >>>>>>> [manage.cluster:17488] select:
init
> > of
> >>>> component vader returned> >> success> >>>>>>> [manage.cluster:17488]
> >>> select: initializing btl component tcp> >>>>>>>
[manage.cluster:17487]
> > mca:
> >>> base: components_open: componentopenib> >> open>
> >>>>>>>>>>> function successful> >>>>>>> [manage.cluster:17487] select:
> >>> initializing btl component self> >>>>>>> [manage.cluster:17487]
select:
> >>> init of component self returned> >> success> >>>>>>>
> >>>> [manage.cluster:17487] select: initializing btl component vader>
> >>>>>>>>
> >>> [manage.cluster:17488] select: init of component tcp returned> >>
> > success>
> >>>>>>>>>> [manage.cluster:17488] select: initializing
> >>>> btl component sm> >>>>>>> [manage.cluster:17488] select: init of
> >>> component sm returnedsuccess> >>>>>>> [manage.cluster:17488] select:
> >>> initializing btl componentopenib
> >>>>>>>>>>>> [manage.cluster:17487] select: init of component vader
> >>> returned> >> success> >>>>>>> [manage.cluster:17487] select:
> > initializing
> >>> btl component tcp> >>>>>>> [manage.cluster:17487] select:
> >>>> init of component tcp returned> >> success> >>>>>>>
> >>> [manage.cluster:17487] select: initializing btl component sm> >>>>>>>
> >>> [manage.cluster:17488] Checking distance from this process to> >>>>>>
> >>>> device=mthca0> >>>>>>> [manage.cluster:17488] hwloc_distances->
> > nbobjs=2>
> >>>>>>>>>> [manage.cluster:17488] hwloc_distances->latency[0]=1.000000>
> >>>>>>>>>> [manage.cluster:17488]
> >>>> hwloc_distances->latency[1]=1.600000> >>>>>>> [manage.cluster:17488]
> >>> hwloc_distances->latency[2]=1.600000> >>>>>>> [manage.cluster:17488]
> >>> hwloc_distances->latency[3]=1.000000> >>>>>>>
> >>>> [manage.cluster:17488] ibv_obj->type set to NULL> >>>>>>>
> >>> [manage.cluster:17488] Process is bound: distance to device is> >>
> >>> 0.000000> >>>>>>> [manage.cluster:17487] select: init of component sm
> >>>> returnedsuccess> >>>>>>> [manage.cluster:17487] select: initializing
> > btl
> >>> componentopenib
> >>>>>>>>>>>> [manage.cluster:17488] openib BTL: rdmacm CPC unavailable
> >>> foruse
> >>>> on> >>>>>>> mthca0:1; skipped> >>>>>>> [manage.cluster:17487]
Checking
> >>> distance from this process to> >>>>>> device=mthca0> >>>>>>>
> >>> [manage.cluster:17487] hwloc_distances->nbobjs=2> >>>>>>>
> >>>> [manage.cluster:17487] hwloc_distances->latency[0]=1.000000> >>>>>>>
> >>> [manage.cluster:17487] hwloc_distances->latency[1]=1.600000> >>>>>>>
> >>> [manage.cluster:17487] hwloc_distances->latency[2]=1.600000>
> >>>>>>>>>>> [manage.cluster:17487] hwloc_distances->latency[3]=1.000000>
> >>>>>>>>>> [manage.cluster:17487] ibv_obj->type set to NULL> >>>>>>>
> >>> [manage.cluster:17487] Process is bound: distance to device is>
> >>>>>> 0.000000> >>>>>>> [manage.cluster:17488] [rank=1] openib: using
port
> >>> mthca0:1> >>>>>>> [manage.cluster:17488] select: init of component
> >>> openibreturned
> >>>>>>> success> >>>>>>> [manage.cluster:17487] openib BTL: rdmacm CPC
> >>> unavailable foruse
> >>>> on> >>>>>>> mthca0:1; skipped> >>>>>>> [manage.cluster:17487]
[rank=0]
> >>> openib: using port mthca0:1>>>>> >> [manage.cluster:17487] select:
init
> > of
> >>> component openib returnedsuccess> >>>>>>>
> >>>> [manage.cluster:17488] mca: bml: Using self btl for send to> >>
> >>> [[18993,1],1]> >>>>>>> on node manage> >>>>>>> [manage.cluster:17487]
> > mca:
> >>> bml: Using self btl for send to> >> [[18993,1],0]> >>>>>>>
> >>>> on node manage> >>>>>>> [manage.cluster:17488] mca: bml: Using vader
> > btl
> >>> for send to> >>>>>> [[18993,1],0]> >>>>>>> on node manage> >>>>>>>
> >>> [manage.cluster:17487] mca: bml: Using vader btl for send
> >>>> to> >>>>>> [[18993,1],1]> >>>>>>> on node manage> >>>>>>> # OSU MPI
> >>> Bandwidth Test v3.1.1> >>>>>>> # Size Bandwidth (MB/s)> >>>>>>> 1
1.76>
> >>>>>>>>>> 2 3.53> >>>>>>> 4 7.06> >>>>>>> 8 14.46> >>>>>>> 16
> >>>> 29.12> >>>>>>> 32 57.54> >>>>>>> 64 100.12> >>>>>>> 128 157.78>
> >>>>>>>>
> >>> 256 277.32> >>>>>>> 512 477.53> >>>>>>> 1024 894.81> >>>>>>> 2048
> > 1330.68>
> >>>>>>>>>> 4096 278.58> >>>>>>> 8192 516.00> >>>>>>>
> >>>> 16384 762.99> >>>>>>> 32768 1037.19> >>>>>>> 65536 1181.66> >>>>>>>
> >>> 131072 1261.91> >>>>>>> 262144 1237.39> >>>>>>> 524288 1247.86>
>>>>>>>
> >>> 1048576 1252.04> >>>>>>> 2097152 1273.46> >>>>>>> 4194304 1281.21>
> >>>>>>>>
> >>> [manage.cluster:17488] mca: base: close: component self closed>
>>>>>>>
> >>> [manage.cluster:17488] mca: base: close:
> >>>> unloading componentself
> >>>>>>>>>>>> [manage.cluster:17487] mca: base: close: component self
> > closed>
> >>>>>>>>>> [manage.cluster:17487] mca: base: close: unloading
componentself
> >>>>>>>>>>>> [manage.cluster:17488] mca: base: close: component vader
> >>> closed> >>>>>>> [manage.cluster:17488] mca: base: close: unloading
> >>> componentvader> >>>>>>> [manage.cluster:17487] mca: base: close:
> >>>> component vader closed> >>>>>>> [manage.cluster:17487] mca: base:
> > close:
> >>> unloading componentvader> >>>>>>> [manage.cluster:17488] mca: base:
> > close:
> >>> component tcp closed> >>>>>>>
> >>>> [manage.cluster:17488] mca: base: close: unloading componenttcp
> >>>>>>>>>>>> [manage.cluster:17487] mca: base: close: component tcp
closed>
> >>>>>>>>>> [manage.cluster:17487] mca: base: close: unloading
componenttcp
> >>>>>>>>>>>> [manage.cluster:17488] mca: base: close: component sm
closed>
> >>>>>>>>>> [manage.cluster:17488] mca: base: close: unloading component
sm>
> >>>>>>>>>> [manage.cluster:17487] mca: base: close:
> >>>> component sm closed> >>>>>>> [manage.cluster:17487] mca: base:
close:
> >>> unloading component sm> >>>>>>> [manage.cluster:17488] mca: base:
> > close:
> >>> component openibclosed
> >>>>>>>>>>>> [manage.cluster:17488] mca: base: close: unloading
> >>> componentopenib> >>>>>>> [manage.cluster:17487] mca: base: close:
> > component
> >>> openibclosed
> >>>>>>>>>>>> [manage.cluster:17487] mca: base: close: unloading
> >>> componentopenib> >>>>>>>> >>>>>>> Tetsuya Mishima> >>>>>>>> >>>>>>>
> >>> 2016/07/27 9:20:28、"devel"さんは「Re: [OMPI devel] sm BTL> >>
> > performace
> >>>> of> >>>>>>> the openmpi-2.0.0」で書きました> >>>>>>>> sm is
deprecated
> > in
> >>> 2.0.0 and will likely be removed in favorof
> >>>>>>> vader> >>>>>> in> >>>>>>> 2.1.0.> >>>>>>>>> >>>>>>>> This issue
is
> >>> probably this known issue:> >>>>>>>
> >>> https://github.com/open-mpi/ompi-release/pull/1250> >>>>>>>>>
>>>>>>>>
> >>
> Please apply those>>>> commits and see if it fixes the issue foryou.>
>>>>>>>>> >>>>>>>>
> >>> -Nathan> >>>>>>>>> >>>>>>>>> On Jul 26, 2016, at 6:17 PM,
> >>> tmish...@jcity.maeda.co.jpwrote:
_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

Reply via email to