Ack, the segv is due to a typo from transcribing the patch. Fixed. Please try the following patch and let me know if it fixes the issues.
https://github.com/hjelmn/ompi/commit/4079eec9749e47dddc6acc9c0847b3091601919f.patch -Nathan > On Aug 8, 2016, at 9:48 PM, tmish...@jcity.maeda.co.jp wrote: > > The latest patch also causes a segfault... > > By the way, I found a typo as below. &ca_pml_ob1.use_all_rdma in the last > line should be &mca_pml_ob1.use_all_rdma: > > + mca_pml_ob1.use_all_rdma = false; > + (void) mca_base_component_var_register > (&mca_pml_ob1_component.pmlm_version, "use_all_rdma", > + "Use all available RDMA btls > for the RDMA and RDMA pipeline protocols " > + "(default: false)", > MCA_BASE_VAR_TYPE_BOOL, NULL, 0, 0, > + OPAL_INFO_LVL_5, > MCA_BASE_VAR_SCOPE_GROUP, &ca_pml_ob1.use_all_rdma); > + > > Here is the OSU_BW and gdb output: > > # OSU MPI Bandwidth Test v3.1.1 > # Size Bandwidth (MB/s) > 1 2.19 > 2 4.43 > 4 8.98 > 8 18.07 > 16 35.58 > 32 70.62 > 64 108.88 > 128 172.97 > 256 305.73 > 512 536.48 > 1024 957.57 > 2048 1587.21 > 4096 1638.81 > 8192 2165.14 > 16384 2482.43 > 32768 2866.33 > 65536 3655.33 > 131072 4208.40 > 262144 4596.12 > 524288 4769.27 > 1048576 4900.00 > [manage:16596] *** Process received signal *** > [manage:16596] Signal: Segmentation fault (11) > [manage:16596] Signal code: Address not mapped (1) > [manage:16596] Failing at address: 0x8 > ... > Core was generated by `osu_bw'. > Program terminated with signal 11, Segmentation fault. > #0 0x00000031d9008806 in ?? () from /lib64/libgcc_s.so.1 > (gdb) where > #0 0x00000031d9008806 in ?? () from /lib64/libgcc_s.so.1 > #1 0x00000031d9008934 in _Unwind_Backtrace () from /lib64/libgcc_s.so.1 > #2 0x00000037ab8e5ee8 in backtrace () from /lib64/libc.so.6 > #3 0x00002b5060c14345 in opal_backtrace_print () > at ./backtrace_execinfo.c:47 > #4 0x00002b5060c11180 in show_stackframe () at ./stacktrace.c:331 > #5 <signal handler called> > #6 mca_pml_ob1_recv_request_schedule_once () at ./pml_ob1_recvreq.c:983 > #7 0x00002aaab461c71a in mca_pml_ob1_recv_request_progress_rndv () > > from /home/mishima/opt/mpi/openmpi-2.0.0-pgi16.5/lib/openmpi/mca_pml_ob1.so > #8 0x00002aaab46198e5 in mca_pml_ob1_recv_frag_match () > at ./pml_ob1_recvfrag.c:715 > #9 0x00002aaab4618e46 in mca_pml_ob1_recv_frag_callback_rndv () > at ./pml_ob1_recvfrag.c:267 > #10 0x00002aaab37958d3 in mca_btl_vader_poll_handle_frag () > at ./btl_vader_component.c:589 > #11 0x00002aaab3795b9a in mca_btl_vader_component_progress () > at ./btl_vader_component.c:231 > #12 0x00002b5060bd16fc in opal_progress () at runtime/opal_progress.c:224 > #13 0x00002b50600e9aa5 in ompi_request_default_wait_all () at > request/req_wait.c:77 > #14 0x00002b50601310dd in PMPI_Waitall () at ./pwaitall.c:76 > #15 0x0000000000401108 in main () at ./osu_bw.c:144 > > > Tetsuya Mishima > > 2016/08/09 11:53:04、"devel"さんは「Re: [OMPI devel] sm BTL performace of > the openmpi-2.0.0」で書きました >> No problem. Thanks for reporting this. Not all platforms see a slowdown > so we missed it before the release. Let me know if that latest patch works > for you. >> >> -Nathan >> >>> On Aug 8, 2016, at 8:50 PM, tmish...@jcity.maeda.co.jp wrote: >>> >>> I understood. Thanks. >>> >>> Tetsuya Mishima >>> >>> 2016/08/09 11:33:15、"devel"さんは「Re: [OMPI devel] sm BTL performace > of >>> the openmpi-2.0.0」で書きました >>>> I will add a control to have the new behavior or using all available > RDMA >>> btls or just the eager ones for the RDMA protocol. The flags will > remain as >>> they are. And, yes, for 2.0.0 you can set the btl >>>> flags if you do not intend to use MPI RMA. >>>> >>>> New patch: >>>> >>>> >>> > https://github.com/hjelmn/ompi/commit/43267012e58d78e3fc713b98c6fb9f782de977c7.patch > >>> >>>> >>>> -Nathan >>>> >>>>> On Aug 8, 2016, at 8:16 PM, tmish...@jcity.maeda.co.jp wrote: >>>>> >>>>> Then, my understanding is that you will restore the default value of >>>>> btl_openib_flags to previous one( = 310) and add a new MCA parameter > to >>>>> control HCA inclusion for such a situation. The work arround so far > for >>>>> openmpi-2.0.0 is setting those flags manually. Right? >>>>> >>>>> Tetsuya Mishima >>>>> >>>>> 2016/08/09 9:56:29、"devel"さんは「Re: [OMPI devel] sm BTL performace >>> of >>>>> the openmpi-2.0.0」で書きました >>>>>> Hmm, not good. So we have a situation where it is sometimes better > to >>>>> include the HCA when it is the only rdma btl. Will have a new version >>> up in >>>>> a bit that adds an MCA parameter to control the >>>>>> behavior. The default will be the same as 1.10.x. >>>>>> >>>>>> -Nathan >>>>>> >>>>>>> On Aug 8, 2016, at 4:51 PM, tmish...@jcity.maeda.co.jp wrote: >>>>>>> >>>>>>> Hi, unfortunately it doesn't work well. The previous one was much >>>>>>> better ... >>>>>>> >>>>>>> [mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np 2 >>> -report-bindings >>>>>>> osu_bw >>>>>>> [manage.cluster:25107] MCW rank 0 bound to socket 0[core 0[hwt 0]], >>>>> socket >>>>>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so >>>>>>> cket 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt >>>>> 0]]: >>>>>>> [B/B/B/B/B/B][./././././.] >>>>>>> [manage.cluster:25107] MCW rank 1 bound to socket 0[core 0[hwt 0]], >>>>> socket >>>>>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so >>>>>>> cket 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt >>>>> 0]]: >>>>>>> [B/B/B/B/B/B][./././././.] >>>>>>> # OSU MPI Bandwidth Test v3.1.1 >>>>>>> # Size Bandwidth (MB/s) >>>>>>> 1 2.22 >>>>>>> 2 4.53 >>>>>>> 4 9.11 >>>>>>> 8 18.02 >>>>>>> 16 35.44 >>>>>>> 32 70.84 >>>>>>> 64 113.71 >>>>>>> 128 176.74 >>>>>>> 256 311.07 >>>>>>> 512 529.03 >>>>>>> 1024 907.83 >>>>>>> 2048 1597.66 >>>>>>> 4096 330.14 >>>>>>> 8192 516.49 >>>>>>> 16384 780.31 >>>>>>> 32768 1038.43 >>>>>>> 65536 1186.36 >>>>>>> 131072 1268.87 >>>>>>> 262144 1222.24 >>>>>>> 524288 1232.30 >>>>>>> 1048576 1244.62 >>>>>>> 2097152 1260.25 >>>>>>> 4194304 1263.47 >>>>>>> >>>>>>> Tetsuya >>>>>>> >>>>>>> >>>>>>> 2016/08/09 2:42:24、"devel"さんは「Re: [OMPI devel] sm BTL > performace >>>>> of >>>>>>> the openmpi-2.0.0」で書きました >>>>>>>> Ok, there was a problem with the selection logic when only one > rdma >>>>>>> capable btl is available. I changed the logic to always use the > RDMA >>>>> btl >>>>>>> over pipelined send/recv. This works better for me on a >>>>>>>> Intel Omnipath system. Let me know if this works for you. >>>>>>>> >>>>>>>> >>>>>>> >>>>> >>> > https://github.com/hjelmn/ompi/commit/dddb865b5337213fd73d0e226b02e2f049cfab47.patch > >>> >>>>> >>>>>>> >>>>>>>> >>>>>>>> -Nathan >>>>>>>> >>>>>>>> On Aug 07, 2016, at 10:00 PM, tmish...@jcity.maeda.co.jp wrote: >>>>>>>> >>>>>>>> Hi, here is the gdb output for additional information: >>>>>>>> >>>>>>>> (It might be inexact, because I built openmpi-2.0.0 without debug >>>>> option) >>>>>>>> >>>>>>>> Core was generated by `osu_bw'. >>>>>>>> Program terminated with signal 11, Segmentation fault. >>>>>>>> #0 0x00000031d9008806 in ?? () from /lib64/libgcc_s.so.1 >>>>>>>> (gdb) where >>>>>>>> #0 0x00000031d9008806 in ?? () from /lib64/libgcc_s.so.1 >>>>>>>> #1 0x00000031d9008934 in _Unwind_Backtrace () >>>>> from /lib64/libgcc_s.so.1 >>>>>>>> #2 0x00000037ab8e5ee8 in backtrace () from /lib64/libc.so.6 >>>>>>>> #3 0x00002ad882bd4345 in opal_backtrace_print () >>>>>>>> at ./backtrace_execinfo.c:47 >>>>>>>> #4 0x00002ad882bd1180 in show_stackframe () at ./stacktrace.c:331 >>>>>>>> #5 <signal handler called> >>>>>>>> #6 mca_pml_ob1_recv_request_schedule_once () >>>>> at ./pml_ob1_recvreq.c:983 >>>>>>>> #7 0x00002aaab412f47a in mca_pml_ob1_recv_request_progress_rndv () >>>>>>>> >>>>>>>> >>>>>>> >>>>> >>> > from /home/mishima/opt/mpi/openmpi-2.0.0-pgi16.5/lib/openmpi/mca_pml_ob1.so >>>>>>>> #8 0x00002aaab412c645 in mca_pml_ob1_recv_frag_match () >>>>>>>> at ./pml_ob1_recvfrag.c:715 >>>>>>>> #9 0x00002aaab412bba6 in mca_pml_ob1_recv_frag_callback_rndv () >>>>>>>> at ./pml_ob1_recvfrag.c:267 >>>>>>>> #10 0x00002aaaaf2748d3 in mca_btl_vader_poll_handle_frag () >>>>>>>> at ./btl_vader_component.c:589 >>>>>>>> #11 0x00002aaaaf274b9a in mca_btl_vader_component_progress () >>>>>>>> at ./btl_vader_component.c:231 >>>>>>>> #12 0x00002ad882b916fc in opal_progress () at >>>>> runtime/opal_progress.c:224 >>>>>>>> #13 0x00002ad8820a9aa5 in ompi_request_default_wait_all () at >>>>>>>> request/req_wait.c:77 >>>>>>>> #14 0x00002ad8820f10dd in PMPI_Waitall () at ./pwaitall.c:76 >>>>>>>> #15 0x0000000000401108 in main () at ./osu_bw.c:144 >>>>>>>> >>>>>>>> Tetsuya >>>>>>>> >>>>>>>> >>>>>>>> 2016/08/08 12:34:57、"devel"さんは「Re: [OMPI devel] sm BTL >>> performace >>>>> of >>>>>>>> the openmpi-2.0.0」で書きました >>>>>>>> Hi, it caused segfault as below: >>>>>>>> [manage.cluster:25436] MCW rank 0 bound to socket 0[core 0[hwt >>>>> 0]],socket >>>>>>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt > 0]], >>>>>>> socket 0[core 4[hwt 0]], socket 0[core 5[hwt >>>>>>> 0]]:[B/B/B/B/B/B][./././././.][manage.cluster:25436] MCW rank 1 > bound >>>>> to >>>>>>> socket 0[core >>>>>>>> 0[hwt 0]],socket >>>>>>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt > 0]], >>>>>>> socket 0[core 4[hwt 0]], socket 0[core 5[hwt >>>>>>> 0]]:[B/B/B/B/B/B][./././././.]# OSU MPI Bandwidth Test v3.1.1# Size >>>>>>> Bandwidth (MB/s)1 >>>>>>>> 2.232 4.514 8.998 17.8316 35.1832 69.6664 109.84128 179.65256 >>>>> 303.52512 >>>>>>> 532.811024 911.742048 1605.294096 1598.738192 2135.9416384 >>> 2468.9832768 >>>>>>> 2818.3765536 3658.83131072 4200.50262144 4545.01524288 >>>>>>>> 4757.841048576 4831.75[manage:25442] *** Process received signal >>>>>>> ***[manage:25442] Signal: Segmentation fault (11)[manage:25442] >>> Signal >>>>>>> code: Address not mapped (1)[manage:25442] Failing at address: >>>>>>>> 0x8 >>>>>>>> >>>>>>> >>>>> >>> > -------------------------------------------------------------------------- >>>>>>>> mpirun noticed that process rank 1 with PID 0 on node manage > exited >>>>>>> onsignal 11 (Segmentation fault). >>>>>>>> >>>>>>> >>>>> >>> > -------------------------------------------------------------------------- >>>>>>>> >>>>>>>> Tetsuya Mishima >>>>>>>> >>>>>>>> 2016/08/08 10:12:05、"devel"さんは「Re: [OMPI devel] sm BTL >>> performace >>>>>>> ofthe openmpi-2.0.0」で書きました> This patch also modifies the put >>>>> path. >>>>>>> Let me know if this works:>> diff --git >>>>>>>> > a/ompi/mca/pml/ob1/pml_ob1_rdma.cb/ompi/mca/pml/ob1/pml_ob1_rdma.c> >>>>> index >>>>>>> 888e126..a3ec6f8 100644> --- a/ompi/mca/pml/ob1/pml_ob1_rdma.c> +++ >>>>>>> b/ompi/mca/pml/ob1/pml_ob1_rdma.c> @@ -42,6 +42,7 @@ >>>>>>>> size_t mca_pml_ob1_rdma_btls(> mca_pml_ob1_com_btl_t* rdma_btls)> > {> >>>>> int >>>>>>> num_btls = mca_bml_base_btl_array_get_size(&bml_endpoint-> > btl_rdma); >>>>>>>>> + int num_eager_btls = mca_bml_base_btl_array_get_size >>>>> (&bml_endpoint-> >>>>>>> btl_eager);> double weight_total = 0;> int num_btls_used = 0;>> @@ >>>>> -57,6 >>>>>>> +58,21 @@ size_t mca_pml_ob1_rdma_btls(> >>>>>>>> (bml_endpoint->btl_rdma_index + n) % num_btls);> >>>>>>> mca_btl_base_registration_handle_t *reg_handle = NULL;> >>>>>>> mca_btl_base_module_t *btl = bml_btl->btl;> + bool ignore = true;> > +> >>>>> + /* >>>>>>> do not use rdma >>>>>>>> btls that are not in the eager list. thisis >>>>>>>> necessary to avoid using> + * btls that exist on the endpoint only >>> to >>>>>>> support RMA. */> + for (int i = 0 ; i < num_eager_btls ; ++i) {> + >>>>>>> mca_bml_base_btl_t *eager_btl >>>>>>>> =mca_bml_base_btl_array_get_index (&bml_endpoint->btl_eager, i);> > + >>> if >>>>>>> (eager_btl->btl_endpoint == bml_btl->btl_endpoint) {> + ignore = >>>>> false;> + >>>>>>> break;> + }> + }> +> + if (ignore) {> + continue;> + >>>>>>>> }>> if (btl->btl_register_mem) {> /* do not use the RDMA protocol >>> with >>>>>>> this btl if 1) leave pinned isdisabled,> @@ -99,18 +115,34 @@ > size_t >>>>>>> mca_pml_ob1_rdma_pipeline_btls( mca_bml_base_endpoint_t* >>>>>>>> bml_endpoint,> size_t size,> mca_pml_ob1_com_btl_t* rdma_btls )> > {> >>> - >>>>> int >>>>>>> i, num_btls = mca_bml_base_btl_array_get_size(&bml_endpoint-> >>>>> btl_rdma);> + >>>>>>> int num_btls = mca_bml_base_btl_array_get_size >>>>>>>> (&bml_endpoint->btl_rdma);> + int num_eager_btls = >>>>>>> mca_bml_base_btl_array_get_size(&bml_endpoint->btl_eager);> double >>>>>>> weight_total = 0;> + int rdma_count = 0;>> - for(i = 0; i < > num_btls >>> && >>>>> i < >>>>>>> >>>>>>>> mca_pml_ob1.max_rdma_per_request; i+ >>>>>>>> +) {> - rdma_btls[i].bml_btl => - mca_bml_base_btl_array_get_next >>>>>>> (&bml_endpoint->btl_rdma);> - rdma_btls[i].btl_reg = NULL;> + for > (int >>> i >>>>> = >>>>>>> 0; i < num_btls && i <mca_pml_ob1.max_rdma_per_request; >>>>>>>> i++) {> + mca_bml_base_btl_t *bml_btl = >>>>> mca_bml_base_btl_array_get_next >>>>>>> (&bml_endpoint->btl_rdma);> + bool ignore = true;> +> + for (int i > = >>>>> 0 ; i >>>>>>> < num_eager_btls ; ++i) {> + mca_bml_base_btl_t >>>>>>>> *eager_btl =mca_bml_base_btl_array_get_index (&bml_endpoint-> >>>>> btl_eager, >>>>>>> i);> + if (eager_btl->btl_endpoint == bml_btl->btl_endpoint) {> + >>>>> ignore = >>>>>>> false;> + break;> + }> + }>> - weight_total += >>>>>>>> rdma_btls[i].bml_btl->btl_weight;> + if (ignore) {> + continue;> >>> + }> >>>>> +> >>>>>>> + rdma_btls[rdma_count].bml_btl = bml_btl;> + rdma_btls[rdma_count+ >>>>>>> +].btl_reg = NULL;> +> + weight_total += >>>>>>>> bml_btl->btl_weight;> }>> - mca_pml_ob1_calc_weighted_length >>>>> (rdma_btls, >>>>>>> i, size,weight_total); >>>>>>>>> + mca_pml_ob1_calc_weighted_length (rdma_btls, rdma_count, >>>>>>> size,weight_total);>> - return i;> + return rdma_count;> }>>>>> > > On >>>>> Aug 7, >>>>>>> 2016, at 6:51 PM, Nathan Hjelm <hje...@me.com> wrote:> >> > >>>>>>>> Looks like the put path probably needs a similar patch. Will >>>>> sendanother >>>>>>> patch soon.> >> >> On Aug 7, 2016, at 6:01 PM, >>>>> tmish...@jcity.maeda.co.jp >>>>>>> wrote:> >>> >> Hi,> >>> >> I applied the patch to >>>>>>>> the file "pml_ob1_rdma.c" and ran osu_bwagain. >>>>>>>>>>> Then, I still see the bad performance for larger size >>> (>=2097152 ).> >>>>>>>>>>>> [mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np >>>>>>> 2-report-bindings >>>>>>>>>>> osu_bw> >> [manage.cluster:27444] MCW rank 0 bound to socket 0 >>> [core >>>>>>> 0[hwt 0]],socket> >> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so> >>>>> >>>>> cket >>>>>>> 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket >>>>>>>> 0[core 5[hwt0]]:> >> [B/B/B/B/B/B][./././././.]> >> >>>>>>> [manage.cluster:27444] MCW rank 1 bound to socket 0[core 0[hwt >>>>> 0]],socket> >>>>>>>>> 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], so> >> cket 0[core 3 > [hwt >>>>>>>> 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt0]]:> >> >>>>>>> [B/B/B/B/B/B][./././././.]> >> # OSU MPI Bandwidth Test v3.1.1> >> > # >>>>> Size >>>>>>> Bandwidth (MB/s)> >> 1 2.23> >> 2 4.52> >> 4 8.82> >> 8 17.83> >> >>>>>>>> 16 35.31> >> 32 69.49> >> 64 109.46> >> 128 178.51> >> 256 307.68> >>>>> >>>>> 512 >>>>>>> 532.64> >> 1024 909.34> >> 2048 1583.95> >> 4096 1554.74> >> 8192 >>>>> 2120.31> >>>>>>>>> 16384 2489.79> >> 32768 2853.66> >> 65536 >>>>>>>> 3692.82> >> 131072 4236.67> >> 262144 4575.63> >> 524288 4778.47> >>> >>>>>>> 1048576 4839.34> >> 2097152 2231.46> >> 4194304 1505.48> >>> >> >>>>> Regards,> >>>>>>>>>>>> Tetsuya Mishima> >>> >> 2016/08/06 0:00:08、" >>>>>>>> devel"さんは「Re: [OMPI devel] sm BTLperformace >>>>>>>> of> >> the openmpi-2.0.0」で書きました> >>> Making ob1 ignore RDMA >>>>> btls >>>>>>> that are not in use for eager messagesmight> >> be sufficient. > Please >>>>> try >>>>>>> the following patch and let me know if itworks> >> for you.> >>>>>>>>>>>>>>> diff --git a/ompi/mca/pml/ob1/pml_ob1_rdma.c> >> >>>>>>> b/ompi/mca/pml/ob1/pml_ob1_rdma.c> >>> index 888e126..0c99525 > 100644> >>>>>>>> >>>>>>> --- a/ompi/mca/pml/ob1/pml_ob1_rdma.c> >>> +++ >>>>>>>> b/ompi/mca/pml/ob1/pml_ob1_rdma.c> >>> @@ -42,6 +42,7 @@ size_t >>>>>>> mca_pml_ob1_rdma_btls(> >>> mca_pml_ob1_com_btl_t* rdma_btls)> >>> > {> >>>>>>>> >>>>>>> int num_btls = >>>>>>>> mca_bml_base_btl_array_get_size(&bml_endpoint->btl_rdma);> >>> + > int >>>>>>> num_eager_btls = mca_bml_base_btl_array_get_size> >> > (&bml_endpoint-> >>>>>>> btl_eager);> >>> double weight_total = 0;> >>> int >>>>>>>> num_btls_used = 0;> >>>> >>> @@ -57,6 +58,21 @@ size_t >>>>>>> mca_pml_ob1_rdma_btls(> >>> (bml_endpoint->btl_rdma_index + n) % >>>>>>> num_btls);> >>> mca_btl_base_registration_handle_t *reg_handle = >>> NULL;> >>>>>>>> >>>>>>> >>>>>>>> mca_btl_base_module_t *btl = bml_btl->btl;> >>> + bool ignore = >>> true;> >>>>>>>>>> +> >>> + /* do not use rdma btls that are not in the eager >>> list.this >>>>>>>> is> >> necessary to avoid using> >>> + * btls that exist on the >>>>> endpoint >>>>>>> only to support RMA. */> >>> + for (int i = 0 ; i < > num_eager_btls ; >>> + >>>>> +i) >>>>>>> {> >>> + mca_bml_base_btl_t *eager_btl => >> >>>>>>>> mca_bml_base_btl_array_get_index (&bml_endpoint->btl_eager, i);> >>>> >>> + >>>>> if >>>>>>> (eager_btl->btl_endpoint == bml_btl->btl_endpoint){ >>>>>>>>>>>> + ignore = false;> >>> + break;> >>> + }> >>> + }> >>> +> >>> > + >>> if >>>>>>> (ignore) {> >>> + continue;> >>> + }> >>>> >>> if (btl-> >>>>> btl_register_mem) >>>>>>> {> >>> /* do not use the RDMA protocol with this btl >>>>>>>> if 1) leave pinned is> >> disabled,> >>>> >>>> >>>> >>> -Nathan> >>>>>>> >>>>>>>>>>>>>>> On Aug 5, 2016, at 8:44 AM, Nathan Hjelm <hje...@me.com> >>>>> wrote:> >>>>>>>>>>>>>>>> Nope. We are not going to change the flags >>>>>>>> as this will disablethe >>>>>>>> blt> >> for one-sided. Not sure what is going on here as the > openib >>>>>>> btlshould >>>>>>>> be> >> 1) not used for pt2pt, and 2) polled infrequently.> >>> The >>> btl >>>>>>> debug log suggests both of these are the case. Not surewhat >>>>>>>> is> >> going on yet.> >>>>> >>>> -Nathan> >>>>> >>>>> On Aug 5, >>> 2016, >>>>> at >>>>>>> 8:16 AM, r...@open-mpi.org wrote:> >>>>>> >>>>> Perhaps those flags >>> need >>>>> to >>>>>>> be the default?> >>>>>> >>>>>> >>>>>> On Aug 5, >>>>>>>> 2016, at 7:14 AM, tmish...@jcity.maeda.co.jp wrote:> >>>>>>> >>>>>>> >>> Hi >>>>>>> Christoph,> >>>>>>> >>>>>> I applied the commits - pull/#1250 as >>> Nathan >>>>>>> told me and added"-mca> >>>>>> btl_openib_flags 311" to >>>>>>>> the mpirun command line option, then it> >> worked for> >>>>>> me. > I >>>>>>> don't know the reason, but it looks ATOMIC_FOP in the> >>>>>> >>>>>>> btl_openib_flags degrades the sm/vader perfomance.> >>>>>>> >>>>>>>>>>>>> Regards,> >>>>>> Tetsuya Mishima> >>>>>>> >>>>>>> >>>>>> > 2016/08/05 >>>>>>> 22:10:37、"devel"さんは「Re: [OMPI devel] sm BTL> >> performace of> >>>>>>>>>>> >>>>>>> the openmpi-2.0.0」で書きました> >>>>>>> Hello,> >>>>>>>> >>>>>>> > We >>>>>>>> see the same problem here on various machines with Open MPI> >> >>>>> 2.0.0.> >>>>>>>>>>>>>> To us it seems that enabling the openib btl triggers >>>>>>> badperformance> >> for> >>>>>> the sm AND vader btls!> >>>>>>> >>>>>>>> --mca btl_base_verbose 10 reports in both cases the correct useof> >>>>> >>>>> sm >>>>>>> and> >>>>>> vader between MPI ranks - only performance differs?!> >>>>>>>>>>>>> >>>>>>>>>>>>>> One irritating thing I see in the log >>>>>>>> output is the following:> >>>>>>> openib BTL: rdmacm CPC > unavailable >>>>> for >>>>>>> use on mlx4_0:1; skipped> >>>>>>> [rank=1] openib: using port >>> mlx4_0:1> >>>>>>>>>>>>>> select: init of component openib returned >>>>>>>> success> >>>>>>>> >>>>>>> Did not look into the "Skipped" code > part >>>>>>> yet, ...> >>>>>>>> >>>>>>> Results see below.> >>>>>>>> >>>>>>> > Best >>>>>>> regards> >>>>>>> Christoph Niethammer> >>>>>>>> >>>>>>> --> >>>>>>>>>>>>>>>>>>>>>>> Christoph Niethammer> >>>>>>> High Performance >>>>> Computing>>> Center Stuttgart (HLRS)> >>>>>>> Nobelstrasse 19> >>>>>>>> >>> 70569 >>>>> Stuttgart> >>>>>>>>>>>>>>>>>>>>>> Tel: ++49(0)711-685-87203> >>>>>>>>>>>>>>> email: nietham...@hlrs.de> >>>>>>> >>>>>>> http://www.hlrs.de/people/niethammer> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>>>>> >>>>>>> mpirun -np 2 --mca btl self,vader osu_bw> >>>>>>> # OSU MPI > Bandwidth >>>>> Test> >>>>>>>>>>>>>> >>>>>>>> # Size Bandwidth (MB/s)> >>>>>>> 1 4.83> >>>>>>> 2 10.30> >>>>>>> > 4 >>>>>>> 24.68> >>>>>>> 8 49.27> >>>>>>> 16 95.80> >>>>>>> 32 187.52> >>>>>>>> >>> 64 >>>>>>> 270.82> >>>>>>> 128 405.00> >>>>>>> 256 659.26> >>>>>>> 512 >>>>>>>> 1165.14> >>>>>>> 1024 2372.83> >>>>>>> 2048 3592.85> >>>>>>> 4096 >>>>>>> 4283.51> >>>>>>> 8192 5523.55> >>>>>>> 16384 7388.92> >>>>>>> 32768 >>>>>>> 7024.37> >>>>>>> 65536 7353.79> >>>>>>> 131072 7465.96> >>>>>>> >>>>>>>> 262144 8597.56> >>>>>>> 524288 9292.86> >>>>>>> 1048576 9168.01> >>>>>>>>>>>> >>>>>>> 2097152 9009.62> >>>>>>> 4194304 9013.02> >>>>>>>> >>>>>>> mpirun > -np >>> 2 >>>>>>> --mca btl self,vader,openib osu_bw> >>>>>>> # OSU MPI >>>>>>>> Bandwidth Test> >>>>>>> # Size Bandwidth (MB/s)> >>>>>>> 1 5.32> >>>>>>>>>>>> >>>>>>> 2 11.14> >>>>>>> 4 20.88> >>>>>>> 8 49.26> >>>>>>> 16 99.11> >>>>>>>> >>> 32 >>>>>>> 197.42> >>>>>>> 64 301.08> >>>>>>> 128 413.64> >>>>>>> >>>>>>>> 256 651.15> >>>>>>> 512 1161.12> >>>>>>> 1024 2460.99> >>>>>>> > 2048 >>>>>>> 3627.36> >>>>>>> 4096 2191.06> >>>>>>> 8192 3118.36> >>>>>>> 16384 >>>>> 3428.45> >>>>>>>>>>>>>> 32768 3676.96> >>>>>>> 65536 3709.65> >>>>>>> >>>>>>>> 131072 3748.64> >>>>>>> 262144 3764.88> >>>>>>> 524288 3764.61> >>>>>>>>>>>> >>>>>>> 1048576 3772.45> >>>>>>> 2097152 3757.37> >>>>>>> 4194304 3746.45> >>>>>>>>>>>>> >>>>>>>>>>>>>> mpirun -np 2 --mca btl self,sm osu_bw>>>>>>>>>> # OSU MPI >>>>> Bandwidth Test> >>>>>>> # Size Bandwidth (MB/s)> >>>>>>>>>>>>>> 1 2.98> >>>>>>> 2 5.97> >>>>>>> 4 11.99> >>>>>>> 8 23.47> >>>>>>>>>>>> >>>>>>> 16 50.64> >>>>>>> 32 99.91> >>>>>>> 64 197.87> >>>>>>> 128 >>>>>>>> 343.32> >>>>>>> 256 667.48> >>>>>>> 512 1200.86> >>>>>>> 1024 >>> 2050.05> >>>>>>>>>>>>>> 2048 3578.52> >>>>>>> 4096 3966.92> >>>>>>> 8192 5687.96> >>>>>>>>>>>> >>>>>>> 16384 7395.88> >>>>>>> 32768 7101.41> >>>>>>> 65536 >>>>>>>> 7619.49> >>>>>>> 131072 7978.09> >>>>>>> 262144 8648.87> >>>>>>> >>>>> 524288 >>>>>>> 9129.18> >>>>>>> 1048576 10525.31> >>>>>>> 2097152 10511.63> >>>>>>>> >>>>>>> 4194304 10489.66> >>>>>>>> >>>>>>> mpirun -np 2 --mca btl >>>>>>>> self,sm,openib osu_bw> >>>>>>> # OSU MPI Bandwidth Test> >>>>>>> # >>>>> Size >>>>>>> Bandwidth (MB/s)> >>>>>>> 1 2.02> >>>>>>> 2 3.00> >>>>>>> 4 9.99> >>>>>>>>>>>> 8 >>>>>>> 19.96> >>>>>>> 16 40.10> >>>>>>> 32 70.63> >>>>>>> >>>>>>>> 64 144.08> >>>>>>> 128 282.21> >>>>>>> 256 543.55> >>>>>>> >>> 5121032.61 >>>>>>>>>>>>>>>> 1024 1871.09> >>>>>>> 2048 3294.07> >>>>>>> 4096 2336.48> >>>>>>>>>>>>>> 8192 3142.22> >>>>>>> 16384 3419.93> >>>>>>> 32768 3647.30> >>>>>>>>>>>> >>>>>>> 65536 3725.40> >>>>>>> 131072 3749.43> >>>>>>> 262144 >>>>>>>> 3765.31> >>>>>>> 524288 3771.06> >>>>>>> 1048576 3772.54> >>>>>>> >>>>> 2097152 >>>>>>> 3760.93> >>>>>>> 4194304 3745.37> >>>>>>>> >>>>>>> ----- Original >>>>> Message >>>>>>> -----> >>>>>>> From: tmish...@jcity.maeda.co.jp> >>>>>>>>>>>>>>> To: "Open MPI Developers" <de...@open-mpi.org>> >>>>>>> > Sent: >>>>>>> Wednesday, July 27, 2016 6:04:48 AM> >>>>>>> Subject: Re: [OMPI >>> devel] >>>>> sm >>>>>>> BTL performace of theopenmpi-2.0.0 >>>>>>>>>>>>>>>>>>>>>>>> HiNathan,> >>>>>>>> >>>>>>> I applied those > commits >>>>>>> and ran again without any BTLspecified. >>>>>>>>>>>>>>>>>>>>>>>> Then, although it says "mca: bml: Using vader btl >>> for >>>>>>> send to> >>>>>> [[18993,1],1]> >>>>>>> on node manage",> >>>>>>> > the >>>>> osu_bw >>>>>>> still shows it's very slow as shown below:> >>>>>>>>>>>>>>>>>>>>>>> [mishima@manage OMB-3.1.1-openmpi2.0.0]$ mpirun -np > 2 >>>>>>> -mca> >>>>>> btl_base_verbose> >>>>>>> 10 -bind-to core >>>>> -report-bindings >>>>>>> osu_bw> >>>>>>> [manage.cluster:17482] MCW rank 0 bound >>>>>>>> to socket 0[core 0[hwt0]]:> >>>>>>> [B/././././.][./././././.]> >>>>>>>>>>>> >>>>>>> [manage.cluster:17482] MCW rank 1 bound to socket 0[core 1[hwt0]]:> >>>>>>>>>>>> >>>>>>> [./B/./././.][./././././.]> >>>>>>> >>>>>>>> [manage.cluster:17487] mca: base: components_register:registering> >>>>>>>>>>>>>> framework btl components> >>>>>>> [manage.cluster:17487] > mca: >>>>> base: >>>>>>> components_register: foundloaded> >>>>>>> component >>>>>>>> self> >>>>>>> [manage.cluster:17487] mca: base: >>>>>>> components_register:component >>>>>>>>>>> self> >>>>>>> register function successful> >>>>>>> >>>>>>> [manage.cluster:17487] mca: base: components_register: foundloaded> >>>>>>>>>>>> >>>>>>> component vader> >>>>>>> [manage.cluster:17488] mca: base: >>>>>>>> components_register:registering> >>>>>>> framework btl components> >>>>>>>>>>>>>> [manage.cluster:17488] mca: base: components_register: >>>>> foundloaded> >>>>>>>>>>>>>> component self> >>>>>>> [manage.cluster:17487] >>>>>>>> mca: base: components_register:component >>>>>>>>>>> vader> >>>>>>> register function successful> >>>>>>> >>>>>>> [manage.cluster:17488] mca: base: components_register:component >>>>>>>>>>> self> >>>>>>> register function successful> >>>>>>> >>>>>>> [manage.cluster:17488] mca: base: components_register: foundloaded> >>>>>>>>>>>> >>>>>>> component vader> >>>>>>> [manage.cluster:17487] mca: base: >>>>>>>> components_register: foundloaded> >>>>>>> component tcp> >>>>>>> >>>>>>> [manage.cluster:17488] mca: base: components_register:component >>>>>>>>>>> vader>>>>>>> register function successful> >>>>>>> >>>>>>> [manage.cluster:17488] mca: base: components_register: foundloaded> >>>>>>>>>>>> >>>>>>> component tcp> >>>>>>> [manage.cluster:17487] mca: base: >>>>>>>> components_register:component >>>>>>>> tcp> >>>>>>> register function successful> >>>>>>> >>>>> [manage.cluster:17487] >>>>>>> mca: base: components_register: foundloaded> >>>>>>> component sm> >>>>>>>>>>>> >>>>>>> [manage.cluster:17488] mca: base: >>>>>>>> components_register:component >>>>>>>> tcp> >>>>>>> register function successful> >>>>>>> >>>>> [manage.cluster:17488] >>>>>>> mca: base: components_register: foundloaded> >>>>>>> component sm> >>>>>>>>>>>> >>>>>>> [manage.cluster:17487] mca: base: >>>>>>>> components_register:component >>>>>>>> sm> >>>>>>> register function successful> >>>>>>> >>>>> [manage.cluster:17488] >>>>>>> mca: base: components_register:component >>>>>>>> sm> >>>>>>> register function successful> >>>>>>> >>>>> [manage.cluster:17488] >>>>>>> mca: base: components_register: foundloaded> >>>>>>> component >>> openib> >>>>>>>>>>>>>> [manage.cluster:17487] mca: base: >>>>>>>> components_register: foundloaded> >>>>>>> component openib> >>>>>>>> >>>>>>> [manage.cluster:17488] mca: base: components_register:component >>>>>>>>>>> openib> >>>>>>> register function successful> >>>>>>> >>>>>>> [manage.cluster:17488] mca: base: components_open: opening btl> >> >>>>>>> components> >>>>>>> [manage.cluster:17488] mca: base: >>> components_open: >>>>>>>> found loaded> >> component> >>>>>>> self> >>>>>>> >>>>> [manage.cluster:17488] >>>>>>> mca: base: components_open: componentself >>>>>>>>>>> open> >>>>>>> function successful> >>>>>>> > [manage.cluster:17488] >>>>>>> mca: base: components_open: found loaded> >> component> >>>>>>> >>> vader> >>>>>>>>>>>>>> [manage.cluster:17488] mca: base: >>>>>>>> components_open: componentvader> >> open> >>>>>>> function >>> successful> >>>>>>>>>>>>>> [manage.cluster:17488] mca: base: components_open: found >>> loaded> >>>>>>> >>>>>>> component> >>>>>>> tcp> >>>>>>> >>>>>>>> [manage.cluster:17488] mca: base: components_open: componenttcp >>>>>>>>>>> open> >>>>>>> function successful> >>>>>>> > [manage.cluster:17488] >>>>>>> mca: base: components_open: found loaded> >> component> >>>>>>> sm> >>>>>>>>>>>> >>>>>>> [manage.cluster:17488] mca: base: components_open: >>>>>>>> component smopen> >>>>>>> function successful> >>>>>>> >>>>>>> [manage.cluster:17488] mca: base: components_open: found loaded> >> >>>>>>> component> >>>>>>> openib> >>>>>>> [manage.cluster:17488] mca: > base: >>>>>>>> components_open: componentopenib> >> open> >>>>>>> function >>>>> successful> >>>>>>>>>>>>>> [manage.cluster:17488] select: initializing btl component >>> self> >>>>>>>>>>>>>> [manage.cluster:17488] select: init of >>>>>>>> component self returned> >> success> >>>>>>> > [manage.cluster:17488] >>>>>>> select: initializing btl component vader> >>>>>>> >>>>> [manage.cluster:17487] >>>>>>> mca: base: components_register:component >>>>>>>>>>> openib> >>>>>>> register function successful> >>>>>>> >>>>>>> [manage.cluster:17487] mca: base: components_open: opening btl> >> >>>>>>> components> >>>>>>> [manage.cluster:17487] mca: base: >>> components_open: >>>>>>>> found loaded> >> component> >>>>>>> self> >>>>>>> >>>>> [manage.cluster:17487] >>>>>>> mca: base: components_open: componentself >>>>>>>>>>> open> >>>>>>> function successful> >>>>>>> > [manage.cluster:17487] >>>>>>> mca: base: components_open: found loaded> >> component> >>>>>>> >>> vader> >>>>>>>>>>>>>> [manage.cluster:17487] mca: base: >>>>>>>> components_open: componentvader> >> open> >>>>>>> function >>> successful> >>>>>>>>>>>>>> [manage.cluster:17487] mca: base: components_open: found >>> loaded> >>>>>>> >>>>>>> component> >>>>>>> tcp> >>>>>>> >>>>>>>> [manage.cluster:17487] mca: base: components_open: componenttcp >>>>>>>>>>> open> >>>>>>> function successful> >>>>>>> > [manage.cluster:17487] >>>>>>> mca: base: components_open: found loaded> >> component> >>>>>>> sm> >>>>>>>>>>>> >>>>>>> [manage.cluster:17487] mca: base: components_open: >>>>>>>> component smopen> >>>>>>> function successful> >>>>>>> >>>>>>> [manage.cluster:17487] mca: base: components_open: found loaded> >> >>>>>>> component> >>>>>>> openib> >>>>>>> [manage.cluster:17488] select: >>> init >>>>> of >>>>>>>> component vader returned> >> success> >>>>>>> > [manage.cluster:17488] >>>>>>> select: initializing btl component tcp> >>>>>>> >>> [manage.cluster:17487] >>>>> mca: >>>>>>> base: components_open: componentopenib> >> open> >>>>>>>>>>>>>>> function successful> >>>>>>> [manage.cluster:17487] select: >>>>>>> initializing btl component self> >>>>>>> [manage.cluster:17487] >>> select: >>>>>>> init of component self returned> >> success> >>>>>>> >>>>>>>> [manage.cluster:17487] select: initializing btl component vader> >>>>>>>>>>>> >>>>>>> [manage.cluster:17488] select: init of component tcp returned> >> >>>>> success> >>>>>>>>>>>>>> [manage.cluster:17488] select: initializing >>>>>>>> btl component sm> >>>>>>> [manage.cluster:17488] select: init of >>>>>>> component sm returnedsuccess> >>>>>>> [manage.cluster:17488] > select: >>>>>>> initializing btl componentopenib >>>>>>>>>>>>>>>> [manage.cluster:17487] select: init of component vader >>>>>>> returned> >> success> >>>>>>> [manage.cluster:17487] select: >>>>> initializing >>>>>>> btl component tcp> >>>>>>> [manage.cluster:17487] select: >>>>>>>> init of component tcp returned> >> success> >>>>>>> >>>>>>> [manage.cluster:17487] select: initializing btl component sm> >>>>>>>> >>>>>>> [manage.cluster:17488] Checking distance from this process to> >>>>>>> >>>>>>>> device=mthca0> >>>>>>> [manage.cluster:17488] hwloc_distances-> >>>>> nbobjs=2> >>>>>>>>>>>>>> [manage.cluster:17488] hwloc_distances->latency[0]=1.000000> >>>>>>>>>>>>>> [manage.cluster:17488] >>>>>>>> hwloc_distances->latency[1]=1.600000> >>>>>>> > [manage.cluster:17488] >>>>>>> hwloc_distances->latency[2]=1.600000> >>>>>>> > [manage.cluster:17488] >>>>>>> hwloc_distances->latency[3]=1.000000> >>>>>>> >>>>>>>> [manage.cluster:17488] ibv_obj->type set to NULL> >>>>>>> >>>>>>> [manage.cluster:17488] Process is bound: distance to device is> >> >>>>>>> 0.000000> >>>>>>> [manage.cluster:17487] select: init of component > sm> >>>>>> returnedsuccess> >>>>>>> [manage.cluster:17487] select: > initializing >>>>> btl >>>>>>> componentopenib >>>>>>>>>>>>>>>> [manage.cluster:17488] openib BTL: rdmacm CPC unavailable >>>>>>> foruse >>>>>>>> on> >>>>>>> mthca0:1; skipped> >>>>>>> [manage.cluster:17487] >>> Checking >>>>>>> distance from this process to> >>>>>> device=mthca0> >>>>>>> >>>>>>> [manage.cluster:17487] hwloc_distances->nbobjs=2> >>>>>>> >>>>>>>> [manage.cluster:17487] hwloc_distances->latency[0]=1.000000> >>>>>>>> >>>>>>> [manage.cluster:17487] hwloc_distances->latency[1]=1.600000> >>>>>>>> >>>>>>> [manage.cluster:17487] hwloc_distances->latency[2]=1.600000> >>>>>>>>>>>>>>> [manage.cluster:17487] hwloc_distances->latency > [3]=1.000000> >>>>>>>>>>>>>> [manage.cluster:17487] ibv_obj->type set to NULL> >>>>>>> >>>>>>> [manage.cluster:17487] Process is bound: distance to device is> >>>>>>>>>> 0.000000> >>>>>>> [manage.cluster:17488] [rank=1] openib: using >>> port >>>>>>> mthca0:1> >>>>>>> [manage.cluster:17488] select: init of component >>>>>>> openibreturned >>>>>>>>>>> success> >>>>>>> [manage.cluster:17487] openib BTL: rdmacm CPC >>>>>>> unavailable foruse >>>>>>>> on> >>>>>>> mthca0:1; skipped> >>>>>>> [manage.cluster:17487] >>> [rank=0] >>>>>>> openib: using port mthca0:1>>>>> >> [manage.cluster:17487] select: >>> init >>>>> of >>>>>>> component openib returnedsuccess> >>>>>>> >>>>>>>> [manage.cluster:17488] mca: bml: Using self btl for send to> >> >>>>>>> [[18993,1],1]> >>>>>>> on node manage> >>>>>>> > [manage.cluster:17487] >>>>> mca: >>>>>>> bml: Using self btl for send to> >> [[18993,1],0]> >>>>>>> >>>>>>>> on node manage> >>>>>>> [manage.cluster:17488] mca: bml: Using > vader >>>>> btl >>>>>>> for send to> >>>>>> [[18993,1],0]> >>>>>>> on node manage> >>>>>>> >>>>>>> [manage.cluster:17487] mca: bml: Using vader btl for send >>>>>>>> to> >>>>>> [[18993,1],1]> >>>>>>> on node manage> >>>>>>> # OSU > MPI >>>>>>> Bandwidth Test v3.1.1> >>>>>>> # Size Bandwidth (MB/s)> >>>>>>> 1 >>> 1.76> >>>>>>>>>>>>>> 2 3.53> >>>>>>> 4 7.06> >>>>>>> 8 14.46> >>>>>>> 16 >>>>>>>> 29.12> >>>>>>> 32 57.54> >>>>>>> 64 100.12> >>>>>>> 128 157.78> >>>>>>>>>>>> >>>>>>> 256 277.32> >>>>>>> 512 477.53> >>>>>>> 1024 894.81> >>>>>>> 2048 >>>>> 1330.68> >>>>>>>>>>>>>> 4096 278.58> >>>>>>> 8192 516.00> >>>>>>> >>>>>>>> 16384 762.99> >>>>>>> 32768 1037.19> >>>>>>> 65536 1181.66> >>>>>>>> >>>>>>> 131072 1261.91> >>>>>>> 262144 1237.39> >>>>>>> 524288 1247.86> >>>>>>>>>> >>>>>>> 1048576 1252.04> >>>>>>> 2097152 1273.46> >>>>>>> 4194304 1281.21> >>>>>>>>>>>> >>>>>>> [manage.cluster:17488] mca: base: close: component self closed> >>>>>>>>>> >>>>>>> [manage.cluster:17488] mca: base: close: >>>>>>>> unloading componentself >>>>>>>>>>>>>>>> [manage.cluster:17487] mca: base: close: component s > elf >>>>> closed>>>>>>>>>>>>> [manage.cluster:17487] mca: base: close: > unloading >>> componentself >>>>>>>>>>>>>>>> [manage.cluster:17488] mca: base: close: component vader >>>>>>> closed> >>>>>>> [manage.cluster:17488] mca: base: close: unloading >>>>>>> componentvader> >>>>>>> [manage.cluster:17487] mca: base: close: >>>>>>>> component vader closed> >>>>>>> [manage.cluster:17487] mca: base: >>>>> close: >>>>>>> unloading componentvader> >>>>>>> [manage.cluster:17488] mca: base: >>>>> close: >>>>>>> component tcp closed> >>>>>>> >>>>>>>> [manage.cluster:17488] mca: base: close: unloading componenttcp >>>>>>>>>>>>>>>> [manage.cluster:17487] mca: base: close: component tcp >>> closed> >>>>>>>>>>>>>> [manage.cluster:17487] mca: base: close: unloading >>> componenttcp >>>>>>>>>>>>>>>> [manage.cluster:17488] mca: base: close: component sm >>> closed> >>>>>>>>>>>>>> [manage.cluster:17488] mca: base: close: unloading component >>> sm> >>>>>>>>>>>>>> [manage.cluster:17487] mca: base: close: >>>>>>>> component sm closed> >>>>>>> [manage.cluster:17487] mca: base: >>> close: >>>>>>> unloading component sm> >>>>>>> [manage.cluster:17488] mca: base: >>>>> close: >>>>>>> component openibclosed >>>>>>>>>>>>>>>> [manage.cluster:17488] mca: base: close: unloading >>>>>>> componentopenib> >>>>>>> [manage.cluster:17487] mca: base: close: >>>>> component >>>>>>> openibclosed >>>>>>>>>>>>>>>> [manage.cluster:17487] mca: base: close: unloading >>>>>>> componentopenib> >>>>>>>> >>>>>>> Tetsuya Mishima> >>>>>>>> >>>>>>> >>>>>>> 2016/07/27 9:20:28、"devel"さんは「Re: [OMPI devel] sm BTL> >> >>>>> performace >>>>>>>> of> >>>>>>> the openmpi-2.0.0」で書きました> >>>>>>>> sm is >>> deprecated >>>>> in >>>>>>> 2.0.0 and will likely be removed in favorof >>>>>>>>>>> vader> >>>>>> in> >>>>>>> 2.1.0.> >>>>>>>>> >>>>>>>> This issue >>> is >>>>>>> probably this known issue:> >>>>>>> >>>>>>> https://github.com/open-mpi/ompi-release/pull/1250> >>>>>>>>> >>>>>>>>>>> >>>>>> >>>> Please apply those>>>> commits and see if it fixes the issue foryou.> >>>>>>>>>>>>>>>>>>>> >>>>>>> -Nathan> >>>>>>>>> >>>>>>>>> On Jul 26, 2016, at 6:17 PM, >>>>>>> tmish...@jcity.maeda.co.jpwrote: >>> _______________________________________________ >>> devel mailing list >>> devel@lists.open-mpi.org >>> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >> >> _______________________________________________ >> devel mailing list >> > de...@lists.open-mpi.orghttps://rfd.newmexicoconsortium.org/mailman/listinfo/devel > _______________________________________________ > devel mailing list > devel@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/devel _______________________________________________ devel mailing list devel@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/devel