Hi All

Back to the original issue of knem in Open MPI 1.8.3.
It really seems to be broken.

I launched the Intel MPI benchmarks (IMB) job both with
'-mca btl ^vader,tcp', and with '-mca btl sm,self,openib'.
Both syntaxes seem to have turned off vader (along with tcp),
as shown in stderr by messages like this
(I also used -mca btl_base_verbose 30):

[1,11]<stddiag>:[node26:13439] mca: bml: Using sm btl to [[39251,1],0] on node node26

*However*, in both cases /dev/knem continues to *show only zeros*.

My conclusion is that the knem seems not to be working
at all in OMPI 1.8.3.

That is a real pity, because without knem performance really suffers.
I took a quick look at the Intel MPI benchmarks output
using OMPI 1.6.5 with knem, and OMPI 1.8.5 where knem doesn't work (despite my attempts to make it work).
The older OMPI with knem shows very good speedups.
For instance, ping-pong on two processors, message size 256kB,
OMPI 1.6.5+knem has a ~32% speeedup w.r.t. OMPI 1.8.3.

#bytes #repetitions      t[usec]   Mbytes/sec
262144          160        48.04      5203.93 (OMPI 1.6.5 + knem)
262144          160        63.72      3923.30 (OMPI 1.8.3, broken knem)

Numbers like these don't give me any incentive to upgrade
our production codes to OMPI 1.8.
Will this be fixed in the next Open MPI 1.8 release?

Thank you,
Gus Correa

PS - Many thanks to Aurelien Boutelier for pointing out the existence
of the vader btl.  Without his tip I would still be in the dark side.

On 10/16/2014 05:46 PM, Gus Correa wrote:

On 10/16/2014 05:28 PM, Nathan Hjelm wrote:
And it doesn't support knem at this time. Probably never will because of
the existence of CMA.

-Nathan


Thanks, Nathan

But for the benefit of mere mortals like me
who don't share the dark or the bright side of the force,
and just need to keep their MPI applications running in production mode,
hopefully with Open MPI 1.8,
can somebody explain more clearly what "vader" is about?

Thank you,
Gus Correa


On Thu, Oct 16, 2014 at 01:49:09PM -0700, Ralph Castain wrote:
FWIW: vader is the default in 1.8

On Oct 16, 2014, at 1:40 PM, Aurélien Bouteiller
<boute...@icl.utk.edu> wrote:

Are you sure you are not using the vader BTL ?

Setting mca_btl_base_verbose and/or sm_verbose should spit out some
knem initialization info.

The CMA linux system (that ships with most 3.1x linux kernels) has
similar features, and is also supported in sm.

Aurelien
--
          ~~~ Aurélien Bouteiller, Ph.D. ~~~
             ~ Research Scientist @ ICL ~
The University of Tennessee, Innovative Computing Laboratory
1122 Volunteer Blvd, suite 309, Knoxville, TN 37996
tel: +1 (865) 974-9375       fax: +1 (865) 974-8296
https://icl.cs.utk.edu/~bouteill/




Le 16 oct. 2014 à 16:35, Gus Correa <g...@ldeo.columbia.edu> a écrit :

Dear Open MPI developers

Well, I just can't keep my promises for too long ...
So, here I am pestering you again, although this time
it is not a request for more documentation.
Hopefully it is something more legit.

I am having trouble using knem with Open MPI 1.8.3,
and need your help.

I configured Open MPI 1.8.3 with knem.
I had done the same with some builds of Open MPI 1.6.5 before.

When I build and launch the Intel MPI benchmarks (IMB)
with Open MPI 1.6.5,
'cat /dev/knem'
starts showing non-zero-and-growing statistics right away.

However, when I build and launch IMB with Open MPI 1.8.3,
/dev/knem shows only zeros,
no statistics growing, nothing.
Knem just seems to be completely asleep.

So, my conclusion is that somehow knem is not working with OMPI 1.8.3,
at least not for me.

***

The runtime environment related to knem is setup the
same way on both OPMI releases.
I tried setting it up both on the command line:

-mca btl_sm_eager_limit 32768 -mca btl_sm_knem_dma_min 1048576

and on the MCA parameter file:

btl_sm_use_knem = 1
btl_sm_eager_limit = 32768
btl_sm_knem_dma_min = 1048576

and the behavior is the same (i.e., knem is active in 1.6.5,
but doesn't seem to be used by 1.8.3, as indicated by the
/dev/knem statistics.)

***

When I 'grep -i knem config.log', both 1.6.5 and 1.8.3 builds show:

#define OMPI_BTL_SM_HAVE_KNEM 1

suggesting that both configurations picked up knem correctly.

On the other hand, when I do 'ompi_info --all --all |grep knem',
OMPI 1.6.5 shows "btl_sm_have_knem_support":

'MCA btl: information "btl_sm_have_knem_support" (value: <1>, data
source: default value)  Whether this component supports the knem
Linux kernel module or not'

By contrast, in OMPI 1.8.3 ompi_info doesn't show this particular
item ("btl_sm_have_knem_support"),
although the *other* 'btl sm knem' items are there,
namely "btl_sm_use_knem","btl_sm_knem_dma_min",
"btl_sm_knem_max_simultaneous".

I am scratching my head to understand why a parameter with such a
suggestive name ("btl_sm_have_knem_support"),
so similar to the OMPI_BTL_SM_HAVE_KNEM cpp macro,
somehow vanished from ompi_info in OMPI 1.8.3.

***

Questions:

- Am I doing something totally wrong,
perhaps with the knem runtime environment?

- Was knem somehow phased out in 1.8.3?

- Could there be a bad interaction with other runtime parameters that
somehow is knocking out knem in 1.8.3?
(FYI, besides knem, I'm just excluding the tcp btl, binding to
core, and reporting the bindings, which is exactly what I do on 1.6.5,
although the runtime parameter syntax has changed.)

- Is knem inadvertently not being activated at runtime in OMPI 1.8.3?
(i.e. a bug)

- Is there a way to increase verbosity to detect if knem is being
used by OMPI?
That would certainly help to check what is going on.
I tried '-mca btl_base_verbose 30' but there was no trace of knem
in sderr/stdout of either 1.6.5 or 1.8.3.
So, the evidence I have that knem is
active in 1.6.5 but not in 1.8.3 comes only from the statistics in
/dev/knem.

***


Thank you,
Gus Correa

***

PS - As an aside, I also have some questions on the knem setup,
which I mostly copied from the knem web site
(hopefully Brice Goglin is listening ...):

- Is 32768 in 'btl_sm_eager_limit 32768' a good number,
or should it be larger/smaller/something else?
[OK, I know I should benchmark it, but exploring the whole parameter
space takes long, so why not asking? ]

- Is it worth using 'btl_sm_knem_dma_min 1048576'?
[I think I read somewhere that this dma engine offload
is an Intel thing, not AMD.]

- How about btl_sm_knem_max_simultaneous?
That one is not mentioned in the knem web site.
Should I leave it default to zero or set it to 1? 2? 4? Something
else?


Thanks again,
Gus Correa
_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2014/10/25511.php

_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2014/10/25512.php

_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2014/10/25513.php


_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2014/10/25515.php

_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2014/10/25518.php

Reply via email to