And it doesn't support knem at this time. Probably never will because of
the existence of CMA.

-Nathan

On Thu, Oct 16, 2014 at 01:49:09PM -0700, Ralph Castain wrote:
> FWIW: vader is the default in 1.8
> 
> On Oct 16, 2014, at 1:40 PM, Aurélien Bouteiller <boute...@icl.utk.edu> wrote:
> 
> > Are you sure you are not using the vader BTL ? 
> > 
> > Setting mca_btl_base_verbose and/or sm_verbose should spit out some knem 
> > initialization info. 
> > 
> > The CMA linux system (that ships with most 3.1x linux kernels) has similar 
> > features, and is also supported in sm.
> > 
> > Aurelien
> > --
> >          ~~~ Aurélien Bouteiller, Ph.D. ~~~
> >             ~ Research Scientist @ ICL ~
> > The University of Tennessee, Innovative Computing Laboratory
> > 1122 Volunteer Blvd, suite 309, Knoxville, TN 37996
> > tel: +1 (865) 974-9375       fax: +1 (865) 974-8296
> > https://icl.cs.utk.edu/~bouteill/
> > 
> > 
> > 
> > 
> > Le 16 oct. 2014 à 16:35, Gus Correa <g...@ldeo.columbia.edu> a écrit :
> > 
> >> Dear Open MPI developers
> >> 
> >> Well, I just can't keep my promises for too long ...
> >> So, here I am pestering you again, although this time
> >> it is not a request for more documentation.
> >> Hopefully it is something more legit.
> >> 
> >> I am having trouble using knem with Open MPI 1.8.3,
> >> and need your help.
> >> 
> >> I configured Open MPI 1.8.3 with knem.
> >> I had done the same with some builds of Open MPI 1.6.5 before.
> >> 
> >> When I build and launch the Intel MPI benchmarks (IMB)
> >> with Open MPI 1.6.5,
> >> 'cat /dev/knem'
> >> starts showing non-zero-and-growing statistics right away.
> >> 
> >> However, when I build and launch IMB with Open MPI 1.8.3,
> >> /dev/knem shows only zeros,
> >> no statistics growing, nothing.
> >> Knem just seems to be completely asleep.
> >> 
> >> So, my conclusion is that somehow knem is not working with OMPI 1.8.3,
> >> at least not for me.
> >> 
> >> ***
> >> 
> >> The runtime environment related to knem is setup the
> >> same way on both OPMI releases.
> >> I tried setting it up both on the command line:
> >> 
> >> -mca btl_sm_eager_limit 32768 -mca btl_sm_knem_dma_min 1048576
> >> 
> >> and on the MCA parameter file:
> >> 
> >> btl_sm_use_knem = 1
> >> btl_sm_eager_limit = 32768
> >> btl_sm_knem_dma_min = 1048576
> >> 
> >> and the behavior is the same (i.e., knem is active in 1.6.5,
> >> but doesn't seem to be used by 1.8.3, as indicated by the
> >> /dev/knem statistics.)
> >> 
> >> ***
> >> 
> >> When I 'grep -i knem config.log', both 1.6.5 and 1.8.3 builds show:
> >> 
> >> #define OMPI_BTL_SM_HAVE_KNEM 1
> >> 
> >> suggesting that both configurations picked up knem correctly.
> >> 
> >> On the other hand, when I do 'ompi_info --all --all |grep knem',
> >> OMPI 1.6.5 shows "btl_sm_have_knem_support":
> >> 
> >> 'MCA btl: information "btl_sm_have_knem_support" (value: <1>, data source: 
> >> default value)  Whether this component supports the knem Linux kernel 
> >> module or not'
> >> 
> >> By contrast, in OMPI 1.8.3 ompi_info doesn't show this particular item 
> >> ("btl_sm_have_knem_support"),
> >> although the *other* 'btl sm knem' items are there,
> >> namely "btl_sm_use_knem","btl_sm_knem_dma_min", 
> >> "btl_sm_knem_max_simultaneous".
> >> 
> >> I am scratching my head to understand why a parameter with such a
> >> suggestive name ("btl_sm_have_knem_support"),
> >> so similar to the OMPI_BTL_SM_HAVE_KNEM cpp macro,
> >> somehow vanished from ompi_info in OMPI 1.8.3.
> >> 
> >> ***
> >> 
> >> Questions:
> >> 
> >> - Am I doing something totally wrong,
> >> perhaps with the knem runtime environment?
> >> 
> >> - Was knem somehow phased out in 1.8.3?
> >> 
> >> - Could there be a bad interaction with other runtime parameters that
> >> somehow is knocking out knem in 1.8.3?
> >> (FYI, besides knem, I'm just excluding the tcp btl, binding to core, and 
> >> reporting the bindings, which is exactly what I do on 1.6.5,
> >> although the runtime parameter syntax has changed.)
> >> 
> >> - Is knem inadvertently not being activated at runtime in OMPI 1.8.3?
> >> (i.e. a bug)
> >> 
> >> - Is there a way to increase verbosity to detect if knem is being
> >> used by OMPI?
> >> That would certainly help to check what is going on.
> >> I tried '-mca btl_base_verbose 30' but there was no trace of knem
> >> in sderr/stdout of either 1.6.5 or 1.8.3.
> >> So, the evidence I have that knem is
> >> active in 1.6.5 but not in 1.8.3 comes only from the statistics in
> >> /dev/knem.
> >> 
> >> ***
> >> 
> >> 
> >> Thank you,
> >> Gus Correa
> >> 
> >> ***
> >> 
> >> PS - As an aside, I also have some questions on the knem setup,
> >> which I mostly copied from the knem web site
> >> (hopefully Brice Goglin is listening ...):
> >> 
> >> - Is 32768 in 'btl_sm_eager_limit 32768' a good number,
> >> or should it be larger/smaller/something else?
> >> [OK, I know I should benchmark it, but exploring the whole parameter
> >> space takes long, so why not asking? ]
> >> 
> >> - Is it worth using 'btl_sm_knem_dma_min 1048576'?
> >> [I think I read somewhere that this dma engine offload
> >> is an Intel thing, not AMD.]
> >> 
> >> - How about btl_sm_knem_max_simultaneous?
> >> That one is not mentioned in the knem web site.
> >> Should I leave it default to zero or set it to 1? 2? 4? Something else?
> >> 
> >> 
> >> Thanks again,
> >> Gus Correa
> >> _______________________________________________
> >> users mailing list
> >> us...@open-mpi.org
> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> >> Link to this post: 
> >> http://www.open-mpi.org/community/lists/users/2014/10/25511.php
> > 
> > _______________________________________________
> > users mailing list
> > us...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> > Link to this post: 
> > http://www.open-mpi.org/community/lists/users/2014/10/25512.php
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/10/25513.php

Attachment: pgpt4AEriuKYv.pgp
Description: PGP signature

Reply via email to