Hi Nathan

Thank you very much for addressing this problem.

I read your notes on Jeff's blog about vader,
and that clarified many things that were obscure to me
when I first started this thread
whining that knem was not working in OMPI 1.8.3.
Thank you also for writing that blog post,
and for sending the link to it.
That was very helpful indeed.

As your closing comments on the blog post point out,
and your IMB benchmark graphs of pingpong/latency &
sendrecv/bandwidth show,
vader+xpmem outperforms the other combinations
of btl+memory_copy_mechanism of intra-node communication.

For the benefit of pedestrian OpenMPI users like me:

1) What is the status of xpmem in the Linux world at this point?
[Proprietary (SGI?) / open source, part of the Linux kernel (which),
part of standard distributions (which) ?]

2) Any recommendation for the values of the
various vader btl parameters?
[There are 12 of them in OMPI 1.8.3!
That is real challenge to get right.]

Which values did you use in your benchmarks?
Defaults?
Other?

In particular, is there an optimal value for the eager/rendevous threshold value? (btl_vader_eager_limit, default=4kB) [The INRIA web site suggests 32kB for the sm+knem counterpart (btl_sm_eager_limit, default=4kB).]

3) Did I understand it right, that the upcoming OpenMPI 1.8.5
can be configured with more than one memory copy mechanism altogether
(e.g. --with-knem and --with-cma and --with-xpmem),
then select one of them at runtime with the btl_vader_single_copy_mechanism parameter?
Or must OMPI be configured with only one memory copy mechanism?

Many thanks,
Gus Correa


On 10/30/2014 05:44 PM, Nathan Hjelm wrote:
I want to close the loop on this issue. 1.8.5 will address it in several
ways:

  - knem support in btl/sm has been fixed. A sanity check was disabling
    knem during component registration. I wrote the sanity check before
    the 1.7 release and didn't intend this side-effect.

  - vader now supports xpmem, cma, and knem. The best available
    single-copy mechanism will be used. If multiple single-copy
    mechanisms are available you can select which one you want to use are
    runtime.

More about the vader btl can be found here:
http://blogs.cisco.com/performance/the-vader-shared-memory-transport-in-open-mpi-now-featuring-3-flavors-of-zero-copy/

-Nathan Hjelm
HPC-5, LANL

On Fri, Oct 17, 2014 at 01:02:23PM -0700, Ralph Castain wrote:
      On Oct 17, 2014, at 12:06 PM, Gus Correa <g...@ldeo.columbia.edu> wrote:
      Hi Jeff

      Many thanks for looking into this and filing a bug report at 11:16PM!

      Thanks to Aurelien, Ralph and Nathan for their help and clarifications
      also.

      **

      Related suggestion:

      Add a note to the FAQ explaining that in OMPI 1.8
      the new (default) btl is vader (and what it is).

      It was a real surprise to me.
      If Aurelien Bouteiller didn't tell me about vader,
      I might have never realized it even existed.

      That could be part of one of the already existent FAQs
      explaining how to select the btl.

      **

      Doubts (btl in OMPI 1.8):

      I still don't understand clearly the meaning and scope of vader
      being a "default btl".

    We mean that it has a higher priority than the other shared memory
    implementation, and so it will be used for intra-node messaging by
    default.

      Which is the scope of this default: intra-node btl only perhaps?

    Yes - strictly intra-node

      Was there a default btl before vader, and which?

    The "sm" btl was the default shared memory transport before vader

      Is vader the intra-node default only (i.e. replaces sm  by default),

    Yes

      or does it somehow extend beyond node boundaries, and replaces (or
      brings in) network btls (openib,tcp,etc) ?

    Nope - just intra-node

      If I am running on several nodes, and want to use openib, not tcp,
      and, say, use vader, what is the right syntax?

      * nothing (OMPI will figure it out ... but what if you have
      IB,Ethernet,Myrinet,OpenGM, altogether?)

    If you have higher-speed connections, we will pick the fastest for
    inter-node messaging as the "default" since we expect you would want the
    fastest possible transport.

      * -mca btl openib (and vader will come along automatically)

    Among the ones you show, this would indeed be the likely choices (openib
    and vader)

      * -mca btl openib,self (and vader will come along automatically)

    The "self" btl is *always* active as the loopback transport

      * -mca btl openib,self,vader (because vader is default only for 1-node
      jobs)
      * something else (or several alternatives)

      Whatever happened to the "self" btl in this new context?
      Gone? Still there?

      Many thanks,
      Gus Correa

      On 10/16/2014 11:16 PM, Jeff Squyres (jsquyres) wrote:

        On Oct 16, 2014, at 1:35 PM, Gus Correa <g...@ldeo.columbia.edu> wrote:

          and on the MCA parameter file:

          btl_sm_use_knem = 1

        I think the logic enforcing this MCA param got broken when we revamped
        the MCA param system.  :-(

          I am scratching my head to understand why a parameter with such a
          suggestive name ("btl_sm_have_knem_support"),
          so similar to the OMPI_BTL_SM_HAVE_KNEM cpp macro,
          somehow vanished from ompi_info in OMPI 1.8.3.

        It looks like this MCA param was also dropped when we revamped the MCA
        system.  Doh!  :-(

        There's some deep mojo going on that is somehow causing knem to not be
        used; I'm too tired to understand the logic right now.  I just opened
        https://github.com/open-mpi/ompi/issues/239 to track this issue --
        feel free to subscribe to the issue to get updates.

      _______________________________________________
      users mailing list
      us...@open-mpi.org
      Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
      Link to this
      post: http://www.open-mpi.org/community/lists/users/2014/10/25532.php

_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2014/10/25534.php



_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2014/10/25647.php


Reply via email to