Re: [OMPI devel] Running on Kubernetes
Hi guys, Thanks for all the suggestions! It's been a while but we finally got it approved for open sourcing. I've submitted a proposal to kubeflow: https://github.com/kubeflow/community/blob/master/proposals/mpi-operator-proposal.md. In this version we've managed to not use ssh, relying on `kubectl exec` instead. It's still pretty "ghetto", but at least we've managed to train some tensorflow models with it. :) Please take a look and let me know what you think. Thanks, Rong On Fri, Mar 16, 2018 at 11:38 AM r...@open-mpi.orgwrote: > I haven’t really spent any time with Kubernetes, but it seems to me you > could just write a Kubernetes plm (and maybe an odls) component and bypass > the ssh stuff completely given that you say there is a launcher API. > > > On Mar 16, 2018, at 11:02 AM, Jeff Squyres (jsquyres) < > jsquy...@cisco.com> wrote: > > > > On Mar 16, 2018, at 10:01 AM, Gilles Gouaillardet < > gilles.gouaillar...@gmail.com> wrote: > >> > >> By default, Open MPI uses the rsh PLM in order to start a job. > > > > To clarify one thing here: the name of our plugin is "rsh" for > historical reasons, but it defaults to looking to looking for "ssh" first. > If it finds ssh, it uses it. Otherwise, it tries to find rsh and use that. > > > > -- > > Jeff Squyres > > jsquy...@cisco.com > > > > ___ > > devel mailing list > > devel@lists.open-mpi.org > > https://lists.open-mpi.org/mailman/listinfo/devel > > ___ > devel mailing list > devel@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/devel ___ devel mailing list devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/devel
Re: [OMPI devel] About supporting HWLOC 2.0.x
I just pushed my patches rebased on master + update to hwloc 2.0.1 to bgoglin/ompi (master branch). My testing of mapping/ranking/binding looks good here (on dual xeon with CoD, 2 sockets x 2 NUMA x 6 cores). It'd be nice if somebody else could test on another platform with different options and/or advanced options (PPR, PE, etc). Brice Le 23/05/2018 à 17:07, Vallee, Geoffroy R. a écrit : > I totally missed that PR before I sent my email, sorry. It pretty much covers > all the modifications I made. :) Let me know if I can help in any way. > > Thanks, > >> On May 22, 2018, at 11:49 AM, Jeff Squyres (jsquyres)>> wrote: >> >> Geoffroy -- check out https://github.com/open-mpi/ompi/pull/4677. >> >> If all those issues are now moot, great. I really haven't followed up much >> since I made the initial PR; I'm happy to have someone else take it over... >> >> >>> On May 22, 2018, at 11:46 AM, Vallee, Geoffroy R. wrote: >>> >>> Hi, >>> >>> HWLOC 2.0.x support was brought up during the call. FYI, I am currently >>> using (and still testing) hwloc 2.0.1 as an external library with master >>> and I did not face any major problem; I only had to fix minor things, >>> mainly for putting the HWLOC topology in a shared memory segment. Let me >>> know if you want me to help with the effort of supporting HWLOC 2.0.x. >>> >>> Thanks, >>> ___ >>> devel mailing list >>> devel@lists.open-mpi.org >>> https://lists.open-mpi.org/mailman/listinfo/devel >> >> -- >> Jeff Squyres >> jsquy...@cisco.com >> >> ___ >> devel mailing list >> devel@lists.open-mpi.org >> https://lists.open-mpi.org/mailman/listinfo/devel >> > ___ > devel mailing list > devel@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/devel ___ devel mailing list devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/devel
Re: [OMPI devel] openmpi-3.1.0 cygwin patch
On 5/23/2018 2:58 PM, Gilles Gouaillardet wrote: Marco, Have you tried to build Open MPI with an external (e.g. Cygwin provided) libevent library ? If that works, I think that would be the preferred method. Cheers, Gilles I will try. If I remember right there was an issue in the past as somewhere a WIN32 was defined an it was screwing the build. Regards Marco ___ devel mailing list devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/devel