Hi,

yes, lsrun exists under openlava.

Using mpirun is fine, but openlava currently requires that to be launched through a bash script (openmpi-mpirun). Would be neater if one could do away with that.

Agan, thanks for looking into this!

/Marc

Hold on - was discussing this with a (possibly former) OpenLava developer who made some suggestions that would make this work. It all hinges on one thing.

Can you please check and see if you have "lsrun" on your system? If you do, then I can offer a tight integration in that we would use OpenLava to actually launch the OMPI daemons. Still not sure I could support you directly launching MPI apps without using mpirun, if that's what you are after.


On Nov 18, 2014, at 7:58 AM, Marc Höppner <marc.hoepp...@bils.se <mailto:marc.hoepp...@bils.se>> wrote:

Hi Ralph,

I really appreciate you guys looking into this! At least now I know that there isn't a better way to run mpi jobs. Probably worth looking into LSF again..

Cheers,

Marc
I took a brief gander at the OpenLava source code, and a couple of things jump out. First, OpenLava is a batch scheduler and only supports batch execution - there is no interactive command for "run this job". So you would have to "bsub" mpirun regardless.

Once you submit the job, mpirun can certainly read the local allocation via the environment. However, we cannot use the OpenLava internal functions to launch the daemons or processes as the code is GPL2, and thus has a viral incompatible license. Ordinarily, we get around that by just executing the interactive job execution command, but OpenLava doesn't have one.

So we'd have no other choice but to use ssh to launch the daemons on the remote nodes. This is exactly what the provided openmpi wrapper script that comes with OpenLava already does.

Bottom line: I don't see a way to do any deeper integration minus the interactive execution command. If OpenLava had a way of getting an allocation and then interactively running jobs, we could support what you requested. This doesn't seem to be what they are intending, unless I'm missing something (the documentation is rather incomplete).

Ralph


On Tue, Nov 18, 2014 at 6:20 AM, Marc Höppner<marc.hoepp...@bils.se <mailto:marc.hoepp...@bils.se>>wrote:

    Hi,

    sure, no problem. And about the C Api, I really don't know more
    than what I was told in the google group post I referred to
    (i.e. the API is essentially identical to LSF 4-6, which should
    be on the web).

    The output of env can be found here:
    https://dl.dropboxusercontent.com/u/1918141/env.txt

    /M

    Marc P. Hoeppner, PhD
    Team Leader
    BILS Genome Annotation Platform
    Department for Medical Biochemistry and Microbiology
    Uppsala University, Sweden
    marc.hoepp...@bils.se <mailto:marc.hoepp...@bils.se>

    On 18 Nov 2014, at 15:14, Ralph Castain <r...@open-mpi.org
    <mailto:r...@open-mpi.org>> wrote:

    If you could just run a single copy of "env" and send the
    output along, that would help a lot. I'm not interested in the
    usual path etc, but would like to see the envars that OpenLava
    is setting.

    Thanks
    Ralph


    On Tue, Nov 18, 2014 at 2:19 AM, Gilles
    Gouaillardet<gilles.gouaillar...@iferc.org
    <mailto:gilles.gouaillar...@iferc.org>>wrote:

        Marc,

        the reply you pointed is a bit confusing to me :

        "There is a native C API which can
        submit/start/stop/kill/re queue jobs"
        this is not what i am looking for :-(

        "you need to make an appropriate call to openlava to start
        a remote process"
        this is what i am interested in :-)
        could you be more specific (e.g. point me to the functions,
        since the OpenLava doc is pretty minimal ...)

        the goal here is to spawn the orted daemons as part of the
        parallel job,
        so these daemons are accounted within the parallel job.
        /* if we use an API that simply spawns orted, but the orted
        is not related whatsoever to the parallel job,
        then we can simply use ssh */

        Cheers,

        Gilles


        On 2014/11/18 18:24, Marc Höppner wrote:
        Hi Gilles,

        thanks for the prompt reply. Yes, as far as I know there is a C API to interact with jobs 
etc. Some mentioning here:https://groups.google.com/forum/#!topic/openlava-users/w74cRUe9Y9E  
<https://groups.google.com/forum/#%21topic/openlava-users/w74cRUe9Y9E>  
<https://groups.google.com/forum/#!topic/openlava-users/w74cRUe9Y9E>  
<https://groups.google.com/forum/#%21topic/openlava-users/w74cRUe9Y9E>
        /Marc Marc P. Hoeppner, PhD Team Leader BILS Genome
        Annotation Platform Department for Medical Biochemistry
        and Microbiology Uppsala University, Sweden
        marc.hoepp...@bils.se <mailto:marc.hoepp...@bils.se>
        On 18 Nov 2014, at 08:40, Gilles Gouaillardet<gilles.gouaillar...@iferc.org>  
<mailto:gilles.gouaillar...@iferc.org>  wrote:

        Hi Marc,

        OpenLava is based on a pretty old version of LSF (4.x if i remember
        correctly)
        and i do not think LSF had support for parallel jobs tight integration
        at that time.

        my understanding is that basically, there is two kind of direct
        integration :
        - mpirun launch: mpirun spawns orted via the API provided by the batch
        manager
        - direct launch: the mpi tasks are launched directly from the
        script/command line and no mpirun/orted is involved
          at that time, it works with SLURM and possibly other PMI capable batch
        manager

        i think OpenLava simply gets a list of hosts from the environment, build
        a machinefile, pass it to mpirun that spawns orted with ssh, so this is
        really loose integration.

        OpenMPI is based on plugins, so as long as the queing system provides an
        API to start/stop/kill tasks, mpirun launch should not
        be a huge effort.

        Are you aware of such an API provided by OpenLava ?

        Cheers,

        Gilles

        On 2014/11/18 16:31, Marc Höppner wrote:
        Hi list,

        I have recently started to wonder how hard it would be to add support for 
queuing systems to the tight integration function of OpenMPI (unfortunately, I am not 
a developer myself). Specifically, we are working with OpenLava (www.openlava.org  
<http://www.openlava.org/>), which is based on an early version of Lava/LSF and 
open source. It's proven quite useful in environments where some level of LSF 
compatibility is needed, but without actually paying for a (rather pricey) LSF 
license.

        Given that openLava shares quite a bit of DNA with LSF, I was wondering 
how hard it would be to add OL tight integration support to OpenMPI. Currently, 
OL enables OpenMPI jobs through a wrapper script, but that's obviously not 
ideal and doesn't work for some programs that have MPI support built-in (and 
thus expect to be able to just execute mpirun).

        Any thoughts on this would be greatly appreciated!

        Regards,

        Marc

        Marc P. Hoeppner, PhD
        Team Leader
        BILS Genome Annotation Platform
        Department for Medical Biochemistry and Microbiology
        Uppsala University, Sweden
        marc.hoepp...@bils.se  <mailto:marc.hoepp...@bils.se>

        _______________________________________________
        devel mailing list
        de...@open-mpi.org  <mailto:de...@open-mpi.org>
        Subscription:http://www.open-mpi.org/mailman/listinfo.cgi/devel
        Link to this 
post:http://www.open-mpi.org/community/lists/devel/2014/11/16312.php
        _______________________________________________
        devel mailing list
        de...@open-mpi.org  <mailto:de...@open-mpi.org>
        Subscription:http://www.open-mpi.org/mailman/listinfo.cgi/devel
        Link to this 
post:http://www.open-mpi.org/community/lists/devel/2014/11/16313.php


        _______________________________________________ devel
        mailing list de...@open-mpi.org
        <mailto:de...@open-mpi.org> Subscription:
        http://www.open-mpi.org/mailman/listinfo.cgi/devel

        Link to this 
post:http://www.open-mpi.org/community/lists/devel/2014/11/16314.php


        _______________________________________________
        devel mailing list
        de...@open-mpi.org <mailto:de...@open-mpi.org>
        Subscription:http://www.open-mpi.org/mailman/listinfo.cgi/devel
        Link to this
        post:http://www.open-mpi.org/community/lists/devel/2014/11/16315.php


    _______________________________________________
    devel mailing list
    de...@open-mpi.org <mailto:de...@open-mpi.org>
    Subscription:http://www.open-mpi.org/mailman/listinfo.cgi/devel
    Link to this
    post:http://www.open-mpi.org/community/lists/devel/2014/11/16316.php


    _______________________________________________
    devel mailing list
    de...@open-mpi.org <mailto:de...@open-mpi.org>
    Subscription:http://www.open-mpi.org/mailman/listinfo.cgi/devel
    Link to this
    post:http://www.open-mpi.org/community/lists/devel/2014/11/16317.php




_______________________________________________
devel mailing list
de...@open-mpi.org
Subscription:http://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this 
post:http://www.open-mpi.org/community/lists/devel/2014/11/16318.php

_______________________________________________
devel mailing list
de...@open-mpi.org <mailto:de...@open-mpi.org>
Subscription:http://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post:http://www.open-mpi.org/community/lists/devel/2014/11/16319.php



_______________________________________________
devel mailing list
de...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: 
http://www.open-mpi.org/community/lists/devel/2014/11/16326.php

Reply via email to