Hi Reuti

The DVM in master seems to be fairly complete, but several organizations are in 
the process of automating tests for it so it gets more regular exercise.

If you are using a version in OMPI 2.x, those are early prototype - we haven’t 
updated the code in the release branches. The more production-ready version 
will be in 3.0, and we’ll start supporting it there.

Meantime, we do appreciate any suggestions and bug reports as we polish it up.


> On Feb 28, 2017, at 2:17 AM, Reuti <re...@staff.uni-marburg.de> wrote:
> 
> Hi,
> 
> Only by reading recent posts I got aware of the DVM. This would be a welcome 
> feature for our setup*. But I see not all options working as expected - is it 
> still a work in progress, or should all work as advertised?
> 
> 1)
> 
> $ soft@server:~> orte-submit -cf foo --hnp file:/home/reuti/dvmuri -n 1 touch 
> /home/reuti/hacked
> ----------------------------------------------------------------------------
> Open MPI has detected that a parameter given to a command line
> option does not match the expected format:
> 
>  Option: np
>  Param:  foo
> 
> ==> The given option is -cf, not -np
> 
> 2)
> 
> According to `man orte-dvm` there is -H, -host, --host, -machinefile, 
> -hostfile but none of them seem operational (Open MPI 2.0.2). A given 
> hostlist given by SGE is honored though.
> 
> -- Reuti
> 
> 
> *) We run Open MPI jobs inside SGE. This works fine. Some applications invoke 
> several `mpiexec`-calls during their execution and rely on temporary files 
> they created in the last step(s). While this is working fine on one and the 
> same machine, it fails in case SGE granted slots on several machines as the 
> scratch directories created by `qrsh -inherit …` vanish once the 
> `mpiexec`-call on this particular node finishes (and not at the end of the 
> complete job). I can mimic persistent scratch directories in SGE for a 
> complete job, but invoking the DVM before and shutting it down later on 
> (either by hand in the job script or by SGE killing all remains at the end of 
> the job) might be more straight forward (looks like `orte-dvm` is started by 
> `qrsh -inherit …` too).
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to