All things are possible, including what you describe. Not sure when we would 
get to it, though.


On Oct 24, 2012, at 4:01 AM, Nicolas Deladerriere 
<nicolas.deladerri...@gmail.com> wrote:

> Reuti,
> 
> The problem I am facing is a small small part of our production
> system, and I cannot modify our mpirun submission system. This is why
> i am looking at solution using only ompi-clean of mpirun command
> specification.
> 
> Thanks,
> Nicolas
> 
> 2012/10/24, Reuti <re...@staff.uni-marburg.de>:
>> Am 24.10.2012 um 11:33 schrieb Nicolas Deladerriere:
>> 
>>> Reuti,
>>> 
>>> Thanks for your comments,
>>> 
>>> In our case, we are currently running different mpirun commands on
>>> clusters sharing the same frontend. Basically we use a wrapper to run
>>> the mpirun command and to run an ompi-clean command to clean up the
>>> mpi job if required.
>>> Using ompi-clean like this just kills all other mpi jobs running on
>>> same frontend. I cannot use queuing system
>> 
>> Why? Using it on a single machine was only one possible setup. Its purpose
>> is to distribute jobs to slave hosts. If you have already one frontend as
>> login-machine it fits perfect: the qmaster (in case of SGE) can run there
>> and the execd on the nodes.
>> 
>> -- Reuti
>> 
>> 
>>> as you have suggested this
>>> is why I was wondering a option or other solution associated to
>>> ompi-clean command to avoid this general mpi jobs cleaning.
>>> 
>>> Cheers
>>> Nicolas
>>> 
>>> 2012/10/24, Reuti <re...@staff.uni-marburg.de>:
>>>> Hi,
>>>> 
>>>> Am 24.10.2012 um 09:36 schrieb Nicolas Deladerriere:
>>>> 
>>>>> I am having issue running ompi-clean which clean up (this is normal)
>>>>> session associated to a user which means it kills all running jobs
>>>>> assoicated to this session (this is also normal). But I would like to
>>>>> be
>>>>> able to clean up session associated to a job (a not user).
>>>>> 
>>>>> Here is my point:
>>>>> 
>>>>> I am running two executable :
>>>>> 
>>>>> % mpirun -np 2 myexec1
>>>>>      --> run with PID 2399 ...
>>>>> % mpirun -np 2 myexec2
>>>>>      --> run with PID 2402 ...
>>>>> 
>>>>> When I run orte-clean I got this result :
>>>>> % orte-clean -v
>>>>> orte-clean: cleaning session dir tree
>>>>> openmpi-sessions-ndelader@myhost_0
>>>>> orte-clean: killing any lingering procs
>>>>> orte-clean: found potential rogue orterun process
>>>>> (pid=2399,user=ndelader), sending SIGKILL...
>>>>> orte-clean: found potential rogue orterun process
>>>>> (pid=2402,user=ndelader), sending SIGKILL...
>>>>> 
>>>>> Which means that both jobs have been killed :-(
>>>>> Basically I would like to perform orte-clean using executable name or
>>>>> PID
>>>>> or whatever that identify which job I want to stop an clean. It seems I
>>>>> would need to create an openmpi session per job. Does it make sense ?
>>>>> And
>>>>> I would like to be able to do something like following command and get
>>>>> following result :
>>>>> 
>>>>> % orte-clean -v myexec1
>>>>> orte-clean: cleaning session dir tree
>>>>> openmpi-sessions-ndelader@myhost_0
>>>>> orte-clean: killing any lingering procs
>>>>> orte-clean: found potential rogue orterun process
>>>>> (pid=2399,user=ndelader), sending SIGKILL...
>>>>> 
>>>>> 
>>>>> Does it make sense ? Is there a way to perform this kind of selection
>>>>> in
>>>>> cleaning process ?
>>>> 
>>>> How many jobs are you starting on how many nodes at one time? This
>>>> requirement could be a point to start to use a queuing system, where can
>>>> remove job individually and also serialize your workflow. In fact: we
>>>> use
>>>> GridEngine also local on workstations for this purpose.
>>>> 
>>>> -- Reuti
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to