All things are possible, including what you describe. Not sure when we would get to it, though.
On Oct 24, 2012, at 4:01 AM, Nicolas Deladerriere <nicolas.deladerri...@gmail.com> wrote: > Reuti, > > The problem I am facing is a small small part of our production > system, and I cannot modify our mpirun submission system. This is why > i am looking at solution using only ompi-clean of mpirun command > specification. > > Thanks, > Nicolas > > 2012/10/24, Reuti <re...@staff.uni-marburg.de>: >> Am 24.10.2012 um 11:33 schrieb Nicolas Deladerriere: >> >>> Reuti, >>> >>> Thanks for your comments, >>> >>> In our case, we are currently running different mpirun commands on >>> clusters sharing the same frontend. Basically we use a wrapper to run >>> the mpirun command and to run an ompi-clean command to clean up the >>> mpi job if required. >>> Using ompi-clean like this just kills all other mpi jobs running on >>> same frontend. I cannot use queuing system >> >> Why? Using it on a single machine was only one possible setup. Its purpose >> is to distribute jobs to slave hosts. If you have already one frontend as >> login-machine it fits perfect: the qmaster (in case of SGE) can run there >> and the execd on the nodes. >> >> -- Reuti >> >> >>> as you have suggested this >>> is why I was wondering a option or other solution associated to >>> ompi-clean command to avoid this general mpi jobs cleaning. >>> >>> Cheers >>> Nicolas >>> >>> 2012/10/24, Reuti <re...@staff.uni-marburg.de>: >>>> Hi, >>>> >>>> Am 24.10.2012 um 09:36 schrieb Nicolas Deladerriere: >>>> >>>>> I am having issue running ompi-clean which clean up (this is normal) >>>>> session associated to a user which means it kills all running jobs >>>>> assoicated to this session (this is also normal). But I would like to >>>>> be >>>>> able to clean up session associated to a job (a not user). >>>>> >>>>> Here is my point: >>>>> >>>>> I am running two executable : >>>>> >>>>> % mpirun -np 2 myexec1 >>>>> --> run with PID 2399 ... >>>>> % mpirun -np 2 myexec2 >>>>> --> run with PID 2402 ... >>>>> >>>>> When I run orte-clean I got this result : >>>>> % orte-clean -v >>>>> orte-clean: cleaning session dir tree >>>>> openmpi-sessions-ndelader@myhost_0 >>>>> orte-clean: killing any lingering procs >>>>> orte-clean: found potential rogue orterun process >>>>> (pid=2399,user=ndelader), sending SIGKILL... >>>>> orte-clean: found potential rogue orterun process >>>>> (pid=2402,user=ndelader), sending SIGKILL... >>>>> >>>>> Which means that both jobs have been killed :-( >>>>> Basically I would like to perform orte-clean using executable name or >>>>> PID >>>>> or whatever that identify which job I want to stop an clean. It seems I >>>>> would need to create an openmpi session per job. Does it make sense ? >>>>> And >>>>> I would like to be able to do something like following command and get >>>>> following result : >>>>> >>>>> % orte-clean -v myexec1 >>>>> orte-clean: cleaning session dir tree >>>>> openmpi-sessions-ndelader@myhost_0 >>>>> orte-clean: killing any lingering procs >>>>> orte-clean: found potential rogue orterun process >>>>> (pid=2399,user=ndelader), sending SIGKILL... >>>>> >>>>> >>>>> Does it make sense ? Is there a way to perform this kind of selection >>>>> in >>>>> cleaning process ? >>>> >>>> How many jobs are you starting on how many nodes at one time? This >>>> requirement could be a point to start to use a queuing system, where can >>>> remove job individually and also serialize your workflow. In fact: we >>>> use >>>> GridEngine also local on workstations for this purpose. >>>> >>>> -- Reuti >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users