Am 24.10.2012 um 11:33 schrieb Nicolas Deladerriere: > Reuti, > > Thanks for your comments, > > In our case, we are currently running different mpirun commands on > clusters sharing the same frontend. Basically we use a wrapper to run > the mpirun command and to run an ompi-clean command to clean up the > mpi job if required. > Using ompi-clean like this just kills all other mpi jobs running on > same frontend. I cannot use queuing system
Why? Using it on a single machine was only one possible setup. Its purpose is to distribute jobs to slave hosts. If you have already one frontend as login-machine it fits perfect: the qmaster (in case of SGE) can run there and the execd on the nodes. -- Reuti > as you have suggested this > is why I was wondering a option or other solution associated to > ompi-clean command to avoid this general mpi jobs cleaning. > > Cheers > Nicolas > > 2012/10/24, Reuti <re...@staff.uni-marburg.de>: >> Hi, >> >> Am 24.10.2012 um 09:36 schrieb Nicolas Deladerriere: >> >>> I am having issue running ompi-clean which clean up (this is normal) >>> session associated to a user which means it kills all running jobs >>> assoicated to this session (this is also normal). But I would like to be >>> able to clean up session associated to a job (a not user). >>> >>> Here is my point: >>> >>> I am running two executable : >>> >>> % mpirun -np 2 myexec1 >>> --> run with PID 2399 ... >>> % mpirun -np 2 myexec2 >>> --> run with PID 2402 ... >>> >>> When I run orte-clean I got this result : >>> % orte-clean -v >>> orte-clean: cleaning session dir tree openmpi-sessions-ndelader@myhost_0 >>> orte-clean: killing any lingering procs >>> orte-clean: found potential rogue orterun process >>> (pid=2399,user=ndelader), sending SIGKILL... >>> orte-clean: found potential rogue orterun process >>> (pid=2402,user=ndelader), sending SIGKILL... >>> >>> Which means that both jobs have been killed :-( >>> Basically I would like to perform orte-clean using executable name or PID >>> or whatever that identify which job I want to stop an clean. It seems I >>> would need to create an openmpi session per job. Does it make sense ? And >>> I would like to be able to do something like following command and get >>> following result : >>> >>> % orte-clean -v myexec1 >>> orte-clean: cleaning session dir tree openmpi-sessions-ndelader@myhost_0 >>> orte-clean: killing any lingering procs >>> orte-clean: found potential rogue orterun process >>> (pid=2399,user=ndelader), sending SIGKILL... >>> >>> >>> Does it make sense ? Is there a way to perform this kind of selection in >>> cleaning process ? >> >> How many jobs are you starting on how many nodes at one time? This >> requirement could be a point to start to use a queuing system, where can >> remove job individually and also serialize your workflow. In fact: we use >> GridEngine also local on workstations for this purpose. >> >> -- Reuti >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users