Hi, Am 24.10.2012 um 09:36 schrieb Nicolas Deladerriere:
> I am having issue running ompi-clean which clean up (this is normal) session > associated to a user which means it kills all running jobs assoicated to this > session (this is also normal). But I would like to be able to clean up > session associated to a job (a not user). > > Here is my point: > > I am running two executable : > > % mpirun -np 2 myexec1 > --> run with PID 2399 ... > % mpirun -np 2 myexec2 > --> run with PID 2402 ... > > When I run orte-clean I got this result : > % orte-clean -v > orte-clean: cleaning session dir tree openmpi-sessions-ndelader@myhost_0 > orte-clean: killing any lingering procs > orte-clean: found potential rogue orterun process (pid=2399,user=ndelader), > sending SIGKILL... > orte-clean: found potential rogue orterun process (pid=2402,user=ndelader), > sending SIGKILL... > > Which means that both jobs have been killed :-( > Basically I would like to perform orte-clean using executable name or PID or > whatever that identify which job I want to stop an clean. It seems I would > need to create an openmpi session per job. Does it make sense ? And I would > like to be able to do something like following command and get following > result : > > % orte-clean -v myexec1 > orte-clean: cleaning session dir tree openmpi-sessions-ndelader@myhost_0 > orte-clean: killing any lingering procs > orte-clean: found potential rogue orterun process (pid=2399,user=ndelader), > sending SIGKILL... > > > Does it make sense ? Is there a way to perform this kind of selection in > cleaning process ? How many jobs are you starting on how many nodes at one time? This requirement could be a point to start to use a queuing system, where can remove job individually and also serialize your workflow. In fact: we use GridEngine also local on workstations for this purpose. -- Reuti