...but patches would be greatly appreciated. :-) On Oct 24, 2012, at 12:24 PM, Ralph Castain wrote:
> All things are possible, including what you describe. Not sure when we would > get to it, though. > > > On Oct 24, 2012, at 4:01 AM, Nicolas Deladerriere > <nicolas.deladerri...@gmail.com> wrote: > >> Reuti, >> >> The problem I am facing is a small small part of our production >> system, and I cannot modify our mpirun submission system. This is why >> i am looking at solution using only ompi-clean of mpirun command >> specification. >> >> Thanks, >> Nicolas >> >> 2012/10/24, Reuti <re...@staff.uni-marburg.de>: >>> Am 24.10.2012 um 11:33 schrieb Nicolas Deladerriere: >>> >>>> Reuti, >>>> >>>> Thanks for your comments, >>>> >>>> In our case, we are currently running different mpirun commands on >>>> clusters sharing the same frontend. Basically we use a wrapper to run >>>> the mpirun command and to run an ompi-clean command to clean up the >>>> mpi job if required. >>>> Using ompi-clean like this just kills all other mpi jobs running on >>>> same frontend. I cannot use queuing system >>> >>> Why? Using it on a single machine was only one possible setup. Its purpose >>> is to distribute jobs to slave hosts. If you have already one frontend as >>> login-machine it fits perfect: the qmaster (in case of SGE) can run there >>> and the execd on the nodes. >>> >>> -- Reuti >>> >>> >>>> as you have suggested this >>>> is why I was wondering a option or other solution associated to >>>> ompi-clean command to avoid this general mpi jobs cleaning. >>>> >>>> Cheers >>>> Nicolas >>>> >>>> 2012/10/24, Reuti <re...@staff.uni-marburg.de>: >>>>> Hi, >>>>> >>>>> Am 24.10.2012 um 09:36 schrieb Nicolas Deladerriere: >>>>> >>>>>> I am having issue running ompi-clean which clean up (this is normal) >>>>>> session associated to a user which means it kills all running jobs >>>>>> assoicated to this session (this is also normal). But I would like to >>>>>> be >>>>>> able to clean up session associated to a job (a not user). >>>>>> >>>>>> Here is my point: >>>>>> >>>>>> I am running two executable : >>>>>> >>>>>> % mpirun -np 2 myexec1 >>>>>> --> run with PID 2399 ... >>>>>> % mpirun -np 2 myexec2 >>>>>> --> run with PID 2402 ... >>>>>> >>>>>> When I run orte-clean I got this result : >>>>>> % orte-clean -v >>>>>> orte-clean: cleaning session dir tree >>>>>> openmpi-sessions-ndelader@myhost_0 >>>>>> orte-clean: killing any lingering procs >>>>>> orte-clean: found potential rogue orterun process >>>>>> (pid=2399,user=ndelader), sending SIGKILL... >>>>>> orte-clean: found potential rogue orterun process >>>>>> (pid=2402,user=ndelader), sending SIGKILL... >>>>>> >>>>>> Which means that both jobs have been killed :-( >>>>>> Basically I would like to perform orte-clean using executable name or >>>>>> PID >>>>>> or whatever that identify which job I want to stop an clean. It seems I >>>>>> would need to create an openmpi session per job. Does it make sense ? >>>>>> And >>>>>> I would like to be able to do something like following command and get >>>>>> following result : >>>>>> >>>>>> % orte-clean -v myexec1 >>>>>> orte-clean: cleaning session dir tree >>>>>> openmpi-sessions-ndelader@myhost_0 >>>>>> orte-clean: killing any lingering procs >>>>>> orte-clean: found potential rogue orterun process >>>>>> (pid=2399,user=ndelader), sending SIGKILL... >>>>>> >>>>>> >>>>>> Does it make sense ? Is there a way to perform this kind of selection >>>>>> in >>>>>> cleaning process ? >>>>> >>>>> How many jobs are you starting on how many nodes at one time? This >>>>> requirement could be a point to start to use a queuing system, where can >>>>> remove job individually and also serialize your workflow. In fact: we >>>>> use >>>>> GridEngine also local on workstations for this purpose. >>>>> >>>>> -- Reuti >>>>> _______________________________________________ >>>>> users mailing list >>>>> us...@open-mpi.org >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/