On Aug 9, 2012, at 2:38 PM, Andrew Connolly wrote:

>  
> I see that ipython has the ability to submit jobs into a batch system. Is 
> this feature how you were submitting jobs into Condor?
> 
> I did not use ipython to submit the jobs, I escaped the ipython shell with a 
> leading "!" before the call to condor, i.e.: "@ipython_prompt[0]: 
> !condor_submit my_submit_file.condor"
>  
> Were all of the jobs running on your local machine or on multiple machines in 
> a cluster?
> Jobs were running on multiple machines on a cluster and afaik none of them 
> were killed.
>  
> Or, more technically, do you know what Condor universe the jobs were running 
> under?
> 
> I always use universe  = vanilla in my condor submit files.
>  
> I suspect that when you removed the jobs, Condor killed the processes that it 
> spawned, but didn't know about some child processes they had in turn spawned. 
> Condor uses several techniques to track the child processes of jobs, none of 
> which are perfect.
>  
> That would make send, but I am fairly certain that my processes were not 
> spawning others, although I can check this.


Two suggestions to help diagnose the problem:

1) Submit some simple sleep jobs, remove them, and check whether Condor kills 
all of them.

2) Submit some of your real jobs and use pstree and friends to see what process 
they create when Condor runs them.

Thanks and regards,
Jaime Frey
UW-Madison Condor Team

Reply via email to