Hmmm....ok. I'll have to look at it this weekend when I return from travel. Can you please send me your test program so I can try to locally reproduce it?
On Thu, Oct 15, 2015 at 3:42 PM, Mark Santcroos <mark.santcr...@rutgers.edu> wrote: > > > On 16 Oct 2015, at 0:23 , Ralph Castain <r...@open-mpi.org> wrote: > > Okay, that means that the dvm isn't recognizing that the jobs actually > completed. > > Ok. > > > So the question is: what is it about those jobs? > > They are all the same. > > > Are those 6 jobs very short-lived, and the others are longer-lived? > > All very short lived, as thats the easiest to reproduce it. > > > If you look at the nodes (before you kill the dvm), are any of those > procs still there? > > I originally ran into this on a large machine, but can reproduce it easily > on my laptop so the results I've been sending in the last messages are from > runs on my laptop. > > The stalled orte-submits are still there obviously, but the actual task > process is no longer active. > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/10/18186.php >