Hmmm....ok. I'll have to look at it this weekend when I return from travel.
Can you please send me your test program so I can try to locally reproduce
it?



On Thu, Oct 15, 2015 at 3:42 PM, Mark Santcroos <mark.santcr...@rutgers.edu>
wrote:

>
> > On 16 Oct 2015, at 0:23 , Ralph Castain <r...@open-mpi.org> wrote:
> > Okay, that means that the dvm isn't recognizing that the jobs actually
> completed.
>
> Ok.
>
> > So the question is: what is it about those jobs?
>
> They are all the same.
>
> > Are those 6 jobs very short-lived, and the others are longer-lived?
>
> All very short lived, as thats the easiest to reproduce it.
>
> > If you look at the nodes (before you kill the dvm), are any of those
> procs still there?
>
> I originally ran into this on a large machine, but can reproduce it easily
> on my laptop so the results I've been sending in the last messages are from
> runs on my laptop.
>
> The stalled orte-submits are still there obviously, but the actual task
> process is no longer active.
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/10/18186.php
>

Reply via email to