Ok, I'll give it a try. Thanks, nick
On Thu, Dec 17, 2009 at 12:44, Ralph Castain <r...@open-mpi.org> wrote: > In case you missed it, this patch should be in the 1.4 nightly tarballs - > feel free to test and let me know what you find. > > Thanks > Ralph > > On Dec 2, 2009, at 10:06 PM, Nicolas Bock wrote: > > That was quick. I will try the patch as soon as you release it. > > nick > > > On Wed, Dec 2, 2009 at 21:06, Ralph Castain <r...@open-mpi.org> wrote: > >> Patch is built and under review... >> >> Thanks again >> Ralph >> >> On Dec 2, 2009, at 5:37 PM, Nicolas Bock wrote: >> >> Thanks >> >> On Wed, Dec 2, 2009 at 17:04, Ralph Castain <r...@open-mpi.org> wrote: >> >>> Yeah, that's the one all right! Definitely missing from 1.3.x. >>> >>> Thanks - I'll build a patch for the next bug-fix release >>> >>> >>> On Dec 2, 2009, at 4:37 PM, Abhishek Kulkarni wrote: >>> >>> > On Wed, Dec 2, 2009 at 5:00 PM, Ralph Castain <r...@open-mpi.org> >>> wrote: >>> >> Indeed - that is very helpful! Thanks! >>> >> Looks like we aren't cleaning up high enough - missing the directory >>> level. >>> >> I seem to recall seeing that error go by and that someone fixed it on >>> our >>> >> devel trunk, so this is likely a repair that didn't get moved over to >>> the >>> >> release branch as it should have done. >>> >> I'll look into it and report back. >>> > >>> > You are probably referring to >>> > https://svn.open-mpi.org/trac/ompi/changeset/21498 >>> > >>> > There was an issue about orte_session_dir_finalize() not >>> > cleaning up the session directories properly. >>> > >>> > Hope that helps. >>> > >>> > Abhishek >>> > >>> >> Thanks again >>> >> Ralph >>> >> On Dec 2, 2009, at 2:45 PM, Nicolas Bock wrote: >>> >> >>> >> >>> >> On Wed, Dec 2, 2009 at 14:23, Ralph Castain <r...@open-mpi.org> wrote: >>> >>> >>> >>> Hmm....if you are willing to keep trying, could you perhaps let it >>> run for >>> >>> a brief time, ctrl-z it, and then do an ls on a directory from a >>> process >>> >>> that has already terminated? The pids will be in order, so just look >>> for an >>> >>> early number (not mpirun or the parent, of course). >>> >>> It would help if you could give us the contents of a directory from a >>> >>> child process that has terminated - would tell us what subsystem is >>> failing >>> >>> to properly cleanup. >>> >> >>> >> Ok, so I Ctrl-Z the master. In >>> >> /tmp/.private/nbock/openmpi-sessions-nbock@mujo_0 I now have only one >>> >> directory >>> >> >>> >> /tmp/.private/nbock/openmpi-sessions-nbock@mujo_0/857 >>> >> >>> >> I can't find that PID though. mpirun has PID 4230, orted does not >>> exist, >>> >> master is 4231, and slave is 4275. When I "fg" master and Ctrl-Z it >>> again, >>> >> slave has a different PID as expected. I Ctrl-Z'ed in iteration 68, >>> there >>> >> are 70 sequentially numbered directories starting at 0. Every >>> directory >>> >> contains another directory called "0". There is nothing in any of >>> those >>> >> directories. I see for instance: >>> >> >>> >> /tmp/.private/nbock/openmpi-sessions-nbock@mujo_0/857 $ ls -lh 70 >>> >> total 4.0K >>> >> drwx------ 2 nbock users 4.0K Dec 2 14:41 0 >>> >> >>> >> and >>> >> >>> >> nbock@mujo /tmp/.private/nbock/openmpi-sessions-nbock@mujo_0/857 $ ls >>> -lh >>> >> 70/0/ >>> >> total 0 >>> >> >>> >> I hope this information helps. Did I understand your question >>> correctly? >>> >> >>> >> nick >>> >> >>> >> _______________________________________________ >>> >> users mailing list >>> >> us...@open-mpi.org >>> >> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >> >>> >> _______________________________________________ >>> >> users mailing list >>> >> us...@open-mpi.org >>> >> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >> >>> > >>> > _______________________________________________ >>> > users mailing list >>> > us...@open-mpi.org >>> > http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >