On May 5, 2010, at 7:54 PM, Douglas Guptill wrote: > P.S. Yes, I know OpenMPI 1.2.8 is old. We have been using it for 2 > years with no apparent problems.
It ain't broke; don't fix it -- nothing wrong with that. > When I saw comments like "machine hung" for 1.4.1, FWIW, I find it hard to believe that Open MPI is the cause of machine hangs. Open MPI is user-level process stuff, which should generally not be able to crash Linux. If user-level processes can hang Linux, then something else is probably broken. But also FWIW, we have found various MPI benchmarks and test applications can be *excellent* at finding underlying server / network problems. I can't think of a case offhand where Open MPI "caused" a machine to hang/crash/die/whatever that wasn't ultimately tracked down to some other root cause. > and "data loss" for 1.3.x, I put aside thoughts of upgrading. We definitely did have a big problem with OpenFabrics registered memory in Open MPI 1.3.0 and 1.3.1 (corrected in 1.3.2). Shame on us. :-( But to continue the "FWIW" from above: we actually do *millions* of regression tests before Open MPI is released -- literally. All of us were convinced that 1.3.0 and 1.3.1 were ok to release; the data corruption issues caught us by surprise. Yuck. Those kinds of bugs are the worst. :-( -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/