Re: [OMPI devel] This is why we test

2009-01-16 Thread Jeff Squyres
We fixed the openib segv, but I forgot to followup about the timeouts that I mentioned in my original mail. The timeouts were from poorly-configured spawn tests. That is, I had 8 cores in the job and ran the spawn test on all 8 cores (all aggressively polling). The spawn test then spawned

Re: [OMPI devel] This is why we test

2009-01-15 Thread Jeff Squyres
Pasha and I *think* we have a fix. However, we're not quite clear on this part of the code, so we need some more testing and eyes on the code. I'll start the tests now -- given that this is a low-frequency bug, I'm going to run a slightly larger MTT run (several thousand tests) that'll t

[OMPI devel] This is why we test

2009-01-15 Thread Jeff Squyres
Unfortunately, I have to throw the flag in the v1.3 release. :-( I ran ~16k tests via MTT yesterday on the rc5 and rc6 tarballs. I found the following: Found test runs: 15962 Passed: 15785 (98.89%) Failed: 83 (0.52%) --> Openib failures: 80 (0.50%) Skipped: 46 (0.29%) Timedout: 48 (0.30%)