Hello,
It appears that sometime after r16777, and by r16799, that something
was broken on the trunk's openib support for 32-bit builds.
The 64-bit tests all seem normal, as well as the 32-bit & 64-bit tests on
the 1.2 branch on the same machine (odin).

See this MTT results page permalink showing the 32-bit odin runs:
http://www.open-mpi.org/mtt/index.php?do_redir=468

Pasha & Gleb, you both did a variety of checkins in that svn r# range.
Do either of you have time to investigate this?

Here is a snippet from one randomly picked failed test (out of thousands):
[1,1][btl_openib_component.c:1665:btl_openib_module_progress] from
odin001 to: odin001 error
polling LP CQ with status LOCAL PROTOCOL ERROR status number 4 for
wr_id 141733120 opcode 128
qp_idx 3
--------------------------------------------------------------------------
mpirun has exited due to process rank 1 with PID 29761 on
node odin001 calling "abort". This will have caused other processes
in the application to be terminated by signals sent by mpirun
(as reported here).
--------------------------------------------------------------------------

Thanks, and happy bug hunting!
-- 
Tim Mattox, Ph.D. - http://homepage.mac.com/tmattox/
 tmat...@gmail.com || timat...@open-mpi.org
    I'm a bright... http://www.the-brights.net/

Reply via email to