Jeff,
Can you please provide more information about you HCA type (ibv_devinfo -v).
Do you see this error immediate during startup, or you get it during your run ?

Thanks,
Pasha

Jeff Layton wrote:
Evening everyone,

I'm running a CFD code on IB and I've encountered an error I'm not sure about and I'm looking for some guidance on where to start looking. Here's the error:

mlx4: local QP operation err (QPN 260092, WQE index 9a9e0000, vendor syndrome 6f, opcode = 5e) [0,1,6][btl_openib_component.c:1392:btl_openib_component_progress] from compute-2-0.local to: compute-2-0.local erro r polling HP CQ with status LOCAL QP OPERATION ERROR status number 2 for wr_id 37742320 opcode 0 mpirun noticed that job rank 0 with PID 21220 on node compute-2-0.local exited on signal 15 (Terminated).
78 additional processes aborted (not shown)


This is openmpi-1.2.9rc2 (sorry - need to upgrade to 1.3.0). The code works correctly for smaller cases, but when I run larger cases I get this error.

I'm heading to bed but I'll check email tomorrow (so to sleep and run but it's been a long day).

TIA!

Jeff


------------------------------------------------------------------------

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to