Matt Hughes wrote:
2009/2/26 Brett Pemberton <br...@vpac.org>:
[[1176,1],0][btl_openib_component.c:2905:handle_wc] from tango092.vpac.org
to: tango090 error polling LP CQ with status RETRY EXCEEDED ERROR status
number 12 for wr_id 38996224 opcode 0 qp_idx 0

What OS are you using?

Centos 5

  I've seen this error and many other Infiniband
related errors on RedHat enterprise linux 4 update 4, with ConnectX
cards and various versions of OFED, up to version 1.3.  Depending on
the MCA parameters, I also see hangs often enough to make native
Infiniband unusable on this OS.


I'd appreciate some advice on if I'm using OFED correctly.

I'm running OFED 1.4, however not the kernel modules, just userland.
Is this a bad idea?

Basically, I recompile the ofed src.rpms for:

dapl, libibcm, libibcommon, libibmad, libibumad, libibverbs, libmthca, librdmacm, libsdp, mstflint

And install onto CentOS, upgrading the in-distro versions.
Should I also be compiling ofa_kernel ?
Could this be causing problems ?

As explained off-list, I'm running the most recent firmware for my cards, although the release is quite old:

hca_id: mthca0
        fw_ver:                         1.2.0
        node_guid:                      0002:c902:0024:3c6c
        sys_image_guid:                 0002:c902:0024:3c6f
        vendor_id:                      0x02c9
        vendor_part_id:                 25204
        hw_ver:                         0xA0
        board_id:                       MT_03B0140001
        phys_port_cnt:                  1
                port:   1
                        state:                  PORT_ACTIVE (4)
                        max_mtu:                2048 (4)
                        active_mtu:             2048 (4)
                        sm_lid:                 1
                        port_lid:               34
                        port_lmc:               0x00

cheers,

        / Brett

--
Brett Pemberton - VPAC Senior Systems Administrator
http://www.vpac.org/ - (03) 9925 4899

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to