On 2008-01-22 01:49, Eli Cohen wrote:

I am sending two patches, one for userspace and one for kernel space
which solves this issue.


Thanks for the patches:
http://lists.openfabrics.org/pipermail/general/2008-January/045259.html
http://lists.openfabrics.org/pipermail/general/2008-January/045260.html

They "fix" the test program I sent to the list earlier. It ran for many hours. Unfortunately, they did not fix my convoluted software.

I applied the user space patch to libmthca-1.0.4 from OFED-1.2.5.4, and the kernel space patch to the 2.6.23.14 kernel. The user space patch did not want to apply one of the hunks (the one containing '- wbm();') to srq.c because the code being patched did not have the 'wbm();' line. This forced me to remove the '- wbm()' line from the patch file.

Then I observed these errors, each occurred twice so far:
- A send completion is out of order. It has a "future" wr_id value.
- A receive completion has a "future" imm_data value.

It looks exactly like if the sending side dropped a few IBV_WR_RDMA_WRITE_WITH_IMM requests. Or the sender sent them later (but my software does not known about them because it stops on the first error).

Is it possible that with IBV_QPT_RC queues, the IBV_WR_RDMA_WRITE_WITH_IMM requests are completed out of order on either sending or receiving side?

Roman
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to