If you are not connecting the QPs using CM, maybe you have a sync problem?
one side (the sender) is in RTS and the other side isn't in RTR
(or a sync problem when closing the connection)
Dotan
Tang, Changqing wrote:
The timeout is 18 (~1sec), and retry is 7 (max).
The error only occurs 1% of runs, sometimes I run the same hello_world code in
a loop, and caught it after 1500 runs. So I don't think it is a cable issue(but
I have not checked the port error counter).
--CQ
-----Original Message-----
From: Dotan Barak [mailto:[EMAIL PROTECTED]
Sent: Sunday, October 28, 2007 2:48 AM
To: Tang, Changqing
Cc: Sean Hefty; Roland Dreier; [email protected]
Subject: Re: [ofa-general] message is received but sender
report error.
Hi.
Maybe you should increase your timeout/retry count for your
application?
can you check the ports error counters (using perfquery)
maybe you have bad cables in your subnet ....
Dotan
Tang, Changqing wrote:
This is Verbs layer code, no IB CM is used.
--CQ
-----Original Message-----
From: Sean Hefty [mailto:[EMAIL PROTECTED]
Sent: Thursday, October 25, 2007 12:38 PM
To: Tang, Changqing; Roland Dreier
Cc: [email protected]
Subject: RE: [ofa-general] message is received but sender report
error.
If this is the case, how would we fix the problem ? It's
hard for us to
delay to destroy the QP, because we don't know how long to delay.
The other way is to do something from the driver, or firmware.
Do you disconnect the QPs using the IB CM?
- Sean
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
To unsubscribe, please visit
http://openib.org/mailman/listinfo/openib-general
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general