[openib-general] Local QP operation error

2006-06-27 Thread Ramachandra K
In a kernel module, on polling the CQ, I am getting a local QP
operation error (IB_WC_LOC_QP_OP_ERR). Work request
posted was of type IB_WR_SEND and the QP was moved to
IB_QPS_RTS state before posting the send work request.

The IB specifcation says that this error indicates an internal QP 
consistency
error. What are the possible reasons for this and is there any way I can
pin point the inconsistency ?

I would appreciate any hints to resolve this error.

Regards,
Ram

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] Local QP operation error

2006-06-27 Thread Michael S. Tsirkin
Quoting r. Ramachandra K [EMAIL PROTECTED]:
 Subject: Local QP operation error
 
 In a kernel module, on polling the CQ, I am getting a local QP
 operation error (IB_WC_LOC_QP_OP_ERR). Work request
 posted was of type IB_WR_SEND and the QP was moved to
 IB_QPS_RTS state before posting the send work request.
 
 The IB specifcation says that this error indicates an internal QP consistency
 error. What are the possible reasons for this and is there any way I can pin
 point the inconsistency ?

This normally indicates some kind of driver bug, or memory corruption.
What is the value of the vendor_err field?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] Local QP operation error

2006-06-27 Thread Ramachandra K




Michael S. Tsirkin wrote:

  
The IB specifcation says that this error indicates an internal QP consistency
error. What are the possible reasons for this and is there any way I can pin
point the inconsistency ?

  
  
This normally indicates some kind of driver bug, or memory corruption.
What is the value of the vendor_err field?

  

The vendor_err field value is 115 (0x73).

Just to clarify, I am writing the kernel module that is getting the
local
QP operation error. I guess I am missing something in my code that
is causing the error. But I am unable to pinpoint the cause of the
error.

Does this error point to some issue with the DMA address specified 
in the work request SGE ?

Regards,
Ram


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] Local QP operation error

2006-06-27 Thread Talpey, Thomas
At 09:21 AM 6/27/2006, Ramachandra K wrote:
Does this error point to some issue with the DMA address specified 
in the work request SGE ?


Ding Ding Ding Ding! :-)

We recently identified the exact issue in the NFS/RDMA server, which
happened only when running on ia64. If you're not using the dma_map_*
api, that's maybe something to look at. ;-)

Tom. 


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] Local QP operation error

2006-06-27 Thread Michael S. Tsirkin
Quoting r. Ramachandra K [EMAIL PROTECTED]:
 Just to clarify, I am writing the kernel module that is getting the local
 QP operation error. I guess I am missing something in my code that
 is causing the error. But I am unable to pinpoint the cause of the error.
 
 Does this error point to some issue with the DMA address specified 
 in the work request SGE ?

Yes, it seems hardware could not read (gather) data when executing the work
request SGE.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general