[openib-general] Local QP operation error
In a kernel module, on polling the CQ, I am getting a local QP operation error (IB_WC_LOC_QP_OP_ERR). Work request posted was of type IB_WR_SEND and the QP was moved to IB_QPS_RTS state before posting the send work request. The IB specifcation says that this error indicates an internal QP consistency error. What are the possible reasons for this and is there any way I can pin point the inconsistency ? I would appreciate any hints to resolve this error. Regards, Ram ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Local QP operation error
Quoting r. Ramachandra K [EMAIL PROTECTED]: Subject: Local QP operation error In a kernel module, on polling the CQ, I am getting a local QP operation error (IB_WC_LOC_QP_OP_ERR). Work request posted was of type IB_WR_SEND and the QP was moved to IB_QPS_RTS state before posting the send work request. The IB specifcation says that this error indicates an internal QP consistency error. What are the possible reasons for this and is there any way I can pin point the inconsistency ? This normally indicates some kind of driver bug, or memory corruption. What is the value of the vendor_err field? -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Local QP operation error
Michael S. Tsirkin wrote: The IB specifcation says that this error indicates an internal QP consistency error. What are the possible reasons for this and is there any way I can pin point the inconsistency ? This normally indicates some kind of driver bug, or memory corruption. What is the value of the vendor_err field? The vendor_err field value is 115 (0x73). Just to clarify, I am writing the kernel module that is getting the local QP operation error. I guess I am missing something in my code that is causing the error. But I am unable to pinpoint the cause of the error. Does this error point to some issue with the DMA address specified in the work request SGE ? Regards, Ram ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Local QP operation error
At 09:21 AM 6/27/2006, Ramachandra K wrote: Does this error point to some issue with the DMA address specified in the work request SGE ? Ding Ding Ding Ding! :-) We recently identified the exact issue in the NFS/RDMA server, which happened only when running on ia64. If you're not using the dma_map_* api, that's maybe something to look at. ;-) Tom. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Local QP operation error
Quoting r. Ramachandra K [EMAIL PROTECTED]: Just to clarify, I am writing the kernel module that is getting the local QP operation error. I guess I am missing something in my code that is causing the error. But I am unable to pinpoint the cause of the error. Does this error point to some issue with the DMA address specified in the work request SGE ? Yes, it seems hardware could not read (gather) data when executing the work request SGE. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general