Tom, Did you make any change to have bonnie++, dd of a 10G file and vdbench concurrently run & finish?
I keep hitting the WQE overflow error below. I saw that most of the requests have two chunks (32K chunk and some-bytes chunk), each chunk requires an frmr + invalidate wrs; However, you set ep->rep_attr.cap.max_send_wr = cdata->max_requests and then for frmr case you do ep->rep_atrr.cap.max_send_wr *=3; which is not enough. Moreover, you also set ep->rep_cqinit = max_send_wr/2 for send completion signal which causes the wqe overflow happened faster. After applying the following patch, I have thing vdbench, dd, and copy 10g_file running overnight -vu --- ofa_kernel-1.5.1.orig/net/sunrpc/xprtrdma/verbs.c 2010-02-24 10:41:22.000000000 -0800 +++ ofa_kernel-1.5.1/net/sunrpc/xprtrdma/verbs.c 2010-02-24 10:03:18.000000000 -0800 @@ -649,8 +654,15 @@ ep->rep_attr.cap.max_send_wr = cdata->max_requests; switch (ia->ri_memreg_strategy) { case RPCRDMA_FRMR: - /* Add room for frmr register and invalidate WRs */ - ep->rep_attr.cap.max_send_wr *= 3; + /* + * Add room for frmr register and invalidate WRs + * Requests sometimes have two chunks, each chunk + * requires to have different frmr. The safest + * WRs required are max_send_wr * 6; however, we + * get send completions and poll fast enough, it + * is pretty safe to have max_send_wr * 4. + */ + ep->rep_attr.cap.max_send_wr *= 4; if (ep->rep_attr.cap.max_send_wr > devattr.max_qp_wr) return -EINVAL; break; @@ -682,7 +694,8 @@ ep->rep_attr.cap.max_recv_sge); /* set trigger for requesting send completion */ - ep->rep_cqinit = ep->rep_attr.cap.max_send_wr/2 /* - 1*/; + ep->rep_cqinit = ep->rep_attr.cap.max_send_wr/4; + switch (ia->ri_memreg_strategy) { case RPCRDMA_MEMWINDOWS_ASYNC: case RPCRDMA_MEMWINDOWS: > -----Original Message----- > From: ewg-boun...@lists.openfabrics.org [mailto:ewg- > boun...@lists.openfabrics.org] On Behalf Of Vu Pham > Sent: Monday, February 22, 2010 12:23 PM > To: Tom Tucker > Cc: linux-rdma@vger.kernel.org; Mahesh Siddheshwar; > e...@lists.openfabrics.org > Subject: Re: [ewg] nfsrdma fails to write big file, > > Tom, > > Some more info on the problem: > 1. Running with memreg=4 (FMR) I can not reproduce the problem > 2. I also see different error on client > > Feb 22 12:16:55 mellanox-2 rpc.idmapd[5786]: nss_getpwnam: name > 'nobody' > does not map into domain 'localdomain' > Feb 22 12:16:55 mellanox-2 kernel: QP 0x70004b: WQE overflow > Feb 22 12:16:55 mellanox-2 kernel: QP 0x6c004a: WQE overflow > Feb 22 12:16:55 mellanox-2 kernel: QP 0x6c004a: WQE overflow > Feb 22 12:16:55 mellanox-2 kernel: RPC: rpcrdma_ep_post: ib_post_send > returned -12 cq_init 48 cq_count 32 > Feb 22 12:17:00 mellanox-2 kernel: RPC: rpcrdma_event_process: > send WC status 5, vend_err F5 > Feb 22 12:17:00 mellanox-2 kernel: rpcrdma: connection to > 13.20.1.9:20049 closed (-103) > > -vu > > > -----Original Message----- > > From: Tom Tucker [mailto:t...@opengridcomputing.com] > > Sent: Monday, February 22, 2010 10:49 AM > > To: Vu Pham > > Cc: linux-rdma@vger.kernel.org; Mahesh Siddheshwar; > > e...@lists.openfabrics.org > > Subject: Re: [ewg] nfsrdma fails to write big file, > > > > Vu Pham wrote: > > > Setup: > > > 1. linux nfsrdma client/server with OFED-1.5.1-20100217-0600, > > ConnectX2 > > > QDR HCAs fw 2.7.8-6, RHEL 5.2. > > > 2. Solaris nfsrdma server svn 130, ConnectX QDR HCA. > > > > > > > > > Running vdbench on 10g file or *dd if=/dev/zero of=10g_file bs=1M > > > count=10000*, operation fail, connection get drop, client cannot > > > re-establish connection to server. > > > After rebooting only the client, I can mount again. > > > > > > It happens with both solaris and linux nfsrdma servers. > > > > > > For linux client/server, I run memreg=5 (FRMR), I don't see problem > > with > > > memreg=6 (global dma key) > > > > > > > > > > Awesome. This is the key I think. > > > > Thanks for the info Vu, > > Tom > > > > > > > On Solaris server snv 130, we see problem decoding write request of > > 32K. > > > The client send two read chunks (32K & 16-byte), the server fail to > > do > > > rdma read on the 16-byte chunk (cqe.status = 10 ie. > > > IB_WC_REM_ACCCESS_ERROR); therefore, server terminate the > connection. > > We > > > don't see this problem on nfs version 3 on Solaris. Solaris server > > run > > > normal memory registration mode. > > > > > > On linux client, I see cqe.status = 12 ie. IB_WC_RETRY_EXC_ERR > > > > > > I added these notes in bug #1919 (bugs.openfabrics.org) to track > the > > > issue. > > > > > > thanks, > > > -vu > > > _______________________________________________ > > > ewg mailing list > > > e...@lists.openfabrics.org > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg > > > > > _______________________________________________ > ewg mailing list > e...@lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html