Vu Pham wrote: > Tom, > > Did you make any change to have bonnie++, dd of a 10G file and vdbench > concurrently run & finish? > >
No I did not but my disk subsystem is pretty slow, so it might be that I just don't have fast enough storage. > I keep hitting the WQE overflow error below. > I saw that most of the requests have two chunks (32K chunk and > some-bytes chunk), each chunk requires an frmr + invalidate wrs; > However, you set ep->rep_attr.cap.max_send_wr = cdata->max_requests and > then for frmr case you do > ep->rep_atrr.cap.max_send_wr *=3; which is not enough. Moreover, you > also set ep->rep_cqinit = max_send_wr/2 for send completion signal which > causes the wqe overflow happened faster. > > > After applying the following patch, I have thing vdbench, dd, and copy > 10g_file running overnight > > -vu > > > --- ofa_kernel-1.5.1.orig/net/sunrpc/xprtrdma/verbs.c 2010-02-24 > 10:41:22.000000000 -0800 > +++ ofa_kernel-1.5.1/net/sunrpc/xprtrdma/verbs.c 2010-02-24 > 10:03:18.000000000 -0800 > @@ -649,8 +654,15 @@ > ep->rep_attr.cap.max_send_wr = cdata->max_requests; > switch (ia->ri_memreg_strategy) { > case RPCRDMA_FRMR: > - /* Add room for frmr register and invalidate WRs */ > - ep->rep_attr.cap.max_send_wr *= 3; > + /* > + * Add room for frmr register and invalidate WRs > + * Requests sometimes have two chunks, each chunk > + * requires to have different frmr. The safest > + * WRs required are max_send_wr * 6; however, we > + * get send completions and poll fast enough, it > + * is pretty safe to have max_send_wr * 4. > + */ > + ep->rep_attr.cap.max_send_wr *= 4; > if (ep->rep_attr.cap.max_send_wr > devattr.max_qp_wr) > return -EINVAL; > break; > @@ -682,7 +694,8 @@ > ep->rep_attr.cap.max_recv_sge); > > /* set trigger for requesting send completion */ > - ep->rep_cqinit = ep->rep_attr.cap.max_send_wr/2 /* - 1*/; > + ep->rep_cqinit = ep->rep_attr.cap.max_send_wr/4; > + > switch (ia->ri_memreg_strategy) { > case RPCRDMA_MEMWINDOWS_ASYNC: > case RPCRDMA_MEMWINDOWS: > > > Erf. This is client code. I'll take a look at this and see if I can understand what Talpey was up to. Tom > > > > >> -----Original Message----- >> From: ewg-boun...@lists.openfabrics.org [mailto:ewg- >> boun...@lists.openfabrics.org] On Behalf Of Vu Pham >> Sent: Monday, February 22, 2010 12:23 PM >> To: Tom Tucker >> Cc: linux-r...@vger.kernel.org; Mahesh Siddheshwar; >> ewg@lists.openfabrics.org >> Subject: Re: [ewg] nfsrdma fails to write big file, >> >> Tom, >> >> Some more info on the problem: >> 1. Running with memreg=4 (FMR) I can not reproduce the problem >> 2. I also see different error on client >> >> Feb 22 12:16:55 mellanox-2 rpc.idmapd[5786]: nss_getpwnam: name >> 'nobody' >> does not map into domain 'localdomain' >> Feb 22 12:16:55 mellanox-2 kernel: QP 0x70004b: WQE overflow >> Feb 22 12:16:55 mellanox-2 kernel: QP 0x6c004a: WQE overflow >> Feb 22 12:16:55 mellanox-2 kernel: QP 0x6c004a: WQE overflow >> Feb 22 12:16:55 mellanox-2 kernel: RPC: rpcrdma_ep_post: ib_post_send >> returned -12 cq_init 48 cq_count 32 >> Feb 22 12:17:00 mellanox-2 kernel: RPC: rpcrdma_event_process: >> send WC status 5, vend_err F5 >> Feb 22 12:17:00 mellanox-2 kernel: rpcrdma: connection to >> 13.20.1.9:20049 closed (-103) >> >> -vu >> >> >>> -----Original Message----- >>> From: Tom Tucker [mailto:t...@opengridcomputing.com] >>> Sent: Monday, February 22, 2010 10:49 AM >>> To: Vu Pham >>> Cc: linux-r...@vger.kernel.org; Mahesh Siddheshwar; >>> ewg@lists.openfabrics.org >>> Subject: Re: [ewg] nfsrdma fails to write big file, >>> >>> Vu Pham wrote: >>> >>>> Setup: >>>> 1. linux nfsrdma client/server with OFED-1.5.1-20100217-0600, >>>> >>> ConnectX2 >>> >>>> QDR HCAs fw 2.7.8-6, RHEL 5.2. >>>> 2. Solaris nfsrdma server svn 130, ConnectX QDR HCA. >>>> >>>> >>>> Running vdbench on 10g file or *dd if=/dev/zero of=10g_file bs=1M >>>> count=10000*, operation fail, connection get drop, client cannot >>>> re-establish connection to server. >>>> After rebooting only the client, I can mount again. >>>> >>>> It happens with both solaris and linux nfsrdma servers. >>>> >>>> For linux client/server, I run memreg=5 (FRMR), I don't see >>>> > problem > >>> with >>> >>>> memreg=6 (global dma key) >>>> >>>> >>>> >>> Awesome. This is the key I think. >>> >>> Thanks for the info Vu, >>> Tom >>> >>> >>> >>>> On Solaris server snv 130, we see problem decoding write request >>>> > of > >>> 32K. >>> >>>> The client send two read chunks (32K & 16-byte), the server fail >>>> > to > >>> do >>> >>>> rdma read on the 16-byte chunk (cqe.status = 10 ie. >>>> IB_WC_REM_ACCCESS_ERROR); therefore, server terminate the >>>> >> connection. >> >>> We >>> >>>> don't see this problem on nfs version 3 on Solaris. Solaris server >>>> >>> run >>> >>>> normal memory registration mode. >>>> >>>> On linux client, I see cqe.status = 12 ie. IB_WC_RETRY_EXC_ERR >>>> >>>> I added these notes in bug #1919 (bugs.openfabrics.org) to track >>>> >> the >> >>>> issue. >>>> >>>> thanks, >>>> -vu >>>> _______________________________________________ >>>> ewg mailing list >>>> ewg@lists.openfabrics.org >>>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg >>>> >>>> >> _______________________________________________ >> ewg mailing list >> ewg@lists.openfabrics.org >> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg >> > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > _______________________________________________ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg