I believe 0 values mean no retry based on test. With both values as 0,  It 
kept receiving IBV_WC_RETRY_EXC_ERR for RDMA READ operations, which aborts the 
connection. Changing to 7 (which I find to be commonly used value in several 
examples from RDMA Core library), the error goes away and we can at least have 
stable connection in such RoCE network. 


-----Original Message-----
From: Alexey Kuznetsov <kuz...@acronis.com <mailto:kuz...@acronis.com>>
Date: Thursday, 17 August 2023 at 11:52 PM
To: Kui Liu <kui....@acronis.com <mailto:kui....@acronis.com>>
Cc: Devel <devel@openvz.org <mailto:devel@openvz.org>>
Subject: Re: [Devel] [PATCH RH7] fs/fuse kio: adjust rdma connection parameters


Ack.


What 0 values did mean? No retry or some default value?


On Thu, Aug 17, 2023 at 11:47 PM Kui Liu <kui....@acronis.com 
<mailto:kui....@acronis.com>> wrote:
>
>
>
> In RoCE network, packet loss and dealy due to congestion can happen
>
> quite often. We need to tolerate such event. So increase retry_count
>
> and rnr_retry_count to 7 to allow NIC to retry operations when an
>
> error happens, instead of returning the error directly which causes
>
> the connection to be aborted.
>
>
>
> Signed-off-by: Liu Kui <kui....@acronis.com <mailto:kui....@acronis.com>>
>
> ---
>
> fs/fuse/kio/pcs/pcs_rdma_conn.c | 4 ++--
>
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
>
>
> diff --git a/fs/fuse/kio/pcs/pcs_rdma_conn.c b/fs/fuse/kio/pcs/pcs_rdma_conn.c
>
> index 4db903151de0..7339b1466d3a 100644
>
> --- a/fs/fuse/kio/pcs/pcs_rdma_conn.c
>
> +++ b/fs/fuse/kio/pcs/pcs_rdma_conn.c
>
> @@ -44,8 +44,8 @@ conn_param_init(struct rdma_conn_param *cp, struct 
> pcs_rdmaio_conn_req *cr,
>
> cp->initiator_depth = min_t(int, U8_MAX, 
> cmid->device->attrs.max_qp_init_rd_atom);
>
>
>
> cp->flow_control = 1; /* does not matter */
>
> - cp->retry_count = 0; /* # retransmissions when no ACK received */
>
> - cp->rnr_retry_count = 0; /* # RNR retransmissions */
>
> + cp->retry_count = 7; /* # retransmissions when no ACK received */
>
> + cp->rnr_retry_count = 7; /* # RNR retransmissions */
>
> }
>
>
>
> static int pcs_rdma_cm_event_handler(struct rdma_cm_id *cmid,
>
> --
>
> 2.32.0 (Apple Git-132)
>
> _______________________________________________
> Devel mailing list
> Devel@openvz.org <mailto:Devel@openvz.org>
> https://lists.openvz.org/mailman/listinfo/devel 
> <https://lists.openvz.org/mailman/listinfo/devel>


_______________________________________________
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel

Reply via email to