On 5/23/2013 1:31 PM, Alex Rosenbaum wrote:
On 5/21/2013 6:24 PM, Hefty, Sean wrote:
My first guess is that the server isn't responding to new requests. -
Sean
This is where we're looking now.
Now testing on 17 server with 8 clients per server.
When disabling all RDMA traffic in the test we
On 5/21/2013 6:24 PM, Hefty, Sean wrote:
My first guess is that the server isn't responding to new requests. -
Sean
This is where we're looking now.
Now testing on 17 server with 8 clients per server.
When disabling all RDMA traffic in the test we get 100% RDMA connection
established. So at
Hi Sean,
We have a user space application which is made of M (clients) x N
(servers) RC connectivity pattern using librdmacm. Basically, there are
N nodes, each running M client process and each client connects to all N
servers.
So under some unknown conditions, many of the clients
So under some unknown conditions, many of the clients connection
attempts fail with RDMA_CM_EVENT_UNREACHABLE event and the status is
-ETIMEDOUT. Looking on the rdma-cm kernel code, I see that the only
location which generates this event is in cma_ib_handler when getting
IB_CM_REQ_ERROR (or
On 21/05/2013 18:24, Hefty, Sean wrote:
I don't remember this patch at all.
Alex, can you please send Sean this patch
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at
On Tue, May 21, 2013 at 6:24 PM, Hefty, Sean sean.he...@intel.com wrote:
One thing seen in the nodes dmesg is a message from an old patch of
yours which exists in ofed1.5.3 but didn't hit (or wasn't accepted?)
upstream saying ib_cm: calculated mra timeout 67584 8192, decreasing
used
One thing seen in the nodes dmesg is a message from an old patch of
yours which exists in ofed1.5.3 but didn't hit (or wasn't accepted?)
upstream saying ib_cm: calculated mra timeout 67584 8192, decreasing
used timeout_ms does this provides any insight into the problem?
I don't