On 06/10/2015 06:35 AM, Hal Rosenstock wrote:
On 6/9/2015 9:52 PM, Bob Ciotti wrote:
We have an issue where lustre servers and clients cannot talk to each
other.
There are about 11,000 clients all trying to connect to a server that
just been rebooted
(nbp6-oss3 in this example)
pfe21 is a lus
> RDMA_CM_EVENT_UNREACHABLE is indicated when there are timeouts in
> underlying CM protocol exchange. I suspect that the server is really
> busy and doesn't respond to the low level CM MADs in a timely manner.
> RDMA CM (and other kernel ULPs like IPoIB and SRP use hard coded local
> and remote re
On 6/9/2015 9:52 PM, Bob Ciotti wrote:
> We have an issue where lustre servers and clients cannot talk to each
> other.
> There are about 11,000 clients all trying to connect to a server that
> just been rebooted
> (nbp6-oss3 in this example)
>
> pfe21 is a lustre client thats trying to remount th