Re: [ewg] MLX4 Strangeness

2010-02-18 Thread Vu Pham
Sent: Wednesday, February 17, 2010 10:07 AM To: Tziporet Koren Cc: linux-r...@vger.kernel.org; ewg@lists.openfabrics.org Subject: Re: [ewg] MLX4 Strangeness Hi Tziporet: Here is a trace with the data for WR failing with status 12. The vendor error is 129. Feb 17 12:27:33 vic10 kernel

Re: [ewg] MLX4 Strangeness

2010-02-17 Thread Tom Tucker
Hi Tziporet: Here is a trace with the data for WR failing with status 12. The vendor error is 129. Feb 17 12:27:33 vic10 kernel: rpcrdma_event_process:154 wr_id status 12 opcode 0 vendor_err 129 byte_len 0 qp 81002a13ec00 ex src_qp wc_flags, 0 pkey_index

Re: [ewg] MLX4 Strangeness

2010-02-16 Thread Tom Tucker
Tziporet Koren wrote: On 2/15/2010 10:24 PM, Tom Tucker wrote: Hello, I am seeing some very strange behavior on my MLX4 adapters running 2.7 firmware and the latest OFED 1.5.1. Two systems are involved and each have dual ported MTHCA DDR adapter and MLX4 adapters. The scenario starts

Re: [ewg] MLX4 Strangeness

2010-02-16 Thread Tom Tucker
Tziporet Koren wrote: On 2/15/2010 10:24 PM, Tom Tucker wrote: Hello, I am seeing some very strange behavior on my MLX4 adapters running 2.7 firmware and the latest OFED 1.5.1. Two systems are involved and each have dual ported MTHCA DDR adapter and MLX4 adapters. The scenario starts

Re: [ewg] MLX4 Strangeness

2010-02-16 Thread Tom Tucker
More info... Reboot the client and try to reconnect to a server that has not been rebooted fails in the same way. It must be an issue with the server. I see no completions on the server or any indication that an RDMA_SEND was incoming. Is there some way to dump adapter state or otherwise see