On 10/25/2020 1:51 PM, zhenwei pi wrote:
Hit a kernel warning: refcount_t: underflow; use-after-free. WARNING: CPU: 0 PID: 0 at lib/refcount.c:28 RIP: 0010:refcount_warn_saturate+0xd9/0xe0 Call Trace: <IRQ> nvme_rdma_recv_done+0xf3/0x280 [nvme_rdma] __ib_process_cq+0x76/0x150 [ib_core] ... The reason is that a zero bytes message received from target, and the host side continues to process without length checking, then the previous CQE is processed twice. Do sanity check on received data length, try to recovery for corrupted CQE case. Because zero bytes message in not defined in spec, using zero bytes message to detect dead connections on transport layer is not standard, currently still treat it as illegal. Thanks to Chao Leng & Sagi for suggestions. Signed-off-by: zhenwei pi <pizhen...@bytedance.com> --- drivers/nvme/host/rdma.c | 8 ++++++++ 1 file changed, 8 insertions(+)
Seems strange that the targets sends zero byte packets. Can you specify which target is this and the scenario ?