Hi,

I have not received any response to this and I haven’t worked on this lately. I 
hope to revisit RDMA messenger on Nautilus in the future.

Thanks,
Orlando


From: Lazuardi Nasution [mailto:mrxlazuar...@gmail.com]
Sent: Saturday, May 25, 2019 9:14 PM
To: Moreno, Orlando <orlando.mor...@intel.com>; Tang, Haodong 
<haodong.t...@intel.com>
Cc: Ceph Users <ceph-users@lists.ceph.com>
Subject: Re: ceph-users Digest, Vol 60, Issue 26

Hi Orlando and Haodong,

Is there any response of this thread? I'm interested with this too.

Best regards,

Date: Fri, 26 Jan 2018 21:53:59 +0000
From: "Moreno, Orlando" 
<orlando.mor...@intel.com<mailto:orlando.mor...@intel.com>>
To: "ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>" 
<ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>>, Ceph
        Development 
<ceph-de...@vger.kernel.org<mailto:ceph-de...@vger.kernel.org>>
Cc: "Tang, Haodong" <haodong.t...@intel.com<mailto:haodong.t...@intel.com>>
Subject: [ceph-users] Ceph OSDs fail to start with RDMA
Message-ID:
        
<034aad465c6cbe4f96d9fb98573a79a63719e...@fmsmsx108.amr.corp.intel.com<mailto:034aad465c6cbe4f96d9fb98573a79a63719e...@fmsmsx108.amr.corp.intel.com>>

Content-Type: text/plain; charset="us-ascii"

Hi all,

I am trying to bring up a Ceph cluster where the private network is 
communicating via RoCEv2. The storage nodes have 2 dual-port 25Gb Mellanox 
ConnectX-4 NICs, with each NIC's ports bonded (2x25Gb mode 4). I have set 
memory limits to unlimited, can rping to each node, and 
ms_async_rdma_device_name set to the ibdev (mlx5_bond_1). Everything goes 
smoothly until I start bringing up OSDs. Nothing appears in stderr, but upon 
further inspection of the OSD log, I see the following error:

RDMAConnectedSocketImpl activate failed to transition to RTR state: (19) No 
such device
/build/ceph-12.2.2/src/msg/async/rdma/RDMAConnectedSocketImpl.cc: In function 
'void RDMAConnectedSocketImpl::handle_connection()' thread 7f908633c700 time 
2018-01-26 10:47:51.607573
/build/ceph-12.2.2/src/msg/async/rdma/RDMAConnectedSocketImpl.cc: 221: FAILED 
assert(!r)

ceph version 12.2.2 (cf0baeeeeba3b47f9427c6c97e2144b094b7e5ba) luminous (stable)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) 
[0x564a2ccf7892]
2: (RDMAConnectedSocketImpl::handle_connection()+0xb4a) [0x564a2d007fba]
3: (EventCenter::process_events(int, std::chrono::duration<unsigned long, 
std::ratio<1l, 1000000000l> >*)+0xa08) [0x564a2cd9a418]
4: (()+0xb4f3a8) [0x564a2cd9e3a8]
5: (()+0xb8c80) [0x7f9088c04c80]
6: (()+0x76ba) [0x7f90892f36ba]
7: (clone()+0x6d) [0x7f908836a41d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to 
interpret this.

Anyone see this before or have any suggestions?

Thanks,
Orlando
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to