if you can point out specific issues, we will be happy to work with you
to get them addressed!
Hello Ursula,
My list of issues that I would like to see addressed can be found below. Doug,
Christoph and others may have additional inputs. The issues that have not yet
been mentioned in other e-mails are:
- The SMC driver only supports one RDMA transport type (RoCE v1) but
none of the other RDMA transport types (RoCE v2, IB and iWARP). New
RDMA drivers should support all RDMA transport types transparently.
The traditional approach to support multiple RDMA transport types is
by using the RDMA/CM to establish connections.
- The implementation of the SMC driver only supports RoCEv1. This is
a very unfortunate choice because:
- RoCEv1 is not routable and hence is limited to a single Ethernet
broadcast domain.
- RoCEv1 packets escape a whole bunch of mechanisms that only work
for IP packets. Firewalls pass all RoCEv1 packets and switches
do not restrict RoCEv1 packets to a single VLAN. This means that
if the network configuration is changed after an SMC connection
has been set up such that IP communication between the endpoints
of an SMC connection is blocked that the SMC RoCEv1 packets will
not be blocked by the network equipment of which the configuration
has just been changed.
- As already mentioned by Christoph, the SMC implementation uses RDMA
calls that probably will be deprecated soon (ib_create_cq()) and
should use the RDMA R/W API instead of building sge lists itself.
I would also suggest that you stop exposing the DMA MR for remote
access (at least by default) and use a proper reg_mr operations with a
limited lifetime on a properly sized buffer.
Also, the cq polling code looks completely wrong, you should really
use the RDMA CQ API.