Eli Cohen wrote: > RDMA over Ethernet (RDMAoE) allows running the IB transport protocol > using Ethernet frames allowing the deployment of IB semantics on > lossless Ethernet fabrics. RDMAoE packets are standard Ethernet frames > with an IEEE assigned Ethertype, a GRH, unmodified IB transport > headers and payload. Aside from the considerations pointed out below, > RDMAoE ports are functionally equivalent to regular IB ports from the > RDMA stack perspective. > > IB subnet management and SA services are not required for RDMAoE > operation; Ethernet management practices are used instead. In > Ethernet, nodes are commonly referred to by applications by means of > an IP address. RDMAoE encodes the IP addresses that were assigned to > the corresponding Ethernet port into its GIDs, and makes use of the IP > stack to bind a destination address to the corresponding netdevice > (just as the CMA does today for IB and iWARP) and to obtain its L2 MAC > addresses. > > The RDMA Verbs API is syntactically unmodified. When referring to > RDMAoE ports, Address handles are required to contain GIDs and the L2 > address fields in the API are ignored. The Ethernet L2 information is > then obtained by the vendor-specific driver (both in kernel- and > user-space) while modifying QPs to RTR and creating address handles. > > In order to maximize transparency for applications, RDMAoE implements > a dedicated API that provides services equivalent to some of those > provided by the IB-SA. The current approach is strictly local but may > evolve in the future. This API is implemented using an independent > source code file which allows for seamless evolution of the code > without affecting the IB native SA interfaces. We have successfully > tested MPI, SDP, RDS, and native Verbs applications over RDMAoE. > > To enable RDMAoE with the mlx4 driver stack, both the mlx4_en and > mlx4_ib drivers must be loaded, and the netdevice for the > corresponding RDMAoE port must be running. Individual ports of a multi > port HCA can be independently configured as Ethernet (with support for > RDMAoE) or IB, as is already the case. > > Following is a series of 8 patches based on version 2.6.30 of the > Linux kernel. This new series reflects changes based on feedback from > the community on the previous set of patches. The whole series is > tagged v3. > > Signed-off-by: Eli Cohen <e...@mellanox.co.il> >
I agree with Or here, I really do not think that making RDMAoE transparent to applications is worth pushing a lot of compatibility code to the kernel. The winner here is definitely rdmaoe_sa - 1000 lines of useless code which boils down to kernel_bind and kernel_setsockopt. Why do you need all this code to hold state, refcounts, whatever - if the kernel already does this for you? If an application uses IB - let it use real IB. If it uses RDMA - let it use all RDMA implementations out there (IB, iwarp, RDMAoE). Therefore, I think the correct place to add RDMAoE is under rdma_cm. If a consumer wants to use RDMAoE - it should use rdma_cm. Looks like you are trying to add something that is between RDMAoE and IBoE, and put a lot of hacky bypass logic in core and ulps. --Yossi _______________________________________________ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg