See below. --Liran
Liran Liss wrote: > GRH is required for all packets; it keeps things simple and provides > length information yes, makes sense > There is no reason to restrict RDMAoE to RDMACM apps: all existing IB > applications can work over RDMAoE Considering the fact that you've put away IPoIB and the non-existence of user space SA queries, the only remaining potential user is SRP which anyway uses sophisticated QP1 and SA services even before going to H.A and multi-pathing - so I don't see anyone using SRP with RDMAoE. As for MPIs which don't use rdma-cm, the reason for that has been the problem or perception of SA scalability which doesn't exist over Ethernet. So all-in-all (or maybe all-to-all...) the only remaining consumers are those MPIs which will insist on exchanging <GID, MAC> pairs the way they exchange <GID, LID> pairs today (note that with IB GRH isn't a must so you should now go and patch all MPIs which don't exchange GIDs). These guys will simply create Adress handles with the <GID, MAC> pairs, and as such there's no need to have any layer below the rdma-cm and addr doing MAC or GID resolution. LL: I think that not having a user-space SA API is a big hole - we should not accept this situation and work to fill in the gap. Once such an API exists, all non-rdmacm apps will probably use it - implementing path queries and multicast joins is a tedious effort, as the convergence in the kernel has shown. For this reason, I believe that eventually almost all MAD traffic (at least for conventional apps - not IB network tools which are irrelevant for RDMAoE anyhow) will go only through QP0 and QP1. So, we will add the same SA support for RDMAoE in user-space, and nearly all apps can make use of it. > we already have a great CM protocol that runs over QP1 and is a > perfect fit for RDMAoE; we don't see any reason to change this sounds good, lets see what the other reviewers think, so you didn't change a bit in the CM? LL: that's the beauty of it: not a single change! > The rmdaoa "sa" code does more than just path resolution - it provides SL, MTU, rate, packet lifetime, and other params. The same thing is also true for multicast joins. mmm, interesting, what is the criteria / logic to assign SL and rate? looking in rdmaoe_sa.c I didn't find any sign to such code, please point me to the function/s that implement this. This is before going to the way you've implemented the multicast joins, which I'll get to later. LL: this is still TBD. We will need to define reasonable policies to integrate with DCB features, such as PFC. Any ideas on this are more than welcome! > In terms of the code, it is very simple: - move useful definitions > that were previously in multicast.c/sa_query.c to header files. - > implement only what is needed for rdmaoe in rdmaoe_sa.c. Perhaps we > can make better code reuse (calling the SA's cmp_rec(), for example). > We will look into it; thanks! I would just like to note that there is > a tradeoff here what ever, however, going your way (i.e claim that there's a need to rdmaoe_sa.c, I still don't see why rdmaoe_sa.c is needed) I don't see how it can be reviewed this since you made all the above changes in one patch - if you want review, separate to at least two patches, the first moves a way code, the second add the rdmaoe_sa.c. LL: good idea, thanks. Regarding rmdaoe_sa.c, we are open to suggestions from the community. We can surely do without it: RDMAoE support will amount to a few 'if' statements in the existing SA code. We understood that most people want to see this separated until the standard is finalized. In the meantime, we will try to reduce code duplication as much as possible. BTW - same goes for the API changes, you should have one patch that relates to kernel only changes, and another patch which relates to the user space changes, and this patch should come in a series with changes to libibverbs so people can review both sides of the kernel/user channel. LL: point noted, thanks. > Yes, we intend to use standard IPv4-mapped addresses > (::0xffff<a.b.c.d>) okay, any reason not to do this from day one? is there anyone in the IB stack or in your firmware that assumes a specific GID would always be in index 0? LL: not at all. Its just a matter of what comes first - implementation is ongoing. > on the contrary, there is *nothing* in the architecture that is HW-specific. This is intentional --- SW rdmaoe will be straighforward, efficient (as is SW FCoE), and simple (this is rdma transport after all!) to implement over any nic. We already know of people that are looking into that. I'm not convinced... e.g we'll get to this more when doing a detailed review on the multicast thing. Also, I haven't seen any comment from anyone on your API changes, can be nice if you act to make the sw rdmaoe developers comment here. LL: the only API changes are the sa-like path-queries and multicast joins; verbs are not changed in any way. In addition, we implemented the 'get_mac' method (for translating gids to macs) in the generic user-kernel ABI so any RDMAoE implementation can use it. I don't understand what concerns you; please explain. Or. _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
