S.B.
--Liran

-----Original Message-----
From: Roland Dreier [mailto:rdre...@cisco.com] 
Sent: Saturday, July 03, 2010 11:34 PM
To: Liran Liss
Cc: Jason Gunthorpe; Hefty, Sean; Aleksey Senin; linux-rdma; 
mo...@voltaire.com; aleks...@voltaire.com; yift...@voltaire.com; Tziporet 
Koren; al...@voltaire.com
Subject: Re: When IBoE will be merged to upstream?

 > Third, RoCE is not IB; its all about making RDMA user-friendly to Ethernet 
 > users.

This is utter nonsense.  RoCE (or IBoE as I prefer ;) is absolutely 
IB-over-Ethernet and it is all about making minimal changes to IB and IB 
applications to run on Ethernet.
LL: ??? IBoE has a completely different management plane (no SM, no SA), and we 
want this management plane to be "as Ethernet as possible" - this is what I 
mean by "not IB".
Otherwise, we are in full agreement: the changes to the IB transport should 
indeed be minimal.

 > Most importantly, we don't want to change the way Ethernet networks are 
 > managed.

That makes sense.  However let's be honest with ourselves -- the fraction of 
Ethernet networks using IPv6 as their only or even main address scheme is 
pretty small.  Of course having a migration path to work with IPv6 is 
important, but for the moment users want to use IPv4 addresses to specify 
destinations.

LL: I wasn't referring only to IPv6 networks; there are standard ways to 
represent IPv4 addresses in the IPv6 (and thus, iboe GID) namespace - IPv4 
mapped addresses (::ffff<ipv4>).
Any mapped ipv4 address can be resolved according to ipv4.

 > - RoCE gids are L3 addresses, which are not (necessarily) of link-local
 >   scope; people will mostly use IP-mapped gids of global scope.
 > - These gids will map to an IP address, which then can resolve to an
 >   outgoing vlan device exactly as in Ethernet.

At that level it all makes sense, but the problem is the specifics of where, 
when and how the mapping is done.

 > We have a specification, we have an implementation, and we have clean  > way 
 > of passing RoCE L2 information to user-space via address handles.

We may have an implementation but we absolutely don't have a specification.  Or 
at least the IBA annex has nothing beyond this:

    A16.5.1 ADDRESS ASSIGNMENT AND RESOLUTION

    Layer 2 local addresses (i.e. SMAC, DMAC), and the methods by which
    those addresses are assigned, are outside the scope of this annex.

    The means for resolving a GID to a local port address (i.e. SMAC or
    DMAC) are outside the scope of this annex. It is assumed that
    standard Ethernet mechanisms, such as ARP or Neighbor Discovery are
    used to maintain an appropriate address cache for RoCE ports.

which was really pretty unfortunate, since it means the exact point we're 
talking about is completely unspecified.  Or is there some other spec you can 
point to?

(This also means it's pretty important that we get this right, since every 
future implementation is going to have a lot of pressure to follow what Linux 
does)

LL: the ibxoe working group has recommended using both IP-mapped and link-local 
addresses (http://www.t11.org/ftp/t11/pub/fc/study/09-543v0.pdf).
Other than that, there is no comprehensive spec so I am afraid you are right.
It seems natural to base iboe addressing on ipv6 practices:
- map ipv6 addresses in a straight forward manner.
- map ipv4 addresses using ipv4-mapped addresses.

 > I don't see any substantial reason to change the basic approach.

I don't really even know what the basic approach is.  For example what's the 
plan for handling GIDs that aren't derived from a MAC address?  For a long time 
we've assumed that the create_ah verb can't sleep, so where are you going to do 
neighbor discovery?

LL: by basic approach, I mean: without modifying IB L2 fields in address 
handles, CQEs, or in MAD payloads.
Iboe doesn't need to do discovery on its own; it can inherit the IP addresses, 
macs and vlans of the eth interface it is associated with.
GIDs that aren't derived from MAC addresses are IP-mapped addresses, which can 
be resolved according to their associated IP addresses.
So, from an admin's perspective, iboe address resolution matches whatever was 
configured for the eth interfaces; no new scheme.

Regarding the implementation, there is no inherent issue that prevents 
create_ah() from sleeping:
- Change a few spinlocks to mutexes in the cma (which sleeps a lot anyway 
because is modifies QP states)
- Trivial for user-space calls...



 - R.
--
Roland Dreier <rola...@cisco.com> || For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to