Re: [PATCH v4 for-next 00/12] Add network namespace support in the RDMA-CM
On Sun, 2015-05-17 at 08:50 +0300, Haggai Eran wrote: > Thanks again everyone for the review comments. I've updated the patch set > accordingly. The main changes are in the first patch to use a read-write > semaphore instead of an SRCU, and with the reference counting of shared > ib_cm_ids. > Please let me know if I missed anything, or if there are other issues with > the series. Hi Haggai, I know you are probably busy reworking this right now on the basis of Jason's comments. However, my biggest issue with this patch set right now is not technical (well, it is, but it's only partially technical). This is a core feature more than anything else. Namespaces for RDMA devices is not unique to IB or RoCE in any way. Yet no thought has been given to how this will work universally across all of the RDMA capable devices (mainly I'm talking about iWARP here...I don't think this is an issue for usNIC as if you want namespace support there, you just start the user space app in a given namespace and you are probably 90% of the way there since the user space application gets its own device and so its own MAC/IP and all of the RDMA transfers are UDP, so the application's namespace should get inherited by all the rest, but Cisco would need to confirm that, hence why I say 90% of the way there, it needs confirmed). So, while you are reworking things right now, you would ideally contact Steve Wise and/or Tatyana Nikolova and discuss the iWARP story on this. I know there won't be a lot of overlap between IB and iWARP, but last time you were asked you didn't even know if this setup could be extended to iWARP. For this next statement, I know I'm directing this to you Haggai, but please don't take it that way. I'm really using your patch set to make a broader point to everyone on the list. When I look at patches for support for a given feature, one of the things I'm going to look at is whether or not that feature is specific to a given hardware type, or if it's a generic feature. If it's a generic feature, then I'm going to want to know that the person submitting it has designed it well. A pre-requisite of designing a generic feature well is that it considers all hardware types, not just your specific hardware type. So when you come back with the next version of this patch set, please have an answer for how it should work on each hardware type even if you don't have implementation patches for each hardware type. -- Doug Ledford GPG KeyID: 0E572FDD signature.asc Description: This is a digitally signed message part
Re: [PATCH v4 for-next 00/12] Add network namespace support in the RDMA-CM
On Tue, May 26, 2015 at 09:34:40AM -0400, Doug Ledford wrote: > This is a core feature more than anything else. Namespaces for RDMA > devices is not unique to IB or RoCE in any way. Yet no thought has been > given to how this will work universally across all of the RDMA > capable I think if Haggi is able to follow the perscription I gave then things will be general. - All rdma cm ids are associated with a netdev - The output flow uses that netdev to restrict, configure and determine the output RDMA device QP - The input flow locates the netdev as step one and then uses the (netdev,ip,port) tuple to find the rdma listener, which is in turn tied to a netdev and is restricted/configured by it. The technology specific part is the two maps: from (input device,packet) to netdev, from netdev to (output device,packet) After the above clean up is done, namespace enabling is basically providing those two mapping functions for each technology in a way that can locate delegatable netdevs. The trivial case for all the ethernet techs is to provide the above maps that can take the (input device,VLAN) and locate the correct child VLAN specific netdev. The existing code to support VLAN should pretty much immediately enable basic namespace support for all the ethernet families. The big open question for ethernet is how to work without relying on VLAN to create delgated netdevs - typically one would use a bridge and veth's, which do not seem very RDMA compatible. But that doesn't need to be answered right now. Remember, this isn't RDMA namespaces, this is netdev namespace support for RDMA-CM -> very different things. Basically, I'm happy with the generality story, if the clean up work I outlined turns out.. > issue for usNIC as if you want namespace support there, you just start > the user space app in a given namespace and you are probably 90% of > the usNIC has no kernel facing functionality, and no interaction with RDMA-CM, so it is irrelevant to any discussion about RDMA-CM :( Jason -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 for-next 00/12] Add network namespace support in the RDMA-CM
On Tue, 2015-05-26 at 10:59 -0600, Jason Gunthorpe wrote: > On Tue, May 26, 2015 at 09:34:40AM -0400, Doug Ledford wrote: > > > This is a core feature more than anything else. Namespaces for RDMA > > devices is not unique to IB or RoCE in any way. Yet no thought has been > > given to how this will work universally across all of the RDMA > > capable > > I think if Haggi is able to follow the perscription I gave then things > will be general. > > - All rdma cm ids are associated with a netdev > - The output flow uses that netdev to restrict, configure and >determine the output RDMA device QP > - The input flow locates the netdev as step one and then uses the >(netdev,ip,port) tuple to find the rdma listener, which is in turn >tied to a netdev and is restricted/configured by it. > > The technology specific part is the two maps: from (input > device,packet) to netdev, from netdev to (output device,packet) > > After the above clean up is done, namespace enabling is basically > providing those two mapping functions for each technology in a way > that can locate delegatable netdevs. > > The trivial case for all the ethernet techs is to provide the above > maps that can take the (input device,VLAN) and locate the correct > child VLAN specific netdev. The existing code to support VLAN should > pretty much immediately enable basic namespace support for all the > ethernet families. > > The big open question for ethernet is how to work without relying on > VLAN to create delgated netdevs - typically one would use a bridge and > veth's, which do not seem very RDMA compatible. But that doesn't need > to be answered right now. > > Remember, this isn't RDMA namespaces, this is netdev namespace support > for RDMA-CM -> very different things. That was the point of my email. This is a very myopic view of the feature. It *should* at least have an idea of these other things too. > Basically, I'm happy with the generality story, if the clean up work I > outlined turns out.. > > > issue for usNIC as if you want namespace support there, you just start > > the user space app in a given namespace and you are probably 90% of > > the > > usNIC has no kernel facing functionality, and no interaction with > RDMA-CM, so it is irrelevant to any discussion about RDMA-CM :( Whether usNIC has a kernel facing functionality or not is irrelevant. This feature isn't kernel only, it effects user space applications launched in a namespace too. And, again, my point was that this discussion is about RDMA-CM and it should be broader (even if the implementation isn't broader). Due to the implementation of usNIC I suspect it would "just work", but it would be better to know so. -- Doug Ledford GPG KeyID: 0E572FDD signature.asc Description: This is a digitally signed message part
RE: [PATCH v4 for-next 00/12] Add network namespace support in the RDMA-CM
> -Original Message- > From: netdev-ow...@vger.kernel.org [mailto:netdev-ow...@vger.kernel.org] On > Behalf Of Doug Ledford > Sent: Tuesday, May 26, 2015 6:35 AM > To: Haggai Eran > Cc: linux-r...@vger.kernel.org; netdev@vger.kernel.org; Liran Liss; Guy > Shapiro; Shachar Raindel; Yotam Kenneth > Subject: Re: [PATCH v4 for-next 00/12] Add network namespace support in the > RDMA-CM ... > I don't think this is an issue for usNIC as if you > want namespace support there, you just start the user space app in a given > namespace and you are probably 90% of the way there since the user space > application gets its own device and so its own MAC/IP and all of the RDMA > transfers are UDP, so the application's namespace should get inherited by all > the rest, but Cisco would need to confirm that, hence why I say 90% of the way > there, it needs confirmed). This is correct. Thanks /Chris N�r��yb�X��ǧv�^�){.n�+���z�^�)w*jg����ݢj/���z�ޖ��2�ޙ&�)ߡ�a�����G���h��j:+v���w��٥
Re: [PATCH v4 for-next 00/12] Add network namespace support in the RDMA-CM
On Tue, May 26, 2015 at 01:46:36PM -0400, Doug Ledford wrote: > > Remember, this isn't RDMA namespaces, this is netdev namespace support > > for RDMA-CM -> very different things. > > That was the point of my email. This is a very myopic view of the > feature. It *should* at least have an idea of these other things too. Everything you talked about seems covered: iwarp/roce/ib now have a fairly clear uniform story for CM. usNIC doesn't use core code. I doubt a larger discussion about a 'rdma namespace' is going to substantially change these patches, they are really netdev focused. Anyhow, I've been saving that discussion for when the roce and umad/uverbs namespace stuff is re-sent. It seems more appropriate at that point. I don't know about you, but I am exhausted looking at these huge patch sets, and narrowing the focus is the only way I see to get through. This series has hopefully narrowed to: 'fix the flow in netdev handling for rdma-cm'. Jason -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 for-next 00/12] Add network namespace support in the RDMA-CM
On 26/05/2015 16:34, Doug Ledford wrote: > On Sun, 2015-05-17 at 08:50 +0300, Haggai Eran wrote: >> Thanks again everyone for the review comments. I've updated the patch set >> accordingly. The main changes are in the first patch to use a read-write >> semaphore instead of an SRCU, and with the reference counting of shared >> ib_cm_ids. >> Please let me know if I missed anything, or if there are other issues with >> the series. > > Hi Haggai, > > I know you are probably busy reworking this right now on the basis of > Jason's comments. However, my biggest issue with this patch set right > now is not technical (well, it is, but it's only partially technical). Hi, I'm sorry about the late reply. We had a holiday here, and then some other tasks took precedence. I've only got back to working on this today. > > This is a core feature more than anything else. Namespaces for RDMA > devices is not unique to IB or RoCE in any way. Yet no thought has been > given to how this will work universally across all of the RDMA capable > devices (mainly I'm talking about iWARP here... I don't agree. It is true we have are not planning to provide an iWarp implementation for network namespaces, as we lack the capacity and the expertise. However, I think that the changes we proposed to the rdma_cm module will work with iWarp too. Perhaps with some of Jason's suggestions it will be smoother, but even in the current design, I think that if iWarp drivers can provide iw_cm with the network device on which a request is received, then it should be simple to modify it for namespace support without significant change to rdma_cm. > I don't think this is an > issue for usNIC as if you want namespace support there, you just start > the user space app in a given namespace and you are probably 90% of the > way there since the user space application gets its own device and so > its own MAC/IP and all of the RDMA transfers are UDP, so the > application's namespace should get inherited by all the rest, but Cisco > would need to confirm that, hence why I say 90% of the way there, it > needs confirmed). > > So, while you are reworking things right now, you would ideally contact > Steve Wise and/or Tatyana Nikolova and discuss the iWARP story on this. > I know there won't be a lot of overlap between IB and iWARP, but last > time you were asked you didn't even know if this setup could be extended > to iWARP. > > For this next statement, I know I'm directing this to you Haggai, but > please don't take it that way. I'm really using your patch set to make > a broader point to everyone on the list. > > When I look at patches for support for a given feature, one of the > things I'm going to look at is whether or not that feature is specific > to a given hardware type, or if it's a generic feature. If it's a > generic feature, then I'm going to want to know that the person > submitting it has designed it well. A pre-requisite of designing a > generic feature well is that it considers all hardware types, not just > your specific hardware type. So when you come back with the next > version of this patch set, please have an answer for how it should work > on each hardware type even if you don't have implementation patches for > each hardware type. Well, because the RDMA subsystem supports a very diverse set of devices, I think there are few people who know the details of all hardware types well. If we are going to evolve the generic parts of the stack, we have to cooperate. We have to rely on the knowledge of people on the mailing list to say whether the feature is well designed for all hardware types, or whether changes are warranted. In this specific case, the patches has been on the list since February. I think it is enough time to allow anyone who is interested in network namespace support to chime in. Regards, Haggai -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 for-next 00/12] Add network namespace support in the RDMA-CM
On 26/05/2015 19:59, Jason Gunthorpe wrote: > The big open question for ethernet is how to work without relying on > VLAN to create delgated netdevs - typically one would use a bridge and > veth's, which do not seem very RDMA compatible. But that doesn't need > to be answered right now. I think in Ethernet the first step would be to support macvlan devices. Like IPoIB child devices, they are directly attached to an RDMA device, so they don't require handling a complex virtual bridging topology as veths do. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 for-next 00/12] Add network namespace support in the RDMA-CM
On 26/05/2015 20:46, Doug Ledford wrote: >> Remember, this isn't RDMA namespaces, this is netdev namespace support >> > for RDMA-CM -> very different things. > That was the point of my email. This is a very myopic view of the > feature. It *should* at least have an idea of these other things too. We did give some thought to the question of whether an RDMA namespace is needed, and concluded that it isn't. RDMA resources such as QP numbers, memory keys, etc. are allocated by the devices. So different containers wouldn't care if they share the "QP number namespace", etc. RDMA CM ports are different because they are chosen by the applications, but they map directly to the network namespace, so they don't require their own namespace. Regards, Haggai -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 for-next 00/12] Add network namespace support in the RDMA-CM
On Thu, 2015-05-28 at 16:07 +0300, Haggai Eran wrote: > On 26/05/2015 16:34, Doug Ledford wrote: > > On Sun, 2015-05-17 at 08:50 +0300, Haggai Eran wrote: > > This is a core feature more than anything else. Namespaces for RDMA > > devices is not unique to IB or RoCE in any way. Yet no thought has been > > given to how this will work universally across all of the RDMA capable > > devices (mainly I'm talking about iWARP here... > I don't agree. It is true we have are not planning to provide an iWarp > implementation for network namespaces, as we lack the capacity and the > expertise. However, I think that the changes we proposed to the rdma_cm > module will work with iWarp too. Perhaps with some of Jason's > suggestions it will be smoother, but even in the current design, I think > that if iWarp drivers can provide iw_cm with the network device on which > a request is received, then it should be simple to modify it for > namespace support without significant change to rdma_cm. My request wasn't for a functional implementation, just a statement that you had in fact thought about it and, as you say here, would expect it to work (and preferably why as well). > Well, because the RDMA subsystem supports a very diverse set of devices, > I think there are few people who know the details of all hardware types > well. If we are going to evolve the generic parts of the stack, we have > to cooperate. We have to rely on the knowledge of people on the mailing > list to say whether the feature is well designed for all hardware types, > or whether changes are warranted. In this specific case, the patches has > been on the list since February. I think it is enough time to allow > anyone who is interested in network namespace support to chime in. You would think that, but sometimes important information comes from totally different places. See mine and Jason's comments back and forth in the SRIOV thread started by Or. Long story short: ip link add dev ib0 name ib0.1 type ipoib is totally broken on at least all Red Hat OSes. It will require reworking of the network scripts and NetworkManager assumptions to make it work. It will also break DHCP on the interface as pkey/guid are the only items that uniquely identify DHCP clients. The net result of our talks was that it is likely that each interface on the same pkey will require an alias GUID per child interface in order to keep things workable. -- Doug Ledford GPG KeyID: 0E572FDD signature.asc Description: This is a digitally signed message part
Re: [PATCH v4 for-next 00/12] Add network namespace support in the RDMA-CM
On Thu, May 28, 2015 at 04:22:36PM +0300, Haggai Eran wrote: > wouldn't care if they share the "QP number namespace", etc. RDMA CM > ports are different because they are chosen by the applications, but > they map directly to the network namespace, so they don't require their > own namespace. Different containers should have restricted access to the PKey and GID tables, and the presence device itself. Just like in the SRIOV case. That is what the 'RDMA Namespace' would control. Jason -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 for-next 00/12] Add network namespace support in the RDMA-CM
On 5/28/2015 5:07 PM, Doug Ledford wrote: You would think that, but sometimes important information comes from totally different places. See mine and Jason's comments back and forth in the SRIOV thread started by Or. Long story short: ip link add dev ib0 name ib0.1 type ipoib is totally broken on at least all Red Hat OSes. It will require reworking of the network scripts and NetworkManager assumptions to make it work. It will also break DHCP on the interface as pkey/guid are the only items that uniquely identify DHCP clients. The net result of our talks was that it is likely that each interface on the same pkey will require an alias GUID per child interface in order to keep things workable. Doug, Just to make sure we're on the same page, you're saying that the IPoIB DHCP scheme (client + server) used on RH product uses Client-ID which is eight byte long or 20 byte long the four upper bytes masked out (which of them?) and hence is broken when multiple entities use the same ID. Anything else except for that (you said "reworking of the network scripts and NetworkManager assumptions to make it work")?? OTOH we realized that the implementation for same PKEY IPoIB childs which exist for a while is broken with the RH DHCP scheme and should be enhanced. OTOH these childs can serve as nice building blocks for IPoIB containers or virtio-IPoIB scheme. Note that out of the eleven patches that make the series, only ONE relates directly to IPoIB, the rest are either applicable to all the transport supported by the RDMA stack, or to IPoIB + RoCE. Under some assumptions and changes people can test it with DHCP scheme different from RH or with non-DHCP based IP address assignment scheme. So we have a very nice effort and work done by developers, to bring RDMA into containers, accompanied by reviewers providing lots of their brain power to make it robust. I don't see why we should stop the whole RDMA containers support train just b/c we found out the IPoIB DHCP bug which was there for few years before this effort started. How about let this series to go after the rest of the reviewers comments are addressed, s.t under IPoIB it will work on small set of environments, while with macvlan based RoCE support to be introduced later it will work on wider set of environments. Or. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 for-next 00/12] Add network namespace support in the RDMA-CM
On Thu, May 28, 2015 at 07:21:11PM +0300, Or Gerlitz wrote: > Anything else except for that (you said "reworking of the network scripts > and NetworkManager assumptions to make it work")?? IPv6 becomes very broken, child interfaces will generate the same IPv6 addreses for radv and link local resulting in duplicate address scenarios. About the only thing that will work properly is statically assigned IPv4 addresses. > I don't see why we should stop the whole RDMA containers support train just > b/c we found out the IPoIB DHCP bug which was there for few years before > this effort started. I don't think that is what Doug said. Jason -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 for-next 00/12] Add network namespace support in the RDMA-CM
On Thu, 2015-05-28 at 11:43 -0600, Jason Gunthorpe wrote: > On Thu, May 28, 2015 at 07:21:11PM +0300, Or Gerlitz wrote: > > > Anything else except for that (you said "reworking of the network scripts > > and NetworkManager assumptions to make it work")?? > > IPv6 becomes very broken, child interfaces will generate the same IPv6 > addreses for radv and link local resulting in duplicate address > scenarios. > > About the only thing that will work properly is statically assigned > IPv4 addresses. > > > I don't see why we should stop the whole RDMA containers support train just > > b/c we found out the IPoIB DHCP bug which was there for few years before > > this effort started. > > I don't think that is what Doug said. Indeed. There is no need to scrap things, but if the design as it stands, and the intended means of creating objects for use in containers, is going to result in an unworkable network, then we have to re-evaluate how the container constructs are created, and that then has possible consequences for how we would get from an incoming packet to the proper container. I'm not trying to stop the "support train" here, but at the same time, if the train is headed for a bridge that's out -- Doug Ledford GPG KeyID: 0E572FDD signature.asc Description: This is a digitally signed message part
Re: [PATCH v4 for-next 00/12] Add network namespace support in the RDMA-CM
On Thu, May 28, 2015 at 9:22 PM, Doug Ledford wrote: >> I don't think that is what Doug said. > Indeed. There is no need to scrap things, but if the design as it > stands, and the intended means of creating objects for use in > containers, is going to result in an unworkable network, then we have to > re-evaluate how the container constructs are created, and that then has > possible consequences for how we would get from an incoming packet to > the proper container. To be precise, do we agree that the issue here isn't "in the design as it stands" but rather in a problem we found in the intended way of assigning IP addresses through DHCP for the containers? > I'm not trying to stop the "support train" here, but at the same time, > if the train is headed for a bridge that's out So what's your concrete saying here? where should we go from here? Or. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 for-next 00/12] Add network namespace support in the RDMA-CM
On Thu, 2015-05-28 at 22:05 +0300, Or Gerlitz wrote: > On Thu, May 28, 2015 at 9:22 PM, Doug Ledford wrote: > > >> I don't think that is what Doug said. > > > Indeed. There is no need to scrap things, but if the design as it > > stands, and the intended means of creating objects for use in > > containers, is going to result in an unworkable network, then we have to > > re-evaluate how the container constructs are created, and that then has > > possible consequences for how we would get from an incoming packet to > > the proper container. > > To be precise, do we agree that the issue here isn't "in the design as > it stands" but rather in a problem we found in the intended way of > assigning IP addresses through DHCP for the containers? No, I would say the problem *is* in the design. But the problem is the selected means of identifying the netdev to get to the namespace (and the proposed means of creating non-default namespace devices to exist in the container), not the namespace design itself. > > I'm not trying to stop the "support train" here, but at the same time, > > if the train is headed for a bridge that's out > > So what's your concrete saying here? where should we go from here? This excerpt is from the commit log of patch 3/12: The IB device and port, together with the P_Key and the IP address should be enough to uniquely identify the ULP net device. The problem here is that this is wrong. If we allow more than one device per pkey with the same GUID, then DHCP breaks, which is bad in and of itself, but it also breaks ipv6 link local addressing. Which means that this hunk in patch 4/12: +#if IS_ENABLED(CONFIG_IPV6) + case AF_INET6: + if (ipv6_chk_addr(net, &addr_in6->sin6_addr, dev, 1)) + return true; + + break; +#endif can now be tricked into returning true for incorrect devices. Where do we go from here? First, I'm inclined to say we should modify the add_child portion of IPoIB to refuse to add links to a PKey if that GUID is already present on that PKey. You could then use different PKeys on the default GUID for separate namespaces. If you need separate namespaces on the same PKey, then enable alias GUIDs for use on the local adapter and require one GUID per namespace on the same PKey. Then I'm inclined to say that we should map for namespaces using device, port, guid/gid, pkey. And in this situation, since a unique guid/gid on any given pkey maps to a unique dhcp identifier and a unique ipv6 lladdr, this becomes freely interchangeable with device, port, pkey, address mappings that this patchset was built around. -- Doug Ledford GPG KeyID: 0E572FDD signature.asc Description: This is a digitally signed message part
Re: [PATCH v4 for-next 00/12] Add network namespace support in the RDMA-CM
On 29/05/2015 00:55, Doug Ledford wrote: > On Thu, 2015-05-28 at 22:05 +0300, Or Gerlitz wrote: >> So what's your concrete saying here? where should we go from here? > > This excerpt is from the commit log of patch 3/12: > > The IB device and port, together with the P_Key and the IP address should > be enough to uniquely identify the ULP net device. > > The problem here is that this is wrong. If we allow more than one > device per pkey with the same GUID, then DHCP breaks, which is bad in > and of itself, but it also breaks ipv6 link local addressing. Which > means that this hunk in patch 4/12: > > +#if IS_ENABLED(CONFIG_IPV6) > + case AF_INET6: > + if (ipv6_chk_addr(net, &addr_in6->sin6_addr, dev, 1)) > + return true; > + > + break; > +#endif > > can now be tricked into returning true for incorrect devices. > > Where do we go from here? > > First, I'm inclined to say we should modify the add_child portion of > IPoIB to refuse to add links to a PKey if that GUID is already present > on that PKey. You could then use different PKeys on the default GUID > for separate namespaces. If you need separate namespaces on the same > PKey, then enable alias GUIDs for use on the local adapter and require > one GUID per namespace on the same PKey. I don't think blocking the current add_child implementation is needed. I agree IPv6 SLAAC and DHCP currently don't work well, and adding alias GUID for child interfaces is important, but the current implementation can be used with static IPv4 addresses, so I don't think it must be disabled. > Then I'm inclined to say that we should map for namespaces using device, > port, guid/gid, pkey. And in this situation, since a unique guid/gid on > any given pkey maps to a unique dhcp identifier and a unique ipv6 > lladdr, this becomes freely interchangeable with device, port, pkey, > address mappings that this patchset was built around. What if we change the namespaces patches to map (device, port, GID, P_Key, IP) to netdev / namespace? That is, to use both the GID and the IP address. This would allow people to use namespaces with the current implementation (provided they have a valid configuration with no conflicting IP addresses), and once alias GUIDs are added, the GUIDs will be used to uniquely resolve the namespace even with such misconfigurations. Regards, Haggai -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 for-next 00/12] Add network namespace support in the RDMA-CM
On 28/05/2015 18:46, Jason Gunthorpe wrote: > On Thu, May 28, 2015 at 04:22:36PM +0300, Haggai Eran wrote: >> wouldn't care if they share the "QP number namespace", etc. RDMA CM >> ports are different because they are chosen by the applications, but >> they map directly to the network namespace, so they don't require their >> own namespace. > > Different containers should have restricted access to the PKey and GID > tables, and the presence device itself. Just like in the SRIOV > case. > > That is what the 'RDMA Namespace' would control. We were thinking here that there is a room for an RDMA cgroup. It would limit the amount of RDMA resources a container can use. It can also be used for the restrictions you mentioned, but maybe they are more suitable for a namespace. I'm not sure. In RoCE for instance, a restricted access to the GID table can be derived from the network namespace directly, but perhaps not in InfiniBand. Regards, Haggai -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 for-next 00/12] Add network namespace support in the RDMA-CM
On Wed, Jun 03, 2015 at 01:03:01PM +0300, Haggai Eran wrote: > > Then I'm inclined to say that we should map for namespaces using device, > > port, guid/gid, pkey. And in this situation, since a unique guid/gid on > > any given pkey maps to a unique dhcp identifier and a unique ipv6 > > lladdr, this becomes freely interchangeable with device, port, pkey, > > address mappings that this patchset was built around. > > What if we change the namespaces patches to map (device, port, GID, > P_Key, IP) to netdev / namespace? That is, to use both the GID and the > IP address. As I keep saying, you are not supposed to use the IP address as a key to find the netdev, that is the wrong way to use the Linux netdev model. Requiring unique GID/PKey allows the implementation to avoid this wrongness, which would be simplifying and more correct. That is the appeal to blocking this scenario when children are created. Jason -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 for-next 00/12] Add network namespace support in the RDMA-CM
On Wed, Jun 3, 2015 at 7:14 PM, Jason Gunthorpe wrote: > On Wed, Jun 03, 2015 at 01:03:01PM +0300, Haggai Eran wrote: >> > Then I'm inclined to say that we should map for namespaces using device, >> > port, guid/gid, pkey. And in this situation, since a unique guid/gid on >> > any given pkey maps to a unique dhcp identifier and a unique ipv6 >> > lladdr, this becomes freely interchangeable with device, port, pkey, >> > address mappings that this patchset was built around. >> >> What if we change the namespaces patches to map (device, port, GID, >> P_Key, IP) to netdev / namespace? That is, to use both the GID and the >> IP address. > > As I keep saying, you are not supposed to use the IP address as a key > to find the netdev, that is the wrong way to use the Linux netdev > model. > > Requiring unique GID/PKey allows the implementation to avoid this > wrongness, which would be simplifying and more correct. > > That is the appeal to blocking this scenario when children are created. Jason, The IPoIB RTNL childs were added around release 3.6/7 of the upstream kernel and are part of the kernel UAPI. They are perfectly used in bunch of schemes: 1. when static IP address assignment is used 2. under PV scheme, when the guest has para-virtual Eth NIC and the host does routing between the back-end (e.g tap or alike) and the IPoIB child. Or when the host does tunneling (vxlan) and alike and sends down the encapsulated packet through a host IP address assigned to the IPoIB child 3. etc few more Indeed the DHCP story isn't working there and to get DHCP work something has to be done. But this issue can't serve for blocking the existing UAPI and introduce regression to working systems. Or. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 for-next 00/12] Add network namespace support in the RDMA-CM
On Wed, Jun 03, 2015 at 10:05:34PM +0300, Or Gerlitz wrote: > Indeed the DHCP story isn't working there and to get DHCP work > something has to be done. But this issue can't serve for blocking the > existing UAPI and introduce regression to working systems. It is not DHCP that concerns me, it is the fact we can't combine net namespaces, RDMA-CM and duplicate GUID IPoIB children together without adding hacks to the kernel. Searching netdevs by IP is a hack. I'm mostly fine with it as an optional capability, similar to macvlan, I just don't see how to cleanly integrate it with RDMA CM and namespaces. And I don't see what RDMA CM is supposed to do when it hits this case. So, any ideas that don't involve the searching for IP hack?? [And yes, as discussed with Haggie, it is not the worst hack in the world, and maybe we can live with it, but lets understand the trade offs carefully] Also, now that this has been brought up, I think you need to make a patch to fix the IPv6 SLAAC breakage this caused. It looks trivial to modify addrconf_ifid_infiniband to return error if the IPoIB child is sharing a guid. It was not good at all to push the child patches forward to 3.6/3.7 if you knew that IPv6 SLAAC was broken by them. Jason -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 for-next 00/12] Add network namespace support in the RDMA-CM
On Wed, Jun 3, 2015 at 10:53 PM, Jason Gunthorpe wrote: > On Wed, Jun 03, 2015 at 10:05:34PM +0300, Or Gerlitz wrote: > >> Indeed the DHCP story isn't working there and to get DHCP work >> something has to be done. But this issue can't serve for blocking the >> existing UAPI and introduce regression to working systems. > > It is not DHCP that concerns me, it is the fact we can't combine net > namespaces, RDMA-CM and duplicate GUID IPoIB children together without > adding hacks to the kernel. Searching netdevs by IP is a hack. > > I'm mostly fine with it as an optional capability, similar to macvlan, > I just don't see how to cleanly integrate it with RDMA CM and > namespaces. And I don't see what RDMA CM is supposed to do when > it hits this case. > > So, any ideas that don't involve the searching for IP hack?? > > [And yes, as discussed with Haggie, it is not the worst hack in the > world, and maybe we can live with it, but lets understand the trade > offs carefully] As Haggai wrote, if we let the using IP address thing to fly up, we have support for RDMA in containers using the RDMA-CM at IPoIB environments. This will let people test, use, experiment, fix, interact (and even production-it when static IP address assignment scheme is used). Later, usage of alias GUIDs for IPoIB RTNL childs would allow to remove the IP thing. Later, the next stage/s in Matan's work on the RoCE GID table would allow to support MACVLAN and hence RoCE too. This is how the Linux kernel being evolved since the 2.5 failure to come up with giant releases -- doing things in relativity small steps. > Also, now that this has been brought up, I think you need to make a > patch to fix the IPv6 SLAAC breakage this caused. It looks trivial to > modify addrconf_ifid_infiniband to return error if the IPoIB child is > sharing a guid. It was not good at all to push the child patches > forward to 3.6/3.7 if you knew that IPv6 SLAAC was broken by them. Till the alias GUID thing is introduced, maybe we can patch addrconf_ifid_infiniband to use the QPN value from the device HW address to come up with unique IPv6 link local address, agree? where you think we can place the 24 bits QPN? Or. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 for-next 00/12] Add network namespace support in the RDMA-CM
On Wed, Jun 03, 2015 at 11:07:37PM +0300, Or Gerlitz wrote: > As Haggai wrote, if we let the using IP address thing to fly up, we have > support for RDMA in containers using the RDMA-CM at IPoIB environments. > This will let people test, use, experiment, fix, interact (and even > production-it when static IP address assignment scheme is used). Sure, I think we all understand the goal, and you've explained some reasonable use cases for the child support. > Later, usage of alias GUIDs for IPoIB RTNL childs would allow to > remove the IP thing. How do we remove it? Along with same-guid child support? What is your idea here? > > Also, now that this has been brought up, I think you need to make a > > patch to fix the IPv6 SLAAC breakage this caused. It looks trivial to > > modify addrconf_ifid_infiniband to return error if the IPoIB child is > > sharing a guid. It was not good at all to push the child patches > > forward to 3.6/3.7 if you knew that IPv6 SLAAC was broken by them. > > Till the alias GUID thing is introduced, maybe we can patch > addrconf_ifid_infiniband to use the QPN value from the device HW > address to come up with unique IPv6 link local address, agree? where > you think we can place the 24 bits QPN? I don't know if that is a good idea, an unstable SLAAC is not in spirit with the RFCs. The safest bet is to return error and disable SLAAC completely. But I'm just guessing here - I'm only feel strongly that something should be done to address this issue in the existing kernel. Jason -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 for-next 00/12] Add network namespace support in the RDMA-CM
On Wed, Jun 03, 2015 at 11:07:37PM +0300, Or Gerlitz wrote: > > I'm mostly fine with it as an optional capability, similar to macvlan, > > I just don't see how to cleanly integrate it with RDMA CM and > > namespaces. And I don't see what RDMA CM is supposed to do when > > it hits this case. > > > > So, any ideas that don't involve the searching for IP hack?? > > > > [And yes, as discussed with Haggie, it is not the worst hack in the > > world, and maybe we can live with it, but lets understand the trade > > offs carefully] > > As Haggai wrote, if we let the using IP address thing to fly up, we have > support for RDMA in containers using the RDMA-CM at IPoIB environments. > This will let people test, use, experiment, fix, interact (and even > production-it when static IP address assignment scheme is used). I just noticed ipvlan got merged a few months ago.. That certainly changed my view on this topic. It is basically a software version of the same-guid ipoib children scheme. Similar issues: Same MAC address as the parent, IPv6 SLAAC is disabled (?), DHCP has similar issue (solved with RFC4361, and broadcasting fallback, it seems).. The l2/l3 distinction in ipvlan is also very interesting. The L3 mode solves some of the security type issues. What do you think Haggi? Is there any chance standard things like ipvlan and macvlan could be used with rdma-cm if their master devices are IPoIB? Are we even on the right path to do that someday? Is that the plan for roce? Any thoughts on the idea we still need ipoib same-guid children if ipvlan is available? Jason -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 for-next 00/12] Add network namespace support in the RDMA-CM
On 04/06/2015 02:48, Jason Gunthorpe wrote: > On Wed, Jun 03, 2015 at 11:07:37PM +0300, Or Gerlitz wrote: > >>> I'm mostly fine with it as an optional capability, similar to macvlan, >>> I just don't see how to cleanly integrate it with RDMA CM and >>> namespaces. And I don't see what RDMA CM is supposed to do when >>> it hits this case. >>> >>> So, any ideas that don't involve the searching for IP hack?? >>> >>> [And yes, as discussed with Haggie, it is not the worst hack in the >>> world, and maybe we can live with it, but lets understand the trade >>> offs carefully] >> >> As Haggai wrote, if we let the using IP address thing to fly up, we have >> support for RDMA in containers using the RDMA-CM at IPoIB environments. >> This will let people test, use, experiment, fix, interact (and even >> production-it when static IP address assignment scheme is used). > > I just noticed ipvlan got merged a few months ago.. That certainly > changed my view on this topic. It is basically a software > version of the same-guid ipoib children scheme. Similar issues: Same MAC > address as the parent, IPv6 SLAAC is disabled (?), DHCP has similar > issue (solved with RFC4361, and broadcasting fallback, it seems).. > > The l2/l3 distinction in ipvlan is also very interesting. The L3 mode > solves some of the security type issues. What do you think Haggi? I think some issues ipvlan is trying to solve would also affect us using the alias GUIDs solution. ipvlan tries to solve among other the problem of a limited MAC filter table in NICs, and avoid using promiscuous mode. But the GID table is also limited, and we don't have something like promiscuous mode for GIDs in InfiniBand. For large scale use of containers we would need to also allow the current model. As for L3 mode, it does seem more restrictive, as all routing decisions are done in the controlling namespace. Our current ipoib child interface implementation is more like the L2 version of ipvlan. > > Is there any chance standard things like ipvlan and macvlan could be > used with rdma-cm if their master devices are IPoIB? These standard interfaces seem very much connected with Ethernet (both have an ARPHDR_ETHER-only check for their upper devices). I think macvlan's functionality would be covered by adding alias GUIDs to ipoib, and ipvlan L2 is covered by the current behavior. Perhaps it would be beneficial to try and make ipvlan more generic so that it would work over ipoib, giving us support for L3 mode. As for rdma-cm support, the patch I had for ipoib attempts to scan each child's upper devices in order to support such topologies. We only tested it with bonding, but I think it would also work with such devices. > Are we even on > the right path to do that someday? Is that the plan for roce? Yes, for RoCE our goal for the start was to support namespaces in RDMA CM through macvlan devices. As long as we can update the RoCE gid table correctly for macvlan and ipvlan devices, the RDMA CM implementation shouldn't care where the details come from. > Any thoughts on the idea we still need ipoib same-guid children if > ipvlan is available? If we port ipvlan to work over IPoIB interfaces and not just Ethernet, then ipvlan L2 would provide exactly the same functionality. There onyl difference I can think of is that ipvlan would use a single UD QP for all devices (and in connected-mode, a single RC QP between a pair of hosts), while ipoib would use a QP per child device, and multiple RC QPs for such pairs. Regards, Haggai -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 for-next 00/12] Add network namespace support in the RDMA-CM
On 04/06/2015 00:45, Jason Gunthorpe wrote: > On Wed, Jun 03, 2015 at 11:07:37PM +0300, Or Gerlitz wrote: >> As Haggai wrote, if we let the using IP address thing to fly up, we have >> support for RDMA in containers using the RDMA-CM at IPoIB environments. >> This will let people test, use, experiment, fix, interact (and even >> production-it when static IP address assignment scheme is used). > > Sure, I think we all understand the goal, and you've explained some > reasonable use cases for the child support. > >> Later, usage of alias GUIDs for IPoIB RTNL childs would allow to >> remove the IP thing. > > How do we remove it? Along with same-guid child support? What is your > idea here? > >>> Also, now that this has been brought up, I think you need to make a >>> patch to fix the IPv6 SLAAC breakage this caused. It looks trivial to >>> modify addrconf_ifid_infiniband to return error if the IPoIB child is >>> sharing a guid. It was not good at all to push the child patches >>> forward to 3.6/3.7 if you knew that IPv6 SLAAC was broken by them. >> >> Till the alias GUID thing is introduced, maybe we can patch >> addrconf_ifid_infiniband to use the QPN value from the device HW >> address to come up with unique IPv6 link local address, agree? where >> you think we can place the 24 bits QPN? > > I don't know if that is a good idea, an unstable SLAAC is not in > spirit with the RFCs. The safest bet is to return error and disable > SLAAC completely. Maybe this is a silly question, but doesn't DAD already disable SLAAC addresses when there's a conflict? Haggai -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 for-next 00/12] Add network namespace support in the RDMA-CM
On Thu, Jun 04, 2015 at 12:41:33PM +0300, Haggai Eran wrote: > On 04/06/2015 00:45, Jason Gunthorpe wrote: > > I don't know if that is a good idea, an unstable SLAAC is not in > > spirit with the RFCs. The safest bet is to return error and disable > > SLAAC completely. > Maybe this is a silly question, but doesn't DAD already disable SLAAC > addresses when there's a conflict? Yes, DAD should certainly trigger and disable the child, but the kernel should not rely on DAD for correctness, it is a safety net, and it isn't guarenteed 100% reliable. Jason -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 for-next 00/12] Add network namespace support in the RDMA-CM
On Thu, Jun 04, 2015 at 09:24:37AM +0300, Haggai Eran wrote: > > The l2/l3 distinction in ipvlan is also very interesting. The L3 mode > > solves some of the security type issues. What do you think Haggi? > I think some issues ipvlan is trying to solve would also affect us using > the alias GUIDs solution. ipvlan tries to solve among other the problem > of a limited MAC filter table in NICs, and avoid using promiscuous mode. > But the GID table is also limited, and we don't have something like > promiscuous mode for GIDs in InfiniBand. For large scale use of > containers we would need to also allow the current model. Yes, that is certainly true. > As for L3 mode, it does seem more restrictive, as all routing decisions > are done in the controlling namespace. Our current ipoib child interface > implementation is more like the L2 version of ipvlan. The ipoib children are exactly like macvlan, because they all have unique LLADDRs. It doesn't start acting like ipvlan until we reach the rdma-cm patches, and where we see the IP stack side act like macvlan and the rdma-cm side try to act like ipvlan - that is why it is so ugly/hacky, > > Is there any chance standard things like ipvlan and macvlan could be > > used with rdma-cm if their master devices are IPoIB? > These standard interfaces seem very much connected with Ethernet (both > have an ARPHDR_ETHER-only check for their upper devices). I think > macvlan's functionality would be covered by adding alias GUIDs to ipoib, > and ipvlan L2 is covered by the current behavior. Perhaps it would be > beneficial to try and make ipvlan more generic so that it would work > over ipoib, giving us support for L3 mode. Yes, macvlan seems very well covered already by IPoIB child interfaces, and I don't see too many reasons to worry about changing that. ipvlan on the other hand, as you observe, is valuable for many reasons. > As for rdma-cm support, the patch I had for ipoib attempts to scan each > child's upper devices in order to support such topologies. We only > tested it with bonding, but I think it would also work with such devices. .. it is so sketchy :| Firstly: I still think the prior discussion is right, and proceeding along the reworking of the ingress side of rdma-cm and focusing on the device,guid,pkey makes 100% sense and will progress things right away. Every other variation seems to build on that. But when we get into bonding and the various vlan things, we loose encapsulation - snooping the children list to guess what the bonding driver is doing seems very hacky. Discussion idea: Can we actually use the netstack to process the RDMA-CM packets? It looks like the netstack wants a skb to do this mid-layer work, so rdma-cm would have to synthesize a skb for the CM packets and pass it through netdev to apply all the transformations and access the various internal states (eg from ipvlan, bonding, etc). rdma-cm would have to 'catch' the skb once it is done traveling and resume its normal processing. Very similar to your notion of using UDP, but without any on-the-wire change. This would fit in that same ingress spot I suggested adding the routing lookup, instead of routing we want the full stack to have a go at figuring out the final netdev. This seems the most general because it will work for all the *vlan type drivers, bonding, and all of the RDMA technologies. (each would have a slightly different way to make the skb, but same basic idea) Lots and lots of details to do that, but conceptually it seems pretty solid? > Yes, for RoCE our goal for the start was to support namespaces in RDMA > CM through macvlan devices. As long as we can update the RoCE gid table > correctly for macvlan and ipvlan devices, the RDMA CM implementation > shouldn't care where the details come from. Hurm, the gid index tagged on the QP1 packet should not be directly used for much on ingress. rdma-cm will have to recover the mac address and vlan to use that as a guide. Synchronizing the gid table and all the internal state in macvlan, ipvlan, bonding seems very hard, I do not envy your task :( > > Any thoughts on the idea we still need ipoib same-guid children if > > ipvlan is available? > If we port ipvlan to work over IPoIB interfaces and not just Ethernet, > then ipvlan L2 would provide exactly the same functionality. There onyl > difference I can think of is that ipvlan would use a single UD QP for > all devices (and in connected-mode, a single RC QP between a pair of > hosts), while ipoib would use a QP per child device, and multiple RC QPs > for such pairs. Agree with this. Jason -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 for-next 00/12] Add network namespace support in the RDMA-CM
On 04/06/2015 19:40, Jason Gunthorpe wrote: > Discussion idea: Can we actually use the netstack to process the > RDMA-CM packets? It looks like the netstack wants a skb to do this > mid-layer work, so rdma-cm would have to synthesize a skb for the CM > packets and pass it through netdev to apply all the transformations > and access the various internal states (eg from ipvlan, bonding, > etc). rdma-cm would have to 'catch' the skb once it is done traveling > and resume its normal processing. Very similar to your notion of using > UDP, but without any on-the-wire change. > > This would fit in that same ingress spot I suggested adding the > routing lookup, instead of routing we want the full stack to have a go > at figuring out the final netdev. > > This seems the most general because it will work for all the *vlan > type drivers, bonding, and all of the RDMA technologies. (each would > have a slightly different way to make the skb, but same basic idea) > > Lots and lots of details to do that, but conceptually it seems pretty > solid? The problem is that the network stack can do all sort of changes to the packets (like NAT), and it may be the case that the hardware can't reflect these changes later on when creating a QP. I think it would be best to stick with resolving the net_dev using the request parameters, and the simpler routing lookup. This way RDMA CM remains in control, and if the user configures routing in an unexpected way, it can just block the request. Haggai -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 for-next 00/12] Add network namespace support in the RDMA-CM
On Mon, Jun 08, 2015 at 10:52:34AM +0300, Haggai Eran wrote: > On 04/06/2015 19:40, Jason Gunthorpe wrote: > > Discussion idea: Can we actually use the netstack to process the > > RDMA-CM packets? It looks like the netstack wants a skb to do this > > mid-layer work, so rdma-cm would have to synthesize a skb for the CM > > packets and pass it through netdev to apply all the transformations > > and access the various internal states (eg from ipvlan, bonding, > > etc). rdma-cm would have to 'catch' the skb once it is done traveling > > and resume its normal processing. Very similar to your notion of using > > UDP, but without any on-the-wire change. > > > > This would fit in that same ingress spot I suggested adding the > > routing lookup, instead of routing we want the full stack to have a go > > at figuring out the final netdev. > > > > This seems the most general because it will work for all the *vlan > > type drivers, bonding, and all of the RDMA technologies. (each would > > have a slightly different way to make the skb, but same basic idea) > > > > Lots and lots of details to do that, but conceptually it seems pretty > > solid? > > The problem is that the network stack can do all sort of changes to the > packets (like NAT), and it may be the case that the hardware can't > reflect these changes later on when creating a QP. Yes, I am aware of that, but there are also alot of things netdev can do that we can realize, like netfilter rules to block packets, for instance Ignoring NAT is a bad choice as well, the best would be to drop on NAT. It would be easy to detect if the netstack mangled the REQ skb packet, for instance. We can't track netdev after the QP is created, but totally ignoring one thing and while re-implementing others seems like a bad idea, long term... > I think it would be best to stick with resolving the net_dev using > the request parameters, and the simpler routing lookup. This way > RDMA CM remains in control, and if the user configures routing in an > unexpected way, it can just block the request. As I said, I think that is fine for the immediate IB support, but when you start talking about roce and emulating macvlan and ipvlan.. Then it starts to look really bad. At least think it through carefully before posting those series. Jason -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html