On Tue, Nov 14, 2017 at 8:44 AM, Or Gerlitz <gerlitz...@gmail.com> wrote: > On Mon, Nov 13, 2017 at 7:10 PM, Alexander Duyck > <alexander.du...@gmail.com> wrote: >> On Sun, Nov 12, 2017 at 10:16 PM, Or Gerlitz <gerlitz...@gmail.com> wrote: >>> On Sun, Nov 12, 2017 at 10:38 PM, Alexander Duyck > >>> The what we call slow path requirements are the following: >>> >>> 1. xmit on VF rep always turns to a receive on the VF, regardless of >>> the offloaded SW steering rules ("send-to-vport") >>> >>> 2. xmit on VF which doesn't meet any offloaded SW steering rules must >>> be received into the host OS from the VF rep > >>> 1,2 above must hold also for the uplink and the PF reps > >> I am well aware of the requirements. We discussed these with Jiri at >> the previous netdev. > >>> When the i40e limitation was described to @ netdev, it seems you have a >>> problem >>> with VF xmit that should be turned to be a recv on the VF rep but also >>> goes to the wire. > >>> It smells as if a FW patch can solve that, isn't that? > >> That is a huge maybe. We looked into it last time and while we can >> meet requirements 1 and 2 we do so with a heavy performance penalty >> due to the fact that we don't support anywhere near the same number of >> flows as a true switch. Also while that might work for i40e > > to recap on i40e, you can support the slow path requirements, but you have an > issue with the fast path (== offloaded flows)? what is the issue there?
We basically need to do some feasability research to see if we can actually meet all the requirements for switchdev on i40e. We have been getting mixed messages where we are given a great many "yes, but" type answers. For i40e we are looking into it but I don't have high confidence in our ability to actually support it in hardare/firmware. If it were as easy as you have been led to believe, we would have done it months ago when we were researching the requirements to support switchdev. In addition i40e isn't really my concern. I am much more concerned about ixgbe as it has a much larger install base and many more customers that are still buying it today. >> we still have a much larger install base of ixgbe ports that we have to >> support. > > ok, but support is one thing and keep enhancing a ten years old wrong > SW model is 2nd thing The model might be 10 years old, but as I said we are still shipping new silicon that was released just over a year ago that is supported by the ixgbe driver. Also I don't know if the term "enhancing" is the right word for what I am thinking. I'm not talking about adding new drivers that only support legacy mode. We are looking at probably having to refactor the whole concept of "trusted" VF in order to break it out into smaller buckets. In addition I plan to come up with a source mode macvlan based "port representor" for legacy SR-IOV and hope to be able to use that to start working on a better path for SR-IOV live migration. Fundamentally the problem I have with us saying we cannot extend legacy mode SR-IOV is that 82599 is a very large piece of the existing install base for 10Gbit in general. We have it shipping on brand new platforms as the silicon that is installed on the motherboard. With that being the case people are going to want to get the most value they can out of the silicon that they purchased since in many cases it is just a standard part of the platform. >>>>> I would have to disagree with this. For devices such as 82599 that >>>> doesn't have a true switch this may limit future functionality since >>>> we can't move it over to switchdev mode. For example one thing I may >>>> need to add is the ability to disable multicast and broadcast receive >>>> on a per-VF basis at some point in the future. > >>> We are on the same boat with ConnectX3/mlx4, so us lucky that misery loves >>> company (my google search also yielded "many narrow-half consolation" is >>> that >>> completely unrelated?) - the legacy mode for ixgbe/mlx4 is there for ~8-10 >>> years >>> - and since then both companies had 2-3 newer HW generations. I don't see >>> why >>> you can't come to your customers and tell that newish functionality needs >>> newer >>> HW - it will also help sell more from the new stuff.. If you keep >>> extending the legacy mode, more ppl/drivers will do that as well and it >>> will not let us go >>> in the right direction. > >> Well I don't know about you guys, but we still are selling parts >> supported by ixgbe > > Same here, we are selling lots of CX3 and have to support that, but I didn't > see why someone will want new features there. I think the difference is that we get pressed on as part of the platform instead of being a single component. If a customer wants some specific feature enabled on 82599 as a part of the platform we tend to need to go along with it in order to avoid being a roadblock in a sale of other components. >> still been adding new hardware as recently as just a couple years ago. > > wait, that's different story. > > You are saying that your older HW doesn't support e-switch > and you want to keep doing new parts of that older HW and you want the > kernel to keep enhance a wrong SW model b/c you are doing new parts > from old HW, I don't see why we as a community need to go there. I'm not saying we have new parts. I'm saying we have existing parts that will likely need some work done. SwitchDev was only introduced about 2 years ago. We have parts that were released around or before then with functionality that didn't anticipate this. We still haven't finished fully implementing all the features that were available on the parts, that is what I am arguing. Usually new features go in for several years after a part is released, usually something on the 3 to 5 year range. > Lets focus on this point for a moment before discussing the other points > you raised. > > Or. When SR-IOV was introduced there were two available modes, Virtual Ethernet Port Aggregation, aka VEPA, and Virtual Ethernet Bridging, aka VEB. The fact is SwitchDev is designed specifically for networking SR-IOV with Virtual Ethernet Bridging, aka VEB. You argue that the legacy model is bad, but I would argue that is because the legacy model was really designed to work more for both VEPA than with VEB, whereas SwitchDev only focuses on VEB. If you take a look in the ixgbe or i40e drivers you will see that we support configuring both of those modes via ndo_bridge_setlink since we have customer install bases that actually prefer VEPA over VEB as they prefer to have their traffic centrally managed instead of having the local host managing the traffic. We cannot just arbitrarily tell our customers they are doing SR-IOV using the "wrong model". I would rather not have SwitchDev become the next SystemD. The type argument you are making is basically dictating to us and our customers how things are supposed to work based on your view things. We have different hardware, different customers, and all of our needs aren't necessarily met by SwitchDev. I would agree that SwitchDev is the go-to solution for VEB configuration, and we do plan to have future hardware support it. In addition I would argue that for the sake of consistency we should make sure that any feature that gets added to the legacy has to be supported by the SwitchDev model as well before it could be supported. If anything my hope is to evolve the legacy model to have much of the same look and feel as SwitchDev, but that will take time and require changes to the legacy model. I don't plan to have a ton of new features added to legacy SR-IOV, as I stated earlier my main concern is the "trusted" VF mode as that has become a security issue as everything is getting dumped into that so we need to break it up to get finer granularity. For example I am looking at adding a promisc/allmulti/multicast/broadcast control per VF to set the upper limit of what a VF can request to receive instead of just turning on "trusted" to allow a VF to turn on promiscuous. My only other concern is live migration. I don't know if that will require changes to the legacy SR-IOV mode or not, but it would be better to not have that door closed as an option than to have to work around it entirely. So, to summarize: 1. VEPA is still a thing, that implies no e-switch. Switchdev does not address that model. 2. I agree that SwitchDev is the way forward for VEB. 3. I agree we should focus on interface consistency so any new feature added to legacy mode has to also be enabled in SwitchDev. I hope this makes my point a bit clearer. I don't fundamentally disagree with the need to focus on having a consistent UAPI going forward. The only spot where we have issues is that I don't see SwitchDev as the only solution as we still have customers that aren't necessarily making use of an eswitch and telling them they are "doing it wrong" isn't really a viable solution. If nothing else I think we can look at re-evaluating this at the next netdev/netconf, and for now I would agree legacy SR-IOV changes should be under greater scrutiny. - Alex