Re: [patch net-next 00/14] rocker: add support for multiple worlds
[...] >> >> Its related in that if you expose your device model you do not need >> opaque strings to do wholesale reconfiguration of the device. Instead >> if the parts of the device that are configurable are exposed to the >> user they can build the "world" they want. > > The disconnect here, I believe, is offloading to hw the Linux > forwarding plane vs. offloading an arbitrary application's forwarding > plane. Switchdev (and rocker) are about offloading the Linux > dataplane. That means Linux _is_ the application (the NOS); hw > offloads what it can from the kernel to accelerate pkt forwarding. > But the user's experience is standard Linux tools (iproute2, netlink) > and building blocks (bridge, bond, etc) are used to construct a switch > (or router), and the fact that the data path is offloaded to hw is > transparent to the user. We could define APIs for arbitrary > applications to program hardware, like John is suggesting. by giving > up raw access to hw resources, like tables, etc. This approach moves > the "driver" to the application, and by-passes the Linux tools and > building blocks. We're still TBD on these APIs, probably because of > the "by-pass" part. Thanks Scott I think this helps some. I don't view my approach as a by-pass though or even letting arbitrary applications have access to the hardware. Today I load arbitrary filters and bpf programs into the kernel to create a pipeline. Now I want to string a couple other tables in front of my pipeline to do some of the heavy lifting. Maybe the real difference is my _datapath_ is not offloaded (by-passing?) the kernel. Most (all?) of my packets are meant for the host and I want to do partial offloading where some of the initial processing is done in the hardware and the rest is handled by software. The "driver" is not in the application it is still in the kernel. I almost have something ready to kick out I meant to do this today might be another day or two though. > > Jiri's patchset here is about moving things around so he can define > another hw mode in rocker. The upper edge for rocker driver is still > switchdev, but with the new eBPF hw mode he's working on, he'll be > able to push down a dynamic pipeline rather than being stuck with the > OF-DPA pipeline we have today (in rocker). I presume once he has this > new eBPF support, he'll program in a "Linux kernel" pipeline, and fill > out the corresponding swtichdev ops. I imagine a P4 -> ePBF compiler, > and we take a linux.p4 and program hw. Linux.p4 should be > generic...consumable by any hardware...it is a representation of the > Linux pipeline. (Similar to P4's switch.p4). > > But now, with eBPF mode in hw, an arbitrary.p4 could be written for > that arbitrary application and pushed down. We still need APIs for > that application. > My gripe here was flipping the hardware between modes with a string value. It seems it has been dropped from the latest version though so I have no problem with the patches. .John -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch net-next 00/14] rocker: add support for multiple worlds
[...] >> I think your underestimating the flexibility of hardware. And >> completely missing the hardware that is based on FPGAs and/or cell >> architectures. This hardware is available today and could support >> topology changes like this. But even less exotic hardware can/will >> support parser updates which makes the device behave differently. > > Sure, I'm just trying to explain that woulds and your "profiles" are > something completely different. I feel like we are running in circles. > Must be going in circles because I don't see the difference. > >> >> Other hardware can reconfigure the topology within some constraints, >> the fm10k device supports this model. An extreme example would put >> an ebpf interpreter in a fpga on the nic and expose it via a driver. >> >> If its just for rocker purposes I'm not really excited about adding >> it to the kernel to support a qemu device. If we allow it for one > > What exactly are you against? Multi-world support as it is of the > userspace iface to change worlds? If the second, I understand, kind of. > Just the userspace interface. I have hardware that can support multiple worlds (although I'm fuzzy what you mean by worlds) today and want to expose that as well. I guess my main objection is I wanted to get away from out of band firmware/microcode updates and this doesn't really look much better to me as I currently understand it. I'll have a bank of microcode images and depending on the string load one of them. I agree we need some way to support configurable hardware. > >> driver I don't see how/why we should block it for "real" devices. >>From the kernels point of view these are all real drivers. I could >> build a qemu model that maps 1:1 with real hardware and do a drop >> in replacement. >> >>> >>> Rocker has a notion of "worlds". When a port is set to be in a certain >>> world, it behaves in completely different way. Now we have just OF-DPA >>> world. I will be adding BPF world shortly. >>> >>> This has nothing to do with profiles as you describe it, this is >>> something completely different! >>> >>> >> >> I'm missing why its different. >> >> Would you object to me adding multiple worlds to fm10k >> using opaque strings? I'll create a world with a topology that maps >> well to ipv4 networks, a world for ipv6 networks, a world for l2 flat >> networks, etc. Each world in this example will have a specific table > > Not worlds in rocker terminology. This is what you call profiles. > > >> topology and parser to support it. In this sense the ports will behave >> in completely different ways i.e. packets will be processed by >> different pipelines. Are you suggesting we do this? > > No, I definitelly do not suggest this. Again, this is what you call "profile". > I don't care about those, not in the scope of this patchset. > hmm maybe you can explain to me what makes a change large enough to be called a "world" and where it is a "profile"? [...] skipped a few comments because I think they were interesting but not the point I was trying to ask. >> Its related in that if you expose your device model you do not need >> opaque strings to do wholesale reconfiguration of the device. Instead >> if the parts of the device that are configurable are exposed to the >> user they can build the "world" they want. > > No, this is not about building. This is about choosing from fixed-sized > pre-defined list of choices. Again, no "profiles". > But I can just expose a list of pre-defined choices that map to what I call "profiles". Must be missing the point help me understand what a world is vs profile? Maybe our goals are not actually conflicting. Do you have any objection to pushing configuration code to create tables, insert a new parser, and change the table topology, and then bind the tables to software subsystems like fdb, l3, tc, nft, bpf, etc. .John -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch net-next 00/14] rocker: add support for multiple worlds
Mon, Oct 05, 2015 at 06:58:06PM CEST, john.fastab...@gmail.com wrote: >On 15-10-05 09:30 AM, Jiri Pirko wrote: >> Mon, Oct 05, 2015 at 05:41:38PM CEST, john.fastab...@gmail.com wrote: >>> On 15-10-04 02:25 PM, Jiri Pirko wrote: From: Jiri PirkoThis patchset allows new rocker worlds to be easily added in future (like eBPF based one I have been working on). The main part of the patchset is the OF-DPA carve-out. It resuts in OF-DPA specific file. Clean cut. The user is able to change rocker port world/mode using rtnl. >>> >>> Hi Jiri, >>> >>> I'm not sure I understand the motivation here. Are you thinking the >>> "real" drivers will start to load worlds or what I've been calling >>> profiles on the devices I have here. If this is the case using >>> opaque strings without any other infrastructure around it to expose >>> what the profile is doing is not sufficient in my opinion. What I >>> would rather have is for drivers to expose the actual configuration >>> parameters they are using, preferable these would be both readable >>> and writable so we don't end up with what the firmware/device driver >>> writers think is best. I think we can get there by exposing a model >>> of the device and configuring "tables". I'll post my latest patch >>> set today to give you a better idea what I'm thinking here. Without >>> this I guess you will end up with drivers creating many profiles and >>> in no consistent way so you end up with here is my "vxlan" profile, >>> here is my "geneve" profile, here is my "magic-foo" profile, etc. I >>> wanted to avoid this. >> >> This is just for rocker purposes. I do not want to do something similar >> for real devices. It does not make sense as real hw always have some >> hard-wired topology. Rocker HW does not. I think that this is the main >> part that may cause some misunderstandings. > >I think your underestimating the flexibility of hardware. And >completely missing the hardware that is based on FPGAs and/or cell >architectures. This hardware is available today and could support >topology changes like this. But even less exotic hardware can/will >support parser updates which makes the device behave differently. Sure, I'm just trying to explain that woulds and your "profiles" are something completely different. I feel like we are running in circles. > >Other hardware can reconfigure the topology within some constraints, >the fm10k device supports this model. An extreme example would put >an ebpf interpreter in a fpga on the nic and expose it via a driver. > >If its just for rocker purposes I'm not really excited about adding >it to the kernel to support a qemu device. If we allow it for one What exactly are you against? Multi-world support as it is of the userspace iface to change worlds? If the second, I understand, kind of. >driver I don't see how/why we should block it for "real" devices. >From the kernels point of view these are all real drivers. I could >build a qemu model that maps 1:1 with real hardware and do a drop >in replacement. > >> >> Rocker has a notion of "worlds". When a port is set to be in a certain >> world, it behaves in completely different way. Now we have just OF-DPA >> world. I will be adding BPF world shortly. >> >> This has nothing to do with profiles as you describe it, this is >> something completely different! >> >> > >I'm missing why its different. > >Would you object to me adding multiple worlds to fm10k >using opaque strings? I'll create a world with a topology that maps >well to ipv4 networks, a world for ipv6 networks, a world for l2 flat >networks, etc. Each world in this example will have a specific table Not worlds in rocker terminology. This is what you call profiles. >topology and parser to support it. In this sense the ports will behave >in completely different ways i.e. packets will be processed by >different pipelines. Are you suggesting we do this? No, I definitelly do not suggest this. Again, this is what you call "profile". I don't care about those, not in the scope of this patchset. > >I'm not sure what you mean by completely different? Is it just a >different parser and table topology? Real hardware can support changing >or at least modifying these today. > >>> >>> But if this is only meant to be a rocker thing then why expose it on >>> the driver side vs just compiling it on the qemu side? If its just >> >> I want user to be able to set the world/mode of the port on fly. No need to >> re-set the hardware if possible to do it from driver. >> > >But the user has no way to know what these strings are doing? > >> >>> for convenience and only meant for the emulated device we should be >>> clear in the documentation and patch set. >> >> This is rocker-only patchset, where do you want to clear it? >> > >I don't think this is reasonable from the kernel side to "know" or >expose a driver is running on qemu like this. The kernel shouldn't >know or care if
Re: [patch net-next 00/14] rocker: add support for multiple worlds
On Mon, Oct 5, 2015 at 9:58 AM, John Fastabendwrote: > On 15-10-05 09:30 AM, Jiri Pirko wrote: >> Mon, Oct 05, 2015 at 05:41:38PM CEST, john.fastab...@gmail.com wrote: >>> On 15-10-04 02:25 PM, Jiri Pirko wrote: From: Jiri Pirko This patchset allows new rocker worlds to be easily added in future (like eBPF based one I have been working on). The main part of the patchset is the OF-DPA carve-out. It resuts in OF-DPA specific file. Clean cut. The user is able to change rocker port world/mode using rtnl. >>> >>> Hi Jiri, >>> >>> I'm not sure I understand the motivation here. Are you thinking the >>> "real" drivers will start to load worlds or what I've been calling >>> profiles on the devices I have here. If this is the case using >>> opaque strings without any other infrastructure around it to expose >>> what the profile is doing is not sufficient in my opinion. What I >>> would rather have is for drivers to expose the actual configuration >>> parameters they are using, preferable these would be both readable >>> and writable so we don't end up with what the firmware/device driver >>> writers think is best. I think we can get there by exposing a model >>> of the device and configuring "tables". I'll post my latest patch >>> set today to give you a better idea what I'm thinking here. Without >>> this I guess you will end up with drivers creating many profiles and >>> in no consistent way so you end up with here is my "vxlan" profile, >>> here is my "geneve" profile, here is my "magic-foo" profile, etc. I >>> wanted to avoid this. >> >> This is just for rocker purposes. I do not want to do something similar >> for real devices. It does not make sense as real hw always have some >> hard-wired topology. Rocker HW does not. I think that this is the main >> part that may cause some misunderstandings. > > I think your underestimating the flexibility of hardware. And > completely missing the hardware that is based on FPGAs and/or cell > architectures. This hardware is available today and could support > topology changes like this. But even less exotic hardware can/will > support parser updates which makes the device behave differently. > > Other hardware can reconfigure the topology within some constraints, > the fm10k device supports this model. An extreme example would put > an ebpf interpreter in a fpga on the nic and expose it via a driver. > > If its just for rocker purposes I'm not really excited about adding > it to the kernel to support a qemu device. If we allow it for one > driver I don't see how/why we should block it for "real" devices. > From the kernels point of view these are all real drivers. I could > build a qemu model that maps 1:1 with real hardware and do a drop > in replacement. > >> >> Rocker has a notion of "worlds". When a port is set to be in a certain >> world, it behaves in completely different way. Now we have just OF-DPA >> world. I will be adding BPF world shortly. >> >> This has nothing to do with profiles as you describe it, this is >> something completely different! >> >> > > I'm missing why its different. > > Would you object to me adding multiple worlds to fm10k > using opaque strings? I'll create a world with a topology that maps > well to ipv4 networks, a world for ipv6 networks, a world for l2 flat > networks, etc. Each world in this example will have a specific table > topology and parser to support it. In this sense the ports will behave > in completely different ways i.e. packets will be processed by > different pipelines. Are you suggesting we do this? > > I'm not sure what you mean by completely different? Is it just a > different parser and table topology? Real hardware can support changing > or at least modifying these today. > >>> >>> But if this is only meant to be a rocker thing then why expose it on >>> the driver side vs just compiling it on the qemu side? If its just >> >> I want user to be able to set the world/mode of the port on fly. No need to >> re-set the hardware if possible to do it from driver. >> > > But the user has no way to know what these strings are doing? > >> >>> for convenience and only meant for the emulated device we should be >>> clear in the documentation and patch set. >> >> This is rocker-only patchset, where do you want to clear it? >> > > I don't think this is reasonable from the kernel side to "know" or > expose a driver is running on qemu like this. The kernel shouldn't > know or care if a device is emulated or not. > >> >>> >>> Final, comment can we abstract the interfaces better? An L2 and L3 >>> table could be mapped generically onto a table pipeline model if the >>> driver gave some small hints like this is my l2 table and this is my l3 >>> table. Then you don't need all the world specific callbacks and the >>> OF-DPA model just looks like an instance of a pipeline with some >>> specific hints where to put l2/l3 rules. >> >> I think you are
Re: [patch net-next 00/14] rocker: add support for multiple worlds
On 15-10-05 09:30 AM, Jiri Pirko wrote: > Mon, Oct 05, 2015 at 05:41:38PM CEST, john.fastab...@gmail.com wrote: >> On 15-10-04 02:25 PM, Jiri Pirko wrote: >>> From: Jiri Pirko>>> >>> This patchset allows new rocker worlds to be easily added in future (like >>> eBPF >>> based one I have been working on). The main part of the patchset is the >>> OF-DPA >>> carve-out. It resuts in OF-DPA specific file. Clean cut. >>> The user is able to change rocker port world/mode using rtnl. >>> >> >> Hi Jiri, >> >> I'm not sure I understand the motivation here. Are you thinking the >> "real" drivers will start to load worlds or what I've been calling >> profiles on the devices I have here. If this is the case using >> opaque strings without any other infrastructure around it to expose >> what the profile is doing is not sufficient in my opinion. What I >> would rather have is for drivers to expose the actual configuration >> parameters they are using, preferable these would be both readable >> and writable so we don't end up with what the firmware/device driver >> writers think is best. I think we can get there by exposing a model >> of the device and configuring "tables". I'll post my latest patch >> set today to give you a better idea what I'm thinking here. Without >> this I guess you will end up with drivers creating many profiles and >> in no consistent way so you end up with here is my "vxlan" profile, >> here is my "geneve" profile, here is my "magic-foo" profile, etc. I >> wanted to avoid this. > > This is just for rocker purposes. I do not want to do something similar > for real devices. It does not make sense as real hw always have some > hard-wired topology. Rocker HW does not. I think that this is the main > part that may cause some misunderstandings. I think your underestimating the flexibility of hardware. And completely missing the hardware that is based on FPGAs and/or cell architectures. This hardware is available today and could support topology changes like this. But even less exotic hardware can/will support parser updates which makes the device behave differently. Other hardware can reconfigure the topology within some constraints, the fm10k device supports this model. An extreme example would put an ebpf interpreter in a fpga on the nic and expose it via a driver. If its just for rocker purposes I'm not really excited about adding it to the kernel to support a qemu device. If we allow it for one driver I don't see how/why we should block it for "real" devices. >From the kernels point of view these are all real drivers. I could build a qemu model that maps 1:1 with real hardware and do a drop in replacement. > > Rocker has a notion of "worlds". When a port is set to be in a certain > world, it behaves in completely different way. Now we have just OF-DPA > world. I will be adding BPF world shortly. > > This has nothing to do with profiles as you describe it, this is > something completely different! > > I'm missing why its different. Would you object to me adding multiple worlds to fm10k using opaque strings? I'll create a world with a topology that maps well to ipv4 networks, a world for ipv6 networks, a world for l2 flat networks, etc. Each world in this example will have a specific table topology and parser to support it. In this sense the ports will behave in completely different ways i.e. packets will be processed by different pipelines. Are you suggesting we do this? I'm not sure what you mean by completely different? Is it just a different parser and table topology? Real hardware can support changing or at least modifying these today. >> >> But if this is only meant to be a rocker thing then why expose it on >> the driver side vs just compiling it on the qemu side? If its just > > I want user to be able to set the world/mode of the port on fly. No need to > re-set the hardware if possible to do it from driver. > But the user has no way to know what these strings are doing? > >> for convenience and only meant for the emulated device we should be >> clear in the documentation and patch set. > > This is rocker-only patchset, where do you want to clear it? > I don't think this is reasonable from the kernel side to "know" or expose a driver is running on qemu like this. The kernel shouldn't know or care if a device is emulated or not. > >> >> Final, comment can we abstract the interfaces better? An L2 and L3 >> table could be mapped generically onto a table pipeline model if the >> driver gave some small hints like this is my l2 table and this is my l3 >> table. Then you don't need all the world specific callbacks and the >> OF-DPA model just looks like an instance of a pipeline with some >> specific hints where to put l2/l3 rules. > > I think you are missing something, or I am. How do you map BPF world > pipeline into tables? The idea of the worlds is to do *completely* > different HW implementation, not just rewire some pre-defined tables. > For BPF
Re: [patch net-next 00/14] rocker: add support for multiple worlds
Mon, Oct 05, 2015 at 05:41:38PM CEST, john.fastab...@gmail.com wrote: >On 15-10-04 02:25 PM, Jiri Pirko wrote: >> From: Jiri Pirko>> >> This patchset allows new rocker worlds to be easily added in future (like >> eBPF >> based one I have been working on). The main part of the patchset is the >> OF-DPA >> carve-out. It resuts in OF-DPA specific file. Clean cut. >> The user is able to change rocker port world/mode using rtnl. >> > >Hi Jiri, > >I'm not sure I understand the motivation here. Are you thinking the >"real" drivers will start to load worlds or what I've been calling >profiles on the devices I have here. If this is the case using >opaque strings without any other infrastructure around it to expose >what the profile is doing is not sufficient in my opinion. What I >would rather have is for drivers to expose the actual configuration >parameters they are using, preferable these would be both readable >and writable so we don't end up with what the firmware/device driver >writers think is best. I think we can get there by exposing a model >of the device and configuring "tables". I'll post my latest patch >set today to give you a better idea what I'm thinking here. Without >this I guess you will end up with drivers creating many profiles and >in no consistent way so you end up with here is my "vxlan" profile, >here is my "geneve" profile, here is my "magic-foo" profile, etc. I >wanted to avoid this. This is just for rocker purposes. I do not want to do something similar for real devices. It does not make sense as real hw always have some hard-wired topology. Rocker HW does not. I think that this is the main part that may cause some misunderstandings. Rocker has a notion of "worlds". When a port is set to be in a certain world, it behaves in completely different way. Now we have just OF-DPA world. I will be adding BPF world shortly. This has nothing to do with profiles as you describe it, this is something completely different! > >But if this is only meant to be a rocker thing then why expose it on >the driver side vs just compiling it on the qemu side? If its just I want user to be able to set the world/mode of the port on fly. No need to re-set the hardware if possible to do it from driver. >for convenience and only meant for the emulated device we should be >clear in the documentation and patch set. This is rocker-only patchset, where do you want to clear it? > >Final, comment can we abstract the interfaces better? An L2 and L3 >table could be mapped generically onto a table pipeline model if the >driver gave some small hints like this is my l2 table and this is my l3 >table. Then you don't need all the world specific callbacks and the >OF-DPA model just looks like an instance of a pipeline with some >specific hints where to put l2/l3 rules. I think you are missing something, or I am. How do you map BPF world pipeline into tables? The idea of the worlds is to do *completely* different HW implementation, not just rewire some pre-defined tables. For BPF world, there will be just BPF interpreter sitting inside HW and running arbitrary code, no tables. > >Like I said I'll send some patches, they will be a bit rough and >against fm10k driver. I'll just send out what I have end of day here. Your patchset sounds totally unrelated to this one. Let's make that clear. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch net-next 00/14] rocker: add support for multiple worlds
From: Jiri PirkoThis patchset allows new rocker worlds to be easily added in future (like eBPF based one I have been working on). The main part of the patchset is the OF-DPA carve-out. It resuts in OF-DPA specific file. Clean cut. The user is able to change rocker port world/mode using rtnl. Jiri Pirko (14): rocker: remove unused rocker_port param from alloc funcs and shorten their names rocker: rename rocker.h to rocker_hw.h rocker: rename rocker.c to rocker_main.c rocker: push tlv processing into separate files rocker: implement set settings mode command rocker: introduce worlds infrastructure rocker: introduce OF-DPA world skeleton rocker: set default world on port probe and clean world on remove rocker: add rtnl ops for port mode [gs]etting rocker: pass "learning" value as a parameter to rocker_port_set_learning rocker: pre-allocate wait structures during cmd ring init rocker: remove trans parameter to rocker_cmd_exec function rocker: call rocker_cmd_exec function with "nowait" boolean instead of flags rocker: move OF-DPA stuff into separate file drivers/net/ethernet/rocker/Makefile |1 + drivers/net/ethernet/rocker/rocker.c | 5478 drivers/net/ethernet/rocker/rocker.h | 543 +-- drivers/net/ethernet/rocker/rocker_hw.h| 467 +++ drivers/net/ethernet/rocker/rocker_main.c | 3093 drivers/net/ethernet/rocker/rocker_ofdpa.c | 2927 +++ drivers/net/ethernet/rocker/rocker_tlv.c | 54 + drivers/net/ethernet/rocker/rocker_tlv.h | 202 + include/uapi/linux/if_link.h | 11 + 9 files changed, 6847 insertions(+), 5929 deletions(-) delete mode 100644 drivers/net/ethernet/rocker/rocker.c create mode 100644 drivers/net/ethernet/rocker/rocker_hw.h create mode 100644 drivers/net/ethernet/rocker/rocker_main.c create mode 100644 drivers/net/ethernet/rocker/rocker_ofdpa.c create mode 100644 drivers/net/ethernet/rocker/rocker_tlv.c create mode 100644 drivers/net/ethernet/rocker/rocker_tlv.h -- 1.9.3 -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html