On 8/25/14, 3:50 PM, Thomas Graf wrote:
On 08/25/14 at 12:15pm, Jamal Hadi Salim wrote:
On 08/25/14 10:17, Thomas Graf wrote:
On 08/25/14 at 09:53am, Jamal Hadi Salim wrote:
fdb_add() *is* flow based. At least in my understanding, the whole
point here is to extend the idea of fdb_add() and make it understand
L2-L4 in a more generic way for the most common protocols.

The reason fdb_add() is not reused is because it is Netlink specific
and only suitable for User -> HW offload. Kernel -> HW offload is
technically possible but not clean.

I dont think we have a problem handling any of this today.
Yes we do. It's restricted to L2 and we can't extend it easily
because it is based on NDA_*. The use of Netlink makes in-kernel
usage a pain. To me this is the sole reason for not using fdb_add()
in the first place. It seems absolutely clear though that fdb_add()
should be removed after the more generic ndo is in place providing
a superset of what fdb_add() can do today.

This is where our (shall i say strong) disagreement is.
I think you will find it non-trivial to show me how you can
actually take the simple L2 bridge and map it to a "flow".
Since your starting point is "everything can be represented via a flow
and some table" - we are at a crosspath.
OK, let me do the convertion for you:

NDA_DST         unused
NDA_LLADDR      sw_flow_key.eth.dst
NDA_CACHEINFO   unused
NDA_PROBES      unused
NDA_VLAN        sw_flow_key.eth.tci
NDA_PORT        unused
NDA_VNI         sw_flow_key.tun_key.tun_id
NDA_IFINDEX     sw_flow_key.phys.in_port
NDA_MASTER      unused

The tc filter API seems to be doing just that.
You have different types of classifiers - the h/w may not be able
to support some classifier types - but that is a capability discovery
challenge.
Agreed but tc is only one out of many possible existing interfaces
we have. macvtap (given we want to extend beyond L2), routing,
OVS, bridge and eventually even things like a team device can and
should make use of offloads.

I am saying two things:
1) There are a few "fundamental" interfaces; L2 and L3 being some.
Add crypto offload and a few i mentioned in  my presentation. We
Can you share that preso? I was not present.

know how to do those. example; there is nothing i cant do with
the rtmsg that is L3. or the fdb/port/vlan filter for L2.
This flow thing should stay out of those.
Let me remind you about the name of the structure behind all L3
forwarding decisions:

         struct flowi4 {
                [...]
        }

Adding a route means adding a flow. Can we please stop the flow
bashing? The concept of a flow is very generic, well known and already
very present in the kernel.

The sw_flow_key proposed comes close to flowi4. Some fields are
different. They can eventually get merged. The strict IPv4/IPv6
separation is what makes it non obvious and probably why Jiri chose
the OVS representation. If you say rtmsg is complete then that clearly
is not the case. In particular VTEP fields, ARP, and TCP flags are
clearly missing for many uses.

Again, I'm not saying flow is the ultimate answer to everything. It
is not. But a lot of hardware out there is aware of flows in combination
with some form of action execution. Non flow based hardware can have
their own classifier.

2) The flow thing should allow a variety of classifiers to be
handled. Again capability discovery would take care of differences.
So you want the flow to represent something that is not a flow. Again,
this comes back to the conversation in the other email. If this is
all about having a single ndo I'm sure we can find common grounds on
that.

From what i understood (trying to summarize here for my own benefit):
the switchdev api currently under review proposes every switch asic offload abstraction as a flow. It does not mandate this via code, however, there seems to be some discussion along those lines.

The switchdev api flow ndo's need to stay for switch asic drivers that support flows directly or possibly want all their hw offload abstraction to be represented by the flow abstraction (openvswitch, the rocker dev ). The details of how the flow is mapped to hw lies in the corresponding switch driver code.

We think rtnetlink is the api to model switch asic hw tables.
We have a working model (Cumulus) that maps rtnetlink to switch
asic hw tables (via snooping rtnetlink msgs). This can be done by extending the switchdev api
with new ndo's for l2 and l3.

Example:
  new switchdev ndo's for fdb_add/fdb_del
  new switchdev ndo's for l3

Now we only need working patches that implement switchdev api ndo ops for l2/l3 (this is in the works).

As long as the current patches under review allow the extension of the api to cover non-flow based l2/l3 switch asic offloads, we might be good (?).

Thanks,
Roopa



_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to