Re: [ovs-dev] [PATCH net 1/1] net: openvswitch: Fix ct_state nat flags for conns arriving from tc

2022-01-06 Thread Jamal Hadi Salim

On 2022-01-05 11:18, Paul Blakey wrote:



On Wed, 5 Jan 2022, Daniel Borkmann wrote:



[..]


Full ack on the bloat for corner cases like ovs offload, especially
given distros just enable most stuff anyway and therefore no light
fast path as with !CONFIG_NET_TC_SKB_EXT. :(

Could this somehow be hidden behind static key or such if offloads
are not used, so we can shrink it back to just calling into plain
__tcf_classify() for sw-only use cases (like BPF)?




It is used for both tc -> ovs and driver -> tc path.

I think I can do what you suggest adn will work on something like
that,  but this specific patch  doesn't really change the ext
allocation/derefences count (and probably  not the size as well).
So can  we take  this (not yet posted v2 after fixing what already
mentioned) and I'll do a patch of what you suggest in net-next?



Sounds reasonable.

The main outstanding challenge is still going to be all these bit
declarations (and i am sure more to come) that are specific for
specific components in TC (act_ct in this case); if every component
started planting their flags(pun intended) then both backwards and
forwards compatibility are going to be needed for maintainance.

The _cb spaces are supposed to be opaque and meaning to whats in
that space is only sensible to the component that is storing and
retrieving. Something like chain id is ok in that namespaces
because it has global meaning to TC, the others are not. My
suggestion is going forward try to not add more component specific
variables.
For a proper solution:
I think we need some sort of "metadata bus" to resolve this in the
long run. Something which is a bigger surgery...

cheers,
jamal

cheers,
jamal


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net 1/1] net: openvswitch: Fix ct_state nat flags for conns arriving from tc

2022-01-05 Thread Daniel Borkmann

On 1/5/22 3:57 PM, Jamal Hadi Salim wrote:

On 2022-01-04 03:28, Paul Blakey wrote:
[..]

--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -287,7 +287,9 @@ struct tc_skb_ext {
  __u32 chain;
  __u16 mru;
  __u16 zone;
-    bool post_ct;
+    bool post_ct:1;
+    bool post_ct_snat:1;
+    bool post_ct_dnat:1;
  };


is skb_ext intended only for ovs? If yes, why does it belong
in the core code? Ex: Looking at tcf_classify() which is such
a core function in the fast path any packet going via tc, it
is now encumbered with with checking presence of skb_ext.
I know passing around metadata is a paramount requirement
for programmability but this is getting messier with speacial
use cases for ovs and/or offload...


Full ack on the bloat for corner cases like ovs offload, especially
given distros just enable most stuff anyway and therefore no light
fast path as with !CONFIG_NET_TC_SKB_EXT. :(

Could this somehow be hidden behind static key or such if offloads
are not used, so we can shrink it back to just calling into plain
__tcf_classify() for sw-only use cases (like BPF)?
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net 1/1] net: openvswitch: Fix ct_state nat flags for conns arriving from tc

2022-01-05 Thread Jamal Hadi Salim

On 2022-01-04 03:28, Paul Blakey wrote:
[..]

--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -287,7 +287,9 @@ struct tc_skb_ext {
__u32 chain;
__u16 mru;
__u16 zone;
-   bool post_ct;
+   bool post_ct:1;
+   bool post_ct_snat:1;
+   bool post_ct_dnat:1;
  };



is skb_ext intended only for ovs? If yes, why does it belong
in the core code? Ex: Looking at tcf_classify() which is such
a core function in the fast path any packet going via tc, it
is now encumbered with with checking presence of skb_ext.
I know passing around metadata is a paramount requirement
for programmability but this is getting messier with speacial
use cases for ovs and/or offload...

cheers,
jamal
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net 1/1] net: openvswitch: Fix ct_state nat flags for conns arriving from tc

2022-01-05 Thread Paul Blakey via dev


On Wed, 5 Jan 2022, Daniel Borkmann wrote:

> On 1/5/22 3:57 PM, Jamal Hadi Salim wrote:
> > On 2022-01-04 03:28, Paul Blakey wrote:
> > [..]
> >> --- a/include/linux/skbuff.h
> >> +++ b/include/linux/skbuff.h
> >> @@ -287,7 +287,9 @@ struct tc_skb_ext {
> >>   __u32 chain;
> >>   __u16 mru;
> >>   __u16 zone;
> >> -    bool post_ct;
> >> +    bool post_ct:1;
> >> +    bool post_ct_snat:1;
> >> +    bool post_ct_dnat:1;
> >>   };
> > 
> > is skb_ext intended only for ovs? If yes, why does it belong
> > in the core code? Ex: Looking at tcf_classify() which is such
> > a core function in the fast path any packet going via tc, it
> > is now encumbered with with checking presence of skb_ext.
> > I know passing around metadata is a paramount requirement
> > for programmability but this is getting messier with speacial
> > use cases for ovs and/or offload...
> 
> Full ack on the bloat for corner cases like ovs offload, especially
> given distros just enable most stuff anyway and therefore no light
> fast path as with !CONFIG_NET_TC_SKB_EXT. :(
> 
> Could this somehow be hidden behind static key or such if offloads
> are not used, so we can shrink it back to just calling into plain
> __tcf_classify() for sw-only use cases (like BPF)?
> 
> 

It is used for both tc -> ovs and driver -> tc path.

I think I can do what you suggest adn will work on something like
that,  but this specific patch  doesn't really change the ext 
allocation/derefences count (and probably  not the size as well).
So can  we take  this (not yet posted v2 after fixing what already 
mentioned) and I'll do a patch of what you suggest in net-next?

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net 1/1] net: openvswitch: Fix ct_state nat flags for conns arriving from tc

2022-01-05 Thread Paul Blakey via dev




On Tue, 4 Jan 2022, Jakub Kicinski wrote:

> On Tue, 4 Jan 2022 10:28:21 +0200 Paul Blakey wrote:
> > Netfilter conntrack maintains NAT flags per connection indicating
> > whether NAT was configured for the connection. Openvswitch maintains
> > NAT flags on the per packet flow key ct_state field, indicating
> > whether NAT was actually executed on the packet.
> > 
> > When a packet misses from tc to ovs the conntrack NAT flags are set.
> > However, NAT was not necessarily executed on the packet because the
> > connection's state might still be in NEW state. As such, openvswitch wrongly
> > assumes that NAT was executed and sets an incorrect flow key NAT flags.
> > 
> > Fix this, by flagging to openvswitch which NAT was actually done in
> > act_ct via tc_skb_ext and tc_skb_cb to the openvswitch module, so
> > the packet flow key NAT flags will be correctly set.
> 
> Fixes ?

I wasn't sure which patches to blame, I guess the bug was there from the
introduction of action ct in tc, so I'll blame that. 

> 
> > Signed-off-by: Paul Blakey 
> 
> > diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> > index 4507d77d6941..bab45a009310 100644
> > --- a/include/linux/skbuff.h
> > +++ b/include/linux/skbuff.h
> > @@ -287,7 +287,9 @@ struct tc_skb_ext {
> > __u32 chain;
> > __u16 mru;
> > __u16 zone;
> > -   bool post_ct;
> > +   bool post_ct:1;
> > +   bool post_ct_snat:1;
> > +   bool post_ct_dnat:1;
> 
> single bit bool variables seem weird, use a unsigned int type, like u8.
> 
> >  };
> >  #endif
> >  
> > diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
> > index 9e71691c491b..a171dfa91910 100644
> > --- a/include/net/pkt_sched.h
> > +++ b/include/net/pkt_sched.h
> > @@ -197,7 +197,9 @@ struct tc_skb_cb {
> > struct qdisc_skb_cb qdisc_cb;
> >  
> > u16 mru;
> > -   bool post_ct;
> > +   bool post_ct: 1;
> 
> extra space

Will remove, and send v2.

> 
> > +   bool post_ct_snat:1;
> > +   bool post_ct_dnat:1;
> > u16 zone; /* Only valid if post_ct = true */
> >  };
> 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net 1/1] net: openvswitch: Fix ct_state nat flags for conns arriving from tc

2022-01-04 Thread Jakub Kicinski
On Tue, 4 Jan 2022 10:28:21 +0200 Paul Blakey wrote:
> Netfilter conntrack maintains NAT flags per connection indicating
> whether NAT was configured for the connection. Openvswitch maintains
> NAT flags on the per packet flow key ct_state field, indicating
> whether NAT was actually executed on the packet.
> 
> When a packet misses from tc to ovs the conntrack NAT flags are set.
> However, NAT was not necessarily executed on the packet because the
> connection's state might still be in NEW state. As such, openvswitch wrongly
> assumes that NAT was executed and sets an incorrect flow key NAT flags.
> 
> Fix this, by flagging to openvswitch which NAT was actually done in
> act_ct via tc_skb_ext and tc_skb_cb to the openvswitch module, so
> the packet flow key NAT flags will be correctly set.

Fixes ?

> Signed-off-by: Paul Blakey 

> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index 4507d77d6941..bab45a009310 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -287,7 +287,9 @@ struct tc_skb_ext {
>   __u32 chain;
>   __u16 mru;
>   __u16 zone;
> - bool post_ct;
> + bool post_ct:1;
> + bool post_ct_snat:1;
> + bool post_ct_dnat:1;

single bit bool variables seem weird, use a unsigned int type, like u8.

>  };
>  #endif
>  
> diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
> index 9e71691c491b..a171dfa91910 100644
> --- a/include/net/pkt_sched.h
> +++ b/include/net/pkt_sched.h
> @@ -197,7 +197,9 @@ struct tc_skb_cb {
>   struct qdisc_skb_cb qdisc_cb;
>  
>   u16 mru;
> - bool post_ct;
> + bool post_ct: 1;

extra space

> + bool post_ct_snat:1;
> + bool post_ct_dnat:1;
>   u16 zone; /* Only valid if post_ct = true */
>  };
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev