Re: [PATCH net-next] net/sched: cls_flower: Add user specified data
On 17-01-17 06:53 AM, Paul Blakey wrote: Should be "trivial" like my patch. Add a new TLV type in TCA_XXX enum in rtnetlink.h in tc_ctl_tfilter look for it and memcpy it in tcf_fill_node set it on the new TCA_XXX if the cls struct has it present. That's exactly what I did before I realized it won't work per internal element (per handle), which is what I want. see my example. So I'll probably implement it like actions dumping/init works - tcf_exts_init, tcf_exts_dump (calling a generic functions on all classifiers who want this). I'll add something like get_cookie, dump_cookie. which get/set the TCA_COOKIE. Yes, that makes more sense. cheers, jamal
Re: [PATCH net-next] net/sched: cls_flower: Add user specified data
On 17/01/2017 13:23, Jamal Hadi Salim wrote: On 17-01-16 04:51 AM, Jiri Pirko wrote: Mon, Jan 16, 2017 at 08:54:18AM CET, pa...@mellanox.com wrote: I think we should do it in a generic way, for every classifier, right away. Same as Jamal is doing for actions. I think that first we should get Jamal's patch merged and then do the same for classifiers. Should be "trivial" like my patch. Add a new TLV type in TCA_XXX enum in rtnetlink.h in tc_ctl_tfilter look for it and memcpy it in tcf_fill_node set it on the new TCA_XXX if the cls struct has it present. That's exactly what I did before I realized it won't work per internal element (per handle), which is what I want. see my example. So I'll probably implement it like actions dumping/init works - tcf_exts_init, tcf_exts_dump (calling a generic functions on all classifiers who want this). I'll add something like get_cookie, dump_cookie. which get/set the TCA_COOKIE. Thanks, Paul. Look at my patch (1 of 2) which I just reposted. cheers, jamal
Re: [PATCH net-next] net/sched: cls_flower: Add user specified data
On 17-01-16 04:51 AM, Jiri Pirko wrote: Mon, Jan 16, 2017 at 08:54:18AM CET, pa...@mellanox.com wrote: I think we should do it in a generic way, for every classifier, right away. Same as Jamal is doing for actions. I think that first we should get Jamal's patch merged and then do the same for classifiers. Should be "trivial" like my patch. Add a new TLV type in TCA_XXX enum in rtnetlink.h in tc_ctl_tfilter look for it and memcpy it in tcf_fill_node set it on the new TCA_XXX if the cls struct has it present. Look at my patch (1 of 2) which I just reposted. cheers, jamal
Re: [PATCH net-next] net/sched: cls_flower: Add user specified data
Mon, Jan 16, 2017 at 08:54:18AM CET, pa...@mellanox.com wrote: > > >On 15/01/2017 21:08, John Fastabend wrote: >> On 17-01-15 09:36 AM, Paul Blakey wrote: >> > >> > >> > On 08/01/2017 19:12, Jiri Pirko wrote: >> > > Mon, Jan 02, 2017 at 03:59:49PM CET, j...@mojatatu.com wrote: >> > > > >> > > > We have been using a cookie as well for actions (which we have been >> > > > using but have been too lazy to submit so far). I am going to port >> > > > it over to the newer kernels and post it. >> > > >> > > Hard to deal with something we can't look at :) >> > > >> > > >> > > > In our case that is intended to be opaque to the kernel i.e kernel >> > > > never inteprets it; in that case it is similar to the kernel >> > > > FIB protocol field. >> > > >> > > In case of this patch, kernel also never interprets it. What makes you >> > > think otherwise. Bot kernel, it is always a binary blob. >> > > >> > > >> > > > >> > > > In your case - could this cookie have been a class/flowid >> > > > (a 32 bit)? >> > > > And would it not make more sense for it the cookie to be >> > > > generic to all classifiers? i.e why is it specific to flower? >> > > >> > > Correct, makes sense to have it generic for all cls and perhaps also >> > > acts. >> > > >> > > >> > >> > Hi all, >> > I've started implementing a general cookie for all classifiers, >> > I added the cookie on tcf_proto struct but then I realized that it won't >> > work as >> > I want. What I want is cookie per internal element (those that are >> > identified by >> > handles), which we can have many per one tcf_proto: >> > >> > tc filter add dev parent : prio 1 cookie 0x123 basic action drop >> > tc filter add dev parent : prio 1 cookie 0x456 "port6" basic >> > action drop >> > tc filter add dev parent : prio 1 cookie 0x777 basic action drop >> > >> > So there is three options to do that: >> > 1) Implement it in each classifier, using some generic function. (kinda >> > like >> > stats, where classifiler_dump() calls tcf_exts_dump_stats()) >> > 2) Have a global hash table for cookies [prio,handle]->[cookie] >> > 3) Have it on flower only for now :) >> > >> > With the first one we will have to change each classifier (or leave it >> > unsupported as default). >> > Second is less code to change (instead of changing each classifier), but >> > might >> > be slower. I'm afraid how it will affect dumping several filters. >> > Third is this patch. >> > >> > Thanks, >> > Paul. >> >> Avoid (2) it creates way more problems than its worth like is it locked/not >> locked, how is it synced, collisions, etc. Seems way overkill. +1 >> >> I like the generic functionality of (1) but unless we see this pop up in >> different filters I wouldn't require it for now. If you just do it in flower >> (option 3) when its added to another classifier we can generalize it. As a >> middle ground creating nice helper routines as needed goes a long way. >> >> .John >> > >Hi all, >So is everyone ok with leaving the commit as is, only adding user data >to flower for now? I think we should do it in a generic way, for every classifier, right away. Same as Jamal is doing for actions. I think that first we should get Jamal's patch merged and then do the same for classifiers.
Re: [PATCH net-next] net/sched: cls_flower: Add user specified data
On 15/01/2017 21:08, John Fastabend wrote: On 17-01-15 09:36 AM, Paul Blakey wrote: On 08/01/2017 19:12, Jiri Pirko wrote: Mon, Jan 02, 2017 at 03:59:49PM CET, j...@mojatatu.com wrote: We have been using a cookie as well for actions (which we have been using but have been too lazy to submit so far). I am going to port it over to the newer kernels and post it. Hard to deal with something we can't look at :) In our case that is intended to be opaque to the kernel i.e kernel never inteprets it; in that case it is similar to the kernel FIB protocol field. In case of this patch, kernel also never interprets it. What makes you think otherwise. Bot kernel, it is always a binary blob. In your case - could this cookie have been a class/flowid (a 32 bit)? And would it not make more sense for it the cookie to be generic to all classifiers? i.e why is it specific to flower? Correct, makes sense to have it generic for all cls and perhaps also acts. Hi all, I've started implementing a general cookie for all classifiers, I added the cookie on tcf_proto struct but then I realized that it won't work as I want. What I want is cookie per internal element (those that are identified by handles), which we can have many per one tcf_proto: tc filter add dev parent : prio 1 cookie 0x123 basic action drop tc filter add dev parent : prio 1 cookie 0x456 "port6" basic action drop tc filter add dev parent : prio 1 cookie 0x777 basic action drop So there is three options to do that: 1) Implement it in each classifier, using some generic function. (kinda like stats, where classifiler_dump() calls tcf_exts_dump_stats()) 2) Have a global hash table for cookies [prio,handle]->[cookie] 3) Have it on flower only for now :) With the first one we will have to change each classifier (or leave it unsupported as default). Second is less code to change (instead of changing each classifier), but might be slower. I'm afraid how it will affect dumping several filters. Third is this patch. Thanks, Paul. Avoid (2) it creates way more problems than its worth like is it locked/not locked, how is it synced, collisions, etc. Seems way overkill. I like the generic functionality of (1) but unless we see this pop up in different filters I wouldn't require it for now. If you just do it in flower (option 3) when its added to another classifier we can generalize it. As a middle ground creating nice helper routines as needed goes a long way. .John Hi all, So is everyone ok with leaving the commit as is, only adding user data to flower for now? Thanks, Paul.
Re: [PATCH net-next] net/sched: cls_flower: Add user specified data
On 17-01-15 09:36 AM, Paul Blakey wrote: > > > On 08/01/2017 19:12, Jiri Pirko wrote: >> Mon, Jan 02, 2017 at 03:59:49PM CET, j...@mojatatu.com wrote: >>> >>> We have been using a cookie as well for actions (which we have been >>> using but have been too lazy to submit so far). I am going to port >>> it over to the newer kernels and post it. >> >> Hard to deal with something we can't look at :) >> >> >>> In our case that is intended to be opaque to the kernel i.e kernel >>> never inteprets it; in that case it is similar to the kernel >>> FIB protocol field. >> >> In case of this patch, kernel also never interprets it. What makes you >> think otherwise. Bot kernel, it is always a binary blob. >> >> >>> >>> In your case - could this cookie have been a class/flowid >>> (a 32 bit)? >>> And would it not make more sense for it the cookie to be >>> generic to all classifiers? i.e why is it specific to flower? >> >> Correct, makes sense to have it generic for all cls and perhaps also >> acts. >> >> > > Hi all, > I've started implementing a general cookie for all classifiers, > I added the cookie on tcf_proto struct but then I realized that it won't work > as > I want. What I want is cookie per internal element (those that are identified > by > handles), which we can have many per one tcf_proto: > > tc filter add dev parent : prio 1 cookie 0x123 basic action drop > tc filter add dev parent : prio 1 cookie 0x456 "port6" basic action > drop > tc filter add dev parent : prio 1 cookie 0x777 basic action drop > > So there is three options to do that: > 1) Implement it in each classifier, using some generic function. (kinda like > stats, where classifiler_dump() calls tcf_exts_dump_stats()) > 2) Have a global hash table for cookies [prio,handle]->[cookie] > 3) Have it on flower only for now :) > > With the first one we will have to change each classifier (or leave it > unsupported as default). > Second is less code to change (instead of changing each classifier), but might > be slower. I'm afraid how it will affect dumping several filters. > Third is this patch. > > Thanks, > Paul. Avoid (2) it creates way more problems than its worth like is it locked/not locked, how is it synced, collisions, etc. Seems way overkill. I like the generic functionality of (1) but unless we see this pop up in different filters I wouldn't require it for now. If you just do it in flower (option 3) when its added to another classifier we can generalize it. As a middle ground creating nice helper routines as needed goes a long way. .John
Re: [PATCH net-next] net/sched: cls_flower: Add user specified data
On 08/01/2017 19:12, Jiri Pirko wrote: Mon, Jan 02, 2017 at 03:59:49PM CET, j...@mojatatu.com wrote: We have been using a cookie as well for actions (which we have been using but have been too lazy to submit so far). I am going to port it over to the newer kernels and post it. Hard to deal with something we can't look at :) In our case that is intended to be opaque to the kernel i.e kernel never inteprets it; in that case it is similar to the kernel FIB protocol field. In case of this patch, kernel also never interprets it. What makes you think otherwise. Bot kernel, it is always a binary blob. In your case - could this cookie have been a class/flowid (a 32 bit)? And would it not make more sense for it the cookie to be generic to all classifiers? i.e why is it specific to flower? Correct, makes sense to have it generic for all cls and perhaps also acts. Hi all, I've started implementing a general cookie for all classifiers, I added the cookie on tcf_proto struct but then I realized that it won't work as I want. What I want is cookie per internal element (those that are identified by handles), which we can have many per one tcf_proto: tc filter add dev parent : prio 1 cookie 0x123 basic action drop tc filter add dev parent : prio 1 cookie 0x456 "port6" basic action drop tc filter add dev parent : prio 1 cookie 0x777 basic action drop So there is three options to do that: 1) Implement it in each classifier, using some generic function. (kinda like stats, where classifiler_dump() calls tcf_exts_dump_stats()) 2) Have a global hash table for cookies [prio,handle]->[cookie] 3) Have it on flower only for now :) With the first one we will have to change each classifier (or leave it unsupported as default). Second is less code to change (instead of changing each classifier), but might be slower. I'm afraid how it will affect dumping several filters. Third is this patch. Thanks, Paul. cheers, jamal On 17-01-02 08:13 AM, Paul Blakey wrote: This is to support saving extra data that might be helpful on retrieval. First use case is upcoming openvswitch flow offloads, extra data will include UFID and port mappings for each added flow. Signed-off-by: Paul Blakey Reviewed-by: Roi Dayan Acked-by: Jiri Pirko --- include/uapi/linux/pkt_cls.h | 3 +++ net/sched/cls_flower.c | 22 +- 2 files changed, 24 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h index cb4bcdc..ca9bbe3 100644 --- a/include/uapi/linux/pkt_cls.h +++ b/include/uapi/linux/pkt_cls.h @@ -471,10 +471,13 @@ enum { TCA_FLOWER_KEY_ICMPV6_TYPE, /* u8 */ TCA_FLOWER_KEY_ICMPV6_TYPE_MASK,/* u8 */ + TCA_FLOWER_COOKIE, /* binary */ + __TCA_FLOWER_MAX, }; #define TCA_FLOWER_MAX (__TCA_FLOWER_MAX - 1) +#define FLOWER_MAX_COOKIE_SIZE 128 enum { TCA_FLOWER_KEY_FLAGS_IS_FRAGMENT = (1 << 0), diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c index 333f8e2..e2f5b25 100644 --- a/net/sched/cls_flower.c +++ b/net/sched/cls_flower.c @@ -85,6 +85,8 @@ struct cls_fl_filter { struct rcu_head rcu; struct tc_to_netdev tc; struct net_device *hw_dev; + size_t cookie_len; + long cookie[0]; }; static unsigned short int fl_mask_range(const struct fl_flow_mask *mask) @@ -794,6 +796,9 @@ static int fl_change(struct net *net, struct sk_buff *in_skb, struct cls_fl_filter *fnew; struct nlattr *tb[TCA_FLOWER_MAX + 1]; struct fl_flow_mask mask = {}; + const struct nlattr *attr; + size_t cookie_len = 0; + void *cookie; int err; if (!tca[TCA_OPTIONS]) @@ -806,10 +811,22 @@ static int fl_change(struct net *net, struct sk_buff *in_skb, if (fold && handle && fold->handle != handle) return -EINVAL; - fnew = kzalloc(sizeof(*fnew), GFP_KERNEL); + if (tb[TCA_FLOWER_COOKIE]) { + attr = tb[TCA_FLOWER_COOKIE]; + cookie_len = nla_len(attr); + cookie = nla_data(attr); + if (cookie_len > FLOWER_MAX_COOKIE_SIZE) + return -EINVAL; + } + + fnew = kzalloc(sizeof(*fnew) + cookie_len, GFP_KERNEL); if (!fnew) return -ENOBUFS; + fnew->cookie_len = cookie_len; + if (cookie_len) + memcpy(fnew->cookie, cookie, cookie_len); + err = tcf_exts_init(&fnew->exts, TCA_FLOWER_ACT, 0); if (err < 0) goto errout; @@ -1151,6 +1168,9 @@ static int fl_dump(struct net *net, struct tcf_proto *tp, unsigned long fh, nla_put_u32(skb, TCA_FLOWER_FLAGS, f->flags); + if (f->cookie_len) + nla_put(skb, TCA_FLOWER_COOKIE, f->cookie_len, f->cookie); + if (tcf_exts_dump(skb, &f->exts)) goto nla_put_failure;
Re: [PATCH net-next] net/sched: cls_flower: Add user specified data
On 17-01-14 10:29 AM, Jiri Pirko wrote: Sat, Jan 14, 2017 at 04:03:17PM CET, j...@mojatatu.com wrote: Fair. So could this be done like IFLA_PHYS_SWITCH_ID and IFLA_PHYS_PORT_ID. They can have variable size, max is MAX_PHYS_ITEM_ID_LEN We can let user to pass arbitrary len up to 16 bytes. This has benefit in fact that if in future this needs to be extended to say 32 bytes, it is backward compatible. We just change the check in kernel. Sure. Ive run out of time for now; but will work on a new version later. cheers, jamal
Re: [PATCH net-next] net/sched: cls_flower: Add user specified data
Sat, Jan 14, 2017 at 04:03:17PM CET, j...@mojatatu.com wrote: >On 17-01-14 09:48 AM, Jiri Pirko wrote: >> Sat, Jan 14, 2017 at 01:56:35PM CET, j...@mojatatu.com wrote: > > >> > I think the feature makes a lot of sense (as is the action variant). >> > But can we make it: >> > a) fixed size >> >> Can you elaborate on why is this needed? >> > >My experience with the action bits its easier to debug >and enforces some discipline to not abuse the amount of RAM used. >If you have 1M rules, one extra 128M is easier on the system than >a few Gigs. Fair. So could this be done like IFLA_PHYS_SWITCH_ID and IFLA_PHYS_PORT_ID. They can have variable size, max is MAX_PHYS_ITEM_ID_LEN We can let user to pass arbitrary len up to 16 bytes. This has benefit in fact that if in future this needs to be extended to say 32 bytes, it is backward compatible. We just change the check in kernel. > >> >> > b) apply to all classifiers >> > c) please post a usage example via iproute2/tc >> > >> > I am going to post the action variant in the next while - will do some more >> > testing first. >> >> I believe we have to make the cls and ats cookies exactly the same. >> > >Probably - they are both needed. See the patch I just posted. > >cheers, >jamal >
Re: [PATCH net-next] net/sched: cls_flower: Add user specified data
On 17-01-14 09:48 AM, Jiri Pirko wrote: Sat, Jan 14, 2017 at 01:56:35PM CET, j...@mojatatu.com wrote: I think the feature makes a lot of sense (as is the action variant). But can we make it: a) fixed size Can you elaborate on why is this needed? My experience with the action bits its easier to debug and enforces some discipline to not abuse the amount of RAM used. If you have 1M rules, one extra 128M is easier on the system than a few Gigs. b) apply to all classifiers c) please post a usage example via iproute2/tc I am going to post the action variant in the next while - will do some more testing first. I believe we have to make the cls and ats cookies exactly the same. Probably - they are both needed. See the patch I just posted. cheers, jamal
Re: [PATCH net-next] net/sched: cls_flower: Add user specified data
Sat, Jan 14, 2017 at 01:56:35PM CET, j...@mojatatu.com wrote: >On 17-01-09 01:23 PM, John Fastabend wrote: >> On 17-01-08 09:19 AM, Jiri Pirko wrote: > >[..] >> > This should never be interpreted by kernel. I think this would be good >> > to make clear in the comment in the code. >> > >> >> Ah OK I had assumed you would be pushing this via tc_cls_flower_offload into >> the driver in a follow up patch. But if it lives in kernel space as opaque >> cookie guess its no different then other id fields order/prio/cookie. >> >> Thanks for clarifying. > > >I think the feature makes a lot of sense (as is the action variant). >But can we make it: >a) fixed size Can you elaborate on why is this needed? >b) apply to all classifiers >c) please post a usage example via iproute2/tc > >I am going to post the action variant in the next while - will do some more >testing first. I believe we have to make the cls and ats cookies exactly the same. > >cheers, >jamal
Re: [PATCH net-next] net/sched: cls_flower: Add user specified data
On 17-01-03 07:22 AM, Paul Blakey wrote: I don't mind having it on TC level but I didn't want to intervene with all filters/TC. Please do make it for all classifiers. I described a use case for u32 where the cookie could be used to pretty print the output on a dump/get. cheers, jamal
Re: [PATCH net-next] net/sched: cls_flower: Add user specified data
On 17-01-09 01:23 PM, John Fastabend wrote: On 17-01-08 09:19 AM, Jiri Pirko wrote: [..] This should never be interpreted by kernel. I think this would be good to make clear in the comment in the code. Ah OK I had assumed you would be pushing this via tc_cls_flower_offload into the driver in a follow up patch. But if it lives in kernel space as opaque cookie guess its no different then other id fields order/prio/cookie. Thanks for clarifying. I think the feature makes a lot of sense (as is the action variant). But can we make it: a) fixed size b) apply to all classifiers c) please post a usage example via iproute2/tc I am going to post the action variant in the next while - will do some more testing first. cheers, jamal
Re: [PATCH net-next] net/sched: cls_flower: Add user specified data
On 17-01-08 09:19 AM, Jiri Pirko wrote: > Mon, Jan 02, 2017 at 11:21:41PM CET, j...@mojatatu.com wrote: >> On 17-01-02 01:23 PM, John Fastabend wrote: >> >>> >>> Additionally I would like to point out this is an arbitrary length binary >>> blob (for undefined use, without even a specified encoding) that gets pushed >>> between user space and hardware ;) This seemed to get folks fairly excited >>> in >>> the past. >>> >> >> The binary blob size is a little strange - but i think there is value >> in storing some "cookie" field. The challenge is whether the kernel >> gets to intepret it; in which case encoding must be specified. Or >> whether we should leave it up to user space - in which something >> like tc could standardize its own encodings. > > This should never be interpreted by kernel. I think this would be good > to make clear in the comment in the code. > Ah OK I had assumed you would be pushing this via tc_cls_flower_offload into the driver in a follow up patch. But if it lives in kernel space as opaque cookie guess its no different then other id fields order/prio/cookie. Thanks for clarifying.
Re: [PATCH net-next] net/sched: cls_flower: Add user specified data
Mon, Jan 02, 2017 at 11:21:41PM CET, j...@mojatatu.com wrote: >On 17-01-02 01:23 PM, John Fastabend wrote: > >> >> Additionally I would like to point out this is an arbitrary length binary >> blob (for undefined use, without even a specified encoding) that gets pushed >> between user space and hardware ;) This seemed to get folks fairly excited in >> the past. >> > >The binary blob size is a little strange - but i think there is value >in storing some "cookie" field. The challenge is whether the kernel >gets to intepret it; in which case encoding must be specified. Or >whether we should leave it up to user space - in which something >like tc could standardize its own encodings. This should never be interpreted by kernel. I think this would be good to make clear in the comment in the code. > >> Some questions, exactly what do you mean by "port mappings" above? In >> general the 'tc' API uses the netdev the netlink msg is processed on as >> the port mapping. If you mean OVS port to netdev port I think this is >> a OVS problem and nothing to do with 'tc'. For what its worth there is an >> existing problem with 'tc' where rules only apply to a single ingress or >> egress port which is limiting on hardware. >> > >In our case the desire is to be able to correlate for a system wide >mostly identity/key mapping. > >> The UFID in my ovs code base is defined as best I can tell here, >> >> [OVS_FLOW_ATTR_UFID] = { .type = NL_A_UNSPEC, .optional = true, >> .min_len = sizeof(ovs_u128) }, >> >> So you need 128 bits if you want a 1:1 mapping onto 'tc'. So rather >> than an arbitrary blob why not make the case that 'tc' ids need to be >> 128 bits long? Even if its just initially done in flower call it >> flower_flow_id and define it so its not opaque and at least at the code >> level it isn't an arbitrary blob of data. >> > >I dont know what this UFID is, but do note: >The idea is not new - the FIB for example has some such cookie >(albeit a tiny one) which will typically be populated to tell >you who/what installed the entry. >I could see f.e use for this cookie to simplify and pretty print in >a human language for the u32 classifier (i.e user space tc sets >some fields in the cookie when updating kernel and when user space >invokes get/dump it uses the cookie to intepret how to pretty print). > >I have attached a compile tested version of the cookies on actions >(flat 64 bit; now that we have experienced the use when we have a >large number of counters - I would not mind a 128 bit field). > > >cheers, >jamal > >> And what are the "next" uses of this besides OVS. It would be really >> valuable to see how this generalizes to other usage models. To avoid >> embedding OVS syntax into 'tc'. >> >> Finally if you want to see an example of binary data encodings look at >> how drivers/hardware/users are currently using the user defined bits in >> ethtools ntuple API. Also track down out of tree drivers to see other >> interesting uses. And that was capped at 64bits :/ >> >> Thanks, >> John >> >> >> >> >> > >diff --git a/include/net/act_api.h b/include/net/act_api.h >index 1d71644..f299ed3 100644 >--- a/include/net/act_api.h >+++ b/include/net/act_api.h >@@ -41,6 +41,7 @@ struct tc_action { > struct rcu_head tcfa_rcu; > struct gnet_stats_basic_cpu __percpu *cpu_bstats; > struct gnet_stats_queue __percpu *cpu_qstats; >+ u64 cookie; > }; > #define tcf_head common.tcfa_head > #define tcf_index common.tcfa_index >diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h >index cb4bcdc..2e968ee 100644 >--- a/include/uapi/linux/pkt_cls.h >+++ b/include/uapi/linux/pkt_cls.h >@@ -67,6 +67,7 @@ enum { > TCA_ACT_INDEX, > TCA_ACT_STATS, > TCA_ACT_PAD, >+ TCA_ACT_COOKIE, > __TCA_ACT_MAX > }; > >diff --git a/net/sched/act_api.c b/net/sched/act_api.c >index 2095c83..97eae6b 100644 >--- a/net/sched/act_api.c >+++ b/net/sched/act_api.c >@@ -26,6 +26,7 @@ > #include > #include > #include >+#include > > static void free_tcf(struct rcu_head *head) > { >@@ -467,17 +468,21 @@ int tcf_action_destroy(struct list_head *actions, int >bind) > return a->ops->dump(skb, a, bind, ref); > } > >-int >-tcf_action_dump_1(struct sk_buff *skb, struct tc_action *a, int bind, int ref) >+int tcf_action_dump_1(struct sk_buff *skb, struct tc_action *a, int bind, >+int ref) > { > int err = -EINVAL; > unsigned char *b = skb_tail_pointer(skb); > struct nlattr *nest; >+ u64 cookie = a->cookie; > > if (nla_put_string(skb, TCA_KIND, a->ops->kind)) > goto nla_put_failure; > if (tcf_action_copy_stats(skb, a, 0)) > goto nla_put_failure; >+ if (nla_put_u64_64bit(skb, TCA_ACT_COOKIE, cookie, TCA_ACT_PAD)) >+ goto nla_put_failure; >+ > nest = nla_nest_start(skb, TCA_OPTIONS); > if (nest == NULL) >
Re: [PATCH net-next] net/sched: cls_flower: Add user specified data
Mon, Jan 02, 2017 at 07:23:27PM CET, john.fastab...@gmail.com wrote: >On 17-01-02 06:59 AM, Jamal Hadi Salim wrote: >> >> We have been using a cookie as well for actions (which we have been >> using but have been too lazy to submit so far). I am going to port >> it over to the newer kernels and post it. >> In our case that is intended to be opaque to the kernel i.e kernel >> never inteprets it; in that case it is similar to the kernel >> FIB protocol field. >> >> In your case - could this cookie have been a class/flowid >> (a 32 bit)? >> And would it not make more sense for it the cookie to be >> generic to all classifiers? i.e why is it specific to flower? >> >> cheers, >> jamal >> >> On 17-01-02 08:13 AM, Paul Blakey wrote: >>> This is to support saving extra data that might be helpful on retrieval. >>> First use case is upcoming openvswitch flow offloads, extra data will >>> include UFID and port mappings for each added flow. >>> >>> Signed-off-by: Paul Blakey >>> Reviewed-by: Roi Dayan >>> Acked-by: Jiri Pirko >>> --- > >Additionally I would like to point out this is an arbitrary length binary >blob (for undefined use, without even a specified encoding) that gets pushed >between user space and hardware ;) This seemed to get folks fairly excited in >the past. No John, this is very different. What was frowned upon was interchange of binary blobs between userspace and hw. In this case, cookie is never interpreted, only stored in kernel memory, used *always* only by user.
Re: [PATCH net-next] net/sched: cls_flower: Add user specified data
Mon, Jan 02, 2017 at 03:59:49PM CET, j...@mojatatu.com wrote: > >We have been using a cookie as well for actions (which we have been >using but have been too lazy to submit so far). I am going to port >it over to the newer kernels and post it. Hard to deal with something we can't look at :) >In our case that is intended to be opaque to the kernel i.e kernel >never inteprets it; in that case it is similar to the kernel >FIB protocol field. In case of this patch, kernel also never interprets it. What makes you think otherwise. Bot kernel, it is always a binary blob. > >In your case - could this cookie have been a class/flowid >(a 32 bit)? >And would it not make more sense for it the cookie to be >generic to all classifiers? i.e why is it specific to flower? Correct, makes sense to have it generic for all cls and perhaps also acts. > >cheers, >jamal > >On 17-01-02 08:13 AM, Paul Blakey wrote: >> This is to support saving extra data that might be helpful on retrieval. >> First use case is upcoming openvswitch flow offloads, extra data will >> include UFID and port mappings for each added flow. >> >> Signed-off-by: Paul Blakey >> Reviewed-by: Roi Dayan >> Acked-by: Jiri Pirko >> --- >> include/uapi/linux/pkt_cls.h | 3 +++ >> net/sched/cls_flower.c | 22 +- >> 2 files changed, 24 insertions(+), 1 deletion(-) >> >> diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h >> index cb4bcdc..ca9bbe3 100644 >> --- a/include/uapi/linux/pkt_cls.h >> +++ b/include/uapi/linux/pkt_cls.h >> @@ -471,10 +471,13 @@ enum { >> TCA_FLOWER_KEY_ICMPV6_TYPE, /* u8 */ >> TCA_FLOWER_KEY_ICMPV6_TYPE_MASK,/* u8 */ >> >> +TCA_FLOWER_COOKIE, /* binary */ >> + >> __TCA_FLOWER_MAX, >> }; >> >> #define TCA_FLOWER_MAX (__TCA_FLOWER_MAX - 1) >> +#define FLOWER_MAX_COOKIE_SIZE 128 >> >> enum { >> TCA_FLOWER_KEY_FLAGS_IS_FRAGMENT = (1 << 0), >> diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c >> index 333f8e2..e2f5b25 100644 >> --- a/net/sched/cls_flower.c >> +++ b/net/sched/cls_flower.c >> @@ -85,6 +85,8 @@ struct cls_fl_filter { >> struct rcu_head rcu; >> struct tc_to_netdev tc; >> struct net_device *hw_dev; >> +size_t cookie_len; >> +long cookie[0]; >> }; >> >> static unsigned short int fl_mask_range(const struct fl_flow_mask *mask) >> @@ -794,6 +796,9 @@ static int fl_change(struct net *net, struct sk_buff >> *in_skb, >> struct cls_fl_filter *fnew; >> struct nlattr *tb[TCA_FLOWER_MAX + 1]; >> struct fl_flow_mask mask = {}; >> +const struct nlattr *attr; >> +size_t cookie_len = 0; >> +void *cookie; >> int err; >> >> if (!tca[TCA_OPTIONS]) >> @@ -806,10 +811,22 @@ static int fl_change(struct net *net, struct sk_buff >> *in_skb, >> if (fold && handle && fold->handle != handle) >> return -EINVAL; >> >> -fnew = kzalloc(sizeof(*fnew), GFP_KERNEL); >> +if (tb[TCA_FLOWER_COOKIE]) { >> +attr = tb[TCA_FLOWER_COOKIE]; >> +cookie_len = nla_len(attr); >> +cookie = nla_data(attr); >> +if (cookie_len > FLOWER_MAX_COOKIE_SIZE) >> +return -EINVAL; >> +} >> + >> +fnew = kzalloc(sizeof(*fnew) + cookie_len, GFP_KERNEL); >> if (!fnew) >> return -ENOBUFS; >> >> +fnew->cookie_len = cookie_len; >> +if (cookie_len) >> +memcpy(fnew->cookie, cookie, cookie_len); >> + >> err = tcf_exts_init(&fnew->exts, TCA_FLOWER_ACT, 0); >> if (err < 0) >> goto errout; >> @@ -1151,6 +1168,9 @@ static int fl_dump(struct net *net, struct tcf_proto >> *tp, unsigned long fh, >> >> nla_put_u32(skb, TCA_FLOWER_FLAGS, f->flags); >> >> +if (f->cookie_len) >> +nla_put(skb, TCA_FLOWER_COOKIE, f->cookie_len, f->cookie); >> + >> if (tcf_exts_dump(skb, &f->exts)) >> goto nla_put_failure; >> >> >
Re: [PATCH net-next] net/sched: cls_flower: Add user specified data
On Wed, Jan 04, 2017 at 01:45:28PM +0200, Paul Blakey wrote: > On 04/01/2017 12:14, Simon Horman wrote: > >On Tue, Jan 03, 2017 at 02:22:05PM +0200, Paul Blakey wrote: > >> > >>On 03/01/2017 13:44, Jamal Hadi Salim wrote: > >>>On 17-01-02 11:33 PM, John Fastabend wrote: > On 17-01-02 05:22 PM, Jamal Hadi Salim wrote: > >>>[..] > >Like all cookie semantics it is for storing state. The receiver > >(kernel) > >is not just store it and not intepret it. The user when reading it back > >simplifies what they have to do for their processing. > > > >>The tuple really should be unique why > >>not use this for system wide mappings? > >> > >I think on a single machine should be enough, however: > >typically the user wants to define the value in a manner that > >in a distributed system it is unique. It would be trickier to > >do so with well defined values such as above. > > > Just extend the tuple that > should be unique in the domain of hostname's, or use some other domain > wide machine identifier. > > >>>May work for the case of filter identification. The nice thing for > >>>allowing cookies is you can let the user define it define their > >>>own scheme. > >>> > Although actions can be shared so the cookie can be shared across > filters. Maybe its useful but it doesn't uniquely identify a filter > in the shared case but the user would have to specify that case > so maybe its not important. > > >>>Note: the action cookies and filter cookies are unrelated/orthogonal. > >>>Their basic concept of stashing something in the cookie to help improve > >>>what user space does (in our case millions of actions of which some are > >>>used for accounting) is similar. > >>>I have no objections to the flow cookies; my main concern was it should > >>>be applicable to all classifiers not just flower. And the arbitrary size > >>>of the cookie that you pointed out is questionable. > >>> > >>>cheers, > >>>jamal > >> > >>Hi all, > >>Our use case is replacing OVS rules with TC filters for HW offload, and > >>you're are right the cookie would > >>have saved us the mapping from OVS rule ufid to the tc filter handle/prio... > >>that was generated for it. > >>It also was going to be used to store other info like which OVS output port > >>corresponds to the ifindex, > >Possibly off-topic but I am curious to know why you need to store the port. > >My possibly naïve assumption is that a filter is attached to the netdev > >corresponding to the input port and mirred or other actions are used to > >output > >to netdevs corresponding to output ports. > > Right, its for the output ports, OVS uses ovs port numbers and mirred action > uses the device ifindex, so there is need > to translate it back to OVS port on dump. Understood, that is a tedious abstraction to support. But I don't see an easy way around it at this time. If I read Jamal's emails correctly he is working on per-action cookies. They may be better than per-flow cookies for storing the OvS port number (though not the UUID of the flow). ...
Re: [PATCH net-next] net/sched: cls_flower: Add user specified data
On 04/01/2017 12:14, Simon Horman wrote: On Tue, Jan 03, 2017 at 02:22:05PM +0200, Paul Blakey wrote: On 03/01/2017 13:44, Jamal Hadi Salim wrote: On 17-01-02 11:33 PM, John Fastabend wrote: On 17-01-02 05:22 PM, Jamal Hadi Salim wrote: [..] Like all cookie semantics it is for storing state. The receiver (kernel) is not just store it and not intepret it. The user when reading it back simplifies what they have to do for their processing. The tuple really should be unique why not use this for system wide mappings? I think on a single machine should be enough, however: typically the user wants to define the value in a manner that in a distributed system it is unique. It would be trickier to do so with well defined values such as above. Just extend the tuple that should be unique in the domain of hostname's, or use some other domain wide machine identifier. May work for the case of filter identification. The nice thing for allowing cookies is you can let the user define it define their own scheme. Although actions can be shared so the cookie can be shared across filters. Maybe its useful but it doesn't uniquely identify a filter in the shared case but the user would have to specify that case so maybe its not important. Note: the action cookies and filter cookies are unrelated/orthogonal. Their basic concept of stashing something in the cookie to help improve what user space does (in our case millions of actions of which some are used for accounting) is similar. I have no objections to the flow cookies; my main concern was it should be applicable to all classifiers not just flower. And the arbitrary size of the cookie that you pointed out is questionable. cheers, jamal Hi all, Our use case is replacing OVS rules with TC filters for HW offload, and you're are right the cookie would have saved us the mapping from OVS rule ufid to the tc filter handle/prio... that was generated for it. It also was going to be used to store other info like which OVS output port corresponds to the ifindex, Possibly off-topic but I am curious to know why you need to store the port. My possibly naïve assumption is that a filter is attached to the netdev corresponding to the input port and mirred or other actions are used to output to netdevs corresponding to output ports. Right, its for the output ports, OVS uses ovs port numbers and mirred action uses the device ifindex, so there is need to translate it back to OVS port on dump. so we need 128+32 for now. It helps us with dumping the the flows back, when we lose data on crash or restarting the user space daemon. HW hints is another thing that might be helpful. Its binary blob because user/app specifc and its usage might change in the future and its and that's why there is some headroom with size as well.
Re: [PATCH net-next] net/sched: cls_flower: Add user specified data
On Tue, Jan 03, 2017 at 02:22:05PM +0200, Paul Blakey wrote: > > > On 03/01/2017 13:44, Jamal Hadi Salim wrote: > >On 17-01-02 11:33 PM, John Fastabend wrote: > >>On 17-01-02 05:22 PM, Jamal Hadi Salim wrote: > > > >[..] > >>>Like all cookie semantics it is for storing state. The receiver > >>>(kernel) > >>>is not just store it and not intepret it. The user when reading it back > >>>simplifies what they have to do for their processing. > >>> > > The tuple really should be unique why > not use this for system wide mappings? > > >>> > >>>I think on a single machine should be enough, however: > >>>typically the user wants to define the value in a manner that > >>>in a distributed system it is unique. It would be trickier to > >>>do so with well defined values such as above. > >>> > >> > >>Just extend the tuple that > >>should be unique in the domain of hostname's, or use some other domain > >>wide machine identifier. > >> > > > >May work for the case of filter identification. The nice thing for > >allowing cookies is you can let the user define it define their > >own scheme. > > > >>Although actions can be shared so the cookie can be shared across > >>filters. Maybe its useful but it doesn't uniquely identify a filter > >>in the shared case but the user would have to specify that case > >>so maybe its not important. > >> > > > >Note: the action cookies and filter cookies are unrelated/orthogonal. > >Their basic concept of stashing something in the cookie to help improve > >what user space does (in our case millions of actions of which some are > >used for accounting) is similar. > >I have no objections to the flow cookies; my main concern was it should > >be applicable to all classifiers not just flower. And the arbitrary size > >of the cookie that you pointed out is questionable. > > > >cheers, > >jamal > > > Hi all, > Our use case is replacing OVS rules with TC filters for HW offload, and > you're are right the cookie would > have saved us the mapping from OVS rule ufid to the tc filter handle/prio... > that was generated for it. > It also was going to be used to store other info like which OVS output port > corresponds to the ifindex, Possibly off-topic but I am curious to know why you need to store the port. My possibly naïve assumption is that a filter is attached to the netdev corresponding to the input port and mirred or other actions are used to output to netdevs corresponding to output ports. > so we need 128+32 for now. It helps us with dumping the the flows back, when > we lose data on crash > or restarting the user space daemon. > HW hints is another thing that might be helpful. > Its binary blob because user/app specifc and its usage might change in the > future and its and that's why there > is some headroom with size as well.
Re: [PATCH net-next] net/sched: cls_flower: Add user specified data
On 03/01/2017 13:44, Jamal Hadi Salim wrote: On 17-01-02 11:33 PM, John Fastabend wrote: On 17-01-02 05:22 PM, Jamal Hadi Salim wrote: [..] Like all cookie semantics it is for storing state. The receiver (kernel) is not just store it and not intepret it. The user when reading it back simplifies what they have to do for their processing. The tuple really should be unique why not use this for system wide mappings? I think on a single machine should be enough, however: typically the user wants to define the value in a manner that in a distributed system it is unique. It would be trickier to do so with well defined values such as above. Just extend the tuple that should be unique in the domain of hostname's, or use some other domain wide machine identifier. May work for the case of filter identification. The nice thing for allowing cookies is you can let the user define it define their own scheme. Although actions can be shared so the cookie can be shared across filters. Maybe its useful but it doesn't uniquely identify a filter in the shared case but the user would have to specify that case so maybe its not important. Note: the action cookies and filter cookies are unrelated/orthogonal. Their basic concept of stashing something in the cookie to help improve what user space does (in our case millions of actions of which some are used for accounting) is similar. I have no objections to the flow cookies; my main concern was it should be applicable to all classifiers not just flower. And the arbitrary size of the cookie that you pointed out is questionable. cheers, jamal Hi all, Our use case is replacing OVS rules with TC filters for HW offload, and you're are right the cookie would have saved us the mapping from OVS rule ufid to the tc filter handle/prio... that was generated for it. It also was going to be used to store other info like which OVS output port corresponds to the ifindex, so we need 128+32 for now. It helps us with dumping the the flows back, when we lose data on crash or restarting the user space daemon. HW hints is another thing that might be helpful. Its binary blob because user/app specifc and its usage might change in the future and its and that's why there is some headroom with size as well. I don't mind having it on TC level but I didn't want to intervene with all filters/TC.
Re: [PATCH net-next] net/sched: cls_flower: Add user specified data
On 17-01-02 11:33 PM, John Fastabend wrote: On 17-01-02 05:22 PM, Jamal Hadi Salim wrote: [..] Like all cookie semantics it is for storing state. The receiver (kernel) is not just store it and not intepret it. The user when reading it back simplifies what they have to do for their processing. The tuple really should be unique why not use this for system wide mappings? I think on a single machine should be enough, however: typically the user wants to define the value in a manner that in a distributed system it is unique. It would be trickier to do so with well defined values such as above. Just extend the tuple that should be unique in the domain of hostname's, or use some other domain wide machine identifier. May work for the case of filter identification. The nice thing for allowing cookies is you can let the user define it define their own scheme. Although actions can be shared so the cookie can be shared across filters. Maybe its useful but it doesn't uniquely identify a filter in the shared case but the user would have to specify that case so maybe its not important. Note: the action cookies and filter cookies are unrelated/orthogonal. Their basic concept of stashing something in the cookie to help improve what user space does (in our case millions of actions of which some are used for accounting) is similar. I have no objections to the flow cookies; my main concern was it should be applicable to all classifiers not just flower. And the arbitrary size of the cookie that you pointed out is questionable. cheers, jamal
Re: [PATCH net-next] net/sched: cls_flower: Add user specified data
On 17-01-02 05:22 PM, Jamal Hadi Salim wrote: > On 17-01-02 05:58 PM, John Fastabend wrote: >> On 17-01-02 02:21 PM, Jamal Hadi Salim wrote: > > >> Well having the length value avoids ending up with cookie1, cookie2, ... >> values as folks push more and more data into the cookie. >> > > Unless there is good reason I dont see why it shouldnt be a fixed length > value. u64/128 should be plenty. > >> I don't see any use in the kernel interpreting it. It has no use >> for it as far as I can see. It doesn't appear to be metadata which >> we use skb->mark for at the moment. >> > > Like all cookie semantics it is for storing state. The receiver (kernel) > is not just store it and not intepret it. The user when reading it back > simplifies what they have to do for their processing. > >> >> The tuple really should be unique why >> not use this for system wide mappings? >> > > I think on a single machine should be enough, however: > typically the user wants to define the value in a manner that > in a distributed system it is unique. It would be trickier to > do so with well defined values such as above. > Just extend the tuple that should be unique in the domain of hostname's, or use some other domain wide machine identifier. > >> The only thing I can think to do with this that I can't do with >> the above tuple and a simple userspace lookup is stick hardware specific >> "hints" in the cookie for the firmware to consume. Which would be >> very helpful for what its worth. >> > > Ok, very different from our use case with actions. > We just use those values to map to stats info without needing to > know what flow or action (graph) it is associated with. > Sure. >> Its a bit strange to push it as an action when its not really an >> action in the traditional datapath. >> > > The action is part of a graph pointed to by a filter. Although actions can be shared so the cookie can be shared across filters. Maybe its useful but it doesn't uniquely identify a filter in the shared case but the user would have to specify that case so maybe its not important. > >> I suspect the OVS usage is a simple 1:1 lookup from OVS id to TC id to >> avoid a userspace map lookup. > > Not that i care about OVS but it sounds like a good use case (even for > tc), no? I'm not opposed to it. Just pushing on the use case a bit to understand. > > cheers, > jamal
Re: [PATCH net-next] net/sched: cls_flower: Add user specified data
On 17-01-02 05:58 PM, John Fastabend wrote: On 17-01-02 02:21 PM, Jamal Hadi Salim wrote: Well having the length value avoids ending up with cookie1, cookie2, ... values as folks push more and more data into the cookie. Unless there is good reason I dont see why it shouldnt be a fixed length value. u64/128 should be plenty. I don't see any use in the kernel interpreting it. It has no use for it as far as I can see. It doesn't appear to be metadata which we use skb->mark for at the moment. Like all cookie semantics it is for storing state. The receiver (kernel) is not just store it and not intepret it. The user when reading it back simplifies what they have to do for their processing. The tuple really should be unique why not use this for system wide mappings? I think on a single machine should be enough, however: typically the user wants to define the value in a manner that in a distributed system it is unique. It would be trickier to do so with well defined values such as above. The only thing I can think to do with this that I can't do with the above tuple and a simple userspace lookup is stick hardware specific "hints" in the cookie for the firmware to consume. Which would be very helpful for what its worth. Ok, very different from our use case with actions. We just use those values to map to stats info without needing to know what flow or action (graph) it is associated with. Its a bit strange to push it as an action when its not really an action in the traditional datapath. The action is part of a graph pointed to by a filter. I suspect the OVS usage is a simple 1:1 lookup from OVS id to TC id to avoid a userspace map lookup. Not that i care about OVS but it sounds like a good use case (even for tc), no? cheers, jamal
Re: [PATCH net-next] net/sched: cls_flower: Add user specified data
On 17-01-02 02:21 PM, Jamal Hadi Salim wrote: > On 17-01-02 01:23 PM, John Fastabend wrote: > >> >> Additionally I would like to point out this is an arbitrary length binary >> blob (for undefined use, without even a specified encoding) that gets pushed >> between user space and hardware ;) This seemed to get folks fairly excited in >> the past. >> > > The binary blob size is a little strange - but i think there is value > in storing some "cookie" field. The challenge is whether the kernel > gets to intepret it; in which case encoding must be specified. Or > whether we should leave it up to user space - in which something > like tc could standardize its own encodings. > Well having the length value avoids ending up with cookie1, cookie2, ... values as folks push more and more data into the cookie. I don't see any use in the kernel interpreting it. It has no use for it as far as I can see. It doesn't appear to be metadata which we use skb->mark for at the moment. >> Some questions, exactly what do you mean by "port mappings" above? In >> general the 'tc' API uses the netdev the netlink msg is processed on as >> the port mapping. If you mean OVS port to netdev port I think this is >> a OVS problem and nothing to do with 'tc'. For what its worth there is an >> existing problem with 'tc' where rules only apply to a single ingress or >> egress port which is limiting on hardware. >> > > In our case the desire is to be able to correlate for a system wide > mostly identity/key mapping. > The tuple really should be unique why not use this for system wide mappings? The only thing I can think to do with this that I can't do with the above tuple and a simple userspace lookup is stick hardware specific "hints" in the cookie for the firmware to consume. Which would be very helpful for what its worth. >> The UFID in my ovs code base is defined as best I can tell here, >> >> [OVS_FLOW_ATTR_UFID] = { .type = NL_A_UNSPEC, .optional = true, >> .min_len = sizeof(ovs_u128) }, >> >> So you need 128 bits if you want a 1:1 mapping onto 'tc'. So rather >> than an arbitrary blob why not make the case that 'tc' ids need to be >> 128 bits long? Even if its just initially done in flower call it >> flower_flow_id and define it so its not opaque and at least at the code >> level it isn't an arbitrary blob of data. >> > > I dont know what this UFID is, but do note: > The idea is not new - the FIB for example has some such cookie > (albeit a tiny one) which will typically be populated to tell > you who/what installed the entry. > I could see f.e use for this cookie to simplify and pretty print in > a human language for the u32 classifier (i.e user space tc sets > some fields in the cookie when updating kernel and when user space > invokes get/dump it uses the cookie to intepret how to pretty print). > > I have attached a compile tested version of the cookies on actions > (flat 64 bit; now that we have experienced the use when we have a > large number of counters - I would not mind a 128 bit field). > Its a bit strange to push it as an action when its not really an action in the traditional datapath. I suspect the OVS usage is a simple 1:1 lookup from OVS id to TC id to avoid a userspace map lookup. > > cheers, > jamal > >> And what are the "next" uses of this besides OVS. It would be really >> valuable to see how this generalizes to other usage models. To avoid >> embedding OVS syntax into 'tc'. >> >> Finally if you want to see an example of binary data encodings look at >> how drivers/hardware/users are currently using the user defined bits in >> ethtools ntuple API. Also track down out of tree drivers to see other >> interesting uses. And that was capped at 64bits :/ >> >> Thanks, >> John >> >> >> >> >> >
Re: [PATCH net-next] net/sched: cls_flower: Add user specified data
On 17-01-02 01:23 PM, John Fastabend wrote: Additionally I would like to point out this is an arbitrary length binary blob (for undefined use, without even a specified encoding) that gets pushed between user space and hardware ;) This seemed to get folks fairly excited in the past. The binary blob size is a little strange - but i think there is value in storing some "cookie" field. The challenge is whether the kernel gets to intepret it; in which case encoding must be specified. Or whether we should leave it up to user space - in which something like tc could standardize its own encodings. Some questions, exactly what do you mean by "port mappings" above? In general the 'tc' API uses the netdev the netlink msg is processed on as the port mapping. If you mean OVS port to netdev port I think this is a OVS problem and nothing to do with 'tc'. For what its worth there is an existing problem with 'tc' where rules only apply to a single ingress or egress port which is limiting on hardware. In our case the desire is to be able to correlate for a system wide mostly identity/key mapping. The UFID in my ovs code base is defined as best I can tell here, [OVS_FLOW_ATTR_UFID] = { .type = NL_A_UNSPEC, .optional = true, .min_len = sizeof(ovs_u128) }, So you need 128 bits if you want a 1:1 mapping onto 'tc'. So rather than an arbitrary blob why not make the case that 'tc' ids need to be 128 bits long? Even if its just initially done in flower call it flower_flow_id and define it so its not opaque and at least at the code level it isn't an arbitrary blob of data. I dont know what this UFID is, but do note: The idea is not new - the FIB for example has some such cookie (albeit a tiny one) which will typically be populated to tell you who/what installed the entry. I could see f.e use for this cookie to simplify and pretty print in a human language for the u32 classifier (i.e user space tc sets some fields in the cookie when updating kernel and when user space invokes get/dump it uses the cookie to intepret how to pretty print). I have attached a compile tested version of the cookies on actions (flat 64 bit; now that we have experienced the use when we have a large number of counters - I would not mind a 128 bit field). cheers, jamal And what are the "next" uses of this besides OVS. It would be really valuable to see how this generalizes to other usage models. To avoid embedding OVS syntax into 'tc'. Finally if you want to see an example of binary data encodings look at how drivers/hardware/users are currently using the user defined bits in ethtools ntuple API. Also track down out of tree drivers to see other interesting uses. And that was capped at 64bits :/ Thanks, John diff --git a/include/net/act_api.h b/include/net/act_api.h index 1d71644..f299ed3 100644 --- a/include/net/act_api.h +++ b/include/net/act_api.h @@ -41,6 +41,7 @@ struct tc_action { struct rcu_head tcfa_rcu; struct gnet_stats_basic_cpu __percpu *cpu_bstats; struct gnet_stats_queue __percpu *cpu_qstats; + u64 cookie; }; #define tcf_head common.tcfa_head #define tcf_index common.tcfa_index diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h index cb4bcdc..2e968ee 100644 --- a/include/uapi/linux/pkt_cls.h +++ b/include/uapi/linux/pkt_cls.h @@ -67,6 +67,7 @@ enum { TCA_ACT_INDEX, TCA_ACT_STATS, TCA_ACT_PAD, + TCA_ACT_COOKIE, __TCA_ACT_MAX }; diff --git a/net/sched/act_api.c b/net/sched/act_api.c index 2095c83..97eae6b 100644 --- a/net/sched/act_api.c +++ b/net/sched/act_api.c @@ -26,6 +26,7 @@ #include #include #include +#include static void free_tcf(struct rcu_head *head) { @@ -467,17 +468,21 @@ int tcf_action_destroy(struct list_head *actions, int bind) return a->ops->dump(skb, a, bind, ref); } -int -tcf_action_dump_1(struct sk_buff *skb, struct tc_action *a, int bind, int ref) +int tcf_action_dump_1(struct sk_buff *skb, struct tc_action *a, int bind, + int ref) { int err = -EINVAL; unsigned char *b = skb_tail_pointer(skb); struct nlattr *nest; + u64 cookie = a->cookie; if (nla_put_string(skb, TCA_KIND, a->ops->kind)) goto nla_put_failure; if (tcf_action_copy_stats(skb, a, 0)) goto nla_put_failure; + if (nla_put_u64_64bit(skb, TCA_ACT_COOKIE, cookie, TCA_ACT_PAD)) + goto nla_put_failure; + nest = nla_nest_start(skb, TCA_OPTIONS); if (nest == NULL) goto nla_put_failure; @@ -578,6 +583,11 @@ struct tc_action *tcf_action_init_1(struct net *net, struct nlattr *nla, if (err < 0) goto err_mod; + if (tb[TCA_ACT_COOKIE]) + a->cookie = nla_get_u64(tb[TCA_ACT_COOKIE]); + else + a->cookie = 0; /* kernel uses 0 */ + /* module cou
Re: [PATCH net-next] net/sched: cls_flower: Add user specified data
On 17-01-02 06:59 AM, Jamal Hadi Salim wrote: > > We have been using a cookie as well for actions (which we have been > using but have been too lazy to submit so far). I am going to port > it over to the newer kernels and post it. > In our case that is intended to be opaque to the kernel i.e kernel > never inteprets it; in that case it is similar to the kernel > FIB protocol field. > > In your case - could this cookie have been a class/flowid > (a 32 bit)? > And would it not make more sense for it the cookie to be > generic to all classifiers? i.e why is it specific to flower? > > cheers, > jamal > > On 17-01-02 08:13 AM, Paul Blakey wrote: >> This is to support saving extra data that might be helpful on retrieval. >> First use case is upcoming openvswitch flow offloads, extra data will >> include UFID and port mappings for each added flow. >> >> Signed-off-by: Paul Blakey >> Reviewed-by: Roi Dayan >> Acked-by: Jiri Pirko >> --- Additionally I would like to point out this is an arbitrary length binary blob (for undefined use, without even a specified encoding) that gets pushed between user space and hardware ;) This seemed to get folks fairly excited in the past. Some questions, exactly what do you mean by "port mappings" above? In general the 'tc' API uses the netdev the netlink msg is processed on as the port mapping. If you mean OVS port to netdev port I think this is a OVS problem and nothing to do with 'tc'. For what its worth there is an existing problem with 'tc' where rules only apply to a single ingress or egress port which is limiting on hardware. The UFID in my ovs code base is defined as best I can tell here, [OVS_FLOW_ATTR_UFID] = { .type = NL_A_UNSPEC, .optional = true, .min_len = sizeof(ovs_u128) }, So you need 128 bits if you want a 1:1 mapping onto 'tc'. So rather than an arbitrary blob why not make the case that 'tc' ids need to be 128 bits long? Even if its just initially done in flower call it flower_flow_id and define it so its not opaque and at least at the code level it isn't an arbitrary blob of data. And what are the "next" uses of this besides OVS. It would be really valuable to see how this generalizes to other usage models. To avoid embedding OVS syntax into 'tc'. Finally if you want to see an example of binary data encodings look at how drivers/hardware/users are currently using the user defined bits in ethtools ntuple API. Also track down out of tree drivers to see other interesting uses. And that was capped at 64bits :/ Thanks, John
Re: [PATCH net-next] net/sched: cls_flower: Add user specified data
We have been using a cookie as well for actions (which we have been using but have been too lazy to submit so far). I am going to port it over to the newer kernels and post it. In our case that is intended to be opaque to the kernel i.e kernel never inteprets it; in that case it is similar to the kernel FIB protocol field. In your case - could this cookie have been a class/flowid (a 32 bit)? And would it not make more sense for it the cookie to be generic to all classifiers? i.e why is it specific to flower? cheers, jamal On 17-01-02 08:13 AM, Paul Blakey wrote: This is to support saving extra data that might be helpful on retrieval. First use case is upcoming openvswitch flow offloads, extra data will include UFID and port mappings for each added flow. Signed-off-by: Paul Blakey Reviewed-by: Roi Dayan Acked-by: Jiri Pirko --- include/uapi/linux/pkt_cls.h | 3 +++ net/sched/cls_flower.c | 22 +- 2 files changed, 24 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h index cb4bcdc..ca9bbe3 100644 --- a/include/uapi/linux/pkt_cls.h +++ b/include/uapi/linux/pkt_cls.h @@ -471,10 +471,13 @@ enum { TCA_FLOWER_KEY_ICMPV6_TYPE, /* u8 */ TCA_FLOWER_KEY_ICMPV6_TYPE_MASK,/* u8 */ + TCA_FLOWER_COOKIE, /* binary */ + __TCA_FLOWER_MAX, }; #define TCA_FLOWER_MAX (__TCA_FLOWER_MAX - 1) +#define FLOWER_MAX_COOKIE_SIZE 128 enum { TCA_FLOWER_KEY_FLAGS_IS_FRAGMENT = (1 << 0), diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c index 333f8e2..e2f5b25 100644 --- a/net/sched/cls_flower.c +++ b/net/sched/cls_flower.c @@ -85,6 +85,8 @@ struct cls_fl_filter { struct rcu_head rcu; struct tc_to_netdev tc; struct net_device *hw_dev; + size_t cookie_len; + long cookie[0]; }; static unsigned short int fl_mask_range(const struct fl_flow_mask *mask) @@ -794,6 +796,9 @@ static int fl_change(struct net *net, struct sk_buff *in_skb, struct cls_fl_filter *fnew; struct nlattr *tb[TCA_FLOWER_MAX + 1]; struct fl_flow_mask mask = {}; + const struct nlattr *attr; + size_t cookie_len = 0; + void *cookie; int err; if (!tca[TCA_OPTIONS]) @@ -806,10 +811,22 @@ static int fl_change(struct net *net, struct sk_buff *in_skb, if (fold && handle && fold->handle != handle) return -EINVAL; - fnew = kzalloc(sizeof(*fnew), GFP_KERNEL); + if (tb[TCA_FLOWER_COOKIE]) { + attr = tb[TCA_FLOWER_COOKIE]; + cookie_len = nla_len(attr); + cookie = nla_data(attr); + if (cookie_len > FLOWER_MAX_COOKIE_SIZE) + return -EINVAL; + } + + fnew = kzalloc(sizeof(*fnew) + cookie_len, GFP_KERNEL); if (!fnew) return -ENOBUFS; + fnew->cookie_len = cookie_len; + if (cookie_len) + memcpy(fnew->cookie, cookie, cookie_len); + err = tcf_exts_init(&fnew->exts, TCA_FLOWER_ACT, 0); if (err < 0) goto errout; @@ -1151,6 +1168,9 @@ static int fl_dump(struct net *net, struct tcf_proto *tp, unsigned long fh, nla_put_u32(skb, TCA_FLOWER_FLAGS, f->flags); + if (f->cookie_len) + nla_put(skb, TCA_FLOWER_COOKIE, f->cookie_len, f->cookie); + if (tcf_exts_dump(skb, &f->exts)) goto nla_put_failure;
[PATCH net-next] net/sched: cls_flower: Add user specified data
This is to support saving extra data that might be helpful on retrieval. First use case is upcoming openvswitch flow offloads, extra data will include UFID and port mappings for each added flow. Signed-off-by: Paul Blakey Reviewed-by: Roi Dayan Acked-by: Jiri Pirko --- include/uapi/linux/pkt_cls.h | 3 +++ net/sched/cls_flower.c | 22 +- 2 files changed, 24 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h index cb4bcdc..ca9bbe3 100644 --- a/include/uapi/linux/pkt_cls.h +++ b/include/uapi/linux/pkt_cls.h @@ -471,10 +471,13 @@ enum { TCA_FLOWER_KEY_ICMPV6_TYPE, /* u8 */ TCA_FLOWER_KEY_ICMPV6_TYPE_MASK,/* u8 */ + TCA_FLOWER_COOKIE, /* binary */ + __TCA_FLOWER_MAX, }; #define TCA_FLOWER_MAX (__TCA_FLOWER_MAX - 1) +#define FLOWER_MAX_COOKIE_SIZE 128 enum { TCA_FLOWER_KEY_FLAGS_IS_FRAGMENT = (1 << 0), diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c index 333f8e2..e2f5b25 100644 --- a/net/sched/cls_flower.c +++ b/net/sched/cls_flower.c @@ -85,6 +85,8 @@ struct cls_fl_filter { struct rcu_head rcu; struct tc_to_netdev tc; struct net_device *hw_dev; + size_t cookie_len; + long cookie[0]; }; static unsigned short int fl_mask_range(const struct fl_flow_mask *mask) @@ -794,6 +796,9 @@ static int fl_change(struct net *net, struct sk_buff *in_skb, struct cls_fl_filter *fnew; struct nlattr *tb[TCA_FLOWER_MAX + 1]; struct fl_flow_mask mask = {}; + const struct nlattr *attr; + size_t cookie_len = 0; + void *cookie; int err; if (!tca[TCA_OPTIONS]) @@ -806,10 +811,22 @@ static int fl_change(struct net *net, struct sk_buff *in_skb, if (fold && handle && fold->handle != handle) return -EINVAL; - fnew = kzalloc(sizeof(*fnew), GFP_KERNEL); + if (tb[TCA_FLOWER_COOKIE]) { + attr = tb[TCA_FLOWER_COOKIE]; + cookie_len = nla_len(attr); + cookie = nla_data(attr); + if (cookie_len > FLOWER_MAX_COOKIE_SIZE) + return -EINVAL; + } + + fnew = kzalloc(sizeof(*fnew) + cookie_len, GFP_KERNEL); if (!fnew) return -ENOBUFS; + fnew->cookie_len = cookie_len; + if (cookie_len) + memcpy(fnew->cookie, cookie, cookie_len); + err = tcf_exts_init(&fnew->exts, TCA_FLOWER_ACT, 0); if (err < 0) goto errout; @@ -1151,6 +1168,9 @@ static int fl_dump(struct net *net, struct tcf_proto *tp, unsigned long fh, nla_put_u32(skb, TCA_FLOWER_FLAGS, f->flags); + if (f->cookie_len) + nla_put(skb, TCA_FLOWER_COOKIE, f->cookie_len, f->cookie); + if (tcf_exts_dump(skb, &f->exts)) goto nla_put_failure; -- 1.8.3.1