Re: [net-next PATCH v3 3/8] net: sched: add cls_u32 offload hooks for netdevs
On 16-02-18 10:24 AM, John Fastabend wrote: On 16-02-18 04:14 AM, Jamal Hadi Salim wrote: On 16-02-17 06:07 PM, John Fastabend wrote: [...] IMO, it would be better at this early stage to enforce the correct behavior for future generations. To follow the netlink semantics which a lot of people are already trained to think in. Current netlink behavior is supposed to be: 1) NEW ==> "Create". Ambigous - could mean a)"create if it doesnt exist" or b) "fail if it exists otherwise create" Unfortunately different parts of the kernel often assume some default from either #a or #b. But this is already handled by the core cls_api.c code. We never get to u32_change if the flags are not correct. Look at the block right above the op call into the classifiers change() code in cls_api.c. Starting at line 287. Indeed that would cover s/ware filters but not h/ware. That will depend on what hardware can do. Really, all you need is to propagate those flags. Your driver ndo can ignore them. I know i will need them. Alternatively: If we say all filters are going to be stored in s/ware as well i.e the tri-state we talked about then the cls_api checks will work as well. But when you are talking millions of filters (such as i deal with) - that may become impractical (and then you are going to have all kind of clever things to find whether an EXCLUSIVE will work or not etc). cheers, jamal
Re: [net-next PATCH v3 3/8] net: sched: add cls_u32 offload hooks for netdevs
On Thu, Feb 18, 2016 at 11:23 AM, Amir Vadai" wrote: > On Wed, Feb 17, 2016 at 03:07:23PM -0800, John Fastabend wrote: >> Actually thinking about this a bit more I wrote this thinking >> that there existed some hardware that actually cared if it was >> a new rule or an existing rule. For me it doesn't matter I do >> the same thing in the new/replace cases I just write into the >> slot on the hardware table and if it happens to have something >> in it well its overwritten e.g. "replaced". This works because >> the cls_u32 layer protects us from doing something unexpected. >> I'm wondering (mostly asking the mlx folks) is there hardware >> out there that cares to make this distinction between new and >> replace? Otherwise I can just drop new and always use replace. >> Or vice versa which is the case in its current form. > I don't see a need for such a distinction in mlx hardware. If you (say) have the same match and different action for a given rule we have HW API to modify existing steering entry under which we would end up calling the FW once instead of twice (delete old, add new). Or.
Re: [net-next PATCH v3 3/8] net: sched: add cls_u32 offload hooks for netdevs
On Thu, Feb 18, 2016 at 11:23:35AM +0200, Amir Vadai" wrote: > On Wed, Feb 17, 2016 at 03:07:23PM -0800, John Fastabend wrote: > > [...] > > > > >> > > >>> +static void u32_replace_hw_hnode(struct tcf_proto *tp, struct > > >>> tc_u_hnode *h) > > >>> +{ > > >>> +struct net_device *dev = tp->q->dev_queue->dev; > > >>> +struct tc_cls_u32_offload u32_offload = {0}; > > >>> +struct tc_to_netdev offload; > > >>> + > > >>> +offload.type = TC_SETUP_CLSU32; > > >>> +offload.cls_u32 = &u32_offload; > > >>> + > > >>> +if (dev->netdev_ops->ndo_setup_tc) { > > >>> +offload.cls_u32->command = TC_CLSU32_NEW_HNODE; > > >> > > >> TC_CLSU32_REPLACE_HNODE? > > >> > > > > > > Yep I made this change and will send out v4. > > > > > > [...] > > > > > >> > > > > Actually thinking about this a bit more I wrote this thinking > > that there existed some hardware that actually cared if it was > > a new rule or an existing rule. For me it doesn't matter I do > > the same thing in the new/replace cases I just write into the > > slot on the hardware table and if it happens to have something > > in it well its overwritten e.g. "replaced". This works because > > the cls_u32 layer protects us from doing something unexpected. > > > > I'm wondering (mostly asking the mlx folks) is there hardware > > out there that cares to make this distinction between new and > > replace? Otherwise I can just drop new and always use replace. > > Or vice versa which is the case in its current form. > I don't see a need for such a distinction in mlx hardware. FWIW, I think it is unlikely such a distinction would be needed for Netronome hardware.
Re: [net-next PATCH v3 3/8] net: sched: add cls_u32 offload hooks for netdevs
On 16-02-18 04:14 AM, Jamal Hadi Salim wrote: > On 16-02-17 06:07 PM, John Fastabend wrote: >> [...] >> > >> Actually thinking about this a bit more I wrote this thinking >> that there existed some hardware that actually cared if it was >> a new rule or an existing rule. For me it doesn't matter I do >> the same thing in the new/replace cases I just write into the >> slot on the hardware table and if it happens to have something >> in it well its overwritten e.g. "replaced". This works because >> the cls_u32 layer protects us from doing something unexpected. >> > > You are describing create-or-update which is a reasonable default > BUT: counting on the user to specify the htid+bktid+nodeid > for every filter and knowing what that means is prone to mistakes > when for example (using your big hammer approach right now) they > dont specify the handle and the kernel creates one for them. > > IMO, it would be better at this early stage to enforce the correct > behavior for future generations. > To follow the netlink semantics which a lot of people are already > trained to think in. > > Current netlink behavior is supposed to be: > > 1) NEW ==> "Create". > Ambigous - could mean a)"create if it doesnt exist" or b) "fail if it > exists otherwise create" > Unfortunately different parts of the kernel often assume some > default from either #a or #b. > But this is already handled by the core cls_api.c code. We never get to u32_change if the flags are not correct. Look at the block right above the op call into the classifiers change() code in cls_api.c. Starting at line 287. > 2) NEW|REPLACE flag ==> "Create if it doesnt exist and replace > if it exists" > > 3)NEW|EXCLUSIVE ==> "Create if it doesnt exist and fail if it > exists" > > 4)NEW|APPEND ==> "just fscking create; i dont care if it exists". > > IOW, just add the flag field which is intepreted from whatever > the user explicitly asks for. And reject say what the hardware > doesnt support. I think I agree with what your saying but I'm fairly sure this is working as you describe. > I have worked with tcams where we support #3. It is a bit inefficient > because you have to check if a rule exists first. And i have worked > in cases where #1 is assumed to mean #2 and at times #4. It is better > user experience to be explicit. Agreed, and I think it is :) > > cheers, > jamal
Re: [net-next PATCH v3 3/8] net: sched: add cls_u32 offload hooks for netdevs
On 16-02-17 06:07 PM, John Fastabend wrote: [...] Actually thinking about this a bit more I wrote this thinking that there existed some hardware that actually cared if it was a new rule or an existing rule. For me it doesn't matter I do the same thing in the new/replace cases I just write into the slot on the hardware table and if it happens to have something in it well its overwritten e.g. "replaced". This works because the cls_u32 layer protects us from doing something unexpected. You are describing create-or-update which is a reasonable default BUT: counting on the user to specify the htid+bktid+nodeid for every filter and knowing what that means is prone to mistakes when for example (using your big hammer approach right now) they dont specify the handle and the kernel creates one for them. IMO, it would be better at this early stage to enforce the correct behavior for future generations. To follow the netlink semantics which a lot of people are already trained to think in. Current netlink behavior is supposed to be: 1) NEW ==> "Create". Ambigous - could mean a)"create if it doesnt exist" or b) "fail if it exists otherwise create" Unfortunately different parts of the kernel often assume some default from either #a or #b. 2) NEW|REPLACE flag ==> "Create if it doesnt exist and replace if it exists" 3)NEW|EXCLUSIVE ==> "Create if it doesnt exist and fail if it exists" 4)NEW|APPEND ==> "just fscking create; i dont care if it exists". IOW, just add the flag field which is intepreted from whatever the user explicitly asks for. And reject say what the hardware doesnt support. I have worked with tcams where we support #3. It is a bit inefficient because you have to check if a rule exists first. And i have worked in cases where #1 is assumed to mean #2 and at times #4. It is better user experience to be explicit. cheers, jamal
Re: [net-next PATCH v3 3/8] net: sched: add cls_u32 offload hooks for netdevs
On Wed, Feb 17, 2016 at 03:07:23PM -0800, John Fastabend wrote: > [...] > > >> > >>> +static void u32_replace_hw_hnode(struct tcf_proto *tp, struct > >>> tc_u_hnode *h) > >>> +{ > >>> +struct net_device *dev = tp->q->dev_queue->dev; > >>> +struct tc_cls_u32_offload u32_offload = {0}; > >>> +struct tc_to_netdev offload; > >>> + > >>> +offload.type = TC_SETUP_CLSU32; > >>> +offload.cls_u32 = &u32_offload; > >>> + > >>> +if (dev->netdev_ops->ndo_setup_tc) { > >>> +offload.cls_u32->command = TC_CLSU32_NEW_HNODE; > >> > >> TC_CLSU32_REPLACE_HNODE? > >> > > > > Yep I made this change and will send out v4. > > > > [...] > > > >> > > Actually thinking about this a bit more I wrote this thinking > that there existed some hardware that actually cared if it was > a new rule or an existing rule. For me it doesn't matter I do > the same thing in the new/replace cases I just write into the > slot on the hardware table and if it happens to have something > in it well its overwritten e.g. "replaced". This works because > the cls_u32 layer protects us from doing something unexpected. > > I'm wondering (mostly asking the mlx folks) is there hardware > out there that cares to make this distinction between new and > replace? Otherwise I can just drop new and always use replace. > Or vice versa which is the case in its current form. I don't see a need for such a distinction in mlx hardware. Thanks, Amir. > > Thanks, > John >
Re: [net-next PATCH v3 3/8] net: sched: add cls_u32 offload hooks for netdevs
[...] >> >>> +static void u32_replace_hw_hnode(struct tcf_proto *tp, struct >>> tc_u_hnode *h) >>> +{ >>> +struct net_device *dev = tp->q->dev_queue->dev; >>> +struct tc_cls_u32_offload u32_offload = {0}; >>> +struct tc_to_netdev offload; >>> + >>> +offload.type = TC_SETUP_CLSU32; >>> +offload.cls_u32 = &u32_offload; >>> + >>> +if (dev->netdev_ops->ndo_setup_tc) { >>> +offload.cls_u32->command = TC_CLSU32_NEW_HNODE; >> >> TC_CLSU32_REPLACE_HNODE? >> > > Yep I made this change and will send out v4. > > [...] > >> Actually thinking about this a bit more I wrote this thinking that there existed some hardware that actually cared if it was a new rule or an existing rule. For me it doesn't matter I do the same thing in the new/replace cases I just write into the slot on the hardware table and if it happens to have something in it well its overwritten e.g. "replaced". This works because the cls_u32 layer protects us from doing something unexpected. I'm wondering (mostly asking the mlx folks) is there hardware out there that cares to make this distinction between new and replace? Otherwise I can just drop new and always use replace. Or vice versa which is the case in its current form. Thanks, John
Re: [net-next PATCH v3 3/8] net: sched: add cls_u32 offload hooks for netdevs
On 16-02-17 02:59 AM, Jamal Hadi Salim wrote: > On 16-02-17 12:17 AM, John Fastabend wrote: >> This patch allows netdev drivers to consume cls_u32 offloads via >> the ndo_setup_tc ndo op. >> >> This works aligns with how network drivers have been doing qdisc >> offloads for mqprio. >> > > This one i have comments on. > >> Signed-off-by: John Fastabend >> --- >> include/linux/netdevice.h |6 ++- >> include/net/pkt_cls.h | 34 +++ >> net/sched/cls_u32.c | 99 >> - [...] >> #endif /* CONFIG_NET_CLS_IND */ >> >> +struct tc_cls_u32_knode { >> +struct tcf_exts *exts; >> +u8 fshift; >> +u32 handle; >> +u32 val; >> +u32 mask; >> +u32 link_handle; >> +struct tc_u32_sel *sel; >> +}; >> > > Swapping sel and fshift would give better struct alignment. > Makes sense I went ahead and did this. >> +struct tc_cls_u32_hnode { >> +u32 handle; >> +u32 prio; >> +unsigned int divisor; >> +}; > > > Assuming in the future "prio" would be moved to something that is more > generic classifier specific? > Sure in the future it can be moved up into a generic struct if it becomes useful there. >> +enum tc_clsu32_command { >> +TC_CLSU32_NEW_KNODE, >> +TC_CLSU32_REPLACE_KNODE, >> +TC_CLSU32_DELETE_KNODE, >> +TC_CLSU32_NEW_HNODE, >> +TC_CLSU32_REPLACE_HNODE, >> +TC_CLSU32_DELETE_HNODE, >> +}; >> + > > It seems to me commands should be generic which speak > Netlinkism. A REPLACE is just a flag to NEW. You dont need > a NEW_XXX for every object. switchdev got this right. > If you use cmd + flags then you can have all kinds of > netlink semantics that relay user intent from user space. Example: > Exclusivity where user says "create if it doesnt exist but dont replace > if it does". > At minimal add "flags" there. > Maybe not this release - but it makes sense to move "command" into > tc_to_netdev; a u16 cmd + u16 flags. > Yep next set of patches add the specific hw/sw/both semantics and specific error handling strategies. For this series we just get the simplest one. >> +struct tc_cls_u32_offload { >> +/* knode values */ >> +enum tc_clsu32_command command; >> +union { >> +struct tc_cls_u32_knode knode; >> +struct tc_cls_u32_hnode hnode; >> +}; >> +}; >> + >> #endif >> diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c >> index 4fbb674..d54bc94 100644 > >> +static void u32_replace_hw_hnode(struct tcf_proto *tp, struct >> tc_u_hnode *h) >> +{ >> +struct net_device *dev = tp->q->dev_queue->dev; >> +struct tc_cls_u32_offload u32_offload = {0}; >> +struct tc_to_netdev offload; >> + >> +offload.type = TC_SETUP_CLSU32; >> +offload.cls_u32 = &u32_offload; >> + >> +if (dev->netdev_ops->ndo_setup_tc) { >> +offload.cls_u32->command = TC_CLSU32_NEW_HNODE; > > TC_CLSU32_REPLACE_HNODE? > Yep I made this change and will send out v4. [...] > > > You are unconditionally calling the _hw_ api. For someone not using _hw_ > offloads, there are a few more instructions. Maybe just do the > dev->netdev_ops->ndo_setup_tc first? > My simple add/rem stress test didn't seem to make any difference. I think there is so much other stuff going on here this is in the noise. I'll take a pass at optimizing this later for all cases not just the hw loading ones. > > And to Or's point: How do i distinguish s/w from h/w? Next series handles this for now just enable the simplest case. > > cheers, > jamal >
Re: [net-next PATCH v3 3/8] net: sched: add cls_u32 offload hooks for netdevs
On 16-02-17 12:17 AM, John Fastabend wrote: This patch allows netdev drivers to consume cls_u32 offloads via the ndo_setup_tc ndo op. This works aligns with how network drivers have been doing qdisc offloads for mqprio. This one i have comments on. Signed-off-by: John Fastabend --- include/linux/netdevice.h |6 ++- include/net/pkt_cls.h | 34 +++ net/sched/cls_u32.c | 99 - 3 files changed, 136 insertions(+), 3 deletions(-) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index e396060..47671ce0 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -779,17 +779,21 @@ static inline bool netdev_phys_item_id_same(struct netdev_phys_item_id *a, typedef u16 (*select_queue_fallback_t)(struct net_device *dev, struct sk_buff *skb); -/* This structure holds attributes of qdisc and classifiers +/* These structures hold the attributes of qdisc and classifiers * that are being passed to the netdevice through the setup_tc op. */ enum { TC_SETUP_MQPRIO, + TC_SETUP_CLSU32, }; +struct tc_cls_u32_offload; + struct tc_to_netdev { unsigned int type; union { u8 tc; + struct tc_cls_u32_offload *cls_u32; }; }; > diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h index bc49967..59789ca 100644 --- a/include/net/pkt_cls.h +++ b/include/net/pkt_cls.h @@ -358,4 +358,38 @@ tcf_match_indev(struct sk_buff *skb, int ifindex) } #endif /* CONFIG_NET_CLS_IND */ +struct tc_cls_u32_knode { + struct tcf_exts *exts; + u8 fshift; + u32 handle; + u32 val; + u32 mask; + u32 link_handle; + struct tc_u32_sel *sel; +}; Swapping sel and fshift would give better struct alignment. +struct tc_cls_u32_hnode { + u32 handle; + u32 prio; + unsigned int divisor; +}; Assuming in the future "prio" would be moved to something that is more generic classifier specific? +enum tc_clsu32_command { + TC_CLSU32_NEW_KNODE, + TC_CLSU32_REPLACE_KNODE, + TC_CLSU32_DELETE_KNODE, + TC_CLSU32_NEW_HNODE, + TC_CLSU32_REPLACE_HNODE, + TC_CLSU32_DELETE_HNODE, +}; + It seems to me commands should be generic which speak Netlinkism. A REPLACE is just a flag to NEW. You dont need a NEW_XXX for every object. switchdev got this right. If you use cmd + flags then you can have all kinds of netlink semantics that relay user intent from user space. Example: Exclusivity where user says "create if it doesnt exist but dont replace if it does". At minimal add "flags" there. Maybe not this release - but it makes sense to move "command" into tc_to_netdev; a u16 cmd + u16 flags. +struct tc_cls_u32_offload { + /* knode values */ + enum tc_clsu32_command command; + union { + struct tc_cls_u32_knode knode; + struct tc_cls_u32_hnode hnode; + }; +}; + #endif diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c index 4fbb674..d54bc94 100644 +static void u32_replace_hw_hnode(struct tcf_proto *tp, struct tc_u_hnode *h) +{ + struct net_device *dev = tp->q->dev_queue->dev; + struct tc_cls_u32_offload u32_offload = {0}; + struct tc_to_netdev offload; + + offload.type = TC_SETUP_CLSU32; + offload.cls_u32 = &u32_offload; + + if (dev->netdev_ops->ndo_setup_tc) { + offload.cls_u32->command = TC_CLSU32_NEW_HNODE; TC_CLSU32_REPLACE_HNODE? + offload.cls_u32->hnode.divisor = h->divisor; + offload.cls_u32->hnode.handle = h->handle; + offload.cls_u32->hnode.prio = h->prio; + + dev->netdev_ops->ndo_setup_tc(dev, tp->q->handle, + tp->protocol, &offload); + } +} static void u32_clear_hnode(struct tcf_proto *tp, struct tc_u_hnode *ht) { struct tc_u_knode *n; @@ -434,6 +522,7 @@ static void u32_clear_hnode(struct tcf_proto *tp, struct tc_u_hnode *ht) RCU_INIT_POINTER(ht->ht[h], rtnl_dereference(n->next)); tcf_unbind_filter(tp, &n->res); + u32_remove_hw_knode(tp, n->handle); call_rcu(&n->rcu, u32_delete_key_freepf_rcu); } } @@ -454,6 +543,7 @@ static int u32_destroy_hnode(struct tcf_proto *tp, struct tc_u_hnode *ht) phn; hn = &phn->next, phn = rtnl_dereference(*hn)) { if (phn == ht) { + u32_clear_hw_hnode(tp, ht); RCU_INIT_POINTER(*hn, ht->next); kfree_rcu(ht, rcu); return 0; @@ -540,8 +630,10 @@ static int u32_delete(struct tcf_proto *tp, unsigned long arg) if (ht == NULL) return 0; - if (TC_U
Re: [net-next PATCH v3 3/8] net: sched: add cls_u32 offload hooks for netdevs
Wed, Feb 17, 2016 at 06:17:09AM CET, john.fastab...@gmail.com wrote: >This patch allows netdev drivers to consume cls_u32 offloads via >the ndo_setup_tc ndo op. > >This works aligns with how network drivers have been doing qdisc >offloads for mqprio. > >Signed-off-by: John Fastabend Acked-by: Jiri Pirko