Re: [patch net-next 00/34] net: sched: allow qdiscs to share filter block instances

2017-10-13 Thread Jiri Pirko
Fri, Oct 13, 2017 at 04:20:38PM CEST, dsah...@gmail.com wrote:
>On 10/13/17 12:26 AM, Jiri Pirko wrote:
>> Thu, Oct 12, 2017 at 11:37:30PM CEST, dsah...@gmail.com wrote:
>>> On 10/12/17 11:17 AM, Jiri Pirko wrote:
 So back to the example. First, we create 2 qdiscs. Both will share
 block number 22. "22" is just an identification. If we don't pass any
 block number, a new one will be generated by kernel:

 $ tc qdisc add dev ens7 ingress block 22
 
 $ tc qdisc add dev ens8 ingress block 22
 

 Now if we list the qdiscs, we will see the block index in the output:

 $ tc qdisc
 qdisc ingress : dev ens7 parent :fff1 block 22 
 qdisc ingress : dev ens8 parent :fff1 block 22 

 Now we can add filter to any of qdiscs sharing the same block:

 $ tc filter add dev ens7 parent : protocol ip pref 25 flower dst_ip 
 192.168.0.0/16 action drop


 We will see the same output if we list filters for ens7 and ens8, 
 including stats:

 $ tc -s filter show dev ens7 ingress
 filter protocol ip pref 25 flower chain 0 
 filter protocol ip pref 25 flower chain 0 handle 0x1 
   eth_type ipv4
   dst_ip 192.168.0.0/16
   not_in_hw
 action order 1: gact action drop
  random type none pass val 0
  index 1 ref 1 bind 1 installed 39 sec used 2 sec
 Action statistics:
 Sent 3108 bytes 37 pkt (dropped 37, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 

 $ tc -s filter show dev ens8 ingress
 filter protocol ip pref 25 flower chain 0 
 filter protocol ip pref 25 flower chain 0 handle 0x1 
   eth_type ipv4
   dst_ip 192.168.0.0/16
   not_in_hw
 action order 1: gact action drop
  random type none pass val 0
  index 1 ref 1 bind 1 installed 40 sec used 3 sec
 Action statistics:
 Sent 3108 bytes 37 pkt (dropped 37, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0
>>>
>>> This seems like really odd semantics to me ... a filter added to one
>>> device shows up on another.
>> 
>> Why is it odd? They share the same block, so it is natural that rule
>> added to one shows in list of rules for all devices that share the same
>> block.
>> 
>> 
>>>
>>> Why not make the shared block a standalone object that is configured
>>> through its own set of commands and then referenced by both devices?
>> 
>> I was thinking about that for a long time. That would require entirely
>> new set of netlink api and internal kernel handling just for this. Lots
>> of duplications. The reason is, the current API is strictly build around
>> ifindex. But the new API would not solve anything. As a user, I still
>> want so see shared rules in individial device listing, because they
>> would get processed for the device. So I believe that the proposed
>> behaviour is correct.
>> 
>
>netconf has NETCONFA_IFINDEX_ALL to keep the device concept but to relay
>information that applies to more than 1 device. You could have something
>similar for tc and shared blocks. Admin is done on this device index
>(e.g., your shared block 22 becomes dev index -22) and the filters are
>attached to another device for sharing using the 'qdisc add' command above.

It can be extended like this I guess. But still, the original rule adding
has to work.



Re: [patch net-next 00/34] net: sched: allow qdiscs to share filter block instances

2017-10-13 Thread David Ahern
On 10/13/17 12:26 AM, Jiri Pirko wrote:
> Thu, Oct 12, 2017 at 11:37:30PM CEST, dsah...@gmail.com wrote:
>> On 10/12/17 11:17 AM, Jiri Pirko wrote:
>>> So back to the example. First, we create 2 qdiscs. Both will share
>>> block number 22. "22" is just an identification. If we don't pass any
>>> block number, a new one will be generated by kernel:
>>>
>>> $ tc qdisc add dev ens7 ingress block 22
>>> 
>>> $ tc qdisc add dev ens8 ingress block 22
>>> 
>>>
>>> Now if we list the qdiscs, we will see the block index in the output:
>>>
>>> $ tc qdisc
>>> qdisc ingress : dev ens7 parent :fff1 block 22 
>>> qdisc ingress : dev ens8 parent :fff1 block 22 
>>>
>>> Now we can add filter to any of qdiscs sharing the same block:
>>>
>>> $ tc filter add dev ens7 parent : protocol ip pref 25 flower dst_ip 
>>> 192.168.0.0/16 action drop
>>>
>>>
>>> We will see the same output if we list filters for ens7 and ens8, including 
>>> stats:
>>>
>>> $ tc -s filter show dev ens7 ingress
>>> filter protocol ip pref 25 flower chain 0 
>>> filter protocol ip pref 25 flower chain 0 handle 0x1 
>>>   eth_type ipv4
>>>   dst_ip 192.168.0.0/16
>>>   not_in_hw
>>> action order 1: gact action drop
>>>  random type none pass val 0
>>>  index 1 ref 1 bind 1 installed 39 sec used 2 sec
>>> Action statistics:
>>> Sent 3108 bytes 37 pkt (dropped 37, overlimits 0 requeues 0) 
>>> backlog 0b 0p requeues 0 
>>>
>>> $ tc -s filter show dev ens8 ingress
>>> filter protocol ip pref 25 flower chain 0 
>>> filter protocol ip pref 25 flower chain 0 handle 0x1 
>>>   eth_type ipv4
>>>   dst_ip 192.168.0.0/16
>>>   not_in_hw
>>> action order 1: gact action drop
>>>  random type none pass val 0
>>>  index 1 ref 1 bind 1 installed 40 sec used 3 sec
>>> Action statistics:
>>> Sent 3108 bytes 37 pkt (dropped 37, overlimits 0 requeues 0) 
>>> backlog 0b 0p requeues 0
>>
>> This seems like really odd semantics to me ... a filter added to one
>> device shows up on another.
> 
> Why is it odd? They share the same block, so it is natural that rule
> added to one shows in list of rules for all devices that share the same
> block.
> 
> 
>>
>> Why not make the shared block a standalone object that is configured
>> through its own set of commands and then referenced by both devices?
> 
> I was thinking about that for a long time. That would require entirely
> new set of netlink api and internal kernel handling just for this. Lots
> of duplications. The reason is, the current API is strictly build around
> ifindex. But the new API would not solve anything. As a user, I still
> want so see shared rules in individial device listing, because they
> would get processed for the device. So I believe that the proposed
> behaviour is correct.
> 

netconf has NETCONFA_IFINDEX_ALL to keep the device concept but to relay
information that applies to more than 1 device. You could have something
similar for tc and shared blocks. Admin is done on this device index
(e.g., your shared block 22 becomes dev index -22) and the filters are
attached to another device for sharing using the 'qdisc add' command above.


Re: [patch net-next 00/34] net: sched: allow qdiscs to share filter block instances

2017-10-13 Thread Jiri Pirko
Fri, Oct 13, 2017 at 08:31:53AM CEST, da...@davemloft.net wrote:
>From: Jiri Pirko 
>Date: Fri, 13 Oct 2017 08:21:01 +0200
>
>> Thu, Oct 12, 2017 at 07:21:48PM CEST, da...@davemloft.net wrote:
>>>
>>>Jiri I'm not looking at a 34 patch set, it's too large.
>>>
>>>Break this up into groups of a dozen or so patches each, no
>>>more.  Submit them one at a time and wait for each series
>>>to be fully reviewed and integrated before moving onto the
>>>next one.
>> 
>> Yeah. As I stated in the beginning of the cover letter, I did not find a
>> way to do it. I could split into 2 of 3 patchsets, problem is that I
>> would introduce interfaces in first patchset that would be only used in
>> patchset 2 or 3. I believe that is not ok. Do you think that I can do it
>> like this this time?
>
>Jiri, please try harder.

Hmm. Okay. The reviewer would miss the big picture. But as you wish :)


Re: [patch net-next 00/34] net: sched: allow qdiscs to share filter block instances

2017-10-13 Thread David Miller
From: Jiri Pirko 
Date: Fri, 13 Oct 2017 08:21:01 +0200

> Thu, Oct 12, 2017 at 07:21:48PM CEST, da...@davemloft.net wrote:
>>
>>Jiri I'm not looking at a 34 patch set, it's too large.
>>
>>Break this up into groups of a dozen or so patches each, no
>>more.  Submit them one at a time and wait for each series
>>to be fully reviewed and integrated before moving onto the
>>next one.
> 
> Yeah. As I stated in the beginning of the cover letter, I did not find a
> way to do it. I could split into 2 of 3 patchsets, problem is that I
> would introduce interfaces in first patchset that would be only used in
> patchset 2 or 3. I believe that is not ok. Do you think that I can do it
> like this this time?

Jiri, please try harder.

Thank you.



Re: [patch net-next 00/34] net: sched: allow qdiscs to share filter block instances

2017-10-13 Thread Jiri Pirko
Thu, Oct 12, 2017 at 11:37:30PM CEST, dsah...@gmail.com wrote:
>On 10/12/17 11:17 AM, Jiri Pirko wrote:
>> So back to the example. First, we create 2 qdiscs. Both will share
>> block number 22. "22" is just an identification. If we don't pass any
>> block number, a new one will be generated by kernel:
>> 
>> $ tc qdisc add dev ens7 ingress block 22
>> 
>> $ tc qdisc add dev ens8 ingress block 22
>> 
>> 
>> Now if we list the qdiscs, we will see the block index in the output:
>> 
>> $ tc qdisc
>> qdisc ingress : dev ens7 parent :fff1 block 22 
>> qdisc ingress : dev ens8 parent :fff1 block 22 
>> 
>> Now we can add filter to any of qdiscs sharing the same block:
>> 
>> $ tc filter add dev ens7 parent : protocol ip pref 25 flower dst_ip 
>> 192.168.0.0/16 action drop
>> 
>> 
>> We will see the same output if we list filters for ens7 and ens8, including 
>> stats:
>> 
>> $ tc -s filter show dev ens7 ingress
>> filter protocol ip pref 25 flower chain 0 
>> filter protocol ip pref 25 flower chain 0 handle 0x1 
>>   eth_type ipv4
>>   dst_ip 192.168.0.0/16
>>   not_in_hw
>> action order 1: gact action drop
>>  random type none pass val 0
>>  index 1 ref 1 bind 1 installed 39 sec used 2 sec
>> Action statistics:
>> Sent 3108 bytes 37 pkt (dropped 37, overlimits 0 requeues 0) 
>> backlog 0b 0p requeues 0 
>> 
>> $ tc -s filter show dev ens8 ingress
>> filter protocol ip pref 25 flower chain 0 
>> filter protocol ip pref 25 flower chain 0 handle 0x1 
>>   eth_type ipv4
>>   dst_ip 192.168.0.0/16
>>   not_in_hw
>> action order 1: gact action drop
>>  random type none pass val 0
>>  index 1 ref 1 bind 1 installed 40 sec used 3 sec
>> Action statistics:
>> Sent 3108 bytes 37 pkt (dropped 37, overlimits 0 requeues 0) 
>> backlog 0b 0p requeues 0
>
>This seems like really odd semantics to me ... a filter added to one
>device shows up on another.

Why is it odd? They share the same block, so it is natural that rule
added to one shows in list of rules for all devices that share the same
block.


>
>Why not make the shared block a standalone object that is configured
>through its own set of commands and then referenced by both devices?

I was thinking about that for a long time. That would require entirely
new set of netlink api and internal kernel handling just for this. Lots
of duplications. The reason is, the current API is strictly build around
ifindex. But the new API would not solve anything. As a user, I still
want so see shared rules in individial device listing, because they
would get processed for the device. So I believe that the proposed
behaviour is correct.



Re: [patch net-next 00/34] net: sched: allow qdiscs to share filter block instances

2017-10-13 Thread Jiri Pirko
Thu, Oct 12, 2017 at 07:21:48PM CEST, da...@davemloft.net wrote:
>
>Jiri I'm not looking at a 34 patch set, it's too large.
>
>Break this up into groups of a dozen or so patches each, no
>more.  Submit them one at a time and wait for each series
>to be fully reviewed and integrated before moving onto the
>next one.

Yeah. As I stated in the beginning of the cover letter, I did not find a
way to do it. I could split into 2 of 3 patchsets, problem is that I
would introduce interfaces in first patchset that would be only used in
patchset 2 or 3. I believe that is not ok. Do you think that I can do it
like this this time?

Thanks


Re: [patch net-next 00/34] net: sched: allow qdiscs to share filter block instances

2017-10-12 Thread David Ahern
On 10/12/17 11:17 AM, Jiri Pirko wrote:
> So back to the example. First, we create 2 qdiscs. Both will share
> block number 22. "22" is just an identification. If we don't pass any
> block number, a new one will be generated by kernel:
> 
> $ tc qdisc add dev ens7 ingress block 22
> 
> $ tc qdisc add dev ens8 ingress block 22
> 
> 
> Now if we list the qdiscs, we will see the block index in the output:
> 
> $ tc qdisc
> qdisc ingress : dev ens7 parent :fff1 block 22 
> qdisc ingress : dev ens8 parent :fff1 block 22 
> 
> Now we can add filter to any of qdiscs sharing the same block:
> 
> $ tc filter add dev ens7 parent : protocol ip pref 25 flower dst_ip 
> 192.168.0.0/16 action drop
> 
> 
> We will see the same output if we list filters for ens7 and ens8, including 
> stats:
> 
> $ tc -s filter show dev ens7 ingress
> filter protocol ip pref 25 flower chain 0 
> filter protocol ip pref 25 flower chain 0 handle 0x1 
>   eth_type ipv4
>   dst_ip 192.168.0.0/16
>   not_in_hw
> action order 1: gact action drop
>  random type none pass val 0
>  index 1 ref 1 bind 1 installed 39 sec used 2 sec
> Action statistics:
> Sent 3108 bytes 37 pkt (dropped 37, overlimits 0 requeues 0) 
> backlog 0b 0p requeues 0 
> 
> $ tc -s filter show dev ens8 ingress
> filter protocol ip pref 25 flower chain 0 
> filter protocol ip pref 25 flower chain 0 handle 0x1 
>   eth_type ipv4
>   dst_ip 192.168.0.0/16
>   not_in_hw
> action order 1: gact action drop
>  random type none pass val 0
>  index 1 ref 1 bind 1 installed 40 sec used 3 sec
> Action statistics:
> Sent 3108 bytes 37 pkt (dropped 37, overlimits 0 requeues 0) 
> backlog 0b 0p requeues 0

This seems like really odd semantics to me ... a filter added to one
device shows up on another.

Why not make the shared block a standalone object that is configured
through its own set of commands and then referenced by both devices?


Re: [patch net-next 00/34] net: sched: allow qdiscs to share filter block instances

2017-10-12 Thread David Miller

Jiri I'm not looking at a 34 patch set, it's too large.

Break this up into groups of a dozen or so patches each, no
more.  Submit them one at a time and wait for each series
to be fully reviewed and integrated before moving onto the
next one.

Thanks.



[patch net-next 00/34] net: sched: allow qdiscs to share filter block instances

2017-10-12 Thread Jiri Pirko
From: Jiri Pirko 

First of all, I would like to apologize for big patchset. However after
couple of hours trying to figure out how to cut it, I found out it is
actually not possible. I would have to add some interface in one patchset
and only use it in second, which is forbidden. Also, I would like
to provide the reviewer the full picture. Most of the patches are small
and contained anyway, so it should be easy to review them. But to the
motivation:

Currently the filters added to qdiscs are independent. So for example if you
have 2 netdevices and you create ingress qdisc on both and you want to add
identical filter rules both, you need to add them twice. This patchset
makes this easier and mainly saves resources allowing to share all filters
within a qdisc - I call it a "filter block". Also this helps to save
resources when we do offload to hw for example to expensive TCAM.

So back to the example. First, we create 2 qdiscs. Both will share
block number 22. "22" is just an identification. If we don't pass any
block number, a new one will be generated by kernel:

$ tc qdisc add dev ens7 ingress block 22

$ tc qdisc add dev ens8 ingress block 22


Now if we list the qdiscs, we will see the block index in the output:

$ tc qdisc
qdisc ingress : dev ens7 parent :fff1 block 22 
qdisc ingress : dev ens8 parent :fff1 block 22 

Now we can add filter to any of qdiscs sharing the same block:

$ tc filter add dev ens7 parent : protocol ip pref 25 flower dst_ip 
192.168.0.0/16 action drop


We will see the same output if we list filters for ens7 and ens8, including 
stats:

$ tc -s filter show dev ens7 ingress
filter protocol ip pref 25 flower chain 0 
filter protocol ip pref 25 flower chain 0 handle 0x1 
  eth_type ipv4
  dst_ip 192.168.0.0/16
  not_in_hw
action order 1: gact action drop
 random type none pass val 0
 index 1 ref 1 bind 1 installed 39 sec used 2 sec
Action statistics:
Sent 3108 bytes 37 pkt (dropped 37, overlimits 0 requeues 0) 
backlog 0b 0p requeues 0 

$ tc -s filter show dev ens8 ingress
filter protocol ip pref 25 flower chain 0 
filter protocol ip pref 25 flower chain 0 handle 0x1 
  eth_type ipv4
  dst_ip 192.168.0.0/16
  not_in_hw
action order 1: gact action drop
 random type none pass val 0
 index 1 ref 1 bind 1 installed 40 sec used 3 sec
Action statistics:
Sent 3108 bytes 37 pkt (dropped 37, overlimits 0 requeues 0) 
backlog 0b 0p requeues 0


Patches overview:

Patches 1-3 introduce infrastructure for block sharing and the interface
   funtions to the qdisc, tcf_block_get_ext and tcf_block_put_ext
Patches 4-11 are removing usages of tp->q pointer, which needs to be
   eventually removed in order to set the tfc_proto independent
   on a qdisc instance
Patches 12-19 introduces block callbacks, internal infra and driver-facing
   interface, they add callback calling to individual classifiers
Patches 20-28 convert individual drivers from ndo_setup_tc to block
   callbacks for classifiers offloading
Patches 29-31 remove unused things due to the previous conversion
Patch 32 introduces block mechanism to handle netif_keep_dst calls
Patch 33 removes tp->q and tp->classid - makes tcf_proto independent on qdisc
Patch 34 finally enables block sharing for cls_ingress and cls_clsact


Iproute2 implementation is here:
https://github.com/jpirko/iproute2_mlxsw/commit/f91ff81e3b307adfe769f29b3c04c70f39a2520c

The next patchset will introduce block sharing for mlxsw. For the
curious ones the patches could be found here:
https://github.com/jpirko/linux_mlxsw/commits/jiri_devel_shblock

Jiri Pirko (34):
  net: sched: store Qdisc pointer in struct block
  net: sched: introduce support for multiple filter chain pointers
registration
  net: sched: introduce shared filter blocks infrastructure
  net: sched: teach tcf_bind/unbind_filter to use block->q
  net: sched: ematch: obtain net pointer from blocks
  net: core: use dev->ingress_queue instead of tp->q
  net: sched: cls_u32: use block instead of q in tc_u_common
  net: sched: avoid usage of tp->q in tcf_classify
  net: sched: tcindex, fw, flow: use tcf_block_q helper to get struct
Qdisc
  net: sched: use tcf_block_q helper to get q pointer for sch_tree_lock
  net: sched: propagate q and parent from caller down to tcf_fill_node
  net: sched: add block bind/unbind notification to drivers
  net: sched: introduce per-block callbacks
  net: sched: use extended variants of block get and put in ingress and
clsact qdiscs
  net: sched: use tc_setup_cb_call to call per-block callbacks
  net: sched: cls_matchall: call block callbacks for offload
  net: sched: cls_u32: swap u32_remove_hw_knode and u32_remove_hw_hnode
  net: sched: cls_u32: call block callbacks for offload
  net: sched: cls_bpf: call block callbacks for offload
  mlxsw: spectrum: