Re: [PATCH net] net_sched: act_mirred: full rcu conversion

2016-09-12 Thread John Fastabend
On 16-09-11 11:12 PM, Cong Wang wrote:
> On Fri, Sep 9, 2016 at 8:52 AM, John Fastabend  
> wrote:
>> On 16-09-08 10:26 PM, Cong Wang wrote:
>>> On Thu, Sep 8, 2016 at 8:51 AM, Eric Dumazet  wrote:
 On Thu, 2016-09-08 at 08:47 -0700, John Fastabend wrote:

> Works for me. FWIW I find this plenty straightforward and don't really
> see the need to make the hash table itself rcu friendly.
>
> Acked-by: John Fastabend 
>

 Yes, it seems this hash table is used in control path, with RTNL held
 anyway.
>>>
>>> Seriously? You never read hashtable in fast path?? I think you need
>>> to wake up.
>>>
>>
>> But the actions use refcnt'ing and should never be decremented to zero
>> as long as they can still be referenced by an active filter. If each
>> action handles its parameters like mirred/gact then I don't see why its
>> necessary.
> 
> This is correct, by "read" I meant "dereference", the tc actions
> are now permanently stored in hashtable directly, so "reading"
> a tc action is reading from hashtable.
> 
> Sorry if this wasn't clear.
> 

OK, but with the current code there is no need to protect the hash table
with an RCU semantics. The ref counting ensures the hash table entries
are always available and any 'replace' commands on actions are handled
internally in the action itself with rcu_assign_pointer replacing old
params with new params. The fast path readers will always read a
consistent set of parameters in this scheme.

So no need for an rcu hash table. The reference count though should
likely be atomic because not all increments/decrements are protected by
RTNL lock.

.John


Re: [PATCH net] net_sched: act_mirred: full rcu conversion

2016-09-12 Thread Cong Wang
On Fri, Sep 9, 2016 at 8:52 AM, John Fastabend  wrote:
> On 16-09-08 10:26 PM, Cong Wang wrote:
>> On Thu, Sep 8, 2016 at 8:51 AM, Eric Dumazet  wrote:
>>> On Thu, 2016-09-08 at 08:47 -0700, John Fastabend wrote:
>>>
 Works for me. FWIW I find this plenty straightforward and don't really
 see the need to make the hash table itself rcu friendly.

 Acked-by: John Fastabend 

>>>
>>> Yes, it seems this hash table is used in control path, with RTNL held
>>> anyway.
>>
>> Seriously? You never read hashtable in fast path?? I think you need
>> to wake up.
>>
>
> But the actions use refcnt'ing and should never be decremented to zero
> as long as they can still be referenced by an active filter. If each
> action handles its parameters like mirred/gact then I don't see why its
> necessary.

This is correct, by "read" I meant "dereference", the tc actions
are now permanently stored in hashtable directly, so "reading"
a tc action is reading from hashtable.

Sorry if this wasn't clear.


Re: [PATCH net] net_sched: act_mirred: full rcu conversion

2016-09-11 Thread Cong Wang
On Fri, Sep 9, 2016 at 5:23 AM, Eric Dumazet  wrote:
> On Thu, 2016-09-08 at 22:24 -0700, Cong Wang wrote:
>> On Thu, Sep 8, 2016 at 8:35 AM, Eric Dumazet  wrote:
>> > From: Eric Dumazet 
>> >
>> > As reported by Cong Wang, I was lazy when I did initial RCU conversion
>> > of tc_mirred, as I thought I could avoid allocation/freeing of a
>> > parameter block.
>>
>> Quote from Eric Dumazet:
>>
>> https://www.mail-archive.com/netdev@vger.kernel.org/msg115482.html
>>
>> 
>> Well, I added a READ_ONCE() to read tcf_action once.
>>
>> Adding rcu here would mean adding a pointer and extra cache line, to
>> deref the values.
>>
>> IMHO the race here has no effect . You either read the old or new value.
>> 
>>
>> Me with facepalm... ;-)
>
>
> Point is still valid. Show me a real case where it was a serious
> problem, instead of simply theoretical.
>
> tc_mirred + ifb patches allowed us to reach a milestone, removing the
> last contended spinlocks, and you are catching up with this one year
> later.
>
> I wont backport this fix in Google prod kernels, because there is
> absolutely no way we need it, and the extra memory cache line might hurt
> latencies.
>
> Since you did not write a fix on your side since June 17th, I presume
> you do not care that much.

Sounds like I were the author of this patch Why are you questioning
your own patch? Did I ask you to care about it? ;-)

Please drop this patch.

Thanks.


Re: [PATCH net] net_sched: act_mirred: full rcu conversion

2016-09-09 Thread John Fastabend
On 16-09-08 10:26 PM, Cong Wang wrote:
> On Thu, Sep 8, 2016 at 8:51 AM, Eric Dumazet  wrote:
>> On Thu, 2016-09-08 at 08:47 -0700, John Fastabend wrote:
>>
>>> Works for me. FWIW I find this plenty straightforward and don't really
>>> see the need to make the hash table itself rcu friendly.
>>>
>>> Acked-by: John Fastabend 
>>>
>>
>> Yes, it seems this hash table is used in control path, with RTNL held
>> anyway.
> 
> Seriously? You never read hashtable in fast path?? I think you need
> to wake up.
> 

But the actions use refcnt'ing and should never be decremented to zero
as long as they can still be referenced by an active filter. If each
action handles its parameters like mirred/gact then I don't see why its
necessary.

I believe though that the refcnt needs to be fixed a bit though most
likely by making it atomic. I original assumed it was protected by
RTNL lock but because its getting decremented from rcu callback this is
not true.

.John




Re: [PATCH net] net_sched: act_mirred: full rcu conversion

2016-09-09 Thread Eric Dumazet
On Thu, 2016-09-08 at 22:59 -0700, Cong Wang wrote:
> On Thu, Sep 8, 2016 at 10:48 PM, Alexei Starovoitov
>  wrote:
> >
> > imo the deliberate small race in Eric's initial lock removal in mirred
> > was a good design choice. I think he did this patch only to
> > silence the complains with the code instead of arguing with words.
> > imo the initial code is good as-is. This patch is also a nice improvement,
> > but certainly not mandatory. 'face palm' is unnecessary.
> 
> LOL, to summarize your words:
> 
> 1) not changing the current code is good
> 2) changing the current code is good too

Only because I was tired of your rants.

I wont backport this patch to Google prod kernels, I already spent too
much time.





Re: [PATCH net] net_sched: act_mirred: full rcu conversion

2016-09-09 Thread Eric Dumazet
On Thu, 2016-09-08 at 22:24 -0700, Cong Wang wrote:
> On Thu, Sep 8, 2016 at 8:35 AM, Eric Dumazet  wrote:
> > From: Eric Dumazet 
> >
> > As reported by Cong Wang, I was lazy when I did initial RCU conversion
> > of tc_mirred, as I thought I could avoid allocation/freeing of a
> > parameter block.
> 
> Quote from Eric Dumazet:
> 
> https://www.mail-archive.com/netdev@vger.kernel.org/msg115482.html
> 
> 
> Well, I added a READ_ONCE() to read tcf_action once.
> 
> Adding rcu here would mean adding a pointer and extra cache line, to
> deref the values.
> 
> IMHO the race here has no effect . You either read the old or new value.
> 
> 
> Me with facepalm... ;-)


Point is still valid. Show me a real case where it was a serious
problem, instead of simply theoretical.

tc_mirred + ifb patches allowed us to reach a milestone, removing the
last contended spinlocks, and you are catching up with this one year
later.

I wont backport this fix in Google prod kernels, because there is
absolutely no way we need it, and the extra memory cache line might hurt
latencies.

Since you did not write a fix on your side since June 17th, I presume
you do not care that much.






Re: [PATCH net] net_sched: act_mirred: full rcu conversion

2016-09-09 Thread Eric Dumazet
On Thu, 2016-09-08 at 22:26 -0700, Cong Wang wrote:
> On Thu, Sep 8, 2016 at 8:51 AM, Eric Dumazet  wrote:
> > On Thu, 2016-09-08 at 08:47 -0700, John Fastabend wrote:
> >
> >> Works for me. FWIW I find this plenty straightforward and don't really
> >> see the need to make the hash table itself rcu friendly.
> >>
> >> Acked-by: John Fastabend 
> >>
> >
> > Yes, it seems this hash table is used in control path, with RTNL held
> > anyway.
> 
> Seriously? You never read hashtable in fast path?? I think you need
> to wake up.

I seriously consider to ignore your mails.

Since you write patches with no changelog, expecting us to read your
mind, you definitely don't need us.





Re: [PATCH net] net_sched: act_mirred: full rcu conversion

2016-09-09 Thread Cong Wang
On Thu, Sep 8, 2016 at 10:48 PM, Alexei Starovoitov
 wrote:
>
> imo the deliberate small race in Eric's initial lock removal in mirred
> was a good design choice. I think he did this patch only to
> silence the complains with the code instead of arguing with words.
> imo the initial code is good as-is. This patch is also a nice improvement,
> but certainly not mandatory. 'face palm' is unnecessary.

LOL, to summarize your words:

1) not changing the current code is good
2) changing the current code is good too


Re: [PATCH net] net_sched: act_mirred: full rcu conversion

2016-09-08 Thread Alexei Starovoitov
On Thu, Sep 08, 2016 at 10:24:32PM -0700, Cong Wang wrote:
> On Thu, Sep 8, 2016 at 8:35 AM, Eric Dumazet  wrote:
> > From: Eric Dumazet 
> >
> > As reported by Cong Wang, I was lazy when I did initial RCU conversion
> > of tc_mirred, as I thought I could avoid allocation/freeing of a
> > parameter block.
> 
> Quote from Eric Dumazet:
> 
> https://www.mail-archive.com/netdev@vger.kernel.org/msg115482.html
> 
> 
> Well, I added a READ_ONCE() to read tcf_action once.
> 
> Adding rcu here would mean adding a pointer and extra cache line, to
> deref the values.
> 
> IMHO the race here has no effect . You either read the old or new value.
> 
> 
> Me with facepalm... ;-)

imo the deliberate small race in Eric's initial lock removal in mirred
was a good design choice. I think he did this patch only to
silence the complains with the code instead of arguing with words.
imo the initial code is good as-is. This patch is also a nice improvement,
but certainly not mandatory. 'face palm' is unnecessary.



Re: [PATCH net] net_sched: act_mirred: full rcu conversion

2016-09-08 Thread Cong Wang
On Thu, Sep 8, 2016 at 8:51 AM, Eric Dumazet  wrote:
> On Thu, 2016-09-08 at 08:47 -0700, John Fastabend wrote:
>
>> Works for me. FWIW I find this plenty straightforward and don't really
>> see the need to make the hash table itself rcu friendly.
>>
>> Acked-by: John Fastabend 
>>
>
> Yes, it seems this hash table is used in control path, with RTNL held
> anyway.

Seriously? You never read hashtable in fast path?? I think you need
to wake up.


Re: [PATCH net] net_sched: act_mirred: full rcu conversion

2016-09-08 Thread Cong Wang
On Thu, Sep 8, 2016 at 8:35 AM, Eric Dumazet  wrote:
> From: Eric Dumazet 
>
> As reported by Cong Wang, I was lazy when I did initial RCU conversion
> of tc_mirred, as I thought I could avoid allocation/freeing of a
> parameter block.

Quote from Eric Dumazet:

https://www.mail-archive.com/netdev@vger.kernel.org/msg115482.html


Well, I added a READ_ONCE() to read tcf_action once.

Adding rcu here would mean adding a pointer and extra cache line, to
deref the values.

IMHO the race here has no effect . You either read the old or new value.


Me with facepalm... ;-)


Re: [PATCH net] net_sched: act_mirred: full rcu conversion

2016-09-08 Thread Eric Dumazet
On Thu, 2016-09-08 at 08:47 -0700, John Fastabend wrote:

> Works for me. FWIW I find this plenty straightforward and don't really
> see the need to make the hash table itself rcu friendly.
> 
> Acked-by: John Fastabend 
> 

Yes, it seems this hash table is used in control path, with RTNL held
anyway.

Thanks for reviewing John.




Re: [PATCH net] net_sched: act_mirred: full rcu conversion

2016-09-08 Thread John Fastabend
On 16-09-08 08:35 AM, Eric Dumazet wrote:
> From: Eric Dumazet 
> 
> As reported by Cong Wang, I was lazy when I did initial RCU conversion
> of tc_mirred, as I thought I could avoid allocation/freeing of a
> parameter block.
> 
> Use an intermediate object so that fast path can get a consistent
> snapshot of multiple variables (dev, action, eaction, ok_push)
> 
> Fixes: 2ee22a90c7af ("net_sched: act_mirred: remove spinlock in fast path")
> Signed-off-by: Eric Dumazet 
> Reported-by: Cong Wang 
> Cc: John Fastabend 
> Cc: Jamal Hadi Salim 
> Cc: Hadar Hen Zion 
> Cc: Amir Vadai 
> ---

Works for me. FWIW I find this plenty straightforward and don't really
see the need to make the hash table itself rcu friendly.

Acked-by: John Fastabend 



[PATCH net] net_sched: act_mirred: full rcu conversion

2016-09-08 Thread Eric Dumazet
From: Eric Dumazet 

As reported by Cong Wang, I was lazy when I did initial RCU conversion
of tc_mirred, as I thought I could avoid allocation/freeing of a
parameter block.

Use an intermediate object so that fast path can get a consistent
snapshot of multiple variables (dev, action, eaction, ok_push)

Fixes: 2ee22a90c7af ("net_sched: act_mirred: remove spinlock in fast path")
Signed-off-by: Eric Dumazet 
Reported-by: Cong Wang 
Cc: John Fastabend 
Cc: Jamal Hadi Salim 
Cc: Hadar Hen Zion 
Cc: Amir Vadai 
---
 include/net/tc_act/tc_mirred.h |   12 +-
 net/sched/act_mirred.c |   54 +--
 2 files changed, 47 insertions(+), 19 deletions(-)

diff --git a/include/net/tc_act/tc_mirred.h b/include/net/tc_act/tc_mirred.h
index 62770add15bd..a613b4b9c56e 100644
--- a/include/net/tc_act/tc_mirred.h
+++ b/include/net/tc_act/tc_mirred.h
@@ -4,12 +4,20 @@
 #include 
 #include 
 
+
+struct tcf_mirred_params {
+   struct net_device   *dev;
+   int action;
+   int eaction;
+   int ok_push;
+   struct rcu_head rcu;
+};
+
 struct tcf_mirred {
struct tc_actioncommon;
int tcfm_eaction;
int tcfm_ifindex;
-   int tcfm_ok_push;
-   struct net_device __rcu *tcfm_dev;
+   struct tcf_mirred_params __rcu *params;
struct list_headtcfm_list;
 };
 #define to_mirred(a) ((struct tcf_mirred *)a)
diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
index 6038c85d92f5..1dc0b6036180 100644
--- a/net/sched/act_mirred.c
+++ b/net/sched/act_mirred.c
@@ -36,15 +36,16 @@ static DEFINE_SPINLOCK(mirred_list_lock);
 static void tcf_mirred_release(struct tc_action *a, int bind)
 {
struct tcf_mirred *m = to_mirred(a);
-   struct net_device *dev;
+   struct tcf_mirred_params *params;
 
/* We could be called either in a RCU callback or with RTNL lock held. 
*/
spin_lock_bh(_list_lock);
list_del(>tcfm_list);
-   dev = rcu_dereference_protected(m->tcfm_dev, 1);
-   if (dev)
-   dev_put(dev);
+   params = rcu_dereference_protected(m->params, 1);
+   if (params->dev)
+   dev_put(params->dev);
spin_unlock_bh(_list_lock);
+   kfree_rcu(params, rcu);
 }
 
 static const struct nla_policy mirred_policy[TCA_MIRRED_MAX + 1] = {
@@ -60,6 +61,8 @@ static int tcf_mirred_init(struct net *net, struct nlattr 
*nla,
 {
struct tc_action_net *tn = net_generic(net, mirred_net_id);
struct nlattr *tb[TCA_MIRRED_MAX + 1];
+   struct tcf_mirred_params *params_new;
+   struct tcf_mirred_params *params_old;
struct tc_mirred *parm;
struct tcf_mirred *m;
struct net_device *dev;
@@ -124,21 +127,33 @@ static int tcf_mirred_init(struct net *net, struct nlattr 
*nla,
tcf_hash_release(*a, bind);
if (!ovr)
return -EEXIST;
+   ret = 0;
}
m = to_mirred(*a);
 
ASSERT_RTNL();
-   m->tcf_action = parm->action;
-   m->tcfm_eaction = parm->eaction;
+   params_new = kzalloc(sizeof(*params_new), GFP_KERNEL);
+   if (unlikely(!params_new)) {
+   if (ret == ACT_P_CREATED)
+   tcf_hash_release(*a, bind);
+   return -ENOMEM;
+   }
+   params_old = rtnl_dereference(m->params);
+   params_new->action = m->tcf_action = parm->action;
+   params_new->eaction = m->tcfm_eaction = parm->eaction;
if (dev != NULL) {
m->tcfm_ifindex = parm->ifindex;
-   if (ret != ACT_P_CREATED)
-   dev_put(rcu_dereference_protected(m->tcfm_dev, 1));
dev_hold(dev);
-   rcu_assign_pointer(m->tcfm_dev, dev);
-   m->tcfm_ok_push = ok_push;
+   params_new->dev = dev;
+   params_new->ok_push = ok_push;
}
 
+   rcu_assign_pointer(m->params, params_new);
+   if (params_old) {
+   if (params_old->dev)
+   dev_put(params_old->dev);
+   kfree_rcu(params_old, rcu);
+   }
if (ret == ACT_P_CREATED) {
spin_lock_bh(_list_lock);
list_add(>tcfm_list, _list);
@@ -153,6 +168,7 @@ static int tcf_mirred(struct sk_buff *skb, const struct 
tc_action *a,
  struct tcf_result *res)
 {
struct tcf_mirred *m = to_mirred(a);
+   struct tcf_mirred_params *params;
struct net_device *dev;
struct sk_buff *skb2;
int retval, err;
@@ -162,8 +178,9 @@ static int tcf_mirred(struct sk_buff *skb, const struct 
tc_action *a,