[PATCH net 1/2] ipv6: Move common init code for rt6_info to a new function rt6_info_init()

2015-10-15 Thread Martin KaFai Lau
Introduce rt6_info_init() to do the common init work for 'struct rt6_info' (after calling dst_alloc). It is a prep work to fix the rt6_info init logic in the ip6_blackhole_route(). Signed-off-by: Martin KaFai Lau Cc: Hannes Frederic Sowa Cc: Julian Anastasov Cc: Phil Sutter C

[PATCH net 1/3] ipv6: Avoid creating RTF_CACHE from a rt that is not managed by fib6 tree

2015-11-11 Thread Martin KaFai Lau
uot;ipv6: Only create RTF_CACHE routes after encountering pmtu") Signed-off-by: Martin KaFai Lau Reported-by: Chris Siebenmann Cc: Chris Siebenmann Cc: Hannes Frederic Sowa --- net/ipv6/route.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/net/ipv6/route.c b/net

[PATCH net 0/3] ipv6: Fixes for pmtu update and DST_NOCACHE route

2015-11-11 Thread Martin KaFai Lau
This patchset fixes: 1. An oops during IPv6 pmtu update on a IPv4 GRE running in an IPSec setup 2. Misc fixes on DST_NOCACHE route -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kerne

[PATCH net 2/3] ipv6: Check expire on DST_NOCACHE route

2015-11-11 Thread Martin KaFai Lau
__rt6_check_expired() as one of the condition check. Signed-off-by: Martin KaFai Lau Cc: Hannes Frederic Sowa --- net/ipv6/route.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 74907c5..3754cf9 100644 --- a/net/ipv6

[PATCH net 3/3] ipv6: Check rt->dst.from for the DST_NOCACHE route

2015-11-11 Thread Martin KaFai Lau
Fixes: 8e3d5be73681 ("ipv6: Avoid double dst_free") Signed-off-by: Martin KaFai Lau Cc: Hannes Frederic Sowa --- include/net/ip6_fib.h | 3 ++- net/ipv6/route.c | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h index

Re: [PATCH net] net/ip6_tunnel: fix dst leak

2015-11-18 Thread Martin KaFai Lau
cached > dst on non current cpu are not actually reset. > > This patch replaces raw_cpu_ptr with per_cpu_ptr, properly cleaning > such storage. Thanks for fixing it. Acked-by: Martin KaFai Lau > > Fixes: cdf3464e6c6b ("ipv6: Fix dst_entry refcnt bugs in ip6_tunnel") &g

Re: kernel warning in tcp_fragment

2015-08-12 Thread Martin KaFai Lau
On Mon, Aug 10, 2015 at 02:35:37PM -0400, Neal Cardwell wrote: > On Mon, Aug 10, 2015 at 2:10 PM, Jovi Zhangwei wrote: > > > > Ping? > > > > We saw a lot of this warnings in our production system. It would be > > great appreciate if someone can give us the fix on this warnings. :) > > What is you

[PATCH RFC net 0/3] ipv6: Fix potential deadlock when creating pcpu rt

2015-08-13 Thread Martin KaFai Lau
This patch series fixes a potential deadlock when creating a pcpu rt. It happens when dst_alloc() decided to run gc. Something like this: read_lock(&table->tb6_lock); ip6_rt_pcpu_alloc() => dst_alloc() => ip6_dst_gc() => write_lock(&table->tb6_lock); /* oops */ Patch 1 and 2 are some prep works.

[PATCH RFC net 3/3] ipv6: Fix potential deadlock when creating pcpu rt

2015-08-13 Thread Martin KaFai Lau
awv6_setsockopt+0x5e/0x67 [141625.827146] [] ? sock_common_setsockopt+0xf/0x11 [141625.833660] [] ? SyS_setsockopt+0x81/0xa2 [141625.839565] [] entry_SYSCALL_64_fastpath+0x12/0x6a Fixes: d52d3997f843 ("pv6: Create percpu rt6_info") Signed-off-by: Martin KaFai Lau CC: Hannes Frederic Sowa

[PATCH RFC net 2/3] ipv6: Add rt6_make_pcpu_route()

2015-08-13 Thread Martin KaFai Lau
It is a prep work for the potential deadlock. The current rt6_get_pcpu_route() will also create a pcpu rt if one does not exist. This patch moves the pcpu rt creation logic into another function, rt6_make_pcpu_route(). Signed-off-by: Martin KaFai Lau CC: Hannes Frederic Sowa --- net/ipv6

[PATCH RFC net 1/3] ipv6: Remove un-used argument from ip6_dst_alloc()

2015-08-13 Thread Martin KaFai Lau
After 4b32b5ad31a6 ("ipv6: Stop rt6_info from using inet_peer's metrics"), ip6_dst_alloc() does not need the 'table' argument. This patch cleans it up. Signed-off-by: Martin KaFai Lau CC: Hannes Frederic Sowa --- net/ipv6/route.c | 21 + 1 file cha

Re: [PATCH RFC net 0/3] ipv6: Fix potential deadlock when creating pcpu rt

2015-08-14 Thread Martin KaFai Lau
On Thu, Aug 13, 2015 at 05:29:09PM -0700, David Miller wrote: > From: Martin KaFai Lau > Date: Thu, 13 Aug 2015 00:58:00 -0700 > > > This patch series fixes a potential deadlock when creating a pcpu rt. > > It happens when dst_alloc() decided to run gc. Something like thi

[PATCH v2 net 1/3] ipv6: Remove un-used argument from ip6_dst_alloc()

2015-08-14 Thread Martin KaFai Lau
After 4b32b5ad31a6 ("ipv6: Stop rt6_info from using inet_peer's metrics"), ip6_dst_alloc() does not need the 'table' argument. This patch cleans it up. Signed-off-by: Martin KaFai Lau CC: Hannes Frederic Sowa --- net/ipv6/route.c | 21 + 1 file cha

[PATCH v2 net 3/3] ipv6: Fix a potential deadlock when creating pcpu rt

2015-08-14 Thread Martin KaFai Lau
awv6_setsockopt+0x5e/0x67 [141625.827146] [] ? sock_common_setsockopt+0xf/0x11 [141625.833660] [] ? SyS_setsockopt+0x81/0xa2 [141625.839565] [] entry_SYSCALL_64_fastpath+0x12/0x6a Fixes: d52d3997f843 ("pv6: Create percpu rt6_info") Signed-off-by: Martin KaFai Lau CC: Hannes Frederic Sowa

[PATCH v2 net 2/3] ipv6: Add rt6_make_pcpu_route()

2015-08-14 Thread Martin KaFai Lau
It is a prep work for fixing a potential deadlock when creating a pcpu rt. The current rt6_get_pcpu_route() will also create a pcpu rt if one does not exist. This patch moves the pcpu rt creation logic into another function, rt6_make_pcpu_route(). Signed-off-by: Martin KaFai Lau CC: Hannes

[PATCH v2 net 0/3] ipv6: Fix a potential deadlock when creating pcpu rt

2015-08-14 Thread Martin KaFai Lau
v1 -> v2: A minor change in the commit message of patch 2. This patch series fixes a potential deadlock when creating a pcpu rt. It happens when dst_alloc() decided to run gc. Something like this: read_lock(&table->tb6_lock); ip6_rt_pcpu_alloc() => dst_alloc() => ip6_dst_gc() => write_lock(&table

Re: [PATCH net-next v5 00/11] ipv6: Only create RTF_CACHE route after encountering pmtu exception

2015-08-28 Thread Martin KaFai Lau
On Mon, Aug 17, 2015 at 11:43:20AM +0200, Alexander Holler wrote: > That's why I vote to check out if it's possible/reasonable to backport this > series to the stable kernels. I have backported to 4.0.y without major issue, so possible. I did try on 3.1x and gave up. It is a lot of changes, so I

[PATCH net-next] ipv6: Avoid rt6_probe() taking writer lock in the fast path

2015-07-21 Thread Martin KaFai Lau
30s. At the end, the total number of finished sendto(): BeforeAfter 55M 95M Signed-off-by: Martin KaFai Lau Cc: Hannes Frederic Sowa --- net/ipv6/route.c | 41 - 1 file changed, 20 insertions(+), 21 deletions(-) diff --git a/net/ipv6/route.c b/

Re: [PATCH net-next] ipv6: Avoid rt6_probe() taking writer lock in the fast path

2015-07-22 Thread Martin KaFai Lau
On Wed, Jul 22, 2015 at 11:10:59AM +0900, YOSHIFUJI Hideaki wrote: > You have to take "some" lock when accessing neigh->nud_state > theoretically. I don't think read_lock can buy us a lot of extra protection either. If it has missed the train, the next ip6_pol_route() call will trigger rt6_probe().

[PATCH net-next v2 2/2] ipv6: Avoid rt6_probe() taking writer lock in the fast path

2015-07-24 Thread Martin KaFai Lau
dto(): Before: 55M After: 95M Signed-off-by: Martin KaFai Lau Cc: Hannes Frederic Sowa CC: Julian Anastasov CC: YOSHIFUJI Hideaki --- net/ipv6/route.c | 4 1 file changed, 4 insertions(+) diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 6d503db..76dcff8 100644 --- a/net/ipv6/rout

[PATCH net-next v2 0/2] ipv6: Avoid rt6_probe() taking writer lock in the fast path

2015-07-24 Thread Martin KaFai Lau
v1 -> v2: 1. Separate the code re-arrangement into another patch 2. Fix style -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

[PATCH net-next v2 1/2] ipv6: Re-arrange code in rt6_probe()

2015-07-24 Thread Martin KaFai Lau
KaFai Lau Cc: Hannes Frederic Sowa Cc: Julian Anastasov Cc: YOSHIFUJI Hideaki --- net/ipv6/route.c | 44 1 file changed, 20 insertions(+), 24 deletions(-) diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 7f2214f..6d503db 100644 --- a/net/ipv6

Re: kernel warning in tcp_fragment

2015-07-27 Thread Martin KaFai Lau
On Wed, Jul 22, 2015 at 11:55:35AM -0700, Jovi Zhangwei wrote: > Sorry for disturbing, our production system(3.14 and 3.18 stable > kernel) have many tcp_fragment warnings, > the trace is same as below one which you discussed before. > > https://urldefense.proofpoint.com/v1/url?u=http://comments.g

[PATCH net 2/3] ipv6: Rename the dst_cache helper functions in ip6_tunnel

2015-09-01 Thread Martin KaFai Lau
ip6_tnl_dst_get(). Signed-off-by: Martin KaFai Lau --- include/net/ip6_tunnel.h | 4 ++-- net/ipv6/ip6_gre.c | 4 ++-- net/ipv6/ip6_tunnel.c| 12 ++-- 3 files changed, 10 insertions(+), 10 deletions(-) diff --git a/include/net/ip6_tunnel.h b/include/net/ip6_tunnel.h index

[PATCH net 3/3] ipv6: Fix dst_entry refcnt bugs in ip6_tunnel

2015-09-01 Thread Martin KaFai Lau
y's refcnt. This patch: 1. Create a percpu dst_entry cache in ip6_tnl 2. Use a spinlock to protect the dst_cache operations 3. The outgoing skb always holds the dst_entry's refcnt Signed-off-by: Martin KaFai Lau --- include/net/ip6_tunnel.h | 11 - net/ipv6/ip6_gre.

[PATCH net 0/3] ipv6: Fix dst_entry refcnt bugs in ip6_tunnel

2015-09-01 Thread Martin KaFai Lau
This patch series is to fix the dst refcnt bugs in ip6_tunnel. Patch 1 and 2 are the prep works. Patch 3 is the fix. I can reproduce the bug by adding and removing the ip6gre tunnel while running a super_netperf TCP_CRR test. I get the following trace by adding WARN_ON_ONCE(newrefcnt < 0) to ds

[PATCH net 1/3] ipv6: Refactor common ip6gre_tunnel_init codes

2015-09-01 Thread Martin KaFai Lau
It is a prep work to fix the dst_entry refcnt bugs in ip6_tunnel. This patch refactors some common init codes used by both ip6gre_tunnel_init and ip6gre_tap_init. Signed-off-by: Martin KaFai Lau --- net/ipv6/ip6_gre.c | 37 - 1 file changed, 24 insertions

Re: [PATCH net 3/3] ipv6: Fix dst_entry refcnt bugs in ip6_tunnel

2015-09-01 Thread Martin KaFai Lau
On Tue, Sep 01, 2015 at 01:14:20PM -0700, Eric Dumazet wrote: > It should not be a problem. refcnt is taken when/if necessary (skb > queued on a qdisc for example) > > We have other uses of skb_dst_set_noref() > > Please describe the problem ? The current ip6_tnl_dst_get() does not take the dst ref

Re: [PATCH net 3/3] ipv6: Fix dst_entry refcnt bugs in ip6_tunnel

2015-09-01 Thread Martin KaFai Lau
On Tue, Sep 01, 2015 at 02:26:58PM -0700, Eric Dumazet wrote: > On Tue, 2015-09-01 at 13:55 -0700, Martin KaFai Lau wrote: > > On Tue, Sep 01, 2015 at 01:14:20PM -0700, Eric Dumazet wrote: > > > It should not be a problem. refcnt is taken when/if necessary (skb > > > qu

Re: [PATCH net 3/3] ipv6: Fix dst_entry refcnt bugs in ip6_tunnel

2015-09-01 Thread Martin KaFai Lau
On Tue, Sep 01, 2015 at 03:38:36PM -0700, Eric Dumazet wrote: > On Tue, 2015-09-01 at 15:25 -0700, Martin KaFai Lau wrote: > > On Tue, Sep 01, 2015 at 02:26:58PM -0700, Eric Dumazet wrote: > > > On Tue, 2015-09-01 at 13:55 -0700, Martin KaFai Lau wrote: > > > > On T

Re: [PATCH net 3/3] ipv6: Fix dst_entry refcnt bugs in ip6_tunnel

2015-09-01 Thread Martin KaFai Lau
On Tue, Sep 01, 2015 at 05:31:44PM -0700, Martin KaFai Lau wrote: > On Tue, Sep 01, 2015 at 03:38:36PM -0700, Eric Dumazet wrote: > > On Tue, 2015-09-01 at 15:25 -0700, Martin KaFai Lau wrote: > > > On Tue, Sep 01, 2015 at 02:26:58PM -0700, Eric Dumazet wrote: > > > &g

Re: [PATCH net 3/3] ipv6: Fix dst_entry refcnt bugs in ip6_tunnel

2015-09-01 Thread Martin KaFai Lau
On Tue, Sep 01, 2015 at 05:42:00PM -0700, Martin KaFai Lau wrote: > I look a closer look at dst_rcu_free() and your commit pointers. I can see > your point > for DST_NOCACHE. > > However, dst_free() for not DST_NOCACHE is still an issue, I think. oops. Ignore this email a

Re: [PATCH net 3/3] ipv6: Fix dst_entry refcnt bugs in ip6_tunnel

2015-09-01 Thread Martin KaFai Lau
On Tue, Sep 01, 2015 at 01:14:20PM -0700, Eric Dumazet wrote: > On Tue, 2015-09-01 at 11:55 -0700, Martin KaFai Lau wrote: > > Problems in the current dst_entry cache in the ip6_tunnel: > > > > 1. ip6_tnl_dst_set is racy. There is no lock to protect it: > >- One ma

Re: [PATCH net 3/3] ipv6: Fix dst_entry refcnt bugs in ip6_tunnel

2015-09-02 Thread Martin KaFai Lau
On Tue, Sep 01, 2015 at 01:14:20PM -0700, Eric Dumazet wrote: > > 2. Use a spinlock to protect the dst_cache operations > > Well, a seqlock would be better : No need for an atomic operation in > fast path. > seqlock can ensure consistency between idst->dst and idst->cookie. However, IPv6 dst destru

Re: [PATCH net 3/3] ipv6: Fix dst_entry refcnt bugs in ip6_tunnel

2015-09-02 Thread Martin KaFai Lau
On Wed, Sep 02, 2015 at 02:30:45PM -0700, Eric Dumazet wrote: > Object cannot be freed until all cpus have exited their RCU sections. You meant the dst_destroy() here will wait for all cpus exited their RCU sections? static inline void dst_free(struct dst_entry *dst) { if (dst->obsolete >

Re: [PATCH net 3/3] ipv6: Fix dst_entry refcnt bugs in ip6_tunnel

2015-09-02 Thread Martin KaFai Lau
On Wed, Sep 02, 2015 at 03:48:57PM -0700, Eric Dumazet wrote: > On Wed, 2015-09-02 at 14:52 -0700, Martin KaFai Lau wrote: > > On Wed, Sep 02, 2015 at 02:30:45PM -0700, Eric Dumazet wrote: > > > Object cannot be freed until all cpus have exited their RCU sections. > > Y

[PATCH RFC v2 net 4/5] ipv6: Avoid double dst_free

2015-09-04 Thread Martin KaFai Lau
been destroyed already. 3. If rt is a DST_NOCACHE, dst_free(rt) should not be called. 4. It is a stopper to make dst freeing from fib tree undergo a rcu grace period. This patch is to use a DST_NOCACHE flag to indicate a rt is managed by the fib tree or not. Signed-off-by: Martin KaFai Lau --- net

[PATCH RFC v2 net 1/5] ipv6: Refactor common ip6gre_tunnel_init codes

2015-09-04 Thread Martin KaFai Lau
It is a prep work to fix the dst_entry refcnt bugs in ip6_tunnel. This patch refactors some common init codes used by both ip6gre_tunnel_init and ip6gre_tap_init. Signed-off-by: Martin KaFai Lau --- net/ipv6/ip6_gre.c | 42 +- 1 file changed, 25

[PATCH RFC v2 net 2/5] ipv6: Rename the dst_cache helper functions in ip6_tunnel

2015-09-04 Thread Martin KaFai Lau
ip6_tnl_dst_get(). Signed-off-by: Martin KaFai Lau --- include/net/ip6_tunnel.h | 4 ++-- net/ipv6/ip6_gre.c | 4 ++-- net/ipv6/ip6_tunnel.c| 12 ++-- 3 files changed, 10 insertions(+), 10 deletions(-) diff --git a/include/net/ip6_tunnel.h b/include/net/ip6_tunnel.h index

[PATCH RFC v2 net 5/5] ipv6: Replace spinlock with seqlock and rcu in ip6_tunnel

2015-09-04 Thread Martin KaFai Lau
This patch uses a seqlock to ensure consistency between idst->dst and idst->cookie. It also makes dst freeing from fib tree to undergo a rcu grace period. Signed-off-by: Martin KaFai Lau --- include/net/ip6_tunnel.h | 4 ++-- net/ipv6/ip6_fib.c | 9 +++-- net/ipv6/ip6_tu

[PATCH RFC v2 net 0/5] ipv6: Fix dst_entry refcnt bugs in ip6_tunnel

2015-09-04 Thread Martin KaFai Lau
v2: - Add patch 4 and 5 to remove the spinlock v1: This patch series is to fix the dst refcnt bugs in ip6_tunnel. Patch 1 and 2 are the prep works. Patch 3 is the fix. I can reproduce the bug by adding and removing the ip6gre tunnel while running a super_netperf TCP_CRR test. I get the followi

[PATCH RFC v2 net 3/5] ipv6: Fix dst_entry refcnt bugs in ip6_tunnel

2015-09-04 Thread Martin KaFai Lau
: 1. Create a percpu dst_entry cache in ip6_tnl 2. Use a spinlock to protect the dst_cache operations 3. ip6_tnl_dst_get always takes the dst refcnt before returning Signed-off-by: Martin KaFai Lau Conflicts: net/ipv6/ip6_gre.c net/ipv6/ip6_tunnel.c --- include/net/ip6_tunnel.h |

Re: [PATCH RFC v2 net 4/5] ipv6: Avoid double dst_free

2015-09-04 Thread Martin KaFai Lau
On Fri, Sep 04, 2015 at 04:12:41PM -0700, Martin KaFai Lau wrote: > @@ -1962,6 +1961,9 @@ static int __ip6_del_rt(struct rt6_info *rt, struct > nl_info *info) > if (rt == net->ipv6.ip6_null_entry) { > err = -ENOENT; > goto out; > + } e

[PATCH v3 net 2/5] ipv6: Rename the dst_cache helper functions in ip6_tunnel

2015-09-11 Thread Martin KaFai Lau
ip6_tnl_dst_get(). Signed-off-by: Martin KaFai Lau --- include/net/ip6_tunnel.h | 4 ++-- net/ipv6/ip6_gre.c | 4 ++-- net/ipv6/ip6_tunnel.c| 12 ++-- 3 files changed, 10 insertions(+), 10 deletions(-) diff --git a/include/net/ip6_tunnel.h b/include/net/ip6_tunnel.h index

[PATCH v3 net 5/5] ipv6: Replace spinlock with seqlock and rcu in ip6_tunnel

2015-09-11 Thread Martin KaFai Lau
This patch uses a seqlock to ensure consistency between idst->dst and idst->cookie. It also makes dst freeing from fib tree to undergo a rcu grace period. Signed-off-by: Martin KaFai Lau --- include/net/ip6_tunnel.h | 4 ++-- net/ipv6/ip6_fib.c | 9 +++-- net/ipv6/ip6_tu

[PATCH v3 net 3/5] ipv6: Fix dst_entry refcnt bugs in ip6_tunnel

2015-09-11 Thread Martin KaFai Lau
: 1. Create a percpu dst_entry cache in ip6_tnl 2. Use a spinlock to protect the dst_cache operations 3. ip6_tnl_dst_get always takes the dst refcnt before returning Signed-off-by: Martin KaFai Lau --- include/net/ip6_tunnel.h | 11 - net/ipv6/ip6_gre.c | 38 --- net/i

[PATCH v3 net 0/5] ipv6: Fix dst_entry refcnt bugs in ip6_tunnel

2015-09-11 Thread Martin KaFai Lau
v3: - Merge a 'if else if' test in patch 4 - Use rcu_dereference_protected in patch 5 to fix a sparse check when CONFIG_SPARSE_RCU_POINTER is enabled v2: - Add patch 4 and 5 to remove the spinlock v1: This patch series is to fix the dst refcnt bugs in ip6_tunnel. Patch 1 and 2 are the prep wor

[PATCH v3 net 4/5] ipv6: Avoid double dst_free

2015-09-11 Thread Martin KaFai Lau
royed already. 3. If rt is a DST_NOCACHE, dst_free(rt) should not be called. 4. It is a stopper to make dst freeing from fib tree undergo a rcu grace period. This patch is to use a DST_NOCACHE flag to indicate a rt is not managed by the fib tree. Signed-off-by: Martin KaFai Lau --- net

[PATCH v3 net 1/5] ipv6: Refactor common ip6gre_tunnel_init codes

2015-09-11 Thread Martin KaFai Lau
It is a prep work to fix the dst_entry refcnt bugs in ip6_tunnel. This patch refactors some common init codes used by both ip6gre_tunnel_init and ip6gre_tap_init. Signed-off-by: Martin KaFai Lau --- net/ipv6/ip6_gre.c | 37 - 1 file changed, 24 insertions

Re: [PATCH v3 net 1/5] ipv6: Refactor common ip6gre_tunnel_init codes

2015-09-11 Thread Martin KaFai Lau
On Fri, Sep 11, 2015 at 03:30:59PM -0700, David Miller wrote: > From: Martin KaFai Lau > Date: Fri, 11 Sep 2015 11:06:17 -0700 > > > @@ -1460,19 +1474,16 @@ static void ip6gre_netlink_parms(struct nlattr > > *data[], > > static int ip6gre_tap_init(struct net_device

Re: kernel warning in tcp_fragment

2015-09-14 Thread Martin KaFai Lau
gt; I would really like to see the issue been fixed by upstream(and backported > to kernel longterm tree 3.14)--either by this patch or something else. Is > there a plan for this? > > Thanks, > > Grant > > On 12/08/2015 20:45, Martin KaFai Lau wrote: > >On Mon, Aug 1

Re: [PATCH v3 net 0/5] ipv6: Fix dst_entry refcnt bugs in ip6_tunnel

2015-09-15 Thread Martin KaFai Lau
On Mon, Sep 14, 2015 at 07:56:37PM -0700, David Miller wrote: > From: David Miller > Date: Mon, 14 Sep 2015 19:49:25 -0700 (PDT) > > > Series applied, thanks Martin. > > Actually, reverted, this doesn't even compile :-/ > > In file included from include/linux/srcu.h:33:0, > from i

[PATCH v4 net 4/5] ipv6: Avoid double dst_free

2015-09-15 Thread Martin KaFai Lau
royed already. 3. If rt is a DST_NOCACHE, dst_free(rt) should not be called. 4. It is a stopper to make dst freeing from fib tree undergo a rcu grace period. This patch is to use a DST_NOCACHE flag to indicate a rt is not managed by the fib tree. Signed-off-by: Martin KaFai Lau --- net

[PATCH v4 net 5/5] ipv6: Replace spinlock with seqlock and rcu in ip6_tunnel

2015-09-15 Thread Martin KaFai Lau
This patch uses a seqlock to ensure consistency between idst->dst and idst->cookie. It also makes dst freeing from fib tree to undergo a rcu grace period. Signed-off-by: Martin KaFai Lau --- include/net/ip6_tunnel.h | 4 ++-- net/ipv6/ip6_fib.c | 9 +++-- net/ipv6/ip6_tu

[PATCH v4 net 1/5] ipv6: Refactor common ip6gre_tunnel_init codes

2015-09-15 Thread Martin KaFai Lau
It is a prep work to fix the dst_entry refcnt bugs in ip6_tunnel. This patch refactors some common init codes used by both ip6gre_tunnel_init and ip6gre_tap_init. Signed-off-by: Martin KaFai Lau --- net/ipv6/ip6_gre.c | 37 - 1 file changed, 24 insertions

[PATCH v4 net 2/5] ipv6: Rename the dst_cache helper functions in ip6_tunnel

2015-09-15 Thread Martin KaFai Lau
ip6_tnl_dst_get(). Signed-off-by: Martin KaFai Lau --- include/net/ip6_tunnel.h | 4 ++-- net/ipv6/ip6_gre.c | 4 ++-- net/ipv6/ip6_tunnel.c| 12 ++-- 3 files changed, 10 insertions(+), 10 deletions(-) diff --git a/include/net/ip6_tunnel.h b/include/net/ip6_tunnel.h index

[PATCH v4 net 0/5] ipv6: Fix dst_entry refcnt bugs in ip6_tunnel

2015-09-15 Thread Martin KaFai Lau
v4: - Fix a compilation error in patch 5 when CONFIG_LOCKDEP is turned on and re-test it v3: - Merge a 'if else if' test in patch 4 - Use rcu_dereference_protected in patch 5 to fix a sparse check when CONFIG_SPARSE_RCU_POINTER is enabled v2: - Add patch 4 and 5 to remove the spinlock v1: Th

[PATCH v4 net 3/5] ipv6: Fix dst_entry refcnt bugs in ip6_tunnel

2015-09-15 Thread Martin KaFai Lau
: 1. Create a percpu dst_entry cache in ip6_tnl 2. Use a spinlock to protect the dst_cache operations 3. ip6_tnl_dst_get always takes the dst refcnt before returning Signed-off-by: Martin KaFai Lau --- include/net/ip6_tunnel.h | 11 - net/ipv6/ip6_gre.c | 38 --- net/i

[PATCH net-next v4 10/10] ipv6: Create percpu rt6_info

2015-05-20 Thread Martin KaFai Lau
After the patch 'ipv6: Only create RTF_CACHE routes after encountering pmtu exception', we need to compensate the performance hit (bouncing dst->__refcnt). Signed-off-by: Martin KaFai Lau Cc: Hannes Frederic Sowa Cc: Steffen Klassert Cc: Julian Anastasov --- include/

[PATCH net-next v4 04/10] ipv6: Only create RTF_CACHE routes after encountering pmtu exception

2015-05-20 Thread Martin KaFai Lau
This patch creates a RTF_CACHE routes only after encountering a pmtu exception. After ip6_rt_update_pmtu() has inserted the RTF_CACHE route to the fib6 tree, the rt->rt6i_node->fn_sernum is bumped which will fail the ip6_dst_check() and trigger a relookup. Signed-off-by: Martin KaFai L

[PATCH net-next v4 09/10] ipv6: Break up ip6_rt_copy()

2015-05-20 Thread Martin KaFai Lau
This patch breaks up ip6_rt_copy() into ip6_rt_copy_init() and ip6_rt_cache_alloc(). In the later patch, we need to create a percpu rt6_info copy. Hence, refactor the common rt6_info init codes to ip6_rt_copy_init(). Signed-off-by: Martin KaFai Lau Cc: Hannes Frederic Sowa Cc: Steffen Klassert

[PATCH net-next v4 08/10] ipv6: Keep track of DST_NOCACHE routes in case of iface down/unregister

2015-05-20 Thread Martin KaFai Lau
This patch keeps track of the DST_NOCACHE routes in a list and replaces its dev with loopback during the iface down/unregister event. Signed-off-by: Martin KaFai Lau Cc: Hannes Frederic Sowa Cc: Steffen Klassert Cc: Julian Anastasov --- include/net/ip6_fib.h | 3 ++ net/ipv6/route.c

[PATCH net-next v4 0/10] ipv6: Only create RTF_CACHE route after encountering pmtu exception

2015-05-20 Thread Martin KaFai Lau
v3 -> v4: - Patch 8 is new. It keeps track of the DST_NOCACHE routes in a list to handle the iface down/unregister event. - Remove rcu from the newly added rt6i_pcpu variable. It is not needed because it has already been protected by the existing reader/writer lock. - Thanks to 'Julian Anast

[PATCH net-next v4 07/10] ipv6: Create RTF_CACHE clone when FLOWI_FLAG_KNOWN_NH is set

2015-05-20 Thread Martin KaFai Lau
This patch always creates RTF_CACHE clone with DST_NOCACHE when FLOWI_FLAG_KNOWN_NH is set so that the rt6i_dst is set to the fl6->daddr. Signed-off-by: Martin KaFai Lau Acked-by: Julian Anastasov Tested-by: Julian Anastasov Cc: Hannes Frederic Sowa Cc: Steffen Klassert --- include/

[PATCH net-next v4 03/10] ipv6: Combine rt6_alloc_cow and rt6_alloc_clone

2015-05-20 Thread Martin KaFai Lau
A prep work for creating RTF_CACHE on exception only. After this patch, the same condition (rt->rt6i_flags & (RTF_NONEXTHOP | RTF_GATEWAY)) is checked twice. This redundancy will be removed in the later patch. Signed-off-by: Martin KaFai Lau Cc: Hannes Frederic Sowa Cc: Steffen Klass

[PATCH net-next v4 01/10] ipv6: Remove external dependency on rt6i_dst and rt6i_src

2015-05-20 Thread Martin KaFai Lau
) later. Signed-off-by: Martin KaFai Lau Reviewed-by: Hannes Frederic Sowa Cc: Steffen Klassert Cc: Julian Anastasov --- drivers/scsi/cxgbi/libcxgbi.c | 2 +- include/net/ipv6.h | 3 ++- net/ipv6/icmp.c | 2 +- net/ipv6/ip6_output.c | 22

[PATCH net-next v4 05/10] ipv6: Add rt6_get_cookie() function

2015-05-20 Thread Martin KaFai Lau
Instead of doing the rt6->rt6i_node check whenever we need to get the route's cookie. Refactor it into rt6_get_cookie(). It is a prep work to handle FLOWI_FLAG_KNOWN_NH and also percpu rt6_info later. Signed-off-by: Martin KaFai Lau Cc: Hannes Frederic Sowa Cc: Steffen Klassert Cc

[PATCH net-next v4 02/10] ipv6: Remove external dependency on rt6i_gateway and RTF_ANYCAST

2015-05-20 Thread Martin KaFai Lau
on rt6i_gateway and RTF_ANYCAST. Signed-off-by: Martin KaFai Lau Cc: Hannes Frederic Sowa Cc: Steffen Klassert Cc: Julian Anastasov --- include/net/ip6_route.h| 19 ++- net/bluetooth/6lowpan.c| 2 +- net/ipv6/icmp.c| 4

[PATCH net-next v4 06/10] ipv6: Set FLOWI_FLAG_KNOWN_NH at flowi6_flags

2015-05-20 Thread Martin KaFai Lau
clone. Signed-off-by: Martin KaFai Lau Acked-by: Julian Anastasov Tested-by: Julian Anastasov Cc: Hannes Frederic Sowa Cc: Steffen Klassert --- net/ipv6/raw.c | 3 +++ net/netfilter/ipvs/ip_vs_xmit.c | 13 + net/netfilter/xt_TEE.c | 1 + 3 files chang

[PATCH net-next v5 06/11] ipv6: Add rt6_get_cookie() function

2015-05-22 Thread Martin KaFai Lau
Instead of doing the rt6->rt6i_node check whenever we need to get the route's cookie. Refactor it into rt6_get_cookie(). It is a prep work to handle FLOWI_FLAG_KNOWN_NH and also percpu rt6_info later. Signed-off-by: Martin KaFai Lau Cc: Hannes Frederic Sowa Cc: Steffen Klassert Cc

[PATCH net-next v5 01/11] ipv6: Clean up ipv6_select_ident() and ip6_fragment()

2015-05-22 Thread Martin KaFai Lau
heck fragment id has been generated or not. Signed-off-by: Martin KaFai Lau Cc: Hannes Frederic Sowa Cc: Steffen Klassert Cc: Julian Anastasov --- include/net/ipv6.h | 3 +-- net/ipv6/ip6_output.c | 17 ++--- net/ipv6/output_core.c | 5 ++--- 3 files changed, 9 insertions(+),

[PATCH net-next v5 09/11] ipv6: Keep track of DST_NOCACHE routes in case of iface down/unregister

2015-05-22 Thread Martin KaFai Lau
This patch keeps track of the DST_NOCACHE routes in a list and replaces its dev with loopback during the iface down/unregister event. Signed-off-by: Martin KaFai Lau Cc: Hannes Frederic Sowa Cc: Steffen Klassert Cc: Julian Anastasov --- include/net/ip6_fib.h | 3 ++ net/ipv6/route.c

[PATCH net-next v5 10/11] ipv6: Break up ip6_rt_copy()

2015-05-22 Thread Martin KaFai Lau
This patch breaks up ip6_rt_copy() into ip6_rt_copy_init() and ip6_rt_cache_alloc(). In the later patch, we need to create a percpu rt6_info copy. Hence, refactor the common rt6_info init codes to ip6_rt_copy_init(). Signed-off-by: Martin KaFai Lau Cc: Hannes Frederic Sowa Cc: Steffen Klassert

[PATCH net-next v5 05/11] ipv6: Only create RTF_CACHE routes after encountering pmtu exception

2015-05-22 Thread Martin KaFai Lau
This patch creates a RTF_CACHE routes only after encountering a pmtu exception. After ip6_rt_update_pmtu() has inserted the RTF_CACHE route to the fib6 tree, the rt->rt6i_node->fn_sernum is bumped which will fail the ip6_dst_check() and trigger a relookup. Signed-off-by: Martin KaFai L

[PATCH net-next v5 03/11] ipv6: Remove external dependency on rt6i_gateway and RTF_ANYCAST

2015-05-22 Thread Martin KaFai Lau
on rt6i_gateway and RTF_ANYCAST. Signed-off-by: Martin KaFai Lau Cc: Hannes Frederic Sowa Cc: Steffen Klassert Cc: Julian Anastasov --- include/net/ip6_route.h| 19 ++- net/bluetooth/6lowpan.c| 2 +- net/ipv6/icmp.c| 4

[PATCH net-next v5 07/11] ipv6: Set FLOWI_FLAG_KNOWN_NH at flowi6_flags

2015-05-22 Thread Martin KaFai Lau
clone. Signed-off-by: Martin KaFai Lau Acked-by: Julian Anastasov Tested-by: Julian Anastasov Cc: Hannes Frederic Sowa Cc: Steffen Klassert --- net/ipv6/raw.c | 3 +++ net/netfilter/ipvs/ip_vs_xmit.c | 13 + net/netfilter/xt_TEE.c | 1 + 3 files chang

[PATCH net-next v5 11/11] ipv6: Create percpu rt6_info

2015-05-22 Thread Martin KaFai Lau
After the patch 'ipv6: Only create RTF_CACHE routes after encountering pmtu exception', we need to compensate the performance hit (bouncing dst->__refcnt). Signed-off-by: Martin KaFai Lau Cc: Hannes Frederic Sowa Cc: Steffen Klassert Cc: Julian Anastasov --- include/

[PATCH net-next v5 02/11] ipv6: Remove external dependency on rt6i_dst and rt6i_src

2015-05-22 Thread Martin KaFai Lau
) later. Signed-off-by: Martin KaFai Lau Reviewed-by: Hannes Frederic Sowa Cc: Steffen Klassert Cc: Julian Anastasov --- drivers/scsi/cxgbi/libcxgbi.c | 2 +- include/net/ipv6.h | 4 +++- net/ipv6/icmp.c | 2 +- net/ipv6/ip6_output.c | 13

[PATCH net-next v5 08/11] ipv6: Create RTF_CACHE clone when FLOWI_FLAG_KNOWN_NH is set

2015-05-22 Thread Martin KaFai Lau
This patch always creates RTF_CACHE clone with DST_NOCACHE when FLOWI_FLAG_KNOWN_NH is set so that the rt6i_dst is set to the fl6->daddr. Signed-off-by: Martin KaFai Lau Acked-by: Julian Anastasov Tested-by: Julian Anastasov Cc: Hannes Frederic Sowa Cc: Steffen Klassert --- include/

[PATCH net-next v5 00/11] ipv6: Only create RTF_CACHE route after encountering pmtu exception

2015-05-22 Thread Martin KaFai Lau
v4 -> v5: - Patch 1 is new. Clean up the ipv6_select_ident() and ip6_fragment(). - Further simplify the newly added rt6_get_pcpu_route(). If there is a 'prev' after cmpxchg, return prev instead of the newly created percpu clone. v3 -> v4: - Patch 8 is new. It keeps track of the DST_NOCACHE r

[PATCH net-next v5 04/11] ipv6: Combine rt6_alloc_cow and rt6_alloc_clone

2015-05-22 Thread Martin KaFai Lau
A prep work for creating RTF_CACHE on exception only. After this patch, the same condition (rt->rt6i_flags & (RTF_NONEXTHOP | RTF_GATEWAY)) is checked twice. This redundancy will be removed in the later patch. Signed-off-by: Martin KaFai Lau Cc: Hannes Frederic Sowa Cc: Steffen Klass

Re: [PATCH net-next] ipv6: ipv6_select_ident() returns a __be32

2015-05-25 Thread Martin KaFai Lau
Reported-by: kbuild test robot Thanks for fixing it. Acked-by: Martin KaFai Lau -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next v5 00/11] ipv6: Only create RTF_CACHE route after encountering pmtu exception

2015-05-26 Thread Martin KaFai Lau
On Tue, May 26, 2015 at 11:20:53PM +0200, Hannes Frederic Sowa wrote: > I also went over the changes to the last version and such, albeit a bit > late: > Reviewed-by: Hannes Frederic Sowa Thanks for your help and review, Hannes! --Martin -- To unsubscribe from this list: send the line "unsubscrib

Re: Recurring trace from tcp_fragment()

2015-06-04 Thread Martin KaFai Lau
Hi Grant, On Thu, Jun 04, 2015 at 09:35:04AM -0700, Grant Zhang wrote: > Hi Neal, > > Unfortunately with the patch we still see the same stack trace. > Attached is the TcpExtTCPSACKReneging with the patch, captured with > 60 seconds interval. Its value is incremented at an similar speed as > befor

Re: Recurring trace from tcp_fragment()

2015-06-04 Thread Martin KaFai Lau
On Thu, Jun 04, 2015 at 01:10:26PM -0700, Grant Zhang wrote: > Hi Martin, > > Thank you! My net.ipv4.tcp_mtu_probing is 1. After turning it off, > the WARN_ON stack is gone. Thanks for confirming it. > Could you elaborate a bit on why this setting relates to the WARN_ON > trace? The WARN_ON is c

[RFC PATCH net] tcp: Update pcount after skb_pull() during mtu probing

2015-06-05 Thread Martin KaFai Lau
/0x7e0 The WARN_ON pointed out that tcp_skb_pcount (i.e. TCP_SKB_CB(skb)->tcp_gso_segs) and skb->len is inconsistent. The WARN_ON stack goes away after setting net.ipv4.tcp_mtu_probing to 0. This patch is to set the pcount after skb_pull() was called in tcp_mtu_probe(). Signed-off-by: Mart

Re: [RFC PATCH net] tcp: Update pcount after skb_pull() during mtu probing

2015-06-05 Thread Martin KaFai Lau
On Fri, Jun 05, 2015 at 09:53:51AM -0700, Eric Dumazet wrote: > Sounds good, although I would simply get rid of all this complexity in > this very unlikely path. > > Would you instead try the following ? > > diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c > index > eeb59befaf06867b00e

Re: [RFC PATCH net] tcp: Update pcount after skb_pull() during mtu probing

2015-06-05 Thread Martin KaFai Lau
On Fri, Jun 05, 2015 at 02:23:55PM -0700, Eric Dumazet wrote: > On Fri, 2015-06-05 at 11:02 -0700, Martin KaFai Lau wrote: > > > tcp_trim_head() does not take the mss_now. > > Is it fine to have mss_now <= tcp_skb_mss(skb)? or we can depend on > > the tcp_init_tso_s

[PATCH net v2] tcp: Force updating pcount after skb_pull() during mtu probing

2015-06-05 Thread Martin KaFai Lau
. v1 - Call tcp_set_skb_tso_segs() for all slicing cases. Signed-off-by: Martin KaFai Lau Reported-by: Grant Zhang Cc: Grant Zhang Cc: Eric Dumazet Cc: Neal Cardwell Cc: Yuchung Cheng --- net/ipv4/tcp_output.c | 12 ++-- 1 file changed, 2 insertions(+), 10 deletions(-) diff --gi

Re: [PATCH net v2] tcp: Force updating pcount after skb_pull() during mtu probing

2015-06-08 Thread Martin KaFai Lau
On Fri, Jun 05, 2015 at 06:11:33PM -0700, Eric Dumazet wrote: > On Fri, 2015-06-05 at 17:46 -0700, Martin KaFai Lau wrote: > > The problem is caught by this WARN_ON(len > skb->len) in tcp_fragment(): > > > > [] warn_slowpath_null+0x1a/0x20 > > [] tcp_fragment+0x2a

Re: [PATCH net v2] tcp: Force updating pcount after skb_pull() during mtu probing

2015-06-09 Thread Martin KaFai Lau
On Tue, Jun 09, 2015 at 10:06:25AM -0700, Eric Dumazet wrote: > I've been working on this, but still can get the bug triggering in > tcp_fragment(), no matter what (Neal patch , yours, mine...) Can you describe the test case that can reproduce it? -- To unsubscribe from this list: send the line "un

Re: [RFC PATCH 06/10] ipv6: Avoid deleting RTF_CACHE route from ip6_route_del()

2015-04-20 Thread Martin KaFai Lau
On Mon, Apr 20, 2015 at 02:23:05PM -0400, David Miller wrote: > From: Martin KaFai Lau > Date: Fri, 10 Apr 2015 18:54:09 -0700 > > > Before patch 'Allow pmtu update on /128 via gateway route', > > RTF_CACHE route was not created for DST_HOST. It also requires

[PATCH net-next 1/5] ipv6: Consider RTF_CACHE when searching the fib6 tree

2015-04-28 Thread Martin KaFai Lau
h will not affect it. Signed-off-by: Martin KaFai Lau Reviewed-by: Hannes Frederic Sowa Cc: Steffen Klassert --- net/ipv6/addrconf.c | 2 ++ net/ipv6/route.c| 6 ++ 2 files changed, 8 insertions(+) diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index 37b70e8..21c2c81 100644 --

[PATCH net-next 5/5] ipv6: Remove DST_METRICS_FORCE_OVERWRITE and _rt6i_peer

2015-04-28 Thread Martin KaFai Lau
is bit is also not needed. Signed-off-by: Martin KaFai Lau Reviewed-by: Hannes Frederic Sowa Cc: Michal Kubeček Cc: Steffen Klassert --- include/net/dst.h | 6 -- include/net/ip6_fib.h | 31 --- net/ipv6/route.c| 36 +

[PATCH net-next 3/5] ipv6: Stop /128 route from disappearing after pmtu update

2015-04-28 Thread Martin KaFai Lau
that allow pmtu update should have a RTF_CACHE clone. Hence, stop updating MTU for any non RTF_CACHE route. Signed-off-by: Martin KaFai Lau Signed-off-by: Steffen Klassert Reviewed-by: Hannes Frederic Sowa --- net/ipv6/route.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff -

[PATCH net-next 2/5] ipv6: Extend the route lookups to low priority metrics.

2015-04-28 Thread Martin KaFai Lau
deletes the invalid route. This typically happens if a host route expires afer a pmtu event. Fix this by searching also for routes with a lower priority metric. Signed-off-by: Steffen Klassert Signed-off-by: Martin KaFai Lau Reviewed-by: Hannes Frederic Sowa --- net/ipv6/route.c | 28

[PATCH net-next 4/5] ipv6: Stop rt6_info from using inet_peer's metrics

2015-04-28 Thread Martin KaFai Lau
ch. Signed-off-by: Martin KaFai Lau Reviewed-by: Hannes Frederic Sowa Cc: Steffen Klassert --- include/net/ip6_fib.h | 10 + net/ipv6/route.c | 102 +- 2 files changed, 60 insertions(+), 52 deletions(-) diff --git a/include/net/ip6_f

[PATCH net-next 0/5] ipv6: Stop /128 route from disappearing after pmtu update

2015-04-28 Thread Martin KaFai Lau
The series is separated from another patch series, 'ipv6: Only create RTF_CACHE route after encountering pmtu exception', which can be found here: http://thread.gmane.org/gmane.linux.network/359140 This series focus on fixing the /128 route issues. It is currently targeted for net-next due to the

[PATCH net-next 0/6 v2] ipv6: Only create RTF_CACHE route after encountering pmtu exception

2015-04-28 Thread Martin KaFai Lau
v1 -> v2: - Move the /128 route bug fixes to another series (posted). - Create a function for checking (rt6i_flags & (RTF_NONEXTHOP | RTF_GATEWAY)). - Avoid shuffling the skb network_header. Instead, change the function signature to take iph instead of skb. The perf numbers do not change much s

[PATCH net-next 3/6] ipv6: Combine rt6_alloc_cow and rt6_alloc_clone

2015-04-28 Thread Martin KaFai Lau
A prep work for creating RTF_CACHE on exception only. After this patch, the same condition (rt->rt6i_flags & (RTF_NONEXTHOP | RTF_GATEWAY)) is checked twice. This redundancy will be removed in the later patch. Signed-off-by: Martin KaFai Lau Reviewed-by: Hannes Frederic Sowa Cc:

<    1   2   3   4   5   6   7   8   9   10   >