Re: Multitude of dst obsolescense race conditions

2014-05-14 Thread dormando
> On Wed, May 14, 2014, at 2:57, dormando wrote: > > Given a machine with frequently changing routes (ie; a router with an > > active internet BGP table and multiple interfaces), there're at least > > several places where obsolete dst's are handled improperly. If I pause >

Re: Multitude of dst obsolescense race conditions

2014-05-14 Thread dormando
> On Wed, 2014-05-14 at 02:57 -0700, dormando wrote: > > Hi, > > > > Given a machine with frequently changing routes (ie; a router with an > > active internet BGP table and multiple interfaces), there're at least > > several places where obsolete dst's a

Multitude of dst obsolescense race conditions

2014-05-14 Thread dormando
23139.585250] [] tcp_write_timer_handler+0xa0/0x1d0 <4>[14723139.585314] [] ? tcp_write_timer_handler+0x1d0/0x1d0 <4>[14723139.585378] [] tcp_write_timer+0x60/0x70 <4>[14723139.585443] [] call_timer_fn+0x3b/0x150 <4>[14723139.585507] [] ? do_IRQ+0x63/0xe0 <4>[14

Multitude of dst obsolescense race conditions

2014-05-14 Thread dormando
and I have to stop for now. Anyway this sucks, please help! thanks! -Dormando -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ

Re: Multitude of dst obsolescense race conditions

2014-05-14 Thread dormando
On Wed, 2014-05-14 at 02:57 -0700, dormando wrote: Hi, Given a machine with frequently changing routes (ie; a router with an active internet BGP table and multiple interfaces), there're at least several places where obsolete dst's are handled improperly. If I pause the route changes

Re: Multitude of dst obsolescense race conditions

2014-05-14 Thread dormando
On Wed, May 14, 2014, at 2:57, dormando wrote: Given a machine with frequently changing routes (ie; a router with an active internet BGP table and multiple interfaces), there're at least several places where obsolete dst's are handled improperly. If I pause the route changes

Re: ipv4_dst_destroy panic regression after 3.10.15

2014-05-13 Thread dormando
On Wed, 22 Jan 2014, Alexei Starovoitov wrote: > On Tue, Jan 21, 2014 at 10:02 PM, Alexei Starovoitov > wrote: > > On Tue, Jan 21, 2014 at 8:10 PM, dormando wrote: > >> > >> > >> On Tue, 21 Jan 2014, Alexei Starovoitov wrote: > >> > &

Re: ipv4_dst_destroy panic regression after 3.10.15

2014-05-13 Thread dormando
On Wed, 22 Jan 2014, Alexei Starovoitov wrote: On Tue, Jan 21, 2014 at 10:02 PM, Alexei Starovoitov alexei.starovoi...@gmail.com wrote: On Tue, Jan 21, 2014 at 8:10 PM, dormando dorma...@rydia.net wrote: On Tue, 21 Jan 2014, Alexei Starovoitov wrote: On Tue, Jan 21, 2014 at 5:39 PM

Re: kmem_cache_alloc panic in 3.10+

2014-01-31 Thread dormando
On Fri, 31 Jan 2014, David Rientjes wrote: > On Fri, 31 Jan 2014, dormando wrote: > > > > CONFIG_SLUB_DEBUG_ON will definitely be slower but can help to identify > > > any possible corruption issues. > > > > > > I'm wondering if you have CONFIG_MEMCG enab

Re: kmem_cache_alloc panic in 3.10+

2014-01-31 Thread dormando
On Fri, 31 Jan 2014, David Rientjes wrote: On Fri, 31 Jan 2014, dormando wrote: CONFIG_SLUB_DEBUG_ON will definitely be slower but can help to identify any possible corruption issues. I'm wondering if you have CONFIG_MEMCG enabled and are actually allocating slab in a non-root

Re: kmem_cache_alloc panic in 3.10+

2014-01-30 Thread dormando
> On Thu, Jan 30, 2014 at 6:16 PM, Eric Dumazet wrote: > > On Wed, 2014-01-29 at 23:05 -0800, dormando wrote: > > > >> We hit the routing code fairly hard. Any hints for what to look at or how > >> to instrument it? Or if it's fixed already? It's a real pain t

Re: kmem_cache_alloc panic in 3.10+

2014-01-30 Thread dormando
On Thu, Jan 30, 2014 at 6:16 PM, Eric Dumazet eric.duma...@gmail.com wrote: On Wed, 2014-01-29 at 23:05 -0800, dormando wrote: We hit the routing code fairly hard. Any hints for what to look at or how to instrument it? Or if it's fixed already? It's a real pain to iterate since it takes

Re: kmem_cache_alloc panic in 3.10+

2014-01-29 Thread dormando
> > On Sat, 2014-01-18 at 00:44 -0800, dormando wrote: > > > Hello again! > > > > > > We've had a rare crash that's existed between 3.10.0 and 3.10.15 at least > > > (trying newer stables now, but I can't tell if it was fixed, and it takes > > > w

Re: kmem_cache_alloc panic in 3.10+

2014-01-29 Thread dormando
On Sat, 2014-01-18 at 00:44 -0800, dormando wrote: Hello again! We've had a rare crash that's existed between 3.10.0 and 3.10.15 at least (trying newer stables now, but I can't tell if it was fixed, and it takes weeks to reproduce). Unfortunately I can only get 8k back from

Re: ipv4_dst_destroy panic regression after 3.10.15

2014-01-21 Thread dormando
On Tue, 21 Jan 2014, Alexei Starovoitov wrote: > On Tue, Jan 21, 2014 at 5:39 PM, dormando wrote: > > > > > On Fri, Jan 17, 2014 at 11:16 PM, dormando wrote: > > > >> On Fri, 2014-01-17 at 22:49 -0800, Eric Dumazet wrote: > > > >> >

Re: ipv4_dst_destroy panic regression after 3.10.15

2014-01-21 Thread dormando
> On Fri, Jan 17, 2014 at 11:16 PM, dormando wrote: > >> On Fri, 2014-01-17 at 22:49 -0800, Eric Dumazet wrote: > >> > On Fri, 2014-01-17 at 17:25 -0800, dormando wrote: > >> > > Hi, > >> > > > >> > > Upgraded a few

Re: ipv4_dst_destroy panic regression after 3.10.15

2014-01-21 Thread dormando
On Fri, Jan 17, 2014 at 11:16 PM, dormando dorma...@rydia.net wrote: On Fri, 2014-01-17 at 22:49 -0800, Eric Dumazet wrote: On Fri, 2014-01-17 at 17:25 -0800, dormando wrote: Hi, Upgraded a few kernels to the latest 3.10 stable tree while tracking down a rare kernel

Re: ipv4_dst_destroy panic regression after 3.10.15

2014-01-21 Thread dormando
On Tue, 21 Jan 2014, Alexei Starovoitov wrote: On Tue, Jan 21, 2014 at 5:39 PM, dormando dorma...@rydia.net wrote: On Fri, Jan 17, 2014 at 11:16 PM, dormando dorma...@rydia.net wrote: On Fri, 2014-01-17 at 22:49 -0800, Eric Dumazet wrote: On Fri, 2014-01-17 at 17:25 -0800

Re: ipv4_dst_destroy panic regression after 3.10.15

2014-01-19 Thread dormando
> On Fri, 2014-01-17 at 17:25 -0800, dormando wrote: > > Hi, > > > > Upgraded a few kernels to the latest 3.10 stable tree while tracking down > > a rare kernel panic, seems to have introduced a much more frequent kernel > > panic. Takes anywhere from 4 hours

Re: ipv4_dst_destroy panic regression after 3.10.15

2014-01-19 Thread dormando
On Fri, 2014-01-17 at 17:25 -0800, dormando wrote: Hi, Upgraded a few kernels to the latest 3.10 stable tree while tracking down a rare kernel panic, seems to have introduced a much more frequent kernel panic. Takes anywhere from 4 hours to 2 days to trigger: 4[196727.311203

Re: ipv4_dst_destroy panic regression after 3.10.15

2014-01-18 Thread dormando
> On Fri, Jan 17, 2014 at 11:16 PM, dormando wrote: > >> On Fri, 2014-01-17 at 22:49 -0800, Eric Dumazet wrote: > >> > On Fri, 2014-01-17 at 17:25 -0800, dormando wrote: > >> > > Hi, > >> > > > >> > > Upgraded a few

Re: kmem_cache_alloc panic in 3.10+

2014-01-18 Thread dormando
> On Sat, 2014-01-18 at 00:44 -0800, dormando wrote: > > Hello again! > > > > We've had a rare crash that's existed between 3.10.0 and 3.10.15 at least > > (trying newer stables now, but I can't tell if it was fixed, and it takes > > weeks to reproduce). > &

kmem_cache_alloc panic in 3.10+

2014-01-18 Thread dormando
49 8b 3c 24 <49> 8b 5c 05 00 48 8d 4a 01 4c 89 e8 65 48 0f c7 0f 0f 94 c0 3c <1>[1197485.264417] RIP [] kmem_cache_alloc+0x5a/0x130 <4>[1197485.264424] RSP <4>[1197485.264427] CR2: 0001 <4>[1197485.264431] ---[ end trace 90fee06aa40b7305 ]--- <0&g

kmem_cache_alloc panic in 3.10+

2014-01-18 Thread dormando
of time it takes to reproduce. Since we have a large number of machines they're always crashing here and there, but once they do it's not going to happen again for a while. Thanks! -Dormando -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord

Re: kmem_cache_alloc panic in 3.10+

2014-01-18 Thread dormando
On Sat, 2014-01-18 at 00:44 -0800, dormando wrote: Hello again! We've had a rare crash that's existed between 3.10.0 and 3.10.15 at least (trying newer stables now, but I can't tell if it was fixed, and it takes weeks to reproduce). Unfortunately I can only get 8k back from pstore

Re: ipv4_dst_destroy panic regression after 3.10.15

2014-01-18 Thread dormando
On Fri, Jan 17, 2014 at 11:16 PM, dormando dorma...@rydia.net wrote: On Fri, 2014-01-17 at 22:49 -0800, Eric Dumazet wrote: On Fri, 2014-01-17 at 17:25 -0800, dormando wrote: Hi, Upgraded a few kernels to the latest 3.10 stable tree while tracking down a rare kernel

Re: ipv4_dst_destroy panic regression after 3.10.15

2014-01-17 Thread dormando
> On Fri, 2014-01-17 at 22:49 -0800, Eric Dumazet wrote: > > On Fri, 2014-01-17 at 17:25 -0800, dormando wrote: > > > Hi, > > > > > > Upgraded a few kernels to the latest 3.10 stable tree while tracking down > > > a rare kernel panic, seems t

ipv4_dst_destroy panic regression after 3.10.15

2014-01-17 Thread dormando
00 48 be 00 01 10 00 00 00 ad de <48> 89 42 08 48 89 10 48 89 bb b8 00 00 00 48 c7 c7 4a 9f e9 81 <1>[196727.313071] RIP [] ipv4_dst_destroy+0x4f/0x80 <4>[196727.313100] RSP <4>[196727.313377] ---[ end trace 64b3f14fae0f2e29 ]--- <0>[196727.380908] Kernel panic - not syn

ipv4_dst_destroy panic regression after 3.10.15

2014-01-17 Thread dormando
the change exists between .15 and .25. Thanks, -Dormando -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

Re: ipv4_dst_destroy panic regression after 3.10.15

2014-01-17 Thread dormando
On Fri, 2014-01-17 at 22:49 -0800, Eric Dumazet wrote: On Fri, 2014-01-17 at 17:25 -0800, dormando wrote: Hi, Upgraded a few kernels to the latest 3.10 stable tree while tracking down a rare kernel panic, seems to have introduced a much more frequent kernel panic. Takes anywhere

Re: ipv4: warnings on sk_wmem_queued

2013-08-31 Thread dormando
> I noticed these warnings on stock 3.10.9 running stress tests on > cmogstored.git (git://bogomips.org/cmogstored.git) doing standard > HTTP server stuff between lo and tmpfs: > [...] > I was going to reboot into 3.10.10 before I looked at dmesg. These > warnings happened after ~8 hours of

Re: ipv4: warnings on sk_wmem_queued

2013-08-31 Thread dormando
I noticed these warnings on stock 3.10.9 running stress tests on cmogstored.git (git://bogomips.org/cmogstored.git) doing standard HTTP server stuff between lo and tmpfs: [...] I was going to reboot into 3.10.10 before I looked at dmesg. These warnings happened after ~8 hours of stress

Re: [PATCH 0/10] Reduce system disruption due to kswapd V2

2013-04-10 Thread dormando
> On Tue, Apr 09, 2013 at 05:27:18PM +, Christoph Lameter wrote: > > One additional measure that may be useful is to make kswapd prefer one > > specific processor on a socket. Two benefits arise from that: > > > > 1. Better use of cpu caches and therefore higher speed, less > > serialization.

Re: [PATCH 0/10] Reduce system disruption due to kswapd V2

2013-04-10 Thread dormando
On Tue, Apr 09, 2013 at 05:27:18PM +, Christoph Lameter wrote: One additional measure that may be useful is to make kswapd prefer one specific processor on a socket. Two benefits arise from that: 1. Better use of cpu caches and therefore higher speed, less serialization.

Re: BUG: IPv4: Attempt to release TCP socket in state 1

2013-03-17 Thread dormando
> On Sat, 2013-03-16 at 10:36 -0700, Eric Dumazet wrote: > > On Fri, 2013-03-15 at 00:19 +0100, Eric Dumazet wrote: > > > > > Thanks thats really useful, we might miss to increment socket refcount > > > in a timer setup. > > > > > > > Hmm, please add following debugging patch as well > > > > diff

Re: BUG: IPv4: Attempt to release TCP socket in state 1

2013-03-17 Thread dormando
On Sat, 2013-03-16 at 10:36 -0700, Eric Dumazet wrote: On Fri, 2013-03-15 at 00:19 +0100, Eric Dumazet wrote: Thanks thats really useful, we might miss to increment socket refcount in a timer setup. Hmm, please add following debugging patch as well diff --git

Re: BUG: IPv4: Attempt to release TCP socket in state 1

2013-03-16 Thread dormando
> On Sat, 2013-03-16 at 10:36 -0700, Eric Dumazet wrote: > > On Fri, 2013-03-15 at 00:19 +0100, Eric Dumazet wrote: > > > > > Thanks thats really useful, we might miss to increment socket refcount > > > in a timer setup. > > > > > > > Hmm, please add following debugging patch as well > > > > diff

Re: BUG: IPv4: Attempt to release TCP socket in state 1

2013-03-16 Thread dormando
On Sat, 2013-03-16 at 10:36 -0700, Eric Dumazet wrote: On Fri, 2013-03-15 at 00:19 +0100, Eric Dumazet wrote: Thanks thats really useful, we might miss to increment socket refcount in a timer setup. Hmm, please add following debugging patch as well diff --git

Re: BUG: IPv4: Attempt to release TCP socket in state 1

2013-03-14 Thread dormando
> On Thu, 2013-03-14 at 14:21 -0700, dormando wrote: > > > > > > diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c > > > index 68f6a94..1d4d97e 100644 > > > --- a/net/ipv4/af_inet.c > > > +++ b/net/ipv4/af_inet.c > > > @@

Re: BUG: IPv4: Attempt to release TCP socket in state 1

2013-03-14 Thread dormando
On Thu, 2013-03-14 at 14:21 -0700, dormando wrote: diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c index 68f6a94..1d4d97e 100644 --- a/net/ipv4/af_inet.c +++ b/net/ipv4/af_inet.c @@ -141,8 +141,9 @@ void inet_sock_destruct(struct sock *sk) sk_mem_reclaim(sk

Re: BUG: IPv4: Attempt to release TCP socket in state 1

2013-03-07 Thread dormando
> On Wed, 2013-03-06 at 16:41 -0800, dormando wrote: > > > Ok... bridge module is loaded but nothing seems to be using it. No > > bond/tunnels/anything enabled. I couldn't quickly figure out what was > > causing it to load. > > > > We removed the need for ma

Re: BUG: IPv4: Attempt to release TCP socket in state 1

2013-03-07 Thread dormando
On Wed, 2013-03-06 at 16:41 -0800, dormando wrote: Ok... bridge module is loaded but nothing seems to be using it. No bond/tunnels/anything enabled. I couldn't quickly figure out what was causing it to load. We removed the need for macvlan, started machines with a fresh boot

Re: BUG: IPv4: Attempt to release TCP socket in state 1

2013-03-06 Thread dormando
> On Mon, 2013-03-04 at 21:44 -0800, dormando wrote: > > > No 3rd party modules. There's a tiny patch for controlling initcwnd from > > userspace and another one for the extra_free_kbytes tunable that I brought > > up in another thread. We've had the initcwnd patch in for

Re: BUG: IPv4: Attempt to release TCP socket in state 1

2013-03-06 Thread dormando
On Mon, 2013-03-04 at 21:44 -0800, dormando wrote: No 3rd party modules. There's a tiny patch for controlling initcwnd from userspace and another one for the extra_free_kbytes tunable that I brought up in another thread. We've had the initcwnd patch in for a long time without trouble

Re: BUG: IPv4: Attempt to release TCP socket in state 1

2013-03-04 Thread dormando
On Mon, 4 Mar 2013, Eric Dumazet wrote: > On Tue, 2013-03-05 at 11:47 +0800, Cong Wang wrote: > > (Cc'ing the right netdev mailing list...) > > > > On 03/05/2013 08:01 AM, dormando wrote: > > > Hi! > > > > > > I have a (core lockup?) with 3.7.6+

BUG: IPv4: Attempt to release TCP socket in state 1

2013-03-04 Thread dormando
just does this until someone reboots the box. Apologies for the ugly paste. Thanks, -Dormando -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo

BUG: IPv4: Attempt to release TCP socket in state 1

2013-03-04 Thread dormando
/0x1de ... then swapped just does this until someone reboots the box. Apologies for the ugly paste. Thanks, -Dormando -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo

Re: BUG: IPv4: Attempt to release TCP socket in state 1

2013-03-04 Thread dormando
On Mon, 4 Mar 2013, Eric Dumazet wrote: On Tue, 2013-03-05 at 11:47 +0800, Cong Wang wrote: (Cc'ing the right netdev mailing list...) On 03/05/2013 08:01 AM, dormando wrote: Hi! I have a (core lockup?) with 3.7.6+ and 3.8.2 which appears to be under ixgbe. The machine appears

Re: [PATCH] add extra free kbytes tunable

2013-02-19 Thread dormando
> > The problem is that adding this tunable will constrain future VM > implementations. We will forever need to at least retain the > pseudo-file. We will also need to make some effort to retain its > behaviour. > > It would of course be better to fix things so you don't need to tweak > VM

Re: [PATCH] add extra free kbytes tunable

2013-02-19 Thread dormando
The problem is that adding this tunable will constrain future VM implementations. We will forever need to at least retain the pseudo-file. We will also need to make some effort to retain its behaviour. It would of course be better to fix things so you don't need to tweak VM internals to

Re: extra free kbytes tunable

2013-02-17 Thread dormando
x-kernel-ow...@vger.kernel.org > > > [mailto:linux-kernel-ow...@vger.kernel.org] On Behalf Of dormando > > > Sent: Monday, February 11, 2013 9:01 PM > > > To: Rik van Riel > > > Cc: Randy Dunlap; Satoru Moriya; linux-kernel@vger.kernel.org; > > > linux...@kvack.org;

[PATCH] add extra free kbytes tunable

2013-02-17 Thread dormando
From: Rik van Riel Add a userspace visible knob to tell the VM to keep an extra amount of memory free, by increasing the gap between each zone's min and low watermarks. This is useful for realtime applications that call system calls and have a bound on the number of allocations that happen in

[PATCH] add extra free kbytes tunable

2013-02-17 Thread dormando
From: Rik van Riel r...@redhat.com Add a userspace visible knob to tell the VM to keep an extra amount of memory free, by increasing the gap between each zone's min and low watermarks. This is useful for realtime applications that call system calls and have a bound on the number of allocations

Re: extra free kbytes tunable

2013-02-17 Thread dormando
Of dormando Sent: Monday, February 11, 2013 9:01 PM To: Rik van Riel Cc: Randy Dunlap; Satoru Moriya; linux-kernel@vger.kernel.org; linux...@kvack.org; lwood...@redhat.com; Seiji Aguchi; a...@linux-foundation.org; hu...@google.com Subject: extra free kbytes tunable Hi

extra free kbytes tunable

2013-02-11 Thread dormando
cut it down as much as I could for this mail. Thanks, -Dormando -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

extra free kbytes tunable

2013-02-11 Thread dormando
for this mail. Thanks, -Dormando -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/