Re: Cache not being reclaimed?

2007-09-05 Thread Ian Kumlien
On ons, 2007-09-05 at 05:45 -0700, Andrew Morton wrote:
> > On Wed, 05 Sep 2007 03:28:07 +0200 Ian Kumlien <[EMAIL PROTECTED]> wrote:
> > Hi, 
> > 
> > I have just had a quite unexpected 'low memory situation'...
> > 
> > This is a AMD64 machine with 2 gig memory, running 64 bit userland.
> > 
> > Kernel: 2.6.23-rc3-git10, updating to -rc5-* as soon as i can.
> > I'm using SLUB:s
> > 
> > 
> > To me, this looks odd... I thought that any cached memory would be
> > reclamed but it was always full.
> > 
> > Ideas?
> > 
> > One example from dmesg:
> > swapper: page allocation failure. order:1, mode:0x4020
> > 
> > Call Trace:
> >[] __alloc_pages+0x30f/0x330
> >  [] __slab_alloc+0x141/0x590
> >  [] __netdev_alloc_skb+0x17/0x40
> >  [] __netdev_alloc_skb+0x17/0x40
> >  [] __kmalloc_track_caller+0xa0/0xc0
> >  [] __alloc_skb+0x6f/0x150
> >  [] __netdev_alloc_skb+0x17/0x40
> >  [] :sky2:sky2_rx_alloc+0x25/0xf0
> >  [] :sky2:sky2_poll+0x6dc/0xcf0
> >  [] tcp_delack_timer+0x0/0x210
> >  [] net_rx_action+0x8a/0x140
> >  [] __do_softirq+0x69/0xe0
> >  [] call_softirq+0x1c/0x30
> >  [] do_softirq+0x35/0x90
> >  [] do_IRQ+0x80/0x100
> >  [] default_idle+0x0/0x40
> >  [] ret_from_intr+0x0/0xa
> >[] default_idle+0x29/0x40
> >  [] cpu_idle+0xa1/0xf0
> > 
> 
> An order-1 GFP_ATOMIC allocation can fail, and networking should recover
> from it.

Well, this isn't only networking, It started with all the apps running
and ended up with a pretty basic desktop with almost nothing running...
(due to continued freezes that caused me to shut down more and more
programs)

> If this is happening a lot then someting might have been broken.  Do you
> have reason to believe that the frequency of this happening has inreased?

I have never, to my knowledge, had this happen before... 

I just happened to start a few downloads with rtorrent and watched the
machine slow down to a crawl... All this with over a gig in cache.

The machine was actually deadlocked for almost a minute at one time.

Top memory usage:
  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
  
 2395 pomac 20   0  440m 289m 238m S2 14.4  32:56.54 rtorrent
21647 root  20   0  203m 120m  10m S1  6.0 569:45.31 X  
  
 2351 pomac 20   0  170m 111m  53m S0  5.5  15:43.66 rtorrent  

At peak time, one of the rtorrent processes consumed more, but i still
had 1.x gig as cache, which imho should have been reclaimed.

vmstat now:
procs ---memory-- ---swap-- -io -system-- cpu
 r  b   swpd   free   buff  cache   si   sobibo   in   cs us sy id wa
 0  0 365204  16628  30644 144531621   36292   18   21  4  2 91  3

PS. I have a dmesg dump from the incident, it's not long enough to
contain all but it could be seen as a snapshot...
DS.

-- 
Ian Kumlien  -- http://pomac.netswarm.net


signature.asc
Description: This is a digitally signed message part


Re: Cache not being reclaimed?

2007-09-05 Thread Andrew Morton
> On Wed, 05 Sep 2007 03:28:07 +0200 Ian Kumlien <[EMAIL PROTECTED]> wrote:
> Hi, 
> 
> I have just had a quite unexpected 'low memory situation'...
> 
> This is a AMD64 machine with 2 gig memory, running 64 bit userland.
> 
> Kernel: 2.6.23-rc3-git10, updating to -rc5-* as soon as i can.
> I'm using SLUB:s
> 
> 
> To me, this looks odd... I thought that any cached memory would be
> reclamed but it was always full.
> 
> Ideas?
> 
> One example from dmesg:
> swapper: page allocation failure. order:1, mode:0x4020
> 
> Call Trace:
>[] __alloc_pages+0x30f/0x330
>  [] __slab_alloc+0x141/0x590
>  [] __netdev_alloc_skb+0x17/0x40
>  [] __netdev_alloc_skb+0x17/0x40
>  [] __kmalloc_track_caller+0xa0/0xc0
>  [] __alloc_skb+0x6f/0x150
>  [] __netdev_alloc_skb+0x17/0x40
>  [] :sky2:sky2_rx_alloc+0x25/0xf0
>  [] :sky2:sky2_poll+0x6dc/0xcf0
>  [] tcp_delack_timer+0x0/0x210
>  [] net_rx_action+0x8a/0x140
>  [] __do_softirq+0x69/0xe0
>  [] call_softirq+0x1c/0x30
>  [] do_softirq+0x35/0x90
>  [] do_IRQ+0x80/0x100
>  [] default_idle+0x0/0x40
>  [] ret_from_intr+0x0/0xa
>[] default_idle+0x29/0x40
>  [] cpu_idle+0xa1/0xf0
> 

An order-1 GFP_ATOMIC allocation can fail, and networking should recover
from it.

If this is happening a lot then someting might have been broken.  Do you
have reason to believe that the frequency of this happening has inreased?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Cache not being reclaimed?

2007-09-05 Thread Andrew Morton
 On Wed, 05 Sep 2007 03:28:07 +0200 Ian Kumlien [EMAIL PROTECTED] wrote:
 Hi, 
 
 I have just had a quite unexpected 'low memory situation'...
 
 This is a AMD64 machine with 2 gig memory, running 64 bit userland.
 
 Kernel: 2.6.23-rc3-git10, updating to -rc5-* as soon as i can.
 I'm using SLUB:s
 
 
 To me, this looks odd... I thought that any cached memory would be
 reclamed but it was always full.
 
 Ideas?
 
 One example from dmesg:
 swapper: page allocation failure. order:1, mode:0x4020
 
 Call Trace:
  IRQ  [8026c7ef] __alloc_pages+0x30f/0x330
  [8028a0a1] __slab_alloc+0x141/0x590
  [805a5937] __netdev_alloc_skb+0x17/0x40
  [805a5937] __netdev_alloc_skb+0x17/0x40
  [8028b470] __kmalloc_track_caller+0xa0/0xc0
  [805a4b3f] __alloc_skb+0x6f/0x150
  [805a5937] __netdev_alloc_skb+0x17/0x40
  [88010945] :sky2:sky2_rx_alloc+0x25/0xf0
  [88013b0c] :sky2:sky2_poll+0x6dc/0xcf0
  [805e5f60] tcp_delack_timer+0x0/0x210
  [805ac38a] net_rx_action+0x8a/0x140
  [80242ac9] __do_softirq+0x69/0xe0
  [8020cd9c] call_softirq+0x1c/0x30
  [8020eb75] do_softirq+0x35/0x90
  [8020ede0] do_IRQ+0x80/0x100
  [8020ad00] default_idle+0x0/0x40
  [8020c121] ret_from_intr+0x0/0xa
  EOI  [8020ad29] default_idle+0x29/0x40
  [8020ade1] cpu_idle+0xa1/0xf0
 

An order-1 GFP_ATOMIC allocation can fail, and networking should recover
from it.

If this is happening a lot then someting might have been broken.  Do you
have reason to believe that the frequency of this happening has inreased?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Cache not being reclaimed?

2007-09-05 Thread Ian Kumlien
On ons, 2007-09-05 at 05:45 -0700, Andrew Morton wrote:
  On Wed, 05 Sep 2007 03:28:07 +0200 Ian Kumlien [EMAIL PROTECTED] wrote:
  Hi, 
  
  I have just had a quite unexpected 'low memory situation'...
  
  This is a AMD64 machine with 2 gig memory, running 64 bit userland.
  
  Kernel: 2.6.23-rc3-git10, updating to -rc5-* as soon as i can.
  I'm using SLUB:s
  
  
  To me, this looks odd... I thought that any cached memory would be
  reclamed but it was always full.
  
  Ideas?
  
  One example from dmesg:
  swapper: page allocation failure. order:1, mode:0x4020
  
  Call Trace:
   IRQ  [8026c7ef] __alloc_pages+0x30f/0x330
   [8028a0a1] __slab_alloc+0x141/0x590
   [805a5937] __netdev_alloc_skb+0x17/0x40
   [805a5937] __netdev_alloc_skb+0x17/0x40
   [8028b470] __kmalloc_track_caller+0xa0/0xc0
   [805a4b3f] __alloc_skb+0x6f/0x150
   [805a5937] __netdev_alloc_skb+0x17/0x40
   [88010945] :sky2:sky2_rx_alloc+0x25/0xf0
   [88013b0c] :sky2:sky2_poll+0x6dc/0xcf0
   [805e5f60] tcp_delack_timer+0x0/0x210
   [805ac38a] net_rx_action+0x8a/0x140
   [80242ac9] __do_softirq+0x69/0xe0
   [8020cd9c] call_softirq+0x1c/0x30
   [8020eb75] do_softirq+0x35/0x90
   [8020ede0] do_IRQ+0x80/0x100
   [8020ad00] default_idle+0x0/0x40
   [8020c121] ret_from_intr+0x0/0xa
   EOI  [8020ad29] default_idle+0x29/0x40
   [8020ade1] cpu_idle+0xa1/0xf0
  
 
 An order-1 GFP_ATOMIC allocation can fail, and networking should recover
 from it.

Well, this isn't only networking, It started with all the apps running
and ended up with a pretty basic desktop with almost nothing running...
(due to continued freezes that caused me to shut down more and more
programs)

 If this is happening a lot then someting might have been broken.  Do you
 have reason to believe that the frequency of this happening has inreased?

I have never, to my knowledge, had this happen before... 

I just happened to start a few downloads with rtorrent and watched the
machine slow down to a crawl... All this with over a gig in cache.

The machine was actually deadlocked for almost a minute at one time.

Top memory usage:
  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
  
 2395 pomac 20   0  440m 289m 238m S2 14.4  32:56.54 rtorrent
21647 root  20   0  203m 120m  10m S1  6.0 569:45.31 X  
  
 2351 pomac 20   0  170m 111m  53m S0  5.5  15:43.66 rtorrent  

At peak time, one of the rtorrent processes consumed more, but i still
had 1.x gig as cache, which imho should have been reclaimed.

vmstat now:
procs ---memory-- ---swap-- -io -system-- cpu
 r  b   swpd   free   buff  cache   si   sobibo   in   cs us sy id wa
 0  0 365204  16628  30644 144531621   36292   18   21  4  2 91  3

PS. I have a dmesg dump from the incident, it's not long enough to
contain all but it could be seen as a snapshot...
DS.

-- 
Ian Kumlien pomac () vapor ! com -- http://pomac.netswarm.net


signature.asc
Description: This is a digitally signed message part


Cache not being reclaimed?

2007-09-04 Thread Ian Kumlien
Hi, 

I have just had a quite unexpected 'low memory situation'...

This is a AMD64 machine with 2 gig memory, running 64 bit userland.

Kernel: 2.6.23-rc3-git10, updating to -rc5-* as soon as i can.
I'm using SLUB:s


To me, this looks odd... I thought that any cached memory would be
reclamed but it was always full.

Ideas?

One example from dmesg:
swapper: page allocation failure. order:1, mode:0x4020

Call Trace:
   [] __alloc_pages+0x30f/0x330
 [] __slab_alloc+0x141/0x590
 [] __netdev_alloc_skb+0x17/0x40
 [] __netdev_alloc_skb+0x17/0x40
 [] __kmalloc_track_caller+0xa0/0xc0
 [] __alloc_skb+0x6f/0x150
 [] __netdev_alloc_skb+0x17/0x40
 [] :sky2:sky2_rx_alloc+0x25/0xf0
 [] :sky2:sky2_poll+0x6dc/0xcf0
 [] tcp_delack_timer+0x0/0x210
 [] net_rx_action+0x8a/0x140
 [] __do_softirq+0x69/0xe0
 [] call_softirq+0x1c/0x30
 [] do_softirq+0x35/0x90
 [] do_IRQ+0x80/0x100
 [] default_idle+0x0/0x40
 [] ret_from_intr+0x0/0xa
   [] default_idle+0x29/0x40
 [] cpu_idle+0xa1/0xf0

Mem-info:
DMA per-cpu:
CPU0: Hot: hi:0, btch:   1 usd:   0   Cold: hi:0, btch:   1
usd:   0
CPU1: Hot: hi:0, btch:   1 usd:   0   Cold: hi:0, btch:   1
usd:   0
DMA32 per-cpu:
CPU0: Hot: hi:  186, btch:  31 usd: 163   Cold: hi:   62, btch:  15
usd:  56
CPU1: Hot: hi:  186, btch:  31 usd:  33   Cold: hi:   62, btch:  15
usd:  60
Active:348343 inactive:122950 dirty:13504 writeback:0 unstable:0
 free:2665 slab:21427 mapped:243884 pagetables:4816 bounce:0
DMA free:8020kB min:20kB low:24kB high:28kB active:16kB inactive:0kB
present:7636kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 2003 2003 2003
DMA32 free:2640kB min:5716kB low:7144kB high:8572kB active:1393356kB
inactive:491800kB present:2052008kB pages_scanned:22 all_unreclaimable?
no
lowmem_reserve[]: 0 0 0 0
DMA: 1*4kB 0*8kB 1*16kB 2*32kB 4*64kB 2*128kB 3*256kB 1*512kB 0*1024kB
1*2048kB 1*4096kB = 8020kB
DMA32: 400*4kB 0*8kB 1*16kB 0*32kB 2*64kB 1*128kB 1*256kB 1*512kB
0*1024kB 0*2048kB 0*4096kB = 2640kB
Swap cache: add 985117, delete 960396, find 102684/214435, race 0+193
Free swap  = 2136272kB
Total swap = 2530180kB
Free swap:   2136272kB
524208 pages of RAM
10098 reserved pages
588916 pages shared
24719 pages swap cached

vmstat
procs ---memory-- ---swap-- -io -system--
cpu
 r  b   swpd   free   buff  cache   si   sobibo   in   cs us sy
id wa
 0  1 393904  16108  13788 158337221   2296798  4  2
92  2


-- 
Ian Kumlien  -- http://pomac.netswarm.net


signature.asc
Description: This is a digitally signed message part


Cache not being reclaimed?

2007-09-04 Thread Ian Kumlien
Hi, 

I have just had a quite unexpected 'low memory situation'...

This is a AMD64 machine with 2 gig memory, running 64 bit userland.

Kernel: 2.6.23-rc3-git10, updating to -rc5-* as soon as i can.
I'm using SLUB:s


To me, this looks odd... I thought that any cached memory would be
reclamed but it was always full.

Ideas?

One example from dmesg:
swapper: page allocation failure. order:1, mode:0x4020

Call Trace:
 IRQ  [8026c7ef] __alloc_pages+0x30f/0x330
 [8028a0a1] __slab_alloc+0x141/0x590
 [805a5937] __netdev_alloc_skb+0x17/0x40
 [805a5937] __netdev_alloc_skb+0x17/0x40
 [8028b470] __kmalloc_track_caller+0xa0/0xc0
 [805a4b3f] __alloc_skb+0x6f/0x150
 [805a5937] __netdev_alloc_skb+0x17/0x40
 [88010945] :sky2:sky2_rx_alloc+0x25/0xf0
 [88013b0c] :sky2:sky2_poll+0x6dc/0xcf0
 [805e5f60] tcp_delack_timer+0x0/0x210
 [805ac38a] net_rx_action+0x8a/0x140
 [80242ac9] __do_softirq+0x69/0xe0
 [8020cd9c] call_softirq+0x1c/0x30
 [8020eb75] do_softirq+0x35/0x90
 [8020ede0] do_IRQ+0x80/0x100
 [8020ad00] default_idle+0x0/0x40
 [8020c121] ret_from_intr+0x0/0xa
 EOI  [8020ad29] default_idle+0x29/0x40
 [8020ade1] cpu_idle+0xa1/0xf0

Mem-info:
DMA per-cpu:
CPU0: Hot: hi:0, btch:   1 usd:   0   Cold: hi:0, btch:   1
usd:   0
CPU1: Hot: hi:0, btch:   1 usd:   0   Cold: hi:0, btch:   1
usd:   0
DMA32 per-cpu:
CPU0: Hot: hi:  186, btch:  31 usd: 163   Cold: hi:   62, btch:  15
usd:  56
CPU1: Hot: hi:  186, btch:  31 usd:  33   Cold: hi:   62, btch:  15
usd:  60
Active:348343 inactive:122950 dirty:13504 writeback:0 unstable:0
 free:2665 slab:21427 mapped:243884 pagetables:4816 bounce:0
DMA free:8020kB min:20kB low:24kB high:28kB active:16kB inactive:0kB
present:7636kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 2003 2003 2003
DMA32 free:2640kB min:5716kB low:7144kB high:8572kB active:1393356kB
inactive:491800kB present:2052008kB pages_scanned:22 all_unreclaimable?
no
lowmem_reserve[]: 0 0 0 0
DMA: 1*4kB 0*8kB 1*16kB 2*32kB 4*64kB 2*128kB 3*256kB 1*512kB 0*1024kB
1*2048kB 1*4096kB = 8020kB
DMA32: 400*4kB 0*8kB 1*16kB 0*32kB 2*64kB 1*128kB 1*256kB 1*512kB
0*1024kB 0*2048kB 0*4096kB = 2640kB
Swap cache: add 985117, delete 960396, find 102684/214435, race 0+193
Free swap  = 2136272kB
Total swap = 2530180kB
Free swap:   2136272kB
524208 pages of RAM
10098 reserved pages
588916 pages shared
24719 pages swap cached

vmstat
procs ---memory-- ---swap-- -io -system--
cpu
 r  b   swpd   free   buff  cache   si   sobibo   in   cs us sy
id wa
 0  1 393904  16108  13788 158337221   2296798  4  2
92  2


-- 
Ian Kumlien pomac () vapor ! com -- http://pomac.netswarm.net


signature.asc
Description: This is a digitally signed message part