Re: kernel 2.6.22: what IS the VM doing?

2007-09-14 Thread Sami Farin
On Wed, Sep 05, 2007 at 18:48:51 -0400, Rik van Riel wrote:
> Sami Farin wrote:
>> On Wed, Sep 05, 2007 at 12:24:26 -0400, Rik van Riel wrote:
>> ...
 *shrug*
>>> The attached patch should make sure kswapd does not free an
>>> excessive number of pages in zone_normal just because the
>>> pages in zone_highmem are difficult to free.
>>>
>>> It does give kswapd a large margin to continue putting equal
>>> pressure on all zones in normal situations.
>>>
>>> Sami, could you try out this patch to see if it helps your
>>> situation?
>>
>> Thanks, Rik.  bzImage is ready, I probably reboot inside one
>> month for a reason or other 8-)
>
> The more I look at the bug, the more I see that it is probably
> not very easy to reproduce on demand.  I have, however, a full

Well, I now booted to x86_64 kernel.

I can still reproduce this.
When I unload ipset modules, kernel resumes "normal" operation, i.e.,
not swapping like mad.

sysrq-m, normal operation:

[172074.989053] SysRq : Show Memory
[172074.989063] Mem-info:
[172074.989071] DMA per-cpu:
[172074.989078] CPU0: Hot: hi:0, btch:   1 usd:   0   Cold: hi:0, 
btch:   1 usd:   0
[172074.989083] CPU1: Hot: hi:0, btch:   1 usd:   0   Cold: hi:0, 
btch:   1 usd:   0
[172074.989089] DMA32 per-cpu:
[172074.989094] CPU0: Hot: hi:  186, btch:  31 usd: 153   Cold: hi:   62, 
btch:  15 usd:  10
[172074.989101] CPU1: Hot: hi:  186, btch:  31 usd:  34   Cold: hi:   62, 
btch:  15 usd:  14
[172074.989109] Active:123048 inactive:55194 dirty:5 writeback:0 unstable:0
[172074.989111]  free:15001 slab:27961 mapped:15063 pagetables:2996 bounce:0
[172074.989118] DMA free:5560kB min:32kB low:40kB high:48kB active:404kB 
inactive:736kB present:8620kB pages_scanned:0 all_unreclaimable? no
[172074.989124] lowmem_reserve[]: 0 968 968
[172074.989143] DMA32 free:5kB min:3964kB low:4952kB high:5944kB 
active:491788kB inactive:220040kB present:991996kB pages_scanned:0 
all_unreclaimable? no
[172074.989150] lowmem_reserve[]: 0 0 0
[172074.989166] DMA: 276*4kB 121*8kB 30*16kB 6*32kB 0*64kB 0*128kB 1*256kB 
1*512kB 2*1024kB 0*2048kB 0*4096kB = 5560kB
[172074.989205] DMA32: 9467*4kB 1222*8kB 121*16kB 22*32kB 3*64kB 1*128kB 
1*256kB 3*512kB 0*1024kB 1*2048kB 0*4096kB = 5kB
[172074.989249] Swap cache: add 613353, delete 556659, find 441592/473681, race 
0+5
[172074.989255] Free swap  = 2751640kB
[172074.989260] Total swap = 3911784kB
[172074.989265] Free swap:   2751640kB
[172074.993693] 255744 pages of RAM
[172074.993699] 6060 reserved pages
[172074.993702] 79933 pages shared
[172074.993706] 56719 pages swap cached

then it goes bad:

[172373.542933] net/ipv4/netfilter/ip_set_nethash.c: retry: rehashing of set 
blockedp2pnew triggered: hashsize grows from 262144 to 288358
[172373.554837] net/ipv4/netfilter/ip_set_nethash.c: retry: rehashing of set 
blockedp2pnew triggered: hashsize grows from 262144 to 317193
[172373.561167] net/ipv4/netfilter/ip_set_nethash.c: retry: rehashing of set 
blockedp2pnew triggered: hashsize grows from 262144 to 348912
[172373.569375] net/ipv4/netfilter/ip_set_nethash.c: retry: rehashing of set 
blockedp2pnew triggered: hashsize grows from 262144 to 383803
[172394.471570] SysRq : Show Memory
[172394.471580] Mem-info:
[172394.471583] DMA per-cpu:
[172394.471586] CPU0: Hot: hi:0, btch:   1 usd:   0   Cold: hi:0, 
btch:   1 usd:   0
[172394.471590] CPU1: Hot: hi:0, btch:   1 usd:   0   Cold: hi:0, 
btch:   1 usd:   0
[172394.471593] DMA32 per-cpu:
[172394.471597] CPU0: Hot: hi:  186, btch:  31 usd: 152   Cold: hi:   62, 
btch:  15 usd:  58
[172394.471601] CPU1: Hot: hi:  186, btch:  31 usd: 108   Cold: hi:   62, 
btch:  15 usd:  52
[172394.471606] Active:46934 inactive:23643 dirty:0 writeback:17112 unstable:0
[172394.471608]  free:133942 slab:16510 mapped:7826 pagetables:3004 bounce:0
[172394.471613] DMA free:8460kB min:32kB low:40kB high:48kB active:0kB 
inactive:0kB present:8620kB pages_scanned:0 all_unreclaimable? yes
[172394.471616] lowmem_reserve[]: 0 968 968
[172394.471623] DMA32 free:527308kB min:3964kB low:4952kB high:5944kB 
active:187736kB inactive:94572kB present:991996kB pages_scanned:92 
all_unreclaimable? no
[172394.471627] lowmem_reserve[]: 0 0 0
[172394.471631] DMA: 154*4kB 133*8kB 78*16kB 29*32kB 12*64kB 0*128kB 1*256kB 
1*512kB 1*1024kB 1*2048kB 0*4096kB = 8464kB
[172394.471644] DMA32: 47127*4kB 24614*8kB 7110*16kB 751*32kB 29*64kB 2*128kB 
0*256kB 2*512kB 1*1024kB 0*2048kB 0*4096kB = 527372kB
[172394.471658] Swap cache: add 659497, delete 623328, find 442788/475174, race 
0+5
[172394.471661] Free swap  = 2571424kB
[172394.471664] Total swap = 3911784kB
[172394.471666] Free swap:   2571424kB
[172394.476322] 255744 pages of RAM
[172394.476325] 6060 reserved pages
[172394.476327] 61683 pages shared
[172394.476329] 36197 pages swap cached

---

procs ---memory-- ---swap-- -io --system-- 
-cpu--
 r  b   swpd   

Re: kernel 2.6.22: what IS the VM doing?

2007-09-14 Thread Sami Farin
On Wed, Sep 05, 2007 at 18:48:51 -0400, Rik van Riel wrote:
 Sami Farin wrote:
 On Wed, Sep 05, 2007 at 12:24:26 -0400, Rik van Riel wrote:
 ...
 *shrug*
 The attached patch should make sure kswapd does not free an
 excessive number of pages in zone_normal just because the
 pages in zone_highmem are difficult to free.

 It does give kswapd a large margin to continue putting equal
 pressure on all zones in normal situations.

 Sami, could you try out this patch to see if it helps your
 situation?

 Thanks, Rik.  bzImage is ready, I probably reboot inside one
 month for a reason or other 8-)

 The more I look at the bug, the more I see that it is probably
 not very easy to reproduce on demand.  I have, however, a full

Well, I now booted to x86_64 kernel.

I can still reproduce this.
When I unload ipset modules, kernel resumes normal operation, i.e.,
not swapping like mad.

sysrq-m, normal operation:

[172074.989053] SysRq : Show Memory
[172074.989063] Mem-info:
[172074.989071] DMA per-cpu:
[172074.989078] CPU0: Hot: hi:0, btch:   1 usd:   0   Cold: hi:0, 
btch:   1 usd:   0
[172074.989083] CPU1: Hot: hi:0, btch:   1 usd:   0   Cold: hi:0, 
btch:   1 usd:   0
[172074.989089] DMA32 per-cpu:
[172074.989094] CPU0: Hot: hi:  186, btch:  31 usd: 153   Cold: hi:   62, 
btch:  15 usd:  10
[172074.989101] CPU1: Hot: hi:  186, btch:  31 usd:  34   Cold: hi:   62, 
btch:  15 usd:  14
[172074.989109] Active:123048 inactive:55194 dirty:5 writeback:0 unstable:0
[172074.989111]  free:15001 slab:27961 mapped:15063 pagetables:2996 bounce:0
[172074.989118] DMA free:5560kB min:32kB low:40kB high:48kB active:404kB 
inactive:736kB present:8620kB pages_scanned:0 all_unreclaimable? no
[172074.989124] lowmem_reserve[]: 0 968 968
[172074.989143] DMA32 free:5kB min:3964kB low:4952kB high:5944kB 
active:491788kB inactive:220040kB present:991996kB pages_scanned:0 
all_unreclaimable? no
[172074.989150] lowmem_reserve[]: 0 0 0
[172074.989166] DMA: 276*4kB 121*8kB 30*16kB 6*32kB 0*64kB 0*128kB 1*256kB 
1*512kB 2*1024kB 0*2048kB 0*4096kB = 5560kB
[172074.989205] DMA32: 9467*4kB 1222*8kB 121*16kB 22*32kB 3*64kB 1*128kB 
1*256kB 3*512kB 0*1024kB 1*2048kB 0*4096kB = 5kB
[172074.989249] Swap cache: add 613353, delete 556659, find 441592/473681, race 
0+5
[172074.989255] Free swap  = 2751640kB
[172074.989260] Total swap = 3911784kB
[172074.989265] Free swap:   2751640kB
[172074.993693] 255744 pages of RAM
[172074.993699] 6060 reserved pages
[172074.993702] 79933 pages shared
[172074.993706] 56719 pages swap cached

then it goes bad:

[172373.542933] net/ipv4/netfilter/ip_set_nethash.c: retry: rehashing of set 
blockedp2pnew triggered: hashsize grows from 262144 to 288358
[172373.554837] net/ipv4/netfilter/ip_set_nethash.c: retry: rehashing of set 
blockedp2pnew triggered: hashsize grows from 262144 to 317193
[172373.561167] net/ipv4/netfilter/ip_set_nethash.c: retry: rehashing of set 
blockedp2pnew triggered: hashsize grows from 262144 to 348912
[172373.569375] net/ipv4/netfilter/ip_set_nethash.c: retry: rehashing of set 
blockedp2pnew triggered: hashsize grows from 262144 to 383803
[172394.471570] SysRq : Show Memory
[172394.471580] Mem-info:
[172394.471583] DMA per-cpu:
[172394.471586] CPU0: Hot: hi:0, btch:   1 usd:   0   Cold: hi:0, 
btch:   1 usd:   0
[172394.471590] CPU1: Hot: hi:0, btch:   1 usd:   0   Cold: hi:0, 
btch:   1 usd:   0
[172394.471593] DMA32 per-cpu:
[172394.471597] CPU0: Hot: hi:  186, btch:  31 usd: 152   Cold: hi:   62, 
btch:  15 usd:  58
[172394.471601] CPU1: Hot: hi:  186, btch:  31 usd: 108   Cold: hi:   62, 
btch:  15 usd:  52
[172394.471606] Active:46934 inactive:23643 dirty:0 writeback:17112 unstable:0
[172394.471608]  free:133942 slab:16510 mapped:7826 pagetables:3004 bounce:0
[172394.471613] DMA free:8460kB min:32kB low:40kB high:48kB active:0kB 
inactive:0kB present:8620kB pages_scanned:0 all_unreclaimable? yes
[172394.471616] lowmem_reserve[]: 0 968 968
[172394.471623] DMA32 free:527308kB min:3964kB low:4952kB high:5944kB 
active:187736kB inactive:94572kB present:991996kB pages_scanned:92 
all_unreclaimable? no
[172394.471627] lowmem_reserve[]: 0 0 0
[172394.471631] DMA: 154*4kB 133*8kB 78*16kB 29*32kB 12*64kB 0*128kB 1*256kB 
1*512kB 1*1024kB 1*2048kB 0*4096kB = 8464kB
[172394.471644] DMA32: 47127*4kB 24614*8kB 7110*16kB 751*32kB 29*64kB 2*128kB 
0*256kB 2*512kB 1*1024kB 0*2048kB 0*4096kB = 527372kB
[172394.471658] Swap cache: add 659497, delete 623328, find 442788/475174, race 
0+5
[172394.471661] Free swap  = 2571424kB
[172394.471664] Total swap = 3911784kB
[172394.471666] Free swap:   2571424kB
[172394.476322] 255744 pages of RAM
[172394.476325] 6060 reserved pages
[172394.476327] 61683 pages shared
[172394.476329] 36197 pages swap cached

---

procs ---memory-- ---swap-- -io --system-- 
-cpu--
 r  b   swpd   free   buff  cache   si   sobibo   in

Re: kernel 2.6.22: what IS the VM doing?

2007-09-05 Thread Rik van Riel

Sami Farin wrote:

On Wed, Sep 05, 2007 at 12:24:26 -0400, Rik van Riel wrote:
...

*shrug*

The attached patch should make sure kswapd does not free an
excessive number of pages in zone_normal just because the
pages in zone_highmem are difficult to free.

It does give kswapd a large margin to continue putting equal
pressure on all zones in normal situations.

Sami, could you try out this patch to see if it helps your
situation?


Thanks, Rik.  bzImage is ready, I probably reboot inside one
month for a reason or other 8-)


The more I look at the bug, the more I see that it is probably
not very easy to reproduce on demand.  I have, however, a full
explanation on why it happens and why the patch should fix it,
so I will submit it for inclusion in -mm.

Sami, thank you for the detailed bug report.

--
Politics is the struggle between those who want to make their country
the best in the world, and those who believe it already is.  Each group
calls the other unpatriotic.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel 2.6.22: what IS the VM doing?

2007-09-05 Thread Sami Farin
On Wed, Sep 05, 2007 at 12:24:26 -0400, Rik van Riel wrote:
...
>> *shrug*
>
> The attached patch should make sure kswapd does not free an
> excessive number of pages in zone_normal just because the
> pages in zone_highmem are difficult to free.
>
> It does give kswapd a large margin to continue putting equal
> pressure on all zones in normal situations.
>
> Sami, could you try out this patch to see if it helps your
> situation?

Thanks, Rik.  bzImage is ready, I probably reboot inside one
month for a reason or other 8-)

-- 
Do what you love because life is too short for anything else.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel 2.6.22: what IS the VM doing?

2007-09-05 Thread Rik van Riel

Sami Farin wrote:

On Tue, Sep 04, 2007 at 21:37:35 -0400, Rik van Riel wrote:

Is the system trying to evict pages like crazy when your
system becomes unusable?


I think so..


If so, I wonder if kswapd is simply doing the wrong thing
and trying to evict data from all zones, simply because the
highmem zone is low on free pages...


*shrug*


The attached patch should make sure kswapd does not free an
excessive number of pages in zone_normal just because the
pages in zone_highmem are difficult to free.

It does give kswapd a large margin to continue putting equal
pressure on all zones in normal situations.

Sami, could you try out this patch to see if it helps your
situation?

Signed-off-by: Rik van Riel <[EMAIL PROTECTED]>
--- linux-2.6.22.noarch/mm/vmscan.c.excessive	2007-09-05 12:19:49.0 -0400
+++ linux-2.6.22.noarch/mm/vmscan.c	2007-09-05 12:21:40.0 -0400
@@ -1371,7 +1371,13 @@ loop_again:
 			temp_priority[i] = priority;
 			sc.nr_scanned = 0;
 			note_zone_scanning_priority(zone, priority);
-			nr_reclaimed += shrink_zone(priority, zone, );
+			/*
+			 * We put equal pressure on every zone, unless one
+			 * zone has way too many pages free already.
+			 */
+			if (!zone_watermark_ok(zone, order, 8*zone->pages_high,
+		end_zone, 0))
+nr_reclaimed += shrink_zone(priority, zone, );
 			reclaim_state->reclaimed_slab = 0;
 			nr_slab = shrink_slab(sc.nr_scanned, GFP_KERNEL,
 		lru_pages);


Re: kernel 2.6.22: what IS the VM doing?

2007-09-05 Thread Sami Farin
On Tue, Sep 04, 2007 at 21:37:35 -0400, Rik van Riel wrote:
> Sami Farin wrote:
>> Using SMP kernel 2.6.22.6pre-CFS-v20.5 on Pentium D (IA-32).
>> I think this bug (or whatever you want to call it) got triggered
>> when you first allocate several megabytes of memory in a kernel module
>> and then free them, and then run e.g. X and when memory gets tight,
>> you end up with this situation...
>> Top 2 /proc/vmstat Biggest Winners:
>> pgrefill_normal:49900/second
>> pgrefill_high:20810/second
>
> That means the pageout code was scanning about 7 pages
> per second on your system during peak stress.  You may have
> run into a scalability problem in the Linux kernel, where it
> wants to clear the referenced bit off all the anonymous pages
> before swapping something out.

Thanks for analysis...

Why turning off swap did not make any difference?
Why does is not keep e.g. xterm in memory (which I had 700MB free)?

> To make matters worse, that unlucky page gets chosen because
> it was the page where kswapd started scanning.  It has little
> to do with being the least recently used page, because every
> anonymous page tends to have its referenced bit set by the time
> we start scanning.
>
> On truly enormous systems, say with 256GB of memory, kswapd
> sometimes needs to scan hundreds of thousands or even millions
> of pages before finding something to swap out.  Not fun.
>
>> Did I forget to include some info???
>> Oh, and I need to reboot in order to get usable system
>> when this bug happens.
>
> Is the system trying to evict pages like crazy when your
> system becomes unusable?

I think so..

> If so, I wonder if kswapd is simply doing the wrong thing
> and trying to evict data from all zones, simply because the
> highmem zone is low on free pages...

*shrug*

-- 
"If we put the Pentagon's personnel managers in charge of the Sahara
Desert, they would run out of sand in five years." -Sayen Report

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel 2.6.22: what IS the VM doing?

2007-09-05 Thread Sami Farin
On Tue, Sep 04, 2007 at 21:37:35 -0400, Rik van Riel wrote:
 Sami Farin wrote:
 Using SMP kernel 2.6.22.6pre-CFS-v20.5 on Pentium D (IA-32).
 I think this bug (or whatever you want to call it) got triggered
 when you first allocate several megabytes of memory in a kernel module
 and then free them, and then run e.g. X and when memory gets tight,
 you end up with this situation...
 Top 2 /proc/vmstat Biggest Winners:
 pgrefill_normal:49900/second
 pgrefill_high:20810/second

 That means the pageout code was scanning about 7 pages
 per second on your system during peak stress.  You may have
 run into a scalability problem in the Linux kernel, where it
 wants to clear the referenced bit off all the anonymous pages
 before swapping something out.

Thanks for analysis...

Why turning off swap did not make any difference?
Why does is not keep e.g. xterm in memory (which I had 700MB free)?

 To make matters worse, that unlucky page gets chosen because
 it was the page where kswapd started scanning.  It has little
 to do with being the least recently used page, because every
 anonymous page tends to have its referenced bit set by the time
 we start scanning.

 On truly enormous systems, say with 256GB of memory, kswapd
 sometimes needs to scan hundreds of thousands or even millions
 of pages before finding something to swap out.  Not fun.

 Did I forget to include some info???
 Oh, and I need to reboot in order to get usable system
 when this bug happens.

 Is the system trying to evict pages like crazy when your
 system becomes unusable?

I think so..

 If so, I wonder if kswapd is simply doing the wrong thing
 and trying to evict data from all zones, simply because the
 highmem zone is low on free pages...

*shrug*

-- 
If we put the Pentagon's personnel managers in charge of the Sahara
Desert, they would run out of sand in five years. -Sayen Report

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel 2.6.22: what IS the VM doing?

2007-09-05 Thread Rik van Riel

Sami Farin wrote:

On Tue, Sep 04, 2007 at 21:37:35 -0400, Rik van Riel wrote:

Is the system trying to evict pages like crazy when your
system becomes unusable?


I think so..


If so, I wonder if kswapd is simply doing the wrong thing
and trying to evict data from all zones, simply because the
highmem zone is low on free pages...


*shrug*


The attached patch should make sure kswapd does not free an
excessive number of pages in zone_normal just because the
pages in zone_highmem are difficult to free.

It does give kswapd a large margin to continue putting equal
pressure on all zones in normal situations.

Sami, could you try out this patch to see if it helps your
situation?

Signed-off-by: Rik van Riel [EMAIL PROTECTED]
--- linux-2.6.22.noarch/mm/vmscan.c.excessive	2007-09-05 12:19:49.0 -0400
+++ linux-2.6.22.noarch/mm/vmscan.c	2007-09-05 12:21:40.0 -0400
@@ -1371,7 +1371,13 @@ loop_again:
 			temp_priority[i] = priority;
 			sc.nr_scanned = 0;
 			note_zone_scanning_priority(zone, priority);
-			nr_reclaimed += shrink_zone(priority, zone, sc);
+			/*
+			 * We put equal pressure on every zone, unless one
+			 * zone has way too many pages free already.
+			 */
+			if (!zone_watermark_ok(zone, order, 8*zone-pages_high,
+		end_zone, 0))
+nr_reclaimed += shrink_zone(priority, zone, sc);
 			reclaim_state-reclaimed_slab = 0;
 			nr_slab = shrink_slab(sc.nr_scanned, GFP_KERNEL,
 		lru_pages);


Re: kernel 2.6.22: what IS the VM doing?

2007-09-05 Thread Sami Farin
On Wed, Sep 05, 2007 at 12:24:26 -0400, Rik van Riel wrote:
...
 *shrug*

 The attached patch should make sure kswapd does not free an
 excessive number of pages in zone_normal just because the
 pages in zone_highmem are difficult to free.

 It does give kswapd a large margin to continue putting equal
 pressure on all zones in normal situations.

 Sami, could you try out this patch to see if it helps your
 situation?

Thanks, Rik.  bzImage is ready, I probably reboot inside one
month for a reason or other 8-)

-- 
Do what you love because life is too short for anything else.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel 2.6.22: what IS the VM doing?

2007-09-05 Thread Rik van Riel

Sami Farin wrote:

On Wed, Sep 05, 2007 at 12:24:26 -0400, Rik van Riel wrote:
...

*shrug*

The attached patch should make sure kswapd does not free an
excessive number of pages in zone_normal just because the
pages in zone_highmem are difficult to free.

It does give kswapd a large margin to continue putting equal
pressure on all zones in normal situations.

Sami, could you try out this patch to see if it helps your
situation?


Thanks, Rik.  bzImage is ready, I probably reboot inside one
month for a reason or other 8-)


The more I look at the bug, the more I see that it is probably
not very easy to reproduce on demand.  I have, however, a full
explanation on why it happens and why the patch should fix it,
so I will submit it for inclusion in -mm.

Sami, thank you for the detailed bug report.

--
Politics is the struggle between those who want to make their country
the best in the world, and those who believe it already is.  Each group
calls the other unpatriotic.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel 2.6.22: what IS the VM doing?

2007-09-04 Thread Rik van Riel

Sami Farin wrote:

Using SMP kernel 2.6.22.6pre-CFS-v20.5 on Pentium D (IA-32).
I think this bug (or whatever you want to call it) got triggered
when you first allocate several megabytes of memory in a kernel module
and then free them, and then run e.g. X and when memory gets tight,
you end up with this situation...

Top 2 /proc/vmstat Biggest Winners:

pgrefill_normal:49900/second
pgrefill_high:20810/second


That means the pageout code was scanning about 7 pages
per second on your system during peak stress.  You may have
run into a scalability problem in the Linux kernel, where it
wants to clear the referenced bit off all the anonymous pages
before swapping something out.

To make matters worse, that unlucky page gets chosen because
it was the page where kswapd started scanning.  It has little
to do with being the least recently used page, because every
anonymous page tends to have its referenced bit set by the time
we start scanning.

On truly enormous systems, say with 256GB of memory, kswapd
sometimes needs to scan hundreds of thousands or even millions
of pages before finding something to swap out.  Not fun.


Did I forget to include some info???
Oh, and I need to reboot in order to get usable system
when this bug happens.


Is the system trying to evict pages like crazy when your
system becomes unusable?

If so, I wonder if kswapd is simply doing the wrong thing
and trying to evict data from all zones, simply because the
highmem zone is low on free pages...

--
Politics is the struggle between those who want to make their country
the best in the world, and those who believe it already is.  Each group
calls the other unpatriotic.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel 2.6.22: what IS the VM doing?

2007-09-04 Thread Rik van Riel

Sami Farin wrote:

Using SMP kernel 2.6.22.6pre-CFS-v20.5 on Pentium D (IA-32).
I think this bug (or whatever you want to call it) got triggered
when you first allocate several megabytes of memory in a kernel module
and then free them, and then run e.g. X and when memory gets tight,
you end up with this situation...

Top 2 /proc/vmstat Biggest Winners:

pgrefill_normal:49900/second
pgrefill_high:20810/second


That means the pageout code was scanning about 7 pages
per second on your system during peak stress.  You may have
run into a scalability problem in the Linux kernel, where it
wants to clear the referenced bit off all the anonymous pages
before swapping something out.

To make matters worse, that unlucky page gets chosen because
it was the page where kswapd started scanning.  It has little
to do with being the least recently used page, because every
anonymous page tends to have its referenced bit set by the time
we start scanning.

On truly enormous systems, say with 256GB of memory, kswapd
sometimes needs to scan hundreds of thousands or even millions
of pages before finding something to swap out.  Not fun.


Did I forget to include some info???
Oh, and I need to reboot in order to get usable system
when this bug happens.


Is the system trying to evict pages like crazy when your
system becomes unusable?

If so, I wonder if kswapd is simply doing the wrong thing
and trying to evict data from all zones, simply because the
highmem zone is low on free pages...

--
Politics is the struggle between those who want to make their country
the best in the world, and those who believe it already is.  Each group
calls the other unpatriotic.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/