Re: kernel 2.6.22: what IS the VM doing?
On Wed, Sep 05, 2007 at 18:48:51 -0400, Rik van Riel wrote: > Sami Farin wrote: >> On Wed, Sep 05, 2007 at 12:24:26 -0400, Rik van Riel wrote: >> ... *shrug* >>> The attached patch should make sure kswapd does not free an >>> excessive number of pages in zone_normal just because the >>> pages in zone_highmem are difficult to free. >>> >>> It does give kswapd a large margin to continue putting equal >>> pressure on all zones in normal situations. >>> >>> Sami, could you try out this patch to see if it helps your >>> situation? >> >> Thanks, Rik. bzImage is ready, I probably reboot inside one >> month for a reason or other 8-) > > The more I look at the bug, the more I see that it is probably > not very easy to reproduce on demand. I have, however, a full Well, I now booted to x86_64 kernel. I can still reproduce this. When I unload ipset modules, kernel resumes "normal" operation, i.e., not swapping like mad. sysrq-m, normal operation: [172074.989053] SysRq : Show Memory [172074.989063] Mem-info: [172074.989071] DMA per-cpu: [172074.989078] CPU0: Hot: hi:0, btch: 1 usd: 0 Cold: hi:0, btch: 1 usd: 0 [172074.989083] CPU1: Hot: hi:0, btch: 1 usd: 0 Cold: hi:0, btch: 1 usd: 0 [172074.989089] DMA32 per-cpu: [172074.989094] CPU0: Hot: hi: 186, btch: 31 usd: 153 Cold: hi: 62, btch: 15 usd: 10 [172074.989101] CPU1: Hot: hi: 186, btch: 31 usd: 34 Cold: hi: 62, btch: 15 usd: 14 [172074.989109] Active:123048 inactive:55194 dirty:5 writeback:0 unstable:0 [172074.989111] free:15001 slab:27961 mapped:15063 pagetables:2996 bounce:0 [172074.989118] DMA free:5560kB min:32kB low:40kB high:48kB active:404kB inactive:736kB present:8620kB pages_scanned:0 all_unreclaimable? no [172074.989124] lowmem_reserve[]: 0 968 968 [172074.989143] DMA32 free:5kB min:3964kB low:4952kB high:5944kB active:491788kB inactive:220040kB present:991996kB pages_scanned:0 all_unreclaimable? no [172074.989150] lowmem_reserve[]: 0 0 0 [172074.989166] DMA: 276*4kB 121*8kB 30*16kB 6*32kB 0*64kB 0*128kB 1*256kB 1*512kB 2*1024kB 0*2048kB 0*4096kB = 5560kB [172074.989205] DMA32: 9467*4kB 1222*8kB 121*16kB 22*32kB 3*64kB 1*128kB 1*256kB 3*512kB 0*1024kB 1*2048kB 0*4096kB = 5kB [172074.989249] Swap cache: add 613353, delete 556659, find 441592/473681, race 0+5 [172074.989255] Free swap = 2751640kB [172074.989260] Total swap = 3911784kB [172074.989265] Free swap: 2751640kB [172074.993693] 255744 pages of RAM [172074.993699] 6060 reserved pages [172074.993702] 79933 pages shared [172074.993706] 56719 pages swap cached then it goes bad: [172373.542933] net/ipv4/netfilter/ip_set_nethash.c: retry: rehashing of set blockedp2pnew triggered: hashsize grows from 262144 to 288358 [172373.554837] net/ipv4/netfilter/ip_set_nethash.c: retry: rehashing of set blockedp2pnew triggered: hashsize grows from 262144 to 317193 [172373.561167] net/ipv4/netfilter/ip_set_nethash.c: retry: rehashing of set blockedp2pnew triggered: hashsize grows from 262144 to 348912 [172373.569375] net/ipv4/netfilter/ip_set_nethash.c: retry: rehashing of set blockedp2pnew triggered: hashsize grows from 262144 to 383803 [172394.471570] SysRq : Show Memory [172394.471580] Mem-info: [172394.471583] DMA per-cpu: [172394.471586] CPU0: Hot: hi:0, btch: 1 usd: 0 Cold: hi:0, btch: 1 usd: 0 [172394.471590] CPU1: Hot: hi:0, btch: 1 usd: 0 Cold: hi:0, btch: 1 usd: 0 [172394.471593] DMA32 per-cpu: [172394.471597] CPU0: Hot: hi: 186, btch: 31 usd: 152 Cold: hi: 62, btch: 15 usd: 58 [172394.471601] CPU1: Hot: hi: 186, btch: 31 usd: 108 Cold: hi: 62, btch: 15 usd: 52 [172394.471606] Active:46934 inactive:23643 dirty:0 writeback:17112 unstable:0 [172394.471608] free:133942 slab:16510 mapped:7826 pagetables:3004 bounce:0 [172394.471613] DMA free:8460kB min:32kB low:40kB high:48kB active:0kB inactive:0kB present:8620kB pages_scanned:0 all_unreclaimable? yes [172394.471616] lowmem_reserve[]: 0 968 968 [172394.471623] DMA32 free:527308kB min:3964kB low:4952kB high:5944kB active:187736kB inactive:94572kB present:991996kB pages_scanned:92 all_unreclaimable? no [172394.471627] lowmem_reserve[]: 0 0 0 [172394.471631] DMA: 154*4kB 133*8kB 78*16kB 29*32kB 12*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 8464kB [172394.471644] DMA32: 47127*4kB 24614*8kB 7110*16kB 751*32kB 29*64kB 2*128kB 0*256kB 2*512kB 1*1024kB 0*2048kB 0*4096kB = 527372kB [172394.471658] Swap cache: add 659497, delete 623328, find 442788/475174, race 0+5 [172394.471661] Free swap = 2571424kB [172394.471664] Total swap = 3911784kB [172394.471666] Free swap: 2571424kB [172394.476322] 255744 pages of RAM [172394.476325] 6060 reserved pages [172394.476327] 61683 pages shared [172394.476329] 36197 pages swap cached --- procs ---memory-- ---swap-- -io --system-- -cpu-- r b swpd
Re: kernel 2.6.22: what IS the VM doing?
On Wed, Sep 05, 2007 at 18:48:51 -0400, Rik van Riel wrote: Sami Farin wrote: On Wed, Sep 05, 2007 at 12:24:26 -0400, Rik van Riel wrote: ... *shrug* The attached patch should make sure kswapd does not free an excessive number of pages in zone_normal just because the pages in zone_highmem are difficult to free. It does give kswapd a large margin to continue putting equal pressure on all zones in normal situations. Sami, could you try out this patch to see if it helps your situation? Thanks, Rik. bzImage is ready, I probably reboot inside one month for a reason or other 8-) The more I look at the bug, the more I see that it is probably not very easy to reproduce on demand. I have, however, a full Well, I now booted to x86_64 kernel. I can still reproduce this. When I unload ipset modules, kernel resumes normal operation, i.e., not swapping like mad. sysrq-m, normal operation: [172074.989053] SysRq : Show Memory [172074.989063] Mem-info: [172074.989071] DMA per-cpu: [172074.989078] CPU0: Hot: hi:0, btch: 1 usd: 0 Cold: hi:0, btch: 1 usd: 0 [172074.989083] CPU1: Hot: hi:0, btch: 1 usd: 0 Cold: hi:0, btch: 1 usd: 0 [172074.989089] DMA32 per-cpu: [172074.989094] CPU0: Hot: hi: 186, btch: 31 usd: 153 Cold: hi: 62, btch: 15 usd: 10 [172074.989101] CPU1: Hot: hi: 186, btch: 31 usd: 34 Cold: hi: 62, btch: 15 usd: 14 [172074.989109] Active:123048 inactive:55194 dirty:5 writeback:0 unstable:0 [172074.989111] free:15001 slab:27961 mapped:15063 pagetables:2996 bounce:0 [172074.989118] DMA free:5560kB min:32kB low:40kB high:48kB active:404kB inactive:736kB present:8620kB pages_scanned:0 all_unreclaimable? no [172074.989124] lowmem_reserve[]: 0 968 968 [172074.989143] DMA32 free:5kB min:3964kB low:4952kB high:5944kB active:491788kB inactive:220040kB present:991996kB pages_scanned:0 all_unreclaimable? no [172074.989150] lowmem_reserve[]: 0 0 0 [172074.989166] DMA: 276*4kB 121*8kB 30*16kB 6*32kB 0*64kB 0*128kB 1*256kB 1*512kB 2*1024kB 0*2048kB 0*4096kB = 5560kB [172074.989205] DMA32: 9467*4kB 1222*8kB 121*16kB 22*32kB 3*64kB 1*128kB 1*256kB 3*512kB 0*1024kB 1*2048kB 0*4096kB = 5kB [172074.989249] Swap cache: add 613353, delete 556659, find 441592/473681, race 0+5 [172074.989255] Free swap = 2751640kB [172074.989260] Total swap = 3911784kB [172074.989265] Free swap: 2751640kB [172074.993693] 255744 pages of RAM [172074.993699] 6060 reserved pages [172074.993702] 79933 pages shared [172074.993706] 56719 pages swap cached then it goes bad: [172373.542933] net/ipv4/netfilter/ip_set_nethash.c: retry: rehashing of set blockedp2pnew triggered: hashsize grows from 262144 to 288358 [172373.554837] net/ipv4/netfilter/ip_set_nethash.c: retry: rehashing of set blockedp2pnew triggered: hashsize grows from 262144 to 317193 [172373.561167] net/ipv4/netfilter/ip_set_nethash.c: retry: rehashing of set blockedp2pnew triggered: hashsize grows from 262144 to 348912 [172373.569375] net/ipv4/netfilter/ip_set_nethash.c: retry: rehashing of set blockedp2pnew triggered: hashsize grows from 262144 to 383803 [172394.471570] SysRq : Show Memory [172394.471580] Mem-info: [172394.471583] DMA per-cpu: [172394.471586] CPU0: Hot: hi:0, btch: 1 usd: 0 Cold: hi:0, btch: 1 usd: 0 [172394.471590] CPU1: Hot: hi:0, btch: 1 usd: 0 Cold: hi:0, btch: 1 usd: 0 [172394.471593] DMA32 per-cpu: [172394.471597] CPU0: Hot: hi: 186, btch: 31 usd: 152 Cold: hi: 62, btch: 15 usd: 58 [172394.471601] CPU1: Hot: hi: 186, btch: 31 usd: 108 Cold: hi: 62, btch: 15 usd: 52 [172394.471606] Active:46934 inactive:23643 dirty:0 writeback:17112 unstable:0 [172394.471608] free:133942 slab:16510 mapped:7826 pagetables:3004 bounce:0 [172394.471613] DMA free:8460kB min:32kB low:40kB high:48kB active:0kB inactive:0kB present:8620kB pages_scanned:0 all_unreclaimable? yes [172394.471616] lowmem_reserve[]: 0 968 968 [172394.471623] DMA32 free:527308kB min:3964kB low:4952kB high:5944kB active:187736kB inactive:94572kB present:991996kB pages_scanned:92 all_unreclaimable? no [172394.471627] lowmem_reserve[]: 0 0 0 [172394.471631] DMA: 154*4kB 133*8kB 78*16kB 29*32kB 12*64kB 0*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 8464kB [172394.471644] DMA32: 47127*4kB 24614*8kB 7110*16kB 751*32kB 29*64kB 2*128kB 0*256kB 2*512kB 1*1024kB 0*2048kB 0*4096kB = 527372kB [172394.471658] Swap cache: add 659497, delete 623328, find 442788/475174, race 0+5 [172394.471661] Free swap = 2571424kB [172394.471664] Total swap = 3911784kB [172394.471666] Free swap: 2571424kB [172394.476322] 255744 pages of RAM [172394.476325] 6060 reserved pages [172394.476327] 61683 pages shared [172394.476329] 36197 pages swap cached --- procs ---memory-- ---swap-- -io --system-- -cpu-- r b swpd free buff cache si sobibo in
Re: kernel 2.6.22: what IS the VM doing?
Sami Farin wrote: On Wed, Sep 05, 2007 at 12:24:26 -0400, Rik van Riel wrote: ... *shrug* The attached patch should make sure kswapd does not free an excessive number of pages in zone_normal just because the pages in zone_highmem are difficult to free. It does give kswapd a large margin to continue putting equal pressure on all zones in normal situations. Sami, could you try out this patch to see if it helps your situation? Thanks, Rik. bzImage is ready, I probably reboot inside one month for a reason or other 8-) The more I look at the bug, the more I see that it is probably not very easy to reproduce on demand. I have, however, a full explanation on why it happens and why the patch should fix it, so I will submit it for inclusion in -mm. Sami, thank you for the detailed bug report. -- Politics is the struggle between those who want to make their country the best in the world, and those who believe it already is. Each group calls the other unpatriotic. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel 2.6.22: what IS the VM doing?
On Wed, Sep 05, 2007 at 12:24:26 -0400, Rik van Riel wrote: ... >> *shrug* > > The attached patch should make sure kswapd does not free an > excessive number of pages in zone_normal just because the > pages in zone_highmem are difficult to free. > > It does give kswapd a large margin to continue putting equal > pressure on all zones in normal situations. > > Sami, could you try out this patch to see if it helps your > situation? Thanks, Rik. bzImage is ready, I probably reboot inside one month for a reason or other 8-) -- Do what you love because life is too short for anything else. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel 2.6.22: what IS the VM doing?
Sami Farin wrote: On Tue, Sep 04, 2007 at 21:37:35 -0400, Rik van Riel wrote: Is the system trying to evict pages like crazy when your system becomes unusable? I think so.. If so, I wonder if kswapd is simply doing the wrong thing and trying to evict data from all zones, simply because the highmem zone is low on free pages... *shrug* The attached patch should make sure kswapd does not free an excessive number of pages in zone_normal just because the pages in zone_highmem are difficult to free. It does give kswapd a large margin to continue putting equal pressure on all zones in normal situations. Sami, could you try out this patch to see if it helps your situation? Signed-off-by: Rik van Riel <[EMAIL PROTECTED]> --- linux-2.6.22.noarch/mm/vmscan.c.excessive 2007-09-05 12:19:49.0 -0400 +++ linux-2.6.22.noarch/mm/vmscan.c 2007-09-05 12:21:40.0 -0400 @@ -1371,7 +1371,13 @@ loop_again: temp_priority[i] = priority; sc.nr_scanned = 0; note_zone_scanning_priority(zone, priority); - nr_reclaimed += shrink_zone(priority, zone, ); + /* + * We put equal pressure on every zone, unless one + * zone has way too many pages free already. + */ + if (!zone_watermark_ok(zone, order, 8*zone->pages_high, + end_zone, 0)) +nr_reclaimed += shrink_zone(priority, zone, ); reclaim_state->reclaimed_slab = 0; nr_slab = shrink_slab(sc.nr_scanned, GFP_KERNEL, lru_pages);
Re: kernel 2.6.22: what IS the VM doing?
On Tue, Sep 04, 2007 at 21:37:35 -0400, Rik van Riel wrote: > Sami Farin wrote: >> Using SMP kernel 2.6.22.6pre-CFS-v20.5 on Pentium D (IA-32). >> I think this bug (or whatever you want to call it) got triggered >> when you first allocate several megabytes of memory in a kernel module >> and then free them, and then run e.g. X and when memory gets tight, >> you end up with this situation... >> Top 2 /proc/vmstat Biggest Winners: >> pgrefill_normal:49900/second >> pgrefill_high:20810/second > > That means the pageout code was scanning about 7 pages > per second on your system during peak stress. You may have > run into a scalability problem in the Linux kernel, where it > wants to clear the referenced bit off all the anonymous pages > before swapping something out. Thanks for analysis... Why turning off swap did not make any difference? Why does is not keep e.g. xterm in memory (which I had 700MB free)? > To make matters worse, that unlucky page gets chosen because > it was the page where kswapd started scanning. It has little > to do with being the least recently used page, because every > anonymous page tends to have its referenced bit set by the time > we start scanning. > > On truly enormous systems, say with 256GB of memory, kswapd > sometimes needs to scan hundreds of thousands or even millions > of pages before finding something to swap out. Not fun. > >> Did I forget to include some info??? >> Oh, and I need to reboot in order to get usable system >> when this bug happens. > > Is the system trying to evict pages like crazy when your > system becomes unusable? I think so.. > If so, I wonder if kswapd is simply doing the wrong thing > and trying to evict data from all zones, simply because the > highmem zone is low on free pages... *shrug* -- "If we put the Pentagon's personnel managers in charge of the Sahara Desert, they would run out of sand in five years." -Sayen Report - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel 2.6.22: what IS the VM doing?
On Tue, Sep 04, 2007 at 21:37:35 -0400, Rik van Riel wrote: Sami Farin wrote: Using SMP kernel 2.6.22.6pre-CFS-v20.5 on Pentium D (IA-32). I think this bug (or whatever you want to call it) got triggered when you first allocate several megabytes of memory in a kernel module and then free them, and then run e.g. X and when memory gets tight, you end up with this situation... Top 2 /proc/vmstat Biggest Winners: pgrefill_normal:49900/second pgrefill_high:20810/second That means the pageout code was scanning about 7 pages per second on your system during peak stress. You may have run into a scalability problem in the Linux kernel, where it wants to clear the referenced bit off all the anonymous pages before swapping something out. Thanks for analysis... Why turning off swap did not make any difference? Why does is not keep e.g. xterm in memory (which I had 700MB free)? To make matters worse, that unlucky page gets chosen because it was the page where kswapd started scanning. It has little to do with being the least recently used page, because every anonymous page tends to have its referenced bit set by the time we start scanning. On truly enormous systems, say with 256GB of memory, kswapd sometimes needs to scan hundreds of thousands or even millions of pages before finding something to swap out. Not fun. Did I forget to include some info??? Oh, and I need to reboot in order to get usable system when this bug happens. Is the system trying to evict pages like crazy when your system becomes unusable? I think so.. If so, I wonder if kswapd is simply doing the wrong thing and trying to evict data from all zones, simply because the highmem zone is low on free pages... *shrug* -- If we put the Pentagon's personnel managers in charge of the Sahara Desert, they would run out of sand in five years. -Sayen Report - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel 2.6.22: what IS the VM doing?
Sami Farin wrote: On Tue, Sep 04, 2007 at 21:37:35 -0400, Rik van Riel wrote: Is the system trying to evict pages like crazy when your system becomes unusable? I think so.. If so, I wonder if kswapd is simply doing the wrong thing and trying to evict data from all zones, simply because the highmem zone is low on free pages... *shrug* The attached patch should make sure kswapd does not free an excessive number of pages in zone_normal just because the pages in zone_highmem are difficult to free. It does give kswapd a large margin to continue putting equal pressure on all zones in normal situations. Sami, could you try out this patch to see if it helps your situation? Signed-off-by: Rik van Riel [EMAIL PROTECTED] --- linux-2.6.22.noarch/mm/vmscan.c.excessive 2007-09-05 12:19:49.0 -0400 +++ linux-2.6.22.noarch/mm/vmscan.c 2007-09-05 12:21:40.0 -0400 @@ -1371,7 +1371,13 @@ loop_again: temp_priority[i] = priority; sc.nr_scanned = 0; note_zone_scanning_priority(zone, priority); - nr_reclaimed += shrink_zone(priority, zone, sc); + /* + * We put equal pressure on every zone, unless one + * zone has way too many pages free already. + */ + if (!zone_watermark_ok(zone, order, 8*zone-pages_high, + end_zone, 0)) +nr_reclaimed += shrink_zone(priority, zone, sc); reclaim_state-reclaimed_slab = 0; nr_slab = shrink_slab(sc.nr_scanned, GFP_KERNEL, lru_pages);
Re: kernel 2.6.22: what IS the VM doing?
On Wed, Sep 05, 2007 at 12:24:26 -0400, Rik van Riel wrote: ... *shrug* The attached patch should make sure kswapd does not free an excessive number of pages in zone_normal just because the pages in zone_highmem are difficult to free. It does give kswapd a large margin to continue putting equal pressure on all zones in normal situations. Sami, could you try out this patch to see if it helps your situation? Thanks, Rik. bzImage is ready, I probably reboot inside one month for a reason or other 8-) -- Do what you love because life is too short for anything else. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel 2.6.22: what IS the VM doing?
Sami Farin wrote: On Wed, Sep 05, 2007 at 12:24:26 -0400, Rik van Riel wrote: ... *shrug* The attached patch should make sure kswapd does not free an excessive number of pages in zone_normal just because the pages in zone_highmem are difficult to free. It does give kswapd a large margin to continue putting equal pressure on all zones in normal situations. Sami, could you try out this patch to see if it helps your situation? Thanks, Rik. bzImage is ready, I probably reboot inside one month for a reason or other 8-) The more I look at the bug, the more I see that it is probably not very easy to reproduce on demand. I have, however, a full explanation on why it happens and why the patch should fix it, so I will submit it for inclusion in -mm. Sami, thank you for the detailed bug report. -- Politics is the struggle between those who want to make their country the best in the world, and those who believe it already is. Each group calls the other unpatriotic. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel 2.6.22: what IS the VM doing?
Sami Farin wrote: Using SMP kernel 2.6.22.6pre-CFS-v20.5 on Pentium D (IA-32). I think this bug (or whatever you want to call it) got triggered when you first allocate several megabytes of memory in a kernel module and then free them, and then run e.g. X and when memory gets tight, you end up with this situation... Top 2 /proc/vmstat Biggest Winners: pgrefill_normal:49900/second pgrefill_high:20810/second That means the pageout code was scanning about 7 pages per second on your system during peak stress. You may have run into a scalability problem in the Linux kernel, where it wants to clear the referenced bit off all the anonymous pages before swapping something out. To make matters worse, that unlucky page gets chosen because it was the page where kswapd started scanning. It has little to do with being the least recently used page, because every anonymous page tends to have its referenced bit set by the time we start scanning. On truly enormous systems, say with 256GB of memory, kswapd sometimes needs to scan hundreds of thousands or even millions of pages before finding something to swap out. Not fun. Did I forget to include some info??? Oh, and I need to reboot in order to get usable system when this bug happens. Is the system trying to evict pages like crazy when your system becomes unusable? If so, I wonder if kswapd is simply doing the wrong thing and trying to evict data from all zones, simply because the highmem zone is low on free pages... -- Politics is the struggle between those who want to make their country the best in the world, and those who believe it already is. Each group calls the other unpatriotic. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel 2.6.22: what IS the VM doing?
Sami Farin wrote: Using SMP kernel 2.6.22.6pre-CFS-v20.5 on Pentium D (IA-32). I think this bug (or whatever you want to call it) got triggered when you first allocate several megabytes of memory in a kernel module and then free them, and then run e.g. X and when memory gets tight, you end up with this situation... Top 2 /proc/vmstat Biggest Winners: pgrefill_normal:49900/second pgrefill_high:20810/second That means the pageout code was scanning about 7 pages per second on your system during peak stress. You may have run into a scalability problem in the Linux kernel, where it wants to clear the referenced bit off all the anonymous pages before swapping something out. To make matters worse, that unlucky page gets chosen because it was the page where kswapd started scanning. It has little to do with being the least recently used page, because every anonymous page tends to have its referenced bit set by the time we start scanning. On truly enormous systems, say with 256GB of memory, kswapd sometimes needs to scan hundreds of thousands or even millions of pages before finding something to swap out. Not fun. Did I forget to include some info??? Oh, and I need to reboot in order to get usable system when this bug happens. Is the system trying to evict pages like crazy when your system becomes unusable? If so, I wonder if kswapd is simply doing the wrong thing and trying to evict data from all zones, simply because the highmem zone is low on free pages... -- Politics is the struggle between those who want to make their country the best in the world, and those who believe it already is. Each group calls the other unpatriotic. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/