Re: [Ext2-devel] Re: OOM problems on 2.6.12-rc1 with many fsx tests

2005-04-12 Thread Stephen C. Tweedie
Hi, On Wed, 2005-04-06 at 02:23, Andrew Morton wrote: > Nobody has noticed the now-fixed leak since 2.6.6 and this one appears to > be 100x slower. Which is fortunate because this one is going to take a > long time to fix. I'll poke at it some more. OK, I'm now at the stage where I can kick of

Re: [Ext2-devel] Re: OOM problems on 2.6.12-rc1 with many fsx tests

2005-04-05 Thread Andrew Morton
Mingming Cao <[EMAIL PROTECTED]> wrote: > > I run the test(20 instances of fsx) with your patch on 2.6.12-rc1 with > 512MB RAM (where I were able to constantly re-create the mem leak and > lead to OOM before). The result is the kernel did not get into OOM after > about 19 hours(before it took ab

Re: [Ext2-devel] Re: OOM problems on 2.6.12-rc1 with many fsx tests

2005-04-05 Thread Mingming Cao
On Mon, 2005-04-04 at 13:04 -0700, Andrew Morton wrote: > Mingming Cao <[EMAIL PROTECTED]> wrote: > > > > On Sun, 2005-04-03 at 18:35 -0700, Andrew Morton wrote: > > > Mingming Cao <[EMAIL PROTECTED]> wrote: > > > > > > > > I run into OOM problem again on 2.6.12-rc1. I run some(20) fsx tests on > >

Re: OOM problems on 2.6.12-rc1 with many fsx tests

2005-04-05 Thread Stephen C. Tweedie
Hi, On Mon, 2005-04-04 at 02:35, Andrew Morton wrote: > Without the below patch it's possible to make ext3 leak at around a > megabyte per minute by arranging for the fs to run a commit every 50 > milliseconds, btw. Ouch! > (Stephen, please review...) Doing so now. > The patch teaches journa

Re: OOM problems on 2.6.12-rc1 with many fsx tests

2005-04-04 Thread Martin J. Bligh
>> > > I run into OOM problem again on 2.6.12-rc1. I run some(20) fsx tests on >> > > 2.6.12-rc1 kernel(and 2.6.11-mm4) on ext3 filesystem, after about 10 >> > > hours the system hit OOM, and OOM keep killing processes one by one. I >> > > could reproduce this problem very constantly on a 2 way

Re: OOM problems on 2.6.12-rc1 with many fsx tests

2005-04-04 Thread Andrew Morton
"Martin J. Bligh" <[EMAIL PROTECTED]> wrote: > > >> > > I run into OOM problem again on 2.6.12-rc1. I run some(20) fsx tests on > >> > > 2.6.12-rc1 kernel(and 2.6.11-mm4) on ext3 filesystem, after about 10 > >> > > hours the system hit OOM, and OOM keep killing processes one by one. I > >> > > c

Re: OOM problems on 2.6.12-rc1 with many fsx tests

2005-04-04 Thread Andrew Morton
Mingming Cao <[EMAIL PROTECTED]> wrote: > > On Sun, 2005-04-03 at 18:35 -0700, Andrew Morton wrote: > > Mingming Cao <[EMAIL PROTECTED]> wrote: > > > > > > I run into OOM problem again on 2.6.12-rc1. I run some(20) fsx tests on > > > 2.6.12-rc1 kernel(and 2.6.11-mm4) on ext3 filesystem, after abou

Re: OOM problems on 2.6.12-rc1 with many fsx tests

2005-04-04 Thread Mingming Cao
On Sun, 2005-04-03 at 18:35 -0700, Andrew Morton wrote: > Mingming Cao <[EMAIL PROTECTED]> wrote: > > > > I run into OOM problem again on 2.6.12-rc1. I run some(20) fsx tests on > > 2.6.12-rc1 kernel(and 2.6.11-mm4) on ext3 filesystem, after about 10 > > hours the system hit OOM, and OOM keep kil

Re: OOM problems on 2.6.12-rc1 with many fsx tests

2005-04-03 Thread Andrew Morton
Mingming Cao <[EMAIL PROTECTED]> wrote: > > I run into OOM problem again on 2.6.12-rc1. I run some(20) fsx tests on > 2.6.12-rc1 kernel(and 2.6.11-mm4) on ext3 filesystem, after about 10 > hours the system hit OOM, and OOM keep killing processes one by one. I > could reproduce this problem very

Re: [Ext2-devel] Re: OOM problems on 2.6.12-rc1 with many fsx tests

2005-03-27 Thread Badari Pulavarty
Badari Pulavarty wrote: Mingming Cao wrote: On Sat, 2005-03-26 at 16:23 -0800, Mingming Cao wrote: On Fri, 2005-03-25 at 14:11 -0800, Badari Pulavarty wrote: On Fri, 2005-03-25 at 13:56, Andrew Morton wrote: Mingming Cao <[EMAIL PROTECTED]> wrote: I run into OOM problem again on 2.6.12-rc1. I run s

Re: [Ext2-devel] Re: OOM problems on 2.6.12-rc1 with many fsx tests

2005-03-27 Thread Badari Pulavarty
Mingming Cao wrote: On Sat, 2005-03-26 at 16:23 -0800, Mingming Cao wrote: On Fri, 2005-03-25 at 14:11 -0800, Badari Pulavarty wrote: On Fri, 2005-03-25 at 13:56, Andrew Morton wrote: Mingming Cao <[EMAIL PROTECTED]> wrote: I run into OOM problem again on 2.6.12-rc1. I run some(20) fsx tests on 2.6

Re: [Ext2-devel] Re: OOM problems on 2.6.12-rc1 with many fsx tests

2005-03-27 Thread Mingming Cao
On Sat, 2005-03-26 at 16:23 -0800, Mingming Cao wrote: > On Fri, 2005-03-25 at 14:11 -0800, Badari Pulavarty wrote: > > On Fri, 2005-03-25 at 13:56, Andrew Morton wrote: > > > Mingming Cao <[EMAIL PROTECTED]> wrote: > > > > > > > > I run into OOM problem again on 2.6.12-rc1. I run some(20) fsx test

Re: [Ext2-devel] Re: OOM problems on 2.6.12-rc1 with many fsx tests

2005-03-26 Thread Mingming Cao
On Fri, 2005-03-25 at 14:11 -0800, Badari Pulavarty wrote: > On Fri, 2005-03-25 at 13:56, Andrew Morton wrote: > > Mingming Cao <[EMAIL PROTECTED]> wrote: > > > > > > I run into OOM problem again on 2.6.12-rc1. I run some(20) fsx tests on > > > 2.6.12-rc1 kernel(and 2.6.11-mm4) on ext3 filesystem,

Re: [Ext2-devel] Re: OOM problems on 2.6.12-rc1 with many fsx tests

2005-03-25 Thread Badari Pulavarty
On Fri, 2005-03-25 at 16:17, Dave Jones wrote: > On Wed, Mar 23, 2005 at 11:53:04AM -0800, Mingming Cao wrote: > > > The fsx command is: > > > > ./fsx -c 10 -n -r 4096 -w 4096 /mnt/test/foo1 & > > > > I also see fsx tests start to generating report about read bad data > > about the tests h

Re: OOM problems on 2.6.12-rc1 with many fsx tests

2005-03-25 Thread Dave Jones
On Wed, Mar 23, 2005 at 11:53:04AM -0800, Mingming Cao wrote: > The fsx command is: > > ./fsx -c 10 -n -r 4096 -w 4096 /mnt/test/foo1 & > > I also see fsx tests start to generating report about read bad data > about the tests have run for about 9 hours(one hour before of the OOM > happen)

Re: [Ext2-devel] Re: OOM problems on 2.6.12-rc1 with many fsx tests

2005-03-25 Thread Badari Pulavarty
On Fri, 2005-03-25 at 13:56, Andrew Morton wrote: > Mingming Cao <[EMAIL PROTECTED]> wrote: > > > > I run into OOM problem again on 2.6.12-rc1. I run some(20) fsx tests on > > 2.6.12-rc1 kernel(and 2.6.11-mm4) on ext3 filesystem, after about 10 > > hours the system hit OOM, and OOM keep killing pro

Re: OOM problems on 2.6.12-rc1 with many fsx tests

2005-03-25 Thread Andrew Morton
Mingming Cao <[EMAIL PROTECTED]> wrote: > > I run into OOM problem again on 2.6.12-rc1. I run some(20) fsx tests on > 2.6.12-rc1 kernel(and 2.6.11-mm4) on ext3 filesystem, after about 10 > hours the system hit OOM, and OOM keep killing processes one by one. I > could reproduce this problem very con

Re: OOM problems on 2.6.12-rc1 with many fsx tests

2005-03-23 Thread Andrew Morton
Andrea Arcangeli <[EMAIL PROTECTED]> wrote: > > On Wed, Mar 23, 2005 at 03:42:32PM -0800, Andrew Morton wrote: > > I'm suspecting here that we simply leaked a refcount on every darn > > pagecache page in the machine. Note how mapped memory has shrunk down to > > less than a megabyte and everything

Re: OOM problems on 2.6.12-rc1 with many fsx tests

2005-03-23 Thread Andrea Arcangeli
On Wed, Mar 23, 2005 at 03:42:32PM -0800, Andrew Morton wrote: > I'm suspecting here that we simply leaked a refcount on every darn > pagecache page in the machine. Note how mapped memory has shrunk down to > less than a megabyte and everything which can be swapped out has been > swapped out. > >

Re: [Ext2-devel] Re: OOM problems on 2.6.12-rc1 with many fsx tests

2005-03-23 Thread Badari Pulavarty
Andrew Morton wrote: "Martin J. Bligh" <[EMAIL PROTECTED]> wrote: Nothing beats poking around in a dead machine's guts with kgdb though. Everyone his taste. But I was surprised by SwapTotal: 1052216 kB SwapFree: 1045984 kB Strange that processes are killed while lots of swap is available.

Re: OOM problems on 2.6.12-rc1 with many fsx tests

2005-03-23 Thread Andrew Morton
"Martin J. Bligh" <[EMAIL PROTECTED]> wrote: > > >> Nothing beats poking around in a dead machine's guts with kgdb though. > > > > Everyone his taste. > > > > But I was surprised by > > > >> SwapTotal: 1052216 kB > >> SwapFree: 1045984 kB > > > > Strange that processes are killed while

Re: OOM problems on 2.6.12-rc1 with many fsx tests

2005-03-23 Thread Martin J. Bligh
>> Nothing beats poking around in a dead machine's guts with kgdb though. > > Everyone his taste. > > But I was surprised by > >> SwapTotal: 1052216 kB >> SwapFree: 1045984 kB > > Strange that processes are killed while lots of swap is available. I don't think we're that smart about i

Re: OOM problems on 2.6.12-rc1 with many fsx tests

2005-03-23 Thread Andries Brouwer
On Wed, Mar 23, 2005 at 03:20:55PM -0800, Andrew Morton wrote: > Nothing beats poking around in a dead machine's guts with kgdb though. Everyone his taste. But I was surprised by > SwapTotal: 1052216 kB > SwapFree: 1045984 kB Strange that processes are killed while lots of swap is ava

Re: OOM problems on 2.6.12-rc1 with many fsx tests

2005-03-23 Thread Andrew Morton
"Martin J. Bligh" <[EMAIL PROTECTED]> wrote: > > > It would be interesting if you could run the same test on 2.6.11. > > One thing I'm finding is that it's hard to backtrace who has each page > in this sort of situation. My plan is to write a debug patch to walk > mem_map and dump out some info

Re: OOM problems on 2.6.12-rc1 with many fsx tests

2005-03-23 Thread Martin J. Bligh
>> I run into OOM problem again on 2.6.12-rc1. I run some(20) fsx tests on >> 2.6.12-rc1 kernel(and 2.6.11-mm4) on ext3 filesystem, after about 10 >> hours the system hit OOM, and OOM keep killing processes one by one. > > I don't have a very good record reading these oom dumps lately, but this

Re: OOM problems on 2.6.12-rc1 with many fsx tests

2005-03-23 Thread Andrew Morton
Mingming Cao <[EMAIL PROTECTED]> wrote: > > I run into OOM problem again on 2.6.12-rc1. I run some(20) fsx tests on > 2.6.12-rc1 kernel(and 2.6.11-mm4) on ext3 filesystem, after about 10 > hours the system hit OOM, and OOM keep killing processes one by one. I don't have a very good record readin

OOM problems on 2.6.12-rc1 with many fsx tests

2005-03-23 Thread Mingming Cao
Andrea, Andrew, I run into OOM problem again on 2.6.12-rc1. I run some(20) fsx tests on 2.6.12-rc1 kernel(and 2.6.11-mm4) on ext3 filesystem, after about 10 hours the system hit OOM, and OOM keep killing processes one by one. I could reproduce this problem very constantly on a 2 way PIII 700MHZ wi