Re: innd mmap bug in 2.4.0-test12

2001-01-04 Thread Ralf Gerbig

Hi,

* "Linus" == Linus Torvalds <[EMAIL PROTECTED]> writes:

[...]

> Almost nothing does that. innd is (sadly) the only regular thing that uses
> this, which is why it's always innd that breaks, even if everything else
> works.

> And even innd is often compiled to use "write()" instead of shared
> mappings (it's a config option), so not even all innd's will break.

just in case nobody else tried, I suffered from the INN problem and it
works for me now with -prerelease, K6 384MB mostly idle exept a small
incoming uucp feed.

Ralf
-- 
 P: Linus Torvalds  patch-2.2.4
-S: Buried alive in diapers
+S: Buried alive in reporters
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2001-01-04 Thread Ralf Gerbig

Hi,

* "Linus" == Linus Torvalds [EMAIL PROTECTED] writes:

[...]

 Almost nothing does that. innd is (sadly) the only regular thing that uses
 this, which is why it's always innd that breaks, even if everything else
 works.

 And even innd is often compiled to use "write()" instead of shared
 mappings (it's a config option), so not even all innd's will break.

just in case nobody else tried, I suffered from the INN problem and it
works for me now with -prerelease, K6 384MB mostly idle exept a small
incoming uucp feed.

Ralf
-- 
 P: Linus Torvalds  patch-2.2.4
-S: Buried alive in diapers
+S: Buried alive in reporters
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-29 Thread Christoph Rohland

Chris Wedgwood <[EMAIL PROTECTED]> writes:

> I would prefer we leave ramfs alone as is -- it makes an excellent
> starting point for a new fs and is fairly simple to grok. If we are
> to add any more complexity here like the size limiting patches or the
> use of a backing store, I'd like to have this as a new filesystem,
> something like 'vmfs' or some such.

That's shm fs + read and write which should be easy to add.

> ramfs is small simple and elegant; for mere mortals like me it
> contains enough to help understand what is required of a filesystem
> without obscuring this fact. I'd hate to see that change.

yes. That's why I copied a lot of the ramfs code into mm/shmem.c

Christoph

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-29 Thread Christoph Rohland

Chris Wedgwood [EMAIL PROTECTED] writes:

 I would prefer we leave ramfs alone as is -- it makes an excellent
 starting point for a new fs and is fairly simple to grok. If we are
 to add any more complexity here like the size limiting patches or the
 use of a backing store, I'd like to have this as a new filesystem,
 something like 'vmfs' or some such.

That's shm fs + read and write which should be easy to add.

 ramfs is small simple and elegant; for mere mortals like me it
 contains enough to help understand what is required of a filesystem
 without obscuring this fact. I'd hate to see that change.

yes. That's why I copied a lot of the ramfs code into mm/shmem.c

Christoph

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12 (UNIMPORTANT)

2000-12-28 Thread Pau

On Thu, 28 Dec 2000, Alan Cox wrote:

> The ramfs maintainer has patches (in -ac) which deal with the size limiting
> of RAMfs. I'm using it on a PDA and its really really nice to be able to
> pop up a GUI app and drag the bar to '60% for apps' like other PDA systems ;)

May I ask which PDA do you have running Linux and worth a few bucks?

Pau

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Daniel Phillips

Linus Torvalds wrote:
> Ok, pre-5 should have all the same places you found already fixed, but
> please do give it some heavy-duty testing to make sure there isn't
> anything hidden..

I've beaten on it fairly heavily with the BUG in there as you suggested,
with no problems.

This kernel even seems a little faster:

  Test machine: 64 meg, 500 Mhz K6, IDE, Ext2, Blocksize=4K
  Test: dbench 48

  pre49.5 MB/sec  11 min 6 secs
  pre5   10.8 MB/sec   9 min 48 secs

I think it would be an awfully good idea to let the BUG out for mass
consumption:

+   if (PageDirty(page)) BUG();
remove_page_from_inode_queue(page);
remove_page_from_hash_queue(page);
page->mapping = NULL;

--
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Mo McKinlay

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


  > does anyone other than me think that the pm code is *way* too agressive about
  > spinning down the hard drive? my 256mb laptop (2.2.16) will only spin down the
  > disk for about 30 seconds before it decides it's got something else it feels
  > like writing out, and spins back up. Spinnup has got to be more wasteful than
  > just leaving the drive spinning...

Yup, I find this - especially if I hang around in X for a while.

- -- 
"  "" " " " " " " "  "  """   "   " " " "" "   "
  "" """  """ ""  "  "  "  "  "" "   """  """ "
""   ""   """ " " """ "  "   "" """" ""  "   ""
""  "  "  """ "    " "    "  ""
"   "  """  " "   "  ""    " "    "  ""


-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.4 (GNU/Linux)
Comment: Made with pgp4pine

iD8DBQE6S7JJRcGgB3aidfkRAhpkAJ9UYVhD+sRqADqUMm2i+UgbuYS8kACgzG4E
WhqPEhm6XHixqHpUOFQ4els=
=yQKY
-END PGP SIGNATURE-

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Linus Torvalds



On Thu, 28 Dec 2000, Daniel Phillips wrote:
> 
> OK, I see you just posted -pre5 while I was making the patch, but here
> it is anyway, as a cross-check.

Ok, pre-5 should have all the same places you found already fixed, but
please do give it some heavy-duty testing to make sure there isn't
anything hidden..

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Daniel Phillips

Linus Torvalds wrote:
> We don't want to lose dirty bits by mistake. The only cases where it's ok
> to clear the dirty bit is when we truncate a page completely (so it won't
> be needed and obviously really shouldn't be written out) and when we've
> lost the last user of a swap cache entry.
> 
> Any other cases might be bugs, where we remove a page from a mapping
> without noticing that it is dirty (we had this bug in reclaim_pages(), for
> example).

I tried to go the lazy way the first time.  This time I put the BUG in
there and then went chasing all the places that do
(__)remove_inode_page.  There are tentacles all over the place but I
*think* I found them all.  That turned up one more place needing
changing, and this is one you already spotted a couple of days ago, in
reclaim_page.  I subjected this patch to some dbenching without
triggering the BUG.

The try_to_unuse function calls delete_from_swap_cache and this is
pretty unfamiliar stuff for me, but it looks like the page is just
freshly read and couldn't be dirty.  There's one more case in
arch/68K/atari/stram.c (unswap_by_read), similar to try_to_unuse.

OK, I see you just posted -pre5 while I was making the patch, but here
it is anyway, as a cross-check.

--- 2.4.0-test13-pre4.clean/mm/filemap.cFri Dec 29 03:14:58 2000
+++ 2.4.0-test13-pre4/mm/filemap.c  Fri Dec 29 04:29:09 2000
@@ -96,6 +96,7 @@
remove_page_from_inode_queue(page);
remove_page_from_hash_queue(page);
page->mapping = NULL;
+   if (PageDirty(page)) BUG();
 }
 
 void remove_inode_page(struct page *page)
@@ -132,7 +133,7 @@
curr = curr->next;
 
/* We cannot invalidate a locked page */
-   if (TryLockPage(page))
+   if (PageDirty(page) || TryLockPage(page))
continue;
 
/* Neither can we invalidate something in use.. */
--- 2.4.0-test13-pre4.clean/mm/vmscan.c Fri Dec 29 03:14:58 2000
+++ 2.4.0-test13-pre4/mm/vmscan.c   Fri Dec 29 04:30:48 2000
@@ -484,7 +484,7 @@
}
 
/* The page is dirty, or locked, move to inactive_dirty list. */
-   if (page->buffers || TryLockPage(page)) {
+   if (page->buffers || PageDirty(page) || TryLockPage(page)) {
del_page_from_inactive_clean_list(page);
add_page_to_inactive_dirty_list(page);
continue;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Daniel Phillips

Linus Torvalds wrote:
> No, I'd much rather have
> 
> if (PageDirty(page)) BUG();
> 
> there, and then have the free_swap_cache code clear the dirty bit.
> 
> We don't want to lose dirty bits by mistake. The only cases where it's ok
> to clear the dirty bit is when we truncate a page completely (so it won't
> be needed and obviously really shouldn't be written out) and when we've
> lost the last user of a swap cache entry.
> 
> Any other cases might be bugs, where we remove a page from a mapping
> without noticing that it is dirty (we had this bug in reclaim_pages(), for
> example).

And in this case it's clear we lose data with nfs and smbfs that way. 
Maybe this is more like it:

--- 2.4.0-test13.clean/mm/filemap.c Fri Dec 29 03:14:58 2000
+++ 2.4.0-test13/mm/filemap.c   Fri Dec 29 04:13:27 2000
@@ -132,7 +132,7 @@
curr = curr->next;
 
/* We cannot invalidate a locked page */
-   if (TryLockPage(page))
+   if (PageDirty(page) || TryLockPage(page))
continue;
 
/* Neither can we invalidate something in use.. */
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Alan Cox

> > I use ramfs for /tmp on my laptop -- it's very handy because it
> > extends the amount of the the disk had spent spun down and therefore
> > battery life; but writing large files into /tmp can blow away the
> > system or at the very least eat away at otherwise usable ram. Not
> > terribly desirable.
> 
> Jeff Garzik had the code to do this, and the new shared memory code should
> be able to be massaged to handle this all without actually bloating the
> kernel (ie "ramfs" would still stay very very tiny, just taking advantage
> of the common code that the VM layer already has to support for other
> things).

The ramfs maintainer has patches (in -ac) which deal with the size limiting
of RAMfs. I'm using it on a PDA and its really really nice to be able to 
pop up a GUI app and drag the bar to '60% for apps' like other PDA systems ;)

They do touch the core vm/vfs code for one callback, which would be nice to
lose but not obvious it can be

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Linus Torvalds



On Thu, 28 Dec 2000, Daniel Phillips wrote:

> [in vmscan.c]
> > Between line 573 and 594 the page can have 1 user and be unlocked, so it
> > can be removed by invalidate_inode_pages, and the mapping will be
> > cleared here:
> > http://innominate.org/~graichen/projects/lxr/source/mm/filemap.c?v=v2.3#L98
> 
> This seems like the obvious thing to do:
> 
> --- 2.4.0-test13.clean/mm/filemap.c   Fri Dec 29 03:14:58 2000
> +++ 2.4.0-test13/mm/filemap.c Fri Dec 29 03:16:21 2000
> @@ -96,6 +96,7 @@
>   remove_page_from_inode_queue(page);
>   remove_page_from_hash_queue(page);
>   page->mapping = NULL;
> + ClearPageDirty(page);
>  }
>  
>  void remove_inode_page(struct page *page)

No, I'd much rather have

if (PageDirty(page)) BUG();

there, and then have the free_swap_cache code clear the dirty bit.

We don't want to lose dirty bits by mistake. The only cases where it's ok
to clear the dirty bit is when we truncate a page completely (so it won't
be needed and obviously really shouldn't be written out) and when we've
lost the last user of a swap cache entry.

Any other cases might be bugs, where we remove a page from a mapping
without noticing that it is dirty (we had this bug in reclaim_pages(), for
example).

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Linus Torvalds



On Thu, 28 Dec 2000, Chris Wedgwood wrote:
> 
> this remind me; perhaps you or Al could answer this.
> 
>   How hard would it be to have ramfs backed by swap? The goal being
>   try to achieve something like a FreeBSDs mfs.
> 
> I use ramfs for /tmp on my laptop -- it's very handy because it
> extends the amount of the the disk had spent spun down and therefore
> battery life; but writing large files into /tmp can blow away the
> system or at the very least eat away at otherwise usable ram. Not
> terribly desirable.

Jeff Garzik had the code to do this, and the new shared memory code should
be able to be massaged to handle this all without actually bloating the
kernel (ie "ramfs" would still stay very very tiny, just taking advantage
of the common code that the VM layer already has to support for other
things).

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



[PATCH] Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Daniel Phillips

[in vmscan.c]
> Between line 573 and 594 the page can have 1 user and be unlocked, so it
> can be removed by invalidate_inode_pages, and the mapping will be
> cleared here:
> http://innominate.org/~graichen/projects/lxr/source/mm/filemap.c?v=v2.3#L98

This seems like the obvious thing to do:

--- 2.4.0-test13.clean/mm/filemap.c Fri Dec 29 03:14:58 2000
+++ 2.4.0-test13/mm/filemap.c   Fri Dec 29 03:16:21 2000
@@ -96,6 +96,7 @@
remove_page_from_inode_queue(page);
remove_page_from_hash_queue(page);
page->mapping = NULL;
+   ClearPageDirty(page);
 }
 
 void remove_inode_page(struct page *page)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Chris Mason



On Thursday, December 28, 2000 15:51:24 -0200 Rik van Riel
<[EMAIL PROTECTED]> wrote:

> On Thu, 28 Dec 2000, Chris Mason wrote:
> 
>> I think a dirty page without a writepage func seems a bit
>> broken.  How about we give ramfs a writepage func that just
>> returns 1.  That way nobody does any special if
>> (ramfs_page(page)) kinds of tests...
> 
> This will lead to the ramfs pages staying on the inactive_dirty
> list forever, deadlocking the system.
> 


This wouldn't be the first time this week I've read this part of
page_launder wrong, but I don't see a difference between a NULL writepage
func, and a writepage func that returns 1.  I read the code like this:

if(PageDirty(page)) {
...
if (!writepage)
goto page_active ;
...
result = writepage(page)
if (result != 1)
continue ;
SetPageDirty(page);
goto page_active ;
}

-chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Rik van Riel

On Thu, 28 Dec 2000, Linus Torvalds wrote:
> On Thu, 28 Dec 2000, Rik van Riel wrote:
> > 
> > I've made a small debugging patch that simply checks
> > for this illegal state in add_page_to_active_list and
> > add_page_to_inactive_dirty_list.
> 
> I bet it won't catch the real bad guy, which almost certainly is
> the "remove_from_swap_cache()" thing (it should do a
> ClearPageDirty() before it removes it from the swapper_inode
> mapping).

Ohhh indeed. This is a very likely candidate which is
trivial to fix.

regards,

Rik
--
Hollywood goes for world dumbination,
Trailer at 11.

http://www.surriel.com/
http://www.conectiva.com/   http://distro.conectiva.com.br/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Rik van Riel

On Thu, 28 Dec 2000, Chris Mason wrote:

> I think a dirty page without a writepage func seems a bit
> broken.  How about we give ramfs a writepage func that just
> returns 1.  That way nobody does any special if
> (ramfs_page(page)) kinds of tests...

This will lead to the ramfs pages staying on the inactive_dirty
list forever, deadlocking the system.

Rik
--
Hollywood goes for world dumbination,
Trailer at 11.

http://www.surriel.com/
http://www.conectiva.com/   http://distro.conectiva.com.br/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Linus Torvalds



On Thu, 28 Dec 2000, Daniel Phillips wrote:
> 
> It's logical that PageDirty should never be get for ramfs, and a ramfs
> page should never have buffers on it.

What?

No no no.

You're obviously right that ramfs will never have buffers on the page, but
why shouldn't a ramfs page be dirty?

Of _course_ a ramfs page is dirty - that's the only thing that tells the
VM that it can't be thrown out.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Chris Mason



On Thursday, December 28, 2000 16:15:48 +0100 Daniel Phillips
<[EMAIL PROTECTED]> wrote:

> On Thu, 28 Dec 2000, Rik van Riel wrote:
>> On Thu, 28 Dec 2000, Daniel Phillips wrote:
>> 
>> > It's logical that PageDirty should never be get for ramfs,
>> 
>> No. Not setting PageDirty will cause the system to move the
>> page to the inactive_clean list and happily reclaim your data.
>> 
>> We _have to_ use something like PageDirty for this, and
>> checking for the ->writepage method will even allow us to
>> do stuff like dynamically switching swapping support for
>> ramfs on/off (or other funny things).
> 
> You're suggesting using the absence of a method as a kind of flag, but
> the code is really too full of obscure stuff like that already.
> 
> How about taking an extra user on the ramfs pages instead.  It doesn't
> sound right to set PageDirty when you are not requesting IO.

I think a dirty page without a writepage func seems a bit broken.  How
about we give ramfs a writepage func that just returns 1.  That way nobody
does any special if (ramfs_page(page)) kinds of tests...

-chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Linus Torvalds



On Thu, 28 Dec 2000, Rik van Riel wrote:
> 
> I've made a small debugging patch that simply checks
> for this illegal state in add_page_to_active_list and
> add_page_to_inactive_dirty_list.

I bet it won't catch the real bad guy, which almost certainly is the
"remove_from_swap_cache()" thing (it should do a ClearPageDirty() before
it removes it from the swapper_inode mapping).

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Daniel Phillips

Rik van Riel wrote:
> 
> On Thu, 28 Dec 2000, Rik van Riel wrote:
> > On Wed, 27 Dec 2000, Linus Torvalds wrote:
> > > On Wed, 27 Dec 2000, Rik van Riel wrote:
> > > >
> > > > The (trivial) patch below should fix this problem.
> > >
> > > It must be wrong.
> > >
> > > If we have a dirty page on the LRU lists, that page _must_ have
> > > a mapping.
> >
> > Hmm, last I looked buffercache pages didn't have
> > page->mapping set ...
> 
> OK, you're right ;)
> 
> We never set PG_dirty for buffercache pages, so a
> pure buffercache page shouldn't be caught here...

But we should, and something that does it is buried in Chris Mason's
patch under the thread [RFC] changes to buffer.c.  Chris is also trying
some other fancy things in that patch, but the buffer cache mapping part
is actually pretty simple.

For buffer cache pages I think the current thinking is that PG_dirty
should stay on until the last dirty buffer on the page has been
submitted for writing.

--
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Daniel Phillips

On Thu, 28 Dec 2000, Rik van Riel wrote:
> On Thu, 28 Dec 2000, Daniel Phillips wrote:
> 
> > It's logical that PageDirty should never be get for ramfs,
> 
> No. Not setting PageDirty will cause the system to move the
> page to the inactive_clean list and happily reclaim your data.
> 
> We _have to_ use something like PageDirty for this, and
> checking for the ->writepage method will even allow us to
> do stuff like dynamically switching swapping support for
> ramfs on/off (or other funny things).

You're suggesting using the absence of a method as a kind of flag, but
the code is really too full of obscure stuff like that already.

How about taking an extra user on the ramfs pages instead.  It doesn't
sound right to set PageDirty when you are not requesting IO.

-- 
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Rik van Riel

On Thu, 28 Dec 2000, Daniel Phillips wrote:

> It's logical that PageDirty should never be get for ramfs,

No. Not setting PageDirty will cause the system to move the
page to the inactive_clean list and happily reclaim your data.

We _have to_ use something like PageDirty for this, and
checking for the ->writepage method will even allow us to
do stuff like dynamically switching swapping support for
ramfs on/off (or other funny things).

regards,

Rik
--
Hollywood goes for world dumbination,
Trailer at 11.

http://www.surriel.com/
http://www.conectiva.com/   http://distro.conectiva.com.br/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Daniel Phillips

Linus Torvalds wrote:
> 
> On Wed, 27 Dec 2000, Philipp Rumpf wrote:
> 
> > On Wed, Dec 27, 2000 at 03:41:04PM -0800, Linus Torvalds wrote:
> > > It must be wrong.
> > >
> > > If we have a dirty page on the LRU lists, that page _must_ have a mapping.
> >
> > What about pages with a mapping but without a writepage function ? or pages
> > whose writepage function fails ?  The current code seems to simply put the
> > page onto the active list in that case, which seems just as wrong to me.
> 
> ramfs. It doesn't have a writepage() function, as there is no backing
> store.
> 
> > > The bug is somewhere else, and your patch is just papering it over. We
> > > should not have a page without a mapping on the LRU lists in the first
> > > place, except if the page has anonymous buffers (and such a page cannot
> >
> > So is there any legal reason we could ever get to page_active ?  Removing
> > that code (or replacing it with BUG()) certainly would make page_launder
> > more readable.
> 
> Apart from the "we have no backing store", there is no legal reason to put
> it back on the active list that I can see.

It's logical that PageDirty should never be get for ramfs, and a ramfs
page should never have buffers on it.  With this and Chris's anon_space
mapping can we replace the check for null ->writepage with BUG?  With
the anon_space mapping we should be able to do the same ford for
->mapping.

Though these things aren't strictly bugs, having to check multiple paths
for everything is slowing us down in picking off the fluff.

--
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Rik van Riel

On Thu, 28 Dec 2000, Rik van Riel wrote:
> On Wed, 27 Dec 2000, Linus Torvalds wrote:
> > On Wed, 27 Dec 2000, Rik van Riel wrote:
> > > 
> > > The (trivial) patch below should fix this problem.
> > 
> > It must be wrong.
> > 
> > If we have a dirty page on the LRU lists, that page _must_ have
> > a mapping.
> 
> Hmm, last I looked buffercache pages didn't have
> page->mapping set ...

OK, you're right ;)

We never set PG_dirty for buffercache pages, so a
pure buffercache page shouldn't be caught here...

I've made a small debugging patch that simply checks
for this illegal state in add_page_to_active_list and
add_page_to_inactive_dirty_list.

regards,

Rik
--
Hollywood goes for world dumbination,
Trailer at 11.

http://www.surriel.com/
http://www.conectiva.com/   http://distro.conectiva.com.br/


--- include/linux/swap.h.orig   Thu Dec 28 12:29:29 2000
+++ include/linux/swap.hThu Dec 28 12:31:29 2000
@@ -206,7 +206,11 @@
 #define ZERO_PAGE_BUG \
if (page_count(page) == 0) BUG();
 
+#define DIRTY_NO_MAPPING \
+   if (PageDirty(page) && !page->mapping) BUG();
+
 #define add_page_to_active_list(page) { \
+   DIRTY_NO_MAPPING \
DEBUG_ADD_PAGE \
ZERO_PAGE_BUG \
SetPageActive(page); \
@@ -215,6 +219,7 @@
 }
 
 #define add_page_to_inactive_dirty_list(page) { \
+   DIRTY_NO_MAPPING \
DEBUG_ADD_PAGE \
ZERO_PAGE_BUG \
SetPageInactiveDirty(page); \

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Daniel Phillips

Zlatko Calusic wrote:
>
> Rik van Riel <[EMAIL PROTECTED]> writes:
>
> > On Mon, 25 Dec 2000, Dan Aloni wrote:
> > > On 25 Dec 2000, Zlatko Calusic wrote:
> > >
> > > > Speaking of page_launder() I just stumbled upon two oopsen today on
> > > > the UP build. Maybe it could give a hint to someone, I'm not that good
> > > > at Oops decoding.
> > >
> > > I've decoded the Oops I got, and found that the problem is in
> > > vmscan.c:line-605, where page->mapping is NULL and a_ops gets
> > > resolved and dereferenced at 0x000c.
> >
> > The code assumes that every page which has the PG_dirty
> > bit set also has page->mapping set to a valid value.
> >
> > The BUG() people are getting confirms that this assumption
> > is not necessarily true and the VM work that's going on will
> > most likely make it not be true either in some cases.
> >
> > The (trivial) patch below should fix this problem...
> >
> Looking at the patch, I'm practically sure it will cure the symptoms.
> But I'm still slightly worried about those pages we skip in
> there. Maybe we should at least try to discover what are those pages,
> and then maybe it will become obvious what we need (or not) to do with
> them.
>
> Some strategic printk()s could probably give us some clue.

Between line 573 and 594 the page can have 1 user and be unlocked, so it
can be removed by invalidate_inode_pages, and the mapping will be
cleared here:
http://innominate.org/~graichen/projects/lxr/source/mm/filemap.c?v=v2.3#L98

--
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Rik van Riel

On Wed, 27 Dec 2000, Linus Torvalds wrote:
> On Wed, 27 Dec 2000, Rik van Riel wrote:
> > 
> > The (trivial) patch below should fix this problem.
> 
> It must be wrong.
> 
> If we have a dirty page on the LRU lists, that page _must_ have
> a mapping.

Hmm, last I looked buffercache pages didn't have
page->mapping set ...

And before anyone gets afraid that pure buffercache
pages get skipped ... they're caught by page_launder()
just fine, in the `if (page->buffers)' code below the
dirty page writeout code.

regards,

Rik
--
Hollywood goes for world dumbination,
Trailer at 11.

http://www.surriel.com/
http://www.conectiva.com/   http://distro.conectiva.com.br/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Alan Cox

> does anyone other than me think that the pm code is *way* too agressive about
> spinning down the hard drive? my 256mb laptop (2.2.16) will only spin down the
> disk for about 30 seconds before it decides it's got something else it feels
> like writing out, and spins back up. Spinnup has got to be more wasteful than
> just leaving the drive spinning...

Take that up with your bios. Hard disk power management is done by the drive
and bios settings. hdparm will let you override it. You can also get good
results using noatime to avoid atime writes on reading

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Alan Cox

 does anyone other than me think that the pm code is *way* too agressive about
 spinning down the hard drive? my 256mb laptop (2.2.16) will only spin down the
 disk for about 30 seconds before it decides it's got something else it feels
 like writing out, and spins back up. Spinnup has got to be more wasteful than
 just leaving the drive spinning...

Take that up with your bios. Hard disk power management is done by the drive
and bios settings. hdparm will let you override it. You can also get good
results using noatime to avoid atime writes on reading

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Rik van Riel

On Wed, 27 Dec 2000, Linus Torvalds wrote:
 On Wed, 27 Dec 2000, Rik van Riel wrote:
  
  The (trivial) patch below should fix this problem.
 
 It must be wrong.
 
 If we have a dirty page on the LRU lists, that page _must_ have
 a mapping.

Hmm, last I looked buffercache pages didn't have
page-mapping set ...

And before anyone gets afraid that pure buffercache
pages get skipped ... they're caught by page_launder()
just fine, in the `if (page-buffers)' code below the
dirty page writeout code.

regards,

Rik
--
Hollywood goes for world dumbination,
Trailer at 11.

http://www.surriel.com/
http://www.conectiva.com/   http://distro.conectiva.com.br/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Daniel Phillips

Zlatko Calusic wrote:

 Rik van Riel [EMAIL PROTECTED] writes:

  On Mon, 25 Dec 2000, Dan Aloni wrote:
   On 25 Dec 2000, Zlatko Calusic wrote:
  
Speaking of page_launder() I just stumbled upon two oopsen today on
the UP build. Maybe it could give a hint to someone, I'm not that good
at Oops decoding.
  
   I've decoded the Oops I got, and found that the problem is in
   vmscan.c:line-605, where page-mapping is NULL and a_ops gets
   resolved and dereferenced at 0x000c.
 
  The code assumes that every page which has the PG_dirty
  bit set also has page-mapping set to a valid value.
 
  The BUG() people are getting confirms that this assumption
  is not necessarily true and the VM work that's going on will
  most likely make it not be true either in some cases.
 
  The (trivial) patch below should fix this problem...
 
 Looking at the patch, I'm practically sure it will cure the symptoms.
 But I'm still slightly worried about those pages we skip in
 there. Maybe we should at least try to discover what are those pages,
 and then maybe it will become obvious what we need (or not) to do with
 them.

 Some strategic printk()s could probably give us some clue.

Between line 573 and 594 the page can have 1 user and be unlocked, so it
can be removed by invalidate_inode_pages, and the mapping will be
cleared here:
http://innominate.org/~graichen/projects/lxr/source/mm/filemap.c?v=v2.3#L98

--
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Rik van Riel

On Thu, 28 Dec 2000, Rik van Riel wrote:
 On Wed, 27 Dec 2000, Linus Torvalds wrote:
  On Wed, 27 Dec 2000, Rik van Riel wrote:
   
   The (trivial) patch below should fix this problem.
  
  It must be wrong.
  
  If we have a dirty page on the LRU lists, that page _must_ have
  a mapping.
 
 Hmm, last I looked buffercache pages didn't have
 page-mapping set ...

OK, you're right ;)

We never set PG_dirty for buffercache pages, so a
pure buffercache page shouldn't be caught here...

I've made a small debugging patch that simply checks
for this illegal state in add_page_to_active_list and
add_page_to_inactive_dirty_list.

regards,

Rik
--
Hollywood goes for world dumbination,
Trailer at 11.

http://www.surriel.com/
http://www.conectiva.com/   http://distro.conectiva.com.br/


--- include/linux/swap.h.orig   Thu Dec 28 12:29:29 2000
+++ include/linux/swap.hThu Dec 28 12:31:29 2000
@@ -206,7 +206,11 @@
 #define ZERO_PAGE_BUG \
if (page_count(page) == 0) BUG();
 
+#define DIRTY_NO_MAPPING \
+   if (PageDirty(page)  !page-mapping) BUG();
+
 #define add_page_to_active_list(page) { \
+   DIRTY_NO_MAPPING \
DEBUG_ADD_PAGE \
ZERO_PAGE_BUG \
SetPageActive(page); \
@@ -215,6 +219,7 @@
 }
 
 #define add_page_to_inactive_dirty_list(page) { \
+   DIRTY_NO_MAPPING \
DEBUG_ADD_PAGE \
ZERO_PAGE_BUG \
SetPageInactiveDirty(page); \

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Daniel Phillips

Linus Torvalds wrote:
 
 On Wed, 27 Dec 2000, Philipp Rumpf wrote:
 
  On Wed, Dec 27, 2000 at 03:41:04PM -0800, Linus Torvalds wrote:
   It must be wrong.
  
   If we have a dirty page on the LRU lists, that page _must_ have a mapping.
 
  What about pages with a mapping but without a writepage function ? or pages
  whose writepage function fails ?  The current code seems to simply put the
  page onto the active list in that case, which seems just as wrong to me.
 
 ramfs. It doesn't have a writepage() function, as there is no backing
 store.
 
   The bug is somewhere else, and your patch is just papering it over. We
   should not have a page without a mapping on the LRU lists in the first
   place, except if the page has anonymous buffers (and such a page cannot
 
  So is there any legal reason we could ever get to page_active ?  Removing
  that code (or replacing it with BUG()) certainly would make page_launder
  more readable.
 
 Apart from the "we have no backing store", there is no legal reason to put
 it back on the active list that I can see.

It's logical that PageDirty should never be get for ramfs, and a ramfs
page should never have buffers on it.  With this and Chris's anon_space
mapping can we replace the check for null -writepage with BUG?  With
the anon_space mapping we should be able to do the same ford for
-mapping.

Though these things aren't strictly bugs, having to check multiple paths
for everything is slowing us down in picking off the fluff.

--
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Rik van Riel

On Thu, 28 Dec 2000, Daniel Phillips wrote:

 It's logical that PageDirty should never be get for ramfs,

No. Not setting PageDirty will cause the system to move the
page to the inactive_clean list and happily reclaim your data.

We _have to_ use something like PageDirty for this, and
checking for the -writepage method will even allow us to
do stuff like dynamically switching swapping support for
ramfs on/off (or other funny things).

regards,

Rik
--
Hollywood goes for world dumbination,
Trailer at 11.

http://www.surriel.com/
http://www.conectiva.com/   http://distro.conectiva.com.br/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Daniel Phillips

On Thu, 28 Dec 2000, Rik van Riel wrote:
 On Thu, 28 Dec 2000, Daniel Phillips wrote:
 
  It's logical that PageDirty should never be get for ramfs,
 
 No. Not setting PageDirty will cause the system to move the
 page to the inactive_clean list and happily reclaim your data.
 
 We _have to_ use something like PageDirty for this, and
 checking for the -writepage method will even allow us to
 do stuff like dynamically switching swapping support for
 ramfs on/off (or other funny things).

You're suggesting using the absence of a method as a kind of flag, but
the code is really too full of obscure stuff like that already.

How about taking an extra user on the ramfs pages instead.  It doesn't
sound right to set PageDirty when you are not requesting IO.

-- 
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Daniel Phillips

Rik van Riel wrote:
 
 On Thu, 28 Dec 2000, Rik van Riel wrote:
  On Wed, 27 Dec 2000, Linus Torvalds wrote:
   On Wed, 27 Dec 2000, Rik van Riel wrote:
   
The (trivial) patch below should fix this problem.
  
   It must be wrong.
  
   If we have a dirty page on the LRU lists, that page _must_ have
   a mapping.
 
  Hmm, last I looked buffercache pages didn't have
  page-mapping set ...
 
 OK, you're right ;)
 
 We never set PG_dirty for buffercache pages, so a
 pure buffercache page shouldn't be caught here...

But we should, and something that does it is buried in Chris Mason's
patch under the thread [RFC] changes to buffer.c.  Chris is also trying
some other fancy things in that patch, but the buffer cache mapping part
is actually pretty simple.

For buffer cache pages I think the current thinking is that PG_dirty
should stay on until the last dirty buffer on the page has been
submitted for writing.

--
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Linus Torvalds



On Thu, 28 Dec 2000, Rik van Riel wrote:
 
 I've made a small debugging patch that simply checks
 for this illegal state in add_page_to_active_list and
 add_page_to_inactive_dirty_list.

I bet it won't catch the real bad guy, which almost certainly is the
"remove_from_swap_cache()" thing (it should do a ClearPageDirty() before
it removes it from the swapper_inode mapping).

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Chris Mason



On Thursday, December 28, 2000 16:15:48 +0100 Daniel Phillips
[EMAIL PROTECTED] wrote:

 On Thu, 28 Dec 2000, Rik van Riel wrote:
 On Thu, 28 Dec 2000, Daniel Phillips wrote:
 
  It's logical that PageDirty should never be get for ramfs,
 
 No. Not setting PageDirty will cause the system to move the
 page to the inactive_clean list and happily reclaim your data.
 
 We _have to_ use something like PageDirty for this, and
 checking for the -writepage method will even allow us to
 do stuff like dynamically switching swapping support for
 ramfs on/off (or other funny things).
 
 You're suggesting using the absence of a method as a kind of flag, but
 the code is really too full of obscure stuff like that already.
 
 How about taking an extra user on the ramfs pages instead.  It doesn't
 sound right to set PageDirty when you are not requesting IO.

I think a dirty page without a writepage func seems a bit broken.  How
about we give ramfs a writepage func that just returns 1.  That way nobody
does any special if (ramfs_page(page)) kinds of tests...

-chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Linus Torvalds



On Thu, 28 Dec 2000, Daniel Phillips wrote:
 
 It's logical that PageDirty should never be get for ramfs, and a ramfs
 page should never have buffers on it.

What?

No no no.

You're obviously right that ramfs will never have buffers on the page, but
why shouldn't a ramfs page be dirty?

Of _course_ a ramfs page is dirty - that's the only thing that tells the
VM that it can't be thrown out.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Rik van Riel

On Thu, 28 Dec 2000, Chris Mason wrote:

 I think a dirty page without a writepage func seems a bit
 broken.  How about we give ramfs a writepage func that just
 returns 1.  That way nobody does any special if
 (ramfs_page(page)) kinds of tests...

This will lead to the ramfs pages staying on the inactive_dirty
list forever, deadlocking the system.

Rik
--
Hollywood goes for world dumbination,
Trailer at 11.

http://www.surriel.com/
http://www.conectiva.com/   http://distro.conectiva.com.br/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Rik van Riel

On Thu, 28 Dec 2000, Linus Torvalds wrote:
 On Thu, 28 Dec 2000, Rik van Riel wrote:
  
  I've made a small debugging patch that simply checks
  for this illegal state in add_page_to_active_list and
  add_page_to_inactive_dirty_list.
 
 I bet it won't catch the real bad guy, which almost certainly is
 the "remove_from_swap_cache()" thing (it should do a
 ClearPageDirty() before it removes it from the swapper_inode
 mapping).

Ohhh indeed. This is a very likely candidate which is
trivial to fix.

regards,

Rik
--
Hollywood goes for world dumbination,
Trailer at 11.

http://www.surriel.com/
http://www.conectiva.com/   http://distro.conectiva.com.br/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Chris Mason



On Thursday, December 28, 2000 15:51:24 -0200 Rik van Riel
[EMAIL PROTECTED] wrote:

 On Thu, 28 Dec 2000, Chris Mason wrote:
 
 I think a dirty page without a writepage func seems a bit
 broken.  How about we give ramfs a writepage func that just
 returns 1.  That way nobody does any special if
 (ramfs_page(page)) kinds of tests...
 
 This will lead to the ramfs pages staying on the inactive_dirty
 list forever, deadlocking the system.
 


This wouldn't be the first time this week I've read this part of
page_launder wrong, but I don't see a difference between a NULL writepage
func, and a writepage func that returns 1.  I read the code like this:

if(PageDirty(page)) {
...
if (!writepage)
goto page_active ;
...
result = writepage(page)
if (result != 1)
continue ;
SetPageDirty(page);
goto page_active ;
}

-chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



[PATCH] Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Daniel Phillips

[in vmscan.c]
 Between line 573 and 594 the page can have 1 user and be unlocked, so it
 can be removed by invalidate_inode_pages, and the mapping will be
 cleared here:
 http://innominate.org/~graichen/projects/lxr/source/mm/filemap.c?v=v2.3#L98

This seems like the obvious thing to do:

--- 2.4.0-test13.clean/mm/filemap.c Fri Dec 29 03:14:58 2000
+++ 2.4.0-test13/mm/filemap.c   Fri Dec 29 03:16:21 2000
@@ -96,6 +96,7 @@
remove_page_from_inode_queue(page);
remove_page_from_hash_queue(page);
page-mapping = NULL;
+   ClearPageDirty(page);
 }
 
 void remove_inode_page(struct page *page)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Linus Torvalds



On Thu, 28 Dec 2000, Chris Wedgwood wrote:
 
 this remind me; perhaps you or Al could answer this.
 
   How hard would it be to have ramfs backed by swap? The goal being
   try to achieve something like a FreeBSDs mfs.
 
 I use ramfs for /tmp on my laptop -- it's very handy because it
 extends the amount of the the disk had spent spun down and therefore
 battery life; but writing large files into /tmp can blow away the
 system or at the very least eat away at otherwise usable ram. Not
 terribly desirable.

Jeff Garzik had the code to do this, and the new shared memory code should
be able to be massaged to handle this all without actually bloating the
kernel (ie "ramfs" would still stay very very tiny, just taking advantage
of the common code that the VM layer already has to support for other
things).

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Linus Torvalds



On Thu, 28 Dec 2000, Daniel Phillips wrote:

 [in vmscan.c]
  Between line 573 and 594 the page can have 1 user and be unlocked, so it
  can be removed by invalidate_inode_pages, and the mapping will be
  cleared here:
  http://innominate.org/~graichen/projects/lxr/source/mm/filemap.c?v=v2.3#L98
 
 This seems like the obvious thing to do:
 
 --- 2.4.0-test13.clean/mm/filemap.c   Fri Dec 29 03:14:58 2000
 +++ 2.4.0-test13/mm/filemap.c Fri Dec 29 03:16:21 2000
 @@ -96,6 +96,7 @@
   remove_page_from_inode_queue(page);
   remove_page_from_hash_queue(page);
   page-mapping = NULL;
 + ClearPageDirty(page);
  }
  
  void remove_inode_page(struct page *page)

No, I'd much rather have

if (PageDirty(page)) BUG();

there, and then have the free_swap_cache code clear the dirty bit.

We don't want to lose dirty bits by mistake. The only cases where it's ok
to clear the dirty bit is when we truncate a page completely (so it won't
be needed and obviously really shouldn't be written out) and when we've
lost the last user of a swap cache entry.

Any other cases might be bugs, where we remove a page from a mapping
without noticing that it is dirty (we had this bug in reclaim_pages(), for
example).

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Alan Cox

  I use ramfs for /tmp on my laptop -- it's very handy because it
  extends the amount of the the disk had spent spun down and therefore
  battery life; but writing large files into /tmp can blow away the
  system or at the very least eat away at otherwise usable ram. Not
  terribly desirable.
 
 Jeff Garzik had the code to do this, and the new shared memory code should
 be able to be massaged to handle this all without actually bloating the
 kernel (ie "ramfs" would still stay very very tiny, just taking advantage
 of the common code that the VM layer already has to support for other
 things).

The ramfs maintainer has patches (in -ac) which deal with the size limiting
of RAMfs. I'm using it on a PDA and its really really nice to be able to 
pop up a GUI app and drag the bar to '60% for apps' like other PDA systems ;)

They do touch the core vm/vfs code for one callback, which would be nice to
lose but not obvious it can be

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Daniel Phillips

Linus Torvalds wrote:
 No, I'd much rather have
 
 if (PageDirty(page)) BUG();
 
 there, and then have the free_swap_cache code clear the dirty bit.
 
 We don't want to lose dirty bits by mistake. The only cases where it's ok
 to clear the dirty bit is when we truncate a page completely (so it won't
 be needed and obviously really shouldn't be written out) and when we've
 lost the last user of a swap cache entry.
 
 Any other cases might be bugs, where we remove a page from a mapping
 without noticing that it is dirty (we had this bug in reclaim_pages(), for
 example).

And in this case it's clear we lose data with nfs and smbfs that way. 
Maybe this is more like it:

--- 2.4.0-test13.clean/mm/filemap.c Fri Dec 29 03:14:58 2000
+++ 2.4.0-test13/mm/filemap.c   Fri Dec 29 04:13:27 2000
@@ -132,7 +132,7 @@
curr = curr-next;
 
/* We cannot invalidate a locked page */
-   if (TryLockPage(page))
+   if (PageDirty(page) || TryLockPage(page))
continue;
 
/* Neither can we invalidate something in use.. */
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Daniel Phillips

Linus Torvalds wrote:
 We don't want to lose dirty bits by mistake. The only cases where it's ok
 to clear the dirty bit is when we truncate a page completely (so it won't
 be needed and obviously really shouldn't be written out) and when we've
 lost the last user of a swap cache entry.
 
 Any other cases might be bugs, where we remove a page from a mapping
 without noticing that it is dirty (we had this bug in reclaim_pages(), for
 example).

I tried to go the lazy way the first time.  This time I put the BUG in
there and then went chasing all the places that do
(__)remove_inode_page.  There are tentacles all over the place but I
*think* I found them all.  That turned up one more place needing
changing, and this is one you already spotted a couple of days ago, in
reclaim_page.  I subjected this patch to some dbenching without
triggering the BUG.

The try_to_unuse function calls delete_from_swap_cache and this is
pretty unfamiliar stuff for me, but it looks like the page is just
freshly read and couldn't be dirty.  There's one more case in
arch/68K/atari/stram.c (unswap_by_read), similar to try_to_unuse.

OK, I see you just posted -pre5 while I was making the patch, but here
it is anyway, as a cross-check.

--- 2.4.0-test13-pre4.clean/mm/filemap.cFri Dec 29 03:14:58 2000
+++ 2.4.0-test13-pre4/mm/filemap.c  Fri Dec 29 04:29:09 2000
@@ -96,6 +96,7 @@
remove_page_from_inode_queue(page);
remove_page_from_hash_queue(page);
page-mapping = NULL;
+   if (PageDirty(page)) BUG();
 }
 
 void remove_inode_page(struct page *page)
@@ -132,7 +133,7 @@
curr = curr-next;
 
/* We cannot invalidate a locked page */
-   if (TryLockPage(page))
+   if (PageDirty(page) || TryLockPage(page))
continue;
 
/* Neither can we invalidate something in use.. */
--- 2.4.0-test13-pre4.clean/mm/vmscan.c Fri Dec 29 03:14:58 2000
+++ 2.4.0-test13-pre4/mm/vmscan.c   Fri Dec 29 04:30:48 2000
@@ -484,7 +484,7 @@
}
 
/* The page is dirty, or locked, move to inactive_dirty list. */
-   if (page-buffers || TryLockPage(page)) {
+   if (page-buffers || PageDirty(page) || TryLockPage(page)) {
del_page_from_inactive_clean_list(page);
add_page_to_inactive_dirty_list(page);
continue;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Linus Torvalds



On Thu, 28 Dec 2000, Daniel Phillips wrote:
 
 OK, I see you just posted -pre5 while I was making the patch, but here
 it is anyway, as a cross-check.

Ok, pre-5 should have all the same places you found already fixed, but
please do give it some heavy-duty testing to make sure there isn't
anything hidden..

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-28 Thread Mo McKinlay

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


   does anyone other than me think that the pm code is *way* too agressive about
   spinning down the hard drive? my 256mb laptop (2.2.16) will only spin down the
   disk for about 30 seconds before it decides it's got something else it feels
   like writing out, and spins back up. Spinnup has got to be more wasteful than
   just leaving the drive spinning...

Yup, I find this - especially if I hang around in X for a while.

- -- 
"  "" " " " " " " "  "  """   "   " " " "" "   "
  "" """  """ ""  "  "  "  "  "" "   """  """ "
""   ""   """ " " """ "  "   "" """" ""  "   ""
""  "  "  """ "    " "    "  ""
"   "  """  " "   "  ""    " "    "  ""


-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.4 (GNU/Linux)
Comment: Made with pgp4pine

iD8DBQE6S7JJRcGgB3aidfkRAhpkAJ9UYVhD+sRqADqUMm2i+UgbuYS8kACgzG4E
WhqPEhm6XHixqHpUOFQ4els=
=yQKY
-END PGP SIGNATURE-

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12 (UNIMPORTANT)

2000-12-28 Thread Pau

On Thu, 28 Dec 2000, Alan Cox wrote:

 The ramfs maintainer has patches (in -ac) which deal with the size limiting
 of RAMfs. I'm using it on a PDA and its really really nice to be able to
 pop up a GUI app and drag the bar to '60% for apps' like other PDA systems ;)

May I ask which PDA do you have running Linux and worth a few bucks?

Pau

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-27 Thread Ari Heitner


On Thu, 28 Dec 2000, Chris Wedgwood wrote:

> (cc' list trimmed)
[further]
> I use ramfs for /tmp on my laptop -- it's very handy because it
> extends the amount of the the disk had spent spun down and therefore
> battery life; but writing large files into /tmp can blow away the
> system or at the very least eat away at otherwise usable ram. Not
> terribly desirable.

mph.

does anyone other than me think that the pm code is *way* too agressive about
spinning down the hard drive? my 256mb laptop (2.2.16) will only spin down the
disk for about 30 seconds before it decides it's got something else it feels
like writing out, and spins back up. Spinnup has got to be more wasteful than
just leaving the drive spinning...

unless the vfs code is cleaning dirty pages to disk when there's no activity,
then any time the vfs writes to disk, it's got something that's really hot and
shouldn't stay in ram ... if that's the case, and the frequency of such
occurrences is so high, perhaps the pm code should wait a couple of times as
long as it does before it spins down. basically, if there's still code doing
stuff, the drive should be spinning. only spindown when we're sure the user has
walked away for good :)





ari



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-27 Thread Dan Aloni

On Thu, 28 Dec 2000, Dan Aloni wrote:

> On Wed, 27 Dec 2000, Philipp Rumpf wrote:
> 
> > On Wed, Dec 27, 2000 at 03:41:04PM -0800, Linus Torvalds wrote:
> > > It must be wrong.
> > > 
> > > If we have a dirty page on the LRU lists, that page _must_ have a mapping.
> > 
> > What about pages with a mapping but without a writepage function ? or pages
> > whose writepage function fails ?  The current code seems to simply put the
> > page onto the active list in that case, which seems just as wrong to me.
> 
> I noticed the offset of 'mapping' in struct page doesn't match the offset
> which get resolved at obsolute address 0x000c according to the Oops,
> but it does matches the offset of a_ops in struct address_space, which
> makes me wonder if it's possible that a_ops == NULL. Suggestions?

I was confused, or maybe temporal insanity, ignore that ;-)

-- 
Dan Aloni 
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-27 Thread Dan Aloni

On Wed, 27 Dec 2000, Philipp Rumpf wrote:

> On Wed, Dec 27, 2000 at 03:41:04PM -0800, Linus Torvalds wrote:
> > It must be wrong.
> > 
> > If we have a dirty page on the LRU lists, that page _must_ have a mapping.
> 
> What about pages with a mapping but without a writepage function ? or pages
> whose writepage function fails ?  The current code seems to simply put the
> page onto the active list in that case, which seems just as wrong to me.

I noticed the offset of 'mapping' in struct page doesn't match the offset
which get resolved at obsolute address 0x000c according to the Oops,
but it does matches the offset of a_ops in struct address_space, which
makes me wonder if it's possible that a_ops == NULL. Suggestions?
 
-- 
Dan Aloni 
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-27 Thread Linus Torvalds



On Wed, 27 Dec 2000, Philipp Rumpf wrote:

> On Wed, Dec 27, 2000 at 03:41:04PM -0800, Linus Torvalds wrote:
> > It must be wrong.
> > 
> > If we have a dirty page on the LRU lists, that page _must_ have a mapping.
> 
> What about pages with a mapping but without a writepage function ? or pages
> whose writepage function fails ?  The current code seems to simply put the
> page onto the active list in that case, which seems just as wrong to me.

ramfs. It doesn't have a writepage() function, as there is no backing
store.

> > The bug is somewhere else, and your patch is just papering it over. We
> > should not have a page without a mapping on the LRU lists in the first
> > place, except if the page has anonymous buffers (and such a page cannot
> 
> So is there any legal reason we could ever get to page_active ?  Removing
> that code (or replacing it with BUG()) certainly would make page_launder
> more readable.

Apart from the "we have no backing store", there is no legal reason to put
it back on the active list that I can see.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-27 Thread Philipp Rumpf

On Wed, Dec 27, 2000 at 03:41:04PM -0800, Linus Torvalds wrote:
> It must be wrong.
> 
> If we have a dirty page on the LRU lists, that page _must_ have a mapping.

What about pages with a mapping but without a writepage function ? or pages
whose writepage function fails ?  The current code seems to simply put the
page onto the active list in that case, which seems just as wrong to me.

> The bug is somewhere else, and your patch is just papering it over. We
> should not have a page without a mapping on the LRU lists in the first
> place, except if the page has anonymous buffers (and such a page cannot

So is there any legal reason we could ever get to page_active ?  Removing
that code (or replacing it with BUG()) certainly would make page_launder
more readable.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-27 Thread Linus Torvalds



On Wed, 27 Dec 2000, Rik van Riel wrote:
> 
> The (trivial) patch below should fix this problem.

It must be wrong.

If we have a dirty page on the LRU lists, that page _must_ have a mapping.

What semantics would you say a non-mapped page has? What are the LRU
routines supposed to do with such a page? 

The bug is somewhere else, and your patch is just papering it over. We
should not have a page without a mapping on the LRU lists in the first
place, except if the page has anonymous buffers (and such a page cannot
legally be dirty as things are in the standard kernel - Chris Mason has
been working on stuff that would make that a normal thing, but it would
also make the page have a mapping).

We'd better add a debug check that makes sure that we don't have
non-mapped non-buffer pages on the LRU lists, and figure out how such a
thing could happen.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-27 Thread Zlatko Calusic

Rik van Riel <[EMAIL PROTECTED]> writes:

> On Mon, 25 Dec 2000, Dan Aloni wrote:
> > On 25 Dec 2000, Zlatko Calusic wrote:
> > 
> > > Speaking of page_launder() I just stumbled upon two oopsen today on
> > > the UP build. Maybe it could give a hint to someone, I'm not that good
> > > at Oops decoding.
> > 
> > I've decoded the Oops I got, and found that the problem is in
> > vmscan.c:line-605, where page->mapping is NULL and a_ops gets
> > resolved and dereferenced at 0x000c.
> 
> The code assumes that every page which has the PG_dirty
> bit set also has page->mapping set to a valid value.
> 
> The BUG() people are getting confirms that this assumption
> is not necessarily true and the VM work that's going on will
> most likely make it not be true either in some cases.
> 
> The (trivial) patch below should fix this problem.
> 
> Linus and/or Alan, could you please apply this for the next
> pre-patch ?
> 

Looking at the patch, I'm practically sure it will cure the symptoms.
But I'm still slightly worried about those pages we skip in
there. Maybe we should at least try to discover what are those pages,
and then maybe it will become obvious what we need (or not) to do with
them.

Some strategic printk()s could probably give us some clue.

Too bad I lost track of the recent changes due to catastrofic time
shortage. And that is a shame as I'm very satisfied with the current
VM code, thanks to your hard work Rik.

> regards,
> 
> Rik
> --
> Hollywood goes for world dumbination,
>   Trailer at 11.
> 
>   http://www.surriel.com/
> http://www.conectiva.com/ http://distro.conectiva.com.br/
> 
> 
> --- vmscan.c.orig Wed Dec 27 16:48:24 2000
> +++ vmscan.c  Wed Dec 27 17:14:32 2000
> @@ -601,7 +601,7 @@
>* Dirty swap-cache page? Write it out if
>* last copy..
>*/
> - if (PageDirty(page)) {
> + if (PageDirty(page) && page->mapping) {
>   int (*writepage)(struct page *) = 
>page->mapping->a_ops->writepage;
>   int result;
>  
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> Please read the FAQ at http://www.tux.org/lkml/
> 

-- 
Zlatko
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-27 Thread Rik van Riel

On Mon, 25 Dec 2000, Dan Aloni wrote:
> On 25 Dec 2000, Zlatko Calusic wrote:
> 
> > Speaking of page_launder() I just stumbled upon two oopsen today on
> > the UP build. Maybe it could give a hint to someone, I'm not that good
> > at Oops decoding.
> 
> I've decoded the Oops I got, and found that the problem is in
> vmscan.c:line-605, where page->mapping is NULL and a_ops gets
> resolved and dereferenced at 0x000c.

The code assumes that every page which has the PG_dirty
bit set also has page->mapping set to a valid value.

The BUG() people are getting confirms that this assumption
is not necessarily true and the VM work that's going on will
most likely make it not be true either in some cases.

The (trivial) patch below should fix this problem.

Linus and/or Alan, could you please apply this for the next
pre-patch ?

regards,

Rik
--
Hollywood goes for world dumbination,
Trailer at 11.

http://www.surriel.com/
http://www.conectiva.com/   http://distro.conectiva.com.br/


--- vmscan.c.orig   Wed Dec 27 16:48:24 2000
+++ vmscan.cWed Dec 27 17:14:32 2000
@@ -601,7 +601,7 @@
 * Dirty swap-cache page? Write it out if
 * last copy..
 */
-   if (PageDirty(page)) {
+   if (PageDirty(page) && page->mapping) {
int (*writepage)(struct page *) = 
page->mapping->a_ops->writepage;
int result;
 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-27 Thread Alan Cox

> I can just imagine Xmas at the Torvalds residence, with their annual=20
> tradition of having the kids scream... But dad, other kids have the lig=
> hts=20
> strung around the trees, not the computer

I don't think you get the full picture. I suspect what gets strung up on the
trees at Christmas if Linus does too much hacking is ... Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-27 Thread Anton Blanchard

 
> This all only matters to things that do shared writable mmap's.
> 
> Almost nothing does that. innd is (sadly) the only regular thing that uses
> this, which is why it's always innd that breaks, even if everything else
> works.

btw samba 2.2 makes extensive use of shared writable mmaps (well it uses tdb
which uses shared writable mmaps). In fact it picked up a bug with virtual
aliasing of shared writable mmaps on sparc64 recently.

Anton
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-27 Thread Anton Blanchard

 
 This all only matters to things that do shared writable mmap's.
 
 Almost nothing does that. innd is (sadly) the only regular thing that uses
 this, which is why it's always innd that breaks, even if everything else
 works.

btw samba 2.2 makes extensive use of shared writable mmaps (well it uses tdb
which uses shared writable mmaps). In fact it picked up a bug with virtual
aliasing of shared writable mmaps on sparc64 recently.

Anton
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-27 Thread Alan Cox

 I can just imagine Xmas at the Torvalds residence, with their annual=20
 tradition of having the kids scream... But dad, other kids have the lig=
 hts=20
 strung around the trees, not the computer

I don't think you get the full picture. I suspect what gets strung up on the
trees at Christmas if Linus does too much hacking is ... Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-27 Thread Rik van Riel

On Mon, 25 Dec 2000, Dan Aloni wrote:
 On 25 Dec 2000, Zlatko Calusic wrote:
 
  Speaking of page_launder() I just stumbled upon two oopsen today on
  the UP build. Maybe it could give a hint to someone, I'm not that good
  at Oops decoding.
 
 I've decoded the Oops I got, and found that the problem is in
 vmscan.c:line-605, where page-mapping is NULL and a_ops gets
 resolved and dereferenced at 0x000c.

The code assumes that every page which has the PG_dirty
bit set also has page-mapping set to a valid value.

The BUG() people are getting confirms that this assumption
is not necessarily true and the VM work that's going on will
most likely make it not be true either in some cases.

The (trivial) patch below should fix this problem.

Linus and/or Alan, could you please apply this for the next
pre-patch ?

regards,

Rik
--
Hollywood goes for world dumbination,
Trailer at 11.

http://www.surriel.com/
http://www.conectiva.com/   http://distro.conectiva.com.br/


--- vmscan.c.orig   Wed Dec 27 16:48:24 2000
+++ vmscan.cWed Dec 27 17:14:32 2000
@@ -601,7 +601,7 @@
 * Dirty swap-cache page? Write it out if
 * last copy..
 */
-   if (PageDirty(page)) {
+   if (PageDirty(page)  page-mapping) {
int (*writepage)(struct page *) = 
page-mapping-a_ops-writepage;
int result;
 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-27 Thread Zlatko Calusic

Rik van Riel [EMAIL PROTECTED] writes:

 On Mon, 25 Dec 2000, Dan Aloni wrote:
  On 25 Dec 2000, Zlatko Calusic wrote:
  
   Speaking of page_launder() I just stumbled upon two oopsen today on
   the UP build. Maybe it could give a hint to someone, I'm not that good
   at Oops decoding.
  
  I've decoded the Oops I got, and found that the problem is in
  vmscan.c:line-605, where page-mapping is NULL and a_ops gets
  resolved and dereferenced at 0x000c.
 
 The code assumes that every page which has the PG_dirty
 bit set also has page-mapping set to a valid value.
 
 The BUG() people are getting confirms that this assumption
 is not necessarily true and the VM work that's going on will
 most likely make it not be true either in some cases.
 
 The (trivial) patch below should fix this problem.
 
 Linus and/or Alan, could you please apply this for the next
 pre-patch ?
 

Looking at the patch, I'm practically sure it will cure the symptoms.
But I'm still slightly worried about those pages we skip in
there. Maybe we should at least try to discover what are those pages,
and then maybe it will become obvious what we need (or not) to do with
them.

Some strategic printk()s could probably give us some clue.

Too bad I lost track of the recent changes due to catastrofic time
shortage. And that is a shame as I'm very satisfied with the current
VM code, thanks to your hard work Rik.

 regards,
 
 Rik
 --
 Hollywood goes for world dumbination,
   Trailer at 11.
 
   http://www.surriel.com/
 http://www.conectiva.com/ http://distro.conectiva.com.br/
 
 
 --- vmscan.c.orig Wed Dec 27 16:48:24 2000
 +++ vmscan.c  Wed Dec 27 17:14:32 2000
 @@ -601,7 +601,7 @@
* Dirty swap-cache page? Write it out if
* last copy..
*/
 - if (PageDirty(page)) {
 + if (PageDirty(page)  page-mapping) {
   int (*writepage)(struct page *) = 
page-mapping-a_ops-writepage;
   int result;
  
 
 -
 To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
 the body of a message to [EMAIL PROTECTED]
 Please read the FAQ at http://www.tux.org/lkml/
 

-- 
Zlatko
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-27 Thread Linus Torvalds



On Wed, 27 Dec 2000, Rik van Riel wrote:
 
 The (trivial) patch below should fix this problem.

It must be wrong.

If we have a dirty page on the LRU lists, that page _must_ have a mapping.

What semantics would you say a non-mapped page has? What are the LRU
routines supposed to do with such a page? 

The bug is somewhere else, and your patch is just papering it over. We
should not have a page without a mapping on the LRU lists in the first
place, except if the page has anonymous buffers (and such a page cannot
legally be dirty as things are in the standard kernel - Chris Mason has
been working on stuff that would make that a normal thing, but it would
also make the page have a mapping).

We'd better add a debug check that makes sure that we don't have
non-mapped non-buffer pages on the LRU lists, and figure out how such a
thing could happen.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-27 Thread Philipp Rumpf

On Wed, Dec 27, 2000 at 03:41:04PM -0800, Linus Torvalds wrote:
 It must be wrong.
 
 If we have a dirty page on the LRU lists, that page _must_ have a mapping.

What about pages with a mapping but without a writepage function ? or pages
whose writepage function fails ?  The current code seems to simply put the
page onto the active list in that case, which seems just as wrong to me.

 The bug is somewhere else, and your patch is just papering it over. We
 should not have a page without a mapping on the LRU lists in the first
 place, except if the page has anonymous buffers (and such a page cannot

So is there any legal reason we could ever get to page_active ?  Removing
that code (or replacing it with BUG()) certainly would make page_launder
more readable.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-27 Thread Linus Torvalds



On Wed, 27 Dec 2000, Philipp Rumpf wrote:

 On Wed, Dec 27, 2000 at 03:41:04PM -0800, Linus Torvalds wrote:
  It must be wrong.
  
  If we have a dirty page on the LRU lists, that page _must_ have a mapping.
 
 What about pages with a mapping but without a writepage function ? or pages
 whose writepage function fails ?  The current code seems to simply put the
 page onto the active list in that case, which seems just as wrong to me.

ramfs. It doesn't have a writepage() function, as there is no backing
store.

  The bug is somewhere else, and your patch is just papering it over. We
  should not have a page without a mapping on the LRU lists in the first
  place, except if the page has anonymous buffers (and such a page cannot
 
 So is there any legal reason we could ever get to page_active ?  Removing
 that code (or replacing it with BUG()) certainly would make page_launder
 more readable.

Apart from the "we have no backing store", there is no legal reason to put
it back on the active list that I can see.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-27 Thread Dan Aloni

On Wed, 27 Dec 2000, Philipp Rumpf wrote:

 On Wed, Dec 27, 2000 at 03:41:04PM -0800, Linus Torvalds wrote:
  It must be wrong.
  
  If we have a dirty page on the LRU lists, that page _must_ have a mapping.
 
 What about pages with a mapping but without a writepage function ? or pages
 whose writepage function fails ?  The current code seems to simply put the
 page onto the active list in that case, which seems just as wrong to me.

I noticed the offset of 'mapping' in struct page doesn't match the offset
which get resolved at obsolute address 0x000c according to the Oops,
but it does matches the offset of a_ops in struct address_space, which
makes me wonder if it's possible that a_ops == NULL. Suggestions?
 
-- 
Dan Aloni 
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-27 Thread Dan Aloni

On Thu, 28 Dec 2000, Dan Aloni wrote:

 On Wed, 27 Dec 2000, Philipp Rumpf wrote:
 
  On Wed, Dec 27, 2000 at 03:41:04PM -0800, Linus Torvalds wrote:
   It must be wrong.
   
   If we have a dirty page on the LRU lists, that page _must_ have a mapping.
  
  What about pages with a mapping but without a writepage function ? or pages
  whose writepage function fails ?  The current code seems to simply put the
  page onto the active list in that case, which seems just as wrong to me.
 
 I noticed the offset of 'mapping' in struct page doesn't match the offset
 which get resolved at obsolute address 0x000c according to the Oops,
 but it does matches the offset of a_ops in struct address_space, which
 makes me wonder if it's possible that a_ops == NULL. Suggestions?

I was confused, or maybe temporal insanity, ignore that ;-)

-- 
Dan Aloni 
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-27 Thread Ari Heitner


On Thu, 28 Dec 2000, Chris Wedgwood wrote:

 (cc' list trimmed)
[further]
 I use ramfs for /tmp on my laptop -- it's very handy because it
 extends the amount of the the disk had spent spun down and therefore
 battery life; but writing large files into /tmp can blow away the
 system or at the very least eat away at otherwise usable ram. Not
 terribly desirable.

mph.

does anyone other than me think that the pm code is *way* too agressive about
spinning down the hard drive? my 256mb laptop (2.2.16) will only spin down the
disk for about 30 seconds before it decides it's got something else it feels
like writing out, and spins back up. Spinnup has got to be more wasteful than
just leaving the drive spinning...

unless the vfs code is cleaning dirty pages to disk when there's no activity,
then any time the vfs writes to disk, it's got something that's really hot and
shouldn't stay in ram ... if that's the case, and the frequency of such
occurrences is so high, perhaps the pm code should wait a couple of times as
long as it does before it spins down. basically, if there's still code doing
stuff, the drive should be spinning. only spindown when we're sure the user has
walked away for good :)





ari



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-26 Thread Michael Peddemors

> Yeah, yeah, it's 7PM Christmas Eve over there, and you're in the middle of
> your Christmas dinner. You might feel that it's unreasonable of me to ask
> you to test out my latest crazy idea.
>
> How selfish of you.
>
> Get back there in front of the computer NOW. Christmas can wait.
>
>   Linus "the Grinch" Torvalds

I can just imagine Xmas at the Torvalds residence, with their annual 
tradition of having the kids scream... But dad, other kids have the lights 
strung around the trees, not the computer

Linus, that's why they call them computer 'boxes'... didn't you know you are 
supposed to wait for 'boxing day' before climbing back in front of the 
screen??


-- 

Michael Peddemors - Senior Consultant
Unix Administration - WebSite Hosting
Network Services - Programming
Wizard Internet Services http://www.wizard.ca
Linux Support Specialist - http://www.linuxmagic.com

(604) 589-0037 Beautiful British Columbia, Canada

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-26 Thread Linus Torvalds



On Mon, 25 Dec 2000, Alan Cox wrote:
>
> > The simple fix is along the lines of adding code to fsync() that walks the
> > inode page list and writes out dirty pages.
> > 
> > The clever and clean fix is to split the inode page list into two lists,
> > one for dirty and one for clean pages, and only walk the dirty list.
> 
> Like the patches that were floating around on the list for the past 
> few months to make O_SYNC work. Could those be used for it ?

I haven't seen the patches, but I suspect strongly that they clash with
the ext2 smart buffer sync code.

For example, the old generic_fdatasync() function already walks the list
of pages and could trivially be extended to handle dirty pages in addition
to just dirty blocks. But ext2 uses the faster and more clever "walk just
the dirty buffers of this inode" thing, which never even looks at any
pages.

This is high-level enough that the low-level filesystems should not even
see this functionality, and it doesn't look very hard to do. It's just all
these holidays in the way (and if you think you can skip christmas with a
4-year-old in the house, think AGAIN. This is a sacred time.).

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-26 Thread Alan Cox

> The simple fix is along the lines of adding code to fsync() that walks the
> inode page list and writes out dirty pages.
> 
> The clever and clean fix is to split the inode page list into two lists,
> one for dirty and one for clean pages, and only walk the dirty list.

Like the patches that were floating around on the list for the past 
few months to make O_SYNC work. Could those be used for it ?

> 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-26 Thread Alan Cox

 The simple fix is along the lines of adding code to fsync() that walks the
 inode page list and writes out dirty pages.
 
 The clever and clean fix is to split the inode page list into two lists,
 one for dirty and one for clean pages, and only walk the dirty list.

Like the patches that were floating around on the list for the past 
few months to make O_SYNC work. Could those be used for it ?

 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-26 Thread Linus Torvalds



On Mon, 25 Dec 2000, Alan Cox wrote:

  The simple fix is along the lines of adding code to fsync() that walks the
  inode page list and writes out dirty pages.
  
  The clever and clean fix is to split the inode page list into two lists,
  one for dirty and one for clean pages, and only walk the dirty list.
 
 Like the patches that were floating around on the list for the past 
 few months to make O_SYNC work. Could those be used for it ?

I haven't seen the patches, but I suspect strongly that they clash with
the ext2 smart buffer sync code.

For example, the old generic_fdatasync() function already walks the list
of pages and could trivially be extended to handle dirty pages in addition
to just dirty blocks. But ext2 uses the faster and more clever "walk just
the dirty buffers of this inode" thing, which never even looks at any
pages.

This is high-level enough that the low-level filesystems should not even
see this functionality, and it doesn't look very hard to do. It's just all
these holidays in the way (and if you think you can skip christmas with a
4-year-old in the house, think AGAIN. This is a sacred time.).

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-26 Thread Michael Peddemors

 Yeah, yeah, it's 7PM Christmas Eve over there, and you're in the middle of
 your Christmas dinner. You might feel that it's unreasonable of me to ask
 you to test out my latest crazy idea.

 How selfish of you.

 Get back there in front of the computer NOW. Christmas can wait.

   Linus "the Grinch" Torvalds

I can just imagine Xmas at the Torvalds residence, with their annual 
tradition of having the kids scream... But dad, other kids have the lights 
strung around the trees, not the computer

Linus, that's why they call them computer 'boxes'... didn't you know you are 
supposed to wait for 'boxing day' before climbing back in front of the 
screen??


-- 

Michael Peddemors - Senior Consultant
Unix Administration - WebSite Hosting
Network Services - Programming
Wizard Internet Services http://www.wizard.ca
Linux Support Specialist - http://www.linuxmagic.com

(604) 589-0037 Beautiful British Columbia, Canada

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-25 Thread Linus Torvalds



On Tue, 26 Dec 2000, Chris Wedgwood wrote:

> On Mon, Dec 25, 2000 at 01:42:33AM -0800, Linus Torvalds wrote:
> 
> We just don't write them out. Because right now the only thing
> that writes out dirty pages is memory pressure. "sync()",
> "fsync()" and "fdatasync()" will happily ignore dirty pages
> completely. The thing that made me overlook that simple thing in
> testing was that I was testing the new VM stuff under heavy VM
> load - to shake out any bugs.
> 
> Does this mean anyone using test13-pre4 should also expect to see
> data not being flushed on shutdown? 

No.

This all only matters to things that do shared writable mmap's.

Almost nothing does that. innd is (sadly) the only regular thing that uses
this, which is why it's always innd that breaks, even if everything else
works.

And even innd is often compiled to use "write()" instead of shared
mappings (it's a config option), so not even all innd's will break.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-25 Thread Linus Torvalds



On Mon, 25 Dec 2000, Marco d'Itri wrote:
>
>  >Do you get any messages? I don't think you will, but it should be tested.
>
> I read you found the real cause so that may be bogus, but I have got two
> messages while booting. The first showed up while doing the fsck of a 6
> GB file systems and killed the process (fscks of smaller partitions
> completed successfully), the second occured while initializing
> /dev/random and left an unkillable dd process and a stuck boot process
> (I gathered this info with sysrq).

I'd still love to get the trace for these. I think I have a handle on the
problems, but it would stil be helpful - dropping a dirty page really
shouldn't happen except for the swap cache (and that should have been
plugged by adding the ClearPageDirty()).

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-25 Thread Marco d'Itri

On Dec 25, Linus Torvalds <[EMAIL PROTECTED]> wrote:

 >Add a printk() to __remove_inode_page() that complains whenever it removes
 >a dirty page. 
 >
 >Oh, in order to not see this with swap pages (which _can_ be removed when
 >they are dirty, if all users of them are gone), add a PageClearDirty() to
 >"remove_from_swap_cache()" so that we don't get false positives..
 >
 >Do you get any messages? I don't think you will, but it should be tested.
I read you found the real cause so that may be bogus, but I have got two
messages while booting. The first showed up while doing the fsck of a 6
GB file systems and killed the process (fscks of smaller partitions
completed successfully), the second occured while initializing
/dev/random and left an unkillable dd process and a stuck boot process
(I gathered this info with sysrq).

Being -test12 unstable for me, if you don't need more data I'll go back
to -test9 until the next release.

 >That's probably the infinite loop in the tty task queue handling, should
 >be fixed in test13-pre3 or so.
Looks like I missed it, evil vger postmasters unsubscribed me again for
no apparent reason...

-- 
ciao,
Marco

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-25 Thread Linus Torvalds



On Mon, 25 Dec 2000, Linus Torvalds wrote:
> 
> Assuming we don't lose any PG_dirty bits, we might of course just lose it
> from the page tables themselves before it ever even gets to "struct page".
> I'm just surprised that it seems to be so repeatable for you - it sounds
> like we _never_ actually write out the dirty pages to disk. It's not that
> we can lose the dirty bit occasionally, we seem to lose it every time in
> your setup.

Nope. I got it.

The thing is even more embarrassing than just losing a dirty bit.

We don't lose any dirty bits (well, we could before, but after adding the
PageDirty() test to reclaim_page() we're ok now).

In fact, we know _exactly_ which pages are dirty, and which pages are not.

We just don't write them out. Because right now the only thing that writes
out dirty pages is memory pressure. "sync()", "fsync()" and "fdatasync()"
will happily ignore dirty pages completely. The thing that made me
overlook that simple thing in testing was that I was testing the new VM
stuff under heavy VM load - to shake out any bugs.

Under heavy VM load, there are no problems, because the memory pressure
will make sure everything gets written out. Under heavy VM load the thing
works just beautifully.

Under _low_, or no, memory pressure, however, the dang thing just stays in
memory. We'll happily reboot with the new contents still cached, in fact.

I bet that if you start something that eats up all your memory, and causes
some nice swapping just before you shut down the machine, your innd active
file will be right as rain after a reboot.

I'm a stupid git. I even remember thinking about the syncing issues at
some point, and then obviously just forgetting about it _completely_.

The simple fix is along the lines of adding code to fsync() that walks the
inode page list and writes out dirty pages.

The clever and clean fix is to split the inode page list into two lists,
one for dirty and one for clean pages, and only walk the dirty list.

Ho ho ho. I _so_ enjoy making a fool out of myself.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-25 Thread Linus Torvalds



On Mon, 25 Dec 2000, Marco d'Itri wrote:

> On Dec 24, Linus Torvalds <[EMAIL PROTECTED]> wrote:
> 
>  >/* The page is dirty, or locked, move to inactive_dirty list. */
>  >if (page->buffers || TryLockPage(page)) {
>  >...
>  >
>  >and change the test to
>  >
>  >if (page->buffers || PageDirty(page) || TryLockPage(page)) {
> Done, no change.
> Got some articles, restarted the server, all is good.
> Got other articles, rebooted and the files now differ.

Willing to test some more?

Add a printk() to __remove_inode_page() that complains whenever it removes
a dirty page. 

Oh, in order to not see this with swap pages (which _can_ be removed when
they are dirty, if all users of them are gone), add a PageClearDirty() to
"remove_from_swap_cache()" so that we don't get false positives..

Do you get any messages? I don't think you will, but it should be tested.
You might mark it a BUG(), so tht we'll get a stack-trace if it happens.

Assuming we don't lose any PG_dirty bits, we might of course just lose it
from the page tables themselves before it ever even gets to "struct page".
I'm just surprised that it seems to be so repeatable for you - it sounds
like we _never_ actually write out the dirty pages to disk. It's not that
we can lose the dirty bit occasionally, we seem to lose it every time in
your setup.

I wonder if it's something specific innd does. Like "msync()" just being
broken or similar. But the code looks sane.

Hmm.. Can you send me an "strace" of innd when this happens?

> And I have another problem: I'm experiencing random hangs using X[1] with
> 2.4.0-test12.

That's probably the infinite loop in the tty task queue handling, should
be fixed in test13-pre3 or so.

Linus


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-25 Thread Linus Torvalds



On Mon, 25 Dec 2000, Marco d'Itri wrote:

 On Dec 24, Linus Torvalds [EMAIL PROTECTED] wrote:
 
  /* The page is dirty, or locked, move to inactive_dirty list. */
  if (page-buffers || TryLockPage(page)) {
  ...
  
  and change the test to
  
  if (page-buffers || PageDirty(page) || TryLockPage(page)) {
 Done, no change.
 Got some articles, restarted the server, all is good.
 Got other articles, rebooted and the files now differ.

Willing to test some more?

Add a printk() to __remove_inode_page() that complains whenever it removes
a dirty page. 

Oh, in order to not see this with swap pages (which _can_ be removed when
they are dirty, if all users of them are gone), add a PageClearDirty() to
"remove_from_swap_cache()" so that we don't get false positives..

Do you get any messages? I don't think you will, but it should be tested.
You might mark it a BUG(), so tht we'll get a stack-trace if it happens.

Assuming we don't lose any PG_dirty bits, we might of course just lose it
from the page tables themselves before it ever even gets to "struct page".
I'm just surprised that it seems to be so repeatable for you - it sounds
like we _never_ actually write out the dirty pages to disk. It's not that
we can lose the dirty bit occasionally, we seem to lose it every time in
your setup.

I wonder if it's something specific innd does. Like "msync()" just being
broken or similar. But the code looks sane.

Hmm.. Can you send me an "strace" of innd when this happens?

 And I have another problem: I'm experiencing random hangs using X[1] with
 2.4.0-test12.

That's probably the infinite loop in the tty task queue handling, should
be fixed in test13-pre3 or so.

Linus


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-25 Thread Linus Torvalds



On Mon, 25 Dec 2000, Linus Torvalds wrote:
 
 Assuming we don't lose any PG_dirty bits, we might of course just lose it
 from the page tables themselves before it ever even gets to "struct page".
 I'm just surprised that it seems to be so repeatable for you - it sounds
 like we _never_ actually write out the dirty pages to disk. It's not that
 we can lose the dirty bit occasionally, we seem to lose it every time in
 your setup.

Nope. I got it.

The thing is even more embarrassing than just losing a dirty bit.

We don't lose any dirty bits (well, we could before, but after adding the
PageDirty() test to reclaim_page() we're ok now).

In fact, we know _exactly_ which pages are dirty, and which pages are not.

We just don't write them out. Because right now the only thing that writes
out dirty pages is memory pressure. "sync()", "fsync()" and "fdatasync()"
will happily ignore dirty pages completely. The thing that made me
overlook that simple thing in testing was that I was testing the new VM
stuff under heavy VM load - to shake out any bugs.

Under heavy VM load, there are no problems, because the memory pressure
will make sure everything gets written out. Under heavy VM load the thing
works just beautifully.

Under _low_, or no, memory pressure, however, the dang thing just stays in
memory. We'll happily reboot with the new contents still cached, in fact.

I bet that if you start something that eats up all your memory, and causes
some nice swapping just before you shut down the machine, your innd active
file will be right as rain after a reboot.

I'm a stupid git. I even remember thinking about the syncing issues at
some point, and then obviously just forgetting about it _completely_.

The simple fix is along the lines of adding code to fsync() that walks the
inode page list and writes out dirty pages.

The clever and clean fix is to split the inode page list into two lists,
one for dirty and one for clean pages, and only walk the dirty list.

Ho ho ho. I _so_ enjoy making a fool out of myself.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-25 Thread Marco d'Itri

On Dec 25, Linus Torvalds [EMAIL PROTECTED] wrote:

 Add a printk() to __remove_inode_page() that complains whenever it removes
 a dirty page. 
 
 Oh, in order to not see this with swap pages (which _can_ be removed when
 they are dirty, if all users of them are gone), add a PageClearDirty() to
 "remove_from_swap_cache()" so that we don't get false positives..
 
 Do you get any messages? I don't think you will, but it should be tested.
I read you found the real cause so that may be bogus, but I have got two
messages while booting. The first showed up while doing the fsck of a 6
GB file systems and killed the process (fscks of smaller partitions
completed successfully), the second occured while initializing
/dev/random and left an unkillable dd process and a stuck boot process
(I gathered this info with sysrq).

Being -test12 unstable for me, if you don't need more data I'll go back
to -test9 until the next release.

 That's probably the infinite loop in the tty task queue handling, should
 be fixed in test13-pre3 or so.
Looks like I missed it, evil vger postmasters unsubscribed me again for
no apparent reason...

-- 
ciao,
Marco

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-25 Thread Linus Torvalds



On Mon, 25 Dec 2000, Marco d'Itri wrote:

  Do you get any messages? I don't think you will, but it should be tested.

 I read you found the real cause so that may be bogus, but I have got two
 messages while booting. The first showed up while doing the fsck of a 6
 GB file systems and killed the process (fscks of smaller partitions
 completed successfully), the second occured while initializing
 /dev/random and left an unkillable dd process and a stuck boot process
 (I gathered this info with sysrq).

I'd still love to get the trace for these. I think I have a handle on the
problems, but it would stil be helpful - dropping a dirty page really
shouldn't happen except for the swap cache (and that should have been
plugged by adding the ClearPageDirty()).

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-25 Thread Linus Torvalds



On Tue, 26 Dec 2000, Chris Wedgwood wrote:

 On Mon, Dec 25, 2000 at 01:42:33AM -0800, Linus Torvalds wrote:
 
 We just don't write them out. Because right now the only thing
 that writes out dirty pages is memory pressure. "sync()",
 "fsync()" and "fdatasync()" will happily ignore dirty pages
 completely. The thing that made me overlook that simple thing in
 testing was that I was testing the new VM stuff under heavy VM
 load - to shake out any bugs.
 
 Does this mean anyone using test13-pre4 should also expect to see
 data not being flushed on shutdown? 

No.

This all only matters to things that do shared writable mmap's.

Almost nothing does that. innd is (sadly) the only regular thing that uses
this, which is why it's always innd that breaks, even if everything else
works.

And even innd is often compiled to use "write()" instead of shared
mappings (it's a config option), so not even all innd's will break.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-24 Thread Augusto César Radtke

Marco d'Itri wrote:

> And I have another problem: I'm experiencing random hangs using X[1] with
> 2.4.0-test12. After a variable amount of time, some of the times I use X
> (I mostly use console) it just freezes hard (no caps lock activity).
> I'm not sure if this only happens while using X or it's just less
> frequent in console. -test9 works fine and I used it since it has been
> released with no ill effects.

This is probably the run_task_queue bug fixed in test13pre3.

Augusto
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-24 Thread Dan Aloni

On 25 Dec 2000, Zlatko Calusic wrote:

> Linus Torvalds <[EMAIL PROTECTED]> writes:
> 
> > On Sun, 24 Dec 2000, Linus Torvalds wrote:
> > > 
> > > Marco, would you mind changing the test in reclaim_page(), somewheer
> > > around line mm/vmscan.c:487 that says:
> > 
> 
> Speaking of page_launder() I just stumbled upon two oopsen today on
> the UP build. Maybe it could give a hint to someone, I'm not that good
> at Oops decoding.
> 
> Unable to handle kernel NULL pointer dereference at virtual address 000c
>  printing eip:
> c012872e
> *pde = 
> Oops: 
> CPU:0
> EIP:0010:[page_launder+510/2156]

I suspected I'm not the only one who is getting these exact same Oopses
(and the lockups that follow them) so earlier today, I've decoded the Oops
I got, and found that the problem is in vmscan.c:line-605, where 
page->mapping is NULL and a_ops gets resolved and dereferenced at
0x000c. 

I leave the fix for the mm experts, I've notified Linus, I guess he's
looking into it. 

-- 
Dan Aloni 
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-24 Thread Marco d'Itri

On Dec 24, Linus Torvalds <[EMAIL PROTECTED]> wrote:

 >  /* The page is dirty, or locked, move to inactive_dirty list. */
 >  if (page->buffers || TryLockPage(page)) {
 >  ...
 >
 >and change the test to
 >
 >  if (page->buffers || PageDirty(page) || TryLockPage(page)) {
Done, no change.
Got some articles, restarted the server, all is good.
Got other articles, rebooted and the files now differ.


And I have another problem: I'm experiencing random hangs using X[1] with
2.4.0-test12. After a variable amount of time, some of the times I use X
(I mostly use console) it just freezes hard (no caps lock activity).
I'm not sure if this only happens while using X or it's just less
frequent in console. -test9 works fine and I used it since it has been
released with no ill effects.


My hardware:

00:00.0 Host bridge: VIA Technologies, Inc. VT82C598 [Apollo MVP3] (rev 04)
00:01.0 PCI bridge: VIA Technologies, Inc. VT82C598 [Apollo MVP3 AGP]
00:07.0 ISA bridge: VIA Technologies, Inc. VT82C586/A/B PCI-to-ISA [Apollo VP] (rev 41)
00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)
00:07.3 Bridge: VIA Technologies, Inc. VT82C586B ACPI (rev 10)
00:09.0 Ethernet controller: Realtek Semiconductor Co., Ltd.  RTL-8029(AS)
01:00.0 VGA compatible controller: Matrox Graphics, Inc. MGA G100 [Productiva] AGP 
(rev 02)


vendor_id   : AuthenticAMD
cpu family  : 5
model   : 8
model name  : AMD-K6(tm) 3D processor
stepping: 12
cpu MHz : 267.282
cache size  : 64 KB


gcc version 2.95.2 2220 (Debian GNU/Linux)


[1] Good old stable XF86_SVGA 3.x from debian potato.
-- 
ciao,
Marco

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-24 Thread Zlatko Calusic

Linus Torvalds <[EMAIL PROTECTED]> writes:

> On Sun, 24 Dec 2000, Linus Torvalds wrote:
> > 
> > Marco, would you mind changing the test in reclaim_page(), somewheer
> > around line mm/vmscan.c:487 that says:
> 

Speaking of page_launder() I just stumbled upon two oopsen today on
the UP build. Maybe it could give a hint to someone, I'm not that good
at Oops decoding.

Merry Christmas!


Unable to handle kernel NULL pointer dereference at virtual address 000c
 printing eip:
c012872e
*pde = 
Oops: 
CPU:0
EIP:0010:[page_launder+510/2156]
EFLAGS: 00010202
eax:    ebx: c12e2ce8   ecx: c1244474   edx: 
esi: c12e2d04   edi:    ebp:    esp: c15d1fb4
ds: 0018   es: 0018   ss: 0018
Process bdflush (pid: 6, stackpage=c15d1000)
Stack: c15d  c15d023a 0008e000   0001 2933 
    c0131e5d 0003  00010f00 c146ff88 c146ffc4 c01073fc 
   c146ffc4 0078 c146ffc4 
Call Trace: [bdflush+141/236] [kernel_thread+40/56] 
Code: 8b 40 0c 8b 00 85 c0 0f 84 ba 04 00 00 83 7c 24 10 00 75 73 


Unable to handle kernel NULL pointer dereference at virtual address 000c
 printing eip:
c012872e
*pde = 
Oops: 
CPU:0
EIP:0010:[page_launder+510/2156]
EFLAGS: 00010202
eax:    ebx: c1260eec   ecx: c15d5fe0   edx: c02917f0
esi: c1260f08   edi:    ebp:    esp: c15d5f9c
ds: 0018   es: 0018   ss: 0018
Process kswapd (pid: 4, stackpage=c15d5000)
Stack: 00010f00 0004   0004   2938 
    c01290fc 0004  00010f00 c01f77f7 c15d4239 0008e000 
   c01291c6 0004  c146ffb8  c01073fc  0078 
Call Trace: [do_try_to_free_pages+52/128] [tvecs+8683/64084] [kswapd+126/288] 
[kernel_thread+40/56] 
Code: 8b 40 0c 8b 00 85 c0 0f 84 ba 04 00 00 83 7c 24 10 00 75 73 

-- 
Zlatko
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-24 Thread Linus Torvalds



On Sun, 24 Dec 2000, Daniel Phillips wrote:
> 
> It looks like PG_dirty is now being used only for swap_cache pages, and
> not for buffer cache and page cache pages, is that correct?

No. PG_dirty is used for all memory mapped pages - be they anonymous or
not.  

These days the buffer dirty bits are only used by "write()", because
write() can obviously dirty smaller areas than one page.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-24 Thread Daniel Phillips

Linus Torvalds wrote:
> Hmm.. I wonder if such a dirty page might have been moved to the
> "inactive_clean" list some way? It shouldn't really be there, as the page
> had users, but if it gets on that list we'd not have tested the dirty bit.
> 
> Marco, would you mind changing the test in reclaim_page(), somewheer
> around line mm/vmscan.c:487 that says:
> 
> /* The page is dirty, or locked, move to inactive_dirty list. */
> if (page->buffers || TryLockPage(page)) {
> ...
> 
> and change the test to
> 
> if (page->buffers || PageDirty(page) || TryLockPage(page)) {
> 
> instead? Ie ad the test for "PageDirty(page)"

Good point.  Up until recently the page dirty bit wasn't actually being
set anywhere and page->buffers was acting as kind of a surrogate dirty
bit - page_launder would call try_to_free_buffers which would find the
dirty buffers and fail out, but start io first

It looks like PG_dirty is now being used only for swap_cache pages, and
not for buffer cache and page cache pages, is that correct?

--
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-24 Thread Dietmar Kling


> 
> Get back there in front of the computer NOW. Christmas can wait.
> 
> Linus "the Grinch" Torvalds


Hoo - Hoo - Hoo,

you've been very naughty Linus. 

Asking people to work on Christmas evening. 

My god Linus, that's so naughty that I add 
it to my list...


As soon as I'm finished with Futurama,
   ... I'll get you!

Merry X-Mas
Santa Claus 

:))
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-24 Thread Linus Torvalds



On Sun, 24 Dec 2000, Linus Torvalds wrote:
> 
> Marco, would you mind changing the test in reclaim_page(), somewheer
> around line mm/vmscan.c:487 that says:

Yeah, yeah, it's 7PM Christmas Eve over there, and you're in the middle of
your Christmas dinner. You might feel that it's unreasonable of me to ask
you to test out my latest crazy idea.

How selfish of you.

Get back there in front of the computer NOW. Christmas can wait.

Linus "the Grinch" Torvalds

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-24 Thread Linus Torvalds



On Sun, 24 Dec 2000, Marco d'Itri wrote:
> On Dec 24, Alexander Viro <[EMAIL PROTECTED]> wrote:
> 
>  >> I put "cp active active.ok" in the rc file before shutting down the
>  >> daemon and at the next boot the files are different, every time.
>  >
>  >Could you send me both files? BTW, which filesystem it is?
> I use ext2. The files are not corrupted, they just are not updated.
> Another data point: at least in some cases, if I stop and start inn
> without rebooting the files are the same.

Ok, looks like we just drop the page cache page without writing it out in
some cases. Possibly/probably because we have dropped the dirty bit on the
floor.

Look slike this is a completely different case from the previous
corruptions, it looks more like a VM issue than a FS thing..

Hmm.. munmap() (and exit()) go through "zap_page_range()", which go
through "free_pte()", which definitely copies the dirty bit to the page
structure.

Hmm.. I wonder if such a dirty page might have been moved to the
"inactive_clean" list some way? It shouldn't really be there, as the page
had users, but if it gets on that list we'd not have tested the dirty bit.

Marco, would you mind changing the test in reclaim_page(), somewheer
around line mm/vmscan.c:487 that says:

/* The page is dirty, or locked, move to inactive_dirty list. */
if (page->buffers || TryLockPage(page)) {
...

and change the test to

if (page->buffers || PageDirty(page) || TryLockPage(page)) {

instead? Ie ad the test for "PageDirty(page)" (and order _is_ important:
the TryLockPage() thing must come last, because it has side effects).

(You might add a "printk()" too that triggers when the new condition
happens, just to see if it does indeed happen).

If the page is on the inactive_clean() list, we'll have to find where it
is put there, because it really shouldn't have been there. 

Uhhuh. Actually, reading "page_launder()", the buffer clearign case looks
suspiciously like i doesn't check for page accessed or dirty bits. That's
probably it. Maybe there are other cases. Anyway, I'd love to hear if the
above one-liner fixes the corruption for you..

Thanks,
Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-24 Thread Marco d'Itri

On Dec 24, Alexander Viro <[EMAIL PROTECTED]> wrote:

 >> I put "cp active active.ok" in the rc file before shutting down the
 >> daemon and at the next boot the files are different, every time.
 >
 >Could you send me both files? BTW, which filesystem it is?
I use ext2. The files are not corrupted, they just are not updated.
Another data point: at least in some cases, if I stop and start inn
without rebooting the files are the same.

--- active.ok   Sun Dec 24 09:58:00 2000
+++ active  Sun Dec 24 08:33:34 2000
@@ -1,5 +1,5 @@
 control 004793 004794 y
-control.cancel 022865 021934 n
+control.cancel 022864 021934 n
 junk 001806 001807 y
 fido.ita.ridere 014779 014777 y
 fido.ita.dewdney 004073 004074 y
@@ -10,19 +10,19 @@
 fido.ita.sf 004777 004778 y
 comp.os.linux.announce 010782 010779 m
 fido.ita.tex 000248 000249 y
-it.news.annunci 004909 004787 m
+it.news.annunci 004905 004787 m
 it.news.gestione 007878 007399 y
 fido.ita.tv 011944 011944 y
 it.test 000796 000797 y
-it.news.gruppi 048004 047898 y
+it.news.gruppi 047994 047898 y
 it.comp.sicurezza.varie 030696 030353 y
 it.comp.sicurezza.unix 002721 002722 y
-it.faq 001154 001091 m
+it.faq 001150 001091 m
[...]

-- 
ciao,
Marco

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: innd mmap bug in 2.4.0-test12

2000-12-24 Thread Jeff Lightfoot

"Marco d'Itri" ([EMAIL PROTECTED]) wrote:
> I can confirm the bug which loses updates to the inn active file when
> it's unmapped is present again in 2.4.0-test12.

It is also still in 2.4.0-test13-pre4 in case someone thought they had
fixed it.

-- 
Jeff Lightfoot   --jeffml at pobox.com   --   http://thefoots.com/
"I see the light at the end of the tunnel now ... someone please
tell me it's not a train" -- Cracker
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



innd mmap bug in 2.4.0-test12

2000-12-24 Thread Marco d'Itri

I can confirm the bug which loses updates to the inn active file when
it's unmapped is present again in 2.4.0-test12.

I put "cp active active.ok" in the rc file before shutting down the
daemon and at the next boot the files are different, every time.

Alexander Viro posted this test case:

#include 
main(argc,argv)
int argc;
char **argv;
{
int fd;
char c=0;
truncate(argv[1], 10);
fd = open(argv[1], 1);
lseek(fd, 16384, 0);
write(fd, , 1);
close(fd);
}

but I tried it and it gives the correct result (a 16384 bytes long file
with only the first few bytes non-zeroed).

Linux wonderland 2.4.0-test12 #15 Thu Dec 21 16:40:16 CET 2000 i586 unknown

-- 
ciao,
Marco

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



innd mmap bug in 2.4.0-test12

2000-12-24 Thread Marco d'Itri

I can confirm the bug which loses updates to the inn active file when
it's unmapped is present again in 2.4.0-test12.

I put "cp active active.ok" in the rc file before shutting down the
daemon and at the next boot the files are different, every time.

Alexander Viro posted this test case:

#include unistd.h
main(argc,argv)
int argc;
char **argv;
{
int fd;
char c=0;
truncate(argv[1], 10);
fd = open(argv[1], 1);
lseek(fd, 16384, 0);
write(fd, c, 1);
close(fd);
}

but I tried it and it gives the correct result (a 16384 bytes long file
with only the first few bytes non-zeroed).

Linux wonderland 2.4.0-test12 #15 Thu Dec 21 16:40:16 CET 2000 i586 unknown

-- 
ciao,
Marco

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



  1   2   >