Re: page_launder() bug

2001-05-14 Thread Marcelo Tosatti
On Sun, 13 May 2001, Linus Torvalds wrote: > > On Sun, 13 May 2001, Rik van Riel wrote: > > > > This means that the swapin path (and the same path for > > other pagecache pages) doesn't take the page lock and > > the page lock doesn't protect us from other people using > > the page while we h

Re: Another VM race? (was: page_launder() bug)

2001-05-14 Thread Mikulas Patocka
> > CPU 0 CPU 1 > > is executing the code markedis executing try_to_free_buffers on > > above with ^^^: the same page (it can be, because CPU 0 > > did not lock the page) > > > > (page->buffers && > > > >

Re: page_launder() bug

2001-05-14 Thread Kai Henningsen
[EMAIL PROTECTED] (Linus Torvalds) wrote on 13.05.01 in <[EMAIL PROTECTED]>: > And that's why I'd rather have generic support for _any_ page mapping that > wants to drop pages early than have specific logic for swapping. > Historically, we've always had very good results from trying to avoid >

Re: Another VM race? (was: page_launder() bug)

2001-05-13 Thread Rik van Riel
On Sun, 13 May 2001, Mikulas Patocka wrote: > CPU 0 CPU 1 > is executing the code marked is executing try_to_free_buffers on > above with ^^^: the same page (it can be, because CPU 0 > did not lock the page) > > (page->buffers

Another VM race? (was: page_launder() bug)

2001-05-13 Thread Mikulas Patocka
> > > + if (!dead_swap_page && > > > + (PageTestandClearReferenced(page) || page->age > 0 || > > > + (!page->buffers && page_count(page) > 1) || > > > + page_ramdisk(page))) { > > ^^ > > > del_page_from

Re: page_launder() bug

2001-05-13 Thread Rik van Riel
On Sun, 13 May 2001, Linus Torvalds wrote: > On Sun, 13 May 2001, Rik van Riel wrote: > > > > Why the hell would we want this ? > You've missed about half the discussion, it seems.. True, I was away at a conference ;) > > If the page is referenced, it should be moved back to the > > active list

Re: page_launder() bug

2001-05-13 Thread Linus Torvalds
On Sun, 13 May 2001, Rik van Riel wrote: > > This means that the swapin path (and the same path for > other pagecache pages) doesn't take the page lock and > the page lock doesn't protect us from other people using > the page while we have it locked. You can test for swap cache deadness without

Re: page_launder() bug

2001-05-13 Thread David S. Miller
Rik van Riel writes: > Then I'd rather check this in a visible place in page_launder() > itself. Granted, this is a special case, but I don't think this > one is worth obfuscating the code for... I think Linus's scheme is just fine, controlling the new 'priority' argument to writepage() using

Re: page_launder() bug

2001-05-13 Thread Rik van Riel
On Sun, 13 May 2001, David S. Miller wrote: > Rik van Riel writes: > > On Tue, 8 May 2001, David S. Miller wrote: > > > Nice. Now the only bit left is moving the referenced bit > > > checking and/or state into writepage as well. This is still > > > part of the plan right? > > > > Why the

Re: page_launder() bug

2001-05-13 Thread David S. Miller
Rik van Riel writes: > On Tue, 8 May 2001, David S. Miller wrote: > > Nice. Now the only bit left is moving the referenced bit > > checking and/or state into writepage as well. This is still > > part of the plan right? > > Why the hell would we want this ? Because if it's a dead swap pa

Re: page_launder() bug

2001-05-13 Thread Rik van Riel
On Tue, 8 May 2001, David S. Miller wrote: > Marcelo Tosatti writes: > > Ok, this patch implements thet thing and also changes ext2+swap+shm > > writepage operations (so I could test the thing). > > > > The performance is better with the patch on my restricted swapping tests. > > Nice. Now

Re: page_launder() bug

2001-05-13 Thread Rik van Riel
On Tue, 8 May 2001, Mikulas Patocka wrote: > > + if (!dead_swap_page && > > + (PageTestandClearReferenced(page) || page->age > 0 || > > +(!page->buffers && page_count(page) > 1) || > > +page_ramdisk(page))) { >

Re: page_launder() bug

2001-05-13 Thread Rik van Riel
On Mon, 7 May 2001, Linus Torvalds wrote: > Which you MUST NOT do without holding the page lock. > > Hint: it needs "page->index", and without holding the page lock you > don't know what it could be. Wouldn't that be the pagecache_lock ? Remember that the semantics for find_swap_page() and fri

Re: page_launder() bug

2001-05-10 Thread Anuradha Ratnaweera
On Mon, 7 May 2001, J . A . Magallon wrote: > > On 05.07 Helge Hafting wrote: > > > > !0 is 1. !(anything else) is 0. It is zero and one, not > > zero and "non-zero". So a !! construction gives zero if you have > > zero, and one if you had anything else. There's no doubt about it. > > > >

Re: page_launder() bug

2001-05-10 Thread Ingo Oeser
On Tue, May 08, 2001 at 09:52:15AM +0200, Helge Hafting wrote: > > Isn't this asking for trouble with the optimizer ? It could kill both > > !!. Using that is like trusting on a certain struct padding-alignment. > > No, this won't cause trouble with the optimizer, because the > optimizer isn't su

Re: page_launder() bug

2001-05-09 Thread Marcelo Tosatti
On Wed, 9 May 2001, David S. Miller wrote: > > Marcelo Tosatti writes: > > > Let me state it a different way, how is the new writepage() framework > > > going to do things like ignore the referenced bit during page_launder > > > for dead swap pages? > > > > Its not able to ignore the refe

Re: page_launder() bug

2001-05-09 Thread David S. Miller
Marcelo Tosatti writes: > > Let me state it a different way, how is the new writepage() framework > > going to do things like ignore the referenced bit during page_launder > > for dead swap pages? > > Its not able to ignore the referenced bit. > > I know we want that, but I can't see an

Re: page_launder() bug

2001-05-09 Thread Marcelo Tosatti
On Wed, 9 May 2001, David S. Miller wrote: > > Marcelo Tosatti writes: > > You want writepage() to check/clean the referenced bit and move the page > > to the active list itself ? > > Well, that's the other part of what my patch was doing. > > Let me state it a different way, how is the ne

Re: page_launder() bug

2001-05-09 Thread David S. Miller
Marcelo Tosatti writes: > You want writepage() to check/clean the referenced bit and move the page > to the active list itself ? Well, that's the other part of what my patch was doing. Let me state it a different way, how is the new writepage() framework going to do things like ignore the ref

Re: page_launder() bug

2001-05-09 Thread Marcelo Tosatti
On Tue, 8 May 2001, David S. Miller wrote: > > Marcelo Tosatti writes: > > Ok, this patch implements thet thing and also changes ext2+swap+shm > > writepage operations (so I could test the thing). > > > > The performance is better with the patch on my restricted swapping tests. > > Nice.

Re: page_launder() bug

2001-05-09 Thread Martin Dalecki
Rusty Russell wrote: > > In message <[EMAIL PROTECTED]> you write: > > > > Jonathan Morton writes: > > > >- page_count(page) == (1 + !!page->buffers)); > > > > > > Two inversions in a row? > > > > It is the most straightforward way to make a '1' or '0' > > integer from the NUL

Re: page_launder() bug

2001-05-08 Thread Jonathan Morton
>That said, anyone who doesn't understand the former should probably >get some more C experience before commenting on others' code... I understood it, but it looked very much like a typo. -- from: Jonathan "Chromatix" Morton mail:

Re: page_launder() bug

2001-05-08 Thread Rusty Russell
In message <[EMAIL PROTECTED]> you write: > > Jonathan Morton writes: > > >- page_count(page) == (1 + !!page->buffers)); > > > > Two inversions in a row? > > It is the most straightforward way to make a '1' or '0' > integer from the NULL state of a pointer. Overall, I'd hav

Re: page_launder() bug

2001-05-08 Thread David S. Miller
Marcelo Tosatti writes: > Ok, this patch implements thet thing and also changes ext2+swap+shm > writepage operations (so I could test the thing). > > The performance is better with the patch on my restricted swapping tests. Nice. Now the only bit left is moving the referenced bit checking

Re: page_launder() bug

2001-05-08 Thread Marcelo Tosatti
On Tue, 8 May 2001, Linus Torvalds wrote: > > > On Tue, 8 May 2001, Marcelo Tosatti wrote: > > > > There are two issues which I missed yesterday: we have to get a reference > > on the page, mark it clean, drop the locks and then call writepage(). If > > the writepage() fails, we'll have to se

Re: page_launder() bug

2001-05-08 Thread Linus Torvalds
On Tue, 8 May 2001, Marcelo Tosatti wrote: > > There are two issues which I missed yesterday: we have to get a reference > on the page, mark it clean, drop the locks and then call writepage(). If > the writepage() fails, we'll have to set_page_dirty(page). We can move the "mark it clean" into w

Re: page_launder() bug

2001-05-08 Thread Kai Henningsen
[EMAIL PROTECTED] (Horst von Brand) wrote on 07.05.01 in <[EMAIL PROTECTED]>: > "David S. Miller" <[EMAIL PROTECTED]> said: > > Jonathan Morton writes: > > > >-page_count(page) == (1 + !!page->buffers)); > > > > > > Two inversions in a row? > > > > It is the most stra

Re: page_launder() bug

2001-05-08 Thread Mikulas Patocka
> > My point is that its _ok_ for us to check if the page is a dead swap cache > > page _without_ the lock since writepage() will recheck again with the page > > _locked_. Quoting you two messages back: > > > > "But it is important to re-calculate the deadness after getting the lock. > > B

Re: page_launder() bug

2001-05-08 Thread Marcelo Tosatti
On Mon, 7 May 2001, Linus Torvalds wrote: > In fact, it might even clean stuff up. Who knows? At least > page_launder() would not need to know about magic dead swap pages, because > the decision would be entirely in writepage(). > > And there aren't that many writepage() implementations in the

Re: page_launder() bug

2001-05-08 Thread BERECZ Szabolcs
On Mon, 7 May 2001, David S. Miller wrote: > My patch is crap and can cause corruptions, there is not argument > about it now :-) is it the only bug in the swap handling? or why is this bug triggered so heavily if the swap is on a filesystem? I had oopses when I used a swapfile on a partition, bu

Re: page_launder() bug

2001-05-08 Thread David S. Miller
Linus Torvalds writes: > Maybe it's academic. Do we know that any of this actually makes any > performance difference at all? We know that dirty swap pages can accumulate to the point where the swapper starves before it gets to enough of the "second pass" cases of the page_launder loop to run

Re: page_launder() bug

2001-05-08 Thread Helge Hafting
"J . A . Magallon" wrote: > > On 05.07 Helge Hafting wrote: > > !0 is 1. !(anything else) is 0. It is zero and one, not > > zero and "non-zero". So a !! construction gives zero if you have > > zero, and one if you had anything else. There's no doubt about it. > > > > > Isn't this asking for

Re: page_launder() bug

2001-05-08 Thread Linus Torvalds
On Mon, 7 May 2001, David S. Miller wrote: > > The only downside would be that the formerly "quick case" in the loop > of dealing with referenced pages would now need to go inside the page > lock. It's probably a non-issue... It might easily be an issue. That function will touch pretty much ev

Re: page_launder() bug

2001-05-07 Thread Linus Torvalds
On Mon, 7 May 2001, Marcelo Tosatti wrote: > > So what about moving the check for a dead swap cache page from > swap_writepage() to page_launder() (+ PageSwapCache() check) just before > the "if (!launder_loop)" ? > > Yes, its ugly special casing. Any other suggestion ? My most favourite app

Re: page_launder() bug

2001-05-07 Thread David S. Miller
Marcelo Tosatti writes: > > Hmmm, can't this happen without my patch? > > No. We will never call writepage() without __GFP_IO without your patch. > I see, because launder_loop never progresses to 1 in that case. My patch is crap and can cause corruptions, there is not argument about it no

Re: page_launder() bug

2001-05-07 Thread Marcelo Tosatti
On Mon, 7 May 2001, David S. Miller wrote: > > Marcelo Tosatti writes: > > I just thought about this case: > > > > We find a dead swap cache page, so dead_swap_page goes to 1. > > > > We call swap_writepage(), but in the meantime the swapin readahead code > > got a reference on the

Re: page_launder() bug

2001-05-07 Thread Linus Torvalds
On Mon, 7 May 2001, David S. Miller wrote: > > Marcelo Tosatti writes: > > I was wrong. The patch is indeed buggy because of the __GFP_IO thing. > > What about the __GFP_IO thing? > > Specifically, what protects the __GFP_IO thing from happening without > my patch? This:

Re: page_launder() bug

2001-05-07 Thread Linus Torvalds
On Mon, 7 May 2001, David S. Miller wrote: > > Marcelo Tosatti writes: > > I just thought about this case: > > > > We find a dead swap cache page, so dead_swap_page goes to 1. > > > > We call swap_writepage(), but in the meantime the swapin readahead code > > got a reference on the

Re: page_launder() bug

2001-05-07 Thread David S. Miller
Marcelo Tosatti writes: > I was wrong. The patch is indeed buggy because of the __GFP_IO thing. What about the __GFP_IO thing? Specifically, what protects the __GFP_IO thing from happening without my patch? Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the lin

Re: page_launder() bug

2001-05-07 Thread David S. Miller
Marcelo Tosatti writes: > I just thought about this case: > > We find a dead swap cache page, so dead_swap_page goes to 1. > > We call swap_writepage(), but in the meantime the swapin readahead code > got a reference on the swap map for the page. > > We write the page out because

Re: page_launder() bug

2001-05-07 Thread David S. Miller
Linus Torvalds writes: > YOUR HEURISTIC IS WRONG! Please start the conversation this way next time. > I call that a bug. You don't. Fine. You made it sound like a data corrupter, a kernel crasher, and that any bug against a kernel with that patch indicates my patch caused it. There is an imp

Re: page_launder() bug

2001-05-07 Thread Linus Torvalds
On Mon, 7 May 2001, David S. Miller wrote: > > Here, let's talk code a little bit so there are no misunderstandings, > I really want to put this to rest: > > Calculate dead_swap_page outside of lock. NO. That's not what you're doing at all. You're calculating something completely different tha

Re: page_launder() bug

2001-05-07 Thread Linus Torvalds
On Mon, 7 May 2001, Marcelo Tosatti wrote: > > So lets fix it and make it look for the swap counts. Ehh. Which you MUST NOT do without holding the page lock. Hint: it needs "page->index", and without holding the page lock you don't know what it could be. An out-of-bounds page index could do

Re: page_launder() bug

2001-05-07 Thread David S. Miller
Marcelo Tosatti writes: > My point is that its _ok_ for us to check if the page is a dead swap cache > page _without_ the lock since writepage() will recheck again with the page > _locked_. Quoting you two messages back: > > "But it is important to re-calculate the deadness after getting t

Re: page_launder() bug

2001-05-07 Thread David S. Miller
Linus Torvalds writes: > What do you expect me to do? The patch is buggy. It should be reverted. > What's your problem? I think the problem he has is that you are acting as if the patch causes corruptions and will end in failures. This is how you are coming across, at least. Really, your pro

Re: page_launder() bug

2001-05-07 Thread Marcelo Tosatti
On Mon, 7 May 2001, Linus Torvalds wrote: > > On Mon, 7 May 2001, Marcelo Tosatti wrote: > > > > So the "dead_swap_page" logic is _not_ buggy and you are full of shit when > > telling Alan to revert the change. (sorry, I could not avoid this one) > > Well, the problem is that the patch _is_

Re: page_launder() bug

2001-05-07 Thread Linus Torvalds
On Mon, 7 May 2001, Marcelo Tosatti wrote: > > So the "dead_swap_page" logic is _not_ buggy and you are full of shit when > telling Alan to revert the change. (sorry, I could not avoid this one) Well, the problem is that the patch _is_ buggy. swap_writepage() does it right. And dead_swap_page

Re: page_launder() bug

2001-05-07 Thread David S. Miller
Linus Torvalds writes: > > On Mon, 7 May 2001, Marcelo Tosatti wrote: > > And thats what swap_writepage() is doing: > > Ehh.. swap_writepage() is called with the page locked. So it _can_ depend > on it. > > If the page isn't locked there, then THAT is a bug. A major one. Linus, he's t

Re: page_launder() bug

2001-05-07 Thread Marcelo Tosatti
On Mon, 7 May 2001, Linus Torvalds wrote: > > On Mon, 7 May 2001, Marcelo Tosatti wrote: > > > > On 7 May 2001, Linus Torvalds wrote: > > > > > But it is important to re-calculate the deadness after getting the > > > lock. Before, it was just an informed guess. After the lock, it is > > > kn

Re: page_launder() bug

2001-05-07 Thread Linus Torvalds
On Mon, 7 May 2001, Marcelo Tosatti wrote: > > On 7 May 2001, Linus Torvalds wrote: > > > But it is important to re-calculate the deadness after getting the > > lock. Before, it was just an informed guess. After the lock, it is > > knowledge. And you can use informed guesses for heuristics, but

Re: page_launder() bug

2001-05-07 Thread Marcelo Tosatti
On 7 May 2001, Linus Torvalds wrote: > But it is important to re-calculate the deadness after getting the > lock. Before, it was just an informed guess. After the lock, it is > knowledge. And you can use informed guesses for heuristics, but you > must _not_ use them for any serious decisions.

Re: page_launder() bug

2001-05-07 Thread David S. Miller
Linus Torvalds writes: > The whole "dead_swap_page" optimization in the -ac tree is apparentrly > completely bogus. It caches a value that is not valid: you cannot > reliably look at whether the page has buffers etc without holding the > page locked. It caches a value controlling heuristic

Re: page_launder() bug

2001-05-07 Thread J . A . Magallon
On 05.07 Helge Hafting wrote: > Tobias Ringstrom wrote: > > > > On Sun, 6 May 2001, David S. Miller wrote: > > > It is the most straightforward way to make a '1' or '0' > > > integer from the NULL state of a pointer. > > > > But is it really specified in the C "standards" to be exctly zero or o

Re: page_launder() bug

2001-05-07 Thread Linus Torvalds
In article <[EMAIL PROTECTED]>, BERECZ Szabolcs <[EMAIL PROTECTED]> wrote: > >there is a bug in page_launder introduced with kernel 2.4.3-ac12. Yes. The whole "dead_swap_page" optimization in the -ac tree is apparentrly completely bogus. It caches a value that is not valid: you cannot reliably

Re: page_launder() bug

2001-05-07 Thread Horst von Brand
"David S. Miller" <[EMAIL PROTECTED]> said: > Jonathan Morton writes: > > >- page_count(page) == (1 + !!page->buffers)); > > > > Two inversions in a row? > > It is the most straightforward way to make a '1' or '0' > integer from the NULL state of a pointer. IMVHO, it is clea

Re: page_launder() bug

2001-05-07 Thread H. Peter Anvin
Followup to: <[EMAIL PROTECTED]> By author:Tobias Ringstrom <[EMAIL PROTECTED]> In newsgroup: linux.dev.kernel > > On Sun, 6 May 2001, David S. Miller wrote: > > It is the most straightforward way to make a '1' or '0' > > integer from the NULL state of a pointer. > > But is it really specifi

Re: page_launder() bug

2001-05-07 Thread Daniel Phillips
On Monday 07 May 2001 08:26, Tobias Ringstrom wrote: > On Sun, 6 May 2001, David S. Miller wrote: > > It is the most straightforward way to make a '1' or '0' > > integer from the NULL state of a pointer. > > But is it really specified in the C "standards" to be exctly zero or > one, and not zero a

Re: page_launder() bug

2001-05-07 Thread Alan Cox
> > It is the most straightforward way to make a '1' or '0' > > integer from the NULL state of a pointer. > > But is it really specified in the C "standards" to be exctly zero or one, > and not zero and non-zero? Yes. (Fortunately since when this argument occurred Linus said he would eat his und

Re: page_launder() bug

2001-05-07 Thread Helge Hafting
Tobias Ringstrom wrote: > > On Sun, 6 May 2001, David S. Miller wrote: > > It is the most straightforward way to make a '1' or '0' > > integer from the NULL state of a pointer. > > But is it really specified in the C "standards" to be exctly zero or one, > and not zero and non-zero? !0 is 1. !

Re: page_launder() bug

2001-05-07 Thread David S. Miller
Tobias Ringstrom writes: > But is it really specified in the C "standards" to be exctly zero or one, > and not zero and non-zero? I'm pretty sure it does. > IMHO, the ?: construct is way more readable and reliable. Well identical code has been there for several months just a few lines away.

Re: page_launder() bug

2001-05-06 Thread Tobias Ringstrom
On Sun, 6 May 2001, David S. Miller wrote: > It is the most straightforward way to make a '1' or '0' > integer from the NULL state of a pointer. But is it really specified in the C "standards" to be exctly zero or one, and not zero and non-zero? IMHO, the ?: construct is way more readable and re

Re: page_launder() bug

2001-05-06 Thread Aaron Lehmann
On Sun, May 06, 2001 at 09:55:26PM -0700, David S. Miller wrote: > > Jonathan Morton writes: > > >- page_count(page) == (1 + !!page->buffers)); > > Two inversions in a row? > It is the most straightforward way to make a '1' or '0' > integer from the NULL state of a pointer. pa

Re: page_launder() bug

2001-05-06 Thread David S. Miller
Jonathan Morton writes: > >-page_count(page) == (1 + !!page->buffers)); > > Two inversions in a row? It is the most straightforward way to make a '1' or '0' integer from the NULL state of a pointer. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list

Re: page_launder() bug

2001-05-06 Thread Jonathan Lundell
At 12:07 AM +0200 2001-05-07, BERECZ Szabolcs wrote: >On Sun, 6 May 2001, Jonathan Morton wrote: > > > >- page_count(page) == (1 + !!page->buffers)); >> >> Two inversions in a row? I'd like to see that made more explicit, >> otherwise it looks like a bug to me. Of course, if

Re: page_launder() bug

2001-05-06 Thread BERECZ Szabolcs
Hi! On Sun, 6 May 2001, Jonathan Morton wrote: > >- page_count(page) == (1 + !!page->buffers)); > > Two inversions in a row? I'd like to see that made more explicit, > otherwise it looks like a bug to me. Of course, if it IS a bug... it's not a bug. if page->buffers is zero

Re: page_launder() bug

2001-05-06 Thread Jonathan Morton
>- page_count(page) == (1 + !!page->buffers)); Two inversions in a row? I'd like to see that made more explicit, otherwise it looks like a bug to me. Of course, if it IS a bug... -- from: Jonathan "Chromatix"

page_launder() bug

2001-05-06 Thread BERECZ Szabolcs
Hi! there is a bug in page_launder introduced with kernel 2.4.3-ac12. if the swapfile is on a filesystem, then after swapping out some pages, the system locks up. sometimes it writes an oops message. I don't know exactly what's the problem, but with the attached patch it works. (this is just a re