Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-29 Thread Andrea Arcangeli
On Mon, Jan 29, 2007 at 03:08:44PM +0100, Andrea Gelmini wrote: > On Mon, Jan 22, 2007 at 10:10:39AM +0100, Peter Zijlstra wrote: > > On Fri, 2007-01-12 at 01:39 +0100, Andrea Gelmini wrote: > > > Hi, > > > I can't do the test 'till next week. > > > > > > Thanks a lot for your time, > > > Gelma

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-29 Thread Andrea Gelmini
On Mon, Jan 22, 2007 at 10:10:39AM +0100, Peter Zijlstra wrote: > On Fri, 2007-01-12 at 01:39 +0100, Andrea Gelmini wrote: > > Hi, > > I can't do the test 'till next week. > > > > Thanks a lot for your time, > > Gelma > > Have you ever gotten around to testing this? well, I spent some time

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-29 Thread Peter Zijlstra
On Mon, 2007-01-29 at 15:08 +0100, Andrea Gelmini wrote: > On Mon, Jan 22, 2007 at 10:10:39AM +0100, Peter Zijlstra wrote: > > On Fri, 2007-01-12 at 01:39 +0100, Andrea Gelmini wrote: > > > Hi, > > > I can't do the test 'till next week. > > > > > > Thanks a lot for your time, > > > Gelma > > >

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-29 Thread Peter Zijlstra
On Mon, 2007-01-29 at 15:08 +0100, Andrea Gelmini wrote: On Mon, Jan 22, 2007 at 10:10:39AM +0100, Peter Zijlstra wrote: On Fri, 2007-01-12 at 01:39 +0100, Andrea Gelmini wrote: Hi, I can't do the test 'till next week. Thanks a lot for your time, Gelma Have you ever

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-29 Thread Andrea Gelmini
On Mon, Jan 22, 2007 at 10:10:39AM +0100, Peter Zijlstra wrote: On Fri, 2007-01-12 at 01:39 +0100, Andrea Gelmini wrote: Hi, I can't do the test 'till next week. Thanks a lot for your time, Gelma Have you ever gotten around to testing this? well, I spent some time doing more

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-29 Thread Andrea Arcangeli
On Mon, Jan 29, 2007 at 03:08:44PM +0100, Andrea Gelmini wrote: On Mon, Jan 22, 2007 at 10:10:39AM +0100, Peter Zijlstra wrote: On Fri, 2007-01-12 at 01:39 +0100, Andrea Gelmini wrote: Hi, I can't do the test 'till next week. Thanks a lot for your time, Gelma Have you

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-10 Thread Andrea Gelmini
On Fri, Jan 05, 2007 at 04:36:06PM +1100, Nick Piggin wrote: > I was thinking like a 100 line C program that I can reproduce here ;) not soon, but I'll try to produce it. > If you can even describe the steps it does: (eg. mmap file A, write(2) to > it, truncate it, , should contain 1s but it

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-10 Thread Andrea Gelmini
On Fri, Jan 05, 2007 at 04:36:06PM +1100, Nick Piggin wrote: I was thinking like a 100 line C program that I can reproduce here ;) not soon, but I'll try to produce it. If you can even describe the steps it does: (eg. mmap file A, write(2) to it, truncate it, , should contain 1s but it

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-07 Thread dean gaudet
On Wed, 3 Jan 2007, Andrew Morton wrote: > On Wed, 03 Jan 2007 22:56:07 -0800 (PST) > David Miller <[EMAIL PROTECTED]> wrote: > > > Note that the original rtorrent debian bug report was against 2.6.18 > > I think that was 2.6.18+debian-added-dirty-page-tracking-patches. i've seen it on a

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-07 Thread dean gaudet
On Wed, 3 Jan 2007, Andrew Morton wrote: On Wed, 03 Jan 2007 22:56:07 -0800 (PST) David Miller [EMAIL PROTECTED] wrote: Note that the original rtorrent debian bug report was against 2.6.18 I think that was 2.6.18+debian-added-dirty-page-tracking-patches. i've seen it on a 2.6.18.4 box

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-05 Thread Linus Torvalds
On Fri, 5 Jan 2007, Christoph Lameter wrote: > > It looks as if most code handling the dirty bits already uses the page > lock? Much does. But I did some debugging (when trying to figure out the VM corruption), and certainly not all of it does. And when I looked at some of the code-paths, I

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-05 Thread Christoph Lameter
On Fri, 5 Jan 2007, Linus Torvalds wrote: > However, a lot of the code isn't really amenable to it as it stands now. > We very much tend to call it in critical sections, and you have to move > them all out of the locks they are now. It looks as if most code handling the dirty bits already uses

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-05 Thread Linus Torvalds
On Fri, 5 Jan 2007, Christoph Lameter wrote: > > Maybe we should require taking the page lock before the dirty bits are > modified? I think it's been suggested several times. However, a lot of the code isn't really amenable to it as it stands now. We very much tend to call it in critical

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-05 Thread Christoph Lameter
On Wed, 3 Jan 2007, Linus Torvalds wrote: > And I haven't actually thought about it that much, so I could be full of > crap. But I don't see anything that protects against it: we may hold the > page lock, but since the code that marks things _dirty_ doesn't > necessarily always hold it, that

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-05 Thread Christoph Lameter
On Wed, 3 Jan 2007, Linus Torvalds wrote: And I haven't actually thought about it that much, so I could be full of crap. But I don't see anything that protects against it: we may hold the page lock, but since the code that marks things _dirty_ doesn't necessarily always hold it, that

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-05 Thread Linus Torvalds
On Fri, 5 Jan 2007, Christoph Lameter wrote: Maybe we should require taking the page lock before the dirty bits are modified? I think it's been suggested several times. However, a lot of the code isn't really amenable to it as it stands now. We very much tend to call it in critical

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-05 Thread Christoph Lameter
On Fri, 5 Jan 2007, Linus Torvalds wrote: However, a lot of the code isn't really amenable to it as it stands now. We very much tend to call it in critical sections, and you have to move them all out of the locks they are now. It looks as if most code handling the dirty bits already uses

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-05 Thread Linus Torvalds
On Fri, 5 Jan 2007, Christoph Lameter wrote: It looks as if most code handling the dirty bits already uses the page lock? Much does. But I did some debugging (when trying to figure out the VM corruption), and certainly not all of it does. And when I looked at some of the code-paths, I

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-04 Thread Nick Piggin
Andrea Gelmini wrote: On Thu, Jan 04, 2007 at 05:03:43PM +1100, Nick Piggin wrote: Anyway that leaves us with the question of why Andrea's database is getting corrupted. Hopefully he can give us a minimal test-case. yep, I can give you a complete image of my machine, or a root access.

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-04 Thread Hugh Dickins
On Wed, 3 Jan 2007, Andrew Morton wrote: > On Wed, 03 Jan 2007 22:56:07 -0800 (PST) > David Miller <[EMAIL PROTECTED]> wrote: > > From: Andrew Morton <[EMAIL PROTECTED]> > > > > > > It'd odd that stories of pre-2.6.19 BerkeleyDB corruption are now coming > > > out of the woodwork. It's the first

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-04 Thread Andrea Gelmini
On Thu, Jan 04, 2007 at 02:57:24PM +1100, Nick Piggin wrote: > I wouldn't discount a kernel bug, but it will be hard to track down > unless you can find an earlier kernel that did not cause the corruptions > and/or provide source for a minimal test case to reproduce. see my others reply, please.

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-04 Thread Andrea Gelmini
On Thu, Jan 04, 2007 at 04:07:18PM +1100, Nick Piggin wrote: > But the patch that Andrea was pointing to was your last patch (The Fix), > which stopped page_mkclean caller throwing out dirty bits. You probably > didn't see that in the mail I cc'ed you on. well, I pointed at that patch for reply,

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-04 Thread Andrea Gelmini
On Thu, Jan 04, 2007 at 05:03:43PM +1100, Nick Piggin wrote: > Anyway that leaves us with the question of why Andrea's database is getting > corrupted. Hopefully he can give us a minimal test-case. yep, I can give you a complete image of my machine, or a root access. replicate the problem it's

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-04 Thread Andrea Gelmini
On Wed, Jan 03, 2007 at 10:12:20PM -0800, Andrew Morton wrote: > > Anyway that leaves us with the question of why Andrea's database is getting > > corrupted. Hopefully he can give us a minimal test-case. > > It'd odd that stories of pre-2.6.19 BerkeleyDB corruption are now coming > out of the

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-04 Thread Andrea Gelmini
On Wed, Jan 03, 2007 at 10:12:20PM -0800, Andrew Morton wrote: Anyway that leaves us with the question of why Andrea's database is getting corrupted. Hopefully he can give us a minimal test-case. It'd odd that stories of pre-2.6.19 BerkeleyDB corruption are now coming out of the woodwork.

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-04 Thread Andrea Gelmini
On Thu, Jan 04, 2007 at 05:03:43PM +1100, Nick Piggin wrote: Anyway that leaves us with the question of why Andrea's database is getting corrupted. Hopefully he can give us a minimal test-case. yep, I can give you a complete image of my machine, or a root access. replicate the problem it's not

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-04 Thread Andrea Gelmini
On Thu, Jan 04, 2007 at 04:07:18PM +1100, Nick Piggin wrote: But the patch that Andrea was pointing to was your last patch (The Fix), which stopped page_mkclean caller throwing out dirty bits. You probably didn't see that in the mail I cc'ed you on. well, I pointed at that patch for reply, but

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-04 Thread Andrea Gelmini
On Thu, Jan 04, 2007 at 02:57:24PM +1100, Nick Piggin wrote: I wouldn't discount a kernel bug, but it will be hard to track down unless you can find an earlier kernel that did not cause the corruptions and/or provide source for a minimal test case to reproduce. see my others reply, please.

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-04 Thread Hugh Dickins
On Wed, 3 Jan 2007, Andrew Morton wrote: On Wed, 03 Jan 2007 22:56:07 -0800 (PST) David Miller [EMAIL PROTECTED] wrote: From: Andrew Morton [EMAIL PROTECTED] It'd odd that stories of pre-2.6.19 BerkeleyDB corruption are now coming out of the woodwork. It's the first I've ever heard

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-04 Thread Nick Piggin
Andrea Gelmini wrote: On Thu, Jan 04, 2007 at 05:03:43PM +1100, Nick Piggin wrote: Anyway that leaves us with the question of why Andrea's database is getting corrupted. Hopefully he can give us a minimal test-case. yep, I can give you a complete image of my machine, or a root access.

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-03 Thread Nick Piggin
Andrew Morton wrote: On Wed, 03 Jan 2007 22:56:07 -0800 (PST) David Miller <[EMAIL PROTECTED]> wrote: Anyway that leaves us with the question of why Andrea's database is getting corrupted. Hopefully he can give us a minimal test-case. It'd odd that stories of pre-2.6.19 BerkeleyDB

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-03 Thread Andrew Morton
On Wed, 03 Jan 2007 22:56:07 -0800 (PST) David Miller <[EMAIL PROTECTED]> wrote: > From: Andrew Morton <[EMAIL PROTECTED]> > Date: Wed, 3 Jan 2007 22:12:20 -0800 > > > On Thu, 04 Jan 2007 17:03:43 +1100 > > Nick Piggin <[EMAIL PROTECTED]> wrote: > > > > > > That bug was introduced in 2.6.19,

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-03 Thread David Miller
From: Andrew Morton <[EMAIL PROTECTED]> Date: Wed, 3 Jan 2007 22:12:20 -0800 > On Thu, 04 Jan 2007 17:03:43 +1100 > Nick Piggin <[EMAIL PROTECTED]> wrote: > > > > That bug was introduced in 2.6.19, with the dirty page tracking patches. > > > > > > 2.6.18 and earlier used ->private_lock coverage

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-03 Thread Andrew Morton
On Thu, 04 Jan 2007 17:03:43 +1100 Nick Piggin <[EMAIL PROTECTED]> wrote: > > That bug was introduced in 2.6.19, with the dirty page tracking patches. > > > > 2.6.18 and earlier used ->private_lock coverage in try_to_free_buffers() to > > prevent it. > > Ohh, right you are, I was looking at

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-03 Thread Nick Piggin
Andrew Morton wrote: On Wed, 3 Jan 2007 20:44:36 -0800 (PST) Linus Torvalds <[EMAIL PROTECTED]> wrote: Actually, I think 2.6.18 may have a subtle variation on it. In particular, I look back at the try_to_free_buffers() thing that I hated so much, and it makes me wonder.. It used to do:

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-03 Thread Andrew Morton
On Wed, 3 Jan 2007 20:44:36 -0800 (PST) Linus Torvalds <[EMAIL PROTECTED]> wrote: > Actually, I think 2.6.18 may have a subtle variation on it. > > In particular, I look back at the try_to_free_buffers() thing that I hated > so much, and it makes me wonder.. It used to do: > >

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-03 Thread Nick Piggin
Linus Torvalds wrote: On Thu, 4 Jan 2007, Nick Piggin wrote: Yhat's when the bug was introduced -- 2.6.19. 2.6.18 does not have this bug, so it cannot be years old. Actually, I think 2.6.18 may have a subtle variation on it. In particular, I look back at the try_to_free_buffers() thing

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-03 Thread Linus Torvalds
On Thu, 4 Jan 2007, Nick Piggin wrote: > > Yhat's when the bug was introduced -- 2.6.19. 2.6.18 does not have > this bug, so it cannot be years old. Actually, I think 2.6.18 may have a subtle variation on it. In particular, I look back at the try_to_free_buffers() thing that I hated so

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-03 Thread Nick Piggin
Andrea Gelmini wrote: On Sun, Dec 31, 2006 at 02:55:58PM +1100, Nick Piggin wrote: This bug was only introduced in 2.6.19, due to a change that caused pte no, Linus said that with 2.6.19 it's easier to trigger this bug... Yhat's when the bug was introduced -- 2.6.19. 2.6.18 does not have

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-03 Thread Nick Piggin
Andrea Gelmini wrote: On Sun, Dec 31, 2006 at 02:55:58PM +1100, Nick Piggin wrote: This bug was only introduced in 2.6.19, due to a change that caused pte no, Linus said that with 2.6.19 it's easier to trigger this bug... Yhat's when the bug was introduced -- 2.6.19. 2.6.18 does not have

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-03 Thread Linus Torvalds
On Thu, 4 Jan 2007, Nick Piggin wrote: Yhat's when the bug was introduced -- 2.6.19. 2.6.18 does not have this bug, so it cannot be years old. Actually, I think 2.6.18 may have a subtle variation on it. In particular, I look back at the try_to_free_buffers() thing that I hated so much,

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-03 Thread Nick Piggin
Linus Torvalds wrote: On Thu, 4 Jan 2007, Nick Piggin wrote: Yhat's when the bug was introduced -- 2.6.19. 2.6.18 does not have this bug, so it cannot be years old. Actually, I think 2.6.18 may have a subtle variation on it. In particular, I look back at the try_to_free_buffers() thing

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-03 Thread Andrew Morton
On Wed, 3 Jan 2007 20:44:36 -0800 (PST) Linus Torvalds [EMAIL PROTECTED] wrote: Actually, I think 2.6.18 may have a subtle variation on it. In particular, I look back at the try_to_free_buffers() thing that I hated so much, and it makes me wonder.. It used to do:

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-03 Thread Nick Piggin
Andrew Morton wrote: On Wed, 3 Jan 2007 20:44:36 -0800 (PST) Linus Torvalds [EMAIL PROTECTED] wrote: Actually, I think 2.6.18 may have a subtle variation on it. In particular, I look back at the try_to_free_buffers() thing that I hated so much, and it makes me wonder.. It used to do:

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-03 Thread Andrew Morton
On Thu, 04 Jan 2007 17:03:43 +1100 Nick Piggin [EMAIL PROTECTED] wrote: That bug was introduced in 2.6.19, with the dirty page tracking patches. 2.6.18 and earlier used -private_lock coverage in try_to_free_buffers() to prevent it. Ohh, right you are, I was looking at 2.6.19 sources.

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-03 Thread David Miller
From: Andrew Morton [EMAIL PROTECTED] Date: Wed, 3 Jan 2007 22:12:20 -0800 On Thu, 04 Jan 2007 17:03:43 +1100 Nick Piggin [EMAIL PROTECTED] wrote: That bug was introduced in 2.6.19, with the dirty page tracking patches. 2.6.18 and earlier used -private_lock coverage in

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2007-01-03 Thread Andrew Morton
On Wed, 03 Jan 2007 22:56:07 -0800 (PST) David Miller [EMAIL PROTECTED] wrote: From: Andrew Morton [EMAIL PROTECTED] Date: Wed, 3 Jan 2007 22:12:20 -0800 On Thu, 04 Jan 2007 17:03:43 +1100 Nick Piggin [EMAIL PROTECTED] wrote: That bug was introduced in 2.6.19, with the dirty page

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2006-12-31 Thread Andrea Gelmini
On Sun, Dec 31, 2006 at 02:55:58PM +1100, Nick Piggin wrote: > This bug was only introduced in 2.6.19, due to a change that caused pte no, Linus said that with 2.6.19 it's easier to trigger this bug... > So if your corruption is years old, then it must be something else. > Maybe it is hidden by a

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2006-12-31 Thread Andrea Gelmini
On Sun, Dec 31, 2006 at 02:55:58PM +1100, Nick Piggin wrote: This bug was only introduced in 2.6.19, due to a change that caused pte no, Linus said that with 2.6.19 it's easier to trigger this bug... So if your corruption is years old, then it must be something else. Maybe it is hidden by a

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2006-12-30 Thread Nick Piggin
: 3bf8ba38f38d3647368e4edcf7d019f9f8d9184a Author: Linus Torvalds <[EMAIL PROTECTED]> AuthorDate: Fri Dec 29 10:00:58 2006 -0800 Committer: Linus Torvalds <[EMAIL PROTECTED]> CommitDate: Fri Dec 29 10:00:58 2006 -0800 VM: Fix nasty and subtle race in shared mmap'ed pa

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2006-12-30 Thread Nick Piggin
: 3bf8ba38f38d3647368e4edcf7d019f9f8d9184a Author: Linus Torvalds [EMAIL PROTECTED] AuthorDate: Fri Dec 29 10:00:58 2006 -0800 Committer: Linus Torvalds [EMAIL PROTECTED] CommitDate: Fri Dec 29 10:00:58 2006 -0800 VM: Fix nasty and subtle race in shared mmap'ed page writeback With 2.6.20

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2006-12-29 Thread Andrea Gelmini
> Parent: 3bf8ba38f38d3647368e4edcf7d019f9f8d9184a > Author: Linus Torvalds <[EMAIL PROTECTED]> > AuthorDate: Fri Dec 29 10:00:58 2006 -0800 > Committer: Linus Torvalds <[EMAIL PROTECTED]> > CommitDate: Fri Dec 29 10:00:58 2006 -0800 > > VM: Fix nasty and subtle race in

Re: VM: Fix nasty and subtle race in shared mmap'ed page writeback

2006-12-29 Thread Andrea Gelmini
: 3bf8ba38f38d3647368e4edcf7d019f9f8d9184a Author: Linus Torvalds [EMAIL PROTECTED] AuthorDate: Fri Dec 29 10:00:58 2006 -0800 Committer: Linus Torvalds [EMAIL PROTECTED] CommitDate: Fri Dec 29 10:00:58 2006 -0800 VM: Fix nasty and subtle race in shared mmap'ed page writeback With 2.6.20-rc2-git1, which