On Fri, Feb 02, 2007 at 07:42:52PM +1100, Nick Piggin ([EMAIL PROTECTED]) wrote:
> Anyway, I had a look at your bugzilla test-case and managed to slim it
> down to something that easily shows what the problem is (available on
> request) -- the problem is that recipient of the sendfile is seeing
> m
Mark Groves wrote:
Hi,
I have been been seeing a problem when using sendfile repeatedly on an
SMP server, which I believe is related to the problem that was
discovered recently with marking dirty pages. The bug, as well as a test
script, is listed at http://bugzilla.kernel.org/show_bug.cgi?id=7
Hi,
I have been been seeing a problem when using sendfile repeatedly on an
SMP server, which I believe is related to the problem that was
discovered recently with marking dirty pages. The bug, as well as a test
script, is listed at http://bugzilla.kernel.org/show_bug.cgi?id=7650.
Currently, we're
On Sun, 7 Jan 2007 12:36:18 +1030
"Tom Lanyon" <[EMAIL PROTECTED]> wrote:
> On 12/27/06, Linus Torvalds <[EMAIL PROTECTED]> wrote:
> > What would also actually be interesting is whether somebody can reproduce
> > this on Reiserfs, for example. I _think_ all the reports I've seen are on
> > ext2 or
On 1/7/07, Tom Lanyon <[EMAIL PROTECTED]> wrote:
I've been following this thread for a while now as I started
experiencing file corruption in rtorrent when I upgraded to 2.6.19. I
am using reiserfs.
However, moving to 2.6.20-rc3 does indeed seem to fix the issue thus far...
--
Tom Lanyon
-
To
On 12/27/06, Linus Torvalds <[EMAIL PROTECTED]> wrote:
What would also actually be interesting is whether somebody can reproduce
this on Reiserfs, for example. I _think_ all the reports I've seen are on
ext2 or ext3, and if this is somehow writeback-related, it could be some
bug that is just shar
On Fri, 29 Dec 2006 16:58:41 -0800 (PST)
Linus Torvalds <[EMAIL PROTECTED]> wrote:
>
>
> On Fri, 29 Dec 2006, Andrew Morton wrote:
> >
> > > > Somewhat nastily, but as ext3 directories are metadata it is appropriate
> > > > that modifications to them be done in terms of buffer_heads (ie:
> > >
On Fri, 29 Dec 2006, Andrew Morton wrote:
>
> > > Somewhat nastily, but as ext3 directories are metadata it is appropriate
> > > that modifications to them be done in terms of buffer_heads (ie: blocks).
> >
> > No. There is nothing "appropriate" about using buffer_heads for metadata.
>
> I sai
On Fri, 29 Dec 2006, Andrew Morton wrote:
>
> Adam Richter spent considerable time a few years ago trying to make the
> mpage code go direct-to-BIO in all cases and we eventually gave up. The
> conceptual layering of page<->blocks<->bio is pretty clean, and it is hard
> and ugly to fully optimi
On Fri, 29 Dec 2006 16:11:44 -0800 (PST)
Linus Torvalds <[EMAIL PROTECTED]> wrote:
>
>
> > JBD implements physical block-based journalling, so it is 100% appropriate
> > that JBD deal with these disk blocks using their buffer_head
> > representation.
>
> And as long as it does that, you just ha
On Fri, 29 Dec 2006, Andrew Morton wrote:
>
> They're extra. As in "can be optimised away".
Sure. Don't use buffer heads.
> The buffer_head is not an IO container. It is the kernel's core
> representation of a disk block.
Please come back from the 90's.
The buffer heads are nothing but a m
On Fri, 29 Dec 2006 18:32:07 -0500
Theodore Tso <[EMAIL PROTECTED]> wrote:
> On Fri, Dec 29, 2006 at 02:42:51PM -0800, Linus Torvalds wrote:
> > I think ext3 is terminally crap by now. It still uses buffer heads in
> > places where it really really shouldn't, and as a result, things like
> > dir
On Fri, 29 Dec 2006, Theodore Tso wrote:
>
> If we do get this fixed for ext4, one interesting question is whether
> people would accept a patch to backport the fixes to ext3, given the
> the grief this is causing the page I/O and VM routines.
I don't think backporting is the smartest option (un
On Fri, 29 Dec 2006 14:42:51 -0800 (PST)
Linus Torvalds <[EMAIL PROTECTED]> wrote:
>
>
> On Fri, 29 Dec 2006, Andrew Morton wrote:
> >
> > - The above change means that we do extra writeout. If a page is dirtied
> > once, kjournald will write it and then pdflush will come along and
> > nee
On Fri, Dec 29, 2006 at 02:42:51PM -0800, Linus Torvalds wrote:
> I think ext3 is terminally crap by now. It still uses buffer heads in
> places where it really really shouldn't, and as a result, things like
> directory accesses are simply slower than they should be. Sadly, I don't
> think ext4
On Fri, 29 Dec 2006, Andrew Morton wrote:
>
> - The above change means that we do extra writeout. If a page is dirtied
> once, kjournald will write it and then pdflush will come along and
> needlessly write it again.
There's zero extra writeout for any flushing that flushes BY PAGES.
Only
On Fri, 29 Dec 2006 14:16:32 -0800
Andrew Morton <[EMAIL PROTECTED]> wrote:
> - Poor old IO accounting broke again.
No it didn't - we're relying upon the behaviour of __set_page_dirty_buffers()
against an already-dirty page.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel
On Fri, 29 Dec 2006 02:48:35 -0800 (PST)
Linus Torvalds <[EMAIL PROTECTED]> wrote:
> + if (mapping && mapping_cap_account_dirty(mapping)) {
> + /*
> + * Yes, Virginia, this is indeed insane.
> + *
> + * We use this sequence to make sure that
>
On Fri, 29 Dec 2006, Theodore Tso wrote:
>
> I'm confused. Does this mean that if "fs blocksize"=="VM pagesize"
> this bug can't trigger?
No. Even if there is just a single buffer-head, if the filesystem ever
writes out that _single_ buffer-head out of turn (ie before the VM
actually asks it
On Fri, 29 Dec 2006, Nick Piggin wrote:
>
> > It still has a tiny tiny race (see the comment), but I bet nobody can really
> > hit it in real life anyway, and I know several ways to fix it, so I'm not
> > really _that_ worried about it.
>
> Well the race isn't a data loss one, is it? Just a cas
* Stephen Clark <[EMAIL PROTECTED]> [2006-12-29 10:17]:
> >It works for me now, both your testcase as well as an installation of
> >Debian on this ARM device. I manually applied the patch to 2.6.19.
>
> Can you post a diff against 2.6.19?
--- a/mm/page-writeback.c 2006-11-29 21:57:37.0
On Fri, Dec 29, 2006 at 12:58:12AM -0800, Linus Torvalds wrote:
> Because what "__set_page_dirty_buffers()" does is that AT THE TIME THE
> "set_page_dirty()" IS CALLED, it will mark all the buffers on that page as
> dirty. That may _sound_ like what we want, but it really isn't. Because by
> the
Martin Michlmayr wrote:
* Linus Torvalds <[EMAIL PROTECTED]> [2006-12-29 02:48]:
Can anybody get corruption with this thing applied? It goes on top
of plain v2.6.20-rc2.
It works for me now, both your testcase as well as an installation of
Debian on this ARM device. I manually applie
* Linus Torvalds <[EMAIL PROTECTED]> [2006-12-29 02:48]:
> Can anybody get corruption with this thing applied? It goes on top
> of plain v2.6.20-rc2.
It works for me now, both your testcase as well as an installation of
Debian on this ARM device. I manually applied the patch to 2.6.19.
Thanks.
-
Linus Torvalds wrote:
[...]
The patch is mostly a comment. The "real" meat of it is actually just a
few lines.
Can anybody get corruption with this thing applied? It goes on top of
plain v2.6.20-rc2.
No corruption with the testcase here. Will check with rtorrent too later
today but I supp
* Linus Torvalds <[EMAIL PROTECTED]> wrote:
> > Hmm? I'd love it if somebody else wrote the patch and tested it,
> > because I'm getting sick and tired of this bug ;)
>
> Who the hell am I kidding? I haven't been able to sleep right for the
> last few days over this bug. It was really getting
Hey nice work Linus!
Linus Torvalds wrote:
On Fri, 29 Dec 2006, Linus Torvalds wrote:
Hmm? I'd love it if somebody else wrote the patch and tested it, because
I'm getting sick and tired of this bug ;)
Who the hell am I kidding? I haven't been able to sleep right for the last
few days over
On Fri, 2006-12-29 at 02:48 -0800, Linus Torvalds wrote:
>
> On Fri, 29 Dec 2006, Linus Torvalds wrote:
> >
> > Hmm? I'd love it if somebody else wrote the patch and tested it, because
> > I'm getting sick and tired of this bug ;)
>
> Who the hell am I kidding? I haven't been able to sleep righ
On Fri, 29 Dec 2006, Linus Torvalds wrote:
>
> Hmm? I'd love it if somebody else wrote the patch and tested it, because
> I'm getting sick and tired of this bug ;)
Who the hell am I kidding? I haven't been able to sleep right for the last
few days over this bug. It was really getting to me.
On Thu, 28 Dec 2006, Linus Torvalds wrote:
>
> So everything I have ever seen says that the VM layer is actually doing
> everything right.
That was true, but at the same time, it's not. Let me explain.
> That to me says: "somebody didn't actually write out out". The VM layer
> asked the file
On Fri, 29 Dec 2006, Segher Boessenkool wrote:
>
> > I think what might be happening is that pdflush writes them out fine,
> > however we don't trap writes by the application _during_ that writeout.
>
> Yeah. I believe that more exactly it happens if the very last
> write to the page causes a w
I think what might be happening is that pdflush writes them out fine,
however we don't trap writes by the application _during_ that writeout.
Yeah. I believe that more exactly it happens if the very last
write to the page causes a writeback (due to dirty balancing)
while another writeback for t
From: Linus Torvalds <[EMAIL PROTECTED]>
Date: Thu, 28 Dec 2006 12:14:31 -0800 (PST)
> I get corruption - but the whole point is that it's very much pdflush that
> should be writing these pages out.
I think what might be happening is that pdflush writes them out fine,
however we don't trap write
On Thu, 2006-12-28 at 11:45 -0800, Andrew Morton wrote:
> On Thu, 28 Dec 2006 11:28:52 -0800 (PST)
> Linus Torvalds <[EMAIL PROTECTED]> wrote:
>
> >
> >
> > On Thu, 28 Dec 2006, Guillaume Chazarain wrote:
> > >
> > > The attached patch fixes the corruption for me.
> >
> > Well, that's a good h
On Thu, 28 Dec 2006, Andrew Morton wrote:
>
> It would be interesting to convert your app to do fsync() before
> FADV_DONTNEED. That would take WB_SYNC_NONE out of the picture as well
> (apart from pdflush activity).
I get corruption - but the whole point is that it's very much pdflush that
s
On Thu, 28 Dec 2006 11:28:52 -0800 (PST)
Linus Torvalds <[EMAIL PROTECTED]> wrote:
>
>
> On Thu, 28 Dec 2006, Guillaume Chazarain wrote:
> >
> > The attached patch fixes the corruption for me.
>
> Well, that's a good hint, but it's really just a symptom. You effectively
> just made the test-p
On Thu, 28 Dec 2006, Guillaume Chazarain wrote:
>
> The attached patch fixes the corruption for me.
Well, that's a good hint, but it's really just a symptom. You effectively
just made the test-program not even try to flush the data to disk, so the
page cache would stay in memory, and you'd no
Guillaume Chazarain a écrit :
I get this kind of corruption:
http://guichaz.free.fr/linux-bug/corruption.png
Actually in qemu, I get three different behaviours:
- no corruption at all : with linux-2.4
- corruption only on the first chunks: before [PATCH] mm: balance dirty
pages as identified
On Thu, 28 Dec 2006, Russell King wrote:
>
> Yup, but I have nothing to do with glibc because I refuse to do that
> silly copyright assignment FSF thing. Hopefully someone else can
> resolve it, but...
Yeah, me too.
> _is_ a fix whether _you_ like it or not to work around the issue so
> peopl
On Thu, Dec 28, 2006 at 09:27:12AM -0800, Linus Torvalds wrote:
> On Thu, 28 Dec 2006, Russell King wrote:
> > and if you look at glibc's memset() function, you'll notice that's exactly
> > what you expect if you pass a non-8bit value to it. Ergo, what you're
> > seeing is utterly expected given g
On Thu, 28 Dec 2006, Russell King wrote:
>
> and if you look at glibc's memset() function, you'll notice that's exactly
> what you expect if you pass a non-8bit value to it. Ergo, what you're
> seeing is utterly expected given glibc's memset() implementation on ARM.
Guys, you _really_ should f
On Thu, 28 Dec 2006, Zhang, Yanmin wrote:
>
> The test program is a process to write/read data. pdflush might write data
> to disk asynchronously. After pdflush writes a page to disk, it will call
> (either by
> softirq) clear_page_dirty to clear the dirty bit after getting the interrupt
> noti
On Wed, 27 Dec 2006, Chen, Kenneth W wrote:
> >
> > Running the test code, git bisect points its finger at this commit.
> > Reverting
> > this commit on top of 2.6.20-rc2 doesn't trigger the bug from the test code.
> >
> > [PATCH] mm: balance dirty pages
> >
> > Now that we can detect
On Wed, 27 Dec 2006, Gordon Farquharson wrote:
>
> 100kB and 200kB files always succeed on the ARM system. 400kB and
> larger always seem to fail.
Oh, wow. Yeah, I've just repressed how tiny 32MB is. And especially if you
lowered the /proc/sys/vm/dirty_ratio to a smaller percentage, I guess
4
On Thu Dec 28 15:09 , Guillaume Chazarain sent:
>I set a qemu environment to test kernels: http://guichaz.free.fr/linux-bug/
>I have corruption with every Fedora release kernel except the first, that is
>2.4.22 works, but 2.6.5, 2.6.9, 2.6.11, 2.6.15 and 2.6.18-1.2798 exhibit
>some corruption.
* Gordon Farquharson <[EMAIL PROTECTED]> [2006-12-28 07:15]:
> Thanks for the fix, Russell.
>
> I can now trigger the (real) problem by using a 25 MB file (100 << 18)
> and the Linksys NSLU2 (ARM, IXP420 processor, 32 MB RAM).
Me too (using 100 << 18). Interestingly, I don't seem to get any
corr
I set a qemu environment to test kernels: http://guichaz.free.fr/linux-bug/
I have corruption with every Fedora release kernel except the first, that is
2.4.22 works, but 2.6.5, 2.6.9, 2.6.11, 2.6.15 and 2.6.18-1.2798 exhibit
some
corruption.
Command line to test:
qemu root_fs -snapshot -kerne
* Russell King <[EMAIL PROTECTED]> [2006-12-28 10:49]:
> > By the way, I just tried it with TARGETSIZE (100 << 12) on a different
> > ARM machine (a Thecus N2100 based on an IOP32x chip with 128 MB of
> > memory) and I see similar results to that from Gordon:
>
> Work around the glibc memset() pro
On 12/28/06, Russell King <[EMAIL PROTECTED]> wrote:
Fixing Linus' test program to pass nr & 255 to memset results in clean
passes on 2.6.9 on TheCus N2100 (IOP8032x) and 2.6.16.9 StrongARM
machines (as would be expected.)
Thanks for the fix, Russell.
I can now trigger the (real) problem by u
On Wed, Dec 27, 2006 at 07:04:34PM -0800, Linus Torvalds wrote:
> [ Modified test-program that tells you where the corruption happens (and
> when the missing parts were supposed to be written out) appended, in
> case people care. ]
Hi
2.6.18 (and 2.6.18.6) is ok, 2.6.19-rc1 is broken. I tri
On Thu, Dec 28, 2006 at 11:16:59AM +0100, Martin Michlmayr wrote:
> * Gordon Farquharson <[EMAIL PROTECTED]> [2006-12-27 22:38]:
> > >> #define TARGETSIZE (100 << 12)
> > >
> > >That's just 400kB!
> > >
> > >There's no way you should see corruption with that kind of value. It
> > >should all stay s
* Gordon Farquharson <[EMAIL PROTECTED]> [2006-12-27 22:38]:
> >> #define TARGETSIZE (100 << 12)
> >
> >That's just 400kB!
> >
> >There's no way you should see corruption with that kind of value. It
> >should all stay solidly in the cache.
> >
> >Is this perhaps with ARM nommu or something else str
On Wed, Dec 27, 2006 at 10:20:20PM -0700, Gordon Farquharson wrote:
> I have run the program a few times, and the output is pretty
> consistent. However, when I increase the target size, the difference
> between the expected and actual values is larger.
>
> Written as (749)935(738)
> Chunk 1113 co
* Gordon Farquharson <[EMAIL PROTECTED]> [2006-12-27 22:38]:
> >That's just 400kB!
> >
> >There's no way you should see corruption with that kind of value. It
> >should all stay solidly in the cache.
> >
> >Is this perhaps with ARM nommu or something else strange? It may be that
> >the program just
On Wed, 2006-12-27 at 19:04 -0800, Linus Torvalds wrote:
>
> On Wed, 27 Dec 2006, David Miller wrote:
> > >
> > > I still don't see _why_, though. But maybe smarter people than me can see
> > > it..
> >
> > FWIW this program definitely triggers the bug for me.
>
> Ok, now that I have something
From: "Chen, Kenneth W" <[EMAIL PROTECTED]>
Date: Wed, 27 Dec 2006 22:10:52 -0800
> Chen, Kenneth wrote on Wednesday, December 27, 2006 9:55 PM
> > Linus Torvalds wrote on Wednesday, December 27, 2006 7:05 PM
> > > On Wed, 27 Dec 2006, David Miller wrote:
> > > > >
> > > > > I still don't see _wh
Chen, Kenneth wrote on Wednesday, December 27, 2006 9:55 PM
> Linus Torvalds wrote on Wednesday, December 27, 2006 7:05 PM
> > On Wed, 27 Dec 2006, David Miller wrote:
> > > >
> > > > I still don't see _why_, though. But maybe smarter people than me can
> > > > see
> > > > it..
> > >
> > > FWIW
On 12/27/06, Linus Torvalds <[EMAIL PROTECTED]> wrote:
That's just 400kB!
There's no way you should see corruption with that kind of value. It
should all stay solidly in the cache.
100kB and 200kB files always succeed on the ARM system. 400kB and
larger always seem to fail.
Does the followin
Linus Torvalds wrote on Wednesday, December 27, 2006 7:05 PM
> On Wed, 27 Dec 2006, David Miller wrote:
> > >
> > > I still don't see _why_, though. But maybe smarter people than me can see
> > > it..
> >
> > FWIW this program definitely triggers the bug for me.
>
> Ok, now that I have somethin
Hi David
On 12/27/06, David Miller <[EMAIL PROTECTED]> wrote:
Me too, I added "-D_POSIX_C_SOURCE=200112" to "fix" this.
That works for me. Thanks for the tip.
Gordon
--
Gordon Farquharson
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [
From: "Gordon Farquharson" <[EMAIL PROTECTED]>
Date: Wed, 27 Dec 2006 22:20:20 -0700
> and for some reason I get
>
> linus-test.c: In function 'remap':
> linus-test.c:61: error: 'POSIX_FADV_DONTNEED' undeclared (first use in
> this function)
>
> when I compile the program, so I replaced POSIX_FA
On 12/27/06, Linus Torvalds <[EMAIL PROTECTED]> wrote:
On Wed, 27 Dec 2006, Gordon Farquharson wrote:
>
> I don't think so. I did reduce the target size
>
> #define TARGETSIZE (100 << 12)
That's just 400kB!
There's no way you should see corruption with that kind of value. It
should all stay so
[Oops - forgot to hit "Reply to All" first time round.]
Hi Linus
On 12/27/06, Linus Torvalds <[EMAIL PROTECTED]> wrote:
For all I know, my test-program is buggy wrt the ordering printouts,
though. Did you perhaps change the logic in any way?
I don't think so. I did reduce the target size
#d
On Wed, 27 Dec 2006, Gordon Farquharson wrote:
>
> It is at all suprising that the second offset within a page can be
> less than the first offset within a page ? e.g.
>
> Chunk 260 corrupted (1-1455) (2769-127)
No, that just means that it went over to the next page (so you actually
had two
On 12/27/06, Linus Torvalds <[EMAIL PROTECTED]> wrote:
[ Modified test-program that tells you where the corruption happens (and
when the missing parts were supposed to be written out) appended, in
case people care. ]
For the record, this is the output from a run on our ARM machine (32
MB R
On Wed, 27 Dec 2006, David Miller wrote:
> >
> > I still don't see _why_, though. But maybe smarter people than me can see
> > it..
>
> FWIW this program definitely triggers the bug for me.
Ok, now that I have something simple to do repeatable stuff with, I can
say what the pattern is.. It's
From: Linus Torvalds <[EMAIL PROTECTED]>
Date: Wed, 27 Dec 2006 16:39:43 -0800 (PST)
>
>
> On Wed, 27 Dec 2006, Linus Torvalds wrote:
> >
> > I think the test-case could probably be improved by having a munmap() and
> > page-cache flush in between the writing and the checking, to see whether
From: Linus Torvalds <[EMAIL PROTECTED]>
Date: Wed, 27 Dec 2006 16:42:40 -0800 (PST)
> That's fine. In that situation, you shouldn't need any atomic ops at all,
> I think all our sw page-table operations are already done under the pte
> lock.
This is true, but there is one subtlety to this I w
On Thu, 28 Dec 2006, Martin Schwidefsky wrote:
>
> For s390 there are two aspects to consider:
> 1) the pte values are 100% software controlled.
That's fine. In that situation, you shouldn't need any atomic ops at all,
I think all our sw page-table operations are already done under the pte
lo
On Wed, 27 Dec 2006, Linus Torvalds wrote:
>
> I think the test-case could probably be improved by having a munmap() and
> page-cache flush in between the writing and the checking, to see whether
> that shows the corruption easier (and possibly without having to start
> paging in order to thr
On Tue, 26 Dec 2006, David Miller wrote:
>
> I've seen it on sparc64, UP kernel, no preempt.
Ok, I still don't have a clue, but I think I at least have a new
test-case.
It can probably be improved upon, but this would _seem_ to trigger the
problem. Can people check?
You'd want to make sure y
On Thu, 2006-12-21 at 12:01 -0800, Linus Torvalds wrote:
> What do you guys think? Does something like this work out for S/390 too? I
> tried to make that "ptep_flush_dirty()" concept work for architectures
> that hide the dirty bit somewhere else too, but..
For s390 there are two aspects to consi
On 12/27/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
I do get this error on reiserfs ( old one, didn't try on reiser4 ).
Stock 2.6.19 plus reiser4 patch. Previously reported by me only in the
debian bts.
I've had reports of corrupted data on earlier kernel releases with
reiserfs3, which we
On Tue, Dec 26, 2006 at 11:26:50AM -0800, Linus Torvalds wrote:
> What would also actually be interesting is whether somebody can reproduce
> this on Reiserfs, for example. I _think_ all the reports I've seen are on
> ext2 or ext3, and if this is somehow writeback-related, it could be some
> bug
On 12/27/06, Linus Torvalds <[EMAIL PROTECTED]> wrote:
- It never uses mprotect on the shared mappings, but it _does_ do:
"mincore()" - but the return values don't much matter (it's used
as a heuristic on which parts to hash, apparently)
I do
I have corrupted files...
> ---
> diff --git a/fs/buffer.c b/fs/buffer.c
> index 263f88e..4652ef1 100644
> --- a/fs/buffer.c
> +++ b/fs/buffer.c
> @@ -1653,19 +1653,7 @@ static int __block_write_full_page(struct inode
> *inode, struct page *page,
> do {
> if (!buffer_mapped(bh
On Tue, 26 Dec 2006, David Miller wrote:
>
> I've seen it on sparc64, UP kernel, no preempt.
Btw, having tried to debug the writeback code, there's one very special
case that just makes me go "hmm".
If we have a buffer that is "busy" when we try to write back a page, we
have this magic "wbc-
From: Tobias Diedrich <[EMAIL PROTECTED]>
Date: Tue, 26 Dec 2006 17:17:00 +0100
> Linus Torvalds wrote:
> > I don't think it's a page table issue any more, it just doesn't look
> > likely with the ARM UP corruption. It's also not apparently even on a
> > cacheline boundary, so it probably is rea
On Tue, 26 Dec 2006, Nick Piggin wrote:
> Linus Torvalds wrote:
> >
> > Ok, so how about this diff.
> >
> > I'm actually feeling good about this one. It really looks like
> > "do_no_page()" was simply buggy, and that this explains everything.
>
> Still trying to catch up here, so I'm not goin
On Tue, Dec 26, 2006 at 05:51:55PM +, Al Viro wrote:
> On Sun, Dec 24, 2006 at 12:24:46PM -0800, Linus Torvalds wrote:
> >
> >
> > On Sun, 24 Dec 2006, Andrei Popa wrote:
> > >
> > > Hash check on download completion found bad chunks, consider using
> > > "safe_sync".
> >
> > Dang. Did you
On Sun, Dec 24, 2006 at 12:24:46PM -0800, Linus Torvalds wrote:
>
>
> On Sun, 24 Dec 2006, Andrei Popa wrote:
> >
> > Hash check on download completion found bad chunks, consider using
> > "safe_sync".
>
> Dang. Did you get any warning messages from the kernel?
>
> Linus
BTW, rm
Linus Torvalds wrote:
On Sun, 24 Dec 2006, Linus Torvalds wrote:
Peter, tell me I'm crazy, but with the new rules, the following condition
is a bug:
- shared mapping
- writable
- not already marked dirty in the PTE
Ok, so how about this diff.
I'm actually feeling good about this one. It
* Linus Torvalds <[EMAIL PROTECTED]> [2006-12-24 11:35]:
> And if this doesn't fix it, I don't know what will..
Sorry, but it still fails (on top of plain 2.6.19).
--
Martin Michlmayr
http://www.cyrius.com/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a
> Quoting Linus Torvalds <[EMAIL PROTECTED]>:
> Subject: Re: [PATCH] mm: fix page_mkclean_one (was: 2.6.19 file content
> corruption on ext3)
>
> Peter, tell me I'm crazy, but with the new rules, the following condition
> is a bug:
>
> - shared mapping
&g
On 12/24/06, Linus Torvalds <[EMAIL PROTECTED]> wrote:
Ok, so how about this diff.
I'm actually feeling good about this one. It really looks like
"do_no_page()" was simply buggy, and that this explains everything.
I tested with just this patch and 2.6.19 and no change. Sorry Linus,
no early C
On Sun, 2006-12-24 at 12:24 -0800, Linus Torvalds wrote:
>
> On Sun, 24 Dec 2006, Andrei Popa wrote:
> >
> > Hash check on download completion found bad chunks, consider using
> > "safe_sync".
>
> Dang. Did you get any warning messages from the kernel?
>
only these:
ACPI: EC: evaluating _Q80
A
On Sun, 24 Dec 2006, Andrei Popa wrote:
>
> Hash check on download completion found bad chunks, consider using
> "safe_sync".
Dang. Did you get any warning messages from the kernel?
Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a
On Sun, 2006-12-24 at 11:35 -0800, Linus Torvalds wrote:
>
> On Sun, 24 Dec 2006, Gordon Farquharson wrote:
> >
> > The apt cache files (/var/cache/apt/*.bin) still get corrupted with
> > this patch and 2.6.19.
>
> Yeah, if my guess about do_no_page() is right, _none_ of the previous
> patches
On Sun, 24 Dec 2006, Gordon Farquharson wrote:
>
> The apt cache files (/var/cache/apt/*.bin) still get corrupted with
> this patch and 2.6.19.
Yeah, if my guess about do_no_page() is right, _none_ of the previous
patches should have ANY effect what-so-ever. In fact, I'd say that even
the "ex
On 12/24/06, Linus Torvalds <[EMAIL PROTECTED]> wrote:
How about this particularly stupid diff? (please test with something that
_would_ cause corruption normally).
It is _entirely_ untested, but what it tries to do is to simply serialize
any writeback in progress with any process that tries to
On Sun, 24 Dec 2006, Linus Torvalds wrote:
>
> Peter, tell me I'm crazy, but with the new rules, the following condition
> is a bug:
>
> - shared mapping
> - writable
> - not already marked dirty in the PTE
Ok, so how about this diff.
I'm actually feeling good about this one. It really lo
On Sun, 24 Dec 2006, Linus Torvalds wrote:
>
> How about this particularly stupid diff? (please test with something that
> _would_ cause corruption normally).
Actually, here's an even more stupid diff, which actually to some degree
seems to capture the real problem better.
Peter, tell me I'm
On Sun, 24 Dec 2006 09:16:06 -0800 (PST)
Linus Torvalds <[EMAIL PROTECTED]> wrote:
>
>
> On Sun, 24 Dec 2006, Andrei Popa wrote:
>
> > On Sun, 2006-12-24 at 04:31 -0800, Andrew Morton wrote:
> > > Andrei Popa <[EMAIL PROTECTED]> wrote:
> > > > /dev/sda7 on / type ext3 (rw,noatime,nobh)
> > > >
On Sun, 24 Dec 2006, Andrei Popa wrote:
> On Sun, 2006-12-24 at 04:31 -0800, Andrew Morton wrote:
> > Andrei Popa <[EMAIL PROTECTED]> wrote:
> > > /dev/sda7 on / type ext3 (rw,noatime,nobh)
> > >
> > > I don't have corruption. I tested twice.
> >
> > This is a surprising result. Can you pleas
On Sun, 2006-12-24 at 04:31 -0800, Andrew Morton wrote:
> On Sun, 24 Dec 2006 14:14:38 +0200
> Andrei Popa <[EMAIL PROTECTED]> wrote:
>
> > > - mount the fs with ext2 with the no-buffer-head option. That means
> > > either:
> > >
> > > grub.conf: rootfstype=ext2 rootflags=nobh
> > > /etc/f
* Andrew Morton <[EMAIL PROTECTED]> [2006-12-24 00:57]:
> /etc/fstab: ext2 nobh
> /etc/fstab: ext3 data=writeback,nobh
It seems that busybox mount ignores the nobh option but both ext2 and
ext3 data=writeback work for me. This is with plain 2.6.19 which
normally always fails.
--
Martin Michl
On Sun, 24 Dec 2006 14:14:38 +0200
Andrei Popa <[EMAIL PROTECTED]> wrote:
> > - mount the fs with ext2 with the no-buffer-head option. That means either:
> >
> > grub.conf: rootfstype=ext2 rootflags=nobh
> > /etc/fstab: ext2 nobh
>
> ierdnac ~ # mount
> /dev/sda7 on / type ext2 (rw,noatime
On Sun, 24 Dec 2006 14:26:01 +0200
Andrei Popa <[EMAIL PROTECTED]> wrote:
> I also tested with ext3 ordered, nobh and I have file corruption...
ordered+nobh isn't a possible combination. The filesystem probably ignored
nobh. nobh mode only makes sense with data=writeback.
-
To unsubscribe from
On Sun, 2006-12-24 at 14:14 +0200, Andrei Popa wrote:
> On Sun, 2006-12-24 at 00:57 -0800, Andrew Morton wrote:
> > On Sun, 24 Dec 2006 00:43:54 -0800 (PST)
> > Linus Torvalds <[EMAIL PROTECTED]> wrote:
> >
> > > I now _suspect_ that we're talking about something like
> > >
> > > - we started a
On Sun, 2006-12-24 at 00:57 -0800, Andrew Morton wrote:
> On Sun, 24 Dec 2006 00:43:54 -0800 (PST)
> Linus Torvalds <[EMAIL PROTECTED]> wrote:
>
> > I now _suspect_ that we're talking about something like
> >
> > - we started a writeout. The IO is still pending, and the page was
> >marked
1 - 100 of 178 matches
Mail list logo