Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-30 Thread Dave Chinner
On Mon, Oct 29, 2012 at 06:11:58PM +, Luck, Tony wrote: > > What I would recommend is adding a > > > > #define FS_CORRUPTED_FL 0x0100 /* File is corrupted */ > > > > ... and which could be accessed and cleared via the lsattr and chattr > > programs. > > Good - but we need some

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-29 Thread Jun'ichi Nomura
On 10/30/12 04:07, Andi Kleen wrote: > Theodore Ts'o writes: >> Note that the problem that we're dealing with is buffered writes; so >> it's quite possible that the process which wrote the file, thus >> dirtying the page cache, has already exited; so there's no way we can >> guarantee we can infor

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-29 Thread Naoya Horiguchi
On Mon, Oct 29, 2012 at 12:07:04PM -0700, Andi Kleen wrote: > Theodore Ts'o writes: ... > > Also, if you're going to keep this state in memory, what happens if > > the inode gets pushed out of memory? > > You lose the error, just like you do today with any other IO error. > > We had a lot of di

RE: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-29 Thread Luck, Tony
> What I would recommend is adding a > > #define FS_CORRUPTED_FL 0x0100 /* File is corrupted */ > > ... and which could be accessed and cleared via the lsattr and chattr > programs. Good - but we need some space to save the corrupted range information too. These errors should be

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-29 Thread Andi Kleen
Theodore Ts'o writes: > > It's actually pretty easy to test this particular one, Note the error can happen at any time. > and certainly > one of the things I'd strongly encourage in this patch series is the > introduction of an interface via madvise It already exists of course. I would sugges

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-29 Thread Jan Kara
On Mon 29-10-12 14:24:56, Ted Tso wrote: > On Mon, Oct 29, 2012 at 03:37:05AM -0700, Andi Kleen wrote: > > > Agreed, if we're going to add an xattr, then we might as well store > > > > I don't think an xattr makes sense for this. It's sufficient to keep > > this state in memory. > > > > In genera

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-29 Thread Theodore Ts'o
On Mon, Oct 29, 2012 at 03:37:05AM -0700, Andi Kleen wrote: > > Agreed, if we're going to add an xattr, then we might as well store > > I don't think an xattr makes sense for this. It's sufficient to keep > this state in memory. > > In general these error paths are hard to test and it's important

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-29 Thread Jun'ichi Nomura
On 10/29/12 19:37, Andi Kleen wrote: > Theodore Ts'o writes: >> On Mon, Oct 29, 2012 at 12:16:32PM +1100, Dave Chinner wrote: >>> Except that there are filesystems that cannot implement such flags, >>> or require on-disk format changes to add more of those flags. This >>> is most definitely not a

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-29 Thread Andi Kleen
Theodore Ts'o writes: > On Mon, Oct 29, 2012 at 12:16:32PM +1100, Dave Chinner wrote: >> >> Except that there are filesystems that cannot implement such flags, >> or require on-disk format changes to add more of those flags. This >> is most definitely not a filesystem specific behaviour, so any

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-28 Thread Theodore Ts'o
On Mon, Oct 29, 2012 at 12:16:32PM +1100, Dave Chinner wrote: > > Except that there are filesystems that cannot implement such flags, > or require on-disk format changes to add more of those flags. This > is most definitely not a filesystem specific behaviour, so any sort > of VFS level per-file s

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-28 Thread Dave Chinner
On Sat, Oct 27, 2012 at 06:16:26PM -0400, Theodore Ts'o wrote: > On Fri, Oct 26, 2012 at 10:24:23PM +, Luck, Tony wrote: > > > Well, we could set a new attribute bit on the file which indicates > > > that the file has been corrupted, and this could cause any attempts to > > > open the file to r

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-27 Thread Naoya Horiguchi
Hi Ted, On Sat, Oct 27, 2012 at 06:16:26PM -0400, Theodore Ts'o wrote: > On Fri, Oct 26, 2012 at 10:24:23PM +, Luck, Tony wrote: > > > Well, we could set a new attribute bit on the file which indicates > > > that the file has been corrupted, and this could cause any attempts to > > > open the

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-27 Thread Theodore Ts'o
On Fri, Oct 26, 2012 at 10:24:23PM +, Luck, Tony wrote: > > Well, we could set a new attribute bit on the file which indicates > > that the file has been corrupted, and this could cause any attempts to > > open the file to return some error until the bit has been cleared. > > That sounds a lot

RE: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-26 Thread Luck, Tony
> Well, we could set a new attribute bit on the file which indicates > that the file has been corrupted, and this could cause any attempts to > open the file to return some error until the bit has been cleared. That sounds a lot better than renaming/moving the file. > This would persist across re

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-26 Thread Naoya Horiguchi
On Fri, Oct 26, 2012 at 02:12:06AM -0400, Theodore Ts'o wrote: > On Thu, Oct 25, 2012 at 11:12:48AM -0400, Naoya Horiguchi wrote: > > + /* Lost data. Handle as critical fs error. */ > > + bh = head = page_buffers(page); > > + do { > > + if (buffer_dirty(bh) && !buffer_delay(bh)) { >

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-26 Thread Theodore Ts'o
On Fri, Oct 26, 2012 at 04:55:01PM +, Luck, Tony wrote: > > I think that we know that the file *is* corrupted, not just "potentially". > We probably know the location of the corruption to cache-line granularity. > Perhaps better on systems where we have access to ecc syndrome bits, > perhaps w

RE: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-26 Thread Luck, Tony
> If we go back to first principles, what do we want to do? We want the > system administrator to know that a file might be potentially > corrupted. And perhaps, if a program tries to read from that file, it > should get an error. If we have a program that has that file mmap'ed > at the time of

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-25 Thread Theodore Ts'o
On Thu, Oct 25, 2012 at 11:12:48AM -0400, Naoya Horiguchi wrote: > + /* Lost data. Handle as critical fs error. */ > + bh = head = page_buffers(page); > + do { > + if (buffer_dirty(bh) && !buffer_delay(bh)) { > + block = bh->b_blocknr; > +

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-25 Thread Jan Kara
On Thu 25-10-12 11:12:48, Naoya Horiguchi wrote: > Ext4 has its own configurable error handling policy, so it's helpful > if we can use it also in the context of memory error handling. > With this patch, when we detect a memory error on a dirty pagecache in > ext4 filesystem, we can allow users to