Hi Jeff,

On Fri, May 05, 2017 at 04:11:18PM -0400, Jeff Layton wrote:
> On Fri, 2017-05-05 at 12:21 -0700, Liu Bo wrote:
> > Hi Jeff,
> > 
> > On Thu, May 04, 2017 at 07:26:17AM -0400, Jeff Layton wrote:
> > > I've been working on set of patches to clean up how writeback errors are
> > > tracked and handled in the kernel:
> > > 
> > > http://marc.info/?l=linux-fsdevel&m=149304074111261&w=2
> > > 
> > > The basic idea is that rather than having a set of flags that are
> > > cleared whenever they are checked, we have a sequence counter and error
> > > that are tracked on a per-mapping basis, and can then use that sequence
> > > counter to tell whether the error should be reported.
> > > 
> > > This changes the way that things like filemap_write_and_wait work.
> > > Rather than having to ensure that AS_EIO/AS_ENOSPC are not cleared
> > > inappropriately (and thus losing errors that should be reported), you
> > > can now tell whether there has been a writeback error since a certain
> > > point in time, irrespective of whether anyone else is checking for
> > > errors.
> > > 
> > > I've been doing some conversions of the existing code to the new scheme,
> > > but btrfs has _really_ complicated error handling. I think it could
> > > probably be simplified with this new scheme, but I could use some help
> > > here.
> > > 
> > > What I think we probably want to do is to sample the error sequence in
> > > the mapping at well-defined points in time (probably when starting a
> > > transaction?) and then use that to determine whether writeback errors
> > > have occurred since then. Is there anyone in the btrfs community who
> > > could help me here?
> > > 
> > 
> > I went through the patch set and reviewed the btrfs part particular in
> > [PATCH v3 14/20] fs: retrofit old error reporting API onto new 
> > infrastructure
> > 
> > It looks good to me.
> > 
> > In btrfs ->writepage(), it sets PG_error whenever an error
> > (-EIO/-ENOSPC/-ENOMEM) occurs and it sets mapping's error as well in
> > end_extent_writepage().  And the special case is the compression code, 
> > where it
> > only sets mapping's error when there is any error during processing 
> > compression
> > bytes.
> > 
> > Similar to ext4, btrfs tracks the IO error by setting mapping's error in
> > writepage_endio and other places (eg. compression code), and around 
> > tree-log.c
> > it's checking BTRFS_ORDERED_IOERR from ordered_extent->flags, which is 
> > usually
> > set in writepage_endio and sometimes in some error handling code where it
> > couldn't call endio.
> > 
> > So the conversion in btrfs's fsync() seems to be good enough, did I miss
> > anything?
> > 
> 
> Many thanks for taking a look:
> 
> There are a number of calls in btrfs to filemap_fdatawait_range that
> check the return code. That function will wait for writeback on all of
> the pages in the mapping range and return an error if there has been
> one. Note too that there are also some places that ignore the return
> code.
> 
> These patches change how filemap_fdatawait_range (and some similar
> functions) work. Before this set, you'd get an error if one had occurred
> since anyone last checked it. Now, you only get an error there if one
> occurred since you started waiting. If the failed writeback occurred
> before that function was called, you won't get an error back.
> 

Since all filemap_fdatawait_range() called in btrfs checked the return value, it
is supposed to catch any errors that are occured from filemap_fdatawrite_range()
which is called twice by btrfs_fdatawrite_range()[1], so with this set, it's
possible to fail to detect errors if only calling filemap_fdatawait_range().

[1]: filemap_fdatawrite_range() needs to be called twice to make sure compressed
data is flushed.

> For fsync, it shouldn't matter. You'll get an error back there if one
> occurred since the last fsync since you're setting it in the mapping.
> The bigger question is whether other callers in this code do anything
> with that error return.
> 
> If they do, then the next question is: from what point do you want to
> detect errors that have occurred?
> 
> What sort of makes sense to me (in a handwavy way) would be to sample
> the errseq_t in the mapping when you start a transaction, and then check
> vs. that for errors. Then, even if you have parallel transactions going
> on the same inode (is that even possible?) then you can tell whether
> they all succeded or not.
> 
> Thoughts?

That may not work, a transaction here is kind of unrelated to mapping and
writeback of a mapping can be processed accross two transactions.

Is it possible to provide another version of filemap_fdatawait_range() to take
@since as an argument?  If so, we can sample wb_err before calling
filemap_fdatawrite_range() (or btrfs_fdatawrite_range()) and collect any wb_err
after filemap_fdatawait_range().

Thanks,

-liubo
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to