Re: [dm-devel] [RFC PATCH] block: xfs: dm thin: train XFS to give up on retrying IO if thinp is out of space

2015-07-23 Thread Dave Chinner
On Thu, Jul 23, 2015 at 01:08:36PM -0400, Mikulas Patocka wrote:
> On Wed, 22 Jul 2015, Dave Chinner wrote:
> > On Wed, Jul 22, 2015 at 10:09:23AM +1000, Dave Chinner wrote:
> > > On Tue, Jul 21, 2015 at 01:47:53PM -0400, Mike Snitzer wrote:
> > > | $ cat
> > > | /sys/fs/xfs/vda/meta_write_errors/enospc/transient_fail_at_umount
> > > | 1

[...]

> You can just stop retrying the I/Os when the user attempts to unmount the 
> filesystem - then, you don't need any configuration option.

See above - the default will do that, but there are users who do not
want that unmount behaviour

-Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [dm-devel] [RFC PATCH] block: xfs: dm thin: train XFS to give up on retrying IO if thinp is out of space

2015-07-23 Thread Mikulas Patocka


On Wed, 22 Jul 2015, Dave Chinner wrote:

> On Wed, Jul 22, 2015 at 10:09:23AM +1000, Dave Chinner wrote:
> > On Tue, Jul 21, 2015 at 01:47:53PM -0400, Mike Snitzer wrote:
> > > On Tue, Jul 21 2015 at 11:34am -0400, Eric Sandeen  
> > > wrote:
> > > > On 7/20/15 5:36 PM, Dave Chinner wrote:
> > > > The issue we had discussed previously is that there is no agreement
> > > > across block devices about whether ENOSPC is a permanent or temporary
> > > > condition.  Asking the admin to tune  the fs to each block device's
> > > > behavior sucks, IMHO.
> > > 
> > > It does suck, but it beats the alternative of XFS continuing to do
> > > nothing about the problem.
> > 
> > Just a comment on that: doing nothing is better than doing the wrong
> > thing and being stuck with it forever. :)
> > 
> > > Disucssing more with Vivek, might be that XFS would be best served to
> > > model what dm-thinp has provided with its 'no_space_timeout'.  It
> > > defaults to queueing IO for 60 seconds, once the timeout expires the
> > > queued IOs getted errored.  If set to 0 dm-thinp will queue IO
> > > indefinitely.
> > 
> > Yes, that's exactly what I proposed in the thread I referenced in
> > my previous email, and what got stuck on the bikeshed wall because
> > of these concerns about knob twiddling:
> > 
> > http://oss.sgi.com/archives/xfs/2015-02/msg00346.html
> > 
> > | e.g. if we need configurable error handling, it needs to be
> > | configurable for different error types, and it needs to be
> > | configurable on a per-mount basis. And it needs to be configurable
> > | at runtime, not just at mount time. That kind of leads to using
> > | sysfs for this. e.g. for each error type we ned to handle different
> > | behaviour for:
> > | 
> > | $ cat /sys/fs/xfs/vda/meta_write_errors/enospc/type
> > | [transient] permanent
> > | $ cat /sys/fs/xfs/vda/meta_write_errors/enospc/perm_timeout_seconds
> > | 300
> > | $ cat
> > | /sys/fs/xfs/vda/meta_write_errors/enospc/perm_max_retry_attempts
> > | 50
> > | $ cat
> > | /sys/fs/xfs/vda/meta_write_errors/enospc/transient_fail_at_umount
> > | 1
> > 
> > I've rebased this patchset, and I'm cleaning it up now, so in a few
> > days I'll have something for review, likely for the 4.3 merge
> > window
> 
> Just thinking a bit more on how to make this simpler to configure,
> is there a simple way for the filesystem to determine the current
> config of the dm thinp volume?

You can just stop retrying the I/Os when the user attempts to unmount the 
filesystem - then, you don't need any configuration option.

Mikulas

> i.e. if the dm-thinp volume is
> configured to error out immediately on enospc, then XFS should
> default to doing the same thing. having XFS be able to grab this
> status at mount time and change the default ENOSPC error config from
> transient to permanent on such dm-thinp volumes would go a long way
> to making these configs Just Do The Right Thing on block dev enospc
> errors...
> 
> e.g. if dm-thinp is configured to queue for 60s and then fail on
> ENOSPC, we want XFS to fail immediately on ENOSPC in metadata IO. If
> dm-thinp is configured to ENOSPC instantly (i.e. no queueing) then
> we want XFS to retry and use it's default retry maximums before
> failing permanently.
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> da...@fromorbit.com
> 
> --
> dm-devel mailing list
> dm-de...@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [dm-devel] [RFC PATCH] block: xfs: dm thin: train XFS to give up on retrying IO if thinp is out of space

2015-07-23 Thread Dave Chinner
On Thu, Jul 23, 2015 at 01:08:36PM -0400, Mikulas Patocka wrote:
 On Wed, 22 Jul 2015, Dave Chinner wrote:
  On Wed, Jul 22, 2015 at 10:09:23AM +1000, Dave Chinner wrote:
   On Tue, Jul 21, 2015 at 01:47:53PM -0400, Mike Snitzer wrote:
   | $ cat
   | /sys/fs/xfs/vda/meta_write_errors/enospc/transient_fail_at_umount
   | 1

[...]

 You can just stop retrying the I/Os when the user attempts to unmount the 
 filesystem - then, you don't need any configuration option.

See above - the default will do that, but there are users who do not
want that unmount behaviour

-Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [dm-devel] [RFC PATCH] block: xfs: dm thin: train XFS to give up on retrying IO if thinp is out of space

2015-07-23 Thread Mikulas Patocka


On Wed, 22 Jul 2015, Dave Chinner wrote:

 On Wed, Jul 22, 2015 at 10:09:23AM +1000, Dave Chinner wrote:
  On Tue, Jul 21, 2015 at 01:47:53PM -0400, Mike Snitzer wrote:
   On Tue, Jul 21 2015 at 11:34am -0400, Eric Sandeen sand...@redhat.com 
   wrote:
On 7/20/15 5:36 PM, Dave Chinner wrote:
The issue we had discussed previously is that there is no agreement
across block devices about whether ENOSPC is a permanent or temporary
condition.  Asking the admin to tune  the fs to each block device's
behavior sucks, IMHO.
   
   It does suck, but it beats the alternative of XFS continuing to do
   nothing about the problem.
  
  Just a comment on that: doing nothing is better than doing the wrong
  thing and being stuck with it forever. :)
  
   Disucssing more with Vivek, might be that XFS would be best served to
   model what dm-thinp has provided with its 'no_space_timeout'.  It
   defaults to queueing IO for 60 seconds, once the timeout expires the
   queued IOs getted errored.  If set to 0 dm-thinp will queue IO
   indefinitely.
  
  Yes, that's exactly what I proposed in the thread I referenced in
  my previous email, and what got stuck on the bikeshed wall because
  of these concerns about knob twiddling:
  
  http://oss.sgi.com/archives/xfs/2015-02/msg00346.html
  
  | e.g. if we need configurable error handling, it needs to be
  | configurable for different error types, and it needs to be
  | configurable on a per-mount basis. And it needs to be configurable
  | at runtime, not just at mount time. That kind of leads to using
  | sysfs for this. e.g. for each error type we ned to handle different
  | behaviour for:
  | 
  | $ cat /sys/fs/xfs/vda/meta_write_errors/enospc/type
  | [transient] permanent
  | $ cat /sys/fs/xfs/vda/meta_write_errors/enospc/perm_timeout_seconds
  | 300
  | $ cat
  | /sys/fs/xfs/vda/meta_write_errors/enospc/perm_max_retry_attempts
  | 50
  | $ cat
  | /sys/fs/xfs/vda/meta_write_errors/enospc/transient_fail_at_umount
  | 1
  
  I've rebased this patchset, and I'm cleaning it up now, so in a few
  days I'll have something for review, likely for the 4.3 merge
  window
 
 Just thinking a bit more on how to make this simpler to configure,
 is there a simple way for the filesystem to determine the current
 config of the dm thinp volume?

You can just stop retrying the I/Os when the user attempts to unmount the 
filesystem - then, you don't need any configuration option.

Mikulas

 i.e. if the dm-thinp volume is
 configured to error out immediately on enospc, then XFS should
 default to doing the same thing. having XFS be able to grab this
 status at mount time and change the default ENOSPC error config from
 transient to permanent on such dm-thinp volumes would go a long way
 to making these configs Just Do The Right Thing on block dev enospc
 errors...
 
 e.g. if dm-thinp is configured to queue for 60s and then fail on
 ENOSPC, we want XFS to fail immediately on ENOSPC in metadata IO. If
 dm-thinp is configured to ENOSPC instantly (i.e. no queueing) then
 we want XFS to retry and use it's default retry maximums before
 failing permanently.
 
 Cheers,
 
 Dave.
 -- 
 Dave Chinner
 da...@fromorbit.com
 
 --
 dm-devel mailing list
 dm-de...@redhat.com
 https://www.redhat.com/mailman/listinfo/dm-devel
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/