Re: XFS and write barrier

2006-07-19 Thread Neil Brown
On Wednesday July 19, [EMAIL PROTECTED] wrote:
 On Tue, Jul 18, 2006 at 06:58:56PM +1000, Neil Brown wrote:
  On Tuesday July 18, [EMAIL PROTECTED] wrote:
   On Mon, Jul 17, 2006 at 01:32:38AM +0800, Federico Sevilla III wrote:
On Sat, Jul 15, 2006 at 12:48:56PM +0200, Martin Steigerwald wrote:
 I am currently gathering information to write an article about journal
 filesystems with emphasis on write barrier functionality, how it
 works, why journalling filesystems need write barrier and the current
 implementation of write barrier support for different filesystems.
  
  Journalling filesystems need write barrier isn't really accurate.
  They can make good use of write barrier if it is supported, and where
  it isn't supported, they should use blkdev_issue_flush in combination
  with regular submit/wait.
 
 blkdev_issue_flush() causes a write cache flush - just like a
 barrier typically causes a write cache flush up to the I/O with the
 barrier in it.  Both of these mechanisms provide the same thing - an
 I/O barrier that enforces ordering of I/Os to disk.
 
 Given that filesystems already indicate to the block layer when they
 want a barrier, wouldn't it be better to get the block layer to issue
 this cache flush if the underlying device doesn't support barriers
 and it receives a barrier request?

A barrier means a lot more than just a flush.
It means
  wait for all proceeding requests to commit
  flush
  write this request
  flush

Any block device that uses the io scheduler could probably manage
this.  Other block devices might not find it so easy.

 
 Any particular reason for not supporting barriers on the other types
 of RAID?
 

Imagine trying to implement barriers for raid0 (or any level with
striping).  You would need to
  block new requests
  wait for all requests to all devices to complete
  issue a flush to all devices
  issue the barrier request to the target device
  issue a flush to the target device
  permit new requests.

This means raid0 would need to keep track of all pending requests,
which it doesn't do.  As the filesystem does, it is just as efficient
to let the filesystem to the work.

I guess raid0 could
  - block new requests
  - submit a no-op barrier to all devices
  - wait for the no-op to complete
  - submit the write/barrier request
  - permit new requests.

This would avoid needing to keep track of all requests.  However I
don't think the Linux block layer supports a no-op barrer, and I don't
think this would actually be better than not supporting barriers.

The real value of barriers (as far as I can see) is that the target
device can understand them so you don't need to stall the queue of
requests flying over the buss to the device.  If you need to stall the
flow of requests and wait at the OS level, then the value of barriers
disappears and you may as well wait in the filesystem code.

At least, that is my understanding.  I am happy to be educated.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: XFS and write barrier

2006-07-18 Thread Nathan Scott
On Mon, Jul 17, 2006 at 01:32:38AM +0800, Federico Sevilla III wrote:
 On Sat, Jul 15, 2006 at 12:48:56PM +0200, Martin Steigerwald wrote:
  I am currently gathering information to write an article about journal
  filesystems with emphasis on write barrier functionality, how it
  works, why journalling filesystems need write barrier and the current
  implementation of write barrier support for different filesystems.
 
 Cool! Would you by any chance have information on the interaction
 between journal filesystems with write barrier functionality, and
 software RAID (md)? Based on my experience with 2.6.17, XFS detects that
 the underlying software RAID 1 device does not support barriers and
 therefore disables that functionality.

Noone here seems to know, maybe Neil | the other folks on linux-raid
can help us out with details on status of MD and write barriers?

cheers.

-- 
Nathan
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: XFS and write barrier

2006-07-18 Thread David Chinner
On Tue, Jul 18, 2006 at 06:58:56PM +1000, Neil Brown wrote:
 On Tuesday July 18, [EMAIL PROTECTED] wrote:
  On Mon, Jul 17, 2006 at 01:32:38AM +0800, Federico Sevilla III wrote:
   On Sat, Jul 15, 2006 at 12:48:56PM +0200, Martin Steigerwald wrote:
I am currently gathering information to write an article about journal
filesystems with emphasis on write barrier functionality, how it
works, why journalling filesystems need write barrier and the current
implementation of write barrier support for different filesystems.
 
 Journalling filesystems need write barrier isn't really accurate.
 They can make good use of write barrier if it is supported, and where
 it isn't supported, they should use blkdev_issue_flush in combination
 with regular submit/wait.

blkdev_issue_flush() causes a write cache flush - just like a
barrier typically causes a write cache flush up to the I/O with the
barrier in it.  Both of these mechanisms provide the same thing - an
I/O barrier that enforces ordering of I/Os to disk.

Given that filesystems already indicate to the block layer when they
want a barrier, wouldn't it be better to get the block layer to issue
this cache flush if the underlying device doesn't support barriers
and it receives a barrier request?

FWIW, Only XFS and Reiser3 use this function, and only then when
issuing a fsync when barriers are disabled to make sure a common
test (fsync then power cycle) doesn't result in data loss...

  Noone here seems to know, maybe Neil | the other folks on linux-raid
  can help us out with details on status of MD and write barriers?
 
 In 2.6.17, md/raid1 will detect if the underlying devices support
 barriers and if they all do, it will accept barrier requests from the
 filesystem and pass those requests down to all devices.
 
 Other raid levels will reject all barrier requests.

Any particular reason for not supporting barriers on the other types
of RAID?

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html