On Fri, 2007-04-27 at 11:01 +0200, Roch - PAE wrote:
> Chad Mynhier writes:
>  > On 4/27/07, Erblichs <[EMAIL PROTECTED]> wrote:
>  > > Ming Zhang wrote:
>  > > >
>  > > > Hi All
>  > > >
>  > > > I wonder if any one have idea about the performance loss caused by COW
>  > > > in ZFS? If you have to read old data out before write it to some other
>  > > > place, it involve disk seek.
>  > > >
>  > >
>  > > Ming,
>  > >
>  > >         Lets take a pro example with a minimal performance
>  > >         tradeoff.
>  > >
>  > >         All FSs that modify a disk block, IMO, do a full
>  > >         disk block read before anything.
>  > >
>  > 
>  > Actually, I'd say that this is the main point that needs to be made.
>  > If you're modifying data that was once on disk, that data had to be
>  > read from at some point in the past.  This is invariably true for any
>  > filesystem.
>  > 
> 
> Nits, just so readers are clear about this : the read of old
> data to service a write, needs only be done when handling a write
> of a partial filesystem block (and the data is not cached as
> mentioned). For a fixed size block database with matching ZFS
> recordsize, then writes will mostly be handled without a need to
> read previous data. Most FS should behave the same here.
> 
>  > With traditional filesystems, that data block is rewritten in the same
>  > place.  If it were the case that disk blocks were always written
>  > immediately after being read, with no intervening I/O to cause a disk
>  > seek, COW would have no performance benefit over traditional
>  > filesystems.  (Well, this isn't true, as there are other benefits to
>  > be had.)
>  > 
>  > But it's rarely (if ever) the case that this happens.  The modified
>  > block is generally written some time after the original block was
>  > read, with plenty of intervening I/O that leaves the disk head over
>  > some randome location on the platter.  So for traditional filesystems,
>  > the in-place write of a modified block will typically involve a disk
>  > seek.
>  > 
>  > And a second point to be made about this is the effect of caching.
>  > With any filesystem, writes are cached in memory and flushed out to
>  > disk on a regular basis.  With traditional filesystems, flushing the
>  > cache involves a set of random writes on the disk, which is possibly
>  > going to involve a disk seek for every block written.  (In the best
>  > case, writes could be reordered in ascending order across the disk to
>  > miinimize the disk seeks, but there would still possibly be a small
>  > disk seek between each write.)
>  > 
>  > With a COW filesystem, flushing the cache involves writing
>  > sequentially to disk with no intervening disk seeks.  (This assumes
>  > that there's enough free space on disk to avoid fragmentation.)  In
>  > the ideal case, this means writing to disk at full platter speed.
>  > This is where the main performance benefit of COW comes from.
>  > 
> 



thank you all for the detailed explanation. i am sorry that i should
read more so maybe i will not ask it at all.

originally i saw zfs use 128KB block size for file and do COW (forgot
what the url is), then I thought for small write it will do partial
block write a lot, so i asked that question. i knew full block write
does not need that.

now i read more including the on disk spec, knew the block size is
variable and then my question is cleared. 

i knew the whole benefit of such write mechanism. it somehow like the
WAFL from netapp, so what is the difference between two regarding this
part? also i remember i read one conference paper about it back to
2003/4, so that is from SUN or that guy went to SUN? ;)

just a thought, such write will make the file non-sequential on disk and
later sequential read will have to do seek. how ZFS solve this? by
aggregated caching? or ZFS need a defrag tool?


ps, before i jump into the deep code ocean, any other document i can
find about zfs internal other than on disk format spec?

thanks again.

> yep.
> 
>  > Chad Mynhier
>  > _______________________________________________
>  > zfs-discuss mailing list
>  > zfs-discuss@opensolaris.org
>  > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> 
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to