Re: [zfs-discuss] Couple questions about ZFS writes and fragmentation

Bob Friesenhahn Mon, 09 Nov 2009 20:05:12 -0800

On Mon, 9 Nov 2009, Ilya wrote:

2. I came upon this statement in a forum post:
[i]"ZFS uses 128K data blocks by default whereas other filesystemstypically use 4K or 8K blocks. This naturally reduces the potentialfor fragmentation by 32X over 4k blocks."[/i]
How is this true? I mean, if you have a 128k default block size andyou store a 4k file within that block then you will have a ton ofslack space to clear up.

Short files are given a short block. Files larger than 128K are dicedinto 128K blocks, but the last block may be shorter.


The fragmentation discussed is fragmentation at the file level.

3. Another statement from a post:
[i]"the seek time for single-user contiguous access is essentiallyzero since the seeks occur while the application is already busyprocessing other data. When mirror vdevs are used, any device in themirror may be used to read the data."[/i]
All this is saying that is when you are reading off of one physicaldevice you will already be seeking for the blocks that you need fromthe other device so the seek time will no longer be an issue right?

The seek time becomes less of an issue for sequential reads if blocksare read from different disks, and the reads are scheduled in advance.It still consumes drive IOPS if the disk needs to seek.

6. Could anyone clarify this post:
[i]"ZFS uses a copy-on-write model. Copy-on-write tends to causefragmentation if portions of existing files are updated. If a largeportion of a file is overwritten in a short period of time, theresult should be reasonably fragment-free but if parts of the fileare updated over a long period of time (like a database) then thefile is certain to be fragmented. This is not such a big problem asit appears to be since such files were already typically accessedusing random access."[/i]

The point here is that zfs buffers unwritten data in memory for up to30 seconds. With a large amount of buffered data, zfs is able towrite the data in a more sequential and better-optimized fashion,while wasting fewer IOPS. Databases usually use random I/O andsynchronous writes, which tends to scramble the data layout on diskwith a copy-on-write model. Zfs is not optimized for databaseperformance. On the other hand, the copy-on-write model reduces thechance of database corruption if there is a power failure or systemcrash.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Couple questions about ZFS writes and fragmentation

Reply via email to