2012-07-30 0:40, opensolarisisdeadlongliveopensolaris пишет:
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
boun...@opensolaris.org] On Behalf Of Jim Klimov
For several times now I've seen statements on this list implying
that a dedicated ZIL/SLOG device catching sync writes for the log,
also allows for more streamlined writes to the pool during normal
healthy TXG syncs, than is the case with the default ZIL located
within the pool.
It might just be more clear, if it's stated differently:
At any given time, your pool is in one of four states: idle, reading, writing,
or idle with writes queued but not currently being written. Now a sync write
operation takes place. If you have a dedicated log, it goes directly to the
log, and it doesn't interfere with any of the other operations that might be
occurring right now. You don't have to interrupt your current activity,
simply, your sync write goes to a dedicated device that's guaranteed to be idle
in relation to all that other stuff. Then the sync write becomes async, and
gets coalesced into the pending TXG.
If you don't have a dedicated log, then the sync write jumps the write queue,
and becomes next in line. It waits for the present read or write operation to
complete, and then the sync write hits the disk, and flushes the disk buffer.
This means the sync write suffered a penalty waiting for the main pool disks to
be interruptible. Without slog, you're causing delay to your sync writes, and
you're causing delay before the next read or write operation can begin... But
that's it. Without slog, your operations are serial, whereas, with slog your
sync write can occur in parallel to your other operations.
There's no extra fragmentation, with or without slog. Because in either case,
the sync write hits some dedicated and recyclable disk blocks, and then it
becomes async and coalesced with all the other async writes. The layout and/or
fragmentation characteristics of the permanent TXG to be written to the pool is
exactly the same either way.
Thanks... but doesn't your description imply that the sync writes
would always be written twice? It should be with dedicated SLOG, but
even with one, I think, small writes hit the SLOG and large ones
go straight to the pool devices (and smaller blocks catch up from
the TXG queue upon TXG flush). However, without a dedicated SLOG,
I thought that the writes into the ZIL happen once on the main
pool devices, and then are referenced from the live block pointer
tree without being rewritten elsewhere (and for the next TXG some
other location may be used for the ZIL). Maybe I am wrong, because
it would also make sense for small writes to hit the disk twice
indeed, and the same pool location(s) being reused for the ZIL.
//Jim
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss