On Jul 29, 2012, at 1:53 PM, Jim Klimov wrote:
> 2012-07-30 0:40, opensolarisisdeadlongliveopensolaris пишет:
>>> From: [email protected] [mailto:zfs-discuss-
>>> [email protected]] On Behalf Of Jim Klimov
>>>
>>> For several times now I've seen statements on this list implying
>>> that a dedicated ZIL/SLOG device catching sync writes for the log,
>>> also allows for more streamlined writes to the pool during normal
>>> healthy TXG syncs, than is the case with the default ZIL located
>>> within the pool.
>>
>> It might just be more clear, if it's stated differently:
>>
>> At any given time, your pool is in one of four states: idle, reading,
>> writing, or idle with writes queued but not currently being written. Now a
>> sync write operation takes place. If you have a dedicated log, it goes
>> directly to the log, and it doesn't interfere with any of the other
>> operations that might be occurring right now. You don't have to interrupt
>> your current activity, simply, your sync write goes to a dedicated device
>> that's guaranteed to be idle in relation to all that other stuff. Then the
>> sync write becomes async, and gets coalesced into the pending TXG.
>>
>> If you don't have a dedicated log, then the sync write jumps the write
>> queue, and becomes next in line. It waits for the present read or write
>> operation to complete, and then the sync write hits the disk, and flushes
>> the disk buffer. This means the sync write suffered a penalty waiting for
>> the main pool disks to be interruptible. Without slog, you're causing delay
>> to your sync writes, and you're causing delay before the next read or write
>> operation can begin... But that's it. Without slog, your operations are
>> serial, whereas, with slog your sync write can occur in parallel to your
>> other operations.
>>
>> There's no extra fragmentation, with or without slog. Because in either
>> case, the sync write hits some dedicated and recyclable disk blocks, and
>> then it becomes async and coalesced with all the other async writes. The
>> layout and/or fragmentation characteristics of the permanent TXG to be
>> written to the pool is exactly the same either way.
>
> Thanks... but doesn't your description imply that the sync writes
> would always be written twice? It should be with dedicated SLOG, but
> even with one, I think, small writes hit the SLOG and large ones
> go straight to the pool devices (and smaller blocks catch up from
> the TXG queue upon TXG flush). However, without a dedicated SLOG,
> I thought that the writes into the ZIL happen once on the main
> pool devices, and then are referenced from the live block pointer
> tree without being rewritten elsewhere (and for the next TXG some
> other location may be used for the ZIL). Maybe I am wrong, because
> it would also make sense for small writes to hit the disk twice
> indeed, and the same pool location(s) being reused for the ZIL.
You are both right and wrong, at the same time. It depends on the data.
Without a slog, writes that are larger than zfs_immediate_write_sz are
written to the permanent place in the pool. Please review (again) my
slides on the subject.
http://www.slideshare.net/relling/zfs-tutorial-lisa-2011
slide 78.
For those who prefer to be lecturered, another opportunity will arise in
December 2012 in San Diego at the LISA'12 conference.. I am revamping
much of the material from 2011, to catch up with all of the cool new things
that arrived and are due this year.
-- richard
--
ZFS Performance and Training
[email protected]
+1-760-896-4422
_______________________________________________
zfs-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss