> From: [email protected] [mailto:[email protected]]
> On Behalf Of Luke S. Crawford
> 
> Sweet!  thanks for the info.  Why only 8G?   I mean, my systems are pretty
> much being constantly written to in a very random manner;  unrolling that
> into sequential writes, which I believe is what a ZIL does, would be
> a huge win.

Here's how it works:
When some application issues a write to disk, by default it's an async
write.  The OS is permitted to buffer it in memory.  Depending on which
version of ZFS we're talking about, it will buffer up to 30 sec, or up to 5
sec, assembling all the async writes into a single transaction group (TXG)
which is a performance enhancement.  Thanks to copy on write and the
requisite block remapping that's necessary to support it, all these small
writes that would otherwise be randomly spread across the disk on an older
filesystem (ext3) are now serialized into a sequential block.

When some application issues fsync(), or opens a file in SYNC write mode,
the OS cannot do the above.  It has to immediately store the write on
nonvolatile storage.  What the log device does is this...  It allows these
small SYNC writes ("small" is configurable, I think by default smaller than
32k) to be quickly written to nonvolatile storage, after which they can be
buffered and optimized along with all the async writes.  Also, since the log
device is only being used for sync writes, there is no device competition -
If some other async read or write is currently in progress on the main pool,
no need to wait for it.  So the log device greatly improves performance of
small random sync writes.

Since recent versions of ZFS, log device removal is supported, so if a log
device fails, then the TXG is immediately flushed from RAM, and the pool
degrades back to "normal" level of performance without log device.  During
that window, a power failure or kernel panic would result in data loss...
As long as you're willing/able to risk this window of a second or two
vulnerability...  Then there is no need for ZIL mirroring.  It is logical,
but I've never seen anybody test it, that ZIL mirroring makes your system
slower than running without ZIL mirroring.  (but still much faster than
running without dedicated log device)

BTW, there are many situations where completely disabling ZIL is also
acceptable.  Even if you're doing databases and so forth, there is
absolutely nothing faster than disabling ZIL, and there is no risk of data
corruption.  There is only a risk of up to 30 sec data loss in the event of
power failure or kernel panic etc.  Depending on the role of your machine
interacting with other machines, this may be acceptable.  And usually is
acceptable if your server is a file server.

(And finally, the actual answer to the question...)

Supposing your log device is on a 6Gbit bus, and a TXG will be flushed at
maximum every 30 seconds, and you're absolutely hammering your system with
small sync writes (perhaps you run a transactional database or something.)
This is a worst case.  Then the most you could possibly ever write to your
log before TXG flush would be 180Gbit = 22 GB.  In practice, this is
probably unattainable, and certainly unusual if it's even possible.  In
practice, 8G is more than enough for 99% of what people care about.  In
fact, the DDRDrive is designed specifically for this purpose, and it's only
4G.  Even 4G is sufficient for 99%

_______________________________________________
Tech mailing list
[email protected]
https://lists.lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/

Reply via email to