On Sat, 10 Apr 2010, Edward Ned Harvey wrote:
For several seconds, *only* the log device is busy. Then it stops, and for maybe 0.5 secs *only* the primary storage disks are busy. Repeat, recycle.I expected to see the log device busy nonstop. And the spindle disks blinking lightly. As long as the spindle disks are idle, why wait for a larger TXG to be built? Why not flush out smaller TXG’s as long as the disks are idle? But worse yet … During the 1-second (or 0.5 second) that the spindle disks are busy, why stop the log device? (Presumably also stopping my application that’s doing all the writing.)
What you are seeing should be expected and is good. The intent log allows synchronous writes to be turned into lazy ordinary writes (like async writes) in the next TXG cycle. Since the intent log is on a SSD, the pressure is taken off of the primary disks to serve that function so you will not see so many IOPS to the primary disks.
This means, if I’m doing zillions of *tiny* sync writes, I will get the best performance with the dedicated log device present. But if I’m doing large sync writes, I would actually get better performance without the log device at all. Or else … add just as many log devices as I have primary storage devices. Which seems kind of crazy.
If this is really a problem for you, then you should be able to somewhat resolve it by placing a smaller cap on the maximum size of a TXG. Then the system will write more often. However, the maximum synchronous bulk write rate will still be limited by the bandwidth of your intent log devices. Huge synchronous bulk writes are pretty rare since usually the bottleneck is elsewhere, such as the ethernet.
Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss