Bob Friesenhahn wrote:
On Sat, 4 Jul 2009, Phil Harman wrote:

If you reboot, your cpio(1) tests will probably go fast again, until someone uses mmap(2) on the files again. I think tar(1) uses read(2), but from my iPod I can't be sure. It would be interesting to see how tar(1) performs if you run that test before cp(1) on a freshly rebooted system.

Ok, I just rebooted the system. Now 'zpool iostat Sun_2540 60' shows that the cpio read rate has increased from (the most recently observed) 33 MB/second to as much as 132 MB/second. To some this may not seem significant but to me it looks a whole lot different. ;-)

Thanks, that's really useful data. I wasn't near a machine at the time, so I couldn't do it for myself. I answered your initial question based on what I understood of the implementation, and it's very satisfying to have the data to back it up.

I have done some work with the ZFS team towards a fix, but it is only currently in OpenSolaris.

Hopefully the fix is very very good. It is difficult to displace the many years of SunOS training that using mmap is the path to best performance. Mmap provides many tools to improve application performance which are just not available via traditional I/O.

The part of the problem I highlighted was ...

  6699438 zfs induces crosscall storm under heavy mapped sequential read

This has been fixed in OpenSolaris, and should be fixed in Solaris 10 update 8.

However, this is only part of the problem. The fundamental issue is that ZFS has its own ARC apart from the Solaris page cache, so whenever mmap() is used, all I/O to that file has to make sure that the two caches are in sync. Hence, a read(2) on a file which has sometime been mapped, will be impacted, even if the file is nolonger mapped.

I'm sure the data and interest from this thread will be useful to the ZFS team in prioritising further performance enhancements. So thanks again. And if there's any more useful data you can add, please do so. If you have a support contract, you might also consider logging a call and even raising an escalation request.

Cheers,
Phil

Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to