There is nothing in here that requires zfs confidential. cross-posted to zfs discuss.
On Oct 21, 2010, at 3:37 PM, Jim Nissen wrote: > Cross-posting. > > -------- Original Message -------- > Subject: Performance problems due to smaller ZFS recordsize > Date: Thu, 21 Oct 2010 14:00:42 -0500 > From: Jim Nissen <jim.nis...@oracle.com> > To: perf-roundtable...@oracle.com, perf-roundta...@sun.com, > kernel-supp...@sun.com > > I working with a customer who is having Directory server backup performance > problems, since switching to ZFS. In short, backups that used to take 1 - 4 > hours on UFS are now taking 12+ hours on ZFS. We've figured out that ZFS > reads seem to be throttled, where writes seem really fast. Backend Storage > is IBM SVC. > > As part of their cutover, they were given the following Best Practice > recommendations from LDAP folks @Sun... > > /etc/system tunables: > set zfs:zfs_arc_max = 0x100000000 > set zfs:zfs_vdev_cache_size = 0 > set zfs:zfs_vdev_cache_bshift = 13 > set zfs:zfs_prefetch_disable = 1 > set zfs:zfs_nocacheflush = 1 > > At ZFS filesystem level: > recordsize = 32K > noatime > > One of the things they noticed is that simple dd reads from one of the 132K > recordsize filesystems runs much faster (4 - 7 times) than their 32K > filesystems. I joined a shared-shell where we switched the same filesystem > from 32K to 128K, and we could see underlying disks were getting 4x better > throughput (from 1.5 - 2MB/sec to 8 - 10MB/s), whereas a direct dd against > one of the disks shows that the disks were capable of much more (45+ MB/sec). > > Here are some snippets from iostat... > > ZFS recordsize of 32K, dd if=./somelarge5gfile of=/dev/null bs=16k (to mimic > application blocksizes) > > extended device statistics > r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device > 67.6 0.0 2132.7 0.0 0.0 0.3 0.0 4.5 0 30 > c6t60050768018E82BDA800000000000565d0 > 67.4 0.0 2156.8 0.0 0.0 0.1 0.0 1.5 0 10 > c6t60050768018E82BDA800000000000564d0 > 68.4 0.0 2158.3 0.0 0.0 0.3 0.0 4.5 0 31 > c6t60050768018E82BDA800000000000563d0 > 66.2 0.0 2118.4 0.0 0.0 0.2 0.0 3.4 0 22 > c6t60050768018E82BDA800000000000562d0 > > ZFS recordsize of 128K, dd if=./somelarge5gfile of=/dev/null bs=16k (to mimic > application blocksizes) > > extended device statistics > r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device > 78.2 0.0 10009.6 0.0 0.0 0.2 0.0 1.9 0 15 > c6t60050768018E82BDA800000000000565d0 > 78.6 0.0 9960.0 0.0 0.0 0.1 0.0 1.2 0 10 > c6t60050768018E82BDA800000000000564d0 > 79.4 0.0 10062.3 0.0 0.0 0.4 0.0 4.4 0 35 > c6t60050768018E82BDA800000000000563d0 > 76.6 0.0 9804.8 0.0 0.0 0.2 0.0 2.3 0 17 > c6t60050768018E82BDA800000000000562d0 > > dd if=/dev/rdsk/c6t60050768018E82BDA800000000000564d0s0 of=/dev/null bs=32k > (to mimic small ZFS blocksize) > extended device statistics > r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device > 3220.9 0.0 51533.9 0.0 0.0 0.9 0.0 0.3 1 94 > c6t60050768018E82BDA800000000000564d0 > > So, it's not like the underlying disk isn't capable of much more than what > ZFS is asking of it. I understand the part where it will have to 4x as much > work with 32K blocksize as with 128K, but it doesn't seem as if ZFS is doing > much at all with underlying disks. > > We've ask the customer to rerun the test without /etc/system tunables. > Anybody else worked a similar issue? Any hints provided would be greatly > appreciated. > > Thanks! > > Jim >
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss