Re: [zfs-discuss] slow speed problem with a new SAS shelf
27.08.2012 14:43, Sašo Kiselkov пишет: Is there any way to disable ARC for testing and leave prefetch enabled? No. The reason is quite simply because prefetch is a mechanism separate from your direct application's read requests. Prefetch runs on ahead of your anticipated read requests and places blocks it expects you'll need in the ARC, so obviously by disabling the ARC, you've disabled prefetch as well. You can get around the problem by exporting and importing the dataset between testing runs, which will clear the ARC, so do: # dd if=/dev/zero of=testfile bs=1024k count=1 # zpool export sas1 # zpool import sas1 # dd if=testfile of=/dev/null bs=1024k Thank you very much, Sašo. Now i see hardware works without problem. I create another 10-disks pair of mirrors zpool for testing: root@atom:/# zpool export sas2 ; zpool import sas2 root@atom:/# readspeed /sas2/5g 5120+0 records in 5120+0 records out 5368709120 bytes (5.4 GB) copied, 5.73728 s, 936 MB/s root@atom:/# zpool export sas2 ; zpool import sas2 root@atom:/# readspeed /sas2/5g 5120+0 records in 5120+0 records out 5368709120 bytes (5.4 GB) copied, 5.63869 s, 952 MB/s ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] slow speed problem with a new SAS shelf
27.08.2012 14:02, Sašo Kiselkov пишет: Can someone with Supermicro JBOD equipped with SAS drives and LSI HBA do this sequential read test? Did that on a SC847 with 45 drives, read speeds around 2GB/s aren't a problem. Thanks for info. Don't forget to set primarycache=none on testing dataset. There's your problem. By disabling the cache you've essentially disabled prefetch. Why are you doing that? Hm. The box have 96Gb RAM. I tried to exclude influence ARC cache. Hasn't thought about prefetch... readspeed is: readspeed () { dd if=$1 of=/dev/null bs=1M ;} root@atom:/sas1/test# zfs set primarycache=metadata sas1/test root@atom:/sas1/test# readspeed 3g 3000+0 records in 3000+0 records out 3145728000 bytes (3.1 GB) copied, 19.2203 s, 164 MB/s Prefetch still disabled? root@atom:/sas1/test# zfs set primarycache=all sas1/test root@atom:/sas1/test# readspeed 3g 3000+0 records in 3000+0 records out 3145728000 bytes (3.1 GB) copied, 3.99195 s, 788 MB/s Seems to be this is disk read speed with prefetch enabled. root@atom:/sas1/test# readspeed 3g 3000+0 records in 3000+0 records out 3145728000 bytes (3.1 GB) copied, 0.901665 s, 3.5 GB/s root@atom:/sas1/test# readspeed 3g 3000+0 records in 3000+0 records out 3145728000 bytes (3.1 GB) copied, 1.02127 s, 3.1 GB/s root@atom:/sas1/test# readspeed 3g 3000+0 records in 3000+0 records out 3145728000 bytes (3.1 GB) copied, 0.86884 s, 3.6 GB/s These results obviously are from memory. Is there any way to disable ARC for testing and leave prefetch enabled? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] slow speed problem with a new SAS shelf
27.08.2012 14:02, Sašo Kiselkov пишет: Can someone with Supermicro JBOD equipped with SAS drives and LSI HBA do this sequential read test? Did that on a SC847 with 45 drives, read speeds around 2GB/s aren't a problem. Thanks for info. Don't forget to set primarycache=none on testing dataset. There's your problem. By disabling the cache you've essentially disabled prefetch. Why are you doing that? Hm. The box have 96Gb RAM. I tried to exclude influence ARC cache. Hasn't thought about prefetch... readspeed is: readspeed () { dd if=$1 of=/dev/null bs=1M ;} root@atom:/sas1/test# zfs set primarycache=metadata sas1/test root@atom:/sas1/test# readspeed 3g 3000+0 records in 3000+0 records out 3145728000 bytes (3.1 GB) copied, 19.2203 s, 164 MB/s Prefetch still disabled? root@atom:/sas1/test# zfs set primarycache=all sas1/test root@atom:/sas1/test# readspeed 3g 3000+0 records in 3000+0 records out 3145728000 bytes (3.1 GB) copied, 3.99195 s, 788 MB/s Seems to be this is disk read speed with prefetch enabled. root@atom:/sas1/test# readspeed 3g 3000+0 records in 3000+0 records out 3145728000 bytes (3.1 GB) copied, 0.901665 s, 3.5 GB/s root@atom:/sas1/test# readspeed 3g 3000+0 records in 3000+0 records out 3145728000 bytes (3.1 GB) copied, 1.02127 s, 3.1 GB/s root@atom:/sas1/test# readspeed 3g 3000+0 records in 3000+0 records out 3145728000 bytes (3.1 GB) copied, 0.86884 s, 3.6 GB/s These results obviously are from memory. Is there any way to disable ARC for testing and leave prefetch enabled? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] slow speed problem with a new SAS shelf
25.07.2012 9:29, Yuri Vorobyev пишет: I faced with a strange performance problem with new disk shelf. We a using ZFS system with SATA disks for a while. What OS and release? Oh. I forgot this important thing. It is OpenIndiana oi_151a5 now. New testing data: I reboot to first boot environment with original io_151 dev installed (without updates). Read speed remains the same. About 150Mb/s. Now something intresting. Booted to Centos 6.3 live CD Create software raid10: #mdadm -C /dev/md0 --level=raid10 --assume-clean --raid-devices=10 /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai /dev/sdaj /dev/sdak /dev/sdal # dd if=/dev/zero of=zero bs=1M count=5000 5000+0 records in 5000+0 records out 524288 bytes (5.2 GB) copied, 9.50402 s, 552 MB/s cleaning file system caches: # free total used free sharedbuffers cached Mem: 991952286800076 92395152 0 221845269432 -/+ buffers/cache:1508460 97686768 Swap:0 0 0 # echo 3 > /proc/sys/vm/drop_caches # free total used free sharedbuffers cached Mem: 991952281434564 97760664 0 1964 75988 -/+ buffers/cache:1356612 97838616 Swap:0 0 0 # dd if=zero of=/dev/null bs=1M 5000+0 records in 5000+0 records out 524288 bytes (5.2 GB) copied, 5.65738 s, 927 MB/s iostat during reading here http://pastebin.com/rwd0LWdc CentOS dmesg here https://dl.dropbox.com/u/12915469/centos_dmesg.txt Sequential speed 6 times more than in OpenIndiana. Seems like driver bug in mpt_sas? Remind you HBA is LSI 9205-8e (2308 chip). LSI support answered me OpenIndiana not supported OS (who would doubt...) What should i do? Go to time-tested LSI2008 HBA? We bought LSI 9200-8e (LSI2008 chip). Card come with FW version 7. Don't do upgrade for now. Reconnect shelf to it. No success. Read speed much lower that expected: # dd if=3g of=/dev/null bs=1M 3000+0 records in 3000+0 records out 3145728000 bytes (3.1 GB) copied, 23.5578 s, 134 MB/s Can someone with Supermicro JBOD equipped with SAS drives and LSI HBA do this sequential read test? Don't forget to set primarycache=none on testing dataset. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] slow speed problem with a new SAS shelf
23.07.2012 21:59, Yuri Vorobyev пишет: I faced with a strange performance problem with new disk shelf. We a using ZFS system with SATA disks for a while. What OS and release? Oh. I forgot this important thing. It is OpenIndiana oi_151a5 now. New testing data: I reboot to first boot environment with original io_151 dev installed (without updates). Read speed remains the same. About 150Mb/s. Now something intresting. Booted to Centos 6.3 live CD Create software raid10: #mdadm -C /dev/md0 --level=raid10 --assume-clean --raid-devices=10 /dev/sdac /dev/sdad /dev/sdae /dev/sdaf /dev/sdag /dev/sdah /dev/sdai /dev/sdaj /dev/sdak /dev/sdal # dd if=/dev/zero of=zero bs=1M count=5000 5000+0 records in 5000+0 records out 524288 bytes (5.2 GB) copied, 9.50402 s, 552 MB/s cleaning file system caches: # free total used free sharedbuffers cached Mem: 991952286800076 92395152 0 221845269432 -/+ buffers/cache:1508460 97686768 Swap:0 0 0 # echo 3 > /proc/sys/vm/drop_caches # free total used free sharedbuffers cached Mem: 991952281434564 97760664 0 1964 75988 -/+ buffers/cache:1356612 97838616 Swap:0 0 0 # dd if=zero of=/dev/null bs=1M 5000+0 records in 5000+0 records out 524288 bytes (5.2 GB) copied, 5.65738 s, 927 MB/s iostat during reading here http://pastebin.com/rwd0LWdc CentOS dmesg here https://dl.dropbox.com/u/12915469/centos_dmesg.txt Sequential speed 6 times more than in OpenIndiana. Seems like driver bug in mpt_sas? Remind you HBA is LSI 9205-8e (2308 chip). LSI support answered me OpenIndiana not supported OS (who would doubt...) What should i do? Go to time-tested LSI2008 HBA? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] slow speed problem with a new SAS shelf
23.07.2012 19:39, Richard Elling пишет: I faced with a strange performance problem with new disk shelf. We a using ZFS system with SATA disks for a while. What OS and release? Oh. I forgot this important thing. It is OpenIndiana oi_151a5 now. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] slow speed problem with a new SAS shelf
Hello. I faced with a strange performance problem with new disk shelf. We a using ZFS system with SATA disks for a while. It is Supermicro SC846-E16 chassis, Supermicro X8DTH-6F motherboard with 96Gb RAM and 24 HITACHI HDS723020BLA642 SATA disks attached to onboard LSI 2008 controller. Pretty much satisfied with it we bought additional shelf with SAS disks for VMs hosting. New shelf is Supermicro SC846-E26 chassis. Disks model is HITACHI HUS156060VLS600 (15K 600Gb SAS2). Additional controller LSI 9205-8e was installed in server and connected with JBOD. I connected JBOD with 2 channels and setup multi path first, but when i noticed performance problem i disabled multi path and disconnected one cable (for sure it is not multipath cause the problem). Problem description follow: Creating test pool with 5 pair of mirrors (new shelf, SAS disks) # zpool create -o version=28 -O primarycache=none test mirror c9t5000CCA02A138899d0 c9t5000CCA02A102181d0 mirror c9t5000CCA02A13500Dd0 c9t5000CCA02A13316Dd0 mirror c9t5000CCA02A005699d0 c9t5000CCA02A004271d0 mirror c9t5000CCA02A004229d0 c9t5000CCA02A1342CDd0 mirror c9t5000CCA02A1251E5d0 c9t5000CCA02A1151DDd0 (primarycache=none) to disable ARC influence Testing sequential write # dd if=/dev/zero of=/test/zero bs=1M count=2048 2048+0 records in 2048+0 records out 2147483648 bytes (2.1 GB) copied, 1.04272 s, 2.1 GB/s iostat when writing look like r/sw/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 1334.60.0 165782.9 0.0 8.40.06.3 1 86 c9t5000CCA02A1151DDd0 0.0 1345.50.0 169575.3 0.0 8.70.06.5 1 88 c9t5000CCA02A1342CDd0 2.0 1359.51.0 168969.8 0.0 8.70.06.4 1 90 c9t5000CCA02A13500Dd0 0.0 1358.50.0 168714.0 0.0 8.70.06.4 1 90 c9t5000CCA02A13316Dd0 0.0 1345.50.0 19.3 0.0 9.00.06.7 1 92 c9t5000CCA02A102181d0 1.0 1317.51.0 164456.9 0.0 8.50.06.5 1 88 c9t5000CCA02A004271d0 4.0 1342.52.0 166282.2 0.0 8.50.06.3 1 88 c9t5000CCA02A1251E5d0 0.0 1377.50.0 170515.5 0.0 8.70.06.3 1 90 c9t5000CCA02A138899d0 Now read # dd if=/test/zero of=/dev/null bs=1M 2048+0 records in 2048+0 records out 2147483648 bytes (2.1 GB) copied, 13.5681 s, 158 MB/s iostat when reading r/sw/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 106.00.0 11417.40.0 0.0 0.20.02.4 0 14 c9t5000CCA02A004271d0 80.00.0 10239.90.0 0.0 0.20.02.4 0 10 c9t5000CCA02A1251E5d0 110.00.0 12182.40.0 0.0 0.10.01.3 0 9 c9t5000CCA02A138899d0 102.00.0 11664.40.0 0.0 0.20.01.8 0 15 c9t5000CCA02A005699d0 99.00.0 10900.90.0 0.0 0.30.03.0 0 16 c9t5000CCA02A004229d0 107.00.0 11545.40.0 0.0 0.20.01.9 0 13 c9t5000CCA02A1151DDd0 81.00.0 10367.90.0 0.0 0.20.02.2 0 11 c9t5000CCA02A1342CDd0 Unexpected low speed! Note the busy column. When writing it about 90%, when reading it about 15% Individual disks raw read speed (don't be confused with name change. i connect JBOD to another HBA channel) # dd if=/dev/dsk/c8t5000CCA02A13889Ad0 of=/dev/null bs=1M count=2000 2000+0 records in 2000+0 records out 2097152000 bytes (2.1 GB) copied, 10.9685 s, 191 MB/s # dd if=/dev/dsk/c8t5000CCA02A1342CEd0 of=/dev/null bs=1M count=2000 2000+0 records in 2000+0 records out 2097152000 bytes (2.1 GB) copied, 10.8024 s, 194 MB/s The 10-disks mirror zpool read slower than a single disk. There is no tuning in /etc/system I tried test with FreeBSD 8.3 live CD. Reads was the same (about 150Mb/s). Also i tried SmartOS, but it can't see disks behind LSI 9205-8e controller. For compare this is speed from SATA pool (it consist of 4 6-disk raidz2 vdev) #dd if=CentOS-6.2-x86_64-bin-DVD1.iso of=/dev/null bs=1M 4218+1 records in 4218+1 records out 4423129088 bytes (4.4 GB) copied, 4.76552 s, 928 MB/s r/sw/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 13614.40.0 800338.50.0 0.1 36.00.02.6 0 914 c6 459.90.0 25761.40.0 0.0 0.80.01.8 0 22 c6t5000CCA369D16860d0 84.00.0 2785.20.0 0.0 0.20.03.0 0 13 c6t5000CCA369D1B1E0d0 836.90.0 50089.50.0 0.0 2.60.03.1 0 60 c6t5000CCA369D1B302d0 411.00.0 24492.60.0 0.0 0.80.02.1 0 25 c6t5000CCA369D16982d0 821.90.0 49385.10.0 0.0 3.00.03.7 0 67 c6t5000CCA369CFBDA3d0 231.00.0 12292.50.0 0.0 0.50.02.3 0 18 c6t5000CCA369D17E73d0 803.90.0 50091.50.0 0.0 2.90.03.6 1 69 c6t5000CCA369D0EA93d0 PS. Before testing i flash last firmware and bios to LSI 9205-8e. It come with factory 9 version. I flashed version 13.5. Now I think that it was not worth such a hurry. Then i downgrade it to version 12. Read speed remai
[zfs-discuss] volblocksize for VMware VMFS-5
Hello. What the best practices for choosing ZFS volume volblocksize setting for VMware VMFS-5? VMFS-5 block size is 1Mb. Not sure how it corresponds with ZFS. Setup details follow: - 11 pairs of mirrors; - 600Gb 15k SAS disks; - SSDs for L2ARC and ZIL - COMSTAR FC target; - about 30 virtual machines, mostly Windows (so underlying file systems is NTFS with 4k block) - 3 ESXi hosts. Also, i will glad to hear volume layout suggestions. I see several options: - one big zvol with size equal size of the pool; - one big zvol with size of the pool-20% (to avoid fragmentation); - several zvols (size?). Thanks for attention. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] convert output zfs diff to something human readable
It is possible to convert "octal representation" of zfs diff output to something human readable? Iconv may be? Please see screenshot http://i.imgur.com/bHhXV.png I create file with russian name there. OS is Solaris 11 Express. This command did the job: zfs diff | perl -plne 's#\\\d{8}(\d{3})#chr(oct ($1)-oct (400))#ge; s#\\040# #g' ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] convert output zfs diff to something human readable
Hello. It is possible to convert "octal representation" of zfs diff output to something human readable? Iconv may be? Please see screenshot http://i.imgur.com/bHhXV.png I create file with russian name there. OS is Solaris 11 Express. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] WarpDrive SLP-300
http://www.lsi.com/channel/about_channel/whatsnew/warpdrive_slp300/index.html I think drivers will be the problem. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] HP ProLiant N36L
28.09.2010 10:45, Brandon High wrote: Anyone had any look getting either OpenSolaris or FreeBSD with zfs working on I looked at it some, and all the hardware should be supported. There is a half-height PCIe x16 and a x1 slot as well. Somebody has already bought this microserver? :) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] 4k block alignment question (X-25E)
31.08.2010 21:23, Ray Van Dolson пишет: Here's an article with some benchmarks: http://wikis.sun.com/pages/viewpage.action?pageId=186241353 Seems to really impact IOPS. This is really interesting reading. Can someone do same tests with Intel X25-E? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New SSD options
As for the Vertex drives- if they are within +-10% of the Intel they're still doing it for half of what the Intel drive costs- so it's an option- not a great option- but still an option. Yes, but Intel is SLC. Much more endurance. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] rpool on ssd. endurance question.
Hello. Is all this data what your looking for? Yes, thank you, Paul. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] rpool on ssd. endurance question.
If anybody uses SSD for rpool more than half-year, can you post SMART information about HostWrites attribute? I want to see how SSD wear for system disk purposes. I'd be happy to, exactly what commands shall I run? Hm. I'm experimenting with OpenSolaris in virtual machine now. Unfortunately I can't give you exactly how-to. But i think it is possible to compile Smartmontools http://smartmontools.sourceforge.net and get SMART attributes something like that: smartctl -a /dev/rdsk/c0t0d0s0 You need install SUNWgcc package to compile. Take a look at http://opensolaris.org/jive/thread.jspa?threadID=120402 and http://opensolaris.org/jive/thread.jspa?threadID=124372 . ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] rpool on ssd. endurance question.
Hello. If anybody uses SSD for rpool more than half-year, can you post SMART information about HostWrites attribute? I want to see how SSD wear for system disk purposes. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Workaround for mpt timeouts in snv_127
Hello. We are seeing the problem on both Sun and non-Sun hardware. On our Sun thumper x4540, we can reproduce it on all 3 devices. Our configuration is large stripes with only 2 vdevs. Doing a simple scrub will show the typical mpt timeout. We are running snv_131. Somebody observed similar problems with new LSI 2008 SAS2 6Gb HBAs? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss