Re: [zfs-discuss] recovering from zfs destroy -r
- Original Message - Hi, Is there a simple way of rolling back to a specific TXG of a volume to recover from such a situation? You can't undo a zfs destroy - restore from backup... -- Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 r...@karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk.___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Server with 4 drives, how to configure ZFS?
- Исходное сообщение - От: Dave U. Random anonym...@anonymitaet-im-inter.net Дата: Tuesday, June 21, 2011 18:32 Тема: Re: [zfs-discuss] Server with 4 drives, how to configure ZFS? Кому (To): zfs-discuss@opensolaris.org Hello Jim! I understood ZFS doesn't like slices but from your reply maybe I should reconsider. I have a few older servers with 4 bays x 73G. If I make a root mirror pool and swap on the other 2 as you suggest, then I would have about 63G x 4 left over. For the sake of completeness, I should mention that you can also create a fast and redundant 4-way mirrored root pool ;) If so then I am back to wondering what to do about 4 drives. Is raidz1 worthwhile in this scenario? That is less redundancythat a mirror and much less than a 3 way mirror, isn't it? Is it even possible to do raidz2 on 4 slices? Or would 2, 2 way mirrors be better? I don't understand what RAID10 is, is it simply a stripe of two mirrors? Yes, by that I meant a striping over two mirrors. Or would it be best to do a 3 way mirror and a hot spare? I would like to be able to tolerate losing one drive without loss of integrity. Any of the scenarios above allow you to lose one drive and not lose data immediately. The rest is a compromise between both performance, space and further redundancy: * 3- or 4-way mirror: least useable space (25% of total disk capacity), most redundancy, highest read speeds for concurrent loads * striping of mirrors (raid10): average useable space (50%), high read speeds for concurrent loads, can tolerate loss of up to 2 drives (slices) in a good scenario (if they are from different mirrors) * raidz2: average useable space (50%), can tolerate loss of any 2 drives * raidz1: max useable space (75%), can tolerate loss of any 1 drive After all the discussions about performance recently on this forum, I would not try to guess which performance would be better in general - raidz1 or raidz2 (there are reads, writes, scrubs and resilvers seemingly all with different preferences toward disk layout), but with a generic workload we have (i.e. serving up zones with some development databases and J2SE app servers) this was not seen to matter much. So for us it was usually raidz2 for tolerance or raidz1 for space. I will be doing new installs of Solaris 10. Is there an option in the installer for me to issue ZFS commands and set up pools or do I need to format the disks before installing and if so how do I do that? Unfortunately, I last installed Solaris 10u7 or so from scratch, others were liveupdates of existing systems and OpenSolaris machines, so I am not certain. From what I gather, the text installer is much more powerful than the graphical one, and its ZFS root setup might encompass creating a root pool in a slice of given size, and possibly mirror it right away. Maybe you can do likewise in JumpStart, but we did not do that after all. Anyhow, after you install a ZFS root of your sufficient size (i.e. our minimalist Solaris 10 installs are often under 1-2Gb per boot environment, multiply for storing different OEs like LiveUpdate and for snapshot history), you can create a slice for the data pool component (s3 in our setups), and then clone the disk slice layout to the other 3 drives like this: # prtvtoc /dev/rdsk/c1t0d0s2 | fmthard -s - /dev/rdsk/c1t1d0s2 (you might need to install the slice table spanning 100% of drives with the fdisk command, first). Then you attach one of the slices to the ZFS root pool to make a mirror, if the installer did not do that: # zpool attach rpool c1t0d0s0 c1t1d0s0 If you have several controllers (perhaps even on different PCI buses) you might want to pick a drive on a different controller than the first one in order to have less SPoF's, but make sure that the second controller is bootable from BIOS. And make that drive bootable: SPARC: # installboot /usr/platform/`uname -i`/lib/fs/ufs/bootblk /dev/rdsk/c1t1d0s0 x86/x86_64: # installgrub /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/c1t1d0s0 For two other drives you just create a new pool in slices *s0: # zpool create swappool mirror c1t2d0s0 c1t3d0s0 # zfs create -V2g swappool/dump # zfs create -V6g swappool/swap Sizes are arbitrary here, they depend on your RAM sizing. You can later add swap from other pools, including a data pool. Dump device size can be tested by configuring dumpadm to use the new device - it would either refuse to use a device too small (then you recreate it bigger), or accept it. The installer would probably create a dump and a swap devices in your root pool, you may elect to destroy them since you have another swap device, at least. Make sure to update the /etc/vfstab file to reference the swap areas which your system should use further on. After this is all completed, you can create a data pool in the s3 slices with your chosen geometry, i.e. # zpool create pool raidz2 c1t0d0s3 c1t1d0s3 c1t2d0s3 c1t3d0s3 In our setups
Re: [zfs-discuss] recovering from zfs destroy -r
Those were backups. What about http://www.solarisinternals.com/wiki/index.php/ZFS_forensics_scrollback_script? http://www.solarisinternals.com/wiki/index.php/ZFS_forensics_scrollback_script? I am willing to give it a go. Thanks, P On 27 Jun 2011, at 09:32, Roy Sigurd Karlsbakk r...@karlsbakk.netmailto:r...@karlsbakk.net wrote: Hi, Is there a simple way of rolling back to a specific TXG of a volume to recover from such a situation? You can't undo a zfs destroy - restore from backup... -- Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 r...@karlsbakk.netmailto:r...@karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Server with 4 drives, how to configure ZFS?
In this setup that will install everything on the root mirror so I will have to move things around later? Like /var and /usr or whatever I don't want on the root mirror? Actually, you do want /usr and much of /var on the root pool, they are integral parts of the svc:/filesystem/local needed to bring up your system to a useable state (regardless of whether the other pools are working or not). Depending on the OS versions, you can do manual data migrations to separate datasets of the root pool, in order to keep some data common between OE's or to enforce different quotas or compression rules. For example, on SXCE and Solaris 10 (but not on oi_148a) we successfully splice out many filesystems in such a layout (the example below also illustrates multiple OEs): # zfs list -o name,refer,quota,compressratio,canmount,mountpoint -t filesystem -r rpool NAMEREFER QUOTA RATIO CANMOUNT MOUNTPOINT rpool 7.92M none 1.45xon /rpool rpool/ROOT 21K none 1.38xnoauto /rpool/ROOT rpool/ROOT/snv_117 758M none 1.00xnoauto / rpool/ROOT/snv_117/opt 27.1M none 1.00xnoauto /opt rpool/ROOT/snv_117/usr 416M none 1.00xnoauto /usr rpool/ROOT/snv_117/var 122M none 1.00xnoauto /var rpool/ROOT/snv_129 930M none 1.45xnoauto / rpool/ROOT/snv_129/opt 109M none 2.70xnoauto /opt rpool/ROOT/snv_129/usr 509M none 2.71xnoauto /usr rpool/ROOT/snv_129/var 288M none 2.54xnoauto /var rpool/SHARED 18K none 3.36xnoauto legacy rpool/SHARED/var 18K none 3.36xnoauto legacy rpool/SHARED/var/adm 2.97M 5G 4.43xnoauto legacy rpool/SHARED/var/cores 118M 5G 3.44xnoauto legacy rpool/SHARED/var/crash 1.39G 5G 3.41xnoauto legacy rpool/SHARED/var/log102M 5G 3.43xnoauto legacy rpool/SHARED/var/mail 66.4M none 1.79xnoauto legacy rpool/SHARED/var/tmp 20K none 1.00xnoauto legacy rpool/test 50.5K none 1.00xnoauto /rpool/test Mounts of /var/* components are done via /etc/vfstab lines like: rpool/SHARED/var/adm- /var/admzfs - yes - rpool/SHARED/var/log- /var/logzfs - yes - rpool/SHARED/var/mail - /var/mail zfs - yes - rpool/SHARED/var/crash - /var/crash zfs - yes - rpool/SHARED/var/cores - /var/cores zfs - yes - While system paths /usr /var /opt are mounted by SMF services directly. And then I just make a RAID10 like Jim was saying with the other 4x60 slices? How should I move mountpoints that aren't separate ZFS filesystems? The only conclusion you can draw from that is: First take it as a given that you can't boot from a raidz volume. Given, you must have one mirror. Thanks, I will keep it in mind. Then you raidz all the remaining space that's capable of being put into a raidz... And what you have left is a pair of unused space, equal to the size of your boot volume. You either waste that space, or you mirror it and put it into your tank. ...or use it as swap space :) I didn't understand what you suggested about appending a 13G mirror to tank. Would that be something like RAID10 without actually being RAID10 so I could still boot from it? How would the system use it? No, this would be an uneven striping over a raid10 (or raidzN) bank of 60Gb slices and a 13Gb mirror. ZFS can do that too, although for performance considerations unbalanced pools are not recommended and should be forced on command-line. And you can not boot from any pool other than a mirror or a single drive. Rationale: a single BIOS device must be sufficient to boot the system and contain all the data needed to boot. So RAID10 sounds like the only reasonable choice since there are an even number of slices, I mean is RAIDZ1 even possible with 4 slices? Yes, it is possible with any amount of slices starting from 3. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- ++ || | Климов Евгений, Jim Klimov | | технический директор CTO | | ЗАО ЦОС и ВТ JSC COSHT | || | +7-903-7705859 (cellular) mailto:jimkli...@cos.ru | |CC:ad...@cos.ru,jimkli...@gmail.com | ++ | () ascii ribbon
Re: [zfs-discuss] Server with 4 drives, how to configure ZFS?
Hello Bob! Thanks for the reply. I was thinking about going with a 3 way mirror and a hot spare. Keep in mind that you can have problems in Sol10u8 if you use a mirror+spare config for the root pool. Should be fixed in u9. But I don't think I can upgrade to larger drives unless I do it all at once, is that correct? You can replace the drives one by one, but the pool will only expand when all the data drives have newer bigger capacity. //Jim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] replace zil drive
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hello everybody, some time ago a SSD within a ZIL mirror died. As I had no SSD available to replace it, I dropped in a normal SAS harddisk to rebuild the mirror. In the meantime I got the warranty replacement SSD. Now I'm wondering about the best option to replace the HDD with the SSD: 1. Remove the log mirror, put the new disk in place, add log mirror 2. Pull the HDD, forcing the mirror to fail, replace the HDD with the SSD Unfortunately I have no free slot in the JBOD available (want to keep the ZIL in the same JBAD as the rest of the pool): 3. Put additional temporary SAS HDD in free slot of different JBOD, replace the HDD in the ZIL mirror with temporary HDD, pull now unused HDD, use free slot for SSD, replace temporary HDD with SSD. Any suggestions? thx Carsten - -- Max Planck Institut fuer marine Mikrobiologie - - Network Administration - Celsiustr. 1 D-28359 Bremen Tel.: +49 421 2028568 Fax.: +49 421 2028565 PGP public key:http://www.mpi-bremen.de/Carsten_John.html -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk4ISy8ACgkQsRCwZeehufs9MQCfetuYQwjbqH2Rb7qyY8G4vxaQ TvUAoNcHPnHED1Ykat8VHF8EJIRiPmct =jwZQ -END PGP SIGNATURE- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] write cache partial-disk pools (was Server with 4 drives, how to configure ZFS?)
2011-06-19 3:47, Richard Elling пишет: On Jun 16, 2011, at 8:05 PM, Daniel Carosone wrote: On Thu, Jun 16, 2011 at 10:40:25PM -0400, Edward Ned Harvey wrote: From: Daniel Carosone [mailto:d...@geek.com.au] Sent: Thursday, June 16, 2011 10:27 PM Is it still the case, as it once was, that allocating anything other than whole disks as vdevs forces NCQ / write cache off on the drive (either or both, forget which, guess write cache)? I will only say, that regardless of whether or not that is or ever was true, I believe it's entirely irrelevant. Because your system performs read and write caching and buffering in ram, the tiny little ram on the disk can't possibly contribute anything. I disagree. It can vastly help improve the IOPS of the disk and keep the channel open for more transactions while one is in progress. Otherwise, the channel is idle, blocked on command completion, while the heads seek. Actually, all of the data I've gathered recently shows that the number of IOPS does not significantly increase for HDDs running random workloads. However the response time does :-( My data is leading me to want to restrict the queue depth to 1 or 2 for HDDs. SDDs are another story, they scale much better in the response time and IOPS vs queue depth analysis. Now, is there going to be a tunable which would allow us to set queue depths per-device? Or tunables are so evil that you'd rather poke an eye your with a stick? (C) Richard Elling ;) -- ++ || | Климов Евгений, Jim Klimov | | технический директор CTO | | ЗАО ЦОС и ВТ JSC COSHT | || | +7-903-7705859 (cellular) mailto:jimkli...@cos.ru | | CC:ad...@cos.ru,jimkli...@mail.ru | ++ | () ascii ribbon campaign - against html mail | | /\- against microsoft attachments | ++ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Fixing txg commit frequency
I'd like to ask about whether there is a method to enforce a certain txg commit frequency on ZFS. Well, there is a timer frequency based on TXG age (i.e 5 sec by default now), in /etc/system like this: set zfs:zfs_txg_synctime = 5 Also there is a buffer-size limit, like this (384Mb): set zfs:zfs_write_limit_override = 0x1800 or on command-line like this: # echo zfs_write_limit_override/W0t402653184 | mdb -kw We had similar spikes with big writes to a Thumper with SXCE in the pre-90's builds, when the system would stall for seconds while flushing a 30-second TXG full of data. Adding a reasonable megabyte limit solved the unresponsiveness problem for us, by making these flush-writes rather small and quick. See also: http://opensolaris.org/jive/thread.jspa?threadID=106453start=15tstart=0 http://opensolaris.org/jive/thread.jspa?messageID=347212 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] replace zil drive
I think that the least disruptive way would be to detach the HDD from the ZIL mirror and offline it, remove and replace with an SSD, and then attach the SSD to the ZIL to make it a mirror again. Note that this would create a window of possible ZIL failure (and you had such a window already when the first SSD died), but the system *should* survive that (fall back to on-pool ZIL after a short timeout of the dedicated device) unless the power dies during this time. - Исходное сообщение - От: Carsten John cj...@mpi-bremen.de Дата: Monday, June 27, 2011 13:21 Тема: [zfs-discuss] replace zil drive Кому (To): zfs-discuss@opensolaris.org -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hello everybody, some time ago a SSD within a ZIL mirror died. As I had no SSD availableto replace it, I dropped in a normal SAS harddisk to rebuild the mirror. In the meantime I got the warranty replacement SSD. Now I'm wondering about the best option to replace the HDD with the SSD: 1. Remove the log mirror, put the new disk in place, add log mirror 2. Pull the HDD, forcing the mirror to fail, replace the HDD with the SSD Unfortunately I have no free slot in the JBOD available (want to keep the ZIL in the same JBAD as the rest of the pool): 3. Put additional temporary SAS HDD in free slot of different JBOD, replace the HDD in the ZIL mirror with temporary HDD, pull now unused HDD, use free slot for SSD, replace temporary HDD with SSD. Any suggestions? thx Carsten - -- Max Planck Institut fuer marine Mikrobiologie - - Network Administration - Celsiustr. 1 D-28359 Bremen Tel.: +49 421 2028568 Fax.: +49 421 2028565 PGP public key:http://www.mpi-bremen.de/Carsten_John.html -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk4ISy8ACgkQsRCwZeehufs9MQCfetuYQwjbqH2Rb7qyY8G4vxaQ TvUAoNcHPnHED1Ykat8VHF8EJIRiPmct =jwZQ -END PGP SIGNATURE- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- ++ || | Климов Евгений, Jim Klimov | | технический директор CTO | | ЗАО ЦОС и ВТ JSC COSHT | || | +7-903-7705859 (cellular) mailto:jimkli...@cos.ru | |CC:ad...@cos.ru,jimkli...@gmail.com | ++ | () ascii ribbon campaign - against html mail | | /\- against microsoft attachments | ++ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Encryption accelerator card recommendations.
I recently bought an HP Proliant Microserver for a home file server. ( pics and more here: http://arstechnica.com/civis/viewtopic.php?p=20968192 ) I installed 5 1.5TB (5900 RPM) drives, upgraded the memory to 8GB, and installed Solaris 11 Express without a hitch. A few simple tests using dd with 1gb and 2gb files showed excellent transfer rates: ~200 MB/sec on a 5 drive raidz2 pool, ~310 MB/sec on a five drive pool with no redundancy. That is, until I enabled encryption, which brought the transfer rates down to around 20 MB/sec... Obviously the CPU is the bottleneck here, and I?m wondering what to do next. I can split the storage into file systems with and without encryption and allocate data accordingly. No need, for example, to encrypt open source code, or music. But I would like to have everything encrypted by default. My concern is not industrial espionage from a hacker in Belarus, but having a disk fail and send it for repair with my credit card statements easily readable on it, etc. I am new to (open or closed)Solaris. I found there is something called the Encryption Framework, and that there is hardware support for encryption. This server has two unused PCI-e slots, so I thought a card could be the solution, but the few I found seem to be geared to protect SSH and VPN connections, etc., not the file system. Cost is a factor also. I could build a similar server with a much faster processor for a few hundred dollars more, so a $1000 dollar card for a $1000 file server is not a reasonable option. Is there anything out there I could use? Thanks, Roberto Waltman ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] replace zil drive
I'd guess removing the SLOG altogether might be the safest, so for this configuration logs mirror-7 ONLINE 0 0 0 c2t22d0 ONLINE 0 0 0 c2t23d0 ONLINE 0 0 0 just `zpool remove zwimming mirror-7`, run `cfgadm -a` to find the full device path, do `cfgadm -c unconfigure devpath`, replace the drive, run devfsadm for good measure, check if it's connected with cfgadm -a, and add the SLOG again. roy - Original Message - I think that the least disruptive way would be to detach the HDD from the ZIL mirror and offline it, remove and replace with an SSD, and then attach the SSD to the ZIL to make it a mirror again. Note that this would create a window of possible ZIL failure (and you had such a window already when the first SSD died), but the system *should* survive that (fall back to on-pool ZIL after a short timeout of the dedicated device) unless the power dies during this time. - Исходное сообщение - От: Carsten John cj...@mpi-bremen.de Дата: Monday, June 27, 2011 13:21 Тема: [zfs-discuss] replace zil drive Кому (To): zfs-discuss@opensolaris.org -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hello everybody, some time ago a SSD within a ZIL mirror died. As I had no SSD availableto replace it, I dropped in a normal SAS harddisk to rebuild the mirror. In the meantime I got the warranty replacement SSD. Now I'm wondering about the best option to replace the HDD with the SSD: 1. Remove the log mirror, put the new disk in place, add log mirror 2. Pull the HDD, forcing the mirror to fail, replace the HDD with the SSD Unfortunately I have no free slot in the JBOD available (want to keep the ZIL in the same JBAD as the rest of the pool): 3. Put additional temporary SAS HDD in free slot of different JBOD, replace the HDD in the ZIL mirror with temporary HDD, pull now unused HDD, use free slot for SSD, replace temporary HDD with SSD. Any suggestions? thx Carsten - -- Max Planck Institut fuer marine Mikrobiologie - - Network Administration - Celsiustr. 1 D-28359 Bremen Tel.: +49 421 2028568 Fax.: +49 421 2028565 PGP public key:http://www.mpi-bremen.de/Carsten_John.html -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk4ISy8ACgkQsRCwZeehufs9MQCfetuYQwjbqH2Rb7qyY8G4vxaQ TvUAoNcHPnHED1Ykat8VHF8EJIRiPmct =jwZQ -END PGP SIGNATURE- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- ++ | | | Климов Евгений, Jim Klimov | | технический директор CTO | | ЗАО ЦОС и ВТ JSC COSHT | | | | +7-903-7705859 (cellular) mailto:jimkli...@cos.ru | | CC:ad...@cos.ru,jimkli...@gmail.com | ++ | () ascii ribbon campaign - against html mail | | /\ - against microsoft attachments | ++ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 r...@karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk.___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Encryption accelerator card recommendations.
IMO a faster processor with built-in AES and other crypto support is most likely to give you the most bang for your buck, particularly if you're using closed Solaris 11, as Solaris engineering is likely to add support for new crypto instructions faster than Illumos (but I don't really know enough about Illumos to say for sure). Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [solarisx86] Encryption accelerator card recommendations.
John Martin wrote: rob_waltman wrote: I installed 5 1.5TB (5900 RPM) drives, upgraded the memory to 8GB, and installed Solaris 11 Express without a hitch. A few simple tests using dd with 1gb and 2gb files showed excellent transfer rates: ~200 MB/sec on a 5 drive raidz2 pool, ~310 MB/sec on a five drive pool with no redundancy. No redundancy? What does zpool status report? Sorry, I am not near the machine to try it. What I meant by no redundancy? is this: I partitioned the 5 disks identically - (a) A 40GB boot partition, (used only in the boot disk, planning to mirror it later) (b) A 200Gb fast partition (c) Two equal size safe partitions filling the rest of the disk (~600Gb each?) Then, (the disks are on c7t0/1/2/3/5, c7t4 is an esata port) # 1Tb fast pool for temporary storage, work in progress, etc zpool create ${props} fast2c7t0d0p2 ... c7t5d0p2 zpool create ${props} safe3 raidz2 c7t0d0p3 ... c7t5d0p3 zpool create ${props} safe4 raidz2 c7t0d0p4 ... c7t5d0p4 Where props contains my chosen property defaults: -O utf8only=on -O mountpoint=none -O atime=off -O encryption=on ... etc. Roberto Waltman ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Encryption accelerator card recommendations.
On 6/27/2011 9:55 AM, Roberto Waltman wrote: I recently bought an HP Proliant Microserver for a home file server. ( pics and more here: http://arstechnica.com/civis/viewtopic.php?p=20968192 ) I installed 5 1.5TB (5900 RPM) drives, upgraded the memory to 8GB, and installed Solaris 11 Express without a hitch. A few simple tests using dd with 1gb and 2gb files showed excellent transfer rates: ~200 MB/sec on a 5 drive raidz2 pool, ~310 MB/sec on a five drive pool with no redundancy. That is, until I enabled encryption, which brought the transfer rates down to around 20 MB/sec... Obviously the CPU is the bottleneck here, and I?m wondering what to do next. I can split the storage into file systems with and without encryption and allocate data accordingly. No need, for example, to encrypt open source code, or music. But I would like to have everything encrypted by default. My concern is not industrial espionage from a hacker in Belarus, but having a disk fail and send it for repair with my credit card statements easily readable on it, etc. I am new to (open or closed)Solaris. I found there is something called the Encryption Framework, and that there is hardware support for encryption. This server has two unused PCI-e slots, so I thought a card could be the solution, but the few I found seem to be geared to protect SSH and VPN connections, etc., not the file system. Cost is a factor also. I could build a similar server with a much faster processor for a few hundred dollars more, so a $1000 dollar card for a $1000 file server is not a reasonable option. Is there anything out there I could use? Thanks, Roberto Waltman You're out of luck. The hardware-encryption device is seen as a small market by the vendors, and they price accordingly. All the solutions are FIPS-compliant, which makes them non-trivially expensive to build/test/verify. I have yet to see the basic crypto accelerator - which should be as simple as an FPGA with downloadable (and updateable) firmware. The other major cost point is the crypto plugins - sadly, there is no way to simply have the CPU farm off crypto jobs to a co-processor. That is, there's no way for the CPU to go oh, that looks like I'm running a crypto algorithm - I should hand it over to the crypto co-processor. Instead, you have to write custom plugin/drivers/libraries for each OS, and pray that each OS has some standardized crypto framework. Otherwise, you have to link apps with custom libraries. I'm always kind of surprised that there hasn't been a movement to create standardized crypto commands, like the various FP-specific commands that are part of MMX/SSE/etc. That way, most of this could be done in hardware seamlessly. -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Encryption accelerator card recommendations.
On Mon, June 27, 2011 15:24, Erik Trimble wrote: [...] I'm always kind of surprised that there hasn't been a movement to create standardized crypto commands, like the various FP-specific commands that are part of MMX/SSE/etc. That way, most of this could be done in hardware seamlessly. The (Ultra)SPARC T-series processors do, but to a certain extent it goes against a CPU manufacturers best (financial) interest to provide this: crypto is very CPU intensive using 'regular' instructions, so if you need to do a lot of it, it would force you to purchase a manufacturer's top-of-the-line CPUs, and to have as many sockets as you can to handle a load (and presumably you need to do useful work besides just en/decrypting traffic). If you have special instructions that do the operations efficiently, it means that you're not chewing up cycles as much, so a less powerful (and cheaper) processor can be purchased. I'm sure all the Web 2.0 companies would love to have these (and OpenSSL link use the instructions), so they could simply enable HTTPS for everything. (Of course it'd also be helpful for data-at-rest, on-disk encryption as well.) The last benchmarks I saw indicated that the SPARC T-series could do 45 Gb/s AES or some such, with gobs of RSA operations as well. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Encryption accelerator card recommendations.
On 6/27/2011 1:13 PM, David Magda wrote: On Mon, June 27, 2011 15:24, Erik Trimble wrote: [...] I'm always kind of surprised that there hasn't been a movement to create standardized crypto commands, like the various FP-specific commands that are part of MMX/SSE/etc. That way, most of this could be done in hardware seamlessly. The (Ultra)SPARC T-series processors do, but to a certain extent it goes against a CPU manufacturers best (financial) interest to provide this: crypto is very CPU intensive using 'regular' instructions, so if you need to do a lot of it, it would force you to purchase a manufacturer's top-of-the-line CPUs, and to have as many sockets as you can to handle a load (and presumably you need to do useful work besides just en/decrypting traffic). If you have special instructions that do the operations efficiently, it means that you're not chewing up cycles as much, so a less powerful (and cheaper) processor can be purchased. I'm sure all the Web 2.0 companies would love to have these (and OpenSSL link use the instructions), so they could simply enable HTTPS for everything. (Of course it'd also be helpful for data-at-rest, on-disk encryption as well.) The last benchmarks I saw indicated that the SPARC T-series could do 45 Gb/s AES or some such, with gobs of RSA operations as well The T-series crypto isn't what I'm thinking of. AFAIK, you still need to use the Crypto framework in Solaris to access the on-chip functionality. Which makes the T-series no different than CPUs without a crypto module but a crypto add-in board instead. What I'm thinking of is something on the lines of what AMD proposed awhile ago, in combination with how we used to handle hardware that had FP optional. That is, you continue to make CPUs without any crypto functionality, EXCEPT that they support certain extensions a la MMX. If no Crypto accelerator was available, the CPU would trap any Crypto calls, and force them to done in software. You could then stick a crypto accellerator in a second CPU socket, and the CPU would recognized this was there, and pipe crypto calls to the dedicated co-processor. Think about how things were done with the i386 and i387. That's what I'm after. With modern CPU buses like AMD Intel support, plopping a co-processor into another CPU socket would really, really help. -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Encryption accelerator card recommendations.
On Jun 27, 2011, at 17:16, Erik Trimble wrote: Think about how things were done with the i386 and i387. That's what I'm after. With modern CPU buses like AMD Intel support, plopping a co-processor into another CPU socket would really, really help. Given the amount of transistors that are available nowadays I think it'd be simpler to just create a series of SIMD instructions right in/on general CPUs, and skip the whole co-processor angle. There's more and more sensitive data out there, so on-disk crypto could be deployed in more places to help prevent data loss (on both servers and desktops/laptops), and those systems that don't do disk IO probably do network IO, and so would be helped from a TLS/SSL/SSH perspective. If I were AMD I'd seriously be thinking about this, as it'd help boost volume and mindshare for a little while with all the folks doing any kind of web activity would pick up kit for HTTPS—at least until Intel brought out a similar thing. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Encryption accelerator card recommendations.
On 06/27/11 15:24, David Magda wrote: Given the amount of transistors that are available nowadays I think it'd be simpler to just create a series of SIMD instructions right in/on general CPUs, and skip the whole co-processor angle. see: http://en.wikipedia.org/wiki/AES_instruction_set Present in many current Intel CPUs; also expected to be present in AMD's Bulldozer based CPUs. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] replace zil drive
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Carsten John Now I'm wondering about the best option to replace the HDD with the SSD: What version of zpool are you running? If it's = 19, then you could actually survive a complete ZIL device failure. So you should simply offline or detach or whatever the HDD and then either attach or add the new SDD. Attach would be mirror, add would be two separate non-mirrored devices. Maybe better performance, maybe not. If it's zpool 19, then you absolutely do not want to degrade to non-mirrored status. First attach the new SSD, then when it's done, detach the HDD. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Encryption accelerator card recommendations.
On Jun 27, 2011, at 18:32, Bill Sommerfeld wrote: On 06/27/11 15:24, David Magda wrote: Given the amount of transistors that are available nowadays I think it'd be simpler to just create a series of SIMD instructions right in/on general CPUs, and skip the whole co-processor angle. see: http://en.wikipedia.org/wiki/AES_instruction_set Present in many current Intel CPUs; also expected to be present in AMD's Bulldozer based CPUs. Now compare that with the T-series stuff that also handles 3DES, RC4, RSA2048, DSA, DH, ECC, MD5, SHA1, SHA2, as well as a hardware RNG: http://en.wikipedia.org/wiki/UltraSPARC_T2 http://blogs.oracle.com/BestPerf/entry/20100920_sparc_t3_pk11rsaperf The initial TLS/SSL set up is actually the expensive part (20-58% of the time spent of the 'transaction'), and that AES can be performed decently even on non-AESNI CPUs: simply adding an RSA accelerator can double performance without session caching, and even ~20% with it. SSL session caching alone can help improve throughput by a factor of more than two. Performance Analysis of TLS Web Servers http://www.cs.rice.edu/~dwallach/pub/tls-tocs.pdf http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.16.1403 AESNI is certain better than nothing, but RSA, SHA, and the RNG would be nice as well. It'd also be handy for ZFS crypto in addition to all the network IO stuff. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Encryption accelerator card recommendations.[GPU acceleration of ZFS]
FYI There is another thread named -- GPU acceleration of ZFS in this list to discuss the possibility to utilize the power of GPGPU. I posted here: Good day, I think ZFS can take advantage of using GPU for sha256 calculation, encryption and maybe compression. Modern video card, like 5xxx or 6xxx ATI HD Series can do calculation of sha256 50-100 times faster than modern 4 cores CPU. kgpu project for linux shows nice results. 'zfs scrub' would work freely on high performance ZFS pools. The only problem that there is no AMD/Nvidia drivers for Solaris that support hardware-assisted OpenCL. Is anyone interested in it? Best regards, Anatoly Legkodymov. On Tue, May 10, 2011 at 11:29 AM, Anatoly legko...@fastmail.fm wrote: Good day, I think ZFS can take advantage of using GPU for sha256 calculation, encryption and maybe compression. Modern video card, like 5xxx or 6xxx ATI HD Series can do calculation of sha256 50-100 times faster than modern 4 cores CPU. Ignoring optimizations from SIMD extensions like SSE and friends, this is probably true. However, the GPU also has to deal with the overhead of data transfer to itself before it can even begin crunching data. Granted, a Gen. 2 x16 link is quite speedy, but is CPU performance really that poor where a GPU can still out-perform it? My undergrad thesis dealt with computational acceleration utilizing CUDA, and the datasets had to scale quite a ways before there was a noticeable advantage in using a Tesla or similar over a bog-standard i7-920. The only problem that there is no AMD/Nvidia drivers for Solaris that support hardware-assisted OpenCL. This, and keep in mind that most of the professional users here will likely be using professional hardware, where a simple 8MB Rage XL gets the job done thanks to the magic of out-of-band management cards and other such facilities. Even as a home user, I have not placed a high-end videocard into my machine, I use a $5 ATI PCI videocard that saw about a hour of use whilst I installed Solaris 11. -- --khd IMHO, zfs need to run in all kind of HW T-series CMT server that can help sha calculation since T1 day, did not see any work in ZFS to take advantage it On 5/10/2011 11:29 AM, Anatoly wrote: Good day, I think ZFS can take advantage of using GPU for sha256 calculation, encryption and maybe compression. Modern video card, like 5xxx or 6xxx ATI HD Series can do calculation of sha256 50-100 times faster than modern 4 cores CPU. kgpu project for linux shows nice results. 'zfs scrub' would work freely on high performance ZFS pools. The only problem that there is no AMD/Nvidia drivers for Solaris that support hardware-assisted OpenCL. Is anyone interested in it? Best regards, Anatoly Legkodymov. On Tue, May 10, 2011 at 10:29 PM, Anatoly legko...@fastmail.fm wrote: Good day, I think ZFS can take advantage of using GPU for sha256 calculation, encryption and maybe compression. Modern video card, like 5xxx or 6xxx ATI HD Series can do calculation of sha256 50-100 times faster than modern 4 cores CPU. kgpu project for linux shows nice results. 'zfs scrub' would work freely on high performance ZFS pools. The only problem that there is no AMD/Nvidia drivers for Solaris that support hardware-assisted OpenCL. Is anyone interested in it? This isn't technically true. The NVIDIA drivers support compute, but there's other parts of the toolchain missing. /* I don't know about ATI/AMD, but I'd guess they likely don't support compute across platforms */ /* Disclaimer - The company I work for has a working HMPP compiler for Solaris/FreeBSD and we may soon support CUDA */ On 10 May 2011, at 16:44, Hung-Sheng Tsao (LaoTsao) Ph. D. wrote: IMHO, zfs need to run in all kind of HW T-series CMT server that can help sha calculation since T1 day, did not see any work in ZFS to take advantage it That support would be in the crypto framework though, not ZFS per se. So I think the OP might consider how best to add GPU support to the crypto framework. Chris ___ Thanks. Fred -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of David Magda Sent: 星期二, 六月 28, 2011 9:23 To: Bill Sommerfeld Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] Encryption accelerator card recommendations. On Jun 27, 2011, at 18:32, Bill Sommerfeld wrote: On 06/27/11 15:24, David Magda wrote: Given the amount of transistors that are available nowadays I think it'd be simpler to just create a series of SIMD instructions right in/on general CPUs, and skip the whole co-processor angle. see: http://en.wikipedia.org/wiki/AES_instruction_set Present in many current Intel CPUs; also expected to be present in AMD's Bulldozer based CPUs. Now compare that with the T-series stuff that also handles 3DES, RC4, RSA2048, DSA, DH, ECC, MD5,
Re: [zfs-discuss] Encryption accelerator card recommendations.[GPU acceleration of ZFS]
On Jun 27, 2011, at 22:03, Fred Liu wrote: FYI There is another thread named -- GPU acceleration of ZFS in this list to discuss the possibility to utilize the power of GPGPU. I posted here: In a similar vein I recently came across SSLShader: http://shader.kaist.edu/sslshader/ http://www.usenix.org/event/nsdi11/tech/full_papers/Jang.pdf http://www.google.com/search?q=sslshader This could be handy for desktops doing ZFS crypto (and even browser SSL and/or SSH), but few servers have decent graphics cards (and SPARC systems don't even have video ports by :). ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Encryption accelerator card recommendations.
On Jun 27, 2011 9:24 PM, David Magda dma...@ee.ryerson.ca wrote: AESNI is certain better than nothing, but RSA, SHA, and the RNG would be nice as well. It'd also be handy for ZFS crypto in addition to all the network IO stuff. The most important reason for AES-NI might be not performance but defeating side-channel attacks. Also, really fast AES HW makes AES-based hash functions quite tempting. No, AES-NI is nothing to sneeze at. Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Encryption accelerator card recommendations.
On Jun 27, 2011 4:15 PM, David Magda dma...@ee.ryerson.ca wrote: The (Ultra)SPARC T-series processors do, but to a certain extent it goes against a CPU manufacturers best (financial) interest to provide this: crypto is very CPU intensive using 'regular' instructions, so if you need to do a lot of it, it would force you to purchase a manufacturer's top-of-the-line CPUs, and to have as many sockets as you can to handle a load (and presumably you need to do useful work besides just en/decrypting traffic). I hope no CPU vendor thinks about the economics of chip making that way. I actually doubt any do. Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss