Re: [zfs-discuss] maczfs / ZEVO
On 17 févr. 2013, at 15:15, Edward Ned Harvey (opensolarisisdeadlongliveopensolaris) opensolarisisdeadlongliveopensola...@nedharvey.com wrote: From: Tim Cook [mailto:t...@cook.ms] Sent: Friday, February 15, 2013 11:14 AM I have a few coworkers using it. No horror stories and it's been in use about 6 months now. If there were any showstoppers I'm sure I'd have heard loud complaints by now :) So, I have discovered a *couple* of unexpected problems. At first, I thought it would be nice to split my HD into 2 partitions, use the 2nd partition for zpool, and use vmdk wrapper around a zvol raw device. So I started partitioning my HD. As it turns out, there's a bug in diskutility... As long as you partition your hard drive and *format* the second partition with hfs+, then it works very smoothly. But then I couldn't find any way to dismount the second partition (there is no eject) ... If I go back, I think maybe I'll figure it out, but I didn't try too hard ... I resized back to normal, and then split again, selecting the Empty Space option for the second partition. Bad idea. Diskutillity horked the partition tables, and I had to restore from time machine. I thought maybe it was just a fluke, so I repeated the whole process a second time ... try to split disk, try to make the second half Free Space and forced to restore system. Lesson learned. Don't try to create an unused partition on the mac HD. So then I just created one big honking file via dd and used it for zpool store. Tried to create zvol. Unfortunately zevo doesn't do zvol. Ok, no problem. Windows can run NTFS inside a vmdk file inside a zfs filesystem inside an hfs+ file inside the hfs+ filesystem. (Yuk.) But it works. Unfortunately, because it's a file in the backend, zevo doesn't find the pool on reboot. It doesn't seem to do the equivalent of a zpool.cache. I've asked a question in their support forum to see if there's some way to solve that problem, but I don't know yet. Tim, Simon, Volker, Chris, and Erik - How do you use it? I am making the informed guess, that you're using it primarily on non-laptops, which have second hard drives, and you're giving the entire disk to the zpool. Right? Actually, my usage is with a laptop, but I've pretty much given up on doing anything serious in ZFS without going whole disk, so I hadn't run across the partitioning issues or the lack of ZFS.cache for mounting file based pools. Back to the day to day usage. I'm using it primarily with my MacBook Air and I have Seagate GoFlex thunderbolt adaptor into which I plug SSDs holding VMs and sources. While on the move, I leave the external drive in my bag and use a 1m thunderbolt cable so I'm tethered to the bag, but it's usable. Eventually, I'll probably get one of the StarTech 4 disk toaster docks on USB 3 for while I'm at the office, and continue to rely on the thunderbolt SSD while on the road. On the partitioning front, after thinking a bit, you should be able to tell Zevo to use a second partition on the main disk. The trick would be creating the partition normally as an HFS+ volume, unmounting it with something like sudo diskutil unmount disk0s4, followed by sudo zpool create zevo disk0s4 Oh, other side notes I almost forgot. To ensure that you don't chew up all of your memory with ARC it's also a good idea to disable spotlight searching on ZFS volumes (sudo mdutil -i off /Volumes/Zevo) Cheers, Erik ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] maczfs / ZEVO
I've been using it happily since before the Greenbytes purchase. You're currently limited to pool version 28. I generally use it with external USB drives (single disk pools), but I have tested file based RAIDZ pools which worked fine. The only caveat I will note, particularly for working with VMs (my primary use case as well) is that you can run into situations where the OS is RAM starved with the ARC filling up. I've run into cases where Fusion refused to boot up VMs claiming not enough memory after I was using another machine for a while. Ejecting the pool will generally clear out the ARC (allocated to the kernel) so that you can reinsert and then start the VM. It's a full implementation as far as I can tell, including zfs send/recv so you can easily backup across the network without having the plugin your disks to the other server. I'd put it in the reliable camp (at the very least more reliable that HFS+ or ExFAT on cheap 2.5 drives) Cheers, Erik On 15 févr. 2013, at 17:08, Edward Ned Harvey (opensolarisisdeadlongliveopensolaris) opensolarisisdeadlongliveopensola...@nedharvey.com wrote: Anybody using maczfs / ZEVO? Have good or bad things to say, in terms of reliability, performance, features? My main reason for asking is this: I have a mac, I use Time Machine, and I have VM's inside. Time Machine, while great in general, has the limitation of being unable to intelligently identify changed bits inside a VM file. So you have to exclude the VM from Time Machine, and you have to run backup software inside the VM. I would greatly prefer, if it's reliable, to let the VM reside on ZFS and use zfs send to backup my guest VM's. I am not looking to replace HFS+ as the primary filesystem of the mac; although that would be cool, there's often a reliability benefit to staying on the supported, beaten path, standard configuration. But if ZFS can be used to hold the guest VM storage reliably, I would benefit from that. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] all in one server
On 18 sept. 2012, at 16:40, Dan Swartzendruber dswa...@druber.com wrote: On 9/18/2012 10:31 AM, Eugen Leitl wrote: I'm currently thinking about rolling a variant of http://www.napp-it.org/napp-it/all-in-one/index_en.html with remote backup (via snapshot and send) to 2-3 other (HP N40L-based) zfs boxes for production in our organisation. The systems themselves would be either Dell or Supermicro (latter with ZIL/L2ARC on SSD, plus SAS disks (pools as mirrors) all with hardware pass-through). The idea is to use zfs for data integrity and backup via data snapshot (especially important data will be also back-up'd via conventional DLT tapes). Before I test thisi -- Is anyone using this is in production? Any caveats? I run an all-in-one and it works fine. Supermicro x9scl-f with 32gb ECC ram. 20 is for the openindiana SAN, with an ibm m1015 passed through via vmdirect (pci passthru). 4 SAS nearline drives in 2x2 mirror config in a jbod chassis. 2 samsung 830 128gb ssds as l2arc. The main caveat is to order the VMs properly for auto-start (assuming you use that as I do.) The OI VM goes first, and I give a good 120 seconds before starting the other VMs. For auto shutdown, all VMs but OI do suspend, OI does shutdown. The big caveat: do NOT use iSCSI for the datastore, use NFS. Maybe there's a way to fix this, but I found that on start up, ESXi would time out the iSCSI datastore mount before the virtualized SAN VM was up and serving the share - bad news. NFS seems to be more resilient there. vmxnet3 vnics should work fine for OI VM, but might want to stick to e1000. Can I actually have a year's worth of snapshots in zfs without too much performance degradation? Dunno about that. This concords with my experience after building a few custom appliances with similar configurations. For the backup side of things, stop and think about the actual use cases for keeping a year's worth of snapshots. Generally speaking, restore requests are for data that is relatively hot and has been live some time in the current quarter. I think that you could limit your snapshot retention to something smaller, and pull the files back from tape if you go past that. One detail missing from this calculation is the frequency of snapshots. A year's worth of hourly snapshots is huge for a little box like the HP NXXL machines. A year's worth of daily snapshots is more in the domain of the reasonable. For reference, though I have one that retains 4 weeks of replicated hourly snapshots without complaint. (8Gb/4x2Tb raidz1) The bigger issue you'll run into will be data sizing as a year's worth of snapshot basically means that you're keeping a journal of every single write that's occurred over the year. If you are running VM Images, this can also mean that you're retaining a years worth of writes to your OS swap file - something of exceedingly little value. You might want to consider moving the swap files to a separate virtual disk on a different volume. If you're running ESXi with a vSphere license, I'd recommend looking at VDR (free with the vCenter license) for backing up the VMs to the little HPs since you get compressed and deduplicated backups that will minimize the replication bandwidth requirements. Much depends on what you're optimizing for. If it's RTO (bring VMs back online very quickly) then replicating the primary NFS datastore is great - just point a server at the replicated NFS store, import the VM and start. With an RPO that coincides with your snapshot frequency. Cheers, Erik ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Incremental send/recv interoperability
Just wondering if an expert can chime in on this one. I have an older machine running 2009.11 with a zpool at version 14. I have a new machine running Solaris Express 11 with the zpool at version 31. I can use zfs send/recv to send a filesystem from the older machine to the new one without any difficulties. However, as soon as I try to update the remote copy with an incremental send/recv I get back the error of cannot receive incremental stream: invalid backup stream. I was under the impression that the streams were backwards compatible (ie a newer version could receive older streams) which appears to be correct for the initial send/recv operation, but failing on the incremental. Cheers, Erik ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Incremental send/recv interoperability
Doh - 2008.11 On 15 févr. 2011, at 11:18, Erik ABLESON wrote: I have an older machine running 2009.11 with a zpool at version 14. I have a new machine running Solaris Express 11 with the zpool at version 31. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How to avoid striping ?
Le 18 oct. 2010 à 08:44, Habony, Zsolt zsolt.hab...@hp.com a écrit : Hi, I have seen a similar question on this list in the archive but haven’t seen the answer. Can I avoid striping across top level vdevs ? If I use a zpool which is one LUN from the SAN, and when it becomes full I add a new LUN to it. But I cannot guarantee that the LUN will not come from the same spindles on the SAN. Can I force zpool to not to stripe the data ? No. The basic principle of the zpool is dynamic striping across vdevs in order to ensure that all available spindles are contributing to the workload. If you want/need more granular control over what data goes to which disk, then you'll need to create multiple pools. Just create a new pool from the new SAN volume and you will segregate the IO. But then you risk having hot and cold spots in your storage as the IO won't be striped. If the approach is to fill a vdev completely before adding a new one this possibility exists anyway until the block rewrite arrives to redistribute existing data across available vdevs. Cheers, Erik ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Mac OS X clients with ZFS server
Le 16 sept. 2010 à 16:18, Rich Teer rich.t...@rite-group.com a écrit : On Thu, 16 Sep 2010, erik.ableson wrote: And for reference, I have a number of 10.6 clients using NFS for sharing Fusion virtual machines, iTunes library, iPhoto libraries etc. without any issues. Excellent; what OS is your NFS server running? OpenSolaris snv129 Erik ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Mac OS X clients with ZFS server
The only tweak needed was making sure that I used the FQDN of the client machines (with appropriate reverse lookups in my DNS) for the sharenfs properties. Envoyé de mon iPhone Le 16 sept. 2010 à 17:15, Rich Teer rich.t...@rite-group.com a écrit : On Thu, 16 Sep 2010, Erik Ableson wrote: OpenSolaris snv129 Hmm, SXCE snv_130 here. Did you have to do any server-side tuning (e.g., allowing remote connections), or did it just work out of the box? I know that Sendmail needs some gentle persuasion to accept remote connections out of the box; perhaps lockd is the same? -- Rich Teer, Publisher Vinylphile Magazine www.vinylphilemag.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Snapshots and Data Loss
A snapshot is a picture of the storage at a point in time so everything depends on the applications using the storage. If you're running a db with lots of cache it's probably a good idea to stop the service or force a flush to disk before taking the snapshot to ensure the integrity of the data. That said, rolling back to a snapshot would be roughly the same thing as stopping the application brutally and it's up to the application to evaluate the data. Some will handle it better than others. If you're running virtual machines the ideal solution is to take a VM snapshot, followed by the filesystem snapshot, then deleting the VM snashot. ZFS snapshots are very reliable but it's scope is limited to the disks that it manages so if there's unflushed data living at a higher level, ZFS won't be aware of it. Cordialement, Erik Ableson On 13 avr. 2010, at 14:22, Tony MacDoodle tpsdoo...@gmail.com wrote: I was wondering if any data was lost while doing a snapshot on a running system? Does it flush everything to disk or would some stuff be lost? Thanks ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive - actual performance
On 25 mars 2010, at 22:00, Bruno Sousa bso...@epinfante.com wrote: Hi, Indeed the 3 disks per vdev (raidz2) seems a bad idea...but it's the system i have now. Regarding the performance...let's assume that a bonnie++ benchmark could go to 200 mg/s in. The possibility of getting the same values (or near) in a zfs send / zfs receive is just a matter of putting , let's say a 10gbE card between both systems? I have the impression that benchmarks are always synthetic, therefore live/production environments behave quite differently. Again, it might be just me, but with 1gb link being able to replicate 2 servers with a average speed above 60 mb/s does seems quite good. However, like i said i would like to know other results from other guys... Don't forget to factor in your transport mechanism. If you're using ssh to pipe the send/recv data your overall speed may end up being CPU bound since I think that ssh will be single threaded so even on a multicore system, you'll only be able to consume one core and here raw clock speed will make difference. Cheers, Erik ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS/OSOL/Firewire...
Funny, I thought the same thing up until a couple of years ago when I thought Apple should have bought Sun :-) Cordialement, Erik Ableson +33.6.80.83.58.28 Envoyé depuis mon iPhone On 19 mars 2010, at 09:41, Khyron khyron4...@gmail.com wrote: Of course, I'm the only person I know who said that Sun should have bought Apple 10 years ago. What do I know? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Can we get some documentation on iSCSI sharing after comstar took over?
Certainly! I just whipped that up since I was testing out a pile of clients with different volumes and got tired of going through all the steps so anything to make it more complete would be useful. Cordialement, Erik Ableson +33.6.80.83.58.28 Envoyé depuis mon iPhone On 17 mars 2010, at 00:25, Svein Skogen sv...@stillbilde.net wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 16.03.2010 22:31, erik.ableson wrote: On 16 mars 2010, at 21:00, Marc Nicholas wrote: On Tue, Mar 16, 2010 at 3:16 PM, Svein Skogen sv...@stillbilde.net mailto:sv...@stillbilde.net wrote: I'll write you a Perl script :) I think there are ... several people that'd like a script that gave us back some of the ease of the old shareiscsi one-off, instead of having to spend time on copy-and-pasting GUIDs they have ... no real use for. ;) I'll try and knock something up in the next few days, then! Try this : http://www.infrageeks.com/groups/infrageeks/wiki/56503/ zvol2iscsi.html Thank you! :) Mind if I (after some sleep) look at extending your script a little? Of course with feedback of the changes I make? //Svein - -- - +---+--- /\ |Svein Skogen | sv...@d80.iso100.no \ / |Solberg Østli 9| PGP Key: 0xE5E76831 X|2020 Skedsmokorset | sv...@jernhuset.no / \ |Norway | PGP Key: 0xCE96CE13 | | sv...@stillbilde.net ascii | | PGP Key: 0x58CD33B6 ribbon |System Admin | svein-listm...@stillbilde.net Campaign|stillbilde.net | PGP Key: 0x22D494A4 +---+--- |msn messenger: | Mobile Phone: +47 907 03 575 |sv...@jernhuset.no | RIPE handle:SS16503-RIPE - +---+--- If you really are in a hurry, mail me at svein-mob...@stillbilde.net This mailbox goes directly to my cellphone and is checked even when I'm not in front of my computer. - Picture Gallery: https://gallery.stillbilde.net/v/svein/ - -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.12 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkugE3wACgkQSBMQn1jNM7YtcwCdFHWdZ2nGSMCsiSEbf9jh+YLT S8YAoOErsJWEkUYSKFiJ/tINxU0gLWHn =OAds -END PGP SIGNATURE- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Moving Storage to opensolaris+zfs. What about backup?
Comments inline : On Wednesday, March 03, 2010, at 06:35PM, Svein Skogen sv...@stillbilde.net wrote: However trying to wrap my head around solaris and backups (I'm used to FreeBSD) is now leaving me with a nasty headache, and still no closer to a real solution. I need something that on regular intervals pushes this zpool: storage 4.06T 1.19T 2.87T29% 1.00x ONLINE - onto a series of tapes, and I really want a solution that allows me to have something resembling a one-button-disaster recovery, either via a cd/dvd bootdisc, or a bootusb image, or via writing a bootblock on the tapes. Preferably a solution that manages to dump the entire zpool, including zfses and volumes and whatnot. If I can dump the rpool along with it, all the better. (basically something that allows me to shuffle a stack of tapes into the safe, maybe along with a bootdevice, with the effect of making me sleep easy knowing that ... when disaster happens, I can use a similar-or-better specced box to restore the entire server to bring everything back on line). are there ... ANY good ideas out there for such a solution? -- Only limited by your creativity. Out curiosity, why the tape solution for disaster recovery? That strikes me as being more work, not to mention much more complicated for disaster recovery since LTOs aren't usually found as standard kit on most machines. As a quick idea how about the following : Boot your system from a USB key (or portable HD), and dd the key to a spare that's kept in the safe, updated when you do anything substantial. There you recover not just a bootable system but any system based customization you've done. This does require downtime however for the duplication. For the data, rather than fight with tapes, I'd go buy a dual-bay disk enclosure and pop in 2 2Tb drives. Attach that to the server (USB/eSATA, whatever's convenient) and use zfs send/recv to copy over snapshots into a full exploitable copy. Put that in the safe with the USB key and you have a completely mobile solution that wants only a computer. Assuming that you don't fill up your current 4Tb of storage, you can keep a number of snapshots to replace the iterative copies done to tape in the old fashioned world. Better yet, do this to two destinations and rotate one off-site. That would be the best as far as disaster recovery convenience goes, but does still require the legwork of attaching the backup disks, running the send/recv, exporting the pool and putting it back in the safe. Using a second machine somewhere and sending it across the network is more easily scalable (but more possibly expensive). Remembering that by copying to another zpool you have a fully exploitable backup copy. I don't think that the idea of copying zfs send streams to tape is a reasonable approach to backups - way to many failure points and dependencies. Not to mention that testing your backup is easy - just import the pool and scrub. Testing against tape adds wear and tear to the tapes and you need room to restore to, is time consuming, and a general PITA. (but it's essential!) If you want to stick with a traditional approach, amanda is a good choice, and OpenSolaris does include an ndmp service, although I haven't looked at it yet. This kind of design depends on your RTO, RPO, administrative constraints, data retention requirements, budget and your definition of a disaster... IMHO, disk to disk with zfs send/recv offers a very flexible and practical solution to many backup and restore needs. Your storage media can be wildly different - small, fast SAS for production going to fewer big SATA drives with asymmetric snapshot retention policies- keep a week in production and as many as you want on the bigger backup drives. Then file level dumps to tape from the backup volumes for archival purposes that can be restored onto any filesystem. Cheers, Erik ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] file concatenation with ZFS copy-on-write
On 3 déc. 2009, at 13:29, Bob Friesenhahn bfrie...@simple.dallas.tx.u s wrote: On Thu, 3 Dec 2009, Darren J Moffat wrote: The answer to this is likely deduplication which ZFS now has. The reason dedup should help here is that after the 'cat' f15 will be made up of blocks that match the blocks of f1 f2 f3 f4 f5. Copy-on-write isn't what helps you here it is dedup. Isn't this only true if the file sizes are such that the concatenated blocks are perfectly aligned on the same zfs block boundaries they used before? This seems unlikely to me. It's also worth noting that if the block alignment works out for the dedup, the actual write traffic will be trivial, consisting only of pointer references, so the heavy lifting will be the read operations. Much depends on the contents of the files. Fixed size binary blobs that align nicely with 16/32/64k boundaries, or variable sized text files. Cordialement, Erik Ableson ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RAID-Z and virtualization
Uhhh - for an unmanaged server you can use ESXi for free. Identical server functionality, just requires licenses if you need multiserver features (ie vMotion) Cordialement, Erik Ableson On 8 nov. 2009, at 19:12, Tim Cook t...@cook.ms wrote: On Sun, Nov 8, 2009 at 11:48 AM, Joe Auty j...@netmusician.org wrote: Tim Cook wrote: It appears that one can get more in the way of features out of VMWare Server for free than with ESX, which is seemingly a hook into buying more VMWare stuff. I've never looked at Sun xVM, in fact I didn't know it even existed, but I do now. Thank you, I will research this some more! The only other variable, I guess, is the future of said technologies given the Oracle takeover? There has been much discussion on how this impacts ZFS, but I'll have to learn how xVM might be affected, if at all. Quite frankly, I wouldn't let that stop you. Even if Oracle were to pull the plug on xVM entirely (not likely), you could very easily just move the VM's back over to *insert your favorite flavor of Linux* or Citrix Xen. Including Unbreakable Linux (Oracle's version of RHEL). I remember now why Xen was a no-go from when I last tested it. I rely on the 64 bit version of FreeBSD for most of my VM guest machines, and FreeBSD only supports running as domU on i386 systems. This is a monkey wrench! Sorry, just thinking outloud here... I have no idea what it supports right now. I can't even find a decent support matrix. Quite frankly, I would (and do) just use a separate server for the fileserver than the vm box. You can get 64bit cpu's with 4GB of ram for awfully cheap nowadays. That should be more than enough for most home workloads. --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RAID-Z and virtualization
Simply put ESXi is exactly the same local feature set as ESX server. So you get all of the useful stuff like transparent memory page sharing (memory deduplication), virtual switches with VLAN tagging, and high performance storage I/O. For free. As many copies as you like. But... You will need a vCenter license and then by server (well, by processor) licenses if you want the advanced management features like live migration of running VMs between servers, fault tolerance, guided consolidation etc. Most importantly, ESXi is a bare metal install so you have a proper hypervisor allocating resources instead of a general purpose OS with a Virtualisation application. Cordialement, Erik Ableson On 8 nov. 2009, at 19:43, Tim Cook t...@cook.ms wrote: On Sun, Nov 8, 2009 at 12:39 PM, Joe Auty j...@netmusician.org wrote: Erik Ableson wrote: Uhhh - for an unmanaged server you can use ESXi for free. Identical server functionality, just requires licenses if you need multiserver features (ie vMotion) How does ESXi w/o vMotion, vSphere, and vCenter server stack up against VMWare Server? My impression was that you need these other pieces to make such an infrastructure useful? VMware server doesn't have vmotion. There is no such thing as vsphere, that's the marketing name for the entire product suite. vCenter is only required for advanced functionality like HA/DPM/DRS that you don't have with VMware server either. Are you just throwing out buzzwords, or do you actually know what they do? --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Borked zpool, missing slog/zil
Hmmm - I've got a fairly old copy of the zpool cache file (circa July), but nothing structural has changed in pool since that date. What other data is held in that file? There have been some filesystem changes, but nothing critical is in the newer filesystems. Any particular procedure required for swapping out the zpool.cache file? Erik On Sunday, 27 September, 2009, at 12:28AM, Ross myxi...@googlemail.com wrote: Do you have a backup copy of your zpool.cache file? If you have that file, ZFS will happily mount a pool on boot without its slog device - it'll just flag the slog as faulted and you can do your normal replace. I used that for a long while on a test server with a ramdisk slog - and I never needed to swap it to a file based slog. However without a backup of that file to make zfs load the pool on boot I don't believe there is any way to import that pool. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Borked zpool, missing slog/zil
Good link - thanks. I'm looking at the details for that one and learning a little zdb at the same time. I've got a situation perhaps a little different in that I _do_ have a current copy of the slog in a file with what appears to be current data. However, I don't see how to attach the slog file to an offline zpool - I have both a dd backup of the ramdisk slog from midnight as well as the current file based slog : zdb -l /root/slog.tmp version=14 name='siovale' state=1 txg=4499446 pool_guid=13808783103733022257 hostid=4834000 hostname='shemhazai' top_guid=6374488381605474740 guid=6374488381605474740 is_log=1 vdev_tree type='file' id=1 guid=6374488381605474740 path='/root/slog.tmp' metaslab_array=230 metaslab_shift=21 ashift=9 asize=938999808 is_log=1 DTL=51 Is there any way that I can attach this slog to the zpool while it's offline? Erik On 27 sept. 2009, at 02:23, David Turnbull dsturnb...@gmail.com wrote: I believe this is relevant: http://github.com/pjjw/logfix Saved my array last year, looks maintained. On 27/09/2009, at 4:49 AM, Erik Ableson wrote: Hmmm - this is an annoying one. I'm currently running an OpenSolaris install (2008.11 upgraded to 2009.06) : SunOS shemhazai 5.11 snv_111b i86pc i386 i86pc Solaris with a zpool made up of one radiz vdev and a small ramdisk based zil. I usually swap out the zil for a file-based copy when I need to reboot (zpool replace /dev/ramdisk/slog /root/slog.tmp) but this time I had a brain fart and forgot to. The server came back up and I could sort of work on the zpool but it was complaining so I did my replace command and it happily resilvered. Then I restarted one more time in order to test bringing everything up cleanly and this time it can't find the file based zil. I try importing and it comes back with: zpool import pool: siovale id: 13808783103733022257 state: UNAVAIL status: One or more devices are missing from the system. action: The pool cannot be imported. Attach the missing devices and try again. see: http://www.sun.com/msg/ZFS-8000-6X config: siovale UNAVAIL missing device raidz1ONLINE c8d0ONLINE c9d0ONLINE c10d0 ONLINE c11d0 ONLINE Additional devices are known to be part of this pool, though their exact configuration cannot be determined. Now the file still exists so I don't know why it can't seem to find it and I thought the missing zil issue was corrected in this version (or did I miss something?). I've looked around for solutions to bring it back online and ran across this method: http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg16545.html but before I jump in on this one I was hoping there was a newer, cleaner approach that I missed somehow. Ideas appreciated... Erik ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Borked zpool, missing slog/zil
Hmmm - this is an annoying one. I'm currently running an OpenSolaris install (2008.11 upgraded to 2009.06) : SunOS shemhazai 5.11 snv_111b i86pc i386 i86pc Solaris with a zpool made up of one radiz vdev and a small ramdisk based zil. I usually swap out the zil for a file-based copy when I need to reboot (zpool replace /dev/ramdisk/slog /root/slog.tmp) but this time I had a brain fart and forgot to. The server came back up and I could sort of work on the zpool but it was complaining so I did my replace command and it happily resilvered. Then I restarted one more time in order to test bringing everything up cleanly and this time it can't find the file based zil. I try importing and it comes back with: zpool import pool: siovale id: 13808783103733022257 state: UNAVAIL status: One or more devices are missing from the system. action: The pool cannot be imported. Attach the missing devices and try again. see: http://www.sun.com/msg/ZFS-8000-6X config: siovale UNAVAIL missing device raidz1ONLINE c8d0ONLINE c9d0ONLINE c10d0 ONLINE c11d0 ONLINE Additional devices are known to be part of this pool, though their exact configuration cannot be determined. Now the file still exists so I don't know why it can't seem to find it and I thought the missing zil issue was corrected in this version (or did I miss something?). I've looked around for solutions to bring it back online and ran across this method: http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg16545.html but before I jump in on this one I was hoping there was a newer, cleaner approach that I missed somehow. Ideas appreciated... Erik ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS for iSCSI based SAN
Bottim line with virtual machines is that your IO will be random by definition since it all goes into the same pipe. If you want to be able to scale, go with RAID 1 vdevs. And don't skimp on the memory. Our current experience hasn't shown a need for an SSD for the ZIL but it might be useful for L2ARC (using iSCSI for VMs, NFS for templates and iso images) Cordialement, Erik Ableson +33.6.80.83.58.28 Envoyé depuis mon iPhone On 24 juin 2009, at 18:56, milosz mew...@gmail.com wrote: Within the thread there are instructions for using iometer to load test your storage. You should test out your solution before going live, and compare what you get with what you need. Just because striping 3 mirrors *will* give you more performance than raidz2 doesn't always mean that is the best solution. Choose the best solution for your use case. multiple vm disks that have any kind of load on them will bury a raidz or raidz2. out of a 6x raidz2 you are going to get the iops and random seek latency of a single drive (realistically the random seek will probably be slightly worse, actually). how could that be adequate for a virtual machine backend? if you set up a raidz2 with 6x15k drives, for the majority of use cases, you are pretty much throwing your money away. you are going to roll your own san, buy a bunch of 15k drives, use 2-3u of rackspace and four (or more) switchports, and what you're getting out of it is essentially a 500gb 15k drive with a high mttdl and a really huge theoretical transfer speed for sequential operations (which you won't be able to saturate anyway because you're delivering over gige)? for this particular setup i can't really think of a situation where that would make sense. Regarding ZIL usage, from what I have read you will only see benefits if you are using NFS backed storage, but that it can be significant. link? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Best controller card for 8 SATA drives ?
Just a side note on the PERC labelled cards: they don't have a JBOD mode so you _have_ to use hardware RAID. This may or may not be an issue in your configuration but it does mean that moving disks between controllers is no longer possible. The only way to do a pseudo JBOD is to create broken RAID 1 volumes which is not ideal. Cordialement, Erik Ableson +33.6.80.83.58.28 Envoyé depuis mon iPhone On 23 juin 2009, at 04:33, Eric D. Mudama edmud...@bounceswoosh.org wrote: On Mon, Jun 22 at 15:46, Miles Nordin wrote: edm == Eric D Mudama edmud...@bounceswoosh.org writes: edm We bought a Dell T610 as a fileserver, and it comes with an edm LSI 1068E based board (PERC6/i SAS). which driver attaches to it? pciids.sourceforge.net says this is a 1078 board, not a 1068 board. please, be careful. There's too much confusion about these cards. Sorry, that may have been confusing. We have the cheapest storage option on the T610, with no onboard cache. I guess it's called the Dell SAS6i/R while they reserve the PERC name for the ones with cache. I had understood that they were basically identical except for the cache, but maybe not. Anyway, this adapter has worked great for us so far. snippet of prtconf -D: i86pc (driver name: rootnex) pci, instance #0 (driver name: npe) pci8086,3411, instance #6 (driver name: pcie_pci) pci1028,1f10, instance #0 (driver name: mpt) sd, instance #1 (driver name: sd) sd, instance #6 (driver name: sd) sd, instance #7 (driver name: sd) sd, instance #2 (driver name: sd) sd, instance #4 (driver name: sd) sd, instance #5 (driver name: sd) For this board the mpt driver is being used, and here's the prtconf -pv info: Node 0x1f assigned-addresses: 81020010..fc00..0100.83020014.. df2ec000..4000.8302001c. .df2f..0001 reg: 0002.....01020010....0100.03020014....4000.0302001c. ...0001 compatible: 'pciex1000,58.1028.1f10.8' + 'pciex1000,58.1028.1f10' + 'pciex1000,58.8' + 'pciex1000,58' + 'pciexclass,01' + 'pciexclass,0100' + 'pci1000,58.1028.1f10.8' + 'pci1000,58.1028.1f10' + 'pci1028,1f10' + 'pci1000,58.8' + 'pci1000,58' + 'pciclass,01' + 'pciclass,0100' model: 'SCSI bus controller' power-consumption: 0001.0001 devsel-speed: interrupts: 0001 subsystem-vendor-id: 1028 subsystem-id: 1f10 unit-address: '0' class-code: 0001 revision-id: 0008 vendor-id: 1000 device-id: 0058 pcie-capid-pointer: 0068 pcie-capid-reg: 0001 name: 'pci1028,1f10' --eric -- Eric D. Mudama edmud...@mail.bounceswoosh.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Best controller card for 8 SATA drives ?
The problem I had was with the single raid 0 volumes (miswrote RAID 1 on the original message) This is not a straight to disk connection and you'll have problems if you ever need to move disks around or move them to another controller. I agree that the MD1000 with ZFS is a rocking, inexpensive setup (we have several!) but I'd recommend using a SAS card with a true JBOD mode for maximum flexibility and portability. If I remember correctly, I think we're using the Adaptec 3085. I've pulled 465MB/s write and 1GB/s read off the MD1000 filled with SATA drives. Cordialement, Erik Ableson +33.6.80.83.58.28 Envoyé depuis mon iPhone On 23 juin 2009, at 21:18, Henrik Johansen hen...@scannet.dk wrote: Kyle McDonald wrote: Erik Ableson wrote: Just a side note on the PERC labelled cards: they don't have a JBOD mode so you _have_ to use hardware RAID. This may or may not be an issue in your configuration but it does mean that moving disks between controllers is no longer possible. The only way to do a pseudo JBOD is to create broken RAID 1 volumes which is not ideal. It won't even let you make single drive RAID 0 LUNs? That's a shame. We currently have 90+ disks that are created as single drive RAID 0 LUNs on several PERC 6/E (LSI 1078E chipset) controllers and used by ZFS. I can assure you that they work without any problems and perform very well indeed. In fact, the combination of PERC 6/E and MD1000 disk arrays has worked so well for us that we are going to double the number of disks during this fall. The lack of portability is disappointing. The trade-off though is battery backed cache if the card supports it. -Kyle Cordialement, Erik Ableson +33.6.80.83.58.28 Envoyé depuis mon iPhone On 23 juin 2009, at 04:33, Eric D. Mudama edmud...@bounceswoosh.org wrote: On Mon, Jun 22 at 15:46, Miles Nordin wrote: edm == Eric D Mudama edmud...@bounceswoosh.org writes: edm We bought a Dell T610 as a fileserver, and it comes with an edm LSI 1068E based board (PERC6/i SAS). which driver attaches to it? pciids.sourceforge.net says this is a 1078 board, not a 1068 board. please, be careful. There's too much confusion about these cards. Sorry, that may have been confusing. We have the cheapest storage option on the T610, with no onboard cache. I guess it's called the Dell SAS6i/R while they reserve the PERC name for the ones with cache. I had understood that they were basically identical except for the cache, but maybe not. Anyway, this adapter has worked great for us so far. snippet of prtconf -D: i86pc (driver name: rootnex) pci, instance #0 (driver name: npe) pci8086,3411, instance #6 (driver name: pcie_pci) pci1028,1f10, instance #0 (driver name: mpt) sd, instance #1 (driver name: sd) sd, instance #6 (driver name: sd) sd, instance #7 (driver name: sd) sd, instance #2 (driver name: sd) sd, instance #4 (driver name: sd) sd, instance #5 (driver name: sd) For this board the mpt driver is being used, and here's the prtconf -pv info: Node 0x1f assigned-addresses: 81020010..fc00..0100.83020014.. df2ec000..4000.8302001c. .df2f..0001 reg: 0002.....01020010....0100.03020014....4000.0302001c. ...0001 compatible: 'pciex1000,58.1028.1f10.8' + 'pciex1000,58.1028.1f10' + 'pciex1000,58.8' + 'pciex1000,58' + 'pciexclass,01' + 'pciexclass,0100' + 'pci1000,58.1028.1f10.8' + 'pci1000,58.1028.1f10' + 'pci1028,1f10' + 'pci1000,58.8' + 'pci1000,58' + 'pciclass, 01' + 'pciclass,0100' model: 'SCSI bus controller' power-consumption: 0001.0001 devsel-speed: interrupts: 0001 subsystem-vendor-id: 1028 subsystem-id: 1f10 unit-address: '0' class-code: 0001 revision-id: 0008 vendor-id: 1000 device-id: 0058 pcie-capid-pointer: 0068 pcie-capid-reg: 0001 name: 'pci1028,1f10' --eric -- Eric D. Mudama edmud...@mail.bounceswoosh.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- Med venlig hilsen / Best Regards Henrik Johansen hen...@scannet.dk Tlf. 75 53 35 00 ScanNet Group A/S ScanNet ___ zfs
Re: [zfs-discuss] 7110 questions
There's a configuration issue in there somewhere. I have a ZFS based system serving up to some ESX servers working great with a few exceptions. First off perf was awful, but there was some confusion on how to optimize network traffic on ESX so I installed a fresh one using only the defaults, no jumbo frames, no etherchannel and I was able to push the ZFS server to wire speed read and write over iSCSI. I still have the write problem over NFS though. I should be back in the datacenter tomorrow to see if it's specific to the ESX NFS client. So my advice is to start looking at all of the tweaks that have been applied to the networking setup on the Xen side first. Cordialement, Erik Ableson +33.6.80.83.58.28 Envoyé depuis mon iPhone On 18 juin 2009, at 21:06, lawrence ho no-re...@opensolaris.org wrote: We have a 7110 on try and buy program. We tried using the 7110 with XEN Server 5 over iSCSI and NFS. Nothing seems to solve the slow write problem. Within the VM, we observed around 8MB/s on writes. Read performance is fantastic. Some troubleshooting was done with local SUN rep. The conclusion is that 7110 does not have write cache in forms of SSD or controller DRAM write cache. The solution from SUN is to buy StorageTek or 7000 series model with SSD write cache. Adam, please advise if there any fixes for 7110. I am still shopping for SAN and would rather buy a 7100 than a StorageTek or something else. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss