On 18 sept. 2012, at 16:40, Dan Swartzendruber <dswa...@druber.com> wrote:

> On 9/18/2012 10:31 AM, Eugen Leitl wrote:
>> I'm currently thinking about rolling a variant of
>> 
>> http://www.napp-it.org/napp-it/all-in-one/index_en.html
>> 
>> with remote backup (via snapshot and send) to 2-3
>> other (HP N40L-based) zfs boxes for production in
>> our organisation. The systems themselves would
>> be either Dell or Supermicro (latter with ZIL/L2ARC
>> on SSD, plus SAS disks (pools as mirrors) all with
>> hardware pass-through).
>> 
>> The idea is to use zfs for data integrity and
>> backup via data snapshot (especially important
>> data will be also back-up'd via conventional DLT
>> tapes).
>> 
>> Before I test thisi --
>> 
>> Is anyone using this is in production? Any caveats?
>>   
> I run an all-in-one and it works fine. Supermicro x9scl-f with 32gb ECC ram.  
> 20 is for the openindiana SAN, with an ibm m1015 passed through via vmdirect 
> (pci passthru).  4 SAS nearline drives in 2x2 mirror config in a jbod 
> chassis.  2 samsung 830 128gb ssds as l2arc.  The main caveat is to order the 
> VMs properly for auto-start (assuming you use that as I do.)  The OI VM goes 
> first, and I give a good 120 seconds before starting the other VMs.  For auto 
> shutdown, all VMs but OI do suspend, OI does shutdown.  The big caveat: do 
> NOT use iSCSI for the datastore, use NFS.  Maybe there's a way to fix this, 
> but I found that on start up, ESXi would time out the iSCSI datastore mount 
> before the virtualized SAN VM was up and serving the share - bad news.  NFS 
> seems to be more resilient there.  vmxnet3 vnics should work fine for OI VM, 
> but might want to stick to e1000.
>> Can I actually have a year's worth of snapshots in
>> zfs without too much performance degradation?
>>   
> Dunno about that.

This concords with my experience after building a few custom appliances with 
similar configurations. For the backup side of things, stop and think about the 
actual use cases for keeping a year's worth of snapshots. Generally speaking, 
restore requests are for data that is relatively hot and has been live some 
time in the current quarter. I think that you could limit your snapshot 
retention to something smaller, and pull the files back from tape if you go 
past that.

One detail missing from this calculation is the frequency of snapshots. A 
year's worth of hourly snapshots is huge for a little box like the HP NXXL 
machines. A year's worth of daily snapshots is more in the domain of the 
reasonable. For reference, though I have one that retains 4 weeks of replicated 
hourly snapshots without complaint. (8Gb/4x2Tb raidz1)

The bigger issue you'll run into will be data sizing as a year's worth of 
snapshot basically means that you're keeping a journal of every single write 
that's occurred over the year. If you are running VM Images, this can also mean 
that you're retaining a years worth of writes to your OS swap file - something 
of exceedingly little value. You might want to consider moving the swap files 
to a separate virtual disk on a different volume.

If you're running ESXi with a vSphere license, I'd recommend looking at VDR 
(free with the vCenter license) for backing up the VMs to the little HPs since 
you get compressed and deduplicated backups that will minimize the replication 
bandwidth requirements.

Much depends on what you're optimizing for. If it's RTO (bring VMs back online 
very quickly) then replicating the primary NFS datastore is great - just point 
a server at the replicated NFS store, import the VM and start. With an RPO that 
coincides with your snapshot frequency. 

Cheers,

Erik
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to