Re: [zfs-discuss] maczfs / ZEVO

2013-02-17 Thread Erik Ableson


On 17 févr. 2013, at 15:15, Edward Ned Harvey 
(opensolarisisdeadlongliveopensolaris) 
opensolarisisdeadlongliveopensola...@nedharvey.com wrote:

 From: Tim Cook [mailto:t...@cook.ms]
 Sent: Friday, February 15, 2013 11:14 AM
 
 I have a few coworkers using it.  No horror stories and it's been in use 
 about 6
 months now.  If there were any showstoppers I'm sure I'd have heard loud
 complaints by now :)
 
 So, I have discovered a *couple* of unexpected problems.
 At first, I thought it would be nice to split my HD into 2 partitions, use 
 the 2nd partition for zpool, and use vmdk wrapper around a zvol raw device.  
 So I started partitioning my HD.  As it turns out, there's a bug in 
 diskutility...  As long as you partition your hard drive and *format* the 
 second partition with hfs+, then it works very smoothly. But then I couldn't 
 find any way to dismount the second partition (there is no eject) ... If I go 
 back, I think maybe I'll figure it out, but I didn't try too hard ... I 
 resized back to normal, and then split again, selecting the Empty Space 
 option for the second partition.  Bad idea.  Diskutillity horked the 
 partition tables, and I had to restore from time machine.  I thought maybe it 
 was just a fluke, so I repeated the whole process a second time ... try to 
 split disk, try to make the second half Free Space and forced to restore 
 system.
 
 Lesson learned. Don't try to create an unused partition on the mac HD.
 
 So then I just created one big honking file via dd and used it for zpool 
 store.  Tried to create zvol.  Unfortunately zevo doesn't do zvol.
 
 Ok, no problem.  Windows can run NTFS inside a vmdk file inside a zfs 
 filesystem inside an hfs+ file inside the hfs+ filesystem.  (Yuk.)  But it 
 works.  
 Unfortunately, because it's a file in the backend, zevo doesn't find the pool 
 on reboot.  It doesn't seem to do the equivalent of a zpool.cache.  I've 
 asked a question in their support forum to see if there's some way to solve 
 that problem, but I don't know yet.
 
 Tim, Simon, Volker, Chris, and Erik - How do you use it?
 I am making the informed guess, that you're using it primarily on 
 non-laptops, which have second hard drives, and you're giving the entire disk 
 to the zpool.  Right?

Actually, my usage is with a laptop, but I've pretty much given up on doing 
anything serious in ZFS without going whole disk, so I hadn't run across the 
partitioning issues or the lack of ZFS.cache for mounting file based pools.

Back to the day to day usage. I'm using it primarily with my MacBook Air and I 
have Seagate GoFlex thunderbolt adaptor into which I plug SSDs holding VMs and 
sources. While on the move, I leave the external drive in my bag and use a 1m 
thunderbolt cable so I'm tethered to the bag, but it's usable.

Eventually, I'll probably get one of the StarTech 4 disk toaster docks on USB 3 
for while I'm at the office, and continue to rely on the thunderbolt SSD while 
on the road. 

On the partitioning front, after thinking a bit, you should be able to tell 
Zevo to use a second partition on the main disk. The trick would be creating 
the partition normally as an HFS+ volume, unmounting it with something like 
sudo diskutil unmount disk0s4, followed by sudo zpool create zevo disk0s4

Oh, other side notes I almost forgot. To ensure that you don't chew up all of 
your memory with ARC it's also a good idea to disable spotlight searching on 
ZFS volumes (sudo mdutil -i off /Volumes/Zevo)

Cheers,

Erik
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] maczfs / ZEVO

2013-02-15 Thread Erik Ableson
I've been using it happily since before the Greenbytes purchase. You're 
currently limited to pool version 28. I generally use it with external USB 
drives (single disk pools), but I have tested file based RAIDZ pools which 
worked fine.

The only caveat I will note, particularly for working with VMs (my primary use 
case as well) is that you can run into situations where the OS is RAM starved 
with the ARC filling up. I've run into cases where Fusion refused to boot up 
VMs claiming not enough memory after I was using another machine for a while. 
Ejecting the pool will generally clear out the ARC (allocated to the kernel) so 
that you can reinsert and then start the VM.

It's a full implementation as far as I can tell, including zfs send/recv so you 
can easily backup across the network without having the plugin your disks to 
the other server.

I'd put it in the reliable camp (at the very least more reliable that HFS+ or 
ExFAT on cheap 2.5 drives)

Cheers,

Erik

On 15 févr. 2013, at 17:08, Edward Ned Harvey 
(opensolarisisdeadlongliveopensolaris) 
opensolarisisdeadlongliveopensola...@nedharvey.com wrote:

 Anybody using maczfs / ZEVO?  Have good or bad things to say, in terms of 
 reliability, performance, features?
  
 My main reason for asking is this:  I have a mac, I use Time Machine, and I 
 have VM's inside.  Time Machine, while great in general, has the limitation 
 of being unable to intelligently identify changed bits inside a VM file.  So 
 you have to exclude the VM from Time Machine, and you have to run backup 
 software inside the VM. 
  
 I would greatly prefer, if it's reliable, to let the VM reside on ZFS and use 
 zfs send to backup my guest VM's.
  
 I am not looking to replace HFS+ as the primary filesystem of the mac; 
 although that would be cool, there's often a reliability benefit to staying 
 on the supported, beaten path, standard configuration.  But if ZFS can be 
 used to hold the guest VM storage reliably, I would benefit from that.
 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] all in one server

2012-09-18 Thread Erik Ableson


On 18 sept. 2012, at 16:40, Dan Swartzendruber dswa...@druber.com wrote:

 On 9/18/2012 10:31 AM, Eugen Leitl wrote:
 I'm currently thinking about rolling a variant of
 
 http://www.napp-it.org/napp-it/all-in-one/index_en.html
 
 with remote backup (via snapshot and send) to 2-3
 other (HP N40L-based) zfs boxes for production in
 our organisation. The systems themselves would
 be either Dell or Supermicro (latter with ZIL/L2ARC
 on SSD, plus SAS disks (pools as mirrors) all with
 hardware pass-through).
 
 The idea is to use zfs for data integrity and
 backup via data snapshot (especially important
 data will be also back-up'd via conventional DLT
 tapes).
 
 Before I test thisi --
 
 Is anyone using this is in production? Any caveats?
   
 I run an all-in-one and it works fine. Supermicro x9scl-f with 32gb ECC ram.  
 20 is for the openindiana SAN, with an ibm m1015 passed through via vmdirect 
 (pci passthru).  4 SAS nearline drives in 2x2 mirror config in a jbod 
 chassis.  2 samsung 830 128gb ssds as l2arc.  The main caveat is to order the 
 VMs properly for auto-start (assuming you use that as I do.)  The OI VM goes 
 first, and I give a good 120 seconds before starting the other VMs.  For auto 
 shutdown, all VMs but OI do suspend, OI does shutdown.  The big caveat: do 
 NOT use iSCSI for the datastore, use NFS.  Maybe there's a way to fix this, 
 but I found that on start up, ESXi would time out the iSCSI datastore mount 
 before the virtualized SAN VM was up and serving the share - bad news.  NFS 
 seems to be more resilient there.  vmxnet3 vnics should work fine for OI VM, 
 but might want to stick to e1000.
 Can I actually have a year's worth of snapshots in
 zfs without too much performance degradation?
   
 Dunno about that.

This concords with my experience after building a few custom appliances with 
similar configurations. For the backup side of things, stop and think about the 
actual use cases for keeping a year's worth of snapshots. Generally speaking, 
restore requests are for data that is relatively hot and has been live some 
time in the current quarter. I think that you could limit your snapshot 
retention to something smaller, and pull the files back from tape if you go 
past that.

One detail missing from this calculation is the frequency of snapshots. A 
year's worth of hourly snapshots is huge for a little box like the HP NXXL 
machines. A year's worth of daily snapshots is more in the domain of the 
reasonable. For reference, though I have one that retains 4 weeks of replicated 
hourly snapshots without complaint. (8Gb/4x2Tb raidz1)

The bigger issue you'll run into will be data sizing as a year's worth of 
snapshot basically means that you're keeping a journal of every single write 
that's occurred over the year. If you are running VM Images, this can also mean 
that you're retaining a years worth of writes to your OS swap file - something 
of exceedingly little value. You might want to consider moving the swap files 
to a separate virtual disk on a different volume.

If you're running ESXi with a vSphere license, I'd recommend looking at VDR 
(free with the vCenter license) for backing up the VMs to the little HPs since 
you get compressed and deduplicated backups that will minimize the replication 
bandwidth requirements.

Much depends on what you're optimizing for. If it's RTO (bring VMs back online 
very quickly) then replicating the primary NFS datastore is great - just point 
a server at the replicated NFS store, import the VM and start. With an RPO that 
coincides with your snapshot frequency. 

Cheers,

Erik
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Incremental send/recv interoperability

2011-02-15 Thread Erik ABLESON
Just wondering if an expert can chime in on this one.

I have an older machine running 2009.11 with a zpool at version 14. I have a 
new machine running Solaris Express 11 with the zpool at version 31.

I can use zfs send/recv to send a filesystem from the older machine to the new 
one without any difficulties. However, as soon as I try to update the remote 
copy with an incremental send/recv I get back the error of cannot receive 
incremental stream: invalid backup stream.

I was under the impression that the streams were backwards compatible (ie a 
newer version could receive older streams) which appears to be correct for the 
initial send/recv operation, but failing on the incremental.

Cheers,

Erik
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Incremental send/recv interoperability

2011-02-15 Thread Erik ABLESON
Doh - 2008.11

On 15 févr. 2011, at 11:18, Erik ABLESON wrote:

 I have an older machine running 2009.11 with a zpool at version 14. I have a 
 new machine running Solaris Express 11 with the zpool at version 31.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to avoid striping ?

2010-11-01 Thread Erik Ableson
Le 18 oct. 2010 à 08:44, Habony, Zsolt zsolt.hab...@hp.com a écrit :

 Hi,
 
I have seen a similar question on this list in the archive but 
 haven’t seen the answer.
 
 Can I avoid striping across top level vdevs ?
 
  
 
If I use a zpool which is one LUN from the SAN, and when it 
 becomes full I add a new LUN to it.
 
 But I cannot guarantee that the LUN will not come from the same spindles on 
 the SAN.
 
  
 
Can I force zpool to not to stripe the data ?
 
No. The basic principle of the zpool is dynamic striping across vdevs in order 
to ensure that all available spindles are contributing to the workload. If you 
want/need more granular control over what data goes to which disk, then you'll 
need to create multiple pools.

Just create a new pool from the new SAN volume and you will segregate the IO. 
But then you risk having hot and cold spots in your storage as the IO won't be 
striped. If the approach is to fill a vdev completely before adding a new one 
this possibility exists anyway until the block rewrite arrives to redistribute 
existing data across available vdevs.

Cheers,

Erik

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Mac OS X clients with ZFS server

2010-09-28 Thread Erik Ableson

Le 16 sept. 2010 à 16:18, Rich Teer rich.t...@rite-group.com a écrit :

 On Thu, 16 Sep 2010, erik.ableson wrote:
 
 And for reference, I have a number of 10.6 clients using NFS for
 sharing Fusion virtual machines, iTunes library, iPhoto libraries etc.
 without any issues.
 
 Excellent; what OS is your NFS server running?

OpenSolaris snv129

Erik
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Mac OS X clients with ZFS server

2010-09-28 Thread Erik Ableson
The only tweak needed was making sure that I used the FQDN of the client 
machines (with appropriate reverse lookups in my DNS) for the sharenfs 
properties. 

Envoyé de mon iPhone

Le 16 sept. 2010 à 17:15, Rich Teer rich.t...@rite-group.com a écrit :

 On Thu, 16 Sep 2010, Erik Ableson wrote:
 
 OpenSolaris snv129
 
 Hmm, SXCE snv_130 here.  Did you have to do any server-side tuning
 (e.g., allowing remote connections), or did it just work out of the
 box?  I know that Sendmail needs some gentle persuasion to accept
 remote connections out of the box; perhaps lockd is the same?
 
 -- 
 Rich Teer, Publisher
 Vinylphile Magazine
 
 www.vinylphilemag.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Snapshots and Data Loss

2010-04-13 Thread Erik Ableson
A snapshot is a picture of the storage at a point in time so  
everything depends on the applications using the storage. If you're  
running a db with lots of cache it's probably a good idea to stop the  
service or force a flush to disk before taking the snapshot to ensure  
the integrity of the data. That said, rolling back to a snapshot would  
be roughly the same thing as stopping the application brutally and  
it's up to the application to evaluate the data. Some will handle it  
better than others.


If you're running virtual machines the ideal solution is to take a VM  
snapshot, followed by the filesystem snapshot, then deleting the VM  
snashot.


ZFS snapshots are very reliable but it's scope is limited to the disks  
that it manages so if there's unflushed data living at a higher level,  
ZFS won't be aware of it.


Cordialement,

Erik Ableson

On 13 avr. 2010, at 14:22, Tony MacDoodle tpsdoo...@gmail.com wrote:

I was wondering if any data was lost while doing a snapshot on a  
running system? Does it flush everything to disk or would some stuff  
be lost?


Thanks
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send/receive - actual performance

2010-03-26 Thread Erik Ableson


On 25 mars 2010, at 22:00, Bruno Sousa bso...@epinfante.com wrote:


Hi,

Indeed the 3 disks per vdev (raidz2) seems a bad idea...but it's the  
system i have now.
Regarding the performance...let's assume that a bonnie++ benchmark  
could go to 200 mg/s in. The possibility of getting the same values  
(or near) in a zfs send / zfs receive is just a matter of putting ,  
let's say a 10gbE card between both systems?
I have the impression that benchmarks are always synthetic,  
therefore live/production environments behave quite differently.
Again, it might be just me, but with 1gb link being able to  
replicate 2 servers with a average speed above 60 mb/s does seems  
quite good. However, like i said i would like to know other results  
from other guys...


Don't forget to factor in your transport mechanism. If you're using  
ssh to pipe the send/recv data your overall speed may end up being CPU  
bound since I think that ssh will be single threaded so even on a  
multicore system, you'll only be able to consume one core and here raw  
clock speed will make difference.


Cheers,

Erik
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS/OSOL/Firewire...

2010-03-19 Thread Erik Ableson
Funny, I thought the same thing up until a couple of years ago when I  
thought Apple should have bought Sun :-)


Cordialement,

Erik Ableson

+33.6.80.83.58.28
Envoyé depuis mon iPhone

On 19 mars 2010, at 09:41, Khyron khyron4...@gmail.com wrote:


Of course, I'm the only person I know who said that Sun should have
bought Apple 10 years ago.  What do I know?

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Can we get some documentation on iSCSI sharing after comstar took over?

2010-03-17 Thread Erik Ableson
Certainly! I just whipped that up since I was testing out a pile of  
clients with different volumes and got tired of going through all the  
steps so anything to make it more complete would be useful.


Cordialement,

Erik Ableson

+33.6.80.83.58.28
Envoyé depuis mon iPhone

On 17 mars 2010, at 00:25, Svein Skogen sv...@stillbilde.net wrote:


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 16.03.2010 22:31, erik.ableson wrote:


On 16 mars 2010, at 21:00, Marc Nicholas wrote:

On Tue, Mar 16, 2010 at 3:16 PM, Svein Skogen sv...@stillbilde.net
mailto:sv...@stillbilde.net wrote:



I'll write you a Perl script :)


   I think there are ... several people that'd like a script that  
gave us
   back some of the ease of the old shareiscsi one-off, instead of  
having

   to spend time on copy-and-pasting GUIDs they have ... no real use
   for. ;)


I'll try and knock something up in the next few days, then!


Try this :

http://www.infrageeks.com/groups/infrageeks/wiki/56503/ 
zvol2iscsi.html




Thank you! :)

Mind if I (after some sleep) look at extending your script a little?  
Of

course with feedback of the changes I make?

//Svein

- --
- +---+---
 /\   |Svein Skogen   | sv...@d80.iso100.no
 \ /   |Solberg Østli 9| PGP Key:  0xE5E76831
  X|2020 Skedsmokorset | sv...@jernhuset.no
 / \   |Norway | PGP Key:  0xCE96CE13
   |   | sv...@stillbilde.net
ascii  |   | PGP Key:  0x58CD33B6
ribbon |System Admin   | svein-listm...@stillbilde.net
Campaign|stillbilde.net | PGP Key:  0x22D494A4
   +---+---
   |msn messenger: | Mobile Phone: +47 907 03 575
   |sv...@jernhuset.no | RIPE handle:SS16503-RIPE
- +---+---
If you really are in a hurry, mail me at
  svein-mob...@stillbilde.net
This mailbox goes directly to my cellphone and is checked
   even when I'm not in front of my computer.
- 
Picture Gallery:
 https://gallery.stillbilde.net/v/svein/
- 
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.12 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkugE3wACgkQSBMQn1jNM7YtcwCdFHWdZ2nGSMCsiSEbf9jh+YLT
S8YAoOErsJWEkUYSKFiJ/tINxU0gLWHn
=OAds
-END PGP SIGNATURE-

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Moving Storage to opensolaris+zfs. What about backup?

2010-03-03 Thread Erik Ableson
Comments inline :
 
On Wednesday, March 03, 2010, at 06:35PM, Svein Skogen sv...@stillbilde.net 
wrote:

However trying to wrap my head around solaris and backups (I'm used to 
FreeBSD) is now leaving me with a nasty headache, and still no closer to a 
real solution. I need something that on regular intervals pushes this zpool:

storage  4.06T  1.19T  2.87T29%  1.00x  ONLINE  -

onto a series of tapes, and I really want a solution that allows me to have 
something resembling a one-button-disaster recovery, either via a cd/dvd 
bootdisc, or a bootusb image, or via writing a bootblock on the tapes. 
Preferably a solution that manages to dump the entire zpool, including zfses 
and volumes and whatnot. If I can dump the rpool along with it, all the 
better. (basically something that allows me to shuffle a stack of tapes into 
the safe, maybe along with a bootdevice, with the effect of making me sleep 
easy knowing that ... when disaster happens, I can use a similar-or-better 
specced box to restore the entire server to bring everything back on line).

are there ... ANY good ideas out there for such a solution?
-- 
Only limited by your creativity.  Out curiosity, why the tape solution for 
disaster recovery?  That strikes me as being more work, not to mention much 
more complicated for disaster recovery since LTOs aren't usually found as 
standard kit on most machines.  As a quick idea how about the following :

Boot your system from a USB key (or portable HD), and dd the key to a spare 
that's kept in the safe, updated when you do anything substantial.  There you 
recover not just a bootable system but any system based customization you've 
done. This does require downtime however for the duplication.

For the data, rather than fight with tapes, I'd go buy a dual-bay disk 
enclosure and pop in 2 2Tb drives.  Attach that to the server (USB/eSATA, 
whatever's convenient) and use zfs send/recv to copy over snapshots into a full 
exploitable copy.  Put that in the safe with the USB key and you have a 
completely mobile solution that wants only a computer. Assuming that you don't 
fill up your current 4Tb of storage, you can keep a number of snapshots to 
replace the iterative copies done to tape in the old fashioned world. Better 
yet, do this to two destinations and rotate one off-site.

That would be the best as far as disaster recovery convenience goes, but does 
still require the legwork of attaching the backup disks, running the send/recv, 
exporting the pool and putting it back in the safe. Using a second machine 
somewhere and sending it across the network is more easily scalable (but more 
possibly expensive).

Remembering that by copying to another zpool you have a fully exploitable 
backup copy.  I don't think that the idea of copying zfs send streams to tape 
is a reasonable approach to backups - way to many failure points and 
dependencies. Not to mention that testing your backup is easy - just import the 
pool and scrub.  Testing against tape adds wear and tear to the tapes and you 
need room to restore to, is time consuming, and a general PITA. (but it's 
essential!)

If you want to stick with a traditional approach, amanda is a good choice, and 
OpenSolaris does include an ndmp service, although I haven't looked at it yet.

This kind of design depends on your RTO, RPO, administrative constraints, data 
retention requirements, budget and your definition of a disaster...

IMHO, disk to disk with zfs send/recv offers a very flexible and practical 
solution to many backup and restore needs. Your storage media can be wildly 
different - small, fast SAS for production going to fewer big SATA drives with 
asymmetric snapshot retention policies- keep a week in production and as many 
as you want on the bigger backup drives. Then file level dumps to tape from the 
backup volumes for archival purposes that can be restored onto any filesystem.

Cheers,

Erik
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] file concatenation with ZFS copy-on-write

2009-12-03 Thread Erik Ableson
 On 3 déc. 2009, at 13:29, Bob Friesenhahn bfrie...@simple.dallas.tx.u 
s wrote:



On Thu, 3 Dec 2009, Darren J Moffat wrote:


The answer to this is likely deduplication which ZFS now has.

The reason dedup should help here is that after the 'cat' f15 will  
be made up of blocks that match the blocks of f1 f2 f3 f4 f5.


Copy-on-write isn't what helps you here it is dedup.


Isn't this only true if the file sizes are such that the  
concatenated blocks are perfectly aligned on the same zfs block  
boundaries they used before?  This seems unlikely to me.


It's also worth noting that if the block alignment works out for the  
dedup, the actual write traffic will be trivial, consisting only of  
pointer references, so the heavy lifting will be the read operations.


Much depends on the contents of the files. Fixed size binary blobs  
that align nicely with 16/32/64k boundaries, or variable sized text  
files.


Cordialement,

Erik Ableson
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAID-Z and virtualization

2009-11-08 Thread Erik Ableson
Uhhh - for an unmanaged server you can use ESXi for free. Identical  
server functionality, just requires licenses if you need multiserver  
features (ie vMotion)


Cordialement,

Erik Ableson

On 8 nov. 2009, at 19:12, Tim Cook t...@cook.ms wrote:




On Sun, Nov 8, 2009 at 11:48 AM, Joe Auty j...@netmusician.org wrote:
Tim Cook wrote:




It appears that one can get more in the way of features out of  
VMWare Server for free than with ESX, which is seemingly a hook  
into buying more VMWare stuff.


I've never looked at Sun xVM, in fact I didn't know it even  
existed, but I do now. Thank you, I will research this some more!


The only other variable, I guess, is the future of said  
technologies given the Oracle takeover? There has been much  
discussion on how this impacts ZFS, but I'll have to learn how xVM  
might be affected, if at all.



Quite frankly, I wouldn't let that stop you.  Even if Oracle were  
to pull the plug on xVM entirely (not likely), you could very  
easily just move the VM's back over to *insert your favorite flavor  
of Linux* or Citrix Xen.  Including Unbreakable Linux (Oracle's  
version of RHEL).




I remember now why Xen was a no-go from when I last tested it. I  
rely on the 64 bit version of FreeBSD for most of my VM guest  
machines, and FreeBSD only supports running as domU on i386 systems.  
This is a monkey wrench!


Sorry, just thinking outloud here...



I have no idea what it supports right now.  I can't even find a  
decent support matrix.  Quite frankly, I would (and do) just use a  
separate server for the fileserver than the vm box.  You can get  
64bit cpu's with 4GB of ram for awfully cheap nowadays.  That should  
be more than enough for most home workloads.


--Tim

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAID-Z and virtualization

2009-11-08 Thread Erik Ableson
Simply put ESXi is exactly the same local feature set as ESX server.  
So you get all of the useful stuff like transparent memory page  
sharing (memory deduplication), virtual switches with VLAN tagging,  
and high performance storage I/O. For free. As many copies as you like.


But... You will need a vCenter license and then by server (well, by  
processor) licenses if you want the advanced management features like  
live migration of running VMs between servers, fault tolerance, guided  
consolidation etc.


Most importantly, ESXi is a bare metal install so you have a proper  
hypervisor allocating resources instead of a general purpose OS with a  
Virtualisation application.


Cordialement,

Erik Ableson

On 8 nov. 2009, at 19:43, Tim Cook t...@cook.ms wrote:




On Sun, Nov 8, 2009 at 12:39 PM, Joe Auty j...@netmusician.org wrote:
Erik Ableson wrote:


Uhhh - for an unmanaged server you can use ESXi for free. Identical  
server functionality, just requires licenses if you need  
multiserver features (ie vMotion)


How does ESXi w/o vMotion, vSphere, and vCenter server stack up  
against VMWare Server? My impression was that you need these other  
pieces to make such an infrastructure useful?



VMware server doesn't have vmotion.  There is no such thing as  
vsphere, that's the marketing name for the entire product suite.   
vCenter is only required for advanced functionality like HA/DPM/DRS  
that you don't have with VMware server either.


Are you just throwing out buzzwords, or do you actually know what  
they do?


--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Borked zpool, missing slog/zil

2009-09-27 Thread Erik Ableson
Hmmm - I've got a fairly old copy of the zpool cache file (circa July), but 
nothing structural has changed in pool since that date. What other data is held 
in that file? There have been some filesystem changes, but nothing critical is 
in the newer filesystems.

Any particular procedure required for swapping out the zpool.cache file?

Erik

On Sunday, 27 September, 2009, at 12:28AM, Ross myxi...@googlemail.com 
wrote:
Do you have a backup copy of your zpool.cache file?

If you have that file, ZFS will happily mount a pool on boot without its slog 
device - it'll just flag the slog as faulted and you can do your normal 
replace.  I used that for a long while on a test server with a ramdisk slog - 
and I never needed to swap it to a file based slog.

However without a backup of that file to make zfs load the pool on boot I 
don't believe there is any way to import that pool.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Borked zpool, missing slog/zil

2009-09-27 Thread Erik Ableson
Good link - thanks. I'm looking at the details for that one and learning a 
little zdb at the same time. I've got a situation perhaps a little different in 
that I _do_ have a current copy of the slog in a file with what appears to be 
current data.

However, I don't see how to attach the slog file to an offline zpool - I have 
both a dd backup of the ramdisk slog from midnight as well as the current file 
based slog :

zdb -l /root/slog.tmp

version=14
name='siovale'
state=1
txg=4499446
pool_guid=13808783103733022257
hostid=4834000
hostname='shemhazai'
top_guid=6374488381605474740
guid=6374488381605474740
is_log=1
vdev_tree
type='file'
id=1
guid=6374488381605474740
path='/root/slog.tmp'
metaslab_array=230
metaslab_shift=21
ashift=9
asize=938999808
is_log=1
DTL=51

Is there any way that I can attach this slog to the zpool while it's offline?

Erik

On 27 sept. 2009, at 02:23, David Turnbull dsturnb...@gmail.com wrote:

 I believe this is relevant: http://github.com/pjjw/logfix
 Saved my array last year, looks maintained.

 On 27/09/2009, at 4:49 AM, Erik Ableson wrote:

 Hmmm - this is an annoying one.

 I'm currently running an OpenSolaris install (2008.11 upgraded to  
 2009.06) :
 SunOS shemhazai 5.11 snv_111b i86pc i386 i86pc Solaris

 with a zpool made up of one radiz vdev and a small ramdisk based  
 zil.  I usually swap out the zil for a file-based copy when I need  
 to reboot (zpool replace /dev/ramdisk/slog /root/slog.tmp) but this  
 time I had a brain fart and forgot to.

 The server came back up and I could sort of work on the zpool but  
 it was complaining so I did my replace command and it happily  
 resilvered.  Then I restarted one more time in order to test  
 bringing everything up cleanly and this time it can't find the file  
 based zil.

 I try importing and it comes back with:
 zpool import
 pool: siovale
   id: 13808783103733022257
 state: UNAVAIL
 status: One or more devices are missing from the system.
 action: The pool cannot be imported. Attach the missing
   devices and try again.
  see: http://www.sun.com/msg/ZFS-8000-6X
 config:

   siovale UNAVAIL  missing device
 raidz1ONLINE
   c8d0ONLINE
   c9d0ONLINE
   c10d0   ONLINE
   c11d0   ONLINE

   Additional devices are known to be part of this pool, though  
 their
   exact configuration cannot be determined.

 Now the file still exists so I don't know why it can't seem to find  
 it and I thought the missing zil issue was corrected in this  
 version (or did I miss something?).

 I've looked around for solutions to bring it back online and ran  
 across this method: 
 http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg16545.html 
  but before I jump in on this one I was hoping there was a newer,  
 cleaner approach that I missed somehow.

 Ideas appreciated...

 Erik

 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Borked zpool, missing slog/zil

2009-09-26 Thread Erik Ableson
Hmmm - this is an annoying one.

I'm currently running an OpenSolaris install (2008.11 upgraded to 2009.06) :
SunOS shemhazai 5.11 snv_111b i86pc i386 i86pc Solaris

with a zpool made up of one radiz vdev and a small ramdisk based zil.  I 
usually swap out the zil for a file-based copy when I need to reboot (zpool 
replace /dev/ramdisk/slog /root/slog.tmp) but this time I had a brain fart and 
forgot to.

The server came back up and I could sort of work on the zpool but it was 
complaining so I did my replace command and it happily resilvered.  Then I 
restarted one more time in order to test bringing everything up cleanly and 
this time it can't find the file based zil.

I try importing and it comes back with:
zpool import
  pool: siovale
id: 13808783103733022257
 state: UNAVAIL
status: One or more devices are missing from the system.
action: The pool cannot be imported. Attach the missing
devices and try again.
   see: http://www.sun.com/msg/ZFS-8000-6X
config:

siovale UNAVAIL  missing device
  raidz1ONLINE
c8d0ONLINE
c9d0ONLINE
c10d0   ONLINE
c11d0   ONLINE

Additional devices are known to be part of this pool, though their
exact configuration cannot be determined.

Now the file still exists so I don't know why it can't seem to find it and I 
thought the missing zil issue was corrected in this version (or did I miss 
something?).

I've looked around for solutions to bring it back online and ran across this 
method: http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg16545.html 
but before I jump in on this one I was hoping there was a newer, cleaner 
approach that I missed somehow.

Ideas appreciated...

Erik

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS for iSCSI based SAN

2009-06-24 Thread Erik Ableson
Bottim line with virtual machines is that your IO will be random by  
definition since it all goes into the same pipe. If you want to be  
able to scale, go with RAID 1 vdevs. And don't skimp on the memory.


Our current experience hasn't shown a need for an SSD for the ZIL but  
it might be useful for L2ARC (using iSCSI for VMs, NFS for templates  
and iso images)


Cordialement,

Erik Ableson

+33.6.80.83.58.28
Envoyé depuis mon iPhone

On 24 juin 2009, at 18:56, milosz mew...@gmail.com wrote:

Within the thread there are instructions for using iometer to load  
test your storage. You should test out your solution before going  
live, and compare what you get with what you need. Just because  
striping 3 mirrors *will* give you more performance than raidz2  
doesn't always mean that is the best solution. Choose the best  
solution for your use case.


multiple vm disks that have any kind of load on them will bury a raidz
or raidz2.  out of a 6x raidz2 you are going to get the iops and
random seek latency of a single drive (realistically the random seek
will probably be slightly worse, actually).  how could that be
adequate for a virtual machine backend?  if you set up a raidz2 with
6x15k drives, for the majority of use cases, you are pretty much
throwing your money away.  you are going to roll your own san, buy a
bunch of 15k drives, use 2-3u of rackspace and four (or more)
switchports, and what you're getting out of it is essentially a 500gb
15k drive with a high mttdl and a really huge theoretical transfer
speed for sequential operations (which you won't be able to saturate
anyway because you're delivering over gige)?  for this particular
setup i can't really think of a situation where that would make sense.

Regarding ZIL usage, from what I have read you will only see  
benefits if you are using NFS backed storage, but that it can be  
significant.


link?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best controller card for 8 SATA drives ?

2009-06-23 Thread Erik Ableson
Just a side note on the PERC labelled cards: they don't have a JBOD  
mode so you _have_ to use hardware RAID. This may or may not be an  
issue in your configuration but it does mean that moving disks between  
controllers is no longer possible. The only way to do a pseudo JBOD is  
to create broken RAID 1 volumes which is not ideal.


Cordialement,

Erik Ableson

+33.6.80.83.58.28
Envoyé depuis mon iPhone

On 23 juin 2009, at 04:33, Eric D. Mudama  
edmud...@bounceswoosh.org wrote:



On Mon, Jun 22 at 15:46, Miles Nordin wrote:

edm == Eric D Mudama edmud...@bounceswoosh.org writes:


 edm We bought a Dell T610 as a fileserver, and it comes with an
 edm LSI 1068E based board (PERC6/i SAS).

which driver attaches to it?

pciids.sourceforge.net says this is a 1078 board, not a 1068 board.

please, be careful.  There's too much confusion about these cards.


Sorry, that may have been confusing.  We have the cheapest storage
option on the T610, with no onboard cache.  I guess it's called the
Dell SAS6i/R while they reserve the PERC name for the ones with
cache.  I had understood that they were basically identical except for
the cache, but maybe not.

Anyway, this adapter has worked great for us so far.


snippet of prtconf -D:


i86pc (driver name: rootnex)
   pci, instance #0 (driver name: npe)
   pci8086,3411, instance #6 (driver name: pcie_pci)
   pci1028,1f10, instance #0 (driver name: mpt)
   sd, instance #1 (driver name: sd)
   sd, instance #6 (driver name: sd)
   sd, instance #7 (driver name: sd)
   sd, instance #2 (driver name: sd)
   sd, instance #4 (driver name: sd)
   sd, instance #5 (driver name: sd)


For this board the mpt driver is being used, and here's the prtconf
-pv info:


 Node 0x1f
   assigned-addresses:   
81020010..fc00..0100.83020014.. 
df2ec000..4000.8302001c. 
.df2f..0001
   reg:   
0002.....01020010....0100.03020014....4000.0302001c. 
...0001
   compatible: 'pciex1000,58.1028.1f10.8' + 'pciex1000,58.1028.1f10'  
+ 'pciex1000,58.8' + 'pciex1000,58' + 'pciexclass,01' +  
'pciexclass,0100' + 'pci1000,58.1028.1f10.8' +  
'pci1000,58.1028.1f10' + 'pci1028,1f10' + 'pci1000,58.8' +  
'pci1000,58' + 'pciclass,01' + 'pciclass,0100'

   model:  'SCSI bus controller'
   power-consumption:  0001.0001
   devsel-speed:  
   interrupts:  0001
   subsystem-vendor-id:  1028
   subsystem-id:  1f10
   unit-address:  '0'
   class-code:  0001
   revision-id:  0008
   vendor-id:  1000
   device-id:  0058
   pcie-capid-pointer:  0068
   pcie-capid-reg:  0001
   name:  'pci1028,1f10'


--eric


--
Eric D. Mudama
edmud...@mail.bounceswoosh.org

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best controller card for 8 SATA drives ?

2009-06-23 Thread Erik Ableson
The problem I had was with the single raid 0 volumes (miswrote RAID 1  
on the original message)


This is not a straight to disk connection and you'll have problems if  
you ever need to move disks around or move them to another controller.


I agree that the MD1000 with ZFS is a rocking, inexpensive setup (we  
have several!) but I'd recommend using a SAS card with a true JBOD  
mode for maximum flexibility and portability. If I remember correctly,  
I think we're using the Adaptec 3085. I've pulled 465MB/s write and  
1GB/s read off the MD1000 filled with SATA drives.


Cordialement,

Erik Ableson

+33.6.80.83.58.28
Envoyé depuis mon iPhone

On 23 juin 2009, at 21:18, Henrik Johansen hen...@scannet.dk wrote:


Kyle McDonald wrote:

Erik Ableson wrote:


Just a side note on the PERC labelled cards: they don't have a  
JBOD mode so you _have_ to use hardware RAID. This may or may not  
be an issue in your configuration but it does mean that moving  
disks between controllers is no longer possible. The only way to  
do a pseudo JBOD is to create broken RAID 1 volumes which is not  
ideal.



It won't even let you make single drive RAID 0 LUNs? That's a shame.


We currently have 90+ disks that are created as single drive RAID 0  
LUNs

on several PERC 6/E (LSI 1078E chipset) controllers and used by ZFS.

I can assure you that they work without any problems and perform very
well indeed.

In fact, the combination of PERC 6/E and MD1000 disk arrays has worked
so well for us that we are going to double the number of disks during
this fall.

The lack of portability is disappointing. The trade-off though is  
battery backed cache if the card supports it.


-Kyle



Cordialement,

Erik Ableson

+33.6.80.83.58.28
Envoyé depuis mon iPhone

On 23 juin 2009, at 04:33, Eric D. Mudama edmud...@bounceswoosh.org 
 wrote:


 On Mon, Jun 22 at 15:46, Miles Nordin wrote:
 edm == Eric D Mudama edmud...@bounceswoosh.org writes:

  edm We bought a Dell T610 as a fileserver, and it comes with an
  edm LSI 1068E based board (PERC6/i SAS).

 which driver attaches to it?

 pciids.sourceforge.net says this is a 1078 board, not a 1068  
board.


 please, be careful.  There's too much confusion about these  
cards.


 Sorry, that may have been confusing.  We have the cheapest storage
 option on the T610, with no onboard cache.  I guess it's called  
the

 Dell SAS6i/R while they reserve the PERC name for the ones with
 cache.  I had understood that they were basically identical  
except for

 the cache, but maybe not.

 Anyway, this adapter has worked great for us so far.


 snippet of prtconf -D:


 i86pc (driver name: rootnex)
pci, instance #0 (driver name: npe)
pci8086,3411, instance #6 (driver name: pcie_pci)
pci1028,1f10, instance #0 (driver name: mpt)
sd, instance #1 (driver name: sd)
sd, instance #6 (driver name: sd)
sd, instance #7 (driver name: sd)
sd, instance #2 (driver name: sd)
sd, instance #4 (driver name: sd)
sd, instance #5 (driver name: sd)


 For this board the mpt driver is being used, and here's the  
prtconf

 -pv info:


  Node 0x1f
assigned-addresses:
81020010..fc00..0100.83020014..

 df2ec000..4000.8302001c.
 .df2f..0001
reg:
0002.....01020010....0100.03020014....4000.0302001c.

 ...0001
compatible: 'pciex1000,58.1028.1f10.8' +  
'pciex1000,58.1028.1f10'  + 'pciex1000,58.8' + 'pciex1000,58' +  
'pciexclass,01' +  'pciexclass,0100' +  
'pci1000,58.1028.1f10.8' +  'pci1000,58.1028.1f10' +  
'pci1028,1f10' + 'pci1000,58.8' +  'pci1000,58' + 'pciclass, 
01' + 'pciclass,0100'

model:  'SCSI bus controller'
power-consumption:  0001.0001
devsel-speed:  
interrupts:  0001
subsystem-vendor-id:  1028
subsystem-id:  1f10
unit-address:  '0'
class-code:  0001
revision-id:  0008
vendor-id:  1000
device-id:  0058
pcie-capid-pointer:  0068
pcie-capid-reg:  0001
name:  'pci1028,1f10'


 --eric


 --
 Eric D. Mudama
 edmud...@mail.bounceswoosh.org

 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


--
Med venlig hilsen / Best Regards

Henrik Johansen
hen...@scannet.dk
Tlf. 75 53 35 00

ScanNet Group
A/S ScanNet

___
zfs

Re: [zfs-discuss] 7110 questions

2009-06-18 Thread Erik Ableson
There's a configuration issue in there somewhere. I have a ZFS based  
system serving up to some ESX servers working great with a few  
exceptions.


First off perf was awful, but there was some confusion on how to  
optimize network traffic on ESX so I installed a fresh one using only  
the defaults, no jumbo frames, no etherchannel and I was able to push  
the ZFS server to wire speed read and write over iSCSI. I still have  
the write problem over NFS though. I should be back in the datacenter  
tomorrow to see if it's specific to the ESX NFS client.


So my advice is to start looking at all of the tweaks that have been  
applied to the networking setup on the Xen side first.


Cordialement,

Erik Ableson

+33.6.80.83.58.28
Envoyé depuis mon iPhone

On 18 juin 2009, at 21:06, lawrence ho no-re...@opensolaris.org wrote:


We have a 7110 on try and buy program.

We tried using the 7110 with XEN Server 5 over iSCSI and NFS.  
Nothing seems to solve the slow write problem. Within the VM, we  
observed around 8MB/s on writes. Read performance is fantastic. Some  
troubleshooting was done with local SUN rep. The conclusion is that  
7110 does not have write cache in forms of SSD or controller DRAM  
write cache. The solution from SUN is to buy StorageTek or 7000  
series model with SSD write cache.


Adam, please advise if there any fixes for 7110. I am still shopping  
for SAN and would rather buy a 7100 than a StorageTek or something  
else.

--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss