Hi Christoph.

Several reasons, I'll try and outline those that I can recall:

1. We found the performance of NFS to be far better than iSCSI, although we 
probably hadn't spent sufficient time tweaking the iSCSI configuration. NB: we 
were exporting ZFS block devices as LUNs to virtual machines as RDMs (raw 
device mappings) as opposed to formatting them as VMFS, so the comparison isn't 
perhaps completely fair.

2. Overall the administration of iSCSI was overcomplicated, and prone to human 
error. Training our people to administer and troubleshoot it was likely to be 
too costly, and the additional ongoing costs associated with the increased 
administration have to be considered.

3. Similar to point (2) above, we found that you had to be careful to keep 
track of IQNs and LUN Ids when mapping in VMware, it's only possible to put 
names on targets, not LUNs, so when mapping multiple disks into a VM the 
process was potentially error-prone.

4. The provisioning of iSCSI storage between ESXi and COMSTAR *felt* a little 
bit unstable; we had numerous instances of lengthy HBA rescans, targets 
appearing and disappearing which although they were all explained within 
'expected behaviour' were either a little counter-intuitive or too 
time-consuming.

5. My understanding of the nature of ESX NFS connections is that less IO 
blocking takes place, which explains why so many people get better throughput 
when running multiple VMs against the same SAN connection.

6. The limit of 48 NFS mappings in ESX wasn't going to be a constraint for us 
for the foreseeable future; we rarely load more than 15 VMs per node. The 
flexibility of creating a separate ZFS filesystem for each VM and exporting it 
separately over NFS was therefore possible.

7. Everyone understands files; when you login to the Solaris system you can 
navigate to the ZFS filesystem and see .vmx and .vmdk files - these are nice 
and simple to manage, clone, backup, export etc etc.

8. Related to point (7), when you use NFS in this way the virtual machine 
configuration is stored alongside the rest of the virtual machine data. This 
means that a snapshot of the ZFS filesystem for that VM gives us a true 
complete backup at that point in time.

I think that just about covers it. Essentially, although iSCSI feels like a 
more clever way of solving our problem, and certainly it poses as a more 
'enterprise' solution, it really was a case of overcomplicating the solution.

Several times in the last few years we've found that simple is better, even if 
it doesn't satisfy one's desire to implement the 'technically perfect 
solution'. It's more a question of balancing the economics (I'm running a 
business, after all) with the actual requirements. Oftentimes just because you 
*can* do something doesn't mean you should, and I've often found that PERCEIVED 
requirements can out-grow ACTUAL requirements just because some technology 
exists to solve problems that you don't have [yet].

In short, I like to keep it fit for purpose, even if it does feel like a more 
agricultural solution.

Finally, I should add that the one major shortcoming with our NFS solution is 
the lack of any equivalent to the iSCSI multipathing. If we had any machines 
that required true high availability or automated failover this would probably 
have negated all of the points above - iSCSI multipathing is a beautiful thing, 
and it creates some awesome possibilities for fault tolerance.

As it stands, we take care of link failure at the network level (as opposed to 
the iSCSI MPIO protocol level) and deal with ESX node or storage node failure 
by manually remapping NFS filesystems from elsewhere. This is actually 
preferable to automated recovery since sometimes we don't want to take the 
'default' action during a failure scenario. By having a carefully documented 
failure plan I believe we have more flexibility, and can deal with recovery on 
a per-client basis, rather than a system-wide basis.

Ultimately we are a business dealing with multiple clients hosted on shared 
hardware, so it's important to keep our implementations client-centric, rather 
than system-centric.

Finally, I should add that although we don't have automated failover of these 
systems, our solution does still permit us to stay well within our contracted 
SLAs, which serves the business need.

Regards,

Timothy Creswick


-----Original Message-----
From: [email protected] 
[mailto:[email protected]] On Behalf Of Christoph Jahn
Sent: 07 September 2009 12:12 PM
To: [email protected]
Subject: Re: [storage-discuss] Scripting iSCSI COMSTAR target creation from ZFS 
block device

Could you please give some background as to why you switched to NFS?

Thanks,
Christoph

> Ah...
> 
> As I mentioned, I wrote this a while ago and actually
> we've mostly switched to using NFS as opposed to
> iSCSI for VMs so haven't had occasion to test on new
> releases.
> 
> Recent posts to this list regarding mapping LUs to
> TGs etc prompted me to post it.
> 
> T
_______________________________________________
storage-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/storage-discuss

Reply via email to