Re: [zfs-discuss] OpenSolaris ZFS NAS Setup
To repeat what some others have said, yes, Solaris seems to handle an iSCSI device going offline in that it doesn't panick and continues working once everything has timed out. However that doesn't necessarily mean it's ready for production use. ZFS will hang for 3 mins (180 seconds) waiting for the iSCSI client to timeout. Now I don't know about you, but HA to me doesn't mean Highly Available, but with occasional 3 minute breaks. Most of the client applications we would want to run on ZFS would be broken with a 3 minute delay returning data, and this was enough for us to give up on ZFS over iSCSI for now. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] OpenSolaris ZFS NAS Setup
Ross wrote: To repeat what some others have said, yes, Solaris seems to handle an iSCSI device going offline in that it doesn't panick and continues working once everything has timed out. However that doesn't necessarily mean it's ready for production use. ZFS will hang for 3 mins (180 seconds) waiting for the iSCSI client to timeout. Now I don't know about you, but HA to me doesn't mean Highly Available, but with occasional 3 minute breaks. Most of the client applications we would want to run on ZFS would be broken with a 3 minute delay returning data, and this was enough for us to give up on ZFS over iSCSI for now. By default, the sd driver has a 60 second timeout with either 3 or 5 retries before timing out the I/O request. In other words, for the same failure mode in a DAS or SAN you will get the same behaviour. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] OpenSolaris ZFS NAS Setup
On Mon, 7 Apr 2008, Ross wrote: However that doesn't necessarily mean it's ready for production use. ZFS will hang for 3 mins (180 seconds) waiting for the iSCSI client to timeout. Now I don't know about you, but HA to me doesn't mean Highly Available, but with occasional 3 minute breaks. Most of the client applications we would want to run on ZFS would be broken with a 3 minute delay returning data, and this was enough for us to give up on ZFS over iSCSI for now. It seems to me that this is a problem with the iSCSI client timeout parameters rather than ZFS itself. Three minutes is sufficient for use over the internet but seems excessive on a LAN. Have you investigated to see if the iSCSI client timeout parameters can be adjusted? Bob == Bob Friesenhahn [EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] OpenSolaris ZFS NAS Setup
Crazy question here... but has anyone tried this with say, a QLogic hardware iSCSI card? Seems like it would solve all your issues. Granted, they aren't free like the software stack, but if you're trying to setup an HA solution, the ~$800 price tag per card seems pretty darn reasonable to me. Not sure how this would help if one target fails. The card doesn't work any magic making the target always available. We are testing a QLA-4052C card, we believe QLogic tested it as installed on a Sun box but not against Solaris iSCSI targets; an attempt to connect from this card *appears* to cause our iscsitgtd daemon to consume a great deal of CPU and memory. We're still trying to find out why. CT ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] OpenSolaris ZFS NAS Setup
On Mon, Apr 7, 2008 at 10:40 AM, Christine Tran [EMAIL PROTECTED] wrote: Crazy question here... but has anyone tried this with say, a QLogic hardware iSCSI card? Seems like it would solve all your issues. Granted, they aren't free like the software stack, but if you're trying to setup an HA solution, the ~$800 price tag per card seems pretty darn reasonable to me. Not sure how this would help if one target fails. The card doesn't work any magic making the target always available. We are testing a QLA-4052C card, we believe QLogic tested it as installed on a Sun box but not against Solaris iSCSI targets; an attempt to connect from this card *appears* to cause our iscsitgtd daemon to consume a great deal of CPU and memory. We're still trying to find out why. CT How would it not help? From what I'm reading, there's a flag in the software iSCSI stack on how to react if a target is lost. This is completely bypassed if you use the hardware card. As far as the OS is concerned, it's just another SCSI disk. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] iSCSI targets mapped to a VMWare ESX server
Hi All ; We are running latest Solaris 10 a X4500 Thumper. We defined a test iSCSI Lun. Out put below Target: AkhanTemp/VM iSCSI Name: iqn.1986-03.com.sun:02:72406bf8-2f5f-635a-f64c-cb664935f3d1 Alias: AkhanTemp/VM Connections: 0 ACL list: TPGT list: LUN information: LUN: 0 GUID: 01144fa709302a0047fa50e6 VID: SUN PID: SOLARIS Type: disk Size: 100G Backing store: /dev/zvol/rdsk/AkhanTemp/VM Status: online We tried to access the LUN from a windows laptop, and it worked without any problems. However VMWare ESX 3,2 Server is unable to access the LUN's. We checked that the virtual interface can ping X4500. Sometimes it sees the Lun , but 200+ Lun's with the same proporties are listed and we cant add them as storage. Then after a rescan they vanish. Any help appraciated Mertol http://www.sun.com/ http://www.sun.com/emrkt/sigs/6g_top.gif Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email mailto:[EMAIL PROTECTED] [EMAIL PROTECTED] attachment: image001.gif___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] iSCSI targets mapped to a VMWare ESX server
Mertol Ozyoney wrote: Hi All ; There are a set of issues being looked at that prevent the VMWare ESX server from working with the Solaris iSCSI Target. http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6597310 At this time there is no target date when this issues will be resolved. Jim We are running latest Solaris 10 a X4500 Thumper. We defined a test iSCSI Lun. Out put below Target: AkhanTemp/VM iSCSI Name: iqn.1986-03.com.sun:02:72406bf8-2f5f-635a-f64c- cb664935f3d1 Alias: AkhanTemp/VM Connections: 0 ACL list: TPGT list: LUN information: LUN: 0 GUID: 01144fa709302a0047fa50e6 VID: SUN PID: SOLARIS Type: disk Size: 100G Backing store: /dev/zvol/rdsk/AkhanTemp/VM Status: online We tried to access the LUN from a windows laptop, and it worked without any problems. However VMWare ESX 3,2 Server is unable to access the LUN’s. We checked that the virtual interface can ping X4500. Sometimes it sees the Lun , but 200+ Lun’s with the same proporties are listed and we cant add them as storage. Then after a rescan they vanish. Any help appraciated Mertol image001.gif Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] OpenSolaris ZFS NAS Setup
Ross Smith wrote: Which again is unacceptable for network storage. If hardware raid controllers took over a minute to timeout a drive network admins would be in uproar. Why should software be held to a different standard? You need to take a systems approach to analyzing these things. For example, how long does an array take to cold boot? When I was Chief Architect for Integrated Systems Engineering, we had a product which included a storage array and a server racked together. If you used the defaults, and simulated a power-loss failure scenario, then the whole thing fell apart. Why? Because the server cold booted much faster than the array. When Solaris started, it looked for the disks, found none because the array was still booting, and declared those disks dead. The result was that you needed system administrator intervention to get the services started again. Not acceptable. The solution was to delay the server boot to more closely match the array's boot time. The default timeout values can be changed, but we rarely recommend it. You can get into all sorts of false failure modes with small timeouts. For example, most disks spec a 30 second spin up time. So if your disk is spun down, perhaps for power savings, then you need a timeout which is greater than 30 seconds by some margin. Similarly, if you have a CD-ROM hanging off the bus, then you need a long timeout to accommodate the slow data access for a CD-ROM. I wrote a Sun BluePrint article discussing some of these issues a few years ago. http://www.sun.com/blueprints/1101/clstrcomplex.pdf I can understand the driver being persistant if your data is on a single disk, however when you have any kind of redundant data, there is no need for these delays. And there should definately not be delays in returning status information. Who ever heard of a hardware raid controller that takes 3 minutes to tell you which disk has gone bad? I can understand how the current configuration came about, but it seems to me that the design of ZFS isn't quite consistent. You do all this end-to-end checksumming to double check that data is consistent because you don't trust the hardware, cables, or controllers to not corrupt data. Yet you trust that same equipment absolutely when it comes to making status decisions. It seems to me that you either trust the infrastructure or you don't, and the safest decision (as ZFS' integrity checking has shown), is not to trust it. ZFS would be better assuming that drivers and controllers won't always return accurate status information, and have it's own set of criteria to determine whether a drive (of any kind) is working as expected and returning responses in a timely manner. I don't see any benefit for ZFS to add another set of timeouts over and above the existing timeouts. Indeed we often want to delay any rash actions which would cause human intervention or prolonged recovery later. Sometimes patience is a virtue. -- richard Date: Mon, 7 Apr 2008 07:48:41 -0700 From: [EMAIL PROTECTED] Subject: Re: [zfs-discuss] OpenSolaris ZFS NAS Setup To: [EMAIL PROTECTED] CC: zfs-discuss@opensolaris.org Ross wrote: To repeat what some others have said, yes, Solaris seems to handle an iSCSI device going offline in that it doesn't panick and continues working once everything has timed out. However that doesn't necessarily mean it's ready for production use. ZFS will hang for 3 mins (180 seconds) waiting for the iSCSI client to timeout. Now I don't know about you, but HA to me doesn't mean Highly Available, but with occasional 3 minute breaks. Most of the client applications we would want to run on ZFS would be broken with a 3 minute delay returning data, and this was enough for us to give up on ZFS over iSCSI for now. By default, the sd driver has a 60 second timeout with either 3 or 5 retries before timing out the I/O request. In other words, for the same failure mode in a DAS or SAN you will get the same behaviour. -- richard Have you played Fishticuffs? Get fish-slapping on Messenger http://www.fishticuffs.co.uk ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] iSCSI targets mapped to a VMWare ESX server
Thanks James ; The problem is nearly identical with mine. When we had 2 LUN's vmware tried to multipath over them . I think this is a bug inside VMWare as it thinks that two LUN 0 are same. I think I can fool it setting up targets with different LUN numbers. After I figured out this, I switched to a single LUN , cleaned a few things and reinitialized . For some reason VM found 200+ LUN's again but all with single active path. I started formatting the first one. (100 GB) It's been going one for the last 20 minutes. I hope it will ever finish. Zpool iostat shows small activity over disk. I think that vmware 3,02 have a severe bug. I will try to open a case at vmware . But there seems to be a lot of people on the web who had no problems with the exact same setup. Mertol http://www.sun.com/ http://www.sun.com/emrkt/sigs/6g_top.gif Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: 07 Nisan 2008 Pazartesi 20:42 To: [EMAIL PROTECTED] Cc: zfs-discuss@opensolaris.org; [EMAIL PROTECTED] Subject: Re: [zfs-discuss] iSCSI targets mapped to a VMWare ESX server Mertol Ozyoney wrote: Hi All ; There are a set of issues being looked at that prevent the VMWare ESX server from working with the Solaris iSCSI Target. http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6597310 At this time there is no target date when this issues will be resolved. Jim We are running latest Solaris 10 a X4500 Thumper. We defined a test iSCSI Lun. Out put below Target: AkhanTemp/VM iSCSI Name: iqn.1986-03.com.sun:02:72406bf8-2f5f-635a-f64c-cb664935f3d1 Alias: AkhanTemp/VM Connections: 0 ACL list: TPGT list: LUN information: LUN: 0 GUID: 01144fa709302a0047fa50e6 VID: SUN PID: SOLARIS Type: disk Size: 100G Backing store: /dev/zvol/rdsk/AkhanTemp/VM Status: online We tried to access the LUN from a windows laptop, and it worked without any problems. However VMWare ESX 3,2 Server is unable to access the LUN's. We checked that the virtual interface can ping X4500. Sometimes it sees the Lun , but 200+ Lun's with the same proporties are listed and we cant add them as storage. Then after a rescan they vanish. Any help appraciated Mertol http://www.sun.com/ image001.gif Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss attachment: image001.gif___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Downgrade zpool version?
I've been using ZFS on my home media server for about a year now. There's a lot I like about Solaris, but the rest of the computers in my house are Macs. Now that the Mac has experimental read/write support for ZFS, I'd like to migrate my zpool to my Mac Pro. I primarily use the machine to serve my iTunes library and my Solaris samba shares never really played well with iTunes. The problem is that I upgraded to Solaris nv84 a while ago and bumped my zpool to version 9 (I think) at that time. The Macintosh guys only support up to version 8. There doesn't seem to be too much activity on the ZFS project at macosforge.com so I'm guessing support for v9 isn't right around the corner. I read the zpool man page and I know it says that there's no way to downgrade, but I'm grasping at straws here. My next alternative is to try to run Solaris in VMWare on the Mac, but I'd rather not have to do that. Thanks for the help, Dave This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Per filesystem scrub
Jeff, On Mon, Mar 31, 2008 at 9:01 AM, Jeff Bonwick [EMAIL PROTECTED] wrote: Peter, That's a great suggestion. And as fortune would have it, we have the code to do it already. Scrubbing in ZFS is driven from the logical layer, not the physical layer. When you scrub a pool, you're really just scrubbing the pool-wide metadata, then scrubbing each filesystem. Thanks for the encouraging response. I was hoping that it would be little more than starting the traversal in the correct place! I've logged CR 6685106 to cover this request. At 50,000 feet, it's as simple as adding a zfs(1M) scrub subcommand and having it invoke the already-existing DMU traverse interface. Closer to ground, there are a few details to work out -- we need an option to specify whether to include snapshots, whether to descend recursively (in the case of nested filesystems), and how to handle branch points (which are created by clones). Plus we need some way to name the MOS (meta-object set, which is where we keep all pool metadata) so you can ask to scrub only that. Devil's in the details and all that... Sounds like a nice tidy project for a summer intern! Jeff On Sat, Mar 29, 2008 at 05:14:20PM +, Peter Tribble wrote: A brief search didn't show anything relevant, so here goes: Would it be feasible to support a scrub per-filesystem rather than per-pool? The reason is that on a large system, a scrub of a pool can take excessively long (and, indeed, may never complete). Running a scrub on each filesystem allows it to be broken up into smaller chunks, which would be much easier to arrange. (For example, I could scrub one filesystem a night and not have it run into working hours.) Another reason might be that I have both busy and quiet filesystems. For the busy ones, they're regularly backed up, and the data regularly read anyway; for the quiet ones they're neither read nor backed up, so it would be nice to be able to validate those. -- -Peter Tribble http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- -Peter Tribble http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] iSCSI targets mapped to a VMWare ESX server
Some time ago I experienced the same issue. Only 1 target could be connected from an esx host. Others were shown as alternative paths to that target. If I'm reminding correctly I thought I read on a forum it has something to do with the disks serial number. Normally every single (i)scsi disk is represented with a serial number, but when we create zvols on zfs and share them via iscsi, there is no unique serial number. Somebody explained the ciscso unique initiator (built-in to esx) is using this serial to differentiate between multiple targets. But since the serial is empty esx thinks all those targets are the same disk. If it's true I don't understand why they use this serial, since the iqn is a unique name for a target. Or shouldn't (open)solaris set a unique serial number for every zvol? K This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Downgrade zpool version?
On Mon, Apr 7, 2008 at 12:46 PM, David Loose [EMAIL PROTECTED] wrote: The problem is that I upgraded to Solaris nv84 a while ago and bumped my zpool to version 9 (I think) at that time. The Macintosh guys only support up to version 8. There doesn't seem to be too much activity on the ZFS project at macosforge.com so I'm guessing support for v9 isn't right around the corner. I'm not sure if it would work, but did you try to do zfs send / zfs recv? If it's just sending the filesystem data, you may be able to get around the zpool version problem. -B -- Brandon High [EMAIL PROTECTED] The good is the enemy of the best. - Nietzsche ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Downgrade zpool version?
On Apr 7, 2008, at 1:46 PM, David Loose wrote: my Solaris samba shares never really played well with iTunes. Another approach might be to stick with Solaris on the server, and run netatalk netatalk.sourceforge.net instead of SAMBA (or, you know your macs can speak NFS ;). -- Keith H. Bierman [EMAIL PROTECTED] | AIM kbiermank 5430 Nassau Circle East | Cherry Hills Village, CO 80113 | 303-997-2749 speaking for myself* Copyright 2008 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss