[zfs-discuss] RFE: ISCSI alias when shareiscsi=on
Starting from this thread: http://www.opensolaris.org/jive/thread.jspa?messageID=118786𝀂 I would love to have the possibility to set an ISCSI alias when doing an shareiscsi=on on ZFS. This will greatly facilate to identify where an IQN is hosted. the ISCSI alias is defined in rfc 3721 e.g. http://www.apps.ietf.org/rfc/rfc3721.html#sec-2 and the CLI could be something like: zfs set shareiscsi=on shareisicsiname= tank Ced. -- Cedric BRINER Geneva - Switzerland ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] iscsitadm local_name in ZFS
cedric briner wrote: hello dear community, Is there a way to have a ``local_name'' as define in iscsitadm.1m when you shareiscsi a zvol. This way, it will give even easier way to identify an device through IQN. Ced. Okay no reply from you so... maybe I didn't make myself well understandable. Let me try to re-explain you what I mean: when you use zvol and enable shareiscsi, could you add a suffix to the IQN (Iscsi Qualified Name). This suffix will be given by myself and will help me to identify which IQN correspond to which zvol : this is just a more human readable tag on an IQN. Similarly, this tag is also given when you do an iscsitadm. And in the man page of iscsitadm it is called a . iscsitadm iscsitadm create target -b /dev/dsk/c0d0s5 tiger or iscsitadm iscsitadm create target -b /dev/dsk/c0d0s5 hd-1 tiger and hd-1 are Ced. -- Cedric BRINER Geneva - Switzerland ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] iscsitadm local_name in ZFS
hello dear community, Is there a way to have a ``local_name'' as define in iscsitadm.1m when you shareiscsi a zvol. This way, it will give even easier way to identify an device through IQN. Ced. -- Cedric BRINER Geneva - Switzerland ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Probability Failure & Calculator
Richard Elling wrote: cedric briner wrote: Hello ZFS community, I do not have a so strong love towards *probability*. And even less love when probability caracterize true, solid and tangible stuff that I've to administer. I start doing some math.. don't get scared : I'm not going to show you the little scribbles that I've done. But I'm looking for some good materials: - online tools to calculate in an analytic or numeric manner different topology - best practice on raid topology : what is the best: - zfs pool tank1 raidz 1 2 3 raidz 4 5 6 raidz 7 8 9 - zfs pool tank1 raidz raidz 1 2 3 raidz 4 5 6 raidz 7 8 9 - ... and other things that I even not thought about Any thought I've put together some graphs to show various possible configurations. The models are also explained (except for the ditto blocks, which is in the pipeline, hopefully in the next few days) See http://blogs.sun.com/relling This is good news to me ! I'll be checking in the next few days your blog. thanks ! Ced. -- richard -- Cedric BRINER Geneva - Switzerland ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Probability Failure & Calculator
Hello ZFS community, I do not have a so strong love towards *probability*. And even less love when probability caracterize true, solid and tangible stuff that I've to administer. I start doing some math.. don't get scared : I'm not going to show you the little scribbles that I've done. But I'm looking for some good materials: - online tools to calculate in an analytic or numeric manner different topology - best practice on raid topology : what is the best: - zfs pool tank1 raidz 1 2 3 raidz 4 5 6 raidz 7 8 9 - zfs pool tank1 raidz raidz 1 2 3 raidz 4 5 6 raidz 7 8 9 - ... and other things that I even not thought about Any thought Ced. -- Cedric BRINER Geneva - Switzerland ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] HowTo: UPS + ZFS & NFS + no fsync
You might set zil_disable to 1 (_then_ mount the fs to be shared). But you're still exposed to OS crashes; those would still corrupt your nfs clients. Just to better understand ? (I know that I'm quite slow :( ) when you say _nfs clients_ are you specifically talking of: - the nfs client program itself : (lockd, statd) meaning that you can have a stale nfs handle or other things ? - the host acting as an nfs client meaning that the nfs client service works, but you would have corrupt the data that the software use with nfs's mounted disk. If I'm digging and digging against this ZIL and NFS UFS with write cache, that's because I do not understand which kind of problems that can occurs. What I read in general is statement like _corruption_ of the client's point of view.. but what does that means ? is the shema of what can happen is : - the application on the nfs client side write data on the nfs server - meanwhile the nfs server crashes so: - the data are not stored - the application on the nfs client think that the data are stored ! :( - when the server is up again - the nfs client re-see the data - the application on the nfs client side find itself with data in the previous state of its lasts writes. Am I right ? So with ZIL: - The application has the ability to do things in the right way. So even of a nfs-server crash, the application on the nfs-client side can rely on is own data. So without ZIL: - The application has not the ability to do things in the right way. And we can have a corruption of data. But that doesn't mean corruption of the FS. It means that the data were partially written and some are missing. For the love of God do NOT do stuff like that. Just create ZFS on a pile of disks the way that we should, with the write cache disabled on all the disks and with redundancy in the ZPool config .. nothing special : Wh !!noo.. this is really special to me !! I've read and re-read many times the: - NFS and ZFS, a fine combination - ZFS Best Practices Guide and other blog without remarking such idea ! I even notice the opposite recommendation from: -ZFS Best Practices Guide >> ZFS Storage Pools Recommendations -http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#ZFS_Storage_Pools_Recommendations where I read : - For production systems, consider using whole disks for storage pools rather than slices for the following reasons: + Allow ZFS to enable the disk's write cache, for those disks that have write caches and from: -NFS and ZFS, a fine combination >> Comparison with UFS -http://blogs.sun.com/roch/#zfs_to_ufs_performance_comparison where I read : Semantically correct NFS service : nfs/ufs : 17 sec (write cache disable) nfs/zfs : 12 sec (write cache disable,zil_disable=0) nfs/zfs : 7 sec (write cache enable,zil_disable=0) then I can say: that nfs/zfs with write cache enable end zil_enable is --in that case-- faster So why are you recommending me to disable the write cache ? -- Cedric BRINER Geneva - Switzerland ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] HowTo: UPS + ZFS & NFS + no fsync
You might set zil_disable to 1 (_then_ mount the fs to be shared). But you're still exposed to OS crashes; those would still corrupt your nfs clients. -r hello Roch, I've few questions 1) from: Shenanigans with ZFS flushing and intelligent arrays... http://blogs.digitar.com/jjww/?itemid=44 I read : Disable the ZIL. The ZIL is the way ZFS maintains _consistency_ until it can get the blocks written to their final place on the disk. That's why the ZIL flushes the cache. If you don't have the ZIL and a power outage occurs, your blocks may go poof in your server's RAM...'cause they never made it to the disk Kemosabe. from : Eric Kustarz's Weblog http://blogs.sun.com/erickustarz/entry/zil_disable I read : Note: disabling the ZIL does _NOT_ compromise filesystem integrity. Disabling the ZIL does NOT cause corruption in ZFS. then : I don't understand: In one they tell that: - we can lose _consistency_ and in the other one they say that : - does not compromise filesystem integrity so .. which one is right ? 2) from : Eric Kustarz's Weblog http://blogs.sun.com/erickustarz/entry/zil_disable I read: Disabling the ZIL is definitely frowned upon and can cause your applications much confusion. Disabling the ZIL can cause corruption for NFS clients in the case where a reply to the client is done before the server crashes, and the server crashes before the data is commited to stable storage. If you can't live with this, then don't turn off the ZIL. then: The service that we export with zfs & NFS is not such things as databases or some really stress full system, but just exporting home. So it feels to me that we can juste disable this ZIL. 3) from: NFS and ZFS, a fine combination http://blogs.sun.com/roch/#zfs_to_ufs_performance_comparison I read: NFS service with risk of corruption of client's side view : nfs/ufs : 7 sec (write cache enable) nfs/zfs : 4.2 sec (write cache enable,zil_disable=1) nfs/zfs : 4.7 sec (write cache disable,zil_disable=1) Semantically correct NFS service : nfs/ufs : 17 sec (write cache disable) nfs/zfs : 12 sec (write cache disable,zil_disable=0) nfs/zfs : 7 sec (write cache enable,zil_disable=0) then : Does this mean that when you just create an UFS FS, and that you just export it with NFS, you are doing an not semantically correct NFS service. And that you have to disable the write cache to have an correct NFS server ??? 4) so can we say that people used to have an NFS with risk of corruption of client's side view can just take ZFS and disable the ZIL ? thanks in advance for your clarifications Ced. P.-S. Does some of you know the best way to send an email containing many questions inside it ? Should I create a thread for each of them, the next time -- Cedric BRINER Geneva - Switzerland ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] HowTo: UPS + ZFS & NFS + no fsync
okay let'say that it is not. :) Imagine that I setup a box: - with Solaris - with many HDs (directly attached). - use ZFS as the FS - export the Data with NFS - on an UPS. Then after reading the : http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#ZFS_and_Complex_Storage_Considerations I wonder if there is a way to tell the OS to ignore the fsync flush commands since they are likely to survive a power outage. Cedric, You do not want to ignore syncs from ZFS if your harddisk is directly attached to the server. As the document mentioned, that is really for Complex Storage with NVRAM where flush is not necessary. This post follows : `XServe Raid & Complex Storage Considerations' http://www.opensolaris.org/jive/thread.jspa?threadID=29276&tstart=0 Where we have made the assumption (*1) if the XServe Raid is connected to an UPS that we can consider the RAM in the XServe Raid as it was NVRAM. (*1) This assumption is even pointed by Roch : http://blogs.sun.com/roch/#zfs_to_ufs_performance_comparison >> Intelligent Storage through: `the Shenanigans with ZFS flushing and intelligent arrays...' http://blogs.digitar.com/jjww/?itemid=44 >> Tell your array to ignore ZFS' flush commands So in this way, when we export it with NFS we get a boost in the BW. Okay, then is there any difference that I do not catch between : - the Shenanigans with ZFS flushing and intelligent arrays... - and my situation I mean, I want to have a cheap and reliable nfs service. Why should I buy expensive `Complex Storage with NVRAM' and not just buying a machine with 8 IDE HD's ? Ced. -- Cedric BRINER Geneva - Switzerland ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] HowTo: UPS + ZFS & NFS + no fsync
Hello, I wonder if the subject of this email is not self-explanetory ? okay let'say that it is not. :) Imagine that I setup a box: - with Solaris - with many HDs (directly attached). - use ZFS as the FS - export the Data with NFS - on an UPS. Then after reading the : http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#ZFS_and_Complex_Storage_Considerations I wonder if there is a way to tell the OS to ignore the fsync flush commands since they are likely to survive a power outage. Ced. -- Cedric BRINER Geneva - Switzerland ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] XServe Raid & Complex Storage Considerations
The Xraid is a very well thought of storage device with a heck of a price point. Attached is an image of the "Settings"/"Performance" Screen where you see "Allow Host Cache Flushing". I think when you use ZFS, it would be best to uncheck that box. This is what happen when you do use GUI in your native language (french in my case). I finally understood what was the meaning in french, after reading it from your image in english :) And your setting just boosted my BW from 0.8 MiB/s to 7 MiB/s * !! good to see that it just works. The only 2 drawbacks to using Xserve raid that I have found are: 1. Partition Management, dynamic expansion and Volume management. If we stay native in OSX tools/filesystems we cant partition with free space then later try to create a partition and retain the data from the already created partition. This really sucks. I'm betting Xsan changes these limitations however. I would love that the Xserve juste provide a way to exports 14 disks to the Host. In this way, we could manage it with zfs in a more fine grained fashion. 2. Each controller can only talk to 7 disks (1/2 the array). Other than that, the thing is really fast, and quite reliable. Not to mention the sexy blue lights that tell you its hummin' yeah right.. quite sexy ! -Andy Ced. * (a MiB is a mebibyte 2^20 ref: http://en.wikipedia.org/wiki/Mebibyte) -- Cedric BRINER Geneva - Switzerland ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] XServe Raid & Complex Storage Considerations
hello the list, After reading the _excellent_ ZFS Best Practices Guide, I've seen in the section: ZFS and Complex Storage Consideration that we should configure the storage system to ignore command which will flush the memory into the disk. So does some of you knows how to tell Xserve Raid to ignore ``fsync'' requests ? After the announce that zfs will be included in Tiger, I'll be surprised that the Xserve Raid will not include such configuration. Ced. -- Cedric BRINER Geneva - Switzerland ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ISCSI + ZFS + NFS
Hello the list, After participating at the presentation of Bill Moore & Jeff Bonwick, I started to think about: ``No special hardware – ZFS loves cheap disks'' okay it loves it. But _how_ can you achieve a well sized storage (40TB) with such technologies. I mean, how can you bind physicaly 70 HD in an zfs pool. After some thinking, I went with this idea in mind which is shown below with some ascii :) +---+ +---+ +---+ |node i1| |node i2| |node i3| | with | | with | | with | | 16 HD | | 16 HD | | 16 HD | +---+ +---+ +---+ \ |/ \ | / +---+ I S C I | jumbo switch |I S C I i-node +---+i-node / \ /\~~ +---+ +---+ |node z0| |node z1| Z F S | with | | with | Z F S z-node | 2 HD | | 2 HD | z-node +---+ +---+ \/~~ \ / +---+ N F S |switch | N F S n-node +---+ n-node the clients / / | \ the clients - / |-- --- / / | \ +---+ +---+ +---+ +---+ |node n0| |node n1| |node n2| |node nX| | with | | with | | with | | with | | 1 HD | | 1 HD | | 1 HD | | 2 HD | +---+ +---+ +---+ +---+ I've splitted it like this. ISCSI space: the one in the top. --- the first from the top. - It provides the ability to see on z0,z1 hard disks which are physically mounted on i0,i2,i3 - I have a least three i-node, which will protect me from an unvailable zpool if one i-node goes down as long as I spread the HD correctly between the i-node. So I can have a cheap i-node with one power supply. ZFS space: the middle one - - the z-nodes are here to consolidate the HDs with a zpool - I will put them into an container so it will be easy to migrate the zpool from one z-node to an other - I will create about 6 zpool to be able to do some load balancing between the z-node. - there will be a least 2 z-node but I'm open to have more than 2 but not only one NFS space: the one in the bottom. - - this will be client which will connect to an z-node - nothing special here. OKAY that's an idea, but then this becomes not so easy to manage. I have made some tries and I found iscsi{,t}adm not that cool to use confronted to what zfs,zpool interfaces provides. Now the questions: 1) what do you think about such topology 2) Do you have any comment on what I've said in the previous line 3) one of the excellent feature of zfs, is that the meta-data kept inside the HD give the possibility to export and import HD without worrying about which HD is for which pool. _But_ 3.1) In this case (see the ascii draw), I mean using it through iscsi, we lost this ability to easily find which HD is for. Due to the fact that iscsi and zfs are two different programs. 3.2) If I use zfs on iscsi HD (IDE), and I create a pool with it and I decide to export it and plug it directly (without iscsi) to the node. I'm not able to do an zfs import on the IDE HD. I try to do a *loopback* iscsi without success :( . This is sad, because, I was thinking to move the z-node in the i-node. But I wont be able to do this due to the behaviour of iscsi. Ced. -- Cedric BRINER Geneva - Switzerland ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs and iscsi: cannot open : I/O error
>> devfsadm -i iscsi # to create the device on sf3 >> iscsiadm list target -Sv| egrep 'OS Device|Peer|Alias' # not empty >> Alias: vol-1 >>IP address (Peer): 10.194.67.111:3260 >> OS Device Name: >> /dev/rdsk/c1t014005A267C12A0045E2F524d0s2 this is where my confusion began. I don't know what is the device c1t04d0s2 for ? I mean what does it represents? I've found that the ``OS Device Name'' (c1t04d0s2) is created after the invocation: devfsadm -i iscsi # to create the device on sf3 but no way, this is not a device that you can use. you can find the device only with the command: format Searching for disks...done AVAILABLE DISK SELECTIONS: 0. c0t0d0 /[EMAIL PROTECTED],0/[EMAIL PROTECTED]/[EMAIL PROTECTED],0 1. c0t2d0 /[EMAIL PROTECTED],0/[EMAIL PROTECTED]/[EMAIL PROTECTED],0 2. c1t014005A267C12A0045E308D2d0 /scsi_vhci/[EMAIL PROTECTED] and then if you create the zpool with: zpool create tank c1t014005A267C12A0045E308D2d0 it works !! BUT.. BUT... and re-BUT Since this, and with all this virtualization... how can I link a device name on my iscsi's client with the device name on my iscsi'server. Because, Imagine that you are in my situation where I want to have (let's say) 4 iscsi'server with at maximum 16 disks attached by iscsi'server. And that you have at least 2 iscsi's client which will consolidate this space with zfs. And suddenly, you can see with zpool that a disk is dead. So I have to be able to replace this disk and so for this, I have to know on which one of the 4 machine it resides and which disk it is. so does some of you knows a little bit about this ? Ced. -- Cedric BRINER Geneva - Switzerland ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zfs and iscsi: cannot open : I/O error
hello, I'm trying to consolidate my HDs in a cheap but (I hope) reliable manner. To do so, I was thinking to use zfs over iscsi. Unfortunately, I'm having some issue with it, when I do: # iscsi server (nexenta alpha 5) # svcadm enable iscsitgt iscsitadm delete target --lun 0 vol-1 iscsitadm list target # empty iscsitadm create target -b /dev/dsk/c0d0s5 vol-1 iscsitadm list target # not empty Target: vol-1 iSCSI Name: iqn.1986-03.com.sun:02:662bd119-1660-6141-cea7-dd799d53b254.vol-1 Connections: 0 #iscsi client (solaris 5.10, up-to-date) # iscsiadm add discovery-address 10.194.67.111 # (iscsi server) iscsiadm modify discovery --sendtargets enable iscsiadm list discovery-address # not empty iscsiadm list target # not empty Target: iqn.1986-03.com.sun:02:662bd119-1660-6141-cea7-dd799d53b254.vol-1 Alias: vol-1 TPGT: 1 ISID: 402a Connections: 1 devfsadm -i iscsi # to create the device on sf3 iscsiadm list target -Sv| egrep 'OS Device|Peer|Alias' # not empty Alias: vol-1 IP address (Peer): 10.194.67.111:3260 OS Device Name: /dev/rdsk/c1t014005A267C12A0045E2F524d0s2 zpool create tank c1t014005A267C12A0045E2F524d0s2 cannot open '/dev/dsk/c1t014005A267C12A0045E2F524d0s2': I/O error #- The error was produced when using the type mode ``disk'' for iscsi. I've follow the advise of Roch:to try the different type of iscsi: disk|raw|tape but unfortunately the only type who accepts the: ``iscsitadm create target -b /dev/dsk/c0d0s5'' is the type disk which doesn't work. Any idea of what I could do to improve this. thanks in advance Ced. -- Cedric BRINER Geneva - Switzerland ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss