[zfs-discuss] RFE: ISCSI alias when shareiscsi=on

2007-05-24 Thread cedric briner

Starting from this thread:
http://www.opensolaris.org/jive/thread.jspa?messageID=118786𝀂

I would love to have the possibility to set an ISCSI alias when doing an 
shareiscsi=on on ZFS. This will greatly facilate to identify where an 
IQN is hosted.


the ISCSI alias is defined in rfc 3721
e.g. http://www.apps.ietf.org/rfc/rfc3721.html#sec-2

and the CLI could be something like:
zfs set shareiscsi=on shareisicsiname= tank


Ced.
--

Cedric BRINER
Geneva - Switzerland
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] iscsitadm local_name in ZFS

2007-05-04 Thread cedric briner

cedric briner wrote:

hello dear community,

Is there a way to have a ``local_name'' as define in iscsitadm.1m when 
you shareiscsi a zvol. This way, it will give even easier 
way to identify an device through IQN.


Ced.



Okay no reply from you so... maybe I didn't make myself well understandable.

Let me try to re-explain you what I mean:
when you use zvol and enable shareiscsi, could you add a suffix to the 
IQN (Iscsi Qualified Name). This suffix will be given by myself and will 
help me to identify which IQN correspond to which zvol : this is just a 
more human readable tag on an IQN.


Similarly, this tag is also given when you do an iscsitadm. And in the 
man page of iscsitadm it is called a .


iscsitadm iscsitadm create target -b  /dev/dsk/c0d0s5  tiger
or
iscsitadm iscsitadm create target -b  /dev/dsk/c0d0s5  hd-1

tiger and hd-1 are 

Ced.

--

Cedric BRINER
Geneva - Switzerland
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] iscsitadm local_name in ZFS

2007-05-03 Thread cedric briner

hello dear community,

Is there a way to have a ``local_name'' as define in iscsitadm.1m when 
you shareiscsi a zvol. This way, it will give even easier 
way to identify an device through IQN.


Ced.

--

Cedric BRINER
Geneva - Switzerland
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Probability Failure & Calculator

2007-04-27 Thread cedric briner

Richard Elling wrote:

cedric briner wrote:

Hello ZFS community,

I do not have a so strong love towards *probability*. And even less 
love when probability caracterize true, solid and tangible stuff that 
I've to administer.


I start doing some math..
don't get scared : I'm not going to show you the little scribbles that 
I've done.


But I'm looking for some good materials:
- online tools to calculate in an analytic or numeric manner different 
topology

- best practice on raid topology : what is the best:
 - zfs pool tank1 raidz 1 2 3 raidz 4 5 6 raidz 7 8 9
 - zfs pool tank1 raidz raidz 1 2 3 raidz 4 5 6 raidz 7 8 9
 - ... and other things that I even not thought about

Any thought


I've put together some graphs to show various possible configurations.
The models are also explained (except for the ditto blocks, which is
in the pipeline, hopefully in the next few days)
See http://blogs.sun.com/relling

This is good news to me !

I'll be checking in the next few days your blog.

thanks  !

Ced.

 -- richard




--

Cedric BRINER
Geneva - Switzerland
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Probability Failure & Calculator

2007-04-27 Thread cedric briner

Hello ZFS community,

I do not have a so strong love towards *probability*. And even less love 
when probability caracterize true, solid and tangible stuff that I've to 
administer.


I start doing some math..
don't get scared : I'm not going to show you the little scribbles that 
I've done.


But I'm looking for some good materials:
- online tools to calculate in an analytic or numeric manner different 
topology

- best practice on raid topology : what is the best:
 - zfs pool tank1 raidz 1 2 3 raidz 4 5 6 raidz 7 8 9
 - zfs pool tank1 raidz raidz 1 2 3 raidz 4 5 6 raidz 7 8 9
 - ... and other things that I even not thought about

Any thought


Ced.





--

Cedric BRINER
Geneva - Switzerland
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] HowTo: UPS + ZFS & NFS + no fsync

2007-04-27 Thread cedric briner

You might set zil_disable to 1 (_then_ mount the fs to be
shared). But you're still exposed to OS crashes; those would
still corrupt your nfs clients.

Just to better understand ? (I know that I'm quite slow :( )
when you say _nfs clients_ are you specifically talking of:
- the nfs client program itself :
   (lockd, statd) meaning that you can have a stale nfs handle or other 
things ?

- the host acting as an nfs client
   meaning that the nfs client service works, but you would have 
corrupt the data that the software use with nfs's mounted disk.



If I'm digging and digging against this ZIL and NFS UFS with write 
cache, that's because I do not understand which kind of problems that 
can occurs. What I read in general is statement like _corruption_ of the 
client's point of view.. but what does that means ?


is the shema of what can happen is :
- the application on the nfs client side write data on the nfs server
- meanwhile the nfs server crashes so:
 - the data are not stored
 - the application on the nfs client think that the data are stored ! :(
- when the server is up again
- the nfs client re-see the data
- the application on the nfs client side find itself with data in the 
previous state of its lasts writes.


Am I right ?

So with ZIL:
 - The application has the ability to do things in the right way. So 
even of a nfs-server crash, the application on the nfs-client side can 
rely on is own data.


So without ZIL:
 - The application has not the ability to do things in the right way. 
And we can have a corruption of data. But that doesn't mean corruption 
of the FS. It means that the data were partially written and some are 
missing.



For the love of God do NOT do stuff like that.

Just create ZFS on a pile of disks the way that we should, with the
write cache disabled on all the disks and with redundancy in the ZPool
config .. nothing special :

Wh !!noo..  this is really special to me !!
I've read and re-read many times the:
 - NFS and ZFS, a fine combination
 - ZFS Best Practices Guide
and other blog without remarking such idea !

I even notice the opposite recommendation
from:
-ZFS Best Practices Guide >> ZFS Storage Pools Recommendations
-http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#ZFS_Storage_Pools_Recommendations
where I read :
 - For production systems, consider using whole disks for storage pools 
rather than slices for the following reasons:
  + Allow ZFS to enable the disk's write cache, for those disks that 
have write caches


and from:
-NFS and ZFS, a fine combination >> Comparison with UFS
-http://blogs.sun.com/roch/#zfs_to_ufs_performance_comparison
where I read :
 Semantically correct NFS service :

nfs/ufs : 17 sec (write cache disable)
nfs/zfs : 12 sec (write cache disable,zil_disable=0)
nfs/zfs :  7 sec (write cache enable,zil_disable=0)
then I can say:
 that nfs/zfs with write cache enable end zil_enable is --in that 
case-- faster


So why are you recommending me to disable the write cache ?

--

Cedric BRINER
Geneva - Switzerland

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] HowTo: UPS + ZFS & NFS + no fsync

2007-04-26 Thread cedric briner

You might set zil_disable to 1 (_then_ mount the fs to be
shared). But you're still exposed to OS crashes; those would 
still corrupt your nfs clients.


-r


hello Roch,

I've few questions

1)
from:
  Shenanigans with ZFS flushing and intelligent arrays...
  http://blogs.digitar.com/jjww/?itemid=44
I read :
 Disable the ZIL. The ZIL is the way ZFS maintains _consistency_ until 
it can get the blocks written to their final place on the disk. That's 
why the ZIL flushes the cache. If you don't have the ZIL and a power 
outage occurs, your blocks may go poof in your server's RAM...'cause 
they never made it to the disk Kemosabe.


from :
  Eric Kustarz's Weblog
  http://blogs.sun.com/erickustarz/entry/zil_disable
I read :
  Note: disabling the ZIL does _NOT_ compromise filesystem integrity. 
Disabling the ZIL does NOT cause corruption in ZFS.


then :
  I don't understand: In one they tell that:
   - we can lose _consistency_
  and in the other one they say that :
   - does not compromise filesystem integrity
  so .. which one is right ?


2)
from :
  Eric Kustarz's Weblog
  http://blogs.sun.com/erickustarz/entry/zil_disable
I read:
 Disabling the ZIL is definitely frowned upon and can cause your 
applications much confusion. Disabling the ZIL can cause corruption for 
NFS clients in the case where a reply to the client is done before the 
server crashes, and the server crashes before the data is commited to 
stable storage. If you can't live with this, then don't turn off the ZIL.


then:
  The service that we export with zfs & NFS is not such things as 
databases or some really stress full system, but just exporting home. So 
it feels to me that we can juste disable this ZIL.


3)
from:
  NFS and ZFS, a fine combination
  http://blogs.sun.com/roch/#zfs_to_ufs_performance_comparison
I read:
  NFS service with risk of corruption of client's side view :

nfs/ufs :  7 sec (write cache enable)
nfs/zfs :  4.2   sec (write cache enable,zil_disable=1)
nfs/zfs :  4.7   sec (write cache disable,zil_disable=1)

Semantically correct NFS service :

nfs/ufs : 17 sec (write cache disable)
nfs/zfs : 12 sec (write cache disable,zil_disable=0)
nfs/zfs :  7 sec (write cache enable,zil_disable=0)

then :
  Does this mean that when you just create an UFS FS, and that you just 
export it with NFS, you are doing an not semantically correct NFS 
service. And that you have to disable the write cache to have an correct 
NFS server ???


4)
so can we say that people used to have an NFS with risk of corruption of 
client's side view can just take ZFS and disable the ZIL ?






thanks in advance for your clarifications

Ced.
P.-S. Does some of you know the best way to send an email containing 
many questions inside it ? Should I create a thread for each of them, 
the next time






--

Cedric BRINER
Geneva - Switzerland
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] HowTo: UPS + ZFS & NFS + no fsync

2007-04-26 Thread cedric briner

okay let'say that it is not. :)
Imagine that I setup a box:
  - with Solaris
  - with many HDs (directly attached).
  - use ZFS as the FS
  - export the Data with NFS
  - on an UPS.

Then after reading the :
http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#ZFS_and_Complex_Storage_Considerations 


I wonder if there is a way to tell the OS to ignore the fsync flush
commands since they are likely to survive a power outage.


Cedric,

You do not want to ignore syncs from ZFS if your harddisk is directly
attached to the server.  As the document mentioned, that is really for
Complex Storage with NVRAM where flush is not necessary.


This post follows : `XServe Raid & Complex Storage Considerations'
http://www.opensolaris.org/jive/thread.jspa?threadID=29276&tstart=0

Where we have made the assumption (*1) if the XServe Raid is connected 
to an UPS that we can consider the RAM in the XServe Raid as it was NVRAM.


(*1)
  This assumption is even pointed by Roch  :
  http://blogs.sun.com/roch/#zfs_to_ufs_performance_comparison
  >> Intelligent Storage
  through: `the Shenanigans with ZFS flushing and intelligent arrays...'
  http://blogs.digitar.com/jjww/?itemid=44
  >> Tell your array to ignore ZFS' flush commands

So in this way, when we export it with NFS we get a boost in the BW.

Okay, then is there any difference that I do not catch between :
 - the Shenanigans with ZFS flushing and intelligent arrays...
 - and my situation

I mean, I want to have a cheap and reliable nfs service. Why should I 
buy expensive `Complex Storage with NVRAM' and not just buying a machine 
with 8 IDE HD's ?



Ced.
--

Cedric BRINER
Geneva - Switzerland
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] HowTo: UPS + ZFS & NFS + no fsync

2007-04-26 Thread cedric briner

Hello,

I wonder if the subject of this email is not self-explanetory ?


okay let'say that it is not. :)
Imagine that I setup a box:
 - with Solaris
 - with many HDs (directly attached).
 - use ZFS as the FS
 - export the Data with NFS
 - on an UPS.

Then after reading the : 
http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#ZFS_and_Complex_Storage_Considerations
I wonder if there is a way to tell the OS to ignore the fsync flush 
commands since they are likely to survive a power outage.



Ced.

--

Cedric BRINER
Geneva - Switzerland
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] XServe Raid & Complex Storage Considerations

2007-04-26 Thread cedric briner

The Xraid is a very well thought of storage device with a heck of a price
point.  Attached is an image of the "Settings"/"Performance" Screen where
you see "Allow Host Cache Flushing".

I think when you use ZFS, it would be best to uncheck that box.
This is what happen when you do use GUI in your native language (french 
in my case). I finally understood what was the meaning in french, after 
reading it from your image in english :)


And your setting just boosted my BW from 0.8 MiB/s to 7 MiB/s * !!
good to see that it just works.


The only 2 drawbacks to using Xserve raid that I have found are:

1. Partition Management, dynamic expansion and Volume management.  If we
stay native in OSX tools/filesystems we cant partition with free space then
later try to create a partition and retain the data from the already created
partition.  This really sucks.  I'm betting Xsan changes these limitations
however.
I would love that the Xserve juste provide a way to exports 14 disks to 
the Host. In this way, we could manage it with zfs in a more fine 
grained fashion.



2. Each controller can only talk to 7 disks (1/2 the array).

Other than that, the thing is really fast, and quite reliable.  Not to
mention the sexy blue lights that tell you its hummin'

yeah right.. quite sexy !


-Andy


Ced.
* (a MiB is a mebibyte 2^20 ref: http://en.wikipedia.org/wiki/Mebibyte)


--

Cedric BRINER
Geneva - Switzerland
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] XServe Raid & Complex Storage Considerations

2007-04-25 Thread cedric briner

hello the list,

After reading the _excellent_ ZFS Best Practices Guide, I've seen in the 
section: ZFS and Complex Storage Consideration that we should configure 
the storage system to ignore command which will flush the memory into 
the disk.


So does some of you knows how to tell Xserve Raid to ignore ``fsync'' 
requests ?


After the announce that zfs will be included in Tiger, I'll be surprised 
that the Xserve Raid will not include such configuration.


Ced.

--

Cedric BRINER
Geneva - Switzerland
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ISCSI + ZFS + NFS

2007-03-20 Thread cedric briner

Hello the list,
After participating at the presentation of Bill Moore & Jeff Bonwick, I
started to think about:
``No special hardware – ZFS loves cheap disks''
okay it loves it. But _how_ can you achieve a well sized storage (40TB)
with such technologies. I mean, how can you bind physicaly 70 HD in an
zfs pool.

After some thinking, I went with this idea in mind which is shown below
with some ascii :)


   +---+   +---+   +---+
   |node i1|   |node i2|   |node i3|
   | with  |   | with  |   | with  |
   | 16 HD |   | 16 HD |   | 16 HD |
   +---+   +---+   +---+
   \   |/
\  |   /
+---+
I S C I | jumbo switch  |I S C I
i-node  +---+i-node
 /  \
/\~~
   +---+   +---+
   |node z0|   |node z1|
Z F S  | with  |   | with  |   Z F S
z-node | 2 HD  |   | 2 HD  |   z-node
   +---+   +---+
\/~~
 \  /
   +---+
N F S  |switch |   N F S
n-node +---+   n-node
the clients  /  / |  \   the clients
   -   /  |-- ---
  /   /   |   \
+---+ +---+ +---+  +---+
|node n0| |node n1| |node n2|  |node nX|
| with  | | with  | | with  |  | with  |
| 1 HD  | | 1 HD  | | 1 HD  |  | 2 HD  |
+---+ +---+ +---+  +---+


I've splitted it like this.


ISCSI space: the one in the top.
---
the first from the top.
- It provides the ability to see on z0,z1 hard disks which are
physically mounted on i0,i2,i3
- I have a least three i-node, which will protect me from an unvailable
zpool if one i-node goes down as long as I spread the HD correctly
between the i-node. So I can have a cheap i-node with one power supply.

ZFS space: the middle one
-
- the z-nodes are here to consolidate the HDs with a zpool
- I will put them into an container so it will be easy to migrate the
zpool from one z-node to an other
- I will create about 6 zpool to be able to do some load balancing
between the z-node.
- there will be a least 2 z-node but I'm open to have more than 2 but
not only one


NFS space: the one in the bottom.
-
- this will be client which will connect to an z-node
- nothing special here.


OKAY that's an idea, but then this becomes not so easy to manage. I have
made some tries and I found iscsi{,t}adm not that cool to use confronted
to what zfs,zpool interfaces provides.

Now the questions:
1) what do you think about such topology
2) Do you have any comment on what I've said in the previous line
3) one of the excellent feature of zfs, is that the meta-data kept
inside the HD give the possibility to export and import HD without
worrying about which HD is for which pool. _But_
3.1) In this case (see the ascii draw), I mean using it through iscsi,
we lost this ability to easily find which HD is for. Due to the fact
that iscsi and zfs are two different programs.
3.2) If I use zfs on iscsi HD (IDE), and I create a pool with it and I
decide to export it and plug it directly (without iscsi) to the node.
I'm not able to do an zfs import on the IDE HD. I try to do a *loopback*
iscsi without success :( . This is sad, because, I was thinking to move 
the z-node in the i-node. But I wont be able to do this due to the 
behaviour of iscsi.



Ced.

--

Cedric BRINER
Geneva - Switzerland
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs and iscsi: cannot open : I/O error

2007-02-26 Thread cedric briner

>> devfsadm -i iscsi # to create the device on sf3
>> iscsiadm list target -Sv| egrep 'OS Device|Peer|Alias' # not empty
>>  Alias: vol-1
>>IP address (Peer): 10.194.67.111:3260
>>   OS Device Name:
>> /dev/rdsk/c1t014005A267C12A0045E2F524d0s2
this is where my confusion began.
I don't know what is the device c1t04d0s2 for ? I mean what does it 
represents?


I've found that the ``OS Device Name'' (c1t04d0s2) is created after 
the invocation:

devfsadm -i iscsi # to create the device on sf3

but no way, this is not a device that you can use.
you can find the device only with the command:
format
   Searching for disks...done


   AVAILABLE DISK SELECTIONS:
   0. c0t0d0 
  /[EMAIL PROTECTED],0/[EMAIL PROTECTED]/[EMAIL PROTECTED],0
   1. c0t2d0 
  /[EMAIL PROTECTED],0/[EMAIL PROTECTED]/[EMAIL PROTECTED],0
   2. c1t014005A267C12A0045E308D2d0 
  /scsi_vhci/[EMAIL PROTECTED]

and then if you create the zpool with:
zpool create tank c1t014005A267C12A0045E308D2d0
it works !!


BUT.. BUT... and re-BUT
Since this, and with all this virtualization... how can I link a device 
name on my iscsi's client with the device name on my iscsi'server.


Because, Imagine that you are in my situation where I want to have 
(let's say) 4 iscsi'server with at maximum 16 disks attached by 
iscsi'server. And that you have at least 2 iscsi's client which will 
consolidate this space with zfs. And suddenly, you can see with zpool 
that a disk is dead. So I have to be able to replace this disk and so 
for this, I have to know on which one of the 4 machine it resides and 
which disk it is.



so does some of you knows a little bit about this ?

Ced.
--

Cedric BRINER
Geneva - Switzerland
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zfs and iscsi: cannot open : I/O error

2007-02-26 Thread cedric briner

hello,

I'm trying to consolidate my HDs in a cheap but (I hope) reliable 
manner. To do so, I was thinking to use zfs over iscsi.


Unfortunately, I'm having some issue with it, when I do:

# iscsi server (nexenta alpha 5)
#
svcadm enable iscsitgt
iscsitadm delete target --lun 0 vol-1
iscsitadm list target # empty
iscsitadm create target -b  /dev/dsk/c0d0s5  vol-1
iscsitadm list target # not empty
   Target: vol-1
   iSCSI Name: 
iqn.1986-03.com.sun:02:662bd119-1660-6141-cea7-dd799d53b254.vol-1

   Connections: 0

#iscsi client (solaris 5.10, up-to-date)
#

iscsiadm add discovery-address 10.194.67.111 # (iscsi server)
iscsiadm modify discovery --sendtargets enable
iscsiadm list discovery-address # not empty
iscsiadm list target # not empty
   Target: 
iqn.1986-03.com.sun:02:662bd119-1660-6141-cea7-dd799d53b254.vol-1

   Alias: vol-1
   TPGT: 1
   ISID: 402a
   Connections: 1

devfsadm -i iscsi # to create the device on sf3
iscsiadm list target -Sv| egrep 'OS Device|Peer|Alias' # not empty
Alias: vol-1
  IP address (Peer): 10.194.67.111:3260
 OS Device Name: 
/dev/rdsk/c1t014005A267C12A0045E2F524d0s2



zpool create tank c1t014005A267C12A0045E2F524d0s2
cannot open '/dev/dsk/c1t014005A267C12A0045E2F524d0s2': I/O error

#-
The error was produced when using the type mode ``disk'' for iscsi. I've 
follow the advise of Roch:to try the different type of iscsi:

disk|raw|tape

but unfortunately the only type who accepts the:
``iscsitadm create target -b  /dev/dsk/c0d0s5'' is the type disk which 
doesn't work.


Any idea of what I could do to improve this.


thanks in advance

Ced.
--

Cedric BRINER
Geneva - Switzerland
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss