Re: [zfs-discuss] ZFS vq_max_pending value ?

2008-01-22 Thread Manoj Nayak

 4 disk raidz group issues 128k/3=42.6k io to each individual data 
 disk.If 35 concurrent 128k IO is enough to saturate a disk( vdev ) ,
 then 35*3=105 concurrent 42k io will be required to saturates the 
 same disk.
>>>
>>> ZFS doesn't know anything about disk saturation.  It will send
>>> up to vq_max_pending  I/O requests per vdev (usually a vdev is a
>>> disk). It will try to keep vq_max_pending I/O requests queued to
>>> the vdev.
>>
>> I can see the "avg pending I/Os" hitting my  vq_max_pending limit, 
>> then raising the limit would be a good thing. I think , it's due to
>> many 42k Read IO to individual disk in the 4 disk raidz group.
>
> You're dealing with a queue here.  iostat's average pending I/Os 
> represents
> the queue depth.   Some devices can't handle a large queue.  In any
> case, queuing theory applies.
>
> Note that for reads, the disk will likely have a track cache, so it is
> not a good assumption that a read I/O will require a media access.
My workload issues around 5000 MB read I/0 & iopattern says around 55% 
of the IO are random in nature.
I don't know how much prefetching through track cache is going to help 
here.Probably I can try disabling vdev_cache
through "set 'zfs_vdev_cache_max' 1"

Thanks
Manoj Nayak
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?

2008-01-22 Thread Kyle McDonald
Albert Chin wrote:
> On Tue, Jan 22, 2008 at 09:20:30PM -0500, Kyle McDonald wrote:
>   
>> Anyone know the answer to this? I'll be ordering 2 of the 7K's for
>> my x346's this week. If niether A nor B will work I'm not sure
>> there's any advantage to using the 7k card considering I want ZFS to
>> do the mirroring.
>> 
>
> Why even both with a H/W RAID array when you won't use the H/W RAID?
> Better to find a decent SAS/FC JBOD with cache. Would definitely be
> cheaper.
>
>   
I've never heard of such a thing? Do you have any links (cheap or not?)

Do they exist for less than $350? Thats what the 7k will run me.
Do they include an enclosure for at least 6 disks? the 7k will use the 6 
U320 hot swap bays already in my IBM x346 chassis.

I'm not being sarcastic, if something better exists, even for  a little 
more, I'm interested. I'd especially love to switch to SATA as I'm about 
to pay about $550 each for 300GB U320 drives, and with SATA I could go 
bigger, or save money or both. :)

   -Kyle

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS vq_max_pending value ?

2008-01-22 Thread Richard Elling
manoj nayak wrote:
>
> - Original Message - From: "Richard Elling" 
> <[EMAIL PROTECTED]>
> To: "manoj nayak" <[EMAIL PROTECTED]>
> Cc: 
> Sent: Wednesday, January 23, 2008 7:20 AM
> Subject: Re: [zfs-discuss] ZFS vq_max_pending value ?
>
>
>> manoj nayak wrote:
>>>
 Manoj Nayak wrote:
> Hi All.
>
> ZFS document says ZFS schedules it's I/O in such way that it 
> manages to saturate a single disk bandwidth  using enough 
> concurrent 128K I/O.
> The no of concurrent I/O is decided by vq_max_pending.The default 
> value for  vq_max_pending is 35.
>
> We have created 4-disk raid-z group inside ZFS pool on Thumper.ZFS 
> record size is set to 128k.When we read/write a 128K record ,it 
> issue a
> 128K/3 I/O to each of the 3 data disks in the 4-disk raid-z group.
>

 Yes, this is how it works for a read without errors.  For a write, you
 should see 4 writes, each 128KBytes/3.  Writes may also be
 coalesced, so you may see larger physical writes.

> We need to saturate all three data disk bandwidth in the Raidz 
> group.Is it required to set vq_max_pending value to 35*3=135  ?
>

 No.  vq_max_pending applies to each vdev.
>>>
>>> 4 disk raidz group issues 128k/3=42.6k io to each individual data 
>>> disk.If 35 concurrent 128k IO is enough to saturate a disk( vdev ) ,
>>> then 35*3=105 concurrent 42k io will be required to saturates the 
>>> same disk.
>>
>> ZFS doesn't know anything about disk saturation.  It will send
>> up to vq_max_pending  I/O requests per vdev (usually a vdev is a
>> disk). It will try to keep vq_max_pending I/O requests queued to
>> the vdev.
>
> I can see the "avg pending I/Os" hitting my  vq_max_pending limit, 
> then raising the limit would be a good thing. I think , it's due to
> many 42k Read IO to individual disk in the 4 disk raidz group.

You're dealing with a queue here.  iostat's average pending I/Os represents
the queue depth.   Some devices can't handle a large queue.  In any
case, queuing theory applies.

Note that for reads, the disk will likely have a track cache, so it is
not a good assumption that a read I/O will require a media access.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS volume Block Size & Record Size

2008-01-22 Thread Manoj Nayak
Hi All,

How  ZFS volblocksize is related to ZFS record size ?

Thanks
Manoj Nayak
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?

2008-01-22 Thread Carson Gaspar
Albert Chin wrote:
> On Tue, Jan 22, 2008 at 09:20:30PM -0500, Kyle McDonald wrote:
>> Anyone know the answer to this? I'll be ordering 2 of the 7K's for
>> my x346's this week. If niether A nor B will work I'm not sure
>> there's any advantage to using the 7k card considering I want ZFS to
>> do the mirroring.
> 
> Why even both with a H/W RAID array when you won't use the H/W RAID?
> Better to find a decent SAS/FC JBOD with cache. Would definitely be
> cheaper.

Please name some candidate cards matching that description - I don't 
know of any.

-- 
Carson
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?

2008-01-22 Thread Albert Chin
On Tue, Jan 22, 2008 at 09:20:30PM -0500, Kyle McDonald wrote:
> Anyone know the answer to this? I'll be ordering 2 of the 7K's for
> my x346's this week. If niether A nor B will work I'm not sure
> there's any advantage to using the 7k card considering I want ZFS to
> do the mirroring.

Why even both with a H/W RAID array when you won't use the H/W RAID?
Better to find a decent SAS/FC JBOD with cache. Would definitely be
cheaper.

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?

2008-01-22 Thread Kyle McDonald
Kyle McDonald wrote:
> Now I just need to determine if  a) the cache is used by the card even 
> when useing the disks on it as JBOD, or b) if the card will allow me to 
> make 5 or 6 raid 0 luns with only 1 disk in each, to simulate (a) and 
> activate the write cache.
>
>   
I found docs at IBM that make me think that (B) at least will work. The 
7k can use up to 30 disks, and can allow the host to see as many as 8 
LUNs. IBM's description of  RAID0 is that it requires a min. of 1 drive, 
so I can't see why I can't create 5 or 6 1 drive RAID 0 LUNs to use the 
7k's 256MB cache with ZFS.

Next question is, with a single drive RAID 0 LUN, will the card's stripe 
unit size be a factor? and if so how to set the card's stripe unit size. 
I know ZFS likes to write to the disk in 128K chunks. Is that 128K to 
each vdev? or 128K/n to each vdev?

This card allows stripe unit sizes of 8k, 16k, 32k, and 64k. I'm 
guessing that if ZFS will be sending 128K to each vdev at once nearly 
all the time, I should use 64k. If it's 128K across the 5 vdevs, then 
16k or 32k might be better?? In either case is there an advantage to 
tuning ZFS's size down to match the card's?
> If this all does work, it should speed up all the writes to the disk, 
> including the ZIL writes. Is there still an advantage to investigating a 
> Solid State Disk, or Flash Drive device to reloacte the ZIL to?
>   
I'm still going to investigate this further. I think I'll try to 
calculate the max ZIL size once I'm up and running, and see if I can't 
get a cheap USB flash drive of a decent size to test this with.

  -Kyle

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?

2008-01-22 Thread Kyle McDonald
Carson Gaspar wrote:
> Kyle McDonald wrote:
> ...
>   
>> I know, but for a that card you need a driver to make it appear as a 
>> device. Plus it would take a PCI slot.
>> I was hoping to make use of the battery backed ram on a RAID card that I 
>> already have (but can't use since I want to let ZFS do the redundancy.)  
>> If I had a card with battery backed ram, how would I go about testing 
>> the commit semantics to see if it is only obeying ZFS commits when the 
>> battery is bad?
>> 
>
> Any _sane_ controller that supports battery backed cache will disable 
> its write cache if its battery goes bad. It should also log this. I'd 
> check the docs or contact your vendor's tech support to verify the card 
> you have is sane, and if it reports the error to its monitoring tools so 
> you find out about it quickly.
>   
You're right. I forgot that. Not only would the commits need to happen 
right away, but the cache should be disabled completely.

Now that you mention it, I know from experience, for the ServeRAID 7k/8k 
controllers, the cache is disabled if/when the battery fails. Good point.

Now I just need to determine if  a) the cache is used by the card even 
when useing the disks on it as JBOD, or b) if the card will allow me to 
make 5 or 6 raid 0 luns with only 1 disk in each, to simulate (a) and 
activate the write cache.

Anyone know the answer to this? I'll be ordering 2 of the 7K's for my 
x346's this week. If niether A nor B will work I'm not sure there's any 
advantage to using the 7k card considering I want ZFS to do the mirroring.

If this all does work, it should speed up all the writes to the disk, 
including the ZIL writes. Is there still an advantage to investigating a 
Solid State Disk, or Flash Drive device to reloacte the ZIL to?
> Now you'll probably _still_ need to disable the ZFS cache flushes, which 
> is a global option, so you'd need to make sure that _all_ your ZFS 
> devices had battery backed write caches or no write caches at all.
>
>   
I guess this is a better solution than chasing down firmware authors to 
get them to ignore flush requests.
It's just too bad it's not settable on a pool by pool basis rather than 
server by server. Won't affect me though this will be the only pool on 
this machine.

   -Kyle

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS vq_max_pending value ?

2008-01-22 Thread manoj nayak

- Original Message - 
From: "Richard Elling" <[EMAIL PROTECTED]>
To: "manoj nayak" <[EMAIL PROTECTED]>
Cc: 
Sent: Wednesday, January 23, 2008 7:20 AM
Subject: Re: [zfs-discuss] ZFS vq_max_pending value ?


> manoj nayak wrote:
>>
>>> Manoj Nayak wrote:
 Hi All.

 ZFS document says ZFS schedules it's I/O in such way that it manages to 
 saturate a single disk bandwidth  using enough concurrent 128K I/O.
 The no of concurrent I/O is decided by vq_max_pending.The default value 
 for  vq_max_pending is 35.

 We have created 4-disk raid-z group inside ZFS pool on Thumper.ZFS 
 record size is set to 128k.When we read/write a 128K record ,it issue a
 128K/3 I/O to each of the 3 data disks in the 4-disk raid-z group.

>>>
>>> Yes, this is how it works for a read without errors.  For a write, you
>>> should see 4 writes, each 128KBytes/3.  Writes may also be
>>> coalesced, so you may see larger physical writes.
>>>
 We need to saturate all three data disk bandwidth in the Raidz group.Is 
 it required to set vq_max_pending value to 35*3=135  ?

>>>
>>> No.  vq_max_pending applies to each vdev.
>>
>> 4 disk raidz group issues 128k/3=42.6k io to each individual data disk.If 
>> 35 concurrent 128k IO is enough to saturate a disk( vdev ) ,
>> then 35*3=105 concurrent 42k io will be required to saturates the same 
>> disk.
>
> ZFS doesn't know anything about disk saturation.  It will send
> up to vq_max_pending  I/O requests per vdev (usually a vdev is a
> disk). It will try to keep vq_max_pending I/O requests queued to
> the vdev.

I can see the "avg pending I/Os" hitting my  vq_max_pending limit, then 
raising the limit would be a good thing. I think , it's due to
many 42k Read IO to individual disk in the 4 disk raidz group.

Thanks
Manoj Nayak

> For writes, you should see them become coalesced, so rather than
> sending 3 42.6kByte write requests to a vdev, you might see one
> 128kByte write request.
>
> In other words, ZFS has an I/O scheduler which is responsible
> for sending I/O requests to vdevs.
> -- richard
>
> 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Sparc zfs root/boot status ?

2008-01-22 Thread David Magda
On Jan 22, 2008, at 18:24, Lori Alt wrote:

> ZFS boot supported by the installation software, plus
> support for having swap and dump be zvols within
> the root pool (i.e., no longer requiring a separate
> swap/dump slice), plus various other features, such
> as support for failsafe-archive booting.

Will there any support for tying into patching / Live Upgrade with  
the ZFS boot put back, or is that a separate project?

Thanks for any info.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS vq_max_pending value ?

2008-01-22 Thread Richard Elling
manoj nayak wrote:
>
>> Manoj Nayak wrote:
>>> Hi All.
>>>
>>> ZFS document says ZFS schedules it's I/O in such way that it manages 
>>> to saturate a single disk bandwidth  using enough concurrent 128K I/O.
>>> The no of concurrent I/O is decided by vq_max_pending.The default 
>>> value for  vq_max_pending is 35.
>>>
>>> We have created 4-disk raid-z group inside ZFS pool on Thumper.ZFS 
>>> record size is set to 128k.When we read/write a 128K record ,it issue a
>>> 128K/3 I/O to each of the 3 data disks in the 4-disk raid-z group.
>>>
>>
>> Yes, this is how it works for a read without errors.  For a write, you
>> should see 4 writes, each 128KBytes/3.  Writes may also be
>> coalesced, so you may see larger physical writes.
>>
>>> We need to saturate all three data disk bandwidth in the Raidz 
>>> group.Is it required to set vq_max_pending value to 35*3=135  ?
>>>
>>
>> No.  vq_max_pending applies to each vdev.
>
> 4 disk raidz group issues 128k/3=42.6k io to each individual data 
> disk.If 35 concurrent 128k IO is enough to saturate a disk( vdev ) ,
> then 35*3=105 concurrent 42k io will be required to saturates the same 
> disk.

ZFS doesn't know anything about disk saturation.  It will send
up to vq_max_pending  I/O requests per vdev (usually a vdev is a
disk). It will try to keep vq_max_pending I/O requests queued to
the vdev.

For writes, you should see them become coalesced, so rather than
sending 3 42.6kByte write requests to a vdev, you might see one
128kByte write request.

In other words, ZFS has an I/O scheduler which is responsible
for sending I/O requests to vdevs.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS vdev_cache

2008-01-22 Thread eric kustarz

On Jan 22, 2008, at 5:39 PM, manoj nayak wrote:

>>
>> Manoj Nayak writes:
>>> Hi All,
>>>
>>> If any dtrace script is available to figure out  the vdev_cache (or
>>> software track buffer) reads  in kiloBytes ?
>>>
>>> The document says the default size of the read is 128k , However
>>> vdev_cache source code implementation says the default size is 64k
>>>
>>> Thanks
>>> Manoj Nayak
>>>
>>
>> Which document ? It's 64K when it applies.
>> Nevada won't use the vdev_cache for data block anymore.
>
> How readahead or software track buffer is going to used in Navada  
> without
> vdev_cache ? Any pointer to documents regarding that ?

The vdev cache is still used - just for metadata.  See:
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/ 
common/fs/zfs/vdev_cache.c#35

http://blogs.sun.com/erickustarz/entry/vdev_cache_improvements_to_help

eric

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS vdev_cache

2008-01-22 Thread manoj nayak
>
> Manoj Nayak writes:
> > Hi All,
> >
> > If any dtrace script is available to figure out  the vdev_cache (or
> > software track buffer) reads  in kiloBytes ?
> >
> > The document says the default size of the read is 128k , However
> > vdev_cache source code implementation says the default size is 64k
> >
> > Thanks
> > Manoj Nayak
> >
>
> Which document ? It's 64K when it applies.
> Nevada won't use the vdev_cache for data block anymore.

How readahead or software track buffer is going to used in Navada without 
vdev_cache ? Any pointer to documents regarding that ?

Thanks
Manoj Nayak

>
> -r
>
> >
> >
> > ___
> > zfs-discuss mailing list
> > zfs-discuss@opensolaris.org
> > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>
> 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS vq_max_pending value ?

2008-01-22 Thread manoj nayak

> Manoj Nayak wrote:
>> Hi All.
>>
>> ZFS document says ZFS schedules it's I/O in such way that it manages to 
>> saturate a single disk bandwidth  using enough concurrent 128K I/O.
>> The no of concurrent I/O is decided by vq_max_pending.The default value 
>> for  vq_max_pending is 35.
>>
>> We have created 4-disk raid-z group inside ZFS pool on Thumper.ZFS record 
>> size is set to 128k.When we read/write a 128K record ,it issue a
>> 128K/3 I/O to each of the 3 data disks in the 4-disk raid-z group.
>>
>
> Yes, this is how it works for a read without errors.  For a write, you
> should see 4 writes, each 128KBytes/3.  Writes may also be
> coalesced, so you may see larger physical writes.
>
>> We need to saturate all three data disk bandwidth in the Raidz group.Is 
>> it required to set vq_max_pending value to 35*3=135  ?
>>
>
> No.  vq_max_pending applies to each vdev.

4 disk raidz group issues 128k/3=42.6k io to each individual data disk.If 35 
concurrent 128k IO is enough to saturate a disk( vdev ) ,
then 35*3=105 concurrent 42k io will be required to saturates the same disk.

Thanks
Manoj Nayak

Use iostat to see what
> the device load is.  For the commonly used Hitachi 500 GByte disks
> in a thumper, the read media bandwidth is 31-64.8 MBytes/s.  Writes
> will be about 80% of reads, or 24.8-51.8 MBytes/s.  In a thumper,
> the disk bandwidth will be the limiting factor for the hardware.
> -- richard
>
> 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Sparc zfs root/boot status ?

2008-01-22 Thread Lori Alt
andrewk9 wrote:
>> zfs boot on sparc will not be putback on its own.
>> It will be putback with the rest of zfs boot support,
>> sometime around build 86.
>> 
>
> Since we already have ZFS boot on x86, what else will be added in addition to 
> ZFS boot for SPARC?
>
>   
ZFS boot supported by the installation software, plus
support for having swap and dump be zvols within
the root pool (i.e., no longer requiring a separate
swap/dump slice), plus various other features, such
as support for failsafe-archive booting.

Lori
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?

2008-01-22 Thread Carson Gaspar
Kyle McDonald wrote:
...
> I know, but for a that card you need a driver to make it appear as a 
> device. Plus it would take a PCI slot.
> I was hoping to make use of the battery backed ram on a RAID card that I 
> already have (but can't use since I want to let ZFS do the redundancy.)  
> If I had a card with battery backed ram, how would I go about testing 
> the commit semantics to see if it is only obeying ZFS commits when the 
> battery is bad?

Any _sane_ controller that supports battery backed cache will disable 
its write cache if its battery goes bad. It should also log this. I'd 
check the docs or contact your vendor's tech support to verify the card 
you have is sane, and if it reports the error to its monitoring tools so 
you find out about it quickly.

Now you'll probably _still_ need to disable the ZFS cache flushes, which 
is a global option, so you'd need to make sure that _all_ your ZFS 
devices had battery backed write caches or no write caches at all.

-- 
Carson
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zpool attach problem

2008-01-22 Thread Rob
On a V240 running s10u4 (no additional patches), I had a pool which looked like 
this:


> # zpool status
>   pool: pool01
>  state: ONLINE
>  scrub: none requested
> config:
> 
> NAME   STATE READ WRITE CKSUM
> pool01 ONLINE   0 0 0
>   mirror   ONLINE   0 0 0
> c8t600C0FF0082668310F838000d0  ONLINE   0 0 0
> c8t600C0FF007E4BE4C38F4ED00d0  ONLINE   0 0 0
>   mirror   ONLINE   0 0 0
> c8t600C0FF008266812A0877700d0  ONLINE   0 0 0
> c8t600C0FF007E4BE2BEDBC9600d0  ONLINE   0 0 0
> 
> errors: No known data errors


Since this system is not in production yet, I wanted to do a little disk 
juggling as follows:


> # zpool detach pool01 c8t600C0FF007E4BE4C38F4ED00d0
> # zpool detach pool01 c8t600C0FF007E4BE2BEDBC9600d0


New pool status:


> # zpool status
>   pool: pool01
>  state: ONLINE
>  scrub: none requested
> config:
> 
> NAME STATE READ WRITE CKSUM
> pool01   ONLINE   0 0 0
>   c8t600C0FF0082668310F838000d0  ONLINE   0 0 0
>   c8t600C0FF008266812A0877700d0  ONLINE   0 0 0
> 
> errors: No known data errors


Finally, I wanted to re-establish mirrors, but am seeing the following errors:


> # zpool attach pool01 c8t600C0FF008266812A0877700d0 
> c8t600C0FF007E4BE4C38F4ED00d0
> cannot attach c8t600C0FF007E4BE4C38F4ED00d0 to 
> c8t600C0FF008266812A0877700d0: device is too small
> # zpool attach pool01 c8t600C0FF0082668310F838000d0 
> c8t600C0FF007E4BE2BEDBC9600d0
> cannot attach c8t600C0FF007E4BE2BEDBC9600d0 to 
> c8t600C0FF0082668310F838000d0: device is too small


Is this expected behavior? The 'zpool' man page says:

If device is not currently part of a mirrored configuration,  device
automatically  transforms  into a two-way  mirror of device and new_device.

But, this isn't what I'm seeing . . . did I do something wrong?

Here's the format output for the disks:


   4. c8t600C0FF007E4BE2BEDBC9600d0 
  /scsi_vhci/[EMAIL PROTECTED]
   5. c8t600C0FF007E4BE4C38F4ED00d0 
  /scsi_vhci/[EMAIL PROTECTED]
   6. c8t600C0FF008266812A0877700d0 
  /scsi_vhci/[EMAIL PROTECTED]
   7. c8t600C0FF0082668310F838000d0 
  /scsi_vhci/[EMAIL PROTECTED]


Rob
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Sparc zfs root/boot status ?

2008-01-22 Thread andrewk9
> zfs boot on sparc will not be putback on its own.
> It will be putback with the rest of zfs boot support,
> sometime around build 86.

Since we already have ZFS boot on x86, what else will be added in addition to 
ZFS boot for SPARC?

Thanks

Andrew.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?

2008-01-22 Thread Kyle McDonald
Albert Chin wrote:
> On Tue, Jan 22, 2008 at 12:47:37PM -0500, Kyle McDonald wrote:
>   
>> My primary use case, is NFS base storage to a farm of software build 
>> servers, and developer desktops.
>> 
>
> For the above environment, you'll probably see a noticable improvement
> with a battery-backed NVRAM-based ZIL. Unfortunately, no inexpensive
> cards exist for the common consumer (with ECC memory anyways). If you
> convince http://www.micromemory.com/ to sell you one, let us know :)
>
>   
I know, but for a that card you need a driver to make it appear as a 
device. Plus it would take a PCI slot.
I was hoping to make use of the battery backed ram on a RAID card that I 
already have (but can't use since I want to let ZFS do the redundancy.)  
If I had a card with battery backed ram, how would I go about testing 
the commit semantics to see if it is only obeying ZFS commits when the 
battery is bad?

Does anyone know if the IBM ServeRAID 7k or 8k do this correctly? If not 
any chance of getting IBM to 'fix' the firmware? The Solaris RedBooks 
I've read, they seem to think highly of ZFS.

Back on the subject of NVRAM for ZIL devices, What are people using then 
for ZIL devices on the budget-limited side of things?

I've foudn some SATA Flash drive, and a bunch that are IDE. 
Unfortunately the HW I'd like to stick this in is a little older... It's 
got a U320 SCSI controller in it. Has anyone found a good U320 Flash 
Disk that's not overkill size wise, and not outrageously expensive? 
Google found what appear to be a few OEM vendors, but no resellers on 
the qty I'd be interested in.

Anyone using a USB Flash drive? Is USB fast enough to gain any benefits?

   -Kyle

> Set "set zfs:zil_disable = 1" in /etc/system to gauge the type of
> improvement you can expect. Don't use this in production though.
>
>   


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?

2008-01-22 Thread Albert Chin
On Tue, Jan 22, 2008 at 12:47:37PM -0500, Kyle McDonald wrote:
> 
> My primary use case, is NFS base storage to a farm of software build 
> servers, and developer desktops.

For the above environment, you'll probably see a noticable improvement
with a battery-backed NVRAM-based ZIL. Unfortunately, no inexpensive
cards exist for the common consumer (with ECC memory anyways). If you
convince http://www.micromemory.com/ to sell you one, let us know :)

Set "set zfs:zil_disable = 1" in /etc/system to gauge the type of
improvement you can expect. Don't use this in production though.

-- 
albert chin ([EMAIL PROTECTED])
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS vq_max_pending value ?

2008-01-22 Thread Richard Elling
Manoj Nayak wrote:
> Hi All.
>
> ZFS document says ZFS schedules it's I/O in such way that it manages to 
> saturate a single disk bandwidth  using enough concurrent 128K I/O.
> The no of concurrent I/O is decided by vq_max_pending.The default value 
> for  vq_max_pending is 35.
>
> We have created 4-disk raid-z group inside ZFS pool on Thumper.ZFS 
> record size is set to 128k.When we read/write a 128K record ,it issue a
> 128K/3 I/O to each of the 3 data disks in the 4-disk raid-z group.
>   

Yes, this is how it works for a read without errors.  For a write, you
should see 4 writes, each 128KBytes/3.  Writes may also be
coalesced, so you may see larger physical writes.

> We need to saturate all three data disk bandwidth in the Raidz group.Is 
> it required to set vq_max_pending value to 35*3=135  ?
>   

No.  vq_max_pending applies to each vdev.  Use iostat to see what
the device load is.  For the commonly used Hitachi 500 GByte disks
in a thumper, the read media bandwidth is 31-64.8 MBytes/s.  Writes
will be about 80% of reads, or 24.8-51.8 MBytes/s.  In a thumper,
the disk bandwidth will be the limiting factor for the hardware.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?

2008-01-22 Thread Kyle McDonald
Are there, or Does it make any sense to try to find a RAID card with 
battery backup that will ignore the ZFS commit commands when the battery 
is able to guarantee stable storage?

I don't know if they do this, but I've recently had good non-ZFS 
performance with the IBM ServeRAID 8k raid that was in an xSeries server 
I was using. the 8k has 256MB or batter backed cache.

The server it was in, only had 6 drive bays, and I'm not looking to have 
it do RAID5 for ZFS, but I just had the idea:

 "Hey, I wonder if I could setup the card with 5 (single drive) RAID 0 LUNs,
  and gain the advantage of the the 256MB battery backed cache, when I tell
  ZFS to do RAIDZ across them?"

I know battery-backed cache, and the proper commit semantics are 
generally found only on higher end raid controllers and arrays (right?) 
But I'm wondering now if I couldn't get an 8 port SATA controller that 
would let me map each single drive as a RAID 0 LUN and use it's cache to 
boost performance.

My primary use case, is NFS base storage to a farm of software build 
servers, and developer desktops.

Anyone searched for this already? Anyone found any reasons why it 
wouldn't work already?

  -Kyle


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] LowEnd Batt. backed raid controllers that will deal with ZFS commit semantics correctly?

2008-01-22 Thread Kyle McDonald
Are there, or Does it make any sense to try to find a RAID card with 
battery backup that will ignore the ZFS commit commands when the battery 
is able to guarantee stable storage?

I don't know if they do this, but I've recently had good non-ZFS 
performance with the IBM ServeRAID 8k raid that was in an xSeries server 
I was using. the 8k has 256MB or batter backed cache.

The server it was in, only had 6 drive bays, and I'm not looking to have 
it do RAID5 for ZFS, but I just had the idea:

 "Hey, I wonder if I could setup the card with 5 (single drive) RAID 0 LUNs,
  and gain the advantage of the the 256MB battery backed cache, when I tell
  ZFS to do RAIDZ across them?"

I know battery-backed cache, and the proper commit semantics are 
generally found only on higher end raid controllers and arrays (right?) 
But I'm wondering now if I couldn't get an 8 port SATA controller that 
would let me map each single drive as a RAID 0 LUN and use it's cache to 
boost performance.

My primary use case, is NFS base storage to a farm of software build 
servers, and developer desktops.

Anyone searched for this already? Anyone found any reasons why it 
wouldn't work already?

  -Kyle


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Sparc zfs root/boot status ?

2008-01-22 Thread Lori Alt
zfs boot on sparc will not be putback on its own.
It will be putback with the rest of zfs boot support,
sometime around build 86.

Lori

Mauro Mozzarelli wrote:
> Back in October/November 2007 when I asked about Sparc zfs boot and root 
> capabilities, I got a reply indicating late December 2007 for a possible 
> release.
>
> I was wondering what is the status right now, will this feature make it into 
> build 79?
>  
>  
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>   

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Sparc zfs root/boot status ?

2008-01-22 Thread Darren J Moffat
Mauro Mozzarelli wrote:
> Back in October/November 2007 when I asked about Sparc zfs boot and root 
> capabilities, I got a reply indicating late December 2007 for a possible 
> release.
> 
> I was wondering what is the status right now, will this feature make it into 
> build 79?

No build 79 has long since closed and SPARC ZFS Boot isn't in it.

-- 
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Swap on ZVOL safe to use?

2008-01-22 Thread Darren J Moffat
Lori Alt wrote:
> The bug is being actively worked at this time (it just got a boost
> in urgency as a result of the issues it was causing for the
> zfs boot project).   It is likely that there will be a fix soon
> (sooner than zfs boot will be available).  In the
> meantime, I know of no workaround.  Maybe someone
> else does.

Is the fix to make it safe to swap on a ZVOL or is it the introduction 
of the raw (non COW) volumes mentioned previously ?

-- 
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS vdev_cache

2008-01-22 Thread Roch - PAE

Manoj Nayak writes:
 > Hi All,
 > 
 > If any dtrace script is available to figure out  the vdev_cache (or 
 > software track buffer) reads  in kiloBytes ?
 > 
 > The document says the default size of the read is 128k , However 
 > vdev_cache source code implementation says the default size is 64k
 > 
 > Thanks
 > Manoj Nayak
 > 

Which document ? It's 64K when it applies.
Nevada won't use the vdev_cache for data block anymore.

-r

 > 
 > 
 > ___
 > zfs-discuss mailing list
 > zfs-discuss@opensolaris.org
 > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Updated ZFS Automatic Snapshot Service - version 0.10.

2008-01-22 Thread Tim Foster
Hi all,

I've got a slightly updated version of the ZFS Automatic Snapshot SMF
Service on my blog.

This version contains a few bugfixes (many thanks to Reid Spencer and
Breandan Dezendorf!) as well as a small new feature - by default we now
avoid taking snapshots for any datasets that are on a pool that's
currently being scrubbed or resilvered to avoid running into 6343667.

More at:
http://blogs.sun.com/timf/entry/zfs_automatic_snapshots_0_10


Is this service something that we'd like to put into OpenSolaris or are
there plans for something similar that achieves the same goal (and
perhaps integrates more neatly with the rest of ZFS) ?

Otherwise, should I start filling in an ARC one-pager template or is
this sort of utility something that's better left to sysadmins to
implement themselves, rather than baking it into the OS ?

cheers,
tim
-- 
Tim Foster, Sun Microsystems Inc, Solaris Engineering Ops
http://blogs.sun.com/timf

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS vq_max_pending value ?

2008-01-22 Thread Manoj Nayak
Hi All.

ZFS document says ZFS schedules it's I/O in such way that it manages to 
saturate a single disk bandwidth  using enough concurrent 128K I/O.
The no of concurrent I/O is decided by vq_max_pending.The default value 
for  vq_max_pending is 35.

We have created 4-disk raid-z group inside ZFS pool on Thumper.ZFS 
record size is set to 128k.When we read/write a 128K record ,it issue a
128K/3 I/O to each of the 3 data disks in the 4-disk raid-z group.

We need to saturate all three data disk bandwidth in the Raidz group.Is 
it required to set vq_max_pending value to 35*3=135  ?

Thanks
Manoj Nayak
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS vdev_cache

2008-01-22 Thread Manoj Nayak
Hi All,

If any dtrace script is available to figure out  the vdev_cache (or 
software track buffer) reads  in kiloBytes ?

The document says the default size of the read is 128k , However 
vdev_cache source code implementation says the default size is 64k

Thanks
Manoj Nayak



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Sparc zfs root/boot status ?

2008-01-22 Thread Mauro Mozzarelli
Back in October/November 2007 when I asked about Sparc zfs boot and root 
capabilities, I got a reply indicating late December 2007 for a possible 
release.

I was wondering what is the status right now, will this feature make it into 
build 79?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Ditto blocks in S10U4 ?

2008-01-22 Thread Tomas Ögren
On 22 January, 2008 - [EMAIL PROTECTED] sent me these 1,6K bytes:

> bash-3.00# cat /etc/release
> Solaris 10 8/07 s10x_u4wos_12b X86
>Copyright 2007 Sun Microsystems, Inc.  All Rights Reserved.
> Use is subject to license terms.
> Assembled 16 August 2007
> 
> (with all the latest patches)
> 
> bash-3.00# zpool list
> NAMESIZEUSED   AVAILCAP  HEALTH ALTROOT
> zpool1 20.8T   5.44G   20.8T 0%  ONLINE -
> 
> bash-3.00# zpool upgrade -v
> This system is currently running ZFS version 4.
> 
> The following versions are supported:
> 
> VER  DESCRIPTION
> ---  
>  1   Initial ZFS version
>  2   Ditto blocks (replicated metadata)
>  3   Hot spares and double parity RAID-Z
>  4   zpool history
> 
> For more information on a particular version, including supported
> releases, see:
> 
> http://www.opensolaris.org/os/community/zfs/version/N
> 
> Where 'N' is the version number.
> 
> bash-3.00# zfs set copies=2 zpool1
> cannot set property for 'zpool1': invalid property 'copies'
> 
> >From http://www.opensolaris.org/os/community/zfs/version/2/
> "... This version includes support for "Ditto Blocks", or replicated
> metadata."
> 
> Can anybody shed any light on it ?

The 'copies' thing in zfs set is ditto blocks for data.. the one in ver2
is for metadata only..

/Tomas
-- 
Tomas Ögren, [EMAIL PROTECTED], http://www.acc.umu.se/~stric/
|- Student at Computing Science, University of Umeå
`- Sysadmin at {cs,acc}.umu.se
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Ditto blocks in S10U4 ?

2008-01-22 Thread przemolicc
bash-3.00# cat /etc/release
Solaris 10 8/07 s10x_u4wos_12b X86
   Copyright 2007 Sun Microsystems, Inc.  All Rights Reserved.
Use is subject to license terms.
Assembled 16 August 2007

(with all the latest patches)

bash-3.00# zpool list
NAMESIZEUSED   AVAILCAP  HEALTH ALTROOT
zpool1 20.8T   5.44G   20.8T 0%  ONLINE -

bash-3.00# zpool upgrade -v
This system is currently running ZFS version 4.

The following versions are supported:

VER  DESCRIPTION
---  
 1   Initial ZFS version
 2   Ditto blocks (replicated metadata)
 3   Hot spares and double parity RAID-Z
 4   zpool history

For more information on a particular version, including supported
releases, see:

http://www.opensolaris.org/os/community/zfs/version/N

Where 'N' is the version number.

bash-3.00# zfs set copies=2 zpool1
cannot set property for 'zpool1': invalid property 'copies'

>From http://www.opensolaris.org/os/community/zfs/version/2/
"... This version includes support for "Ditto Blocks", or replicated
metadata."

Can anybody shed any light on it ?

Regards
przemol

-- 
http://przemol.blogspot.com/






















--
Kogo Doda ciagnela do lozka?
Sprawdz >>  http://link.interia.pl/f1cde

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] problem with nfs share of zfs storage

2008-01-22 Thread Robert Milkowski
Hello Francois,

Monday, January 21, 2008, 9:51:22 PM, you wrote:

FD> I have a need to stream video over nfs. video is stored on zfs. every 10
FD> minutes or so, the video will freeze, and then 1 minute later it
FD> resumes. This doesn't happen from an nfs mount on ufs. zfs server is a
FD> 32 bit P4 box with 512MB, running nexenta in plain text mode, and
FD> nothing else, really. Tried playback from different OSes and the same is
FD> happening. Network has more than 10x the capacity that is required, no
FD> compression on zfs

FD> Any idea what is going on? cpu is not pegged on server or playback
FD> client. Not sure what to look for.



try to do iostat -xnz 1 while you are streamin and catch the moment
you experience a problem.

Also try vmstat -p 1 at the same time and catch the same moment.



-- 
Best regards,
 Robertmailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ATA UDMA data parity error

2008-01-22 Thread Kent Watsen

For the archive, I swapped the mobo and all is good now...  (I copied 
100GB into the pool without a crash)

One problem I had was that Solaris would hang whenever booting - even 
when all the aoc-sat2-mv8 cards were pulled out.  Turns out that 
switching the BIOS field "USB 2.0 Controller Mode" from "HiSpeed" to 
"FullSpeed" makes the difference - any ideas why?

Thanks,
Kent

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss