Re: [zfs-discuss] Best way to incorporate disk size tolerance into

2007-09-13 Thread Daniel Carosone
> Caveat: do not enable nonvolatile write cache for UFS.

Correction:  do not enable *volatile* write cache for UFS  :-)

--
Dan.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best way to incorporate disk size tolerance into

2007-09-10 Thread Richard Elling
MC wrote:
> To expand on this:
> 
>> The recommended use of whole disks is for drives with volatile 
>> write caches where ZFS will enable the cache if it owns the whole disk.
> 
> Does ZFS really never use disk cache when working with a disk slice?  

This question doesn't make sense.  ZFS doesn't know anything about the
disk's cache.  But if ZFS has full control over the disk, then it will
attempt to enable the disk's nonvolatile cache.

> Is there any way to force it to use the disk cache?

ZFS doesn't know anything about the disk's cache.  But it will try to
issue the flush cache commands as needed.

To try to preempt the next question, some disks allow you to turn off
the nonvolatile write cache.  Some don't.  Some disks allow you to
enable or disable the nonvolatile write cache via the format(1m) command
in expert mode.  Some don't.  AFAIK, nobody has a list of these, so you
might try it.  Caveat: do not enable nonvolatile write cache for UFS.
  -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best way to incorporate disk size tolerance into

2007-09-10 Thread MC
To expand on this:

> The recommended use of whole disks is for drives with volatile write caches 
> where ZFS will enable the cache if it owns the whole disk.

Does ZFS really never use disk cache when working with a disk slice?  Is there 
any way to force it to use the disk cache?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best way to incorporate disk size tolerance into

2007-08-29 Thread Richard Elling
MC wrote:
>> This is a problem for replacement, not creation.
> 
> You're talking about solving the problem in the future?  I'm talking about 
> working around the problem today.  :)  This isn't a fluffy dream problem.  I 
> ran into this last month when an RMA'd drive wouldn't fit back into a RAID5 
> array.  RAIDZ is subject to the exact same problem, so I want to find the 
> solution before making a RAIDZ array.
> 
>> The authoritative answer is in the man page for zpool.
> 
> You quoted the exact same line that I quoted in my original post.  That isn't 
> a solution.  That is a constraint which causes the problem and which the 
> solution must work around.
> 
> The two solutions listed here are slicing and hacking the "whole disk" label 
> to be smaller than the whole disk.  There is no consensus here on what 
> solution, if any, should be used.  I would like there to be, so I'll leave 
> the original question open.

slicing seems simple enough.
  -- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best way to incorporate disk size tolerance into

2007-08-29 Thread MC
> This is a problem for replacement, not creation.

You're talking about solving the problem in the future?  I'm talking about 
working around the problem today.  :)  This isn't a fluffy dream problem.  I 
ran into this last month when an RMA'd drive wouldn't fit back into a RAID5 
array.  RAIDZ is subject to the exact same problem, so I want to find the 
solution before making a RAIDZ array.

> The authoritative answer is in the man page for zpool.

You quoted the exact same line that I quoted in my original post.  That isn't a 
solution.  That is a constraint which causes the problem and which the solution 
must work around.

The two solutions listed here are slicing and hacking the "whole disk" label to 
be smaller than the whole disk.  There is no consensus here on what solution, 
if any, should be used.  I would like there to be, so I'll leave the original 
question open.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best way to incorporate disk size tolerance into

2007-08-29 Thread Richard Elling
MC wrote:
> Thanks for the comprehensive replies!
> 
> I'll need some baby speak on this one though: 
> 
>> The recommended use of whole disks is for drives with volatile write 
>> caches where ZFS will enable the cache if it owns the whole disk. There 
>> may be an RFE lurking here, but it might be tricky to correctly implement 
>> to protect against future data corruptions by non-ZFS use.
> 
> I don't know what you mean by "drives with volatile write caches", but I'm 
> dealing with commodity SATA2 drives from WD/Seagate/Hitachi/Samsung.  

You may see it in the data sheet as "buffer" or "cache buffer" for such drives.
Usually 8-16 MBytes with 32 MBytes for newer drives.

> This disk replacement thing is a pretty common use case, so I think it would 
> be smart to sort it out while someone cares, and then stick the authoritative
> answer into the zfs wiki.  This is what I can contribute without knowing the 
> answer:

The authoritative answer is in the man page for zpool.
System Administration Commands  zpool(1M)

 The size of new_device must be greater than or equal  to
 the minimum size of all the devices in a mirror or raidz
 configuration.

> The best way to incorporate abnormal disk size variance tolerance into a 
> raidz array 
> is BLANK, and it has these BLANK side effects.  

This is a problem for replacement, not creation.  For creation, the problem 
becomes
more generic, but can make use of automation.  I've got some algorithms to do 
that,
but am not quite ready with a generic solution which is administrator friendly. 
 In
other words, the science isn't difficult, the automation is.
  -- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best way to incorporate disk size tolerance into

2007-08-28 Thread MC
Thanks for the comprehensive replies!

I'll need some baby speak on this one though: 

> The recommended use of whole disks is for drives with volatile write caches 
> where ZFS will enable the cache if it owns the whole disk. There may be an 
> RFE lurking here, but it might be tricky to correctly implement to protect 
> against future data corruptions by non-ZFS use.

I don't know what you mean by "drives with volatile write caches", but I'm 
dealing with commodity SATA2 drives from WD/Seagate/Hitachi/Samsung.  

This disk replacement thing is a pretty common use case, so I think it would be 
smart to sort it out while someone cares, and then stick the authoritative 
answer into the zfs wiki.  This is what I can contribute without knowing the 
answer:

The best way to incorporate abnormal disk size variance tolerance into a raidz 
array is BLANK, and it has these BLANK side effects.  

Now you guys fill in the BLANKs :P
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best way to incorporate disk size tolerance into raidz arrays?

2007-08-28 Thread Richard Elling
MC wrote:
> The situation: a three 500gb disk raidz array.  One disk breaks and you 
> replace it with a new one.  But the new 500gb disk is slightly smaller 
> than the smallest disk in the array.  

This is quite a problem for RAID arrays, too.  It is why vendors use custom
labels for disks.  When you have multiple disk vendors, or the disk vendors
change designs, you can end up with slightly different sized disks.  So you
tend to use a least common denominator for your custom label.

> I presume the disk would not be accepted into the array because the zpool 
> replace entry on the zpool man page says "The size of new_device must be 
> greater than or equal to the minimum size of all the devices in a mirror or 
> raidz configuration."[1]

yes

> I had expected (hoped) that a raidz array with sufficient free space would 
> downsize itself to accommodate the smaller replaced disk.  But I've never 
> seen that function mentioned anywhere :o)

this is the infamous "shrink vdev" RFE.

> So I figure the only way to build smaller-than-max-disk-size functionality
> into a raidz array is to make a slice on each disk that is slightly smaller 
> than the max disk size, and then build the array out of those slices.  Am I 
> correct here?

This is the technique vendors use for RAID arrays.

> If so, is there a downside to using slice(s) instead of whole disks?  The 
> zpool 
> manual says "ZFS can use individual slices or partitions, though the 
> recommended 
> mode of operation is to use whole disks." ["Virtual Devices (vdevs)", 1]  

The recommended use of whole disks is for drives with volatile write caches 
where
ZFS will enable the cache if it owns the whole disk.  There may be an RFE 
lurking
here, but it might be tricky to correctly implement to protect against future
data corruptions by non-ZFS use.
  -- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best way to incorporate disk size tolerance into raidz arrays?

2007-08-28 Thread Marion Hakanson
[EMAIL PROTECTED] said:
> The situation: a three 500gb disk raidz array.  One disk breaks and you
> replace it with a new one.  But the new 500gb disk is slightly smaller than
> the smallest disk in the array.   
> . . .
> So I figure the only way to build smaller-than-max-disk-size functionality
> into a raidz array is to make a slice on each disk that is slightly smaller
> than the max disk size, and then build the array out of those slices.  Am I
> correct here?

Actually, you can manually adjust the "whole disk" label so it takes up
less than the whole disk.  ZFS doesn't seem to notice.  One way of doing
this is to create a temporary whole-disk pool on an unlabelled disk,
allowing ZFS to setup its standard EFI label.  Then destroy that temporary
pool, and use "format" to adjust the size of slice 0 to whatever smaller
block count you want.  Later "zpool create", "add", or "attach" operations
seem to just follow the existing label, rather than adjust it upwards to
the maximum block count that will fit on the disk.

I'm just reporting what I've observed (Solaris-10U3);  Naturally this
could change as releases go forward, although the current behavior
seems like a pretty safe one.


> If so, is there a downside to using slice(s) instead of whole disks?  The
> zpool manual says "ZFS can use individual slices or partitions, though the
> recommended mode of operation is to use whole disks." ["Virtual Devices
> (vdevs)", 1]   

The only down-side I know of is only a potential one:  You could get
competing uses of the same spindle, if you have more than one slice in
use on the same physical drive at the same time.  That can definitely
slow things down a lot, depending on what's going on.  ZFS seems to try
to use up all available performance of the drives it has been configured
to use.

Note that slicing up a boot drive with boot filesystems on part of
the disk, and a ZFS data pool on the rest, works just fine, likely
because you don't typically see a lot of I/O on the OS/boot filesystems
unless you're short on RAM (in which case things go slow for other reasons).

Regards,

Marion


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Best way to incorporate disk size tolerance into raidz arrays?

2007-08-28 Thread MC
The situation: a three 500gb disk raidz array.  One disk breaks and you replace 
it with a new one.  But the new 500gb disk is slightly smaller than the 
smallest disk in the array.  

I presume the disk would not be accepted into the array because the zpool 
replace entry on the zpool man page says "The size of new_device must be 
greater than or equal to the minimum size of all the devices in a mirror or 
raidz configuration."[1]

I had expected (hoped) that a raidz array with sufficient free space would 
downsize itself to accommodate the smaller replaced disk.  But I've never seen 
that function mentioned anywhere :o)  

So I figure the only way to build smaller-than-max-disk-size functionality into 
a raidz array is to make a slice on each disk that is slightly smaller than the 
max disk size, and then build the array out of those slices.  Am I correct here?

If so, is there a downside to using slice(s) instead of whole disks?  The zpool 
manual says "ZFS can use individual slices or partitions, though the 
recommended mode of operation is to use whole disks." ["Virtual Devices 
(vdevs)", 1]  

[1] http://docs.sun.com/app/docs/doc/819-2240/zpool-1m?a=view
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss