Re: [zfs-discuss] To slice, or not to slice

2010-04-06 Thread Edward Ned Harvey
 I have reason to believe that both the drive, and the OS are correct.
 I have suspicion that the HBA simply handled the creation of this
 volume somehow differently than how it handled the original.  Don't
 know the answer for sure yet.

Ok, that's confirmed now.  Apparently when the drives ship from the factory,
they're pre-initialized for the HBA, so the HBA happily imports them and
creates simple volume (aka jbod) using the factory initialization.
Unfortunately, the factory init includes HBA metadata at both the start and
end of the drive ... so I lose 1MB.

The fix to the problem is to initialize the disk again with the HBA, and
then create a new simple volume.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-04 Thread Edward Ned Harvey
 Your experience is exactly why I suggested ZFS start doing some right
 sizing if you will.  Chop off a bit from the end of any disk so that
 we're guaranteed to be able to replace drives from different
 manufacturers.  The excuse being no reason to, Sun drives are always
 of identical size.  If your drives did indeed come from Sun, their
 response is clearly not true.  Regardless, I guess I still think it
 should be done.  Figure out what the greatest variation we've seen from
 drives that are supposedly of the exact same size, and chop it off the
 end of every disk.  I'm betting it's no more than 1GB, and probably
 less than that.  When we're talking about a 2TB drive, I'm willing to
 give up a gig to be guaranteed I won't have any issues when it comes
 time to swap it out.

My disks are sun branded intel disks.  Same model number.  The first
replacement disk had a newer firmware, so we jumped to conclusion that was
the cause of the problem, and caused oracle plenty of trouble in locating an
older firmware drive in some warehouse somewhere.  But the second
replacement disk is truly identical to the original.  Same firmware and
everything.  Only the serial number is different.  Still the same problem
behavior.

I have reason to believe that both the drive, and the OS are correct.  I
have suspicion that the HBA simply handled the creation of this volume
somehow differently than how it handled the original.  Don't know the answer
for sure yet.

Either way, yes, I would love zpool to automatically waste a little space at
the end of the drive, to avoid this sort of situation, whether it's caused
by drive manufacturers, or HBA, or any other factor.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-04 Thread Edward Ned Harvey
 CR 6844090, zfs should be able to mirror to a smaller disk
 http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6844090
 b117, June 2009

Awesome.  Now if someone would only port that to solaris, I'd be a happy
man.   ;-)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-04 Thread Tim Cook
On Sun, Apr 4, 2010 at 9:46 PM, Edward Ned Harvey solar...@nedharvey.comwrote:

  CR 6844090, zfs should be able to mirror to a smaller disk
  http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6844090
  b117, June 2009

 Awesome.  Now if someone would only port that to solaris, I'd be a happy
 man.   ;-)



Have you tried pointing that bug out to the support engineers who have your
case at Oracle?  If the fixed code is already out there, it's just a matter
of porting the code, right?  :)

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-04 Thread Edward Ned Harvey
  There is some question about performance.  Is there any additional
 overhead caused by using a slice instead of the whole physical device?
 
 No.
 
 If the disk is only used for ZFS, then it is ok to enable volatile disk
 write caching
 if the disk also supports write cache flush requests.
 
 If the disk is shared with UFS, then it is not ok to enable volatile
 disk write caching.

Thank you.  If you don't know the answer to this off the top of your head,
I'll go attempt the internet, but thought you might just know the answer in
2 seconds ...

Assuming the disk's write cache is disabled because of the slice (as
documented in the Best Practices Guide) how do you enable it?  I would only
be using ZFS on the drive.  The existence of a slice is purely to avoid
future mirror problems and the like.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-04 Thread Edward Ned Harvey
I haven't taken that approach, but I guess I'll give it a try.

 

 

 

From: Tim Cook [mailto:t...@cook.ms] 
Sent: Sunday, April 04, 2010 11:00 PM
To: Edward Ned Harvey
Cc: Richard Elling; zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] To slice, or not to slice

 

 

On Sun, Apr 4, 2010 at 9:46 PM, Edward Ned Harvey solar...@nedharvey.com
wrote:

 CR 6844090, zfs should be able to mirror to a smaller disk
 http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6844090
 b117, June 2009

Awesome.  Now if someone would only port that to solaris, I'd be a happy
man.   ;-)



Have you tried pointing that bug out to the support engineers who have your
case at Oracle?  If the fixed code is already out there, it's just a matter
of porting the code, right?  :)

--Tim

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-04 Thread Richard Elling
On Apr 4, 2010, at 8:11 PM, Edward Ned Harvey wrote:
 There is some question about performance.  Is there any additional
 overhead caused by using a slice instead of the whole physical device?
 
 No.
 
 If the disk is only used for ZFS, then it is ok to enable volatile disk
 write caching
 if the disk also supports write cache flush requests.
 
 If the disk is shared with UFS, then it is not ok to enable volatile
 disk write caching.
 
 Thank you.  If you don't know the answer to this off the top of your head,
 I'll go attempt the internet, but thought you might just know the answer in
 2 seconds ...
 
 Assuming the disk's write cache is disabled because of the slice (as
 documented in the Best Practices Guide) how do you enable it?  I would only
 be using ZFS on the drive.  The existence of a slice is purely to avoid
 future mirror problems and the like.

This is a trick question -- some drives ignore efforts to disable the write 
cache :-P

Use format -e for access to the expert mode where you can enable
the write cache. 

As for performance benefits, YMMV.
 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] To slice, or not to slice

2010-04-03 Thread Edward Ned Harvey
Momentarily, I will begin scouring the omniscient interweb for information, but 
I'd like to know a little bit of what people would say here.  The question is 
to slice, or not to slice, disks before using them in a zpool.

One reason to slice comes from recent personal experience.  One disk of a 
mirror dies.  Replaced under contract with an identical disk.  Same model 
number, same firmware.  Yet when it's plugged into the system, for an unknown 
reason, it appears 0.001 Gb smaller than the old disk, and therefore unable to 
attach and un-degrade the mirror.  It seems logical this problem could have 
been avoided if the device added to the pool originally had been a slice 
somewhat smaller than the whole physical device.  Say, a slice of 28G out of 
the 29G physical disk.  Because later when I get the infinitesimally smaller 
disk, I can always slice 28G out of it to use as the mirror device.

There is some question about performance.  Is there any additional overhead 
caused by using a slice instead of the whole physical device?

There is another question about performance.  One of my colleagues said he saw 
some literature on the internet somewhere, saying ZFS behaves differently for 
slices than it does on physical devices, because it doesn't assume it has 
exclusive access to that physical device, and therefore caches or buffers 
differently ... or something like that.

Any other pros/cons people can think of?

And finally, if anyone has experience doing this, and process recommendations?  
That is ... My next task is to go read documentation again, to refresh my 
memory from years ago, about the difference between format, partition, 
label, fdisk, because those terms don't have the same meaning that they do 
in other OSes...  And I don't know clearly right now, which one(s) I want to 
do, in order to create the large slice of my disks.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-03 Thread Edward Ned Harvey
  One reason to slice comes from recent personal experience. One disk
 of
  a mirror dies. Replaced under contract with an identical disk. Same
  model number, same firmware. Yet when it's plugged into the system,
  for an unknown reason, it appears 0.001 Gb smaller than the old disk,
  and therefore unable to attach and un-degrade the mirror. It seems
  logical this problem could have been avoided if the device added to
  the pool originally had been a slice somewhat smaller than the whole
  physical device. Say, a slice of 28G out of the 29G physical disk.
  Because later when I get the infinitesimally smaller disk, I can
  always slice 28G out of it to use as the mirror device.
 
 
 What build were you running? The should have been addressed by
 CR6844090
 that went into build 117.

I'm running solaris, but that's irrelevant.  The storagetek array controller
itself reports the new disk as infinitesimally smaller than the one which I
want to mirror.  Even before the drive is given to the OS, that's the way it
is.  Sun X4275 server.

BTW, I'm still degraded.  Haven't found an answer yet, and am considering
breaking all my mirrors, to create a new pool on the freed disks, and using
partitions in those disks, for the sake of rebuilding my pool using
partitions on all disks.  The aforementioned performance problem is not as
scary to me as running in degraded redundancy.


 it's well documented. ZFS won't attempt to enable the drive's cache
 unless it has the physical device. See
 
 http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide
 #Storage_Pools

Nice.  Thank you.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-03 Thread Roy Sigurd Karlsbakk
- Edward Ned Harvey solar...@nedharvey.com skrev:
  What build were you running? The should have been addressed by
  CR6844090
  that went into build 117.
 
 I'm running solaris, but that's irrelevant.  The storagetek array
 controller
 itself reports the new disk as infinitesimally smaller than the one
 which I
 want to mirror.  Even before the drive is given to the OS, that's the
 way it
 is.  Sun X4275 server.
 
 BTW, I'm still degraded.  Haven't found an answer yet, and am
 considering
 breaking all my mirrors, to create a new pool on the freed disks, and
 using
 partitions in those disks, for the sake of rebuilding my pool using
 partitions on all disks.  The aforementioned performance problem is
 not as
 scary to me as running in degraded redundancy.

I would return the drive to get a bigger one before doing something as drastic 
as that. There might have been a hichup in the production line, and that's not 
your fault.

roy
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-03 Thread Edward Ned Harvey
 And finally, if anyone has experience doing this, and process
 recommendations?  That is … My next task is to go read documentation
 again, to refresh my memory from years ago, about the difference
 between “format,” “partition,” “label,” “fdisk,” because those terms
 don’t have the same meaning that they do in other OSes…  And I don’t
 know clearly right now, which one(s) I want to do, in order to create
 the large slice of my disks.
 
 The whole partition vs. slice thing is a bit fuzzy to me, so take this
 with a grain of salt. You can create partitions using fdisk, or slices
 using format. The BIOS and other operating systems (windows, linux,
 etc) will be able to recognize partitions, while they won't be able to
 make sense of slices. If you need to boot from the drive or share it
 with another OS, then partitions are the way to go. If it's exclusive
 to solaris, then you can use slices. You can (but shouldn't) use slices
 and partitions from the same device (eg: c5t0d0s0 and c5t0d0p0).

Oh, I managed to find a really good answer to this question.  Several
sources all say to do precisely the same procedure, and when I did it on a
test system, it worked perfectly.  Simple and easy to repeat.  So I think
this is the gospel method to create the slices, if you're going to create
slices:

http://docs.sun.com/app/docs/doc/806-4073/6jd67r9hu
and
http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide#Rep
lacing.2FRelabeling_the_Root_Pool_Disk


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-03 Thread Roy Sigurd Karlsbakk
 Oh, I managed to find a really good answer to this question.  Several
 sources all say to do precisely the same procedure, and when I did it
 on a
 test system, it worked perfectly.  Simple and easy to repeat.  So I
 think
 this is the gospel method to create the slices, if you're going to
 create

Seems like a clumsy workaround for a hardware problem. It will also disable the 
drives' cache, which is not a good idea. Why not just get a new drive?

Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
r...@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er 
et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av 
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og 
relevante synonymer på norsk.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-03 Thread Edward Ned Harvey
 On Apr 2, 2010, at 2:29 PM, Edward Ned Harvey wrote:
  I've also heard that the risk for unexpected failure of your pool is
 higher if/when you reach 100% capacity.  I've heard that you should
 always create a small ZFS filesystem within a pool, and give it some
 reserved space, along with the filesystem that you actually plan to use
 in your pool.  Anyone care to offer any comments on that?
 
 Define failure in this context?
 
 I am not aware of a data loss failure when near full.  However, all
 file systems
 will experience performance degradation for write operations as they
 become
 full.

To tell the truth, I'm not exactly sure.  Because I've never lost any ZFS
pool or filesystem.  I only have it deployed on 3 servers, and only one of
those gets heavy use.  It only filled up once, and it didn't have any
problem.  So I'm only trying to understand the great beyond, that which I
have never known myself.  Learn from other peoples' experience,
preventively.  Yes, I do embrace a lot of voodoo and superstition in doing
sysadmin, but that's just cuz stuff ain't perfect, and I've seen so many
things happen that were supposedly not possible.  (Not talking about ZFS in
that regard...  yet.)  Well, unless you count the issue I'm having right
now, with two identical disks appearing as different sizes...  But I don't
think that's a zfs problem.

I recall some discussion either here or on opensolaris-discuss or
opensolaris-help, where at least one or a few people said they had some sort
of problem or problems, and they were suspicious about the correlation
between it happening, and the disk being full.  I also recall talking to
some random guy at a conference who said something similar.  But it's all
vague.  I really don't know.

And I have nothing concrete.  Hence the post asking for peoples' comments.
Somebody might relate something they experienced less vague than what I
know.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-03 Thread Edward Ned Harvey
 I would return the drive to get a bigger one before doing something as
 drastic as that. There might have been a hichup in the production line,
 and that's not your fault.

Yeah, but I already have 2 of the replacement disks, both doing the same
thing.  One has a firmware newer than my old disk (so originally I thought
that was the cause, and requested another replacement disk).  But then we
got a replacement disk which is identical in every way to the failed disk
... but it still appears smaller for some reason.

So this happened on my SSD.  What's to prevent it from happening on one of
the spindle disks in the future?  Nothing that I know of ...  

So far, the idea of slicing seems to be the only preventive or corrective
measure.  Hence, wondering what pros/cons people would describe, beyond what
I've already thought up myself.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-03 Thread Bob Friesenhahn

On Sat, 3 Apr 2010, Edward Ned Harvey wrote:


I would return the drive to get a bigger one before doing something as
drastic as that. There might have been a hichup in the production line,
and that's not your fault.


Yeah, but I already have 2 of the replacement disks, both doing the same
thing.  One has a firmware newer than my old disk (so originally I thought
that was the cause, and requested another replacement disk).  But then we
got a replacement disk which is identical in every way to the failed disk
... but it still appears smaller for some reason.

So this happened on my SSD.  What's to prevent it from happening on one of
the spindle disks in the future?  Nothing that I know of ...


Just keep in mind that this has been fixed in OpenSolaris for some 
time, and will surely be fixed in Solaris 10, if not already.  The 
annoying issue is that you probably need to add all of the vdev 
devices using an OS which already has the fix.  I don't know if it can 
repair a slightly overly-large device.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-03 Thread Tim Cook
On Fri, Apr 2, 2010 at 4:05 PM, Edward Ned Harvey
guacam...@nedharvey.comwrote:

  Momentarily, I will begin scouring the omniscient interweb for
 information, but I’d like to know a little bit of what people would say
 here.  The question is to slice, or not to slice, disks before using them in
 a zpool.



 One reason to slice comes from recent personal experience.  One disk of a
 mirror dies.  Replaced under contract with an identical disk.  Same model
 number, same firmware.  Yet when it’s plugged into the system, for an
 unknown reason, it appears 0.001 Gb smaller than the old disk, and therefore
 unable to attach and un-degrade the mirror.  It seems logical this problem
 could have been avoided if the device added to the pool originally had been
 a slice somewhat smaller than the whole physical device.  Say, a slice of
 28G out of the 29G physical disk.  Because later when I get the
 infinitesimally smaller disk, I can always slice 28G out of it to use as the
 mirror device.



 There is some question about performance.  Is there any additional overhead
 caused by using a slice instead of the whole physical device?



 There is another question about performance.  One of my colleagues said he
 saw some literature on the internet somewhere, saying ZFS behaves
 differently for slices than it does on physical devices, because it doesn’t
 assume it has exclusive access to that physical device, and therefore caches
 or buffers differently … or something like that.



 Any other pros/cons people can think of?



 And finally, if anyone has experience doing this, and process
 recommendations?  That is … My next task is to go read documentation again,
 to refresh my memory from years ago, about the difference between “format,”
 “partition,” “label,” “fdisk,” because those terms don’t have the same
 meaning that they do in other OSes…  And I don’t know clearly right now,
 which one(s) I want to do, in order to create the large slice of my disks.


Your experience is exactly why I suggested ZFS start doing some right
sizing if you will.  Chop off a bit from the end of any disk so that we're
guaranteed to be able to replace drives from different manufacturers.  The
excuse being no reason to, Sun drives are always of identical size.  If
your drives did indeed come from Sun, their response is clearly not true.
 Regardless, I guess I still think it should be done.  Figure out what the
greatest variation we've seen from drives that are supposedly of the exact
same size, and chop it off the end of every disk.  I'm betting it's no more
than 1GB, and probably less than that.  When we're talking about a 2TB
drive, I'm willing to give up a gig to be guaranteed I won't have any issues
when it comes time to swap it out.

--Tim

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-03 Thread Robert Milkowski

On 03/04/2010 19:24, Tim Cook wrote:



On Fri, Apr 2, 2010 at 4:05 PM, Edward Ned Harvey 
guacam...@nedharvey.com mailto:guacam...@nedharvey.com wrote:


Momentarily, I will begin scouring the omniscient interweb for
information, but I’d like to know a little bit of what people
would say here.  The question is to slice, or not to slice, disks
before using them in a zpool.

One reason to slice comes from recent personal experience.  One
disk of a mirror dies.  Replaced under contract with an identical
disk.  Same model number, same firmware.  Yet when it’s plugged
into the system, for an unknown reason, it appears 0.001 Gb
smaller than the old disk, and therefore unable to attach and
un-degrade the mirror.  It seems logical this problem could have
been avoided if the device added to the pool originally had been a
slice somewhat smaller than the whole physical device.  Say, a
slice of 28G out of the 29G physical disk.  Because later when I
get the infinitesimally smaller disk, I can always slice 28G out
of it to use as the mirror device.

There is some question about performance.  Is there any additional
overhead caused by using a slice instead of the whole physical device?

There is another question about performance.  One of my colleagues
said he saw some literature on the internet somewhere, saying ZFS
behaves differently for slices than it does on physical devices,
because it doesn’t assume it has exclusive access to that physical
device, and therefore caches or buffers differently … or something
like that.

Any other pros/cons people can think of?

And finally, if anyone has experience doing this, and process
recommendations?  That is … My next task is to go read
documentation again, to refresh my memory from years ago, about
the difference between “format,” “partition,” “label,” “fdisk,”
because those terms don’t have the same meaning that they do in
other OSes…  And I don’t know clearly right now, which one(s) I
want to do, in order to create the large slice of my disks.


Your experience is exactly why I suggested ZFS start doing some right 
sizing if you will.  Chop off a bit from the end of any disk so that 
we're guaranteed to be able to replace drives from different 
manufacturers.  The excuse being no reason to, Sun drives are always 
of identical size.  If your drives did indeed come from Sun, their 
response is clearly not true.  Regardless, I guess I still think it 
should be done.  Figure out what the greatest variation we've seen 
from drives that are supposedly of the exact same size, and chop it 
off the end of every disk.  I'm betting it's no more than 1GB, and 
probably less than that.  When we're talking about a 2TB drive, I'm 
willing to give up a gig to be guaranteed I won't have any issues when 
it comes time to swap it out.




that's what open solaris is doing more or less for some time now.

look in the archives of this mailing list for more information.
--
Robert Milkowski
http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-03 Thread Tim Cook
On Sat, Apr 3, 2010 at 6:53 PM, Robert Milkowski mi...@task.gda.pl wrote:

  On 03/04/2010 19:24, Tim Cook wrote:



 On Fri, Apr 2, 2010 at 4:05 PM, Edward Ned Harvey guacam...@nedharvey.com
  wrote:

   Momentarily, I will begin scouring the omniscient interweb for
 information, but I’d like to know a little bit of what people would say
 here.  The question is to slice, or not to slice, disks before using them in
 a zpool.



 One reason to slice comes from recent personal experience.  One disk of a
 mirror dies.  Replaced under contract with an identical disk.  Same model
 number, same firmware.  Yet when it’s plugged into the system, for an
 unknown reason, it appears 0.001 Gb smaller than the old disk, and therefore
 unable to attach and un-degrade the mirror.  It seems logical this problem
 could have been avoided if the device added to the pool originally had been
 a slice somewhat smaller than the whole physical device.  Say, a slice of
 28G out of the 29G physical disk.  Because later when I get the
 infinitesimally smaller disk, I can always slice 28G out of it to use as the
 mirror device.



 There is some question about performance.  Is there any additional
 overhead caused by using a slice instead of the whole physical device?



 There is another question about performance.  One of my colleagues said he
 saw some literature on the internet somewhere, saying ZFS behaves
 differently for slices than it does on physical devices, because it doesn’t
 assume it has exclusive access to that physical device, and therefore caches
 or buffers differently … or something like that.



 Any other pros/cons people can think of?



 And finally, if anyone has experience doing this, and process
 recommendations?  That is … My next task is to go read documentation again,
 to refresh my memory from years ago, about the difference between “format,”
 “partition,” “label,” “fdisk,” because those terms don’t have the same
 meaning that they do in other OSes…  And I don’t know clearly right now,
 which one(s) I want to do, in order to create the large slice of my disks.


  Your experience is exactly why I suggested ZFS start doing some right
 sizing if you will.  Chop off a bit from the end of any disk so that we're
 guaranteed to be able to replace drives from different manufacturers.  The
 excuse being no reason to, Sun drives are always of identical size.  If
 your drives did indeed come from Sun, their response is clearly not true.
  Regardless, I guess I still think it should be done.  Figure out what the
 greatest variation we've seen from drives that are supposedly of the exact
 same size, and chop it off the end of every disk.  I'm betting it's no more
 than 1GB, and probably less than that.  When we're talking about a 2TB
 drive, I'm willing to give up a gig to be guaranteed I won't have any issues
 when it comes time to swap it out.


  that's what open solaris is doing more or less for some time now.

 look in the archives of this mailing list for more information.
 --
 Robert Milkowski
 http://milek.blogspot.com



Since when?  It isn't doing it on any of my drives, build 134, and judging
by the OP's issues, it isn't doing it for him either... I try to follow this
list fairly closely and I've never seen anyone at Sun/Oracle say they were
going to start doing it after I was shot down the first time.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-03 Thread Tim Cook
On Sat, Apr 3, 2010 at 7:50 PM, Tim Cook t...@cook.ms wrote:



 On Sat, Apr 3, 2010 at 6:53 PM, Robert Milkowski mi...@task.gda.plwrote:

  On 03/04/2010 19:24, Tim Cook wrote:



 On Fri, Apr 2, 2010 at 4:05 PM, Edward Ned Harvey 
 guacam...@nedharvey.com wrote:

   Momentarily, I will begin scouring the omniscient interweb for
 information, but I’d like to know a little bit of what people would say
 here.  The question is to slice, or not to slice, disks before using them in
 a zpool.



 One reason to slice comes from recent personal experience.  One disk of a
 mirror dies.  Replaced under contract with an identical disk.  Same model
 number, same firmware.  Yet when it’s plugged into the system, for an
 unknown reason, it appears 0.001 Gb smaller than the old disk, and therefore
 unable to attach and un-degrade the mirror.  It seems logical this problem
 could have been avoided if the device added to the pool originally had been
 a slice somewhat smaller than the whole physical device.  Say, a slice of
 28G out of the 29G physical disk.  Because later when I get the
 infinitesimally smaller disk, I can always slice 28G out of it to use as the
 mirror device.



 There is some question about performance.  Is there any additional
 overhead caused by using a slice instead of the whole physical device?



 There is another question about performance.  One of my colleagues said
 he saw some literature on the internet somewhere, saying ZFS behaves
 differently for slices than it does on physical devices, because it doesn’t
 assume it has exclusive access to that physical device, and therefore caches
 or buffers differently … or something like that.



 Any other pros/cons people can think of?



 And finally, if anyone has experience doing this, and process
 recommendations?  That is … My next task is to go read documentation again,
 to refresh my memory from years ago, about the difference between “format,”
 “partition,” “label,” “fdisk,” because those terms don’t have the same
 meaning that they do in other OSes…  And I don’t know clearly right now,
 which one(s) I want to do, in order to create the large slice of my disks.


  Your experience is exactly why I suggested ZFS start doing some right
 sizing if you will.  Chop off a bit from the end of any disk so that we're
 guaranteed to be able to replace drives from different manufacturers.  The
 excuse being no reason to, Sun drives are always of identical size.  If
 your drives did indeed come from Sun, their response is clearly not true.
  Regardless, I guess I still think it should be done.  Figure out what the
 greatest variation we've seen from drives that are supposedly of the exact
 same size, and chop it off the end of every disk.  I'm betting it's no more
 than 1GB, and probably less than that.  When we're talking about a 2TB
 drive, I'm willing to give up a gig to be guaranteed I won't have any issues
 when it comes time to swap it out.


  that's what open solaris is doing more or less for some time now.

 look in the archives of this mailing list for more information.
 --
 Robert Milkowski
 http://milek.blogspot.com



 Since when?  It isn't doing it on any of my drives, build 134, and judging
 by the OP's issues, it isn't doing it for him either... I try to follow this
 list fairly closely and I've never seen anyone at Sun/Oracle say they were
 going to start doing it after I was shot down the first time.

 --Tim



Oh... and after 15 minutes of searching for everything from 'right-sizing'
to 'block reservation' to 'replacement disk smaller size fewer blocks' etc.
etc. I don't see a single thread on it.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-03 Thread Richard Elling
On Apr 3, 2010, at 5:56 PM, Tim Cook wrote:
 
 On Sat, Apr 3, 2010 at 7:50 PM, Tim Cook t...@cook.ms wrote:
 Your experience is exactly why I suggested ZFS start doing some right 
 sizing if you will.  Chop off a bit from the end of any disk so that we're 
 guaranteed to be able to replace drives from different manufacturers.  The 
 excuse being no reason to, Sun drives are always of identical size.  If 
 your drives did indeed come from Sun, their response is clearly not true.  
 Regardless, I guess I still think it should be done.  Figure out what the 
 greatest variation we've seen from drives that are supposedly of the exact 
 same size, and chop it off the end of every disk.  I'm betting it's no more 
 than 1GB, and probably less than that.  When we're talking about a 2TB 
 drive, I'm willing to give up a gig to be guaranteed I won't have any issues 
 when it comes time to swap it out.
 
 
 that's what open solaris is doing more or less for some time now.
 
 look in the archives of this mailing list for more information.
 -- 
 Robert Milkowski
 http://milek.blogspot.com
 
 
 
 Since when?  It isn't doing it on any of my drives, build 134, and judging by 
 the OP's issues, it isn't doing it for him either... I try to follow this 
 list fairly closely and I've never seen anyone at Sun/Oracle say they were 
 going to start doing it after I was shot down the first time.
 
 --Tim
 
 
 Oh... and after 15 minutes of searching for everything from 'right-sizing' to 
 'block reservation' to 'replacement disk smaller size fewer blocks' etc. etc. 
 I don't see a single thread on it.

CR 6844090, zfs should be able to mirror to a smaller disk
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6844090
b117, June 2009
 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com 





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-03 Thread Richard Elling
On Apr 2, 2010, at 2:05 PM, Edward Ned Harvey wrote:

 Momentarily, I will begin scouring the omniscient interweb for information, 
 but I’d like to know a little bit of what people would say here.  The 
 question is to slice, or not to slice, disks before using them in a zpool.
  
 One reason to slice comes from recent personal experience.  One disk of a 
 mirror dies.  Replaced under contract with an identical disk.  Same model 
 number, same firmware.  Yet when it’s plugged into the system, for an unknown 
 reason, it appears 0.001 Gb smaller than the old disk, and therefore unable 
 to attach and un-degrade the mirror.  It seems logical this problem could 
 have been avoided if the device added to the pool originally had been a slice 
 somewhat smaller than the whole physical device.  Say, a slice of 28G out of 
 the 29G physical disk.  Because later when I get the infinitesimally smaller 
 disk, I can always slice 28G out of it to use as the mirror device.

If the HBA is configured for RAID mode, then it will reserve some space on disk
for its metadata.  This occurs no matter what type of disk you attach.

 There is some question about performance.  Is there any additional overhead 
 caused by using a slice instead of the whole physical device?

No.

 There is another question about performance.  One of my colleagues said he 
 saw some literature on the internet somewhere, saying ZFS behaves differently 
 for slices than it does on physical devices, because it doesn’t assume it has 
 exclusive access to that physical device, and therefore caches or buffers 
 differently … or something like that.
  
 Any other pros/cons people can think of?

If the disk is only used for ZFS, then it is ok to enable volatile disk write 
caching
if the disk also supports write cache flush requests.

If the disk is shared with UFS, then it is not ok to enable volatile disk write 
caching.

 -- richard

 
 And finally, if anyone has experience doing this, and process 
 recommendations?  That is … My next task is to go read documentation again, 
 to refresh my memory from years ago, about the difference between “format,” 
 “partition,” “label,” “fdisk,” because those terms don’t have the same 
 meaning that they do in other OSes…  And I don’t know clearly right now, 
 which one(s) I want to do, in order to create the large slice of my disks.
  
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com 





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-03 Thread Tim Cook
On Sat, Apr 3, 2010 at 9:52 PM, Richard Elling richard.ell...@gmail.comwrote:

 On Apr 3, 2010, at 5:56 PM, Tim Cook wrote:
 
  On Sat, Apr 3, 2010 at 7:50 PM, Tim Cook t...@cook.ms wrote:
  Your experience is exactly why I suggested ZFS start doing some right
 sizing if you will.  Chop off a bit from the end of any disk so that we're
 guaranteed to be able to replace drives from different manufacturers.  The
 excuse being no reason to, Sun drives are always of identical size.  If
 your drives did indeed come from Sun, their response is clearly not true.
  Regardless, I guess I still think it should be done.  Figure out what the
 greatest variation we've seen from drives that are supposedly of the exact
 same size, and chop it off the end of every disk.  I'm betting it's no more
 than 1GB, and probably less than that.  When we're talking about a 2TB
 drive, I'm willing to give up a gig to be guaranteed I won't have any issues
 when it comes time to swap it out.
 
 
  that's what open solaris is doing more or less for some time now.
 
  look in the archives of this mailing list for more information.
  --
  Robert Milkowski
  http://milek.blogspot.com
 
 
 
  Since when?  It isn't doing it on any of my drives, build 134, and
 judging by the OP's issues, it isn't doing it for him either... I try to
 follow this list fairly closely and I've never seen anyone at Sun/Oracle say
 they were going to start doing it after I was shot down the first time.
 
  --Tim
 
 
  Oh... and after 15 minutes of searching for everything from
 'right-sizing' to 'block reservation' to 'replacement disk smaller size
 fewer blocks' etc. etc. I don't see a single thread on it.

 CR 6844090, zfs should be able to mirror to a smaller disk
 http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6844090
 b117http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6844090%0Ab117,
 June 2009
  -- richard



Unless the bug description is incomplete, that's talking about adding a
mirror to an existing drive.  Not about replacing a failed drive in an
existing vdev that could be raid-z#.  I'm almost positive I had an issue
post b117 with replacing a failed drive in a raid-z2 vdev.

I'll have to see if I can dig up a system to test the theory on.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-03 Thread Richard Elling
On Apr 3, 2010, at 8:00 PM, Tim Cook wrote:
 On Sat, Apr 3, 2010 at 9:52 PM, Richard Elling richard.ell...@gmail.com 
 wrote:
 On Apr 3, 2010, at 5:56 PM, Tim Cook wrote:
 
  On Sat, Apr 3, 2010 at 7:50 PM, Tim Cook t...@cook.ms wrote:
  Your experience is exactly why I suggested ZFS start doing some right 
  sizing if you will.  Chop off a bit from the end of any disk so that 
  we're guaranteed to be able to replace drives from different 
  manufacturers.  The excuse being no reason to, Sun drives are always of 
  identical size.  If your drives did indeed come from Sun, their response 
  is clearly not true.  Regardless, I guess I still think it should be done. 
   Figure out what the greatest variation we've seen from drives that are 
  supposedly of the exact same size, and chop it off the end of every disk.  
  I'm betting it's no more than 1GB, and probably less than that.  When 
  we're talking about a 2TB drive, I'm willing to give up a gig to be 
  guaranteed I won't have any issues when it comes time to swap it out.
 
 
  that's what open solaris is doing more or less for some time now.
 
  look in the archives of this mailing list for more information.
  --
  Robert Milkowski
  http://milek.blogspot.com
 
 
 
  Since when?  It isn't doing it on any of my drives, build 134, and judging 
  by the OP's issues, it isn't doing it for him either... I try to follow 
  this list fairly closely and I've never seen anyone at Sun/Oracle say they 
  were going to start doing it after I was shot down the first time.
 
  --Tim
 
 
  Oh... and after 15 minutes of searching for everything from 'right-sizing' 
  to 'block reservation' to 'replacement disk smaller size fewer blocks' etc. 
  etc. I don't see a single thread on it.
 
 CR 6844090, zfs should be able to mirror to a smaller disk
 http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6844090
 b117, June 2009
  -- richard
 
 
 
 Unless the bug description is incomplete, that's talking about adding a 
 mirror to an existing drive.  Not about replacing a failed drive in an 
 existing vdev that could be raid-z#.  I'm almost positive I had an issue post 
 b117 with replacing a failed drive in a raid-z2 vdev.

It is the same code.

That said, I have experimented with various cases and I have not found
prediction of tolerable size difference to be easy.

 I'll have to see if I can dig up a system to test the theory on.

Works fine.

# ramdiskadm -a rd1 10k
/dev/ramdisk/rd1
# ramdiskadm -a rd2 10k
/dev/ramdisk/rd2
# ramdiskadm -a rd3 10k
/dev/ramdisk/rd3
# ramdiskadm -a rd4 99900k
/dev/ramdisk/rd4
# zpool create -o cachefile=none zwimming raidz /dev/ramdisk/rd1 
/dev/ramdisk/rd2 /dev/ramdisk/rd3
# zpool status zwimming
  pool: zwimming
 state: ONLINE
 scrub: none requested
config:

NAME  STATE READ WRITE CKSUM
zwimming  ONLINE   0 0 0
  raidz1-0ONLINE   0 0 0
/dev/ramdisk/rd1  ONLINE   0 0 0
/dev/ramdisk/rd2  ONLINE   0 0 0
/dev/ramdisk/rd3  ONLINE   0 0 0

errors: No known data errors
# zpool replace zwimming /dev/ramdisk/rd3 /dev/ramdisk/rd4
# zpool status zwimming
  pool: zwimming
 state: ONLINE
 scrub: resilver completed after 0h0m with 0 errors on Sat Apr  3 20:08:35 2010
config:

NAME  STATE READ WRITE CKSUM
zwimming  ONLINE   0 0 0
  raidz1-0ONLINE   0 0 0
/dev/ramdisk/rd1  ONLINE   0 0 0
/dev/ramdisk/rd2  ONLINE   0 0 0
/dev/ramdisk/rd4  ONLINE   0 0 0  45K resilvered

errors: No known data errors


 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com 





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-02 Thread Edward Ned Harvey
This might be unrelated, but along similar lines .

 

I've also heard that the risk for unexpected failure of your pool is higher
if/when you reach 100% capacity.  I've heard that you should always create a
small ZFS filesystem within a pool, and give it some reserved space, along
with the filesystem that you actually plan to use in your pool.  Anyone care
to offer any comments on that?

 

 

 

From: zfs-discuss-boun...@opensolaris.org
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Edward Ned Harvey
Sent: Friday, April 02, 2010 5:23 PM
To: zfs-discuss@opensolaris.org
Subject: [zfs-discuss] To slice, or not to slice

 

Momentarily, I will begin scouring the omniscient interweb for information,
but I'd like to know a little bit of what people would say here.  The
question is to slice, or not to slice, disks before using them in a zpool.

 

One reason to slice comes from recent personal experience.  One disk of a
mirror dies.  Replaced under contract with an identical disk.  Same model
number, same firmware.  Yet when it's plugged into the system, for an
unknown reason, it appears 0.001 Gb smaller than the old disk, and therefore
unable to attach and un-degrade the mirror.  It seems logical this problem
could have been avoided if the device added to the pool originally had been
a slice somewhat smaller than the whole physical device.  Say, a slice of
28G out of the 29G physical disk.  Because later when I get the
infinitesimally smaller disk, I can always slice 28G out of it to use as the
mirror device.

 

There is some question about performance.  Is there any additional overhead
caused by using a slice instead of the whole physical device?

 

There is another question about performance.  One of my colleagues said he
saw some literature on the internet somewhere, saying ZFS behaves
differently for slices than it does on physical devices, because it doesn't
assume it has exclusive access to that physical device, and therefore caches
or buffers differently . or something like that.

 

Any other pros/cons people can think of?

 

And finally, if anyone has experience doing this, and process
recommendations?  That is . My next task is to go read documentation again,
to refresh my memory from years ago, about the difference between format,
partition, label, fdisk, because those terms don't have the same
meaning that they do in other OSes.  And I don't know clearly right now,
which one(s) I want to do, in order to create the large slice of my disks.

 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-02 Thread Ian Collins

On 04/ 3/10 10:23 AM, Edward Ned Harvey wrote:


Momentarily, I will begin scouring the omniscient interweb for 
information, but I’d like to know a little bit of what people would 
say here. The question is to slice, or not to slice, disks before 
using them in a zpool.




Not.

One reason to slice comes from recent personal experience. One disk of 
a mirror dies. Replaced under contract with an identical disk. Same 
model number, same firmware. Yet when it’s plugged into the system, 
for an unknown reason, it appears 0.001 Gb smaller than the old disk, 
and therefore unable to attach and un-degrade the mirror. It seems 
logical this problem could have been avoided if the device added to 
the pool originally had been a slice somewhat smaller than the whole 
physical device. Say, a slice of 28G out of the 29G physical disk. 
Because later when I get the infinitesimally smaller disk, I can 
always slice 28G out of it to use as the mirror device.




What build were you running? The should have been addressed by CR6844090 
that went into build 117.


There is some question about performance. Is there any additional 
overhead caused by using a slice instead of the whole physical device?


There is another question about performance. One of my colleagues said 
he saw some literature on the internet somewhere, saying ZFS behaves 
differently for slices than it does on physical devices, because it 
doesn’t assume it has exclusive access to that physical device, and 
therefore caches or buffers differently … or something like that.


it's well documented. ZFS won't attempt to enable the drive's cache 
unless it has the physical device. See


http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#Storage_Pools

--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-02 Thread Brandon High
On Fri, Apr 2, 2010 at 2:23 PM, Edward Ned Harvey
guacam...@nedharvey.comwrote:

  There is some question about performance.  Is there any additional
 overhead caused by using a slice instead of the whole physical device?


zfs will disable the write cache when it's not working with whole disks,
which may reduce performance. You can turn the cache back on however. I
don't remember the exact incantation to do so, but format -e springs to
mind.

And finally, if anyone has experience doing this, and process
 recommendations?  That is … My next task is to go read documentation again,
 to refresh my memory from years ago, about the difference between “format,”
 “partition,” “label,” “fdisk,” because those terms don’t have the same
 meaning that they do in other OSes…  And I don’t know clearly right now,
 which one(s) I want to do, in order to create the large slice of my disks.


The whole partition vs. slice thing is a bit fuzzy to me, so take this with
a grain of salt. You can create partitions using fdisk, or slices using
format. The BIOS and other operating systems (windows, linux, etc) will be
able to recognize partitions, while they won't be able to make sense of
slices. If you need to boot from the drive or share it with another OS, then
partitions are the way to go. If it's exclusive to solaris, then you can use
slices. You can (but shouldn't) use slices and partitions from the same
device (eg: c5t0d0s0 and c5t0d0p0).

-B

-- 
Brandon High : bh...@freaks.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-02 Thread Brandon High
On Fri, Apr 2, 2010 at 2:29 PM, Edward Ned Harvey solar...@nedharvey.comwrote:

  I’ve also heard that the risk for unexpected failure of your pool is
 higher if/when you reach 100% capacity.  I’ve heard that you should always
 create a small ZFS filesystem within a pool, and give it some reserved
 space, along with the filesystem that you actually plan to use in your
 pool.  Anyone care to offer any comments on that?

I think you can just create a dataset with a reservation to avoid the issue.
As I understand it, zfs doesn't automatically set aside a few percent of
reserved space like ufs does.

-B

-- 
Brandon High : bh...@freaks.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-02 Thread Richard Elling
On Apr 2, 2010, at 2:29 PM, Edward Ned Harvey wrote:
 I’ve also heard that the risk for unexpected failure of your pool is higher 
 if/when you reach 100% capacity.  I’ve heard that you should always create a 
 small ZFS filesystem within a pool, and give it some reserved space, along 
 with the filesystem that you actually plan to use in your pool.  Anyone care 
 to offer any comments on that?

Define failure in this context?

I am not aware of a data loss failure when near full.  However, all file systems
will experience performance degradation for write operations as they become
full.
 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com 





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss