Re: [zfs-discuss] To slice, or not to slice

2010-04-06 Thread Edward Ned Harvey
> I have reason to believe that both the drive, and the OS are correct.
> I have suspicion that the HBA simply handled the creation of this
> volume somehow differently than how it handled the original.  Don't
> know the answer for sure yet.

Ok, that's confirmed now.  Apparently when the drives ship from the factory,
they're pre-initialized for the HBA, so the HBA happily imports them and
"creates simple volume" (aka jbod) using the factory initialization.
Unfortunately, the factory init includes HBA metadata at both the start and
end of the drive ... so I lose 1MB.

The fix to the problem is to initialize the disk again with the HBA, and
then create a new simple volume.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-04 Thread Richard Elling
On Apr 4, 2010, at 8:11 PM, Edward Ned Harvey wrote:
>>> There is some question about performance.  Is there any additional
>> overhead caused by using a slice instead of the whole physical device?
>> 
>> No.
>> 
>> If the disk is only used for ZFS, then it is ok to enable volatile disk
>> write caching
>> if the disk also supports write cache flush requests.
>> 
>> If the disk is shared with UFS, then it is not ok to enable volatile
>> disk write caching.
> 
> Thank you.  If you don't know the answer to this off the top of your head,
> I'll go attempt the internet, but thought you might just know the answer in
> 2 seconds ...
> 
> Assuming the disk's write cache is disabled because of the slice (as
> documented in the Best Practices Guide) how do you enable it?  I would only
> be using ZFS on the drive.  The existence of a slice is purely to avoid
> future mirror problems and the like.

This is a trick question -- some drives ignore efforts to disable the write 
cache :-P

Use "format -e" for access to the expert mode where you can enable
the write cache. 

As for performance benefits, YMMV.
 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-04 Thread Edward Ned Harvey
I haven't taken that approach, but I guess I'll give it a try.

 

 

 

From: Tim Cook [mailto:t...@cook.ms] 
Sent: Sunday, April 04, 2010 11:00 PM
To: Edward Ned Harvey
Cc: Richard Elling; zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] To slice, or not to slice

 

 

On Sun, Apr 4, 2010 at 9:46 PM, Edward Ned Harvey 
wrote:

> CR 6844090, zfs should be able to mirror to a smaller disk
> http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6844090
> b117, June 2009

Awesome.  Now if someone would only port that to solaris, I'd be a happy
man.   ;-)



Have you tried pointing that bug out to the support engineers who have your
case at Oracle?  If the fixed code is already out there, it's just a matter
of porting the code, right?  :)

--Tim

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-04 Thread Edward Ned Harvey
> > There is some question about performance.  Is there any additional
> overhead caused by using a slice instead of the whole physical device?
> 
> No.
> 
> If the disk is only used for ZFS, then it is ok to enable volatile disk
> write caching
> if the disk also supports write cache flush requests.
> 
> If the disk is shared with UFS, then it is not ok to enable volatile
> disk write caching.

Thank you.  If you don't know the answer to this off the top of your head,
I'll go attempt the internet, but thought you might just know the answer in
2 seconds ...

Assuming the disk's write cache is disabled because of the slice (as
documented in the Best Practices Guide) how do you enable it?  I would only
be using ZFS on the drive.  The existence of a slice is purely to avoid
future mirror problems and the like.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-04 Thread Tim Cook
On Sun, Apr 4, 2010 at 9:46 PM, Edward Ned Harvey wrote:

> > CR 6844090, zfs should be able to mirror to a smaller disk
> > http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6844090
> > b117, June 2009
>
> Awesome.  Now if someone would only port that to solaris, I'd be a happy
> man.   ;-)
>
>

Have you tried pointing that bug out to the support engineers who have your
case at Oracle?  If the fixed code is already out there, it's just a matter
of porting the code, right?  :)

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-04 Thread Edward Ned Harvey
> CR 6844090, zfs should be able to mirror to a smaller disk
> http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6844090
> b117, June 2009

Awesome.  Now if someone would only port that to solaris, I'd be a happy
man.   ;-)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-04 Thread Edward Ned Harvey
> Your experience is exactly why I suggested ZFS start doing some "right
> sizing" if you will.  Chop off a bit from the end of any disk so that
> we're guaranteed to be able to replace drives from different
> manufacturers.  The excuse being "no reason to, Sun drives are always
> of identical size".  If your drives did indeed come from Sun, their
> response is clearly not true.  Regardless, I guess I still think it
> should be done.  Figure out what the greatest variation we've seen from
> drives that are supposedly of the exact same size, and chop it off the
> end of every disk.  I'm betting it's no more than 1GB, and probably
> less than that.  When we're talking about a 2TB drive, I'm willing to
> give up a gig to be guaranteed I won't have any issues when it comes
> time to swap it out.

My disks are sun branded intel disks.  Same model number.  The first
replacement disk had a newer firmware, so we jumped to conclusion that was
the cause of the problem, and caused oracle plenty of trouble in locating an
older firmware drive in some warehouse somewhere.  But the second
replacement disk is truly identical to the original.  Same firmware and
everything.  Only the serial number is different.  Still the same problem
behavior.

I have reason to believe that both the drive, and the OS are correct.  I
have suspicion that the HBA simply handled the creation of this volume
somehow differently than how it handled the original.  Don't know the answer
for sure yet.

Either way, yes, I would love zpool to automatically waste a little space at
the end of the drive, to avoid this sort of situation, whether it's caused
by drive manufacturers, or HBA, or any other factor.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-03 Thread Richard Elling
On Apr 3, 2010, at 8:00 PM, Tim Cook wrote:
> On Sat, Apr 3, 2010 at 9:52 PM, Richard Elling  
> wrote:
> On Apr 3, 2010, at 5:56 PM, Tim Cook wrote:
> >
> > On Sat, Apr 3, 2010 at 7:50 PM, Tim Cook  wrote:
> >> Your experience is exactly why I suggested ZFS start doing some "right 
> >> sizing" if you will.  Chop off a bit from the end of any disk so that 
> >> we're guaranteed to be able to replace drives from different 
> >> manufacturers.  The excuse being "no reason to, Sun drives are always of 
> >> identical size".  If your drives did indeed come from Sun, their response 
> >> is clearly not true.  Regardless, I guess I still think it should be done. 
> >>  Figure out what the greatest variation we've seen from drives that are 
> >> supposedly of the exact same size, and chop it off the end of every disk.  
> >> I'm betting it's no more than 1GB, and probably less than that.  When 
> >> we're talking about a 2TB drive, I'm willing to give up a gig to be 
> >> guaranteed I won't have any issues when it comes time to swap it out.
> >>
> >>
> > that's what open solaris is doing more or less for some time now.
> >
> > look in the archives of this mailing list for more information.
> > --
> > Robert Milkowski
> > http://milek.blogspot.com
> >
> >
> >
> > Since when?  It isn't doing it on any of my drives, build 134, and judging 
> > by the OP's issues, it isn't doing it for him either... I try to follow 
> > this list fairly closely and I've never seen anyone at Sun/Oracle say they 
> > were going to start doing it after I was shot down the first time.
> >
> > --Tim
> >
> >
> > Oh... and after 15 minutes of searching for everything from 'right-sizing' 
> > to 'block reservation' to 'replacement disk smaller size fewer blocks' etc. 
> > etc. I don't see a single thread on it.
> 
> CR 6844090, zfs should be able to mirror to a smaller disk
> http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6844090
> b117, June 2009
>  -- richard
> 
> 
> 
> Unless the bug description is incomplete, that's talking about adding a 
> mirror to an existing drive.  Not about replacing a failed drive in an 
> existing vdev that could be raid-z#.  I'm almost positive I had an issue post 
> b117 with replacing a failed drive in a raid-z2 vdev.

It is the same code.

That said, I have experimented with various cases and I have not found
prediction of tolerable size difference to be easy.

> I'll have to see if I can dig up a system to test the theory on.

Works fine.

# ramdiskadm -a rd1 10k
/dev/ramdisk/rd1
# ramdiskadm -a rd2 10k
/dev/ramdisk/rd2
# ramdiskadm -a rd3 10k
/dev/ramdisk/rd3
# ramdiskadm -a rd4 99900k
/dev/ramdisk/rd4
# zpool create -o cachefile=none zwimming raidz /dev/ramdisk/rd1 
/dev/ramdisk/rd2 /dev/ramdisk/rd3
# zpool status zwimming
  pool: zwimming
 state: ONLINE
 scrub: none requested
config:

NAME  STATE READ WRITE CKSUM
zwimming  ONLINE   0 0 0
  raidz1-0ONLINE   0 0 0
/dev/ramdisk/rd1  ONLINE   0 0 0
/dev/ramdisk/rd2  ONLINE   0 0 0
/dev/ramdisk/rd3  ONLINE   0 0 0

errors: No known data errors
# zpool replace zwimming /dev/ramdisk/rd3 /dev/ramdisk/rd4
# zpool status zwimming
  pool: zwimming
 state: ONLINE
 scrub: resilver completed after 0h0m with 0 errors on Sat Apr  3 20:08:35 2010
config:

NAME  STATE READ WRITE CKSUM
zwimming  ONLINE   0 0 0
  raidz1-0ONLINE   0 0 0
/dev/ramdisk/rd1  ONLINE   0 0 0
/dev/ramdisk/rd2  ONLINE   0 0 0
/dev/ramdisk/rd4  ONLINE   0 0 0  45K resilvered

errors: No known data errors


 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com 





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-03 Thread Tim Cook
On Sat, Apr 3, 2010 at 9:52 PM, Richard Elling wrote:

> On Apr 3, 2010, at 5:56 PM, Tim Cook wrote:
> >
> > On Sat, Apr 3, 2010 at 7:50 PM, Tim Cook  wrote:
> >> Your experience is exactly why I suggested ZFS start doing some "right
> sizing" if you will.  Chop off a bit from the end of any disk so that we're
> guaranteed to be able to replace drives from different manufacturers.  The
> excuse being "no reason to, Sun drives are always of identical size".  If
> your drives did indeed come from Sun, their response is clearly not true.
>  Regardless, I guess I still think it should be done.  Figure out what the
> greatest variation we've seen from drives that are supposedly of the exact
> same size, and chop it off the end of every disk.  I'm betting it's no more
> than 1GB, and probably less than that.  When we're talking about a 2TB
> drive, I'm willing to give up a gig to be guaranteed I won't have any issues
> when it comes time to swap it out.
> >>
> >>
> > that's what open solaris is doing more or less for some time now.
> >
> > look in the archives of this mailing list for more information.
> > --
> > Robert Milkowski
> > http://milek.blogspot.com
> >
> >
> >
> > Since when?  It isn't doing it on any of my drives, build 134, and
> judging by the OP's issues, it isn't doing it for him either... I try to
> follow this list fairly closely and I've never seen anyone at Sun/Oracle say
> they were going to start doing it after I was shot down the first time.
> >
> > --Tim
> >
> >
> > Oh... and after 15 minutes of searching for everything from
> 'right-sizing' to 'block reservation' to 'replacement disk smaller size
> fewer blocks' etc. etc. I don't see a single thread on it.
>
> CR 6844090, zfs should be able to mirror to a smaller disk
> http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6844090
> b117,
> June 2009
>  -- richard
>
>

Unless the bug description is incomplete, that's talking about adding a
mirror to an existing drive.  Not about replacing a failed drive in an
existing vdev that could be raid-z#.  I'm almost positive I had an issue
post b117 with replacing a failed drive in a raid-z2 vdev.

I'll have to see if I can dig up a system to test the theory on.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-03 Thread Richard Elling
On Apr 2, 2010, at 2:05 PM, Edward Ned Harvey wrote:

> Momentarily, I will begin scouring the omniscient interweb for information, 
> but I’d like to know a little bit of what people would say here.  The 
> question is to slice, or not to slice, disks before using them in a zpool.
>  
> One reason to slice comes from recent personal experience.  One disk of a 
> mirror dies.  Replaced under contract with an identical disk.  Same model 
> number, same firmware.  Yet when it’s plugged into the system, for an unknown 
> reason, it appears 0.001 Gb smaller than the old disk, and therefore unable 
> to attach and un-degrade the mirror.  It seems logical this problem could 
> have been avoided if the device added to the pool originally had been a slice 
> somewhat smaller than the whole physical device.  Say, a slice of 28G out of 
> the 29G physical disk.  Because later when I get the infinitesimally smaller 
> disk, I can always slice 28G out of it to use as the mirror device.

If the HBA is configured for RAID mode, then it will reserve some space on disk
for its metadata.  This occurs no matter what type of disk you attach.

> There is some question about performance.  Is there any additional overhead 
> caused by using a slice instead of the whole physical device?

No.

> There is another question about performance.  One of my colleagues said he 
> saw some literature on the internet somewhere, saying ZFS behaves differently 
> for slices than it does on physical devices, because it doesn’t assume it has 
> exclusive access to that physical device, and therefore caches or buffers 
> differently … or something like that.
>  
> Any other pros/cons people can think of?

If the disk is only used for ZFS, then it is ok to enable volatile disk write 
caching
if the disk also supports write cache flush requests.

If the disk is shared with UFS, then it is not ok to enable volatile disk write 
caching.

 -- richard

 
> And finally, if anyone has experience doing this, and process 
> recommendations?  That is … My next task is to go read documentation again, 
> to refresh my memory from years ago, about the difference between “format,” 
> “partition,” “label,” “fdisk,” because those terms don’t have the same 
> meaning that they do in other OSes…  And I don’t know clearly right now, 
> which one(s) I want to do, in order to create the large slice of my disks.
>  
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com 





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-03 Thread Richard Elling
On Apr 3, 2010, at 5:56 PM, Tim Cook wrote:
> 
> On Sat, Apr 3, 2010 at 7:50 PM, Tim Cook  wrote:
>> Your experience is exactly why I suggested ZFS start doing some "right 
>> sizing" if you will.  Chop off a bit from the end of any disk so that we're 
>> guaranteed to be able to replace drives from different manufacturers.  The 
>> excuse being "no reason to, Sun drives are always of identical size".  If 
>> your drives did indeed come from Sun, their response is clearly not true.  
>> Regardless, I guess I still think it should be done.  Figure out what the 
>> greatest variation we've seen from drives that are supposedly of the exact 
>> same size, and chop it off the end of every disk.  I'm betting it's no more 
>> than 1GB, and probably less than that.  When we're talking about a 2TB 
>> drive, I'm willing to give up a gig to be guaranteed I won't have any issues 
>> when it comes time to swap it out.
>> 
>> 
> that's what open solaris is doing more or less for some time now.
> 
> look in the archives of this mailing list for more information.
> -- 
> Robert Milkowski
> http://milek.blogspot.com
> 
> 
> 
> Since when?  It isn't doing it on any of my drives, build 134, and judging by 
> the OP's issues, it isn't doing it for him either... I try to follow this 
> list fairly closely and I've never seen anyone at Sun/Oracle say they were 
> going to start doing it after I was shot down the first time.
> 
> --Tim
> 
> 
> Oh... and after 15 minutes of searching for everything from 'right-sizing' to 
> 'block reservation' to 'replacement disk smaller size fewer blocks' etc. etc. 
> I don't see a single thread on it.

CR 6844090, zfs should be able to mirror to a smaller disk
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6844090
b117, June 2009
 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com 





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-03 Thread Tim Cook
On Sat, Apr 3, 2010 at 7:50 PM, Tim Cook  wrote:

>
>
> On Sat, Apr 3, 2010 at 6:53 PM, Robert Milkowski wrote:
>
>>  On 03/04/2010 19:24, Tim Cook wrote:
>>
>>
>>
>> On Fri, Apr 2, 2010 at 4:05 PM, Edward Ned Harvey <
>> guacam...@nedharvey.com> wrote:
>>
>>>   Momentarily, I will begin scouring the omniscient interweb for
>>> information, but I’d like to know a little bit of what people would say
>>> here.  The question is to slice, or not to slice, disks before using them in
>>> a zpool.
>>>
>>>
>>>
>>> One reason to slice comes from recent personal experience.  One disk of a
>>> mirror dies.  Replaced under contract with an identical disk.  Same model
>>> number, same firmware.  Yet when it’s plugged into the system, for an
>>> unknown reason, it appears 0.001 Gb smaller than the old disk, and therefore
>>> unable to attach and un-degrade the mirror.  It seems logical this problem
>>> could have been avoided if the device added to the pool originally had been
>>> a slice somewhat smaller than the whole physical device.  Say, a slice of
>>> 28G out of the 29G physical disk.  Because later when I get the
>>> infinitesimally smaller disk, I can always slice 28G out of it to use as the
>>> mirror device.
>>>
>>>
>>>
>>> There is some question about performance.  Is there any additional
>>> overhead caused by using a slice instead of the whole physical device?
>>>
>>>
>>>
>>> There is another question about performance.  One of my colleagues said
>>> he saw some literature on the internet somewhere, saying ZFS behaves
>>> differently for slices than it does on physical devices, because it doesn’t
>>> assume it has exclusive access to that physical device, and therefore caches
>>> or buffers differently … or something like that.
>>>
>>>
>>>
>>> Any other pros/cons people can think of?
>>>
>>>
>>>
>>> And finally, if anyone has experience doing this, and process
>>> recommendations?  That is … My next task is to go read documentation again,
>>> to refresh my memory from years ago, about the difference between “format,”
>>> “partition,” “label,” “fdisk,” because those terms don’t have the same
>>> meaning that they do in other OSes…  And I don’t know clearly right now,
>>> which one(s) I want to do, in order to create the large slice of my disks.
>>>
>>
>>  Your experience is exactly why I suggested ZFS start doing some "right
>> sizing" if you will.  Chop off a bit from the end of any disk so that we're
>> guaranteed to be able to replace drives from different manufacturers.  The
>> excuse being "no reason to, Sun drives are always of identical size".  If
>> your drives did indeed come from Sun, their response is clearly not true.
>>  Regardless, I guess I still think it should be done.  Figure out what the
>> greatest variation we've seen from drives that are supposedly of the exact
>> same size, and chop it off the end of every disk.  I'm betting it's no more
>> than 1GB, and probably less than that.  When we're talking about a 2TB
>> drive, I'm willing to give up a gig to be guaranteed I won't have any issues
>> when it comes time to swap it out.
>>
>>
>>  that's what open solaris is doing more or less for some time now.
>>
>> look in the archives of this mailing list for more information.
>> --
>> Robert Milkowski
>> http://milek.blogspot.com
>>
>>
>
> Since when?  It isn't doing it on any of my drives, build 134, and judging
> by the OP's issues, it isn't doing it for him either... I try to follow this
> list fairly closely and I've never seen anyone at Sun/Oracle say they were
> going to start doing it after I was shot down the first time.
>
> --Tim
>


Oh... and after 15 minutes of searching for everything from 'right-sizing'
to 'block reservation' to 'replacement disk smaller size fewer blocks' etc.
etc. I don't see a single thread on it.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-03 Thread Tim Cook
On Sat, Apr 3, 2010 at 6:53 PM, Robert Milkowski  wrote:

>  On 03/04/2010 19:24, Tim Cook wrote:
>
>
>
> On Fri, Apr 2, 2010 at 4:05 PM, Edward Ned Harvey  > wrote:
>
>>   Momentarily, I will begin scouring the omniscient interweb for
>> information, but I’d like to know a little bit of what people would say
>> here.  The question is to slice, or not to slice, disks before using them in
>> a zpool.
>>
>>
>>
>> One reason to slice comes from recent personal experience.  One disk of a
>> mirror dies.  Replaced under contract with an identical disk.  Same model
>> number, same firmware.  Yet when it’s plugged into the system, for an
>> unknown reason, it appears 0.001 Gb smaller than the old disk, and therefore
>> unable to attach and un-degrade the mirror.  It seems logical this problem
>> could have been avoided if the device added to the pool originally had been
>> a slice somewhat smaller than the whole physical device.  Say, a slice of
>> 28G out of the 29G physical disk.  Because later when I get the
>> infinitesimally smaller disk, I can always slice 28G out of it to use as the
>> mirror device.
>>
>>
>>
>> There is some question about performance.  Is there any additional
>> overhead caused by using a slice instead of the whole physical device?
>>
>>
>>
>> There is another question about performance.  One of my colleagues said he
>> saw some literature on the internet somewhere, saying ZFS behaves
>> differently for slices than it does on physical devices, because it doesn’t
>> assume it has exclusive access to that physical device, and therefore caches
>> or buffers differently … or something like that.
>>
>>
>>
>> Any other pros/cons people can think of?
>>
>>
>>
>> And finally, if anyone has experience doing this, and process
>> recommendations?  That is … My next task is to go read documentation again,
>> to refresh my memory from years ago, about the difference between “format,”
>> “partition,” “label,” “fdisk,” because those terms don’t have the same
>> meaning that they do in other OSes…  And I don’t know clearly right now,
>> which one(s) I want to do, in order to create the large slice of my disks.
>>
>
>  Your experience is exactly why I suggested ZFS start doing some "right
> sizing" if you will.  Chop off a bit from the end of any disk so that we're
> guaranteed to be able to replace drives from different manufacturers.  The
> excuse being "no reason to, Sun drives are always of identical size".  If
> your drives did indeed come from Sun, their response is clearly not true.
>  Regardless, I guess I still think it should be done.  Figure out what the
> greatest variation we've seen from drives that are supposedly of the exact
> same size, and chop it off the end of every disk.  I'm betting it's no more
> than 1GB, and probably less than that.  When we're talking about a 2TB
> drive, I'm willing to give up a gig to be guaranteed I won't have any issues
> when it comes time to swap it out.
>
>
>  that's what open solaris is doing more or less for some time now.
>
> look in the archives of this mailing list for more information.
> --
> Robert Milkowski
> http://milek.blogspot.com
>
>

Since when?  It isn't doing it on any of my drives, build 134, and judging
by the OP's issues, it isn't doing it for him either... I try to follow this
list fairly closely and I've never seen anyone at Sun/Oracle say they were
going to start doing it after I was shot down the first time.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-03 Thread Robert Milkowski

On 03/04/2010 19:24, Tim Cook wrote:



On Fri, Apr 2, 2010 at 4:05 PM, Edward Ned Harvey 
mailto:guacam...@nedharvey.com>> wrote:


Momentarily, I will begin scouring the omniscient interweb for
information, but I’d like to know a little bit of what people
would say here.  The question is to slice, or not to slice, disks
before using them in a zpool.

One reason to slice comes from recent personal experience.  One
disk of a mirror dies.  Replaced under contract with an identical
disk.  Same model number, same firmware.  Yet when it’s plugged
into the system, for an unknown reason, it appears 0.001 Gb
smaller than the old disk, and therefore unable to attach and
un-degrade the mirror.  It seems logical this problem could have
been avoided if the device added to the pool originally had been a
slice somewhat smaller than the whole physical device.  Say, a
slice of 28G out of the 29G physical disk.  Because later when I
get the infinitesimally smaller disk, I can always slice 28G out
of it to use as the mirror device.

There is some question about performance.  Is there any additional
overhead caused by using a slice instead of the whole physical device?

There is another question about performance.  One of my colleagues
said he saw some literature on the internet somewhere, saying ZFS
behaves differently for slices than it does on physical devices,
because it doesn’t assume it has exclusive access to that physical
device, and therefore caches or buffers differently … or something
like that.

Any other pros/cons people can think of?

And finally, if anyone has experience doing this, and process
recommendations?  That is … My next task is to go read
documentation again, to refresh my memory from years ago, about
the difference between “format,” “partition,” “label,” “fdisk,”
because those terms don’t have the same meaning that they do in
other OSes…  And I don’t know clearly right now, which one(s) I
want to do, in order to create the large slice of my disks.


Your experience is exactly why I suggested ZFS start doing some "right 
sizing" if you will.  Chop off a bit from the end of any disk so that 
we're guaranteed to be able to replace drives from different 
manufacturers.  The excuse being "no reason to, Sun drives are always 
of identical size".  If your drives did indeed come from Sun, their 
response is clearly not true.  Regardless, I guess I still think it 
should be done.  Figure out what the greatest variation we've seen 
from drives that are supposedly of the exact same size, and chop it 
off the end of every disk.  I'm betting it's no more than 1GB, and 
probably less than that.  When we're talking about a 2TB drive, I'm 
willing to give up a gig to be guaranteed I won't have any issues when 
it comes time to swap it out.




that's what open solaris is doing more or less for some time now.

look in the archives of this mailing list for more information.
--
Robert Milkowski
http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-03 Thread Tim Cook
On Fri, Apr 2, 2010 at 4:05 PM, Edward Ned Harvey
wrote:

>  Momentarily, I will begin scouring the omniscient interweb for
> information, but I’d like to know a little bit of what people would say
> here.  The question is to slice, or not to slice, disks before using them in
> a zpool.
>
>
>
> One reason to slice comes from recent personal experience.  One disk of a
> mirror dies.  Replaced under contract with an identical disk.  Same model
> number, same firmware.  Yet when it’s plugged into the system, for an
> unknown reason, it appears 0.001 Gb smaller than the old disk, and therefore
> unable to attach and un-degrade the mirror.  It seems logical this problem
> could have been avoided if the device added to the pool originally had been
> a slice somewhat smaller than the whole physical device.  Say, a slice of
> 28G out of the 29G physical disk.  Because later when I get the
> infinitesimally smaller disk, I can always slice 28G out of it to use as the
> mirror device.
>
>
>
> There is some question about performance.  Is there any additional overhead
> caused by using a slice instead of the whole physical device?
>
>
>
> There is another question about performance.  One of my colleagues said he
> saw some literature on the internet somewhere, saying ZFS behaves
> differently for slices than it does on physical devices, because it doesn’t
> assume it has exclusive access to that physical device, and therefore caches
> or buffers differently … or something like that.
>
>
>
> Any other pros/cons people can think of?
>
>
>
> And finally, if anyone has experience doing this, and process
> recommendations?  That is … My next task is to go read documentation again,
> to refresh my memory from years ago, about the difference between “format,”
> “partition,” “label,” “fdisk,” because those terms don’t have the same
> meaning that they do in other OSes…  And I don’t know clearly right now,
> which one(s) I want to do, in order to create the large slice of my disks.
>
>
Your experience is exactly why I suggested ZFS start doing some "right
sizing" if you will.  Chop off a bit from the end of any disk so that we're
guaranteed to be able to replace drives from different manufacturers.  The
excuse being "no reason to, Sun drives are always of identical size".  If
your drives did indeed come from Sun, their response is clearly not true.
 Regardless, I guess I still think it should be done.  Figure out what the
greatest variation we've seen from drives that are supposedly of the exact
same size, and chop it off the end of every disk.  I'm betting it's no more
than 1GB, and probably less than that.  When we're talking about a 2TB
drive, I'm willing to give up a gig to be guaranteed I won't have any issues
when it comes time to swap it out.

--Tim

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-03 Thread Bob Friesenhahn

On Sat, 3 Apr 2010, Edward Ned Harvey wrote:


I would return the drive to get a bigger one before doing something as
drastic as that. There might have been a hichup in the production line,
and that's not your fault.


Yeah, but I already have 2 of the replacement disks, both doing the same
thing.  One has a firmware newer than my old disk (so originally I thought
that was the cause, and requested another replacement disk).  But then we
got a replacement disk which is identical in every way to the failed disk
... but it still appears smaller for some reason.

So this happened on my SSD.  What's to prevent it from happening on one of
the spindle disks in the future?  Nothing that I know of ...


Just keep in mind that this has been fixed in OpenSolaris for some 
time, and will surely be fixed in Solaris 10, if not already.  The 
annoying issue is that you probably need to add all of the vdev 
devices using an OS which already has the fix.  I don't know if it can 
"repair" a slightly overly-large device.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-03 Thread Edward Ned Harvey
> I would return the drive to get a bigger one before doing something as
> drastic as that. There might have been a hichup in the production line,
> and that's not your fault.

Yeah, but I already have 2 of the replacement disks, both doing the same
thing.  One has a firmware newer than my old disk (so originally I thought
that was the cause, and requested another replacement disk).  But then we
got a replacement disk which is identical in every way to the failed disk
... but it still appears smaller for some reason.

So this happened on my SSD.  What's to prevent it from happening on one of
the spindle disks in the future?  Nothing that I know of ...  

So far, the idea of slicing seems to be the only preventive or corrective
measure.  Hence, wondering what pros/cons people would describe, beyond what
I've already thought up myself.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-03 Thread Edward Ned Harvey
> On Apr 2, 2010, at 2:29 PM, Edward Ned Harvey wrote:
> > I've also heard that the risk for unexpected failure of your pool is
> higher if/when you reach 100% capacity.  I've heard that you should
> always create a small ZFS filesystem within a pool, and give it some
> reserved space, along with the filesystem that you actually plan to use
> in your pool.  Anyone care to offer any comments on that?
> 
> Define "failure" in this context?
> 
> I am not aware of a data loss failure when near full.  However, all
> file systems
> will experience performance degradation for write operations as they
> become
> full.

To tell the truth, I'm not exactly sure.  Because I've never lost any ZFS
pool or filesystem.  I only have it deployed on 3 servers, and only one of
those gets heavy use.  It only filled up once, and it didn't have any
problem.  So I'm only trying to understand "the great beyond," that which I
have never known myself.  Learn from other peoples' experience,
preventively.  Yes, I do embrace a lot of voodoo and superstition in doing
sysadmin, but that's just cuz stuff ain't perfect, and I've seen so many
things happen that were supposedly not possible.  (Not talking about ZFS in
that regard...  yet.)  Well, unless you count the issue I'm having right
now, with two identical disks appearing as different sizes...  But I don't
think that's a zfs problem.

I recall some discussion either here or on opensolaris-discuss or
opensolaris-help, where at least one or a few people said they had some sort
of problem or problems, and they were suspicious about the correlation
between it happening, and the disk being full.  I also recall talking to
some random guy at a conference who said something similar.  But it's all
vague.  I really don't know.

And I have nothing concrete.  Hence the post asking for peoples' comments.
Somebody might relate something they experienced less vague than what I
know.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-03 Thread Roy Sigurd Karlsbakk
> Oh, I managed to find a really good answer to this question.  Several
> sources all say to do precisely the same procedure, and when I did it
> on a
> test system, it worked perfectly.  Simple and easy to repeat.  So I
> think
> this is the gospel method to create the slices, if you're going to
> create

Seems like a clumsy workaround for a hardware problem. It will also disable the 
drives' cache, which is not a good idea. Why not just get a new drive?

Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
r...@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er 
et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av 
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og 
relevante synonymer på norsk.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-03 Thread Edward Ned Harvey
>> And finally, if anyone has experience doing this, and process
>> recommendations?  That is … My next task is to go read documentation
>> again, to refresh my memory from years ago, about the difference
>> between “format,” “partition,” “label,” “fdisk,” because those terms
>> don’t have the same meaning that they do in other OSes…  And I don’t
>> know clearly right now, which one(s) I want to do, in order to create
>> the large slice of my disks.
> 
> The whole partition vs. slice thing is a bit fuzzy to me, so take this
> with a grain of salt. You can create partitions using fdisk, or slices
> using format. The BIOS and other operating systems (windows, linux,
> etc) will be able to recognize partitions, while they won't be able to
> make sense of slices. If you need to boot from the drive or share it
> with another OS, then partitions are the way to go. If it's exclusive
> to solaris, then you can use slices. You can (but shouldn't) use slices
> and partitions from the same device (eg: c5t0d0s0 and c5t0d0p0).

Oh, I managed to find a really good answer to this question.  Several
sources all say to do precisely the same procedure, and when I did it on a
test system, it worked perfectly.  Simple and easy to repeat.  So I think
this is the gospel method to create the slices, if you're going to create
slices:

http://docs.sun.com/app/docs/doc/806-4073/6jd67r9hu
and
http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide#Rep
lacing.2FRelabeling_the_Root_Pool_Disk


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-03 Thread Roy Sigurd Karlsbakk
- "Edward Ned Harvey"  skrev:
> > What build were you running? The should have been addressed by
> > CR6844090
> > that went into build 117.
> 
> I'm running solaris, but that's irrelevant.  The storagetek array
> controller
> itself reports the new disk as infinitesimally smaller than the one
> which I
> want to mirror.  Even before the drive is given to the OS, that's the
> way it
> is.  Sun X4275 server.
> 
> BTW, I'm still degraded.  Haven't found an answer yet, and am
> considering
> breaking all my mirrors, to create a new pool on the freed disks, and
> using
> partitions in those disks, for the sake of rebuilding my pool using
> partitions on all disks.  The aforementioned performance problem is
> not as
> scary to me as running in degraded redundancy.

I would return the drive to get a bigger one before doing something as drastic 
as that. There might have been a hichup in the production line, and that's not 
your fault.

roy
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-03 Thread Edward Ned Harvey
> > One reason to slice comes from recent personal experience. One disk
> of
> > a mirror dies. Replaced under contract with an identical disk. Same
> > model number, same firmware. Yet when it's plugged into the system,
> > for an unknown reason, it appears 0.001 Gb smaller than the old disk,
> > and therefore unable to attach and un-degrade the mirror. It seems
> > logical this problem could have been avoided if the device added to
> > the pool originally had been a slice somewhat smaller than the whole
> > physical device. Say, a slice of 28G out of the 29G physical disk.
> > Because later when I get the infinitesimally smaller disk, I can
> > always slice 28G out of it to use as the mirror device.
> >
> 
> What build were you running? The should have been addressed by
> CR6844090
> that went into build 117.

I'm running solaris, but that's irrelevant.  The storagetek array controller
itself reports the new disk as infinitesimally smaller than the one which I
want to mirror.  Even before the drive is given to the OS, that's the way it
is.  Sun X4275 server.

BTW, I'm still degraded.  Haven't found an answer yet, and am considering
breaking all my mirrors, to create a new pool on the freed disks, and using
partitions in those disks, for the sake of rebuilding my pool using
partitions on all disks.  The aforementioned performance problem is not as
scary to me as running in degraded redundancy.


> it's well documented. ZFS won't attempt to enable the drive's cache
> unless it has the physical device. See
> 
> http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide
> #Storage_Pools

Nice.  Thank you.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-02 Thread Richard Elling
On Apr 2, 2010, at 2:29 PM, Edward Ned Harvey wrote:
> I’ve also heard that the risk for unexpected failure of your pool is higher 
> if/when you reach 100% capacity.  I’ve heard that you should always create a 
> small ZFS filesystem within a pool, and give it some reserved space, along 
> with the filesystem that you actually plan to use in your pool.  Anyone care 
> to offer any comments on that?

Define "failure" in this context?

I am not aware of a data loss failure when near full.  However, all file systems
will experience performance degradation for write operations as they become
full.
 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com 





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-02 Thread Brandon High
On Fri, Apr 2, 2010 at 2:29 PM, Edward Ned Harvey wrote:

>  I’ve also heard that the risk for unexpected failure of your pool is
> higher if/when you reach 100% capacity.  I’ve heard that you should always
> create a small ZFS filesystem within a pool, and give it some reserved
> space, along with the filesystem that you actually plan to use in your
> pool.  Anyone care to offer any comments on that?
>
I think you can just create a dataset with a reservation to avoid the issue.
As I understand it, zfs doesn't automatically set aside a few percent of
reserved space like ufs does.

-B

-- 
Brandon High : bh...@freaks.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-02 Thread Brandon High
On Fri, Apr 2, 2010 at 2:23 PM, Edward Ned Harvey
wrote:

>  There is some question about performance.  Is there any additional
> overhead caused by using a slice instead of the whole physical device?
>

zfs will disable the write cache when it's not working with whole disks,
which may reduce performance. You can turn the cache back on however. I
don't remember the exact incantation to do so, but "format -e" springs to
mind.

And finally, if anyone has experience doing this, and process
> recommendations?  That is … My next task is to go read documentation again,
> to refresh my memory from years ago, about the difference between “format,”
> “partition,” “label,” “fdisk,” because those terms don’t have the same
> meaning that they do in other OSes…  And I don’t know clearly right now,
> which one(s) I want to do, in order to create the large slice of my disks.
>

The whole partition vs. slice thing is a bit fuzzy to me, so take this with
a grain of salt. You can create partitions using fdisk, or slices using
format. The BIOS and other operating systems (windows, linux, etc) will be
able to recognize partitions, while they won't be able to make sense of
slices. If you need to boot from the drive or share it with another OS, then
partitions are the way to go. If it's exclusive to solaris, then you can use
slices. You can (but shouldn't) use slices and partitions from the same
device (eg: c5t0d0s0 and c5t0d0p0).

-B

-- 
Brandon High : bh...@freaks.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-02 Thread Ian Collins

On 04/ 3/10 10:23 AM, Edward Ned Harvey wrote:


Momentarily, I will begin scouring the omniscient interweb for 
information, but I’d like to know a little bit of what people would 
say here. The question is to slice, or not to slice, disks before 
using them in a zpool.




Not.

One reason to slice comes from recent personal experience. One disk of 
a mirror dies. Replaced under contract with an identical disk. Same 
model number, same firmware. Yet when it’s plugged into the system, 
for an unknown reason, it appears 0.001 Gb smaller than the old disk, 
and therefore unable to attach and un-degrade the mirror. It seems 
logical this problem could have been avoided if the device added to 
the pool originally had been a slice somewhat smaller than the whole 
physical device. Say, a slice of 28G out of the 29G physical disk. 
Because later when I get the infinitesimally smaller disk, I can 
always slice 28G out of it to use as the mirror device.




What build were you running? The should have been addressed by CR6844090 
that went into build 117.


There is some question about performance. Is there any additional 
overhead caused by using a slice instead of the whole physical device?


There is another question about performance. One of my colleagues said 
he saw some literature on the internet somewhere, saying ZFS behaves 
differently for slices than it does on physical devices, because it 
doesn’t assume it has exclusive access to that physical device, and 
therefore caches or buffers differently … or something like that.


it's well documented. ZFS won't attempt to enable the drive's cache 
unless it has the physical device. See


http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#Storage_Pools

--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] To slice, or not to slice

2010-04-02 Thread Edward Ned Harvey
This might be unrelated, but along similar lines .

 

I've also heard that the risk for unexpected failure of your pool is higher
if/when you reach 100% capacity.  I've heard that you should always create a
small ZFS filesystem within a pool, and give it some reserved space, along
with the filesystem that you actually plan to use in your pool.  Anyone care
to offer any comments on that?

 

 

 

From: zfs-discuss-boun...@opensolaris.org
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Edward Ned Harvey
Sent: Friday, April 02, 2010 5:23 PM
To: zfs-discuss@opensolaris.org
Subject: [zfs-discuss] To slice, or not to slice

 

Momentarily, I will begin scouring the omniscient interweb for information,
but I'd like to know a little bit of what people would say here.  The
question is to slice, or not to slice, disks before using them in a zpool.

 

One reason to slice comes from recent personal experience.  One disk of a
mirror dies.  Replaced under contract with an identical disk.  Same model
number, same firmware.  Yet when it's plugged into the system, for an
unknown reason, it appears 0.001 Gb smaller than the old disk, and therefore
unable to attach and un-degrade the mirror.  It seems logical this problem
could have been avoided if the device added to the pool originally had been
a slice somewhat smaller than the whole physical device.  Say, a slice of
28G out of the 29G physical disk.  Because later when I get the
infinitesimally smaller disk, I can always slice 28G out of it to use as the
mirror device.

 

There is some question about performance.  Is there any additional overhead
caused by using a slice instead of the whole physical device?

 

There is another question about performance.  One of my colleagues said he
saw some literature on the internet somewhere, saying ZFS behaves
differently for slices than it does on physical devices, because it doesn't
assume it has exclusive access to that physical device, and therefore caches
or buffers differently . or something like that.

 

Any other pros/cons people can think of?

 

And finally, if anyone has experience doing this, and process
recommendations?  That is . My next task is to go read documentation again,
to refresh my memory from years ago, about the difference between "format,"
"partition," "label," "fdisk," because those terms don't have the same
meaning that they do in other OSes.  And I don't know clearly right now,
which one(s) I want to do, in order to create the large slice of my disks.

 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss