[zfs-discuss] Lackluster ZFS performance trials using various ZIL and L2ARC configurations...

2009-01-14 Thread Gray Carper
Hey, all!

Using iozone (with the sequential read, sequential write, random read, and
random write categories), on a Sun X4240 system running OpenSolaris b104
(NexentaStor 1.1.2, actually), we recently ran a number of relative
performance tests using a few ZIL and L2ARC configurations (meant to try and
uncover which configuration would be the best choice). I'd like to share the
highlights with you all (without bogging you down with raw data) to see if
anything strikes you.

Our first (baseline) test used a ZFS pool which had a self-contained ZIL and
L2ARC (i.e. not moved to other devices, the default configuration). Note
that this system had both SSDs and SAS drive attached to the controller, but
only the SAS drives were in use.

In the second test, we rebuilt the ZFS pool with the ZIL on a 32GB SSD and
the L2ARC on four 146GB SAS drives. Random reads were significantly worse
than the baseline, but all other categories were slightly better.

In the third test, we rebuilt the ZFS pool with the ZIL on a 32GB SSD and
the L2ARC on four 80GB SSDs. Sequential reads were better than the baseline,
but all other categories were worse.

In the fourth test, we rebuilt the ZFS pool with no separate ZIL, but with
the L2ARC on four 146GB SAS drives. Random reads were significantly worse
than the baseline and all other categories were about the same as the
baseline.

As you can imagine, we were disappointed. None of those configurations
resulted in any significant improvements, and all of the configurations
resulted in at least one category being worse. This was very much not what
we expected.

For the sake of sanity checking, we decided to run the baseline case again
(ZFS pool which had a self-contained ZIL and L2ARC), but this time remove
the SSDs completely from the box. Amazingly, the simple presence of the SSDs
seemed to be a negative influence - the new SSD-free test showed improvement
in every single category when compared to the original baseline test.

So, this has lead us to the conclusion that we shouldn't be mixing SSDs with
SAS drives on the same controller (at least, not the controller we have in
this box). Has anyone else seen problems like this before that might
validate that conclusion? If so, we think we should probably build an SSD
JBOD, hook it up to the box, and re-run the tests. This leads us to another
question: Does anyone have any recommendations for SSD-performant
controllers that have great OpenSolaris driver support?

Thanks!
-Gray
-- 
Gray Carper
MSIS Technical Services
University of Michigan Medical School
gcar...@umich.edu  |  skype:  graycarper  |  734.418.8506
http://www.umms.med.umich.edu/msis/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SDXC and the future of ZFS

2009-01-14 Thread Chris Ridd

On 14 Jan 2009, at 10:01, Andrew Gabriel wrote:

> DOS/FAT filesystem implementations in appliances can be found in less
> than 8K code and data size (mostly that's code). Limited functionality
> implementations can be smaller than 1kB size.

Just for the sake of comparison, how big is the limited ZFS  
implementation in grub?

Cheers,

Chris
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] (no subject)

2009-01-14 Thread JZ
Ok, it's also important, in many many cases, but not all -
taking the problem into tomorrow is also not very good.

IMHO, maybe all you smart open folks that know all about this and that, but 
dunno how to fix your darn email address to appear "zfs user" on the darn 
list discussion?
do I have to spell this out to you?

OMG,
you Solaris mail server is too much for me, kicking my ass, you win
chatting on, z open folks.




best,
z, at home

[Daisy baby getting off late, you babies don't understand!] 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send and file fragmentation

2009-01-14 Thread JZ
Yes, that's more like it.
hahahahahahahaha, it's all happy.
chatting on, Sun folks.

best,
z


- Original Message - 
From: "David Shirley" 
To: 
Sent: Wednesday, January 14, 2009 11:44 PM
Subject: Re: [zfs-discuss] zfs send and file fragmentation


> CoW = copy on write, a ZFS feature, which makes snapshots easier to 
> maintain.
>
> However it also introduces file fragmentation in database datafiles.
>
>> Sorry, I just cannot tell how is this name related to
>> Sun, IMHO.
>>
>> In my days and enterprise environments, lots of CoWs
>> are stored in
>> relational databases.
>> I dunno where the hell you come from.
>>
>> best,
>> z
>>
> -- 
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send and file fragmentation

2009-01-14 Thread David Shirley
CoW = copy on write, a ZFS feature, which makes snapshots easier to maintain.

However it also introduces file fragmentation in database datafiles.

> Sorry, I just cannot tell how is this name related to
> Sun, IMHO.
> 
> In my days and enterprise environments, lots of CoWs
> are stored in 
> relational databases.
> I dunno where the hell you come from.
> 
> best,
> z
>
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send and file fragmentation

2009-01-14 Thread JZ
Sorry, I just cannot tell how is this name related to Sun, IMHO.

In my days and enterprise environments, lots of CoWs are stored in 
relational databases.
I dunno where the hell you come from.

best,
z



- Original Message - 
From: "David Shirley" 
To: 
Sent: Wednesday, January 14, 2009 11:29 PM
Subject: [zfs-discuss] zfs send and file fragmentation


> Hi Guys,
>
> We are in the process of using snapshots and zfs send/recv to copy 
> terrabytes of database datafiles.
>
> Due to CoW not being very friendly to databases we have a LOT of 
> fragmentation on the datafiles.
>
> This really slows down zfs send.
>
> Is it possible (or already done) to have zfs send/recv able to do a 
> blockwise send/recv of a zpool/zfs dataset so that we can harness the 
> speed of sequential I/O.
>
> This would also solve the uncompress/recompress nature of zfs send/recv.
>
> It's probably easier said than done, however I hope it gets looked at :)
>
> Cheers
> David
> -- 
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zfs send and file fragmentation

2009-01-14 Thread David Shirley
Hi Guys,

We are in the process of using snapshots and zfs send/recv to copy terrabytes 
of database datafiles.

Due to CoW not being very friendly to databases we have a LOT of fragmentation 
on the datafiles.

This really slows down zfs send.

Is it possible (or already done) to have zfs send/recv able to do a blockwise 
send/recv of a zpool/zfs dataset so that we can harness the speed of sequential 
I/O.

This would also solve the uncompress/recompress nature of zfs send/recv.

It's probably easier said than done, however I hope it gets looked at :)

Cheers
David
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] What are the usual suspects in data errors?

2009-01-14 Thread Richard Elling
well, since this is part of how I make my living, or at least
what is in my current job description...

Gary Mills wrote:
> I realize that any error can occur in a storage subsystem, but most
> of these have an extremely low probability.  I'm interested in this
> discussion in only those that do occur occasionally, and that are
> not catastrophic.

excellent... fertile ground for research.  One of the things
that we see occur with ZFS is that it detects errors which were
previously not detected.  You can see this happen on this forum
when people try to kill the canary (ZFS).  I think a better
analogy is astronomy: as our ability to see more of the universe
gets better, we see more of the universe -- but that also raises
the number of questions we can't answer... well... yet...

> Consider the common configuration of two SCSI disks connected to
> the same HBA that are configured as a mirror in some manner.  In this
> case, the data path in general consists of:

Beware of the Decomposition Law which says, the part is more
than a fraction of the whole.  This is what trips people up when
they think that if every part performs flawlessly, then the whole
will perform flawlessly.

> o The application
> o The filesystem
> o The drivers
> o The HBA
> o The SCSI bus
> o The controllers
> o The heads and patters
> 
> Many of those components have their own error checking.  Some have
> error correction.  For example, parity checking is done on a SCSI bus,
> unless it's specifically disabled.  Do SATA and PATA connections also
> do error checking?  Disk sector I/O uses CRC error checking and
> correction.  Memory buffers would often be protected by parity memory.
> Is there any more that I've missed?

thousands more ;-)


> Now, let's consider common errors.  To me, the most frequent would
> be a bit error on a disk sector.  In this case, the controller would
> report a CRC error and would not return bad data.  The filesystem
> would obtain the data from its redundant copy.  I assume that ZFS
> would also rewrite the bad sector to correct it.  The application
> would not see an error.  Similar events would happen for a parity
> error on the SCSI bus.

Nit: modern disks can detect and correct multiple byte errors in a
sector.  If ZFS can correct it (depends on the ZFS configuration)
then it will, but it will not rewrite the defective sector -- it
will write to a different sector.  While that seems better, it also
introduces at least one new failure mode and can help to expose
other, existing failure modes, such as phantom writes.

> What can go wrong with the disk controller?  A simple seek to the
> wrong track is not a problem because the track number is encoded on
> the platter.  The controller will simply recalibrate the mechanism and
> retry the seek.  If it computes the wrong sector, that would be a
> problem.  Does this happen with any frequency?  In this case, ZFS
> would detect a checksum error and obtain the data from its redundant
> copy.
> 
> A logic error in ZFS might result in incorrect metadata being written
> with valid checksum.  In this case, ZFS might panic on import or might
> correct the error.  How is this sort of error prevented?
> 
> If the application wrote bad data to the filesystem, none of the
> error checking in lower layers would detect it.  This would be
> strictly an error in the application.
> 
> Some errors might result from a loss of power if some ZFS data was
> written to a disk cache but never was written to the disk platter.
> Again, ZFS might panic on import or might correct the error.  How is
> this sort of error prevented?
> 
> After all of this discussion, what other errors can ZFS checksums
> reasonably detect?  Certainly if some of the other error checking
> failed to detect an error, ZFS would still detect one.  How likely
> are these other error checks to fail?
> 
> Is there anything else I've missed in this analysis?

Everything along the way.  If you search the archives here you will
find anecdotes of:
+ bad disks -- of all sorts
+ bad power supplies
+ bad FC switch firmware
+ flaky cables
+ bugs in NIC drivers
+ transient and permanent DRAM errors
+ and, of course, bugs in ZFS code

Basically, anywhere you data touches can fail.

However, to make the problem tractable, we often
divide failures into two classifications:

1. mechanical, including quantum-mechanical

2. design or implementation, including software defects,
design deficiencies, and manufacturing

There is a lot of experience with measurements of mechanical
failure modes, so we tend to have some ways to assign reliability
budgets and predictions.  For #2, the science we use for #1
doesn't apply.
  -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] What are the usual suspects in data errors?

2009-01-14 Thread JZ
Folks, I am very sorry, for don't know how to be not misleading.

I was not challenging the Ten Commandments.
That is a good code. And maybe the first one we need to follow.

Vain and pride and arrogance and courage are all very different, but very 
similar.
Before you can truely understand the code of love, you will have to be very 
careful.

And then, there are beyond.

Folks, this is a technology discussion, not a religious discussion.
I love you all.
I do not want to see you folks cannot make to your dream states with your 
technology knowhow just because you don't even understand the basic code of 
love.

Folks, I love you all.
OMG, I did not teach King and High anything beyond what I have said here in 
open.
It that not enough to make the darn open discussion go on?
Please.


[do you know if not because of the 40 friends, I can be dead by now 
talking this much to an open list???]

best,
z


- Original Message - 
From: "JZ" 
To: "A Darren Dunham" ; 
Sent: Wednesday, January 14, 2009 7:48 PM
Subject: Re: [zfs-discuss] What are the usual suspects in data errors?


> folks, please, chatting on - don't make me stop you, we are all open 
> folks.
>
>
> [but darn]
>
> ok, thank you much for the anticipation for something actually useful, 
> here
> is another thing I shared with MS Storage but not with you folks yet --
>
> we win with real advantages, not lies, not scales, but only real knowhow.
>
> cheers,
> z
>
>
>
> - Original Message - 
> From: "JZ" 
> To: "A Darren Dunham" ; 
> Sent: Wednesday, January 14, 2009 7:38 PM
> Subject: Re: [zfs-discuss] What are the usual suspects in data errors?
>
>
>> darn, Darren, learning fast!
>>
>> best,
>> z
>>
>>
>> - Original Message - 
>> From: "A Darren Dunham" 
>> To: 
>> Sent: Wednesday, January 14, 2009 6:15 PM
>> Subject: Re: [zfs-discuss] What are the usual suspects in data errors?
>>
>>
>>> On Wed, Jan 14, 2009 at 04:39:03PM -0600, Gary Mills wrote:
 I realize that any error can occur in a storage subsystem, but most
 of these have an extremely low probability.  I'm interested in this
 discussion in only those that do occur occasionally, and that are
 not catastrophic.
>>>
>>> What level is "extremely low" here?
>>>
 Many of those components have their own error checking.  Some have
 error correction.  For example, parity checking is done on a SCSI bus,
 unless it's specifically disabled.  Do SATA and PATA connections also
 do error checking?  Disk sector I/O uses CRC error checking and
 correction.  Memory buffers would often be protected by parity memory.
 Is there any more that I've missed?
>>>
>>> Reports suggest that bugs in drive firmware could account for errors at
>>> a level that is not insignificant.
>>>
 What can go wrong with the disk controller?  A simple seek to the
 wrong track is not a problem because the track number is encoded on
 the platter.  The controller will simply recalibrate the mechanism and
 retry the seek.  If it computes the wrong sector, that would be a
 problem.  Does this happen with any frequency?
>>>
>>> Netapp documents certain rewrite bugs that they've specifically seen.  I
>>> would imagine they have good data on the frequency that they see it in
>>> the field.
>>>
 In this case, ZFS
 would detect a checksum error and obtain the data from its redundant
 copy.
>>>
>>> Correct.
>>>
 A logic error in ZFS might result in incorrect metadata being written
 with valid checksum.  In this case, ZFS might panic on import or might
 correct the error.  How is this sort of error prevented?
>>>
>>> It's very difficult to protect yourself from software bugs with the same
>>> piece of software.  You can create assertions that are hopefully simpler
>>> and less prone to errors, but they will not catch all bugs.
>>>
 Some errors might result from a loss of power if some ZFS data was
 written to a disk cache but never was written to the disk platter.
 Again, ZFS might panic on import or might correct the error.  How is
 this sort of error prevented?
>>>
>>> ZFS uses a multi-stage commit.  It relies on the "disk" responding to a
>>> request to flush caches to the disk.  If that assumption is correct,
>>> then there is no problem in general with power issues.  The disk is
>>> consistent both before and after the cache is flushed.
>>>
>>> -- 
>>> Darren
>>> ___
>>> zfs-discuss mailing list
>>> zfs-discuss@opensolaris.org
>>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>> ___
>> zfs-discuss mailing list
>> zfs-discuss@opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss 

___
zfs-discuss 

Re: [zfs-discuss] What are the usual suspects in data errors?

2009-01-14 Thread JZ
folks, please, chatting on - don't make me stop you, we are all open folks.


[but darn]

ok, thank you much for the anticipation for something actually useful, here 
is another thing I shared with MS Storage but not with you folks yet --

we win with real advantages, not lies, not scales, but only real knowhow.

cheers,
z



- Original Message - 
From: "JZ" 
To: "A Darren Dunham" ; 
Sent: Wednesday, January 14, 2009 7:38 PM
Subject: Re: [zfs-discuss] What are the usual suspects in data errors?


> darn, Darren, learning fast!
>
> best,
> z
>
>
> - Original Message - 
> From: "A Darren Dunham" 
> To: 
> Sent: Wednesday, January 14, 2009 6:15 PM
> Subject: Re: [zfs-discuss] What are the usual suspects in data errors?
>
>
>> On Wed, Jan 14, 2009 at 04:39:03PM -0600, Gary Mills wrote:
>>> I realize that any error can occur in a storage subsystem, but most
>>> of these have an extremely low probability.  I'm interested in this
>>> discussion in only those that do occur occasionally, and that are
>>> not catastrophic.
>>
>> What level is "extremely low" here?
>>
>>> Many of those components have their own error checking.  Some have
>>> error correction.  For example, parity checking is done on a SCSI bus,
>>> unless it's specifically disabled.  Do SATA and PATA connections also
>>> do error checking?  Disk sector I/O uses CRC error checking and
>>> correction.  Memory buffers would often be protected by parity memory.
>>> Is there any more that I've missed?
>>
>> Reports suggest that bugs in drive firmware could account for errors at
>> a level that is not insignificant.
>>
>>> What can go wrong with the disk controller?  A simple seek to the
>>> wrong track is not a problem because the track number is encoded on
>>> the platter.  The controller will simply recalibrate the mechanism and
>>> retry the seek.  If it computes the wrong sector, that would be a
>>> problem.  Does this happen with any frequency?
>>
>> Netapp documents certain rewrite bugs that they've specifically seen.  I
>> would imagine they have good data on the frequency that they see it in
>> the field.
>>
>>> In this case, ZFS
>>> would detect a checksum error and obtain the data from its redundant
>>> copy.
>>
>> Correct.
>>
>>> A logic error in ZFS might result in incorrect metadata being written
>>> with valid checksum.  In this case, ZFS might panic on import or might
>>> correct the error.  How is this sort of error prevented?
>>
>> It's very difficult to protect yourself from software bugs with the same
>> piece of software.  You can create assertions that are hopefully simpler
>> and less prone to errors, but they will not catch all bugs.
>>
>>> Some errors might result from a loss of power if some ZFS data was
>>> written to a disk cache but never was written to the disk platter.
>>> Again, ZFS might panic on import or might correct the error.  How is
>>> this sort of error prevented?
>>
>> ZFS uses a multi-stage commit.  It relies on the "disk" responding to a
>> request to flush caches to the disk.  If that assumption is correct,
>> then there is no problem in general with power issues.  The disk is
>> consistent both before and after the cache is flushed.
>>
>> -- 
>> Darren
>> ___
>> zfs-discuss mailing list
>> zfs-discuss@opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] What are the usual suspects in data errors?

2009-01-14 Thread JZ
darn, Darren, learning fast!

best,
z


- Original Message - 
From: "A Darren Dunham" 
To: 
Sent: Wednesday, January 14, 2009 6:15 PM
Subject: Re: [zfs-discuss] What are the usual suspects in data errors?


> On Wed, Jan 14, 2009 at 04:39:03PM -0600, Gary Mills wrote:
>> I realize that any error can occur in a storage subsystem, but most
>> of these have an extremely low probability.  I'm interested in this
>> discussion in only those that do occur occasionally, and that are
>> not catastrophic.
> 
> What level is "extremely low" here?
> 
>> Many of those components have their own error checking.  Some have
>> error correction.  For example, parity checking is done on a SCSI bus,
>> unless it's specifically disabled.  Do SATA and PATA connections also
>> do error checking?  Disk sector I/O uses CRC error checking and
>> correction.  Memory buffers would often be protected by parity memory.
>> Is there any more that I've missed?
> 
> Reports suggest that bugs in drive firmware could account for errors at
> a level that is not insignificant.
> 
>> What can go wrong with the disk controller?  A simple seek to the
>> wrong track is not a problem because the track number is encoded on
>> the platter.  The controller will simply recalibrate the mechanism and
>> retry the seek.  If it computes the wrong sector, that would be a
>> problem.  Does this happen with any frequency? 
> 
> Netapp documents certain rewrite bugs that they've specifically seen.  I
> would imagine they have good data on the frequency that they see it in
> the field.
> 
>> In this case, ZFS
>> would detect a checksum error and obtain the data from its redundant
>> copy.
> 
> Correct.
> 
>> A logic error in ZFS might result in incorrect metadata being written
>> with valid checksum.  In this case, ZFS might panic on import or might
>> correct the error.  How is this sort of error prevented?
> 
> It's very difficult to protect yourself from software bugs with the same
> piece of software.  You can create assertions that are hopefully simpler
> and less prone to errors, but they will not catch all bugs.
> 
>> Some errors might result from a loss of power if some ZFS data was
>> written to a disk cache but never was written to the disk platter.
>> Again, ZFS might panic on import or might correct the error.  How is
>> this sort of error prevented?
> 
> ZFS uses a multi-stage commit.  It relies on the "disk" responding to a
> request to flush caches to the disk.  If that assumption is correct,
> then there is no problem in general with power issues.  The disk is
> consistent both before and after the cache is flushed.
> 
> -- 
> Darren
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] What are the usual suspects in data errors?

2009-01-14 Thread A Darren Dunham
On Wed, Jan 14, 2009 at 04:39:03PM -0600, Gary Mills wrote:
> I realize that any error can occur in a storage subsystem, but most
> of these have an extremely low probability.  I'm interested in this
> discussion in only those that do occur occasionally, and that are
> not catastrophic.

What level is "extremely low" here?

> Many of those components have their own error checking.  Some have
> error correction.  For example, parity checking is done on a SCSI bus,
> unless it's specifically disabled.  Do SATA and PATA connections also
> do error checking?  Disk sector I/O uses CRC error checking and
> correction.  Memory buffers would often be protected by parity memory.
> Is there any more that I've missed?

Reports suggest that bugs in drive firmware could account for errors at
a level that is not insignificant.

> What can go wrong with the disk controller?  A simple seek to the
> wrong track is not a problem because the track number is encoded on
> the platter.  The controller will simply recalibrate the mechanism and
> retry the seek.  If it computes the wrong sector, that would be a
> problem.  Does this happen with any frequency? 

Netapp documents certain rewrite bugs that they've specifically seen.  I
would imagine they have good data on the frequency that they see it in
the field.

> In this case, ZFS
> would detect a checksum error and obtain the data from its redundant
> copy.

Correct.

> A logic error in ZFS might result in incorrect metadata being written
> with valid checksum.  In this case, ZFS might panic on import or might
> correct the error.  How is this sort of error prevented?

It's very difficult to protect yourself from software bugs with the same
piece of software.  You can create assertions that are hopefully simpler
and less prone to errors, but they will not catch all bugs.

> Some errors might result from a loss of power if some ZFS data was
> written to a disk cache but never was written to the disk platter.
> Again, ZFS might panic on import or might correct the error.  How is
> this sort of error prevented?

ZFS uses a multi-stage commit.  It relies on the "disk" responding to a
request to flush caches to the disk.  If that assumption is correct,
then there is no problem in general with power issues.  The disk is
consistent both before and after the cache is flushed.

-- 
Darren
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] OpenSolaris better Than Solaris10u6 with requards to ARECA Raid Card

2009-01-14 Thread James C. McPherson

Just to let everybody know, I'm in touch with Charles and we're
working on this problem offline. We'll report back to the list
when we've got something to talk about.


James

On Wed, 14 Jan 2009 08:37:44 -0800 (PST)
Charles Wright  wrote:

> Here's an update:
> 
> I thought that the error message
> arcmsr0: too many outstanding commands
> might be due to a Scsi queue being over ran
> 
> The areca driver has
> #*define*ARCMSR_MAX_OUTSTANDING_CMD
> 
> 256
> 
> What might be happening is each raid set results in a new instance of
> the areca driver getting loaded so perhaps the scsi queue on the card
> is just get over ran as each drive is getting a queue depth of 256,
> as such I tested with sd_max_throttle:16
> 
> (16 Drives * 16 Queues = 256)
> 
> I verified sd_max_throttle got set via:
> r...@yoda:~/solaris-install-stuff# echo "sd_max_throttle/D" |mdb -k
> sd_max_throttle:
> sd_max_throttle:16  
> 






James C. McPherson
--
Senior Kernel Software Engineer, Solaris
Sun Microsystems
http://blogs.sun.com/jmcp   http://www.jmcp.homeunix.com/blog
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS on partitions

2009-01-14 Thread Mattias Pantzare
>>
>> ZFS will always flush the disk cache at appropriate times. If ZFS
>> thinks that is alone it will turn the write cache on the disk on.
>
> I'm not sure if you're trying to argue or agree.  If you're trying to argue,
> you're going to have to do a better job than "zfs will always flush disk
> cache at appropriate times", because that's outright false in the case where
> zfs doesn't own the entire disk.  That flush may very well produce an
> outcome zfs could never pre-determine.

You can send flush cache commands to the disk how often you wish, the
only thing that happens is that the disk writes dirty sectors from its
cache to the disk. That is, no writes will be done that should not
have happend at some time anyway. This will not harm UFS or any other
user of the disk. Other users can issue flush cache command without
affecting ZFS. Please read up on what the flus cache command does!

ZFS will send flush cache commands even when it is not alone on the
disk. There are many disks with write cache on by default. There have
even been disks that won't turn it off even if told so.


>> > It could cause corruption if you had UFS and zfs on the same disk.
>>
>> It is safe to have UFS and ZFS on the same disk and it has always been
>> safe.
>
> ***unless you turn on write cache.  And without write cache, performance
> sucks.  Hence me answering the OP's question.

There was no mention of cache at all in the question.

It was not clear that this sentence reffered to your own text, hence
the misunderstanding:

"It could cause corruption if you had UFS and zfs on the same disk."

I read that as a separate statement.


As to the performance sucks, that is putting it a bit harsh, you will
get better performance with write cache but the system will be
perfectly usable without.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zpool upgrade while some disks are faulted

2009-01-14 Thread Amer Ather
What happens when zpool upgrade is run on a zpool that has some faulted 
disks? I guess it is safe to run zpool upgrade while zpool is online.

Thanks,
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] What are the usual suspects in data errors?

2009-01-14 Thread Gary Mills
I realize that any error can occur in a storage subsystem, but most
of these have an extremely low probability.  I'm interested in this
discussion in only those that do occur occasionally, and that are
not catastrophic.

Consider the common configuration of two SCSI disks connected to
the same HBA that are configured as a mirror in some manner.  In this
case, the data path in general consists of:

o The application
o The filesystem
o The drivers
o The HBA
o The SCSI bus
o The controllers
o The heads and patters

Many of those components have their own error checking.  Some have
error correction.  For example, parity checking is done on a SCSI bus,
unless it's specifically disabled.  Do SATA and PATA connections also
do error checking?  Disk sector I/O uses CRC error checking and
correction.  Memory buffers would often be protected by parity memory.
Is there any more that I've missed?

Now, let's consider common errors.  To me, the most frequent would
be a bit error on a disk sector.  In this case, the controller would
report a CRC error and would not return bad data.  The filesystem
would obtain the data from its redundant copy.  I assume that ZFS
would also rewrite the bad sector to correct it.  The application
would not see an error.  Similar events would happen for a parity
error on the SCSI bus.

What can go wrong with the disk controller?  A simple seek to the
wrong track is not a problem because the track number is encoded on
the platter.  The controller will simply recalibrate the mechanism and
retry the seek.  If it computes the wrong sector, that would be a
problem.  Does this happen with any frequency?  In this case, ZFS
would detect a checksum error and obtain the data from its redundant
copy.

A logic error in ZFS might result in incorrect metadata being written
with valid checksum.  In this case, ZFS might panic on import or might
correct the error.  How is this sort of error prevented?

If the application wrote bad data to the filesystem, none of the
error checking in lower layers would detect it.  This would be
strictly an error in the application.

Some errors might result from a loss of power if some ZFS data was
written to a disk cache but never was written to the disk platter.
Again, ZFS might panic on import or might correct the error.  How is
this sort of error prevented?

After all of this discussion, what other errors can ZFS checksums
reasonably detect?  Certainly if some of the other error checking
failed to detect an error, ZFS would still detect one.  How likely
are these other error checks to fail?

Is there anything else I've missed in this analysis?

-- 
-Gary Mills--Unix Support--U of M Academic Computing and Networking-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] questions on zfs backups

2009-01-14 Thread Ian Collins
Mark Shellenbaum wrote:
> Ian Collins wrote:
>> satya wrote:
>>> Any idea if we can use pax command to backup ZFS acls? will -p
>>> option of pax utility do the trick?
>> pax should, according to
>> http://docs.sun.com/app/docs/doc/819-5461/gbchx?a=view
>>
>
> pax isn't ACL aware. It does handle extended attributes, though.
> Here is pax RFE to support ACLs.
>
> 1191280 *pax* pax should understand acls
>
Thanks for the clarification Mark.

I think the documentation should clarify this, the grouping pax with tar
and cpio in bullet "Archive utilities – Save ZFS data with archive
utilities such as tar, cpio, and pax or third-party backup products."
implies otherwise.

-- 
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] questions on zfs backups

2009-01-14 Thread Mark Shellenbaum
Ian Collins wrote:
> satya wrote:
>>  Any idea if we can use pax command to backup ZFS acls? will -p option of 
>> pax utility do the trick? 
>>
>>   
> pax should, according to
> http://docs.sun.com/app/docs/doc/819-5461/gbchx?a=view
> 

pax isn't ACL aware.  It does handle extended attributes, though.
Here is pax RFE to support ACLs.

1191280 *pax* pax should understand acls

> tar and cpio do.
> 

tar and cpio handle both ACLs and extended attributes.

> It should be simple enough to test, just generate an archive and have a
> look.
> 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] help please - Zpool import : I/O error

2009-01-14 Thread Mathieu
Hi,

I have a big problem with my ZFS drive. After a kernel panic, I cannot import 
the pool anymore :

---
=> zpool status 
no pools available
=> zpool list
no pools available

---
=> zpool import

pool: ZFS
   id: 9004030332584099627
  state: FAULTED
status: The pool metadata is corrupted.
action: The pool cannot be imported due to damaged devices or data.
 The pool may be active on on another system, but can be imported 
using
 the '-f' flag.
see: http://www.sun.com/msg/ZFS-8000-72
config:

ZFSFAULTED  corrupted data
  disk2s2   ONLINE

  pool: ZFS
   id: 5050959592823553345
  state: FAULTED
status: The pool was last accessed by another system.
action: The pool cannot be imported due to damaged devices or data.
see: http://www.sun.com/msg/ZFS-8000-EY
config:

ZFS UNAVAIL  insufficient replicas
  disk1 UNAVAIL  cannot open

---
=> zpool import -f 9004030332584099627
cannot import 'ZFS': I/O error
---

I am despaired, all my data are on this drive and I hadn't time to make a 
backup.

Is there anything I can do ?

Thank you for your help,

Mathieu
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] questions on zfs backups

2009-01-14 Thread Ian Collins
satya wrote:
>  Any idea if we can use pax command to backup ZFS acls? will -p option of pax 
> utility do the trick? 
>
>   
pax should, according to
http://docs.sun.com/app/docs/doc/819-5461/gbchx?a=view

tar and cpio do.

It should be simple enough to test, just generate an archive and have a
look.

-- 
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS on partitions

2009-01-14 Thread Richard Elling
??

Tim wrote:
> On Wed, Jan 14, 2009 at 2:40 PM, Mattias Pantzare  > wrote:
> 
> On Wed, Jan 14, 2009 at 20:03, Tim  > wrote:
>  >
>  >
>  > On Tue, Jan 13, 2009 at 6:26 AM, Brian Wilson
> mailto:bfwil...@doit.wisc.edu>>
>  > wrote:
>  >>
>  >> Does creating ZFS pools on multiple partitions on the same
> physical drive
>  >> still run into the performance and other issues that putting
> pools in slices
>  >> does?
>  >
>  >
>  > Is zfs going to own the whole drive or not?  The *issue* is that
> zfs will
>  > not use the drive cache if it doesn't own the whole disk since it
> won't know
>  > whether or not it should be flushing cache at any given point in
> time.
> 
> ZFS will always flush the disk cache at appropriate times. If ZFS
> thinks that is alone it will turn the write cache on the disk on.
> 
> 
> I'm not sure if you're trying to argue or agree.  If you're trying to 
> argue, you're going to have to do a better job than "zfs will always 
> flush disk cache at appropriate times", because that's outright false in 
> the case where zfs doesn't own the entire disk.  That flush may very 
> well produce an outcome zfs could never pre-determine.

Would you care to explain this logic?  Are you saying that if ZFS
sends a cache flush command to a disk that it will "produce an
outcome ZFS could never pre-determime?"  Or am I just misinterpreting?
  -- richard
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] questions on zfs backups

2009-01-14 Thread satya
Any update on star ability to backup ZFS ACLs? Any idea if we can use pax 
command to backup ZFS acls? will -p option of pax utility do the trick? 

-satya
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS on partitions

2009-01-14 Thread Tim
On Wed, Jan 14, 2009 at 2:40 PM, Mattias Pantzare wrote:

> On Wed, Jan 14, 2009 at 20:03, Tim  wrote:
> >
> >
> > On Tue, Jan 13, 2009 at 6:26 AM, Brian Wilson 
> > wrote:
> >>
> >> Does creating ZFS pools on multiple partitions on the same physical
> drive
> >> still run into the performance and other issues that putting pools in
> slices
> >> does?
> >
> >
> > Is zfs going to own the whole drive or not?  The *issue* is that zfs will
> > not use the drive cache if it doesn't own the whole disk since it won't
> know
> > whether or not it should be flushing cache at any given point in time.
>
> ZFS will always flush the disk cache at appropriate times. If ZFS
> thinks that is alone it will turn the write cache on the disk on.


I'm not sure if you're trying to argue or agree.  If you're trying to argue,
you're going to have to do a better job than "zfs will always flush disk
cache at appropriate times", because that's outright false in the case where
zfs doesn't own the entire disk.  That flush may very well produce an
outcome zfs could never pre-determine.


>
> > It could cause corruption if you had UFS and zfs on the same disk.
>
> It is safe to have UFS and ZFS on the same disk and it has always been
> safe.
>

***unless you turn on write cache.  And without write cache, performance
sucks.  Hence me answering the OP's question.


>
> Write cache on the disk is not safe for UFS, that is why zfs will turn
> it on only if it is alone.


Which is EXACTLY what he's asking, and what I just told him.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS on partitions

2009-01-14 Thread Tim
On Wed, Jan 14, 2009 at 2:48 PM, Miles Nordin  wrote:

> but, the write cache on/offness is a stateful setting stored on the
> disk platter, right?  so it survives reboots of the disk, and ZFS
> doesn't turn it off, and UFS arguably should turn it off but
> doesn't---once you've dedicated a disk to ZFS, you have to turn the
> write cache off yourself somehow using 'format -e' if you are no
> longer using a disk for ZFS only.  Or am I remembering wrong?
>


ZFS does turn it off if it doesn't have the whole disk.  That's where the
performance issues come from.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS on partitions

2009-01-14 Thread Miles Nordin
but, the write cache on/offness is a stateful setting stored on the
disk platter, right?  so it survives reboots of the disk, and ZFS
doesn't turn it off, and UFS arguably should turn it off but
doesn't---once you've dedicated a disk to ZFS, you have to turn the
write cache off yourself somehow using 'format -e' if you are no
longer using a disk for ZFS only.  Or am I remembering wrong?


pgpSCzWjxMi6Y.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS on partitions

2009-01-14 Thread Mattias Pantzare
On Wed, Jan 14, 2009 at 20:03, Tim  wrote:
>
>
> On Tue, Jan 13, 2009 at 6:26 AM, Brian Wilson 
> wrote:
>>
>> Does creating ZFS pools on multiple partitions on the same physical drive
>> still run into the performance and other issues that putting pools in slices
>> does?
>
>
> Is zfs going to own the whole drive or not?  The *issue* is that zfs will
> not use the drive cache if it doesn't own the whole disk since it won't know
> whether or not it should be flushing cache at any given point in time.

ZFS will always flush the disk cache at appropriate times. If ZFS
thinks that is alone it will turn the write cache on the disk on.

>
> It could cause corruption if you had UFS and zfs on the same disk.

It is safe to have UFS and ZFS on the same disk and it has always been safe.

Write cache on the disk is not safe for UFS, that is why zfs will turn
it on only if it is alone.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS on partitions

2009-01-14 Thread Tim
On Tue, Jan 13, 2009 at 6:26 AM, Brian Wilson wrote:

>
> Does creating ZFS pools on multiple partitions on the same physical drive
> still run into the performance and other issues that putting pools in slices
> does?
>


Is zfs going to own the whole drive or not?  The *issue* is that zfs will
not use the drive cache if it doesn't own the whole disk since it won't know
whether or not it should be flushing cache at any given point in time.

It could cause corruption if you had UFS and zfs on the same disk.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Why is st_size of a zfs directory equal to the

2009-01-14 Thread David Collier-Brown
"Richard L. Hamilton"  wrote:
>> I did find the earlier discussion on the subject (someone e-mailed me that 
>> there had been
>> such).  It seemed to conclude that some apps are statically linked with old 
>> scandir() code
>> that (incorrectly) assumed that the number of directory entries could be 
>> estimated as
>> st_size/24; and worse, that some such apps might be seeing the small st_size 
>> that zfs
>> offers via NFS, so they might not even be something that could be fixed on 
>> Solaris at all.
>> But I didn't see anything in the discussion that suggested that this was 
>> going to be changed.
>> Nor did I see a compelling argument for leaving it the way it is, either.  
>> In the face of
>> "undefined", all arguments end up as pragmatism rather than principle, IMO.
> 
Joerg Schilling wrote:
> This is a problem I had to fix for some customers in 1992 when people started 
> to use NFS 
> servers based on the Novell OS.
> Jörg
> 

  Oh bother, I should have noticed this back in 1999/2001 (;-))

  Joking aside, we were looking at the Solaris ABI (application
Binary interface) and working on ensuring binary stability. The
size of a directory entry was supposed to be undefined and in
principle *variable*, but Novell et all seem to have assumed that
the size they used was guaranteed to be the same for all time.

  And no machine needs more than 640 KB of memory, either...

  Ah well, at least the ZFS folks found it for us, so I can add
it to my database of porting problems.  What OSs did you folks
find it on?

--dave (an external consultant, these days) c-b
-- 
David Collier-Brown| Always do right. This will gratify
Sun Microsystems, Toronto  | some people and astonish the rest
dav...@sun.com |  -- Mark Twain
cell: (647) 833-9377, bridge: (877) 385-4099 code: 506 9191#
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Solaris destroys large discs?? Bug in install?

2009-01-14 Thread Orvar Korvar
Ok, ive upgraded mother board's BIOS. Installed ZFS with b105 over the existing 
UFS b104. It works better now. The disk sounds almost like normal, barely 
audiable. But sometimes it goes back and sounds like hell. Very seldom now.

I dont get it. Why does UFS do this? Hmm...
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Why is st_size of a zfs directory equal to the

2009-01-14 Thread Joerg Schilling
"Richard L. Hamilton"  wrote:

> I did find the earlier discussion on the subject (someone e-mailed me that 
> there had been
> such).  It seemed to conclude that some apps are statically linked with old 
> scandir() code
> that (incorrectly) assumed that the number of directory entries could be 
> estimated as
> st_size/24; and worse, that some such apps might be seeing the small st_size 
> that zfs
> offers via NFS, so they might not even be something that could be fixed on 
> Solaris at all.
> But I didn't see anything in the discussion that suggested that this was 
> going to be changed.
> Nor did I see a compelling argument for leaving it the way it is, either.  In 
> the face of
> "undefined", all arguments end up as pragmatism rather than principle, IMO.

This is a problem I had to fix for some customers in 1992 when people started 
to use NFS 
servers based on the Novell OS.

Jörg

-- 
 EMail:jo...@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
   j...@cs.tu-berlin.de(uni)  
   joerg.schill...@fokus.fraunhofer.de (work) Blog: 
http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] OpenSolaris better Than Solaris10u6 with requards to ARECA Raid Card

2009-01-14 Thread Richard Elling
Charles Wright wrote:
> Here's an update:
>
> I thought that the error message
> arcmsr0: too many outstanding commands
> might be due to a Scsi queue being over ran
>   

Rather than messing with sd_max_throttle, you might try
changing the number of iops ZFS will queue to a vdev.
IMHO this is easier to correlate because the ZFS tunable is
closer to the application than an sd tunable.  Details at:
http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Device_I.2FO_Queue_Size_.28I.2FO_Concurrency.29

iostat will show the number of iops queued to the device in
the actv column, but for modern hardware this number can
fluctuate quite a bit in a 1-second sample period -- which
implies that you need lots of load to see it.  The problem is
that if lots of load makes it fall over, then the load will be
automatically reduced -- causing you to chase your tail.
It should be fairly easy to whip up a dtrace script which
would quantize the queue depth, though... [need more days
in the hour...]
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Why is st_size of a zfs directory equal to the

2009-01-14 Thread Richard L. Hamilton
> "Richard L. Hamilton"  wrote:
> 
> > Cute idea, maybe.  But very inconsistent with the
> size in blocks (reported by ls -dls dir).
> > Is there a particular reason for this, or is it one
> of those just for the heck of it things?
> >
> > Granted that it isn't necessarily _wrong_.  I just
> checked SUSv3 for stat() and sys/stat.h,
> > and it appears that st_size is only well-defined
> for regular files and symlinks.  So I suppose
> > it could be (a) undefined, or  (b) whatever is
> deemed to be useful, for directories,
> > device files, etc.
> 
> You could also return 0 for st_size for all
> directories and would still be 
> POSIX compliant.
> 
> 
> Jörg
> 

Yes, some do IIRC (certainly for empty directories, maybe always; I forget what
OS I saw that on).

Heck, "undefined" means it wouldn't be _wrong_ to return a random number.  Even
a _negative_ number wouldn't necessarily be wrong (although it would be a new 
low
in rudeness, perhaps).

I did find the earlier discussion on the subject (someone e-mailed me that 
there had been
such).  It seemed to conclude that some apps are statically linked with old 
scandir() code
that (incorrectly) assumed that the number of directory entries could be 
estimated as
st_size/24; and worse, that some such apps might be seeing the small st_size 
that zfs
offers via NFS, so they might not even be something that could be fixed on 
Solaris at all.
But I didn't see anything in the discussion that suggested that this was going 
to be changed.
Nor did I see a compelling argument for leaving it the way it is, either.  In 
the face of
"undefined", all arguments end up as pragmatism rather than principle, IMO.

Maybe it's not a bad thing to go and break incorrect code.  But if that code 
has worked for
a long time (maybe long enough for the source to have been lost), I don't know 
that it's
helpful to just remind everyone that st_size is only defined for certain types 
of objects,
and directories aren't one of them.

(Now if one wanted to write something to break code depending on 32-bit time_t 
_now_
rather than waiting for 2038, that might be a good deed in terms of breaking 
things.
But I'll be 80 then (if I'm still alive), and I probably won't care.)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] OpenSolaris better Than Solaris10u6 with requards to ARECA Raid Card

2009-01-14 Thread Charles Wright
Here's an update:

I thought that the error message
arcmsr0: too many outstanding commands
might be due to a Scsi queue being over ran

The areca driver has
#*define*ARCMSR_MAX_OUTSTANDING_CMD 
256

What might be happening is each raid set results in a new instance of the areca 
driver getting loaded so perhaps
the scsi queue on the card is just get over ran as each drive is getting a 
queue depth of 256, as such I tested with sd_max_throttle:16

(16 Drives * 16 Queues = 256)

I verified sd_max_throttle got set via:
r...@yoda:~/solaris-install-stuff# echo "sd_max_throttle/D" |mdb -k
sd_max_throttle:
sd_max_throttle:16  



I see that if I run this script to create a bunch of small files I can make a 
lot of drives jump to degrade in a hurry.

#!/bin/bash

dir=/backup/homebackup/junk
mkdir -p $dir

cd $dir

date
printf "Creating 1 1k files in $dir \n"
i=1
while [ $i -ge 0 ]
do
   j=`expr $i - 1`
   dd if=/dev/zero of=$i count=1 bs=1k &> /dev/null
   i=$j
done

date

i=1
printf "Deleting 1 1k files in $dir \n"
while [ $i -ge 0 ]
do
   j=`expr $i - 1`
   rm $i
   i=$j
done
date


Before running the script:
r...@yoda:~# zpool status
 pool: backup
state: ONLINE
scrub: none requested
config:

   NAME STATE READ WRITE CKSUM
   backup   ONLINE   0 0 0
 raidz1 ONLINE   0 0 2
   c4t2d0   ONLINE   0 0 0
   c4t3d0   ONLINE   0 0 0
   c4t4d0   ONLINE   0 0 0
   c4t5d0   ONLINE   0 0 0
   c4t6d0   ONLINE   0 0 0
   c4t7d0   ONLINE   0 0 0
   c4t8d0   ONLINE   0 0 0
 raidz1 ONLINE   0 0 2
   c4t9d0   ONLINE   0 0 0
   c4t10d0  ONLINE   0 0 0
   c4t11d0  ONLINE   0 0 0
   c4t12d0  ONLINE   0 0 0
   c4t13d0  ONLINE   0 0 0
   c4t14d0  ONLINE   0 0 0
   c4t15d0  ONLINE   0 0 0

errors: No known data errors

 pool: rpool
state: ONLINE
scrub: none requested
config:

   NAME  STATE READ WRITE CKSUM
   rpool ONLINE   0 0 0
 mirror  ONLINE   0 0 0
   c4t0d0s0  ONLINE   0 0 0
   c4t1d0s0  ONLINE   0 0 0

errors: No known data errors


   AFTER running the script:

r...@yoda:~/solaris-install-stuff# zpool status -v
 pool: backup
state: DEGRADED
status: One or more devices has experienced an error resulting in data
   corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
   entire pool from backup.
  see: http://www.sun.com/msg/ZFS-8000-8A
scrub: none requested
config:

   NAME STATE READ WRITE CKSUM
   backup   DEGRADED 0 0 5
 raidz1 DEGRADED 0 014
   c4t2d0   DEGRADED 0 0 0  too many errors
   c4t3d0   ONLINE   0 0 0
   c4t4d0   ONLINE   0 0 0
   c4t5d0   DEGRADED 0 0 0  too many errors
   c4t6d0   ONLINE   0 0 0
   c4t7d0   DEGRADED 0 0 0  too many errors
   c4t8d0   DEGRADED 0 0 0  too many errors
 raidz1 DEGRADED 0 012
   c4t9d0   DEGRADED 0 0 0  too many errors
   c4t10d0  DEGRADED 0 0 0  too many errors
   c4t11d0  DEGRADED 0 0 0  too many errors
   c4t12d0  DEGRADED 0 0 0  too many errors
   c4t13d0  DEGRADED 0 0 0  too many errors
   c4t14d0  DEGRADED 0 0 0  too many errors
   c4t15d0  DEGRADED 0 0 1  too many errors

errors: Permanent errors have been detected in the following files:

   backup/homebackup:<0x0>

 pool: rpool
state: ONLINE
scrub: none requested
config:

   NAME  STATE READ WRITE CKSUM
   rpool ONLINE   0 0 0
 mirror  ONLINE   0 0 0
   c4t0d0s0  ONLINE   0 0 0
   c4t1d0s0  ONLINE   0 0 0

errors: No known data errors

BTW 
I called Seagate to check the drive firmware.They confirm that Firmware 
version 3.AEK is the latest for the drives I have.   This is the version 
running on all 16 of my drives.

I'm about out of ideas to try.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] OpenSolaris better Than Solaris10u6 with requards to ARECA Raid Card

2009-01-14 Thread Charles Wright
Thanks for the info.I'm running the Latest Firmware for my card: V1.46
with BOOT ROM Version V1.45

Could you tell me how you have your card configured?   Are you using JBOD, 
RAID, or Pass Through?   What is your Max SATA mode set too?   How may drives 
do you have attached?

What is your ZFS config like?

Thanks.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] mirror rpool

2009-01-14 Thread Richard Elling
mijenix wrote:
> yes, that's the way zpool likes it
>
> I think I've to understand how (Open)Solaris create disks or how 
> the partition thing works under OSol. Do you know any guide or howto?
>   

We've tried to make sure the ZFS Admin Guide covers these things, including
the procedure for mirroring the root pool by hand (Jumpstart can do it
automatically :-)
http://www.opensolaris.org/os/community/zfs/docs/zfsadmin.pdf

If it does not meet your needs, please let us know.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool export+import doesn't maintain snapshot

2009-01-14 Thread Nico Sabbi
On Wednesday 14 January 2009 16:49:48 cindy.swearin...@sun.com wrote:
> Nico,
>
> If you want to enable snapshot display as in previous releases,
> then set this parameter on the pool:
>
> # zpool set listsnapshots=on pool-name
>
> Cind
>

thanks, it works as I need.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool export+import doesn't maintain snapshot

2009-01-14 Thread Cindy . Swearingen
Nico,

If you want to enable snapshot display as in previous releases,
then set this parameter on the pool:

# zpool set listsnapshots=on pool-name

Cind

Nico Sabbi wrote:
> On Wednesday 14 January 2009 11:44:56 Peter Tribble wrote:
> 
>>On Wed, Jan 14, 2009 at 10:11 AM, Nico Sabbi  
> 
> wrote:
> 
>>>Hi,
>>>I wanted to migrate a virtual disk from a S10U6 to OpenSolaris
>>>2008.11.
>>>In the first machine I rebooted to single-user and ran
>>>$ zpool export disco
>>>
>>>then copied the disk files to the target VM, rebooted as
>>>single-user and ran
>>>$ zpool import disco
>>>
>>>The disc was mounted, but none of the hundreds of snapshots was
>>>there.
>>>
>>>Did Imiss something?
>>
>>How do you know the snapshots are gone?
>>
>>Note that the zfs list command no longer shows snapshots by
>>default. You need 'zfs list -t all' for that.
> 
> 
> now I see them, but why this change? what do I have to do to list them
> by default as on the old server?
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Why is st_size of a zfs directory equal to the number of entries?

2009-01-14 Thread Joerg Schilling
"Richard L. Hamilton"  wrote:

> Cute idea, maybe.  But very inconsistent with the size in blocks (reported by 
> ls -dls dir).
> Is there a particular reason for this, or is it one of those just for the 
> heck of it things?
>
> Granted that it isn't necessarily _wrong_.  I just checked SUSv3 for stat() 
> and sys/stat.h,
> and it appears that st_size is only well-defined for regular files and 
> symlinks.  So I suppose
> it could be (a) undefined, or  (b) whatever is deemed to be useful, for 
> directories,
> device files, etc.

You could also return 0 for st_size for all directories and would still be 
POSIX compliant.


Jörg

-- 
 EMail:jo...@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
   j...@cs.tu-berlin.de(uni)  
   joerg.schill...@fokus.fraunhofer.de (work) Blog: 
http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs & iscsi sustained write performance

2009-01-14 Thread milosz
sorry, that 60% statement was misleading... i will VERY OCCASIONALLY get a 
spike to 60%, but i'm averaging more like 15%, with the throughput often 
dropping to zero for several seconds at a time.

that iperf test more or less demonstrates it isn't a network problem, no?

also i have been using microsoft iscsi initiator... i will try doing a 
solaris-solaris test later.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Reboot during import (now with kernel panic log)

2009-01-14 Thread Tobbe Lundberg
I have now installed OpenSolaris 2008.11 to a harddrive so I could catch the 
shutdown messages that gets written to /var/adm/messages.

When my computer reboots I get a kernel panic and this is the relevant part of 
the log: (also posted here if you don't like the linebreaks: 
http://paste2.org/p/129753)

Jan 14 15:12:40 opensolaris unix: [ID 836849 kern.notice] 
Jan 14 15:12:40 opensolaris ^Mpanic[cpu1]/thread=ff000c0dec80: 
Jan 14 15:12:40 opensolaris genunix: [ID 335743 kern.notice] BAD TRAP: type=e 
(#pf Page fault) rp=ff000c0de680 addr=0 occurred in module "unix" due to a 
NULL pointer dereference
Jan 14 15:12:40 opensolaris unix: [ID 10 kern.notice] 
Jan 14 15:12:40 opensolaris unix: [ID 839527 kern.notice] sched: 
Jan 14 15:12:40 opensolaris unix: [ID 753105 kern.notice] #pf Page fault
Jan 14 15:12:40 opensolaris unix: [ID 532287 kern.notice] Bad kernel fault at 
addr=0x0
Jan 14 15:12:40 opensolaris unix: [ID 243837 kern.notice] pid=0, 
pc=0xfb84e84b, sp=0xff000c0de778, eflags=0x10246
Jan 14 15:12:40 opensolaris unix: [ID 211416 kern.notice] cr0: 
8005003b cr4: 6f8
Jan 14 15:12:40 opensolaris unix: [ID 624947 kern.notice] cr2: 0
Jan 14 15:12:40 opensolaris unix: [ID 625075 kern.notice] cr3: 340
Jan 14 15:12:40 opensolaris unix: [ID 625715 kern.notice] cr8: c
Jan 14 15:12:40 opensolaris unix: [ID 10 kern.notice] 
Jan 14 15:12:40 opensolaris unix: [ID 592667 kern.notice]   rdi:
0 rsi:0 rdx: ff000c0dec80
Jan 14 15:12:40 opensolaris unix: [ID 592667 kern.notice]   rcx:
   20  r8:0  r9: ff025b247040
Jan 14 15:12:40 opensolaris unix: [ID 592667 kern.notice]   rax:
0 rbx:  400 rbp: ff000c0de7e0
Jan 14 15:12:40 opensolaris unix: [ID 592667 kern.notice]   r10:   
1a6ee4882b124b r11: ff025c2173f8 r12:0
Jan 14 15:12:40 opensolaris unix: [ID 592667 kern.notice]   r13:   
5f9800 r14: 3191 r15: ff02544c
Jan 14 15:12:40 opensolaris unix: [ID 592667 kern.notice]   fsb:
0 gsb: ff0252ea8540  ds:   4b
Jan 14 15:12:40 opensolaris unix: [ID 592667 kern.notice]es:
   4b  fs:0  gs:  1c3
Jan 14 15:12:40 opensolaris unix: [ID 592667 kern.notice]   trp:
e err:2 rip: fb84e84b
Jan 14 15:12:40 opensolaris unix: [ID 592667 kern.notice]cs:
   30 rfl:10246 rsp: ff000c0de778
Jan 14 15:12:40 opensolaris unix: [ID 266532 kern.notice]ss:
   38
Jan 14 15:12:40 opensolaris unix: [ID 10 kern.notice] 
Jan 14 15:12:40 opensolaris genunix: [ID 655072 kern.notice] ff000c0de560 
unix:die+dd ()
Jan 14 15:12:40 opensolaris genunix: [ID 655072 kern.notice] ff000c0de670 
unix:trap+1752 ()
Jan 14 15:12:40 opensolaris genunix: [ID 655072 kern.notice] ff000c0de680 
unix:_cmntrap+e9 ()
Jan 14 15:12:40 opensolaris genunix: [ID 655072 kern.notice] ff000c0de7e0 
unix:mutex_enter+b ()
Jan 14 15:12:40 opensolaris genunix: [ID 655072 kern.notice] ff000c0de850 
zfs:metaslab_free+12e ()
Jan 14 15:12:40 opensolaris genunix: [ID 655072 kern.notice] ff000c0de870 
zfs:zio_dva_free+26 ()
Jan 14 15:12:40 opensolaris genunix: [ID 655072 kern.notice] ff000c0de8a0 
zfs:zio_execute+a0 ()
Jan 14 15:12:40 opensolaris genunix: [ID 655072 kern.notice] ff000c0de8d0 
zfs:zio_nowait+57 ()
Jan 14 15:12:40 opensolaris genunix: [ID 655072 kern.notice] ff000c0de960 
zfs:arc_free+18f ()
Jan 14 15:12:40 opensolaris genunix: [ID 655072 kern.notice] ff000c0de9b0 
zfs:dsl_free+30 ()
Jan 14 15:12:40 opensolaris genunix: [ID 655072 kern.notice] ff000c0dea40 
zfs:dsl_dataset_block_kill+293 ()
Jan 14 15:12:40 opensolaris genunix: [ID 655072 kern.notice] ff000c0deac0 
zfs:dmu_objset_sync+c4 ()
Jan 14 15:12:40 opensolaris genunix: [ID 655072 kern.notice] ff000c0deb30 
zfs:dsl_pool_sync+1e8 ()
Jan 14 15:12:40 opensolaris genunix: [ID 655072 kern.notice] ff000c0debc0 
zfs:spa_sync+2af ()
Jan 14 15:12:40 opensolaris genunix: [ID 655072 kern.notice] ff000c0dec60 
zfs:txg_sync_thread+1fc ()
Jan 14 15:12:40 opensolaris genunix: [ID 655072 kern.notice] ff000c0dec70 
unix:thread_start+8 ()
Jan 14 15:12:40 opensolaris unix: [ID 10 kern.notice] 
Jan 14 15:12:40 opensolaris genunix: [ID 672855 kern.notice] syncing file 
systems...
Jan 14 15:12:40 opensolaris genunix: [ID 904073 kern.notice]  done
Jan 14 15:12:41 opensolaris genunix: [ID 111219 kern.notice] dumping to 
/dev/zvol/dsk/rpool/dump, offset 65536, content: kernel
Jan 14 15:12:47 opensolaris genunix: [ID 409368 kern.notice] ^M100% done: 
107977 pages dumped, compression ratio 4.62, 
Jan 14 15:12:47 opensolaris genunix: [ID 851671 kern.notice] dump succeeded
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensola

[zfs-discuss] Why is st_size of a zfs directory equal to the number of entries?

2009-01-14 Thread Richard L. Hamilton
Cute idea, maybe.  But very inconsistent with the size in blocks (reported by 
ls -dls dir).
Is there a particular reason for this, or is it one of those just for the heck 
of it things?

Granted that it isn't necessarily _wrong_.  I just checked SUSv3 for stat() and 
sys/stat.h,
and it appears that st_size is only well-defined for regular files and 
symlinks.  So I suppose
it could be (a) undefined, or  (b) whatever is deemed to be useful, for 
directories,
device files, etc.

This is of course inconsistent with the behavior on other filesystems.  On UFS 
(a bit
of a special case perhaps in that it still allows read(2) on a directory, for 
compatibility),
the st_size seems to reflect the actual number of bytes used by the 
implementation to
hold the directory's current contents.  That may well also be the case for 
tmpfs, but from
user-land, one can't tell since it (reasonably enough) disallows read(2) on 
directories.
Haven't checked any other filesystems.  Don't have anything else (pcfs, hsfs, 
udfs, ...)
mounted at the moment to check.

(other stuff: ISTR that devices on Solaris will give a "size" if applicable, 
but for
non LF-aware 32-bit, that may be capped at MAXOFF32_T rather than returning an 
error;
I think maybe for pipes, one sees the number of bytes available to be read.  
None of
which is portable or should necessarily be depended on...)

Cool ideas are fine, but IMO, if one does wish to make something nominally 
undefined
have some particular behavior, I wonder why one wouldn't at least try for 
consistency...
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] mirror rpool

2009-01-14 Thread Johan Hartzenberg
On Wed, Jan 14, 2009 at 10:58 AM, mijenix  wrote:

> yes, that's the way zpool likes it
>
> I think I've to understand how (Open)Solaris create disks or how
> the partition thing works under OSol. Do you know any guide or howto?
>

http://initialprogramload.blogspot.com/2008/07/how-solaris-disk-device-names-work.html




-- 
Any sufficiently advanced technology is indistinguishable from magic.
   Arthur C. Clarke

My blog: http://initialprogramload.blogspot.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] OpenSolaris better Than Solaris10u6 with requards to ARECA Raid Card

2009-01-14 Thread Johan Hartzenberg
There is an update in build 105, but it is only pertaining to the Raid
Management tool:

 Issues Resolved:
BUG/RFE:6776690Areca
raid management util doesn't work on solaris
Files Changed: 
update:usr/src/uts/intel/io/scsi/adapters/arcmsr/arcmsr.c
update:usr/src/uts/intel/io/scsi/adapters/arcmsr/arcmsr.h



On Wed, Jan 14, 2009 at 1:17 PM, Orvar Korvar <
knatte_fnatte_tja...@yahoo.com> wrote:

> Ive read about some Areca bug(?) being fixed in SXCE b105?
> --
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>



-- 
Any sufficiently advanced technology is indistinguishable from magic.
   Arthur C. Clarke

My blog: http://initialprogramload.blogspot.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] OpenSolaris better Than Solaris10u6 with requards to ARECA Raid Card

2009-01-14 Thread Orvar Korvar
Ive read about some Areca bug(?) being fixed in SXCE b105?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool export+import doesn't maintain snapshot

2009-01-14 Thread Nico Sabbi
On Wednesday 14 January 2009 11:44:56 Peter Tribble wrote:
> On Wed, Jan 14, 2009 at 10:11 AM, Nico Sabbi  
wrote:
> > Hi,
> > I wanted to migrate a virtual disk from a S10U6 to OpenSolaris
> > 2008.11.
> > In the first machine I rebooted to single-user and ran
> > $ zpool export disco
> >
> > then copied the disk files to the target VM, rebooted as
> > single-user and ran
> > $ zpool import disco
> >
> > The disc was mounted, but none of the hundreds of snapshots was
> > there.
> >
> > Did Imiss something?
>
> How do you know the snapshots are gone?
>
> Note that the zfs list command no longer shows snapshots by
> default. You need 'zfs list -t all' for that.

now I see them, but why this change? what do I have to do to list them
by default as on the old server?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool export+import doesn't maintain snapshot

2009-01-14 Thread Peter Tribble
On Wed, Jan 14, 2009 at 10:11 AM, Nico Sabbi  wrote:
> Hi,
> I wanted to migrate a virtual disk from a S10U6 to OpenSolaris
> 2008.11.
> In the first machine I rebooted to single-user and ran
> $ zpool export disco
>
> then copied the disk files to the target VM, rebooted as single-user
> and ran
> $ zpool import disco
>
> The disc was mounted, but none of the hundreds of snapshots was there.
>
> Did Imiss something?

How do you know the snapshots are gone?

Note that the zfs list command no longer shows snapshots by default.
You need 'zfs list -t all' for that.

-- 
-Peter Tribble
http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Odd network performance with ZFS/CIFS

2009-01-14 Thread fredrick phol
Turning off windows quality of service seems to have given me sustained write 
speeds hitting about 90MB/s using cifs

wites to the ISCSI device are hitting about 40mb/s but the network utilisation 
graph is very jagged, it's just a constant spike to 60% utilisation then a drop 
to 0 and repeat

I also seem to have seen a slight improvement by turning off flow control 
(which if my network knowledge serves me correctly shouldn't really be on on my 
nic anyway)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zpool export+import doesn't maintain snapshot

2009-01-14 Thread Nico Sabbi
Hi,
I wanted to migrate a virtual disk from a S10U6 to OpenSolaris 
2008.11.
In the first machine I rebooted to single-user and ran
$ zpool export disco

then copied the disk files to the target VM, rebooted as single-user
and ran
$ zpool import disco

The disc was mounted, but none of the hundreds of snapshots was there.

Did Imiss something?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS vs ZFS + HW raid? Which is best?

2009-01-14 Thread Ashish Nabira
Hi;

It's all about performance when you consider H/W raid. It will put  
extra overhead on your OS . As ZFS is fast, I will always prefer ZFS  
based RAID. It will also save cost of RAID card.


Ashish Nabira
nab...@sun.com
http://sun.com
Work is worship."


On 13-Jan-09, at 4:49 PM, Orvar Korvar wrote:

> Ok, I draw the conclusion that there is no consensus on this. Nobody  
> really knows for sure.
>
> I am in the process of converting some Windows guys to ZFS, and they  
> think that HW raid + ZFS should be better than only ZFS. I tell them  
> they should ditch their HW raid, but can not really motivate why.  
> That is why I am asking this question. And noone at SUN does really  
> know, it seems. Ive asked this in another thread, no answer.
>
> I will tell people that ZFS + HW raid is good enough, and I will not  
> recommend against HW raid anymore.
> -- 
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] mirror rpool

2009-01-14 Thread JZ
thank you, at least the list is alive.

ok, let me provide some comments beyond IT, since the zpool likes it.

knowledge is power.
truth is knowledge.
one can only understand truth if one can handle truth.
and that is through learning, and reasoning.
when you possess enough knowledge, you will be able to handle the truth.
and any time before you possess enough knowledge, truth may not be handled 
correctly and it can be very counter-productive.

if you would like an universal guide of doing things, one guide has been 
around for a long time, and agreed by many folks.
it's called the Ten Commandments.
http://www.the-ten-commandments.org/the-ten-commandments.html

best,
z



- Original Message - 
From: "mijenix" 
To: 
Sent: Wednesday, January 14, 2009 3:58 AM
Subject: Re: [zfs-discuss] mirror rpool


> yes, that's the way zpool likes it
>
> I think I've to understand how (Open)Solaris create disks or how
> the partition thing works under OSol. Do you know any guide or howto?
> -- 
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SDXC and the future of ZFS

2009-01-14 Thread Andrew Gabriel
Roch Bourbonnais wrote:
> Le 12 janv. 09 à 17:39, Carson Gaspar a écrit :
>
>   
>> Joerg Schilling wrote:
>> 
>>> Fabian Wörner  wrote:
>>>
>>>   
 my post was not to start a discuss gpl<>cddl.
 It just an idea to promote ZFS and OPENSOLARIS
 If it was against anything than against exfat, nothing else!!!
 
>>> If you like to promoote ZFS, you need to understand why the party  
>>> you like
>>> to promote it to does not already use it ;-)
>>>   
>> And for SDXC, ZFS will probably never be the filesystem of choice.
>> Removable media of this type is mostly used in portable electronic
>> devices, such as cameras, cellphones, etc. All of which are power,  
>> CPU,
>> and memory limited. ZFS, while a marvelous filesystem, is incredibly  
>> RAM
>> hungry. I suspect it's CPU profile is also non-trivial for a  
>> restricted
>> performance device.
>> 
>
> I have not looked at it recently but for any access greater than ~ 16K  
> ZFS was more efficient than UFS.
> It's just one partial data point but the conventional wisdom that ZFS  
> will use more cpu is not an absolute truth.
>
> Even more so for RAM,  ZFS with 128K record make efficient use of  
> metadata. The only ram it needs to operation is 10 seconds of
> of your workload's  throughput and that can be tuned down in appliances.
>   

DOS/FAT filesystem implementations in appliances can be found in less 
than 8K code and data size (mostly that's code). Limited functionality 
implementations can be smaller than 1kB size.

-- 
Andrew
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs & iscsi sustained write performance

2009-01-14 Thread Roch

milosz writes:
 > iperf test coming out fine, actually...
 > 
 > iperf -s -w 64k
 > 
 > iperf -c -w 64k -t 900 -i 5
 > 
 > [ ID] Interval   Transfer Bandwidth
 > [  5]  0.0-899.9 sec  81.1 GBytes774 Mbits/sec
 > 
 > totally steady.  i could probably implement some tweaks to improve it, but 
 > if i were getting a steady 77% of gigabit i'd be very happy.
 > 

So you're trying to get from 60% to 77%. IIRC you had some
small amount of reads going on. If you can find out where
those come from and eliminate them that could help.

Did we cover maxrecvdataseglen also ? I've seen this help
throughput using solaris initiator :
 
iscsiadm list target | grep ^Target | awk '{print $2}' | while read
x ; do
iscsiadm modify target-param -p maxrecvdataseglen=65536 $x
done


-r

 
 > not seeing any cpu saturation with mpstat... nothing unusual other than low 
 > activity while zfs commits writes to disk (ostensibly this is when the 
 > transfer rate troughs)...
 > -- 
 > This message posted from opensolaris.org
 > ___
 > zfs-discuss mailing list
 > zfs-discuss@opensolaris.org
 > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS vs HardWare raid - data integrity?

2009-01-14 Thread JZ
Folks,
what can I post to the list to make the discussion go on?

Is this what you folks want to see? which I shared with King and High but 
not you folks?
http://www.excelsioritsolutions.com/jz/jzbrush/jzbrush.htm
This is not even IT stuff so that I never thought I should post this to the 
list...

This is getting too strange for an open discussion.
Please, folks.

Best,
z 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] mirror rpool

2009-01-14 Thread mijenix
yes, that's the way zpool likes it

I think I've to understand how (Open)Solaris create disks or how 
the partition thing works under OSol. Do you know any guide or howto?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss