date:20100505

Re: [zfs-discuss] Performance of the ZIL

2010-05-05 Thread Pasi Kärkkäinen

On Wed, May 05, 2010 at 11:32:23PM -0400, Edward Ned Harvey wrote:
> > From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> > boun...@opensolaris.org] On Behalf Of Robert Milkowski
> >
> > if you can disable ZIL and compare the performance to when it is off it
> > will give you an estimate of what's the absolute maximum performance
> > increase (if any) by having a dedicated ZIL device.
> 
> I'll second this suggestion.  It'll cost you nothing to disable the ZIL
> temporarily.  (You have to dismount the filesystem twice.  Once to disable
> the ZIL, and once to re-enable it.)  Then you can see if performance is
> good.  If performance is good, then you'll know you need to accelerate your
> ZIL.  (Because disabled ZIL is the fastest thing you could possibly ever
> do.)
> 
> Generally speaking, you should not disable your ZIL for the long run.  But
> in some cases, it makes sense.
> 
> Here's how you determine if you want to disable your ZIL permanently:
> 
> First, understand that with the ZIL disabled, all sync writes are treated as
> async writes.  This is buffered in ram before being written to disk, so the
> kernel can optimize and aggregate the write operations into one big chunk.
> 
> No matter what, if you have an ungraceful system shutdown, you will lose all
> the async writes that were waiting in ram.
> 
> If you have ZIL disabled, you will also lose the sync writes that were
> waiting in ram (because those are being handled as async.)
> 
> In neither case do you have data or filesystem corruption.
> 

ZFS probably is still OK, since it's designed to handle this (?),
but the data can't be OK if you lose 30 secs of writes.. 30 secs of writes
that have been ack'd being done to the servers/applications..

> The risk of running with no ZIL is:  In the case of ungraceful shutdown, in
> addition to the (up to 30 sec) async writes that will be lost, you will also
> lose up to 30 sec of sync writes.
> 

-- Pasi

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] why both dedup and compression?

2010-05-05 Thread Richard Elling

On May 5, 2010, at 8:35 PM, Richard Jahnel wrote:

> Hmm...
> 
> To clarify.
> 
> Every discussion or benchmarking that I have seen always show both off, 
> compression only or both on.
> 
> Why never compression off and dedup on?

I've seen this quite often.  The decision to compress is based on the 
compressibility of the data.  The decision to dedup is based on the 
duplication of the data.

> After some further thought... perhaps it's because compression works at the 
> byte level and dedup is at the block level. Perhaps I have answered my own 
> question.

Both work at the block level. Hence, they are complementary.
Two identical blocks will compress identically, and then dedup.
 -- richard

-- 
ZFS storage and performance consulting at http://www.RichardElling.com










___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Loss of L2ARC SSD Behaviour

2010-05-05 Thread Michael Sullivan

On 6 May 2010, at 13:18 , Edward Ned Harvey wrote:

>> From: Michael Sullivan [mailto:michael.p.sulli...@mac.com]
>> 
>> While it explains how to implement these, there is no information
>> regarding failure of a device in a striped L2ARC set of SSD's.  I have
> 
> http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#Sepa
> rate_Cache_Devices
> 
> It is not possible to mirror or use raidz on cache devices, nor is it
> necessary. If a cache device fails, the data will simply be read from the
> main pool storage devices instead.
> 

I understand this.

> I guess I didn't write this part, but:  If you have multiple cache devices,
> they are all independent from each other.  Failure of one does not negate
> the functionality of the others.
> 

Ok, this is what I wanted to know.  The that the L2ARC devices assigned to the 
pool are not striped but are independent.  Loss of one drive will just cause a 
cache miss and force ZFS to go out to the pool for its objects.

But then I'm not talking about using RAIDZ on a cache device.  I'm talking 
about a striped device which would be RAID-0.  If the SSD's are all assigned to 
L2ARC, then they are not striped in any fashion (RAID-0), but are completely 
independent and the L2ARC will continue to operate, just missing a single SSD.

> 
>> I'm running 2009.11 which is the latest OpenSolaris.  
> 
> Quoi??  2009.06 is the latest available from opensolaris.com and
> opensolaris.org.
> 
> If you want something newer, AFAIK, you have to go to developer build, such
> as osol-dev-134
> 
> Sure you didn't accidentally get 2008.11?
> 

My mistake… snv_111b which is 2009.06.  I know it went up to 11 somewhere.

> 
>> I am also well aware of the effect of losing a ZIL device will cause
>> loss of the entire pool.  Which is why I would never have a ZIL device
>> unless it was mirrored and on different controllers.
> 
> Um ... the log device is not special.  If you lose *any* unmirrored device,
> you lose the pool.  Except for cache devices, or log devices on zpool >=19
> 

Well, if I've got a separate ZIL which is mirrored for performance, and 
mirrored because I think my data is valuable and important, I will have 
something more than RAID-0 on my main storage pool too.  More than likely 
RAIDZ2 since I plan on using L2ARC to help improve performance along with 
separate SSD mirrored ZIL devices.

> 
>> From the information I've been reading about the loss of a ZIL device,
>> it will be relocated to the storage pool it is assigned to.  I'm not
>> sure which version this is in, but it would be nice if someone could
>> provide the release number it is included in (and actually works), it
>> would be nice.  
> 
> What the heck?  Didn't I just answer that question?
> I know I said this is answered in ZFS Best Practices Guide.
> http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#Sepa
> rate_Log_Devices
> 
> Prior to pool version 19, if you have an unmirrored log device that fails,
> your whole pool is permanently lost.
> Prior to pool version 19, mirroring the log device is highly recommended.
> In pool version 19 or greater, if an unmirrored log device fails during
> operation, the system reverts to the default behavior, using blocks from the
> main storage pool for the ZIL, just as if the log device had been gracefully
> removed via the "zpool remove" command.
> 

No need to get defensive here, all I'm looking for is the spool version number 
which supports it and the version of OpenSolaris which supports that ZPOOL 
version.

I think that if you are building for performance, it would be almost intuitive 
to have a mirrored ZIL in the event of failure, and perhaps even a hot spare 
available as well.  I don't like the idea of my ZIL being transferred back to 
the pool, but having it transferred back is better than the alternative which 
would be data loss or corruption.

> 
>> Also, will this functionality be included in the
>> mythical 2010.03 release?
> 
> 
> Zpool 19 was released in build 125.  Oct 16, 2009.  You can rest assured it
> will be included in 2010.03, or 04, or whenever that thing comes out.
> 

Thanks, build 125.

> 
>> So what you are saying is that if a single device fails in a striped
>> L2ARC VDEV, then the entire VDEV is taken offline and the fallback is
>> to simply use the regular ARC and fetch from the pool whenever there is
>> a cache miss.
> 
> It sounds like you're only going to believe it if you test it.  Go for it.
> That's what I did before I wrote that section of the ZFS Best Practices
> Guide.
> 
> In ZFS, there is no such thing as striping, although the term is commonly
> used, because adding multiple devices creates all the benefit of striping,
> plus all the benefit of concatenation, but colloquially, people think
> concatenation is weird or unused or something, so people just naturally
> gravitated to calling it a stripe in ZFS too, although that's not
> technically correct according to the traditional RAID d

Re: [zfs-discuss] Loss of L2ARC SSD Behaviour

2010-05-05 Thread Edward Ned Harvey

> From: Michael Sullivan [mailto:michael.p.sulli...@mac.com]
> 
> My Google is very strong and I have the Best Practices Guide committed
> to bookmark as well as most of it to memory.
> 
> While it explains how to implement these, there is no information
> regarding failure of a device in a striped L2ARC set of SSD's.  I have

http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#Sepa
rate_Cache_Devices

It is not possible to mirror or use raidz on cache devices, nor is it
necessary. If a cache device fails, the data will simply be read from the
main pool storage devices instead.

I guess I didn't write this part, but:  If you have multiple cache devices,
they are all independent from each other.  Failure of one does not negate
the functionality of the others.

 
> I'm running 2009.11 which is the latest OpenSolaris.  

Quoi??  2009.06 is the latest available from opensolaris.com and
opensolaris.org.

If you want something newer, AFAIK, you have to go to developer build, such
as osol-dev-134

Sure you didn't accidentally get 2008.11?


> I am also well aware of the effect of losing a ZIL device will cause
> loss of the entire pool.  Which is why I would never have a ZIL device
> unless it was mirrored and on different controllers.

Um ... the log device is not special.  If you lose *any* unmirrored device,
you lose the pool.  Except for cache devices, or log devices on zpool >=19


> From the information I've been reading about the loss of a ZIL device,
> it will be relocated to the storage pool it is assigned to.  I'm not
> sure which version this is in, but it would be nice if someone could
> provide the release number it is included in (and actually works), it
> would be nice.  

What the heck?  Didn't I just answer that question?
I know I said this is answered in ZFS Best Practices Guide.
http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#Sepa
rate_Log_Devices

Prior to pool version 19, if you have an unmirrored log device that fails,
your whole pool is permanently lost.
Prior to pool version 19, mirroring the log device is highly recommended.
In pool version 19 or greater, if an unmirrored log device fails during
operation, the system reverts to the default behavior, using blocks from the
main storage pool for the ZIL, just as if the log device had been gracefully
removed via the "zpool remove" command.


> Also, will this functionality be included in the
> mythical 2010.03 release?


Zpool 19 was released in build 125.  Oct 16, 2009.  You can rest assured it
will be included in 2010.03, or 04, or whenever that thing comes out.


> So what you are saying is that if a single device fails in a striped
> L2ARC VDEV, then the entire VDEV is taken offline and the fallback is
> to simply use the regular ARC and fetch from the pool whenever there is
> a cache miss.

It sounds like you're only going to believe it if you test it.  Go for it.
That's what I did before I wrote that section of the ZFS Best Practices
Guide.

In ZFS, there is no such thing as striping, although the term is commonly
used, because adding multiple devices creates all the benefit of striping,
plus all the benefit of concatenation, but colloquially, people think
concatenation is weird or unused or something, so people just naturally
gravitated to calling it a stripe in ZFS too, although that's not
technically correct according to the traditional RAID definition.  But
nobody bothered to create a new term "stripecat" or whatever, for ZFS.


> Or, does what you are saying here mean that if I have a 4 SSD's in a
> stripe for my L2ARC, and one device fails, the L2ARC will be
> reconfigured dynamically using the remaining SSD's for L2ARC.

No reconfiguration necessary, because it's not a stripe.  It's 4 separate
devices, which ZFS can use simultaneously if it wants to.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Loss of L2ARC SSD Behaviour

2010-05-05 Thread Michael Sullivan

Hi Ed,

Thanks for your answers.  Seem to make sense, sort of…

On 6 May 2010, at 12:21 , Edward Ned Harvey wrote:

>> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
>> boun...@opensolaris.org] On Behalf Of Michael Sullivan
>> 
>> I have a question I cannot seem to find an answer to.
> 
> Google for ZFS Best Practices Guide  (on solarisinternals).  I know this
> answer is there.
> 

My Google is very strong and I have the Best Practices Guide committed to 
bookmark as well as most of it to memory.

While it explains how to implement these, there is no information regarding 
failure of a device in a striped L2ARC set of SSD's.  I have been hard pressed 
to find this information anywhere, short of testing it myself, but I don't have 
the necessary hardware in a lab to test correctly.  If someone has pointers to 
references, could you please provide them to chapter and verse, rather than the 
advice to "Go read the manual."

> 
>> I know if I set up ZIL on SSD and the SSD goes bad, the the ZIL will be
>> relocated back to the spool.  I'd probably have it mirrored anyway,
>> just in case.  However you cannot mirror the L2ARC, so...
> 
> Careful.  The "log device removal" feature exists, and is present in the
> developer builds of opensolaris today.  However, it's not included in
> opensolars 2009.06, and it's not included in the latest and greatest solaris
> 10 yet.  Which means, right now, if you lose an unmirrored ZIL (log) device,
> your whole pool is lost, unless you're running a developer build of
> opensolaris.
> 

I'm running 2009.11 which is the latest OpenSolaris.  I should have made that 
clear, and that I don't intend this to be on Solaris 10 system, and am waiting 
for the next production build anyway.  As you say, it does not exist in 
2009.06, this is not the latest production Opensolaris which is 2009.11, and 
I'd be more interested in its behavior than an older release.

I am also well aware of the effect of losing a ZIL device will cause loss of 
the entire pool.  Which is why I would never have a ZIL device unless it was 
mirrored and on different controllers.

>From the information I've been reading about the loss of a ZIL device, it will 
>be relocated to the storage pool it is assigned to.  I'm not sure which 
>version this is in, but it would be nice if someone could provide the release 
>number it is included in (and actually works), it would be nice.  Also, will 
>this functionality be included in the mythical 2010.03 release?

Also, I'd be interested to know what features along these lines will be 
available in 2010.03 if it ever sees the light of day.

> 
>> What I want to know, is what happens if one of those SSD's goes bad?
>> What happens to the L2ARC?  Is it just taken offline, or will it
>> continue to perform even with one drive missing?
> 
> In the L2ARC (cache) there is no ability to mirror, because cache device
> removal has always been supported.  You can't mirror a cache device, because
> you don't need it.
> 
> If one of the cache devices fails, no harm is done.  That device goes
> offline.  The rest stay online.
> 

So what you are saying is that if a single device fails in a striped L2ARC 
VDEV, then the entire VDEV is taken offline and the fallback is to simply use 
the regular ARC and fetch from the pool whenever there is a cache miss.

Or, does what you are saying here mean that if I have a 4 SSD's in a stripe for 
my L2ARC, and one device fails, the L2ARC will be reconfigured dynamically 
using the remaining SSD's for L2ARC.

It would be good to get an answer to this from someone who has actually tested 
this or is more intimately familiar with the ZFS code rather than all the 
speculation I've been getting so far.

> 
>> Sorry, if these questions have been asked before, but I cannot seem to
>> find an answer.
> 
> Since you said this twice, I'll answer it twice.  ;-)
> I think the best advice regarding cache/log device mirroring is in the ZFS
> Best Practices Guide.
> 

Been there read that, many, many times.  It's an invaluable reference, I agree.

Thanks

Mike

---
Michael Sullivan   
michael.p.sulli...@me.com
http://www.kamiogi.net/
Japan Mobile: +81-80-3202-2599
US Phone: +1-561-283-2034

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zpool mirror (dumb question)

2010-05-05 Thread Steve Staples

OK, I've installed OpenSolaris DEV 134, created 2 files.

Mkfile 128m /disk1
Mkfile 127m /disk2

Zpool create stapler /disk1
Zpool attach stapler /disk1 /disk2

Cannot attach /disk2 to /disk1: device is too small  (that's what she said..
lol)

But, if I created 128m and 128m - 10bytes, it works. I can attach the
smaller drive.
And if I create 1000m, I can attach a 999m virtual disk... 
So, my question is, is what is the ratio on this?  how much smaller can
drive2 be than drive1?

I was trying to find developer notes, and what's been upgraded. but my
searching didn't turn up anything of interest :(

Steve

another question, is zpool shrinking available?

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZIL behavior on import

2010-05-05 Thread Edward Ned Harvey

> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> boun...@opensolaris.org] On Behalf Of Steven Stallion
> 
> I had a question regarding how the ZIL interacts with zpool import:
> 
> Given that the intent log is replayed in the event of a system failure,
> does the replay behavior differ if -f is passed to zpool import? For
> example, if I have a system which fails prior to completing a series of
> writes and I reboot using a failsafe (i.e. install disc), will the log
> be
> replayed after a zpool import -f ?

If your log devices are present, and you zpool import (even without the -f),
then the log devices will be replayed.  Regardless of which version of zpool
you have.

If your log device is not present, and you zpool import ...
If you have zpool version <19, you simply cannot import. 
If you have zpool >=19, the system will prompt you:  Warning, log device not
present. If you import -f, you will lose any unplayed events on the missing
log device, but the pool will import.

FYI, in solaris 10, you cannot have zpool version 19 yet.
In opensolaris 2009.06, zpool version 19 is not available unless you upgrade
to a developer build.
In the latest developer build, zpool version is >=19 by default.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] why both dedup and compression?

2010-05-05 Thread Ian Collins


On 05/ 6/10 03:35 PM, Richard Jahnel wrote:

Hmm...

To clarify.

Every discussion or benchmarking that I have seen always show both off, 
compression only or both on.

Why never compression off and dedup on?

After some further thought... perhaps it's because compression works at the 
byte level and dedup is at the block level. Perhaps I have answered my own 
question.

   
Data that don't compress also tends to be data that doesn't dedup well 
(media files for example).


--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] why both dedup and compression?

2010-05-05 Thread Richard Jahnel

Hmm...

To clarify.

Every discussion or benchmarking that I have seen always show both off, 
compression only or both on.

Why never compression off and dedup on?

After some further thought... perhaps it's because compression works at the 
byte level and dedup is at the block level. Perhaps I have answered my own 
question.

Some confirmation would be nice though.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Performance of the ZIL

2010-05-05 Thread Edward Ned Harvey

> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> boun...@opensolaris.org] On Behalf Of Robert Milkowski
>
> if you can disable ZIL and compare the performance to when it is off it
> will give you an estimate of what's the absolute maximum performance
> increase (if any) by having a dedicated ZIL device.

I'll second this suggestion.  It'll cost you nothing to disable the ZIL
temporarily.  (You have to dismount the filesystem twice.  Once to disable
the ZIL, and once to re-enable it.)  Then you can see if performance is
good.  If performance is good, then you'll know you need to accelerate your
ZIL.  (Because disabled ZIL is the fastest thing you could possibly ever
do.)

Generally speaking, you should not disable your ZIL for the long run.  But
in some cases, it makes sense.

Here's how you determine if you want to disable your ZIL permanently:

First, understand that with the ZIL disabled, all sync writes are treated as
async writes.  This is buffered in ram before being written to disk, so the
kernel can optimize and aggregate the write operations into one big chunk.

No matter what, if you have an ungraceful system shutdown, you will lose all
the async writes that were waiting in ram.

If you have ZIL disabled, you will also lose the sync writes that were
waiting in ram (because those are being handled as async.)

In neither case do you have data or filesystem corruption.

The risk of running with no ZIL is:  In the case of ungraceful shutdown, in
addition to the (up to 30 sec) async writes that will be lost, you will also
lose up to 30 sec of sync writes.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Loss of L2ARC SSD Behaviour

2010-05-05 Thread Edward Ned Harvey

> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> boun...@opensolaris.org] On Behalf Of Michael Sullivan
> 
> I have a question I cannot seem to find an answer to.

Google for ZFS Best Practices Guide  (on solarisinternals).  I know this
answer is there.


> I know if I set up ZIL on SSD and the SSD goes bad, the the ZIL will be
> relocated back to the spool.  I'd probably have it mirrored anyway,
> just in case.  However you cannot mirror the L2ARC, so...

Careful.  The "log device removal" feature exists, and is present in the
developer builds of opensolaris today.  However, it's not included in
opensolars 2009.06, and it's not included in the latest and greatest solaris
10 yet.  Which means, right now, if you lose an unmirrored ZIL (log) device,
your whole pool is lost, unless you're running a developer build of
opensolaris.


> What I want to know, is what happens if one of those SSD's goes bad?
> What happens to the L2ARC?  Is it just taken offline, or will it
> continue to perform even with one drive missing?

In the L2ARC (cache) there is no ability to mirror, because cache device
removal has always been supported.  You can't mirror a cache device, because
you don't need it.

If one of the cache devices fails, no harm is done.  That device goes
offline.  The rest stay online.


> Sorry, if these questions have been asked before, but I cannot seem to
> find an answer.

Since you said this twice, I'll answer it twice.  ;-)
I think the best advice regarding cache/log device mirroring is in the ZFS
Best Practices Guide.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Best practice for full stystem backup - equivelent of ufsdump/ufsrestore

2010-05-05 Thread Edward Ned Harvey

> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> boun...@opensolaris.org] On Behalf Of Ray Van Dolson
> 
> Well, being able to remove ZIL devices is one important feature
> missing.  Hopefully in U9. :)

I did have a support rep confirm for me that both the log device removal,
and the ability to mirror slightly smaller devices will be present in U9.
But he couldn't say when that would be.

And if I happen to remember my facts wrong (or not remember my facts when I
think I do) ... Please throw no stones.  ;-)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Best practice for full stystem backup - equivelent of ufsdump/ufsrestore

2010-05-05 Thread Edward Ned Harvey

> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> boun...@opensolaris.org] On Behalf Of Bob Friesenhahn
> 
> From a zfs standpoint, Solaris 10 does not seem to be behind the
> currently supported OpenSolaris release.

I'm sorry, I'll have to disagree with you there.  In solaris 10, fully
updated, you can only get up to zpool version 15.  This is lacking many
later features ... For me in particular, zpool 19 is when "zpool remove log"
was first supported.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Reverse lookup: inode to name lookup

2010-05-05 Thread Edward Ned Harvey

> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> boun...@opensolaris.org] On Behalf Of Edward Ned Harvey

Thanks to Victor, here is at least proof of concept that yes, it is possible
to reverse resolve, inode number --> pathname, and yes, it is almost
infinitely faster than doing something like "find":

Root can reverse lookup names of inodes with this command:
zdb -  

(on a tangent)
Surprisingly, it is not limited to just looking up directories.  It finds
files too (sort of).  Apparently a file inode does contain *one* reference
to its latest parent.  But if you hardlink more than once, you'll only find
the latest parent, and if you rm the latest hardlink, then it'll still find
only the latest parent, which has been unlinked, and therefore not valid.

But it works perfectly for directories.
(back from tangent)

Regardless of how big the filesystem is, regardless of cache warmness,
regardless of how many inodes you want to reverse-lookup, this zdb command
takes between 1 and 2 seconds per filesystem, fixed.  In other words, the
operation of performing reverse-lookup per inode is essentially zero time,
but there is some kind of "startup" overhead. 

In theory at least, the reverse lookup could be equally as fast as a regular
forward lookup, such as "ls" or "stat".  But my measurements also show that
a forward lookup incurs some form of "startup" overhead.  A forward lookup
on an already mounted filesystem should require a few ms.  But in my example
below, it takes several hundred ms per snapshot, which means there's a
"warmup" period for some reason, to open up each snapshot.

Find, of course, scales linearly with the total number of directories/files
in the filesystem.  On my company filer, I got these results:

Just do a forward lookup
time ls -d /tank/somefilesystem/.zfs/snapshot/*/some_object
took 24 sec, on my 53 snapshots 
(that's 0.45sec per snapshot)
Using a for loop and zdb to reverse lookup those things
took 1m 3sec, on my 53 snapshots
(that's 1.19 sec per snapshot)
Using "find -inum" to locate all those things ...
I only let it complete 4 snapshots.  
Took 33 mins per snapshot

So that's a marvelous proof of concept.  Yes, reverse lookup is possible,
and it's essentially infinitely faster than "find -inum" can be.  I have a
feeling a reverse-lookup application could be even faster, if it were an
application designed specifically for this purpose.  

Zdb is not a suitable long term solution for this purpose.  Zdb is only
sufficient here, as a proof of concept.  

Here's the problem with zdb:
man zdb
DESCRIPTION
 The zdb command is used by  support  engineers  to  diagnose
 failures and gather statistics. Since the ZFS file system is
 always consistent on disk and is self-repairing, zdb  should
 only be run under the direction by a support engineer.

 If no arguments are  specified,  zdb,  performs  basic  con-
 sistency  checks  on  the  pool and associated datasets, and
 report any problems detected.

 Any options supported by this command are  internal  to  Sun
 and subject to change at any time.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] why both dedup and compression?

2010-05-05 Thread Erik Trimble

One of the big things to remember with dedup is that it is
block-oriented (as is compression) - it deals with things in discrete
chunks, (usually) not the entire file as a stream. So, let's do a
thought-experiment here:

File A is 100MB in size. From ZFS's standpoint, let's say it's made up
of 100 1MB blocks (or chunks, or slabs). Let's also say that none of the
blocks are identical (which is highly likely) - that is, no block
checksums identically.

Thus, with dedup on, this file takes up 100MB of space.  If I do a "cp
fileA fileB", no more additional space will be taken up. 

However, let's say I then add 1 bit of data to the very front of file A.
Now, block alignments have changed for the entire file, so all the 1MB
blocks checksum differently. Thus, in this case, adding 1 bit of data to
file A actually causes 100MB+1bit of new data to be used, as now none of
file B's block are the same as file A.  Therefore, after 1 additional
bit has been written, total disk usage is 200MB+1 bit.

If compression were being used, file A originally would likely take up <
100MB, and file B would take up the same amount; thus, the two together
could take up, say 150MB together (with a conservative 25% compression
ratio). After writing 1 new bit to file A, file A almost certainly
compresses the same as before, so the two files will continue to occupy
150MB of space.

Compression is not obsoleted by dedup. They both have their places,
depending on the data being stored, and the usage pattern of that data.

-Erik

On Wed, 2010-05-05 at 19:11 -0700, Richard L. Hamilton wrote:
> Another thought is this: _unless_ the CPU is the bottleneck on
> a particular system, compression (_when_ it actually helps) can
> speed up overall operation, by reducing the amount of I/O needed.
> But storing already-compressed files in a filesystem with compression
> is likely to result in wasted effort, with little or no gain to show for it.
> 
> Even deduplication requires some extra effort.  Looking at the documentation,
> it implies a particular checksum algorithm _plus_ verification (if the 
> checksum
> or digest matches, then make sure by doing a byte-for-byte compare of the
> blocks, since nothing shorter than the data itself can _guarantee_ that
> they're the same, just like no lossless compression can possibly work for
> all possible bitstreams).
> 
> So doing either of these where the success rate is likely to be too low
> is probably not helpful.
> 
> There are stats that show the savings for a filesystem due to compression
> or deduplication.  What I think would be interesting is some advice as to
> how much (percentage) savings one should be getting to expect to come
> out ahead not just on storage, but on overall system performance.  Of
> course, no such guidance would exactly fit any particular workload, but
> I think one might be able to come up with some approximate numbers,
> or at least a range, below which those features probably represented
> a waste of effort unless space was at an absolute premium.

-- 
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] why both dedup and compression?

2010-05-05 Thread Richard L. Hamilton

Another thought is this: _unless_ the CPU is the bottleneck on
a particular system, compression (_when_ it actually helps) can
speed up overall operation, by reducing the amount of I/O needed.
But storing already-compressed files in a filesystem with compression
is likely to result in wasted effort, with little or no gain to show for it.

Even deduplication requires some extra effort.  Looking at the documentation,
it implies a particular checksum algorithm _plus_ verification (if the checksum
or digest matches, then make sure by doing a byte-for-byte compare of the
blocks, since nothing shorter than the data itself can _guarantee_ that
they're the same, just like no lossless compression can possibly work for
all possible bitstreams).

So doing either of these where the success rate is likely to be too low
is probably not helpful.

There are stats that show the savings for a filesystem due to compression
or deduplication.  What I think would be interesting is some advice as to
how much (percentage) savings one should be getting to expect to come
out ahead not just on storage, but on overall system performance.  Of
course, no such guidance would exactly fit any particular workload, but
I think one might be able to come up with some approximate numbers,
or at least a range, below which those features probably represented
a waste of effort unless space was at an absolute premium.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] why both dedup and compression?

2010-05-05 Thread Richard L. Hamilton

> I've googled this for a bit, but can't seem to find
> the answer.
> 
> What does compression bring to the party that dedupe
> doesn't cover already?
> 
> Thank you for you patience and answers.

That almost sounds like a classroom question.

Pick a simple example: large text files, of which each is
unique, maybe lines of data or something.  Not likely to
be much in the way of duplicate blocks to share, but
very likely to be highly compressible.

Contrast that with binary files, which might have blocks
of zero bytes in them (without being strictly sparse, sometimes).
With deduping, one such block is all that's actually stored (along
with all the references to it, of course).

In the 30 seconds or so I've been thinking about it to type this,
I would _guess_ that one might want one or the other, but
rarely both, since compression might tend to work against deduplication.

So given the availability of both, and how lightweight zfs filesystems
are, one might want to create separate filesystems within a pool with
one or the other as appropriate, and separate the data according to
which would likely work better on it.  Also, one might as well
put compressed video, audio, and image formats in a filesystem
that was _not_ compressed, since compressing an already compressed
file seldom gains much if anything more.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] why both dedup and compression?

2010-05-05 Thread Alex Blewitt

Dedup came much later than compression. Also, compression saves both  
space and therefore load time even when there's only one copy. It is  
especially good for e.g. HTML or man page documentation which tends to  
compress very well (versus binary formats like images or MP3s that  
don't).


It gives me an extra, say, 10g on my laptop's 80g SSD which isn't bad.

Alex

Sent from my (new) iPhone

On 6 May 2010, at 02:06, Richard Jahnel  wrote:


I've googled this for a bit, but can't seem to find the answer.

What does compression bring to the party that dedupe doesn't cover  
already?


Thank you for you patience and answers.
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] why both dedup and compression?

2010-05-05 Thread Richard Jahnel

I've googled this for a bit, but can't seem to find the answer.

What does compression bring to the party that dedupe doesn't cover already?

Thank you for you patience and answers.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Different devices with the same name in zpool status

2010-05-05 Thread Brandon High

On Wed, May 5, 2010 at 5:00 PM, Ian Collins  wrote:
> Have you hot swapped any drives?  I had a similar oddity after swapping
> drives and running cfgadm.

No hot-swapping. I'd imported & exported both pools from a LiveCD
environment, but I'd also rebooted at least twice since then.

-B

-- 
Brandon High : bh...@freaks.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Best practice for full stystem backup - equivelent of ufsdump/ufsrestore

2010-05-05 Thread Ray Van Dolson

On Wed, May 05, 2010 at 05:09:40PM -0700, Erik Trimble wrote:
> On Wed, 2010-05-05 at 19:03 -0500, Bob Friesenhahn wrote:
> > On Wed, 5 May 2010, Ray Van Dolson wrote:
> > >>
> > >> From a zfs standpoint, Solaris 10 does not seem to be behind the
> > >> currently supported OpenSolaris release.
> > >
> > > Well, being able to remove ZIL devices is one important feature
> > > missing.  Hopefully in U9. :)
> > 
> > While the development versions of OpenSolaris are clearly well beyond 
> > Solaris 10, I don't believe that the supported version of OpenSolaris 
> > (a year old already) has this feature yet either and Solaris 10 has 
> > been released several times since then already.  When the forthcoming 
> > OpenSolaris release emerges in 2011, the situation will be far 
> > different.  Solaris 10 can then play catch-up with the release of U9 
> > in 2012.
> > 
> > Bob
> 
> Pessimist. ;-)
> 
> 
> s/2011/2010/
> s/2012/2011/
> 

Yeah, U9 in 2012 makes me very sad.

I would really love to see the hot-removable ZIL's this year.
Otherwise I'll need to rebuild a few zpools :)

Ray
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Best practice for full stystem backup - equivelent of ufsdump/ufsrestore

2010-05-05 Thread Erik Trimble

On Wed, 2010-05-05 at 19:03 -0500, Bob Friesenhahn wrote:
> On Wed, 5 May 2010, Ray Van Dolson wrote:
> >>
> >> From a zfs standpoint, Solaris 10 does not seem to be behind the
> >> currently supported OpenSolaris release.
> >
> > Well, being able to remove ZIL devices is one important feature
> > missing.  Hopefully in U9. :)
> 
> While the development versions of OpenSolaris are clearly well beyond 
> Solaris 10, I don't believe that the supported version of OpenSolaris 
> (a year old already) has this feature yet either and Solaris 10 has 
> been released several times since then already.  When the forthcoming 
> OpenSolaris release emerges in 2011, the situation will be far 
> different.  Solaris 10 can then play catch-up with the release of U9 
> in 2012.
> 
> Bob

Pessimist. ;-)


s/2011/2010/
s/2012/2011/




-- 
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Best practice for full stystem backup - equivelent of ufsdump/ufsrestore

2010-05-05 Thread Bob Friesenhahn


On Wed, 5 May 2010, Ray Van Dolson wrote:


From a zfs standpoint, Solaris 10 does not seem to be behind the
currently supported OpenSolaris release.


Well, being able to remove ZIL devices is one important feature
missing.  Hopefully in U9. :)


While the development versions of OpenSolaris are clearly well beyond 
Solaris 10, I don't believe that the supported version of OpenSolaris 
(a year old already) has this feature yet either and Solaris 10 has 
been released several times since then already.  When the forthcoming 
OpenSolaris release emerges in 2011, the situation will be far 
different.  Solaris 10 can then play catch-up with the release of U9 
in 2012.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] How to completely erradicate ZFS

2010-05-05 Thread nich romero

You are right; the system does not really care that it can not mount it 
automatically but it still tries since it sees the zpool data.

[b]pfexec zdb -l /dev/rdsk/c7t0d0s2[/b]

LABEL 0

failed to unpack label 0

LABEL 1

failed to unpack label 1

LABEL 2

version: 19
name: 'rpool'
state: 0
txg: 604
pool_guid: 15191080926808974889
hostid: 215494
hostname: ''
top_guid: 10231941211973911013
guid: 10231941211973911013
vdev_children: 1
vdev_tree:
type: 'disk'
id: 0
guid: 10231941211973911013
path: '/dev/dsk/c4t0d0s0'
devid: 'id1,s...@ast68022cf=4nx017qk/a'
phys_path: '/p...@0,0/pci10f1,2...@5/d...@0,0:a'
whole_disk: 0
metaslab_array: 23
metaslab_shift: 26
ashift: 9
asize: 7985430528
is_log: 0
create_txg: 4

LABEL 3

version: 19
name: 'rpool'
state: 0
txg: 604
pool_guid: 15191080926808974889
hostid: 215494
hostname: ''
top_guid: 10231941211973911013
guid: 10231941211973911013
vdev_children: 1
vdev_tree:
type: 'disk'
id: 0
guid: 10231941211973911013
path: '/dev/dsk/c4t0d0s0'
devid: 'id1,s...@ast68022cf=4nx017qk/a'
phys_path: '/p...@0,0/pci10f1,2...@5/d...@0,0:a'
whole_disk: 0
metaslab_array: 23
metaslab_shift: 26
ashift: 9
asize: 7985430528
is_log: 0
create_txg: 4


What I finally ended up doing was dd'ing the the disk:

[b]prtvtoc /dev/rdsk/c7t0d0s2[/b]
* /dev/rdsk/c7t0d0s2 partition map
*
* Dimensions:
* 512 bytes/sector
*  32 sectors/track
* 128 tracks/cylinder
*4096 sectors/cylinder
*3813 cylinders
*3811 accessible cylinders
*
* Flags:
*   1: unmountable
*  10: read-only
*
* Unallocated space:
*   First SectorLast
*   Sector CountSector 
*4096  15605760  15609855
*
*  First SectorLast
* Partition  Tag  FlagsSector CountSector  Mount Directory
   2  501  0  15609856  15609855
   8  101  0  4096  4095

[b]pfexec dd if=/dev/zero of=/dev/rdsk/c7t0d0p0 bs=512 count=8192
pfexec dd if=/dev/zero of=/dev/rdsk/c7t0d0p0 bs=512 seek=15613952 count=8192
pfexec fdisk -B /dev/rdsk/c7t0d0p0[/b]

[b]pfexec newfs -v /dev/dsk/c7t0d0s2[/b]
newfs: construct a new file system /dev/rdsk/c7t0d0s2: (y/n)? y
pfexec mkfs -F ufs /dev/rdsk/c7t0d0s2 15609856 32 -1 8192 1024 224 1 1056 8192 
t 0 -1 8 8 n
mkfs: bad value for rps: 1056 must be between 1 and 1000
mkfs: rps reset to default 60
Warning: 2048 sector(s) in last cylinder unallocated
/dev/rdsk/c7t0d0s2: 15609856 sectors in 2541 cylinders of 48 tracks, 128 
sectors
7622.0MB in 159 cyl groups (16 c/g, 48.00MB/g, 5824 i/g)
super-block backups (for fsck -F ufs -o b=#) at:
 32, 98464, 196896, 295328, 393760, 492192, 590624, 689056, 787488, 885920,



The only real reason I am doing this anyway it to experiment with the R/W 
speeds and comparing PCFS (FAT32), UFS and ZFS on the removable media.  
Apparently the slow PCFS speeds are not going to be fixed any time soon and 
copying 8G files to a CF was becoming tedious.  Just switching to UFS took me 
from 1.3MB/s to 6.9MB/s on an old microdrive.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Different devices with the same name in zpool status

2010-05-05 Thread Ian Collins


On 05/ 6/10 11:48 AM, Brandon High wrote:

I know for certain that my rpool and tank pool are not both using
c6t0d0 and c6t1d0, but that's what zpool status is showing.

It appears to be an output bug, or a problem with the zpool.cache,
since format shows my rpool devices at c8t0d0 and c8t1d0.

   
Have you hot swapped any drives?  I had a similar oddity after swapping 
drives and running cfgadm.



What's the right way to fix this? Do nothing? boot -r? Remove
/etc/zfs/zpool.cache? Edit or remove /etc/path_to_inst and let boot-r
fix it?

   

After the system rebooted, the drives all matched up.

--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Different devices with the same name in zpool status

2010-05-05 Thread Brandon High

I know for certain that my rpool and tank pool are not both using
c6t0d0 and c6t1d0, but that's what zpool status is showing.

It appears to be an output bug, or a problem with the zpool.cache,
since format shows my rpool devices at c8t0d0 and c8t1d0.

What's the right way to fix this? Do nothing? boot -r? Remove
/etc/zfs/zpool.cache? Edit or remove /etc/path_to_inst and let boot-r
fix it?

-B

bh...@basestar:~$ zpool status
  pool: rpool
 state: ONLINE
 scrub: none requested
config:

NAME  STATE READ WRITE CKSUM
rpool ONLINE   0 0 0
  mirror-0ONLINE   0 0 0
c6t0d0s0  ONLINE   0 0 0
c6t1d0s0  ONLINE   0 0 0

errors: No known data errors

  pool: tank
 state: ONLINE
 scrub: scrub completed after 6h35m with 0 errors on Tue May  4 16:29:46 2010
config:

NAMESTATE READ WRITE CKSUM
tankONLINE   0 0 0
  raidz2-0  ONLINE   0 0 0
c6t0d0  ONLINE   0 0 0
c6t1d0  ONLINE   0 0 0
c6t2d0  ONLINE   0 0 0
c6t3d0  ONLINE   0 0 0
c6t4d0  ONLINE   0 0 0
c6t5d0  ONLINE   0 0 0
c6t6d0  ONLINE   0 0 0
c6t7d0  ONLINE   0 0 0
logs
  c7t0d0s0  ONLINE   0 0 0
cache
  c7t0d0s1  ONLINE   0 0 0

bh...@basestar:~$ pfexec format -e < /dev/null
Searching for disks...done


AVAILABLE DISK SELECTIONS:
   0. c6t0d0 
  /p...@0,0/pci10de,3...@a/pci8086,3...@0/pci11ab,1...@4/d...@0,0
   1. c6t1d0 
  /p...@0,0/pci10de,3...@a/pci8086,3...@0/pci11ab,1...@4/d...@1,0
   2. c6t2d0 
  /p...@0,0/pci10de,3...@a/pci8086,3...@0/pci11ab,1...@4/d...@2,0
   3. c6t3d0 
  /p...@0,0/pci10de,3...@a/pci8086,3...@0/pci11ab,1...@4/d...@3,0
   4. c6t4d0 
  /p...@0,0/pci10de,3...@a/pci8086,3...@0/pci11ab,1...@4/d...@4,0
   5. c6t5d0 
  /p...@0,0/pci10de,3...@a/pci8086,3...@0/pci11ab,1...@4/d...@5,0
   6. c6t6d0 
  /p...@0,0/pci10de,3...@a/pci8086,3...@0/pci11ab,1...@4/d...@6,0
   7. c6t7d0 
  /p...@0,0/pci10de,3...@a/pci8086,3...@0/pci11ab,1...@4/d...@7,0
   8. c7t0d0 
  /p...@0,0/pci1043,8...@5,2/d...@0,0
   9. c8t0d0 
  /p...@0,0/pci1043,8...@5,1/d...@0,0
  10. c8t1d0 
  /p...@0,0/pci1043,8...@5,1/d...@1,0

-- 
Brandon High : bh...@freaks.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Best practice for full stystem backup - equivelent of ufsdump/ufsrestore

2010-05-05 Thread Ray Van Dolson

On Wed, May 05, 2010 at 04:31:08PM -0700, Bob Friesenhahn wrote:
> On Thu, 6 May 2010, Ian Collins wrote:
> >> Bob and Ian are right.  I was trying to remember the last time I installed
> >> Solaris 10, and the best I can recall, it was around late fall 2007.
> >> The fine folks at Oracle have been making improvements to the product
> >> since then, even though no new significant features have been added since
> >> that time :-(
> >> 
> > ZFS boot?
> 
> I think that Richard is referring to the fact that the PowerPC/Cell 
> Solaris 10 port for the Sony Playstation III never emerged.  ;-)
> 
> Other than desktop features, as a Solaris 10 user I have seen 
> OpenSolaris kernel features continually percolate down to Solaris 10 
> so I don't feel as left out as Richard would like me to feel.
> 
> From a zfs standpoint, Solaris 10 does not seem to be behind the 
> currently supported OpenSolaris release.
> 
> Bob

Well, being able to remove ZIL devices is one important feature
missing.  Hopefully in U9. :)

Ray
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Best practice for full stystem backup - equivelent of ufsdump/ufsrestore

2010-05-05 Thread Bob Friesenhahn


On Thu, 6 May 2010, Ian Collins wrote:

Bob and Ian are right.  I was trying to remember the last time I installed
Solaris 10, and the best I can recall, it was around late fall 2007.
The fine folks at Oracle have been making improvements to the product
since then, even though no new significant features have been added since
that time :-(


ZFS boot?


I think that Richard is referring to the fact that the PowerPC/Cell 
Solaris 10 port for the Sony Playstation III never emerged.  ;-)


Other than desktop features, as a Solaris 10 user I have seen 
OpenSolaris kernel features continually percolate down to Solaris 10 
so I don't feel as left out as Richard would like me to feel.


From a zfs standpoint, Solaris 10 does not seem to be behind the 

currently supported OpenSolaris release.

Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Mirroring USB Drive with Laptop for Backup purposes

2010-05-05 Thread Bob Friesenhahn


On Thu, 6 May 2010, Daniel Carosone wrote:


That said, I'd also recommend a scrub on a regular basis, once the
resilver has completed, and that will trawl through all the data and
take all that time you were worried about anyway.  For a 200G disk,
full, over usb, I'd expect around 4-5 hours.  That's fine for a "leave
running overnight" workflow.


When I have simply powered down a mirror disk in a USB-based mirrored 
pair, I have noticed that it seems that zfs must be doing its own 
little secret scrub of the restored disk without me requesting it even 
though 'zpool status' does not mention it and it says the disk is 
resilvered.  The flashing lights annoyed me so I exported and imported 
the pool and then the flashing lights were gone.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Mirroring USB Drive with Laptop for Backup purposes

2010-05-05 Thread Bob Friesenhahn


On Wed, 5 May 2010, Edward Ned Harvey wrote:


Here are the obstacles I think you'll have with your proposed solution:

#1 I think all the entire used portion of the filesystem needs to resilver
every time.  I don't think there's any such thing as an incremental
resilver.


It sounds like you are not sure.  Maybe you should be sure.  Yes, I do 
think that it is a wise idea if you are really sure.


See "Transactional pruning" at

  http://blogs.sun.com/bonwick/entry/smokin_mirrors

and then "Top-down resilvering".


This would have the added benefit of the USB drive being bootable.


By default, AFAIK, that's not correct.  When you mirror rpool to another
device, by default the 2nd device is not bootable, because it's just got an
rpool in there.  No boot loader.


Unless it was added at install time, or the user added a boot loader. 
It is quite doable since it is the normal case as when a system is 
installed onto a mirror pair of disks.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] How to completely erradicate ZFS

2010-05-05 Thread Richard Elling

On May 5, 2010, at 12:59 PM, nich romero wrote:

> Stupid question time.   
> 
> I have a CF Card that I place a ZFS volume.  Now I want to put a UFS volume 
> on it instead but I can not seem to get ride of the ZFS information on the 
> drive.  I have tried clearing and recreating the Partition Table with fdisk. 
> I have tried clearing the labels and VTOC but when I put the Solaris 
> partition on the disk again the ZFS information seeming reapears and the 
> system complains that is cannot mount ZFS rpool.
> 
> Any help would be appreciated.

The system won't care unless it is expected to import rpool.
Use "zbd -C" and see if the cache expects to import the pool.
If so, export it. If not, please show the exact error message you see.
 -- richard

-- 
ZFS storage and performance consulting at http://www.RichardElling.com










___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Best practice for full stystem backup - equivelent of ufsdump/ufsrestor

2010-05-05 Thread Simon Breden

Hi Euan,

You might find some of this useful:

http://breden.org.uk/2009/08/29/home-fileserver-mirrored-ssd-zfs-root-boot/
http://breden.org.uk/2009/08/30/home-fileserver-zfs-boot-pool-recovery/

I backed up the rpool to a single file which I believe is frowned upon, due to 
the consequences of an error occurring within the sent stream, but sending to a 
file system instead will fix this aspect, and you may still find the rest of 
use.

Cheers,
Simon
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Mirroring USB Drive with Laptop for Backup purposes

2010-05-05 Thread Daniel Carosone

On Wed, May 05, 2010 at 04:34:13PM -0400, Edward Ned Harvey wrote:
> The suggestion I would have instead, would be to make the external drive its
> own separate zpool, and then you can incrementally "zfs send | zfs receive"
> onto the external.

I'd suggest doing both, to different destinations :)  Each kind of
"backup" serves different, complementary purposes.

> #1 I think all the entire used portion of the filesystem needs to resilver
> every time.  I don't think there's any such thing as an incremental
> resilver.  

Incorrect. It will play forward all the (still-live) blocks from txg's
between the time it was last online and now. 

That said, I'd also recommend a scrub on a regular basis, once the
resilver has completed, and that will trawl through all the data and
take all that time you were worried about anyway.  For a 200G disk,
full, over usb, I'd expect around 4-5 hours.  That's fine for a "leave
running overnight" workflow.

This is the benefit of this kind of "backup" - as well as being almost
brainless to initiate, it's able to automatically repair marginal
sectors on the laptop disk if they become unreadable, saving you from
the hassle of trying to restore damaged files.

The send|recv kind of backup is much better for restoring data from
old snapshots (if the target is larger than the source and keeps them
longer), and recovering from accidentally destroying both mirrored
copies of data due to operator error. 

> #2 How would you plan to disconnect the drive?  If you zpool detach it, I
> think it's no longer a mirror, and not mountable.

That's correct - which is why you would use "zpool offline".

--
Dan.

pgpgbQjfYhj6R.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Mirroring USB Drive with Laptop for Backup purposes

2010-05-05 Thread Carson Gaspar


Glenn Lagasse wrote:


How about ease-of-use, all you have to do is plug in the usb disk and
zfs will 'do the right thing'.  You don't have to remember to run zfs
send | zfs receive, or bother with figuring out what to send/recv etc
etc etc.


It should be possible to automate that via syseventd/syseventconfd. 
Sadly the documentation is a bit... um... sparse...


--
Carson
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Best practice for full stystem backup - equivelent of ufsdump/ufsrestore

2010-05-05 Thread Ian Collins


On 05/ 6/10 05:32 AM, Richard Elling wrote:

On May 4, 2010, at 7:55 AM, Bob Friesenhahn wrote:
   

On Mon, 3 May 2010, Richard Elling wrote:
 

This is not a problem on Solaris 10. It can affect OpenSolaris, though.
   

That's precisely the opposite of what I thought.  Care to explain?
 

In Solaris 10, you are stuck with LiveUpgrade, so the root pool is
not shared with other boot environments.
   

Richard,

You have fallen out of touch with Solaris 10, which is still a moving target.  
While the Live Upgrade commands you are familiar with in Solaris 10 still 
mostly work as before, they *do* take advantage of zfs's features and boot 
environments do share the same root pool just like in OpenSolaris.  Solaris 10 
Live Upgrade is dramatically improved in conjunction with zfs boot.  I am not 
sure how far behind it is from OpenSolaris new boot administration tools but 
under zfs its function can not be terribly different.
 

Bob and Ian are right.  I was trying to remember the last time I installed
Solaris 10, and the best I can recall, it was around late fall 2007.
The fine folks at Oracle have been making improvements to the product
since then, even though no new significant features have been added since
that time :-(
   

ZFS boot?

--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Mirroring USB Drive with Laptop for Backup purposes

2010-05-05 Thread Glenn Lagasse

* Edward Ned Harvey (solar...@nedharvey.com) wrote:
> > From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> > boun...@opensolaris.org] On Behalf Of Matt Keenan
> > 
> > Just wondering whether mirroring a USB drive with main laptop disk for
> > backup purposes is recommended or not.
> > 
> > Plan would be to connect the USB drive, once or twice a week, let it
> > resilver, and then disconnect again. Connecting USB drive 24/7 would
> > AFAIK have performance issues for the Laptop.
> 
> MMmmm...  If it works, sounds good.  But I don't think it'll work as
> expected, for a number of reasons, outlined below.

It used to work for James Gosling.

http://blogs.sun.com/jag/entry/solaris_and_os_x

[snip]
 
> > This would have the added benefit of the USB drive being bootable.
> 
> By default, AFAIK, that's not correct.  When you mirror rpool to
> another device, by default the 2nd device is not bootable, because
> it's just got an rpool in there.  No boot loader.

That's true, but easily fixed (just like for any other mirrored pool
configuration).

installgrub -m /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/{disk}

> Even if you do this mirror idea, which I believe will be slower and
> less reliable than "zfs send | zfs receive" you still haven't gained
> anything as compared to the "zfs send | zfs receive" procedure, which
> is known to work reliable with optimal performance.

How about ease-of-use, all you have to do is plug in the usb disk and
zfs will 'do the right thing'.  You don't have to remember to run zfs
send | zfs receive, or bother with figuring out what to send/recv etc
etc etc.

Cheers,

-- 
Glenn
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] b134 pool borked!

2010-05-05 Thread Michael Mattsson

I got a suggestion to check what fmdump -eV gave to look for PCI errors if the 
controller might be broken.

Attached you'll find the last panic's fmdump -eV. It indicates that ZFS can't 
open the drives. That might suggest a broken controller, but my slog is on the 
motherboard's internal controller. 

One might think that the motherboard itself might be toast or do we have a case 
of unstable power?
-- 
This message posted from opensolaris.orgMay 04 2010 19:44:31.716566239 ereport.fs.zfs.vdev.open_failed
nvlist version: 0
class = ereport.fs.zfs.vdev.open_failed
ena = 0xeeed67dca00c01
detector = (embedded nvlist)
nvlist version: 0
version = 0x0
scheme = zfs
pool = 0x97541c1ea1ad833e
vdev = 0x645834a4c69584e5
(end detector)

pool = tank
pool_guid = 0x97541c1ea1ad833e
pool_context = 1
pool_failmode = wait
vdev_guid = 0x645834a4c69584e5
vdev_type = disk
vdev_path = /dev/dsk/c13t1d0s0
vdev_devid = id1,s...@sata_wdc_wd5001aals-0_wd-wmasy3260051/a
parent_guid = 0x6041a7903a345374
parent_type = raidz
prev_state = 0x1
__ttl = 0x1
__tod = 0x4be05cff 0x2ab5eedf

May 04 2010 19:44:31.716565705 ereport.fs.zfs.vdev.open_failed
nvlist version: 0
class = ereport.fs.zfs.vdev.open_failed
ena = 0xeeed67dca00c01
detector = (embedded nvlist)
nvlist version: 0
version = 0x0
scheme = zfs
pool = 0x97541c1ea1ad833e
vdev = 0x928ecd01b281b313
(end detector)

pool = tank
pool_guid = 0x97541c1ea1ad833e
pool_context = 1
pool_failmode = wait
vdev_guid = 0x928ecd01b281b313
vdev_type = disk
vdev_path = /dev/dsk/c13t2d0s0
vdev_devid = id1,s...@sata_samsung_hd103si___s1vsj90sc22634/a
parent_guid = 0x6041a7903a345374
parent_type = raidz
prev_state = 0x1
__ttl = 0x1
__tod = 0x4be05cff 0x2ab5ecc9

May 04 2010 19:44:31.716565713 ereport.fs.zfs.vdev.open_failed
nvlist version: 0
class = ereport.fs.zfs.vdev.open_failed
ena = 0xeeed67dca00c01
detector = (embedded nvlist)
nvlist version: 0
version = 0x0
scheme = zfs
pool = 0x97541c1ea1ad833e
vdev = 0xc6c893601f1263cb
(end detector)

pool = tank
pool_guid = 0x97541c1ea1ad833e
pool_context = 1
pool_failmode = wait
vdev_guid = 0xc6c893601f1263cb
vdev_type = disk
vdev_path = /dev/dsk/c8t0d0s0
vdev_devid = id1,s...@sata_intel_ssdsa2m080__cvpo003401vt080bgn/a
parent_guid = 0x97541c1ea1ad833e
parent_type = root
prev_state = 0x1
__ttl = 0x1
__tod = 0x4be05cff 0x2ab5ecd1

May 04 2010 19:44:31.716566468 ereport.fs.zfs.vdev.open_failed
nvlist version: 0
class = ereport.fs.zfs.vdev.open_failed
ena = 0xeeed67dca00c01
detector = (embedded nvlist)
nvlist version: 0
version = 0x0
scheme = zfs
pool = 0x97541c1ea1ad833e
vdev = 0x381e0480469b4ed7
(end detector)

pool = tank
pool_guid = 0x97541c1ea1ad833e
pool_context = 1
pool_failmode = wait
vdev_guid = 0x381e0480469b4ed7
vdev_type = disk
vdev_path = /dev/dsk/c13t3d0s0
vdev_devid = id1,s...@sata_samsung_hd103si___s1vsj90sc22045/a
parent_guid = 0x6041a7903a345374
parent_type = raidz
prev_state = 0x1
__ttl = 0x1
__tod = 0x4be05cff 0x2ab5efc4

May 04 2010 19:44:31.716566182 ereport.fs.zfs.vdev.open_failed
nvlist version: 0
class = ereport.fs.zfs.vdev.open_failed
ena = 0xeeed67dca00c01
detector = (embedded nvlist)
nvlist version: 0
version = 0x0
scheme = zfs
pool = 0x97541c1ea1ad833e
vdev = 0x6e5ce9b416a3f8a4
(end detector)

pool = tank
pool_guid = 0x97541c1ea1ad833e
pool_context = 1
pool_failmode = wait
vdev_guid = 0x6e5ce9b416a3f8a4
vdev_type = disk
vdev_path = /dev/dsk/c13t6d0s0
vdev_devid = id1,s...@sata_wdc_wd6400aacs-0_wd-wcauf0934679/a
parent_guid = 0x4491e617ebc26c75
parent_type = raidz
prev_state = 0x1
__ttl = 0x1
__tod = 0x4be05cff 0x2ab5eea6

May 04 2010 19:44:31.716565740 ereport.fs.zfs.vdev.open_failed
nvlist version: 0
class = ereport.fs.zfs.vdev.open_failed
ena = 0xeeed67dca00c01
detector = (embedded nvlist)
nvlist version: 0
version = 0x0
scheme = zfs
pool = 0x97541c1ea1ad833e
vdev = 0x69f0986c92adda53

Re: [zfs-discuss] How to completely erradicate ZFS

2010-05-05 Thread Freddie Cash

On Wed, May 5, 2010 at 1:36 PM, Matt Cowger  wrote:

> It probably put an EFI label on the disk.  Try doing a wiping the first AND
> last 2MB.
>
>
If nothing else works, the following should definitely do it:
  dd if=/dev/zero of=/dev/whatever bs=1M

That will write zeroes to every bit of the drive, start to finish.  You can
play around with the block size (bs).

-- 
Freddie Cash
fjwc...@gmail.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] How to completely erradicate ZFS

2010-05-05 Thread Matt Cowger

It probably put an EFI label on the disk.  Try doing a wiping the first AND 
last 2MB.

--M

-Original Message-
From: zfs-discuss-boun...@opensolaris.org 
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of nich romero
Sent: Wednesday, May 05, 2010 1:00 PM
To: zfs-discuss@opensolaris.org
Subject: [zfs-discuss] How to completely erradicate ZFS

Stupid question time.   

I have a CF Card that I place a ZFS volume.  Now I want to put a UFS volume on 
it instead but I can not seem to get ride of the ZFS information on the drive.  
I have tried clearing and recreating the Partition Table with fdisk. I have 
tried clearing the labels and VTOC but when I put the Solaris partition on the 
disk again the ZFS information seeming reapears and the system complains that 
is cannot mount ZFS rpool.

Any help would be appreciated.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Mirroring USB Drive with Laptop for Backup purposes

2010-05-05 Thread Edward Ned Harvey

> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> boun...@opensolaris.org] On Behalf Of Matt Keenan
> 
> Just wondering whether mirroring a USB drive with main laptop disk for
> backup purposes is recommended or not.
> 
> Plan would be to connect the USB drive, once or twice a week, let it
> resilver, and then disconnect again. Connecting USB drive 24/7 would
> AFAIK have performance issues for the Laptop.

MMmmm...  If it works, sounds good.  But I don't think it'll work as
expected, for a number of reasons, outlined below.

The suggestion I would have instead, would be to make the external drive its
own separate zpool, and then you can incrementally "zfs send | zfs receive"
onto the external.

Here are the obstacles I think you'll have with your proposed solution:

#1 I think all the entire used portion of the filesystem needs to resilver
every time.  I don't think there's any such thing as an incremental
resilver.  

#2 How would you plan to disconnect the drive?  If you zpool detach it, I
think it's no longer a mirror, and not mountable.  If you simply yank out
the plug ... although that might work, it would certainly be nonideal.  If
you power off, disconnect, and power on ... Again, it should probably be
fine, but it's not designed to be used that way intentionally, so your
results ... are probably as-yet untested.

I don't want to go on.  This list could go on forever.  I will strongly
encourage you to simply use "zfs send | zfs receive" because that's a
standard practice thing to do.  It is known that the external drive is not
bootable this way, but that's why you have this article on how to make it
bootable:

http://docs.sun.com/app/docs/doc/819-5461/ghzur?l=en&a=view


> This would have the added benefit of the USB drive being bootable.

By default, AFAIK, that's not correct.  When you mirror rpool to another
device, by default the 2nd device is not bootable, because it's just got an
rpool in there.  No boot loader.

Even if you do this mirror idea, which I believe will be slower and less
reliable than "zfs send | zfs receive" you still haven't gained anything as
compared to the "zfs send | zfs receive" procedure, which is known to work
reliable with optimal performance.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] How to completely erradicate ZFS

2010-05-05 Thread nich romero

Stupid question time.   

I have a CF Card that I place a ZFS volume.  Now I want to put a UFS volume on 
it instead but I can not seem to get ride of the ZFS information on the drive.  
I have tried clearing and recreating the Partition Table with fdisk. I have 
tried clearing the labels and VTOC but when I put the Solaris partition on the 
disk again the ZFS information seeming reapears and the system complains that 
is cannot mount ZFS rpool.

Any help would be appreciated.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] ZIL behavior on import

2010-05-05 Thread Robert Milkowski


On 05/05/2010 20:45, Steven Stallion wrote:

All,

I had a question regarding how the ZIL interacts with zpool import:

Given that the intent log is replayed in the event of a system failure,
does the replay behavior differ if -f is passed to zpool import? For
example, if I have a system which fails prior to completing a series of
writes and I reboot using a failsafe (i.e. install disc), will the log be
replayed after a zpool import -f ?

   

yes

--
Robert Milkowski
http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] ZIL behavior on import

2010-05-05 Thread Steven Stallion

All,

I had a question regarding how the ZIL interacts with zpool import:

Given that the intent log is replayed in the event of a system failure,
does the replay behavior differ if -f is passed to zpool import? For
example, if I have a system which fails prior to completing a series of
writes and I reboot using a failsafe (i.e. install disc), will the log be
replayed after a zpool import -f ?

Regards,

Steve
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Does Opensolaris support thin reclamation?

2010-05-05 Thread Andrew Chace

Support for thin reclamation depends on the SCSI "WRITE SAME" command; see this 
draft of a document from T10: 

http://www.t10.org/ftp/t10/document.05/05-270r0.pdf. 

I spent some time searching the source code for support for "WRITE SAME", but I 
wasn't able to find much. I assume that if it was supported, it would be listed 
in this header file:

http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/sys/scsi/generic/commands.h

Does anyone know for certain whether Opensolaris supports thin reclamation on 
thinly-provisioned LUNs? If not, is anyone interested in or actively working on 
this? 

I'm especially interested in ZFS' support for thin reclamation, but I would be 
interested in hearing about support (or lack of) for UFS and SVM as well.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] zfs destroy -f and dataset is busy?

2010-05-05 Thread Christopher Harrison

We have a pair of opensolaris systems running snv_124.   Our main zpool 
'z' is running ZFS pool version 18.  


Problem:
#zfs destroy -f z/Users/harri...@zfs-auto-snap:daily-2010-04-09-00:00
cannot destroy 'z/Users/harri...@zfs-auto-snap:daily-2010-04-09-00:00': 
dataset is busy


I have tried:
Unable to destroy numerous datasets even with a -f option.
unmounted the filesystem and remounted it same problem
exporting and importing the zpool
The zpool scrub completes without errors:
12:51pm taurus/harrison [~] 182#zpool status z
 pool: z
state: ONLINE
scrub: scrub completed after 17h32m with 0 errors on Sun May  2 
18:47:38 2010

config:

   NAME STATE READ WRITE CKSUM
   zONLINE   0 0 0
 raidz2 ONLINE   0 0 0


Any suggestions would be greatly appreciated.

Thanks in advance,
   -C


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Best practice for full stystem backup - equivelent of ufsdump/ufsrestore

2010-05-05 Thread Richard Elling

On May 4, 2010, at 7:55 AM, Bob Friesenhahn wrote:
> On Mon, 3 May 2010, Richard Elling wrote:
 
 This is not a problem on Solaris 10. It can affect OpenSolaris, though.
>>> 
>>> That's precisely the opposite of what I thought.  Care to explain?
>> 
>> In Solaris 10, you are stuck with LiveUpgrade, so the root pool is
>> not shared with other boot environments.
> 
> Richard,
> 
> You have fallen out of touch with Solaris 10, which is still a moving target. 
>  While the Live Upgrade commands you are familiar with in Solaris 10 still 
> mostly work as before, they *do* take advantage of zfs's features and boot 
> environments do share the same root pool just like in OpenSolaris.  Solaris 
> 10 Live Upgrade is dramatically improved in conjunction with zfs boot.  I am 
> not sure how far behind it is from OpenSolaris new boot administration tools 
> but under zfs its function can not be terribly different.

Bob and Ian are right.  I was trying to remember the last time I installed 
Solaris 10, and the best I can recall, it was around late fall 2007.
The fine folks at Oracle have been making improvements to the product
since then, even though no new significant features have been added since
that time :-(
 -- richard

-- 
ZFS storage and performance consulting at http://www.RichardElling.com










___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] [indiana-discuss] image-update doesn't work anymore (bootfs not supported on EFI)

2010-05-05 Thread Evan Layton


On 5/5/10 10:22 AM, Christian Thalinger wrote:

On Wed, 2010-05-05 at 09:45 -0600, Evan Layton wrote:

No that doesn't appear like an EFI label. So it appears that ZFS
is seeing something there that it's interpreting as an EFI label.
Then the command to set the bootfs property is failing due to that.

To restate the problem the BE can't be activated because we can't set
the bootfs property of the root pool and even the ZFS command to set
it fails with "property 'bootfs' not supported on EFI labeled devices"

for example the following command:
# zfs set bootfs=rpool/ROOT/opensolaris rpool

fails with that same error message.


I guess you mean zpool, but yes:


Yes that's what I meant (I hate when my fingers betray me like that) ;-)



# zpool set bootfs=rpool/ROOT/opensolaris-138 rpool
cannot set property for 'rpool': property 'bootfs' not supported on EFI labeled 
devices



Do you have any of the older BEs like build 134 that you can boot back
to and see if those will allow you to set the bootfs property on the
root pool? It's just really strange that out of nowhere it started
thinking that the device is EFI labeled.


I have a couple of BEs I could boot to:

$ beadm list
BE  Active Mountpoint Space   Policy Created
--  -- -- -   -- ---
opensolaris -  -  1.00G   static 2009-10-01 08:00
opensolaris-124 -  -  20.95M  static 2009-10-03 13:30
opensolaris-125 -  -  30.00M  static 2009-10-17 15:18
opensolaris-126 -  -  25.33M  static 2009-10-29 20:18
opensolaris-127 -  -  1.37G   static 2009-11-14 13:20
opensolaris-128 -  -  1.91G   static 2009-12-04 14:28
opensolaris-129 -  -  22.49M  static 2009-12-12 11:31
opensolaris-130 -  -  21.64M  static 2009-12-26 19:46
opensolaris-131 -  -  24.72M  static 2010-01-22 22:51
opensolaris-132 -  -  57.32M  static 2010-02-09 23:05
opensolaris-133 -  -  1.07G   static 2010-02-20 12:55
opensolaris-134 N  /  43.17G  static 2010-03-08 21:58
opensolaris-138 R  -  1.81G   static 2010-05-04 12:03

I will try on 132 or 133.  Get back to you later.


Thanks!

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] b134 pool borked!

2010-05-05 Thread Michael Mattsson

Thanks for your reply! I ran memtest86 and it did not report any errors. The 
disk controller I've not replaced, yet. The server is up in multi-user mode 
with the broken pool in an un-imported state. Format now works and properly 
lists all my devices without panic'ing. zpool import  panic's the box 
with the same stack trace as above.

Could it still be the disk controller? I'd jump through the roof of happiness 
if that's the case. It's one of those Supermicro thumper controllers. Anyone 
know any good non-destructive diagnostics to run?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] [indiana-discuss] image-update doesn't work anymore (bootfs not supported on EFI)

2010-05-05 Thread Evan Layton


On 5/5/10 1:44 AM, Christian Thalinger wrote:

On Tue, 2010-05-04 at 16:19 -0600, Evan Layton wrote:

Can you try the following and see if it really thinks it's an EFI lable?
# dd if=/dev/dsk/c12t0d0s2 of=x skip=512 bs=1 count=10
# cat x

This may help us determine if this is another instance of bug 6860320


# dd if=/dev/dsk/c12t0d0s2 of=x skip=512 bs=1 count=10
10+0 records in
10+0 records out
10 bytes (10 B) copied, 0.0259365 s, 0.4 kB/s
# cat x
# od x
000 00 00 00 00 00
012
#

Doesn't look like an EFI label.



No that doesn't appear like an EFI label. So it appears that ZFS
is seeing something there that it's interpreting as an EFI label.
Then the command to set the bootfs property is failing due to that.

To restate the problem the BE can't be activated because we can't set
the bootfs property of the root pool and even the ZFS command to set
it fails with "property 'bootfs' not supported on EFI labeled devices"

for example the following command:
# zfs set bootfs=rpool/ROOT/opensolaris rpool

fails with that same error message.

Do you have any of the older BEs like build 134 that you can boot back
to and see if those will allow you to set the bootfs property on the
root pool? It's just really strange that out of nowhere it started
thinking that the device is EFI labeled.

I'm including zfs-discuss to get the ZFS folks thoughts on the issue.

-evan


More info from original thread:

> On 05/ 4/10 10:45 AM, Christian Thalinger wrote:
>> On Tue, 2010-05-04 at 10:36 -0500, Shawn Walker wrote:
 What confuses me is that the update from b133 to b134 obviously worked
 before--because I have a b134 image--but it doesn't now.
>>>
>>> I'm on b135 myself and haven't seen this issue yet.
>>>
 I can't think of anything I did that changed anything on the disk or
 the
 partition table, whatever that could be. Or is this because I tried to
 install b137 and that changed something?
>>>
>>> What does your partition layout look like?
>>
>> Not sure how I can print the partition to show what you want to see.
>> Maybe this:
>>
>> format> current
>> Current Disk = c12t0d0
>> 
>> /p...@0,0/pci10de,c...@b/d...@0,0
>>
>> format> verify
>>
>> Primary label contents:
>>
>> Volume name =< >
>> ascii name =
>> pcyl = 38912
>> ncyl = 38910
>> acyl = 2
>> bcyl = 0
>> nhead = 255
>> nsect = 63
>> Part Tag Flag Cylinders Size Blocks
>> 0 root wm 1 - 38909 298.06GB (38909/0/0) 625073085
>> 1 unassigned wm 0 0 (0/0/0) 0
>> 2 backup wu 0 - 38909 298.07GB (38910/0/0) 625089150
>> 3 unassigned wm 0 0 (0/0/0) 0
>> 4 unassigned wm 0 0 (0/0/0) 0
>> 5 unassigned wm 0 0 (0/0/0) 0
>> 6 unassigned wm 0 0 (0/0/0) 0
>> 7 unassigned wm 0 0 (0/0/0) 0
>> 8 boot wu 0 - 0 7.84MB (1/0/0) 16065
>> 9 unassigned wm 0 0 (0/0/0) 0
>>
>> format>
>>
>>> How are you booting the
>>> system? (rEFIt?)
>>
>> No, I just installed OpenSolaris.
>
> Ah, only OS then?
>
> ...
>>> Only bug I see possibly related is 6929493 (in the sense that changes
>>> for the bug may have triggered this issue possibly).
>>
>> A few days ago I noticed that the new boot environment is actually there
>> and can be booted despite the ZFS error. I installed b138 today and it
>> works, but I get this error on updating.
>
> So, there are some ZFS bugs that seem related, although some of them are
> supposedly already fixed and I'm not certain that others relate:
>
> http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6740164
> http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6860320
>
> Did you recently attach or import any zpools?
>
> Also, when you originally installed the OS, did you completely erase the
> drive before installing?
>
> I've run into problems in the past where fixes or changes have caused
> the OS to check partition headers and other areas for signatures that
> were leftover by other disk utilities and gave me grief.
>
> So, to be clear, sometime after you updated to b134, you could no longer
> update to any other builds because it gave you a message like this?
>
> be_get_uuid: failed to get uuid property from BE root dataset user
> ...
> set_bootfs: failed to set bootfs property for pool rpool: property
> 'bootfs' not supported on EFI labeled devices
> be_activate: failed to set bootfs pool property for
> rpool/ROOT/opensolaris-135
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Using local raw disks for an opensolaris b134 virtualized host under ESXi 4

2010-05-05 Thread carlopmart


Hi all,

 I would like to install a "virtual san" using opensolaris b134 under an ESXi 4 
host. Instead of use vmfs datastores I would like to use local raw disks on ESXi 4 
host: http://www.mattiasholm.com/node/33.


 Somebody have tried?? Some problem to do this? Or is it better to use vmfs than 
raw local disks??


Thanks.

--
CL Martinez
carlopmart {at} gmail {d0t} com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Another MPT issue - kernel crash

2010-05-05 Thread Bruno Sousa

Hi James,

Thanks for the information, and if there's any test/command to be done
on this server, just let me know it.

Regards,
Bruno

On 5-5-2010 15:38, James C. McPherson wrote:
>
> On  5/05/10 10:42 PM, Bruno Sousa wrote:
>> Hi all,
>>
>> I have faced yet another kernel panic that seems to be related to mpt
>> driver.
>> This time i was trying to add a new disk to a running system (snv_134)
>> and this new disk was not being detected...following a tip i ran the
>> lsitool to reset the bus and this lead to a system panic.
>>
>> MPT driver : BAD TRAP: type=e (#pf Page fault) rp=ff001fc98020
>> addr=4 occurred in module "mpt" due to a NULL pointer dereference
>>
>> If someone has a similar problem it might be worthwhile to expose it
>> here or to add information to the filled bug , available at
>> https://defect.opensolaris.org/bz/show_bug.cgi?id=15879
>
>
> That's an already-known CR, tracked in Bugster. I've
> updated defect.o.o and transferred your info to the
> Bugster CR, 6895862. Until the nightly inside->outside
> bugs.o.o sync up it'll still show up as closed, but
> don't worry, I've re-opened it.
>
>
> James C. McPherson
> -- 
> Senior Software Engineer, Solaris
> Oracle
> http://www.jmcp.homeunix.com/blog
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Another MPT issue - kernel crash

2010-05-05 Thread James C. McPherson



On  5/05/10 10:42 PM, Bruno Sousa wrote:

Hi all,

I have faced yet another kernel panic that seems to be related to mpt
driver.
This time i was trying to add a new disk to a running system (snv_134)
and this new disk was not being detected...following a tip i ran the
lsitool to reset the bus and this lead to a system panic.

MPT driver : BAD TRAP: type=e (#pf Page fault) rp=ff001fc98020
addr=4 occurred in module "mpt" due to a NULL pointer dereference

If someone has a similar problem it might be worthwhile to expose it
here or to add information to the filled bug , available at
https://defect.opensolaris.org/bz/show_bug.cgi?id=15879



That's an already-known CR, tracked in Bugster. I've
updated defect.o.o and transferred your info to the
Bugster CR, 6895862. Until the nightly inside->outside
bugs.o.o sync up it'll still show up as closed, but
don't worry, I've re-opened it.


James C. McPherson
--
Senior Software Engineer, Solaris
Oracle
http://www.jmcp.homeunix.com/blog
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] [storage-discuss] iscsitgtd failed request to share on zpool import after upgrade from b104 to b134

2010-05-05 Thread Jim Dunham

Przem,

> Anybody has an idea what I can do about it?

zfs set shareiscsi=off vol01/zvol01
zfs set shareiscsi=off vol01/zvol02

Doing this will have no impact on the LUs if configured under COMSTAR.
This will also transparently go away with b136, when ZFS ignores the shareiscsi 
property.

- Jim


> 
> On 04/05/2010 16:43, "eXeC001er"  wrote:
> 
>> Perhaps the problem is that the old version of pool have shareiscsi, but new
>> version have not this option, and for share LUN via iscsi you need to make
>> lun-mapping.
>> 
>> 
>> 
>> 2010/5/4 Przemyslaw Ceglowski
>> mailto:prze...@ceglowski.net>>
>> Jim,
>> 
>> On May 4, 2010, at 3:45 PM, Jim Dunham wrote:
>> 
 
 On May 4, 2010, at 2:43 PM, Richard Elling wrote:
 
> On May 4, 2010, at 5:19 AM, Przemyslaw Ceglowski wrote:
> 
>> It does not look like it is:
>> 
>> r...@san01a:/export/home/admin# svcs -a | grep iscsi
>> online May_01   svc:/network/iscsi/initiator:default
>> online May_01   svc:/network/iscsi/target:default
> 
> This is COMSTAR.
 
 Thanks Richard, I am aware of that.
>>> 
>>> Since you upgrade to b134, not b136 the iSCSI Target Daemon is still around,
>>> just not on our system.
>>> 
>>> IPS packaging changes have not installed the iSCSI Target Daemon (among 
>>> other
>>> things) by default. It is contained in IPS package known as either
 SUNWiscsitgt or network/iscsi/target/legacy. Visit your local package
>>> repository for updates: http://pkg.opensolaris.org/dev/
>>> 
>>> Of course starting with build 136..., iSCSI Target Daemon (and ZFS
>>> shareiscsi) are gone, so you will need to reconfigure your two ZVOLs
>>> 'vol01/zvol01' >and 'vol01/zvol02', under COMSTAR soon.
>>> 
>>> http://wikis.sun.com/display/OpenSolarisInfo/How+to+Configure+iSCSI+Target+Po
>>> rts
>>> http://wikis.sun.com/display/OpenSolarisInfo/COMSTAR+Administration
>>> 
>>> - Jim
>> 
>> The migrated zVols have been running under COMSTAR originally on b104 which
>> makes me wonder even more. Is there any way I can get rid of those messages?
>> 
>>> 
 
> 
>> _
>> Przem
>> 
>>> 
>>> 
>>> 
>>> From: Rick McNeal [ramcn...@gmail.com]
>>> Sent: 04 May 2010 13:14
>>> To: Przemyslaw Ceglowski
>>> Subject: Re: [storage-discuss] iscsitgtd failed request to share on
>>> zpool import after upgrade from b104 to b134
 
>>> Look and see if the target daemon service is still enabled. COMSTAR
 has been the official scsi target project for a while now. In fact, the
 old iscscitgtd >was removed in build 136.
> 
> For Nexenta, the old iscsi target was removed in 3.0 (based on b134).
> -- richard
 
 It does not answer my original question.
 -- Przem
 
> 
>>> 
>>> Rick McNeal
>>> 
>>> 
>>> On May 4, 2010, at 5:38 AM, Przemyslaw Ceglowski
>>> mailto:prze...@ceglowski.net>> wrote:
>>> 
 Hi,
 
 I am posting my question to both storage-discuss and zfs-discuss
 as I am not quite sure what is causing the messages I am receiving.
 
 I have recently migrated my zfs volume from b104 to b134 and
 upgraded it from zfs version 14 to 22. It consist of two zvol's
 'vol01/zvol01' and 'vol01/zvol02'.
 During zpool import I am getting a non-zero exit code, however the
 volume is imported successfuly. Could you please help me to understand
 what could be the reason of those messages?
 
 r...@san01a:/export/home/admin#zpool import vol01
 r...@san01a:/export/home/admin#cannot share 'vol01/zvol01':
 iscsitgtd failed request to share
 r...@san01a:/export/home/admin#cannot share 'vol01/zvol02':
 iscsitgtd failed request to share
 
 Many thanks,
 Przem
 ___
 storage-discuss mailing list
 storage-disc...@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/storage-discuss
>> ___
>> zfs-discuss mailing list
>> zfs-discuss@opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> 
> --
> ZFS storage and performance consulting at http://www.RichardElling.com
 ___
 storage-discuss mailing list
 storage-disc...@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/storage-discuss
>> ___
>> zfs-discuss mailing list
>> zfs-discuss@opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>> 
> 


___
zfs-disc

[zfs-discuss] Another MPT issue - kernel crash

2010-05-05 Thread Bruno Sousa

Hi all,

I have faced yet another kernel panic that seems to be related to mpt
driver.
This time i was trying to add a new disk to a running system (snv_134)
and this new disk was not being detected...following a tip i ran the
lsitool to reset the bus and this lead to a system panic.

MPT driver : BAD TRAP: type=e (#pf Page fault) rp=ff001fc98020
addr=4 occurred in module "mpt" due to a NULL pointer dereference

If someone has a similar problem it might be worthwhile to expose it
here or to add information to the filled bug , available at
https://defect.opensolaris.org/bz/show_bug.cgi?id=15879

Thanks,
Bruno


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] b134 pool borked!

2010-05-05 Thread Markus Kovero

Hi, It definitely seems like hardware-related issue as panics related to common 
tools like format isn’t to be expected.

Anyhow. You might want to start to get all your disks show up in iostat / 
cfgadm before trying to import pool. You should replace controller if you have 
not already done so, and RAM should be all ok I guess?

Yours
Markus Kovero

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] b134 pool borked!

2010-05-05 Thread Michael Mattsson

This is how my zpool import command looks like:

Attached you'll find the output of zdb -l of each device.

  pool: tank
id: 10904371515657913150
 state: ONLINE
action: The pool can be imported using its name or numeric identifier.
config:

tank ONLINE
  raidz1-0   ONLINE
c13t4d0  ONLINE
c13t5d0  ONLINE
c13t6d0  ONLINE
c13t7d0  ONLINE
  raidz1-1   ONLINE
c13t3d0  ONLINE
c13t1d0  ONLINE
c13t2d0  ONLINE
c13t0d0  ONLINE
cache
  c8t2d0
logs
  c8t0d0 ONLINE
-- 
This message posted from opensolaris.org

zdbl.gz
Description: Binary data
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

56 matches

Mail list logo