[zfs-discuss] Posible newbie question about space between zpool and zfs file systems

2010-03-15 Thread Michael Hassey
Sorry if this is too basic -

So I have a single zpool in addition to the rpool, called xpool.

NAMESIZE   USED  AVAILCAP  HEALTH  ALTROOT
rpool   136G   109G  27.5G79%  ONLINE  -
xpool   408G   171G   237G42%  ONLINE  -

I have 408 in the pool, am using 171 leaving me 237 GB. 

The pool is built up as;

  pool: xpool
 state: ONLINE
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
xpool   ONLINE   0 0 0
  raidz2ONLINE   0 0 0
c8t1d0  ONLINE   0 0 0
c8t2d0  ONLINE   0 0 0
c8t3d0  ONLINE   0 0 0

errors: No known data errors


But - and here is the question -

Creating file systems on it, and the file systems in play report only 76GB of 
space free

<<<>>

xpool/zones/logserver/ROOT/zbe 975M  76.4G   975M  legacy
xpool/zones/openxsrvr 2.22G  76.4G  21.9K  /export/zones/openxsrvr
xpool/zones/openxsrvr/ROOT2.22G  76.4G  18.9K  legacy
xpool/zones/openxsrvr/ROOT/zbe2.22G  76.4G  2.22G  legacy
xpool/zones/puggles241M  76.4G  21.9K  /export/zones/puggles
xpool/zones/puggles/ROOT   241M  76.4G  18.9K  legacy
xpool/zones/puggles/ROOT/zbe   241M  76.4G   241M  legacy
xpool/zones/reposerver 299M  76.4G  21.9K  /export/zones/reposerver


So my question is, where is the space from xpool being used? or is it?


Thanks for reading.

Mike.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] corruption of ZFS on iScsi storage

2010-03-15 Thread Gabriele Bulfon
Hello,
I'd like to check for any guidance about using zfs on iscsi storage appliances.
Recently I had an unlucky situation with an unlucky storage machine freezing.
Once the storage was up again (rebooted) all other iscsi clients were happy, 
while one of the iscsi clients (a sun solaris sparc, running Oracle) did not 
mount the volume marking it as corrupted.
I had no way to get back my zfs data: had to destroy and recreate from backups.
So I have some questions regarding this nice story:
- I remember sysadmins being able to almost always recover data on corrupted 
ufs filesystems by magic of superblocks. Is there something similar on zfs? Is 
there really no way to access data of a corrupted zfs filesystem?
- In this case, the storage appliance is a legacy system based on linux, so 
raids/mirrors are managed at the storage side its own way. Being an iscsi 
target, this volume was mounted as a single iscsi disk from the solaris host, 
and prepared as a zfs pool consisting of this single iscsi target. ZFS best 
practices, tell me that to be safe in case of corruption, pools should always 
be mirrors or raidz on 2 or more disks. In this case, I considered all safe, 
because the mirror and raid was managed by the storage machine. But from the 
solaris host point of view, the pool was just one! And maybe this has been the 
point of failure. What is the correct way to go in this case?
- Finally, looking forward to run new storage appliances using OpenSolaris and 
its ZFS+iscsitadm and/or comstar, I feel a bit confused by the possibility of 
having a double zfs situation: in this case, I would have the storage zfs 
filesystem divided into zfs volumes, accessed via iscsi by a possible solaris 
host that creates his own zfs pool on it (...is it too redundant??) and again I 
would fall in the same previous case (host zfs pool connected to one only iscsi 
resource).

Any guidance would be really appreciated :)
Thanks a lot
Gabriele.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Posible newbie question about space between zpool and zfs file systems

2010-03-15 Thread Cindy Swearingen

Hi Michael,

For a RAIDZ pool, the zpool list command identifies the "inflated" space
for the storage pool, which is the physical available space without an
accounting for redundancy overhead.

The zfs list command identifies how much actual pool space is available
to the file systems.

See the example of a RAIDZ-2 pool created below with 3 44 GB disks.
The total pool capacity reported by zpool list is 134 GB. The amount of
pool space that is available to the file systems is 43.8 GB due to
RAIDZ-2 redundancy overhead.

See this FAQ section for more information.

http://hub.opensolaris.org/bin/view/Community+Group+zfs/faq#HZFSAdministrationQuestions

Why doesn't the space that is reported by the zpool list command and the 
zfs list command match?


Although this site is dog-slow for me today...

Thanks,

Cindy

# zpool create xpool raidz2 c3t40d0 c3t40d1 c3t40d2
# zpool list xpool
NAMESIZE   USED  AVAILCAP  HEALTH  ALTROOT
xpool   134G   234K   134G 0%  ONLINE  -
# zfs list xpool
NAMEUSED  AVAIL  REFER  MOUNTPOINT
xpool  73.2K  43.8G  20.9K  /xpool


On 03/15/10 08:38, Michael Hassey wrote:

Sorry if this is too basic -

So I have a single zpool in addition to the rpool, called xpool.

NAMESIZE   USED  AVAILCAP  HEALTH  ALTROOT
rpool   136G   109G  27.5G79%  ONLINE  -
xpool   408G   171G   237G42%  ONLINE  -

I have 408 in the pool, am using 171 leaving me 237 GB. 


The pool is built up as;

  pool: xpool
 state: ONLINE
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
xpool   ONLINE   0 0 0
  raidz2ONLINE   0 0 0
c8t1d0  ONLINE   0 0 0
c8t2d0  ONLINE   0 0 0
c8t3d0  ONLINE   0 0 0

errors: No known data errors


But - and here is the question -

Creating file systems on it, and the file systems in play report only 76GB of 
space free

<<<>>

xpool/zones/logserver/ROOT/zbe 975M  76.4G   975M  legacy
xpool/zones/openxsrvr 2.22G  76.4G  21.9K  /export/zones/openxsrvr
xpool/zones/openxsrvr/ROOT2.22G  76.4G  18.9K  legacy
xpool/zones/openxsrvr/ROOT/zbe2.22G  76.4G  2.22G  legacy
xpool/zones/puggles241M  76.4G  21.9K  /export/zones/puggles
xpool/zones/puggles/ROOT   241M  76.4G  18.9K  legacy
xpool/zones/puggles/ROOT/zbe   241M  76.4G   241M  legacy
xpool/zones/reposerver 299M  76.4G  21.9K  /export/zones/reposerver


So my question is, where is the space from xpool being used? or is it?


Thanks for reading.

Mike.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Posible newbie question about space between zpool and zf

2010-03-15 Thread Michael Hassey
That solved it.

Thank you Cindy.

Zpool list NOT reporting raidz overhead is what threw me...


Thanks again.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] corruption of ZFS on iScsi storage

2010-03-15 Thread Ware Adams

On Mar 15, 2010, at 10:55 AM, Gabriele Bulfon wrote:

> - In this case, the storage appliance is a legacy system based on linux, so 
> raids/mirrors are managed at the storage side its own way. Being an iscsi 
> target, this volume was mounted as a single iscsi disk from the solaris host, 
> and prepared as a zfs pool consisting of this single iscsi target. ZFS best 
> practices, tell me that to be safe in case of corruption, pools should always 
> be mirrors or raidz on 2 or more disks. In this case, I considered all safe, 
> because the mirror and raid was managed by the storage machine. But from the 
> solaris host point of view, the pool was just one! And maybe this has been 
> the point of failure. What is the correct way to go in this case?

I'd guess this could be because the iscsi target wasn't honoring ZFS flush 
requests.

> - Finally, looking forward to run new storage appliances using OpenSolaris 
> and its ZFS+iscsitadm and/or comstar, I feel a bit confused by the 
> possibility of having a double zfs situation: in this case, I would have the 
> storage zfs filesystem divided into zfs volumes, accessed via iscsi by a 
> possible solaris host that creates his own zfs pool on it (...is it too 
> redundant??) and again I would fall in the same previous case (host zfs pool 
> connected to one only iscsi resource).

My experience with this is significantly lower end, but I have had iSCSI shares 
from a ZFS NAS come up as corrupt to the client.  It's fixable if you have 
snapshots.

I've been using iSCSI to provide Time Machine targets to OS X boxes.  We had a 
client crash during writing, and upon reboot it showed the iSCSI volume is 
corrupt.  You can put whatever file system you like the iSCSI target obviously. 
 The current OpenSolaris iSCSI implementation I believe uses synchronous 
writes, so hopefully what happened to you wouldn't happen in this case.

In my case I was using HFS+ (the OS X client has to), and I couldn't repair the 
volume.  However, with a snapshot I could roll it back.  If you plan ahead this 
should save you some restoration work (you'll need to be able to roll back all 
the files that have to be consistent).

Good luck,
Ware
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Posible newbie question about space between zpool and zfs file systems

2010-03-15 Thread Khyron
Yeah, this threw me.  A 3 disk RAID-Z2 doesn't make sense, because at a
redundancy level, RAID-Z2 looks like RAID 6.  That is, there are 2 levels of

parity for the data.  Out of 3 disks, the equivalent of 2 disks will be used
to
store redundancy (parity) data and only 1 disk equivalent will store actual
data.  This is what others might term a "degenerate case of 3-way
mirroring",
except with a lot more computational overhead since we're performing 2
parity calculations.

I'm curious what the purpose of creating a 3 disk RAID-Z2 pool is/was?
(For my own personal edification.  Maybe there is something for me to learn
from this example.)

Aside: Does ZFS actually create the pool as a 3-way mirror, given that this
configuration is effectively the same?  This is a question for any of the
ZFS
team who may be reading but I'm curious now.

On Mon, Mar 15, 2010 at 10:38, Michael Hassey  wrote:

> Sorry if this is too basic -
>
> So I have a single zpool in addition to the rpool, called xpool.
>
> NAMESIZE   USED  AVAILCAP  HEALTH  ALTROOT
> rpool   136G   109G  27.5G79%  ONLINE  -
> xpool   408G   171G   237G42%  ONLINE  -
>
> I have 408 in the pool, am using 171 leaving me 237 GB.
>
> The pool is built up as;
>
>  pool: xpool
>  state: ONLINE
>  scrub: none requested
> config:
>
>NAMESTATE READ WRITE CKSUM
>xpool   ONLINE   0 0 0
>  raidz2ONLINE   0 0 0
>c8t1d0  ONLINE   0 0 0
>c8t2d0  ONLINE   0 0 0
>c8t3d0  ONLINE   0 0 0
>
> errors: No known data errors
>
>
> But - and here is the question -
>
> Creating file systems on it, and the file systems in play report only 76GB
> of space free
>
> <<<>>
>
> xpool/zones/logserver/ROOT/zbe 975M  76.4G   975M  legacy
> xpool/zones/openxsrvr 2.22G  76.4G  21.9K
>  /export/zones/openxsrvr
> xpool/zones/openxsrvr/ROOT2.22G  76.4G  18.9K  legacy
> xpool/zones/openxsrvr/ROOT/zbe2.22G  76.4G  2.22G  legacy
> xpool/zones/puggles241M  76.4G  21.9K
>  /export/zones/puggles
> xpool/zones/puggles/ROOT   241M  76.4G  18.9K  legacy
> xpool/zones/puggles/ROOT/zbe   241M  76.4G   241M  legacy
> xpool/zones/reposerver 299M  76.4G  21.9K
>  /export/zones/reposerver
>
>
> So my question is, where is the space from xpool being used? or is it?
>
>
> Thanks for reading.
>
> Mike.
> --
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>



-- 
"You can choose your friends, you can choose the deals." - Equity Private

"If Linux is faster, it's a Solaris bug." - Phil Harman

Blog - http://whatderass.blogspot.com/
Twitter - @khyron4eva
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] corruption of ZFS on iScsi storage

2010-03-15 Thread Ross Walker
On Mar 15, 2010, at 10:55 AM, Gabriele Bulfon   
wrote:



Hello,
I'd like to check for any guidance about using zfs on iscsi storage  
appliances.
Recently I had an unlucky situation with an unlucky storage machine  
freezing.
Once the storage was up again (rebooted) all other iscsi clients  
were happy, while one of the iscsi clients (a sun solaris sparc,  
running Oracle) did not mount the volume marking it as corrupted.
I had no way to get back my zfs data: had to destroy and recreate  
from backups.

So I have some questions regarding this nice story:
- I remember sysadmins being able to almost always recover data on  
corrupted ufs filesystems by magic of superblocks. Is there  
something similar on zfs? Is there really no way to access data of a  
corrupted zfs filesystem?
- In this case, the storage appliance is a legacy system based on  
linux, so raids/mirrors are managed at the storage side its own way.  
Being an iscsi target, this volume was mounted as a single iscsi  
disk from the solaris host, and prepared as a zfs pool consisting of  
this single iscsi target. ZFS best practices, tell me that to be  
safe in case of corruption, pools should always be mirrors or raidz  
on 2 or more disks. In this case, I considered all safe, because the  
mirror and raid was managed by the storage machine. But from the  
solaris host point of view, the pool was just one! And maybe this  
has been the point of failure. What is the correct way to go in this  
case?
- Finally, looking forward to run new storage appliances using  
OpenSolaris and its ZFS+iscsitadm and/or comstar, I feel a bit  
confused by the possibility of having a double zfs situation: in  
this case, I would have the storage zfs filesystem divided into zfs  
volumes, accessed via iscsi by a possible solaris host that creates  
his own zfs pool on it (...is it too redundant??) and again I would  
fall in the same previous case (host zfs pool connected to one only  
iscsi resource).


Any guidance would be really appreciated :)
Thanks a lot
Gabriele.


What iSCSI target was this?

If it was IET I hope you were NOT using the write-back option on it as  
it caches write data in volatile RAM.


IET does support cache flushes, but if you cache in RAM (bad idea) a  
system lockup or panic will ALWAYS loose data.


-Ross

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] corruption of ZFS on iScsi storage

2010-03-15 Thread Gabriele Bulfon
Well, I actually don't know what implementation is inside this legacy machine.
This machine is an AMI StoreTrends ITX, but maybe it has been built around IET, 
don't know.
Well, maybe I should disable write-back on every zfs host connecting on iscsi?
How do I check this?

Thx
Gabriele.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] corruption of ZFS on iScsi storage

2010-03-15 Thread Ware Adams

On Mar 15, 2010, at 12:13 PM, Gabriele Bulfon wrote:

> Well, I actually don't know what implementation is inside this legacy machine.
> This machine is an AMI StoreTrends ITX, but maybe it has been built around 
> IET, don't know.
> Well, maybe I should disable write-back on every zfs host connecting on iscsi?
> How do I check this?

I think this would be a property of the NAS, not the clients.

--Ware
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] corruption of ZFS on iScsi storage

2010-03-15 Thread Ross Walker
On Mar 15, 2010, at 12:19 PM, Ware Adams   
wrote:




On Mar 15, 2010, at 12:13 PM, Gabriele Bulfon wrote:

Well, I actually don't know what implementation is inside this  
legacy machine.
This machine is an AMI StoreTrends ITX, but maybe it has been built  
around IET, don't know.
Well, maybe I should disable write-back on every zfs host  
connecting on iscsi?

How do I check this?


I think this would be a property of the NAS, not the clients.


Yes, Ware's right the setting should be on the AMI device.

I don't know what target it's using either, but if it has an option to  
disable write-back caching at least then if it doesn't honor flushing  
your data should still be safe.


-Ross

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] CR 6880994 and pkg fix

2010-03-15 Thread David Dyer-Bennet

On Sun, March 14, 2010 13:54, Frank Middleton wrote:

>
> How can it even be remotely possible to get a checksum failure on mirrored
> drives
> with copies=2? That means all four copies were corrupted? Admittedly this
> is
> on a grotty PC with no ECC and flaky bus parity, but how come the same
> file always
> gets flagged as being clobbered (even though apparently it isn't).
>
> The oddest part is that libdlpi.so.1 doesn't actually seem to be
> corrupted. nm lists
> it with no problem and you can copy it to /tmp, rename it, and then copy
> it back.
> objdump and readelf can all process this library with no problem. But "pkg
> fix"
> flags an error in it's own inscrutable way. CCing pkg-discuss in case a
> pkg guru
> can shed any light on what the output of "pkg fix" (below) means.
> Presumably libc
> is OK, or it wouldn't boot :-).

This sounds really bizarre.

One detail suggestion on checking what's going on (since I don't have a
clue towards a real root-cause determination): Get an md5sum on a clean
copy of the file, say from a new install or something, and check the
allegedly-corrupted copy against that.  This can fairly easily give you a
pretty reliable indication if the file is truly corrupted or not.
-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool reporting consistent read errors

2010-03-15 Thread David Dyer-Bennet

On Mon, March 15, 2010 00:54, no...@euphoriq.com wrote:
> I'm running a raidz1 with 3 Samsung 1.5TB drives.  Every time I scrub the
> pool I get multiple read errors, no write errors and no checksum errors on
> one drive (always the same drive, and no data loss).
>
> I've changed cables, changed the sata ports the drives are attached to, I
> always get the same outcome.  The drives are new.  Is this likely a drive
> problem?

Given what you've already changed, it's sounding like it could well be a
drive problem.  The one other thing that comes to mind is power to the
drive.

-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] backup zpool to tape

2010-03-15 Thread Scott Meilicke
Greg, I am using NetBackup 6.5.3.1 (7.x is out) with fine results. Nice and 
fast.

-Scott
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] backup zpool to tape

2010-03-15 Thread Greg
Hey Scott, 
Thanks for the information. I doubt I can drop that kind of cash, but back to 
getting bacula working!

Thanks again,
Greg
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool reporting consistent read errors

2010-03-15 Thread no...@euphoriq.com
Wow.  I never thought about it.  I changed the power supply to a cheap one a 
while back (a now seemingly foolish effort to save money) - it could be the 
issue.  I'll change it back and let you know.

Thanks
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool reporting consistent read errors

2010-03-15 Thread Svein Skogen
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 15.03.2010 21:13, no...@euphoriq.com wrote:
> Wow.  I never thought about it.  I changed the power supply to a cheap one a 
> while back (a now seemingly foolish effort to save money) - it could be the 
> issue.  I'll change it back and let you know.

"cheap" powersupplies rarely are.  ;)

It's been my experience that if you "overengineer" the psu a bit, the
efficiency of the PSU increases (it's no longer pushing 100% of its
rated spec) and actually the consumed power (on the 220v side) drops.

//Svein

- -- 
- +---+---
  /"\   |Svein Skogen   | sv...@d80.iso100.no
  \ /   |Solberg Østli 9| PGP Key:  0xE5E76831
   X|2020 Skedsmokorset | sv...@jernhuset.no
  / \   |Norway | PGP Key:  0xCE96CE13
|   | sv...@stillbilde.net
 ascii  |   | PGP Key:  0x58CD33B6
 ribbon |System Admin   | svein-listm...@stillbilde.net
Campaign|stillbilde.net | PGP Key:  0x22D494A4
+---+---
|msn messenger: | Mobile Phone: +47 907 03 575
|sv...@jernhuset.no | RIPE handle:SS16503-RIPE
- +---+---
 If you really are in a hurry, mail me at
   svein-mob...@stillbilde.net
 This mailbox goes directly to my cellphone and is checked
even when I'm not in front of my computer.
- 
 Picture Gallery:
  https://gallery.stillbilde.net/v/svein/
- 
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.12 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkuemfgACgkQSBMQn1jNM7ZJ5gCghZuA3LnqkZnA54zddSlrkG6Y
MbcAoK8RU5td2Xx79q+Wmbztth7pB217
=pRID
-END PGP SIGNATURE-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] persistent L2ARC

2010-03-15 Thread Abdullah Al-Dahlawi
Greeting ALL


I understand that L2ARC is still under enhancement. Does any one know if ZFS
can be upgrades to include "Persistent L2ARC", ie. L2ARC will not loose its
contents after system reboot ?




-- 
Abdullah Al-Dahlawi
George Washington University
Department. Of Electrical & Computer Engineering

Check The Fastest 500 Super Computers Worldwide
http://www.top500.org/list/2009/11/100
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] pool causes kernel panic, recursive mutex enter, 134

2010-03-15 Thread Mark
hi,
i´m using opensolaris about 2 years with an mirrored rpool and an data pool 
with 3 x 2 (mirrored) drives.
the data pool drives are connected to SIL pci-express cards.

yesterday i updated from 130 to 134, everything seemed to be fine and i also 
replaced 1 pair of mirrored drives with larger disks.
still no problems, done some tests, rebooted a few times, checked logs, nothing 
special.

today i started copying a larger amount of data. while copying, at about 40gb, 
opensolaris gave me the first kernel panic ever seen on this system. system 
rebooted and while mounting the data pool, you may guess it, panic again.

what i did so far in trying to get it up again:

boot without data drive, try to mount manualy and with -F -n (non destructive 
as manual says)
tried to mount normal with different combination of mirrors taken offline, so 
that there is only a single drive for each slice. same panic.

i still have the drives that i replaced with the newer drives but i believe 
they are useless since the structure changed?

the kernel panic i get is cpu(0) recursive mutex enter and several lines of SIL 
driver errors. 
i tried also booting with previous BE 130 before the update and where the pools 
never got an error, same panic.

ANY ideas of volume rescue are welcome - if i missed some important 
information,please tell me.
regards, mark
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool reporting consistent read errors

2010-03-15 Thread David Dyer-Bennet

On Mon, March 15, 2010 15:35, Svein Skogen wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> On 15.03.2010 21:13, no...@euphoriq.com wrote:
>> Wow.  I never thought about it.  I changed the power supply to a cheap
>> one a while back (a now seemingly foolish effort to save money) - it
>> could be the issue.  I'll change it back and let you know.
>
> "cheap" powersupplies rarely are.  ;)

I've had all types fail on me.  I think I've had more power supplies than
disk drives fail on me, even.

And they can produce the most *amazing* range of symptoms, if they don't
fail completely.  Quite remarkable.

> It's been my experience that if you "overengineer" the psu a bit, the
> efficiency of the PSU increases (it's no longer pushing 100% of its
> rated spec) and actually the consumed power (on the 220v side) drops.

Strangely enough, running up to the limit is hard on components, yes.

-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] pool causes kernel panic, recursive mutex enter, 134

2010-03-15 Thread Mark
some screenshots that may help:

 pool: tank
id: 5649976080828524375
 state: ONLINE
action: The pool can be imported using its name or numeric identifier.
config:

data   ONLINE
  mirror-0 ONLINE
c27t2d0ONLINE
c27t0d0ONLINE
  mirror-1 ONLINE
c27t3d0ONLINE
c29t1d0ONLINE
  mirror-2 ONLINE
c27t1d0ONLINE
c29t0d0ONLINE



Mar 15 21:42:50 solaris1.local ^Mpanic[cpu0]/thread=d6792f00:   

   
Mar 15 21:42:50 solaris1.local genunix: [ID 335743 kern.notice] BAD TRAP: 
type=e (#pf Page fault) rp=d76d3658 addr=34 occurred in module "zfs" due to a 
NULL pointer dereference   
Mar 15 21:42:50 solaris1.local unix: [ID 10 kern.notice]

   
Mar 15 21:42:50 solaris1.local unix: [ID 839527 kern.notice] syseventd: 

   
Mar 15 21:42:50 solaris1.local unix: [ID 753105 kern.notice] #pf Page fault 

   
Mar 15 21:42:50 solaris1.local unix: [ID 532287 kern.notice] Bad kernel fault 
at addr=0x34
 
Mar 15 21:42:50 solaris1.local unix: [ID 243837 kern.notice] pid=93, 
pc=0xf924b97e, sp=0xd76d36c4, eflags=0x10282
  
Mar 15 21:42:50 solaris1.local unix: [ID 211416 kern.notice] cr0: 
8005003b cr4: 6f8   
 
Mar 15 21:42:50 solaris1.local unix: [ID 624947 kern.notice] cr2: 34

   
Mar 15 21:42:50 solaris1.local unix: [ID 625075 kern.notice] cr3: 2ead020   

   
Mar 15 21:42:50 solaris1.local unix: [ID 10 kern.notice]

   
Mar 15 21:42:50 solaris1.local unix: [ID 537610 kern.notice] gs: d76d01b0  
fs:0  es:   cb0160  ds: e31a0160

Mar 15 21:42:50 solaris1.local unix: [ID 537610 kern.notice]edi:0 
esi: de581350 ebp: d76d36a4 esp: d76d3690   
 
Mar 15 21:42:50 solaris1.local unix: [ID 537610 kern.notice]ebx:0 
edx:b ecx:0 eax:0   
 
Mar 15 21:42:50 solaris1.local unix: [ID 537610 kern.notice]trp:e 
err:0 eip: f924b97e  cs:  158   
 
Mar 15 21:42:50 solaris1.local unix: [ID 717149 kern.notice]efl:10282 
usp: d76d36c4  ss: f924b9c6 
 
Mar 15 21:42:50 solaris1.local unix: [ID 10 kern.notice]

   
Mar 15 21:42:50 solaris1.local genunix: [ID 353471 kern.notice] d76d3594 
unix:die+93 (e, d76d3658, 34, 0)
  
Mar 15 21:42:50 solaris1.local genunix: [ID 353471 kern.notice] d76d3644 
unix:trap+1449 (d76d3658, 34, 0)
  
Mar 15 21:42:50 solaris1.local genunix: [ID 353471 kern.notice] d76d3658 
unix:cmntrap+7c (d76d01b0, 0, cb0160)   
  
Mar 15 21:42:50 solaris1.local genunix: [ID 353471 kern.notice] d76d36a4 
zfs:vdev_is_dead+6 (0, 0, cb36a7, e31ad)
  
Mar 15 21:42:50 solaris1.local genunix: [ID 353471 kern.notice] d76d36c4 
zfs:vdev_readable+e (0, 1, 0, fe96c13d) 
  
Mar 15 21:42:50 solaris1.local genunix: [ID 353471 kern.notice] d76d3704 
zfs:vdev_mirror_child_select+55 (dedc6560, 1, 0, f92)   
  
Mar 15 21:42:50 solaris1.local genunix: [ID 353471 kern.notice] d76d3744 
zfs:vdev_mirror_io_start+b3 (dedc6560, 1

Re: [zfs-discuss] Posible newbie question about space between zpool and zfs file systems

2010-03-15 Thread Tonmaus
Hi Cindy,
trying to reproduce this 

> For a RAIDZ pool, the zpool list command identifies
> the "inflated" space
> for the storage pool, which is the physical available
> space without an
> accounting for redundancy overhead.
> 
> The zfs list command identifies how much actual pool
> space is available
> to the file systems.

I am lacking 1 TB on my pool:

u...@filemeister:~$ zpool list daten
NAMESIZE  ALLOC   FREECAP  DEDUP  HEALTH  ALTROOT
daten10T  3,71T  6,29T37%  1.00x  ONLINE  -
u...@filemeister:~$ zpool status daten
  pool: daten
 state: ONLINE
 scrub: none requested
config:

NAME  STATE READ WRITE CKSUM
daten ONLINE   0 0 0
  raidz2-0ONLINE   0 0 0
c10t2d0   ONLINE   0 0 0
c10t3d0   ONLINE   0 0 0
c10t4d0   ONLINE   0 0 0
c10t5d0   ONLINE   0 0 0
c10t6d0   ONLINE   0 0 0
c10t7d0   ONLINE   0 0 0
c10t8d0   ONLINE   0 0 0
c10t9d0   ONLINE   0 0 0
c11t18d0  ONLINE   0 0 0
c11t19d0  ONLINE   0 0 0
c11t20d0  ONLINE   0 0 0
spares
  c11t21d0AVAIL

errors: No known data errors
u...@filemeister:~$ zfs list daten
NAMEUSED  AVAIL  REFER  MOUNTPOINT
daten  3,01T  4,98T   110M  /daten

I am counting 11 disks 1 TB each in a raidz2 pool. This is 11 TB gross 
capacity, and 9 TB net. Zpool is however stating 10 TB and zfs is stating 8TB. 
The difference between net and gross is correct, but where is the capacity from 
the 11th disk going?

Regards,

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Posible newbie question about space between zpool and zfs file systems

2010-03-15 Thread Carson Gaspar

Tonmaus wrote:


I am lacking 1 TB on my pool:

u...@filemeister:~$ zpool list daten NAMESIZE  ALLOC   FREE
CAP  DEDUP  HEALTH  ALTROOT daten10T  3,71T  6,29T37%  1.00x
ONLINE  - u...@filemeister:~$ zpool status daten pool: daten state:
ONLINE scrub: none requested config:

NAME  STATE READ WRITE CKSUM daten ONLINE   0
0 0 raidz2-0ONLINE   0 0 0 c10t2d0   ONLINE
0 0 0 c10t3d0   ONLINE   0 0 0 c10t4d0   ONLINE
0 0 0 c10t5d0   ONLINE   0 0 0 c10t6d0   ONLINE
0 0 0 c10t7d0   ONLINE   0 0 0 c10t8d0   ONLINE
0 0 0 c10t9d0   ONLINE   0 0 0 c11t18d0  ONLINE
0 0 0 c11t19d0  ONLINE   0 0 0 c11t20d0  ONLINE
0 0 0 spares c11t21d0AVAIL

errors: No known data errors u...@filemeister:~$ zfs list daten NAME
USED  AVAIL  REFER  MOUNTPOINT daten  3,01T  4,98T   110M  /daten

I am counting 11 disks 1 TB each in a raidz2 pool. This is 11 TB
gross capacity, and 9 TB net. Zpool is however stating 10 TB and zfs
is stating 8TB. The difference between net and gross is correct, but
where is the capacity from the 11th disk going?


My guess is unit conversion and rounding. Your pool has 11 base 10 TB, 
which is 10.2445 base 2 TiB.


Likewise your fs has 9 base 10 TB, which is 8.3819 base 2 TiB.

--
Carson
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Posible newbie question about space between zpool and zfs file systems

2010-03-15 Thread Erik Trimble
On Mon, 2010-03-15 at 15:03 -0700, Tonmaus wrote:
> Hi Cindy,
> trying to reproduce this 
> 
> > For a RAIDZ pool, the zpool list command identifies
> > the "inflated" space
> > for the storage pool, which is the physical available
> > space without an
> > accounting for redundancy overhead.
> > 
> > The zfs list command identifies how much actual pool
> > space is available
> > to the file systems.
> 
> I am lacking 1 TB on my pool:
> 
> u...@filemeister:~$ zpool list daten
> NAMESIZE  ALLOC   FREECAP  DEDUP  HEALTH  ALTROOT
> daten10T  3,71T  6,29T37%  1.00x  ONLINE  -
> u...@filemeister:~$ zpool status daten
>   pool: daten
>  state: ONLINE
>  scrub: none requested
> config:
> 
> NAME  STATE READ WRITE CKSUM
> daten ONLINE   0 0 0
>   raidz2-0ONLINE   0 0 0
> c10t2d0   ONLINE   0 0 0
> c10t3d0   ONLINE   0 0 0
> c10t4d0   ONLINE   0 0 0
> c10t5d0   ONLINE   0 0 0
> c10t6d0   ONLINE   0 0 0
> c10t7d0   ONLINE   0 0 0
> c10t8d0   ONLINE   0 0 0
> c10t9d0   ONLINE   0 0 0
> c11t18d0  ONLINE   0 0 0
> c11t19d0  ONLINE   0 0 0
> c11t20d0  ONLINE   0 0 0
> spares
>   c11t21d0AVAIL
> 
> errors: No known data errors
> u...@filemeister:~$ zfs list daten
> NAMEUSED  AVAIL  REFER  MOUNTPOINT
> daten  3,01T  4,98T   110M  /daten
> 
> I am counting 11 disks 1 TB each in a raidz2 pool. This is 11 TB gross 
> capacity, and 9 TB net. Zpool is however stating 10 TB and zfs is stating 
> 8TB. The difference between net and gross is correct, but where is the 
> capacity from the 11th disk going?
> 
> Regards,
> 
> Tonmaus

1TB disks aren't a terabyte.

Remember, the storage industry uses powers of 10, not 2.  it's
annoying.  

For each GB, you lose 7% in actual space computation. For each TB, it's
about 9%. So, your "1TB" of  is actually about 931 GB. 

'zfs list' is going to report in actual powers-of-2, just like df. 


In my case, I have a 12 x 1TB configuration, and zfs list shows:


# zpool list
NAMESIZE   USED  AVAILCAP  HEALTH  ALTROOT
array2540  10.9T  5.46T  5.41T50%  ONLINE  -

Likewise:

# zfs list
NAMEUSED  AVAIL  REFER  MOUNTPOINT
array2540  4.53T  4.34T  80.4M  /data


So, here's the math:

1 "storage TB" = 1e12 / (1024^3) = 931 actual GB

931 GB x 12 = 11,172 GB
but, 1TB = 1024 GB
so:  931 GB x 12 / (1024) = 10.9TB.


Quick Math: 1 TB of advertised space = 0.91 TB of real space
1 GB of advertised space = 0.93 GB of real space





-- 
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Posible newbie question about space between zpool and zfs file systems

2010-03-15 Thread Erik Trimble
On Mon, 2010-03-15 at 15:40 -0700, Carson Gaspar wrote:
> Tonmaus wrote:
> 
> > I am lacking 1 TB on my pool:
> > 
> > u...@filemeister:~$ zpool list daten NAMESIZE  ALLOC   FREE
> > CAP  DEDUP  HEALTH  ALTROOT daten10T  3,71T  6,29T37%  1.00x
> > ONLINE  - u...@filemeister:~$ zpool status daten pool: daten state:
> > ONLINE scrub: none requested config:
> > 
> > NAME  STATE READ WRITE CKSUM daten ONLINE   0
> > 0 0 raidz2-0ONLINE   0 0 0 c10t2d0   ONLINE
> > 0 0 0 c10t3d0   ONLINE   0 0 0 c10t4d0   ONLINE
> > 0 0 0 c10t5d0   ONLINE   0 0 0 c10t6d0   ONLINE
> > 0 0 0 c10t7d0   ONLINE   0 0 0 c10t8d0   ONLINE
> > 0 0 0 c10t9d0   ONLINE   0 0 0 c11t18d0  ONLINE
> > 0 0 0 c11t19d0  ONLINE   0 0 0 c11t20d0  ONLINE
> > 0 0 0 spares c11t21d0AVAIL
> > 
> > errors: No known data errors u...@filemeister:~$ zfs list daten NAME
> > USED  AVAIL  REFER  MOUNTPOINT daten  3,01T  4,98T   110M  /daten
> > 
> > I am counting 11 disks 1 TB each in a raidz2 pool. This is 11 TB
> > gross capacity, and 9 TB net. Zpool is however stating 10 TB and zfs
> > is stating 8TB. The difference between net and gross is correct, but
> > where is the capacity from the 11th disk going?
> 
> My guess is unit conversion and rounding. Your pool has 11 base 10 TB, 
> which is 10.2445 base 2 TiB.
> 
> Likewise your fs has 9 base 10 TB, which is 8.3819 base 2 TiB.

Not quite.  

11 x 10^12 =~ 10.004 x (1024^4).

So, the 'zpool list' is right on, at "10T" available.

For the 'zfs list', remember there is a slight overhead for filesystem
formatting. 

So, instead of 

9 x 10^12 =~ 8.185 x (1024^4)

it shows 7.99TB usable. The roughly 200GB is the overhead. (or, about
3%).




-- 
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] corruption of ZFS on iScsi storage

2010-03-15 Thread Tonmaus
> Being an iscsi
> target, this volume was mounted as a single iscsi
> disk from the solaris host, and prepared as a zfs
> pool consisting of this single iscsi target. ZFS best
> practices, tell me that to be safe in case of
> corruption, pools should always be mirrors or raidz
> on 2 or more disks. In this case, I considered all
> safe, because the mirror and raid was managed by the
> storage machine. 

As far as I understand Best Practises, redundancy needs to be within zfs in 
order to provide full protection. So, actually Best Practises says that your 
scenario is rather one to be avoided. 

Regards,
Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Posible newbie question about space between zpool and zfs file systems

2010-03-15 Thread Tonmaus
> My guess is unit conversion and rounding. Your pool
>  has 11 base 10 TB, 
>  which is 10.2445 base 2 TiB.
>  
> Likewise your fs has 9 base 10 TB, which is 8.3819
>  base 2 TiB.
> Not quite.  
> 
> 11 x 10^12 =~ 10.004 x (1024^4).
> 
> So, the 'zpool list' is right on, at "10T" available.

Duh! I completely forgot about this. Thanks for the heads-up.

Tonmaus
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Posible newbie question about space between zpool and zfs file systems

2010-03-15 Thread Carson Gaspar

Someone wrote (I haven't seen the mail, only the unattributed quote):

My guess is unit conversion and rounding. Your pool
 has 11 base 10 TB, 
 which is 10.2445 base 2 TiB.
 
Likewise your fs has 9 base 10 TB, which is 8.3819

 base 2 TiB.


Not quite.  


11 x 10^12 =~ 10.004 x (1024^4).

So, the 'zpool list' is right on, at "10T" available.


Duh, I was doing GiB math (y = x * 10^9 / 2^20), not TiB math (y = x * 
10^12 / 2^40).


Thanks for the correction.

--
Carson
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] corruption of ZFS on iScsi storage

2010-03-15 Thread Tim Cook
On Mon, Mar 15, 2010 at 9:55 AM, Gabriele Bulfon wrote:

> Hello,
> I'd like to check for any guidance about using zfs on iscsi storage
> appliances.
> Recently I had an unlucky situation with an unlucky storage machine
> freezing.
> Once the storage was up again (rebooted) all other iscsi clients were
> happy, while one of the iscsi clients (a sun solaris sparc, running Oracle)
> did not mount the volume marking it as corrupted.
> I had no way to get back my zfs data: had to destroy and recreate from
> backups.
> So I have some questions regarding this nice story:
> - I remember sysadmins being able to almost always recover data on
> corrupted ufs filesystems by magic of superblocks. Is there something
> similar on zfs? Is there really no way to access data of a corrupted zfs
> filesystem?
> - In this case, the storage appliance is a legacy system based on linux, so
> raids/mirrors are managed at the storage side its own way. Being an iscsi
> target, this volume was mounted as a single iscsi disk from the solaris
> host, and prepared as a zfs pool consisting of this single iscsi target. ZFS
> best practices, tell me that to be safe in case of corruption, pools should
> always be mirrors or raidz on 2 or more disks. In this case, I considered
> all safe, because the mirror and raid was managed by the storage machine.
> But from the solaris host point of view, the pool was just one! And maybe
> this has been the point of failure. What is the correct way to go in this
> case?
> - Finally, looking forward to run new storage appliances using OpenSolaris
> and its ZFS+iscsitadm and/or comstar, I feel a bit confused by the
> possibility of having a double zfs situation: in this case, I would have the
> storage zfs filesystem divided into zfs volumes, accessed via iscsi by a
> possible solaris host that creates his own zfs pool on it (...is it too
> redundant??) and again I would fall in the same previous case (host zfs pool
> connected to one only iscsi resource).
>
> Any guidance would be really appreciated :)
> Thanks a lot
> Gabriele.
>
>
To answer the other portion of your question, yes, you can roll back zfs if
you're at the proper version.  The procedure is listed below, essentially it
will try to find the last known good transaction.  If that doesn't work,
your only remaining option is to restore from backup:
http://docs.sun.com/app/docs/doc/817-2271/gbctt?l=ja&a=view

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] persistent L2ARC

2010-03-15 Thread Giovanni Tirloni
On Mon, Mar 15, 2010 at 5:39 PM, Abdullah Al-Dahlawi wrote:

> Greeting ALL
>
>
> I understand that L2ARC is still under enhancement. Does any one know if
> ZFS can be upgrades to include "Persistent L2ARC", ie. L2ARC will not loose
> its contents after system reboot ?
>

There is a bug opened for that but it doesn't seem to be implemented yet.

http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6662467

-- 
Giovanni
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] corruption of ZFS on iScsi storage

2010-03-15 Thread Ross Walker

On Mar 15, 2010, at 7:11 PM, Tonmaus  wrote:


Being an iscsi
target, this volume was mounted as a single iscsi
disk from the solaris host, and prepared as a zfs
pool consisting of this single iscsi target. ZFS best
practices, tell me that to be safe in case of
corruption, pools should always be mirrors or raidz
on 2 or more disks. In this case, I considered all
safe, because the mirror and raid was managed by the
storage machine.


As far as I understand Best Practises, redundancy needs to be within  
zfs in order to provide full protection. So, actually Best Practises  
says that your scenario is rather one to be avoided.


There is nothing saying redundancy can't be provided below ZFS just if  
you want auto recovery you need redundancy within ZFS itself as well.


You can have 2 separate raid arrays served up via iSCSI to ZFS which  
then makes a mirror out of the storage.


-Ross

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] corruption of ZFS on iScsi storage

2010-03-15 Thread Tim Cook
On Mon, Mar 15, 2010 at 9:10 PM, Ross Walker  wrote:

> On Mar 15, 2010, at 7:11 PM, Tonmaus  wrote:
>
>  Being an iscsi
>>> target, this volume was mounted as a single iscsi
>>> disk from the solaris host, and prepared as a zfs
>>> pool consisting of this single iscsi target. ZFS best
>>> practices, tell me that to be safe in case of
>>> corruption, pools should always be mirrors or raidz
>>> on 2 or more disks. In this case, I considered all
>>> safe, because the mirror and raid was managed by the
>>> storage machine.
>>>
>>
>> As far as I understand Best Practises, redundancy needs to be within zfs
>> in order to provide full protection. So, actually Best Practises says that
>> your scenario is rather one to be avoided.
>>
>
> There is nothing saying redundancy can't be provided below ZFS just if you
> want auto recovery you need redundancy within ZFS itself as well.
>
> You can have 2 separate raid arrays served up via iSCSI to ZFS which then
> makes a mirror out of the storage.
>
> -Ross
>
>
Perhaps I'm remembering incorrectly, but I didn't think mirroring would
auto-heal/recover, I thought that was limited to the raidz* implementations.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] corruption of ZFS on iScsi storage

2010-03-15 Thread Ross Walker

On Mar 15, 2010, at 11:10 PM, Tim Cook  wrote:




On Mon, Mar 15, 2010 at 9:10 PM, Ross Walker   
wrote:

On Mar 15, 2010, at 7:11 PM, Tonmaus  wrote:

Being an iscsi
target, this volume was mounted as a single iscsi
disk from the solaris host, and prepared as a zfs
pool consisting of this single iscsi target. ZFS best
practices, tell me that to be safe in case of
corruption, pools should always be mirrors or raidz
on 2 or more disks. In this case, I considered all
safe, because the mirror and raid was managed by the
storage machine.

As far as I understand Best Practises, redundancy needs to be within  
zfs in order to provide full protection. So, actually Best Practises  
says that your scenario is rather one to be avoided.


There is nothing saying redundancy can't be provided below ZFS just  
if you want auto recovery you need redundancy within ZFS itself as  
well.


You can have 2 separate raid arrays served up via iSCSI to ZFS which  
then makes a mirror out of the storage.


-Ross


Perhaps I'm remembering incorrectly, but I didn't think mirroring  
would auto-heal/recover, I thought that was limited to the raidz*  
implementations.


Mirroring auto-heals, in fact copies=2 on a single disk vdev can auto- 
heal (if it isn't a disk failure).


-Ross

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to manage scrub priority or defer scrub?

2010-03-15 Thread Richard Elling
On Mar 14, 2010, at 11:25 PM, Tonmaus wrote:
> Hello again,
> 
> I am still concerned if my points are being well taken.
> 
>> If you are concerned that a
>> single 200TB pool would take a long
>> time to scrub, then use more pools and scrub in
>> parallel.
> 
> The main concern is not scrub time. Scrub time could be weeks if scrub just 
> would behave. You may imagine that there are applications where segmentation 
> is a pain point, too.

I agree.

>> The scrub will queue no more
>> han 10 I/Os at one time to a device, so devices which
>> can handle concurrent I/O
>> are not consumed entirely by scrub I/O. This could be
>> tuned lower, but your storage 
>> is slow and *any* I/O activity will be noticed.
> 
> There are a couple of things I maybe don't understand, then.
> 
> - zpool iostat is reporting more than 1k of outputs while scrub

ok

> - throughput is as high as can be until maxing out CPU

You would rather your CPU be idle?  What use is an idle CPU, besides wasting 
energy :-)?

> - nominal I/O capacity of a single device is still around 90, how can 10 I/Os 
> already bring down payload

90 IOPS is approximately the worst-case rate for a 7,200 rpm disk for a small, 
random
workload. ZFS tends to write sequentially, so "random writes" tend to become 
"sequential
writes" on ZFS. So it is quite common to see scrub workloads with >> 90 IOPS.

> - scrubbing the same pool, configured as raidz1 didn't max out CPU which is 
> no surprise (haha, slow storage...) the notable part is that it didn't slow 
> down payload that much either.

raidz creates more, smaller writes than a mirror or simple stripe. If the disks 
are slow,
then the IOPS will be lower and the scrub takes longer, but the I/O scheduler 
can
manage the queue better (disks are slower).

> - scrub is obviously fine with data added or deleted during a pass. So, it 
> could be possible to pause and resume a pass, couldn't it?

You can start or stop scrubs, there no resume directive.   There are several
bugs/RFEs along these lines, something like:
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6743992

> My conclusion from these observations is that not only disk speed counts 
> here, but other bottlenecks may strike as well. Solving the issue by the 
> wallet is one way, solving it by configuration of parameters is another. So, 
> is there a lever for scrub I/O prio, or not? Is there a possibility to pause 
> scrub passed and resume?

Scrub is already the lowest priority.  Would you like it to be lower?
I think the issue is more related to which queue is being managed by
the ZFS priority scheduler rather than the lack of scheduling priority.
 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Atlanta, March 16-18, 2010 http://nexenta-atlanta.eventbrite.com 
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com 




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss