[zfs-discuss] zpool mirror (dumb question)

2010-05-01 Thread Steve Staples
Hi there!

I am new to the list, and to OpenSolaris, as well as ZPS.

I am creating a zpool/zfs to use on my NAS server, and basically I want some
redundancy for my files/media.   What I am looking to do, is get a bunch of
2TB drives, and mount them mirrored, and in a zpool so that I don't have to
worry about running out of room. (I know, pretty typical I guess).

My problem is, is that not all 2TB hard drives are the same size (even
though they should be 2 trillion bytes, there is still sometimes a +/- (I've
only noticed this 2x so far) ) and if I create them mirrored, and one fails,
and then I replace the drive, and for some reason, it is 1byte smaller, it
will not work.

How would I go about fixing this "problem"?


THIS is just a thought, I am looking for thoughts and opinions on doing
this... it prolly would be a bad idea, but hey, does it hurt to ask?

I have been thinking, and would it be a good idea, to have on the 2TB
drives, say 1TB or 500GB "files" and then mount them as mirrored?   So
basically, have a 2TB hard drive, set up like:

(where drive1 and drive2 are the paths to the mount points)
Mkfile 465gb /drive1/drive1part1
Mkfile 465gb /drive1/drive1part2
Mkfile 465gb /drive1/drive1part3
Mkfile 465gb /drive1/drive1part4

Mkfile 465gb /drive2/drive2part1
Mkfile 465gb /drive2/drive2part2
Mkfile 465gb /drive2/drive2part3
Mkfile 465gb /drive2/drive2part4

(I use 465gb, as 2TB = 2trillion bytes, / 4 = 465.66 gb)

And then add them to the zpool
Zpool add medianas mirror /drive1/drive1part1 /drive2/drive2/part1
Zpool add medianas mirror /drive1/drive1part2 /drive2/drive2/part2
Zpool add medianas mirror /drive1/drive1part3 /drive2/drive2/part3
Zpool add medianas mirror /drive1/drive1part4 /drive2/drive2/part4

And then, if a drive goes and I only have a 500gb and a 1.5tb drives, they
could be replaced that way?

I am sure there are performance issues in doing this, but would the
performance outweigh the possibility of hard drive failure and replacing
drives?

Sorry for posting a novel, but I am just concerned about failure on bigger
drives, and putting my media/files into basically what consists of a JBOD
type array (on steroids).

Steve

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best practice for full stystem backup - equivelent of ufsdump/ufsrestore

2010-05-01 Thread Edward Ned Harvey
> From: Bob Friesenhahn [mailto:bfrie...@simple.dallas.tx.us]
> Sent: Saturday, May 01, 2010 7:07 PM
> 
> On Sat, 1 May 2010, Peter Tribble wrote:
> >>
> >> With the new Oracle policies, it seems unlikely that you will be
> able to
> >> reinstall the OS and achieve what you had before.
> >
> > And what policies have Oracle introduced that mean you can't
> reinstall
> > your system?
> 
> The main concern is that you might not be able to get back the same OS
> install you had before due to loss of patch access after your service
> contract has expired and Oracle arbitrarily decided not to grant you a
> new one.

It's as if you didn't even read this thread.  In the proposed answers to
Euan's question, there is no need to apply any patches, or to have any
service contract.  As long as you still have your OS install CD, or *any* OS
install CD, you install a throw-away OS, just for the sake of letting the
installer create the partitions, boot record, boot properties, etc...  And
then you immediately obliterate and overwrite rpool, using your backup
image.  Since this restoration process puts the filesystem back into the
exact state it was before failure ... All the patches you previously had are
restored, and everything is restored just as it was before crash.

There is nothing anywhere which indicates any reason you couldn't do this,
even in the future.  So you're totally spreading BS on this one.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Single-disk pool corrupted after controller failure

2010-05-01 Thread Diogo Franco
On 05/01/2010 06:07 PM, Bill Sommerfeld wrote:
> there are two reasons why you could get this:
>  1) the labels are gone.
Possible, since I got the metadata errors on `zfs status` before.

>  2) the labels are not at the start of what solaris sees as p1, and thus
> are somewhere else on the disk.  I'd look more closely at how freebsd
> computes the start of the partition or slice '/dev/ad6s1d'
> that contains the pool.
> 
> I think #2 is somewhat more likely.
c5d0p1 is the only place where zdb finds any labels at all too...
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Virtual to physical migration

2010-05-01 Thread devsk
This actually turned out be a lot of fun! The end of it is that I have a hard 
disk partition now which can boot in both physical and virtual world (got rid 
of the VDIs finally!). The physical world has outstanding performance but has 
ugly graphics (1600x1200 vesa driver with weird DPI and fonts...ughhh) because 
ATI drivers don't work in Opensolaris for my card.

Virtualbox gives me better graphics than my real install. This is a bit of a 
painful pill to swallow!

One thing I tested right away in the physical install was to see how portage 
performance was compared to Linux. This is a sort of test of the scheduler as 
well as small file performance of the FS. OpenSolaris emerged python-2.6.5 in 
45 seconds compared to 55 seconds in Linux, cmake took 38 seconds vs 53 seconds 
in Linux. In general, a portage operation (like emerge -pv ) completes 
much much faster in OpenSolaris.

Its not to say OpenSolaris doesn't have issues. I notice short freezes 
(keyboard/mouse and gkrellm updates) lasting few seconds (sometimes more) 
during FS activity. Firefox restart (like after an add-on install) takes 
forever whereas it should be instant because all of it is in memory already. I 
would like to troubleshoot these sometime using DTrace.

Lastly, OpenSolaris is memory hungry! It crawls without it. In the VM, I have 
it using 1.5GB and pkg manager alone can eat more than half of it, throwing 
everything else into swap.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS Root Permissions Create Pool With Zpool

2010-05-01 Thread WebDawg
When I first started using ZFS I tried to create a pool from my disks
/dev/c8d1 and /dev/c8d1 .  I could see the slices though.

I could not see those disks with out being root and all though I get
it ZFS didnt.

It could not find the disks and did not tell me I needed to be root.

That is all...

Web...
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best practice for full stystem backup - equivelent of ufsdump/ufsrestore

2010-05-01 Thread Bob Friesenhahn

On Sat, 1 May 2010, Peter Tribble wrote:


With the new Oracle policies, it seems unlikely that you will be able to
reinstall the OS and achieve what you had before.


And what policies have Oracle introduced that mean you can't reinstall
your system?


The main concern is that you might not be able to get back the same OS 
install you had before due to loss of patch access after your service 
contract has expired and Oracle arbitrarily decided not to grant you a 
new one.


Maybe if you are able to overwrite the pool with the original pristine 
state rather than rely on an "install", then you would be ok.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best practice for full stystem backup - equivelent of ufsdump/ufsrestore

2010-05-01 Thread Ian Collins

On 05/ 1/10 04:46 PM, Edward Ned Harvey wrote:

One more really important gotcha.  Let's suppose the version of zfs on the
CD supports up to zpool 14.  Let's suppose your "live" system had been fully
updated before crash, and let's suppose the zpool had been upgraded to zpool
15.  Wouldn't that mean it's impossible to restore your rpool using the CD?
   


Just make sure you have an up to date live CD when you upgrade your pool.

It's seldom wise to upgrade a pool too quickly after an OS upgrade, you 
may find an issue and have to revert back to a previous BE.


--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best practice for full stystem backup - equivelent of ufsdump/ufsrestore

2010-05-01 Thread Peter Tribble
On Fri, Apr 30, 2010 at 6:39 PM, Bob Friesenhahn
 wrote:
> On Thu, 29 Apr 2010, Edward Ned Harvey wrote:
>>
>> This is why I suggested the technique of:
>> Reinstall the OS just like you did when you first built your machine,
>> before
>> the catastrophy.  It doesn't even matter if you make the same selections
>> you
>
> With the new Oracle policies, it seems unlikely that you will be able to
> reinstall the OS and achieve what you had before.

And what policies have Oracle introduced that mean you can't reinstall
your system?

-- 
-Peter Tribble
http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Single-disk pool corrupted after controller failure

2010-05-01 Thread Bill Sommerfeld

On 05/01/10 13:06, Diogo Franco wrote:

After seeing that on some cases labels were corrupted, I tried running
zdb -l on mine:

...
(labels 0, 1 not there, labels 2, 3 are there).


I'm looking for pointers on how to fix this situation, since the disk
still has available metadata.


there are two reasons why you could get this:
 1) the labels are gone.

 2) the labels are not at the start of what solaris sees as p1, and 
thus are somewhere else on the disk.  I'd look more closely at how 
freebsd computes the start of the partition or slice '/dev/ad6s1d'

that contains the pool.

I think #2 is somewhat more likely.

- Bill
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Performance drop during scrub?

2010-05-01 Thread Bob Friesenhahn

On Fri, 30 Apr 2010, Freddie Cash wrote:


Without a periodic scrub that touches every single bit of data in the pool, how 
can you be sure
that 10-year files that haven't been opened in 5 years are still intact?


You don't.  But it seems that having two or three extra copies of the 
data on different disks should instill considerable confidence.  With 
sufficient redundancy, chances are that the computer will explode 
before it loses data due to media corruption.  The calculated time 
before data loss becomes longer than even the pyramids in Egypt could 
withstand.


The situation becomes similar to having a house with a heavy front 
door with three deadbolt locks, and many glass windows.  The front 
door with its three locks is no longer a concern when you are 
evaluating your home for its security against burglary or home 
invasion because the glass windows are so fragile and easily broken.


It is necessary to look at all the factors which might result in data 
loss before deciding what the most effective steps are to minimize 
the probability of loss.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS: "Cannot replace a replacing drive"

2010-05-01 Thread Victor Latushkin

On Apr 29, 2010, at 2:20 AM, Freddie Cash wrote:

> On Wed, Apr 28, 2010 at 2:48 PM, Victor Latushkin  
> wrote:
>  
> 2. Run 'zdb -ddd storage' and provide section titles Dirty Time Logs
> 
>  See attached.

So you really do have enough redundancy to be able to handle this scenario, so 
this is software bug. On recent OpenSolaris build you should be able to detach 
one of the devices, and replace second one. Version 14 corresponds to build 
103, and spa_vdev_detach() was changed significantly in build 105 (along with 
other related changes), so those changes are probably not yet available in 
FreeBSD.


> 
> 3. Try 'zpool detach' approach on v14 system?
> 
> Pool upgraded successfully.

Btw, you do not have to upgrade pool immediately along with upgrade to newer 
ZFS version. Now you cannot use upgraded pool on system running older bits.

> Same results to all the zpool commands, though:  online, offline, detach, 
> replace.

Ok, i see. I mistakenly thought that v14 is for user/group quotas which means 
build 114, but i was wrong - user/group quotas require version 15.

> Another option may be to try latest OpenSolaris LiveCD (build 134).
> 
> I'll have to see if I can download/make one.
> 
> Does it include drivers for 3Ware 9550SXU and 9650SE RAID controllers?  All 
> the drives are plugged into those.

I do not know for sure, but quick check with google suggests that chances are 
good.

victor

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Single-disk pool corrupted after controller failure

2010-05-01 Thread Diogo Franco
I had a single spare 500GB HDD and I decided to install a FreeBSD file
server in it for learning purposes, and I moved almost all of my data
to it. Yesterday, and naturally after no longer having backups of the
data in the server, I had a controller failure (SiS 180 (oh, the
quality)) and the HDD was considered unplugged. When I noticed a few
checksum failures on `zfs status` (including two on metadata (small
hex numbers)), I tried running `zfs scrub tank`, thinking it was a
regular data corruption and then the box locked up. I had also
converted the pool to v14 a few days before, so the freebsd v13 tools
couldn't do anything to help.

Today I downloaded the OpenSolaris 134 snapshot image and booted it to
try and rescue the pool, but:

# zpool status
no pools available

So I couldn't run a clean or an export or destroy to reimport with -D.
I tried to run a regular import:

# zpool import
 pool: tank
   id: 6157028625215863355
state: FAULTED
status: The pool was last accessed by another system.
action: The pool cannot be imported due to damaged devices or data.
The pool may be active on another system, but can be imported using
the '-f' flag.
  see: http://www.sun.com/msg/ZFS-8000-EY
config:

tankFAULTED  corrupted data
 c5d0p1UNAVAIL  corrupted data

There was no important data written in the past two days or so, thus
using an older uberblock would't be a problem, so I tried using the
new recovery option:

# mkdir -p /mnt/tank && zpool import -fF -R /mnt/tank tank
cannot import 'tank': one or more devices is currently unavailable
Destroy and re-create the pool from
a backup source.

I tried googling for other people with similar issues, but almost all
of them had raids and other complex configuration and were not really
related to this problem.
After seeing that on some cases labels were corrupted, I tried running
zdb -l on mine:

# zdb -l /dev/dsk/c5d0p1

LABEL 0

failed to unpack label 0

LABEL 1

failed to unpack label 1

LABEL 2

version: 14
name: 'tank'
state: 0
txg: 11420324
pool_guid: 6157028625215863355
hostid: 2563111091
hostname: ''
top_guid: 1987270273092463401
guid: 1987270273092463401
vdev_tree:
type: 'disk'
id: 0
guid: 1987270273092463401
path: '/dev/ad6s1d'
whole_disk: 0
metaslab_array: 23
metaslab_shift: 32
ashift: 9
asize: 497955373056
is_log: 0
DTL: 111

LABEL 3

version: 14
name: 'tank'
state: 0
txg: 11420324
pool_guid: 6157028625215863355
hostid: 2563111091
hostname: ''
top_guid: 1987270273092463401
guid: 1987270273092463401
vdev_tree:
type: 'disk'
id: 0
guid: 1987270273092463401
path: '/dev/ad6s1d'
whole_disk: 0
metaslab_array: 23
metaslab_shift: 32
ashift: 9
asize: 497955373056
is_log: 0
DTL: 111

I'm looking for pointers on how to fix this situation, since the disk
still has available metadata.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS on ZFS based storage?

2010-05-01 Thread Brandon High
On Sat, May 1, 2010 at 7:08 AM, Gabriele Bulfon  wrote:
> My question is:
> - is it correct to mount the iScsi device as base disks for the VM and then 
> create zpools/volumes in it, considering that behind it there is already 
> another zfs?

Yes, that will work fine. In fact, zfs checksums will help protect
from over the wire errors. You can enable redundancy at either or both
levels, depending on performance requirements, available space and
your level of paranoia. Using mirroring or raidz in your VM will use
more bandwidth to your iscsi server.

> - in case it's correct to have the VM zfs over the storage zfs, where should 
> I manage snapshots? on the VM or on the storage?

It's up to what you plan on doing with the VM. I'd probably do both,
depending on the changes that I plan on making. For instance, use time
slider / zfs-auto-snapshot on the VM, but also snapshot the zvol on
the backing store before making any big configuration changes.

-B

-- 
Brandon High : bh...@freaks.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Reverse lookup: inode to name lookup

2010-05-01 Thread Mattias Pantzare
> If the kernel (or root) can open an arbitrary directory by inode number,
> then the kernel (or root) can find the inode number of its parent by looking
> at the '..' entry, which the kernel (or root) can then open, and identify
> both:  the name of the child subdir whose inode number is already known, and
> (b) yet another '..' entry.  The kernel (or root) can repeat this process
> recursively, up to the root of the filesystem tree.  At that time, the
> kernel (or root) has completely identified the absolute path of the inode
> that it started with.
>
> The only question I want answered right now is:
>
> Although it is possible, is it implemented?  Is there any kind of function,
> or existing program, which can be run by root, to obtain either the complete
> path of a directory by inode number, or to simply open an inode by number,
> which would leave the recursion and absolute path generation yet to be
> completed?

You can do in the kernel by calling vnodetopath(). I don't know if it
is exposed to user space.

But that could be slow if you have large directories so you have to
think about where you would use it.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Reverse lookup: inode to name lookup

2010-05-01 Thread Edward Ned Harvey
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> boun...@opensolaris.org] On Behalf Of Mattias Pantzare
> 
> The nfs server can find the file but not the file _name_.
> 
> inode is all that the NFS server needs, it does not need the file name
> if it has the inode number.

It is not useful or helpful for you guys to debate whether or not this is
possible.  And it is especially not helpful to flat out say "it's not
possible."  Here is the final word on whether or not it's possible:
Whenever any process calls "open('/some/path/filename')" that system call is
handled by the kernel, recursively resolving name to inode number, checking
permissions, and opening that inode number, until the final inode is
identified and opened, or some error is encountered.  The point is:
Obviously, the kernel has the facility to open an inode by number.  However,
for security reasons (enforcing permissions of parent directories before the
parent directories have been identified), the ability to open an arbitrary
inode by number is not normally made available to user level applications,
except perhaps when run by root.

At present, a file inode does not contain any reference to its parent
directory or directories.  But that's just a problem inherent to files.  It
is fundamentally easier to reverse lookup a directory by inode number,
because this information is already in the filesystem.  No filesystem
enhancements are needed to reverse lookup a directory by inode number,
because:  (a) every directory contains an entry ".." which refers to its
parent by number, and (b) every directory has precisely one parent, and no
more.  There is no such thing as a hardlink copy of a directory.  Therefore,
there is exactly one absolute path to any directory in any ZFS filesystem.

If the kernel (or root) can open an arbitrary directory by inode number,
then the kernel (or root) can find the inode number of its parent by looking
at the '..' entry, which the kernel (or root) can then open, and identify
both:  the name of the child subdir whose inode number is already known, and
(b) yet another '..' entry.  The kernel (or root) can repeat this process
recursively, up to the root of the filesystem tree.  At that time, the
kernel (or root) has completely identified the absolute path of the inode
that it started with.

The only question I want answered right now is:

Although it is possible, is it implemented?  Is there any kind of function,
or existing program, which can be run by root, to obtain either the complete
path of a directory by inode number, or to simply open an inode by number,
which would leave the recursion and absolute path generation yet to be
completed?

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs log on another zfs pool

2010-05-01 Thread mark.musa...@oracle.com

What problem are you trying to solve?




On 1 May 2010, at 02:18, Tuomas Leikola   
wrote:



Hi.

I have a simple question. Is it safe to place log device on another  
zfs disk?


I'm planning on placing the log on my mirrored root partition. Using  
latest opensolaris.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best practice for full stystem backup - equivelent of ufsdump/ufsrestore

2010-05-01 Thread Bob Friesenhahn

On Sat, 1 May 2010, Edward Ned Harvey wrote:


Would that be fuel to recommend people, "Never upgrade your version of zpool
or zfs on your rpool?"


It does seem to be a wise policy to not update the pool and filesystem 
versions unless you require a new pool or filesystem feature.  Then 
you would update to the minimum version required to support that 
feature.  Note that if the default filesystem version changes and you 
create a new filesystem, this may also cause problems (I have been 
bit by that before).


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Reverse lookup: inode to name lookup

2010-05-01 Thread Mattias Pantzare
On Sat, May 1, 2010 at 16:49,   wrote:
>
>
>>No, a NFS client will not ask the NFS server for a name by sending the
>>inode or NFS-handle. There is no need for a NFS client to do that.
>
> The NFS clients certainly version 2 and 3 only use the "file handle";
> the file handle can be decoded by the server.  It filehandle does not
> contain the name, only the FSid, the inode number and the generation.
>
>
>>There is no way to get a name from an inode number.
>
> The nfs server knows how so it is clearly possible.  It is not exported to
> userland but the kernel can find a file by its inumber.

The nfs server can find the file but not the file _name_.

inode is all that the NFS server needs, it does not need the file name
if it has the inode number.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Reverse lookup: inode to name lookup

2010-05-01 Thread Casper . Dik


>No, a NFS client will not ask the NFS server for a name by sending the
>inode or NFS-handle. There is no need for a NFS client to do that.

The NFS clients certainly version 2 and 3 only use the "file handle";
the file handle can be decoded by the server.  It filehandle does not
contain the name, only the FSid, the inode number and the generation.


>There is no way to get a name from an inode number.

The nfs server knows how so it is clearly possible.  It is not exported to 
userland but the kernel can find a file by its inumber.

Casper

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Reverse lookup: inode to name lookup

2010-05-01 Thread Mattias Pantzare
On Sat, May 1, 2010 at 16:23,   wrote:
>
>
>>I understand you cannot lookup names by inode number in general, because
>>that would present a security violation.  Joe User should not be able to
>>find the name of an item that's in a directory where he does not have
>>permission.
>>
>>
>>
>>But, even if it can only be run by root, is there some way to lookup the
>>name of an object based on inode number?
>
> Sure, that's typically how NFS works.
>
> The inode itself is not sufficient; an inode number might be recycled and
> and old snapshot with the same inode number may refer to a different file.

No, a NFS client will not ask the NFS server for a name by sending the
inode or NFS-handle. There is no need for a NFS client to do that.

There is no way to get a name from an inode number.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Reverse lookup: inode to name lookup

2010-05-01 Thread Casper . Dik


>I understand you cannot lookup names by inode number in general, because
>that would present a security violation.  Joe User should not be able to
>find the name of an item that's in a directory where he does not have
>permission.
>
> 
>
>But, even if it can only be run by root, is there some way to lookup the
>name of an object based on inode number?

Sure, that's typically how NFS works.

The inode itself is not sufficient; an inode number might be recycled and 
and old snapshot with the same inode number may refer to a different file.

Casper

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS on ZFS based storage?

2010-05-01 Thread Gabriele Bulfon
I'm trying to guess what is the best practice in this scenario:
- let's say I have a zfs based storage (let's say nexenta) that has it zfs 
pools and volumes shared as iScsi raw devices
- let's say I have another server running xvm or virtualbox connected to the 
storage
- let's say one of the virtual guests is OpenSolaris

My question is:
- is it correct to mount the iScsi device as base disks for the VM and then 
create zpools/volumes in it, considering that behind it there is already 
another zfs?
- what alternatives do I have?
- in case it's correct to have the VM zfs over the storage zfs, where should I 
manage snapshots? on the VM or on the storage?

Thanks for any idea
Gabriele.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Reverse lookup: inode to name lookup

2010-05-01 Thread Edward Ned Harvey
Forget about files for the moment, because directories are fundamentally
easier to deal with.

 

Let's suppose I've got the inode number of some directory in the present
filesystem.

[r...@filer ~]# ls -id /share/projects/foo/goo/rev1.0/working

 14363 /share/projects/foo/goo/rev1.0/working/

 

I want to identify the previous names & locations of that directory from
snapshots.

find /share/.zfs/snapshot -inum 14363

 

And I want to do it fast.  I don't want to use "find" or anything else that
needs to walk every tree of every snapshot.  The answer needs to be
essentially zero-time, just like the "ls -id" is essentially zero-time.

 

I understand you cannot lookup names by inode number in general, because
that would present a security violation.  Joe User should not be able to
find the name of an item that's in a directory where he does not have
permission.

 

But, even if it can only be run by root, is there some way to lookup the
name of an object based on inode number?

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to manage scrub priority or defer scrub?

2010-05-01 Thread Lutz Schumann
I was going though this posting and it seems that were is some "personal 
tension" :). 

However going back to the technical problem of scrubbing a 200 TB pool I think 
this issue needs to be addressed. 

One warning up front: This writing is rather long, and if you like to jump to 
the part dealing with Scrub, jump to "Scrub implementation" below.

>From my perspective: 

  - ZFS is great for huge amounts of data 

Thats what it was made for with 128bit and jbod design in mind. So ZFS is 
perfect for internet multi media in terms of scalability. 

  - ZFS is great for commodity hardware

Ok you should use 24x7 drives, but 2 TB 7200 disks are ok for internet media 
mass storage. We want huge amounts of data stored and in the internet age 
nobody pays for this. So you must use low cost hardware (well it must be 
compatible) - but you should not need enterprise components - thats what we 
have ZFS as clever software for. For mass storage internet services, the 
alternative is NOT EMC, NetApp (remember nobody pays a lot for the services 
because you can get it free at google) - the alternative is Linux based HW raid 
(with its well known limitations) and home grown solutions. Those do not have 
the nice ZFS features mentioned below.

  - ZFS guarantees data intrity by self-healing silent data corruption (thats 
what the checksums are for) - But only if you have redundancy. 

There are a lot of posts on the net saying when people will notice the bad 
blocks - it happens when a disk in a raid5 failes, and they have to resilver 
everything. Then you detect the missing redundancy. So people use Raid6 and 
"hope" that everything works. Or people do scrubs on their advanced raid 
controllers (if they provide internal checksumming). 

The same problem exists for huge, passive, raidz1 data sets in ZFS. If you do 
not scrub the array regularly, chances are higher that you will have a bad 
block during resilvering and then ZFS can not help. For active data sets the 
problem is not as critical, because on every read the checksum is verified - 
but still - because once in arc cache noboy checks - the problem exists. So we 
need scrub! 

  - ZFS can do many other nice things 

There's compression, dedupe etc .. however I look at them as "nice to have. 

  - ZFS needs proper pool design 

Using ZFS right is not easy, sizing the system is even more complicated. There 
are a lot of threads reagarding pool design - the easiest is to say "do a lot 
of mirrors", cause then the read performance really scales. However in internet 
mass media services, you cant - too expensive - because mirrored ZFS is more 
expensive then HW Raid 6 with Linux. How much members to a vdev ? multiple 
pools or single pools ? 

  - ZFS is open and community based 

... well lets see how this goes wth Oracle "financing" the whole stuff :)

And some of those points make ZFS a hit for internet service provider and mass 
media requirements (VOD etc.)!

So whats my point you may ask ? 

My experience with ZFS is that some points are simply not addressed well enough 
yet - BUT - ZFS is a living piece of software and thanks to the many great 
people developing it, it evolves faster then all the other storage solutions. 
So for the longer term - I believe ZFS will (hopefully) have all 
"enterprice-ness" it needs and it will revolutionize the storage industry (like 
cisco did in the 70's). I really believe that. 

>From my perspective some of the points not addressed well in ZFS are:

  - pool defragmentation - you need this for a COW filesystem 

I think the ZFS developers are working on this with the background rewriter. So 
I hope it will come 2010. With the rewriter on disk layout can be optimized for 
read performance for sequencial workloads - also for raidz1 and raidz2 - 
meaning ZFS can compete with Raid5 and Raid6 - also for wider vdevs. And wider 
vdevs mean more effective capacity. If the vdev read-ahead cache is working 
nice with a sequencially aligned on disk layout then - (from disk) read 
performance will be great.

  - IO priorization for zvols / zfs filesystems (aka Storage QoS)

Unfortunately you can not prioritize I/O to zfs filesystems and zvols right 
now. I think this is one of the features that make ZFS not suitable for 1st 
tier storage (like EMC Symmetrix or NetApp FAS6000 series).  You need 
priorization here - because your SAP system really is more important than my 
MP3 web server :)

  - Deduplication not ready for production

Currently dedup is nice, but the DDT table handling and memory sizing is tricky 
and hardly usable for larger pools (my perspective). The DDT is handled like 
any other component - meaning user I/O can push the DDT out of the arc (and the 
L2ARC) - even with (primarycache=secondarycache)=metadata. For typical mass 
media storage applications, the working set is much larger then the memory (and 
L2ARC) meaning your DDT will come from disk - causing real performance 
degration.  

This is especially true for 

[zfs-discuss] zfs log on another zfs pool

2010-05-01 Thread Tuomas Leikola
Hi.

I have a simple question. Is it safe to place log device on another zfs
disk?

I'm planning on placing the log on my mirrored root partition. Using latest
opensolaris.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Panic when deleting a large dedup snapshot

2010-05-01 Thread Roy Sigurd Karlsbakk
- "Cindy Swearingen"  skrev:

> Brandon,
> 
> You're probably hitting this CR:
> 
> http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6924824

Interesting - reported in february and still no fix?

roy
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss