[zfs-discuss] dedup and handling corruptions - impossible?

2010-08-21 Thread devsk
If dedup is ON and the pool develops a corruption in a file, I can never fix it 
because when I try to copy the correct file on top of the corrupt file,
the block hash will match with the existing blocks and only reference count 
will be updated. The only way to fix it is to delete all
snapshots (to remove all references) and then delete the file and then copy the 
valid file. This is a pretty high cost if it is so (empirical
evidence so far, I don't know internal details).

Has anyone else experienced this?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Root pool on boot drive lost on another machine because of devids

2010-08-21 Thread devsk
Actually, I figured it has nothing to do with /etc/zfs/zpool.cache. As part of 
removing that file within a LiveCD, I was basically importing and exporting the 
rpool.
So, the only thing required is an equivalent of 'zpool import -f rpool && zpool 
export rpool'. I wonder why this can't be automated.

PS: I understand why grub code may not have 'zpool' command as such but it can 
definitely do the equivalent. Grub does have ZFS specific code.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Root pool on boot drive lost on another machine because of devids

2010-08-21 Thread devsk
> > I have a USB flash drive which boots up my
> > opensolaris install. What happens is that whenever
> I
> > move to a different machine,
> > the root pool is lost because the devids don't
> match
> > with what's in /etc/zfs/zpool.cache and the system
> > just can't find the rpool.
> 
> See defect 4755 or defect 5484
> 
> https://defect.opensolaris.org/bz/show_bug.cgi?id=4755
> 
> https://defect.opensolaris.org/bz/show_bug.cgi?id=5484
> 
> 
> When I last experimented with booting Solaris
> from flash memory sticks I modified scsa2usb
> so that it would construct a devid for the usb
> flash memory stick,

Isn't it as simple as 'read what pool the user specified in findroot/bootfs 
commands, zpool export  and zpool import -f ' and move on?

If it wasn't, why would removing /etc/zfs/zpool.cache using a ISO based livecd 
make it work every time? I can change controller for this drive freely as
long as I want as long as there is no valid /etc/zfs/zpool.cache file present.

BTW: in my case, its not strictly treated as removable flash drive because I am 
accessing the USB flash drive as a physical drive in virtualbox.
So, depending on which SCSI port I add it to, it may or may not boot.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS with Equallogic storage

2010-08-21 Thread Toby Thain


On 21-Aug-10, at 3:06 PM, Ross Walker wrote:

On Aug 21, 2010, at 2:14 PM, Bill Sommerfeld > wrote:



On 08/21/10 10:14, Ross Walker wrote:
...
Would I be better off forgoing resiliency for simplicity, putting  
all my faith into the Equallogic to handle data resiliency?


IMHO, no; the resulting system will be significantly more brittle.


Exactly how brittle I guess depends on the Equallogic system.


If you don't let zfs manage redundancy, Bill is correct: it's a more  
fragile system that *cannot* self heal data errors in the (deep)  
stack. Quantifying the increased risk, is a question that Richard  
Elling could probably answer :)


--Toby



-Ross

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS with Equallogic storage

2010-08-21 Thread Richard Elling
On Aug 21, 2010, at 2:21 PM, Ross Walker wrote:

> On Aug 21, 2010, at 4:40 PM, Richard Elling  wrote:
> 
>> On Aug 21, 2010, at 10:14 AM, Ross Walker wrote:
>>> I'm planning on setting up an NFS server for our ESXi hosts and plan on 
>>> using a virtualized Solaris or Nexenta host to serve ZFS over NFS.
>> 
>> Please follow the joint EMC+NetApp best practices for VMware ESX servers.
>> The recommendations apply to any NFS implementation for ESX.
> 
> Thanks, I'll check that out! Always looking for advice on how best to tweak 
> NFS for ESX.

In this case, it is ESX over NFS recommendations.  You will want to change the
settings on the ESX server.
http://www.vmware.com/files/pdf/partners/netapp_esx_best_practices_whitepaper.pdf

 -- richard

-- 
OpenStorage Summit, October 25-27, San Fransisco
http://nexenta-summit2010.eventbrite.com

Richard Elling
rich...@nexenta.com   +1-760-896-4422
Enterprise class storage for everyone
www.nexenta.com




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS with Equallogic storage

2010-08-21 Thread Ross Walker
On Aug 21, 2010, at 4:40 PM, Richard Elling  wrote:

> On Aug 21, 2010, at 10:14 AM, Ross Walker wrote:
>> I'm planning on setting up an NFS server for our ESXi hosts and plan on 
>> using a virtualized Solaris or Nexenta host to serve ZFS over NFS.
> 
> Please follow the joint EMC+NetApp best practices for VMware ESX servers.
> The recommendations apply to any NFS implementation for ESX.

Thanks, I'll check that out! Always looking for advice on how best to tweak NFS 
for ESX.

I have a current ZFS over NFS implementation, but on direct attached storage 
using Sol10. I will be interested to see how Nexenta compares.

>> The storage I have available is provided by Equallogic boxes over 10Gbe 
>> iSCSI.
>> 
>> I am trying to figure out the best way to provide both performance and 
>> resiliency given the Equallogic provides the redundancy.
>> 
>> Since I am hoping to provide a 2TB datastore I am thinking of carving out 
>> either 3 1TB luns or 6 500GB luns that will be RDM'd to the storage VM and 
>> within the storage server setting up either 1 raidz vdev with the 1TB luns 
>> (less RDMs) or 2 raidz vdevs with the 500GB luns (more fine grained 
>> expandability, work in 1TB increments).
>> 
>> Given the 2GB of write-back cache on the Equallogic I think the integrated 
>> ZIL would work fine (needs benchmarking though).
> 
> This should work fine.
> 
>> The vmdk files themselves won't be backed up (more data then I can store), 
>> just the essential data contained within, so I would think resiliency would 
>> be important here.
>> 
>> My questions are these.
>> 
>> Does this setup make sense?
> 
> Yes, it is perfectly reasonable.
> 
>> Would I be better off forgoing resiliency for simplicity, putting all my 
>> faith into the Equallogic to handle data resiliency?
> 
> I don't have much direct experience with Equillogic, but I would expect that
> they do a reasonable job of protecting data, or they would be out of business.
> 
> You can also use the copies parameter to set extra redundancy for the 
> important
> files. ZFS will also tell you if corruption is found in a single file, so 
> that you can 
> recover just the file and not be forced to recover everything else. I think 
> this fits
> into your back strategy.

I thought of the copies parameter, but figured a raidz laid on top of the 
storage pool would only waste 33% instead of 50% and since this is on top of a 
conceptually single RAID volume the IOPS bottleneck won't come into play since 
the any single drive IOPS will be equal to the array IOPS as a whole.

>> Will this setup perform? Anybody with experience in this type of setup?
> 
> Many people are quite happy with RAID arrays and still take advantage of 
> the features of ZFS: checksums, snapshots, clones, send/receive, VMware
> integration, etc. The decision of where to implement data protection (RAID) 
> is not as important as the decision to protect your data.  
> 
> My advice: protect your data.

Always good advice.

So I suppose this just confirms my analysis.

Thanks,

-Ross

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS with Equallogic storage

2010-08-21 Thread Richard Elling
On Aug 21, 2010, at 10:14 AM, Ross Walker wrote:
> I'm planning on setting up an NFS server for our ESXi hosts and plan on using 
> a virtualized Solaris or Nexenta host to serve ZFS over NFS.

Please follow the joint EMC+NetApp best practices for VMware ESX servers.
The recommendations apply to any NFS implementation for ESX.

> The storage I have available is provided by Equallogic boxes over 10Gbe iSCSI.
> 
> I am trying to figure out the best way to provide both performance and 
> resiliency given the Equallogic provides the redundancy.
> 
> Since I am hoping to provide a 2TB datastore I am thinking of carving out 
> either 3 1TB luns or 6 500GB luns that will be RDM'd to the storage VM and 
> within the storage server setting up either 1 raidz vdev with the 1TB luns 
> (less RDMs) or 2 raidz vdevs with the 500GB luns (more fine grained 
> expandability, work in 1TB increments).
> 
> Given the 2GB of write-back cache on the Equallogic I think the integrated 
> ZIL would work fine (needs benchmarking though).

This should work fine.

> The vmdk files themselves won't be backed up (more data then I can store), 
> just the essential data contained within, so I would think resiliency would 
> be important here.
> 
> My questions are these.
> 
> Does this setup make sense?

Yes, it is perfectly reasonable.

> Would I be better off forgoing resiliency for simplicity, putting all my 
> faith into the Equallogic to handle data resiliency?

I don't have much direct experience with Equillogic, but I would expect that
they do a reasonable job of protecting data, or they would be out of business.

You can also use the copies parameter to set extra redundancy for the important
files. ZFS will also tell you if corruption is found in a single file, so that 
you can 
recover just the file and not be forced to recover everything else. I think 
this fits
into your back strategy.

> Will this setup perform? Anybody with experience in this type of setup?

Many people are quite happy with RAID arrays and still take advantage of 
the features of ZFS: checksums, snapshots, clones, send/receive, VMware
integration, etc. The decision of where to implement data protection (RAID) 
is not as important as the decision to protect your data.  

My advice: protect your data.
 -- richard

-- 
Richard Elling
rich...@nexenta.com   +1-760-896-4422
Enterprise class storage for everyone
www.nexenta.com



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] OpenStorage Summit

2010-08-21 Thread Brad Stone
Just wanted to make a quick announcement that there will be an OpenStorage 
Summit in Palo Alto, CA in late October. The conference should have a lot of
good OpenSolaris talks, with ZFS experts such as Bill Moore, Adam Levanthal, 
and Ben Rockwood already planning to give presentations. The conference is open 
to other storage solutions, and we also expect participation from FreeNAS, 
OpenFiler, and Lustre for example. There will be presentations on SSDs, ZFS 
basics, performance tuning, etc.

The agenda is still being formed, as we are hoping to get more presentation 
proposals from the community. To submit a proposal, send an email to 
summit2...@nexenta.com.

For additional details or to take advantage of early bird registration, go to 
http://nexenta-summit2010.eventbrite.com.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Root pool on boot drive lost on another machine because of devids

2010-08-21 Thread Jürgen Keil
> I have a USB flash drive which boots up my
> opensolaris install. What happens is that whenever I
> move to a different machine,
> the root pool is lost because the devids don't match
> with what's in /etc/zfs/zpool.cache and the system
> just can't find the rpool.

See defect 4755 or defect 5484

https://defect.opensolaris.org/bz/show_bug.cgi?id=4755
https://defect.opensolaris.org/bz/show_bug.cgi?id=5484

When I last experimented with booting Solaris
from flash memory sticks I modified scsa2usb
so that it would construct a devid for the usb
flash memory stick,
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS with Equallogic storage

2010-08-21 Thread Ross Walker
On Aug 21, 2010, at 2:14 PM, Bill Sommerfeld  wrote:

> On 08/21/10 10:14, Ross Walker wrote:
>> I am trying to figure out the best way to provide both performance and 
>> resiliency given the Equallogic provides the redundancy.
> 
> (I have no specific experience with Equallogic; the following is just generic 
> advice)
> 
> Every bit stored in zfs is checksummed at the block level; zfs will not use 
> data or metadata if the checksum doesn't match.

I understand that much and is the reason I picked ZFS for persistent data 
storage.

> zfs relies on redundancy (storing multiple copies) to provide resilience; if 
> it can't independently read the multiple copies and pick the one it likes, it 
> can't recover from bitrot or failure of the underlying storage.

Can't auto-recover, but will report the failure so it can be restored from 
backup, but since the vmdk files are too big to backup...

> if you want resilience, zfs must be responsible for redundancy.

Must have, not necessarily have full control.

> You imply having multiple storage servers.  The simplest thing to do is 
> export one large LUN from each of two different storage servers, and have ZFS 
> mirror them.

Well... You need to know that the multiple storage servers are acting as a 
single pool with tiered storage levels (SAS 15K in RAID10 and SATA in RAID6) 
and luns are auto-tiered across these based on demand performance, so a pool of 
mirrors won't really provide any more performance then a raidz (same physical 
RAID) and raidz will only "waste" 33% as oppose to 50%.

> While this reduces the available space, depending on your workload, you can 
> make some of it back by enabling compression.
> 
> And, given sufficiently recent software, and sufficient memory and/or ssd for 
> l2arc, you can enable dedup.

The host is a blade server with no room for SSDs, but if SSD investment is 
needed in the future I can add an SSD Equallogic box to the storage pool.

> Of course, the effectiveness of both dedup and compression depends on your 
> workload.
> 
>> Would I be better off forgoing resiliency for simplicity, putting all my 
>> faith into the Equallogic to handle data resiliency?
> 
> IMHO, no; the resulting system will be significantly more brittle.

Exactly how brittle I guess depends on the Equallogic system.

-Ross

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS with Equallogic storage

2010-08-21 Thread Bill Sommerfeld

On 08/21/10 10:14, Ross Walker wrote:

I am trying to figure out the best way to provide both performance and 
resiliency given the Equallogic provides the redundancy.


(I have no specific experience with Equallogic; the following is just 
generic advice)


Every bit stored in zfs is checksummed at the block level; zfs will not 
use data or metadata if the checksum doesn't match.


zfs relies on redundancy (storing multiple copies) to provide 
resilience; if it can't independently read the multiple copies and pick 
the one it likes, it can't recover from bitrot or failure of the 
underlying storage.


if you want resilience, zfs must be responsible for redundancy.

You imply having multiple storage servers.  The simplest thing to do is 
export one large LUN from each of two different storage servers, and 
have ZFS mirror them.


While this reduces the available space, depending on your workload, you 
can make some of it back by enabling compression.


And, given sufficiently recent software, and sufficient memory and/or 
ssd for l2arc, you can enable dedup.


Of course, the effectiveness of both dedup and compression depends on 
your workload.



Would I be better off forgoing resiliency for simplicity, putting all my faith 
into the Equallogic to handle data resiliency?


IMHO, no; the resulting system will be significantly more brittle.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS with Equallogic storage

2010-08-21 Thread Ross Walker

I'm planning on setting up an NFS server for our ESXi hosts and plan on using a 
virtualized Solaris or Nexenta host to serve ZFS over NFS.

The storage I have available is provided by Equallogic boxes over 10Gbe iSCSI.

I am trying to figure out the best way to provide both performance and 
resiliency given the Equallogic provides the redundancy.

Since I am hoping to provide a 2TB datastore I am thinking of carving out 
either 3 1TB luns or 6 500GB luns that will be RDM'd to the storage VM and 
within the storage server setting up either 1 raidz vdev with the 1TB luns 
(less RDMs) or 2 raidz vdevs with the 500GB luns (more fine grained 
expandability, work in 1TB increments).

Given the 2GB of write-back cache on the Equallogic I think the integrated ZIL 
would work fine (needs benchmarking though).

The vmdk files themselves won't be backed up (more data then I can store), just 
the essential data contained within, so I would think resiliency would be 
important here.

My questions are these.

Does this setup make sense?

Would I be better off forgoing resiliency for simplicity, putting all my faith 
into the Equallogic to handle data resiliency?

Will this setup perform? Anybody with experience in this type of setup?

-Ross


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS development moving behind closed doors

2010-08-21 Thread Orvar Korvar
"And by the way: Wasn't there a comment of Linus Torvals recently that people 
shound move their low-quality code into the codebase ??? ;)"

Anyone knows the link? Good against the Linux fanboys. :o)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cant't detach spare device from pool

2010-08-21 Thread Ian Collins

On 08/21/10 08:50 PM, Simone Caldana wrote:

Il giorno 21/ago/2010, alle ore 10.10, Ian Collins ha scritto:
   

On 08/21/10 07:03 PM, Martin Mundschenk wrote:
 

After about 62 hours and 90%, the resilvering process got stuck. Since 12 hours 
nothing happens anymore. Thus, I can not detach the spare device. Is there a 
way to get the resilvering process back running?

   

Are you sure it's stuck?  They can take a very long time and go really slow at 
the end.
 

especially if you're writing things in that pool.

   
Oh yes, I just had to wait 88 hours for a 500G drive in a raidz2 to 
resilver on a backup staging server.  It was at "100%" for about half of 
that...


--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cant't detach spare device from pool

2010-08-21 Thread Simone Caldana
Il giorno 21/ago/2010, alle ore 10.10, Ian Collins ha scritto:
> On 08/21/10 07:03 PM, Martin Mundschenk wrote:
>> After about 62 hours and 90%, the resilvering process got stuck. Since 12 
>> hours nothing happens anymore. Thus, I can not detach the spare device. Is 
>> there a way to get the resilvering process back running?
>> 
> Are you sure it's stuck?  They can take a very long time and go really slow 
> at the end.

especially if you're writing things in that pool.

-- 
Simone Caldana

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cant't detach spare device from pool

2010-08-21 Thread Ian Collins

On 08/21/10 07:03 PM, Martin Mundschenk wrote:
After about 62 hours and 90%, the resilvering process got stuck. Since 
12 hours nothing happens anymore. Thus, I can not detach the spare 
device. Is there a way to get the resilvering process back running?


Are you sure it's stuck?  They can take a very long time and go really 
slow at the end.


--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cant't detach spare device from pool

2010-08-21 Thread Martin Mundschenk
After about 62 hours and 90%, the resilvering process got stuck. Since 12 hours 
nothing happens anymore. Thus, I can not detach the spare device. Is there a 
way to get the resilvering process back running?

Martin



Am 18.08.2010 um 20:11 schrieb Mark Musante:

> You need to let the resilver complete before you can detach the spare.  This 
> is a known problem, CR 6909724.
> 
> http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6909724

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss