On Wed, 29 Oct 2008, Graham McArdle wrote:

> Maybe the discussions you recall aren't fully indexed for searching 
> on these keywords or they were in another forum, but thanks for 
> giving me the gist of it. It is potentially quite an Achilles heel 
> for ZFS though. I've argued locally to migrate our main online data 
> archive (currently 3.5TB) to ZFS, but if the recovery time for disk 
> failures keeps getting slower as the archive grows and accumulates 
> snapshots etc., some questions might be asked about this policy.

The simple solution is to use a reasonable pool design.  Limit the 
maximum LUN size to a size which can be resilved in reasonable time. 
Don't do something silly like building a mirror across two 10TB LUNs.

Huge disks will take longer to resilver, and it is more likely for a 
secondary failure to occur during resilvering.  Manage your potential 
losses by managing the size of a loss, and therefore the time to 
recover.

Another rules is to control how full a pool is allowed to become.  If 
you fill it to 99% you can be assured that the pool will become 
fragmented and resilvers will take much longer.  The pool will be 
slower in normal use as well.

Bob
======================================
Bob Friesenhahn
[EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to