On Mar 20, 2011, at 14:24, Roy Sigurd Karlsbakk wrote:

>>> It all depends on the number of drives in the VDEV(s), traffic
>>> patterns during resilver, speed VDEV fill, of drives etc. Still,
>>> close to 6 days is a lot. Can you detail your configuration?
>> 
>> How many times do we have to rehash this? The speed of resilver is
>> dependent on the amount of data, the distribution of data on the
>> resilvering device, speed of the resilvering device, and the throttle. It is 
>> NOT
>> dependent on the number of drives in the vdev.
> 
> Thanks for clearing this up - I've been told large VDEVs lead to long 
> resilver times, but then, I guess that was wrong.

There was a thread ("Suggested RaidZ configuration...") a little while back 
where the topic of IOps and resilver time came up:

http://mail.opensolaris.org/pipermail/zfs-discuss/2010-September/thread.html#44633

I think this message by Erik Trimble is a good summary:

> Scenario 1:    I have 5 1TB disks in a raidz1, and I assume I have 128k slab 
> sizes.  Thus, I have 32k of data for each slab written to each disk. (4x32k 
> data + 32k parity for a 128k slab size).  So, each IOPS gets to reconstruct 
> 32k of data on the failed drive.   It thus takes about 1TB/32k = 31e6 IOPS to 
> reconstruct the full 1TB drive.
> 
> Scenario 2:    I have 10 1TB drives in a raidz1, with the same 128k slab 
> sizes.  In this case, there's only about 14k of data on each drive for a 
> slab. This means, each IOPS to the failed drive only write 14k.  So, it takes 
> 1TB/14k = 71e6 IOPS to complete.
> 
> From this, it can be pretty easy to see that the number of required IOPS to 
> the resilvered disk goes up linearly with the number of data drives in a 
> vdev.  Since you're always going to be IOPS bound by the single disk 
> resilvering, you have a fixed limit.

        
http://mail.opensolaris.org/pipermail/zfs-discuss/2010-September/044660.html

Also, a post by Jeff Bonwick on resilvering:

        http://blogs.sun.com/bonwick/entry/smokin_mirrors

Between Richard's and Eric's statements, I would say that while resilver time 
is not dependent "number of drives in the vdev", the pool configuration can 
affect the IOps rate, and /that/ can affect the time it takes to finish a 
resilver. Is that a decent summary?

I think maybe the "number of drives in the vdev" perhaps come into play because 
that when people have a lot of disks, they often put them into RAIDZ[123] 
configurations. So it's just a matter of confusing the (IOps limiting) 
configuration with the fact that one may have many disks.

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to