On May 6, 2011, at 3:24 AM, Erik Trimble <erik.trim...@oracle.com> wrote:

> On 5/6/2011 1:37 AM, casper....@oracle.com wrote:
>>> Op 06-05-11 05:44, Richard Elling schreef:
>>>> As the size of the data grows, the need to have the whole DDT in RAM or 
>>>> L2ARC
>>>> decreases. With one notable exception, destroying a dataset or snapshot 
>>>> requires
>>>> the DDT entries for the destroyed blocks to be updated. This is why people 
>>>> can
>>>> go for months or years and not see a problem, until they try to destroy a 
>>>> dataset.
>>> So what you are saying is "you with your ram-starved system, don't even
>>> try to start using snapshots on that system". Right?
>> 
>> I think it's more like "don't use dedup when you don't have RAM".
>> 
>> (It is not possible to not use snapshots in Solaris; they are used for
>> everything)

:-)

>> 
>> Casper
>> 
> Casper and Richard are correct - RAM starvation seriously impacts snapshot or 
> dataset deletion when a pool has dedup enabled.  The reason behind this is 
> that ZFS needs to scan the entire DDT to check to see if it can actually 
> delete each block in the to-be-deleted snapshot/dataset, or if it just needs 
> to update the dedup reference count.

AIUI, the issue is not the the DDT is scanned, it is an AVL tree for a reason. 
The issue is that each reference update means that one, small bit of data is 
changed. If the reference is not already in ARC, then a small, probably random 
read is needed. If you have a typical consumer disk, especially a "green" disk, 
and have not tuned zfs_vdev_max_pending, then that itty bitty read can easily 
take more than 100 milliseconds(!) Consider that you can have thousands or 
millions of reference updates to do during a zfs destroy, and the math gets 
ugly. This is why fast SSDs make good dedup candidates.

> If it can't store the entire DDT in either the ARC or L2ARC, it will be 
> forced to do considerable I/O to disk, as it brings in the appropriate DDT 
> entry.   Worst case for insufficient ARC/L2ARC space can increase deletion 
> times by many orders of magnitude. E.g. days, weeks, or even months to do a 
> deletion.

I've never seen months, but I have seen days, especially for low-perf disks.

> 
> If dedup isn't enabled, snapshot and data deletion is very light on RAM 
> requirements, and generally won't need to do much (if any) disk I/O.  Such 
> deletion should take milliseconds to a minute or so.

Yes, perhaps a bit longer for recursive destruction, but everyone here knows 
recursion is evil, right? :-)
 -- richard

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to