Re: [zfs-discuss] Summary: Dedup and L2ARC memory requirements

Erik Trimble Fri, 06 May 2011 18:19:23 -0700

On 5/6/2011 5:46 PM, Richard Elling wrote:

On May 6, 2011, at 3:24 AM, Erik Trimble<erik.trim...@oracle.com>  wrote:

Casper and Richard are correct - RAM starvation seriously impacts snapshot or 
dataset deletion when a pool has dedup enabled.  The reason behind this is that 
ZFS needs to scan the entire DDT to check to see if it can actually delete each 
block in the to-be-deleted snapshot/dataset, or if it just needs to update the 
dedup reference count.

AIUI, the issue is not the the DDT is scanned, it is an AVL tree for a reason. The issue 
is that each reference update means that one, small bit of data is changed. If the 
reference is not already in ARC, then a small, probably random read is needed. If you 
have a typical consumer disk, especially a "green" disk, and have not tuned 
zfs_vdev_max_pending, then that itty bitty read can easily take more than 100 
milliseconds(!) Consider that you can have thousands or millions of reference updates to 
do during a zfs destroy, and the math gets ugly. This is why fast SSDs make good dedup 
candidates.

Just out of curiosity - I'm assuming that a delete works like this:

    (1) find list of blocks associated with file to be deleted
    (2) using the DDT, find out if any other files are using those blocks

(3) delete/update any metadata associated with the file (dirents,ACLs, etc.)

    (4) for each block in the file

(4a) if the DDT indicates there ARE other files using thisblock, update the DDT entry to change the refcount(4b) if the DDT indicates there AREN'T any other files, movethe physical block to the free list, and delete the DDT entry

In a bulk delete scenario (not just snapshot deletion), I'd presume #1above almost always causes a Random I/O request to disk, as all therelevant metadata for every (to be deleted) file is unlikely to bestored in ARC. If you can't fit the DDT in ARC/L2ARC, #2 above wouldrequire you to pull in the remainder of the DDT info from disk, right?#3 and #4 can be batched up, so they don't hurt that much.

Is that a (roughly) correct deletion methodology? Or can someone give amore accurate view of what's actually going on?

If it can't store the entire DDT in either the ARC or L2ARC, it will be forced 
to do considerable I/O to disk, as it brings in the appropriate DDT entry.   
Worst case for insufficient ARC/L2ARC space can increase deletion times by many 
orders of magnitude. E.g. days, weeks, or even months to do a deletion.

I've never seen months, but I have seen days, especially for low-perf disks.

I've seen an estimate of 5 weeks for removing a snapshot on a 1TB deduppool made up of 1 disk.


Not an optimal set up.

:-)

If dedup isn't enabled, snapshot and data deletion is very light on RAM 
requirements, and generally won't need to do much (if any) disk I/O.  Such 
deletion should take milliseconds to a minute or so.

Yes, perhaps a bit longer for recursive destruction, but everyone here knows 
recursion is evil, right? :-)
  -- richard

You, my friend, have obviously never worshipped at the Temple of theLamba Calculus, nor been exposed to the Holy Writ that is "Structure andInterpretation of Computer Programs"(http://mitpress.mit.edu/sicp/full-text/book/book.html).

I sentence you to a semester of 6.001 problem sets, written by ProfSussman sometime in the 1980s.


(yes, I went to MIT.)

--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Summary: Dedup and L2ARC memory requirements

Reply via email to