Rich Freeman <r-bt...@thefreemanclan.net> schrieb:

> On Thu, Mar 26, 2015 at 8:07 PM, Martin <m_bt...@ml1.co.uk> wrote:
>>
>> Anyone with any comments on how well duperemove performs for TB-sized
>> volumes?
> 
> Took many hours but less than a day for a few TB - I'm not sure
> whether it is smart enough to take less time on subsequent scans like
> bedup.
> 
>>
>> Does it work across subvolumes? (Presumably not...)
> 
> As far as I can tell, yes.  Unless you pass a command-line option it
> crosses filesystem boundaries and even scans non-btrfs filesystems
> (like /proc, /dev, etc).  Obviously you'll want to avoid that since it
> only wastes time and I can just imagine it trying to hash kcore and
> such.
> 
> Other than being less-than-ideal intelligence-wise, it seemed
> effective.  I can live with that in an early release like this.

This is mainly in there to support deduping across different subvolumes 
within the same device pool. So I think the idea was neither less-than-
ideal, nor unintelligent, and it has nothing to do with performance.

But your warning is still valid: One should take care not to "dedupe" 
special filesystems (but that is the same with every other tool out there, 
like rsync, cp, essentially everything that supports recursion), nor is it 
very effective for the deduplication process to cross a boundary to a non-
btrfs device - for one or more exceptions: You may want duperemove to write 
hashes for a non-btrfs device and use the result for other purposes outside 
of duperemoves scope, or you are nesting btrfs into non-btrfs into btrfs 
mounts, or...

Concluding that: duperemove should probably not try to become smart about 
filesystem boundaries. It should either cross them or not as it is now - the 
option is left to the user (as is the task to supply proper cmdline 
arguments with that).

With the planned performance improvements, I'm guessing the best way will 
become mounting the root subvolume (subvolid 0) and letting duperemove work 
on that as a whole - including crossing all fs boundaries.

-- 
Replies to list only preferred.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to