Hello Stefan,

As you can read dedup keep me awake at night.

I still think that there is a need for a deduplication implementation
that would perform nearly as fast as regular qcow2.

I though about this: http://en.wikipedia.org/wiki/Normal_distribution.

Not all block are equals for deduplication.
Some will deduplicate well and some won't.

My idea would be to run periodically a filter on the in ram tree in order to
drop the less performing and the less promising block.

The less performing block involved on a deduplication operation since the last
run of the filter would be kept because they are promising so they would
survive and have a chance to climb among the top performers.

The less performing block not involved in a deduplication operation since the
last run of the filter would be definitively dropped from the HashNode tree
since they are loosers.

The center of the bell curve would be kept since they are champions.

This way this ram based implementation could offer speed while it's memory usage
being limited.

Regards

Benoît

Reply via email to