On Wed, May 22, 2013 at 03:34:18PM +0000, Dietmar Maurer wrote: > > That sounds like more work than a persistent dirty bitmap. The advantage > > is that > > while dirty bitmaps are consumed by a single user, the Merkle tree can be > > used > > to sync up any number of replicas. > > I also consider it safer, because you make sure the data exists (using hash > keys like SHA1). > > I am unsure how you can check if a dirty bitmap contains errors, or is out of > date? > > Also, you can compare arbitrary Merkle trees, whereas a dirty bitmap is > always related to a single image. > (consider the user remove the latest backup from the backup target).
One disadvantage of Merkle trees is that the client becomes stateful - the client needs to store its own Merkle tree and this requires fancier client-side code. It is also more expensive to update hashes than a dirty bitmap. Not because you need to hash data but because a small write (e.g. 1 sector) requires that you read the surrounding sectors to recompute a hash for the cluster. Therefore you can expect worse guest I/O performance than with a dirty bitmap. I still think it's a cool idea. Making it work well will require a lot more effort than a dirty bitmap. Stefan