> > I also consider it safer, because you make sure the data exists (using hash > > keys > like SHA1). > > > > I am unsure how you can check if a dirty bitmap contains errors, or is out > > of > date? > > > > Also, you can compare arbitrary Merkle trees, whereas a dirty bitmap is > > always > related to a single image. > > (consider the user remove the latest backup from the backup target). > > One disadvantage of Merkle trees is that the client becomes stateful - the > client > needs to store its own Merkle tree and this requires fancier client-side code.
What 'client' do you talk about here? But sure, the code gets more complex, and needs considerable amount of RAM to store the hash keys . > It is also more expensive to update hashes than a dirty bitmap. Not because > you > need to hash data but because a small write (e.g. 1 sector) requires that you > read the surrounding sectors to recompute a hash for the cluster. Therefore > you > can expect worse guest I/O performance than with a dirty bitmap. There is no need to update any hash - You only need to do that on backup - in fact, all those things can be done by the backup driver. > I still think it's a cool idea. Making it work well will require a lot more > effort than > a dirty bitmap. How do you re-generate a dirty bitmap after a server crash?