> Maybe the best approach is to maintain a dirty bitmap while the guest is > running, which is fairly cheap. Then you can use the dirty bitmap to only > hash > modified clusters when building the Merkle tree - this avoids reading the > entire disk image.
Yes, this is an good optimization. > > > I still think it's a cool idea. Making it work well will require a > > > lot more effort than a dirty bitmap. > > > > How do you re-generate a dirty bitmap after a server crash? > > The dirty bitmap is invalid after crash. A full backup is required, all > clusters > are considered dirty. > > The simplest way to implement this is to mark the persistent bitmap "invalid" > upon the first guest write. When QEMU is terminated cleanly, flush all dirty > bitmap updates to disk and then mark the file "valid" > again. If QEMU finds the file is "invalid" on startup, start from scratch. Or you can compared the hash keys in that case? Although I guess computing all those SHA1 checksums needs a considerable amount of CPU time.