On 06.11.19 12:22, Max Reitz wrote: > On 06.11.19 12:18, Dietmar Maurer wrote: >>> And if it issues a smaller request, there is no way for a guest device >>> to tell it “OK, here’s your data, but note we have a whole 4 MB chunk >>> around it, maybe you’d like to take that as well...?” >>> >>> I understand wanting to increase the backup buffer size, but I don’t >>> quite understand why we’d want it to increase to the source cluster size >>> when the guest also has no idea what the source cluster size is. >> >> Because it is more efficent. > > For rbd.
Let me elaborate: Yes, a cluster size generally means that it is most “efficient” to access the storage at that size. But there’s a tradeoff. At some point, reading the data takes sufficiently long that reading a bit of metadata doesn’t matter anymore (usually, that is). There is a bit of a problem with making the backup copy size rather large, and that is the fact that backup’s copy-before-write causes guest writes to stall. So if the guest just writes a bit of data, a 4 MB buffer size may mean that in the background it will have to wait for 4 MB of data to be copied.[1] Hm. OTOH, we have the same problem already with the target’s cluster size, which can of course be 4 MB as well. But I can imagine it to actually be important for the target, because otherwise there might be read-modify-write cycles. But for the source, I still don’t quite understand why rbd has such a problem with small read requests. I don’t doubt that it has (as you explained), but again, how is it then even possible to use rbd as the backend for a guest that has no idea of this requirement? Does Linux really prefill the page cache with 4 MB of data for each read? Max [1] I suppose what we could do is decouple the copy buffer size from the bitmap granularity, but that would be more work than just a MAX() in backup_calculate_cluster_size().
signature.asc
Description: OpenPGP digital signature