On Fri, 06/29 12:24, Eric Blake wrote: > On 06/29/2018 10:15 AM, Vladimir Sementsov-Ogievskiy wrote: > > We need to synchronize backup job with reading from fleecing image > > like it was done in block/replication.c. > > > > Otherwise, the following situation is theoretically possible: > > > > Grammar suggestions: > > > 1. client start reading > > client starts reading > > > 2. client understand, that there is no corresponding cluster in > > fleecing image > > 3. client is going to read from backing file (i.e. active image) > > client sees that no corresponding cluster has been allocated in the fleecing > image, so the request is forwarded to the backing file > > > 4. guest writes to active image > > 5. this write is stopped by backup(sync=none) and cluster is copied to > > fleecing image > > 6. guest write continues... > > 7. and client reads _new_ (or partly new) date from active image > > Interesting race. Can it actually happen, or does our read code already > serialize writes to the same area while a read is underway?
Yes, I wonder why wait_serialising_requests() is not enough. If it's possible, can we have a test case (with help of blkdebug, for example)? > > In short, I see what problem you are claiming exists: the moment the client > starts reading from the backing file, that portion of the backing file must > remain unchanged until after the client is done reading. But I don't know > enough details of the block layer to know if this is actually a problem, or > if adding the new filter is just overhead. Fam