On Thu, Mar 01, 2012 at 05:10:57PM +0100, Kevin Wolf wrote: > Am 27.02.2012 14:16, schrieb Stefan Hajnoczi: > > The L2 table cache reduces QED metadata reads that would be required > > when translating LBAs to offsets into the image file. Since requests > > execute in parallel it is possible to share an L2 table between multiple > > requests. > > > > There is a potential data corruption issue when an in-use L2 table is > > evicted from the cache because the following situation occurs: > > > > 1. An allocating write performs an update to L2 table "A". > > > > 2. Another request needs L2 table "B" and causes table "A" to be > > evicted. > > > > 3. A new read request needs L2 table "A" but it is not cached. > > > > As a result the L2 update from #1 can overlap with the L2 fetch from #3. > > We must avoid doing overlapping I/O requests here since the worst case > > outcome is that the L2 fetch completes before the L2 update and yields > > stale data. In that case we would effectively discard the L2 update and > > lose data clusters! > > > > Thanks to Benoît Canet <benoit.ca...@gmail.com> for extensive testing > > and debugging which lead to discovery of this bug. > > > > Reported-by: Benoît Canet <benoit.ca...@gmail.com> > > Signed-off-by: Stefan Hajnoczi <stefa...@linux.vnet.ibm.com> > > Thanks, applied to the block branch. > > How about a qemu-iotests case?
The test case is not ready yet. I started writing one but it is racy because I haven't introduced a way of controlling AIO issue/complete for tests. My next step is to add that. Stefan