On Mon, Feb 27, 2012 at 1:16 PM, Stefan Hajnoczi <stefa...@linux.vnet.ibm.com> wrote: > The L2 table cache reduces QED metadata reads that would be required > when translating LBAs to offsets into the image file. Since requests > execute in parallel it is possible to share an L2 table between multiple > requests. > > There is a potential data corruption issue when an in-use L2 table is > evicted from the cache because the following situation occurs: > > 1. An allocating write performs an update to L2 table "A". > > 2. Another request needs L2 table "B" and causes table "A" to be > evicted. > > 3. A new read request needs L2 table "A" but it is not cached. > > As a result the L2 update from #1 can overlap with the L2 fetch from #3. > We must avoid doing overlapping I/O requests here since the worst case > outcome is that the L2 fetch completes before the L2 update and yields > stale data. In that case we would effectively discard the L2 update and > lose data clusters! > > Thanks to Benoît Canet <benoit.ca...@gmail.com> for extensive testing > and debugging which lead to discovery of this bug. > > Reported-by: Benoît Canet <benoit.ca...@gmail.com> > Signed-off-by: Stefan Hajnoczi <stefa...@linux.vnet.ibm.com> > --- > Please include this in -stable once it has been merged into qemu.git/master. > > block/qed-l2-cache.c | 22 ++++++++++++++++++---- > 1 files changed, 18 insertions(+), 4 deletions(-)
Thanks for testing this fix and confirming it works, Benoît. Feel free to reply with your Tested-by: line. Stefan