On Mon, Feb 27, 2012 at 1:16 PM, Stefan Hajnoczi
<stefa...@linux.vnet.ibm.com> wrote:
> The L2 table cache reduces QED metadata reads that would be required
> when translating LBAs to offsets into the image file.  Since requests
> execute in parallel it is possible to share an L2 table between multiple
> requests.
>
> There is a potential data corruption issue when an in-use L2 table is
> evicted from the cache because the following situation occurs:
>
>  1. An allocating write performs an update to L2 table "A".
>
>  2. Another request needs L2 table "B" and causes table "A" to be
>     evicted.
>
>  3. A new read request needs L2 table "A" but it is not cached.
>
> As a result the L2 update from #1 can overlap with the L2 fetch from #3.
> We must avoid doing overlapping I/O requests here since the worst case
> outcome is that the L2 fetch completes before the L2 update and yields
> stale data.  In that case we would effectively discard the L2 update and
> lose data clusters!
>
> Thanks to Benoît Canet <benoit.ca...@gmail.com> for extensive testing
> and debugging which lead to discovery of this bug.
>
> Reported-by: Benoît Canet <benoit.ca...@gmail.com>
> Signed-off-by: Stefan Hajnoczi <stefa...@linux.vnet.ibm.com>
> ---
> Please include this in -stable once it has been merged into qemu.git/master.
>
>  block/qed-l2-cache.c |   22 ++++++++++++++++++----
>  1 files changed, 18 insertions(+), 4 deletions(-)

Thanks for testing this fix and confirming it works, Benoît.  Feel
free to reply with your Tested-by: line.

Stefan

Reply via email to