Tested-by: Benoît Canet <benoit.ca...@gmail.com>

On Mon, Feb 27, 2012 at 2:16 PM, Stefan Hajnoczi <
stefa...@linux.vnet.ibm.com> wrote:

> The L2 table cache reduces QED metadata reads that would be required
> when translating LBAs to offsets into the image file.  Since requests
> execute in parallel it is possible to share an L2 table between multiple
> requests.
>
> There is a potential data corruption issue when an in-use L2 table is
> evicted from the cache because the following situation occurs:
>
>  1. An allocating write performs an update to L2 table "A".
>
>  2. Another request needs L2 table "B" and causes table "A" to be
>     evicted.
>
>  3. A new read request needs L2 table "A" but it is not cached.
>
> As a result the L2 update from #1 can overlap with the L2 fetch from #3.
> We must avoid doing overlapping I/O requests here since the worst case
> outcome is that the L2 fetch completes before the L2 update and yields
> stale data.  In that case we would effectively discard the L2 update and
> lose data clusters!
>
> Thanks to Benoît Canet <benoit.ca...@gmail.com> for extensive testing
> and debugging which lead to discovery of this bug.
>
> Reported-by: Benoît Canet <benoit.ca...@gmail.com>
> Signed-off-by: Stefan Hajnoczi <stefa...@linux.vnet.ibm.com>
> ---
> Please include this in -stable once it has been merged into
> qemu.git/master.
>
>  block/qed-l2-cache.c |   22 ++++++++++++++++++----
>  1 files changed, 18 insertions(+), 4 deletions(-)
>
> diff --git a/block/qed-l2-cache.c b/block/qed-l2-cache.c
> index 02b81a2..e9b2aae 100644
> --- a/block/qed-l2-cache.c
> +++ b/block/qed-l2-cache.c
> @@ -161,11 +161,25 @@ void qed_commit_l2_cache_entry(L2TableCache
> *l2_cache, CachedL2Table *l2_table)
>         return;
>     }
>
> +    /* Evict an unused cache entry so we have space.  If all entries are
> in use
> +     * we can grow the cache temporarily and we try to shrink back down
> later.
> +     */
>     if (l2_cache->n_entries >= MAX_L2_CACHE_SIZE) {
> -        entry = QTAILQ_FIRST(&l2_cache->entries);
> -        QTAILQ_REMOVE(&l2_cache->entries, entry, node);
> -        l2_cache->n_entries--;
> -        qed_unref_l2_cache_entry(entry);
> +        CachedL2Table *next;
> +        QTAILQ_FOREACH_SAFE(entry, &l2_cache->entries, node, next) {
> +            if (entry->ref > 1) {
> +                continue;
> +            }
> +
> +            QTAILQ_REMOVE(&l2_cache->entries, entry, node);
> +            l2_cache->n_entries--;
> +            qed_unref_l2_cache_entry(entry);
> +
> +            /* Stop evicting when we've shrunk back to max size */
> +            if (l2_cache->n_entries < MAX_L2_CACHE_SIZE) {
> +                break;
> +            }
> +        }
>     }
>
>     l2_cache->n_entries++;
> --
> 1.7.9
>
>
>

Reply via email to