Hi, Juan. Thanks for your review.

> -----Original Message-----
> From: Juan Quintela [mailto:quint...@redhat.com]
> Sent: Tuesday, February 25, 2014 11:25 PM
> To: Gonglei (Arei)
> Cc: qemu-devel@nongnu.org; Dr. David Alan Gilbert; owass...@redhat.com;
> chenliang (T); Huangweidong (C)
> Subject: Re: [PATCH v3] XBZRLE: Fix qemu crash when resize the xbzrle cache
> 
> "Gonglei (Arei)" <arei.gong...@huawei.com> wrote:
> > Resizing the xbzrle cache during migration causes qemu-crash,
> > because the main-thread and migration-thread modify the xbzrle
> > cache size concurrently without lock-protection.
> >
> > Signed-off-by: ChenLiang <chenlian...@huawei.com>
> > Signed-off-by: Gonglei <arei.gong...@huawei.com>
> > Reviewed-by: Dr. David Alan Gilbert <dgilb...@redhat.com>
> 
> Sorry for the late review.
> 
> > ---
> > Changes against the previous version:
> >  *Remove the temporary variable cache in the xbzrle_cache_resize.
> >
> >  arch_init.c | 58
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++---
> >  1 file changed, 55 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch_init.c b/arch_init.c
> > index 80574a0..003640f 100644
> > --- a/arch_init.c
> > +++ b/arch_init.c
> > @@ -164,26 +164,69 @@ static struct {
> >      uint8_t *encoded_buf;
> >      /* buffer for storing page content */
> >      uint8_t *current_buf;
> > -    /* Cache for XBZRLE */
> > +    /* Cache for XBZRLE, Protected by lock. */
> >      PageCache *cache;
> > +    QemuMutex lock;
> >  } XBZRLE = {
> >      .encoded_buf = NULL,
> >      .current_buf = NULL,
> >      .cache = NULL,
> > +    /* use the lock carefully */
> > +    .lock = {PTHREAD_MUTEX_INITIALIZER},
> >  };
> 
> Have you tried to compile for windows?  I think this would make it
> break.  We need to init it with qemu_mutex_init() in ram_save_setup()
> after the call to cache_init()?

Yeah,it isn't compatible with windows. There is a problem that we must 
initialize the lock at qemu startup. 
Because the xbzrle_cache_resize can be called anytime. I don't know how to 
handle it in windows. 
Do you have any suggestion?
> 
> 
> 
> >  /* buffer used for XBZRLE decoding */
> >  static uint8_t *xbzrle_decoded_buf;
> >
> > +static void XBZRLE_cache_lock(void)
> > +{
> > +    qemu_mutex_lock(&XBZRLE.lock);
> > +}
> > +
> 
> if we change this to:
> 
>     if (migrate_use_xbzrle())
>         qemu_mutex_lock(&XBZRLE.lock);
> 
> On both cache operations, we would remove the overhead in the no XBRLE
> case, right?

Check.
> 
> > +static void XBZRLE_cache_unlock(void)
> > +{
> > +    qemu_mutex_unlock(&XBZRLE.lock);
> > +}
> > +
> >  int64_t xbzrle_cache_resize(int64_t new_size)
> >  {
> > +    PageCache *new_cache, *old_cache;
> > +
> >      if (new_size < TARGET_PAGE_SIZE) {
> >          return -1;
> >      }
> >
> > +    XBZRLE_cache_lock();
> >      if (XBZRLE.cache != NULL) {
> > -        return cache_resize(XBZRLE.cache, new_size /
> TARGET_PAGE_SIZE) *
> > -            TARGET_PAGE_SIZE;
> > +        /* check XBZRLE.cache again later */
> > +        XBZRLE_cache_unlock();
> > +        if (pow2floor(new_size) == migrate_xbzrle_cache_size()) {
> > +            return pow2floor(new_size);
> > +        }
> > +        new_cache = cache_init(new_size / TARGET_PAGE_SIZE,
> > +                                        TARGET_PAGE_SIZE);
> > +        if (!new_cache) {
> > +            DPRINTF("Error creating cache\n");
> > +            return -1;
> > +        }
> > +
> > +        XBZRLE_cache_lock();
> > +        /* the XBZRLE.cache may have be destroyed, check it again */
> > +        if (XBZRLE.cache != NULL) {
> > +            old_cache = XBZRLE.cache;
> > +            XBZRLE.cache = new_cache;
> > +            new_cache = NULL;
> > +        }
> > +        XBZRLE_cache_unlock();
> > +
> > +        if (NULL == new_cache) {
> > +            cache_fini(old_cache);
> > +        } else {
> > +            cache_fini(new_cache);
> > +        }
> > +    } else {
> > +        XBZRLE_cache_unlock();
> >      }
> 
> Can we change this to:
> 
>         if (pow2floor(new_size) == migrate_xbzrle_cache_size()) {
>             return pow2floor(new_size);
>         }
> 
>         new_cache = cache_init(new_size / TARGET_PAGE_SIZE,
>                                         TARGET_PAGE_SIZE);
>         if (!new_cache) {
>             DPRINTF("Error creating cache\n");
>             return -1;
>         }
> 
>         XBZRLE_cache_lock();
>         /* the XBZRLE.cache may have be destroyed, check it again */
>         if (XBZRLE.cache != NULL) {
>             cache_to_free = XBZRLE.cache;
>             XBZRLE.cache = new_cache;
>         } else {
>             cache_to_free = new_cache;
>         }
>         XBZRLE_cache_unlock();
> 
>         cache_fini(cache_to_free);
> 
> I think that looking is clearer this way.  BTW, we are losing
> _everything_ that is on the cache if we resize it.  I don't know if that
> is what we want.  If we have a big cache, and make it bigger because we
> are having too many misses, just dropping the whole cache looks like a
> bad idea :-(
> 

Dropping the whole cache is a pity. I have thought about reuse the cache_resize 
in the xbzrle_cache_resize. 
But the cache_resize may consume a little time, and it will block migration 
thread too long.
> 
> And my understanding is that we should do a:
> 
> migrate_get_current()->xbzrle_cache_size = new_size;
> 
> independently of what CACHE is equal or different from NULL?
> (this was already wrong before your patch, just asking because you are
> looking at the code).

We should do it definitely. The old code do it too. But the old code don't 
handle the return
value of the xbzrle_cache_resize. The return value may be -1. I will fix it .

In addition, I also do some work to fix another XBZRLE bug and optimize XBZRLE. 
The patches will be committed later.
> 
> Thanks, Juan.

Best regards,
-Gonglei

Reply via email to