On 10/15/2018 04:32 PM, Ian Jackson wrote:
> Make the bufdev and non-bufdev messages distinct, and always print the
> non-constant argument (ie, the size).

Ok, so I was doing live migration of a domU with 8GiB ram, with Xen 4.11
and Linux 4.18 in the dom0, and it consistently failed with:

xencall: error: alloc_pages: mmap failed: Invalid argument
xc: error: Unable to allocate memory for dirty bitmaps, batch pfns and
deferred pages: Internal error
xc: error: Save failed (12 = Cannot allocate memory): Internal error
libxl-save-helper: debug: complete r=-1: Cannot allocate memory

With Linux 4.17 it either succeeds, or crashes the hypervisor (oh yeah!).

Thanks to help in #xendevel a workaround was found (thanks Juergen), by
bumping the value of /sys/module/xen_privcmd/parameters/limit to
something higher than the default 64, which, using some obscure
calculation method, seems to mean 256k memory.

I could migrate this 8GiB domU when setting it to 128.

Andrew said:
17:13 < andyhhp> migration can and will use large buffers
17:13 < andyhhp> the total size of mappings is O(n) with the size of the
VM you are trying to migrate.

If I live migrate a domU with 64GiB of memory, how do I know what value
I need for this limit?

It feels a bit like a new kind of gnttab_max_frames setting that users
will run into when they do non-trivial things with Xen.

If this is the case, and sane low defaults are preferred, then I'd
really like to see some kind of error message that is actually helpful
for an end user. E.g. "Hey! You're trying to migrate a domU with XGiB of
memory. It's not gonna happen right now, but if you change the value for
/sys/module/xen_privcmd/parameters/limit to Y it will work. Don't worry,
your domU is still running fine on this host.", having Y calculated back
from memory size.

And then the live migration howto should contain a section with tunables
that describe how you should tune xen taking the max domU memory size
that you're using into account...

Just some thinking out loud... For normal users, these kind of errors
are really scary, and if they're not a bug, but an expected thing
because of settings, the way of presenting them to the user should not
look like a really serious problem.

Hans

> This assists diagnosis.
> 
> CC: Andrew Cooper <andrew.coop...@citrix.com>
> CC: Hans van Kranenburg <h...@knorrie.org>
> ---
> v2: Print sizes.
> ---
>  tools/libs/call/linux.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/libs/call/linux.c b/tools/libs/call/linux.c
> index d8a6306e04..51fa4899eb 100644
> --- a/tools/libs/call/linux.c
> +++ b/tools/libs/call/linux.c
> @@ -93,7 +93,8 @@ static void *alloc_pages_bufdev(xencall_handle *xcall, 
> size_t npages)
>               xcall->buf_fd, 0);
>      if ( p == MAP_FAILED )
>      {
> -        PERROR("alloc_pages: mmap failed");
> +        PERROR("alloc_pages: mmap (,%zu*%lu,...) [bufdev] failed",
> +               npages, (unsigned long)PAGE_SIZE);
>          p = NULL;
>      }
>  
> @@ -110,7 +111,7 @@ static void *alloc_pages_nobufdev(xencall_handle *xcall, 
> size_t npages)
>      p = mmap(NULL, size, PROT_READ|PROT_WRITE, 
> MAP_PRIVATE|MAP_ANONYMOUS|MAP_LOCKED, -1, 0);
>      if ( p == MAP_FAILED )
>      {
> -        PERROR("alloc_pages: mmap failed");
> +        PERROR("alloc_pages: mmap(,%zu,...) [nobufdev] failed", size);
>          return NULL;
>      }
>  
> @@ -119,7 +120,8 @@ static void *alloc_pages_nobufdev(xencall_handle *xcall, 
> size_t npages)
>      rc = madvise(p, npages * PAGE_SIZE, MADV_DONTFORK);
>      if ( rc < 0 )
>      {
> -        PERROR("alloc_pages: madvise failed");
> +        PERROR("alloc_pages: madvise (,%zu*%lu,) [nobufdev] failed",
> +               npages, (unsigned long)PAGE_SIZE);
>          goto out;
>      }
>  
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Reply via email to