Re: [Xen-devel] [PATCH] migration, xen: Fix block image lock issue on live migration

2017-10-02 Thread Dr. David Alan Gilbert
Adding in kwolf;  it looks sane to me; Kevin?
If I'm reading this right, this is just after the device state save.

Dave

* Anthony PERARD (anthony.per...@citrix.com) wrote:
> When doing a live migration of a Xen guest with libxl, the images for
> block devices are locked by the original QEMU process, and this prevent
> the QEMU at the destination to take the lock and the migration fail.
> 
> From QEMU point of view, once the RAM of a domain is migrated, there is
> two QMP commands, "stop" then "xen-save-devices-state", at which point a
> new QEMU is spawned at the destination.
> 
> Release locks in "xen-save-devices-state" so the destination can takes
> them.
> 
> Signed-off-by: Anthony PERARD <anthony.per...@citrix.com>
> ---
> CCing libxl maintainers:
> CC: Ian Jackson <ian.jack...@eu.citrix.com>
> CC: Wei Liu <wei.l...@citrix.com>
> ---
>  migration/savevm.c | 14 ++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/migration/savevm.c b/migration/savevm.c
> index 4a88228614..69d904c179 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -2263,6 +2263,20 @@ void qmp_xen_save_devices_state(const char *filename, 
> Error **errp)
>  qemu_fclose(f);
>  if (ret < 0) {
>  error_setg(errp, QERR_IO_ERROR);
> +} else {
> +/* libxl calls the QMP command "stop" before calling
> + * "xen-save-devices-state" and in case of migration failure, libxl
> + * would call "cont".
> + * So call bdrv_inactivate_all (release locks) here to let the other
> + * side of the migration take controle of the images.
> + */
> +if (!saved_vm_running) {
> +ret = bdrv_inactivate_all();
> +if (ret) {
> +        error_setg(errp, "%s: bdrv_inactivate_all() failed (%d)",
> +   __func__, ret);
> +}
> +}
>  }
>  
>   the_end:
> -- 
> Anthony PERARD
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 2/4] shutdown: Prepare for use of an enum in reset/shutdown_request

2017-04-28 Thread Dr. David Alan Gilbert
* Eric Blake (ebl...@redhat.com) wrote:
> On 04/28/2017 11:09 AM, Dr. David Alan Gilbert wrote:
> 
> >>> At a higher level, using your tags, I'm not sure where a reset triggered
> >>> by a fault detected by the hypervisor lives - e.g. an x86 triple fault
> >>> where the guest screws up so badly that it just gets reset.  Is
> >>> that a guest-reset or a guest-panic or what - neither case
> >>> was actually asked for by the guest itself.
> >>
> >> Wouldn't that be host-error (qemu detected an error that prevents
> >> further execution of the guest without a reset - and a triple fault
> >> seems to fall into the category of the guest getting itself wedged
> >> rather than actually trying to reset)?  Except patch 3 only used
> >> SHUTDOWN_TYPE_HOST_ERROR in the xen portion of the patch.
> >>
> >> So if any x86 expert has an opinion on where triple-fault handling is
> >> emulated, and what category should be used there, I'm welcome to
> >> tweaking this series.
> > 
> > It's pretty much on the border anyway, I don't think it matters too
> > much; it sounds perfectly reasonable.
> 
> Actually, reading
> https://blogs.msdn.microsoft.com/larryosterman/2005/02/08/faster-syscall-trap-redux/
> makes it sound like the triple-fault = reset is exploited by existing OS
> (dating back to days of targetting 286 machines), so it is bare-metal
> behavior that we have to faithfully emulate as a guest-triggered reset,
> and not something where the guest has wedged itself to the point where
> qemu can no longer execute the guest.

The point is it's both :-)
A lot of x86 reset code tries four or five different ways to invoke
a reset and if all else fails they triple fault.

Dave

> -- 
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.   +1-919-301-3266
> Virtualization:  qemu.org | libvirt.org
> 



--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 2/4] shutdown: Prepare for use of an enum in reset/shutdown_request

2017-04-28 Thread Dr. David Alan Gilbert
* Eric Blake (ebl...@redhat.com) wrote:
> On 04/28/2017 10:27 AM, Dr. David Alan Gilbert wrote:
> 
> >>>> +# Enumeration of various causes for shutdown.
> >>>> +#
> >>>> +# @host-qmp: Reaction to a QMP command, such as 'quit'
> >>>> +# @host-signal: Reaction to a signal, such as SIGINT
> >>>> +# @host-ui: Reaction to a UI event, such as closing the window
> >>>> +# @host-replay: The host is replaying an earlier shutdown event
> >>>> +# @host-error: Qemu encountered an error that prevents further use of 
> >>>> the guest
> >>>> +# @guest-shutdown: The guest requested a shutdown, such as via ACPI or
> >>>> +#  other hardware-specific action
> >>>> +# @guest-reset: The guest requested a reset, and the command line
> >>>> +#   response to a reset is to instead trigger a shutdown
> >>>> +# @guest-panic: The guest panicked, and the command line response to
> >>>> +#   a panic is to trigger a shutdown
> >>>
> 
> > At a higher level, using your tags, I'm not sure where a reset triggered
> > by a fault detected by the hypervisor lives - e.g. an x86 triple fault
> > where the guest screws up so badly that it just gets reset.  Is
> > that a guest-reset or a guest-panic or what - neither case
> > was actually asked for by the guest itself.
> 
> Wouldn't that be host-error (qemu detected an error that prevents
> further execution of the guest without a reset - and a triple fault
> seems to fall into the category of the guest getting itself wedged
> rather than actually trying to reset)?  Except patch 3 only used
> SHUTDOWN_TYPE_HOST_ERROR in the xen portion of the patch.
> 
> So if any x86 expert has an opinion on where triple-fault handling is
> emulated, and what category should be used there, I'm welcome to
> tweaking this series.

It's pretty much on the border anyway, I don't think it matters too
much; it sounds perfectly reasonable.

Dave

> -- 
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.   +1-919-301-3266
> Virtualization:  qemu.org | libvirt.org
> 



--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 2/4] shutdown: Prepare for use of an enum in reset/shutdown_request

2017-04-28 Thread Dr. David Alan Gilbert
* Eric Blake (ebl...@redhat.com) wrote:
> On 04/28/2017 03:08 AM, Dr. David Alan Gilbert wrote:
> > * Eric Blake (ebl...@redhat.com) wrote:
> >> We want to track why a guest was shutdown; in particular, being able
> >> to tell the difference between a guest request (such as ACPI request)
> >> and host request (such as SIGINT) will prove useful to libvirt.
> >> Since all requests eventually end up changing shutdown_requested in
> >> vl.c, the logical change is to make that value track the reason,
> >> rather than its current 0/1 contents.
> >>
> 
> >>  ##
> >> +# @ShutdownCause:
> >> +#
> >> +# Enumeration of various causes for shutdown.
> >> +#
> >> +# @host-qmp: Reaction to a QMP command, such as 'quit'
> >> +# @host-signal: Reaction to a signal, such as SIGINT
> >> +# @host-ui: Reaction to a UI event, such as closing the window
> >> +# @host-replay: The host is replaying an earlier shutdown event
> >> +# @host-error: Qemu encountered an error that prevents further use of the 
> >> guest
> >> +# @guest-shutdown: The guest requested a shutdown, such as via ACPI or
> >> +#  other hardware-specific action
> >> +# @guest-reset: The guest requested a reset, and the command line
> >> +#   response to a reset is to instead trigger a shutdown
> >> +# @guest-panic: The guest panicked, and the command line response to
> >> +#   a panic is to trigger a shutdown
> > 
> > It's a little coarse grained;  is there anyway to pass platform specific 
> > information
> > for debug?  I ask because I spent a while debugging a few bugs with 
> > unexpected
> > resets and had to figure out which of x86's many reset causes triggered it.
> 
> I'm open to any followup patches that add further enum values and
> adjusts the various callers (patch 3 shows how MANY callers use
> qemu_system_shutdown_request).  But I don't think it's necessarily in
> scope for this series - remember, my goal here was merely to distinguish
> between host- and guest-triggered resets (which libvirt and higher
> management tasks want to know)

Yep, that's fine.

> rather than which of multiple reset
> paths was taken (I agree that it is useful during a qemu debug session -
> but that's a different audience).  I also don't consider myself an
> expert in the many ways that x86 can reset - it was easy to blindly
> rewrite qemu_system_shutdown_request() into
> qemu_system_shutdown_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN) based solely
> on directory, but it would be harder to distinguish which of the
> multiple files should have which finer-grained cause.

Yes, I'm also not an expert on x86 resets - but when I was debugging
I just added a tag in every place it called the reset code.

At a higher level, using your tags, I'm not sure where a reset triggered
by a fault detected by the hypervisor lives - e.g. an x86 triple fault
where the guest screws up so badly that it just gets reset.  Is
that a guest-reset or a guest-panic or what - neither case
was actually asked for by the guest itself.

Dave


> 
> -- 
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.   +1-919-301-3266
> Virtualization:  qemu.org | libvirt.org
> 



--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 2/4] shutdown: Prepare for use of an enum in reset/shutdown_request

2017-04-28 Thread Dr. David Alan Gilbert
   qemu_system_reset(VMRESET_REPORT);
> +request = qemu_reset_requested_get();
> +if (request >= 0) {
> +qemu_system_reset(VMRESET_REPORT, request);
>  destroy_hvm_domain(true);
>  }
>  }
> diff --git a/migration/colo.c b/migration/colo.c
> index c19eb3f..17a5482 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -611,7 +611,7 @@ void *colo_process_incoming_thread(void *opaque)
>  }
> 
>  qemu_mutex_lock_iothread();
> -qemu_system_reset(VMRESET_SILENT);
> +qemu_system_reset(VMRESET_SILENT, -1);
>  vmstate_loading = true;
>  if (qemu_loadvm_state(fb) < 0) {
>  error_report("COLO: loadvm failed");
> diff --git a/migration/savevm.c b/migration/savevm.c
> index 03ae1bd..dcbaf00 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -2292,7 +2292,7 @@ int load_vmstate(const char *name)
>  return -EINVAL;
>  }
> 
> -qemu_system_reset(VMRESET_SILENT);
> +qemu_system_reset(VMRESET_SILENT, -1);
>  mis->from_src_file = f;
> 
>  aio_context_acquire(aio_context);
> -- 
> 2.9.3
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Regression: Xen guest with 5G of RAM on 32bit fail to boot

2015-12-02 Thread Dr. David Alan Gilbert
* Paolo Bonzini (pbonz...@redhat.com) wrote:
> 
> 
> On 01/12/2015 18:53, Anthony PERARD wrote:
> > The problem is in qemu_ram_alloc_internal() where 'size' and 'maxsize' are
> > now been truncate to 32bit, due to 'qemu_host_page_size' been an uintptr_t
> > in the HOST_PAGE_ALIGN macro.
> 
> Isn't it qemu_host_page_mask that causes the problem?
> 
> This should also work, as it causes qemu_host_page_mask to sign-extend:

I've just posted a set that just flips them to ram_addr_t (and removes the 10 
year
old warning that questions whether the type is right - which it obviously 
wasn't);
see '[For 2.5?? PATCH 1/1] qemu_{real_}host_page_[size|mask] change types to
 ram_addr_t'

Anthony: I'd appreciate if you could give it a Xen test.

Dave

> diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
> index f9998b9..87a4145 100644
> --- a/include/exec/cpu-all.h
> +++ b/include/exec/cpu-all.h
> @@ -174,11 +174,10 @@ extern unsigned long reserved_va;
>  #define TARGET_PAGE_MASK ~(TARGET_PAGE_SIZE - 1)
>  #define TARGET_PAGE_ALIGN(addr) (((addr) + TARGET_PAGE_SIZE - 1) & 
> TARGET_PAGE_MASK)
>  
> -/* ??? These should be the larger of uintptr_t and target_ulong.  */
>  extern uintptr_t qemu_real_host_page_size;
> -extern uintptr_t qemu_real_host_page_mask;
> +extern intptr_t qemu_real_host_page_mask;
>  extern uintptr_t qemu_host_page_size;
> -extern uintptr_t qemu_host_page_mask;
> +extern intptr_t qemu_host_page_mask;
>  
>  #define HOST_PAGE_ALIGN(addr) (((addr) + qemu_host_page_size - 1) & 
> qemu_host_page_mask)
>  #define REAL_HOST_PAGE_ALIGN(addr) (((addr) + qemu_real_host_page_size - 1) 
> & \
> diff --git a/translate-all.c b/translate-all.c
> index a940bd2..7a15109 100644
> --- a/translate-all.c
> +++ b/translate-all.c
> @@ -118,7 +118,7 @@ typedef struct PageDesc {
>  #define V_L1_SHIFT (L1_MAP_ADDR_SPACE_BITS - TARGET_PAGE_BITS - V_L1_BITS)
>  
>  uintptr_t qemu_host_page_size;
> -uintptr_t qemu_host_page_mask;
> +intptr_t qemu_host_page_mask;
>  
>  /* The bottom level has pointers to PageDesc */
>  static void *l1_map[V_L1_SIZE];
> @@ -326,14 +326,14 @@ void page_size_init(void)
>  /* NOTE: we can always suppose that qemu_host_page_size >=
> TARGET_PAGE_SIZE */
>  qemu_real_host_page_size = getpagesize();
> -qemu_real_host_page_mask = ~(qemu_real_host_page_size - 1);
> +qemu_real_host_page_mask = -(intptr_t)qemu_real_host_page_size;
>  if (qemu_host_page_size == 0) {
>  qemu_host_page_size = qemu_real_host_page_size;
>  }
>  if (qemu_host_page_size < TARGET_PAGE_SIZE) {
>  qemu_host_page_size = TARGET_PAGE_SIZE;
>  }
> -qemu_host_page_mask = ~(qemu_host_page_size - 1);
> +qemu_host_page_mask = -(intptr_t)qemu_host_page_size;
>  }
>  
>  static void page_init(void)
> diff --git a/translate-common.c b/translate-common.c
> index 619feb4..171222d 100644
> --- a/translate-common.c
> +++ b/translate-common.c
> @@ -21,7 +21,7 @@
>  #include "qom/cpu.h"
>  
>  uintptr_t qemu_real_host_page_size;
> -uintptr_t qemu_real_host_page_mask;
> +intptr_t qemu_real_host_page_mask;
>  
>  #ifndef CONFIG_USER_ONLY
>  /* mask must never be zero, except for A20 change call */
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Regression: Xen guest with 5G of RAM on 32bit fail to boot

2015-12-02 Thread Dr. David Alan Gilbert
* Paolo Bonzini (pbonz...@redhat.com) wrote:
> 
> 
> On 02/12/2015 11:30, Paolo Bonzini wrote:
> > diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
> > index f9998b9..87a4145 100644
> > --- a/include/exec/cpu-all.h
> > +++ b/include/exec/cpu-all.h
> > @@ -174,11 +174,10 @@ extern unsigned long reserved_va;
> >  #define TARGET_PAGE_MASK ~(TARGET_PAGE_SIZE - 1)
> >  #define TARGET_PAGE_ALIGN(addr) (((addr) + TARGET_PAGE_SIZE - 1) & 
> > TARGET_PAGE_MASK)
> >  
> > -/* ??? These should be the larger of uintptr_t and target_ulong.  */
> >  extern uintptr_t qemu_real_host_page_size;
> > -extern uintptr_t qemu_real_host_page_mask;
> > +extern intptr_t qemu_real_host_page_mask;
> >  extern uintptr_t qemu_host_page_size;
> > -extern uintptr_t qemu_host_page_mask;
> > +extern intptr_t qemu_host_page_mask;
> >  
> >  #define HOST_PAGE_ALIGN(addr) (((addr) + qemu_host_page_size - 1) & 
> > qemu_host_page_mask)
> >  #define REAL_HOST_PAGE_ALIGN(addr) (((addr) + qemu_real_host_page_size - 
> > 1) & \
> > diff --git a/translate-all.c b/translate-all.c
> > index a940bd2..7a15109 100644
> > --- a/translate-all.c
> > +++ b/translate-all.c
> > @@ -118,7 +118,7 @@ typedef struct PageDesc {
> >  #define V_L1_SHIFT (L1_MAP_ADDR_SPACE_BITS - TARGET_PAGE_BITS - V_L1_BITS)
> >  
> >  uintptr_t qemu_host_page_size;
> > -uintptr_t qemu_host_page_mask;
> > +intptr_t qemu_host_page_mask;
> >  
> >  /* The bottom level has pointers to PageDesc */
> >  static void *l1_map[V_L1_SIZE];
> > @@ -326,14 +326,14 @@ void page_size_init(void)
> >  /* NOTE: we can always suppose that qemu_host_page_size >=
> > TARGET_PAGE_SIZE */
> >  qemu_real_host_page_size = getpagesize();
> > -qemu_real_host_page_mask = ~(qemu_real_host_page_size - 1);
> > +qemu_real_host_page_mask = -(intptr_t)qemu_real_host_page_size;
> >  if (qemu_host_page_size == 0) {
> >  qemu_host_page_size = qemu_real_host_page_size;
> >  }
> >  if (qemu_host_page_size < TARGET_PAGE_SIZE) {
> >  qemu_host_page_size = TARGET_PAGE_SIZE;
> >  }
> > -qemu_host_page_mask = ~(qemu_host_page_size - 1);
> > +qemu_host_page_mask = -(intptr_t)qemu_host_page_size;
> >  }
> >  
> >  static void page_init(void)
> > diff --git a/translate-common.c b/translate-common.c
> > index 619feb4..171222d 100644
> > --- a/translate-common.c
> > +++ b/translate-common.c
> > @@ -21,7 +21,7 @@
> >  #include "qom/cpu.h"
> >  
> >  uintptr_t qemu_real_host_page_size;
> > -uintptr_t qemu_real_host_page_mask;
> > +intptr_t qemu_real_host_page_mask;
> >  
> >  #ifndef CONFIG_USER_ONLY
> >  /* mask must never be zero, except for A20 change call */
> > 
> > 
> 
> Ok, I tested this by adding
> 
> + assert(HOST_PAGE_ALIGN(0x123456700ll) == 0x123457000ll);
> + assert(REAL_HOST_PAGE_ALIGN(0x123456700ll) == 0x123457000ll);
> 
> and doing a 32-bit x86_64-linux-user build.  Since Dave's patch does not
> compile for user-mode emulation (ram_addr_t is a softmmu concept), I'm
> queuing my patch for 2.5.

Hmm yes OK; my alternate was just making ram_addr_t being always defined.
I'm not sure we have any other type that's more suitable than ram_addr_t
but I guess the intptr_t will work for the mask.

Dave

> 
> Paolo
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Regression: Xen guest with 5G of RAM on 32bit fail to boot

2015-12-01 Thread Dr. David Alan Gilbert
* Anthony PERARD (anthony.per...@citrix.com) wrote:
> Hi,
> 
> Under Xen, a guest with 5G of RAM, with a 32bit binary QEMU (well, with a
> 32bit dom0) does not boot anymore. QEMU abort() with "Bad ram offset 
> efffd000".
> 
> This issue first appear in 4ed023ce2a39ab5812d33cf4d819def168965a7f (Round
> up RAMBlock sizes to host page sizes).
> 
> The problem is in qemu_ram_alloc_internal() where 'size' and 'maxsize' are
> now been truncate to 32bit, due to 'qemu_host_page_size' been an uintptr_t
> in the HOST_PAGE_ALIGN macro.
> 
> ram_add_t is uint64_t when compiled with --enable-xen.

Hmm, that's a fun problem.
Would changing qemu_host_page_[size|mask] to ram_addr_t  work?

Dave

> 
> Regards,
> 
> -- 
> Anthony PERARD
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [Qemu-devel] [edk2] EDK II & GPL - Re: OVMF BoF @ KVM Forum 2015

2015-09-10 Thread Dr. David Alan Gilbert
* Daniel P. Berrange (berra...@redhat.com) wrote:
> On Wed, Sep 09, 2015 at 05:41:59PM -0700, Jordan Justen wrote:
> > On 2015-09-09 16:05:20, Andrew Fish wrote:
> > > 
> > > > On Sep 9, 2015, at 3:24 PM, Jordan Justen <jordan.l.jus...@intel.com> 
> > > > wrot> > > FWIW, I don't mind if the consensus is that GplDriverPkg must 
> > > > live in
> > > > a separate repo. But, it would be nice to hear a good reason why it
> > > > must live elsewhere.
> > > 
> > > Because GPL is not a permissive license. An accidental git grep and
> > > copying some code can change the license of the code that gets the
> > > GPL code pasted into it.
> > 
> > I like this argument. It is slightly tempered by the fact that git
> > grep always shows the source path, and thus 'GplDriverPkg' would be
> > obviously visible.
> 
> Plenty of projects have a scenario in which different parts of their
> codebase are under different licenses, without there being undue
> problems. If you make it clear by having a separate directory, then
> I think you can ultimately credit the developers with having enough
> intelligence to do the right thing here. If not, then I'd probably
> question whether you can trust them to submit any code at all, as
> they could equally have blindly copied it from a 3rd party project
> under an incompatible license.

Many companies dont trust their engineers to do that, and have painful
review processes to stop their engineers stupidly copying closed
code into open projects; and in general they're needed because the
engineers would do it if they weren't stopped.

Dave

> 
> Regards,
> Daniel
> -- 
> |: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org  -o-     http://virt-manager.org :|
> |: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
> |: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel