Re: [Xen-devel] [PATCH v5 2/4] shutdown: Prepare for use of an enum in reset/shutdown_request

2017-04-28 Thread Dr. David Alan Gilbert
* Eric Blake (ebl...@redhat.com) wrote:
> On 04/28/2017 11:09 AM, Dr. David Alan Gilbert wrote:
> 
> >>> At a higher level, using your tags, I'm not sure where a reset triggered
> >>> by a fault detected by the hypervisor lives - e.g. an x86 triple fault
> >>> where the guest screws up so badly that it just gets reset.  Is
> >>> that a guest-reset or a guest-panic or what - neither case
> >>> was actually asked for by the guest itself.
> >>
> >> Wouldn't that be host-error (qemu detected an error that prevents
> >> further execution of the guest without a reset - and a triple fault
> >> seems to fall into the category of the guest getting itself wedged
> >> rather than actually trying to reset)?  Except patch 3 only used
> >> SHUTDOWN_TYPE_HOST_ERROR in the xen portion of the patch.
> >>
> >> So if any x86 expert has an opinion on where triple-fault handling is
> >> emulated, and what category should be used there, I'm welcome to
> >> tweaking this series.
> > 
> > It's pretty much on the border anyway, I don't think it matters too
> > much; it sounds perfectly reasonable.
> 
> Actually, reading
> https://blogs.msdn.microsoft.com/larryosterman/2005/02/08/faster-syscall-trap-redux/
> makes it sound like the triple-fault = reset is exploited by existing OS
> (dating back to days of targetting 286 machines), so it is bare-metal
> behavior that we have to faithfully emulate as a guest-triggered reset,
> and not something where the guest has wedged itself to the point where
> qemu can no longer execute the guest.

The point is it's both :-)
A lot of x86 reset code tries four or five different ways to invoke
a reset and if all else fails they triple fault.

Dave

> -- 
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.   +1-919-301-3266
> Virtualization:  qemu.org | libvirt.org
> 



--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 2/4] shutdown: Prepare for use of an enum in reset/shutdown_request

2017-04-28 Thread Eric Blake
On 04/28/2017 11:09 AM, Dr. David Alan Gilbert wrote:

>>> At a higher level, using your tags, I'm not sure where a reset triggered
>>> by a fault detected by the hypervisor lives - e.g. an x86 triple fault
>>> where the guest screws up so badly that it just gets reset.  Is
>>> that a guest-reset or a guest-panic or what - neither case
>>> was actually asked for by the guest itself.
>>
>> Wouldn't that be host-error (qemu detected an error that prevents
>> further execution of the guest without a reset - and a triple fault
>> seems to fall into the category of the guest getting itself wedged
>> rather than actually trying to reset)?  Except patch 3 only used
>> SHUTDOWN_TYPE_HOST_ERROR in the xen portion of the patch.
>>
>> So if any x86 expert has an opinion on where triple-fault handling is
>> emulated, and what category should be used there, I'm welcome to
>> tweaking this series.
> 
> It's pretty much on the border anyway, I don't think it matters too
> much; it sounds perfectly reasonable.

Actually, reading
https://blogs.msdn.microsoft.com/larryosterman/2005/02/08/faster-syscall-trap-redux/
makes it sound like the triple-fault = reset is exploited by existing OS
(dating back to days of targetting 286 machines), so it is bare-metal
behavior that we have to faithfully emulate as a guest-triggered reset,
and not something where the guest has wedged itself to the point where
qemu can no longer execute the guest.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 2/4] shutdown: Prepare for use of an enum in reset/shutdown_request

2017-04-28 Thread Dr. David Alan Gilbert
* Eric Blake (ebl...@redhat.com) wrote:
> On 04/28/2017 10:27 AM, Dr. David Alan Gilbert wrote:
> 
>  +# Enumeration of various causes for shutdown.
>  +#
>  +# @host-qmp: Reaction to a QMP command, such as 'quit'
>  +# @host-signal: Reaction to a signal, such as SIGINT
>  +# @host-ui: Reaction to a UI event, such as closing the window
>  +# @host-replay: The host is replaying an earlier shutdown event
>  +# @host-error: Qemu encountered an error that prevents further use of 
>  the guest
>  +# @guest-shutdown: The guest requested a shutdown, such as via ACPI or
>  +#  other hardware-specific action
>  +# @guest-reset: The guest requested a reset, and the command line
>  +#   response to a reset is to instead trigger a shutdown
>  +# @guest-panic: The guest panicked, and the command line response to
>  +#   a panic is to trigger a shutdown
> >>>
> 
> > At a higher level, using your tags, I'm not sure where a reset triggered
> > by a fault detected by the hypervisor lives - e.g. an x86 triple fault
> > where the guest screws up so badly that it just gets reset.  Is
> > that a guest-reset or a guest-panic or what - neither case
> > was actually asked for by the guest itself.
> 
> Wouldn't that be host-error (qemu detected an error that prevents
> further execution of the guest without a reset - and a triple fault
> seems to fall into the category of the guest getting itself wedged
> rather than actually trying to reset)?  Except patch 3 only used
> SHUTDOWN_TYPE_HOST_ERROR in the xen portion of the patch.
> 
> So if any x86 expert has an opinion on where triple-fault handling is
> emulated, and what category should be used there, I'm welcome to
> tweaking this series.

It's pretty much on the border anyway, I don't think it matters too
much; it sounds perfectly reasonable.

Dave

> -- 
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.   +1-919-301-3266
> Virtualization:  qemu.org | libvirt.org
> 



--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 2/4] shutdown: Prepare for use of an enum in reset/shutdown_request

2017-04-28 Thread Eric Blake
On 04/28/2017 10:27 AM, Dr. David Alan Gilbert wrote:

 +# Enumeration of various causes for shutdown.
 +#
 +# @host-qmp: Reaction to a QMP command, such as 'quit'
 +# @host-signal: Reaction to a signal, such as SIGINT
 +# @host-ui: Reaction to a UI event, such as closing the window
 +# @host-replay: The host is replaying an earlier shutdown event
 +# @host-error: Qemu encountered an error that prevents further use of the 
 guest
 +# @guest-shutdown: The guest requested a shutdown, such as via ACPI or
 +#  other hardware-specific action
 +# @guest-reset: The guest requested a reset, and the command line
 +#   response to a reset is to instead trigger a shutdown
 +# @guest-panic: The guest panicked, and the command line response to
 +#   a panic is to trigger a shutdown
>>>

> At a higher level, using your tags, I'm not sure where a reset triggered
> by a fault detected by the hypervisor lives - e.g. an x86 triple fault
> where the guest screws up so badly that it just gets reset.  Is
> that a guest-reset or a guest-panic or what - neither case
> was actually asked for by the guest itself.

Wouldn't that be host-error (qemu detected an error that prevents
further execution of the guest without a reset - and a triple fault
seems to fall into the category of the guest getting itself wedged
rather than actually trying to reset)?  Except patch 3 only used
SHUTDOWN_TYPE_HOST_ERROR in the xen portion of the patch.

So if any x86 expert has an opinion on where triple-fault handling is
emulated, and what category should be used there, I'm welcome to
tweaking this series.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 2/4] shutdown: Prepare for use of an enum in reset/shutdown_request

2017-04-28 Thread Dr. David Alan Gilbert
* Eric Blake (ebl...@redhat.com) wrote:
> On 04/28/2017 03:08 AM, Dr. David Alan Gilbert wrote:
> > * Eric Blake (ebl...@redhat.com) wrote:
> >> We want to track why a guest was shutdown; in particular, being able
> >> to tell the difference between a guest request (such as ACPI request)
> >> and host request (such as SIGINT) will prove useful to libvirt.
> >> Since all requests eventually end up changing shutdown_requested in
> >> vl.c, the logical change is to make that value track the reason,
> >> rather than its current 0/1 contents.
> >>
> 
> >>  ##
> >> +# @ShutdownCause:
> >> +#
> >> +# Enumeration of various causes for shutdown.
> >> +#
> >> +# @host-qmp: Reaction to a QMP command, such as 'quit'
> >> +# @host-signal: Reaction to a signal, such as SIGINT
> >> +# @host-ui: Reaction to a UI event, such as closing the window
> >> +# @host-replay: The host is replaying an earlier shutdown event
> >> +# @host-error: Qemu encountered an error that prevents further use of the 
> >> guest
> >> +# @guest-shutdown: The guest requested a shutdown, such as via ACPI or
> >> +#  other hardware-specific action
> >> +# @guest-reset: The guest requested a reset, and the command line
> >> +#   response to a reset is to instead trigger a shutdown
> >> +# @guest-panic: The guest panicked, and the command line response to
> >> +#   a panic is to trigger a shutdown
> > 
> > It's a little coarse grained;  is there anyway to pass platform specific 
> > information
> > for debug?  I ask because I spent a while debugging a few bugs with 
> > unexpected
> > resets and had to figure out which of x86's many reset causes triggered it.
> 
> I'm open to any followup patches that add further enum values and
> adjusts the various callers (patch 3 shows how MANY callers use
> qemu_system_shutdown_request).  But I don't think it's necessarily in
> scope for this series - remember, my goal here was merely to distinguish
> between host- and guest-triggered resets (which libvirt and higher
> management tasks want to know)

Yep, that's fine.

> rather than which of multiple reset
> paths was taken (I agree that it is useful during a qemu debug session -
> but that's a different audience).  I also don't consider myself an
> expert in the many ways that x86 can reset - it was easy to blindly
> rewrite qemu_system_shutdown_request() into
> qemu_system_shutdown_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN) based solely
> on directory, but it would be harder to distinguish which of the
> multiple files should have which finer-grained cause.

Yes, I'm also not an expert on x86 resets - but when I was debugging
I just added a tag in every place it called the reset code.

At a higher level, using your tags, I'm not sure where a reset triggered
by a fault detected by the hypervisor lives - e.g. an x86 triple fault
where the guest screws up so badly that it just gets reset.  Is
that a guest-reset or a guest-panic or what - neither case
was actually asked for by the guest itself.

Dave


> 
> -- 
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.   +1-919-301-3266
> Virtualization:  qemu.org | libvirt.org
> 



--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 2/4] shutdown: Prepare for use of an enum in reset/shutdown_request

2017-04-28 Thread Eric Blake
On 04/28/2017 03:08 AM, Dr. David Alan Gilbert wrote:
> * Eric Blake (ebl...@redhat.com) wrote:
>> We want to track why a guest was shutdown; in particular, being able
>> to tell the difference between a guest request (such as ACPI request)
>> and host request (such as SIGINT) will prove useful to libvirt.
>> Since all requests eventually end up changing shutdown_requested in
>> vl.c, the logical change is to make that value track the reason,
>> rather than its current 0/1 contents.
>>

>>  ##
>> +# @ShutdownCause:
>> +#
>> +# Enumeration of various causes for shutdown.
>> +#
>> +# @host-qmp: Reaction to a QMP command, such as 'quit'
>> +# @host-signal: Reaction to a signal, such as SIGINT
>> +# @host-ui: Reaction to a UI event, such as closing the window
>> +# @host-replay: The host is replaying an earlier shutdown event
>> +# @host-error: Qemu encountered an error that prevents further use of the 
>> guest
>> +# @guest-shutdown: The guest requested a shutdown, such as via ACPI or
>> +#  other hardware-specific action
>> +# @guest-reset: The guest requested a reset, and the command line
>> +#   response to a reset is to instead trigger a shutdown
>> +# @guest-panic: The guest panicked, and the command line response to
>> +#   a panic is to trigger a shutdown
> 
> It's a little coarse grained;  is there anyway to pass platform specific 
> information
> for debug?  I ask because I spent a while debugging a few bugs with unexpected
> resets and had to figure out which of x86's many reset causes triggered it.

I'm open to any followup patches that add further enum values and
adjusts the various callers (patch 3 shows how MANY callers use
qemu_system_shutdown_request).  But I don't think it's necessarily in
scope for this series - remember, my goal here was merely to distinguish
between host- and guest-triggered resets (which libvirt and higher
management tasks want to know), rather than which of multiple reset
paths was taken (I agree that it is useful during a qemu debug session -
but that's a different audience).  I also don't consider myself an
expert in the many ways that x86 can reset - it was easy to blindly
rewrite qemu_system_shutdown_request() into
qemu_system_shutdown_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN) based solely
on directory, but it would be harder to distinguish which of the
multiple files should have which finer-grained cause.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature
___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 2/4] shutdown: Prepare for use of an enum in reset/shutdown_request

2017-04-28 Thread Dr. David Alan Gilbert
* Eric Blake (ebl...@redhat.com) wrote:
> We want to track why a guest was shutdown; in particular, being able
> to tell the difference between a guest request (such as ACPI request)
> and host request (such as SIGINT) will prove useful to libvirt.
> Since all requests eventually end up changing shutdown_requested in
> vl.c, the logical change is to make that value track the reason,
> rather than its current 0/1 contents.
> 
> Since command-line options control whether a reset request is turned
> into a shutdown request instead, the same treatment is given to
> reset_requested.
> 
> This patch adds a QAPI enum ShutdownCause that describes reasons
> that a shutdown can be requested, and changes qemu_system_reset() to
> pass the reason through, although for now it is not reported.  The
> next patch will actually wire things up to modify events to report
> data based on the reason, and to pass the correct enum value in from
> various call-sites that can trigger a reset/shutdown.  Since QAPI
> generates enums starting at 0, it's easier if we use a different
> number as our sentinel that no request has happened yet.  Most of
> the changes are in vl.c, but xen was using things externally.
> 
> Signed-off-by: Eric Blake 
> 
> ---
> v4: s/ShutdownType/ShutdownCause/, no thanks to mingw header pollution
> v3: new patch
> ---
>  qapi-schema.json| 23 +++
>  include/sysemu/sysemu.h |  2 +-
>  vl.c| 44 
>  hw/i386/xen/xen-hvm.c   |  9 ++---
>  migration/colo.c|  2 +-
>  migration/savevm.c  |  2 +-
>  6 files changed, 60 insertions(+), 22 deletions(-)
> 
> diff --git a/qapi-schema.json b/qapi-schema.json
> index 01b087f..a4ebdd1 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -2304,6 +2304,29 @@
>  { 'command': 'system_powerdown' }
> 
>  ##
> +# @ShutdownCause:
> +#
> +# Enumeration of various causes for shutdown.
> +#
> +# @host-qmp: Reaction to a QMP command, such as 'quit'
> +# @host-signal: Reaction to a signal, such as SIGINT
> +# @host-ui: Reaction to a UI event, such as closing the window
> +# @host-replay: The host is replaying an earlier shutdown event
> +# @host-error: Qemu encountered an error that prevents further use of the 
> guest
> +# @guest-shutdown: The guest requested a shutdown, such as via ACPI or
> +#  other hardware-specific action
> +# @guest-reset: The guest requested a reset, and the command line
> +#   response to a reset is to instead trigger a shutdown
> +# @guest-panic: The guest panicked, and the command line response to
> +#   a panic is to trigger a shutdown

It's a little coarse grained;  is there anyway to pass platform specific 
information
for debug?  I ask because I spent a while debugging a few bugs with unexpected
resets and had to figure out which of x86's many reset causes triggered it.

Dave

> +# Since: 2.10
> +##
> +{ 'enum': 'ShutdownCause',
> +  'data': [ 'host-qmp', 'host-signal', 'host-ui', 'host-replay', 
> 'host-error',
> +'guest-shutdown', 'guest-reset', 'guest-panic' ] }
> +
> +##
>  # @cpu:
>  #
>  # This command is a nop that is only provided for the purposes of 
> compatibility.
> diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
> index 16175f7..00a907f 100644
> --- a/include/sysemu/sysemu.h
> +++ b/include/sysemu/sysemu.h
> @@ -65,7 +65,7 @@ bool qemu_vmstop_requested(RunState *r);
>  int qemu_shutdown_requested_get(void);
>  int qemu_reset_requested_get(void);
>  void qemu_system_killed(int signal, pid_t pid);
> -void qemu_system_reset(bool report);
> +void qemu_system_reset(bool report, int reason);
>  void qemu_system_guest_panicked(GuestPanicInformation *info);
>  size_t qemu_target_page_size(void);
> 
> diff --git a/vl.c b/vl.c
> index 879786a..2b95b7f 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -1597,8 +1597,8 @@ void vm_state_notify(int running, RunState state)
>  }
>  }
> 
> -static int reset_requested;
> -static int shutdown_requested, shutdown_signal;
> +static int reset_requested = -1;
> +static int shutdown_requested = -1, shutdown_signal;
>  static pid_t shutdown_pid;
>  static int powerdown_requested;
>  static int debug_requested;
> @@ -1624,7 +1624,7 @@ int qemu_reset_requested_get(void)
> 
>  static int qemu_shutdown_requested(void)
>  {
> -return atomic_xchg(_requested, 0);
> +return atomic_xchg(_requested, -1);
>  }
> 
>  static void qemu_kill_report(void)
> @@ -1650,11 +1650,11 @@ static void qemu_kill_report(void)
>  static int qemu_reset_requested(void)
>  {
>  int r = reset_requested;
> -if (r && replay_checkpoint(CHECKPOINT_RESET_REQUESTED)) {
> -reset_requested = 0;
> +if (r >= 0 && replay_checkpoint(CHECKPOINT_RESET_REQUESTED)) {
> +reset_requested = -1;
>  return r;
>  }
> -return false;
> +return -1;
>  }
> 
>  static int qemu_suspend_requested(void)
> @@ -1686,7 +1686,12 @@