Re: [Xen-devel] [PATCH 0/3] xen/arm: Inject an exception to the guest rather than crashing it

2018-01-31 Thread Julien Grall

Hi Stefano,

On 30/01/18 19:21, Stefano Stabellini wrote:

On Tue, 30 Jan 2018, Julien Grall wrote:

Hi,

On 30/01/18 18:29, Andrew Cooper wrote:

On 30/01/18 17:00, Julien Grall wrote:



On 30/01/18 16:38, Andrew Cooper wrote:

On 30/01/18 16:14, Julien Grall wrote:

Hi all,

This small series replaces all call to domain_crash_synchronous by
injecting
an exception to the guest.

This will result to a nicer trace from the guest (no need to
manually walk
the stack) and give a chance to the guest to give a bit more
information on
what it was doing.

Cheers,

Julien Grall (3):
     xen/arm: io: Distinguish unhandled IO from aborted one
     xen/arm: Don't crash domain on bad MMIO emulation
     xen/arm: Don't crash the domain on invalid HVC immediate


Thanks.

I don't feel qualified to review these, but some notes.

Patch 1.  s/avodi/avoid/ in the commit message

Patches 2 and 3.  You probably want to convert the printks to
gdprintk()s, otherwise guests can choke up the ratelimited log.  Doing
so will also mean that the vcpu will be identified consistently, which
it isn't currently.

We didn't use g*printk because it would be more confusing to print the
current vCPU in some cases (e.g when accessing the re-distributor of
another vCPU) or does not matter (e.g for ITS).


In the former case, you'd want to print both current, and the target
vcpu.  The latter still matters what current is if something goes wrong.

We have plenty of similar cases in x86, but at the point you are
printing an diagnostic message, ignoring current is almost always the
wrong think to do.


I will look at it on another series.





The problem with the debug version is those information are actually
quite useful in non-debug build. We found quite a few issues thanks to
them.

I think it would make more sense for Xen to provide per-guest
ratelimited than hiding those messages in non-debug build.


Per guest is quite a lot more complicated than global, and would still
require a global limit to prevent a concerted attack from multiple
guests to avoid DoSing the system.

Debug vs unilateral is your prerogative as a maintainer, but as you've
said yourself, the are used for debugging purposes, which proves my point.


So on x86, you always request the user to reproduce it with debug build
enable?

Stefano, what's your opinion here?


I think it would be great to have per-guest rate limiting.

On ARM, it would be impossible to ask users to repro with debug enabled,
given that many deployments are embedded (set-top boxes, etc).

I think it is OK to keep the XENLOG_DEBUG, they should be filtered out
by loglvl. But we need to be careful about the others. We might want to
convert the XENLOG_WARNING in traps.c to XENLOG_DEBUG: they are warning
about guests misbehaving, but from the hypervisor point of view, it's not
a problem, guests can shoot themselves if they want to; it's OK.


Not really. Some of the XENLOG_WARNING may happen because Xen does not 
handle a trap (see the one on do_trap_guest_sync). That has nothing to 
do with a guest misbehaving and could happen if Xen boots on newer hardware.


Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 0/3] xen/arm: Inject an exception to the guest rather than crashing it

2018-01-30 Thread Andrew Cooper
On 30/01/18 19:21, Stefano Stabellini wrote:
> On Tue, 30 Jan 2018, Julien Grall wrote:
>> Hi,
>>
>> On 30/01/18 18:29, Andrew Cooper wrote:
>>> On 30/01/18 17:00, Julien Grall wrote:

 On 30/01/18 16:38, Andrew Cooper wrote:
> On 30/01/18 16:14, Julien Grall wrote:
>> Hi all,
>>
>> This small series replaces all call to domain_crash_synchronous by
>> injecting
>> an exception to the guest.
>>
>> This will result to a nicer trace from the guest (no need to
>> manually walk
>> the stack) and give a chance to the guest to give a bit more
>> information on
>> what it was doing.
>>
>> Cheers,
>>
>> Julien Grall (3):
>>     xen/arm: io: Distinguish unhandled IO from aborted one
>>     xen/arm: Don't crash domain on bad MMIO emulation
>>     xen/arm: Don't crash the domain on invalid HVC immediate
> Thanks.
>
> I don't feel qualified to review these, but some notes.
>
> Patch 1.  s/avodi/avoid/ in the commit message
>
> Patches 2 and 3.  You probably want to convert the printks to
> gdprintk()s, otherwise guests can choke up the ratelimited log.  Doing
> so will also mean that the vcpu will be identified consistently, which
> it isn't currently.
 We didn't use g*printk because it would be more confusing to print the
 current vCPU in some cases (e.g when accessing the re-distributor of
 another vCPU) or does not matter (e.g for ITS).
>>> In the former case, you'd want to print both current, and the target
>>> vcpu.  The latter still matters what current is if something goes wrong.
>>>
>>> We have plenty of similar cases in x86, but at the point you are
>>> printing an diagnostic message, ignoring current is almost always the
>>> wrong think to do.
>> I will look at it on another series.
>>
 The problem with the debug version is those information are actually
 quite useful in non-debug build. We found quite a few issues thanks to
 them.

 I think it would make more sense for Xen to provide per-guest
 ratelimited than hiding those messages in non-debug build.
>>> Per guest is quite a lot more complicated than global, and would still
>>> require a global limit to prevent a concerted attack from multiple
>>> guests to avoid DoSing the system.
>>>
>>> Debug vs unilateral is your prerogative as a maintainer, but as you've
>>> said yourself, the are used for debugging purposes, which proves my point.
>> So on x86, you always request the user to reproduce it with debug build
>> enable?
>>
>> Stefano, what's your opinion here?
> I think it would be great to have per-guest rate limiting.
>
> On ARM, it would be impossible to ask users to repro with debug enabled,
> given that many deployments are embedded (set-top boxes, etc).
>
> I think it is OK to keep the XENLOG_DEBUG, they should be filtered out
> by loglvl. But we need to be careful about the others. We might want to
> convert the XENLOG_WARNING in traps.c to XENLOG_DEBUG: they are warning
> about guests misbehaving, but from the hypervisor point of view, it's not
> a problem, guests can shoot themselves if they want to; it's OK.

This is a valid argument, but to do it consistently, you'll want to
provide an arch-specific version of gdprintk() so that the common code
printk()s don't evaporate into /dev/null.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 0/3] xen/arm: Inject an exception to the guest rather than crashing it

2018-01-30 Thread Stefano Stabellini
On Tue, 30 Jan 2018, Julien Grall wrote:
> Hi,
> 
> On 30/01/18 18:29, Andrew Cooper wrote:
> > On 30/01/18 17:00, Julien Grall wrote:
> > > 
> > > 
> > > On 30/01/18 16:38, Andrew Cooper wrote:
> > > > On 30/01/18 16:14, Julien Grall wrote:
> > > > > Hi all,
> > > > > 
> > > > > This small series replaces all call to domain_crash_synchronous by
> > > > > injecting
> > > > > an exception to the guest.
> > > > > 
> > > > > This will result to a nicer trace from the guest (no need to
> > > > > manually walk
> > > > > the stack) and give a chance to the guest to give a bit more
> > > > > information on
> > > > > what it was doing.
> > > > > 
> > > > > Cheers,
> > > > > 
> > > > > Julien Grall (3):
> > > > >     xen/arm: io: Distinguish unhandled IO from aborted one
> > > > >     xen/arm: Don't crash domain on bad MMIO emulation
> > > > >     xen/arm: Don't crash the domain on invalid HVC immediate
> > > > 
> > > > Thanks.
> > > > 
> > > > I don't feel qualified to review these, but some notes.
> > > > 
> > > > Patch 1.  s/avodi/avoid/ in the commit message
> > > > 
> > > > Patches 2 and 3.  You probably want to convert the printks to
> > > > gdprintk()s, otherwise guests can choke up the ratelimited log.  Doing
> > > > so will also mean that the vcpu will be identified consistently, which
> > > > it isn't currently.
> > > We didn't use g*printk because it would be more confusing to print the
> > > current vCPU in some cases (e.g when accessing the re-distributor of
> > > another vCPU) or does not matter (e.g for ITS).
> > 
> > In the former case, you'd want to print both current, and the target
> > vcpu.  The latter still matters what current is if something goes wrong.
> > 
> > We have plenty of similar cases in x86, but at the point you are
> > printing an diagnostic message, ignoring current is almost always the
> > wrong think to do.
> 
> I will look at it on another series.
> 
> > 
> > > 
> > > The problem with the debug version is those information are actually
> > > quite useful in non-debug build. We found quite a few issues thanks to
> > > them.
> > > 
> > > I think it would make more sense for Xen to provide per-guest
> > > ratelimited than hiding those messages in non-debug build.
> > 
> > Per guest is quite a lot more complicated than global, and would still
> > require a global limit to prevent a concerted attack from multiple
> > guests to avoid DoSing the system.
> > 
> > Debug vs unilateral is your prerogative as a maintainer, but as you've
> > said yourself, the are used for debugging purposes, which proves my point.
> 
> So on x86, you always request the user to reproduce it with debug build
> enable?
> 
> Stefano, what's your opinion here?

I think it would be great to have per-guest rate limiting.

On ARM, it would be impossible to ask users to repro with debug enabled,
given that many deployments are embedded (set-top boxes, etc).

I think it is OK to keep the XENLOG_DEBUG, they should be filtered out
by loglvl. But we need to be careful about the others. We might want to
convert the XENLOG_WARNING in traps.c to XENLOG_DEBUG: they are warning
about guests misbehaving, but from the hypervisor point of view, it's not
a problem, guests can shoot themselves if they want to; it's OK.___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 0/3] xen/arm: Inject an exception to the guest rather than crashing it

2018-01-30 Thread Andrew Cooper
On 30/01/18 18:46, Julien Grall wrote:
> Hi,
>
> On 30/01/18 18:29, Andrew Cooper wrote:
>> On 30/01/18 17:00, Julien Grall wrote:
>>>
>>>
>>> On 30/01/18 16:38, Andrew Cooper wrote:
 On 30/01/18 16:14, Julien Grall wrote:
> Hi all,
>
> This small series replaces all call to domain_crash_synchronous by
> injecting
> an exception to the guest.
>
> This will result to a nicer trace from the guest (no need to
> manually walk
> the stack) and give a chance to the guest to give a bit more
> information on
> what it was doing.
>
> Cheers,
>
> Julien Grall (3):
>     xen/arm: io: Distinguish unhandled IO from aborted one
>     xen/arm: Don't crash domain on bad MMIO emulation
>     xen/arm: Don't crash the domain on invalid HVC immediate

 Thanks.

 I don't feel qualified to review these, but some notes.

 Patch 1.  s/avodi/avoid/ in the commit message

 Patches 2 and 3.  You probably want to convert the printks to
 gdprintk()s, otherwise guests can choke up the ratelimited log.  Doing
 so will also mean that the vcpu will be identified consistently, which
 it isn't currently.
>>> We didn't use g*printk because it would be more confusing to print the
>>> current vCPU in some cases (e.g when accessing the re-distributor of
>>> another vCPU) or does not matter (e.g for ITS).
>>
>> In the former case, you'd want to print both current, and the target
>> vcpu.  The latter still matters what current is if something goes wrong.
>>
>> We have plenty of similar cases in x86, but at the point you are
>> printing an diagnostic message, ignoring current is almost always the
>> wrong think to do.
>
> I will look at it on another series.

Fair enough.

>
>>
>>>
>>> The problem with the debug version is those information are actually
>>> quite useful in non-debug build. We found quite a few issues thanks to
>>> them.
>>>
>>> I think it would make more sense for Xen to provide per-guest
>>> ratelimited than hiding those messages in non-debug build.
>>
>> Per guest is quite a lot more complicated than global, and would still
>> require a global limit to prevent a concerted attack from multiple
>> guests to avoid DoSing the system.
>>
>> Debug vs unilateral is your prerogative as a maintainer, but as you've
>> said yourself, the are used for debugging purposes, which proves my
>> point.
>
> So on x86, you always request the user to reproduce it with debug
> build enable?

That is very specific to what goes wrong, but no, we generally don't
have release-build messages for "the guest screwed up and we hit it
expected way for doing so".  We do (well - are trying to, which is how I
found this domain_crash_sync issue in the first place) get consistent at
printing useful release-build messages for abnormal termination of the
guest.

In terms of making it easier to debug, XenServer always ships with a
release and debug hypervisor built from the same source so customers can
trivially switch into the debug version and collect logs if we think
that is a useful thing to do, but this is only used in a fraction of
cases in the first place.

Anything more complicated is definitely going to require an experienced
human to investigate, at which point they can do whatever they want.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 0/3] xen/arm: Inject an exception to the guest rather than crashing it

2018-01-30 Thread Andrew Cooper
On 30/01/18 17:00, Julien Grall wrote:
>
>
> On 30/01/18 16:38, Andrew Cooper wrote:
>> On 30/01/18 16:14, Julien Grall wrote:
>>> Hi all,
>>>
>>> This small series replaces all call to domain_crash_synchronous by
>>> injecting
>>> an exception to the guest.
>>>
>>> This will result to a nicer trace from the guest (no need to
>>> manually walk
>>> the stack) and give a chance to the guest to give a bit more
>>> information on
>>> what it was doing.
>>>
>>> Cheers,
>>>
>>> Julien Grall (3):
>>>    xen/arm: io: Distinguish unhandled IO from aborted one
>>>    xen/arm: Don't crash domain on bad MMIO emulation
>>>    xen/arm: Don't crash the domain on invalid HVC immediate
>>
>> Thanks.
>>
>> I don't feel qualified to review these, but some notes.
>>
>> Patch 1.  s/avodi/avoid/ in the commit message
>>
>> Patches 2 and 3.  You probably want to convert the printks to
>> gdprintk()s, otherwise guests can choke up the ratelimited log.  Doing
>> so will also mean that the vcpu will be identified consistently, which
>> it isn't currently.
> We didn't use g*printk because it would be more confusing to print the
> current vCPU in some cases (e.g when accessing the re-distributor of
> another vCPU) or does not matter (e.g for ITS).

In the former case, you'd want to print both current, and the target
vcpu.  The latter still matters what current is if something goes wrong.

We have plenty of similar cases in x86, but at the point you are
printing an diagnostic message, ignoring current is almost always the
wrong think to do.

>
> The problem with the debug version is those information are actually
> quite useful in non-debug build. We found quite a few issues thanks to
> them.
>
> I think it would make more sense for Xen to provide per-guest
> ratelimited than hiding those messages in non-debug build.

Per guest is quite a lot more complicated than global, and would still
require a global limit to prevent a concerted attack from multiple
guests to avoid DoSing the system.

Debug vs unilateral is your prerogative as a maintainer, but as you've
said yourself, the are used for debugging purposes, which proves my point.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 0/3] xen/arm: Inject an exception to the guest rather than crashing it

2018-01-30 Thread Julien Grall



On 30/01/18 16:38, Andrew Cooper wrote:

On 30/01/18 16:14, Julien Grall wrote:

Hi all,

This small series replaces all call to domain_crash_synchronous by injecting
an exception to the guest.

This will result to a nicer trace from the guest (no need to manually walk
the stack) and give a chance to the guest to give a bit more information on
what it was doing.

Cheers,

Julien Grall (3):
   xen/arm: io: Distinguish unhandled IO from aborted one
   xen/arm: Don't crash domain on bad MMIO emulation
   xen/arm: Don't crash the domain on invalid HVC immediate


Thanks.

I don't feel qualified to review these, but some notes.

Patch 1.  s/avodi/avoid/ in the commit message

Patches 2 and 3.  You probably want to convert the printks to
gdprintk()s, otherwise guests can choke up the ratelimited log.  Doing
so will also mean that the vcpu will be identified consistently, which
it isn't currently.
We didn't use g*printk because it would be more confusing to print the 
current vCPU in some cases (e.g when accessing the re-distributor of 
another vCPU) or does not matter (e.g for ITS).


The problem with the debug version is those information are actually 
quite useful in non-debug build. We found quite a few issues thanks to them.


I think it would make more sense for Xen to provide per-guest 
ratelimited than hiding those messages in non-debug build.




~Andrew



--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 0/3] xen/arm: Inject an exception to the guest rather than crashing it

2018-01-30 Thread Andrew Cooper
On 30/01/18 16:14, Julien Grall wrote:
> Hi all,
>
> This small series replaces all call to domain_crash_synchronous by injecting
> an exception to the guest.
>
> This will result to a nicer trace from the guest (no need to manually walk
> the stack) and give a chance to the guest to give a bit more information on
> what it was doing.
>
> Cheers,
>
> Julien Grall (3):
>   xen/arm: io: Distinguish unhandled IO from aborted one
>   xen/arm: Don't crash domain on bad MMIO emulation
>   xen/arm: Don't crash the domain on invalid HVC immediate

Thanks.

I don't feel qualified to review these, but some notes.

Patch 1.  s/avodi/avoid/ in the commit message

Patches 2 and 3.  You probably want to convert the printks to
gdprintk()s, otherwise guests can choke up the ratelimited log.  Doing
so will also mean that the vcpu will be identified consistently, which
it isn't currently.

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 0/3] xen/arm: Inject an exception to the guest rather than crashing it

2018-01-30 Thread Julien Grall
Hi all,

This small series replaces all call to domain_crash_synchronous by injecting
an exception to the guest.

This will result to a nicer trace from the guest (no need to manually walk
the stack) and give a chance to the guest to give a bit more information on
what it was doing.

Cheers,

Julien Grall (3):
  xen/arm: io: Distinguish unhandled IO from aborted one
  xen/arm: Don't crash domain on bad MMIO emulation
  xen/arm: Don't crash the domain on invalid HVC immediate

 xen/arch/arm/io.c  | 24 +--
 xen/arch/arm/traps.c   | 47 ++
 xen/arch/arm/vgic-v2.c |  2 --
 xen/arch/arm/vgic-v3-its.c |  3 ---
 xen/arch/arm/vgic-v3.c |  8 
 xen/arch/arm/vpl011.c  |  2 --
 xen/include/asm-arm/mmio.h |  9 -
 7 files changed, 53 insertions(+), 42 deletions(-)

-- 
2.11.0


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel