On 14/10/2020 17:28, Roger Pau Monné wrote:
> On Fri, Oct 09, 2020 at 12:53:01PM +0100, Andrew Cooper wrote:
>> Despite appearing to be a deliberate design choice of early PV64, the
>> resulting behaviour for unregistered SYSCALL callbacks creates an untenable
>> testability problem for Xen.  Furthermore, the behaviour is undocumented,
>> bizarre, and inconsistent with related behaviour in Xen, and very liable
>> introduce a security vulnerability into a PV guest if the author hasn't
>> studied Xen's assembly code in detail.
>>
>> There are two different bugs here.
>>
>> 1) The current logic confuses the registered entrypoints, and may deliver a
>>    SYSCALL from 32bit userspace to the 64bit entry, when only a 64bit
>>    entrypoint is registered.
>>
>>    This has been the case ever since 2007 (c/s cd75d47348b) but up until
>>    2018 (c/s dba899de14) the wrong selectors would be handed to the guest for
>>    a 32bit SYSCALL entry, making it appear as if it a 64bit entry all along.
>>
>>    Xen would malfunction under these circumstances, if it were a PV guest.
>>    Linux would as well, but PVOps has always registered both entrypoints and
>>    discarded the Xen-provided selectors.  NetBSD really does malfunction as a
>>    consequence (benignly now, but a VM DoS before the 2018 Xen selector fix).
>>
>> 2) In the case that neither SYSCALL callbacks are registered, the guest will
>>    be crashed when userspace executes a SYSCALL instruction, which is a
>>    userspace => kernel DoS.
>>
>>    This has been the case ever since the introduction of 64bit PV support, 
>> but
>>    behaves unlike all other SYSCALL/SYSENTER callbacks in Xen, which yield
>>    #GP/#UD in userspace before the callback is registered, and are therefore
>>    safe by default.
>>
>> This change does constitute a change in the PV ABI, for corner cases of a PV
>> guest kernel registering neither callback, or not registering the 32bit
>> callback when running on AMD/Hygon hardware.
>>
>> It brings the behaviour in line with PV32 SYSCALL/SYSENTER, and PV64
>> SYSENTER (safe by default, until explicitly enabled), as well as native
>> hardware (always delivered to the single applicable callback).
>>
>> Most importantly however, and the primary reason for the change, is that it
>> lets us sensibly test the fast system call entrypoints under all states a PV
>> guest can construct, to prove correct behaviour.
>>
>> Signed-off-by: Andrew Cooper <andrew.coop...@citrix.com>
> Acked-by: Roger Pau Monné <roger....@citrix.com>
>
>> ---
>> CC: Jan Beulich <jbeul...@suse.com>
>> CC: Roger Pau Monné <roger....@citrix.com>
>> CC: Wei Liu <w...@xen.org>
>> CC: Andy Lutomirski <l...@kernel.org>
>> CC: Manuel Bouyer <bou...@antioche.eu.org>
>>
>> v2:
>>  * Drop unnecessary instruction suffixes
>>  * Don't truncate #UD entrypoint to 32 bits
>>
>> Manuel: This will result in a corner case change for NetBSD.
>>
>> At the moment on native, 32bit userspace on 64bit NetBSD will get #UD (Intel,
>> etc), or an explicit -ENOSYS (AMD, etc) when trying to execute a 32bit 
>> SYSCALL
>> instruction.
>>
>> After this change, a 64bit PV VM will consistently see #UD (like on Intel, 
>> etc
>> hardware) even when running on AMD/Hygon hardware (as Xsyscall32 isn't
>> registered with Xen), rather than following Xsyscall into the proper system
>> call path.
>> ---
>>  xen/arch/x86/x86_64/entry.S | 26 +++++++++++++++++++-------
>>  1 file changed, 19 insertions(+), 7 deletions(-)
>>
>> diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S
>> index 000eb9722b..aaf8402f93 100644
>> --- a/xen/arch/x86/x86_64/entry.S
>> +++ b/xen/arch/x86/x86_64/entry.S
>> @@ -26,18 +26,30 @@
>>  /* %rbx: struct vcpu */
>>  ENTRY(switch_to_kernel)
>>          leaq  VCPU_trap_bounce(%rbx),%rdx
>> -        /* TB_eip = (32-bit syscall && syscall32_addr) ?
>> -         *          syscall32_addr : syscall_addr */
>> -        xor   %eax,%eax
>> +
>> +        /* TB_eip = 32-bit syscall ? syscall32_addr : syscall_addr */
>> +        mov   VCPU_syscall32_addr(%rbx), %ecx
> This being an unsigned long field, shouldn't you use %rcx here?

Yes I should.  Sorry - thought I'd fixed all of these.  I'll ad higher
half handlers to the XTF test.

~Andrew

Reply via email to