Re: Is espfix64's double-fault thing OK on Xen?

2014-07-14 Thread H. Peter Anvin
On 07/14/2014 07:46 PM, Andy Lutomirski wrote:
> 
> On espfix-less kernels (Xen and non-Xen), 16-bit CS w/ 16-bit SS
> always fails.  Native (32-bit or 64-bit, according to the binary) CS
> with 16-bit SS fails for sigreturn_32, but passes for sigreturn_64.  I
> find this somewhat odd.  Native ss always passes.
> 

espfix32 is disabled on Xen.

-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Is espfix64's double-fault thing OK on Xen?

2014-07-14 Thread Andy Lutomirski
On Mon, Jul 14, 2014 at 7:46 PM, Andy Lutomirski  wrote:
> On Mon, Jul 14, 2014 at 3:23 PM, H. Peter Anvin  wrote:
>> On 07/14/2014 02:35 PM, Andy Lutomirski wrote:
>>> Presumably the problem is here:
>>>
>>> ENTRY(xen_iret)
>>> pushq $0
>>> 1:jmp hypercall_iret
>>> ENDPATCH(xen_iret)
>>>
>>> This seems rather unlikely to work on the espfix stack.
>>>
>>> Maybe espfix64 should be disabled when running on Xen and Xen should
>>> implement its own espfix64 in the hypervisor.
>>
>> Perhaps the first question is: is espfix even necessary on Xen?  How
>> does the Xen PV IRET handle returning to a 16-bit stack segment?
>>
>
> Test case here:
>
> https://gitorious.org/linux-test-utils/linux-clock-tests/source/dbfe196a0f6efedc119deb1cdbb0139dbdf609ee:
>
> It's sigreturn_32 and sigreturn_64.  Summary:
>
> (sigreturn_64 always fails unless my SS patch is applied.  results
> below for sigreturn_64 assume the patch is applied.  This is on KVM
> (-cpu host) on Sandy Bridge.)
>
> On Xen with espfix, both OOPS intermittently.
>
> On espfix-less kernels (Xen and non-Xen), 16-bit CS w/ 16-bit SS
> always fails.  Native (32-bit or 64-bit, according to the binary) CS
> with 16-bit SS fails for sigreturn_32, but passes for sigreturn_64.  I
> find this somewhat odd.  Native ss always passes.
>
> So I think that Xen makes no difference here, aside from the bug.
>
> That being said, I don't know whether Linux can do espfix64 at all
> when Xen is running -- for all I know, the IRET hypercall switches
> stacks to a Xen stack.

Microcode is weird.  Without espfix:

[RUN]64-bit CS (33), 32-bit SS (2b)
SP: 8badf00d5aadc0de -> 8badf00d5aadc0de
[OK]all registers okay
[RUN]32-bit CS (23), 32-bit SS (2b)
SP: 8badf00d5aadc0de -> 5aadc0de
[OK]all registers okay
[RUN]16-bit CS (7), 32-bit SS (2b)
SP: 8badf00d5aadc0de -> 5aadc0de
[OK]all registers okay
[RUN]64-bit CS (33), 16-bit SS (f)
SP: 8badf00d5aadc0de -> 8badf00d5aadc0de
[OK]all registers okay
[RUN]32-bit CS (23), 16-bit SS (f)
SP: 8badf00d5aadc0de -> 5ae3c0de
[FAIL]Reg 15 mismatch: requested 0x8badf00d5aadc0de; got 0x5ae3c0de
[RUN]16-bit CS (7), 16-bit SS (f)
SP: 8badf00d5aadc0de -> 5ae3c0de
[FAIL]Reg 15 mismatch: requested 0x8badf00d5aadc0de; got 0x5ae3c0de

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Is espfix64's double-fault thing OK on Xen?

2014-07-14 Thread Andy Lutomirski
On Mon, Jul 14, 2014 at 3:23 PM, H. Peter Anvin  wrote:
> On 07/14/2014 02:35 PM, Andy Lutomirski wrote:
>> Presumably the problem is here:
>>
>> ENTRY(xen_iret)
>> pushq $0
>> 1:jmp hypercall_iret
>> ENDPATCH(xen_iret)
>>
>> This seems rather unlikely to work on the espfix stack.
>>
>> Maybe espfix64 should be disabled when running on Xen and Xen should
>> implement its own espfix64 in the hypervisor.
>
> Perhaps the first question is: is espfix even necessary on Xen?  How
> does the Xen PV IRET handle returning to a 16-bit stack segment?
>

Test case here:

https://gitorious.org/linux-test-utils/linux-clock-tests/source/dbfe196a0f6efedc119deb1cdbb0139dbdf609ee:

It's sigreturn_32 and sigreturn_64.  Summary:

(sigreturn_64 always fails unless my SS patch is applied.  results
below for sigreturn_64 assume the patch is applied.  This is on KVM
(-cpu host) on Sandy Bridge.)

On Xen with espfix, both OOPS intermittently.

On espfix-less kernels (Xen and non-Xen), 16-bit CS w/ 16-bit SS
always fails.  Native (32-bit or 64-bit, according to the binary) CS
with 16-bit SS fails for sigreturn_32, but passes for sigreturn_64.  I
find this somewhat odd.  Native ss always passes.

So I think that Xen makes no difference here, aside from the bug.

That being said, I don't know whether Linux can do espfix64 at all
when Xen is running -- for all I know, the IRET hypercall switches
stacks to a Xen stack.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Is espfix64's double-fault thing OK on Xen?

2014-07-14 Thread H. Peter Anvin
On 07/14/2014 02:35 PM, Andy Lutomirski wrote:
> Presumably the problem is here:
> 
> ENTRY(xen_iret)
> pushq $0
> 1:jmp hypercall_iret
> ENDPATCH(xen_iret)
> 
> This seems rather unlikely to work on the espfix stack.
> 
> Maybe espfix64 should be disabled when running on Xen and Xen should
> implement its own espfix64 in the hypervisor.

Perhaps the first question is: is espfix even necessary on Xen?  How
does the Xen PV IRET handle returning to a 16-bit stack segment?

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Is espfix64's double-fault thing OK on Xen?

2014-07-14 Thread Andy Lutomirski
On Mon, Jul 14, 2014 at 2:31 PM, Andy Lutomirski  wrote:
> I'm now rather confused.
>
> On Xen 64-bit, AFAICS, syscall handlers run with CS = 0xe033.  I think
> that Xen is somehow fixing up traps that came from "kernel" mode to
> show CS = 0xe030, which is an impossible selector value (unless that
> segment is conforming) to keep user_mode_vm happy.
>
> I'm running this test:
>
> https://gitorious.org/linux-test-utils/linux-clock-tests/source/1e13516a41416a7282f43c83097c9dfe4619344b:sigreturn.c
>
> It requires a kernel with my SS sigcontext change; otherwise it
> doesn't do anything.
>
> Without Xen, it works reliably.  On Xen, it seems to OOPS some
> fraction of the time.  It gets a null pointer dereference here:
>
> movq %rax,(0*8)(%rdi)/* RAX */
>
> It looks like:
>
> [0.565752] BUG: unable to handle kernel NULL pointer dereference
> at   (null)
> [0.566706] IP: [] irq_return_ldt+0x11/0x5c
> [0.566706] PGD 4eb40067 PUD 4eb38067 PMD 0
> [0.566706] Oops: 0002 [#1] SMP
> [0.566706] Modules linked in:
> [0.566706] CPU: 1 PID: 81 Comm: sigreturn Not tainted 3.16.0-rc4+ #47
> [0.566706] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> [0.566706] task: 88004e8aa180 ti: 88004eb68000 task.ti:
> 88004eb68000
> [0.566706] RIP: e030:[]  []
> irq_return_ldt+0x11/0x5c
> [0.566706] RSP: e02b:88004eb6bfc8  EFLAGS: 00010002
> [0.566706] RAX:  RBX:  RCX: 
> 
> [0.566706] RDX: 000a RSI: 0051 RDI: 
> 
> [0.566706] RBP: 006d3018 R08:  R09: 
> 
> [0.566706] R10: 0008 R11: 0202 R12: 
> 
> [0.566706] R13: 0001 R14: 0040eec0 R15: 
> 
> [0.566706] FS:  (0063) GS:88005630()
> knlGS:
> [0.566706] CS:  e033 DS: 000f ES: 000f CR0: 80050033
> [0.566706] CR2:  CR3: 4eb3c000 CR4: 
> 00042660
> [0.566706] Stack:
> [0.566706]  0051  
> 0007
> [0.566706]  0202 8badf00d5aad 000f
> [0.566706] Call Trace:
> [0.566706] Code: 44 24 20 04 75 14 e9 9d 5a 89 ff 90 66 66 66 2e
> 0f 1f 84 00 00 00 00 00 48 cf 50 57 66 66 90 66 66 90 65 48 8b 3c 25
> 00 b0 00 00 <48> 89 07 48 8b 44 24 10 48 89 47 08 48 8b 44 24 18 48 89
> 47 10
> [0.566706] RIP  [] irq_return_ldt+0x11/0x5c
> [0.566706]  RSP 
> [0.566706] CR2: 
> [0.566706] ---[ end trace a62b7f28ce379a48 ]---
>
> When it doesn't OOPS, it segfaults.  I don't know why.  I suspect that
> Xen either has a bug in modify_ldt, sigreturn, or iret when returning
> to a CS that lives in the LDT.

Presumably the problem is here:

ENTRY(xen_iret)
pushq $0
1:jmp hypercall_iret
ENDPATCH(xen_iret)

This seems rather unlikely to work on the espfix stack.

Maybe espfix64 should be disabled when running on Xen and Xen should
implement its own espfix64 in the hypervisor.

--Andy

>
>
> --Andy



-- 
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Is espfix64's double-fault thing OK on Xen?

2014-07-14 Thread Andy Lutomirski
I'm now rather confused.

On Xen 64-bit, AFAICS, syscall handlers run with CS = 0xe033.  I think
that Xen is somehow fixing up traps that came from "kernel" mode to
show CS = 0xe030, which is an impossible selector value (unless that
segment is conforming) to keep user_mode_vm happy.

I'm running this test:

https://gitorious.org/linux-test-utils/linux-clock-tests/source/1e13516a41416a7282f43c83097c9dfe4619344b:sigreturn.c

It requires a kernel with my SS sigcontext change; otherwise it
doesn't do anything.

Without Xen, it works reliably.  On Xen, it seems to OOPS some
fraction of the time.  It gets a null pointer dereference here:

movq %rax,(0*8)(%rdi)/* RAX */

It looks like:

[0.565752] BUG: unable to handle kernel NULL pointer dereference
at   (null)
[0.566706] IP: [] irq_return_ldt+0x11/0x5c
[0.566706] PGD 4eb40067 PUD 4eb38067 PMD 0
[0.566706] Oops: 0002 [#1] SMP
[0.566706] Modules linked in:
[0.566706] CPU: 1 PID: 81 Comm: sigreturn Not tainted 3.16.0-rc4+ #47
[0.566706] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[0.566706] task: 88004e8aa180 ti: 88004eb68000 task.ti:
88004eb68000
[0.566706] RIP: e030:[]  []
irq_return_ldt+0x11/0x5c
[0.566706] RSP: e02b:88004eb6bfc8  EFLAGS: 00010002
[0.566706] RAX:  RBX:  RCX: 
[0.566706] RDX: 000a RSI: 0051 RDI: 
[0.566706] RBP: 006d3018 R08:  R09: 
[0.566706] R10: 0008 R11: 0202 R12: 
[0.566706] R13: 0001 R14: 0040eec0 R15: 
[0.566706] FS:  (0063) GS:88005630()
knlGS:
[0.566706] CS:  e033 DS: 000f ES: 000f CR0: 80050033
[0.566706] CR2:  CR3: 4eb3c000 CR4: 00042660
[0.566706] Stack:
[0.566706]  0051  
0007
[0.566706]  0202 8badf00d5aad 000f
[0.566706] Call Trace:
[0.566706] Code: 44 24 20 04 75 14 e9 9d 5a 89 ff 90 66 66 66 2e
0f 1f 84 00 00 00 00 00 48 cf 50 57 66 66 90 66 66 90 65 48 8b 3c 25
00 b0 00 00 <48> 89 07 48 8b 44 24 10 48 89 47 08 48 8b 44 24 18 48 89
47 10
[0.566706] RIP  [] irq_return_ldt+0x11/0x5c
[0.566706]  RSP 
[0.566706] CR2: 
[0.566706] ---[ end trace a62b7f28ce379a48 ]---

When it doesn't OOPS, it segfaults.  I don't know why.  I suspect that
Xen either has a bug in modify_ldt, sigreturn, or iret when returning
to a CS that lives in the LDT.


--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Is espfix64's double-fault thing OK on Xen?

2014-07-14 Thread Andy Lutomirski
On Mon, Jul 14, 2014 at 10:11 AM, Andy Lutomirski  wrote:
> On Mon, Jul 14, 2014 at 10:04 AM, H. Peter Anvin  wrote:
>> On 07/09/2014 04:17 PM, Andy Lutomirski wrote:
>>> This part in __do_double_fault looks fishy:
>>>
>>> cmpl $__KERNEL_CS,CS(%rdi)
>>> jne do_double_fault
>>>
>>> Shouldn't that be:
>>>
>>> test $3,CS(%rdi)
>>> jnz do_double_fault
>>>
>>
>> No, it should be fine.  The *only* case where we need to do the espfix
>> magic is when we are on __KERNEL_CS.
>>
>
> IIRC Xen has a somewhat different GDT, and at least the userspace CS
> in IA32_STAR disagrees with normal Linux.  If the kernel CS is also
> strange, then there will be an extra possible CS value here.

There's FLAT_KERNEL_CS64, which is not equal to __KERNEL_CS.  If the
espfix mechanism gets invoked with that CS, then I expect that
something unexpected will happen.

That being said, FLAT_KERNEL_CS64 is CPL3, so my code might not be any better.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Is espfix64's double-fault thing OK on Xen?

2014-07-14 Thread Andy Lutomirski
On Mon, Jul 14, 2014 at 10:04 AM, H. Peter Anvin  wrote:
> On 07/09/2014 04:17 PM, Andy Lutomirski wrote:
>> This part in __do_double_fault looks fishy:
>>
>> cmpl $__KERNEL_CS,CS(%rdi)
>> jne do_double_fault
>>
>> Shouldn't that be:
>>
>> test $3,CS(%rdi)
>> jnz do_double_fault
>>
>
> No, it should be fine.  The *only* case where we need to do the espfix
> magic is when we are on __KERNEL_CS.
>

IIRC Xen has a somewhat different GDT, and at least the userspace CS
in IA32_STAR disagrees with normal Linux.  If the kernel CS is also
strange, then there will be an extra possible CS value here.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Is espfix64's double-fault thing OK on Xen?

2014-07-14 Thread H. Peter Anvin
On 07/09/2014 04:17 PM, Andy Lutomirski wrote:
> This part in __do_double_fault looks fishy:
> 
> cmpl $__KERNEL_CS,CS(%rdi)
> jne do_double_fault
> 
> Shouldn't that be:
> 
> test $3,CS(%rdi)
> jnz do_double_fault
> 

No, it should be fine.  The *only* case where we need to do the espfix
magic is when we are on __KERNEL_CS.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Is espfix64's double-fault thing OK on Xen?

2014-07-14 Thread Konrad Rzeszutek Wilk
On Wed, Jul 09, 2014 at 04:17:57PM -0700, Andy Lutomirski wrote:
> This part in __do_double_fault looks fishy:
> 
> cmpl $__KERNEL_CS,CS(%rdi)
> jne do_double_fault
> 
> Shouldn't that be:
> 
> test $3,CS(%rdi)
> jnz do_double_fault
> 

Let me rope in David, who was playing with that recently.

> --Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Is espfix64's double-fault thing OK on Xen?

2014-07-14 Thread Konrad Rzeszutek Wilk
On Wed, Jul 09, 2014 at 04:17:57PM -0700, Andy Lutomirski wrote:
 This part in __do_double_fault looks fishy:
 
 cmpl $__KERNEL_CS,CS(%rdi)
 jne do_double_fault
 
 Shouldn't that be:
 
 test $3,CS(%rdi)
 jnz do_double_fault
 

Let me rope in David, who was playing with that recently.

 --Andy
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Is espfix64's double-fault thing OK on Xen?

2014-07-14 Thread H. Peter Anvin
On 07/09/2014 04:17 PM, Andy Lutomirski wrote:
 This part in __do_double_fault looks fishy:
 
 cmpl $__KERNEL_CS,CS(%rdi)
 jne do_double_fault
 
 Shouldn't that be:
 
 test $3,CS(%rdi)
 jnz do_double_fault
 

No, it should be fine.  The *only* case where we need to do the espfix
magic is when we are on __KERNEL_CS.

-hpa


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Is espfix64's double-fault thing OK on Xen?

2014-07-14 Thread Andy Lutomirski
On Mon, Jul 14, 2014 at 10:04 AM, H. Peter Anvin h...@zytor.com wrote:
 On 07/09/2014 04:17 PM, Andy Lutomirski wrote:
 This part in __do_double_fault looks fishy:

 cmpl $__KERNEL_CS,CS(%rdi)
 jne do_double_fault

 Shouldn't that be:

 test $3,CS(%rdi)
 jnz do_double_fault


 No, it should be fine.  The *only* case where we need to do the espfix
 magic is when we are on __KERNEL_CS.


IIRC Xen has a somewhat different GDT, and at least the userspace CS
in IA32_STAR disagrees with normal Linux.  If the kernel CS is also
strange, then there will be an extra possible CS value here.

--Andy
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Is espfix64's double-fault thing OK on Xen?

2014-07-14 Thread Andy Lutomirski
On Mon, Jul 14, 2014 at 10:11 AM, Andy Lutomirski l...@amacapital.net wrote:
 On Mon, Jul 14, 2014 at 10:04 AM, H. Peter Anvin h...@zytor.com wrote:
 On 07/09/2014 04:17 PM, Andy Lutomirski wrote:
 This part in __do_double_fault looks fishy:

 cmpl $__KERNEL_CS,CS(%rdi)
 jne do_double_fault

 Shouldn't that be:

 test $3,CS(%rdi)
 jnz do_double_fault


 No, it should be fine.  The *only* case where we need to do the espfix
 magic is when we are on __KERNEL_CS.


 IIRC Xen has a somewhat different GDT, and at least the userspace CS
 in IA32_STAR disagrees with normal Linux.  If the kernel CS is also
 strange, then there will be an extra possible CS value here.

There's FLAT_KERNEL_CS64, which is not equal to __KERNEL_CS.  If the
espfix mechanism gets invoked with that CS, then I expect that
something unexpected will happen.

That being said, FLAT_KERNEL_CS64 is CPL3, so my code might not be any better.

--Andy
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Is espfix64's double-fault thing OK on Xen?

2014-07-14 Thread Andy Lutomirski
I'm now rather confused.

On Xen 64-bit, AFAICS, syscall handlers run with CS = 0xe033.  I think
that Xen is somehow fixing up traps that came from kernel mode to
show CS = 0xe030, which is an impossible selector value (unless that
segment is conforming) to keep user_mode_vm happy.

I'm running this test:

https://gitorious.org/linux-test-utils/linux-clock-tests/source/1e13516a41416a7282f43c83097c9dfe4619344b:sigreturn.c

It requires a kernel with my SS sigcontext change; otherwise it
doesn't do anything.

Without Xen, it works reliably.  On Xen, it seems to OOPS some
fraction of the time.  It gets a null pointer dereference here:

movq %rax,(0*8)(%rdi)/* RAX */

It looks like:

[0.565752] BUG: unable to handle kernel NULL pointer dereference
at   (null)
[0.566706] IP: [81775493] irq_return_ldt+0x11/0x5c
[0.566706] PGD 4eb40067 PUD 4eb38067 PMD 0
[0.566706] Oops: 0002 [#1] SMP
[0.566706] Modules linked in:
[0.566706] CPU: 1 PID: 81 Comm: sigreturn Not tainted 3.16.0-rc4+ #47
[0.566706] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[0.566706] task: 88004e8aa180 ti: 88004eb68000 task.ti:
88004eb68000
[0.566706] RIP: e030:[81775493]  [81775493]
irq_return_ldt+0x11/0x5c
[0.566706] RSP: e02b:88004eb6bfc8  EFLAGS: 00010002
[0.566706] RAX:  RBX:  RCX: 
[0.566706] RDX: 000a RSI: 0051 RDI: 
[0.566706] RBP: 006d3018 R08:  R09: 
[0.566706] R10: 0008 R11: 0202 R12: 
[0.566706] R13: 0001 R14: 0040eec0 R15: 
[0.566706] FS:  (0063) GS:88005630()
knlGS:
[0.566706] CS:  e033 DS: 000f ES: 000f CR0: 80050033
[0.566706] CR2:  CR3: 4eb3c000 CR4: 00042660
[0.566706] Stack:
[0.566706]  0051  
0007
[0.566706]  0202 8badf00d5aad 000f
[0.566706] Call Trace:
[0.566706] Code: 44 24 20 04 75 14 e9 9d 5a 89 ff 90 66 66 66 2e
0f 1f 84 00 00 00 00 00 48 cf 50 57 66 66 90 66 66 90 65 48 8b 3c 25
00 b0 00 00 48 89 07 48 8b 44 24 10 48 89 47 08 48 8b 44 24 18 48 89
47 10
[0.566706] RIP  [81775493] irq_return_ldt+0x11/0x5c
[0.566706]  RSP 88004eb6bfc8
[0.566706] CR2: 
[0.566706] ---[ end trace a62b7f28ce379a48 ]---

When it doesn't OOPS, it segfaults.  I don't know why.  I suspect that
Xen either has a bug in modify_ldt, sigreturn, or iret when returning
to a CS that lives in the LDT.


--Andy
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Is espfix64's double-fault thing OK on Xen?

2014-07-14 Thread Andy Lutomirski
On Mon, Jul 14, 2014 at 2:31 PM, Andy Lutomirski l...@amacapital.net wrote:
 I'm now rather confused.

 On Xen 64-bit, AFAICS, syscall handlers run with CS = 0xe033.  I think
 that Xen is somehow fixing up traps that came from kernel mode to
 show CS = 0xe030, which is an impossible selector value (unless that
 segment is conforming) to keep user_mode_vm happy.

 I'm running this test:

 https://gitorious.org/linux-test-utils/linux-clock-tests/source/1e13516a41416a7282f43c83097c9dfe4619344b:sigreturn.c

 It requires a kernel with my SS sigcontext change; otherwise it
 doesn't do anything.

 Without Xen, it works reliably.  On Xen, it seems to OOPS some
 fraction of the time.  It gets a null pointer dereference here:

 movq %rax,(0*8)(%rdi)/* RAX */

 It looks like:

 [0.565752] BUG: unable to handle kernel NULL pointer dereference
 at   (null)
 [0.566706] IP: [81775493] irq_return_ldt+0x11/0x5c
 [0.566706] PGD 4eb40067 PUD 4eb38067 PMD 0
 [0.566706] Oops: 0002 [#1] SMP
 [0.566706] Modules linked in:
 [0.566706] CPU: 1 PID: 81 Comm: sigreturn Not tainted 3.16.0-rc4+ #47
 [0.566706] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
 [0.566706] task: 88004e8aa180 ti: 88004eb68000 task.ti:
 88004eb68000
 [0.566706] RIP: e030:[81775493]  [81775493]
 irq_return_ldt+0x11/0x5c
 [0.566706] RSP: e02b:88004eb6bfc8  EFLAGS: 00010002
 [0.566706] RAX:  RBX:  RCX: 
 
 [0.566706] RDX: 000a RSI: 0051 RDI: 
 
 [0.566706] RBP: 006d3018 R08:  R09: 
 
 [0.566706] R10: 0008 R11: 0202 R12: 
 
 [0.566706] R13: 0001 R14: 0040eec0 R15: 
 
 [0.566706] FS:  (0063) GS:88005630()
 knlGS:
 [0.566706] CS:  e033 DS: 000f ES: 000f CR0: 80050033
 [0.566706] CR2:  CR3: 4eb3c000 CR4: 
 00042660
 [0.566706] Stack:
 [0.566706]  0051  
 0007
 [0.566706]  0202 8badf00d5aad 000f
 [0.566706] Call Trace:
 [0.566706] Code: 44 24 20 04 75 14 e9 9d 5a 89 ff 90 66 66 66 2e
 0f 1f 84 00 00 00 00 00 48 cf 50 57 66 66 90 66 66 90 65 48 8b 3c 25
 00 b0 00 00 48 89 07 48 8b 44 24 10 48 89 47 08 48 8b 44 24 18 48 89
 47 10
 [0.566706] RIP  [81775493] irq_return_ldt+0x11/0x5c
 [0.566706]  RSP 88004eb6bfc8
 [0.566706] CR2: 
 [0.566706] ---[ end trace a62b7f28ce379a48 ]---

 When it doesn't OOPS, it segfaults.  I don't know why.  I suspect that
 Xen either has a bug in modify_ldt, sigreturn, or iret when returning
 to a CS that lives in the LDT.

Presumably the problem is here:

ENTRY(xen_iret)
pushq $0
1:jmp hypercall_iret
ENDPATCH(xen_iret)

This seems rather unlikely to work on the espfix stack.

Maybe espfix64 should be disabled when running on Xen and Xen should
implement its own espfix64 in the hypervisor.

--Andy



 --Andy



-- 
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Is espfix64's double-fault thing OK on Xen?

2014-07-14 Thread H. Peter Anvin
On 07/14/2014 02:35 PM, Andy Lutomirski wrote:
 Presumably the problem is here:
 
 ENTRY(xen_iret)
 pushq $0
 1:jmp hypercall_iret
 ENDPATCH(xen_iret)
 
 This seems rather unlikely to work on the espfix stack.
 
 Maybe espfix64 should be disabled when running on Xen and Xen should
 implement its own espfix64 in the hypervisor.

Perhaps the first question is: is espfix even necessary on Xen?  How
does the Xen PV IRET handle returning to a 16-bit stack segment?

-hpa


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Is espfix64's double-fault thing OK on Xen?

2014-07-14 Thread Andy Lutomirski
On Mon, Jul 14, 2014 at 3:23 PM, H. Peter Anvin h...@zytor.com wrote:
 On 07/14/2014 02:35 PM, Andy Lutomirski wrote:
 Presumably the problem is here:

 ENTRY(xen_iret)
 pushq $0
 1:jmp hypercall_iret
 ENDPATCH(xen_iret)

 This seems rather unlikely to work on the espfix stack.

 Maybe espfix64 should be disabled when running on Xen and Xen should
 implement its own espfix64 in the hypervisor.

 Perhaps the first question is: is espfix even necessary on Xen?  How
 does the Xen PV IRET handle returning to a 16-bit stack segment?


Test case here:

https://gitorious.org/linux-test-utils/linux-clock-tests/source/dbfe196a0f6efedc119deb1cdbb0139dbdf609ee:

It's sigreturn_32 and sigreturn_64.  Summary:

(sigreturn_64 always fails unless my SS patch is applied.  results
below for sigreturn_64 assume the patch is applied.  This is on KVM
(-cpu host) on Sandy Bridge.)

On Xen with espfix, both OOPS intermittently.

On espfix-less kernels (Xen and non-Xen), 16-bit CS w/ 16-bit SS
always fails.  Native (32-bit or 64-bit, according to the binary) CS
with 16-bit SS fails for sigreturn_32, but passes for sigreturn_64.  I
find this somewhat odd.  Native ss always passes.

So I think that Xen makes no difference here, aside from the bug.

That being said, I don't know whether Linux can do espfix64 at all
when Xen is running -- for all I know, the IRET hypercall switches
stacks to a Xen stack.

--Andy
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Is espfix64's double-fault thing OK on Xen?

2014-07-14 Thread Andy Lutomirski
On Mon, Jul 14, 2014 at 7:46 PM, Andy Lutomirski l...@amacapital.net wrote:
 On Mon, Jul 14, 2014 at 3:23 PM, H. Peter Anvin h...@zytor.com wrote:
 On 07/14/2014 02:35 PM, Andy Lutomirski wrote:
 Presumably the problem is here:

 ENTRY(xen_iret)
 pushq $0
 1:jmp hypercall_iret
 ENDPATCH(xen_iret)

 This seems rather unlikely to work on the espfix stack.

 Maybe espfix64 should be disabled when running on Xen and Xen should
 implement its own espfix64 in the hypervisor.

 Perhaps the first question is: is espfix even necessary on Xen?  How
 does the Xen PV IRET handle returning to a 16-bit stack segment?


 Test case here:

 https://gitorious.org/linux-test-utils/linux-clock-tests/source/dbfe196a0f6efedc119deb1cdbb0139dbdf609ee:

 It's sigreturn_32 and sigreturn_64.  Summary:

 (sigreturn_64 always fails unless my SS patch is applied.  results
 below for sigreturn_64 assume the patch is applied.  This is on KVM
 (-cpu host) on Sandy Bridge.)

 On Xen with espfix, both OOPS intermittently.

 On espfix-less kernels (Xen and non-Xen), 16-bit CS w/ 16-bit SS
 always fails.  Native (32-bit or 64-bit, according to the binary) CS
 with 16-bit SS fails for sigreturn_32, but passes for sigreturn_64.  I
 find this somewhat odd.  Native ss always passes.

 So I think that Xen makes no difference here, aside from the bug.

 That being said, I don't know whether Linux can do espfix64 at all
 when Xen is running -- for all I know, the IRET hypercall switches
 stacks to a Xen stack.

Microcode is weird.  Without espfix:

[RUN]64-bit CS (33), 32-bit SS (2b)
SP: 8badf00d5aadc0de - 8badf00d5aadc0de
[OK]all registers okay
[RUN]32-bit CS (23), 32-bit SS (2b)
SP: 8badf00d5aadc0de - 5aadc0de
[OK]all registers okay
[RUN]16-bit CS (7), 32-bit SS (2b)
SP: 8badf00d5aadc0de - 5aadc0de
[OK]all registers okay
[RUN]64-bit CS (33), 16-bit SS (f)
SP: 8badf00d5aadc0de - 8badf00d5aadc0de
[OK]all registers okay
[RUN]32-bit CS (23), 16-bit SS (f)
SP: 8badf00d5aadc0de - 5ae3c0de
[FAIL]Reg 15 mismatch: requested 0x8badf00d5aadc0de; got 0x5ae3c0de
[RUN]16-bit CS (7), 16-bit SS (f)
SP: 8badf00d5aadc0de - 5ae3c0de
[FAIL]Reg 15 mismatch: requested 0x8badf00d5aadc0de; got 0x5ae3c0de

--Andy
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Is espfix64's double-fault thing OK on Xen?

2014-07-14 Thread H. Peter Anvin
On 07/14/2014 07:46 PM, Andy Lutomirski wrote:
 
 On espfix-less kernels (Xen and non-Xen), 16-bit CS w/ 16-bit SS
 always fails.  Native (32-bit or 64-bit, according to the binary) CS
 with 16-bit SS fails for sigreturn_32, but passes for sigreturn_64.  I
 find this somewhat odd.  Native ss always passes.
 

espfix32 is disabled on Xen.

-hpa

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Is espfix64's double-fault thing OK on Xen?

2014-07-09 Thread Andy Lutomirski
This part in __do_double_fault looks fishy:

cmpl $__KERNEL_CS,CS(%rdi)
jne do_double_fault

Shouldn't that be:

test $3,CS(%rdi)
jnz do_double_fault

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Is espfix64's double-fault thing OK on Xen?

2014-07-09 Thread Andy Lutomirski
This part in __do_double_fault looks fishy:

cmpl $__KERNEL_CS,CS(%rdi)
jne do_double_fault

Shouldn't that be:

test $3,CS(%rdi)
jnz do_double_fault

--Andy
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/