Re: [GIT PULL] x86 setup: correct booting on 486DX4

2007-11-04 Thread Linus Torvalds


On Sun, 4 Nov 2007, Eric W. Biederman wrote:
> 
> I do seem to recall etherboot having a far jump in that spot and it
> working on everything from a 386 on up.  So I'm not certain if the
> kind of jump matters.  Still the kernel has a lot more exposure.

I actually suspect you could have just about anything in there, including 
just a couple of nops, or just avoiding certain instructions for a few 
cycles.

The i386/i486 pipeline isn't actually all that long (I ca't find it here, 
but I want to say it was just five stages), and the whole/only issue with 
writing to cr0 on those CPU's is literally that there isn't any forwarding 
of the cr0 state, so any instruction that actually has depend on the cr0 
value needs to have that value stable in the register by the time it 
executes.

So I literally suspect that just a couple of no-ops in between the move to 
cr0 and any instruction that depends on the state of the PE bit would be 
ok. And there aren't that many instructions that do, it's generally just 
the ones that load a segment that can care.

But I'd actually be worried about a ljmp directly after the "move to 
cr0", exactly because an ljump actually does have semantic dependencies on 
the PE bit. But it's quite likely that ljmp is microcoded (it takes 12+ 
cycles even in real mode), and since microcode was nonpipelined, that 
would hide it.

But "move to segment" is definitely *not* microcoded in real mode (it's 
documented as just two cycles for reg->seg), so I'm not at all surprised 
that "mov->cr0" followed immediately by "mov->seg" will not work.

In short:

 - far jumps are in the "dangerous instruction" category after a change to 
   PE. I would suggest not using it, although I also suspect that it 
   probably works if only because it's probably microcoded on at least an 
   i386.

 - instead of a short taken jump, you can almost certainly use anything 
   that is microcoded or just otherwise takes enough cycles (where 
   "enough" is likely in the 5-10 range) to make sure the writeback to CR0 
   is stable by the time any instruction uses it.

 - almost anything that doesn't actually involve a segment descriptor 
   lookup is probably not going to care at all about the value of PE. The 
   PE bit really doesn't affect all that much of the x86 instruction set, 
   and if an instruction doesn't care, it doesn't matter whether it's 
   executed with the old or the new value.


Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] x86 setup: correct booting on 486DX4

2007-11-04 Thread Eric W. Biederman
"H. Peter Anvin" <[EMAIL PROTECTED]> writes:

> Linus Torvalds wrote:
>>
>> And Linux always did it correctly. I don't understand why you disagree, and
>> why Jeremy says
>>
>>  "Having successfully broken the rules for a long time so far,maybe
>> we can get away with still cutting corners..."
>>
>> when the fact is, we used to *not* cut corners, we used to *not* break the
>> rules, and what we used to do (a short jump immediately after setting PE) was
>> exactly what Intel always said you should do, and there is no question
>> what-so-ever about it.
>>
>
> Apparently because the Intel documentation disagrees with itself. That's all.

Yes.  Let's go back to the tested version with the short jump, that
looks safest as it is what we have always done, and we certainly need some
kind of jump in there.

I do seem to recall etherboot having a far jump in that spot and it
working on everything from a 386 on up.  So I'm not certain if the
kind of jump matters.  Still the kernel has a lot more exposure.

At the same time it does look like we really do enter protected mode
with a valid gdt after the short jump so doing the segments loads as
I did originally in 32bit mode looks like it was excessively
conservative.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] x86 setup: correct booting on 486DX4

2007-11-04 Thread H. Peter Anvin

H. Peter Anvin wrote:


Apparently because the Intel documentation disagrees with itself. That's 
all.




Just to be perfectly clear: I much prefer the code with the short (near) 
jump, because it keeps the code cleaner.  I have sent a patch to Mikael 
to test out.


-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] x86 setup: correct booting on 486DX4

2007-11-04 Thread Eric W. Biederman
"H. Peter Anvin" <[EMAIL PROTECTED]> writes:

> Hi Linus; please pull:
>
>   git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-2.6-x86setup.git
> for-linus
>
> H. Peter Anvin (1):
>   x86 setup: correct booting on 486DX4
>
>  arch/x86/boot/pmjump.S |   32 +---
>  1 files changed, 21 insertions(+), 11 deletions(-)
>
> [Full diff and log follows]

Looks reasonable to me.

> commit ac3b37b78c5f0f0be0b476a35370650f7bad482f
> Author: H. Peter Anvin <[EMAIL PROTECTED]>
> Date:   Sun Nov 4 14:33:41 2007 -0800
>
> x86 setup: correct booting on 486DX4
> 
> Apparently, the 486DX4 does not correctly serialize a mov to %cr0, so
> we really do need the far jump immediately afterwards.  This means
> losing the nice separation between 16- and 32-bit code, but c'est la
> vie.
> 
> Also pass %ebx = %edi = %ebp = 0 to support future extension of the
> 32-bit boot protocol.
> 
> Signed-off-by: H. Peter Anvin <[EMAIL PROTECTED]>
>
> diff --git a/arch/x86/boot/pmjump.S b/arch/x86/boot/pmjump.S
> index 2e55923..17e6dec 100644
> --- a/arch/x86/boot/pmjump.S
> +++ b/arch/x86/boot/pmjump.S
> @@ -28,27 +28,37 @@
>   * void protected_mode_jump(u32 entrypoint, u32 bootparams);
>   */
>  protected_mode_jump:
> - xorl%ebx, %ebx  # Flag to indicate this is a boot
>   movl%edx, %esi  # Pointer to boot_params table
> - movl%eax, 2f# Patch ljmpl instruction
> +
> + xorl%edx, %edx
> + movw%cs, %dx
> + shll$4, %edx# Patch ljmpl instruction
> + addl%edx, 2f
>   jmp 1f  # Short jump to flush instruction q.
>  
>  1:
>   movw$__BOOT_DS, %cx
> + xorl%ebx, %ebx  # Per protocol
> + xorl%ebp, %ebp  # Per protocol
> + xorl%edi, %edi  # Per protocol
>  
>   movl%cr0, %edx
>   orb $1, %dl # Protected mode (PE) bit
>   movl%edx, %cr0
> + 
> + .byte   0x66, 0xea  # ljmpl opcode
> +2:   .long   3f  # Offset
> + .word   __BOOT_CS   # Segment
>  
> - movw%cx, %ds
> - movw%cx, %es
> - movw%cx, %fs
> - movw%cx, %gs
> - movw%cx, %ss
> + .code32
> +3:
> + movl%ecx, %ds
> + movl%ecx, %es
> + movl%ecx, %fs
> + movl%ecx, %gs
> + movl%ecx, %ss
>  
>   # Jump to the 32-bit entrypoint
> - .byte   0x66, 0xea  # ljmpl opcode
> -2:   .long   0   # offset
> - .word   __BOOT_CS   # segment
> -
> + jmpl*%eax
> + 
>   .size   protected_mode_jump, .-protected_mode_jump
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] x86 setup: correct booting on 486DX4

2007-11-04 Thread H. Peter Anvin

Linus Torvalds wrote:


And Linux always did it correctly. I don't understand why you disagree, 
and why Jeremy says


	"Having successfully broken the rules for a long time so far, 
	 maybe we can get away with still cutting corners..."


when the fact is, we used to *not* cut corners, we used to *not* break the 
rules, and what we used to do (a short jump immediately after setting PE) 
was exactly what Intel always said you should do, and there is no question 
what-so-ever about it.




Apparently because the Intel documentation disagrees with itself. 
That's all.


-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] x86 setup: correct booting on 486DX4

2007-11-04 Thread Linus Torvalds


On Sun, 4 Nov 2007, H. Peter Anvin wrote:
> 
> It's not an instruction-decoding issue at all (that's a 16- vs 32-bit issue,
> which can only be changed by a ljmp).  Apparently the 486DX4 mis-executes the
> load to segment register, which is an EU function in that context.  (And yes,
> it's sort-of-documented behaviour in the sense that the documentation says "do
> things this way", but the Intel docs are unfortunately full of "do things this
> way" which don't make sense and occasionally are actively harmful, too.)

I still disagree.

I took out "Programming the 80386" just to check, and the documentation 
very clearly states that when changing the CR0 bits (I quote):

"The program must execute a jump instruction immediately after 
 changing the value of the PE bit in order to flush the execution 
 pipeline of any instructions that may have been fetched in the 
 wrong mode. [...]"

In other words, not only is this documented since day 1, it makes total 
sense, and they even said exactöy *why* that jump had to be done.

In fact, there's even a code example. It's page 624 in my copy of the 
book, and yes, it has a short jump to flush things, followed by a long 
jump. The code there looks like this:

; *
; ** [4] Enter Protected Mode
; *
SMSW AX
OR   AX, PE
LMSW AX
JMP  Flush
Flush:
JMP far ptr Start32


which is pretty damn conclusive. It's documented, it has examples, it 
works. In other words, it's how you should do things.

And Linux always did it correctly. I don't understand why you disagree, 
and why Jeremy says

"Having successfully broken the rules for a long time so far, 
 maybe we can get away with still cutting corners..."

when the fact is, we used to *not* cut corners, we used to *not* break the 
rules, and what we used to do (a short jump immediately after setting PE) 
was exactly what Intel always said you should do, and there is no question 
what-so-ever about it.

So here's a suggestion:

 - make the code do what it used to do. A regular jump to flush the 
   pipeline. Which is what Intel has always said should be done.

and I really don't see that there is any argument about this. 

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] x86 setup: correct booting on 486DX4

2007-11-04 Thread H. Peter Anvin

Linus Torvalds wrote:


On Sun, 4 Nov 2007, Linus Torvalds wrote:

I'm not entirely sure that it needs to be a long-jump, btw. I think any
regular branch is sufficient. You obviously *do* need to make the long 
jump later (to reload %cs in protected mode), but I'm not sure it's needed 
in that place. I forget the exact rules (but they definitely were 
documented).


Hmm. The original Linux code did

movw$1, %ax
lmsw%ax
jmp flush_instr
flush_instr:

and I think that was straigh out of the documentation. So yeah, I think 
that's the right fix - not a longjmp (which in itself is dangerous: it 
potentially behaves *differently* on different CPU's, since some CPU's may 
do the long jump with pre-protected-mode semantics, while others will do 
it with protected mode already in effect!)




Just looked it up; it was a bit hard to find (it is Intel vol 3 page 
9-27, at least in the version I have), but you're right -- the 
documentation only demands a short jump here, not a long jmp (which 
actually makes sense given what I remembered that a long jump should be 
deferrable here.)  So yes, that is definitely the right fix and avoids 
the ugly mixing of code.


I'll update the patch.

-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] x86 setup: correct booting on 486DX4

2007-11-04 Thread H. Peter Anvin

Linus Torvalds wrote:


On Sun, 4 Nov 2007, H. Peter Anvin wrote:

Apparently, the 486DX4 does not correctly serialize a mov to %cr0, so

we really do need the far jump immediately afterwards.


Hmm. I'm not sure I agree with the commit message.

This is documented behaviour on i386 and i486: instruction decoding is 
decoupled from execution, so things that change processor mode have to do 
a jump to make sure that %cr0 changes take effect.




It's not an instruction-decoding issue at all (that's a 16- vs 32-bit 
issue, which can only be changed by a ljmp).  Apparently the 486DX4 
mis-executes the load to segment register, which is an EU function in 
that context.  (And yes, it's sort-of-documented behaviour in the sense 
that the documentation says "do things this way", but the Intel docs are 
unfortunately full of "do things this way" which don't make sense and 
occasionally are actively harmful, too.)



I'm not entirely sure that it needs to be a long-jump, btw. I think any
regular branch is sufficient. You obviously *do* need to make the long 
jump later (to reload %cs in protected mode), but I'm not sure it's needed 
in that place. I forget the exact rules (but they definitely were 
documented).


That's exactly the issue here.  The code without this patch deferred the 
long jump until after the segment loads, this worked on all processors 
except, apparently, the 486DX4.  Hence, move the ljmp up to the earliest 
possible location.


-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] x86 setup: correct booting on 486DX4

2007-11-04 Thread Jeremy Fitzhardinge
Linus Torvalds wrote:
> I'm not entirely sure that it needs to be a long-jump, btw. I think any
> regular branch is sufficient. You obviously *do* need to make the long 
> jump later (to reload %cs in protected mode), but I'm not sure it's needed 
> in that place. I forget the exact rules (but they definitely were 
> documented).

Yes, it says it needs to be a far jmp or call (and if you enabled paging
in the cr0 load, you need to identity-map the branch target).  Having
successfully broken the rules for a long time so far, maybe we can get
away with still cutting corners...  but it doesn't seem particularly
worthwhile since we've been caught once.

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] x86 setup: correct booting on 486DX4

2007-11-04 Thread Linus Torvalds


On Sun, 4 Nov 2007, Linus Torvalds wrote:
> 
> I'm not entirely sure that it needs to be a long-jump, btw. I think any
> regular branch is sufficient. You obviously *do* need to make the long 
> jump later (to reload %cs in protected mode), but I'm not sure it's needed 
> in that place. I forget the exact rules (but they definitely were 
> documented).

Hmm. The original Linux code did

movw$1, %ax
lmsw%ax
jmp flush_instr
flush_instr:

and I think that was straigh out of the documentation. So yeah, I think 
that's the right fix - not a longjmp (which in itself is dangerous: it 
potentially behaves *differently* on different CPU's, since some CPU's may 
do the long jump with pre-protected-mode semantics, while others will do 
it with protected mode already in effect!)

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] x86 setup: correct booting on 486DX4

2007-11-04 Thread Linus Torvalds


On Sun, 4 Nov 2007, H. Peter Anvin wrote:
> 
> Apparently, the 486DX4 does not correctly serialize a mov to %cr0, so
> we really do need the far jump immediately afterwards.

Hmm. I'm not sure I agree with the commit message.

This is documented behaviour on i386 and i486: instruction decoding is 
decoupled from execution, so things that change processor mode have to do 
a jump to make sure that %cr0 changes take effect.

I'm not entirely sure that it needs to be a long-jump, btw. I think any
regular branch is sufficient. You obviously *do* need to make the long 
jump later (to reload %cs in protected mode), but I'm not sure it's needed 
in that place. I forget the exact rules (but they definitely were 
documented).

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] x86 setup: correct booting on 486DX4

2007-11-04 Thread H. Peter Anvin
Hi Linus; please pull:

  git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-2.6-x86setup.git 
for-linus

H. Peter Anvin (1):
  x86 setup: correct booting on 486DX4

 arch/x86/boot/pmjump.S |   32 +---
 1 files changed, 21 insertions(+), 11 deletions(-)

[Full diff and log follows]

commit ac3b37b78c5f0f0be0b476a35370650f7bad482f
Author: H. Peter Anvin <[EMAIL PROTECTED]>
Date:   Sun Nov 4 14:33:41 2007 -0800

x86 setup: correct booting on 486DX4

Apparently, the 486DX4 does not correctly serialize a mov to %cr0, so
we really do need the far jump immediately afterwards.  This means
losing the nice separation between 16- and 32-bit code, but c'est la
vie.

Also pass %ebx = %edi = %ebp = 0 to support future extension of the
32-bit boot protocol.

Signed-off-by: H. Peter Anvin <[EMAIL PROTECTED]>

diff --git a/arch/x86/boot/pmjump.S b/arch/x86/boot/pmjump.S
index 2e55923..17e6dec 100644
--- a/arch/x86/boot/pmjump.S
+++ b/arch/x86/boot/pmjump.S
@@ -28,27 +28,37 @@
  * void protected_mode_jump(u32 entrypoint, u32 bootparams);
  */
 protected_mode_jump:
-   xorl%ebx, %ebx  # Flag to indicate this is a boot
movl%edx, %esi  # Pointer to boot_params table
-   movl%eax, 2f# Patch ljmpl instruction
+
+   xorl%edx, %edx
+   movw%cs, %dx
+   shll$4, %edx# Patch ljmpl instruction
+   addl%edx, 2f
jmp 1f  # Short jump to flush instruction q.
 
 1:
movw$__BOOT_DS, %cx
+   xorl%ebx, %ebx  # Per protocol
+   xorl%ebp, %ebp  # Per protocol
+   xorl%edi, %edi  # Per protocol
 
movl%cr0, %edx
orb $1, %dl # Protected mode (PE) bit
movl%edx, %cr0
+   
+   .byte   0x66, 0xea  # ljmpl opcode
+2: .long   3f  # Offset
+   .word   __BOOT_CS   # Segment
 
-   movw%cx, %ds
-   movw%cx, %es
-   movw%cx, %fs
-   movw%cx, %gs
-   movw%cx, %ss
+   .code32
+3:
+   movl%ecx, %ds
+   movl%ecx, %es
+   movl%ecx, %fs
+   movl%ecx, %gs
+   movl%ecx, %ss
 
# Jump to the 32-bit entrypoint
-   .byte   0x66, 0xea  # ljmpl opcode
-2: .long   0   # offset
-   .word   __BOOT_CS   # segment
-
+   jmpl*%eax
+   
.size   protected_mode_jump, .-protected_mode_jump
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] x86 setup: correct booting on 486DX4

2007-11-04 Thread H. Peter Anvin
Hi Linus; please pull:

  git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-2.6-x86setup.git 
for-linus

H. Peter Anvin (1):
  x86 setup: correct booting on 486DX4

 arch/x86/boot/pmjump.S |   32 +---
 1 files changed, 21 insertions(+), 11 deletions(-)

[Full diff and log follows]

commit ac3b37b78c5f0f0be0b476a35370650f7bad482f
Author: H. Peter Anvin [EMAIL PROTECTED]
Date:   Sun Nov 4 14:33:41 2007 -0800

x86 setup: correct booting on 486DX4

Apparently, the 486DX4 does not correctly serialize a mov to %cr0, so
we really do need the far jump immediately afterwards.  This means
losing the nice separation between 16- and 32-bit code, but c'est la
vie.

Also pass %ebx = %edi = %ebp = 0 to support future extension of the
32-bit boot protocol.

Signed-off-by: H. Peter Anvin [EMAIL PROTECTED]

diff --git a/arch/x86/boot/pmjump.S b/arch/x86/boot/pmjump.S
index 2e55923..17e6dec 100644
--- a/arch/x86/boot/pmjump.S
+++ b/arch/x86/boot/pmjump.S
@@ -28,27 +28,37 @@
  * void protected_mode_jump(u32 entrypoint, u32 bootparams);
  */
 protected_mode_jump:
-   xorl%ebx, %ebx  # Flag to indicate this is a boot
movl%edx, %esi  # Pointer to boot_params table
-   movl%eax, 2f# Patch ljmpl instruction
+
+   xorl%edx, %edx
+   movw%cs, %dx
+   shll$4, %edx# Patch ljmpl instruction
+   addl%edx, 2f
jmp 1f  # Short jump to flush instruction q.
 
 1:
movw$__BOOT_DS, %cx
+   xorl%ebx, %ebx  # Per protocol
+   xorl%ebp, %ebp  # Per protocol
+   xorl%edi, %edi  # Per protocol
 
movl%cr0, %edx
orb $1, %dl # Protected mode (PE) bit
movl%edx, %cr0
+   
+   .byte   0x66, 0xea  # ljmpl opcode
+2: .long   3f  # Offset
+   .word   __BOOT_CS   # Segment
 
-   movw%cx, %ds
-   movw%cx, %es
-   movw%cx, %fs
-   movw%cx, %gs
-   movw%cx, %ss
+   .code32
+3:
+   movl%ecx, %ds
+   movl%ecx, %es
+   movl%ecx, %fs
+   movl%ecx, %gs
+   movl%ecx, %ss
 
# Jump to the 32-bit entrypoint
-   .byte   0x66, 0xea  # ljmpl opcode
-2: .long   0   # offset
-   .word   __BOOT_CS   # segment
-
+   jmpl*%eax
+   
.size   protected_mode_jump, .-protected_mode_jump
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] x86 setup: correct booting on 486DX4

2007-11-04 Thread Linus Torvalds


On Sun, 4 Nov 2007, H. Peter Anvin wrote:
 
 Apparently, the 486DX4 does not correctly serialize a mov to %cr0, so
 we really do need the far jump immediately afterwards.

Hmm. I'm not sure I agree with the commit message.

This is documented behaviour on i386 and i486: instruction decoding is 
decoupled from execution, so things that change processor mode have to do 
a jump to make sure that %cr0 changes take effect.

I'm not entirely sure that it needs to be a long-jump, btw. I think any
regular branch is sufficient. You obviously *do* need to make the long 
jump later (to reload %cs in protected mode), but I'm not sure it's needed 
in that place. I forget the exact rules (but they definitely were 
documented).

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] x86 setup: correct booting on 486DX4

2007-11-04 Thread Linus Torvalds


On Sun, 4 Nov 2007, Linus Torvalds wrote:
 
 I'm not entirely sure that it needs to be a long-jump, btw. I think any
 regular branch is sufficient. You obviously *do* need to make the long 
 jump later (to reload %cs in protected mode), but I'm not sure it's needed 
 in that place. I forget the exact rules (but they definitely were 
 documented).

Hmm. The original Linux code did

movw$1, %ax
lmsw%ax
jmp flush_instr
flush_instr:

and I think that was straigh out of the documentation. So yeah, I think 
that's the right fix - not a longjmp (which in itself is dangerous: it 
potentially behaves *differently* on different CPU's, since some CPU's may 
do the long jump with pre-protected-mode semantics, while others will do 
it with protected mode already in effect!)

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] x86 setup: correct booting on 486DX4

2007-11-04 Thread Jeremy Fitzhardinge
Linus Torvalds wrote:
 I'm not entirely sure that it needs to be a long-jump, btw. I think any
 regular branch is sufficient. You obviously *do* need to make the long 
 jump later (to reload %cs in protected mode), but I'm not sure it's needed 
 in that place. I forget the exact rules (but they definitely were 
 documented).

Yes, it says it needs to be a far jmp or call (and if you enabled paging
in the cr0 load, you need to identity-map the branch target).  Having
successfully broken the rules for a long time so far, maybe we can get
away with still cutting corners...  but it doesn't seem particularly
worthwhile since we've been caught once.

J
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] x86 setup: correct booting on 486DX4

2007-11-04 Thread H. Peter Anvin

Linus Torvalds wrote:


On Sun, 4 Nov 2007, H. Peter Anvin wrote:

Apparently, the 486DX4 does not correctly serialize a mov to %cr0, so

we really do need the far jump immediately afterwards.


Hmm. I'm not sure I agree with the commit message.

This is documented behaviour on i386 and i486: instruction decoding is 
decoupled from execution, so things that change processor mode have to do 
a jump to make sure that %cr0 changes take effect.




It's not an instruction-decoding issue at all (that's a 16- vs 32-bit 
issue, which can only be changed by a ljmp).  Apparently the 486DX4 
mis-executes the load to segment register, which is an EU function in 
that context.  (And yes, it's sort-of-documented behaviour in the sense 
that the documentation says do things this way, but the Intel docs are 
unfortunately full of do things this way which don't make sense and 
occasionally are actively harmful, too.)



I'm not entirely sure that it needs to be a long-jump, btw. I think any
regular branch is sufficient. You obviously *do* need to make the long 
jump later (to reload %cs in protected mode), but I'm not sure it's needed 
in that place. I forget the exact rules (but they definitely were 
documented).


That's exactly the issue here.  The code without this patch deferred the 
long jump until after the segment loads, this worked on all processors 
except, apparently, the 486DX4.  Hence, move the ljmp up to the earliest 
possible location.


-hpa
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] x86 setup: correct booting on 486DX4

2007-11-04 Thread H. Peter Anvin

Linus Torvalds wrote:


On Sun, 4 Nov 2007, Linus Torvalds wrote:

I'm not entirely sure that it needs to be a long-jump, btw. I think any
regular branch is sufficient. You obviously *do* need to make the long 
jump later (to reload %cs in protected mode), but I'm not sure it's needed 
in that place. I forget the exact rules (but they definitely were 
documented).


Hmm. The original Linux code did

movw$1, %ax
lmsw%ax
jmp flush_instr
flush_instr:

and I think that was straigh out of the documentation. So yeah, I think 
that's the right fix - not a longjmp (which in itself is dangerous: it 
potentially behaves *differently* on different CPU's, since some CPU's may 
do the long jump with pre-protected-mode semantics, while others will do 
it with protected mode already in effect!)




Just looked it up; it was a bit hard to find (it is Intel vol 3 page 
9-27, at least in the version I have), but you're right -- the 
documentation only demands a short jump here, not a long jmp (which 
actually makes sense given what I remembered that a long jump should be 
deferrable here.)  So yes, that is definitely the right fix and avoids 
the ugly mixing of code.


I'll update the patch.

-hpa
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] x86 setup: correct booting on 486DX4

2007-11-04 Thread Linus Torvalds


On Sun, 4 Nov 2007, H. Peter Anvin wrote:
 
 It's not an instruction-decoding issue at all (that's a 16- vs 32-bit issue,
 which can only be changed by a ljmp).  Apparently the 486DX4 mis-executes the
 load to segment register, which is an EU function in that context.  (And yes,
 it's sort-of-documented behaviour in the sense that the documentation says do
 things this way, but the Intel docs are unfortunately full of do things this
 way which don't make sense and occasionally are actively harmful, too.)

I still disagree.

I took out Programming the 80386 just to check, and the documentation 
very clearly states that when changing the CR0 bits (I quote):

The program must execute a jump instruction immediately after 
 changing the value of the PE bit in order to flush the execution 
 pipeline of any instructions that may have been fetched in the 
 wrong mode. [...]

In other words, not only is this documented since day 1, it makes total 
sense, and they even said exactöy *why* that jump had to be done.

In fact, there's even a code example. It's page 624 in my copy of the 
book, and yes, it has a short jump to flush things, followed by a long 
jump. The code there looks like this:

; *
; ** [4] Enter Protected Mode
; *
SMSW AX
OR   AX, PE
LMSW AX
JMP  Flush
Flush:
JMP far ptr Start32


which is pretty damn conclusive. It's documented, it has examples, it 
works. In other words, it's how you should do things.

And Linux always did it correctly. I don't understand why you disagree, 
and why Jeremy says

Having successfully broken the rules for a long time so far, 
 maybe we can get away with still cutting corners...

when the fact is, we used to *not* cut corners, we used to *not* break the 
rules, and what we used to do (a short jump immediately after setting PE) 
was exactly what Intel always said you should do, and there is no question 
what-so-ever about it.

So here's a suggestion:

 - make the code do what it used to do. A regular jump to flush the 
   pipeline. Which is what Intel has always said should be done.

and I really don't see that there is any argument about this. 

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] x86 setup: correct booting on 486DX4

2007-11-04 Thread H. Peter Anvin

Linus Torvalds wrote:


And Linux always did it correctly. I don't understand why you disagree, 
and why Jeremy says


	Having successfully broken the rules for a long time so far, 
	 maybe we can get away with still cutting corners...


when the fact is, we used to *not* cut corners, we used to *not* break the 
rules, and what we used to do (a short jump immediately after setting PE) 
was exactly what Intel always said you should do, and there is no question 
what-so-ever about it.




Apparently because the Intel documentation disagrees with itself. 
That's all.


-hpa
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] x86 setup: correct booting on 486DX4

2007-11-04 Thread Eric W. Biederman
H. Peter Anvin [EMAIL PROTECTED] writes:

 Hi Linus; please pull:

   git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-2.6-x86setup.git
 for-linus

 H. Peter Anvin (1):
   x86 setup: correct booting on 486DX4

  arch/x86/boot/pmjump.S |   32 +---
  1 files changed, 21 insertions(+), 11 deletions(-)

 [Full diff and log follows]

Looks reasonable to me.

 commit ac3b37b78c5f0f0be0b476a35370650f7bad482f
 Author: H. Peter Anvin [EMAIL PROTECTED]
 Date:   Sun Nov 4 14:33:41 2007 -0800

 x86 setup: correct booting on 486DX4
 
 Apparently, the 486DX4 does not correctly serialize a mov to %cr0, so
 we really do need the far jump immediately afterwards.  This means
 losing the nice separation between 16- and 32-bit code, but c'est la
 vie.
 
 Also pass %ebx = %edi = %ebp = 0 to support future extension of the
 32-bit boot protocol.
 
 Signed-off-by: H. Peter Anvin [EMAIL PROTECTED]

 diff --git a/arch/x86/boot/pmjump.S b/arch/x86/boot/pmjump.S
 index 2e55923..17e6dec 100644
 --- a/arch/x86/boot/pmjump.S
 +++ b/arch/x86/boot/pmjump.S
 @@ -28,27 +28,37 @@
   * void protected_mode_jump(u32 entrypoint, u32 bootparams);
   */
  protected_mode_jump:
 - xorl%ebx, %ebx  # Flag to indicate this is a boot
   movl%edx, %esi  # Pointer to boot_params table
 - movl%eax, 2f# Patch ljmpl instruction
 +
 + xorl%edx, %edx
 + movw%cs, %dx
 + shll$4, %edx# Patch ljmpl instruction
 + addl%edx, 2f
   jmp 1f  # Short jump to flush instruction q.
  
  1:
   movw$__BOOT_DS, %cx
 + xorl%ebx, %ebx  # Per protocol
 + xorl%ebp, %ebp  # Per protocol
 + xorl%edi, %edi  # Per protocol
  
   movl%cr0, %edx
   orb $1, %dl # Protected mode (PE) bit
   movl%edx, %cr0
 + 
 + .byte   0x66, 0xea  # ljmpl opcode
 +2:   .long   3f  # Offset
 + .word   __BOOT_CS   # Segment
  
 - movw%cx, %ds
 - movw%cx, %es
 - movw%cx, %fs
 - movw%cx, %gs
 - movw%cx, %ss
 + .code32
 +3:
 + movl%ecx, %ds
 + movl%ecx, %es
 + movl%ecx, %fs
 + movl%ecx, %gs
 + movl%ecx, %ss
  
   # Jump to the 32-bit entrypoint
 - .byte   0x66, 0xea  # ljmpl opcode
 -2:   .long   0   # offset
 - .word   __BOOT_CS   # segment
 -
 + jmpl*%eax
 + 
   .size   protected_mode_jump, .-protected_mode_jump
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] x86 setup: correct booting on 486DX4

2007-11-04 Thread H. Peter Anvin

H. Peter Anvin wrote:


Apparently because the Intel documentation disagrees with itself. That's 
all.




Just to be perfectly clear: I much prefer the code with the short (near) 
jump, because it keeps the code cleaner.  I have sent a patch to Mikael 
to test out.


-hpa
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] x86 setup: correct booting on 486DX4

2007-11-04 Thread Eric W. Biederman
H. Peter Anvin [EMAIL PROTECTED] writes:

 Linus Torvalds wrote:

 And Linux always did it correctly. I don't understand why you disagree, and
 why Jeremy says

  Having successfully broken the rules for a long time so far,maybe
 we can get away with still cutting corners...

 when the fact is, we used to *not* cut corners, we used to *not* break the
 rules, and what we used to do (a short jump immediately after setting PE) was
 exactly what Intel always said you should do, and there is no question
 what-so-ever about it.


 Apparently because the Intel documentation disagrees with itself. That's all.

Yes.  Let's go back to the tested version with the short jump, that
looks safest as it is what we have always done, and we certainly need some
kind of jump in there.

I do seem to recall etherboot having a far jump in that spot and it
working on everything from a 386 on up.  So I'm not certain if the
kind of jump matters.  Still the kernel has a lot more exposure.

At the same time it does look like we really do enter protected mode
with a valid gdt after the short jump so doing the segments loads as
I did originally in 32bit mode looks like it was excessively
conservative.

Eric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] x86 setup: correct booting on 486DX4

2007-11-04 Thread Linus Torvalds


On Sun, 4 Nov 2007, Eric W. Biederman wrote:
 
 I do seem to recall etherboot having a far jump in that spot and it
 working on everything from a 386 on up.  So I'm not certain if the
 kind of jump matters.  Still the kernel has a lot more exposure.

I actually suspect you could have just about anything in there, including 
just a couple of nops, or just avoiding certain instructions for a few 
cycles.

The i386/i486 pipeline isn't actually all that long (I ca't find it here, 
but I want to say it was just five stages), and the whole/only issue with 
writing to cr0 on those CPU's is literally that there isn't any forwarding 
of the cr0 state, so any instruction that actually has depend on the cr0 
value needs to have that value stable in the register by the time it 
executes.

So I literally suspect that just a couple of no-ops in between the move to 
cr0 and any instruction that depends on the state of the PE bit would be 
ok. And there aren't that many instructions that do, it's generally just 
the ones that load a segment that can care.

But I'd actually be worried about a ljmp directly after the move to 
cr0, exactly because an ljump actually does have semantic dependencies on 
the PE bit. But it's quite likely that ljmp is microcoded (it takes 12+ 
cycles even in real mode), and since microcode was nonpipelined, that 
would hide it.

But move to segment is definitely *not* microcoded in real mode (it's 
documented as just two cycles for reg-seg), so I'm not at all surprised 
that mov-cr0 followed immediately by mov-seg will not work.

In short:

 - far jumps are in the dangerous instruction category after a change to 
   PE. I would suggest not using it, although I also suspect that it 
   probably works if only because it's probably microcoded on at least an 
   i386.

 - instead of a short taken jump, you can almost certainly use anything 
   that is microcoded or just otherwise takes enough cycles (where 
   enough is likely in the 5-10 range) to make sure the writeback to CR0 
   is stable by the time any instruction uses it.

 - almost anything that doesn't actually involve a segment descriptor 
   lookup is probably not going to care at all about the value of PE. The 
   PE bit really doesn't affect all that much of the x86 instruction set, 
   and if an instruction doesn't care, it doesn't matter whether it's 
   executed with the old or the new value.


Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/