Re: [GIT PULL] x86 setup: correct booting on 486DX4
On Sun, 4 Nov 2007, Eric W. Biederman wrote: > > I do seem to recall etherboot having a far jump in that spot and it > working on everything from a 386 on up. So I'm not certain if the > kind of jump matters. Still the kernel has a lot more exposure. I actually suspect you could have just about anything in there, including just a couple of nops, or just avoiding certain instructions for a few cycles. The i386/i486 pipeline isn't actually all that long (I ca't find it here, but I want to say it was just five stages), and the whole/only issue with writing to cr0 on those CPU's is literally that there isn't any forwarding of the cr0 state, so any instruction that actually has depend on the cr0 value needs to have that value stable in the register by the time it executes. So I literally suspect that just a couple of no-ops in between the move to cr0 and any instruction that depends on the state of the PE bit would be ok. And there aren't that many instructions that do, it's generally just the ones that load a segment that can care. But I'd actually be worried about a ljmp directly after the "move to cr0", exactly because an ljump actually does have semantic dependencies on the PE bit. But it's quite likely that ljmp is microcoded (it takes 12+ cycles even in real mode), and since microcode was nonpipelined, that would hide it. But "move to segment" is definitely *not* microcoded in real mode (it's documented as just two cycles for reg->seg), so I'm not at all surprised that "mov->cr0" followed immediately by "mov->seg" will not work. In short: - far jumps are in the "dangerous instruction" category after a change to PE. I would suggest not using it, although I also suspect that it probably works if only because it's probably microcoded on at least an i386. - instead of a short taken jump, you can almost certainly use anything that is microcoded or just otherwise takes enough cycles (where "enough" is likely in the 5-10 range) to make sure the writeback to CR0 is stable by the time any instruction uses it. - almost anything that doesn't actually involve a segment descriptor lookup is probably not going to care at all about the value of PE. The PE bit really doesn't affect all that much of the x86 instruction set, and if an instruction doesn't care, it doesn't matter whether it's executed with the old or the new value. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] x86 setup: correct booting on 486DX4
"H. Peter Anvin" <[EMAIL PROTECTED]> writes: > Linus Torvalds wrote: >> >> And Linux always did it correctly. I don't understand why you disagree, and >> why Jeremy says >> >> "Having successfully broken the rules for a long time so far,maybe >> we can get away with still cutting corners..." >> >> when the fact is, we used to *not* cut corners, we used to *not* break the >> rules, and what we used to do (a short jump immediately after setting PE) was >> exactly what Intel always said you should do, and there is no question >> what-so-ever about it. >> > > Apparently because the Intel documentation disagrees with itself. That's all. Yes. Let's go back to the tested version with the short jump, that looks safest as it is what we have always done, and we certainly need some kind of jump in there. I do seem to recall etherboot having a far jump in that spot and it working on everything from a 386 on up. So I'm not certain if the kind of jump matters. Still the kernel has a lot more exposure. At the same time it does look like we really do enter protected mode with a valid gdt after the short jump so doing the segments loads as I did originally in 32bit mode looks like it was excessively conservative. Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] x86 setup: correct booting on 486DX4
H. Peter Anvin wrote: Apparently because the Intel documentation disagrees with itself. That's all. Just to be perfectly clear: I much prefer the code with the short (near) jump, because it keeps the code cleaner. I have sent a patch to Mikael to test out. -hpa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] x86 setup: correct booting on 486DX4
"H. Peter Anvin" <[EMAIL PROTECTED]> writes: > Hi Linus; please pull: > > git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-2.6-x86setup.git > for-linus > > H. Peter Anvin (1): > x86 setup: correct booting on 486DX4 > > arch/x86/boot/pmjump.S | 32 +--- > 1 files changed, 21 insertions(+), 11 deletions(-) > > [Full diff and log follows] Looks reasonable to me. > commit ac3b37b78c5f0f0be0b476a35370650f7bad482f > Author: H. Peter Anvin <[EMAIL PROTECTED]> > Date: Sun Nov 4 14:33:41 2007 -0800 > > x86 setup: correct booting on 486DX4 > > Apparently, the 486DX4 does not correctly serialize a mov to %cr0, so > we really do need the far jump immediately afterwards. This means > losing the nice separation between 16- and 32-bit code, but c'est la > vie. > > Also pass %ebx = %edi = %ebp = 0 to support future extension of the > 32-bit boot protocol. > > Signed-off-by: H. Peter Anvin <[EMAIL PROTECTED]> > > diff --git a/arch/x86/boot/pmjump.S b/arch/x86/boot/pmjump.S > index 2e55923..17e6dec 100644 > --- a/arch/x86/boot/pmjump.S > +++ b/arch/x86/boot/pmjump.S > @@ -28,27 +28,37 @@ > * void protected_mode_jump(u32 entrypoint, u32 bootparams); > */ > protected_mode_jump: > - xorl%ebx, %ebx # Flag to indicate this is a boot > movl%edx, %esi # Pointer to boot_params table > - movl%eax, 2f# Patch ljmpl instruction > + > + xorl%edx, %edx > + movw%cs, %dx > + shll$4, %edx# Patch ljmpl instruction > + addl%edx, 2f > jmp 1f # Short jump to flush instruction q. > > 1: > movw$__BOOT_DS, %cx > + xorl%ebx, %ebx # Per protocol > + xorl%ebp, %ebp # Per protocol > + xorl%edi, %edi # Per protocol > > movl%cr0, %edx > orb $1, %dl # Protected mode (PE) bit > movl%edx, %cr0 > + > + .byte 0x66, 0xea # ljmpl opcode > +2: .long 3f # Offset > + .word __BOOT_CS # Segment > > - movw%cx, %ds > - movw%cx, %es > - movw%cx, %fs > - movw%cx, %gs > - movw%cx, %ss > + .code32 > +3: > + movl%ecx, %ds > + movl%ecx, %es > + movl%ecx, %fs > + movl%ecx, %gs > + movl%ecx, %ss > > # Jump to the 32-bit entrypoint > - .byte 0x66, 0xea # ljmpl opcode > -2: .long 0 # offset > - .word __BOOT_CS # segment > - > + jmpl*%eax > + > .size protected_mode_jump, .-protected_mode_jump - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] x86 setup: correct booting on 486DX4
Linus Torvalds wrote: And Linux always did it correctly. I don't understand why you disagree, and why Jeremy says "Having successfully broken the rules for a long time so far, maybe we can get away with still cutting corners..." when the fact is, we used to *not* cut corners, we used to *not* break the rules, and what we used to do (a short jump immediately after setting PE) was exactly what Intel always said you should do, and there is no question what-so-ever about it. Apparently because the Intel documentation disagrees with itself. That's all. -hpa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] x86 setup: correct booting on 486DX4
On Sun, 4 Nov 2007, H. Peter Anvin wrote: > > It's not an instruction-decoding issue at all (that's a 16- vs 32-bit issue, > which can only be changed by a ljmp). Apparently the 486DX4 mis-executes the > load to segment register, which is an EU function in that context. (And yes, > it's sort-of-documented behaviour in the sense that the documentation says "do > things this way", but the Intel docs are unfortunately full of "do things this > way" which don't make sense and occasionally are actively harmful, too.) I still disagree. I took out "Programming the 80386" just to check, and the documentation very clearly states that when changing the CR0 bits (I quote): "The program must execute a jump instruction immediately after changing the value of the PE bit in order to flush the execution pipeline of any instructions that may have been fetched in the wrong mode. [...]" In other words, not only is this documented since day 1, it makes total sense, and they even said exactöy *why* that jump had to be done. In fact, there's even a code example. It's page 624 in my copy of the book, and yes, it has a short jump to flush things, followed by a long jump. The code there looks like this: ; * ; ** [4] Enter Protected Mode ; * SMSW AX OR AX, PE LMSW AX JMP Flush Flush: JMP far ptr Start32 which is pretty damn conclusive. It's documented, it has examples, it works. In other words, it's how you should do things. And Linux always did it correctly. I don't understand why you disagree, and why Jeremy says "Having successfully broken the rules for a long time so far, maybe we can get away with still cutting corners..." when the fact is, we used to *not* cut corners, we used to *not* break the rules, and what we used to do (a short jump immediately after setting PE) was exactly what Intel always said you should do, and there is no question what-so-ever about it. So here's a suggestion: - make the code do what it used to do. A regular jump to flush the pipeline. Which is what Intel has always said should be done. and I really don't see that there is any argument about this. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] x86 setup: correct booting on 486DX4
Linus Torvalds wrote: On Sun, 4 Nov 2007, Linus Torvalds wrote: I'm not entirely sure that it needs to be a long-jump, btw. I think any regular branch is sufficient. You obviously *do* need to make the long jump later (to reload %cs in protected mode), but I'm not sure it's needed in that place. I forget the exact rules (but they definitely were documented). Hmm. The original Linux code did movw$1, %ax lmsw%ax jmp flush_instr flush_instr: and I think that was straigh out of the documentation. So yeah, I think that's the right fix - not a longjmp (which in itself is dangerous: it potentially behaves *differently* on different CPU's, since some CPU's may do the long jump with pre-protected-mode semantics, while others will do it with protected mode already in effect!) Just looked it up; it was a bit hard to find (it is Intel vol 3 page 9-27, at least in the version I have), but you're right -- the documentation only demands a short jump here, not a long jmp (which actually makes sense given what I remembered that a long jump should be deferrable here.) So yes, that is definitely the right fix and avoids the ugly mixing of code. I'll update the patch. -hpa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] x86 setup: correct booting on 486DX4
Linus Torvalds wrote: On Sun, 4 Nov 2007, H. Peter Anvin wrote: Apparently, the 486DX4 does not correctly serialize a mov to %cr0, so we really do need the far jump immediately afterwards. Hmm. I'm not sure I agree with the commit message. This is documented behaviour on i386 and i486: instruction decoding is decoupled from execution, so things that change processor mode have to do a jump to make sure that %cr0 changes take effect. It's not an instruction-decoding issue at all (that's a 16- vs 32-bit issue, which can only be changed by a ljmp). Apparently the 486DX4 mis-executes the load to segment register, which is an EU function in that context. (And yes, it's sort-of-documented behaviour in the sense that the documentation says "do things this way", but the Intel docs are unfortunately full of "do things this way" which don't make sense and occasionally are actively harmful, too.) I'm not entirely sure that it needs to be a long-jump, btw. I think any regular branch is sufficient. You obviously *do* need to make the long jump later (to reload %cs in protected mode), but I'm not sure it's needed in that place. I forget the exact rules (but they definitely were documented). That's exactly the issue here. The code without this patch deferred the long jump until after the segment loads, this worked on all processors except, apparently, the 486DX4. Hence, move the ljmp up to the earliest possible location. -hpa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] x86 setup: correct booting on 486DX4
Linus Torvalds wrote: > I'm not entirely sure that it needs to be a long-jump, btw. I think any > regular branch is sufficient. You obviously *do* need to make the long > jump later (to reload %cs in protected mode), but I'm not sure it's needed > in that place. I forget the exact rules (but they definitely were > documented). Yes, it says it needs to be a far jmp or call (and if you enabled paging in the cr0 load, you need to identity-map the branch target). Having successfully broken the rules for a long time so far, maybe we can get away with still cutting corners... but it doesn't seem particularly worthwhile since we've been caught once. J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] x86 setup: correct booting on 486DX4
On Sun, 4 Nov 2007, Linus Torvalds wrote: > > I'm not entirely sure that it needs to be a long-jump, btw. I think any > regular branch is sufficient. You obviously *do* need to make the long > jump later (to reload %cs in protected mode), but I'm not sure it's needed > in that place. I forget the exact rules (but they definitely were > documented). Hmm. The original Linux code did movw$1, %ax lmsw%ax jmp flush_instr flush_instr: and I think that was straigh out of the documentation. So yeah, I think that's the right fix - not a longjmp (which in itself is dangerous: it potentially behaves *differently* on different CPU's, since some CPU's may do the long jump with pre-protected-mode semantics, while others will do it with protected mode already in effect!) Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] x86 setup: correct booting on 486DX4
On Sun, 4 Nov 2007, H. Peter Anvin wrote: > > Apparently, the 486DX4 does not correctly serialize a mov to %cr0, so > we really do need the far jump immediately afterwards. Hmm. I'm not sure I agree with the commit message. This is documented behaviour on i386 and i486: instruction decoding is decoupled from execution, so things that change processor mode have to do a jump to make sure that %cr0 changes take effect. I'm not entirely sure that it needs to be a long-jump, btw. I think any regular branch is sufficient. You obviously *do* need to make the long jump later (to reload %cs in protected mode), but I'm not sure it's needed in that place. I forget the exact rules (but they definitely were documented). Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] x86 setup: correct booting on 486DX4
Hi Linus; please pull: git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-2.6-x86setup.git for-linus H. Peter Anvin (1): x86 setup: correct booting on 486DX4 arch/x86/boot/pmjump.S | 32 +--- 1 files changed, 21 insertions(+), 11 deletions(-) [Full diff and log follows] commit ac3b37b78c5f0f0be0b476a35370650f7bad482f Author: H. Peter Anvin <[EMAIL PROTECTED]> Date: Sun Nov 4 14:33:41 2007 -0800 x86 setup: correct booting on 486DX4 Apparently, the 486DX4 does not correctly serialize a mov to %cr0, so we really do need the far jump immediately afterwards. This means losing the nice separation between 16- and 32-bit code, but c'est la vie. Also pass %ebx = %edi = %ebp = 0 to support future extension of the 32-bit boot protocol. Signed-off-by: H. Peter Anvin <[EMAIL PROTECTED]> diff --git a/arch/x86/boot/pmjump.S b/arch/x86/boot/pmjump.S index 2e55923..17e6dec 100644 --- a/arch/x86/boot/pmjump.S +++ b/arch/x86/boot/pmjump.S @@ -28,27 +28,37 @@ * void protected_mode_jump(u32 entrypoint, u32 bootparams); */ protected_mode_jump: - xorl%ebx, %ebx # Flag to indicate this is a boot movl%edx, %esi # Pointer to boot_params table - movl%eax, 2f# Patch ljmpl instruction + + xorl%edx, %edx + movw%cs, %dx + shll$4, %edx# Patch ljmpl instruction + addl%edx, 2f jmp 1f # Short jump to flush instruction q. 1: movw$__BOOT_DS, %cx + xorl%ebx, %ebx # Per protocol + xorl%ebp, %ebp # Per protocol + xorl%edi, %edi # Per protocol movl%cr0, %edx orb $1, %dl # Protected mode (PE) bit movl%edx, %cr0 + + .byte 0x66, 0xea # ljmpl opcode +2: .long 3f # Offset + .word __BOOT_CS # Segment - movw%cx, %ds - movw%cx, %es - movw%cx, %fs - movw%cx, %gs - movw%cx, %ss + .code32 +3: + movl%ecx, %ds + movl%ecx, %es + movl%ecx, %fs + movl%ecx, %gs + movl%ecx, %ss # Jump to the 32-bit entrypoint - .byte 0x66, 0xea # ljmpl opcode -2: .long 0 # offset - .word __BOOT_CS # segment - + jmpl*%eax + .size protected_mode_jump, .-protected_mode_jump - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] x86 setup: correct booting on 486DX4
Hi Linus; please pull: git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-2.6-x86setup.git for-linus H. Peter Anvin (1): x86 setup: correct booting on 486DX4 arch/x86/boot/pmjump.S | 32 +--- 1 files changed, 21 insertions(+), 11 deletions(-) [Full diff and log follows] commit ac3b37b78c5f0f0be0b476a35370650f7bad482f Author: H. Peter Anvin [EMAIL PROTECTED] Date: Sun Nov 4 14:33:41 2007 -0800 x86 setup: correct booting on 486DX4 Apparently, the 486DX4 does not correctly serialize a mov to %cr0, so we really do need the far jump immediately afterwards. This means losing the nice separation between 16- and 32-bit code, but c'est la vie. Also pass %ebx = %edi = %ebp = 0 to support future extension of the 32-bit boot protocol. Signed-off-by: H. Peter Anvin [EMAIL PROTECTED] diff --git a/arch/x86/boot/pmjump.S b/arch/x86/boot/pmjump.S index 2e55923..17e6dec 100644 --- a/arch/x86/boot/pmjump.S +++ b/arch/x86/boot/pmjump.S @@ -28,27 +28,37 @@ * void protected_mode_jump(u32 entrypoint, u32 bootparams); */ protected_mode_jump: - xorl%ebx, %ebx # Flag to indicate this is a boot movl%edx, %esi # Pointer to boot_params table - movl%eax, 2f# Patch ljmpl instruction + + xorl%edx, %edx + movw%cs, %dx + shll$4, %edx# Patch ljmpl instruction + addl%edx, 2f jmp 1f # Short jump to flush instruction q. 1: movw$__BOOT_DS, %cx + xorl%ebx, %ebx # Per protocol + xorl%ebp, %ebp # Per protocol + xorl%edi, %edi # Per protocol movl%cr0, %edx orb $1, %dl # Protected mode (PE) bit movl%edx, %cr0 + + .byte 0x66, 0xea # ljmpl opcode +2: .long 3f # Offset + .word __BOOT_CS # Segment - movw%cx, %ds - movw%cx, %es - movw%cx, %fs - movw%cx, %gs - movw%cx, %ss + .code32 +3: + movl%ecx, %ds + movl%ecx, %es + movl%ecx, %fs + movl%ecx, %gs + movl%ecx, %ss # Jump to the 32-bit entrypoint - .byte 0x66, 0xea # ljmpl opcode -2: .long 0 # offset - .word __BOOT_CS # segment - + jmpl*%eax + .size protected_mode_jump, .-protected_mode_jump - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] x86 setup: correct booting on 486DX4
On Sun, 4 Nov 2007, H. Peter Anvin wrote: Apparently, the 486DX4 does not correctly serialize a mov to %cr0, so we really do need the far jump immediately afterwards. Hmm. I'm not sure I agree with the commit message. This is documented behaviour on i386 and i486: instruction decoding is decoupled from execution, so things that change processor mode have to do a jump to make sure that %cr0 changes take effect. I'm not entirely sure that it needs to be a long-jump, btw. I think any regular branch is sufficient. You obviously *do* need to make the long jump later (to reload %cs in protected mode), but I'm not sure it's needed in that place. I forget the exact rules (but they definitely were documented). Linus - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] x86 setup: correct booting on 486DX4
On Sun, 4 Nov 2007, Linus Torvalds wrote: I'm not entirely sure that it needs to be a long-jump, btw. I think any regular branch is sufficient. You obviously *do* need to make the long jump later (to reload %cs in protected mode), but I'm not sure it's needed in that place. I forget the exact rules (but they definitely were documented). Hmm. The original Linux code did movw$1, %ax lmsw%ax jmp flush_instr flush_instr: and I think that was straigh out of the documentation. So yeah, I think that's the right fix - not a longjmp (which in itself is dangerous: it potentially behaves *differently* on different CPU's, since some CPU's may do the long jump with pre-protected-mode semantics, while others will do it with protected mode already in effect!) Linus - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] x86 setup: correct booting on 486DX4
Linus Torvalds wrote: I'm not entirely sure that it needs to be a long-jump, btw. I think any regular branch is sufficient. You obviously *do* need to make the long jump later (to reload %cs in protected mode), but I'm not sure it's needed in that place. I forget the exact rules (but they definitely were documented). Yes, it says it needs to be a far jmp or call (and if you enabled paging in the cr0 load, you need to identity-map the branch target). Having successfully broken the rules for a long time so far, maybe we can get away with still cutting corners... but it doesn't seem particularly worthwhile since we've been caught once. J - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] x86 setup: correct booting on 486DX4
Linus Torvalds wrote: On Sun, 4 Nov 2007, H. Peter Anvin wrote: Apparently, the 486DX4 does not correctly serialize a mov to %cr0, so we really do need the far jump immediately afterwards. Hmm. I'm not sure I agree with the commit message. This is documented behaviour on i386 and i486: instruction decoding is decoupled from execution, so things that change processor mode have to do a jump to make sure that %cr0 changes take effect. It's not an instruction-decoding issue at all (that's a 16- vs 32-bit issue, which can only be changed by a ljmp). Apparently the 486DX4 mis-executes the load to segment register, which is an EU function in that context. (And yes, it's sort-of-documented behaviour in the sense that the documentation says do things this way, but the Intel docs are unfortunately full of do things this way which don't make sense and occasionally are actively harmful, too.) I'm not entirely sure that it needs to be a long-jump, btw. I think any regular branch is sufficient. You obviously *do* need to make the long jump later (to reload %cs in protected mode), but I'm not sure it's needed in that place. I forget the exact rules (but they definitely were documented). That's exactly the issue here. The code without this patch deferred the long jump until after the segment loads, this worked on all processors except, apparently, the 486DX4. Hence, move the ljmp up to the earliest possible location. -hpa - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] x86 setup: correct booting on 486DX4
Linus Torvalds wrote: On Sun, 4 Nov 2007, Linus Torvalds wrote: I'm not entirely sure that it needs to be a long-jump, btw. I think any regular branch is sufficient. You obviously *do* need to make the long jump later (to reload %cs in protected mode), but I'm not sure it's needed in that place. I forget the exact rules (but they definitely were documented). Hmm. The original Linux code did movw$1, %ax lmsw%ax jmp flush_instr flush_instr: and I think that was straigh out of the documentation. So yeah, I think that's the right fix - not a longjmp (which in itself is dangerous: it potentially behaves *differently* on different CPU's, since some CPU's may do the long jump with pre-protected-mode semantics, while others will do it with protected mode already in effect!) Just looked it up; it was a bit hard to find (it is Intel vol 3 page 9-27, at least in the version I have), but you're right -- the documentation only demands a short jump here, not a long jmp (which actually makes sense given what I remembered that a long jump should be deferrable here.) So yes, that is definitely the right fix and avoids the ugly mixing of code. I'll update the patch. -hpa - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] x86 setup: correct booting on 486DX4
On Sun, 4 Nov 2007, H. Peter Anvin wrote: It's not an instruction-decoding issue at all (that's a 16- vs 32-bit issue, which can only be changed by a ljmp). Apparently the 486DX4 mis-executes the load to segment register, which is an EU function in that context. (And yes, it's sort-of-documented behaviour in the sense that the documentation says do things this way, but the Intel docs are unfortunately full of do things this way which don't make sense and occasionally are actively harmful, too.) I still disagree. I took out Programming the 80386 just to check, and the documentation very clearly states that when changing the CR0 bits (I quote): The program must execute a jump instruction immediately after changing the value of the PE bit in order to flush the execution pipeline of any instructions that may have been fetched in the wrong mode. [...] In other words, not only is this documented since day 1, it makes total sense, and they even said exactöy *why* that jump had to be done. In fact, there's even a code example. It's page 624 in my copy of the book, and yes, it has a short jump to flush things, followed by a long jump. The code there looks like this: ; * ; ** [4] Enter Protected Mode ; * SMSW AX OR AX, PE LMSW AX JMP Flush Flush: JMP far ptr Start32 which is pretty damn conclusive. It's documented, it has examples, it works. In other words, it's how you should do things. And Linux always did it correctly. I don't understand why you disagree, and why Jeremy says Having successfully broken the rules for a long time so far, maybe we can get away with still cutting corners... when the fact is, we used to *not* cut corners, we used to *not* break the rules, and what we used to do (a short jump immediately after setting PE) was exactly what Intel always said you should do, and there is no question what-so-ever about it. So here's a suggestion: - make the code do what it used to do. A regular jump to flush the pipeline. Which is what Intel has always said should be done. and I really don't see that there is any argument about this. Linus - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] x86 setup: correct booting on 486DX4
Linus Torvalds wrote: And Linux always did it correctly. I don't understand why you disagree, and why Jeremy says Having successfully broken the rules for a long time so far, maybe we can get away with still cutting corners... when the fact is, we used to *not* cut corners, we used to *not* break the rules, and what we used to do (a short jump immediately after setting PE) was exactly what Intel always said you should do, and there is no question what-so-ever about it. Apparently because the Intel documentation disagrees with itself. That's all. -hpa - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] x86 setup: correct booting on 486DX4
H. Peter Anvin [EMAIL PROTECTED] writes: Hi Linus; please pull: git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-2.6-x86setup.git for-linus H. Peter Anvin (1): x86 setup: correct booting on 486DX4 arch/x86/boot/pmjump.S | 32 +--- 1 files changed, 21 insertions(+), 11 deletions(-) [Full diff and log follows] Looks reasonable to me. commit ac3b37b78c5f0f0be0b476a35370650f7bad482f Author: H. Peter Anvin [EMAIL PROTECTED] Date: Sun Nov 4 14:33:41 2007 -0800 x86 setup: correct booting on 486DX4 Apparently, the 486DX4 does not correctly serialize a mov to %cr0, so we really do need the far jump immediately afterwards. This means losing the nice separation between 16- and 32-bit code, but c'est la vie. Also pass %ebx = %edi = %ebp = 0 to support future extension of the 32-bit boot protocol. Signed-off-by: H. Peter Anvin [EMAIL PROTECTED] diff --git a/arch/x86/boot/pmjump.S b/arch/x86/boot/pmjump.S index 2e55923..17e6dec 100644 --- a/arch/x86/boot/pmjump.S +++ b/arch/x86/boot/pmjump.S @@ -28,27 +28,37 @@ * void protected_mode_jump(u32 entrypoint, u32 bootparams); */ protected_mode_jump: - xorl%ebx, %ebx # Flag to indicate this is a boot movl%edx, %esi # Pointer to boot_params table - movl%eax, 2f# Patch ljmpl instruction + + xorl%edx, %edx + movw%cs, %dx + shll$4, %edx# Patch ljmpl instruction + addl%edx, 2f jmp 1f # Short jump to flush instruction q. 1: movw$__BOOT_DS, %cx + xorl%ebx, %ebx # Per protocol + xorl%ebp, %ebp # Per protocol + xorl%edi, %edi # Per protocol movl%cr0, %edx orb $1, %dl # Protected mode (PE) bit movl%edx, %cr0 + + .byte 0x66, 0xea # ljmpl opcode +2: .long 3f # Offset + .word __BOOT_CS # Segment - movw%cx, %ds - movw%cx, %es - movw%cx, %fs - movw%cx, %gs - movw%cx, %ss + .code32 +3: + movl%ecx, %ds + movl%ecx, %es + movl%ecx, %fs + movl%ecx, %gs + movl%ecx, %ss # Jump to the 32-bit entrypoint - .byte 0x66, 0xea # ljmpl opcode -2: .long 0 # offset - .word __BOOT_CS # segment - + jmpl*%eax + .size protected_mode_jump, .-protected_mode_jump - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] x86 setup: correct booting on 486DX4
H. Peter Anvin wrote: Apparently because the Intel documentation disagrees with itself. That's all. Just to be perfectly clear: I much prefer the code with the short (near) jump, because it keeps the code cleaner. I have sent a patch to Mikael to test out. -hpa - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] x86 setup: correct booting on 486DX4
H. Peter Anvin [EMAIL PROTECTED] writes: Linus Torvalds wrote: And Linux always did it correctly. I don't understand why you disagree, and why Jeremy says Having successfully broken the rules for a long time so far,maybe we can get away with still cutting corners... when the fact is, we used to *not* cut corners, we used to *not* break the rules, and what we used to do (a short jump immediately after setting PE) was exactly what Intel always said you should do, and there is no question what-so-ever about it. Apparently because the Intel documentation disagrees with itself. That's all. Yes. Let's go back to the tested version with the short jump, that looks safest as it is what we have always done, and we certainly need some kind of jump in there. I do seem to recall etherboot having a far jump in that spot and it working on everything from a 386 on up. So I'm not certain if the kind of jump matters. Still the kernel has a lot more exposure. At the same time it does look like we really do enter protected mode with a valid gdt after the short jump so doing the segments loads as I did originally in 32bit mode looks like it was excessively conservative. Eric - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] x86 setup: correct booting on 486DX4
On Sun, 4 Nov 2007, Eric W. Biederman wrote: I do seem to recall etherboot having a far jump in that spot and it working on everything from a 386 on up. So I'm not certain if the kind of jump matters. Still the kernel has a lot more exposure. I actually suspect you could have just about anything in there, including just a couple of nops, or just avoiding certain instructions for a few cycles. The i386/i486 pipeline isn't actually all that long (I ca't find it here, but I want to say it was just five stages), and the whole/only issue with writing to cr0 on those CPU's is literally that there isn't any forwarding of the cr0 state, so any instruction that actually has depend on the cr0 value needs to have that value stable in the register by the time it executes. So I literally suspect that just a couple of no-ops in between the move to cr0 and any instruction that depends on the state of the PE bit would be ok. And there aren't that many instructions that do, it's generally just the ones that load a segment that can care. But I'd actually be worried about a ljmp directly after the move to cr0, exactly because an ljump actually does have semantic dependencies on the PE bit. But it's quite likely that ljmp is microcoded (it takes 12+ cycles even in real mode), and since microcode was nonpipelined, that would hide it. But move to segment is definitely *not* microcoded in real mode (it's documented as just two cycles for reg-seg), so I'm not at all surprised that mov-cr0 followed immediately by mov-seg will not work. In short: - far jumps are in the dangerous instruction category after a change to PE. I would suggest not using it, although I also suspect that it probably works if only because it's probably microcoded on at least an i386. - instead of a short taken jump, you can almost certainly use anything that is microcoded or just otherwise takes enough cycles (where enough is likely in the 5-10 range) to make sure the writeback to CR0 is stable by the time any instruction uses it. - almost anything that doesn't actually involve a segment descriptor lookup is probably not going to care at all about the value of PE. The PE bit really doesn't affect all that much of the x86 instruction set, and if an instruction doesn't care, it doesn't matter whether it's executed with the old or the new value. Linus - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/