Re: [PATCH v3 2/3] Fix undefined operation fault that can hang a cpu on crash or panic

2020-07-07 Thread David P. Reed


On Tuesday, July 7, 2020 3:24pm, "Sean Christopherson" 
 said:

> On Tue, Jul 07, 2020 at 03:09:38PM -0400, David P. Reed wrote:
>>
>> On Tuesday, July 7, 2020 1:09am, "Sean Christopherson"
>>  said:
>> Sean, are you the one who would get this particular fix pushed into Linus's
>> tree, by the way? The "maintainership" is not clear to me.
> 
> Nope, I'm just here to complain and nitpick :-)  There's no direct maintainer
> for virtext.h so it falls under the higher level arch/x86 umbrella, i.e. I
> expect Boris/Thomas/Ingo will pick this up.
> 
Thanks for your time and effort in helping.



Re: [PATCH v3 2/3] Fix undefined operation fault that can hang a cpu on crash or panic

2020-07-07 Thread David P. Reed


On Tuesday, July 7, 2020 1:09am, "Sean Christopherson" 
 said:

> On Sat, Jul 04, 2020 at 04:38:08PM -0400, David P. Reed wrote:
>> Fix: Mask undefined operation fault during emergency VMXOFF that must be
>> attempted to force cpu exit from VMX root operation.
>> Explanation: When a cpu may be in VMX root operation (only possible when
>> CR4.VMXE is set), crash or panic reboot tries to exit VMX root operation
>> using VMXOFF. This is necessary, because any INIT will be masked while cpu
>> is in VMX root operation, but that state cannot be reliably
>> discerned by the state of the cpu.
>> VMXOFF faults if the cpu is not actually in VMX root operation, signalling
>> undefined operation.
>> Discovered while debugging an out-of-tree x-visor with a race. Can happen
>> due to certain kinds of bugs in KVM.
>>
>> Fixes: 208067 <https://bugzilla.kernel.org/show_bug.cgi?id=208067>
>> Reported-by: David P. Reed 
>> Suggested-by: Thomas Gleixner 
>> Suggested-by: Sean Christopherson 
>> Suggested-by: Andy Lutomirski 
>> Signed-off-by: David P. Reed 
>> ---
>>  arch/x86/include/asm/virtext.h | 20 ++--
>>  1 file changed, 14 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/virtext.h b/arch/x86/include/asm/virtext.h
>> index 0ede8d04535a..0e0900eacb9c 100644
>> --- a/arch/x86/include/asm/virtext.h
>> +++ b/arch/x86/include/asm/virtext.h
>> @@ -30,11 +30,11 @@ static inline int cpu_has_vmx(void)
>>  }
>>
>>
>> -/* Disable VMX on the current CPU
>> +/* Exit VMX root mode and isable VMX on the current CPU.
>>   *
>>   * vmxoff causes a undefined-opcode exception if vmxon was not run
>> - * on the CPU previously. Only call this function if you know VMX
>> - * is enabled.
>> + * on the CPU previously. Only call this function if you know cpu
>> + * is in VMX root mode.
>>   */
>>  static inline void cpu_vmxoff(void)
>>  {
>> @@ -47,14 +47,22 @@ static inline int cpu_vmx_enabled(void)
>>  return __read_cr4() & X86_CR4_VMXE;
>>  }
>>
>> -/* Disable VMX if it is enabled on the current CPU
>> +/* Safely exit VMX root mode and disable VMX if VMX enabled
>> + * on the current CPU. Handle undefined-opcode fault
>> + * that can occur if cpu is not in VMX root mode, due
>> + * to a race.
>>   *
>>   * You shouldn't call this if cpu_has_vmx() returns 0.
>>   */
>>  static inline void __cpu_emergency_vmxoff(void)
>>  {
>> -if (cpu_vmx_enabled())
>> -cpu_vmxoff();
>> +if (!cpu_vmx_enabled())
>> +return;
>> +asm volatile ("1:vmxoff\n\t"
>> +  "2:\n\t"
>> +  _ASM_EXTABLE(1b, 2b)
>> +  ::: "cc", "memory");
>> +cr4_clear_bits(X86_CR4_VMXE);
> 
> Open coding vmxoff doesn't make sense, and IMO is flat out wrong as it fixes
> flows that use __cpu_emergency_vmxoff() but leaves the same bug hanging
> around in emergency_vmx_disable_all() until the next patch.
> 
> The reason I say it doesn't make sense is that there is no sane scenario
> where the generic vmxoff helper should _not_ eat the fault.  All other VMXOFF
> faults are mode related, i.e. any fault is guaranteed to be due to the
> !post-VMXON check unless we're magically in RM, VM86, compat mode, or at
> CPL>0.  Given that the whole point of this series is that it's impossible to
> determine whether or not the CPU if post-VMXON if CR4.VMXE=1 without taking a
> fault of some form, there's simply no way that anything except the hypervisor
> (in normal operation) can know the state of VMX.  And given that the only
> in-tree hypervisor (KVM) has its own version of vmxoff, that means there is
> no scenario in which cpu_vmxoff() can safely be used.  Case in point, after
> the next patch there are no users of cpu_vmxoff().
> 
> TL;DR: Just do fixup on cpu_vmxoff().

Personally, I don't care either way, since it fixes the bug either way (and 
it's inlined, so either way no additional code is generated. I was just being 
conservative since the cpu_vmxoff() is exported throughout the kernel source, 
so it might be expected to stay the same (when not in an "emergency"). I'll 
wait a day or two for any objections to just doing the fix in cpu_vmxoff() by 
other commenters. WIth no objection, I'll just do it that way.

Sean, are you the one who would get this particular fix pushed into Linus's 
tree, by the way? The "maintainership" is not clear to me. If you are, happy to 
take direction from you as the primary input.

> 
>>  }
>>
>>  /* Disable VMX if it is supported and enabled on the current CPU
>> --
>> 2.26.2
>>
> 




Re: [PATCH v3 2/3] Fix undefined operation fault that can hang a cpu on crash or panic

2020-07-05 Thread David P. Reed


On Sunday, July 5, 2020 4:55pm, "Andy Lutomirski"  said:

> On Sun, Jul 5, 2020 at 12:52 PM David P. Reed  wrote:
>>
>> Thanks, will handle these. 2 questions below.
>>
>> On Sunday, July 5, 2020 2:22pm, "Andy Lutomirski"  said:
>>
>> > On Sat, Jul 4, 2020 at 1:38 PM David P. Reed  wrote:
>> >>
>> >> Fix: Mask undefined operation fault during emergency VMXOFF that must be
>> >> attempted to force cpu exit from VMX root operation.
>> >> Explanation: When a cpu may be in VMX root operation (only possible when
>> >> CR4.VMXE is set), crash or panic reboot tries to exit VMX root operation
>> >> using VMXOFF. This is necessary, because any INIT will be masked while cpu
>> >> is in VMX root operation, but that state cannot be reliably
>> >> discerned by the state of the cpu.
>> >> VMXOFF faults if the cpu is not actually in VMX root operation, signalling
>> >> undefined operation.
>> >> Discovered while debugging an out-of-tree x-visor with a race. Can happen
>> >> due to certain kinds of bugs in KVM.
>> >
>> > Can you re-wrap lines to 68 characters?  Also, the Fix: and
>>
>> I used 'scripts/checkpatch.pl' and it had me wrap to 75 chars:
>> "WARNING: Possible unwrapped commit description (prefer a maximum 75 chars 
>> per
>> line)"
>>
>> Should I submit a fix to checkpatch.pl to say 68?
> 
> 75 is probably fine too, but something is odd about your wrapping.
> You have long lines mostly alternating with short lines.  It's as if
> you wrote 120-ish character lines and then wrapped to 75 without
> reflowing.
My emacs settings tend to wrap at about 85 depending on file type (big 
screens). I did the shortening manually, aimed at breaking at meaningful 
points, not worrying too much about
line-length uniformity.

> 
>>
>> > Explanation: is probably unnecessary.  You could say:
>> >
>> > Ignore a potential #UD failut during emergency VMXOFF ...
>> >
>> > When a cpu may be in VMX ...
>> >
>> >>
>> >> Fixes: 208067 <https://bugzilla.kernel.org/show_bug.cgi?id=208067>
>> >> Reported-by: David P. Reed 
>> >
>> > It's not really necessary to say that you, the author, reported the
>> > problem, but I guess it's harmless.
>> >
>> >> Suggested-by: Thomas Gleixner 
>> >> Suggested-by: Sean Christopherson 
>> >> Suggested-by: Andy Lutomirski 
>> >> Signed-off-by: David P. Reed 
>> >> ---
>> >>  arch/x86/include/asm/virtext.h | 20 ++--
>> >>  1 file changed, 14 insertions(+), 6 deletions(-)
>> >>
>> >> diff --git a/arch/x86/include/asm/virtext.h 
>> >> b/arch/x86/include/asm/virtext.h
>> >> index 0ede8d04535a..0e0900eacb9c 100644
>> >> --- a/arch/x86/include/asm/virtext.h
>> >> +++ b/arch/x86/include/asm/virtext.h
>> >> @@ -30,11 +30,11 @@ static inline int cpu_has_vmx(void)
>> >>  }
>> >>
>> >>
>> >> -/* Disable VMX on the current CPU
>> >> +/* Exit VMX root mode and isable VMX on the current CPU.
>> >
>> > s/isable/disable/
>> >
>> >
>> >>  /* Disable VMX if it is supported and enabled on the current CPU
>> >> --
>> >> 2.26.2
>> >>
>> >
>> > Other than that:
>> >
>> > Reviewed-by: Andy Lutomirski 
>>
>> As a newbie, I have a process question - should I resend the patch with the
>> 'Reviewed-by' line, as well as correcting the other wording? Thanks!
> 
> Probably.  Sometimes a maintainer will apply the patch and make these
> types of cosmetic changes, but it's easier if you resubmit.  That
> being said, for non-urgent patches, it's usually considered polite to
> wait a day or two to give other people a chance to comment.

I'm not sure which maintainer will move the patches along. I am waiting for 
additional input, but will resubmit in a day or two.

> 
> --Andy
> 




Re: [PATCH v3 3/3] Force all cpus to exit VMX root operation on crash/panic reliably

2020-07-05 Thread David P. Reed


On Sunday, July 5, 2020 2:26pm, "Andy Lutomirski"  said:

> On Sat, Jul 4, 2020 at 1:38 PM David P. Reed  wrote:
>>
>> Fix the logic during crash/panic reboot on Intel processors that
>> can support VMX operation to ensure that all processors are not
>> in VMX root operation. Prior code made optimistic assumptions
>> about other cpus that would leave other cpus in VMX root operation
>> depending on timing of crash/panic reboot.
>> Builds on cpu_ermergency_vmxoff() and __cpu_emergency_vmxoff() created
>> in a prior patch.
>>
>> Suggested-by: Sean Christopherson 
>> Signed-off-by: David P. Reed 
>> ---
>>  arch/x86/kernel/reboot.c | 20 +++-
>>  1 file changed, 7 insertions(+), 13 deletions(-)
>>
>> diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
>> index 0ec7ced727fe..c8e96ba78efc 100644
>> --- a/arch/x86/kernel/reboot.c
>> +++ b/arch/x86/kernel/reboot.c
>> @@ -543,24 +543,18 @@ static void emergency_vmx_disable_all(void)
>>  * signals when VMX is enabled.
>>  *
>>  * We can't take any locks and we may be on an inconsistent
>> -* state, so we use NMIs as IPIs to tell the other CPUs to disable
>> -* VMX and halt.
>> +* state, so we use NMIs as IPIs to tell the other CPUs to exit
>> +* VMX root operation and halt.
>>  *
>>  * For safety, we will avoid running the nmi_shootdown_cpus()
>>  * stuff unnecessarily, but we don't have a way to check
>> -* if other CPUs have VMX enabled. So we will call it only if the
>> -* CPU we are running on has VMX enabled.
>> -*
>> -* We will miss cases where VMX is not enabled on all CPUs. This
>> -* shouldn't do much harm because KVM always enable VMX on all
>> -* CPUs anyway. But we can miss it on the small window where KVM
>> -* is still enabling VMX.
>> +* if other CPUs might be in VMX root operation.
>>  */
>> -   if (cpu_has_vmx() && cpu_vmx_enabled()) {
>> -   /* Disable VMX on this CPU. */
>> -   cpu_vmxoff();
>> +   if (cpu_has_vmx()) {
>> +   /* Safely force out of VMX root operation on this CPU. */
>> +   __cpu_emergency_vmxoff();
>>
>> -   /* Halt and disable VMX on the other CPUs */
>> +   /* Halt and exit VMX root operation on the other CPUs */
>> nmi_shootdown_cpus(vmxoff_nmi);
>>
>> }
> 
> Seems reasonable to me.
> 
> As a minor caveat, doing cr4_clear_bits() in NMI context is not really
> okay, but we're about to reboot, so nothing too awful should happen.
> And this has very little to do with your patch.

I had wondered why the bit is cleared, too. (I assumed it was OK or desirable, 
because it was being cleared in NMI context before). Happy to submit a separate 
patch to eliminate that issue as well, since the point of emergency vmxoff is 
only to get out of VMX root mode - CR4.VMXE's state is irrelevant. Of course, 
clearing it prevents any future emergency vmxoff attempts. (there seemed to be 
some confusion about "enabling" VMX vs. "in VMX operation" in the comments)  
Should I?

> 
> Reviewed-by: Andy Lutomirski 
> 




Re: [PATCH v3 2/3] Fix undefined operation fault that can hang a cpu on crash or panic

2020-07-05 Thread David P. Reed
Thanks, will handle these. 2 questions below.

On Sunday, July 5, 2020 2:22pm, "Andy Lutomirski"  said:

> On Sat, Jul 4, 2020 at 1:38 PM David P. Reed  wrote:
>>
>> Fix: Mask undefined operation fault during emergency VMXOFF that must be
>> attempted to force cpu exit from VMX root operation.
>> Explanation: When a cpu may be in VMX root operation (only possible when
>> CR4.VMXE is set), crash or panic reboot tries to exit VMX root operation
>> using VMXOFF. This is necessary, because any INIT will be masked while cpu
>> is in VMX root operation, but that state cannot be reliably
>> discerned by the state of the cpu.
>> VMXOFF faults if the cpu is not actually in VMX root operation, signalling
>> undefined operation.
>> Discovered while debugging an out-of-tree x-visor with a race. Can happen
>> due to certain kinds of bugs in KVM.
> 
> Can you re-wrap lines to 68 characters?  Also, the Fix: and

I used 'scripts/checkpatch.pl' and it had me wrap to 75 chars:
"WARNING: Possible unwrapped commit description (prefer a maximum 75 chars per 
line)"

Should I submit a fix to checkpatch.pl to say 68? 

> Explanation: is probably unnecessary.  You could say:
> 
> Ignore a potential #UD failut during emergency VMXOFF ...
> 
> When a cpu may be in VMX ...
> 
>>
>> Fixes: 208067 <https://bugzilla.kernel.org/show_bug.cgi?id=208067>
>> Reported-by: David P. Reed 
> 
> It's not really necessary to say that you, the author, reported the
> problem, but I guess it's harmless.
> 
>> Suggested-by: Thomas Gleixner 
>> Suggested-by: Sean Christopherson 
>> Suggested-by: Andy Lutomirski 
>> Signed-off-by: David P. Reed 
>> ---
>>  arch/x86/include/asm/virtext.h | 20 ++--
>>  1 file changed, 14 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/virtext.h b/arch/x86/include/asm/virtext.h
>> index 0ede8d04535a..0e0900eacb9c 100644
>> --- a/arch/x86/include/asm/virtext.h
>> +++ b/arch/x86/include/asm/virtext.h
>> @@ -30,11 +30,11 @@ static inline int cpu_has_vmx(void)
>>  }
>>
>>
>> -/* Disable VMX on the current CPU
>> +/* Exit VMX root mode and isable VMX on the current CPU.
> 
> s/isable/disable/
> 
> 
>>  /* Disable VMX if it is supported and enabled on the current CPU
>> --
>> 2.26.2
>>
> 
> Other than that:
> 
> Reviewed-by: Andy Lutomirski 

As a newbie, I have a process question - should I resend the patch with the 
'Reviewed-by' line, as well as correcting the other wording? Thanks!

> 
> --Andy
> 




[PATCH v3 2/3] Fix undefined operation fault that can hang a cpu on crash or panic

2020-07-04 Thread David P. Reed
Fix: Mask undefined operation fault during emergency VMXOFF that must be
attempted to force cpu exit from VMX root operation.
Explanation: When a cpu may be in VMX root operation (only possible when
CR4.VMXE is set), crash or panic reboot tries to exit VMX root operation
using VMXOFF. This is necessary, because any INIT will be masked while cpu
is in VMX root operation, but that state cannot be reliably
discerned by the state of the cpu.
VMXOFF faults if the cpu is not actually in VMX root operation, signalling
undefined operation.
Discovered while debugging an out-of-tree x-visor with a race. Can happen
due to certain kinds of bugs in KVM.

Fixes: 208067 <https://bugzilla.kernel.org/show_bug.cgi?id=208067>
Reported-by: David P. Reed 
Suggested-by: Thomas Gleixner 
Suggested-by: Sean Christopherson 
Suggested-by: Andy Lutomirski 
Signed-off-by: David P. Reed 
---
 arch/x86/include/asm/virtext.h | 20 ++--
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/virtext.h b/arch/x86/include/asm/virtext.h
index 0ede8d04535a..0e0900eacb9c 100644
--- a/arch/x86/include/asm/virtext.h
+++ b/arch/x86/include/asm/virtext.h
@@ -30,11 +30,11 @@ static inline int cpu_has_vmx(void)
 }
 
 
-/* Disable VMX on the current CPU
+/* Exit VMX root mode and isable VMX on the current CPU.
  *
  * vmxoff causes a undefined-opcode exception if vmxon was not run
- * on the CPU previously. Only call this function if you know VMX
- * is enabled.
+ * on the CPU previously. Only call this function if you know cpu
+ * is in VMX root mode.
  */
 static inline void cpu_vmxoff(void)
 {
@@ -47,14 +47,22 @@ static inline int cpu_vmx_enabled(void)
return __read_cr4() & X86_CR4_VMXE;
 }
 
-/* Disable VMX if it is enabled on the current CPU
+/* Safely exit VMX root mode and disable VMX if VMX enabled
+ * on the current CPU. Handle undefined-opcode fault
+ * that can occur if cpu is not in VMX root mode, due
+ * to a race.
  *
  * You shouldn't call this if cpu_has_vmx() returns 0.
  */
 static inline void __cpu_emergency_vmxoff(void)
 {
-   if (cpu_vmx_enabled())
-   cpu_vmxoff();
+   if (!cpu_vmx_enabled())
+   return;
+   asm volatile ("1:vmxoff\n\t"
+ "2:\n\t"
+ _ASM_EXTABLE(1b, 2b)
+ ::: "cc", "memory");
+   cr4_clear_bits(X86_CR4_VMXE);
 }
 
 /* Disable VMX if it is supported and enabled on the current CPU
-- 
2.26.2



[PATCH v3 0/3] Fix undefined operation VMXOFF during reboot and crash

2020-07-04 Thread David P. Reed
At the request of Sean Christopherson, the original patch was split
into three patches, each fixing a distinct issue related to the original
bug, of a hang due to VMXOFF causing an undefined operation fault
when the kernel reboots with CR4.VMXE set. The combination of
the patches is the complete fix to the reported bug, and a lurking
error in asm side effects.

David P. Reed (3):
  Correct asm VMXOFF side effects
  Fix undefined operation fault that can hang a cpu on crash or panic
  Force all cpus to exit VMX root operation on crash/panic reliably

 arch/x86/include/asm/virtext.h | 24 
 arch/x86/kernel/reboot.c   | 20 +++-
 2 files changed, 23 insertions(+), 21 deletions(-)

-- 
2.26.2



[PATCH v3 3/3] Force all cpus to exit VMX root operation on crash/panic reliably

2020-07-04 Thread David P. Reed
Fix the logic during crash/panic reboot on Intel processors that
can support VMX operation to ensure that all processors are not
in VMX root operation. Prior code made optimistic assumptions
about other cpus that would leave other cpus in VMX root operation
depending on timing of crash/panic reboot.
Builds on cpu_ermergency_vmxoff() and __cpu_emergency_vmxoff() created
in a prior patch.

Suggested-by: Sean Christopherson 
Signed-off-by: David P. Reed 
---
 arch/x86/kernel/reboot.c | 20 +++-
 1 file changed, 7 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
index 0ec7ced727fe..c8e96ba78efc 100644
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -543,24 +543,18 @@ static void emergency_vmx_disable_all(void)
 * signals when VMX is enabled.
 *
 * We can't take any locks and we may be on an inconsistent
-* state, so we use NMIs as IPIs to tell the other CPUs to disable
-* VMX and halt.
+* state, so we use NMIs as IPIs to tell the other CPUs to exit
+* VMX root operation and halt.
 *
 * For safety, we will avoid running the nmi_shootdown_cpus()
 * stuff unnecessarily, but we don't have a way to check
-* if other CPUs have VMX enabled. So we will call it only if the
-* CPU we are running on has VMX enabled.
-*
-* We will miss cases where VMX is not enabled on all CPUs. This
-* shouldn't do much harm because KVM always enable VMX on all
-* CPUs anyway. But we can miss it on the small window where KVM
-* is still enabling VMX.
+* if other CPUs might be in VMX root operation.
 */
-   if (cpu_has_vmx() && cpu_vmx_enabled()) {
-   /* Disable VMX on this CPU. */
-   cpu_vmxoff();
+   if (cpu_has_vmx()) {
+   /* Safely force out of VMX root operation on this CPU. */
+   __cpu_emergency_vmxoff();
 
-   /* Halt and disable VMX on the other CPUs */
+   /* Halt and exit VMX root operation on the other CPUs */
nmi_shootdown_cpus(vmxoff_nmi);
 
}
-- 
2.26.2



[PATCH v3 1/3] Correct asm VMXOFF side effects

2020-07-04 Thread David P. Reed
Tell gcc that VMXOFF instruction clobbers condition codes
and memory when executed.
Also, correct original comments to remove kernel-doc syntax
per Randy Dunlap's request.

Suggested-by: Randy Dunlap 
Signed-off-by: David P. Reed 
---
 arch/x86/include/asm/virtext.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/virtext.h b/arch/x86/include/asm/virtext.h
index 9aad0e0876fb..0ede8d04535a 100644
--- a/arch/x86/include/asm/virtext.h
+++ b/arch/x86/include/asm/virtext.h
@@ -30,7 +30,7 @@ static inline int cpu_has_vmx(void)
 }
 
 
-/** Disable VMX on the current CPU
+/* Disable VMX on the current CPU
  *
  * vmxoff causes a undefined-opcode exception if vmxon was not run
  * on the CPU previously. Only call this function if you know VMX
@@ -38,7 +38,7 @@ static inline int cpu_has_vmx(void)
  */
 static inline void cpu_vmxoff(void)
 {
-   asm volatile ("vmxoff");
+   asm volatile ("vmxoff" ::: "cc", "memory");
cr4_clear_bits(X86_CR4_VMXE);
 }
 
@@ -47,7 +47,7 @@ static inline int cpu_vmx_enabled(void)
return __read_cr4() & X86_CR4_VMXE;
 }
 
-/** Disable VMX if it is enabled on the current CPU
+/* Disable VMX if it is enabled on the current CPU
  *
  * You shouldn't call this if cpu_has_vmx() returns 0.
  */
@@ -57,7 +57,7 @@ static inline void __cpu_emergency_vmxoff(void)
cpu_vmxoff();
 }
 
-/** Disable VMX if it is supported and enabled on the current CPU
+/* Disable VMX if it is supported and enabled on the current CPU
  */
 static inline void cpu_emergency_vmxoff(void)
 {
-- 
2.26.2



Re: [PATCH v2] Fix undefined operation VMXOFF during reboot and crash

2020-06-29 Thread David P. Reed


On Monday, June 29, 2020 5:49pm, "Sean Christopherson" 
 said:

> On Mon, Jun 29, 2020 at 02:22:45PM -0700, Andy Lutomirski wrote:
>>
>>
>> > On Jun 29, 2020, at 1:54 PM, David P. Reed  wrote:
>> >
>> > Simple question for those on the To: and CC: list here. Should I
>> > abandon any hope of this patch being accepted? It's been a long time.
>> >
>> > The non-response after I acknowledged that this was discovered by when
>> > working on a personal, non-commercial research project - which is
>> > "out-of-tree" (apparently dirty words on LKML) has me thinking my
>> > contribution is unwanted. That's fine, I suppose. I can maintain this patch
>> > out-of-tree as well.  I did incorporate all the helpful suggestions I
>> > received in this second patch, and given some encouragement, will happily
>> > submit a revised v3 if there is any likelihood of acceptance. I'm wary of
>> > doing more radical changes (like combining emergency and normal paths).
>> >
>>
>> Sorry about being slow and less actively encouraging than we should be. We
>> absolutely welcome personal contributions. The actual problem is that
>> everyone is worked and we’re all slow. Also, you may be hitting a corner
>> case
>> in the process: is this a KVM patch or an x86 patch?
> 
> It's an x86 patch as it's not KVM specific, e.g. this code also helps play
> nice with out of tree hypervisors.
> 
> The code change is mostly good, but it needs to be split up as there are
> three separate fixes:
> 
>   1. Handle #UD on VMXON due to a race.
>   2. Mark memory and flags as clobbered by VMXON.
>   3. Change emergency_vmx_disable_all() to not manually check 
> cpu_vmx_enabled().
> 
> Yes, the changes are tiny, but if for example #3 introduces a bug then we
> don't have to revert #1 and #2.  Or perhaps older kernels are only subject
> to the #1 and #2 and thus dumping all three changes into a single patch makes
> it all harder to backport.  In other words, all the usual "one change per
> patch" reasons.
> 
Thanks. If no one else responds with additional suggestions, I will make it 
into 3 patches.
I'm happy to learn the nuances of the kernel patch regimen.





Re: [PATCH v2] Fix undefined operation VMXOFF during reboot and crash

2020-06-29 Thread David P. Reed
Simple question for those on the To: and CC: list here. Should I abandon any 
hope of this patch being accepted? It's been a long time.

The non-response after I acknowledged that this was discovered by when working 
on a personal, non-commercial research project - which is "out-of-tree" 
(apparently dirty words on LKML) has me thinking my contribution is unwanted. 
That's fine, I suppose. I can maintain this patch out-of-tree as well.
I did incorporate all the helpful suggestions I received in this second patch, 
and given some encouragement, will happily submit a revised v3 if there is any 
likelihood of acceptance. I'm wary of doing more radical changes (like 
combining emergency and normal paths).


On Thursday, June 25, 2020 10:59am, "David P. Reed"  said:

> Correction to my comment below.
> On Thursday, June 25, 2020 10:45am, "David P. Reed"  
> said:
> 
>> [Sorry: this is resent because my mailer included HTML, rejected by LKML]
>> Thanks for the response, Sean ... I had thought everyone was too busy to 
>> follow
>> up
>> from the first version.
>>  
>> I confess I'm not sure why this should be broken up into a patch series, 
>> given
>> that it is so very small and is all aimed at the same category of bug.
>>  
>> And the "emergency" path pre-existed, I didn't want to propose removing it, 
>> since
>> I assumed it was there for a reason. I didn't want to include my own 
>> judgement as
>> to whether there should only be one path. (I'm pretty sure I didn't find a 
>> VMXOFF
>> in KVM separately from the instance in this include file, but I will check).
> Just checked. Yes, the kvm code's handling of VMXOFF is separate, and though 
> it
> uses exception masking, seems to do other things, perhaps related to nested 
> KVM,
> but I haven't studied the deep logic of KVM nesting.
> 
>>  
>> A question: if I make it a series, I have to test each patch doesn't break
>> something individually, in order to handle the case where one patch is 
>> accepted
>> and the others are not. Do I need to test each individual patch thoroughly 
>> as an
>> independent patch against all those cases?
>> I know the combination don't break anything and fixes the issues I've 
>> discovered
>> by testing all combinations (and I've done some thorough testing of panics,
>> oopses
>> crashes, kexec, ... under all combinations of CR4.VMXE enablement and crash
>> source
>> to verify the fix fixes the problem's manifestations and to verify that it
>> doesn't
>> break any of the working paths.
>>  
>> That said, I'm willing to do a v3 "series" based on these suggestions if it 
>> will
>> smooth its acceptance. If it's not going to get accepted after doing that, my
>> motivation is flagging.
>> On Thursday, June 25, 2020 2:06am, "Sean Christopherson"
>>  said:
>>
>>
>>
>>> On Thu, Jun 11, 2020 at 03:48:18PM -0400, David P. Reed wrote:
>>> > -/** Disable VMX on the current CPU
>>> > +/* Disable VMX on the current CPU
>>> > *
>>> > - * vmxoff causes a undefined-opcode exception if vmxon was not run
>>> > - * on the CPU previously. Only call this function if you know VMX
>>> > - * is enabled.
>>> > + * vmxoff causes an undefined-opcode exception if vmxon was not run
>>> > + * on the CPU previously. Only call this function directly if you know 
>>> > VMX
>>> > + * is enabled *and* CPU is in VMX root operation.
>>> > */
>>> > static inline void cpu_vmxoff(void)
>>> > {
>>> > - asm volatile ("vmxoff");
>>> > + asm volatile ("vmxoff" ::: "cc", "memory"); /* clears all flags on 
>>> > success
>>> */
>>> > cr4_clear_bits(X86_CR4_VMXE);
>>> > }
>>> >
>>> > @@ -47,17 +47,35 @@ static inline int cpu_vmx_enabled(void)
>>> > return __read_cr4() & X86_CR4_VMXE;
>>> > }
>>> >
>>> > -/** Disable VMX if it is enabled on the current CPU
>>> > - *
>>> > - * You shouldn't call this if cpu_has_vmx() returns 0.
>>> > +/*
>>> > + * Safely disable VMX root operation if active
>>> > + * Note that if CPU is not in VMX root operation this
>>> > + * VMXOFF will fault an undefined operation fault,
>>> > + * so use the exception masking facility to handle that RARE
>>> > + * case.
>>> > + * You shouldn't call this directly if cpu_has_vmx() returns 0
>>

Re: [PATCH v2] Fix undefined operation VMXOFF during reboot and crash

2020-06-25 Thread David P. Reed
Correction to my comment below.
On Thursday, June 25, 2020 10:45am, "David P. Reed"  said:

> [Sorry: this is resent because my mailer included HTML, rejected by LKML]
> Thanks for the response, Sean ... I had thought everyone was too busy to 
> follow up
> from the first version.
>  
> I confess I'm not sure why this should be broken up into a patch series, given
> that it is so very small and is all aimed at the same category of bug.
>  
> And the "emergency" path pre-existed, I didn't want to propose removing it, 
> since
> I assumed it was there for a reason. I didn't want to include my own 
> judgement as
> to whether there should only be one path. (I'm pretty sure I didn't find a 
> VMXOFF
> in KVM separately from the instance in this include file, but I will check).
Just checked. Yes, the kvm code's handling of VMXOFF is separate, and though it 
uses exception masking, seems to do other things, perhaps related to nested 
KVM, but I haven't studied the deep logic of KVM nesting.

>  
> A question: if I make it a series, I have to test each patch doesn't break
> something individually, in order to handle the case where one patch is 
> accepted
> and the others are not. Do I need to test each individual patch thoroughly as 
> an
> independent patch against all those cases?
> I know the combination don't break anything and fixes the issues I've 
> discovered
> by testing all combinations (and I've done some thorough testing of panics, 
> oopses
> crashes, kexec, ... under all combinations of CR4.VMXE enablement and crash 
> source
> to verify the fix fixes the problem's manifestations and to verify that it 
> doesn't
> break any of the working paths.
>  
> That said, I'm willing to do a v3 "series" based on these suggestions if it 
> will
> smooth its acceptance. If it's not going to get accepted after doing that, my
> motivation is flagging.
> On Thursday, June 25, 2020 2:06am, "Sean Christopherson"
>  said:
> 
> 
> 
>> On Thu, Jun 11, 2020 at 03:48:18PM -0400, David P. Reed wrote:
>> > -/** Disable VMX on the current CPU
>> > +/* Disable VMX on the current CPU
>> > *
>> > - * vmxoff causes a undefined-opcode exception if vmxon was not run
>> > - * on the CPU previously. Only call this function if you know VMX
>> > - * is enabled.
>> > + * vmxoff causes an undefined-opcode exception if vmxon was not run
>> > + * on the CPU previously. Only call this function directly if you know VMX
>> > + * is enabled *and* CPU is in VMX root operation.
>> > */
>> > static inline void cpu_vmxoff(void)
>> > {
>> > - asm volatile ("vmxoff");
>> > + asm volatile ("vmxoff" ::: "cc", "memory"); /* clears all flags on 
>> > success
>> */
>> > cr4_clear_bits(X86_CR4_VMXE);
>> > }
>> >
>> > @@ -47,17 +47,35 @@ static inline int cpu_vmx_enabled(void)
>> > return __read_cr4() & X86_CR4_VMXE;
>> > }
>> >
>> > -/** Disable VMX if it is enabled on the current CPU
>> > - *
>> > - * You shouldn't call this if cpu_has_vmx() returns 0.
>> > +/*
>> > + * Safely disable VMX root operation if active
>> > + * Note that if CPU is not in VMX root operation this
>> > + * VMXOFF will fault an undefined operation fault,
>> > + * so use the exception masking facility to handle that RARE
>> > + * case.
>> > + * You shouldn't call this directly if cpu_has_vmx() returns 0
>> > + */
>> > +static inline void cpu_vmxoff_safe(void)
>> > +{
>> > + asm volatile("1:vmxoff\n\t" /* clears all flags on success */
>>
>> Eh, I wouldn't bother with the comment, there are a million other caveats
>> with VMXOFF that are far more interesting.
>>
>> > + "2:\n\t"
>> > + _ASM_EXTABLE(1b, 2b)
>> > + ::: "cc", "memory");
>>
>> Adding the memory and flags clobber should be a separate patch.
>>
>> > + cr4_clear_bits(X86_CR4_VMXE);
>> > +}
>>
>>
>> I don't see any value in safe/unsafe variants. The only in-kernel user of
>> VMXOFF outside of the emergency flows is KVM, which has its own VMXOFF
>> helper, i.e. all users of cpu_vmxoff() want the "safe" variant. Just add
>> the exception fixup to cpu_vmxoff() and call it good.
>>
>> > +
>> > +/*
>> > + * Force disable VMX if it is enabled on the current CPU,
>> > + * when it is unknown whether CPU is in VMX operation.
>> > */
>> > static inline void __cp

Re: [PATCH v2] Fix undefined operation VMXOFF during reboot and crash

2020-06-25 Thread David P. Reed
[Sorry: this is resent because my mailer included HTML, rejected by LKML]
Thanks for the response, Sean ... I had thought everyone was too busy to follow 
up from the first version.
 
I confess I'm not sure why this should be broken up into a patch series, given 
that it is so very small and is all aimed at the same category of bug.
 
And the "emergency" path pre-existed, I didn't want to propose removing it, 
since I assumed it was there for a reason. I didn't want to include my own 
judgement as to whether there should only be one path. (I'm pretty sure I 
didn't find a VMXOFF in KVM separately from the instance in this include file, 
but I will check).
 
A question: if I make it a series, I have to test each patch doesn't break 
something individually, in order to handle the case where one patch is accepted 
and the others are not. Do I need to test each individual patch thoroughly as 
an independent patch against all those cases?
I know the combination don't break anything and fixes the issues I've 
discovered by testing all combinations (and I've done some thorough testing of 
panics, oopses crashes, kexec, ... under all combinations of CR4.VMXE 
enablement and crash source to verify the fix fixes the problem's 
manifestations and to verify that it doesn't break any of the working paths.
 
That said, I'm willing to do a v3 "series" based on these suggestions if it 
will smooth its acceptance. If it's not going to get accepted after doing that, 
my motivation is flagging.
On Thursday, June 25, 2020 2:06am, "Sean Christopherson" 
 said:



> On Thu, Jun 11, 2020 at 03:48:18PM -0400, David P. Reed wrote:
> > -/** Disable VMX on the current CPU
> > +/* Disable VMX on the current CPU
> > *
> > - * vmxoff causes a undefined-opcode exception if vmxon was not run
> > - * on the CPU previously. Only call this function if you know VMX
> > - * is enabled.
> > + * vmxoff causes an undefined-opcode exception if vmxon was not run
> > + * on the CPU previously. Only call this function directly if you know VMX
> > + * is enabled *and* CPU is in VMX root operation.
> > */
> > static inline void cpu_vmxoff(void)
> > {
> > - asm volatile ("vmxoff");
> > + asm volatile ("vmxoff" ::: "cc", "memory"); /* clears all flags on success
> */
> > cr4_clear_bits(X86_CR4_VMXE);
> > }
> >
> > @@ -47,17 +47,35 @@ static inline int cpu_vmx_enabled(void)
> > return __read_cr4() & X86_CR4_VMXE;
> > }
> >
> > -/** Disable VMX if it is enabled on the current CPU
> > - *
> > - * You shouldn't call this if cpu_has_vmx() returns 0.
> > +/*
> > + * Safely disable VMX root operation if active
> > + * Note that if CPU is not in VMX root operation this
> > + * VMXOFF will fault an undefined operation fault,
> > + * so use the exception masking facility to handle that RARE
> > + * case.
> > + * You shouldn't call this directly if cpu_has_vmx() returns 0
> > + */
> > +static inline void cpu_vmxoff_safe(void)
> > +{
> > + asm volatile("1:vmxoff\n\t" /* clears all flags on success */
> 
> Eh, I wouldn't bother with the comment, there are a million other caveats
> with VMXOFF that are far more interesting.
> 
> > + "2:\n\t"
> > + _ASM_EXTABLE(1b, 2b)
> > + ::: "cc", "memory");
> 
> Adding the memory and flags clobber should be a separate patch.
> 
> > + cr4_clear_bits(X86_CR4_VMXE);
> > +}
> 
> 
> I don't see any value in safe/unsafe variants. The only in-kernel user of
> VMXOFF outside of the emergency flows is KVM, which has its own VMXOFF
> helper, i.e. all users of cpu_vmxoff() want the "safe" variant. Just add
> the exception fixup to cpu_vmxoff() and call it good.
> 
> > +
> > +/*
> > + * Force disable VMX if it is enabled on the current CPU,
> > + * when it is unknown whether CPU is in VMX operation.
> > */
> > static inline void __cpu_emergency_vmxoff(void)
> > {
> > - if (cpu_vmx_enabled())
> > - cpu_vmxoff();
> > + if (!cpu_vmx_enabled())
> > + return;
> > + cpu_vmxoff_safe();
> 
> Unnecessary churn.
> 
> > }
> >
> > -/** Disable VMX if it is supported and enabled on the current CPU
> > +/* Force disable VMX if it is supported on current CPU
> > */
> > static inline void cpu_emergency_vmxoff(void)
> > {
> > diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
> > index e040ba6be27b..b0e6b106a67e 100644
> > --- a/arch/x86/kernel/reboot.c
> > +++ b/arch/x86/kernel/reboot.c
> > @@ -540,21 +540,14 @@ static void emergency_vmx_disable_all(void)
> > *
> >

[PATCH v2] Fix undefined operation VMXOFF during reboot and crash

2020-06-11 Thread David P. Reed
If a panic/reboot occurs when CR4 has VMX enabled, a VMXOFF is
done on all CPUS, to allow the INIT IPI to function, since
INIT is suppressed when CPUs are in VMX root operation.
Problem is that  VMXOFF will causes undefined operation fault when CPU not
in VMX operation, that is, VMXON has not been executed yet, or VMXOFF
has been execute but VMX still enabled. Patch makes the reboot
work more reliably by masking the exception on VMXOFF in the
crash/panic/reboot path, which uses cpu_emergency_vmxoff().
Can happen with KVM due to a race, but that race is rare today.
Problem discovered doing out-of-tree x-visor development that uses VMX
in a novel way for kernel performance analysis.
The logic in reboot.c is also corrected, since the point of forcing
the processor out of VMX root operation is to allow the INIT signal
to be unmasked. See Intel SDM section on differences between VMX Root
operation and normal operation. Thus every CPU must be forced out of
VMX operation. Since the CPU may hang rather if INIT fails than restart,
a manual hardware "reset" is the only way out of this state in a
lights-out datacenter (well, if there is a BMC, it can issue a
hardware RESET to the chip).
Style errors in original file fixed, at request of Randy Dunlap:
eliminate '/**' in non-kernel-doc comments.

Fixes: 208067 <https://bugzilla.kernel.org/show_bug.cgi?id=208067>
Reported-by: David P. Reed 
Reported-by: Randy Dunlap 
Suggested-by: Thomas Gleixner 
Suggested-by: Sean Christopherson 
Suggested-by: Andy Lutomirski 
Signed-off-by: David P. Reed 
---
 arch/x86/include/asm/virtext.h | 40 --
 arch/x86/kernel/reboot.c   | 13 +++
 2 files changed, 32 insertions(+), 21 deletions(-)

diff --git a/arch/x86/include/asm/virtext.h b/arch/x86/include/asm/virtext.h
index 9aad0e0876fb..ed22c1983da8 100644
--- a/arch/x86/include/asm/virtext.h
+++ b/arch/x86/include/asm/virtext.h
@@ -30,15 +30,15 @@ static inline int cpu_has_vmx(void)
 }
 
 
-/** Disable VMX on the current CPU
+/* Disable VMX on the current CPU
  *
- * vmxoff causes a undefined-opcode exception if vmxon was not run
- * on the CPU previously. Only call this function if you know VMX
- * is enabled.
+ * vmxoff causes an undefined-opcode exception if vmxon was not run
+ * on the CPU previously. Only call this function directly if you know VMX
+ * is enabled *and* CPU is in VMX root operation.
  */
 static inline void cpu_vmxoff(void)
 {
-   asm volatile ("vmxoff");
+   asm volatile ("vmxoff" ::: "cc", "memory"); /* clears all flags on 
success */
cr4_clear_bits(X86_CR4_VMXE);
 }
 
@@ -47,17 +47,35 @@ static inline int cpu_vmx_enabled(void)
return __read_cr4() & X86_CR4_VMXE;
 }
 
-/** Disable VMX if it is enabled on the current CPU
- *
- * You shouldn't call this if cpu_has_vmx() returns 0.
+/*
+ * Safely disable VMX root operation if active
+ * Note that if CPU is not in VMX root operation this
+ * VMXOFF will fault an undefined operation fault,
+ * so use the exception masking facility to handle that RARE
+ * case.
+ * You shouldn't call this directly if cpu_has_vmx() returns 0
+ */
+static inline void cpu_vmxoff_safe(void)
+{
+   asm volatile("1:vmxoff\n\t" /* clears all flags on success */
+   "2:\n\t"
+_ASM_EXTABLE(1b, 2b)
+::: "cc",  "memory");
+   cr4_clear_bits(X86_CR4_VMXE);
+}
+
+/*
+ * Force disable VMX if it is enabled on the current CPU,
+ * when it is unknown whether CPU is in VMX operation.
  */
 static inline void __cpu_emergency_vmxoff(void)
 {
-   if (cpu_vmx_enabled())
-   cpu_vmxoff();
+   if (!cpu_vmx_enabled())
+   return;
+   cpu_vmxoff_safe();
 }
 
-/** Disable VMX if it is supported and enabled on the current CPU
+/* Force disable VMX if it is supported on current CPU
  */
 static inline void cpu_emergency_vmxoff(void)
 {
diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
index e040ba6be27b..b0e6b106a67e 100644
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -540,21 +540,14 @@ static void emergency_vmx_disable_all(void)
 *
 * For safety, we will avoid running the nmi_shootdown_cpus()
 * stuff unnecessarily, but we don't have a way to check
-* if other CPUs have VMX enabled. So we will call it only if the
-* CPU we are running on has VMX enabled.
-*
-* We will miss cases where VMX is not enabled on all CPUs. This
-* shouldn't do much harm because KVM always enable VMX on all
-* CPUs anyway. But we can miss it on the small window where KVM
-* is still enabling VMX.
+* if other CPUs have VMX enabled.
 */
-   if (cpu_has_vmx() && cpu_vmx_enabled()) {
+   if (cpu_has_vmx()) {
/* Disable VMX on this CPU. */
-   cpu_v

[PATCH v2] Fix undefined operation VMXOFF during reboot and crash

2020-06-11 Thread David P. Reed
If a panic/reboot occurs when CR4 has VMX enabled, a VMXOFF is
done on all CPUS, to allow the INIT IPI to function, since
INIT is suppressed when CPUs are in VMX root operation.
Problem is that  VMXOFF will causes undefined operation fault when CPU not
in VMX operation, that is, VMXON has not been executed yet, or VMXOFF
has been execute but VMX still enabled. Patch makes the reboot
work more reliably by masking the exception on VMXOFF in the
crash/panic/reboot path, which uses cpu_emergency_vmxoff().
Can happen with KVM due to a race, but that race is rare today.
Problem discovered doing out-of-tree x-visor development that uses VMX
in a novel way for kernel performance analysis.
The logic in reboot.c is also corrected, since the point of forcing
the processor out of VMX root operation is to allow the INIT signal
to be unmasked. See Intel SDM section on differences between VMX Root
operation and normal operation. Thus every CPU must be forced out of
VMX operation. Since the CPU may hang rather if INIT fails than restart,
a manual hardware "reset" is the only way out of this state in a
lights-out datacenter (well, if there is a BMC, it can issue a
hardware RESET to the chip).
Style errors in original file fixed, at request of Randy Dunlap:
eliminate '/**' in non-kernel-doc comments.

Fixes: 208067 <https://bugzilla.kernel.org/show_bug.cgi?id=208067>
Reported-by: David P. Reed 
Reported-by: Randy Dunlap 
Suggested-by: Thomas Gleixner 
Suggested-by: Sean Christopherson 
Suggested-by: Andy Lutomirski 
Signed-off-by: David P. Reed 
---
 arch/x86/include/asm/virtext.h | 40 --
 arch/x86/kernel/reboot.c   | 13 +++
 2 files changed, 32 insertions(+), 21 deletions(-)

diff --git a/arch/x86/include/asm/virtext.h b/arch/x86/include/asm/virtext.h
index 9aad0e0876fb..ed22c1983da8 100644
--- a/arch/x86/include/asm/virtext.h
+++ b/arch/x86/include/asm/virtext.h
@@ -30,15 +30,15 @@ static inline int cpu_has_vmx(void)
 }
 
 
-/** Disable VMX on the current CPU
+/* Disable VMX on the current CPU
  *
- * vmxoff causes a undefined-opcode exception if vmxon was not run
- * on the CPU previously. Only call this function if you know VMX
- * is enabled.
+ * vmxoff causes an undefined-opcode exception if vmxon was not run
+ * on the CPU previously. Only call this function directly if you know VMX
+ * is enabled *and* CPU is in VMX root operation.
  */
 static inline void cpu_vmxoff(void)
 {
-   asm volatile ("vmxoff");
+   asm volatile ("vmxoff" ::: "cc", "memory"); /* clears all flags on 
success */
cr4_clear_bits(X86_CR4_VMXE);
 }
 
@@ -47,17 +47,35 @@ static inline int cpu_vmx_enabled(void)
return __read_cr4() & X86_CR4_VMXE;
 }
 
-/** Disable VMX if it is enabled on the current CPU
- *
- * You shouldn't call this if cpu_has_vmx() returns 0.
+/*
+ * Safely disable VMX root operation if active
+ * Note that if CPU is not in VMX root operation this
+ * VMXOFF will fault an undefined operation fault,
+ * so use the exception masking facility to handle that RARE
+ * case.
+ * You shouldn't call this directly if cpu_has_vmx() returns 0
+ */
+static inline void cpu_vmxoff_safe(void)
+{
+   asm volatile("1:vmxoff\n\t" /* clears all flags on success */
+   "2:\n\t"
+_ASM_EXTABLE(1b, 2b)
+::: "cc",  "memory");
+   cr4_clear_bits(X86_CR4_VMXE);
+}
+
+/*
+ * Force disable VMX if it is enabled on the current CPU,
+ * when it is unknown whether CPU is in VMX operation.
  */
 static inline void __cpu_emergency_vmxoff(void)
 {
-   if (cpu_vmx_enabled())
-   cpu_vmxoff();
+   if (!cpu_vmx_enabled())
+   return;
+   cpu_vmxoff_safe();
 }
 
-/** Disable VMX if it is supported and enabled on the current CPU
+/* Force disable VMX if it is supported on current CPU
  */
 static inline void cpu_emergency_vmxoff(void)
 {
diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
index e040ba6be27b..b0e6b106a67e 100644
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -540,21 +540,14 @@ static void emergency_vmx_disable_all(void)
 *
 * For safety, we will avoid running the nmi_shootdown_cpus()
 * stuff unnecessarily, but we don't have a way to check
-* if other CPUs have VMX enabled. So we will call it only if the
-* CPU we are running on has VMX enabled.
-*
-* We will miss cases where VMX is not enabled on all CPUs. This
-* shouldn't do much harm because KVM always enable VMX on all
-* CPUs anyway. But we can miss it on the small window where KVM
-* is still enabling VMX.
+* if other CPUs have VMX enabled.
 */
-   if (cpu_has_vmx() && cpu_vmx_enabled()) {
+   if (cpu_has_vmx()) {
/* Disable VMX on this CPU. */
-   cpu_v

[PATCH] Fix undefined operation VMXOFF during reboot and crash

2020-06-10 Thread David P. Reed
If a panic/reboot occurs when CR4 has VMX enabled, a VMXOFF is
done on all CPUS, to allow the INIT IPI to function, since
INIT is suppressed when CPUs are in VMX root operation.
However, VMXOFF causes an undefined operation fault if the CPU is not
in VMX operation, that is, VMXON has not been executed, or VMXOFF
has been executed, but VMX is enabled. This fix makes the reboot
work more reliably by modifying the #UD handler to skip the VMXOFF
if VMX is enabled on the CPU and the VMXOFF is executed as part
of cpu_emergency_vmxoff().
The logic in reboot.c is also corrected, since the point of forcing
the processor out of VMX root operation is because when VMX root
operation is enabled, the processor INIT signal is always masked.
See Intel SDM section on differences between VMX Root operation and normal
operation. Thus every CPU must be forced out of VMX operation.
Since the CPU will hang rather than restart, a manual "reset" is the
only way out of this state (or if there is a BMC, it can issue a RESET
to the chip).

Signed-off-by: David P. Reed 
---
 arch/x86/include/asm/virtext.h | 24 
 arch/x86/kernel/reboot.c   | 13 ++---
 arch/x86/kernel/traps.c| 52 --
 3 files changed, 71 insertions(+), 18 deletions(-)

diff --git a/arch/x86/include/asm/virtext.h b/arch/x86/include/asm/virtext.h
index 9aad0e0876fb..ea2d67191684 100644
--- a/arch/x86/include/asm/virtext.h
+++ b/arch/x86/include/asm/virtext.h
@@ -13,12 +13,16 @@
 #ifndef _ASM_X86_VIRTEX_H
 #define _ASM_X86_VIRTEX_H
 
+#include 
+
 #include 
 
 #include 
 #include 
 #include 
 
+DECLARE_PER_CPU_READ_MOSTLY(int, doing_emergency_vmxoff);
+
 /*
  * VMX functions:
  */
@@ -33,8 +37,8 @@ static inline int cpu_has_vmx(void)
 /** Disable VMX on the current CPU
  *
  * vmxoff causes a undefined-opcode exception if vmxon was not run
- * on the CPU previously. Only call this function if you know VMX
- * is enabled.
+ * on the CPU previously. Only call this function directly if you know VMX
+ * is enabled *and* CPU is in VMX root operation.
  */
 static inline void cpu_vmxoff(void)
 {
@@ -47,17 +51,25 @@ static inline int cpu_vmx_enabled(void)
return __read_cr4() & X86_CR4_VMXE;
 }
 
-/** Disable VMX if it is enabled on the current CPU
+/** Force disable VMX if it is enabled on the current CPU.
+ * Note that if CPU is not in VMX root operation this
+ * VMXOFF will fault an undefined operation fault.
+ * So the 'doing_emergency_vmxoff' percpu flag is set,
+ * the trap handler for just restarts execution after
+ * the VMXOFF instruction.
  *
- * You shouldn't call this if cpu_has_vmx() returns 0.
+ * You shouldn't call this directly if cpu_has_vmx() returns 0.
  */
 static inline void __cpu_emergency_vmxoff(void)
 {
-   if (cpu_vmx_enabled())
+   if (cpu_vmx_enabled()) {
+   this_cpu_write(doing_emergency_vmxoff, 1);
cpu_vmxoff();
+   this_cpu_write(doing_emergency_vmxoff, 0);
+   }
 }
 
-/** Disable VMX if it is supported and enabled on the current CPU
+/** Force disable VMX if it is supported and enabled on the current CPU
  */
 static inline void cpu_emergency_vmxoff(void)
 {
diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
index 3ca43be4f9cf..abc8b51a57c7 100644
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -540,21 +540,14 @@ static void emergency_vmx_disable_all(void)
 *
 * For safety, we will avoid running the nmi_shootdown_cpus()
 * stuff unnecessarily, but we don't have a way to check
-* if other CPUs have VMX enabled. So we will call it only if the
-* CPU we are running on has VMX enabled.
-*
-* We will miss cases where VMX is not enabled on all CPUs. This
-* shouldn't do much harm because KVM always enable VMX on all
-* CPUs anyway. But we can miss it on the small window where KVM
-* is still enabling VMX.
+* if other CPUs have VMX enabled.
 */
-   if (cpu_has_vmx() && cpu_vmx_enabled()) {
+   if (cpu_has_vmx()) {
/* Disable VMX on this CPU. */
-   cpu_vmxoff();
+   cpu_emergency_vmxoff();
 
/* Halt and disable VMX on the other CPUs */
nmi_shootdown_cpus(vmxoff_nmi);
-
}
 }
 
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 4cc541051994..2dcf57ef467e 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -39,6 +39,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -59,6 +60,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_X86_64
 #include 
@@ -70,6 +72,8 @@
 #include 
 #endif
 
+DEFINE_PER_CPU_READ_MOSTLY(int, doing_emergency_vmxoff) = 0;
+
 DECLARE_BITMAP(system_vectors, NR_VECTORS);
 
 static inline void cond_local_irq_enable(struct pt_regs *regs)
@@ -115,6 +119,43 @@ int fixup_bug(struct pt_regs *regs, int tr

Re: [linux-kernel] Re: [PATCH] x86: use explicit timing delay for pit accesses in kernel and pcspkr driver

2008-02-20 Thread David P. Reed
Actually, disparaging things as "one idiotic system" doesn't seem like a 
long-term thoughtful process - it's not even accurate.  There are more 
such systems that are running code today than the total number of 486 
systems ever manufactured.  The production rate is $1M/month.


a) ENE chips are "documented" to receive port 80, and also it is the 
case that modern chipsets will happily diagnose writes to non-existent 
ports as MCE's.   Using side effects that depend on non-existent ports 
just creates a brittle failure mode down the road.  And it's not just 
post ACPI "initialization".   The pcspkr use of port 80 caused solid 
freezes if you typed "tab" to complete a command line and there were 
more than one choice, leading to beeps.


b) sad to say, Linux is not what hardware vendors use as the system that 
their BIOSes MUST work with.  That's Windows, and Windows, whether we 
like it or not does not require hardware vendors to stay away from port 80.


IMHO, calling something "idiotic" is hardly evidence-based decision 
making.   Maybe you love to hate Microsoft, but until Intel writes an 
architecture standard that says explicitly that a "standard PC" must not 
use port 80 for any peripheral, the port 80 thing is folklore, and one 
that is solely Linux-defined.


Rene Herman wrote:

On 20-02-08 18:05, H. Peter Anvin wrote:
 

Rene Herman wrote:


_Something_ like this would seem to be the only remaining option. It 
seems fairly unuseful to #ifdef around that switch statement for 
kernels without support for the earlier families, but if you insist...




"Only remaining option" other than the one we've had all along.  Even 
on the one idiotic set of systems which break, it only breaks 
post-ACPI intialization, IIRC.


Linus vetoed the DMI switch.

Rene.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [PATCH] x86: use explicit timing delay for pit accesses in kernel and pcspkr driver

2008-02-20 Thread David P. Reed
Actually, disparaging things as one idiotic system doesn't seem like a 
long-term thoughtful process - it's not even accurate.  There are more 
such systems that are running code today than the total number of 486 
systems ever manufactured.  The production rate is $1M/month.


a) ENE chips are documented to receive port 80, and also it is the 
case that modern chipsets will happily diagnose writes to non-existent 
ports as MCE's.   Using side effects that depend on non-existent ports 
just creates a brittle failure mode down the road.  And it's not just 
post ACPI initialization.   The pcspkr use of port 80 caused solid 
freezes if you typed tab to complete a command line and there were 
more than one choice, leading to beeps.


b) sad to say, Linux is not what hardware vendors use as the system that 
their BIOSes MUST work with.  That's Windows, and Windows, whether we 
like it or not does not require hardware vendors to stay away from port 80.


IMHO, calling something idiotic is hardly evidence-based decision 
making.   Maybe you love to hate Microsoft, but until Intel writes an 
architecture standard that says explicitly that a standard PC must not 
use port 80 for any peripheral, the port 80 thing is folklore, and one 
that is solely Linux-defined.


Rene Herman wrote:

On 20-02-08 18:05, H. Peter Anvin wrote:
 

Rene Herman wrote:


_Something_ like this would seem to be the only remaining option. It 
seems fairly unuseful to #ifdef around that switch statement for 
kernels without support for the earlier families, but if you insist...




Only remaining option other than the one we've had all along.  Even 
on the one idiotic set of systems which break, it only breaks 
post-ACPI intialization, IIRC.


Linus vetoed the DMI switch.

Rene.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86: use explicit timing delay for pit accesses in kernel and pcspkr driver

2008-02-18 Thread David P. Reed
x86: use explicit timing delay for pit accesses in kernel and pcspkr driver

pit accesses in i8253.c and pcspkr driver use outb_p for timing.
Fix them to use explicit timing delay for access to PIT,
rather than inb_p/outb_p calls, which use insufficiently explicit
delays (defaulting to port 80 writes) that can cause freeze problems
on some machines, such as Quanta moterboard  machines using ENE EC's.

Since the pcspkr driver accesses PIT registers directly, it should also
use outb_pit, which is inlined, so does not need bo exported.
Explicit timing delay is only needed in pcspkr for accesses to the 8253 PIT.
Fix pcspkr driver to use the new outb_pic call properly, use
named PIC port values rather than hex constants, and drop its use of
inb_p and outb_p in accessing port 61h where it has never been needed.

Signed-off-by: David P. Reed <[EMAIL PROTECTED]>

Index: linux-2.6/drivers/input/misc/pcspkr.c
===
--- linux-2.6.orig/drivers/input/misc/pcspkr.c
+++ linux-2.6/drivers/input/misc/pcspkr.c
@@ -36,6 +36,7 @@ static int pcspkr_event(struct input_dev
 {
unsigned int count = 0;
unsigned long flags;
+   unsigned char port61;
 
if (type != EV_SND)
return -1;
@@ -51,17 +52,18 @@ static int pcspkr_event(struct input_dev
 
spin_lock_irqsave(_lock, flags);
 
+   port61 = inb(0x61);
if (count) {
/* enable counter 2 */
-   outb_p(inb_p(0x61) | 3, 0x61);
+   outb(port61 | 3, 0x61);
/* set command for counter 2, 2 byte write */
-   outb_p(0xB6, 0x43);
+   outb_pit(0xB6, PIT_MODE);
/* select desired HZ */
-   outb_p(count & 0xff, 0x42);
-   outb((count >> 8) & 0xff, 0x42);
+   outb_pit(count & 0xff, PIT_CH2);
+   outb((count >> 8) & 0xff, PIT_CH2);
} else {
/* disable counter 2 */
-   outb(inb_p(0x61) & 0xFC, 0x61);
+   outb(port61 & 0xFC, 0x61);
}
 
spin_unlock_irqrestore(_lock, flags);
Index: linux-2.6/include/asm-x86/i8253.h
===
--- linux-2.6.orig/include/asm-x86/i8253.h
+++ linux-2.6/include/asm-x86/i8253.h
@@ -12,7 +12,25 @@ extern struct clock_event_device *global
 
 extern void setup_pit_timer(void);
 
-#define inb_pitinb_p
-#define outb_pit   outb_p
+/* accesses to PIT registers need careful delays on some platforms. Define
+   them here in a common place */
+static inline unsigned char inb_pit(unsigned int port)
+{
+   /* delay for some accesses to PIT on motherboard or in chipset must be
+  at least one microsecond, but be safe here. */
+   unsigned char value = inb(port);
+   udelay(2);
+   return value;
+}
+
+static inline void outb_pit(unsigned char value, unsigned int port)
+{
+   /* delay for some accesses to PIT on motherboard or in chipset must be
+  at least one microsecond, but be safe here. */
+   outb(value, port);
+   udelay(2);
+}
+
+
 
 #endif /* __ASM_I8253_H__ */
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86: define outb_pic and inb_pic to stop using outb_p and inb_p

2008-02-18 Thread David P. Reed
x86: define outb_pic and inb_pic to stop using outb_p and inb_p
 
The delay between io port accesses to the PIC is now defined using outb_pic
and inb_pic.  This fix provides the next step, using udelay(2) to define the
*PIC specific* timing requirements, rather than on bus-oriented timing, which
is not well calibrated.

Again, the primary reason for fixing this is to use proper delay strategy,
and in particular to fix crashes that can result from using port 80 writes
on machines that have resources on port 80, such as the ENE chips used by Quanta
in latops it designs and sells to, e.g. HP.

Signed-off-by: David P. Reed <[EMAIL PROTECTED]>
Index: linux-2.6/include/asm-x86/i8259.h
===
--- linux-2.6.orig/include/asm-x86/i8259.h
+++ linux-2.6/include/asm-x86/i8259.h
@@ -29,7 +29,23 @@ extern void enable_8259A_irq(unsigned in
 extern void disable_8259A_irq(unsigned int irq);
 extern unsigned int startup_8259A_irq(unsigned int irq);
 
-#define inb_picinb_p
-#define outb_pic   outb_p
+/* the PIC may need a careful delay on some platforms, hence specific calls */
+static inline unsigned char inb_pic(unsigned int port)
+{
+   /* delay for some accesses to PIC on motherboard or in chipset must be
+  at least one microsecond, but be safe here. */
+   unsigned char value = inb(port);
+   udelay(2);
+   return value;
+}
+
+static inline void outb_pic(unsigned char value, unsigned int port)
+{
+   /* delay for some accesses to PIC on motherboard or in chipset must be
+  at least one microsecond, but be safe here. */
+   outb(value, port);
+   udelay(2);
+}
+
 
 #endif /* __ASM_I8259_H__ */

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [patch 1/2] x86: define outb_pic and inb_pic to stop using outb_p and inb_p

2008-02-18 Thread David P. Reed

Alan Cox wrote:

+unsigned char inb_pic(unsigned int port)
+{
+   /* delay for some accesses to PIC on motherboard or in chipset must be
+  at least one microsecond, but be safe here. */
+   unsigned char value = inb(port);
+   udelay(2);
+   return value;
+}



inline it. Its almost no instructions

  
Will do. Assume you desire inlining of the outb_pic, and also the 
inb_pit and outb_pit routines.  Didn't do it because the code is 
slightly bigger than the call.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch] x86: revised - use explicit timing delay for pit accesses

2008-02-18 Thread David P. Reed
x86: revised - use explicit timing delay for pit accesses

pit accesses in i8253.c and pcspkr driver use outb_p for timing.
Fix them to use explicit timing delay for access to PIT,
rather than inb_p/outb_p calls, which use insufficiently explicit
delays (defaulting to port 80 writes) that can cause freeze problems
on some machines, such as Quanta moterboard  machines using ENE EC's.

Since the pcspkr driver accesses PIT registers directly, it needs
the symbol outb_pit exported, so it can be built as a module.
Explicit timing delay is only needed in pcspkr for accesses to the 8253 PIT.
Fix pcspkr driver to use the new outb_pic call properly, use
named PIC port values rather than hex constants, and drop its use of
inb_p and outb_p in accessing port 61h where it has never been needed.

Signed-off-by: David P. Reed <[EMAIL PROTECTED]>

Index: linux-2.6/drivers/input/misc/pcspkr.c
===
--- linux-2.6.orig/drivers/input/misc/pcspkr.c
+++ linux-2.6/drivers/input/misc/pcspkr.c
@@ -36,6 +36,7 @@ static int pcspkr_event(struct input_dev
 {
unsigned int count = 0;
unsigned long flags;
+   unsigned char port61;
 
if (type != EV_SND)
return -1;
@@ -51,17 +52,18 @@ static int pcspkr_event(struct input_dev
 
spin_lock_irqsave(_lock, flags);
 
+   port61 = inb(0x61);
if (count) {
/* enable counter 2 */
-   outb_p(inb_p(0x61) | 3, 0x61);
+   outb(port61 | 3, 0x61);
/* set command for counter 2, 2 byte write */
-   outb_p(0xB6, 0x43);
+   outb_pit(0xB6, PIT_MODE);
/* select desired HZ */
-   outb_p(count & 0xff, 0x42);
-   outb((count >> 8) & 0xff, 0x42);
+   outb_pit(count & 0xff, PIT_CH2);
+   outb((count >> 8) & 0xff, PIT_CH2);
} else {
/* disable counter 2 */
-   outb(inb_p(0x61) & 0xFC, 0x61);
+   outb(port61 & 0xFC, 0x61);
}
 
spin_unlock_irqrestore(_lock, flags);
Index: linux-2.6/arch/x86/kernel/i8253.c
===
--- linux-2.6.orig/arch/x86/kernel/i8253.c
+++ linux-2.6/arch/x86/kernel/i8253.c
@@ -31,6 +31,29 @@ static inline void pit_disable_clocksour
 struct clock_event_device *global_clock_event;
 
 /*
+ * define the PIT specific port access routines, which define the timing
+ * needed by the PIT registers on some platforms.
+ */
+unsigned char inb_pit(unsigned int port)
+{
+   /* delay for some accesses to PIC on motherboard or in chipset must be
+  at least one microsecond, but be safe here. */
+   unsigned char value = inb(port);
+   udelay(2);
+   return value;
+}
+EXPORT_SYMBOL(inb_pit);
+
+void outb_pit(unsigned char value, unsigned int port)
+{
+   /* delay for some accesses to PIC on motherboard or in chipset must be
+  at least one microsecond, but be safe here. */
+   outb(value, port);
+   udelay(2);
+}
+EXPORT_SYMBOL(outb_pit);
+
+/*
  * Initialize the PIT timer.
  *
  * This is also called after resume to bring the PIT into operation again.
Index: linux-2.6/include/asm-x86/i8253.h
===
--- linux-2.6.orig/include/asm-x86/i8253.h
+++ linux-2.6/include/asm-x86/i8253.h
@@ -12,7 +12,9 @@ extern struct clock_event_device *global
 
 extern void setup_pit_timer(void);
 
-#define inb_pitinb_p
-#define outb_pit   outb_p
+/* accesses to PIT registers need careful delays on some platforms. Define
+   them here in a common place */
+extern unsigned char inb_pit(unsigned int port);
+extern void outb_pit(unsigned char value, unsigned int port);
 
 #endif /* __ASM_I8253_H__ */
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] [patch 2/2] x86: use explicit timing delay for pit accesses

2008-02-18 Thread David P. Reed
Oops. the patch I just submitted for i8253.c didn't export the symbol 
needed by the pcspkr driver to build it as a module.  I will send the 
revised patch shortly.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 2/2] x86: use explicit timing delay for pit accesses

2008-02-18 Thread David P. Reed
pit accesses in i8253.c and pcspkr driver use outb_p for timing.
Fix them to use explicit timing delay for access to PIT,
rather than inb_p/outb_p calls, which use insufficiently explicit
delays (defaulting to port 80 writes) that can cause freeze problems
on some machines, such as Quanta moterboard  machines using ENE EC's.

The explicit timing delay is only needed in pcspkr for accesses to the 8253 PIT.
Fix pcspkr driver to use the new outb_pic call properly, use
named port values rather than hex constants, and drop its use of
inb_p and outb_p in accessing port 61h where it has never been needed.

Signed-off-by: David P. Reed <[EMAIL PROTECTED]>

Index: linux-2.6/drivers/input/misc/pcspkr.c
===
--- linux-2.6.orig/drivers/input/misc/pcspkr.c
+++ linux-2.6/drivers/input/misc/pcspkr.c
@@ -36,6 +36,7 @@ static int pcspkr_event(struct input_dev
 {
unsigned int count = 0;
unsigned long flags;
+   unsigned char port61;
 
if (type != EV_SND)
return -1;
@@ -51,17 +52,18 @@ static int pcspkr_event(struct input_dev
 
spin_lock_irqsave(_lock, flags);
 
+   port61 = inb(0x61);
if (count) {
/* enable counter 2 */
-   outb_p(inb_p(0x61) | 3, 0x61);
+   outb(port61 | 3, 0x61);
/* set command for counter 2, 2 byte write */
-   outb_p(0xB6, 0x43);
+   outb_pit(0xB6, PIT_MODE);
/* select desired HZ */
-   outb_p(count & 0xff, 0x42);
-   outb((count >> 8) & 0xff, 0x42);
+   outb_pit(count & 0xff, PIT_CH2);
+   outb((count >> 8) & 0xff, PIT_CH2);
} else {
/* disable counter 2 */
-   outb(inb_p(0x61) & 0xFC, 0x61);
+   outb(port61 & 0xFC, 0x61);
}
 
spin_unlock_irqrestore(_lock, flags);
Index: linux-2.6/arch/x86/kernel/i8253.c
===
--- linux-2.6.orig/arch/x86/kernel/i8253.c
+++ linux-2.6/arch/x86/kernel/i8253.c
@@ -31,6 +31,27 @@ static inline void pit_disable_clocksour
 struct clock_event_device *global_clock_event;
 
 /*
+ * define the PIT specific port access routines, which define the timing
+ * needed by the PIT registers on some platforms.
+ */
+unsigned char inb_pit(unsigned int port)
+{
+   /* delay for some accesses to PIC on motherboard or in chipset must be
+  at least one microsecond, but be safe here. */
+   unsigned char value = inb(port);
+   udelay(2);
+   return value;
+}
+
+void outb_pit(unsigned char value, unsigned int port)
+{
+   /* delay for some accesses to PIC on motherboard or in chipset must be
+  at least one microsecond, but be safe here. */
+   outb(value, port);
+   udelay(2);
+}
+
+/*
  * Initialize the PIT timer.
  *
  * This is also called after resume to bring the PIT into operation again.
Index: linux-2.6/include/asm-x86/i8253.h
===
--- linux-2.6.orig/include/asm-x86/i8253.h
+++ linux-2.6/include/asm-x86/i8253.h
@@ -12,7 +12,9 @@ extern struct clock_event_device *global
 
 extern void setup_pit_timer(void);
 
-#define inb_pitinb_p
-#define outb_pit   outb_p
+/* accesses to PIT registers need careful delays on some platforms. Define
+   them here in a common place */
+extern unsigned char inb_pit(unsigned int port);
+extern void outb_pit(unsigned char value, unsigned int port);
 
 #endif /* __ASM_I8253_H__ */

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 1/2] x86: define outb_pic and inb_pic to stop using outb_p and inb_p

2008-02-18 Thread David P. Reed
The delay between io port accesses to the PIC is now defined using outb_pic
and inb_pic.  This fix provides the next step, using udelay(2) to define the
*PIC specific* timing requirements, rather than on bus-oriented timing, which
is not well calibrated.

Again, the primary reason for fixing this is to use proper delay strategy,
and in particular to fix crashes that can result from using port 80 writes
on machines that have resources on port 80, such as the ENE chips used by Quanta
in latops it designs and sells to, e.g. HP.

Signed-off-by: David P. Reed <[EMAIL PROTECTED]>
Index: linux-2.6/arch/x86/kernel/i8259_32.c
===
--- linux-2.6.orig/arch/x86/kernel/i8259_32.c
+++ linux-2.6/arch/x86/kernel/i8259_32.c
@@ -277,6 +277,23 @@ static int __init i8259A_init_sysfs(void
 
 device_initcall(i8259A_init_sysfs);
 
+unsigned char inb_pic(unsigned int port)
+{
+   /* delay for some accesses to PIC on motherboard or in chipset must be
+  at least one microsecond, but be safe here. */
+   unsigned char value = inb(port);
+   udelay(2);
+   return value;
+}
+
+void outb_pic(unsigned char value, unsigned int port)
+{
+   /* delay for some accesses to PIC on motherboard or in chipset must be
+  at least one microsecond, but be safe here. */
+   outb(value, port);
+   udelay(2);
+}
+
 void init_8259A(int auto_eoi)
 {
unsigned long flags;
Index: linux-2.6/arch/x86/kernel/i8259_64.c
===
--- linux-2.6.orig/arch/x86/kernel/i8259_64.c
+++ linux-2.6/arch/x86/kernel/i8259_64.c
@@ -347,6 +347,23 @@ static int __init i8259A_init_sysfs(void
 
 device_initcall(i8259A_init_sysfs);
 
+unsigned char inb_pic(unsigned int port)
+{
+   /* delay for some accesses to PIC on motherboard or in chipset must be
+  at least one microsecond, but be safe here. */
+   unsigned char value = inb(port);
+   udelay(2);
+   return value;
+}
+
+void outb_pic(unsigned char value, unsigned int port)
+{
+   /* delay for some accesses to PIC on motherboard or in chipset must be
+  at least one microsecond, but be safe here. */
+   outb(value, port);
+   udelay(2);
+}
+
 void init_8259A(int auto_eoi)
 {
unsigned long flags;
Index: linux-2.6/include/asm-x86/i8259.h
===
--- linux-2.6.orig/include/asm-x86/i8259.h
+++ linux-2.6/include/asm-x86/i8259.h
@@ -28,8 +28,8 @@ extern void init_8259A(int auto_eoi);
 extern void enable_8259A_irq(unsigned int irq);
 extern void disable_8259A_irq(unsigned int irq);
 extern unsigned int startup_8259A_irq(unsigned int irq);
-
-#define inb_picinb_p
-#define outb_pic   outb_p
+/* the PIC may need a careful delay on some platforms, hence specific calls */
+extern unsigned char inb_pic(unsigned int port);
+extern void outb_pic(unsigned char value, unsigned int port);
 
 #endif /* __ASM_I8259_H__ */

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 0/2] replacement submission for motherboard/chipset iodelay fixes

2008-02-18 Thread David P. Reed
Here are the two revised patches based on Alan Cox's NAK's and suggestions
regarding using the _pic and _pit versions of inb/outb.  The new patches
use udelay(2) as a conservative delay for pic and pit, and isolate
that usage in the respective i8253.c and i8259_*.c files.
Together with the already ack'ed patch for CMOS rtc (not included here)
these should solve the problem with modern machines, the ones that don't use
older devices, but only motherboard/chipset resources.

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 1/2] x86: define outb_pic and inb_pic to stop using outb_p and inb_p

2008-02-18 Thread David P. Reed
The delay between io port accesses to the PIC is now defined using outb_pic
and inb_pic.  This fix provides the next step, using udelay(2) to define the
*PIC specific* timing requirements, rather than on bus-oriented timing, which
is not well calibrated.

Again, the primary reason for fixing this is to use proper delay strategy,
and in particular to fix crashes that can result from using port 80 writes
on machines that have resources on port 80, such as the ENE chips used by Quanta
in latops it designs and sells to, e.g. HP.

Signed-off-by: David P. Reed [EMAIL PROTECTED]
Index: linux-2.6/arch/x86/kernel/i8259_32.c
===
--- linux-2.6.orig/arch/x86/kernel/i8259_32.c
+++ linux-2.6/arch/x86/kernel/i8259_32.c
@@ -277,6 +277,23 @@ static int __init i8259A_init_sysfs(void
 
 device_initcall(i8259A_init_sysfs);
 
+unsigned char inb_pic(unsigned int port)
+{
+   /* delay for some accesses to PIC on motherboard or in chipset must be
+  at least one microsecond, but be safe here. */
+   unsigned char value = inb(port);
+   udelay(2);
+   return value;
+}
+
+void outb_pic(unsigned char value, unsigned int port)
+{
+   /* delay for some accesses to PIC on motherboard or in chipset must be
+  at least one microsecond, but be safe here. */
+   outb(value, port);
+   udelay(2);
+}
+
 void init_8259A(int auto_eoi)
 {
unsigned long flags;
Index: linux-2.6/arch/x86/kernel/i8259_64.c
===
--- linux-2.6.orig/arch/x86/kernel/i8259_64.c
+++ linux-2.6/arch/x86/kernel/i8259_64.c
@@ -347,6 +347,23 @@ static int __init i8259A_init_sysfs(void
 
 device_initcall(i8259A_init_sysfs);
 
+unsigned char inb_pic(unsigned int port)
+{
+   /* delay for some accesses to PIC on motherboard or in chipset must be
+  at least one microsecond, but be safe here. */
+   unsigned char value = inb(port);
+   udelay(2);
+   return value;
+}
+
+void outb_pic(unsigned char value, unsigned int port)
+{
+   /* delay for some accesses to PIC on motherboard or in chipset must be
+  at least one microsecond, but be safe here. */
+   outb(value, port);
+   udelay(2);
+}
+
 void init_8259A(int auto_eoi)
 {
unsigned long flags;
Index: linux-2.6/include/asm-x86/i8259.h
===
--- linux-2.6.orig/include/asm-x86/i8259.h
+++ linux-2.6/include/asm-x86/i8259.h
@@ -28,8 +28,8 @@ extern void init_8259A(int auto_eoi);
 extern void enable_8259A_irq(unsigned int irq);
 extern void disable_8259A_irq(unsigned int irq);
 extern unsigned int startup_8259A_irq(unsigned int irq);
-
-#define inb_picinb_p
-#define outb_pic   outb_p
+/* the PIC may need a careful delay on some platforms, hence specific calls */
+extern unsigned char inb_pic(unsigned int port);
+extern void outb_pic(unsigned char value, unsigned int port);
 
 #endif /* __ASM_I8259_H__ */

-- 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 0/2] replacement submission for motherboard/chipset iodelay fixes

2008-02-18 Thread David P. Reed
Here are the two revised patches based on Alan Cox's NAK's and suggestions
regarding using the _pic and _pit versions of inb/outb.  The new patches
use udelay(2) as a conservative delay for pic and pit, and isolate
that usage in the respective i8253.c and i8259_*.c files.
Together with the already ack'ed patch for CMOS rtc (not included here)
these should solve the problem with modern machines, the ones that don't use
older devices, but only motherboard/chipset resources.

-- 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 2/2] x86: use explicit timing delay for pit accesses

2008-02-18 Thread David P. Reed
pit accesses in i8253.c and pcspkr driver use outb_p for timing.
Fix them to use explicit timing delay for access to PIT,
rather than inb_p/outb_p calls, which use insufficiently explicit
delays (defaulting to port 80 writes) that can cause freeze problems
on some machines, such as Quanta moterboard  machines using ENE EC's.

The explicit timing delay is only needed in pcspkr for accesses to the 8253 PIT.
Fix pcspkr driver to use the new outb_pic call properly, use
named port values rather than hex constants, and drop its use of
inb_p and outb_p in accessing port 61h where it has never been needed.

Signed-off-by: David P. Reed [EMAIL PROTECTED]

Index: linux-2.6/drivers/input/misc/pcspkr.c
===
--- linux-2.6.orig/drivers/input/misc/pcspkr.c
+++ linux-2.6/drivers/input/misc/pcspkr.c
@@ -36,6 +36,7 @@ static int pcspkr_event(struct input_dev
 {
unsigned int count = 0;
unsigned long flags;
+   unsigned char port61;
 
if (type != EV_SND)
return -1;
@@ -51,17 +52,18 @@ static int pcspkr_event(struct input_dev
 
spin_lock_irqsave(i8253_lock, flags);
 
+   port61 = inb(0x61);
if (count) {
/* enable counter 2 */
-   outb_p(inb_p(0x61) | 3, 0x61);
+   outb(port61 | 3, 0x61);
/* set command for counter 2, 2 byte write */
-   outb_p(0xB6, 0x43);
+   outb_pit(0xB6, PIT_MODE);
/* select desired HZ */
-   outb_p(count  0xff, 0x42);
-   outb((count  8)  0xff, 0x42);
+   outb_pit(count  0xff, PIT_CH2);
+   outb((count  8)  0xff, PIT_CH2);
} else {
/* disable counter 2 */
-   outb(inb_p(0x61)  0xFC, 0x61);
+   outb(port61  0xFC, 0x61);
}
 
spin_unlock_irqrestore(i8253_lock, flags);
Index: linux-2.6/arch/x86/kernel/i8253.c
===
--- linux-2.6.orig/arch/x86/kernel/i8253.c
+++ linux-2.6/arch/x86/kernel/i8253.c
@@ -31,6 +31,27 @@ static inline void pit_disable_clocksour
 struct clock_event_device *global_clock_event;
 
 /*
+ * define the PIT specific port access routines, which define the timing
+ * needed by the PIT registers on some platforms.
+ */
+unsigned char inb_pit(unsigned int port)
+{
+   /* delay for some accesses to PIC on motherboard or in chipset must be
+  at least one microsecond, but be safe here. */
+   unsigned char value = inb(port);
+   udelay(2);
+   return value;
+}
+
+void outb_pit(unsigned char value, unsigned int port)
+{
+   /* delay for some accesses to PIC on motherboard or in chipset must be
+  at least one microsecond, but be safe here. */
+   outb(value, port);
+   udelay(2);
+}
+
+/*
  * Initialize the PIT timer.
  *
  * This is also called after resume to bring the PIT into operation again.
Index: linux-2.6/include/asm-x86/i8253.h
===
--- linux-2.6.orig/include/asm-x86/i8253.h
+++ linux-2.6/include/asm-x86/i8253.h
@@ -12,7 +12,9 @@ extern struct clock_event_device *global
 
 extern void setup_pit_timer(void);
 
-#define inb_pitinb_p
-#define outb_pit   outb_p
+/* accesses to PIT registers need careful delays on some platforms. Define
+   them here in a common place */
+extern unsigned char inb_pit(unsigned int port);
+extern void outb_pit(unsigned char value, unsigned int port);
 
 #endif /* __ASM_I8253_H__ */

-- 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] [patch 2/2] x86: use explicit timing delay for pit accesses

2008-02-18 Thread David P. Reed
Oops. the patch I just submitted for i8253.c didn't export the symbol 
needed by the pcspkr driver to build it as a module.  I will send the 
revised patch shortly.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch] x86: revised - use explicit timing delay for pit accesses

2008-02-18 Thread David P. Reed
x86: revised - use explicit timing delay for pit accesses

pit accesses in i8253.c and pcspkr driver use outb_p for timing.
Fix them to use explicit timing delay for access to PIT,
rather than inb_p/outb_p calls, which use insufficiently explicit
delays (defaulting to port 80 writes) that can cause freeze problems
on some machines, such as Quanta moterboard  machines using ENE EC's.

Since the pcspkr driver accesses PIT registers directly, it needs
the symbol outb_pit exported, so it can be built as a module.
Explicit timing delay is only needed in pcspkr for accesses to the 8253 PIT.
Fix pcspkr driver to use the new outb_pic call properly, use
named PIC port values rather than hex constants, and drop its use of
inb_p and outb_p in accessing port 61h where it has never been needed.

Signed-off-by: David P. Reed [EMAIL PROTECTED]

Index: linux-2.6/drivers/input/misc/pcspkr.c
===
--- linux-2.6.orig/drivers/input/misc/pcspkr.c
+++ linux-2.6/drivers/input/misc/pcspkr.c
@@ -36,6 +36,7 @@ static int pcspkr_event(struct input_dev
 {
unsigned int count = 0;
unsigned long flags;
+   unsigned char port61;
 
if (type != EV_SND)
return -1;
@@ -51,17 +52,18 @@ static int pcspkr_event(struct input_dev
 
spin_lock_irqsave(i8253_lock, flags);
 
+   port61 = inb(0x61);
if (count) {
/* enable counter 2 */
-   outb_p(inb_p(0x61) | 3, 0x61);
+   outb(port61 | 3, 0x61);
/* set command for counter 2, 2 byte write */
-   outb_p(0xB6, 0x43);
+   outb_pit(0xB6, PIT_MODE);
/* select desired HZ */
-   outb_p(count  0xff, 0x42);
-   outb((count  8)  0xff, 0x42);
+   outb_pit(count  0xff, PIT_CH2);
+   outb((count  8)  0xff, PIT_CH2);
} else {
/* disable counter 2 */
-   outb(inb_p(0x61)  0xFC, 0x61);
+   outb(port61  0xFC, 0x61);
}
 
spin_unlock_irqrestore(i8253_lock, flags);
Index: linux-2.6/arch/x86/kernel/i8253.c
===
--- linux-2.6.orig/arch/x86/kernel/i8253.c
+++ linux-2.6/arch/x86/kernel/i8253.c
@@ -31,6 +31,29 @@ static inline void pit_disable_clocksour
 struct clock_event_device *global_clock_event;
 
 /*
+ * define the PIT specific port access routines, which define the timing
+ * needed by the PIT registers on some platforms.
+ */
+unsigned char inb_pit(unsigned int port)
+{
+   /* delay for some accesses to PIC on motherboard or in chipset must be
+  at least one microsecond, but be safe here. */
+   unsigned char value = inb(port);
+   udelay(2);
+   return value;
+}
+EXPORT_SYMBOL(inb_pit);
+
+void outb_pit(unsigned char value, unsigned int port)
+{
+   /* delay for some accesses to PIC on motherboard or in chipset must be
+  at least one microsecond, but be safe here. */
+   outb(value, port);
+   udelay(2);
+}
+EXPORT_SYMBOL(outb_pit);
+
+/*
  * Initialize the PIT timer.
  *
  * This is also called after resume to bring the PIT into operation again.
Index: linux-2.6/include/asm-x86/i8253.h
===
--- linux-2.6.orig/include/asm-x86/i8253.h
+++ linux-2.6/include/asm-x86/i8253.h
@@ -12,7 +12,9 @@ extern struct clock_event_device *global
 
 extern void setup_pit_timer(void);
 
-#define inb_pitinb_p
-#define outb_pit   outb_p
+/* accesses to PIT registers need careful delays on some platforms. Define
+   them here in a common place */
+extern unsigned char inb_pit(unsigned int port);
+extern void outb_pit(unsigned char value, unsigned int port);
 
 #endif /* __ASM_I8253_H__ */
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [patch 1/2] x86: define outb_pic and inb_pic to stop using outb_p and inb_p

2008-02-18 Thread David P. Reed

Alan Cox wrote:

+unsigned char inb_pic(unsigned int port)
+{
+   /* delay for some accesses to PIC on motherboard or in chipset must be
+  at least one microsecond, but be safe here. */
+   unsigned char value = inb(port);
+   udelay(2);
+   return value;
+}



inline it. Its almost no instructions

  
Will do. Assume you desire inlining of the outb_pic, and also the 
inb_pit and outb_pit routines.  Didn't do it because the code is 
slightly bigger than the call.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86: define outb_pic and inb_pic to stop using outb_p and inb_p

2008-02-18 Thread David P. Reed
x86: define outb_pic and inb_pic to stop using outb_p and inb_p
 
The delay between io port accesses to the PIC is now defined using outb_pic
and inb_pic.  This fix provides the next step, using udelay(2) to define the
*PIC specific* timing requirements, rather than on bus-oriented timing, which
is not well calibrated.

Again, the primary reason for fixing this is to use proper delay strategy,
and in particular to fix crashes that can result from using port 80 writes
on machines that have resources on port 80, such as the ENE chips used by Quanta
in latops it designs and sells to, e.g. HP.

Signed-off-by: David P. Reed [EMAIL PROTECTED]
Index: linux-2.6/include/asm-x86/i8259.h
===
--- linux-2.6.orig/include/asm-x86/i8259.h
+++ linux-2.6/include/asm-x86/i8259.h
@@ -29,7 +29,23 @@ extern void enable_8259A_irq(unsigned in
 extern void disable_8259A_irq(unsigned int irq);
 extern unsigned int startup_8259A_irq(unsigned int irq);
 
-#define inb_picinb_p
-#define outb_pic   outb_p
+/* the PIC may need a careful delay on some platforms, hence specific calls */
+static inline unsigned char inb_pic(unsigned int port)
+{
+   /* delay for some accesses to PIC on motherboard or in chipset must be
+  at least one microsecond, but be safe here. */
+   unsigned char value = inb(port);
+   udelay(2);
+   return value;
+}
+
+static inline void outb_pic(unsigned char value, unsigned int port)
+{
+   /* delay for some accesses to PIC on motherboard or in chipset must be
+  at least one microsecond, but be safe here. */
+   outb(value, port);
+   udelay(2);
+}
+
 
 #endif /* __ASM_I8259_H__ */

-- 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86: use explicit timing delay for pit accesses in kernel and pcspkr driver

2008-02-18 Thread David P. Reed
x86: use explicit timing delay for pit accesses in kernel and pcspkr driver

pit accesses in i8253.c and pcspkr driver use outb_p for timing.
Fix them to use explicit timing delay for access to PIT,
rather than inb_p/outb_p calls, which use insufficiently explicit
delays (defaulting to port 80 writes) that can cause freeze problems
on some machines, such as Quanta moterboard  machines using ENE EC's.

Since the pcspkr driver accesses PIT registers directly, it should also
use outb_pit, which is inlined, so does not need bo exported.
Explicit timing delay is only needed in pcspkr for accesses to the 8253 PIT.
Fix pcspkr driver to use the new outb_pic call properly, use
named PIC port values rather than hex constants, and drop its use of
inb_p and outb_p in accessing port 61h where it has never been needed.

Signed-off-by: David P. Reed [EMAIL PROTECTED]

Index: linux-2.6/drivers/input/misc/pcspkr.c
===
--- linux-2.6.orig/drivers/input/misc/pcspkr.c
+++ linux-2.6/drivers/input/misc/pcspkr.c
@@ -36,6 +36,7 @@ static int pcspkr_event(struct input_dev
 {
unsigned int count = 0;
unsigned long flags;
+   unsigned char port61;
 
if (type != EV_SND)
return -1;
@@ -51,17 +52,18 @@ static int pcspkr_event(struct input_dev
 
spin_lock_irqsave(i8253_lock, flags);
 
+   port61 = inb(0x61);
if (count) {
/* enable counter 2 */
-   outb_p(inb_p(0x61) | 3, 0x61);
+   outb(port61 | 3, 0x61);
/* set command for counter 2, 2 byte write */
-   outb_p(0xB6, 0x43);
+   outb_pit(0xB6, PIT_MODE);
/* select desired HZ */
-   outb_p(count  0xff, 0x42);
-   outb((count  8)  0xff, 0x42);
+   outb_pit(count  0xff, PIT_CH2);
+   outb((count  8)  0xff, PIT_CH2);
} else {
/* disable counter 2 */
-   outb(inb_p(0x61)  0xFC, 0x61);
+   outb(port61  0xFC, 0x61);
}
 
spin_unlock_irqrestore(i8253_lock, flags);
Index: linux-2.6/include/asm-x86/i8253.h
===
--- linux-2.6.orig/include/asm-x86/i8253.h
+++ linux-2.6/include/asm-x86/i8253.h
@@ -12,7 +12,25 @@ extern struct clock_event_device *global
 
 extern void setup_pit_timer(void);
 
-#define inb_pitinb_p
-#define outb_pit   outb_p
+/* accesses to PIT registers need careful delays on some platforms. Define
+   them here in a common place */
+static inline unsigned char inb_pit(unsigned int port)
+{
+   /* delay for some accesses to PIT on motherboard or in chipset must be
+  at least one microsecond, but be safe here. */
+   unsigned char value = inb(port);
+   udelay(2);
+   return value;
+}
+
+static inline void outb_pit(unsigned char value, unsigned int port)
+{
+   /* delay for some accesses to PIT on motherboard or in chipset must be
+  at least one microsecond, but be safe here. */
+   outb(value, port);
+   udelay(2);
+}
+
+
 
 #endif /* __ASM_I8253_H__ */
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [PATCH 1/3] x86: fix init_8259A() to not use outb_pic

2008-02-17 Thread David P. Reed

Rene Herman wrote:

On 17-02-08 23:25, Alan Cox wrote:


On Sun, 17 Feb 2008 16:56:28 -0500 (EST)
"David P. Reed" <[EMAIL PROTECTED]> wrote:


fix init_8259A() which initializes the 8259 PIC to not use outb_pic,
which is a renamed version of outb_p, and delete outb_pic define.


NAK

The entire point of inb_pic/outb_pic is to isolate the various methods
and keep the logic for delays in one place. Undoing this just creates a
nasty mess.

Quite probably inb_pic/outb_pic will end up as static inlines that do 
inb

or outb with a udelay of 1 or 2 but that is where the knowledge belongs.


Additional NAK in sofar that the PIC delays were reported to be 
necesary with some VIA chipsets earlier in these threads.


Rene.

This not being a place where performance matters, I will submit a new 
patch that changes inb_pic and outb_pic to use udelay(2).  However, note 
that init_8259A does not use these consistently in its own accesses to 
the PIC registers.  Should I change it to use the _pic calls whereever 
it touches the PIC registers to be conservative?  Note that there is a 
udelay(100) after the registers are all setup, perhaps this is the real 
VIA requirement...

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/3] x86: fix pcspkr to not use inb_p/outb_p calls.

2008-02-17 Thread David P. Reed
Fix pcspkr driver to use explicit timing delay for access to PIT,
rather than inb_p/outb_p calls, which use insufficiently explicit
delays (defaulting to port 80 writes) that can cause freeze problems
on some machines, such as Quanta moterboard  machines using ENE EC's.
The explicit timing delay is only needed for accesses to the 8253 PIT.
The standard requirement for the 8253 to respond to successive writes
is 1 microsecond.  The 8253 has never been on the expansion bus, so 
a proper delay has nothing to do with expansion bus timing, but instead
its internal logic's capability to react to input.  Since udelay is correctly
calibrated by the time the pcspkr driver is initialized, we use 1 microsecond
as the timing.

Also shorten lines to less than 80 characters.

Signed-off-by: David P. Reed <[EMAIL PROTECTED]>

Index: linux-2.6/drivers/input/misc/pcspkr.c
===
--- linux-2.6.orig/drivers/input/misc/pcspkr.c
+++ linux-2.6/drivers/input/misc/pcspkr.c
@@ -32,9 +32,11 @@ MODULE_ALIAS("platform:pcspkr");
 static DEFINE_SPINLOCK(i8253_lock);
 #endif
 
-static int pcspkr_event(struct input_dev *dev, unsigned int type, unsigned int 
code, int value)
+static int pcspkr_event(struct input_dev *dev, unsigned int type,
+   unsigned int code, int value)
 {
unsigned int count = 0;
+   unsigned char mask;
unsigned long flags;
 
if (type != EV_SND)
@@ -51,17 +53,21 @@ static int pcspkr_event(struct input_dev
 
spin_lock_irqsave(_lock, flags);
 
+   mask = inb(0x61);
if (count) {
/* enable counter 2 */
-   outb_p(inb_p(0x61) | 3, 0x61);
+   outb(mask | 3, 0x61);
+   /* some 8253's may require 1 usec. between accesses */
/* set command for counter 2, 2 byte write */
-   outb_p(0xB6, 0x43);
+   outb(0xB6, 0x43);
+   udelay(1);
/* select desired HZ */
-   outb_p(count & 0xff, 0x42);
+   outb(count & 0xff, 0x42);
+   udelay(1);
outb((count >> 8) & 0xff, 0x42);
} else {
/* disable counter 2 */
-   outb(inb_p(0x61) & 0xFC, 0x61);
+   outb(mask & 0xFC, 0x61);
}
 
spin_unlock_irqrestore(_lock, flags);

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/3] x86: fix init_8259A() to not use outb_pic

2008-02-17 Thread David P. Reed
fix init_8259A() which initializes the 8259 PIC to not use outb_pic,
which is a renamed version of outb_p, and delete outb_pic define.
There is already code in the .c files that does accesses to CMD & IMR registers
in successive outb() calls without _p.  Thus the outb_p is obviously not
needed, if it ever was.  Research into chipset documentation and old BIOS
listings shows that IODELAY was not used even in early machines.  Thus
the delay between i/o port writes was deleted for the 8259.

Again, the primary reason for fixing this is to use proper delay strategy,
and in particular to fix crashes that can result from using port 80 writes
on machines that have resources on port 80, such as the ENE chips used by Quanta
in latops it designs and sells to, e.g. HP.

Signed-off-by: David P. Reed <[EMAIL PROTECTED]>
Index: linux-2.6/arch/x86/kernel/i8259_32.c
===
--- linux-2.6.orig/arch/x86/kernel/i8259_32.c
+++ linux-2.6/arch/x86/kernel/i8259_32.c
@@ -285,24 +285,30 @@ void init_8259A(int auto_eoi)
 
spin_lock_irqsave(_lock, flags);
 
-   outb(0xff, PIC_MASTER_IMR); /* mask all of 8259A-1 */
-   outb(0xff, PIC_SLAVE_IMR);  /* mask all of 8259A-2 */
-
-   /*
-* outb_pic - this has to work on a wide range of PC hardware.
-*/
-   outb_pic(0x11, PIC_MASTER_CMD); /* ICW1: select 8259A-1 init */
-   outb_pic(0x20 + 0, PIC_MASTER_IMR); /* ICW2: 8259A-1 IR0-7 mapped 
to 0x20-0x27 */
-   outb_pic(1U << PIC_CASCADE_IR, PIC_MASTER_IMR); /* 8259A-1 (the master) 
has a slave on IR2 */
+   /* mask all of 8259A-1 */
+   outb(0xff, PIC_MASTER_IMR);
+   /* mask all of 8259A-2 */
+   outb(0xff, PIC_SLAVE_IMR);
+
+   /* ICW1: select 8259A-1 init */
+   outb(0x11, PIC_MASTER_CMD);
+   /* ICW2: 8259A-1 IR0-7 mapped to 0x20-0x27 */
+   outb(0x20 + 0, PIC_MASTER_IMR);
+   /* 8259A-1 (the master) has a slave on IR2 */
+   outb(1U << PIC_CASCADE_IR, PIC_MASTER_IMR);
if (auto_eoi)   /* master does Auto EOI */
-   outb_pic(MASTER_ICW4_DEFAULT | PIC_ICW4_AEOI, PIC_MASTER_IMR);
+   outb(MASTER_ICW4_DEFAULT | PIC_ICW4_AEOI, PIC_MASTER_IMR);
else/* master expects normal EOI */
-   outb_pic(MASTER_ICW4_DEFAULT, PIC_MASTER_IMR);
+   outb(MASTER_ICW4_DEFAULT, PIC_MASTER_IMR);
 
-   outb_pic(0x11, PIC_SLAVE_CMD);  /* ICW1: select 8259A-2 init */
-   outb_pic(0x20 + 8, PIC_SLAVE_IMR);  /* ICW2: 8259A-2 IR0-7 mapped 
to 0x28-0x2f */
-   outb_pic(PIC_CASCADE_IR, PIC_SLAVE_IMR);/* 8259A-2 is a slave 
on master's IR2 */
-   outb_pic(SLAVE_ICW4_DEFAULT, PIC_SLAVE_IMR); /* (slave's support for 
AEOI in flat mode is to be investigated) */
+   /* ICW1: select 8259A-2 init */
+   outb(0x11, PIC_SLAVE_CMD);
+   /* ICW2: 8259A-2 IR0-7 mapped to 0x28-0x2f */
+   outb(0x20 + 8, PIC_SLAVE_IMR);
+   /* 8259A-2 is a slave on master's IR2 */
+   outb(PIC_CASCADE_IR, PIC_SLAVE_IMR);
+   /* (slave's support for AEOI in flat mode is to be investigated) */
+   outb(SLAVE_ICW4_DEFAULT, PIC_SLAVE_IMR);
if (auto_eoi)
/*
 * In AEOI mode we just have to mask the interrupt
Index: linux-2.6/arch/x86/kernel/i8259_64.c
===
--- linux-2.6.orig/arch/x86/kernel/i8259_64.c
+++ linux-2.6/arch/x86/kernel/i8259_64.c
@@ -355,29 +355,30 @@ void init_8259A(int auto_eoi)
 
spin_lock_irqsave(_lock, flags);
 
-   outb(0xff, PIC_MASTER_IMR); /* mask all of 8259A-1 */
-   outb(0xff, PIC_SLAVE_IMR);  /* mask all of 8259A-2 */
+   /* mask all of 8259A-1 */
+   outb(0xff, PIC_MASTER_IMR);
+   /* mask all of 8259A-2 */
+   outb(0xff, PIC_SLAVE_IMR);
 
-   /*
-* outb_pic - this has to work on a wide range of PC hardware.
-*/
-   outb_pic(0x11, PIC_MASTER_CMD); /* ICW1: select 8259A-1 init */
+   /* ICW1: select 8259A-1 init */
+   outb(0x11, PIC_MASTER_CMD);
/* ICW2: 8259A-1 IR0-7 mapped to 0x30-0x37 */
-   outb_pic(IRQ0_VECTOR, PIC_MASTER_IMR);
+   outb(IRQ0_VECTOR, PIC_MASTER_IMR);
/* 8259A-1 (the master) has a slave on IR2 */
-   outb_pic(0x04, PIC_MASTER_IMR);
+   outb(0x04, PIC_MASTER_IMR);
if (auto_eoi)   /* master does Auto EOI */
-   outb_pic(MASTER_ICW4_DEFAULT | PIC_ICW4_AEOI, PIC_MASTER_IMR);
+   outb(MASTER_ICW4_DEFAULT | PIC_ICW4_AEOI, PIC_MASTER_IMR);
else/* master expects normal EOI */
-   outb_pic(MASTER_ICW4_DEFAULT, PIC_MASTER_IMR);
+   outb(MASTER_ICW4_DEFAULT, PIC_MASTER_IMR);
 
-   outb_pic(0x11, PIC_SLAVE_CMD);  /* ICW1: select 8259A-2 init */
+   /* ICW1: select 8259A-2 init */
+   outb(0x11, PIC_SLAVE_CMD);
/* ICW2: 8259A-2 IR0-7 mapped to 0x38-0x3f */
- 

[PATCH 2/3] x86: fix cmos read and write to not use inb_p and outb_p

2008-02-17 Thread David P. Reed
fix code to access CMOS rtc registers so that it does not use inb_p and
outb_p routines, which are deprecated.  Extensive research on all known CMOS RTC
chipset timing shows that there is no need for a delay in accessing the
registers of these chips even on old machines. These chipa are never on an
expansion bus, but have always been "motherboard" resources, either in the
processor chipset or explicitly on the motherboard, and they are not part of 
the ISA/LPC or PCI buses, so delays should not be based on bus timing.
The reason to fix it:
1) port 80 writes often hang some laptops that use ENE EC chipsets, esp. those
designed and manufactured by Quanta for HP;
2) RTC accesses are timing sensitive, and extra microseconds may matter;
3) the new "io_delay" function is calibrated by expansion bus timing needs,
thus is not appropriate for access to CMOS rtc registers.

Signed-off-by: David P. Reed <[EMAIL PROTECTED]>
Index: linux-2.6/arch/x86/kernel/rtc.c
===
--- linux-2.6.orig/arch/x86/kernel/rtc.c
+++ linux-2.6/arch/x86/kernel/rtc.c
@@ -151,8 +151,8 @@ unsigned char rtc_cmos_read(unsigned cha
unsigned char val;
 
lock_cmos_prefix(addr);
-   outb_p(addr, RTC_PORT(0));
-   val = inb_p(RTC_PORT(1));
+   outb(addr, RTC_PORT(0));
+   val = inb(RTC_PORT(1));
lock_cmos_suffix(addr);
return val;
 }
@@ -161,8 +161,8 @@ EXPORT_SYMBOL(rtc_cmos_read);
 void rtc_cmos_write(unsigned char val, unsigned char addr)
 {
lock_cmos_prefix(addr);
-   outb_p(addr, RTC_PORT(0));
-   outb_p(val, RTC_PORT(1));
+   outb(addr, RTC_PORT(0));
+   outb(val, RTC_PORT(1));
lock_cmos_suffix(addr);
 }
 EXPORT_SYMBOL(rtc_cmos_write);

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/3] x86: cleanup primary motherboard chip port access delays

2008-02-17 Thread David P. Reed
cleanup motherboard chip io port delays.  inb_p and outb_p have traditionally
used a write to port 80 (a non-existent port) as a delay.  Though there is an
argument that that is a good delay for devices on the ISA or PCI expansion buses
it is not a good mechanism for devices in the processor chipset or on the
"motherboard".  The write to port 80 at best causes an abort on the ISA or LPC
bus, and on some machines (like many of the HP laptops manufactured by Quanta)
actually writes data to real i/o devices.  For example, the ENE Embedded
Controller chip family defaults to provide a register at port 80 that 
can be written, and which can cause an interrupt in the Embedded Controller.

This has been shown to cause hangs on some machines, especially in accessing
the CMOS RTC during bootup.

This patch series addresses three of the places where these are used in common
kernel code - in particular the three uses that affect the HP laptops mentioned
above, modifying the delays to match the worst known delay issues for the
specific chips.

The patch set is complementary to the iodelay= kernel parameter added in 2.6.25,
since it means fewer users will need to add that parameter to run linux "out of 
the box" without hanging.

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/3] x86: fix pcspkr to not use inb_p/outb_p calls.

2008-02-17 Thread David P. Reed
Fix pcspkr driver to use explicit timing delay for access to PIT,
rather than inb_p/outb_p calls, which use insufficiently explicit
delays (defaulting to port 80 writes) that can cause freeze problems
on some machines, such as Quanta moterboard  machines using ENE EC's.
The explicit timing delay is only needed for accesses to the 8253 PIT.
The standard requirement for the 8253 to respond to successive writes
is 1 microsecond.  The 8253 has never been on the expansion bus, so 
a proper delay has nothing to do with expansion bus timing, but instead
its internal logic's capability to react to input.  Since udelay is correctly
calibrated by the time the pcspkr driver is initialized, we use 1 microsecond
as the timing.

Also shorten lines to less than 80 characters.

Signed-off-by: David P. Reed [EMAIL PROTECTED]

Index: linux-2.6/drivers/input/misc/pcspkr.c
===
--- linux-2.6.orig/drivers/input/misc/pcspkr.c
+++ linux-2.6/drivers/input/misc/pcspkr.c
@@ -32,9 +32,11 @@ MODULE_ALIAS(platform:pcspkr);
 static DEFINE_SPINLOCK(i8253_lock);
 #endif
 
-static int pcspkr_event(struct input_dev *dev, unsigned int type, unsigned int 
code, int value)
+static int pcspkr_event(struct input_dev *dev, unsigned int type,
+   unsigned int code, int value)
 {
unsigned int count = 0;
+   unsigned char mask;
unsigned long flags;
 
if (type != EV_SND)
@@ -51,17 +53,21 @@ static int pcspkr_event(struct input_dev
 
spin_lock_irqsave(i8253_lock, flags);
 
+   mask = inb(0x61);
if (count) {
/* enable counter 2 */
-   outb_p(inb_p(0x61) | 3, 0x61);
+   outb(mask | 3, 0x61);
+   /* some 8253's may require 1 usec. between accesses */
/* set command for counter 2, 2 byte write */
-   outb_p(0xB6, 0x43);
+   outb(0xB6, 0x43);
+   udelay(1);
/* select desired HZ */
-   outb_p(count  0xff, 0x42);
+   outb(count  0xff, 0x42);
+   udelay(1);
outb((count  8)  0xff, 0x42);
} else {
/* disable counter 2 */
-   outb(inb_p(0x61)  0xFC, 0x61);
+   outb(mask  0xFC, 0x61);
}
 
spin_unlock_irqrestore(i8253_lock, flags);

-- 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/3] x86: fix init_8259A() to not use outb_pic

2008-02-17 Thread David P. Reed
fix init_8259A() which initializes the 8259 PIC to not use outb_pic,
which is a renamed version of outb_p, and delete outb_pic define.
There is already code in the .c files that does accesses to CMD  IMR registers
in successive outb() calls without _p.  Thus the outb_p is obviously not
needed, if it ever was.  Research into chipset documentation and old BIOS
listings shows that IODELAY was not used even in early machines.  Thus
the delay between i/o port writes was deleted for the 8259.

Again, the primary reason for fixing this is to use proper delay strategy,
and in particular to fix crashes that can result from using port 80 writes
on machines that have resources on port 80, such as the ENE chips used by Quanta
in latops it designs and sells to, e.g. HP.

Signed-off-by: David P. Reed [EMAIL PROTECTED]
Index: linux-2.6/arch/x86/kernel/i8259_32.c
===
--- linux-2.6.orig/arch/x86/kernel/i8259_32.c
+++ linux-2.6/arch/x86/kernel/i8259_32.c
@@ -285,24 +285,30 @@ void init_8259A(int auto_eoi)
 
spin_lock_irqsave(i8259A_lock, flags);
 
-   outb(0xff, PIC_MASTER_IMR); /* mask all of 8259A-1 */
-   outb(0xff, PIC_SLAVE_IMR);  /* mask all of 8259A-2 */
-
-   /*
-* outb_pic - this has to work on a wide range of PC hardware.
-*/
-   outb_pic(0x11, PIC_MASTER_CMD); /* ICW1: select 8259A-1 init */
-   outb_pic(0x20 + 0, PIC_MASTER_IMR); /* ICW2: 8259A-1 IR0-7 mapped 
to 0x20-0x27 */
-   outb_pic(1U  PIC_CASCADE_IR, PIC_MASTER_IMR); /* 8259A-1 (the master) 
has a slave on IR2 */
+   /* mask all of 8259A-1 */
+   outb(0xff, PIC_MASTER_IMR);
+   /* mask all of 8259A-2 */
+   outb(0xff, PIC_SLAVE_IMR);
+
+   /* ICW1: select 8259A-1 init */
+   outb(0x11, PIC_MASTER_CMD);
+   /* ICW2: 8259A-1 IR0-7 mapped to 0x20-0x27 */
+   outb(0x20 + 0, PIC_MASTER_IMR);
+   /* 8259A-1 (the master) has a slave on IR2 */
+   outb(1U  PIC_CASCADE_IR, PIC_MASTER_IMR);
if (auto_eoi)   /* master does Auto EOI */
-   outb_pic(MASTER_ICW4_DEFAULT | PIC_ICW4_AEOI, PIC_MASTER_IMR);
+   outb(MASTER_ICW4_DEFAULT | PIC_ICW4_AEOI, PIC_MASTER_IMR);
else/* master expects normal EOI */
-   outb_pic(MASTER_ICW4_DEFAULT, PIC_MASTER_IMR);
+   outb(MASTER_ICW4_DEFAULT, PIC_MASTER_IMR);
 
-   outb_pic(0x11, PIC_SLAVE_CMD);  /* ICW1: select 8259A-2 init */
-   outb_pic(0x20 + 8, PIC_SLAVE_IMR);  /* ICW2: 8259A-2 IR0-7 mapped 
to 0x28-0x2f */
-   outb_pic(PIC_CASCADE_IR, PIC_SLAVE_IMR);/* 8259A-2 is a slave 
on master's IR2 */
-   outb_pic(SLAVE_ICW4_DEFAULT, PIC_SLAVE_IMR); /* (slave's support for 
AEOI in flat mode is to be investigated) */
+   /* ICW1: select 8259A-2 init */
+   outb(0x11, PIC_SLAVE_CMD);
+   /* ICW2: 8259A-2 IR0-7 mapped to 0x28-0x2f */
+   outb(0x20 + 8, PIC_SLAVE_IMR);
+   /* 8259A-2 is a slave on master's IR2 */
+   outb(PIC_CASCADE_IR, PIC_SLAVE_IMR);
+   /* (slave's support for AEOI in flat mode is to be investigated) */
+   outb(SLAVE_ICW4_DEFAULT, PIC_SLAVE_IMR);
if (auto_eoi)
/*
 * In AEOI mode we just have to mask the interrupt
Index: linux-2.6/arch/x86/kernel/i8259_64.c
===
--- linux-2.6.orig/arch/x86/kernel/i8259_64.c
+++ linux-2.6/arch/x86/kernel/i8259_64.c
@@ -355,29 +355,30 @@ void init_8259A(int auto_eoi)
 
spin_lock_irqsave(i8259A_lock, flags);
 
-   outb(0xff, PIC_MASTER_IMR); /* mask all of 8259A-1 */
-   outb(0xff, PIC_SLAVE_IMR);  /* mask all of 8259A-2 */
+   /* mask all of 8259A-1 */
+   outb(0xff, PIC_MASTER_IMR);
+   /* mask all of 8259A-2 */
+   outb(0xff, PIC_SLAVE_IMR);
 
-   /*
-* outb_pic - this has to work on a wide range of PC hardware.
-*/
-   outb_pic(0x11, PIC_MASTER_CMD); /* ICW1: select 8259A-1 init */
+   /* ICW1: select 8259A-1 init */
+   outb(0x11, PIC_MASTER_CMD);
/* ICW2: 8259A-1 IR0-7 mapped to 0x30-0x37 */
-   outb_pic(IRQ0_VECTOR, PIC_MASTER_IMR);
+   outb(IRQ0_VECTOR, PIC_MASTER_IMR);
/* 8259A-1 (the master) has a slave on IR2 */
-   outb_pic(0x04, PIC_MASTER_IMR);
+   outb(0x04, PIC_MASTER_IMR);
if (auto_eoi)   /* master does Auto EOI */
-   outb_pic(MASTER_ICW4_DEFAULT | PIC_ICW4_AEOI, PIC_MASTER_IMR);
+   outb(MASTER_ICW4_DEFAULT | PIC_ICW4_AEOI, PIC_MASTER_IMR);
else/* master expects normal EOI */
-   outb_pic(MASTER_ICW4_DEFAULT, PIC_MASTER_IMR);
+   outb(MASTER_ICW4_DEFAULT, PIC_MASTER_IMR);
 
-   outb_pic(0x11, PIC_SLAVE_CMD);  /* ICW1: select 8259A-2 init */
+   /* ICW1: select 8259A-2 init */
+   outb(0x11, PIC_SLAVE_CMD);
/* ICW2: 8259A-2 IR0-7 mapped to 0x38-0x3f */
-   outb_pic

[PATCH 2/3] x86: fix cmos read and write to not use inb_p and outb_p

2008-02-17 Thread David P. Reed
fix code to access CMOS rtc registers so that it does not use inb_p and
outb_p routines, which are deprecated.  Extensive research on all known CMOS RTC
chipset timing shows that there is no need for a delay in accessing the
registers of these chips even on old machines. These chipa are never on an
expansion bus, but have always been motherboard resources, either in the
processor chipset or explicitly on the motherboard, and they are not part of 
the ISA/LPC or PCI buses, so delays should not be based on bus timing.
The reason to fix it:
1) port 80 writes often hang some laptops that use ENE EC chipsets, esp. those
designed and manufactured by Quanta for HP;
2) RTC accesses are timing sensitive, and extra microseconds may matter;
3) the new io_delay function is calibrated by expansion bus timing needs,
thus is not appropriate for access to CMOS rtc registers.

Signed-off-by: David P. Reed [EMAIL PROTECTED]
Index: linux-2.6/arch/x86/kernel/rtc.c
===
--- linux-2.6.orig/arch/x86/kernel/rtc.c
+++ linux-2.6/arch/x86/kernel/rtc.c
@@ -151,8 +151,8 @@ unsigned char rtc_cmos_read(unsigned cha
unsigned char val;
 
lock_cmos_prefix(addr);
-   outb_p(addr, RTC_PORT(0));
-   val = inb_p(RTC_PORT(1));
+   outb(addr, RTC_PORT(0));
+   val = inb(RTC_PORT(1));
lock_cmos_suffix(addr);
return val;
 }
@@ -161,8 +161,8 @@ EXPORT_SYMBOL(rtc_cmos_read);
 void rtc_cmos_write(unsigned char val, unsigned char addr)
 {
lock_cmos_prefix(addr);
-   outb_p(addr, RTC_PORT(0));
-   outb_p(val, RTC_PORT(1));
+   outb(addr, RTC_PORT(0));
+   outb(val, RTC_PORT(1));
lock_cmos_suffix(addr);
 }
 EXPORT_SYMBOL(rtc_cmos_write);

-- 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/3] x86: cleanup primary motherboard chip port access delays

2008-02-17 Thread David P. Reed
cleanup motherboard chip io port delays.  inb_p and outb_p have traditionally
used a write to port 80 (a non-existent port) as a delay.  Though there is an
argument that that is a good delay for devices on the ISA or PCI expansion buses
it is not a good mechanism for devices in the processor chipset or on the
motherboard.  The write to port 80 at best causes an abort on the ISA or LPC
bus, and on some machines (like many of the HP laptops manufactured by Quanta)
actually writes data to real i/o devices.  For example, the ENE Embedded
Controller chip family defaults to provide a register at port 80 that 
can be written, and which can cause an interrupt in the Embedded Controller.

This has been shown to cause hangs on some machines, especially in accessing
the CMOS RTC during bootup.

This patch series addresses three of the places where these are used in common
kernel code - in particular the three uses that affect the HP laptops mentioned
above, modifying the delays to match the worst known delay issues for the
specific chips.

The patch set is complementary to the iodelay= kernel parameter added in 2.6.25,
since it means fewer users will need to add that parameter to run linux out of 
the box without hanging.

-- 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [PATCH 1/3] x86: fix init_8259A() to not use outb_pic

2008-02-17 Thread David P. Reed

Rene Herman wrote:

On 17-02-08 23:25, Alan Cox wrote:


On Sun, 17 Feb 2008 16:56:28 -0500 (EST)
David P. Reed [EMAIL PROTECTED] wrote:


fix init_8259A() which initializes the 8259 PIC to not use outb_pic,
which is a renamed version of outb_p, and delete outb_pic define.


NAK

The entire point of inb_pic/outb_pic is to isolate the various methods
and keep the logic for delays in one place. Undoing this just creates a
nasty mess.

Quite probably inb_pic/outb_pic will end up as static inlines that do 
inb

or outb with a udelay of 1 or 2 but that is where the knowledge belongs.


Additional NAK in sofar that the PIC delays were reported to be 
necesary with some VIA chipsets earlier in these threads.


Rene.

This not being a place where performance matters, I will submit a new 
patch that changes inb_pic and outb_pic to use udelay(2).  However, note 
that init_8259A does not use these consistently in its own accesses to 
the PIC registers.  Should I change it to use the _pic calls whereever 
it touches the PIC registers to be conservative?  Note that there is a 
udelay(100) after the registers are all setup, perhaps this is the real 
VIA requirement...

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-14 Thread David P. Reed

David Woodhouse wrote:

On Fri, 2008-01-11 at 09:35 -0500, David P. Reed wrote:
  

  Using any "unused port" for a delay means that the machine check
feature is wasted and utterly unusable.



Not entirely unusable. You can recover silently from the machine check
if it was one of the known accesses to the 'unused port'. It certainly
achieves a delay :)
  

I'm sure that's what the driver writers had in mind.  ;-)

And I think we probably have a great shot at getting Intel, Microsoft, 
HP, et al.. to add a feature for Linux to one of the ACPI table 
specifications that define an "unused port for delay purposes" field in 
the ACPI 4.0 spec, and retrofit it into PC/104 machine BIOSes.  At least 
Microsoft doesn't have a patent on using port 80 for delay purposes. :-)





--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-14 Thread David P. Reed

David Woodhouse wrote:

On Fri, 2008-01-11 at 09:35 -0500, David P. Reed wrote:
  

  Using any unused port for a delay means that the machine check
feature is wasted and utterly unusable.



Not entirely unusable. You can recover silently from the machine check
if it was one of the known accesses to the 'unused port'. It certainly
achieves a delay :)
  

I'm sure that's what the driver writers had in mind.  ;-)

And I think we probably have a great shot at getting Intel, Microsoft, 
HP, et al.. to add a feature for Linux to one of the ACPI table 
specifications that define an unused port for delay purposes field in 
the ACPI 4.0 spec, and retrofit it into PC/104 machine BIOSes.  At least 
Microsoft doesn't have a patent on using port 80 for delay purposes. :-)





--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-11 Thread David P. Reed

Alan Cox wrote:
bus abort on the LPC bus".   More problematic is that I would think some 
people might want to turn on the AMD feature that generates machine 
checks if a bus timeout happens.   The whole point of machine checks is 



An ISA/LPC bus timeout is fulfilled by the bridge so doesn't cause an MCE.


  
Good possibility, but the documentation on HyperTransport suggests 
otherwise, even for LPC bridges in this particular modern world of 
AMD64. I might do the experiment someday to see if my LPC bridge is 
implemented in a way that does or doesn't support enabling MCE's. Could 
be different between Intel and AMD - I haven't had reason to pore over 
the Intel chipset specs, since my poking into all this stuff has been 
driven by my personal machine's issues, and it's not got any Intel 
compatible parts.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-11 Thread David P. Reed



Rene Herman wrote:

On 11-01-08 02:36, Zachary Amsden wrote:

FWIW, I fixed the problem locally by recompiling, changing port 80 to 
port 84 in io.h; works great, and doesn't conflict with any occupied 
ports.


Might not give you a "proper" delay though. 0xed should be a better 
choice...


I don't think there is any magic here.  I modified the patch to do *no 
delay at all* in the io_delay "quirk" and have been running reliably for 
weeks including the very heavy I/O load that comes from using software 
RAID on this nice laptop that has two separate SATA drives!  This 
particular laptop has no problematic devices - the only problem is 
actually in the CMOS_READ and CMOS_WRITE macros that *use* the _p 
operations in a way that is unnecessary on this machine.  (in fact, it 
would be hard to add a problematic device - there's no PCMCIA slot 
either, and so every option is USB or Firewire).


Using 0xED happens to work, but it's not guaranteed to work either.  
There is not a "standard" for an "unused port that is mapped to cause a 
bus abort on the LPC bus".   More problematic is that I would think some 
people might want to turn on the AMD feature that generates machine 
checks if a bus timeout happens.   The whole point of machine checks is 
to allow the machine to be more reliable.   Using any "unused port" for 
a delay means that the machine check feature is wasted and utterly unusable.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-11 Thread David P. Reed



Rene Herman wrote:

On 11-01-08 02:36, Zachary Amsden wrote:

FWIW, I fixed the problem locally by recompiling, changing port 80 to 
port 84 in io.h; works great, and doesn't conflict with any occupied 
ports.


Might not give you a proper delay though. 0xed should be a better 
choice...


I don't think there is any magic here.  I modified the patch to do *no 
delay at all* in the io_delay quirk and have been running reliably for 
weeks including the very heavy I/O load that comes from using software 
RAID on this nice laptop that has two separate SATA drives!  This 
particular laptop has no problematic devices - the only problem is 
actually in the CMOS_READ and CMOS_WRITE macros that *use* the _p 
operations in a way that is unnecessary on this machine.  (in fact, it 
would be hard to add a problematic device - there's no PCMCIA slot 
either, and so every option is USB or Firewire).


Using 0xED happens to work, but it's not guaranteed to work either.  
There is not a standard for an unused port that is mapped to cause a 
bus abort on the LPC bus.   More problematic is that I would think some 
people might want to turn on the AMD feature that generates machine 
checks if a bus timeout happens.   The whole point of machine checks is 
to allow the machine to be more reliable.   Using any unused port for 
a delay means that the machine check feature is wasted and utterly unusable.



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-11 Thread David P. Reed

Alan Cox wrote:
bus abort on the LPC bus.   More problematic is that I would think some 
people might want to turn on the AMD feature that generates machine 
checks if a bus timeout happens.   The whole point of machine checks is 



An ISA/LPC bus timeout is fulfilled by the bridge so doesn't cause an MCE.


  
Good possibility, but the documentation on HyperTransport suggests 
otherwise, even for LPC bridges in this particular modern world of 
AMD64. I might do the experiment someday to see if my LPC bridge is 
implemented in a way that does or doesn't support enabling MCE's. Could 
be different between Intel and AMD - I haven't had reason to pore over 
the Intel chipset specs, since my poking into all this stuff has been 
driven by my personal machine's issues, and it's not got any Intel 
compatible parts.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-10 Thread David P. Reed


Rene Herman wrote:

On 10-01-08 01:37, Robert Hancock wrote:


I agree. In this case the BIOS on these laptops is trying to tell us 
"port 80 is used for our purposes, do not touch it". We should be 
listening.


Listening is fine but what are you going to do after you have 
listened? Right, not use port 0x80 but since that's the current idea 
anyway outside of legacy drivers, you don't actually need to listen.


If the quirk-to-0xed or similar was to stay, it's a much better 
switching point than DMI strings but if not, it's not actually important.
Well, I was just suggesting a warning that would come up when a driver 
that still used port 80 was initialized...
I think that is what Alan Cox and others suggest for legacy drivers that 
have worked - I agree that it may not be the right thing to screw them 
up, especially since my laptop, and probably most machines that might 
start using port 80 or other "legacy ports" won't ever need those drivers.


I thought more about a complete solution last night.   A clean idea that 
really fits the linux design might be the following outline of a patch. 
I suspect it might seem far less ugly, and probably would meet Alan 
Cox's needs, too - I am very sympathetic about not breaking 8390's, etc.


Define a "motherboard resources" driver that claims all the resources 
defined for PNP0C02 devices during the pnp process.   I think Windoze 
actually does something quite similar.   This would claim port 80.


Define an iodelay driver.  This driver exists largely to claim port 80 
for the iodelay operation  (you could even define an option for other 
ports).  Legacy drivers would be modified to require iodelay.  The 
iodelay driver would set up the iodelay mechanism to do something other 
than the "boot time" default - which could be no delay, or udelay.  It 
would also set a flag that says "_b operations are safe".


Put a WARN_ONCE() in the in/out*_b operations that checks the flag that 
is set by the iodelay driver.  This would only trigger in the case that 
80 or whatever was reserved by some other device driver - such as the 
motherboard resources driver above.  Modern machines won't do that.


Finally, anything that happens before the motherboard resources and 
iodelay drivers are initialized cannot use in*_p or out*_p (whether they 
can be loadable modules rather than built in is a question).  This is a 
very small set, and I believe with the exception of the PIT (8253/4) are 
very safe.


Note that this idea is also compatible with rewriting all drivers to use 
"iodelay()" explicitly instead of _p().  But it doesn't require that.




Rene.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-10 Thread David P. Reed


Rene Herman wrote:

On 10-01-08 01:37, Robert Hancock wrote:


I agree. In this case the BIOS on these laptops is trying to tell us 
port 80 is used for our purposes, do not touch it. We should be 
listening.


Listening is fine but what are you going to do after you have 
listened? Right, not use port 0x80 but since that's the current idea 
anyway outside of legacy drivers, you don't actually need to listen.


If the quirk-to-0xed or similar was to stay, it's a much better 
switching point than DMI strings but if not, it's not actually important.
Well, I was just suggesting a warning that would come up when a driver 
that still used port 80 was initialized...
I think that is what Alan Cox and others suggest for legacy drivers that 
have worked - I agree that it may not be the right thing to screw them 
up, especially since my laptop, and probably most machines that might 
start using port 80 or other legacy ports won't ever need those drivers.


I thought more about a complete solution last night.   A clean idea that 
really fits the linux design might be the following outline of a patch. 
I suspect it might seem far less ugly, and probably would meet Alan 
Cox's needs, too - I am very sympathetic about not breaking 8390's, etc.


Define a motherboard resources driver that claims all the resources 
defined for PNP0C02 devices during the pnp process.   I think Windoze 
actually does something quite similar.   This would claim port 80.


Define an iodelay driver.  This driver exists largely to claim port 80 
for the iodelay operation  (you could even define an option for other 
ports).  Legacy drivers would be modified to require iodelay.  The 
iodelay driver would set up the iodelay mechanism to do something other 
than the boot time default - which could be no delay, or udelay.  It 
would also set a flag that says _b operations are safe.


Put a WARN_ONCE() in the in/out*_b operations that checks the flag that 
is set by the iodelay driver.  This would only trigger in the case that 
80 or whatever was reserved by some other device driver - such as the 
motherboard resources driver above.  Modern machines won't do that.


Finally, anything that happens before the motherboard resources and 
iodelay drivers are initialized cannot use in*_p or out*_p (whether they 
can be loadable modules rather than built in is a question).  This is a 
very small set, and I believe with the exception of the PIT (8253/4) are 
very safe.


Note that this idea is also compatible with rewriting all drivers to use 
iodelay() explicitly instead of _p().  But it doesn't require that.




Rene.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-09 Thread David P. Reed

Zachary Amsden wrote:


According to Phoenix Technologies book "System BIOS for IBM PCs,
Compatibles and EISA Computers, 2nd Edition", the I/O port list gives

port 0080h   R/W  Extra page register (temporary storage)

Despite looking, I've never seen it documented anywhere else, but I
believe it works on just about every PC platform.  Except, apparently,
my laptop.


  
The port 80 problem was discovered by me, after months of "bisecting" 
the running code around a problem with hanging when using hwclock in 
64-bit mode when ACPI is on.  So it kills my laptop, too, and many 
currentlaptop motherboards designed by Quanta for HP and Compaq (dv6000, 
dv9000, tx1000, apparently)


In the last couple of weeks, I was able with luck to discover that the 
problem is the ENE KB3920 chip, which is the "big brother" of the KB3700 
chip included in the OLPC XO "$100 laptop" made also by Quanta.  I 
verified this by taking my laptop apart - a fun and risky experience.  
Didn't break any connectors, but I don't recommend it for those who are 
not experienced disassembling laptops and cellphones, etc.  The KB3920 
contains an EC, an SMBus, a KBC, some watchdog timers, and a variety of 
other functions that keep the laptop going, coordinating the 
relationships among various peripherals.  The firmware is part standard 
from ENE, part OEM-specific, in this case coded by Quanta or a BIOS 
subcontractor.


You can read the specsheet for the KB3700 online at laptop.org, since 
the specs of the laptop are "open".  The 3920's spec is confidential.  
And the firmware is confidential as well for both the 3700 and 3920.  
Clues as to what it does can be gleaned by reading the disassembler 
output of the DSDT code in the particular laptops - though the SMM BIOS 
probably also talks to it.


Modern machines have many subsystems, and the ACPI and SMBIOS coordinate 
to run them; blade servers also have drawer controllers and backplane 
management buses.  The part that runs Linux is only part of the machine.


Your laptop isn't an aberration.  It's part of the new generation of 
evolved machines that take advantage of the capabilities of ACPI and 
SMBIOS and DMI standards that are becoming core parts of the market.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-09 Thread David P. Reed

Christer Weinigel wrote:


Did I miss anyting?

  

Nothing that seems *crucial* going forward for Linux.  The fate of
"legacy machines" is really important to get right.

I have a small suggestion in mind that might be helpful in the future:
the  "motherboard resources" discovered as PNP0C02 devices in their _CRS
settings in ACPI during ACPI PnP startup should be reserved (or
checked), and any drivers that still use port 80 implicitly should
reserve that port.

This may be too late in the boot process to make a decision not to use 
port 80, and it

doesn't help decide a strategy to use an alternate port (0xED happens to
"work" on the dv9000 machines in the sense that it generates a bus
timeout on LPC, but there is no guarantee that 0xED is free on any 
particular motherboard, and "unusedness" is not declared in any 
BIOS/ACPI tables) or to use a udelay-based iodelay (but there is nothing 
in the BIOS tables that suggest the right delays for various I/O ports 
if any modern parts need them...which I question, but can't prove a 
negative in general).


However, doing the reservations on such resources could generate a 
warning that would help diagnose new current and future

designs including devices like the ENE KB3920 that have a port that is
defaulted to port 80 and routed to the EC for functions that the 
firmware and ACPI can agree to do.  Or any other ports used in new ways 
and properly notified to the OS via the now-standard Wintel BIOS functions.


I don't know if /proc/ioports is being maintained, but the fact that it
doesn't contain all of those PNP0C02 resources known on my machine seems 
to be a bug - which I am happy to code a patch or two for as a 
contribution back to Linux, if it isn't on the way out as the /sys 
hierarchy does a better job.

/sys/bus/pnp/... does get built properly and has port 80 described
properly - not as a DMA port, but as a port in use by device 05:00, 
which is the motherboard resource catchall.  Thus the patch would be small.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-09 Thread David P. Reed

Christer Weinigel wrote:


Did I miss anyting?

  

Nothing that seems *crucial* going forward for Linux.  The fate of
legacy machines is really important to get right.

I have a small suggestion in mind that might be helpful in the future:
the  motherboard resources discovered as PNP0C02 devices in their _CRS
settings in ACPI during ACPI PnP startup should be reserved (or
checked), and any drivers that still use port 80 implicitly should
reserve that port.

This may be too late in the boot process to make a decision not to use 
port 80, and it

doesn't help decide a strategy to use an alternate port (0xED happens to
work on the dv9000 machines in the sense that it generates a bus
timeout on LPC, but there is no guarantee that 0xED is free on any 
particular motherboard, and unusedness is not declared in any 
BIOS/ACPI tables) or to use a udelay-based iodelay (but there is nothing 
in the BIOS tables that suggest the right delays for various I/O ports 
if any modern parts need them...which I question, but can't prove a 
negative in general).


However, doing the reservations on such resources could generate a 
warning that would help diagnose new current and future

designs including devices like the ENE KB3920 that have a port that is
defaulted to port 80 and routed to the EC for functions that the 
firmware and ACPI can agree to do.  Or any other ports used in new ways 
and properly notified to the OS via the now-standard Wintel BIOS functions.


I don't know if /proc/ioports is being maintained, but the fact that it
doesn't contain all of those PNP0C02 resources known on my machine seems 
to be a bug - which I am happy to code a patch or two for as a 
contribution back to Linux, if it isn't on the way out as the /sys 
hierarchy does a better job.

/sys/bus/pnp/... does get built properly and has port 80 described
properly - not as a DMA port, but as a port in use by device 05:00, 
which is the motherboard resource catchall.  Thus the patch would be small.



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-09 Thread David P. Reed

Zachary Amsden wrote:


According to Phoenix Technologies book System BIOS for IBM PCs,
Compatibles and EISA Computers, 2nd Edition, the I/O port list gives

port 0080h   R/W  Extra page register (temporary storage)

Despite looking, I've never seen it documented anywhere else, but I
believe it works on just about every PC platform.  Except, apparently,
my laptop.


  
The port 80 problem was discovered by me, after months of bisecting 
the running code around a problem with hanging when using hwclock in 
64-bit mode when ACPI is on.  So it kills my laptop, too, and many 
currentlaptop motherboards designed by Quanta for HP and Compaq (dv6000, 
dv9000, tx1000, apparently)


In the last couple of weeks, I was able with luck to discover that the 
problem is the ENE KB3920 chip, which is the big brother of the KB3700 
chip included in the OLPC XO $100 laptop made also by Quanta.  I 
verified this by taking my laptop apart - a fun and risky experience.  
Didn't break any connectors, but I don't recommend it for those who are 
not experienced disassembling laptops and cellphones, etc.  The KB3920 
contains an EC, an SMBus, a KBC, some watchdog timers, and a variety of 
other functions that keep the laptop going, coordinating the 
relationships among various peripherals.  The firmware is part standard 
from ENE, part OEM-specific, in this case coded by Quanta or a BIOS 
subcontractor.


You can read the specsheet for the KB3700 online at laptop.org, since 
the specs of the laptop are open.  The 3920's spec is confidential.  
And the firmware is confidential as well for both the 3700 and 3920.  
Clues as to what it does can be gleaned by reading the disassembler 
output of the DSDT code in the particular laptops - though the SMM BIOS 
probably also talks to it.


Modern machines have many subsystems, and the ACPI and SMBIOS coordinate 
to run them; blade servers also have drawer controllers and backplane 
management buses.  The part that runs Linux is only part of the machine.


Your laptop isn't an aberration.  It's part of the new generation of 
evolved machines that take advantage of the capabilities of ACPI and 
SMBIOS and DMI standards that are becoming core parts of the market.



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-08 Thread David P. Reed

Christer Weinigel wrote:

Argument by personal authority.  Thats good.
There is no other kind of argument.  Are you claiming supernatural 
authority drives your typing fingers, or is your argument based on what 
you think you know?  I have piles of code that I wrote, spec sheets (now 
that I'm back in my home office), code that others wrote at the time, 
and documentation from vendors that come from my personal experiences.  
That doesn't mean I'm always right - always happy to learn something 
new.  Just don't condescend to a 55 year old who has been writing 
operating systems, compilers, and designing hardware for almost 40 years 
professionally (yes, I got my first job at 16 writing FORTRAN code to 
simulate hydrodynamic systems).

I guess that's why you
don't seem to understand the difference between reading the serial port
status register and not being allowed to access a register at all
due to such this as the 4 cycle delay you quoted yourself from the 8390
data sheet,
If you read what I said carefully, I said that the 8390 was a very 
special case.   The "chip select" problem it experienced was pretty much 
unique among boards of the time.  Those of us who looked at its design 
and had any experience designing hardware for buses like the unibus or 
even the buses on PDP-8's and DG machines thought it had to be a joke.  
Of course it saved money per board, so it beat the 3Com boards on price 
- and you could program it after a fashion.  So it involved "cheaping out".


The normal timing problem was that an out or in operation to a board or 
chip required some time to elapse before the chip performed the side 
effects internally so that the next operation to it would have an 
effect.  This is exactly the reason why most chips and boards are 
designed to either have a polling of a flag indicate operation 
completion.  The serial "buffer empty" flag is the simplest possible 
explanatory example of such handshaking that came to mind (writing a 
character to a serial output device twice often leads to surprises, 
unless you wait for the previous character to clock out).  See my 
comment on RTC below, for a more complex to explain example.

and similar issues with the I8253 that I quoted from its
data sheet a few posts ago.

  
The 8253 was a motherboard chip.  I am not sure it had any timing 
problems with its electrical signalling.  I just don't remember.  The 
spec sheet doesn't say it's internal state can get scrambled.


I was thinking of another timer, the RTC which is usually a part of the
Super I/O.
The RTC has very well documented timing requirements.  But none of the 
spec sheets, nor my experience with it, mention electrical issues that 
prevented back-to-back port operations.  The documented timing 
requirements have to do with the state during the time it ticks over 
internally once per second.  But it is carefully designed to have a flag 
that is "on" during 244 microseconds prior to and covering the time it 
is unsafe to read the registers.   That design is special because it is 
designed to operate when the machine is powered off, so it has two 
internal clock domains, one of which is used in "low power" mode and is 
very slow to minimize power.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-08 Thread David P. Reed


Christer Weinigel wrote:
There is no need to use io writes to supposedly/theoretically "unused 
ports" to make drivers work on any bus.

ISA included!  You can, for example, wait for an ISA bus serial
adapter to put out its next character by looping reading the port
that has the output buffer full flag in a tight loop, with no delay
code at all.  And if you need to time things, just call a timing loop
subroutine that you calibrate at boot time.



Now you're totally confusing things.  You're talking about looking at
bits in a register to see if a transmit register is empty.  
That's easy.


The delays needed for the Intel M8259 and M8253 say that you're not
even allowed to access the registers _at_ _all_ for some time after a
register access.  If you do a write to a register immediately followed
by any access, including a read of the status register, you can corrupt
the state of the chip.
  
Not true.  Even on the original IBM 5150 PC, the 8259 on the motherboard 
accepted back to back  OUT and IN instructions, and it would NOT trash 
the chip state.  You can read the original IBM BIOS code if you like.  I 
don't remember about the 8253's timing.  I doubt the chip's state would 
be corrupted in any way. The data and address lines were the same data 
and address lines that the microprocessor used to access memory - it 
didn't "hold" the lines stable any longer than the OUT instruction.

And the Intel chips are not the only ones with that kind of brain
damage.  But what makes the 8259 and 8253 a big problem is that every
modern PC has a descendant of those chips in them.
Register compatible.  Not  the same chips or even the same  masks or 
timing requirements.

The discrete Intel
chips or clones got aggregated into Super I/O chips, and the Super I/O
chips were put on a LPC bus (an ISA bus with another name) or
integrated into the southbrige.
Don't try to teach your grandmother to suck eggs: I've been programming 
PC compatibles since probably before you were able to do long division - 
including writing code on the first prototype IBM PCs, the first 
pre-manufacturing PC-ATs, and zillions of clones.  (and I was also 
involved in designing hardware including the so-called "Lotus Intel" 
expanded memory cards and the original PC cards)  The 8259 PIC is an 
*interrupt controller*. It was NEVER present in a Super I/O chip, or an 
LPC chip.  Its functionality was absorbed into the chipsets that control 
interrupt mapping, like the PIIX and the nForce. 


And the "if it ain't broken, don't fix
it" mantra probably means that some modern chipsets are still using
exactly the same internal design as the 25 year old chips and will
still be subject to some of those ancient limitations.
  
Oh, come on.  Give the VLSI designers some credit for brains.   The CAD 
tools used to design the 8259 and 8253 were so primitive you couldn't 
even get a chip manufactured with designs from that era today.  When 
people design chips today they do it with simulators that can't even 
work, and testers that run from test suites that were not available at 
the time.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-08 Thread David P. Reed

Alan -

I dug up a DP83901A SNIC datasheet in a quick Google search, while that 
wasn't the only such chip, it was one of them.  I can forward the PDF 
(the www.alldatasheet.com site dynamically creates the download URL), if 
anyone wants it.
The relevant passage says, in regard to delaying between checking the 
CRDA addresses to see if a dummy "remote read" has been executed., and 
in regard perhaps to other card IO register loops: 



   TIME BETWEEN CHIP SELECTS

   The SNIC requires that successive chip selects be no
   closer 
   than 4 bus clocks (BSCK) together. If the condition is
   violat-   
   ed the SNIC may glitch ACK. CPUs that operate from pipe-

   lined instructions (i e 386) or have a cache (i e 486) can
   execute consecutive I O cycles very quickly The solution is
   to delay the execution of consecutive I O cycles by either
   breaking the pipeline or forcing the CPU to access outside
   its cache.

The NE2000 as I recall had no special logic on the board to protect the 
chip from successive chip selects that were too close - which is the 
reason for the problem. Clearly an out to port 80 takes more than 4 ISA 
bus clocks, so that works if the NE2000 is on the ISA bus,   On the 
other hand, there are other ways to delay more than 4 ISA bus clocks.  
And as you say, one needs a delay for this chip that relates to the 
chip's card's bus's clock speed, not absolute time.


Alan Cox wrote:
As well you should. I am honestly curious (for my own satisfaction) as 
to what the natsemi docs say the delay code should do  (can't imagine 
they say "use io port 80 because it is unused").  I don't have any 



They say you must allow 4 bus clocks for the address decode. They don't
deal with the ISA side as the chip itself has no ISA glue.


  
copies anymore. But mere curiosity on my part is not worth spending a 
lot of time on - I know you are super busy.   If there's a copy online 
at a URL ...



Not that I know of. There may be. A good general source of info is Russ
Nelson's old DOS packet driver collection.


  

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-08 Thread David P. Reed

Alan Cox wrote:

The natsemi docs here say otherwise. I trust them not you.
  
As well you should. I am honestly curious (for my own satisfaction) as 
to what the natsemi docs say the delay code should do  (can't imagine 
they say "use io port 80 because it is unused").  I don't have any 
copies anymore. But mere curiosity on my part is not worth spending a 
lot of time on - I know you are super busy.   If there's a copy online 
at a URL ...


The problem is that certain people, unfortunately those who know
nothing about ISA related bus systems, keep trying to confuse ISA delay
logic with core chip logic and end up trying to solve both a problem and a
non-problem in one, creating a nasty mess in the process.

  
I agree that the problems of chip logic and ISA delay are all tangled 
up, probably more than need be.  I hope that the solution turns out to 
simplify matters, and hopefully to document the intention of the 
resulting code sections a bit more clearly for the future.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-08 Thread David P. Reed



Ondrej Zary wrote:

On Tuesday 08 January 2008 18:24:02 David P. Reed wrote:
  

Windows these days does delays with timing loops or the scheduler.  It
doesn't use a "port".  Also, Windows XP only supports machines that tend
not to have timing problems that use delays.  Instead, if a device takes
a while to respond, it has a "busy bit" in some port or memory slot that
can be tested.



Windows XP can run on a machine with ISA slot(s) and has built-in drivers for 
some plug ISA cards - e.g. the famous 3Com EtherLink III. I think that 
there's a driver for NE2000-compatible cards too and it probably works.
  
There is no need to use io writes to supposedly/theoretically "unused 
ports" to make drivers work on any bus.
ISA included!  You can, for example, wait for an ISA bus serial adapter 
to put out its next character by looping reading the port that has the 
output buffer full flag in a tight loop, with no delay code at all.  And 
if you need to time things, just call a timing loop subroutine that you 
calibrate at boot time.
I wrote DOS drivers for NE2000's on the ISA bus when they were brand new 
designs from Novell without such kludges as writes to I/O port 80.  I 
don't remember writing a driver for the 3com devices - probably didn't, 
because 3Com's cards were expensive at the time.


In any case, Linux *did* adopt this port 80 strategy - I'm sure all 
concerned thought it was frightfully clever at the time.  Linus 
expressed his skepticism in the comments in io.h.  The problem is to 
safely move away from it toward a proper strategy that doesn't depend on 
"bus aborts" which would trigger machine checks if they were properly 
enabled.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-08 Thread David P. Reed
Windows these days does delays with timing loops or the scheduler.  It 
doesn't use a "port".  Also, Windows XP only supports machines that tend 
not to have timing problems that use delays.  Instead, if a device takes 
a while to respond, it has a "busy bit" in some port or memory slot that 
can be tested.


Almost all of the issues in Linux where _p operations are used are (or 
should be) historical - IMO.


Ondrej Zary wrote:

On Tuesday 08 January 2008 02:38:15 David P. Reed wrote:
  

H. Peter Anvin wrote:


And shoot the designer of this particular microcontroller firmware.
  

Well, some days I want to shoot the "designer" of the entire Wintel
architecture...  it's not exactly "designed" by anybody of course, and
today it's created largely by a collection of Taiwanese and Chinese ODM
firms, coupled with Microsoft WinHEC and Intel folks.  At least they
follow the rules and their ACPI and BIOS code say that they are using
port 80 very clearly if you use PnP and ACPI properly.  And in the old
days, you were "supposed" to use the system BIOS to talk to things like
the PIT that had timing issues, not write your own code.



Does anyone know what port does Windows use? I'm pretty sure that it isn't 80h 
as I run Windows 98 often with port 80h debug card inserted. The last POST 
code set by BIOS usually remains on the display and only changes when BIOS 
does something like suspend/resume. IIRC, there was a program that was able 
to display temperature from onboard sensors on the port 80h display that's 
integrated on some mainboards.


  

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-08 Thread David P. Reed
The last time I heard of a 12 MHz bus in a PC system was in the days of 
the PC-AT, when some clone makers sped up their buses (pre PCI!!!) in an 
attempt to allow adapter card *memory* to run at the 12 MHz speed.


This caused so many industry-wide problems with adapter cards that 
couldn't be installed in certain machines and still run reliably that 
the industry learned a lesson.  That doesn't mean that LPCs don't run at 
12 MHz, but if they do, they don't have old 8 bit punky cards plugged 
into them for lots of practical reasons.  (I have whole drawers full of 
such old cards, trying to figure out an environmentally responsible way 
to get rid of them - even third world countries would be fools to make 
machiens with them).


I can't believe that we are not supporting today's machines correctly 
because we are still trying to be compatible with a few (at most a 
hundre thousand were manufactured!  Much less still functioning or 
running Linux) machines.


Now I understand that PC/104 machines and other things are very non PC 
compatible, but are x86 processor architectures.  Do they even run x86 
under 2.6.24?


Perhaps the rational solution here is to declare x86 the architecture 
for "relics" and develop a merged architecture called "modern machines" 
to include only those PCs that have been made to work since, say, the 
release of (cough) WIndows 2000?


Bodo Eggert wrote:

On Tue, 8 Jan 2008, Rene Herman wrote:
  

On 08-01-08 00:24, H. Peter Anvin wrote:


Rene Herman wrote:
  


  

Is this only about the ones then left for things like legacy PIC and PIT?
Does anyone care about just sticking in a udelay(2) (or 1) there as a
replacement and call it a day?



PIT is problematic because the PIT may be necessary for udelay setup.
  

Yes, can initialise loops_per_jiffy conservatively. Just didn't quite get why
you guys are talking about an ISA bus speed parameter.



If the ISA bus is below 8 MHz, we might need a longer delay. If we default
to the longer delay, the delay will be too long for more than 99,99 % of 
all systems, not counting i586+. Especially if the driver is fine-tuned to 
give maximum throughput, this may be bad.


OTOH, the DOS drivers I heared about use delays and would break on 
underclocked ISA busses if the n * ISA_HZ delay was needed. Maybe

somebody having a configurable ISA bus speed and some problematic
chips can test it ...

  

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-08 Thread David P. Reed
The last time I heard of a 12 MHz bus in a PC system was in the days of 
the PC-AT, when some clone makers sped up their buses (pre PCI!!!) in an 
attempt to allow adapter card *memory* to run at the 12 MHz speed.


This caused so many industry-wide problems with adapter cards that 
couldn't be installed in certain machines and still run reliably that 
the industry learned a lesson.  That doesn't mean that LPCs don't run at 
12 MHz, but if they do, they don't have old 8 bit punky cards plugged 
into them for lots of practical reasons.  (I have whole drawers full of 
such old cards, trying to figure out an environmentally responsible way 
to get rid of them - even third world countries would be fools to make 
machiens with them).


I can't believe that we are not supporting today's machines correctly 
because we are still trying to be compatible with a few (at most a 
hundre thousand were manufactured!  Much less still functioning or 
running Linux) machines.


Now I understand that PC/104 machines and other things are very non PC 
compatible, but are x86 processor architectures.  Do they even run x86 
under 2.6.24?


Perhaps the rational solution here is to declare x86 the architecture 
for relics and develop a merged architecture called modern machines 
to include only those PCs that have been made to work since, say, the 
release of (cough) WIndows 2000?


Bodo Eggert wrote:

On Tue, 8 Jan 2008, Rene Herman wrote:
  

On 08-01-08 00:24, H. Peter Anvin wrote:


Rene Herman wrote:
  


  

Is this only about the ones then left for things like legacy PIC and PIT?
Does anyone care about just sticking in a udelay(2) (or 1) there as a
replacement and call it a day?



PIT is problematic because the PIT may be necessary for udelay setup.
  

Yes, can initialise loops_per_jiffy conservatively. Just didn't quite get why
you guys are talking about an ISA bus speed parameter.



If the ISA bus is below 8 MHz, we might need a longer delay. If we default
to the longer delay, the delay will be too long for more than 99,99 % of 
all systems, not counting i586+. Especially if the driver is fine-tuned to 
give maximum throughput, this may be bad.


OTOH, the DOS drivers I heared about use delays and would break on 
underclocked ISA busses if the n * ISA_HZ delay was needed. Maybe

somebody having a configurable ISA bus speed and some problematic
chips can test it ...

  

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-08 Thread David P. Reed
Windows these days does delays with timing loops or the scheduler.  It 
doesn't use a port.  Also, Windows XP only supports machines that tend 
not to have timing problems that use delays.  Instead, if a device takes 
a while to respond, it has a busy bit in some port or memory slot that 
can be tested.


Almost all of the issues in Linux where _p operations are used are (or 
should be) historical - IMO.


Ondrej Zary wrote:

On Tuesday 08 January 2008 02:38:15 David P. Reed wrote:
  

H. Peter Anvin wrote:


And shoot the designer of this particular microcontroller firmware.
  

Well, some days I want to shoot the designer of the entire Wintel
architecture...  it's not exactly designed by anybody of course, and
today it's created largely by a collection of Taiwanese and Chinese ODM
firms, coupled with Microsoft WinHEC and Intel folks.  At least they
follow the rules and their ACPI and BIOS code say that they are using
port 80 very clearly if you use PnP and ACPI properly.  And in the old
days, you were supposed to use the system BIOS to talk to things like
the PIT that had timing issues, not write your own code.



Does anyone know what port does Windows use? I'm pretty sure that it isn't 80h 
as I run Windows 98 often with port 80h debug card inserted. The last POST 
code set by BIOS usually remains on the display and only changes when BIOS 
does something like suspend/resume. IIRC, there was a program that was able 
to display temperature from onboard sensors on the port 80h display that's 
integrated on some mainboards.


  

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-08 Thread David P. Reed



Ondrej Zary wrote:

On Tuesday 08 January 2008 18:24:02 David P. Reed wrote:
  

Windows these days does delays with timing loops or the scheduler.  It
doesn't use a port.  Also, Windows XP only supports machines that tend
not to have timing problems that use delays.  Instead, if a device takes
a while to respond, it has a busy bit in some port or memory slot that
can be tested.



Windows XP can run on a machine with ISA slot(s) and has built-in drivers for 
some plugplay ISA cards - e.g. the famous 3Com EtherLink III. I think that 
there's a driver for NE2000-compatible cards too and it probably works.
  
There is no need to use io writes to supposedly/theoretically unused 
ports to make drivers work on any bus.
ISA included!  You can, for example, wait for an ISA bus serial adapter 
to put out its next character by looping reading the port that has the 
output buffer full flag in a tight loop, with no delay code at all.  And 
if you need to time things, just call a timing loop subroutine that you 
calibrate at boot time.
I wrote DOS drivers for NE2000's on the ISA bus when they were brand new 
designs from Novell without such kludges as writes to I/O port 80.  I 
don't remember writing a driver for the 3com devices - probably didn't, 
because 3Com's cards were expensive at the time.


In any case, Linux *did* adopt this port 80 strategy - I'm sure all 
concerned thought it was frightfully clever at the time.  Linus 
expressed his skepticism in the comments in io.h.  The problem is to 
safely move away from it toward a proper strategy that doesn't depend on 
bus aborts which would trigger machine checks if they were properly 
enabled.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-08 Thread David P. Reed

Alan Cox wrote:

The natsemi docs here say otherwise. I trust them not you.
  
As well you should. I am honestly curious (for my own satisfaction) as 
to what the natsemi docs say the delay code should do  (can't imagine 
they say use io port 80 because it is unused).  I don't have any 
copies anymore. But mere curiosity on my part is not worth spending a 
lot of time on - I know you are super busy.   If there's a copy online 
at a URL ...


The problem is that certain people, unfortunately those who know
nothing about ISA related bus systems, keep trying to confuse ISA delay
logic with core chip logic and end up trying to solve both a problem and a
non-problem in one, creating a nasty mess in the process.

  
I agree that the problems of chip logic and ISA delay are all tangled 
up, probably more than need be.  I hope that the solution turns out to 
simplify matters, and hopefully to document the intention of the 
resulting code sections a bit more clearly for the future.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-08 Thread David P. Reed

Alan -

I dug up a DP83901A SNIC datasheet in a quick Google search, while that 
wasn't the only such chip, it was one of them.  I can forward the PDF 
(the www.alldatasheet.com site dynamically creates the download URL), if 
anyone wants it.
The relevant passage says, in regard to delaying between checking the 
CRDA addresses to see if a dummy remote read has been executed., and 
in regard perhaps to other card IO register loops: 



   TIME BETWEEN CHIP SELECTS

   The SNIC requires that successive chip selects be no
   closer 
   than 4 bus clocks (BSCK) together. If the condition is
   violat-   
   ed the SNIC may glitch ACK. CPUs that operate from pipe-

   lined instructions (i e 386) or have a cache (i e 486) can
   execute consecutive I O cycles very quickly The solution is
   to delay the execution of consecutive I O cycles by either
   breaking the pipeline or forcing the CPU to access outside
   its cache.

The NE2000 as I recall had no special logic on the board to protect the 
chip from successive chip selects that were too close - which is the 
reason for the problem. Clearly an out to port 80 takes more than 4 ISA 
bus clocks, so that works if the NE2000 is on the ISA bus,   On the 
other hand, there are other ways to delay more than 4 ISA bus clocks.  
And as you say, one needs a delay for this chip that relates to the 
chip's card's bus's clock speed, not absolute time.


Alan Cox wrote:
As well you should. I am honestly curious (for my own satisfaction) as 
to what the natsemi docs say the delay code should do  (can't imagine 
they say use io port 80 because it is unused).  I don't have any 



They say you must allow 4 bus clocks for the address decode. They don't
deal with the ISA side as the chip itself has no ISA glue.


  
copies anymore. But mere curiosity on my part is not worth spending a 
lot of time on - I know you are super busy.   If there's a copy online 
at a URL ...



Not that I know of. There may be. A good general source of info is Russ
Nelson's old DOS packet driver collection.


  

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-08 Thread David P. Reed


Christer Weinigel wrote:
There is no need to use io writes to supposedly/theoretically unused 
ports to make drivers work on any bus.

ISA included!  You can, for example, wait for an ISA bus serial
adapter to put out its next character by looping reading the port
that has the output buffer full flag in a tight loop, with no delay
code at all.  And if you need to time things, just call a timing loop
subroutine that you calibrate at boot time.



Now you're totally confusing things.  You're talking about looking at
bits in a register to see if a transmit register is empty.  
That's easy.


The delays needed for the Intel M8259 and M8253 say that you're not
even allowed to access the registers _at_ _all_ for some time after a
register access.  If you do a write to a register immediately followed
by any access, including a read of the status register, you can corrupt
the state of the chip.
  
Not true.  Even on the original IBM 5150 PC, the 8259 on the motherboard 
accepted back to back  OUT and IN instructions, and it would NOT trash 
the chip state.  You can read the original IBM BIOS code if you like.  I 
don't remember about the 8253's timing.  I doubt the chip's state would 
be corrupted in any way. The data and address lines were the same data 
and address lines that the microprocessor used to access memory - it 
didn't hold the lines stable any longer than the OUT instruction.

And the Intel chips are not the only ones with that kind of brain
damage.  But what makes the 8259 and 8253 a big problem is that every
modern PC has a descendant of those chips in them.
Register compatible.  Not  the same chips or even the same  masks or 
timing requirements.

The discrete Intel
chips or clones got aggregated into Super I/O chips, and the Super I/O
chips were put on a LPC bus (an ISA bus with another name) or
integrated into the southbrige.
Don't try to teach your grandmother to suck eggs: I've been programming 
PC compatibles since probably before you were able to do long division - 
including writing code on the first prototype IBM PCs, the first 
pre-manufacturing PC-ATs, and zillions of clones.  (and I was also 
involved in designing hardware including the so-called Lotus Intel 
expanded memory cards and the original PC cards)  The 8259 PIC is an 
*interrupt controller*. It was NEVER present in a Super I/O chip, or an 
LPC chip.  Its functionality was absorbed into the chipsets that control 
interrupt mapping, like the PIIX and the nForce. 


And the if it ain't broken, don't fix
it mantra probably means that some modern chipsets are still using
exactly the same internal design as the 25 year old chips and will
still be subject to some of those ancient limitations.
  
Oh, come on.  Give the VLSI designers some credit for brains.   The CAD 
tools used to design the 8259 and 8253 were so primitive you couldn't 
even get a chip manufactured with designs from that era today.  When 
people design chips today they do it with simulators that can't even 
work, and testers that run from test suites that were not available at 
the time.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-08 Thread David P. Reed

Christer Weinigel wrote:

Argument by personal authority.  Thats good.
There is no other kind of argument.  Are you claiming supernatural 
authority drives your typing fingers, or is your argument based on what 
you think you know?  I have piles of code that I wrote, spec sheets (now 
that I'm back in my home office), code that others wrote at the time, 
and documentation from vendors that come from my personal experiences.  
That doesn't mean I'm always right - always happy to learn something 
new.  Just don't condescend to a 55 year old who has been writing 
operating systems, compilers, and designing hardware for almost 40 years 
professionally (yes, I got my first job at 16 writing FORTRAN code to 
simulate hydrodynamic systems).

I guess that's why you
don't seem to understand the difference between reading the serial port
status register and not being allowed to access a register at all
due to such this as the 4 cycle delay you quoted yourself from the 8390
data sheet,
If you read what I said carefully, I said that the 8390 was a very 
special case.   The chip select problem it experienced was pretty much 
unique among boards of the time.  Those of us who looked at its design 
and had any experience designing hardware for buses like the unibus or 
even the buses on PDP-8's and DG machines thought it had to be a joke.  
Of course it saved money per board, so it beat the 3Com boards on price 
- and you could program it after a fashion.  So it involved cheaping out.


The normal timing problem was that an out or in operation to a board or 
chip required some time to elapse before the chip performed the side 
effects internally so that the next operation to it would have an 
effect.  This is exactly the reason why most chips and boards are 
designed to either have a polling of a flag indicate operation 
completion.  The serial buffer empty flag is the simplest possible 
explanatory example of such handshaking that came to mind (writing a 
character to a serial output device twice often leads to surprises, 
unless you wait for the previous character to clock out).  See my 
comment on RTC below, for a more complex to explain example.

and similar issues with the I8253 that I quoted from its
data sheet a few posts ago.

  
The 8253 was a motherboard chip.  I am not sure it had any timing 
problems with its electrical signalling.  I just don't remember.  The 
spec sheet doesn't say it's internal state can get scrambled.


I was thinking of another timer, the RTC which is usually a part of the
Super I/O.
The RTC has very well documented timing requirements.  But none of the 
spec sheets, nor my experience with it, mention electrical issues that 
prevented back-to-back port operations.  The documented timing 
requirements have to do with the state during the time it ticks over 
internally once per second.  But it is carefully designed to have a flag 
that is on during 244 microseconds prior to and covering the time it 
is unsafe to read the registers.   That design is special because it is 
designed to operate when the machine is powered off, so it has two 
internal clock domains, one of which is used in low power mode and is 
very slow to minimize power.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-07 Thread David P. Reed

H. Peter Anvin wrote:


And shoot the designer of this particular microcontroller firmware.

 
Well, some days I want to shoot the "designer" of the entire Wintel 
architecture...  it's not exactly "designed" by anybody of course, and 
today it's created largely by a collection of Taiwanese and Chinese ODM 
firms, coupled with Microsoft WinHEC and Intel folks.  At least they 
follow the rules and their ACPI and BIOS code say that they are using 
port 80 very clearly if you use PnP and ACPI properly.  And in the old 
days, you were "supposed" to use the system BIOS to talk to things like 
the PIT that had timing issues, not write your own code.


Or perhaps the ACPI spec should specify a timing loop spec and precisely 
specify the desired timing after accessing an I/O port till that device 
has properly "acted" on that operation.


The idea that Port 80 was "unused" and appropriate for delay purposes 
elicited skepticism by Linus that is recorded for posterity in the 
comments of the relevant Linux include files - especially since it was 
clearly "used" for non-delay purposes, by cards that could be plugged 
into a PCI (fast), not just an 8-bit ISA, bus.


Perhaps we should declare the world of ACPI systems a separate "arch" 
from the world of l'ancien regime where folklore about which ports were 
used for what ruled.   I lived through those old days, and they were not 
wonderful, either.


The world sucks, and Linux is supposed to be able to adapt to that 
world, suckitude and all.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-07 Thread David P. Reed
On another topic.  I have indeed determined what device uses port 80 on 
Quanta AMD64 laptops from HP.


I had lunch with Jim Gettys of OLPC a week ago; he's an old friend since 
he worked on the original X windows system.   After telling him my story 
about port 80, he mentioned that the OLPC XO machine had some issues 
with port 80 which was by design handled by the ENE KBC device on its 
motherboard.   He said the ENE was a very desirable chipset for AMD 
designs recommended by Quanta.  Richard Smith of OLPC explained to me 
how the KB3700 they use works, and that they use the KB3700 to send POST 
codes out over a serial link during boot up.


This gave me a reason to take apart my laptop, to discover that it has 
an ENE KB3920 B0 as its EC and KBC.  The port interface for the KB3920 
includes listening to port 80 which is then made available to firmware 
on the EC.  It is recognized and decoded on the LPC bus, only for 
writes, and optionally can generate an interrupt in the 8051.


Dumping both the ENE chip, and looking at the DSDT.dsl for my machine, I 
discovered that port 80 is used as an additional parameter for various 
DSDT methods that communicate to the EC, when it is operating in ACPI mode.


More work is in progress as I play around with this.  But the key thing 
is that ACPI and perhaps SMM both use port 80 as part of the base 
function of the chipset.


And actually, if I had looked at the /sys/bus/pnp definitions, rather 
than /proc/ioports, I would have noticed that port 80 was part of a 
PNP0C02 resource set.   That means exactly one thing:  ACPI says that 
port 80 is NOT free to be used, for delays or anything else.


This should make no difference here: it's just one more reason to stop 
using port 80 for delays on modern machines.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-07 Thread David P. Reed



H. Peter Anvin wrote:

Rene Herman wrote:


Is this only about the ones then left for things like legacy PIC and 
PIT? Does anyone care about just sticking in a udelay(2) (or 1) there 
as a replacement and call it a day?




PIT is problematic because the PIT may be necessary for udelay setup.

The PIT usage for calibrating the delay loop can be moderated, if need 
by, by using the PC BIOS which by definition uses the PIT correctly it 
its int 15 function 83 call..   Just do it before coming up in a state 
where the PC BIOS int 15h calls no longer work.  I gave code to do this 
in a much earlier message.


This is the MOST reliable way to use the PIT early in boot, on a PC 
compatible.  God knows how one should do it on a Macintosh running a 
386/20  :-).   But the ONLY old bat-PIT machines are, thank god, PC 
compatible, maybe.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-07 Thread David P. Reed



H. Peter Anvin wrote:

Rene Herman wrote:


Is this only about the ones then left for things like legacy PIC and 
PIT? Does anyone care about just sticking in a udelay(2) (or 1) there 
as a replacement and call it a day?




PIT is problematic because the PIT may be necessary for udelay setup.

The PIT usage for calibrating the delay loop can be moderated, if need 
by, by using the PC BIOS which by definition uses the PIT correctly it 
its int 15 function 83 call..   Just do it before coming up in a state 
where the PC BIOS int 15h calls no longer work.  I gave code to do this 
in a much earlier message.


This is the MOST reliable way to use the PIT early in boot, on a PC 
compatible.  God knows how one should do it on a Macintosh running a 
386/20  :-).   But the ONLY old bat-PIT machines are, thank god, PC 
compatible, maybe.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-07 Thread David P. Reed
On another topic.  I have indeed determined what device uses port 80 on 
Quanta AMD64 laptops from HP.


I had lunch with Jim Gettys of OLPC a week ago; he's an old friend since 
he worked on the original X windows system.   After telling him my story 
about port 80, he mentioned that the OLPC XO machine had some issues 
with port 80 which was by design handled by the ENE KBC device on its 
motherboard.   He said the ENE was a very desirable chipset for AMD 
designs recommended by Quanta.  Richard Smith of OLPC explained to me 
how the KB3700 they use works, and that they use the KB3700 to send POST 
codes out over a serial link during boot up.


This gave me a reason to take apart my laptop, to discover that it has 
an ENE KB3920 B0 as its EC and KBC.  The port interface for the KB3920 
includes listening to port 80 which is then made available to firmware 
on the EC.  It is recognized and decoded on the LPC bus, only for 
writes, and optionally can generate an interrupt in the 8051.


Dumping both the ENE chip, and looking at the DSDT.dsl for my machine, I 
discovered that port 80 is used as an additional parameter for various 
DSDT methods that communicate to the EC, when it is operating in ACPI mode.


More work is in progress as I play around with this.  But the key thing 
is that ACPI and perhaps SMM both use port 80 as part of the base 
function of the chipset.


And actually, if I had looked at the /sys/bus/pnp definitions, rather 
than /proc/ioports, I would have noticed that port 80 was part of a 
PNP0C02 resource set.   That means exactly one thing:  ACPI says that 
port 80 is NOT free to be used, for delays or anything else.


This should make no difference here: it's just one more reason to stop 
using port 80 for delays on modern machines.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-07 Thread David P. Reed

H. Peter Anvin wrote:


And shoot the designer of this particular microcontroller firmware.

 
Well, some days I want to shoot the designer of the entire Wintel 
architecture...  it's not exactly designed by anybody of course, and 
today it's created largely by a collection of Taiwanese and Chinese ODM 
firms, coupled with Microsoft WinHEC and Intel folks.  At least they 
follow the rules and their ACPI and BIOS code say that they are using 
port 80 very clearly if you use PnP and ACPI properly.  And in the old 
days, you were supposed to use the system BIOS to talk to things like 
the PIT that had timing issues, not write your own code.


Or perhaps the ACPI spec should specify a timing loop spec and precisely 
specify the desired timing after accessing an I/O port till that device 
has properly acted on that operation.


The idea that Port 80 was unused and appropriate for delay purposes 
elicited skepticism by Linus that is recorded for posterity in the 
comments of the relevant Linux include files - especially since it was 
clearly used for non-delay purposes, by cards that could be plugged 
into a PCI (fast), not just an 8-bit ISA, bus.


Perhaps we should declare the world of ACPI systems a separate arch 
from the world of l'ancien regime where folklore about which ports were 
used for what ruled.   I lived through those old days, and they were not 
wonderful, either.


The world sucks, and Linux is supposed to be able to adapt to that 
world, suckitude and all.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-02 Thread David P. Reed
FYI - another quirky Quanta motherboard from HP, with DMI readings 
reported to me.


 Original Message 
Date:   Wed, 2 Jan 2008 16:23:27 +1030
From:   Joel Stanley <[EMAIL PROTECTED]>
To: David P. Reed <[EMAIL PROTECTED]>
Subject:Re: [PATCH] Option to disable AMD C1E (allows dynticks to work)



On Dec 30, 2007 1:13 AM, David P. Reed <[EMAIL PROTECTED]> wrote:

I have also attached a c program that only touches port 80.  Compile it
for 32-bit mode (see comment), run it as root, and after two or three
runs, it will hang a system that has the port 80 bug.


Using port80.c, I could hard lock a HP Pavilion tx1000 laptop on the
first go. This was with ubuntu hardy's stock kernel (a 2.6.24-rc)


dmidecode -s baseboard-manufacturer
dmidecode -s baseboard-product-name


Quanta
30BF

Tonight, I will try compiling a kernel with these values added to your patch.

Some history, feel free to ignore if it's not relevant: ubuntu
feisty's 2.6.22 based kernel worked fine, irc. We were having issues
with sound, so tried fedora8's .23 based kernel, but this would
sporadically hard lock. Ubuntu hardy's 2.6.24 appeared fine, for the 2
hours or so I used it last night, until using the port80.c program,
obviously.

Cheers,

Joel


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-kernel] Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-02 Thread David P. Reed
FYI - another quirky Quanta motherboard from HP, with DMI readings 
reported to me.


 Original Message 
Date:   Wed, 2 Jan 2008 16:23:27 +1030
From:   Joel Stanley [EMAIL PROTECTED]
To: David P. Reed [EMAIL PROTECTED]
Subject:Re: [PATCH] Option to disable AMD C1E (allows dynticks to work)



On Dec 30, 2007 1:13 AM, David P. Reed [EMAIL PROTECTED] wrote:

I have also attached a c program that only touches port 80.  Compile it
for 32-bit mode (see comment), run it as root, and after two or three
runs, it will hang a system that has the port 80 bug.


Using port80.c, I could hard lock a HP Pavilion tx1000 laptop on the
first go. This was with ubuntu hardy's stock kernel (a 2.6.24-rc)


dmidecode -s baseboard-manufacturer
dmidecode -s baseboard-product-name


Quanta
30BF

Tonight, I will try compiling a kernel with these values added to your patch.

Some history, feel free to ignore if it's not relevant: ubuntu
feisty's 2.6.22 based kernel worked fine, irc. We were having issues
with sound, so tried fedora8's .23 based kernel, but this would
sporadically hard lock. Ubuntu hardy's 2.6.24 appeared fine, for the 2
hours or so I used it last night, until using the port80.c program,
obviously.

Cheers,

Joel


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-01 Thread David P. Reed


Alan Cox wrote:

That does imply some muppet 'extended' the debug interface for power
management on your laptop. Also pretty much proves that for such systems
we do have to move from port 0x80 to another delay approach.
  
Alan - in googling around the net yesterday looking for SuperIO chipsets 
that claim to support port 80, I have found that "blade" servers from 
companies like IBM and HP *claim* to have a system for monitoring port 
80 diagnostic codes and sending them to the "drawer" management 
processor through a management backplane.   This is a little puzzling, 
because you'd think they would have noticed port 80 issues, since they 
run Linux in their systems.  Maybe not hangs, but it seems unhelpful to 
have a lot of noise spewing over a bus that is supposed to provide 
"management" diagnostics.  Anyway, what I did not find was whether there 
was a particular chipset that provided that port 80 feature on those 
machines.  However, if it's a common "cell" in a design, it may have 
leaked into the notebook market chipsets too.


Anyone know if the Linux kernels used on blade servers have been patched 
to not do the port 80 things?  I don't think this would break anything 
there, but it might have been a helpful patch for their purposes.  I 
don't do blades personally or at work (I focus on mobile devices these 
days, and my personal servers are discrete), so I have no knowledge.


It could be that the blade servers have BIOSes that don't do POST codes 
over port 80, but send them directly to the "drawer" management bus, of 
course.  
--

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-01 Thread David P. Reed

Pavel Machek wrote:

2. there is some "meaning" to certain byte values being written (the
_PTS and _WAK use of arguments that come from callers to store into port
80 makes me suspicious.)   That might mean that the freeze happens only
when certain values are written, or when they are written closely in
time to some other action - being used to communicate something to the



There's nothing easier than always writing 0 to the 0x80 to check if
it hangs in such case...?
Pavel

  
I did try that.  Machine in question does hang when you write 0 to 0x80 
in a loop a few thousand times.  This particular suspicion was that the 
problem was caused by the following sort of thing (it's a multi-cpu 
system...)


First, some ACPI code writes "meaningful value" X to  port 80 that is 
sort of a "parameter" to whatever follows.  Just because the DSDT 
disassembly *calls* it the DBUG port doesn't mean it is *only* used for 
debugging.   We (Linux) use it for timing delays, after all...


then Linux driver writes some random value (!=X) including zero to port 80.

then ACPI writes some other values that cause SMI or some other thing to 
happen,


There are experiments that are not so simple that could rule this 
particular guess out.   I have them on my queue of experiments I might 
try (locking out ACPI).  Of course if the BIOS were GPL, we could look 
at the comments, etc...


I may today pull the laptop apart to see if I can see what chips are on 
it, besides the nvidia chipset and the processor.  That might give a 
clue as to what SuperIO or other logic chips are there.




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-01 Thread David P. Reed

Alan Cox wrote:
responds to reads differently than "unused" ports.  In particular, an 
inb takes 1/2 the elapsed time compared to a read to "known" unused port 
0xed - 792 tsc ticks for port 80 compared to about 1450 tsc ticks for 
port 0xed and other unused ports (tsc at 800 MHz).



Well at least we know where the port is now - thats too fast for an LPC
bus device, so it must be an SMI trap.

Only easy way to find out is to use the debugging event counters and see
how many instruction cycles are issued as part of the 0x80 port. If its
suprisingly high then you've got a firmware bug and can go spank HP.

  
Alan, thank you for the pointers.  I have been doing variations on this 
testing theme for a while - I get intrigued by a good debugging 
challenge, and after all it's my machine...


Two relevant new data points, and then some more suggestions:

1. It appears to be a real port.  SMI traps are not happening in the 
normal outb to 80.  Hundreds of them execute perfectly with the expected 
instruction counts.  If I can trace the particular event that creates 
the hard freeze (getting really creative, here) and stop before the 
freeze disables the entire computer, I will.  That may be an SMI, or 
perhaps any other kind of interrupt or exception.  Maybe someone knows 
how to safely trace through an impending SMI while doing printk's or 
something?


2. It appears to be the standard POST diagnostic port.  On a whim, I 
disassembled my DSDT code, and studied it more closely.   It turns out 
that there are a bunch of "Store(..., DBUG)" instructions scattered 
throughout, and when you look at what DBUG is defined as, it is defined 
as an IO Port at IO address DBGP, which is a 1-byte value = 0x80.  So 
the ACPI BIOS thinks it has something to do with debugging.   There's a 
little strangeness here, however, because the value sent to the port 
occasionally has something to do with arguments to the ACPI operations 
relating to sleep and wakeup ...  could just be that those arguments are 
distinctive.


In thinking about this, I recognize a couple of things.  ACPI is telling 
us something when it declares a reference to port 80 in its code.  It's 
not telling us the function of this port on this machine, but it is 
telling us that it is being used by the BIOS.   This could be a reason 
to put out a printk warning message...   'warning: port 80 is used by 
ACPI BIOS - if you are experiencing problems, you might try an alternate 
means of iodelay.'


Second, it seems likely that there are one of two possible reasons that 
the port 80 writes cause hang/freezes:


1. buffer overflow in such a device.

2. there is some "meaning" to certain byte values being written (the 
_PTS and _WAK use of arguments that come from callers to store into port 
80 makes me suspicious.)   That might mean that the freeze happens only 
when certain values are written, or when they are written closely in 
time to some other action - being used to communicate something to the 
SMM code).   If there is some race in when Linux's port 80 writes happen 
that happen to change the meaning of a request to the hardware or to 
SMM, then we could be rarely stepping on




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-01 Thread David P. Reed

Alan Cox wrote:
responds to reads differently than unused ports.  In particular, an 
inb takes 1/2 the elapsed time compared to a read to known unused port 
0xed - 792 tsc ticks for port 80 compared to about 1450 tsc ticks for 
port 0xed and other unused ports (tsc at 800 MHz).



Well at least we know where the port is now - thats too fast for an LPC
bus device, so it must be an SMI trap.

Only easy way to find out is to use the debugging event counters and see
how many instruction cycles are issued as part of the 0x80 port. If its
suprisingly high then you've got a firmware bug and can go spank HP.

  
Alan, thank you for the pointers.  I have been doing variations on this 
testing theme for a while - I get intrigued by a good debugging 
challenge, and after all it's my machine...


Two relevant new data points, and then some more suggestions:

1. It appears to be a real port.  SMI traps are not happening in the 
normal outb to 80.  Hundreds of them execute perfectly with the expected 
instruction counts.  If I can trace the particular event that creates 
the hard freeze (getting really creative, here) and stop before the 
freeze disables the entire computer, I will.  That may be an SMI, or 
perhaps any other kind of interrupt or exception.  Maybe someone knows 
how to safely trace through an impending SMI while doing printk's or 
something?


2. It appears to be the standard POST diagnostic port.  On a whim, I 
disassembled my DSDT code, and studied it more closely.   It turns out 
that there are a bunch of Store(..., DBUG) instructions scattered 
throughout, and when you look at what DBUG is defined as, it is defined 
as an IO Port at IO address DBGP, which is a 1-byte value = 0x80.  So 
the ACPI BIOS thinks it has something to do with debugging.   There's a 
little strangeness here, however, because the value sent to the port 
occasionally has something to do with arguments to the ACPI operations 
relating to sleep and wakeup ...  could just be that those arguments are 
distinctive.


In thinking about this, I recognize a couple of things.  ACPI is telling 
us something when it declares a reference to port 80 in its code.  It's 
not telling us the function of this port on this machine, but it is 
telling us that it is being used by the BIOS.   This could be a reason 
to put out a printk warning message...   'warning: port 80 is used by 
ACPI BIOS - if you are experiencing problems, you might try an alternate 
means of iodelay.'


Second, it seems likely that there are one of two possible reasons that 
the port 80 writes cause hang/freezes:


1. buffer overflow in such a device.

2. there is some meaning to certain byte values being written (the 
_PTS and _WAK use of arguments that come from callers to store into port 
80 makes me suspicious.)   That might mean that the freeze happens only 
when certain values are written, or when they are written closely in 
time to some other action - being used to communicate something to the 
SMM code).   If there is some race in when Linux's port 80 writes happen 
that happen to change the meaning of a request to the hardware or to 
SMM, then we could be rarely stepping on




--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-01 Thread David P. Reed


Alan Cox wrote:

That does imply some muppet 'extended' the debug interface for power
management on your laptop. Also pretty much proves that for such systems
we do have to move from port 0x80 to another delay approach.
  
Alan - in googling around the net yesterday looking for SuperIO chipsets 
that claim to support port 80, I have found that blade servers from 
companies like IBM and HP *claim* to have a system for monitoring port 
80 diagnostic codes and sending them to the drawer management 
processor through a management backplane.   This is a little puzzling, 
because you'd think they would have noticed port 80 issues, since they 
run Linux in their systems.  Maybe not hangs, but it seems unhelpful to 
have a lot of noise spewing over a bus that is supposed to provide 
management diagnostics.  Anyway, what I did not find was whether there 
was a particular chipset that provided that port 80 feature on those 
machines.  However, if it's a common cell in a design, it may have 
leaked into the notebook market chipsets too.


Anyone know if the Linux kernels used on blade servers have been patched 
to not do the port 80 things?  I don't think this would break anything 
there, but it might have been a helpful patch for their purposes.  I 
don't do blades personally or at work (I focus on mobile devices these 
days, and my personal servers are discrete), so I have no knowledge.


It could be that the blade servers have BIOSes that don't do POST codes 
over port 80, but send them directly to the drawer management bus, of 
course.  
--

To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2008-01-01 Thread David P. Reed

Pavel Machek wrote:

2. there is some meaning to certain byte values being written (the
_PTS and _WAK use of arguments that come from callers to store into port
80 makes me suspicious.)   That might mean that the freeze happens only
when certain values are written, or when they are written closely in
time to some other action - being used to communicate something to the



There's nothing easier than always writing 0 to the 0x80 to check if
it hangs in such case...?
Pavel

  
I did try that.  Machine in question does hang when you write 0 to 0x80 
in a loop a few thousand times.  This particular suspicion was that the 
problem was caused by the following sort of thing (it's a multi-cpu 
system...)


First, some ACPI code writes meaningful value X to  port 80 that is 
sort of a parameter to whatever follows.  Just because the DSDT 
disassembly *calls* it the DBUG port doesn't mean it is *only* used for 
debugging.   We (Linux) use it for timing delays, after all...


then Linux driver writes some random value (!=X) including zero to port 80.

then ACPI writes some other values that cause SMI or some other thing to 
happen,


There are experiments that are not so simple that could rule this 
particular guess out.   I have them on my queue of experiments I might 
try (locking out ACPI).  Of course if the BIOS were GPL, we could look 
at the comments, etc...


I may today pull the laptop apart to see if I can see what chips are on 
it, besides the nvidia chipset and the processor.  That might give a 
clue as to what SuperIO or other logic chips are there.




--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override

2007-12-30 Thread David P. Reed



H. Peter Anvin wrote:
Now, I think there is a specific reason to believe that EGA/VGA (but 
perhaps not CGA/MDA) didn't need these kinds of hacks: the video cards 
of the day was touched, directly, by an interminable number of DOS 
applications.  CGA/MDA generally *were not*, due to the unsynchronized 
memory of the original versions (writing could cause snow), so most 
applications tended to fall back to using the BIOS access methods for 
CGA and MDA.


A little history... not that it really matters, but some might be 
interested in a 55-year-old hacker's sentimental recollections...As 
someone who actually wrote drivers for CGA and MDA on the original IBM 
PC, I can tell you that back to back I/O *port* writes and reads were 
perfectly fine.  The "snow" problem had nothing to do with I/O ports.  
It had to do with the memory on the CGA adapter card not being dual 
ported, and in high-res (80x25) character mode (only!) a CPU read or 
write access caused a read of the adapter memory by the 
character-generator to fail, causing one character-position of the 
current scanline being output to get all random bits, which was then put 
through the character-generator and generated whatever the character 
generator did with 8 random bits of character or attributes as an index 
into the character generator's font table.


In particular, the solution in both the BIOS and in Visicalc, 1-2-3, and 
other products that did NOT use the BIOS or DOS for I/O to the CGA or 
MDA because they were Dog Slow, was to detect the CGA, and do a *very* 
tight loop doing "inb" instructions from one of the CGA status 
registers, looking for a 0-1 transition on the horizontal retrace flag.  
It would then do a write to display memory with all interrupts locked 
out, because that was all it could do during the horizontal retrace, 
given the speed of the processor.  One of the hacks I did in those days 
(I wrote the CGA driver for Visicalc Advanced Version and several other 
Software Arts programs, some of which were sold to Lotus when they 
bought our assets, and hired me, in 1985) was to measure the "horizontal 
retrace time" and the "vertical blanking interval" when the program 
started, and compile screen-writing code that squeezed as many writes as 
possible into both horizontal retraces and vertical retraces.   That was 
actually a "selling point" for spreadsheets - the reviewers actually 
measured whether you could use the down-arrow key in auto-repeat mode 
and keep the screen scrolling at the relevant rate!  That was hard on an 
8088 or 80286 processor, with a CGA card.


It was great when EGA and VGA came out, but we still had to support the 
CGA long after.  Which is why I fully understand the need not to break 
old machines.  We had to run on every machine that was claimed to be "PC 
compatible" - many of which were hardly so compatible  (the PS/2 model 
50 had a completely erroneous serial chip that claimed to emulate the 
original 8250, but had an immense pile of bugs, for example, that IBM 
begged ISVs to call a software problem and fix so they didn't get sued).


The IBM PC bus (predecessor of the current ISA bus, which came from the 
PC-AT's 16-bit bus), did just fine electrically - any I/O port-specific 
timing problems had to do with the timing of the chips attached to the 
bus.  For example, if a bus write to a port was routed into a particular 
chip, the timing of that chip's subsequent processing might be such that 
it was not ready to respond to another read or write.)  That's not a 
"signalling" problem - it has nothing to do with capacitance on the bus, 
e.g., but a functional speed problem in the chip (if on the motherboard) 
or the adapter card.


Rant off.  This has nothing, of course, to do with present issues.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Option to disable AMD C1E (allows dynticks to work)

2007-12-30 Thread David P. Reed



Richard Harman wrote:


I think I may have a monkey wrench to throw into this, I finally got 
around to testing the C1E patch, and the port80 patch.  End result: 
port80 patch has zero effect on this laptop, and the C1E patch makes

it stable.



Stating that your system is "stable" is not very definitive.   Does 
hwclock work when full Fedora 8 is running without the port80 patch, or 
have you disabled the uses of hwclock in your init and shutdown code?   
Have you set the hwclock setting to use the extremely dangerous 
"-directisa" option - which hides the problem because it avoids the port 
80 i/o?


Try compiling and running the test program port80.c a few times.   If 
your machine doesn't hang, it would be interesting to see the results it 
gives.


The C1E patch alone does not fix the port 80 problem several of us have 
observed.  what does dmidecode say for your motherboard vendor and model?



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override

2007-12-30 Thread David P. Reed

Alan Cox wrote:
Now what's interesting is that the outb to port 80 is *faster* than an 
outb to an unused port, on my machine.  So there's something there - 
actually accepting the bus transaction.   In the ancient 5150 PC, 80 was 



Yes and I even told you a while back how to verify where it is. From the
timing you get its not on the LPC bus but chipset core so pretty
certainly an SMM trap as other systems with the same chipset don't have
the bug. Probably all that is needed is a BIOS upgrade

  
Actually, I could see whether it was SMM trapping due to AMD MSR's that 
would allow such trapping, performance or debug registers.  Nothing was 
set to trap with SMI or other traps on any port outputs.   But I'm 
continuing to investigate for a cause.  It would be nice if it were a 
BIOS-fixable problem.  It would be even nicer if the BIOS were GPL...

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override

2007-12-30 Thread David P. Reed
I am so happy that there will be a way for people who don't build their 
own kernels to run Linux on their HP and Compaq laptops that have 
problems with gazillions of writes to port 80, and I'm also happy that 
some of the strange driver code will be cleaned up over time.  Thank you 
all.  Some thoughts you all might consider, take or leave, in this 
process, from an old engineering manager who once had to worry about QA 
for software on nearly every personal computer model in the 1980-1992 
period:


You know, there is a class of devices that are defined to use port 
0x80...  it's that historically useful class of devices that show/record 
the POST diagnostics.   It certainly was not designed for "delay" 
purposes.   In fact, some of those same silly devices are still used in 
industry during manufacturing test.   I wonder what would happen if 
Windows were not part of manufacturing test, and instead Linux were the 
"standard" for some category of machines...


When I was still working at Lotus in the late '80's, when we still 
supported machines like 286's, there were lots of problems with timing 
loops in drivers in applications (even Win 3.0 had some in hard disk 
drivers, as did some of our printer drivers, ...), as clock speeds 
continued to ramp.  There were major news stories of machines that 
"crashed when xyz application or zyx peripheral were added".  It was 
Intel, as I recall, that started "publicly" berating companies in the PC 
industry for using the "two short jumps" solutions, and suggesting that 
they measure the processor speed at bootup, using the BIOS standard for 
doing that with the int 15 BIOS elapsed time calls, and always use 
"calibrated" timing loops.   Which all of us who supported device 
drivers started to do  (remember, apps had device drivers in those days 
for many devices that talked directly with the registers).


I was impressed when I dug into Linux eventually, that this operating 
system "got it right" by measuring the timing during boot and creating a 
udelay function that really worked!


So I have to say, that when I was tracing down the problem that 
originally kicked off this thread, which was that just accessing the RTC 
using the standard CMOS_READ macros in a loop caused a hang, that these 
"outb al,80h" things were there.   And I noticed your skeptical comment 
in the code, Linus.  Knowing that there was never in any of the 
documented RTC chipsets a need for a pause between accesses (going back 
to my days at Software Arts working on just about every old machine 
there was...) I changed it on a lark to do no pause at all.   And my 
machine never hung...


Now what's interesting is that the outb to port 80 is *faster* than an 
outb to an unused port, on my machine.  So there's something there - 
actually accepting the bus transaction.   In the ancient 5150 PC, 80 was 
unused because it was the DMA controller port that drove memory refresh, 
and had no meaning.


Now my current hypothesis (not having access to quanta's design specs 
for a board they designed and have shipped in quantity, or having taken 
the laptop apart recently) is that there is logic there on port 80, 
doing something.  Perhaps even "POST diagnostic recording" as every PC 
since the XT has supported... perhaps supporting post-crash 
dignostics...   And that that something has a buffer, perhaps even in 
the "Embedded Controller" that may need emptying periodically.   It 
takes several tens of thousands of "outb" to port 80 to hang the 
hardware solid - so something is either rare or overflowing.  In any 
case, if this hypothesis is correct - the hardware may have an erratum, 
but the hardware is doing a very desirable thing - standardizing on an 
error mechanism that was already in the "standard" as an option...  It's 
Linux that is using a "standard" in a wrong way (a diagnostic port as a 
delay).


So I say all this, mainly to point out that Linux has done timing loops 
right (udelay and ndelay) - except one place where there was some 
skepticism expressed, right there in the code.   Linus may have some 
idea why it was thought important to do an essential delay with a bus 
transaction that had uncertain timing.   My hypothesis is that 
"community" projects have the danger of "magical theories" and 
"coolness" overriding careful engineering design practices.


Cleaning up that "clever hack" that seemed so good at the time is hugely 
difficult, especially when the driver writer didn't write down why he 
used it.  

Thus I would suggest that the _p functions be deprecated, and if there 
needs to be a timing-delay after in/out instructions, define 
in_pause(port, nsec_delay) with an explicit delay.   And if the delay is 
dependent on bus speeds, define a bus-speed ratio calibration.


Thus in future driver writing, people will be forced to think clearly 
about what the timing characteristics of their device on its bus must 
be.   That presupposes that driver writers understand the timing 

Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override

2007-12-30 Thread David P. Reed
I am so happy that there will be a way for people who don't build their 
own kernels to run Linux on their HP and Compaq laptops that have 
problems with gazillions of writes to port 80, and I'm also happy that 
some of the strange driver code will be cleaned up over time.  Thank you 
all.  Some thoughts you all might consider, take or leave, in this 
process, from an old engineering manager who once had to worry about QA 
for software on nearly every personal computer model in the 1980-1992 
period:


You know, there is a class of devices that are defined to use port 
0x80...  it's that historically useful class of devices that show/record 
the POST diagnostics.   It certainly was not designed for delay 
purposes.   In fact, some of those same silly devices are still used in 
industry during manufacturing test.   I wonder what would happen if 
Windows were not part of manufacturing test, and instead Linux were the 
standard for some category of machines...


When I was still working at Lotus in the late '80's, when we still 
supported machines like 286's, there were lots of problems with timing 
loops in drivers in applications (even Win 3.0 had some in hard disk 
drivers, as did some of our printer drivers, ...), as clock speeds 
continued to ramp.  There were major news stories of machines that 
crashed when xyz application or zyx peripheral were added.  It was 
Intel, as I recall, that started publicly berating companies in the PC 
industry for using the two short jumps solutions, and suggesting that 
they measure the processor speed at bootup, using the BIOS standard for 
doing that with the int 15 BIOS elapsed time calls, and always use 
calibrated timing loops.   Which all of us who supported device 
drivers started to do  (remember, apps had device drivers in those days 
for many devices that talked directly with the registers).


I was impressed when I dug into Linux eventually, that this operating 
system got it right by measuring the timing during boot and creating a 
udelay function that really worked!


So I have to say, that when I was tracing down the problem that 
originally kicked off this thread, which was that just accessing the RTC 
using the standard CMOS_READ macros in a loop caused a hang, that these 
outb al,80h things were there.   And I noticed your skeptical comment 
in the code, Linus.  Knowing that there was never in any of the 
documented RTC chipsets a need for a pause between accesses (going back 
to my days at Software Arts working on just about every old machine 
there was...) I changed it on a lark to do no pause at all.   And my 
machine never hung...


Now what's interesting is that the outb to port 80 is *faster* than an 
outb to an unused port, on my machine.  So there's something there - 
actually accepting the bus transaction.   In the ancient 5150 PC, 80 was 
unused because it was the DMA controller port that drove memory refresh, 
and had no meaning.


Now my current hypothesis (not having access to quanta's design specs 
for a board they designed and have shipped in quantity, or having taken 
the laptop apart recently) is that there is logic there on port 80, 
doing something.  Perhaps even POST diagnostic recording as every PC 
since the XT has supported... perhaps supporting post-crash 
dignostics...   And that that something has a buffer, perhaps even in 
the Embedded Controller that may need emptying periodically.   It 
takes several tens of thousands of outb to port 80 to hang the 
hardware solid - so something is either rare or overflowing.  In any 
case, if this hypothesis is correct - the hardware may have an erratum, 
but the hardware is doing a very desirable thing - standardizing on an 
error mechanism that was already in the standard as an option...  It's 
Linux that is using a standard in a wrong way (a diagnostic port as a 
delay).


So I say all this, mainly to point out that Linux has done timing loops 
right (udelay and ndelay) - except one place where there was some 
skepticism expressed, right there in the code.   Linus may have some 
idea why it was thought important to do an essential delay with a bus 
transaction that had uncertain timing.   My hypothesis is that 
community projects have the danger of magical theories and 
coolness overriding careful engineering design practices.


Cleaning up that clever hack that seemed so good at the time is hugely 
difficult, especially when the driver writer didn't write down why he 
used it.  

Thus I would suggest that the _p functions be deprecated, and if there 
needs to be a timing-delay after in/out instructions, define 
in_pause(port, nsec_delay) with an explicit delay.   And if the delay is 
dependent on bus speeds, define a bus-speed ratio calibration.


Thus in future driver writing, people will be forced to think clearly 
about what the timing characteristics of their device on its bus must 
be.   That presupposes that driver writers understand the timing 
issues.   If they do not, they 

Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override

2007-12-30 Thread David P. Reed

Alan Cox wrote:
Now what's interesting is that the outb to port 80 is *faster* than an 
outb to an unused port, on my machine.  So there's something there - 
actually accepting the bus transaction.   In the ancient 5150 PC, 80 was 



Yes and I even told you a while back how to verify where it is. From the
timing you get its not on the LPC bus but chipset core so pretty
certainly an SMM trap as other systems with the same chipset don't have
the bug. Probably all that is needed is a BIOS upgrade

  
Actually, I could see whether it was SMM trapping due to AMD MSR's that 
would allow such trapping, performance or debug registers.  Nothing was 
set to trap with SMI or other traps on any port outputs.   But I'm 
continuing to investigate for a cause.  It would be nice if it were a 
BIOS-fixable problem.  It would be even nicer if the BIOS were GPL...

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Option to disable AMD C1E (allows dynticks to work)

2007-12-30 Thread David P. Reed



Richard Harman wrote:


I think I may have a monkey wrench to throw into this, I finally got 
around to testing the C1E patch, and the port80 patch.  End result: 
port80 patch has zero effect on this laptop, and the C1E patch makes

it stable.



Stating that your system is stable is not very definitive.   Does 
hwclock work when full Fedora 8 is running without the port80 patch, or 
have you disabled the uses of hwclock in your init and shutdown code?   
Have you set the hwclock setting to use the extremely dangerous 
-directisa option - which hides the problem because it avoids the port 
80 i/o?


Try compiling and running the test program port80.c a few times.   If 
your machine doesn't hang, it would be interesting to see the results it 
gives.


The C1E patch alone does not fix the port 80 problem several of us have 
observed.  what does dmidecode say for your motherboard vendor and model?



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override

2007-12-30 Thread David P. Reed



H. Peter Anvin wrote:
Now, I think there is a specific reason to believe that EGA/VGA (but 
perhaps not CGA/MDA) didn't need these kinds of hacks: the video cards 
of the day was touched, directly, by an interminable number of DOS 
applications.  CGA/MDA generally *were not*, due to the unsynchronized 
memory of the original versions (writing could cause snow), so most 
applications tended to fall back to using the BIOS access methods for 
CGA and MDA.


A little history... not that it really matters, but some might be 
interested in a 55-year-old hacker's sentimental recollections...As 
someone who actually wrote drivers for CGA and MDA on the original IBM 
PC, I can tell you that back to back I/O *port* writes and reads were 
perfectly fine.  The snow problem had nothing to do with I/O ports.  
It had to do with the memory on the CGA adapter card not being dual 
ported, and in high-res (80x25) character mode (only!) a CPU read or 
write access caused a read of the adapter memory by the 
character-generator to fail, causing one character-position of the 
current scanline being output to get all random bits, which was then put 
through the character-generator and generated whatever the character 
generator did with 8 random bits of character or attributes as an index 
into the character generator's font table.


In particular, the solution in both the BIOS and in Visicalc, 1-2-3, and 
other products that did NOT use the BIOS or DOS for I/O to the CGA or 
MDA because they were Dog Slow, was to detect the CGA, and do a *very* 
tight loop doing inb instructions from one of the CGA status 
registers, looking for a 0-1 transition on the horizontal retrace flag.  
It would then do a write to display memory with all interrupts locked 
out, because that was all it could do during the horizontal retrace, 
given the speed of the processor.  One of the hacks I did in those days 
(I wrote the CGA driver for Visicalc Advanced Version and several other 
Software Arts programs, some of which were sold to Lotus when they 
bought our assets, and hired me, in 1985) was to measure the horizontal 
retrace time and the vertical blanking interval when the program 
started, and compile screen-writing code that squeezed as many writes as 
possible into both horizontal retraces and vertical retraces.   That was 
actually a selling point for spreadsheets - the reviewers actually 
measured whether you could use the down-arrow key in auto-repeat mode 
and keep the screen scrolling at the relevant rate!  That was hard on an 
8088 or 80286 processor, with a CGA card.


It was great when EGA and VGA came out, but we still had to support the 
CGA long after.  Which is why I fully understand the need not to break 
old machines.  We had to run on every machine that was claimed to be PC 
compatible - many of which were hardly so compatible  (the PS/2 model 
50 had a completely erroneous serial chip that claimed to emulate the 
original 8250, but had an immense pile of bugs, for example, that IBM 
begged ISVs to call a software problem and fix so they didn't get sued).


The IBM PC bus (predecessor of the current ISA bus, which came from the 
PC-AT's 16-bit bus), did just fine electrically - any I/O port-specific 
timing problems had to do with the timing of the chips attached to the 
bus.  For example, if a bus write to a port was routed into a particular 
chip, the timing of that chip's subsequent processing might be such that 
it was not ready to respond to another read or write.)  That's not a 
signalling problem - it has nothing to do with capacitance on the bus, 
e.g., but a functional speed problem in the chip (if on the motherboard) 
or the adapter card.


Rant off.  This has nothing, of course, to do with present issues.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Option to disable AMD C1E (allows dynticks to work)

2007-12-29 Thread David P. Reed

Islam Amer wrote:

Hello.
I was interested in getting dynticks to work on my compaq presario v6000
to help with the 1 hour thirty minutes battery time, but after this
discussion I lost interest.

I too had the early boot time hang, and found it was udev triggering the
bug.
  
This early boot time hang is *almost certainly* due to the in/out port 
80 bug, which I discovered a few weeks ago, which affects hwclock and 
other I/O device drivers on a number of HP/Compaq machines in exactly 
this way.  The proper fix for this bug is in dispute, and will probably 
not occur in the 2.6.24 release because it touches code in many, many 
drivers.  The simplest way to test if you have a problem of this sort is 
to try this shell line as root, after you boot successfully.  If your 
machine hangs hard,  you have a problem that really looks like the port 
80 problem.


for ((i = 0; i < 1000; i = i + 1)); do cat /dev/nvram > /dev/null; done

I have also attached a c program that only touches port 80.  Compile it 
for 32-bit mode (see comment), run it as root, and after two or three 
runs, it will hang a system that has the port 80 bug.


If you then run:

dmidecode -s baseboard-manufacturer
dmidecode -s baseboard-product-name

are the values you should plug into the .matches field in the 
dmi_system_id struct in the attached patch. It would be great if you 
could do that, test, and post back with those values so they can be 
accumulated.  HP/Compaq machines with quanta m/b's are very popular, and 
very common - so at least a quirk patch for all the broken models would 
be worth doing in 2.6.25 or downstream in the distros.  The right 
patches will probably take a long time - there is a dispute as to what 
the semantics of port 80 writes even mean among the core kernel 
developers, because the hack is lost in the dim dark days of history, 
and safe resolution will take time


There is also a C1E issue with the BIOS in my machine (an HP Pavilion 
dv9000z).  I don't know if it is a bug, yet, but that's a different 
problem - associated with dynticks, perhaps.  I have to say that 
researching the AMD Kernel/BIOS docs on C1E (a very new feature in the 
last year on AMD) leaves me puzzled as to whether the dynticks problem 
exists on my machine at all, but the patch for it turns off dynticks!





Changing the /etc/init.d/udev script so that the line containing

/sbin/udevtrigger

to

/sbin/udevtrigger --subsystem-nomatch="*misc*"

seemed to fix things.

the hang is triggered specifically by 


echo add > /sys/class/misc/rtc/uevent
after inserting rtc.ko

Also using hwclock to set the rtc , will cause a hard hang, if you are
using 64bit linux. Disable the init scripts that set the time, or use
the 32bit binary, as suggested here : 


http://www.mail-archive.com/[EMAIL PROTECTED]/msg41964.html

I hope this helps. But your hardware is slightly different though.
  


commit c12c7a47b9af87e8d867d5aa0ca5c6bcdd2463da
Author: Rene Herman <[EMAIL PROTECTED]>
Date:   Mon Dec 17 21:23:55 2007 +0100

x86: provide a DMI based port 0x80 I/O delay override.

Certain (HP) laptops experience trouble from our port 0x80 I/O delay
writes. This patch provides for a DMI based switch to the "alternate
diagnostic port" 0xed (as used by some BIOSes as well) for these.

David P. Reed confirmed that port 0xed works for him and provides a
proper delay. The symptoms of _not_ working are a hanging machine,
with "hwclock" use being a direct trigger.

Earlier versions of this attempted to simply use udelay(2), with the
2 being a value tested to be a nicely conservative upper-bound with
help from many on the linux-kernel mailinglist, but that approach has
two problems.

First, pre-loops_per_jiffy calibration (which is post PIT init while
some implementations of the PIT are actually one of the historically
problematic devices that need the delay) udelay() isn't particularly
well-defined. We could initialise loops_per_jiffy conservatively (and
based on CPU family so as to not unduly delay old machines) which
would sort of work, but still leaves:

Second, delaying isn't the only effect that a write to port 0x80 has.
It's also a PCI posting barrier which some devices may be explicitly
or implicitly relying on. Alan Cox did a survey and found evidence
that additionally various drivers are racy on SMP without the bus
locking outb.

Switching to an inb() makes the timing too unpredictable and as such,
this DMI based switch should be the safest approach for now. Any more
invasive changes should get more rigid testing first. It's moreover
only very few machines with the problem and a DMI based hack seems
to fit that situation.

An early boot parameter to make the choice manually (and override any
possible DMI based decision) is also provided:

	io_delay=standard|alterna

Re: [PATCH] Option to disable AMD C1E (allows dynticks to work)

2007-12-29 Thread David P. Reed

Islam Amer wrote:

Hello.
I was interested in getting dynticks to work on my compaq presario v6000
to help with the 1 hour thirty minutes battery time, but after this
discussion I lost interest.

I too had the early boot time hang, and found it was udev triggering the
bug.
  
This early boot time hang is *almost certainly* due to the in/out port 
80 bug, which I discovered a few weeks ago, which affects hwclock and 
other I/O device drivers on a number of HP/Compaq machines in exactly 
this way.  The proper fix for this bug is in dispute, and will probably 
not occur in the 2.6.24 release because it touches code in many, many 
drivers.  The simplest way to test if you have a problem of this sort is 
to try this shell line as root, after you boot successfully.  If your 
machine hangs hard,  you have a problem that really looks like the port 
80 problem.


for ((i = 0; i  1000; i = i + 1)); do cat /dev/nvram  /dev/null; done

I have also attached a c program that only touches port 80.  Compile it 
for 32-bit mode (see comment), run it as root, and after two or three 
runs, it will hang a system that has the port 80 bug.


If you then run:

dmidecode -s baseboard-manufacturer
dmidecode -s baseboard-product-name

are the values you should plug into the .matches field in the 
dmi_system_id struct in the attached patch. It would be great if you 
could do that, test, and post back with those values so they can be 
accumulated.  HP/Compaq machines with quanta m/b's are very popular, and 
very common - so at least a quirk patch for all the broken models would 
be worth doing in 2.6.25 or downstream in the distros.  The right 
patches will probably take a long time - there is a dispute as to what 
the semantics of port 80 writes even mean among the core kernel 
developers, because the hack is lost in the dim dark days of history, 
and safe resolution will take time


There is also a C1E issue with the BIOS in my machine (an HP Pavilion 
dv9000z).  I don't know if it is a bug, yet, but that's a different 
problem - associated with dynticks, perhaps.  I have to say that 
researching the AMD Kernel/BIOS docs on C1E (a very new feature in the 
last year on AMD) leaves me puzzled as to whether the dynticks problem 
exists on my machine at all, but the patch for it turns off dynticks!





Changing the /etc/init.d/udev script so that the line containing

/sbin/udevtrigger

to

/sbin/udevtrigger --subsystem-nomatch=*misc*

seemed to fix things.

the hang is triggered specifically by 


echo add  /sys/class/misc/rtc/uevent
after inserting rtc.ko

Also using hwclock to set the rtc , will cause a hard hang, if you are
using 64bit linux. Disable the init scripts that set the time, or use
the 32bit binary, as suggested here : 


http://www.mail-archive.com/[EMAIL PROTECTED]/msg41964.html

I hope this helps. But your hardware is slightly different though.
  


commit c12c7a47b9af87e8d867d5aa0ca5c6bcdd2463da
Author: Rene Herman [EMAIL PROTECTED]
Date:   Mon Dec 17 21:23:55 2007 +0100

x86: provide a DMI based port 0x80 I/O delay override.

Certain (HP) laptops experience trouble from our port 0x80 I/O delay
writes. This patch provides for a DMI based switch to the alternate
diagnostic port 0xed (as used by some BIOSes as well) for these.

David P. Reed confirmed that port 0xed works for him and provides a
proper delay. The symptoms of _not_ working are a hanging machine,
with hwclock use being a direct trigger.

Earlier versions of this attempted to simply use udelay(2), with the
2 being a value tested to be a nicely conservative upper-bound with
help from many on the linux-kernel mailinglist, but that approach has
two problems.

First, pre-loops_per_jiffy calibration (which is post PIT init while
some implementations of the PIT are actually one of the historically
problematic devices that need the delay) udelay() isn't particularly
well-defined. We could initialise loops_per_jiffy conservatively (and
based on CPU family so as to not unduly delay old machines) which
would sort of work, but still leaves:

Second, delaying isn't the only effect that a write to port 0x80 has.
It's also a PCI posting barrier which some devices may be explicitly
or implicitly relying on. Alan Cox did a survey and found evidence
that additionally various drivers are racy on SMP without the bus
locking outb.

Switching to an inb() makes the timing too unpredictable and as such,
this DMI based switch should be the safest approach for now. Any more
invasive changes should get more rigid testing first. It's moreover
only very few machines with the problem and a DMI based hack seems
to fit that situation.

An early boot parameter to make the choice manually (and override any
possible DMI based decision) is also provided:

	io_delay=standard|alternate

This does not change the io_delay() in the boot

Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2007-12-17 Thread David P. Reed
Besides the two reports of freezes on bugzilla.kernel.org (9511, 6307), 
the following two bug reports on bugzilla.redhat.com are almost 
certainly due to the same cause (imo, of course): 245834, 227234.


Ubuntu launchpad bug 158849 also seems to report the same problem, for 
an HP dv6258se 64-bit machine.


Also this one: 
http://www.mail-archive.com/[EMAIL PROTECTED]/msg10321.html


If you want to collect dmidecode data from these folks, perhaps we might 
get a wider sense of what categories of machines are affected.  They all 
seem to be recemt HP and Compaq AMD64 laptops, probably all Quanta 
motherboards.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2007-12-17 Thread David P. Reed



H. Peter Anvin wrote:

David P. Reed wrote:
As support: port 80 on the reporter's (my) HP dv9000z laptop clearly 
responds to reads differently than "unused" ports.  In particular, an 
inb takes 1/2 the elapsed time compared to a read to "known" unused 
port 0xed - 792 tsc ticks for port 80 compared to about 1450 tsc 
ticks for port 0xed and other unused ports (tsc at 800 MHz).




Any timings for port 0xf0 (write zero), out of curiosity?



Here's a bunch of data:

port 0xF0: cycles: out 919, in 933
port 0xed: cycles: out 2541, in 2036
port 0x70: cycles: out n/a,  in 934
port 0x80: cycles: out 1424, in 795

AMD Turion 64x2 TL-60 CPU running at 800 MHz, nVidia MCP51 chipset, 
Quanta motherboard.  Running 2.6.24-rc5 with Ingo's patch so inb_p, etc. 
use port 0xed.


Note that I can run the port 80 test once, the second time I get the 
hard freeze.  I didn't try writing to port 70 from userspace - that 
one's dangerous, but the reading of it was included for a timing typical 
of a chipset supported device.  These are all pretty consistent.


I find the "read" timing from 0x80 very interesting.  The write 
timeing is also interesting, being faster than an unused port. 



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2007-12-17 Thread David P. Reed

H. Peter Anvin wrote:

Rene Herman wrote:


I do not know how universal that is, but _reading_ port 0xf0 might in 
fact be sensible then? And should even work on a 386/387 pair? (I 
have a 386/387 in fact, although I'd need to dig it up).




No.  Someone might have used 0xf0 as a readonly port for other uses.

As support: port 80 on the reporter's (my) HP dv9000z laptop clearly 
responds to reads differently than "unused" ports.  In particular, an 
inb takes 1/2 the elapsed time compared to a read to "known" unused port 
0xed - 792 tsc ticks for port 80 compared to about 1450 tsc ticks for 
port 0xed and other unused ports (tsc at 800 MHz).


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2007-12-17 Thread David P. Reed

Ingo -

I finished testing the rolled up patch that you provided.  It seems to 
work just fine.  Thank you for putting this all together and persevering 
in this long and complex discussion. 

Here are the results, on the offending laptop, using 2.6.24-rc5 plus 
that one patch.


First: booted with normal boot parameters (no io_delay=):

   According to dmesg, 0xed is used.

   hwclock ran fine, hundreds of times.
   my shell script loop doing "cat /dev/nvram > /dev/null" ran fine, 
several times.
   Running Rene's "port 80" speed test ran fine once, then froze the 
system hard.  (expected)


Second: booted with io_delay=0x80, several tests, rebooting after freezes:

   hwclock froze system hard.  (this is the problem that drove me to 
find this bug).

   my shell script loop froze system hard.

Third: booted with io_delay=none:

   hwclock ran fine, also hundreds of times.
   my shell script loop ran fine several times.
   Running rene's port80 test ran fine twice, froze system hard on 
third try.


Fourth: booted with io_delay=udelay:

   hwclock ran fine, also hundreds of times.
   my shell script loop ran fine several times.
   Running Rene's port80 test ran fine, froze system hard on second try.

Analysis:

   patch works fine, and default to 0xed seems super conservative.
   I will probably use the boot parameter io_delay=none, because I 
don't seem to have any I/O

   devices that require any delays - and this way I can find any that do.

Still wondering:

   what the heck is going on with port 80 on my laptop motherboard.  
Clearly it "does something".
   I will in my spare time continue investigating, though having a 
reliable system is GREAT.





  


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2007-12-17 Thread David P. Reed

About to start building and testing.  It will take a few hours.

Ingo Molnar wrote:
here's an updated rollup patch, against 2.6.24-rc4. David, could you 
please try this? This should work out of box on your system, without any 
boot option or other tweak needed.



  

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: provide a DMI based port 0x80 I/O delay override.

2007-12-17 Thread David P. Reed

Rene Herman wrote:
No, most definitely not. Having the user select udelay or none through 
the kernel config and then the kernel deciding "ah, you know what, 
I'll know better and use port access anyway" is _utterly_ broken 
behaviour. Software needs to listen to its master.


When acting as an ordinary user, the .config is beyond my control 
(except on Gentoo).   It is in control of the distro (Fedora, Ubuntu, 
... but perhaps not Gentoo).  I think the distro guys want a default 
behavior that is set in .config, with quirk overrides being done when 
needed.   And of course the user in his/her boot params gets the final say.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   >