Re: [PATCH] x86, FPU: Fix FPU initialization

2013-04-16 Thread Ingo Molnar

* Borislav Petkov  wrote:

> On Tue, Apr 16, 2013 at 11:25:28AM +0200, Ingo Molnar wrote:
> > I've got limited extra capacity right now - but if Peter rebases
> > tip:x86/cpu or you send a pullable update of tip:x86/cpu I can stick
> > it into -tip testing and yell if it goes wrong.
> >
> 
> Ok, I'll do that in a second.
> 
> Btw, another heads-up: you know how I'm regularly testing
> Linus+tip/master - well, I started seeing some strange lockups on my
> workstation with -rc7 + tip from two days ago. And the box wouldn't
> resume properly, the last line it would print is:
> 
> "Disabling non-boot CPUs ..."
> 
> and then hang. I've backed-out tip/master and it seems to work so it has
> to be some interaction caused by something in tip. I haven't been able
> to put my finger on it though but I'll watch it and try to trigger it on
> my other boxes.

Would be nice to pin that down ...

You are the first one to report this.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86, FPU: Fix FPU initialization

2013-04-16 Thread Borislav Petkov
On Tue, Apr 16, 2013 at 11:25:28AM +0200, Ingo Molnar wrote:
> I've got limited extra capacity right now - but if Peter rebases
> tip:x86/cpu or you send a pullable update of tip:x86/cpu I can stick
> it into -tip testing and yell if it goes wrong.
>

Ok, I'll do that in a second.

Btw, another heads-up: you know how I'm regularly testing
Linus+tip/master - well, I started seeing some strange lockups on my
workstation with -rc7 + tip from two days ago. And the box wouldn't
resume properly, the last line it would print is:

"Disabling non-boot CPUs ..."

and then hang. I've backed-out tip/master and it seems to work so it has
to be some interaction caused by something in tip. I haven't been able
to put my finger on it though but I'll watch it and try to trigger it on
my other boxes.

Thanks.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86, FPU: Fix FPU initialization

2013-04-16 Thread Ingo Molnar

* Borislav Petkov  wrote:

> On Mon, Apr 15, 2013 at 05:54:15PM +0200, Borislav Petkov wrote:
> > On Mon, Apr 15, 2013 at 12:18:25PM +0200, Ingo Molnar wrote:
> > > It was tip:master with x86/cpu merged in freshly.
> > 
> > Ok, some more observations. I can trigger some oops similar yours (I
> > haven't caught mine yet over serial or such) with latest tip/master +
> > tip:x86/cpu.
> 
> Ok, here's the deal - it looks like a corruption which causes a couple
> of different backtraces with different functions in the call trace. I've
> bisected tip:x86/cpu and the evildoers are:

Correct, 'late effects of memory corruption' was my first impression too, from 
the 
crash pattern.

> 
> commit 3019653a57585602690fd38679326e9337f7ed7f
> Author: Borislav Petkov 
> Date:   Wed Apr 10 21:37:03 2013 +0200
> 
> x86/fpu: Fix FPU initialization
> 
> commit c70293d0e3fef6b989cd8268027d410cf06ce384
> Author: H. Peter Anvin 
> Date:   Mon Apr 8 17:57:43 2013 +0200
> 
> x86: Get rid of ->hard_math and all the FPU asm fu
> 
> 
> I'll venture a guess and say that if you revert those, your .config
> would boot on your K8 too.
> 
> So, I'd propose we take those 2 out for more careful inspection and
> fixing and the rest of tip:x86/cpu can go upstream in the upcoming merge
> window. IMHO of course.

I've got limited extra capacity right now - but if Peter rebases tip:x86/cpu or 
you send a pullable update of tip:x86/cpu I can stick it into -tip testing and 
yell if it goes wrong.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86, FPU: Fix FPU initialization

2013-04-16 Thread Ingo Molnar

* Borislav Petkov b...@alien8.de wrote:

 On Mon, Apr 15, 2013 at 05:54:15PM +0200, Borislav Petkov wrote:
  On Mon, Apr 15, 2013 at 12:18:25PM +0200, Ingo Molnar wrote:
   It was tip:master with x86/cpu merged in freshly.
  
  Ok, some more observations. I can trigger some oops similar yours (I
  haven't caught mine yet over serial or such) with latest tip/master +
  tip:x86/cpu.
 
 Ok, here's the deal - it looks like a corruption which causes a couple
 of different backtraces with different functions in the call trace. I've
 bisected tip:x86/cpu and the evildoers are:

Correct, 'late effects of memory corruption' was my first impression too, from 
the 
crash pattern.

 
 commit 3019653a57585602690fd38679326e9337f7ed7f
 Author: Borislav Petkov b...@suse.de
 Date:   Wed Apr 10 21:37:03 2013 +0200
 
 x86/fpu: Fix FPU initialization
 
 commit c70293d0e3fef6b989cd8268027d410cf06ce384
 Author: H. Peter Anvin h...@zytor.com
 Date:   Mon Apr 8 17:57:43 2013 +0200
 
 x86: Get rid of -hard_math and all the FPU asm fu
 
 
 I'll venture a guess and say that if you revert those, your .config
 would boot on your K8 too.
 
 So, I'd propose we take those 2 out for more careful inspection and
 fixing and the rest of tip:x86/cpu can go upstream in the upcoming merge
 window. IMHO of course.

I've got limited extra capacity right now - but if Peter rebases tip:x86/cpu or 
you send a pullable update of tip:x86/cpu I can stick it into -tip testing and 
yell if it goes wrong.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86, FPU: Fix FPU initialization

2013-04-16 Thread Borislav Petkov
On Tue, Apr 16, 2013 at 11:25:28AM +0200, Ingo Molnar wrote:
 I've got limited extra capacity right now - but if Peter rebases
 tip:x86/cpu or you send a pullable update of tip:x86/cpu I can stick
 it into -tip testing and yell if it goes wrong.


Ok, I'll do that in a second.

Btw, another heads-up: you know how I'm regularly testing
Linus+tip/master - well, I started seeing some strange lockups on my
workstation with -rc7 + tip from two days ago. And the box wouldn't
resume properly, the last line it would print is:

Disabling non-boot CPUs ...

and then hang. I've backed-out tip/master and it seems to work so it has
to be some interaction caused by something in tip. I haven't been able
to put my finger on it though but I'll watch it and try to trigger it on
my other boxes.

Thanks.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86, FPU: Fix FPU initialization

2013-04-16 Thread Ingo Molnar

* Borislav Petkov b...@alien8.de wrote:

 On Tue, Apr 16, 2013 at 11:25:28AM +0200, Ingo Molnar wrote:
  I've got limited extra capacity right now - but if Peter rebases
  tip:x86/cpu or you send a pullable update of tip:x86/cpu I can stick
  it into -tip testing and yell if it goes wrong.
 
 
 Ok, I'll do that in a second.
 
 Btw, another heads-up: you know how I'm regularly testing
 Linus+tip/master - well, I started seeing some strange lockups on my
 workstation with -rc7 + tip from two days ago. And the box wouldn't
 resume properly, the last line it would print is:
 
 Disabling non-boot CPUs ...
 
 and then hang. I've backed-out tip/master and it seems to work so it has
 to be some interaction caused by something in tip. I haven't been able
 to put my finger on it though but I'll watch it and try to trigger it on
 my other boxes.

Would be nice to pin that down ...

You are the first one to report this.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86, FPU: Fix FPU initialization

2013-04-15 Thread Borislav Petkov
On Mon, Apr 15, 2013 at 05:54:15PM +0200, Borislav Petkov wrote:
> On Mon, Apr 15, 2013 at 12:18:25PM +0200, Ingo Molnar wrote:
> > It was tip:master with x86/cpu merged in freshly.
> 
> Ok, some more observations. I can trigger some oops similar yours (I
> haven't caught mine yet over serial or such) with latest tip/master +
> tip:x86/cpu.

Ok, here's the deal - it looks like a corruption which causes a couple
of different backtraces with different functions in the call trace. I've
bisected tip:x86/cpu and the evildoers are:

commit 3019653a57585602690fd38679326e9337f7ed7f
Author: Borislav Petkov 
Date:   Wed Apr 10 21:37:03 2013 +0200

x86/fpu: Fix FPU initialization

commit c70293d0e3fef6b989cd8268027d410cf06ce384
Author: H. Peter Anvin 
Date:   Mon Apr 8 17:57:43 2013 +0200

x86: Get rid of ->hard_math and all the FPU asm fu


I'll venture a guess and say that if you revert those, your .config
would boot on your K8 too.

So, I'd propose we take those 2 out for more careful inspection and
fixing and the rest of tip:x86/cpu can go upstream in the upcoming merge
window. IMHO of course.

Thanks.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86, FPU: Fix FPU initialization

2013-04-15 Thread Borislav Petkov
On Mon, Apr 15, 2013 at 12:18:25PM +0200, Ingo Molnar wrote:
> It was tip:master with x86/cpu merged in freshly.

Ok, some more observations. I can trigger some oops similar yours (I
haven't caught mine yet over serial or such) with latest tip/master +
tip:x86/cpu.

When I remove tip:x86/cpu, the machine boots fine so I probably can
say now that I can reproduce at least similar behavior to what you're
observing.

Anyway, I'll try to catch the oops and try to decipher it and do a
bisection. Will keep you posted.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86, FPU: Fix FPU initialization

2013-04-15 Thread Ingo Molnar

* Borislav Petkov  wrote:

> On Mon, Apr 15, 2013 at 12:08:58PM +0200, Ingo Molnar wrote:
> > It gave me the impression of memory corruption - but impressions can
> > deceive ;-)
> >
> > Anyway, not sure I can test/bisect it this week - merge window
> > preparations and all that.
> 
> Ok, and also, in your oops, it said 3.9.0-rc6+ but tip:x86/cpu is 
> v3.9-rc5-11-g3019653a5758 so could it be a different kernel or some strange 
> interaction with some other code. [...]

It was tip:master with x86/cpu merged in freshly.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86, FPU: Fix FPU initialization

2013-04-15 Thread Borislav Petkov
On Mon, Apr 15, 2013 at 12:08:58PM +0200, Ingo Molnar wrote:
> It gave me the impression of memory corruption - but impressions can
> deceive ;-)
>
> Anyway, not sure I can test/bisect it this week - merge window
> preparations and all that.

Ok, and also, in your oops, it said 3.9.0-rc6+ but tip:x86/cpu is
v3.9-rc5-11-g3019653a5758 so could it be a different kernel or some
strange interaction with some other code. I'll run your config with
tip:x86/cpu on my AMD box here to see whether I can repro on real hw.

Thanks.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86, FPU: Fix FPU initialization

2013-04-15 Thread Ingo Molnar

* Borislav Petkov  wrote:

> And 0x3f76 + 0x104 gives exactly 0x407a which is the address at
> which we #PF:
> 
> [   15.921486] BUG: unable to handle kernel paging request at 407a
> [   15.921486] IP: [<41071ab0>] __lock_acquire.isra.19+0x3e0/0xb00
> 
> More hmmm...

It gave me the impression of memory corruption - but impressions can deceive ;-)

Anyway, not sure I can test/bisect it this week - merge window preparations and 
all that.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86, FPU: Fix FPU initialization

2013-04-15 Thread Ingo Molnar

* Borislav Petkov b...@alien8.de wrote:

 And 0x3f76 + 0x104 gives exactly 0x407a which is the address at
 which we #PF:
 
 [   15.921486] BUG: unable to handle kernel paging request at 407a
 [   15.921486] IP: [41071ab0] __lock_acquire.isra.19+0x3e0/0xb00
 
 More hmmm...

It gave me the impression of memory corruption - but impressions can deceive ;-)

Anyway, not sure I can test/bisect it this week - merge window preparations and 
all that.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86, FPU: Fix FPU initialization

2013-04-15 Thread Borislav Petkov
On Mon, Apr 15, 2013 at 12:08:58PM +0200, Ingo Molnar wrote:
 It gave me the impression of memory corruption - but impressions can
 deceive ;-)

 Anyway, not sure I can test/bisect it this week - merge window
 preparations and all that.

Ok, and also, in your oops, it said 3.9.0-rc6+ but tip:x86/cpu is
v3.9-rc5-11-g3019653a5758 so could it be a different kernel or some
strange interaction with some other code. I'll run your config with
tip:x86/cpu on my AMD box here to see whether I can repro on real hw.

Thanks.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86, FPU: Fix FPU initialization

2013-04-15 Thread Ingo Molnar

* Borislav Petkov b...@alien8.de wrote:

 On Mon, Apr 15, 2013 at 12:08:58PM +0200, Ingo Molnar wrote:
  It gave me the impression of memory corruption - but impressions can
  deceive ;-)
 
  Anyway, not sure I can test/bisect it this week - merge window
  preparations and all that.
 
 Ok, and also, in your oops, it said 3.9.0-rc6+ but tip:x86/cpu is 
 v3.9-rc5-11-g3019653a5758 so could it be a different kernel or some strange 
 interaction with some other code. [...]

It was tip:master with x86/cpu merged in freshly.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86, FPU: Fix FPU initialization

2013-04-15 Thread Borislav Petkov
On Mon, Apr 15, 2013 at 12:18:25PM +0200, Ingo Molnar wrote:
 It was tip:master with x86/cpu merged in freshly.

Ok, some more observations. I can trigger some oops similar yours (I
haven't caught mine yet over serial or such) with latest tip/master +
tip:x86/cpu.

When I remove tip:x86/cpu, the machine boots fine so I probably can
say now that I can reproduce at least similar behavior to what you're
observing.

Anyway, I'll try to catch the oops and try to decipher it and do a
bisection. Will keep you posted.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86, FPU: Fix FPU initialization

2013-04-15 Thread Borislav Petkov
On Mon, Apr 15, 2013 at 05:54:15PM +0200, Borislav Petkov wrote:
 On Mon, Apr 15, 2013 at 12:18:25PM +0200, Ingo Molnar wrote:
  It was tip:master with x86/cpu merged in freshly.
 
 Ok, some more observations. I can trigger some oops similar yours (I
 haven't caught mine yet over serial or such) with latest tip/master +
 tip:x86/cpu.

Ok, here's the deal - it looks like a corruption which causes a couple
of different backtraces with different functions in the call trace. I've
bisected tip:x86/cpu and the evildoers are:

commit 3019653a57585602690fd38679326e9337f7ed7f
Author: Borislav Petkov b...@suse.de
Date:   Wed Apr 10 21:37:03 2013 +0200

x86/fpu: Fix FPU initialization

commit c70293d0e3fef6b989cd8268027d410cf06ce384
Author: H. Peter Anvin h...@zytor.com
Date:   Mon Apr 8 17:57:43 2013 +0200

x86: Get rid of -hard_math and all the FPU asm fu


I'll venture a guess and say that if you revert those, your .config
would boot on your K8 too.

So, I'd propose we take those 2 out for more careful inspection and
fixing and the rest of tip:x86/cpu can go upstream in the upcoming merge
window. IMHO of course.

Thanks.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86, FPU: Fix FPU initialization

2013-04-12 Thread Borislav Petkov
On Fri, Apr 12, 2013 at 11:47:24AM +0200, Borislav Petkov wrote:
> On Thu, Apr 11, 2013 at 10:34:48PM -0700, H. Peter Anvin wrote:
> > >The lockup went away after excluding x86/cpu. I'll try more testing
> > >as time permits.
> 
> Right,
> 
> so tip:x86/cpu has all in all 11 patches. Maybe a quick bisect?

Ok, some more info. decodecoding your "Code:" section gives this (yep,
all the instruction bytes were repeated so I could've made a mistake
there while removing the duplicates):

[ 15.921486] Code: 00 83 3d c0 14 d0 41 00 0f 85 18 05 00 00 ba 34 03 00 00 b8 
cb e0 4e 41 e8 ee 74 fb ff e9 04 05 00 00 85 db 0f 84 fc 04 00 00 90 <3e> ff 83 
04 01 00 00 a1 48 48 77 41 8b b7 5c 03 00 00 85 c0 0f
All code

   0:   00 83 3d c0 14 d0   add%al,-0x2feb3fc3(%rbx)
   6:   41 00 0fadd%cl,(%r15)
   9:   85 18   test   %ebx,(%rax)
   b:   05 00 00 ba 34  add$0x34ba,%eax
  10:   03 00   add(%rax),%eax
  12:   00 b8 cb e0 4e 41   add%bh,0x414ee0cb(%rax)
  18:   e8 ee 74 fb ff  callq  0xfffb750b
  1d:   e9 04 05 00 00  jmpq   0x526
  22:   85 db   test   %ebx,%ebx
  24:   0f 84 fc 04 00 00   je 0x526
  2a:   90  nop
  2b:*  3e ff 83 04 01 00 00incl   %ds:0x104(%rbx) <-- trapping 
instruction
  32:   a1 48 48 77 41 8b b7movabs 0x35cb78b41774848,%eax
  39:   5c 03 
  3b:   00 00   add%al,(%rax)
  3d:   85 c0   test   %eax,%eax
  3f:

Now, if I look at __lock_acquire objdump here, I get:

2688:   31 c0   xor%eax,%eax
268a:   e9 49 0b 00 00  jmp31d8 <__lock_acquire+0xba6>
268f:   8b 4d c4mov-0x3c(%ebp),%ecx
2692:   8b 44 91 04 mov0x4(%ecx,%edx,4),%eax
2696:   85 c0   test   %eax,%eax
2698:   75 0e   jne26a8 <__lock_acquire+0x76>
269a:   8b 45 c4mov-0x3c(%ebp),%eax
269d:   31 c9   xor%ecx,%ecx
269f:   e8 12 e5 ff ff  call   bb6 
26a4:   85 c0   test   %eax,%eax
26a6:   74 e0   je 2688 <__lock_acquire+0x56>
26a8:   ff 80 04 01 00 00   incl   0x104(%eax)  
<---
26ae:   8b 96 68 03 00 00   mov0x368(%esi),%edx

which can be correlated with a lot of fuzz but the INC seems to look
the same and the offset within __lock_acquire is almost in the same
vicinity.

Which looks like this snippet here:

.L752:
movl-60(%ebp), %eax # %sfp,
xorl%ecx, %ecx  #
callregister_lock_class #
testl   %eax, %eax  # class
je  .L970   #,
.L753:
#APP
# 95 "/w/kernel/linux-2.6/arch/x86/include/asm/atomic.h" 1
incl 260(%eax)  # MEM[(struct atomic_t *)D.29327_54].counter
<---
# 0 "" 2
#NO_APP

and this has to be:

/*
 * Not cached?
 */
if (unlikely(!class)) {
class = register_lock_class(lock, subclass, 0);
if (!class)
return 0;
}
atomic_inc((atomic_t *)>ops);
<---


So looking at the decode above, we have the class pointer in %ebx
(decodecode somehow can't differentiate between 32- and 64-bit code
dump, probably needs a flag or so) and it is 0x3f76. Which doesn't
look like a valid kernel pointer to me.

And 0x3f76 + 0x104 gives exactly 0x407a which is the address at
which we #PF:

[   15.921486] BUG: unable to handle kernel paging request at 407a
[   15.921486] IP: [<41071ab0>] __lock_acquire.isra.19+0x3e0/0xb00

More hmmm...

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86, FPU: Fix FPU initialization

2013-04-12 Thread Borislav Petkov
On Thu, Apr 11, 2013 at 10:34:48PM -0700, H. Peter Anvin wrote:
> >The lockup went away after excluding x86/cpu. I'll try more testing
> >as time permits.

Right,

so tip:x86/cpu has all in all 11 patches. Maybe a quick bisect?

Thanks.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86, FPU: Fix FPU initialization

2013-04-12 Thread Borislav Petkov
On Thu, Apr 11, 2013 at 10:34:48PM -0700, H. Peter Anvin wrote:
 The lockup went away after excluding x86/cpu. I'll try more testing
 as time permits.

Right,

so tip:x86/cpu has all in all 11 patches. Maybe a quick bisect?

Thanks.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86, FPU: Fix FPU initialization

2013-04-12 Thread Borislav Petkov
On Fri, Apr 12, 2013 at 11:47:24AM +0200, Borislav Petkov wrote:
 On Thu, Apr 11, 2013 at 10:34:48PM -0700, H. Peter Anvin wrote:
  The lockup went away after excluding x86/cpu. I'll try more testing
  as time permits.
 
 Right,
 
 so tip:x86/cpu has all in all 11 patches. Maybe a quick bisect?

Ok, some more info. decodecoding your Code: section gives this (yep,
all the instruction bytes were repeated so I could've made a mistake
there while removing the duplicates):

[ 15.921486] Code: 00 83 3d c0 14 d0 41 00 0f 85 18 05 00 00 ba 34 03 00 00 b8 
cb e0 4e 41 e8 ee 74 fb ff e9 04 05 00 00 85 db 0f 84 fc 04 00 00 90 3e ff 83 
04 01 00 00 a1 48 48 77 41 8b b7 5c 03 00 00 85 c0 0f
All code

   0:   00 83 3d c0 14 d0   add%al,-0x2feb3fc3(%rbx)
   6:   41 00 0fadd%cl,(%r15)
   9:   85 18   test   %ebx,(%rax)
   b:   05 00 00 ba 34  add$0x34ba,%eax
  10:   03 00   add(%rax),%eax
  12:   00 b8 cb e0 4e 41   add%bh,0x414ee0cb(%rax)
  18:   e8 ee 74 fb ff  callq  0xfffb750b
  1d:   e9 04 05 00 00  jmpq   0x526
  22:   85 db   test   %ebx,%ebx
  24:   0f 84 fc 04 00 00   je 0x526
  2a:   90  nop
  2b:*  3e ff 83 04 01 00 00incl   %ds:0x104(%rbx) -- trapping 
instruction
  32:   a1 48 48 77 41 8b b7movabs 0x35cb78b41774848,%eax
  39:   5c 03 
  3b:   00 00   add%al,(%rax)
  3d:   85 c0   test   %eax,%eax
  3f:

Now, if I look at __lock_acquire objdump here, I get:

2688:   31 c0   xor%eax,%eax
268a:   e9 49 0b 00 00  jmp31d8 __lock_acquire+0xba6
268f:   8b 4d c4mov-0x3c(%ebp),%ecx
2692:   8b 44 91 04 mov0x4(%ecx,%edx,4),%eax
2696:   85 c0   test   %eax,%eax
2698:   75 0e   jne26a8 __lock_acquire+0x76
269a:   8b 45 c4mov-0x3c(%ebp),%eax
269d:   31 c9   xor%ecx,%ecx
269f:   e8 12 e5 ff ff  call   bb6 register_lock_class
26a4:   85 c0   test   %eax,%eax
26a6:   74 e0   je 2688 __lock_acquire+0x56
26a8:   ff 80 04 01 00 00   incl   0x104(%eax)  
---
26ae:   8b 96 68 03 00 00   mov0x368(%esi),%edx

which can be correlated with a lot of fuzz but the INC seems to look
the same and the offset within __lock_acquire is almost in the same
vicinity.

Which looks like this snippet here:

.L752:
movl-60(%ebp), %eax # %sfp,
xorl%ecx, %ecx  #
callregister_lock_class #
testl   %eax, %eax  # class
je  .L970   #,
.L753:
#APP
# 95 /w/kernel/linux-2.6/arch/x86/include/asm/atomic.h 1
incl 260(%eax)  # MEM[(struct atomic_t *)D.29327_54].counter
---
# 0  2
#NO_APP

and this has to be:

/*
 * Not cached?
 */
if (unlikely(!class)) {
class = register_lock_class(lock, subclass, 0);
if (!class)
return 0;
}
atomic_inc((atomic_t *)class-ops);
---


So looking at the decode above, we have the class pointer in %ebx
(decodecode somehow can't differentiate between 32- and 64-bit code
dump, probably needs a flag or so) and it is 0x3f76. Which doesn't
look like a valid kernel pointer to me.

And 0x3f76 + 0x104 gives exactly 0x407a which is the address at
which we #PF:

[   15.921486] BUG: unable to handle kernel paging request at 407a
[   15.921486] IP: [41071ab0] __lock_acquire.isra.19+0x3e0/0xb00

More hmmm...

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86, FPU: Fix FPU initialization

2013-04-11 Thread H. Peter Anvin
I used to have one of these but have it away when cleaning out my study... no 
space.

Ingo Molnar  wrote:

>
>* Borislav Petkov  wrote:
>
>> On Thu, Apr 11, 2013 at 12:26:09PM -0700, H. Peter Anvin wrote:
>> > What host is this?
>> 
>> Judging by the DMI string in the oops:
>> 
>> > [   15.921486] Pid: 73, comm: hwclock Tainted: GW   
>3.9.0-rc6+ #222032 System manufacturer System Product Name/A8N-E
>> 
>> it is an ASUS board with a K8 on it - probably Ingo's old K8 which
>> triggers all kinds of crap off an on.
>> 
>> 8-)
>
>Yep, with Fedora Core 8, and totally unchanged userspace, booting
>randconfigs of 
>the latest -tip:master tree. This box has booted up over a million
>Linux kernels 
>in the past 4+ years, so when it shows new types of sickness then in
>99.9% of the 
>cases it's something about the kernel.
>
>The lockup went away after excluding x86/cpu. I'll try more testing as
>time 
>permits.
>
>Thanks,
>
>   Ingo

-- 
Sent from my mobile phone. Please excuse brevity and lack of formatting.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86, FPU: Fix FPU initialization

2013-04-11 Thread Ingo Molnar

* Borislav Petkov  wrote:

> On Thu, Apr 11, 2013 at 12:26:09PM -0700, H. Peter Anvin wrote:
> > What host is this?
> 
> Judging by the DMI string in the oops:
> 
> > [   15.921486] Pid: 73, comm: hwclock Tainted: GW3.9.0-rc6+ 
> > #222032 System manufacturer System Product Name/A8N-E
> 
> it is an ASUS board with a K8 on it - probably Ingo's old K8 which
> triggers all kinds of crap off an on.
> 
> 8-)

Yep, with Fedora Core 8, and totally unchanged userspace, booting randconfigs 
of 
the latest -tip:master tree. This box has booted up over a million Linux 
kernels 
in the past 4+ years, so when it shows new types of sickness then in 99.9% of 
the 
cases it's something about the kernel.

The lockup went away after excluding x86/cpu. I'll try more testing as time 
permits.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86, FPU: Fix FPU initialization

2013-04-11 Thread Borislav Petkov
On Thu, Apr 11, 2013 at 12:26:09PM -0700, H. Peter Anvin wrote:
> What host is this?

Judging by the DMI string in the oops:

> [   15.921486] Pid: 73, comm: hwclock Tainted: GW3.9.0-rc6+ 
> #222032 System manufacturer System Product Name/A8N-E

it is an ASUS board with a K8 on it - probably Ingo's old K8 which
triggers all kinds of crap off an on.

8-)

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86, FPU: Fix FPU initialization

2013-04-11 Thread H. Peter Anvin
On 04/11/2013 05:09 AM, Ingo Molnar wrote:
> 
> Even with this applied, the attached config is still unhappy and 
> crashes/locks up 
> during user-space init, see the crashlog attached below.
> 
> The config has MATH_EMULATION=y, so I suspect it's the same problem category. 
> 

What host is this?

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86, FPU: Fix FPU initialization

2013-04-11 Thread Borislav Petkov
On Thu, Apr 11, 2013 at 02:09:52PM +0200, Ingo Molnar wrote:
> Even with this applied, the attached config is still unhappy and
> crashes/locks up during user-space init, see the crashlog attached
> below.
>
> The config has MATH_EMULATION=y, so I suspect it's the same problem
> category.
>
> (I'll keep tip:x86/cpu excluded from tip:master so that others are not
> affected by this bug.)

Right,

of course, I can't trigger it here :(

Let's see:

> INIT: version 2.86 booting
> [   14.723352] mount (55) used greatest stack depth: 5820 bytes left
> [   14.723352] mount (55) used greatest stack depth: 5820 bytes left

Don't you just hate the repeated lines? :-)

> [   15.187354] awk (64) used greatest stack depth: 5816 bytes left
> [   15.187354] awk (64) used greatest stack depth: 5816 bytes left
>   Welcome to [   15.327059] gzip (70) used greatest stack depth: 
> 5576 bytes left
> [   15.327059] gzip (70) used greatest stack depth: 5576 bytes left
> Fedora Core
>   Press 'I' to enter interactive startup.
> modprobe: FATAL: Could not load /lib/modules/3.9.0-rc6+/modules.dep: No such 
> file or directory
> 
> [   15.921486] BUG: unable to handle kernel [   15.921486] BUG: unable to 
> handle kernel paging requestpaging request at 407a
>  at 407a
> [   15.921486] IP:[   15.921486] IP: [<41071ab0>] 
> __lock_acquire.isra.19+0x3e0/0xb00
>  [<41071ab0>] __lock_acquire.isra.19+0x3e0/0xb00
> [   15.921486] *pde =  [   15.921486] *pde =  
> 
> [   15.921486] Oops: 0002 [#1] [   15.921486] Oops: 0002 [#1] SMP SMP 
> 
> [   15.921486] Modules linked in:[   15.921486] Modules linked in:
> 
> [   15.921486] Pid: 73, comm: hwclock Tainted: GW3.9.0-rc6+ 
> #222032 System manufacturer System Product Name/A8N-E
> [   15.921486] Pid: 73, comm: hwclock Tainted: GW3.9.0-rc6+ 
> #222032 System manufacturer System Product Name/A8N-E

Ok, so you're running a M686 32-bit kernel on an Athlon 64?

Also, what exactly is that kernel: 3.9.0-rc6+? tip:x86/cpu is
v3.9-rc5-11-g3019653a5758

> [   15.921486] EIP: 0060:[<41071ab0>] EFLAGS: 00013002 CPU: 0
> [   15.921486] EIP: 0060:[<41071ab0>] EFLAGS: 00013002 CPU: 0
> [   15.921486] EIP is at __lock_acquire.isra.19+0x3e0/0xb00
> [   15.921486] EIP is at __lock_acquire.isra.19+0x3e0/0xb00
> [   15.921486] EAX: 7e917f94 EBX: 3f76 ECX:  EDX: 
> [   15.921486] EAX: 7e917f94 EBX: 3f76 ECX:  EDX: 
> [   15.921486] ESI:  EDI: 7e9469c0 EBP: 7e9cfed8 ESP: 7e9cfe88
> [   15.921486] ESI:  EDI: 7e9469c0 EBP: 7e9cfed8 ESP: 7e9cfe88
> [   15.921486]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
> [   15.921486]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
> [   15.921486] CR0: 8005003b CR2: 407a CR3: 01768000 CR4: 0690
> [   15.921486] CR0: 8005003b CR2: 407a CR3: 01768000 CR4: 0690
> [   15.921486] DR0:  DR1:  DR2:  DR3: 
> [   15.921486] DR0:  DR1:  DR2:  DR3: 
> [   15.921486] DR6: 0ff0 DR7: 0400
> [   15.921486] DR6: 0ff0 DR7: 0400
> [   15.921486] Process hwclock (pid: 73, ti=7e9ce000 task=7e9469c0 
> task.ti=7e9ce000)
> [   15.921486] Process hwclock (pid: 73, ti=7e9ce000 task=7e9469c0 
> task.ti=7e9ce000)
> [   15.921486] Stack:
> [   15.921486] Stack:
> [   15.921486]  0003[   15.921486]  0003 b4fe9c00 b4fe9c00 0003 
> 0003 0001 0001 7e999500 7e999500   7e999d00 
> 7e999d00 7e995340 7e995340
> 
> [   15.921486]  3002[   15.921486]  3002 7e8e8920 7e8e8920 7e9c0207 
> 7e9c0207 8018 8018 7e999500 7e999500 7e9c0207 7e9c0207 7e946d24 
> 7e946d24 7e946d20 7e946d20
> 
> [   15.921486]  7e917f94[   15.921486]  7e917f94   7e9469c0 
> 7e9469c0 3246 3246 7e9cff00 7e9cff00 4107264d 4107264d  
>   
> 
> [   15.921486] Call Trace:
> [   15.921486] Call Trace:
> [   15.921486]  [<4107264d>] lock_acquire+0x5d/0x80
> [   15.921486]  [<4107264d>] lock_acquire+0x5d/0x80
> [   15.921486]  [<41109905>] ? exit_fs+0x35/0x70
> [   15.921486]  [<41109905>] ? exit_fs+0x35/0x70

Right, so I can't see how exit_fs grabbing a bunch of locks could be
related to MATH_EMULATION. I'm not saying it can't - I just don't see it
from the trace.

> [   15.921486]  [<413deba1>] _raw_spin_lock+0x41/0x70
> [   15.921486]  [<413deba1>] _raw_spin_lock+0x41/0x70
> [   15.921486]  [<41109905>] ? exit_fs+0x35/0x70
> [   15.921486]  [<41109905>] ? exit_fs+0x35/0x70
> [   15.921486]  [<41109905>] exit_fs+0x35/0x70
> [   15.921486]  [<41109905>] exit_fs+0x35/0x70
> [   15.921486]  [<4102ddab>] do_exit+0x2fb/0x850
> [   15.921486]  [<4102ddab>] do_exit+0x2fb/0x850
> [   15.921486]  [<4102e48c>] do_group_exit+0x6c/0xb0
> [   15.921486]  [<4102e48c>] do_group_exit+0x6c/0xb0
> [   15.921486]  [<4102e4e3>] sys_exit_group+0x13/0x20
> [   15.921486]  [<4102e4e3>] sys_exit_group+0x13/0x20
> [   15.921486]  [<413e4f05>] 

Re: [PATCH] x86, FPU: Fix FPU initialization

2013-04-11 Thread Borislav Petkov
On Thu, Apr 11, 2013 at 02:09:52PM +0200, Ingo Molnar wrote:
 Even with this applied, the attached config is still unhappy and
 crashes/locks up during user-space init, see the crashlog attached
 below.

 The config has MATH_EMULATION=y, so I suspect it's the same problem
 category.

 (I'll keep tip:x86/cpu excluded from tip:master so that others are not
 affected by this bug.)

Right,

of course, I can't trigger it here :(

Let's see:

 INIT: version 2.86 booting
 [   14.723352] mount (55) used greatest stack depth: 5820 bytes left
 [   14.723352] mount (55) used greatest stack depth: 5820 bytes left

Don't you just hate the repeated lines? :-)

 [   15.187354] awk (64) used greatest stack depth: 5816 bytes left
 [   15.187354] awk (64) used greatest stack depth: 5816 bytes left
   Welcome to [   15.327059] gzip (70) used greatest stack depth: 
 5576 bytes left
 [   15.327059] gzip (70) used greatest stack depth: 5576 bytes left
 Fedora Core
   Press 'I' to enter interactive startup.
 modprobe: FATAL: Could not load /lib/modules/3.9.0-rc6+/modules.dep: No such 
 file or directory
 
 [   15.921486] BUG: unable to handle kernel [   15.921486] BUG: unable to 
 handle kernel paging requestpaging request at 407a
  at 407a
 [   15.921486] IP:[   15.921486] IP: [41071ab0] 
 __lock_acquire.isra.19+0x3e0/0xb00
  [41071ab0] __lock_acquire.isra.19+0x3e0/0xb00
 [   15.921486] *pde =  [   15.921486] *pde =  
 
 [   15.921486] Oops: 0002 [#1] [   15.921486] Oops: 0002 [#1] SMP SMP 
 
 [   15.921486] Modules linked in:[   15.921486] Modules linked in:
 
 [   15.921486] Pid: 73, comm: hwclock Tainted: GW3.9.0-rc6+ 
 #222032 System manufacturer System Product Name/A8N-E
 [   15.921486] Pid: 73, comm: hwclock Tainted: GW3.9.0-rc6+ 
 #222032 System manufacturer System Product Name/A8N-E

Ok, so you're running a M686 32-bit kernel on an Athlon 64?

Also, what exactly is that kernel: 3.9.0-rc6+? tip:x86/cpu is
v3.9-rc5-11-g3019653a5758

 [   15.921486] EIP: 0060:[41071ab0] EFLAGS: 00013002 CPU: 0
 [   15.921486] EIP: 0060:[41071ab0] EFLAGS: 00013002 CPU: 0
 [   15.921486] EIP is at __lock_acquire.isra.19+0x3e0/0xb00
 [   15.921486] EIP is at __lock_acquire.isra.19+0x3e0/0xb00
 [   15.921486] EAX: 7e917f94 EBX: 3f76 ECX:  EDX: 
 [   15.921486] EAX: 7e917f94 EBX: 3f76 ECX:  EDX: 
 [   15.921486] ESI:  EDI: 7e9469c0 EBP: 7e9cfed8 ESP: 7e9cfe88
 [   15.921486] ESI:  EDI: 7e9469c0 EBP: 7e9cfed8 ESP: 7e9cfe88
 [   15.921486]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
 [   15.921486]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
 [   15.921486] CR0: 8005003b CR2: 407a CR3: 01768000 CR4: 0690
 [   15.921486] CR0: 8005003b CR2: 407a CR3: 01768000 CR4: 0690
 [   15.921486] DR0:  DR1:  DR2:  DR3: 
 [   15.921486] DR0:  DR1:  DR2:  DR3: 
 [   15.921486] DR6: 0ff0 DR7: 0400
 [   15.921486] DR6: 0ff0 DR7: 0400
 [   15.921486] Process hwclock (pid: 73, ti=7e9ce000 task=7e9469c0 
 task.ti=7e9ce000)
 [   15.921486] Process hwclock (pid: 73, ti=7e9ce000 task=7e9469c0 
 task.ti=7e9ce000)
 [   15.921486] Stack:
 [   15.921486] Stack:
 [   15.921486]  0003[   15.921486]  0003 b4fe9c00 b4fe9c00 0003 
 0003 0001 0001 7e999500 7e999500   7e999d00 
 7e999d00 7e995340 7e995340
 
 [   15.921486]  3002[   15.921486]  3002 7e8e8920 7e8e8920 7e9c0207 
 7e9c0207 8018 8018 7e999500 7e999500 7e9c0207 7e9c0207 7e946d24 
 7e946d24 7e946d20 7e946d20
 
 [   15.921486]  7e917f94[   15.921486]  7e917f94   7e9469c0 
 7e9469c0 3246 3246 7e9cff00 7e9cff00 4107264d 4107264d  
   
 
 [   15.921486] Call Trace:
 [   15.921486] Call Trace:
 [   15.921486]  [4107264d] lock_acquire+0x5d/0x80
 [   15.921486]  [4107264d] lock_acquire+0x5d/0x80
 [   15.921486]  [41109905] ? exit_fs+0x35/0x70
 [   15.921486]  [41109905] ? exit_fs+0x35/0x70

Right, so I can't see how exit_fs grabbing a bunch of locks could be
related to MATH_EMULATION. I'm not saying it can't - I just don't see it
from the trace.

 [   15.921486]  [413deba1] _raw_spin_lock+0x41/0x70
 [   15.921486]  [413deba1] _raw_spin_lock+0x41/0x70
 [   15.921486]  [41109905] ? exit_fs+0x35/0x70
 [   15.921486]  [41109905] ? exit_fs+0x35/0x70
 [   15.921486]  [41109905] exit_fs+0x35/0x70
 [   15.921486]  [41109905] exit_fs+0x35/0x70
 [   15.921486]  [4102ddab] do_exit+0x2fb/0x850
 [   15.921486]  [4102ddab] do_exit+0x2fb/0x850
 [   15.921486]  [4102e48c] do_group_exit+0x6c/0xb0
 [   15.921486]  [4102e48c] do_group_exit+0x6c/0xb0
 [   15.921486]  [4102e4e3] sys_exit_group+0x13/0x20
 [   15.921486]  [4102e4e3] sys_exit_group+0x13/0x20
 [   15.921486]  [413e4f05] sysenter_do_call+0x12/0x31
 [   15.921486]  [413e4f05] sysenter_do_call+0x12/0x31
 [   15.921486] Code:[   15.921486] Code: 00 00 83 

Re: [PATCH] x86, FPU: Fix FPU initialization

2013-04-11 Thread H. Peter Anvin
On 04/11/2013 05:09 AM, Ingo Molnar wrote:
 
 Even with this applied, the attached config is still unhappy and 
 crashes/locks up 
 during user-space init, see the crashlog attached below.
 
 The config has MATH_EMULATION=y, so I suspect it's the same problem category. 
 

What host is this?

-hpa


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86, FPU: Fix FPU initialization

2013-04-11 Thread Borislav Petkov
On Thu, Apr 11, 2013 at 12:26:09PM -0700, H. Peter Anvin wrote:
 What host is this?

Judging by the DMI string in the oops:

 [   15.921486] Pid: 73, comm: hwclock Tainted: GW3.9.0-rc6+ 
 #222032 System manufacturer System Product Name/A8N-E

it is an ASUS board with a K8 on it - probably Ingo's old K8 which
triggers all kinds of crap off an on.

8-)

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86, FPU: Fix FPU initialization

2013-04-11 Thread Ingo Molnar

* Borislav Petkov b...@alien8.de wrote:

 On Thu, Apr 11, 2013 at 12:26:09PM -0700, H. Peter Anvin wrote:
  What host is this?
 
 Judging by the DMI string in the oops:
 
  [   15.921486] Pid: 73, comm: hwclock Tainted: GW3.9.0-rc6+ 
  #222032 System manufacturer System Product Name/A8N-E
 
 it is an ASUS board with a K8 on it - probably Ingo's old K8 which
 triggers all kinds of crap off an on.
 
 8-)

Yep, with Fedora Core 8, and totally unchanged userspace, booting randconfigs 
of 
the latest -tip:master tree. This box has booted up over a million Linux 
kernels 
in the past 4+ years, so when it shows new types of sickness then in 99.9% of 
the 
cases it's something about the kernel.

The lockup went away after excluding x86/cpu. I'll try more testing as time 
permits.

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86, FPU: Fix FPU initialization

2013-04-11 Thread H. Peter Anvin
I used to have one of these but have it away when cleaning out my study... no 
space.

Ingo Molnar mi...@kernel.org wrote:


* Borislav Petkov b...@alien8.de wrote:

 On Thu, Apr 11, 2013 at 12:26:09PM -0700, H. Peter Anvin wrote:
  What host is this?
 
 Judging by the DMI string in the oops:
 
  [   15.921486] Pid: 73, comm: hwclock Tainted: GW   
3.9.0-rc6+ #222032 System manufacturer System Product Name/A8N-E
 
 it is an ASUS board with a K8 on it - probably Ingo's old K8 which
 triggers all kinds of crap off an on.
 
 8-)

Yep, with Fedora Core 8, and totally unchanged userspace, booting
randconfigs of 
the latest -tip:master tree. This box has booted up over a million
Linux kernels 
in the past 4+ years, so when it shows new types of sickness then in
99.9% of the 
cases it's something about the kernel.

The lockup went away after excluding x86/cpu. I'll try more testing as
time 
permits.

Thanks,

   Ingo

-- 
Sent from my mobile phone. Please excuse brevity and lack of formatting.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86, FPU: Fix FPU initialization

2013-04-10 Thread Borislav Petkov
On Wed, Apr 10, 2013 at 06:11:22PM +0200, Borislav Petkov wrote:
> On Wed, Apr 10, 2013 at 08:35:43AM -0700, H. Peter Anvin wrote:
> > OK, this thread took off in another direction but you're still looking
> > at this, right?
> 
> Yep, and I think I have the rootcause, let's start (oops below for
> info).

Ok, here's a fix which boots fine here in qemu. Ingo, it would be cool
if you gave it a run to verify.

Thanks.

--
>From 2263430417dd8de1a5fef4b2c40127e681fdc1ab Mon Sep 17 00:00:00 2001
From: Borislav Petkov 
Date: Wed, 10 Apr 2013 21:37:03 +0200
Subject: [PATCH] x86, FPU: Fix FPU initialization

c70293d0e3fe ("x86: Get rid of ->hard_math and all the FPU asm
fu") converted the FPU detection code to C. Yours truly, in his
overzealousness, used static_cpu_has() too early, before alternatives
have run, leading to the checks in fpu_init() to fail and fpu_init() to
set CR0.EM.

This, in turn, lead to an early NULL ptr due to
a chicken-and-an-egg issue (full details here:
http://lkml.kernel.org/r/20130410161122.gi6...@pd.tnic).

Fix it back to the normal CPU feature checks.

Signed-off-by: Borislav Petkov 
---
 arch/x86/kernel/i387.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/i387.c b/arch/x86/kernel/i387.c
index 3a6455304c8d..b0928898bf54 100644
--- a/arch/x86/kernel/i387.c
+++ b/arch/x86/kernel/i387.c
@@ -163,7 +163,7 @@ void __cpuinit fpu_init(void)
unsigned long cr4_mask = 0;
 
 #ifndef CONFIG_MATH_EMULATION
-   if (!static_cpu_has(X86_FEATURE_FPU)) {
+   if (!cpu_has_fpu) {
pr_emerg("No FPU found and no math emulation present\n");
pr_emerg("Giving up\n");
for (;;)
@@ -179,7 +179,7 @@ void __cpuinit fpu_init(void)
 
cr0 = read_cr0();
cr0 &= ~(X86_CR0_TS|X86_CR0_EM); /* clear TS and EM */
-   if (!static_cpu_has(X86_FEATURE_FPU))
+   if (!cpu_has_fpu)
cr0 |= X86_CR0_EM;
write_cr0(cr0);
 
-- 
1.8.2.135.g7b592fa


-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] x86, FPU: Fix FPU initialization

2013-04-10 Thread Borislav Petkov
On Wed, Apr 10, 2013 at 06:11:22PM +0200, Borislav Petkov wrote:
 On Wed, Apr 10, 2013 at 08:35:43AM -0700, H. Peter Anvin wrote:
  OK, this thread took off in another direction but you're still looking
  at this, right?
 
 Yep, and I think I have the rootcause, let's start (oops below for
 info).

Ok, here's a fix which boots fine here in qemu. Ingo, it would be cool
if you gave it a run to verify.

Thanks.

--
From 2263430417dd8de1a5fef4b2c40127e681fdc1ab Mon Sep 17 00:00:00 2001
From: Borislav Petkov b...@suse.de
Date: Wed, 10 Apr 2013 21:37:03 +0200
Subject: [PATCH] x86, FPU: Fix FPU initialization

c70293d0e3fe (x86: Get rid of -hard_math and all the FPU asm
fu) converted the FPU detection code to C. Yours truly, in his
overzealousness, used static_cpu_has() too early, before alternatives
have run, leading to the checks in fpu_init() to fail and fpu_init() to
set CR0.EM.

This, in turn, lead to an early NULL ptr due to
a chicken-and-an-egg issue (full details here:
http://lkml.kernel.org/r/20130410161122.gi6...@pd.tnic).

Fix it back to the normal CPU feature checks.

Signed-off-by: Borislav Petkov b...@suse.de
---
 arch/x86/kernel/i387.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/i387.c b/arch/x86/kernel/i387.c
index 3a6455304c8d..b0928898bf54 100644
--- a/arch/x86/kernel/i387.c
+++ b/arch/x86/kernel/i387.c
@@ -163,7 +163,7 @@ void __cpuinit fpu_init(void)
unsigned long cr4_mask = 0;
 
 #ifndef CONFIG_MATH_EMULATION
-   if (!static_cpu_has(X86_FEATURE_FPU)) {
+   if (!cpu_has_fpu) {
pr_emerg(No FPU found and no math emulation present\n);
pr_emerg(Giving up\n);
for (;;)
@@ -179,7 +179,7 @@ void __cpuinit fpu_init(void)
 
cr0 = read_cr0();
cr0 = ~(X86_CR0_TS|X86_CR0_EM); /* clear TS and EM */
-   if (!static_cpu_has(X86_FEATURE_FPU))
+   if (!cpu_has_fpu)
cr0 |= X86_CR0_EM;
write_cr0(cr0);
 
-- 
1.8.2.135.g7b592fa


-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/