Re: [BISECTED] 3.19-rc1 regression - kernel does not load in GRUB 0.97 (GRUB Legacy)

2015-01-05 Thread yzhu1

On 12/28/2014 07:12 PM, Juergen Gross wrote:

On 12/28/2014 08:20 AM, Пламен Петров wrote:

-Original Message-
From: Juergen Gross [mailto:jgr...@suse.com]
Sent: Saturday, December 27, 2014 3:48 PM
To: Пламен Петров; linux-kernel@vger.kernel.org
Cc: 'Thomas Gleixner'
Subject: Re: [BISECTED] 3.19-rc1 regression - kernel does not load 
in GRUB

0.97 (GRUB Legacy)

On 12/24/2014 01:28 AM, Пламен Петров wrote:

Hello!

I use GRUB Legacy bootloader (version 0.97) on a couple machines, and
where 3.18.x loads fine, 3.19-rc1 does not.

While compiling I used the attached .config file accompanied by "make
olddefconfig"


Can you tell me something about the hardware (processor model)?
You are not booting the system under VMWare by any chance?


As a matter of fact - I am compiling in a VMware Player (6.0.3
build-1895310) virtual machine, boot testing there, and then if 
everything
is OK, I transfer the monolithic kernel produced on 3 virtual 
machines that

run on EXSi and 2 actual servers. So along those lines - the failing
3.19-rc1 never saw actual hardware - it was all tested (and bisected) 
inside

a VM.


Thanks. VMWare having problems with my patch is a known issue. I've
already sent a patch working around that issue (VMWare has a bug
emulating the PAT MSR). You can either use that patch (attached for
your convenience) or use the "nopat" option.


Juergen


This patch works for me. I made tests with this patch.

Zhu Yanjun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BISECTED] 3.19-rc1 regression - kernel does not load in GRUB 0.97 (GRUB Legacy)

2014-12-28 Thread Juergen Gross

On 12/28/2014 08:20 AM, Пламен Петров wrote:

-Original Message-
From: Juergen Gross [mailto:jgr...@suse.com]
Sent: Saturday, December 27, 2014 3:48 PM
To: Пламен Петров; linux-kernel@vger.kernel.org
Cc: 'Thomas Gleixner'
Subject: Re: [BISECTED] 3.19-rc1 regression - kernel does not load in GRUB
0.97 (GRUB Legacy)

On 12/24/2014 01:28 AM, Пламен Петров wrote:

Hello!

I use GRUB Legacy bootloader (version 0.97) on a couple machines, and
where 3.18.x loads fine, 3.19-rc1 does not.

While compiling I used the attached .config file accompanied by "make
olddefconfig"


Can you tell me something about the hardware (processor model)?
You are not booting the system under VMWare by any chance?


As a matter of fact - I am compiling in a VMware Player (6.0.3
build-1895310) virtual machine, boot testing there, and then if everything
is OK, I transfer the monolithic kernel produced on 3 virtual machines that
run on EXSi and 2 actual servers. So along those lines - the failing
3.19-rc1 never saw actual hardware - it was all tested (and bisected) inside
a VM.


Thanks. VMWare having problems with my patch is a known issue. I've
already sent a patch working around that issue (VMWare has a bug
emulating the PAT MSR). You can either use that patch (attached for
your convenience) or use the "nopat" option.


Juergen

>From 4b65fb80338c71673cabfa9fa9b0f80f5a4bc320 Mon Sep 17 00:00:00 2001
From: Juergen Gross 
Date: Tue, 16 Dec 2014 07:43:51 +0100
Subject: [PATCH] x86: don't rely on VMWare emulating PAT MSR correctly

VMWare seems not to emulate the PAT MSR correctly: reaeding
MSR_IA32_CR_PAT returns 0 even after writing another value to it.

Detect this bug and don't use the read value if it is 0.

Commit bd809af16e3ab1f8d55b3e2928c47c67e2a865d2 ("x86: Enable PAT to
use cache mode translation tables") triggers this VMWare bug when the
kernel is booted as a VMWare guest.

Reported-by: Jongman Heo 
Signed-off-by: Juergen Gross 
Tested-by: Jongman Heo 
---
 arch/x86/mm/pat.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index edf299c..7ac6869 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -234,8 +234,13 @@ void pat_init(void)
 	  PAT(4, WB) | PAT(5, WC) | PAT(6, UC_MINUS) | PAT(7, UC);
 
 	/* Boot CPU check */
-	if (!boot_pat_state)
+	if (!boot_pat_state) {
 		rdmsrl(MSR_IA32_CR_PAT, boot_pat_state);
+		if (!boot_pat_state) {
+			pat_disable("PAT read returns always zero, disabled.");
+			return;
+		}
+	}
 
 	wrmsrl(MSR_IA32_CR_PAT, pat);
 
-- 
2.1.2



RE: [BISECTED] 3.19-rc1 regression - kernel does not load in GRUB 0.97 (GRUB Legacy)

2014-12-27 Thread Пламен Петров
> -Original Message-
> From: Пламен Петров [mailto:pla...@petrovi.no-ip.info]
> Sent: Sunday, December 28, 2014 9:20 AM
> To: 'Juergen Gross'; 'linux-kernel@vger.kernel.org'
> Cc: 'Thomas Gleixner'
> Subject: RE: [BISECTED] 3.19-rc1 regression - kernel does not load in GRUB
> 0.97 (GRUB Legacy)
> 
> > -Original Message-
> > From: Juergen Gross [mailto:jgr...@suse.com]
> > Sent: Saturday, December 27, 2014 3:48 PM
> > To: Пламен Петров; linux-kernel@vger.kernel.org
> > Cc: 'Thomas Gleixner'
> > Subject: Re: [BISECTED] 3.19-rc1 regression - kernel does not load in
> > GRUB
> > 0.97 (GRUB Legacy)
> >
...
> >
> > Can you boot with kernel option "nopat"?
> 
> Same as above - kernel command line options do not help, as they never get
> into play to even make a difference.
> 

Scratch the above, and sorry for my jump to conclusions instead of "First
Trying"T.

Adding "nopat" to the kernel command line options does make the previously
broken unpatched vanilla 3.19-rc1 kernel boot fine.

Does this mean that I should add "nopat" command line option to all VMs for
3.19?

> If you need the actual details of my setup or want me to try patches -
just
> ask.
> 
-
Plamen Petrov

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [BISECTED] 3.19-rc1 regression - kernel does not load in GRUB 0.97 (GRUB Legacy)

2014-12-27 Thread Пламен Петров
> -Original Message-
> From: Juergen Gross [mailto:jgr...@suse.com]
> Sent: Saturday, December 27, 2014 3:48 PM
> To: Пламен Петров; linux-kernel@vger.kernel.org
> Cc: 'Thomas Gleixner'
> Subject: Re: [BISECTED] 3.19-rc1 regression - kernel does not load in GRUB
> 0.97 (GRUB Legacy)
> 
> On 12/24/2014 01:28 AM, Пламен Петров wrote:
> > Hello!
> >
> > I use GRUB Legacy bootloader (version 0.97) on a couple machines, and
> > where 3.18.x loads fine, 3.19-rc1 does not.
> >
> > While compiling I used the attached .config file accompanied by "make
> > olddefconfig"
> 
> Can you tell me something about the hardware (processor model)?
> You are not booting the system under VMWare by any chance?

As a matter of fact - I am compiling in a VMware Player (6.0.3
build-1895310) virtual machine, boot testing there, and then if everything
is OK, I transfer the monolithic kernel produced on 3 virtual machines that
run on EXSi and 2 actual servers. So along those lines - the failing
3.19-rc1 never saw actual hardware - it was all tested (and bisected) inside
a VM.

> 
> Could you try the earlyprintk kernel option (serial or vga)?

Does not help. You see - the kernel never makes it there to print anything. 
GRUB just says: 

kernel /vmlinuz.test rw root=/dev/sda2 console=ttyS0,9600n8 console=tty0
[Linux-bzImage, setup=0x3c00, size=0x430ec0]

early console in decompress_kernel

Decompressing Linux... Parsing ELF... done.
Booting the kernel.

and it just stays there.

Retyping the above here got me thinking - is early console part of the
kernel?

> 
> Can you boot with kernel option "nopat"?

Same as above - kernel command line options do not help, as they never get
into play to even make a difference.

If you need the actual details of my setup or want me to try patches - just
ask.

> 
> Juergen
> 
> > The bisection I ran points to:
> > bd809af16e3ab1f8d55b3e2928c47c67e2a865d2 is the first bad commit
> > commit bd809af16e3ab1f8d55b3e2928c47c67e2a865d2
> > Author: Juergen Gross 
> > Date:   Mon Nov 3 14:02:03 2014 +0100
> >
> >  x86: Enable PAT to use cache mode translation tables
> >

Anyway, thanks to all the devs for their changes in 3.19-rc1 - there is a
huge difference in loading speed in my compile VM: with 3.18.x there is
something like 60 seconds until the VM is up and running; and with 3.19-rc1
(with commit bd809af16e3ab1f8d55b3e2928c4 reverted) - that same machine is
up and running in less than 10 seconds.
So  that makes it a 6-fold loading time speedup from just going to 3.19-rc1.
Thanks!
-
Plamen Petrov

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BISECTED] 3.19-rc1 regression - kernel does not load in GRUB 0.97 (GRUB Legacy)

2014-12-27 Thread Juergen Gross

On 12/24/2014 01:28 AM, Пламен Петров wrote:

Hello!

I use GRUB Legacy bootloader (version 0.97) on a couple machines, and where
3.18.x loads fine, 3.19-rc1 does not.

While compiling I used the attached .config file accompanied by "make
olddefconfig"


Can you tell me something about the hardware (processor model)?
You are not booting the system under VMWare by any chance?

Could you try the earlyprintk kernel option (serial or vga)?

Can you boot with kernel option "nopat"?


Juergen



The bisection I ran points to:
bd809af16e3ab1f8d55b3e2928c47c67e2a865d2 is the first bad commit
commit bd809af16e3ab1f8d55b3e2928c47c67e2a865d2
Author: Juergen Gross 
Date:   Mon Nov 3 14:02:03 2014 +0100

 x86: Enable PAT to use cache mode translation tables

 Update the translation tables from cache mode to pgprot values
 according to the PAT settings. This enables changing the cache
 attributes of a PAT index in just one place without having to change
 at the users side.

 With this change it is possible to use the same kernel with different
 PAT configurations, e.g. supporting Xen.

Here is the output of git bisect log

git bisect start
# bad: [97bf6af1f928216fd6c5a66e8a57bfa95a659672] Linux 3.19-rc1
git bisect bad 97bf6af1f928216fd6c5a66e8a57bfa95a659672
# good: [b2776bf7149bddd1f4161f14f79520f17fc1d71d] Linux 3.18
git bisect good b2776bf7149bddd1f4161f14f79520f17fc1d71d
# bad: [70e71ca0af244f48a5dcf56dc435243792e3a495] Merge
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
git bisect bad 70e71ca0af244f48a5dcf56dc435243792e3a495
# bad: [e28870f9b3e92cd3570925089c6bb789c2603bc4] Merge tag
'backlight-for-linus-3.19' of
git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight
git bisect bad e28870f9b3e92cd3570925089c6bb789c2603bc4
# good: [6da314122ddc11936c6f054753bbb956a499d020] Merge tag 'dt-for-linus'
of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
git bisect good 6da314122ddc11936c6f054753bbb956a499d020
# good: [a53b831549141aa060a8b54b76e3a42870d74cc0] exit: pidns: fix/update
the comments in zap_pid_ns_processes()
git bisect good a53b831549141aa060a8b54b76e3a42870d74cc0
# bad: [cbfe0de303a55ed96d8831c2d5f56f8131cd6612] Merge branch 'for-linus'
of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
git bisect bad cbfe0de303a55ed96d8831c2d5f56f8131cd6612
# bad: [a6b849578ef3e0b131b1ea4063473a4f935a65e9] Merge branch 'for-linus'
of git://git.samba.org/sfrench/cifs-2.6
git bisect bad a6b849578ef3e0b131b1ea4063473a4f935a65e9
# bad: [c9f861c77269bc9950c16c6404a9476062241671] Merge branch
'x86-ras-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad c9f861c77269bc9950c16c6404a9476062241671
# good: [773fed910d41e443e495a6bfa9ab1c2b7b13e012] Merge branches
'x86-platform-for-linus' and 'x86-uv-for-linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good 773fed910d41e443e495a6bfa9ab1c2b7b13e012
# good: [f439c429c320981943f8b64b2a4049d946cb492b] x86: Support PAT bit in
pagetable dump for lower levels
git bisect good f439c429c320981943f8b64b2a4049d946cb492b
# good: [e3480271f59253cb60d030aa5e615bf00b731fea] x86, mce, severity:
Extend the the mce_severity mechanism to handle UCNA/DEFERRED error
git bisect good e3480271f59253cb60d030aa5e615bf00b731fea
# bad: [0dbcae884779fdf7e2239a97ac7488877f0693d9] x86: mm: Move PAT only
functions to mm/pat.c
git bisect bad 0dbcae884779fdf7e2239a97ac7488877f0693d9
# bad: [bd809af16e3ab1f8d55b3e2928c47c67e2a865d2] x86: Enable PAT to use
cache mode translation tables
git bisect bad bd809af16e3ab1f8d55b3e2928c47c67e2a865d2
# good: [f5b2831d654167d77da8afbef4d2584897b12d0c] x86: Respect PAT bit when
copying pte values between large and normal pages
git bisect good f5b2831d654167d77da8afbef4d2584897b12d0c
# first bad commit: [bd809af16e3ab1f8d55b3e2928c47c67e2a865d2] x86: Enable
PAT to use cache mode translation tables

Reverting the above commit fixes the problem for me, and 3.19-rc1 loads
fine.

Any additional info available on request.

Please, CC me - I am not subscribed to the list.
-
Plamen Petrov



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[BISECTED] 3.19-rc1 regression - kernel does not load in GRUB 0.97 (GRUB Legacy)

2014-12-23 Thread Пламен Петров
Hello!

I use GRUB Legacy bootloader (version 0.97) on a couple machines, and where
3.18.x loads fine, 3.19-rc1 does not.

While compiling I used the attached .config file accompanied by "make
olddefconfig"

The bisection I ran points to:
bd809af16e3ab1f8d55b3e2928c47c67e2a865d2 is the first bad commit
commit bd809af16e3ab1f8d55b3e2928c47c67e2a865d2
Author: Juergen Gross 
Date:   Mon Nov 3 14:02:03 2014 +0100

x86: Enable PAT to use cache mode translation tables

Update the translation tables from cache mode to pgprot values
according to the PAT settings. This enables changing the cache
attributes of a PAT index in just one place without having to change
at the users side.

With this change it is possible to use the same kernel with different
PAT configurations, e.g. supporting Xen.

Here is the output of git bisect log

git bisect start
# bad: [97bf6af1f928216fd6c5a66e8a57bfa95a659672] Linux 3.19-rc1
git bisect bad 97bf6af1f928216fd6c5a66e8a57bfa95a659672
# good: [b2776bf7149bddd1f4161f14f79520f17fc1d71d] Linux 3.18
git bisect good b2776bf7149bddd1f4161f14f79520f17fc1d71d
# bad: [70e71ca0af244f48a5dcf56dc435243792e3a495] Merge
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
git bisect bad 70e71ca0af244f48a5dcf56dc435243792e3a495
# bad: [e28870f9b3e92cd3570925089c6bb789c2603bc4] Merge tag
'backlight-for-linus-3.19' of
git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight
git bisect bad e28870f9b3e92cd3570925089c6bb789c2603bc4
# good: [6da314122ddc11936c6f054753bbb956a499d020] Merge tag 'dt-for-linus'
of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
git bisect good 6da314122ddc11936c6f054753bbb956a499d020
# good: [a53b831549141aa060a8b54b76e3a42870d74cc0] exit: pidns: fix/update
the comments in zap_pid_ns_processes()
git bisect good a53b831549141aa060a8b54b76e3a42870d74cc0
# bad: [cbfe0de303a55ed96d8831c2d5f56f8131cd6612] Merge branch 'for-linus'
of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
git bisect bad cbfe0de303a55ed96d8831c2d5f56f8131cd6612
# bad: [a6b849578ef3e0b131b1ea4063473a4f935a65e9] Merge branch 'for-linus'
of git://git.samba.org/sfrench/cifs-2.6
git bisect bad a6b849578ef3e0b131b1ea4063473a4f935a65e9
# bad: [c9f861c77269bc9950c16c6404a9476062241671] Merge branch
'x86-ras-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad c9f861c77269bc9950c16c6404a9476062241671
# good: [773fed910d41e443e495a6bfa9ab1c2b7b13e012] Merge branches
'x86-platform-for-linus' and 'x86-uv-for-linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good 773fed910d41e443e495a6bfa9ab1c2b7b13e012
# good: [f439c429c320981943f8b64b2a4049d946cb492b] x86: Support PAT bit in
pagetable dump for lower levels
git bisect good f439c429c320981943f8b64b2a4049d946cb492b
# good: [e3480271f59253cb60d030aa5e615bf00b731fea] x86, mce, severity:
Extend the the mce_severity mechanism to handle UCNA/DEFERRED error
git bisect good e3480271f59253cb60d030aa5e615bf00b731fea
# bad: [0dbcae884779fdf7e2239a97ac7488877f0693d9] x86: mm: Move PAT only
functions to mm/pat.c
git bisect bad 0dbcae884779fdf7e2239a97ac7488877f0693d9
# bad: [bd809af16e3ab1f8d55b3e2928c47c67e2a865d2] x86: Enable PAT to use
cache mode translation tables
git bisect bad bd809af16e3ab1f8d55b3e2928c47c67e2a865d2
# good: [f5b2831d654167d77da8afbef4d2584897b12d0c] x86: Respect PAT bit when
copying pte values between large and normal pages
git bisect good f5b2831d654167d77da8afbef4d2584897b12d0c
# first bad commit: [bd809af16e3ab1f8d55b3e2928c47c67e2a865d2] x86: Enable
PAT to use cache mode translation tables

Reverting the above commit fixes the problem for me, and 3.19-rc1 loads
fine.

Any additional info available on request.

Please, CC me - I am not subscribed to the list.
-
Plamen Petrov


config-v3.19-rc1-12-g443fb5a
Description: Binary data