Re: sched: system doesn't boot since "sched: Add new migrate_disable() implementation"

2020-10-20 Thread Christian Eggers
On Tuesday, 20 October 2020, 13:30:09 CEST, Peter Zijlstra wrote:
> On Mon, Oct 19, 2020 at 05:09:35PM +0200, Sebastian Andrzej Siewior wrote:
> > On 2020-10-19 12:21:06 [+0200], Christian Eggers wrote:
> > > I have problems with the latest 5.9-rt releases on i.MX6ULL (!
CONFIG_SMP):
> > …
> > 
> > > Any hints?
> > 
> > Thank you for the report. The reason is the migrate_disable()
> > implementation for !SMP.
> 
> This should fix things I suppose. I'll fold it in.
> 
> ---
> --- a/include/linux/preempt.h
> +++ b/include/linux/preempt.h
> @@ -378,7 +378,12 @@ static inline void preempt_notifier_init
>  extern void migrate_disable(void);
>  extern void migrate_enable(void);
> 
> -#else /* !(CONFIG_SMP && CONFIG_PREEMPT_RT) */
> +#elif defined(CONFIG_PREEMPT_RT)
> +
> +static inline void migrate_disable(void) { }
> +static inline void migrate_enable(void { }
closing bracket missing

> +
> +#else /* !CONFIG_PREEMPT_RT */
> 
>  /**
>   * migrate_disable - Prevent migration of the current task

I didn't understand much of you discussion with Sebastian,
but my system is able to boot now.

# uname -r
5.9.0-rt16+

Best regards
Christian





Re: sched: system doesn't boot since "sched: Add new migrate_disable() implementation"

2020-10-20 Thread Sebastian Andrzej Siewior
On 2020-10-20 13:41:37 [+0200], Peter Zijlstra wrote:
> Right, but this patch set doesn't include the lazy preemption stuff, and
> given the 'fun' Valentin and me are still having with it, I'd like to
> keep it like that.
> 
> But yes, that might warrant a slightly less NOP implementation.

Uh. Looking at the actual implementation we don't look at the mg-counter
but have preempt_lazy_disable() for that.
Let me sync your bits then.
Thanks.

Sebastian


Re: sched: system doesn't boot since "sched: Add new migrate_disable() implementation"

2020-10-20 Thread Peter Zijlstra
On Tue, Oct 20, 2020 at 01:38:28PM +0200, Sebastian Andrzej Siewior wrote:
> On 2020-10-20 13:30:09 [+0200], Peter Zijlstra wrote:
> > On Mon, Oct 19, 2020 at 05:09:35PM +0200, Sebastian Andrzej Siewior wrote:
> > > On 2020-10-19 12:21:06 [+0200], Christian Eggers wrote:
> > > > I have problems with the latest 5.9-rt releases on i.MX6ULL 
> > > > (!CONFIG_SMP):
> > > > 
> > > …
> > > > Any hints?
> > > 
> > > Thank you for the report. The reason is the migrate_disable()
> > > implementation for !SMP.
> > 
> > This should fix things I suppose. I'll fold it in.
> 
> It will. It will also break lazy-preemption. Each time a sleeping lock
> is acquired there is also migrate_disable() and the migrate-disable
> counter is != 0 (even for UP). The result is that a wake up for a
> SCHED_OTHER task with mg counter != 0 will not lead to context switch
> (same like preemption counter != 0). The difference is that a wake up
> for a RT task ignores this counter and perform a context switch anyway.

Right, but this patch set doesn't include the lazy preemption stuff, and
given the 'fun' Valentin and me are still having with it, I'd like to
keep it like that.

But yes, that might warrant a slightly less NOP implementation.


Re: sched: system doesn't boot since "sched: Add new migrate_disable() implementation"

2020-10-20 Thread Sebastian Andrzej Siewior
On 2020-10-20 13:30:09 [+0200], Peter Zijlstra wrote:
> On Mon, Oct 19, 2020 at 05:09:35PM +0200, Sebastian Andrzej Siewior wrote:
> > On 2020-10-19 12:21:06 [+0200], Christian Eggers wrote:
> > > I have problems with the latest 5.9-rt releases on i.MX6ULL (!CONFIG_SMP):
> > > 
> > …
> > > Any hints?
> > 
> > Thank you for the report. The reason is the migrate_disable()
> > implementation for !SMP.
> 
> This should fix things I suppose. I'll fold it in.

It will. It will also break lazy-preemption. Each time a sleeping lock
is acquired there is also migrate_disable() and the migrate-disable
counter is != 0 (even for UP). The result is that a wake up for a
SCHED_OTHER task with mg counter != 0 will not lead to context switch
(same like preemption counter != 0). The difference is that a wake up
for a RT task ignores this counter and perform a context switch anyway.

That way we have RT wake ups on time but avoid stumbling from one lock
to another.

> ---
> --- a/include/linux/preempt.h
> +++ b/include/linux/preempt.h
> @@ -378,7 +378,12 @@ static inline void preempt_notifier_init
>  extern void migrate_disable(void);
>  extern void migrate_enable(void);
>  
> -#else /* !(CONFIG_SMP && CONFIG_PREEMPT_RT) */
> +#elif defined(CONFIG_PREEMPT_RT)
> +
> +static inline void migrate_disable(void) { }
> +static inline void migrate_enable(void { }
> +
> +#else /* !CONFIG_PREEMPT_RT */
>  
>  /**
>   * migrate_disable - Prevent migration of the current task

Sebastian


Re: sched: system doesn't boot since "sched: Add new migrate_disable() implementation"

2020-10-20 Thread Peter Zijlstra
On Mon, Oct 19, 2020 at 05:09:35PM +0200, Sebastian Andrzej Siewior wrote:
> On 2020-10-19 12:21:06 [+0200], Christian Eggers wrote:
> > I have problems with the latest 5.9-rt releases on i.MX6ULL (!CONFIG_SMP):
> > 
> …
> > Any hints?
> 
> Thank you for the report. The reason is the migrate_disable()
> implementation for !SMP.

This should fix things I suppose. I'll fold it in.

---
--- a/include/linux/preempt.h
+++ b/include/linux/preempt.h
@@ -378,7 +378,12 @@ static inline void preempt_notifier_init
 extern void migrate_disable(void);
 extern void migrate_enable(void);
 
-#else /* !(CONFIG_SMP && CONFIG_PREEMPT_RT) */
+#elif defined(CONFIG_PREEMPT_RT)
+
+static inline void migrate_disable(void) { }
+static inline void migrate_enable(void { }
+
+#else /* !CONFIG_PREEMPT_RT */
 
 /**
  * migrate_disable - Prevent migration of the current task


Re: sched: system doesn't boot since "sched: Add new migrate_disable() implementation"

2020-10-19 Thread Sebastian Andrzej Siewior
On 2020-10-19 12:21:06 [+0200], Christian Eggers wrote:
> I have problems with the latest 5.9-rt releases on i.MX6ULL (!CONFIG_SMP):
> 
…
> Any hints?

Thank you for the report. The reason is the migrate_disable()
implementation for !SMP.

> Best regards
> Christian

Sebastian


sched: system doesn't boot since "sched: Add new migrate_disable() implementation"

2020-10-19 Thread Christian Eggers
I have problems with the latest 5.9-rt releases on i.MX6ULL (!CONFIG_SMP):

-rc8-rt13 works fine
-rc8-rt14 doesn't compile (due to CONFIG_FRACE, already fixed in -rt16)
-rt15 dito.
-rt16 compiles, but doesn't boot (no console output at all)

After reverting (on -rt16)

de1c0755e6f9 ("tracing: fix compile failure on RT with PREEMPT_RT off")
30763ce6c15d ("sched: Add new migrate_disable() implementation")

the system boots fine again.

Tracking the problem down showed that calls to wait_for_completion_timeout() 
(e.g. during imx_rngc_probe) will never return. The IRQ routine which should 
fire the completion is not executed, and the call doesn't return after the 
timeout. The IRQ flag on the ARM is not set before entering 
wait_for_completion_timeout(), so CPU interrupts seem to be on.

When building with CONFIG_SMP, the system boots fine.

Any hints?

Best regards
Christian





Re: 4.17.0-rc1 doesn't boot.

2018-04-17 Thread Dave Hansen
Heh, your .config is insidious:

9ffe3000 B __brk_base
9ffe3000 B __bss_stop
9fff3000 b .brk.dmi_alloc
a0003000 b .brk.early_pgt_alloc
a000f000 B _end
a000f000 B __brk_limit

dmi_alloc is __init, so it gets freed at some point and the PTEs zeroed
out.  That causes the warning when change_page_attr() sees the zero'd
PTE.  We just need to special-case the __init section along with the
linear map in pageattr.c.

I'll have some patches to do this shortly.


Re: 4.17.0-rc1 doesn't boot.

2018-04-17 Thread Dave Hansen
Heh, your .config is insidious:

9ffe3000 B __brk_base
9ffe3000 B __bss_stop
9fff3000 b .brk.dmi_alloc
a0003000 b .brk.early_pgt_alloc
a000f000 B _end
a000f000 B __brk_limit

dmi_alloc is __init, so it gets freed at some point and the PTEs zeroed
out.  That causes the warning when change_page_attr() sees the zero'd
PTE.  We just need to special-case the __init section along with the
linear map in pageattr.c.

I'll have some patches to do this shortly.


Re: 4.17.0-rc1 doesn't boot.

2018-04-17 Thread Mariusz Ceier
On 17 April 2018 at 18:48, Dave Hansen  wrote:
> On 04/17/2018 09:00 AM, Mike Galbraith wrote:
>> On Tue, 2018-04-17 at 17:31 +0200, Borislav Petkov wrote:
>>> On Tue, Apr 17, 2018 at 05:21:30PM +0200, Jörg Otte wrote:
 finished bisection.
 39114b7a743e6759bab4d96b7d9651d44d17e3f9 is the first bad commit
 (x86/pti: Never implicitly clear _PAGE_GLOBAL for kernel image).
>>>
>>> Looks like you're not the only one:
>>>
>>> http://marc.info/?i=20180417150130.ga11...@ak-laptop.emea.nsn-net.net
>>
>> I'm hitting this too, but only with PREEMPT_RT.  I put a bandaid on it
>> (tell pti_kernel_image_global_ok() to return true for PREEMPT_RT) while
>> waiting to see if it was really really as non-rt as it appeared to be.
>
> It looks like pti_init() is too early for
> change_page_attr()/set_memory_nonglobal() because they look for
> irqs_off().  This *should* be OK in practice because we only need to
> flush the boot CPU, not the others.  That's what ends up causing the
> BUG_ON().
>
> But, there's apparently something else going on too because things don't
> boot even with that BUG_ON() backed out.
>
> The good news is that its easy to reproduce.

I'm hitting the same bug on my PC. Git bisect points to the same
commit (39114b7a743e6759bab4d96b7d9651d44d17e3f9).
Afaik I don't use PREEMPT_RT, unless CONFIG_PREEMPT=y is the same as PREEMPT_RT.
I'm attaching my .config. My CPU is i5 6600.

Kernel seems to work in qemu when -enable-kvm and -cpu host are
removed from the qemu-system-x86_64 command line from the other
mailthread.


PS. Resending mail as plain-text (hopefully, who knows what insane
gmail does), sorry for the previous mail (please ignore it).


.config.gz
Description: application/gzip


Re: 4.17.0-rc1 doesn't boot.

2018-04-17 Thread Mariusz Ceier
On 17 April 2018 at 18:48, Dave Hansen  wrote:
> On 04/17/2018 09:00 AM, Mike Galbraith wrote:
>> On Tue, 2018-04-17 at 17:31 +0200, Borislav Petkov wrote:
>>> On Tue, Apr 17, 2018 at 05:21:30PM +0200, Jörg Otte wrote:
 finished bisection.
 39114b7a743e6759bab4d96b7d9651d44d17e3f9 is the first bad commit
 (x86/pti: Never implicitly clear _PAGE_GLOBAL for kernel image).
>>>
>>> Looks like you're not the only one:
>>>
>>> http://marc.info/?i=20180417150130.ga11...@ak-laptop.emea.nsn-net.net
>>
>> I'm hitting this too, but only with PREEMPT_RT.  I put a bandaid on it
>> (tell pti_kernel_image_global_ok() to return true for PREEMPT_RT) while
>> waiting to see if it was really really as non-rt as it appeared to be.
>
> It looks like pti_init() is too early for
> change_page_attr()/set_memory_nonglobal() because they look for
> irqs_off().  This *should* be OK in practice because we only need to
> flush the boot CPU, not the others.  That's what ends up causing the
> BUG_ON().
>
> But, there's apparently something else going on too because things don't
> boot even with that BUG_ON() backed out.
>
> The good news is that its easy to reproduce.

I'm hitting the same bug on my PC. Git bisect points to the same
commit (39114b7a743e6759bab4d96b7d9651d44d17e3f9).
Afaik I don't use PREEMPT_RT, unless CONFIG_PREEMPT=y is the same as PREEMPT_RT.
I'm attaching my .config. My CPU is i5 6600.

Kernel seems to work in qemu when -enable-kvm and -cpu host are
removed from the qemu-system-x86_64 command line from the other
mailthread.


PS. Resending mail as plain-text (hopefully, who knows what insane
gmail does), sorry for the previous mail (please ignore it).


.config.gz
Description: application/gzip


Re: 4.17.0-rc1 doesn't boot.

2018-04-17 Thread Dave Hansen
On 04/17/2018 09:00 AM, Mike Galbraith wrote:
> On Tue, 2018-04-17 at 17:31 +0200, Borislav Petkov wrote:
>> On Tue, Apr 17, 2018 at 05:21:30PM +0200, Jörg Otte wrote:
>>> finished bisection.
>>> 39114b7a743e6759bab4d96b7d9651d44d17e3f9 is the first bad commit
>>> (x86/pti: Never implicitly clear _PAGE_GLOBAL for kernel image).
>>
>> Looks like you're not the only one:
>>
>> http://marc.info/?i=20180417150130.ga11...@ak-laptop.emea.nsn-net.net
> 
> I'm hitting this too, but only with PREEMPT_RT.  I put a bandaid on it
> (tell pti_kernel_image_global_ok() to return true for PREEMPT_RT) while
> waiting to see if it was really really as non-rt as it appeared to be.

It looks like pti_init() is too early for
change_page_attr()/set_memory_nonglobal() because they look for
irqs_off().  This *should* be OK in practice because we only need to
flush the boot CPU, not the others.  That's what ends up causing the
BUG_ON().

But, there's apparently something else going on too because things don't
boot even with that BUG_ON() backed out.

The good news is that its easy to reproduce.


Re: 4.17.0-rc1 doesn't boot.

2018-04-17 Thread Dave Hansen
On 04/17/2018 09:00 AM, Mike Galbraith wrote:
> On Tue, 2018-04-17 at 17:31 +0200, Borislav Petkov wrote:
>> On Tue, Apr 17, 2018 at 05:21:30PM +0200, Jörg Otte wrote:
>>> finished bisection.
>>> 39114b7a743e6759bab4d96b7d9651d44d17e3f9 is the first bad commit
>>> (x86/pti: Never implicitly clear _PAGE_GLOBAL for kernel image).
>>
>> Looks like you're not the only one:
>>
>> http://marc.info/?i=20180417150130.ga11...@ak-laptop.emea.nsn-net.net
> 
> I'm hitting this too, but only with PREEMPT_RT.  I put a bandaid on it
> (tell pti_kernel_image_global_ok() to return true for PREEMPT_RT) while
> waiting to see if it was really really as non-rt as it appeared to be.

It looks like pti_init() is too early for
change_page_attr()/set_memory_nonglobal() because they look for
irqs_off().  This *should* be OK in practice because we only need to
flush the boot CPU, not the others.  That's what ends up causing the
BUG_ON().

But, there's apparently something else going on too because things don't
boot even with that BUG_ON() backed out.

The good news is that its easy to reproduce.


Re: 4.17.0-rc1 doesn't boot.

2018-04-17 Thread Mike Galbraith
On Tue, 2018-04-17 at 17:31 +0200, Borislav Petkov wrote:
> On Tue, Apr 17, 2018 at 05:21:30PM +0200, Jörg Otte wrote:
> > finished bisection.
> > 39114b7a743e6759bab4d96b7d9651d44d17e3f9 is the first bad commit
> > (x86/pti: Never implicitly clear _PAGE_GLOBAL for kernel image).
> 
> Looks like you're not the only one:
> 
> http://marc.info/?i=20180417150130.ga11...@ak-laptop.emea.nsn-net.net

I'm hitting this too, but only with PREEMPT_RT.  I put a bandaid on it
(tell pti_kernel_image_global_ok() to return true for PREEMPT_RT) while
waiting to see if it was really really as non-rt as it appeared to be.

-Mike


Re: 4.17.0-rc1 doesn't boot.

2018-04-17 Thread Mike Galbraith
On Tue, 2018-04-17 at 17:31 +0200, Borislav Petkov wrote:
> On Tue, Apr 17, 2018 at 05:21:30PM +0200, Jörg Otte wrote:
> > finished bisection.
> > 39114b7a743e6759bab4d96b7d9651d44d17e3f9 is the first bad commit
> > (x86/pti: Never implicitly clear _PAGE_GLOBAL for kernel image).
> 
> Looks like you're not the only one:
> 
> http://marc.info/?i=20180417150130.ga11...@ak-laptop.emea.nsn-net.net

I'm hitting this too, but only with PREEMPT_RT.  I put a bandaid on it
(tell pti_kernel_image_global_ok() to return true for PREEMPT_RT) while
waiting to see if it was really really as non-rt as it appeared to be.

-Mike


Re: 4.17.0-rc1 doesn't boot.

2018-04-17 Thread Borislav Petkov
On Tue, Apr 17, 2018 at 05:21:30PM +0200, Jörg Otte wrote:
> finished bisection.
> 39114b7a743e6759bab4d96b7d9651d44d17e3f9 is the first bad commit
> (x86/pti: Never implicitly clear _PAGE_GLOBAL for kernel image).

Looks like you're not the only one:

http://marc.info/?i=20180417150130.ga11...@ak-laptop.emea.nsn-net.net

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.


Re: 4.17.0-rc1 doesn't boot.

2018-04-17 Thread Borislav Petkov
On Tue, Apr 17, 2018 at 05:21:30PM +0200, Jörg Otte wrote:
> finished bisection.
> 39114b7a743e6759bab4d96b7d9651d44d17e3f9 is the first bad commit
> (x86/pti: Never implicitly clear _PAGE_GLOBAL for kernel image).

Looks like you're not the only one:

http://marc.info/?i=20180417150130.ga11...@ak-laptop.emea.nsn-net.net

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.


Re: 4.17.0-rc1 doesn't boot.

2018-04-17 Thread Jörg Otte
2018-04-17 16:27 GMT+02:00 Borislav Petkov :
> On Tue, Apr 17, 2018 at 04:16:34PM +0200, Jörg Otte wrote:
>> Current Linus master tree (4.17.0-rc1-00021-ga27fc14) does'nt fix it.
>
> Then pls continue bisecting. Unless someone has a better idea...
>

finished bisection.
39114b7a743e6759bab4d96b7d9651d44d17e3f9 is the first bad commit
(x86/pti: Never implicitly clear _PAGE_GLOBAL for kernel image).

Thanks, Jörg


Re: 4.17.0-rc1 doesn't boot.

2018-04-17 Thread Jörg Otte
2018-04-17 16:27 GMT+02:00 Borislav Petkov :
> On Tue, Apr 17, 2018 at 04:16:34PM +0200, Jörg Otte wrote:
>> Current Linus master tree (4.17.0-rc1-00021-ga27fc14) does'nt fix it.
>
> Then pls continue bisecting. Unless someone has a better idea...
>

finished bisection.
39114b7a743e6759bab4d96b7d9651d44d17e3f9 is the first bad commit
(x86/pti: Never implicitly clear _PAGE_GLOBAL for kernel image).

Thanks, Jörg


Re: 4.17.0-rc1 doesn't boot.

2018-04-17 Thread Borislav Petkov
On Tue, Apr 17, 2018 at 04:16:34PM +0200, Jörg Otte wrote:
> Current Linus master tree (4.17.0-rc1-00021-ga27fc14) does'nt fix it.

Then pls continue bisecting. Unless someone has a better idea...

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.


Re: 4.17.0-rc1 doesn't boot.

2018-04-17 Thread Borislav Petkov
On Tue, Apr 17, 2018 at 04:16:34PM +0200, Jörg Otte wrote:
> Current Linus master tree (4.17.0-rc1-00021-ga27fc14) does'nt fix it.

Then pls continue bisecting. Unless someone has a better idea...

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.


Re: 4.17.0-rc1 doesn't boot.

2018-04-17 Thread Jörg Otte
2018-04-17 10:14 GMT+02:00 Borislav Petkov :
> On Tue, Apr 17, 2018 at 10:00:25AM +0200, Jörg Otte wrote:
>> Maybe the problem came in with:
>> 6b0a02e:  "Merge branch 'x86-pti-for-linus' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip"
>
> Fetch latest Linus master and try again - there might be a relevant fix
> there.
>

Current Linus master tree (4.17.0-rc1-00021-ga27fc14) does'nt fix it.

Thanks, Jörg


Re: 4.17.0-rc1 doesn't boot.

2018-04-17 Thread Jörg Otte
2018-04-17 10:14 GMT+02:00 Borislav Petkov :
> On Tue, Apr 17, 2018 at 10:00:25AM +0200, Jörg Otte wrote:
>> Maybe the problem came in with:
>> 6b0a02e:  "Merge branch 'x86-pti-for-linus' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip"
>
> Fetch latest Linus master and try again - there might be a relevant fix
> there.
>

Current Linus master tree (4.17.0-rc1-00021-ga27fc14) does'nt fix it.

Thanks, Jörg


Re: 4.17.0-rc1 doesn't boot.

2018-04-17 Thread Borislav Petkov
On Tue, Apr 17, 2018 at 10:00:25AM +0200, Jörg Otte wrote:
> Maybe the problem came in with:
> 6b0a02e:  "Merge branch 'x86-pti-for-linus' of
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip"

Fetch latest Linus master and try again - there might be a relevant fix
there.

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.


Re: 4.17.0-rc1 doesn't boot.

2018-04-17 Thread Borislav Petkov
On Tue, Apr 17, 2018 at 10:00:25AM +0200, Jörg Otte wrote:
> Maybe the problem came in with:
> 6b0a02e:  "Merge branch 'x86-pti-for-linus' of
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip"

Fetch latest Linus master and try again - there might be a relevant fix
there.

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.


4.17.0-rc1 doesn't boot.

2018-04-17 Thread Jörg Otte
Hi,
my notebook doesn't boot with 4.17.0-rc1. Booting stops right after
displaying "loading initial ramdisk..". No further displays.
Also nothing is wriiten to the logs.

First known bad kernel is: 4.16.0-12564-g6b0a02e
Last known good kernel is: 4.16.0-12548-g71b8ebb

Maybe the problem came in with:
6b0a02e:  "Merge branch 'x86-pti-for-linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip"


Thanks, Jörg


4.17.0-rc1 doesn't boot.

2018-04-17 Thread Jörg Otte
Hi,
my notebook doesn't boot with 4.17.0-rc1. Booting stops right after
displaying "loading initial ramdisk..". No further displays.
Also nothing is wriiten to the logs.

First known bad kernel is: 4.16.0-12564-g6b0a02e
Last known good kernel is: 4.16.0-12548-g71b8ebb

Maybe the problem came in with:
6b0a02e:  "Merge branch 'x86-pti-for-linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip"


Thanks, Jörg


Re: arm64 kvm built with clang doesn't boot

2018-04-12 Thread Andrey Konovalov
On Fri, Mar 16, 2018 at 3:31 PM, Mark Rutland <mark.rutl...@arm.com> wrote:
> On Fri, Mar 16, 2018 at 02:13:14PM +, Mark Rutland wrote:
>> On Fri, Mar 16, 2018 at 02:49:00PM +0100, Andrey Konovalov wrote:
>> > Hi!
>>
>> Hi,
>>
>> > I've recently tried to boot clang built kernel on real hardware
>> > (Odroid C2 board) instead of using a VM. The issue that I stumbled
>> > upon is that arm64 kvm built with clang doesn't boot.
>> >
>> > Adding -fno-jump-tables compiler flag to arch/arm64/kvm/* helps. There
>> > was a patch some time ago that did exactly that
>> > (https://patchwork.kernel.org/patch/10060381/), but it wasn't accepted
>> > AFAICT (see the discussion on that thread).
>> >
>> > What would be the best way to get this fixed?
>>
>> I think that patch is our best bet currently, but to save ourselves pain
>> in future it would be *really* nice if GCC and clang could provide an
>> option line -fno-absolute-addressing that would implicitly disable any
>> feature that would generate an absolute address as jump tables do.
>>
>> > I've also had to disable CONFIG_JUMP_LABEL to get the kernel boot
>> > (even without kvm enabled), but that might be a different (though
>> > related) issue.
>>
>> With v4.15 (and clang 5.0.0), I did not have to disable jump labels to
>> get a kernel booting on a Juno platform, though I did have to pass
>> -fno-jump-tables to the hyp code.
>
> FWIW, with that same compiler and patch applied atop of v4.16-rc4, and
> some bodges around clang not liking the rX register naming in the SMCCC
> code, I get a kernel that boots on my Juno, though I immediately hit a
> KASAN splat:
>
> [8.476766] 
> ==
> [8.483990] BUG: KASAN: slab-out-of-bounds in __d_lookup_rcu+0x350/0x400
> [8.490664] Read of size 8 at addr 8009336e2a30 by task init/1

Hi Mark!

Just FYI, this should be fixed with https://reviews.llvm.org/D44981 +
https://patchwork.kernel.org/patch/10339103/

Thanks!


Re: arm64 kvm built with clang doesn't boot

2018-04-12 Thread Andrey Konovalov
On Fri, Mar 16, 2018 at 3:31 PM, Mark Rutland  wrote:
> On Fri, Mar 16, 2018 at 02:13:14PM +, Mark Rutland wrote:
>> On Fri, Mar 16, 2018 at 02:49:00PM +0100, Andrey Konovalov wrote:
>> > Hi!
>>
>> Hi,
>>
>> > I've recently tried to boot clang built kernel on real hardware
>> > (Odroid C2 board) instead of using a VM. The issue that I stumbled
>> > upon is that arm64 kvm built with clang doesn't boot.
>> >
>> > Adding -fno-jump-tables compiler flag to arch/arm64/kvm/* helps. There
>> > was a patch some time ago that did exactly that
>> > (https://patchwork.kernel.org/patch/10060381/), but it wasn't accepted
>> > AFAICT (see the discussion on that thread).
>> >
>> > What would be the best way to get this fixed?
>>
>> I think that patch is our best bet currently, but to save ourselves pain
>> in future it would be *really* nice if GCC and clang could provide an
>> option line -fno-absolute-addressing that would implicitly disable any
>> feature that would generate an absolute address as jump tables do.
>>
>> > I've also had to disable CONFIG_JUMP_LABEL to get the kernel boot
>> > (even without kvm enabled), but that might be a different (though
>> > related) issue.
>>
>> With v4.15 (and clang 5.0.0), I did not have to disable jump labels to
>> get a kernel booting on a Juno platform, though I did have to pass
>> -fno-jump-tables to the hyp code.
>
> FWIW, with that same compiler and patch applied atop of v4.16-rc4, and
> some bodges around clang not liking the rX register naming in the SMCCC
> code, I get a kernel that boots on my Juno, though I immediately hit a
> KASAN splat:
>
> [8.476766] 
> ==
> [8.483990] BUG: KASAN: slab-out-of-bounds in __d_lookup_rcu+0x350/0x400
> [8.490664] Read of size 8 at addr 8009336e2a30 by task init/1

Hi Mark!

Just FYI, this should be fixed with https://reviews.llvm.org/D44981 +
https://patchwork.kernel.org/patch/10339103/

Thanks!


Re: arm64 kvm built with clang doesn't boot

2018-03-19 Thread Nick Desaulniers
The thread I was thinking of is:
http://lists.infradead.org/pipermail/linux-arm-kernel/2018-March/562953.html
which is about b092201e0020614127f495c092e0a12d26a2116e `arm64: Add
ARM_SMCCC_ARCH_WORKAROUND_1 BP hardening support`.  As you mention, that
commit uses the correct widths.
Andrey had sent me his workaround, which modified some #defines in
include/linux/arm-smccc.h, which were added recently in commit
f2d3b2e8759a5833df6f022e42df2d581e6d843c `arm/arm64: smccc: Implement SMCCC
v1.1 inline primitive`.

>>> We probably want to be very explicit with register widths here.

So f2d3b2e8759 is the patch I'm referring to.

On Fri, Mar 16, 2018 at 10:18 AM Marc Zyngier  wrote:

> On 16/03/18 16:52, Nick Desaulniers wrote:

> [dropping kernel-dynamic-to...@google.com which keeps bouncing]

> > Is this in regards to: commit "arm64: Add ARM_SMCCC_ARCH_WORKAROUND_1 BP
> > hardening support"? Has anyone tried to upstream a fix for this?  We
> > probably want to be very explicit with register widths here.
> What do you mean? The current code is as strict as it gets, and
> explicitly tells the compiler to use the right register width, based on
> the SMC call parameter types.

> Thanks,

>  M.
> --
> Jazz is not dead. It just smells funny...



--
Thanks,
~Nick Desaulniers


Re: arm64 kvm built with clang doesn't boot

2018-03-19 Thread Nick Desaulniers
The thread I was thinking of is:
http://lists.infradead.org/pipermail/linux-arm-kernel/2018-March/562953.html
which is about b092201e0020614127f495c092e0a12d26a2116e `arm64: Add
ARM_SMCCC_ARCH_WORKAROUND_1 BP hardening support`.  As you mention, that
commit uses the correct widths.
Andrey had sent me his workaround, which modified some #defines in
include/linux/arm-smccc.h, which were added recently in commit
f2d3b2e8759a5833df6f022e42df2d581e6d843c `arm/arm64: smccc: Implement SMCCC
v1.1 inline primitive`.

>>> We probably want to be very explicit with register widths here.

So f2d3b2e8759 is the patch I'm referring to.

On Fri, Mar 16, 2018 at 10:18 AM Marc Zyngier  wrote:

> On 16/03/18 16:52, Nick Desaulniers wrote:

> [dropping kernel-dynamic-to...@google.com which keeps bouncing]

> > Is this in regards to: commit "arm64: Add ARM_SMCCC_ARCH_WORKAROUND_1 BP
> > hardening support"? Has anyone tried to upstream a fix for this?  We
> > probably want to be very explicit with register widths here.
> What do you mean? The current code is as strict as it gets, and
> explicitly tells the compiler to use the right register width, based on
> the SMC call parameter types.

> Thanks,

>  M.
> --
> Jazz is not dead. It just smells funny...



--
Thanks,
~Nick Desaulniers


Re: arm64 kvm built with clang doesn't boot

2018-03-17 Thread Ard Biesheuvel
(+ Thomas)

On 16 March 2018 at 17:13, Mark Rutland  wrote:
> On Fri, Mar 16, 2018 at 04:52:08PM +, Nick Desaulniers wrote:
>> + Sami (Google), Takahiro (Linaro)
>>
>> Just so I fully understand the problem enough to articulate it, we'd be
>> looking for the compiler to keep the jump tables for speed (I would guess
>> -fno-jump-tables would emit an if-else chain) but only emit relative jumps
>> (not absolute jumps)?
>
> Our main concern is that there is no absolute addressing. If that rules
> out using a relative jump table, that's ok.
>
> We want to avoid the fragility of collecting -f-no-* options as future
> compiler transformations end up introducing absolute addressing.
>

This all comes back to the assumptions made by the compiler when
building PIC/PIE code, i.e., that symbols should be preemptible and
thus all references should be indirected via GOT entries, and that
text relocations should be avoided.

If we had a way to tell the compiler that these concerns do not apply
for us, we could use -fpic/-fpie in the kernel and be done with it.
-fvisibility=hidden *almost* gives us what we need, but in practice,
only the #pragma variant (#pragma GCC visibility push (hidden)) makes
-fpic behave in a sensible way for freestanding builds, and gets rid
of absolute references where possible (note that statically
initialized pointer variables always involve absolute relocations)


Re: arm64 kvm built with clang doesn't boot

2018-03-17 Thread Ard Biesheuvel
(+ Thomas)

On 16 March 2018 at 17:13, Mark Rutland  wrote:
> On Fri, Mar 16, 2018 at 04:52:08PM +, Nick Desaulniers wrote:
>> + Sami (Google), Takahiro (Linaro)
>>
>> Just so I fully understand the problem enough to articulate it, we'd be
>> looking for the compiler to keep the jump tables for speed (I would guess
>> -fno-jump-tables would emit an if-else chain) but only emit relative jumps
>> (not absolute jumps)?
>
> Our main concern is that there is no absolute addressing. If that rules
> out using a relative jump table, that's ok.
>
> We want to avoid the fragility of collecting -f-no-* options as future
> compiler transformations end up introducing absolute addressing.
>

This all comes back to the assumptions made by the compiler when
building PIC/PIE code, i.e., that symbols should be preemptible and
thus all references should be indirected via GOT entries, and that
text relocations should be avoided.

If we had a way to tell the compiler that these concerns do not apply
for us, we could use -fpic/-fpie in the kernel and be done with it.
-fvisibility=hidden *almost* gives us what we need, but in practice,
only the #pragma variant (#pragma GCC visibility push (hidden)) makes
-fpic behave in a sensible way for freestanding builds, and gets rid
of absolute references where possible (note that statically
initialized pointer variables always involve absolute relocations)


Re: arm64 kvm built with clang doesn't boot

2018-03-16 Thread Marc Zyngier
On 16/03/18 16:52, Nick Desaulniers wrote:

[dropping kernel-dynamic-to...@google.com which keeps bouncing]

> Is this in regards to: commit "arm64: Add ARM_SMCCC_ARCH_WORKAROUND_1 BP
> hardening support"? Has anyone tried to upstream a fix for this?  We
> probably want to be very explicit with register widths here.
What do you mean? The current code is as strict as it gets, and
explicitly tells the compiler to use the right register width, based on
the SMC call parameter types.

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...


Re: arm64 kvm built with clang doesn't boot

2018-03-16 Thread Marc Zyngier
On 16/03/18 16:52, Nick Desaulniers wrote:

[dropping kernel-dynamic-to...@google.com which keeps bouncing]

> Is this in regards to: commit "arm64: Add ARM_SMCCC_ARCH_WORKAROUND_1 BP
> hardening support"? Has anyone tried to upstream a fix for this?  We
> probably want to be very explicit with register widths here.
What do you mean? The current code is as strict as it gets, and
explicitly tells the compiler to use the right register width, based on
the SMC call parameter types.

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...


Re: arm64 kvm built with clang doesn't boot

2018-03-16 Thread Mark Rutland
On Fri, Mar 16, 2018 at 04:52:08PM +, Nick Desaulniers wrote:
> + Sami (Google), Takahiro (Linaro)
> 
> Just so I fully understand the problem enough to articulate it, we'd be
> looking for the compiler to keep the jump tables for speed (I would guess
> -fno-jump-tables would emit an if-else chain) but only emit relative jumps
> (not absolute jumps)?

Our main concern is that there is no absolute addressing. If that rules
out using a relative jump table, that's ok.

We want to avoid the fragility of collecting -f-no-* options as future
compiler transformations end up introducing absolute addressing.

Thanks,
Mark.


Re: arm64 kvm built with clang doesn't boot

2018-03-16 Thread Mark Rutland
On Fri, Mar 16, 2018 at 04:52:08PM +, Nick Desaulniers wrote:
> + Sami (Google), Takahiro (Linaro)
> 
> Just so I fully understand the problem enough to articulate it, we'd be
> looking for the compiler to keep the jump tables for speed (I would guess
> -fno-jump-tables would emit an if-else chain) but only emit relative jumps
> (not absolute jumps)?

Our main concern is that there is no absolute addressing. If that rules
out using a relative jump table, that's ok.

We want to avoid the fragility of collecting -f-no-* options as future
compiler transformations end up introducing absolute addressing.

Thanks,
Mark.


Re: arm64 kvm built with clang doesn't boot

2018-03-16 Thread Nick Desaulniers
+ Sami (Google), Takahiro (Linaro)

Just so I fully understand the problem enough to articulate it, we'd be
looking for the compiler to keep the jump tables for speed (I would guess
-fno-jump-tables would emit an if-else chain) but only emit relative jumps
(not absolute jumps)?

> Perhaps Nick can comment on whether something like
-fno-absolute-addressing would be feasible in clang.

Checked with some of my LLVM friends.  They mentioned that this is tricky
because you need to move the addresses of the jump table from a data
section back into the text section.

Looks like LLVM has an interesting method
`shouldPutJumpTableInFunctionSection` [0]. Unfortunately, it gets
overridden for ELF to always return false. [1]

It looks like there's also a flag `no-jump-tables` [2].  Looks like Sami
has used this in the past in kvm. [3]

It's still probably possible to add this to LLVM, so I can pursue that with
LLVM devs.

> But just for the reference, I'm using 4.16-rc4 with a patch to fix SMCCC
issues that you mentioned.

Is this in regards to: commit "arm64: Add ARM_SMCCC_ARCH_WORKAROUND_1 BP
hardening support"? Has anyone tried to upstream a fix for this?  We
probably want to be very explicit with register widths here.

[0]
https://github.com/llvm-mirror/llvm/blob/a5bd54307b1adacb3df297b9b8010979b9afa4d7/lib/Target/TargetLoweringObjectFile.cpp#L280
[1]
https://github.com/llvm-mirror/llvm/blob/e7676fec11b02e4b698b5ffc99e1901246a7bf66/lib/CodeGen/TargetLoweringObjectFileImpl.cpp#L494
[2]
https://github.com/llvm-mirror/llvm/blob/11f5adb29bf90bc1a40b8bb512afcff4b1ac0f56/lib/Transforms/Utils/SimplifyCFG.cpp#L5233
[3] https://patchwork.kernel.org/patch/10060301/

--
Thanks,
~Nick Desaulniers


Re: arm64 kvm built with clang doesn't boot

2018-03-16 Thread Nick Desaulniers
+ Sami (Google), Takahiro (Linaro)

Just so I fully understand the problem enough to articulate it, we'd be
looking for the compiler to keep the jump tables for speed (I would guess
-fno-jump-tables would emit an if-else chain) but only emit relative jumps
(not absolute jumps)?

> Perhaps Nick can comment on whether something like
-fno-absolute-addressing would be feasible in clang.

Checked with some of my LLVM friends.  They mentioned that this is tricky
because you need to move the addresses of the jump table from a data
section back into the text section.

Looks like LLVM has an interesting method
`shouldPutJumpTableInFunctionSection` [0]. Unfortunately, it gets
overridden for ELF to always return false. [1]

It looks like there's also a flag `no-jump-tables` [2].  Looks like Sami
has used this in the past in kvm. [3]

It's still probably possible to add this to LLVM, so I can pursue that with
LLVM devs.

> But just for the reference, I'm using 4.16-rc4 with a patch to fix SMCCC
issues that you mentioned.

Is this in regards to: commit "arm64: Add ARM_SMCCC_ARCH_WORKAROUND_1 BP
hardening support"? Has anyone tried to upstream a fix for this?  We
probably want to be very explicit with register widths here.

[0]
https://github.com/llvm-mirror/llvm/blob/a5bd54307b1adacb3df297b9b8010979b9afa4d7/lib/Target/TargetLoweringObjectFile.cpp#L280
[1]
https://github.com/llvm-mirror/llvm/blob/e7676fec11b02e4b698b5ffc99e1901246a7bf66/lib/CodeGen/TargetLoweringObjectFileImpl.cpp#L494
[2]
https://github.com/llvm-mirror/llvm/blob/11f5adb29bf90bc1a40b8bb512afcff4b1ac0f56/lib/Transforms/Utils/SimplifyCFG.cpp#L5233
[3] https://patchwork.kernel.org/patch/10060301/

--
Thanks,
~Nick Desaulniers


Re: arm64 kvm built with clang doesn't boot

2018-03-16 Thread Marc Zyngier
On 16/03/18 14:53, Andrey Konovalov wrote:
> On Fri, Mar 16, 2018 at 3:13 PM, Marc Zyngier  wrote:
>> I wasn't aware of that discussion, but this is indeed quite annoying.
>> Note that you should be able to restrict this to arch/arm64/kvm/hyp/*
>> and virt/kvm/arm/hyp/*.
> 
> That works as well (tried it, the kernel boots). I've also tried
> compiling without the flag for virt/kvm/arm/hyp/*, it boots as well.

That's just luck. Anything in a hyp/ directory has the potential to be
executed at EL2, and thus could be generating these jump tables. It
could blow up at a later time, once you start running VMs...

M.
-- 
Jazz is not dead. It just smells funny...


Re: arm64 kvm built with clang doesn't boot

2018-03-16 Thread Marc Zyngier
On 16/03/18 14:53, Andrey Konovalov wrote:
> On Fri, Mar 16, 2018 at 3:13 PM, Marc Zyngier  wrote:
>> I wasn't aware of that discussion, but this is indeed quite annoying.
>> Note that you should be able to restrict this to arch/arm64/kvm/hyp/*
>> and virt/kvm/arm/hyp/*.
> 
> That works as well (tried it, the kernel boots). I've also tried
> compiling without the flag for virt/kvm/arm/hyp/*, it boots as well.

That's just luck. Anything in a hyp/ directory has the potential to be
executed at EL2, and thus could be generating these jump tables. It
could blow up at a later time, once you start running VMs...

M.
-- 
Jazz is not dead. It just smells funny...


Re: arm64 kvm built with clang doesn't boot

2018-03-16 Thread Andrey Konovalov
On Fri, Mar 16, 2018 at 3:31 PM, Mark Rutland  wrote:
>
> FWIW, with that same compiler and patch applied atop of v4.16-rc4, and
> some bodges around clang not liking the rX register naming in the SMCCC
> code, I get a kernel that boots on my Juno, though I immediately hit a
> KASAN splat:
>
> [8.476766] 
> ==
> [8.483990] BUG: KASAN: slab-out-of-bounds in __d_lookup_rcu+0x350/0x400
> [8.490664] Read of size 8 at addr 8009336e2a30 by task init/1

I see this as well, I'm looking into it. It seems that
__no_sanitize_address is not defined for clang (defining it doesn't
help though, so the issue might be deeper).


Re: arm64 kvm built with clang doesn't boot

2018-03-16 Thread Andrey Konovalov
On Fri, Mar 16, 2018 at 3:31 PM, Mark Rutland  wrote:
>
> FWIW, with that same compiler and patch applied atop of v4.16-rc4, and
> some bodges around clang not liking the rX register naming in the SMCCC
> code, I get a kernel that boots on my Juno, though I immediately hit a
> KASAN splat:
>
> [8.476766] 
> ==
> [8.483990] BUG: KASAN: slab-out-of-bounds in __d_lookup_rcu+0x350/0x400
> [8.490664] Read of size 8 at addr 8009336e2a30 by task init/1

I see this as well, I'm looking into it. It seems that
__no_sanitize_address is not defined for clang (defining it doesn't
help though, so the issue might be deeper).


Re: arm64 kvm built with clang doesn't boot

2018-03-16 Thread Andrey Konovalov
On Fri, Mar 16, 2018 at 3:13 PM, Marc Zyngier  wrote:
> I wasn't aware of that discussion, but this is indeed quite annoying.
> Note that you should be able to restrict this to arch/arm64/kvm/hyp/*
> and virt/kvm/arm/hyp/*.

That works as well (tried it, the kernel boots). I've also tried
compiling without the flag for virt/kvm/arm/hyp/*, it boots as well.


Re: arm64 kvm built with clang doesn't boot

2018-03-16 Thread Andrey Konovalov
On Fri, Mar 16, 2018 at 3:13 PM, Marc Zyngier  wrote:
> I wasn't aware of that discussion, but this is indeed quite annoying.
> Note that you should be able to restrict this to arch/arm64/kvm/hyp/*
> and virt/kvm/arm/hyp/*.

That works as well (tried it, the kernel boots). I've also tried
compiling without the flag for virt/kvm/arm/hyp/*, it boots as well.


Re: arm64 kvm built with clang doesn't boot

2018-03-16 Thread Andrey Konovalov
On Fri, Mar 16, 2018 at 3:13 PM, Mark Rutland  wrote:
> I think that patch is our best bet currently, but to save ourselves pain
> in future it would be *really* nice if GCC and clang could provide an
> option line -fno-absolute-addressing that would implicitly disable any
> feature that would generate an absolute address as jump tables do.
>

Let me know if you want me to mail that patch again.

Perhaps Nick can comment on whether something like
-fno-absolute-addressing would be feasible in clang. Although even if
it gets implemented, it won't fix the already released clang versions.

> With v4.15 (and clang 5.0.0), I did not have to disable jump labels to
> get a kernel booting on a Juno platform, though I did have to pass
> -fno-jump-tables to the hyp code.
>
> Which kernel version and clang version are you using?

I've rechecked and I think I was wrong here. I disabled
COFNIG_JUMP_LABEL while trying to get the kernel booting before I
added the kvm flags. It seems it's not needed after all.

But just for the reference, I'm using 4.16-rc4 with a patch to fix
SMCCC issues that you mentioned. As for clang, I'm using LLVM revision
325711 (a couple of weeks old).


Re: arm64 kvm built with clang doesn't boot

2018-03-16 Thread Andrey Konovalov
On Fri, Mar 16, 2018 at 3:13 PM, Mark Rutland  wrote:
> I think that patch is our best bet currently, but to save ourselves pain
> in future it would be *really* nice if GCC and clang could provide an
> option line -fno-absolute-addressing that would implicitly disable any
> feature that would generate an absolute address as jump tables do.
>

Let me know if you want me to mail that patch again.

Perhaps Nick can comment on whether something like
-fno-absolute-addressing would be feasible in clang. Although even if
it gets implemented, it won't fix the already released clang versions.

> With v4.15 (and clang 5.0.0), I did not have to disable jump labels to
> get a kernel booting on a Juno platform, though I did have to pass
> -fno-jump-tables to the hyp code.
>
> Which kernel version and clang version are you using?

I've rechecked and I think I was wrong here. I disabled
COFNIG_JUMP_LABEL while trying to get the kernel booting before I
added the kvm flags. It seems it's not needed after all.

But just for the reference, I'm using 4.16-rc4 with a patch to fix
SMCCC issues that you mentioned. As for clang, I'm using LLVM revision
325711 (a couple of weeks old).


Re: arm64 kvm built with clang doesn't boot

2018-03-16 Thread Mark Rutland
On Fri, Mar 16, 2018 at 02:13:14PM +, Mark Rutland wrote:
> On Fri, Mar 16, 2018 at 02:49:00PM +0100, Andrey Konovalov wrote:
> > Hi!
> 
> Hi,
>  
> > I've recently tried to boot clang built kernel on real hardware
> > (Odroid C2 board) instead of using a VM. The issue that I stumbled
> > upon is that arm64 kvm built with clang doesn't boot.
> > 
> > Adding -fno-jump-tables compiler flag to arch/arm64/kvm/* helps. There
> > was a patch some time ago that did exactly that
> > (https://patchwork.kernel.org/patch/10060381/), but it wasn't accepted
> > AFAICT (see the discussion on that thread).
> > 
> > What would be the best way to get this fixed?
> 
> I think that patch is our best bet currently, but to save ourselves pain
> in future it would be *really* nice if GCC and clang could provide an
> option line -fno-absolute-addressing that would implicitly disable any
> feature that would generate an absolute address as jump tables do.
> 
> > I've also had to disable CONFIG_JUMP_LABEL to get the kernel boot
> > (even without kvm enabled), but that might be a different (though
> > related) issue.
> 
> With v4.15 (and clang 5.0.0), I did not have to disable jump labels to
> get a kernel booting on a Juno platform, though I did have to pass
> -fno-jump-tables to the hyp code.

FWIW, with that same compiler and patch applied atop of v4.16-rc4, and
some bodges around clang not liking the rX register naming in the SMCCC
code, I get a kernel that boots on my Juno, though I immediately hit a
KASAN splat:

[8.476766] 
==
[8.483990] BUG: KASAN: slab-out-of-bounds in __d_lookup_rcu+0x350/0x400
[8.490664] Read of size 8 at addr 8009336e2a30 by task init/1
[8.496808] 
[8.498313] CPU: 2 PID: 1 Comm: init Not tainted 
4.16.0-rc4-1-g1e3a801c4e30-dirty #1
[8.506361] Hardware name: ARM Juno development board (r1) (DT)
[8.512248] Call trace:
[8.514699]  dump_backtrace+0x0/0x29c
[8.518356]  show_stack+0x20/0x2c
[8.521667]  dump_stack+0x118/0x15c
[8.525151]  print_address_description+0x80/0x2d0
[8.529839]  kasan_report+0x208/0x278
[8.533492]  __asan_load8+0x1b0/0x1b8
[8.537148]  __d_lookup_rcu+0x350/0x400
[8.540974]  lookup_fast+0x19c/0x780
[8.544541]  walk_component+0x108/0x121c
[8.548452]  path_lookupat+0x1a4/0x620
[8.552197]  filename_lookup+0x1d8/0x440
[8.556113]  user_path_at_empty+0x54/0x68
[8.560112]  vfs_statx+0x108/0x1f0
[8.563507]  SyS_newfstatat+0x118/0x19c
[8.567333]  el0_svc_naked+0x30/0x34
[8.570889] 
[8.572380] Allocated by task 1:
[8.575603]  kasan_kmalloc+0xe0/0x1ac
[8.579259]  __kmalloc+0x1e4/0x278
[8.582656]  __d_alloc+0x8c/0x370
[8.585968]  d_alloc_parallel+0xdc/0xca0
[8.589883]  nfs_readdir_xdr_to_array+0xe44/0x1694
[8.594658]  nfs_readdir_filler+0x44/0xe8
[8.598662]  do_read_cache_page+0x450/0x6f4
[8.602836]  read_cache_page+0x44/0x54
[8.606575]  nfs_readdir+0xd58/0xef4
[8.610143]  iterate_dir+0x15c/0x26c
[8.613711]  SyS_getdents64+0x180/0x30c
[8.617538]  el0_svc_naked+0x30/0x34
[8.621093] 
[8.622584] Freed by task 0:
[8.625451] (stack is not available)
[8.629007] 
[8.630505] The buggy address belongs to the object at 8009336e2a00
[8.630505]  which belongs to the cache kmalloc-128 of size 128
[8.642969] The buggy address is located 48 bytes inside of
[8.642969]  128-byte region [8009336e2a00, 8009336e2a80)
[8.654558] The buggy address belongs to the page:
[8.659335] page:7e0024cdb880 count:1 mapcount:0 
mapping: index:0x0
[8.667304] flags: 0x1fffc100(slab)
[8.671487] raw: 1fffc100   
000100100010
[8.679206] raw: dead0100 dead0200 800937403c00 

[8.686907] page dumped because: kasan: bad access detected
[8.692447] 
[8.693935] Memory state around the buggy address:
[8.698710]  8009336e2900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00
[8.705902]  8009336e2980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc 
fc
[8.713093] >8009336e2a00: 00 00 00 00 00 00 06 fc fc fc fc fc fc fc fc 
fc
[8.720277]  ^
[8.725051]  8009336e2a80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc 
fc
[8.732242]  8009336e2b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00
[8.739426] 
==

Thanks,
Mark.


Re: arm64 kvm built with clang doesn't boot

2018-03-16 Thread Mark Rutland
On Fri, Mar 16, 2018 at 02:13:14PM +, Mark Rutland wrote:
> On Fri, Mar 16, 2018 at 02:49:00PM +0100, Andrey Konovalov wrote:
> > Hi!
> 
> Hi,
>  
> > I've recently tried to boot clang built kernel on real hardware
> > (Odroid C2 board) instead of using a VM. The issue that I stumbled
> > upon is that arm64 kvm built with clang doesn't boot.
> > 
> > Adding -fno-jump-tables compiler flag to arch/arm64/kvm/* helps. There
> > was a patch some time ago that did exactly that
> > (https://patchwork.kernel.org/patch/10060381/), but it wasn't accepted
> > AFAICT (see the discussion on that thread).
> > 
> > What would be the best way to get this fixed?
> 
> I think that patch is our best bet currently, but to save ourselves pain
> in future it would be *really* nice if GCC and clang could provide an
> option line -fno-absolute-addressing that would implicitly disable any
> feature that would generate an absolute address as jump tables do.
> 
> > I've also had to disable CONFIG_JUMP_LABEL to get the kernel boot
> > (even without kvm enabled), but that might be a different (though
> > related) issue.
> 
> With v4.15 (and clang 5.0.0), I did not have to disable jump labels to
> get a kernel booting on a Juno platform, though I did have to pass
> -fno-jump-tables to the hyp code.

FWIW, with that same compiler and patch applied atop of v4.16-rc4, and
some bodges around clang not liking the rX register naming in the SMCCC
code, I get a kernel that boots on my Juno, though I immediately hit a
KASAN splat:

[8.476766] 
==
[8.483990] BUG: KASAN: slab-out-of-bounds in __d_lookup_rcu+0x350/0x400
[8.490664] Read of size 8 at addr 8009336e2a30 by task init/1
[8.496808] 
[8.498313] CPU: 2 PID: 1 Comm: init Not tainted 
4.16.0-rc4-1-g1e3a801c4e30-dirty #1
[8.506361] Hardware name: ARM Juno development board (r1) (DT)
[8.512248] Call trace:
[8.514699]  dump_backtrace+0x0/0x29c
[8.518356]  show_stack+0x20/0x2c
[8.521667]  dump_stack+0x118/0x15c
[8.525151]  print_address_description+0x80/0x2d0
[8.529839]  kasan_report+0x208/0x278
[8.533492]  __asan_load8+0x1b0/0x1b8
[8.537148]  __d_lookup_rcu+0x350/0x400
[8.540974]  lookup_fast+0x19c/0x780
[8.544541]  walk_component+0x108/0x121c
[8.548452]  path_lookupat+0x1a4/0x620
[8.552197]  filename_lookup+0x1d8/0x440
[8.556113]  user_path_at_empty+0x54/0x68
[8.560112]  vfs_statx+0x108/0x1f0
[8.563507]  SyS_newfstatat+0x118/0x19c
[8.567333]  el0_svc_naked+0x30/0x34
[8.570889] 
[8.572380] Allocated by task 1:
[8.575603]  kasan_kmalloc+0xe0/0x1ac
[8.579259]  __kmalloc+0x1e4/0x278
[8.582656]  __d_alloc+0x8c/0x370
[8.585968]  d_alloc_parallel+0xdc/0xca0
[8.589883]  nfs_readdir_xdr_to_array+0xe44/0x1694
[8.594658]  nfs_readdir_filler+0x44/0xe8
[8.598662]  do_read_cache_page+0x450/0x6f4
[8.602836]  read_cache_page+0x44/0x54
[8.606575]  nfs_readdir+0xd58/0xef4
[8.610143]  iterate_dir+0x15c/0x26c
[8.613711]  SyS_getdents64+0x180/0x30c
[8.617538]  el0_svc_naked+0x30/0x34
[8.621093] 
[8.622584] Freed by task 0:
[8.625451] (stack is not available)
[8.629007] 
[8.630505] The buggy address belongs to the object at 8009336e2a00
[8.630505]  which belongs to the cache kmalloc-128 of size 128
[8.642969] The buggy address is located 48 bytes inside of
[8.642969]  128-byte region [8009336e2a00, 8009336e2a80)
[8.654558] The buggy address belongs to the page:
[8.659335] page:7e0024cdb880 count:1 mapcount:0 
mapping: index:0x0
[8.667304] flags: 0x1fffc100(slab)
[8.671487] raw: 1fffc100   
000100100010
[8.679206] raw: dead0100 dead0200 800937403c00 

[8.686907] page dumped because: kasan: bad access detected
[8.692447] 
[8.693935] Memory state around the buggy address:
[8.698710]  8009336e2900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00
[8.705902]  8009336e2980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc 
fc
[8.713093] >8009336e2a00: 00 00 00 00 00 00 06 fc fc fc fc fc fc fc fc 
fc
[8.720277]  ^
[8.725051]  8009336e2a80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc 
fc
[8.732242]  8009336e2b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00
[8.739426] 
==

Thanks,
Mark.


Re: arm64 kvm built with clang doesn't boot

2018-03-16 Thread Marc Zyngier
Hi Andrey,

On 16/03/18 13:49, Andrey Konovalov wrote:
> Hi!
> 
> I've recently tried to boot clang built kernel on real hardware
> (Odroid C2 board) instead of using a VM. The issue that I stumbled
> upon is that arm64 kvm built with clang doesn't boot.
> 
> Adding -fno-jump-tables compiler flag to arch/arm64/kvm/* helps. There
> was a patch some time ago that did exactly that
> (https://patchwork.kernel.org/patch/10060381/), but it wasn't accepted
> AFAICT (see the discussion on that thread).

I wasn't aware of that discussion, but this is indeed quite annoying.
Note that you should be able to restrict this to arch/arm64/kvm/hyp/*
and virt/kvm/arm/hyp/*.

> What would be the best way to get this fixed?

Ideally, I'd like to see is a way to stick to PC-relative addressing
within a compilation unit.

> I've also had to disable CONFIG_JUMP_LABEL to get the kernel boot
> (even without kvm enabled), but that might be a different (though
> related) issue.

That's quite bizarre. Does clang has the equivalent of "asm goto"? Or do
you rely on reading a variable to decide whether or not to branch?

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...


Re: arm64 kvm built with clang doesn't boot

2018-03-16 Thread Marc Zyngier
Hi Andrey,

On 16/03/18 13:49, Andrey Konovalov wrote:
> Hi!
> 
> I've recently tried to boot clang built kernel on real hardware
> (Odroid C2 board) instead of using a VM. The issue that I stumbled
> upon is that arm64 kvm built with clang doesn't boot.
> 
> Adding -fno-jump-tables compiler flag to arch/arm64/kvm/* helps. There
> was a patch some time ago that did exactly that
> (https://patchwork.kernel.org/patch/10060381/), but it wasn't accepted
> AFAICT (see the discussion on that thread).

I wasn't aware of that discussion, but this is indeed quite annoying.
Note that you should be able to restrict this to arch/arm64/kvm/hyp/*
and virt/kvm/arm/hyp/*.

> What would be the best way to get this fixed?

Ideally, I'd like to see is a way to stick to PC-relative addressing
within a compilation unit.

> I've also had to disable CONFIG_JUMP_LABEL to get the kernel boot
> (even without kvm enabled), but that might be a different (though
> related) issue.

That's quite bizarre. Does clang has the equivalent of "asm goto"? Or do
you rely on reading a variable to decide whether or not to branch?

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...


Re: arm64 kvm built with clang doesn't boot

2018-03-16 Thread Mark Rutland
On Fri, Mar 16, 2018 at 02:49:00PM +0100, Andrey Konovalov wrote:
> Hi!

Hi,
 
> I've recently tried to boot clang built kernel on real hardware
> (Odroid C2 board) instead of using a VM. The issue that I stumbled
> upon is that arm64 kvm built with clang doesn't boot.
> 
> Adding -fno-jump-tables compiler flag to arch/arm64/kvm/* helps. There
> was a patch some time ago that did exactly that
> (https://patchwork.kernel.org/patch/10060381/), but it wasn't accepted
> AFAICT (see the discussion on that thread).
> 
> What would be the best way to get this fixed?

I think that patch is our best bet currently, but to save ourselves pain
in future it would be *really* nice if GCC and clang could provide an
option line -fno-absolute-addressing that would implicitly disable any
feature that would generate an absolute address as jump tables do.

> I've also had to disable CONFIG_JUMP_LABEL to get the kernel boot
> (even without kvm enabled), but that might be a different (though
> related) issue.

With v4.15 (and clang 5.0.0), I did not have to disable jump labels to
get a kernel booting on a Juno platform, though I did have to pass
-fno-jump-tables to the hyp code.

Which kernel version and clang version are you using?

Thanks,
Mark.


Re: arm64 kvm built with clang doesn't boot

2018-03-16 Thread Mark Rutland
On Fri, Mar 16, 2018 at 02:49:00PM +0100, Andrey Konovalov wrote:
> Hi!

Hi,
 
> I've recently tried to boot clang built kernel on real hardware
> (Odroid C2 board) instead of using a VM. The issue that I stumbled
> upon is that arm64 kvm built with clang doesn't boot.
> 
> Adding -fno-jump-tables compiler flag to arch/arm64/kvm/* helps. There
> was a patch some time ago that did exactly that
> (https://patchwork.kernel.org/patch/10060381/), but it wasn't accepted
> AFAICT (see the discussion on that thread).
> 
> What would be the best way to get this fixed?

I think that patch is our best bet currently, but to save ourselves pain
in future it would be *really* nice if GCC and clang could provide an
option line -fno-absolute-addressing that would implicitly disable any
feature that would generate an absolute address as jump tables do.

> I've also had to disable CONFIG_JUMP_LABEL to get the kernel boot
> (even without kvm enabled), but that might be a different (though
> related) issue.

With v4.15 (and clang 5.0.0), I did not have to disable jump labels to
get a kernel booting on a Juno platform, though I did have to pass
-fno-jump-tables to the hyp code.

Which kernel version and clang version are you using?

Thanks,
Mark.


arm64 kvm built with clang doesn't boot

2018-03-16 Thread Andrey Konovalov
Hi!

I've recently tried to boot clang built kernel on real hardware
(Odroid C2 board) instead of using a VM. The issue that I stumbled
upon is that arm64 kvm built with clang doesn't boot.

Adding -fno-jump-tables compiler flag to arch/arm64/kvm/* helps. There
was a patch some time ago that did exactly that
(https://patchwork.kernel.org/patch/10060381/), but it wasn't accepted
AFAICT (see the discussion on that thread).

What would be the best way to get this fixed?

I've also had to disable CONFIG_JUMP_LABEL to get the kernel boot
(even without kvm enabled), but that might be a different (though
related) issue.

Thanks!


arm64 kvm built with clang doesn't boot

2018-03-16 Thread Andrey Konovalov
Hi!

I've recently tried to boot clang built kernel on real hardware
(Odroid C2 board) instead of using a VM. The issue that I stumbled
upon is that arm64 kvm built with clang doesn't boot.

Adding -fno-jump-tables compiler flag to arch/arm64/kvm/* helps. There
was a patch some time ago that did exactly that
(https://patchwork.kernel.org/patch/10060381/), but it wasn't accepted
AFAICT (see the discussion on that thread).

What would be the best way to get this fixed?

I've also had to disable CONFIG_JUMP_LABEL to get the kernel boot
(even without kvm enabled), but that might be a different (though
related) issue.

Thanks!


Re: 4.14.9 doesn't boot (regression)

2017-12-30 Thread Josh Poimboeuf
On Sun, Dec 31, 2017 at 01:03:25AM +0300, Alexander Tsoy wrote:
> > Turns out my previous code to print iret frames was a bit ...
> > misguided, to put it nicely.  Not sure what I was smoking.
> > 
> > Hopefully the below patch should fix it (in place of the previous
> > patch).  Would you mind testing again?
> > 
> 
> With that patch I get:
> 
> [2.160017] NMI backtrace for cpu 0
> [2.160017] CPU: 0 PID: 1 Comm: init Not tainted 4.15.0-rc5 #1
> [2.160017] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> 1.10.2-1.fc27 04/01/2014
> [2.160017] RIP: 0010:double_fault+0x0/0x30
> [2.160017] RSP: :fe807fd0 EFLAGS: 00010086
> [2.160017] RAX: ffc0 RBX: 0001 RCX: 
> c101
> [2.160017] RDX: 8edc RSI:  RDI: 
> fe807f58
> [2.160017] RBP:  R08:  R09: 
> 
> [2.160017] R10:  R11:  R12: 
> a3c01426
> [2.160017] R13:  R14:  R15: 
> 
> [2.160017] FS:  () GS:8edcffc0() 
> knlGS:
> [2.160017] CS:  0010 DS:  ES:  CR0: 80050033
> [2.160017] CR2: fe806f08 CR3: 7c153000 CR4: 
> 06b0
> [2.160017] Call Trace:
> [2.160017]  <#DF>
> [2.160017] RIP: 0010:do_double_fault+0xb/0x140
> [2.160017] RSP: :fe806f18 EFLAGS: 00010086
> [2.160017]  

Yes, that's more like it.  I'll clean up the patches and submit them
soon.  These nasty bugs are always a good testcase for the stack dump
code.

Thanks for testing!

-- 
Josh


Re: 4.14.9 doesn't boot (regression)

2017-12-30 Thread Josh Poimboeuf
On Sun, Dec 31, 2017 at 01:03:25AM +0300, Alexander Tsoy wrote:
> > Turns out my previous code to print iret frames was a bit ...
> > misguided, to put it nicely.  Not sure what I was smoking.
> > 
> > Hopefully the below patch should fix it (in place of the previous
> > patch).  Would you mind testing again?
> > 
> 
> With that patch I get:
> 
> [2.160017] NMI backtrace for cpu 0
> [2.160017] CPU: 0 PID: 1 Comm: init Not tainted 4.15.0-rc5 #1
> [2.160017] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> 1.10.2-1.fc27 04/01/2014
> [2.160017] RIP: 0010:double_fault+0x0/0x30
> [2.160017] RSP: :fe807fd0 EFLAGS: 00010086
> [2.160017] RAX: ffc0 RBX: 0001 RCX: 
> c101
> [2.160017] RDX: 8edc RSI:  RDI: 
> fe807f58
> [2.160017] RBP:  R08:  R09: 
> 
> [2.160017] R10:  R11:  R12: 
> a3c01426
> [2.160017] R13:  R14:  R15: 
> 
> [2.160017] FS:  () GS:8edcffc0() 
> knlGS:
> [2.160017] CS:  0010 DS:  ES:  CR0: 80050033
> [2.160017] CR2: fe806f08 CR3: 7c153000 CR4: 
> 06b0
> [2.160017] Call Trace:
> [2.160017]  <#DF>
> [2.160017] RIP: 0010:do_double_fault+0xb/0x140
> [2.160017] RSP: :fe806f18 EFLAGS: 00010086
> [2.160017]  

Yes, that's more like it.  I'll clean up the patches and submit them
soon.  These nasty bugs are always a good testcase for the stack dump
code.

Thanks for testing!

-- 
Josh


Re: 4.14.9 doesn't boot (regression)

2017-12-30 Thread Alexander Tsoy
В Sat, 30 Dec 2017 11:57:46 -0600
Josh Poimboeuf  пишет:

> On Sat, Dec 30, 2017 at 11:09:46AM -0600, Josh Poimboeuf wrote:
> > On Sat, Dec 30, 2017 at 11:45:13AM +0300, Alexander Tsoy wrote:  
> > > В Пт, 29/12/2017 в 21:49 -0600, Josh Poimboeuf пишет:  
> > > > On Fri, Dec 29, 2017 at 05:10:35PM -0700, Andy Lutomirski
> > > > wrote:  
> > > > > (Also, Josh, the oops code should have printed the contents
> > > > > of the struct pt_regs at the top of the DF stack.  Any idea
> > > > > why it didn't?)  
> > > > 
> > > > Looking at one of the dumps:
> > > > 
> > > >   [  392.774879] NMI backtrace for cpu 0
> > > >   [  392.774881] CPU: 0 PID: 1 Comm: init Not tainted
> > > > 4.14.9-gentoo #1
> > > >   [  392.774881] Hardware name: Red Hat KVM, BIOS 0.5.1
> > > > 01/01/2011 [  392.774882] task: 8802368b8000 task.stack:
> > > > c900c000 [  392.774885] RIP: 0010:double_fault+0x0/0x30
> > > >   [  392.774886] RSP: :ff527fd0 EFLAGS: 0086
> > > >   [  392.774887] RAX: 3fc0 RBX: 0001
> > > > RCX: c101
> > > >   [  392.774887] RDX: 8802 RSI: 
> > > > RDI: ff527f58
> > > >   [  392.774887] RBP:  R08: 
> > > > R09: 
> > > >   [  392.774888] R10:  R11: 
> > > > R12: 816ae726
> > > >   [  392.774888] R13:  R14: 
> > > > R15: 
> > > >   [  392.774889] FS:  ()
> > > > GS:88023fc0() knlGS:
> > > >   [  392.774889] CS:  0010 DS:  ES:  CR0:
> > > > 80050033 [  392.774890] CR2: ff526f08 CR3:
> > > > 000235b48002 CR4: 001606f0
> > > >   [  392.774892] Call Trace:
> > > >   [  392.774894]  <#DF>
> > > >   [  392.774897]  do_double_fault+0xb/0x140
> > > >   [  392.774898]  
> > > > 
> > > > It should have at least printed the #DF iret frame registers,
> > > > which I recently added support for in "x86/unwinder: Handle
> > > > stack overflows more
> > > > gracefully", which is in both 4.14.9 and 4.15-rc5.
> > > > 
> > > > I think the missing iret regs are due to a bug in
> > > > show_trace_log_lvl(),
> > > > where if the unwind starts with two regs frames in a row, the
> > > > second regs don't get printed.
> > > > 
> > > > Alexander, would you mind reproducing again with the below
> > > > patch?  It should still fail, but this time it should hopefully
> > > > show another RIP/RSP/EFLAGS instead of the
> > > > "do_double_fault+0xb/0x140" line. 
> > > 
> > > Yes, it works:
> > > 
> > > [   23.058064] NMI backtrace for cpu 2
> > > [   23.058068] CPU: 2 PID: 1 Comm: init Not tainted 4.15.0-rc5+ #1
> > > [   23.058069] Hardware name: QEMU Standard PC (i440FX + PIIX,
> > > 1996), BIOS 1.10.2-1.fc27 04/01/2014
> > > [   23.058074] RIP: 0010:double_fault+0x0/0x30
> > > [   23.058075] RSP: :fe85ffd0 EFLAGS: 0086
> > > [   23.058077] RAX: 3fd0 RBX: 0001 RCX:
> > > c101
> > > [   23.058077] RDX: 9681 RSI:  RDI:
> > > fe85ff58
> > > [   23.058078] RBP:  R08:  R09:
> > > 
> > > [   23.058079] R10:  R11:  R12:
> > > 92001426
> > > [   23.058080] R13:  R14:  R15:
> > > 
> > > [   23.058083] FS:  ()
> > > GS:96813fd0() knlGS:
> > > [   23.058084] CS:  0010 DS:  ES:  CR0: 80050033
> > > [   23.058085] CR2: fe85ef08 CR3: 000137a09000 CR4:
> > > 000406a0
> > > [   23.058089] Call Trace:
> > > [   23.058101]  <#DF>
> > > [   23.058104] RIP: 0010:do_double_fault+0xb/0x140
> > > [   23.058105] RSP: :fe85ef18 EFLAGS: 00010086
> > > ORIG_RAX: 
> > > [   23.058106] RAX: 3fd0 RBX: 0001 RCX:
> > > c101
> > > [   23.058107] RDX: 9681 RSI:  RDI:
> > > fe85ff58
> > > [   23.058107] RBP:  R08:  R09:
> > > 
> > > [   23.058108] R10:  R11:  R12:
> > > 92001426
> > > [   23.058108] R13:  R14:  R15:
> > > 
> > > [   23.058111]  
> > > [   23.058111] Code: 05 00 00 48 89 e7 31 f6 e8 2e 8c 61 ff e9 69
> > > 06 00 00 e8 94 05 00 00 48 89 e7 31 f6 e8 1a 8c 61 ff e9 55 06 00
> > > 00 0f 1f 44 00 00 <0f> 1f 00 48 83 c4 88 e8 e4 04 00 00 48 89 e7
> > > 48 8b 74 24 78 48  
> > 
> > That's better indeed, though still not quite right.  It should have
> > only shown a subset of those registers.  One more bug to fix
> > there...  
> 
> Turns out my previous code to print iret frames was a bit ...
> misguided, to put it nicely.  Not sure what I was smoking.
> 
> Hopefully the below patch should fix it (in 

Re: 4.14.9 doesn't boot (regression)

2017-12-30 Thread Alexander Tsoy
В Sat, 30 Dec 2017 11:57:46 -0600
Josh Poimboeuf  пишет:

> On Sat, Dec 30, 2017 at 11:09:46AM -0600, Josh Poimboeuf wrote:
> > On Sat, Dec 30, 2017 at 11:45:13AM +0300, Alexander Tsoy wrote:  
> > > В Пт, 29/12/2017 в 21:49 -0600, Josh Poimboeuf пишет:  
> > > > On Fri, Dec 29, 2017 at 05:10:35PM -0700, Andy Lutomirski
> > > > wrote:  
> > > > > (Also, Josh, the oops code should have printed the contents
> > > > > of the struct pt_regs at the top of the DF stack.  Any idea
> > > > > why it didn't?)  
> > > > 
> > > > Looking at one of the dumps:
> > > > 
> > > >   [  392.774879] NMI backtrace for cpu 0
> > > >   [  392.774881] CPU: 0 PID: 1 Comm: init Not tainted
> > > > 4.14.9-gentoo #1
> > > >   [  392.774881] Hardware name: Red Hat KVM, BIOS 0.5.1
> > > > 01/01/2011 [  392.774882] task: 8802368b8000 task.stack:
> > > > c900c000 [  392.774885] RIP: 0010:double_fault+0x0/0x30
> > > >   [  392.774886] RSP: :ff527fd0 EFLAGS: 0086
> > > >   [  392.774887] RAX: 3fc0 RBX: 0001
> > > > RCX: c101
> > > >   [  392.774887] RDX: 8802 RSI: 
> > > > RDI: ff527f58
> > > >   [  392.774887] RBP:  R08: 
> > > > R09: 
> > > >   [  392.774888] R10:  R11: 
> > > > R12: 816ae726
> > > >   [  392.774888] R13:  R14: 
> > > > R15: 
> > > >   [  392.774889] FS:  ()
> > > > GS:88023fc0() knlGS:
> > > >   [  392.774889] CS:  0010 DS:  ES:  CR0:
> > > > 80050033 [  392.774890] CR2: ff526f08 CR3:
> > > > 000235b48002 CR4: 001606f0
> > > >   [  392.774892] Call Trace:
> > > >   [  392.774894]  <#DF>
> > > >   [  392.774897]  do_double_fault+0xb/0x140
> > > >   [  392.774898]  
> > > > 
> > > > It should have at least printed the #DF iret frame registers,
> > > > which I recently added support for in "x86/unwinder: Handle
> > > > stack overflows more
> > > > gracefully", which is in both 4.14.9 and 4.15-rc5.
> > > > 
> > > > I think the missing iret regs are due to a bug in
> > > > show_trace_log_lvl(),
> > > > where if the unwind starts with two regs frames in a row, the
> > > > second regs don't get printed.
> > > > 
> > > > Alexander, would you mind reproducing again with the below
> > > > patch?  It should still fail, but this time it should hopefully
> > > > show another RIP/RSP/EFLAGS instead of the
> > > > "do_double_fault+0xb/0x140" line. 
> > > 
> > > Yes, it works:
> > > 
> > > [   23.058064] NMI backtrace for cpu 2
> > > [   23.058068] CPU: 2 PID: 1 Comm: init Not tainted 4.15.0-rc5+ #1
> > > [   23.058069] Hardware name: QEMU Standard PC (i440FX + PIIX,
> > > 1996), BIOS 1.10.2-1.fc27 04/01/2014
> > > [   23.058074] RIP: 0010:double_fault+0x0/0x30
> > > [   23.058075] RSP: :fe85ffd0 EFLAGS: 0086
> > > [   23.058077] RAX: 3fd0 RBX: 0001 RCX:
> > > c101
> > > [   23.058077] RDX: 9681 RSI:  RDI:
> > > fe85ff58
> > > [   23.058078] RBP:  R08:  R09:
> > > 
> > > [   23.058079] R10:  R11:  R12:
> > > 92001426
> > > [   23.058080] R13:  R14:  R15:
> > > 
> > > [   23.058083] FS:  ()
> > > GS:96813fd0() knlGS:
> > > [   23.058084] CS:  0010 DS:  ES:  CR0: 80050033
> > > [   23.058085] CR2: fe85ef08 CR3: 000137a09000 CR4:
> > > 000406a0
> > > [   23.058089] Call Trace:
> > > [   23.058101]  <#DF>
> > > [   23.058104] RIP: 0010:do_double_fault+0xb/0x140
> > > [   23.058105] RSP: :fe85ef18 EFLAGS: 00010086
> > > ORIG_RAX: 
> > > [   23.058106] RAX: 3fd0 RBX: 0001 RCX:
> > > c101
> > > [   23.058107] RDX: 9681 RSI:  RDI:
> > > fe85ff58
> > > [   23.058107] RBP:  R08:  R09:
> > > 
> > > [   23.058108] R10:  R11:  R12:
> > > 92001426
> > > [   23.058108] R13:  R14:  R15:
> > > 
> > > [   23.058111]  
> > > [   23.058111] Code: 05 00 00 48 89 e7 31 f6 e8 2e 8c 61 ff e9 69
> > > 06 00 00 e8 94 05 00 00 48 89 e7 31 f6 e8 1a 8c 61 ff e9 55 06 00
> > > 00 0f 1f 44 00 00 <0f> 1f 00 48 83 c4 88 e8 e4 04 00 00 48 89 e7
> > > 48 8b 74 24 78 48  
> > 
> > That's better indeed, though still not quite right.  It should have
> > only shown a subset of those registers.  One more bug to fix
> > there...  
> 
> Turns out my previous code to print iret frames was a bit ...
> misguided, to put it nicely.  Not sure what I was smoking.
> 
> Hopefully the below patch should fix it (in place of the previous
> 

Re: 4.14.9 doesn't boot (regression)

2017-12-30 Thread Josh Poimboeuf
On Sat, Dec 30, 2017 at 11:09:46AM -0600, Josh Poimboeuf wrote:
> On Sat, Dec 30, 2017 at 11:45:13AM +0300, Alexander Tsoy wrote:
> > В Пт, 29/12/2017 в 21:49 -0600, Josh Poimboeuf пишет:
> > > On Fri, Dec 29, 2017 at 05:10:35PM -0700, Andy Lutomirski wrote:
> > > > (Also, Josh, the oops code should have printed the contents of the
> > > > struct pt_regs at the top of the DF stack.  Any idea why it
> > > > didn't?)
> > > 
> > > Looking at one of the dumps:
> > > 
> > >   [  392.774879] NMI backtrace for cpu 0
> > >   [  392.774881] CPU: 0 PID: 1 Comm: init Not tainted 4.14.9-gentoo
> > > #1
> > >   [  392.774881] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
> > >   [  392.774882] task: 8802368b8000 task.stack: c900c000
> > >   [  392.774885] RIP: 0010:double_fault+0x0/0x30
> > >   [  392.774886] RSP: :ff527fd0 EFLAGS: 0086
> > >   [  392.774887] RAX: 3fc0 RBX: 0001 RCX:
> > > c101
> > >   [  392.774887] RDX: 8802 RSI:  RDI:
> > > ff527f58
> > >   [  392.774887] RBP:  R08:  R09:
> > > 
> > >   [  392.774888] R10:  R11:  R12:
> > > 816ae726
> > >   [  392.774888] R13:  R14:  R15:
> > > 
> > >   [  392.774889] FS:  ()
> > > GS:88023fc0() knlGS:
> > >   [  392.774889] CS:  0010 DS:  ES:  CR0: 80050033
> > >   [  392.774890] CR2: ff526f08 CR3: 000235b48002 CR4:
> > > 001606f0
> > >   [  392.774892] Call Trace:
> > >   [  392.774894]  <#DF>
> > >   [  392.774897]  do_double_fault+0xb/0x140
> > >   [  392.774898]  
> > > 
> > > It should have at least printed the #DF iret frame registers, which I
> > > recently added support for in "x86/unwinder: Handle stack overflows
> > > more
> > > gracefully", which is in both 4.14.9 and 4.15-rc5.
> > > 
> > > I think the missing iret regs are due to a bug in
> > > show_trace_log_lvl(),
> > > where if the unwind starts with two regs frames in a row, the second
> > > regs don't get printed.
> > > 
> > > Alexander, would you mind reproducing again with the below patch?  It
> > > should still fail, but this time it should hopefully show another
> > > RIP/RSP/EFLAGS instead of the "do_double_fault+0xb/0x140" line.
> > > 
> > 
> > Yes, it works:
> > 
> > [   23.058064] NMI backtrace for cpu 2
> > [   23.058068] CPU: 2 PID: 1 Comm: init Not tainted 4.15.0-rc5+ #1
> > [   23.058069] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> > BIOS 1.10.2-1.fc27 04/01/2014
> > [   23.058074] RIP: 0010:double_fault+0x0/0x30
> > [   23.058075] RSP: :fe85ffd0 EFLAGS: 0086
> > [   23.058077] RAX: 3fd0 RBX: 0001 RCX:
> > c101
> > [   23.058077] RDX: 9681 RSI:  RDI:
> > fe85ff58
> > [   23.058078] RBP:  R08:  R09:
> > 
> > [   23.058079] R10:  R11:  R12:
> > 92001426
> > [   23.058080] R13:  R14:  R15:
> > 
> > [   23.058083] FS:  () GS:96813fd0()
> > knlGS:
> > [   23.058084] CS:  0010 DS:  ES:  CR0: 80050033
> > [   23.058085] CR2: fe85ef08 CR3: 000137a09000 CR4:
> > 000406a0
> > [   23.058089] Call Trace:
> > [   23.058101]  <#DF>
> > [   23.058104] RIP: 0010:do_double_fault+0xb/0x140
> > [   23.058105] RSP: :fe85ef18 EFLAGS: 00010086 ORIG_RAX:
> > 
> > [   23.058106] RAX: 3fd0 RBX: 0001 RCX:
> > c101
> > [   23.058107] RDX: 9681 RSI:  RDI:
> > fe85ff58
> > [   23.058107] RBP:  R08:  R09:
> > 
> > [   23.058108] R10:  R11:  R12:
> > 92001426
> > [   23.058108] R13:  R14:  R15:
> > 
> > [   23.058111]  
> > [   23.058111] Code: 05 00 00 48 89 e7 31 f6 e8 2e 8c 61 ff e9 69 06 00
> > 00 e8 94 05 00 00 48 89 e7 31 f6 e8 1a 8c 61 ff e9 55 06 00 00 0f 1f 44
> > 00 00 <0f> 1f 00 48 83 c4 88 e8 e4 04 00 00 48 89 e7 48 8b 74 24 78 48
> 
> That's better indeed, though still not quite right.  It should have only
> shown a subset of those registers.  One more bug to fix there...

Turns out my previous code to print iret frames was a bit ... misguided,
to put it nicely.  Not sure what I was smoking.

Hopefully the below patch should fix it (in place of the previous
patch).  Would you mind testing again?

diff --git a/arch/x86/include/asm/unwind.h b/arch/x86/include/asm/unwind.h
index c1688c2d0a12..1f86e1b0a5cd 100644
--- a/arch/x86/include/asm/unwind.h
+++ b/arch/x86/include/asm/unwind.h
@@ -56,18 +56,27 @@ void unwind_start(struct unwind_state 

Re: 4.14.9 doesn't boot (regression)

2017-12-30 Thread Josh Poimboeuf
On Sat, Dec 30, 2017 at 11:09:46AM -0600, Josh Poimboeuf wrote:
> On Sat, Dec 30, 2017 at 11:45:13AM +0300, Alexander Tsoy wrote:
> > В Пт, 29/12/2017 в 21:49 -0600, Josh Poimboeuf пишет:
> > > On Fri, Dec 29, 2017 at 05:10:35PM -0700, Andy Lutomirski wrote:
> > > > (Also, Josh, the oops code should have printed the contents of the
> > > > struct pt_regs at the top of the DF stack.  Any idea why it
> > > > didn't?)
> > > 
> > > Looking at one of the dumps:
> > > 
> > >   [  392.774879] NMI backtrace for cpu 0
> > >   [  392.774881] CPU: 0 PID: 1 Comm: init Not tainted 4.14.9-gentoo
> > > #1
> > >   [  392.774881] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
> > >   [  392.774882] task: 8802368b8000 task.stack: c900c000
> > >   [  392.774885] RIP: 0010:double_fault+0x0/0x30
> > >   [  392.774886] RSP: :ff527fd0 EFLAGS: 0086
> > >   [  392.774887] RAX: 3fc0 RBX: 0001 RCX:
> > > c101
> > >   [  392.774887] RDX: 8802 RSI:  RDI:
> > > ff527f58
> > >   [  392.774887] RBP:  R08:  R09:
> > > 
> > >   [  392.774888] R10:  R11:  R12:
> > > 816ae726
> > >   [  392.774888] R13:  R14:  R15:
> > > 
> > >   [  392.774889] FS:  ()
> > > GS:88023fc0() knlGS:
> > >   [  392.774889] CS:  0010 DS:  ES:  CR0: 80050033
> > >   [  392.774890] CR2: ff526f08 CR3: 000235b48002 CR4:
> > > 001606f0
> > >   [  392.774892] Call Trace:
> > >   [  392.774894]  <#DF>
> > >   [  392.774897]  do_double_fault+0xb/0x140
> > >   [  392.774898]  
> > > 
> > > It should have at least printed the #DF iret frame registers, which I
> > > recently added support for in "x86/unwinder: Handle stack overflows
> > > more
> > > gracefully", which is in both 4.14.9 and 4.15-rc5.
> > > 
> > > I think the missing iret regs are due to a bug in
> > > show_trace_log_lvl(),
> > > where if the unwind starts with two regs frames in a row, the second
> > > regs don't get printed.
> > > 
> > > Alexander, would you mind reproducing again with the below patch?  It
> > > should still fail, but this time it should hopefully show another
> > > RIP/RSP/EFLAGS instead of the "do_double_fault+0xb/0x140" line.
> > > 
> > 
> > Yes, it works:
> > 
> > [   23.058064] NMI backtrace for cpu 2
> > [   23.058068] CPU: 2 PID: 1 Comm: init Not tainted 4.15.0-rc5+ #1
> > [   23.058069] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> > BIOS 1.10.2-1.fc27 04/01/2014
> > [   23.058074] RIP: 0010:double_fault+0x0/0x30
> > [   23.058075] RSP: :fe85ffd0 EFLAGS: 0086
> > [   23.058077] RAX: 3fd0 RBX: 0001 RCX:
> > c101
> > [   23.058077] RDX: 9681 RSI:  RDI:
> > fe85ff58
> > [   23.058078] RBP:  R08:  R09:
> > 
> > [   23.058079] R10:  R11:  R12:
> > 92001426
> > [   23.058080] R13:  R14:  R15:
> > 
> > [   23.058083] FS:  () GS:96813fd0()
> > knlGS:
> > [   23.058084] CS:  0010 DS:  ES:  CR0: 80050033
> > [   23.058085] CR2: fe85ef08 CR3: 000137a09000 CR4:
> > 000406a0
> > [   23.058089] Call Trace:
> > [   23.058101]  <#DF>
> > [   23.058104] RIP: 0010:do_double_fault+0xb/0x140
> > [   23.058105] RSP: :fe85ef18 EFLAGS: 00010086 ORIG_RAX:
> > 
> > [   23.058106] RAX: 3fd0 RBX: 0001 RCX:
> > c101
> > [   23.058107] RDX: 9681 RSI:  RDI:
> > fe85ff58
> > [   23.058107] RBP:  R08:  R09:
> > 
> > [   23.058108] R10:  R11:  R12:
> > 92001426
> > [   23.058108] R13:  R14:  R15:
> > 
> > [   23.058111]  
> > [   23.058111] Code: 05 00 00 48 89 e7 31 f6 e8 2e 8c 61 ff e9 69 06 00
> > 00 e8 94 05 00 00 48 89 e7 31 f6 e8 1a 8c 61 ff e9 55 06 00 00 0f 1f 44
> > 00 00 <0f> 1f 00 48 83 c4 88 e8 e4 04 00 00 48 89 e7 48 8b 74 24 78 48
> 
> That's better indeed, though still not quite right.  It should have only
> shown a subset of those registers.  One more bug to fix there...

Turns out my previous code to print iret frames was a bit ... misguided,
to put it nicely.  Not sure what I was smoking.

Hopefully the below patch should fix it (in place of the previous
patch).  Would you mind testing again?

diff --git a/arch/x86/include/asm/unwind.h b/arch/x86/include/asm/unwind.h
index c1688c2d0a12..1f86e1b0a5cd 100644
--- a/arch/x86/include/asm/unwind.h
+++ b/arch/x86/include/asm/unwind.h
@@ -56,18 +56,27 @@ void unwind_start(struct unwind_state 

Re: 4.14.9 doesn't boot (regression)

2017-12-30 Thread Josh Poimboeuf
On Sat, Dec 30, 2017 at 11:45:13AM +0300, Alexander Tsoy wrote:
> В Пт, 29/12/2017 в 21:49 -0600, Josh Poimboeuf пишет:
> > On Fri, Dec 29, 2017 at 05:10:35PM -0700, Andy Lutomirski wrote:
> > > (Also, Josh, the oops code should have printed the contents of the
> > > struct pt_regs at the top of the DF stack.  Any idea why it
> > > didn't?)
> > 
> > Looking at one of the dumps:
> > 
> >   [  392.774879] NMI backtrace for cpu 0
> >   [  392.774881] CPU: 0 PID: 1 Comm: init Not tainted 4.14.9-gentoo
> > #1
> >   [  392.774881] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
> >   [  392.774882] task: 8802368b8000 task.stack: c900c000
> >   [  392.774885] RIP: 0010:double_fault+0x0/0x30
> >   [  392.774886] RSP: :ff527fd0 EFLAGS: 0086
> >   [  392.774887] RAX: 3fc0 RBX: 0001 RCX:
> > c101
> >   [  392.774887] RDX: 8802 RSI:  RDI:
> > ff527f58
> >   [  392.774887] RBP:  R08:  R09:
> > 
> >   [  392.774888] R10:  R11:  R12:
> > 816ae726
> >   [  392.774888] R13:  R14:  R15:
> > 
> >   [  392.774889] FS:  ()
> > GS:88023fc0() knlGS:
> >   [  392.774889] CS:  0010 DS:  ES:  CR0: 80050033
> >   [  392.774890] CR2: ff526f08 CR3: 000235b48002 CR4:
> > 001606f0
> >   [  392.774892] Call Trace:
> >   [  392.774894]  <#DF>
> >   [  392.774897]  do_double_fault+0xb/0x140
> >   [  392.774898]  
> > 
> > It should have at least printed the #DF iret frame registers, which I
> > recently added support for in "x86/unwinder: Handle stack overflows
> > more
> > gracefully", which is in both 4.14.9 and 4.15-rc5.
> > 
> > I think the missing iret regs are due to a bug in
> > show_trace_log_lvl(),
> > where if the unwind starts with two regs frames in a row, the second
> > regs don't get printed.
> > 
> > Alexander, would you mind reproducing again with the below patch?  It
> > should still fail, but this time it should hopefully show another
> > RIP/RSP/EFLAGS instead of the "do_double_fault+0xb/0x140" line.
> > 
> 
> Yes, it works:
> 
> [   23.058064] NMI backtrace for cpu 2
> [   23.058068] CPU: 2 PID: 1 Comm: init Not tainted 4.15.0-rc5+ #1
> [   23.058069] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS 1.10.2-1.fc27 04/01/2014
> [   23.058074] RIP: 0010:double_fault+0x0/0x30
> [   23.058075] RSP: :fe85ffd0 EFLAGS: 0086
> [   23.058077] RAX: 3fd0 RBX: 0001 RCX:
> c101
> [   23.058077] RDX: 9681 RSI:  RDI:
> fe85ff58
> [   23.058078] RBP:  R08:  R09:
> 
> [   23.058079] R10:  R11:  R12:
> 92001426
> [   23.058080] R13:  R14:  R15:
> 
> [   23.058083] FS:  () GS:96813fd0()
> knlGS:
> [   23.058084] CS:  0010 DS:  ES:  CR0: 80050033
> [   23.058085] CR2: fe85ef08 CR3: 000137a09000 CR4:
> 000406a0
> [   23.058089] Call Trace:
> [   23.058101]  <#DF>
> [   23.058104] RIP: 0010:do_double_fault+0xb/0x140
> [   23.058105] RSP: :fe85ef18 EFLAGS: 00010086 ORIG_RAX:
> 
> [   23.058106] RAX: 3fd0 RBX: 0001 RCX:
> c101
> [   23.058107] RDX: 9681 RSI:  RDI:
> fe85ff58
> [   23.058107] RBP:  R08:  R09:
> 
> [   23.058108] R10:  R11:  R12:
> 92001426
> [   23.058108] R13:  R14:  R15:
> 
> [   23.058111]  
> [   23.058111] Code: 05 00 00 48 89 e7 31 f6 e8 2e 8c 61 ff e9 69 06 00
> 00 e8 94 05 00 00 48 89 e7 31 f6 e8 1a 8c 61 ff e9 55 06 00 00 0f 1f 44
> 00 00 <0f> 1f 00 48 83 c4 88 e8 e4 04 00 00 48 89 e7 48 8b 74 24 78 48

That's better indeed, though still not quite right.  It should have only
shown a subset of those registers.  One more bug to fix there...

-- 
Josh


Re: 4.14.9 doesn't boot (regression)

2017-12-30 Thread Josh Poimboeuf
On Sat, Dec 30, 2017 at 11:45:13AM +0300, Alexander Tsoy wrote:
> В Пт, 29/12/2017 в 21:49 -0600, Josh Poimboeuf пишет:
> > On Fri, Dec 29, 2017 at 05:10:35PM -0700, Andy Lutomirski wrote:
> > > (Also, Josh, the oops code should have printed the contents of the
> > > struct pt_regs at the top of the DF stack.  Any idea why it
> > > didn't?)
> > 
> > Looking at one of the dumps:
> > 
> >   [  392.774879] NMI backtrace for cpu 0
> >   [  392.774881] CPU: 0 PID: 1 Comm: init Not tainted 4.14.9-gentoo
> > #1
> >   [  392.774881] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
> >   [  392.774882] task: 8802368b8000 task.stack: c900c000
> >   [  392.774885] RIP: 0010:double_fault+0x0/0x30
> >   [  392.774886] RSP: :ff527fd0 EFLAGS: 0086
> >   [  392.774887] RAX: 3fc0 RBX: 0001 RCX:
> > c101
> >   [  392.774887] RDX: 8802 RSI:  RDI:
> > ff527f58
> >   [  392.774887] RBP:  R08:  R09:
> > 
> >   [  392.774888] R10:  R11:  R12:
> > 816ae726
> >   [  392.774888] R13:  R14:  R15:
> > 
> >   [  392.774889] FS:  ()
> > GS:88023fc0() knlGS:
> >   [  392.774889] CS:  0010 DS:  ES:  CR0: 80050033
> >   [  392.774890] CR2: ff526f08 CR3: 000235b48002 CR4:
> > 001606f0
> >   [  392.774892] Call Trace:
> >   [  392.774894]  <#DF>
> >   [  392.774897]  do_double_fault+0xb/0x140
> >   [  392.774898]  
> > 
> > It should have at least printed the #DF iret frame registers, which I
> > recently added support for in "x86/unwinder: Handle stack overflows
> > more
> > gracefully", which is in both 4.14.9 and 4.15-rc5.
> > 
> > I think the missing iret regs are due to a bug in
> > show_trace_log_lvl(),
> > where if the unwind starts with two regs frames in a row, the second
> > regs don't get printed.
> > 
> > Alexander, would you mind reproducing again with the below patch?  It
> > should still fail, but this time it should hopefully show another
> > RIP/RSP/EFLAGS instead of the "do_double_fault+0xb/0x140" line.
> > 
> 
> Yes, it works:
> 
> [   23.058064] NMI backtrace for cpu 2
> [   23.058068] CPU: 2 PID: 1 Comm: init Not tainted 4.15.0-rc5+ #1
> [   23.058069] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS 1.10.2-1.fc27 04/01/2014
> [   23.058074] RIP: 0010:double_fault+0x0/0x30
> [   23.058075] RSP: :fe85ffd0 EFLAGS: 0086
> [   23.058077] RAX: 3fd0 RBX: 0001 RCX:
> c101
> [   23.058077] RDX: 9681 RSI:  RDI:
> fe85ff58
> [   23.058078] RBP:  R08:  R09:
> 
> [   23.058079] R10:  R11:  R12:
> 92001426
> [   23.058080] R13:  R14:  R15:
> 
> [   23.058083] FS:  () GS:96813fd0()
> knlGS:
> [   23.058084] CS:  0010 DS:  ES:  CR0: 80050033
> [   23.058085] CR2: fe85ef08 CR3: 000137a09000 CR4:
> 000406a0
> [   23.058089] Call Trace:
> [   23.058101]  <#DF>
> [   23.058104] RIP: 0010:do_double_fault+0xb/0x140
> [   23.058105] RSP: :fe85ef18 EFLAGS: 00010086 ORIG_RAX:
> 
> [   23.058106] RAX: 3fd0 RBX: 0001 RCX:
> c101
> [   23.058107] RDX: 9681 RSI:  RDI:
> fe85ff58
> [   23.058107] RBP:  R08:  R09:
> 
> [   23.058108] R10:  R11:  R12:
> 92001426
> [   23.058108] R13:  R14:  R15:
> 
> [   23.058111]  
> [   23.058111] Code: 05 00 00 48 89 e7 31 f6 e8 2e 8c 61 ff e9 69 06 00
> 00 e8 94 05 00 00 48 89 e7 31 f6 e8 1a 8c 61 ff e9 55 06 00 00 0f 1f 44
> 00 00 <0f> 1f 00 48 83 c4 88 e8 e4 04 00 00 48 89 e7 48 8b 74 24 78 48

That's better indeed, though still not quite right.  It should have only
shown a subset of those registers.  One more bug to fix there...

-- 
Josh


Re: 4.14.9 doesn't boot (regression)

2017-12-30 Thread Jiri Kosina
On Sat, 30 Dec 2017, Toralf Förster wrote:

> This made the issue go away :
> 
> diff --git a/Makefile b/Makefile
> index ac8c441866b7..11a12947c550 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -414,7 +414,7 @@ LINUXINCLUDE:= \
>  
>  KBUILD_AFLAGS   := -D__ASSEMBLY__
>  KBUILD_CFLAGS   := -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs \
> -  -fno-strict-aliasing -fno-common -fshort-wchar \
> +  -fno-strict-aliasing -fno-common -fshort-wchar 
> -fstack-check=no \
>-Werror-implicit-function-declaration \
>-Wno-format-security \
>-std=gnu89
> 
> But this doesn't solve the root cause, right ? So if the root cause is 
> "Gentoo hardened GCC is broken" please just let me know this - FWIW I'm 
> in #gentoo-dev on freenode.

-fstack-check for kernel is never going to work properly.

That option is purely for userspace, and assumes all the logic around 
'stack guard gap' and the auto-growing semantics being in place; which is 
there for user stack VMA, but definitely not for kernel stack.

It's probably the "hardened" flavor of your distro trying to push 
'-fstack-check' to everything it compiles; so I actually think the 
Makefile patch, sanitizing CFLAGS by force-disabling -fstack-check is 
exactly what we should be doing.

Thanks,

-- 
Jiri Kosina
SUSE Labs


Re: 4.14.9 doesn't boot (regression)

2017-12-30 Thread Jiri Kosina
On Sat, 30 Dec 2017, Toralf Förster wrote:

> This made the issue go away :
> 
> diff --git a/Makefile b/Makefile
> index ac8c441866b7..11a12947c550 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -414,7 +414,7 @@ LINUXINCLUDE:= \
>  
>  KBUILD_AFLAGS   := -D__ASSEMBLY__
>  KBUILD_CFLAGS   := -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs \
> -  -fno-strict-aliasing -fno-common -fshort-wchar \
> +  -fno-strict-aliasing -fno-common -fshort-wchar 
> -fstack-check=no \
>-Werror-implicit-function-declaration \
>-Wno-format-security \
>-std=gnu89
> 
> But this doesn't solve the root cause, right ? So if the root cause is 
> "Gentoo hardened GCC is broken" please just let me know this - FWIW I'm 
> in #gentoo-dev on freenode.

-fstack-check for kernel is never going to work properly.

That option is purely for userspace, and assumes all the logic around 
'stack guard gap' and the auto-growing semantics being in place; which is 
there for user stack VMA, but definitely not for kernel stack.

It's probably the "hardened" flavor of your distro trying to push 
'-fstack-check' to everything it compiles; so I actually think the 
Makefile patch, sanitizing CFLAGS by force-disabling -fstack-check is 
exactly what we should be doing.

Thanks,

-- 
Jiri Kosina
SUSE Labs


Re: 4.14.9 doesn't boot (regression)

2017-12-30 Thread Toralf Förster
On 12/30/2017 02:13 AM, Alexander Tsoy wrote:
> You are right, It's due to fstack-check enabled in gentoo's gcc spec.
> "-fstack-check=no" in KBUILD_CFLAGS fixed this problem for me. =/

This made the issue go away :

diff --git a/Makefile b/Makefile
index ac8c441866b7..11a12947c550 100644
--- a/Makefile
+++ b/Makefile
@@ -414,7 +414,7 @@ LINUXINCLUDE:= \
 
 KBUILD_AFLAGS   := -D__ASSEMBLY__
 KBUILD_CFLAGS   := -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs \
-  -fno-strict-aliasing -fno-common -fshort-wchar \
+  -fno-strict-aliasing -fno-common -fshort-wchar 
-fstack-check=no \
   -Werror-implicit-function-declaration \
   -Wno-format-security \
   -std=gnu89

But this doesn't solve the root cause, right ? So if the root cause is "Gentoo 
hardened GCC is broken" please just let me know this - FWIW I'm in #gentoo-dev 
on freenode.

-- 
Toralf
PGP C4EACDDE 0076E94E


Re: 4.14.9 doesn't boot (regression)

2017-12-30 Thread Toralf Förster
On 12/30/2017 02:13 AM, Alexander Tsoy wrote:
> You are right, It's due to fstack-check enabled in gentoo's gcc spec.
> "-fstack-check=no" in KBUILD_CFLAGS fixed this problem for me. =/

This made the issue go away :

diff --git a/Makefile b/Makefile
index ac8c441866b7..11a12947c550 100644
--- a/Makefile
+++ b/Makefile
@@ -414,7 +414,7 @@ LINUXINCLUDE:= \
 
 KBUILD_AFLAGS   := -D__ASSEMBLY__
 KBUILD_CFLAGS   := -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs \
-  -fno-strict-aliasing -fno-common -fshort-wchar \
+  -fno-strict-aliasing -fno-common -fshort-wchar 
-fstack-check=no \
   -Werror-implicit-function-declaration \
   -Wno-format-security \
   -std=gnu89

But this doesn't solve the root cause, right ? So if the root cause is "Gentoo 
hardened GCC is broken" please just let me know this - FWIW I'm in #gentoo-dev 
on freenode.

-- 
Toralf
PGP C4EACDDE 0076E94E


Re: 4.14.9 doesn't boot (regression)

2017-12-30 Thread Jiri Kosina
On Fri, 29 Dec 2017, Linus Torvalds wrote:

> Ok, so what does seem to be consistent for everybody is that 
> double-fault in the NMI backtrace.
> 
> So the fact that the NMI always hits on a double-fault does make me
> suspect that it's a infinite stream of double-faults, and that is
> presumably also what causes the RCU timeout.

As I've been fighting with recursive double-faults lately (backporting PTI 
to ancient kernels), I can tell you that this is not the symptom you'd be 
seeing in such case; recursive double fault pretty quickly overflows the 
interrupt stack and triple-faults.

-- 
Jiri Kosina
SUSE Labs


Re: 4.14.9 doesn't boot (regression)

2017-12-30 Thread Jiri Kosina
On Fri, 29 Dec 2017, Linus Torvalds wrote:

> Ok, so what does seem to be consistent for everybody is that 
> double-fault in the NMI backtrace.
> 
> So the fact that the NMI always hits on a double-fault does make me
> suspect that it's a infinite stream of double-faults, and that is
> presumably also what causes the RCU timeout.

As I've been fighting with recursive double-faults lately (backporting PTI 
to ancient kernels), I can tell you that this is not the symptom you'd be 
seeing in such case; recursive double fault pretty quickly overflows the 
interrupt stack and triple-faults.

-- 
Jiri Kosina
SUSE Labs


Re: 4.14.9 doesn't boot (regression)

2017-12-30 Thread Toralf Förster
On 12/30/2017 04:49 AM, Josh Poimboeuf wrote:
> Alexander, would you mind reproducing again with the below patch?  It
> should still fail, but this time it should hopefully show another
> RIP/RSP/EFLAGS instead of the "do_double_fault+0xb/0x140" line.

I applied that too on top of v4.15-rc5-114-g2758b3e3e630 (no other patches or 
changes to cflags or so), make c clean, then build and booted the kernel, still 
stucks, the result is in [1]


[1] https://zwiebeltoralf.de/pub/IMG_20171230_102325.jpg

-- 
Toralf
PGP C4EACDDE 0076E94E


Re: 4.14.9 doesn't boot (regression)

2017-12-30 Thread Toralf Förster
On 12/30/2017 04:49 AM, Josh Poimboeuf wrote:
> Alexander, would you mind reproducing again with the below patch?  It
> should still fail, but this time it should hopefully show another
> RIP/RSP/EFLAGS instead of the "do_double_fault+0xb/0x140" line.

I applied that too on top of v4.15-rc5-114-g2758b3e3e630 (no other patches or 
changes to cflags or so), make c clean, then build and booted the kernel, still 
stucks, the result is in [1]


[1] https://zwiebeltoralf.de/pub/IMG_20171230_102325.jpg

-- 
Toralf
PGP C4EACDDE 0076E94E


Re: 4.14.9 doesn't boot (regression)

2017-12-30 Thread Toralf Förster
On 12/30/2017 10:14 AM, Alexander Tsoy wrote:
> Yes, and only in hardened profile, so most users don't have -fstack-
> check by default. :)
Indeed, I do run hardened Gentoo only.

-- 
Toralf
PGP C4EACDDE 0076E94E


Re: 4.14.9 doesn't boot (regression)

2017-12-30 Thread Toralf Förster
On 12/30/2017 10:14 AM, Alexander Tsoy wrote:
> Yes, and only in hardened profile, so most users don't have -fstack-
> check by default. :)
Indeed, I do run hardened Gentoo only.

-- 
Toralf
PGP C4EACDDE 0076E94E


Re: 4.14.9 doesn't boot (regression)

2017-12-30 Thread Alexander Tsoy
В Пт, 29/12/2017 в 17:34 -0800, Linus Torvalds пишет:
> On Fri, Dec 29, 2017 at 5:00 PM, Linus Torvalds
>  wrote:
> > 
> > Good. I was not feeling so happy about this bug report, but now I
> > can
> > firmly just blame the gentoo compiler for having some shit-for-
> > brains
> > "feature".
> 
> Looks like I can generate similar bad code with the F26 version of
> gcc, it's just not enabled by default.
> 
> So all gentoo did was change the default options.

Yes, and only in hardened profile, so most users don't have -fstack-
check by default. :)

> 
> I suspect we should just add a
> 
> KBUILD_CFLAGS  += $(call cc-option,-fno-stack-check,)
> 
> somewhere to the main Makefile, just to make sure.
> 
> Maybe like the appended?
> 
> Toralf, Alexander, does this make things JustWork(tm) for you?

I can confirm that with your patch my gcc produces working kernel.


Re: 4.14.9 doesn't boot (regression)

2017-12-30 Thread Alexander Tsoy
В Пт, 29/12/2017 в 17:34 -0800, Linus Torvalds пишет:
> On Fri, Dec 29, 2017 at 5:00 PM, Linus Torvalds
>  wrote:
> > 
> > Good. I was not feeling so happy about this bug report, but now I
> > can
> > firmly just blame the gentoo compiler for having some shit-for-
> > brains
> > "feature".
> 
> Looks like I can generate similar bad code with the F26 version of
> gcc, it's just not enabled by default.
> 
> So all gentoo did was change the default options.

Yes, and only in hardened profile, so most users don't have -fstack-
check by default. :)

> 
> I suspect we should just add a
> 
> KBUILD_CFLAGS  += $(call cc-option,-fno-stack-check,)
> 
> somewhere to the main Makefile, just to make sure.
> 
> Maybe like the appended?
> 
> Toralf, Alexander, does this make things JustWork(tm) for you?

I can confirm that with your patch my gcc produces working kernel.


Re: 4.14.9 doesn't boot (regression)

2017-12-30 Thread Alexander Tsoy
В Пт, 29/12/2017 в 21:49 -0600, Josh Poimboeuf пишет:
> On Fri, Dec 29, 2017 at 05:10:35PM -0700, Andy Lutomirski wrote:
> > (Also, Josh, the oops code should have printed the contents of the
> > struct pt_regs at the top of the DF stack.  Any idea why it
> > didn't?)
> 
> Looking at one of the dumps:
> 
>   [  392.774879] NMI backtrace for cpu 0
>   [  392.774881] CPU: 0 PID: 1 Comm: init Not tainted 4.14.9-gentoo
> #1
>   [  392.774881] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
>   [  392.774882] task: 8802368b8000 task.stack: c900c000
>   [  392.774885] RIP: 0010:double_fault+0x0/0x30
>   [  392.774886] RSP: :ff527fd0 EFLAGS: 0086
>   [  392.774887] RAX: 3fc0 RBX: 0001 RCX:
> c101
>   [  392.774887] RDX: 8802 RSI:  RDI:
> ff527f58
>   [  392.774887] RBP:  R08:  R09:
> 
>   [  392.774888] R10:  R11:  R12:
> 816ae726
>   [  392.774888] R13:  R14:  R15:
> 
>   [  392.774889] FS:  ()
> GS:88023fc0() knlGS:
>   [  392.774889] CS:  0010 DS:  ES:  CR0: 80050033
>   [  392.774890] CR2: ff526f08 CR3: 000235b48002 CR4:
> 001606f0
>   [  392.774892] Call Trace:
>   [  392.774894]  <#DF>
>   [  392.774897]  do_double_fault+0xb/0x140
>   [  392.774898]  
> 
> It should have at least printed the #DF iret frame registers, which I
> recently added support for in "x86/unwinder: Handle stack overflows
> more
> gracefully", which is in both 4.14.9 and 4.15-rc5.
> 
> I think the missing iret regs are due to a bug in
> show_trace_log_lvl(),
> where if the unwind starts with two regs frames in a row, the second
> regs don't get printed.
> 
> Alexander, would you mind reproducing again with the below patch?  It
> should still fail, but this time it should hopefully show another
> RIP/RSP/EFLAGS instead of the "do_double_fault+0xb/0x140" line.
> 

Yes, it works:

[   23.058064] NMI backtrace for cpu 2
[   23.058068] CPU: 2 PID: 1 Comm: init Not tainted 4.15.0-rc5+ #1
[   23.058069] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.10.2-1.fc27 04/01/2014
[   23.058074] RIP: 0010:double_fault+0x0/0x30
[   23.058075] RSP: :fe85ffd0 EFLAGS: 0086
[   23.058077] RAX: 3fd0 RBX: 0001 RCX:
c101
[   23.058077] RDX: 9681 RSI:  RDI:
fe85ff58
[   23.058078] RBP:  R08:  R09:

[   23.058079] R10:  R11:  R12:
92001426
[   23.058080] R13:  R14:  R15:

[   23.058083] FS:  () GS:96813fd0()
knlGS:
[   23.058084] CS:  0010 DS:  ES:  CR0: 80050033
[   23.058085] CR2: fe85ef08 CR3: 000137a09000 CR4:
000406a0
[   23.058089] Call Trace:
[   23.058101]  <#DF>
[   23.058104] RIP: 0010:do_double_fault+0xb/0x140
[   23.058105] RSP: :fe85ef18 EFLAGS: 00010086 ORIG_RAX:

[   23.058106] RAX: 3fd0 RBX: 0001 RCX:
c101
[   23.058107] RDX: 9681 RSI:  RDI:
fe85ff58
[   23.058107] RBP:  R08:  R09:

[   23.058108] R10:  R11:  R12:
92001426
[   23.058108] R13:  R14:  R15:

[   23.058111]  
[   23.058111] Code: 05 00 00 48 89 e7 31 f6 e8 2e 8c 61 ff e9 69 06 00
00 e8 94 05 00 00 48 89 e7 31 f6 e8 1a 8c 61 ff e9 55 06 00 00 0f 1f 44
00 00 <0f> 1f 00 48 83 c4 88 e8 e4 04 00 00 48 89 e7 48 8b 74 24 78 48


Re: 4.14.9 doesn't boot (regression)

2017-12-30 Thread Alexander Tsoy
В Пт, 29/12/2017 в 21:49 -0600, Josh Poimboeuf пишет:
> On Fri, Dec 29, 2017 at 05:10:35PM -0700, Andy Lutomirski wrote:
> > (Also, Josh, the oops code should have printed the contents of the
> > struct pt_regs at the top of the DF stack.  Any idea why it
> > didn't?)
> 
> Looking at one of the dumps:
> 
>   [  392.774879] NMI backtrace for cpu 0
>   [  392.774881] CPU: 0 PID: 1 Comm: init Not tainted 4.14.9-gentoo
> #1
>   [  392.774881] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
>   [  392.774882] task: 8802368b8000 task.stack: c900c000
>   [  392.774885] RIP: 0010:double_fault+0x0/0x30
>   [  392.774886] RSP: :ff527fd0 EFLAGS: 0086
>   [  392.774887] RAX: 3fc0 RBX: 0001 RCX:
> c101
>   [  392.774887] RDX: 8802 RSI:  RDI:
> ff527f58
>   [  392.774887] RBP:  R08:  R09:
> 
>   [  392.774888] R10:  R11:  R12:
> 816ae726
>   [  392.774888] R13:  R14:  R15:
> 
>   [  392.774889] FS:  ()
> GS:88023fc0() knlGS:
>   [  392.774889] CS:  0010 DS:  ES:  CR0: 80050033
>   [  392.774890] CR2: ff526f08 CR3: 000235b48002 CR4:
> 001606f0
>   [  392.774892] Call Trace:
>   [  392.774894]  <#DF>
>   [  392.774897]  do_double_fault+0xb/0x140
>   [  392.774898]  
> 
> It should have at least printed the #DF iret frame registers, which I
> recently added support for in "x86/unwinder: Handle stack overflows
> more
> gracefully", which is in both 4.14.9 and 4.15-rc5.
> 
> I think the missing iret regs are due to a bug in
> show_trace_log_lvl(),
> where if the unwind starts with two regs frames in a row, the second
> regs don't get printed.
> 
> Alexander, would you mind reproducing again with the below patch?  It
> should still fail, but this time it should hopefully show another
> RIP/RSP/EFLAGS instead of the "do_double_fault+0xb/0x140" line.
> 

Yes, it works:

[   23.058064] NMI backtrace for cpu 2
[   23.058068] CPU: 2 PID: 1 Comm: init Not tainted 4.15.0-rc5+ #1
[   23.058069] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.10.2-1.fc27 04/01/2014
[   23.058074] RIP: 0010:double_fault+0x0/0x30
[   23.058075] RSP: :fe85ffd0 EFLAGS: 0086
[   23.058077] RAX: 3fd0 RBX: 0001 RCX:
c101
[   23.058077] RDX: 9681 RSI:  RDI:
fe85ff58
[   23.058078] RBP:  R08:  R09:

[   23.058079] R10:  R11:  R12:
92001426
[   23.058080] R13:  R14:  R15:

[   23.058083] FS:  () GS:96813fd0()
knlGS:
[   23.058084] CS:  0010 DS:  ES:  CR0: 80050033
[   23.058085] CR2: fe85ef08 CR3: 000137a09000 CR4:
000406a0
[   23.058089] Call Trace:
[   23.058101]  <#DF>
[   23.058104] RIP: 0010:do_double_fault+0xb/0x140
[   23.058105] RSP: :fe85ef18 EFLAGS: 00010086 ORIG_RAX:

[   23.058106] RAX: 3fd0 RBX: 0001 RCX:
c101
[   23.058107] RDX: 9681 RSI:  RDI:
fe85ff58
[   23.058107] RBP:  R08:  R09:

[   23.058108] R10:  R11:  R12:
92001426
[   23.058108] R13:  R14:  R15:

[   23.058111]  
[   23.058111] Code: 05 00 00 48 89 e7 31 f6 e8 2e 8c 61 ff e9 69 06 00
00 e8 94 05 00 00 48 89 e7 31 f6 e8 1a 8c 61 ff e9 55 06 00 00 0f 1f 44
00 00 <0f> 1f 00 48 83 c4 88 e8 e4 04 00 00 48 89 e7 48 8b 74 24 78 48


Re: 4.14.9 doesn't boot (regression)

2017-12-30 Thread Toralf Förster
On 12/30/2017 01:10 AM, Andy Lutomirski wrote:
> Toralf, can you send the complete output of:
> 
> objdump -dr arch/x86/kernel/traps.o
> 
> From the build tree of a nonworking kernel?

I attached it.

FWIW:

tfoerste@t44 ~/devel/linux $ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-pc-linux-gnu/6.4.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /var/tmp/portage/sys-devel/gcc-6.4.0/work/gcc-6.4.0/configure 
--host=x86_64-pc-linux-gnu --build=x86_64-pc-linux-gnu --prefix=/usr 
--bindir=/usr/x86_64-pc-linux-gnu/gcc-bin/6.4.0 
--includedir=/usr/lib/gcc/x86_64-pc-linux-gnu/6.4.0/include 
--datadir=/usr/share/gcc-data/x86_64-pc-linux-gnu/6.4.0 
--mandir=/usr/share/gcc-data/x86_64-pc-linux-gnu/6.4.0/man 
--infodir=/usr/share/gcc-data/x86_64-pc-linux-gnu/6.4.0/info 
--with-gxx-include-dir=/usr/lib/gcc/x86_64-pc-linux-gnu/6.4.0/include/g++-v6 
--with-python-dir=/share/gcc-data/x86_64-pc-linux-gnu/6.4.0/python 
--enable-languages=c,c++ --enable-obsolete --enable-secureplt --disable-werror 
--with-system-zlib --enable-nls --without-included-gettext 
--enable-checking=release --with-bugurl=https://bugs.gentoo.org/ 
--with-pkgversion='Gentoo Hardened 6.4.0 p1.1' --enable-esp 
--enable-libstdcxx-time --disable-libstdcxx-pch --enable-shared 
--enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu 
--enable-multilib --with-multilib-list=m32,m64 --disable-altivec 
--disable-fixed-point --enable-targets=all --disable-libgcj --enable-libgomp 
--disable-libmudflap --disable-libssp --disable-libcilkrts --disable-libmpx 
--enable-vtable-verify --enable-libvtv --disable-libquadmath --enable-lto 
--without-isl --disable-libsanitizer --enable-default-pie --enable-default-ssp
Thread model: posix
gcc version 6.4.0 (Gentoo Hardened 6.4.0 p1.1)

-- 
Toralf
PGP C4EACDDE 0076E94E

arch/x86/kernel/traps.o: file format elf64-x86-64


Disassembly of section .text:

 :
   0:   41 57   push   %r15
   2:   41 56   push   %r14
   4:   41 55   push   %r13
   6:   41 54   push   %r12
   8:   55  push   %rbp
   9:   53  push   %rbx
   a:   48 81 ec 28 10 00 00sub$0x1028,%rsp
  11:   48 83 0c 24 00  orq$0x0,(%rsp)
  16:   48 81 c4 20 10 00 00add$0x1020,%rsp
  1d:   65 48 8b 2c 25 00 00mov%gs:0x0,%rbp
  24:   00 00 
22: R_X86_64_32Scurrent_task
  26:   f6 81 88 00 00 00 03testb  $0x3,0x88(%rcx)
  2d:   4c 63 efmovslq %edi,%r13
  30:   41 89 f6mov%esi,%r14d
  33:   48 89 14 24 mov%rdx,(%rsp)
  37:   49 89 ccmov%rcx,%r12
  3a:   4d 89 c7mov%r8,%r15
  3d:   4c 89 cbmov%r9,%rbx
  40:   75 3b   jne7d 
  42:   44 89 eemov%r13d,%esi
  45:   48 89 cfmov%rcx,%rdi
  48:   e8 00 00 00 00  callq  4d 
49: R_X86_64_PC32   fixup_exception-0x4
  4d:   85 c0   test   %eax,%eax
  4f:   74 0f   je 60 
  51:   48 83 c4 08 add$0x8,%rsp
  55:   5b  pop%rbx
  56:   5d  pop%rbp
  57:   41 5c   pop%r12
  59:   41 5d   pop%r13
  5b:   41 5e   pop%r14
  5d:   41 5f   pop%r15
  5f:   c3  retq   
  60:   48 8b 3c 24 mov(%rsp),%rdi
  64:   4c 89 bd c0 09 00 00mov%r15,0x9c0(%rbp)
  6b:   4c 89 famov%r15,%rdx
  6e:   4c 89 e6mov%r12,%rsi
  71:   4c 89 ad b8 09 00 00mov%r13,0x9b8(%rbp)
  78:   e8 00 00 00 00  callq  7d 
79: R_X86_64_PC32   die-0x4
  7d:   8b 05 00 00 00 00   mov0x0(%rip),%eax# 83 
7f: R_X86_64_PC32   show_unhandled_signals-0x4
  83:   4c 89 bd c0 09 00 00mov%r15,0x9c0(%rbp)
  8a:   4c 89 ad b8 09 00 00mov%r13,0x9b8(%rbp)
  91:   85 c0   test   %eax,%eax
  93:   75 28   jnebd 
  95:   48 85 dbtest   %rbx,%rbx
  98:   b8 01 00 00 00  mov$0x1,%eax
  9d:   48 89 eamov%rbp,%rdx
  a0:   48 0f 44 d8 cmove  %rax,%rbx
  a4:   48 83 c4 08 add$0x8,%rsp
  a8:   44 89 f7mov%r14d,%edi
  ab:   48 89 demov%rbx,%rsi
  ae:   5b  pop%rbx
  af:   5d  pop%rbp
  b0:   41 5c   pop%r12
  b2:   41 5d   pop%r13
  b4:   41 5e   pop%r14
  b6:   41 5f   pop%r15
  b8:   e9 00 00 00 00  jmpq   bd 

Re: 4.14.9 doesn't boot (regression)

2017-12-30 Thread Toralf Förster
On 12/30/2017 01:10 AM, Andy Lutomirski wrote:
> Toralf, can you send the complete output of:
> 
> objdump -dr arch/x86/kernel/traps.o
> 
> From the build tree of a nonworking kernel?

I attached it.

FWIW:

tfoerste@t44 ~/devel/linux $ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-pc-linux-gnu/6.4.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /var/tmp/portage/sys-devel/gcc-6.4.0/work/gcc-6.4.0/configure 
--host=x86_64-pc-linux-gnu --build=x86_64-pc-linux-gnu --prefix=/usr 
--bindir=/usr/x86_64-pc-linux-gnu/gcc-bin/6.4.0 
--includedir=/usr/lib/gcc/x86_64-pc-linux-gnu/6.4.0/include 
--datadir=/usr/share/gcc-data/x86_64-pc-linux-gnu/6.4.0 
--mandir=/usr/share/gcc-data/x86_64-pc-linux-gnu/6.4.0/man 
--infodir=/usr/share/gcc-data/x86_64-pc-linux-gnu/6.4.0/info 
--with-gxx-include-dir=/usr/lib/gcc/x86_64-pc-linux-gnu/6.4.0/include/g++-v6 
--with-python-dir=/share/gcc-data/x86_64-pc-linux-gnu/6.4.0/python 
--enable-languages=c,c++ --enable-obsolete --enable-secureplt --disable-werror 
--with-system-zlib --enable-nls --without-included-gettext 
--enable-checking=release --with-bugurl=https://bugs.gentoo.org/ 
--with-pkgversion='Gentoo Hardened 6.4.0 p1.1' --enable-esp 
--enable-libstdcxx-time --disable-libstdcxx-pch --enable-shared 
--enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu 
--enable-multilib --with-multilib-list=m32,m64 --disable-altivec 
--disable-fixed-point --enable-targets=all --disable-libgcj --enable-libgomp 
--disable-libmudflap --disable-libssp --disable-libcilkrts --disable-libmpx 
--enable-vtable-verify --enable-libvtv --disable-libquadmath --enable-lto 
--without-isl --disable-libsanitizer --enable-default-pie --enable-default-ssp
Thread model: posix
gcc version 6.4.0 (Gentoo Hardened 6.4.0 p1.1)

-- 
Toralf
PGP C4EACDDE 0076E94E

arch/x86/kernel/traps.o: file format elf64-x86-64


Disassembly of section .text:

 :
   0:   41 57   push   %r15
   2:   41 56   push   %r14
   4:   41 55   push   %r13
   6:   41 54   push   %r12
   8:   55  push   %rbp
   9:   53  push   %rbx
   a:   48 81 ec 28 10 00 00sub$0x1028,%rsp
  11:   48 83 0c 24 00  orq$0x0,(%rsp)
  16:   48 81 c4 20 10 00 00add$0x1020,%rsp
  1d:   65 48 8b 2c 25 00 00mov%gs:0x0,%rbp
  24:   00 00 
22: R_X86_64_32Scurrent_task
  26:   f6 81 88 00 00 00 03testb  $0x3,0x88(%rcx)
  2d:   4c 63 efmovslq %edi,%r13
  30:   41 89 f6mov%esi,%r14d
  33:   48 89 14 24 mov%rdx,(%rsp)
  37:   49 89 ccmov%rcx,%r12
  3a:   4d 89 c7mov%r8,%r15
  3d:   4c 89 cbmov%r9,%rbx
  40:   75 3b   jne7d 
  42:   44 89 eemov%r13d,%esi
  45:   48 89 cfmov%rcx,%rdi
  48:   e8 00 00 00 00  callq  4d 
49: R_X86_64_PC32   fixup_exception-0x4
  4d:   85 c0   test   %eax,%eax
  4f:   74 0f   je 60 
  51:   48 83 c4 08 add$0x8,%rsp
  55:   5b  pop%rbx
  56:   5d  pop%rbp
  57:   41 5c   pop%r12
  59:   41 5d   pop%r13
  5b:   41 5e   pop%r14
  5d:   41 5f   pop%r15
  5f:   c3  retq   
  60:   48 8b 3c 24 mov(%rsp),%rdi
  64:   4c 89 bd c0 09 00 00mov%r15,0x9c0(%rbp)
  6b:   4c 89 famov%r15,%rdx
  6e:   4c 89 e6mov%r12,%rsi
  71:   4c 89 ad b8 09 00 00mov%r13,0x9b8(%rbp)
  78:   e8 00 00 00 00  callq  7d 
79: R_X86_64_PC32   die-0x4
  7d:   8b 05 00 00 00 00   mov0x0(%rip),%eax# 83 
7f: R_X86_64_PC32   show_unhandled_signals-0x4
  83:   4c 89 bd c0 09 00 00mov%r15,0x9c0(%rbp)
  8a:   4c 89 ad b8 09 00 00mov%r13,0x9b8(%rbp)
  91:   85 c0   test   %eax,%eax
  93:   75 28   jnebd 
  95:   48 85 dbtest   %rbx,%rbx
  98:   b8 01 00 00 00  mov$0x1,%eax
  9d:   48 89 eamov%rbp,%rdx
  a0:   48 0f 44 d8 cmove  %rax,%rbx
  a4:   48 83 c4 08 add$0x8,%rsp
  a8:   44 89 f7mov%r14d,%edi
  ab:   48 89 demov%rbx,%rsi
  ae:   5b  pop%rbx
  af:   5d  pop%rbp
  b0:   41 5c   pop%r12
  b2:   41 5d   pop%r13
  b4:   41 5e   pop%r14
  b6:   41 5f   pop%r15
  b8:   e9 00 00 00 00  jmpq   bd 
b9: R_X86_64_PC32   force_sig_info-0x4
  bd:   44 89 f6

Re: 4.14.9 doesn't boot (regression)

2017-12-29 Thread Josh Poimboeuf
On Fri, Dec 29, 2017 at 05:10:35PM -0700, Andy Lutomirski wrote:
> (Also, Josh, the oops code should have printed the contents of the
> struct pt_regs at the top of the DF stack.  Any idea why it didn't?)

Looking at one of the dumps:

  [  392.774879] NMI backtrace for cpu 0
  [  392.774881] CPU: 0 PID: 1 Comm: init Not tainted 4.14.9-gentoo #1
  [  392.774881] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
  [  392.774882] task: 8802368b8000 task.stack: c900c000
  [  392.774885] RIP: 0010:double_fault+0x0/0x30
  [  392.774886] RSP: :ff527fd0 EFLAGS: 0086
  [  392.774887] RAX: 3fc0 RBX: 0001 RCX: 
c101
  [  392.774887] RDX: 8802 RSI:  RDI: 
ff527f58
  [  392.774887] RBP:  R08:  R09: 

  [  392.774888] R10:  R11:  R12: 
816ae726
  [  392.774888] R13:  R14:  R15: 

  [  392.774889] FS:  () GS:88023fc0() 
knlGS:
  [  392.774889] CS:  0010 DS:  ES:  CR0: 80050033
  [  392.774890] CR2: ff526f08 CR3: 000235b48002 CR4: 
001606f0
  [  392.774892] Call Trace:
  [  392.774894]  <#DF>
  [  392.774897]  do_double_fault+0xb/0x140
  [  392.774898]  

It should have at least printed the #DF iret frame registers, which I
recently added support for in "x86/unwinder: Handle stack overflows more
gracefully", which is in both 4.14.9 and 4.15-rc5.

I think the missing iret regs are due to a bug in show_trace_log_lvl(),
where if the unwind starts with two regs frames in a row, the second
regs don't get printed.

Alexander, would you mind reproducing again with the below patch?  It
should still fail, but this time it should hopefully show another
RIP/RSP/EFLAGS instead of the "do_double_fault+0xb/0x140" line.


diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index 36b17e0febe8..39a320d077aa 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -103,6 +103,7 @@ void show_trace_log_lvl(struct task_struct *task, struct 
pt_regs *regs,
 
unwind_start(, task, regs, stack);
stack = stack ? : get_stack_pointer(task, regs);
+   regs = unwind_get_entry_regs();
 
/*
 * Iterate through the stacks, starting with the current stack pointer.
@@ -120,7 +121,7 @@ void show_trace_log_lvl(struct task_struct *task, struct 
pt_regs *regs,
 * - hardirq stack
 * - entry stack
 */
-   for (regs = NULL; stack; stack = PTR_ALIGN(stack_info.next_sp, 
sizeof(long))) {
+   for ( ; stack; stack = PTR_ALIGN(stack_info.next_sp, sizeof(long))) {
const char *stack_name;
 
if (get_stack_info(stack, task, _info, _mask)) {


Re: 4.14.9 doesn't boot (regression)

2017-12-29 Thread Josh Poimboeuf
On Fri, Dec 29, 2017 at 05:10:35PM -0700, Andy Lutomirski wrote:
> (Also, Josh, the oops code should have printed the contents of the
> struct pt_regs at the top of the DF stack.  Any idea why it didn't?)

Looking at one of the dumps:

  [  392.774879] NMI backtrace for cpu 0
  [  392.774881] CPU: 0 PID: 1 Comm: init Not tainted 4.14.9-gentoo #1
  [  392.774881] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
  [  392.774882] task: 8802368b8000 task.stack: c900c000
  [  392.774885] RIP: 0010:double_fault+0x0/0x30
  [  392.774886] RSP: :ff527fd0 EFLAGS: 0086
  [  392.774887] RAX: 3fc0 RBX: 0001 RCX: 
c101
  [  392.774887] RDX: 8802 RSI:  RDI: 
ff527f58
  [  392.774887] RBP:  R08:  R09: 

  [  392.774888] R10:  R11:  R12: 
816ae726
  [  392.774888] R13:  R14:  R15: 

  [  392.774889] FS:  () GS:88023fc0() 
knlGS:
  [  392.774889] CS:  0010 DS:  ES:  CR0: 80050033
  [  392.774890] CR2: ff526f08 CR3: 000235b48002 CR4: 
001606f0
  [  392.774892] Call Trace:
  [  392.774894]  <#DF>
  [  392.774897]  do_double_fault+0xb/0x140
  [  392.774898]  

It should have at least printed the #DF iret frame registers, which I
recently added support for in "x86/unwinder: Handle stack overflows more
gracefully", which is in both 4.14.9 and 4.15-rc5.

I think the missing iret regs are due to a bug in show_trace_log_lvl(),
where if the unwind starts with two regs frames in a row, the second
regs don't get printed.

Alexander, would you mind reproducing again with the below patch?  It
should still fail, but this time it should hopefully show another
RIP/RSP/EFLAGS instead of the "do_double_fault+0xb/0x140" line.


diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index 36b17e0febe8..39a320d077aa 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -103,6 +103,7 @@ void show_trace_log_lvl(struct task_struct *task, struct 
pt_regs *regs,
 
unwind_start(, task, regs, stack);
stack = stack ? : get_stack_pointer(task, regs);
+   regs = unwind_get_entry_regs();
 
/*
 * Iterate through the stacks, starting with the current stack pointer.
@@ -120,7 +121,7 @@ void show_trace_log_lvl(struct task_struct *task, struct 
pt_regs *regs,
 * - hardirq stack
 * - entry stack
 */
-   for (regs = NULL; stack; stack = PTR_ALIGN(stack_info.next_sp, 
sizeof(long))) {
+   for ( ; stack; stack = PTR_ALIGN(stack_info.next_sp, sizeof(long))) {
const char *stack_name;
 
if (get_stack_info(stack, task, _info, _mask)) {


Re: 4.14.9 doesn't boot (regression)

2017-12-29 Thread Linus Torvalds
On Fri, Dec 29, 2017 at 5:00 PM, Linus Torvalds
 wrote:
>
> Good. I was not feeling so happy about this bug report, but now I can
> firmly just blame the gentoo compiler for having some shit-for-brains
> "feature".

Looks like I can generate similar bad code with the F26 version of
gcc, it's just not enabled by default.

So all gentoo did was change the default options.

I suspect we should just add a

KBUILD_CFLAGS  += $(call cc-option,-fno-stack-check,)

somewhere to the main Makefile, just to make sure.

Maybe like the appended?

Toralf, Alexander, does this make things JustWork(tm) for you?

Linus
 Makefile | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Makefile b/Makefile
index ac8c441866b7..92b74bcd3c2a 100644
--- a/Makefile
+++ b/Makefile
@@ -789,6 +789,9 @@ KBUILD_CFLAGS += $(call cc-disable-warning, pointer-sign)
 # disable invalid "can't wrap" optimizations for signed / pointers
 KBUILD_CFLAGS  += $(call cc-option,-fno-strict-overflow)
 
+# Make sure -fstack-check isn't enabled (like gentoo apparently did)
+KBUILD_CFLAGS  += $(call cc-option,-fno-stack-check,)
+
 # conserve stack if available
 KBUILD_CFLAGS   += $(call cc-option,-fconserve-stack)
 


Re: 4.14.9 doesn't boot (regression)

2017-12-29 Thread Linus Torvalds
On Fri, Dec 29, 2017 at 5:00 PM, Linus Torvalds
 wrote:
>
> Good. I was not feeling so happy about this bug report, but now I can
> firmly just blame the gentoo compiler for having some shit-for-brains
> "feature".

Looks like I can generate similar bad code with the F26 version of
gcc, it's just not enabled by default.

So all gentoo did was change the default options.

I suspect we should just add a

KBUILD_CFLAGS  += $(call cc-option,-fno-stack-check,)

somewhere to the main Makefile, just to make sure.

Maybe like the appended?

Toralf, Alexander, does this make things JustWork(tm) for you?

Linus
 Makefile | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Makefile b/Makefile
index ac8c441866b7..92b74bcd3c2a 100644
--- a/Makefile
+++ b/Makefile
@@ -789,6 +789,9 @@ KBUILD_CFLAGS += $(call cc-disable-warning, pointer-sign)
 # disable invalid "can't wrap" optimizations for signed / pointers
 KBUILD_CFLAGS  += $(call cc-option,-fno-strict-overflow)
 
+# Make sure -fstack-check isn't enabled (like gentoo apparently did)
+KBUILD_CFLAGS  += $(call cc-option,-fno-stack-check,)
+
 # conserve stack if available
 KBUILD_CFLAGS   += $(call cc-option,-fconserve-stack)
 


Re: 4.14.9 doesn't boot (regression)

2017-12-29 Thread Alexander Tsoy
В Пт, 29/12/2017 в 17:10 -0700, Andy Lutomirski пишет:
> 
> Also, you wouldn't happen to be using Gentoo perchance?  I already
> have two reports of a Gentoo system miscompiling the vDSO due to
> Gentoo enabling -fstack-check and GCC generating stack check code
> that is highly suboptimal, actively incorrect, and doesn't even
> manage to check the stack in a particularly helpful way.
> 
> If this is indeed what's going on, I'm going to try to come up with a
> patch to outright fail the build on these buggy systems.  We could
> probably fudge the build options to avoid the problem, but Gentoo
> really just needs fix its toolchain.

You are right, It's due to fstack-check enabled in gentoo's gcc spec.
"-fstack-check=no" in KBUILD_CFLAGS fixed this problem for me. =/


Re: 4.14.9 doesn't boot (regression)

2017-12-29 Thread Alexander Tsoy
В Пт, 29/12/2017 в 17:10 -0700, Andy Lutomirski пишет:
> 
> Also, you wouldn't happen to be using Gentoo perchance?  I already
> have two reports of a Gentoo system miscompiling the vDSO due to
> Gentoo enabling -fstack-check and GCC generating stack check code
> that is highly suboptimal, actively incorrect, and doesn't even
> manage to check the stack in a particularly helpful way.
> 
> If this is indeed what's going on, I'm going to try to come up with a
> patch to outright fail the build on these buggy systems.  We could
> probably fudge the build options to avoid the problem, but Gentoo
> really just needs fix its toolchain.

You are right, It's due to fstack-check enabled in gentoo's gcc spec.
"-fstack-check=no" in KBUILD_CFLAGS fixed this problem for me. =/


Re: 4.14.9 doesn't boot (regression)

2017-12-29 Thread Linus Torvalds
 f

On Fri, Dec 29, 2017 at 4:10 PM, Andy Lutomirski  wrote:
>
> Double faults use IST, so a double fault that double faults will effectively 
> just start over rather than eventually running out of stack and triple 
> faulting.
>
> But check out the registers. We have RSP = ...28fd8 and CR2 = ...27f08.
> IOW the double fault stack is ...28000 - ...28fff and we're somehow getting
> a failed page fault a couple hundred bytes below the bottom of the IST stack.
> IOW, I think we're just stuck in a neverending loop of stack overflows.

Ahh, good catch. This feels like it might finally be explaining things.

> (Also, Josh, the oops code should have printed the contents of the struct 
> pt_regs at the top of the DF stack.  Any idea why it didn't?)
>
> Toralf, can you send the complete output of:
>
> objdump -dr arch/x86/kernel/traps.o
>
> From the build tree of a nonworking kernel?

Alexander made one of his failing kernels available earlier:

https://www.dropbox.com/s/yesupqgig3uxf73/linux-4.15-rc5%2B.tar.xz?dl=0

and yes, there's something seriously wrong there. Doing a disassembly
on "do_double_fault()" shows:

8101bda0 :
8101bda0:   41 54   push   %r12
8101bda2:   55  push   %rbp
8101bda3:   53  push   %rbx
8101bda4:   48 81 ec 20 10 00 00sub$0x1020,%rsp
8101bdab:   48 83 0c 24 00  orq$0x0,(%rsp)
8101bdb0:   48 81 c4 20 10 00 00add$0x1020,%rsp

WTF? That's bogus crap, and not ok in the kernel.  Doing a stack probe
below the stack by subtracting 4128rom the stack pointer and then
oring it, and then resetting the stack pointer again is just crazy.
And it's definitely not ever going to work for the kernel that has a
limited stack.

So yes, It's a terminally broken compiler from hell. I assume gentoo
has applied some completely broken security patch to their compiler,
turning said compiler into complete garbage.

Doing some trivial grepping on the disassembly in that vmlinux file,
there's tons of those "let's probe more than a page below the stack"
issues. The biggest offset I found was 0x1400.

That one happened to be in do_sys_poll().

> Also, you wouldn't happen to be using Gentoo perchance?

Yes, several people involved are using gentoo. Maybe everybody.

> I already have two reports of a Gentoo system miscompiling the vDSO
> due to Gentoo enabling -fstack-check and GCC generating stack check
> code that is highly suboptimal, actively incorrect, and doesn't even
> manage to check the stack in a particularly helpful way.

Yes. Good. I think you root-caused it.

Good. I was not feeling so happy about this bug report, but now I can
firmly just blame the gentoo compiler for having some shit-for-brains
"feature".

   Linus


Re: 4.14.9 doesn't boot (regression)

2017-12-29 Thread Linus Torvalds
 f

On Fri, Dec 29, 2017 at 4:10 PM, Andy Lutomirski  wrote:
>
> Double faults use IST, so a double fault that double faults will effectively 
> just start over rather than eventually running out of stack and triple 
> faulting.
>
> But check out the registers. We have RSP = ...28fd8 and CR2 = ...27f08.
> IOW the double fault stack is ...28000 - ...28fff and we're somehow getting
> a failed page fault a couple hundred bytes below the bottom of the IST stack.
> IOW, I think we're just stuck in a neverending loop of stack overflows.

Ahh, good catch. This feels like it might finally be explaining things.

> (Also, Josh, the oops code should have printed the contents of the struct 
> pt_regs at the top of the DF stack.  Any idea why it didn't?)
>
> Toralf, can you send the complete output of:
>
> objdump -dr arch/x86/kernel/traps.o
>
> From the build tree of a nonworking kernel?

Alexander made one of his failing kernels available earlier:

https://www.dropbox.com/s/yesupqgig3uxf73/linux-4.15-rc5%2B.tar.xz?dl=0

and yes, there's something seriously wrong there. Doing a disassembly
on "do_double_fault()" shows:

8101bda0 :
8101bda0:   41 54   push   %r12
8101bda2:   55  push   %rbp
8101bda3:   53  push   %rbx
8101bda4:   48 81 ec 20 10 00 00sub$0x1020,%rsp
8101bdab:   48 83 0c 24 00  orq$0x0,(%rsp)
8101bdb0:   48 81 c4 20 10 00 00add$0x1020,%rsp

WTF? That's bogus crap, and not ok in the kernel.  Doing a stack probe
below the stack by subtracting 4128rom the stack pointer and then
oring it, and then resetting the stack pointer again is just crazy.
And it's definitely not ever going to work for the kernel that has a
limited stack.

So yes, It's a terminally broken compiler from hell. I assume gentoo
has applied some completely broken security patch to their compiler,
turning said compiler into complete garbage.

Doing some trivial grepping on the disassembly in that vmlinux file,
there's tons of those "let's probe more than a page below the stack"
issues. The biggest offset I found was 0x1400.

That one happened to be in do_sys_poll().

> Also, you wouldn't happen to be using Gentoo perchance?

Yes, several people involved are using gentoo. Maybe everybody.

> I already have two reports of a Gentoo system miscompiling the vDSO
> due to Gentoo enabling -fstack-check and GCC generating stack check
> code that is highly suboptimal, actively incorrect, and doesn't even
> manage to check the stack in a particularly helpful way.

Yes. Good. I think you root-caused it.

Good. I was not feeling so happy about this bug report, but now I can
firmly just blame the gentoo compiler for having some shit-for-brains
"feature".

   Linus


Re: 4.14.9 doesn't boot (regression)

2017-12-29 Thread Andy Lutomirski


> On Dec 29, 2017, at 3:53 PM, Linus Torvalds  
> wrote:
> 
>> On Fri, Dec 29, 2017 at 2:30 PM, Toralf Förster  
>> wrote:
>> 
>> The bad news - the issue is not solved with the changed cflags.
>> The good news - I could compile eventually a working config for my desktop  
>> (works fine with 4.14.10 with generic CPU) having a higher screen resolution 
>> during boot.
>> 
>> So I made a "make distclean", followed by a "sudo zcat /proc/config.gz > 
>> .config", changed the .config to use MCORE2 instead of GENERIC and defined 
>> the string "-local" to ensure that the modules directory is really unique.
>> Then I run "time make -j4 && sudo make modules_install && sudo cp 
>> arch/x86_64/boot/bzImage /boot/vmlinuz-0 && sudo grub-mkconfig -o 
>> /boot/grub/grub.cfg", booted and made 3 fotos which were uploaded to [1], 
>> look for IMG_*
> 
> Ok, so what does seem to be consistent for everybody is that
> double-fault in the NMI backtrace.
> 
> So the fact that the NMI always hits on a double-fault does make me
> suspect that it's a infinite stream of double-faults, and that is
> presumably also what causes the RCU timeout.
> 
> And as I pointed out elsewhere (damn two threads), I think that it
> would help to simply catch the *first* double-fault.
> 
> And I *think* that the only thing that can make a double-fault
> silently be re-tried is the CONFIG_X86_ESPFIX64 case, so if you can
> build a failing kernel with the CONFIG_X86_ESPFIX64 case disabled in
> arch/x86/kernel/traps.c do_double_fault(), that would be interesting.

Double faults use IST, so a double fault that double faults will effectively 
just start over rather than eventually running out of stack and triple faulting.

But check out the registers. We have RSP = ...28fd8 and CR2 = ...27f08. IOW the 
double fault stack is ...28000 - ...28fff and we're somehow getting a failed 
page fault a couple hundred bytes below the bottom of the IST stack.  IOW, I 
think we're just stuck in a neverending loop of stack overflows.

(Also, Josh, the oops code should have printed the contents of the struct 
pt_regs at the top of the DF stack.  Any idea why it didn't?)

Toralf, can you send the complete output of:

objdump -dr arch/x86/kernel/traps.o

From the build tree of a nonworking kernel?

Also, you wouldn't happen to be using Gentoo perchance?  I already have two 
reports of a Gentoo system miscompiling the vDSO due to Gentoo enabling 
-fstack-check and GCC generating stack check code that is highly suboptimal, 
actively incorrect, and doesn't even manage to check the stack in a 
particularly helpful way.

If this is indeed what's going on, I'm going to try to come up with a patch to 
outright fail the build on these buggy systems.  We could probably fudge the 
build options to avoid the problem, but Gentoo really just needs fix its 
toolchain.


Re: 4.14.9 doesn't boot (regression)

2017-12-29 Thread Andy Lutomirski


> On Dec 29, 2017, at 3:53 PM, Linus Torvalds  
> wrote:
> 
>> On Fri, Dec 29, 2017 at 2:30 PM, Toralf Förster  
>> wrote:
>> 
>> The bad news - the issue is not solved with the changed cflags.
>> The good news - I could compile eventually a working config for my desktop  
>> (works fine with 4.14.10 with generic CPU) having a higher screen resolution 
>> during boot.
>> 
>> So I made a "make distclean", followed by a "sudo zcat /proc/config.gz > 
>> .config", changed the .config to use MCORE2 instead of GENERIC and defined 
>> the string "-local" to ensure that the modules directory is really unique.
>> Then I run "time make -j4 && sudo make modules_install && sudo cp 
>> arch/x86_64/boot/bzImage /boot/vmlinuz-0 && sudo grub-mkconfig -o 
>> /boot/grub/grub.cfg", booted and made 3 fotos which were uploaded to [1], 
>> look for IMG_*
> 
> Ok, so what does seem to be consistent for everybody is that
> double-fault in the NMI backtrace.
> 
> So the fact that the NMI always hits on a double-fault does make me
> suspect that it's a infinite stream of double-faults, and that is
> presumably also what causes the RCU timeout.
> 
> And as I pointed out elsewhere (damn two threads), I think that it
> would help to simply catch the *first* double-fault.
> 
> And I *think* that the only thing that can make a double-fault
> silently be re-tried is the CONFIG_X86_ESPFIX64 case, so if you can
> build a failing kernel with the CONFIG_X86_ESPFIX64 case disabled in
> arch/x86/kernel/traps.c do_double_fault(), that would be interesting.

Double faults use IST, so a double fault that double faults will effectively 
just start over rather than eventually running out of stack and triple faulting.

But check out the registers. We have RSP = ...28fd8 and CR2 = ...27f08. IOW the 
double fault stack is ...28000 - ...28fff and we're somehow getting a failed 
page fault a couple hundred bytes below the bottom of the IST stack.  IOW, I 
think we're just stuck in a neverending loop of stack overflows.

(Also, Josh, the oops code should have printed the contents of the struct 
pt_regs at the top of the DF stack.  Any idea why it didn't?)

Toralf, can you send the complete output of:

objdump -dr arch/x86/kernel/traps.o

From the build tree of a nonworking kernel?

Also, you wouldn't happen to be using Gentoo perchance?  I already have two 
reports of a Gentoo system miscompiling the vDSO due to Gentoo enabling 
-fstack-check and GCC generating stack check code that is highly suboptimal, 
actively incorrect, and doesn't even manage to check the stack in a 
particularly helpful way.

If this is indeed what's going on, I'm going to try to come up with a patch to 
outright fail the build on these buggy systems.  We could probably fudge the 
build options to avoid the problem, but Gentoo really just needs fix its 
toolchain.


Re: 4.14.9 doesn't boot (regression)

2017-12-29 Thread Toralf Förster
On 12/29/2017 11:53 PM, Linus Torvalds wrote:
> So just change the
> 
>   #ifdef CONFIG_X86_ESPFIX64
> 
> into a
> 
>   #if 0
> 
> and see if instead of the RCU stall after 20 seconds, you get an
> immediate double fault error report instead?

well, 3 IMG_20171230_0008* should show the results https://zwiebeltoralf.de/pub/



-- 
Toralf
PGP C4EACDDE 0076E94E


Re: 4.14.9 doesn't boot (regression)

2017-12-29 Thread Toralf Förster
On 12/29/2017 11:53 PM, Linus Torvalds wrote:
> So just change the
> 
>   #ifdef CONFIG_X86_ESPFIX64
> 
> into a
> 
>   #if 0
> 
> and see if instead of the RCU stall after 20 seconds, you get an
> immediate double fault error report instead?

well, 3 IMG_20171230_0008* should show the results https://zwiebeltoralf.de/pub/



-- 
Toralf
PGP C4EACDDE 0076E94E


Re: 4.14.9 doesn't boot (regression)

2017-12-29 Thread Linus Torvalds
On Fri, Dec 29, 2017 at 2:30 PM, Toralf Förster  wrote:
>
> The bad news - the issue is not solved with the changed cflags.
> The good news - I could compile eventually a working config for my desktop  
> (works fine with 4.14.10 with generic CPU) having a higher screen resolution 
> during boot.
>
> So I made a "make distclean", followed by a "sudo zcat /proc/config.gz > 
> .config", changed the .config to use MCORE2 instead of GENERIC and defined 
> the string "-local" to ensure that the modules directory is really unique.
> Then I run "time make -j4 && sudo make modules_install && sudo cp 
> arch/x86_64/boot/bzImage /boot/vmlinuz-0 && sudo grub-mkconfig -o 
> /boot/grub/grub.cfg", booted and made 3 fotos which were uploaded to [1], 
> look for IMG_*

Ok, so what does seem to be consistent for everybody is that
double-fault in the NMI backtrace.

So the fact that the NMI always hits on a double-fault does make me
suspect that it's a infinite stream of double-faults, and that is
presumably also what causes the RCU timeout.

And as I pointed out elsewhere (damn two threads), I think that it
would help to simply catch the *first* double-fault.

And I *think* that the only thing that can make a double-fault
silently be re-tried is the CONFIG_X86_ESPFIX64 case, so if you can
build a failing kernel with the CONFIG_X86_ESPFIX64 case disabled in
arch/x86/kernel/traps.c do_double_fault(), that would be interesting.

So just change the

  #ifdef CONFIG_X86_ESPFIX64

into a

  #if 0

and see if instead of the RCU stall after 20 seconds, you get an
immediate double fault error report instead?

I'm still entirely confused about why that MCORE2 would make _any_
difference what-so-ever, so this is all fishing for random clues in
the dark.

  Linus


Re: 4.14.9 doesn't boot (regression)

2017-12-29 Thread Linus Torvalds
On Fri, Dec 29, 2017 at 2:30 PM, Toralf Förster  wrote:
>
> The bad news - the issue is not solved with the changed cflags.
> The good news - I could compile eventually a working config for my desktop  
> (works fine with 4.14.10 with generic CPU) having a higher screen resolution 
> during boot.
>
> So I made a "make distclean", followed by a "sudo zcat /proc/config.gz > 
> .config", changed the .config to use MCORE2 instead of GENERIC and defined 
> the string "-local" to ensure that the modules directory is really unique.
> Then I run "time make -j4 && sudo make modules_install && sudo cp 
> arch/x86_64/boot/bzImage /boot/vmlinuz-0 && sudo grub-mkconfig -o 
> /boot/grub/grub.cfg", booted and made 3 fotos which were uploaded to [1], 
> look for IMG_*

Ok, so what does seem to be consistent for everybody is that
double-fault in the NMI backtrace.

So the fact that the NMI always hits on a double-fault does make me
suspect that it's a infinite stream of double-faults, and that is
presumably also what causes the RCU timeout.

And as I pointed out elsewhere (damn two threads), I think that it
would help to simply catch the *first* double-fault.

And I *think* that the only thing that can make a double-fault
silently be re-tried is the CONFIG_X86_ESPFIX64 case, so if you can
build a failing kernel with the CONFIG_X86_ESPFIX64 case disabled in
arch/x86/kernel/traps.c do_double_fault(), that would be interesting.

So just change the

  #ifdef CONFIG_X86_ESPFIX64

into a

  #if 0

and see if instead of the RCU stall after 20 seconds, you get an
immediate double fault error report instead?

I'm still entirely confused about why that MCORE2 would make _any_
difference what-so-ever, so this is all fishing for random clues in
the dark.

  Linus


Re: 4.14.9 doesn't boot (regression)

2017-12-29 Thread Toralf Förster
On 12/29/2017 10:17 PM, Linus Torvalds wrote:
> On Fri, Dec 29, 2017 at 1:02 PM, Toralf Förster  
> wrote:
>> On 12/29/2017 09:12 PM, Linus Torvalds wrote:
>>> instead, and see if that makes a difference, that would narrow down
>>> the possible root cause of this problem.
>>
>> not at this ThinkPad T440s (didn't test at the server with an i7-3930).
>>
>> Boot stops just at:
>>
>> tsc: Refined TSC clocksource calibration: 2494.225 MHz
>> clocksource: tsc: mask: 0x max_cycles: 
>> 0x23f3ea95b09, max_idle_ns: 440795287034 ns
> 
> Uhhuh. So for Alexander Troy, just getting rid of the -march=core2
> fixed the boot.
> 
> But not for you.
> 
> Strange. It really looked like the exact same thing.
> 
>> This is a "Intel(R) Core(TM) i5-4300U CPU @ 1.90GHz" with gcc-6.4
> 
> Yeah, other reporters of this have used gcc-6.4.0 too.
> 
> But there's been some muddying of the waters there too - changing
> compilers have fixed it for some cases, but there's at least one
> report that a kernel build with gcc-7.2.0 still had the issue (and
> another that said it didn't).
> 
> But the MCORE2 was consistent for several people - including you.
> Until this point.
> 
> Strange.
> 
> The only other thing (apart from the compiler flag) that MCORE2
> results in is to enable
> 
>  CONFIG_X86_INTEL_USERCOPY
>  CONFIG_X86_USE_PPRO_CHECKSUM
>  CONFIG_X86_P6_NOP
> 
> and the two first of those shouldn't even matter on x86-64, and I
> don't see that last one making any difference either.
> 
> So because it looks so impossible that the "-march=core2" didn't make
> a difference for you, I'll ask you to please double-check that you
> actually booted into the right kernel.
> 
> Sorry for doubting you, but your report just broke the _one_
> consistent thing we've seen about this bug.
> 
>   Linus
> 


I double-checked it.

The bad news - the issue is not solved with the changed cflags.
The good news - I could compile eventually a working config for my desktop  
(works fine with 4.14.10 with generic CPU) having a higher screen resolution 
during boot.

So I made a "make distclean", followed by a "sudo zcat /proc/config.gz > 
.config", changed the .config to use MCORE2 instead of GENERIC and defined the 
string "-local" to ensure that the modules directory is really unique.
Then I run "time make -j4 && sudo make modules_install && sudo cp 
arch/x86_64/boot/bzImage /boot/vmlinuz-0 && sudo grub-mkconfig -o 
/boot/grub/grub.cfg", booted and made 3 fotos which were uploaded to [1], look 
for IMG_*

[1] https://zwiebeltoralf.de/pub/


-- 
Toralf
PGP C4EACDDE 0076E94E


Re: 4.14.9 doesn't boot (regression)

2017-12-29 Thread Toralf Förster
On 12/29/2017 10:17 PM, Linus Torvalds wrote:
> On Fri, Dec 29, 2017 at 1:02 PM, Toralf Förster  
> wrote:
>> On 12/29/2017 09:12 PM, Linus Torvalds wrote:
>>> instead, and see if that makes a difference, that would narrow down
>>> the possible root cause of this problem.
>>
>> not at this ThinkPad T440s (didn't test at the server with an i7-3930).
>>
>> Boot stops just at:
>>
>> tsc: Refined TSC clocksource calibration: 2494.225 MHz
>> clocksource: tsc: mask: 0x max_cycles: 
>> 0x23f3ea95b09, max_idle_ns: 440795287034 ns
> 
> Uhhuh. So for Alexander Troy, just getting rid of the -march=core2
> fixed the boot.
> 
> But not for you.
> 
> Strange. It really looked like the exact same thing.
> 
>> This is a "Intel(R) Core(TM) i5-4300U CPU @ 1.90GHz" with gcc-6.4
> 
> Yeah, other reporters of this have used gcc-6.4.0 too.
> 
> But there's been some muddying of the waters there too - changing
> compilers have fixed it for some cases, but there's at least one
> report that a kernel build with gcc-7.2.0 still had the issue (and
> another that said it didn't).
> 
> But the MCORE2 was consistent for several people - including you.
> Until this point.
> 
> Strange.
> 
> The only other thing (apart from the compiler flag) that MCORE2
> results in is to enable
> 
>  CONFIG_X86_INTEL_USERCOPY
>  CONFIG_X86_USE_PPRO_CHECKSUM
>  CONFIG_X86_P6_NOP
> 
> and the two first of those shouldn't even matter on x86-64, and I
> don't see that last one making any difference either.
> 
> So because it looks so impossible that the "-march=core2" didn't make
> a difference for you, I'll ask you to please double-check that you
> actually booted into the right kernel.
> 
> Sorry for doubting you, but your report just broke the _one_
> consistent thing we've seen about this bug.
> 
>   Linus
> 


I double-checked it.

The bad news - the issue is not solved with the changed cflags.
The good news - I could compile eventually a working config for my desktop  
(works fine with 4.14.10 with generic CPU) having a higher screen resolution 
during boot.

So I made a "make distclean", followed by a "sudo zcat /proc/config.gz > 
.config", changed the .config to use MCORE2 instead of GENERIC and defined the 
string "-local" to ensure that the modules directory is really unique.
Then I run "time make -j4 && sudo make modules_install && sudo cp 
arch/x86_64/boot/bzImage /boot/vmlinuz-0 && sudo grub-mkconfig -o 
/boot/grub/grub.cfg", booted and made 3 fotos which were uploaded to [1], look 
for IMG_*

[1] https://zwiebeltoralf.de/pub/


-- 
Toralf
PGP C4EACDDE 0076E94E


Re: 4.14.9 doesn't boot (regression)

2017-12-29 Thread Alexander Tsoy
В Пт, 29/12/2017 в 13:39 -0800, Linus Torvalds пишет:
> On Fri, Dec 29, 2017 at 1:17 PM, Linus Torvalds
>  wrote:
> > 
> > Yeah, other reporters of this have used gcc-6.4.0 too.
> > 
> > But there's been some muddying of the waters there too - changing
> > compilers have fixed it for some cases, but there's at least one
> > report that a kernel build with gcc-7.2.0 still had the issue (and
> > another that said it didn't).
> 
> Side note: I'm not convinced that we will reliably catch a compiler
> version change in our dependency analysis, so it's probably best to
> "make clean" between switching compilers to make sure that you don't
> have old object files with the old compiler.

I did "make clean" after changing compiler flags.


Re: 4.14.9 doesn't boot (regression)

2017-12-29 Thread Alexander Tsoy
В Пт, 29/12/2017 в 13:39 -0800, Linus Torvalds пишет:
> On Fri, Dec 29, 2017 at 1:17 PM, Linus Torvalds
>  wrote:
> > 
> > Yeah, other reporters of this have used gcc-6.4.0 too.
> > 
> > But there's been some muddying of the waters there too - changing
> > compilers have fixed it for some cases, but there's at least one
> > report that a kernel build with gcc-7.2.0 still had the issue (and
> > another that said it didn't).
> 
> Side note: I'm not convinced that we will reliably catch a compiler
> version change in our dependency analysis, so it's probably best to
> "make clean" between switching compilers to make sure that you don't
> have old object files with the old compiler.

I did "make clean" after changing compiler flags.


Re: 4.14.9 doesn't boot (regression)

2017-12-29 Thread Linus Torvalds
On Fri, Dec 29, 2017 at 1:17 PM, Linus Torvalds
 wrote:
>
> Yeah, other reporters of this have used gcc-6.4.0 too.
>
> But there's been some muddying of the waters there too - changing
> compilers have fixed it for some cases, but there's at least one
> report that a kernel build with gcc-7.2.0 still had the issue (and
> another that said it didn't).

Side note: I'm not convinced that we will reliably catch a compiler
version change in our dependency analysis, so it's probably best to
"make clean" between switching compilers to make sure that you don't
have old object files with the old compiler.

> But the MCORE2 was consistent for several people - including you.
> Until this point.

.. and our build infrastructure definitely _should_ catch compiler
switch changes automatically and force a re-build.

  Linus


Re: 4.14.9 doesn't boot (regression)

2017-12-29 Thread Linus Torvalds
On Fri, Dec 29, 2017 at 1:17 PM, Linus Torvalds
 wrote:
>
> Yeah, other reporters of this have used gcc-6.4.0 too.
>
> But there's been some muddying of the waters there too - changing
> compilers have fixed it for some cases, but there's at least one
> report that a kernel build with gcc-7.2.0 still had the issue (and
> another that said it didn't).

Side note: I'm not convinced that we will reliably catch a compiler
version change in our dependency analysis, so it's probably best to
"make clean" between switching compilers to make sure that you don't
have old object files with the old compiler.

> But the MCORE2 was consistent for several people - including you.
> Until this point.

.. and our build infrastructure definitely _should_ catch compiler
switch changes automatically and force a re-build.

  Linus


Re: 4.14.9 doesn't boot (regression)

2017-12-29 Thread Linus Torvalds
On Fri, Dec 29, 2017 at 1:02 PM, Toralf Förster  wrote:
> On 12/29/2017 09:12 PM, Linus Torvalds wrote:
>> instead, and see if that makes a difference, that would narrow down
>> the possible root cause of this problem.
>
> not at this ThinkPad T440s (didn't test at the server with an i7-3930).
>
> Boot stops just at:
>
> tsc: Refined TSC clocksource calibration: 2494.225 MHz
> clocksource: tsc: mask: 0x max_cycles: 0x23f3ea95b09, 
> max_idle_ns: 440795287034 ns

Uhhuh. So for Alexander Troy, just getting rid of the -march=core2
fixed the boot.

But not for you.

Strange. It really looked like the exact same thing.

> This is a "Intel(R) Core(TM) i5-4300U CPU @ 1.90GHz" with gcc-6.4

Yeah, other reporters of this have used gcc-6.4.0 too.

But there's been some muddying of the waters there too - changing
compilers have fixed it for some cases, but there's at least one
report that a kernel build with gcc-7.2.0 still had the issue (and
another that said it didn't).

But the MCORE2 was consistent for several people - including you.
Until this point.

Strange.

The only other thing (apart from the compiler flag) that MCORE2
results in is to enable

 CONFIG_X86_INTEL_USERCOPY
 CONFIG_X86_USE_PPRO_CHECKSUM
 CONFIG_X86_P6_NOP

and the two first of those shouldn't even matter on x86-64, and I
don't see that last one making any difference either.

So because it looks so impossible that the "-march=core2" didn't make
a difference for you, I'll ask you to please double-check that you
actually booted into the right kernel.

Sorry for doubting you, but your report just broke the _one_
consistent thing we've seen about this bug.

  Linus


  1   2   3   >