Re: [BUG] Revert 0b05e2d671c4 'powerpc/32: cacheable_memcpy becomes memcpy'

2015-09-21 Thread Michael Ellerman
On Fri, 2015-09-18 at 16:45 +0200, Thomas Gleixner wrote:
> On Fri, 18 Sep 2015, Segher Boessenkool wrote:
> 
> > On Fri, Sep 18, 2015 at 08:01:57PM +1000, Michael Ellerman wrote:
> > > Weird.
> > > 
> > > Can you try:
> > > 
> > >   d690740f22f6 ("powerpc/powernv: Enable LEDS support")
> > > 
> > > That is the commit before my merge of Scott's tree, which had all the 
> > > Freescale
> > > & 32-bit stuff.
> > > 
> > > If that works, it would isolate it to only about 30 commits from Scott's 
> > > tree.
> > 
> > My theory is it is not a kernel problem at all; instead, uboot overwrites
> > some little part of the kernel image.
> 
> u-boot is just loading a cuImage which is a kernel with appended
> device tree. So u-boot is out of the picture,

Unless something is still DMA'ing after u-boot has handed over to the kernel.
We had that happen on OF years ago.

> still investigating (when time allows)

Thanks. I assume no one at Freescale has been able to reproduce this?

cheers


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [BUG] Revert 0b05e2d671c4 'powerpc/32: cacheable_memcpy becomes memcpy'

2015-09-21 Thread Scott Wood
On Mon, 2015-09-21 at 17:10 +1000, Michael Ellerman wrote:
> On Fri, 2015-09-18 at 16:45 +0200, Thomas Gleixner wrote:
> > On Fri, 18 Sep 2015, Segher Boessenkool wrote:
> > 
> > > On Fri, Sep 18, 2015 at 08:01:57PM +1000, Michael Ellerman wrote:
> > > > Weird.
> > > > 
> > > > Can you try:
> > > > 
> > > >   d690740f22f6 ("powerpc/powernv: Enable LEDS support")
> > > > 
> > > > That is the commit before my merge of Scott's tree, which had all the 
> > > > Freescale
> > > > & 32-bit stuff.
> > > > 
> > > > If that works, it would isolate it to only about 30 commits from 
> > > > Scott's tree.
> > > 
> > > My theory is it is not a kernel problem at all; instead, uboot 
> > > overwrites
> > > some little part of the kernel image.
> > 
> > u-boot is just loading a cuImage which is a kernel with appended
> > device tree. So u-boot is out of the picture,
> 
> Unless something is still DMA'ing after u-boot has handed over to the 
> kernel.
> We had that happen on OF years ago.
>
> > still investigating (when time allows)
> 
> Thanks. I assume no one at Freescale has been able to reproduce this?

I don't have an mpc52xx to test on.  I can boot a p4080 with tracing enabled.

-Scott

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [BUG] Revert 0b05e2d671c4 'powerpc/32: cacheable_memcpy becomes memcpy'

2015-09-18 Thread Michael Ellerman
On Thu, 2015-09-17 at 12:13 +0200, Thomas Gleixner wrote:
> On Wed, 16 Sep 2015, Steven Rostedt wrote:
> > On Wed, 16 Sep 2015 22:01:06 +0200 (CEST)
> > Thomas Gleixner  wrote:
> > > So now I have to chase down that one:
> > > 
> > > [0.230210] ftrace: allocating 16560 entries in 49 pages
> > > [0.273313] [ cut here ]
> > > [0.278048] WARNING: at 
> > > /home/tglx/work/kernel/tip/tip/kernel/trace/ftrace.c:1974
> > 
> 
> It's not observable before the ppc merge, but I can't identify the
> culprit by bisection. bisection led into lala land.

Weird.

Can you try:

  d690740f22f6 ("powerpc/powernv: Enable LEDS support")

That is the commit before my merge of Scott's tree, which had all the Freescale
& 32-bit stuff.

If that works, it would isolate it to only about 30 commits from Scott's tree.

cheers


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [BUG] Revert 0b05e2d671c4 'powerpc/32: cacheable_memcpy becomes memcpy'

2015-09-18 Thread Segher Boessenkool
On Fri, Sep 18, 2015 at 08:01:57PM +1000, Michael Ellerman wrote:
> Weird.
> 
> Can you try:
> 
>   d690740f22f6 ("powerpc/powernv: Enable LEDS support")
> 
> That is the commit before my merge of Scott's tree, which had all the 
> Freescale
> & 32-bit stuff.
> 
> If that works, it would isolate it to only about 30 commits from Scott's tree.

My theory is it is not a kernel problem at all; instead, uboot overwrites
some little part of the kernel image.


Segher
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [BUG] Revert 0b05e2d671c4 'powerpc/32: cacheable_memcpy becomes memcpy'

2015-09-18 Thread Thomas Gleixner
On Fri, 18 Sep 2015, Segher Boessenkool wrote:

> On Fri, Sep 18, 2015 at 08:01:57PM +1000, Michael Ellerman wrote:
> > Weird.
> > 
> > Can you try:
> > 
> >   d690740f22f6 ("powerpc/powernv: Enable LEDS support")
> > 
> > That is the commit before my merge of Scott's tree, which had all the 
> > Freescale
> > & 32-bit stuff.
> > 
> > If that works, it would isolate it to only about 30 commits from Scott's 
> > tree.
> 
> My theory is it is not a kernel problem at all; instead, uboot overwrites
> some little part of the kernel image.

u-boot is just loading a cuImage which is a kernel with appended
device tree. So u-boot is out of the picture, still investigating
(when time allows)

Thanks,

tglx
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [BUG] Revert 0b05e2d671c4 'powerpc/32: cacheable_memcpy becomes memcpy'

2015-09-17 Thread Thomas Gleixner
On Wed, 16 Sep 2015, Steven Rostedt wrote:
> On Wed, 16 Sep 2015 22:01:06 +0200 (CEST)
> Thomas Gleixner  wrote:
> > So now I have to chase down that one:
> > 
> > [0.230210] ftrace: allocating 16560 entries in 49 pages
> > [0.273313] [ cut here ]
> > [0.278048] WARNING: at 
> > /home/tglx/work/kernel/tip/tip/kernel/trace/ftrace.c:1974
> 
> OK, so this is where ftrace converts the mcount to nops.
> 
> Just to be clear, there's nothing in the command line that enables any
> function tracing, is there?

Nothing on the command line.

> > Happens somewhere between the powerpc merge and rc1. System boots up
> > to userspace and then locks hard 

Digging deeper. My assumption that it's a post powerpc merge failure
turned out to be wrong.

Some more data points. I see the above splat with

CONFIG_FUNCTION_TRACER=y
CONFIG_FUNCTION_GRAPH_TRACER=y

It goes away with 

CONFIG_FUNCTION_TRACER=y
CONFIG_FUNCTION_GRAPH_TRACER=n

But the box still does not get to the login prompt.

CONFIG_FUNCTION_TRACER=n

makes it work again.

It's not observable before the ppc merge, but I can't identify the
culprit by bisection. bisection led into lala land.

Thanks,

tglx
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [BUG] Revert 0b05e2d671c4 'powerpc/32: cacheable_memcpy becomes memcpy'

2015-09-17 Thread David Laight
From: Segher Boessenkool
> Sent: 17 September 2015 04:19
> On Thu, Sep 17, 2015 at 12:50:12PM +1000, Michael Ellerman wrote:
> > On Wed, 2015-09-16 at 21:54 -0400, Steven Rostedt wrote:
> > > This could be a symptom and not the problem. What the above shows is
> > > that ftrace tried to convert the mcount at change_protection but what
> > > it expected was there wasn't. Unfortunately, it doesn't state exactly
> > > what it wants (that would take a arch specific function to do that, and
> > > this is in generic code). But what it found was "74 66 74 70", which I
> > > have no idea what type of command that is.
> >
> > This is big endian, so I think that's:
> >
> >   andis.  r6,r3,29808
> >
> > Which is feasible.
> 
> It also says "tftp", which is intriguing if nothing else :-)

Much more likely than the above instruction.
If the address it wass read from is in the dump, look at the entire string.

David

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [BUG] Revert 0b05e2d671c4 'powerpc/32: cacheable_memcpy becomes memcpy'

2015-09-17 Thread Thomas Gleixner
On Thu, 17 Sep 2015, Steven Rostedt wrote:

> On Thu, 17 Sep 2015 16:38:52 +0200 (CEST)
> Thomas Gleixner  wrote:
> 
> > On Thu, 17 Sep 2015, Steven Rostedt wrote:
> > 
> > > On Thu, 17 Sep 2015 12:13:15 +0200 (CEST)
> > > Thomas Gleixner  wrote:
> > > 
> > > > Digging deeper. My assumption that it's a post powerpc merge failure
> > > > turned out to be wrong.
> > > 
> > > Does 4.2 have the problem?
> > 
> > No. Neither does 
> > 
> > 4c92b5bb1422: Merge branch 'pcmcia' of 
> > git://ftp.arm.linux.org.uk/~rmk/linux-arm
> 
> What's the significance of that commit?

It's the commit before the merge of the powerpc tree.

ff474e8ca854: Merge tag 'powerpc-4.3-1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
 
> > It just results in a different failure mode. Instead of silently
> > hanging I get:
> > 
> > [2.248275] Oops: Exception in kernel mode, sig: 4 [#1]
> > [2.253633] PREEMPT lite5200
> > [2.256584] Modules linked in:
> > [2.259723] CPU: 0 PID: 1 Comm: swapper Not tainted 
> > 4.3.0-rc1-51179-gae80a2f-dirty #75
> > [2.267815] task: c383c000 ti: c383a000 task.ti: c383a000
> > [2.273330] NIP: c00e1eec LR: c00df0f4 CTR: 
> > [2.278405] REGS: c383bcd0 TRAP: 0700   Not tainted  
> > (4.3.0-rc1-51179-gae80a2f-dirty)
> > [2.286396] MSR: 00089032   CR: 44824028  XER: 
> > [2.293187] 
> > GPR00: c00def84 c383bd80 c383c000 c3084000 bff1 00677595 c383bdd8 
> >  
> > GPR08: 0001 0001 0400 0002 24828022  c0004254 
> > 84822042 
> > GPR16: 2000 44822042 f000 c3086ffc c06ce248 c383a000 c3082000 
> > c06d 
> > GPR24: c383a000 0ffc 00677595 bff1 c3084000 c3015bfc 0017 
> > c3086000 
> > [2.323656] NIP [c00e1eec] vm_normal_page+0x0/0xdc
> > [2.328560] LR [c00df0f4] follow_page_mask+0x260/0x4fc
> > [2.333807] Call Trace:
> > [2.336321] [c383bd80] [c00def84] follow_page_mask+0xf0/0x4fc 
> > (unreliable)
> > [2.343360] [c383bdd0] [c00df4a4] __get_user_pages.part.28+0x114/0x3e0
> > [2.350050] [c383be30] [c010e788] copy_strings+0x16c/0x2c8
> > [2.355668] [c383bea0] [c010e91c] copy_strings_kernel+0x38/0x50
> > [2.361730] [c383bec0] [c011057c] do_execveat_common+0x440/0x658
> > [2.367877] [c383bf10] [c01107cc] do_execve+0x38/0x48
> > [2.373056] [c383bf20] [c00039f0] try_to_run_init_process+0x24/0x64
> > [2.379469] [c383bf30] [c000430c] kernel_init+0xb8/0x10c
> > [2.384924] [c383bf40] [c0010c40] ret_from_kernel_thread+0x5c/0x64
> > [2.391242] --- interrupt: 0 at   (null)
> > [2.391242] LR =   (null)
> > [2.398263] Instruction dump:
> > [2.401297] 0100 00037000   f000 0001 
> > 0a641e09 acde4823 
> > [2.409237] 000f 179a7b00 07de2900 03ef1480 <01f78a40> 0001c200 
> > 6000 9421fff0 
> 
> Can you objdump this and and see what that is suppose to be.

Certainly not the code at NIP [c00e1eec] vm_normal_page:

c00e1eec :
c00e1eec:   7c 08 02 a6 mflrr0
c00e1ef0:   90 01 00 04 stw r0,4(r1)
c00e1ef4:   4b f2 f6 05 bl  c00114f8 <_mcount>
c00e1ef8:   94 21 ff f0 stwur1,-16(r1)
c00e1efc:   7c 08 02 a6 mflrr0
c00e1f00:   90 01 00 14 stw r0,20(r1)
c00e1f04:   70 a9 08 00 andi.   r9,r5,2048

That looks more like random data corruption.
 
> > [2.417375] ---[ end trace 996fd312ce9c18ce ]---
> > 
> > Again, if I disable CONFIG_TRACER its gone.
> 
> You mean if you disable CONFIG_FUNCTION_TRACER?

I have to disable both to make it boot. Disabling
CONFIG_FUNCTION_TRACER changes the failure mode, but does not make it
go away.
 

> Below is the entire push of ftrace for this merge window. Not much has
> changed. Could using "unsigned long" instead of "long" with the
> MCOUNT_ADDR cause this bug?

No, because the trace merge happened after the powerpc merge. But the
powerpc merge might be a red herring and the whole issue is caused by
something else which just gets unearthed by it.

Thanks,

tglx


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [BUG] Revert 0b05e2d671c4 'powerpc/32: cacheable_memcpy becomes memcpy'

2015-09-17 Thread Steven Rostedt
On Thu, 17 Sep 2015 12:13:15 +0200 (CEST)
Thomas Gleixner  wrote:

> Digging deeper. My assumption that it's a post powerpc merge failure
> turned out to be wrong.

Does 4.2 have the problem?

> 
> Some more data points. I see the above splat with
> 
> CONFIG_FUNCTION_TRACER=y
> CONFIG_FUNCTION_GRAPH_TRACER=y
> 
> It goes away with 
> 
> CONFIG_FUNCTION_TRACER=y
> CONFIG_FUNCTION_GRAPH_TRACER=n

Strange, because function graph tracer should have no effect on the
conversions of mcount calls into nops.


> 
> But the box still does not get to the login prompt.
> 
> CONFIG_FUNCTION_TRACER=n
> 
> makes it work again.
> 
> It's not observable before the ppc merge, but I can't identify the
> culprit by bisection. bisection led into lala land.
> 

If it's a corruption of the mcount tables, it could be specific on what
the compiler does. That is, the working of one kernel to the other, may
only depend on how gcc compiled something. Have you tried different
compilers? Maybe one version of gcc may work over another?

This may explain why turning off function graph made the splat go away.
It changes the way the compiler built the code.

-- Steve
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [BUG] Revert 0b05e2d671c4 'powerpc/32: cacheable_memcpy becomes memcpy'

2015-09-17 Thread Steven Rostedt
On Thu, 17 Sep 2015 16:38:52 +0200 (CEST)
Thomas Gleixner  wrote:

> On Thu, 17 Sep 2015, Steven Rostedt wrote:
> 
> > On Thu, 17 Sep 2015 12:13:15 +0200 (CEST)
> > Thomas Gleixner  wrote:
> > 
> > > Digging deeper. My assumption that it's a post powerpc merge failure
> > > turned out to be wrong.
> > 
> > Does 4.2 have the problem?
> 
> No. Neither does 
> 
> 4c92b5bb1422: Merge branch 'pcmcia' of 
> git://ftp.arm.linux.org.uk/~rmk/linux-arm

What's the significance of that commit?

>  
> > If it's a corruption of the mcount tables, it could be specific on what
> > the compiler does. That is, the working of one kernel to the other, may
> > only depend on how gcc compiled something. Have you tried different
> > compilers? Maybe one version of gcc may work over another?
> 
> It just results in a different failure mode. Instead of silently
> hanging I get:
> 
> [2.248275] Oops: Exception in kernel mode, sig: 4 [#1]
> [2.253633] PREEMPT lite5200
> [2.256584] Modules linked in:
> [2.259723] CPU: 0 PID: 1 Comm: swapper Not tainted 
> 4.3.0-rc1-51179-gae80a2f-dirty #75
> [2.267815] task: c383c000 ti: c383a000 task.ti: c383a000
> [2.273330] NIP: c00e1eec LR: c00df0f4 CTR: 
> [2.278405] REGS: c383bcd0 TRAP: 0700   Not tainted  
> (4.3.0-rc1-51179-gae80a2f-dirty)
> [2.286396] MSR: 00089032   CR: 44824028  XER: 
> [2.293187] 
> GPR00: c00def84 c383bd80 c383c000 c3084000 bff1 00677595 c383bdd8 
>  
> GPR08: 0001 0001 0400 0002 24828022  c0004254 
> 84822042 
> GPR16: 2000 44822042 f000 c3086ffc c06ce248 c383a000 c3082000 
> c06d 
> GPR24: c383a000 0ffc 00677595 bff1 c3084000 c3015bfc 0017 
> c3086000 
> [2.323656] NIP [c00e1eec] vm_normal_page+0x0/0xdc
> [2.328560] LR [c00df0f4] follow_page_mask+0x260/0x4fc
> [2.333807] Call Trace:
> [2.336321] [c383bd80] [c00def84] follow_page_mask+0xf0/0x4fc (unreliable)
> [2.343360] [c383bdd0] [c00df4a4] __get_user_pages.part.28+0x114/0x3e0
> [2.350050] [c383be30] [c010e788] copy_strings+0x16c/0x2c8
> [2.355668] [c383bea0] [c010e91c] copy_strings_kernel+0x38/0x50
> [2.361730] [c383bec0] [c011057c] do_execveat_common+0x440/0x658
> [2.367877] [c383bf10] [c01107cc] do_execve+0x38/0x48
> [2.373056] [c383bf20] [c00039f0] try_to_run_init_process+0x24/0x64
> [2.379469] [c383bf30] [c000430c] kernel_init+0xb8/0x10c
> [2.384924] [c383bf40] [c0010c40] ret_from_kernel_thread+0x5c/0x64
> [2.391242] --- interrupt: 0 at   (null)
> [2.391242] LR =   (null)
> [2.398263] Instruction dump:
> [2.401297] 0100 00037000   f000 0001 0a641e09 
> acde4823 
> [2.409237] 000f 179a7b00 07de2900 03ef1480 <01f78a40> 0001c200 
> 6000 9421fff0 

Can you objdump this and and see what that is suppose to be.

> [2.417375] ---[ end trace 996fd312ce9c18ce ]---
> 
> Again, if I disable CONFIG_TRACER its gone.

You mean if you disable CONFIG_FUNCTION_TRACER?

Below is the entire push of ftrace for this merge window. Not much has
changed. Could using "unsigned long" instead of "long" with the
MCOUNT_ADDR cause this bug?

-- Steve

diff --git a/Documentation/trace/ftrace.txt b/Documentation/trace/ftrace.txt
index 7ddb1e319f84..072d3c4d5753 100644
--- a/Documentation/trace/ftrace.txt
+++ b/Documentation/trace/ftrace.txt
@@ -686,6 +686,8 @@ The above is mostly meaningful for kernel developers.
 The marks are determined by the difference between this
 current trace and the next trace.
  '$' - greater than 1 second
+ '@' - greater than 100 milisecond
+ '*' - greater than 10 milisecond
  '#' - greater than 1000 microsecond
  '!' - greater than 100 microsecond
  '+' - greater than 10 microsecond
@@ -1939,26 +1941,49 @@ want, depending on your needs.
 
   ie:
 
-  0)   |up_write() {
-  0)   0.646 us|  _spin_lock_irqsave();
-  0)   0.684 us|  _spin_unlock_irqrestore();
-  0)   3.123 us|}
-  0)   0.548 us|fput();
-  0) + 58.628 us   |  }
+  3) # 1837.709 us |  } /* __switch_to */
+  3)   |  finish_task_switch() {
+  3)   0.313 us|_raw_spin_unlock_irq();
+  3)   3.177 us|  }
+  3) # 1889.063 us |} /* __schedule */
+  3) ! 140.417 us  |  } /* __schedule */
+  3) # 2034.948 us |} /* schedule */
+  3) * 33998.59 us |  } /* schedule_preempt_disabled */
 
   [...]
 
-  0)   |  putname() {
-  0)   |kmem_cache_free() {
-  0)   0.518 us|  __phys_addr();
-  0)   1.757 us|}
-  0)   2.861 us|  }
-  0) ! 115.305 us  |}
-  0) ! 116.402 us  |  }
+  1)   0.260 us|  msecs_to_jiffies();
+  1)   0.313 us|  __rcu_read_unlock();
+  1) + 61.770 us   |}
+  1) + 64.479 us   |  

Re: [BUG] Revert 0b05e2d671c4 'powerpc/32: cacheable_memcpy becomes memcpy'

2015-09-17 Thread Thomas Gleixner
On Thu, 17 Sep 2015, Steven Rostedt wrote:

> On Thu, 17 Sep 2015 12:13:15 +0200 (CEST)
> Thomas Gleixner  wrote:
> 
> > Digging deeper. My assumption that it's a post powerpc merge failure
> > turned out to be wrong.
> 
> Does 4.2 have the problem?

No. Neither does 

4c92b5bb1422: Merge branch 'pcmcia' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
 
> If it's a corruption of the mcount tables, it could be specific on what
> the compiler does. That is, the working of one kernel to the other, may
> only depend on how gcc compiled something. Have you tried different
> compilers? Maybe one version of gcc may work over another?

It just results in a different failure mode. Instead of silently
hanging I get:

[2.248275] Oops: Exception in kernel mode, sig: 4 [#1]
[2.253633] PREEMPT lite5200
[2.256584] Modules linked in:
[2.259723] CPU: 0 PID: 1 Comm: swapper Not tainted 
4.3.0-rc1-51179-gae80a2f-dirty #75
[2.267815] task: c383c000 ti: c383a000 task.ti: c383a000
[2.273330] NIP: c00e1eec LR: c00df0f4 CTR: 
[2.278405] REGS: c383bcd0 TRAP: 0700   Not tainted  
(4.3.0-rc1-51179-gae80a2f-dirty)
[2.286396] MSR: 00089032   CR: 44824028  XER: 
[2.293187] 
GPR00: c00def84 c383bd80 c383c000 c3084000 bff1 00677595 c383bdd8  
GPR08: 0001 0001 0400 0002 24828022  c0004254 84822042 
GPR16: 2000 44822042 f000 c3086ffc c06ce248 c383a000 c3082000 c06d 
GPR24: c383a000 0ffc 00677595 bff1 c3084000 c3015bfc 0017 c3086000 
[2.323656] NIP [c00e1eec] vm_normal_page+0x0/0xdc
[2.328560] LR [c00df0f4] follow_page_mask+0x260/0x4fc
[2.333807] Call Trace:
[2.336321] [c383bd80] [c00def84] follow_page_mask+0xf0/0x4fc (unreliable)
[2.343360] [c383bdd0] [c00df4a4] __get_user_pages.part.28+0x114/0x3e0
[2.350050] [c383be30] [c010e788] copy_strings+0x16c/0x2c8
[2.355668] [c383bea0] [c010e91c] copy_strings_kernel+0x38/0x50
[2.361730] [c383bec0] [c011057c] do_execveat_common+0x440/0x658
[2.367877] [c383bf10] [c01107cc] do_execve+0x38/0x48
[2.373056] [c383bf20] [c00039f0] try_to_run_init_process+0x24/0x64
[2.379469] [c383bf30] [c000430c] kernel_init+0xb8/0x10c
[2.384924] [c383bf40] [c0010c40] ret_from_kernel_thread+0x5c/0x64
[2.391242] --- interrupt: 0 at   (null)
[2.391242] LR =   (null)
[2.398263] Instruction dump:
[2.401297] 0100 00037000   f000 0001 0a641e09 
acde4823 
[2.409237] 000f 179a7b00 07de2900 03ef1480 <01f78a40> 0001c200 6000 
9421fff0 
[2.417375] ---[ end trace 996fd312ce9c18ce ]---

Again, if I disable CONFIG_TRACER its gone.

Thanks,

tglx

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [BUG] Revert 0b05e2d671c4 'powerpc/32: cacheable_memcpy becomes memcpy'

2015-09-17 Thread Segher Boessenkool
On Thu, Sep 17, 2015 at 09:47:39AM +, David Laight wrote:
> > It also says "tftp", which is intriguing if nothing else :-)
> 
> Much more likely than the above instruction.
> If the address it wass read from is in the dump, look at the entire string.

And tell us what network drivers you use, what firmware, what bootloader.
Try changing either of those (if feasible), see if the problem goes away.


Segher
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [BUG] Revert 0b05e2d671c4 'powerpc/32: cacheable_memcpy becomes memcpy'

2015-09-17 Thread Segher Boessenkool
On Thu, Sep 17, 2015 at 04:38:52PM +0200, Thomas Gleixner wrote:
> [2.398263] Instruction dump:
> [2.401297] 0100 00037000   f000 0001 0a641e09 
> acde4823 
> [2.409237] 000f 179a7b00 07de2900 03ef1480 <01f78a40> 0001c200 
> 6000 9421fff0 

Those are not instructions (until the nop).

Starting three back from the failing instruction, those are

39600 13200 6600 <3300> 115200

so something has been scribbling what looks like clock frequencies
over your code.  Judging from the register contents this has happened
some time before.

Some device trees have similar numbers, e.g. media5200.dts does.


Segher
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [BUG] Revert 0b05e2d671c4 'powerpc/32: cacheable_memcpy becomes memcpy'

2015-09-16 Thread Scott Wood
On Wed, 2015-09-16 at 18:23 +0200, Thomas Gleixner wrote:
> With this commit applied, my MPC5200 board fails to boot 4.3-rc1. The
> boot failure is silent.
> 
> Bisection pointed straight to this commit. Reverting it on top of
> 4.3-rc1 works like a charm.

Yes, we've been iterating on a fix for this, and I think the patches posted 
earlier today are ready to go in:

http://patchwork.ozlabs.org/patch/518321/
http://patchwork.ozlabs.org/patch/518323/

-Scott

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [BUG] Revert 0b05e2d671c4 'powerpc/32: cacheable_memcpy becomes memcpy'

2015-09-16 Thread Thomas Gleixner
On Wed, 16 Sep 2015, Scott Wood wrote:

> On Wed, 2015-09-16 at 18:23 +0200, Thomas Gleixner wrote:
> > With this commit applied, my MPC5200 board fails to boot 4.3-rc1. The
> > boot failure is silent.
> > 
> > Bisection pointed straight to this commit. Reverting it on top of
> > 4.3-rc1 works like a charm.
> 
> Yes, we've been iterating on a fix for this, and I think the patches posted 
> earlier today are ready to go in:
> 
> http://patchwork.ozlabs.org/patch/518321/
> http://patchwork.ozlabs.org/patch/518323/

Tested-by: Thomas Gleixner 

So now I have to chase down that one:

[0.230210] ftrace: allocating 16560 entries in 49 pages
[0.273313] [ cut here ]
[0.278048] WARNING: at 
/home/tglx/work/kernel/tip/tip/kernel/trace/ftrace.c:1974
[0.285653] Modules linked in:
[0.288781] CPU: 0 PID: 0 Comm: swapper Not tainted 
4.2.0-50528-gfac33bf-dirty #29
[0.296485] task: c06563e0 ti: c06a2000 task.ti: c06a2000
[0.301976] NIP: c008df9c LR: c008e440 CTR: 
[0.307027] REGS: c06a3e80 TRAP: 0700   Not tainted  
(4.2.0-50528-gfac33bf-dirty)
[0.314631] MSR: 00021032   CR: 22028022  XER: 
[0.321126]
GPR00: 0001 c06a3f30 c06563e0 ffea c3888c88 0004 c06a3ef8 764d9364
GPR08: 00407584 c069df92  0001 22022028  03fb9000 007fff00
GPR16: c065debc 9032 c06b8aa0 c0011184  c06a711c 0bb6 c38021a0
GPR24:  8c88 0bb6 c06a7120 c3888c88 101d6120 c3888c88 c00e1f2c
[0.351444] NIP [c008df9c] ftrace_bug+0xbc/0x2d4
[0.356160] LR [c008e440] ftrace_process_locs+0x28c/0x6cc
[0.361652] Call Trace:
[0.364169] [c06a3f30] [c00971a0] buffer_ftrace_now.isra.50+0x28/0x64 
(unreliable)
[0.371888] [c06a3f60] [c008e440] ftrace_process_locs+0x28c/0x6cc
[0.378119] [c06a3fb0] [c0610848] ftrace_init+0xac/0x14c
[0.383559] [c06a3fc0] [c06051fc] start_kernel+0x390/0x3a8
[0.389147] [c06a3ff0] [3438] 0x3438
[0.393136] Instruction dump:
[0.396154] 483f9951 3c60c059 7fe4fb78 3863b768 7fe5fb78 483f993d 41b20214 
48000144
[0.404060] 3d20c06a 3929df92 88090002 6801 <0f00> 2f80 409e01b4 
3d20c06a
[0.412203] ---[ end trace dc8fa200cb88537f ]---
[0.416912] ftrace failed to modify [] change_protection+0x8/0x2c
[0.424015]  actual: 74:66:74:70
[0.427324] ftrace record flags: 0
[0.430785]  (0)   expected tramp: c0011198

Happens somewhere between the powerpc merge and rc1. System boots up
to userspace and then locks hard 

Thanks,

tglx
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [BUG] Revert 0b05e2d671c4 'powerpc/32: cacheable_memcpy becomes memcpy'

2015-09-16 Thread Michael Ellerman
On Wed, 2015-09-16 at 11:29 -0500, Scott Wood wrote:
> On Wed, 2015-09-16 at 18:23 +0200, Thomas Gleixner wrote:
> > With this commit applied, my MPC5200 board fails to boot 4.3-rc1. The
> > boot failure is silent.
> > 
> > Bisection pointed straight to this commit. Reverting it on top of
> > 4.3-rc1 works like a charm.
> 
> Yes, we've been iterating on a fix for this, and I think the patches posted 
> earlier today are ready to go in:
> 
> http://patchwork.ozlabs.org/patch/518321/
> http://patchwork.ozlabs.org/patch/518323/

Yeah, sorry it took a few iterations, maybe we should have reverted first.

Anyway I'll take that as an ack from you Scott and put these in my fixes branch
with Thomas' Tested-by.

cheers


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [BUG] Revert 0b05e2d671c4 'powerpc/32: cacheable_memcpy becomes memcpy'

2015-09-16 Thread Steven Rostedt
On Wed, 16 Sep 2015 22:01:06 +0200 (CEST)
Thomas Gleixner  wrote:

> On Wed, 16 Sep 2015, Scott Wood wrote:
> 
> > On Wed, 2015-09-16 at 18:23 +0200, Thomas Gleixner wrote:
> > > With this commit applied, my MPC5200 board fails to boot 4.3-rc1. The
> > > boot failure is silent.
> > > 
> > > Bisection pointed straight to this commit. Reverting it on top of
> > > 4.3-rc1 works like a charm.
> > 
> > Yes, we've been iterating on a fix for this, and I think the patches posted 
> > earlier today are ready to go in:
> > 
> > http://patchwork.ozlabs.org/patch/518321/
> > http://patchwork.ozlabs.org/patch/518323/
> 
> Tested-by: Thomas Gleixner 
> 
> So now I have to chase down that one:
> 
> [0.230210] ftrace: allocating 16560 entries in 49 pages
> [0.273313] [ cut here ]
> [0.278048] WARNING: at 
> /home/tglx/work/kernel/tip/tip/kernel/trace/ftrace.c:1974

OK, so this is where ftrace converts the mcount to nops.

Just to be clear, there's nothing in the command line that enables any
function tracing, is there?

> [0.285653] Modules linked in:
> [0.288781] CPU: 0 PID: 0 Comm: swapper Not tainted 
> 4.2.0-50528-gfac33bf-dirty #29
> [0.296485] task: c06563e0 ti: c06a2000 task.ti: c06a2000
> [0.301976] NIP: c008df9c LR: c008e440 CTR: 
> [0.307027] REGS: c06a3e80 TRAP: 0700   Not tainted  
> (4.2.0-50528-gfac33bf-dirty)
> [0.314631] MSR: 00021032   CR: 22028022  XER: 
> [0.321126]
> GPR00: 0001 c06a3f30 c06563e0 ffea c3888c88 0004 c06a3ef8 764d9364
> GPR08: 00407584 c069df92  0001 22022028  03fb9000 007fff00
> GPR16: c065debc 9032 c06b8aa0 c0011184  c06a711c 0bb6 c38021a0
> GPR24:  8c88 0bb6 c06a7120 c3888c88 101d6120 c3888c88 c00e1f2c
> [0.351444] NIP [c008df9c] ftrace_bug+0xbc/0x2d4
> [0.356160] LR [c008e440] ftrace_process_locs+0x28c/0x6cc
> [0.361652] Call Trace:
> [0.364169] [c06a3f30] [c00971a0] buffer_ftrace_now.isra.50+0x28/0x64 
> (unreliable)
> [0.371888] [c06a3f60] [c008e440] ftrace_process_locs+0x28c/0x6cc
> [0.378119] [c06a3fb0] [c0610848] ftrace_init+0xac/0x14c
> [0.383559] [c06a3fc0] [c06051fc] start_kernel+0x390/0x3a8
> [0.389147] [c06a3ff0] [3438] 0x3438
> [0.393136] Instruction dump:
> [0.396154] 483f9951 3c60c059 7fe4fb78 3863b768 7fe5fb78 483f993d 41b20214 
> 48000144
> [0.404060] 3d20c06a 3929df92 88090002 6801 <0f00> 2f80 
> 409e01b4 3d20c06a
> [0.412203] ---[ end trace dc8fa200cb88537f ]---
> [0.416912] ftrace failed to modify [] change_protection+0x8/0x2c
> [0.424015]  actual: 74:66:74:70
> [0.427324] ftrace record flags: 0
> [0.430785]  (0)   expected tramp: c0011198
> 
> Happens somewhere between the powerpc merge and rc1. System boots up
> to userspace and then locks hard 

This could be a symptom and not the problem. What the above shows is
that ftrace tried to convert the mcount at change_protection but what
it expected was there wasn't. Unfortunately, it doesn't state exactly
what it wants (that would take a arch specific function to do that, and
this is in generic code). But what it found was "74 66 74 70", which I
have no idea what type of command that is.

-- Steve
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [BUG] Revert 0b05e2d671c4 'powerpc/32: cacheable_memcpy becomes memcpy'

2015-09-16 Thread Michael Ellerman
On Wed, 2015-09-16 at 21:54 -0400, Steven Rostedt wrote:
> On Wed, 16 Sep 2015 22:01:06 +0200 (CEST)
> Thomas Gleixner  wrote:
> 
> > On Wed, 16 Sep 2015, Scott Wood wrote:
> > 
> > > On Wed, 2015-09-16 at 18:23 +0200, Thomas Gleixner wrote:
> > > > With this commit applied, my MPC5200 board fails to boot 4.3-rc1. The
> > > > boot failure is silent.
> > > > 
> > > > Bisection pointed straight to this commit. Reverting it on top of
> > > > 4.3-rc1 works like a charm.
> > > 
> > > Yes, we've been iterating on a fix for this, and I think the patches 
> > > posted 
> > > earlier today are ready to go in:
> > > 
> > > http://patchwork.ozlabs.org/patch/518321/
> > > http://patchwork.ozlabs.org/patch/518323/
> > 
> > Tested-by: Thomas Gleixner 
> > 
> > So now I have to chase down that one:
> > 
> > [0.230210] ftrace: allocating 16560 entries in 49 pages
> > [0.273313] [ cut here ]
> > [0.278048] WARNING: at 
> > /home/tglx/work/kernel/tip/tip/kernel/trace/ftrace.c:1974
> 
> OK, so this is where ftrace converts the mcount to nops.
> 
> Just to be clear, there's nothing in the command line that enables any
> function tracing, is there?
> 
> > [0.285653] Modules linked in:
> > [0.288781] CPU: 0 PID: 0 Comm: swapper Not tainted 
> > 4.2.0-50528-gfac33bf-dirty #29
> > [0.296485] task: c06563e0 ti: c06a2000 task.ti: c06a2000
> > [0.301976] NIP: c008df9c LR: c008e440 CTR: 
> > [0.307027] REGS: c06a3e80 TRAP: 0700   Not tainted  
> > (4.2.0-50528-gfac33bf-dirty)
> > [0.314631] MSR: 00021032   CR: 22028022  XER: 
> > [0.321126]
> > GPR00: 0001 c06a3f30 c06563e0 ffea c3888c88 0004 c06a3ef8 
> > 764d9364
> > GPR08: 00407584 c069df92  0001 22022028  03fb9000 
> > 007fff00
> > GPR16: c065debc 9032 c06b8aa0 c0011184  c06a711c 0bb6 
> > c38021a0
> > GPR24:  8c88 0bb6 c06a7120 c3888c88 101d6120 c3888c88 
> > c00e1f2c
> > [0.351444] NIP [c008df9c] ftrace_bug+0xbc/0x2d4
> > [0.356160] LR [c008e440] ftrace_process_locs+0x28c/0x6cc
> > [0.361652] Call Trace:
> > [0.364169] [c06a3f30] [c00971a0] buffer_ftrace_now.isra.50+0x28/0x64 
> > (unreliable)
> > [0.371888] [c06a3f60] [c008e440] ftrace_process_locs+0x28c/0x6cc
> > [0.378119] [c06a3fb0] [c0610848] ftrace_init+0xac/0x14c
> > [0.383559] [c06a3fc0] [c06051fc] start_kernel+0x390/0x3a8
> > [0.389147] [c06a3ff0] [3438] 0x3438
> > [0.393136] Instruction dump:
> > [0.396154] 483f9951 3c60c059 7fe4fb78 3863b768 7fe5fb78 483f993d 
> > 41b20214 48000144
> > [0.404060] 3d20c06a 3929df92 88090002 6801 <0f00> 2f80 
> > 409e01b4 3d20c06a
> > [0.412203] ---[ end trace dc8fa200cb88537f ]---
> > [0.416912] ftrace failed to modify [] 
> > change_protection+0x8/0x2c
> > [0.424015]  actual: 74:66:74:70
> > [0.427324] ftrace record flags: 0
> > [0.430785]  (0)   expected tramp: c0011198
> > 
> > Happens somewhere between the powerpc merge and rc1. System boots up
> > to userspace and then locks hard 
> 
> This could be a symptom and not the problem. What the above shows is
> that ftrace tried to convert the mcount at change_protection but what
> it expected was there wasn't. Unfortunately, it doesn't state exactly
> what it wants (that would take a arch specific function to do that, and
> this is in generic code). But what it found was "74 66 74 70", which I
> have no idea what type of command that is.

This is big endian, so I think that's:

  andis.  r6,r3,29808

Which is feasible.

Can you objdump the vmlinux? Mine looks like:

  c0157cf0 :
  c0157cf0: 7c 08 02 a6 mflrr0
  c0157cf4: 90 01 00 04 stw r0,4(r1)
  c0157cf8: 4b eb 9b f9 bl  c00118f0 <_mcount>
  c0157d00: 7c 08 02 a6 mflrr0
  c0157d04: 7d 80 00 26 mfcrr12
  c0157d08: 90 01 00 64 stw r0,100(r1)
  c0157d0c: bd c1 00 18 stmwr14,24(r1)
  c0157d10: 91 81 00 14 stw r12,20(r1)
  c0157d14: 81 23 00 28 lwz r9,40(r3)
  c0157d18: 75 2a 00 40 andis.  r10,r9,64


cheers


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [BUG] Revert 0b05e2d671c4 'powerpc/32: cacheable_memcpy becomes memcpy'

2015-09-16 Thread Segher Boessenkool
On Thu, Sep 17, 2015 at 12:50:12PM +1000, Michael Ellerman wrote:
> On Wed, 2015-09-16 at 21:54 -0400, Steven Rostedt wrote:
> > This could be a symptom and not the problem. What the above shows is
> > that ftrace tried to convert the mcount at change_protection but what
> > it expected was there wasn't. Unfortunately, it doesn't state exactly
> > what it wants (that would take a arch specific function to do that, and
> > this is in generic code). But what it found was "74 66 74 70", which I
> > have no idea what type of command that is.
> 
> This is big endian, so I think that's:
> 
>   andis.  r6,r3,29808
> 
> Which is feasible.

It also says "tftp", which is intriguing if nothing else :-)


Segher
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev