Re: Kernel panic on PowerMac G5 while scanning for SMU sensors

2013-03-17 Thread Phileas Fogg

On 03/17/2013 01:07 PM, Phileas Fogg wrote:


I wanted to read the temperature sensors of my PowerMac G5 11,2.
For that i installed the lm-sensors package, run 'sensors-detect'
command and my Linux 3.8.2 kernel paniced in

_smu_i2c_low_completion_

with this message

'Unable to handle kernel paging request for data at address 0x0008.'

Regards
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


A further analysis showed that it crashes in _smu_i2c_complete_command_
which is called from _smu_i2c_low_completion_.

This line caused the panic:

list_del(&cmd->link);

regards
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Kernel panic on PowerMac G5 while scanning for SMU sensors

2013-03-17 Thread Phileas Fogg


I wanted to read the temperature sensors of my PowerMac G5 11,2.
For that i installed the lm-sensors package, run 'sensors-detect' 
command and my Linux 3.8.2 kernel paniced in


_smu_i2c_low_completion_

with this message

'Unable to handle kernel paging request for data at address 0x0008.'

Regards
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Linux kernel 3.x problems on PowerMac G5

2013-03-11 Thread Phileas Fogg

On 03/10/2013 01:52 PM, Benjamin Herrenschmidt wrote:

On Sun, 2013-03-10 at 11:53 +0100, Phileas Fogg wrote:

Good news :) I found the bug.
MMU features were not set properly for PPC970MP DD1.0 which,
unfortunately, my machine has.
Damn, one line fix but one week searching.
Linux 3.8.2 boots without problems now :)


Nice one ! I didn't think anybody shipped a DD1.0 chip ! :-)

Looks like some typo/thinko in the cputable and you are one of the very
rare victims of it. I'll fix that up. Regarding the IDE problem, can you
shoot a note to Tejun who wrote that patch (and CC me) ? I do recommend
switching to libata but we should still fix the problem with legacy IDE.

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev



The IDE-CD bug appears to be already fixed. I tested g5_defconfig with 
Linux 3.8.2 and it boots OK. I think i saw a commit regarding this issue 
in git.


Regards
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Linux kernel 3.x problems on PowerMac G5

2013-03-10 Thread Phileas Fogg

Good news :) I found the bug.
MMU features were not set properly for PPC970MP DD1.0 which,
unfortunately, my machine has.
Damn, one line fix but one week searching.
Linux 3.8.2 boots without problems now :)


Here is my patch:

--- arch/powerpc/kernel/cputable.c.old  2013-03-10 11:48:56.559480758 +0100
+++ arch/powerpc/kernel/cputable.c  2013-03-10 11:41:07.522786804 +0100
@@ -275,7 +275,7 @@
.cpu_features   = CPU_FTRS_PPC970,
.cpu_user_features  = COMMON_USER_POWER4 |
PPC_FEATURE_HAS_ALTIVEC_COMP,
-   .mmu_features   = MMU_FTR_HPTE_TABLE,
+   .mmu_features   = MMU_FTRS_PPC970,
.icache_bsize   = 128,
.dcache_bsize   = 128,
.num_pmcs   = 8,



Regards
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Linux kernel 3.x problems on PowerMac G5

2013-03-10 Thread Phileas Fogg

On 03/10/2013 01:45 AM, Benjamin Herrenschmidt wrote:

On Sun, 2013-03-10 at 01:26 +0100, Phileas Fogg wrote:


i managed to find the bad commit after a couple of days bisecting.


Thanks !



44ae3ab3358e962039c36ad4ae461ae9fb29596c is the first bad commit
commit 44ae3ab3358e962039c36ad4ae461ae9fb29596c
Author: Matt Evans 
Date:   Wed Apr 6 19:48:50 2011 +

  powerpc: Free up some CPU feature bits by moving out MMU-related
features

  Some of the 64bit PPC CPU features are MMU-related, so this patch moves
  them to MMU_FTR_ bits.  All cpu_has_feature()-style tests are moved to
  mmu_has_feature(), and seven feature bits are freed as a result.

  Signed-off-by: Matt Evans 
  Signed-off-by: Benjamin Herrenschmidt 



Have you verified that if you checkout git at the above commit point, it
fails and if you then just revert that commit on top, it works again ?

The above should have been mostly a NOP change but I'll have a closer
look in case a typo of some kind actually broke something.


Actually, there are 2 problems i found.
The first problem occurs when i enable IDE CDROM driver on my machine.
The following commit causes hangs on my machine at boot:


Ok. You may want to switch to the new libata instead of the old IDE
driver too

(CONFIG_IDE off, CONFIG_ATA on, CONFIG_PATA_MACIO on and from there it
will use the SCSI CDROM driver).


--
commit 5b03a1b140e13a28ff6be1526892a9dc538ddef6
Author: Tejun Heo 
Date:   Wed Mar 9 19:54:27 2011 +0100

  ide: Convert to bdops->check_events()

  Convert ->media_changed() to the new ->check_events() method.  The
  conversion is mostly mechanical.  The only notable change is that
  cdrom now doesn't generate any event if @slot_nr isn't CDSL_CURRENT.
  It used to return -EINVAL which would be treated as media changed.  As
  media changer isn't supported anyway, this doesn't make any
  difference.

  This makes ide emit the standard disk events and allows kernel event
  polling.  Currently, only MEDIA_CHANGE event is implemented.  Adding
  support for EJECT_REQUEST shouldn't be difficult; however, given that
  ide driver is already deprecated, it probably is best to leave it
  alone.

  Signed-off-by: Tejun Heo 





If i disable IDE CDROM driver then the Linux kernel boots again
and then it hits the commit 44ae3ab3358e962039c36ad4ae461ae9fb29596c
and hangs again :)

The commit eca590f402332ab873d13f2d8d00fa0b91cfff36 which is before
the commit 44ae3ab3358e962039c36ad4ae461ae9fb29596c works fine,
i tested it myself to be on the safe side.


Ok thanks. I'll dig a bit if I get a chance next week.

Cheers,
Ben.



Regards



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev




I'm trying to debug the problem and printed CPU and MMU features
before and after this bad commit. And found i think i found the problem.
At least i got very strange results.

The following code i did add to arch/powerpc/kernel/setup_64.c:setup_system

printk("cpu_features %lx\n", cur_cpu_spec->cpu_features);
printk("mmu_features %lx\n", cur_cpu_spec->mmu_features);



CPU features before 44ae3ab3358e962039c36ad4ae461ae9fb29596c commit:

cpu_features 0x24000c718100448


CPU and MMU features after 44ae3ab3358e962039c36ad4ae461ae9fb29596c commit:

cpu_features 0x18100448
mmu_features 0x0001

MMU features in 2nd case have only bit MMU_FTR_HPTE_TABLE set.
Where are the bits MMU_FTR_SLB, MMU_FTR_16M_PAGE and MMU_FTR_TLBIEL 
which introduced in commit 44ae3ab3358e962039c36ad4ae461ae9fb29596c ?

They should be set if i see it right in arch/powerpc/include/asm/mmu.h.

#define MMU_FTR_PPCAS_ARCH_V2   (MMU_FTR_SLB | MMU_FTR_TLBIEL | \
 MMU_FTR_16M_PAGE)
/* MMU feature bit sets for various CPUs */
#define MMU_FTRS_DEFAULT_HPTE_ARCH_V2   \
MMU_FTR_HPTE_TABLE | MMU_FTR_PPCAS_ARCH_V2
#define MMU_FTRS_POWER4 MMU_FTRS_DEFAULT_HPTE_ARCH_V2
#define MMU_FTRS_PPC970 MMU_FTRS_POWER4


Hope it helps to fix the problem.

regards
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Linux kernel 3.x problems on PowerMac G5

2013-03-09 Thread Phileas Fogg

On 03/10/2013 01:45 AM, Benjamin Herrenschmidt wrote:



Have you verified that if you checkout git at the above commit point, it
fails and if you then just revert that commit on top, it works again ?

The above should have been mostly a NOP change but I'll have a closer
look in case a typo of some kind actually broke something.



Verified, the commit 65f47f1339dfcffcd5837a307172fb41aa39e479 hangs too.
After reverting the changes of the commit 
44ae3ab3358e962039c36ad4ae461ae9fb29596c, it is booting fine again.


regards

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Linux kernel 3.x problems on PowerMac G5

2013-03-09 Thread Phileas Fogg

On 03/07/2013 09:22 PM, Benjamin Herrenschmidt wrote:

On Thu, 2013-03-07 at 21:08 +0100, Phileas Fogg wrote:

And the bisect couldn't find the commit which causes hangs on my
machine.
All commits which were provided by the bisect were bad.
And the commit before tha last bad bisect commit was bad too.
I did bisect several times, and got the same results.

Fo testing i used linux-3.0.y branch of
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git.

Did i miss something or do something wrong here ?


Did git bisect go down a merge commit ? It does for me if I try that and
asks to test that merge first. If you get that wrong it can get very
confused.

That's all I can think of... do you have the bisection log just in
case ?

Also you can use gitk -- arch/powerpc to look at the changes to powerpc
code and try manually random points before/after that if you think
bisect isn't doing the right thing.

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev



Hi,

i managed to find the bad commit after a couple of days bisecting.



44ae3ab3358e962039c36ad4ae461ae9fb29596c is the first bad commit
commit 44ae3ab3358e962039c36ad4ae461ae9fb29596c
Author: Matt Evans 
Date:   Wed Apr 6 19:48:50 2011 +

powerpc: Free up some CPU feature bits by moving out MMU-related 
features


Some of the 64bit PPC CPU features are MMU-related, so this patch moves
them to MMU_FTR_ bits.  All cpu_has_feature()-style tests are moved to
mmu_has_feature(), and seven feature bits are freed as a result.

Signed-off-by: Matt Evans 
Signed-off-by: Benjamin Herrenschmidt 





Actually, there are 2 problems i found.
The first problem occurs when i enable IDE CDROM driver on my machine.
The following commit causes hangs on my machine at boot:



--
commit 5b03a1b140e13a28ff6be1526892a9dc538ddef6
Author: Tejun Heo 
Date:   Wed Mar 9 19:54:27 2011 +0100

ide: Convert to bdops->check_events()

Convert ->media_changed() to the new ->check_events() method.  The
conversion is mostly mechanical.  The only notable change is that
cdrom now doesn't generate any event if @slot_nr isn't CDSL_CURRENT.
It used to return -EINVAL which would be treated as media changed.  As
media changer isn't supported anyway, this doesn't make any
difference.

This makes ide emit the standard disk events and allows kernel event
polling.  Currently, only MEDIA_CHANGE event is implemented.  Adding
support for EJECT_REQUEST shouldn't be difficult; however, given that
ide driver is already deprecated, it probably is best to leave it
alone.

Signed-off-by: Tejun Heo 





If i disable IDE CDROM driver then the Linux kernel boots again
and then it hits the commit 44ae3ab3358e962039c36ad4ae461ae9fb29596c
and hangs again :)

The commit eca590f402332ab873d13f2d8d00fa0b91cfff36 which is before
the commit 44ae3ab3358e962039c36ad4ae461ae9fb29596c works fine,
i tested it myself to be on the safe side.



Regards

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Linux kernel 3.x problems on PowerMac G5

2013-03-07 Thread Phileas Fogg

On 03/07/2013 09:22 PM, Benjamin Herrenschmidt wrote:

On Thu, 2013-03-07 at 21:08 +0100, Phileas Fogg wrote:

And the bisect couldn't find the commit which causes hangs on my
machine.
All commits which were provided by the bisect were bad.
And the commit before tha last bad bisect commit was bad too.
I did bisect several times, and got the same results.

Fo testing i used linux-3.0.y branch of
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git.

Did i miss something or do something wrong here ?


Did git bisect go down a merge commit ? It does for me if I try that and
asks to test that merge first. If you get that wrong it can get very
confused.

That's all I can think of... do you have the bisection log just in
case ?

Also you can use gitk -- arch/powerpc to look at the changes to powerpc
code and try manually random points before/after that if you think
bisect isn't doing the right thing.

Cheers,
Ben.




Thanks. I'll try manually some commits then.

And here is the bisect log:

git bisect log
# bad: [55922c9d1b84b89cb946c777fddccb3247e7df2c] Linux 3.0-rc1
# good: [61c4f2c81c61f73549928dfd9f3e8f26aa36a8cf] Linux 2.6.39
git bisect start '55922c9d1b84b89cb946c777fddccb3247e7df2c' 
'61c4f2c81c61f73549928dfd9f3e8f26aa36a8cf'
# bad: [c44dead70a841d90ddc01968012f323c33217c9e] Merge branch 
'usb-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6

git bisect bad c44dead70a841d90ddc01968012f323c33217c9e
# bad: [d93515611bbc70c2fe4db232e5feb448ed8e4cc9] macvlan: fix panic if 
lowerdev in a bond

git bisect bad d93515611bbc70c2fe4db232e5feb448ed8e4cc9
# bad: [9c6a02f41d10dc9fbf5dd42058e8846f38dd2d9a] sctp: make sctp over 
IPv6 work with IPsec

git bisect bad 9c6a02f41d10dc9fbf5dd42058e8846f38dd2d9a
# bad: [d30ee670f25ea8f265a2804e2a0a53804cac5185] net-bonding: Fix minor 
sparse complaints

git bisect bad d30ee670f25ea8f265a2804e2a0a53804cac5185
# bad: [b37e3b6d64358604960b35e8ecbb7aed22e0926e] Merge branch 'master' 
of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6

git bisect bad b37e3b6d64358604960b35e8ecbb7aed22e0926e
# bad: [6c74608bd479bbe02ce330f83df43c3f535ed200] ssb: pci: trivial: 
drop useless pointer

git bisect bad 6c74608bd479bbe02ce330f83df43c3f535ed200
# bad: [83860c594f65945b1a2c99e84338e1145cd34890] ath9k_hw: remove 
pCap->tx_triglevel_max

git bisect bad 83860c594f65945b1a2c99e84338e1145cd34890
# bad: [f171760c558946c7a2e0ee310dfb968f9d4853c6] ath9k_hw: enable a 
BlockAck related fixup specific to AR9100

git bisect bad f171760c558946c7a2e0ee310dfb968f9d4853c6
# bad: [e600707b021efdc109e7becd467798da339ec26d] mwl8k: differentiate 
between WMM queues and AMPDU queues

git bisect bad e600707b021efdc109e7becd467798da339ec26d
# bad: [e7fc63388def06d2d1bdb6916748c92c037a42c6] ath9k_hw: Speedup 
register ops for HTC driver

git bisect bad e7fc63388def06d2d1bdb6916748c92c037a42c6
# bad: [6d64ab7f9240e3201fde3fd16ce4227bd795d2ab] ath9k_htc: Fix LED pin 
for AR9287 HTC device

git bisect bad 6d64ab7f9240e3201fde3fd16ce4227bd795d2ab
# bad: [22dd2fd283ea96b4d45185d3e861ef46005082f4] iwlwifi: remove 
duplicate initialization in __iwl_down()

git bisect bad 22dd2fd283ea96b4d45185d3e861ef46005082f4



Regards
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Linux kernel 3.x problems on PowerMac G5

2013-03-07 Thread Phileas Fogg

On 03/03/2013 08:24 PM, Benjamin Herrenschmidt wrote:

On Sun, 2013-03-03 at 20:16 +0100, Phileas Fogg wrote:

Benjamin Herrenschmidt wrote:

Thanks. It looks like a bisection might indeed be the way to go...

Out of curiosity, have you tried without some of your additional drivers ?
Maybe one of them is the culprit...

Cheers,
Ben.



Not yet, will do.
But I tested the official Debian Wheezy RC netinstall CD with Linux 3.2,
it has the same problem and hangs at boot on my machine.


Ok, so it's definitely something about your configuration. Maybe
something in the 11,2 support code chokes on single-chip configs, I
don't have one of them to test, both mines are dual chip (ie. quad
core).

But it does look like a regression that should be bisectable, so let me
know went you're done there and what you get.

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev



Hi,

i'm completely confused now.
I did a bisect between the following 2 commits: 
61c4f2c81c61f73549928dfd9f3e8f26aa36a8cf (good) Linux 2.6.39

and
55922c9d1b84b89cb946c777fddccb3247e7df2c (bad)  Linux 3.0-rc1
Both commist were tested by me on my machine. And Linux 3.0-rc1 hangs
on my machine but Linux 2.6.39 works fine.

And the bisect couldn't find the commit which causes hangs on my machine.
All commits which were provided by the bisect were bad.
And the commit before tha last bad bisect commit was bad too.
I did bisect several times, and got the same results.

Fo testing i used linux-3.0.y branch of
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git.

Did i miss something or do something wrong here ?

Regards
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Linux kernel 3.x problems on PowerMac G5

2013-03-03 Thread Phileas Fogg

Benjamin Herrenschmidt wrote:

Thanks. It looks like a bisection might indeed be the way to go...

Out of curiosity, have you tried without some of your additional drivers ?
Maybe one of them is the culprit...

Cheers,
Ben.



Not yet, will do.
But I tested the official Debian Wheezy RC netinstall CD with Linux 3.2,
it has the same problem and hangs at boot on my machine.

regards

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Linux kernel 3.x problems on PowerMac G5

2013-03-03 Thread Phileas Fogg

Benjamin Herrenschmidt wrote:

On Sat, 2013-03-02 at 15:40 +0100, Phileas Fogg wrote:

recently i got a PowerMac G5 and installed Debian Linux 2.6.32 on it.
Everything works so far and Debian boots properly.

Today i tried to boot Linux 3 on the machine and it doesn't boot.
The Linux 3 kernel was cross-compiled by me.

On Linux 3.8.1 it hangs after this line:
---
windfarm: Drive bay control loop started.

And then i'm getting RCU stall call traces.

On Linux 3.2 it hangs too but not at the same place.
It hangs after some SCSI message.

Have anyone tested Linux 3 kernels on PowerMac G5 recently ?


Hrm, this is odd. I do run pretty much every release on my G5's without
problems... Can you send me your .config and then try with a
g5_defconfig just to see if it makes a difference ?

There *might* have been a problem on those older machines vs. the 64T
address space patches, so maybe try back 3.6 and 3.7 and let me know,
I'm still trying to get the right fix in (I know it breaks PS/3 under
some circumstances).

Cheers,
Ben.





Reverted 64TB commit on Linux 3.8.1 and it didn't help, it still hangs.

Here is the RCU stall call trace:

BUG - soft lockup - CPU#1 stuck for 23s
Call trace:
padzero
load_elf_binary
search_binary_handler
load_script
search_binary_handler
do_execve_common
sys_execve
syscall_exit

Exception at kernel_execve
LR = run_init_process



regards


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Linux kernel 3.x problems on PowerMac G5

2013-03-02 Thread Phileas Fogg

Aaro Koskinen wrote:

Hi,

On Sat, Mar 02, 2013 at 03:40:19PM +0100, Phileas Fogg wrote:

Have anyone tested Linux 3 kernels on PowerMac G5 recently ?


3.8 boots normally to shell and is stable on G5 iMac (PowerMac8,1).

A.



Thanks. Then i configured something wrongly probably.
I tried Linux 2.6.39.4 and it boots too.
I looked around on the Internet and it seems i'm not the only one
who has problems with Linux 3.x on my PowerMac G5.

The guy here has the same problem:
http://forums.gentoo.org/viewtopic-p-7222918.html


My PowerMac CPU


cat /proc/cpuinfo
processor   : 0
cpu : PPC970MP, altivec supported
clock   : 2000.00MHz
revision: 1.0 (pvr 0044 0100)

processor   : 1
cpu : PPC970MP, altivec supported
clock   : 2000.00MHz
revision: 1.0 (pvr 0044 0100)

timebase: 
platform: PowerMac
model   : PowerMac11,2
machine : PowerMac11,2
motherboard : PowerMac11,2 MacRISC4 Power Macintosh
detected as : 337 (PowerMac G5 Dual Core)
pmac flags  : 
L2 cache: 1024K unified
pmac-generation : NewWorld
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Linux kernel 3.x problems on PowerMac G5

2013-03-02 Thread Phileas Fogg


Hi,

recently i got a PowerMac G5 and installed Debian Linux 2.6.32 on it.
Everything works so far and Debian boots properly.

Today i tried to boot Linux 3 on the machine and it doesn't boot.
The Linux 3 kernel was cross-compiled by me.

On Linux 3.8.1 it hangs after this line:
---
windfarm: Drive bay control loop started.

And then i'm getting RCU stall call traces.

On Linux 3.2 it hangs too but not at the same place.
It hangs after some SCSI message.

Have anyone tested Linux 3 kernels on PowerMac G5 recently ?

Regards
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: PS3: Strange issue with kexec and FreeBSD loader

2013-02-22 Thread Phileas Fogg

Benjamin Herrenschmidt wrote:

On Sat, 2013-02-23 at 00:41 +0100, Phileas Fogg wrote:

Benjamin Herrenschmidt wrote:

On Fri, 2013-02-22 at 21:49 +0100, Phileas Fogg wrote:

i wanted to let you know that i tested your advice. And let me say, it's was a
damn good advice :) I can boot FreeBSD loader on Linux 3.8 now, no SHA256
checksum failures. And no panics with FreeBSD LiveCD anymore too.

I just inserted hard_irq_disable() after each local_irq_disable() in
arch/powerpc/kernel/machine_kexec_64.c


Awesome ! :-)

Care to send a patch with a Signed-off-by: ?


No problem, but as i said it was your idea how to fix the issue with kexec.
Anyways here is the patch which i tested on my PS3 console with Linux 3.8.
After applying this patch i can boot any Linux kernel starting with 2.6,
FreeBSD loader, FreeBSD LiveCD and my own tiny ELF kernels too.
Even OpenBSD bootloader starts now too :)
And i don't see any failed SHA256 checksums in the purgatory code.


Thanks, but I still need the Signed-off-by: line before i can apply
it :-) (legal...)

Cheers,
Ben.


regards

  From c17cdf38dfe180b4a571827bb547aaf9b678cf29 Mon Sep 17 00:00:00 2001
From: Phileas Fogg 
Date: Sat, 23 Feb 2013 00:32:19 +0100
Subject: [PATCH] kexec: disable hard IRQ before kexec

Disable hard IRQ before kexec a new kernel image.
Not doing it can result in corrupted data in the memory segments
reserved for the new kernel.
---
   arch/powerpc/kernel/machine_kexec_64.c | 3 +++
   1 file changed, 3 insertions(+)

diff --git a/arch/powerpc/kernel/machine_kexec_64.c
b/arch/powerpc/kernel/machine_kexec_64.c
index 7206701..e08b9d0 100644
--- a/arch/powerpc/kernel/machine_kexec_64.c
+++ b/arch/powerpc/kernel/machine_kexec_64.c
@@ -162,6 +162,7 @@ static int kexec_all_irq_disabled = 0;
   static void kexec_smp_down(void *arg)
   {
local_irq_disable();
+   hard_irq_disable();
mb(); /* make sure our irqs are disabled before we say they are */
get_paca()->kexec_state = KEXEC_STATE_IRQS_OFF;
while(kexec_all_irq_disabled == 0)
@@ -244,6 +245,7 @@ static void kexec_prepare_cpus(void)
wake_offline_cpus();
smp_call_function(kexec_smp_down, NULL, /* wait */0);
local_irq_disable();
+   hard_irq_disable();
mb(); /* make sure IRQs are disabled before we say they are */
get_paca()->kexec_state = KEXEC_STATE_IRQS_OFF;

@@ -281,6 +283,7 @@ static void kexec_prepare_cpus(void)
if (ppc_md.kexec_cpu_down)
ppc_md.kexec_cpu_down(0, 0);
local_irq_disable();
+   hard_irq_disable();
   }

   #endif /* SMP */



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev



Next attempt.


Signed-off-by: Phileas Fogg 
---

From c17cdf38dfe180b4a571827bb547aaf9b678cf29 Mon Sep 17 00:00:00 2001
From: Phileas Fogg 
Date: Sat, 23 Feb 2013 00:32:19 +0100
Subject: [PATCH] kexec: disable hard IRQ before kexec

Disable hard IRQ before kexec a new kernel image.
Not doing it can result in corrupted data in the memory segments
reserved for the new kernel.
---
 arch/powerpc/kernel/machine_kexec_64.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/powerpc/kernel/machine_kexec_64.c 
b/arch/powerpc/kernel/machine_kexec_64.c

index 7206701..e08b9d0 100644
--- a/arch/powerpc/kernel/machine_kexec_64.c
+++ b/arch/powerpc/kernel/machine_kexec_64.c
@@ -162,6 +162,7 @@ static int kexec_all_irq_disabled = 0;
 static void kexec_smp_down(void *arg)
 {
local_irq_disable();
+   hard_irq_disable();
mb(); /* make sure our irqs are disabled before we say they are */
get_paca()->kexec_state = KEXEC_STATE_IRQS_OFF;
while(kexec_all_irq_disabled == 0)
@@ -244,6 +245,7 @@ static void kexec_prepare_cpus(void)
wake_offline_cpus();
smp_call_function(kexec_smp_down, NULL, /* wait */0);
local_irq_disable();
+   hard_irq_disable();
mb(); /* make sure IRQs are disabled before we say they are */
get_paca()->kexec_state = KEXEC_STATE_IRQS_OFF;

@@ -281,6 +283,7 @@ static void kexec_prepare_cpus(void)
if (ppc_md.kexec_cpu_down)
ppc_md.kexec_cpu_down(0, 0);
local_irq_disable();
+   hard_irq_disable();
 }

 #endif /* SMP */
--
1.8.1.4
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: PS3: Strange issue with kexec and FreeBSD loader

2013-02-22 Thread Phileas Fogg

Benjamin Herrenschmidt wrote:

On Fri, 2013-02-22 at 21:49 +0100, Phileas Fogg wrote:

i wanted to let you know that i tested your advice. And let me say, it's was a
damn good advice :) I can boot FreeBSD loader on Linux 3.8 now, no SHA256
checksum failures. And no panics with FreeBSD LiveCD anymore too.

I just inserted hard_irq_disable() after each local_irq_disable() in
arch/powerpc/kernel/machine_kexec_64.c


Awesome ! :-)

Care to send a patch with a Signed-off-by: ?

Cheers,
Ben.




No problem, but as i said it was your idea how to fix the issue with kexec.
Anyways here is the patch which i tested on my PS3 console with Linux 3.8.
After applying this patch i can boot any Linux kernel starting with 2.6,
FreeBSD loader, FreeBSD LiveCD and my own tiny ELF kernels too.
Even OpenBSD bootloader starts now too :)
And i don't see any failed SHA256 checksums in the purgatory code.

regards

From c17cdf38dfe180b4a571827bb547aaf9b678cf29 Mon Sep 17 00:00:00 2001
From: Phileas Fogg 
Date: Sat, 23 Feb 2013 00:32:19 +0100
Subject: [PATCH] kexec: disable hard IRQ before kexec

Disable hard IRQ before kexec a new kernel image.
Not doing it can result in corrupted data in the memory segments
reserved for the new kernel.
---
 arch/powerpc/kernel/machine_kexec_64.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/powerpc/kernel/machine_kexec_64.c 
b/arch/powerpc/kernel/machine_kexec_64.c

index 7206701..e08b9d0 100644
--- a/arch/powerpc/kernel/machine_kexec_64.c
+++ b/arch/powerpc/kernel/machine_kexec_64.c
@@ -162,6 +162,7 @@ static int kexec_all_irq_disabled = 0;
 static void kexec_smp_down(void *arg)
 {
local_irq_disable();
+   hard_irq_disable();
mb(); /* make sure our irqs are disabled before we say they are */
get_paca()->kexec_state = KEXEC_STATE_IRQS_OFF;
while(kexec_all_irq_disabled == 0)
@@ -244,6 +245,7 @@ static void kexec_prepare_cpus(void)
wake_offline_cpus();
smp_call_function(kexec_smp_down, NULL, /* wait */0);
local_irq_disable();
+   hard_irq_disable();
mb(); /* make sure IRQs are disabled before we say they are */
get_paca()->kexec_state = KEXEC_STATE_IRQS_OFF;

@@ -281,6 +283,7 @@ static void kexec_prepare_cpus(void)
if (ppc_md.kexec_cpu_down)
ppc_md.kexec_cpu_down(0, 0);
local_irq_disable();
+   hard_irq_disable();
 }

 #endif /* SMP */
--
1.8.1.4
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: PS3: Strange issue with kexec and FreeBSD loader

2013-02-22 Thread Phileas Fogg

Benjamin Herrenschmidt wrote:

On Thu, 2013-02-21 at 22:44 +0100, Phileas Fogg wrote:

Stripped OpenWRT image:


c001a474:   48 00 00 05 bl  0xc001a478
c001a478:   7c a8 02 a6 mflrr5
c001a47c:   38 a5 00 1c addir5,r5,28
c001a480:   7c 21 0b 78 mr  r1,r1
c001a484:   80 85 00 00 lwz r4,0(r5)
c001a488:   2c 04 00 00 cmpwi   r4,0
c001a48c:   40 82 00 62 bnea-   0x60
c001a490:   4b ff ff f0 b   0xc001a480
c001a494:   00 00 00 00 .long 0x0
c001a498:   a0 6d 00 48 lhz r3,72(r13)
c001a49c:   48 00 00 11 bl  0xc001a4ac



Smell like a bad stack pointer to me...

One thing I noticed is that kexec doesn't seem to hard disable
interrupts, which is ... fishy at best. It should do that
before it switches stacks around. Dunno if that's the cause
of the problem but it might be worth adding a hard_irq_disable()
after all the local_irq_disable(), making sure we are hard
disabled before going into asm.

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev



Hi,

i wanted to let you know that i tested your advice. And let me say, it's was a 
damn good advice :) I can boot FreeBSD loader on Linux 3.8 now, no SHA256 
checksum failures. And no panics with FreeBSD LiveCD anymore too.


I just inserted hard_irq_disable() after each local_irq_disable() in 
arch/powerpc/kernel/machine_kexec_64.c


Thanks

regards
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: PS3: Strange issue with kexec and FreeBSD loader

2013-02-21 Thread Phileas Fogg

Benjamin Herrenschmidt wrote:

On Thu, 2013-02-21 at 21:38 +0100, Phileas Fogg wrote:

The new 8 bytes at offset 0x90 in dt.dump.hex look suspicously like
the kernel virtual address: 0xc001a4a0.


It does indeed. What does that address correspond to in the kernel
text ? Can you disassemble around it with "objdump -D vmlinux" ?

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev



Does it look like the new data at offset 0x80 and 0x88 in DT are MSR flags 
MSR_DR, MSR_IR and MSR_EE ?

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: PS3: Strange issue with kexec and FreeBSD loader

2013-02-21 Thread Phileas Fogg

Benjamin Herrenschmidt wrote:

On Thu, 2013-02-21 at 21:38 +0100, Phileas Fogg wrote:

The new 8 bytes at offset 0x90 in dt.dump.hex look suspicously like
the kernel virtual address: 0xc001a4a0.


It does indeed. What does that address correspond to in the kernel
text ? Can you disassemble around it with "objdump -D vmlinux" ?

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev



Here.
I used OpenWRT ELF for testing and it's stripped.
Then i compiled Linux 3.8 myself and didn't strip it.
Addresses are different in both cases but the code is the same and
it is kexec code :)


Stripped OpenWRT image:


c001a474:   48 00 00 05 bl  0xc001a478
c001a478:   7c a8 02 a6 mflrr5
c001a47c:   38 a5 00 1c addir5,r5,28
c001a480:   7c 21 0b 78 mr  r1,r1
c001a484:   80 85 00 00 lwz r4,0(r5)
c001a488:   2c 04 00 00 cmpwi   r4,0
c001a48c:   40 82 00 62 bnea-   0x60
c001a490:   4b ff ff f0 b   0xc001a480
c001a494:   00 00 00 00 .long 0x0
c001a498:   a0 6d 00 48 lhz r3,72(r13)
c001a49c:   48 00 00 11 bl  0xc001a4ac
c001a4a0:   38 80 00 02 li  r4,2  < !!!
c001a4a4:   98 8d 00 4b stb r4,75(r13)
c001a4a8:   4b ff ff cc b   0xc001a474
c001a4ac:   39 20 00 02 li  r9,2
c001a4b0:   39 40 00 30 li  r10,48
c001a4b4:   7d 68 02 a6 mflrr11
c001a4b8:   7d 80 00 a6 mfmsr   r12
c001a4bc:   7d 89 48 78 andcr9,r12,r9
c001a4c0:   7d 8a 50 78 andcr10,r12,r10
c001a4c4:   7d 21 01 64 mtmsrd  r9,1



Unstripped Linux 3.8 kernel:
-


c001c02c <.kexec_wait>:
c001c02c:   48 00 00 05 bl  c001c030 
<.kexec_wait+0x4>
c001c030:   7c a8 02 a6 mflrr5
c001c034:   38 a5 00 1c addir5,r5,28
c001c038:   7c 21 0b 78 mr  r1,r1
c001c03c:   80 85 00 00 lwz r4,0(r5)
c001c040:   2c 04 00 00 cmpwi   r4,0
c001c044:   40 82 00 62 bnea-   60 
c001c048:   4b ff ff f0 b   c001c038 
<.kexec_wait+0xc>

c001c04c :
c001c04c:   00 00 00 00 .long 0x0

c001c050 <.kexec_smp_wait>:
c001c050:   a0 6d 00 48 lhz r3,72(r13)
c001c054:   48 00 00 11 bl  c001c064 
c001c058:   38 80 00 02 li  r4,2<-- !!!
c001c05c:   98 8d 00 4b stb r4,75(r13)
c001c060:   4b ff ff cc b   c001c02c <.kexec_wait>

c001c064 :
c001c064:   39 20 00 02 li  r9,2
c001c068:   39 40 00 30 li  r10,48


regards


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: PS3: Strange issue with kexec and FreeBSD loader

2013-02-21 Thread Phileas Fogg

Benjamin Herrenschmidt wrote:

On Wed, 2013-02-20 at 21:43 +0100, Phileas Fogg wrote:


I found the single commit which brakes kexec stuff for FreeBSD loader or other
custom ELF kernels on the PS3 console.


  From 7230c5644188cd9e3fb380cc97dde00c464a3ba7 Mon Sep 17 00:00:00 2001
From: Benjamin Herrenschmidt 
Date: Tue, 6 Mar 2012 18:27:59 +1100
Subject: [PATCH] powerpc: Rework lazy-interrupt handling


Odd... That rework had its own issues and so several patches went in
subsequently to address them. It's possible that the PS3 does more
horrid stuff we missed here but I don't quite see how to relate that to
your specific memory corruption problem...

Do you see any "pattern" to the corruption ? Does it looks like
something known ? IE., exception frame, ASCII data, MSR values, ...

Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev



Hi,

here is some data for analyzing.

First, i modified kexec-tools and dumped the kernel and DT segments before they
are passed to the kexec_load syscall. I also modified the purgatory code and
made it dump the computed SHA256 checksum, the original SHA256 checksum and
the DT.

Here is the output from kexec-tools:
--

root@ps3-linux:~# kexec -l loader.ps3
segment[0].mem:0x1371000 memsz:262144
segment[1].mem:0x13b1000 memsz:36864
segment[2].mem:0x7fff000 memsz:4096
sha256_digest: 66 a6 c0 be d5 3c ba c2 85 6 97 4 d2 e1 aa 28 63 fa 7f 79 ce de
   e7 7f 26 14 a1 fa 2a ea bc 83



Here is the output from the purgatory code:
-

I'm in purgatory
sha256 digests do not match :(
   digest: d4 dc 50 0a ef 78 8e 28 e0 9a fe 52 e1 72 1c b3 23 a6 f4 ea 40
   7a 2d fd 6b 2a 66 95 63 f6 99 2a
sha256_digest: 66 a6 c0 be d5 3c ba c2 85 06 97 04 d2 e1 aa 28 63 fa 7f 79 ce
   de e7 7f 26 14 a1 fa 2a ea bc 83
sha256_regions:
start=0x01371000 len=0x0004
start=0x07fff000 len=0x1000



Here is the DT dump from kexec-tools:
---

  d0 0d fe ed 00 00 03 70  00 00 00 40 00 00 02 74  |...p...@...t|
0010  00 00 00 20 00 00 00 02  00 00 00 02 00 00 00 00  |... |
0020  00 00 00 00 07 ff f0 00  00 00 00 00 00 00 03 70  |...p|
0030  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ||
0040  00 00 00 01 2f 00 00 00  00 00 00 03 00 00 00 04  |/...|
0050  00 00 00 00 00 00 00 02  00 00 00 03 00 00 00 04  ||
0060  00 00 00 0f 00 00 00 02  00 00 00 03 00 00 00 09  ||
0070  00 00 00 1b 00 00 00 00  73 6f 6e 79 2c 70 73 33  |sony,ps3|
0080  00 00 00 00 00 00 00 03  00 00 00 04 00 00 00 26  |...&|
0090  00 00 00 00 00 00 00 03  00 00 00 08 00 00 00 39  |...9|
00a0  00 00 00 00 38 6d 43 80  00 00 00 03 00 00 00 08  |8mC.|
00b0  00 00 00 48 00 00 00 00  53 6f 6e 79 50 53 33 00  |...HSonyPS3.|
00c0  00 00 00 03 00 00 00 01  00 00 00 4e 00 00 00 00  |...N|
00d0  00 00 00 01 2f 63 68 6f  73 65 6e 00 00 00 00 03  |/chosen.|
00e0  00 00 00 08 00 00 00 53  00 00 00 00 00 00 00 00  |...S|
00f0  00 00 00 03 00 00 00 07  00 00 00 4e 63 68 6f 73  |...Nchos|
0100  65 6e 00 00 00 00 00 03  00 00 00 02 00 00 00 66  |en.f|
0110  20 00 00 00 00 00 00 02  00 00 00 01 2f 63 70 75  | .../cpu|
0120  73 00 00 00 00 00 00 03  00 00 00 04 00 00 00 00  |s...|
0130  00 00 00 01 00 00 00 03  00 00 00 04 00 00 00 0f  ||
0140  00 00 00 00 00 00 00 03  00 00 00 05 00 00 00 4e  |...N|
0150  63 70 75 73 00 00 00 00  00 00 00 01 2f 63 70 75  |cpus/cpu|
0160  73 2f 63 70 75 40 30 00  00 00 00 03 00 00 00 04  |s/cpu@0.|
0170  00 00 00 6f 00 00 00 00  00 00 00 03 00 00 00 04  |...o|
0180  00 00 00 7f 00 00 00 80  00 00 00 03 00 00 00 04  ||
0190  00 00 00 91 00 00 80 00  00 00 00 03 00 00 00 04  ||
01a0  00 00 00 9e 63 70 75 00  00 00 00 03 00 00 00 04  |cpu.|
01b0  00 00 00 aa 00 00 00 80  00 00 00 03 00 00 00 04  ||
01c0  00 00 00 bc 00 00 80 00  00 00 00 03 00 00 00 08  ||
01d0  00 00 00 c9 00 00 00 00  00 00 00 00 00 00 00 01  ||
01e0  00 00 00 03 00 00 00 04  00 00 00 4e 63 70 75 00  |...Ncpu.|
01f0  00 00 00 03 00 00 00 04  00 00 00 e4 00 00 00 00  ||
0200  00 00 00 03 00 00 00 04  00 00 00 e8 00 00 00 00  ||
0210  00 00 00 02 00 00 00 02  00 00 00 01 2f 6d 65 6d  |/mem|
0220  6f 72 79 00 00 00 00 03  00 00 00 07 00 00 00 9e  |ory.|
0230  6d 65 6d 6f 72 79 0

Re: PS3: Strange issue with kexec and FreeBSD loader

2013-02-20 Thread Phileas Fogg

Phileas Fogg wrote:

Phileas Fogg wrote:

I could finally find the commit which broke FreeBSD booting in linux-stable.git
repository.
The Linux 3.4-rc1 seems to have this problem already.

--
commit 5375871d432ae9fc581014ac117b96aaee3cd0c7
Merge: b57cb72 dfbc2d7
Author: Linus Torvalds 
Date:   Wed Mar 21 18:55:10 2012 -0700

 Merge branch 'next' of
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc

 Pull powerpc merge from Benjamin Herrenschmidt:
  "Here's the powerpc batch for this merge window.  It is going to be a
   bit more nasty than usual as in touching things outside of
   arch/powerpc mostly due to the big iSeriesectomy :-) We finally got
   rid of the bugger (legacy iSeries support) which was a PITA to
   maintain and that nobody really used anymore.

   Here are some of the highlights:

- Legacy iSeries is gone.  Thanks Stephen ! There's still some bits
  and pieces remaining if you do a grep -ir series arch/powerpc but
  they are harmless and will be removed in the next few weeks
  hopefully.

- The 'fadump' functionality (Firmware Assisted Dump) replaces the
  previous (equivalent) "pHyp assisted dump"...  it's a rewrite of a
  mechanism to get the hypervisor to do crash dumps on pSeries, the
  new implementation hopefully being much more reliable.  Thanks
  Mahesh Salgaonkar.

- The "EEH" code (pSeries PCI error handling & recovery) got a big
  spring cleaning, motivated by the need to be able to implement a
  new backend for it on top of some new different type of firwmare.

  The work isn't complete yet, but a good chunk of the cleanups is
  there.  Note that this adds a field to struct device_node which is
  not very nice and which Grant objects to.  I will have a patch soon
  that moves that to a powerpc private data structure (hopefully
  before rc1) and we'll improve things further later on (hopefully
  getting rid of the need for that pointer completely).  Thanks Gavin
  Shan.

- I dug into our exception & interrupt handling code to improve the
  way we do lazy interrupt handling (and make it work properly with
  "edge" triggered interrupt sources), and while at it found & fixed
  a wagon of issues in those areas, including adding support for page
  fault retry & fatal signals on page faults.

- Your usual random batch of small fixes & updates, including a bunch
  of new embedded boards, both Freescale and APM based ones, etc..."

 I fixed up some conflicts with the generalized irq-domain changes from
 Grant Likely, hopefully correctly.

 * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
(141 commits)
   powerpc/ps3: Do not adjust the wrapper load address
   powerpc: Remove the rest of the legacy iSeries include files
   powerpc: Remove the remaining CONFIG_PPC_ISERIES pieces
   init: Remove CONFIG_PPC_ISERIES
   powerpc: Remove FW_FEATURE ISERIES from arch code
   tty/hvc_vio: FW_FEATURE_ISERIES is no longer selectable
   powerpc/spufs: Fix double unlocks
   powerpc/5200: convert mpc5200 to use of_platform_populate()
   powerpc/mpc5200: add options to mpc5200_defconfig
   powerpc/mpc52xx: add a4m072 board support
   powerpc/mpc5200: update mpc5200_defconfig to fit for charon board
   Documentation/powerpc/mpc52xx.txt: Checkpatch cleanup
   powerpc/44x: Add additional device support for APM821xx SoC and Bluestone
board
   powerpc/44x: Add support PCI-E for APM821xx SoC and Bluestone board
   MAINTAINERS: Update PowerPC 4xx tree
   powerpc/44x: The bug fixed support for APM821xx SoC and Bluestone board
   powerpc: document the FSL MPIC message register binding
   powerpc: add support for MPIC message register API
   powerpc/fsl: Added aliased MSIIR register address to MSI node in dts
   powerpc/85xx: mpc8548cds - add 36-bit dts
   ...

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Reverting this commit fixes the problem with SHA256 checkusm in the purgatory
code too. I'm trying to find out which commit exactly caused the problem.

regards
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev






I found the single commit which brakes kexec stuff for FreeBSD loader or other 
custom ELF kernels on the PS3 console.



From 7230c5644188cd9e3fb380cc97dde00c464a3ba7 Mon Sep 17 00:00:00 2001
From: Benjamin Herrenschmidt 
Date: Tue, 6 Mar 2012 18:27:59 +1100
Subject: [PATCH] powerpc: Rework l

Re: PS3: Strange issue with kexec and FreeBSD loader

2013-02-19 Thread Phileas Fogg

Phileas Fogg wrote:

I could finally find the commit which broke FreeBSD booting in linux-stable.git
repository.
The Linux 3.4-rc1 seems to have this problem already.

--
commit 5375871d432ae9fc581014ac117b96aaee3cd0c7
Merge: b57cb72 dfbc2d7
Author: Linus Torvalds 
Date:   Wed Mar 21 18:55:10 2012 -0700

 Merge branch 'next' of
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc

 Pull powerpc merge from Benjamin Herrenschmidt:
  "Here's the powerpc batch for this merge window.  It is going to be a
   bit more nasty than usual as in touching things outside of
   arch/powerpc mostly due to the big iSeriesectomy :-) We finally got
   rid of the bugger (legacy iSeries support) which was a PITA to
   maintain and that nobody really used anymore.

   Here are some of the highlights:

- Legacy iSeries is gone.  Thanks Stephen ! There's still some bits
  and pieces remaining if you do a grep -ir series arch/powerpc but
  they are harmless and will be removed in the next few weeks
  hopefully.

- The 'fadump' functionality (Firmware Assisted Dump) replaces the
  previous (equivalent) "pHyp assisted dump"...  it's a rewrite of a
  mechanism to get the hypervisor to do crash dumps on pSeries, the
  new implementation hopefully being much more reliable.  Thanks
  Mahesh Salgaonkar.

- The "EEH" code (pSeries PCI error handling & recovery) got a big
  spring cleaning, motivated by the need to be able to implement a
  new backend for it on top of some new different type of firwmare.

  The work isn't complete yet, but a good chunk of the cleanups is
  there.  Note that this adds a field to struct device_node which is
  not very nice and which Grant objects to.  I will have a patch soon
  that moves that to a powerpc private data structure (hopefully
  before rc1) and we'll improve things further later on (hopefully
  getting rid of the need for that pointer completely).  Thanks Gavin
  Shan.

- I dug into our exception & interrupt handling code to improve the
  way we do lazy interrupt handling (and make it work properly with
  "edge" triggered interrupt sources), and while at it found & fixed
  a wagon of issues in those areas, including adding support for page
  fault retry & fatal signals on page faults.

- Your usual random batch of small fixes & updates, including a bunch
  of new embedded boards, both Freescale and APM based ones, etc..."

 I fixed up some conflicts with the generalized irq-domain changes from
 Grant Likely, hopefully correctly.

 * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
(141 commits)
   powerpc/ps3: Do not adjust the wrapper load address
   powerpc: Remove the rest of the legacy iSeries include files
   powerpc: Remove the remaining CONFIG_PPC_ISERIES pieces
   init: Remove CONFIG_PPC_ISERIES
   powerpc: Remove FW_FEATURE ISERIES from arch code
   tty/hvc_vio: FW_FEATURE_ISERIES is no longer selectable
   powerpc/spufs: Fix double unlocks
   powerpc/5200: convert mpc5200 to use of_platform_populate()
   powerpc/mpc5200: add options to mpc5200_defconfig
   powerpc/mpc52xx: add a4m072 board support
   powerpc/mpc5200: update mpc5200_defconfig to fit for charon board
   Documentation/powerpc/mpc52xx.txt: Checkpatch cleanup
   powerpc/44x: Add additional device support for APM821xx SoC and Bluestone
board
   powerpc/44x: Add support PCI-E for APM821xx SoC and Bluestone board
   MAINTAINERS: Update PowerPC 4xx tree
   powerpc/44x: The bug fixed support for APM821xx SoC and Bluestone board
   powerpc: document the FSL MPIC message register binding
   powerpc: add support for MPIC message register API
   powerpc/fsl: Added aliased MSIIR register address to MSI node in dts
   powerpc/85xx: mpc8548cds - add 36-bit dts
   ...

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Reverting this commit fixes the problem with SHA256 checkusm in the purgatory 
code too. I'm trying to find out which commit exactly caused the problem.


regards
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: PS3: Strange issue with kexec and FreeBSD loader

2013-02-19 Thread Phileas Fogg
I could finally find the commit which broke FreeBSD booting in linux-stable.git 
repository.

The Linux 3.4-rc1 seems to have this problem already.

--
commit 5375871d432ae9fc581014ac117b96aaee3cd0c7
Merge: b57cb72 dfbc2d7
Author: Linus Torvalds 
Date:   Wed Mar 21 18:55:10 2012 -0700

Merge branch 'next' of 
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc


Pull powerpc merge from Benjamin Herrenschmidt:
 "Here's the powerpc batch for this merge window.  It is going to be a
  bit more nasty than usual as in touching things outside of
  arch/powerpc mostly due to the big iSeriesectomy :-) We finally got
  rid of the bugger (legacy iSeries support) which was a PITA to
  maintain and that nobody really used anymore.

  Here are some of the highlights:

   - Legacy iSeries is gone.  Thanks Stephen ! There's still some bits
 and pieces remaining if you do a grep -ir series arch/powerpc but
 they are harmless and will be removed in the next few weeks
 hopefully.

   - The 'fadump' functionality (Firmware Assisted Dump) replaces the
 previous (equivalent) "pHyp assisted dump"...  it's a rewrite of a
 mechanism to get the hypervisor to do crash dumps on pSeries, the
 new implementation hopefully being much more reliable.  Thanks
 Mahesh Salgaonkar.

   - The "EEH" code (pSeries PCI error handling & recovery) got a big
 spring cleaning, motivated by the need to be able to implement a
 new backend for it on top of some new different type of firwmare.

 The work isn't complete yet, but a good chunk of the cleanups is
 there.  Note that this adds a field to struct device_node which is
 not very nice and which Grant objects to.  I will have a patch soon
 that moves that to a powerpc private data structure (hopefully
 before rc1) and we'll improve things further later on (hopefully
 getting rid of the need for that pointer completely).  Thanks Gavin
 Shan.

   - I dug into our exception & interrupt handling code to improve the
 way we do lazy interrupt handling (and make it work properly with
 "edge" triggered interrupt sources), and while at it found & fixed
 a wagon of issues in those areas, including adding support for page
 fault retry & fatal signals on page faults.

   - Your usual random batch of small fixes & updates, including a bunch
 of new embedded boards, both Freescale and APM based ones, etc..."

I fixed up some conflicts with the generalized irq-domain changes from
Grant Likely, hopefully correctly.

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: 
(141 commits)

  powerpc/ps3: Do not adjust the wrapper load address
  powerpc: Remove the rest of the legacy iSeries include files
  powerpc: Remove the remaining CONFIG_PPC_ISERIES pieces
  init: Remove CONFIG_PPC_ISERIES
  powerpc: Remove FW_FEATURE ISERIES from arch code
  tty/hvc_vio: FW_FEATURE_ISERIES is no longer selectable
  powerpc/spufs: Fix double unlocks
  powerpc/5200: convert mpc5200 to use of_platform_populate()
  powerpc/mpc5200: add options to mpc5200_defconfig
  powerpc/mpc52xx: add a4m072 board support
  powerpc/mpc5200: update mpc5200_defconfig to fit for charon board
  Documentation/powerpc/mpc52xx.txt: Checkpatch cleanup
  powerpc/44x: Add additional device support for APM821xx SoC and Bluestone 
board

  powerpc/44x: Add support PCI-E for APM821xx SoC and Bluestone board
  MAINTAINERS: Update PowerPC 4xx tree
  powerpc/44x: The bug fixed support for APM821xx SoC and Bluestone board
  powerpc: document the FSL MPIC message register binding
  powerpc: add support for MPIC message register API
  powerpc/fsl: Added aliased MSIIR register address to MSI node in dts
  powerpc/85xx: mpc8548cds - add 36-bit dts
  ...

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: PS3: Strange issue with kexec and FreeBSD loader

2013-02-17 Thread Phileas Fogg

Geert Uytterhoeven wrote:

Hi Phileas,

On Sun, Feb 17, 2013 at 12:12 AM, Phileas Fogg  wrote:

I found new clues about the problem.

Normally the device tree memory segment is allocated at the top of the boot
memory region. The boot memory size on the PS3 console is 128MB.


root@ps3-linux:~# kexec -l loader.ps3
segment[0].mem:0x131d000 memsz:262144
segment[1].mem:0x135d000 memsz:36864
segment[2].mem:0x7fff000 memsz:4096

And the device tree is located at address 0x7fff000, it's the last page of
the boot memory.

I changed the kexec-tools and made it store the device tree just after the
purgatory code which is located at address 0x135d000. Like here:


root@ps3-linux:~# kexec -l loader.ps3
segment[0].mem:0x131d000 memsz:262144
segment[1].mem:0x135d000 memsz:36864
segment[2].mem:0x1366000 memsz:4096   < new address of device tree
segment

And now the sha256 verification is always successful for the FreeBSD loader
too.
But still no idea what actually corrupts the device tree segment when it's
located at the top of the boot memory region. And why it happens on Linux
3.7 and Linux 3.8 but not on Linux 3.3.8.


Have you looked at the actual data that ends up being written there?
It may give a clue...

Gr{oetje,eeting}s,

 Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
 -- Linus Torvalds
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev



i was able to dump the device tree data from the purgatory code and compared the 
original DT which i dumped from kexec-tools and the one from purgatory.
About 20 bytes at the end of the string table of the device tree were corrupted. 
Large part of the new data are 0s.


regards
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: PS3: Strange issue with kexec and FreeBSD loader

2013-02-16 Thread Phileas Fogg


I found new clues about the problem.

Normally the device tree memory segment is allocated at the top of the boot 
memory region. The boot memory size on the PS3 console is 128MB.


root@ps3-linux:~# kexec -l loader.ps3
segment[0].mem:0x131d000 memsz:262144
segment[1].mem:0x135d000 memsz:36864
segment[2].mem:0x7fff000 memsz:4096

And the device tree is located at address 0x7fff000, it's the last page of the 
boot memory.


I changed the kexec-tools and made it store the device tree just after the 
purgatory code which is located at address 0x135d000. Like here:


root@ps3-linux:~# kexec -l loader.ps3
segment[0].mem:0x131d000 memsz:262144
segment[1].mem:0x135d000 memsz:36864
segment[2].mem:0x1366000 memsz:4096   < new address of device tree segment

And now the sha256 verification is always successful for the FreeBSD loader too.
But still no idea what actually corrupts the device tree segment when it's 
located at the top of the boot memory region. And why it happens on Linux 3.7 
and Linux 3.8 but not on Linux 3.3.8.


regards



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: PS3: Strange issue with kexec and FreeBSD loader

2013-02-16 Thread Phileas Fogg

Phileas Fogg wrote:

I was able to capture the debug output from the purgatory code and it's very 
odd.

This the SHA256 digest calculated by kexec-tools:

root@ps3-linux:~# kexec -l loader.ps3
Warning: append= option is not passed. Using the first kernel root partition
Modified cmdline:
Unable to find /proc/device-tree//chosen/linux,stdout-path, printing from
purgatory is diabled
segment[0].mem:0x131d000 memsz:262144
segment[1].mem:0x135d000 memsz:36864
segment[2].mem:0x7fff000 memsz:4096
sha256_digest: 77 d5 30 a7 67 5f 67 93 f1 e0 ce 84 bd 4e 1b ec 3c 4a 9e 86 5c a1
33 87 9e b1 5f c8 91 ce e8 61


And this is the debug output i'm always getting from the purgatory code:

I'm in purgatory
sha256 digests do not match :(
digest: fd 4f df a8 af 5b e1 6b bc 51 5d b8 ab be 75 fb 76 fd 64 64 26
3e a8 9f 46 ec 91 de 05 4e 72 78
sha256_digest: 00 39 e3 b2 45 0d 20 68 74 c2 4e ee e4 4a cf ec c3 78 4f 1c 65 ff
a8 76 73 68 5d 01 70 0b b6 50

regards


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


I was able to analyze the problem more and found out that the device tree memory 
region gets corrupted. I slightly modified kexec-tools and made it first compute 
a checksum of the first segment only where the new kernel is located.

And the checksum was always verified as correct in the purgatoroy code.
Then i made kexec-tools compute the checksum of the 3rd segment only where a 
device tree is stored. And this time the verify function in the purgatory failed 
always.


Output form the purgatory code:


I'm in purgatory
sha256 digests do not match :(
   digest: e3 b0 c4 42 98 fc 1c 14 9a fb f4 c8 99 6f b9 24 27 ae 41 e4 64 
9b 93 4c a4 95 99 1b 78 52 b8 55
sha256_digest: 57 08 81 e7 62 c3 22 2f d9 1d 94 a5 d0 f7 53 8f fe 69 64 84 4d 71 
2d aa e2 07 45 b3 78 79 6e 26

sha256_regions:
start=0x07fff000 len=0x1000

The sha256_digest is actually the correct SHA256 checksum precomputed by 
kexec-tools when the new kernel was given to the old kernel.


I will try to analyze the problem more later.

regards


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: PS3: Strange issue with kexec and FreeBSD loader

2013-02-16 Thread Phileas Fogg

Phileas Fogg wrote:


Hi,

i'm using OpenWRT petitboot bootloader on my PS3 to boot FreeBSD loader which 
is a simple PPC32 ELF file.
I haven't had any issues with it and OpenWRT based on Linux 3.3.8.
Recently i built an OpenWRT image with Linux 3.7, i have no issues at all with 
kexec and any Linux kernels starting with 2.6 but
FreeBSD loader won't boot and just hangs. The same issue with OpenWRT based on 
Linux 3.6 kernel.
So, i started to analyze this problem and found out where it hangs.

It seems that the purgatory code from kexec-tools loops endlessly if SHA256 
verification of the loaded segments
fails.

See
   
http://git.kernel.org/?p=utils/kernel/kexec/kexec-tools.git;a=blob_plain;f=purgatory/purgatory.c;hb=566ca8a12145196b00ad37939cfd58a97f96ba89

Because the function _verify_sha256_digest fails, the function _purgatory_ 
loops endlessly.
This problem occurs only with Linux 3.6 or Linux 3.7 and FreeBSD loader.
I killed the endless loop and could boot the FreeBSD loader on Linux 3.7 too.

Any idea what could cause this problem ?

Thanks.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev



Found another strange problem. I'm not able to boot FreeBSD LiveCD with
OpenWRT + Linux 3.8 (or Linux 3.7), the same CD which boots on
OpenWRT + Linux 3.3.8.

The LiveCD just panics and the PS3 console shuts down. Very odd.
The problem is probably connected with the kexec issue i'm having
and happens only with the recent Linux kernels.

regards
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: PS3: Strange issue with kexec and FreeBSD loader

2013-02-16 Thread Phileas Fogg

I was able to capture the debug output from the purgatory code and it's very 
odd.

This the SHA256 digest calculated by kexec-tools:

root@ps3-linux:~# kexec -l loader.ps3
Warning: append= option is not passed. Using the first kernel root partition
Modified cmdline:
Unable to find /proc/device-tree//chosen/linux,stdout-path, printing from 
purgatory is diabled

segment[0].mem:0x131d000 memsz:262144
segment[1].mem:0x135d000 memsz:36864
segment[2].mem:0x7fff000 memsz:4096
sha256_digest: 77 d5 30 a7 67 5f 67 93 f1 e0 ce 84 bd 4e 1b ec 3c 4a 9e 86 5c a1 
33 87 9e b1 5f c8 91 ce e8 61



And this is the debug output i'm always getting from the purgatory code:

I'm in purgatory
sha256 digests do not match :(
   digest: fd 4f df a8 af 5b e1 6b bc 51 5d b8 ab be 75 fb 76 fd 64 64 26 
3e a8 9f 46 ec 91 de 05 4e 72 78
sha256_digest: 00 39 e3 b2 45 0d 20 68 74 c2 4e ee e4 4a cf ec c3 78 4f 1c 65 ff 
a8 76 73 68 5d 01 70 0b b6 50


regards


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re[2]: [PATCH 2/2] powerpc: Make context bits depend on virtual addr size.

2013-02-13 Thread Phileas Fogg

>
>Ok. How about the below patch. This is based on the suggestion from
>Paulus. I still have to take care of few comments in the code.
>
>We now split the proto-vsid range differently.
> User:   0 to [2^(CONTEXT_BITS) - 4  + 2^(USER_ESID_BITS)]
> kernel: [2^(CONTEXT_BITS) - 4 + 2^(USER_ESID_BITS)] to 2^(VSID_BITS) - 1
>
>Phileas and Geoff,
>
>Can we check whether this fix the PS3 boot hang ?
 Tested the new patch with Linux 3.8.0-rc7, it seems to be working well.

regards
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re[5]: PS3 platform is broken on Linux 3.7.0

2013-02-11 Thread Phileas Fogg

>"Aneesh Kumar K.V" < aneesh.ku...@linux.vnet.ibm.com > writes:
>
>> Phileas Fogg < phileas-f...@mail.ru > writes:
>>
>>>  And another note.
>>> I took a look at the MMU chapter in the Cell Architecture handbook and 
>>> indeed the first 15 bits in VA are treated as 0 by the hardware.
>>>
>>> Quote:
>>>
>>> 1. High-order bits above 65 bits in the 80-bit virtual address (VA[0:14]) 
>>> are not implemented. The hardware always
>>>    treats these bits as `0'. Software must not set these bits to any other 
>>> value than `0' or the results are undefined in
>>>    the PPE.
>>>
>>>
>>
>> True, we missed the below part of ISA doc:
>>
>> ISA doc says
>>
>> "On implementations that support a virtual address size
>> of only n bits, n < 78, bits 0:77-n of the AVA field must be
>> zeros. "
>>
>> The Cell document I found at 
>>
>>  
>> https://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/7A77CCDF14FE70D5852575CA0074E8ED/$file/CellBE_Handbook_v1.12_3Apr09_pub.pdf
>>
>> gives 
>>
>> Virtual Address (VA) Size -> 65 bits
>>
>> So as per ISA, bits 0:12 should be zero, which should make 0:14 of PTE
>> fields zero for Cell.
>>
>> I will try to do a patch. 
>>
>
>Can you try this patch ?
>
>diff --git a/arch/powerpc/include/asm/mmu-hash64.h 
>b/arch/powerpc/include/asm/mmu-hash64.h
>index 2fdb47a..f01fd9a 100644
>--- a/arch/powerpc/include/asm/mmu-hash64.h
>+++ b/arch/powerpc/include/asm/mmu-hash64.h
>@@ -381,21 +381,37 @@ extern void slb_set_size(u16 size);
>  * hash collisions.
>  */
> 
>+/* This should go in Kconfig */
>+/*
>+ * Be careful with this value. This determines the VSID_MODULUS_*  and that
>+ * need to be co-prime with VSID_MULTIPLIER*
>+ */
>+#if 1
>+#define MAX_VIRTUAL_ADDR_BITS 65
>+#else
>+#define MAX_VIRTUAL_ADDR_BITS 66
>+#endif
>+/*
>+ * One bit is taken by the kernel, only the rest of space is available for the
>+ * user space.
>+ */
>+#define CONTEXT_BITS  (MAX_VIRTUAL_ADDR_BITS - \
>+   (USER_ESID_BITS + SID_SHIFT + 1))
>+#define USER_ESID_BITS18
>+#define USER_ESID_BITS_1T 6
>+
> /*
>  * This should be computed such that protovosid * vsid_mulitplier
>  * doesn't overflow 64 bits. It should also be co-prime to vsid_modulus
>  */
> #define VSID_MULTIPLIER_256M  ASM_CONST(12538073) /* 24-bit prime */
>-#define VSID_BITS_256M38
>+#define VSID_BITS_256M(CONTEXT_BITS + USER_ESID_BITS + 1)
> #define VSID_MODULUS_256M ((1UL< 
> #define VSID_MULTIPLIER_1TASM_CONST(12538073) /* 24-bit prime */
>-#define VSID_BITS_1T  26
>+#define VSID_BITS_1T  (CONTEXT_BITS + USER_ESID_BITS_1T + 1)
> #define VSID_MODULUS_1T   ((1UL< 
>-#define CONTEXT_BITS  19
>-#define USER_ESID_BITS18
>-#define USER_ESID_BITS_1T 6
> 
> #define USER_VSID_RANGE   (1UL << (USER_ESID_BITS + SID_SHIFT))
> 
>

Testing it with Linux 3.8.0-rc7, it looks good so far under heavy hard disk 
usage.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re[5]: PS3 platform is broken on Linux 3.7.0

2013-02-10 Thread Phileas Fogg

>Phileas Fogg < phileas-f...@mail.ru > writes:
>
>>  Please ignore the previous patch to fix the PACA issue on PS3 arch.
>> This is the correct one:
>>
>> --- a/arch/powerpc/kernel/setup_64.c 2013-02-10 13:56:12.803855673 +0100
>> +++ b/arch/powerpc/kernel/setup_64.c 2013-02-10 14:07:22.870561322 +0100
>> @@ -186,6 +186,9 @@
>>  initialise_paca(&boot_paca, 0);
>>  setup_paca(&boot_paca);
>> 
>> +/* Allow percpu accesses to "work" until we setup percpu data */
>> +boot_paca.data_offset = 0;
>> +
>>  /* Initialize lockdep early or else spinlocks will blow */
>>  lockdep_init();
>> 
>>
>
>commit 466921c5a4669f4315528a25f9afd66601ce2c04 is done to fix the
>lockdep related issue on ppc64. So this may need little bit more
>explanation. So if we explicitly use boot_paca, do we still need the
>changes in the above commit ?
>
>-aneesh
>
>___
>Linuxppc-dev mailing list
>Linuxppc-dev@lists.ozlabs.org
>https://lists.ozlabs.org/listinfo/linuxppc-dev

Ok, here is the next PACA fix test.

I tested the following patch with Linux 3.8.0-rc7 on PS3 arch and still getting 
panics.

Patch:

--- arch/powerpc/kernel/setup_64.c.old    2013-02-10 19:34:53.787366191 +0100
+++ arch/powerpc/kernel/setup_64.c    2013-02-10 19:35:38.834035478 +0100
@@ -186,6 +186,9 @@
     initialise_paca(&boot_paca, 0);
     setup_paca(&boot_paca);
 
+    /* Allow percpu accesses to "work" until we setup percpu data */
+    boot_paca.data_offset = 0;
+
     /* Initialize lockdep early or else spinlocks will blow */
     lockdep_init();
 
@@ -208,8 +211,6 @@
 
     /* Fix up paca fields required for the boot cpu */
     get_paca()->cpu_start = 1;
-    /* Allow percpu accesses to "work" until we setup percpu data */
-    get_paca()->data_offset = 0;
 
     /* Probe the machine type */
     probe_machine();



It seems that 'boot_paca' and 'get_paca()' refer to different PACAs.

regards 


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re[3]: PS3 platform is broken on Linux 3.7.0

2013-02-10 Thread Phileas Fogg
 Please ignore the previous patch to fix the PACA issue on PS3 arch.
This is the correct one:

--- a/arch/powerpc/kernel/setup_64.c2013-02-10 13:56:12.803855673 +0100
+++ b/arch/powerpc/kernel/setup_64.c2013-02-10 14:07:22.870561322 +0100
@@ -186,6 +186,9 @@
initialise_paca(&boot_paca, 0);
setup_paca(&boot_paca);
 
+   /* Allow percpu accesses to "work" until we setup percpu data */
+   boot_paca.data_offset = 0;
+
/* Initialize lockdep early or else spinlocks will blow */
lockdep_init();
 




Воскресенье, 10 февраля 2013, 15:45 +04:00 от Phileas Fogg 
:
>
>>On Fri, 2012-12-14 at 16:35 +0400, Phileas Fogg wrote:
>>> Hi,
>>> 
>>> I wanted to bring to your attention the fact that the PS3 platform is 
>>> broken on Linux 3.7.0.
>>> 
>>> i'm not able to boot Linux 3.7.0 on my PS3 slim. Linux 3.6.10 boots just 
>>> fine but not 3.7.0
>>> When i try to boot Linux 3.7.0 then my PS3  shuts down.
>>> 
>>> So i cloned the Linux powerpc GIT repository and tried to find out which 
>>> commits broke the PS3 platform.
>>> After some time I tracked it down to 2 commits:
>>
>>Aneesh, do you have any idea what might be going on there ? Can you look
>>at the PS3 hash code ? It's a bit different from the rest, you might
>>have missed an update or two...
>>
>>Michael, same deal with PACA...
>>
>>Cheers,
>>Ben.
>
>
>I debugged the issue with the panic on PACA access on PS3 arch and found out 
>that it panics in
>
>arch/powerpc/kernel/setup_64.c -> early_setup -> udbg_early_init -> 
>register_early_udbg_console -> console_lock -> down -> 
>raw_spin_unlock_irqrestore
>
>It panics only if i enable lock debugging in kernel.
>
>I suggest the following patch to fix the issue:
>
>--- arch/powerpc/kernel/setup_64.c.old 2013-02-10 13:39:45.147131547 +0100
>+++ arch/powerpc/kernel/setup_64.c 2013-02-10 13:40:51.697135419 +0100
>@@ -186,6 +186,9 @@
>   initialise_paca(&boot_paca, 0);
>   setup_paca(&boot_paca);
> 
>+  /* Allow percpu accesses to "work" until we setup percpu data */
>+  get_paca()->data_offset = 0;
>+
>   /* Initialize lockdep early or else spinlocks will blow */
>   lockdep_init();
> 
>@@ -208,8 +211,6 @@
> 
>   /* Fix up paca fields required for the boot cpu */
>   get_paca()->cpu_start = 1;
>-  /* Allow percpu accesses to "work" until we setup percpu data */
>-  get_paca()->data_offset = 0;
> 
>   /* Probe the machine type */
>   probe_machine();
>
>
>
>
>___
>Linuxppc-dev mailing list
>Linuxppc-dev@lists.ozlabs.org
>https://lists.ozlabs.org/listinfo/linuxppc-dev

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re[2]: PS3 platform is broken on Linux 3.7.0

2013-02-10 Thread Phileas Fogg

>On Fri, 2012-12-14 at 16:35 +0400, Phileas Fogg wrote:
>> Hi,
>> 
>> I wanted to bring to your attention the fact that the PS3 platform is broken 
>> on Linux 3.7.0.
>> 
>> i'm not able to boot Linux 3.7.0 on my PS3 slim. Linux 3.6.10 boots just 
>> fine but not 3.7.0
>> When i try to boot Linux 3.7.0 then my PS3  shuts down.
>> 
>> So i cloned the Linux powerpc GIT repository and tried to find out which 
>> commits broke the PS3 platform.
>> After some time I tracked it down to 2 commits:
>
>Aneesh, do you have any idea what might be going on there ? Can you look
>at the PS3 hash code ? It's a bit different from the rest, you might
>have missed an update or two...
>
>Michael, same deal with PACA...
>
>Cheers,
>Ben.


I debugged the issue with the panic on PACA access on PS3 arch and found out 
that it panics in

arch/powerpc/kernel/setup_64.c -> early_setup -> udbg_early_init -> 
register_early_udbg_console -> console_lock -> down -> 
raw_spin_unlock_irqrestore

It panics only if i enable lock debugging in kernel.

I suggest the following patch to fix the issue:

--- arch/powerpc/kernel/setup_64.c.old  2013-02-10 13:39:45.147131547 +0100
+++ arch/powerpc/kernel/setup_64.c  2013-02-10 13:40:51.697135419 +0100
@@ -186,6 +186,9 @@
initialise_paca(&boot_paca, 0);
setup_paca(&boot_paca);
 
+   /* Allow percpu accesses to "work" until we setup percpu data */
+   get_paca()->data_offset = 0;
+
/* Initialize lockdep early or else spinlocks will blow */
lockdep_init();
 
@@ -208,8 +211,6 @@
 
/* Fix up paca fields required for the boot cpu */
get_paca()->cpu_start = 1;
-   /* Allow percpu accesses to "work" until we setup percpu data */
-   get_paca()->data_offset = 0;
 
/* Probe the machine type */
probe_machine();




___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re[3]: PS3 platform is broken on Linux 3.7.0

2013-02-10 Thread Phileas Fogg
 And another note.
I took a look at the MMU chapter in the Cell Architecture handbook and indeed 
the first 15 bits in VA are treated as 0 by the hardware.

Quote:

1. High-order bits above 65 bits in the 80-bit virtual address (VA[0:14]) are 
not implemented. The hardware always
   treats these bits as `0'. Software must not set these bits to any other 
value than `0' or the results are undefined in
   the PPE.


regards


Воскресенье, 10 февраля 2013, 12:59 +04:00 от Phileas Fogg 
:
>Hi,
>
>i found where the problem lies.
>I also printed some values in ps3_hpte_insert with and without 64TB support, i 
>used OpenWRT with Linux 3.7.6 for testing.
>
>Some values without 64TB support:
>-
>
>[    0.060487] RPC: Registered named UNIX socket transport module.
>[    0.060511] RPC: Registered udp transport module.
>[    0.060672] RPC: Registered tcp transport module.
>[    0.060873] RPC: Registered tcp NFSv4.1 backchannel transport module.
>[    0.061080] initcall .init_sunrpc+0x0/0xbc returned 0 after 784 usecs
>[    0.061280] calling  .populate_rootfs+0x0/0x120 @ 1
>[    0.061683] ps3_hpte_insert:result=0 vpn=f09b89af50101 pa=d4e ix=dfa0 
>v=f09b89af5001 r=6c005d4e0194 psize=0 ssize=0 lpar=6c005d4e
>[    0.061733] ps3_hpte_insert:result=0 vpn=f09b89af50102 pa=d4e1000 ix=dfb8 
>v=f09b89af5001 r=6c005d4e1194 psize=0 ssize=0 lpar=6c005d4e1000
>[    0.061895] ps3_hpte_insert:result=0 vpn=f09b89af50103 pa=d4e2000 ix=dfb0 
>v=f09b89af5001 r=6c005d4e2194 psize=0 ssize=0 lpar=6c005d4e2000
>
>
>Some values with 64TB support:
>-
>
>[    0.076477] calling  .init_sunrpc+0x0/0xbc @ 1
>[    0.076992] RPC: Registered named UNIX socket transport module.
>[    0.077017] RPC: Registered udp transport module.
>[    0.077076] RPC: Registered tcp transport module.
>[    0.077277] RPC: Registered tcp NFSv4.1 backchannel transport module.
>[    0.077484] initcall .init_sunrpc+0x0/0xbc returned 0 after 784 usecs
>[    0.077684] calling  .populate_rootfs+0x0/0x120 @ 1
>[    0.078126] ps3_hpte_insert:result=-17 vpn=25008684d80101 pa=d567000 
>ix=2ec8 v=25008684d8001 r=6c005d567194 psize=0 ssize=0 lpar=6c005d567000
>[    0.078164] ps3_hpte_insert:result=-17 vpn=25008684d80101 pa=d567000 
>ix=2ec8 v=25008684d8001 r=6c005d567194 psize=0 ssize=0 lpar=6c005d567000
>[    0.078287] [ cut here ]
>[    0.078482] Kernel BUG at c002cb3c [verbose debug info unavailable]
>[    0.078686] Oops: Exception in kernel mode, sig: 5 [#1]
>[    0.078883] SMP NR_CPUS=2 PS3
>[    0.079084] Modules linked in:
>[    0.079287] NIP: c002cb3c LR: c002cb38 CTR: 002ffc38
>[    0.079489] REGS: cd04f0e0 TRAP: 0700   Not tainted  (3.7.6)
>[    0.079687] MSR: 80020032   CR: 2222  XER: 
>[    0.079888] SOFTE: 0
>[    0.080090] TASK = cd049060[1] 'swapper/1' THREAD: cd04c000 
>CPU: 1
>GPR00: c002cb38 cd04f360 c12ec8d0 0081 
>GPR04:     
>GPR08:  c124ce10  c002bcf0 
>GPR12: 2222 c7ffe280 c0008c94 c05cba00 
>
>
>
>And now take a look at 'v' values in both cases.
>
>Without 64TB support:   v=f09b89af5001
>With 64TB support: v=25008684d8001
>
>Number of leading zeros in f09b89af5001 is 16.
>Number of leading zeros in 25008684d8001 is 14.
>
>And that's why lv1_insert_htab_entry fails with -17 which means 
>LV1_ILLEGAL_PARAMETER_VALUE because
>the Hypervisor of PS3 checks 'AVPN' values for number of leading zeros and 
>allows at least 15 bits which in case
>of 'v' value 25008684d8001 is too small of course.
>
>Not sure how to fix it in current Linux kernel. You guys know it better than 
>me.
>
>Regards
>
>
>
>
>Понедельник, 14 января 2013, 15:37 -08:00 от Geoff Levand < 
>ge...@infradead.org >:
>>Hi,
>>
>>On Fri, 2013-01-11 at 18:12 -0800, Geoff Levand wrote:
>>> I checked these, and Michael's 407821a34fce89b4f0b031dbab5cec7d059f46bc
>>> does indeed cause the LV1 hypervisor to panic early, and if that is
>>> reverted, Aneesh's 048ee0993ec8360abb0b51bdf8f8721e9ed62ec4 hits a BUG.
>>
>>Just to give an update, I did a little more work on it and found that
>>the call to lv1_insert_htab_entry() inside ps3_hpte_insert() is
>>failing.
>>
>>     
>>http://git.kernel.org/?p=linux/kernel/git/geoff/ps3-linux.git;a=blob;f=arch/powerpc/platforms/ps3/htab.c;hb=HEAD#l70
>>
>>The values of the variables printed al

Re[2]: PS3 platform is broken on Linux 3.7.0

2013-02-10 Thread Phileas Fogg
 Hi,

i found where the problem lies.
I also printed some values in ps3_hpte_insert with and without 64TB support, i 
used OpenWRT with Linux 3.7.6 for testing.

Some values without 64TB support:
-

[    0.060487] RPC: Registered named UNIX socket transport module.
[    0.060511] RPC: Registered udp transport module.
[    0.060672] RPC: Registered tcp transport module.
[    0.060873] RPC: Registered tcp NFSv4.1 backchannel transport module.
[    0.061080] initcall .init_sunrpc+0x0/0xbc returned 0 after 784 usecs
[    0.061280] calling  .populate_rootfs+0x0/0x120 @ 1
[    0.061683] ps3_hpte_insert:result=0 vpn=f09b89af50101 pa=d4e ix=dfa0 
v=f09b89af5001 r=6c005d4e0194 psize=0 ssize=0 lpar=6c005d4e
[    0.061733] ps3_hpte_insert:result=0 vpn=f09b89af50102 pa=d4e1000 ix=dfb8 
v=f09b89af5001 r=6c005d4e1194 psize=0 ssize=0 lpar=6c005d4e1000
[    0.061895] ps3_hpte_insert:result=0 vpn=f09b89af50103 pa=d4e2000 ix=dfb0 
v=f09b89af5001 r=6c005d4e2194 psize=0 ssize=0 lpar=6c005d4e2000


Some values with 64TB support:
-

[    0.076477] calling  .init_sunrpc+0x0/0xbc @ 1
[    0.076992] RPC: Registered named UNIX socket transport module.
[    0.077017] RPC: Registered udp transport module.
[    0.077076] RPC: Registered tcp transport module.
[    0.077277] RPC: Registered tcp NFSv4.1 backchannel transport module.
[    0.077484] initcall .init_sunrpc+0x0/0xbc returned 0 after 784 usecs
[    0.077684] calling  .populate_rootfs+0x0/0x120 @ 1
[    0.078126] ps3_hpte_insert:result=-17 vpn=25008684d80101 pa=d567000 ix=2ec8 
v=25008684d8001 r=6c005d567194 psize=0 ssize=0 lpar=6c005d567000
[    0.078164] ps3_hpte_insert:result=-17 vpn=25008684d80101 pa=d567000 ix=2ec8 
v=25008684d8001 r=6c005d567194 psize=0 ssize=0 lpar=6c005d567000
[    0.078287] [ cut here ]
[    0.078482] Kernel BUG at c002cb3c [verbose debug info unavailable]
[    0.078686] Oops: Exception in kernel mode, sig: 5 [#1]
[    0.078883] SMP NR_CPUS=2 PS3
[    0.079084] Modules linked in:
[    0.079287] NIP: c002cb3c LR: c002cb38 CTR: 002ffc38
[    0.079489] REGS: cd04f0e0 TRAP: 0700   Not tainted  (3.7.6)
[    0.079687] MSR: 80020032   CR: 2222  XER: 
[    0.079888] SOFTE: 0
[    0.080090] TASK = cd049060[1] 'swapper/1' THREAD: cd04c000 
CPU: 1
GPR00: c002cb38 cd04f360 c12ec8d0 0081 
GPR04:     
GPR08:  c124ce10  c002bcf0 
GPR12: 2222 c7ffe280 c0008c94 c05cba00 



And now take a look at 'v' values in both cases.

Without 64TB support:   v=f09b89af5001
With 64TB support: v=25008684d8001

Number of leading zeros in f09b89af5001 is 16.
Number of leading zeros in 25008684d8001 is 14.

And that's why lv1_insert_htab_entry fails with -17 which means 
LV1_ILLEGAL_PARAMETER_VALUE because
the Hypervisor of PS3 checks 'AVPN' values for number of leading zeros and 
allows at least 15 bits which in case
of 'v' value 25008684d8001 is too small of course.

Not sure how to fix it in current Linux kernel. You guys know it better than me.

Regards




Понедельник, 14 января 2013, 15:37 -08:00 от Geoff Levand :
>Hi,
>
>On Fri, 2013-01-11 at 18:12 -0800, Geoff Levand wrote:
>> I checked these, and Michael's 407821a34fce89b4f0b031dbab5cec7d059f46bc
>> does indeed cause the LV1 hypervisor to panic early, and if that is
>> reverted, Aneesh's 048ee0993ec8360abb0b51bdf8f8721e9ed62ec4 hits a BUG.
>
>Just to give an update, I did a little more work on it and found that
>the call to lv1_insert_htab_entry() inside ps3_hpte_insert() is
>failing.
>
>    
>http://git.kernel.org/?p=linux/kernel/git/geoff/ps3-linux.git;a=blob;f=arch/powerpc/platforms/ps3/htab.c;hb=HEAD#l70
>
>The values of the variables printed all look strange compared with
>commit 048ee0993 reverted.  I'll try do some more work on it this
>week.
>
>-Geoff
>
>
>
>
>
>___
>Linuxppc-dev mailing list
>Linuxppc-dev@lists.ozlabs.org
>https://lists.ozlabs.org/listinfo/linuxppc-dev

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

PS3: Strange issue with kexec and FreeBSD loader

2013-02-08 Thread Phileas Fogg

Hi,

i'm using OpenWRT petitboot bootloader on my PS3 to boot FreeBSD loader which 
is a simple PPC32 ELF file.
I haven't had any issues with it and OpenWRT based on Linux 3.3.8.
Recently i built an OpenWRT image with Linux 3.7, i have no issues at all with 
kexec and any Linux kernels starting with 2.6 but
FreeBSD loader won't boot and just hangs. The same issue with OpenWRT based on 
Linux 3.6 kernel.
So, i started to analyze this problem and found out where it hangs.

It seems that the purgatory code from kexec-tools loops endlessly if SHA256 
verification of the loaded segments
fails.

See
  
http://git.kernel.org/?p=utils/kernel/kexec/kexec-tools.git;a=blob_plain;f=purgatory/purgatory.c;hb=566ca8a12145196b00ad37939cfd58a97f96ba89

Because the function _verify_sha256_digest fails, the function _purgatory_ 
loops endlessly.
This problem occurs only with Linux 3.6 or Linux 3.7 and FreeBSD loader.
I killed the endless loop and could boot the FreeBSD loader on Linux 3.7 too.

Any idea what could cause this problem ?

Thanks.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

PS3 platform is broken on Linux 3.7.0

2012-12-14 Thread Phileas Fogg
 Hi,

I wanted to bring to your attention the fact that the PS3 platform is broken on 
Linux 3.7.0.

i'm not able to boot Linux 3.7.0 on my PS3 slim. Linux 3.6.10 boots just fine 
but not 3.7.0
When i try to boot Linux 3.7.0 then my PS3  shuts down.

So i cloned the Linux powerpc GIT repository and tried to find out which 
commits broke the PS3 platform.
After some time I tracked it down to 2 commits:

-

commit 407821a34fce89b4f0b031dbab5cec7d059f46bc
Author: Michael Ellerman 
Date:   Fri Sep 7 15:31:44 2012 +

    powerpc: Initialise paca.data_offset with poison
    
    It's possible for the cpu_possible_mask to change between the time we
    initialise the pacas and the time we setup per_cpu areas.
    
    Obviously impossible cpus shouldn't ever be running, but stranger things
    have happened. So be paranoid and initialise data_offset with a poison
    value in case we don't set it up later.
    
    Based on a patch from Anton Blanchard.
    
    Signed-off-by: Michael Ellerman 
    Signed-off-by: Benjamin Herrenschmidt 




commit 048ee0993ec8360abb0b51bdf8f8721e9ed62ec4
Author: Aneesh Kumar K.V 
Date:   Mon Sep 10 02:52:55 2012 +

    powerpc/mm: Add 64TB support
    
    Increase max addressable range to 64TB. This is not tested on
    real hardware yet.
    
    Reviewed-by: Paul Mackerras 
    Signed-off-by: Aneesh Kumar K.V 
    Signed-off-by: Benjamin Herrenschmidt 

--

The first commit causes my PS3 to shut down. If i revert it then i'm able to 
boot Linux 3.7.0 and even see some boot messages
on my screen. But then it hangs. The second commit is the reason for the hang 
as i figured it out.

I reverted both commits in current Linux 3.7.0 and was able to boot Linux 3.7.0 
on my PS3 slim successfully.

Regards

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev