[PATCH] powerpc/cell: Only iterate over online nodes in cbe_init_pm_irq()

2013-04-23 Thread Michael Ellerman
None of the cell platforms support CPU hotplug, so we should iterate
only over online nodes when setting PMU interrupts.

This also fixes a warning during boot when NODES_SHIFT is large enough:

WARNING: at /scratch/michael/src/kmk/linus/kernel/irq/irqdomain.c:766
...
NIP [c00db278] .irq_linear_revmap+0x30/0x58
LR [c00dc2a0] .irq_create_mapping+0x38/0x1a8
Call Trace:
[c003fc9c3af0] [c00dc2a0] .irq_create_mapping+0x38/0x1a8 
(unreliable)
[c003fc9c3b80] [c0655c1c] 
.__machine_initcall_cell_cbe_init_pm_irq+0x84/0x158
[c003fc9c3c20] [c000afb4] .do_one_initcall+0x5c/0x1e0
[c003fc9c3cd0] [c0644580] .kernel_init_freeable+0x238/0x328
[c003fc9c3db0] [c000b784] .kernel_init+0x1c/0x120
[c003fc9c3e30] [c0009fb8] .ret_from_kernel_thread+0x64/0xac

This is caused by us overflowing our linear revmap because we're
requesting too many interrupts.

Reported-by: Dennis Schridde 
Signed-off-by: Michael Ellerman 
---
 arch/powerpc/platforms/cell/pmu.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/cell/pmu.c 
b/arch/powerpc/platforms/cell/pmu.c
index 59c1a16..348a27b 100644
--- a/arch/powerpc/platforms/cell/pmu.c
+++ b/arch/powerpc/platforms/cell/pmu.c
@@ -382,7 +382,7 @@ static int __init cbe_init_pm_irq(void)
unsigned int irq;
int rc, node;
 
-   for_each_node(node) {
+   for_each_online_node(node) {
irq = irq_create_mapping(NULL, IIC_IRQ_IOEX_PMI |
   (node << IIC_IRQ_NODE_SHIFT));
if (irq == NO_IRQ) {
-- 
1.7.10.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] powerpc/cell: Only iterate over online nodes in cbe_init_pm_irq()

2013-04-23 Thread Michael Ellerman
On Tue, Apr 23, 2013 at 02:45:50PM +0200, Dennis Schridde wrote:
> Hello everyone!
> 
> I have been testing this patch (given to me by Grant Likely 
> ) with various kernel versions (3.6.2, 3.6.11, 
> 3.8.6, 3.8.8) since November and can confirm that it solves part of the IRQ 
> mapping issue on Cell (namely the one Michael mentioned) and does not cause 
> any additional noticable problems.
> 
> Some problems remain after this patch, as can be seen in the attached 
> dmesg.log. If further information is required to debug this, or you need some 
> help in testing more patches, please contact me.

..

This one below I really don't understand.

It's saying that the domain->ops->map() returned -22.

But as far as I can see all the map routines we have on cell return
success unconditionally. So something strange is going on. I'll have to
dig into a bit more.

> [0.490734] irq: irq-93==>hwirq-0x5d mapping failed: -22
> [0.522130] [ cut here ]
> [0.549706] WARNING: at 
> /usr/src/linux-3.8.6-aufs/kernel/irq/irqdomain.c:467
> [0.591895] Modules linked in:
> [0.610126] NIP: c00bdeac LR: c00bdea8 CTR: 
> c0025670
> [0.652317] REGS: c000fe667190 TRAP: 0700   Not tainted  (3.8.6-aufs)
> [0.692940] MSR: 90029032   CR: 4424  
> XER: 
> [0.739817] SOFTE: 1
> [0.752840] TASK = c000fe668000[1] 'swapper/0' THREAD: 
> c000fe664000 CPU: 0
> GPR00: c00bdea8 c000fe667410 c0699cb0 002c 
> GPR04:  78e26d37 0008  
> GPR08: 78eef1c4    
> GPR12: d700 cfffb000 c000a460  
> GPR16:     
> GPR20:   c000fe006078 005e 
> GPR24: 0174 005d 005d c000fe65fc00 
> GPR28: c000fe006060 005d c0643be0 005d 
> [1.119009] NIP [c00bdeac] .irq_domain_associate_many+0x264/0x290
> [1.159628] LR [c00bdea8] .irq_domain_associate_many+0x260/0x290
> [1.199731] Call Trace:
> [1.214318] [c000fe667410] [c00bdea8] 
> .irq_domain_associate_many+0x260/0x290 (unreliable)
> [1.269528] [c000fe6674e0] [c00be928] 
> .irq_create_mapping+0xc8/0x1d0
> [1.313801] [c000fe667580] [c00bead8] 
> .irq_create_of_mapping+0xa8/0x170
> [1.359639] [c000fe667630] [c0290c30] 
> .irq_of_parse_and_map+0x40/0x58
> [1.404429] [c000fe6676c0] [c0290df0] .of_irq_count+0x30/0x58
> [1.445057] [c000fe667750] [c029182c] 
> .of_device_alloc+0x1ec/0x288
> [1.488287] [c000fe667850] [c029191c] 
> .of_platform_device_create_pdata+0x54/0xf8
> [1.538810] [c000fe6678f0] [c0291b04] 
> .of_platform_bus_create+0x144/0x1e0
> [1.585687] [c000fe6679e0] [c0291b60] 
> .of_platform_bus_create+0x1a0/0x1e0
> [1.632564] [c000fe667ad0] [c0291d50] 
> .of_platform_bus_probe+0xd0/0x140
> [1.678404] [c000fe667b70] [c04109e4] 
> .__machine_initcall_cell_cell_publish_devices+0x54/0x1b0
> [1.736219] [c000fe667c40] [c0009e70] 
> .do_one_initcall+0x168/0x1d0
> [1.779445] [c000fe667d00] [c03ffb6c] 
> .kernel_init_freeable+0x14c/0x21c
> [1.825281] [c000fe667db0] [c000a47c] .kernel_init+0x1c/0x108
> [1.865907] [c000fe667e30] [c0008cd8] 
> .ret_from_kernel_thread+0x64/0x8c
> [1.911738] Instruction dump:
> [1.929448] 7fa4eb78 7ca507b4 4828c965 6000 0fe0 3860ffea 4b80 
> e87e8020 
> [1.975803] 7fa4eb78 7fe5fb78 4828c945 6000 <0fe0> 3920 
> 7f83e378 7f44d378 
> [2.023213] ---[ end trace 093b23e74665976f ]---

...

> [   23.270411] Setting up PCI bus 
> /axon@300/plb5/pciex@a0a0
> [   23.312464] PCI host bridge /axon@300/plb5/pciex@a0a0  
> ranges:
> [   23.357765]   IO 0x03a1..0x03a1 -> 
> 0x
> [   23.400470]  MEM 0x03c08000..0x03c0bfff -> 
> 0x8000 
> [   23.443702]  MEM 0x03c0c000..0x03c0 -> 
> 0xc000 Prefetch
> [   23.491188] of-pci D3802400.pciex: PCI host bridge to bus 0004:00
> [   23.529643] pci_bus 0004:00: root bus resource [io  0x54000-0x63fff] (bus 
> address [0x-0x])
> [   23.583289] pci_bus 0004:00: root bus resource [mem 
> 0x3c08000-0x3c0bfff] (bus address [0x8000-0xbfff])
> [   23.647356] pci_bus 0004:00: root bus resource [mem 
> 0x3c0c000-0x3c0 pref] (bus address [0xc000-0x])
> [   23.714024] pci_bus 0004:00: root bus resource [bus 00-ff]
> [   23.746836] pci_bus 0004:00: busn_res: [bus 00-ff] end is updated to ff
> [   23.746935] pci 0004:00:0

Re: [PATCH] powerpc/cell: Only iterate over online nodes in cbe_init_pm_irq()

2013-04-23 Thread Dennis Schridde
Hello!

Please find an up-to-date dmesg log attached. It was created from a Linux 
3.8.8 kernel with these two patches applied:
-   for_each_node(node) {
+   for_each_online_node(node) {

-   return distance;
+   return ((a == b) ? LOCAL_DISTANCE : REMOTE_DISTANCE);

More information on the setup (incl. config and lspci) can be found in:
Subject: PROBLEM: Only 2 of 4 cores used on IBM Cell blades and no threads 
shown in spufs
Message-ID: <1470334.YUWOQ37ijW@ernie>

Am Dienstag, 23. April 2013, 23:02:19 schrieb Michael Ellerman:
> On Tue, Apr 23, 2013 at 02:45:50PM +0200, Dennis Schridde wrote:
> This is probably not good, but I'll have to compare to an old kernel to
> be sure. Have you noticed that PCI is broken in any way?

How do I find out?

The only thing I noticed (and someone else, who has been contacting me about 
cellminer questions recently) is that there might be something weird going on 
with the network.
For example, on RHEL5 we had issues with NFS loosing connection to the server 
(the x86-64 server directly in the same rack).
And now me and the other guy notice that the connection to the Eligius bitcoin 
mining pool is a bit flakey - apparently not transmitting all our work.
But this might also be caused by bugs in cellminer or Ruby - and the RHEL5 
issue is very certainly not related, since now on Gentoo/Linux, with a current 
kernel, the problem is gone.

--Dennis

signature.asc
Description: This is a digitally signed message part.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/cell: Only iterate over online nodes in cbe_init_pm_irq()

2013-05-17 Thread Dennis Schridde
Hello!

Just wanted to remind you: The patchto fix cbe_init_pm_irq() that Michael and 
Grant sent me is still not included in Linux 3.8.12.

--Dennis

Am Dienstag, 23. April 2013, 22:14:51 schrieb Michael Ellerman:
> None of the cell platforms support CPU hotplug, so we should iterate
> only over online nodes when setting PMU interrupts.
> 
> This also fixes a warning during boot when NODES_SHIFT is large enough:
> 
> WARNING: at /scratch/michael/src/kmk/linus/kernel/irq/irqdomain.c:766
> ...
> NIP [c00db278] .irq_linear_revmap+0x30/0x58
> LR [c00dc2a0] .irq_create_mapping+0x38/0x1a8
> Call Trace:
> [c003fc9c3af0] [c00dc2a0] .irq_create_mapping+0x38/0x1a8
> (unreliable) [c003fc9c3b80] [c0655c1c]
> .__machine_initcall_cell_cbe_init_pm_irq+0x84/0x158 [c003fc9c3c20]
> [c000afb4] .do_one_initcall+0x5c/0x1e0
> [c003fc9c3cd0] [c0644580] .kernel_init_freeable+0x238/0x328
> [c003fc9c3db0] [c000b784] .kernel_init+0x1c/0x120
> [c003fc9c3e30] [c0009fb8] .ret_from_kernel_thread+0x64/0xac
> 
> This is caused by us overflowing our linear revmap because we're
> requesting too many interrupts.
> 
> Reported-by: Dennis Schridde 
> Signed-off-by: Michael Ellerman 
> ---
>  arch/powerpc/platforms/cell/pmu.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/platforms/cell/pmu.c
> b/arch/powerpc/platforms/cell/pmu.c index 59c1a16..348a27b 100644
> --- a/arch/powerpc/platforms/cell/pmu.c
> +++ b/arch/powerpc/platforms/cell/pmu.c
> @@ -382,7 +382,7 @@ static int __init cbe_init_pm_irq(void)
>   unsigned int irq;
>   int rc, node;
> 
> - for_each_node(node) {
> + for_each_online_node(node) {
>   irq = irq_create_mapping(NULL, IIC_IRQ_IOEX_PMI |
>  (node << IIC_IRQ_NODE_SHIFT));
>   if (irq == NO_IRQ) {

signature.asc
Description: This is a digitally signed message part.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/cell: Only iterate over online nodes in cbe_init_pm_irq()

2013-05-21 Thread Michael Ellerman
On Fri, May 17, 2013 at 05:45:05PM +0200, Dennis Schridde wrote:
> Hello!
> 
> Just wanted to remind you: The patchto fix cbe_init_pm_irq() that Michael and 
> Grant sent me is still not included in Linux 3.8.12.

I didn't push that one to stable because it just fixes a warning. If you
want it you'll have to grab it yourself.

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev