[RFC] [PATCH] powerpc: Add MSR_DE to MSR_KERNEL
Emulators such as BDI2000 and CodeWarrior needs to have MSR_DE set in order to support break points. This adds MSR_DE for kernel space only. --- I have tested this briefly with BDI2000 on P2010(e500) and it works for me. I don't know if there are any bad side effects, therfore this RFC. arch/powerpc/include/asm/reg.h |2 +- arch/powerpc/include/asm/reg_booke.h |2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h index 7fdc2c0..25c8554 100644 --- a/arch/powerpc/include/asm/reg.h +++ b/arch/powerpc/include/asm/reg.h @@ -108,7 +108,7 @@ #define MSR_USER64 MSR_USER32 | MSR_64BIT #elif defined(CONFIG_PPC_BOOK3S_32) || defined(CONFIG_8xx) /* Default MSR for kernel mode. */ -#define MSR_KERNEL (MSR_ME|MSR_RI|MSR_IR|MSR_DR) +#define MSR_KERNEL (MSR_ME|MSR_RI|MSR_IR|MSR_DR|MSR_DE) #define MSR_USER (MSR_KERNEL|MSR_PR|MSR_EE) #endif diff --git a/arch/powerpc/include/asm/reg_booke.h b/arch/powerpc/include/asm/reg_booke.h index 500fe1d..0cb259b 100644 --- a/arch/powerpc/include/asm/reg_booke.h +++ b/arch/powerpc/include/asm/reg_booke.h @@ -37,7 +37,7 @@ #define MSR_KERNEL (MSR_ME|MSR_RI|MSR_IR|MSR_DR|MSR_CE) #define MSR_USER (MSR_KERNEL|MSR_PR|MSR_EE) #else -#define MSR_KERNEL (MSR_ME|MSR_RI|MSR_CE) +#define MSR_KERNEL (MSR_ME|MSR_RI|MSR_CE|MSR_DE) #define MSR_USER (MSR_KERNEL|MSR_PR|MSR_EE) #endif -- 1.7.3.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC] [PATCH] powerpc: Add MSR_DE to MSR_KERNEL
Hi Joakim. On May 30, 2012, at 12:43 AM, Joakim Tjernlund wrote: I have tested this briefly with BDI2000 on P2010(e500) and it works for me. I don't know if there are any bad side effects, therfore this RFC. We used to have MSR_DE surrounded by CONFIG_something to ensure it wasn't set under normal operation. IIRC, if MSR_DE is set, you will have problems with software debuggers that utilize the the debugging registers in the chip itself. You only want to force this to be set when using the BDI, not at other times. Thanks. -- Dan ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[RFC PATCH powerpc] make CONFIG_NUMA depends on CONFIG_SMP
I'm not sure whether it makes sense to add this dependency to avoid CONFI_NUMA && !CONFIG_SMP. I want to do this because I saw some build errors on next-tree when compiling with CONFIG_SMP disabled, and it seems they are caused by some codes under the CONFIG_NUMA #ifdefs. Signed-off-by: Li Zhong --- arch/powerpc/Kconfig |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 050cb37..b2aa74b 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -394,7 +394,7 @@ config IRQ_ALL_CPUS config NUMA bool "NUMA support" - depends on PPC64 + depends on PPC64 && SMP default y if SMP && PPC_PSERIES config NODES_SHIFT -- 1.7.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: pread() and pwrite() system calls
> > We have a system with linux 2.6.32 and the somewhat archaic > > uClibc 0.9.27 (but I'm not sure the current version is > > any better, and I think there are binary compatibility > > if we update). > > > > I've just discovered that pread() is 'implemented' > > by using 3 lseek() system calls and read(). > > (the same is true for the 64bit versions). ... > I think that it is an uClibc problem. It seems that uClibc hadn't been changed when the names of the constants used for the system calls were changed from __NR_pread to __NR_pread64. David ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: ppc/sata-fsl: orphan config value: CONFIG_MPC8315_DS
On Wed, May 30, 2012 at 6:57 AM, Scott Wood wrote: > On 05/29/2012 05:07 PM, Anthony Foiani wrote: >> Scott Wood writes: >> >>> CONFIG_MPC831x_RDB doesn't mean that you're running on such a board, >>> only that the kernel supports those boards. It should be a runtime >>> test. >> >> Point taken. >> >> If that SATA check is CPU/SOC-based, then it should be easy enough to >> test. The cpuinfo for my board is: >> >> # cat /proc/cpuinfo >> processor : 0 >> cpu : e300c3 >> clock : 266.64MHz >> revision : 2.0 (pvr 8085 0020) >> bogomips : 66.66 >> timebase : >> >> On the other hand, if the problem is actually caused by board trace >> routing (or other hardware that's outside the control of the CPU/SOC), >> then I don't know how possible a runtime check will be. > > Board information is available from the device tree, and from platform > code that was selected based on the device tree. > >> Do you know if there is a specific errata that the MPC8315_DS ran >> across that required this fix, or was it a band-aid in the first >> place? > > I don't know the history of this, sorry. It looks like Yang Li added > this code -- Yang, can you answer this? The original code was there before I touched the driver. So unfortunately I also don't know the history of the problem. Judging from the comment in code and current test result I guess it is a board related issue. I agree with Anthony that the best action for now is to remove the workaround completely. Leo ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re[2]: [RFC] [PATCH] powerpc: Add MSR_DE to MSR_KERNEL
>> I have tested this briefly with BDI2000 on P2010(e500) and >> it works for me. I don't know if there are any bad side effects, >> therfore >> this RFC. > We used to have MSR_DE surrounded by CONFIG_something > to ensure it wasn't set under normal operation. IIRC, if MSR_DE > is set, you will have problems with software debuggers that > utilize the the debugging registers in the chip itself. You only want > to force this to be set when using the BDI, not at other times. This MSR_DE is also of interest and used for software debuggers that make use of the debug registers. Only if MSR_DE is set then debug interrupts are generated. If a debug event leads to a debug interrupt handled by a software debugger or if it leads to a debug halt handled by a JTAG tool is selected with DBCR0_EDM / DBCR0_IDM. The "e500 Core Family Reference Manual" chapter "Chapter 8 Debug Support" explains in detail the effect of MSR_DE. Ruedi ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC] [PATCH] powerpc: Add MSR_DE to MSR_KERNEL
On 05/30/2012 03:43 AM, Joakim Tjernlund wrote: Emulators such as BDI2000 and CodeWarrior needs to have MSR_DE set in order to support break points. This adds MSR_DE for kernel space only. --- I have tested this briefly with BDI2000 on P2010(e500) and it works for me. I don't know if there are any bad side effects, therfore this RFC. arch/powerpc/include/asm/reg.h |2 +- arch/powerpc/include/asm/reg_booke.h |2 +- 2 files changed, 2 insertions(+), 2 deletions(-) snip I believe that additional patches are required for CodeWarrior to work properly (e.g., assembly start up). I think the patches should come from Freescale. For whatever reason, they include them in their SDK, but haven't submitted them for inclusion in the mainline. As a developer on Freescale Power products, I would like to see Freescale offer up a CodeWarrior patch set, so I don't have to manage the patches myself when working outside the SDK (i.e., on a more recent kernel). ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Gianfar TX problems
Hello, I'm working on a P1021MDS based board and I have a strange behaviour regarding the TX packets from the gianfar driver (linux 3.0.4-rc5). Rx is working correctly but Tx is not... in particular I noticed that the buffer descriptors are correctly filled but the packets simply do NOT exit form the TSEC (I looked at TSEC's tx counters)! What is puzzling me is that sometimes both TX and RX work correctly but most of the time only Rx is working! Again, testing the same uImage on a Freescale P1021MDS devboard everything works well! Maybe something related to DMA/RAM coherence? Thanks in advance, Rodolfo -- GNU/Linux Solutions e-mail: giome...@enneenne.com Linux Device Driver giome...@linux.it Embedded Systems phone: +39 349 2432127 UNIX programming skype: rodolfo.giometti Freelance ICT Italia - Consulente ICT Italia - www.consulenti-ict.it ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
kernel panic during kernel module load (powerpc specific part)
Hi, We have seen the following kernel panic, happened during loading a kernel module: [ 536.107430] Unable to handle kernel paging request for data at address 0xd76a907c [ 536.114922] Faulting instruction address: 0xc770 [ 536.119891] Oops: Kernel access of bad area, sig: 11 [#1] [ 536.125291] CCEP MPC8541E [ 536.127908] Modules linked in: pppoe(+) nf_conntrack_ipv6 ... [ 536.155705] NIP: c770 LR: c770 CTR: d76ab0d4 [ 536.160674] REGS: d76a8f24 TRAP: 0300 Not tainted (2.6.33-ccep) [ 536.166857] MSR: 00021000 CR: 24000482 XER: 2000 [ 536.172718] DEAR: d76a907c, ESR: 0080 [ 536.176728] TASK = cbd7f9f0[972] 'insmod' THREAD: cbeb2000 [ 536.182041] GPR00: 83cbfff8 d76a8fd4 cbd7f9f0 83cbfff8 cbeb3e1e [ 536.190438] GPR08: 327b d76aa000 24000482 d76a8fd4 cbd7fc08 10019b04 100c2e5c [ 536.198836] GPR16: 1009bafc 100c44a4 100c2ed4 100c 100d1e60 100d1ca8 100017ab [ 536.207235] GPR24: 100017ae 10001936 c0343ae8 10012018 834bffe8 836bffec 838bfff0 [ 536.215819] NIP [c770] InstructionStorage+0xb0/0xc0 [ 536.221048] LR [c770] InstructionStorage+0xb0/0xc0 [ 536.226188] Call Trace: [ 536.228630] Instruction dump: [ 536.231600] 90eb002c 910b0030 7cbe0aa6 90ab00b8 7d846378 38a0 39400401 914b00b0 [ 536.239386] 3d42 614a1002 512a0420 4800d6f5 c000e65c 6000 6000 [ 536.247348] Kernel panic - not syncing: Fatal exception in interrupt [ 536.253704] Call Trace: [ 536.256149] Rebooting in 10 seconds.. The system crashes inside the return of the init entry point of the kernel module. I've found the following root cause: (1) The system has a high number of NAT rules configured, which created a bigger vmalloc area. I've checked this by looking at /proc/vmallocinfo. (2) The kernel module ELF file contains the separate section .init.text for the init entry point, which is marked with __init, as usual. (3) The kernel module ELF file contains the function prologue and epilogue in the .text section. (4) The epilogue is also called from the init entry point, in order to return to the caller. It is intended to restore the non-volatile registers from the stack and to jump to the caller. (5) Because of (1), it is not more possible to jump by a relative branch instruction. The distance is too big. Instead, the trampoline method is applied, which allows longer jumps via register. (please see see do_plt_call() in arch/powerpc/kernel/module_32.c) (6) Unfortunately, the trampoline code (do_plt_call()) is using register r11 to setup the jump. It looks like the prologue and epilogue are using also the register r11, in order to point to the previous stack frame. This is a conflict !!! The trampoline code is damaging the content of r11. According to the current EABI definitions, the register r11 has got a dedicated function (pointer to previous stack frame). In the following, there are parts of the prologue/epilogue shown, which are generated by the compiler: ... 0084 <_rest32gpr_28>: 84: 83 8b ff f0 lwz r28,-16(r11) 0088 <_rest32gpr_29>: 88: 83 ab ff f4 lwz r29,-12(r11) 008c <_rest32gpr_30>: 8c: 83 cb ff f8 lwz r30,-8(r11) 0090 <_rest32gpr_31>: 90: 83 eb ff fc lwz r31,-4(r11) 94: 4e 80 00 20 blr 0098 <_rest32gpr_14_x>: 98: 81 cb ff b8 lwz r14,-72(r11) ... I'd suggest to use register r12 instead of r11 in the trampoline generation code, in do_plt_call() (arch/powerpc/kernel/module_32.c). I'm using kernel 2.6.33, but I think it is also relevant for the current kernel release. Below, there is the complete debug sessions, showing more the details. Thanks Steffen Rumler -- 0xd54990c0: addir11,r1,48 0xd54990c4: mr r3,r29 0xd54990c8: b 0xd5499100 <-- going to return from the init entry point (gdb) bt #0 0xd5499100 in ?? () #1 0xd54990ac in ?? () #2 0xc0001db0 in do_one_initcall (fn=0, wait=1131130) at init/main.c:719 #3 0xc0059e50 in sys_init_module (umod=, ... #4 0xc000e038 in syscall_dotrace_cont () at arch/powerpc/kernel/entry_32.S:331 Backtrace stopped: frame did not save the PC (gdb) info reg r1 r1 0xcbbdbed0341821 (gdb) x/2x 0xcbbdbed0 0xcbbdbed0:0xcbbdbf000xd54990ac (gdb) x/2x 0xcbbdbf00 0xcbbdbf00:0xcbbdbf200xc0001db0 (gdb) info reg r11 r110xcbbdbf003418210048 --> the stack is OK here --> r11 is OK, pointing to the previous stack frame 0xd5499100: lisr11,-10381 <-- this is the trampoline code using/changing r11 (do_plt_call()). 0
MPC8315 PCI express lockup
(I apologise for this not having much to do with linux...) We have a system with an MPC8315 ppc running linux 2.6.32 that uses the PCI express interface in RC mode to interface to an Altera FPGA. This uses both PIO and the PEX DMA interfaces (locally written dma driver). Under normal circumstances this all works fine. However under some circumstances (eg DMA reads from addresses that don't have actual slaves on the fpga [1]) the dma transfer requests don't complete. There are no obvious error bits set in the hisr or csmisr registers and the csb_status shows 'dma in progress'. The dma transfer itself can be cancelled (by setting the SUS bit in the dma_ctrl register), and the relevant status bit are set to show the transfer has been aborted. Once in this state all further PCI express transfers fail. DMA requests timeout (driver gives up waiting for completion) and PIO requests fault (Oops: Machine check, sig 7 [#1]) locking the kernel solid. This looks very much like the MPC8315's errata PEX7 except that I don't see the CSMISR[RST] bit set. I'm not at all sure the recovery for that errata is actually writable! I'm certainly not going to write it just to find out if it would help. In any case it is quite likely that the driver's ISR will try to do a PIO read while a dma transfer is timing out. We can look at the fpga side and possibly find out what it is doing, but it would be useful to know more about the status on the ppc side. I presume there is a way of doing a 'probe' type memory cycle that won't panic on a fault? Although that may not help me keep linux running as the ISR needs to to a PCIe write to remove the level-sensitive interrupt. Any thought on ways to progress? David [1] PIO reads are ok and just return 0x. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: ppc/sata-fsl: orphan config value: CONFIG_MPC8315_DS
Li Yang writes: > The original code was there before I touched the driver. So > unfortunately I also don't know the history of the problem. Alas. > Judging from the comment in code and current test result I guess it > is a board related issue. I wonder if anyone on the 8315_DS project knows where the limitation came from, since that's the origin of the workaround... Regardless, it's recommended by at least one vendor who based their design on the 8315 RDB. If it's board-related, then that seems a reasonable conditional. > I agree with Anthony that the best action for now is to remove the > workaround completely. Eeek. I'm pretty sure that it needs to stay. (I can't guarantee that it has fixed my problem, but it's been a week or two without the hang, so I'm becoming more confident). I think the question is how to best conditionalize it. The options seem to be: 1. at compile time, via kconfig bits; 2. at runtime via probing / discovery; or 3. at runtime via device tree. Given that this is a relatively old platform, and only 2-3 of us have run into this issue in 5 years, I'm inclined to just go with option 1. That's exactly what Adrian's patch (from 2008!) does: http://old.nabble.com/-2.6-patch--sata_fsl.c%3A-fix-8315DS-workaround-td18807647.html Using CONFIG_831x_RDB seems like a reasonable choice. Anyway. To be clear, my project is currently in good shape (by adopting Adrian's patch) so I don't have any actual urgency for fixing this. I was hoping that someone might know the "correct" answer offhand, but I honestly think that this isn't worth spending too much time on. (But I do think that Adrian's patch is an improvement over the current state of affairs.) Thanks again to everyone that's chimed in. Best regards, Anthony Foiani ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: ppc/sata-fsl: orphan config value: CONFIG_MPC8315_DS
Scott Wood writes: > Board information is available from the device tree, and from > platform code that was selected based on the device tree. You're right, of course; I was focusing on discovery/probing, and completely forgot about "provided information". However, as I just mentioned in my reply to Yang, I'm pretty happy with the kconfig solution (Adrian's patch, basically). If we find that this is a more widespread problem, we can revisit this discussion; but if only a handful of us have encountered this in a 5-year-old design, then I don't think it's worth the extra effort of making it dynamic. Maybe someone who knows devtree really well could crank that out in a few minutes... but I'm not that person. :) Regardless, thanks very much for helping out on this. I do advocate that Adrian's patch get put into place, so that we don't have undocumented / unconnected kconfig symbols in the tree. If we ever do find out more details about the workaround, we can at least add some comments at the code site. Thanks again! Best regards, Anthony Foiani ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: ppc/sata-fsl: orphan config value: CONFIG_MPC8315_DS
On 05/30/2012 03:14 PM, Anthony Foiani wrote: > Scott Wood writes: > >> Board information is available from the device tree, and from >> platform code that was selected based on the device tree. > > You're right, of course; I was focusing on discovery/probing, and > completely forgot about "provided information". > > However, as I just mentioned in my reply to Yang, I'm pretty happy > with the kconfig solution (Adrian's patch, basically). > > If we find that this is a more widespread problem, we can revisit this > discussion; but if only a handful of us have encountered this in a > 5-year-old design, then I don't think it's worth the extra effort of > making it dynamic. We currently support building one kernel that supports a bunch of different boards. The hardcoding of this workaround was harmless so far because it was conditional on a symbol that was never defined, but now you'll be enabling this workaround on any kernel that simply has support for mpc8315erdb. That is not acceptable unless you show it's harmless on all those other boards. -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: ppc/sata-fsl: orphan config value: CONFIG_MPC8315_DS
Scott Wood writes: > We currently support building one kernel that supports a bunch of > different boards. The hardcoding of this workaround was harmless so > far because it was conditional on a symbol that was never defined, > but now you'll be enabling this workaround on any kernel that simply > has support for mpc8315erdb. That is not acceptable unless you show > it's harmless on all those other boards. Ok, I see your point now. Sorry for being dense. At the moment, I'm building a kernel that is only going to run on this particular board, so the kconfig solution works *for me*. Unfortunately, I'm not sure I can help develop a more generic solution. I can't reliably reproduce the problem, so I can't even offer to help test for it. Even more unfortunately, I don't currently have the bandwidth to do any more investigation or experimenting with the devtree option (as much as I would like to!). At this point in my project, I probably can't even justify trying to switch to a more current kernel, so I couldn't try out a new release regardless. Sorry I can't be more help. Thanks again, Tony ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: kernel panic during kernel module load (powerpc specific part)
On Wed, 2012-05-30 at 16:33 +0200, Steffen Rumler wrote: > Hi, > > The system crashes inside the return of the init entry point of the kernel > module. > > I've found the following root cause: > > (6) Unfortunately, the trampoline code (do_plt_call()) is using register > r11 to setup the jump. >It looks like the prologue and epilogue are using also the > register r11, in order to point to the previous stack frame. >This is a conflict !!! The trampoline code is damaging the content > of r11. Hi Steffen, Great bug report! I can't quite work out what the standards say, the versions I'm looking at are probably old anyway. Have you tried the obvious fix? cheers diff --git a/arch/powerpc/kernel/module_32.c b/arch/powerpc/kernel/module_32.c index 0b6d796..989d79a 100644 --- a/arch/powerpc/kernel/module_32.c +++ b/arch/powerpc/kernel/module_32.c @@ -205,9 +205,9 @@ static uint32_t do_plt_call(void *location, } /* Stolen from Paul Mackerras as well... */ - entry->jump[0] = 0x3d60+((val+0x8000)>>16); /* lis r11,sym@ha */ - entry->jump[1] = 0x396b + (val&0x); /* addi r11,r11,sym@l*/ - entry->jump[2] = 0x7d6903a6;/* mtctr r11 */ + entry->jump[0] = 0x3d80+((val+0x8000)>>16); /* lis r12,sym@ha */ + entry->jump[1] = 0x398c + (val&0x); /* addi r12,r12,sym@l*/ + entry->jump[2] = 0x7d8903a6;/* mtctr r12 */ entry->jump[3] = 0x4e800420;/* bctr */ DEBUGP("Initialized plt for 0x%x at %p\n", val, entry); ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] powerpc/mm: dereference OF node "/chosen"
The form affinity for NUMA is set to 1 if the firmware supports OPAL. Otherwise, we have to retrieve that from OF node "/chosen". For the latter case, OF node "/chosen" was referred without dereferencing. The patch dereference OF node "/chosen" if necessary. Signed-off-by: Gavin Shan --- arch/powerpc/mm/numa.c |2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c index b6edbb3..5ca3a15 100644 --- a/arch/powerpc/mm/numa.c +++ b/arch/powerpc/mm/numa.c @@ -340,6 +340,8 @@ static int __init find_min_common_depth(void) dbg("Using form 1 affinity\n"); form1_affinity = 1; } + + of_node_put(chosen); } } -- 1.7.9.5 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] powerpc/pci: cleanup on duplicate assignment
While creating the PCI root bus through function pci_create_root_bus() of PCI core, it should have assigned the secondary bus number for the newly created PCI root bus. Thus we needn't do the explicit assignment for the secondary bus number again in pcibios_scan_phb(). Signed-off-by: Gavin Shan --- arch/powerpc/kernel/pci-common.c |1 - 1 file changed, 1 deletion(-) diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c index 8e78e93..0f75bd5 100644 --- a/arch/powerpc/kernel/pci-common.c +++ b/arch/powerpc/kernel/pci-common.c @@ -1646,7 +1646,6 @@ void __devinit pcibios_scan_phb(struct pci_controller *hose) pci_free_resource_list(&resources); return; } - bus->secondary = hose->first_busno; hose->bus = bus; /* Get probe mode and perform scan */ -- 1.7.9.5 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] powerpc: Use enhanced touch instructions in POWER7 copy_to_user/copy_from_user
Version 2.06 of the POWER ISA introduced enhanced touch instructions, allowing us to specify a number of attributes including the length of a stream. This patch adds a software stream for both loads and stores in the POWER7 copy_tofrom_user loop. Since the setup is quite complicated and we have to use an eieio to ensure correct ordering of the "GO" command we only do this for copies above 4kB. To quantify any performance improvements we need a working set bigger than the caches so we operate on a 1GB file: # dd if=/dev/zero of=/tmp/foo bs=1M count=1024 And we compare how fast we can read the file: # dd if=/tmp/foo of=/dev/null bs=1M before: 7.7 GB/s after: 9.6 GB/s A 25% improvement. The worst case for this patch will be a completely L1 cache contained copy of just over 4kB. We can test this with the copy_to_user testcase we used to tune copy_tofrom_user originally: http://ozlabs.org/~anton/junkcode/copy_to_user.c # time ./copy_to_user2 -l 4224 -i 1000 before: 6.807 s after: 6.946 s A 2% slowdown, which seems reasonable considering our data is unlikely to be completely L1 contained. Signed-off-by: Anton Blanchard --- v2: Use cr1 in the comparison so we don't corrupt the compare/branch to select vmx vs non vmx loops. Index: linux-build/arch/powerpc/lib/copyuser_power7.S === --- linux-build.orig/arch/powerpc/lib/copyuser_power7.S 2012-05-29 21:22:40.445551834 +1000 +++ linux-build/arch/powerpc/lib/copyuser_power7.S 2012-05-31 15:28:35.336354208 +1000 @@ -298,6 +298,37 @@ err1; stb r0,0(r3) ld r5,STACKFRAMESIZE+64(r1) mtlrr0 + /* +* We prefetch both the source and destination using enhanced touch +* instructions. We use a stream ID of 0 for the load side and +* 1 for the store side. +*/ + clrrdi r6,r4,7 + clrrdi r9,r3,7 + ori r9,r9,1 /* stream=1 */ + + srdir7,r5,7 /* length in cachelines, capped at 0x3FF */ + cmpldi cr1,r7,0x3FF + ble cr1,1f + li r7,0x3FF +1: lis r0,0x0E00 /* depth=7 */ + sldir7,r7,7 + or r7,r7,r0 + ori r10,r7,1/* stream=1 */ + + lis r8,0x8000 /* GO=1 */ + clrldi r8,r8,32 + +.machine push +.machine "power4" + dcbtr0,r6,0b01000 + dcbtr0,r7,0b01010 + dcbtst r0,r9,0b01000 + dcbtst r0,r10,0b01010 + eieio + dcbtr0,r8,0b01010 /* GO */ +.machine pop + beq .Lunwind_stack_nonvmx_copy /* ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] powerpc: POWER7 optimised memcpy using VMX and enhanced prefetch
Implement a POWER7 optimised memcpy using VMX and enhanced prefetch instructions. This is a copy of the POWER7 optimised copy_to_user/copy_from_user loop. Detailed implementation and performance details can be found in commit a66086b8197d (powerpc: POWER7 optimised copy_to_user/copy_from_user using VMX). I noticed memcpy issues when profiling a RAID6 workload: .memcpy .async_memcpy .async_copy_data .__raid_run_ops .handle_stripe .raid5d .md_thread I created a simplified testcase by building a RAID6 array with 4 1GB ramdisks (booting with brd.rd_size=1048576): # mdadm -CR -e 1.2 /dev/md0 --level=6 -n4 /dev/ram[0-3] I then timed how long it took to write to the entire array: # dd if=/dev/zero of=/dev/md0 bs=1M Before: 892 MB/s After: 999 MB/s A 12% improvement. Signed-off-by: Anton Blanchard --- Index: linux-build/arch/powerpc/lib/Makefile === --- linux-build.orig/arch/powerpc/lib/Makefile 2012-05-30 15:27:30.0 +1000 +++ linux-build/arch/powerpc/lib/Makefile 2012-05-31 09:12:27.574372864 +1000 @@ -17,7 +17,8 @@ obj-$(CONFIG_HAS_IOMEM) += devres.o obj-$(CONFIG_PPC64)+= copypage_64.o copyuser_64.o \ memcpy_64.o usercopy_64.o mem_64.o string.o \ checksum_wrappers_64.o hweight_64.o \ - copyuser_power7.o string_64.o copypage_power7.o + copyuser_power7.o string_64.o copypage_power7.o \ + memcpy_power7.o obj-$(CONFIG_XMON) += sstep.o ldstfp.o obj-$(CONFIG_KPROBES) += sstep.o ldstfp.o obj-$(CONFIG_HAVE_HW_BREAKPOINT) += sstep.o ldstfp.o Index: linux-build/arch/powerpc/lib/memcpy_64.S === --- linux-build.orig/arch/powerpc/lib/memcpy_64.S 2012-05-30 09:39:59.0 +1000 +++ linux-build/arch/powerpc/lib/memcpy_64.S2012-05-31 09:12:00.093876936 +1000 @@ -11,7 +11,11 @@ .align 7 _GLOBAL(memcpy) +BEGIN_FTR_SECTION std r3,48(r1) /* save destination pointer for return value */ +FTR_SECTION_ELSE + b memcpy_power7 +ALT_FTR_SECTION_END_IFCLR(CPU_FTR_VMX_COPY) PPC_MTOCRF(0x01,r5) cmpldi cr1,r5,16 neg r6,r3 # LS 3 bits = # bytes to 8-byte dest bdry Index: linux-build/arch/powerpc/lib/memcpy_power7.S === --- /dev/null 1970-01-01 00:00:00.0 + +++ linux-build/arch/powerpc/lib/memcpy_power7.S2012-05-31 15:28:03.495781127 +1000 @@ -0,0 +1,650 @@ +/* + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + * Copyright (C) IBM Corporation, 2012 + * + * Author: Anton Blanchard + */ +#include + +#define STACKFRAMESIZE 256 +#define STK_REG(i) (112 + ((i)-14)*8) + +_GLOBAL(memcpy_power7) +#ifdef CONFIG_ALTIVEC + cmpldi r5,16 + cmpldi cr1,r5,4096 + + std r3,48(r1) + + blt .Lshort_copy + bgt cr1,.Lvmx_copy +#else + cmpldi r5,16 + + std r3,48(r1) + + blt .Lshort_copy +#endif + +.Lnonvmx_copy: + /* Get the source 8B aligned */ + neg r6,r4 + mtocrf 0x01,r6 + clrldi r6,r6,(64-3) + + bf cr7*4+3,1f + lbz r0,0(r4) + addir4,r4,1 + stb r0,0(r3) + addir3,r3,1 + +1: bf cr7*4+2,2f + lhz r0,0(r4) + addir4,r4,2 + sth r0,0(r3) + addir3,r3,2 + +2: bf cr7*4+1,3f + lwz r0,0(r4) + addir4,r4,4 + stw r0,0(r3) + addir3,r3,4 + +3: sub r5,r5,r6 + cmpldi r5,128 + blt 5f + + mflrr0 + stdur1,-STACKFRAMESIZE(r1) + std r14,STK_REG(r14)(r1) + std r15,STK_REG(r15)(r1) + std r16,STK_REG(r16)(r1) + std r17,STK_REG(r17)(r1) + std r18,STK_REG(r18)(r1) + std r19,STK_REG(r19)(r1) + std r20,STK_REG(r20)(r1) + std r21,STK_REG(r21)(r1) + std r22,STK_REG(r22)(r1) + std r0,STACKFRAMESIZE+16(r1) + + srdir6,r5,7 + mtctr r6 + + /* Now do cacheline (128B) sized loa