[PATCH] powerpc: Avoid taking a data miss on every userspace instruction miss
From: Anton Blanchard Early on in do_page_fault() we call store_updates_sp(), regardless of the type of exception. For an instruction miss this doesn't make sense, because we only use this information to detect if a data miss is the result of a stack expansion instruction or not. Worse still, it results in a data miss within every userspace instruction miss handler, because we try and load the very instruction we are about to install a pte for! A simple exec microbenchmark runs 6% faster on POWER8 with this fix: #include #include #include int main(int argc, char *argv[]) { unsigned long left = atol(argv[1]); char leftstr[16]; if (left-- == 0) return 0; sprintf(leftstr, "%ld", left); execlp(argv[0], argv[0], leftstr, NULL); perror("exec failed\n"); return 0; } Pass the number of iterations on the command line (eg 1) and time how long it takes to execute. Signed-off-by: Anton Blanchard --- arch/powerpc/mm/fault.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c index fd6484fc2fa9..3a7d580fdc59 100644 --- a/arch/powerpc/mm/fault.c +++ b/arch/powerpc/mm/fault.c @@ -287,7 +287,7 @@ int do_page_fault(struct pt_regs *regs, unsigned long address, * can result in fault, which will cause a deadlock when called with * mmap_sem held */ - if (user_mode(regs)) + if (!is_exec && user_mode(regs)) store_update_sp = store_updates_sp(regs); if (user_mode(regs)) -- 2.11.0
Re: [PATCH kernel] powerpc/iommu: Do not call PageTransHuge() on tail pages
On Tue, 2017-03-28 at 16:25 +1100, Alexey Kardashevskiy wrote: > The CMA pages migration code does not support compound pages at > the moment so it performs few tests before proceeding to actual page > migration. > > One of the tests - PageTransHuge() - has VM_BUG_ON_PAGE(PageTail()) as > it should be called on head pages. Since we also test for PageCompound(), > and it contains PageTail(), we can simply move PageCompound() in front > of PageTransHuge() and therefore avoid possible VM_BUG_ON_PAGE. > > Signed-off-by: Alexey Kardashevskiy > --- The fix looks reasonable to me. I suspect the checks can be simplified and we can support split and move of THP in the future. For now, looks good Acked-by: Balbir Singh
[PATCH] powerpc/misc: fix exported functions that reference the TOC
When the kernel is compiled to use 64bit ABIv2 the _GLOBAL() macro does not include a global entry point. A function's global entry point is used when the function is called from a different TOC context and in the kernel this typically means a call from a module into the vmlinux (or vis-a-vis). There are a few exported ASM functions declared with _GLOBAL() and calling them from a module will module will likely crash the kernel since any TOC relative load will yield garbage. To fix this use _GLOBAL_TOC() for exported asm functions rather than _GLOBAL() and some documentation about when to use each. Signed-off-by: Oliver O'Halloran --- arch/powerpc/include/asm/ppc_asm.h | 12 arch/powerpc/kernel/misc_64.S | 4 ++-- 2 files changed, 14 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/include/asm/ppc_asm.h b/arch/powerpc/include/asm/ppc_asm.h index 359c443..3abf8c3 100644 --- a/arch/powerpc/include/asm/ppc_asm.h +++ b/arch/powerpc/include/asm/ppc_asm.h @@ -198,6 +198,18 @@ END_FW_FTR_SECTION_IFSET(FW_FEATURE_SPLPAR) #ifdef PPC64_ELF_ABI_v2 +/* + * When to use _GLOBAL_TOC() instead of _GLOBAL(): + * + * a) The function is exported using EXPORT_SYMBOL_*() + * *and* + * b) The function, or any function that it calls, references the TOC. + * + * In this situation _GLOBAL_TOC() is required because exported functions are + * callable from modules which may a different TOC to the kernel proper and the + * _GLOBAL() macro skips the TOC setup which is required on ELF ABIv2. + */ + #define _GLOBAL(name) \ .align 2 ; \ .type name,@function; \ diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S index ec94aef..d18da8c 100644 --- a/arch/powerpc/kernel/misc_64.S +++ b/arch/powerpc/kernel/misc_64.S @@ -67,7 +67,7 @@ PPC64_CACHES: * flush all bytes from start through stop-1 inclusive */ -_GLOBAL(flush_icache_range) +_GLOBAL_TOC(flush_icache_range) BEGIN_FTR_SECTION PURGE_PREFETCHED_INS blr @@ -120,7 +120,7 @@ EXPORT_SYMBOL(flush_icache_range) * *flush all bytes from start to stop-1 inclusive */ -_GLOBAL(flush_dcache_range) +_GLOBAL_TOC(flush_dcache_range) /* * Flush the data cache to memory -- 2.9.3
Re: [PATCH 12/12] powerpc/kvm: Native usage of the XIVE interrupt controller
On Tue, 2017-03-28 at 16:26 +1100, Paul Mackerras wrote: > > > --- a/arch/powerpc/include/asm/kvm_book3s_asm.h > > +++ b/arch/powerpc/include/asm/kvm_book3s_asm.h > > @@ -111,6 +111,8 @@ struct kvmppc_host_state { > > struct kvm_vcpu *kvm_vcpu; > > struct kvmppc_vcore *kvm_vcore; > > void __iomem *xics_phys; > > + void __iomem *xive_tm_area_phys; > > + void __iomem *xive_tm_area_virt; > > Does this cause the paca to become a cacheline larger? (Not that > there is much alternative to having these fields.) It does, though as you said, there's little I can do here. .../... > > > > +/* QW0 and QW1 of a context */ > > +union xive_qw01 { > > + struct { > > + u8 nsr; > > + u8 cppr; > > + u8 ipb; > > + u8 lsmfb; > > + u8 ack; > > + u8 inc; > > + u8 age; > > + u8 pipr; > > + }; > > + __be64 qw; > > +}; > > This is slightly confusing because a "QW" (quadword) would normally > be 128 bits, but this union is 64 bits. It's me being wrong. It's not QW0 and QW1, it's word 0 and 1 of the QW. Word 2 is used for setting up the CAM and Word 3 is unused. I'll fixup the naming. > > > > +extern int kvmppc_xive_set_xive(struct kvm *kvm, u32 irq, u32 > > server, > > + u32 priority); > > +extern int kvmppc_xive_get_xive(struct kvm *kvm, u32 irq, u32 > > *server, > > + u32 *priority); > > Might be worth a comment here to explain that the first xive is > eXternal Interrupt Virtualization Engine and the second xive is > eXternal Interrupt Vector Entry. Haha, indeed ;-) I'll add something. > > > > +static inline void kvmppc_set_xive_tm_area_phys(int cpu, unsigned > > long addr) > > +{} > > Shouldn't this be kvmppc_set_xive_tm_area to match the other > definition? Yup. Bit-rot from earlier versions of the patch that only had "phys" (real mode only). > > --- a/arch/powerpc/include/asm/xive.h > > +++ b/arch/powerpc/include/asm/xive.h > > @@ -55,7 +55,8 @@ struct xive_q { > > #define XIVE_ESB_SET_PQ_01 0xd00 > > #define XIVE_ESB_SET_PQ_10 0xe00 > > #define XIVE_ESB_SET_PQ_11 0xf00 > > -#define XIVE_ESB_MASK XIVE_ESB_SET_PQ_01 > > +#define XIVE_ESB_SOFT_MASK XIVE_ESB_SET_PQ_10 > > +#define XIVE_ESB_HARD_MASK XIVE_ESB_SET_PQ_01 > > What's the difference between a "soft" mask and a "hard" mask? I'll document, though I may not use the "aliases" anymore if it's just confusing. (Basically soft mask will remember in Q if something happens while masked, hard mask will not). > > > > - kvmppc_xics_set_mapped(kvm, guest_gsi, desc- > > >irq_data.hwirq); > > + if (xive_enabled()) > > + rc = kvmppc_xive_set_mapped(kvm, guest_gsi, desc); > > + else > > + kvmppc_xics_set_mapped(kvm, guest_gsi, desc- > > >irq_data.hwirq); > > + printk("set mapped for IRQ %d -> %d returned %d\n", > > + host_irq, guest_gsi, rc); > > This seems like a debugging thing that should be removed or turned > into a DBG(). Yup, forgot about it. @@ -398,6 +422,9 @@ static long kvmppc_read_one_intr(bool *again) > > u8 host_ipi; > > int64_t rc; > > > > + if (xive_enabled()) > > + return 1; > > Why not do this in kvmppc_read_intr() rather than here? Dunno, probably missed that loop. I'll change it > > paca */ > > +#ifdef CONFIG_KVM_XICS > > + /* We are exiting, pull the VP from the XIVE */ > > + lwz r0, VCPU_XIVE_PUSHED(r9) > > + cmpwi cr0, r0, 0 > > + beq 1f > > + li r7, TM_SPC_PULL_OS_CTX > > + li r6, TM_QW1_OS > > + mfmsr r0 > > + andi. r0, r0, MSR_IR /* in real > > mode? */ > > + beq 2f > > + ld r10, HSTATE_XIVE_TM_AREA_VIRT(r13) > > + cmpldi cr0, r10, 0 > > + beq 1f > > + lwzxr11, r7, r10 > > + eieio > > + ldx r11, r6, r10 > > I assume you meant to do these two loads into the same target > register, but I don't know why, so a comment would be useful. Right. We don't care about the result of the first one. It's the special side-effect load to perform the pull. It doesn't return useful info (the spec isn't clear there, so I should document it). Once we have pulled, the TM OS area is frozen so I can do a 64-bit load to get W0 and W1 & back them up. > > + b 3f > > +2: ld r10, HSTATE_XIVE_TM_AREA_PHYS(r13) > > + cmpldi cr0, r10, 0 > > + beq 1f > > + lwzcix r11, r7, r10 > > + eieio > > + ldcix r11, r6, r10 > > +3: std r11, VCPU_XIVE_SAVED_STATE(r9) > > + /* Fixup some of the state for the next load */ > > + li r10, 0 > > + li r0, 0xff > > + stw r10, VCPU_XIVE_PUSHED(r9) > > + stb r10, (VCPU_XIVE_SAVED_STATE+3)(r9) > > + stb r0, (VCPU_XIVE_SAVED_STATE+4)(r9) > > +1: > > +#endif /* CONFIG_KVM_XICS */ > > /* Save more register state */ > > mfdar r6 > > mfdsisr r7 > > @@ -2035,7 +2086,7 @@ hcall_real_table: > > .long D
[PATCH] powerpc/nohash: Fix use of mmu_has_feature() in setup_initial_memory_limit()
setup_initial_memory_limit() is called from early_init_devtree(), which runs prior to feature patching. If the kernel is built with CONFIG_JUMP_LABEL=y and CONFIG_JUMP_LABEL_FEATURE_CHECKS=y then we will potentially get the wrong value. If we also have CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG=y we get a warning and backtrace: Warning! mmu_has_feature() used prior to jump label init! CPU: 0 PID: 0 Comm: swapper Not tainted 4.11.0-rc4-gccN-next-20170331-g6af2434 #1 Call Trace: [c0fc3d50] [c0a26c30] .dump_stack+0xa8/0xe8 (unreliable) [c0fc3de0] [c002e6b8] .setup_initial_memory_limit+0xa4/0x104 [c0fc3e60] [c0d5c23c] .early_init_devtree+0xd0/0x2f8 [c0fc3f00] [c0d5d3b0] .early_setup+0x90/0x11c [c0fc3f90] [c520] start_here_multiplatform+0x68/0x80 Fix it by using early_mmu_has_feature(). Fixes: c12e6f24d413 ("powerpc: Add option to use jump label for mmu_has_feature()") Signed-off-by: Michael Ellerman --- arch/powerpc/mm/tlb_nohash.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/mm/tlb_nohash.c b/arch/powerpc/mm/tlb_nohash.c index ba28fcb98597..bfc4a0869609 100644 --- a/arch/powerpc/mm/tlb_nohash.c +++ b/arch/powerpc/mm/tlb_nohash.c @@ -770,7 +770,7 @@ void setup_initial_memory_limit(phys_addr_t first_memblock_base, * avoid going over total available memory just in case... */ #ifdef CONFIG_PPC_FSL_BOOK3E - if (mmu_has_feature(MMU_FTR_TYPE_FSL_E)) { + if (early_mmu_has_feature(MMU_FTR_TYPE_FSL_E)) { unsigned long linear_sz; unsigned int num_cams; -- 2.7.4
Re: [PATCH 4/5] crypto/nx: Add P9 NX support for 842 compression engine.
Haren Myneni writes: > @@ -656,13 +953,21 @@ static __init int nx842_powernv_init(void) > BUILD_BUG_ON(DDE_BUFFER_ALIGN % DDE_BUFFER_SIZE_MULT); > BUILD_BUG_ON(DDE_BUFFER_SIZE_MULT % DDE_BUFFER_LAST_MULT); > > - for_each_compatible_node(dn, NULL, "ibm,power-nx") > - nx842_powernv_probe(dn); > + if (is_vas_available()) { > + for_each_compatible_node(dn, NULL, "ibm,xscom") > + nx842_powernv_probe_vas(dn); I'm not keen on how the device bindings work, instead, I think firmware should provide a 'ibm,vas' compatible node, rather than simply searching through all the ibm,xscom nodes. XSCOMs aren't something that Linux should really know about, it's a debug interface, and one we use through PRD to do PRD-things, XSCOMs aren't part of the architecture. -- Stewart Smith OPAL Architect, IBM.
Re: [PATCH] powerpc: Add POWER9 copy_page() loop
On Mon, 2017-04-03 at 10:54 +1000, Anton Blanchard wrote: > > > Good idea, I hadn't thought of embedding it all in a feature > > > section. > > > > It may not work currently because you get those ftr_alt_97 relocation > > errors with the "else" parts because relative branches to other code > > need to be direct and I think reachable from both places. > > I thought about this a bit more. One potential issue will be > profiling - perf annotate will match the samples against the unpatched > code which could be very confusing. Could we make all those functions a dynamic-linker style stub ? IE, they "find" the right target function and call a helper to patch the calling site to call directly into the right one on the first call. Cheers, Ben.
Re: [PATCH] powerpc: Add POWER9 copy_page() loop
Hi Nick, > > Good idea, I hadn't thought of embedding it all in a feature > > section. > > It may not work currently because you get those ftr_alt_97 relocation > errors with the "else" parts because relative branches to other code > need to be direct and I think reachable from both places. I thought about this a bit more. One potential issue will be profiling - perf annotate will match the samples against the unpatched code which could be very confusing. Anton
Re: [PATCH v4 04/11] VAS: Define vas_init() and vas_exit()
Hi Sukadev, [auto build test ERROR on powerpc/next] [also build test ERROR on v4.11-rc4 next-20170331] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Sukadev-Bhattiprolu/Add-Power9-PVR/20170402-155232 base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next config: powerpc-allmodconfig (attached as .config) compiler: powerpc64-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705 reproduce: wget https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree make.cross ARCH=powerpc All errors (new ones prefixed by >>): In file included from arch/powerpc/platforms/powernv/vas.c:17:0: >> arch/powerpc/platforms/powernv/vas.h:17:3: error: #error "TODO: Compute >> RMA/Paste-address for 4K pages." # error "TODO: Compute RMA/Paste-address for 4K pages." ^ vim +17 arch/powerpc/platforms/powernv/vas.h 8d1d7c15 Sukadev Bhattiprolu 2017-03-30 11 #define _VAS_H 8d1d7c15 Sukadev Bhattiprolu 2017-03-30 12 #include 8d1d7c15 Sukadev Bhattiprolu 2017-03-30 13 #include 8d1d7c15 Sukadev Bhattiprolu 2017-03-30 14 #include 8d1d7c15 Sukadev Bhattiprolu 2017-03-30 15 8d1d7c15 Sukadev Bhattiprolu 2017-03-30 16 #ifdef CONFIG_PPC_4K_PAGES 8d1d7c15 Sukadev Bhattiprolu 2017-03-30 @17 # error "TODO: Compute RMA/Paste-address for 4K pages." 8d1d7c15 Sukadev Bhattiprolu 2017-03-30 18 #else 8d1d7c15 Sukadev Bhattiprolu 2017-03-30 19 #ifndef CONFIG_PPC_64K_PAGES 8d1d7c15 Sukadev Bhattiprolu 2017-03-30 20 # error "Unexpected Page size." :: The code at line 17 was first introduced by commit :: 8d1d7c159a85bf75ef7b06edf8f27ef56d6b4b3f VAS: Define macros, register fields and structures :: TO: Sukadev Bhattiprolu :: CC: 0day robot --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
Linux 4.11: Reported regressions as of Tuesday, 2017-04-02
Hi! Find below my second regression report for Linux 4.11. It lists 13 regressions I'm currently aware of. It lists 6 fixed regressions. Some of them where in the first report from three weeks ago; a few were supposed to go into a second report I prepared last week, but wasn't able to finish :-/ As always: Are you aware of any other regressions? Then please let me know (simply CC regressi...@leemhuis.info). And please tell me if there is anything in the report that shouldn't be there. Ciao, Thorsten P.S.: Thx to all those that CCed me on regression reports, that makes compiling these reports a whole lot easier! == Current regressions == Desc: malta_defconfig regressions Repo: 2017-03-31 https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1367470.html Stat: n/a Note: some patched already heading mainline Desc: Commit d8514d8edb5b ("ovl: copy up regular file using O_TMPFILE") breaks ubifs Repo: 2017-03-28 https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1363879.html Stat: 2017-03-30 https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1366190.html https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1366208.html Note: patches being tested; tests looking good so far Desc: 07ec51480b5e ("virtio_pci: use shared interrupts for virtqueues") causes some kworker grief in -rt too Repo: 2017-03-27 https://www.mail-archive.com/search?l=mid&q=1490605644.14634.50.ca...@gmx.de Stat: 2017-03-31 https://www.mail-archive.com/search?l=mid&q=20170331082049.ga4...@lst.de Note: hch is looking into this Desc: HP 820 G3 becomes unstable after resume from suspend Repo: 2017-03-25 https://bugzilla.kernel.org/show_bug.cgi?id=195041 Stat: 2017-03-26 Note: might be a duplicate of https://bugzilla.kernel.org/show_bug.cgi?id=194801 ; revert for that bug is heading upstream via davem (see https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1361534.html ) Desc: NVMe APST? Samsung PM951 NVMe sudden controller death Repo: 2017-03-25 https://bugzilla.kernel.org/show_bug.cgi?id=195039 Stat: 2017-03-29 Note: Got luto and axboe into the loop, investigation ongoing, maybe a blacklist update is needed; issue might be the same to https://bugzilla.kernel.org/show_bug.cgi?id=194921 (see below) Desc: NVMe APST? NVMe resets leads to capacity change to 0, leading to panics; Samsung SSD as well Repo: 2017-03-18 https://bugzilla.kernel.org/show_bug.cgi?id=194921 Stat: 2017-03-28 Note: Got luto and axboe into the loop, investigation ongoing; maybe a blacklist update is needed; issue might be related to https://bugzilla.kernel.org/show_bug.cgi?id=195039 (see above) Desc: Perf regression after enabling nvme APST Repo: 2017-03-17 https://lkml.org/lkml/2017/3/17/177 Stat: 2017-03-20 https://lkml.org/lkml/2017/3/20/998 Note: luto: lying disk? Desc: i915 gpu hangs under load Repo: 2017-03-22 https://www.mail-archive.com/intel-gfx@lists.freedesktop.org/msg116227.html https://bugs.freedesktop.org/show_bug.cgi?id=100181 Stat: 2017-04-02 https://www.mail-archive.com/intel-gfx@lists.freedesktop.org/msg117315.html Note: Reporter: "there's a fix out there. I don't know if it's in rc5 though." Desc: 4.10/4.11: mmc: core: HS DDR switch, don't change timing before checking status, as it might lead to boot problems Repo: 2017-03-10 https://patchwork.kernel.org/patch/9617489/ Stat: 2017-03-24 Note: looks like the real root cause was found, but then the discussion stalled afaics Desc: Synaptics RMI4 touchpad regression in 4.11-rc1: pointer jumps Repo: 2017-03-11 https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1351561.html Stat: 2017-03-31 https://www.mail-archive.com/search?l=mid&q=20170331085751.gf22...@mail.corp.redhat.com Note: two patches to improve the situation available; discussion which to use Desc: Synaptics RMI4 touchpad regression in 4.11-rc1: palm detection Repo: 2017-03-11 https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1351561.html Stat: 2017-03-19 https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1356832.html Note: discussion stalled; asked for an update Desc: e1000e: __pci_enable_msi_range fails before/after resume Repo: 2017-03-06 https://bugzilla.kernel.org/show_bug.cgi?id=194801 Stat: 2017-03-14 https://bugzilla.kernel.org/show_bug.cgi?id=194801#c1 Note: revert for that bug is heading upstream via davem (see https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1361534.html ) Desc: thinkpad x220: GPU hang Repo: 2017-03-05 https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1345689.html Stat: 2017-03-25 https://www.mail-archive.com/dri-devel@lists.freedesktop.org/msg180860.html Note: Ignored by DRI people? Pavel wrote: "We know where the bug is, but there's no fix for it. There was one patch, but it was quickly withdrawn." == Stalled, waiting for feedback from reporter == Desc: pine64 defconfig: WARNING: CPU: 0 PID: 86 at drivers/base/dd.c:349 driver_probe_device+0x258/0x2c0 Repo: 2017-03-25 https://b
[PATCH] powerpc/sequoia: fix NAND partitions not to overlap
Fix overlapping NAND partitions. Signed-off-by: Pavel Machek diff --git a/arch/powerpc/boot/dts/sequoia.dts b/arch/powerpc/boot/dts/sequoia.dts index b1d3292..e41b88a 100644 --- a/arch/powerpc/boot/dts/sequoia.dts +++ b/arch/powerpc/boot/dts/sequoia.dts @@ -229,7 +229,7 @@ }; partition@84000 { label = "user"; - reg = <0x 0x01f7c000>; + reg = <0x00084000 0x01f7c000>; }; }; }; -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature