[PATCH] cxl: Force context lock during EEH flow
During an eeh event when the cxl card is fenced and card sysfs attr perst_reloads_same_image is set following warning message is seen in the kernel logs: [ 60.622727] Adapter context unlocked with 0 active contexts [ 60.622762] [ cut here ] [ 60.622771] WARNING: CPU: 12 PID: 627 at ../drivers/misc/cxl/main.c:325 cxl_adapter_context_unlock+0x60/0x80 [cxl] Even though this warning is harmless, it clutters the kernel log during an eeh event. This warning is triggered as the EEH callback cxl_pci_error_detected doesn't obtain a context-lock before forcibly detaching all active context and when context-lock is released during call to cxl_configure_adapter from cxl_pci_slot_reset, a warning in cxl_adapter_context_unlock is triggered. To fix this warning and also prevent activation of any context during eeh the patch introduces a new function cxl_adapter_context_force_lock that would forcefully acquire the context_lock and warn if any active contexts exists when this happens. After the EEH flow concludes with call to cxl_pci_resume the context-lock is released with a call to cxl_adapter_context_unlock. Cc: sta...@vger.kernel.org Fixes: 70b565bbdb91("cxl: Prevent adapter reset if an active context exists") Reported-by: Andrew DonnellanSigned-off-by: Vaibhav Jain Reviewed-by: Andrew Donnellan --- drivers/misc/cxl/cxl.h | 3 +++ drivers/misc/cxl/main.c | 10 ++ drivers/misc/cxl/pci.c | 18 -- 3 files changed, 29 insertions(+), 2 deletions(-) diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h index b24d767..34d2ca0 100644 --- a/drivers/misc/cxl/cxl.h +++ b/drivers/misc/cxl/cxl.h @@ -970,4 +970,7 @@ int cxl_adapter_context_lock(struct cxl *adapter); /* Unlock the contexts-lock if taken. Warn and force unlock otherwise */ void cxl_adapter_context_unlock(struct cxl *adapter); +/* Force contexts-lock to be taken */ +void cxl_adapter_context_force_lock(struct cxl *adapter); + #endif diff --git a/drivers/misc/cxl/main.c b/drivers/misc/cxl/main.c index 62e0dfb..1754a0c 100644 --- a/drivers/misc/cxl/main.c +++ b/drivers/misc/cxl/main.c @@ -301,6 +301,16 @@ void cxl_adapter_context_put(struct cxl *adapter) atomic_dec_if_positive(>contexts_num); } +void cxl_adapter_context_force_lock(struct cxl *adapter) +{ + int count = atomic_read(>contexts_num); + + if (count > 0) + pr_warn("Forcing context lock with %d active contexts\n", + count); + atomic_set(>contexts_num, -1); +} + int cxl_adapter_context_lock(struct cxl *adapter) { int rc; diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c index 80a87ab..d76fd4d 100644 --- a/drivers/misc/cxl/pci.c +++ b/drivers/misc/cxl/pci.c @@ -1487,8 +1487,6 @@ static int cxl_configure_adapter(struct cxl *adapter, struct pci_dev *dev) if ((rc = cxl_native_register_psl_err_irq(adapter))) goto err; - /* Release the context lock as adapter is configured */ - cxl_adapter_context_unlock(adapter); return 0; err: @@ -1587,6 +1585,9 @@ static struct cxl *cxl_pci_init_adapter(struct pci_dev *dev) if ((rc = cxl_sysfs_adapter_add(adapter))) goto err_put1; + /* Release the context lock as adapter is configured */ + cxl_adapter_context_unlock(adapter); + return adapter; err_put1: @@ -1607,6 +1608,9 @@ static void cxl_pci_remove_adapter(struct cxl *adapter) { pr_devel("cxl_remove_adapter\n"); + /* Forcibly take the adapter context lock */ + cxl_adapter_context_force_lock(adapter); + cxl_sysfs_adapter_remove(adapter); cxl_debugfs_adapter_remove(adapter); @@ -1778,6 +1782,9 @@ static pci_ers_result_t cxl_pci_error_detected(struct pci_dev *pdev, */ schedule(); + /* forcibly take the context lock to prevent new context activation */ + cxl_adapter_context_force_lock(adapter); + /* If we're permanently dead, give up. */ if (state == pci_channel_io_perm_failure) { /* Tell the AFU drivers; but we don't care what they @@ -1879,11 +1886,15 @@ static pci_ers_result_t cxl_pci_error_detected(struct pci_dev *pdev, /* Only continue if everyone agrees on NEED_RESET */ if (result != PCI_ERS_RESULT_NEED_RESET) return result; + } + for (i = 0; i < adapter->slices; i++) { + afu = adapter->afu[i]; cxl_context_detach_all(afu); cxl_ops->afu_deactivate_mode(afu, afu->current_mode); pci_deconfigure_afu(afu); } + cxl_deconfigure_adapter(adapter); return result; @@ -1979,6 +1990,9 @@ static void cxl_pci_resume(struct pci_dev *pdev) afu_dev->driver->err_handler->resume(afu_dev); }
Re: [PATCH] powerpc/mm: Fix spurrious segfaults on radix with Autonuma
Benjamin Herrenschmidtwrites: > When autonuma marks a PTE inaccessible it clears all the protection > bits but leave the PTE valid. > > With the Radix MMU, an attempt at executing from such a PTE will > take a fault with bit 35 of SRR1 set "SRR1_ISI_N_OR_G". > > It is thus incorrect to treat all such faults as errors. We should > pass them to handle_mm_fault() for autonuma to deal with. The case > of pages that are really not executable is handled by the existing > test for VM_EXEC further down. > > That leaves us with catching the kernel attempts at executing user > pages. We can catch that earlier, even before we do find_vma. > > It is never valid on powerpc for the kernel to take an exec fault > to begin with. So fold that test with the existing test for the > kernel faulting on kernel addresses to bail out early. > > Signed-off-by: Benjamin Herrenschmidt > Fixes: 1d18ad0 ("powerpc/mm: Detect instruction fetch denied and report") > Fixes: 0ab5171 ("powerpc/mm: Fix no execute fault handling on pre-POWER5") Reviewed-by: Aneesh Kumar K.V > --- > > diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c > index 6fd30ac..62a50d6 100644 > --- a/arch/powerpc/mm/fault.c > +++ b/arch/powerpc/mm/fault.c > @@ -253,8 +253,11 @@ int do_page_fault(struct pt_regs *regs, unsigned long > address, > if (unlikely(debugger_fault_handler(regs))) > goto bail; > > - /* On a kernel SLB miss we can only check for a valid exception entry */ > - if (!user_mode(regs) && (address >= TASK_SIZE)) { > + /* > + * The kernel should never take an execute fault nor should it > + * take a page fault to a kernel address. > + */ > + if (!user_mode(regs) && (is_exec || (address >= TASK_SIZE))) { > rc = SIGSEGV; > goto bail; > } > @@ -391,20 +394,6 @@ int do_page_fault(struct pt_regs *regs, unsigned long > address, > > if (is_exec) { > /* > - * An execution fault + no execute ? > - * > - * On CPUs that don't have CPU_FTR_COHERENT_ICACHE we > - * deliberately create NX mappings, and use the fault to do the > - * cache flush. This is usually handled in > hash_page_do_lazy_icache() > - * but we could end up here if that races with a concurrent PTE > - * update. In that case we need to fall through here to the VMA > - * check below. > - */ > - if (cpu_has_feature(CPU_FTR_COHERENT_ICACHE) && > - (regs->msr & SRR1_ISI_N_OR_G)) > - goto bad_area; > - > - /* >* Allow execution from readable areas if the MMU does not >* provide separate controls over reading and executing. >*
Re: [PATCH] powerpc/powernv: implement NMI IPI with OPAL_SIGNAL_SYSTEM_RESET
On Fri, 03 Feb 2017 17:00:19 +1100 Alistair Popplewrote: > On Fri, 3 Feb 2017 03:20:48 PM Nicholas Piggin wrote: > > On Fri, 3 Feb 2017 00:25:11 +1000 > > Nicholas Piggin wrote: > > > > > This goes with the previous NMI IPI series, and a new version of > > > Alistair's opal API I posted to the skiboot list. > > > > And here is the incremental bit that is required for Alistair's > > hardware implementation to work. > > > > If the opal broacast call fails with OPAL_PARTIAL, then we designate > > a bouncer CPU on another core to send NMI IPIs back to our sibling > > threads. > > > > Probably needs more discussion and testing about how to detect and > > handle failure cases and future compatibility for different types of > > restrictions, but at least it works in mambo. > > Looking at the Mambo implementation you recently posted I couldn't see > where OPAL_PARTIAL was returned. So I guess you have other Skiboot > patches to test this path using Mambo? I'm wondering because they > could be useful for testing the skiboot HW side as well... Oh I just did it all on the Linux side, just forced it to always do bounce rather than try broadcast first, I was lazy. But we'll make mambo match the hardware implementation as far as possible once you've got something. > > Of course the other option rather than doing this in Linux is call > > into opal in the system reset handler and have it do the bouncing. > > Something to consider before we finalise the API. > > That might not be such a bad idea. OPAL could queue up the threads it > couldn't reset and then wait until opal_sreset_complete() is called > from an eligible thread to reset the ones it couldn't do in the > original call. Yeah, IIRC we thought it might have been harder to do in firmware, but that may not be the case. It may not be a bad idea in general for opal to have a token available to hook system resets too. > I might try prototyping something like this when I get some time. One > issue would be if there is only a single core in the system, but > that's unlikely and I think that's probably something we can't support > in any case as cores can't sreset threads on the same core, at least > on P8. Yes on powernv it would be unusual, except maybe mambo where we can make it work. Thanks, Nick
[PATCH] powerpc/mm: Fix spurrious segfaults on radix with Autonuma
When autonuma marks a PTE inaccessible it clears all the protection bits but leave the PTE valid. With the Radix MMU, an attempt at executing from such a PTE will take a fault with bit 35 of SRR1 set "SRR1_ISI_N_OR_G". It is thus incorrect to treat all such faults as errors. We should pass them to handle_mm_fault() for autonuma to deal with. The case of pages that are really not executable is handled by the existing test for VM_EXEC further down. That leaves us with catching the kernel attempts at executing user pages. We can catch that earlier, even before we do find_vma. It is never valid on powerpc for the kernel to take an exec fault to begin with. So fold that test with the existing test for the kernel faulting on kernel addresses to bail out early. Signed-off-by: Benjamin HerrenschmidtFixes: 1d18ad0 ("powerpc/mm: Detect instruction fetch denied and report") Fixes: 0ab5171 ("powerpc/mm: Fix no execute fault handling on pre-POWER5") --- diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c index 6fd30ac..62a50d6 100644 --- a/arch/powerpc/mm/fault.c +++ b/arch/powerpc/mm/fault.c @@ -253,8 +253,11 @@ int do_page_fault(struct pt_regs *regs, unsigned long address, if (unlikely(debugger_fault_handler(regs))) goto bail; - /* On a kernel SLB miss we can only check for a valid exception entry */ - if (!user_mode(regs) && (address >= TASK_SIZE)) { + /* +* The kernel should never take an execute fault nor should it +* take a page fault to a kernel address. +*/ + if (!user_mode(regs) && (is_exec || (address >= TASK_SIZE))) { rc = SIGSEGV; goto bail; } @@ -391,20 +394,6 @@ int do_page_fault(struct pt_regs *regs, unsigned long address, if (is_exec) { /* -* An execution fault + no execute ? -* -* On CPUs that don't have CPU_FTR_COHERENT_ICACHE we -* deliberately create NX mappings, and use the fault to do the -* cache flush. This is usually handled in hash_page_do_lazy_icache() -* but we could end up here if that races with a concurrent PTE -* update. In that case we need to fall through here to the VMA -* check below. -*/ - if (cpu_has_feature(CPU_FTR_COHERENT_ICACHE) && - (regs->msr & SRR1_ISI_N_OR_G)) - goto bad_area; - - /* * Allow execution from readable areas if the MMU does not * provide separate controls over reading and executing. *
Re: [PATCH] powerpc/powernv: implement NMI IPI with OPAL_SIGNAL_SYSTEM_RESET
On Fri, 3 Feb 2017 03:20:48 PM Nicholas Piggin wrote: > On Fri, 3 Feb 2017 00:25:11 +1000 > Nicholas Pigginwrote: > > > This goes with the previous NMI IPI series, and a new version of > > Alistair's opal API I posted to the skiboot list. > > And here is the incremental bit that is required for Alistair's > hardware implementation to work. > > If the opal broacast call fails with OPAL_PARTIAL, then we designate > a bouncer CPU on another core to send NMI IPIs back to our sibling > threads. > > Probably needs more discussion and testing about how to detect and > handle failure cases and future compatibility for different types of > restrictions, but at least it works in mambo. Looking at the Mambo implementation you recently posted I couldn't see where OPAL_PARTIAL was returned. So I guess you have other Skiboot patches to test this path using Mambo? I'm wondering because they could be useful for testing the skiboot HW side as well... > > Of course the other option rather than doing this in Linux is call > into opal in the system reset handler and have it do the bouncing. > Something to consider before we finalise the API. That might not be such a bad idea. OPAL could queue up the threads it couldn't reset and then wait until opal_sreset_complete() is called from an eligible thread to reset the ones it couldn't do in the original call. I might try prototyping something like this when I get some time. One issue would be if there is only a single core in the system, but that's unlikely and I think that's probably something we can't support in any case as cores can't sreset threads on the same core, at least on P8. - Alistair > Thanks, > Nick > > --- > arch/powerpc/platforms/powernv/smp.c | 88 > +--- > 1 file changed, 82 insertions(+), 6 deletions(-) > > diff --git a/arch/powerpc/platforms/powernv/smp.c > b/arch/powerpc/platforms/powernv/smp.c > index f90555f75723..4241fda9df9e 100644 > --- a/arch/powerpc/platforms/powernv/smp.c > +++ b/arch/powerpc/platforms/powernv/smp.c > @@ -241,8 +241,32 @@ static int pnv_cpu_bootable(unsigned int nr) > return smp_generic_cpu_bootable(nr); > } > > +static int nmi_ipi_bounce_cpu; > +static int nmi_ipi_bounce_cpu_done; > +static int nmi_ipi_bounce_target_core; > +static int nmi_ipi_bounce_target_exclude; > + > int pnv_system_reset_exception(struct pt_regs *regs) > { > + smp_rmb(); > + if (nmi_ipi_bounce_cpu == smp_processor_id()) { > + int64_t rc; > + int c; > + > + for_each_online_cpu(c) { > + if (!cpumask_test_cpu(c, > cpu_sibling_mask(nmi_ipi_bounce_target_core))) > + continue; > + if (c == nmi_ipi_bounce_target_exclude) > + continue; > + rc = opal_signal_system_reset(c); > + if (rc != OPAL_SUCCESS) { > + nmi_ipi_bounce_cpu_done = -1; > + return 1; > + } > + } > + nmi_ipi_bounce_cpu_done = 1; > + } > + > if (smp_handle_nmi_ipi(regs)) > return 1; > return 0; > @@ -252,13 +276,65 @@ static int pnv_cause_nmi_ipi(int cpu) > { > int64_t rc; > > - rc = opal_signal_system_reset(cpu); > - if (rc == OPAL_SUCCESS) > - return 1; > + if (cpu >= 0) { > + rc = opal_signal_system_reset(cpu); > + if (rc == OPAL_SUCCESS) > + return 1; > + } else { > + /* > + * Test bounce behavior with broadcast IPI. > + */ > + rc = OPAL_PARTIAL; > + } > + if (rc == OPAL_PARTIAL) { > + int c; > > - /* > - * Don't cope with OPAL_PARTIAL yet (just punt to regular IPI) > - */ > + /* > + * Some platforms can not send NMI to sibling threads in > + * the same core. We can designate one inter-core target > + * to bounce NMIs back to our sibling threads. > + */ > + > + if (cpu >= 0) { > + /* > + * Don't support bouncing unicast NMIs yet (because > + * that would have to raise an NMI on an unrelated > + * CPU. Revisit this if callers start using unicast. > + */ > + return 0; > + } > + > + nmi_ipi_bounce_cpu = -1; > + nmi_ipi_bounce_cpu_done = 0; > + nmi_ipi_bounce_target_core = -1; > + nmi_ipi_bounce_target_exclude = -1; > + smp_wmb(); > + > + for_each_online_cpu(c) { > + if (cpumask_test_cpu(c, > cpu_sibling_mask(smp_processor_id( > + continue; > + > + if (nmi_ipi_bounce_cpu == -1) { > +
Re: [PATCH] powerpc/powernv: implement NMI IPI with OPAL_SIGNAL_SYSTEM_RESET
On Fri, 3 Feb 2017 00:25:11 +1000 Nicholas Pigginwrote: > This goes with the previous NMI IPI series, and a new version of > Alistair's opal API I posted to the skiboot list. And here is the incremental bit that is required for Alistair's hardware implementation to work. If the opal broacast call fails with OPAL_PARTIAL, then we designate a bouncer CPU on another core to send NMI IPIs back to our sibling threads. Probably needs more discussion and testing about how to detect and handle failure cases and future compatibility for different types of restrictions, but at least it works in mambo. Of course the other option rather than doing this in Linux is call into opal in the system reset handler and have it do the bouncing. Something to consider before we finalise the API. Thanks, Nick --- arch/powerpc/platforms/powernv/smp.c | 88 +--- 1 file changed, 82 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/platforms/powernv/smp.c b/arch/powerpc/platforms/powernv/smp.c index f90555f75723..4241fda9df9e 100644 --- a/arch/powerpc/platforms/powernv/smp.c +++ b/arch/powerpc/platforms/powernv/smp.c @@ -241,8 +241,32 @@ static int pnv_cpu_bootable(unsigned int nr) return smp_generic_cpu_bootable(nr); } +static int nmi_ipi_bounce_cpu; +static int nmi_ipi_bounce_cpu_done; +static int nmi_ipi_bounce_target_core; +static int nmi_ipi_bounce_target_exclude; + int pnv_system_reset_exception(struct pt_regs *regs) { + smp_rmb(); + if (nmi_ipi_bounce_cpu == smp_processor_id()) { + int64_t rc; + int c; + + for_each_online_cpu(c) { + if (!cpumask_test_cpu(c, cpu_sibling_mask(nmi_ipi_bounce_target_core))) + continue; + if (c == nmi_ipi_bounce_target_exclude) + continue; + rc = opal_signal_system_reset(c); + if (rc != OPAL_SUCCESS) { + nmi_ipi_bounce_cpu_done = -1; + return 1; + } + } + nmi_ipi_bounce_cpu_done = 1; + } + if (smp_handle_nmi_ipi(regs)) return 1; return 0; @@ -252,13 +276,65 @@ static int pnv_cause_nmi_ipi(int cpu) { int64_t rc; - rc = opal_signal_system_reset(cpu); - if (rc == OPAL_SUCCESS) - return 1; + if (cpu >= 0) { + rc = opal_signal_system_reset(cpu); + if (rc == OPAL_SUCCESS) + return 1; + } else { + /* +* Test bounce behavior with broadcast IPI. +*/ + rc = OPAL_PARTIAL; + } + if (rc == OPAL_PARTIAL) { + int c; - /* -* Don't cope with OPAL_PARTIAL yet (just punt to regular IPI) -*/ + /* +* Some platforms can not send NMI to sibling threads in +* the same core. We can designate one inter-core target +* to bounce NMIs back to our sibling threads. +*/ + + if (cpu >= 0) { + /* +* Don't support bouncing unicast NMIs yet (because +* that would have to raise an NMI on an unrelated +* CPU. Revisit this if callers start using unicast. +*/ + return 0; + } + + nmi_ipi_bounce_cpu = -1; + nmi_ipi_bounce_cpu_done = 0; + nmi_ipi_bounce_target_core = -1; + nmi_ipi_bounce_target_exclude = -1; + smp_wmb(); + + for_each_online_cpu(c) { + if (cpumask_test_cpu(c, cpu_sibling_mask(smp_processor_id( + continue; + + if (nmi_ipi_bounce_cpu == -1) { + nmi_ipi_bounce_cpu = c; + nmi_ipi_bounce_target_core = smp_processor_id(); + if (cpu == NMI_IPI_ALL_OTHERS) + nmi_ipi_bounce_target_exclude = smp_processor_id(); + } + + rc = opal_signal_system_reset(c); + if (rc != OPAL_SUCCESS) + return 0; + } + + if (nmi_ipi_bounce_cpu == -1) + return 0; /* could not find a bouncer */ + + while (!nmi_ipi_bounce_cpu_done) + cpu_relax(); + + if (nmi_ipi_bounce_cpu_done == 1) + return 1; /* bounce worked */ + } return 0; } -- 2.11.0
Re: [PATCH v2] EDAC: mpc85xx: Add T2080 l2-cache support
Chris Packhamwrites: > On 03/02/17 12:55, Michael Ellerman wrote: >> Chris if you want to send a patch to add the compatible string to the >> l2cache.txt I would merge that, but honestly it doesn't achieve much >> other than possibly catching a typo in the compatible name. > > I think catching a typo might be worthwhile. It's 5 minutes work for me > to grep/sed through the code to find existing compatible strings and > update the document. Which might save someone else a lot of time > debugging only to find out they've transposed some digits in the dts. > > I'll whip something up and send it out shortly. Thanks. cheers
[PATCH v2 1/1] powerpc: mm: support ARCH_MMAP_RND_BITS
powerpc: arch_mmap_rnd() uses hard-coded values, (23-PAGE_SHIFT) for 32-bit and (30-PAGE_SHIFT) for 64-bit, to generate the random offset for the mmap base address. This value represents a compromise between increased ASLR effectiveness and avoiding address-space fragmentation. Replace it with a Kconfig option, which is sensibly bounded, so that platform developers may choose where to place this compromise. Keep default values as new minimums. This patch makes sure that now powerpc mmap arch_mmap_rnd() approach is similar to other ARCHs like x86, arm64 and arm. Cc: Alexander GrafCc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: Michael Ellerman Cc: Anatolij Gustschin Cc: Alistair Popple Cc: Matt Porter Cc: Vitaly Bordug Cc: Scott Wood Cc: Kumar Gala Cc: Daniel Cashman Signed-off-by: Bhupesh Sharma Reviewed-by: Kees Cook --- Changes since v1: v1 can be seen here (https://lists.ozlabs.org/pipermail/linuxppc-dev/2017-February/153594.html) - No functional change in this patch. - Added R-B from Kees. - Dropped PATCH 2/2 from v1 as recommended by Kees Cook. arch/powerpc/Kconfig | 34 ++ arch/powerpc/mm/mmap.c | 7 --- 2 files changed, 38 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index a8ee573fe610..b4a843f68705 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -22,6 +22,38 @@ config MMU bool default y +config ARCH_MMAP_RND_BITS_MIN + default 5 if PPC_256K_PAGES && 32BIT + default 12 if PPC_256K_PAGES && 64BIT + default 7 if PPC_64K_PAGES && 32BIT + default 14 if PPC_64K_PAGES && 64BIT + default 9 if PPC_16K_PAGES && 32BIT + default 16 if PPC_16K_PAGES && 64BIT + default 11 if PPC_4K_PAGES && 32BIT + default 18 if PPC_4K_PAGES && 64BIT + +# max bits determined by the following formula: +# VA_BITS - PAGE_SHIFT - 4 +# for e.g for 64K page and 64BIT = 48 - 16 - 4 = 28 +config ARCH_MMAP_RND_BITS_MAX + default 10 if PPC_256K_PAGES && 32BIT + default 26 if PPC_256K_PAGES && 64BIT + default 12 if PPC_64K_PAGES && 32BIT + default 28 if PPC_64K_PAGES && 64BIT + default 14 if PPC_16K_PAGES && 32BIT + default 30 if PPC_16K_PAGES && 64BIT + default 16 if PPC_4K_PAGES && 32BIT + default 32 if PPC_4K_PAGES && 64BIT + +config ARCH_MMAP_RND_COMPAT_BITS_MIN + default 5 if PPC_256K_PAGES + default 7 if PPC_64K_PAGES + default 9 if PPC_16K_PAGES + default 11 + +config ARCH_MMAP_RND_COMPAT_BITS_MAX + default 16 + config HAVE_SETUP_PER_CPU_AREA def_bool PPC64 @@ -100,6 +132,8 @@ config PPC select HAVE_EFFICIENT_UNALIGNED_ACCESS if !(CPU_LITTLE_ENDIAN && POWER7_CPU) select HAVE_KPROBES select HAVE_ARCH_KGDB + select HAVE_ARCH_MMAP_RND_BITS + select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT select HAVE_KRETPROBES select HAVE_ARCH_TRACEHOOK select HAVE_MEMBLOCK diff --git a/arch/powerpc/mm/mmap.c b/arch/powerpc/mm/mmap.c index 2f1e44362198..babf59faab3b 100644 --- a/arch/powerpc/mm/mmap.c +++ b/arch/powerpc/mm/mmap.c @@ -60,11 +60,12 @@ unsigned long arch_mmap_rnd(void) { unsigned long rnd; - /* 8MB for 32bit, 1GB for 64bit */ +#ifdef CONFIG_COMPAT if (is_32bit_task()) - rnd = get_random_long() % (1<<(23-PAGE_SHIFT)); + rnd = get_random_long() & ((1UL << mmap_rnd_compat_bits) - 1); else - rnd = get_random_long() % (1UL<<(30-PAGE_SHIFT)); +#endif + rnd = get_random_long() & ((1UL << mmap_rnd_bits) - 1); return rnd << PAGE_SHIFT; } -- 2.7.4
Re: modversions: redefine kcrctab entries as 32-bit values
+++ Jessica Yu [02/02/17 22:54 -0500]: +++ Ard Biesheuvel [24/01/17 16:16 +]: This v4 is a followup to [0] 'modversions: redefine kcrctab entries as relative CRC pointers', but since relative CRC pointers do not work in modules, and are actually only needed by powerpc with CONFIG_RELOCATABLE=y, I have made it a Kconfig selectable feature instead. Patch #1 introduces the MODULE_REL_CRCS Kconfig symbol, and adds the kbuild handling of it, i.e., modpost, genksyms and kallsyms. Patch #2 switches all architectures to 32-bit CRC entries in kcrctab, where all architectures except powerpc with CONFIG_RELOCATABLE=y use absolute ELF symbol references as before. v4: make relative CRCs kconfig selectable use absolute CRC symbols in modules regardless of kconfig selection split into two patches This asymmetry threw me off a bit, especially the Kconfig naming (only vmlinux crcs get the relative offsets, and only on powerpc atm, but all modules keep the absolute syms, but it is called MODULE_REL_CRCS...), if we keep this asymmetric crc treatment, it would be really nice to note this discrepancy this somewhere, perhaps in the Kconfig, to keep our heads from spinning :-) I'm still catching up on the previous discussion threads, but can you explain a bit more why you switched away from full blown relative crcs from your last patchset [1]? I had lightly tested your v3 on ppc64le previously, and relative offsets with modules seemed to worked very well. I'm probably missing something very obvious. Ah, I just saw your other comment about other arches not having support for the rel32 offsets :-/ The asymmetry still bothers me though. Can't we have something that just switches relative crcs on and off, that applies to *both* vmlinux and modules? Then we can get rid of the crc_owner check in check_version() and just have something like: if (IS_ENABLED(CONFIG_RELATIVE_CRCS)) crcval = resolve_crc(crc); Also we could get rid of the '&& !defined(MODULE)' checks scattered in export.h. Then the arches that want relative crcs and that *do* have rel32 relocation support can turn relative crcs on, and powerpc can enable it, right? Would that work or is there another reason this won't work with modules (assuming that the arches that select this option support the relative offsets)? Jessica
Re: [tip:sched/core] sched/core: Add debugging code to catch missing update_rq_clock() calls
> On 02-Feb-2017, at 9:25 PM, Peter Zijlstra <pet...@infradead.org> wrote: > > On Tue, Jan 31, 2017 at 10:22:47AM -0700, Ross Zwisler wrote: >> On Tue, Jan 31, 2017 at 4:48 AM, Mike Galbraith <efa...@gmx.de> wrote: >>> On Tue, 2017-01-31 at 16:30 +0530, Sachin Sant wrote: > > > Could some of you test this? It seems to cure things in my (very) > limited testing. > I ran few cycles of cpu hot(un)plug tests. In most cases it works except one where I ran into rcu stall: [ 173.493453] INFO: rcu_sched detected stalls on CPUs/tasks: [ 173.493473] 8-...: (2 GPs behind) idle=006/140/0 softirq=0/0 fqs=2996 [ 173.493476] (detected by 0, t=6002 jiffies, g=885, c=884, q=6350) [ 173.493482] Task dump for CPU 8: [ 173.493484] cpuhp/8 R running task0 3416 2 0x0884 [ 173.493489] Call Trace: [ 173.493492] [c004f7b834a0] [c004f7b83560] 0xc004f7b83560 (unreliable) [ 173.493498] [c004f7b83670] [c0008d28] alignment_common+0x128/0x130 [ 173.493503] --- interrupt: 600 at _raw_spin_lock+0x2c/0xc0 [ 173.493503] LR = try_to_wake_up+0x204/0x5c0 [ 173.493507] [c004f7b83960] [c004f4d8084c] 0xc004f4d8084c (unreliable) [ 173.493511] [c004f7b83990] [c00fef54] try_to_wake_up+0x204/0x5c0 [ 173.493515] [c004f7b83a10] [c00e2b88] create_worker+0x148/0x250 [ 173.493519] [c004f7b83ab0] [c00e6e1c] alloc_unbound_pwq+0x3bc/0x4c0 [ 173.493522] [c004f7b83b10] [c00e7084] wq_update_unbound_numa+0x164/0x270 [ 173.493526] [c004f7b83bb0] [c00e8990] workqueue_online_cpu+0x250/0x3b0 [ 173.493529] [c004f7b83c70] [c00c2758] cpuhp_invoke_callback+0x148/0x5b0 [ 173.493533] [c004f7b83ce0] [c00c2df8] cpuhp_up_callbacks+0x48/0x140 [ 173.493536] [c004f7b83d30] [c00c3e98] cpuhp_thread_fun+0x148/0x180 [ 173.493540] [c004f7b83d60] [c00f3930] smpboot_thread_fn+0x290/0x2a0 [ 173.493544] [c004f7b83dc0] [c00edb3c] kthread+0x14c/0x190 [ 173.493547] [c004f7b83e30] [c000b4e8] ret_from_kernel_thread+0x5c/0x74 [ 243.913715] INFO: task kworker/0:2:380 blocked for more than 120 seconds. [ 243.913732] Not tainted 4.10.0-rc6-next-20170202 #6 [ 243.913735] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 243.913738] kworker/0:2 D0 380 2 0x0800 [ 243.913746] Workqueue: events vmstat_shepherd [ 243.913748] Call Trace: [ 243.913752] [c000ff07f820] [c011135c] enqueue_entity+0x81c/0x1200 (unreliable) [ 243.913757] [c000ff07f9f0] [c001a660] __switch_to+0x300/0x400 [ 243.913762] [c000ff07fa50] [c08df4f4] __schedule+0x314/0xb10 [ 243.913766] [c000ff07fb20] [c08dfd30] schedule+0x40/0xb0 [ 243.913769] [c000ff07fb50] [c08e02b8] schedule_preempt_disabled+0x18/0x30 [ 243.913773] [c000ff07fb70] [c08e1654] __mutex_lock.isra.6+0x1a4/0x660 [ 243.913777] [c000ff07fc00] [c00c3828] get_online_cpus+0x48/0x90 [ 243.913780] [c000ff07fc30] [c025fd78] vmstat_shepherd+0x38/0x150 [ 243.913784] [c000ff07fc80] [c00e5794] process_one_work+0x1a4/0x4d0 [ 243.913788] [c000ff07fd20] [c00e5b58] worker_thread+0x98/0x5a0 [ 243.913791] [c000ff07fdc0] [c00edb3c] kthread+0x14c/0x190 [ 243.913795] [c000ff07fe30] [c000b4e8] ret_from_kernel_thread+0x5c/0x74 [ 243.913824] INFO: task drmgr:3413 blocked for more than 120 seconds. [ 243.913826] Not tainted 4.10.0-rc6-next-20170202 #6 [ 243.913829] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 243.913831] drmgr D0 3413 3114 0x00040080 [ 243.913834] Call Trace: [ 243.913836] [c00257ff3380] [c00257ff3440] 0xc00257ff3440 (unreliable) [ 243.913840] [c00257ff3550] [c001a660] __switch_to+0x300/0x400 [ 243.913844] [c00257ff35b0] [c08df4f4] __schedule+0x314/0xb10 [ 243.913847] [c00257ff3680] [c08dfd30] schedule+0x40/0xb0 [ 243.913851] [c00257ff36b0] [c08e4594] schedule_timeout+0x274/0x470 [ 243.913855] [c00257ff37b0] [c08e0efc] wait_for_common+0x1ac/0x2c0 [ 243.913858] [c00257ff3830] [c00c50e4] bringup_cpu+0x84/0xe0 [ 243.913862] [c00257ff3860] [c00c2758] cpuhp_invoke_callback+0x148/0x5b0 [ 243.913865] [c00257ff38d0] [c00c2df8] cpuhp_up_callbacks+0x48/0x140 [ 243.913868] [c00257ff3920] [c00c5438] _cpu_up+0xe8/0x1c0 [ 243.913872] [c00257ff3980] [c00c5630] do_cpu_up+0x120/0x150 [ 243.913876] [c00257ff3a00] [c05c005c] cpu_subsys_online+0x5c/0xe0 [ 243.913879] [c00257ff3a50] [c05b7d84] device_online+0xb4/0x120 [ 243.913883] [c00257ff3a90] [c0093424] dlpar_online_cpu+0x144/0x1e0 [ 243.913887]
Re: modversions: redefine kcrctab entries as 32-bit values
+++ Ard Biesheuvel [24/01/17 16:16 +]: This v4 is a followup to [0] 'modversions: redefine kcrctab entries as relative CRC pointers', but since relative CRC pointers do not work in modules, and are actually only needed by powerpc with CONFIG_RELOCATABLE=y, I have made it a Kconfig selectable feature instead. Patch #1 introduces the MODULE_REL_CRCS Kconfig symbol, and adds the kbuild handling of it, i.e., modpost, genksyms and kallsyms. Patch #2 switches all architectures to 32-bit CRC entries in kcrctab, where all architectures except powerpc with CONFIG_RELOCATABLE=y use absolute ELF symbol references as before. v4: make relative CRCs kconfig selectable use absolute CRC symbols in modules regardless of kconfig selection split into two patches This asymmetry threw me off a bit, especially the Kconfig naming (only vmlinux crcs get the relative offsets, and only on powerpc atm, but all modules keep the absolute syms, but it is called MODULE_REL_CRCS...), if we keep this asymmetric crc treatment, it would be really nice to note this discrepancy this somewhere, perhaps in the Kconfig, to keep our heads from spinning :-) I'm still catching up on the previous discussion threads, but can you explain a bit more why you switched away from full blown relative crcs from your last patchset [1]? I had lightly tested your v3 on ppc64le previously, and relative offsets with modules seemed to worked very well. I'm probably missing something very obvious. [1] http://marc.info/?l=linux-arch=148493613415294=2 v3: emit CRCs into .rodata rather than .rodata.modver, given that the latter will be emitted with read-write permissions, making the CRCs end up in a writable module segment. fold the modpost fix to ensure that the section address is only substracted from the symbol address when the ELF object in question is fully linked (i.e., ET_DYN or ET_EXEC, and not ET_REL) v2: update modpost as well, so that genksyms no longer has to emit symbols for both the actual CRC value and the reference to where it is stored in the image [0] http://marc.info/?l=linux-arch=148493613415294=2 Ard Biesheuvel (2): kbuild: modversions: add infrastructure for emitting relative CRCs modversions: treat symbol CRCs as 32 bit quantities arch/powerpc/Kconfig | 1 + arch/powerpc/include/asm/module.h | 4 -- arch/powerpc/kernel/module_64.c | 8 include/asm-generic/export.h | 11 ++--- include/linux/export.h| 14 ++ include/linux/module.h| 14 +++--- init/Kconfig | 4 ++ kernel/module.c | 45 ++-- scripts/Makefile.build| 2 + scripts/genksyms/genksyms.c | 19 ++--- scripts/kallsyms.c| 12 ++ scripts/mod/modpost.c | 10 + 12 files changed, 93 insertions(+), 51 deletions(-) -- 2.7.4
Re: [tip:sched/core] sched/core: Add debugging code to catch missing update_rq_clock() calls
On Thu, 2017-02-02 at 16:55 +0100, Peter Zijlstra wrote: > On Tue, Jan 31, 2017 at 10:22:47AM -0700, Ross Zwisler wrote: > > On Tue, Jan 31, 2017 at 4:48 AM, Mike Galbraith> > wrote: > > > On Tue, 2017-01-31 at 16:30 +0530, Sachin Sant wrote: > > > Could some of you test this? It seems to cure things in my (very) > limited testing. Hotplug stress gripe is gone here. -Mike
Re: xmon memory dump does not handle LE
I'm referring to the three commands listed in the help: d dump bytes dfdump float values dddump double values As it turns out, all three of these commands do exactly the same thing, and it's certainly not what I'd expect based on experience with other debuggers. Maybe the original intent was only to simply output bytes of memory, but to me the help implies otherwise and certainly something more is needed. I'll take a look at Balbir's patch and see if I can help move it along. Thanks, Doug On 02/02/2017 07:46 PM, Michael Ellerman wrote: Douglas Millerwrites: I was hoping this would be any easy fix, but now it looks like it will be more difficult. The basic problem is that xmon memory commands like 'dd' do not properly display the data on LE instances. This means that not only is it difficult to read but one cannot copy-paste addresses from the output. This severely encumbers debugging using xmon on LE systems. What do you mean by "properly"? I think what you're saying is that dumping bytes of memory on LE doesn't give you nice u64 pointers, but that's not a bug, that's just how memory is laid out on LE. Also memory isn't always filled with u64 pointers, so just byte swapping at that size is not correct. Sometimes you'll be looking at ints, in which case you need to swap 4 bytes at a time. What we need is dump commands that do a byte swap of a given size. Balbir was working on this but I think he got diverted. His first attempt was here: https://patchwork.ozlabs.org/patch/696348/ But as I said in my reply: So as discussed, lets add d1, d2, d4, d8, which dump 1/2/4/8 bytes at a time, in cpu endian, and which each value separated by space. Here's a quick hack to get it going. Let me know if that works for you, it's not pretty, but we could probably merge it with a bit of cleanup. cheers diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c index a44b049b9cf6..3903af5fe276 100644 --- a/arch/powerpc/xmon/xmon.c +++ b/arch/powerpc/xmon/xmon.c @@ -2334,10 +2334,43 @@ static void dump_pacas(void) } #endif +static void dump_by_size(unsigned long addr, long count, int size) +{ + unsigned char temp[16]; + int i, j; + u64 val; + + count = ALIGN(count, 16); + + for (i = 0; i < count; i += 16, addr += 16) { + printf(REG, addr); + + if (mread(addr, temp, 16) != 16) { + printf("Faulted reading %d bytes from 0x"REG"\n", 16, addr); + return; + } + + for (j = 0; j < 16; j += size) { + putchar(' '); + switch (size) { + case 1: val = temp[j]; break; + case 2: val = *(u16 *)[j]; break; + case 4: val = *(u32 *)[j]; break; + case 8: val = *(u64 *)[j]; break; + default: val = 0; + } + + printf("%0*lx", size * 2, val); + } + + printf("\n"); + } +} + static void dump(void) { - int c; + int size, c; c = inchar(); @@ -2350,8 +2383,9 @@ dump(void) } #endif - if ((isxdigit(c) && c != 'f' && c != 'd') || c == '\n') + if (c == '\n') termch = c; + scanhex((void *)); if (termch != '\n') termch = 0; @@ -2383,9 +2417,21 @@ dump(void) ndump = 64; else if (ndump > MAX_DUMP) ndump = MAX_DUMP; - prdump(adrs, ndump); + + size = 0; + switch (c) { + case '8': size += 4; + case '4': size += 2; + case '2': size += 1; + case '1': size += 1; + dump_by_size(adrs, ndump, size); + break; + default: + prdump(adrs, ndump); + last_cmd = "d\n"; + } + adrs += ndump; - last_cmd = "d\n"; } }
linux-next: manual merge of the rcu tree with the powerpc tree
Hi all, Today's linux-next merge of the rcu tree got a conflict in: arch/powerpc/Kconfig between commit: d6c569b99558 ("powerpc/64: Move HAVE_CONTEXT_TRACKING from pseries to common Kconfig") from the powerpc tree and commit: c7327406b3c3 ("rcu: Make arch select smp_mb__after_unlock_lock() strength") from the rcu tree. I fixed it up (see below) and can carry the fix as necessary. This is now fixed as far as linux-next is concerned, but any non trivial conflicts should be mentioned to your upstream maintainer when your tree is submitted for merging. You may also want to consider cooperating with the maintainer of the conflicting tree to minimise any particularly complex conflicts. -- Cheers, Stephen Rothwell diff --cc arch/powerpc/Kconfig index a47e2b22df67,9fecd004fee8.. --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@@ -164,10 -164,11 +164,11 @@@ config PP select ARCH_HAS_SCALED_CPUTIME if VIRT_CPU_ACCOUNTING_NATIVE select HAVE_ARCH_HARDENED_USERCOPY select HAVE_KERNEL_GZIP - select HAVE_CC_STACKPROTECTOR + select HAVE_CONTEXT_TRACKING if PPC64 + select ARCH_WEAK_RELEASE_ACQUIRE config GENERIC_CSUM - def_bool CPU_LITTLE_ENDIAN + def_bool n config EARLY_PRINTK bool
Re: xmon memory dump does not handle LE
Douglas Millerwrites: > I was hoping this would be any easy fix, but now it looks like it will > be more difficult. > > The basic problem is that xmon memory commands like 'dd' do not properly > display the data on LE instances. This means that not only is it > difficult to read but one cannot copy-paste addresses from the output. > This severely encumbers debugging using xmon on LE systems. What do you mean by "properly"? I think what you're saying is that dumping bytes of memory on LE doesn't give you nice u64 pointers, but that's not a bug, that's just how memory is laid out on LE. Also memory isn't always filled with u64 pointers, so just byte swapping at that size is not correct. Sometimes you'll be looking at ints, in which case you need to swap 4 bytes at a time. What we need is dump commands that do a byte swap of a given size. Balbir was working on this but I think he got diverted. His first attempt was here: https://patchwork.ozlabs.org/patch/696348/ But as I said in my reply: So as discussed, lets add d1, d2, d4, d8, which dump 1/2/4/8 bytes at a time, in cpu endian, and which each value separated by space. Here's a quick hack to get it going. Let me know if that works for you, it's not pretty, but we could probably merge it with a bit of cleanup. cheers diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c index a44b049b9cf6..3903af5fe276 100644 --- a/arch/powerpc/xmon/xmon.c +++ b/arch/powerpc/xmon/xmon.c @@ -2334,10 +2334,43 @@ static void dump_pacas(void) } #endif +static void dump_by_size(unsigned long addr, long count, int size) +{ + unsigned char temp[16]; + int i, j; + u64 val; + + count = ALIGN(count, 16); + + for (i = 0; i < count; i += 16, addr += 16) { + printf(REG, addr); + + if (mread(addr, temp, 16) != 16) { + printf("Faulted reading %d bytes from 0x"REG"\n", 16, addr); + return; + } + + for (j = 0; j < 16; j += size) { + putchar(' '); + switch (size) { + case 1: val = temp[j]; break; + case 2: val = *(u16 *)[j]; break; + case 4: val = *(u32 *)[j]; break; + case 8: val = *(u64 *)[j]; break; + default: val = 0; + } + + printf("%0*lx", size * 2, val); + } + + printf("\n"); + } +} + static void dump(void) { - int c; + int size, c; c = inchar(); @@ -2350,8 +2383,9 @@ dump(void) } #endif - if ((isxdigit(c) && c != 'f' && c != 'd') || c == '\n') + if (c == '\n') termch = c; + scanhex((void *)); if (termch != '\n') termch = 0; @@ -2383,9 +2417,21 @@ dump(void) ndump = 64; else if (ndump > MAX_DUMP) ndump = MAX_DUMP; - prdump(adrs, ndump); + + size = 0; + switch (c) { + case '8': size += 4; + case '4': size += 2; + case '2': size += 1; + case '1': size += 1; + dump_by_size(adrs, ndump, size); + break; + default: + prdump(adrs, ndump); + last_cmd = "d\n"; + } + adrs += ndump; - last_cmd = "d\n"; } }
[PATCH] Documentation: powerpc/fsl: Update compatible for l2cache binding
List all the current valid compatible strings for the l2cache binding. This should stop checkpatch.pl from complaining and will hopefully save someone from having to debug a typo in their dts. Signed-off-by: Chris Packham--- .../devicetree/bindings/powerpc/fsl/l2cache.txt| 42 -- 1 file changed, 40 insertions(+), 2 deletions(-) diff --git a/Documentation/devicetree/bindings/powerpc/fsl/l2cache.txt b/Documentation/devicetree/bindings/powerpc/fsl/l2cache.txt index c41b2187eaa8..dc9bb3182525 100644 --- a/Documentation/devicetree/bindings/powerpc/fsl/l2cache.txt +++ b/Documentation/devicetree/bindings/powerpc/fsl/l2cache.txt @@ -5,8 +5,46 @@ The cache bindings explained below are ePAPR compliant Required Properties: -- compatible : Should include "fsl,chip-l2-cache-controller" and "cache" - where chip is the processor (bsc9132, npc8572 etc.) +- compatible : Should include one of the following: + "fsl,8540-l2-cache-controller" + "fsl,8541-l2-cache-controller" + "fsl,8544-l2-cache-controller" + "fsl,8548-l2-cache-controller" + "fsl,8555-l2-cache-controller" + "fsl,8568-l2-cache-controller" + "fsl,b4420-l2-cache-controller" + "fsl,b4860-l2-cache-controller" + "fsl,bsc9131-l2-cache-controller" + "fsl,bsc9132-l2-cache-controller" + "fsl,c293-l2-cache-controller" + "fsl,mpc8536-l2-cache-controller" + "fsl,mpc8540-l2-cache-controller" + "fsl,mpc8541-l2-cache-controller" + "fsl,mpc8544-l2-cache-controller" + "fsl,mpc8548-l2-cache-controller" + "fsl,mpc8555-l2-cache-controller" + "fsl,mpc8560-l2-cache-controller" + "fsl,mpc8568-l2-cache-controller" + "fsl,mpc8569-l2-cache-controller" + "fsl,mpc8572-l2-cache-controller" + "fsl,p1010-l2-cache-controller" + "fsl,p1011-l2-cache-controller" + "fsl,p1012-l2-cache-controller" + "fsl,p1013-l2-cache-controller" + "fsl,p1014-l2-cache-controller" + "fsl,p1015-l2-cache-controller" + "fsl,p1016-l2-cache-controller" + "fsl,p1020-l2-cache-controller" + "fsl,p1021-l2-cache-controller" + "fsl,p1022-l2-cache-controller" + "fsl,p1023-l2-cache-controller" + "fsl,p1024-l2-cache-controller" + "fsl,p1025-l2-cache-controller" + "fsl,p2010-l2-cache-controller" + "fsl,p2020-l2-cache-controller" + "fsl,t2080-l2-cache-controller" + "fsl,t4240-l2-cache-controller" + and "cache". - reg : Address and size of L2 cache controller registers - cache-size : Size of the entire L2 cache - interrupts : Error interrupt of L2 controller -- 2.11.0.24.ge6920cf
Re: [PATCH v2] EDAC: mpc85xx: Add T2080 l2-cache support
On 03/02/17 12:55, Michael Ellerman wrote: > Chris if you want to send a patch to add the compatible string to the > l2cache.txt I would merge that, but honestly it doesn't achieve much > other than possibly catching a typo in the compatible name. I think catching a typo might be worthwhile. It's 5 minutes work for me to grep/sed through the code to find existing compatible strings and update the document. Which might save someone else a lot of time debugging only to find out they've transposed some digits in the dts. I'll whip something up and send it out shortly.
Re: [kernel-hardening] Re: [PATCH 1/2] powerpc: mm: support ARCH_MMAP_RND_BITS
Balbir Singhwrites: > On Thu, Feb 02, 2017 at 09:23:33PM +1100, Michael Ellerman wrote: >> +config ARCH_MMAP_RND_BITS_MIN >> +# On 64-bit up to 1G of address space (2^30) >> +default 12 if 64BIT && PPC_256K_PAGES # 256K (2^18), = 30 - 18 = 12 >> +default 14 if 64BIT && PPC_64K_PAGES# 64K (2^16), = 30 - 16 = 14 >> +default 16 if 64BIT && PPC_16K_PAGES# 16K (2^14), = 30 - 14 = 16 >> +default 18 if 64BIT # 4K (2^12), = 30 - 12 = 18 >> +default ARCH_MMAP_RND_COMPAT_BITS_MIN >> + >> +config ARCH_MMAP_RND_BITS_MAX >> +# On 64-bit up to 32T of address space (2^45) > > I thought it was 64T, TASK_SIZE_USER64 is 2^46? The virtual address space is 64T. The comment is talking about how much can be taking up by the randomisation factor. cheers
Re: [PATCH v2] EDAC: mpc85xx: Add T2080 l2-cache support
Borislav Petkovwrites: > On Wed, Feb 01, 2017 at 11:46:23PM +, Chris Packham wrote: >> >> diff --git a/drivers/edac/mpc85xx_edac.c b/drivers/edac/mpc85xx_edac.c >> >> index 8f66cbed70b7..67f7bc3fe5b3 100644 >> >> --- a/drivers/edac/mpc85xx_edac.c >> >> +++ b/drivers/edac/mpc85xx_edac.c >> >> @@ -629,6 +629,7 @@ static const struct of_device_id >> >> mpc85xx_l2_err_of_match[] = { >> >> { .compatible = "fsl,p1020-l2-cache-controller", }, >> >> { .compatible = "fsl,p1021-l2-cache-controller", }, >> >> { .compatible = "fsl,p2020-l2-cache-controller", }, >> >> + { .compatible = "fsl,t2080-l2-cache-controller", }, >> > >> > WARNING: DT compatible string "fsl,t2080-l2-cache-controller" appears >> > un-documented -- check ./Documentation/devicetree/bindings/ >> > #58: FILE: drivers/edac/mpc85xx_edac.c:632: >> > + { .compatible = "fsl,t2080-l2-cache-controller", }, >> > >> > What is checkpatch.pl trying to tell me here? >> > >> >> checpkatch.pl is confused by >> Documentation/devicetree/bindings/powerpc/fsl/l2cache.txt which says >> >> - compatible: Should include "fsl,chip-l2-cache-controller" and "cache" >>where chip is the processor (bsc9132, npc8572 etc.) >> >> So none of the fsl cache controllers pass the checkpatch.pl test. > > Hmm, so others do list those names explicitly. For example: > > Documentation/devicetree/bindings/pinctrl/allwinner,sunxi-pinctrl.txt > > And the patch that added that check to cp: > > bff5da433525 ("checkpatch: add DT compatible string documentation checks") > > is basically to enforce explicit compatible names. > > So I'd like to have an ACK from a PPC maintainer here first before I > apply this. It's fine with me: Acked-by: Michael Ellerman Chris if you want to send a patch to add the compatible string to the l2cache.txt I would merge that, but honestly it doesn't achieve much other than possibly catching a typo in the compatible name. cheers
linux-next: manual merge of the powerpc tree with the powerpc-fixes tree
Hi all, Today's linux-next merge of the powerpc tree got a conflict in: arch/powerpc/Kconfig between commit: f2574030b0e3 ("powerpc: Revert the initial stack protector support") from the powerpc-fixes tree and commit: d6c569b99558 ("powerpc/64: Move HAVE_CONTEXT_TRACKING from pseries to common Kconfig") from the powerpc tree. I fixed it up (see below) and can carry the fix as necessary. This is now fixed as far as linux-next is concerned, but any non trivial conflicts should be mentioned to your upstream maintainer when your tree is submitted for merging. You may also want to consider cooperating with the maintainer of the conflicting tree to minimise any particularly complex conflicts. -- Cheers, Stephen Rothwell diff --cc arch/powerpc/Kconfig index a46d1c0d14d3,33f5b8380a7d.. --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@@ -164,9 -164,11 +164,10 @@@ config PP select ARCH_HAS_SCALED_CPUTIME if VIRT_CPU_ACCOUNTING_NATIVE select HAVE_ARCH_HARDENED_USERCOPY select HAVE_KERNEL_GZIP - select HAVE_CC_STACKPROTECTOR + select HAVE_CONTEXT_TRACKING if PPC64 config GENERIC_CSUM - def_bool CPU_LITTLE_ENDIAN + def_bool n config EARLY_PRINTK bool
Re: [tip:sched/core] sched/core: Add debugging code to catch missing update_rq_clock() calls
On Thu, 02 Feb, at 04:55:06PM, Peter Zijlstra wrote: > On Tue, Jan 31, 2017 at 10:22:47AM -0700, Ross Zwisler wrote: > > On Tue, Jan 31, 2017 at 4:48 AM, Mike Galbraithwrote: > > > On Tue, 2017-01-31 at 16:30 +0530, Sachin Sant wrote: > > > Could some of you test this? It seems to cure things in my (very) > limited testing. I haven't tested it but this looks like the correct fix to me. Reviewed-by: Matt Fleming
Re: [PATCH 0/2] RFC: Adjust powerpc ASLR elf randomness
On 3 Feb 2017 00:49, "Kees Cook"wrote: On Thu, Feb 2, 2017 at 10:08 AM, Bhupesh Sharma wrote: > On Thu, Feb 2, 2017 at 7:51 PM, Kees Cook wrote: >> On Wed, Feb 1, 2017 at 9:42 PM, Bhupesh Sharma wrote: >>> The 2nd patch increases the ELF_ET_DYN_BASE value from the current >>> hardcoded value of 0x2000_ to something more practical, >>> i.e. TASK_SIZE - PAGE_SHIFT (which makes sense especially for >>> 64-bit platforms which would like to utilize more randomization >>> in the load address of a PIE elf). >> >> I don't think you want this second patch. Moving ELF_ET_DYN_BASE to >> the top of TASK_SIZE means you'll be constantly colliding with stack >> and mmap randomization. 0x2000 is way better since it randomizes >> up from there towards the mmap area. >> >> Is there a reason to avoid the 32-bit memory range for the ELF addresses? > > I think you are right. Hmm, I think I was going by my particular use > case which might not be required for generic PPC platforms. > > I have one doubt though - I have primarily worked on arm64 and x86 > architectures and there I see there 64-bit user space applications > using the 64-bit load addresses/ranges. I am not sure why PPC64 is > different historically. x86's ELF_ET_DYN_BASE is (TASK_SIZE / 3 * 2), so it puts it ET_DYN base at the top third of the address space. (In theory, this is to avoid interpreter collisions, but I'm working on removing that restriction, as it seems pointless.) Other architectures have small ELF_ET_DYN_BASEs, which is good: it allows them to have larger entropy for ET_DYN. Fair enough. I will drop the 2nd patch then and spin a v2. Regards, Bhupesh
Re: [PATCH 0/2] RFC: Adjust powerpc ASLR elf randomness
On Thu, Feb 2, 2017 at 10:08 AM, Bhupesh Sharmawrote: > On Thu, Feb 2, 2017 at 7:51 PM, Kees Cook wrote: >> On Wed, Feb 1, 2017 at 9:42 PM, Bhupesh Sharma wrote: >>> The 2nd patch increases the ELF_ET_DYN_BASE value from the current >>> hardcoded value of 0x2000_ to something more practical, >>> i.e. TASK_SIZE - PAGE_SHIFT (which makes sense especially for >>> 64-bit platforms which would like to utilize more randomization >>> in the load address of a PIE elf). >> >> I don't think you want this second patch. Moving ELF_ET_DYN_BASE to >> the top of TASK_SIZE means you'll be constantly colliding with stack >> and mmap randomization. 0x2000 is way better since it randomizes >> up from there towards the mmap area. >> >> Is there a reason to avoid the 32-bit memory range for the ELF addresses? > > I think you are right. Hmm, I think I was going by my particular use > case which might not be required for generic PPC platforms. > > I have one doubt though - I have primarily worked on arm64 and x86 > architectures and there I see there 64-bit user space applications > using the 64-bit load addresses/ranges. I am not sure why PPC64 is > different historically. x86's ELF_ET_DYN_BASE is (TASK_SIZE / 3 * 2), so it puts it ET_DYN base at the top third of the address space. (In theory, this is to avoid interpreter collisions, but I'm working on removing that restriction, as it seems pointless.) Other architectures have small ELF_ET_DYN_BASEs, which is good: it allows them to have larger entropy for ET_DYN. -Kees -- Kees Cook Pixel Security
Re: [PATCH v3] Fix loading of module radeonfb on PowerMac
Hi, On Wednesday, February 01, 2017 07:55:25 AM Mathieu Malaterre wrote: > Any chance this patch could be considered for inclusion this time ? > > Thanks > > On Wed, Nov 23, 2016 at 8:26 AM, Mathieu Malaterrewrote: > > When the linux kernel is build with (typical kernel ship with Debian > > installer): > > > > CONFIG_FB_OF=y > > CONFIG_VT_HW_CONSOLE_BINDING=y > > CONFIG_FB_RADEON=m > > > > The offb driver takes precedence over module radeonfb. It is then > > impossible to load the module, error reported is: > > > > [ 96.551486] radeonfb :00:10.0: enabling device (0006 -> 0007) > > [ 96.551526] radeonfb :00:10.0: BAR 0: can't reserve [mem > > 0x9800-0x9fff pref] > > [ 96.551531] radeonfb (:00:10.0): cannot request region 0. > > [ 96.551545] radeonfb: probe of :00:10.0 failed with error -16 > > > > This patch reproduce the behavior of the module radeon, so as to make it > > possible to load radeonfb when offb is first loaded. > > > > The problem is that offb call pci_request_region first, and then radeonfb > > tries to do it, and since one is trying to take over from the other, it > > can't > > do that because the area is already reserved. > > > > It should be noticed that `offb_destroy` is never called which explain the > > need to skip error detection on the radeonfb side. > > > > Signed-off-by: Mathieu Malaterre > > Link: https://bugs.debian.org/826629#57 > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=119741 > > Suggested-by: Lennart Sorensen > > --- > > > > v2: Remove compilation warning > > > > v3: Hide error messages on PPC > > > > drivers/video/fbdev/aty/radeon_base.c | 27 +++ > > 1 file changed, 27 insertions(+) > > > > diff --git a/drivers/video/fbdev/aty/radeon_base.c > > b/drivers/video/fbdev/aty/radeon_base.c > > index 218339a..837c86a 100644 > > --- a/drivers/video/fbdev/aty/radeon_base.c > > +++ b/drivers/video/fbdev/aty/radeon_base.c > > @@ -2259,6 +2259,22 @@ static struct bin_attribute edid2_attr = { > > .read = radeon_show_edid2, > > }; > > > > +static int radeon_kick_out_firmware_fb(struct pci_dev *pdev) > > +{ > > + struct apertures_struct *ap; > > + > > + ap = alloc_apertures(1); > > + if (!ap) > > + return -ENOMEM; > > + > > + ap->ranges[0].base = pci_resource_start(pdev, 0); > > + ap->ranges[0].size = pci_resource_len(pdev, 0); > > + > > + remove_conflicting_framebuffers(ap, KBUILD_MODNAME, false); > > + kfree(ap); > > + > > + return 0; > > +} > > > > static int radeonfb_pci_register(struct pci_dev *pdev, > > const struct pci_device_id *ent) > > @@ -2314,20 +2330,29 @@ static int radeonfb_pci_register(struct pci_dev > > *pdev, > > rinfo->fb_base_phys = pci_resource_start (pdev, 0); > > rinfo->mmio_base_phys = pci_resource_start (pdev, 2); > > > > + ret = radeon_kick_out_firmware_fb(pdev); > > + if (ret) > > + return ret; > > + > > /* request the mem regions */ > > ret = pci_request_region(pdev, 0, "radeonfb framebuffer"); > > + /* this is not an error on PowerMac where offb already requested mem > > regions */ > > +#ifndef CONFIG_PPC This doesn't look correct: - PPC is not only PowerMac and offb supports more cards than radeon (while radeonfb can be used for secondary graphics card) - offb can be disabled (i.e. in CONFIG_FB_OF=n && CONFIG_FB_RADEON=y case error checking will be missing now) The last put_fb_info() on fb_info should call ->fb_destroy (offb_destroy in our case) and remove_conflicting_framebuffers() is calling put_fb_info() so there is some extra reference on fb_info somewhere preventing it from going away. Please look into fixing this. > > if (ret < 0) { > > printk( KERN_ERR "radeonfb (%s): cannot request region > > 0.\n", > > pci_name(rinfo->pdev)); > > goto err_release_fb; > > } > > +#endif > > > > ret = pci_request_region(pdev, 2, "radeonfb mmio"); > > +#ifndef CONFIG_PPC > > if (ret < 0) { > > printk( KERN_ERR "radeonfb (%s): cannot request region > > 2.\n", > > pci_name(rinfo->pdev)); > > goto err_release_pci0; > > } > > +#endif > > > > /* map the regions */ > > rinfo->mmio_base = ioremap(rinfo->mmio_base_phys, RADEON_REGSIZE); > > @@ -2511,10 +2536,12 @@ static int radeonfb_pci_register(struct pci_dev > > *pdev, > > iounmap(rinfo->mmio_base); > > err_release_pci2: > > pci_release_region(pdev, 2); > > +#ifndef CONFIG_PPC > > err_release_pci0: > > pci_release_region(pdev, 0); > > err_release_fb: > > framebuffer_release(info); > > +#endif > > err_disable: > > err_out: > > return ret; > > -- > > 2.1.4 Best regards, -- Bartlomiej
Re: [PATCH 0/2] RFC: Adjust powerpc ASLR elf randomness
Hi Balbir, On Thu, Feb 2, 2017 at 12:14 PM, Balbir Singhwrote: > On Thu, Feb 02, 2017 at 11:12:46AM +0530, Bhupesh Sharma wrote: >> This RFC patchset tries to make the powerpc ASLR elf randomness >> implementation similar to other ARCHs (like x86). >> >> The 1st patch introduces the support of ARCH_MMAP_RND_BITS in powerpc >> mmap implementation to allow a sane balance between increased randomness >> in the mmap address of ASLR elfs and increased address space >> fragmentation. >> > > From what I see we get 28 bits of entropy right for 64k pages > bits as compared to 14 bits earlier? That's correct. We can go upto 28-bits of entropy for 64BIT platforms using 64K pages with the current approach. I see arm64 using > 28 bits of entropy randomness in some cases, but I think 28-bit MAX entropy is sensible for 64BIT/64K combination on PPC. >> The 2nd patch increases the ELF_ET_DYN_BASE value from the current >> hardcoded value of 0x2000_ to something more practical, >> i.e. TASK_SIZE - PAGE_SHIFT (which makes sense especially for >> 64-bit platforms which would like to utilize more randomization >> in the load address of a PIE elf). >> > > This helps PIE executables as such and leaves other not impacted? It basically affects all shared object files (as noted in [1]). However as Kees noted in one of his reviews, I think this 2nd patch might not be needed for all generic ppc platforms. [1] http://lxr.free-electrons.com/source/arch/powerpc/include/asm/elf.h#L26. Regards, Bhupesh
Re: [PATCH 0/2] RFC: Adjust powerpc ASLR elf randomness
Hi Kees, Thanks for the review. Please see my comments inline. On Thu, Feb 2, 2017 at 7:51 PM, Kees Cookwrote: > On Wed, Feb 1, 2017 at 9:42 PM, Bhupesh Sharma wrote: >> This RFC patchset tries to make the powerpc ASLR elf randomness >> implementation similar to other ARCHs (like x86). >> >> The 1st patch introduces the support of ARCH_MMAP_RND_BITS in powerpc >> mmap implementation to allow a sane balance between increased randomness >> in the mmap address of ASLR elfs and increased address space >> fragmentation. >> >> The 2nd patch increases the ELF_ET_DYN_BASE value from the current >> hardcoded value of 0x2000_ to something more practical, >> i.e. TASK_SIZE - PAGE_SHIFT (which makes sense especially for >> 64-bit platforms which would like to utilize more randomization >> in the load address of a PIE elf). > > I don't think you want this second patch. Moving ELF_ET_DYN_BASE to > the top of TASK_SIZE means you'll be constantly colliding with stack > and mmap randomization. 0x2000 is way better since it randomizes > up from there towards the mmap area. > > Is there a reason to avoid the 32-bit memory range for the ELF addresses? > > -Kees I think you are right. Hmm, I think I was going by my particular use case which might not be required for generic PPC platforms. I have one doubt though - I have primarily worked on arm64 and x86 architectures and there I see there 64-bit user space applications using the 64-bit load addresses/ranges. I am not sure why PPC64 is different historically. Regards, Bhupesh
Re: [PATCH 1/2] powerpc: mm: support ARCH_MMAP_RND_BITS
HI Balbir, On Thu, Feb 2, 2017 at 2:41 PM, Balbir Singhwrote: >> @@ -100,6 +132,8 @@ config PPC >> select HAVE_EFFICIENT_UNALIGNED_ACCESS if !(CPU_LITTLE_ENDIAN && >> POWER7_CPU) >> select HAVE_KPROBES >> select HAVE_ARCH_KGDB >> + select HAVE_ARCH_MMAP_RND_BITS >> + select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT > > COMPAT is on for ppc64 by default, so we'll end up with COMPAT_BITS same > as before all the time. No, actually the 'ARCH_MMAP_RND_COMPAT_BITS' values can be changed after boot using the '/proc/sys/vm/mmap_rnd_compat_bits' tunable: http://lxr.free-electrons.com/source/arch/Kconfig#L624 Regards, Bhupesh
Re: [PATCH v3 2/2] cpufreq: qoriq: Don't look at clock implementation details
On Tue, Jul 19, 2016 at 10:02 PM, Yuantian Tangwrote: > > PING. > > Regards, > Yuantian > > > -Original Message- > > From: Scott Wood [mailto:o...@buserror.net] > > Sent: Saturday, July 09, 2016 5:07 AM > > To: Michael Turquette ; Russell King > > ; Stephen Boyd ; Viresh > > Kumar ; Rafael J. Wysocki > > Cc: linux-...@vger.kernel.org; linux...@vger.kernel.org; linuxppc- > > d...@lists.ozlabs.org; Yuantian Tang ; Yang-Leo Li > > ; Xiaofeng Ren > > Subject: Re: [PATCH v3 2/2] cpufreq: qoriq: Don't look at clock > > implementation details > > > > On Thu, 2016-07-07 at 19:26 -0700, Michael Turquette wrote: > > > Quoting Scott Wood (2016-07-06 21:13:23) > > > > > > > > On Wed, 2016-07-06 at 18:30 -0700, Michael Turquette wrote: > > > > > > > > > > Quoting Scott Wood (2016-06-15 23:21:25) > > > > > > > > > > > > > > > > > > -static struct device_node *cpu_to_clk_node(int cpu) > > > > > > +static struct clk *cpu_to_clk(int cpu) > > > > > > { > > > > > > - struct device_node *np, *clk_np; > > > > > > + struct device_node *np; > > > > > > + struct clk *clk; > > > > > > > > > > > > if (!cpu_present(cpu)) > > > > > > return NULL; > > > > > > @@ -112,37 +80,28 @@ static struct device_node > > > > > > *cpu_to_clk_node(int > > > > > > cpu) > > > > > > if (!np) > > > > > > return NULL; > > > > > > > > > > > > - clk_np = of_parse_phandle(np, "clocks", 0); > > > > > > - if (!clk_np) > > > > > > - return NULL; > > > > > > - > > > > > > + clk = of_clk_get(np, 0); > > > > > Why not use devm_clk_get here? > > > > devm_clk_get() is a wrapper around clk_get() which is not the same > > > > as of_clk_get(). What device would you pass to devm_clk_get(), and > > > > what name would you pass? > > > I'm fuzzy on whether or not you get a struct device from a cpufreq > > > driver. If so, then that would be the one to use. I would hope that > > > cpufreq drivers model cpus as devices, but I'm really not sure without > > > looking into the code. > > > > It's not the cpufreq code that provides it, but get_cpu_device() could be > > used. > > > > Do you have any comments on the first patch of this set? Any action on this patch? This patch is still a dependency for cpufreq to work on all QorIQ platforms. Regards, Leo
Re: [PATCH 1/2] powerpc: mm: support ARCH_MMAP_RND_BITS
Hi Kees, On Thu, Feb 2, 2017 at 7:55 PM, Kees Cookwrote: > On Wed, Feb 1, 2017 at 9:42 PM, Bhupesh Sharma wrote: >> powerpc: arch_mmap_rnd() uses hard-coded values, (23-PAGE_SHIFT) for >> 32-bit and (30-PAGE_SHIFT) for 64-bit, to generate the random offset >> for the mmap base address. >> >> This value represents a compromise between increased >> ASLR effectiveness and avoiding address-space fragmentation. >> Replace it with a Kconfig option, which is sensibly bounded, so that >> platform developers may choose where to place this compromise. >> Keep default values as new minimums. >> >> This patch makes sure that now powerpc mmap arch_mmap_rnd() approach >> is similar to other ARCHs like x86, arm64 and arm. >> >> Cc: Alexander Graf >> Cc: Benjamin Herrenschmidt >> Cc: Paul Mackerras >> Cc: Michael Ellerman >> Cc: Anatolij Gustschin >> Cc: Alistair Popple >> Cc: Matt Porter >> Cc: Vitaly Bordug >> Cc: Scott Wood >> Cc: Kumar Gala >> Cc: Daniel Cashman >> Cc: Kees Cook >> Signed-off-by: Bhupesh Sharma >> --- >> arch/powerpc/Kconfig | 34 ++ >> arch/powerpc/mm/mmap.c | 7 --- >> 2 files changed, 38 insertions(+), 3 deletions(-) >> >> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig >> index a8ee573fe610..b4a843f68705 100644 >> --- a/arch/powerpc/Kconfig >> +++ b/arch/powerpc/Kconfig >> @@ -22,6 +22,38 @@ config MMU >> bool >> default y >> >> +config ARCH_MMAP_RND_BITS_MIN >> + default 5 if PPC_256K_PAGES && 32BIT >> + default 12 if PPC_256K_PAGES && 64BIT >> + default 7 if PPC_64K_PAGES && 32BIT >> + default 14 if PPC_64K_PAGES && 64BIT >> + default 9 if PPC_16K_PAGES && 32BIT >> + default 16 if PPC_16K_PAGES && 64BIT >> + default 11 if PPC_4K_PAGES && 32BIT >> + default 18 if PPC_4K_PAGES && 64BIT >> + >> +# max bits determined by the following formula: >> +# VA_BITS - PAGE_SHIFT - 4 >> +# for e.g for 64K page and 64BIT = 48 - 16 - 4 = 28 >> +config ARCH_MMAP_RND_BITS_MAX >> + default 10 if PPC_256K_PAGES && 32BIT >> + default 26 if PPC_256K_PAGES && 64BIT >> + default 12 if PPC_64K_PAGES && 32BIT >> + default 28 if PPC_64K_PAGES && 64BIT >> + default 14 if PPC_16K_PAGES && 32BIT >> + default 30 if PPC_16K_PAGES && 64BIT >> + default 16 if PPC_4K_PAGES && 32BIT >> + default 32 if PPC_4K_PAGES && 64BIT >> + >> +config ARCH_MMAP_RND_COMPAT_BITS_MIN >> + default 5 if PPC_256K_PAGES >> + default 7 if PPC_64K_PAGES >> + default 9 if PPC_16K_PAGES >> + default 11 >> + >> +config ARCH_MMAP_RND_COMPAT_BITS_MAX >> + default 16 >> + >> config HAVE_SETUP_PER_CPU_AREA >> def_bool PPC64 >> >> @@ -100,6 +132,8 @@ config PPC >> select HAVE_EFFICIENT_UNALIGNED_ACCESS if !(CPU_LITTLE_ENDIAN && >> POWER7_CPU) >> select HAVE_KPROBES >> select HAVE_ARCH_KGDB >> + select HAVE_ARCH_MMAP_RND_BITS >> + select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT >> select HAVE_KRETPROBES >> select HAVE_ARCH_TRACEHOOK >> select HAVE_MEMBLOCK >> diff --git a/arch/powerpc/mm/mmap.c b/arch/powerpc/mm/mmap.c >> index 2f1e44362198..babf59faab3b 100644 >> --- a/arch/powerpc/mm/mmap.c >> +++ b/arch/powerpc/mm/mmap.c >> @@ -60,11 +60,12 @@ unsigned long arch_mmap_rnd(void) >> { >> unsigned long rnd; >> >> - /* 8MB for 32bit, 1GB for 64bit */ >> +#ifdef CONFIG_COMPAT >> if (is_32bit_task()) >> - rnd = get_random_long() % (1<<(23-PAGE_SHIFT)); >> + rnd = get_random_long() & ((1UL << mmap_rnd_compat_bits) - >> 1); >> else >> - rnd = get_random_long() % (1UL<<(30-PAGE_SHIFT)); >> +#endif >> + rnd = get_random_long() & ((1UL << mmap_rnd_bits) - 1); >> >> return rnd << PAGE_SHIFT; >> } > > Awesome! This looks good to me based on my earlier analysis. > > Reviewed-by: Kees Cook Many thanks. Regards, Bhupesh
Re: [PATCH v2] EDAC: mpc85xx: Add T2080 l2-cache support
On Wed, Feb 01, 2017 at 11:46:23PM +, Chris Packham wrote: > >> diff --git a/drivers/edac/mpc85xx_edac.c b/drivers/edac/mpc85xx_edac.c > >> index 8f66cbed70b7..67f7bc3fe5b3 100644 > >> --- a/drivers/edac/mpc85xx_edac.c > >> +++ b/drivers/edac/mpc85xx_edac.c > >> @@ -629,6 +629,7 @@ static const struct of_device_id > >> mpc85xx_l2_err_of_match[] = { > >>{ .compatible = "fsl,p1020-l2-cache-controller", }, > >>{ .compatible = "fsl,p1021-l2-cache-controller", }, > >>{ .compatible = "fsl,p2020-l2-cache-controller", }, > >> + { .compatible = "fsl,t2080-l2-cache-controller", }, > > > > WARNING: DT compatible string "fsl,t2080-l2-cache-controller" appears > > un-documented -- check ./Documentation/devicetree/bindings/ > > #58: FILE: drivers/edac/mpc85xx_edac.c:632: > > + { .compatible = "fsl,t2080-l2-cache-controller", }, > > > > What is checkpatch.pl trying to tell me here? > > > > checpkatch.pl is confused by > Documentation/devicetree/bindings/powerpc/fsl/l2cache.txt which says > > - compatible: Should include "fsl,chip-l2-cache-controller" and "cache" >where chip is the processor (bsc9132, npc8572 etc.) > > So none of the fsl cache controllers pass the checkpatch.pl test. Hmm, so others do list those names explicitly. For example: Documentation/devicetree/bindings/pinctrl/allwinner,sunxi-pinctrl.txt And the patch that added that check to cp: bff5da433525 ("checkpatch: add DT compatible string documentation checks") is basically to enforce explicit compatible names. So I'd like to have an ACK from a PPC maintainer here first before I apply this. -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply.
Re: gcc trunk fails to build kernel on PowerPC64 due to oprofile warnings
On 01/26/2017 11:06 AM, Robert Richter wrote: > On 26.01.17 10:46:43, William Cohen wrote: >> From 7e46dbd7dc5bc941926a4a63c28ccebf46493e8d Mon Sep 17 00:00:00 2001 >> From: William Cohen>> Date: Thu, 26 Jan 2017 10:33:59 -0500 >> Subject: [PATCH] Avoid hypthetical string truncation in oprofile stats buffer >> MIME-Version: 1.0 >> Content-Type: text/plain; charset=UTF-8 >> Content-Transfer-Encoding: 8bit >> >> Increased the size of an internal oprofile driver buffer ensuring that >> the string was never truncated for any possible int value to avoid the >> following gcc-7 compiler error on ppc when building the kernel: > > Please test gcc7 for other archs first. I don't think this is the only > change needed to avoid this warning in oprofile code. > > Thanks, > > -Robert > Hi Robert, I looked through the oprofile arch specific code for other snprintf uses with small character arrays and added those to the patch. Attached is current patch to increase the size of the buffers to make sure that they will not be truncated. OProfile since 1.0.0 has used the kernels perf infrastructure rather than the oprofile kernel driver. OProfile 1.0 was released September 2014, over two years ago. Would it make sense to deprecate and at some point remove the oprofile driver kernel from the kernel? Recent Fedora distributions already have CONFIG_OPROFILE unset in the kernel configurations. -Will >From e5490da918186cbd42b8609da146946fbdadf0e5 Mon Sep 17 00:00:00 2001 From: William Cohen Date: Thu, 2 Feb 2017 12:02:51 -0500 Subject: [PATCH] Avoid hypthetical string truncation in various oprofile buffers MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Increased the size of internal oprofile driver buffers ensuring that the strings were never truncated for any possible values to avoid warning/errors like the following GCC 7 compiler error on ppc when building the kernel: linux/arch/powerpc/oprofile/../../../drivers/oprofile/oprofile_stats.c: In function âoprofile_create_stats_filesâ: linux/arch/powerpc/oprofile/../../../drivers/oprofile/oprofile_stats.c:55:25: error: â%dâ directive output may be truncated writing between 1 and 11 bytes into a region of size 7 [-Werror=format-truncation=] snprintf(buf, 10, "cpu%d", i); ^~ linux/arch/powerpc/oprofile/../../../drivers/oprofile/oprofile_stats.c:55:21: note: using the range [1, -2147483648] for directive argument snprintf(buf, 10, "cpu%d", i); ^~~ linux/arch/powerpc/oprofile/../../../drivers/oprofile/oprofile_stats.c:55:3: note: format output between 5 and 15 bytes into a destination of size 10 snprintf(buf, 10, "cpu%d", i); ^ LD crypto/async_tx/built-in.o CC lib/random32.o cc1: all warnings being treated as errors Signed-off-by: William Cohen --- arch/alpha/oprofile/common.c | 4 ++-- arch/avr32/oprofile/op_model_avr32.c | 2 +- arch/mips/oprofile/common.c | 4 ++-- arch/powerpc/oprofile/common.c | 4 ++-- arch/x86/oprofile/nmi_int.c | 2 +- drivers/oprofile/oprofile_perf.c | 4 ++-- drivers/oprofile/oprofile_stats.c| 4 ++-- 7 files changed, 12 insertions(+), 12 deletions(-) diff --git a/arch/alpha/oprofile/common.c b/arch/alpha/oprofile/common.c index 310a4ce..a1704ee 100644 --- a/arch/alpha/oprofile/common.c +++ b/arch/alpha/oprofile/common.c @@ -112,9 +112,9 @@ op_axp_create_files(struct dentry *root) for (i = 0; i < model->num_counters; ++i) { struct dentry *dir; - char buf[4]; + char buf[32]; - snprintf(buf, sizeof buf, "%d", i); + snprintf(buf, sizeof(buf), "%d", i); dir = oprofilefs_mkdir(root, buf); oprofilefs_create_ulong(dir, "enabled", [i].enabled); diff --git a/arch/avr32/oprofile/op_model_avr32.c b/arch/avr32/oprofile/op_model_avr32.c index 08308be..af1249b 100644 --- a/arch/avr32/oprofile/op_model_avr32.c +++ b/arch/avr32/oprofile/op_model_avr32.c @@ -101,7 +101,7 @@ static int avr32_perf_counter_create_files(struct dentry *root) { struct dentry *dir; unsigned int i; - char filename[4]; + char filename[32]; for (i = 0; i < NR_counter; i++) { snprintf(filename, sizeof(filename), "%u", i); diff --git a/arch/mips/oprofile/common.c b/arch/mips/oprofile/common.c index 2f33992..20583cd 100644 --- a/arch/mips/oprofile/common.c +++ b/arch/mips/oprofile/common.c @@ -41,9 +41,9 @@ static int op_mips_create_files(struct dentry *root) for (i = 0; i < model->num_counters; ++i) { struct dentry *dir; - char buf[4]; + char buf[32]; - snprintf(buf, sizeof buf, "%d", i); + snprintf(buf, sizeof(buf), "%d", i); dir = oprofilefs_mkdir(root, buf); oprofilefs_create_ulong(dir, "enabled", [i].enabled); diff --git a/arch/powerpc/oprofile/common.c b/arch/powerpc/oprofile/common.c index bf094c5..5ac7b88 100644 --- a/arch/powerpc/oprofile/common.c +++
Re: [PATCH] vfio: Fix build break when SPAPR_TCE_IOMMU=n
On Thu, 02 Feb 2017 20:50:48 +1100 Michael Ellermanwrote: > Michael Ellerman writes: > > > Currently the kconfig logic for VFIO_IOMMU_SPAPR_TCE and VFIO_SPAPR_EEH > > is broken when SPAPR_TCE_IOMMU=n. Leading to: > > > > warning: (VFIO) selects VFIO_IOMMU_SPAPR_TCE which has unmet direct > > dependencies (VFIO && SPAPR_TCE_IOMMU) > > warning: (VFIO) selects VFIO_IOMMU_SPAPR_TCE which has unmet direct > > dependencies (VFIO && SPAPR_TCE_IOMMU) > > drivers/vfio/vfio_iommu_spapr_tce.c:113:8: error: implicit declaration > > of function 'mm_iommu_find' > > > > This stems from the fact that VFIO selects VFIO_IOMMU_SPAPR_TCE, and > > although it has an if clause, the condition is not correct. > > > > We could fix it by doing select VFIO_IOMMU_SPAPR_TCE if SPAPR_TCE_IOMMU, > > but the cleaner fix is to drop the selects and tie VFIO_IOMMU_SPAPR_TCE > > to the value of VFIO, and express the dependencies in only once place. > > > > Do the same for VFIO_SPAPR_EEH. > > > > The end result is that the values of VFIO_IOMMU_SPAPR_TCE and > > VFIO_SPAPR_EEH follow the value of VFIO, except when SPAPR_TCE_IOMMU=n > > and/or EEH=n. Which is exactly what we want to happen. > > Ping? > > There was a bit of discussion on this patch but I think we decided it > was correct in the end. If there are no other comments or objections, I'll queue this for v4.11. Based on this last comment: eg, using def_tristate you get: # CONFIG_VFIO_IOMMU_SPAPR_TCE is not set # CONFIG_VFIO is not set Whereas using depends all you get is: # CONFIG_VFIO is not set I prefer prefer the solution here that doesn't leave extra unselected config entries when CONFIG_VFIO is also not selected. Thanks, Alex
Re: [tip:sched/core] sched/core: Add debugging code to catch missing update_rq_clock() calls
On Tue, Jan 31, 2017 at 10:22:47AM -0700, Ross Zwisler wrote: > On Tue, Jan 31, 2017 at 4:48 AM, Mike Galbraithwrote: > > On Tue, 2017-01-31 at 16:30 +0530, Sachin Sant wrote: Could some of you test this? It seems to cure things in my (very) limited testing. --- diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 96e4ccc..b773821 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -5609,7 +5609,7 @@ static void migrate_tasks(struct rq *dead_rq) { struct rq *rq = dead_rq; struct task_struct *next, *stop = rq->stop; - struct rq_flags rf, old_rf; + struct rq_flags rf; int dest_cpu; /* @@ -5628,7 +5628,9 @@ static void migrate_tasks(struct rq *dead_rq) * class method both need to have an up-to-date * value of rq->clock[_task] */ + rq_pin_lock(rq, ); update_rq_clock(rq); + rq_unpin_lock(rq, ); for (;;) { /* @@ -5641,7 +5643,7 @@ static void migrate_tasks(struct rq *dead_rq) /* * pick_next_task assumes pinned rq->lock. */ - rq_pin_lock(rq, ); + rq_repin_lock(rq, ); next = pick_next_task(rq, _task, ); BUG_ON(!next); next->sched_class->put_prev_task(rq, next); @@ -5670,13 +5672,6 @@ static void migrate_tasks(struct rq *dead_rq) continue; } - /* -* __migrate_task() may return with a different -* rq->lock held and a new cookie in 'rf', but we need -* to preserve rf::clock_update_flags for 'dead_rq'. -*/ - old_rf = rf; - /* Find suitable destination for @next, with force if needed. */ dest_cpu = select_fallback_rq(dead_rq->cpu, next); @@ -5685,7 +5680,6 @@ static void migrate_tasks(struct rq *dead_rq) raw_spin_unlock(>lock); rq = dead_rq; raw_spin_lock(>lock); - rf = old_rf; } raw_spin_unlock(>pi_lock); }
RE: ibmvtpm byteswapping inconsistency
From: Michal Suchánek > Sent: 02 February 2017 11:30 ... > The word is marked correctly as __be64 in that patch because count and > handle are swapped to BE when saved to it and the whole word is then > swapped again when loaded. If you just load ((u64)IBMVTPM_VALID_CMD << > 56 | ((u64)VTPM_TPM_COMMAND << 48) | ((u64)count << 32) | > ibmvtpm->rtce_dma_handle in a register it works equally well > without any __be and swaps involved. And that version will almost certainly generate much better code. David
Re: [PATCH v4 00/15] livepatch: hybrid consistency model
On Wed 2017-02-01 14:02:43, Josh Poimboeuf wrote: > On Thu, Jan 19, 2017 at 09:46:08AM -0600, Josh Poimboeuf wrote: > > Here's v4, based on linux-next/master. Mostly minor changes this time, > > primarily due to Petr's v3 comments. > > So far, the only review comments have been related to the first patch, > of which I just posted an updated version. > > If there are no more comments, it would be great to get these patches in > for the 4.11 merge window. Any objections to that? I have finished the review finally. We still need to address some issues but I think that we are much more close now. I found only few small problems. Othewise, the consistency handling looks pretty consistent to me :-) Best Regards, Petr
xmon memory dump does not handle LE
I was hoping this would be any easy fix, but now it looks like it will be more difficult. The basic problem is that xmon memory commands like 'dd' do not properly display the data on LE instances. This means that not only is it difficult to read but one cannot copy-paste addresses from the output. This severely encumbers debugging using xmon on LE systems. It looks like prdump() is highly optimized but only works on BE. Have I missed something, or will it take a significant re-write of this code to fix the LE problem? Thanks, Doug
Re: [PATCH v4.1 01/15] stacktrace/x86: add function for detecting reliable stack traces
On Wed, 1 Feb 2017, Josh Poimboeuf wrote: > For live patching and possibly other use cases, a stack trace is only > useful if it can be assured that it's completely reliable. Add a new > save_stack_trace_tsk_reliable() function to achieve that. > > Note that if the target task isn't the current task, and the target task > is allowed to run, then it could be writing the stack while the unwinder > is reading it, resulting in possible corruption. So the caller of > save_stack_trace_tsk_reliable() must ensure that the task is either > 'current' or inactive. > > save_stack_trace_tsk_reliable() relies on the x86 unwinder's detection > of pt_regs on the stack. If the pt_regs are not user-mode registers > from a syscall, then they indicate an in-kernel interrupt or exception > (e.g. preemption or a page fault), in which case the stack is considered > unreliable due to the nature of frame pointers. > > It also relies on the x86 unwinder's detection of other issues, such as: > > - corrupted stack data > - stack grows the wrong way > - stack walk doesn't reach the bottom > - user didn't provide a large enough entries array > > Such issues are reported by checking unwind_error() and !unwind_done(). > > Also add CONFIG_HAVE_RELIABLE_STACKTRACE so arch-independent code can > determine at build time whether the function is implemented. > > Signed-off-by: Josh PoimboeufLooks good to me. Reviewed-by: Miroslav Benes Miroslav
[PATCH] powerpc/powernv: implement NMI IPI with OPAL_SIGNAL_SYSTEM_RESET
Signed-off-by: Nicholas Piggin--- Hi, This goes with the previous NMI IPI series, and a new version of Alistair's opal API I posted to the skiboot list. Only RFC until the firmware API is agreed on. Thanks, Nick arch/powerpc/include/asm/opal-api.h| 21 - arch/powerpc/include/asm/opal.h| 1 + arch/powerpc/platforms/powernv/opal-wrappers.S | 1 + arch/powerpc/platforms/powernv/powernv.h | 1 + arch/powerpc/platforms/powernv/setup.c | 3 +++ arch/powerpc/platforms/powernv/smp.c | 24 6 files changed, 50 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/opal-api.h b/arch/powerpc/include/asm/opal-api.h index 0e2e57bcab50..53f1c09cab5d 100644 --- a/arch/powerpc/include/asm/opal-api.h +++ b/arch/powerpc/include/asm/opal-api.h @@ -167,7 +167,26 @@ #define OPAL_INT_EOI 124 #define OPAL_INT_SET_MFRR 125 #define OPAL_PCI_TCE_KILL 126 -#define OPAL_LAST 126 +#define OPAL_NMMU_SET_PTCR 127 +#define OPAL_XIVE_RESET128 +#define OPAL_XIVE_GET_IRQ_INFO 129 +#define OPAL_XIVE_GET_IRQ_CONFIG 130 +#define OPAL_XIVE_SET_IRQ_CONFIG 131 +#define OPAL_XIVE_GET_QUEUE_INFO 132 +#define OPAL_XIVE_SET_QUEUE_INFO 133 +#define OPAL_XIVE_DONATE_PAGE 134 +#define OPAL_XIVE_ALLOCATE_VP_BLOCK135 +#define OPAL_XIVE_FREE_VP_BLOCK136 +#define OPAL_XIVE_GET_VP_INFO 137 +#define OPAL_XIVE_SET_VP_INFO 138 +#define OPAL_XIVE_ALLOCATE_IRQ 139 +#define OPAL_XIVE_FREE_IRQ 140 +#define OPAL_XIVE_RESERVED1141 +#define OPAL_XIVE_RESERVED2142 +#define OPAL_XIVE_RESERVED3143 +#define OPAL_XIVE_RESERVED4144 +#define OPAL_SIGNAL_SYSTEM_RESET 145 +#define OPAL_LAST 145 /* Device tree flags */ diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h index 5c7db0f1a708..893d422febf6 100644 --- a/arch/powerpc/include/asm/opal.h +++ b/arch/powerpc/include/asm/opal.h @@ -232,6 +232,7 @@ int64_t opal_pci_tce_kill(uint64_t phb_id, uint32_t kill_type, int64_t opal_rm_pci_tce_kill(uint64_t phb_id, uint32_t kill_type, uint32_t pe_num, uint32_t tce_size, uint64_t dma_addr, uint32_t npages); +int64_t opal_signal_system_reset(int32_t cpu); /* Internal functions */ extern int early_init_dt_scan_opal(unsigned long node, const char *uname, diff --git a/arch/powerpc/platforms/powernv/opal-wrappers.S b/arch/powerpc/platforms/powernv/opal-wrappers.S index 3aa40f1b20f5..e1e9ef11f100 100644 --- a/arch/powerpc/platforms/powernv/opal-wrappers.S +++ b/arch/powerpc/platforms/powernv/opal-wrappers.S @@ -312,3 +312,4 @@ OPAL_CALL(opal_int_set_mfrr, OPAL_INT_SET_MFRR); OPAL_CALL_REAL(opal_rm_int_set_mfrr, OPAL_INT_SET_MFRR); OPAL_CALL(opal_pci_tce_kill, OPAL_PCI_TCE_KILL); OPAL_CALL_REAL(opal_rm_pci_tce_kill, OPAL_PCI_TCE_KILL); +OPAL_CALL(opal_signal_system_reset,OPAL_SIGNAL_SYSTEM_RESET); diff --git a/arch/powerpc/platforms/powernv/powernv.h b/arch/powerpc/platforms/powernv/powernv.h index da7c843ac7f1..6b17bf7190e7 100644 --- a/arch/powerpc/platforms/powernv/powernv.h +++ b/arch/powerpc/platforms/powernv/powernv.h @@ -3,6 +3,7 @@ #ifdef CONFIG_SMP extern void pnv_smp_init(void); +extern int pnv_system_reset_exception(struct pt_regs *regs); #else static inline void pnv_smp_init(void) { } #endif diff --git a/arch/powerpc/platforms/powernv/setup.c b/arch/powerpc/platforms/powernv/setup.c index d50c7d99baaf..2ebb429e272e 100644 --- a/arch/powerpc/platforms/powernv/setup.c +++ b/arch/powerpc/platforms/powernv/setup.c @@ -260,6 +260,9 @@ static void __init pnv_setup_machdep_opal(void) ppc_md.restart = pnv_restart; pm_power_off = pnv_power_off; ppc_md.halt = pnv_halt; +#ifdef CONFIG_SMP + ppc_md.system_reset_exception = pnv_system_reset_exception; +#endif ppc_md.machine_check_exception = opal_machine_check; ppc_md.mce_check_early_recovery = opal_mce_check_early_recovery; ppc_md.hmi_exception_early = opal_hmi_exception_early; diff --git a/arch/powerpc/platforms/powernv/smp.c b/arch/powerpc/platforms/powernv/smp.c index 092ec1f7b58d..f90555f75723 100644 --- a/arch/powerpc/platforms/powernv/smp.c +++ b/arch/powerpc/platforms/powernv/smp.c @@ -241,6 +241,28 @@ static int pnv_cpu_bootable(unsigned int nr) return smp_generic_cpu_bootable(nr); } +int pnv_system_reset_exception(struct pt_regs *regs) +{ + if (smp_handle_nmi_ipi(regs)) +
Re: [PATCH 1/2] powerpc: mm: support ARCH_MMAP_RND_BITS
On Wed, Feb 1, 2017 at 9:42 PM, Bhupesh Sharmawrote: > powerpc: arch_mmap_rnd() uses hard-coded values, (23-PAGE_SHIFT) for > 32-bit and (30-PAGE_SHIFT) for 64-bit, to generate the random offset > for the mmap base address. > > This value represents a compromise between increased > ASLR effectiveness and avoiding address-space fragmentation. > Replace it with a Kconfig option, which is sensibly bounded, so that > platform developers may choose where to place this compromise. > Keep default values as new minimums. > > This patch makes sure that now powerpc mmap arch_mmap_rnd() approach > is similar to other ARCHs like x86, arm64 and arm. > > Cc: Alexander Graf > Cc: Benjamin Herrenschmidt > Cc: Paul Mackerras > Cc: Michael Ellerman > Cc: Anatolij Gustschin > Cc: Alistair Popple > Cc: Matt Porter > Cc: Vitaly Bordug > Cc: Scott Wood > Cc: Kumar Gala > Cc: Daniel Cashman > Cc: Kees Cook > Signed-off-by: Bhupesh Sharma > --- > arch/powerpc/Kconfig | 34 ++ > arch/powerpc/mm/mmap.c | 7 --- > 2 files changed, 38 insertions(+), 3 deletions(-) > > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig > index a8ee573fe610..b4a843f68705 100644 > --- a/arch/powerpc/Kconfig > +++ b/arch/powerpc/Kconfig > @@ -22,6 +22,38 @@ config MMU > bool > default y > > +config ARCH_MMAP_RND_BITS_MIN > + default 5 if PPC_256K_PAGES && 32BIT > + default 12 if PPC_256K_PAGES && 64BIT > + default 7 if PPC_64K_PAGES && 32BIT > + default 14 if PPC_64K_PAGES && 64BIT > + default 9 if PPC_16K_PAGES && 32BIT > + default 16 if PPC_16K_PAGES && 64BIT > + default 11 if PPC_4K_PAGES && 32BIT > + default 18 if PPC_4K_PAGES && 64BIT > + > +# max bits determined by the following formula: > +# VA_BITS - PAGE_SHIFT - 4 > +# for e.g for 64K page and 64BIT = 48 - 16 - 4 = 28 > +config ARCH_MMAP_RND_BITS_MAX > + default 10 if PPC_256K_PAGES && 32BIT > + default 26 if PPC_256K_PAGES && 64BIT > + default 12 if PPC_64K_PAGES && 32BIT > + default 28 if PPC_64K_PAGES && 64BIT > + default 14 if PPC_16K_PAGES && 32BIT > + default 30 if PPC_16K_PAGES && 64BIT > + default 16 if PPC_4K_PAGES && 32BIT > + default 32 if PPC_4K_PAGES && 64BIT > + > +config ARCH_MMAP_RND_COMPAT_BITS_MIN > + default 5 if PPC_256K_PAGES > + default 7 if PPC_64K_PAGES > + default 9 if PPC_16K_PAGES > + default 11 > + > +config ARCH_MMAP_RND_COMPAT_BITS_MAX > + default 16 > + > config HAVE_SETUP_PER_CPU_AREA > def_bool PPC64 > > @@ -100,6 +132,8 @@ config PPC > select HAVE_EFFICIENT_UNALIGNED_ACCESS if !(CPU_LITTLE_ENDIAN && > POWER7_CPU) > select HAVE_KPROBES > select HAVE_ARCH_KGDB > + select HAVE_ARCH_MMAP_RND_BITS > + select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT > select HAVE_KRETPROBES > select HAVE_ARCH_TRACEHOOK > select HAVE_MEMBLOCK > diff --git a/arch/powerpc/mm/mmap.c b/arch/powerpc/mm/mmap.c > index 2f1e44362198..babf59faab3b 100644 > --- a/arch/powerpc/mm/mmap.c > +++ b/arch/powerpc/mm/mmap.c > @@ -60,11 +60,12 @@ unsigned long arch_mmap_rnd(void) > { > unsigned long rnd; > > - /* 8MB for 32bit, 1GB for 64bit */ > +#ifdef CONFIG_COMPAT > if (is_32bit_task()) > - rnd = get_random_long() % (1<<(23-PAGE_SHIFT)); > + rnd = get_random_long() & ((1UL << mmap_rnd_compat_bits) - 1); > else > - rnd = get_random_long() % (1UL<<(30-PAGE_SHIFT)); > +#endif > + rnd = get_random_long() & ((1UL << mmap_rnd_bits) - 1); > > return rnd << PAGE_SHIFT; > } Awesome! This looks good to me based on my earlier analysis. Reviewed-by: Kees Cook -Kees -- Kees Cook Pixel Security
Re: [PATCH 0/2] RFC: Adjust powerpc ASLR elf randomness
On Wed, Feb 1, 2017 at 9:42 PM, Bhupesh Sharmawrote: > This RFC patchset tries to make the powerpc ASLR elf randomness > implementation similar to other ARCHs (like x86). > > The 1st patch introduces the support of ARCH_MMAP_RND_BITS in powerpc > mmap implementation to allow a sane balance between increased randomness > in the mmap address of ASLR elfs and increased address space > fragmentation. > > The 2nd patch increases the ELF_ET_DYN_BASE value from the current > hardcoded value of 0x2000_ to something more practical, > i.e. TASK_SIZE - PAGE_SHIFT (which makes sense especially for > 64-bit platforms which would like to utilize more randomization > in the load address of a PIE elf). I don't think you want this second patch. Moving ELF_ET_DYN_BASE to the top of TASK_SIZE means you'll be constantly colliding with stack and mmap randomization. 0x2000 is way better since it randomizes up from there towards the mmap area. Is there a reason to avoid the 32-bit memory range for the ELF addresses? -Kees -- Kees Cook Pixel Security
Re: [PATCH 1/2] powerpc: mm: support ARCH_MMAP_RND_BITS
On Thu, Feb 02, 2017 at 09:23:33PM +1100, Michael Ellerman wrote: > +config ARCH_MMAP_RND_BITS_MIN > + # On 64-bit up to 1G of address space (2^30) > + default 12 if 64BIT && PPC_256K_PAGES # 256K (2^18), = 30 - 18 = 12 > + default 14 if 64BIT && PPC_64K_PAGES# 64K (2^16), = 30 - 16 = 14 > + default 16 if 64BIT && PPC_16K_PAGES# 16K (2^14), = 30 - 14 = 16 > + default 18 if 64BIT # 4K (2^12), = 30 - 12 = 18 > + default ARCH_MMAP_RND_COMPAT_BITS_MIN > + > +config ARCH_MMAP_RND_BITS_MAX > + # On 64-bit up to 32T of address space (2^45) I thought it was 64T, TASK_SIZE_USER64 is 2^46? > I also have what I think is a better hunk for that: > > unsigned long arch_mmap_rnd(void) > { > - unsigned long rnd; > + unsigned long shift, rnd; > > - /* 8MB for 32bit, 1GB for 64bit */ > + shift = mmap_rnd_bits; > +#ifdef CONFIG_COMPAT > if (is_32bit_task()) > - rnd = (unsigned long)get_random_int() % (1<<(23-PAGE_SHIFT)); > - else > - rnd = (unsigned long)get_random_int() % (1<<(30-PAGE_SHIFT)); > + shift = mmap_rnd_compat_bits; > +#endif > + > + rnd = (unsigned long)get_random_int() % (1 << shift); > > But I'm just nit picking I guess :) > No.. the version above is nicer IMHO Balbir
Re: [PATCH v4 13/15] livepatch: change to a per-task consistency model
!!! This is the right version. I am sorry again for the confusion. !!! > Change livepatch to use a basic per-task consistency model. This is the > foundation which will eventually enable us to patch those ~10% of > security patches which change function or data semantics. This is the > biggest remaining piece needed to make livepatch more generally useful. > > diff --git a/Documentation/livepatch/livepatch.txt > b/Documentation/livepatch/livepatch.txt > index 7f04e13..fb00d66 100644 > --- a/Documentation/livepatch/livepatch.txt > +++ b/Documentation/livepatch/livepatch.txt > +3.1 Adding consistency model support to new architectures > +- > + > +For adding consistency model support to new architectures, there are a > +few options: > + > +1) Add CONFIG_HAVE_RELIABLE_STACKTRACE. This means porting objtool, and > + for non-DWARF unwinders, also making sure there's a way for the stack > + tracing code to detect interrupts on the stack. > + > +2) Alternatively, figure out a way to patch kthreads without stack > + checking. If all kthreads sleep in the same place, then we can > + designate that place as a patching point. I think Petr M has been > + working on that? Here is version with some more details: Alternatively, every kthread has to explicitely switch current->patch_state on a safe place. Kthreads are typically "infinite" loops that do some action repeatedly. The safe location is between the loops when there are no locks taken and all data structures are in a well defined state. The location is clear when using the workqueues or the kthread worker API. These kthreads process "independent" works in a generic loop. It is much more complicated with kthreads using a custom loop. There the safe place must be carefully localized case by case. > + In that case, arches without > + HAVE_RELIABLE_STACKTRACE would still be able to use the > + non-stack-checking parts of the consistency model: > + > + a) patching user tasks when they cross the kernel/user space > + boundary; and > + > + b) patching kthreads and idle tasks at their designated patch points. > + > + This option isn't as good as option 1 because it requires signaling > + most of the tasks to patch them. Kthreads need to be waken because most of them ignore signals. And idle tasks might need to be explicitely scheduled if there is too big load. Mirek knows more about this. Well, I am not sure if you want to get into such details. > + But it could still be a good backup > + option for those architectures which don't have reliable stack traces > + yet. > + > +In the meantime, patches for such architectures can bypass the > +consistency model by setting klp_patch.immediate to true. I would add that this is perfectly fine for patches that do not change semantic of the patched functions. In practice, this is usable in about 90% of security and critical fixes. > 4. Livepatch module > @@ -134,7 +242,7 @@ Documentation/livepatch/module-elf-format.txt for more > details. > > > 4.2. Metadata > - > +- > > The patch is described by several structures that split the information > into three levels: > @@ -239,9 +347,15 @@ Registered patches might be enabled either by calling > klp_enable_patch() or > by writing '1' to /sys/kernel/livepatch//enabled. The system will > start using the new implementation of the patched functions at this stage. > > -In particular, if an original function is patched for the first time, a > -function specific struct klp_ops is created and an universal ftrace handler > -is registered. > +When a patch is enabled, livepatch enters into a transition state where > +tasks are converging to the patched state. This is indicated by a value > +of '1' in /sys/kernel/livepatch//transition. Once all tasks have > +been patched, the 'transition' value changes to '0'. For more > +information about this process, see the "Consistency model" section. > + > +If an original function is patched for the first time, a function > +specific struct klp_ops is created and an universal ftrace handler is > +registered. > > Functions might be patched multiple times. The ftrace handler is registered > only once for the given function. Further patches just add an entry to the > @@ -261,6 +375,12 @@ by writing '0' to /sys/kernel/livepatch//enabled. > At this stage > either the code from the previously enabled patch or even the original > code gets used. > > +When a patch is disabled, livepatch enters into a transition state where > +tasks are converging to the unpatched state. This is indicated by a > +value of '1' in /sys/kernel/livepatch//transition. Once all tasks > +have been unpatched, the 'transition' value changes to '0'. For more > +information about this process, see the "Consistency model" section. > + > Here all the functions (struct klp_func) associated with the to-be-disabled > patch are removed
Re: [PATCH v4 13/15] livepatch: change to a per-task consistency model
IMPORTANT: Please, forget this version. It is few days old and incomplete and probably wrong. I am sorry for confusion. Best Regards, Petr
Re: [PATCH v4 13/15] livepatch: change to a per-task consistency model
On Thu 2017-01-19 09:46:21, Josh Poimboeuf wrote: > Change livepatch to use a basic per-task consistency model. This is the > foundation which will eventually enable us to patch those ~10% of > security patches which change function or data semantics. This is the > biggest remaining piece needed to make livepatch more generally useful. > > diff --git a/Documentation/livepatch/livepatch.txt > b/Documentation/livepatch/livepatch.txt > index 7f04e13..fb00d66 100644 > --- a/Documentation/livepatch/livepatch.txt > +++ b/Documentation/livepatch/livepatch.txt > +3.1 Adding consistency model support to new architectures > +- > + > +For adding consistency model support to new architectures, there are a > +few options: > + > +1) Add CONFIG_HAVE_RELIABLE_STACKTRACE. This means porting objtool, and > + for non-DWARF unwinders, also making sure there's a way for the stack > + tracing code to detect interrupts on the stack. > + > +2) Alternatively, figure out a way to patch kthreads without stack > + checking. If all kthreads sleep in the same place, then we can > + designate that place as a patching point. I think Petr M has been > + working on that? Alternatively, every kthread has to explicitely switch current->patch_state on a safe place. Kthreads are typically "infinite" loops that do some action repeatedly. The safe location is between the loops when there are no locks taken and all data structures are in a well defined state. The location is well defined when using the workqueues or kthread worker API. These kthreads process "independent" works in a generic loop. It is much more complicated with kthreads using a custom loop. There the safe place must be carefully localized case by case. > + In that case, arches without > + HAVE_RELIABLE_STACKTRACE would still be able to use the > + non-stack-checking parts of the consistency model: > + > + a) patching user tasks when they cross the kernel/user space > + boundary; and > + > + b) patching kthreads and idle tasks at their designated patch points. > + > + This option isn't as good as option 1 because it requires signaling > + most of the tasks to patch them. Kthreads need to be waken because most of them ignore signals. And idle tasks might need to be explicitely scheduled if there is too big load. Mirek knows more about this. Well, I am not sure if you want to get into such details. > + But it could still be a good backup > + option for those architectures which don't have reliable stack traces > + yet. > + > +In the meantime, patches for such architectures can bypass the > +consistency model by setting klp_patch.immediate to true. I would add that this is perfectly fine for patches that do not change semantic of the patched functions. In practice, this is usable in about 90% of security and critical fixes. > 4. Livepatch module > @@ -134,7 +242,7 @@ Documentation/livepatch/module-elf-format.txt for more > details. > > > 4.2. Metadata > - > +- > > The patch is described by several structures that split the information > into three levels: > @@ -239,9 +347,15 @@ Registered patches might be enabled either by calling > klp_enable_patch() or > by writing '1' to /sys/kernel/livepatch//enabled. The system will > start using the new implementation of the patched functions at this stage. > > -In particular, if an original function is patched for the first time, a > -function specific struct klp_ops is created and an universal ftrace handler > -is registered. > +When a patch is enabled, livepatch enters into a transition state where > +tasks are converging to the patched state. This is indicated by a value > +of '1' in /sys/kernel/livepatch//transition. Once all tasks have > +been patched, the 'transition' value changes to '0'. For more > +information about this process, see the "Consistency model" section. > + > +If an original function is patched for the first time, a function > +specific struct klp_ops is created and an universal ftrace handler is > +registered. > > Functions might be patched multiple times. The ftrace handler is registered > only once for the given function. Further patches just add an entry to the > @@ -261,6 +375,12 @@ by writing '0' to /sys/kernel/livepatch//enabled. > At this stage > either the code from the previously enabled patch or even the original > code gets used. > > +When a patch is disabled, livepatch enters into a transition state where > +tasks are converging to the unpatched state. This is indicated by a > +value of '1' in /sys/kernel/livepatch//transition. Once all tasks > +have been unpatched, the 'transition' value changes to '0'. For more > +information about this process, see the "Consistency model" section. > + > Here all the functions (struct klp_func) associated with the to-be-disabled > patch are removed from the corresponding struct klp_ops. The ftrace handler >
Re: powerpc/boot: Update .gitignore
On Wed, 2017-02-01 at 06:00:11 UTC, Michael Ellerman wrote: > Add a few things that have been missed from .gitignore over the years. > > Signed-off-by: Michael EllermanApplied to powerpc next. https://git.kernel.org/powerpc/c/4eb43875a1859b8d4fb6c56a441a18 cheers
Re: powerpc/pseries: Report DLPAR capabilities
On Wed, 2017-01-11 at 17:00:58 UTC, Nathan Fontenot wrote: > As we add the ability to do DLPAR of additional devices through > the sysfs interface we need to know which devices are supported. > This adds the reporting of supported devices with a comma separated > list reported in the existing /sys/kernel/dlpar. > > Signed-off-by: Nathan FontenotApplied to powerpc next, thanks. https://git.kernel.org/powerpc/c/673bc4354d42731018494bb69d63b6 cheers
Re: powerpc/debug: PTDUMP should depend on DEBUG_FS
On Wed, 2017-02-01 at 02:23:44 UTC, Michael Ellerman wrote: > CONFIG_PPC_PTDUMP currently selects CONFIG_DEBUG_FS. But CONFIG_DEBUG_FS > is user-selectable, so we shouldn't select it. Instead depend on it. > > Signed-off-by: Michael EllermanApplied to powerpc next. https://git.kernel.org/powerpc/c/1c877f71b7b9c0a5144e29d599eac2 cheers
Re: powerpc/xmon: Cleanup to use is_kernel_addr macro
On Thu, 2017-01-05 at 11:08:15 UTC, Madhavan Srinivasan wrote: > Signed-off-by: Madhavan SrinivasanApplied to powerpc next, thanks. https://git.kernel.org/powerpc/c/e71ff89c712cb387914abff373ac83 cheers
Re: [v3, 1/3] powerpc/pseries: Make the acquire/release of the drc for memory a seperate step
On Fri, 2017-01-06 at 19:25:53 UTC, John Allen wrote: > When adding and removing LMBs we should make the acquire/release of > the DRC a separate step to allow for a few improvements. First > this will ensure that LMBs removed during a remove by count operation > are all available if a error occurs and we need to add them back. By > first removeing all the LMBs from the kernel before releasing their > DRCs the LMBs are available to add back should an error occur. > > Also, this will allow for faster re-add operations of memory for > PRRN event handling since we can skip the unneeded step of having > to release the DRC and the acquire it back. > > Signed-off-by: Nathan Fontenot> Signed-off-by: John Allen Series applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/c21f515c743687c6c2b3d38227e6ad cheers
Re: [1/3] powerpc/sparse: constify the address pointer in __get_user_check
On Mon, 2017-01-30 at 06:41:53 UTC, Daniel Axtens wrote: > In __get_user_check, we create an intermediate pointer for the > user address we're about to fetch. We currently don't tag this > pointer as const. Make it const, as we are simply dereferencing > it, and it's scope is limited to the __get_user_check macro. > > Signed-off-by: Daniel AxtensSeries applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/f84ed59a612d866cde0bd17ad2a52a cheers
Re: [1/2] powerpc/64: Move HAVE_CONTEXT_TRACKING from pseries to common Kconfig
On Thu, 2017-01-12 at 10:17:33 UTC, Anton Blanchard wrote: > From: Anton Blanchard> > We added support for HAVE_CONTEXT_TRACKING, but placed the option inside > PPC_PSERIES. > > This has the undesirable effect that NO_HZ_FULL can be enabled on a > kernel with both powernv and pseries support, but cannot on a kernel > with powernv only support. > > Signed-off-by: Anton Blanchard Series applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/d6c569b99558b219fcf0ce0d3af8ec cheers
Re: ibmvtpm byteswapping inconsistency
On Wed, 1 Feb 2017 23:40:33 -0500 Vickywrote: > > On Jan 26, 2017, at 5:58 PM, Ashley Lai > > wrote: > > > > Adding Vicky from IBM. > > > > > > On 01/26/2017 04:05 PM, Jason Gunthorpe wrote: > >> On Thu, Jan 26, 2017 at 09:22:48PM +0100, Michal Such??nek wrote: > >> > >>> This is repeated a few times in the driver so I added memset to > >>> quiet gcc and make behavior deterministic in case the unused > >>> fields get some meaning in the future. > >> Yep, reserved certainly needs to be zeroed.. Can you send a patch? > >> memset is overkill... > >> > >>> However, in tpm_ibmvtpm_send the structure is initialized as > >>> > >>> struct ibmvtpm_crq crq; > >>> __be64 *word = (__be64 *) > >>> ... > >>> crq.valid = (u8)IBMVTPM_VALID_CMD; > >>> crq.msg = (u8)VTPM_TPM_COMMAND; > >>> crq.len = cpu_to_be16(count); > >>> crq.data = cpu_to_be32(ibmvtpm->rtce_dma_handle); > >>> > >>> and submitted with > >>> > >>> rc = ibmvtpm_send_crq(ibmvtpm->vdev, be64_to_cpu(word[0]), > >>> be64_to_cpu(word[1])); > >>> meaning it is swapped twice. > >> No idea, Nayna may know. > >> > >> My guess is that '__be64 *word' should be 'u64 *word'... > >> > >> Jason > > > > I don’t think we want ‘word' to be changed back to be of type > ‘u64’. Please see commit 62dfd912ab3b5405b6fe72d0135c37e9648071f1 The word is marked correctly as __be64 in that patch because count and handle are swapped to BE when saved to it and the whole word is then swapped again when loaded. If you just load ((u64)IBMVTPM_VALID_CMD << 56 | ((u64)VTPM_TPM_COMMAND << 48) | ((u64)count << 32) | ibmvtpm->rtce_dma_handle in a register it works equally well without any __be and swaps involved. Note however that __be64 and u64 are all the same to the compiler. It's just a note for the reader and analysis tools. Thanks Michal
Re: ibmvtpm byteswapping inconsistency
Vickywrites: >> On Jan 26, 2017, at 5:58 PM, Ashley Lai wrote: >> >> Adding Vicky from IBM. >> >> >> On 01/26/2017 04:05 PM, Jason Gunthorpe wrote: >>> On Thu, Jan 26, 2017 at 09:22:48PM +0100, Michal Such??nek wrote: >>> This is repeated a few times in the driver so I added memset to quiet gcc and make behavior deterministic in case the unused fields get some meaning in the future. >>> Yep, reserved certainly needs to be zeroed.. Can you send a patch? >>> memset is overkill... >>> However, in tpm_ibmvtpm_send the structure is initialized as struct ibmvtpm_crq crq; __be64 *word = (__be64 *) ... crq.valid = (u8)IBMVTPM_VALID_CMD; crq.msg = (u8)VTPM_TPM_COMMAND; crq.len = cpu_to_be16(count); crq.data = cpu_to_be32(ibmvtpm->rtce_dma_handle); and submitted with rc = ibmvtpm_send_crq(ibmvtpm->vdev, be64_to_cpu(word[0]), be64_to_cpu(word[1])); meaning it is swapped twice. >>> No idea, Nayna may know. >>> >>> My guess is that '__be64 *word' should be 'u64 *word'... >>> >>> Jason >> > > I don’t think we want ‘word' to be changed back to be of type ‘u64’. Please > see commit 62dfd912ab3b5405b6fe72d0135c37e9648071f1 The type of word is basically irrelevant. Unless you're running sparse and actually checking the errors, which it seems you're not doing: drivers/char/tpm/tpm_ibmvtpm.c:90:30: warning: cast removes address space of expression drivers/char/tpm/tpm_ibmvtpm.c:91:23: warning: incorrect type in argument 1 (different address spaces) drivers/char/tpm/tpm_ibmvtpm.c:91:23:expected void * drivers/char/tpm/tpm_ibmvtpm.c:91:23:got void [noderef] *rtce_buf drivers/char/tpm/tpm_ibmvtpm.c:136:17: warning: cast removes address space of expression drivers/char/tpm/tpm_ibmvtpm.c:188:46: warning: incorrect type in argument 2 (different base types) drivers/char/tpm/tpm_ibmvtpm.c:188:46:expected unsigned long long [unsigned] [usertype] w1 drivers/char/tpm/tpm_ibmvtpm.c:188:46:got restricted __be64 [usertype] drivers/char/tpm/tpm_ibmvtpm.c:189:31: warning: incorrect type in argument 3 (different base types) drivers/char/tpm/tpm_ibmvtpm.c:189:31:expected unsigned long long [unsigned] [usertype] w2 drivers/char/tpm/tpm_ibmvtpm.c:189:31:got restricted __be64 [usertype] drivers/char/tpm/tpm_ibmvtpm.c:215:46: warning: incorrect type in argument 2 (different base types) drivers/char/tpm/tpm_ibmvtpm.c:215:46:expected unsigned long long [unsigned] [usertype] w1 drivers/char/tpm/tpm_ibmvtpm.c:215:46:got restricted __be64 [usertype] drivers/char/tpm/tpm_ibmvtpm.c:216:31: warning: incorrect type in argument 3 (different base types) drivers/char/tpm/tpm_ibmvtpm.c:216:31:expected unsigned long long [unsigned] [usertype] w2 drivers/char/tpm/tpm_ibmvtpm.c:216:31:got restricted __be64 [usertype] drivers/char/tpm/tpm_ibmvtpm.c:294:30: warning: incorrect type in argument 1 (different address spaces) drivers/char/tpm/tpm_ibmvtpm.c:294:30:expected void const * drivers/char/tpm/tpm_ibmvtpm.c:294:30:got void [noderef] *rtce_buf drivers/char/tpm/tpm_ibmvtpm.c:342:46: warning: incorrect type in argument 2 (different base types) drivers/char/tpm/tpm_ibmvtpm.c:342:46:expected unsigned long long [unsigned] [usertype] w1 drivers/char/tpm/tpm_ibmvtpm.c:342:46:got restricted __be64 [usertype] drivers/char/tpm/tpm_ibmvtpm.c:343:31: warning: incorrect type in argument 3 (different base types) drivers/char/tpm/tpm_ibmvtpm.c:343:31:expected unsigned long long [unsigned] [usertype] w2 drivers/char/tpm/tpm_ibmvtpm.c:343:31:got restricted __be64 [usertype] drivers/char/tpm/tpm_ibmvtpm.c:494:43: warning: incorrect type in assignment (different address spaces) drivers/char/tpm/tpm_ibmvtpm.c:494:43:expected void [noderef] *rtce_buf drivers/char/tpm/tpm_ibmvtpm.c:494:43:got void * drivers/char/tpm/tpm_ibmvtpm.c:501:52: warning: incorrect type in argument 2 (different address spaces) drivers/char/tpm/tpm_ibmvtpm.c:501:52:expected void *ptr drivers/char/tpm/tpm_ibmvtpm.c:501:52:got void [noderef] *rtce_buf drivers/char/tpm/tpm_ibmvtpm.c:507:46: warning: incorrect type in argument 1 (different address spaces) drivers/char/tpm/tpm_ibmvtpm.c:507:46:expected void const * drivers/char/tpm/tpm_ibmvtpm.c:507:46:got void [noderef] *rtce_buf What matters is how you actually do the byte swaps. cheers
Re: [PATCH 1/2] powerpc: mm: support ARCH_MMAP_RND_BITS
Bhupesh Sharmawrites: > powerpc: arch_mmap_rnd() uses hard-coded values, (23-PAGE_SHIFT) for > 32-bit and (30-PAGE_SHIFT) for 64-bit, to generate the random offset > for the mmap base address. > > This value represents a compromise between increased > ASLR effectiveness and avoiding address-space fragmentation. > Replace it with a Kconfig option, which is sensibly bounded, so that > platform developers may choose where to place this compromise. > Keep default values as new minimums. > > This patch makes sure that now powerpc mmap arch_mmap_rnd() approach > is similar to other ARCHs like x86, arm64 and arm. Thanks for looking at this, it's been on my TODO for a while. I have a half completed version locally, but never got around to testing it thoroughly. > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig > index a8ee573fe610..b4a843f68705 100644 > --- a/arch/powerpc/Kconfig > +++ b/arch/powerpc/Kconfig > @@ -22,6 +22,38 @@ config MMU > bool > default y > > +config ARCH_MMAP_RND_BITS_MIN > + default 5 if PPC_256K_PAGES && 32BIT > + default 12 if PPC_256K_PAGES && 64BIT > + default 7 if PPC_64K_PAGES && 32BIT > + default 14 if PPC_64K_PAGES && 64BIT > + default 9 if PPC_16K_PAGES && 32BIT > + default 16 if PPC_16K_PAGES && 64BIT > + default 11 if PPC_4K_PAGES && 32BIT > + default 18 if PPC_4K_PAGES && 64BIT > + > +# max bits determined by the following formula: > +# VA_BITS - PAGE_SHIFT - 4 > +# for e.g for 64K page and 64BIT = 48 - 16 - 4 = 28 > +config ARCH_MMAP_RND_BITS_MAX > + default 10 if PPC_256K_PAGES && 32BIT > + default 26 if PPC_256K_PAGES && 64BIT > + default 12 if PPC_64K_PAGES && 32BIT > + default 28 if PPC_64K_PAGES && 64BIT > + default 14 if PPC_16K_PAGES && 32BIT > + default 30 if PPC_16K_PAGES && 64BIT > + default 16 if PPC_4K_PAGES && 32BIT > + default 32 if PPC_4K_PAGES && 64BIT > + > +config ARCH_MMAP_RND_COMPAT_BITS_MIN > + default 5 if PPC_256K_PAGES > + default 7 if PPC_64K_PAGES > + default 9 if PPC_16K_PAGES > + default 11 > + > +config ARCH_MMAP_RND_COMPAT_BITS_MAX > + default 16 > + This is what I have below, which is a bit neater I think because each value is only there once (by defaulting to the COMPAT value). My max values are different to yours, I don't really remember why I chose those values, so we can argue about which is right. +config ARCH_MMAP_RND_BITS_MIN + # On 64-bit up to 1G of address space (2^30) + default 12 if 64BIT && PPC_256K_PAGES # 256K (2^18), = 30 - 18 = 12 + default 14 if 64BIT && PPC_64K_PAGES# 64K (2^16), = 30 - 16 = 14 + default 16 if 64BIT && PPC_16K_PAGES# 16K (2^14), = 30 - 14 = 16 + default 18 if 64BIT # 4K (2^12), = 30 - 12 = 18 + default ARCH_MMAP_RND_COMPAT_BITS_MIN + +config ARCH_MMAP_RND_BITS_MAX + # On 64-bit up to 32T of address space (2^45) + default 27 if 64BIT && PPC_256K_PAGES # 256K (2^18), = 45 - 18 = 27 + default 29 if 64BIT && PPC_64K_PAGES# 64K (2^16), = 45 - 16 = 29 + default 31 if 64BIT && PPC_16K_PAGES# 16K (2^14), = 45 - 14 = 31 + default 33 if 64BIT # 4K (2^12), = 45 - 12 = 33 + default ARCH_MMAP_RND_COMPAT_BITS_MAX + +config ARCH_MMAP_RND_COMPAT_BITS_MIN + # Up to 8MB of address space (2^23) + default 5 if PPC_256K_PAGES # 256K (2^18), = 23 - 18 = 5 + default 7 if PPC_64K_PAGES # 64K (2^16), = 23 - 16 = 7 + default 9 if PPC_16K_PAGES # 16K (2^14), = 23 - 14 = 9 + default 11 # 4K (2^12), = 23 - 12 = 11 + +config ARCH_MMAP_RND_COMPAT_BITS_MAX + # Up to 2G of address space (2^31) + default 13 if PPC_256K_PAGES# 256K (2^18), = 31 - 18 = 13 + default 15 if PPC_64K_PAGES # 64K (2^16), = 31 - 16 = 15 + default 17 if PPC_16K_PAGES # 16K (2^14), = 31 - 14 = 17 + default 19 # 4K (2^12), = 31 - 12 = 19 > diff --git a/arch/powerpc/mm/mmap.c b/arch/powerpc/mm/mmap.c > index 2f1e44362198..babf59faab3b 100644 > --- a/arch/powerpc/mm/mmap.c > +++ b/arch/powerpc/mm/mmap.c > @@ -60,11 +60,12 @@ unsigned long arch_mmap_rnd(void) > { > unsigned long rnd; > > - /* 8MB for 32bit, 1GB for 64bit */ > +#ifdef CONFIG_COMPAT > if (is_32bit_task()) > - rnd = get_random_long() % (1<<(23-PAGE_SHIFT)); > + rnd = get_random_long() & ((1UL << mmap_rnd_compat_bits) - 1); > else > - rnd = get_random_long() % (1UL<<(30-PAGE_SHIFT)); > +#endif > + rnd = get_random_long() & ((1UL << mmap_rnd_bits) - 1); I also have what I think is a better hunk for that: unsigned long arch_mmap_rnd(void) { - unsigned long rnd; + unsigned long shift, rnd; - /* 8MB for 32bit,
Re: [PATCH] vfio: Fix build break when SPAPR_TCE_IOMMU=n
Michael Ellermanwrites: > Currently the kconfig logic for VFIO_IOMMU_SPAPR_TCE and VFIO_SPAPR_EEH > is broken when SPAPR_TCE_IOMMU=n. Leading to: > > warning: (VFIO) selects VFIO_IOMMU_SPAPR_TCE which has unmet direct > dependencies (VFIO && SPAPR_TCE_IOMMU) > warning: (VFIO) selects VFIO_IOMMU_SPAPR_TCE which has unmet direct > dependencies (VFIO && SPAPR_TCE_IOMMU) > drivers/vfio/vfio_iommu_spapr_tce.c:113:8: error: implicit declaration of > function 'mm_iommu_find' > > This stems from the fact that VFIO selects VFIO_IOMMU_SPAPR_TCE, and > although it has an if clause, the condition is not correct. > > We could fix it by doing select VFIO_IOMMU_SPAPR_TCE if SPAPR_TCE_IOMMU, > but the cleaner fix is to drop the selects and tie VFIO_IOMMU_SPAPR_TCE > to the value of VFIO, and express the dependencies in only once place. > > Do the same for VFIO_SPAPR_EEH. > > The end result is that the values of VFIO_IOMMU_SPAPR_TCE and > VFIO_SPAPR_EEH follow the value of VFIO, except when SPAPR_TCE_IOMMU=n > and/or EEH=n. Which is exactly what we want to happen. Ping? There was a bit of discussion on this patch but I think we decided it was correct in the end. cheers
Re: [PATCH 1 4/4] PCI: rcar: Use of_device_get_match_data() to simplify probe
On Tue, Jan 31, 2017 at 02:20:20PM -0600, Bjorn Helgaas wrote: > This is a DT-only driver, so the only way to call rcar_pcie_probe() is to > match an entry in rcar_pcie_of_match[], so of_id cannot be NULL. > > Furthermore, of_id->data can only be NULL if an rcar_pcie_of_match[] entry > has a NULL .data member. That's a driver defect, and we don't want to > return -EINVAL, which is easy to ignore. We'd rather take the NULL pointer > dereference so we notice the problem and fix it. > > Use of_device_get_match_data() to retrieve the hw_init_fn pointer. No > functional change intended. > > Suggested-by: Geert Uytterhoeven> Signed-off-by: Bjorn Helgaas Acked-by: Simon Horman > --- > drivers/pci/host/pcie-rcar.c |7 +-- > 1 file changed, 1 insertion(+), 6 deletions(-) > > diff --git a/drivers/pci/host/pcie-rcar.c b/drivers/pci/host/pcie-rcar.c > index aca85be101f8..b3b6d5273347 100644 > --- a/drivers/pci/host/pcie-rcar.c > +++ b/drivers/pci/host/pcie-rcar.c > @@ -1125,7 +1125,6 @@ static int rcar_pcie_probe(struct platform_device *pdev) > struct device *dev = >dev; > struct rcar_pcie *pcie; > unsigned int data; > - const struct of_device_id *of_id; > int err; > int (*hw_init_fn)(struct rcar_pcie *); > > @@ -1149,11 +1148,6 @@ static int rcar_pcie_probe(struct platform_device > *pdev) > if (err) > return err; > > - of_id = of_match_device(rcar_pcie_of_match, dev); > - if (!of_id || !of_id->data) > - return -EINVAL; > - hw_init_fn = of_id->data; > - > pm_runtime_enable(dev); > err = pm_runtime_get_sync(dev); > if (err < 0) { > @@ -1162,6 +1156,7 @@ static int rcar_pcie_probe(struct platform_device *pdev) > } > > /* Failure to get a link might just be that no cards are inserted */ > + hw_init_fn = of_device_get_match_data(dev); > err = hw_init_fn(pcie); > if (err) { > dev_info(dev, "PCIe link down\n"); >
Re: [PATCH 1/2] powerpc: mm: support ARCH_MMAP_RND_BITS
> @@ -100,6 +132,8 @@ config PPC > select HAVE_EFFICIENT_UNALIGNED_ACCESS if !(CPU_LITTLE_ENDIAN && > POWER7_CPU) > select HAVE_KPROBES > select HAVE_ARCH_KGDB > + select HAVE_ARCH_MMAP_RND_BITS > + select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT COMPAT is on for ppc64 by default, so we'll end up with COMPAT_BITS same as before all the time. Balbir Singh.
Re: [PATCH 1/3] powerpc/xmon Update ppc-dis/opc.c and ppc.h
Hi Balbir, [auto build test ERROR on powerpc/next] [also build test ERROR on v4.10-rc6 next-20170201] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Balbir-Singh/Update-xmon-disassembly/20170202-135830 base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next config: powerpc-allmodconfig (attached as .config) compiler: powerpc64-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705 reproduce: wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree make.cross ARCH=powerpc Note: the linux-review/Balbir-Singh/Update-xmon-disassembly/20170202-135830 HEAD aab7a78909e32ae5743164fc79d9f5eda993f0e7 builds fine. It only hurts bisectibility. All errors (new ones prefixed by >>): >> arch/powerpc/xmon/ppc-dis.c:22:19: fatal error: stdio.h: No such file or >> directory #include ^ compilation terminated. -- >> arch/powerpc/xmon/ppc-opc.c:23:19: fatal error: stdio.h: No such file or >> directory #include ^ compilation terminated. vim +22 arch/powerpc/xmon/ppc-dis.c 16 the GNU General Public License for more details. 17 18 You should have received a copy of the GNU General Public License 19 along with this file; see the file COPYING. If not, write to the Free 20 Software Foundation, 51 Franklin Street - Fifth Floor, Boston, MA 02110-1301, USA. */ 21 > 22 #include 23 #include "sysdep.h" 24 #include "dis-asm.h" 25 #include "opcode/ppc.h" --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
Re: [PATCH 2/3] powerpc/xmon: Apply binutils changes to upgrade disassembly
Hi Balbir, [auto build test ERROR on powerpc/next] [also build test ERROR on v4.10-rc6 next-20170201] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Balbir-Singh/Update-xmon-disassembly/20170202-135830 base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next config: powerpc-defconfig (attached as .config) compiler: powerpc64-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705 reproduce: wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree make.cross ARCH=powerpc Note: the linux-review/Balbir-Singh/Update-xmon-disassembly/20170202-135830 HEAD aab7a78909e32ae5743164fc79d9f5eda993f0e7 builds fine. It only hurts bisectibility. All errors (new ones prefixed by >>): >> arch/powerpc/xmon/ppc-dis.c:21:20: fatal error: sysdep.h: No such file or >> directory #include "sysdep.h" ^ compilation terminated. -- >> arch/powerpc/xmon/ppc-opc.c:22:20: fatal error: sysdep.h: No such file or >> directory #include "sysdep.h" ^ compilation terminated. vim +21 arch/powerpc/xmon/ppc-dis.c ^1da177e arch/ppc64/xmon/ppc-dis.c Linus Torvalds 2005-04-16 15 the GNU General Public License for more details. ^1da177e arch/ppc64/xmon/ppc-dis.c Linus Torvalds 2005-04-16 16 ^1da177e arch/ppc64/xmon/ppc-dis.c Linus Torvalds 2005-04-16 17 You should have received a copy of the GNU General Public License ^1da177e arch/ppc64/xmon/ppc-dis.c Linus Torvalds 2005-04-16 18 along with this file; see the file COPYING. If not, write to the Free 897f112b arch/powerpc/xmon/ppc-dis.c Michael Ellerman 2006-11-23 19 Software Foundation, 51 Franklin Street - Fifth Floor, Boston, MA 02110-1301, USA. */ ^1da177e arch/ppc64/xmon/ppc-dis.c Linus Torvalds 2005-04-16 20 36a211d8 arch/powerpc/xmon/ppc-dis.c Balbir Singh 2017-02-02 @21 #include "sysdep.h" c4d6af82 arch/powerpc/xmon/ppc-dis.c Balbir Singh 2017-02-02 22 #include e0426047 arch/powerpc/xmon/ppc-dis.c Michael Ellerman 2006-11-23 23 #include "dis-asm.h" c4d6af82 arch/powerpc/xmon/ppc-dis.c Balbir Singh 2017-02-02 24 #include "elf-bfd.h" :: The code at line 21 was first introduced by commit :: 36a211d8b7faed8b0730f7f4b19bf2a546c40d06 powerpc/xmon Update ppc-dis/opc.c and ppc.h :: TO: Balbir Singh <bsinghar...@gmail.com> :: CC: 0day robot <fengguang...@intel.com> --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
Re: [PATCH 2/3] powerpc: allow compilation on cross-endian toolchain
On 2016/11/27 01:46PM, Nicholas Piggin wrote: > On Sat, 26 Nov 2016 18:30:15 +1100 > Michael Ellermanwrote: > > > Nicholas Piggin writes: > > > On Thu, 24 Nov 2016 00:02:08 +1100 > > > diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile > > > index 617dece..8828807 100644 > > > --- a/arch/powerpc/Makefile > > > +++ b/arch/powerpc/Makefile > > > @@ -73,13 +73,18 @@ MULTIPLEWORD := -mmultiple > > > endif > > > > > > cflags-$(CONFIG_CPU_BIG_ENDIAN) += $(call > > > cc-option,-mbig-endian) > > > +cflags-$(CONFIG_CPU_BIG_ENDIAN) += $(call cc-option,-mabi=elfv1) > > > +cflags-$(CONFIG_CPU_BIG_ENDIAN) += $(call > > > cc-option,-mcall-aixdesc) > > > > This blew up the world: > > > > > > http://kisskb.ellerman.id.au/kisskb/head/1ba4cb3d67e181bdc9a911d7be81f64e3d7597d2/ > > > > Successful: 24% 63/258 > > > > I suspect you need to make -mcall-aixdesc 64-bit only. > > Yes, I forgot 32-bit will pick those up. 3rd time's a charm, this compiles > 64 bit be/le and 32-bit now. > > -- > > Subject: [PATCH] powerpc: allow compilation on cross-endian toolchain > > GCC can compile with either endian, but the ABI version always > defaults to the default endian. Alan Modra says: > > you need both -mbig and -mabi=elfv1 to make a powerpc64le gcc > generate powerpc64 code > > The opposite is true for powerpc64 when generating -mlittle it > requires -mabi=elfv2 to generate v2 ABI. This change adds ABI > annotations together with endianness. The kernel with ELFv2 ABI > also uses -mcall-aixdesc, but boot/ does not. > > Signed-off-by: Nicholas Piggin FWIW, this fixes the issue with vmx build when doing a BE build with LE toolchain: /tmp/ccR5lr0U.s: Error: .size expression for aes_p8_set_encrypt_key does not evaluate to a constant /tmp/ccR5lr0U.s: Error: .size expression for .aes_p8_set_encrypt_key does not evaluate to a constant /tmp/ccR5lr0U.s: Error: .size expression for aes_p8_set_decrypt_key does not evaluate to a constant /tmp/ccR5lr0U.s: Error: .size expression for .aes_p8_set_decrypt_key does not evaluate to a constant /tmp/ccR5lr0U.s: Error: .size expression for aes_p8_encrypt does not evaluate to a constant /tmp/ccR5lr0U.s: Error: .size expression for .aes_p8_encrypt does not evaluate to a constant Tested-by: Naveen N. Rao Thanks, Naveen