Re: linux-next: build failure after merge of the powerpc tree
On Wed, 2016-01-13 at 11:16 +0530, Aneesh Kumar K.V wrote: > Michael Ellerman writes: > > On Thu, 2016-01-07 at 19:16 +1100, Stephen Rothwell wrote: > > > After merging the powerpc tree, today's linux-next build (powerpc64 > > > allnoconfig) failed like this: > > > > > > arch/powerpc/mm/hash_utils_64.c: In function 'get_paca_psize': > > > arch/powerpc/mm/hash_utils_64.c:869:19: error: 'struct paca_struct' has > > > no member named 'context' > > > return get_paca()->context.user_psize; > > >^ > > > arch/powerpc/mm/hash_utils_64.c:870:1: error: control reaches end of > > > non-void function [-Werror=return-type] > > > } > > > ^ > > > > > > Caused by commit > > > > > > 2fc251a8dda5 ("powerpc: Copy only required pieces of the mm_context_t > > > to the paca") > > > > Well that's rather embarrassing, for Mikey ;D > > > This build has CONFIG_PPC_MM_SLICES not set ... > > > > Ugh, but it would seem none of our defconfigs do :/ > > 4K page size with hugetlb disabled will get that Yeah, but none of our defconfigs do that. I've got a kisskb target for it now: http://kisskb.ellerman.id.au/kisskb/target/28577/ cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH V1 14/33] powerpc/mm: Use helper for finding pte bits mapping I/O area
On Wed, 2016-01-13 at 11:37 +0530, Aneesh Kumar K.V wrote: > Benjamin Herrenschmidt writes: > > > On Tue, 2016-01-12 at 10:42 +0300, Denis Kirjanov wrote: > > > > +static inline unsigned long pte_io_cache_bits(void) > > > > +{ > > > > + return _PAGE_NO_CACHE | _PAGE_GUARDED; > > > > +} > > > This could be just plain #define > > > > Or just use pgprot_noncached() > > > #define pgprot_noncached(prot) (__pgprot((pgprot_val(prot) & > ~_PAGE_CACHE_CTL) | \ > _PAGE_NO_CACHE | > _PAGE_GUARDED)) > > > That will return me a pgprot_t. I can fix that by using > pgprot_val(pgprot_noncached(0)). Is that what you are suggesting ? Shouln't ioremap just use pgprot_noncached(PAGE_KERNEL) or similar ? Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v3 2/2] KVM: PPC: Exit guest upon MCE when FWNMI capability is enabled
Enhance KVM to cause a guest exit with KVM_EXIT_NMI exit reasons upon a machine check exception (MCE) in the guest address space if the KVM_CAP_PPC_FWNMI capability is enabled (instead of delivering 0x200 interrupt to guest). This enables QEMU to build error log and deliver machine check exception to guest via guest registered machine check handler. This approach simplifies the delivering of machine check exception to guest OS compared to the earlier approach of KVM directly invoking 0x200 guest interrupt vector. In the earlier approach QEMU was enhanced to patch the 0x200 interrupt vector during boot. The patched code at 0x200 issued a private hcall to pass the control to QEMU to build the error log. This design/approach is based on the feedback for the QEMU patches to handle machine check exception. Details of earlier approach of handling machine check exception in QEMU and related discussions can be found at: https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg00813.html Signed-off-by: Aravinda Prasad --- arch/powerpc/kvm/book3s_hv.c| 12 ++-- arch/powerpc/kvm/book3s_hv_rmhandlers.S | 48 +++ 2 files changed, 26 insertions(+), 34 deletions(-) diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index a7352b5..4fa03d0 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -858,15 +858,9 @@ static int kvmppc_handle_exit_hv(struct kvm_run *run, struct kvm_vcpu *vcpu, r = RESUME_GUEST; break; case BOOK3S_INTERRUPT_MACHINE_CHECK: - /* -* Deliver a machine check interrupt to the guest. -* We have to do this, even if the host has handled the -* machine check, because machine checks use SRR0/1 and -* the interrupt might have trashed guest state in them. -*/ - kvmppc_book3s_queue_irqprio(vcpu, - BOOK3S_INTERRUPT_MACHINE_CHECK); - r = RESUME_GUEST; + /* Exit to guest with KVM_EXIT_NMI as exit reason */ + run->exit_reason = KVM_EXIT_NMI; + r = RESUME_HOST; break; case BOOK3S_INTERRUPT_PROGRAM: { diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S index 3c6badc..84e32a3 100644 --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S @@ -133,21 +133,18 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S) stb r0, HSTATE_HWTHREAD_REQ(r13) /* -* For external and machine check interrupts, we need -* to call the Linux handler to process the interrupt. -* We do that by jumping to absolute address 0x500 for -* external interrupts, or the machine_check_fwnmi label -* for machine checks (since firmware might have patched -* the vector area at 0x200). The [h]rfid at the end of the -* handler will return to the book3s_hv_interrupts.S code. -* For other interrupts we do the rfid to get back -* to the book3s_hv_interrupts.S code here. +* For external interrupts we need to call the Linux +* handler to process the interrupt. We do that by jumping +* to absolute address 0x500 for external interrupts. +* The [h]rfid at the end of the handler will return to +* the book3s_hv_interrupts.S code. For other interrupts +* we do the rfid to get back to the book3s_hv_interrupts.S +* code here. */ ld r8, 112+PPC_LR_STKOFF(r1) addir1, r1, 112 ld r7, HSTATE_HOST_MSR(r13) - cmpwi cr1, r12, BOOK3S_INTERRUPT_MACHINE_CHECK cmpwi r12, BOOK3S_INTERRUPT_EXTERNAL beq 11f cmpwi r12, BOOK3S_INTERRUPT_H_DOORBELL @@ -162,7 +159,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S) mtmsrd r6, 1 /* Clear RI in MSR */ mtsrr0 r8 mtsrr1 r7 - beq cr1, 13f/* machine check */ RFI /* On POWER7, we have external interrupts set to use HSRR0/1 */ @@ -170,8 +166,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S) mtspr SPRN_HSRR1, r7 ba 0x500 -13:b machine_check_fwnmi - 14:mtspr SPRN_HSRR0, r8 mtspr SPRN_HSRR1, r7 b hmi_exception_after_realmode @@ -2390,15 +2384,13 @@ machine_check_realmode: ld r9, HSTATE_KVM_VCPU(r13) li r12, BOOK3S_INTERRUPT_MACHINE_CHECK /* -* Deliver unhandled/fatal (e.g. UE) MCE errors to guest through -* machine check interrupt (set HSRR0 to 0x200). And for handled -* errors (no-fatal), just go back to guest execution with current -* HSRR0 instead of exiting guest. This new approach will inject -* machine check to guest for fatal error causing guest to
[PATCH v3 1/2] KVM: PPC: New capability to control MCE behaviour
This patch introduces a new KVM capability to control how KVM behaves on machine check exception (MCE). Without this capability, KVM redirects machine check exceptions to guest's 0x200 vector if the address in error belongs to the guest. With this capability KVM causes a guest exit with NMI exit reason. This is required to avoid problems if a new kernel/KVM is used with an old QEMU for guests that don't issue "ibm,nmi-register". As old QEMU does not understand the NMI exit type, it treats it as a fatal error. However, the guest could have handled the machine check error if the exception was delivered to guest's 0x200 interrupt vector instead of NMI exit in case of old QEMU. QEMU part can be found at: http://lists.nongnu.org/archive/html/qemu-ppc/2015-12/msg00199.html Change Log v3: - Split the patch into 2. First patch introduces the new capability while the second one enhances KVM to redirect MCE. - Fix access width bug - Rebased to v4.4-rc7 Change Log v2: - Added KVM capability Signed-off-by: Aravinda Prasad --- arch/powerpc/include/asm/kvm_host.h |1 + arch/powerpc/kernel/asm-offsets.c |1 + arch/powerpc/kvm/powerpc.c |7 +++ include/uapi/linux/kvm.h|1 + 4 files changed, 10 insertions(+) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index cfa758c..9ac2b84 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -243,6 +243,7 @@ struct kvm_arch { int hpt_cma_alloc; struct dentry *debugfs_dir; struct dentry *htab_dentry; + u8 fwnmi_enabled; #endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */ #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE struct mutex hpt_mutex; diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c index 221d584..6a4e81a 100644 --- a/arch/powerpc/kernel/asm-offsets.c +++ b/arch/powerpc/kernel/asm-offsets.c @@ -506,6 +506,7 @@ int main(void) DEFINE(KVM_ENABLED_HCALLS, offsetof(struct kvm, arch.enabled_hcalls)); DEFINE(KVM_LPCR, offsetof(struct kvm, arch.lpcr)); DEFINE(KVM_VRMA_SLB_V, offsetof(struct kvm, arch.vrma_slb_v)); + DEFINE(KVM_FWNMI, offsetof(struct kvm, arch.fwnmi_enabled)); DEFINE(VCPU_DSISR, offsetof(struct kvm_vcpu, arch.shregs.dsisr)); DEFINE(VCPU_DAR, offsetof(struct kvm_vcpu, arch.shregs.dar)); DEFINE(VCPU_VPA, offsetof(struct kvm_vcpu, arch.vpa.pinned_addr)); diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c index 6fd2405..a8399b5 100644 --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -570,6 +570,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) r = 1; break; #endif + case KVM_CAP_PPC_FWNMI: + r = 1; + break; default: r = 0; break; @@ -1132,6 +1135,10 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu, break; } #endif /* CONFIG_KVM_XICS */ + case KVM_CAP_PPC_FWNMI: + r = 0; + vcpu->kvm->arch.fwnmi_enabled = true; + break; default: r = -EINVAL; break; diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 03f3618..d8a07b5 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -831,6 +831,7 @@ struct kvm_ppc_smmu_info { #define KVM_CAP_GUEST_DEBUG_HW_WPS 120 #define KVM_CAP_SPLIT_IRQCHIP 121 #define KVM_CAP_IOEVENTFD_ANY_LENGTH 122 +#define KVM_CAP_PPC_FWNMI 123 #ifdef KVM_CAP_IRQ_ROUTING ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V2] powerpc/powernv: Remove support for p5ioc2
"p5ioc2 is used by approximately 2 machines in the world, and has never ever been a supported configuration." The code for p5ioc2 is essentially unused and complicates what is already a very complicated codebase. Its removal is essentially a "free win" in the effort to simplify the powernv PCI code. In addition, support for p5ioc2 has been dropped from skiboot. There's no reason to keep it around in the kernel. Signed-off-by: Russell Currey --- V2: Remove pointless union and rebase on -next Tested on a P7IOC machine and a PHB3 machine. Skiboot p5ioc2 removal patch: https://patchwork.ozlabs.org/patch/544898/ --- arch/powerpc/platforms/powernv/Makefile | 2 +- arch/powerpc/platforms/powernv/pci-p5ioc2.c | 271 arch/powerpc/platforms/powernv/pci.c| 15 +- arch/powerpc/platforms/powernv/pci.h| 152 4 files changed, 74 insertions(+), 366 deletions(-) delete mode 100644 arch/powerpc/platforms/powernv/pci-p5ioc2.c diff --git a/arch/powerpc/platforms/powernv/Makefile b/arch/powerpc/platforms/powernv/Makefile index f1516b5..cd9711e 100644 --- a/arch/powerpc/platforms/powernv/Makefile +++ b/arch/powerpc/platforms/powernv/Makefile @@ -5,7 +5,7 @@ obj-y += opal-msglog.o opal-hmi.o opal-power.o opal-irqchip.o obj-y += opal-kmsg.o obj-$(CONFIG_SMP) += smp.o subcore.o subcore-asm.o -obj-$(CONFIG_PCI) += pci.o pci-p5ioc2.o pci-ioda.o npu-dma.o +obj-$(CONFIG_PCI) += pci.o pci-ioda.o npu-dma.o obj-$(CONFIG_EEH) += eeh-powernv.o obj-$(CONFIG_PPC_SCOM) += opal-xscom.o obj-$(CONFIG_MEMORY_FAILURE) += opal-memory-errors.o diff --git a/arch/powerpc/platforms/powernv/pci-p5ioc2.c b/arch/powerpc/platforms/powernv/pci-p5ioc2.c deleted file mode 100644 index f2bdfea..000 --- a/arch/powerpc/platforms/powernv/pci-p5ioc2.c +++ /dev/null @@ -1,271 +0,0 @@ -/* - * Support PCI/PCIe on PowerNV platforms - * - * Currently supports only P5IOC2 - * - * Copyright 2011 Benjamin Herrenschmidt, IBM Corp. - * - * This program is free software; you can redistribute it and/or - * modify it under the terms of the GNU General Public License - * as published by the Free Software Foundation; either version - * 2 of the License, or (at your option) any later version. - */ - -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include "powernv.h" -#include "pci.h" - -/* For now, use a fixed amount of TCE memory for each p5ioc2 - * hub, 16M will do - */ -#define P5IOC2_TCE_MEMORY 0x0100 - -#ifdef CONFIG_PCI_MSI -static int pnv_pci_p5ioc2_msi_setup(struct pnv_phb *phb, struct pci_dev *dev, - unsigned int hwirq, unsigned int virq, - unsigned int is_64, struct msi_msg *msg) -{ - if (WARN_ON(!is_64)) - return -ENXIO; - msg->data = hwirq - phb->msi_base; - msg->address_hi = 0x1000; - msg->address_lo = 0; - - return 0; -} - -static void pnv_pci_init_p5ioc2_msis(struct pnv_phb *phb) -{ - unsigned int count; - const __be32 *prop = of_get_property(phb->hose->dn, -"ibm,opal-msi-ranges", NULL); - if (!prop) - return; - - /* Don't do MSI's on p5ioc2 PCI-X are they are not properly -* verified in HW -*/ - if (of_device_is_compatible(phb->hose->dn, "ibm,p5ioc2-pcix")) - return; - phb->msi_base = be32_to_cpup(prop); - count = be32_to_cpup(prop + 1); - if (msi_bitmap_alloc(&phb->msi_bmp, count, phb->hose->dn)) { - pr_err("PCI %d: Failed to allocate MSI bitmap !\n", - phb->hose->global_number); - return; - } - phb->msi_setup = pnv_pci_p5ioc2_msi_setup; - phb->msi32_support = 0; - pr_info(" Allocated bitmap for %d MSIs (base IRQ 0x%x)\n", - count, phb->msi_base); -} -#else -static void pnv_pci_init_p5ioc2_msis(struct pnv_phb *phb) { } -#endif /* CONFIG_PCI_MSI */ - -static struct iommu_table_ops pnv_p5ioc2_iommu_ops = { - .set = pnv_tce_build, -#ifdef CONFIG_IOMMU_API - .exchange = pnv_tce_xchg, -#endif - .clear = pnv_tce_free, - .get = pnv_tce_get, -}; - -static void pnv_pci_p5ioc2_dma_dev_setup(struct pnv_phb *phb, -struct pci_dev *pdev) -{ - struct iommu_table *tbl = phb->p5ioc2.table_group.tables[0]; - - if (!tbl->it_map) { - tbl->it_ops = &pnv_p5ioc2_iommu_ops; - iommu_init_table(tbl, phb->hose->node); - iommu_register_group(&phb->p5ioc2.table_group, - pci_domain_nr(phb->hose->bus), phb->opal_id); - INIT_LIST_HEAD_RCU(&tbl->it_group_list); - pnv_pc
Re: [PATCH] powerpc/powernv: Remove support for p5ioc2
On Wed, 2016-01-13 at 17:39 +1100, Andrew Donnellan wrote: > On 13/01/16 17:10, Russell Currey wrote: > > "p5ioc2 is used by approximately 2 machines in the world, and has never > > ever been a supported configuration." > > > > The code for p5ioc2 is essentially unused and complicates what is already > > a very complicated codebase. Its removal is essentially a "free win" in > > the effort to simplify the powernv PCI code. > > > > In addition, support for p5ioc2 has been dropped from skiboot. There's no > > reason to keep it around in the kernel. > > > > Signed-off-by: Russell Currey > > Doesn't apply cleanly on next, but that's minor. Going to do a V2 to address your other comment, so I might as well fix the next issue. > > > @@ -117,11 +115,6 @@ struct pnv_phb { > > > > union { > > struct { > > - struct iommu_table iommu_table; > > - struct iommu_table_group table_group; > > - } p5ioc2; > > - > > - struct { > > /* Global bridge info */ > > unsigned inttotal_pe; > > unsigned intreserved_pe; > > Given this leaves struct ioda as the only member of the union, do we > want to get rid of the union? > Probably. I was going to leave that for future patches (which will be a proper refactoring rather than a pure removal), but given it makes no difference I should just get rid of it now. Thanks for the review. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] powerpc/powernv: Remove support for p5ioc2
On 13/01/16 17:10, Russell Currey wrote: "p5ioc2 is used by approximately 2 machines in the world, and has never ever been a supported configuration." The code for p5ioc2 is essentially unused and complicates what is already a very complicated codebase. Its removal is essentially a "free win" in the effort to simplify the powernv PCI code. In addition, support for p5ioc2 has been dropped from skiboot. There's no reason to keep it around in the kernel. Signed-off-by: Russell Currey Doesn't apply cleanly on next, but that's minor. @@ -117,11 +115,6 @@ struct pnv_phb { union { struct { - struct iommu_table iommu_table; - struct iommu_table_group table_group; - } p5ioc2; - - struct { /* Global bridge info */ unsigned inttotal_pe; unsigned intreserved_pe; Given this leaves struct ioda as the only member of the union, do we want to get rid of the union? -- Andrew Donnellan Software Engineer, OzLabs andrew.donnel...@au1.ibm.com Australia Development Lab, Canberra +61 2 6201 8874 (work)IBM Australia Limited ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] powerpc/powernv: Remove support for p5ioc2
"p5ioc2 is used by approximately 2 machines in the world, and has never ever been a supported configuration." The code for p5ioc2 is essentially unused and complicates what is already a very complicated codebase. Its removal is essentially a "free win" in the effort to simplify the powernv PCI code. In addition, support for p5ioc2 has been dropped from skiboot. There's no reason to keep it around in the kernel. Signed-off-by: Russell Currey --- Tested on a P7IOC machine and a PHB3 machine. Skiboot p5ioc2 removal patch: https://patchwork.ozlabs.org/patch/544898/ --- arch/powerpc/platforms/powernv/Makefile | 2 +- arch/powerpc/platforms/powernv/pci-p5ioc2.c | 271 arch/powerpc/platforms/powernv/pci.c| 15 +- arch/powerpc/platforms/powernv/pci.h| 12 +- 4 files changed, 5 insertions(+), 295 deletions(-) delete mode 100644 arch/powerpc/platforms/powernv/pci-p5ioc2.c diff --git a/arch/powerpc/platforms/powernv/Makefile b/arch/powerpc/platforms/powernv/Makefile index 1c8cdb6..8a65c9c 100644 --- a/arch/powerpc/platforms/powernv/Makefile +++ b/arch/powerpc/platforms/powernv/Makefile @@ -4,7 +4,7 @@ obj-y += rng.o opal-elog.o opal-dump.o opal-sysparam.o opal-sensor.o obj-y += opal-msglog.o opal-hmi.o opal-power.o opal-irqchip.o obj-$(CONFIG_SMP) += smp.o subcore.o subcore-asm.o -obj-$(CONFIG_PCI) += pci.o pci-p5ioc2.o pci-ioda.o +obj-$(CONFIG_PCI) += pci.o pci-ioda.o obj-$(CONFIG_EEH) += eeh-powernv.o obj-$(CONFIG_PPC_SCOM) += opal-xscom.o obj-$(CONFIG_MEMORY_FAILURE) += opal-memory-errors.o diff --git a/arch/powerpc/platforms/powernv/pci-p5ioc2.c b/arch/powerpc/platforms/powernv/pci-p5ioc2.c deleted file mode 100644 index f2bdfea..000 --- a/arch/powerpc/platforms/powernv/pci-p5ioc2.c +++ /dev/null @@ -1,271 +0,0 @@ -/* - * Support PCI/PCIe on PowerNV platforms - * - * Currently supports only P5IOC2 - * - * Copyright 2011 Benjamin Herrenschmidt, IBM Corp. - * - * This program is free software; you can redistribute it and/or - * modify it under the terms of the GNU General Public License - * as published by the Free Software Foundation; either version - * 2 of the License, or (at your option) any later version. - */ - -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include "powernv.h" -#include "pci.h" - -/* For now, use a fixed amount of TCE memory for each p5ioc2 - * hub, 16M will do - */ -#define P5IOC2_TCE_MEMORY 0x0100 - -#ifdef CONFIG_PCI_MSI -static int pnv_pci_p5ioc2_msi_setup(struct pnv_phb *phb, struct pci_dev *dev, - unsigned int hwirq, unsigned int virq, - unsigned int is_64, struct msi_msg *msg) -{ - if (WARN_ON(!is_64)) - return -ENXIO; - msg->data = hwirq - phb->msi_base; - msg->address_hi = 0x1000; - msg->address_lo = 0; - - return 0; -} - -static void pnv_pci_init_p5ioc2_msis(struct pnv_phb *phb) -{ - unsigned int count; - const __be32 *prop = of_get_property(phb->hose->dn, -"ibm,opal-msi-ranges", NULL); - if (!prop) - return; - - /* Don't do MSI's on p5ioc2 PCI-X are they are not properly -* verified in HW -*/ - if (of_device_is_compatible(phb->hose->dn, "ibm,p5ioc2-pcix")) - return; - phb->msi_base = be32_to_cpup(prop); - count = be32_to_cpup(prop + 1); - if (msi_bitmap_alloc(&phb->msi_bmp, count, phb->hose->dn)) { - pr_err("PCI %d: Failed to allocate MSI bitmap !\n", - phb->hose->global_number); - return; - } - phb->msi_setup = pnv_pci_p5ioc2_msi_setup; - phb->msi32_support = 0; - pr_info(" Allocated bitmap for %d MSIs (base IRQ 0x%x)\n", - count, phb->msi_base); -} -#else -static void pnv_pci_init_p5ioc2_msis(struct pnv_phb *phb) { } -#endif /* CONFIG_PCI_MSI */ - -static struct iommu_table_ops pnv_p5ioc2_iommu_ops = { - .set = pnv_tce_build, -#ifdef CONFIG_IOMMU_API - .exchange = pnv_tce_xchg, -#endif - .clear = pnv_tce_free, - .get = pnv_tce_get, -}; - -static void pnv_pci_p5ioc2_dma_dev_setup(struct pnv_phb *phb, -struct pci_dev *pdev) -{ - struct iommu_table *tbl = phb->p5ioc2.table_group.tables[0]; - - if (!tbl->it_map) { - tbl->it_ops = &pnv_p5ioc2_iommu_ops; - iommu_init_table(tbl, phb->hose->node); - iommu_register_group(&phb->p5ioc2.table_group, - pci_domain_nr(phb->hose->bus), phb->opal_id); - INIT_LIST_HEAD_RCU(&tbl->it_group_list); - pnv_pci_link_table_and_group(phb->hose->n
Re: [RFC PATCH V1 14/33] powerpc/mm: Use helper for finding pte bits mapping I/O area
Benjamin Herrenschmidt writes: > On Tue, 2016-01-12 at 10:42 +0300, Denis Kirjanov wrote: >> > +static inline unsigned long pte_io_cache_bits(void) >> > +{ >> > + return _PAGE_NO_CACHE | _PAGE_GUARDED; >> > +} >> This could be just plain #define > > Or just use pgprot_noncached() > #define pgprot_noncached(prot)(__pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) | \ _PAGE_NO_CACHE | _PAGE_GUARDED)) That will return me a pgprot_t. I can fix that by using pgprot_val(pgprot_noncached(0)). Is that what you are suggesting ? -aneesh ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] powerpc/eeh: Validate arch in eeh_add_device_early()
On Sun, 2016-01-10 at 01:08 -0200, Guilherme G. Piccoli wrote:weust changes the way the arch checking is done in function > > This patch jeeh_add_device_early(): we use no more eeh_enabled(), but instead > we check therunning architecture by using the macro machine_is(). If we are > running on > pSeries or PowerNV, the EEH mechanism can be enabled; otherwise, we bail out > the function. This way, we don't enable EEH on Cell and we don't hit the oops > on DLPAR either. Can't we just check for eeh_ops being NULL ? Cheers, Ben. > Fixes: 89a51df5ab1d ("powerpc/eeh: Fix crash in eeh_add_device_early() on > Cell") > Signed-off-by: Guilherme G. Piccoli > --- > arch/powerpc/kernel/eeh.c | 8 +++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c > index 40e4d4a..81e2d3e 100644 > --- a/arch/powerpc/kernel/eeh.c > +++ b/arch/powerpc/kernel/eeh.c > @@ -1072,7 +1072,13 @@ void eeh_add_device_early(struct pci_dn *pdn) > struct pci_controller *phb; > struct eeh_dev *edev = pdn_to_eeh_dev(pdn); > > - if (!edev || !eeh_enabled()) > + if (!edev) > + return; > + > + /* Some platforms (like Cell) don't have EEH capabilities, so we > + * need to abort here. In case of pseries or powernv, we have EEH > + * so we can continue. */ > + if (!machine_is(pseries) && !machine_is(powernv)) > return; > > if (!eeh_has_flag(EEH_PROBE_MODE_DEVTREE)) ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH V1 01/33] powerpc/mm: add _PAGE_HASHPTE similar to 4K hash
Balbir Singh writes: > On Tue, 12 Jan 2016 12:45:36 +0530 > "Aneesh Kumar K.V" wrote: > >> Not really needed. But this brings it back to as it was before >> > > Could you expand on not really needed. Could the changelog describe how > the bits will be used in the follow on patches. > What confused me in the beginning was difference between 4k and 64k page size. I was trying to find out whether we miss a hpte flush in any scenario because of this. ie, a pte update on a linux pte, for which we are doing a parallel hash pte insert. After looking at it closer my understanding is this won't happen because pte update also look at _PAGE_BUSY and we will wait for hash pte insert to finish before going ahead with the pte update. But to avoid further confusion I was wondering whether we should keep this closer to what we have with __hash_page_4k. Hence the statement "Not really needed". -aneesh ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: linux-next: build failure after merge of the powerpc tree
Michael Ellerman writes: > On Thu, 2016-01-07 at 19:16 +1100, Stephen Rothwell wrote: >> Hi all, >> >> After merging the powerpc tree, today's linux-next build (powerpc64 >> allnoconfig) failed like this: >> >> arch/powerpc/mm/hash_utils_64.c: In function 'get_paca_psize': >> arch/powerpc/mm/hash_utils_64.c:869:19: error: 'struct paca_struct' has no >> member named 'context' >> return get_paca()->context.user_psize; >>^ >> arch/powerpc/mm/hash_utils_64.c:870:1: error: control reaches end of >> non-void function [-Werror=return-type] >> } >> ^ >> >> Caused by commit >> >> 2fc251a8dda5 ("powerpc: Copy only required pieces of the mm_context_t to >> the paca") > > Well that's rather embarrassing, for Mikey ;D > >> This build has CONFIG_PPC_MM_SLICES not set ... > > Ugh, but it would seem none of our defconfigs do :/ 4K page size with hugetlb disabled will get that -aneesh ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH V1 14/33] powerpc/mm: Use helper for finding pte bits mapping I/O area
On Tue, 2016-01-12 at 10:42 +0300, Denis Kirjanov wrote: > > +static inline unsigned long pte_io_cache_bits(void) > > +{ > > + return _PAGE_NO_CACHE | _PAGE_GUARDED; > > +} > This could be just plain #define Or just use pgprot_noncached() Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH V1 01/33] powerpc/mm: add _PAGE_HASHPTE similar to 4K hash
On Tue, 12 Jan 2016 12:45:36 +0530 "Aneesh Kumar K.V" wrote: > Not really needed. But this brings it back to as it was before > Could you expand on not really needed. Could the changelog describe how the bits will be used in the follow on patches. Balbir ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH kernel] powerpc/ioda: Set "read" permission when "write" is set
On 01/12/2016 05:07 PM, Benjamin Herrenschmidt wrote: On Tue, 2016-01-12 at 15:40 +1100, Alexey Kardashevskiy wrote: Quite often drivers set only "write" permission assuming that this includes "read" permission as well and this works on plenty platforms. However IODA2 is strict about this and produces an EEH when "read" permission is not and reading happens. This adds a workaround in IODA code to always add the "read" bit when the "write" bit is set. Cc: Benjamin Herrenschmidt Signed-off-by: Alexey Kardashevskiy --- Ben, what was the driver which did not set "read" and caused EEH? aacraid Cheers, Ben. Just to be precise, the driver wasn't responsible for setting READ. The driver called scsi_dma_map() and the scsicmd was set (by scsi layer) as DMA_FROM_DEVICE so the current code would set the permissions to WRITE-ONLY. Previously, and in other architectures, this scsicmd would have resulted in READ+WRITE permissions on the DMA map. --- arch/powerpc/platforms/powernv/pci.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c index f2dd772..c7dcae5 100644 --- a/arch/powerpc/platforms/powernv/pci.c +++ b/arch/powerpc/platforms/powernv/pci.c @@ -601,6 +601,9 @@ int pnv_tce_build(struct iommu_table *tbl, long index, long npages, u64 rpn = __pa(uaddr) >> tbl->it_page_shift; long i; + if (proto_tce & TCE_PCI_WRITE) + proto_tce |= TCE_PCI_READ; + for (i = 0; i < npages; i++) { unsigned long newtce = proto_tce | ((rpn + i) << tbl->it_page_shift); @@ -622,6 +625,9 @@ int pnv_tce_xchg(struct iommu_table *tbl, long index, BUG_ON(*hpa & ~IOMMU_PAGE_MASK(tbl)); + if (newtce & TCE_PCI_WRITE) + newtce |= TCE_PCI_READ; + oldtce = xchg(pnv_tce(tbl, idx), cpu_to_be64(newtce)); *hpa = be64_to_cpu(oldtce) & ~(TCE_PCI_READ | TCE_PCI_WRITE); *direction = iommu_tce_direction(oldtce); ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] powerpc/powernv: Fix OPAL_CONSOLE_FLUSH prototype and usages
On 13/01/16 12:04, Russell Currey wrote: The recently added OPAL API call, OPAL_CONSOLE_FLUSH, originally took no parameters and returned nothing. The call was updated to accept the terminal number to flush, and returned various values depending on the state of the output buffer. The prototype has been updated and its usage in the OPAL kmsg dumper has been modified to support its new behaviour as an incremental flush. Signed-off-by: Russell Currey Looks fine to me. Reviewed-by: Andrew Donnellan -- Andrew Donnellan Software Engineer, OzLabs andrew.donnel...@au1.ibm.com Australia Development Lab, Canberra +61 2 6201 8874 (work)IBM Australia Limited ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] powerpc/powernv: Fix OPAL_CONSOLE_FLUSH prototype and usages
The recently added OPAL API call, OPAL_CONSOLE_FLUSH, originally took no parameters and returned nothing. The call was updated to accept the terminal number to flush, and returned various values depending on the state of the output buffer. The prototype has been updated and its usage in the OPAL kmsg dumper has been modified to support its new behaviour as an incremental flush. Signed-off-by: Russell Currey --- This patch should be applied on top of "powerpc/powernv: Add a kmsg_dumper that flushes console output on panic", which was recently merged into powerpc-next. --- arch/powerpc/include/asm/opal.h| 2 +- arch/powerpc/platforms/powernv/opal-kmsg.c | 9 - 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h index a5fd407..07a99e6 100644 --- a/arch/powerpc/include/asm/opal.h +++ b/arch/powerpc/include/asm/opal.h @@ -35,7 +35,7 @@ int64_t opal_console_read(int64_t term_number, __be64 *length, uint8_t *buffer); int64_t opal_console_write_buffer_space(int64_t term_number, __be64 *length); -void opal_console_flush(void); +int64_t opal_console_flush(int64_t term_number); int64_t opal_rtc_read(__be32 *year_month_day, __be64 *hour_minute_second_millisecond); int64_t opal_rtc_write(uint32_t year_month_day, diff --git a/arch/powerpc/platforms/powernv/opal-kmsg.c b/arch/powerpc/platforms/powernv/opal-kmsg.c index bd3b2ee..6f1214d 100644 --- a/arch/powerpc/platforms/powernv/opal-kmsg.c +++ b/arch/powerpc/platforms/powernv/opal-kmsg.c @@ -27,6 +27,7 @@ static void force_opal_console_flush(struct kmsg_dumper *dumper, enum kmsg_dump_reason reason) { int i; + int64_t ret; /* * Outside of a panic context the pollers will continue to run, @@ -36,7 +37,13 @@ static void force_opal_console_flush(struct kmsg_dumper *dumper, return; if (opal_check_token(OPAL_CONSOLE_FLUSH)) { - opal_console_flush(); + ret = opal_console_flush(0); + + if (ret == OPAL_UNSUPPORTED || ret == OPAL_PARAMETER) + return; + + /* Incrementally flush until there's nothing left */ + while (opal_console_flush(0) != OPAL_SUCCESS); } else { /* * If OPAL_CONSOLE_FLUSH is not implemented in the firmware, -- 2.7.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [v3,11/41] mips: reuse asm-generic/barrier.h
On 01/12/2016 01:40 PM, Peter Zijlstra wrote: It is selectable only for MIPS R2 but not MIPS R6. The reason is - most of MIPS R2 CPUs have short pipeline and that SYNC is just waste of CPU resource, especially taking into account that "lightweight syncs" are converted to a heavy "SYNC 0" in many of that CPUs. However the latest MIPS/Imagination CPU have a pipeline long enough to hit a problem - absence of SYNC at LL/SC inside atomics, barriers etc. What ?! Are you saying that because R2 has short pipelines its unlikely to hit the reordering issues and we can omit barriers? It was my guess to explain - why barriers was not included originally. You can check with Ralf, he knows more about that time MIPS Linux code. I bother with this more than 2 years and I just try to solve that issue - in recent CPUs the load after LL/SC synchronization instruction loop can get ahead of SC for sure, it was tested. And reading the MIPS64 v6.04 instruction set manual, I think 0x11/0x12 are_NOT_ transitive and therefore cannot be used to implement the smp_mb__{before,after} stuff. That is, in MIPS speak, those SYNC types are Ordering Barriers, not Completion Barriers. Please see above, point 2. That did not in fact enlighten things. Are they transitive/multi-copy atomic or not? Peter Zijlstra recently wrote: "In particular we're very much all 'confused' about the various notions of transitivity". I am actually confused too and need some examples here. (and here Will will go into great detail on the differences between the two and make our collective brains explode :-) That is, currently all architectures -- with exception of PPC -- have RCsc locks, but using these non-transitive things will get you RCpc locks. So yes, MIPS can go RCpc for its locks and share the burden of pain with PPC, but that needs to be a very concious decision. I don't understand that - I tried hard but I can't find any word like "RCsc", "RCpc" in Documents/ directory. Web search goes nowhere, of course. From: lkml.kernel.org/r/20150828153921.gf19...@twins.programming.kicks-ass.net Yes, the difference between RCpc and RCsc is in the meaning of RELEASE + ACQUIRE. With RCsc that implies a full memory barrier, with RCpc it does not. MIPS Arch starting from R2 requires that. If some CPU can't, it should execute a full "SYNC 0" instead, which is a full memory barrier. Currently PowerPC is the only arch that (can, and) does RCpc and gives a weaker RELEASE + ACQUIRE. Only the CPU who did the ACQUIRE is guaranteed to see the stores of the CPU which did the RELEASE in order. Yes, it was a goal for SYNC_ACQUIRE and SYNC_RELEASE. Caveats: - "Full memory barrier" on MIPS means - full barrier for any device in coherent domain. In MIPS Tech/Imagination Tech MIPS-based CPU it is "for any device connected to CM or IOCU + directly connected memory". - It is not applied to instruction fetch. However, I-Cache flushes and SYNCI are consistent with that. There is also hazard barrier instructions to clear CPU pipeline to some extent - to help with this limitation. I don't think that these caveats prevent a correct Acquire/Release semantic. - Leonid. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH kernel] powerpc/ioda: Set "read" permission when "write" is set
On Tue, 2016-01-12 at 15:40 +1100, Alexey Kardashevskiy wrote: > Quite often drivers set only "write" permission assuming that this > includes "read" permission as well and this works on plenty > platforms. > However IODA2 is strict about this and produces an EEH when "read" > permission is not and reading happens. > > This adds a workaround in IODA code to always add the "read" bit when > the "write" bit is set. > > Cc: Benjamin Herrenschmidt > Signed-off-by: Alexey Kardashevskiy > --- > > > Ben, what was the driver which did not set "read" and caused EEH? aacraid Cheers, Ben. > > --- > arch/powerpc/platforms/powernv/pci.c | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/arch/powerpc/platforms/powernv/pci.c > b/arch/powerpc/platforms/powernv/pci.c > index f2dd772..c7dcae5 100644 > --- a/arch/powerpc/platforms/powernv/pci.c > +++ b/arch/powerpc/platforms/powernv/pci.c > @@ -601,6 +601,9 @@ int pnv_tce_build(struct iommu_table *tbl, long > index, long npages, > u64 rpn = __pa(uaddr) >> tbl->it_page_shift; > long i; > > + if (proto_tce & TCE_PCI_WRITE) > + proto_tce |= TCE_PCI_READ; > + > for (i = 0; i < npages; i++) { > unsigned long newtce = proto_tce | > ((rpn + i) << tbl->it_page_shift); > @@ -622,6 +625,9 @@ int pnv_tce_xchg(struct iommu_table *tbl, long > index, > > BUG_ON(*hpa & ~IOMMU_PAGE_MASK(tbl)); > > + if (newtce & TCE_PCI_WRITE) > + newtce |= TCE_PCI_READ; > + > oldtce = xchg(pnv_tce(tbl, idx), cpu_to_be64(newtce)); > *hpa = be64_to_cpu(oldtce) & ~(TCE_PCI_READ | > TCE_PCI_WRITE); > *direction = iommu_tce_direction(oldtce); ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: cxl: Fix DSI misses when the context owning task exits
On Tue, 2016-01-12 at 13:29 +, David Laight wrote: > From: Michael Ellerman > > Sent: 11 January 2016 09:14 > > On Tue, 2015-24-11 at 10:56:18 UTC, Vaibhav Jain wrote: > > > Presently when a user-space process issues CXL_IOCTL_START_WORK ioctl we > > > store the pid of the current task_struct and use it to get pointer to > > > the mm_struct of the process, while processing page or segment faults > > > from the capi card. However this causes issues when the thread that had > > > originally issued the start-work ioctl exits in which case the stored > > > pid is no more valid and the cxl driver is unable to handle faults as > > > the mm_struct corresponding to process is no more accessible. > > > > > > This patch fixes this issue by using the mm_struct of the next alive > > > task in the thread group. This is done by iterating over all the tasks > > > in the thread group starting from thread group leader and calling > > > get_task_mm on each one of them. When a valid mm_struct is obtained the > > > pid of the associated task is stored in the context replacing the > > > exiting one for handling future faults. > > I don't even claim to understand the linux model for handling process > address maps, nor what the cxl driver is doing, but the above looks > more than dodgy. Thanks for reviewing it! cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/2] scripts/recordmcount.pl: support data in text section on powerpc
On Tue, 2016-01-12 at 10:42 -0500, Steven Rostedt wrote: > On Tue, 12 Jan 2016 23:14:22 +1100 > Michael Ellerman wrote: > > From: Ulrich Weigand > > > > If a text section starts out with a data blob before the first > > function start label, disassembly parsing doing in recordmcount.pl > > gets confused on powerpc, leading to creation of corrupted module > > objects. > > > > This was not a problem so far since the compiler would never create > > such text sections. However, this has changed with a recent change > > in GCC 6 to support distances of > 2GB between a function and its > > assoicated TOC in the ELFv2 ABI, exposing this problem. > > > > There is already code in recordmcount.pl to handle such data blobs > > on the sparc64 platform. This patch uses the same method to handle > > those on powerpc as well. > > > > Cc: sta...@vger.kernel.org > > Signed-off-by: Ulrich Weigand > > Signed-off-by: Michael Ellerman > > --- > > scripts/recordmcount.pl | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > Steve can we get an ack for this one, to go via powerpc? cheers > > Acked-by: Steven Rostedt Thanks. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [v3,11/41] mips: reuse asm-generic/barrier.h
On Tue, Jan 12, 2016 at 12:45:14PM -0800, Leonid Yegoshin wrote: > (I try to answer on multiple mails in one) > > First of all, it seems like some generic notes should be given here: > > 1. Generic MIPS "SYNC" (aka "SYNC 0") instruction is a very heavy in some > CPUs. On that CPUs it basically kills pipelines in each CPU, can do a > special memory/IO bus transaction (similar to "fence") and hold a system > until all R/W is completed. It is like Big Kernel Lock but worse. So, the > move to SMP_* kind of barriers is needed to improve performance, especially > on newest CPUs with long pipelines. The MIPS SYNC isn't any worse than the PPC SYNC, x86 MFENCE or arm DSB SY, yes they're heavy, so what. > 2. MIPS Arch document may be misleading because words "ordering" and > "completion" means different from Linux, the SYNC instruction description is > written for HW engineers. I wrote that in a separate patch of the same > patchset - http://patchwork.linux-mips.org/patch/10505/ "MIPS: R6: Use > lightweight SYNC instruction in smp_* memory barriers": Did you actually say anything here? > >This instructions were specifically designed to work for smp_*() sort of > >memory barriers in MIPS R2/R3/R5 and R6. > > > >Unfortunately, it's description is very cryptic and is done in HW engineering > >style which prevents use of it by SW. > > 3. I bother MIPS Arch team long time until I completely understood that MIPS > SYNC_WMB, SYNC_MB, SYNC_RMB, SYNC_RELEASE and SYNC_ACQUIRE do an exactly > that is required in Documentation/memory-barriers.txt Ha! and you think that document covers all the really fun details? In particular we're very much all 'confused' about the various notions of transitivity and what barriers imply how much of it. > In Peter Zijlstra mail: > > >1) you do not make such things selectable; either the hardware needs > >them or it doesn't. If it does you_must_ use them, however unlikely. > It is selectable only for MIPS R2 but not MIPS R6. The reason is - most of > MIPS R2 CPUs have short pipeline and that SYNC is just waste of CPU > resource, especially taking into account that "lightweight syncs" are > converted to a heavy "SYNC 0" in many of that CPUs. However the latest > MIPS/Imagination CPU have a pipeline long enough to hit a problem - absence > of SYNC at LL/SC inside atomics, barriers etc. What ?! Are you saying that because R2 has short pipelines its unlikely to hit the reordering issues and we can omit barriers? > >And reading the MIPS64 v6.04 instruction set manual, I think 0x11/0x12 > >are_NOT_ transitive and therefore cannot be used to implement the > >smp_mb__{before,after} stuff. > > > >That is, in MIPS speak, those SYNC types are Ordering Barriers, not > >Completion Barriers. > > Please see above, point 2. That did not in fact enlighten things. Are they transitive/multi-copy atomic or not? (and here Will will go into great detail on the differences between the two and make our collective brains explode :-) > >That is, currently all architectures -- with exception of PPC -- have > >RCsc locks, but using these non-transitive things will get you RCpc > >locks. > > > >So yes, MIPS can go RCpc for its locks and share the burden of pain with > >PPC, but that needs to be a very concious decision. > > I don't understand that - I tried hard but I can't find any word like > "RCsc", "RCpc" in Documents/ directory. Web search goes nowhere, of course. From: lkml.kernel.org/r/20150828153921.gf19...@twins.programming.kicks-ass.net Yes, the difference between RCpc and RCsc is in the meaning of RELEASE + ACQUIRE. With RCsc that implies a full memory barrier, with RCpc it does not. Currently PowerPC is the only arch that (can, and) does RCpc and gives a weaker RELEASE + ACQUIRE. Only the CPU who did the ACQUIRE is guaranteed to see the stores of the CPU which did the RELEASE in order. As it stands, RCU is the only _known_ codebase where this matters, but we did in fact write code for a fair number of years 'assuming' RELEASE + ACQUIRE was a full barrier, so who knows what else is out there. RCsc - release consistency sequential consistency RCpc - release consistency processor consistency https://en.wikipedia.org/wiki/Processor_consistency ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [v3,11/41] mips: reuse asm-generic/barrier.h
(I try to answer on multiple mails in one) First of all, it seems like some generic notes should be given here: 1. Generic MIPS "SYNC" (aka "SYNC 0") instruction is a very heavy in some CPUs. On that CPUs it basically kills pipelines in each CPU, can do a special memory/IO bus transaction (similar to "fence") and hold a system until all R/W is completed. It is like Big Kernel Lock but worse. So, the move to SMP_* kind of barriers is needed to improve performance, especially on newest CPUs with long pipelines. 2. MIPS Arch document may be misleading because words "ordering" and "completion" means different from Linux, the SYNC instruction description is written for HW engineers. I wrote that in a separate patch of the same patchset - http://patchwork.linux-mips.org/patch/10505/ "MIPS: R6: Use lightweight SYNC instruction in smp_* memory barriers": This instructions were specifically designed to work for smp_*() sort of memory barriers in MIPS R2/R3/R5 and R6. Unfortunately, it's description is very cryptic and is done in HW engineering style which prevents use of it by SW. 3. I bother MIPS Arch team long time until I completely understood that MIPS SYNC_WMB, SYNC_MB, SYNC_RMB, SYNC_RELEASE and SYNC_ACQUIRE do an exactly that is required in Documentation/memory-barriers.txt In Peter Zijlstra mail: 1) you do not make such things selectable; either the hardware needs them or it doesn't. If it does you_must_ use them, however unlikely. It is selectable only for MIPS R2 but not MIPS R6. The reason is - most of MIPS R2 CPUs have short pipeline and that SYNC is just waste of CPU resource, especially taking into account that "lightweight syncs" are converted to a heavy "SYNC 0" in many of that CPUs. However the latest MIPS/Imagination CPU have a pipeline long enough to hit a problem - absence of SYNC at LL/SC inside atomics, barriers etc. And reading the MIPS64 v6.04 instruction set manual, I think 0x11/0x12 are_NOT_ transitive and therefore cannot be used to implement the smp_mb__{before,after} stuff. That is, in MIPS speak, those SYNC types are Ordering Barriers, not Completion Barriers. Please see above, point 2. That is, currently all architectures -- with exception of PPC -- have RCsc locks, but using these non-transitive things will get you RCpc locks. So yes, MIPS can go RCpc for its locks and share the burden of pain with PPC, but that needs to be a very concious decision. I don't understand that - I tried hard but I can't find any word like "RCsc", "RCpc" in Documents/ directory. Web search goes nowhere, of course. In Will Deacon mail: The issue I have with the SYNC description in the text above is that it describes the single CPU (program order) and the dual-CPU (confusingly named global order) cases, but then doesn't generalise any further. That means we can't sensibly reason about transitivity properties when a third agent is involved. For example, the WRC+sync+addr test: P0: Wx = 1 P1: Rx == 1 SYNC Wy = 1 P2: Ry == 1 Rx = 0 I can't find anything to forbid that, given the text. The main problem is having the SYNC on P1 affect the write by P0. As I understand that test, the visibility of P0: W[x] = 1 is identical to P1 and P2 here. If P1 got X before SYNC and write to Y after SYNC then instruction source register dependency tracking in P2 prevents a speculative load of X before P2 obtains Y from the same place as P0/P1 and calculate address of X. If some load of X in P2 happens before address dependency calculation it's result is discarded. Yes, you can't find that in MIPS SYNC instruction description, it is more likely in CM (Coherence Manager) area. I just pointed our arch team member responsible for documents and he will think how to explain that. - Leonid. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v3 01/41] lcoking/barriers, arch: Use smp barriers in smp_store_release()
On Tue, Jan 12, 2016 at 08:28:44AM -0800, Paul E. McKenney wrote: > On Sun, Jan 10, 2016 at 04:16:32PM +0200, Michael S. Tsirkin wrote: > > From: Davidlohr Bueso > > > > With commit b92b8b35a2e ("locking/arch: Rename set_mb() to smp_store_mb()") > > it was made clear that the context of this call (and thus set_mb) > > is strictly for CPU ordering, as opposed to IO. As such all archs > > should use the smp variant of mb(), respecting the semantics and > > saving a mandatory barrier on UP. > > > > Signed-off-by: Davidlohr Bueso > > Signed-off-by: Peter Zijlstra (Intel) > > Cc: > > Cc: Andrew Morton > > Cc: Benjamin Herrenschmidt > > Cc: Heiko Carstens > > Cc: Linus Torvalds > > Cc: Paul E. McKenney > > Cc: Peter Zijlstra > > Cc: Thomas Gleixner > > Cc: Tony Luck > > Cc: d...@stgolabs.net > > Link: > > http://lkml.kernel.org/r/1445975631-17047-3-git-send-email-d...@stgolabs.net > > Signed-off-by: Ingo Molnar > > Aside from a need for s/lcoking/locking/ in the subject line: > > Reviewed-by: Paul E. McKenney Thanks! Though Ingo already put this in tip tree like this, and I need a copy in my tree to avoid breaking bisect, so I will probably keep it exactly the same to avoid confusion. > > --- > > arch/ia64/include/asm/barrier.h| 2 +- > > arch/powerpc/include/asm/barrier.h | 2 +- > > arch/s390/include/asm/barrier.h| 2 +- > > include/asm-generic/barrier.h | 2 +- > > 4 files changed, 4 insertions(+), 4 deletions(-) > > > > diff --git a/arch/ia64/include/asm/barrier.h > > b/arch/ia64/include/asm/barrier.h > > index df896a1..209c4b8 100644 > > --- a/arch/ia64/include/asm/barrier.h > > +++ b/arch/ia64/include/asm/barrier.h > > @@ -77,7 +77,7 @@ do { > > \ > > ___p1; \ > > }) > > > > -#define smp_store_mb(var, value) do { WRITE_ONCE(var, value); mb(); } > > while (0) > > +#define smp_store_mb(var, value) do { WRITE_ONCE(var, value); smp_mb(); } > > while (0) > > > > /* > > * The group barrier in front of the rsm & ssm are necessary to ensure > > diff --git a/arch/powerpc/include/asm/barrier.h > > b/arch/powerpc/include/asm/barrier.h > > index 0eca6ef..a7af5fb 100644 > > --- a/arch/powerpc/include/asm/barrier.h > > +++ b/arch/powerpc/include/asm/barrier.h > > @@ -34,7 +34,7 @@ > > #define rmb() __asm__ __volatile__ ("sync" : : : "memory") > > #define wmb() __asm__ __volatile__ ("sync" : : : "memory") > > > > -#define smp_store_mb(var, value) do { WRITE_ONCE(var, value); mb(); } > > while (0) > > +#define smp_store_mb(var, value) do { WRITE_ONCE(var, value); smp_mb(); } > > while (0) > > > > #ifdef __SUBARCH_HAS_LWSYNC > > #define SMPWMB LWSYNC > > diff --git a/arch/s390/include/asm/barrier.h > > b/arch/s390/include/asm/barrier.h > > index d68e11e..7ffd0b1 100644 > > --- a/arch/s390/include/asm/barrier.h > > +++ b/arch/s390/include/asm/barrier.h > > @@ -36,7 +36,7 @@ > > #define smp_mb__before_atomic()smp_mb() > > #define smp_mb__after_atomic() smp_mb() > > > > -#define smp_store_mb(var, value) do { WRITE_ONCE(var, value); > > mb(); } while (0) > > +#define smp_store_mb(var, value) do { WRITE_ONCE(var, value); smp_mb(); > > } while (0) > > > > #define smp_store_release(p, v) > > \ > > do { > > \ > > diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h > > index b42afad..0f45f93 100644 > > --- a/include/asm-generic/barrier.h > > +++ b/include/asm-generic/barrier.h > > @@ -93,7 +93,7 @@ > > #endif /* CONFIG_SMP */ > > > > #ifndef smp_store_mb > > -#define smp_store_mb(var, value) do { WRITE_ONCE(var, value); mb(); } > > while (0) > > +#define smp_store_mb(var, value) do { WRITE_ONCE(var, value); smp_mb(); } > > while (0) > > #endif > > > > #ifndef smp_mb__before_atomic > > -- > > MST > > ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 2/2] ASoC: fsl: select SND_SOC_FSL_SAI or SND_SOC_FSL_SSI depending on SoC type
Lothar Waßmann wrote: - select SND_SOC_FSL_SSI + select SND_SOC_FSL_SAI if SOC_IMX6UL + select SND_SOC_FSL_SSI if SOC_IMX6Q || SOC_IMX6SL || SOC_IMX6SX I don't think this is compatible with a multiarch kernel. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/2] ASoC: fsl: imx-sgtl5000: make audmux optional for imx sound driver
On Tue, Jan 12, 2016 at 07:13:30PM +0100, Lothar Waßmann wrote: > i.MX6UL does not have the audio multiplexer (AUDMUX) like e.g. i.MX6Q, > but apart from that can use the same audio driver. Make audmux > optional for the imx-sgtl5000 driver, so it can be used on i.MX6UL > too. Also i.MX6UL requires use of the SAI interface rather than SSI. If it doesn't have the audmux can you use simple-card? signature.asc Description: PGP signature ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 1/2] ASoC: fsl: imx-sgtl5000: make audmux optional for imx sound driver
i.MX6UL does not have the audio multiplexer (AUDMUX) like e.g. i.MX6Q, but apart from that can use the same audio driver. Make audmux optional for the imx-sgtl5000 driver, so it can be used on i.MX6UL too. Also i.MX6UL requires use of the SAI interface rather than SSI. Signed-off-by: Lothar Waßmann --- sound/soc/fsl/imx-sgtl5000.c | 70 +++- 1 file changed, 36 insertions(+), 34 deletions(-) diff --git a/sound/soc/fsl/imx-sgtl5000.c b/sound/soc/fsl/imx-sgtl5000.c index b99e0b5..7cefb40 100644 --- a/sound/soc/fsl/imx-sgtl5000.c +++ b/sound/soc/fsl/imx-sgtl5000.c @@ -65,40 +65,42 @@ static int imx_sgtl5000_probe(struct platform_device *pdev) int int_port, ext_port; int ret; - ret = of_property_read_u32(np, "mux-int-port", &int_port); - if (ret) { - dev_err(&pdev->dev, "mux-int-port missing or invalid\n"); - return ret; - } - ret = of_property_read_u32(np, "mux-ext-port", &ext_port); - if (ret) { - dev_err(&pdev->dev, "mux-ext-port missing or invalid\n"); - return ret; - } - - /* -* The port numbering in the hardware manual starts at 1, while -* the audmux API expects it starts at 0. -*/ - int_port--; - ext_port--; - ret = imx_audmux_v2_configure_port(int_port, - IMX_AUDMUX_V2_PTCR_SYN | - IMX_AUDMUX_V2_PTCR_TFSEL(ext_port) | - IMX_AUDMUX_V2_PTCR_TCSEL(ext_port) | - IMX_AUDMUX_V2_PTCR_TFSDIR | - IMX_AUDMUX_V2_PTCR_TCLKDIR, - IMX_AUDMUX_V2_PDCR_RXDSEL(ext_port)); - if (ret) { - dev_err(&pdev->dev, "audmux internal port setup failed\n"); - return ret; - } - ret = imx_audmux_v2_configure_port(ext_port, - IMX_AUDMUX_V2_PTCR_SYN, - IMX_AUDMUX_V2_PDCR_RXDSEL(int_port)); - if (ret) { - dev_err(&pdev->dev, "audmux external port setup failed\n"); - return ret; + if (!of_property_read_bool(np, "fsl,no-audmux")) { + ret = of_property_read_u32(np, "mux-int-port", &int_port); + if (ret) { + dev_err(&pdev->dev, "mux-int-port missing or invalid\n"); + return ret; + } + ret = of_property_read_u32(np, "mux-ext-port", &ext_port); + if (ret) { + dev_err(&pdev->dev, "mux-ext-port missing or invalid\n"); + return ret; + } + + /* +* The port numbering in the hardware manual starts at 1, while +* the audmux API expects it starts at 0. +*/ + int_port--; + ext_port--; + ret = imx_audmux_v2_configure_port(int_port, + IMX_AUDMUX_V2_PTCR_SYN | + IMX_AUDMUX_V2_PTCR_TFSEL(ext_port) | + IMX_AUDMUX_V2_PTCR_TCSEL(ext_port) | + IMX_AUDMUX_V2_PTCR_TFSDIR | + IMX_AUDMUX_V2_PTCR_TCLKDIR, + IMX_AUDMUX_V2_PDCR_RXDSEL(ext_port)); + if (ret) { + dev_err(&pdev->dev, "audmux internal port setup failed\n"); + return ret; + } + ret = imx_audmux_v2_configure_port(ext_port, + IMX_AUDMUX_V2_PTCR_SYN, + IMX_AUDMUX_V2_PDCR_RXDSEL(int_port)); + if (ret) { + dev_err(&pdev->dev, "audmux external port setup failed\n"); + return ret; + } } ssi_np = of_parse_phandle(pdev->dev.of_node, "ssi-controller", 0); -- 2.1.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 2/2] ASoC: fsl: select SND_SOC_FSL_SAI or SND_SOC_FSL_SSI depending on SoC type
i.MX6UL does not provide an SSI interface like the other i.MX6 SoCs, but only an SAI interface. Select the appropriate interface(s) depending on the enabled SoC types. Signed-off-by: Lothar Waßmann --- sound/soc/fsl/Kconfig | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/sound/soc/fsl/Kconfig b/sound/soc/fsl/Kconfig index 14dfdee..c128823 100644 --- a/sound/soc/fsl/Kconfig +++ b/sound/soc/fsl/Kconfig @@ -258,7 +258,8 @@ config SND_SOC_IMX_SGTL5000 select SND_SOC_SGTL5000 select SND_SOC_IMX_PCM_DMA select SND_SOC_IMX_AUDMUX - select SND_SOC_FSL_SSI + select SND_SOC_FSL_SAI if SOC_IMX6UL + select SND_SOC_FSL_SSI if SOC_IMX6Q || SOC_IMX6SL || SOC_IMX6SX help Say Y if you want to add support for SoC audio on an i.MX board with a sgtl5000 codec. -- 2.1.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 0/2] ASoC: fsl: make snd-soc-imx-sgtl5000 driver useable on i.MX6UL
This patchset adds support for the i.MX6UL SoC to the imx-sgtl5000 sound driver. The first patch makes the audmux setup optional for the driver, since i.MX6UL does not have this unit. The second patch selects the SAI interface rather than the SSI interface for the i.MX6UL SoC. A patch to make the corresponding DTB changes has been sent separately. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 2/2] clk: imx: add kpp clock for i.MX6UL
Add the necessary clock to use the KPP interface on i.MX6UL. Signed-off-by: Lothar Waßmann --- drivers/clk/imx/clk-imx6ul.c | 1 + include/dt-bindings/clock/imx6ul-clock.h | 3 ++- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/clk/imx/clk-imx6ul.c b/drivers/clk/imx/clk-imx6ul.c index 3e31ec0..1ee28d3 100644 --- a/drivers/clk/imx/clk-imx6ul.c +++ b/drivers/clk/imx/clk-imx6ul.c @@ -365,6 +365,7 @@ static void __init imx6ul_clocks_init(struct device_node *ccm_node) /* CCGR5 */ clks[IMX6UL_CLK_ROM]= imx_clk_gate2("rom", "ahb", base + 0x7c,0); clks[IMX6UL_CLK_SDMA] = imx_clk_gate2("sdma", "ahb", base + 0x7c,6); + clks[IMX6UL_CLK_KPP]= imx_clk_gate2("kpp", "ipg", base + 0x7c,8); clks[IMX6UL_CLK_WDOG2] = imx_clk_gate2("wdog2","ipg", base + 0x7c,10); clks[IMX6UL_CLK_SPBA] = imx_clk_gate2("spba", "ipg", base + 0x7c,12); clks[IMX6UL_CLK_SPDIF] = imx_clk_gate2_shared("spdif", "spdif_podf", base + 0x7c,14, &share_count_audio); diff --git a/include/dt-bindings/clock/imx6ul-clock.h b/include/dt-bindings/clock/imx6ul-clock.h index 08ce4a7..fd8aee8 100644 --- a/include/dt-bindings/clock/imx6ul-clock.h +++ b/include/dt-bindings/clock/imx6ul-clock.h @@ -234,7 +234,8 @@ #define IMX6UL_CLK_CSI_SEL 221 #define IMX6UL_CLK_CSI_PODF222 #define IMX6UL_CLK_PLL3_120M 223 +#define IMX6UL_CLK_KPP 224 -#define IMX6UL_CLK_END 224 +#define IMX6UL_CLK_END 225 #endif /* __DT_BINDINGS_CLOCK_IMX6UL_H */ -- 2.1.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 1/2] clk: imx: whitespace cleanup; no functional change
remove whitespace before TAB. Signed-off-by: Lothar Waßmann --- drivers/clk/imx/clk-imx6ul.c | 62 ++--- include/dt-bindings/clock/imx6ul-clock.h | 146 +++ 2 files changed, 104 insertions(+), 104 deletions(-) diff --git a/drivers/clk/imx/clk-imx6ul.c b/drivers/clk/imx/clk-imx6ul.c index 08692d7..3e31ec0 100644 --- a/drivers/clk/imx/clk-imx6ul.c +++ b/drivers/clk/imx/clk-imx6ul.c @@ -157,9 +157,9 @@ static void __init imx6ul_clocks_init(struct device_node *ccm_node) clk_set_parent(clks[IMX6UL_PLL7_BYPASS], clks[IMX6UL_CLK_PLL7]); clks[IMX6UL_CLK_PLL1_SYS] = imx_clk_fixed_factor("pll1_sys", "pll1_bypass", 1, 1); - clks[IMX6UL_CLK_PLL2_BUS] = imx_clk_gate("pll2_bus", "pll2_bypass", base + 0x30, 13); - clks[IMX6UL_CLK_PLL3_USB_OTG] = imx_clk_gate("pll3_usb_otg", "pll3_bypass", base + 0x10, 13); - clks[IMX6UL_CLK_PLL4_AUDIO] = imx_clk_gate("pll4_audio", "pll4_bypass", base + 0x70, 13); + clks[IMX6UL_CLK_PLL2_BUS] = imx_clk_gate("pll2_bus", "pll2_bypass", base + 0x30, 13); + clks[IMX6UL_CLK_PLL3_USB_OTG] = imx_clk_gate("pll3_usb_otg", "pll3_bypass", base + 0x10, 13); + clks[IMX6UL_CLK_PLL4_AUDIO] = imx_clk_gate("pll4_audio", "pll4_bypass", base + 0x70, 13); clks[IMX6UL_CLK_PLL5_VIDEO] = imx_clk_gate("pll5_video", "pll5_bypass", base + 0xa0, 13); clks[IMX6UL_CLK_PLL6_ENET] = imx_clk_gate("pll6_enet", "pll6_bypass", base + 0xe0, 13); clks[IMX6UL_CLK_PLL7_USB_HOST] = imx_clk_gate("pll7_usb_host", "pll7_bypass", base + 0x20, 13); @@ -196,8 +196,8 @@ static void __init imx6ul_clocks_init(struct device_node *ccm_node) base + 0xe0, 2, 2, 0, clk_enet_ref_table, &imx_ccm_lock); clks[IMX6UL_CLK_ENET2_REF_125M] = imx_clk_gate("enet_ref_125m", "enet2_ref", base + 0xe0, 20); - clks[IMX6UL_CLK_ENET_PTP_REF] = imx_clk_fixed_factor("enet_ptp_ref", "pll6_enet", 1, 20); - clks[IMX6UL_CLK_ENET_PTP] = imx_clk_gate("enet_ptp", "enet_ptp_ref", base + 0xe0, 21); + clks[IMX6UL_CLK_ENET_PTP_REF] = imx_clk_fixed_factor("enet_ptp_ref", "pll6_enet", 1, 20); + clks[IMX6UL_CLK_ENET_PTP] = imx_clk_gate("enet_ptp", "enet_ptp_ref", base + 0xe0, 21); clks[IMX6UL_CLK_PLL4_POST_DIV] = clk_register_divider_table(NULL, "pll4_post_div", "pll4_audio", CLK_SET_RATE_PARENT | CLK_SET_RATE_GATE, base + 0x70, 19, 2, 0, post_div_table, &imx_ccm_lock); @@ -210,8 +210,8 @@ static void __init imx6ul_clocks_init(struct device_node *ccm_node) /* name parent_name mult div */ clks[IMX6UL_CLK_PLL2_198M] = imx_clk_fixed_factor("pll2_198m", "pll2_pfd2_396m", 1, 2); - clks[IMX6UL_CLK_PLL3_80M] = imx_clk_fixed_factor("pll3_80m", "pll3_usb_otg", 1, 6); - clks[IMX6UL_CLK_PLL3_60M] = imx_clk_fixed_factor("pll3_60m", "pll3_usb_otg", 1, 8); + clks[IMX6UL_CLK_PLL3_80M] = imx_clk_fixed_factor("pll3_80m", "pll3_usb_otg", 1, 6); + clks[IMX6UL_CLK_PLL3_60M] = imx_clk_fixed_factor("pll3_60m", "pll3_usb_otg", 1, 8); clks[IMX6UL_CLK_GPT_3M]= imx_clk_fixed_factor("gpt_3m", "osc", 1, 8); np = ccm_node; @@ -219,34 +219,34 @@ static void __init imx6ul_clocks_init(struct device_node *ccm_node) WARN_ON(!base); clks[IMX6UL_CA7_SECONDARY_SEL]= imx_clk_mux("ca7_secondary_sel", base + 0xc, 3, 1, ca7_secondary_sels, ARRAY_SIZE(ca7_secondary_sels)); - clks[IMX6UL_CLK_STEP] = imx_clk_mux("step", base + 0x0c, 8, 1, step_sels, ARRAY_SIZE(step_sels)); - clks[IMX6UL_CLK_PLL1_SW] = imx_clk_mux_flags("pll1_sw", base + 0x0c, 2, 1, pll1_sw_sels, ARRAY_SIZE(pll1_sw_sels), 0); + clks[IMX6UL_CLK_STEP] = imx_clk_mux("step", base + 0x0c, 8, 1, step_sels, ARRAY_SIZE(step_sels)); + clks[IMX6UL_CLK_PLL1_SW] = imx_clk_mux_flags("pll1_sw", base + 0x0c, 2, 1, pll1_sw_sels, ARRAY_SIZE(pll1_sw_sels), 0); clks[IMX6UL_CLK_AXI_ALT_SEL] = imx_clk_mux("axi_alt_sel", base + 0x14, 7, 1, axi_alt_sels, ARRAY_SIZE(axi_alt_sels)); - clks[IMX6UL_CLK_AXI_SEL] = imx_clk_mux_flags("axi_sel", base + 0x14, 6, 1, axi_sels, ARRAY_SIZE(axi_sels), 0); - clks[IMX6UL_CLK_PERIPH_PRE] = imx_clk_mux("periph_pre", base + 0x18, 18, 2, periph_pre_sels, ARRAY_SIZE(periph_pre_sels)); - clks[IMX6UL_CLK_PERIPH2_PRE] = imx_clk_mux("periph2_pre", base + 0x18, 21, 2, periph2_pre_sels, ARRAY_SIZE(periph2_pre_sels)); + clks[IMX6UL_CLK_AXI_SEL] = imx_clk_mux_flags("axi_sel", base + 0x14, 6, 1, axi_sels, ARRAY_SIZE(axi_sels), 0); + clks[IMX6UL_CLK_PERIPH_PRE] = imx_clk_mux("periph_pre", base + 0x18, 18, 2, periph_pre_sels, ARRAY_
[PATCH 0/2] clk: imx6: add kpp clock for i.MX6UL
This patchset adds the clock which is necessary to operate the KPP unit on i.MX6UL. The first patch removes bogus whitespace before TABs in indentation. The second patch adds the clock definition. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] Add hwcap2 bits for POWER9
On 01/12/2016 11:39 AM, Steven Munroe wrote: >> That's the rule. There are no other discussions to be had. >> > Well is was posted to to powerpc next: > https://git.kernel.org/powerpc/c/e708c24cd01ce80b1609d8bacc > > We have agreement between the kernel and GLIBC (and the ABI). > > The issue is just coordination across communities and individuals that > may not being paying attention to other communities dead lines. > > Have you ever tried to push a string, up hill. That is open source > development in nutshell. ;) I know exactly what this is like. > So it is in flight and glibc is soft/slush freeze. I would hate to > revert this one day just to add it back to the next. Especially if those > days straddle the hard freeze ... > > So can we let this ride a day or too? Sure. I'm not an unreasonable person. My goal as a glibc steward is to remind IBM that our best practice is that we *wait* until it goes into mainline before committing to glibc master. There really isn't any reason to check this in to glibc master right now. It could wait. Adhemerval as a release manager is also not an unreasonable person. I have already discussed with Tulio that he should have just waited to commit these changes, but gotten an exception from Adhemerval to checkin the fairly low-risk patches late in the freeze. That's exactly the purpose of a release managers job, to grant you exceptions as we approach release, particularly when schedules don't quite line up. Cheers, Carlos. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v3 05/41] powerpc: reuse asm-generic/barrier.h
On Sun, Jan 10, 2016 at 04:17:09PM +0200, Michael S. Tsirkin wrote: > On powerpc read_barrier_depends, smp_read_barrier_depends > smp_store_mb(), smp_mb__before_atomic and smp_mb__after_atomic match the > asm-generic variants exactly. Drop the local definitions and pull in > asm-generic/barrier.h instead. > > This is in preparation to refactoring this code area. > > Signed-off-by: Michael S. Tsirkin > Acked-by: Arnd Bergmann Looks sane to me. Reviewed-by: Paul E. McKenney > --- > arch/powerpc/include/asm/barrier.h | 9 ++--- > 1 file changed, 2 insertions(+), 7 deletions(-) > > diff --git a/arch/powerpc/include/asm/barrier.h > b/arch/powerpc/include/asm/barrier.h > index a7af5fb..980ad0c 100644 > --- a/arch/powerpc/include/asm/barrier.h > +++ b/arch/powerpc/include/asm/barrier.h > @@ -34,8 +34,6 @@ > #define rmb() __asm__ __volatile__ ("sync" : : : "memory") > #define wmb() __asm__ __volatile__ ("sync" : : : "memory") > > -#define smp_store_mb(var, value) do { WRITE_ONCE(var, value); smp_mb(); } > while (0) > - > #ifdef __SUBARCH_HAS_LWSYNC > #define SMPWMB LWSYNC > #else > @@ -60,9 +58,6 @@ > #define smp_wmb()barrier() > #endif /* CONFIG_SMP */ > > -#define read_barrier_depends() do { } while (0) > -#define smp_read_barrier_depends() do { } while (0) > - > /* > * This is a barrier which prevents following instructions from being > * started until the value of the argument x is known. For example, if > @@ -87,8 +82,8 @@ do { > \ > ___p1; \ > }) > > -#define smp_mb__before_atomic() smp_mb() > -#define smp_mb__after_atomic() smp_mb() > #define smp_mb__before_spinlock() smp_mb() > > +#include > + > #endif /* _ASM_POWERPC_BARRIER_H */ > -- > MST > ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] Add hwcap2 bits for POWER9
On Mon, 2016-01-11 at 15:48 -0500, Carlos O'Donell wrote: > On 01/11/2016 02:55 PM, Tulio Magno Quites Machado Filho wrote: > > "Carlos O'Donell" writes: > > > >> On 01/11/2016 10:16 AM, Tulio Magno Quites Machado Filho wrote: > >>> Adhemerval Zanella writes: > >>> > On 08-01-2016 13:36, Peter Bergner wrote: > > On Fri, 2016-01-08 at 11:25 -0200, Tulio Magno Quites Machado Filho > > wrote: > >> Peter, this solves the issue you reported previously [1]. > >> > >> [1] https://sourceware.org/ml/libc-alpha/2015-12/msg00522.html > > > > Agreed, thanks. I'll also add the POWER9 support to the GCC side > > of the patch now that the glibc code is upstream. > > I do not see these bits being added in kernel side yet and GLIBC usual > only sync these kind of bits *after* they are included in kernel side. > So I would advise to either get these pieces (kernel support and hwcap > advertise) in kernel before 2.23 release, otherwise revert the patches. > >>> > >>> Ack. > >>> It has just been sent to the correspondent Linux mailing list: > >>> https://lists.ozlabs.org/pipermail/linuxppc-dev/2016-January/137763.html > >> > >> Please revert the changes from glibc until you checkin support to linux > >> kernel mainline. > >> > >> Leaving these bits in increases the risk that someone uses to deploy a > >> glibc > >> that then may have the wrong value. > > > > Could you clarify this statement, please? > > I fail to see how they could have the wrong value. > > Until it is checked into the mainline kernel it is not canonical. > > That's the rule. There are no other discussions to be had. > Well is was posted to to powerpc next: https://git.kernel.org/powerpc/c/e708c24cd01ce80b1609d8bacc We have agreement between the kernel and GLIBC (and the ABI). The issue is just coordination across communities and individuals that may not being paying attention to other communities dead lines. Have you ever tried to push a string, up hill. That is open source development in nutshell. ;) So it is in flight and glibc is soft/slush freeze. I would hate to revert this one day just to add it back to the next. Especially if those days straddle the hard freeze ... So can we let this ride a day or too? > The single rule avoids discussions like "it can never be wrong because that's > what our ABI says it is." > ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v3 01/41] lcoking/barriers, arch: Use smp barriers in smp_store_release()
On Sun, Jan 10, 2016 at 04:16:32PM +0200, Michael S. Tsirkin wrote: > From: Davidlohr Bueso > > With commit b92b8b35a2e ("locking/arch: Rename set_mb() to smp_store_mb()") > it was made clear that the context of this call (and thus set_mb) > is strictly for CPU ordering, as opposed to IO. As such all archs > should use the smp variant of mb(), respecting the semantics and > saving a mandatory barrier on UP. > > Signed-off-by: Davidlohr Bueso > Signed-off-by: Peter Zijlstra (Intel) > Cc: > Cc: Andrew Morton > Cc: Benjamin Herrenschmidt > Cc: Heiko Carstens > Cc: Linus Torvalds > Cc: Paul E. McKenney > Cc: Peter Zijlstra > Cc: Thomas Gleixner > Cc: Tony Luck > Cc: d...@stgolabs.net > Link: > http://lkml.kernel.org/r/1445975631-17047-3-git-send-email-d...@stgolabs.net > Signed-off-by: Ingo Molnar Aside from a need for s/lcoking/locking/ in the subject line: Reviewed-by: Paul E. McKenney > --- > arch/ia64/include/asm/barrier.h| 2 +- > arch/powerpc/include/asm/barrier.h | 2 +- > arch/s390/include/asm/barrier.h| 2 +- > include/asm-generic/barrier.h | 2 +- > 4 files changed, 4 insertions(+), 4 deletions(-) > > diff --git a/arch/ia64/include/asm/barrier.h b/arch/ia64/include/asm/barrier.h > index df896a1..209c4b8 100644 > --- a/arch/ia64/include/asm/barrier.h > +++ b/arch/ia64/include/asm/barrier.h > @@ -77,7 +77,7 @@ do { > \ > ___p1; \ > }) > > -#define smp_store_mb(var, value) do { WRITE_ONCE(var, value); mb(); } > while (0) > +#define smp_store_mb(var, value) do { WRITE_ONCE(var, value); smp_mb(); } > while (0) > > /* > * The group barrier in front of the rsm & ssm are necessary to ensure > diff --git a/arch/powerpc/include/asm/barrier.h > b/arch/powerpc/include/asm/barrier.h > index 0eca6ef..a7af5fb 100644 > --- a/arch/powerpc/include/asm/barrier.h > +++ b/arch/powerpc/include/asm/barrier.h > @@ -34,7 +34,7 @@ > #define rmb() __asm__ __volatile__ ("sync" : : : "memory") > #define wmb() __asm__ __volatile__ ("sync" : : : "memory") > > -#define smp_store_mb(var, value) do { WRITE_ONCE(var, value); mb(); } > while (0) > +#define smp_store_mb(var, value) do { WRITE_ONCE(var, value); smp_mb(); } > while (0) > > #ifdef __SUBARCH_HAS_LWSYNC > #define SMPWMB LWSYNC > diff --git a/arch/s390/include/asm/barrier.h b/arch/s390/include/asm/barrier.h > index d68e11e..7ffd0b1 100644 > --- a/arch/s390/include/asm/barrier.h > +++ b/arch/s390/include/asm/barrier.h > @@ -36,7 +36,7 @@ > #define smp_mb__before_atomic() smp_mb() > #define smp_mb__after_atomic() smp_mb() > > -#define smp_store_mb(var, value) do { WRITE_ONCE(var, value); > mb(); } while (0) > +#define smp_store_mb(var, value) do { WRITE_ONCE(var, value); smp_mb(); > } while (0) > > #define smp_store_release(p, v) > \ > do { \ > diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h > index b42afad..0f45f93 100644 > --- a/include/asm-generic/barrier.h > +++ b/include/asm-generic/barrier.h > @@ -93,7 +93,7 @@ > #endif /* CONFIG_SMP */ > > #ifndef smp_store_mb > -#define smp_store_mb(var, value) do { WRITE_ONCE(var, value); mb(); } while > (0) > +#define smp_store_mb(var, value) do { WRITE_ONCE(var, value); smp_mb(); } > while (0) > #endif > > #ifndef smp_mb__before_atomic > -- > MST > ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/2] scripts/recordmcount.pl: support data in text section on powerpc
On Tue, 12 Jan 2016 23:14:22 +1100 Michael Ellerman wrote: > From: Ulrich Weigand > > If a text section starts out with a data blob before the first > function start label, disassembly parsing doing in recordmcount.pl > gets confused on powerpc, leading to creation of corrupted module > objects. > > This was not a problem so far since the compiler would never create > such text sections. However, this has changed with a recent change > in GCC 6 to support distances of > 2GB between a function and its > assoicated TOC in the ELFv2 ABI, exposing this problem. > > There is already code in recordmcount.pl to handle such data blobs > on the sparc64 platform. This patch uses the same method to handle > those on powerpc as well. > > Cc: sta...@vger.kernel.org > Signed-off-by: Ulrich Weigand > Signed-off-by: Michael Ellerman > --- > scripts/recordmcount.pl | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > Steve can we get an ack for this one, to go via powerpc? cheers Acked-by: Steven Rostedt -- Steve > > diff --git a/scripts/recordmcount.pl b/scripts/recordmcount.pl > index 826470d7f000..96e2486a6fc4 100755 > --- a/scripts/recordmcount.pl > +++ b/scripts/recordmcount.pl > @@ -263,7 +263,8 @@ if ($arch eq "x86_64") { > > } elsif ($arch eq "powerpc") { > $local_regex = "^[0-9a-fA-F]+\\s+t\\s+(\\.?\\S+)"; > -$function_regex = "^([0-9a-fA-F]+)\\s+<(\\.?.*?)>:"; > +# See comment in the sparc64 section for why we use '\w'. > +$function_regex = "^([0-9a-fA-F]+)\\s+<(\\.?\\w*?)>:"; > $mcount_regex = "^\\s*([0-9a-fA-F]+):.*\\s\\.?_mcount\$"; > > if ($bits == 64) { ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v3 13/41] x86: reuse asm-generic/barrier.h
On Sun, 10 Jan 2016, Michael S. Tsirkin wrote: > As on most architectures, on x86 read_barrier_depends and > smp_read_barrier_depends are empty. Drop the local definitions and pull > the generic ones from asm-generic/barrier.h instead: they are identical. > > This is in preparation to refactoring this code area. > > Signed-off-by: Michael S. Tsirkin > Acked-by: Arnd Bergmann Reviewed-by: Thomas Gleixner ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v3 27/41] x86: define __smp_xxx
On Sun, 10 Jan 2016, Michael S. Tsirkin wrote: > This defines __smp_xxx barriers for x86, > for use by virtualization. > > smp_xxx barriers are removed as they are > defined correctly by asm-generic/barriers.h > > Signed-off-by: Michael S. Tsirkin > Acked-by: Arnd Bergmann Reviewed-by: Thomas Gleixner ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: cxl: Fix DSI misses when the context owning task exits
From: Michael Ellerman > Sent: 11 January 2016 09:14 > On Tue, 2015-24-11 at 10:56:18 UTC, Vaibhav Jain wrote: > > Presently when a user-space process issues CXL_IOCTL_START_WORK ioctl we > > store the pid of the current task_struct and use it to get pointer to > > the mm_struct of the process, while processing page or segment faults > > from the capi card. However this causes issues when the thread that had > > originally issued the start-work ioctl exits in which case the stored > > pid is no more valid and the cxl driver is unable to handle faults as > > the mm_struct corresponding to process is no more accessible. > > > > This patch fixes this issue by using the mm_struct of the next alive > > task in the thread group. This is done by iterating over all the tasks > > in the thread group starting from thread group leader and calling > > get_task_mm on each one of them. When a valid mm_struct is obtained the > > pid of the associated task is stored in the context replacing the > > exiting one for handling future faults. I don't even claim to understand the linux model for handling process address maps, nor what the cxl driver is doing, but the above looks more than dodgy. David ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v3 00/41] arch: barrier cleanup + barriers for virt
On Sun, Jan 10, 2016 at 04:16:22PM +0200, Michael S. Tsirkin wrote: > I parked this in vhost tree for now, though the inclusion of patch 1 from tip > creates a merge conflict - but one that is trivial to resolve. > > So I intend to just merge it all through my tree, including the > duplicate patch, and assume conflict will be resolved. > > I would really appreciate some feedback on arch bits (especially the x86 > bits), > and acks for merging this through the vhost tree. Thanks for doing this, looks good to me. Acked-by: Peter Zijlstra (Intel) ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: powerpc/powernv: Remove misleading comment in pci.c
On Fri, 2016-08-01 at 05:16:47 UTC, Russell Currey wrote: > PCI in powernv now supports quite a bit more than p5ioc2, so remove the > outdated comment. > > Signed-off-by: Russell Currey > Acked-by: Stewart Smith Applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/b0eab5b29a55fd9f31fad28df5 cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: powerpc: Implement save_stack_trace_regs() to enable kprobe stack tracing (was [RFC] ppc: Implement save_stack_trace_regs())
On Mon, 2016-11-01 at 03:30:31 UTC, Michael Ellerman wrote: > On Fri, 2016-01-08 at 17:50 -0500, Steven Rostedt wrote: > > > Are you going to take this, or do you want me to? > > Sorry, yep I'll take it. > > I trimmed the change log a bit, final version below. > > powerpc: Implement save_stack_trace_regs() to enable kprobe stack tracing > > It has come to my attention that kprobe event stack tracing does not > work on powerpc. You can see with the following: > > # cd /sys/kernel/debug/tracing > # echo stacktrace > trace_options > # echo 'p kfree' > kprobe_events > # echo 1 > events/kprobes/enable > > Will print the following warning: > save_stack_trace_regs() not implemented yet. > > Although save_stack_trace() (which normal event stack traces use) is > implemented, save_stack_trace_regs() which kprobe events use is not. > This is a cheap attempt to implement that function. > > Note, This may have issues if a task tries to get a stack trace from > another task with its regs, because it just passes in "current" to > save_context_stack(). But this does solve the issue with stack tracing > kprobe events. > > Reported-by: Chunyu Hu > Signed-off-by: Steven Rostedt > Signed-off-by: Michael Ellerman Applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/35de3b1aa16842214e0cd7c603 cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: powerpc: Add HWCAP bits for Power9
On Mon, 2016-11-01 at 02:59:04 UTC, Michael Ellerman wrote: > In order to support Power9 we need two new HWCAP bits. We are merging > these ahead of the cputable entry so that glibc can start referring to > them. > > Signed-off-by: Michael Ellerman Applied to powerpc next. https://git.kernel.org/powerpc/c/e708c24cd01ce80b1609d8bacc cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: platforms/powernv: Fix update of NVLink DMA mask
On Fri, 2016-08-01 at 00:35:09 UTC, Alistair Popple wrote: > The emulated NVLink PCI devices share the same IODA2 TCE tables but > only support a single TVT (instead of the normal two for PCI > devices). This requires the kernel to manually replace windows with > either the bypass or non-bypass window depending on what the driver > has requested. > > Unfortunately an incorrect optimisation was made in > pnv_pci_ioda_dma_set_mask() which caused updating of some NPU device > PEs to be skipped in certain configurations due to an incorrect > assumption that a NULL peer PE in the array indicated there were no > more peers present. This patch fixes the problem by ensuring all peer > PEs are updated. > > Signed-off-by: Alistair Popple Applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/419dbd5e1ff0e45a6e1d28c1f7 cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [next] powerpc/mm: fix _PAGE_SWP_SOFT_DIRTY breaking swapoff
On Sun, 2016-10-01 at 00:54:59 UTC, Hugh Dickins wrote: > Swapoff after swapping hangs on the G5, when CONFIG_CHECKPOINT_RESTORE=y > but CONFIG_MEM_SOFT_DIRTY is not set. That's because the non-zero > _PAGE_SWP_SOFT_DIRTY bit, added by CONFIG_HAVE_ARCH_SOFT_DIRTY=y, is not > discounted when CONFIG_MEM_SOFT_DIRTY is not set: so swap ptes cannot be > recognized. > > (I suspect that the peculiar dependence of HAVE_ARCH_SOFT_DIRTY on > CHECKPOINT_RESTORE in arch/powerpc/Kconfig comes from an incomplete > attempt to solve this problem.) > > It's true that the relationship between CONFIG_HAVE_ARCH_SOFT_DIRTY and > and CONFIG_MEM_SOFT_DIRTY is too confusing, and it's true that swapoff > should be made more robust; but nevertheless, fix up the powerpc ifdefs > as x86_64 and s390 (which met the same problem) have them, defining the > bits as 0 if CONFIG_MEM_SOFT_DIRTY is not set. > > Signed-off-by: Hugh Dickins > Reviewed-by: Cyrill Gorcunov > Acked-by: Laurent Dufour Applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/2f10f1a7884e97a68e52c4b6f7 cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [V2] mm/powerpc: Fix _PAGE_PTE breaking swapoff
On Mon, 2016-11-01 at 15:49:34 UTC, "Aneesh Kumar K.V" wrote: > Core kernel expect swp_entry_t to be consisting of > only swap type and swap offset. We should not leak pte bits to > swp_entry_t. This breaks swapoff which use the swap type and offset > to build a swp_entry_t and later compare that to the swp_entry_t > obtained from linux page table pte. Leaking pte bits to swp_entry_t > breaks that comparison and results in us looping in try_to_unuse. > > The stack trace can be anywhere below try_to_unuse() in mm/swapfile.c, > since swapoff is circling around and around that function, reading from > each used swap block into a page, then trying to find where that page > belongs, looking at every non-file pte of every mm that ever swapped. > > Reported-by: Hugh Dickins > Suggested-by: Hugh Dickins > Signed-off-by: Aneesh Kumar K.V > Acked-by: Hugh Dickins Applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/44734f23de2465c3c0d39e4a16 cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: linux-next: build failure after merge of the powerpc tree
On Thu, 2016-07-01 at 08:16:13 UTC, Stephen Rothwell wrote: > Hi all, > > After merging the powerpc tree, today's linux-next build (powerpc64 > allnoconfig) failed like this: > > arch/powerpc/mm/hash_utils_64.c: In function 'get_paca_psize': > arch/powerpc/mm/hash_utils_64.c:869:19: error: 'struct paca_struct' has no > member named 'context' > return get_paca()->context.user_psize; >^ > arch/powerpc/mm/hash_utils_64.c:870:1: error: control reaches end of non-void > function [-Werror=return-type] > } > ^ > > Caused by commit > > 2fc251a8dda5 ("powerpc: Copy only required pieces of the mm_context_t to > the paca") > > This build has CONFIG_PPC_MM_SLICES not set ... > > I have applied the following patch for today: > > From: Stephen Rothwell > Date: Thu, 7 Jan 2016 19:07:18 +1100 > Subject: [PATCH] powerpc: restore the user_psize member of the mm_context_t in > the paca > > It is used when CONFIG_PPC_MM_SLICES is not set. > > Fixes: 2fc251a8dda5 ("powerpc: Copy only required pieces of the mm_context_t > to the paca") > Signed-off-by: Stephen Rothwell Applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/c33e54fafacaf83b3e345aae0e cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: cxl: Enable PCI device ID for future IBM CXL adapter
On Mon, 2015-07-12 at 22:03:32 UTC, Uma Krishnan wrote: > Add support for future IBM Coherent Accelerator (CXL) device > with ID of 0x0601. > > Signed-off-by: Uma Krishnan > Reviewed-by: Matthew R. Ochs Applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/68adb7bfd66504e97364651fb7 cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [2/2] powerpc/powernv: Reserve PE#0 on NPU
On Mon, 2016-11-01 at 05:53:50 UTC, Alistair Popple wrote: > P8+ hardware reports all errors on PE#0. This patch ensures PE#0 is > not assigned to NPU devices so that it can be used for EEH. > > Signed-off-by: Alistair Popple Applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/08f48f3234a79bca86c2283a16 cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [v2,2/2] cxl: use -Werror only with CONFIG_PPC_WERROR
On Fri, 2016-08-01 at 18:30:10 UTC, Brian Norris wrote: > Some developers really like to have -Werror enabled for their code, as > it helps to ensure warning free code. Others don't want -Werror, as it > (for example) can cause problems when newer (or older) compilers have > different sets of warnings, or new warnings can appear just when turning > up the warning level (e.g., make W=1 or W=2). Thus, it seems prudent to > have the use of -Werror be configurable. > > It so happens that cxl is only built on PowerPC, and PowerPC already > has a nice set of Kconfig options for this, under CONFIG_PPC_WERROR. So > let's use that, and the world is a happy place again! (Note that > PPC_WERROR defaults to =y, so the common case compile should still be > enforcing -Werror.) > > Fixes: d3d73f4b38a8 ("cxl: Compile with -Werror") > Signed-off-by: Brian Norris Applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/57f7c3932516b9f7908d9b0a24 cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [v2,1/2] cxl: fix build for GCC 4.6.x
On Fri, 2016-08-01 at 18:30:09 UTC, Brian Norris wrote: > GCC 4.6.3 does not support -Wno-unused-const-variable. Instead, use the > kbuild infrastructure that checks if this options exists. > > Fixes: 2cd55c68c0a4 ("cxl: Fix build failure due to -Wunused-variable > behaviour change") > Suggested-by: Michal Marek > Suggested-by: Arnd Bergmann > Signed-off-by: Brian Norris Applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/aa09545589ceeff884421d8eb3 cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [1/2] powerpc/powernv: Change NPU PE# assignment
On Mon, 2016-11-01 at 05:53:49 UTC, Alistair Popple wrote: > The P8+ hardware supports four partitionable endpoints (PEs) however > the hardware reports all errors as occurring on PE#0. This means we > need to reserve this PE for error handling (EEH) and not assign it to > a NPU device, implying that some devices will need to share PEs. > > This patch changes the PE assignment for NPU devices such that NPU > devices which connect to the same GPU are assigned to the same > PE#. > > Signed-off-by: Alistair Popple Applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/b521549a09ddfac3bed38e2611 cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 2/2] powerpc/module: Handle R_PPC64_ENTRY relocations
From: Ulrich Weigand GCC 6 will include changes to generated code with -mcmodel=large, which is used to build kernel modules on powerpc64le. This was necessary because the large model is supposed to allow arbitrary sizes and locations of the code and data sections, but the ELFv2 global entry point prolog still made the unconditional assumption that the TOC associated with any particular function can be found within 2 GB of the function entry point: func: addis r2,r12,(.TOC.-func)@ha addi r2,r2,(.TOC.-func)@l .localentry func, .-func To remove this assumption, GCC will now generate instead this global entry point prolog sequence when using -mcmodel=large: .quad .TOC.-func func: .reloc ., R_PPC64_ENTRY ldr2, -8(r12) add r2, r2, r12 .localentry func, .-func The new .reloc triggers an optimization in the linker that will replace this new prolog with the original code (see above) if the linker determines that the distance between .TOC. and func is in range after all. Since this new relocation is now present in module object files, the kernel module loader is required to handle them too. This patch adds support for the new relocation and implements the same optimization done by the GNU linker. Cc: sta...@vger.kernel.org Signed-off-by: Ulrich Weigand Signed-off-by: Michael Ellerman --- arch/powerpc/include/uapi/asm/elf.h | 2 ++ arch/powerpc/kernel/module_64.c | 27 +++ 2 files changed, 29 insertions(+) diff --git a/arch/powerpc/include/uapi/asm/elf.h b/arch/powerpc/include/uapi/asm/elf.h index 59dad113897b..c2d21d11c2d2 100644 --- a/arch/powerpc/include/uapi/asm/elf.h +++ b/arch/powerpc/include/uapi/asm/elf.h @@ -295,6 +295,8 @@ do { \ #define R_PPC64_TLSLD 108 #define R_PPC64_TOCSAVE109 +#define R_PPC64_ENTRY 118 + #define R_PPC64_REL16 249 #define R_PPC64_REL16_LO 250 #define R_PPC64_REL16_HI 251 diff --git a/arch/powerpc/kernel/module_64.c b/arch/powerpc/kernel/module_64.c index 68384514506b..59663af9315f 100644 --- a/arch/powerpc/kernel/module_64.c +++ b/arch/powerpc/kernel/module_64.c @@ -635,6 +635,33 @@ int apply_relocate_add(Elf64_Shdr *sechdrs, */ break; + case R_PPC64_ENTRY: + /* +* Optimize ELFv2 large code model entry point if +* the TOC is within 2GB range of current location. +*/ + value = my_r2(sechdrs, me) - (unsigned long)location; + if (value + 0x80008000 > 0x) + break; + /* +* Check for the large code model prolog sequence: +* ld r2, ...(r12) +* add r2, r2, r12 +*/ + if uint32_t *)location)[0] & ~0xfffc) + != 0xe84c) + break; + if (((uint32_t *)location)[1] != 0x7c426214) + break; + /* +* If found, replace it with: +* addis r2, r12, (.TOC.-func)@ha +* addi r2, r12, (.TOC.-func)@l +*/ + ((uint32_t *)location)[0] = 0x3c4c + PPC_HA(value); + ((uint32_t *)location)[1] = 0x3842 + PPC_LO(value); + break; + case R_PPC64_REL16_HA: /* Subtract location pointer */ value -= (unsigned long)location; -- 2.5.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 1/2] scripts/recordmcount.pl: support data in text section on powerpc
From: Ulrich Weigand If a text section starts out with a data blob before the first function start label, disassembly parsing doing in recordmcount.pl gets confused on powerpc, leading to creation of corrupted module objects. This was not a problem so far since the compiler would never create such text sections. However, this has changed with a recent change in GCC 6 to support distances of > 2GB between a function and its assoicated TOC in the ELFv2 ABI, exposing this problem. There is already code in recordmcount.pl to handle such data blobs on the sparc64 platform. This patch uses the same method to handle those on powerpc as well. Cc: sta...@vger.kernel.org Signed-off-by: Ulrich Weigand Signed-off-by: Michael Ellerman --- scripts/recordmcount.pl | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) Steve can we get an ack for this one, to go via powerpc? cheers diff --git a/scripts/recordmcount.pl b/scripts/recordmcount.pl index 826470d7f000..96e2486a6fc4 100755 --- a/scripts/recordmcount.pl +++ b/scripts/recordmcount.pl @@ -263,7 +263,8 @@ if ($arch eq "x86_64") { } elsif ($arch eq "powerpc") { $local_regex = "^[0-9a-fA-F]+\\s+t\\s+(\\.?\\S+)"; -$function_regex = "^([0-9a-fA-F]+)\\s+<(\\.?.*?)>:"; +# See comment in the sparc64 section for why we use '\w'. +$function_regex = "^([0-9a-fA-F]+)\\s+<(\\.?\\w*?)>:"; $mcount_regex = "^\\s*([0-9a-fA-F]+):.*\\s\\.?_mcount\$"; if ($bits == 64) { -- 2.5.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [V3] powerpc/powernv: Add a kmsg_dumper that flushes console output on panic
On Tue, 2016-01-12 at 15:17 +1100, Russell Currey wrote: > On Tue, 2016-01-12 at 14:44 +1100, Stewart Smith wrote: > > Michael Ellerman writes: > > > On Fri, 2015-27-11 at 06:23:07 UTC, Russell Currey wrote: > > > > On BMC machines, console output is controlled by the OPAL firmware and > > > > is > > > > only flushed when its pollers are called. When the kernel is in a panic > > > > state, it no longer calls these pollers and thus console output does not > > > > completely flush, causing some output from the panic to be lost. > > > > > > > > Output is only actually lost when the kernel is configured to not power > > > > off > > > > or reboot after panic (i.e. CONFIG_PANIC_TIMEOUT is set to 0) since OPAL > > > > flushes the console buffer as part of its power down routines. Before > > > > this > > > > patch, however, only partial output would be printed during the timeout > > > > wait. > > > > > > > > This patch adds a new kmsg_dumper which gets called at panic time to > > > > ensure > > > > panic output is not lost. It accomplishes this by calling > > > > OPAL_CONSOLE_FLUSH > > > > in the OPAL API, and if that is not available, the pollers are called > > > > enough > > > > times to (hopefully) completely flush the buffer. > > > > > > > > The flushing mechanism will only affect output printed at and before the > > > > kmsg_dump call in kernel/panic.c:panic(). As such, the "end Kernel > > > > panic" > > > > message may still be truncated as follows: > > > > > > > > > Call Trace: > > > > > [c00f1f603b00] [c08e9458] dump_stack+0x90/0xbc > > > > > (unreliable) > > > > > [c00f1f603b30] [c08e7e78] panic+0xf8/0x2c4 > > > > > [c00f1f603bc0] [c0be4860] mount_block_root+0x288/0x33c > > > > > [c00f1f603c80] [c0be4d14] prepare_namespace+0x1f4/0x254 > > > > > [c00f1f603d00] [c0be43e8] kernel_init_freeable+0x318/0x350 > > > > > [c00f1f603dc0] [c000bd74] kernel_init+0x24/0x130 > > > > > [c00f1f603e30] [c00095b0] ret_from_kernel_thread+0x5c/0xac > > > > > ---[ end Kernel panic - not > > > > > > > > This functionality is implemented as a kmsg_dumper as it seems to be the > > > > most sensible way to introduce platform-specific functionality to the > > > > panic function. > > > > > > > > Signed-off-by: Russell Currey > > > > Reviewed-by: Andrew Donnellan > > > > > > Applied to powerpc next, thanks. > > > > > > https://git.kernel.org/powerpc/c/affddff69c55eb68969448f35f > > > > The firmware interface changed slightly since this kernel patch[1], it > > added a parameter to OPAL_CONSOLE_FLUSH which accepted the terminal > > number to flush, theoretically allowing this to be plumbed into TTY > > layer or something too. > > > > So, we'll either have to update this patch or replace it with an updated > > one. > > > > [1] i'm pushing the accepted skiboot patch now. > > > I'm working on an updated kernel patch to use the new parameter and additional > return values, so I suppose it's up to mpe whether or not this patch gets > merged now and another gets sent later to amend it, or if this patch gets > reverted in next and I can send a V4 adding the new stuff. Doh. I'd rather not revert it, unless we have to. Basically we're passing junk in r3, which skiboot is expecting to be the terminal number. So running the current kernel code on the updated skiboot shouldn't crash and burn, it just won't actually work the way it's supposed to. So my preference would be just an incremental patch ASAP to fix the kernel to do the right thing with the new interface. cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v3 2/2] powerpc: tracing: don't trace hcalls on offline CPUs
On Mon, 2015-12-14 at 23:18 +0300, Denis Kirjanov wrote: > ./drmgr -c cpu -a -r gives the following warning: > > [ 2327.035563] > RCU used illegally from offline CPU! > rcu_scheduler_active = 1, debug_locks = 1 > [ 2327.035564] no locks held by swapper/12/0. > [ 2327.035565] > stack backtrace: > [ 2327.035567] CPU: 12 PID: 0 Comm: swapper/12 Tainted: G S > 4.3.0-rc3-00060-g353169a #5 > [ 2327.035568] Call Trace: > [ 2327.035573] [c001d62578e0] [c08977fc] .dump_stack+0x98/0xd4 > (unreliable) > [ 2327.035577] [c001d6257960] [c0109bd8] > .lockdep_rcu_suspicious+0x108/0x170 > [ 2327.035580] [c001d62579f0] [c006a1d0] > .__trace_hcall_exit+0x2b0/0x2c0 > [ 2327.035584] [c001d6257ab0] [c006a2e8] > plpar_hcall_norets_trace+0x70/0x8c > [ 2327.035588] [c001d6257b20] [c0067a14] > .icp_hv_set_cpu_priority+0x54/0xc0 > [ 2327.035592] [c001d6257ba0] [c0066c5c] > .xics_teardown_cpu+0x5c/0xa0 > [ 2327.035595] [c001d6257c20] [c00747ac] > .pseries_mach_cpu_die+0x6c/0x320 > [ 2327.035598] [c001d6257cd0] [c00439cc] .cpu_die+0x3c/0x60 > [ 2327.035602] [c001d6257d40] [c00183d8] > .arch_cpu_idle_dead+0x28/0x40 > [ 2327.035606] [c001d6257db0] [c00ff1dc] > .cpu_startup_entry+0x4fc/0x560 > [ 2327.035610] [c001d6257ed0] [c0043728] > .start_secondary+0x328/0x360 > [ 2327.035614] [c001d6257f90] [c0008a6c] > start_secondary_prolog+0x10/0x14 > [ 2327.035620] cpu 12 (hwid 12) Ready to die... > [ 2327.144463] cpu 13 (hwid 13) Ready to die... > [ 2327.294180] cpu 14 (hwid 14) Ready to die... > [ 2327.403599] cpu 15 (hwid 15) Ready to die... > > Make the hypervisor tracepoints conditional > by using TRACE_EVENT_FN_COND > > Signed-off-by: Denis Kirjanov Acked-by: Michael Ellerman cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: [PATCH v2] perf/probe: Search both .eh_frame and .debug_frame sections for probe location
Hi Hemant, >From: Hemant Kumar [mailto:hem...@linux.vnet.ibm.com] > >perf probe through debuginfo__find_probes() in util/probe-finder.c >checks for the functions' frame descriptions in either .eh_frame section >of an ELF or the .debug_frame. The check is based on whether either one >of these sections is present. Depending on distro, toolchain defaults, >architetcutre, build flags, etc., CFI might be found in either .eh_frame >and/or .debug_frame. Sometimes, it may happen that, .eh_frame, even if >present, may not be complete and may miss some descriptions. Therefore, >to be sure, to find the CFI covering an address we will always have to >investigate both if available. OK, so we'd better check both cfi's. [...] >+/* Find probe points from debuginfo */ >+static int debuginfo__find_probes(struct debuginfo *dbg, >+struct probe_finder *pf) >+{ >+ int ret = 0; >+ >+#if _ELFUTILS_PREREQ(0, 142) >+ Elf *elf; >+ GElf_Ehdr ehdr; >+ GElf_Shdr shdr; >+ >+ if (pf->cfi_eh || pf->cfi_dbg) >+ return debuginfo__find_probe_location(dbg, pf); >+ >+ /* Get the call frame information from this dwarf */ >+ elf = dwarf_getelf(dbg->dbg); >+ if (elf == NULL) >+ return -EINVAL; >+ >+ if (gelf_getehdr(elf, &ehdr) == NULL) >+ return -EINVAL; >+ >+ if (elf_section_by_name(elf, &ehdr, &shdr, ".eh_frame", NULL) && >+ shdr.sh_type == SHT_PROGBITS) { >+ pf->cfi_eh = dwarf_getcfi_elf(elf); >+ } else { >+ pf->cfi_dbg = dwarf_getcfi(dbg->dbg); >+ } Hmm, if you want to check both of those cfi's, don't we have to do below? if (elf_section_by_name(elf, &ehdr, &shdr, ".eh_frame", NULL) && shdr.sh_type == SHT_PROGBITS) pf->cfi_eh = dwarf_getcfi_elf(elf); pf->cfi_dbg = dwarf_getcfi(dbg->dbg); Then, both of pf->cfi_* will be filled (if the elf has ".eh_frame"). Thanks! ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [v3,11/41] mips: reuse asm-generic/barrier.h
On Tue, Jan 12, 2016 at 11:40:12AM +0100, Peter Zijlstra wrote: > On Tue, Jan 12, 2016 at 11:25:55AM +0100, Peter Zijlstra wrote: > > On Tue, Jan 12, 2016 at 10:27:11AM +0100, Peter Zijlstra wrote: > > > 2) the changelog _completely_ fails to explain the sync 0x11 and sync > > > 0x12 semantics nor does it provide a publicly accessible link to > > > documentation that does. > > > > Ralf pointed me at: https://imgtec.com/mips/architectures/mips64/ > > > > > 3) it really should have explained what you did with > > > smp_llsc_mb/smp_mb__before_llsc() in _detail_. > > > > And reading the MIPS64 v6.04 instruction set manual, I think 0x11/0x12 > > are _NOT_ transitive and therefore cannot be used to implement the > > smp_mb__{before,after} stuff. > > > > That is, in MIPS speak, those SYNC types are Ordering Barriers, not > > Completion Barriers. They need not be globally performed. > > Which if true; and I know Will has some questions here; would also mean > that you 'cannot' use the ACQUIRE/RELEASE barriers for your locks as was > recently suggested by David Daney. The issue I have with the SYNC description in the text above is that it describes the single CPU (program order) and the dual-CPU (confusingly named global order) cases, but then doesn't generalise any further. That means we can't sensibly reason about transitivity properties when a third agent is involved. For example, the WRC+sync+addr test: P0: Wx = 1 P1: Rx == 1 SYNC Wy = 1 P2: Ry == 1 Rx = 0 I can't find anything to forbid that, given the text. The main problem is having the SYNC on P1 affect the write by P0. > That is, currently all architectures -- with exception of PPC -- have > RCsc locks, but using these non-transitive things will get you RCpc > locks. > > So yes, MIPS can go RCpc for its locks and share the burden of pain with > PPC, but that needs to be a very concious decision. I think it's much worse than RCpc, given my interpretation of the wording. Will ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH RESEND v4 4/4] cpufreq: powernv: Add sysfs attributes to show throttle stats
Hi Shilpa, On Tue, Jan 12, 2016 at 04:24:27AM -0600, Shilpasri G Bhat wrote: > +static inline int get_chip_index(struct kobject *kobj) Probably have "get_chip_index(int id)". See the reason below. > +{ > + int i, id; > + > + i = kstrtoint(kobj->name + 4, 0, &id); > + if (i) > + return i; > + > + for (i = 0; i < nr_chips; i++) > + if (chips[i].id == id) > + return i; This pattern to obtain a chip index from the chip id is repeated in multiple place inside this file. Might be worthwhile to move this to a helper function, i.e get_chip_index(id)! > + return -EINVAL; > +} > + > +static ssize_t throttle_freq_show(struct kobject *kobj, > + struct kobj_attribute *attr, char *buf) > +{ > + int i, count = 0, id; > + We obtain the id from kobj here and then obtain the index from id via the function below. > + id = get_chip_index(kobj); > + if (id < 0) > + return id; > + > + for (i = 0; i < powernv_pstate_info.nr_pstates; i++) > + count += sprintf(&buf[count], "%d %d\n", > +powernv_freqs[i].frequency, > +chips[id].pstate_stat[i]); > + > + return count; > +} > + > +static struct kobj_attribute attr_throttle_frequencies = > +__ATTR(throttle_frequencies, 0444, throttle_freq_show, NULL); > + -- Thanks and Regards gautham. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH RESEND v4 1/4] cpufreq: powernv: Remove cpu_to_chip_id() from hot-path
On 01/12/2016 03:54 PM, Shilpasri G Bhat wrote: > cpu_to_chip_id() does a DT walk through to find out the chip id by taking a > contended device tree lock. This adds an unnecessary overhead in a hot-path. > So instead of cpu_to_chip_id() use PIR of the cpu to find the chip id. > > Reported-by: Anton Blanchard > Signed-off-by: Shilpasri G Bhat > --- > drivers/cpufreq/powernv-cpufreq.c | 7 +-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/drivers/cpufreq/powernv-cpufreq.c > b/drivers/cpufreq/powernv-cpufreq.c > index cb50138..597a084 100644 > --- a/drivers/cpufreq/powernv-cpufreq.c > +++ b/drivers/cpufreq/powernv-cpufreq.c > @@ -39,6 +39,7 @@ > #define PMSR_PSAFE_ENABLE(1UL << 30) > #define PMSR_SPR_EM_DISABLE (1UL << 31) > #define PMSR_MAX(x) ((x >> 32) & 0xFF) > +#define pir_to_chip_id(pir) (((pir) >> 7) & 0x3f) Since this is platform specific and true only for power8, this is not the right place to put it. Either you can move this to arch/powerpc or you can maintain a cpu to chip map within the driver. > > static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1]; > static bool rebooting, throttled, occ_reset; > @@ -312,13 +313,14 @@ static inline unsigned int get_nominal_index(void) > static void powernv_cpufreq_throttle_check(void *data) > { > unsigned int cpu = smp_processor_id(); > + unsigned int chip_id = pir_to_chip_id(hard_smp_processor_id()); > unsigned long pmsr; > int pmsr_pmax, i; > > pmsr = get_pmspr(SPRN_PMSR); > > for (i = 0; i < nr_chips; i++) > - if (chips[i].id == cpu_to_chip_id(cpu)) > + if (chips[i].id == chip_id) > break; > > /* Check for Pmax Capping */ > @@ -558,7 +560,8 @@ static int init_chip_info(void) > unsigned int prev_chip_id = UINT_MAX; > > for_each_possible_cpu(cpu) { > - unsigned int id = cpu_to_chip_id(cpu); > + unsigned int id = > + pir_to_chip_id(get_hard_smp_processor_id(cpu)); > > if (prev_chip_id != id) { > prev_chip_id = id; > Thanks, Shreyas ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH RESEND v4 3/4] cpufreq: powernv: Add a trace print for the throttle event
Hi Shilpa, Just saw this resend! On Tue, Jan 12, 2016 at 04:24:26AM -0600, Shilpasri G Bhat wrote: > Record the throttle event with a trace print replacing the printk, > except for events like throttling below nominal and occ reset > event which print a warning message. > > Signed-off-by: Shilpasri G Bhat > --- [..snip..] > > -static void powernv_cpufreq_throttle_check(void *data) > +static void powernv_cpufreq_check_pmax(void) ^^^ This function only contains code moved from powernv_cpufreq_throttle_check with pr_crit/pr_warns replaced by trace_powernv_throttle. Furthermore, it is not called from any other place. Given that the original function was ~60 lines do we really need to split it into two separate functions ? If yes, could it be an inline function ? > { > unsigned int cpu = smp_processor_id(); > unsigned int chip_id = pir_to_chip_id(hard_smp_processor_id()); > - unsigned long pmsr; > int pmsr_pmax, i; > > - pmsr = get_pmspr(SPRN_PMSR); > + pmsr_pmax = (s8)PMSR_MAX(get_pmspr(SPRN_PMSR)); > > for (i = 0; i < nr_chips; i++) > if (chips[i].id == chip_id) > break; > > - /* Check for Pmax Capping */ > - pmsr_pmax = (s8)PMSR_MAX(pmsr); > if (pmsr_pmax != powernv_pstate_info.max) { > if (chips[i].throttled) > - goto next; > + return; > + > chips[i].throttled = true; > if (pmsr_pmax < powernv_pstate_info.nominal) > - pr_crit("CPU %d on Chip %u has Pmax reduced below > nominal frequency (%d < %d)\n", > - cpu, chips[i].id, pmsr_pmax, > - powernv_pstate_info.nominal); > - else > - pr_info("CPU %d on Chip %u has Pmax reduced below turbo > frequency (%d < %d)\n", > - cpu, chips[i].id, pmsr_pmax, > - powernv_pstate_info.max); > + pr_warn_once("CPU %d on Chip %u has Pmax reduced below > nominal frequency (%d < %d)\n", > + cpu, chips[i].id, pmsr_pmax, > + powernv_pstate_info.nominal); > + > + trace_powernv_throttle(chips[i].id, > +throttle_reason[chips[i].throt_reason], > +pmsr_pmax); > } else if (chips[i].throttled) { > chips[i].throttled = false; > - pr_info("CPU %d on Chip %u has Pmax restored to %d\n", cpu, > - chips[i].id, pmsr_pmax); > + trace_powernv_throttle(chips[i].id, > +throttle_reason[chips[i].throt_reason], > +pmsr_pmax); > } > +} > + > +static void powernv_cpufreq_throttle_check(void *data) > +{ > + unsigned long pmsr; > + > + pmsr = get_pmspr(SPRN_PMSR); > + > + /* Check for Pmax Capping */ > + powernv_cpufreq_check_pmax(); If you want to retain this function, you could pass pmsr as an argument instead of computing it afresh in powernv_cpufreq_check_pmax() > /* Check if Psafe_mode_active is set in PMSR. */ > -next: > if (pmsr & PMSR_PSAFE_ENABLE) { > throttled = true; > pr_info("Pstate set to safe frequency\n"); -- Thanks and Regards gautham. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v3 2/2] powerpc: tracing: don't trace hcalls on offline CPUs
On 12/23/15, Steven Rostedt wrote: > On Mon, 14 Dec 2015 23:18:06 +0300 > Denis Kirjanov wrote: > >> ./drmgr -c cpu -a -r gives the following warning: >> >> [ 2327.035563] >> RCU used illegally from offline CPU! >> rcu_scheduler_active = 1, debug_locks = 1 >> [ 2327.035564] no locks held by swapper/12/0. >> [ 2327.035565] >> stack backtrace: >> [ 2327.035567] CPU: 12 PID: 0 Comm: swapper/12 Tainted: G S >> 4.3.0-rc3-00060-g353169a #5 >> [ 2327.035568] Call Trace: >> [ 2327.035573] [c001d62578e0] [c08977fc] .dump_stack+0x98/0xd4 >> (unreliable) >> [ 2327.035577] [c001d6257960] [c0109bd8] >> .lockdep_rcu_suspicious+0x108/0x170 >> [ 2327.035580] [c001d62579f0] [c006a1d0] >> .__trace_hcall_exit+0x2b0/0x2c0 >> [ 2327.035584] [c001d6257ab0] [c006a2e8] >> plpar_hcall_norets_trace+0x70/0x8c >> [ 2327.035588] [c001d6257b20] [c0067a14] >> .icp_hv_set_cpu_priority+0x54/0xc0 >> [ 2327.035592] [c001d6257ba0] [c0066c5c] >> .xics_teardown_cpu+0x5c/0xa0 >> [ 2327.035595] [c001d6257c20] [c00747ac] >> .pseries_mach_cpu_die+0x6c/0x320 >> [ 2327.035598] [c001d6257cd0] [c00439cc] .cpu_die+0x3c/0x60 >> [ 2327.035602] [c001d6257d40] [c00183d8] >> .arch_cpu_idle_dead+0x28/0x40 >> [ 2327.035606] [c001d6257db0] [c00ff1dc] >> .cpu_startup_entry+0x4fc/0x560 >> [ 2327.035610] [c001d6257ed0] [c0043728] >> .start_secondary+0x328/0x360 >> [ 2327.035614] [c001d6257f90] [c0008a6c] >> start_secondary_prolog+0x10/0x14 >> [ 2327.035620] cpu 12 (hwid 12) Ready to die... >> [ 2327.144463] cpu 13 (hwid 13) Ready to die... >> [ 2327.294180] cpu 14 (hwid 14) Ready to die... >> [ 2327.403599] cpu 15 (hwid 15) Ready to die... >> >> Make the hypervisor tracepoints conditional >> by using TRACE_EVENT_FN_COND >> >> Signed-off-by: Denis Kirjanov > > I applied the first patch, but I need Acks from the powerpc maintainers > to take this one. > Hi Michael, Could you please put your ack to the second patch. Thanks! > -- Steve > > >> >> v2 changes: >> - Use raw_smp_processor_id as suggested by BenH >> since since hcalls can be called from preemptable sections >> >> v3 changes: >> - Fix the subject line >> --- >> arch/powerpc/include/asm/trace.h | 8 ++-- >> 1 file changed, 6 insertions(+), 2 deletions(-) >> >> diff --git a/arch/powerpc/include/asm/trace.h >> b/arch/powerpc/include/asm/trace.h >> index 8e86b48..32e36b1 100644 >> --- a/arch/powerpc/include/asm/trace.h >> +++ b/arch/powerpc/include/asm/trace.h >> @@ -57,12 +57,14 @@ DEFINE_EVENT(ppc64_interrupt_class, >> timer_interrupt_exit, >> extern void hcall_tracepoint_regfunc(void); >> extern void hcall_tracepoint_unregfunc(void); >> >> -TRACE_EVENT_FN(hcall_entry, >> +TRACE_EVENT_FN_COND(hcall_entry, >> >> TP_PROTO(unsigned long opcode, unsigned long *args), >> >> TP_ARGS(opcode, args), >> >> +TP_CONDITION(cpu_online(raw_smp_processor_id())), >> + >> TP_STRUCT__entry( >> __field(unsigned long, opcode) >> ), >> @@ -76,13 +78,15 @@ TRACE_EVENT_FN(hcall_entry, >> hcall_tracepoint_regfunc, hcall_tracepoint_unregfunc >> ); >> >> -TRACE_EVENT_FN(hcall_exit, >> +TRACE_EVENT_FN_COND(hcall_exit, >> >> TP_PROTO(unsigned long opcode, unsigned long retval, >> unsigned long *retbuf), >> >> TP_ARGS(opcode, retval, retbuf), >> >> +TP_CONDITION(cpu_online(raw_smp_processor_id())), >> + >> TP_STRUCT__entry( >> __field(unsigned long, opcode) >> __field(unsigned long, retval) > > ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [v3,11/41] mips: reuse asm-generic/barrier.h
On Tue, Jan 12, 2016 at 11:25:55AM +0100, Peter Zijlstra wrote: > On Tue, Jan 12, 2016 at 10:27:11AM +0100, Peter Zijlstra wrote: > > 2) the changelog _completely_ fails to explain the sync 0x11 and sync > > 0x12 semantics nor does it provide a publicly accessible link to > > documentation that does. > > Ralf pointed me at: https://imgtec.com/mips/architectures/mips64/ > > > 3) it really should have explained what you did with > > smp_llsc_mb/smp_mb__before_llsc() in _detail_. > > And reading the MIPS64 v6.04 instruction set manual, I think 0x11/0x12 > are _NOT_ transitive and therefore cannot be used to implement the > smp_mb__{before,after} stuff. > > That is, in MIPS speak, those SYNC types are Ordering Barriers, not > Completion Barriers. They need not be globally performed. Which if true; and I know Will has some questions here; would also mean that you 'cannot' use the ACQUIRE/RELEASE barriers for your locks as was recently suggested by David Daney. That is, currently all architectures -- with exception of PPC -- have RCsc locks, but using these non-transitive things will get you RCpc locks. So yes, MIPS can go RCpc for its locks and share the burden of pain with PPC, but that needs to be a very concious decision. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH RESEND v4 4/4] cpufreq: powernv: Add sysfs attributes to show throttle stats
Create sysfs attributes to export throttle information in /sys/devices/system/cpu/cpufreq/chipN. The newly added sysfs files are as follows: 1)/sys/devices/system/cpu/cpufreq/chip0/throttle_frequencies This gives the throttle stats for each of the available frequencies. The throttle stat of a frequency is the total number of times the max frequency is reduced to that frequency. # cat /sys/devices/system/cpu/cpufreq/chip0/throttle_frequencies 4023000 0 399 0 3956000 1 3923000 0 389 0 3857000 2 3823000 0 379 0 3757000 2 3724000 1 369 1 ... 2)/sys/devices/system/cpu/cpufreq/chip0/throttle_reasons This directory contains throttle reason files. Each file gives the total number of times the max frequency is throttled, except for 'throttle_reset', which gives the total number of times the max frequency is unthrottled after being throttled. # cd /sys/devices/system/cpu/cpufreq/chip0/throttle_reasons # cat cpu_over_temperature 7 # cat occ_reset 0 # cat over_current 0 # cat power_cap 0 # cat power_supply_failure 0 # cat throttle_reset 7 3)/sys/devices/system/cpu/cpufreq/chip0/throttle_stat This gives the total number of events of max frequency throttling to lower frequencies in the turbo range of frequencies and the sub-turbo(at and below nominal) range of frequencies. # cat /sys/devices/system/cpu/cpufreq/chip0/throttle_stat turbo 7 sub-turbo 0 Signed-off-by: Shilpasri G Bhat --- Changes from v3: - Seperate the patch to contain only the throttle sysfs attribute changes. - Add helper inline function get_chip_index() Changes from v2: - Fixed kbuild test warning. drivers/cpufreq/powernv-cpufreq.c:609:2: warning: ignoring return value of 'kstrtoint', declared with attribute warn_unused_result [-Wunused-result] Changes from v1: - Added a kobject to struct chip - Grouped the throttle reasons under a separate attribute_group and exported each reason as individual file. - Moved the sysfs files from /sys/devices/system/node/nodeN to /sys/devices/system/cpu/cpufreq/chipN - As suggested by Paul Clarke replaced 'Nominal' with 'sub-turbo'. - Modified the commit message. drivers/cpufreq/powernv-cpufreq.c | 177 +- 1 file changed, 173 insertions(+), 4 deletions(-) diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c index c98a6e7..40ccd9d 100644 --- a/drivers/cpufreq/powernv-cpufreq.c +++ b/drivers/cpufreq/powernv-cpufreq.c @@ -54,6 +54,16 @@ static const char * const throttle_reason[] = { "OCC Reset" }; +enum throt_reason_type { + NO_THROTTLE = 0, + POWERCAP, + CPU_OVERTEMP, + POWER_SUPPLY_FAILURE, + OVERCURRENT, + OCC_RESET_THROTTLE, + OCC_MAX_REASON +}; + static struct chip { unsigned int id; bool throttled; @@ -61,6 +71,11 @@ static struct chip { u8 throt_reason; cpumask_t mask; struct work_struct throttle; + int throt_turbo; + int throt_nominal; + int reason[OCC_MAX_REASON]; + int *pstate_stat; + struct kobject *kobj; } *chips; static int nr_chips; @@ -195,6 +210,113 @@ static struct freq_attr *powernv_cpu_freq_attr[] = { NULL, }; +static inline int get_chip_index(struct kobject *kobj) +{ + int i, id; + + i = kstrtoint(kobj->name + 4, 0, &id); + if (i) + return i; + + for (i = 0; i < nr_chips; i++) + if (chips[i].id == id) + return i; + return -EINVAL; +} + +static ssize_t throttle_freq_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + int i, count = 0, id; + + id = get_chip_index(kobj); + if (id < 0) + return id; + + for (i = 0; i < powernv_pstate_info.nr_pstates; i++) + count += sprintf(&buf[count], "%d %d\n", + powernv_freqs[i].frequency, + chips[id].pstate_stat[i]); + + return count; +} + +static struct kobj_attribute attr_throttle_frequencies = +__ATTR(throttle_frequencies, 0444, throttle_freq_show, NULL); + +static ssize_t throttle_stat_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + int id, count = 0; + + id = get_chip_index(kobj); + if (id < 0) + return id; + + count += sprintf(&buf[count], "turbo %d\n", chips[id].throt_turbo); + count += sprintf(&buf[count], "sub-turbo %d\n", + chips[id].throt_nominal); + + return count; +} + +static struct kobj_attribute attr_throttle_stat = +__ATTR(throttle_stat, 0444, throttle_stat_show, NULL); + +#define define_throttle_reason_attr(attr_name, val) \ +static ssize_t attr_name##_show(struct kobject *kobj,\ + st
[PATCH RESEND v4 2/4] cpufreq: powernv/tracing: Add powernv_throttle tracepoint
This patch adds the powernv_throttle tracepoint to trace the CPU frequency throttling event, which is used by the powernv-cpufreq driver in POWER8. Signed-off-by: Shilpasri G Bhat CC: Ingo Molnar CC: Steven Rostedt --- No changes from v2 and v3. include/trace/events/power.h | 22 ++ kernel/trace/power-traces.c | 1 + 2 files changed, 23 insertions(+) diff --git a/include/trace/events/power.h b/include/trace/events/power.h index 284244e..19e5030 100644 --- a/include/trace/events/power.h +++ b/include/trace/events/power.h @@ -38,6 +38,28 @@ DEFINE_EVENT(cpu, cpu_idle, TP_ARGS(state, cpu_id) ); +TRACE_EVENT(powernv_throttle, + + TP_PROTO(int chip_id, const char *reason, int pmax), + + TP_ARGS(chip_id, reason, pmax), + + TP_STRUCT__entry( + __field(int, chip_id) + __string(reason, reason) + __field(int, pmax) + ), + + TP_fast_assign( + __entry->chip_id = chip_id; + __assign_str(reason, reason); + __entry->pmax = pmax; + ), + + TP_printk("Chip %d Pmax %d %s", __entry->chip_id, + __entry->pmax, __get_str(reason)) +); + TRACE_EVENT(pstate_sample, TP_PROTO(u32 core_busy, diff --git a/kernel/trace/power-traces.c b/kernel/trace/power-traces.c index eb4220a..81b8745 100644 --- a/kernel/trace/power-traces.c +++ b/kernel/trace/power-traces.c @@ -15,4 +15,5 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(suspend_resume); EXPORT_TRACEPOINT_SYMBOL_GPL(cpu_idle); +EXPORT_TRACEPOINT_SYMBOL_GPL(powernv_throttle); -- 1.9.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH RESEND v4 3/4] cpufreq: powernv: Add a trace print for the throttle event
Record the throttle event with a trace print replacing the printk, except for events like throttling below nominal and occ reset event which print a warning message. Signed-off-by: Shilpasri G Bhat --- Changes from v3: - Separate this patch to contain trace_point changes - Move struct chip member 'restore' of type bool above 'mask' to reduce structure padding. No changes from v2. Changes from v1: - As suggested by Paul Clarke replaced char * throttle_reason[][30] by const char * const throttle_reason[]. drivers/cpufreq/powernv-cpufreq.c | 95 --- 1 file changed, 49 insertions(+), 46 deletions(-) diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c index 597a084..c98a6e7 100644 --- a/drivers/cpufreq/powernv-cpufreq.c +++ b/drivers/cpufreq/powernv-cpufreq.c @@ -28,6 +28,7 @@ #include #include #include +#include #include #include @@ -44,12 +45,22 @@ static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1]; static bool rebooting, throttled, occ_reset; +static const char * const throttle_reason[] = { + "No throttling", + "Power Cap", + "Processor Over Temperature", + "Power Supply Failure", + "Over Current", + "OCC Reset" +}; + static struct chip { unsigned int id; bool throttled; + bool restore; + u8 throt_reason; cpumask_t mask; struct work_struct throttle; - bool restore; } *chips; static int nr_chips; @@ -310,41 +321,49 @@ static inline unsigned int get_nominal_index(void) return powernv_pstate_info.max - powernv_pstate_info.nominal; } -static void powernv_cpufreq_throttle_check(void *data) +static void powernv_cpufreq_check_pmax(void) { unsigned int cpu = smp_processor_id(); unsigned int chip_id = pir_to_chip_id(hard_smp_processor_id()); - unsigned long pmsr; int pmsr_pmax, i; - pmsr = get_pmspr(SPRN_PMSR); + pmsr_pmax = (s8)PMSR_MAX(get_pmspr(SPRN_PMSR)); for (i = 0; i < nr_chips; i++) if (chips[i].id == chip_id) break; - /* Check for Pmax Capping */ - pmsr_pmax = (s8)PMSR_MAX(pmsr); if (pmsr_pmax != powernv_pstate_info.max) { if (chips[i].throttled) - goto next; + return; + chips[i].throttled = true; if (pmsr_pmax < powernv_pstate_info.nominal) - pr_crit("CPU %d on Chip %u has Pmax reduced below nominal frequency (%d < %d)\n", - cpu, chips[i].id, pmsr_pmax, - powernv_pstate_info.nominal); - else - pr_info("CPU %d on Chip %u has Pmax reduced below turbo frequency (%d < %d)\n", - cpu, chips[i].id, pmsr_pmax, - powernv_pstate_info.max); + pr_warn_once("CPU %d on Chip %u has Pmax reduced below nominal frequency (%d < %d)\n", +cpu, chips[i].id, pmsr_pmax, +powernv_pstate_info.nominal); + + trace_powernv_throttle(chips[i].id, + throttle_reason[chips[i].throt_reason], + pmsr_pmax); } else if (chips[i].throttled) { chips[i].throttled = false; - pr_info("CPU %d on Chip %u has Pmax restored to %d\n", cpu, - chips[i].id, pmsr_pmax); + trace_powernv_throttle(chips[i].id, + throttle_reason[chips[i].throt_reason], + pmsr_pmax); } +} + +static void powernv_cpufreq_throttle_check(void *data) +{ + unsigned long pmsr; + + pmsr = get_pmspr(SPRN_PMSR); + + /* Check for Pmax Capping */ + powernv_cpufreq_check_pmax(); /* Check if Psafe_mode_active is set in PMSR. */ -next: if (pmsr & PMSR_PSAFE_ENABLE) { throttled = true; pr_info("Pstate set to safe frequency\n"); @@ -358,7 +377,7 @@ next: if (throttled) { pr_info("PMSR = %16lx\n", pmsr); - pr_crit("CPU Frequency could be throttled\n"); + pr_warn("CPU Frequency could be throttled\n"); } } @@ -449,15 +468,6 @@ void powernv_cpufreq_work_fn(struct work_struct *work) } } -static char throttle_reason[][30] = { - "No throttling", - "Power Cap", - "Processor Over Temperature", - "Power Supply Failure", - "Over Current", - "OCC Reset" -}; - static int
[PATCH RESEND v4 1/4] cpufreq: powernv: Remove cpu_to_chip_id() from hot-path
cpu_to_chip_id() does a DT walk through to find out the chip id by taking a contended device tree lock. This adds an unnecessary overhead in a hot-path. So instead of cpu_to_chip_id() use PIR of the cpu to find the chip id. Reported-by: Anton Blanchard Signed-off-by: Shilpasri G Bhat --- drivers/cpufreq/powernv-cpufreq.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c index cb50138..597a084 100644 --- a/drivers/cpufreq/powernv-cpufreq.c +++ b/drivers/cpufreq/powernv-cpufreq.c @@ -39,6 +39,7 @@ #define PMSR_PSAFE_ENABLE (1UL << 30) #define PMSR_SPR_EM_DISABLE(1UL << 31) #define PMSR_MAX(x)((x >> 32) & 0xFF) +#define pir_to_chip_id(pir)(((pir) >> 7) & 0x3f) static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1]; static bool rebooting, throttled, occ_reset; @@ -312,13 +313,14 @@ static inline unsigned int get_nominal_index(void) static void powernv_cpufreq_throttle_check(void *data) { unsigned int cpu = smp_processor_id(); + unsigned int chip_id = pir_to_chip_id(hard_smp_processor_id()); unsigned long pmsr; int pmsr_pmax, i; pmsr = get_pmspr(SPRN_PMSR); for (i = 0; i < nr_chips; i++) - if (chips[i].id == cpu_to_chip_id(cpu)) + if (chips[i].id == chip_id) break; /* Check for Pmax Capping */ @@ -558,7 +560,8 @@ static int init_chip_info(void) unsigned int prev_chip_id = UINT_MAX; for_each_possible_cpu(cpu) { - unsigned int id = cpu_to_chip_id(cpu); + unsigned int id = + pir_to_chip_id(get_hard_smp_processor_id(cpu)); if (prev_chip_id != id) { prev_chip_id = id; -- 1.9.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH RESEND v4 0/4] cpufreq: powernv: Redesign the presentation of throttle notification
In POWER8, OCC(On-Chip-Controller) can throttle the frequency of the CPU when the chip crosses its thermal and power limits. Currently, powernv-cpufreq driver detects and reports this event as a console message. Some machines may not sustain the max turbo frequency in all conditions and can be throttled frequently. This can lead to the flooding of console with throttle messages. So this patchset aims to redesign the presentation of this event via sysfs counters and tracepoints. Patches [2] to [4] will add a perf trace point "power:powernv_throttle" and sysfs throttle counter stats in /sys/devices/system/cpu/cpufreq/chipN. Patch [1] solves a bug in powernv_cpufreq_throttle_check(), which calls in to cpu_to_chip_id() in hot path which reads DT every time to find the chip id. Resending the patchset as I has cc'ed sta...@vger.kernel.org in developemnt cycle and used --in-reply-to to post a new version. Changes from v3: - Add a fix to replace cpu_to_chip_id() with simpler PIR shift to obtain the chip id. - Break patch2 in to two patches separating the tracepoint and sysfs attribute changes. Changes from v2: - Fixed kbuild test warning. drivers/cpufreq/powernv-cpufreq.c:609:2: warning: ignoring return value of 'kstrtoint', declared with attribute warn_unused_result [-Wunused-result] Shilpasri G Bhat (4): cpufreq: powernv: Remove cpu_to_chip_id() from hot-path cpufreq: powernv/tracing: Add powernv_throttle tracepoint cpufreq: powernv: Add a trace print for the throttle event cpufreq: powernv: Add sysfs attributes to show throttle stats drivers/cpufreq/powernv-cpufreq.c | 279 +++--- include/trace/events/power.h | 22 +++ kernel/trace/power-traces.c | 1 + 3 files changed, 250 insertions(+), 52 deletions(-) -- 1.9.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [v3,11/41] mips: reuse asm-generic/barrier.h
On Tue, Jan 12, 2016 at 10:27:11AM +0100, Peter Zijlstra wrote: > 2) the changelog _completely_ fails to explain the sync 0x11 and sync > 0x12 semantics nor does it provide a publicly accessible link to > documentation that does. Ralf pointed me at: https://imgtec.com/mips/architectures/mips64/ > 3) it really should have explained what you did with > smp_llsc_mb/smp_mb__before_llsc() in _detail_. And reading the MIPS64 v6.04 instruction set manual, I think 0x11/0x12 are _NOT_ transitive and therefore cannot be used to implement the smp_mb__{before,after} stuff. That is, in MIPS speak, those SYNC types are Ordering Barriers, not Completion Barriers. They need not be globally performed. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [v3,11/41] mips: reuse asm-generic/barrier.h
On Tue, Jan 12, 2016 at 10:43:36AM +0200, Michael S. Tsirkin wrote: > On Mon, Jan 11, 2016 at 05:14:14PM -0800, Leonid Yegoshin wrote: > > On 01/10/2016 06:18 AM, Michael S. Tsirkin wrote: > > >On mips dma_rmb, dma_wmb, smp_store_mb, read_barrier_depends, > > >smp_read_barrier_depends, smp_store_release and smp_load_acquire match > > >the asm-generic variants exactly. Drop the local definitions and pull in > > >asm-generic/barrier.h instead. > > > > > This statement doesn't fit MIPS barriers variations. Moreover, there is a > > reason to extend that even more specific, at least for smp_store_release and > > smp_load_acquire, look into > > > > http://patchwork.linux-mips.org/patch/10506/ > > > > - Leonid. > > Fine, but it matches what current code is doing. Since that > MIPS_LIGHTWEIGHT_SYNC patch didn't go into linux-next yet, do > you see a problem reworking it on top of this patchset? That patch is a complete doorstop atm. It needs a lot more work before it can go anywhere. Don't worry about it. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [v3,11/41] mips: reuse asm-generic/barrier.h
On Mon, Jan 11, 2016 at 05:14:14PM -0800, Leonid Yegoshin wrote: > This statement doesn't fit MIPS barriers variations. Moreover, there is a > reason to extend that even more specific, at least for smp_store_release and > smp_load_acquire, look into > > http://patchwork.linux-mips.org/patch/10506/ Dude, that's one horrible patch. 1) you do not make such things selectable; either the hardware needs them or it doesn't. If it does you _must_ use them, however unlikely. 2) the changelog _completely_ fails to explain the sync 0x11 and sync 0x12 semantics nor does it provide a publicly accessible link to documentation that does. 3) it really should have explained what you did with smp_llsc_mb/smp_mb__before_llsc() in _detail_. And I agree that ideally it should be split into parts. Seriously, this is _NOT_ OK. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [v3,11/41] mips: reuse asm-generic/barrier.h
On Mon, Jan 11, 2016 at 05:14:14PM -0800, Leonid Yegoshin wrote: > On 01/10/2016 06:18 AM, Michael S. Tsirkin wrote: > >On mips dma_rmb, dma_wmb, smp_store_mb, read_barrier_depends, > >smp_read_barrier_depends, smp_store_release and smp_load_acquire match > >the asm-generic variants exactly. Drop the local definitions and pull in > >asm-generic/barrier.h instead. > > > This statement doesn't fit MIPS barriers variations. Moreover, there is a > reason to extend that even more specific, at least for smp_store_release and > smp_load_acquire, look into > > http://patchwork.linux-mips.org/patch/10506/ > > - Leonid. Fine, but it matches what current code is doing. Since that MIPS_LIGHTWEIGHT_SYNC patch didn't go into linux-next yet, do you see a problem reworking it on top of this patchset? -- MST ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH V1 18/33] powerpc/mm: Add helper for update page flags during ioremap
On 1/12/16, Aneesh Kumar K.V wrote: > They differ between radix and hash. Hence we need a helper > > Signed-off-by: Aneesh Kumar K.V > --- > arch/powerpc/include/asm/book3s/32/pgtable.h | 11 +++ > arch/powerpc/include/asm/book3s/64/hash.h| 11 +++ > arch/powerpc/include/asm/nohash/pgtable.h| 20 > arch/powerpc/mm/pgtable_64.c | 16 +--- > 4 files changed, 43 insertions(+), 15 deletions(-) Can we put it alone in some common header file? > > diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h > b/arch/powerpc/include/asm/book3s/32/pgtable.h > index c0898e26ed4a..b53d7504d6f6 100644 > --- a/arch/powerpc/include/asm/book3s/32/pgtable.h > +++ b/arch/powerpc/include/asm/book3s/32/pgtable.h > @@ -491,6 +491,17 @@ static inline unsigned long gup_pte_filter(int write) > mask |= _PAGE_RW; > return mask; > } > + > +static inline unsigned long ioremap_prot_flags(unsigned long flags) > +{ > + /* writeable implies dirty for kernel addresses */ > + if (flags & _PAGE_RW) > + flags |= _PAGE_DIRTY; > + > + /* we don't want to let _PAGE_USER and _PAGE_EXEC leak out */ > + flags &= ~(_PAGE_USER | _PAGE_EXEC); > + return flags; > +} > #endif /* !__ASSEMBLY__ */ > > #endif /* _ASM_POWERPC_BOOK3S_32_PGTABLE_H */ > diff --git a/arch/powerpc/include/asm/book3s/64/hash.h > b/arch/powerpc/include/asm/book3s/64/hash.h > index d51709dad729..4f0fdb9a5d19 100644 > --- a/arch/powerpc/include/asm/book3s/64/hash.h > +++ b/arch/powerpc/include/asm/book3s/64/hash.h > @@ -592,6 +592,17 @@ static inline unsigned long gup_pte_filter(int write) > return mask; > } > > +static inline unsigned long ioremap_prot_flags(unsigned long flags) > +{ > + /* writeable implies dirty for kernel addresses */ > + if (flags & _PAGE_RW) > + flags |= _PAGE_DIRTY; > + > + /* we don't want to let _PAGE_USER and _PAGE_EXEC leak out */ > + flags &= ~(_PAGE_USER | _PAGE_EXEC); > + return flags; > +} > + > #ifdef CONFIG_TRANSPARENT_HUGEPAGE > extern void hpte_do_hugepage_flush(struct mm_struct *mm, unsigned long > addr, > pmd_t *pmdp, unsigned long old_pmd); > diff --git a/arch/powerpc/include/asm/nohash/pgtable.h > b/arch/powerpc/include/asm/nohash/pgtable.h > index e4173cb06e5b..8861ec146985 100644 > --- a/arch/powerpc/include/asm/nohash/pgtable.h > +++ b/arch/powerpc/include/asm/nohash/pgtable.h > @@ -238,6 +238,26 @@ static inline unsigned long gup_pte_filter(int write) > return mask; > } > > +static inline unsigned long ioremap_prot_flags(unsigned long flags) > +{ > + /* writeable implies dirty for kernel addresses */ > + if (flags & _PAGE_RW) > + flags |= _PAGE_DIRTY; > + > + /* we don't want to let _PAGE_USER and _PAGE_EXEC leak out */ > + flags &= ~(_PAGE_USER | _PAGE_EXEC); > + > +#ifdef _PAGE_BAP_SR > + /* _PAGE_USER contains _PAGE_BAP_SR on BookE using the new PTE format > + * which means that we just cleared supervisor access... oops ;-) This > + * restores it > + */ > + flags |= _PAGE_BAP_SR; > +#endif > + > + return flags; > +} > + > #ifdef CONFIG_HUGETLB_PAGE > static inline int hugepd_ok(hugepd_t hpd) > { > diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c > index 21a9a171c267..aa8ff4c74563 100644 > --- a/arch/powerpc/mm/pgtable_64.c > +++ b/arch/powerpc/mm/pgtable_64.c > @@ -188,21 +188,7 @@ void __iomem * ioremap_prot(phys_addr_t addr, unsigned > long size, > { > void *caller = __builtin_return_address(0); > > - /* writeable implies dirty for kernel addresses */ > - if (flags & _PAGE_RW) > - flags |= _PAGE_DIRTY; > - > - /* we don't want to let _PAGE_USER and _PAGE_EXEC leak out */ > - flags &= ~(_PAGE_USER | _PAGE_EXEC); > - > -#ifdef _PAGE_BAP_SR > - /* _PAGE_USER contains _PAGE_BAP_SR on BookE using the new PTE format > - * which means that we just cleared supervisor access... oops ;-) This > - * restores it > - */ > - flags |= _PAGE_BAP_SR; > -#endif > - > + flags = ioremap_prot_flags(flags); > if (ppc_md.ioremap) > return ppc_md.ioremap(addr, size, flags, caller); > return __ioremap_caller(addr, size, flags, caller); > -- > 2.5.0 > > ___ > Linuxppc-dev mailing list > Linuxppc-dev@lists.ozlabs.org > https://lists.ozlabs.org/listinfo/linuxppc-dev ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH V1 14/33] powerpc/mm: Use helper for finding pte bits mapping I/O area
On 1/12/16, Aneesh Kumar K.V wrote: > We will have different values for hash and radix. Hence we > cannot use #define constants. Add helper > > Signed-off-by: Aneesh Kumar K.V > --- > arch/powerpc/include/asm/book3s/32/pgtable.h | 5 + > arch/powerpc/include/asm/book3s/64/hash.h| 5 + > arch/powerpc/include/asm/nohash/pgtable.h| 5 + > arch/powerpc/kernel/isa-bridge.c | 4 ++-- > arch/powerpc/kernel/pci_64.c | 2 +- > arch/powerpc/mm/pgtable_64.c | 2 +- > 6 files changed, 19 insertions(+), 4 deletions(-) > > diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h > b/arch/powerpc/include/asm/book3s/32/pgtable.h > index 3ed3303c1295..77adada2f3b4 100644 > --- a/arch/powerpc/include/asm/book3s/32/pgtable.h > +++ b/arch/powerpc/include/asm/book3s/32/pgtable.h > @@ -478,6 +478,11 @@ static inline pgprot_t pgprot_writecombine(pgprot_t > prot) > return pgprot_noncached_wc(prot); > } > > +static inline unsigned long pte_io_cache_bits(void) > +{ > + return _PAGE_NO_CACHE | _PAGE_GUARDED; > +} This could be just plain #define > + > #endif /* !__ASSEMBLY__ */ > > #endif /* _ASM_POWERPC_BOOK3S_32_PGTABLE_H */ > diff --git a/arch/powerpc/include/asm/book3s/64/hash.h > b/arch/powerpc/include/asm/book3s/64/hash.h > index ced3aed63af2..1b27c0c8effa 100644 > --- a/arch/powerpc/include/asm/book3s/64/hash.h > +++ b/arch/powerpc/include/asm/book3s/64/hash.h > @@ -578,6 +578,11 @@ static inline pgprot_t pgprot_writecombine(pgprot_t > prot) > extern pgprot_t vm_get_page_prot(unsigned long vm_flags); > #define vm_get_page_prot vm_get_page_prot > > +static inline unsigned long pte_io_cache_bits(void) > +{ > + return _PAGE_NO_CACHE | _PAGE_GUARDED; > +} > + > #ifdef CONFIG_TRANSPARENT_HUGEPAGE > extern void hpte_do_hugepage_flush(struct mm_struct *mm, unsigned long > addr, > pmd_t *pmdp, unsigned long old_pmd); > diff --git a/arch/powerpc/include/asm/nohash/pgtable.h > b/arch/powerpc/include/asm/nohash/pgtable.h > index 11e3767216c0..8c4bb8fda0de 100644 > --- a/arch/powerpc/include/asm/nohash/pgtable.h > +++ b/arch/powerpc/include/asm/nohash/pgtable.h > @@ -224,6 +224,11 @@ extern pgprot_t phys_mem_access_prot(struct file *file, > unsigned long pfn, >unsigned long size, pgprot_t vma_prot); > #define __HAVE_PHYS_MEM_ACCESS_PROT > > +static inline unsigned long pte_io_cache_bits(void) > +{ > + return _PAGE_NO_CACHE | _PAGE_GUARDED; > +} > + > #ifdef CONFIG_HUGETLB_PAGE > static inline int hugepd_ok(hugepd_t hpd) > { > diff --git a/arch/powerpc/kernel/isa-bridge.c > b/arch/powerpc/kernel/isa-bridge.c > index 0f1997097960..d81185f025fa 100644 > --- a/arch/powerpc/kernel/isa-bridge.c > +++ b/arch/powerpc/kernel/isa-bridge.c > @@ -109,14 +109,14 @@ static void pci_process_ISA_OF_ranges(struct > device_node *isa_node, > size = 0x1; > > __ioremap_at(phb_io_base_phys, (void *)ISA_IO_BASE, > - size, _PAGE_NO_CACHE|_PAGE_GUARDED); > + size, pte_io_cache_bits()); > return; > > inval_range: > printk(KERN_ERR "no ISA IO ranges or unexpected isa range, " > "mapping 64k\n"); > __ioremap_at(phb_io_base_phys, (void *)ISA_IO_BASE, > - 0x1, _PAGE_NO_CACHE|_PAGE_GUARDED); > + 0x1, pte_io_cache_bits()); > } > > > diff --git a/arch/powerpc/kernel/pci_64.c b/arch/powerpc/kernel/pci_64.c > index 60bb187cb46a..7fe1dfd214a1 100644 > --- a/arch/powerpc/kernel/pci_64.c > +++ b/arch/powerpc/kernel/pci_64.c > @@ -159,7 +159,7 @@ static int pcibios_map_phb_io_space(struct > pci_controller *hose) > > /* Establish the mapping */ > if (__ioremap_at(phys_page, area->addr, size_page, > - _PAGE_NO_CACHE | _PAGE_GUARDED) == NULL) > + pte_io_cache_bits()) == NULL) > return -ENOMEM; > > /* Fixup hose IO resource */ > diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c > index e5f600d19326..6d161cec2e32 100644 > --- a/arch/powerpc/mm/pgtable_64.c > +++ b/arch/powerpc/mm/pgtable_64.c > @@ -253,7 +253,7 @@ void __iomem * __ioremap(phys_addr_t addr, unsigned long > size, > > void __iomem * ioremap(phys_addr_t addr, unsigned long size) > { > - unsigned long flags = _PAGE_NO_CACHE | _PAGE_GUARDED; > + unsigned long flags = pte_io_cache_bits(); > void *caller = __builtin_return_address(0); > > if (ppc_md.ioremap) > -- > 2.5.0 > > ___ > Linuxppc-dev mailing list > Linuxppc-dev@lists.ozlabs.org > https://lists.ozlabs.org/listinfo/linuxppc-dev ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[RFC PATCH V1 29/33] powerpc/mm: Hash linux abstraction for THP
Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/include/asm/book3s/64/hash-64k.h | 42 --- arch/powerpc/include/asm/book3s/64/hash.h | 14 +++ arch/powerpc/include/asm/book3s/64/pgtable.h | 154 +- arch/powerpc/mm/pgtable-hash64.c | 58 +- 4 files changed, 198 insertions(+), 70 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h index 8008c9a89416..e697fc528c0a 100644 --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h @@ -190,11 +190,19 @@ static inline int hugepd_ok(hugepd_t hpd) #endif /* CONFIG_HUGETLB_PAGE */ #ifdef CONFIG_TRANSPARENT_HUGEPAGE -extern unsigned long pmd_hugepage_update(struct mm_struct *mm, -unsigned long addr, -pmd_t *pmdp, -unsigned long clr, -unsigned long set); + +extern pmd_t pfn_hlpmd(unsigned long pfn, pgprot_t pgprot); +extern pmd_t mk_hlpmd(struct page *page, pgprot_t pgprot); +extern pmd_t hlpmd_modify(pmd_t pmd, pgprot_t newprot); +extern int hl_has_transparent_hugepage(void); +extern void set_hlpmd_at(struct mm_struct *mm, unsigned long addr, +pmd_t *pmdp, pmd_t pmd); + +extern unsigned long hlpmd_hugepage_update(struct mm_struct *mm, + unsigned long addr, + pmd_t *pmdp, + unsigned long clr, + unsigned long set); static inline char *get_hpte_slot_array(pmd_t *pmdp) { /* @@ -253,51 +261,55 @@ static inline void mark_hpte_slot_valid(unsigned char *hpte_slot_array, * that for explicit huge pages. * */ -static inline int pmd_trans_huge(pmd_t pmd) +static inline int hlpmd_trans_huge(pmd_t pmd) { return !!((pmd_val(pmd) & (H_PAGE_PTE | H_PAGE_THP_HUGE)) == (H_PAGE_PTE | H_PAGE_THP_HUGE)); } -static inline int pmd_large(pmd_t pmd) +static inline int hlpmd_large(pmd_t pmd) { return !!(pmd_val(pmd) & H_PAGE_PTE); } -static inline pmd_t pmd_mknotpresent(pmd_t pmd) +static inline pmd_t hlpmd_mknotpresent(pmd_t pmd) { return __pmd(pmd_val(pmd) & ~H_PAGE_PRESENT); } -#define __HAVE_ARCH_PMD_SAME -static inline int pmd_same(pmd_t pmd_a, pmd_t pmd_b) +static inline pmd_t hlpmd_mkhuge(pmd_t pmd) +{ + return __pmd(pmd_val(pmd) | (H_PAGE_PTE | H_PAGE_THP_HUGE)); +} + +static inline int hlpmd_same(pmd_t pmd_a, pmd_t pmd_b) { return (((pmd_val(pmd_a) ^ pmd_val(pmd_b)) & ~H_PAGE_HPTEFLAGS) == 0); } -static inline int __pmdp_test_and_clear_young(struct mm_struct *mm, +static inline int __hlpmdp_test_and_clear_young(struct mm_struct *mm, unsigned long addr, pmd_t *pmdp) { unsigned long old; if ((pmd_val(*pmdp) & (H_PAGE_ACCESSED | H_PAGE_HASHPTE)) == 0) return 0; - old = pmd_hugepage_update(mm, addr, pmdp, H_PAGE_ACCESSED, 0); + old = hlpmd_hugepage_update(mm, addr, pmdp, H_PAGE_ACCESSED, 0); return ((old & H_PAGE_ACCESSED) != 0); } -#define __HAVE_ARCH_PMDP_SET_WRPROTECT -static inline void pmdp_set_wrprotect(struct mm_struct *mm, unsigned long addr, +static inline void hlpmdp_set_wrprotect(struct mm_struct *mm, unsigned long addr, pmd_t *pmdp) { if ((pmd_val(*pmdp) & H_PAGE_RW) == 0) return; - pmd_hugepage_update(mm, addr, pmdp, H_PAGE_RW, 0); + hlpmd_hugepage_update(mm, addr, pmdp, H_PAGE_RW, 0); } #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ + #endif /* __ASSEMBLY__ */ #endif /* _ASM_POWERPC_BOOK3S_64_HASH_64K_H */ diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h index 20bb9da200c6..f43b26c4d319 100644 --- a/arch/powerpc/include/asm/book3s/64/hash.h +++ b/arch/powerpc/include/asm/book3s/64/hash.h @@ -600,6 +600,20 @@ static inline void hpte_do_hugepage_flush(struct mm_struct *mm, } #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ +extern int hlpmdp_set_access_flags(struct vm_area_struct *vma, + unsigned long address, pmd_t *pmdp, + pmd_t entry, int dirty); +extern int hlpmdp_test_and_clear_young(struct vm_area_struct *vma, + unsigned long address, pmd_t *pmdp); +extern pmd_t hlpmdp_huge_get_and_clear(struct mm_struct *mm, + unsigned long addr, pmd_t *pmdp); +extern pmd_t hlpmdp_collapse_flush(struct vm_area_struct *vma, + unsigned long address, pmd_t *pmdp); +extern void hlpgtable_trans_huge_deposit(struct mm_struct *mm, pmd_t *pmdp, +