[PATCH] powerpc/pci: Fix endian bug in fixed PHB numbering
The recent commit 63a72284b159 ("powerpc/pci: Assign fixed PHB number based on device-tree properties"), added code to read a 64-bit property from the device tree, and if not found read a 32-bit property (reg). There was a bug in the 32-bit case, on big endian machines, due to the use of the 64-bit value to read the 32-bit property. The cast of &prop means we end up writing to the high 32-bit of prop, leaving the low 32-bits containing whatever junk was on the stack. If that junk value was non-zero, and < MAX_PHBS, we would end up using it as the PHB id. This results in users seeing what appear to be random PHB ids. Fix it by reading into a u32 property and then assigning that to the u64 value, letting the CPU do the correct conversions for us. Fixes: 63a72284b159 ("powerpc/pci: Assign fixed PHB number based on device-tree properties") Signed-off-by: Michael Ellerman --- arch/powerpc/kernel/pci-common.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c index a5c0153ede37..7fdf324d5b51 100644 --- a/arch/powerpc/kernel/pci-common.c +++ b/arch/powerpc/kernel/pci-common.c @@ -78,6 +78,7 @@ EXPORT_SYMBOL(get_pci_dma_ops); static int get_phb_number(struct device_node *dn) { int ret, phb_id = -1; + u32 prop_32; u64 prop; /* @@ -86,8 +87,10 @@ static int get_phb_number(struct device_node *dn) * reading "ibm,opal-phbid", only present in OPAL environment. */ ret = of_property_read_u64(dn, "ibm,opal-phbid", &prop); - if (ret) - ret = of_property_read_u32_index(dn, "reg", 1, (u32 *)&prop); + if (ret) { + ret = of_property_read_u32_index(dn, "reg", 1, &prop_32); + prop = prop_32; + } if (!ret) phb_id = (int)(prop & (MAX_PHBS - 1)); -- 2.7.4
Re: [PATCH kernel 05/15] powerpc/iommu: Stop using @current in mm_iommu_xxx
On 09/08/16 16:04, Nicholas Piggin wrote: > On Tue, 9 Aug 2016 14:43:00 +1000 > Balbir Singh wrote: > >> On 03/08/16 18:40, Alexey Kardashevskiy wrote: > >>> -long mm_iommu_get(unsigned long ua, unsigned long entries, >>> +long mm_iommu_get(struct mm_struct *mm, unsigned long ua, unsigned long >>> entries, >>> struct mm_iommu_table_group_mem_t **pmem) >>> { >>> struct mm_iommu_table_group_mem_t *mem; >>> long i, j, ret = 0, locked_entries = 0; >>> struct page *page = NULL; >>> >>> - if (!current || !current->mm) >>> - return -ESRCH; /* process exited */ >> >> VM_BUG_ON(mm == NULL)? > > >>> @@ -128,10 +129,17 @@ static long tce_iommu_register_pages(struct >>> tce_container *container, >>> ((vaddr + size) < vaddr)) >>> return -EINVAL; >>> >>> - ret = mm_iommu_get(vaddr, entries, &mem); >>> + if (!container->mm) { >>> + if (!current->mm) >>> + return -ESRCH; /* process exited */ >> >> You may even want to check for PF_EXITING and ignore those tasks? > > > These are related to some of the questions I had about the patch. > > But I think it makes sense just to take this approach as a minimal > bug fix without changing logic too much or adding BUG_ONs, and then > if we we can consider how iommu takes references to mm and uses it > (if anybody finds the time). > Agreed Balbir
Re: [PATCH kernel 05/15] powerpc/iommu: Stop using @current in mm_iommu_xxx
On Tue, 9 Aug 2016 14:43:00 +1000 Balbir Singh wrote: > On 03/08/16 18:40, Alexey Kardashevskiy wrote: > > -long mm_iommu_get(unsigned long ua, unsigned long entries, > > +long mm_iommu_get(struct mm_struct *mm, unsigned long ua, unsigned long > > entries, > > struct mm_iommu_table_group_mem_t **pmem) > > { > > struct mm_iommu_table_group_mem_t *mem; > > long i, j, ret = 0, locked_entries = 0; > > struct page *page = NULL; > > > > - if (!current || !current->mm) > > - return -ESRCH; /* process exited */ > > VM_BUG_ON(mm == NULL)? > > @@ -128,10 +129,17 @@ static long tce_iommu_register_pages(struct > > tce_container *container, > > ((vaddr + size) < vaddr)) > > return -EINVAL; > > > > - ret = mm_iommu_get(vaddr, entries, &mem); > > + if (!container->mm) { > > + if (!current->mm) > > + return -ESRCH; /* process exited */ > > You may even want to check for PF_EXITING and ignore those tasks? These are related to some of the questions I had about the patch. But I think it makes sense just to take this approach as a minimal bug fix without changing logic too much or adding BUG_ONs, and then if we we can consider how iommu takes references to mm and uses it (if anybody finds the time). Thanks, Nick
Re: [PATCH] powerpc: rebuild vdsos correctly
On Tue, 09 Aug 2016 14:49:25 +1000 Michael Ellerman wrote: > Nicholas Piggin writes: > > > diff --git a/arch/powerpc/kernel/vdso32/Makefile > > b/arch/powerpc/kernel/vdso32/Makefile > > index cbabd14..ae1f245 100644 > > --- a/arch/powerpc/kernel/vdso32/Makefile > > +++ b/arch/powerpc/kernel/vdso32/Makefile > > @@ -39,14 +39,14 @@ $(obj)/%.so: $(obj)/%.so.dbg FORCE > > $(call if_changed,objcopy) > > > > # assembly rules for the .S files > > -$(obj-vdso32): %.o: %.S > > +$(obj-vdso32): %.o: %.S FORCE > > $(call if_changed_dep,vdso32as) > > > > # actual build commands > > quiet_cmd_vdso32ld = VDSO32L $@ > > - cmd_vdso32ld = $(CROSS32CC) $(c_flags) -Wl,-T $^ -o $@ > > + cmd_vdso32ld = $(CROSS32CC) $(c_flags) -o $@ -Wl,-T$(filter > > %.lds,$^) $(filter %.o,$^) > > quiet_cmd_vdso32as = VDSO32A $@ > > - cmd_vdso32as = $(CROSS32CC) $(a_flags) -c -o $@ $< > > + cmd_vdso32as = $(CROSS32CC) $(a_flags) -o $@ -c $< > > Are the two changes above required, they aren't obviously related. The vdso32ld change is required because otherwise "FORCE" gets put on the end of the command line. vdso32as... I think is not required. Want a new version without it?
Re: [PATCH kernel 05/15] powerpc/iommu: Stop using @current in mm_iommu_xxx
On 05/08/16 17:00, Michael Ellerman wrote: > Alexey Kardashevskiy writes: > >> In some situations the userspace memory context may live longer than >> the userspace process itself so if we need to do proper memory context >> cleanup, we better cache @mm and use it later when the process is gone >> (@current or @current->mm are NULL). >> >> This changes mm_iommu_xxx API to receive mm_struct instead of using one >> from @current. >> >> This is needed by the following patch to do proper cleanup in time. >> This depends on "powerpc/powernv/ioda: Fix endianness when reading TCEs" >> to do proper cleanup via tce_iommu_clear() patch. >> >> To keep API consistent, this replaces mm_context_t with mm_struct; >> we stick to mm_struct as mm_iommu_adjust_locked_vm() helper needs >> access to &mm->mmap_sem. >> >> This should cause no behavioral change. > > Is this a theoretical bug, or do we hit it in practice? Actual bug. > > In other words, should I merge this as a fix for 4.8, or can it wait for > 4.9 with the rest of the series? Assuming this does not have "rb" or "ab" from anyone familiar with IOMMU on powernv, this has to wait :-/ > >> Signed-off-by: Alexey Kardashevskiy >> --- >> arch/powerpc/include/asm/mmu_context.h | 20 +++-- >> arch/powerpc/kernel/setup-common.c | 2 +- >> arch/powerpc/mm/mmu_context_book3s64.c | 4 +-- >> arch/powerpc/mm/mmu_context_iommu.c| 54 >> ++ > >> drivers/vfio/vfio_iommu_spapr_tce.c| 41 -- > > I'd need an ACK from Alex for that part. -- Alexey
4.8.0-rc1: Invalid memory access at $SRR0: 0140f86c $SRR1: 00003030
Hi, while trying to upgrade this PowerBook G4 from 4.7-rc7 to 4.8-rc1, it's unable to boot the Yaboot (v1.3.16 from Debian/stable) boot loader: copying OF device tree... Building dt strings... Building dt stucture... Device tree strings 0x01e72000 -> 0x01e73615 Device tree struct 0x01e74000 -> 0x01e7e000 Quiescing Open Firmware... Bootng Linux via __start()... Invalid memory access at $SRR0: 0140f86c $SRR1: 3030 Apple PowerBook6,8 4.9.0f0 BootROM built on 01/10/05 at 10:39:14 [...] ok 0:> _ Going back to 4.7-rc7 (w/o installing Yaboot again) works just fine. The config is mostly the same (used "make oldconfig" from 4.7), but I've said YES to CONFIG_SLAB_FREELIST_RANDOM - could this be causing the boot failure? Full .config and screen shot: http://nerdbynature.de/bits/4.8.0-rc1/ Thanks, Christian. -- BOFH excuse #319: Your computer hasn't been returning all the bits it gets from the Internet.
Re: [PATCH kernel 14/15] vfio/spapr_tce: Export container API for external users
On 09/08/16 02:43, Alex Williamson wrote: > On Wed, 3 Aug 2016 18:40:55 +1000 > Alexey Kardashevskiy wrote: > >> This exports helpers which are needed to keep a VFIO container in >> memory while there are external users such as KVM. >> >> Signed-off-by: Alexey Kardashevskiy >> --- >> drivers/vfio/vfio.c | 30 ++ >> drivers/vfio/vfio_iommu_spapr_tce.c | 16 +++- >> include/linux/vfio.h| 6 ++ >> 3 files changed, 51 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c >> index d1d70e0..baf6a9c 100644 >> --- a/drivers/vfio/vfio.c >> +++ b/drivers/vfio/vfio.c >> @@ -1729,6 +1729,36 @@ long vfio_external_check_extension(struct vfio_group >> *group, unsigned long arg) >> EXPORT_SYMBOL_GPL(vfio_external_check_extension); >> >> /** >> + * External user API for containers, exported by symbols to be linked >> + * dynamically. >> + * >> + */ >> +struct vfio_container *vfio_container_get_ext(struct file *filep) >> +{ >> +struct vfio_container *container = filep->private_data; >> + >> +if (filep->f_op != &vfio_fops) >> +return ERR_PTR(-EINVAL); >> + >> +vfio_container_get(container); >> + >> +return container; >> +} >> +EXPORT_SYMBOL_GPL(vfio_container_get_ext); >> + >> +void vfio_container_put_ext(struct vfio_container *container) >> +{ >> +vfio_container_put(container); >> +} >> +EXPORT_SYMBOL_GPL(vfio_container_put_ext); >> + >> +void *vfio_container_get_iommu_data_ext(struct vfio_container *container) >> +{ >> +return container->iommu_data; >> +} >> +EXPORT_SYMBOL_GPL(vfio_container_get_iommu_data_ext); >> + >> +/** >> * Sub-module support >> */ >> /* >> diff --git a/drivers/vfio/vfio_iommu_spapr_tce.c >> b/drivers/vfio/vfio_iommu_spapr_tce.c >> index 3594ad3..fceea3d 100644 >> --- a/drivers/vfio/vfio_iommu_spapr_tce.c >> +++ b/drivers/vfio/vfio_iommu_spapr_tce.c >> @@ -1331,6 +1331,21 @@ const struct vfio_iommu_driver_ops >> tce_iommu_driver_ops = { >> .detach_group = tce_iommu_detach_group, >> }; >> >> +struct iommu_table *vfio_container_spapr_tce_table_get_ext(void *iommu_data, >> +u64 offset) >> +{ >> +struct tce_container *container = iommu_data; >> +struct iommu_table *tbl = NULL; >> + >> +if (tce_iommu_find_table(container, offset, &tbl) < 0) >> +return NULL; >> + >> +iommu_table_get(tbl); >> + >> +return tbl; >> +} >> +EXPORT_SYMBOL_GPL(vfio_container_spapr_tce_table_get_ext); >> + >> static int __init tce_iommu_init(void) >> { >> return vfio_register_iommu_driver(&tce_iommu_driver_ops); >> @@ -1348,4 +1363,3 @@ MODULE_VERSION(DRIVER_VERSION); >> MODULE_LICENSE("GPL v2"); >> MODULE_AUTHOR(DRIVER_AUTHOR); >> MODULE_DESCRIPTION(DRIVER_DESC); >> - >> diff --git a/include/linux/vfio.h b/include/linux/vfio.h >> index 0ecae0b..1c2138a 100644 >> --- a/include/linux/vfio.h >> +++ b/include/linux/vfio.h >> @@ -91,6 +91,12 @@ extern void vfio_group_put_external_user(struct >> vfio_group *group); >> extern int vfio_external_user_iommu_id(struct vfio_group *group); >> extern long vfio_external_check_extension(struct vfio_group *group, >>unsigned long arg); >> +extern struct vfio_container *vfio_container_get_ext(struct file *filep); >> +extern void vfio_container_put_ext(struct vfio_container *container); >> +extern void *vfio_container_get_iommu_data_ext( >> +struct vfio_container *container); >> +extern struct iommu_table *vfio_container_spapr_tce_table_get_ext( >> +void *iommu_data, u64 offset); >> >> /* >> * Sub-module helpers > > > I think you need to take a closer look of the lifecycle of a container, > having a reference means the container itself won't go away, but only > having a group set within that container holds the actual IOMMU > references. container->iommu_data is going to be NULL once the > groups are lost. Thanks, Container owns the iommu tables and this is what I care about here, groups attached or not - this is handled separately via IOMMU group list in a specific iommu_table struct, these groups get detached from iommu_table when they are removed from a container. -- Alexey
Re: [PATCH 0/7] ima: carry the measurement list across kexec
On 04/08/16 22:24, Mimi Zohar wrote: > The TPM PCRs are only reset on a hard reboot. In order to validate a > TPM's quote after a soft reboot (eg. kexec -e), the IMA measurement list > of the running kernel must be saved and then restored on the subsequent > boot. > > The existing securityfs binary_runtime_measurements file conveniently > provides a serialized format of the IMA measurement list. This patch > set serializes the measurement list in this format and restores it. > > This patch set pre-req's Thiago Bauermann's "kexec_file: Add buffer > hand-over for the next kernel" patch set* for actually carrying the > serialized measurement list across the kexec. > > Mimi > Hi, Mimi I am trying to convince myself of the security of the solution. I asked Thiago as well, but may be I am be lagging behind in understanding. We trust the kernel to hand over PCR values of the old kernel (which cannot be validated) to the IMA subsystem in the new kernel for storage. I guess the idea is for ima_add_boot_aggregate to do the right thing? How do we validate what the old kernel is giving us? Why do we care for the old measurement list? Is it still of significance in the new kernel? Balbir Singh.
[PATCH v2] powerpc/book3s: Fix MCE console messages for unrecoverable MCE.
From: Mahesh Salgaonkar When machine check occurs with MSR(RI=0), it means MC interrupt is unrecoverable and kernel goes down to panic path. But the console message still shows it as recovered. This patch fixes the MCE console messages. Fixes: 36df96f8acaf ("powerpc/book3s: Decode and save machine check event.") Cc: sta...@vger.kernel.org Signed-off-by: Mahesh Salgaonkar --- Changes in v2: - Used pr_err() instead of printk(). --- arch/powerpc/kernel/mce.c |3 ++- arch/powerpc/platforms/powernv/opal.c |1 + 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c index ef267fd..5e7ece0 100644 --- a/arch/powerpc/kernel/mce.c +++ b/arch/powerpc/kernel/mce.c @@ -92,7 +92,8 @@ void save_mce_event(struct pt_regs *regs, long handled, mce->in_use = 1; mce->initiator = MCE_INITIATOR_CPU; - if (handled) + /* Mark it recovered if we have handled it and MSR(RI=1). */ + if (handled && (regs->msr & MSR_RI)) mce->disposition = MCE_DISPOSITION_RECOVERED; else mce->disposition = MCE_DISPOSITION_NOT_RECOVERED; diff --git a/arch/powerpc/platforms/powernv/opal.c b/arch/powerpc/platforms/powernv/opal.c index 738c9b1..48fa512 100644 --- a/arch/powerpc/platforms/powernv/opal.c +++ b/arch/powerpc/platforms/powernv/opal.c @@ -399,6 +399,7 @@ static int opal_recover_mce(struct pt_regs *regs, if (!(regs->msr & MSR_RI)) { /* If MSR_RI isn't set, we cannot recover */ + pr_err("Machine check interrupt unrecoverable: MSR(RI=0)\n"); recovered = 0; } else if (evt->disposition == MCE_DISPOSITION_RECOVERED) { /* Platform corrected itself */
Re: [PATCH] powerpc: rebuild vdsos correctly
Nicholas Piggin writes: > diff --git a/arch/powerpc/kernel/vdso32/Makefile > b/arch/powerpc/kernel/vdso32/Makefile > index cbabd14..ae1f245 100644 > --- a/arch/powerpc/kernel/vdso32/Makefile > +++ b/arch/powerpc/kernel/vdso32/Makefile > @@ -39,14 +39,14 @@ $(obj)/%.so: $(obj)/%.so.dbg FORCE > $(call if_changed,objcopy) > > # assembly rules for the .S files > -$(obj-vdso32): %.o: %.S > +$(obj-vdso32): %.o: %.S FORCE > $(call if_changed_dep,vdso32as) > > # actual build commands > quiet_cmd_vdso32ld = VDSO32L $@ > - cmd_vdso32ld = $(CROSS32CC) $(c_flags) -Wl,-T $^ -o $@ > + cmd_vdso32ld = $(CROSS32CC) $(c_flags) -o $@ -Wl,-T$(filter %.lds,$^) > $(filter %.o,$^) > quiet_cmd_vdso32as = VDSO32A $@ > - cmd_vdso32as = $(CROSS32CC) $(a_flags) -c -o $@ $< > + cmd_vdso32as = $(CROSS32CC) $(a_flags) -o $@ -c $< Are the two changes above required, they aren't obviously related. cheers
Re: [PATCH] powerpc/pci: Only do fixed PHB numbering on powernv
"Guilherme G. Piccoli" writes: > On 08/08/2016 09:32 PM, Michael Ellerman wrote: >> "Guilherme G. Piccoli" writes: >>> >>> (i) What is the specific issue? Do you have some logs or at least a >>> "high-level" description of the problem in Xorg? I took a look in its >>> code and PCI domain is coded as u16, which is correct/expected. So it >>> seems a subtle bug we should investigate and hopefully fix. >> >> It was reported here: >> >>https://lists.ozlabs.org/pipermail/linuxppc-dev/2016-August/147062.html >> >> It seems xorg just has a hard coded limit of 256 domains. > > Thanks for the link Michael. I guess Xorg _had_ this limit in the > "past", since the function that was logged on error - xf86MapLegacyIO() > - was removed by a commit of 2014: > > https://lists.x.org/archives/xorg-devel/2014-July/043224.html Aha, nice work. In fact it seems to be better than that, the array of domains was removed in 2011 in: https://cgit.freedesktop.org/xorg/xserver/commit/?id=858fbbb40d7c69540cd1fb5315cebf811c6e7b3f Which is officially ancient history as far as I'm concerned. >>> (ii) Why is it related to the absence of pseries check?! You said this >>> was your bad, but as far as I understand, Xorg runs in pSeries too so >>> the issue should also be there heheh >> >> Well yes I guess it would, if anyone had tested Xorg on pseries :) > > We use to test Xorg on pSeries regularly; in fact, I made a quick test > today: > > http://imgur.com/a/l1lP8 > > I forced the domain to be 0x as in the above image, and everything > worked fine. Awesome. >> I think for now I'm going to apply this, and we'll work out something >> else later. > > OK, I guess your solution is fine and solves the pasemi issue quickly, No given the above info on xorg I'll drop this, and merge just the endian fix. cheers
Re: [PATCH kernel 05/15] powerpc/iommu: Stop using @current in mm_iommu_xxx
On 03/08/16 18:40, Alexey Kardashevskiy wrote: > In some situations the userspace memory context may live longer than > the userspace process itself so if we need to do proper memory context > cleanup, we better cache @mm and use it later when the process is gone > (@current or @current->mm are NULL). > > This changes mm_iommu_xxx API to receive mm_struct instead of using one > from @current. > > This is needed by the following patch to do proper cleanup in time. > This depends on "powerpc/powernv/ioda: Fix endianness when reading TCEs" > to do proper cleanup via tce_iommu_clear() patch. > > To keep API consistent, this replaces mm_context_t with mm_struct; > we stick to mm_struct as mm_iommu_adjust_locked_vm() helper needs > access to &mm->mmap_sem. > > This should cause no behavioral change. > Looks good, minor nits below Acked-by: Balbir Singh > Signed-off-by: Alexey Kardashevskiy > --- > arch/powerpc/include/asm/mmu_context.h | 20 +++-- > arch/powerpc/kernel/setup-common.c | 2 +- > arch/powerpc/mm/mmu_context_book3s64.c | 4 +-- > arch/powerpc/mm/mmu_context_iommu.c| 54 > ++ > drivers/vfio/vfio_iommu_spapr_tce.c| 41 -- > 5 files changed, 62 insertions(+), 59 deletions(-) > > diff --git a/arch/powerpc/include/asm/mmu_context.h > b/arch/powerpc/include/asm/mmu_context.h > index 9d2cd0c..b85cc7b 100644 > --- a/arch/powerpc/include/asm/mmu_context.h > +++ b/arch/powerpc/include/asm/mmu_context.h > @@ -18,16 +18,18 @@ extern void destroy_context(struct mm_struct *mm); > #ifdef CONFIG_SPAPR_TCE_IOMMU > struct mm_iommu_table_group_mem_t; > > -extern bool mm_iommu_preregistered(void); > -extern long mm_iommu_get(unsigned long ua, unsigned long entries, > +extern bool mm_iommu_preregistered(struct mm_struct *mm); > +extern long mm_iommu_get(struct mm_struct *mm, > + unsigned long ua, unsigned long entries, > struct mm_iommu_table_group_mem_t **pmem); > -extern long mm_iommu_put(struct mm_iommu_table_group_mem_t *mem); > -extern void mm_iommu_init(mm_context_t *ctx); > -extern void mm_iommu_cleanup(mm_context_t *ctx); > -extern struct mm_iommu_table_group_mem_t *mm_iommu_lookup(unsigned long ua, > - unsigned long size); > -extern struct mm_iommu_table_group_mem_t *mm_iommu_find(unsigned long ua, > - unsigned long entries); > +extern long mm_iommu_put(struct mm_struct *mm, > + struct mm_iommu_table_group_mem_t *mem); > +extern void mm_iommu_init(struct mm_struct *mm); > +extern void mm_iommu_cleanup(struct mm_struct *mm); > +extern struct mm_iommu_table_group_mem_t *mm_iommu_lookup(struct mm_struct > *mm, > + unsigned long ua, unsigned long size); > +extern struct mm_iommu_table_group_mem_t *mm_iommu_find(struct mm_struct *mm, > + unsigned long ua, unsigned long entries); > extern long mm_iommu_ua_to_hpa(struct mm_iommu_table_group_mem_t *mem, > unsigned long ua, unsigned long *hpa); > extern long mm_iommu_mapped_inc(struct mm_iommu_table_group_mem_t *mem); > diff --git a/arch/powerpc/kernel/setup-common.c > b/arch/powerpc/kernel/setup-common.c > index 714b4ba..e90b68a 100644 > --- a/arch/powerpc/kernel/setup-common.c > +++ b/arch/powerpc/kernel/setup-common.c > @@ -905,7 +905,7 @@ void __init setup_arch(char **cmdline_p) > init_mm.context.pte_frag = NULL; > #endif > #ifdef CONFIG_SPAPR_TCE_IOMMU > - mm_iommu_init(&init_mm.context); > + mm_iommu_init(&init_mm); > #endif > irqstack_early_init(); > exc_lvl_early_init(); > diff --git a/arch/powerpc/mm/mmu_context_book3s64.c > b/arch/powerpc/mm/mmu_context_book3s64.c > index b114f8b..ad82735 100644 > --- a/arch/powerpc/mm/mmu_context_book3s64.c > +++ b/arch/powerpc/mm/mmu_context_book3s64.c > @@ -115,7 +115,7 @@ int init_new_context(struct task_struct *tsk, struct > mm_struct *mm) > mm->context.pte_frag = NULL; > #endif > #ifdef CONFIG_SPAPR_TCE_IOMMU > - mm_iommu_init(&mm->context); > + mm_iommu_init(mm); > #endif > return 0; > } > @@ -160,7 +160,7 @@ static inline void destroy_pagetable_page(struct > mm_struct *mm) > void destroy_context(struct mm_struct *mm) > { > #ifdef CONFIG_SPAPR_TCE_IOMMU > - mm_iommu_cleanup(&mm->context); > + mm_iommu_cleanup(mm); > #endif > > #ifdef CONFIG_PPC_ICSWX > diff --git a/arch/powerpc/mm/mmu_context_iommu.c > b/arch/powerpc/mm/mmu_context_iommu.c > index da6a216..ee6685b 100644 > --- a/arch/powerpc/mm/mmu_context_iommu.c > +++ b/arch/powerpc/mm/mmu_context_iommu.c > @@ -53,7 +53,7 @@ static long mm_iommu_adjust_locked_vm(struct mm_struct *mm, > } > > pr_debug("[%d] RLIMIT_MEMLOCK HASH64 %c%ld %ld/%ld\n", > - current->pid, > + current ? current->pid : 0, > incr ? '+' : '-', > npages << PAGE_SHIFT, > mm->locked_vm << PAGE_SHIFT, > @@ -63,28 +
Re: [PATCH 2/5] kbuild: allow archs to select build for link dead code/data elimination
> I don't understand what led Andi Kleen to also move .text.hot and > .text.unlikely together with .text [2], but this may have > been a related issue. The goal was just to move .hot and .unlikely all together, so that they are clustered and use the minimum amount of cache. On x86 doesn't matter where they are exactly, as long as each is together. If they are not explicitely listed then the linker interleaves them with the normal text, which defeats the purpose. -Andi
Re: [PATCH] powerpc/pci: Only do fixed PHB numbering on powernv
On 08/08/2016 09:32 PM, Michael Ellerman wrote: "Guilherme G. Piccoli" writes: On 08/07/2016 08:48 PM, Gavin Shan wrote: On Fri, Aug 05, 2016 at 04:40:56PM +1000, Michael Ellerman wrote: The recent commit 63a72284b159 ("powerpc/pci: Assign fixed PHB number based on device-tree properties"), added code to read a 64-bit property >from the device tree, and if not found read a 32-bit property (reg). There was a bug in the 32-bit case, on big endian machines, due to the use of the 64-bit value to read the 32-bit property. The cast of &prop means we end up writing to the high 32-bit of prop, leaving the low 32-bits containing whatever junk was on the stack. If that junk value was non-zero, and < MAX_PHBS, we would end up using it as the PHB id. This exposed a second problem, which is that Xorg can't cope with a domain number > 256. So for now drop the fallback to reg, and only use "ibm,opal-phbid" for PHB numbering. Fixes: 63a72284b159 ("powerpc/pci: Assign fixed PHB number based on device-tree properties") Signed-off-by: Michael Ellerman --- arch/powerpc/kernel/pci-common.c | 24 +--- 1 file changed, 9 insertions(+), 15 deletions(-) This is my bad. Guilherme limited the reg case to pseries only, but I made it generic. I tested it on G5, Cell etc. which all booted - but that's not really a good enough test. Michael and Gavin, thanks very much for fixing and commenting on the issue. I'm sorry about the bug on Big Endian machines, my mistake! I'll fix it in a future patch, but right now I have 2 questions so I can investigate better the issue found on Xorg: (i) What is the specific issue? Do you have some logs or at least a "high-level" description of the problem in Xorg? I took a look in its code and PCI domain is coded as u16, which is correct/expected. So it seems a subtle bug we should investigate and hopefully fix. It was reported here: https://lists.ozlabs.org/pipermail/linuxppc-dev/2016-August/147062.html It seems xorg just has a hard coded limit of 256 domains. Thanks for the link Michael. I guess Xorg _had_ this limit in the "past", since the function that was logged on error - xf86MapLegacyIO() - was removed by a commit of 2014: https://lists.x.org/archives/xorg-devel/2014-July/043224.html (ii) Why is it related to the absence of pseries check?! You said this was your bad, but as far as I understand, Xorg runs in pSeries too so the issue should also be there heheh Well yes I guess it would, if anyone had tested Xorg on pseries :) We use to test Xorg on pSeries regularly; in fact, I made a quick test today: http://imgur.com/a/l1lP8 I forced the domain to be 0x as in the above image, and everything worked fine. The bug was reported/found in another platform? Yeah pasemi, see the email above. On closer inspection I also see it on my G5, ie. the domain numbers are random, but the machine doesn't mind (because I don't run xorg). As Gavin pointed, the important/interesting part of the patch is related to pSeries, in which we have PHB hotplug and so the patch allows network adapters to work well in PHB hotplug scenario, even if using predictable naming. For now, I guess this fix is pretty good, but would be really important to have this feature on pSeries too - I'll put some effort in solving the Xorg bug. I think for now I'm going to apply this, and we'll work out something else later. OK, I guess your solution is fine and solves the pasemi issue quickly, but we should really consider performing fixed assignment of domains on pSeries later, since the PHB hotplug operation then can work fine with predictable naming of network interfaces. I'll work in a fix for the endian issue and will test more the patch with Xorg, specially in "old" distros still supported in pSeries. Cheers, Guilherme cheers
Build and qemu test results for v4.8-rc1
Build results: total: 148 pass: 142 fail: 6 Failed builds: h8300:allnoconfig h8300:edosk2674_defconfig h8300:h8300h-sim_defconfig h8300:h8s-sim_defconfig unicore32:defconfig unicore32:allnoconfig Qemu test results: total: 107 pass: 104 fail: 3 Failed tests: arm:kzm:imx_v6_v7_defconfig powerpc:nosmp:ppc64_e5500_defconfig powerpc:smp:ppc64_e5500_defconfig === h8300: arch/h8300/include/asm/io.h:9:15: error: unknown type name ‘u8’ arch/h8300/include/asm/io.h:9:58: error: expected ‘;’, ‘,’ or ‘)’ before ‘*’ token [and many more similar errors] Fixed with [1] --- unicore32: mm/gup.c: In function ‘check_vma_flags’: mm/gup.c:458: error: too many arguments to function ‘arch_vma_access_permitted’ mm/gup.c: In function ‘vma_permits_fault’: gup.c:642: error: too many arguments to function ‘arch_vma_access_permitted’ Fixed with [2] --- qemu arm:kzm:imx_v6_v7_defconfig: Failed to create /dev/root: -14 ... Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0) See [3] --- qemu powerpc:nosmp:qemu_ppc64_e5500_defconfig: qemu powerpc:smp:qemu_ppc64_e5500_defconfig: ./arch/powerpc/include/asm/cputhreads.h: In function 'get_tensr': ./arch/powerpc/include/asm/cputhreads.h:101:2: error: implicit declaration of function 'cpu_has_feature' Fixed with [4] --- [1] https://patchwork.kernel.org/patch/9166177/ [2] https://patchwork.kernel.org/patch/8631791/ [3] https://lkml.org/lkml/2016/8/2/2085 [4] https://patchwork.kernel.org/patch/9266131/
Re: [PATCH] powerpc: vdso: Add missing include file
On 08/08/2016 06:46 PM, Michael Ellerman wrote: Guenter Roeck writes: Some powerpc builds fail with the following buld error. Is that a randconfig? No, it is a configuration I use for qemu testing. https://github.com/groeck/linux-build-test/blob/master/rootfs/ppc64/qemu_ppc64_e5500_defconfig Guenter
Re: [PATCH] powerpc: vdso: Add missing include file
Guenter Roeck writes: > Some powerpc builds fail with the following buld error. Is that a randconfig? All my builds are green. Thanks for the patch anyway. cheers
Re: [PATCH] powerpc/pci: Only do fixed PHB numbering on powernv
"Guilherme G. Piccoli" writes: > On 08/07/2016 08:48 PM, Gavin Shan wrote: >> On Fri, Aug 05, 2016 at 04:40:56PM +1000, Michael Ellerman wrote: >>> The recent commit 63a72284b159 ("powerpc/pci: Assign fixed PHB number >>> based on device-tree properties"), added code to read a 64-bit property >>>from the device tree, and if not found read a 32-bit property (reg). >>> >>> There was a bug in the 32-bit case, on big endian machines, due to the >>> use of the 64-bit value to read the 32-bit property. The cast of &prop >>> means we end up writing to the high 32-bit of prop, leaving the low >>> 32-bits containing whatever junk was on the stack. >>> >>> If that junk value was non-zero, and < MAX_PHBS, we would end up using >>> it as the PHB id. >>> >>> This exposed a second problem, which is that Xorg can't cope with a >>> domain number > 256. >>> >>> So for now drop the fallback to reg, and only use "ibm,opal-phbid" for >>> PHB numbering. >>> >>> Fixes: 63a72284b159 ("powerpc/pci: Assign fixed PHB number based on >>> device-tree properties") >>> Signed-off-by: Michael Ellerman >>> --- >>> arch/powerpc/kernel/pci-common.c | 24 +--- >>> 1 file changed, 9 insertions(+), 15 deletions(-) >>> >>> >>> This is my bad. Guilherme limited the reg case to pseries only, but I made >>> it >>> generic. I tested it on G5, Cell etc. which all booted - but that's not >>> really >>> a good enough test. > > Michael and Gavin, thanks very much for fixing and commenting on the > issue. I'm sorry about the bug on Big Endian machines, my mistake! I'll > fix it in a future patch, but right now I have 2 questions so I can > investigate better the issue found on Xorg: > > (i) What is the specific issue? Do you have some logs or at least a > "high-level" description of the problem in Xorg? I took a look in its > code and PCI domain is coded as u16, which is correct/expected. So it > seems a subtle bug we should investigate and hopefully fix. It was reported here: https://lists.ozlabs.org/pipermail/linuxppc-dev/2016-August/147062.html It seems xorg just has a hard coded limit of 256 domains. > (ii) Why is it related to the absence of pseries check?! You said this > was your bad, but as far as I understand, Xorg runs in pSeries too so > the issue should also be there heheh Well yes I guess it would, if anyone had tested Xorg on pseries :) > The bug was reported/found in another platform? Yeah pasemi, see the email above. On closer inspection I also see it on my G5, ie. the domain numbers are random, but the machine doesn't mind (because I don't run xorg). > As Gavin pointed, the important/interesting part of the patch is related > to pSeries, in which we have PHB hotplug and so the patch allows network > adapters to work well in PHB hotplug scenario, even if using predictable > naming. For now, I guess this fix is pretty good, but would be really > important to have this feature on pSeries too - I'll put some effort in > solving the Xorg bug. I think for now I'm going to apply this, and we'll work out something else later. cheers
Re: [PATCH v2] cxl: Use fixed width predefined types in data structure.
Frederic Barrat writes: > Le 05/08/2016 à 14:02, Philippe Bergheaud a écrit : >> This patch fixes a regression introduced by commit b810253. >> >> It substitutes the type __u8 to u8 in the uapi header cxl.h, >> because the latter is not always defined in userland build >> environments, in particular when cross-compiling libcxl on >> x86_64 linux machines (RHEL6.7 and Ubuntu 16.04). >> >> This patch also changes the size of the field data_size, and >> makes it constant, to support 32-bit userland applications >> running on big-endian ppc64 kernels transparently. >> >> This breaks the (young) API that has been merged in v4.8. >> >> Signed-off-by: Philippe Bergheaud >> --- >> Changes since v1: >> Added an explanation for the proposed API change in the log. >> >> Note: >> As far as I know, cxlflash is the only known user of the API. > > > Yes, ideally, we'd like to change the type of 'data_size' to something > smaller/constant and were expecting it's still doable since the API was > merged to 4.8 and the expected user (cxlflash) hasn't started using the > API yet. Yep that's fine, it just needed to be mentioned in the change log. We don't guarantee that an ABI merged in -rc1 doesn't change in -rc2. But we do (in general) guarantee that an ABI in a released kernel (ie. 4.8 final), never changes. cheers
Re: [PATCH 2/5] kbuild: allow archs to select build for link dead code/data elimination
On Mon, Aug 08, 2016 at 05:14:27PM +0200, Arnd Bergmann wrote: > I have reverted that patch now, so ARM uses ".fixup" again like every > other architecture does, and now "*(.fixup) *(.text .text.*)" works > correctly, while ""*(.fixup) *(.text .fixup .text.*)" also fails > the same way that I saw before: That is really odd. The linker isn't supposed to treat those script snippets differently. First match for .fixup wins. $ cat > fixup1.s <<\EOF .global _start .text _start: .dc.a .L2 .L1: .section ".fixup","ax",%progbits .L2: .dc.a .L1 EOF $ cat > fixup2.s <<\EOF .section ".text.xyz","ax",%progbits .dc.a .L2 .L1: .section ".fixup","ax",%progbits .L2: .dc.a .L1 EOF $ cat > fixup.lnk <<\EOF SECTIONS { .text : { *(.fixup) *(.text .fixup .text.*) } } EOF $ as -o fixup1.o fixup1.s $ as -o fixup2.o fixup2.s $ ld -o fixup -T fixup.lnk -Map fixup.map fixup1.o fixup2.o $ cat fixup.map Memory Configuration Name Origin Length Attributes *default*0x 0x Linker script and memory map .text 0x 0x10 *(.fixup) .fixup 0x0x4 fixup1.o .fixup 0x00040x4 fixup2.o *(.text .fixup .text.*) .text 0x00080x4 fixup1.o 0x0008_start .text 0x000c0x0 fixup2.o .text.xyz 0x000c0x4 fixup2.o [snip] -- Alan Modra Australia Development Lab, IBM
Re: [PATCH] crypto: powerpc - CRYPT_CRC32C_VPMSUM should depend on ALTIVEC
Herbert Xu writes: > On Fri, Aug 05, 2016 at 01:28:03PM +1000, Michael Ellerman wrote: >> Anton Blanchard writes: >> >> > Hi Michael, >> > >> >> The optimised crc32c implementation depends on VMX (aka. Altivec) >> >> instructions, so the kernel must be built with Altivec support in >> >> order for the crc32c code to build. >> > >> > Thanks for that, looks good. >> > >> > Acked-by: Anton Blanchard >> >> Thanks. Herbert, expecting you'll pick this up. > > In that case Anton can you please post it to linux-crypto? Sorry that's my fault, will repost to linux-crypto. cheers
[PATCH v2] crypto: powerpc - CRYPT_CRC32C_VPMSUM should depend on ALTIVEC
The optimised crc32c implementation depends on VMX (aka. Altivec) instructions, so the kernel must be built with Altivec support in order for the crc32c code to build. Fixes: 6dd7a82cc54e ("crypto: powerpc - Add POWER8 optimised crc32c") Acked-by: Anton Blanchard Signed-off-by: Michael Ellerman --- crypto/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) v2: No change. Send to linux-crypto. diff --git a/crypto/Kconfig b/crypto/Kconfig index a9377bef25e3..84d71482bf08 100644 --- a/crypto/Kconfig +++ b/crypto/Kconfig @@ -439,7 +439,7 @@ config CRYPTO_CRC32C_INTEL config CRYPT_CRC32C_VPMSUM tristate "CRC32c CRC algorithm (powerpc64)" - depends on PPC64 + depends on PPC64 && ALTIVEC select CRYPTO_HASH select CRC32 help -- 2.7.4
Re: [5.3] ucc_geth: Fix to avoid IS_ERR_VALUE abuses and dead code on 64bit systems.
On Monday, August 8, 2016 3:49:22 PM CEST David Laight wrote: > From: Arnd Bergmann > > Sent: 08 August 2016 16:13 > > > > On Monday, August 8, 2016 2:49:11 PM CEST David Laight wrote: > > > > > > > If qe_muram_alloc will return any error, Then IS_ERR_VALUE will always > > > > return 0. it'll not call ucc_fast_free for any failure. Inside 'if code' > > > > will be a dead code on 64bit. Even qe_muram_addr will return wrong > > > > virtual address. Which can cause an error. > > > > > > > > kfree((void *)ugeth->tx_bd_ring_offset[i]); > > > > > > Erm, kfree() isn't the right function for things allocated by > > > qe_muram_alloc(). > > > > > > I still thing you need to stop this code using IS_ERR_VALUE() at all. > > > > Those are two separate issues: > > > > a) The ucc_geth driver mixing kmalloc() memory with muram, and assigning > >the result to "u32" and "void __iomem *" variables, both of which > >are wrong at least half of the time. > > > > b) calling conventions of qe_muram_alloc() being defined in a way that > >requires the use of IS_ERR_VALUE(), because '0' is a valid address > >here. > > Yep, it is all a big bag of worms... > '0' being valid is going to make tidying up after failure 'problematic'. > > > The first one can be solved by updating the network driver, ideally > > by getting rid of the casts and using proper types and accessors, > > while the second would require updating all users of that interface. > > It might be worth (at least as a compilation option) of embedding the > 'muram offset' in a structure (passed and returned by value). > > The compiler can then check that the driver code is never be looking > directly at the value. > > For 'b' zero can be made invalid by changing the places where the > offset is added/subtracted. > It could even be used to offset the saved physical and virtual > addresses of the area - so not needing any extra code when the values > are converted to physical/virtual addresses. Agreed. For this driver, we don't actually seem to use the value returned from the allocation function, only the virtual __iomem address we get after calling qe_muram_addr(), so it would be a big improvement to just store the virtual address as a pointer, and wrap the calls to qe_muram_alloc/qe_muram_addr/qe_muram_free with an appropriate helper that doesn't even show the offset. However, I'd also separate the normal kmalloc pointer from the muram_alloc() pointer because only the latter is __iomem, and we shouldn't really call MMIO accessor functions on RAM in portable code. Arnd
Re: Problem booting Kernel 4.8-rc1 on Sam460ex ( Canyonlands)
On Mon, Aug 8, 2016 at 2:25 PM, Julian Margetson wrote: > On 8/8/2016 9:39 AM, Kefeng Wang wrote: >> >> +cc Rob >> >> On 2016/8/8 20:51, Julian Margetson wrote: >>> >>> On 8/8/2016 8:22 AM, Kefeng Wang wrote: On 2016/8/8 19:52, Julian Margetson wrote: > > Problem booting Kernel 4.8-rc1 on Sam460ex ( Canyonlands) machine . > > This is what I get with the bisect. Hi Julian I have no Sam460ex machine, could you show the good and bad kernel bootlog? >> >> I have no experience about PowerPC, the status shows below, >> version boot >> v4.7ok >> v4.8-rc1fail >> v4.7 + this patch fail(right?) >> >> Does PowerPC support earlyprintk/earlycon? Please set it, and let's show >> somethings from earlyprintk. > > Not getting any output with earlyprintk so not sure if it works or if my > settings are incorrect . This should work: earlycon=uart8250,mmio,ef600300,115200n8 Adding "initcall_debug debug" would also be useful here. > I tried to revert the commit but the result is the same so I suspect that > this one may not really be the problem . > Looks like the real problem may be something affecting the the device tree > loading . That's strange. If it is this commit, the init order changes as it changed from device to arch initcall. I haven't seen any other platforms having problems. Rob
Re: [PATCH -net] net/ethernet: tundra: fix dump_eth_one warning in tsi108_eth
From: Paul Gortmaker Date: Thu, 4 Aug 2016 16:07:58 -0400 > The call site for this function appears as: > > #ifdef DEBUG > data->msg_enable = DEBUG; > dump_eth_one(dev); > #endif > > ...leading to the following warning for !DEBUG builds: > > drivers/net/ethernet/tundra/tsi108_eth.c:169:13: warning: 'dump_eth_one' > defined but not used [-Wunused-function] > static void dump_eth_one(struct net_device *dev) > ^ > > ...when using the arch/powerpc/configs/mpc7448_hpc2_defconfig > > Put the function definition under the same #ifdef as the call site > to avoid the warning. > > Cc: "David S. Miller" > Cc: net...@vger.kernel.org > Cc: linuxppc-dev@lists.ozlabs.org > Signed-off-by: Paul Gortmaker Applied, thanks.
Re: [PATCH v2] powerpc: Do not make the entire heap executable
On Mon, Aug 8, 2016 at 7:55 AM, Denys Vlasenko wrote: > On 32-bit powerps the ELF PLT sections of binaries (built with --bss-plt, > or with a toolchain which defaults to it) look like this: > > [17] .sbss NOBITS 0002aff8 01aff8 14 00 WA 0 0 > 4 > [18] .plt NOBITS 0002b00c 01aff8 84 00 WAX 0 0 > 4 > [19] .bss NOBITS 0002b090 01aff8 a4 00 WA 0 0 > 4 > > Which results in an ELF load header: > > Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align > LOAD 0x019c70 0x00029c70 0x00029c70 0x01388 0x014c4 RWE 0x1 > > This is all correct, the load region containing the PLT is marked as > executable. Note that the PLT starts at 0002b00c but the file mapping ends at > 0002aff8, so the PLT falls in the 0 fill section described by the load header, > and after a page boundary. > > Unfortunately the generic ELF loader ignores the X bit in the load headers > when it creates the 0 filled non-file backed mappings. It assumes all of these > mappings are RW BSS sections, which is not the case for PPC. > > gcc/ld has an option (--secure-plt) to not do this, this is said to incur > a small performance penalty. > > Currently, to support 32-bit binaries with PLT in BSS kernel maps *entire > brk area* with executable rights for all binaries, even --secure-plt ones. > > Stop doing that. > > Teach the ELF loader to check the X bit in the relevant load header > and create 0 filled anonymous mappings that are executable > if the load header requests that. > > The patch was originally posted in 2012 by Jason Gunthorpe > and apparently ignored: > > https://lkml.org/lkml/2012/9/30/138 > > Lightly run-tested. > > Signed-off-by: Jason Gunthorpe > Signed-off-by: Denys Vlasenko > CC: Benjamin Herrenschmidt > CC: Paul Mackerras > CC: Kees Cook > CC: Oleg Nesterov , > CC: Michael Ellerman > CC: Florian Weimer > CC: linux...@kvack.org, > CC: linuxppc-dev@lists.ozlabs.org > CC: linux-ker...@vger.kernel.org > --- > Changes since v1: > * wrapped lines to not exceed 79 chars > * improved comment > * expanded CC list > > arch/powerpc/include/asm/page.h| 10 +-- > arch/powerpc/include/asm/page_32.h | 2 -- > arch/powerpc/include/asm/page_64.h | 4 --- > fs/binfmt_elf.c| 56 > ++ > 4 files changed, 45 insertions(+), 27 deletions(-) > > diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h > index 56398e7..42d7ea1 100644 > --- a/arch/powerpc/include/asm/page.h > +++ b/arch/powerpc/include/asm/page.h > @@ -225,15 +225,7 @@ extern long long virt_phys_offset; > #endif > #endif > > -/* > - * Unfortunately the PLT is in the BSS in the PPC32 ELF ABI, > - * and needs to be executable. This means the whole heap ends > - * up being executable. > - */ > -#define VM_DATA_DEFAULT_FLAGS32(VM_READ | VM_WRITE | VM_EXEC | \ > -VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC) > - > -#define VM_DATA_DEFAULT_FLAGS64(VM_READ | VM_WRITE | \ > +#define VM_DATA_DEFAULT_FLAGS (VM_READ | VM_WRITE | \ > VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC) > > #ifdef __powerpc64__ > diff --git a/arch/powerpc/include/asm/page_32.h > b/arch/powerpc/include/asm/page_32.h > index 6a8e179..6113fa8 100644 > --- a/arch/powerpc/include/asm/page_32.h > +++ b/arch/powerpc/include/asm/page_32.h > @@ -9,8 +9,6 @@ > #endif > #endif > > -#define VM_DATA_DEFAULT_FLAGS VM_DATA_DEFAULT_FLAGS32 > - > #ifdef CONFIG_NOT_COHERENT_CACHE > #define ARCH_DMA_MINALIGN L1_CACHE_BYTES > #endif > diff --git a/arch/powerpc/include/asm/page_64.h > b/arch/powerpc/include/asm/page_64.h > index dd5f071..52d8e9c 100644 > --- a/arch/powerpc/include/asm/page_64.h > +++ b/arch/powerpc/include/asm/page_64.h > @@ -159,10 +159,6 @@ do { \ > > #endif /* !CONFIG_HUGETLB_PAGE */ > > -#define VM_DATA_DEFAULT_FLAGS \ > - (is_32bit_task() ? \ > -VM_DATA_DEFAULT_FLAGS32 : VM_DATA_DEFAULT_FLAGS64) > - > /* > * This is the default if a program doesn't have a PT_GNU_STACK > * program header entry. The PPC64 ELF ABI has a non executable stack > diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c > index a7a28110..50006d0 100644 > --- a/fs/binfmt_elf.c > +++ b/fs/binfmt_elf.c > @@ -91,14 +91,25 @@ static struct linux_binfmt elf_format = { > > #define BAD_ADDR(x) ((unsigned long)(x) >= TASK_SIZE) > > -static int set_brk(unsigned long start, unsigned long end) > +static int set_brk(unsigned long start, unsigned long end, int prot) > { > start = ELF_PAGEALIGN(start); > end = ELF_PAGEALIGN(end); > if (end > start) { > - int error = vm_brk(start, end - start); > - if (error) > - return error; > + /* Map the non-file portion of the last load header. If the > + heade
[PATCH 1/4] dt-bindings: add doc for ibm,hotplug-aperture
Signed-off-by: Reza Arbab --- .../bindings/powerpc/opal/hotplug-aperture.txt | 26 ++ 1 file changed, 26 insertions(+) create mode 100644 Documentation/devicetree/bindings/powerpc/opal/hotplug-aperture.txt diff --git a/Documentation/devicetree/bindings/powerpc/opal/hotplug-aperture.txt b/Documentation/devicetree/bindings/powerpc/opal/hotplug-aperture.txt new file mode 100644 index 000..b8dffaa --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/opal/hotplug-aperture.txt @@ -0,0 +1,26 @@ +Designated hotplug memory +- + +This binding describes a region of hotplug memory which is not present at boot, +allowing its eventual NUMA associativity to be prespecified. + +Required properties: + +- compatible + "ibm,hotplug-aperture" + +- reg + base address and size of the region (standard definition) + +- ibm,associativity + NUMA associativity (standard definition) + +Example: + +A 2 GiB aperture at 0x1, to be part of nid 3 when hotplugged: + + hotplug-memory@1 { + compatible = "ibm,hotplug-aperture"; + reg = <0x0 0x1 0x0 0x8000>; + ibm,associativity = <0x4 0x0 0x0 0x0 0x3>; + }; -- 1.8.3.1
[PATCH 3/4] powerpc/mm: allow memory hotplug into a memoryless node
Remove the check which prevents us from hotplugging into an empty node. Signed-off-by: Reza Arbab --- arch/powerpc/mm/numa.c | 13 + 1 file changed, 1 insertion(+), 12 deletions(-) diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c index 80d067d..bc70c4f 100644 --- a/arch/powerpc/mm/numa.c +++ b/arch/powerpc/mm/numa.c @@ -1127,7 +1127,7 @@ static int hot_add_node_scn_to_nid(unsigned long scn_addr) int hot_add_scn_to_nid(unsigned long scn_addr) { struct device_node *memory = NULL; - int nid, found = 0; + int nid; if (!numa_enabled || (min_common_depth < 0)) return first_online_node; @@ -1143,17 +1143,6 @@ int hot_add_scn_to_nid(unsigned long scn_addr) if (nid < 0 || !node_online(nid)) nid = first_online_node; - if (NODE_DATA(nid)->node_spanned_pages) - return nid; - - for_each_online_node(nid) { - if (NODE_DATA(nid)->node_spanned_pages) { - found = 1; - break; - } - } - - BUG_ON(!found); return nid; } -- 1.8.3.1
[PATCH 4/4] mm: enable CONFIG_MOVABLE_NODE on powerpc
Onlining memory into ZONE_MOVABLE requires CONFIG_MOVABLE_NODE. Enable the use of this config option on PPC64 platforms. Signed-off-by: Reza Arbab --- Documentation/kernel-parameters.txt | 2 +- mm/Kconfig | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 46c030a..07fefd8 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -2344,7 +2344,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted. that the amount of memory usable for all allocations is not too small. - movable_node[KNL,X86] Boot-time switch to enable the effects + movable_node[KNL,X86,PPC] Boot-time switch to enable the effects of CONFIG_MOVABLE_NODE=y. See mm/Kconfig for details. MTD_Partition= [MTD] diff --git a/mm/Kconfig b/mm/Kconfig index 78a23c5..4154638 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -153,7 +153,7 @@ config MOVABLE_NODE bool "Enable to assign a node which has only movable memory" depends on HAVE_MEMBLOCK depends on NO_BOOTMEM - depends on X86_64 + depends on X86_64 || PPC64 depends on NUMA default n help -- 1.8.3.1
[PATCH 0/4] powerpc/mm: movable hotplug memory nodes
These changes enable onlining memory into ZONE_MOVABLE on power, and the creation of discrete nodes of movable memory. Node hotplug is not supported on power [1]. The approach taken instead is to create a memoryless placeholder node for the designated address range at boot. Hotplug and onlining of the memory are then done in the usual way. The numa code on power currently prevents hotplugging to a memoryless node. This limitation has been questioned before [2], and judging by the response, there doesn't seem to be a reason we can't remove it. No issues have been found in light testing. [1] commit 3af229f ("powerpc/numa: Reset node_possible_map to only node_online_map") [2] http://lkml.kernel.org/r/cagzkibrmksa1yyhbf5hwgxubcjse5smksmy4tpanerme2ug...@mail.gmail.com http://lkml.kernel.org/r/20160511215051.gf22...@arbab-laptop.austin.ibm.com Reza Arbab (4): dt-bindings: add doc for ibm,hotplug-aperture powerpc/mm: create numa nodes for hotplug memory powerpc/mm: allow memory hotplug into a memoryless node mm: enable CONFIG_MOVABLE_NODE on powerpc .../bindings/powerpc/opal/hotplug-aperture.txt | 26 ++ Documentation/kernel-parameters.txt| 2 +- arch/powerpc/mm/numa.c | 23 --- mm/Kconfig | 2 +- 4 files changed, 37 insertions(+), 16 deletions(-) create mode 100644 Documentation/devicetree/bindings/powerpc/opal/hotplug-aperture.txt -- 1.8.3.1
[PATCH 2/4] powerpc/mm: create numa nodes for hotplug memory
When scanning the device tree to initialize the system NUMA topology, process dt elements with compatible id "ibm,hotplug-aperture" to create memoryless numa nodes. These nodes will be filled when hotplug occurs within the associated address range. Signed-off-by: Reza Arbab --- arch/powerpc/mm/numa.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c index 75b9cd6..80d067d 100644 --- a/arch/powerpc/mm/numa.c +++ b/arch/powerpc/mm/numa.c @@ -708,6 +708,12 @@ static void __init parse_drconf_memory(struct device_node *memory) } } +static const struct of_device_id memory_match[] = { + { .type = "memory" }, + { .compatible = "ibm,hotplug-aperture" }, + { /* sentinel */ } +}; + static int __init parse_numa_properties(void) { struct device_node *memory; @@ -752,7 +758,7 @@ static int __init parse_numa_properties(void) get_n_mem_cells(&n_mem_addr_cells, &n_mem_size_cells); - for_each_node_by_type(memory, "memory") { + for_each_matching_node(memory, memory_match) { unsigned long start; unsigned long size; int nid; @@ -1080,7 +1086,7 @@ static int hot_add_node_scn_to_nid(unsigned long scn_addr) struct device_node *memory; int nid = -1; - for_each_node_by_type(memory, "memory") { + for_each_matching_node(memory, memory_match) { unsigned long start, size; int ranges; const __be32 *memcell_buf; -- 1.8.3.1
Re: [PATCH v13 3/6] CPM/QE: use genalloc to manage CPM/QE muram
Le 08/08/2016 à 05:00, Qiang Zhao a écrit : On 6/8/2016 03:48AM, Christophe Leroy wrote : -Original Message- From: Christophe Leroy [mailto:christophe.le...@c-s.fr] Sent: Saturday, August 06, 2016 12:59 AM To: Zhao Qiang ; lau...@codeaurora.org Cc: catalin.mari...@arm.com; linux-ker...@vger.kernel.org; Scott Wood ; o...@lixom.net; a...@linux-foundation.org; linuxppc- d...@lists.ozlabs.org; x@freescale.com Subject: Re: [PATCH v13 3/6] CPM/QE: use genalloc to manage CPM/QE muram Le 30/11/2015 à 03:48, Zhao Qiang a écrit : Use genalloc to manage CPM/QE muram instead of rheap. Signed-off-by: Zhao Qiang --- Changes for v9: - splitted from patch 3/5, modify cpm muram management functions. Changes for v10: - modify cpm muram first, then move to qe_common - modify commit. Changes for v11: - factor out the common alloc code - modify min_alloc_order to zero for cpm_muram_alloc_fixed. Changes for v12: - Nil Changes for v13: - rebase arch/powerpc/include/asm/cpm.h | 3 + arch/powerpc/platforms/Kconfig | 4 +- arch/powerpc/sysdev/cpm_common.c | 126 +++ lib/genalloc.c | 2 +- 4 files changed, 94 insertions(+), 41 deletions(-) With that patch applied, I get the following Oops on a 8xx (Which has a CPM1). cpm_muram_init() is called from setup_arch() It seems that gen_pool_add() tries to kmalloc() memory but the SLAB is not available yet. Thank you for your comments, I can't find a 8xx board, would you like to test the patch Attached on your board? Thanks for your support (indeed I only received your mail a few minutes, that is after I proposed another patch). Your patch will not work, because initcalls are called too late. The 8xx needs the SMCs from the CPM for console, that is long before initcalls are called. I sent a proposed patch approximatly 2 hours ago, it is called "[PATCH] soc: fsl/qe: fix Oops on CPM1 (and likely CPM2)" That one works. Could you have a look ? Christophe --- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. https://www.avast.com/antivirus
Re: [Patch v3 01/11] arch/powerpc/pci: Fix compiling error for mpc85xx_edac
On 08/05/2016 12:01 AM, Borislav Petkov wrote: > On Fri, Aug 05, 2016 at 04:26:26AM +, york sun wrote: >> I don't have deep knowledge of this driver. What I am trying is to >> separate the common DDR part and share it with ARM platforms. Along the >> way, I found the compiling error if build a module. If exposing these >> functions becomes a concern, I can live without it. > > Perhaps you or Johannes could fix this properly to use pci_get_device() > as the rest of the EDAC drivers do, instead of exporting core PCI > functions... > Boris, I'd like to separate the first two patches from this set. They are not really related to the DDR part I am working on. It will take me a while to sort out the correct fix. York
Re: [PATCH kernel 14/15] vfio/spapr_tce: Export container API for external users
On Wed, 3 Aug 2016 18:40:55 +1000 Alexey Kardashevskiy wrote: > This exports helpers which are needed to keep a VFIO container in > memory while there are external users such as KVM. > > Signed-off-by: Alexey Kardashevskiy > --- > drivers/vfio/vfio.c | 30 ++ > drivers/vfio/vfio_iommu_spapr_tce.c | 16 +++- > include/linux/vfio.h| 6 ++ > 3 files changed, 51 insertions(+), 1 deletion(-) > > diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c > index d1d70e0..baf6a9c 100644 > --- a/drivers/vfio/vfio.c > +++ b/drivers/vfio/vfio.c > @@ -1729,6 +1729,36 @@ long vfio_external_check_extension(struct vfio_group > *group, unsigned long arg) > EXPORT_SYMBOL_GPL(vfio_external_check_extension); > > /** > + * External user API for containers, exported by symbols to be linked > + * dynamically. > + * > + */ > +struct vfio_container *vfio_container_get_ext(struct file *filep) > +{ > + struct vfio_container *container = filep->private_data; > + > + if (filep->f_op != &vfio_fops) > + return ERR_PTR(-EINVAL); > + > + vfio_container_get(container); > + > + return container; > +} > +EXPORT_SYMBOL_GPL(vfio_container_get_ext); > + > +void vfio_container_put_ext(struct vfio_container *container) > +{ > + vfio_container_put(container); > +} > +EXPORT_SYMBOL_GPL(vfio_container_put_ext); > + > +void *vfio_container_get_iommu_data_ext(struct vfio_container *container) > +{ > + return container->iommu_data; > +} > +EXPORT_SYMBOL_GPL(vfio_container_get_iommu_data_ext); > + > +/** > * Sub-module support > */ > /* > diff --git a/drivers/vfio/vfio_iommu_spapr_tce.c > b/drivers/vfio/vfio_iommu_spapr_tce.c > index 3594ad3..fceea3d 100644 > --- a/drivers/vfio/vfio_iommu_spapr_tce.c > +++ b/drivers/vfio/vfio_iommu_spapr_tce.c > @@ -1331,6 +1331,21 @@ const struct vfio_iommu_driver_ops > tce_iommu_driver_ops = { > .detach_group = tce_iommu_detach_group, > }; > > +struct iommu_table *vfio_container_spapr_tce_table_get_ext(void *iommu_data, > + u64 offset) > +{ > + struct tce_container *container = iommu_data; > + struct iommu_table *tbl = NULL; > + > + if (tce_iommu_find_table(container, offset, &tbl) < 0) > + return NULL; > + > + iommu_table_get(tbl); > + > + return tbl; > +} > +EXPORT_SYMBOL_GPL(vfio_container_spapr_tce_table_get_ext); > + > static int __init tce_iommu_init(void) > { > return vfio_register_iommu_driver(&tce_iommu_driver_ops); > @@ -1348,4 +1363,3 @@ MODULE_VERSION(DRIVER_VERSION); > MODULE_LICENSE("GPL v2"); > MODULE_AUTHOR(DRIVER_AUTHOR); > MODULE_DESCRIPTION(DRIVER_DESC); > - > diff --git a/include/linux/vfio.h b/include/linux/vfio.h > index 0ecae0b..1c2138a 100644 > --- a/include/linux/vfio.h > +++ b/include/linux/vfio.h > @@ -91,6 +91,12 @@ extern void vfio_group_put_external_user(struct vfio_group > *group); > extern int vfio_external_user_iommu_id(struct vfio_group *group); > extern long vfio_external_check_extension(struct vfio_group *group, > unsigned long arg); > +extern struct vfio_container *vfio_container_get_ext(struct file *filep); > +extern void vfio_container_put_ext(struct vfio_container *container); > +extern void *vfio_container_get_iommu_data_ext( > + struct vfio_container *container); > +extern struct iommu_table *vfio_container_spapr_tce_table_get_ext( > + void *iommu_data, u64 offset); > > /* > * Sub-module helpers I think you need to take a closer look of the lifecycle of a container, having a reference means the container itself won't go away, but only having a group set within that container holds the actual IOMMU references. container->iommu_data is going to be NULL once the groups are lost. Thanks, Alex
[PATCH] soc: fsl/qe: fix Oops on CPM1 (and likely CPM2)
Commit 0e6e01ff694ee ("CPM/QE: use genalloc to manage CPM/QE muram") has changed the way muram is managed. genalloc uses kmalloc(), hence requires the SLAB to be up and running. On powerpc 8xx, cpm_reset() is called early during startup. cpm_reset() then calls cpm_muram_init() before SLAB is available, hence the following Oops. cpm_reset() cannot be called during initcalls because the CPM is needed for console This patch splits cpm_muram_init() in two parts. The first part, related to mappings, is kept as cpm_muram_init() The second part is named cpm_muram_pool_init() and is called the first time cpm_muram_alloc() is used [0.00] Unable to handle kernel paging request for data at address 0x0008 [0.00] Faulting instruction address: 0xc01acce0 [0.00] Oops: Kernel access of bad area, sig: 11 [#1] [0.00] PREEMPT CMPC885 [0.00] CPU: 0 PID: 0 Comm: swapper Not tainted 4.4.14-g0886ed8 #5 [0.00] task: c05183e0 ti: c0536000 task.ti: c0536000 [0.00] NIP: c01acce0 LR: c0011068 CTR: [0.00] REGS: c0537e50 TRAP: 0300 Not tainted (4.4.14-s3k-dev-g0886ed8-svn) [0.00] MSR: 1032 CR: 28044428 XER: [0.00] DAR: 0008 DSISR: c000 GPR00: c0011068 c0537f00 c05183e0 9000 0bc0 GPR08: ff003000 ff00b000 ff003bbf 22044422 100d43a8 07ff94e8 GPR16: 07bb5d70 07ff81f4 07ff81f4 07ff81f4 GPR24: 07ffb3a0 07fe7628 c055 c7ffa190 c054 ff003bbf 0001 [0.00] NIP [c01acce0] gen_pool_add_virt+0x14/0xdc [0.00] LR [c0011068] cpm_muram_init+0xd4/0x18c [0.00] Call Trace: [0.00] [c0537f00] [0200] 0x200 (unreliable) [0.00] [c0537f20] [c0011068] cpm_muram_init+0xd4/0x18c [0.00] [c0537f70] [c0494684] cpm_reset+0xb4/0xc8 [0.00] [c0537f90] [c0494c64] cmpc885_setup_arch+0x10/0x30 [0.00] [c0537fa0] [c0493cd4] setup_arch+0x130/0x168 [0.00] [c0537fb0] [c04906bc] start_kernel+0x88/0x380 [0.00] [c0537ff0] [c0002224] start_here+0x38/0x98 [0.00] Instruction dump: [0.00] 91430010 91430014 80010014 83e1000c 7c0803a6 38210010 4e800020 7c0802a6 [0.00] 9421ffe0 bf61000c 90010024 7c7e1b78 <80630008> 7c9c2378 7cc31c30 3863001f [0.00] ---[ end trace dc8fa200cb88537f ]--- fixes: 0e6e01ff694ee ("CPM/QE: use genalloc to manage CPM/QE muram") Cc: sta...@vger.linux.org Signed-off-by: Christophe Leroy --- drivers/soc/fsl/qe/qe_common.c | 58 -- 1 file changed, 44 insertions(+), 14 deletions(-) diff --git a/drivers/soc/fsl/qe/qe_common.c b/drivers/soc/fsl/qe/qe_common.c index 41eff80..531cfcf 100644 --- a/drivers/soc/fsl/qe/qe_common.c +++ b/drivers/soc/fsl/qe/qe_common.c @@ -45,6 +45,44 @@ static LIST_HEAD(muram_block_list); #define OF_MAX_ADDR_CELLS 4 #define GENPOOL_OFFSET (4096 * 8) +static int cpm_muram_pool_init(void) +{ + struct device_node *np; + struct resource r; + int i = 0; + int ret = 0; + + if (muram_pool) + return 0; + + np = of_find_compatible_node(NULL, NULL, "fsl,cpm-muram-data"); + if (!np) { + /* try legacy bindings */ + np = of_find_node_by_name(NULL, "data-only"); + if (!np) { + pr_err("Cannot find CPM muram data node"); + ret = -ENODEV; + goto out; + } + } + + muram_pool = gen_pool_create(0, -1); + + while (of_address_to_resource(np, i++, &r) == 0) { + ret = gen_pool_add(muram_pool, r.start - muram_pbase + + GENPOOL_OFFSET, resource_size(&r), -1); + if (ret) { + pr_err("QE: couldn't add muram to pool!\n"); + gen_pool_destroy(muram_pool); + goto out; + } + } + +out: + of_node_put(np); + return ret; +} + int cpm_muram_init(void) { struct device_node *np; @@ -65,39 +103,28 @@ int cpm_muram_init(void) if (!np) { pr_err("Cannot find CPM muram data node"); ret = -ENODEV; - goto out_muram; + goto out; } } - muram_pool = gen_pool_create(0, -1); muram_pbase = of_translate_address(np, zero); if (muram_pbase == (phys_addr_t)OF_BAD_ADDR) { pr_err("Cannot translate zero through CPM muram node"); ret = -ENODEV; - goto out_pool; + goto out; } while (of_address_to_resource(np, i++, &r) == 0) { if (r.end > max) max = r.end; - ret = gen_pool_add(muram_pool, r.start - muram_pbase + - GENPOOL_OFFSET, r
Re: [PATCH] crypto: powerpc - CRYPT_CRC32C_VPMSUM should depend on ALTIVEC
On Fri, Aug 05, 2016 at 01:28:03PM +1000, Michael Ellerman wrote: > Anton Blanchard writes: > > > Hi Michael, > > > >> The optimised crc32c implementation depends on VMX (aka. Altivec) > >> instructions, so the kernel must be built with Altivec support in > >> order for the crc32c code to build. > > > > Thanks for that, looks good. > > > > Acked-by: Anton Blanchard > > Thanks. Herbert, expecting you'll pick this up. In that case Anton can you please post it to linux-crypto? Thanks, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
RE: [5.3] ucc_geth: Fix to avoid IS_ERR_VALUE abuses and dead code on 64bit systems.
From: Arnd Bergmann > Sent: 08 August 2016 16:13 > > On Monday, August 8, 2016 2:49:11 PM CEST David Laight wrote: > > > > > If qe_muram_alloc will return any error, Then IS_ERR_VALUE will always > > > return 0. it'll not call ucc_fast_free for any failure. Inside 'if code' > > > will be a dead code on 64bit. Even qe_muram_addr will return wrong > > > virtual address. Which can cause an error. > > > > > > kfree((void *)ugeth->tx_bd_ring_offset[i]); > > > > Erm, kfree() isn't the right function for things allocated by > > qe_muram_alloc(). > > > > I still thing you need to stop this code using IS_ERR_VALUE() at all. > > Those are two separate issues: > > a) The ucc_geth driver mixing kmalloc() memory with muram, and assigning >the result to "u32" and "void __iomem *" variables, both of which >are wrong at least half of the time. > > b) calling conventions of qe_muram_alloc() being defined in a way that >requires the use of IS_ERR_VALUE(), because '0' is a valid address >here. Yep, it is all a big bag of worms... '0' being valid is going to make tidying up after failure 'problematic'. > The first one can be solved by updating the network driver, ideally > by getting rid of the casts and using proper types and accessors, > while the second would require updating all users of that interface. It might be worth (at least as a compilation option) of embedding the 'muram offset' in a structure (passed and returned by value). The compiler can then check that the driver code is never be looking directly at the value. For 'b' zero can be made invalid by changing the places where the offset is added/subtracted. It could even be used to offset the saved physical and virtual addresses of the area - so not needing any extra code when the values are converted to physical/virtual addresses. David
Re: [PATCH] crypto: crc32c-vpmsum - Convert to CPU feature based module autoloading
Michael Ellerman wrote: > > This is actually an arch/powerpc patch, so I'll merge it unless Herbert > objects. It's fine with me. Cheers, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH 2/5] kbuild: allow archs to select build for link dead code/data elimination
On Monday, August 8, 2016 9:19:47 AM CEST Alan Modra wrote: > On Sun, Aug 07, 2016 at 10:26:19PM +0200, Arnd Bergmann wrote: > > On Sunday, August 7, 2016 7:27:39 PM CEST Alan Modra wrote: > > > > > > If it can, then Nicholas' patch should be: > > > > > > *(.text.hot .text.hot.*) *(.text.unlikely .text.unlikely.*) > > > *(.text .text.*) > > > > > > If you can't put .text.fixup too far away then you may as well just use > > > > > > *(.text .text.*) > > > > I tried this version: > > > > diff --git a/include/asm-generic/vmlinux.lds.h > > b/include/asm-generic/vmlinux.lds.h > > index b1f8828e9eac..fc210dacac9a 100644 > > --- a/include/asm-generic/vmlinux.lds.h > > +++ b/include/asm-generic/vmlinux.lds.h > > @@ -438,7 +438,9 @@ > > * during second ld run in second ld pass when generating System.map */ > > #define TEXT_TEXT\ > > ALIGN_FUNCTION(); \ > > - *(.text.hot .text .text.fixup .text.unlikely .text.*) \ > > + *(.text.hot .text.hot.*)\ > > + *(.text.unlikely .text.fixup .text.unlikely.*) \ > > + *(.text .text.*)\ > > *(.ref.text)\ > > MEM_KEEP(init.text) \ > > MEM_KEEP(exit.text) \ > > > > but that failed to link an allyesconfig kernel because of references > > from .fixup to .text.*. Trying your version now: > > Well then, that proves you can't put .text.fixup too far aways from > the associated input section. > > > *(.text.hot .text.hot.*) *(.text.unlikely .text.unlikely.*) *(.text .text.*) > > Which means this is guaranteed to fail when you test it properly using > gcc's profiling options, in order to generate .text.hot* and/or > .text.unlikely* sections. I've investigated further and it seems that "*(.text.fixup) *(.text .text.*)" fails just because we list .text.fixup twice. The .text.fixup section was originally[1] introduced to work around the same link error that it is causing now: if we use recursive linking, merging .text and .text.fixup helps avoid the problems of sections that are >32MB before the final link. I have reverted that patch now, so ARM uses ".fixup" again like every other architecture does, and now "*(.fixup) *(.text .text.*)" works correctly, while ""*(.fixup) *(.text .fixup .text.*)" also fails the same way that I saw before: drivers/scsi/sg.o:(.fixup+0x4): relocation truncated to fit: R_ARM_THM_JUMP24 against `.text.sg_ioctl' drivers/scsi/sg.o:(.fixup+0xc): relocation truncated to fit: R_ARM_THM_JUMP24 against `.text.sg_ioctl' drivers/scsi/sg.o:(.fixup+0x14): relocation truncated to fit: R_ARM_THM_JUMP24 against `.text.sg_ioctl' drivers/scsi/sg.o:(.fixup+0x1c): relocation truncated to fit: R_ARM_THM_JUMP24 against `.text.sg_ioctl' drivers/scsi/sg.o:(.fixup+0x24): relocation truncated to fit: R_ARM_THM_JUMP24 against `.text.sg_ioctl' drivers/scsi/sg.o:(.fixup+0x2c): relocation truncated to fit: R_ARM_THM_JUMP24 against `.text.sg_ioctl' drivers/scsi/sg.o:(.fixup+0x34): relocation truncated to fit: R_ARM_THM_JUMP24 against `.text.sg_ioctl' drivers/scsi/sg.o:(.fixup+0x3c): relocation truncated to fit: R_ARM_THM_JUMP24 against `.text.sg_ioctl' drivers/scsi/sg.o:(.fixup+0x44): relocation truncated to fit: R_ARM_THM_JUMP24 against `.text.sg_ioctl' I don't understand what led Andi Kleen to also move .text.hot and .text.unlikely together with .text [2], but this may have been a related issue. > It seems to me the right thing to do would be to change kernel asm to > generate .text.foo.fixup for any .text.foo section. A gas feature > available with binutils-2.26 enabled by --sectname-subst might help > with implementing that. I think this what Nico wanted to use anyway to eliminate more functions from the output. Arnd [1] http://www.arm.linux.org.uk/developer/patches/viewpatch.php?id=8321 http://www.arm.linux.org.uk/developer/patches/viewpatch.php?id=8322 [2] https://lkml.org/lkml/2015/7/19/377
Re: [5.3] ucc_geth: Fix to avoid IS_ERR_VALUE abuses and dead code on 64bit systems.
On Monday, August 8, 2016 2:49:11 PM CEST David Laight wrote: > > > If qe_muram_alloc will return any error, Then IS_ERR_VALUE will always > > return 0. it'll not call ucc_fast_free for any failure. Inside 'if code' > > will be a dead code on 64bit. Even qe_muram_addr will return wrong > > virtual address. Which can cause an error. > > > > kfree((void *)ugeth->tx_bd_ring_offset[i]); > > Erm, kfree() isn't the right function for things allocated by > qe_muram_alloc(). > > I still thing you need to stop this code using IS_ERR_VALUE() at all. Those are two separate issues: a) The ucc_geth driver mixing kmalloc() memory with muram, and assigning the result to "u32" and "void __iomem *" variables, both of which are wrong at least half of the time. b) calling conventions of qe_muram_alloc() being defined in a way that requires the use of IS_ERR_VALUE(), because '0' is a valid address here. The first one can be solved by updating the network driver, ideally by getting rid of the casts and using proper types and accessors, while the second would require updating all users of that interface. Arnd
Re: [PATCH v2] cxl: Use fixed width predefined types in data structure.
Le 05/08/2016 à 14:02, Philippe Bergheaud a écrit : This patch fixes a regression introduced by commit b810253. It substitutes the type __u8 to u8 in the uapi header cxl.h, because the latter is not always defined in userland build environments, in particular when cross-compiling libcxl on x86_64 linux machines (RHEL6.7 and Ubuntu 16.04). This patch also changes the size of the field data_size, and makes it constant, to support 32-bit userland applications running on big-endian ppc64 kernels transparently. This breaks the (young) API that has been merged in v4.8. Signed-off-by: Philippe Bergheaud --- Changes since v1: Added an explanation for the proposed API change in the log. Note: As far as I know, cxlflash is the only known user of the API. Yes, ideally, we'd like to change the type of 'data_size' to something smaller/constant and were expecting it's still doable since the API was merged to 4.8 and the expected user (cxlflash) hasn't started using the API yet. Reviewed-by: Frederic Barrat Fred
[PATCH v2] powerpc: Do not make the entire heap executable
On 32-bit powerps the ELF PLT sections of binaries (built with --bss-plt, or with a toolchain which defaults to it) look like this: [17] .sbss NOBITS 0002aff8 01aff8 14 00 WA 0 0 4 [18] .plt NOBITS 0002b00c 01aff8 84 00 WAX 0 0 4 [19] .bss NOBITS 0002b090 01aff8 a4 00 WA 0 0 4 Which results in an ELF load header: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align LOAD 0x019c70 0x00029c70 0x00029c70 0x01388 0x014c4 RWE 0x1 This is all correct, the load region containing the PLT is marked as executable. Note that the PLT starts at 0002b00c but the file mapping ends at 0002aff8, so the PLT falls in the 0 fill section described by the load header, and after a page boundary. Unfortunately the generic ELF loader ignores the X bit in the load headers when it creates the 0 filled non-file backed mappings. It assumes all of these mappings are RW BSS sections, which is not the case for PPC. gcc/ld has an option (--secure-plt) to not do this, this is said to incur a small performance penalty. Currently, to support 32-bit binaries with PLT in BSS kernel maps *entire brk area* with executable rights for all binaries, even --secure-plt ones. Stop doing that. Teach the ELF loader to check the X bit in the relevant load header and create 0 filled anonymous mappings that are executable if the load header requests that. The patch was originally posted in 2012 by Jason Gunthorpe and apparently ignored: https://lkml.org/lkml/2012/9/30/138 Lightly run-tested. Signed-off-by: Jason Gunthorpe Signed-off-by: Denys Vlasenko CC: Benjamin Herrenschmidt CC: Paul Mackerras CC: Kees Cook CC: Oleg Nesterov , CC: Michael Ellerman CC: Florian Weimer CC: linux...@kvack.org, CC: linuxppc-dev@lists.ozlabs.org CC: linux-ker...@vger.kernel.org --- Changes since v1: * wrapped lines to not exceed 79 chars * improved comment * expanded CC list arch/powerpc/include/asm/page.h| 10 +-- arch/powerpc/include/asm/page_32.h | 2 -- arch/powerpc/include/asm/page_64.h | 4 --- fs/binfmt_elf.c| 56 ++ 4 files changed, 45 insertions(+), 27 deletions(-) diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h index 56398e7..42d7ea1 100644 --- a/arch/powerpc/include/asm/page.h +++ b/arch/powerpc/include/asm/page.h @@ -225,15 +225,7 @@ extern long long virt_phys_offset; #endif #endif -/* - * Unfortunately the PLT is in the BSS in the PPC32 ELF ABI, - * and needs to be executable. This means the whole heap ends - * up being executable. - */ -#define VM_DATA_DEFAULT_FLAGS32(VM_READ | VM_WRITE | VM_EXEC | \ -VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC) - -#define VM_DATA_DEFAULT_FLAGS64(VM_READ | VM_WRITE | \ +#define VM_DATA_DEFAULT_FLAGS (VM_READ | VM_WRITE | \ VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC) #ifdef __powerpc64__ diff --git a/arch/powerpc/include/asm/page_32.h b/arch/powerpc/include/asm/page_32.h index 6a8e179..6113fa8 100644 --- a/arch/powerpc/include/asm/page_32.h +++ b/arch/powerpc/include/asm/page_32.h @@ -9,8 +9,6 @@ #endif #endif -#define VM_DATA_DEFAULT_FLAGS VM_DATA_DEFAULT_FLAGS32 - #ifdef CONFIG_NOT_COHERENT_CACHE #define ARCH_DMA_MINALIGN L1_CACHE_BYTES #endif diff --git a/arch/powerpc/include/asm/page_64.h b/arch/powerpc/include/asm/page_64.h index dd5f071..52d8e9c 100644 --- a/arch/powerpc/include/asm/page_64.h +++ b/arch/powerpc/include/asm/page_64.h @@ -159,10 +159,6 @@ do { \ #endif /* !CONFIG_HUGETLB_PAGE */ -#define VM_DATA_DEFAULT_FLAGS \ - (is_32bit_task() ? \ -VM_DATA_DEFAULT_FLAGS32 : VM_DATA_DEFAULT_FLAGS64) - /* * This is the default if a program doesn't have a PT_GNU_STACK * program header entry. The PPC64 ELF ABI has a non executable stack diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index a7a28110..50006d0 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -91,14 +91,25 @@ static struct linux_binfmt elf_format = { #define BAD_ADDR(x) ((unsigned long)(x) >= TASK_SIZE) -static int set_brk(unsigned long start, unsigned long end) +static int set_brk(unsigned long start, unsigned long end, int prot) { start = ELF_PAGEALIGN(start); end = ELF_PAGEALIGN(end); if (end > start) { - int error = vm_brk(start, end - start); - if (error) - return error; + /* Map the non-file portion of the last load header. If the + header is requesting these pages to be executeable then + we have to honour that, otherwise assume they are bss. */ + if (prot & PROT_EXEC) { + unsigned long addr; + addr = vm_mmap(0, start, end - start, prot, +
RE: [5.3] ucc_geth: Fix to avoid IS_ERR_VALUE abuses and dead code on 64bit systems.
From: netdev-ow...@vger.kernel.org [mailto:netdev-ow...@vger.kernel.org] On Behalf Of Arvind Yadav > IS_ERR_VALUE() assumes that parameter is an unsigned long. > It can not be used to check if 'unsigned int' is passed insted. > Which tends to reflect an error. > In 64bit architectures sizeof (int) == 4 && sizeof (long) == 8. > IS_ERR_VALUE(x) is ((x) >= (unsigned long)-4095). > IS_ERR_VALUE() of 'unsigned int' is always false because the 32bit > value is zero extended to 64 bits. You are being far too wordy above, and definitely below. > > Now problem in Freescale QEGigabit Ethernet-: > drivers/net/ethernet/freescale/ucc_geth.c > ... > qe_muram_addr(init_enet_pram_offset); > > qe_muram_alloc (a.k.a. cpm_muram_alloc) returns unsigned long. > Return value store in a u32 (init_enet_offset, exf_glbl_param_offset, > rx_glbl_pram_offset, tx_glbl_pram_offset, send_q_mem_reg_offset, > thread_dat_tx_offset, thread_dat_rx_offset, scheduler_offset, > tx_fw_statistics_pram_offset, rx_fw_statistics_pram_offset, > rx_irq_coalescing_tbl_offset, rx_bd_qs_tbl_offset, tx_bd_ring_offset, > init_enet_pram_offset and rx_bd_ring_offset). Inpenetrable... > If qe_muram_alloc will return any error, Then IS_ERR_VALUE will always > return 0. it'll not call ucc_fast_free for any failure. Inside 'if code' > will be a dead code on 64bit. Even qe_muram_addr will return wrong > virtual address. Which can cause an error. > > kfree((void *)ugeth->tx_bd_ring_offset[i]); Erm, kfree() isn't the right function for things allocated by qe_muram_alloc(). I still thing you need to stop this code using IS_ERR_VALUE() at all. David
[5.3] ucc_geth: Fix to avoid IS_ERR_VALUE abuses and dead code on 64bit systems.
IS_ERR_VALUE() assumes that parameter is an unsigned long. It can not be used to check if 'unsigned int' is passed insted. Which tends to reflect an error. In 64bit architectures sizeof (int) == 4 && sizeof (long) == 8. IS_ERR_VALUE(x) is ((x) >= (unsigned long)-4095). IS_ERR_VALUE() of 'unsigned int' is always false because the 32bit value is zero extended to 64 bits. Now problem in Freescale QEGigabit Ethernet-: drivers/net/ethernet/freescale/ucc_geth.c init_enet_offset = qe_muram_alloc(thread_size, thread_alignment); if (IS_ERR_VALUE(init_enet_offset)) { if (netif_msg_ifup(ugeth)) pr_err("Can not allocate DPRAM memory\n"); qe_put_snum((u8) snum); return -ENOMEM; } ugeth->tx_bd_ring_offset[j] = qe_muram_alloc(length, UCC_GETH_TX_BD_RING_ALIGNMENT); if (!IS_ERR_VALUE(ugeth->tx_bd_ring_offset[j])) ugeth->p_tx_bd_ring[j] = (u8 __iomem *) qe_muram_addr(ugeth-> tx_bd_ring_offset[j]); ugeth->rx_bd_ring_offset[j] = qe_muram_alloc(length, UCC_GETH_RX_BD_RING_ALIGNMENT); if (!IS_ERR_VALUE(ugeth->rx_bd_ring_offset[j])) ugeth->p_rx_bd_ring[j] = (u8 __iomem *) qe_muram_addr(ugeth-> rx_bd_ring_offset[j]); /* Allocate global tx parameter RAM page */ ugeth->tx_glbl_pram_offset = qe_muram_alloc(sizeof(struct ucc_geth_tx_global_pram), UCC_GETH_TX_GLOBAL_PRAM_ALIGNMENT); if (IS_ERR_VALUE(ugeth->tx_glbl_pram_offset)) { if (netif_msg_ifup(ugeth)) pr_err("Can not allocate DPRAM memory for p_tx_glbl_pram\n"); return -ENOMEM; } /* Size varies with number of Tx threads */ ugeth->thread_dat_tx_offset = qe_muram_alloc(numThreadsTxNumerical * sizeof(struct ucc_geth_thread_data_tx) + 32 * (numThreadsTxNumerical == 1), UCC_GETH_THREAD_DATA_ALIGNMENT); if (IS_ERR_VALUE(ugeth->thread_dat_tx_offset)) { if (netif_msg_ifup(ugeth)) pr_err ("Can not allocate DPRAM memory for p_thread_data_tx\n"); return -ENOMEM; } /* Size varies with number of Tx queues */ ugeth->send_q_mem_reg_offset = qe_muram_alloc(ug_info->numQueuesTx * sizeof(struct ucc_geth_send_queue_qd), UCC_GETH_SEND_QUEUE_QUEUE_DESCRIPTOR_ALIGNMENT); if (IS_ERR_VALUE(ugeth->send_q_mem_reg_offset)) { if (netif_msg_ifup(ugeth)) pr_err("Can not allocate DPRAM memory for p_send_q_mem_reg\n"); return -ENOMEM; } ugeth->scheduler_offset = qe_muram_alloc(sizeof(struct ucc_geth_scheduler), UCC_GETH_SCHEDULER_ALIGNMENT); if (IS_ERR_VALUE(ugeth->scheduler_offset)) { if (netif_msg_ifup(ugeth)) pr_err("Can not allocate DPRAM memory for p_scheduler\n"); return -ENOMEM; } ugeth->tx_fw_statistics_pram_offset = qe_muram_alloc(sizeof (struct ucc_geth_tx_firmware_statistics_pram), UCC_GETH_TX_STATISTICS_ALIGNMENT); if (IS_ERR_VALUE(ugeth->tx_fw_statistics_pram_offset)) { if (netif_msg_ifup(ugeth)) pr_err( "Can not allocate DPRAM memory for p_tx_fw_statistics_pram\n"); return -ENOMEM; } /* Allocate global rx parameter RAM page */ ugeth->rx_glbl_pram_offset = qe_muram_alloc(sizeof(struct ucc_geth_rx_global_pram), UCC_GETH_RX_GLOBAL_PRAM_ALIGNMENT); if (IS_ERR_VALUE(ugeth->rx_glbl_pram_offset)) { if (netif_msg_ifup(ugeth)) pr_err("Can not allocate DPRAM memory for p_rx_glbl_pram\n"); return -ENOMEM; } /* Size varies with number of Rx threads */ ugeth->thread_dat_rx_offset = qe_muram_alloc(numThreadsRxNumerical * sizeof(struct ucc_geth_thread_data_rx), UCC_GETH_THREAD_DATA_ALIGNMENT); if (IS_ERR_VALUE(ugeth->thread_dat_rx_offset)) { if (netif_msg_ifup(ugeth)) pr_err("Can not allocate DPRAM memory for p_thread_data_rx\n"); return -ENOMEM; } ugeth->rx_fw_statistics_pram_offset = qe_muram_alloc(sizeof (struct ucc_geth_rx_firmware_statistics_pram), UCC_GETH_RX_STATISTICS_ALIGNMENT); if (IS_ERR_VALUE(ugeth->rx_fw_statistics_pram_offset)) { if (netif_msg_ifup(ugeth)) pr_err(
Re: Problem booting Kernel 4.8-rc1 on Sam460ex ( Canyonlands)
On 2016/8/8 19:52, Julian Margetson wrote: > Problem booting Kernel 4.8-rc1 on Sam460ex ( Canyonlands) machine . > > This is what I get with the bisect. Hi Julian I have no Sam460ex machine, could you show the good and bad kernel bootlog? Thanks, Kefeng > > Regards > > Julian > > > 44a7185c2ae6383c88ff5b1ef2e2969f35d7b8b7 is the first bad commit > commit 44a7185c2ae6383c88ff5b1ef2e2969f35d7b8b7 > Author: Kefeng Wang > Date: Wed Jun 1 14:52:54 2016 +0800 > > of/platform: Add common method to populate default bus > > The arch code calls of_platform_populate() with default match table > when it wants to populate default bus. > > This patch introduce a new of_platform_default_populate_init() and make it > arch_initcall_sync(it should be later than some iommu configration, eg, > of_iommu_init() and swiotlb_late_init in arm64), then we can finish above > job in common method. > > In order to avoid the default bus being populated twice, simply checking > the flag of bus node whether has be set OF_POPULATED_BUS or not. > > After that, we can safely remove the caller in arch code. > > Btw, add debug print in of_platform_populate(), and use __func__ to > print function's name of of_platform_bus_create(). > > Cc: Rob Herring > Cc: Frank Rowand > Cc: Grant Likely > Signed-off-by: Kefeng Wang > Signed-off-by: Rob Herring > > :04 04 2d2438c33110bd5d0cba9540357c7e81f74485dd > 3ad033aad77298c3a6f50ed4c651d7336825d700 Mdrivers > root@julian-VirtualBox:/usr/src/linux-test# > > > git bisect log > git bisect start > # good: [523d939ef98fd712632d93a5a2b588e477a7565e] Linux 4.7 > git bisect good 523d939ef98fd712632d93a5a2b588e477a7565e > # bad: [29b4817d4018df78086157ea3a55c1d9424a7cfc] Linux 4.8-rc1 > git bisect bad 29b4817d4018df78086157ea3a55c1d9424a7cfc > # good: [1c88e19b0f6a8471ee50d5062721ba30b8fd4ba9] Merge branch 'akpm' > (patches from Andrew) > git bisect good 1c88e19b0f6a8471ee50d5062721ba30b8fd4ba9 > # bad: [77a87824ed676ca8ff8482e4157d3adb284fd381] > clocksource/drivers/clps_711x: fixup for "ARM: clps711x: > git bisect bad 77a87824ed676ca8ff8482e4157d3adb284fd381 > # bad: [c9b95e5961c0294e0efffeaa847c1a1e6369204c] Merge tag 'sound-4.8-rc1' > of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound > git bisect bad c9b95e5961c0294e0efffeaa847c1a1e6369204c > # good: [1056c9bd2702ea1bb79abf9bd1e78c578589d247] Merge tag > 'clk-for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux > git bisect good 1056c9bd2702ea1bb79abf9bd1e78c578589d247 > # bad: [bad60e6f259a01cf9f29a1ef8d435ab6c60b2de9] Merge tag 'powerpc-4.8-1' > of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux > git bisect bad bad60e6f259a01cf9f29a1ef8d435ab6c60b2de9 > # bad: [bad60e6f259a01cf9f29a1ef8d435ab6c60b2de9] Merge tag 'powerpc-4.8-1' > of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux > git bisect bad bad60e6f259a01cf9f29a1ef8d435ab6c60b2de9 > # good: [f7b9ebb79e90b19bf6a2cb805a536258437fc3fa] powerpc: Re-order > setup_panic() > git bisect good f7b9ebb79e90b19bf6a2cb805a536258437fc3fa > # good: [f7b9ebb79e90b19bf6a2cb805a536258437fc3fa] powerpc: Re-order > setup_panic() > git bisect good f7b9ebb79e90b19bf6a2cb805a536258437fc3fa > # good: [944171cbf499d3445c749f7c13c46de0a564a905] pNFS: Actively set > attributes as invalid if LAYOUTCOMMIT is outstanding > git bisect good 944171cbf499d3445c749f7c13c46de0a564a905 > # bad: [b325e04ea21081439f0f3e7fe1117e883a9031d8] Merge branch > 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip > git bisect bad b325e04ea21081439f0f3e7fe1117e883a9031d8 > # bad: [b325e04ea21081439f0f3e7fe1117e883a9031d8] Merge branch > 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip > git bisect bad b325e04ea21081439f0f3e7fe1117e883a9031d8 > # bad: [0e13f99d3a4816a3996a69721388b4cf1a982cf8] Documentation: dt: i2c: use > correct STMicroelectronics vendor prefix > git bisect bad 0e13f99d3a4816a3996a69721388b4cf1a982cf8 > # bad: [1a1d2f9968660e016883b21c499158c08d9c87bf] mips: use > of_platform_default_populate() to populate default bus > git bisect bad 1a1d2f9968660e016883b21c499158c08d9c87bf > # bad: [db41252cf8b58c8a57226204e113b9dffaebf693] cris: Remove unnecessary > of_platform_populate with default match table > git bisect bad db41252cf8b58c8a57226204e113b9dffaebf693 > # bad: [61c78644e7f1bd9445b3d7ddc3d35989a38985ee] arc: Remove unnecessary > of_platform_populate with default match table > git bisect bad 61c78644e7f1bd9445b3d7ddc3d35989a38985ee > # bad: [44a7185c2ae6383c88ff5b1ef2e2969f35d7b8b7] of/platform: Add common > method to populate default bus > git bisect bad 44a7185c2ae6383c88ff5b1ef2e2969f35d7b8b7 > # good: [bb8e15d60462a84a25a3bf33e8bc29b46c6d470a] of: iommu: make > of_iommu_init() postcore_initcall_sync > git bisect good bb8e15d60462a84a25a3bf33e8bc29b46c6d470a > # first bad commit: [44a7185c2ae638
Re: [PATCH] powerpc/pci: Only do fixed PHB numbering on powernv
On 08/07/2016 08:48 PM, Gavin Shan wrote: On Fri, Aug 05, 2016 at 04:40:56PM +1000, Michael Ellerman wrote: The recent commit 63a72284b159 ("powerpc/pci: Assign fixed PHB number based on device-tree properties"), added code to read a 64-bit property from the device tree, and if not found read a 32-bit property (reg). There was a bug in the 32-bit case, on big endian machines, due to the use of the 64-bit value to read the 32-bit property. The cast of &prop means we end up writing to the high 32-bit of prop, leaving the low 32-bits containing whatever junk was on the stack. If that junk value was non-zero, and < MAX_PHBS, we would end up using it as the PHB id. This exposed a second problem, which is that Xorg can't cope with a domain number > 256. So for now drop the fallback to reg, and only use "ibm,opal-phbid" for PHB numbering. Fixes: 63a72284b159 ("powerpc/pci: Assign fixed PHB number based on device-tree properties") Signed-off-by: Michael Ellerman --- arch/powerpc/kernel/pci-common.c | 24 +--- 1 file changed, 9 insertions(+), 15 deletions(-) This is my bad. Guilherme limited the reg case to pseries only, but I made it generic. I tested it on G5, Cell etc. which all booted - but that's not really a good enough test. Michael and Gavin, thanks very much for fixing and commenting on the issue. I'm sorry about the bug on Big Endian machines, my mistake! I'll fix it in a future patch, but right now I have 2 questions so I can investigate better the issue found on Xorg: (i) What is the specific issue? Do you have some logs or at least a "high-level" description of the problem in Xorg? I took a look in its code and PCI domain is coded as u16, which is correct/expected. So it seems a subtle bug we should investigate and hopefully fix. (ii) Why is it related to the absence of pseries check?! You said this was your bad, but as far as I understand, Xorg runs in pSeries too so the issue should also be there heheh The bug was reported/found in another platform? As Gavin pointed, the important/interesting part of the patch is related to pSeries, in which we have PHB hotplug and so the patch allows network adapters to work well in PHB hotplug scenario, even if using predictable naming. For now, I guess this fix is pretty good, but would be really important to have this feature on pSeries too - I'll put some effort in solving the Xorg bug. Thanks, Guilherme diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c index f93942b4b6a6..f9c2ffaa1884 100644 --- a/arch/powerpc/kernel/pci-common.c +++ b/arch/powerpc/kernel/pci-common.c @@ -77,29 +77,23 @@ EXPORT_SYMBOL(get_pci_dma_ops); */ static int get_phb_number(struct device_node *dn) { - int ret, phb_id = -1; + int ret, phb_id; u64 prop; /* -* Try fixed PHB numbering first, by checking archs and reading -* the respective device-tree properties. Firstly, try powernv by -* reading "ibm,opal-phbid", only present in OPAL environment. +* On powernv machines we should have the "ibm,opal-phbid" property, if +* so use that so that PHB numbers are predictable. */ ret = of_property_read_u64(dn, "ibm,opal-phbid", &prop); - if (ret) - ret = of_property_read_u32_index(dn, "reg", 1, (u32 *)&prop); - - if (!ret) + if (ret == 0) { phb_id = (int)(prop & (MAX_PHBS - 1)); Michael, It's going to ignore the issue addressed by commit 63a72284b159 ("powerpc/pci: Assign fixed PHB number based on device-tree properties"). The commit tried to get the PCI domain number from "reg" on pSeries. We still need check "reg" and try to get PCI domain number from it on pSeries only: if (ret && machine_is(pseries)) ret = of_property_read_u32_index(dn, "reg", 1, (u32 *)&prop); Thanks, Gavin - /* We need to be sure to not use the same PHB number twice. */ - if ((phb_id >= 0) && !test_and_set_bit(phb_id, phb_bitmap)) - return phb_id; + /* Make sure never to use the same PHB number twice. */ + if (!test_and_set_bit(phb_id, phb_bitmap)) + return phb_id; + } - /* -* If not pseries nor powernv, or if fixed PHB numbering tried to add -* the same PHB number twice, then fallback to dynamic PHB numbering. -*/ + /* If fixed PHB numbering failed, fallback to dynamic PHB numbering. */ phb_id = find_first_zero_bit(phb_bitmap, MAX_PHBS); BUG_ON(phb_id >= MAX_PHBS); set_bit(phb_id, phb_bitmap); -- 2.7.4
Re: Problem booting Kernel 4.8-rc1 on Sam460ex ( Canyonlands)
On 8/8/2016 8:22 AM, Kefeng Wang wrote: On 2016/8/8 19:52, Julian Margetson wrote: Problem booting Kernel 4.8-rc1 on Sam460ex ( Canyonlands) machine . This is what I get with the bisect. Hi Julian I have no Sam460ex machine, could you show the good and bad kernel bootlog? Thanks, Kefeng Regards Julian 44a7185c2ae6383c88ff5b1ef2e2969f35d7b8b7 is the first bad commit commit 44a7185c2ae6383c88ff5b1ef2e2969f35d7b8b7 Author: Kefeng Wang Date: Wed Jun 1 14:52:54 2016 +0800 of/platform: Add common method to populate default bus The arch code calls of_platform_populate() with default match table when it wants to populate default bus. This patch introduce a new of_platform_default_populate_init() and make it arch_initcall_sync(it should be later than some iommu configration, eg, of_iommu_init() and swiotlb_late_init in arm64), then we can finish above job in common method. In order to avoid the default bus being populated twice, simply checking the flag of bus node whether has be set OF_POPULATED_BUS or not. After that, we can safely remove the caller in arch code. Btw, add debug print in of_platform_populate(), and use __func__ to print function's name of of_platform_bus_create(). Cc: Rob Herring Cc: Frank Rowand Cc: Grant Likely Signed-off-by: Kefeng Wang Signed-off-by: Rob Herring :04 04 2d2438c33110bd5d0cba9540357c7e81f74485dd 3ad033aad77298c3a6f50ed4c651d7336825d700 Mdrivers root@julian-VirtualBox:/usr/src/linux-test# git bisect log git bisect start # good: [523d939ef98fd712632d93a5a2b588e477a7565e] Linux 4.7 git bisect good 523d939ef98fd712632d93a5a2b588e477a7565e # bad: [29b4817d4018df78086157ea3a55c1d9424a7cfc] Linux 4.8-rc1 git bisect bad 29b4817d4018df78086157ea3a55c1d9424a7cfc # good: [1c88e19b0f6a8471ee50d5062721ba30b8fd4ba9] Merge branch 'akpm' (patches from Andrew) git bisect good 1c88e19b0f6a8471ee50d5062721ba30b8fd4ba9 # bad: [77a87824ed676ca8ff8482e4157d3adb284fd381] clocksource/drivers/clps_711x: fixup for "ARM: clps711x: git bisect bad 77a87824ed676ca8ff8482e4157d3adb284fd381 # bad: [c9b95e5961c0294e0efffeaa847c1a1e6369204c] Merge tag 'sound-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound git bisect bad c9b95e5961c0294e0efffeaa847c1a1e6369204c # good: [1056c9bd2702ea1bb79abf9bd1e78c578589d247] Merge tag 'clk-for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux git bisect good 1056c9bd2702ea1bb79abf9bd1e78c578589d247 # bad: [bad60e6f259a01cf9f29a1ef8d435ab6c60b2de9] Merge tag 'powerpc-4.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux git bisect bad bad60e6f259a01cf9f29a1ef8d435ab6c60b2de9 # bad: [bad60e6f259a01cf9f29a1ef8d435ab6c60b2de9] Merge tag 'powerpc-4.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux git bisect bad bad60e6f259a01cf9f29a1ef8d435ab6c60b2de9 # good: [f7b9ebb79e90b19bf6a2cb805a536258437fc3fa] powerpc: Re-order setup_panic() git bisect good f7b9ebb79e90b19bf6a2cb805a536258437fc3fa # good: [f7b9ebb79e90b19bf6a2cb805a536258437fc3fa] powerpc: Re-order setup_panic() git bisect good f7b9ebb79e90b19bf6a2cb805a536258437fc3fa # good: [944171cbf499d3445c749f7c13c46de0a564a905] pNFS: Actively set attributes as invalid if LAYOUTCOMMIT is outstanding git bisect good 944171cbf499d3445c749f7c13c46de0a564a905 # bad: [b325e04ea21081439f0f3e7fe1117e883a9031d8] Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip git bisect bad b325e04ea21081439f0f3e7fe1117e883a9031d8 # bad: [b325e04ea21081439f0f3e7fe1117e883a9031d8] Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip git bisect bad b325e04ea21081439f0f3e7fe1117e883a9031d8 # bad: [0e13f99d3a4816a3996a69721388b4cf1a982cf8] Documentation: dt: i2c: use correct STMicroelectronics vendor prefix git bisect bad 0e13f99d3a4816a3996a69721388b4cf1a982cf8 # bad: [1a1d2f9968660e016883b21c499158c08d9c87bf] mips: use of_platform_default_populate() to populate default bus git bisect bad 1a1d2f9968660e016883b21c499158c08d9c87bf # bad: [db41252cf8b58c8a57226204e113b9dffaebf693] cris: Remove unnecessary of_platform_populate with default match table git bisect bad db41252cf8b58c8a57226204e113b9dffaebf693 # bad: [61c78644e7f1bd9445b3d7ddc3d35989a38985ee] arc: Remove unnecessary of_platform_populate with default match table git bisect bad 61c78644e7f1bd9445b3d7ddc3d35989a38985ee # bad: [44a7185c2ae6383c88ff5b1ef2e2969f35d7b8b7] of/platform: Add common method to populate default bus git bisect bad 44a7185c2ae6383c88ff5b1ef2e2969f35d7b8b7 # good: [bb8e15d60462a84a25a3bf33e8bc29b46c6d470a] of: iommu: make of_iommu_init() postcore_initcall_sync git bisect good bb8e15d60462a84a25a3bf33e8bc29b46c6d470a # first bad commit: [44a7185c2ae6383c88ff5b1ef2e2969f35d7b8b7] of/platform: Add common method to populate default bus . On all Bad Boots no output after Loading Devic
Problem booting Kernel 4.8-rc1 on Sam460ex ( Canyonlands)
Problem booting Kernel 4.8-rc1 on Sam460ex ( Canyonlands) machine . This is what I get with the bisect. Regards Julian 44a7185c2ae6383c88ff5b1ef2e2969f35d7b8b7 is the first bad commit commit 44a7185c2ae6383c88ff5b1ef2e2969f35d7b8b7 Author: Kefeng Wang Date: Wed Jun 1 14:52:54 2016 +0800 of/platform: Add common method to populate default bus The arch code calls of_platform_populate() with default match table when it wants to populate default bus. This patch introduce a new of_platform_default_populate_init() and make it arch_initcall_sync(it should be later than some iommu configration, eg, of_iommu_init() and swiotlb_late_init in arm64), then we can finish above job in common method. In order to avoid the default bus being populated twice, simply checking the flag of bus node whether has be set OF_POPULATED_BUS or not. After that, we can safely remove the caller in arch code. Btw, add debug print in of_platform_populate(), and use __func__ to print function's name of of_platform_bus_create(). Cc: Rob Herring Cc: Frank Rowand Cc: Grant Likely Signed-off-by: Kefeng Wang Signed-off-by: Rob Herring :04 04 2d2438c33110bd5d0cba9540357c7e81f74485dd 3ad033aad77298c3a6f50ed4c651d7336825d700 Mdrivers root@julian-VirtualBox:/usr/src/linux-test# git bisect log git bisect start # good: [523d939ef98fd712632d93a5a2b588e477a7565e] Linux 4.7 git bisect good 523d939ef98fd712632d93a5a2b588e477a7565e # bad: [29b4817d4018df78086157ea3a55c1d9424a7cfc] Linux 4.8-rc1 git bisect bad 29b4817d4018df78086157ea3a55c1d9424a7cfc # good: [1c88e19b0f6a8471ee50d5062721ba30b8fd4ba9] Merge branch 'akpm' (patches from Andrew) git bisect good 1c88e19b0f6a8471ee50d5062721ba30b8fd4ba9 # bad: [77a87824ed676ca8ff8482e4157d3adb284fd381] clocksource/drivers/clps_711x: fixup for "ARM: clps711x: git bisect bad 77a87824ed676ca8ff8482e4157d3adb284fd381 # bad: [c9b95e5961c0294e0efffeaa847c1a1e6369204c] Merge tag 'sound-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound git bisect bad c9b95e5961c0294e0efffeaa847c1a1e6369204c # good: [1056c9bd2702ea1bb79abf9bd1e78c578589d247] Merge tag 'clk-for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux git bisect good 1056c9bd2702ea1bb79abf9bd1e78c578589d247 # bad: [bad60e6f259a01cf9f29a1ef8d435ab6c60b2de9] Merge tag 'powerpc-4.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux git bisect bad bad60e6f259a01cf9f29a1ef8d435ab6c60b2de9 # bad: [bad60e6f259a01cf9f29a1ef8d435ab6c60b2de9] Merge tag 'powerpc-4.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux git bisect bad bad60e6f259a01cf9f29a1ef8d435ab6c60b2de9 # good: [f7b9ebb79e90b19bf6a2cb805a536258437fc3fa] powerpc: Re-order setup_panic() git bisect good f7b9ebb79e90b19bf6a2cb805a536258437fc3fa # good: [f7b9ebb79e90b19bf6a2cb805a536258437fc3fa] powerpc: Re-order setup_panic() git bisect good f7b9ebb79e90b19bf6a2cb805a536258437fc3fa # good: [944171cbf499d3445c749f7c13c46de0a564a905] pNFS: Actively set attributes as invalid if LAYOUTCOMMIT is outstanding git bisect good 944171cbf499d3445c749f7c13c46de0a564a905 # bad: [b325e04ea21081439f0f3e7fe1117e883a9031d8] Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip git bisect bad b325e04ea21081439f0f3e7fe1117e883a9031d8 # bad: [b325e04ea21081439f0f3e7fe1117e883a9031d8] Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip git bisect bad b325e04ea21081439f0f3e7fe1117e883a9031d8 # bad: [0e13f99d3a4816a3996a69721388b4cf1a982cf8] Documentation: dt: i2c: use correct STMicroelectronics vendor prefix git bisect bad 0e13f99d3a4816a3996a69721388b4cf1a982cf8 # bad: [1a1d2f9968660e016883b21c499158c08d9c87bf] mips: use of_platform_default_populate() to populate default bus git bisect bad 1a1d2f9968660e016883b21c499158c08d9c87bf # bad: [db41252cf8b58c8a57226204e113b9dffaebf693] cris: Remove unnecessary of_platform_populate with default match table git bisect bad db41252cf8b58c8a57226204e113b9dffaebf693 # bad: [61c78644e7f1bd9445b3d7ddc3d35989a38985ee] arc: Remove unnecessary of_platform_populate with default match table git bisect bad 61c78644e7f1bd9445b3d7ddc3d35989a38985ee # bad: [44a7185c2ae6383c88ff5b1ef2e2969f35d7b8b7] of/platform: Add common method to populate default bus git bisect bad 44a7185c2ae6383c88ff5b1ef2e2969f35d7b8b7 # good: [bb8e15d60462a84a25a3bf33e8bc29b46c6d470a] of: iommu: make of_iommu_init() postcore_initcall_sync git bisect good bb8e15d60462a84a25a3bf33e8bc29b46c6d470a # first bad commit: [44a7185c2ae6383c88ff5b1ef2e2969f35d7b8b7] of/platform: Add common method to populate default bus
RE: [PATCH v13 3/6] CPM/QE: use genalloc to manage CPM/QE muram
On 6/8/2016 03:48AM, Christophe Leroy wrote : > -Original Message- > From: Christophe Leroy [mailto:christophe.le...@c-s.fr] > Sent: Saturday, August 06, 2016 12:59 AM > To: Zhao Qiang ; lau...@codeaurora.org > Cc: catalin.mari...@arm.com; linux-ker...@vger.kernel.org; Scott Wood > ; o...@lixom.net; a...@linux-foundation.org; linuxppc- > d...@lists.ozlabs.org; x@freescale.com > Subject: Re: [PATCH v13 3/6] CPM/QE: use genalloc to manage CPM/QE muram > > > > Le 30/11/2015 à 03:48, Zhao Qiang a écrit : > > Use genalloc to manage CPM/QE muram instead of rheap. > > > > Signed-off-by: Zhao Qiang > > --- > > Changes for v9: > > - splitted from patch 3/5, modify cpm muram management functions. > > Changes for v10: > > - modify cpm muram first, then move to qe_common > > - modify commit. > > Changes for v11: > > - factor out the common alloc code > > - modify min_alloc_order to zero for cpm_muram_alloc_fixed. > > Changes for v12: > > - Nil > > Changes for v13: > > - rebase > > > > arch/powerpc/include/asm/cpm.h | 3 + > > arch/powerpc/platforms/Kconfig | 4 +- > > arch/powerpc/sysdev/cpm_common.c | 126 > +++ > > lib/genalloc.c | 2 +- > > 4 files changed, 94 insertions(+), 41 deletions(-) > > > > With that patch applied, I get the following Oops on a 8xx (Which has a CPM1). > > cpm_muram_init() is called from setup_arch() > > It seems that gen_pool_add() tries to kmalloc() memory but the SLAB is not > available yet. > Thank you for your comments, I can't find a 8xx board, would you like to test the patch Attached on your board? > [0.00] Unable to handle kernel paging request for data at > address 0x0008 > [0.00] Faulting instruction address: 0xc01acce0 > [0.00] Oops: Kernel access of bad area, sig: 11 [#1] > [0.00] PREEMPT CMPC885 > [0.00] CPU: 0 PID: 0 Comm: swapper Not tainted > 4.4.14-s3k-dev-g0886ed8-svn #5 > [0.00] task: c05183e0 ti: c0536000 task.ti: c0536000 > [0.00] NIP: c01acce0 LR: c0011068 CTR: > [0.00] REGS: c0537e50 TRAP: 0300 Not tainted > (4.4.14-s3k-dev-g0886ed8-svn) > [0.00] MSR: 1032 CR: 28044428 XER: > [0.00] DAR: 0008 DSISR: c000 > GPR00: c0011068 c0537f00 c05183e0 9000 0bc0 > GPR08: ff003000 ff00b000 ff003bbf 22044422 100d43a8 > 07ff94e8 > GPR16: 07bb5d70 07ff81f4 07ff81f4 07ff81f4 > > GPR24: 07ffb3a0 07fe7628 c055 c7ffa190 c054 ff003bbf > 0001 > [0.00] NIP [c01acce0] gen_pool_add_virt+0x14/0xdc > [0.00] LR [c0011068] cpm_muram_init+0xd4/0x18c > [0.00] Call Trace: > [0.00] [c0537f00] [0200] 0x200 (unreliable) > [0.00] [c0537f20] [c0011068] cpm_muram_init+0xd4/0x18c > [0.00] [c0537f70] [c0494684] cpm_reset+0xb4/0xc8 > [0.00] [c0537f90] [c0494c64] cmpc885_setup_arch+0x10/0x30 > [0.00] [c0537fa0] [c0493cd4] setup_arch+0x130/0x168 > [0.00] [c0537fb0] [c04906bc] start_kernel+0x88/0x380 > [0.00] [c0537ff0] [c0002224] start_here+0x38/0x98 > [0.00] Instruction dump: > [0.00] 91430010 91430014 80010014 83e1000c 7c0803a6 38210010 > 4e800020 7c0802a6 > [0.00] 9421ffe0 bf61000c 90010024 7c7e1b78 <80630008> 7c9c2378 > 7cc31c30 3863001f > [0.00] ---[ end trace dc8fa200cb88537f ]--- 0001-CPM1-use-subsys_initcall-to-init-cpm1.patch Description: 0001-CPM1-use-subsys_initcall-to-init-cpm1.patch
[PATCH] cxl: Set psl_fir_cntl to production environment value
Switch the setting of psl_fir_cntl from debug to production environment recommended value. It mostly affects the PSL behavior when an error is raised in psl_fir1/2. Tested with cxlflash. Signed-off-by: Frederic Barrat --- drivers/misc/cxl/pci.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c index d152e2d..4be25c4 100644 --- a/drivers/misc/cxl/pci.c +++ b/drivers/misc/cxl/pci.c @@ -379,7 +379,7 @@ static int calc_capp_routing(struct pci_dev *dev, u64 *chipid, u64 *capp_unit_id static int init_implementation_adapter_psl_regs(struct cxl *adapter, struct pci_dev *dev) { - u64 psl_dsnctl; + u64 psl_dsnctl, psl_fircntl; u64 chipid; u64 capp_unit_id; int rc; @@ -398,8 +398,11 @@ static int init_implementation_adapter_psl_regs(struct cxl *adapter, struct pci_ cxl_p1_write(adapter, CXL_PSL_RESLCKTO, 0x2000200ULL); /* snoop write mask */ cxl_p1_write(adapter, CXL_PSL_SNWRALLOC, 0xULL); - /* set fir_accum */ - cxl_p1_write(adapter, CXL_PSL_FIR_CNTL, 0x0800ULL); + /* set fir_cntl to recommended value for production env */ + psl_fircntl = (0x2ULL << (63-3)); /* ce_report */ + psl_fircntl |= (0x1ULL << (63-6)); /* FIR_report */ + psl_fircntl |= 0x1ULL; /* ce_thresh */ + cxl_p1_write(adapter, CXL_PSL_FIR_CNTL, psl_fircntl); /* for debugging with trace arrays */ cxl_p1_write(adapter, CXL_PSL_TRACE, 0xFF7CULL); -- 2.7.4
[PATCH] powerpc: rebuild vdsos correctly
When using if_changed, we need to add FORCE to dependencies, otherwise we don't get command line change checking amongst other things. This has resulted in vdsos not being rebuilt when switching between big and little endian. Signed-off-by: Nicholas Piggin --- There look to be some other cases in arch/powerpc/ of if_changed being used without FORCE too, which should also be investigated. Also, why are we using if_changed_deps? --- arch/powerpc/kernel/vdso32/Makefile | 8 arch/powerpc/kernel/vdso64/Makefile | 8 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/arch/powerpc/kernel/vdso32/Makefile b/arch/powerpc/kernel/vdso32/Makefile index cbabd14..ae1f245 100644 --- a/arch/powerpc/kernel/vdso32/Makefile +++ b/arch/powerpc/kernel/vdso32/Makefile @@ -30,7 +30,7 @@ CPPFLAGS_vdso32.lds += -P -C -Upowerpc $(obj)/vdso32_wrapper.o : $(obj)/vdso32.so # link rule for the .so file, .lds has to be first -$(obj)/vdso32.so.dbg: $(src)/vdso32.lds $(obj-vdso32) +$(obj)/vdso32.so.dbg: $(src)/vdso32.lds $(obj-vdso32) FORCE $(call if_changed,vdso32ld) # strip rule for the .so file @@ -39,14 +39,14 @@ $(obj)/%.so: $(obj)/%.so.dbg FORCE $(call if_changed,objcopy) # assembly rules for the .S files -$(obj-vdso32): %.o: %.S +$(obj-vdso32): %.o: %.S FORCE $(call if_changed_dep,vdso32as) # actual build commands quiet_cmd_vdso32ld = VDSO32L $@ - cmd_vdso32ld = $(CROSS32CC) $(c_flags) -Wl,-T $^ -o $@ + cmd_vdso32ld = $(CROSS32CC) $(c_flags) -o $@ -Wl,-T$(filter %.lds,$^) $(filter %.o,$^) quiet_cmd_vdso32as = VDSO32A $@ - cmd_vdso32as = $(CROSS32CC) $(a_flags) -c -o $@ $< + cmd_vdso32as = $(CROSS32CC) $(a_flags) -o $@ -c $< # install commands for the unstripped file quiet_cmd_vdso_install = INSTALL $@ diff --git a/arch/powerpc/kernel/vdso64/Makefile b/arch/powerpc/kernel/vdso64/Makefile index c710802..0d5f04a 100644 --- a/arch/powerpc/kernel/vdso64/Makefile +++ b/arch/powerpc/kernel/vdso64/Makefile @@ -23,7 +23,7 @@ CPPFLAGS_vdso64.lds += -P -C -U$(ARCH) $(obj)/vdso64_wrapper.o : $(obj)/vdso64.so # link rule for the .so file, .lds has to be first -$(obj)/vdso64.so.dbg: $(src)/vdso64.lds $(obj-vdso64) +$(obj)/vdso64.so.dbg: $(src)/vdso64.lds $(obj-vdso64) FORCE $(call if_changed,vdso64ld) # strip rule for the .so file @@ -32,14 +32,14 @@ $(obj)/%.so: $(obj)/%.so.dbg FORCE $(call if_changed,objcopy) # assembly rules for the .S files -$(obj-vdso64): %.o: %.S +$(obj-vdso64): %.o: %.S FORCE $(call if_changed_dep,vdso64as) # actual build commands quiet_cmd_vdso64ld = VDSO64L $@ - cmd_vdso64ld = $(CC) $(c_flags) -Wl,-T $^ -o $@ + cmd_vdso64ld = $(CC) $(c_flags) -o $@ -Wl,-T$(filter %.lds,$^) $(filter %.o,$^) quiet_cmd_vdso64as = VDSO64A $@ - cmd_vdso64as = $(CC) $(a_flags) -c -o $@ $< + cmd_vdso64as = $(CC) $(a_flags) -o $@ -c $< # install commands for the unstripped file quiet_cmd_vdso_install = INSTALL $@ -- 2.8.1
[PATCH] powerpc/64be: must build vdso32 as BE on LE host
vdso32 should be built with `gcc -m32 --big-endian` when compiling a big endian kernel. Signed-off-by: Nicholas Piggin --- arch/powerpc/Makefile | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile index 709a22a..bf183b2a 100644 --- a/arch/powerpc/Makefile +++ b/arch/powerpc/Makefile @@ -82,6 +82,8 @@ else ifeq ($(call cc-option-yn,-mbig-endian),y) override CC+= -mbig-endian override AS+= -mbig-endian +override CROSS32CC += -mbig-endian +override CROSS32AS += -mbig-endian endif override LD+= -EB LDEMULATION:= ppc -- 2.8.1
Re: [RESEND][PATCH v2 2/2] powerpc/fadump: parse fadump reserve memory size based on memory range
On Monday 08 August 2016 02:26 PM, Michael Ellerman wrote: Hari Bathini writes: On Friday 05 August 2016 12:23 AM, Hari Bathini wrote: On Thursday 04 August 2016 03:15 PM, Michael Ellerman wrote: The code already knows how to reserve 5% based on the size of the machine's memory, as long as no commandline parameter is passed. So why can't we just use that logic? That is the default value reserved but not a good enough value for every case. It is a bit difficult to come up with a robust formula that works for every case as new kernel changes could make the values obsolete. But it won't be all that difficult to find values that work for different memory ranges for a given kernel version. Passing that as range based input with "fadump_reserve_mem" parameter would work for every memory configuration on a given system, which is what this patch is trying to provide.. You want me to add this to the changelog on respin? Hi Michael, I'm not really convinced. Distros are going to want to specify a fixed set of values for different memory sizes, at least that's what I've seen in the past with kdump. So I don't see why we can't just do that in the kernel with a formula based on memory size, and maybe some other information. Agreed. Such support would be great but this patch is adding support for a new syntax for an existing parameter which should still be good to have? Maybe the formula is more complicated than 5% of RAM, but it shouldn't be *that* much more complicated. Depending on what all kernel versions that need support, this can get ugly? I could be completely wrong though.. Thanks Hari cheers ___ kexec mailing list ke...@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH 1/5] kbuild: allow architectures to use thin archives instead of ld -r
On Monday, August 8, 2016 1:25:32 PM CEST Nicholas Piggin wrote: > On Sat, 6 Aug 2016 22:10:45 +0200 > Sam Ravnborg wrote: > > > Hi Nicholas. > > > > On Fri, Aug 05, 2016 at 10:11:59PM +1000, Nicholas Piggin wrote: > > > From: Stephen Rothwell > > > > > > ld -r is an incremental link used to create built-in.o files in build > > > subdirectories. It produces relocatable object files containing all > > > its input files, and these are are then pulled together and relocated > > > in the final link. Aside from the bloat, this constrains the final > > > link relocations, which has bitten large powerpc builds with > > > unresolvable relocations in the final link. > > > > > > Alan Modra has recommended the kernel use thin archives for linking. > > > This is an alternative and means that the linker has more information > > > available to it when it links the kernel. > > > > > > This patch enables a config option architectures can select, > > If we want to do this, then I suggest to make the logic reverse. > > Architectures that for some reasons cannot use this should > > have the possibility to avoid it. But let it be enabled by default. > > I was thinking the build matrix (architectures x build options x toolchains) > is a bit too large to switch it for everybody. I've far from even tested it > for a fraction of powerpc builds. I would prefer arch maintainers to switch > it themselves, but I do hope we can move everybody and just remove the old > method within a few releases. > > But I'm happy to go with whatever arch and kbuild maintainers prefer, so I > appreciate any discussion on it. How about making the option visible for everyone with CONFIG_EXPERT? That way you don't get it by default, but you do get it with randconfig and allmodconfig build testing. Arnd
Re: [RESEND PATCH v3 2/2] powernv: Fix MCE handler to avoid trashing CR0/CR1 registers.
On 08/08/2016 02:28 PM, Michael Ellerman wrote: > Mahesh J Salgaonkar writes: > >> From: Mahesh Salgaonkar >> >> The current implementation of MCE early handling modifies CR0/1 registers >> without saving its old values. Fix this by moving early check for >> powersaving mode to machine_check_handle_early(). >> >> The power architecture 2.06 or later allows the possibility of getting >> machine check while in nap/sleep/winkle. The last bit of HSPRG0 is set >> to 1, if thread is woken up from winkle. Hence, clear the last bit of >> HSPRG0 (r13) before MCE handler starts using it as paca pointer. >> >> Also, the current code always puts the thread into nap state irrespective >> of whatever idle state it woke up from. Fix that by looking at >> paca->thread_idle_state and put the thread back into same state where it >> came from. >> >> Cc: sta...@vger.kernel.org > > The information I need is "which commit introduced the bug". It fixes commit 1c51089: powerpc/book3s: Return from interrupt if coming from evil context. > Given that I can work out which stable releases we should backport the > patch to. It will need an backport to stable once it hits upstream. -Mahesh. > > cheers >
Re: [RESEND PATCH v3 2/2] powernv: Fix MCE handler to avoid trashing CR0/CR1 registers.
Mahesh J Salgaonkar writes: > From: Mahesh Salgaonkar > > The current implementation of MCE early handling modifies CR0/1 registers > without saving its old values. Fix this by moving early check for > powersaving mode to machine_check_handle_early(). > > The power architecture 2.06 or later allows the possibility of getting > machine check while in nap/sleep/winkle. The last bit of HSPRG0 is set > to 1, if thread is woken up from winkle. Hence, clear the last bit of > HSPRG0 (r13) before MCE handler starts using it as paca pointer. > > Also, the current code always puts the thread into nap state irrespective > of whatever idle state it woke up from. Fix that by looking at > paca->thread_idle_state and put the thread back into same state where it > came from. > > Cc: sta...@vger.kernel.org The information I need is "which commit introduced the bug". Given that I can work out which stable releases we should backport the patch to. cheers
Re: [RESEND][PATCH v2 2/2] powerpc/fadump: parse fadump reserve memory size based on memory range
Hari Bathini writes: > On Friday 05 August 2016 12:23 AM, Hari Bathini wrote: >> On Thursday 04 August 2016 03:15 PM, Michael Ellerman wrote: >>> The code already knows how to reserve 5% based on the size of the >>> machine's >>> memory, as long as no commandline parameter is passed. So why can't we >>> just use that logic? >> >> That is the default value reserved but not a good enough value for >> every case. It is a bit difficult to come up with a robust formula >> that works for every case as new kernel changes could make the >> values obsolete. But it won't be all that difficult to find values that >> work for different memory ranges for a given kernel version. >> Passing that as range based input with "fadump_reserve_mem" >> parameter would work for every memory configuration on a >> given system, which is what this patch is trying to provide.. > > You want me to add this to the changelog on respin? I'm not really convinced. Distros are going to want to specify a fixed set of values for different memory sizes, at least that's what I've seen in the past with kdump. So I don't see why we can't just do that in the kernel with a formula based on memory size, and maybe some other information. Maybe the formula is more complicated than 5% of RAM, but it shouldn't be *that* much more complicated. cheers
[RESEND PATCH v3 2/2] powernv: Fix MCE handler to avoid trashing CR0/CR1 registers.
From: Mahesh Salgaonkar The current implementation of MCE early handling modifies CR0/1 registers without saving its old values. Fix this by moving early check for powersaving mode to machine_check_handle_early(). The power architecture 2.06 or later allows the possibility of getting machine check while in nap/sleep/winkle. The last bit of HSPRG0 is set to 1, if thread is woken up from winkle. Hence, clear the last bit of HSPRG0 (r13) before MCE handler starts using it as paca pointer. Also, the current code always puts the thread into nap state irrespective of whatever idle state it woke up from. Fix that by looking at paca->thread_idle_state and put the thread back into same state where it came from. Cc: sta...@vger.kernel.org Reported-by: Paul Mackerras Signed-off-by: Mahesh Salgaonkar Reviewed-by: Shreyas B. Prabhu --- Change in v3: - Rebase to Linus' master. Change in v2: - Call IDLE_STATE_ENTER_SEQ(PPC_NAP) instead of power7_enter_nap_mode() to be consistent with other part of code. --- arch/powerpc/kernel/exceptions-64s.S | 69 -- 1 file changed, 40 insertions(+), 29 deletions(-) diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 694def6..a59c9cc 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -144,29 +144,14 @@ machine_check_pSeries_1: * vector */ SET_SCRATCH0(r13) /* save r13 */ -#ifdef CONFIG_PPC_P7_NAP -BEGIN_FTR_SECTION - /* Running native on arch 2.06 or later, check if we are -* waking up from nap. We only handle no state loss and -* supervisor state loss. We do -not- handle hypervisor -* state loss at this time. + /* +* Running native on arch 2.06 or later, we may wakeup from winkle +* inside machine check. If yes, then last bit of HSPGR0 would be set +* to 1. Hence clear it unconditionally. */ - mfspr r13,SPRN_SRR1 - rlwinm. r13,r13,47-31,30,31 - OPT_GET_SPR(r13, SPRN_CFAR, CPU_FTR_CFAR) - beq 9f - - mfspr r13,SPRN_SRR1 - rlwinm. r13,r13,47-31,30,31 - /* waking up from powersave (nap) state */ - cmpwi cr1,r13,2 - /* Total loss of HV state is fatal. let's just stay stuck here */ - OPT_GET_SPR(r13, SPRN_CFAR, CPU_FTR_CFAR) - bgt cr1,. -9: - OPT_SET_SPR(r13, SPRN_CFAR, CPU_FTR_CFAR) -END_FTR_SECTION_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206) -#endif /* CONFIG_PPC_P7_NAP */ + GET_PACA(r13) + clrrdi r13,r13,1 + SET_PACA(r13) EXCEPTION_PROLOG_0(PACA_EXMC) BEGIN_FTR_SECTION b machine_check_powernv_early @@ -1273,25 +1258,51 @@ machine_check_handle_early: * Check if thread was in power saving mode. We come here when any * of the following is true: * a. thread wasn't in power saving mode -* b. thread was in power saving mode with no state loss or -*supervisor state loss +* b. thread was in power saving mode with no state loss, +*supervisor state loss or hypervisor state loss. * -* Go back to nap again if (b) is true. +* Go back to nap/sleep/winkle mode again if (b) is true. */ rlwinm. r11,r12,47-31,30,31 /* Was it in power saving mode? */ beq 4f /* No, it wasn;t */ /* Thread was in power saving mode. Go back to nap again. */ cmpwi r11,2 - bne 3f - /* Supervisor state loss */ + blt 3f + /* Supervisor/Hypervisor state loss */ li r0,1 stb r0,PACA_NAPSTATELOST(r13) 3: bl machine_check_queue_event MACHINE_CHECK_HANDLER_WINDUP GET_PACA(r13) ld r1,PACAR1(r13) - li r3,PNV_THREAD_NAP - b pnv_enter_arch207_idle_mode + /* +* Check what idle state this CPU was in and go back to same mode +* again. +*/ + lbz r3,PACA_THREAD_IDLE_STATE(r13) + cmpwi r3,PNV_THREAD_NAP + bgt 10f + IDLE_STATE_ENTER_SEQ(PPC_NAP) + /* No return */ +10: + cmpwi r3,PNV_THREAD_SLEEP + bgt 2f + IDLE_STATE_ENTER_SEQ(PPC_SLEEP) + /* No return */ + +2: + /* +* Go back to winkle. Please note that this thread was woken up in +* machine check from winkle and have not restored the per-subcore +* state. Hence before going back to winkle, set last bit of HSPGR0 +* to 1. This will make sure that if this thread gets woken up +* again at reset vector 0x100 then it will get chance to restore +* the subcore state. +*/ + ori r13,r13,1 + SET_PACA(r13) + IDLE_STATE_ENTER_SEQ(PPC_WINKLE) + /* No return */ 4: #endif /*
[RESEND PATCH v3 1/2] powernv: Move IDLE_STATE_ENTER_SEQ macro to cpuidle.h
From: Mahesh Salgaonkar Move IDLE_STATE_ENTER_SEQ macro to cpuidle.h so that MCE handler changes in subsequent patch can use it. No functionality change. Cc: sta...@vger.kernel.org Signed-off-by: Mahesh Salgaonkar --- Change in v3: - Rebase to Linus' master. --- arch/powerpc/include/asm/cpuidle.h | 13 + arch/powerpc/kernel/idle_book3s.S | 12 2 files changed, 13 insertions(+), 12 deletions(-) diff --git a/arch/powerpc/include/asm/cpuidle.h b/arch/powerpc/include/asm/cpuidle.h index 3d7fc06..01b8a13 100644 --- a/arch/powerpc/include/asm/cpuidle.h +++ b/arch/powerpc/include/asm/cpuidle.h @@ -19,4 +19,17 @@ extern u64 pnv_first_deep_stop_state; #endif +/* Idle state entry routines */ +#ifdef CONFIG_PPC_P7_NAP +#defineIDLE_STATE_ENTER_SEQ(IDLE_INST) \ + /* Magic NAP/SLEEP/WINKLE mode enter sequence */\ + std r0,0(r1); \ + ptesync;\ + ld r0,0(r1); \ +1: cmp cr0,r0,r0; \ + bne 1b; \ + IDLE_INST; \ + b . +#endif /* CONFIG_PPC_P7_NAP */ + #endif diff --git a/arch/powerpc/kernel/idle_book3s.S b/arch/powerpc/kernel/idle_book3s.S index 45784ec..748b912 100644 --- a/arch/powerpc/kernel/idle_book3s.S +++ b/arch/powerpc/kernel/idle_book3s.S @@ -44,18 +44,6 @@ PSSCR_PSLL_MASK | PSSCR_TR_MASK | \ PSSCR_MTL_MASK -/* Idle state entry routines */ - -#defineIDLE_STATE_ENTER_SEQ(IDLE_INST) \ - /* Magic NAP/SLEEP/WINKLE mode enter sequence */\ - std r0,0(r1); \ - ptesync;\ - ld r0,0(r1); \ -1: cmp cr0,r0,r0; \ - bne 1b; \ - IDLE_INST; \ - b . - .text /*
Re: [RESEND][PATCH v2 2/2] powerpc/fadump: parse fadump reserve memory size based on memory range
On Friday 05 August 2016 12:23 AM, Hari Bathini wrote: On Thursday 04 August 2016 03:15 PM, Michael Ellerman wrote: Hari Bathini writes: ... /** * fadump_calculate_reserve_size(): reserve variable boot area 5% of System RAM * @@ -212,12 +262,17 @@ static inline unsigned long fadump_calculate_reserve_size(void) { unsigned long size; +/* sets fw_dump.reserve_bootvar */ +parse_fadump_reserve_mem(); + /* * Check if the size is specified through fadump_reserve_mem= cmdline * option. If yes, then use that. */ if (fw_dump.reserve_bootvar) return fw_dump.reserve_bootvar; +else +printk(KERN_INFO "fadump: calculating default boot size\n"); /* divide by 20 to get 5% of value */ size = memblock_end_of_DRAM() / 20; The code already knows how to reserve 5% based on the size of the machine's memory, as long as no commandline parameter is passed. So why can't we just use that logic? Hi Michael, That is the default value reserved but not a good enough value for every case. It is a bit difficult to come up with a robust formula that works for every case as new kernel changes could make the values obsolete. But it won't be all that difficult to find values that work for different memory ranges for a given kernel version. Passing that as range based input with "fadump_reserve_mem" parameter would work for every memory configuration on a given system, which is what this patch is trying to provide.. Hi Michael, You want me to add this to the changelog on respin? Thanks Hari Thanks Hari cheers
Re: [PATCH v3 19/21] powerpc: Add option to use jump label for mmu_has_feature()
Hi, > This patch causes an oops when building with the gold linker: Found the problem. On binutils .meminit.text is within _stext/_etext: [Nr] Name TypeAddress OffSize ES Flg Lk Inf Al [ 3] .meminit.text PROGBITSc0989d14 999d14 00225c 00 AX 0 0 4 c099 R _etext But on gold it is not: c097 A _etext [Nr] Name TypeAddress OffSize ES Flg Lk Inf Al [ 3] .meminit.text PROGBITSc0970bcc 980bcc 002220 00 AX 0 0 4 As a result kernel_text_address() returns false, and __jump_label_update() fails to update: if (entry->code && kernel_text_address(entry->code)) arch_jump_label_transform(entry, jump_label_type(entry)); Seems like we get the correct layout on binutils by luck and we need to explicitly handle .meminit.text. Anton
Re: [PATCH] powernv: Load correct TOC pointer while waking up from winkle.
* Benjamin Herrenschmidt [2016-08-06 08:38:53]: > On Fri, 2016-08-05 at 19:13 +0530, Mahesh J Salgaonkar wrote: > > From: Mahesh Salgaonkar > > > > The function pnv_restore_hyp_resource() loads the TOC into r2 from > > the invalid PACA pointer before fixing r13 value. This do not affect > > POWER ISA 3.0 but it does have an impact on POWER ISA 2.07 or less > > leading CPU to get stuck forever. > > When was this broken ? Should this get backported to stable ? Broken for 4.8-rc1 only after moving the SPRN_HSPRG0 checking to idle_book3s.S. Hence no back ports needed. --Vaidy