Re: [PATCH v2 5/6] X86: remove redundant cpuidle_idle_call()
On Wed, Jan 29, 2014 at 03:14:40PM -0500, Nicolas Pitre wrote: Looking into some cpuidle drivers for x86 I found at least one that doesn't respect this convention. Damn. Which one? We should probably fix it :-) ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2 0/6] setting the table for integration of cpuidle with the scheduler
On Wed, Jan 29, 2014 at 12:45:07PM -0500, Nicolas Pitre wrote: As everyone should know by now, we want to integrate the cpuidle governor with the scheduler for a more efficient idling of CPUs. In order to help the transition, this small patch series moves the existing interaction with cpuidle from architecture code to generic core code. The ARM, PPC, SH and X86 architectures are concerned. No functional change should have occurred yet. @peterz: Are you willing to pick up those patches? Yeah.. no objections. Should I pick these up or will you be sending another round? ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH 07/10] KVM: PPC: BOOK3S: PR: Emulate facility status and control register
Am 30.01.2014 um 07:00 schrieb Paul Mackerras pau...@samba.org: On Tue, Jan 28, 2014 at 10:14:12PM +0530, Aneesh Kumar K.V wrote: We allow priv-mode update of this. The guest value is saved in fscr, and the value actually used is saved in shadow_fscr. shadow_fscr only contains values that are allowed by the host. On facility unavailable interrupt, if the facility is allowed by fscr but disabled in shadow_fscr we need to emulate the support. Currently all but EBB is disabled. We still don't support performance monitoring in PR guest. ... +/* + * Save the current fscr in shadow fscr + */ +mfspr r3,SPRN_FSCR +PPC_STL r3, VCPU_SHADOW_FSCR(r7) I don't think you need to do this. What could possibly have changed FSCR since we loaded it on the way into the guest? The interrupt cause is part of fscr. But yes, we only meed to store that on an fscr interrupt. Do we use anything from fscr inside the kernel? Could we switch it lazily on vcpu_load/put? Alex Paul. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH 02/10] KVM: PPC: BOOK3S: PR: Emulate virtual timebase register
Am 30.01.2014 um 06:49 schrieb Paul Mackerras pau...@samba.org: On Tue, Jan 28, 2014 at 10:14:07PM +0530, Aneesh Kumar K.V wrote: virtual time base register is a per vm register and need to saved and restored on vm exit and entry. Writing to VTB is not allowed in the privileged mode. ... +#ifdef CONFIG_PPC_BOOK3S_64 +#define mfvtb()({unsigned long rval;\ +asm volatile(mfspr %0, %1 :\ + =r (rval) : i (SPRN_VTB)); rval;}) The mfspr will be a no-op on anything before POWER8, meaning the result will be whatever value was in the destination GPR before the mfspr. I suppose that may not matter if the result is only ever used when we're running on a POWER8 host, but I would feel more comfortable if we had explicit feature tests to make sure of that, rather than possibly doing computations with unpredictable values. With your patch, a guest on a POWER7 or a PPC970 could do a read from VTB and get garbage -- first, there is nothing to stop userspace from requesting POWER8 emulation on an older machine, and secondly, even if the virtual machine is a PPC970 (say) you don't implement unimplemented SPR semantics for VTB (no-op if PR=0, illegal instruction interrupt if PR=1). On the whole I think it is reasonable to reject an attempt to set the virtual PVR to a POWER8 PVR value if we are not running on a POWER8 host, because emulating all the new POWER8 features in software (particularly transactional memory) would not be feasible. Alex may disagree. :) We don't have a good feature flag indicator that tells kvm what the guest cpu is capable of. So yes, I think it's reasonable to just not expose p8 registers on p8 for now. In theory it's of course possible to emulate a lot of p8 features on pre-p8 hardware, but I'm not sure it's worth the effort. If anyone wants to spend the time to work on it I'd be happy to tale patches though ;) Alex Paul. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
PCIe Access - achieve bursts without DMA
Hello PPC-developers, I'm currently trying to benchmark access speeds to our PCIe-connected IP-cores located inside our FPGA. On x86-based systems I was able to achieve bursts for both read and write access. On PPC32, using an e500v2, I had no success at all so far. I tried using ioremap_wc(), like I did on x86, for writing, and it only results in my writes just being single requests, one after another. For reads, I noticed I could not ioremap_cache() on PPC, so I used simple ioremap() here. I used several ways to read from the device, from simple readl(),memcpy_from_io(), memcpy() to cacheable_memcpy() - with no improvements. Even when just issuing a batch of prefetch()-calls for all the memory to read did not result in read bursts. I only get really poor results, writing is possible with around 40 MiByte/s, whereas I can read at about only 3 MiByte/s. After hours of studying the reference manual from freescale, looking into other code and searching the web, I'm close to resignation. Maybe someone of you has some more directions for me, I'd appreciate every hint that leads me to my problem's solution - maybe I just missed something or lack knowledge about this architecture in general. Thanks for your reading. Michael ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 2/2] Fix compile error of pgtable-ppc64.h
On Thu, Jan 30, 2014 at 09:57:36AM +1100, Benjamin Herrenschmidt wrote: On Wed, 2014-01-29 at 10:45 -0800, Greg KH wrote: On Tue, Jan 28, 2014 at 05:52:42PM +0530, Aneesh Kumar K.V wrote: From: Li Zhong zh...@linux.vnet.ibm.com It seems that forward declaration couldn't work well with typedef, use struct spinlock directly to avoiding following build errors: In file included from include/linux/spinlock.h:81, from include/linux/seqlock.h:35, from include/linux/time.h:5, from include/uapi/linux/timex.h:56, from include/linux/timex.h:56, from include/linux/sched.h:17, from arch/powerpc/kernel/asm-offsets.c:17: include/linux/spinlock_types.h:76: error: redefinition of typedef 'spinlock_t' /root/linux-next/arch/powerpc/include/asm/pgtable-ppc64.h:563: note: previous declaration of 'spinlock_t' was here build fix for upstream SHA1: b3084f4db3aeb991c507ca774337c7e7893ed04f for 3.13 stable series I don't understand, why is this needed? Is there a corrisponding patch upstream that already does this? What went wrong with a normal backport of the patch to 3.13? There's a corresponding patch in powerpc-next that I'm about to send to Linus today, but for the backport, the fix could be folded into the original offending patch. Oh come on, you know better than to try to send me a patch that isn't in Linus's tree already. Crap, I can't take that at all. Send me the git commit id when it is in Linus's tree, otherwise I'm not taking it. And no, don't fold in anything, that's not ok either. I'll just go drop this patch entirely from all of my -stable trees for now. Feel free to resend them when all of the needed stuff is upstream. greg k-h ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] powerpc/eeh: drop taken reference to driver on eeh_rmv_device
Commit f5c57710dd62dd06f176934a8b4b8accbf00f9f8 (powerpc/eeh: Use partial hotplug for EEH unaware drivers) introduces eeh_rmv_device, which may grab a reference to a driver, but not release it. That prevents a driver from being removed after it has gone through EEH recovery. This patch drops the reference in either exit path if it was taken. Signed-off-by: Thadeu Lima de Souza Cascardo casca...@linux.vnet.ibm.com --- arch/powerpc/kernel/eeh_driver.c |5 - 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c index 7bb30dc..afe7337 100644 --- a/arch/powerpc/kernel/eeh_driver.c +++ b/arch/powerpc/kernel/eeh_driver.c @@ -364,7 +364,7 @@ static void *eeh_rmv_device(void *data, void *userdata) return NULL; driver = eeh_pcid_get(dev); if (driver driver-err_handler) - return NULL; + goto out; /* Remove it from PCI subsystem */ pr_debug(EEH: Removing %s without EEH sensitive driver\n, @@ -377,6 +377,9 @@ static void *eeh_rmv_device(void *data, void *userdata) pci_stop_and_remove_bus_device(dev); pci_unlock_rescan_remove(); +out: + if (driver) + eeh_pcid_put(dev); return NULL; } -- 1.7.1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2 0/6] setting the table for integration of cpuidle with the scheduler
On Thu, 30 Jan 2014, Peter Zijlstra wrote: On Wed, Jan 29, 2014 at 12:45:07PM -0500, Nicolas Pitre wrote: As everyone should know by now, we want to integrate the cpuidle governor with the scheduler for a more efficient idling of CPUs. In order to help the transition, this small patch series moves the existing interaction with cpuidle from architecture code to generic core code. The ARM, PPC, SH and X86 architectures are concerned. No functional change should have occurred yet. @peterz: Are you willing to pick up those patches? Yeah.. no objections. Should I pick these up or will you be sending another round? I think you could pick them now, taking care of picking up the amended #1/6. Nicolas ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2 1/6] idle: move the cpuidle entry point to the generic idle loop
On 01/30/2014 06:28 AM, Nicolas Pitre wrote: On Thu, 30 Jan 2014, Preeti U Murthy wrote: Hi Nicolas, On 01/30/2014 02:01 AM, Nicolas Pitre wrote: On Wed, 29 Jan 2014, Nicolas Pitre wrote: In order to integrate cpuidle with the scheduler, we must have a better proximity in the core code with what cpuidle is doing and not delegate such interaction to arch code. Architectures implementing arch_cpu_idle() should simply enter a cheap idle mode in the absence of a proper cpuidle driver. Signed-off-by: Nicolas Pitre n...@linaro.org Acked-by: Daniel Lezcano daniel.lezc...@linaro.org As mentioned in my reply to Olof's comment on patch #5/6, here's a new version of this patch adding the safety local_irq_enable() to the core code. - 8 From: Nicolas Pitre nicolas.pi...@linaro.org Subject: idle: move the cpuidle entry point to the generic idle loop In order to integrate cpuidle with the scheduler, we must have a better proximity in the core code with what cpuidle is doing and not delegate such interaction to arch code. Architectures implementing arch_cpu_idle() should simply enter a cheap idle mode in the absence of a proper cpuidle driver. In both cases i.e. whether it is a cpuidle driver or the default arch_cpu_idle(), the calling convention expects IRQs to be disabled on entry and enabled on exit. There is a warning in place already but let's add a forced IRQ enable here as well. This will allow for removing the forced IRQ enable some implementations do locally and Why would this patch allow for removing the forced IRQ enable that are being done on some archs in arch_cpu_idle()? Isn't this patch expecting the default arch_cpu_idle() to have re-enabled the interrupts after exiting from the default idle state? Its supposed to only catch faulty cpuidle drivers that haven't enabled IRQs on exit from idle state but are expected to have done so, isn't it? Exact. However x86 currently does this: if (cpuidle_idle_call()) x86_idle(); else local_irq_enable(); So whenever cpuidle_idle_call() is successful then IRQs are unconditionally enabled whether or not the underlying cpuidle driver has properly done it or not. And the reason is that some of the x86 cpuidle do fail to enable IRQs before returning. So the idea is to get rid of this unconditional IRQ enabling and let the core issue a warning instead (as well as enabling IRQs to allow the system to run). But what I don't get with your comment is the local_irq_enable is done from the cpuidle common framework in 'cpuidle_enter_state' it is not done from the arch specific backend cpuidle driver. So the code above could be: if (cpuidle_idle_call()) x86_idle(); without the else section, this local_irq_enable is pointless. Or may be I missed something ? -- http://www.linaro.org/ Linaro.org │ Open source software for ARM SoCs Follow Linaro: http://www.facebook.com/pages/Linaro Facebook | http://twitter.com/#!/linaroorg Twitter | http://www.linaro.org/linaro-blog/ Blog ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: PCIe Access - achieve bursts without DMA
From Moese, Michael Hello PPC-developers, I'm currently trying to benchmark access speeds to our PCIe-connected IP-cores located inside our FPGA. On x86-based systems I was able to achieve bursts for both read and write access. On PPC32, using an e500v2, I had no success at all so far. I'm not sure that you can. I had to write a simple driver for the PCIe CSB bridge dma on a 83xx ppc. I think that might be the one in the e500v2. I don't know how fast 'normal' PCIe slaves are, but we were accessing an Altera fpga and the latency is less than pedestrian. I think an ISA bus can run faster! With moderate length transfers, the throughput was more than adequate. David ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v2] kexec/ppc64 fix device tree endianess issues for memory attributes
All the attributes exposed in the device tree are in Big Endian format. This patch add the byte swap operation for some entries which were not yet processed, including those fixed by the following kernel's patch : https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-January/114720.html To work on PPC64 Little Endian mode, kexec now requires that the kernel's patch mentioned above is applied on the kexecing kernel. Tested on ppc64 LPAR (kexec/dump) and ppc64le in a Qemu/KVM guest (kexec) Changes from v1 : * add processing of the following entries : - ibm,dynamic-reconfiguration-memory - chosen/linux,kernel-end - chosen/linux,crashkernel-base size - chosen/linux,memory-limit - chosen/linux,htab-base size - linux,tce-base size - memory@/reg Signed-off-by: Laurent Dufour lduf...@linux.vnet.ibm.com --- kexec/arch/ppc64/crashdump-ppc64.c |9 --- kexec/arch/ppc64/kexec-ppc64.c | 44 +++- kexec/fs2dt.c | 19 3 files changed, 48 insertions(+), 24 deletions(-) diff --git a/kexec/arch/ppc64/crashdump-ppc64.c b/kexec/arch/ppc64/crashdump-ppc64.c index e31dd6d..c0d575d 100644 --- a/kexec/arch/ppc64/crashdump-ppc64.c +++ b/kexec/arch/ppc64/crashdump-ppc64.c @@ -146,12 +146,12 @@ static int get_dyn_reconf_crash_memory_ranges(void) return -1; } - start = ((uint64_t *)buf)[DRCONF_ADDR]; + start = be64_to_cpu(((uint64_t *)buf)[DRCONF_ADDR]); end = start + lmb_size; if (start == 0 end = (BACKUP_SRC_END + 1)) start = BACKUP_SRC_END + 1; - flags = (*((uint32_t *)buf[DRCONF_FLAGS])); + flags = be32_to_cpu((*((uint32_t *)buf[DRCONF_FLAGS]))); /* skip this block if the reserved bit is set in flags (0x80) or if the block is not assigned to this partition (0x8) */ if ((flags 0x80) || !(flags 0x8)) @@ -252,8 +252,9 @@ static int get_crash_memory_ranges(struct memory_range **range, int *ranges) goto err; } - start = ((unsigned long long *)buf)[0]; - end = start + ((unsigned long long *)buf)[1]; + start = be64_to_cpu(((unsigned long long *)buf)[0]); + end = start + + be64_to_cpu(((unsigned long long *)buf)[1]); if (start == 0 end = (BACKUP_SRC_END + 1)) start = BACKUP_SRC_END + 1; diff --git a/kexec/arch/ppc64/kexec-ppc64.c b/kexec/arch/ppc64/kexec-ppc64.c index af9112b..49b291d 100644 --- a/kexec/arch/ppc64/kexec-ppc64.c +++ b/kexec/arch/ppc64/kexec-ppc64.c @@ -167,7 +167,7 @@ static int get_dyn_reconf_base_ranges(void) * lmb_size, num_of_lmbs(global variables) are * initialized once here. */ - lmb_size = ((uint64_t *)buf)[0]; + lmb_size = be64_to_cpu(((uint64_t *)buf)[0]); fclose(file); strcpy(fname, /proc/device-tree/); @@ -183,7 +183,7 @@ static int get_dyn_reconf_base_ranges(void) fclose(file); return -1; } - num_of_lmbs = ((unsigned int *)buf)[0]; + num_of_lmbs = be32_to_cpu(((unsigned int *)buf)[0]); for (i = 0; i num_of_lmbs; i++) { if ((n = fread(buf, 1, 24, file)) 0) { @@ -194,7 +194,7 @@ static int get_dyn_reconf_base_ranges(void) if (nr_memory_ranges = max_memory_ranges) return -1; - start = ((uint64_t *)buf)[0]; + start = be64_to_cpu(((uint64_t *)buf)[0]); end = start + lmb_size; add_base_memory_range(start, end); } @@ -278,8 +278,8 @@ static int get_base_ranges(void) if (realloc_memory_ranges() 0) break; } - start = ((uint64_t *)buf)[0]; - end = start + ((uint64_t *)buf)[1]; + start = be64_to_cpu(((uint64_t *)buf)[0]); + end = start + be64_to_cpu(((uint64_t *)buf)[1]); add_base_memory_range(start, end); fclose(file); } @@ -363,6 +363,7 @@ static int get_devtree_details(unsigned long kexec_flags) goto error_openfile; } fclose(file); + kernel_end = be64_to_cpu(kernel_end); /* Add kernel memory to exclude_range */ exclude_range[i].start = 0x0UL; @@ -386,6 +387,7 @@ static int get_devtree_details(unsigned long kexec_flags) goto error_openfile; }
Re: [PATCH v2 6/6] cpu/idle.c: move to sched/idle.c
On Wed, Jan 29, 2014 at 12:45:13PM -0500, Nicolas Pitre wrote: Integration of cpuidle with the scheduler requires that the idle loop be closely integrated with the scheduler proper. Moving cpu/idle.c into the sched directory will allow for a smoother integration, and eliminate a subdirectory which contained only one source file. Signed-off-by: Nicolas Pitre n...@linaro.org --- kernel/Makefile | 1 - kernel/cpu/Makefile | 1 - kernel/sched/Makefile| 2 +- kernel/{cpu = sched}/idle.c | 0 4 files changed, 1 insertion(+), 3 deletions(-) delete mode 100644 kernel/cpu/Makefile rename kernel/{cpu = sched}/idle.c (100%) --- a/kernel/sched/Makefile +++ b/kernel/sched/Makefile @@ -11,7 +11,7 @@ ifneq ($(CONFIG_SCHED_OMIT_FRAME_POINTER),y) CFLAGS_core.o := $(PROFILING) -fno-omit-frame-pointer endif -obj-y += core.o proc.o clock.o cputime.o idle_task.o fair.o rt.o stop_task.o +obj-y += core.o proc.o clock.o cputime.o idle_task.o idle.o fair.o rt.o stop_task.o obj-y += wait.o completion.o obj-$(CONFIG_SMP) += cpupri.o obj-$(CONFIG_SCHED_AUTOGROUP) += auto_group.o diff --git a/kernel/cpu/idle.c b/kernel/sched/idle.c similarity index 100% rename from kernel/cpu/idle.c rename to kernel/sched/idle.c This is not a valid patch for PATCH(1). Please try again. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[RESEND PATCH] powerpc/relocate fix relocate processing in LE mode
Relocation's code is not working in little endian mode because the r_info field, which is a 64 bits value, should be read from the right offset. The current code is optimized to read the r_info field as a 32 bits value starting at the middle of the double word (offset 12). When running in LE mode, the read value is not correct since only the MSB is read. This patch removes this optimization which consist to deal with a 32 bits value instead of a 64 bits one. This way it works in big and little endian mode. Signed-off-by: Laurent Dufour lduf...@linux.vnet.ibm.com --- arch/powerpc/kernel/reloc_64.S |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kernel/reloc_64.S b/arch/powerpc/kernel/reloc_64.S index b47a0e1..1482327 100644 --- a/arch/powerpc/kernel/reloc_64.S +++ b/arch/powerpc/kernel/reloc_64.S @@ -69,8 +69,8 @@ _GLOBAL(relocate) * R_PPC64_RELATIVE ones. */ mtctr r8 -5: lwz r0,12(9)/* ELF64_R_TYPE(reloc-r_info) */ - cmpwi r0,R_PPC64_RELATIVE +5: ld r0,8(9) /* ELF64_R_TYPE(reloc-r_info) */ + cmpdi r0,R_PPC64_RELATIVE bne 6f ld r6,0(r9)/* reloc-r_offset */ ld r0,16(r9) /* reloc-r_addend */ ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2 6/6] cpu/idle.c: move to sched/idle.c
On Thu, 30 Jan 2014, Peter Zijlstra wrote: On Wed, Jan 29, 2014 at 12:45:13PM -0500, Nicolas Pitre wrote: Integration of cpuidle with the scheduler requires that the idle loop be closely integrated with the scheduler proper. Moving cpu/idle.c into the sched directory will allow for a smoother integration, and eliminate a subdirectory which contained only one source file. Signed-off-by: Nicolas Pitre n...@linaro.org --- kernel/Makefile | 1 - kernel/cpu/Makefile | 1 - kernel/sched/Makefile| 2 +- kernel/{cpu = sched}/idle.c | 0 4 files changed, 1 insertion(+), 3 deletions(-) delete mode 100644 kernel/cpu/Makefile rename kernel/{cpu = sched}/idle.c (100%) --- a/kernel/sched/Makefile +++ b/kernel/sched/Makefile @@ -11,7 +11,7 @@ ifneq ($(CONFIG_SCHED_OMIT_FRAME_POINTER),y) CFLAGS_core.o := $(PROFILING) -fno-omit-frame-pointer endif -obj-y += core.o proc.o clock.o cputime.o idle_task.o fair.o rt.o stop_task.o +obj-y += core.o proc.o clock.o cputime.o idle_task.o idle.o fair.o rt.o stop_task.o obj-y += wait.o completion.o obj-$(CONFIG_SMP) += cpupri.o obj-$(CONFIG_SCHED_AUTOGROUP) += auto_group.o diff --git a/kernel/cpu/idle.c b/kernel/sched/idle.c similarity index 100% rename from kernel/cpu/idle.c rename to kernel/sched/idle.c This is not a valid patch for PATCH(1). Please try again. Don't you use git? ;-) Here's a plain patch: - 8 From 1bf40eb80a44633094e94986a74bd5ffa222f9d4 Mon Sep 17 00:00:00 2001 From: Nicolas Pitre nicolas.pi...@linaro.org Date: Sun, 26 Jan 2014 23:42:01 -0500 Subject: [PATCH] cpu/idle.c: move to sched/idle.c Integration of cpuidle with the scheduler requires that the idle loop be closely integrated with the scheduler proper. Moving cpu/idle.c into the sched directory will allow for a smoother integration, and eliminate a subdirectory which contained only one source file. Signed-off-by: Nicolas Pitre n...@linaro.org --- kernel/Makefile | 1 - kernel/cpu/Makefile | 1 - kernel/cpu/idle.c | 144 -- kernel/sched/Makefile | 2 +- kernel/sched/idle.c | 144 ++ 5 files changed, 145 insertions(+), 147 deletions(-) delete mode 100644 kernel/cpu/Makefile delete mode 100644 kernel/cpu/idle.c create mode 100644 kernel/sched/idle.c diff --git a/kernel/Makefile b/kernel/Makefile index bc010ee272..6f1c7e5cfc 100644 --- a/kernel/Makefile +++ b/kernel/Makefile @@ -22,7 +22,6 @@ obj-y += sched/ obj-y += locking/ obj-y += power/ obj-y += printk/ -obj-y += cpu/ obj-y += irq/ obj-y += rcu/ diff --git a/kernel/cpu/Makefile b/kernel/cpu/Makefile deleted file mode 100644 index 59ab052ef7..00 --- a/kernel/cpu/Makefile +++ /dev/null @@ -1 +0,0 @@ -obj-y = idle.o diff --git a/kernel/cpu/idle.c b/kernel/cpu/idle.c deleted file mode 100644 index 14ca43430a..00 --- a/kernel/cpu/idle.c +++ /dev/null @@ -1,144 +0,0 @@ -/* - * Generic entry point for the idle threads - */ -#include linux/sched.h -#include linux/cpu.h -#include linux/cpuidle.h -#include linux/tick.h -#include linux/mm.h -#include linux/stackprotector.h - -#include asm/tlb.h - -#include trace/events/power.h - -static int __read_mostly cpu_idle_force_poll; - -void cpu_idle_poll_ctrl(bool enable) -{ - if (enable) { - cpu_idle_force_poll++; - } else { - cpu_idle_force_poll--; - WARN_ON_ONCE(cpu_idle_force_poll 0); - } -} - -#ifdef CONFIG_GENERIC_IDLE_POLL_SETUP -static int __init cpu_idle_poll_setup(char *__unused) -{ - cpu_idle_force_poll = 1; - return 1; -} -__setup(nohlt, cpu_idle_poll_setup); - -static int __init cpu_idle_nopoll_setup(char *__unused) -{ - cpu_idle_force_poll = 0; - return 1; -} -__setup(hlt, cpu_idle_nopoll_setup); -#endif - -static inline int cpu_idle_poll(void) -{ - rcu_idle_enter(); - trace_cpu_idle_rcuidle(0, smp_processor_id()); - local_irq_enable(); - while (!tif_need_resched()) - cpu_relax(); - trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, smp_processor_id()); - rcu_idle_exit(); - return 1; -} - -/* Weak implementations for optional arch specific functions */ -void __weak arch_cpu_idle_prepare(void) { } -void __weak arch_cpu_idle_enter(void) { } -void __weak arch_cpu_idle_exit(void) { } -void __weak arch_cpu_idle_dead(void) { } -void __weak arch_cpu_idle(void) -{ - cpu_idle_force_poll = 1; - local_irq_enable(); -} - -/* - * Generic idle loop implementation - */ -static void cpu_idle_loop(void) -{ - while (1) { - tick_nohz_idle_enter(); - - while (!need_resched()) { - check_pgt_cache(); - rmb(); - - if (cpu_is_offline(smp_processor_id())) - arch_cpu_idle_dead(); - -
Re: [PATCH v2 1/6] idle: move the cpuidle entry point to the generic idle loop
On Thu, 30 Jan 2014, Daniel Lezcano wrote: On 01/30/2014 06:28 AM, Nicolas Pitre wrote: On Thu, 30 Jan 2014, Preeti U Murthy wrote: Hi Nicolas, On 01/30/2014 02:01 AM, Nicolas Pitre wrote: On Wed, 29 Jan 2014, Nicolas Pitre wrote: In order to integrate cpuidle with the scheduler, we must have a better proximity in the core code with what cpuidle is doing and not delegate such interaction to arch code. Architectures implementing arch_cpu_idle() should simply enter a cheap idle mode in the absence of a proper cpuidle driver. Signed-off-by: Nicolas Pitre n...@linaro.org Acked-by: Daniel Lezcano daniel.lezc...@linaro.org As mentioned in my reply to Olof's comment on patch #5/6, here's a new version of this patch adding the safety local_irq_enable() to the core code. - 8 From: Nicolas Pitre nicolas.pi...@linaro.org Subject: idle: move the cpuidle entry point to the generic idle loop In order to integrate cpuidle with the scheduler, we must have a better proximity in the core code with what cpuidle is doing and not delegate such interaction to arch code. Architectures implementing arch_cpu_idle() should simply enter a cheap idle mode in the absence of a proper cpuidle driver. In both cases i.e. whether it is a cpuidle driver or the default arch_cpu_idle(), the calling convention expects IRQs to be disabled on entry and enabled on exit. There is a warning in place already but let's add a forced IRQ enable here as well. This will allow for removing the forced IRQ enable some implementations do locally and Why would this patch allow for removing the forced IRQ enable that are being done on some archs in arch_cpu_idle()? Isn't this patch expecting the default arch_cpu_idle() to have re-enabled the interrupts after exiting from the default idle state? Its supposed to only catch faulty cpuidle drivers that haven't enabled IRQs on exit from idle state but are expected to have done so, isn't it? Exact. However x86 currently does this: if (cpuidle_idle_call()) x86_idle(); else local_irq_enable(); So whenever cpuidle_idle_call() is successful then IRQs are unconditionally enabled whether or not the underlying cpuidle driver has properly done it or not. And the reason is that some of the x86 cpuidle do fail to enable IRQs before returning. So the idea is to get rid of this unconditional IRQ enabling and let the core issue a warning instead (as well as enabling IRQs to allow the system to run). But what I don't get with your comment is the local_irq_enable is done from the cpuidle common framework in 'cpuidle_enter_state' it is not done from the arch specific backend cpuidle driver. Oh well... This certainly means we'll have to clean this mess as some drivers do it on their own while some others don't. Some drivers also loop on !need_resched() while some others simply return on the first interrupt. So the code above could be: if (cpuidle_idle_call()) x86_idle(); without the else section, this local_irq_enable is pointless. Or may be I missed something ? A later patch removes it anyway. But if it is really necessary to enable interrupts then the core will do it but with a warning now. Nicolas ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] slub: Don't throw away partial remote slabs if there is no local memory
On Wed, 29 Jan 2014, Nishanth Aravamudan wrote: exactly what the caller intends. int searchnode = node; if (node == NUMA_NO_NODE) searchnode = numa_mem_id(); if (!node_present_pages(node)) searchnode = local_memory_node(node); The difference in semantics from the previous is that here, if we have a memoryless node, rather than using the CPU's nearest NUMA node, we use the NUMA node closest to the requested one? The idea here is that the page allocator will do the fallback to other nodes. This check for !node_present should not be necessary. SLUB needs to accept the page from whatever node the page allocator returned and work with that. The problem is the check for having a slab from the right node may fall again after another attempt to allocate from the same node. SLUB will then push the slab from the *wrong* node back to the partial lists and may attempt another allocation that will again be successful but return memory from another node. That way the partial lists from a particular node are growing uselessly. One way to solve this may be to check if memory is actually allocated from the requested node and fallback to NUMA_NO_NODE (which will use the last allocated slab) for future allocs if the page allocator returned memory from a different node (unless GFP_THIS_NODE is set of course). Otherwise we end up replicating the page allocator logic in slub like in slab. That is what I wanted to avoid. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2 6/6] cpu/idle.c: move to sched/idle.c
On Thu, Jan 30, 2014 at 11:03:31AM -0500, Nicolas Pitre wrote: This is not a valid patch for PATCH(1). Please try again. Don't you use git? ;-) Nah, git and me don't get along well. Here's a plain patch: Thanks! ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2 6/6] cpu/idle.c: move to sched/idle.c
On Thu, 2014-01-30 at 17:27 +0100, Peter Zijlstra wrote: On Thu, Jan 30, 2014 at 11:03:31AM -0500, Nicolas Pitre wrote: This is not a valid patch for PATCH(1). Please try again. Don't you use git? ;-) Nah, git and me don't get along well. Perhaps you could use a newer version of patch http://savannah.gnu.org/forum/forum.php?forum_id=7361 GNU patch version 2.7 released Item posted by Andreas Gruenbacher agruen on Wed 12 Sep 2012 02:18:14 PM UTC. I am pleased to announce that version 2.7 of GNU patch has been released. The following significant changes have happened since the last stable release in December 2009: * Support for most features of the diff --git format, including renames and copies, permission changes, and symlink diffs. Binary diffs are not supported yet; patch will complain and skip them. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2 6/6] cpu/idle.c: move to sched/idle.c
On Thu, Jan 30, 2014 at 08:41:16AM -0800, Joe Perches wrote: Perhaps you could use a newer version of patch GNU patch version 2.7 released Yeah, I know about that, I'll wait until its common in all distros, updating all machines I use by hand is just painful. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2 1/6] idle: move the cpuidle entry point to the generic idle loop
On 01/30/2014 05:07 PM, Nicolas Pitre wrote: On Thu, 30 Jan 2014, Daniel Lezcano wrote: On 01/30/2014 06:28 AM, Nicolas Pitre wrote: On Thu, 30 Jan 2014, Preeti U Murthy wrote: Hi Nicolas, On 01/30/2014 02:01 AM, Nicolas Pitre wrote: On Wed, 29 Jan 2014, Nicolas Pitre wrote: In order to integrate cpuidle with the scheduler, we must have a better proximity in the core code with what cpuidle is doing and not delegate such interaction to arch code. Architectures implementing arch_cpu_idle() should simply enter a cheap idle mode in the absence of a proper cpuidle driver. Signed-off-by: Nicolas Pitre n...@linaro.org Acked-by: Daniel Lezcano daniel.lezc...@linaro.org As mentioned in my reply to Olof's comment on patch #5/6, here's a new version of this patch adding the safety local_irq_enable() to the core code. - 8 From: Nicolas Pitre nicolas.pi...@linaro.org Subject: idle: move the cpuidle entry point to the generic idle loop In order to integrate cpuidle with the scheduler, we must have a better proximity in the core code with what cpuidle is doing and not delegate such interaction to arch code. Architectures implementing arch_cpu_idle() should simply enter a cheap idle mode in the absence of a proper cpuidle driver. In both cases i.e. whether it is a cpuidle driver or the default arch_cpu_idle(), the calling convention expects IRQs to be disabled on entry and enabled on exit. There is a warning in place already but let's add a forced IRQ enable here as well. This will allow for removing the forced IRQ enable some implementations do locally and Why would this patch allow for removing the forced IRQ enable that are being done on some archs in arch_cpu_idle()? Isn't this patch expecting the default arch_cpu_idle() to have re-enabled the interrupts after exiting from the default idle state? Its supposed to only catch faulty cpuidle drivers that haven't enabled IRQs on exit from idle state but are expected to have done so, isn't it? Exact. However x86 currently does this: if (cpuidle_idle_call()) x86_idle(); else local_irq_enable(); So whenever cpuidle_idle_call() is successful then IRQs are unconditionally enabled whether or not the underlying cpuidle driver has properly done it or not. And the reason is that some of the x86 cpuidle do fail to enable IRQs before returning. So the idea is to get rid of this unconditional IRQ enabling and let the core issue a warning instead (as well as enabling IRQs to allow the system to run). But what I don't get with your comment is the local_irq_enable is done from the cpuidle common framework in 'cpuidle_enter_state' it is not done from the arch specific backend cpuidle driver. Oh well... This certainly means we'll have to clean this mess as some drivers do it on their own while some others don't. Some drivers also loop on !need_resched() while some others simply return on the first interrupt. Ok, I think the mess is coming from 'default_idle' which does not re-enable the local_irq but used from different places like amd_e400_idle and apm_cpu_idle. void default_idle(void) { trace_cpu_idle_rcuidle(1, smp_processor_id()); safe_halt(); trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, smp_processor_id()); } Considering the system configured without cpuidle because this one *always* enable the local irq, we have the different cases: x86_idle = default_idle(); == local_irq_enable is missing x86_idle = amd_e400_idle(); == it calls local_irq_disable(); but in the idle loop context where the local irqs are already disabled. == if amd_e400_c1e_detected is true, the local_irq are enabled == otherwise no == default_idle is called from there and does not enable local_irqs So the code above could be: if (cpuidle_idle_call()) x86_idle(); without the else section, this local_irq_enable is pointless. Or may be I missed something ? A later patch removes it anyway. But if it is really necessary to enable interrupts then the core will do it but with a warning now. This WARN should disappear. It was there because it was up to the backend cpuidle driver to enable the irq. But in the meantime, that was consolidated into a single place in the cpuidle framework so no need to try to catch errors. What about (based on this patchset). diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index 4505e2a..2d60cbb 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -299,6 +299,7 @@ void arch_cpu_idle_dead(void) void arch_cpu_idle(void) { x86_idle(); + local_irq_enable(); } /* -- http://www.linaro.org/ Linaro.org │ Open source software for ARM SoCs Follow Linaro: http://www.facebook.com/pages/Linaro Facebook | http://twitter.com/#!/linaroorg Twitter | http://www.linaro.org/linaro-blog/ Blog ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org
Re: [PATCH 2/2] Fix compile error of pgtable-ppc64.h
Greg KH g...@kroah.com writes: On Thu, Jan 30, 2014 at 09:57:36AM +1100, Benjamin Herrenschmidt wrote: On Wed, 2014-01-29 at 10:45 -0800, Greg KH wrote: On Tue, Jan 28, 2014 at 05:52:42PM +0530, Aneesh Kumar K.V wrote: From: Li Zhong zh...@linux.vnet.ibm.com It seems that forward declaration couldn't work well with typedef, use struct spinlock directly to avoiding following build errors: In file included from include/linux/spinlock.h:81, from include/linux/seqlock.h:35, from include/linux/time.h:5, from include/uapi/linux/timex.h:56, from include/linux/timex.h:56, from include/linux/sched.h:17, from arch/powerpc/kernel/asm-offsets.c:17: include/linux/spinlock_types.h:76: error: redefinition of typedef 'spinlock_t' /root/linux-next/arch/powerpc/include/asm/pgtable-ppc64.h:563: note: previous declaration of 'spinlock_t' was here build fix for upstream SHA1: b3084f4db3aeb991c507ca774337c7e7893ed04f for 3.13 stable series I don't understand, why is this needed? Is there a corrisponding patch upstream that already does this? What went wrong with a normal backport of the patch to 3.13? There's a corresponding patch in powerpc-next that I'm about to send to Linus today, but for the backport, the fix could be folded into the original offending patch. Oh come on, you know better than to try to send me a patch that isn't in Linus's tree already. Crap, I can't take that at all. Send me the git commit id when it is in Linus's tree, otherwise I'm not taking it. And no, don't fold in anything, that's not ok either. I'll just go drop this patch entirely from all of my -stable trees for now. Feel free to resend them when all of the needed stuff is upstream. The fix for mremap crash is already in Linus tree. It is the build failure for older gcc compiler version that is not in linus tree. We missed that in the first pull request. Do we really need to drop the patch from 3.11 and 3.12 trees ? The patch their is a variant, and don't require this build fix. -aneesh ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 2/2] Fix compile error of pgtable-ppc64.h
On Thu, Jan 30, 2014 at 11:08:52PM +0530, Aneesh Kumar K.V wrote: Greg KH g...@kroah.com writes: On Thu, Jan 30, 2014 at 09:57:36AM +1100, Benjamin Herrenschmidt wrote: On Wed, 2014-01-29 at 10:45 -0800, Greg KH wrote: On Tue, Jan 28, 2014 at 05:52:42PM +0530, Aneesh Kumar K.V wrote: From: Li Zhong zh...@linux.vnet.ibm.com It seems that forward declaration couldn't work well with typedef, use struct spinlock directly to avoiding following build errors: In file included from include/linux/spinlock.h:81, from include/linux/seqlock.h:35, from include/linux/time.h:5, from include/uapi/linux/timex.h:56, from include/linux/timex.h:56, from include/linux/sched.h:17, from arch/powerpc/kernel/asm-offsets.c:17: include/linux/spinlock_types.h:76: error: redefinition of typedef 'spinlock_t' /root/linux-next/arch/powerpc/include/asm/pgtable-ppc64.h:563: note: previous declaration of 'spinlock_t' was here build fix for upstream SHA1: b3084f4db3aeb991c507ca774337c7e7893ed04f for 3.13 stable series I don't understand, why is this needed? Is there a corrisponding patch upstream that already does this? What went wrong with a normal backport of the patch to 3.13? There's a corresponding patch in powerpc-next that I'm about to send to Linus today, but for the backport, the fix could be folded into the original offending patch. Oh come on, you know better than to try to send me a patch that isn't in Linus's tree already. Crap, I can't take that at all. Send me the git commit id when it is in Linus's tree, otherwise I'm not taking it. And no, don't fold in anything, that's not ok either. I'll just go drop this patch entirely from all of my -stable trees for now. Feel free to resend them when all of the needed stuff is upstream. The fix for mremap crash is already in Linus tree. What is the git commit id? It is the build failure for older gcc compiler version that is not in linus tree. That is what I can not take. We missed that in the first pull request. Do we really need to drop the patch from 3.11 and 3.12 trees ? I already did. The patch their is a variant, and don't require this build fix. Don't give me a variant, give me the exact same patch, only changed to handle the fuzz/differences of older kernels, don't make different changes to the original patch to make up for things you found out later on, otherwise everyone is confused as to why the fix for the fix is not in the tree. So, when both patches get in Linus's tree, please send me the properly backported patches and I'll be glad to apply them. greg k-h ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 2/2] Fix compile error of pgtable-ppc64.h
Greg KH g...@kroah.com writes: On Thu, Jan 30, 2014 at 11:08:52PM +0530, Aneesh Kumar K.V wrote: Greg KH g...@kroah.com writes: On Thu, Jan 30, 2014 at 09:57:36AM +1100, Benjamin Herrenschmidt wrote: On Wed, 2014-01-29 at 10:45 -0800, Greg KH wrote: On Tue, Jan 28, 2014 at 05:52:42PM +0530, Aneesh Kumar K.V wrote: From: Li Zhong zh...@linux.vnet.ibm.com It seems that forward declaration couldn't work well with typedef, use struct spinlock directly to avoiding following build errors: In file included from include/linux/spinlock.h:81, from include/linux/seqlock.h:35, from include/linux/time.h:5, from include/uapi/linux/timex.h:56, from include/linux/timex.h:56, from include/linux/sched.h:17, from arch/powerpc/kernel/asm-offsets.c:17: include/linux/spinlock_types.h:76: error: redefinition of typedef 'spinlock_t' /root/linux-next/arch/powerpc/include/asm/pgtable-ppc64.h:563: note: previous declaration of 'spinlock_t' was here build fix for upstream SHA1: b3084f4db3aeb991c507ca774337c7e7893ed04f for 3.13 stable series I don't understand, why is this needed? Is there a corrisponding patch upstream that already does this? What went wrong with a normal backport of the patch to 3.13? There's a corresponding patch in powerpc-next that I'm about to send to Linus today, but for the backport, the fix could be folded into the original offending patch. Oh come on, you know better than to try to send me a patch that isn't in Linus's tree already. Crap, I can't take that at all. Send me the git commit id when it is in Linus's tree, otherwise I'm not taking it. And no, don't fold in anything, that's not ok either. I'll just go drop this patch entirely from all of my -stable trees for now. Feel free to resend them when all of the needed stuff is upstream. The fix for mremap crash is already in Linus tree. What is the git commit id? upstream SHA1: b3084f4db3aeb991c507ca774337c7e7893ed04f That is patch 1 in this series. It is the build failure for older gcc compiler version that is not in linus tree. That is what I can not take. We missed that in the first pull request. Do we really need to drop the patch from 3.11 and 3.12 trees ? I already did. The patch their is a variant, and don't require this build fix. Don't give me a variant, give me the exact same patch, only changed to handle the fuzz/differences of older kernels, don't make different changes to the original patch to make up for things you found out later on, otherwise everyone is confused as to why the fix for the fix is not in the tree. In this specific case it may be difficult. 3.13 have other changes around the code path. It has split pmd locks etc which result in us doing a withdraw and deposit even on x86. For 3.11 and 3.12, we need to do that extra withdraw and deposit only for ppc64. Hence the variant which used #ifdef around that code. So, when both patches get in Linus's tree, please send me the properly backported patches and I'll be glad to apply them. -aneesh ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2 1/6] idle: move the cpuidle entry point to the generic idle loop
On Thu, Jan 30, 2014 at 06:28:52PM +0100, Daniel Lezcano wrote: Ok, I think the mess is coming from 'default_idle' which does not re-enable the local_irq but used from different places like amd_e400_idle and apm_cpu_idle. void default_idle(void) { trace_cpu_idle_rcuidle(1, smp_processor_id()); safe_halt(); trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, smp_processor_id()); } Considering the system configured without cpuidle because this one *always* enable the local irq, we have the different cases: x86_idle = default_idle(); == local_irq_enable is missing safe_halt() is sti; hlt and so very much does the irq_enable. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] powerpc/pseries: Disable relocation on exception while going down during crash.
From: Mahesh Salgaonkar mah...@linux.vnet.ibm.com Disable relocation on exception while going down even in kdump case. This is because we are about clear htab mappings while kexec-ing into kdump kernel and we may run into issues if we still have AIL ON. Signed-off-by: Mahesh Salgaonkar mah...@linux.vnet.ibm.com --- arch/powerpc/platforms/pseries/setup.c |3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c index c1f1908..3925173 100644 --- a/arch/powerpc/platforms/pseries/setup.c +++ b/arch/powerpc/platforms/pseries/setup.c @@ -430,8 +430,7 @@ static void pSeries_machine_kexec(struct kimage *image) { long rc; - if (firmware_has_feature(FW_FEATURE_SET_MODE) - (image-type != KEXEC_TYPE_CRASH)) { + if (firmware_has_feature(FW_FEATURE_SET_MODE)) { rc = pSeries_disable_reloc_on_exc(); if (rc != H_SUCCESS) pr_warning(Warning: Failed to disable relocation on ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] powerpc: Fix kdump hang issue on p8 with relocation on exception enabled.
From: Mahesh Salgaonkar mah...@linux.vnet.ibm.com On p8 systems, with relocation on exception feature enabled we are seeing kdump kernel hang at interrupt vector 0xc*4400. The reason is, with this feature enabled, exception are raised with MMU (IR=DR=1) ON with the default offset of 0xc*4000. Since exception is raised in virtual mode it requires the vector region to be executable without which it fails to fetch and execute instruction at 0xc*4xxx. For default kernel since kernel is loaded at real 0, the htab mappings sets the entire kernel text region executable. But for relocatable kernel (e.g. kdump case) we only copy interrupt vectors down to real 0 and never marked that region as executable because in p7 and below we always get exception in real mode. This patch fixes this issue by marking htab mapping range as executable that overlaps with the interrupt vector region for relocatable kernel. Thanks to Ben who helped me to debug this issue and find the root cause. Signed-off-by: Mahesh Salgaonkar mah...@linux.vnet.ibm.com --- arch/powerpc/include/asm/sections.h | 12 arch/powerpc/mm/hash_utils_64.c | 14 ++ 2 files changed, 26 insertions(+) diff --git a/arch/powerpc/include/asm/sections.h b/arch/powerpc/include/asm/sections.h index 4ee06fe..d0e784e 100644 --- a/arch/powerpc/include/asm/sections.h +++ b/arch/powerpc/include/asm/sections.h @@ -8,6 +8,7 @@ #ifdef __powerpc64__ +extern char __start_interrupts[]; extern char __end_interrupts[]; extern char __prom_init_toc_start[]; @@ -21,6 +22,17 @@ static inline int in_kernel_text(unsigned long addr) return 0; } +static inline int overlaps_interrupt_vector_text(unsigned long start, + unsigned long end) +{ + unsigned long real_start, real_end; + real_start = __start_interrupts - _stext; + real_end = __end_interrupts - _stext; + + return start (unsigned long)__va(real_end) + (unsigned long)__va(real_start) end; +} + static inline int overlaps_kernel_text(unsigned long start, unsigned long end) { return start (unsigned long)__init_end diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c index 6176b3c..50e21af 100644 --- a/arch/powerpc/mm/hash_utils_64.c +++ b/arch/powerpc/mm/hash_utils_64.c @@ -206,6 +206,20 @@ int htab_bolt_mapping(unsigned long vstart, unsigned long vend, if (overlaps_kernel_text(vaddr, vaddr + step)) tprot = ~HPTE_R_N; + /* +* If relocatable, check if it overlaps interrupt vectors that +* are copied down to real 0. For relocatable kernel +* (e.g. kdump case) we copy interrupt vectors down to real +* address 0. Mark that region as executable. This is +* because on p8 system with relocation on exception feature +* enabled, exceptions are raised with MMU (IR=DR=1) ON. Hence +* in order to execute the interrupt handlers in virtual +* mode the vector region need to be marked as executable. +*/ + if ((PHYSICAL_START MEMORY_START) + overlaps_interrupt_vector_text(vaddr, vaddr + step)) + tprot = ~HPTE_R_N; + hash = hpt_hash(vpn, shift, ssize); hpteg = ((hash htab_hash_mask) * HPTES_PER_GROUP); ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 2/2] Fix compile error of pgtable-ppc64.h
On Thu, 2014-01-30 at 09:55 -0800, Greg KH wrote: On Thu, Jan 30, 2014 at 11:08:52PM +0530, Aneesh Kumar K.V wrote: Greg KH g...@kroah.com writes: On Thu, Jan 30, 2014 at 09:57:36AM +1100, Benjamin Herrenschmidt wrote: On Wed, 2014-01-29 at 10:45 -0800, Greg KH wrote: On Tue, Jan 28, 2014 at 05:52:42PM +0530, Aneesh Kumar K.V wrote: From: Li Zhong zh...@linux.vnet.ibm.com It seems that forward declaration couldn't work well with typedef, use struct spinlock directly to avoiding following build errors: In file included from include/linux/spinlock.h:81, from include/linux/seqlock.h:35, from include/linux/time.h:5, from include/uapi/linux/timex.h:56, from include/linux/timex.h:56, from include/linux/sched.h:17, from arch/powerpc/kernel/asm-offsets.c:17: include/linux/spinlock_types.h:76: error: redefinition of typedef 'spinlock_t' /root/linux-next/arch/powerpc/include/asm/pgtable-ppc64.h:563: note: previous declaration of 'spinlock_t' was here build fix for upstream SHA1: b3084f4db3aeb991c507ca774337c7e7893ed04f for 3.13 stable series I don't understand, why is this needed? Is there a corrisponding patch upstream that already does this? What went wrong with a normal backport of the patch to 3.13? There's a corresponding patch in powerpc-next that I'm about to send to Linus today, but for the backport, the fix could be folded into the original offending patch. Oh come on, you know better than to try to send me a patch that isn't in Linus's tree already. Crap, I can't take that at all. Send me the git commit id when it is in Linus's tree, otherwise I'm not taking it. And no, don't fold in anything, that's not ok either. I'll just go drop this patch entirely from all of my -stable trees for now. Feel free to resend them when all of the needed stuff is upstream. The fix for mremap crash is already in Linus tree. What is the git commit id? Relax Greg :-) The submissions all had the commit ID of the original patch upsteam: b3084f4db3aeb991c507ca774337c7e7893ed04f The only *thing* here is due to churn upstream in 3.13, the backport is a bit different for 3.13 vs. earlier versions. The earlier ones are perfectly kosher and you should have no reason not to take them. The 3.13, well, Mahesh was a bit quick here, he sent you the actual patch that went upstream ... and a second patch to fix a problem with older gcc's that it introduces. Because it's a simple build fix of the previous patch, I suggested folding it in instead. That build fix is what is not yet upstream, it's in my -next branch which Linus hasn't pulled just yet. If that's an issue for you, just drop the 3.13 variant of the patch and we'll send it again with the build fix as soon as Linus has pulled the latter. It is the build failure for older gcc compiler version that is not in linus tree. That is what I can not take. We missed that in the first pull request. Do we really need to drop the patch from 3.11 and 3.12 trees ? I already did. The patch their is a variant, and don't require this build fix. Don't give me a variant, give me the exact same patch, only changed to handle the fuzz/differences of older kernels, don't make different changes to the original patch to make up for things you found out later on, otherwise everyone is confused as to why the fix for the fix is not in the tree. The backport patch is a variant because of changes in the affected function that went into 3.13. So, when both patches get in Linus's tree, please send me the properly backported patches and I'll be glad to apply them. Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/2][v7] driver/memory:Move Freescale IFC driver to a common driver
On Tue, 2014-01-28 at 10:59 +0530, Prabhakar Kushwaha wrote: Freescale IFC controller has been used for mpc8xxx. It will be used for ARM-based SoC as well. This patch moves the driver to driver/memory and fix the header file includes. Also remove module_platform_driver() and instead call platform_driver_register() from subsys_initcall() to make sure this module has been loaded before MTD partition parsing starts. Signed-off-by: Prabhakar Kushwaha prabha...@freescale.com Acked-by: Arnd Bergmann a...@arndb.de When did Arnd ack this? Especially in v7 form... and I don't see him on CC. +config FSL_IFC + bool Freescale Integrated Flash Controller + depends on FSL_SOC + help + This driver is for the Integrated Flash Controller Controller(IFC) Controller Controller? + module available in Freescale SoCs. This controller allows to handle flash + devices such as NOR, NAND, FPGA and ASIC etc FPGA and ASIC are not (necessarily) flash devices. -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 2/2] Fix compile error of pgtable-ppc64.h
On Fri, Jan 31, 2014 at 07:59:01AM +1100, Benjamin Herrenschmidt wrote: If that's an issue for you, just drop the 3.13 variant of the patch and we'll send it again with the build fix as soon as Linus has pulled the latter. I have done that. thanks, greg k-h ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] powerpc/eeh: drop taken reference to driver on eeh_rmv_device
On Thu, Jan 30, 2014 at 11:00:48AM -0200, Thadeu Lima de Souza Cascardo wrote: Commit f5c57710dd62dd06f176934a8b4b8accbf00f9f8 (powerpc/eeh: Use partial hotplug for EEH unaware drivers) introduces eeh_rmv_device, which may grab a reference to a driver, but not release it. That prevents a driver from being removed after it has gone through EEH recovery. This patch drops the reference in either exit path if it was taken. Signed-off-by: Thadeu Lima de Souza Cascardo casca...@linux.vnet.ibm.com --- arch/powerpc/kernel/eeh_driver.c |5 - 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c index 7bb30dc..afe7337 100644 --- a/arch/powerpc/kernel/eeh_driver.c +++ b/arch/powerpc/kernel/eeh_driver.c @@ -364,7 +364,7 @@ static void *eeh_rmv_device(void *data, void *userdata) return NULL; driver = eeh_pcid_get(dev); if (driver driver-err_handler) - return NULL; + goto out; /* Remove it from PCI subsystem */ pr_debug(EEH: Removing %s without EEH sensitive driver\n, @@ -377,6 +377,9 @@ static void *eeh_rmv_device(void *data, void *userdata) For normal case (driver without EEH support), we probably release the reference to the driver before pci_stop_and_remove_bus_device(). pci_stop_and_remove_bus_device(dev); pci_unlock_rescan_remove(); +out: + if (driver) + eeh_pcid_put(dev); return NULL; We needn't if (driver) here as eeh_pcid_put() already had the check. } Thanks, Gavin ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: [PATCH 1/2][v7] driver/memory:Move Freescale IFC driver to a common driver
-Original Message- From: Wood Scott-B07421 Sent: Friday, January 31, 2014 3:01 AM To: Kushwaha Prabhakar-B32579 Cc: linuxppc-dev@lists.ozlabs.org Subject: Re: [PATCH 1/2][v7] driver/memory:Move Freescale IFC driver to a common driver On Tue, 2014-01-28 at 10:59 +0530, Prabhakar Kushwaha wrote: Freescale IFC controller has been used for mpc8xxx. It will be used for ARM-based SoC as well. This patch moves the driver to driver/memory and fix the header file includes. Also remove module_platform_driver() and instead call platform_driver_register() from subsys_initcall() to make sure this module has been loaded before MTD partition parsing starts. Signed-off-by: Prabhakar Kushwaha prabha...@freescale.com Acked-by: Arnd Bergmann a...@arndb.de When did Arnd ack this? Especially in v7 form... and I don't see him on CC. +config FSL_IFC + bool Freescale Integrated Flash Controller + depends on FSL_SOC + help + This driver is for the Integrated Flash Controller Controller(IFC) Controller Controller? I will fix it + module available in Freescale SoCs. This controller allows to handle flash + devices such as NOR, NAND, FPGA and ASIC etc FPGA and ASIC are not (necessarily) flash devices. Yes it true. I am not sure this folder is only for flash controller. I can see references of FPGA, SRAM in same Kconfigs. Regards, Prabhakar ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/2][v7] driver/memory:Move Freescale IFC driver to a common driver
On Thu, 2014-01-30 at 21:23 -0600, Kushwaha Prabhakar-B32579 wrote: -Original Message- From: Wood Scott-B07421 Sent: Friday, January 31, 2014 3:01 AM To: Kushwaha Prabhakar-B32579 Cc: linuxppc-dev@lists.ozlabs.org Subject: Re: [PATCH 1/2][v7] driver/memory:Move Freescale IFC driver to a common driver On Tue, 2014-01-28 at 10:59 +0530, Prabhakar Kushwaha wrote: Freescale IFC controller has been used for mpc8xxx. It will be used for ARM-based SoC as well. This patch moves the driver to driver/memory and fix the header file includes. Also remove module_platform_driver() and instead call platform_driver_register() from subsys_initcall() to make sure this module has been loaded before MTD partition parsing starts. Signed-off-by: Prabhakar Kushwaha prabha...@freescale.com Acked-by: Arnd Bergmann a...@arndb.de When did Arnd ack this? Especially in v7 form... and I don't see him on CC. +config FSL_IFC + bool Freescale Integrated Flash Controller + depends on FSL_SOC + help + This driver is for the Integrated Flash Controller Controller(IFC) Controller Controller? I will fix it + module available in Freescale SoCs. This controller allows to handle flash + devices such as NOR, NAND, FPGA and ASIC etc FPGA and ASIC are not (necessarily) flash devices. Yes it true. I am not sure this folder is only for flash controller. I can see references of FPGA, SRAM in same Kconfigs. Right, just fix the help text. s/handle flash devices/handle devices/ -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 0/3] powerpc: Free up an IPI message slot for tick broadcast IPIs
This patchset is a precursor for enabling deep idle states on powerpc, when the local CPU timers stop. The tick broadcast framework in the Linux Kernel today handles wakeup of such CPUs at their next timer event by using an external clock device. At the expiry of this clock device, IPIs are sent to the CPUs in deep idle states so that they wakeup to handle their respective timers. This patchset frees up one of the IPI slots on powerpc so as to be used to handle the tick broadcast IPI. On certain implementations of powerpc, such an external clock device is absent. Adding support to the tick broadcast framework to handle wakeup of CPUs from deep idle states on such implementations is currently under discussion. https://lkml.org/lkml/2014/1/15/86 https://lkml.org/lkml/2014/1/24/28 Either way this patchset is essential to enable handling the tick broadcast IPIs. --- Preeti U Murthy (1): cpuidle/ppc: Split timer_interrupt() into timer handling and interrupt handling routines Srivatsa S. Bhat (2): powerpc: Free up the slot of PPC_MSG_CALL_FUNC_SINGLE IPI message powerpc: Implement tick broadcast IPI as a fixed IPI message arch/powerpc/include/asm/smp.h |2 - arch/powerpc/include/asm/time.h |1 arch/powerpc/kernel/smp.c | 23 ++-- arch/powerpc/kernel/time.c | 86 ++- arch/powerpc/platforms/cell/interrupt.c |2 - arch/powerpc/platforms/ps3/smp.c|2 - 6 files changed, 71 insertions(+), 45 deletions(-) -- ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 1/3] powerpc: Free up the slot of PPC_MSG_CALL_FUNC_SINGLE IPI message
From: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com The IPI handlers for both PPC_MSG_CALL_FUNC and PPC_MSG_CALL_FUNC_SINGLE map to a common implementation - generic_smp_call_function_single_interrupt(). So, we can consolidate them and save one of the IPI message slots, (which are precious on powerpc, since only 4 of those slots are available). So, implement the functionality of PPC_MSG_CALL_FUNC_SINGLE using PPC_MSG_CALL_FUNC itself and release its IPI message slot, so that it can be used for something else in the future, if desired. Signed-off-by: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com Signed-off-by: Preeti U. Murthy pre...@linux.vnet.ibm.com Acked-by: Geoff Levand ge...@infradead.org [For the PS3 part] --- arch/powerpc/include/asm/smp.h |2 +- arch/powerpc/kernel/smp.c | 12 +--- arch/powerpc/platforms/cell/interrupt.c |2 +- arch/powerpc/platforms/ps3/smp.c|2 +- 4 files changed, 8 insertions(+), 10 deletions(-) diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h index 084e080..9f7356b 100644 --- a/arch/powerpc/include/asm/smp.h +++ b/arch/powerpc/include/asm/smp.h @@ -120,7 +120,7 @@ extern int cpu_to_core_id(int cpu); * in /proc/interrupts will be wrong!!! --Troy */ #define PPC_MSG_CALL_FUNCTION 0 #define PPC_MSG_RESCHEDULE 1 -#define PPC_MSG_CALL_FUNC_SINGLE 2 +#define PPC_MSG_UNUSED 2 #define PPC_MSG_DEBUGGER_BREAK 3 /* for irq controllers that have dedicated ipis per message (4) */ diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index ac2621a..ee7d76b 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -145,9 +145,9 @@ static irqreturn_t reschedule_action(int irq, void *data) return IRQ_HANDLED; } -static irqreturn_t call_function_single_action(int irq, void *data) +static irqreturn_t unused_action(int irq, void *data) { - generic_smp_call_function_single_interrupt(); + /* This slot is unused and hence available for use, if needed */ return IRQ_HANDLED; } @@ -168,14 +168,14 @@ static irqreturn_t debug_ipi_action(int irq, void *data) static irq_handler_t smp_ipi_action[] = { [PPC_MSG_CALL_FUNCTION] = call_function_action, [PPC_MSG_RESCHEDULE] = reschedule_action, - [PPC_MSG_CALL_FUNC_SINGLE] = call_function_single_action, + [PPC_MSG_UNUSED] = unused_action, [PPC_MSG_DEBUGGER_BREAK] = debug_ipi_action, }; const char *smp_ipi_name[] = { [PPC_MSG_CALL_FUNCTION] = ipi call function, [PPC_MSG_RESCHEDULE] = ipi reschedule, - [PPC_MSG_CALL_FUNC_SINGLE] = ipi call function single, + [PPC_MSG_UNUSED] = ipi unused, [PPC_MSG_DEBUGGER_BREAK] = ipi debugger, }; @@ -251,8 +251,6 @@ irqreturn_t smp_ipi_demux(void) generic_smp_call_function_interrupt(); if (all IPI_MESSAGE(PPC_MSG_RESCHEDULE)) scheduler_ipi(); - if (all IPI_MESSAGE(PPC_MSG_CALL_FUNC_SINGLE)) - generic_smp_call_function_single_interrupt(); if (all IPI_MESSAGE(PPC_MSG_DEBUGGER_BREAK)) debug_ipi_action(0, NULL); } while (info-messages); @@ -280,7 +278,7 @@ EXPORT_SYMBOL_GPL(smp_send_reschedule); void arch_send_call_function_single_ipi(int cpu) { - do_message_pass(cpu, PPC_MSG_CALL_FUNC_SINGLE); + do_message_pass(cpu, PPC_MSG_CALL_FUNCTION); } void arch_send_call_function_ipi_mask(const struct cpumask *mask) diff --git a/arch/powerpc/platforms/cell/interrupt.c b/arch/powerpc/platforms/cell/interrupt.c index 2d42f3b..adf3726 100644 --- a/arch/powerpc/platforms/cell/interrupt.c +++ b/arch/powerpc/platforms/cell/interrupt.c @@ -215,7 +215,7 @@ void iic_request_IPIs(void) { iic_request_ipi(PPC_MSG_CALL_FUNCTION); iic_request_ipi(PPC_MSG_RESCHEDULE); - iic_request_ipi(PPC_MSG_CALL_FUNC_SINGLE); + iic_request_ipi(PPC_MSG_UNUSED); iic_request_ipi(PPC_MSG_DEBUGGER_BREAK); } diff --git a/arch/powerpc/platforms/ps3/smp.c b/arch/powerpc/platforms/ps3/smp.c index 4b35166..00d1a7c 100644 --- a/arch/powerpc/platforms/ps3/smp.c +++ b/arch/powerpc/platforms/ps3/smp.c @@ -76,7 +76,7 @@ static int __init ps3_smp_probe(void) BUILD_BUG_ON(PPC_MSG_CALL_FUNCTION!= 0); BUILD_BUG_ON(PPC_MSG_RESCHEDULE != 1); - BUILD_BUG_ON(PPC_MSG_CALL_FUNC_SINGLE != 2); + BUILD_BUG_ON(PPC_MSG_UNUSED != 2); BUILD_BUG_ON(PPC_MSG_DEBUGGER_BREAK != 3); for (i = 0; i MSG_COUNT; i++) { ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 2/3] powerpc: Implement tick broadcast IPI as a fixed IPI message
From: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com For scalability and performance reasons, we want the tick broadcast IPIs to be handled as efficiently as possible. Fixed IPI messages are one of the most efficient mechanisms available - they are faster than the smp_call_function mechanism because the IPI handlers are fixed and hence they don't involve costly operations such as adding IPI handlers to the target CPU's function queue, acquiring locks for synchronization etc. Luckily we have an unused IPI message slot, so use that to implement tick broadcast IPIs efficiently. Signed-off-by: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com [Functions renamed to tick_broadcast* and Changelog modified by Preeti U. Murthypre...@linux.vnet.ibm.com] Signed-off-by: Preeti U. Murthy pre...@linux.vnet.ibm.com Acked-by: Geoff Levand ge...@infradead.org [For the PS3 part] --- arch/powerpc/include/asm/smp.h |2 +- arch/powerpc/include/asm/time.h |1 + arch/powerpc/kernel/smp.c | 19 +++ arch/powerpc/kernel/time.c |5 + arch/powerpc/platforms/cell/interrupt.c |2 +- arch/powerpc/platforms/ps3/smp.c|2 +- 6 files changed, 24 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h index 9f7356b..ff51046 100644 --- a/arch/powerpc/include/asm/smp.h +++ b/arch/powerpc/include/asm/smp.h @@ -120,7 +120,7 @@ extern int cpu_to_core_id(int cpu); * in /proc/interrupts will be wrong!!! --Troy */ #define PPC_MSG_CALL_FUNCTION 0 #define PPC_MSG_RESCHEDULE 1 -#define PPC_MSG_UNUSED 2 +#define PPC_MSG_TICK_BROADCAST 2 #define PPC_MSG_DEBUGGER_BREAK 3 /* for irq controllers that have dedicated ipis per message (4) */ diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h index c1f2676..1d428e6 100644 --- a/arch/powerpc/include/asm/time.h +++ b/arch/powerpc/include/asm/time.h @@ -28,6 +28,7 @@ extern struct clock_event_device decrementer_clockevent; struct rtc_time; extern void to_tm(int tim, struct rtc_time * tm); extern void GregorianDay(struct rtc_time *tm); +extern void tick_broadcast_ipi_handler(void); extern void generic_calibrate_decr(void); diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c index ee7d76b..6f06f05 100644 --- a/arch/powerpc/kernel/smp.c +++ b/arch/powerpc/kernel/smp.c @@ -35,6 +35,7 @@ #include asm/ptrace.h #include linux/atomic.h #include asm/irq.h +#include asm/hw_irq.h #include asm/page.h #include asm/pgtable.h #include asm/prom.h @@ -145,9 +146,9 @@ static irqreturn_t reschedule_action(int irq, void *data) return IRQ_HANDLED; } -static irqreturn_t unused_action(int irq, void *data) +static irqreturn_t tick_broadcast_ipi_action(int irq, void *data) { - /* This slot is unused and hence available for use, if needed */ + tick_broadcast_ipi_handler(); return IRQ_HANDLED; } @@ -168,14 +169,14 @@ static irqreturn_t debug_ipi_action(int irq, void *data) static irq_handler_t smp_ipi_action[] = { [PPC_MSG_CALL_FUNCTION] = call_function_action, [PPC_MSG_RESCHEDULE] = reschedule_action, - [PPC_MSG_UNUSED] = unused_action, + [PPC_MSG_TICK_BROADCAST] = tick_broadcast_ipi_action, [PPC_MSG_DEBUGGER_BREAK] = debug_ipi_action, }; const char *smp_ipi_name[] = { [PPC_MSG_CALL_FUNCTION] = ipi call function, [PPC_MSG_RESCHEDULE] = ipi reschedule, - [PPC_MSG_UNUSED] = ipi unused, + [PPC_MSG_TICK_BROADCAST] = ipi tick-broadcast, [PPC_MSG_DEBUGGER_BREAK] = ipi debugger, }; @@ -251,6 +252,8 @@ irqreturn_t smp_ipi_demux(void) generic_smp_call_function_interrupt(); if (all IPI_MESSAGE(PPC_MSG_RESCHEDULE)) scheduler_ipi(); + if (all IPI_MESSAGE(PPC_MSG_TICK_BROADCAST)) + tick_broadcast_ipi_handler(); if (all IPI_MESSAGE(PPC_MSG_DEBUGGER_BREAK)) debug_ipi_action(0, NULL); } while (info-messages); @@ -289,6 +292,14 @@ void arch_send_call_function_ipi_mask(const struct cpumask *mask) do_message_pass(cpu, PPC_MSG_CALL_FUNCTION); } +void tick_broadcast(const struct cpumask *mask) +{ + unsigned int cpu; + + for_each_cpu(cpu, mask) + do_message_pass(cpu, PPC_MSG_TICK_BROADCAST); +} + #if defined(CONFIG_DEBUGGER) || defined(CONFIG_KEXEC) void smp_send_debugger_break(void) { diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c index b3dab20..3ff97db 100644 --- a/arch/powerpc/kernel/time.c +++ b/arch/powerpc/kernel/time.c @@ -825,6 +825,11 @@ static void decrementer_set_mode(enum clock_event_mode mode, decrementer_set_next_event(DECREMENTER_MAX, dev); } +/* Interrupt handler for the timer broadcast IPI */ +void tick_broadcast_ipi_handler(void) +{ +} + static void
[PATCH 3/3] cpuidle/ppc: Split timer_interrupt() into timer handling and interrupt handling routines
From: Preeti U Murthy pre...@linux.vnet.ibm.com Split timer_interrupt(), which is the local timer interrupt handler on ppc into routines called during regular interrupt handling and __timer_interrupt(), which takes care of running local timers and collecting time related stats. This will enable callers interested only in running expired local timers to directly call into __timer_interupt(). One of the use cases of this is the tick broadcast IPI handling in which the sleeping CPUs need to handle the local timers that have expired. Signed-off-by: Preeti U Murthy pre...@linux.vnet.ibm.com --- arch/powerpc/kernel/time.c | 81 +--- 1 file changed, 46 insertions(+), 35 deletions(-) diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c index 3ff97db..df2989b 100644 --- a/arch/powerpc/kernel/time.c +++ b/arch/powerpc/kernel/time.c @@ -478,6 +478,47 @@ void arch_irq_work_raise(void) #endif /* CONFIG_IRQ_WORK */ +void __timer_interrupt(void) +{ + struct pt_regs *regs = get_irq_regs(); + u64 *next_tb = __get_cpu_var(decrementers_next_tb); + struct clock_event_device *evt = __get_cpu_var(decrementers); + u64 now; + + trace_timer_interrupt_entry(regs); + + if (test_irq_work_pending()) { + clear_irq_work_pending(); + irq_work_run(); + } + + now = get_tb_or_rtc(); + if (now = *next_tb) { + *next_tb = ~(u64)0; + if (evt-event_handler) + evt-event_handler(evt); + __get_cpu_var(irq_stat).timer_irqs_event++; + } else { + now = *next_tb - now; + if (now = DECREMENTER_MAX) + set_dec((int)now); + /* We may have raced with new irq work */ + if (test_irq_work_pending()) + set_dec(1); + __get_cpu_var(irq_stat).timer_irqs_others++; + } + +#ifdef CONFIG_PPC64 + /* collect purr register values often, for accurate calculations */ + if (firmware_has_feature(FW_FEATURE_SPLPAR)) { + struct cpu_usage *cu = __get_cpu_var(cpu_usage_array); + cu-current_tb = mfspr(SPRN_PURR); + } +#endif + + trace_timer_interrupt_exit(regs); +} + /* * timer_interrupt - gets called when the decrementer overflows, * with interrupts disabled. @@ -486,8 +527,6 @@ void timer_interrupt(struct pt_regs * regs) { struct pt_regs *old_regs; u64 *next_tb = __get_cpu_var(decrementers_next_tb); - struct clock_event_device *evt = __get_cpu_var(decrementers); - u64 now; /* Ensure a positive value is written to the decrementer, or else * some CPUs will continue to take decrementer exceptions. @@ -519,39 +558,7 @@ void timer_interrupt(struct pt_regs * regs) old_regs = set_irq_regs(regs); irq_enter(); - trace_timer_interrupt_entry(regs); - - if (test_irq_work_pending()) { - clear_irq_work_pending(); - irq_work_run(); - } - - now = get_tb_or_rtc(); - if (now = *next_tb) { - *next_tb = ~(u64)0; - if (evt-event_handler) - evt-event_handler(evt); - __get_cpu_var(irq_stat).timer_irqs_event++; - } else { - now = *next_tb - now; - if (now = DECREMENTER_MAX) - set_dec((int)now); - /* We may have raced with new irq work */ - if (test_irq_work_pending()) - set_dec(1); - __get_cpu_var(irq_stat).timer_irqs_others++; - } - -#ifdef CONFIG_PPC64 - /* collect purr register values often, for accurate calculations */ - if (firmware_has_feature(FW_FEATURE_SPLPAR)) { - struct cpu_usage *cu = __get_cpu_var(cpu_usage_array); - cu-current_tb = mfspr(SPRN_PURR); - } -#endif - - trace_timer_interrupt_exit(regs); - + __timer_interrupt(); irq_exit(); set_irq_regs(old_regs); } @@ -828,6 +835,10 @@ static void decrementer_set_mode(enum clock_event_mode mode, /* Interrupt handler for the timer broadcast IPI */ void tick_broadcast_ipi_handler(void) { + u64 *next_tb = __get_cpu_var(decrementers_next_tb); + + *next_tb = get_tb_or_rtc(); + __timer_interrupt(); } static void register_decrementer_clockevent(int cpu) ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2] kexec/ppc64 fix device tree endianess issues for memory attributes
On Thu, Jan 30, 2014 at 04:06:22PM +0100, Laurent Dufour wrote: All the attributes exposed in the device tree are in Big Endian format. This patch add the byte swap operation for some entries which were not yet processed, including those fixed by the following kernel's patch : https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-January/114720.html To work on PPC64 Little Endian mode, kexec now requires that the kernel's patch mentioned above is applied on the kexecing kernel. Tested on ppc64 LPAR (kexec/dump) and ppc64le in a Qemu/KVM guest (kexec) Changes from v1 : * add processing of the following entries : - ibm,dynamic-reconfiguration-memory - chosen/linux,kernel-end - chosen/linux,crashkernel-base size - chosen/linux,memory-limit - chosen/linux,htab-base size - linux,tce-base size - memory@/reg Signed-off-by: Laurent Dufour lduf...@linux.vnet.ibm.com Thanks, applied. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev