Re: [PATCH v2 5/6] X86: remove redundant cpuidle_idle_call()

2014-01-30 Thread Peter Zijlstra
On Wed, Jan 29, 2014 at 03:14:40PM -0500, Nicolas Pitre wrote:
 Looking into some cpuidle drivers for x86 I found at least one that 
 doesn't respect this convention.  Damn.

Which one? We should probably fix it :-)
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v2 0/6] setting the table for integration of cpuidle with the scheduler

2014-01-30 Thread Peter Zijlstra
On Wed, Jan 29, 2014 at 12:45:07PM -0500, Nicolas Pitre wrote:
 As everyone should know by now, we want to integrate the cpuidle
 governor with the scheduler for a more efficient idling of CPUs.
 In order to help the transition, this small patch series moves the
 existing interaction with cpuidle from architecture code to generic
 core code.  The ARM, PPC, SH and X86 architectures are concerned.
 No functional change should have occurred yet.
 
 @peterz: Are you willing to pick up those patches?

Yeah.. no objections. Should I pick these up or will you be sending
another round?
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [RFC PATCH 07/10] KVM: PPC: BOOK3S: PR: Emulate facility status and control register

2014-01-30 Thread Alexander Graf


 Am 30.01.2014 um 07:00 schrieb Paul Mackerras pau...@samba.org:
 
 On Tue, Jan 28, 2014 at 10:14:12PM +0530, Aneesh Kumar K.V wrote:
 We allow priv-mode update of this. The guest value is saved in fscr,
 and the value actually used is saved in shadow_fscr. shadow_fscr
 only contains values that are allowed by the host. On
 facility unavailable interrupt, if the facility is allowed by fscr
 but disabled in shadow_fscr we need to emulate the support. Currently
 all but EBB is disabled. We still don't support performance monitoring
 in PR guest.
 
 ...
 
 +/*
 + * Save the current fscr in shadow fscr
 + */
 +mfspr r3,SPRN_FSCR
 +PPC_STL r3, VCPU_SHADOW_FSCR(r7)
 
 I don't think you need to do this.  What could possibly have changed
 FSCR since we loaded it on the way into the guest?

The interrupt cause is part of fscr. But yes, we only meed to store that on an 
fscr interrupt.

Do we use anything from fscr inside the kernel? Could we switch it lazily on 
vcpu_load/put?

Alex

 
 Paul.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [RFC PATCH 02/10] KVM: PPC: BOOK3S: PR: Emulate virtual timebase register

2014-01-30 Thread Alexander Graf


 Am 30.01.2014 um 06:49 schrieb Paul Mackerras pau...@samba.org:
 
 On Tue, Jan 28, 2014 at 10:14:07PM +0530, Aneesh Kumar K.V wrote:
 virtual time base register is a per vm register and need to saved
 and restored on vm exit and entry. Writing to VTB is not allowed
 in the privileged mode.
 ...
 
 +#ifdef CONFIG_PPC_BOOK3S_64
 +#define mfvtb()({unsigned long rval;\
 +asm volatile(mfspr %0, %1 :\
 + =r (rval) : i (SPRN_VTB)); rval;})
 
 The mfspr will be a no-op on anything before POWER8, meaning the
 result will be whatever value was in the destination GPR before the
 mfspr.  I suppose that may not matter if the result is only ever used
 when we're running on a POWER8 host, but I would feel more comfortable
 if we had explicit feature tests to make sure of that, rather than
 possibly doing computations with unpredictable values.
 
 With your patch, a guest on a POWER7 or a PPC970 could do a read from
 VTB and get garbage -- first, there is nothing to stop userspace from
 requesting POWER8 emulation on an older machine, and secondly, even if
 the virtual machine is a PPC970 (say) you don't implement
 unimplemented SPR semantics for VTB (no-op if PR=0, illegal
 instruction interrupt if PR=1).
 
 On the whole I think it is reasonable to reject an attempt to set the
 virtual PVR to a POWER8 PVR value if we are not running on a POWER8
 host, because emulating all the new POWER8 features in software
 (particularly transactional memory) would not be feasible.  Alex may
 disagree. :)

We don't have a good feature flag indicator that tells kvm what the guest cpu 
is capable of. So yes, I think it's reasonable to just not expose p8 registers 
on p8 for now.

In theory it's of course possible to emulate a lot of p8 features on pre-p8 
hardware, but I'm not sure it's worth the effort. If anyone wants to spend the 
time to work on it I'd be happy to tale patches though ;)

Alex

 
 Paul.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


PCIe Access - achieve bursts without DMA

2014-01-30 Thread Moese, Michael
Hello PPC-developers,
I'm currently trying to benchmark access speeds to our PCIe-connected IP-cores
located inside our FPGA. On x86-based systems I was able to achieve bursts for
both read and write access. On PPC32, using an e500v2, I had no success at all 
so far. 
I tried using ioremap_wc(), like I did on x86, for writing, and it only results 
in my
writes just being single requests, one after another.
For reads, I noticed I could not ioremap_cache() on PPC, so I used simple 
ioremap()
here. 
I used several ways to read from the device, from simple 
readl(),memcpy_from_io(), 
memcpy()  to cacheable_memcpy() - with no improvements.  Even when just issuing
a batch of prefetch()-calls for all the memory to read did not result in read 
bursts.

I only get really poor results, writing is possible with around 40 MiByte/s, 
whereas I  
can read at about only 3 MiByte/s.
After hours of studying the reference manual from freescale, looking into other 
code
and searching the web, I'm close to resignation.

Maybe someone of you has some more directions for me, I'd appreciate every hint
that leads me to my problem's solution - maybe I just missed something or lack 
knowledge about this architecture in general.

Thanks for your reading.


Michael
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 2/2] Fix compile error of pgtable-ppc64.h

2014-01-30 Thread Greg KH
On Thu, Jan 30, 2014 at 09:57:36AM +1100, Benjamin Herrenschmidt wrote:
 On Wed, 2014-01-29 at 10:45 -0800, Greg KH wrote:
  On Tue, Jan 28, 2014 at 05:52:42PM +0530, Aneesh Kumar K.V wrote:
   From: Li Zhong zh...@linux.vnet.ibm.com
   
   It seems that forward declaration couldn't work well with typedef, use
   struct spinlock directly to avoiding following build errors:
   
   In file included from include/linux/spinlock.h:81,
from include/linux/seqlock.h:35,
from include/linux/time.h:5,
from include/uapi/linux/timex.h:56,
from include/linux/timex.h:56,
from include/linux/sched.h:17,
from arch/powerpc/kernel/asm-offsets.c:17:
   include/linux/spinlock_types.h:76: error: redefinition of typedef 
   'spinlock_t'
   /root/linux-next/arch/powerpc/include/asm/pgtable-ppc64.h:563: note: 
   previous declaration of 'spinlock_t' was here
   
   build fix for upstream SHA1: b3084f4db3aeb991c507ca774337c7e7893ed04f
   for 3.13 stable series
  
  I don't understand, why is this needed?  Is there a corrisponding patch
  upstream that already does this?  What went wrong with a normal
  backport of the patch to 3.13?
 
 There's a corresponding patch in powerpc-next that I'm about to send to
 Linus today, but for the backport, the fix could be folded into the
 original offending patch.

Oh come on, you know better than to try to send me a patch that isn't in
Linus's tree already.  Crap, I can't take that at all.

Send me the git commit id when it is in Linus's tree, otherwise I'm not
taking it.

And no, don't fold in anything, that's not ok either.  I'll just go
drop this patch entirely from all of my -stable trees for now.  Feel
free to resend them when all of the needed stuff is upstream.

greg k-h
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH] powerpc/eeh: drop taken reference to driver on eeh_rmv_device

2014-01-30 Thread Thadeu Lima de Souza Cascardo
Commit f5c57710dd62dd06f176934a8b4b8accbf00f9f8 (powerpc/eeh: Use
partial hotplug for EEH unaware drivers) introduces eeh_rmv_device,
which may grab a reference to a driver, but not release it.

That prevents a driver from being removed after it has gone through EEH
recovery.

This patch drops the reference in either exit path if it was taken.

Signed-off-by: Thadeu Lima de Souza Cascardo casca...@linux.vnet.ibm.com
---
 arch/powerpc/kernel/eeh_driver.c |5 -
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index 7bb30dc..afe7337 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -364,7 +364,7 @@ static void *eeh_rmv_device(void *data, void *userdata)
return NULL;
driver = eeh_pcid_get(dev);
if (driver  driver-err_handler)
-   return NULL;
+   goto out;
 
/* Remove it from PCI subsystem */
pr_debug(EEH: Removing %s without EEH sensitive driver\n,
@@ -377,6 +377,9 @@ static void *eeh_rmv_device(void *data, void *userdata)
pci_stop_and_remove_bus_device(dev);
pci_unlock_rescan_remove();
 
+out:
+   if (driver)
+   eeh_pcid_put(dev);
return NULL;
 }
 
-- 
1.7.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v2 0/6] setting the table for integration of cpuidle with the scheduler

2014-01-30 Thread Nicolas Pitre
On Thu, 30 Jan 2014, Peter Zijlstra wrote:

 On Wed, Jan 29, 2014 at 12:45:07PM -0500, Nicolas Pitre wrote:
  As everyone should know by now, we want to integrate the cpuidle
  governor with the scheduler for a more efficient idling of CPUs.
  In order to help the transition, this small patch series moves the
  existing interaction with cpuidle from architecture code to generic
  core code.  The ARM, PPC, SH and X86 architectures are concerned.
  No functional change should have occurred yet.
  
  @peterz: Are you willing to pick up those patches?
 
 Yeah.. no objections. Should I pick these up or will you be sending
 another round?

I think you could pick them now, taking care of picking up the amended 
#1/6.


Nicolas
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v2 1/6] idle: move the cpuidle entry point to the generic idle loop

2014-01-30 Thread Daniel Lezcano

On 01/30/2014 06:28 AM, Nicolas Pitre wrote:

On Thu, 30 Jan 2014, Preeti U Murthy wrote:


Hi Nicolas,

On 01/30/2014 02:01 AM, Nicolas Pitre wrote:

On Wed, 29 Jan 2014, Nicolas Pitre wrote:


In order to integrate cpuidle with the scheduler, we must have a better
proximity in the core code with what cpuidle is doing and not delegate
such interaction to arch code.

Architectures implementing arch_cpu_idle() should simply enter
a cheap idle mode in the absence of a proper cpuidle driver.

Signed-off-by: Nicolas Pitre n...@linaro.org
Acked-by: Daniel Lezcano daniel.lezc...@linaro.org


As mentioned in my reply to Olof's comment on patch #5/6, here's a new
version of this patch adding the safety local_irq_enable() to the core
code.

- 8

From: Nicolas Pitre nicolas.pi...@linaro.org
Subject: idle: move the cpuidle entry point to the generic idle loop

In order to integrate cpuidle with the scheduler, we must have a better
proximity in the core code with what cpuidle is doing and not delegate
such interaction to arch code.

Architectures implementing arch_cpu_idle() should simply enter
a cheap idle mode in the absence of a proper cpuidle driver.

In both cases i.e. whether it is a cpuidle driver or the default
arch_cpu_idle(), the calling convention expects IRQs to be disabled
on entry and enabled on exit. There is a warning in place already but
let's add a forced IRQ enable here as well.  This will allow for
removing the forced IRQ enable some implementations do locally and


Why would this patch allow for removing the forced IRQ enable that are
being done on some archs in arch_cpu_idle()? Isn't this patch expecting
the default arch_cpu_idle() to have re-enabled the interrupts after
exiting from the default idle state? Its supposed to only catch faulty
cpuidle drivers that haven't enabled IRQs on exit from idle state but
are expected to have done so, isn't it?


Exact.  However x86 currently does this:

if (cpuidle_idle_call())
x86_idle();
else
local_irq_enable();

So whenever cpuidle_idle_call() is successful then IRQs are
unconditionally enabled whether or not the underlying cpuidle driver has
properly done it or not.  And the reason is that some of the x86 cpuidle
do fail to enable IRQs before returning.

So the idea is to get rid of this unconditional IRQ enabling and let the
core issue a warning instead (as well as enabling IRQs to allow the
system to run).


But what I don't get with your comment is the local_irq_enable is done 
from the cpuidle common framework in 'cpuidle_enter_state' it is not 
done from the arch specific backend cpuidle driver.


So the code above could be:

if (cpuidle_idle_call())
x86_idle();

without the else section, this local_irq_enable is pointless. Or may be 
I missed something ?



--
 http://www.linaro.org/ Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  http://www.facebook.com/pages/Linaro Facebook |
http://twitter.com/#!/linaroorg Twitter |
http://www.linaro.org/linaro-blog/ Blog

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: PCIe Access - achieve bursts without DMA

2014-01-30 Thread David Laight
From Moese, Michael
 Hello PPC-developers,
 I'm currently trying to benchmark access speeds to our PCIe-connected IP-cores
 located inside our FPGA. On x86-based systems I was able to achieve bursts for
 both read and write access. On PPC32, using an e500v2, I had no success at all
 so far.

I'm not sure that you can.
I had to write a simple driver for the PCIe CSB bridge dma on a 83xx ppc.
I think that might be the one in the e500v2.

I don't know how fast 'normal' PCIe slaves are, but we were accessing
an Altera fpga and the latency is less than pedestrian.
I think an ISA bus can run faster!
With moderate length transfers, the throughput was more than adequate.

David



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH v2] kexec/ppc64 fix device tree endianess issues for memory attributes

2014-01-30 Thread Laurent Dufour
All the attributes exposed in the device tree are in Big Endian format.

This patch add the byte swap operation for some entries which were not yet
processed, including those fixed by the following kernel's patch :

https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-January/114720.html

To work on PPC64 Little Endian mode, kexec now requires that the kernel's
patch mentioned above is applied on the kexecing kernel.

Tested on ppc64 LPAR (kexec/dump) and ppc64le in a Qemu/KVM guest (kexec)

Changes from v1 :
 * add processing of the following entries :
   - ibm,dynamic-reconfiguration-memory
   - chosen/linux,kernel-end
   - chosen/linux,crashkernel-base  size
   - chosen/linux,memory-limit
   - chosen/linux,htab-base  size
   - linux,tce-base  size
   - memory@/reg
Signed-off-by: Laurent Dufour lduf...@linux.vnet.ibm.com
---
 kexec/arch/ppc64/crashdump-ppc64.c |9 ---
 kexec/arch/ppc64/kexec-ppc64.c |   44 +++-
 kexec/fs2dt.c  |   19 
 3 files changed, 48 insertions(+), 24 deletions(-)

diff --git a/kexec/arch/ppc64/crashdump-ppc64.c 
b/kexec/arch/ppc64/crashdump-ppc64.c
index e31dd6d..c0d575d 100644
--- a/kexec/arch/ppc64/crashdump-ppc64.c
+++ b/kexec/arch/ppc64/crashdump-ppc64.c
@@ -146,12 +146,12 @@ static int get_dyn_reconf_crash_memory_ranges(void)
return -1;
}
 
-   start = ((uint64_t *)buf)[DRCONF_ADDR];
+   start = be64_to_cpu(((uint64_t *)buf)[DRCONF_ADDR]);
end = start + lmb_size;
if (start == 0  end = (BACKUP_SRC_END + 1))
start = BACKUP_SRC_END + 1;
 
-   flags = (*((uint32_t *)buf[DRCONF_FLAGS]));
+   flags = be32_to_cpu((*((uint32_t *)buf[DRCONF_FLAGS])));
/* skip this block if the reserved bit is set in flags (0x80)
   or if the block is not assigned to this partition (0x8) */
if ((flags  0x80) || !(flags  0x8))
@@ -252,8 +252,9 @@ static int get_crash_memory_ranges(struct memory_range 
**range, int *ranges)
goto err;
}
 
-   start = ((unsigned long long *)buf)[0];
-   end = start + ((unsigned long long *)buf)[1];
+   start = be64_to_cpu(((unsigned long long *)buf)[0]);
+   end = start +
+   be64_to_cpu(((unsigned long long *)buf)[1]);
if (start == 0  end = (BACKUP_SRC_END + 1))
start = BACKUP_SRC_END + 1;
 
diff --git a/kexec/arch/ppc64/kexec-ppc64.c b/kexec/arch/ppc64/kexec-ppc64.c
index af9112b..49b291d 100644
--- a/kexec/arch/ppc64/kexec-ppc64.c
+++ b/kexec/arch/ppc64/kexec-ppc64.c
@@ -167,7 +167,7 @@ static int get_dyn_reconf_base_ranges(void)
 * lmb_size, num_of_lmbs(global variables) are
 * initialized once here.
 */
-   lmb_size = ((uint64_t *)buf)[0];
+   lmb_size = be64_to_cpu(((uint64_t *)buf)[0]);
fclose(file);
 
strcpy(fname, /proc/device-tree/);
@@ -183,7 +183,7 @@ static int get_dyn_reconf_base_ranges(void)
fclose(file);
return -1;
}
-   num_of_lmbs = ((unsigned int *)buf)[0];
+   num_of_lmbs = be32_to_cpu(((unsigned int *)buf)[0]);
 
for (i = 0; i  num_of_lmbs; i++) {
if ((n = fread(buf, 1, 24, file))  0) {
@@ -194,7 +194,7 @@ static int get_dyn_reconf_base_ranges(void)
if (nr_memory_ranges = max_memory_ranges)
return -1;
 
-   start = ((uint64_t *)buf)[0];
+   start = be64_to_cpu(((uint64_t *)buf)[0]);
end = start + lmb_size;
add_base_memory_range(start, end);
}
@@ -278,8 +278,8 @@ static int get_base_ranges(void)
if (realloc_memory_ranges()  0)
break;
}
-   start = ((uint64_t *)buf)[0];
-   end = start + ((uint64_t *)buf)[1];
+   start =  be64_to_cpu(((uint64_t *)buf)[0]);
+   end = start + be64_to_cpu(((uint64_t *)buf)[1]);
add_base_memory_range(start, end);
fclose(file);
}
@@ -363,6 +363,7 @@ static int get_devtree_details(unsigned long kexec_flags)
goto error_openfile;
}
fclose(file);
+   kernel_end = be64_to_cpu(kernel_end);
 
/* Add kernel memory to exclude_range */
exclude_range[i].start = 0x0UL;
@@ -386,6 +387,7 @@ static int get_devtree_details(unsigned long kexec_flags)
goto error_openfile;
}
  

Re: [PATCH v2 6/6] cpu/idle.c: move to sched/idle.c

2014-01-30 Thread Peter Zijlstra
On Wed, Jan 29, 2014 at 12:45:13PM -0500, Nicolas Pitre wrote:
 Integration of cpuidle with the scheduler requires that the idle loop be
 closely integrated with the scheduler proper. Moving cpu/idle.c into the
 sched directory will allow for a smoother integration, and eliminate a
 subdirectory which contained only one source file.
 
 Signed-off-by: Nicolas Pitre n...@linaro.org
 ---
  kernel/Makefile  | 1 -
  kernel/cpu/Makefile  | 1 -
  kernel/sched/Makefile| 2 +-
  kernel/{cpu = sched}/idle.c | 0
  4 files changed, 1 insertion(+), 3 deletions(-)
  delete mode 100644 kernel/cpu/Makefile
  rename kernel/{cpu = sched}/idle.c (100%)

 --- a/kernel/sched/Makefile
 +++ b/kernel/sched/Makefile
 @@ -11,7 +11,7 @@ ifneq ($(CONFIG_SCHED_OMIT_FRAME_POINTER),y)
  CFLAGS_core.o := $(PROFILING) -fno-omit-frame-pointer
  endif
  
 -obj-y += core.o proc.o clock.o cputime.o idle_task.o fair.o rt.o stop_task.o
 +obj-y += core.o proc.o clock.o cputime.o idle_task.o idle.o fair.o rt.o 
 stop_task.o
  obj-y += wait.o completion.o
  obj-$(CONFIG_SMP) += cpupri.o
  obj-$(CONFIG_SCHED_AUTOGROUP) += auto_group.o
 diff --git a/kernel/cpu/idle.c b/kernel/sched/idle.c
 similarity index 100%
 rename from kernel/cpu/idle.c
 rename to kernel/sched/idle.c

This is not a valid patch for PATCH(1). Please try again.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[RESEND PATCH] powerpc/relocate fix relocate processing in LE mode

2014-01-30 Thread Laurent Dufour
Relocation's code is not working in little endian mode because the r_info
field, which is a 64 bits value, should be read from the right offset.

The current code is optimized to read the r_info field as a 32 bits value
starting at the middle of the double word (offset 12). When running in LE
mode, the read value is not correct since only the MSB is read.

This patch removes this optimization which consist to deal with a 32 bits
value instead of a 64 bits one. This way it works in big and little endian
mode.

Signed-off-by: Laurent Dufour lduf...@linux.vnet.ibm.com
---
 arch/powerpc/kernel/reloc_64.S |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/reloc_64.S b/arch/powerpc/kernel/reloc_64.S
index b47a0e1..1482327 100644
--- a/arch/powerpc/kernel/reloc_64.S
+++ b/arch/powerpc/kernel/reloc_64.S
@@ -69,8 +69,8 @@ _GLOBAL(relocate)
 * R_PPC64_RELATIVE ones.
 */
mtctr   r8
-5: lwz r0,12(9)/* ELF64_R_TYPE(reloc-r_info) */
-   cmpwi   r0,R_PPC64_RELATIVE
+5: ld  r0,8(9) /* ELF64_R_TYPE(reloc-r_info) */
+   cmpdi   r0,R_PPC64_RELATIVE
bne 6f
ld  r6,0(r9)/* reloc-r_offset */
ld  r0,16(r9)   /* reloc-r_addend */

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v2 6/6] cpu/idle.c: move to sched/idle.c

2014-01-30 Thread Nicolas Pitre
On Thu, 30 Jan 2014, Peter Zijlstra wrote:

 On Wed, Jan 29, 2014 at 12:45:13PM -0500, Nicolas Pitre wrote:
  Integration of cpuidle with the scheduler requires that the idle loop be
  closely integrated with the scheduler proper. Moving cpu/idle.c into the
  sched directory will allow for a smoother integration, and eliminate a
  subdirectory which contained only one source file.
  
  Signed-off-by: Nicolas Pitre n...@linaro.org
  ---
   kernel/Makefile  | 1 -
   kernel/cpu/Makefile  | 1 -
   kernel/sched/Makefile| 2 +-
   kernel/{cpu = sched}/idle.c | 0
   4 files changed, 1 insertion(+), 3 deletions(-)
   delete mode 100644 kernel/cpu/Makefile
   rename kernel/{cpu = sched}/idle.c (100%)
 
  --- a/kernel/sched/Makefile
  +++ b/kernel/sched/Makefile
  @@ -11,7 +11,7 @@ ifneq ($(CONFIG_SCHED_OMIT_FRAME_POINTER),y)
   CFLAGS_core.o := $(PROFILING) -fno-omit-frame-pointer
   endif
   
  -obj-y += core.o proc.o clock.o cputime.o idle_task.o fair.o rt.o 
  stop_task.o
  +obj-y += core.o proc.o clock.o cputime.o idle_task.o idle.o fair.o rt.o 
  stop_task.o
   obj-y += wait.o completion.o
   obj-$(CONFIG_SMP) += cpupri.o
   obj-$(CONFIG_SCHED_AUTOGROUP) += auto_group.o
  diff --git a/kernel/cpu/idle.c b/kernel/sched/idle.c
  similarity index 100%
  rename from kernel/cpu/idle.c
  rename to kernel/sched/idle.c
 
 This is not a valid patch for PATCH(1). Please try again.

Don't you use git?  ;-)

Here's a plain patch:

- 8

From 1bf40eb80a44633094e94986a74bd5ffa222f9d4 Mon Sep 17 00:00:00 2001
From: Nicolas Pitre nicolas.pi...@linaro.org
Date: Sun, 26 Jan 2014 23:42:01 -0500
Subject: [PATCH] cpu/idle.c: move to sched/idle.c

Integration of cpuidle with the scheduler requires that the idle loop be
closely integrated with the scheduler proper. Moving cpu/idle.c into the
sched directory will allow for a smoother integration, and eliminate a
subdirectory which contained only one source file.

Signed-off-by: Nicolas Pitre n...@linaro.org
---
 kernel/Makefile   |   1 -
 kernel/cpu/Makefile   |   1 -
 kernel/cpu/idle.c | 144 --
 kernel/sched/Makefile |   2 +-
 kernel/sched/idle.c   | 144 ++
 5 files changed, 145 insertions(+), 147 deletions(-)
 delete mode 100644 kernel/cpu/Makefile
 delete mode 100644 kernel/cpu/idle.c
 create mode 100644 kernel/sched/idle.c

diff --git a/kernel/Makefile b/kernel/Makefile
index bc010ee272..6f1c7e5cfc 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -22,7 +22,6 @@ obj-y += sched/
 obj-y += locking/
 obj-y += power/
 obj-y += printk/
-obj-y += cpu/
 obj-y += irq/
 obj-y += rcu/
 
diff --git a/kernel/cpu/Makefile b/kernel/cpu/Makefile
deleted file mode 100644
index 59ab052ef7..00
--- a/kernel/cpu/Makefile
+++ /dev/null
@@ -1 +0,0 @@
-obj-y  = idle.o
diff --git a/kernel/cpu/idle.c b/kernel/cpu/idle.c
deleted file mode 100644
index 14ca43430a..00
--- a/kernel/cpu/idle.c
+++ /dev/null
@@ -1,144 +0,0 @@
-/*
- * Generic entry point for the idle threads
- */
-#include linux/sched.h
-#include linux/cpu.h
-#include linux/cpuidle.h
-#include linux/tick.h
-#include linux/mm.h
-#include linux/stackprotector.h
-
-#include asm/tlb.h
-
-#include trace/events/power.h
-
-static int __read_mostly cpu_idle_force_poll;
-
-void cpu_idle_poll_ctrl(bool enable)
-{
-   if (enable) {
-   cpu_idle_force_poll++;
-   } else {
-   cpu_idle_force_poll--;
-   WARN_ON_ONCE(cpu_idle_force_poll  0);
-   }
-}
-
-#ifdef CONFIG_GENERIC_IDLE_POLL_SETUP
-static int __init cpu_idle_poll_setup(char *__unused)
-{
-   cpu_idle_force_poll = 1;
-   return 1;
-}
-__setup(nohlt, cpu_idle_poll_setup);
-
-static int __init cpu_idle_nopoll_setup(char *__unused)
-{
-   cpu_idle_force_poll = 0;
-   return 1;
-}
-__setup(hlt, cpu_idle_nopoll_setup);
-#endif
-
-static inline int cpu_idle_poll(void)
-{
-   rcu_idle_enter();
-   trace_cpu_idle_rcuidle(0, smp_processor_id());
-   local_irq_enable();
-   while (!tif_need_resched())
-   cpu_relax();
-   trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, smp_processor_id());
-   rcu_idle_exit();
-   return 1;
-}
-
-/* Weak implementations for optional arch specific functions */
-void __weak arch_cpu_idle_prepare(void) { }
-void __weak arch_cpu_idle_enter(void) { }
-void __weak arch_cpu_idle_exit(void) { }
-void __weak arch_cpu_idle_dead(void) { }
-void __weak arch_cpu_idle(void)
-{
-   cpu_idle_force_poll = 1;
-   local_irq_enable();
-}
-
-/*
- * Generic idle loop implementation
- */
-static void cpu_idle_loop(void)
-{
-   while (1) {
-   tick_nohz_idle_enter();
-
-   while (!need_resched()) {
-   check_pgt_cache();
-   rmb();
-
-   if (cpu_is_offline(smp_processor_id()))
-   arch_cpu_idle_dead();
-
-   

Re: [PATCH v2 1/6] idle: move the cpuidle entry point to the generic idle loop

2014-01-30 Thread Nicolas Pitre
On Thu, 30 Jan 2014, Daniel Lezcano wrote:

 On 01/30/2014 06:28 AM, Nicolas Pitre wrote:
  On Thu, 30 Jan 2014, Preeti U Murthy wrote:
 
   Hi Nicolas,
  
   On 01/30/2014 02:01 AM, Nicolas Pitre wrote:
On Wed, 29 Jan 2014, Nicolas Pitre wrote:
   
 In order to integrate cpuidle with the scheduler, we must have a
 better
 proximity in the core code with what cpuidle is doing and not delegate
 such interaction to arch code.

 Architectures implementing arch_cpu_idle() should simply enter
 a cheap idle mode in the absence of a proper cpuidle driver.

 Signed-off-by: Nicolas Pitre n...@linaro.org
 Acked-by: Daniel Lezcano daniel.lezc...@linaro.org
   
As mentioned in my reply to Olof's comment on patch #5/6, here's a new
version of this patch adding the safety local_irq_enable() to the core
code.
   
- 8
   
From: Nicolas Pitre nicolas.pi...@linaro.org
Subject: idle: move the cpuidle entry point to the generic idle loop
   
In order to integrate cpuidle with the scheduler, we must have a better
proximity in the core code with what cpuidle is doing and not delegate
such interaction to arch code.
   
Architectures implementing arch_cpu_idle() should simply enter
a cheap idle mode in the absence of a proper cpuidle driver.
   
In both cases i.e. whether it is a cpuidle driver or the default
arch_cpu_idle(), the calling convention expects IRQs to be disabled
on entry and enabled on exit. There is a warning in place already but
let's add a forced IRQ enable here as well.  This will allow for
removing the forced IRQ enable some implementations do locally and
  
   Why would this patch allow for removing the forced IRQ enable that are
   being done on some archs in arch_cpu_idle()? Isn't this patch expecting
   the default arch_cpu_idle() to have re-enabled the interrupts after
   exiting from the default idle state? Its supposed to only catch faulty
   cpuidle drivers that haven't enabled IRQs on exit from idle state but
   are expected to have done so, isn't it?
 
  Exact.  However x86 currently does this:
 
   if (cpuidle_idle_call())
   x86_idle();
   else
   local_irq_enable();
 
  So whenever cpuidle_idle_call() is successful then IRQs are
  unconditionally enabled whether or not the underlying cpuidle driver has
  properly done it or not.  And the reason is that some of the x86 cpuidle
  do fail to enable IRQs before returning.
 
  So the idea is to get rid of this unconditional IRQ enabling and let the
  core issue a warning instead (as well as enabling IRQs to allow the
  system to run).
 
 But what I don't get with your comment is the local_irq_enable is done from
 the cpuidle common framework in 'cpuidle_enter_state' it is not done from the
 arch specific backend cpuidle driver.

Oh well... This certainly means we'll have to clean this mess as some 
drivers do it on their own while some others don't.  Some drivers also 
loop on !need_resched() while some others simply return on the first 
interrupt.

 So the code above could be:
 
   if (cpuidle_idle_call())
   x86_idle();
 
 without the else section, this local_irq_enable is pointless. Or may be I
 missed something ?

A later patch removes it anyway.  But if it is really necessary to 
enable interrupts then the core will do it but with a warning now.


Nicolas
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] slub: Don't throw away partial remote slabs if there is no local memory

2014-01-30 Thread Christoph Lameter
On Wed, 29 Jan 2014, Nishanth Aravamudan wrote:

 exactly what the caller intends.

 int searchnode = node;
 if (node == NUMA_NO_NODE)
   searchnode = numa_mem_id();
 if (!node_present_pages(node))
   searchnode = local_memory_node(node);

 The difference in semantics from the previous is that here, if we have a
 memoryless node, rather than using the CPU's nearest NUMA node, we use
 the NUMA node closest to the requested one?

The idea here is that the page allocator will do the fallback to other
nodes. This check for !node_present should not be necessary. SLUB needs to
accept the page from whatever node the page allocator returned and work
with that.

The problem is the check for having a slab from the right node may fall
again after another attempt to allocate from the same node. SLUB will then
push the slab from the *wrong* node back to the partial lists and may
attempt another allocation that will again be successful but return memory
from another node. That way the partial lists from a particular node are
growing uselessly.

One way to solve this may be to check if memory is actually allocated
from the requested node and fallback to NUMA_NO_NODE (which will use the
last allocated slab) for future allocs if the page allocator returned
memory from a different node (unless GFP_THIS_NODE is set of course).
Otherwise we end up replicating  the page allocator logic in slub like in
slab. That is what I wanted to
avoid.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v2 6/6] cpu/idle.c: move to sched/idle.c

2014-01-30 Thread Peter Zijlstra
On Thu, Jan 30, 2014 at 11:03:31AM -0500, Nicolas Pitre wrote:
  This is not a valid patch for PATCH(1). Please try again.
 
 Don't you use git?  ;-)

Nah, git and me don't get along well.

 Here's a plain patch:

Thanks!
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v2 6/6] cpu/idle.c: move to sched/idle.c

2014-01-30 Thread Joe Perches
On Thu, 2014-01-30 at 17:27 +0100, Peter Zijlstra wrote:
 On Thu, Jan 30, 2014 at 11:03:31AM -0500, Nicolas Pitre wrote:
   This is not a valid patch for PATCH(1). Please try again.
  
  Don't you use git?  ;-)
 
 Nah, git and me don't get along well.

Perhaps you could use a newer version of patch

http://savannah.gnu.org/forum/forum.php?forum_id=7361

GNU patch version 2.7 released

Item posted by Andreas Gruenbacher agruen on Wed 12 Sep 2012 02:18:14
PM UTC.

I am pleased to announce that version 2.7 of GNU patch has been
released. The following significant changes have happened since the last
stable release in December 2009: 

  * Support for most features of the diff --git format, including
renames and copies, permission changes, and symlink diffs.
Binary diffs are not supported yet; patch will complain and skip
them.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v2 6/6] cpu/idle.c: move to sched/idle.c

2014-01-30 Thread Peter Zijlstra
On Thu, Jan 30, 2014 at 08:41:16AM -0800, Joe Perches wrote:
 Perhaps you could use a newer version of patch
 
 GNU patch version 2.7 released

Yeah, I know about that, I'll wait until its common in all distros,
updating all machines I use by hand is just painful.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v2 1/6] idle: move the cpuidle entry point to the generic idle loop

2014-01-30 Thread Daniel Lezcano

On 01/30/2014 05:07 PM, Nicolas Pitre wrote:

On Thu, 30 Jan 2014, Daniel Lezcano wrote:


On 01/30/2014 06:28 AM, Nicolas Pitre wrote:

On Thu, 30 Jan 2014, Preeti U Murthy wrote:


Hi Nicolas,

On 01/30/2014 02:01 AM, Nicolas Pitre wrote:

On Wed, 29 Jan 2014, Nicolas Pitre wrote:


In order to integrate cpuidle with the scheduler, we must have a
better
proximity in the core code with what cpuidle is doing and not delegate
such interaction to arch code.

Architectures implementing arch_cpu_idle() should simply enter
a cheap idle mode in the absence of a proper cpuidle driver.

Signed-off-by: Nicolas Pitre n...@linaro.org
Acked-by: Daniel Lezcano daniel.lezc...@linaro.org


As mentioned in my reply to Olof's comment on patch #5/6, here's a new
version of this patch adding the safety local_irq_enable() to the core
code.

- 8

From: Nicolas Pitre nicolas.pi...@linaro.org
Subject: idle: move the cpuidle entry point to the generic idle loop

In order to integrate cpuidle with the scheduler, we must have a better
proximity in the core code with what cpuidle is doing and not delegate
such interaction to arch code.

Architectures implementing arch_cpu_idle() should simply enter
a cheap idle mode in the absence of a proper cpuidle driver.

In both cases i.e. whether it is a cpuidle driver or the default
arch_cpu_idle(), the calling convention expects IRQs to be disabled
on entry and enabled on exit. There is a warning in place already but
let's add a forced IRQ enable here as well.  This will allow for
removing the forced IRQ enable some implementations do locally and


Why would this patch allow for removing the forced IRQ enable that are
being done on some archs in arch_cpu_idle()? Isn't this patch expecting
the default arch_cpu_idle() to have re-enabled the interrupts after
exiting from the default idle state? Its supposed to only catch faulty
cpuidle drivers that haven't enabled IRQs on exit from idle state but
are expected to have done so, isn't it?


Exact.  However x86 currently does this:

  if (cpuidle_idle_call())
  x86_idle();
  else
  local_irq_enable();

So whenever cpuidle_idle_call() is successful then IRQs are
unconditionally enabled whether or not the underlying cpuidle driver has
properly done it or not.  And the reason is that some of the x86 cpuidle
do fail to enable IRQs before returning.

So the idea is to get rid of this unconditional IRQ enabling and let the
core issue a warning instead (as well as enabling IRQs to allow the
system to run).


But what I don't get with your comment is the local_irq_enable is done from
the cpuidle common framework in 'cpuidle_enter_state' it is not done from the
arch specific backend cpuidle driver.


Oh well... This certainly means we'll have to clean this mess as some
drivers do it on their own while some others don't.  Some drivers also
loop on !need_resched() while some others simply return on the first
interrupt.


Ok, I think the mess is coming from 'default_idle' which does not 
re-enable the local_irq but used from different places like 
amd_e400_idle and apm_cpu_idle.


void default_idle(void)
{
trace_cpu_idle_rcuidle(1, smp_processor_id());
safe_halt();
trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, smp_processor_id());
}

Considering the system configured without cpuidle because this one 
*always* enable the local irq, we have the different cases:


x86_idle = default_idle();
== local_irq_enable is missing

x86_idle = amd_e400_idle();
== it calls local_irq_disable(); but in the idle loop context where the 
local irqs are already disabled.

== if amd_e400_c1e_detected is true, the local_irq are enabled
== otherwise no
== default_idle is called from there and does not enable local_irqs



So the code above could be:

if (cpuidle_idle_call())
x86_idle();

without the else section, this local_irq_enable is pointless. Or may be I
missed something ?


A later patch removes it anyway.  But if it is really necessary to
enable interrupts then the core will do it but with a warning now.


This WARN should disappear. It was there because it was up to the 
backend cpuidle driver to enable the irq. But in the meantime, that was 
consolidated into a single place in the cpuidle framework so no need to 
try to catch errors.


What about (based on this patchset).

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 4505e2a..2d60cbb 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -299,6 +299,7 @@ void arch_cpu_idle_dead(void)
 void arch_cpu_idle(void)
 {
x86_idle();
+   local_irq_enable();
 }

 /*



--
 http://www.linaro.org/ Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  http://www.facebook.com/pages/Linaro Facebook |
http://twitter.com/#!/linaroorg Twitter |
http://www.linaro.org/linaro-blog/ Blog

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org

Re: [PATCH 2/2] Fix compile error of pgtable-ppc64.h

2014-01-30 Thread Aneesh Kumar K.V
Greg KH g...@kroah.com writes:

 On Thu, Jan 30, 2014 at 09:57:36AM +1100, Benjamin Herrenschmidt wrote:
 On Wed, 2014-01-29 at 10:45 -0800, Greg KH wrote:
  On Tue, Jan 28, 2014 at 05:52:42PM +0530, Aneesh Kumar K.V wrote:
   From: Li Zhong zh...@linux.vnet.ibm.com
   
   It seems that forward declaration couldn't work well with typedef, use
   struct spinlock directly to avoiding following build errors:
   
   In file included from include/linux/spinlock.h:81,
from include/linux/seqlock.h:35,
from include/linux/time.h:5,
from include/uapi/linux/timex.h:56,
from include/linux/timex.h:56,
from include/linux/sched.h:17,
from arch/powerpc/kernel/asm-offsets.c:17:
   include/linux/spinlock_types.h:76: error: redefinition of typedef 
   'spinlock_t'
   /root/linux-next/arch/powerpc/include/asm/pgtable-ppc64.h:563: note: 
   previous declaration of 'spinlock_t' was here
   
   build fix for upstream SHA1: b3084f4db3aeb991c507ca774337c7e7893ed04f
   for 3.13 stable series
  
  I don't understand, why is this needed?  Is there a corrisponding patch
  upstream that already does this?  What went wrong with a normal
  backport of the patch to 3.13?
 
 There's a corresponding patch in powerpc-next that I'm about to send to
 Linus today, but for the backport, the fix could be folded into the
 original offending patch.

 Oh come on, you know better than to try to send me a patch that isn't in
 Linus's tree already.  Crap, I can't take that at all.

 Send me the git commit id when it is in Linus's tree, otherwise I'm not
 taking it.

 And no, don't fold in anything, that's not ok either.  I'll just go
 drop this patch entirely from all of my -stable trees for now.  Feel
 free to resend them when all of the needed stuff is upstream.

The fix for mremap crash is already in Linus tree. It is the build
failure for older gcc compiler version that is not in linus tree. We
missed that in the first pull request. Do we really need to drop the
patch from 3.11 and 3.12 trees ? The patch their is a variant, and don't
require this build fix.

-aneesh

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 2/2] Fix compile error of pgtable-ppc64.h

2014-01-30 Thread Greg KH
On Thu, Jan 30, 2014 at 11:08:52PM +0530, Aneesh Kumar K.V wrote:
 Greg KH g...@kroah.com writes:
 
  On Thu, Jan 30, 2014 at 09:57:36AM +1100, Benjamin Herrenschmidt wrote:
  On Wed, 2014-01-29 at 10:45 -0800, Greg KH wrote:
   On Tue, Jan 28, 2014 at 05:52:42PM +0530, Aneesh Kumar K.V wrote:
From: Li Zhong zh...@linux.vnet.ibm.com

It seems that forward declaration couldn't work well with typedef, use
struct spinlock directly to avoiding following build errors:

In file included from include/linux/spinlock.h:81,
 from include/linux/seqlock.h:35,
 from include/linux/time.h:5,
 from include/uapi/linux/timex.h:56,
 from include/linux/timex.h:56,
 from include/linux/sched.h:17,
 from arch/powerpc/kernel/asm-offsets.c:17:
include/linux/spinlock_types.h:76: error: redefinition of typedef 
'spinlock_t'
/root/linux-next/arch/powerpc/include/asm/pgtable-ppc64.h:563: note: 
previous declaration of 'spinlock_t' was here

build fix for upstream SHA1: b3084f4db3aeb991c507ca774337c7e7893ed04f
for 3.13 stable series
   
   I don't understand, why is this needed?  Is there a corrisponding patch
   upstream that already does this?  What went wrong with a normal
   backport of the patch to 3.13?
  
  There's a corresponding patch in powerpc-next that I'm about to send to
  Linus today, but for the backport, the fix could be folded into the
  original offending patch.
 
  Oh come on, you know better than to try to send me a patch that isn't in
  Linus's tree already.  Crap, I can't take that at all.
 
  Send me the git commit id when it is in Linus's tree, otherwise I'm not
  taking it.
 
  And no, don't fold in anything, that's not ok either.  I'll just go
  drop this patch entirely from all of my -stable trees for now.  Feel
  free to resend them when all of the needed stuff is upstream.
 
 The fix for mremap crash is already in Linus tree.

What is the git commit id?

 It is the build failure for older gcc compiler version that is not in
 linus tree.

That is what I can not take.

 We missed that in the first pull request. Do we really need to drop
 the patch from 3.11 and 3.12 trees ?

I already did.

 The patch their is a variant, and don't require this build fix.

Don't give me a variant, give me the exact same patch, only changed to
handle the fuzz/differences of older kernels, don't make different
changes to the original patch to make up for things you found out later
on, otherwise everyone is confused as to why the fix for the fix is not
in the tree.

So, when both patches get in Linus's tree, please send me the properly
backported patches and I'll be glad to apply them.

greg k-h
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 2/2] Fix compile error of pgtable-ppc64.h

2014-01-30 Thread Aneesh Kumar K.V
Greg KH g...@kroah.com writes:

 On Thu, Jan 30, 2014 at 11:08:52PM +0530, Aneesh Kumar K.V wrote:
 Greg KH g...@kroah.com writes:
 
  On Thu, Jan 30, 2014 at 09:57:36AM +1100, Benjamin Herrenschmidt wrote:
  On Wed, 2014-01-29 at 10:45 -0800, Greg KH wrote:
   On Tue, Jan 28, 2014 at 05:52:42PM +0530, Aneesh Kumar K.V wrote:
From: Li Zhong zh...@linux.vnet.ibm.com

It seems that forward declaration couldn't work well with typedef, use
struct spinlock directly to avoiding following build errors:

In file included from include/linux/spinlock.h:81,
 from include/linux/seqlock.h:35,
 from include/linux/time.h:5,
 from include/uapi/linux/timex.h:56,
 from include/linux/timex.h:56,
 from include/linux/sched.h:17,
 from arch/powerpc/kernel/asm-offsets.c:17:
include/linux/spinlock_types.h:76: error: redefinition of typedef 
'spinlock_t'
/root/linux-next/arch/powerpc/include/asm/pgtable-ppc64.h:563: note: 
previous declaration of 'spinlock_t' was here

build fix for upstream SHA1: b3084f4db3aeb991c507ca774337c7e7893ed04f
for 3.13 stable series
   
   I don't understand, why is this needed?  Is there a corrisponding patch
   upstream that already does this?  What went wrong with a normal
   backport of the patch to 3.13?
  
  There's a corresponding patch in powerpc-next that I'm about to send to
  Linus today, but for the backport, the fix could be folded into the
  original offending patch.
 
  Oh come on, you know better than to try to send me a patch that isn't in
  Linus's tree already.  Crap, I can't take that at all.
 
  Send me the git commit id when it is in Linus's tree, otherwise I'm not
  taking it.
 
  And no, don't fold in anything, that's not ok either.  I'll just go
  drop this patch entirely from all of my -stable trees for now.  Feel
  free to resend them when all of the needed stuff is upstream.
 
 The fix for mremap crash is already in Linus tree.

 What is the git commit id?

upstream SHA1: b3084f4db3aeb991c507ca774337c7e7893ed04f

That is patch 1 in this series.



 It is the build failure for older gcc compiler version that is not in
 linus tree.

 That is what I can not take.

 We missed that in the first pull request. Do we really need to drop
 the patch from 3.11 and 3.12 trees ?

 I already did.

 The patch their is a variant, and don't require this build fix.

 Don't give me a variant, give me the exact same patch, only changed to
 handle the fuzz/differences of older kernels, don't make different
 changes to the original patch to make up for things you found out later
 on, otherwise everyone is confused as to why the fix for the fix is not
 in the tree.

In this specific case it may be difficult. 3.13 have other changes
around the code path. It has split pmd locks etc which result in us
doing a withdraw and deposit even on x86. For 3.11 and 3.12, we need to
do that extra withdraw and deposit only for ppc64. Hence the variant
which used #ifdef around that code. 


 So, when both patches get in Linus's tree, please send me the properly
 backported patches and I'll be glad to apply them.


-aneesh

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v2 1/6] idle: move the cpuidle entry point to the generic idle loop

2014-01-30 Thread Peter Zijlstra
On Thu, Jan 30, 2014 at 06:28:52PM +0100, Daniel Lezcano wrote:
 Ok, I think the mess is coming from 'default_idle' which does not re-enable
 the local_irq but used from different places like amd_e400_idle and
 apm_cpu_idle.
 
 void default_idle(void)
 {
 trace_cpu_idle_rcuidle(1, smp_processor_id());
 safe_halt();
 trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, smp_processor_id());
 }
 
 Considering the system configured without cpuidle because this one *always*
 enable the local irq, we have the different cases:
 
 x86_idle = default_idle();
 == local_irq_enable is missing
 

safe_halt() is sti; hlt and so very much does the irq_enable.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH] powerpc/pseries: Disable relocation on exception while going down during crash.

2014-01-30 Thread Mahesh J Salgaonkar
From: Mahesh Salgaonkar mah...@linux.vnet.ibm.com

Disable relocation on exception while going down even in kdump case. This
is because we are about clear htab mappings while kexec-ing into kdump
kernel and we may run into issues if we still have AIL ON.

Signed-off-by: Mahesh Salgaonkar mah...@linux.vnet.ibm.com
---
 arch/powerpc/platforms/pseries/setup.c |3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/setup.c 
b/arch/powerpc/platforms/pseries/setup.c
index c1f1908..3925173 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -430,8 +430,7 @@ static void pSeries_machine_kexec(struct kimage *image)
 {
long rc;
 
-   if (firmware_has_feature(FW_FEATURE_SET_MODE) 
-   (image-type != KEXEC_TYPE_CRASH)) {
+   if (firmware_has_feature(FW_FEATURE_SET_MODE)) {
rc = pSeries_disable_reloc_on_exc();
if (rc != H_SUCCESS)
pr_warning(Warning: Failed to disable relocation on 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH] powerpc: Fix kdump hang issue on p8 with relocation on exception enabled.

2014-01-30 Thread Mahesh J Salgaonkar
From: Mahesh Salgaonkar mah...@linux.vnet.ibm.com

On p8 systems, with relocation on exception feature enabled we are seeing
kdump kernel hang at interrupt vector 0xc*4400. The reason is, with this
feature enabled, exception are raised with MMU (IR=DR=1) ON with the
default offset of 0xc*4000. Since exception is raised in virtual mode it
requires the vector region to be executable without which it fails to
fetch and execute instruction at 0xc*4xxx. For default kernel since kernel
is loaded at real 0, the htab mappings sets the entire kernel text region
executable. But for relocatable kernel (e.g. kdump case) we only copy
interrupt vectors down to real 0 and never marked that region as
executable because in p7 and below we always get exception in real mode.

This patch fixes this issue by marking htab mapping range as executable
that overlaps with the interrupt vector region for relocatable kernel.

Thanks to Ben who helped me to debug this issue and find the root cause.

Signed-off-by: Mahesh Salgaonkar mah...@linux.vnet.ibm.com
---
 arch/powerpc/include/asm/sections.h |   12 
 arch/powerpc/mm/hash_utils_64.c |   14 ++
 2 files changed, 26 insertions(+)

diff --git a/arch/powerpc/include/asm/sections.h 
b/arch/powerpc/include/asm/sections.h
index 4ee06fe..d0e784e 100644
--- a/arch/powerpc/include/asm/sections.h
+++ b/arch/powerpc/include/asm/sections.h
@@ -8,6 +8,7 @@
 
 #ifdef __powerpc64__
 
+extern char __start_interrupts[];
 extern char __end_interrupts[];
 
 extern char __prom_init_toc_start[];
@@ -21,6 +22,17 @@ static inline int in_kernel_text(unsigned long addr)
return 0;
 }
 
+static inline int overlaps_interrupt_vector_text(unsigned long start,
+   unsigned long end)
+{
+   unsigned long real_start, real_end;
+   real_start = __start_interrupts - _stext;
+   real_end = __end_interrupts - _stext;
+
+   return start  (unsigned long)__va(real_end) 
+   (unsigned long)__va(real_start)  end;
+}
+
 static inline int overlaps_kernel_text(unsigned long start, unsigned long end)
 {
return start  (unsigned long)__init_end 
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 6176b3c..50e21af 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -206,6 +206,20 @@ int htab_bolt_mapping(unsigned long vstart, unsigned long 
vend,
if (overlaps_kernel_text(vaddr, vaddr + step))
tprot = ~HPTE_R_N;
 
+   /*
+* If relocatable, check if it overlaps interrupt vectors that
+* are copied down to real 0. For relocatable kernel
+* (e.g. kdump case) we copy interrupt vectors down to real
+* address 0. Mark that region as executable. This is
+* because on p8 system with relocation on exception feature
+* enabled, exceptions are raised with MMU (IR=DR=1) ON. Hence
+* in order to execute the interrupt handlers in virtual
+* mode the vector region need to be marked as executable.
+*/
+   if ((PHYSICAL_START  MEMORY_START) 
+   overlaps_interrupt_vector_text(vaddr, vaddr + step))
+   tprot = ~HPTE_R_N;
+
hash = hpt_hash(vpn, shift, ssize);
hpteg = ((hash  htab_hash_mask) * HPTES_PER_GROUP);
 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 2/2] Fix compile error of pgtable-ppc64.h

2014-01-30 Thread Benjamin Herrenschmidt
On Thu, 2014-01-30 at 09:55 -0800, Greg KH wrote:
 On Thu, Jan 30, 2014 at 11:08:52PM +0530, Aneesh Kumar K.V wrote:
  Greg KH g...@kroah.com writes:
  
   On Thu, Jan 30, 2014 at 09:57:36AM +1100, Benjamin Herrenschmidt wrote:
   On Wed, 2014-01-29 at 10:45 -0800, Greg KH wrote:
On Tue, Jan 28, 2014 at 05:52:42PM +0530, Aneesh Kumar K.V wrote:
 From: Li Zhong zh...@linux.vnet.ibm.com
 
 It seems that forward declaration couldn't work well with typedef, 
 use
 struct spinlock directly to avoiding following build errors:
 
 In file included from include/linux/spinlock.h:81,
  from include/linux/seqlock.h:35,
  from include/linux/time.h:5,
  from include/uapi/linux/timex.h:56,
  from include/linux/timex.h:56,
  from include/linux/sched.h:17,
  from arch/powerpc/kernel/asm-offsets.c:17:
 include/linux/spinlock_types.h:76: error: redefinition of typedef 
 'spinlock_t'
 /root/linux-next/arch/powerpc/include/asm/pgtable-ppc64.h:563: note: 
 previous declaration of 'spinlock_t' was here
 
 build fix for upstream SHA1: b3084f4db3aeb991c507ca774337c7e7893ed04f
 for 3.13 stable series

I don't understand, why is this needed?  Is there a corrisponding patch
upstream that already does this?  What went wrong with a normal
backport of the patch to 3.13?
   
   There's a corresponding patch in powerpc-next that I'm about to send to
   Linus today, but for the backport, the fix could be folded into the
   original offending patch.
  
   Oh come on, you know better than to try to send me a patch that isn't in
   Linus's tree already.  Crap, I can't take that at all.
  
   Send me the git commit id when it is in Linus's tree, otherwise I'm not
   taking it.
  
   And no, don't fold in anything, that's not ok either.  I'll just go
   drop this patch entirely from all of my -stable trees for now.  Feel
   free to resend them when all of the needed stuff is upstream.
  
  The fix for mremap crash is already in Linus tree.
 
 What is the git commit id?

Relax Greg :-) The submissions all had the commit ID of the original
patch upsteam: b3084f4db3aeb991c507ca774337c7e7893ed04f

The only *thing* here is due to churn upstream in 3.13, the backport
is a bit different for 3.13 vs. earlier versions.

The earlier ones are perfectly kosher and you should have no reason
not to take them.

The 3.13, well, Mahesh was a bit quick here, he sent you the actual
patch that went upstream ... and a second patch to fix a problem
with older gcc's that it introduces. Because it's a simple build fix of
the previous patch, I suggested folding it in instead.

That build fix is what is not yet upstream, it's in my -next branch
which Linus hasn't pulled just yet.

If that's an issue for you, just drop the 3.13 variant of the patch and
we'll send it again with the build fix as soon as Linus has pulled the
latter.

  It is the build failure for older gcc compiler version that is not in
  linus tree.
 
 That is what I can not take.
 
  We missed that in the first pull request. Do we really need to drop
  the patch from 3.11 and 3.12 trees ?
 
 I already did.
 
  The patch their is a variant, and don't require this build fix.
 
 Don't give me a variant, give me the exact same patch, only changed to
 handle the fuzz/differences of older kernels, don't make different
 changes to the original patch to make up for things you found out later
 on, otherwise everyone is confused as to why the fix for the fix is not
 in the tree.

The backport patch is a variant because of changes in the affected
function that went into 3.13.

 So, when both patches get in Linus's tree, please send me the properly
 backported patches and I'll be glad to apply them.

Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 1/2][v7] driver/memory:Move Freescale IFC driver to a common driver

2014-01-30 Thread Scott Wood
On Tue, 2014-01-28 at 10:59 +0530, Prabhakar Kushwaha wrote:
  Freescale IFC controller has been used for mpc8xxx. It will be used
  for ARM-based SoC as well. This patch moves the driver to driver/memory
  and fix the header file includes.
 
  Also remove module_platform_driver() and  instead call
  platform_driver_register() from subsys_initcall() to make sure this module
  has been loaded before MTD partition parsing starts.
 
 Signed-off-by: Prabhakar Kushwaha prabha...@freescale.com
 Acked-by: Arnd Bergmann a...@arndb.de

When did Arnd ack this?  Especially in v7 form... and I don't see him on
CC.

 +config FSL_IFC
 + bool Freescale Integrated Flash Controller
 + depends on FSL_SOC
 + help
 +   This driver is for the Integrated Flash Controller Controller(IFC)

Controller Controller?

 +   module available in Freescale SoCs. This controller allows to handle 
 flash
 +   devices such as NOR, NAND, FPGA and ASIC etc

FPGA and ASIC are not (necessarily) flash devices.

-Scott


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 2/2] Fix compile error of pgtable-ppc64.h

2014-01-30 Thread Greg KH
On Fri, Jan 31, 2014 at 07:59:01AM +1100, Benjamin Herrenschmidt wrote:
 If that's an issue for you, just drop the 3.13 variant of the patch and
 we'll send it again with the build fix as soon as Linus has pulled the
 latter.

I have done that.

thanks,

greg k-h
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] powerpc/eeh: drop taken reference to driver on eeh_rmv_device

2014-01-30 Thread Gavin Shan
On Thu, Jan 30, 2014 at 11:00:48AM -0200, Thadeu Lima de Souza Cascardo wrote:
Commit f5c57710dd62dd06f176934a8b4b8accbf00f9f8 (powerpc/eeh: Use
partial hotplug for EEH unaware drivers) introduces eeh_rmv_device,
which may grab a reference to a driver, but not release it.

That prevents a driver from being removed after it has gone through EEH
recovery.

This patch drops the reference in either exit path if it was taken.

Signed-off-by: Thadeu Lima de Souza Cascardo casca...@linux.vnet.ibm.com
---
 arch/powerpc/kernel/eeh_driver.c |5 -
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kernel/eeh_driver.c 
b/arch/powerpc/kernel/eeh_driver.c
index 7bb30dc..afe7337 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -364,7 +364,7 @@ static void *eeh_rmv_device(void *data, void *userdata)
   return NULL;
   driver = eeh_pcid_get(dev);
   if (driver  driver-err_handler)
-  return NULL;
+  goto out;

   /* Remove it from PCI subsystem */
   pr_debug(EEH: Removing %s without EEH sensitive driver\n,
@@ -377,6 +377,9 @@ static void *eeh_rmv_device(void *data, void *userdata)

For normal case (driver without EEH support), we probably release the reference
to the driver before pci_stop_and_remove_bus_device().

   pci_stop_and_remove_bus_device(dev);
   pci_unlock_rescan_remove();

+out:
+  if (driver)
+  eeh_pcid_put(dev);
   return NULL;

We needn't if (driver) here as eeh_pcid_put() already had the check.

 }


Thanks,
Gavin

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


RE: [PATCH 1/2][v7] driver/memory:Move Freescale IFC driver to a common driver

2014-01-30 Thread prabha...@freescale.com


 -Original Message-
 From: Wood Scott-B07421
 Sent: Friday, January 31, 2014 3:01 AM
 To: Kushwaha Prabhakar-B32579
 Cc: linuxppc-dev@lists.ozlabs.org
 Subject: Re: [PATCH 1/2][v7] driver/memory:Move Freescale IFC driver to a 
 common
 driver
 
 On Tue, 2014-01-28 at 10:59 +0530, Prabhakar Kushwaha wrote:
   Freescale IFC controller has been used for mpc8xxx. It will be used
  for ARM-based SoC as well. This patch moves the driver to
  driver/memory  and fix the header file includes.
 
   Also remove module_platform_driver() and  instead call
   platform_driver_register() from subsys_initcall() to make sure this
  module  has been loaded before MTD partition parsing starts.
 
  Signed-off-by: Prabhakar Kushwaha prabha...@freescale.com
  Acked-by: Arnd Bergmann a...@arndb.de
 
 When did Arnd ack this?  Especially in v7 form... and I don't see him on CC.
 
  +config FSL_IFC
  +   bool Freescale Integrated Flash Controller
  +   depends on FSL_SOC
  +   help
  + This driver is for the Integrated Flash Controller Controller(IFC)
 
 Controller Controller?

I will fix it

 
  + module available in Freescale SoCs. This controller allows to handle
 flash
  + devices such as NOR, NAND, FPGA and ASIC etc
 
 FPGA and ASIC are not (necessarily) flash devices.
 

Yes it true. 
I am not sure this folder is only for flash controller.
I can see references of FPGA, SRAM in same Kconfigs. 

Regards,
Prabhakar
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 1/2][v7] driver/memory:Move Freescale IFC driver to a common driver

2014-01-30 Thread Scott Wood
On Thu, 2014-01-30 at 21:23 -0600, Kushwaha Prabhakar-B32579 wrote:
 
  -Original Message-
  From: Wood Scott-B07421
  Sent: Friday, January 31, 2014 3:01 AM
  To: Kushwaha Prabhakar-B32579
  Cc: linuxppc-dev@lists.ozlabs.org
  Subject: Re: [PATCH 1/2][v7] driver/memory:Move Freescale IFC driver to a 
  common
  driver
  
  On Tue, 2014-01-28 at 10:59 +0530, Prabhakar Kushwaha wrote:
Freescale IFC controller has been used for mpc8xxx. It will be used
   for ARM-based SoC as well. This patch moves the driver to
   driver/memory  and fix the header file includes.
  
Also remove module_platform_driver() and  instead call
platform_driver_register() from subsys_initcall() to make sure this
   module  has been loaded before MTD partition parsing starts.
  
   Signed-off-by: Prabhakar Kushwaha prabha...@freescale.com
   Acked-by: Arnd Bergmann a...@arndb.de
  
  When did Arnd ack this?  Especially in v7 form... and I don't see him on CC.
  
   +config FSL_IFC
   + bool Freescale Integrated Flash Controller
   + depends on FSL_SOC
   + help
   +   This driver is for the Integrated Flash Controller Controller(IFC)
  
  Controller Controller?
 
 I will fix it
 
  
   +   module available in Freescale SoCs. This controller allows to handle
  flash
   +   devices such as NOR, NAND, FPGA and ASIC etc
  
  FPGA and ASIC are not (necessarily) flash devices.
  
 
 Yes it true. 
 I am not sure this folder is only for flash controller.
 I can see references of FPGA, SRAM in same Kconfigs. 

Right, just fix the help text.
s/handle flash devices/handle devices/

-Scott


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 0/3] powerpc: Free up an IPI message slot for tick broadcast IPIs

2014-01-30 Thread Preeti U Murthy
This patchset is a precursor for enabling deep idle states on powerpc,
when the local CPU timers stop. The tick broadcast framework in
the Linux Kernel today handles wakeup of such CPUs at their next timer event
by using an external clock device. At the expiry of this clock device, IPIs
are sent to the CPUs in deep idle states  so that they wakeup to handle their
respective timers. This patchset frees up one of the IPI slots on powerpc
so as to be used to handle the tick broadcast IPI.

On certain implementations of powerpc, such an external clock device is absent.
Adding support to the tick broadcast framework to handle wakeup of CPUs from
deep idle states on such implementations is currently under discussion.
https://lkml.org/lkml/2014/1/15/86
https://lkml.org/lkml/2014/1/24/28

Either way this patchset is essential to enable handling the tick broadcast 
IPIs.
---

Preeti U Murthy (1):
  cpuidle/ppc: Split timer_interrupt() into timer handling and interrupt 
handling routines

Srivatsa S. Bhat (2):
  powerpc: Free up the slot of PPC_MSG_CALL_FUNC_SINGLE IPI message
  powerpc: Implement tick broadcast IPI as a fixed IPI message


 arch/powerpc/include/asm/smp.h  |2 -
 arch/powerpc/include/asm/time.h |1 
 arch/powerpc/kernel/smp.c   |   23 ++--
 arch/powerpc/kernel/time.c  |   86 ++-
 arch/powerpc/platforms/cell/interrupt.c |2 -
 arch/powerpc/platforms/ps3/smp.c|2 -
 6 files changed, 71 insertions(+), 45 deletions(-)

-- 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 1/3] powerpc: Free up the slot of PPC_MSG_CALL_FUNC_SINGLE IPI message

2014-01-30 Thread Preeti U Murthy
From: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com

The IPI handlers for both PPC_MSG_CALL_FUNC and PPC_MSG_CALL_FUNC_SINGLE map
to a common implementation - generic_smp_call_function_single_interrupt(). So,
we can consolidate them and save one of the IPI message slots, (which are
precious on powerpc, since only 4 of those slots are available).

So, implement the functionality of PPC_MSG_CALL_FUNC_SINGLE using
PPC_MSG_CALL_FUNC itself and release its IPI message slot, so that it can be
used for something else in the future, if desired.

Signed-off-by: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com
Signed-off-by: Preeti U. Murthy pre...@linux.vnet.ibm.com
Acked-by: Geoff Levand ge...@infradead.org [For the PS3 part]
---

 arch/powerpc/include/asm/smp.h  |2 +-
 arch/powerpc/kernel/smp.c   |   12 +---
 arch/powerpc/platforms/cell/interrupt.c |2 +-
 arch/powerpc/platforms/ps3/smp.c|2 +-
 4 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h
index 084e080..9f7356b 100644
--- a/arch/powerpc/include/asm/smp.h
+++ b/arch/powerpc/include/asm/smp.h
@@ -120,7 +120,7 @@ extern int cpu_to_core_id(int cpu);
  * in /proc/interrupts will be wrong!!! --Troy */
 #define PPC_MSG_CALL_FUNCTION   0
 #define PPC_MSG_RESCHEDULE  1
-#define PPC_MSG_CALL_FUNC_SINGLE   2
+#define PPC_MSG_UNUSED 2
 #define PPC_MSG_DEBUGGER_BREAK  3
 
 /* for irq controllers that have dedicated ipis per message (4) */
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index ac2621a..ee7d76b 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -145,9 +145,9 @@ static irqreturn_t reschedule_action(int irq, void *data)
return IRQ_HANDLED;
 }
 
-static irqreturn_t call_function_single_action(int irq, void *data)
+static irqreturn_t unused_action(int irq, void *data)
 {
-   generic_smp_call_function_single_interrupt();
+   /* This slot is unused and hence available for use, if needed */
return IRQ_HANDLED;
 }
 
@@ -168,14 +168,14 @@ static irqreturn_t debug_ipi_action(int irq, void *data)
 static irq_handler_t smp_ipi_action[] = {
[PPC_MSG_CALL_FUNCTION] =  call_function_action,
[PPC_MSG_RESCHEDULE] = reschedule_action,
-   [PPC_MSG_CALL_FUNC_SINGLE] = call_function_single_action,
+   [PPC_MSG_UNUSED] = unused_action,
[PPC_MSG_DEBUGGER_BREAK] = debug_ipi_action,
 };
 
 const char *smp_ipi_name[] = {
[PPC_MSG_CALL_FUNCTION] =  ipi call function,
[PPC_MSG_RESCHEDULE] = ipi reschedule,
-   [PPC_MSG_CALL_FUNC_SINGLE] = ipi call function single,
+   [PPC_MSG_UNUSED] = ipi unused,
[PPC_MSG_DEBUGGER_BREAK] = ipi debugger,
 };
 
@@ -251,8 +251,6 @@ irqreturn_t smp_ipi_demux(void)
generic_smp_call_function_interrupt();
if (all  IPI_MESSAGE(PPC_MSG_RESCHEDULE))
scheduler_ipi();
-   if (all  IPI_MESSAGE(PPC_MSG_CALL_FUNC_SINGLE))
-   generic_smp_call_function_single_interrupt();
if (all  IPI_MESSAGE(PPC_MSG_DEBUGGER_BREAK))
debug_ipi_action(0, NULL);
} while (info-messages);
@@ -280,7 +278,7 @@ EXPORT_SYMBOL_GPL(smp_send_reschedule);
 
 void arch_send_call_function_single_ipi(int cpu)
 {
-   do_message_pass(cpu, PPC_MSG_CALL_FUNC_SINGLE);
+   do_message_pass(cpu, PPC_MSG_CALL_FUNCTION);
 }
 
 void arch_send_call_function_ipi_mask(const struct cpumask *mask)
diff --git a/arch/powerpc/platforms/cell/interrupt.c 
b/arch/powerpc/platforms/cell/interrupt.c
index 2d42f3b..adf3726 100644
--- a/arch/powerpc/platforms/cell/interrupt.c
+++ b/arch/powerpc/platforms/cell/interrupt.c
@@ -215,7 +215,7 @@ void iic_request_IPIs(void)
 {
iic_request_ipi(PPC_MSG_CALL_FUNCTION);
iic_request_ipi(PPC_MSG_RESCHEDULE);
-   iic_request_ipi(PPC_MSG_CALL_FUNC_SINGLE);
+   iic_request_ipi(PPC_MSG_UNUSED);
iic_request_ipi(PPC_MSG_DEBUGGER_BREAK);
 }
 
diff --git a/arch/powerpc/platforms/ps3/smp.c b/arch/powerpc/platforms/ps3/smp.c
index 4b35166..00d1a7c 100644
--- a/arch/powerpc/platforms/ps3/smp.c
+++ b/arch/powerpc/platforms/ps3/smp.c
@@ -76,7 +76,7 @@ static int __init ps3_smp_probe(void)
 
BUILD_BUG_ON(PPC_MSG_CALL_FUNCTION!= 0);
BUILD_BUG_ON(PPC_MSG_RESCHEDULE   != 1);
-   BUILD_BUG_ON(PPC_MSG_CALL_FUNC_SINGLE != 2);
+   BUILD_BUG_ON(PPC_MSG_UNUSED   != 2);
BUILD_BUG_ON(PPC_MSG_DEBUGGER_BREAK   != 3);
 
for (i = 0; i  MSG_COUNT; i++) {

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 2/3] powerpc: Implement tick broadcast IPI as a fixed IPI message

2014-01-30 Thread Preeti U Murthy
From: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com

For scalability and performance reasons, we want the tick broadcast IPIs
to be handled as efficiently as possible. Fixed IPI messages
are one of the most efficient mechanisms available - they are faster than
the smp_call_function mechanism because the IPI handlers are fixed and hence
they don't involve costly operations such as adding IPI handlers to the target
CPU's function queue, acquiring locks for synchronization etc.

Luckily we have an unused IPI message slot, so use that to implement
tick broadcast IPIs efficiently.

Signed-off-by: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com
[Functions renamed to tick_broadcast* and Changelog modified by
 Preeti U. Murthypre...@linux.vnet.ibm.com]
Signed-off-by: Preeti U. Murthy pre...@linux.vnet.ibm.com
Acked-by: Geoff Levand ge...@infradead.org [For the PS3 part]
---

 arch/powerpc/include/asm/smp.h  |2 +-
 arch/powerpc/include/asm/time.h |1 +
 arch/powerpc/kernel/smp.c   |   19 +++
 arch/powerpc/kernel/time.c  |5 +
 arch/powerpc/platforms/cell/interrupt.c |2 +-
 arch/powerpc/platforms/ps3/smp.c|2 +-
 6 files changed, 24 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h
index 9f7356b..ff51046 100644
--- a/arch/powerpc/include/asm/smp.h
+++ b/arch/powerpc/include/asm/smp.h
@@ -120,7 +120,7 @@ extern int cpu_to_core_id(int cpu);
  * in /proc/interrupts will be wrong!!! --Troy */
 #define PPC_MSG_CALL_FUNCTION   0
 #define PPC_MSG_RESCHEDULE  1
-#define PPC_MSG_UNUSED 2
+#define PPC_MSG_TICK_BROADCAST 2
 #define PPC_MSG_DEBUGGER_BREAK  3
 
 /* for irq controllers that have dedicated ipis per message (4) */
diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h
index c1f2676..1d428e6 100644
--- a/arch/powerpc/include/asm/time.h
+++ b/arch/powerpc/include/asm/time.h
@@ -28,6 +28,7 @@ extern struct clock_event_device decrementer_clockevent;
 struct rtc_time;
 extern void to_tm(int tim, struct rtc_time * tm);
 extern void GregorianDay(struct rtc_time *tm);
+extern void tick_broadcast_ipi_handler(void);
 
 extern void generic_calibrate_decr(void);
 
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index ee7d76b..6f06f05 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -35,6 +35,7 @@
 #include asm/ptrace.h
 #include linux/atomic.h
 #include asm/irq.h
+#include asm/hw_irq.h
 #include asm/page.h
 #include asm/pgtable.h
 #include asm/prom.h
@@ -145,9 +146,9 @@ static irqreturn_t reschedule_action(int irq, void *data)
return IRQ_HANDLED;
 }
 
-static irqreturn_t unused_action(int irq, void *data)
+static irqreturn_t tick_broadcast_ipi_action(int irq, void *data)
 {
-   /* This slot is unused and hence available for use, if needed */
+   tick_broadcast_ipi_handler();
return IRQ_HANDLED;
 }
 
@@ -168,14 +169,14 @@ static irqreturn_t debug_ipi_action(int irq, void *data)
 static irq_handler_t smp_ipi_action[] = {
[PPC_MSG_CALL_FUNCTION] =  call_function_action,
[PPC_MSG_RESCHEDULE] = reschedule_action,
-   [PPC_MSG_UNUSED] = unused_action,
+   [PPC_MSG_TICK_BROADCAST] = tick_broadcast_ipi_action,
[PPC_MSG_DEBUGGER_BREAK] = debug_ipi_action,
 };
 
 const char *smp_ipi_name[] = {
[PPC_MSG_CALL_FUNCTION] =  ipi call function,
[PPC_MSG_RESCHEDULE] = ipi reschedule,
-   [PPC_MSG_UNUSED] = ipi unused,
+   [PPC_MSG_TICK_BROADCAST] = ipi tick-broadcast,
[PPC_MSG_DEBUGGER_BREAK] = ipi debugger,
 };
 
@@ -251,6 +252,8 @@ irqreturn_t smp_ipi_demux(void)
generic_smp_call_function_interrupt();
if (all  IPI_MESSAGE(PPC_MSG_RESCHEDULE))
scheduler_ipi();
+   if (all  IPI_MESSAGE(PPC_MSG_TICK_BROADCAST))
+   tick_broadcast_ipi_handler();
if (all  IPI_MESSAGE(PPC_MSG_DEBUGGER_BREAK))
debug_ipi_action(0, NULL);
} while (info-messages);
@@ -289,6 +292,14 @@ void arch_send_call_function_ipi_mask(const struct cpumask 
*mask)
do_message_pass(cpu, PPC_MSG_CALL_FUNCTION);
 }
 
+void tick_broadcast(const struct cpumask *mask)
+{
+   unsigned int cpu;
+
+   for_each_cpu(cpu, mask)
+   do_message_pass(cpu, PPC_MSG_TICK_BROADCAST);
+}
+
 #if defined(CONFIG_DEBUGGER) || defined(CONFIG_KEXEC)
 void smp_send_debugger_break(void)
 {
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index b3dab20..3ff97db 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -825,6 +825,11 @@ static void decrementer_set_mode(enum clock_event_mode 
mode,
decrementer_set_next_event(DECREMENTER_MAX, dev);
 }
 
+/* Interrupt handler for the timer broadcast IPI */
+void tick_broadcast_ipi_handler(void)
+{
+}
+
 static void 

[PATCH 3/3] cpuidle/ppc: Split timer_interrupt() into timer handling and interrupt handling routines

2014-01-30 Thread Preeti U Murthy
From: Preeti U Murthy pre...@linux.vnet.ibm.com

Split timer_interrupt(), which is the local timer interrupt handler on ppc
into routines called during regular interrupt handling and __timer_interrupt(),
which takes care of running local timers and collecting time related stats.

This will enable callers interested only in running expired local timers to
directly call into __timer_interupt(). One of the use cases of this is the
tick broadcast IPI handling in which the sleeping CPUs need to handle the local
timers that have expired.

Signed-off-by: Preeti U Murthy pre...@linux.vnet.ibm.com
---

 arch/powerpc/kernel/time.c |   81 +---
 1 file changed, 46 insertions(+), 35 deletions(-)

diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 3ff97db..df2989b 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -478,6 +478,47 @@ void arch_irq_work_raise(void)
 
 #endif /* CONFIG_IRQ_WORK */
 
+void __timer_interrupt(void)
+{
+   struct pt_regs *regs = get_irq_regs();
+   u64 *next_tb = __get_cpu_var(decrementers_next_tb);
+   struct clock_event_device *evt = __get_cpu_var(decrementers);
+   u64 now;
+
+   trace_timer_interrupt_entry(regs);
+
+   if (test_irq_work_pending()) {
+   clear_irq_work_pending();
+   irq_work_run();
+   }
+
+   now = get_tb_or_rtc();
+   if (now = *next_tb) {
+   *next_tb = ~(u64)0;
+   if (evt-event_handler)
+   evt-event_handler(evt);
+   __get_cpu_var(irq_stat).timer_irqs_event++;
+   } else {
+   now = *next_tb - now;
+   if (now = DECREMENTER_MAX)
+   set_dec((int)now);
+   /* We may have raced with new irq work */
+   if (test_irq_work_pending())
+   set_dec(1);
+   __get_cpu_var(irq_stat).timer_irqs_others++;
+   }
+
+#ifdef CONFIG_PPC64
+   /* collect purr register values often, for accurate calculations */
+   if (firmware_has_feature(FW_FEATURE_SPLPAR)) {
+   struct cpu_usage *cu = __get_cpu_var(cpu_usage_array);
+   cu-current_tb = mfspr(SPRN_PURR);
+   }
+#endif
+
+   trace_timer_interrupt_exit(regs);
+}
+
 /*
  * timer_interrupt - gets called when the decrementer overflows,
  * with interrupts disabled.
@@ -486,8 +527,6 @@ void timer_interrupt(struct pt_regs * regs)
 {
struct pt_regs *old_regs;
u64 *next_tb = __get_cpu_var(decrementers_next_tb);
-   struct clock_event_device *evt = __get_cpu_var(decrementers);
-   u64 now;
 
/* Ensure a positive value is written to the decrementer, or else
 * some CPUs will continue to take decrementer exceptions.
@@ -519,39 +558,7 @@ void timer_interrupt(struct pt_regs * regs)
old_regs = set_irq_regs(regs);
irq_enter();
 
-   trace_timer_interrupt_entry(regs);
-
-   if (test_irq_work_pending()) {
-   clear_irq_work_pending();
-   irq_work_run();
-   }
-
-   now = get_tb_or_rtc();
-   if (now = *next_tb) {
-   *next_tb = ~(u64)0;
-   if (evt-event_handler)
-   evt-event_handler(evt);
-   __get_cpu_var(irq_stat).timer_irqs_event++;
-   } else {
-   now = *next_tb - now;
-   if (now = DECREMENTER_MAX)
-   set_dec((int)now);
-   /* We may have raced with new irq work */
-   if (test_irq_work_pending())
-   set_dec(1);
-   __get_cpu_var(irq_stat).timer_irqs_others++;
-   }
-
-#ifdef CONFIG_PPC64
-   /* collect purr register values often, for accurate calculations */
-   if (firmware_has_feature(FW_FEATURE_SPLPAR)) {
-   struct cpu_usage *cu = __get_cpu_var(cpu_usage_array);
-   cu-current_tb = mfspr(SPRN_PURR);
-   }
-#endif
-
-   trace_timer_interrupt_exit(regs);
-
+   __timer_interrupt();
irq_exit();
set_irq_regs(old_regs);
 }
@@ -828,6 +835,10 @@ static void decrementer_set_mode(enum clock_event_mode 
mode,
 /* Interrupt handler for the timer broadcast IPI */
 void tick_broadcast_ipi_handler(void)
 {
+   u64 *next_tb = __get_cpu_var(decrementers_next_tb);
+
+   *next_tb = get_tb_or_rtc();
+   __timer_interrupt();
 }
 
 static void register_decrementer_clockevent(int cpu)

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v2] kexec/ppc64 fix device tree endianess issues for memory attributes

2014-01-30 Thread Simon Horman
On Thu, Jan 30, 2014 at 04:06:22PM +0100, Laurent Dufour wrote:
 All the attributes exposed in the device tree are in Big Endian format.
 
 This patch add the byte swap operation for some entries which were not yet
 processed, including those fixed by the following kernel's patch :
 
 https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-January/114720.html
 
 To work on PPC64 Little Endian mode, kexec now requires that the kernel's
 patch mentioned above is applied on the kexecing kernel.
 
 Tested on ppc64 LPAR (kexec/dump) and ppc64le in a Qemu/KVM guest (kexec)
 
 Changes from v1 :
  * add processing of the following entries :
- ibm,dynamic-reconfiguration-memory
- chosen/linux,kernel-end
- chosen/linux,crashkernel-base  size
- chosen/linux,memory-limit
- chosen/linux,htab-base  size
- linux,tce-base  size
- memory@/reg
 Signed-off-by: Laurent Dufour lduf...@linux.vnet.ibm.com

Thanks, applied.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev