RE: [PATCH] powerpc: mitigate impact of decrementer reset

2014-10-08 Thread Heinz Wrobel
Paul,

what if your tb wraps during the  test?

 -Original Message-
 From: Linuxppc-dev [mailto:linuxppc-dev-
 bounces+heinz.wrobel=freescale@lists.ozlabs.org] On Behalf Of Paul
 Clarke
 Sent: Tuesday, October 07, 2014 21:13
 To: linuxppc-dev@lists.ozlabs.org
 Subject: [PATCH] powerpc: mitigate impact of decrementer reset
 
 The POWER ISA defines an always-running decrementer which can be used to
 schedule interrupts after a certain time interval has elapsed.
 The decrementer counts down at the same frequency as the Time Base, which
 is 512 MHz.  The maximum value of the decrementer is 0x7fff.
 This works out to a maximum interval of about 4.19 seconds.
 
 If a larger interval is desired, the kernel will set the decrementer to its
 maximum value and reset it after it expires (underflows) a sufficient number 
 of
 times until the desired interval has elapsed.
 
 The negative effect of this is that an unwanted latency spike will impact 
 normal
 processing at most every 4.19 seconds.  On an IBM POWER8-based system, this
 spike was measured at about 25-30 microseconds, much of which was basic,
 opportunistic housekeeping tasks that could otherwise have waited.
 
 This patch short-circuits the reset of the decrementer, exiting after the
 decrementer reset, but before the housekeeping tasks if the only need for the
 interrupt is simply to reset it.  After this patch, the latency spike was 
 measured
 at about 150 nanoseconds.
 
 Signed-off-by: Paul A. Clarke p...@us.ibm.com
 ---
   arch/powerpc/kernel/time.c | 13 +
   1 file changed, 13 insertions(+)
 
 diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c index
 368ab37..962a06b 100644
 --- a/arch/powerpc/kernel/time.c
 +++ b/arch/powerpc/kernel/time.c
 @@ -528,6 +528,7 @@ void timer_interrupt(struct pt_regs * regs)
   {
   struct pt_regs *old_regs;
   u64 *next_tb = __get_cpu_var(decrementers_next_tb);
 + u64 now;
 
   /* Ensure a positive value is written to the decrementer, or else
* some CPUs will continue to take decrementer exceptions.
 @@ -550,6 +551,18 @@ void timer_interrupt(struct pt_regs * regs)
*/
   may_hard_irq_enable();
 
 + /* If this is simply the decrementer expiring (underflow) due to
 +  * the limited size of the decrementer, and not a set timer,
 +  * reset (if needed) and return
 +  */
 + now = get_tb_or_rtc();
 + if (now  *next_tb) {

What if now and *next_tb are not on the same wrap count? They are both modulo 
values AFACS.
Shouldn't this be right here more like a if ((*next_tb - now)  2^63) style 
test to check for deltas within the range instead of absolute values?

 + now = *next_tb - now;
 + if (now = DECREMENTER_MAX)
 + set_dec((int)now);
 + __get_cpu_var(irq_stat).timer_irqs_others++;
 + return;
 + }
 
   #if defined(CONFIG_PPC32)  defined(CONFIG_PPC_PMAC)
   if (atomic_read(ppc_n_lost_interrupts) != 0)
 --
 2.1.2.330.g565301e

BR,

Heinz
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 4/7] sound/radeon: Add quirk for broken 64-bit MSI

2014-10-08 Thread Alex Deucher
On Wed, Oct 8, 2014 at 1:28 AM, Benjamin Herrenschmidt
b...@kernel.crashing.org wrote:
 On Tue, 2014-10-07 at 19:47 -0400, Alex Deucher wrote:
  This moves the setting of the quirk flag to the audio driver.
 
  While recent ASICs have that problem fixed, they don't seem to
  be listed in the PCI IDs of the current driver, so let's quirk all
  the ATI HDMI for now. The consequences are nil on x86 anyway.
 
  Signed-off-by: Alex Deucher alexdeuc...@gmail.com
  Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
  CC: sta...@vger.kernel.org

 Further discussion with the hw teams have revealed that this is still
 an issue on newer asics so I think your original patch is correct
 after all.  Just disable 64 bit MSIs on all AMD audio PCI ids.

 Allright, I won't resend the whole series, I can just pickup my previous
 patch. Takashi, Bjorn, Dave, this series covers your 3 areas of
 maintainership, how do you want to proceed ? I'm happy to merge the
 whole lot via powerpc ASAP (since it's all CC'ed stable) if you guys
 send me the appropriate acks, otherwise, let me know.


I don't remember if I gave my formal review of your original patch, so if not,
Reviewed-by: Alex Deucher alexander.deuc...@amd.com

Alex
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 4/7] sound/radeon: Add quirk for broken 64-bit MSI

2014-10-08 Thread Benjamin Herrenschmidt
On Tue, 2014-10-07 at 19:47 -0400, Alex Deucher wrote:
  This moves the setting of the quirk flag to the audio driver.
 
  While recent ASICs have that problem fixed, they don't seem to
  be listed in the PCI IDs of the current driver, so let's quirk all
  the ATI HDMI for now. The consequences are nil on x86 anyway.
 
  Signed-off-by: Alex Deucher alexdeuc...@gmail.com
  Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
  CC: sta...@vger.kernel.org
 
 Further discussion with the hw teams have revealed that this is still
 an issue on newer asics so I think your original patch is correct
 after all.  Just disable 64 bit MSIs on all AMD audio PCI ids.

Allright, I won't resend the whole series, I can just pickup my previous
patch. Takashi, Bjorn, Dave, this series covers your 3 areas of
maintainership, how do you want to proceed ? I'm happy to merge the
whole lot via powerpc ASAP (since it's all CC'ed stable) if you guys
send me the appropriate acks, otherwise, let me know.

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 4/7] sound/radeon: Add quirk for broken 64-bit MSI

2014-10-08 Thread Takashi Iwai
At Wed, 08 Oct 2014 16:28:16 +1100,
Benjamin Herrenschmidt wrote:
 
 On Tue, 2014-10-07 at 19:47 -0400, Alex Deucher wrote:
   This moves the setting of the quirk flag to the audio driver.
  
   While recent ASICs have that problem fixed, they don't seem to
   be listed in the PCI IDs of the current driver, so let's quirk all
   the ATI HDMI for now. The consequences are nil on x86 anyway.
  
   Signed-off-by: Alex Deucher alexdeuc...@gmail.com
   Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
   CC: sta...@vger.kernel.org
  
  Further discussion with the hw teams have revealed that this is still
  an issue on newer asics so I think your original patch is correct
  after all.  Just disable 64 bit MSIs on all AMD audio PCI ids.
 
 Allright, I won't resend the whole series, I can just pickup my previous
 patch. Takashi, Bjorn, Dave, this series covers your 3 areas of
 maintainership, how do you want to proceed ? I'm happy to merge the
 whole lot via powerpc ASAP (since it's all CC'ed stable) if you guys
 send me the appropriate acks, otherwise, let me know.

Feel free to merge through your tree.
  Reviewed-by: Takashi Iwai ti...@suse.de


thanks,

Takashi
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 08/44] kernel: Move pm_power_off to common code

2014-10-08 Thread Jesper Nilsson
On Tue, Oct 07, 2014 at 07:28:10AM +0200, Guenter Roeck wrote:
 pm_power_off is defined for all architectures. Move it to common code.
 
 Have all architectures call do_kernel_poweroff instead of pm_power_off.
 Some architectures point pm_power_off to machine_power_off. For those,
 call do_kernel_poweroff from machine_power_off instead.

For the CRIS parts:

  arch/cris/kernel/process.c |  4 +---

Acked-by: Jesper Nilsson jesper.nils...@axis.com


/^JN - Jesper Nilsson
-- 
   Jesper Nilsson -- jesper.nils...@axis.com
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc: Reimplement __get_SP() as a function not a define

2014-10-08 Thread Li Zhong
On 三, 2014-10-01 at 15:10 +1000, Anton Blanchard wrote:
 Li Zhong points out an issue with our current __get_SP()
 implementation. If ftrace function tracing is enabled (ie -pg
 profiling using _mcount) we spill a stack frame on 64bit all the
 time.
 
 If a function calls __get_SP() and later calls a function that is
 tail call optimised, we will pop the stack frame and the value
 returned by __get_SP() is no longer valid. An example from Li can
 be found in save_stack_trace - save_context_stack:
 
 c00432c0 .save_stack_trace:
 c00432c0:   mflrr0
 c00432c4:   std r0,16(r1)
 c00432c8:   stdur1,-128(r1) -- stack frame for _mcount
 c00432cc:   std r3,112(r1)
 c00432d0:   bl  ._mcount
 c00432d4:   nop
 
 c00432d8:   mr  r4,r1 -- __get_SP()
 
 c00432dc:   ld  r5,632(r13)
 c00432e0:   ld  r3,112(r1)
 c00432e4:   li  r6,1
 
 c00432e8:   addir1,r1,128 -- pop stack frame
 
 c00432ec:   ld  r0,16(r1)
 c00432f0:   mtlrr0
 c00432f4:   b   .save_context_stack -- tail call optimized
 
 save_context_stack ends up with a stack pointer below the current
 one, and it is likely to be scribbled over.
 
 Fix this by making __get_SP() a function which returns the
 callers stack frame. Also replace inline assembly which grabs
 the stack pointer in save_stack_trace and show_stack with
 __get_SP().
 
 Reported-by: Li Zhong zh...@linux.vnet.ibm.com
 Signed-off-by: Anton Blanchard an...@samba.org
 ---
  arch/powerpc/include/asm/reg.h   | 3 +--
  arch/powerpc/kernel/misc.S   | 4 
  arch/powerpc/kernel/process.c| 2 +-
  arch/powerpc/kernel/stacktrace.c | 2 +-
  4 files changed, 7 insertions(+), 4 deletions(-)
 
 diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
 index 0c05059..0f973c0 100644
 --- a/arch/powerpc/include/asm/reg.h
 +++ b/arch/powerpc/include/asm/reg.h
 @@ -1264,8 +1264,7 @@ static inline unsigned long mfvtb (void)
  
  #define proc_trap()  asm volatile(trap)
  
 -#define __get_SP()   ({unsigned long sp; \
 - asm volatile(mr %0,1: =r (sp)); sp;})
 +extern unsigned long __get_SP(void);

It seems that some module code is using __get_SP, e.g. xfs in the
example below:
ERROR: .__get_SP [fs/xfs/xfs.ko] undefined!

Maybe we need export this symbol in arch/powerpc/kernel/ppc_ksyms.c? 

diff --git a/arch/powerpc/kernel/ppc_ksyms.c b/arch/powerpc/kernel/ppc_ksyms.c
index 48d17d6f..eebd4e4 100644
--- a/arch/powerpc/kernel/ppc_ksyms.c
+++ b/arch/powerpc/kernel/ppc_ksyms.c
@@ -207,3 +207,5 @@ EXPORT_SYMBOL_GPL(mmu_psize_defs);
 #ifdef CONFIG_EPAPR_PARAVIRT
 EXPORT_SYMBOL(epapr_hypercall_start);
 #endif
+
+EXPORT_SYMBOL(__get_SP);

With the above compiling error fixed, this patch solved the SP issue I saw, so 

Tested-by: Li Zhong zh...@linux.vnet.ibm.com

  
  extern unsigned long scom970_read(unsigned int address);
  extern void scom970_write(unsigned int address, unsigned long value);
 diff --git a/arch/powerpc/kernel/misc.S b/arch/powerpc/kernel/misc.S
 index 7ce26d4..120deb7 100644
 --- a/arch/powerpc/kernel/misc.S
 +++ b/arch/powerpc/kernel/misc.S
 @@ -114,3 +114,7 @@ _GLOBAL(longjmp)
   mtlrr0
   mr  r3,r4
   blr
 +
 +_GLOBAL(__get_SP)
 + PPC_LL  r3,0(r1)
 + blr
 diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
 index aa1df89..3cc6439 100644
 --- a/arch/powerpc/kernel/process.c
 +++ b/arch/powerpc/kernel/process.c
 @@ -1545,7 +1545,7 @@ void show_stack(struct task_struct *tsk, unsigned long 
 *stack)
   tsk = current;
   if (sp == 0) {
   if (tsk == current)
 - asm(mr %0,1 : =r (sp));
 + sp = __get_SP();
   else
   sp = tsk-thread.ksp;
   }
 diff --git a/arch/powerpc/kernel/stacktrace.c 
 b/arch/powerpc/kernel/stacktrace.c
 index 3d30ef1..7f65bae 100644
 --- a/arch/powerpc/kernel/stacktrace.c
 +++ b/arch/powerpc/kernel/stacktrace.c
 @@ -50,7 +50,7 @@ void save_stack_trace(struct stack_trace *trace)
  {
   unsigned long sp;
  
 - asm(mr %0,1 : =r (sp));
 + sp = __get_SP();
  
   save_context_stack(trace, sp, current, 1);
  }


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 0/16] POWER8 Coherent Accelerator device driver

2014-10-08 Thread Michael Neuling
This is the latest version of the cxl driver.  Change log below:

v4:
 - Updates based on comments from mpe (offline and online).
 - Refactor the sstp lock to be an entry lock.
 - Fixed error paths on new status_mutex in start_work
 - added some missing include files
 - moved associating pid/mm from open() to start_work ioctl.
 - improved IDR setup and destroy
 - fix block comments.
 - remove #undef at top of files
 - wed - work_element_descriptor on user visible interfaces
 - Lots of documentation updates.
 - Device name changes.
   - No longer has a default dev name /dev/afuM.N for each mode.
   - Dedicated, slave and master all have distinct char devs.
 - Prevent AFU reset when contexts active.
 - Endian bug fix for find_free_sste().
 - Fix locking on reset_store_afu.
 - Make CXL_IOCTL_GET_PROCESS_ELEMENT return a __u32 instead of int.
 - Rename event.afu_err.err to error
 - Fixed master specific sysfs attribute creation
 - fix sparse errors with debugfs.  Was passing iomem ptrs to userspace.

v3:
 - Updates based on comments from mpe, benh, aneesh and offline reviews.
 - Fixed bug freeing AFU IRQs that also freed the multiplexed PSL IRQ
 - Change copro_flush_all_slbs to a static inline as suggested by mpe
 - Implement sanitisation routines to clear out more registers and do full
   adapter wide tlbia and slbia when initialising hardware
 - Add self testcase to msi_bitmap to test allocations are aligned to a power of
   2 and cleanup comment as suggested by mpe
 - Clean up cxl_use_count
 - Split out detach_process_native into two logical functions
 - Improve comment in set_msi_irq_chip as requested by mpe
 - Move cxl functions in pci-ioda.c to be under just one #ifdef CONFIG_CXL_BASE
 - Cleanup hash_page and hash_page_mm from mpes and Aneesh' reviews
 - Remove dead code in cxl_alloc_sst
 - Add timeout in afu_slbia_native
 - Remove cxl backend and driver ops abstractions
   - Removed separate cxl-pci module
   - Merged cxl pci module init calls into main driver init
 - Refactor afu_read() to be a bit simpler and more closely follow exising
   patterns in the kernel
 - Userspace API updates from reviews:
   - Added ioctl to get the process element number, and removed it as a return
 from the start work ioctl
   - Alter cxl_event to have one common header struct
   - Dropped check error ioctl
   - Added current and binary compatible API version numbers to sysfs
   - read() now takes a 4K (or greater) buffer
   - Pack event structs to reduce unecessary reserved fields
 - Event sizes can now differ
 - All event sizes are 64bit multiples to allow future event coalescing
   - Add flags fields to indicate which fields contain valid data
   - Add BUILD_BUG_ONs to protect against inadvertantly changing API without
 bumping version number and/or flags
   - Update documentation
 - Skip CXL SLBIA codepath if CXL is not in use
 - Split cxl_slbia_core into two functions to be easier to read
 - Refactor copro_data_segment (renamed to copro_calc_slb) since we are no
   longer merging with hash_page and cleaned up parameters.
 - Some renames:
   - struct cxl_t - struct cxl
   - struct cxl_afu_t - struct cxl_afu
   - struct cxl_context_t - struct cxl_context
   - copro_data_segment - copro_calc_slb
   - ctx-ph - ctx-pe
 - Added ctx-status mutex lock around for start and release context

v2:
 - Updates based on comments from, Anton, Gavin, Aneesh, jk and offline reviews
 - Simplified copro_data_segment() and merged code with hash_page_mm()
(New patch 10/17)
 - PCIe code simplifications based on Gavin's review
 - Removed redundant comment in msi_bitmap_alloc_hwirqs()
 - Fix for locking in idr_remove in core driver
 - Ensure PSL is enabled when PHB is flipped to CXL mode
 - Added CONFIG_PPC_COPRO_BASE to compile copro_fault.c
 - Merged SPU and cxl slb flushing calls into copro_flush_all_slbs()
(New patch 03/17)
 - Moved slb_vsid_shift() to static inline from #define
 - Don't write paca-context when demoting segments and mm != current
 - Fix minor typos in documentation

v1:
 - Initial post

This add support for the Coherent Accelerator (cxl) attached to POWER8
processors.  This coherent accelerator interface is designed to allow the
coherent connection of FPGA based accelerators (and other devices) to a POWER
systems.

IBM refers to this as the Coherent Accelerator Processor Interface or CAPI.  In
this driver it's referred to by the name cxl to avoid confusion with the ISDN
CAPI subsystem.

An overview of the patches:
  Patches  1-3:  Split some of the old Cell co-processor code out so it can be
   reused.
  Patches  4-10: Add infrastructure to arch/powerpc needed by cxl.
  Patches  11:   Add call backs needed for invalidating cxl mm contexts.
  Patch12:   Add cxl specific support that needs to be built in to the
   kernel (can't be a module).
  Patches 13-15: Add the majority of the device driver and API header.
  Patch16:   Documentation.

The documentation 

[PATCH v4 01/16] powerpc/cell: Move spu_handle_mm_fault() out of cell platform

2014-10-08 Thread Michael Neuling
From: Ian Munsie imun...@au1.ibm.com

Currently spu_handle_mm_fault() is in the cell platform.

This code is generically useful for other non-cell co-processors on powerpc.

This patch moves this function out of the cell platform into arch/powerpc/mm so
that others may use it.

Signed-off-by: Ian Munsie imun...@au1.ibm.com
Signed-off-by: Michael Neuling mi...@neuling.org
---
 arch/powerpc/Kconfig |  4 
 arch/powerpc/include/asm/copro.h | 16 
 arch/powerpc/include/asm/spu.h   |  5 ++---
 arch/powerpc/mm/Makefile |  1 +
 .../{platforms/cell/spu_fault.c = mm/copro_fault.c} | 14 ++
 arch/powerpc/platforms/cell/Kconfig  |  1 +
 arch/powerpc/platforms/cell/Makefile |  2 +-
 arch/powerpc/platforms/cell/spufs/fault.c|  4 ++--
 8 files changed, 33 insertions(+), 14 deletions(-)
 create mode 100644 arch/powerpc/include/asm/copro.h
 rename arch/powerpc/{platforms/cell/spu_fault.c = mm/copro_fault.c} (89%)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 4bc7b62..8f094e9 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -603,6 +603,10 @@ config PPC_SUBPAGE_PROT
  to set access permissions (read/write, readonly, or no access)
  on the 4k subpages of each 64k page.
 
+config PPC_COPRO_BASE
+   bool
+   default n
+
 config SCHED_SMT
bool SMT (Hyperthreading) scheduler support
depends on PPC64  SMP
diff --git a/arch/powerpc/include/asm/copro.h b/arch/powerpc/include/asm/copro.h
new file mode 100644
index 000..51cae85
--- /dev/null
+++ b/arch/powerpc/include/asm/copro.h
@@ -0,0 +1,16 @@
+/*
+ * Copyright 2014 IBM Corp.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#ifndef _ASM_POWERPC_COPRO_H
+#define _ASM_POWERPC_COPRO_H
+
+int copro_handle_mm_fault(struct mm_struct *mm, unsigned long ea,
+ unsigned long dsisr, unsigned *flt);
+
+#endif /* _ASM_POWERPC_COPRO_H */
diff --git a/arch/powerpc/include/asm/spu.h b/arch/powerpc/include/asm/spu.h
index 37b7ca3..a6e6e2b 100644
--- a/arch/powerpc/include/asm/spu.h
+++ b/arch/powerpc/include/asm/spu.h
@@ -27,6 +27,8 @@
 #include linux/workqueue.h
 #include linux/device.h
 #include linux/mutex.h
+#include asm/reg.h
+#include asm/copro.h
 
 #define LS_SIZE (256 * 1024)
 #define LS_ADDR_MASK (LS_SIZE - 1)
@@ -277,9 +279,6 @@ void spu_remove_dev_attr(struct device_attribute *attr);
 int spu_add_dev_attr_group(struct attribute_group *attrs);
 void spu_remove_dev_attr_group(struct attribute_group *attrs);
 
-int spu_handle_mm_fault(struct mm_struct *mm, unsigned long ea,
-   unsigned long dsisr, unsigned *flt);
-
 /*
  * Notifier blocks:
  *
diff --git a/arch/powerpc/mm/Makefile b/arch/powerpc/mm/Makefile
index d0130ff..325e861 100644
--- a/arch/powerpc/mm/Makefile
+++ b/arch/powerpc/mm/Makefile
@@ -34,3 +34,4 @@ obj-$(CONFIG_TRANSPARENT_HUGEPAGE) += hugepage-hash64.o
 obj-$(CONFIG_PPC_SUBPAGE_PROT) += subpage-prot.o
 obj-$(CONFIG_NOT_COHERENT_CACHE) += dma-noncoherent.o
 obj-$(CONFIG_HIGHMEM)  += highmem.o
+obj-$(CONFIG_PPC_COPRO_BASE)   += copro_fault.o
diff --git a/arch/powerpc/platforms/cell/spu_fault.c 
b/arch/powerpc/mm/copro_fault.c
similarity index 89%
rename from arch/powerpc/platforms/cell/spu_fault.c
rename to arch/powerpc/mm/copro_fault.c
index 641e727..ba7df14 100644
--- a/arch/powerpc/platforms/cell/spu_fault.c
+++ b/arch/powerpc/mm/copro_fault.c
@@ -1,5 +1,5 @@
 /*
- * SPU mm fault handler
+ * CoProcessor (SPU/AFU) mm fault handler
  *
  * (C) Copyright IBM Deutschland Entwicklung GmbH 2007
  *
@@ -23,16 +23,14 @@
 #include linux/sched.h
 #include linux/mm.h
 #include linux/export.h
-
-#include asm/spu.h
-#include asm/spu_csa.h
+#include asm/reg.h
 
 /*
  * This ought to be kept in sync with the powerpc specific do_page_fault
  * function. Currently, there are a few corner cases that we haven't had
  * to handle fortunately.
  */
-int spu_handle_mm_fault(struct mm_struct *mm, unsigned long ea,
+int copro_handle_mm_fault(struct mm_struct *mm, unsigned long ea,
unsigned long dsisr, unsigned *flt)
 {
struct vm_area_struct *vma;
@@ -58,12 +56,12 @@ int spu_handle_mm_fault(struct mm_struct *mm, unsigned long 
ea,
goto out_unlock;
}
 
-   is_write = dsisr  MFC_DSISR_ACCESS_PUT;
+   is_write = dsisr  DSISR_ISSTORE;
if (is_write) {
if (!(vma-vm_flags  VM_WRITE))
goto out_unlock;
} else {
-   if (dsisr  MFC_DSISR_ACCESS_DENIED)
+   if (dsisr  DSISR_PROTFAULT)
goto out_unlock;
 

[PATCH v4 02/16] powerpc/cell: Move data segment faulting code out of cell platform

2014-10-08 Thread Michael Neuling
From: Ian Munsie imun...@au1.ibm.com

__spu_trap_data_seg() currently contains code to determine the VSID and ESID
required for a particular EA and mm struct.

This code is generically useful for other co-processors. This moves the code of
the cell platform so it can be used by other powerpc code. It also adds 1TB
segment handling which Cell didn't support.  The new function is called
copro_calculate_slb().

This also moves the internal struct spu_slb to a generic struct copro_slb which
is now used in the Cell and copro code.  We use this new struct instead of
passing around esid and vsid parameters.

Signed-off-by: Ian Munsie imun...@au1.ibm.com
Signed-off-by: Michael Neuling mi...@neuling.org
---
 arch/powerpc/include/asm/copro.h   |  7 +
 arch/powerpc/include/asm/mmu-hash64.h  |  7 +
 arch/powerpc/mm/copro_fault.c  | 46 
 arch/powerpc/mm/slb.c  |  3 --
 arch/powerpc/platforms/cell/spu_base.c | 55 ++
 5 files changed, 69 insertions(+), 49 deletions(-)

diff --git a/arch/powerpc/include/asm/copro.h b/arch/powerpc/include/asm/copro.h
index 51cae85..b0e6a18 100644
--- a/arch/powerpc/include/asm/copro.h
+++ b/arch/powerpc/include/asm/copro.h
@@ -10,7 +10,14 @@
 #ifndef _ASM_POWERPC_COPRO_H
 #define _ASM_POWERPC_COPRO_H
 
+struct copro_slb
+{
+   u64 esid, vsid;
+};
+
 int copro_handle_mm_fault(struct mm_struct *mm, unsigned long ea,
  unsigned long dsisr, unsigned *flt);
 
+int copro_calculate_slb(struct mm_struct *mm, u64 ea, struct copro_slb *slb);
+
 #endif /* _ASM_POWERPC_COPRO_H */
diff --git a/arch/powerpc/include/asm/mmu-hash64.h 
b/arch/powerpc/include/asm/mmu-hash64.h
index d765144..aeabd02 100644
--- a/arch/powerpc/include/asm/mmu-hash64.h
+++ b/arch/powerpc/include/asm/mmu-hash64.h
@@ -190,6 +190,13 @@ static inline unsigned int mmu_psize_to_shift(unsigned int 
mmu_psize)
 
 #ifndef __ASSEMBLY__
 
+static inline int slb_vsid_shift(int ssize)
+{
+   if (ssize == MMU_SEGSIZE_256M)
+   return SLB_VSID_SHIFT;
+   return SLB_VSID_SHIFT_1T;
+}
+
 static inline int segment_shift(int ssize)
 {
if (ssize == MMU_SEGSIZE_256M)
diff --git a/arch/powerpc/mm/copro_fault.c b/arch/powerpc/mm/copro_fault.c
index ba7df14..a15a23e 100644
--- a/arch/powerpc/mm/copro_fault.c
+++ b/arch/powerpc/mm/copro_fault.c
@@ -24,6 +24,7 @@
 #include linux/mm.h
 #include linux/export.h
 #include asm/reg.h
+#include asm/copro.h
 
 /*
  * This ought to be kept in sync with the powerpc specific do_page_fault
@@ -90,3 +91,48 @@ out_unlock:
return ret;
 }
 EXPORT_SYMBOL_GPL(copro_handle_mm_fault);
+
+int copro_calculate_slb(struct mm_struct *mm, u64 ea, struct copro_slb *slb)
+{
+   u64 vsid;
+   int psize, ssize;
+
+   slb-esid = (ea  ESID_MASK) | SLB_ESID_V;
+
+   switch (REGION_ID(ea)) {
+   case USER_REGION_ID:
+   pr_devel(%s: 0x%llx -- USER_REGION_ID\n, __func__, ea);
+   psize = get_slice_psize(mm, ea);
+   ssize = user_segment_size(ea);
+   vsid = get_vsid(mm-context.id, ea, ssize);
+   break;
+   case VMALLOC_REGION_ID:
+   pr_devel(%s: 0x%llx -- VMALLOC_REGION_ID\n, __func__, ea);
+   if (ea  VMALLOC_END)
+   psize = mmu_vmalloc_psize;
+   else
+   psize = mmu_io_psize;
+   ssize = mmu_kernel_ssize;
+   vsid = get_kernel_vsid(ea, mmu_kernel_ssize);
+   break;
+   case KERNEL_REGION_ID:
+   pr_devel(%s: 0x%llx -- KERNEL_REGION_ID\n, __func__, ea);
+   psize = mmu_linear_psize;
+   ssize = mmu_kernel_ssize;
+   vsid = get_kernel_vsid(ea, mmu_kernel_ssize);
+   break;
+   default:
+   pr_debug(%s: invalid region access at %016llx\n, __func__, 
ea);
+   return 1;
+   }
+
+   vsid = (vsid  slb_vsid_shift(ssize)) | SLB_VSID_USER;
+
+   vsid |= mmu_psize_defs[psize].sllp |
+   ((ssize == MMU_SEGSIZE_1T) ? SLB_VSID_B_1T : 0);
+
+   slb-vsid = vsid;
+
+   return 0;
+}
+EXPORT_SYMBOL_GPL(copro_calculate_slb);
diff --git a/arch/powerpc/mm/slb.c b/arch/powerpc/mm/slb.c
index 0399a67..6e450ca 100644
--- a/arch/powerpc/mm/slb.c
+++ b/arch/powerpc/mm/slb.c
@@ -46,9 +46,6 @@ static inline unsigned long mk_esid_data(unsigned long ea, 
int ssize,
return (ea  slb_esid_mask(ssize)) | SLB_ESID_V | slot;
 }
 
-#define slb_vsid_shift(ssize)  \
-   ((ssize) == MMU_SEGSIZE_256M? SLB_VSID_SHIFT: SLB_VSID_SHIFT_1T)
-
 static inline unsigned long mk_vsid_data(unsigned long ea, int ssize,
 unsigned long flags)
 {
diff --git a/arch/powerpc/platforms/cell/spu_base.c 
b/arch/powerpc/platforms/cell/spu_base.c
index 2930d1e..ffcbd24 100644
--- a/arch/powerpc/platforms/cell/spu_base.c
+++ b/arch/powerpc/platforms/cell/spu_base.c
@@ 

[PATCH v4 03/16] powerpc/cell: Make spu_flush_all_slbs() generic

2014-10-08 Thread Michael Neuling
From: Ian Munsie imun...@au1.ibm.com

This moves spu_flush_all_slbs() into a generic call copro_flush_all_slbs().

This will be useful when we add cxl which also needs a similar SLB flush call.

Signed-off-by: Ian Munsie imun...@au1.ibm.com
Signed-off-by: Michael Neuling mi...@neuling.org
---
 arch/powerpc/include/asm/copro.h |  6 ++
 arch/powerpc/mm/copro_fault.c|  9 +
 arch/powerpc/mm/hash_utils_64.c  | 10 +++---
 arch/powerpc/mm/slice.c  | 10 +++---
 4 files changed, 21 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/include/asm/copro.h b/arch/powerpc/include/asm/copro.h
index b0e6a18..ce216df 100644
--- a/arch/powerpc/include/asm/copro.h
+++ b/arch/powerpc/include/asm/copro.h
@@ -20,4 +20,10 @@ int copro_handle_mm_fault(struct mm_struct *mm, unsigned 
long ea,
 
 int copro_calculate_slb(struct mm_struct *mm, u64 ea, struct copro_slb *slb);
 
+
+#ifdef CONFIG_PPC_COPRO_BASE
+void copro_flush_all_slbs(struct mm_struct *mm);
+#else
+static inline void copro_flush_all_slbs(struct mm_struct *mm) {}
+#endif
 #endif /* _ASM_POWERPC_COPRO_H */
diff --git a/arch/powerpc/mm/copro_fault.c b/arch/powerpc/mm/copro_fault.c
index a15a23e..f2aa5a8 100644
--- a/arch/powerpc/mm/copro_fault.c
+++ b/arch/powerpc/mm/copro_fault.c
@@ -25,6 +25,7 @@
 #include linux/export.h
 #include asm/reg.h
 #include asm/copro.h
+#include asm/spu.h
 
 /*
  * This ought to be kept in sync with the powerpc specific do_page_fault
@@ -136,3 +137,11 @@ int copro_calculate_slb(struct mm_struct *mm, u64 ea, 
struct copro_slb *slb)
return 0;
 }
 EXPORT_SYMBOL_GPL(copro_calculate_slb);
+
+void copro_flush_all_slbs(struct mm_struct *mm)
+{
+#ifdef CONFIG_SPU_BASE
+   spu_flush_all_slbs(mm);
+#endif
+}
+EXPORT_SYMBOL_GPL(copro_flush_all_slbs);
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index daee7f4..5c0738d 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -51,7 +51,7 @@
 #include asm/cacheflush.h
 #include asm/cputable.h
 #include asm/sections.h
-#include asm/spu.h
+#include asm/copro.h
 #include asm/udbg.h
 #include asm/code-patching.h
 #include asm/fadump.h
@@ -901,9 +901,7 @@ void demote_segment_4k(struct mm_struct *mm, unsigned long 
addr)
if (get_slice_psize(mm, addr) == MMU_PAGE_4K)
return;
slice_set_range_psize(mm, addr, 1, MMU_PAGE_4K);
-#ifdef CONFIG_SPU_BASE
-   spu_flush_all_slbs(mm);
-#endif
+   copro_flush_all_slbs(mm);
if (get_paca_psize(addr) != MMU_PAGE_4K) {
get_paca()-context = mm-context;
slb_flush_and_rebolt();
@@ -1141,9 +1139,7 @@ int hash_page(unsigned long ea, unsigned long access, 
unsigned long trap)
   to 4kB pages because of 
   non-cacheable mapping\n);
psize = mmu_vmalloc_psize = MMU_PAGE_4K;
-#ifdef CONFIG_SPU_BASE
-   spu_flush_all_slbs(mm);
-#endif
+   copro_flush_all_slbs(mm);
}
}
 
diff --git a/arch/powerpc/mm/slice.c b/arch/powerpc/mm/slice.c
index b0c75cc..a81791c 100644
--- a/arch/powerpc/mm/slice.c
+++ b/arch/powerpc/mm/slice.c
@@ -32,7 +32,7 @@
 #include linux/export.h
 #include asm/mman.h
 #include asm/mmu.h
-#include asm/spu.h
+#include asm/copro.h
 
 /* some sanity checks */
 #if (PGTABLE_RANGE  43)  SLICE_MASK_SIZE
@@ -232,9 +232,7 @@ static void slice_convert(struct mm_struct *mm, struct 
slice_mask mask, int psiz
 
spin_unlock_irqrestore(slice_convert_lock, flags);
 
-#ifdef CONFIG_SPU_BASE
-   spu_flush_all_slbs(mm);
-#endif
+   copro_flush_all_slbs(mm);
 }
 
 /*
@@ -671,9 +669,7 @@ void slice_set_psize(struct mm_struct *mm, unsigned long 
address,
 
spin_unlock_irqrestore(slice_convert_lock, flags);
 
-#ifdef CONFIG_SPU_BASE
-   spu_flush_all_slbs(mm);
-#endif
+   copro_flush_all_slbs(mm);
 }
 
 void slice_set_range_psize(struct mm_struct *mm, unsigned long start,
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 04/16] powerpc/msi: Improve IRQ bitmap allocator

2014-10-08 Thread Michael Neuling
From: Ian Munsie imun...@au1.ibm.com

Currently msi_bitmap_alloc_hwirqs() will round up any IRQ allocation requests
to the nearest power of 2. eg. ask for 5 IRQs and you'll get 8. This wastes a
lot of IRQs which can be a scarce resource.

For cxl we may require multiple IRQs for every context that is attached to the
accelerator. There may be 1000s of contexts attached, hence we can easily run
out of IRQs, especially if we are needlessly wasting them.

This changes the msi_bitmap_alloc_hwirqs() to allocate only the required number
of IRQs, hence avoiding this wastage. It keeps the natural alignment
requirement though.

Signed-off-by: Ian Munsie imun...@au1.ibm.com
Signed-off-by: Michael Neuling mi...@neuling.org
---
 arch/powerpc/sysdev/msi_bitmap.c | 36 +---
 1 file changed, 25 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/sysdev/msi_bitmap.c b/arch/powerpc/sysdev/msi_bitmap.c
index 2ff6302..871d94b 100644
--- a/arch/powerpc/sysdev/msi_bitmap.c
+++ b/arch/powerpc/sysdev/msi_bitmap.c
@@ -20,32 +20,37 @@ int msi_bitmap_alloc_hwirqs(struct msi_bitmap *bmp, int num)
int offset, order = get_count_order(num);
 
spin_lock_irqsave(bmp-lock, flags);
-   /*
-* This is fast, but stricter than we need. We might want to add
-* a fallback routine which does a linear search with no alignment.
-*/
-   offset = bitmap_find_free_region(bmp-bitmap, bmp-irq_count, order);
+
+   offset = bitmap_find_next_zero_area(bmp-bitmap, bmp-irq_count, 0,
+   num, (1  order) - 1);
+   if (offset  bmp-irq_count)
+   goto err;
+
+   bitmap_set(bmp-bitmap, offset, num);
spin_unlock_irqrestore(bmp-lock, flags);
 
-   pr_debug(msi_bitmap: allocated 0x%x (2^%d) at offset 0x%x\n,
-num, order, offset);
+   pr_debug(msi_bitmap: allocated 0x%x at offset 0x%x\n, num, offset);
 
return offset;
+err:
+   spin_unlock_irqrestore(bmp-lock, flags);
+   return -ENOMEM;
 }
+EXPORT_SYMBOL(msi_bitmap_alloc_hwirqs);
 
 void msi_bitmap_free_hwirqs(struct msi_bitmap *bmp, unsigned int offset,
unsigned int num)
 {
unsigned long flags;
-   int order = get_count_order(num);
 
-   pr_debug(msi_bitmap: freeing 0x%x (2^%d) at offset 0x%x\n,
-num, order, offset);
+   pr_debug(msi_bitmap: freeing 0x%x at offset 0x%x\n,
+num, offset);
 
spin_lock_irqsave(bmp-lock, flags);
-   bitmap_release_region(bmp-bitmap, offset, order);
+   bitmap_clear(bmp-bitmap, offset, num);
spin_unlock_irqrestore(bmp-lock, flags);
 }
+EXPORT_SYMBOL(msi_bitmap_free_hwirqs);
 
 void msi_bitmap_reserve_hwirq(struct msi_bitmap *bmp, unsigned int hwirq)
 {
@@ -180,6 +185,15 @@ void __init test_basics(void)
msi_bitmap_free_hwirqs(bmp, size / 2, 1);
check(msi_bitmap_alloc_hwirqs(bmp, 1) == size / 2);
 
+   /* Check we get a naturally aligned offset */
+   check(msi_bitmap_alloc_hwirqs(bmp, 2) % 2 == 0);
+   check(msi_bitmap_alloc_hwirqs(bmp, 4) % 4 == 0);
+   check(msi_bitmap_alloc_hwirqs(bmp, 8) % 8 == 0);
+   check(msi_bitmap_alloc_hwirqs(bmp, 9) % 16 == 0);
+   check(msi_bitmap_alloc_hwirqs(bmp, 3) % 4 == 0);
+   check(msi_bitmap_alloc_hwirqs(bmp, 7) % 8 == 0);
+   check(msi_bitmap_alloc_hwirqs(bmp, 121) % 128 == 0);
+
msi_bitmap_free(bmp);
 
/* Clients may check bitmap == NULL for not-allocated */
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 05/16] powerpc/mm: Export mmu_kernel_ssize and mmu_linear_psize

2014-10-08 Thread Michael Neuling
From: Ian Munsie imun...@au1.ibm.com

Export mmu_kernel_ssize and mmu_linear_psize.  These are needed by the cxl
driver which has it's own MMU.  To setup the MMU cxl needs access to these.

Signed-off-by: Ian Munsie imun...@au1.ibm.com
Signed-off-by: Michael Neuling mi...@neuling.org
---
 arch/powerpc/mm/hash_utils_64.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 5c0738d..bbdb054 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -98,6 +98,7 @@ unsigned long htab_size_bytes;
 unsigned long htab_hash_mask;
 EXPORT_SYMBOL_GPL(htab_hash_mask);
 int mmu_linear_psize = MMU_PAGE_4K;
+EXPORT_SYMBOL_GPL(mmu_linear_psize);
 int mmu_virtual_psize = MMU_PAGE_4K;
 int mmu_vmalloc_psize = MMU_PAGE_4K;
 #ifdef CONFIG_SPARSEMEM_VMEMMAP
@@ -105,6 +106,7 @@ int mmu_vmemmap_psize = MMU_PAGE_4K;
 #endif
 int mmu_io_psize = MMU_PAGE_4K;
 int mmu_kernel_ssize = MMU_SEGSIZE_256M;
+EXPORT_SYMBOL_GPL(mmu_kernel_ssize);
 int mmu_highuser_ssize = MMU_SEGSIZE_256M;
 u16 mmu_slb_size = 64;
 EXPORT_SYMBOL_GPL(mmu_slb_size);
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 06/16] powerpc/powernv: Split out set MSI IRQ chip code

2014-10-08 Thread Michael Neuling
From: Ian Munsie imun...@au1.ibm.com

Some of the MSI IRQ code in pnv_pci_ioda_msi_setup() is generically useful so
split it out.

This will be used by some of the cxl PCIe code later.

Signed-off-by: Ian Munsie imun...@au1.ibm.com
Signed-off-by: Michael Neuling mi...@neuling.org
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 42 ++-
 1 file changed, 24 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index df241b1..baf3de6 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -1306,14 +1306,35 @@ static void pnv_ioda2_msi_eoi(struct irq_data *d)
icp_native_eoi(d);
 }
 
+
+static void set_msi_irq_chip(struct pnv_phb *phb, unsigned int virq)
+{
+   struct irq_data *idata;
+   struct irq_chip *ichip;
+
+   if (phb-type != PNV_PHB_IODA2)
+   return;
+
+   if (!phb-ioda.irq_chip_init) {
+   /*
+* First time we setup an MSI IRQ, we need to setup the
+* corresponding IRQ chip to route correctly.
+*/
+   idata = irq_get_irq_data(virq);
+   ichip = irq_data_get_irq_chip(idata);
+   phb-ioda.irq_chip_init = 1;
+   phb-ioda.irq_chip = *ichip;
+   phb-ioda.irq_chip.irq_eoi = pnv_ioda2_msi_eoi;
+   }
+   irq_set_chip(virq, phb-ioda.irq_chip);
+}
+
 static int pnv_pci_ioda_msi_setup(struct pnv_phb *phb, struct pci_dev *dev,
  unsigned int hwirq, unsigned int virq,
  unsigned int is_64, struct msi_msg *msg)
 {
struct pnv_ioda_pe *pe = pnv_ioda_get_pe(dev);
struct pci_dn *pdn = pci_get_pdn(dev);
-   struct irq_data *idata;
-   struct irq_chip *ichip;
unsigned int xive_num = hwirq - phb-msi_base;
__be32 data;
int rc;
@@ -1365,22 +1386,7 @@ static int pnv_pci_ioda_msi_setup(struct pnv_phb *phb, 
struct pci_dev *dev,
}
msg-data = be32_to_cpu(data);
 
-   /*
-* Change the IRQ chip for the MSI interrupts on PHB3.
-* The corresponding IRQ chip should be populated for
-* the first time.
-*/
-   if (phb-type == PNV_PHB_IODA2) {
-   if (!phb-ioda.irq_chip_init) {
-   idata = irq_get_irq_data(virq);
-   ichip = irq_data_get_irq_chip(idata);
-   phb-ioda.irq_chip_init = 1;
-   phb-ioda.irq_chip = *ichip;
-   phb-ioda.irq_chip.irq_eoi = pnv_ioda2_msi_eoi;
-   }
-
-   irq_set_chip(virq, phb-ioda.irq_chip);
-   }
+   set_msi_irq_chip(phb, virq);
 
pr_devel(%s: %s-bit MSI on hwirq %x (xive #%d),
  address=%x_%08x data=%x PE# %d\n,
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 07/16] cxl: Add new header for call backs and structs

2014-10-08 Thread Michael Neuling
From: Ian Munsie imun...@au1.ibm.com

This new header adds callbacks and structs needed by the rest of the kernel to
hook into the cxl infrastructure.

This adds the cxl_ctx_in_use() function for use in the mm code to see if any
cxl contexts are currently in use. This is used by the tlbie() to determine if
it can do local TLB invalidations or not. This also adds get/put calls for the
cxl driver module to refcount the active cxl contexts.

cxl_ctx_get/put/in_use are static inlined here as they are called in tlbie
which we want to be fast (mpe's suggestion).

Empty functions are provided when CONFIG_CXL_BASE is not enabled.

Signed-off-by: Ian Munsie imun...@au1.ibm.com
Signed-off-by: Michael Neuling mi...@neuling.org
---
 include/misc/cxl.h | 48 
 1 file changed, 48 insertions(+)
 create mode 100644 include/misc/cxl.h

diff --git a/include/misc/cxl.h b/include/misc/cxl.h
new file mode 100644
index 000..975cc78
--- /dev/null
+++ b/include/misc/cxl.h
@@ -0,0 +1,48 @@
+/*
+ * Copyright 2014 IBM Corp.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#ifndef _MISC_CXL_H
+#define _MISC_CXL_H
+
+#ifdef CONFIG_CXL_BASE
+
+#define CXL_IRQ_RANGES 4
+
+struct cxl_irq_ranges {
+   irq_hw_number_t offset[CXL_IRQ_RANGES];
+   irq_hw_number_t range[CXL_IRQ_RANGES];
+};
+
+extern atomic_t cxl_use_count;
+
+static inline bool cxl_ctx_in_use(void)
+{
+   return (atomic_read(cxl_use_count) != 0);
+}
+
+static inline void cxl_ctx_get(void)
+{
+   atomic_inc(cxl_use_count);
+}
+
+static inline void cxl_ctx_put(void)
+{
+   atomic_dec(cxl_use_count);
+}
+
+void cxl_slbia(struct mm_struct *mm);
+
+#else /* CONFIG_CXL_BASE */
+
+static inline bool cxl_ctx_in_use(void) { return false; }
+static inline void cxl_slbia(struct mm_struct *mm) {}
+
+#endif /* CONFIG_CXL_BASE */
+
+#endif
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 08/16] powerpc/powerpc: Add new PCIe functions for allocating cxl interrupts

2014-10-08 Thread Michael Neuling
From: Ian Munsie imun...@au1.ibm.com

This adds a number of functions for allocating IRQs under powernv PCIe for cxl.

Signed-off-by: Ian Munsie imun...@au1.ibm.com
Signed-off-by: Michael Neuling mi...@neuling.org
---
 arch/powerpc/include/asm/pnv-pci.h|  31 ++
 arch/powerpc/platforms/powernv/pci-ioda.c | 154 ++
 2 files changed, 185 insertions(+)
 create mode 100644 arch/powerpc/include/asm/pnv-pci.h

diff --git a/arch/powerpc/include/asm/pnv-pci.h 
b/arch/powerpc/include/asm/pnv-pci.h
new file mode 100644
index 000..f09a22f
--- /dev/null
+++ b/arch/powerpc/include/asm/pnv-pci.h
@@ -0,0 +1,31 @@
+/*
+ * Copyright 2014 IBM Corp.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#ifndef _ASM_PNV_PCI_H
+#define _ASM_PNV_PCI_H
+
+#include linux/pci.h
+#include misc/cxl.h
+
+int pnv_phb_to_cxl(struct pci_dev *dev);
+int pnv_cxl_ioda_msi_setup(struct pci_dev *dev, unsigned int hwirq,
+  unsigned int virq);
+int pnv_cxl_alloc_hwirqs(struct pci_dev *dev, int num);
+void pnv_cxl_release_hwirqs(struct pci_dev *dev, int hwirq, int num);
+int pnv_cxl_get_irq_count(struct pci_dev *dev);
+struct device_node *pnv_pci_to_phb_node(struct pci_dev *dev);
+
+#ifdef CONFIG_CXL_BASE
+int pnv_cxl_alloc_hwirq_ranges(struct cxl_irq_ranges *irqs,
+  struct pci_dev *dev, int num);
+void pnv_cxl_release_hwirq_ranges(struct cxl_irq_ranges *irqs,
+ struct pci_dev *dev);
+#endif
+
+#endif
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index baf3de6..2dfc857 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -37,6 +37,9 @@
 #include asm/xics.h
 #include asm/debug.h
 #include asm/firmware.h
+#include asm/pnv-pci.h
+
+#include misc/cxl.h
 
 #include powernv.h
 #include pci.h
@@ -1329,6 +1332,157 @@ static void set_msi_irq_chip(struct pnv_phb *phb, 
unsigned int virq)
irq_set_chip(virq, phb-ioda.irq_chip);
 }
 
+#ifdef CONFIG_CXL_BASE
+
+struct device_node *pnv_pci_to_phb_node(struct pci_dev *dev)
+{
+   struct pci_controller *hose = pci_bus_to_host(dev-bus);
+
+   return hose-dn;
+}
+EXPORT_SYMBOL(pnv_pci_to_phb_node);
+
+int pnv_phb_to_cxl(struct pci_dev *dev)
+{
+   struct pci_controller *hose = pci_bus_to_host(dev-bus);
+   struct pnv_phb *phb = hose-private_data;
+   struct pnv_ioda_pe *pe;
+   int rc;
+
+   pe = pnv_ioda_get_pe(dev);
+   if (!pe)
+   return -ENODEV;
+
+   pe_info(pe, Switching PHB to CXL\n);
+
+   rc = opal_pci_set_phb_cxl_mode(phb-opal_id, 1, pe-pe_number);
+   if (rc)
+   dev_err(dev-dev, opal_pci_set_phb_cxl_mode failed: %i\n, 
rc);
+
+   return rc;
+}
+EXPORT_SYMBOL(pnv_phb_to_cxl);
+
+/* Find PHB for cxl dev and allocate MSI hwirqs?
+ * Returns the absolute hardware IRQ number
+ */
+int pnv_cxl_alloc_hwirqs(struct pci_dev *dev, int num)
+{
+   struct pci_controller *hose = pci_bus_to_host(dev-bus);
+   struct pnv_phb *phb = hose-private_data;
+   int hwirq = msi_bitmap_alloc_hwirqs(phb-msi_bmp, num);
+
+   if (hwirq  0) {
+   dev_warn(dev-dev, Failed to find a free MSI\n);
+   return -ENOSPC;
+   }
+
+   return phb-msi_base + hwirq;
+}
+EXPORT_SYMBOL(pnv_cxl_alloc_hwirqs);
+
+void pnv_cxl_release_hwirqs(struct pci_dev *dev, int hwirq, int num)
+{
+   struct pci_controller *hose = pci_bus_to_host(dev-bus);
+   struct pnv_phb *phb = hose-private_data;
+
+   msi_bitmap_free_hwirqs(phb-msi_bmp, hwirq - phb-msi_base, num);
+}
+EXPORT_SYMBOL(pnv_cxl_release_hwirqs);
+
+void pnv_cxl_release_hwirq_ranges(struct cxl_irq_ranges *irqs,
+ struct pci_dev *dev)
+{
+   struct pci_controller *hose = pci_bus_to_host(dev-bus);
+   struct pnv_phb *phb = hose-private_data;
+   int i, hwirq;
+
+   for (i = 1; i  CXL_IRQ_RANGES; i++) {
+   if (!irqs-range[i])
+   continue;
+   pr_devel(cxl release irq range 0x%x: offset: 0x%lx  limit: 
%ld\n,
+i, irqs-offset[i],
+irqs-range[i]);
+   hwirq = irqs-offset[i] - phb-msi_base;
+   msi_bitmap_free_hwirqs(phb-msi_bmp, hwirq,
+  irqs-range[i]);
+   }
+}
+EXPORT_SYMBOL(pnv_cxl_release_hwirq_ranges);
+
+int pnv_cxl_alloc_hwirq_ranges(struct cxl_irq_ranges *irqs,
+  struct pci_dev *dev, int num)
+{
+   struct pci_controller *hose = pci_bus_to_host(dev-bus);
+   struct pnv_phb *phb = hose-private_data;
+   int i, hwirq, try;
+
+   memset(irqs, 0, sizeof(struct 

[PATCH v4 09/16] powerpc/mm: Add new hash_page_mm()

2014-10-08 Thread Michael Neuling
From: Ian Munsie imun...@au1.ibm.com

This adds a new function hash_page_mm() based on the existing hash_page().
This version allows any struct mm to be passed in, rather than assuming
current. This is useful for servicing co-processor faults which are not in the
context of the current running process.

We need to be careful here as the current hash_page() assumes current in a few
places.

Signed-off-by: Ian Munsie imun...@au1.ibm.com
Signed-off-by: Michael Neuling mi...@neuling.org
---
 arch/powerpc/include/asm/mmu-hash64.h |  1 +
 arch/powerpc/mm/hash_utils_64.c   | 24 +---
 2 files changed, 18 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu-hash64.h 
b/arch/powerpc/include/asm/mmu-hash64.h
index aeabd02..764e141 100644
--- a/arch/powerpc/include/asm/mmu-hash64.h
+++ b/arch/powerpc/include/asm/mmu-hash64.h
@@ -324,6 +324,7 @@ extern int __hash_page_64K(unsigned long ea, unsigned long 
access,
   unsigned int local, int ssize);
 struct mm_struct;
 unsigned int hash_page_do_lazy_icache(unsigned int pp, pte_t pte, int trap);
+extern int hash_page_mm(struct mm_struct *mm, unsigned long ea, unsigned long 
access, unsigned long trap);
 extern int hash_page(unsigned long ea, unsigned long access, unsigned long 
trap);
 int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long 
vsid,
 pte_t *ptep, unsigned long trap, int local, int ssize,
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index bbdb054..698834d 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -904,7 +904,7 @@ void demote_segment_4k(struct mm_struct *mm, unsigned long 
addr)
return;
slice_set_range_psize(mm, addr, 1, MMU_PAGE_4K);
copro_flush_all_slbs(mm);
-   if (get_paca_psize(addr) != MMU_PAGE_4K) {
+   if ((get_paca_psize(addr) != MMU_PAGE_4K)  (current-mm == mm)) {
get_paca()-context = mm-context;
slb_flush_and_rebolt();
}
@@ -989,12 +989,11 @@ static void check_paca_psize(unsigned long ea, struct 
mm_struct *mm,
  * -1 - critical hash insertion error
  * -2 - access not permitted by subpage protection mechanism
  */
-int hash_page(unsigned long ea, unsigned long access, unsigned long trap)
+int hash_page_mm(struct mm_struct *mm, unsigned long ea, unsigned long access, 
unsigned long trap)
 {
enum ctx_state prev_state = exception_enter();
pgd_t *pgdir;
unsigned long vsid;
-   struct mm_struct *mm;
pte_t *ptep;
unsigned hugeshift;
const struct cpumask *tmp;
@@ -1008,7 +1007,6 @@ int hash_page(unsigned long ea, unsigned long access, 
unsigned long trap)
switch (REGION_ID(ea)) {
case USER_REGION_ID:
user_region = 1;
-   mm = current-mm;
if (! mm) {
DBG_LOW( user region with no mm !\n);
rc = 1;
@@ -1019,7 +1017,6 @@ int hash_page(unsigned long ea, unsigned long access, 
unsigned long trap)
vsid = get_vsid(mm-context.id, ea, ssize);
break;
case VMALLOC_REGION_ID:
-   mm = init_mm;
vsid = get_kernel_vsid(ea, mmu_kernel_ssize);
if (ea  VMALLOC_END)
psize = mmu_vmalloc_psize;
@@ -1104,7 +1101,8 @@ int hash_page(unsigned long ea, unsigned long access, 
unsigned long trap)
WARN_ON(1);
}
 #endif
-   check_paca_psize(ea, mm, psize, user_region);
+   if (current-mm == mm)
+   check_paca_psize(ea, mm, psize, user_region);
 
goto bail;
}
@@ -1145,7 +1143,8 @@ int hash_page(unsigned long ea, unsigned long access, 
unsigned long trap)
}
}
 
-   check_paca_psize(ea, mm, psize, user_region);
+   if (current-mm == mm)
+   check_paca_psize(ea, mm, psize, user_region);
 #endif /* CONFIG_PPC_64K_PAGES */
 
 #ifdef CONFIG_PPC_HAS_HASH_64K
@@ -1180,6 +1179,17 @@ bail:
exception_exit(prev_state);
return rc;
 }
+EXPORT_SYMBOL_GPL(hash_page_mm);
+
+int hash_page(unsigned long ea, unsigned long access, unsigned long trap)
+{
+   struct mm_struct *mm = current-mm;
+
+   if (REGION_ID(ea) == VMALLOC_REGION_ID)
+   mm = init_mm;
+
+   return hash_page_mm(mm, ea, access, trap);
+}
 EXPORT_SYMBOL_GPL(hash_page);
 
 void hash_preload(struct mm_struct *mm, unsigned long ea,
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 10/16] powerpc/opal: Add PHB to cxl mode call

2014-10-08 Thread Michael Neuling
From: Ian Munsie imun...@au1.ibm.com

This adds the OPAL call to change a PHB into cxl mode.

Signed-off-by: Ian Munsie imun...@au1.ibm.com
Signed-off-by: Michael Neuling mi...@neuling.org
---
 arch/powerpc/include/asm/opal.h| 2 ++
 arch/powerpc/platforms/powernv/opal-wrappers.S | 1 +
 2 files changed, 3 insertions(+)

diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index 86055e5..84c37c4dbc 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -146,6 +146,7 @@ struct opal_sg_list {
 #define OPAL_GET_PARAM 89
 #define OPAL_SET_PARAM 90
 #define OPAL_DUMP_RESEND   91
+#define OPAL_PCI_SET_PHB_CXL_MODE  93
 #define OPAL_DUMP_INFO294
 #define OPAL_PCI_EEH_FREEZE_SET97
 #define OPAL_HANDLE_HMI98
@@ -924,6 +925,7 @@ int64_t opal_sensor_read(uint32_t sensor_hndl, int token, 
__be32 *sensor_data);
 int64_t opal_handle_hmi(void);
 int64_t opal_register_dump_region(uint32_t id, uint64_t start, uint64_t end);
 int64_t opal_unregister_dump_region(uint32_t id);
+int64_t opal_pci_set_phb_cxl_mode(uint64_t phb_id, uint64_t mode, uint64_t 
pe_number);
 
 /* Internal functions */
 extern int early_init_dt_scan_opal(unsigned long node, const char *uname,
diff --git a/arch/powerpc/platforms/powernv/opal-wrappers.S 
b/arch/powerpc/platforms/powernv/opal-wrappers.S
index 2e6ce1b..0fb56dc 100644
--- a/arch/powerpc/platforms/powernv/opal-wrappers.S
+++ b/arch/powerpc/platforms/powernv/opal-wrappers.S
@@ -247,3 +247,4 @@ OPAL_CALL(opal_set_param,   OPAL_SET_PARAM);
 OPAL_CALL(opal_handle_hmi, OPAL_HANDLE_HMI);
 OPAL_CALL(opal_register_dump_region,   OPAL_REGISTER_DUMP_REGION);
 OPAL_CALL(opal_unregister_dump_region, OPAL_UNREGISTER_DUMP_REGION);
+OPAL_CALL(opal_pci_set_phb_cxl_mode,   OPAL_PCI_SET_PHB_CXL_MODE);
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 11/16] powerpc/mm: Add hooks for cxl

2014-10-08 Thread Michael Neuling
From: Ian Munsie imun...@au1.ibm.com

This adds hooks into the core powerpc mm code for cxl.

The core powerpc code sometimes uses local tlbie. Unfortunately this won't
work with the current cxl driver as it relies on snooping tlbie broadcasts.

The cxl hardware can have TLB entries invalidated via MMIO but this is not
currently supported by the driver. In future we can make local tlbie smarter so
that it invalidates cxl contexts via MMIO when it needs to but for now we have
this workaround.

This workaround checks for any active cxl contexts and if so, disables local
tlbie.

This also adds a hook for when SLBs are invalidated. This ensures any
corresponding SLBs in cxl are also invalidated at the same time. This is
required for segment demotion.

Signed-off-by: Ian Munsie imun...@au1.ibm.com
Signed-off-by: Michael Neuling mi...@neuling.org
---
 arch/powerpc/mm/copro_fault.c| 2 ++
 arch/powerpc/mm/hash_native_64.c | 6 +-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/mm/copro_fault.c b/arch/powerpc/mm/copro_fault.c
index f2aa5a8..0f9939e 100644
--- a/arch/powerpc/mm/copro_fault.c
+++ b/arch/powerpc/mm/copro_fault.c
@@ -26,6 +26,7 @@
 #include asm/reg.h
 #include asm/copro.h
 #include asm/spu.h
+#include misc/cxl.h
 
 /*
  * This ought to be kept in sync with the powerpc specific do_page_fault
@@ -143,5 +144,6 @@ void copro_flush_all_slbs(struct mm_struct *mm)
 #ifdef CONFIG_SPU_BASE
spu_flush_all_slbs(mm);
 #endif
+   cxl_slbia(mm);
 }
 EXPORT_SYMBOL_GPL(copro_flush_all_slbs);
diff --git a/arch/powerpc/mm/hash_native_64.c b/arch/powerpc/mm/hash_native_64.c
index afc0a82..ae4962a 100644
--- a/arch/powerpc/mm/hash_native_64.c
+++ b/arch/powerpc/mm/hash_native_64.c
@@ -29,6 +29,8 @@
 #include asm/kexec.h
 #include asm/ppc-opcode.h
 
+#include misc/cxl.h
+
 #ifdef DEBUG_LOW
 #define DBG_LOW(fmt...) udbg_printf(fmt)
 #else
@@ -149,9 +151,11 @@ static inline void __tlbiel(unsigned long vpn, int psize, 
int apsize, int ssize)
 static inline void tlbie(unsigned long vpn, int psize, int apsize,
 int ssize, int local)
 {
-   unsigned int use_local = local  mmu_has_feature(MMU_FTR_TLBIEL);
+   unsigned int use_local;
int lock_tlbie = !mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE);
 
+   use_local = local  mmu_has_feature(MMU_FTR_TLBIEL)  
!cxl_ctx_in_use();
+
if (use_local)
use_local = mmu_psize_defs[psize].tlbiel;
if (lock_tlbie  !use_local)
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 12/16] cxl: Add base builtin support

2014-10-08 Thread Michael Neuling
From: Ian Munsie imun...@au1.ibm.com

This adds the base cxl support that cannot be built as a module. Specifically
it adds the cxl callbacks that are called from the core powerpc mm code which
must always exist irrespective of if the cxl module is loaded or not. This is
similar to how cell works with CONFIG_SPU_BASE.

This adds a cxl_slbia() call (similar to spu_flush_all_slbs()) which checks if
the cxl module is loaded and in use, returning immediately if it is not. If it
is in use it calls into the cxl SLB invalidation code.

Signed-off-by: Ian Munsie imun...@au1.ibm.com
Signed-off-by: Michael Neuling mi...@neuling.org
---
 drivers/misc/Kconfig  |  1 +
 drivers/misc/Makefile |  1 +
 drivers/misc/cxl/Kconfig  |  8 +
 drivers/misc/cxl/Makefile |  1 +
 drivers/misc/cxl/base.c   | 86 +++
 5 files changed, 97 insertions(+)
 create mode 100644 drivers/misc/cxl/Kconfig
 create mode 100644 drivers/misc/cxl/Makefile
 create mode 100644 drivers/misc/cxl/base.c

diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index b841180..bbeb451 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -527,4 +527,5 @@ source drivers/misc/vmw_vmci/Kconfig
 source drivers/misc/mic/Kconfig
 source drivers/misc/genwqe/Kconfig
 source drivers/misc/echo/Kconfig
+source drivers/misc/cxl/Kconfig
 endmenu
diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
index 5497d02..7d5c4cd 100644
--- a/drivers/misc/Makefile
+++ b/drivers/misc/Makefile
@@ -55,3 +55,4 @@ obj-y += mic/
 obj-$(CONFIG_GENWQE)   += genwqe/
 obj-$(CONFIG_ECHO) += echo/
 obj-$(CONFIG_VEXPRESS_SYSCFG)  += vexpress-syscfg.o
+obj-$(CONFIG_CXL_BASE) += cxl/
diff --git a/drivers/misc/cxl/Kconfig b/drivers/misc/cxl/Kconfig
new file mode 100644
index 000..5cdd319
--- /dev/null
+++ b/drivers/misc/cxl/Kconfig
@@ -0,0 +1,8 @@
+#
+# IBM Coherent Accelerator (CXL) compatible devices
+#
+
+config CXL_BASE
+   bool
+   default n
+   select PPC_COPRO_BASE
diff --git a/drivers/misc/cxl/Makefile b/drivers/misc/cxl/Makefile
new file mode 100644
index 000..e30ad0a
--- /dev/null
+++ b/drivers/misc/cxl/Makefile
@@ -0,0 +1 @@
+obj-$(CONFIG_CXL_BASE) += base.o
diff --git a/drivers/misc/cxl/base.c b/drivers/misc/cxl/base.c
new file mode 100644
index 000..0654ad8
--- /dev/null
+++ b/drivers/misc/cxl/base.c
@@ -0,0 +1,86 @@
+/*
+ * Copyright 2014 IBM Corp.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#include linux/module.h
+#include linux/rcupdate.h
+#include asm/errno.h
+#include misc/cxl.h
+#include cxl.h
+
+/* protected by rcu */
+static struct cxl_calls *cxl_calls;
+
+atomic_t cxl_use_count = ATOMIC_INIT(0);
+EXPORT_SYMBOL(cxl_use_count);
+
+#ifdef CONFIG_CXL_MODULE
+
+static inline struct cxl_calls *cxl_calls_get(void)
+{
+   struct cxl_calls *calls = NULL;
+
+   rcu_read_lock();
+   calls = rcu_dereference(cxl_calls);
+   if (calls  !try_module_get(calls-owner))
+   calls = NULL;
+   rcu_read_unlock();
+
+   return calls;
+}
+
+static inline void cxl_calls_put(struct cxl_calls *calls)
+{
+   BUG_ON(calls != cxl_calls);
+
+   /* we don't need to rcu this, as we hold a reference to the module */
+   module_put(cxl_calls-owner);
+}
+
+#else /* !defined CONFIG_CXL_MODULE */
+
+static inline struct cxl_calls *cxl_calls_get(void)
+{
+   return cxl_calls;
+}
+
+static inline void cxl_calls_put(struct cxl_calls *calls) { }
+
+#endif /* CONFIG_CXL_MODULE */
+
+void cxl_slbia(struct mm_struct *mm)
+{
+   struct cxl_calls *calls;
+
+   calls = cxl_calls_get();
+   if (!calls)
+   return;
+
+   if (cxl_ctx_in_use())
+   calls-cxl_slbia(mm);
+
+   cxl_calls_put(calls);
+}
+
+int register_cxl_calls(struct cxl_calls *calls)
+{
+   if (cxl_calls)
+   return -EBUSY;
+
+   rcu_assign_pointer(cxl_calls, calls);
+   return 0;
+}
+EXPORT_SYMBOL_GPL(register_cxl_calls);
+
+void unregister_cxl_calls(struct cxl_calls *calls)
+{
+   BUG_ON(cxl_calls-owner != calls-owner);
+   RCU_INIT_POINTER(cxl_calls, NULL);
+   synchronize_rcu();
+}
+EXPORT_SYMBOL_GPL(unregister_cxl_calls);
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 14/16] cxl: Add userspace header file

2014-10-08 Thread Michael Neuling
From: Ian Munsie imun...@au1.ibm.com

This adds a header file for use by userspace programs wanting to interact with
the kernel cxl driver.  It defines structs and magic numbers required for
userspace to interact with devices in /dev/cxl/afuM.N.

Further documentation on this interface is added in a subsequent patch in
Documentation/powerpc/cxl.txt.

It also adds this new userspace header file to Kbuild so it's exported when
doing make headers_installs.

Signed-off-by: Ian Munsie imun...@au1.ibm.com
Signed-off-by: Michael Neuling mi...@neuling.org
---
 include/uapi/Kbuild  |  1 +
 include/uapi/misc/Kbuild |  2 ++
 include/uapi/misc/cxl.h  | 87 
 3 files changed, 90 insertions(+)
 create mode 100644 include/uapi/misc/Kbuild
 create mode 100644 include/uapi/misc/cxl.h

diff --git a/include/uapi/Kbuild b/include/uapi/Kbuild
index 81d2106..245aa6e 100644
--- a/include/uapi/Kbuild
+++ b/include/uapi/Kbuild
@@ -12,3 +12,4 @@ header-y += video/
 header-y += drm/
 header-y += xen/
 header-y += scsi/
+header-y += misc/
diff --git a/include/uapi/misc/Kbuild b/include/uapi/misc/Kbuild
new file mode 100644
index 000..e96cae7
--- /dev/null
+++ b/include/uapi/misc/Kbuild
@@ -0,0 +1,2 @@
+# misc Header export list
+header-y += cxl.h
diff --git a/include/uapi/misc/cxl.h b/include/uapi/misc/cxl.h
new file mode 100644
index 000..c232be6
--- /dev/null
+++ b/include/uapi/misc/cxl.h
@@ -0,0 +1,87 @@
+/*
+ * Copyright 2014 IBM Corp.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#ifndef _UAPI_MISC_CXL_H
+#define _UAPI_MISC_CXL_H
+
+#include linux/types.h
+#include linux/ioctl.h
+
+/* Structs for IOCTLS for userspace to talk to the kernel */
+struct cxl_ioctl_start_work {
+   __u64 flags;
+   __u64 work_element_descriptor;
+   __u64 amr;
+   __s16 num_interrupts;
+   __s16 reserved1;
+   __s32 reserved2;
+   __u64 reserved3;
+   __u64 reserved4;
+   __u64 reserved5;
+   __u64 reserved6;
+};
+#define CXL_START_WORK_AMR 0x0001ULL
+#define CXL_START_WORK_NUM_IRQS0x0002ULL
+#define CXL_START_WORK_ALL (CXL_START_WORK_AMR |\
+CXL_START_WORK_NUM_IRQS)
+
+/* IOCTL numbers */
+#define CXL_MAGIC 0xCA
+#define CXL_IOCTL_START_WORK   _IOW(CXL_MAGIC, 0x00, struct 
cxl_ioctl_start_work)
+#define CXL_IOCTL_GET_PROCESS_ELEMENT  _IOR(CXL_MAGIC, 0x01, __u32)
+
+/* Events from read() */
+#define CXL_READ_MIN_SIZE 0x1000 /* 4K */
+
+enum cxl_event_type {
+   CXL_EVENT_RESERVED  = 0,
+   CXL_EVENT_AFU_INTERRUPT = 1,
+   CXL_EVENT_DATA_STORAGE  = 2,
+   CXL_EVENT_AFU_ERROR = 3,
+};
+
+struct cxl_event_header {
+   __u16 type;
+   __u16 size;
+   __u16 process_element;
+   __u16 reserved1;
+};
+
+struct cxl_event_afu_interrupt {
+   __u16 flags;
+   __u16 irq; /* Raised AFU interrupt number */
+   __u32 reserved1;
+};
+
+struct cxl_event_data_storage {
+   __u16 flags;
+   __u16 reserved1;
+   __u32 reserved2;
+   __u64 addr;
+   __u64 dsisr;
+   __u64 reserved3;
+};
+
+struct cxl_event_afu_error {
+   __u16 flags;
+   __u16 reserved1;
+   __u32 reserved2;
+   __u64 error;
+};
+
+struct cxl_event {
+   struct cxl_event_header header;
+   union {
+   struct cxl_event_afu_interrupt irq;
+   struct cxl_event_data_storage fault;
+   struct cxl_event_afu_error afu_error;
+   };
+};
+
+#endif /* _UAPI_MISC_CXL_H */
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 15/16] cxl: Add driver to Kbuild and Makefiles

2014-10-08 Thread Michael Neuling
From: Ian Munsie imun...@au1.ibm.com

Signed-off-by: Ian Munsie imun...@au1.ibm.com
Signed-off-by: Michael Neuling mi...@neuling.org
---
 drivers/misc/cxl/Kconfig  | 17 +
 drivers/misc/cxl/Makefile |  2 ++
 2 files changed, 19 insertions(+)

diff --git a/drivers/misc/cxl/Kconfig b/drivers/misc/cxl/Kconfig
index 5cdd319..a990b39 100644
--- a/drivers/misc/cxl/Kconfig
+++ b/drivers/misc/cxl/Kconfig
@@ -6,3 +6,20 @@ config CXL_BASE
bool
default n
select PPC_COPRO_BASE
+
+config CXL
+   tristate Support for IBM Coherent Accelerators (CXL)
+   depends on PPC_POWERNV  PCI_MSI
+   select CXL_BASE
+   default m
+   help
+ Select this option to enable driver support for IBM Coherent
+ Accelerators (CXL).  CXL is otherwise known as Coherent Accelerator
+ Processor Interface (CAPI).  CAPI allows accelerators in FPGAs to be
+ coherently attached to a CPU via an MMU.  This driver enables
+ userspace programs to access these accelerators via /dev/cxl/afuM.N
+ devices.
+
+ CAPI adapters are found in POWER8 based systems.
+
+ If unsure, say N.
diff --git a/drivers/misc/cxl/Makefile b/drivers/misc/cxl/Makefile
index e30ad0a..165e98f 100644
--- a/drivers/misc/cxl/Makefile
+++ b/drivers/misc/cxl/Makefile
@@ -1 +1,3 @@
+cxl-y  += main.o file.o irq.o fault.o native.o 
context.o sysfs.o debugfs.o pci.o
+obj-$(CONFIG_CXL)  += cxl.o
 obj-$(CONFIG_CXL_BASE) += base.o
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 16/16] cxl: Add documentation for userspace APIs

2014-10-08 Thread Michael Neuling
From: Ian Munsie imun...@au1.ibm.com

This documentation gives an overview of the hardware architecture, userspace
APIs via /dev/cxl/afuM.N and the syfs files. It also adds a MAINTAINERS file
entry for cxl.

Signed-off-by: Ian Munsie imun...@au1.ibm.com
Signed-off-by: Michael Neuling mi...@neuling.org
---
 Documentation/ABI/testing/sysfs-class-cxl | 130 ++
 Documentation/ioctl/ioctl-number.txt  |   1 +
 Documentation/powerpc/00-INDEX|   2 +
 Documentation/powerpc/cxl.txt | 379 ++
 MAINTAINERS   |  12 +
 include/uapi/misc/cxl.h   |   7 +-
 6 files changed, 528 insertions(+), 3 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-class-cxl
 create mode 100644 Documentation/powerpc/cxl.txt

diff --git a/Documentation/ABI/testing/sysfs-class-cxl 
b/Documentation/ABI/testing/sysfs-class-cxl
new file mode 100644
index 000..0a5508c
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-class-cxl
@@ -0,0 +1,130 @@
+Slave contexts (eg. /sys/class/cxl/afu0.0s):
+
+What:   /sys/class/cxl/afu/irqs_max
+Date:   September 2014
+Contact:linuxppc-dev@lists.ozlabs.org
+Description:read/write
+Decimal value of maximum number of interrupts that can be
+requested by userspace.  The default on probe is the maximum
+that hardware can support (eg. 2037). Write values will limit
+userspace applications to that many userspace interrupts. Must
+be = irqs_min.
+
+What:   /sys/class/cxl/afu/irqs_min
+Date:   September 2014
+Contact:linuxppc-dev@lists.ozlabs.org
+Description:read only
+Decimal value of the minimum number of interrupts that
+userspace must request on a CXL_START_WORK ioctl. Userspace may
+omit the num_interrupts field in the START_WORK IOCTL to get
+this minimum automatically.
+
+What:   /sys/class/cxl/afu/mmio_size
+Date:   September 2014
+Contact:linuxppc-dev@lists.ozlabs.org
+Description:read only
+Decimal value of the size of the MMIO space that may be mmaped
+by userspace.
+
+What:   /sys/class/cxl/afu/modes_supported
+Date:   September 2014
+Contact:linuxppc-dev@lists.ozlabs.org
+Description:read only
+List of the modes this AFU supports. One per line.
+Valid entries are: dedicated_process and afu_directed
+
+What:   /sys/class/cxl/afu/mode
+Date:   September 2014
+Contact:linuxppc-dev@lists.ozlabs.org
+Description:read/write
+The current mode the AFU is using. Will be one of the modes
+given in modes_supported. Writing will change the mode
+provided that no user contexts are attached.
+
+
+What:   /sys/class/cxl/afu/prefault_mode
+Date:   September 2014
+Contact:linuxppc-dev@lists.ozlabs.org
+Description:read/write
+Set the mode for prefaulting in segments into the segment table
+when performing the START_WORK ioctl. Possible values:
+none: No prefaulting (default)
+work_element_descriptor: Treat the work element
+ descriptor as an effective address and
+ prefault what it points to.
+all: all segments process calling START_WORK maps.
+
+What:   /sys/class/cxl/afu/reset
+Date:   September 2014
+Contact:linuxppc-dev@lists.ozlabs.org
+Description:write only
+Writing 1 here will reset the AFU provided there are not
+contexts active on the AFU.
+
+What:   /sys/class/cxl/afu/api_version
+Date:   September 2014
+Contact:linuxppc-dev@lists.ozlabs.org
+Description:read only
+Decimal value of the current version of the kernel/user API.
+
+What:   /sys/class/cxl/afu/api_version_com
+Date:   September 2014
+Contact:linuxppc-dev@lists.ozlabs.org
+Description:read only
+Decimal value of the the lowest version of the userspace API
+this this kernel supports.
+
+
+
+Master contexts (eg. /sys/class/cxl/afu0.0m)
+
+What:   /sys/class/cxl/afum/mmio_size
+Date:   September 2014
+Contact:linuxppc-dev@lists.ozlabs.org
+Description:read only
+Decimal value of the size of the MMIO space that may be mmaped
+by userspace. This includes all slave contexts space also.
+
+What:   /sys/class/cxl/afum/pp_mmio_len
+Date:   September 2014
+Contact:linuxppc-dev@lists.ozlabs.org
+Description:read only
+Decimal value of the Per Process MMIO space length.
+
+What:   

Re: [PATCH] tools/perf/powerpc: Fix build break

2014-10-08 Thread Aneesh Kumar K.V
Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com writes:

   CC   arch/powerpc/util/skip-callchain-idx.o
 arch/powerpc/util/skip-callchain-idx.c: In function ‘check_return_reg’:
 arch/powerpc/util/skip-callchain-idx.c:55:3: error: implicit declaration of 
 function ‘pr_debug’ [-Werror=implicit-function-declaration]
pr_debug(dwarf_frame_register() %s\n, dwarf_errmsg(-1));

 Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
 ---
  tools/perf/arch/powerpc/util/skip-callchain-idx.c | 1 +
  1 file changed, 1 insertion(+)

 diff --git a/tools/perf/arch/powerpc/util/skip-callchain-idx.c 
 b/tools/perf/arch/powerpc/util/skip-callchain-idx.c
 index a7c23a4b3778..d73ef8bb08c7 100644
 --- a/tools/perf/arch/powerpc/util/skip-callchain-idx.c
 +++ b/tools/perf/arch/powerpc/util/skip-callchain-idx.c
 @@ -15,6 +15,7 @@

  #include util/thread.h
  #include util/callchain.h
 +#include util/debug.h

  /*
   * When saving the callchain on Power, the kernel conservatively saves

We still have this broken upstream.

-aneesh

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: powerpc: mitigate impact of decrementer reset

2014-10-08 Thread Preeti U Murthy
On 10/08/2014 08:22 AM, Michael Ellerman wrote:
 On Tue, 2014-07-10 at 19:13:24 UTC, Paul Clarke wrote:
 The POWER ISA defines an always-running decrementer which can be used
 to schedule interrupts after a certain time interval has elapsed.
 The decrementer counts down at the same frequency as the Time Base,
 which is 512 MHz.  The maximum value of the decrementer is 0x7fff.
 This works out to a maximum interval of about 4.19 seconds.

 If a larger interval is desired, the kernel will set the decrementer
 to its maximum value and reset it after it expires (underflows)
 a sufficient number of times until the desired interval has elapsed.

 The negative effect of this is that an unwanted latency spike will
 impact normal processing at most every 4.19 seconds.  On an IBM
 POWER8-based system, this spike was measured at about 25-30
 microseconds, much of which was basic, opportunistic housekeeping
 tasks that could otherwise have waited.

 This patch short-circuits the reset of the decrementer, exiting after
 the decrementer reset, but before the housekeeping tasks if the only
 need for the interrupt is simply to reset it.  After this patch,
 the latency spike was measured at about 150 nanoseconds.
 
 Hi Paul,
 
 Thanks for the excellent changelog. But this patch makes me a bit nervous :)
 
 Do you know where the latency is coming from? Is it primarily the irq work?
 
 If so I'd prefer if we could move the short circuit into __timer_interrupt()
 itself. That way we'd still have the trace points usable, and it would
 hopefully result in less duplicated logic.

I agree, this is perhaps the better approach.

Regards
Preeti U Murthy
 
 cheers
 ___
 Linuxppc-dev mailing list
 Linuxppc-dev@lists.ozlabs.org
 https://lists.ozlabs.org/listinfo/linuxppc-dev
 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v4 13/16] cxl: Driver code for powernv PCIe based cards for userspace access

2014-10-08 Thread Ian Munsie
Excerpts from Michael Neuling's message of 2014-10-08 19:55:02 +1100:
 +static ssize_t afu_read(struct file *file, char __user *buf, size_t count,
 +loff_t *off)
...
 +for (;;) {
 +prepare_to_wait(ctx-wq, wait, TASK_INTERRUPTIBLE);
 +if (ctx_event_pending(ctx))
 +break;
 +
 +spin_unlock_irqrestore(ctx-lock, flags);
 +if (file-f_flags  O_NONBLOCK)
 +return -EAGAIN;
 +
 +if (signal_pending(current))
 +return -ERESTARTSYS;

Looks like I mucked this up while refactoring - these two cases no
longer call finish_wait() which can lead to a crash if something later
wakes up the ctx-wq... I'll post a fix in a separate patch shortly.

-Ian

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] CXL: Fix afu_read() not doing finish_wait() on signal or non-blocking

2014-10-08 Thread Ian Munsie
If afu_read() returned due to a signal or the AFU file descriptor being
opened non-blocking it would not call finish_wait() before returning,
which could lead to a crash later when something else wakes up the wait
queue.

This patch restructures the wait logic to ensure that the cleanup is
done correctly.

Signed-off-by: Ian Munsie imun...@au1.ibm.com
---
 drivers/misc/cxl/file.c | 20 +++-
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/drivers/misc/cxl/file.c b/drivers/misc/cxl/file.c
index 847b7e6..378b099 100644
--- a/drivers/misc/cxl/file.c
+++ b/drivers/misc/cxl/file.c
@@ -273,6 +273,7 @@ static ssize_t afu_read(struct file *file, char __user 
*buf, size_t count,
struct cxl_context *ctx = file-private_data;
struct cxl_event event;
unsigned long flags;
+   int rc;
DEFINE_WAIT(wait);
 
if (count  CXL_READ_MIN_SIZE)
@@ -285,13 +286,17 @@ static ssize_t afu_read(struct file *file, char __user 
*buf, size_t count,
if (ctx_event_pending(ctx))
break;
 
-   spin_unlock_irqrestore(ctx-lock, flags);
-   if (file-f_flags  O_NONBLOCK)
-   return -EAGAIN;
+   if (file-f_flags  O_NONBLOCK) {
+   rc = -EAGAIN;
+   goto out;
+   }
 
-   if (signal_pending(current))
-   return -ERESTARTSYS;
+   if (signal_pending(current)) {
+   rc = -ERESTARTSYS;
+   goto out;
+   }
 
+   spin_unlock_irqrestore(ctx-lock, flags);
pr_devel(afu_read going to sleep...\n);
schedule();
pr_devel(afu_read woken up\n);
@@ -336,6 +341,11 @@ static ssize_t afu_read(struct file *file, char __user 
*buf, size_t count,
if (copy_to_user(buf, event, event.header.size))
return -EFAULT;
return event.header.size;
+
+out:
+   finish_wait(ctx-wq, wait);
+   spin_unlock_irqrestore(ctx-lock, flags);
+   return rc;
 }
 
 static const struct file_operations afu_fops = {
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

回复: [PATCH 08/44] kernel: Move pm_power_off to common code

2014-10-08 Thread Xuetao Guan

- Guenter Roeck li...@roeck-us.net 写道:
 pm_power_off is defined for all architectures. Move it to common code.
 
 Have all architectures call do_kernel_poweroff instead of pm_power_off.
 Some architectures point pm_power_off to machine_power_off. For those,
 call do_kernel_poweroff from machine_power_off instead.
 

For UniCore32 part,

Acked-by: Xuetao Guan g...@mprc.pku.edu.cn

Thanks
Xuetao
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc: mitigate impact of decrementer reset

2014-10-08 Thread Paul Clarke

On 10/08/2014 12:37 AM, Heinz Wrobel wrote:

what if your tb wraps during the  test?


Per the Power ISA, Time Base is 64 bits, monotonically increasing, and 
is writable only in hypervisor state.  To my understanding, it is set to 
zero at boot (although this is not prescribed).


Also, as noted by others, the logic is roughly duplicated (with some 
differences) from the analogous code in __timer_interrupt just above it.


I don't see wrapping as a concern.

PC

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V3 3/3] powerpc, ptrace: Enable support for miscellaneous registers

2014-10-08 Thread Anshuman Khandual
On 08/28/2014 03:05 AM, Sukadev Bhattiprolu wrote:
 
 Anshuman Khandual [khand...@linux.vnet.ibm.com] wrote:
 | This patch enables get and set of miscellaneous registers through ptrace
 | PTRACE_GETREGSET/PTRACE_SETREGSET interface by implementing new powerpc
 | specific register set REGSET_MISC support corresponding to the new ELF
 | core note NT_PPC_MISC added previously in this regard.
 | 
 | Signed-off-by: Anshuman Khandual khand...@linux.vnet.ibm.com
 | ---
 |  arch/powerpc/kernel/ptrace.c | 81 
 
 |  1 file changed, 81 insertions(+)
 | 
 | diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
 | index 17642ef..63b883a 100644
 | --- a/arch/powerpc/kernel/ptrace.c
 | +++ b/arch/powerpc/kernel/ptrace.c
 | @@ -1149,6 +1149,76 @@ static int tm_cvmx_set(struct task_struct *target, 
 const struct user_regset *reg
 |  #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
 |  
 |  /*
 | + * Miscellaneous Registers
 | + *
 | + * struct {
 | + * unsigned long dscr;
 | + * unsigned long ppr;
 | + * unsigned long tar;
 | + * };
 | + */
 | +static int misc_get(struct task_struct *target, const struct user_regset 
 *regset,
 | +  unsigned int pos, unsigned int count,
 | +  void *kbuf, void __user *ubuf)
 | +{
 | +   int ret;
 | +
 | +   /* DSCR register */
 | +   ret = user_regset_copyout(pos, count, kbuf, ubuf,
 | +   target-thread.dscr, 0,
 | +   sizeof(unsigned long));
 | +
 | +   BUILD_BUG_ON(offsetof(struct thread_struct, dscr) + sizeof(unsigned 
 long) +
 | +   sizeof(unsigned long) != offsetof(struct 
 thread_struct, ppr));
 
 
 I see these in  arch/powerpc/include/asm/processor.h
 
 #ifdef CONFIG_PPC64
 unsigned long   dscr;
 int dscr_inherit;
 unsigned long   ppr;/* used to save/restore SMT priority */
 #endif
 
 where there is an 'int' between ppr and dscr. So, should one of
 the above sizeof(unsigned long) be changed to sizeof(int) ?

Right, I understand that but strangely I get this compile time error
when it is changed to sizeof(int).

 error: call to ‘__compiletime_assert_1350’ declared with attribute error:
  BUILD_BUG_ON failed: TSO(dscr) + sizeof(unsigned long) + sizeof(int) != 
TSO(ppr)
  BUILD_BUG_ON(TSO(dscr) + sizeof(unsigned long) + sizeof(int) != TSO(ppr));

may be I am missing something here.

 
 Also, since we use offsetof(struct thread_struct, field) heavily, a
 macro local to the file, may simplify the code.

Right, will do that.

 #define   TSO(f)  (offsetof(struct thread_struct, f))
 
 | +
 | +   /* PPR register */
 | +   if (!ret)
 | +   ret = user_regset_copyout(pos, count, kbuf, ubuf,
 | + target-thread.ppr, sizeof(unsigned 
 long),
 | + 2 * sizeof(unsigned long));
 | +
 | +   BUILD_BUG_ON(offsetof(struct thread_struct, ppr) + sizeof(unsigned long)
 | +   != offsetof(struct 
 thread_struct, tar));
 | +   /* TAR register */
 | +   if (!ret)
 | +   ret = user_regset_copyout(pos, count, kbuf, ubuf,
 | + target-thread.tar, 2 * 
 sizeof(unsigned long),
 | + 3 * sizeof(unsigned long));
 | +   return ret;
 | +}
 | +
 | +static int misc_set(struct task_struct *target, const struct user_regset 
 *regset,
 | +  unsigned int pos, unsigned int count,
 | +  const void *kbuf, const void __user *ubuf)
 | +{
 | +   int ret;
 | +
 | +   /* DSCR register */
 | +   ret = user_regset_copyin(pos, count, kbuf, ubuf,
 | +   target-thread.dscr, 0,
 | +   sizeof(unsigned long));
 | +
 | +   BUILD_BUG_ON(offsetof(struct thread_struct, dscr) + sizeof(unsigned 
 long) +
 | +   sizeof(unsigned long) != offsetof(struct thread_struct, 
 ppr));
 | +
 | +   /* PPR register */
 | +   if (!ret)
 | +   ret = user_regset_copyin(pos, count, kbuf, ubuf,
 | +   target-thread.ppr, 
 sizeof(unsigned long),
 | +   2 * sizeof(unsigned long));
 | +
 | +   BUILD_BUG_ON(offsetof(struct thread_struct, ppr) + sizeof(unsigned long)
 | +   != offsetof(struct 
 thread_struct, tar));
 | +
 | +   /* TAR register */
 | +   if (!ret)
 | +   ret = user_regset_copyin(pos, count, kbuf, ubuf,
 | +   target-thread.tar, 2 * 
 sizeof(unsigned long),
 | +   3 * sizeof(unsigned long));
 | +   return ret;
 | +}
 | +
 | +/*
 |   * These are our native regset flavors.
 |   */
 |  enum powerpc_regset {
 | @@ -1169,6 +1239,7 @@ enum powerpc_regset {
 | REGSET_TM_CFPR, /* TM checkpointed FPR */
 | 

Re: [PATCH v2 1/2] spi: fsl-spi: Fix parameter ram offset setup for CPM1

2014-10-08 Thread leroy christophe


Le 07/10/2014 02:15, Scott Wood a écrit :

On Sat, 2014-10-04 at 14:02 +0200, christophe leroy wrote:

Le 03/10/2014 22:29, Scott Wood a écrit :

On Fri, 2014-10-03 at 18:49 +0200, Christophe Leroy wrote:

On CPM1, the SPI parameter RAM has a default location. In fsl_spi_cpm_get_pram()
there was a confusion between the SPI_BASE register and the base of the SPI
parameter RAM. Fortunatly, it was working properly with MPC866 and MPC885
because they do set SPI_BASE, but on MPC860 and other old MPC8xx that doesn't
set SPI_BASE, pram_ofs was not properly set. This patch fixes this confusion.

Signed-off-by: Christophe Leroy christophe.le...@c-s.fr

---
Changes from v1 to v2: none

   drivers/spi/spi-fsl-cpm.c | 9 -
   1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/spi/spi-fsl-cpm.c b/drivers/spi/spi-fsl-cpm.c
index 54b0637..0f3a912 100644
--- a/drivers/spi/spi-fsl-cpm.c
+++ b/drivers/spi/spi-fsl-cpm.c
@@ -262,15 +262,14 @@ static unsigned long fsl_spi_cpm_get_pram(struct 
mpc8xxx_spi *mspi)
pram_ofs = cpm_muram_alloc(SPI_PRAM_SIZE, 64);
out_be16(spi_base, pram_ofs);
} else {
-   struct spi_pram __iomem *pram = spi_base;
-   u16 rpbase = in_be16(pram-rpbase);
+   u16 rpbase = in_be16(spi_base);
   
-		/* Microcode relocation patch applied? */

+   /* Microcode relocation patch applied | rpbase set by default */
if (rpbase) {
pram_ofs = rpbase;
} else {
-   pram_ofs = cpm_muram_alloc(SPI_PRAM_SIZE, 64);
-   out_be16(spi_base, pram_ofs);
+   pram_ofs = offsetof(cpm8xx_t, cp_dparam[PROFF_SPI]) -
+  offsetof(cpm8xx_t, cp_dpmem[0]);
}
}

Why is PROFF_SPI not coming from the device tree?

That's where it starts to become tricky.

PROFF_SPI is defined in cpm1.h which is included by the driver already.

Yes, but those values shouldn't be used.  It's a leftover from the old
way of hardcoding things and describing the hardware with kconfig rather
than the device tree.


It provides the default offset from the start of the parameter RAM.
Previously I had the following in my device tree, and the last part of
the source above (the one for rpbase == 0) could not work.

  spi: spi@a80 {
  cell-index = 0;
  compatible = fsl,spi, fsl,cpm1-spi;
  reg = 0xa80 0x30 0x3d80 0x30;

First reg area was the area for SPI registers. Second area was the
parameter RAM zone, which was just mapped to get access to the SPI_BASE
pointer (rpbase)

Now I have

  compatible = fsl,spi, fsl,cpm1-spi-reloc;
  reg = 0xa80 0x30 0x3dac 0x2;

First reg area is the area for SPI registers. Second area is the
SPI_BASE, as for the CPM2.

On recent 8xx (885 and 866 at least) it contains the offset (=0x1D80) of
the parameter RAM. But on old ones (860, ...) it contains 0. Therefore
we have to get the default index in another way.
What I wanted was to keep something similar to what's done with CPM2.

What should it look like if that offset had to be in the device tree ?

If the offset is not relocatable or discoverable, it should stay in the
device tree.  If you have an old chip you wouldn't have
fsl,cpm1-spi-reloc and thus you'd still have 0x3d80 0x30 in reg.
This index is from the start of the dual port RAM. It is 0x2000 above 
the start of the CPM area.

In the DTS, we have:

soc@ff00 {
compatible = fsl,mpc885, fsl,pq1-soc;
#address-cells = 1;
#size-cells = 1;
device_type = soc;
ranges = 0x0 0xff00 0x28000;
bus-frequency = 0;
clock-frequency = 0;

cpm@9c0 {
#address-cells = 1;
#size-cells = 1;
compatible = fsl,mpc885-cpm, fsl,cpm1;
ranges;
reg = 0x9c0 0x40;
brg-frequency = 0;
interrupts = 0;// cpm error interrupt
interrupt-parent = CPM_PIC;

muram@2000 {
#address-cells = 1;
#size-cells = 1;
ranges = 0x0 0x2000 0x2000;

data@0 {
compatible = fsl,cpm-muram-data;
reg = 0x0 0x1c00;
};
};

spi: spi@a80 {
#address-cells = 1;
#size-cells = 0;
cell-index = 0;
compatible = fsl,spi, fsl,cpm1-spi;
reg = 0xa80 0x30 0x3d80 0x30;
interrupts = 5;
interrupt-parent = CPM_PIC;
mode = cpu;


The binding allows me to do an of_iomap() on the parameter RAM, hence to 
get access to the relocation index which is inside it.
But if the relocation index is 0, I have to calculate it by myself 
because the calling function expects it in return.
The binding is 

Re: [PATCH v2 1/2] spi: fsl-spi: Fix parameter ram offset setup for CPM1

2014-10-08 Thread Scott Wood
On Wed, 2014-10-08 at 18:21 +0200, leroy christophe wrote:
 Le 07/10/2014 02:15, Scott Wood a écrit :
  On Sat, 2014-10-04 at 14:02 +0200, christophe leroy wrote:
  What should it look like if that offset had to be in the device tree ?
  If the offset is not relocatable or discoverable, it should stay in the
  device tree.  If you have an old chip you wouldn't have
  fsl,cpm1-spi-reloc and thus you'd still have 0x3d80 0x30 in reg.
 This index is from the start of the dual port RAM. It is 0x2000 above 
 the start of the CPM area.
 In the DTS, we have:
 
  soc@ff00 {
  compatible = fsl,mpc885, fsl,pq1-soc;
  #address-cells = 1;
  #size-cells = 1;
  device_type = soc;
  ranges = 0x0 0xff00 0x28000;
  bus-frequency = 0;
  clock-frequency = 0;
 
  cpm@9c0 {
  #address-cells = 1;
  #size-cells = 1;
  compatible = fsl,mpc885-cpm, fsl,cpm1;
  ranges;
  reg = 0x9c0 0x40;
  brg-frequency = 0;
  interrupts = 0;// cpm error interrupt
  interrupt-parent = CPM_PIC;
 
  muram@2000 {
  #address-cells = 1;
  #size-cells = 1;
  ranges = 0x0 0x2000 0x2000;
 
  data@0 {
  compatible = fsl,cpm-muram-data;
  reg = 0x0 0x1c00;
  };
  };
 
  spi: spi@a80 {
  #address-cells = 1;
  #size-cells = 0;
  cell-index = 0;
  compatible = fsl,spi, fsl,cpm1-spi;
  reg = 0xa80 0x30 0x3d80 0x30;
  interrupts = 5;
  interrupt-parent = CPM_PIC;
  mode = cpu;
 
 
 The binding allows me to do an of_iomap() on the parameter RAM, hence to 
 get access to the relocation index which is inside it.
 But if the relocation index is 0, I have to calculate it by myself 
 because the calling function expects it in return.
 The binding is also supposed to tell that the muram is at 0xff002000. 
 But I don't know how I can get this info and use it to calculate the 
 index of my param RAM ? I need to calculate the index which is 1d80 
 (0x3d80 - 0x2000)

What binding are you talking about?  There is no published binding for
this yet.

As for what the driver should do, it should do an of_iomap(), but what
it does with the resulting memory depends on the compatible.  For
fsl,cpm1-spi, the result would be the parameter RAM for the device.  For
fsl,cpm1-spi-reloc and fsl,cpm2-spi, it would be the relocation
register.  The driver would either read the contents of the register, or
write a different offset.

My understanding is that the relocation register would only be zero on
the chips where we'd use fsl,cpm1-spi, not fsl,cpm1-spi-reloc.

-Scott


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/2] spi: fsl-spi: Allow dynamic allocation of CPM1 parameter RAM

2014-10-08 Thread leroy christophe


Le 07/10/2014 02:19, Scott Wood a écrit :

On Sat, 2014-10-04 at 12:15 +0200, christophe leroy wrote:

Le 03/10/2014 22:24, Scott Wood a écrit :

On Fri, 2014-10-03 at 22:15 +0200, christophe leroy wrote:

Le 03/10/2014 16:44, Mark Brown a écrit :

On Fri, Oct 03, 2014 at 02:56:09PM +0200, Christophe Leroy wrote:


+config CPM1_RELOCSPI
+   bool Dynamic SPI relocation
+   default n
+   help
+ On recent MPC8xx (at least MPC866 and MPC885) SPI can be relocated
+ without micropatch. This activates relocation to a dynamically
+ allocated area in the CPM Dual port RAM.
+ When combined with SPI relocation patch (for older MPC8xx) it avoids
+ the loss of additional Dual port RAM space just above the patch,
+ which might be needed for example when using the CPM QMC.

Something like this shouldn't be a compile time option.  Either it
should be unconditional or it should be triggered in some system
specific manner (from DT, from knowing about other users or similar).

Can't be unconditional as older versions of mpc8xx (eg MPC860) don't
support relocation without a micropatch.
I have therefore submitted a v2 based on a DTS compatible property.

So the device tree change is about whether relocation is supported, not
whether it is required?

Indeed no, my intension is to say that relocation is requested. Do you
mean that it should then not use a compatible ?

The device tree describes hardware.  It doesn't tell software how to use
that hardware.

Based on one of your other e-mails, I think what you want to say here is
that the old binding didn't describe the registers needed for
relocation, so the new compatible describes the new binding, rather than
requesting that software do a relocation.  Software that sees the new
binding could choose to relocate, or just choose to read the current
offset from the register.

Not exactly.
The old binding does describe the entire default param RAM (0x3d80 size 
0x30). The relocation index is within this param RAM at 0x3dac.

So the old binding is enough to allow relocation.
The issue today with the driver (hence my first patch) is that the 
driver reads the relocation index but takes a wrong decision if the 
index is 0: it assumes that an nul index means that a param RAM shall be 
allocated, which is wrong. A nul index means that the component doesn't 
support relocation, so the default param RAM shall be used. The function 
used for that is supposed to return the index. So when the index is 
null, I need to calculate it.


Now, it can't be the SPI driver by itself that decide if he has to 
relocate or not. Because it depends whether I need to relocate or not. 
There is no point in waisting another area of the dualport RAM if I 
don't need to use SCC2 in a mode that overlaps the SPI parameter RAM.


Today on the old MPC8xx, a microcode patch is needed in order to be able 
to relocate, and relocated address is directly fixed by the code 
handling the patch (sysdev/micropatch.c). The patch loading function is 
call very early in the boot process by cpm_reset() which is call by the 
xxx_setup_arch().

I have two issues with the way it is done today:
1/ the address which in hard coded is the micropatch loading function() 
is within the area for descripters for the QMC, so I would need to use 
another address.
2/ for new MPC8xx which don't need microcode patch, I have no way today 
to relocate.


I have the same issue with the relocation of SMC1. Today when we 
activate SMC1 relocation microcode patch, the loading function has a 
hard coded relocation area for SMC1 which is the area dedicated to the 
MPC8xx DSP. It means that I need to change it as I want to use the DSP.


Would it be acceptable to define a fixed relocation address in the 
Kconfig in which we select microcode patch (arch/powerpc/platforms/8xx), 
instead of having it hardcoded in micropatch.c ?


Or maybe it would be possible to select which microcode patch we 
want/need via the device tree and which address shall be used for 
relocation ? What would you suggest to describe it ?



How about checking for the existing specific-SoC compatibles?

What do you mean ?

Look for fsl,mpc885-cpm-i2c etc.  Or, if you didn't follow that
pattern (remember, I can't see your device tree!), look for
fsl,mpc885-cpm or fsl,mpc866-cpm in the parent node.  It's moot
though, if the device tree also needs to be modified to describe the
register used to relocate.

-Scott


I'm not sure I understood your question.
My full device tree below

Christophe

/*
 * MIA ethernet Device Tree Source
 *
 * Copyright 2011 CSSI, Inc
 */

/dts-v1/;

/ {
model = MIAE;
compatible = fsl,cmpc885, fsl,mod885;
#address-cells = 1;
#size-cells = 1;

aliases {
ethernet0 = eth0;
ethernet1 = eth1;
mdio = phy;
serial0 = smc1;
};

cpus {
#address-cells = 1;
#size-cells = 0;

PowerPC,885@0 {
device_type = cpu;
  

Re: [PATCH V3 3/3] powerpc, ptrace: Enable support for miscellaneous registers

2014-10-08 Thread Sukadev Bhattiprolu
Anshuman Khandual [khand...@linux.vnet.ibm.com] wrote:
| On 08/28/2014 03:05 AM, Sukadev Bhattiprolu wrote:
|  
|  I see these in  arch/powerpc/include/asm/processor.h
|  
|  #ifdef CONFIG_PPC64
|  unsigned long   dscr;
|  int dscr_inherit;
|  unsigned long   ppr;/* used to save/restore SMT priority */
|  #endif
|  
|  where there is an 'int' between ppr and dscr. So, should one of
|  the above sizeof(unsigned long) be changed to sizeof(int) ?
| 
| Right, I understand that but strangely I get this compile time error
| when it is changed to sizeof(int).
| 
|  error: call to ‘__compiletime_assert_1350’ declared with attribute error:
|   BUILD_BUG_ON failed: TSO(dscr) + sizeof(unsigned long) + sizeof(int) != 
TSO(ppr)
|   BUILD_BUG_ON(TSO(dscr) + sizeof(unsigned long) + sizeof(int) != TSO(ppr));
| 
| may be I am missing something here.

I guess there is a 4-byte padding after dscr_inherit. We could make that
explicit by adding a field or just go with the sizeof(unsigned long).

Thanks,

Sukadev

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] tools/perf/powerpc: Fix build break

2014-10-08 Thread Sukadev Bhattiprolu
Aneesh Kumar K.V [aneesh.ku...@linux.vnet.ibm.com] wrote:
| Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com writes:
| 
|CC   arch/powerpc/util/skip-callchain-idx.o
|  arch/powerpc/util/skip-callchain-idx.c: In function ‘check_return_reg’:
|  arch/powerpc/util/skip-callchain-idx.c:55:3: error: implicit declaration of 
function ‘pr_debug’ [-Werror=implicit-function-declaration]
| pr_debug(dwarf_frame_register() %s\n, dwarf_errmsg(-1));
| 
|  Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
|  ---
|   tools/perf/arch/powerpc/util/skip-callchain-idx.c | 1 +
|   1 file changed, 1 insertion(+)
| 
|  diff --git a/tools/perf/arch/powerpc/util/skip-callchain-idx.c 
b/tools/perf/arch/powerpc/util/skip-callchain-idx.c
|  index a7c23a4b3778..d73ef8bb08c7 100644
|  --- a/tools/perf/arch/powerpc/util/skip-callchain-idx.c
|  +++ b/tools/perf/arch/powerpc/util/skip-callchain-idx.c
|  @@ -15,6 +15,7 @@
| 
|   #include util/thread.h
|   #include util/callchain.h
|  +#include util/debug.h
| 
|   /*
|* When saving the callchain on Power, the kernel conservatively saves
| 
| We still have this broken upstream.

The fix is in Ingo's tree, commit ad7e767.

Ingo, can you push this fix to Linus - it fixes a build failure in
Powerpc.

Sukadev


| 
| -aneesh
| 
| ___
| Linuxppc-dev mailing list
| Linuxppc-dev@lists.ozlabs.org
| https://lists.ozlabs.org/listinfo/linuxppc-dev

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/2] spi: fsl-spi: Allow dynamic allocation of CPM1 parameter RAM

2014-10-08 Thread Scott Wood
On Wed, 2014-10-08 at 18:46 +0200, leroy christophe wrote:
 Le 07/10/2014 02:19, Scott Wood a écrit :
  On Sat, 2014-10-04 at 12:15 +0200, christophe leroy wrote:
  Le 03/10/2014 22:24, Scott Wood a écrit :
  On Fri, 2014-10-03 at 22:15 +0200, christophe leroy wrote:
  Le 03/10/2014 16:44, Mark Brown a écrit :
  On Fri, Oct 03, 2014 at 02:56:09PM +0200, Christophe Leroy wrote:
 
  +config CPM1_RELOCSPI
  +  bool Dynamic SPI relocation
  +  default n
  +  help
  +On recent MPC8xx (at least MPC866 and MPC885) SPI can be 
  relocated
  +without micropatch. This activates relocation to a dynamically
  +allocated area in the CPM Dual port RAM.
  +When combined with SPI relocation patch (for older MPC8xx) it 
  avoids
  +the loss of additional Dual port RAM space just above the 
  patch,
  +which might be needed for example when using the CPM QMC.
  Something like this shouldn't be a compile time option.  Either it
  should be unconditional or it should be triggered in some system
  specific manner (from DT, from knowing about other users or similar).
  Can't be unconditional as older versions of mpc8xx (eg MPC860) don't
  support relocation without a micropatch.
  I have therefore submitted a v2 based on a DTS compatible property.
  So the device tree change is about whether relocation is supported, not
  whether it is required?
  Indeed no, my intension is to say that relocation is requested. Do you
  mean that it should then not use a compatible ?
  The device tree describes hardware.  It doesn't tell software how to use
  that hardware.
 
  Based on one of your other e-mails, I think what you want to say here is
  that the old binding didn't describe the registers needed for
  relocation, so the new compatible describes the new binding, rather than
  requesting that software do a relocation.  Software that sees the new
  binding could choose to relocate, or just choose to read the current
  offset from the register.
 Not exactly.
 The old binding does describe the entire default param RAM (0x3d80 size 
 0x30). The relocation index is within this param RAM at 0x3dac.
 So the old binding is enough to allow relocation.

Oh, so the relocation register is part of the region?  If you relocate
the region, does the relocation register move, or stay at 0x3dac?  I
checked the manual and it wasn't clear.  I had assumed it worked the
same as cpm2, where the relocation register does not move.

 The issue today with the driver (hence my first patch) is that the 
 driver reads the relocation index but takes a wrong decision if the 
 index is 0: it assumes that an nul index means that a param RAM shall be 
 allocated, which is wrong. A nul index means that the component doesn't 
 support relocation, so the default param RAM shall be used. The function 
 used for that is supposed to return the index. So when the index is 
 null, I need to calculate it.
 
 Now, it can't be the SPI driver by itself that decide if he has to 
 relocate or not. Because it depends whether I need to relocate or not. 
 There is no point in waisting another area of the dualport RAM if I 
 don't need to use SCC2 in a mode that overlaps the SPI parameter RAM.

Is the DPRAM currently fully utilized?

If it's really important to not waste 48 bytes of DPRAM, Could you make
the policy decision in platform code, or check at runtime what mode SCC2
is in?

 Today on the old MPC8xx, a microcode patch is needed in order to be able 
 to relocate, and relocated address is directly fixed by the code 
 handling the patch (sysdev/micropatch.c). The patch loading function is 
 call very early in the boot process by cpm_reset() which is call by the 
 xxx_setup_arch().
 I have two issues with the way it is done today:
 1/ the address which in hard coded is the micropatch loading function() 
 is within the area for descripters for the QMC, so I would need to use 
 another address.
 2/ for new MPC8xx which don't need microcode patch, I have no way today 
 to relocate.
 
 I have the same issue with the relocation of SMC1. Today when we 
 activate SMC1 relocation microcode patch, the loading function has a 
 hard coded relocation area for SMC1 which is the area dedicated to the 
 MPC8xx DSP. It means that I need to change it as I want to use the DSP.
 
 Would it be acceptable to define a fixed relocation address in the 
 Kconfig in which we select microcode patch (arch/powerpc/platforms/8xx), 
 instead of having it hardcoded in micropatch.c ?

No, that would prevent the ability to build support for all 8xx in one
kernel.

 Or maybe it would be possible to select which microcode patch we 
 want/need via the device tree and which address shall be used for 
 relocation ? What would you suggest to describe it ?

Yes, use the existing information in the device tree, or use PVR, to
determine which chip you're on and thus whic microcode to use.

 
  How about checking for the existing specific-SoC compatibles?
  What do 

Re: [PATCH 0/2] net: fs_enet: Remove non NAPI RX and add NAPI for TX

2014-10-08 Thread David Miller
From: Christophe Leroy christophe.le...@c-s.fr
Date: Tue,  7 Oct 2014 15:04:53 +0200 (CEST)

 When using a MPC8xx as a router, 'perf' shows a significant time spent in 
 fs_enet_interrupt() and fs_enet_start_xmit().
 'perf annotate' shows that the time spent in fs_enet_start_xmit is indeed 
 spent
 between spin_unlock_irqrestore() and the following instruction, hence in
 interrupt handling. This is due to the TX complete interrupt that fires after
 each transmitted packet.
 This patchset first remove all non NAPI handling as NAPI has become the only
 mode for RX, then adds NAPI for handling TX complete.
 This improves NAT TCP throughput by 21% on MPC885 with FEC.
 
 Tested on MPC885 with FEC.
 
 [PATCH 1/2] net: fs_enet: Remove non NAPI RX
 [PATCH 2/2] net: fs_enet: Add NAPI TX
 
 Signed-off-by: Christophe Leroy christophe.le...@c-s.fr

Series applied, thanks.

Any particular reason you didn't just put the TX reclaim calls into
the existing NAPI handler?

That's what other drivers do, because TX reclaim can make SKBs
available for RX packet receive on the local cpu.  So generally you
have one NAPI context that first does any pending TX reclaim, then
polls the RX ring for new packets.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/fsl: Add support for pci(e) machine check exception on E500MC / E5500

2014-10-08 Thread Scott Wood
On Tue, 2014-10-07 at 22:08 -0500, Jia Hongtao-B38951 wrote:
 
  -Original Message-
  From: Wood Scott-B07421
  Sent: Tuesday, September 30, 2014 2:36 AM
  To: Guenter Roeck
  Cc: Benjamin Herrenschmidt; Paul Mackerras; Michael Ellerman; linuxppc-
  d...@lists.ozlabs.org; linux-ker...@vger.kernel.org; Jojy G Varghese;
  Guenter Roeck; Jia Hongtao-B38951
  Subject: Re: [PATCH] powerpc/fsl: Add support for pci(e) machine check
  exception on E500MC / E5500
  
  On Mon, 2014-09-29 at 09:48 -0700, Guenter Roeck wrote:
   From: Jojy G Varghese jo...@juniper.net
  
   For E500MC and E5500, a machine check exception in pci(e) memory space
   crashes the kernel.
  
   Testing shows that the MCAR(U) register is zero on a MC exception for
   the
   E5500 core. At the same time, DEAR register has been found to have the
   address of the faulty load address during an MC exception for this core.
  
   This fix changes the current behavior to fixup the result register and
   instruction pointers in the case of a load operation on a faulty PCI
   address.
  
   The changes are:
   - Added the hook to pci machine check handing to the e500mc machine
  check
 exception handler.
   - For the E5500 core, load faulting address from SPRN_DEAR register.
 As mentioned above, this is necessary because the E5500 core does not
 report the fault address in the MCAR register.
  
   Cc: Scott Wood scottw...@freescale.com
   Signed-off-by: Jojy G Varghese jo...@juniper.net [Guenter Roeck:
   updated description]
   Signed-off-by: Guenter Roeck gro...@juniper.net
   Signed-off-by: Guenter Roeck li...@roeck-us.net
   ---
arch/powerpc/kernel/traps.c   | 3 ++-
arch/powerpc/sysdev/fsl_pci.c | 5 +
2 files changed, 7 insertions(+), 1 deletion(-)
  
   diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
   index 0dc43f9..ecb709b 100644
   --- a/arch/powerpc/kernel/traps.c
   +++ b/arch/powerpc/kernel/traps.c
   @@ -494,7 +494,8 @@ int machine_check_e500mc(struct pt_regs *regs)
 int recoverable = 1;
  
 if (reason  MCSR_LD) {
   - recoverable = fsl_rio_mcheck_exception(regs);
   + recoverable = fsl_rio_mcheck_exception(regs) ||
   + fsl_pci_mcheck_exception(regs);
 if (recoverable == 1)
 goto silent_out;
 }
   diff --git a/arch/powerpc/sysdev/fsl_pci.c
   b/arch/powerpc/sysdev/fsl_pci.c index c507767..bdb956b 100644
   --- a/arch/powerpc/sysdev/fsl_pci.c
   +++ b/arch/powerpc/sysdev/fsl_pci.c
   @@ -1021,6 +1021,11 @@ int fsl_pci_mcheck_exception(struct pt_regs
   *regs)  #endif
 addr += mfspr(SPRN_MCAR);
  
   +#ifdef CONFIG_E5500_CPU
   + if (mfspr(SPRN_EPCR)  SPRN_EPCR_ICM)
   + addr = PFN_PHYS(vmalloc_to_pfn((void *)mfspr(SPRN_DEAR)));
  #endif
  
  Kconfig tells you what hardware is supported, not what hardware you're
  actually running on.
  
  Jia Hongtao, do you know anything about this issue?  Is there an erratum?
 
 Sorry for the late response, I just return from my vacation.
 I don't know this issue.
 
  What chips are affected by the the erratum covered by
  http://patchwork.ozlabs.org/patch/240239/?
 
 MPC8544, MPC8548, MPC8572 are affected by this erratum.

What is the erratum number?

 I checked P4080 which using e500mc and no such erratum is found.

What is the erratum behavior, and how does it differ from the problem
that Jojy and Guenter are trying to solve?

-Scott


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] CXL: Fix afu_read() not doing finish_wait() on signal or non-blocking

2014-10-08 Thread Ian Munsie
From: Ian Munsie imun...@au1.ibm.com

If afu_read() returned due to a signal or the AFU file descriptor being
opened non-blocking it would not call finish_wait() before returning,
which could lead to a crash later when something else wakes up the wait
queue.

This patch restructures the wait logic to ensure that the cleanup is
done correctly.

Signed-off-by: Ian Munsie imun...@au1.ibm.com
---

Resending with correct whitespace as my mailer decided to replace tabs with
spaces on the last try.

 drivers/misc/cxl/file.c | 20 +++-
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/drivers/misc/cxl/file.c b/drivers/misc/cxl/file.c
index 847b7e6..378b099 100644
--- a/drivers/misc/cxl/file.c
+++ b/drivers/misc/cxl/file.c
@@ -273,6 +273,7 @@ static ssize_t afu_read(struct file *file, char __user 
*buf, size_t count,
struct cxl_context *ctx = file-private_data;
struct cxl_event event;
unsigned long flags;
+   int rc;
DEFINE_WAIT(wait);
 
if (count  CXL_READ_MIN_SIZE)
@@ -285,13 +286,17 @@ static ssize_t afu_read(struct file *file, char __user 
*buf, size_t count,
if (ctx_event_pending(ctx))
break;
 
-   spin_unlock_irqrestore(ctx-lock, flags);
-   if (file-f_flags  O_NONBLOCK)
-   return -EAGAIN;
+   if (file-f_flags  O_NONBLOCK) {
+   rc = -EAGAIN;
+   goto out;
+   }
 
-   if (signal_pending(current))
-   return -ERESTARTSYS;
+   if (signal_pending(current)) {
+   rc = -ERESTARTSYS;
+   goto out;
+   }
 
+   spin_unlock_irqrestore(ctx-lock, flags);
pr_devel(afu_read going to sleep...\n);
schedule();
pr_devel(afu_read woken up\n);
@@ -336,6 +341,11 @@ static ssize_t afu_read(struct file *file, char __user 
*buf, size_t count,
if (copy_to_user(buf, event, event.header.size))
return -EFAULT;
return event.header.size;
+
+out:
+   finish_wait(ctx-wq, wait);
+   spin_unlock_irqrestore(ctx-lock, flags);
+   return rc;
 }
 
 static const struct file_operations afu_fops = {
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [PATCH] powerpc/fsl: Add support for pci(e) machine check exception on E500MC / E5500

2014-10-08 Thread Hongtao Jia


 -Original Message-
 From: Wood Scott-B07421
 Sent: Thursday, October 09, 2014 7:48 AM
 To: Jia Hongtao-B38951
 Cc: Guenter Roeck; Benjamin Herrenschmidt; Paul Mackerras; Michael
 Ellerman; linuxppc-dev@lists.ozlabs.org; linux-ker...@vger.kernel.org;
 Jojy G Varghese; Guenter Roeck
 Subject: Re: [PATCH] powerpc/fsl: Add support for pci(e) machine check
 exception on E500MC / E5500
 
 On Tue, 2014-10-07 at 22:08 -0500, Jia Hongtao-B38951 wrote:
 
   -Original Message-
   From: Wood Scott-B07421
   Sent: Tuesday, September 30, 2014 2:36 AM
   To: Guenter Roeck
   Cc: Benjamin Herrenschmidt; Paul Mackerras; Michael Ellerman;
   linuxppc- d...@lists.ozlabs.org; linux-ker...@vger.kernel.org; Jojy G
   Varghese; Guenter Roeck; Jia Hongtao-B38951
   Subject: Re: [PATCH] powerpc/fsl: Add support for pci(e) machine
   check exception on E500MC / E5500
  
   On Mon, 2014-09-29 at 09:48 -0700, Guenter Roeck wrote:
From: Jojy G Varghese jo...@juniper.net
   
For E500MC and E5500, a machine check exception in pci(e) memory
space crashes the kernel.
   
Testing shows that the MCAR(U) register is zero on a MC exception
for the
E5500 core. At the same time, DEAR register has been found to have
the address of the faulty load address during an MC exception for
 this core.
   
This fix changes the current behavior to fixup the result register
and instruction pointers in the case of a load operation on a
faulty PCI address.
   
The changes are:
- Added the hook to pci machine check handing to the e500mc
machine
   check
  exception handler.
- For the E5500 core, load faulting address from SPRN_DEAR register.
  As mentioned above, this is necessary because the E5500 core does
 not
  report the fault address in the MCAR register.
   
Cc: Scott Wood scottw...@freescale.com
Signed-off-by: Jojy G Varghese jo...@juniper.net [Guenter Roeck:
updated description]
Signed-off-by: Guenter Roeck gro...@juniper.net
Signed-off-by: Guenter Roeck li...@roeck-us.net
---
 arch/powerpc/kernel/traps.c   | 3 ++-
 arch/powerpc/sysdev/fsl_pci.c | 5 +
 2 files changed, 7 insertions(+), 1 deletion(-)
   
diff --git a/arch/powerpc/kernel/traps.c
b/arch/powerpc/kernel/traps.c index 0dc43f9..ecb709b 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -494,7 +494,8 @@ int machine_check_e500mc(struct pt_regs *regs)
int recoverable = 1;
   
if (reason  MCSR_LD) {
-   recoverable = fsl_rio_mcheck_exception(regs);
+   recoverable = fsl_rio_mcheck_exception(regs) ||
+   fsl_pci_mcheck_exception(regs);
if (recoverable == 1)
goto silent_out;
}
diff --git a/arch/powerpc/sysdev/fsl_pci.c
b/arch/powerpc/sysdev/fsl_pci.c index c507767..bdb956b 100644
--- a/arch/powerpc/sysdev/fsl_pci.c
+++ b/arch/powerpc/sysdev/fsl_pci.c
@@ -1021,6 +1021,11 @@ int fsl_pci_mcheck_exception(struct pt_regs
*regs)  #endif
addr += mfspr(SPRN_MCAR);
   
+#ifdef CONFIG_E5500_CPU
+   if (mfspr(SPRN_EPCR)  SPRN_EPCR_ICM)
+   addr = PFN_PHYS(vmalloc_to_pfn((void
 *)mfspr(SPRN_DEAR)));
   #endif
  
   Kconfig tells you what hardware is supported, not what hardware
   you're actually running on.
  
   Jia Hongtao, do you know anything about this issue?  Is there an
 erratum?
 
  Sorry for the late response, I just return from my vacation.
  I don't know this issue.
 
   What chips are affected by the the erratum covered by
   http://patchwork.ozlabs.org/patch/240239/?
 
  MPC8544, MPC8548, MPC8572 are affected by this erratum.
 
 What is the erratum number?

The number of this erratum for each chip is not consistent.
MPC8544: PCIe 4
MPC8548: PCI-Ex 39
MPC8572: PCI-Ex 3

 
  I checked P4080 which using e500mc and no such erratum is found.
 
 What is the erratum behavior, and how does it differ from the problem
 that Jojy and Guenter are trying to solve?

Here is the description of the erratum:

When its link goes down, the PCI Express controller clears all outstanding 
transactions with an
error indicator and sends a link down exception to the interrupt controller if
PEX_PME_MES_DISR[LDDD] = 0. If, however, any transactions are sent to the 
controller
after the link down event, they will be accepted by the controller and wait for 
the link to come
back up before starting any timeout counters (e.g. completion timeout). There 
is no mechanism
to cancel the new transactions short of a device HRESET.

For e500mc as Jojy and Guenter described it's like the same erratum on e500, 
not 100% sure.

For e5500 I don't quite understand yet.

 
 -Scott
 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V3 2/3] powerpc, ptrace: Enable support for transactional memory register sets

2014-10-08 Thread Anshuman Khandual
On 08/28/2014 03:05 AM, Sukadev Bhattiprolu wrote:
 Anshuman Khandual [khand...@linux.vnet.ibm.com] wrote:
 | This patch enables get and set of transactional memory related register
 | sets through PTRACE_GETREGSET/PTRACE_SETREGSET interface by implementing
 | four new powerpc specific register sets i.e REGSET_TM_SPR, REGSET_TM_CGPR,
 | REGSET_TM_CFPR, REGSET_CVMX support corresponding to these following new
 | ELF core note types added previously in this regard.
 | 
 | (1) NT_PPC_TM_SPR
 | (2) NT_PPC_TM_CGPR
 | (3) NT_PPC_TM_CFPR
 | (4) NT_PPC_TM_CVMX
 | 
 | Signed-off-by: Anshuman Khandual khand...@linux.vnet.ibm.com
 | ---
 |  arch/powerpc/include/asm/switch_to.h |   8 +
 |  arch/powerpc/kernel/process.c|  24 ++
 |  arch/powerpc/kernel/ptrace.c | 792 
 +--
 |  3 files changed, 795 insertions(+), 29 deletions(-)
 | 
 | diff --git a/arch/powerpc/include/asm/switch_to.h 
 b/arch/powerpc/include/asm/switch_to.h
 | index 0e83e7d..2737f46 100644
 | --- a/arch/powerpc/include/asm/switch_to.h
 | +++ b/arch/powerpc/include/asm/switch_to.h
 | @@ -80,6 +80,14 @@ static inline void flush_spe_to_thread(struct 
 task_struct *t)
 |  }
 |  #endif
 |  
 | +#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 | +extern void flush_tmregs_to_thread(struct task_struct *);
 | +#else
 | +static inline void flush_tmregs_to_thread(struct task_struct *t)
 | +{
 | +}
 | +#endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
 | +
 |  static inline void clear_task_ebb(struct task_struct *t)
 |  {
 |  #ifdef CONFIG_PPC_BOOK3S_64
 | diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
 | index 31d0215..e247898 100644
 | --- a/arch/powerpc/kernel/process.c
 | +++ b/arch/powerpc/kernel/process.c
 | @@ -695,6 +695,30 @@ static inline void __switch_to_tm(struct task_struct 
 *prev)
 | }
 |  }
 |  
 | +void flush_tmregs_to_thread(struct task_struct *tsk)
 | +{
 | +   /*
 | +* If task is not current, it should have been flushed
 | +* already to it's thread_struct during __switch_to().
 | +*/
 | +   if (tsk != current)
 | +   return;
 | +
 | +   preempt_disable();
 | +   if (tsk-thread.regs) {
 | +   /*
 | +* If we are still current, the TM state need to
 | +* be flushed to thread_struct as it will be still
 | +* present in the current cpu.
 | +*/
 | +   if (MSR_TM_ACTIVE(tsk-thread.regs-msr)) {
 | +   __switch_to_tm(tsk);
 | +   tm_recheckpoint_new_task(tsk);
 | +   }
 | +   }
 | +   preempt_enable();
 | +}
 | +
 |  /*
 |   * This is called if we are on the way out to userspace and the
 |   * TIF_RESTORE_TM flag is set.  It checks if we need to reload
 | diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
 | index 2e3d2bf..17642ef 100644
 | --- a/arch/powerpc/kernel/ptrace.c
 | +++ b/arch/powerpc/kernel/ptrace.c
 | @@ -357,6 +357,17 @@ static int gpr_set(struct task_struct *target, const 
 struct user_regset *regset,
 | return ret;
 |  }
 |  
 | +/*
 | + * When any transaction is active, thread_struct-transact_fp holds
 | + * the current running value of all FPR registers and thread_struct-
 | + * fp_state holds the last checkpointed FPR registers state for the
 | + * current transaction.
 | + *
 | + * struct data {
 | + * u64 fpr[32];
 | + * u64 fpscr;
 | + * };
 | + */
 
 Maybe a reference to 'struct thread_fp_state' in the comments will help ?

Okay, will try to add.

 
 
 |  static int fpr_get(struct task_struct *target, const struct user_regset 
 *regset,
 |unsigned int pos, unsigned int count,
 |void *kbuf, void __user *ubuf)
 | @@ -365,21 +376,41 @@ static int fpr_get(struct task_struct *target, const 
 struct user_regset *regset,
 | u64 buf[33];
 | int i;
 |  #endif
 | -   flush_fp_to_thread(target);
 | +   if (MSR_TM_ACTIVE(target-thread.regs-msr)) {
 | +   flush_fp_to_thread(target);
 | +   flush_altivec_to_thread(target);
 | +   flush_tmregs_to_thread(target);
 | +   } else {
 | +   flush_fp_to_thread(target);
 | +   }
 
 flush_fp_to_thread(target) is uncondtional - so could be outside
 the if and else blocks ?

yes

 
 |  
 |  #ifdef CONFIG_VSX
 | /* copy to local buffer then write that out */
 | -   for (i = 0; i  32 ; i++)
 | -   buf[i] = target-thread.TS_FPR(i);
 | -   buf[32] = target-thread.fp_state.fpscr;
 | +   if (MSR_TM_ACTIVE(target-thread.regs-msr)) {
 | +   for (i = 0; i  32 ; i++)
 | +   buf[i] = target-thread.TS_TRANS_FPR(i);
 | +   buf[32] = target-thread.transact_fp.fpscr;
 | +   } else {
 | +   for (i = 0; i  32 ; i++)
 | +   buf[i] = target-thread.TS_FPR(i);
 | +   buf[32] = target-thread.fp_state.fpscr;
 | +   }
 | return user_regset_copyout(pos, count, kbuf, ubuf, buf, 0, -1);
 |  
 |  #else
 | -   BUILD_BUG_ON(offsetof(struct 

Re: [PATCH 0/2] net: fs_enet: Remove non NAPI RX and add NAPI for TX

2014-10-08 Thread leroy christophe


Le 08/10/2014 22:03, David Miller a écrit :

From: Christophe Leroy christophe.le...@c-s.fr
Date: Tue,  7 Oct 2014 15:04:53 +0200 (CEST)


When using a MPC8xx as a router, 'perf' shows a significant time spent in
fs_enet_interrupt() and fs_enet_start_xmit().
'perf annotate' shows that the time spent in fs_enet_start_xmit is indeed spent
between spin_unlock_irqrestore() and the following instruction, hence in
interrupt handling. This is due to the TX complete interrupt that fires after
each transmitted packet.
This patchset first remove all non NAPI handling as NAPI has become the only
mode for RX, then adds NAPI for handling TX complete.
This improves NAT TCP throughput by 21% on MPC885 with FEC.

Tested on MPC885 with FEC.

[PATCH 1/2] net: fs_enet: Remove non NAPI RX
[PATCH 2/2] net: fs_enet: Add NAPI TX

Signed-off-by: Christophe Leroy christophe.le...@c-s.fr

Series applied, thanks.

Any particular reason you didn't just put the TX reclaim calls into
the existing NAPI handler?

Not really. I used the gianfar.c driver as a model.


That's what other drivers do, because TX reclaim can make SKBs
available for RX packet receive on the local cpu.  So generally you
have one NAPI context that first does any pending TX reclaim, then
polls the RX ring for new packets.


Is that a better approach ?

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev