Re: [PATCH] powerpc/8xx: Build fix with Hugetlbfs enabled

2018-04-17 Thread Michael Ellerman
"Aneesh Kumar K.V" writes: > 8xx use slice code when hugetlbfs is enabled. We missed a header include on > 8xx which resulted in the below build failure. > > config: mpc885_ads_defconfig + CONFIG_HUGETLBFS > >CC arch/powerpc/mm/slice.o >

Re: [RFC] virtio: Use DMA MAP API for devices without an IOMMU

2018-04-17 Thread Anshuman Khandual
On 04/15/2018 05:41 PM, Christoph Hellwig wrote: > On Fri, Apr 06, 2018 at 06:37:18PM +1000, Benjamin Herrenschmidt wrote: implemented as DMA API which the virtio core understands. There is no need for an IOMMU to be involved for the device representation in this case IMHO. >>> >>>

[PATCH v2 7/7] ocxl: Document new OCXL IOCTLs

2018-04-17 Thread Alastair D'Silva
From: Alastair D'Silva Signed-off-by: Alastair D'Silva --- Documentation/accelerators/ocxl.rst | 11 +++ 1 file changed, 11 insertions(+) diff --git a/Documentation/accelerators/ocxl.rst b/Documentation/accelerators/ocxl.rst index

[PATCH v2 1/7] powerpc: Add TIDR CPU feature for Power9

2018-04-17 Thread Alastair D'Silva
From: Alastair D'Silva This patch adds a CPU feature bit to show whether the CPU has the TIDR register available, enabling as_notify/wait in userspace. Signed-off-by: Alastair D'Silva --- arch/powerpc/include/asm/cputable.h | 3 ++-

[PATCH v2 6/7] ocxl: Add an IOCTL so userspace knows what CPU features are available

2018-04-17 Thread Alastair D'Silva
From: Alastair D'Silva In order for a userspace AFU driver to call the Power9 specific OCXL_IOCTL_ENABLE_P9_WAIT, it needs to verify that it can actually make that call. Signed-off-by: Alastair D'Silva --- Documentation/accelerators/ocxl.rst | 1 -

[PATCH v2 5/7] ocxl: Expose the thread_id needed for wait on p9

2018-04-17 Thread Alastair D'Silva
From: Alastair D'Silva In order to successfully issue as_notify, an AFU needs to know the TID to notify, which in turn means that this information should be available in userspace so it can be communicated to the AFU. Signed-off-by: Alastair D'Silva

[PATCH v2 3/7] powerpc: use task_pid_nr() for TID allocation

2018-04-17 Thread Alastair D'Silva
From: Alastair D'Silva The current implementation of TID allocation, using a global IDR, may result in an errant process starving the system of available TIDs. Instead, use task_pid_nr(), as mentioned by the original author. The scenario described which prevented it's use

[PATCH v2 0/7] ocxl: Implement Power9 as_notify/wait for OpenCAPI

2018-04-17 Thread Alastair D'Silva
From: Alastair D'Silva The Power 9 as_notify/wait feature provides a lower latency way to signal a thread that work is complete. This series enables the use of this feature from OpenCAPI adapters, as well as addressing a potential starvation issue when allocating thread

[PATCH v2 4/7] ocxl: Rename pnv_ocxl_spa_remove_pe to clarify it's action

2018-04-17 Thread Alastair D'Silva
From: Alastair D'Silva The function removes the process element from NPU cache. Signed-off-by: Alastair D'Silva --- arch/powerpc/include/asm/pnv-ocxl.h | 2 +- arch/powerpc/platforms/powernv/ocxl.c | 4 ++-- drivers/misc/ocxl/link.c |

[PATCH v2 2/7] powerpc: Use TIDR CPU feature to control TIDR allocation

2018-04-17 Thread Alastair D'Silva
From: Alastair D'Silva Switch the use of TIDR on it's CPU feature, rather than assuming it is available based on architecture. Signed-off-by: Alastair D'Silva --- arch/powerpc/kernel/process.c | 6 +++--- 1 file changed, 3 insertions(+), 3

Re: [PATCH] misc: cxl: Change return type to vm_fault_t

2018-04-17 Thread Andrew Donnellan
On 18/04/18 00:53, Souptick Joarder wrote: Use new return type vm_fault_t for fault handler. For now, this is just documenting that the function returns a VM_FAULT value rather than an errno. Once all instances are converted, vm_fault_t will become a distinct type. Reference id -> 1c8f422059ae

Re: [PATCH V1 00/11] powerpc/mm/book3s64: Support for split pmd ptlock

2018-04-17 Thread Balbir Singh
On Mon, 16 Apr 2018 16:57:12 +0530 "Aneesh Kumar K.V" wrote: > This patch series add split pmd pagetable lock for book3s64. nohash64 also > should > be able to switch to this. I need to workout the code dependency. This series > also migh have broken the build on

Re: [PATCH 1/2] powernv/npu: Do a PID GPU TLB flush when invalidating a large address range

2018-04-17 Thread Balbir Singh
On Tue, Apr 17, 2018 at 7:17 PM, Balbir Singh wrote: > On Tue, Apr 17, 2018 at 7:11 PM, Alistair Popple > wrote: >> The NPU has a limited number of address translation shootdown (ATSD) >> registers and the GPU has limited bandwidth to process ATSDs.

[PATCH] powerpc: platform: cell: spufs: Change return type to vm_fault_t

2018-04-17 Thread Souptick Joarder
Use new return type vm_fault_t for fault handler. For now, this is just documenting that the function returns a VM_FAULT value rather than an errno. Once all instances are converted, vm_fault_t will become a distinct type. Reference id -> 1c8f422059ae ("mm: change return type to vm_fault_t")

[PATCH] misc: cxl: Change return type to vm_fault_t

2018-04-17 Thread Souptick Joarder
Use new return type vm_fault_t for fault handler. For now, this is just documenting that the function returns a VM_FAULT value rather than an errno. Once all instances are converted, vm_fault_t will become a distinct type. Reference id -> 1c8f422059ae ("mm: change return type to vm_fault_t")

Re: [PATCH 2/2] powernv/npu: Add a debugfs setting to change ATSD threshold

2018-04-17 Thread Balbir Singh
On Tue, 17 Apr 2018 19:11:29 +1000 Alistair Popple wrote: > The threshold at which it becomes more efficient to coalesce a range of > ATSDs into a single per-PID ATSD is currently not well understood due to a > lack of real-world work loads. This patch adds a debugfs

Re: [PATCH] powerpc: platform: cell: spufs: Change return type to vm_fault_t

2018-04-17 Thread Matthew Wilcox
On Wed, Apr 18, 2018 at 12:50:38AM +0530, Souptick Joarder wrote: > Use new return type vm_fault_t for fault handler. For > now, this is just documenting that the function returns > a VM_FAULT value rather than an errno. Once all instances > are converted, vm_fault_t will become a distinct type. >

Re: [PATCH] powerpc: platform: cell: spufs: Change return type to vm_fault_t

2018-04-17 Thread Arnd Bergmann
On Tue, Apr 17, 2018 at 9:20 PM, Souptick Joarder wrote: > Use new return type vm_fault_t for fault handler. For > now, this is just documenting that the function returns > a VM_FAULT value rather than an errno. Once all instances > are converted, vm_fault_t will become a

Re: [RFC PATCH 1/3] signal: Ensure every siginfo we send has all bits initialized

2018-04-17 Thread Eric W. Biederman
Dave Martin writes: > Hmmm > > memset()/clear_siginfo() may ensure that there are no uninitialised > explicit fields except for those in inactive union members, but I'm not > sure that this approach is guaranteed to sanitise the padding seen by > userspace. > > Rationale

Re: [PATCH] powerpc: Allow selection of CONFIG_LD_DEAD_CODE_DATA_ELIMINATION

2018-04-17 Thread Mathieu Malaterre
On Tue, Apr 17, 2018 at 6:49 PM, Christophe LEROY wrote: > > > Le 17/04/2018 à 18:45, Mathieu Malaterre a écrit : >> >> On Tue, Apr 17, 2018 at 12:49 PM, Christophe Leroy >> wrote: >>> >>> This option does dead code and data elimination with the

[PATCH v2 2/2] powerpc/32be: use stmw/lmw for registers save/restore in asm

2018-04-17 Thread Christophe Leroy
arch/powerpc/Makefile activates -mmultiple on BE PPC32 configs in order to use multiple word instructions in functions entry/exit The patch does the same for the asm parts, for consistency On processors like the 8xx on which insn fetching is pretty slow, this speeds up registers save/restore

[PATCH v2 1/2] powerpc: avoid an unnecessary test and branch in longjmp()

2018-04-17 Thread Christophe Leroy
Doing the test at exit of the function avoids an unnecessary test and branch inside longjmp() Signed-off-by: Christophe Leroy --- v2: Swapped both patches in the serie to reduce number of impacted lines arch/powerpc/kernel/misc.S | 9 - 1 file changed, 4

Re: [PATCH 2/6 v2] iommu: of: make of_pci_map_rid() available for other devices too

2018-04-17 Thread Robin Murphy
On 17/04/18 11:21, Nipun Gupta wrote: iommu-map property is also used by devices with fsl-mc. This patch moves the of_pci_map_rid to generic location, so that it can be used by other busses too. Signed-off-by: Nipun Gupta --- drivers/iommu/of_iommu.c | 106

Re: [PATCH] powerpc: Allow selection of CONFIG_LD_DEAD_CODE_DATA_ELIMINATION

2018-04-17 Thread Christophe LEROY
Le 17/04/2018 à 18:45, Mathieu Malaterre a écrit : On Tue, Apr 17, 2018 at 12:49 PM, Christophe Leroy wrote: This option does dead code and data elimination with the linker by compiling with -ffunction-sections -fdata-sections and linking with --gc-sections. By

Re: [PATCH] powerpc: Allow selection of CONFIG_LD_DEAD_CODE_DATA_ELIMINATION

2018-04-17 Thread Mathieu Malaterre
On Tue, Apr 17, 2018 at 12:49 PM, Christophe Leroy wrote: > This option does dead code and data elimination with the linker by > compiling with -ffunction-sections -fdata-sections and linking with > --gc-sections. > > By selecting this option on mpc885_ads_defconfig, >

Re: powerpc/modules: Fix crashes by adding CONFIG_RELOCATABLE to vermagic

2018-04-17 Thread Ard Biesheuvel
On 16 April 2018 at 16:10, Michael Ellerman wrote: > Ard Biesheuvel writes: > >> On 11 April 2018 at 16:49, Michael Ellerman >> wrote: >>> On Tue, 2018-04-10 at 01:22:06 UTC, Michael Ellerman wrote: If you

[PATCH v10 22/25] mm: speculative page fault handler return VMA

2018-04-17 Thread Laurent Dufour
When the speculative page fault handler is returning VM_RETRY, there is a chance that VMA fetched without grabbing the mmap_sem can be reused by the legacy page fault handler. By reusing it, we avoid calling find_vma() again. To achieve, that we must ensure that the VMA structure will not be

[PATCH v10 25/25] powerpc/mm: add speculative page fault

2018-04-17 Thread Laurent Dufour
This patch enable the speculative page fault on the PowerPC architecture. This will try a speculative page fault without holding the mmap_sem, if it returns with VM_FAULT_RETRY, the mmap_sem is acquired and the traditional page fault processing is done. The speculative path is only tried for

[PATCH v10 24/25] x86/mm: add speculative pagefault handling

2018-04-17 Thread Laurent Dufour
From: Peter Zijlstra Try a speculative fault before acquiring mmap_sem, if it returns with VM_FAULT_RETRY continue with the mmap_sem acquisition and do the traditional fault. Signed-off-by: Peter Zijlstra (Intel) [Clearing of FAULT_FLAG_ALLOW_RETRY

[PATCH v10 23/25] mm: add speculative page fault vmstats

2018-04-17 Thread Laurent Dufour
Add speculative_pgfault vmstat counter to count successful speculative page fault handling. Also fixing a minor typo in include/linux/vm_event_item.h. Signed-off-by: Laurent Dufour --- include/linux/vm_event_item.h | 3 +++ mm/memory.c | 1 +

[PATCH v10 21/25] perf tools: add support for the SPF perf event

2018-04-17 Thread Laurent Dufour
Add support for the new speculative faults event. Acked-by: David Rientjes Signed-off-by: Laurent Dufour --- tools/include/uapi/linux/perf_event.h | 1 + tools/perf/util/evsel.c | 1 + tools/perf/util/parse-events.c| 4

[PATCH v10 20/25] perf: add a speculative page fault sw event

2018-04-17 Thread Laurent Dufour
Add a new software event to count succeeded speculative page faults. Acked-by: David Rientjes Signed-off-by: Laurent Dufour --- include/uapi/linux/perf_event.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/uapi/linux/perf_event.h

[PATCH v10 19/25] mm: adding speculative page fault failure trace events

2018-04-17 Thread Laurent Dufour
This patch a set of new trace events to collect the speculative page fault event failures. Signed-off-by: Laurent Dufour --- include/trace/events/pagefault.h | 88 mm/memory.c | 62

[PATCH v10 18/25] mm: provide speculative fault infrastructure

2018-04-17 Thread Laurent Dufour
From: Peter Zijlstra Provide infrastructure to do a speculative fault (not holding mmap_sem). The not holding of mmap_sem means we can race against VMA change/removal and page-table destruction. We use the SRCU VMA freeing to keep the VMA around. We use the VMA seqcount to

[PATCH v10 17/25] mm: protect mm_rb tree with a rwlock

2018-04-17 Thread Laurent Dufour
This change is inspired by the Peter's proposal patch [1] which was protecting the VMA using SRCU. Unfortunately, SRCU is not scaling well in that particular case, and it is introducing major performance degradation due to excessive scheduling operations. To allow access to the mm_rb tree without

[PATCH v10 16/25] mm: introduce __page_add_new_anon_rmap()

2018-04-17 Thread Laurent Dufour
When dealing with speculative page fault handler, we may race with VMA being split or merged. In this case the vma->vm_start and vm->vm_end fields may not match the address the page fault is occurring. This can only happens when the VMA is split but in that case, the anon_vma pointer of the new

[PATCH v10 15/25] mm: introduce __vm_normal_page()

2018-04-17 Thread Laurent Dufour
When dealing with the speculative fault path we should use the VMA's field cached value stored in the vm_fault structure. Currently vm_normal_page() is using the pointer to the VMA to fetch the vm_flags value. This patch provides a new __vm_normal_page() which is receiving the vm_flags flags

[PATCH v10 14/25] mm: introduce __lru_cache_add_active_or_unevictable

2018-04-17 Thread Laurent Dufour
The speculative page fault handler which is run without holding the mmap_sem is calling lru_cache_add_active_or_unevictable() but the vm_flags is not guaranteed to remain constant. Introducing __lru_cache_add_active_or_unevictable() which has the vma flags value parameter instead of the vma

[PATCH v10 13/25] mm/migrate: Pass vm_fault pointer to migrate_misplaced_page()

2018-04-17 Thread Laurent Dufour
migrate_misplaced_page() is only called during the page fault handling so it's better to pass the pointer to the struct vm_fault instead of the vma. This way during the speculative page fault path the saved vma->vm_flags could be used. Acked-by: David Rientjes

[PATCH v10 12/25] mm: cache some VMA fields in the vm_fault structure

2018-04-17 Thread Laurent Dufour
When handling speculative page fault, the vma->vm_flags and vma->vm_page_prot fields are read once the page table lock is released. So there is no more guarantee that these fields would not change in our back. They will be saved in the vm_fault structure before the VMA is checked for changes.

[PATCH v10 11/25] mm: protect SPF handler against anon_vma changes

2018-04-17 Thread Laurent Dufour
The speculative page fault handler must be protected against anon_vma changes. This is because page_add_new_anon_rmap() is called during the speculative path. In addition, don't try speculative page fault if the VMA don't have an anon_vma structure allocated because its allocation should be

[PATCH v10 10/25] mm: protect mremap() against SPF hanlder

2018-04-17 Thread Laurent Dufour
If a thread is remapping an area while another one is faulting on the destination area, the SPF handler may fetch the vma from the RB tree before the pte has been moved by the other thread. This means that the moved ptes will overwrite those create by the page fault handler leading to page leaked.

[PATCH v10 09/25] mm: protect VMA modifications using VMA sequence count

2018-04-17 Thread Laurent Dufour
The VMA sequence count has been introduced to allow fast detection of VMA modification when running a page fault handler without holding the mmap_sem. This patch provides protection against the VMA modification done in : - madvise() - mpol_rebind_policy() -

[PATCH v10 08/25] mm: VMA sequence count

2018-04-17 Thread Laurent Dufour
From: Peter Zijlstra Wrap the VMA modifications (vma_adjust/unmap_page_range) with sequence counts such that we can easily test if a VMA is changed. The unmap_page_range() one allows us to make assumptions about page-tables; when we find the seqcount hasn't changed we can

[PATCH v10 07/25] mm: introduce INIT_VMA()

2018-04-17 Thread Laurent Dufour
Some VMA struct fields need to be initialized once the VMA structure is allocated. Currently this only concerns anon_vma_chain field but some other will be added to support the speculative page fault. Instead of spreading the initialization calls all over the code, let's introduce a dedicated

[PATCH v10 06/25] mm: make pte_unmap_same compatible with SPF

2018-04-17 Thread Laurent Dufour
pte_unmap_same() is making the assumption that the page table are still around because the mmap_sem is held. This is no more the case when running a speculative page fault and additional check must be made to ensure that the final page table are still there. This is now done by calling

[PATCH v10 05/25] mm: introduce pte_spinlock for FAULT_FLAG_SPECULATIVE

2018-04-17 Thread Laurent Dufour
When handling page fault without holding the mmap_sem the fetch of the pte lock pointer and the locking will have to be done while ensuring that the VMA is not touched in our back. So move the fetch and locking operations in a dedicated function. Signed-off-by: Laurent Dufour

[PATCH v10 04/25] mm: prepare for FAULT_FLAG_SPECULATIVE

2018-04-17 Thread Laurent Dufour
From: Peter Zijlstra When speculating faults (without holding mmap_sem) we need to validate that the vma against which we loaded pages is still valid when we're ready to install the new PTE. Therefore, replace the pte_offset_map_lock() calls that (re)take the PTL with

[PATCH v10 03/25] powerpc/mm: set ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT

2018-04-17 Thread Laurent Dufour
Set ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT for BOOK3S_64. This enables the Speculative Page Fault handler. Support is only provide for BOOK3S_64 currently because: - require CONFIG_PPC_STD_MMU because checks done in set_access_flags_filter() - require BOOK3S because we can't support for

[PATCH v10 02/25] x86/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT

2018-04-17 Thread Laurent Dufour
Set ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT which turns on the Speculative Page Fault handler when building for 64bit. Cc: Thomas Gleixner Signed-off-by: Laurent Dufour --- arch/x86/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git

[PATCH v10 01/25] mm: introduce CONFIG_SPECULATIVE_PAGE_FAULT

2018-04-17 Thread Laurent Dufour
This configuration variable will be used to build the code needed to handle speculative page fault. By default it is turned off, and activated depending on architecture support, SMP and MMU. Suggested-by: Thomas Gleixner Suggested-by: David Rientjes

[PATCH v10 00/25] Speculative page faults

2018-04-17 Thread Laurent Dufour
This is a port on kernel 4.16 of the work done by Peter Zijlstra to handle page fault without holding the mm semaphore [1]. The idea is to try to handle user space page faults without holding the mmap_sem. This should allow better concurrency for massively threaded process since the page fault

Re: [RFC PATCH 1/3] signal: Ensure every siginfo we send has all bits initialized

2018-04-17 Thread Dave Martin
On Sun, Apr 15, 2018 at 10:57:33AM -0500, Eric W. Biederman wrote: > > Call clear_siginfo to ensure every stack allocated siginfo is properly > initialized before being passed to the signal sending functions. > > Note: It is not safe to depend on C initializers to initialize struct > siginfo on

[PATCH] powerpc/time: remove to_tm and use RTC_LIB

2018-04-17 Thread Christophe Leroy
RTC_LIB includes a generic function to convert RTC data into struct rtc_time. Use it and remove to_tm(). Signed-off-by: Christophe Leroy --- arch/powerpc/Kconfig| 1 + arch/powerpc/include/asm/time.h | 1 - arch/powerpc/kernel/rtas-proc.c

Re: [PATCH] powerpc/8xx: Build fix with Hugetlbfs enabled

2018-04-17 Thread Christophe LEROY
Le 16/04/2018 à 13:27, Aneesh Kumar K.V a écrit : 8xx use slice code when hugetlbfs is enabled. We missed a header include on 8xx which resulted in the below build failure. config: mpc885_ads_defconfig + CONFIG_HUGETLBFS CC arch/powerpc/mm/slice.o arch/powerpc/mm/slice.c: In

[PATCH] powerpc/8xx: Remove RTC clock on 88x

2018-04-17 Thread Christophe Leroy
The 885 familly processors don't have the Real Time Clock Signed-off-by: Christophe Leroy --- arch/powerpc/platforms/8xx/adder875.c| 2 -- arch/powerpc/platforms/8xx/ep88xc.c | 2 -- arch/powerpc/platforms/8xx/mpc885ads_setup.c | 2 -- 3 files changed,

[PATCH] powerpc/boot: remove unused variable in mpc8xx

2018-04-17 Thread Christophe Leroy
Variable div is set but never used. Remove it. Signed-off-by: Christophe Leroy --- arch/powerpc/boot/mpc8xx.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/powerpc/boot/mpc8xx.c b/arch/powerpc/boot/mpc8xx.c index add55a7f184f..c9bd9285c548

[PATCH] powerpc/misc: merge reloc_offset() and add_reloc_offset()

2018-04-17 Thread Christophe Leroy
reloc_offset() is the same as add_reloc_offset(0) Signed-off-by: Christophe Leroy --- arch/powerpc/kernel/misc.S | 17 +++-- 1 file changed, 3 insertions(+), 14 deletions(-) diff --git a/arch/powerpc/kernel/misc.S b/arch/powerpc/kernel/misc.S index

[PATCH] powerpc: Allow selection of CONFIG_LD_DEAD_CODE_DATA_ELIMINATION

2018-04-17 Thread Christophe Leroy
This option does dead code and data elimination with the linker by compiling with -ffunction-sections -fdata-sections and linking with --gc-sections. By selecting this option on mpc885_ads_defconfig, vmlinux LOAD segment size gets reduced by 10% Program Header before the patch: LOAD off

[PATCH 6/6 v2] arm64: dts: ls208xa: comply with the iommu map binding for fsl_mc

2018-04-17 Thread Nipun Gupta
Fsl-mc bus now support the iommu-map property. Comply to this binding for fsl_mc bus. This patch also updates the dts w.r.t. the DMA configuration. Signed-off-by: Nipun Gupta --- arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi | 6 +- 1 file changed, 5 insertions(+), 1

[PATCH 5/6 v2] bus: fsl-mc: supoprt dma configure for devices on fsl-mc bus

2018-04-17 Thread Nipun Gupta
Signed-off-by: Nipun Gupta --- drivers/bus/fsl-mc/fsl-mc-bus.c | 16 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c index 5d8266c..624828b 100644 ---

[PATCH 4/6 v2] iommu: arm-smmu: Add support for the fsl-mc bus

2018-04-17 Thread Nipun Gupta
Implement bus specific support for the fsl-mc bus including registering arm_smmu_ops and bus specific device add operations. Signed-off-by: Nipun Gupta --- drivers/iommu/arm-smmu.c | 7 +++ drivers/iommu/iommu.c| 21 + include/linux/fsl/mc.h |

[PATCH 3/6 v2] iommu: support iommu configuration for fsl-mc devices

2018-04-17 Thread Nipun Gupta
Signed-off-by: Nipun Gupta --- drivers/iommu/of_iommu.c | 20 1 file changed, 20 insertions(+) diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c index 4e7712f..af4fc3b 100644 --- a/drivers/iommu/of_iommu.c +++ b/drivers/iommu/of_iommu.c

[PATCH 2/6 v2] iommu: of: make of_pci_map_rid() available for other devices too

2018-04-17 Thread Nipun Gupta
iommu-map property is also used by devices with fsl-mc. This patch moves the of_pci_map_rid to generic location, so that it can be used by other busses too. Signed-off-by: Nipun Gupta --- drivers/iommu/of_iommu.c | 106 +--

[PATCH 1/6 v2] Docs: dt: add fsl-mc iommu-map device-tree binding

2018-04-17 Thread Nipun Gupta
The existing IOMMU bindings cannot be used to specify the relationship between fsl-mc devices and IOMMUs. This patch adds a generic binding for mapping fsl-mc devices to IOMMUs, using iommu-map property. Signed-off-by: Nipun Gupta ---

[PATCH 0/6 v2] Support for fsl-mc bus and its devices in SMMU

2018-04-17 Thread Nipun Gupta
This patchset defines IOMMU DT binding for fsl-mc bus and adds support in SMMU for fsl-mc bus. This patch series is dependent on patset: https://patchwork.kernel.org/patch/10317337/ These patches - Define property 'iommu-map' for fsl-mc bus (patch 1) - Integrates the fsl-mc bus with the SMMU

Re: [1/5] powerpc/lib: Fix off-by-one in alternate feature patching

2018-04-17 Thread Michael Ellerman
On Mon, 2018-04-16 at 14:39:01 UTC, Michael Ellerman wrote: > When we patch an alternate feature section, we have to adjust any > relative branches that branch out of the alternate section. > > But currently we have a bug if we have a branch that points to past > the last instruction of the

Re: powerpc/64s: Default l1d_size to 64K in RFI fallback flush

2018-04-17 Thread Michael Ellerman
On Tue, 2018-04-17 at 01:49:20 UTC, Michael Ellerman wrote: > From: Madhavan Srinivasan > > If there is no d-cache-size property in the device tree, l1d_size could > be zero. We don't actually expect that to happen, it's only been seen > on mambo (simulator) in some

[RESEND PATCH 1/3] powerpc: dts: use 'atmel' as at24 anufacturer for pdm360ng

2018-04-17 Thread Bartosz Golaszewski
Using 'at' as the part of the compatible string is now deprecated. Use a correct string: 'atmel,'. Signed-off-by: Bartosz Golaszewski --- arch/powerpc/boot/dts/pdm360ng.dts | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/boot/dts/pdm360ng.dts

[RESEND PATCH 3/3] powerpc: dts: use a correct at24 compatible fallback in ac14xx

2018-04-17 Thread Bartosz Golaszewski
Using 'at24' as fallback is now deprecated - use the full 'atmel,' string. Signed-off-by: Bartosz Golaszewski --- arch/powerpc/boot/dts/ac14xx.dts | 20 ++-- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/arch/powerpc/boot/dts/ac14xx.dts

[RESEND PATCH 2/3] powerpc: dts: use 'atmel' as at24 manufacturer for kmcent2

2018-04-17 Thread Bartosz Golaszewski
Using compatible strings without the part for at24 is now deprecated. Use a correct 'atmel,' value. Signed-off-by: Bartosz Golaszewski --- arch/powerpc/boot/dts/fsl/kmcent2.dts | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git

Re: [PATCH] powerpc/misc: get rid of add_reloc_offset()

2018-04-17 Thread Paul Mackerras
On Tue, Apr 17, 2018 at 09:56:24AM +0200, Christophe Leroy wrote: > add_reloc_offset() is almost redundant with reloc_offset() > > Signed-off-by: Christophe Leroy > --- > arch/powerpc/include/asm/setup.h | 3 +-- > arch/powerpc/kernel/misc.S | 16

Re: [PATCH 1/2] powernv/npu: Do a PID GPU TLB flush when invalidating a large address range

2018-04-17 Thread Balbir Singh
On Tue, Apr 17, 2018 at 7:11 PM, Alistair Popple wrote: > The NPU has a limited number of address translation shootdown (ATSD) > registers and the GPU has limited bandwidth to process ATSDs. This can > result in contention of ATSD registers leading to soft lockups on some >

[PATCH 2/2] powernv/npu: Add a debugfs setting to change ATSD threshold

2018-04-17 Thread Alistair Popple
The threshold at which it becomes more efficient to coalesce a range of ATSDs into a single per-PID ATSD is currently not well understood due to a lack of real-world work loads. This patch adds a debugfs parameter allowing the threshold to be altered at runtime in order to aid future development

[PATCH 1/2] powernv/npu: Do a PID GPU TLB flush when invalidating a large address range

2018-04-17 Thread Alistair Popple
The NPU has a limited number of address translation shootdown (ATSD) registers and the GPU has limited bandwidth to process ATSDs. This can result in contention of ATSD registers leading to soft lockups on some threads, particularly when invalidating a large address range in

[PATCH] powerpc/misc: get rid of add_reloc_offset()

2018-04-17 Thread Christophe Leroy
add_reloc_offset() is almost redundant with reloc_offset() Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/setup.h | 3 +-- arch/powerpc/kernel/misc.S | 16 arch/powerpc/kernel/prom_init_check.sh | 2 +- 3 files changed,

[PATCH 7/7] powerpc/lib: Remove .balign inside string functions for PPC32

2018-04-17 Thread Christophe Leroy
commit 87a156fb18fe1 ("Align hot loops of some string functions") degraded the performance of string functions by adding useless nops A simple benchmark on an 8xx calling 10x a memchr() that matches the first byte runs in 41668 TB ticks before this patch and in 35986 TB ticks after this

[PATCH 6/7] powerpc/lib: inline more NUL size verifications

2018-04-17 Thread Christophe Leroy
strncmp(), strncpy(), memchr() are often called with constant size. This patch gives GCC a chance to optimise NULL size verification out Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/string.h | 24 arch/powerpc/lib/string.S

[PATCH 5/7] powerpc/lib: optimise 32 bits __clear_user()

2018-04-17 Thread Christophe Leroy
Rewrite clear_user() on the same principle as memset(0), making use of dcbz to clear complete cache lines. This code is a copy/paste of memset(), with some modifications in order to retrieve remaining number of bytes to be cleared, as it needs to be returned in case of error. On a MPC885,

[PATCH 4/7] powerpc/lib: inline memcmp() for small constant sizes

2018-04-17 Thread Christophe Leroy
In my 8xx configuration, I get 208 calls to memcmp() Within those 208 calls, about half of them have constant sizes, 46 have a size of 8, 17 have a size of 16, only a few have a size over 16. Other fixed sizes are mostly 4, 6 and 10. This patch inlines calls to memcmp() when size is constant and

[PATCH 3/7] powerpc/lib: optimise PPC32 memcmp

2018-04-17 Thread Christophe Leroy
At the time being, memcmp() compares two chunks of memory byte per byte. This patch optimised the comparison by comparing word by word. A small benchmark performed on an 8xx based on the comparison of two chuncks of 512 bytes performed 10 times gives: Before : 5852274 TB ticks After:

[PATCH 2/7] powerpc/lib: inline memcmp() NUL size verification

2018-04-17 Thread Christophe Leroy
Many calls to memcmp() are done with constant size. This patch gives GCC a chance to optimise out the NULL size verification. Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/string.h | 10 ++ arch/powerpc/lib/memcmp_64.S | 4

[PATCH 1/7] powerpc/lib: move PPC32 specific functions out of string.S

2018-04-17 Thread Christophe Leroy
In preparation of optimisation patches, move PPC32 specific memcmp() and __clear_user() into string_32.S Signed-off-by: Christophe Leroy --- arch/powerpc/lib/Makefile| 5 +-- arch/powerpc/lib/string.S| 61 -