[PATCH] powerpc/mm/pgtable: Split mappings on hot-unplug

2018-02-06 Thread Balbir Singh
This patch splits the a linear mapping if the hot-unplug range is smaller than the mapping size. The code detects if the mapping needs to be split into a smaller size and if so, uses the stop machine infrastructure to map the current linear mapping with a smaller size mapping. Then the requested

Re: [PATCH v2] powerpc/npu: Cleanup MMIO ATSD flushing

2018-02-06 Thread Alistair Popple
On Tuesday, 16 January 2018 3:15:05 PM AEDT Alistair Popple wrote: > Thanks Balbir, one question below. I have no way of testing this at present > but > it looks ok to me. Thanks! The below are more future optimisations once we can test. So in the meantime: Acked-by: Alistair Popple

Re: [PATCH] char: nvram: disable on ARM

2018-02-06 Thread Alexandre Belloni
On 06/02/2018 at 23:55:02 +0100, Arnd Bergmann wrote: > * arch/arm/kernel/time.c has this code > > #if defined(CONFIG_RTC_DRV_CMOS) || defined(CONFIG_RTC_DRV_CMOS_MODULE) || \ > defined(CONFIG_NVRAM) || defined(CONFIG_NVRAM_MODULE) > /* this needs a better home */ > DEFINE_SPINLOCK(rtc_lock);

[PATCH] powerpc/64s/radix: kernel boot-time NULL pointer protection using a guard-PID

2018-02-06 Thread Nicholas Piggin
This change restores and formalises the behaviour that access to NULL or other user addresses by the kernel during boot should fault rather than succeed and modify memory. This was inadvertently broken when fixing another bug, because it was previously not well defined and only worked by chance.

Re: [PATCH] char: nvram: disable on ARM

2018-02-06 Thread Arnd Bergmann
On Tue, Feb 6, 2018 at 11:05 PM, Alexandre Belloni wrote: > /dev/nvram was never meant to be used alongside the RTC CMOS driver from > drivers/rtc as it already expose the NVRAM through another interface.. > Anyway, the last defconfig to enable it properly was

Re: [PATCH v7 04/24] mm: Dont assume page-table invariance during faults

2018-02-06 Thread Matthew Wilcox
On Tue, Feb 06, 2018 at 05:49:50PM +0100, Laurent Dufour wrote: > From: Peter Zijlstra > > One of the side effects of speculating on faults (without holding > mmap_sem) is that we can race with free_pgtables() and therefore we > cannot assume the page-tables will stick

[PATCH v7 21/24] perf tools: Add support for the SPF perf event

2018-02-06 Thread Laurent Dufour
Add support for the new speculative faults event. Signed-off-by: Laurent Dufour --- tools/include/uapi/linux/perf_event.h | 1 + tools/perf/util/evsel.c | 1 + tools/perf/util/parse-events.c| 4 tools/perf/util/parse-events.l| 1 +

[PATCH v7 24/24] powerpc/mm: Add speculative page fault

2018-02-06 Thread Laurent Dufour
This patch enable the speculative page fault on the PowerPC architecture. This will try a speculative page fault without holding the mmap_sem, if it returns with VM_FAULT_RETRY, the mmap_sem is acquired and the traditional page fault processing is done. The speculative path is only tried for

[PATCH v7 23/24] x86/mm: Add speculative pagefault handling

2018-02-06 Thread Laurent Dufour
From: Peter Zijlstra Try a speculative fault before acquiring mmap_sem, if it returns with VM_FAULT_RETRY continue with the mmap_sem acquisition and do the traditional fault. Signed-off-by: Peter Zijlstra (Intel) [Clearing of FAULT_FLAG_ALLOW_RETRY

[PATCH v7 12/24] mm/migrate: Pass vm_fault pointer to migrate_misplaced_page()

2018-02-06 Thread Laurent Dufour
migrate_misplaced_page() is only called during the page fault handling so it's better to pass the pointer to the struct vm_fault instead of the vma. This way during the speculative page fault path the saved vma->vm_flags could be used. Signed-off-by: Laurent Dufour

[PATCH v7 22/24] mm: Speculative page fault handler return VMA

2018-02-06 Thread Laurent Dufour
When the speculative page fault handler is returning VM_RETRY, there is a chance that VMA fetched without grabbing the mmap_sem can be reused by the legacy page fault handler. By reusing it, we avoid calling find_vma() again. To achieve, that we must ensure that the VMA structure will not be

[PATCH v7 19/24] mm: Adding speculative page fault failure trace events

2018-02-06 Thread Laurent Dufour
This patch a set of new trace events to collect the speculative page fault event failures. Signed-off-by: Laurent Dufour --- include/trace/events/pagefault.h | 87 mm/memory.c | 62

[PATCH v7 20/24] perf: Add a speculative page fault sw event

2018-02-06 Thread Laurent Dufour
Add a new software event to count succeeded speculative page faults. Signed-off-by: Laurent Dufour --- include/uapi/linux/perf_event.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index

[PATCH v7 18/24] mm: Provide speculative fault infrastructure

2018-02-06 Thread Laurent Dufour
From: Peter Zijlstra Provide infrastructure to do a speculative fault (not holding mmap_sem). The not holding of mmap_sem means we can race against VMA change/removal and page-table destruction. We use the SRCU VMA freeing to keep the VMA around. We use the VMA seqcount to

[PATCH v7 17/24] mm: Protect mm_rb tree with a rwlock

2018-02-06 Thread Laurent Dufour
This change is inspired by the Peter's proposal patch [1] which was protecting the VMA using SRCU. Unfortunately, SRCU is not scaling well in that particular case, and it is introducing major performance degradation due to excessive scheduling operations. To allow access to the mm_rb tree without

[PATCH v7 16/24] mm: Introduce __page_add_new_anon_rmap()

2018-02-06 Thread Laurent Dufour
When dealing with speculative page fault handler, we may race with VMA being split or merged. In this case the vma->vm_start and vm->vm_end fields may not match the address the page fault is occurring. This can only happens when the VMA is split but in that case, the anon_vma pointer of the new

[PATCH v7 15/24] mm: Introduce __vm_normal_page()

2018-02-06 Thread Laurent Dufour
When dealing with the speculative fault path we should use the VMA's field cached value stored in the vm_fault structure. Currently vm_normal_page() is using the pointer to the VMA to fetch the vm_flags value. This patch provides a new __vm_normal_page() which is receiving the vm_flags flags

[PATCH v7 14/24] mm: Introduce __maybe_mkwrite()

2018-02-06 Thread Laurent Dufour
The current maybe_mkwrite() is getting passed the pointer to the vma structure to fetch the vm_flags field. When dealing with the speculative page fault handler, it will be better to rely on the cached vm_flags value stored in the vm_fault structure. This patch introduce a __maybe_mkwrite()

[PATCH v7 13/24] mm: Introduce __lru_cache_add_active_or_unevictable

2018-02-06 Thread Laurent Dufour
The speculative page fault handler which is run without holding the mmap_sem is calling lru_cache_add_active_or_unevictable() but the vm_flags is not guaranteed to remain constant. Introducing __lru_cache_add_active_or_unevictable() which has the vma flags value parameter instead of the vma

[PATCH v7 11/24] mm: Cache some VMA fields in the vm_fault structure

2018-02-06 Thread Laurent Dufour
When handling speculative page fault, the vma->vm_flags and vma->vm_page_prot fields are read once the page table lock is released. So there is no more guarantee that these fields would not change in our back. They will be saved in the vm_fault structure before the VMA is checked for changes.

[PATCH v7 10/24] mm: Protect SPF handler against anon_vma changes

2018-02-06 Thread Laurent Dufour
The speculative page fault handler must be protected against anon_vma changes. This is because page_add_new_anon_rmap() is called during the speculative path. In addition, don't try speculative page fault if the VMA don't have an anon_vma structure allocated because its allocation should be

[PATCH v7 09/24] mm: protect mremap() against SPF hanlder

2018-02-06 Thread Laurent Dufour
If a thread is remapping an area while another one is faulting on the destination area, the SPF handler may fetch the vma from the RB tree before the pte has been moved by the other thread. This means that the moved ptes will overwrite those create by the page fault handler leading to page leaked.

[PATCH v7 08/24] mm: Protect VMA modifications using VMA sequence count

2018-02-06 Thread Laurent Dufour
The VMA sequence count has been introduced to allow fast detection of VMA modification when running a page fault handler without holding the mmap_sem. This patch provides protection against the VMA modification done in : - madvise() - mpol_rebind_policy() -

[PATCH v7 07/24] mm: VMA sequence count

2018-02-06 Thread Laurent Dufour
From: Peter Zijlstra Wrap the VMA modifications (vma_adjust/unmap_page_range) with sequence counts such that we can easily test if a VMA is changed. The unmap_page_range() one allows us to make assumptions about page-tables; when we find the seqcount hasn't changed we can

[PATCH v7 06/24] mm: Introduce pte_spinlock for FAULT_FLAG_SPECULATIVE

2018-02-06 Thread Laurent Dufour
When handling page fault without holding the mmap_sem the fetch of the pte lock pointer and the locking will have to be done while ensuring that the VMA is not touched in our back. So move the fetch and locking operations in a dedicated function. Signed-off-by: Laurent Dufour

[PATCH v7 04/24] mm: Dont assume page-table invariance during faults

2018-02-06 Thread Laurent Dufour
From: Peter Zijlstra One of the side effects of speculating on faults (without holding mmap_sem) is that we can race with free_pgtables() and therefore we cannot assume the page-tables will stick around. Remove the reliance on the pte pointer. Signed-off-by: Peter

[PATCH v7 01/24] mm: Introduce CONFIG_SPECULATIVE_PAGE_FAULT

2018-02-06 Thread Laurent Dufour
This configuration variable will be used to build the code needed to handle speculative page fault. By default it is turned off, and activated depending on architecture support. Suggested-by: Thomas Gleixner Signed-off-by: Laurent Dufour ---

[PATCH v7 05/24] mm: Prepare for FAULT_FLAG_SPECULATIVE

2018-02-06 Thread Laurent Dufour
From: Peter Zijlstra When speculating faults (without holding mmap_sem) we need to validate that the vma against which we loaded pages is still valid when we're ready to install the new PTE. Therefore, replace the pte_offset_map_lock() calls that (re)take the PTL with

[PATCH v7 03/24] powerpc/mm: Define CONFIG_SPECULATIVE_PAGE_FAULT

2018-02-06 Thread Laurent Dufour
Define CONFIG_SPECULATIVE_PAGE_FAULT for BOOK3S_64 and SMP. This enables the Speculative Page Fault handler. Support is only provide for BOOK3S_64 currently because: - require CONFIG_PPC_STD_MMU because checks done in set_access_flags_filter() - require BOOK3S because we can't support for

[PATCH v7 02/24] x86/mm: Define CONFIG_SPECULATIVE_PAGE_FAULT

2018-02-06 Thread Laurent Dufour
Introduce CONFIG_SPECULATIVE_PAGE_FAULT which turns on the Speculative Page Fault handler when building for 64bits with SMP. Cc: Thomas Gleixner Signed-off-by: Laurent Dufour --- arch/x86/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git

[PATCH v7 00/24] Speculative page faults

2018-02-06 Thread Laurent Dufour
This is a port on kernel 4.15 of the work done by Peter Zijlstra to handle page fault without holding the mm semaphore [1]. The idea is to try to handle user space page faults without holding the mmap_sem. This should allow better concurrency for massively threaded process since the page fault

Re: rtc-opal: Fix handling of firmware error codes, prevent busy loops

2018-02-06 Thread Michael Ellerman
Alexandre Belloni writes: > On 06/02/2018 at 16:22:47 +1100, Michael Ellerman wrote: >> > Just a note to let you know that this patch should have gone through my >> > tree but it was not sent to linux-rtc or me. >> >> Sorry, I saw it had been languishing for a

Re: [GIT PULL] Please pull powerpc/linux.git powerpc-4.16-1 tag

2018-02-06 Thread Michael Ellerman
Linus Torvalds writes: > Hmm. This adds a > >static inline void pci_uevent_ers(struct pci_dev *pdev, .. > > to include/linux/pci.h. > > Why? > > You do realize that that header file is included by almost every > driver out there. Why is that magical function

Re: Kernel 4.15 lost set_robust_list support on POWER 9

2018-02-06 Thread Mauricio Faria de Oliveira
On 02/05/2018 11:06 PM, Nicholas Piggin wrote: Does this help? powerpc/64s/radix: allocate guard-PID for kernel contexts at boot Yes, the test-case passes: # strace -e set_robust_list -f ./test set_robust_list(0x7fff8d453910, 24) = 0 +++ exited with 0 +++ # uname -r

Re: [PATCH] powerpc/64s: Fix MASKABLE_RELON_EXCEPTION_HV_OOL macro

2018-02-06 Thread Alexey Kardashevskiy
On 06/02/18 23:36, Madhavan Srinivasan wrote: > Commit f14e953b191f ("powerpc/64s: Add support to take additional parameter > in MASKABLE_* macro") > messed up MASKABLE_RELON_EXCEPTION_HV_OOL macro by adding the wrong > __SOFTEN__ test which caused guest kernel trash at boot. Patch to fix > the

Re: rtc-opal: Fix handling of firmware error codes, prevent busy loops

2018-02-06 Thread Alexandre Belloni
On 06/02/2018 at 16:22:47 +1100, Michael Ellerman wrote: > > Just a note to let you know that this patch should have gone through my > > tree but it was not sent to linux-rtc or me. > > Sorry, I saw it had been languishing for a long time and assumed you'd > missed it. > > Happy to revert/rework

[PATCH] powerpc: wii: Probe the whole devicetree

2018-02-06 Thread Jonathan Neuschäfer
Previously, wii_device_probe would only initialize devices under the /hollywood node. After this patch, platform devices placed outside of /hollywood will also be initialized. The intended usecase for this are devices located outside of the Hollywood chip, such as GPIO LEDs and GPIO buttons.

[PATCH] powerpc/64s: Fix MASKABLE_RELON_EXCEPTION_HV_OOL macro

2018-02-06 Thread Madhavan Srinivasan
Commit f14e953b191f ("powerpc/64s: Add support to take additional parameter in MASKABLE_* macro") messed up MASKABLE_RELON_EXCEPTION_HV_OOL macro by adding the wrong __SOFTEN__ test which caused guest kernel trash at boot. Patch to fix the macro to use SOFTEN_TEST_HV instead of SOFTEN_NOTEST_HV.

Re: rtc-opal: Fix handling of firmware error codes, prevent busy loops

2018-02-06 Thread Alexandre Belloni
Hi, On 02/08/2016 at 11:50:16 +1000, Stewart Smith wrote: > According to the OPAL docs: > https://github.com/open-power/skiboot/blob/skiboot-5.2.5/doc/opal-api/opal-rtc-read-3.txt > https://github.com/open-power/skiboot/blob/skiboot-5.2.5/doc/opal-api/opal-rtc-write-4.txt > OPAL_HARDWARE may be

Re: [PATCH] powerpc/64s: fix may_hard_irq_enable for PMI soft masking

2018-02-06 Thread Nicholas Piggin
On Tue, 6 Feb 2018 15:30:43 +0530 Madhavan Srinivasan wrote: > On Saturday 03 February 2018 12:47 PM, Nicholas Piggin wrote: > > vThe soft IRQ masking code has to hard-disable interrupts in cases > > where the exception is not cleared by the masked handler. External > >

Re: DPAA Ethernet traffice troubles with Linux kernel

2018-02-06 Thread Christian Zigotzky
Hello, I have tried to figure out why there is a problem with the buffer space but unfortunately without any success. Any ideas? Could you please watch Skateman's video? [1] Thanks, Christian [1] https://drive.google.com/file/d/18RhksfcavRJPr86asQDTzrmsN20D0Xim/view On 03 February 2018 at

Re: [PATCH] powerpc/64s: fix may_hard_irq_enable for PMI soft masking

2018-02-06 Thread Madhavan Srinivasan
On Saturday 03 February 2018 12:47 PM, Nicholas Piggin wrote: vThe soft IRQ masking code has to hard-disable interrupts in cases where the exception is not cleared by the masked handler. External interrupts used this approach for soft masking. Now recently PMU interrupts do the same thing.