This patch splits the a linear mapping if the hot-unplug range
is smaller than the mapping size. The code detects if the mapping
needs to be split into a smaller size and if so, uses the stop
machine infrastructure to map the current linear mapping with
a smaller size mapping. Then the requested
On Tuesday, 16 January 2018 3:15:05 PM AEDT Alistair Popple wrote:
> Thanks Balbir, one question below. I have no way of testing this at present
> but
> it looks ok to me. Thanks!
The below are more future optimisations once we can test. So in the meantime:
Acked-by: Alistair Popple
On 06/02/2018 at 23:55:02 +0100, Arnd Bergmann wrote:
> * arch/arm/kernel/time.c has this code
>
> #if defined(CONFIG_RTC_DRV_CMOS) || defined(CONFIG_RTC_DRV_CMOS_MODULE) || \
> defined(CONFIG_NVRAM) || defined(CONFIG_NVRAM_MODULE)
> /* this needs a better home */
> DEFINE_SPINLOCK(rtc_lock);
This change restores and formalises the behaviour that access to NULL or
other user addresses by the kernel during boot should fault rather than
succeed and modify memory. This was inadvertently broken when fixing
another bug, because it was previously not well defined and only worked
by chance.
On Tue, Feb 6, 2018 at 11:05 PM, Alexandre Belloni
wrote:
> /dev/nvram was never meant to be used alongside the RTC CMOS driver from
> drivers/rtc as it already expose the NVRAM through another interface..
> Anyway, the last defconfig to enable it properly was
On Tue, Feb 06, 2018 at 05:49:50PM +0100, Laurent Dufour wrote:
> From: Peter Zijlstra
>
> One of the side effects of speculating on faults (without holding
> mmap_sem) is that we can race with free_pgtables() and therefore we
> cannot assume the page-tables will stick
Add support for the new speculative faults event.
Signed-off-by: Laurent Dufour
---
tools/include/uapi/linux/perf_event.h | 1 +
tools/perf/util/evsel.c | 1 +
tools/perf/util/parse-events.c| 4
tools/perf/util/parse-events.l| 1 +
This patch enable the speculative page fault on the PowerPC
architecture.
This will try a speculative page fault without holding the mmap_sem,
if it returns with VM_FAULT_RETRY, the mmap_sem is acquired and the
traditional page fault processing is done.
The speculative path is only tried for
From: Peter Zijlstra
Try a speculative fault before acquiring mmap_sem, if it returns with
VM_FAULT_RETRY continue with the mmap_sem acquisition and do the
traditional fault.
Signed-off-by: Peter Zijlstra (Intel)
[Clearing of FAULT_FLAG_ALLOW_RETRY
migrate_misplaced_page() is only called during the page fault handling so
it's better to pass the pointer to the struct vm_fault instead of the vma.
This way during the speculative page fault path the saved vma->vm_flags
could be used.
Signed-off-by: Laurent Dufour
When the speculative page fault handler is returning VM_RETRY, there is a
chance that VMA fetched without grabbing the mmap_sem can be reused by the
legacy page fault handler. By reusing it, we avoid calling find_vma()
again. To achieve, that we must ensure that the VMA structure will not be
This patch a set of new trace events to collect the speculative page fault
event failures.
Signed-off-by: Laurent Dufour
---
include/trace/events/pagefault.h | 87
mm/memory.c | 62
Add a new software event to count succeeded speculative page faults.
Signed-off-by: Laurent Dufour
---
include/uapi/linux/perf_event.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index
From: Peter Zijlstra
Provide infrastructure to do a speculative fault (not holding
mmap_sem).
The not holding of mmap_sem means we can race against VMA
change/removal and page-table destruction. We use the SRCU VMA freeing
to keep the VMA around. We use the VMA seqcount to
This change is inspired by the Peter's proposal patch [1] which was
protecting the VMA using SRCU. Unfortunately, SRCU is not scaling well in
that particular case, and it is introducing major performance degradation
due to excessive scheduling operations.
To allow access to the mm_rb tree without
When dealing with speculative page fault handler, we may race with VMA
being split or merged. In this case the vma->vm_start and vm->vm_end
fields may not match the address the page fault is occurring.
This can only happens when the VMA is split but in that case, the
anon_vma pointer of the new
When dealing with the speculative fault path we should use the VMA's field
cached value stored in the vm_fault structure.
Currently vm_normal_page() is using the pointer to the VMA to fetch the
vm_flags value. This patch provides a new __vm_normal_page() which is
receiving the vm_flags flags
The current maybe_mkwrite() is getting passed the pointer to the vma
structure to fetch the vm_flags field.
When dealing with the speculative page fault handler, it will be better to
rely on the cached vm_flags value stored in the vm_fault structure.
This patch introduce a __maybe_mkwrite()
The speculative page fault handler which is run without holding the
mmap_sem is calling lru_cache_add_active_or_unevictable() but the vm_flags
is not guaranteed to remain constant.
Introducing __lru_cache_add_active_or_unevictable() which has the vma flags
value parameter instead of the vma
When handling speculative page fault, the vma->vm_flags and
vma->vm_page_prot fields are read once the page table lock is released. So
there is no more guarantee that these fields would not change in our back.
They will be saved in the vm_fault structure before the VMA is checked for
changes.
The speculative page fault handler must be protected against anon_vma
changes. This is because page_add_new_anon_rmap() is called during the
speculative path.
In addition, don't try speculative page fault if the VMA don't have an
anon_vma structure allocated because its allocation should be
If a thread is remapping an area while another one is faulting on the
destination area, the SPF handler may fetch the vma from the RB tree before
the pte has been moved by the other thread. This means that the moved ptes
will overwrite those create by the page fault handler leading to page
leaked.
The VMA sequence count has been introduced to allow fast detection of
VMA modification when running a page fault handler without holding
the mmap_sem.
This patch provides protection against the VMA modification done in :
- madvise()
- mpol_rebind_policy()
-
From: Peter Zijlstra
Wrap the VMA modifications (vma_adjust/unmap_page_range) with sequence
counts such that we can easily test if a VMA is changed.
The unmap_page_range() one allows us to make assumptions about
page-tables; when we find the seqcount hasn't changed we can
When handling page fault without holding the mmap_sem the fetch of the
pte lock pointer and the locking will have to be done while ensuring
that the VMA is not touched in our back.
So move the fetch and locking operations in a dedicated function.
Signed-off-by: Laurent Dufour
From: Peter Zijlstra
One of the side effects of speculating on faults (without holding
mmap_sem) is that we can race with free_pgtables() and therefore we
cannot assume the page-tables will stick around.
Remove the reliance on the pte pointer.
Signed-off-by: Peter
This configuration variable will be used to build the code needed to
handle speculative page fault.
By default it is turned off, and activated depending on architecture
support.
Suggested-by: Thomas Gleixner
Signed-off-by: Laurent Dufour
---
From: Peter Zijlstra
When speculating faults (without holding mmap_sem) we need to validate
that the vma against which we loaded pages is still valid when we're
ready to install the new PTE.
Therefore, replace the pte_offset_map_lock() calls that (re)take the
PTL with
Define CONFIG_SPECULATIVE_PAGE_FAULT for BOOK3S_64 and SMP. This enables
the Speculative Page Fault handler.
Support is only provide for BOOK3S_64 currently because:
- require CONFIG_PPC_STD_MMU because checks done in
set_access_flags_filter()
- require BOOK3S because we can't support for
Introduce CONFIG_SPECULATIVE_PAGE_FAULT which turns on the Speculative Page
Fault handler when building for 64bits with SMP.
Cc: Thomas Gleixner
Signed-off-by: Laurent Dufour
---
arch/x86/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git
This is a port on kernel 4.15 of the work done by Peter Zijlstra to
handle page fault without holding the mm semaphore [1].
The idea is to try to handle user space page faults without holding the
mmap_sem. This should allow better concurrency for massively threaded
process since the page fault
Alexandre Belloni writes:
> On 06/02/2018 at 16:22:47 +1100, Michael Ellerman wrote:
>> > Just a note to let you know that this patch should have gone through my
>> > tree but it was not sent to linux-rtc or me.
>>
>> Sorry, I saw it had been languishing for a
Linus Torvalds writes:
> Hmm. This adds a
>
>static inline void pci_uevent_ers(struct pci_dev *pdev, ..
>
> to include/linux/pci.h.
>
> Why?
>
> You do realize that that header file is included by almost every
> driver out there. Why is that magical function
On 02/05/2018 11:06 PM, Nicholas Piggin wrote:
Does this help?
powerpc/64s/radix: allocate guard-PID for kernel contexts at boot
Yes, the test-case passes:
# strace -e set_robust_list -f ./test
set_robust_list(0x7fff8d453910, 24) = 0
+++ exited with 0 +++
# uname -r
On 06/02/18 23:36, Madhavan Srinivasan wrote:
> Commit f14e953b191f ("powerpc/64s: Add support to take additional parameter
> in MASKABLE_* macro")
> messed up MASKABLE_RELON_EXCEPTION_HV_OOL macro by adding the wrong
> __SOFTEN__ test which caused guest kernel trash at boot. Patch to fix
> the
On 06/02/2018 at 16:22:47 +1100, Michael Ellerman wrote:
> > Just a note to let you know that this patch should have gone through my
> > tree but it was not sent to linux-rtc or me.
>
> Sorry, I saw it had been languishing for a long time and assumed you'd
> missed it.
>
> Happy to revert/rework
Previously, wii_device_probe would only initialize devices under the
/hollywood node. After this patch, platform devices placed outside of
/hollywood will also be initialized.
The intended usecase for this are devices located outside of the
Hollywood chip, such as GPIO LEDs and GPIO buttons.
Commit f14e953b191f ("powerpc/64s: Add support to take additional parameter in
MASKABLE_* macro")
messed up MASKABLE_RELON_EXCEPTION_HV_OOL macro by adding the wrong
__SOFTEN__ test which caused guest kernel trash at boot. Patch to fix
the macro to use SOFTEN_TEST_HV instead of SOFTEN_NOTEST_HV.
Hi,
On 02/08/2016 at 11:50:16 +1000, Stewart Smith wrote:
> According to the OPAL docs:
> https://github.com/open-power/skiboot/blob/skiboot-5.2.5/doc/opal-api/opal-rtc-read-3.txt
> https://github.com/open-power/skiboot/blob/skiboot-5.2.5/doc/opal-api/opal-rtc-write-4.txt
> OPAL_HARDWARE may be
On Tue, 6 Feb 2018 15:30:43 +0530
Madhavan Srinivasan wrote:
> On Saturday 03 February 2018 12:47 PM, Nicholas Piggin wrote:
> > vThe soft IRQ masking code has to hard-disable interrupts in cases
> > where the exception is not cleared by the masked handler. External
> >
Hello,
I have tried to figure out why there is a problem with the buffer space
but unfortunately without any success. Any ideas? Could you please watch
Skateman's video? [1]
Thanks,
Christian
[1] https://drive.google.com/file/d/18RhksfcavRJPr86asQDTzrmsN20D0Xim/view
On 03 February 2018 at
On Saturday 03 February 2018 12:47 PM, Nicholas Piggin wrote:
vThe soft IRQ masking code has to hard-disable interrupts in cases
where the exception is not cleared by the masked handler. External
interrupts used this approach for soft masking. Now recently PMU
interrupts do the same thing.
42 matches
Mail list logo