[PATCH v2 7/7] powerpc/64s/radix: avoid ptesync after set_pte and ptep_set_access_flags

2018-05-19 Thread Nicholas Piggin
The ISA suggests ptesync after setting a pte, to prevent a table walk initiated by a subsequent access from missing that store and causing a spurious fault. This is an architectual allowance that allows an implementation's page table walker to be incoherent with the store queue. However there is

[PATCH v2 6/7] powerpc/64s/radix: prefetch user address in update_mmu_cache

2018-05-19 Thread Nicholas Piggin
Prefetch the faulting address in update_mmu_cache to give the page table walker perhaps 100 cycles head start as locks are dropped and the interrupt completed. Signed-off-by: Nicholas Piggin --- arch/powerpc/mm/mem.c | 4 +++- arch/powerpc/mm/pgtable-book3s64.c |

[PATCH v2 5/7] powerpc/64s/radix: optimise pte_update

2018-05-19 Thread Nicholas Piggin
Implementing pte_update with pte_xchg (which uses cmpxchg) is inefficient. A single larx/stcx. works fine, no need for the less efficient cmpxchg sequence. Then remove the memory barriers from the operation. There is a requirement for TLB flushing to load mm_cpumask after the store that reduces

[PATCH v2 4/7] powerpc/64s/radix: make ptep_get_and_clear_full non-atomic for the full case

2018-05-19 Thread Nicholas Piggin
This matches other architectures, when we know there will be no further accesses to the address (e.g., for teardown), page table entries can be cleared non-atomically. The comments about NMMU are bogus: all MMU notifiers (including NMMU) are released at this point, with their TLBs flushed. An

[PATCH v2 3/7] powerpc/64s/radix: make single threaded mms always flush all translations from non-local CPUs

2018-05-19 Thread Nicholas Piggin
Go one step further, if we're going to put a tlbie on the bus at all, make it count. Make any global invalidation from a single threaded mm do a full PID flush so the mm_cpumask can be reset. The tradeoff is that it will over-flush one time the local CPU's TLB if there was a small number of pages

[PATCH v2 2/7] powerpc/64s/radix: reset mm_cpumask for single thread process when possible

2018-05-19 Thread Nicholas Piggin
When a single-threaded process has a non-local mm_cpumask and requires a full PID tlbie invalidation, use that as an opportunity to reset the cpumask back to the current CPU we're running on. No other thread can concurrently switch to this mm, because it must have been given a reference to

[PATCH v2 1/7] powerpc/64s/radix: do not flush TLB on spurious fault

2018-05-19 Thread Nicholas Piggin
In the case of a spurious fault (which can happen due to a race with another thread that changes the page table), the default Linux mm code calls flush_tlb_page for that address. This is not required because the pte will be re-fetched. Hash does not wire this up to a hardware TLB flush for this

[PATCH v2 0/7] Various TLB and PTE improvements

2018-05-19 Thread Nicholas Piggin
I've posted most of these separately at one time or another but I will send them as a series, there have been some bug fixes and changelog and comment improvements, and got some more numbers. Most of the patches are logically independent (except 2 and 3 AFAIKS). Thanks, Nick Nicholas Piggin

Re: pkeys on POWER: Access rights not reset on execve

2018-05-19 Thread Andy Lutomirski
On Sat, May 19, 2018 at 1:28 PM Ram Pai wrote: > You got it mostly right. Filling in some more details below for > completeness. > [...] Okay, so I guess I was correct as to what the functionality was but not as to the encoding or the name of UAMOR. Can you also confirm

Re: pkeys on POWER: Access rights not reset on execve

2018-05-19 Thread Ram Pai
On Fri, May 18, 2018 at 06:50:39PM -0700, Andy Lutomirski wrote: > On Fri, May 18, 2018 at 6:19 PM Ram Pai wrote: > > > However the fundamental issue is still the same, as mentioned in the > > other thread. > > > "Should the permissions on a key be allowed to be changed, if

Re: pkeys on POWER: Access rights not reset on execve

2018-05-19 Thread Florian Weimer
On 05/19/2018 03:19 AM, Ram Pai wrote: The issue you may be talking about here is that -- "when you set the AMR register to 0x, it just sets it to 0x0c00." To me it looks like, exec/fork are not related to the issue. Or are they also somehow connected to the issue?