Re: [RFC PATCH 2/2] mm, fs: daxfile, an interface for byte-addressable updates to pmem

2017-06-20 Thread Andy Lutomirski
Hence > they want to use userspace data sync primitives to avoid this > overhead and so filesystems need to make it possible to provide this > userspace idata sync capability. If I were using DAX in production, I'd have exactly this issue. Let me quote myself: On Tue, Jun 20, 2017 at 9

[PATCH v3 04/11] x86/mm: Give each mm TLB flush generation a unique ID

2017-06-20 Thread Andy Lutomirski
. Signed-off-by: Andy Lutomirski --- arch/x86/include/asm/mmu.h | 25 +++-- arch/x86/include/asm/mmu_context.h | 5 + arch/x86/include/asm/tlbflush.h| 18 ++ arch/x86/mm/tlb.c | 6 -- 4 files changed, 50 insertions(+), 4

[PATCH v3 08/11] x86/mm: Disable PCID on 32-bit kernels

2017-06-20 Thread Andy Lutomirski
32-bit kernels on new hardware will see PCID in CPUID, but PCID can only be used in 64-bit mode. Rather than making all PCID code conditional, just disable the feature on 32-bit builds. Signed-off-by: Andy Lutomirski --- arch/x86/include/asm/disabled-features.h | 4 +++- arch/x86/kernel/cpu

[PATCH v3 00/11] PCID and improved laziness

2017-06-20 Thread Andy Lutomirski
ered (Nadav) - Fix ASID corruption on unlazying (kbuild bot) - Move Xen init to the right place - Misc cleanups Andy Lutomirski (11): x86/mm: Don't reenter flush_tlb_func_common() x86/ldt: Simplify LDT switching logic x86/mm: Remove reset_lazy_tlbstate() x86/mm: Give each mm TLB flus

[PATCH v3 01/11] x86/mm: Don't reenter flush_tlb_func_common()

2017-06-20 Thread Andy Lutomirski
y TLB to track the actual loaded mm") Signed-off-by: Andy Lutomirski --- arch/x86/mm/tlb.c | 15 +-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 2a5e851f2035..f06239c6919f 100644 --- a/arch/x86/mm/tlb.c +++ b/ar

[PATCH v3 07/11] x86/mm: Stop calling leave_mm() in idle code

2017-06-20 Thread Andy Lutomirski
aders, since it has no callers any more. Signed-off-by: Andy Lutomirski --- arch/ia64/include/asm/acpi.h | 2 -- arch/x86/include/asm/acpi.h | 2 -- arch/x86/mm/tlb.c | 19 +++ drivers/acpi/processor_idle.c | 2 -- drivers/idle/intel_idle.c | 9 - 5

[PATCH v3 06/11] x86/mm: Rework lazy TLB mode and TLB freshness tracking

2017-06-20 Thread Andy Lutomirski
s to find all the CPUs it needs to twiddle. The UV tlbflush code is rather dated and should be changed. Cc: Andrew Banman Cc: Mike Travis Cc: Dimitri Sivanich Cc: Juergen Gross Cc: Boris Ostrovsky Signed-off-by: Andy Lutomirski --- arch/x86/include/asm/mmu_context.h | 6 +- arch/x86/includ

[PATCH v3 11/11] x86/mm: Try to preserve old TLB entries using PCID

2017-06-20 Thread Andy Lutomirski
LB, in which case we reuse a recent value. This seems to save about 100ns on context switches between mms. Signed-off-by: Andy Lutomirski --- arch/x86/include/asm/mmu_context.h | 3 ++ arch/x86/include/asm/processor-flags.h | 2 + arch/x86/include/asm/tlbflush.h| 18 +++- arch/x86

[PATCH v3 10/11] x86/mm: Enable CR4.PCIDE on supported systems

2017-06-20 Thread Andy Lutomirski
We can use PCID if the CPU has PCID and PGE and we're not on Xen. By itself, this has no effect. The next patch will start using PCID. Cc: Juergen Gross Cc: Boris Ostrovsky Signed-off-by: Andy Lutomirski --- arch/x86/include/asm/tlbflush.h | 8 arch/x86/kernel/cpu/common.c

[PATCH v3 09/11] x86/mm: Add nopcid to turn off PCID

2017-06-20 Thread Andy Lutomirski
The parameter is only present on x86_64 systems to save a few bytes, as PCID is always disabled on x86_32. Signed-off-by: Andy Lutomirski --- Documentation/admin-guide/kernel-parameters.txt | 2 ++ arch/x86/kernel/cpu/common.c| 18 ++ 2 files changed, 20

[PATCH v3 03/11] x86/mm: Remove reset_lazy_tlbstate()

2017-06-20 Thread Andy Lutomirski
The only call site also calls idle_task_exit(), and idle_task_exit() puts us into a clean state by explicitly switching to init_mm. Reviewed-by: Rik van Riel Signed-off-by: Andy Lutomirski --- arch/x86/include/asm/tlbflush.h | 8 arch/x86/kernel/smpboot.c | 1 - 2 files changed

[PATCH v3 05/11] x86/mm: Track the TLB's tlb_gen and update the flushing algorithm

2017-06-20 Thread Andy Lutomirski
unnecessary local flushes later on. We can address this if it becomes a problem by carefully updating the target CPU's tlb_gen directly. By itself, this patch is a very minor optimization that avoids unnecessary flushes when multiple TLB flushes targetting the same CPU race. Signed-off-by:

[PATCH v3 02/11] x86/ldt: Simplify LDT switching logic

2017-06-20 Thread Andy Lutomirski
heless have a stale LDT descriptor. Simplify the code to update LDTR if either the previous or the next mm has an LDT, i.e. effectively restore the historical logic.. While we're at it, clean up the code by moving all the ifdeffery to a header where it belongs. Acked-by: Rik van Riel Signed

Re: [PATCH] fs: Reorder inode_owner_or_capable() to avoid needless

2017-06-21 Thread Andy Lutomirski
On Tue, Jun 20, 2017 at 2:40 PM, Kees Cook wrote: > Checking for capabilities should be the last operation when performing > access control tests so that PF_SUPERPRIV is set only when it was required > for success (implying that the capability was needed for the operation). > Revie

Re: [PATCH v3 05/11] x86/mm: Track the TLB's tlb_gen and update the flushing algorithm

2017-06-21 Thread Andy Lutomirski
On Wed, Jun 21, 2017 at 1:32 AM, Thomas Gleixner wrote: > On Tue, 20 Jun 2017, Andy Lutomirski wrote: >> struct flush_tlb_info { >> + /* >> + * We support several kinds of flushes. >> + * >> + * - Fully flush a single mm. flush_mm will be se

Re: [PATCH v3 01/11] x86/mm: Don't reenter flush_tlb_func_common()

2017-06-21 Thread Andy Lutomirski
On Wed, Jun 21, 2017 at 1:49 AM, Borislav Petkov wrote: > On Tue, Jun 20, 2017 at 10:22:07PM -0700, Andy Lutomirski wrote: >> It was historically possible to have two concurrent TLB flushes >> targetting the same CPU: one initiated locally and one initiated >> remotely.

Re: [PATCH v3 07/11] x86/mm: Stop calling leave_mm() in idle code

2017-06-21 Thread Andy Lutomirski
On Wed, Jun 21, 2017 at 2:22 AM, Thomas Gleixner wrote: > On Tue, 20 Jun 2017, Andy Lutomirski wrote: >> diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c >> index 216d7ec88c0c..2ae43f59091d 100644 >> --- a/drivers/idle/intel_idle.c >> +++

Re: [PATCH v3 06/11] x86/mm: Rework lazy TLB mode and TLB freshness tracking

2017-06-21 Thread Andy Lutomirski
On Wed, Jun 21, 2017 at 2:01 AM, Thomas Gleixner wrote: > On Tue, 20 Jun 2017, Andy Lutomirski wrote: >> -/* >> - * The flush IPI assumes that a thread switch happens in this order: >> - * [cpu0: the cpu that switches] >> - * 1) switch_mm() either 1a) or 1b) &g

Re: [PATCH v3 04/11] x86/mm: Give each mm TLB flush generation a unique ID

2017-06-21 Thread Andy Lutomirski
On Wed, Jun 21, 2017 at 3:33 AM, Borislav Petkov wrote: > On Tue, Jun 20, 2017 at 10:22:10PM -0700, Andy Lutomirski wrote: >> +#define INIT_MM_CONTEXT(mm) \ >> + .context = {\ >>

Re: [PATCH v3 10/11] x86/mm: Enable CR4.PCIDE on supported systems

2017-06-21 Thread Andy Lutomirski
On Wed, Jun 21, 2017 at 2:39 AM, Thomas Gleixner wrote: > On Tue, 20 Jun 2017, Andy Lutomirski wrote: >> + /* Set up PCID */ >> + if (cpu_has(c, X86_FEATURE_PCID)) { >> + if (cpu_has(c, X86_FEATURE_PGE)) { >> + cr4_set_bits(X86_

Re: [PATCH v3 01/11] x86/mm: Don't reenter flush_tlb_func_common()

2017-06-21 Thread Andy Lutomirski
On Wed, Jun 21, 2017 at 4:26 PM, Nadav Amit wrote: > Andy Lutomirski wrote: > >> index 2a5e851f2035..f06239c6919f 100644 >> --- a/arch/x86/mm/tlb.c >> +++ b/arch/x86/mm/tlb.c >> @@ -208,6 +208,9 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct >

Re: [PATCH v3 04/11] x86/mm: Give each mm TLB flush generation a unique ID

2017-06-21 Thread Andy Lutomirski
On Wed, Jun 21, 2017 at 10:43 AM, Borislav Petkov wrote: > On Tue, Jun 20, 2017 at 10:22:10PM -0700, Andy Lutomirski wrote: >> - * The x86 doesn't have a mmu context, but >> - * we put the segment information here. >> + * x86 has arch-specific MMU state beyo

Re: [PATCH v3 05/11] x86/mm: Track the TLB's tlb_gen and update the flushing algorithm

2017-06-21 Thread Andy Lutomirski
On Wed, Jun 21, 2017 at 11:44 AM, Borislav Petkov wrote: > On Tue, Jun 20, 2017 at 10:22:11PM -0700, Andy Lutomirski wrote: >> + this_cpu_write(cpu_tlbstate.ctxs[0].ctx_id, next->context.ctx_id); >> + this_cpu_write(cpu_tlbstate.ctxs[0].tlb_gen, >> +

Re: [PATCH v3 11/11] x86/mm: Try to preserve old TLB entries using PCID

2017-06-21 Thread Andy Lutomirski
On Wed, Jun 21, 2017 at 6:38 AM, Thomas Gleixner wrote: > On Tue, 20 Jun 2017, Andy Lutomirski wrote: >> This patch uses PCID differently. We use a PCID to identify a >> recently-used mm on a per-cpu basis. An mm has no fixed PCID >> binding at all; instead, we give it a

Re: [RFC PATCH 2/2] mm, fs: daxfile, an interface for byte-addressable updates to pmem

2017-06-21 Thread Andy Lutomirski
On Wed, Jun 21, 2017 at 5:02 PM, Dave Chinner wrote: > > You seem to be calling the "fdatasync on every page fault" the It's the opposite of fdatasync(). It needs to sync whatever metadata is needed to find the data. The data doesn't need to be synced. > "lightweight" option. That's the brute-

Re: [PATCH v3 00/11] PCID and improved laziness

2017-06-21 Thread Andy Lutomirski
On Wed, Jun 21, 2017 at 11:23 AM, Linus Torvalds wrote: > On Tue, Jun 20, 2017 at 10:22 PM, Andy Lutomirski wrote: >> There are three performance benefits here: > > Side note: can you post the actual performance numbers, even if only > from some silly test program on just on

Re: [RFC][PATCH] x86, syscalls: use SYSCALL_DEFINE() macros for sys_modify_ldt()

2017-10-13 Thread Andy Lutomirski
On Fri, Oct 13, 2017 at 4:49 PM, Brian Gerst wrote: > On Fri, Oct 13, 2017 at 5:03 PM, Andy Lutomirski wrote: >> On Fri, Oct 13, 2017 at 1:39 PM, Dave Hansen >> wrote: >>> >>> I noticed that we don't have tracepoints for sys_modify_ldt(). I >>

Re: [RFC][PATCH] x86, syscalls: use SYSCALL_DEFINE() macros for sys_modify_ldt()

2017-10-14 Thread Andy Lutomirski
On Fri, Oct 13, 2017 at 11:25 PM, Brian Gerst wrote: > On Sat, Oct 14, 2017 at 12:42 AM, Andy Lutomirski wrote: >> On Fri, Oct 13, 2017 at 4:49 PM, Brian Gerst wrote: >>> On Fri, Oct 13, 2017 at 5:03 PM, Andy Lutomirski wrote: >>>> On Fri, Oct 13, 2017 at 1:

Re: [tip:x86/urgent] x86/mm: Flush more aggressively in lazy TLB mode

2017-10-14 Thread Andy Lutomirski
On Sat, Oct 14, 2017 at 3:49 AM, tip-bot for Andy Lutomirski wrote: > Commit-ID: b956575bed91ecfb136a8300742ecbbf451471ab > Gitweb: > https://git.kernel.org/tip/b956575bed91ecfb136a8300742ecbbf451471ab > Author: Andy Lutomirski > AuthorDate: Mon, 9 Oct 2017 09:50:49 -0

[PATCH 0/3] Fixes on top of the TLB fix in x86/urgent

2017-10-14 Thread Andy Lutomirski
I redid the TLB fix to resolve Boris' comments, and then Ingo applied the old version :-/ Here are my fixes respun on top of the version in -tip. If they pass muster (they're quite straightforward), can one of you get them to Linus? (If you disagree with Boris, then skip patch

[PATCH 3/3] x86/mm: Remove debug/x86/tlb_defer_switch_to_init_mm

2017-10-14 Thread Andy Lutomirski
Borislav thinks that we don't need this knob in a released kernel. Get rid of it. Fixes: b956575bed91 ("x86/mm: Flush more aggressively in lazy TLB mode") Signed-off-by: Andy Lutomirski --- arch/x86/include/asm/tlbflush.h | 20 -- arch/x86/mm/tlb.c

[PATCH 2/3] x86/mm: Tidy up "x86/mm: Flush more aggressively in lazy TLB mode"

2017-10-14 Thread Andy Lutomirski
o_init_mm", which describes what it acutally means. - Move the static_branch crap into a helper. - Improve comments. Actually removing the debugfs option is in the next patch. Fixes: b956575bed91 ("x86/mm: Flush more aggressively in lazy TLB mode") Signed-off-by: Andy Lut

[PATCH 1/3] x86/mm/64: Remove the last VM_BUG_ON from the TLB code

2017-10-14 Thread Andy Lutomirski
Let's avoid hard-to-diagnose crashes in the future. Signed-off-by: Andy Lutomirski --- arch/x86/mm/tlb.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 658bf0090565..7db23f9f804e 100644 --- a/arch/x86/mm/tlb.c +++ b/arc

Re: [tip:x86/urgent] x86/mm: Flush more aggressively in lazy TLB mode

2017-10-14 Thread Andy Lutomirski
On Sat, Oct 14, 2017 at 9:34 AM, Andy Lutomirski wrote: > On Sat, Oct 14, 2017 at 3:49 AM, tip-bot for Andy Lutomirski > wrote: >> Commit-ID: b956575bed91ecfb136a8300742ecbbf451471ab >> Gitweb: >> https://git.kernel.org/tip/b956575bed91ecfb136a8300742ecbbf4514

Re: [PATCH] x86/mm: Rip out the TLB benchmarking knob

2017-10-14 Thread Andy Lutomirski
On Sat, Oct 14, 2017 at 5:34 AM, Borislav Petkov wrote: > On Sat, Oct 14, 2017 at 03:49:12AM -0700, tip-bot for Andy Lutomirski wrote: >> Commit-ID: b956575bed91ecfb136a8300742ecbbf451471ab >> Gitweb: >> https://git.kernel.org/tip/b956575bed91ecfb136a8300742ecbbf451471a

Re: [PATCH v4] pidns: introduce syscall translate_pid

2017-10-16 Thread Andy Lutomirski
On Mon, Oct 16, 2017 at 3:54 PM, prakash.sangappa wrote: > > > On 10/16/2017 03:07 PM, Nagarathnam Muthusamy wrote: >> >> >> >> On 10/16/2017 02:36 PM, Andrew Morton wrote: >>> >>> On Sat, 14 Oct 2017 11:17:47 +0300 Konstantin Khlebnikov >>> wrote: >>> >>> pid_t translate_pid(pid_t pid, int s

Re: [lkp-robot] [x86/mm] c4c3c3c2d0: will-it-scale.per_process_ops -61.0% regression

2017-10-16 Thread Andy Lutomirski
On Mon, Oct 16, 2017 at 3:15 AM, Borislav Petkov wrote: > On Mon, Oct 16, 2017 at 10:39:17AM +0800, kernel test robot wrote: >> >> Greeting, >> >> FYI, we noticed a -61.0% regression of will-it-scale.per_process_ops due to >> commit: >> >> >> commit: c4c3c3c2d00826c88b5c02c20e80704664424b9b ("x86

Re: [PATCH] x86, syscalls: use SYSCALL_DEFINE() macros for sys_modify_ldt()

2017-10-17 Thread Andy Lutomirski
-EINVAL from a function returning 'unsigned long', we end > up with 0xffea in %rax, which is wrong. > > To work around this and maintain the 'int' behavior while using > the SYSCALL_DEFINEx() macros, so we add a cast to 'unsigned int' > in both impleme

Re: [PATCH v4] pidns: introduce syscall translate_pid

2017-10-17 Thread Andy Lutomirski
On Tue, Oct 17, 2017 at 8:38 AM, Prakash Sangappa wrote: > > > On 10/16/17 5:52 PM, Andy Lutomirski wrote: >> >> On Mon, Oct 16, 2017 at 3:54 PM, prakash.sangappa >> wrote: >>> >>> >>> On 10/16/2017 03:07 PM, Nagarathnam Muthusamy wrote: >

Re: [lkp-robot] [x86/mm] c4c3c3c2d0: will-it-scale.per_process_ops -61.0% regression

2017-10-17 Thread Andy Lutomirski
On Tue, Oct 17, 2017 at 1:00 AM, Borislav Petkov wrote: > On Tue, Oct 17, 2017 at 06:57:43AM +0200, Markus Trippelsdorf wrote: >> On 2017.10.16 at 18:06 -0700, Andy Lutomirski wrote: >> > On Mon, Oct 16, 2017 at 3:15 AM, Borislav Petkov wrote: >> > > On Mon, Oc

Re: [PATCH v4] pidns: introduce syscall translate_pid

2017-10-17 Thread Andy Lutomirski
On Tue, Oct 17, 2017 at 3:35 PM, prakash sangappa wrote: > > On 10/17/2017 3:02 PM, Andy Lutomirski wrote: >> >> On Tue, Oct 17, 2017 at 8:38 AM, Prakash Sangappa >> wrote: >>> >>> >>> On 10/16/17 5:52 PM, Andy Lutomirski wrote: >>

Re: [PATCHv1, RFC 0/8] Boot-time switching between 4- and 5-level paging

2017-05-25 Thread Andy Lutomirski
On Thu, May 25, 2017 at 4:24 PM, Linus Torvalds wrote: > On Thu, May 25, 2017 at 1:33 PM, Kirill A. Shutemov > wrote: >> Here' my first attempt to bring boot-time between 4- and 5-level paging. >> It looks not too terrible to me. I've expected it to be worse. > > If I read this right, you just ma

[PATCH v3 0/8] x86 TLB flush cleanups, moving toward PCID support

2017-05-25 Thread Andy Lutomirski
: - Rebased onto tip:x86/mm to pick up UV and Xen changes. - Drop the patches that Ingo already applied. Changes from RFC: - Fixed missing call to arch_tlbbatch_flush(). - "Be more consistent wrt PAGE_SHIFT vs PAGE_SIZE in tlb flush code" is new - Misc typos fixed. - Actually

[PATCH v3 5/8] x86/mm: Remove the UP tlbflush code; always use the formerly SMP code

2017-05-25 Thread Andy Lutomirski
since it means that any change to the TLB flush code had to make sure not to break it. Simplify everything by deleting the UP code. Cc: Rik van Riel Cc: Dave Hansen Cc: Nadav Amit Cc: Michal Hocko Cc: Andrew Morton Cc: Arjan van de Ven Signed-off-by: Andy Lutomirski --- arch/x86/Kconfig

[PATCH v3 3/8] x86/mm: Refactor flush_tlb_mm_range() to merge local and remote cases

2017-05-25 Thread Andy Lutomirski
Nadav Amit Cc: Michal Hocko Cc: Andrew Morton Cc: Arjan van de Ven Signed-off-by: Andy Lutomirski --- arch/x86/include/asm/tlbflush.h | 1 - arch/x86/mm/tlb.c | 113 +--- 2 files changed, 48 insertions(+), 66 deletions(-) diff --git a/arc

[PATCH v3 7/8] x86/mm: Be more consistent wrt PAGE_SHIFT vs PAGE_SIZE in tlb flush code

2017-05-25 Thread Andy Lutomirski
: Andy Lutomirski --- arch/x86/mm/tlb.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 4bfadb869a1e..4bcec510174c 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -220,8 +220,7 @@ static void flush_tlb_func_common(const

[PATCH v3 8/8] x86,kvm: Teach KVM's VMX code that CR3 isn't a constant

2017-05-25 Thread Andy Lutomirski
rankly, it also scares me a bit that KVM ever treated CR3 as constant, but it looks like it was okay before.) Cc: Paolo Bonzini Cc: Radim Krčmář Cc: k...@vger.kernel.org Cc: Rik van Riel Cc: Dave Hansen Cc: Nadav Amit Cc: Michal Hocko Cc: Andrew Morton Cc: Arjan van de Ven Signed-off-by: A

[PATCH v3 1/8] x86/mm: Pass flush_tlb_info to flush_tlb_others() etc

2017-05-25 Thread Andy Lutomirski
ushes in a future patch. Cc: Rik van Riel Cc: Dave Hansen Cc: Nadav Amit Cc: Michal Hocko Cc: Andrew Morton Signed-off-by: Andy Lutomirski --- arch/x86/include/asm/paravirt.h | 6 ++-- arch/x86/include/asm/paravirt_types.h | 5 ++- arch/x86/include/asm/tlbflush.h

[PATCH v3 6/8] x86/mm: Rework lazy TLB to track the actual loaded mm

2017-05-25 Thread Andy Lutomirski
his change, perf may behave a bit erratically if it tries to read user memory in kernel thread context. We should build on this patch to teach perf to never look at user memory when cpu_tlbstate.loaded_mm != current->mm. Cc: Rik van Riel Cc: Dave Hansen Cc: Nadav Amit Cc: Michal Hocko Cc: An

[PATCH v3 4/8] x86/mm: Use new merged flush logic in arch_tlbbatch_flush()

2017-05-25 Thread Andy Lutomirski
Amit Cc: Michal Hocko Cc: Arjan van de Ven Cc: Andrew Morton Signed-off-by: Andy Lutomirski --- arch/x86/mm/tlb.c | 8 ++-- 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 12b8812e8926..c03b4a0ce58c 100644 --- a/arch/x86/mm/tlb.c

[PATCH v3 2/8] x86/mm: Change the leave_mm() condition for local TLB flushes

2017-05-25 Thread Andy Lutomirski
te flush code, but for ease of verifying and bisecting the patch, I want the local and remote flush behavior to match first. This patch changes the local code to match the remote code. Cc: Rik van Riel Cc: Dave Hansen Cc: Nadav Amit Cc: Michal Hocko Cc: Arjan van de Ven Signed-off-by: An

Re: [PATCH v3 2/8] x86/mm: Change the leave_mm() condition for local TLB flushes

2017-05-25 Thread Andy Lutomirski
On Thu, May 25, 2017 at 6:39 PM, Rik van Riel wrote: > On Thu, 2017-05-25 at 17:47 -0700, Andy Lutomirski wrote: >> >> +++ b/arch/x86/mm/tlb.c >> @@ -311,7 +311,7 @@ void flush_tlb_mm_range(struct mm_struct *mm, >> unsigned long start, >> go

Re: [PATCHv1, RFC 0/8] Boot-time switching between 4- and 5-level paging

2017-05-26 Thread Andy Lutomirski
On Thu, May 25, 2017 at 9:18 PM, Kevin Easton wrote: > (If it weren't for that, maybe you could point the last entry in the PML4 > at the PML4 itself, so it also works as a PML5 for accessing kernel > addresses? And of course make sure nothing gets loaded above > 0xff80). This was an

Re: [x86/mm] e2a7dcce31: kernel_BUG_at_arch/x86/mm/tlb.c

2017-05-27 Thread Andy Lutomirski
On Sat, May 27, 2017 at 6:31 AM, kernel test robot wrote: > > FYI, we noticed the following commit: > > commit: e2a7dcce31f10bd7471b4245a6d1f2de344e7adf ("x86/mm: Rework lazy TLB to > track the actual loaded mm") > https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git > x86/tlbflush_cleanu

Re: [PATCH 16/16] platform/x86: dell-wmi: Convert to the WMI bus infrastructure

2017-05-27 Thread Andy Lutomirski
On Sat, May 27, 2017 at 3:50 AM, Pali Rohár wrote: > On Saturday 27 May 2017 07:31:30 Darren Hart wrote: >> - dell_wmi_input_dev->name = "Dell WMI hotkeys"; >> - dell_wmi_input_dev->phys = "wmi/input0"; >> - dell_wmi_input_dev->id.bustype = BUS_HOST; >> + priv->input_dev->name = "D

Re: [PATCH] platform/x86: dell-rbtn: Improve explanation about DELLABC6

2017-05-27 Thread Andy Lutomirski
On Sat, May 27, 2017 at 4:01 AM, Pali Rohár wrote: > On Saturday 27 May 2017 07:16:19 Darren Hart wrote: >> From: Andy Lutomirski >> >> According to Mario at Dell, the DELLABC6 device should not be used on >> a Linux system. It also conflicts with Intel-HID and its &

Re: [PATCH 1/2] nvme: Wait at least 6000ms before entering the deepest idle state

2017-05-27 Thread Andy Lutomirski
On Fri, May 26, 2017 at 1:52 AM, Christoph Hellwig wrote: > On Wed, May 24, 2017 at 03:06:30PM -0700, Andy Lutomirski wrote: >> This should at least make vendors less nervous about Linux's APST >> policy. I'm not aware of any concrete bugs it would fix (although I >&

Re: [x86/mm] e2a7dcce31: kernel_BUG_at_arch/x86/mm/tlb.c

2017-05-27 Thread Andy Lutomirski
On Sat, May 27, 2017 at 9:00 AM, Andy Lutomirski wrote: > On Sat, May 27, 2017 at 6:31 AM, kernel test robot > wrote: >> >> FYI, we noticed the following commit: >> >> commit: e2a7dcce31f10bd7471b4245a6d1f2de344e7adf ("x86/mm: Rework lazy TLB >>

Re: [PATCH 16/16] platform/x86: dell-wmi: Convert to the WMI bus infrastructure

2017-05-27 Thread Andy Lutomirski
On Sat, May 27, 2017 at 9:17 AM, Dmitry Torokhov wrote: > On May 27, 2017 9:04:38 AM PDT, Andy Lutomirski wrote: >>On Sat, May 27, 2017 at 3:50 AM, Pali Rohár >>wrote: >>> On Saturday 27 May 2017 07:31:30 Darren Hart wrote: >>>> - dell_wmi

Re: [PATCH 15/16] platform/x86: wmi-mof: New driver to expose embedded WMI MOF metadata

2017-05-27 Thread Andy Lutomirski
; > On Saturday 27 May 2017 07:31:29 Darren Hart wrote: >> From: Andy Lutomirski >> >> Quite a few laptops (and maybe servers?) have embedded WMI MOF > > Not "a few", but "lots of" :-) > >> metadata. I think that Samba has tools to interpret

Re: [x86/mm] e2a7dcce31: kernel_BUG_at_arch/x86/mm/tlb.c

2017-05-28 Thread Andy Lutomirski
On Sun, May 28, 2017 at 7:11 AM, Rik van Riel wrote: > On Sat, 2017-05-27 at 09:00 -0700, Andy Lutomirski wrote: >> On Sat, May 27, 2017 at 6:31 AM, kernel test robot >> wrote: >> > >> > FYI, we noticed the following commit: >> > >> > commit

Re: [tip:x86/urgent] x86/PAT: Fix Xorg regression on CPUs that don't support PAT

2017-05-28 Thread Andy Lutomirski
On Sun, May 28, 2017 at 11:18 AM, Bernhard Held wrote: > Hi, > > this patch breaks the boot of my kernel. The last message is "Booting > the kernel.". > > My setup might be unusual: I'm running a Xenon E5450 (LGA 771) in a > Gigbayte G33-DS3R board (LGA 775). The BIOS is patched with the > microco

Re: [PATCH 00/23] KAISER: unmap most of the kernel from userspace page tables

2017-11-01 Thread Andy Lutomirski
On Tue, Oct 31, 2017 at 4:44 PM, Dave Hansen wrote: > On 10/31/2017 04:27 PM, Linus Torvalds wrote: >> Inconveniently, the people you cc'd on the actual patches did *not* >> get cc'd with this 00/23 cover letter email. > > Urg, sorry about that. > >> (a) is this on top of Andy's entry cleanups? >

Re: [PATCH 12/23] x86, kaiser: map dynamically-allocated LDTs

2017-11-01 Thread Andy Lutomirski
On Tue, Oct 31, 2017 at 3:32 PM, Dave Hansen wrote: > > Normally, a process just has a NULL mm->context.ldt. But, we > have a syscall for a process to set a new one. If a process does > that, we need to map the new LDT. > > The original KAISER patch missed this case. Tglx suggested that we inst

Re: [PATCH 04/23] x86, tlb: make CR4-based TLB flushes more robust

2017-11-01 Thread Andy Lutomirski
On Tue, Oct 31, 2017 at 3:31 PM, Dave Hansen wrote: > > Our CR4-based TLB flush currently requries global pages to be > supported *and* enabled. But, we really only need for them to be > supported. Make the code more robust by alllowing X86_CR4_PGE to > clear as well as set. > > This change was

Re: [PATCH 21/23] x86, pcid, kaiser: allow flushing for future ASID switches

2017-11-01 Thread Andy Lutomirski
On Tue, Oct 31, 2017 at 3:32 PM, Dave Hansen wrote: > > If we change the page tables in such a way that we need an > invalidation of all contexts (aka. PCIDs / ASIDs) we can > actively invalidate them by: > 1. INVPCID for each PCID (works for single pages too). > 2. Load CR3 with each PCID witho

Re: [PATCH] x86, mm: make alternatives code do stronger TLB flush

2017-11-01 Thread Andy Lutomirski
On Tue, Oct 31, 2017 at 11:07 AM, Dave Hansen wrote: > > From: Dave Hansen > > local_flush_tlb() does a CR3 write. But, that kind of TLB flush is > not guaranteed to invalidate global pages. The entire kernel is > mapped with global pages. > > Also, now that we have PCIDs, local_flush_tlb() wil

Re: [PATCH 04/23] x86, tlb: make CR4-based TLB flushes more robust

2017-11-01 Thread Andy Lutomirski
On Wed, Nov 1, 2017 at 3:11 AM, Kirill A. Shutemov wrote: > On Wed, Nov 01, 2017 at 01:01:45AM -0700, Andy Lutomirski wrote: >> On Tue, Oct 31, 2017 at 3:31 PM, Dave Hansen >> wrote: >> > >> > Our CR4-based TLB flush currently requries global pages to be >&

Re: [PATCH 17/18] x86/asm/64: Remove thread_struct::sp0

2017-11-01 Thread Andy Lutomirski
On Wed, Nov 1, 2017 at 3:23 AM, Borislav Petkov wrote: > On Thu, Oct 26, 2017 at 01:26:49AM -0700, Andy Lutomirski wrote: >> On x86_64, we can easily calculate sp0 when needed instead of >> storing it in thread_struct. >> >> On x86_32, a similar cleanup would be pos

Re: [PATCH 18/18] x86/traps: Use a new on_thread_stack() helper to clean up an assertion

2017-11-01 Thread Andy Lutomirski
I added a commit message :) On Wed, Nov 1, 2017 at 3:31 AM, Borislav Petkov wrote: > On Thu, Oct 26, 2017 at 01:26:50AM -0700, Andy Lutomirski wrote: >> Signed-off-by: Andy Lutomirski >> --- >> arch/x86/include/asm/processor.h | 6 ++ >> arch/x86/kernel/traps.c

Re: [PATCH 04/23] x86, tlb: make CR4-based TLB flushes more robust

2017-11-01 Thread Andy Lutomirski
On Wed, Nov 1, 2017 at 3:56 AM, Kirill A. Shutemov wrote: > On Wed, Nov 01, 2017 at 03:38:23AM -0700, Andy Lutomirski wrote: >> On Wed, Nov 1, 2017 at 3:11 AM, Kirill A. Shutemov >> wrote: >> > On Wed, Nov 01, 2017 at 01:01:45AM -0700, Andy Lutomirski wrote: >> &

Re: [PATCH 07/18] x86/asm/64: Merge the fast and slow SYSRET paths

2017-11-01 Thread Andy Lutomirski
On Fri, Oct 27, 2017 at 1:11 PM, Borislav Petkov wrote: > On Thu, Oct 26, 2017 at 01:26:39AM -0700, Andy Lutomirski wrote: >> They did almost the same thing. Remove a bunch of pointless >> instructions (mostly hidden in macros) and reduce cognitive load by >> merging them

Re: [PATCH 02/18] x86/asm/64: Split the iret-to-user and iret-to-kernel paths

2017-11-01 Thread Andy Lutomirski
On Fri, Oct 27, 2017 at 1:04 PM, Borislav Petkov wrote: > On Thu, Oct 26, 2017 at 01:26:34AM -0700, Andy Lutomirski wrote: >> These code paths will diverge soon. >> >> Signed-off-by: Andy Lutomirski >> --- >> arch/x86/entry/entry_64.S| 32

Re: [PATCH 02/18] x86/asm/64: Split the iret-to-user and iret-to-kernel paths

2017-11-01 Thread Andy Lutomirski
On Fri, Oct 27, 2017 at 11:05 AM, Dave Hansen wrote: > On 10/26/2017 01:26 AM, Andy Lutomirski wrote: >> +GLOBAL(restore_regs_and_return_to_usermode) >> +#ifdef CONFIG_DEBUG_ENTRY >> + testl $3, CS(%rsp) >> + jnz 1f >> + ud2 > > A nit from

Re: [PATCH 21/23] x86, pcid, kaiser: allow flushing for future ASID switches

2017-11-01 Thread Andy Lutomirski
On Wed, Nov 1, 2017 at 7:17 AM, Dave Hansen wrote: > On 11/01/2017 01:03 AM, Andy Lutomirski wrote: >>> This ensures that any futuee context switches will do a full flush >>> of the TLB so they pick up the changes. >> I'm convuced. What was wrong with the old co

Re: [PATCH 00/23] KAISER: unmap most of the kernel from userspace page tables

2017-11-01 Thread Andy Lutomirski
On Wed, Nov 1, 2017 at 12:05 PM, Linus Torvalds wrote: > On Wed, Nov 1, 2017 at 11:46 AM, Dave Hansen > wrote: >> >> The vmalloc()'d stacks definitely need the page table walk. > > Ugh, yes. Nasty. > > Andy at some point mentioned a per-cpu initial stack trampoline thing > for his exception patch

Re: [PATCH 07/18] x86/asm/64: Merge the fast and slow SYSRET paths

2017-11-01 Thread Andy Lutomirski
On Wed, Nov 1, 2017 at 10:25 AM, Brian Gerst wrote: > On Thu, Oct 26, 2017 at 4:26 AM, Andy Lutomirski wrote: >> They did almost the same thing. Remove a bunch of pointless >> instructions (mostly hidden in macros) and reduce cognitive load by >> merging them. >

Re: [PATCH 21/23] x86, pcid, kaiser: allow flushing for future ASID switches

2017-11-01 Thread Andy Lutomirski
On Wed, Nov 1, 2017 at 1:59 PM, Dave Hansen wrote: > On 11/01/2017 01:31 PM, Andy Lutomirski wrote: >> On Wed, Nov 1, 2017 at 7:17 AM, Dave Hansen >> wrote: >>> On 11/01/2017 01:03 AM, Andy Lutomirski wrote: >>>>> This ensures that any futuee context swi

Re: [PATCH 02/23] x86, kaiser: do not set _PAGE_USER for init_mm page tables

2017-11-01 Thread Andy Lutomirski
On Wed, Nov 1, 2017 at 2:11 PM, Thomas Gleixner wrote: > On Tue, 31 Oct 2017, Dave Hansen wrote: > >> >> init_mm is for kernel-exclusive use. If someone is allocating page >> tables in it, do not set _PAGE_USER on them. This ensures that >> we do *not* set NX on these page tables in the KAISER c

Re: [PATCH 02/23] x86, kaiser: do not set _PAGE_USER for init_mm page tables

2017-11-02 Thread Andy Lutomirski
On Wed, Nov 1, 2017 at 2:28 PM, Thomas Gleixner wrote: > On Wed, 1 Nov 2017, Andy Lutomirski wrote: > >> On Wed, Nov 1, 2017 at 2:11 PM, Thomas Gleixner wrote: >> > On Tue, 31 Oct 2017, Dave Hansen wrote: >> > >> >> >> >> init_mm is fo

Re: [PATCH 02/23] x86, kaiser: do not set _PAGE_USER for init_mm page tables

2017-11-02 Thread Andy Lutomirski
On Wed, Nov 1, 2017 at 3:20 PM, Thomas Gleixner wrote: > On Wed, 1 Nov 2017, Linus Torvalds wrote: >> On Wed, Nov 1, 2017 at 2:52 PM, Dave Hansen >> wrote: >> > On 11/01/2017 02:28 PM, Thomas Gleixner wrote: >> >> On Wed, 1 Nov 2017, Andy Lutomirski wrote: &g

Re: [PATCH 00/23] KAISER: unmap most of the kernel from userspace page tables

2017-11-02 Thread Andy Lutomirski
On Wed, Nov 1, 2017 at 1:33 PM, Andy Lutomirski wrote: > On Wed, Nov 1, 2017 at 12:05 PM, Linus Torvalds > wrote: >> On Wed, Nov 1, 2017 at 11:46 AM, Dave Hansen >> wrote: >>> >>> The vmalloc()'d stacks definitely need the page table walk. >&g

Re: [PATCH 00/23] KAISER: unmap most of the kernel from userspace page tables

2017-11-02 Thread Andy Lutomirski
On Thu, Nov 2, 2017 at 12:32 AM, Andy Lutomirski wrote: > On Wed, Nov 1, 2017 at 1:33 PM, Andy Lutomirski wrote: >> On Wed, Nov 1, 2017 at 12:05 PM, Linus Torvalds >> wrote: >>> On Wed, Nov 1, 2017 at 11:46 AM, Dave Hansen >>> wrote: >>>> >>

[PATCH v2 00/20] Pile o' entry/exit/sp0 changes

2017-11-02 Thread Andy Lutomirski
ch - Add some assertions - Cleanups Andy Lutomirski (19): x86/asm/64: Remove the restore_c_regs_and_iret label x86/asm/64: Split the iret-to-user and iret-to-kernel paths x86/asm/64: Move SWAPGS into the common iret-to-usermode path x86/asm/64: Simplify reg restore code in the standard IRET p

[PATCH v2 01/20] x86/asm/64: Remove the restore_c_regs_and_iret label

2017-11-02 Thread Andy Lutomirski
The only user was the 64-bit opportunistic SYSRET failure path, and that path didn't really need it. This change makes the opportunistic SYSRET code a bit more straightforward and gets rid of the label. Reviewed-by: Borislav Petkov Signed-off-by: Andy Lutomirski --- arch/x86/entry/entry

[PATCH v2 03/20] x86/asm/64: Move SWAPGS into the common iret-to-usermode path

2017-11-02 Thread Andy Lutomirski
All of the code paths that ended up doing IRET to usermode did SWAPGS immediately beforehand. Move the SWAPGS into the common code. Signed-off-by: Andy Lutomirski --- arch/x86/entry/entry_64.S| 32 ++-- arch/x86/entry/entry_64_compat.S | 3 +-- 2 files

[PATCH v2 02/20] x86/asm/64: Split the iret-to-user and iret-to-kernel paths

2017-11-02 Thread Andy Lutomirski
These code paths will diverge soon. Signed-off-by: Andy Lutomirski --- arch/x86/entry/entry_64.S| 34 +- arch/x86/entry/entry_64_compat.S | 2 +- arch/x86/kernel/head_64.S| 2 +- 3 files changed, 27 insertions(+), 11 deletions(-) diff --git a

[PATCH v2 04/20] x86/asm/64: Simplify reg restore code in the standard IRET paths

2017-11-02 Thread Andy Lutomirski
The old code restored all the registers with movq instead of pop. In theory, this was done because some CPUs have higher movq throughput, but any gain there would be tiny and is almost certainly outweighed by the higher text size. This saves 96 bytes of text. Signed-off-by: Andy Lutomirski

[PATCH v2 13/20] x86/asm/64: Pass sp0 directly to load_sp0()

2017-11-02 Thread Andy Lutomirski
load_sp0() had an odd signature: void load_sp0(struct tss_struct *tss, struct thread_struct *thread); Simplify it to: void load_sp0(unsigned long sp0); Reviewed-by: Borislav Petkov Signed-off-by: Andy Lutomirski --- arch/x86/include/asm/paravirt.h | 5 ++--- arch/x86/include/asm

[PATCH v2 06/20] x86/asm/64: Use pop instead of movq in syscall_return_via_sysret

2017-11-02 Thread Andy Lutomirski
Saves 64 bytes. Reviewed-by: Borislav Petkov Signed-off-by: Andy Lutomirski --- arch/x86/entry/entry_64.S | 14 +++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index e70303258daf..86fdce00e682 100644 --- a/arch

[PATCH v2 16/20] x86/boot/64: Stop initializing TSS.sp0 at boot

2017-11-02 Thread Andy Lutomirski
0 and stop initializing it on CPU init. The comment text mostly comes from Dave Hansen. Thanks! Signed-off-by: Andy Lutomirski --- arch/x86/kernel/cpu/common.c | 12 ++-- arch/x86/kernel/process.c| 8 +++- 2 files changed, 17 insertions(+), 3 deletions(-) diff --git a/arch/x86/k

[PATCH v2 17/20] x86/asm/64: Remove all remaining direct thread_struct::sp0 reads

2017-11-02 Thread Andy Lutomirski
The only remaining readers in context switch code or vm86(), and they all just want to update TSS.sp0 to match the current task. Replace them all with a new helper update_sp0(). Reviewed-by: Borislav Petkov Signed-off-by: Andy Lutomirski --- arch/x86/include/asm/switch_to.h | 6 ++ arch

[PATCH v2 08/20] x86/entry/64: Use POP instead of MOV to restore regs on NMI return

2017-11-02 Thread Andy Lutomirski
This gets rid of the last user of the old RESTORE_..._REGS infrastructure. Signed-off-by: Andy Lutomirski --- arch/x86/entry/entry_64.S | 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index 2e7f4952af94

[PATCH v2 11/20] x86/asm/64: De-Xen-ify our NMI code

2017-11-02 Thread Andy Lutomirski
Gross Signed-off-by: Andy Lutomirski --- arch/x86/entry/entry_64.S | 30 ++ 1 file changed, 18 insertions(+), 12 deletions(-) diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index b4df83177d14..b58fb6335850 100644 --- a/arch/x86/entry/entry_64.S

[PATCH v2 14/20] x86/asm: Add task_top_of_stack() to find the top of a task's stack

2017-11-02 Thread Andy Lutomirski
This will let us get rid of a few places that hardcode accesses to thread.sp0. Signed-off-by: Andy Lutomirski --- arch/x86/include/asm/processor.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 064b84722166

[PATCH v2 12/20] x86/asm/32: Pull MSR_IA32_SYSENTER_CS update code out of native_load_sp0()

2017-11-02 Thread Andy Lutomirski
uest, as lguest didn't have any SYSENTER support at all. Signed-off-by: Andy Lutomirski --- arch/x86/include/asm/processor.h | 7 --- arch/x86/include/asm/switch_to.h | 12 arch/x86/kernel/process_32.c | 4 +++- arch/x86/kernel/process_64.c | 2 +- arch/x86/kernel/v

[PATCH v2 10/20] xen: add xen nmi trap entry

2017-11-02 Thread Andy Lutomirski
t rid of the very fragile and questionable dependencies between the bare metal NMI handler and Xen assumptions believed to be broken anyway. Signed-off-by: Juergen Gross Signed-off-by: Andy Lutomirski --- arch/x86/entry/entry_64.S| 2 +- arch/x86/include/asm/traps.h | 2 +- arch/x8

[PATCH v2 18/20] x86/boot/32: Fix cpu_current_top_of_stack initialization at boot

2017-11-02 Thread Andy Lutomirski
cpu_current_top_of_stack's initialization forgot about TOP_OF_KERNEL_STACK_PADDING. This bug didn't matter because the idle threads never enter user mode. Reviewed-by: Borislav Petkov Signed-off-by: Andy Lutomirski --- arch/x86/kernel/smpboot.c | 3 +-- 1 file changed, 1 inser

[PATCH v2 20/20] x86/traps: Use a new on_thread_stack() helper to clean up an assertion

2017-11-02 Thread Andy Lutomirski
Let's keep the stack-related logic together rather than open-coding a comparison in an assertion in the traps code. Reviewed-by: Borislav Petkov Signed-off-by: Andy Lutomirski --- arch/x86/include/asm/processor.h | 6 ++ arch/x86/kernel/traps.c | 3 +-- 2 files chang

[PATCH v2 19/20] x86/asm/64: Remove thread_struct::sp0

2017-11-02 Thread Andy Lutomirski
On x86_64, we can easily calculate sp0 when needed instead of storing it in thread_struct. On x86_32, a similar cleanup would be possible, but it would require cleaning up the vm86 code first, and that can wait for a later cleanup series. Signed-off-by: Andy Lutomirski --- arch/x86/include/asm

<    2   3   4   5   6   7   8   9   10   11   >