Re: [PATCH] syscalls: Update the syscall #defines to match uapi

2019-08-11 Thread Andy Lutomirski
On Fri, Aug 9, 2019 at 6:11 PM Alistair Francis wrote: > > Update the #defines around sys_fstat64() and sys_fstatat64() to match > the #defines around the __NR3264_fstatat and __NR3264_fstat definitions > in include/uapi/asm-generic/unistd.h. This avoids compiler failures if > one is defined.

[WIP 4/4] bpf: Allow creating all program types without privilege

2019-08-05 Thread Andy Lutomirski
verifiers. Signed-off-by: Andy Lutomirski --- kernel/bpf/syscall.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 23f8f89d2a86..730afa2be786 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -1649,8 +1649,7

[WIP 1/4] bpf: Respect persistent map and prog access modes

2019-08-05 Thread Andy Lutomirski
In the interest of making bpf() more useful by unprivileged users, this patch teaches bpf to respect access modes on map and prog inodes. The permissions are: R on a map: read the map W on a map: write the map Referencing a map from a program should require RW. R on a prog: Read or introspect

[WIP 2/4] bpf: Don't require mknod() permission to pin an object

2019-08-05 Thread Andy Lutomirski
is currently the only user in the kernel outside of mknod() itself that uses it to create regular (i.e. S_IFREG) files. Signed-off-by: Andy Lutomirski --- kernel/bpf/inode.c | 4 1 file changed, 4 deletions(-) diff --git a/kernel/bpf/inode.c b/kernel/bpf/inode.c index cb07736b33ae

[WIP 0/4] bpf: A bit of progress toward unprivileged use

2019-08-05 Thread Andy Lutomirski
Other than the mknod() patch, this is not ready for prime time. These patches try to make progress toward making bpf() more useful without privilege Andy Lutomirski (4): bpf: Respect persistent map and prog access modes bpf: Don't require mknod() permission to pin an object bpf: Add a way

[WIP 3/4] bpf: Add a way to mark functions as requiring privilege

2019-08-05 Thread Andy Lutomirski
don't want to inadvertently generate audit events for privileges that are never used. So it's the idea that counts :) Signed-off-by: Andy Lutomirski --- include/linux/bpf.h | 15 +++ include/linux/bpf_verifier.h | 1 + kernel/bpf/verifier.c| 8

Re: [RFC PATCH v1 1/5] fs: Add support for an O_MAYEXEC flag on sys_open()

2019-08-04 Thread Andy Lutomirski
On Wed, Dec 12, 2018 at 6:43 AM Jan Kara wrote: > > On Wed 12-12-18 09:17:08, Mickaël Salaün wrote: > > When the O_MAYEXEC flag is passed, sys_open() may be subject to > > additional restrictions depending on a security policy implemented by an > > LSM through the inode_permission hook. > > > >

Re: [patch 2/5] x86/kvm: Handle task_work on VMENTER/EXIT

2019-08-02 Thread Andy Lutomirski
> On Aug 2, 2019, at 3:22 PM, Thomas Gleixner wrote: > >> On Fri, 2 Aug 2019, Paolo Bonzini wrote: >>> On 01/08/19 23:47, Thomas Gleixner wrote: >>> Right you are about cond_resched() being called, but for SRCU this does not >>> matter unless there is some way to do a synchronize operation on

Re: [PATCHv5 25/37] x86/vdso: Switch image on setns()/clone()

2019-08-01 Thread Andy Lutomirski
On Wed, Jul 31, 2019 at 11:09 PM wrote: > > On July 31, 2019 10:34:26 PM PDT, Andy Lutomirski wrote: > >On Mon, Jul 29, 2019 at 2:58 PM Dmitry Safonov wrote: > >> > >> As it has been discussed on timens RFC, adding a new conditional > >branch > >> `i

Re: [patch 1/5] tracehook: Provide TIF_NOTIFY_RESUME handling for KVM

2019-08-01 Thread Andy Lutomirski
> On Aug 1, 2019, at 7:48 AM, Peter Zijlstra wrote: > >> On Thu, Aug 01, 2019 at 04:32:51PM +0200, Thomas Gleixner wrote: >> +#ifdef CONFIG_HAVE_ARCH_TRACEHOOK >> +/** >> + * tracehook_handle_notify_resume - Notify resume handling for virt >> + * >> + * Called with interrupts and preemption

Re: [PATCHv5 25/37] x86/vdso: Switch image on setns()/clone()

2019-07-31 Thread Andy Lutomirski
On Mon, Jul 29, 2019 at 2:58 PM Dmitry Safonov wrote: > > As it has been discussed on timens RFC, adding a new conditional branch > `if (inside_time_ns)` on VDSO for all processes is undesirable. > It will add a penalty for everybody as branch predictor may mispredict > the jump. Also there are

Re: [PATCHv5 01/37] ns: Introduce Time Namespace

2019-07-31 Thread Andy Lutomirski
On Mon, Jul 29, 2019 at 2:58 PM Dmitry Safonov wrote: > > From: Andrei Vagin > > Time Namespace isolates clock values. > +static int timens_install(struct nsproxy *nsproxy, struct ns_common *new) > +{ > + struct time_namespace *ns = to_time_ns(new); > + > + if

Re: [PATCHv5 21/37] x86/vdso: Restrict splitting VVAR VMA

2019-07-31 Thread Andy Lutomirski
On Mon, Jul 29, 2019 at 2:58 PM Dmitry Safonov wrote: > > Although, time namespace can work with VVAR VMA split, it seems worth > to forbid splitting VVAR resulting in stricter ABI and reducing amount > of corner-cases to consider while working further on VDSO. > > I don't think there is any

Re: [PATCHv5 23/37] x86/vdso: Add offsets page in vvar

2019-07-31 Thread Andy Lutomirski
On Mon, Jul 29, 2019 at 2:58 PM Dmitry Safonov wrote: > > From: Andrei Vagin > > As modern applications fetch time from VDSO without entering the kernel, > it's needed to provide offsets for userspace code inside time namespace. > > A page for timens offsets is allocated on time namespace

Re: [PATCHv5 28/37] x86/vdso: Enable static branches for the timens vdso

2019-07-31 Thread Andy Lutomirski
On Mon, Jul 29, 2019 at 2:58 PM Dmitry Safonov wrote: > > From: Andrei Vagin > > As it has been discussed on timens RFC, adding a new conditional branch > `if (inside_time_ns)` on VDSO for all processes is undesirable. > > Addressing those problems, there are two versions of VDSO's .so: > for

Re: [patch V2 3/5] lib/vdso/32: Provide legacy syscall fallbacks

2019-07-30 Thread Andy Lutomirski
On Tue, Jul 30, 2019 at 2:39 AM Thomas Gleixner wrote: > > To address the regression which causes seccomp to deny applications the > access to clock_gettime64() and clock_getres64() syscalls because they > are not enabled in the existing filters. > > That trips over the fact that 32bit VDSOs use

Re: [PATCH v2] sched/core: Don't use dying mm as active_mm of kthreads

2019-07-29 Thread Andy Lutomirski
> On Jul 29, 2019, at 8:03 AM, Peter Zijlstra wrote: > >> On Mon, Jul 29, 2019 at 10:51:51AM -0400, Waiman Long wrote: >>> On 7/29/19 4:52 AM, Peter Zijlstra wrote: On Sat, Jul 27, 2019 at 01:10:47PM -0400, Waiman Long wrote: It was found that a dying mm_struct where the owning task

Re: [PATCH] tracing: Prevent RCU EQS breakage in preemptirq events

2019-07-28 Thread Andy Lutomirski
On Sun, Jul 28, 2019 at 6:08 PM Eiichi Tsukata wrote: > > If context tracking is enabled, causing page fault in preemptirq > irq_enable or irq_disable events triggers the following RCU EQS warning. > Yuck. > diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c > index

Re: [PATCH] x86: panic when a kernel stack overflow is detected

2019-07-28 Thread Andy Lutomirski
On Sun, Jul 28, 2019 at 6:59 PM Daniel Axtens wrote: > > Currently, when a kernel stack overflow is detected via VMAP_STACK, > the task is killed with die(). > > This isn't safe, because we don't know how that process has affected > kernel state. In particular, we don't know what locks have been

Re: [patch 4/5] x86/vdso/32: Use 32bit syscall fallback

2019-07-28 Thread Andy Lutomirski
mplement the 32bit variants which use the legacy syscalls and select the > variant in the core library. > > The 64bit time variants are not removed because they are required for the > time64 based vdso accessors. Reviewed-by: Andy Lutomirski

Re: [patch 3/5] lib/vdso/32: Provide legacy syscall fallbacks

2019-07-28 Thread Andy Lutomirski
at 32bit VDSOs use the new clock_gettime64() and > clock_getres64() syscalls in the fallback path. > > Implement a __cvdso_clock_get*time32() variants which invokes the legacy > 32bit syscalls when the architecture requests it. > > The conditional can go away once all architectur

Re: [patch 2/5] lib/vdso: Move fallback invocation to the callers

2019-07-28 Thread Andy Lutomirski
; syscall fallback in the 64bit and 32bit variants. > > Preparatory work for using legacy syscalls in 32bit VDSO. No functional > change. Reviewed-by: Andy Lutomirski

Re: [patch 1/5] lib/vdso/32: Remove inconsistent NULL pointer checks

2019-07-28 Thread Andy Lutomirski
ast > which only works because the pointer is NULL. Reviewed-by: Andy Lutomirski FWIW, the equivalent change to gettimeofday would be an ABI break, since we historically have that check, and it even makes sense there.

Re: [5.2 REGRESSION] Generic vDSO breaks seccomp-enabled userspace on i386

2019-07-27 Thread Andy Lutomirski
On Sat, Jul 27, 2019 at 2:52 PM Thomas Gleixner wrote: > > On Sat, 27 Jul 2019, Thomas Gleixner wrote: > > On Sat, 27 Jul 2019, Andy Lutomirski wrote: > > > > > > I think it's getting quite late to start inventing new seccomp > > > features to fix th

Re: [5.2 REGRESSION] Generic vDSO breaks seccomp-enabled userspace on i386

2019-07-27 Thread Andy Lutomirski
On Fri, Jul 26, 2019 at 11:01 AM Sean Christopherson wrote: > > +cc Paul > > On Wed, Jul 24, 2019 at 01:56:34AM +0200, Thomas Gleixner wrote: > > On Tue, 23 Jul 2019, Kees Cook wrote: > > > > > On Wed, Jul 24, 2019 at 12:59:03AM +0200, Thomas Gleixner wrote: > > > > And as we have

Re: [RFC PATCH 04/21] x86/sgx: Add /dev/sgx/virt_epc device to allocate "raw" EPC for VMs

2019-07-27 Thread Andy Lutomirski
On Fri, Jul 26, 2019 at 10:52 PM Sean Christopherson wrote: > > Add an SGX device to enable userspace to allocate EPC without an > associated enclave. The intended and only known use case for direct EPC > allocation is to expose EPC to a KVM guest, hence the virt_epc moniker, > virt.{c,h} files

Re: [RFC PATCH 21/21] KVM: x86: Add capability to grant VM access to privileged SGX attribute

2019-07-27 Thread Andy Lutomirski
On Fri, Jul 26, 2019 at 10:52 PM Sean Christopherson wrote: > > The SGX subsystem restricts access to a subset of enclave attributes to > provide additional security for an uncompromised kernel, e.g. to prevent > malware from using the PROVISIONKEY to ensure its nodes are running > inside a

Re: [RFC PATCH 08/21] KVM: x86: Add kvm_x86_ops hook to short circuit emulation

2019-07-27 Thread Andy Lutomirski
On Fri, Jul 26, 2019 at 10:52 PM Sean Christopherson wrote: > > Similar to the existing AMD #NPF case where emulation of the current > instruction is not possible due to lack of information, virtualization > of Intel SGX will introduce a scenario where emulation is not possible > due to the

[PATCH] x86/hw_breakpoint: Prevent data breakpoints on cpu_entry_area

2019-07-25 Thread Andy Lutomirski
A data breakpoint near the top of an IST stack will cause unresoverable recursion. A data breakpoint on the GDT, IDT, or TSS is terrifying. Prevent either of these from happening. Co-developed-by: Peter Zijlstra Signed-off-by: Andy Lutomirski --- The rest of my series is still in progress

Re: [5.2 REGRESSION] Generic vDSO breaks seccomp-enabled userspace on i386

2019-07-23 Thread Andy Lutomirski
On Tue, Jul 23, 2019 at 2:55 PM Kees Cook wrote: > > On Mon, Jul 22, 2019 at 04:47:36PM -0700, Andy Lutomirski wrote: > > On Mon, Jul 22, 2019 at 4:28 PM Kees Cook wrote: > > > I've built a straw-man for this idea... but I have to say I don't > > > like it. This

Re: [5.2 REGRESSION] Generic vDSO breaks seccomp-enabled userspace on i386

2019-07-23 Thread Andy Lutomirski
> On Jul 23, 2019, at 2:18 AM, Peter Zijlstra wrote: > >> On Mon, Jul 22, 2019 at 04:47:36PM -0700, Andy Lutomirski wrote: >> >> I don't love this whole concept, but I also don't have a better idea. > > Are we really talking about changing the kernel bec

Re: [5.2 REGRESSION] Generic vDSO breaks seccomp-enabled userspace on i386

2019-07-22 Thread Andy Lutomirski
On Mon, Jul 22, 2019 at 4:28 PM Kees Cook wrote: > > On Mon, Jul 22, 2019 at 12:17:16PM -0700, Andy Lutomirski wrote: > > On Mon, Jul 22, 2019 at 11:39 AM Kees Cook wrote: > > > > > > On Mon, Jul 22, 2019 at 08:31:32PM +0200, Thomas Gleixner wrote: > > >

Re: [5.2 REGRESSION] Generic vDSO breaks seccomp-enabled userspace on i386

2019-07-22 Thread Andy Lutomirski
On Mon, Jul 22, 2019 at 11:39 AM Kees Cook wrote: > > On Mon, Jul 22, 2019 at 08:31:32PM +0200, Thomas Gleixner wrote: > > On Mon, 22 Jul 2019, Kees Cook wrote: > > > Just so I'm understanding: the vDSO change introduced code to make an > > > actual syscall on i386, which for most seccomp filters

[tip:x86/entry] x86/syscalls: Split the x32 syscalls into their own table

2019-07-22 Thread tip-bot for Andy Lutomirski
Commit-ID: 6365b842aae4490ebfafadfc6bb27a6d3cc54757 Gitweb: https://git.kernel.org/tip/6365b842aae4490ebfafadfc6bb27a6d3cc54757 Author: Andy Lutomirski AuthorDate: Wed, 3 Jul 2019 13:34:04 -0700 Committer: Thomas Gleixner CommitDate: Mon, 22 Jul 2019 10:31:23 +0200 x86/syscalls: Split

[tip:x86/entry] x86/syscalls: Disallow compat entries for all types of 64-bit syscalls

2019-07-22 Thread tip-bot for Andy Lutomirski
Commit-ID: f85a8573ceb225e606fcf38a9320782316f47c71 Gitweb: https://git.kernel.org/tip/f85a8573ceb225e606fcf38a9320782316f47c71 Author: Andy Lutomirski AuthorDate: Wed, 3 Jul 2019 13:34:03 -0700 Committer: Thomas Gleixner CommitDate: Mon, 22 Jul 2019 10:31:22 +0200 x86/syscalls

[tip:x86/entry] x86/syscalls: Use the compat versions of rt_sigsuspend() and rt_sigprocmask()

2019-07-22 Thread tip-bot for Andy Lutomirski
Commit-ID: a8d03c3f300eefff3b5c14798409e4b43e37dd9b Gitweb: https://git.kernel.org/tip/a8d03c3f300eefff3b5c14798409e4b43e37dd9b Author: Andy Lutomirski AuthorDate: Wed, 3 Jul 2019 13:34:02 -0700 Committer: Thomas Gleixner CommitDate: Mon, 22 Jul 2019 10:31:22 +0200 x86/syscalls: Use

[tip:x86/entry] x86/syscalls: Make __X32_SYSCALL_BIT be unsigned long

2019-07-22 Thread tip-bot for Andy Lutomirski
Commit-ID: 45e29d119e9923ff14dfb840e3482bef1667bbfb Gitweb: https://git.kernel.org/tip/45e29d119e9923ff14dfb840e3482bef1667bbfb Author: Andy Lutomirski AuthorDate: Wed, 3 Jul 2019 13:34:05 -0700 Committer: Thomas Gleixner CommitDate: Mon, 22 Jul 2019 10:31:22 +0200 x86/syscalls: Make

[tip:x86/apic] x86/apic: Initialize TPR to block interrupts 16-31

2019-07-22 Thread tip-bot for Andy Lutomirski
Commit-ID: 229b969b3d38bc28bcd55841ee7ca9a9afb922f3 Gitweb: https://git.kernel.org/tip/229b969b3d38bc28bcd55841ee7ca9a9afb922f3 Author: Andy Lutomirski AuthorDate: Sun, 14 Jul 2019 08:23:14 -0700 Committer: Thomas Gleixner CommitDate: Mon, 22 Jul 2019 10:12:32 +0200 x86/apic

Re: [PATCH v3 5/9] x86/mm/tlb: Privatize cpu_tlbstate

2019-07-20 Thread Andy Lutomirski
On Fri, Jul 19, 2019 at 11:54 AM Nadav Amit wrote: > > > On Jul 19, 2019, at 11:48 AM, Dave Hansen wrote: > > > > On 7/19/19 11:43 AM, Nadav Amit wrote: > >> Andy said that for the lazy tlb optimizations there might soon be more > >> shared state. If you prefer, I can move is_lazy outside of

Re: [PATCH v3 0/6] Tracing vs CR2

2019-07-20 Thread Andy Lutomirski
On Fri, Jul 19, 2019 at 8:59 PM Eiichi Tsukata wrote: > > > On 2019/07/19 5:27, Andy Lutomirski wrote: > > Hi all- > > > > I suspect that a bunch of the bugs you're all finding boil down to: > > > > - Nested debug exceptions could corrupt the outer exceptio

Re: [5.2 REGRESSION] Generic vDSO breaks seccomp-enabled userspace on i386

2019-07-19 Thread Andy Lutomirski
> On Jul 19, 2019, at 1:03 PM, Sean Christopherson > wrote: > > The generic vDSO implementation, starting with commit > > 7ac870747988 ("x86/vdso: Switch to generic vDSO implementation") > > breaks seccomp-enabled userspace on 32-bit x86 (i386) kernels. Prior to > the generic

Re: [PATCH 3/3] mm/vmalloc: Sync unmappings in vunmap_page_range()

2019-07-19 Thread Andy Lutomirski
On Fri, Jul 19, 2019 at 5:21 AM Joerg Roedel wrote: > > On Thu, Jul 18, 2019 at 12:04:49PM -0700, Andy Lutomirski wrote: > > I find it problematic that there is no meaningful documentation as to > > what vmalloc_sync_all() is supposed to do. > > Yeah, I found that too,

Re: [PATCH v3 0/6] Tracing vs CR2

2019-07-18 Thread Andy Lutomirski
Hi all- I suspect that a bunch of the bugs you're all finding boil down to: - Nested debug exceptions could corrupt the outer exception's DR6. - Nested debug exceptions in which *both* exceptions came from the kernel were probably all kinds of buggy - Data breakpoints in bad places in the

Re: [PATCH 3/3] mm/vmalloc: Sync unmappings in vunmap_page_range()

2019-07-18 Thread Andy Lutomirski
On Thu, Jul 18, 2019 at 2:17 AM Joerg Roedel wrote: > > Hi Andy, > > On Wed, Jul 17, 2019 at 02:24:09PM -0700, Andy Lutomirski wrote: > > On Wed, Jul 17, 2019 at 12:14 AM Joerg Roedel wrote: > > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > > > ind

Re: [PATCH 3/3] mm/vmalloc: Sync unmappings in vunmap_page_range()

2019-07-17 Thread Andy Lutomirski
On Wed, Jul 17, 2019 at 12:14 AM Joerg Roedel wrote: > > From: Joerg Roedel > > On x86-32 with PTI enabled, parts of the kernel page-tables > are not shared between processes. This can cause mappings in > the vmalloc/ioremap area to persist in some page-tables > after the regions is unmapped and

Re: [PATCH v3 0/6] Tracing vs CR2

2019-07-16 Thread Andy Lutomirski
On Tue, Jul 16, 2019 at 2:53 PM Vegard Nossum wrote: > > > On 7/16/19 9:33 PM, Vegard Nossum wrote: > > > > On 7/11/19 1:40 PM, Peter Zijlstra wrote: > >> Hi, > >> > >> Here's the latest (and hopefully final) set of tracing vs CR2 patches. > >> > >> They are basically the same as v2, with only

Re: [PATCH v2] x86/paravirt: Drop {read,write}_cr8() hooks

2019-07-15 Thread Andy Lutomirski
On Mon, Jul 15, 2019 at 4:30 PM Andrew Cooper wrote: > > On 15/07/2019 19:17, Nadav Amit wrote: > >> On Jul 15, 2019, at 8:16 AM, Andrew Cooper > >> wrote: > >> > >> There is a lot of infrastructure for functionality which is used > >> exclusively in __{save,restore}_processor_state() on the

Re: [RFC PATCH, x86]: Disable CPA cache flush for selfsnoop targets

2019-07-15 Thread Andy Lutomirski
On Mon, Jul 15, 2019 at 3:53 PM Andi Kleen wrote: > > > I haven't tested on a real kernel with i915. Does i915 really hit > > this code path? Does it happen more than once or twice at boot? > > Yes some workloads allocate/free a lot of write combined memory > for graphics objects. > But where

Re: [RFC PATCH, x86]: Disable CPA cache flush for selfsnoop targets

2019-07-15 Thread Andy Lutomirski
On Mon, Jul 15, 2019 at 12:38 PM Andi Kleen wrote: > > > > > That does not answer the question whether it's worthwhile to do that. > > It's likely worthwhile for (Intel integrated) graphics. > > There was also a recent issue with 3dxp/dax, which uses ioremap in some > cases. > FWIW, I applied

Re: [RFC PATCH, x86]: Disable CPA cache flush for selfsnoop targets

2019-07-15 Thread Andy Lutomirski
On Mon, Jul 15, 2019 at 12:39 PM Andi Kleen wrote: > > > Right, we don't know where the PAT invocation comes from and whether they > > are safe to omit flushing the cache. The module load code would be one > > obvious candidate. > > Module load just changes the writable/executable status, right?

Re: [PATCH 0/2] Remove 32-bit Xen PV guest support

2019-07-15 Thread Andy Lutomirski
On Mon, Jul 15, 2019 at 9:34 AM Andi Kleen wrote: > > Juergen Gross writes: > > > The long term plan has been to replace Xen PV guests by PVH. The first > > victim of that plan are now 32-bit PV guests, as those are used only > > rather seldom these days. Xen on x86 requires 64-bit support and

[tip:x86/urgent] Revert "x86/ptrace: Prevent ptrace from clearing the FS/GS selector" and fix the test

2019-07-15 Thread tip-bot for Andy Lutomirski
Commit-ID: c7ca0b614513afba57824cae68447f9c32b1ee61 Gitweb: https://git.kernel.org/tip/c7ca0b614513afba57824cae68447f9c32b1ee61 Author: Andy Lutomirski AuthorDate: Mon, 15 Jul 2019 07:21:44 -0700 Committer: Thomas Gleixner CommitDate: Mon, 15 Jul 2019 17:12:31 +0200 Revert "x86/p

Re: [PATCH] x86/cpu/intel: Skip CPA cache flush on CPUs with cache self-snooping

2019-07-15 Thread Andy Lutomirski
On Mon, Jul 15, 2019 at 7:21 AM Uros Bizjak wrote: > > CPUs which have self-snooping capability can handle conflicting > memory type across CPUs by snooping its own cache. Commit #fd329f276ecaa > ("x86/mtrr: Skip cache flushes on CPUs with cache self-snooping") > avoids cache flushes when MTRR

Re: [PATCH] x86/paravirt: Drop {read,write}_cr8() hooks

2019-07-15 Thread Andy Lutomirski
On Mon, Jul 15, 2019 at 6:23 AM Juergen Gross wrote: > > On 15.07.19 15:00, Andrew Cooper wrote: > > There is a lot of infrastructure for functionality which is used > > exclusively in __{save,restore}_processor_state() on the suspend/resume > > path. > > > > cr8 is an alias of APIC_TASKPRI, and

[PATCH] Revert "x86/ptrace: Prevent ptrace from clearing the FS/GS selector" and fix the test

2019-07-15 Thread Andy Lutomirski
modifies the test case so that it tests the preexisting behavior. Signed-off-by: Andy Lutomirski --- arch/x86/kernel/ptrace.c | 14 -- tools/testing/selftests/x86/fsgsbase.c | 22 -- 2 files changed, 16 insertions(+), 20 deletions(-) diff --git a/arch

[PATCH] x86/apic: Initialize TPR to block interrupts 16-31

2019-07-14 Thread Andy Lutomirski
IOMMU that remaps interrupts. The purpose of this patch is to reduce the chance that a certain class of device malfunctions crashes the kernel in hard-to-debug ways. Cc: Nadav Amit Cc: Stephane Eranian Cc: Feng Tang Suggested-by: Andrew Cooper Signed-off-by: Andy Lutomirski --- arch/x86/kernel

Re: [RFC v2 00/27] Kernel Address Space Isolation

2019-07-14 Thread Andy Lutomirski
On Fri, Jul 12, 2019 at 12:06 PM Peter Zijlstra wrote: > > On Fri, Jul 12, 2019 at 06:37:47PM +0200, Alexandre Chartre wrote: > > On 7/12/19 5:16 PM, Thomas Gleixner wrote: > > > > Right. If we decide to expose more parts of the kernel mappings then > > > that's > > > just adding more stuff to

Re: [RFC v2 00/27] Kernel Address Space Isolation

2019-07-12 Thread Andy Lutomirski
> On Jul 12, 2019, at 10:37 AM, Alexandre Chartre > wrote: > > > >> On 7/12/19 5:16 PM, Thomas Gleixner wrote: >>> On Fri, 12 Jul 2019, Peter Zijlstra wrote: On Fri, Jul 12, 2019 at 01:56:44PM +0200, Alexandre Chartre wrote: I think that's precisely what makes ASI and PTI

Re: [RFC v2 00/27] Kernel Address Space Isolation

2019-07-12 Thread Andy Lutomirski
On Fri, Jul 12, 2019 at 6:45 AM Alexandre Chartre wrote: > > > On 7/12/19 2:50 PM, Peter Zijlstra wrote: > > On Fri, Jul 12, 2019 at 01:56:44PM +0200, Alexandre Chartre wrote: > > > >> I think that's precisely what makes ASI and PTI different and independent. > >> PTI is just about switching

Re: On

2019-07-11 Thread Andy Lutomirski
On Thu, Jul 11, 2019 at 5:01 PM Carlo Wood wrote: > > I believe that the only safe solution is to let the Event Loop > Thread do the deleting. So, if all else fails I'll have to add > objects that a Worker Thread thinks need to be deleted to a > FIFO that is processed by the Event Loop Thread

Re: [RFC v2 02/26] mm/asi: Abort isolation on interrupt, exception and context switch

2019-07-11 Thread Andy Lutomirski
> On Jul 11, 2019, at 8:25 AM, Alexandre Chartre > wrote: > > Address space isolation should be aborted if there is an interrupt, > an exception or a context switch. Interrupt/exception handlers and > context switch code need to run with the full kernel address space. > Address space

Re: [PATCH v3 6/6] x86/entry/64: Remove TRACE_IRQS_*_DEBUG

2019-07-11 Thread Andy Lutomirski
On Thu, Jul 11, 2019 at 4:51 AM Peter Zijlstra wrote: > > Since INT3/#BP no longer runs on an IST, this workaround is no longer > required. > > Tested by running lockdep+ftrace as described in the initial commit: > > 5963e317b1e9 ("ftrace/x86: Do not change stacks in DEBUG when calling >

Re: [RFC PATCH, x86]: Disable CPA cache flush for selfsnoop targets

2019-07-11 Thread Andy Lutomirski
On Thu, Jul 11, 2019 at 1:13 AM Uros Bizjak wrote: > > Recent patch [1] disabled a self-snoop feature on a list of processor > models with a known errata, so we are confident that the feature > should work on remaining models also for other purposes than to speed > up MTRR programming. > > I

Re: [PATCH 2/2] x86/numa: instance all parsed numa node

2019-07-09 Thread Andy Lutomirski
> On Jul 9, 2019, at 1:24 AM, Pingfan Liu wrote: > >> On Tue, Jul 9, 2019 at 2:12 PM Thomas Gleixner wrote: >> >>> On Tue, 9 Jul 2019, Pingfan Liu wrote: On Mon, Jul 8, 2019 at 5:35 PM Thomas Gleixner wrote: It can and it does. That's the whole point why we bring up

Re: [PATCH] selftests/seccomp/seccomp_bpf: update for PTRACE_GET_SYSCALL_INFO

2019-07-08 Thread Andy Lutomirski
On Mon, Jul 8, 2019 at 11:29 AM Dmitry V. Levin wrote: > > The syscall entry/exit is now exposed via PTRACE_GETEVENTMSG, > update the test accordingly. Reviewed-by: Andy Lutomirski

Re: [PATCH 2/2] x86/numa: instance all parsed numa node

2019-07-08 Thread Andy Lutomirski
> On Jul 8, 2019, at 3:35 AM, Thomas Gleixner wrote: > >> On Mon, 8 Jul 2019, Pingfan Liu wrote: >>> On Mon, Jul 8, 2019 at 3:44 AM Thomas Gleixner wrote: >>> On Fri, 5 Jul 2019, Pingfan Liu wrote: I hit a bug on an AMD machine, with kexec -l nr_cpus=4 option. nr_cpus

Re: [PATCH v2 5/7] x86/mm, tracing: Fix CR2 corruption

2019-07-07 Thread Andy Lutomirski
On Sun, Jul 7, 2019 at 8:10 AM Andy Lutomirski wrote: > > On Thu, Jul 4, 2019 at 1:03 PM Peter Zijlstra wrote: > > > > Despire the current efforts to read CR2 before tracing happens there > > still exist a number of possible holes: > > > > idtentry

Re: [PATCH v2 5/7] x86/mm, tracing: Fix CR2 corruption

2019-07-07 Thread Andy Lutomirski
read_cr2(); /* whoopsie */ > > And similar for i386. > > Fix it by pulling the CR2 read into the entry code, before any of that > stuff gets a chance to run and ruin things. Reviewed-by: Andy Lutomirski Subject to the discussion as to whether this is the right approach at all.

Re: [PATCH v2 5/7] x86/mm, tracing: Fix CR2 corruption

2019-07-06 Thread Andy Lutomirski
> On Jul 6, 2019, at 6:08 PM, Linus Torvalds > wrote: > > On Sat, Jul 6, 2019 at 3:41 PM Linus Torvalds > wrote: >> >>> On Sat, Jul 6, 2019 at 3:27 PM Steven Rostedt wrote: >>> >>> We also have to deal with reading vmalloc'd data as that can fault too. >> >> Ahh, that may be a better

Re: [PATCH v2 5/7] x86/mm, tracing: Fix CR2 corruption

2019-07-06 Thread Andy Lutomirski
On Sat, Jul 6, 2019 at 3:27 PM Steven Rostedt wrote: > > On Sat, 6 Jul 2019 14:41:22 -0700 > Linus Torvalds wrote: > > > On Fri, Jul 5, 2019 at 6:50 AM Peter Zijlstra wrote: > > > > > > Also; all previous attempts at fixing this have been about pushing the > > > read_cr2() earlier; notably: > >

Re: [patch V2 04/25] x86/apic: Make apic_pending_intr_clear() more robust

2019-07-05 Thread Andy Lutomirski
On Fri, Jul 5, 2019 at 1:36 PM Thomas Gleixner wrote: > > On Fri, 5 Jul 2019, Andy Lutomirski wrote: > > On Fri, Jul 5, 2019 at 8:47 AM Andrew Cooper > > wrote: > > > Because TPR is 0, an incoming IPI can trigger #AC, #CP, #VC or #SX > > > without an er

Re: [patch V2 04/25] x86/apic: Make apic_pending_intr_clear() more robust

2019-07-05 Thread Andy Lutomirski
On Fri, Jul 5, 2019 at 1:25 PM Thomas Gleixner wrote: > > Andrew, > > > > > These can be addressed by setting TPR to 0x10, which will inhibit > > Right, that's easy and obvious. > This boots: diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index 177aa8ef2afa..5257c40bde6c

Re: [patch V2 04/25] x86/apic: Make apic_pending_intr_clear() more robust

2019-07-05 Thread Andy Lutomirski
On Fri, Jul 5, 2019 at 8:47 AM Andrew Cooper wrote: > > On 04/07/2019 16:51, Thomas Gleixner wrote: > > 2) The loop termination logic is interesting at best. > > > > If the machine has no TSC or cpu_khz is not known yet it tries 1 > > million times to ack stale IRR/ISR bits. What? > >

Re: [PATCH v2 5/7] x86/mm, tracing: Fix CR2 corruption

2019-07-04 Thread Andy Lutomirski
> On Jul 4, 2019, at 7:18 PM, Linus Torvalds > wrote: > >> On Fri, Jul 5, 2019 at 5:03 AM Peter Zijlstra wrote: >> >> Despire the current efforts to read CR2 before tracing happens there >> still exist a number of possible holes: > > So this whole series disturbs me for the simple reason

Re: [PATCH v2 4/7] x86/entry/64: Update comments and sanity tests for create_gap

2019-07-04 Thread Andy Lutomirski
On Thu, Jul 4, 2019 at 1:03 PM Peter Zijlstra wrote: > > Acked-by: Andy Lutomirski

Re: [PATCH v2 3/7] x86/entry/64: Simplify idtentry a little

2019-07-04 Thread Andy Lutomirski
/* get error code */ > - movq$-1, ORIG_RAX(%rsp) /* no syscall to restart */ > - .else > - xorl %esi, %esi /* no error code */ > + idtentry_part \do_sym, \has_error_code, 0 Nice! You are adding an extra UNWIND_HINT_REGS that wasn't here before, but I think that's fine. However, can you pleace make it paranoid=0 instead of just 0? You could go all the way verbose and say do_sym=\do_sym, etc, but that seems like overkill. Other than that nitpick, Acked-by: Andy Lutomirski --Andy

Re: [PATCH v2 2/7] x86/entry/32: Simplify common_exception

2019-07-04 Thread Andy Lutomirski
On Thu, Jul 4, 2019 at 1:03 PM Peter Zijlstra wrote: > > By adding one more option to SAVE_ALL we can make use of it in > common_exception and simplify things. This saves duplication later > where page_fault will no longer use common_exception. > Reviewed-by: Andy Lutomirski Alt

Re: [PATCH v2 1/7] x86/paravirt: Make read_cr2() CALLEE_SAVE

2019-07-04 Thread Andy Lutomirski
On Thu, Jul 4, 2019 at 1:03 PM Peter Zijlstra wrote: > > The one paravirt read_cr2() implementation (Xen) is actually quite > trivial and doesn't need to clobber anything other than the return > register. By making read_cr2() CALLEE_SAVE we avoid all the PUSH/POP > nonsense and allow more

Re: [PATCH 3/3] x86/mm, tracing: Fix CR2 corruption

2019-07-03 Thread Andy Lutomirski
On Wed, Jul 3, 2019 at 3:01 PM Peter Zijlstra wrote: > > On Wed, Jul 03, 2019 at 01:27:09PM -0700, Andy Lutomirski wrote: > > On Wed, Jul 3, 2019 at 3:28 AM root wrote: > > > > @@ -1338,18 +1347,9 @@ ENTRY(error_entry) > > > movq%rax, %rsp

[PATCH 2/4] x86/syscalls: Disallow compat entries for all types of 64-bit syscalls

2019-07-03 Thread Andy Lutomirski
A "compat" entry in the syscall tables means to use a different entry on 32-bit and 64-bit builds. This only makes sense for syscalls that exist in the first place in 32-bit builds, so disallow it for anything other than i386. Signed-off-by: Andy Lutomirski --- arch/x86/entr

[PATCH 1/4] x86/syscalls: Use the compat versions of rt_sigsuspend() and rt_sigprocmask()

2019-07-03 Thread Andy Lutomirski
the compat vesions. sendfile64() is more complicated, and I'll address it separately. Signed-off-by: Andy Lutomirski --- arch/x86/entry/syscalls/syscall_32.tbl | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls

[PATCH 3/4] x86/syscalls: Split the x32 syscalls into their own table

2019-07-03 Thread Andy Lutomirski
that need special handling on x32 can share the same number on x32 and x86_64. This means that the special syscall range 512-547 can be treated as a legacy wart instead of something that may need to be extended in the future. This patch also adds a selftest to verify the new behavior. Signed-of

[PATCH 0/4] x32 and compat syscall improvements

2019-07-03 Thread Andy Lutomirski
:) Andy Lutomirski (4): x86/syscalls: Use the compat versions of rt_sigsuspend() and rt_sigprocmask() x86/syscalls: Disallow compat entries for all types of 64-bit syscalls x86/syscalls: Split the x32 syscalls into their own table x86/syscalls: Make __X32_SYSCALL_BIT be unsigned long arch

[PATCH 4/4] x86/syscalls: Make __X32_SYSCALL_BIT be unsigned long

2019-07-03 Thread Andy Lutomirski
to be. Syscall numbers are, for all practical purposes, unsigned long, so make __X32_SYSCALL_BIT be unsigned long. Signed-off-by: Andy Lutomirski --- arch/x86/include/uapi/asm/unistd.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/include/uapi/asm/unistd.h b/arch/x86/include

Re: [PATCH 3/3] x86/mm, tracing: Fix CR2 corruption

2019-07-03 Thread Andy Lutomirski
On Wed, Jul 3, 2019 at 3:28 AM root wrote: > > Despire the current efforts to read CR2 before tracing happens there > still exist a number of possible holes: > > idtentry page_fault do_page_fault has_error_code=1 > call error_entry > TRACE_IRQS_OFF > call

[PATCH] selftests/x86: Don't muck with ftrace in mpx_mini_test

2019-07-03 Thread Andy Lutomirski
I don't know why mpx_mini_test tries to reprogram ftrace, but it seems rude and it makes the test crash if run as non-root. Comment it out. Cc: Dave Hansen Signed-off-by: Andy Lutomirski --- tools/testing/selftests/x86/mpx-mini-test.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion

[tip:x86/cpu] selftests/x86/fsgsbase: Fix some test case bugs

2019-07-03 Thread tip-bot for Andy Lutomirski
Commit-ID: 697096b1f458fb81212d1c82d7846e932455 Gitweb: https://git.kernel.org/tip/697096b1f458fb81212d1c82d7846e932455 Author: Andy Lutomirski AuthorDate: Tue, 2 Jul 2019 20:43:04 -0700 Committer: Thomas Gleixner CommitDate: Wed, 3 Jul 2019 16:24:56 +0200 selftests/x86

[PATCH v2] selftests/x86/fsgsbase: Fix some test case bugs

2019-07-02 Thread Andy Lutomirski
Andi Kleen Cc: H. Peter Anvin Cc: "BaeChang Seok" Signed-off-by: Andy Lutomirski --- Changes from v1: - Fix one more 0x7 (Chang) - This time, I explicitly tested it with modify_ldt() disabled tools/testing/selftests/x86/fsgsbase.c | 74 ++ 1 file changed,

[PATCH] selftests/x86/fsgsbase: Fix some test case bugs

2019-07-02 Thread Andy Lutomirski
: Andi Kleen Cc: Ravi Shankar Cc: H. Peter Anvin Cc: "BaeChang Seok" Signed-off-by: Andy Lutomirski --- tools/testing/selftests/x86/fsgsbase.c | 72 ++ 1 file changed, 39 insertions(+), 33 deletions(-) diff --git a/tools/testing/selftests/x86/fsgsbase.c

[tip:x86/cpu] x86/entry/64: Fix and clean up paranoid_exit

2019-07-02 Thread tip-bot for Andy Lutomirski
Commit-ID: 539bca535decb11a0861b6205c6684b8e908589b Gitweb: https://git.kernel.org/tip/539bca535decb11a0861b6205c6684b8e908589b Author: Andy Lutomirski AuthorDate: Mon, 1 Jul 2019 20:43:21 -0700 Committer: Thomas Gleixner CommitDate: Tue, 2 Jul 2019 08:45:20 +0200 x86/entry/64: Fix

[tip:x86/cpu] x86/entry/64: Don't compile ignore_sysret if 32-bit emulation is enabled

2019-07-02 Thread tip-bot for Andy Lutomirski
Commit-ID: dffb3f9db6b593f3ed6ab4c8d8f10e0aa6aa7a88 Gitweb: https://git.kernel.org/tip/dffb3f9db6b593f3ed6ab4c8d8f10e0aa6aa7a88 Author: Andy Lutomirski AuthorDate: Mon, 1 Jul 2019 20:43:20 -0700 Committer: Thomas Gleixner CommitDate: Tue, 2 Jul 2019 08:45:20 +0200 x86/entry/64: Don't

[tip:x86/cpu] selftests/x86: Test SYSCALL and SYSENTER manually with TF set

2019-07-02 Thread tip-bot for Andy Lutomirski
Commit-ID: 9402eaf4c11f0b892eda7b2bcb4654ab34ce34f9 Gitweb: https://git.kernel.org/tip/9402eaf4c11f0b892eda7b2bcb4654ab34ce34f9 Author: Andy Lutomirski AuthorDate: Mon, 1 Jul 2019 20:43:19 -0700 Committer: Thomas Gleixner CommitDate: Tue, 2 Jul 2019 08:45:20 +0200 selftests/x86: Test

Re: [PATCH 0/3] FSGSBASE fix, test, and a semi-related cleanup

2019-07-01 Thread Andy Lutomirski
On Mon, Jul 1, 2019 at 8:43 PM Andy Lutomirski wrote: > > In -tip, if FSGSBASE and PTI are on, the kernel crashes if SYSENTER > happens with TF set. It also crashes under if a non-NMI paranoid > entry happens for any other reason from kernel mode with user GSBASE > and user CR3,

[PATCH 0/3] FSGSBASE fix, test, and a semi-related cleanup

2019-07-01 Thread Andy Lutomirski
minutes while debugging this wondering whether I was accidentally triggering ignore_sysret. Andy Lutomirski (3): selftests/x86: Test SYSCALL and SYSENTER manually with TF set x86/entry/64: Don't compile ignore_sysret if 32-bit emulation is enabled x86/entry/64: Fix and clean up paranoid_exit

[PATCH 3/3] x86/entry/64: Fix and clean up paranoid_exit

2019-07-01 Thread Andy Lutomirski
Kleen Cc: Ravi Shankar Cc: "Bae, Chang Seok" Signed-off-by: Andy Lutomirski --- arch/x86/entry/entry_64.S | 33 + 1 file changed, 17 insertions(+), 16 deletions(-) diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index 54b1b0468b2b.

[PATCH 2/3] x86/entry/64: Don't compile ignore_sysret if 32-bit emulation is enabled

2019-07-01 Thread Andy Lutomirski
It's only used if !CONFIG_IA32_EMULATION, so disable it in normal configs. This will save a few bytes of text and reduce confusion. Cc: "Bae, Chang Seok" Signed-off-by: Andy Lutomirski --- arch/x86/entry/entry_64.S | 6 ++ 1 file changed, 6 insertions(+) diff --git a/arch

[PATCH 1/3] selftests/x86: Test SYSCALL and SYSENTER manually with TF set

2019-07-01 Thread Andy Lutomirski
kernel.org Signed-off-by: Andy Lutomirski --- tools/testing/selftests/x86/Makefile | 5 +- .../testing/selftests/x86/syscall_arg_fault.c | 112 +- 2 files changed, 110 insertions(+), 7 deletions(-) diff --git a/tools/testing/selftests/x86/Makefile b/tools/testing/sel

Re: [PATCH][next] selftests/x86: fix spelling mistake "FAILT" -> "FAIL"

2019-07-01 Thread Andy Lutomirski
#PF(0x%lx)\n", > + printf("[FAIL]\tExecution failed with the wrong error: > #PF(0x%lx)\n", > segv_err); > return 1; > } > -- > 2.20.1 > Acked-by: Andy Lutomirski

Re: [PATCH V33 24/30] bpf: Restrict bpf when kernel lockdown is in confidentiality mode

2019-06-29 Thread Andy Lutomirski
On Fri, Jun 28, 2019 at 11:47 AM Matthew Garrett wrote: > > On Thu, Jun 27, 2019 at 4:27 PM Andy Lutomirski wrote: > > They're really quite similar in my mind. Certainly some things in the > > "integrity" category give absolutely trivial control over the kernel >

Re: [RFC PATCH 3/3] Prevent user from writing to IBT bitmap.

2019-06-29 Thread Andy Lutomirski
On Fri, Jun 28, 2019 at 12:50 PM Yu-cheng Yu wrote: > > The IBT bitmap is visiable from user-mode, but not writable. > > Signed-off-by: Yu-cheng Yu > > --- > arch/x86/mm/fault.c | 7 +++ > 1 file changed, 7 insertions(+) > > diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c > index

<    7   8   9   10   11   12   13   14   15   16   >