[PATCH v24 11/12] samples/landlock: Add a sandbox manager example

2020-11-12 Thread Mickaël Salaün
From: Mickaël Salaün 

Add a basic sandbox tool to launch a command which can only access a
whitelist of file hierarchies in a read-only or read-write way.

Cc: James Morris 
Cc: Jann Horn 
Cc: Kees Cook 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
---

Changes since v23:
* Re-add hints to help users understand the required kernel
  configuration.  This was removed with the removal of
  landlock_get_features(2).

Changes since v21:
* Remove LANDLOCK_ACCESS_FS_CHROOT.
* Clean up help.

Changes since v20:
* Update with new syscalls and type names.
* Update errno check for EOPNOTSUPP.
* Use the full syscall interfaces: explicitely set the "flags" field to
  zero.

Changes since v19:
* Update with the new Landlock syscalls.
* Comply with commit 5f2fb52fac15 ("kbuild: rename hostprogs-y/always to
  hostprogs/always-y").

Changes since v16:
* Switch syscall attribute pointer and size arguments.

Changes since v15:
* Update access right names.
* Properly assign access right to files according to the new related
  syscall restriction.
* Replace "select" with "depends on" HEADERS_INSTALL (suggested by Randy
  Dunlap).

Changes since v14:
* Fix Kconfig dependency.
* Remove access rights that may be required for FD-only requests:
  mmap, truncate, getattr, lock, chmod, chown, chgrp, ioctl.
* Fix useless hardcoded syscall number.
* Use execvpe().
* Follow symlinks.
* Extend help with common file paths.
* Constify variables.
* Clean up comments.
* Improve error message.

Changes since v11:
* Add back the filesystem sandbox manager and update it to work with the
  new Landlock syscall.

Previous changes:
https://lore.kernel.org/lkml/20190721213116.23476-9-...@digikod.net/
---
 samples/Kconfig  |   7 ++
 samples/Makefile |   1 +
 samples/landlock/.gitignore  |   1 +
 samples/landlock/Makefile|  15 +++
 samples/landlock/sandboxer.c | 236 +++
 5 files changed, 260 insertions(+)
 create mode 100644 samples/landlock/.gitignore
 create mode 100644 samples/landlock/Makefile
 create mode 100644 samples/landlock/sandboxer.c

diff --git a/samples/Kconfig b/samples/Kconfig
index 0ed6e4d71d87..e6129496ced5 100644
--- a/samples/Kconfig
+++ b/samples/Kconfig
@@ -124,6 +124,13 @@ config SAMPLE_HIDRAW
bool "hidraw sample"
depends on CC_CAN_LINK && HEADERS_INSTALL
 
+config SAMPLE_LANDLOCK
+   bool "Build Landlock sample code"
+   depends on HEADERS_INSTALL
+   help
+ Build a simple Landlock sandbox manager able to launch a process
+ restricted by a user-defined filesystem access control.
+
 config SAMPLE_PIDFD
bool "pidfd sample"
depends on CC_CAN_LINK && HEADERS_INSTALL
diff --git a/samples/Makefile b/samples/Makefile
index c3392a595e4b..087e0988ccc5 100644
--- a/samples/Makefile
+++ b/samples/Makefile
@@ -11,6 +11,7 @@ obj-$(CONFIG_SAMPLE_KDB)  += kdb/
 obj-$(CONFIG_SAMPLE_KFIFO) += kfifo/
 obj-$(CONFIG_SAMPLE_KOBJECT)   += kobject/
 obj-$(CONFIG_SAMPLE_KPROBES)   += kprobes/
+subdir-$(CONFIG_SAMPLE_LANDLOCK)   += landlock
 obj-$(CONFIG_SAMPLE_LIVEPATCH) += livepatch/
 subdir-$(CONFIG_SAMPLE_PIDFD)  += pidfd
 obj-$(CONFIG_SAMPLE_QMI_CLIENT)+= qmi/
diff --git a/samples/landlock/.gitignore b/samples/landlock/.gitignore
new file mode 100644
index ..f43668b2d318
--- /dev/null
+++ b/samples/landlock/.gitignore
@@ -0,0 +1 @@
+/sandboxer
diff --git a/samples/landlock/Makefile b/samples/landlock/Makefile
new file mode 100644
index ..21eda5774948
--- /dev/null
+++ b/samples/landlock/Makefile
@@ -0,0 +1,15 @@
+# SPDX-License-Identifier: BSD-3-Clause
+
+hostprogs := sandboxer
+
+always-y := $(hostprogs)
+
+KBUILD_HOSTCFLAGS += -I$(objtree)/usr/include
+
+.PHONY: all clean
+
+all:
+   $(MAKE) -C ../.. samples/landlock/
+
+clean:
+   $(MAKE) -C ../.. M=samples/landlock/ clean
diff --git a/samples/landlock/sandboxer.c b/samples/landlock/sandboxer.c
new file mode 100644
index ..127fb063c23a
--- /dev/null
+++ b/samples/landlock/sandboxer.c
@@ -0,0 +1,236 @@
+// SPDX-License-Identifier: BSD-3-Clause
+/*
+ * Simple Landlock sandbox manager able to launch a process restricted by a
+ * user-defined filesystem access control.
+ *
+ * Copyright © 2017-2020 Mickaël Salaün 
+ * Copyright © 2020 ANSSI
+ */
+
+#define _GNU_SOURCE
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#ifndef landlock_create_ruleset
+static inline int landlock_create_ruleset(
+   const struct landlock_ruleset_attr *const attr,
+   const size_t size, const __u32 flags)
+{
+   errno = 0;
+   return syscall(__NR_landlock_create_ruleset, attr, size, flags);
+}
+#endif
+
+#ifndef landlock_add_rule
+static inline int landlock_

[PATCH v24 05/12] LSM: Infrastructure management of the superblock

2020-11-12 Thread Mickaël Salaün
From: Casey Schaufler 

Move management of the superblock->sb_security blob out of the
individual security modules and into the security infrastructure.
Instead of allocating the blobs from within the modules, the modules
tell the infrastructure how much space is required, and the space is
allocated there.

Cc: Kees Cook 
Cc: John Johansen 
Signed-off-by: Casey Schaufler 
Signed-off-by: Mickaël Salaün 
Reviewed-by: Stephen Smalley 
---

Changes since v20:
* Remove all Reviewed-by except Stephen Smalley:
  
https://lore.kernel.org/lkml/CAEjxPJ7ARJO57MBW66=xsBzMMRb=9ulgqock5eskhcaivmx...@mail.gmail.com/
* Cosmetic fix in the commit message.

Changes since v17:
* Rebase the original LSM stacking patch from v5.3 to v5.7: I fixed some
  diff conflicts caused by code moves and function renames in
  selinux/include/objsec.h and selinux/hooks.c .  I checked that it
  builds but I didn't test the changes for SELinux nor SMACK.
  https://lore.kernel.org/r/20190829232935.7099-2-ca...@schaufler-ca.com
---
 include/linux/lsm_hooks.h |  1 +
 security/security.c   | 46 
 security/selinux/hooks.c  | 58 ---
 security/selinux/include/objsec.h |  6 
 security/selinux/ss/services.c|  3 +-
 security/smack/smack.h|  6 
 security/smack/smack_lsm.c| 35 +--
 7 files changed, 85 insertions(+), 70 deletions(-)

diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index c503f7ab8afb..ff0f03a45c56 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -1563,6 +1563,7 @@ struct lsm_blob_sizes {
int lbs_cred;
int lbs_file;
int lbs_inode;
+   int lbs_superblock;
int lbs_ipc;
int lbs_msg_msg;
int lbs_task;
diff --git a/security/security.c b/security/security.c
index a28045dc9e7f..4ffd6c3af9d7 100644
--- a/security/security.c
+++ b/security/security.c
@@ -202,6 +202,7 @@ static void __init lsm_set_blob_sizes(struct lsm_blob_sizes 
*needed)
lsm_set_blob_size(>lbs_inode, _sizes.lbs_inode);
lsm_set_blob_size(>lbs_ipc, _sizes.lbs_ipc);
lsm_set_blob_size(>lbs_msg_msg, _sizes.lbs_msg_msg);
+   lsm_set_blob_size(>lbs_superblock, _sizes.lbs_superblock);
lsm_set_blob_size(>lbs_task, _sizes.lbs_task);
 }
 
@@ -332,12 +333,13 @@ static void __init ordered_lsm_init(void)
for (lsm = ordered_lsms; *lsm; lsm++)
prepare_lsm(*lsm);
 
-   init_debug("cred blob size = %d\n", blob_sizes.lbs_cred);
-   init_debug("file blob size = %d\n", blob_sizes.lbs_file);
-   init_debug("inode blob size= %d\n", blob_sizes.lbs_inode);
-   init_debug("ipc blob size  = %d\n", blob_sizes.lbs_ipc);
-   init_debug("msg_msg blob size  = %d\n", blob_sizes.lbs_msg_msg);
-   init_debug("task blob size = %d\n", blob_sizes.lbs_task);
+   init_debug("cred blob size   = %d\n", blob_sizes.lbs_cred);
+   init_debug("file blob size   = %d\n", blob_sizes.lbs_file);
+   init_debug("inode blob size  = %d\n", blob_sizes.lbs_inode);
+   init_debug("ipc blob size= %d\n", blob_sizes.lbs_ipc);
+   init_debug("msg_msg blob size= %d\n", blob_sizes.lbs_msg_msg);
+   init_debug("superblock blob size = %d\n", blob_sizes.lbs_superblock);
+   init_debug("task blob size   = %d\n", blob_sizes.lbs_task);
 
/*
 * Create any kmem_caches needed for blobs
@@ -669,6 +671,27 @@ static void __init lsm_early_task(struct task_struct *task)
panic("%s: Early task alloc failed.\n", __func__);
 }
 
+/**
+ * lsm_superblock_alloc - allocate a composite superblock blob
+ * @sb: the superblock that needs a blob
+ *
+ * Allocate the superblock blob for all the modules
+ *
+ * Returns 0, or -ENOMEM if memory can't be allocated.
+ */
+static int lsm_superblock_alloc(struct super_block *sb)
+{
+   if (blob_sizes.lbs_superblock == 0) {
+   sb->s_security = NULL;
+   return 0;
+   }
+
+   sb->s_security = kzalloc(blob_sizes.lbs_superblock, GFP_KERNEL);
+   if (sb->s_security == NULL)
+   return -ENOMEM;
+   return 0;
+}
+
 /*
  * The default value of the LSM hook is defined in linux/lsm_hook_defs.h and
  * can be accessed with:
@@ -866,12 +889,21 @@ int security_fs_context_parse_param(struct fs_context 
*fc, struct fs_parameter *
 
 int security_sb_alloc(struct super_block *sb)
 {
-   return call_int_hook(sb_alloc_security, 0, sb);
+   int rc = lsm_superblock_alloc(sb);
+
+   if (unlikely(rc))
+   return rc;
+   rc = call_int_hook(sb_alloc_security, 0, sb);
+   if (unlikely(rc))
+   security_sb_free(sb);
+   return rc;
 }
 
 

[PATCH v24 04/12] landlock: Add ptrace restrictions

2020-11-12 Thread Mickaël Salaün
From: Mickaël Salaün 

Using ptrace(2) and related debug features on a target process can lead
to a privilege escalation.  Indeed, ptrace(2) can be used by an attacker
to impersonate another task and to remain undetected while performing
malicious activities.  Thanks to  ptrace_may_access(), various part of
the kernel can check if a tracer is more privileged than a tracee.

A landlocked process has fewer privileges than a non-landlocked process
and must then be subject to additional restrictions when manipulating
processes. To be allowed to use ptrace(2) and related syscalls on a
target process, a landlocked process must have a subset of the target
process's rules (i.e. the tracee must be in a sub-domain of the tracer).

Cc: James Morris 
Cc: Kees Cook 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
Reviewed-by: Jann Horn 
---

Changes since v21:
* Fix copyright dates.

Changes since v14:
* Constify variables.

Changes since v13:
* Make the ptrace restriction mandatory, like in the v10.
* Remove the eBPF dependency.

Previous changes:
https://lore.kernel.org/lkml/20191104172146.30797-5-...@digikod.net/
---
 security/landlock/Makefile |   2 +-
 security/landlock/ptrace.c | 120 +
 security/landlock/ptrace.h |  14 +
 security/landlock/setup.c  |   2 +
 4 files changed, 137 insertions(+), 1 deletion(-)
 create mode 100644 security/landlock/ptrace.c
 create mode 100644 security/landlock/ptrace.h

diff --git a/security/landlock/Makefile b/security/landlock/Makefile
index 041ea242e627..f1d1eb72fa76 100644
--- a/security/landlock/Makefile
+++ b/security/landlock/Makefile
@@ -1,4 +1,4 @@
 obj-$(CONFIG_SECURITY_LANDLOCK) := landlock.o
 
 landlock-y := setup.o object.o ruleset.o \
-   cred.o
+   cred.o ptrace.o
diff --git a/security/landlock/ptrace.c b/security/landlock/ptrace.c
new file mode 100644
index ..77c77bb1fe97
--- /dev/null
+++ b/security/landlock/ptrace.c
@@ -0,0 +1,120 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Landlock LSM - Ptrace hooks
+ *
+ * Copyright © 2017-2020 Mickaël Salaün 
+ * Copyright © 2019-2020 ANSSI
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "common.h"
+#include "cred.h"
+#include "ptrace.h"
+#include "ruleset.h"
+#include "setup.h"
+
+/**
+ * domain_scope_le - Checks domain ordering for scoped ptrace
+ *
+ * @parent: Parent domain.
+ * @child: Potential child of @parent.
+ *
+ * Checks if the @parent domain is less or equal to (i.e. an ancestor, which
+ * means a subset of) the @child domain.
+ */
+static bool domain_scope_le(const struct landlock_ruleset *const parent,
+   const struct landlock_ruleset *const child)
+{
+   const struct landlock_hierarchy *walker;
+
+   if (!parent)
+   return true;
+   if (!child)
+   return false;
+   for (walker = child->hierarchy; walker; walker = walker->parent) {
+   if (walker == parent->hierarchy)
+   /* @parent is in the scoped hierarchy of @child. */
+   return true;
+   }
+   /* There is no relationship between @parent and @child. */
+   return false;
+}
+
+static bool task_is_scoped(const struct task_struct *const parent,
+   const struct task_struct *const child)
+{
+   bool is_scoped;
+   const struct landlock_ruleset *dom_parent, *dom_child;
+
+   rcu_read_lock();
+   dom_parent = landlock_get_task_domain(parent);
+   dom_child = landlock_get_task_domain(child);
+   is_scoped = domain_scope_le(dom_parent, dom_child);
+   rcu_read_unlock();
+   return is_scoped;
+}
+
+static int task_ptrace(const struct task_struct *const parent,
+   const struct task_struct *const child)
+{
+   /* Quick return for non-landlocked tasks. */
+   if (!landlocked(parent))
+   return 0;
+   if (task_is_scoped(parent, child))
+   return 0;
+   return -EPERM;
+}
+
+/**
+ * hook_ptrace_access_check - Determines whether the current process may access
+ *   another
+ *
+ * @child: Process to be accessed.
+ * @mode: Mode of attachment.
+ *
+ * If the current task has Landlock rules, then the child must have at least
+ * the same rules.  Else denied.
+ *
+ * Determines whether a process may access another, returning 0 if permission
+ * granted, -errno if denied.
+ */
+static int hook_ptrace_access_check(struct task_struct *const child,
+   const unsigned int mode)
+{
+   return task_ptrace(current, child);
+}
+
+/**
+ * hook_ptrace_traceme - Determines whether another process may trace the
+ *  current one
+ *
+ * @parent: Task proposed to be the tracer.
+ *
+ * If the parent has Landlock rules, then the current task must have the same
+ * or more rules.  Else denied.
+ *
+ * Determines whether the nominated task

[PATCH v24 01/12] landlock: Add object management

2020-11-12 Thread Mickaël Salaün
From: Mickaël Salaün 

A Landlock object enables to identify a kernel object (e.g. an inode).
A Landlock rule is a set of access rights allowed on an object.  Rules
are grouped in rulesets that may be tied to a set of processes (i.e.
subjects) to enforce a scoped access-control (i.e. a domain).

Because Landlock's goal is to empower any process (especially
unprivileged ones) to sandbox themselves, we cannot rely on a
system-wide object identification such as file extended attributes.
Indeed, we need innocuous, composable and modular access-controls.

The main challenge with these constraints is to identify kernel objects
while this identification is useful (i.e. when a security policy makes
use of this object).  But this identification data should be freed once
no policy is using it.  This ephemeral tagging should not and may not be
written in the filesystem.  We then need to manage the lifetime of a
rule according to the lifetime of its objects.  To avoid a global lock,
this implementation make use of RCU and counters to safely reference
objects.

A following commit uses this generic object management for inodes.

Cc: James Morris 
Cc: Kees Cook 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
Reviewed-by: Jann Horn 
---

Changes since v23:
* Update landlock_create_object() to return error codes instead of NULL.
  This help error handling in callers.
* When using make oldconfig with a previous configuration already
  including the CONFIG_LSM variable, no question is asked to update its
  content.  Update the Kconfig help to warn about LSM stacking
  configuration.
* Constify variable (spotted by Vincent Dagonneau).

Changes since v22:
* Fix spelling (spotted by Jann Horn).

Changes since v21:
* Update Kconfig help.
* Clean up comments.

Changes since v18:
* Account objects to kmemcg.

Changes since v14:
* Simplify the object, rule and ruleset management at the expense of a
  less aggressive memory freeing (contributed by Jann Horn, with
  additional modifications):
  - Remove object->list aggregating the rules tied to an object.
  - Remove landlock_get_object(), landlock_drop_object(),
{get,put}_object_cleaner() and landlock_rule_is_disabled().
  - Rewrite landlock_put_object() to use a more simple mechanism
(no tricky RCU).
  - Replace enum landlock_object_type and landlock_release_object() with
landlock_object_underops->release()
  - Adjust unions and Sparse annotations.
  Cf. 
https://lore.kernel.org/lkml/cag48ez21ben0wl1bbmtiiu8j9jp5iewthowz4turuj+ki0y...@mail.gmail.com/
* Merge struct landlock_rule into landlock_ruleset_elem to simplify the
  rule management.
* Constify variables.
* Improve kernel documentation.
* Cosmetic variable renames.
* Remove the "default" in the Kconfig (suggested by Jann Horn).
* Only use refcount_inc() through getter helpers.
* Update Kconfig description.

Changes since v13:
* New dedicated implementation, removing the need for eBPF.

Previous changes:
https://lore.kernel.org/lkml/20190721213116.23476-6-...@digikod.net/
---
 MAINTAINERS| 10 +
 security/Kconfig   |  1 +
 security/Makefile  |  2 +
 security/landlock/Kconfig  | 21 +
 security/landlock/Makefile |  3 ++
 security/landlock/object.c | 67 
 security/landlock/object.h | 91 ++
 7 files changed, 195 insertions(+)
 create mode 100644 security/landlock/Kconfig
 create mode 100644 security/landlock/Makefile
 create mode 100644 security/landlock/object.c
 create mode 100644 security/landlock/object.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 3da6d8c154e4..33482cbe56e7 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9826,6 +9826,16 @@ F:   net/core/sock_map.c
 F: net/ipv4/tcp_bpf.c
 F: net/ipv4/udp_bpf.c
 
+LANDLOCK SECURITY MODULE
+M:     Mickaël Salaün 
+L: linux-security-mod...@vger.kernel.org
+S: Supported
+W: https://landlock.io
+T: git https://github.com/landlock-lsm/linux.git
+F: security/landlock/
+K: landlock
+K: LANDLOCK
+
 LANTIQ / INTEL Ethernet drivers
 M: Hauke Mehrtens 
 L: net...@vger.kernel.org
diff --git a/security/Kconfig b/security/Kconfig
index 7561f6f99f1d..15a4342b5d01 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -238,6 +238,7 @@ source "security/loadpin/Kconfig"
 source "security/yama/Kconfig"
 source "security/safesetid/Kconfig"
 source "security/lockdown/Kconfig"
+source "security/landlock/Kconfig"
 
 source "security/integrity/Kconfig"
 
diff --git a/security/Makefile b/security/Makefile
index 3baf435de541..c688f4907a1b 100644
--- a/security/Makefile
+++ b/security/Makefile
@@ -13,6 +13,7 @@ subdir-$(CONFIG_SECURITY_LOADPIN) += loadpin
 subdir-$(CONFIG_SECURITY_SAFESETID)+= safesetid
 subdir-$(CONFIG_SECURITY_LOCKDOWN_LSM) += lockdown
 subdir-$(CONFIG_BPF_LSM)   += bpf
+subdir-$(CONFIG_SECURITY_LAND

[PATCH v24 10/12] selftests/landlock: Add user space tests

2020-11-12 Thread Mickaël Salaün
From: Mickaël Salaün 

Test all Landlock system calls, ptrace hooks semantic and filesystem
access-control.

Test coverage for security/landlock/ is 94.8% of lines.  The code not
covered only deals with internal kernel errors (e.g. memory allocation)
and race conditions.

Cc: James Morris 
Cc: Jann Horn 
Cc: Kees Cook 
Cc: Serge E. Hallyn 
Cc: Shuah Khan 
Signed-off-by: Mickaël Salaün 
Reviewed-by: Vincent Dagonneau 
---

Changes since v23:
* Add an interleaved_masked_accesses test to check corner cases for
  interleaved layered ruleset combinations.
* Update ruleset_overlap and inherit_subset tests to follow the new
  intersect access rights behavior.
* Extend the inherit_superset test to check that layers are handled as
  expected in the superset use case, which complete the inherit_subset
  checks.
* Fix comment (spotted by Vincent Dagonneau).

Changes since v22:
* Extend and add a new test to better check rules applied to the root
  directory: rule_over_root_allow_then_deny, rule_over_root_deny.
* Change the signature of test_path*() to make the calls clearer.

Changes since v21:
* Remove layout1.chroot test and update layout1.unhandled_access to not
  rely on LANDLOCK_ACCESS_FS_CHROOT.
* Clean up comments.

Changes since v20:
* Update with new syscalls and type names.
* Use the full syscall interfaces: explicitely set the "flags" field to
  zero.
* Update the empty_path_beneath_attr test to check for EFAULT.
* Update and merge tests for the simplified copy_min_struct_from_user().
* Clean up makefile.
* Rename some types and variables in a more consistent way.

Changes since v19:
* Update with the new Landlock syscalls.
* Fix device creation.
* Check the new landlock_attr_features members: last_rule_type and
  last_target_type .
* Constify variables.

Changes since v18:
* Replace ruleset_rw.inval with layout1.inval to avoid inexistent test
  layout.
* Use the new FIXTURE_VARIANT for ptrace_test: makes the tests more
  readable and usable.
* Add ARRAY_SIZE() macro to please checkpatch.

Changes since v17:
* Add new test for mknod with a zero mode.
* Use memset(3) to initialize attr_features in base_test.

Changes since v16:
* Add new unpriv_enforce_without_no_new_privs test: check that ruleset
  enforcing is forbiden without no_new_privs and CAP_SYS_ADMIN.
* Drop capabilities when useful.
* Check the new size_attr_features field from struct
  landlock_attr_features.
* Update the empty_or_same_ruleset test to check complementary empty
  ruleset.
* Update base_test according to the new attribute structures and fix the
  inconsistent_attr test accordingly.
* Switch syscall attribute pointer and size arguments.
* Rename test files with a "_test" suffix.

Changes since v14:
* Add new tests:
  - superset: check new layer bitmask.
  - max_layers: check maximum number of layers.
  - release_inodes: check that umount work well.
  - empty_or_same_ruleset.
  - inconsistent_attr: checks copy_to_user limits.
  - in ruleset_rw.inval to check ruleset FD.
  - proc_unlinked_file: check file access through /proc/self/fd .
  - file_access_rights: check that a file can only get consistent access
rights.
  - unpriv: check that NO_NEW_PRIVS or CAP_SYS_ADMIN is required.
  - check pipe access through /proc/self/fd .
  - check move_mount(2).
  - check ruleset file descriptor properties.
  - proc_nsfs: extend to check that internal filesystems (e.g. nsfs) are
allowed.
* Double-check read and write effective actions.
* Fix potential desynchronization between the kernel sources and
  installed headers by overriding the build step in the Makefile.  This
  also enable to build with Clang.
* Add two files in the test directories (for link test and rename test).
* Remove test for ruleset's show_fdinfo().
* Replace EBADR with EBADFD.
* Update tests accordingly to the changes of rename and link rights.
* Fix (now) illegal access rights tied to files.
* Update rename and link tests.
* Remove superfluous '\n' in TH_LOG() calls.
* Make assert calls consistent and readable.
* Fix the execute test.
* Make tests future-proof.
* Cosmetic fixes.

Changes since v14:
* Add new tests:
  - Compatibility: empty_attr_{ruleset,path_beneath,enforce} to check
minimal attr size.
  - Access types: link_to, rename_from, rename_to, rmdir, unlink,
make_char, make_block, make_reg, make_sock, make_fifo, make_sym,
make_dir, chroot, execute.
  - Test privilege escalation prevention by enforcing a nested rule, on
a parent directory, with less restrictions than one on a child
directory.
  - Test for empty and more than 32-bits allowed_access
* Merge the two test mount hierarchies.
* Complete relative path tests by combining chdir and chroot.
* Adjust tests:
  - Remove the layout1/extend_ruleset_with_denied_path test.
  - Extend layout1/whitelist test with checks on file.
  - Add and use create_dir_and_file().
* Only use read/write checks but not stat(2) for tests.
* Rename test.h to common.h and improve it.
* Rena

[PATCH v24 00/12] Landlock LSM

2020-11-12 Thread Mickaël Salaün
Hi,

This patch series simplifies the code, makes stacked access-control more
consistent (from the user point of view), properly handles memory
allocation errors, and adds more tests (covering layered ruleset corner
cases).  Most of these changes were sent as a separate patch series:
https://lore.kernel.org/lkml/2020213442.434639-1-...@digikod.net/

James and Jann, could you please take a look at the main changes
(patches 2, 7 and 8)?

The SLOC count is 1185 for security/landlock/ and 1767 for
tools/testing/selftest/landlock/ .  Test coverage for security/landlock/
is 94.8% of lines.  The code not covered only deals with internal kernel
errors (e.g. memory allocation) and race conditions.

The compiled documentation is available here:
https://landlock.io/linux-doc/landlock-v24/userspace-api/landlock.html

This series can be applied on top of v5.10-rc3 .  This can be tested
with CONFIG_SECURITY_LANDLOCK, CONFIG_SAMPLE_LANDLOCK and by prepending
"landlock," to CONFIG_LSM.  This patch series can be found in a Git
repository here:
https://github.com/landlock-lsm/linux/commits/landlock-v24
I would really appreciate constructive comments on this patch series.


# Landlock LSM

The goal of Landlock is to enable to restrict ambient rights (e.g.
global filesystem access) for a set of processes.  Because Landlock is a
stackable LSM [1], it makes possible to create safe security sandboxes
as new security layers in addition to the existing system-wide
access-controls. This kind of sandbox is expected to help mitigate the
security impact of bugs or unexpected/malicious behaviors in user-space
applications. Landlock empowers any process, including unprivileged
ones, to securely restrict themselves.

Landlock is inspired by seccomp-bpf but instead of filtering syscalls
and their raw arguments, a Landlock rule can restrict the use of kernel
objects like file hierarchies, according to the kernel semantic.
Landlock also takes inspiration from other OS sandbox mechanisms: XNU
Sandbox, FreeBSD Capsicum or OpenBSD Pledge/Unveil.

In this current form, Landlock misses some access-control features.
This enables to minimize this patch series and ease review.  This series
still addresses multiple use cases, especially with the combined use of
seccomp-bpf: applications with built-in sandboxing, init systems,
security sandbox tools and security-oriented APIs [2].

Previous version:
https://lore.kernel.org/lkml/20201103182109.1014179-1-...@digikod.net/

[1] 
https://lore.kernel.org/lkml/50db058a-7dde-441b-a7f9-f6837fe8b...@schaufler-ca.com/
[2] 
https://lore.kernel.org/lkml/f646e1c7-33cf-333f-070c-0a40ad046...@digikod.net/


Casey Schaufler (1):
  LSM: Infrastructure management of the superblock

Mickaël Salaün (11):
  landlock: Add object management
  landlock: Add ruleset and domain management
  landlock: Set up the security framework and manage credentials
  landlock: Add ptrace restrictions
  fs,security: Add sb_delete hook
  landlock: Support filesystem access-control
  landlock: Add syscall implementations
  arch: Wire up Landlock syscalls
  selftests/landlock: Add user space tests
  samples/landlock: Add a sandbox manager example
  landlock: Add user and kernel documentation

 Documentation/security/index.rst  |1 +
 Documentation/security/landlock.rst   |   79 +
 Documentation/userspace-api/index.rst |1 +
 Documentation/userspace-api/landlock.rst  |  275 +++
 MAINTAINERS   |   13 +
 arch/Kconfig  |7 +
 arch/alpha/kernel/syscalls/syscall.tbl|3 +
 arch/arm/tools/syscall.tbl|3 +
 arch/arm64/include/asm/unistd.h   |2 +-
 arch/arm64/include/asm/unistd32.h |6 +
 arch/ia64/kernel/syscalls/syscall.tbl |3 +
 arch/m68k/kernel/syscalls/syscall.tbl |3 +
 arch/microblaze/kernel/syscalls/syscall.tbl   |3 +
 arch/mips/kernel/syscalls/syscall_n32.tbl |3 +
 arch/mips/kernel/syscalls/syscall_n64.tbl |3 +
 arch/mips/kernel/syscalls/syscall_o32.tbl |3 +
 arch/parisc/kernel/syscalls/syscall.tbl   |3 +
 arch/powerpc/kernel/syscalls/syscall.tbl  |3 +
 arch/s390/kernel/syscalls/syscall.tbl |3 +
 arch/sh/kernel/syscalls/syscall.tbl   |3 +
 arch/sparc/kernel/syscalls/syscall.tbl|3 +
 arch/um/Kconfig   |1 +
 arch/x86/entry/syscalls/syscall_32.tbl|3 +
 arch/x86/entry/syscalls/syscall_64.tbl|3 +
 arch/xtensa/kernel/syscalls/syscall.tbl   |3 +
 fs/super.c|1 +
 include/linux/lsm_hook_defs.h |1 +
 include/linux/lsm_hooks.h |3 +
 include/linux/security.h  |4 +
 include/linux/syscalls.h  |7 +
 include/uapi/asm-generic/unistd.h |8 +-
 include/uapi/linux/

[PATCH v1 5/9] landlock: Add extra checks when inserting a rule

2020-11-11 Thread Mickaël Salaün
Make sure that there is always an (allocated) object in each used rules.
This could have prevented the bug fixed with a previous commit.

When the rules from a ruleset are merged in a domain with
landlock_enforce_ruleset_current(2), these new rules should be assigned
to the last layer.  However, when a rule is just extending a ruleset
with landlock_add_rule(2), the number of layers of this updated ruleset
should always be 0.  Checking such use of landlock_insert_rule() could
enable to detect bugs in future developments.

Replace the hardcoded 1 with SINGLE_DEPTH_NESTING.

Cc: James Morris 
Cc: Jann Horn 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
---
 security/landlock/ruleset.c | 17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/security/landlock/ruleset.c b/security/landlock/ruleset.c
index 7654a66cea43..1fb85daeb750 100644
--- a/security/landlock/ruleset.c
+++ b/security/landlock/ruleset.c
@@ -102,6 +102,10 @@ int landlock_insert_rule(struct landlock_ruleset *const 
ruleset,
 
might_sleep();
lockdep_assert_held(>lock);
+   if (WARN_ON_ONCE(!rule->object))
+   return -ENOENT;
+   if (!is_merge && WARN_ON_ONCE(ruleset->nb_layers != 0))
+   return -EINVAL;
walker_node = &(ruleset->root.rb_node);
while (*walker_node) {
struct landlock_rule *const this = rb_entry(*walker_node,
@@ -223,12 +227,7 @@ static struct landlock_ruleset *inherit_ruleset(
return new_ruleset;
 
mutex_lock(_ruleset->lock);
-   mutex_lock_nested(>lock, 1);
-   new_ruleset->nb_layers = parent->nb_layers;
-   new_ruleset->fs_access_mask = parent->fs_access_mask;
-   WARN_ON_ONCE(!parent->hierarchy);
-   get_hierarchy(parent->hierarchy);
-   new_ruleset->hierarchy->parent = parent->hierarchy;
+   mutex_lock_nested(>lock, SINGLE_DEPTH_NESTING);
 
/* Copies the @parent tree. */
rbtree_postorder_for_each_entry_safe(walker_rule, next_rule,
@@ -237,6 +236,12 @@ static struct landlock_ruleset *inherit_ruleset(
if (err)
goto out_unlock;
}
+   new_ruleset->nb_layers = parent->nb_layers;
+   new_ruleset->fs_access_mask = parent->fs_access_mask;
+   WARN_ON_ONCE(!parent->hierarchy);
+   get_hierarchy(parent->hierarchy);
+   new_ruleset->hierarchy->parent = parent->hierarchy;
+
mutex_unlock(>lock);
mutex_unlock(_ruleset->lock);
return new_ruleset;
-- 
2.29.2



[PATCH v1 2/9] landlock: Cosmetic fixes for filesystem management

2020-11-11 Thread Mickaël Salaün
Improve comments and make get_inode_object() more readable.  The kfree()
call is correct but we should mimimize as much as possible lock windows.

Cc: James Morris 
Cc: Jann Horn 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
---
 security/landlock/fs.c | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/security/landlock/fs.c b/security/landlock/fs.c
index b67c821bb40b..33fc7ae17c7f 100644
--- a/security/landlock/fs.c
+++ b/security/landlock/fs.c
@@ -85,8 +85,8 @@ static struct landlock_object *get_inode_object(struct inode 
*const inode)
return object;
}
/*
-* We're racing with release_inode(), the object is going away.
-* Wait for release_inode(), then retry.
+* We are racing with release_inode(), the object is going
+* away.  Wait for release_inode(), then retry.
 */
spin_lock(>lock);
spin_unlock(>lock);
@@ -107,21 +107,21 @@ static struct landlock_object *get_inode_object(struct 
inode *const inode)
lockdep_is_held(>i_lock));
if (unlikely(object)) {
/* Someone else just created the object, bail out and retry. */
-   kfree(new_object);
spin_unlock(>i_lock);
+   kfree(new_object);
 
rcu_read_lock();
goto retry;
-   } else {
-   rcu_assign_pointer(inode_sec->object, new_object);
-   /*
-* @inode will be released by hook_sb_delete() on its
-* superblock shutdown.
-*/
-   ihold(inode);
-   spin_unlock(>i_lock);
-   return new_object;
}
+
+   rcu_assign_pointer(inode_sec->object, new_object);
+   /*
+* @inode will be released by hook_sb_delete() on its superblock
+* shutdown.
+*/
+   ihold(inode);
+   spin_unlock(>i_lock);
+   return new_object;
 }
 
 /* All access rights which can be tied to files. */
-- 
2.29.2



[PATCH v1 4/9] landlock: Always intersect access rights

2020-11-11 Thread Mickaël Salaün
Following the previous commit logic, make ruleset updates more
consistent by always intersecting access rights (boolean AND) instead of
combining them (boolean OR) for the same layer.

This defensive approach could also help avoid user space to
inadvertently allow multiple access rights for the same object (e.g.
write and execute access on a path hierarchy) instead of dealing with
such inconsistency.  This can happen when there is no deduplication of
objects (e.g. paths and underlying inodes) whereas they get different
access rights with landlock_add_rule(2).

Update layout1.ruleset_overlap and layout1.inherit_subset tests
accordingly.

Cc: James Morris 
Cc: Jann Horn 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
---
 security/landlock/ruleset.c| 17 -
 tools/testing/selftests/landlock/fs_test.c | 41 +++---
 2 files changed, 34 insertions(+), 24 deletions(-)

diff --git a/security/landlock/ruleset.c b/security/landlock/ruleset.c
index 9fe92b2f5fbd..7654a66cea43 100644
--- a/security/landlock/ruleset.c
+++ b/security/landlock/ruleset.c
@@ -82,11 +82,11 @@ static void put_rule(struct landlock_rule *const rule)
 /**
  * landlock_insert_rule - Insert a rule in a ruleset
  *
+ * Intersects access rights of the rule with those of the ruleset.
+ *
  * @ruleset: The ruleset to be updated.
  * @rule: Read-only payload to be inserted (not owned by this function).
- * @is_merge: If true, intersects access rights and updates the rule's layers
- *(e.g. merge two rulesets), else do a union of access rights and
- *keep the rule's layers (e.g. extend a ruleset)
+ * @is_merge: If true, handle the rule layers.
  *
  * Assumptions:
  *
@@ -117,16 +117,11 @@ int landlock_insert_rule(struct landlock_ruleset *const 
ruleset,
}
 
/* If there is a matching rule, updates it. */
-   if (is_merge) {
-   /* Intersects access rights. */
-   this->access &= rule->access;
-
+   if (is_merge)
/* Updates the rule layers with the next one. */
this->layers |= BIT_ULL(ruleset->nb_layers);
-   } else {
-   /* Extends access rights. */
-   this->access |= rule->access;
-   }
+   /* Intersects access rights. */
+   this->access &= rule->access;
return 0;
}
 
diff --git a/tools/testing/selftests/landlock/fs_test.c 
b/tools/testing/selftests/landlock/fs_test.c
index ade0ad8728d8..1885174b2770 100644
--- a/tools/testing/selftests/landlock/fs_test.c
+++ b/tools/testing/selftests/landlock/fs_test.c
@@ -591,14 +591,16 @@ TEST_F(layout1, unhandled_access)
 TEST_F(layout1, ruleset_overlap)
 {
const struct rule rules[] = {
-   /* These rules should be ORed among them. */
+   /* These rules should be ANDed among them. */
{
.path = dir_s1d2,
-   .access = LANDLOCK_ACCESS_FS_WRITE_FILE,
+   .access = LANDLOCK_ACCESS_FS_READ_FILE |
+   LANDLOCK_ACCESS_FS_WRITE_FILE,
},
{
.path = dir_s1d2,
-   .access = LANDLOCK_ACCESS_FS_READ_DIR,
+   .access = LANDLOCK_ACCESS_FS_READ_FILE |
+   LANDLOCK_ACCESS_FS_READ_DIR,
},
{}
};
@@ -609,24 +611,37 @@ TEST_F(layout1, ruleset_overlap)
enforce_ruleset(_metadata, ruleset_fd);
EXPECT_EQ(0, close(ruleset_fd));
 
+   /* Checks s1d1 hierarchy. */
+   ASSERT_EQ(-1, open(file1_s1d1, O_RDONLY | O_CLOEXEC));
+   ASSERT_EQ(EACCES, errno);
ASSERT_EQ(-1, open(file1_s1d1, O_WRONLY | O_CLOEXEC));
ASSERT_EQ(EACCES, errno);
+   ASSERT_EQ(-1, open(file1_s1d1, O_RDWR | O_CLOEXEC));
+   ASSERT_EQ(EACCES, errno);
ASSERT_EQ(-1, open(dir_s1d1, O_RDONLY | O_DIRECTORY | O_CLOEXEC));
ASSERT_EQ(EACCES, errno);
 
-   open_fd = open(file1_s1d2, O_WRONLY | O_CLOEXEC);
-   ASSERT_LE(0, open_fd);
-   EXPECT_EQ(0, close(open_fd));
-   open_fd = open(dir_s1d2, O_RDONLY | O_DIRECTORY | O_CLOEXEC);
+   /* Checks s1d2 hierarchy. */
+   open_fd = open(file1_s1d2, O_RDONLY | O_CLOEXEC);
ASSERT_LE(0, open_fd);
EXPECT_EQ(0, close(open_fd));
+   ASSERT_EQ(-1, open(file1_s1d2, O_WRONLY | O_CLOEXEC));
+   ASSERT_EQ(EACCES, errno);
+   ASSERT_EQ(-1, open(file1_s1d2, O_RDWR | O_CLOEXEC));
+   ASSERT_EQ(EACCES, errno);
+   ASSERT_EQ(-1, open(dir_s1d2, O_RDONLY | O_DIRECTORY | O_CLOEXEC));
+   ASSERT_EQ(EACCES, errno);
 
-   open_fd = open(file1_s1d3, O_WRONLY | O_CLOEXEC);
-   ASSERT_LE(0, open_fd);
-   EXPECT_EQ(0, close(open_fd));
-   open_fd = open(dir_s1d3, O_

[PATCH v1 7/9] landlock: Clean up get_ruleset_from_fd()

2020-11-11 Thread Mickaël Salaün
Rewrite get_ruleset_from_fd() to please the 0-DAY CI Kernel Test Service
that reported an uninitialized variable (false positive).  Anyway, it is
cleaner like this.

Cc: James Morris 
Cc: Jann Horn 
Cc: Serge E. Hallyn 
Link: 
https://lore.kernel.org/linux-security-module/202011101854.zgbwwusk-...@intel.com/
Reported-by: kernel test robot 
Signed-off-by: Mickaël Salaün 
---
 security/landlock/syscall.c | 22 --
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/security/landlock/syscall.c b/security/landlock/syscall.c
index 486136d4f46e..543ae36cd339 100644
--- a/security/landlock/syscall.c
+++ b/security/landlock/syscall.c
@@ -196,24 +196,26 @@ static struct landlock_ruleset *get_ruleset_from_fd(const 
int fd,
 {
struct fd ruleset_f;
struct landlock_ruleset *ruleset;
-   int err;
 
ruleset_f = fdget(fd);
if (!ruleset_f.file)
return ERR_PTR(-EBADF);
 
/* Checks FD type and access right. */
-   err = 0;
-   if (ruleset_f.file->f_op != _fops)
-   err = -EBADFD;
-   else if (!(ruleset_f.file->f_mode & mode))
-   err = -EPERM;
-   if (!err) {
-   ruleset = ruleset_f.file->private_data;
-   landlock_get_ruleset(ruleset);
+   if (ruleset_f.file->f_op != _fops) {
+   ruleset = ERR_PTR(-EBADFD);
+   goto out_fdput;
}
+   if (!(ruleset_f.file->f_mode & mode)) {
+   ruleset = ERR_PTR(-EPERM);
+   goto out_fdput;
+   }
+   ruleset = ruleset_f.file->private_data;
+   landlock_get_ruleset(ruleset);
+
+out_fdput:
fdput(ruleset_f);
-   return err ? ERR_PTR(err) : ruleset;
+   return ruleset;
 }
 
 /* Path handling */
-- 
2.29.2



[PATCH v1 3/9] landlock: Enforce deterministic interleaved path rules

2020-11-11 Thread Mickaël Salaün
To have consistent layered rules, granting access to a path implies that
all accesses tied to inodes, from the requested file to the real root,
must be checked.  Otherwise, stacked rules may result to overzealous
restrictions.  By excluding the ability to add exceptions in the same
layer (e.g. /a allowed, /a/b denied, and /a/b/c allowed), we get
deterministic interleaved path rules.  This removes an optimization
which could be replaced by a proper cache mechanism.  This also further
simplifies and explain check_access_path_continue().

Add a layout1.interleaved_masked_accesses test to check corner-case
layered rule combinations.

Cc: James Morris 
Cc: Jann Horn 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
---
 security/landlock/fs.c | 38 +
 tools/testing/selftests/landlock/fs_test.c | 95 ++
 2 files changed, 115 insertions(+), 18 deletions(-)

diff --git a/security/landlock/fs.c b/security/landlock/fs.c
index 33fc7ae17c7f..2ca4dce1e9ed 100644
--- a/security/landlock/fs.c
+++ b/security/landlock/fs.c
@@ -180,16 +180,24 @@ static bool check_access_path_continue(
rcu_dereference(landlock_inode(inode)->object));
rcu_read_unlock();
 
-   /* Checks for matching layers. */
-   if (rule && (rule->layers | *layer_mask)) {
-   if ((rule->access & access_request) == access_request) {
-   *layer_mask &= ~rule->layers;
-   return true;
-   } else {
-   return false;
-   }
+   if (!rule)
+   /* Continues to walk if there is no rule for this inode. */
+   return true;
+   /*
+* We must check all layers for each inode because we may encounter
+* multiple different accesses from the same layer in a walk.  Each
+* layer must at least allow the access request one time (i.e. with one
+* inode).  This enables to have a deterministic behavior whatever
+* inode is tagged within interleaved layers.
+*/
+   if ((rule->access & access_request) == access_request) {
+   /* Validates layers for which all accesses are allowed. */
+   *layer_mask &= ~rule->layers;
+   /* Continues to walk until all layers are validated. */
+   return true;
}
-   return true;
+   /* Stops if a rule in the path don't allow all requested access. */
+   return false;
 }
 
 static int check_access_path(const struct landlock_ruleset *const domain,
@@ -231,12 +239,6 @@ static int check_access_path(const struct landlock_ruleset 
*const domain,
_mask)) {
struct dentry *parent_dentry;
 
-   /* Stops when a rule from each layer granted access. */
-   if (layer_mask == 0) {
-   allowed = true;
-   break;
-   }
-
 jump_up:
/*
 * Does not work with orphaned/private mounts like overlayfs
@@ -248,10 +250,10 @@ static int check_access_path(const struct 
landlock_ruleset *const domain,
goto jump_up;
} else {
/*
-* Stops at the real root.  Denies access
-* because not all layers have granted access.
+* Stops at the real root.  Denies access if
+* not all layers granted access.
 */
-   allowed = false;
+   allowed = (layer_mask == 0);
break;
}
}
diff --git a/tools/testing/selftests/landlock/fs_test.c 
b/tools/testing/selftests/landlock/fs_test.c
index 8aed28081ec8..ade0ad8728d8 100644
--- a/tools/testing/selftests/landlock/fs_test.c
+++ b/tools/testing/selftests/landlock/fs_test.c
@@ -629,6 +629,101 @@ TEST_F(layout1, ruleset_overlap)
EXPECT_EQ(0, close(open_fd));
 }
 
+TEST_F(layout1, interleaved_masked_accesses)
+{
+   /*
+* Checks overly restrictive rules:
+* layer 1: allows s1d1/s1d2/s1d3/file1
+* layer 2: allows s1d1/s1d2/s1d3
+*  denies s1d1/s1d2
+* layer 3: allows s1d1
+* layer 4: allows s1d1/s1d2
+*/
+   const struct rule layer1[] = {
+   /* Allows access to file1_s1d3 with the first layer. */
+   {
+   .path = file1_s1d3,
+   .access = LANDLOCK_ACCESS_FS_READ_FILE,
+   },
+   {}
+   };
+   const struct rule layer2[] = {
+   /* Start by granting access to file1_s1d3 with this rule... */
+   {
+   .path = dir_s1d3,
+   

[PATCH v1 6/9] selftests/landlock: Extend layout1.inherit_superset

2020-11-11 Thread Mickaël Salaün
These additional checks test that layers are handled as expected in the
superset use case, which complete the inherit_subset checks.

Cc: James Morris 
Cc: Jann Horn 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
---
 tools/testing/selftests/landlock/fs_test.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/tools/testing/selftests/landlock/fs_test.c 
b/tools/testing/selftests/landlock/fs_test.c
index 1885174b2770..c2f65034e4ee 100644
--- a/tools/testing/selftests/landlock/fs_test.c
+++ b/tools/testing/selftests/landlock/fs_test.c
@@ -907,6 +907,10 @@ TEST_F(layout1, inherit_superset)
open_fd = open(dir_s1d3, O_RDONLY | O_DIRECTORY | O_CLOEXEC);
ASSERT_LE(0, open_fd);
ASSERT_EQ(0, close(open_fd));
+   /* File access is allowed for file1_s1d3. */
+   open_fd = open(file1_s1d3, O_RDONLY | O_CLOEXEC);
+   ASSERT_LE(0, open_fd);
+   ASSERT_EQ(0, close(open_fd));
 
/* Now dir_s1d2, parent of dir_s1d3, gets a new rule tied to it. */
add_path_beneath(_metadata, ruleset_fd, LANDLOCK_ACCESS_FS_READ_FILE |
@@ -922,6 +926,10 @@ TEST_F(layout1, inherit_superset)
open_fd = open(dir_s1d3, O_RDONLY | O_DIRECTORY | O_CLOEXEC);
ASSERT_LE(0, open_fd);
ASSERT_EQ(0, close(open_fd));
+   /* File access is now denied for file1_s1d3. */
+   open_fd = open(file1_s1d3, O_RDONLY | O_CLOEXEC);
+   ASSERT_LE(-1, open_fd);
+   ASSERT_EQ(EACCES, errno);
 }
 
 TEST_F(layout1, max_layers)
-- 
2.29.2



[PATCH v1 1/9] landlock: Fix memory allocation error handling

2020-11-11 Thread Mickaël Salaün
Handle memory allocation errors in landlock_create_object() call.  This
prevent to inadvertently hold an inode.  Also, make get_inode_object()
more readable.

Cc: James Morris 
Cc: Jann Horn 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
---
 security/landlock/fs.c | 5 +
 security/landlock/object.c | 5 +++--
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/security/landlock/fs.c b/security/landlock/fs.c
index d8c5d19ac2af..b67c821bb40b 100644
--- a/security/landlock/fs.c
+++ b/security/landlock/fs.c
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -98,6 +99,8 @@ static struct landlock_object *get_inode_object(struct inode 
*const inode)
 * holding any locks).
 */
new_object = landlock_create_object(_fs_underops, inode);
+   if (IS_ERR(new_object))
+   return new_object;
 
spin_lock(>i_lock);
object = rcu_dereference_protected(inode_sec->object,
@@ -145,6 +148,8 @@ int landlock_append_fs_rule(struct landlock_ruleset *const 
ruleset,
access_rights |= _LANDLOCK_ACCESS_FS_MASK & ~ruleset->fs_access_mask;
rule.access = access_rights;
rule.object = get_inode_object(d_backing_inode(path->dentry));
+   if (IS_ERR(rule.object))
+   return PTR_ERR(rule.object);
mutex_lock(>lock);
err = landlock_insert_rule(ruleset, , false);
mutex_unlock(>lock);
diff --git a/security/landlock/object.c b/security/landlock/object.c
index a71644ee72a7..54ba0327002a 100644
--- a/security/landlock/object.c
+++ b/security/landlock/object.c
@@ -8,6 +8,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -23,10 +24,10 @@ struct landlock_object *landlock_create_object(
struct landlock_object *new_object;
 
if (WARN_ON_ONCE(!underops || !underobj))
-   return NULL;
+   return ERR_PTR(-ENOENT);
new_object = kzalloc(sizeof(*new_object), GFP_KERNEL_ACCOUNT);
if (!new_object)
-   return NULL;
+   return ERR_PTR(-ENOMEM);
refcount_set(_object->usage, 1);
spin_lock_init(_object->lock);
new_object->underops = underops;
-- 
2.29.2



[PATCH v1 0/9] Landlock fixes

2020-11-11 Thread Mickaël Salaün
Hi,

This patch series fixes some issues and makes the Landlock filesystem
access-control more consistent and deterministic when stacking multiple
rulesets.  This is checked by current and new tests.  I also extended
documentation and example to help users.

This series can be applied on top of
https://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security.git/log/?h=landlock_lsm

Regards,

Mickaël Salaün (9):
  landlock: Fix memory allocation error handling
  landlock: Cosmetic fixes for filesystem management
  landlock: Enforce deterministic interleaved path rules
  landlock: Always intersect access rights
  landlock: Add extra checks when inserting a rule
  selftests/landlock: Extend layout1.inherit_superset
  landlock: Clean up get_ruleset_from_fd()
  landlock: Add help to enable Landlock as a stacked LSM
  landlock: Extend documentation about limitations

 Documentation/userspace-api/landlock.rst   |  17 +++
 samples/landlock/sandboxer.c   |  21 +++-
 security/landlock/Kconfig  |   4 +-
 security/landlock/fs.c |  67 +-
 security/landlock/object.c |   5 +-
 security/landlock/ruleset.c|  34 ++---
 security/landlock/syscall.c|  24 ++--
 tools/testing/selftests/landlock/fs_test.c | 140 +++--
 8 files changed, 239 insertions(+), 73 deletions(-)


base-commit: 96b3198c4025c11347651700b77e45a686d78553
-- 
2.29.2



[PATCH v1 8/9] landlock: Add help to enable Landlock as a stacked LSM

2020-11-11 Thread Mickaël Salaün
When using make oldconfig with a previous configuration already
including the CONFIG_LSM variable, no question is asked to update its
content.

Update the Kconfig help and add hints to the sample to help user
understand the required configuration.

This also cut long strings to fit in 100 columns.

Cc: James Morris 
Cc: Jann Horn 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
---
 samples/landlock/sandboxer.c | 21 +++--
 security/landlock/Kconfig|  4 +++-
 2 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/samples/landlock/sandboxer.c b/samples/landlock/sandboxer.c
index ee5ec1203cb7..127fb063c23a 100644
--- a/samples/landlock/sandboxer.c
+++ b/samples/landlock/sandboxer.c
@@ -169,7 +169,8 @@ int main(const int argc, char *const argv[], char *const 
*const envp)
fprintf(stderr, "usage: %s=\"...\" %s=\"...\" %s  
[args]...\n\n",
ENV_FS_RO_NAME, ENV_FS_RW_NAME, argv[0]);
fprintf(stderr, "Launch a command in a restricted 
environment.\n\n");
-   fprintf(stderr, "Environment variables containing paths, each 
separated by a colon:\n");
+   fprintf(stderr, "Environment variables containing paths, "
+   "each separated by a colon:\n");
fprintf(stderr, "* %s: list of paths allowed to be used in a 
read-only way.\n",
ENV_FS_RO_NAME);
fprintf(stderr, "* %s: list of paths allowed to be used in a 
read-write way.\n",
@@ -185,6 +186,21 @@ int main(const int argc, char *const argv[], char *const 
*const envp)
ruleset_fd = landlock_create_ruleset(_attr, 
sizeof(ruleset_attr), 0);
if (ruleset_fd < 0) {
perror("Failed to create a ruleset");
+   switch (errno) {
+   case ENOSYS:
+   fprintf(stderr, "Hint: Landlock is not supported by the 
current kernel. "
+   "To support it, build the kernel with "
+   "CONFIG_SECURITY_LANDLOCK=y and prepend 
"
+   "\"landlock,\" to the content of 
CONFIG_LSM.\n");
+   break;
+   case EOPNOTSUPP:
+   fprintf(stderr, "Hint: Landlock is currently disabled. "
+   "It can be enabled in the kernel 
configuration by "
+   "prepending \"landlock,\" to the 
content of CONFIG_LSM, "
+   "or at boot time by setting the same 
content to the "
+   "\"lsm\" kernel parameter.\n");
+   break;
+   }
return 1;
}
if (populate_ruleset(ENV_FS_RO_NAME, ruleset_fd,
@@ -210,7 +226,8 @@ int main(const int argc, char *const argv[], char *const 
*const envp)
execvpe(cmd_path, cmd_argv, envp);
fprintf(stderr, "Failed to execute \"%s\": %s\n", cmd_path,
strerror(errno));
-   fprintf(stderr, "Hint: access to the binary, the interpreter or shared 
libraries may be denied.\n");
+   fprintf(stderr, "Hint: access to the binary, the interpreter or "
+   "shared libraries may be denied.\n");
return 1;
 
 err_close_ruleset:
diff --git a/security/landlock/Kconfig b/security/landlock/Kconfig
index cbf88bb7fd97..43e5b0bb0706 100644
--- a/security/landlock/Kconfig
+++ b/security/landlock/Kconfig
@@ -16,4 +16,6 @@ config SECURITY_LANDLOCK
 
  See Documentation/userspace-api/landlock.rst for further information.
 
- If you are unsure how to answer this question, answer N.
+ If you are unsure how to answer this question, answer N.  Otherwise, 
you
+ should also prepend "landlock," to the content of CONFIG_LSM to enable
+ Landlock at boot time.
-- 
2.29.2



[PATCH v1 9/9] landlock: Extend documentation about limitations

2020-11-11 Thread Mickaël Salaün
Explain limitations for the maximum number of stacked ruleset, and the
memory usage restrictions.

Cc: James Morris 
Cc: Jann Horn 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
---
 Documentation/userspace-api/landlock.rst | 17 +
 security/landlock/syscall.c  |  2 ++
 2 files changed, 19 insertions(+)

diff --git a/Documentation/userspace-api/landlock.rst 
b/Documentation/userspace-api/landlock.rst
index 8f727de20479..7e83e5def1bc 100644
--- a/Documentation/userspace-api/landlock.rst
+++ b/Documentation/userspace-api/landlock.rst
@@ -186,6 +186,23 @@ Enforcing a ruleset
 Current limitations
 ===
 
+Ruleset layers
+--
+
+There is a limit of 64 layers of stacked rulesets.  This can be an issue for a
+task willing to enforce a new ruleset in complement to its 64 inherited
+rulesets.  Once this limit is reached, sys_landlock_enforce_ruleset_current()
+returns E2BIG.  It is then strongly suggested to carefully build rulesets once
+in the life of a thread, especially for applications able to launch other
+applications which may also want to sandbox themselves (e.g. shells, container
+managers, etc.).
+
+Memory usage
+
+
+Kernel memory allocated to create rulesets is accounted and can be restricted
+by the :doc:`/admin-guide/cgroup-v1/memory`.
+
 File renaming and linking
 -
 
diff --git a/security/landlock/syscall.c b/security/landlock/syscall.c
index 543ae36cd339..045bcac79e17 100644
--- a/security/landlock/syscall.c
+++ b/security/landlock/syscall.c
@@ -361,6 +361,8 @@ SYSCALL_DEFINE4(landlock_add_rule,
  * - EPERM: @ruleset_fd has no read access to the underlying ruleset, or the
  *   current thread is not running with no_new_privs, or it doesn't have
  *   CAP_SYS_ADMIN in its namespace.
+ * - E2BIG: The maximum number of stacked rulesets is reached for the current
+ *   task.
  */
 SYSCALL_DEFINE2(landlock_enforce_ruleset_current,
const int, ruleset_fd, const __u32, flags)
-- 
2.29.2



Re: [PATCH v23 00/12] Landlock LSM

2020-11-10 Thread Mickaël Salaün


On 10/11/2020 07:47, James Morris wrote:
> On Tue, 3 Nov 2020, Mickaël Salaün wrote:
> 
>> Hi,
>>
>> Can you please consider to merge this into the tree?
>>
> 
> I've added this to my tree:
> git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security.git 
> landlock_lsm
> 
> and merged into next-testing (which is pulled into linux-next).
> 
> 
> Please make any further changes against the branch in my tree.

Great, thanks!


[PATCH v23 07/12] landlock: Support filesystem access-control

2020-11-03 Thread Mickaël Salaün
From: Mickaël Salaün 

Thanks to the Landlock objects and ruleset, it is possible to identify
inodes according to a process's domain.  To enable an unprivileged
process to express a file hierarchy, it first needs to open a directory
(or a file) and pass this file descriptor to the kernel through
landlock_add_rule(2).  When checking if a file access request is
allowed, we walk from the requested dentry to the real root, following
the different mount layers.  The access to each "tagged" inodes are
collected according to their rule layer level, and ANDed to create
access to the requested file hierarchy.  This makes possible to identify
a lot of files without tagging every inodes nor modifying the
filesystem, while still following the view and understanding the user
has from the filesystem.

Add a new ARCH_EPHEMERAL_INODES for UML because it currently does not
keep the same struct inodes for the same inodes whereas these inodes are
in use.

This commit adds a minimal set of supported filesystem access-control
which doesn't enable to restrict all file-related actions.  This is the
result of multiple discussions to minimize the code of Landlock to ease
review.  Thanks to the Landlock design, extending this access-control
without breaking user space will not be a problem.  Moreover, seccomp
filters can be used to restrict the use of syscall families which may
not be currently handled by Landlock.

Cc: Al Viro 
Cc: Anton Ivanov 
Cc: James Morris 
Cc: Jann Horn 
Cc: Jeff Dike 
Cc: Kees Cook 
Cc: Richard Weinberger 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
---

Changes since v22:
* Simplify check_access_path_continue() (suggested by Jann Horn).
* Remove prefetch() call for now (suggested by Jann Horn).
* Fix spelling and remove superfluous comment (spotted by Jann Horn).
* Cosmetic variable renaming.

Changes since v21:
* Rename ARCH_EPHEMERAL_STATES to ARCH_EPHEMERAL_INODES (suggested by
  James Morris).
* Remove the LANDLOCK_ACCESS_FS_CHROOT right because chroot(2) (which
  requires CAP_SYS_CHROOT) doesn't enable to bypass Landlock (as tests
  demonstrate it), and because it is often used by sandboxes, it would
  be counterproductive to forbid it.  This also reduces the code size.
* Clean up documentation.

Changes since v19:
* Fix spelling (spotted by Randy Dunlap).

Changes since v18:
* Remove useless include.
* Fix spelling.

Changes since v17:
* Replace landlock_release_inodes() with security_sb_delete() (requested
  by James Morris).
* Replace struct super_block->s_landlock_inode_refs with the LSM
  infrastructure management of the superblock (requested by James
  Morris).
* Fix mknod restriction with a zero mode (spotted by Vincent Dagonneau).
* Minimize executed code in path_mknod and file_open hooks when the
  current tasks is not sandboxed.
* Remove useless checks on the file pointer and inode in
  hook_file_open() .
* Constify domain pointers.
* Rename inode_landlock() to landlock_inode().
* Import include/uapi/linux/landlock.h and _LANDLOCK_ACCESS_FS_* from
  the ruleset and domain management patch.
* Explain the rational of this minimal set of access-control.
  https://lore.kernel.org/lkml/f646e1c7-33cf-333f-070c-0a40ad046...@digikod.net/

Changes since v16:
* Add ARCH_EPHEMERAL_STATES and enable it for UML.

Changes since v15:
* Replace layer_levels and layer_depth with a bitfield of layers: this
  enables to properly manage superset and subset of access rights,
  whatever their order in the stack of layers.
  Cf. 
https://lore.kernel.org/lkml/e07fe473-1801-01cc-12ae-b3167f952...@digikod.net/
* Allow to open pipes and similar special files through /proc/self/fd/.
* Properly handle internal filesystems such as nsfs: always allow these
  kind of roots because disconnected path cannot be evaluated.
* Remove the LANDLOCK_ACCESS_FS_LINK_TO and
  LANDLOCK_ACCESS_FS_RENAME_{TO,FROM}, but use the
  LANDLOCK_ACCESS_FS_REMOVE_{FILE,DIR} and LANDLOCK_ACCESS_FS_MAKE_*
  instead.  Indeed, it is not possible for now (and not really useful)
  to express the semantic of a source and a destination.
* Check access rights to remove a directory or a file with rename(2).
* Forbid reparenting when linking or renaming.  This is needed to easily
  protect against possible privilege escalation by changing the place of
  a file or directory in relation to an enforced access policy (from the
  set of layers).  This will be relaxed in the future.
* Update hooks to take into account replacement of the object's self and
  beneath access bitfields with one.  Simplify the code.
* Check file related access rights.
* Check d_is_negative() instead of !d_backing_inode() in
  check_access_path_continue(), and continue the path walk while there
  is no mapped inode e.g., with rename(2).
* Check private inode in check_access_path().
* Optimize get_file_access() when dealing with a directory.
* Add missing atomic.h .

Changes since v14:
* Simplify the object, rule and ruleset management at the expense of a
  le

[PATCH v23 04/12] landlock: Add ptrace restrictions

2020-11-03 Thread Mickaël Salaün
From: Mickaël Salaün 

Using ptrace(2) and related debug features on a target process can lead
to a privilege escalation.  Indeed, ptrace(2) can be used by an attacker
to impersonate another task and to remain undetected while performing
malicious activities.  Thanks to  ptrace_may_access(), various part of
the kernel can check if a tracer is more privileged than a tracee.

A landlocked process has fewer privileges than a non-landlocked process
and must then be subject to additional restrictions when manipulating
processes. To be allowed to use ptrace(2) and related syscalls on a
target process, a landlocked process must have a subset of the target
process's rules (i.e. the tracee must be in a sub-domain of the tracer).

Cc: James Morris 
Cc: Kees Cook 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
Reviewed-by: Jann Horn 
---

Changes since v21:
* Fix copyright dates.

Changes since v14:
* Constify variables.

Changes since v13:
* Make the ptrace restriction mandatory, like in the v10.
* Remove the eBPF dependency.

Previous changes:
https://lore.kernel.org/lkml/20191104172146.30797-5-...@digikod.net/
---
 security/landlock/Makefile |   2 +-
 security/landlock/ptrace.c | 120 +
 security/landlock/ptrace.h |  14 +
 security/landlock/setup.c  |   2 +
 4 files changed, 137 insertions(+), 1 deletion(-)
 create mode 100644 security/landlock/ptrace.c
 create mode 100644 security/landlock/ptrace.h

diff --git a/security/landlock/Makefile b/security/landlock/Makefile
index 041ea242e627..f1d1eb72fa76 100644
--- a/security/landlock/Makefile
+++ b/security/landlock/Makefile
@@ -1,4 +1,4 @@
 obj-$(CONFIG_SECURITY_LANDLOCK) := landlock.o
 
 landlock-y := setup.o object.o ruleset.o \
-   cred.o
+   cred.o ptrace.o
diff --git a/security/landlock/ptrace.c b/security/landlock/ptrace.c
new file mode 100644
index ..77c77bb1fe97
--- /dev/null
+++ b/security/landlock/ptrace.c
@@ -0,0 +1,120 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Landlock LSM - Ptrace hooks
+ *
+ * Copyright © 2017-2020 Mickaël Salaün 
+ * Copyright © 2019-2020 ANSSI
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "common.h"
+#include "cred.h"
+#include "ptrace.h"
+#include "ruleset.h"
+#include "setup.h"
+
+/**
+ * domain_scope_le - Checks domain ordering for scoped ptrace
+ *
+ * @parent: Parent domain.
+ * @child: Potential child of @parent.
+ *
+ * Checks if the @parent domain is less or equal to (i.e. an ancestor, which
+ * means a subset of) the @child domain.
+ */
+static bool domain_scope_le(const struct landlock_ruleset *const parent,
+   const struct landlock_ruleset *const child)
+{
+   const struct landlock_hierarchy *walker;
+
+   if (!parent)
+   return true;
+   if (!child)
+   return false;
+   for (walker = child->hierarchy; walker; walker = walker->parent) {
+   if (walker == parent->hierarchy)
+   /* @parent is in the scoped hierarchy of @child. */
+   return true;
+   }
+   /* There is no relationship between @parent and @child. */
+   return false;
+}
+
+static bool task_is_scoped(const struct task_struct *const parent,
+   const struct task_struct *const child)
+{
+   bool is_scoped;
+   const struct landlock_ruleset *dom_parent, *dom_child;
+
+   rcu_read_lock();
+   dom_parent = landlock_get_task_domain(parent);
+   dom_child = landlock_get_task_domain(child);
+   is_scoped = domain_scope_le(dom_parent, dom_child);
+   rcu_read_unlock();
+   return is_scoped;
+}
+
+static int task_ptrace(const struct task_struct *const parent,
+   const struct task_struct *const child)
+{
+   /* Quick return for non-landlocked tasks. */
+   if (!landlocked(parent))
+   return 0;
+   if (task_is_scoped(parent, child))
+   return 0;
+   return -EPERM;
+}
+
+/**
+ * hook_ptrace_access_check - Determines whether the current process may access
+ *   another
+ *
+ * @child: Process to be accessed.
+ * @mode: Mode of attachment.
+ *
+ * If the current task has Landlock rules, then the child must have at least
+ * the same rules.  Else denied.
+ *
+ * Determines whether a process may access another, returning 0 if permission
+ * granted, -errno if denied.
+ */
+static int hook_ptrace_access_check(struct task_struct *const child,
+   const unsigned int mode)
+{
+   return task_ptrace(current, child);
+}
+
+/**
+ * hook_ptrace_traceme - Determines whether another process may trace the
+ *  current one
+ *
+ * @parent: Task proposed to be the tracer.
+ *
+ * If the parent has Landlock rules, then the current task must have the same
+ * or more rules.  Else denied.
+ *
+ * Determines whether the nominated task

[PATCH v23 06/12] fs,security: Add sb_delete hook

2020-11-03 Thread Mickaël Salaün
From: Mickaël Salaün 

The sb_delete security hook is called when shutting down a superblock,
which may be useful to release kernel objects tied to the superblock's
lifetime (e.g. inodes).

This new hook is needed by Landlock to release (ephemerally) tagged
struct inodes.  This comes from the unprivileged nature of Landlock
described in the next commit.

Cc: Al Viro 
Cc: James Morris 
Cc: Kees Cook 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
Reviewed-by: Jann Horn 
---

Changes since v17:
* Initial patch to replace the direct call to landlock_release_inodes()
  (requested by James Morris).
  https://lore.kernel.org/lkml/alpine.lrh.2.21.2005150536440.7...@namei.org/
---
 fs/super.c| 1 +
 include/linux/lsm_hook_defs.h | 1 +
 include/linux/lsm_hooks.h | 2 ++
 include/linux/security.h  | 4 
 security/security.c   | 5 +
 5 files changed, 13 insertions(+)

diff --git a/fs/super.c b/fs/super.c
index a51c2083cd6b..fea9475189f7 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -454,6 +454,7 @@ void generic_shutdown_super(struct super_block *sb)
evict_inodes(sb);
/* only nonzero refcount inodes can have marks */
fsnotify_sb_delete(sb);
+   security_sb_delete(sb);
 
if (sb->s_dio_done_wq) {
destroy_workqueue(sb->s_dio_done_wq);
diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
index 32a940117e7a..1ba9b4dfecb3 100644
--- a/include/linux/lsm_hook_defs.h
+++ b/include/linux/lsm_hook_defs.h
@@ -59,6 +59,7 @@ LSM_HOOK(int, 0, fs_context_dup, struct fs_context *fc,
 LSM_HOOK(int, -ENOPARAM, fs_context_parse_param, struct fs_context *fc,
 struct fs_parameter *param)
 LSM_HOOK(int, 0, sb_alloc_security, struct super_block *sb)
+LSM_HOOK(void, LSM_RET_VOID, sb_delete, struct super_block *sb)
 LSM_HOOK(void, LSM_RET_VOID, sb_free_security, struct super_block *sb)
 LSM_HOOK(void, LSM_RET_VOID, sb_free_mnt_opts, void *mnt_opts)
 LSM_HOOK(int, 0, sb_eat_lsm_opts, char *orig, void **mnt_opts)
diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index ff0f03a45c56..dbfcec05a176 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -108,6 +108,8 @@
  * allocated.
  * @sb contains the super_block structure to be modified.
  * Return 0 if operation was successful.
+ * @sb_delete:
+ * Release objects tied to a superblock (e.g. inodes).
  * @sb_free_security:
  * Deallocate and clear the sb->s_security field.
  * @sb contains the super_block structure to be modified.
diff --git a/include/linux/security.h b/include/linux/security.h
index bc2725491560..a4603b62d444 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -287,6 +287,7 @@ void security_bprm_committed_creds(struct linux_binprm 
*bprm);
 int security_fs_context_dup(struct fs_context *fc, struct fs_context *src_fc);
 int security_fs_context_parse_param(struct fs_context *fc, struct fs_parameter 
*param);
 int security_sb_alloc(struct super_block *sb);
+void security_sb_delete(struct super_block *sb);
 void security_sb_free(struct super_block *sb);
 void security_free_mnt_opts(void **mnt_opts);
 int security_sb_eat_lsm_opts(char *options, void **mnt_opts);
@@ -619,6 +620,9 @@ static inline int security_sb_alloc(struct super_block *sb)
return 0;
 }
 
+static inline void security_sb_delete(struct super_block *sb)
+{ }
+
 static inline void security_sb_free(struct super_block *sb)
 { }
 
diff --git a/security/security.c b/security/security.c
index 4ffd6c3af9d7..4563e7a79216 100644
--- a/security/security.c
+++ b/security/security.c
@@ -899,6 +899,11 @@ int security_sb_alloc(struct super_block *sb)
return rc;
 }
 
+void security_sb_delete(struct super_block *sb)
+{
+   call_void_hook(sb_delete, sb);
+}
+
 void security_sb_free(struct super_block *sb)
 {
call_void_hook(sb_free_security, sb);
-- 
2.28.0



[PATCH v23 02/12] landlock: Add ruleset and domain management

2020-11-03 Thread Mickaël Salaün
From: Mickaël Salaün 

A Landlock ruleset is mainly a red-black tree with Landlock rules as
nodes.  This enables quick update and lookup to match a requested
access, e.g. to a file.  A ruleset is usable through a dedicated file
descriptor (cf. following commit implementing syscalls) which enables a
process to create and populate a ruleset with new rules.

A domain is a ruleset tied to a set of processes.  This group of rules
defines the security policy enforced on these processes and their future
children.  A domain can transition to a new domain which is the
intersection of all its constraints and those of a ruleset provided by
the current process.  This modification only impact the current process.
This means that a process can only gain more constraints (i.e. lose
accesses) over time.

Cc: James Morris 
Cc: Kees Cook 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
Reviewed-by: Jann Horn 
---

Changes since v22:
* Explicitely use RB_ROOT and SINGLE_DEPTH_NESTING (suggested by Jann
  Horn).
* Improve comments and fix spelling (suggested by Jann Horn).

Changes since v21:
* Add and clean up comments.

Changes since v18:
* Account rulesets to kmemcg.
* Remove struct holes.
* Cosmetic changes.

Changes since v17:
* Move include/uapi/linux/landlock.h and _LANDLOCK_ACCESS_FS_* to a
  following patch.

Changes since v16:
* Allow enforcement of empty ruleset, which enables deny-all policies.

Changes since v15:
* Replace layer_levels and layer_depth with a bitfield of layers, cf.
  filesystem commit.
* Rename the LANDLOCK_ACCESS_FS_{UNLINK,RMDIR} with
  LANDLOCK_ACCESS_FS_REMOVE_{FILE,DIR} because it makes sense to use
  them for the action of renaming a file or a directory, which may lead
  to the removal of the source file or directory.  Removes the
  LANDLOCK_ACCESS_FS_{LINK_TO,RENAME_FROM,RENAME_TO} which are now
  replaced with LANDLOCK_ACCESS_FS_REMOVE_{FILE,DIR} and
  LANDLOCK_ACCESS_FS_MAKE_* .
* Update the documentation accordingly and highlight how the access
  rights are taken into account.
* Change nb_rules from atomic_t to u32 because it is not use anymore by
  show_fdinfo().
* Add safeguard for level variables types.
* Check max number of rules.
* Replace struct landlock_access (self and beneath bitfields) with one
  bitfield.
* Remove useless variable.
* Add comments.

Changes since v14:
* Simplify the object, rule and ruleset management at the expense of a
  less aggressive memory freeing (contributed by Jann Horn, with
  additional modifications):
  - Make a domain immutable (remove the opportunistic cleaning).
  - Remove RCU pointers.
  - Merge struct landlock_ref and struct landlock_ruleset_elem into
landlock_rule: get ride of rule's RCU.
  - Adjust union.
  - Remove the landlock_insert_rule() check about a new object with the
same address as a previously disabled one, because it is not
possible to disable a rule anymore.
  Cf. 
https://lore.kernel.org/lkml/cag48ez21ben0wl1bbmtiiu8j9jp5iewthowz4turuj+ki0y...@mail.gmail.com/
* Fix nested domains by implementing a notion of layer level and depth:
  - Update landlock_insert_rule() to manage such layers.
  - Add an inherit_ruleset() helper to properly create a new domain.
  - Rename landlock_find_access() to landlock_find_rule() and return a
full rule reference.
  - Add a layer_level and a layer_depth fields to struct landlock_rule.
  - Add a top_layer_level field to struct landlock_ruleset.
* Remove access rights that may be required for FD-only requests:
  truncate, getattr, lock, chmod, chown, chgrp, ioctl.  This will be
  handle in a future evolution of Landlock, but right now the goal is to
  lighten the code to ease review.
* Remove LANDLOCK_ACCESS_FS_OPEN and rename
  LANDLOCK_ACCESS_FS_{READ,WRITE} with a FILE suffix.
* Rename LANDLOCK_ACCESS_FS_READDIR to match the *_FILE pattern.
* Remove LANDLOCK_ACCESS_FS_MAP which was useless.
* Fix memory leak in put_hierarchy() (reported by Jann Horn).
* Fix user-after-free and rename free_ruleset() (reported by Jann Horn).
* Replace the for loops with rbtree_postorder_for_each_entry_safe().
* Constify variables.
* Only use refcount_inc() through getter helpers.
* Change Landlock_insert_ruleset_access() to
  Landlock_insert_ruleset_rule().
* Rename landlock_put_ruleset_enqueue() to landlock_put_ruleset_deferred().
* Improve kernel documentation and add a warning about the unhandled
  access/syscall families.
* Move ABI check to syscall.c .

Changes since v13:
* New implementation, inspired by the previous inode eBPF map, but
  agnostic to the underlying kernel object.

Previous changes:
https://lore.kernel.org/lkml/20190721213116.23476-7-...@digikod.net/
---
 security/landlock/Makefile  |   2 +-
 security/landlock/ruleset.c | 355 
 security/landlock/ruleset.h | 158 
 3 files changed, 514 insertions(+), 1 deletion(-)
 create mode 100644 security/landlock/ruleset.c
 create mode 100644 security/landlock/ruleset.h

diff --git

[PATCH v23 10/12] selftests/landlock: Add user space tests

2020-11-03 Thread Mickaël Salaün
From: Mickaël Salaün 

Test all Landlock system calls, ptrace hooks semantic and filesystem
access-control.

Test coverage for security/landlock/ is 95.5% of lines.  The code not
covered only deals with internal kernel errors (e.g. memory allocation)
and race conditions.

Cc: James Morris 
Cc: Jann Horn 
Cc: Kees Cook 
Cc: Serge E. Hallyn 
Cc: Shuah Khan 
Signed-off-by: Mickaël Salaün 
Reviewed-by: Vincent Dagonneau 
---

Changes since v22:
* Extend and add a new test to better check rules applied to the root
  directory: rule_over_root_allow_then_deny, rule_over_root_deny.
* Change the signature of test_path*() to make the calls clearer.

Changes since v21:
* Remove layout1.chroot test and update layout1.unhandled_access to not
  rely on LANDLOCK_ACCESS_FS_CHROOT.
* Clean up comments.

Changes since v20:
* Update with new syscalls and type names.
* Use the full syscall interfaces: explicitely set the "flags" field to
  zero.
* Update the empty_path_beneath_attr test to check for EFAULT.
* Update and merge tests for the simplified copy_min_struct_from_user().
* Clean up makefile.
* Rename some types and variables in a more consistent way.

Changes since v19:
* Update with the new Landlock syscalls.
* Fix device creation.
* Check the new landlock_attr_features members: last_rule_type and
  last_target_type .
* Constify variables.

Changes since v18:
* Replace ruleset_rw.inval with layout1.inval to avoid inexistent test
  layout.
* Use the new FIXTURE_VARIANT for ptrace_test: makes the tests more
  readable and usable.
* Add ARRAY_SIZE() macro to please checkpatch.

Changes since v17:
* Add new test for mknod with a zero mode.
* Use memset(3) to initialize attr_features in base_test.

Changes since v16:
* Add new unpriv_enforce_without_no_new_privs test: check that ruleset
  enforcing is forbiden without no_new_privs and CAP_SYS_ADMIN.
* Drop capabilities when useful.
* Check the new size_attr_features field from struct
  landlock_attr_features.
* Update the empty_or_same_ruleset test to check complementary empty
  ruleset.
* Update base_test according to the new attribute structures and fix the
  inconsistent_attr test accordingly.
* Switch syscall attribute pointer and size arguments.
* Rename test files with a "_test" suffix.

Changes since v14:
* Add new tests:
  - superset: check new layer bitmask.
  - max_layers: check maximum number of layers.
  - release_inodes: check that umount work well.
  - empty_or_same_ruleset.
  - inconsistent_attr: checks copy_to_user limits.
  - in ruleset_rw.inval to check ruleset FD.
  - proc_unlinked_file: check file access through /proc/self/fd .
  - file_access_rights: check that a file can only get consistent access
rights.
  - unpriv: check that NO_NEW_PRIVS or CAP_SYS_ADMIN is required.
  - check pipe access through /proc/self/fd .
  - check move_mount(2).
  - check ruleset file descriptor properties.
  - proc_nsfs: extend to check that internal filesystems (e.g. nsfs) are
allowed.
* Double-check read and write effective actions.
* Fix potential desynchronization between the kernel sources and
  installed headers by overriding the build step in the Makefile.  This
  also enable to build with Clang.
* Add two files in the test directories (for link test and rename test).
* Remove test for ruleset's show_fdinfo().
* Replace EBADR with EBADFD.
* Update tests accordingly to the changes of rename and link rights.
* Fix (now) illegal access rights tied to files.
* Update rename and link tests.
* Remove superfluous '\n' in TH_LOG() calls.
* Make assert calls consistent and readable.
* Fix the execute test.
* Make tests future-proof.
* Cosmetic fixes.

Changes since v14:
* Add new tests:
  - Compatibility: empty_attr_{ruleset,path_beneath,enforce} to check
minimal attr size.
  - Access types: link_to, rename_from, rename_to, rmdir, unlink,
make_char, make_block, make_reg, make_sock, make_fifo, make_sym,
make_dir, chroot, execute.
  - Test privilege escalation prevention by enforcing a nested rule, on
a parent directory, with less restrictions than one on a child
directory.
  - Test for empty and more than 32-bits allowed_access
* Merge the two test mount hierarchies.
* Complete relative path tests by combining chdir and chroot.
* Adjust tests:
  - Remove the layout1/extend_ruleset_with_denied_path test.
  - Extend layout1/whitelist test with checks on file.
  - Add and use create_dir_and_file().
* Only use read/write checks but not stat(2) for tests.
* Rename test.h to common.h and improve it.
* Rename path name to make them more consistent, easy to understand and
  make them in a common directory.
* Make create_ruleset() more generic.
* Constify variables.
* Re-add static global variables.
* Remove useless openat(2).
* Fix and complete kernel config.
* Set umask and clean up file modes.
* Clean up open flags.
* Improve Makefile.
* Fix spelling.
* Improve comments and error messages.

Changes since v13:
* Add back the filesy

[PATCH v23 08/12] landlock: Add syscall implementations

2020-11-03 Thread Mickaël Salaün
From: Mickaël Salaün 

These 3 system calls are designed to be used by unprivileged processes
to sandbox themselves:
* landlock_create_ruleset(2): Creates a ruleset and returns its file
  descriptor.
* landlock_add_rule(2): Adds a rule (e.g. file hierarchy access) to a
  ruleset, identified by the dedicated file descriptor.
* landlock_enforce_ruleset_current(2): Enforces a ruleset on the current
  thread and its future children (similar to seccomp).  This syscall has
  the same usage restrictions as seccomp(2): the caller must have the
  no_new_privs attribute set or have CAP_SYS_ADMIN in the current user
  namespace.

All these syscalls have a "flags" argument (not currently used) to
enable extensibility.

Here are the motivations for these new syscalls:
* A sandboxed process may not have access to file systems, including
  /dev, /sys or /proc, but it should still be able to add more
  restrictions to itself.
* Neither prctl(2) nor seccomp(2) (which was used in a previous version)
  fit well with the current definition of a Landlock security policy.

All passed structs (attributes) are checked at build time to ensure that
they don't contain holes and that they are aligned the same way for each
architecture.

See the user and kernel documentation for more details (provided by a
following commit):
* Documentation/userspace-api/landlock.rst
* Documentation/security/landlock.rst

Cc: Arnd Bergmann 
Cc: James Morris 
Cc: Jann Horn 
Cc: Kees Cook 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
---

Changes since v22:
* Replace security_capable() with ns_capable_noaudit() (suggested by
  Jann Horn) and explicitly return EPERM.
* Fix landlock_enforce_ruleset_current(2)'s out_put_creds (spotted by
  Jann Horn).
* Add __always_inline to copy_min_struct_from_user() to make its
  BUILD_BUG_ON() checks reliable (suggested by Jann Horn).
* Simplify path assignation in get_path_from_fd() (suggested by Jann
  Horn).
* Fix spelling (spotted by Jann Horn).

Changes since v21:
* Fix and improve comments.

Changes since v20:
* Remove two arguments to landlock_enforce_ruleset(2) (requested by Arnd
  Bergmann) and rename it to landlock_enforce_ruleset_current(2): remove
  the enum landlock_target_type and the target file descriptor (not used
  for now).  A ruleset can only be enforced on the current thread.
* Remove the size argument in landlock_add_rule() (requested by Arnd
  Bergmann).
* Remove landlock_get_features(2) (suggested by Arnd Bergmann).
* Simplify and rename copy_struct_if_any_from_user() to
  copy_min_struct_from_user().
* Rename "options" to "flags" to allign with current syscalls.
* Rename some types and variables in a more consistent way.
* Fix missing type declarations in syscalls.h .

Changes since v19:
* Replace the landlock(2) syscall with 4 syscalls (one for each
  command): landlock_get_features(2), landlock_create_ruleset(2),
  landlock_add_rule(2) and landlock_enforce_ruleset(2) (suggested by
  Arnd Bergmann).
  https://lore.kernel.org/lkml/56d15841-e2c1-2d58-59b8-3a6a09b23...@digikod.net/
* Return EOPNOTSUPP (instead of ENOPKG) when Landlock is disabled.
* Add two new fields to landlock_attr_features to fit with the new
  syscalls: last_rule_type and last_target_type.  This enable to easily
  identify which types are supported.
* Pack landlock_attr_path_beneath struct because of the removed
  ruleset_fd.
* Update documentation and fix spelling.

Changes since v18:
* Remove useless include.
* Remove LLATTR_SIZE() which was only used to shorten lines. Cf. commit
  bdc48fa11e46 ("checkpatch/coding-style: deprecate 80-column warning").

Changes since v17:
* Synchronize syscall declaration.
* Fix comment.

Changes since v16:
* Add a size_attr_features field to struct landlock_attr_features for
  self-introspection, and move the access_fs field to be more
  consistent.
* Replace __aligned_u64 types of attribute fields with __u16, __s32,
  __u32 and __u64, and check at build time that these structures does
  not contain hole and that they are aligned the same way (8-bits) on
  all architectures.  This shrinks the size of the userspace ABI, which
  may be appreciated especially for struct landlock_attr_features which
  could grow a lot in the future.  For instance, struct
  landlock_attr_features shrinks from 72 bytes to 32 bytes.  This change
  also enables to remove 64-bits to 32-bits conversion checks.
* Switch syscall attribute pointer and size arguments to follow similar
  syscall argument order (e.g. bpf, clone3, openat2).
* Set LANDLOCK_OPT_* types to 32-bits.
* Allow enforcement of empty ruleset, which enables deny-all policies.
* Fix documentation inconsistency.

Changes since v15:
* Do not add file descriptors referring to internal filesystems (e.g.
  nsfs) in a ruleset.
* Replace is_user_mountable() with in-place clean checks.
* Replace EBADR with EBADFD in get_ruleset_from_fd() and
  get_path_from_fd().
* Remove ruleset's show_fdinfo

[PATCH v23 09/12] arch: Wire up Landlock syscalls

2020-11-03 Thread Mickaël Salaün
From: Mickaël Salaün 

Wire up the following system calls for all architectures:
* landlock_create_ruleset(2)
* landlock_add_rule(2)
* landlock_enforce_ruleset_current(2)

Cc: Arnd Bergmann 
Cc: James Morris 
Cc: Jann Horn 
Cc: Kees Cook 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
---

Changes since v21:
* Rebase and leave space for watch_mount(2) from -next.

Changes since v20:
* Remove landlock_get_features(2).
* Decrease syscall numbers to stick to process_madvise(2) in -next.
* Rename landlock_enforce_ruleset(2) to
  landlock_enforce_ruleset_current(2).

Changes since v19:
* Increase syscall numbers by 4 to leave space for new ones (in
  linux-next): watch_mount(2), watch_sb(2), fsinfo(2) and
  process_madvise(2) (requested by Arnd Bergmann).
* Replace the previous multiplexor landlock(2) with 4 syscalls:
  landlock_get_features(2), landlock_create_ruleset(2),
  landlock_add_rule(2) and landlock_enforce_ruleset(2).

Changes since v18:
* Increase the syscall number because of the new faccessat2(2).

Changes since v14:
* Add all architectures.

Changes since v13:
* New implementation.
---
 arch/alpha/kernel/syscalls/syscall.tbl  | 3 +++
 arch/arm/tools/syscall.tbl  | 3 +++
 arch/arm64/include/asm/unistd.h | 2 +-
 arch/arm64/include/asm/unistd32.h   | 6 ++
 arch/ia64/kernel/syscalls/syscall.tbl   | 3 +++
 arch/m68k/kernel/syscalls/syscall.tbl   | 3 +++
 arch/microblaze/kernel/syscalls/syscall.tbl | 3 +++
 arch/mips/kernel/syscalls/syscall_n32.tbl   | 3 +++
 arch/mips/kernel/syscalls/syscall_n64.tbl   | 3 +++
 arch/mips/kernel/syscalls/syscall_o32.tbl   | 3 +++
 arch/parisc/kernel/syscalls/syscall.tbl | 3 +++
 arch/powerpc/kernel/syscalls/syscall.tbl| 3 +++
 arch/s390/kernel/syscalls/syscall.tbl   | 3 +++
 arch/sh/kernel/syscalls/syscall.tbl | 3 +++
 arch/sparc/kernel/syscalls/syscall.tbl  | 3 +++
 arch/x86/entry/syscalls/syscall_32.tbl  | 3 +++
 arch/x86/entry/syscalls/syscall_64.tbl  | 3 +++
 arch/xtensa/kernel/syscalls/syscall.tbl | 3 +++
 include/uapi/asm-generic/unistd.h   | 8 +++-
 19 files changed, 62 insertions(+), 2 deletions(-)

diff --git a/arch/alpha/kernel/syscalls/syscall.tbl 
b/arch/alpha/kernel/syscalls/syscall.tbl
index ee7b01bb7346..7ef9966fc654 100644
--- a/arch/alpha/kernel/syscalls/syscall.tbl
+++ b/arch/alpha/kernel/syscalls/syscall.tbl
@@ -480,3 +480,6 @@
 548common  pidfd_getfd sys_pidfd_getfd
 549common  faccessat2  sys_faccessat2
 550common  process_madvise sys_process_madvise
+552common  landlock_create_ruleset 
sys_landlock_create_ruleset
+553common  landlock_add_rule   
sys_landlock_add_rule
+554common  landlock_enforce_ruleset_current
sys_landlock_enforce_ruleset_current
diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl
index d056a548358e..5bde774cef96 100644
--- a/arch/arm/tools/syscall.tbl
+++ b/arch/arm/tools/syscall.tbl
@@ -454,3 +454,6 @@
 438common  pidfd_getfd sys_pidfd_getfd
 439common  faccessat2  sys_faccessat2
 440common  process_madvise sys_process_madvise
+442common  landlock_create_ruleset 
sys_landlock_create_ruleset
+443common  landlock_add_rule   
sys_landlock_add_rule
+444common  landlock_enforce_ruleset_current
sys_landlock_enforce_ruleset_current
diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
index b3b2019f8d16..64ebdc1ec581 100644
--- a/arch/arm64/include/asm/unistd.h
+++ b/arch/arm64/include/asm/unistd.h
@@ -38,7 +38,7 @@
 #define __ARM_NR_compat_set_tls(__ARM_NR_COMPAT_BASE + 5)
 #define __ARM_NR_COMPAT_END(__ARM_NR_COMPAT_BASE + 0x800)
 
-#define __NR_compat_syscalls   441
+#define __NR_compat_syscalls   445
 #endif
 
 #define __ARCH_WANT_SYS_CLONE
diff --git a/arch/arm64/include/asm/unistd32.h 
b/arch/arm64/include/asm/unistd32.h
index 107f08e03b9f..253521adb064 100644
--- a/arch/arm64/include/asm/unistd32.h
+++ b/arch/arm64/include/asm/unistd32.h
@@ -889,6 +889,12 @@ __SYSCALL(__NR_pidfd_getfd, sys_pidfd_getfd)
 __SYSCALL(__NR_faccessat2, sys_faccessat2)
 #define __NR_process_madvise 440
 __SYSCALL(__NR_process_madvise, sys_process_madvise)
+#define __NR_landlock_create_ruleset 442
+__SYSCALL(__NR_landlock_create_ruleset, sys_landlock_create_ruleset)
+#define __NR_landlock_add_rule 443
+__SYSCALL(__NR_landlock_add_rule, sys_landlock_add_rule)
+#define __NR_landlock_enforce_ruleset_current 444
+__SYSCALL(__NR_landlock_enforce_ruleset_current, 
sys_landlock_enforce_ruleset_current)
 
 /*
  * Please add new compat syscalls above this comment and update
diff --git a/arch/ia64/kernel/syscalls/syscall.tbl 
b/arch/ia64/kernel/syscalls/syscall.tbl
index b96ed8b8a508

[PATCH v23 12/12] landlock: Add user and kernel documentation

2020-11-03 Thread Mickaël Salaün
From: Mickaël Salaün 

This documentation can be built with the Sphinx framework.

Cc: James Morris 
Cc: Jann Horn 
Cc: Kees Cook 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
Reviewed-by: Vincent Dagonneau 
---

Changes since v22:
* Fix spelling and remove obsolete sentence (spotted by Jann Horn).
* Bump date.

Changes since v21:
* Move the user space documentation to userspace-api/landlock.rst and
  the kernel documentation to security/landlock.rst .
* Add license headers.
* Add last update dates.
* Update MAINTAINERS file.
* Add (back) links to git.kernel.org .
* Fix spelling.

Changes since v20:
* Update examples and documentation with the new syscalls.

Changes since v19:
* Update examples and documentation with the new syscalls.

Changes since v15:
* Add current limitations.

Changes since v14:
* Fix spelling (contributed by Randy Dunlap).
* Extend documentation about inheritance and explain layer levels.
* Remove the use of now-removed access rights.
* Use GitHub links.
* Improve kernel documentation.
* Add section for tests.
* Update example.

Changes since v13:
* Rewrote the documentation according to the major revamp.

Previous changes:
https://lore.kernel.org/lkml/20191104172146.30797-8-...@digikod.net/
---
 Documentation/security/index.rst |   1 +
 Documentation/security/landlock.rst  |  79 +++
 Documentation/userspace-api/index.rst|   1 +
 Documentation/userspace-api/landlock.rst | 258 +++
 MAINTAINERS  |   2 +
 5 files changed, 341 insertions(+)
 create mode 100644 Documentation/security/landlock.rst
 create mode 100644 Documentation/userspace-api/landlock.rst

diff --git a/Documentation/security/index.rst b/Documentation/security/index.rst
index 8129405eb2cc..16335de04e8c 100644
--- a/Documentation/security/index.rst
+++ b/Documentation/security/index.rst
@@ -16,3 +16,4 @@ Security Documentation
siphash
tpm/index
digsig
+   landlock
diff --git a/Documentation/security/landlock.rst 
b/Documentation/security/landlock.rst
new file mode 100644
index ..4c88a67a6958
--- /dev/null
+++ b/Documentation/security/landlock.rst
@@ -0,0 +1,79 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. Copyright © 2017-2020 Mickaël Salaün 
+.. Copyright © 2019-2020 ANSSI
+
+==
+Landlock LSM: kernel documentation
+==
+
+:Author: Mickaël Salaün
+:Date: November 2020
+
+Landlock's goal is to create scoped access-control (i.e. sandboxing).  To
+harden a whole system, this feature should be available to any process,
+including unprivileged ones.  Because such process may be compromised or
+backdoored (i.e. untrusted), Landlock's features must be safe to use from the
+kernel and other processes point of view.  Landlock's interface must therefore
+expose a minimal attack surface.
+
+Landlock is designed to be usable by unprivileged processes while following the
+system security policy enforced by other access control mechanisms (e.g. DAC,
+LSM).  Indeed, a Landlock rule shall not interfere with other access-controls
+enforced on the system, only add more restrictions.
+
+Any user can enforce Landlock rulesets on their processes.  They are merged and
+evaluated according to the inherited ones in a way that ensures that only more
+constraints can be added.
+
+User space documentation can be found here: :doc:`/userspace-api/landlock`.
+
+Guiding principles for safe access controls
+===
+
+* A Landlock rule shall be focused on access control on kernel objects instead
+  of syscall filtering (i.e. syscall arguments), which is the purpose of
+  seccomp-bpf.
+* To avoid multiple kinds of side-channel attacks (e.g. leak of security
+  policies, CPU-based attacks), Landlock rules shall not be able to
+  programmatically communicate with user space.
+* Kernel access check shall not slow down access request from unsandboxed
+  processes.
+* Computation related to Landlock operations (e.g. enforcing a ruleset) shall
+  only impact the processes requesting them.
+
+Tests
+=
+
+Userspace tests for backward compatibility, ptrace restrictions and filesystem
+support can be found here: `tools/testing/selftests/landlock/`_.
+
+Kernel structures
+=
+
+Object
+--
+
+.. kernel-doc:: security/landlock/object.h
+:identifiers:
+
+Ruleset and domain
+--
+
+A domain is a read-only ruleset tied to a set of subjects (i.e. tasks'
+credentials).  Each time a ruleset is enforced on a task, the current domain is
+duplicated and the ruleset is imported as a new layer of rules in the new
+domain.  Indeed, once in a domain, each rule is tied to a layer level.  To
+grant access to an object, at least one rule of each layer must allow the
+requested action on the object.  A task can then only transit to a new domain
+which is the intersection of the constraints from the current domain and those
+of a ruleset provided

[PATCH v23 00/12] Landlock LSM

2020-11-03 Thread Mickaël Salaün
Hi,

Can you please consider to merge this into the tree?

This new patch series fixes some spelling, improves comments, simplifies
the code, adds one more test, and add some Reviewed-by tags.

The SLOC count is 1180 for security/landlock/ and 1680 for
tools/testing/selftest/landlock/ .  Test coverage for security/landlock/
is 95.5% of lines.  The code not covered only deals with internal kernel
errors (e.g. memory allocation) and race conditions.

The compiled documentation is available here:
https://landlock.io/linux-doc/landlock-v23/userspace-api/landlock.html

This series can be applied on top of v5.10-rc1 .  This can be tested with
CONFIG_SECURITY_LANDLOCK and CONFIG_SAMPLE_LANDLOCK.  This patch series
can be found in a Git repository here:
https://github.com/landlock-lsm/linux/commits/landlock-v23
I would really appreciate constructive comments on this patch series.


# Landlock LSM

The goal of Landlock is to enable to restrict ambient rights (e.g.
global filesystem access) for a set of processes.  Because Landlock is a
stackable LSM [1], it makes possible to create safe security sandboxes
as new security layers in addition to the existing system-wide
access-controls. This kind of sandbox is expected to help mitigate the
security impact of bugs or unexpected/malicious behaviors in user-space
applications. Landlock empowers any process, including unprivileged
ones, to securely restrict themselves.

Landlock is inspired by seccomp-bpf but instead of filtering syscalls
and their raw arguments, a Landlock rule can restrict the use of kernel
objects like file hierarchies, according to the kernel semantic.
Landlock also takes inspiration from other OS sandbox mechanisms: XNU
Sandbox, FreeBSD Capsicum or OpenBSD Pledge/Unveil.

In this current form, Landlock misses some access-control features.
This enables to minimize this patch series and ease review.  This series
still addresses multiple use cases, especially with the combined use of
seccomp-bpf: applications with built-in sandboxing, init systems,
security sandbox tools and security-oriented APIs [2].

Previous version:
https://lore.kernel.org/lkml/20201027200358.557003-1-...@digikod.net/

[1] 
https://lore.kernel.org/lkml/50db058a-7dde-441b-a7f9-f6837fe8b...@schaufler-ca.com/
[2] 
https://lore.kernel.org/lkml/f646e1c7-33cf-333f-070c-0a40ad046...@digikod.net/


Casey Schaufler (1):
  LSM: Infrastructure management of the superblock

Mickaël Salaün (11):
  landlock: Add object management
  landlock: Add ruleset and domain management
  landlock: Set up the security framework and manage credentials
  landlock: Add ptrace restrictions
  fs,security: Add sb_delete hook
  landlock: Support filesystem access-control
  landlock: Add syscall implementations
  arch: Wire up Landlock syscalls
  selftests/landlock: Add user space tests
  samples/landlock: Add a sandbox manager example
  landlock: Add user and kernel documentation

 Documentation/security/index.rst  |1 +
 Documentation/security/landlock.rst   |   79 +
 Documentation/userspace-api/index.rst |1 +
 Documentation/userspace-api/landlock.rst  |  258 +++
 MAINTAINERS   |   13 +
 arch/Kconfig  |7 +
 arch/alpha/kernel/syscalls/syscall.tbl|3 +
 arch/arm/tools/syscall.tbl|3 +
 arch/arm64/include/asm/unistd.h   |2 +-
 arch/arm64/include/asm/unistd32.h |6 +
 arch/ia64/kernel/syscalls/syscall.tbl |3 +
 arch/m68k/kernel/syscalls/syscall.tbl |3 +
 arch/microblaze/kernel/syscalls/syscall.tbl   |3 +
 arch/mips/kernel/syscalls/syscall_n32.tbl |3 +
 arch/mips/kernel/syscalls/syscall_n64.tbl |3 +
 arch/mips/kernel/syscalls/syscall_o32.tbl |3 +
 arch/parisc/kernel/syscalls/syscall.tbl   |3 +
 arch/powerpc/kernel/syscalls/syscall.tbl  |3 +
 arch/s390/kernel/syscalls/syscall.tbl |3 +
 arch/sh/kernel/syscalls/syscall.tbl   |3 +
 arch/sparc/kernel/syscalls/syscall.tbl|3 +
 arch/um/Kconfig   |1 +
 arch/x86/entry/syscalls/syscall_32.tbl|3 +
 arch/x86/entry/syscalls/syscall_64.tbl|3 +
 arch/xtensa/kernel/syscalls/syscall.tbl   |3 +
 fs/super.c|1 +
 include/linux/lsm_hook_defs.h |1 +
 include/linux/lsm_hooks.h |3 +
 include/linux/security.h  |4 +
 include/linux/syscalls.h  |7 +
 include/uapi/asm-generic/unistd.h |8 +-
 include/uapi/linux/landlock.h |  128 ++
 kernel/sys_ni.c   |5 +
 samples/Kconfig   |7 +
 samples/Makefile  |1 +
 samples/landlock/.gitignore   |1 +
 samples/landlock/Makefile

[PATCH v23 05/12] LSM: Infrastructure management of the superblock

2020-11-03 Thread Mickaël Salaün
From: Casey Schaufler 

Move management of the superblock->sb_security blob out of the
individual security modules and into the security infrastructure.
Instead of allocating the blobs from within the modules, the modules
tell the infrastructure how much space is required, and the space is
allocated there.

Cc: Kees Cook 
Cc: John Johansen 
Signed-off-by: Casey Schaufler 
Signed-off-by: Mickaël Salaün 
Reviewed-by: Stephen Smalley 
---

Changes since v20:
* Remove all Reviewed-by except Stephen Smalley:
  
https://lore.kernel.org/lkml/CAEjxPJ7ARJO57MBW66=xsBzMMRb=9ulgqock5eskhcaivmx...@mail.gmail.com/
* Cosmetic fix in the commit message.

Changes since v17:
* Rebase the original LSM stacking patch from v5.3 to v5.7: I fixed some
  diff conflicts caused by code moves and function renames in
  selinux/include/objsec.h and selinux/hooks.c .  I checked that it
  builds but I didn't test the changes for SELinux nor SMACK.
  https://lore.kernel.org/r/20190829232935.7099-2-ca...@schaufler-ca.com
---
 include/linux/lsm_hooks.h |  1 +
 security/security.c   | 46 
 security/selinux/hooks.c  | 58 ---
 security/selinux/include/objsec.h |  6 
 security/selinux/ss/services.c|  3 +-
 security/smack/smack.h|  6 
 security/smack/smack_lsm.c| 35 +--
 7 files changed, 85 insertions(+), 70 deletions(-)

diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index c503f7ab8afb..ff0f03a45c56 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -1563,6 +1563,7 @@ struct lsm_blob_sizes {
int lbs_cred;
int lbs_file;
int lbs_inode;
+   int lbs_superblock;
int lbs_ipc;
int lbs_msg_msg;
int lbs_task;
diff --git a/security/security.c b/security/security.c
index a28045dc9e7f..4ffd6c3af9d7 100644
--- a/security/security.c
+++ b/security/security.c
@@ -202,6 +202,7 @@ static void __init lsm_set_blob_sizes(struct lsm_blob_sizes 
*needed)
lsm_set_blob_size(>lbs_inode, _sizes.lbs_inode);
lsm_set_blob_size(>lbs_ipc, _sizes.lbs_ipc);
lsm_set_blob_size(>lbs_msg_msg, _sizes.lbs_msg_msg);
+   lsm_set_blob_size(>lbs_superblock, _sizes.lbs_superblock);
lsm_set_blob_size(>lbs_task, _sizes.lbs_task);
 }
 
@@ -332,12 +333,13 @@ static void __init ordered_lsm_init(void)
for (lsm = ordered_lsms; *lsm; lsm++)
prepare_lsm(*lsm);
 
-   init_debug("cred blob size = %d\n", blob_sizes.lbs_cred);
-   init_debug("file blob size = %d\n", blob_sizes.lbs_file);
-   init_debug("inode blob size= %d\n", blob_sizes.lbs_inode);
-   init_debug("ipc blob size  = %d\n", blob_sizes.lbs_ipc);
-   init_debug("msg_msg blob size  = %d\n", blob_sizes.lbs_msg_msg);
-   init_debug("task blob size = %d\n", blob_sizes.lbs_task);
+   init_debug("cred blob size   = %d\n", blob_sizes.lbs_cred);
+   init_debug("file blob size   = %d\n", blob_sizes.lbs_file);
+   init_debug("inode blob size  = %d\n", blob_sizes.lbs_inode);
+   init_debug("ipc blob size= %d\n", blob_sizes.lbs_ipc);
+   init_debug("msg_msg blob size= %d\n", blob_sizes.lbs_msg_msg);
+   init_debug("superblock blob size = %d\n", blob_sizes.lbs_superblock);
+   init_debug("task blob size   = %d\n", blob_sizes.lbs_task);
 
/*
 * Create any kmem_caches needed for blobs
@@ -669,6 +671,27 @@ static void __init lsm_early_task(struct task_struct *task)
panic("%s: Early task alloc failed.\n", __func__);
 }
 
+/**
+ * lsm_superblock_alloc - allocate a composite superblock blob
+ * @sb: the superblock that needs a blob
+ *
+ * Allocate the superblock blob for all the modules
+ *
+ * Returns 0, or -ENOMEM if memory can't be allocated.
+ */
+static int lsm_superblock_alloc(struct super_block *sb)
+{
+   if (blob_sizes.lbs_superblock == 0) {
+   sb->s_security = NULL;
+   return 0;
+   }
+
+   sb->s_security = kzalloc(blob_sizes.lbs_superblock, GFP_KERNEL);
+   if (sb->s_security == NULL)
+   return -ENOMEM;
+   return 0;
+}
+
 /*
  * The default value of the LSM hook is defined in linux/lsm_hook_defs.h and
  * can be accessed with:
@@ -866,12 +889,21 @@ int security_fs_context_parse_param(struct fs_context 
*fc, struct fs_parameter *
 
 int security_sb_alloc(struct super_block *sb)
 {
-   return call_int_hook(sb_alloc_security, 0, sb);
+   int rc = lsm_superblock_alloc(sb);
+
+   if (unlikely(rc))
+   return rc;
+   rc = call_int_hook(sb_alloc_security, 0, sb);
+   if (unlikely(rc))
+   security_sb_free(sb);
+   return rc;
 }
 
 

[PATCH v23 11/12] samples/landlock: Add a sandbox manager example

2020-11-03 Thread Mickaël Salaün
From: Mickaël Salaün 

Add a basic sandbox tool to launch a command which can only access a
whitelist of file hierarchies in a read-only or read-write way.

Cc: James Morris 
Cc: Jann Horn 
Cc: Kees Cook 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
---

Changes since v21:
* Remove LANDLOCK_ACCESS_FS_CHROOT.
* Clean up help.

Changes since v20:
* Update with new syscalls and type names.
* Update errno check for EOPNOTSUPP.
* Use the full syscall interfaces: explicitely set the "flags" field to
  zero.

Changes since v19:
* Update with the new Landlock syscalls.
* Comply with commit 5f2fb52fac15 ("kbuild: rename hostprogs-y/always to
  hostprogs/always-y").

Changes since v16:
* Switch syscall attribute pointer and size arguments.

Changes since v15:
* Update access right names.
* Properly assign access right to files according to the new related
  syscall restriction.
* Replace "select" with "depends on" HEADERS_INSTALL (suggested by Randy
  Dunlap).

Changes since v14:
* Fix Kconfig dependency.
* Remove access rights that may be required for FD-only requests:
  mmap, truncate, getattr, lock, chmod, chown, chgrp, ioctl.
* Fix useless hardcoded syscall number.
* Use execvpe().
* Follow symlinks.
* Extend help with common file paths.
* Constify variables.
* Clean up comments.
* Improve error message.

Changes since v11:
* Add back the filesystem sandbox manager and update it to work with the
  new Landlock syscall.

Previous changes:
https://lore.kernel.org/lkml/20190721213116.23476-9-...@digikod.net/
---
 samples/Kconfig  |   7 ++
 samples/Makefile |   1 +
 samples/landlock/.gitignore  |   1 +
 samples/landlock/Makefile|  15 +++
 samples/landlock/sandboxer.c | 219 +++
 5 files changed, 243 insertions(+)
 create mode 100644 samples/landlock/.gitignore
 create mode 100644 samples/landlock/Makefile
 create mode 100644 samples/landlock/sandboxer.c

diff --git a/samples/Kconfig b/samples/Kconfig
index 0ed6e4d71d87..e6129496ced5 100644
--- a/samples/Kconfig
+++ b/samples/Kconfig
@@ -124,6 +124,13 @@ config SAMPLE_HIDRAW
bool "hidraw sample"
depends on CC_CAN_LINK && HEADERS_INSTALL
 
+config SAMPLE_LANDLOCK
+   bool "Build Landlock sample code"
+   depends on HEADERS_INSTALL
+   help
+ Build a simple Landlock sandbox manager able to launch a process
+ restricted by a user-defined filesystem access control.
+
 config SAMPLE_PIDFD
bool "pidfd sample"
depends on CC_CAN_LINK && HEADERS_INSTALL
diff --git a/samples/Makefile b/samples/Makefile
index c3392a595e4b..087e0988ccc5 100644
--- a/samples/Makefile
+++ b/samples/Makefile
@@ -11,6 +11,7 @@ obj-$(CONFIG_SAMPLE_KDB)  += kdb/
 obj-$(CONFIG_SAMPLE_KFIFO) += kfifo/
 obj-$(CONFIG_SAMPLE_KOBJECT)   += kobject/
 obj-$(CONFIG_SAMPLE_KPROBES)   += kprobes/
+subdir-$(CONFIG_SAMPLE_LANDLOCK)   += landlock
 obj-$(CONFIG_SAMPLE_LIVEPATCH) += livepatch/
 subdir-$(CONFIG_SAMPLE_PIDFD)  += pidfd
 obj-$(CONFIG_SAMPLE_QMI_CLIENT)+= qmi/
diff --git a/samples/landlock/.gitignore b/samples/landlock/.gitignore
new file mode 100644
index ..f43668b2d318
--- /dev/null
+++ b/samples/landlock/.gitignore
@@ -0,0 +1 @@
+/sandboxer
diff --git a/samples/landlock/Makefile b/samples/landlock/Makefile
new file mode 100644
index ..21eda5774948
--- /dev/null
+++ b/samples/landlock/Makefile
@@ -0,0 +1,15 @@
+# SPDX-License-Identifier: BSD-3-Clause
+
+hostprogs := sandboxer
+
+always-y := $(hostprogs)
+
+KBUILD_HOSTCFLAGS += -I$(objtree)/usr/include
+
+.PHONY: all clean
+
+all:
+   $(MAKE) -C ../.. samples/landlock/
+
+clean:
+   $(MAKE) -C ../.. M=samples/landlock/ clean
diff --git a/samples/landlock/sandboxer.c b/samples/landlock/sandboxer.c
new file mode 100644
index ..ee5ec1203cb7
--- /dev/null
+++ b/samples/landlock/sandboxer.c
@@ -0,0 +1,219 @@
+// SPDX-License-Identifier: BSD-3-Clause
+/*
+ * Simple Landlock sandbox manager able to launch a process restricted by a
+ * user-defined filesystem access control.
+ *
+ * Copyright © 2017-2020 Mickaël Salaün 
+ * Copyright © 2020 ANSSI
+ */
+
+#define _GNU_SOURCE
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#ifndef landlock_create_ruleset
+static inline int landlock_create_ruleset(
+   const struct landlock_ruleset_attr *const attr,
+   const size_t size, const __u32 flags)
+{
+   errno = 0;
+   return syscall(__NR_landlock_create_ruleset, attr, size, flags);
+}
+#endif
+
+#ifndef landlock_add_rule
+static inline int landlock_add_rule(const int ruleset_fd,
+   const enum landlock_rule_type rule_type,
+   const void *const rule_attr, const __u32 flags)
+{
+

[PATCH v23 03/12] landlock: Set up the security framework and manage credentials

2020-11-03 Thread Mickaël Salaün
From: Mickaël Salaün 

Process's credentials point to a Landlock domain, which is underneath
implemented with a ruleset.  In the following commits, this domain is
used to check and enforce the ptrace and filesystem security policies.
A domain is inherited from a parent to its child the same way a thread
inherits a seccomp policy.

Cc: James Morris 
Cc: Kees Cook 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
Reviewed-by: Jann Horn 
---

Changes since v21:
* Fix copyright dates.

Changes since v17:
* Constify returned domain pointers from landlock_get_current_domain()
  and landlock_get_task_domain() helpers.

Changes since v15:
* Optimize landlocked() for current thread.
* Display the greeting message when everything is initialized.

Changes since v14:
* Uses pr_fmt from common.h .
* Constify variables.
* Remove useless NULL initialization.

Changes since v13:
* totally get ride of the seccomp dependency
* only keep credential management and LSM setup.

Previous changes:
https://lore.kernel.org/lkml/20191104172146.30797-4-...@digikod.net/
---
 security/Kconfig   | 10 +++
 security/landlock/Makefile |  3 +-
 security/landlock/common.h | 20 +
 security/landlock/cred.c   | 46 ++
 security/landlock/cred.h   | 58 ++
 security/landlock/setup.c  | 31 
 security/landlock/setup.h  | 16 +++
 7 files changed, 178 insertions(+), 6 deletions(-)
 create mode 100644 security/landlock/common.h
 create mode 100644 security/landlock/cred.c
 create mode 100644 security/landlock/cred.h
 create mode 100644 security/landlock/setup.c
 create mode 100644 security/landlock/setup.h

diff --git a/security/Kconfig b/security/Kconfig
index 15a4342b5d01..0ced7fd33e4d 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -278,11 +278,11 @@ endchoice
 
 config LSM
string "Ordered list of enabled LSMs"
-   default 
"lockdown,yama,loadpin,safesetid,integrity,smack,selinux,tomoyo,apparmor,bpf" 
if DEFAULT_SECURITY_SMACK
-   default 
"lockdown,yama,loadpin,safesetid,integrity,apparmor,selinux,smack,tomoyo,bpf" 
if DEFAULT_SECURITY_APPARMOR
-   default "lockdown,yama,loadpin,safesetid,integrity,tomoyo,bpf" if 
DEFAULT_SECURITY_TOMOYO
-   default "lockdown,yama,loadpin,safesetid,integrity,bpf" if 
DEFAULT_SECURITY_DAC
-   default 
"lockdown,yama,loadpin,safesetid,integrity,selinux,smack,tomoyo,apparmor,bpf"
+   default 
"landlock,lockdown,yama,loadpin,safesetid,integrity,smack,selinux,tomoyo,apparmor,bpf"
 if DEFAULT_SECURITY_SMACK
+   default 
"landlock,lockdown,yama,loadpin,safesetid,integrity,apparmor,selinux,smack,tomoyo,bpf"
 if DEFAULT_SECURITY_APPARMOR
+   default "landlock,lockdown,yama,loadpin,safesetid,integrity,tomoyo,bpf" 
if DEFAULT_SECURITY_TOMOYO
+   default "landlock,lockdown,yama,loadpin,safesetid,integrity,bpf" if 
DEFAULT_SECURITY_DAC
+   default 
"landlock,lockdown,yama,loadpin,safesetid,integrity,selinux,smack,tomoyo,apparmor,bpf"
help
  A comma-separated list of LSMs, in initialization order.
  Any LSMs left off this list will be ignored. This can be
diff --git a/security/landlock/Makefile b/security/landlock/Makefile
index d846eba445bb..041ea242e627 100644
--- a/security/landlock/Makefile
+++ b/security/landlock/Makefile
@@ -1,3 +1,4 @@
 obj-$(CONFIG_SECURITY_LANDLOCK) := landlock.o
 
-landlock-y := object.o ruleset.o
+landlock-y := setup.o object.o ruleset.o \
+   cred.o
diff --git a/security/landlock/common.h b/security/landlock/common.h
new file mode 100644
index ..5dc0fe15707d
--- /dev/null
+++ b/security/landlock/common.h
@@ -0,0 +1,20 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Landlock LSM - Common constants and helpers
+ *
+ * Copyright © 2016-2020 Mickaël Salaün 
+ * Copyright © 2018-2020 ANSSI
+ */
+
+#ifndef _SECURITY_LANDLOCK_COMMON_H
+#define _SECURITY_LANDLOCK_COMMON_H
+
+#define LANDLOCK_NAME "landlock"
+
+#ifdef pr_fmt
+#undef pr_fmt
+#endif
+
+#define pr_fmt(fmt) LANDLOCK_NAME ": " fmt
+
+#endif /* _SECURITY_LANDLOCK_COMMON_H */
diff --git a/security/landlock/cred.c b/security/landlock/cred.c
new file mode 100644
index ..7074149d2517
--- /dev/null
+++ b/security/landlock/cred.c
@@ -0,0 +1,46 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Landlock LSM - Credential hooks
+ *
+ * Copyright © 2017-2020 Mickaël Salaün 
+ * Copyright © 2018-2020 ANSSI
+ */
+
+#include 
+#include 
+
+#include "common.h"
+#include "cred.h"
+#include "ruleset.h"
+#include "setup.h"
+
+static int hook_cred_prepare(struct cred *const new,
+   const struct cred *const old, const gfp_t gfp)
+{
+   const struct landlock_cred_security *cred_old = landlock_cred(old);
+   struct landlock_cred_security *cred_new

[PATCH v23 01/12] landlock: Add object management

2020-11-03 Thread Mickaël Salaün
From: Mickaël Salaün 

A Landlock object enables to identify a kernel object (e.g. an inode).
A Landlock rule is a set of access rights allowed on an object.  Rules
are grouped in rulesets that may be tied to a set of processes (i.e.
subjects) to enforce a scoped access-control (i.e. a domain).

Because Landlock's goal is to empower any process (especially
unprivileged ones) to sandbox themselves, we cannot rely on a
system-wide object identification such as file extended attributes.
Indeed, we need innocuous, composable and modular access-controls.

The main challenge with these constraints is to identify kernel objects
while this identification is useful (i.e. when a security policy makes
use of this object).  But this identification data should be freed once
no policy is using it.  This ephemeral tagging should not and may not be
written in the filesystem.  We then need to manage the lifetime of a
rule according to the lifetime of its objects.  To avoid a global lock,
this implementation make use of RCU and counters to safely reference
objects.

A following commit uses this generic object management for inodes.

Cc: James Morris 
Cc: Kees Cook 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
Reviewed-by: Jann Horn 
---

Changes since v22:
* Fix spelling (spotted by Jann Horn).

Changes since v21:
* Update Kconfig help.
* Clean up comments.

Changes since v18:
* Account objects to kmemcg.

Changes since v14:
* Simplify the object, rule and ruleset management at the expense of a
  less aggressive memory freeing (contributed by Jann Horn, with
  additional modifications):
  - Remove object->list aggregating the rules tied to an object.
  - Remove landlock_get_object(), landlock_drop_object(),
{get,put}_object_cleaner() and landlock_rule_is_disabled().
  - Rewrite landlock_put_object() to use a more simple mechanism
(no tricky RCU).
  - Replace enum landlock_object_type and landlock_release_object() with
landlock_object_underops->release()
  - Adjust unions and Sparse annotations.
  Cf. 
https://lore.kernel.org/lkml/cag48ez21ben0wl1bbmtiiu8j9jp5iewthowz4turuj+ki0y...@mail.gmail.com/
* Merge struct landlock_rule into landlock_ruleset_elem to simplify the
  rule management.
* Constify variables.
* Improve kernel documentation.
* Cosmetic variable renames.
* Remove the "default" in the Kconfig (suggested by Jann Horn).
* Only use refcount_inc() through getter helpers.
* Update Kconfig description.

Changes since v13:
* New dedicated implementation, removing the need for eBPF.

Previous changes:
https://lore.kernel.org/lkml/20190721213116.23476-6-...@digikod.net/
---
 MAINTAINERS| 10 +
 security/Kconfig   |  1 +
 security/Makefile  |  2 +
 security/landlock/Kconfig  | 19 
 security/landlock/Makefile |  3 ++
 security/landlock/object.c | 66 +++
 security/landlock/object.h | 91 ++
 7 files changed, 192 insertions(+)
 create mode 100644 security/landlock/Kconfig
 create mode 100644 security/landlock/Makefile
 create mode 100644 security/landlock/object.c
 create mode 100644 security/landlock/object.h

diff --git a/MAINTAINERS b/MAINTAINERS
index b516bb34a8d5..38f7d50008e7 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9826,6 +9826,16 @@ F:   net/core/sock_map.c
 F: net/ipv4/tcp_bpf.c
 F: net/ipv4/udp_bpf.c
 
+LANDLOCK SECURITY MODULE
+M:     Mickaël Salaün 
+L: linux-security-mod...@vger.kernel.org
+S: Supported
+W: https://landlock.io
+T: git https://github.com/landlock-lsm/linux.git
+F: security/landlock/
+K: landlock
+K: LANDLOCK
+
 LANTIQ / INTEL Ethernet drivers
 M: Hauke Mehrtens 
 L: net...@vger.kernel.org
diff --git a/security/Kconfig b/security/Kconfig
index 7561f6f99f1d..15a4342b5d01 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -238,6 +238,7 @@ source "security/loadpin/Kconfig"
 source "security/yama/Kconfig"
 source "security/safesetid/Kconfig"
 source "security/lockdown/Kconfig"
+source "security/landlock/Kconfig"
 
 source "security/integrity/Kconfig"
 
diff --git a/security/Makefile b/security/Makefile
index 3baf435de541..c688f4907a1b 100644
--- a/security/Makefile
+++ b/security/Makefile
@@ -13,6 +13,7 @@ subdir-$(CONFIG_SECURITY_LOADPIN) += loadpin
 subdir-$(CONFIG_SECURITY_SAFESETID)+= safesetid
 subdir-$(CONFIG_SECURITY_LOCKDOWN_LSM) += lockdown
 subdir-$(CONFIG_BPF_LSM)   += bpf
+subdir-$(CONFIG_SECURITY_LANDLOCK) += landlock
 
 # always enable default capabilities
 obj-y  += commoncap.o
@@ -32,6 +33,7 @@ obj-$(CONFIG_SECURITY_SAFESETID)   += safesetid/
 obj-$(CONFIG_SECURITY_LOCKDOWN_LSM)+= lockdown/
 obj-$(CONFIG_CGROUPS)  += device_cgroup.o
 obj-$(CONFIG_BPF_LSM)  += bpf/
+obj-$(CONFIG_SECURITY_LANDLOCK)+= landlock/
 
 # Object in

Re: [PATCH v22 07/12] landlock: Support filesystem access-control

2020-11-03 Thread Mickaël Salaün


On 29/10/2020 02:06, Jann Horn wrote:
> (On Tue, Oct 27, 2020 at 9:04 PM Mickaël Salaün  wrote:

>> diff --git a/security/landlock/fs.c b/security/landlock/fs.c
> [...]
>> +static inline u32 get_file_access(const struct file *const file)
>> +{
>> +   u32 access = 0;
>> +
>> +   if (file->f_mode & FMODE_READ) {
>> +   /* A directory can only be opened in read mode. */
>> +   if (S_ISDIR(file_inode(file)->i_mode))
>> +   return LANDLOCK_ACCESS_FS_READ_DIR;
>> +   access = LANDLOCK_ACCESS_FS_READ_FILE;
>> +   }
>> +   /*
>> +* A LANDLOCK_ACCESS_FS_APPEND could be added but we also need to 
>> check
>> +* fcntl(2).
>> +*/
> 
> Once 
> https://lore.kernel.org/linux-api/20200831153207.go3...@brightrain.aerifal.cx/
> lands, pwritev2() with RWF_NOAPPEND will also be problematic for
> classifying "write" vs "append"; you may want to include that in the
> comment. (Or delete the comment.)

Contrary to fcntl(2), pwritev2(2) doesn't seems to modify the file
description. Otherwise, other LSMs would need to be patched.
I'll remove this comment anyway.

> 
>> +   if (file->f_mode & FMODE_WRITE)
>> +   access |= LANDLOCK_ACCESS_FS_WRITE_FILE;
>> +   /* __FMODE_EXEC is indeed part of f_flags, not f_mode. */
>> +   if (file->f_flags & __FMODE_EXEC)
>> +   access |= LANDLOCK_ACCESS_FS_EXECUTE;
>> +   return access;
>> +}
> [...]
> 


Re: [PATCH v1 1/2] ptrace: Set PF_SUPERPRIV when checking capability

2020-10-30 Thread Mickaël Salaün


On 30/10/2020 16:47, Jann Horn wrote:
> On Fri, Oct 30, 2020 at 1:39 PM Mickaël Salaün  wrote:
>> Commit 69f594a38967 ("ptrace: do not audit capability check when outputing
>> /proc/pid/stat") replaced the use of ns_capable() with
>> has_ns_capability{,_noaudit}() which doesn't set PF_SUPERPRIV.
>>
>> Commit 6b3ad6649a4c ("ptrace: reintroduce usage of subjective credentials in
>> ptrace_has_cap()") replaced has_ns_capability{,_noaudit}() with
>> security_capable(), which doesn't set PF_SUPERPRIV neither.
>>
>> Since commit 98f368e9e263 ("kernel: Add noaudit variant of ns_capable()"), a
>> new ns_capable_noaudit() helper is available.  Let's use it!
>>
>> As a result, the signature of ptrace_has_cap() is restored to its original 
>> one.
>>
>> Cc: Christian Brauner 
>> Cc: Eric Paris 
>> Cc: Jann Horn 
>> Cc: Kees Cook 
>> Cc: Oleg Nesterov 
>> Cc: Serge E. Hallyn 
>> Cc: Tyler Hicks 
>> Cc: sta...@vger.kernel.org
>> Fixes: 6b3ad6649a4c ("ptrace: reintroduce usage of subjective credentials in 
>> ptrace_has_cap()")
>> Fixes: 69f594a38967 ("ptrace: do not audit capability check when outputing 
>> /proc/pid/stat")
>> Signed-off-by: Mickaël Salaün 
> 
> Yeah... I guess this makes sense. (We'd have to undo or change it if
> we ever end up needing to use a different set of credentials, e.g.
> from ->f_cred, but I guess that's really something we should avoid
> anyway.)
> 
> Reviewed-by: Jann Horn 
> 
> with one nit:
> 
> 
> [...]
>>  /* Returns 0 on success, -errno on denial. */
>>  static int __ptrace_may_access(struct task_struct *task, unsigned int mode)
>>  {
>> -   const struct cred *cred = current_cred(), *tcred;
>> +   const struct cred *const cred = current_cred(), *tcred;
> 
> This is an unrelated change, and almost no kernel code marks local
> pointer variables as "const". I would drop this change from the patch.

This give guarantee that the cred variable will not be used for
something else than current_cred(), which kinda prove that this patch
doesn't change the behavior of __ptrace_may_access() by not using cred
in ptrace_has_cap(). It doesn't hurt and I think it could be useful to
spot issues when backporting.

> 
>> struct mm_struct *mm;
>> kuid_t caller_uid;
>> kgid_t caller_gid;


Re: [PATCH v22 08/12] landlock: Add syscall implementations

2020-10-30 Thread Mickaël Salaün


On 30/10/2020 04:07, Jann Horn wrote:
> On Thu, Oct 29, 2020 at 12:30 PM Mickaël Salaün  wrote:
>> On 29/10/2020 02:06, Jann Horn wrote:
>>> On Tue, Oct 27, 2020 at 9:04 PM Mickaël Salaün  wrote:
>>>> These 3 system calls are designed to be used by unprivileged processes
>>>> to sandbox themselves:
> [...]
>>>> +   /*
>>>> +* Similar checks as for seccomp(2), except that an -EPERM may be
>>>> +* returned.
>>>> +*/
>>>> +   if (!task_no_new_privs(current)) {
>>>> +   err = security_capable(current_cred(), current_user_ns(),
>>>> +   CAP_SYS_ADMIN, CAP_OPT_NOAUDIT);
>>>
>>> I think this should be ns_capable_noaudit(current_user_ns(), CAP_SYS_ADMIN)?
>>
>> Right. The main difference is that ns_capable*() set PF_SUPERPRIV in
>> current->flags. I guess seccomp should use ns_capable_noaudit() as well?
> 
> Yeah. That seccomp code is from commit e2cfabdfd0756, with commit date
> in April 2012, while ns_capable_noaudit() was introduced in commit
> 98f368e9e263, with commit date in June 2016; the seccomp code predates
> the availability of that API.
> 
> Do you want to send a patch to Kees for that, or should I?
> 

I found another case of this inconsistency in ptrace. I sent patches:
https://lore.kernel.org/lkml/20201030123849.770769-1-...@digikod.net/


[PATCH v1 1/2] ptrace: Set PF_SUPERPRIV when checking capability

2020-10-30 Thread Mickaël Salaün
From: Mickaël Salaün 

Commit 69f594a38967 ("ptrace: do not audit capability check when outputing
/proc/pid/stat") replaced the use of ns_capable() with
has_ns_capability{,_noaudit}() which doesn't set PF_SUPERPRIV.

Commit 6b3ad6649a4c ("ptrace: reintroduce usage of subjective credentials in
ptrace_has_cap()") replaced has_ns_capability{,_noaudit}() with
security_capable(), which doesn't set PF_SUPERPRIV neither.

Since commit 98f368e9e263 ("kernel: Add noaudit variant of ns_capable()"), a
new ns_capable_noaudit() helper is available.  Let's use it!

As a result, the signature of ptrace_has_cap() is restored to its original one.

Cc: Christian Brauner 
Cc: Eric Paris 
Cc: Jann Horn 
Cc: Kees Cook 
Cc: Oleg Nesterov 
Cc: Serge E. Hallyn 
Cc: Tyler Hicks 
Cc: sta...@vger.kernel.org
Fixes: 6b3ad6649a4c ("ptrace: reintroduce usage of subjective credentials in 
ptrace_has_cap()")
Fixes: 69f594a38967 ("ptrace: do not audit capability check when outputing 
/proc/pid/stat")
Signed-off-by: Mickaël Salaün 
---
 kernel/ptrace.c | 18 ++
 1 file changed, 6 insertions(+), 12 deletions(-)

diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 43d6179508d6..aa3c2fd6e41b 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -264,23 +264,17 @@ static int ptrace_check_attach(struct task_struct *child, 
bool ignore_state)
return ret;
 }
 
-static bool ptrace_has_cap(const struct cred *cred, struct user_namespace *ns,
-  unsigned int mode)
+static bool ptrace_has_cap(struct user_namespace *ns, unsigned int mode)
 {
-   int ret;
-
if (mode & PTRACE_MODE_NOAUDIT)
-   ret = security_capable(cred, ns, CAP_SYS_PTRACE, 
CAP_OPT_NOAUDIT);
-   else
-   ret = security_capable(cred, ns, CAP_SYS_PTRACE, CAP_OPT_NONE);
-
-   return ret == 0;
+   return ns_capable_noaudit(ns, CAP_SYS_PTRACE);
+   return ns_capable(ns, CAP_SYS_PTRACE);
 }
 
 /* Returns 0 on success, -errno on denial. */
 static int __ptrace_may_access(struct task_struct *task, unsigned int mode)
 {
-   const struct cred *cred = current_cred(), *tcred;
+   const struct cred *const cred = current_cred(), *tcred;
struct mm_struct *mm;
kuid_t caller_uid;
kgid_t caller_gid;
@@ -326,7 +320,7 @@ static int __ptrace_may_access(struct task_struct *task, 
unsigned int mode)
gid_eq(caller_gid, tcred->sgid) &&
gid_eq(caller_gid, tcred->gid))
goto ok;
-   if (ptrace_has_cap(cred, tcred->user_ns, mode))
+   if (ptrace_has_cap(tcred->user_ns, mode))
goto ok;
rcu_read_unlock();
return -EPERM;
@@ -345,7 +339,7 @@ static int __ptrace_may_access(struct task_struct *task, 
unsigned int mode)
mm = task->mm;
if (mm &&
((get_dumpable(mm) != SUID_DUMP_USER) &&
-!ptrace_has_cap(cred, mm->user_ns, mode)))
+!ptrace_has_cap(mm->user_ns, mode)))
return -EPERM;
 
return security_ptrace_access_check(task, mode);
-- 
2.28.0



[PATCH v1 2/2] seccomp: Set PF_SUPERPRIV when checking capability

2020-10-30 Thread Mickaël Salaün
From: Mickaël Salaün 

Replace the use of security_capable(current_cred(), ...) with
ns_capable_noaudit() which set PF_SUPERPRIV.

Since commit 98f368e9e263 ("kernel: Add noaudit variant of
ns_capable()"), a new ns_capable_noaudit() helper is available.  Let's
use it!

Cc: Jann Horn 
Cc: Kees Cook 
Cc: Tyler Hicks 
Cc: Will Drewry 
Cc: sta...@vger.kernel.org
Fixes: e2cfabdfd075 ("seccomp: add system call filtering using BPF")
Signed-off-by: Mickaël Salaün 
---
 kernel/seccomp.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index 8ad7a293255a..53a7d1512dd7 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -38,7 +38,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
@@ -558,8 +558,7 @@ static struct seccomp_filter *seccomp_prepare_filter(struct 
sock_fprog *fprog)
 * behavior of privileged children.
 */
if (!task_no_new_privs(current) &&
-   security_capable(current_cred(), current_user_ns(),
-CAP_SYS_ADMIN, CAP_OPT_NOAUDIT) != 0)
+   !ns_capable_noaudit(current_user_ns(), CAP_SYS_ADMIN))
return ERR_PTR(-EACCES);
 
/* Allocate a new seccomp_filter */
-- 
2.28.0



[PATCH v1 0/2] Fix misuse of security_capable()

2020-10-30 Thread Mickaël Salaün
This series replaces all the use of security_capable(current_cred(),
...) with ns_capable{,_noaudit}() which set PF_SUPERPRIV.

This initially come from a review of Landlock by Jann Horn:
https://lore.kernel.org/lkml/cag48ez1fqvkt78129wozbwfbvhapyar9ojahfhabbnxebr9...@mail.gmail.com/

Mickaël Salaün (2):
  ptrace: Set PF_SUPERPRIV when checking capability
  seccomp: Set PF_SUPERPRIV when checking capability

 kernel/ptrace.c  | 18 ++
 kernel/seccomp.c |  5 ++---
 2 files changed, 8 insertions(+), 15 deletions(-)


base-commit: 3650b228f83adda7e5ee532e2b90429c03f7b9ec
-- 
2.28.0



[PATCH] selftests/seccomp: Update kernel config

2020-10-30 Thread Mickaël Salaün
seccomp_bpf.c uses unshare(CLONE_NEWPID), which requires CONFIG_PID_NS
to be set.

Cc: Kees Cook 
Cc: Shuah Khan 
Cc: Tycho Andersen 
Fixes: 6a21cc50f0c7 ("seccomp: add a return code to trap to userspace")
Signed-off-by: Mickaël Salaün 
---
 tools/testing/selftests/seccomp/config | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/testing/selftests/seccomp/config 
b/tools/testing/selftests/seccomp/config
index 64c19d8eba79..ad431a5178fb 100644
--- a/tools/testing/selftests/seccomp/config
+++ b/tools/testing/selftests/seccomp/config
@@ -1,3 +1,4 @@
+CONFIG_PID_NS=y
 CONFIG_SECCOMP=y
 CONFIG_SECCOMP_FILTER=y
 CONFIG_USER_NS=y

base-commit: 3650b228f83adda7e5ee532e2b90429c03f7b9ec
-- 
2.28.0



Re: [PATCH v22 00/12] Landlock LSM

2020-10-29 Thread Mickaël Salaün


On 29/10/2020 02:05, Jann Horn wrote:
> On Tue, Oct 27, 2020 at 9:04 PM Mickaël Salaün  wrote:
>> This new patch series improves documentation, cleans up comments,
>> renames ARCH_EPHEMERAL_STATES to ARCH_EPHEMERAL_INODES and removes
>> LANDLOCK_ACCESS_FS_CHROOT.
> 
> Thanks for continuing to work on this! This is going to be really
> valuable for sandboxing.
> 
> I hadn't looked at this series for a while; but I've now read through
> it, and I don't see any major problems left. :) That said, there still
> are a couple small things...
> 

Thanks Jann, I really appreciate your reviews!


Re: [PATCH v22 12/12] landlock: Add user and kernel documentation

2020-10-29 Thread Mickaël Salaün


On 29/10/2020 02:07, Jann Horn wrote:
> On Tue, Oct 27, 2020 at 9:04 PM Mickaël Salaün  wrote:
>> This documentation can be built with the Sphinx framework.
>>
>> Cc: James Morris 
>> Cc: Jann Horn 
>> Cc: Kees Cook 
>> Cc: Serge E. Hallyn 
>> Signed-off-by: Mickaël Salaün 
>> Reviewed-by: Vincent Dagonneau 
> [...]
>> diff --git a/Documentation/userspace-api/landlock.rst 
>> b/Documentation/userspace-api/landlock.rst
> [...]
>> +Landlock rules
>> +==
>> +
>> +A Landlock rule enables to describe an action on an object.  An object is
> 
> s/enables to describe/describes/

OK.

> 
>> +currently a file hierarchy, and the related filesystem actions are defined 
>> in
>> +`Access rights`_.  A set of rules is aggregated in a ruleset, which can then
>> +restrict the thread enforcing it, and its future children.
>> +
>> +Defining and enforcing a security policy
>> +
>> +
>> +We first need to create the ruleset that will contain our rules.  For this
>> +example, the ruleset will contain rules which only allow read actions, but
>> +write actions will be denied.  The ruleset then needs to handle both of 
>> these
>> +kind of actions.  To have a backward compatibility, these actions should be
>> +ANDed with the supported ones.
> 
> This sounds as if there is a way for userspace to discover which
> actions are supported by the running kernel; but we don't have
> anything like that, right?

Right, it dates from the landlock_get_features(2), which is now gone but
may be replaced by something else in the future. I'll remove that.

> 
> If we want to make that possible, we could maybe change
> sys_landlock_create_ruleset() so that if
> ruleset_attr.handled_access_fs contains bits we don't know, we clear
> those bits and then copy the struct back to userspace? And then
> userspace can retry the syscall with the cleared bits? Or something
> along those lines?

Yes, but I would prefer clear syscall which don't read and write from/to
the same argument. I'm working on a more generic solution. It should not
be an issue for now.

> 
> [...]
>> +We can now add a new rule to this ruleset thanks to the returned file
>> +descriptor referring to this ruleset.  The rule will only enable to read the
> 
> s/enable to read/allow reading/

OK.

> 
>> +file hierarchy ``/usr``.  Without another rule, write actions would then be
>> +denied by the ruleset.  To add ``/usr`` to the ruleset, we open it with the
>> +``O_PATH`` flag and fill the  landlock_path_beneath_attr with this 
>> file
>> +descriptor.
> [...]
>> +Inheritance
>> +---
>> +
>> +Every new thread resulting from a :manpage:`clone(2)` inherits Landlock 
>> domain
>> +restrictions from its parent.  This is similar to the seccomp inheritance 
>> (cf.
>> +:doc:`/userspace-api/seccomp_filter`) or any other LSM dealing with task's
>> +:manpage:`credentials(7)`.  For instance, one process's thread may apply
>> +Landlock rules to itself, but they will not be automatically applied to 
>> other
>> +sibling threads (unlike POSIX thread credential changes, cf.
>> +:manpage:`nptl(7)`).
>> +
>> +When a thread sandbox itself, we have the grantee that the related security
> 
> s/sandbox/sandboxes/
> s/grantee/guarantee/

OK.

> 
>> +policy will stay enforced on all this thread's descendants.  This enables to
>> +create standalone and modular security policies per application, which will
> 
> s/enables to create/allows creating/

OK.

> 
> 
>> +automatically be composed between themselves according to their runtime 
>> parent
>> +policies.


Re: [PATCH v22 08/12] landlock: Add syscall implementations

2020-10-29 Thread Mickaël Salaün


On 29/10/2020 02:06, Jann Horn wrote:
> On Tue, Oct 27, 2020 at 9:04 PM Mickaël Salaün  wrote:
>> These 3 system calls are designed to be used by unprivileged processes
>> to sandbox themselves:
>> * landlock_create_ruleset(2): Creates a ruleset and returns its file
>>   descriptor.
>> * landlock_add_rule(2): Adds a rule (e.g. file hierarchy access) to a
>>   ruleset, identified by the dedicated file descriptor.
>> * landlock_enforce_ruleset_current(2): Enforces a ruleset on the current
>>   thread and its future children (similar to seccomp).  This syscall has
>>   the same usage restrictions as seccomp(2): the caller must have the
>>   no_new_privs attribute set or have CAP_SYS_ADMIN in the current user
>>   namespace.
> [...]
>> Cc: Arnd Bergmann 
>> Cc: James Morris 
>> Cc: Jann Horn 
>> Cc: Kees Cook 
>> Cc: Serge E. Hallyn 
>> Signed-off-by: Mickaël Salaün 
> [...]
>> diff --git a/include/uapi/linux/landlock.h b/include/uapi/linux/landlock.h
> [...]
>> +/**
>> + * struct landlock_path_beneath_attr - Path hierarchy definition
>> + *
>> + * Argument of sys_landlock_add_rule().
>> + */
>> +struct landlock_path_beneath_attr {
>> +   /**
>> +* @allowed_access: Bitmask of allowed actions for this file 
>> hierarchy
>> +* (cf. `Filesystem flags`_).
>> +*/
>> +   __u64 allowed_access;
>> +   /**
>> +* @parent_fd: File descriptor, open with ``O_PATH``, which identify
> 
> nit: "identifies"

OK

> 
>> +* the parent directory of a file hierarchy, or just a file.
>> +*/
>> +   __s32 parent_fd;
>> +   /*
>> +* This struct is packed to avoid trailing reserved members.
>> +* Cf. security/landlock/syscall.c:build_check_abi()
>> +*/
>> +} __attribute__((packed));
> [...]
>> diff --git a/security/landlock/syscall.c b/security/landlock/syscall.c
> [...]
>> +static int copy_min_struct_from_user(void *const dst, const size_t ksize,
>> +   const size_t ksize_min, const void __user *const src,
>> +   const size_t usize)
>> +{
>> +   /* Checks buffer inconsistencies. */
>> +   BUILD_BUG_ON(!dst);
>> +   if (!src)
>> +   return -EFAULT;
>> +
>> +   /* Checks size ranges. */
>> +   BUILD_BUG_ON(ksize <= 0);
>> +   BUILD_BUG_ON(ksize < ksize_min);
> 
> To make these checks work reliably, you should add __always_inline to
> the function.

Done.

> 
>> +   if (usize < ksize_min)
>> +   return -EINVAL;
>> +   if (usize > PAGE_SIZE)
>> +   return -E2BIG;
>> +
>> +   /* Copies user buffer and fills with zeros. */
>> +   return copy_struct_from_user(dst, ksize, src, usize);
>> +}
> [...]
>> +static int get_path_from_fd(const s32 fd, struct path *const path)
>> +{
>> +   struct fd f;
>> +   int err = 0;
>> +
>> +   BUILD_BUG_ON(!__same_type(fd,
>> +   ((struct landlock_path_beneath_attr *)NULL)->parent_fd));
>> +
>> +   /* Handles O_PATH. */
>> +   f = fdget_raw(fd);
>> +   if (!f.file)
>> +   return -EBADF;
>> +   /*
>> +* Only allows O_PATH file descriptor: enables to restrict ambient
>> +* filesystem access without requiring to open and risk leaking or
>> +* misusing a file descriptor.  Forbid internal filesystems (e.g.
>> +* nsfs), including pseudo filesystems that will never be mountable
>> +* (e.g. sockfs, pipefs).
>> +*/
>> +   if (!(f.file->f_mode & FMODE_PATH) ||
>> +   (f.file->f_path.mnt->mnt_flags & MNT_INTERNAL) ||
>> +   (f.file->f_path.dentry->d_sb->s_flags & SB_NOUSER) ||
>> +   d_is_negative(f.file->f_path.dentry) ||
>> +   IS_PRIVATE(d_backing_inode(f.file->f_path.dentry))) {
>> +   err = -EBADFD;
>> +   goto out_fdput;
>> +   }
>> +   path->mnt = f.file->f_path.mnt;
>> +   path->dentry = f.file->f_path.dentry;
> 
> those two lines can be replaced with "*path = f.file->f_path"

Done.

> 
>> +   path_get(path);
>> +
>> +out_fdput:
>> +   fdput(f);
>> +   return err;
>> +}
> [...]
>> +/**
>> + * sys_landlock_enforce_ruleset_current - Enforce a ruleset on the current 
>> tas

Re: [PATCH v22 07/12] landlock: Support filesystem access-control

2020-10-29 Thread Mickaël Salaün


On 29/10/2020 02:06, Jann Horn wrote:
> (On Tue, Oct 27, 2020 at 9:04 PM Mickaël Salaün  wrote:
>> Thanks to the Landlock objects and ruleset, it is possible to identify
>> inodes according to a process's domain.  To enable an unprivileged
>> process to express a file hierarchy, it first needs to open a directory
>> (or a file) and pass this file descriptor to the kernel through
>> landlock_add_rule(2).  When checking if a file access request is
>> allowed, we walk from the requested dentry to the real root, following
>> the different mount layers.  The access to each "tagged" inodes are
>> collected according to their rule layer level, and ANDed to create
>> access to the requested file hierarchy.  This makes possible to identify
>> a lot of files without tagging every inodes nor modifying the
>> filesystem, while still following the view and understanding the user
>> has from the filesystem.
>>
>> Add a new ARCH_EPHEMERAL_INODES for UML because it currently does not
>> keep the same struct inodes for the same inodes whereas these inodes are
>> in use.
>>
>> This commit adds a minimal set of supported filesystem access-control
>> which doesn't enable to restrict all file-related actions.  This is the
>> result of multiple discussions to minimize the code of Landlock to ease
>> review.  Thanks to the Landlock design, extending this access-control
>> without breaking user space will not be a problem.  Moreover, seccomp
>> filters can be used to restrict the use of syscall families which may
>> not be currently handled by Landlock.
> [...]
>> diff --git a/include/uapi/linux/landlock.h b/include/uapi/linux/landlock.h
> [...]
>> +/**
>> + * DOC: fs_access
>> + *
>> + * A set of actions on kernel objects may be defined by an attribute (e.g.
>> + *  landlock_path_beneath_attr) including a bitmask of access.
>> + *
>> + * Filesystem flags
>> + * 
>> + *
>> + * These flags enable to restrict a sandbox process to a set of actions on
> 
> s/sandbox/sandboxed/

OK

> 
> [...]
>> diff --git a/security/landlock/fs.c b/security/landlock/fs.c
> [...]
>> +static const struct landlock_object_underops landlock_fs_underops = {
>> +   .release = release_inode
>> +};
> [...]
>> +/* Access-control management */
>> +
>> +static bool check_access_path_continue(
>> +   const struct landlock_ruleset *const domain,
>> +   const struct path *const path, const u32 access_request,
>> +   bool *const allow, u64 *const layer_mask)
>> +{
>> +   const struct landlock_rule *rule;
>> +   const struct inode *inode;
>> +   bool next = true;
>> +
>> +   prefetch(path->dentry->d_parent);
> 
> IIRC software prefetch() turned out to only rarely actually have a
> performance benefit, and they often actually make things worse; see
> e.g. <https://lwn.net/Articles/444336/>. Unless you have strong
> evidence that this actually brings a performance benefit, I'd probably
> get rid of this.

I took inspiration from the fs/d_path.c:prepend_path() but I agree. I'll
remove prefetch() calls in the next series. I'll add them later if a
benchmark shows an interesting performance impact.

> 
>> +   if (d_is_negative(path->dentry))
>> +   /* Continues to walk while there is no mapped inode. */
>> +   return true;
>> +   inode = d_backing_inode(path->dentry);
>> +   rcu_read_lock();
>> +   rule = landlock_find_rule(domain,
>> +   rcu_dereference(landlock_inode(inode)->object));
>> +   rcu_read_unlock();
>> +
>> +   /* Checks for matching layers. */
>> +   if (rule && (rule->layers | *layer_mask)) {
>> +   *allow = (rule->access & access_request) == access_request;
>> +   if (*allow) {
>> +   *layer_mask &= ~rule->layers;
>> +   /* Stops when a rule from each layer granted access. 
>> */
>> +   next = !!*layer_mask;
>> +   } else {
>> +   next = false;
>> +   }
>> +   }
>> +   return next;
>> +}
>> +
>> +static int check_access_path(const struct landlock_ruleset *const domain,
>> +   const struct path *const path, u32 access_request)
>> +{
>> +   bool allow = false;
>> +   struct path walker_path;
>> +   u64 layer_mask;
>> +
>> +   if (WARN_ON_ONCE(!domain || !path)

Re: [PATCH v22 02/12] landlock: Add ruleset and domain management

2020-10-29 Thread Mickaël Salaün


On 29/10/2020 02:05, Jann Horn wrote:
> On Tue, Oct 27, 2020 at 9:04 PM Mickaël Salaün  wrote:
>> A Landlock ruleset is mainly a red-black tree with Landlock rules as
>> nodes.  This enables quick update and lookup to match a requested access
>> e.g., to a file.  A ruleset is usable through a dedicated file
>> descriptor (cf. following commit implementing syscalls) which enables a
>> process to create and populate a ruleset with new rules.
>>
>> A domain is a ruleset tied to a set of processes.  This group of rules
>> defines the security policy enforced on these processes and their future
>> children.  A domain can transition to a new domain which is the
>> intersection of all its constraints and those of a ruleset provided by
>> the current process.  This modification only impact the current process.
>> This means that a process can only gain more constraints (i.e. lose
>> accesses) over time.
>>
>> Cc: James Morris 
>> Cc: Jann Horn 
>> Cc: Kees Cook 
>> Cc: Serge E. Hallyn 
>> Signed-off-by: Mickaël Salaün 
> 
> Reviewed-by: Jann Horn 

Thanks.

> 
> with some nits:
> 
> [...]
>> +static struct landlock_ruleset *create_ruleset(void)
>> +{
>> +   struct landlock_ruleset *new_ruleset;
>> +
>> +   new_ruleset = kzalloc(sizeof(*new_ruleset), GFP_KERNEL_ACCOUNT);
>> +   if (!new_ruleset)
>> +   return ERR_PTR(-ENOMEM);
>> +   refcount_set(_ruleset->usage, 1);
>> +   mutex_init(_ruleset->lock);
>> +   /*
>> +* root = RB_ROOT
> 
> This should probably be done explicitly, even though it's currently a
> no-op, in case the implementation of RB_ROOT changes in the future.

OK, I'll do it for RB_ROOT.

> 
>> +* hierarchy = NULL
>> +* nb_rules = 0
>> +* nb_layers = 0
>> +* fs_access_mask = 0
>> +*/
>> +   return new_ruleset;
>> +}
> [...]
>> +/**
>> + * landlock_insert_rule - Insert a rule in a ruleset
>> + *
>> + * @ruleset: The ruleset to be updated.
>> + * @rule: Read-only payload to be inserted (not own by this function).
> 
> s/own/owned/

OK

> 
>> + * @is_merge: If true, intersects access rights and updates the rule's 
>> layers
>> + *(e.g. merge two rulesets), else do a union of access rights 
>> and
>> + *keep the rule's layers (e.g. extend a ruleset)
>> + *
>> + * Assumptions:
>> + *
>> + * - An inserted rule cannot be removed.
>> + * - The underlying kernel object must be held by the caller.
>> + */
>> +int landlock_insert_rule(struct landlock_ruleset *const ruleset,
>> +   struct landlock_rule *const rule, const bool is_merge)
> [...]
>> +static int merge_ruleset(struct landlock_ruleset *const dst,
>> +   struct landlock_ruleset *const src)
>> +{
>> +   struct landlock_rule *walker_rule, *next_rule;
>> +   int err = 0;
>> +
>> +   might_sleep();
>> +   if (!src)
>> +   return 0;
>> +   /* Only merge into a domain. */
>> +   if (WARN_ON_ONCE(!dst || !dst->hierarchy))
>> +   return -EFAULT;
>> +
>> +   mutex_lock(>lock);
>> +   mutex_lock_nested(>lock, 1);
> 
> Maybe add a comment like this above these two lines? "Ruleset locks
> are ordered by time of ruleset creation; dst is newer than src."

OK

> 
> Also, maybe s/1/SINGLE_DEPTH_NESTING/.

OK

> 
> 
>> +   /*
>> +* Makes a new layer, but only increments the number of layers after
>> +* the rules are inserted.
>> +*/
>> +   if (dst->nb_layers == sizeof(walker_rule->layers) * BITS_PER_BYTE) {
>> +   err = -E2BIG;
>> +   goto out_unlock;
>> +   }
>> +   dst->fs_access_mask |= src->fs_access_mask;
>> +
>> +   /* Merges the @src tree. */
>> +   rbtree_postorder_for_each_entry_safe(walker_rule, next_rule,
>> +   >root, node) {
>> +   err = landlock_insert_rule(dst, walker_rule, true);
>> +   if (err)
>> +   goto out_unlock;
>> +   }
>> +   dst->nb_layers++;
>> +
>> +out_unlock:
>> +   mutex_unlock(>lock);
>> +   mutex_unlock(>lock);
>> +   return err;
>> +}
> [...]
>> diff --git a/security/landlock/ruleset.h b/security/landlock/ruleset.h
> [...]
>> +struct landlock_rule {
>> +   /**
>> +* @node:

Re: [PATCH v22 01/12] landlock: Add object management

2020-10-29 Thread Mickaël Salaün


On 29/10/2020 02:05, Jann Horn wrote:
> On Tue, Oct 27, 2020 at 9:04 PM Mickaël Salaün  wrote:
>> A Landlock object enables to identify a kernel object (e.g. an inode).
>> A Landlock rule is a set of access rights allowed on an object.  Rules
>> are grouped in rulesets that may be tied to a set of processes (i.e.
>> subjects) to enforce a scoped access-control (i.e. a domain).
>>
>> Because Landlock's goal is to empower any process (especially
>> unprivileged ones) to sandbox themselves, we cannot rely on a
>> system-wide object identification such as file extended attributes.
>> Indeed, we need innocuous, composable and modular access-controls.
>>
>> The main challenge with these constraints is to identify kernel objects
>> while this identification is useful (i.e. when a security policy makes
>> use of this object).  But this identification data should be freed once
>> no policy is using it.  This ephemeral tagging should not and may not be
>> written in the filesystem.  We then need to manage the lifetime of a
>> rule according to the lifetime of its objects.  To avoid a global lock,
>> this implementation make use of RCU and counters to safely reference
>> objects.
>>
>> A following commit uses this generic object management for inodes.
>>
>> Cc: James Morris 
>> Cc: Jann Horn 
>> Cc: Kees Cook 
>> Cc: Serge E. Hallyn 
>> Signed-off-by: Mickaël Salaün 
> 
> Reviewed-by: Jann Horn 

Thanks for the review.

> 
> except for some minor nits:
> 
> [...]
>> diff --git a/security/landlock/object.c b/security/landlock/object.c
> [...]
>> +void landlock_put_object(struct landlock_object *const object)
>> +{
>> +   /*
>> +* The call to @object->underops->release(object) might sleep e.g.,
> 
> s/ e.g.,/, e.g./

I indeed prefer the comma preceding the "e.g.", but it seems that there
is a difference between UK english and US english:
https://english.stackexchange.com/questions/16172/should-i-always-use-a-comma-after-e-g-or-i-e
Looking at the kernel documentation makes it clear:
$ git grep -F 'e.g. ' | wc -l
1179
$ git grep -F 'e.g., ' | wc -l
160

I'll apply your fix in the whole patch series.

> 
>> +* because of iput().
>> +*/
>> +   might_sleep();
>> +   if (!object)
>> +   return;
> [...]
>> +}
>> diff --git a/security/landlock/object.h b/security/landlock/object.h
> [...]
>> +struct landlock_object {
>> +   /**
>> +* @usage: This counter is used to tie an object to the rules 
>> matching
>> +* it or to keep it alive while adding a new rule.  If this counter
>> +* reaches zero, this struct must not be modified, but this counter 
>> can
>> +* still be read from within an RCU read-side critical section.  When
>> +* adding a new rule to an object with a usage counter of zero, we 
>> must
>> +* wait until the pointer to this object is set to NULL (or 
>> recycled).
>> +*/
>> +   refcount_t usage;
>> +   /**
>> +* @lock: Guards against concurrent modifications.  This lock must be
> 
> s/must be/must be held/ ?

Right.

> 
>> +* from the time @usage drops to zero until any weak references from
>> +* @underobj to this object have been cleaned up.
>> +*
>> +* Lock ordering: inode->i_lock nests inside this.
>> +*/
>> +   spinlock_t lock;
> [...]
>> +};
>> +
>> +struct landlock_object *landlock_create_object(
>> +   const struct landlock_object_underops *const underops,
>> +   void *const underojb);
> 
> nit: "underobj"
> 

Good catch!


[PATCH v22 12/12] landlock: Add user and kernel documentation

2020-10-27 Thread Mickaël Salaün
From: Mickaël Salaün 

This documentation can be built with the Sphinx framework.

Cc: James Morris 
Cc: Jann Horn 
Cc: Kees Cook 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
Reviewed-by: Vincent Dagonneau 
---

Changes since v21:
* Move the user space documentation to userspace-api/landlock.rst and
  the kernel documentation to security/landlock.rst .
* Add license headers.
* Add last update dates.
* Update MAINTAINERS file.
* Add (back) links to git.kernel.org .
* Fix spelling.

Changes since v20:
* Update examples and documentation with the new syscalls.

Changes since v19:
* Update examples and documentation with the new syscalls.

Changes since v15:
* Add current limitations.

Changes since v14:
* Fix spelling (contributed by Randy Dunlap).
* Extend documentation about inheritance and explain layer levels.
* Remove the use of now-removed access rights.
* Use GitHub links.
* Improve kernel documentation.
* Add section for tests.
* Update example.

Changes since v13:
* Rewrote the documentation according to the major revamp.

Previous changes:
https://lore.kernel.org/lkml/20191104172146.30797-8-...@digikod.net/
---
 Documentation/security/index.rst |   1 +
 Documentation/security/landlock.rst  |  79 +++
 Documentation/userspace-api/index.rst|   1 +
 Documentation/userspace-api/landlock.rst | 259 +++
 MAINTAINERS  |   2 +
 5 files changed, 342 insertions(+)
 create mode 100644 Documentation/security/landlock.rst
 create mode 100644 Documentation/userspace-api/landlock.rst

diff --git a/Documentation/security/index.rst b/Documentation/security/index.rst
index 8129405eb2cc..16335de04e8c 100644
--- a/Documentation/security/index.rst
+++ b/Documentation/security/index.rst
@@ -16,3 +16,4 @@ Security Documentation
siphash
tpm/index
digsig
+   landlock
diff --git a/Documentation/security/landlock.rst 
b/Documentation/security/landlock.rst
new file mode 100644
index ..9b619eb4fe55
--- /dev/null
+++ b/Documentation/security/landlock.rst
@@ -0,0 +1,79 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. Copyright © 2017-2020 Mickaël Salaün 
+.. Copyright © 2019-2020 ANSSI
+
+==
+Landlock LSM: kernel documentation
+==
+
+:Author: Mickaël Salaün
+:Date: October 2020
+
+Landlock's goal is to create scoped access-control (i.e. sandboxing).  To
+harden a whole system, this feature should be available to any process,
+including unprivileged ones.  Because such process may be compromised or
+backdoored (i.e. untrusted), Landlock's features must be safe to use from the
+kernel and other processes point of view.  Landlock's interface must therefore
+expose a minimal attack surface.
+
+Landlock is designed to be usable by unprivileged processes while following the
+system security policy enforced by other access control mechanisms (e.g. DAC,
+LSM).  Indeed, a Landlock rule shall not interfere with other access-controls
+enforced on the system, only add more restrictions.
+
+Any user can enforce Landlock rulesets on their processes.  They are merged and
+evaluated according to the inherited ones in a way that ensures that only more
+constraints can be added.
+
+User space documentation can be found here: :doc:`/userspace-api/landlock`.
+
+Guiding principles for safe access controls
+===
+
+* A Landlock rule shall be focused on access control on kernel objects instead
+  of syscall filtering (i.e. syscall arguments), which is the purpose of
+  seccomp-bpf.
+* To avoid multiple kinds of side-channel attacks (e.g. leak of security
+  policies, CPU-based attacks), Landlock rules shall not be able to
+  programmatically communicate with user space.
+* Kernel access check shall not slow down access request from unsandboxed
+  processes.
+* Computation related to Landlock operations (e.g. enforcing a ruleset) shall
+  only impact the processes requesting them.
+
+Tests
+=
+
+Userspace tests for backward compatibility, ptrace restrictions and filesystem
+support can be found here: `tools/testing/selftests/landlock/`_.
+
+Kernel structures
+=
+
+Object
+--
+
+.. kernel-doc:: security/landlock/object.h
+:identifiers:
+
+Ruleset and domain
+--
+
+A domain is a read-only ruleset tied to a set of subjects (i.e. tasks'
+credentials).  Each time a ruleset is enforced on a task, the current domain is
+duplicated and the ruleset is imported as a new layer of rules in the new
+domain.  Indeed, once in a domain, each rule is tied to a layer level.  To
+grant access to an object, at least one rule of each layer must allow the
+requested action on the object.  A task can then only transit to a new domain
+which is the intersection of the constraints from the current domain and those
+of a ruleset provided by the task.
+
+The definition of a subject is implicit for a task sandboxing itself, which
+makes

[PATCH v22 01/12] landlock: Add object management

2020-10-27 Thread Mickaël Salaün
From: Mickaël Salaün 

A Landlock object enables to identify a kernel object (e.g. an inode).
A Landlock rule is a set of access rights allowed on an object.  Rules
are grouped in rulesets that may be tied to a set of processes (i.e.
subjects) to enforce a scoped access-control (i.e. a domain).

Because Landlock's goal is to empower any process (especially
unprivileged ones) to sandbox themselves, we cannot rely on a
system-wide object identification such as file extended attributes.
Indeed, we need innocuous, composable and modular access-controls.

The main challenge with these constraints is to identify kernel objects
while this identification is useful (i.e. when a security policy makes
use of this object).  But this identification data should be freed once
no policy is using it.  This ephemeral tagging should not and may not be
written in the filesystem.  We then need to manage the lifetime of a
rule according to the lifetime of its objects.  To avoid a global lock,
this implementation make use of RCU and counters to safely reference
objects.

A following commit uses this generic object management for inodes.

Cc: James Morris 
Cc: Jann Horn 
Cc: Kees Cook 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
---

Changes since v21:
* Update Kconfig help.
* Clean up comments.

Changes since v18:
* Account objects to kmemcg.

Changes since v14:
* Simplify the object, rule and ruleset management at the expense of a
  less aggressive memory freeing (contributed by Jann Horn, with
  additional modifications):
  - Remove object->list aggregating the rules tied to an object.
  - Remove landlock_get_object(), landlock_drop_object(),
{get,put}_object_cleaner() and landlock_rule_is_disabled().
  - Rewrite landlock_put_object() to use a more simple mechanism
(no tricky RCU).
  - Replace enum landlock_object_type and landlock_release_object() with
landlock_object_underops->release()
  - Adjust unions and Sparse annotations.
  Cf. 
https://lore.kernel.org/lkml/cag48ez21ben0wl1bbmtiiu8j9jp5iewthowz4turuj+ki0y...@mail.gmail.com/
* Merge struct landlock_rule into landlock_ruleset_elem to simplify the
  rule management.
* Constify variables.
* Improve kernel documentation.
* Cosmetic variable renames.
* Remove the "default" in the Kconfig (suggested by Jann Horn).
* Only use refcount_inc() through getter helpers.
* Update Kconfig description.

Changes since v13:
* New dedicated implementation, removing the need for eBPF.

Previous changes:
https://lore.kernel.org/lkml/20190721213116.23476-6-...@digikod.net/
---
 MAINTAINERS| 10 +
 security/Kconfig   |  1 +
 security/Makefile  |  2 +
 security/landlock/Kconfig  | 19 
 security/landlock/Makefile |  3 ++
 security/landlock/object.c | 66 +++
 security/landlock/object.h | 91 ++
 7 files changed, 192 insertions(+)
 create mode 100644 security/landlock/Kconfig
 create mode 100644 security/landlock/Makefile
 create mode 100644 security/landlock/object.c
 create mode 100644 security/landlock/object.h

diff --git a/MAINTAINERS b/MAINTAINERS
index e73636b75f29..06c77076214a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9846,6 +9846,16 @@ F:   net/core/sock_map.c
 F: net/ipv4/tcp_bpf.c
 F: net/ipv4/udp_bpf.c
 
+LANDLOCK SECURITY MODULE
+M:     Mickaël Salaün 
+L: linux-security-mod...@vger.kernel.org
+S: Supported
+W: https://landlock.io
+T: git https://github.com/landlock-lsm/linux.git
+F: security/landlock/
+K: landlock
+K: LANDLOCK
+
 LANTIQ / INTEL Ethernet drivers
 M: Hauke Mehrtens 
 L: net...@vger.kernel.org
diff --git a/security/Kconfig b/security/Kconfig
index 7561f6f99f1d..15a4342b5d01 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -238,6 +238,7 @@ source "security/loadpin/Kconfig"
 source "security/yama/Kconfig"
 source "security/safesetid/Kconfig"
 source "security/lockdown/Kconfig"
+source "security/landlock/Kconfig"
 
 source "security/integrity/Kconfig"
 
diff --git a/security/Makefile b/security/Makefile
index 3baf435de541..c688f4907a1b 100644
--- a/security/Makefile
+++ b/security/Makefile
@@ -13,6 +13,7 @@ subdir-$(CONFIG_SECURITY_LOADPIN) += loadpin
 subdir-$(CONFIG_SECURITY_SAFESETID)+= safesetid
 subdir-$(CONFIG_SECURITY_LOCKDOWN_LSM) += lockdown
 subdir-$(CONFIG_BPF_LSM)   += bpf
+subdir-$(CONFIG_SECURITY_LANDLOCK) += landlock
 
 # always enable default capabilities
 obj-y  += commoncap.o
@@ -32,6 +33,7 @@ obj-$(CONFIG_SECURITY_SAFESETID)   += safesetid/
 obj-$(CONFIG_SECURITY_LOCKDOWN_LSM)+= lockdown/
 obj-$(CONFIG_CGROUPS)  += device_cgroup.o
 obj-$(CONFIG_BPF_LSM)  += bpf/
+obj-$(CONFIG_SECURITY_LANDLOCK)+= landlock/
 
 # Object integrity file lists
 subdir-$(CONFIG_INTEGRITY) += inte

[PATCH v22 09/12] arch: Wire up Landlock syscalls

2020-10-27 Thread Mickaël Salaün
From: Mickaël Salaün 

Wire up the following system calls for all architectures:
* landlock_create_ruleset(2)
* landlock_add_rule(2)
* landlock_enforce_ruleset_current(2)

Cc: Arnd Bergmann 
Cc: James Morris 
Cc: Jann Horn 
Cc: Kees Cook 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
---

Changes since v21:
* Rebase and leave space for watch_mount(2) from -next.

Changes since v20:
* Remove landlock_get_features(2).
* Decrease syscall numbers to stick to process_madvise(2) in -next.
* Rename landlock_enforce_ruleset(2) to
  landlock_enforce_ruleset_current(2).

Changes since v19:
* Increase syscall numbers by 4 to leave space for new ones (in
  linux-next): watch_mount(2), watch_sb(2), fsinfo(2) and
  process_madvise(2) (requested by Arnd Bergmann).
* Replace the previous multiplexor landlock(2) with 4 syscalls:
  landlock_get_features(2), landlock_create_ruleset(2),
  landlock_add_rule(2) and landlock_enforce_ruleset(2).

Changes since v18:
* Increase the syscall number because of the new faccessat2(2).

Changes since v14:
* Add all architectures.

Changes since v13:
* New implementation.
---
 arch/alpha/kernel/syscalls/syscall.tbl  | 3 +++
 arch/arm/tools/syscall.tbl  | 3 +++
 arch/arm64/include/asm/unistd.h | 2 +-
 arch/arm64/include/asm/unistd32.h   | 6 ++
 arch/ia64/kernel/syscalls/syscall.tbl   | 3 +++
 arch/m68k/kernel/syscalls/syscall.tbl   | 3 +++
 arch/microblaze/kernel/syscalls/syscall.tbl | 3 +++
 arch/mips/kernel/syscalls/syscall_n32.tbl   | 3 +++
 arch/mips/kernel/syscalls/syscall_n64.tbl   | 3 +++
 arch/mips/kernel/syscalls/syscall_o32.tbl   | 3 +++
 arch/parisc/kernel/syscalls/syscall.tbl | 3 +++
 arch/powerpc/kernel/syscalls/syscall.tbl| 3 +++
 arch/s390/kernel/syscalls/syscall.tbl   | 3 +++
 arch/sh/kernel/syscalls/syscall.tbl | 3 +++
 arch/sparc/kernel/syscalls/syscall.tbl  | 3 +++
 arch/x86/entry/syscalls/syscall_32.tbl  | 3 +++
 arch/x86/entry/syscalls/syscall_64.tbl  | 3 +++
 arch/xtensa/kernel/syscalls/syscall.tbl | 3 +++
 include/uapi/asm-generic/unistd.h   | 8 +++-
 19 files changed, 62 insertions(+), 2 deletions(-)

diff --git a/arch/alpha/kernel/syscalls/syscall.tbl 
b/arch/alpha/kernel/syscalls/syscall.tbl
index ee7b01bb7346..7ef9966fc654 100644
--- a/arch/alpha/kernel/syscalls/syscall.tbl
+++ b/arch/alpha/kernel/syscalls/syscall.tbl
@@ -480,3 +480,6 @@
 548common  pidfd_getfd sys_pidfd_getfd
 549common  faccessat2  sys_faccessat2
 550common  process_madvise sys_process_madvise
+552common  landlock_create_ruleset 
sys_landlock_create_ruleset
+553common  landlock_add_rule   
sys_landlock_add_rule
+554common  landlock_enforce_ruleset_current
sys_landlock_enforce_ruleset_current
diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl
index d056a548358e..5bde774cef96 100644
--- a/arch/arm/tools/syscall.tbl
+++ b/arch/arm/tools/syscall.tbl
@@ -454,3 +454,6 @@
 438common  pidfd_getfd sys_pidfd_getfd
 439common  faccessat2  sys_faccessat2
 440common  process_madvise sys_process_madvise
+442common  landlock_create_ruleset 
sys_landlock_create_ruleset
+443common  landlock_add_rule   
sys_landlock_add_rule
+444common  landlock_enforce_ruleset_current
sys_landlock_enforce_ruleset_current
diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
index b3b2019f8d16..64ebdc1ec581 100644
--- a/arch/arm64/include/asm/unistd.h
+++ b/arch/arm64/include/asm/unistd.h
@@ -38,7 +38,7 @@
 #define __ARM_NR_compat_set_tls(__ARM_NR_COMPAT_BASE + 5)
 #define __ARM_NR_COMPAT_END(__ARM_NR_COMPAT_BASE + 0x800)
 
-#define __NR_compat_syscalls   441
+#define __NR_compat_syscalls   445
 #endif
 
 #define __ARCH_WANT_SYS_CLONE
diff --git a/arch/arm64/include/asm/unistd32.h 
b/arch/arm64/include/asm/unistd32.h
index 107f08e03b9f..253521adb064 100644
--- a/arch/arm64/include/asm/unistd32.h
+++ b/arch/arm64/include/asm/unistd32.h
@@ -889,6 +889,12 @@ __SYSCALL(__NR_pidfd_getfd, sys_pidfd_getfd)
 __SYSCALL(__NR_faccessat2, sys_faccessat2)
 #define __NR_process_madvise 440
 __SYSCALL(__NR_process_madvise, sys_process_madvise)
+#define __NR_landlock_create_ruleset 442
+__SYSCALL(__NR_landlock_create_ruleset, sys_landlock_create_ruleset)
+#define __NR_landlock_add_rule 443
+__SYSCALL(__NR_landlock_add_rule, sys_landlock_add_rule)
+#define __NR_landlock_enforce_ruleset_current 444
+__SYSCALL(__NR_landlock_enforce_ruleset_current, 
sys_landlock_enforce_ruleset_current)
 
 /*
  * Please add new compat syscalls above this comment and update
diff --git a/arch/ia64/kernel/syscalls/syscall.tbl 
b/arch/ia64/kernel/syscalls/syscall.tbl
index b96ed8b8a508

[PATCH v22 05/12] LSM: Infrastructure management of the superblock

2020-10-27 Thread Mickaël Salaün
From: Casey Schaufler 

Move management of the superblock->sb_security blob out of the
individual security modules and into the security infrastructure.
Instead of allocating the blobs from within the modules, the modules
tell the infrastructure how much space is required, and the space is
allocated there.

Cc: Kees Cook 
Cc: John Johansen 
Signed-off-by: Casey Schaufler 
Signed-off-by: Mickaël Salaün 
Reviewed-by: Stephen Smalley 
---

Changes since v20:
* Remove all Reviewed-by except Stephen Smalley:
  
https://lore.kernel.org/lkml/CAEjxPJ7ARJO57MBW66=xsBzMMRb=9ulgqock5eskhcaivmx...@mail.gmail.com/
* Cosmetic fix in the commit message.

Changes since v17:
* Rebase the original LSM stacking patch from v5.3 to v5.7: I fixed some
  diff conflicts caused by code moves and function renames in
  selinux/include/objsec.h and selinux/hooks.c .  I checked that it
  builds but I didn't test the changes for SELinux nor SMACK.
  https://lore.kernel.org/r/20190829232935.7099-2-ca...@schaufler-ca.com
---
 include/linux/lsm_hooks.h |  1 +
 security/security.c   | 46 
 security/selinux/hooks.c  | 58 ---
 security/selinux/include/objsec.h |  6 
 security/selinux/ss/services.c|  3 +-
 security/smack/smack.h|  6 
 security/smack/smack_lsm.c| 35 +--
 7 files changed, 85 insertions(+), 70 deletions(-)

diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index c503f7ab8afb..ff0f03a45c56 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -1563,6 +1563,7 @@ struct lsm_blob_sizes {
int lbs_cred;
int lbs_file;
int lbs_inode;
+   int lbs_superblock;
int lbs_ipc;
int lbs_msg_msg;
int lbs_task;
diff --git a/security/security.c b/security/security.c
index a28045dc9e7f..4ffd6c3af9d7 100644
--- a/security/security.c
+++ b/security/security.c
@@ -202,6 +202,7 @@ static void __init lsm_set_blob_sizes(struct lsm_blob_sizes 
*needed)
lsm_set_blob_size(>lbs_inode, _sizes.lbs_inode);
lsm_set_blob_size(>lbs_ipc, _sizes.lbs_ipc);
lsm_set_blob_size(>lbs_msg_msg, _sizes.lbs_msg_msg);
+   lsm_set_blob_size(>lbs_superblock, _sizes.lbs_superblock);
lsm_set_blob_size(>lbs_task, _sizes.lbs_task);
 }
 
@@ -332,12 +333,13 @@ static void __init ordered_lsm_init(void)
for (lsm = ordered_lsms; *lsm; lsm++)
prepare_lsm(*lsm);
 
-   init_debug("cred blob size = %d\n", blob_sizes.lbs_cred);
-   init_debug("file blob size = %d\n", blob_sizes.lbs_file);
-   init_debug("inode blob size= %d\n", blob_sizes.lbs_inode);
-   init_debug("ipc blob size  = %d\n", blob_sizes.lbs_ipc);
-   init_debug("msg_msg blob size  = %d\n", blob_sizes.lbs_msg_msg);
-   init_debug("task blob size = %d\n", blob_sizes.lbs_task);
+   init_debug("cred blob size   = %d\n", blob_sizes.lbs_cred);
+   init_debug("file blob size   = %d\n", blob_sizes.lbs_file);
+   init_debug("inode blob size  = %d\n", blob_sizes.lbs_inode);
+   init_debug("ipc blob size= %d\n", blob_sizes.lbs_ipc);
+   init_debug("msg_msg blob size= %d\n", blob_sizes.lbs_msg_msg);
+   init_debug("superblock blob size = %d\n", blob_sizes.lbs_superblock);
+   init_debug("task blob size   = %d\n", blob_sizes.lbs_task);
 
/*
 * Create any kmem_caches needed for blobs
@@ -669,6 +671,27 @@ static void __init lsm_early_task(struct task_struct *task)
panic("%s: Early task alloc failed.\n", __func__);
 }
 
+/**
+ * lsm_superblock_alloc - allocate a composite superblock blob
+ * @sb: the superblock that needs a blob
+ *
+ * Allocate the superblock blob for all the modules
+ *
+ * Returns 0, or -ENOMEM if memory can't be allocated.
+ */
+static int lsm_superblock_alloc(struct super_block *sb)
+{
+   if (blob_sizes.lbs_superblock == 0) {
+   sb->s_security = NULL;
+   return 0;
+   }
+
+   sb->s_security = kzalloc(blob_sizes.lbs_superblock, GFP_KERNEL);
+   if (sb->s_security == NULL)
+   return -ENOMEM;
+   return 0;
+}
+
 /*
  * The default value of the LSM hook is defined in linux/lsm_hook_defs.h and
  * can be accessed with:
@@ -866,12 +889,21 @@ int security_fs_context_parse_param(struct fs_context 
*fc, struct fs_parameter *
 
 int security_sb_alloc(struct super_block *sb)
 {
-   return call_int_hook(sb_alloc_security, 0, sb);
+   int rc = lsm_superblock_alloc(sb);
+
+   if (unlikely(rc))
+   return rc;
+   rc = call_int_hook(sb_alloc_security, 0, sb);
+   if (unlikely(rc))
+   security_sb_free(sb);
+   return rc;
 }
 
 

[PATCH v22 11/12] samples/landlock: Add a sandbox manager example

2020-10-27 Thread Mickaël Salaün
From: Mickaël Salaün 

Add a basic sandbox tool to launch a command which can only access a
whitelist of file hierarchies in a read-only or read-write way.

Cc: James Morris 
Cc: Jann Horn 
Cc: Kees Cook 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
---

Changes since v21:
* Remove LANDLOCK_ACCESS_FS_CHROOT.
* Clean up help.

Changes since v20:
* Update with new syscalls and type names.
* Update errno check for EOPNOTSUPP.
* Use the full syscall interfaces: explicitely set the "flags" field to
  zero.

Changes since v19:
* Update with the new Landlock syscalls.
* Comply with commit 5f2fb52fac15 ("kbuild: rename hostprogs-y/always to
  hostprogs/always-y").

Changes since v16:
* Switch syscall attribute pointer and size arguments.

Changes since v15:
* Update access right names.
* Properly assign access right to files according to the new related
  syscall restriction.
* Replace "select" with "depends on" HEADERS_INSTALL (suggested by Randy
  Dunlap).

Changes since v14:
* Fix Kconfig dependency.
* Remove access rights that may be required for FD-only requests:
  mmap, truncate, getattr, lock, chmod, chown, chgrp, ioctl.
* Fix useless hardcoded syscall number.
* Use execvpe().
* Follow symlinks.
* Extend help with common file paths.
* Constify variables.
* Clean up comments.
* Improve error message.

Changes since v11:
* Add back the filesystem sandbox manager and update it to work with the
  new Landlock syscall.

Previous changes:
https://lore.kernel.org/lkml/20190721213116.23476-9-...@digikod.net/
---
 samples/Kconfig  |   7 ++
 samples/Makefile |   1 +
 samples/landlock/.gitignore  |   1 +
 samples/landlock/Makefile|  15 +++
 samples/landlock/sandboxer.c | 219 +++
 5 files changed, 243 insertions(+)
 create mode 100644 samples/landlock/.gitignore
 create mode 100644 samples/landlock/Makefile
 create mode 100644 samples/landlock/sandboxer.c

diff --git a/samples/Kconfig b/samples/Kconfig
index 0ed6e4d71d87..e6129496ced5 100644
--- a/samples/Kconfig
+++ b/samples/Kconfig
@@ -124,6 +124,13 @@ config SAMPLE_HIDRAW
bool "hidraw sample"
depends on CC_CAN_LINK && HEADERS_INSTALL
 
+config SAMPLE_LANDLOCK
+   bool "Build Landlock sample code"
+   depends on HEADERS_INSTALL
+   help
+ Build a simple Landlock sandbox manager able to launch a process
+ restricted by a user-defined filesystem access control.
+
 config SAMPLE_PIDFD
bool "pidfd sample"
depends on CC_CAN_LINK && HEADERS_INSTALL
diff --git a/samples/Makefile b/samples/Makefile
index c3392a595e4b..087e0988ccc5 100644
--- a/samples/Makefile
+++ b/samples/Makefile
@@ -11,6 +11,7 @@ obj-$(CONFIG_SAMPLE_KDB)  += kdb/
 obj-$(CONFIG_SAMPLE_KFIFO) += kfifo/
 obj-$(CONFIG_SAMPLE_KOBJECT)   += kobject/
 obj-$(CONFIG_SAMPLE_KPROBES)   += kprobes/
+subdir-$(CONFIG_SAMPLE_LANDLOCK)   += landlock
 obj-$(CONFIG_SAMPLE_LIVEPATCH) += livepatch/
 subdir-$(CONFIG_SAMPLE_PIDFD)  += pidfd
 obj-$(CONFIG_SAMPLE_QMI_CLIENT)+= qmi/
diff --git a/samples/landlock/.gitignore b/samples/landlock/.gitignore
new file mode 100644
index ..f43668b2d318
--- /dev/null
+++ b/samples/landlock/.gitignore
@@ -0,0 +1 @@
+/sandboxer
diff --git a/samples/landlock/Makefile b/samples/landlock/Makefile
new file mode 100644
index ..21eda5774948
--- /dev/null
+++ b/samples/landlock/Makefile
@@ -0,0 +1,15 @@
+# SPDX-License-Identifier: BSD-3-Clause
+
+hostprogs := sandboxer
+
+always-y := $(hostprogs)
+
+KBUILD_HOSTCFLAGS += -I$(objtree)/usr/include
+
+.PHONY: all clean
+
+all:
+   $(MAKE) -C ../.. samples/landlock/
+
+clean:
+   $(MAKE) -C ../.. M=samples/landlock/ clean
diff --git a/samples/landlock/sandboxer.c b/samples/landlock/sandboxer.c
new file mode 100644
index ..ee5ec1203cb7
--- /dev/null
+++ b/samples/landlock/sandboxer.c
@@ -0,0 +1,219 @@
+// SPDX-License-Identifier: BSD-3-Clause
+/*
+ * Simple Landlock sandbox manager able to launch a process restricted by a
+ * user-defined filesystem access control.
+ *
+ * Copyright © 2017-2020 Mickaël Salaün 
+ * Copyright © 2020 ANSSI
+ */
+
+#define _GNU_SOURCE
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#ifndef landlock_create_ruleset
+static inline int landlock_create_ruleset(
+   const struct landlock_ruleset_attr *const attr,
+   const size_t size, const __u32 flags)
+{
+   errno = 0;
+   return syscall(__NR_landlock_create_ruleset, attr, size, flags);
+}
+#endif
+
+#ifndef landlock_add_rule
+static inline int landlock_add_rule(const int ruleset_fd,
+   const enum landlock_rule_type rule_type,
+   const void *const rule_attr, const __u32 flags)
+{
+

[PATCH v22 03/12] landlock: Set up the security framework and manage credentials

2020-10-27 Thread Mickaël Salaün
From: Mickaël Salaün 

Process's credentials point to a Landlock domain, which is underneath
implemented with a ruleset.  In the following commits, this domain is
used to check and enforce the ptrace and filesystem security policies.
A domain is inherited from a parent to its child the same way a thread
inherits a seccomp policy.

Cc: James Morris 
Cc: Jann Horn 
Cc: Kees Cook 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
---

Changes since v21:
* Fix copyright dates.

Changes since v17:
* Constify returned domain pointers from landlock_get_current_domain()
  and landlock_get_task_domain() helpers.

Changes since v15:
* Optimize landlocked() for current thread.
* Display the greeting message when everything is initialized.

Changes since v14:
* Uses pr_fmt from common.h .
* Constify variables.
* Remove useless NULL initialization.

Changes since v13:
* totally get ride of the seccomp dependency
* only keep credential management and LSM setup.

Previous changes:
https://lore.kernel.org/lkml/20191104172146.30797-4-...@digikod.net/
---
 security/Kconfig   | 10 +++
 security/landlock/Makefile |  3 +-
 security/landlock/common.h | 20 +
 security/landlock/cred.c   | 46 ++
 security/landlock/cred.h   | 58 ++
 security/landlock/setup.c  | 31 
 security/landlock/setup.h  | 16 +++
 7 files changed, 178 insertions(+), 6 deletions(-)
 create mode 100644 security/landlock/common.h
 create mode 100644 security/landlock/cred.c
 create mode 100644 security/landlock/cred.h
 create mode 100644 security/landlock/setup.c
 create mode 100644 security/landlock/setup.h

diff --git a/security/Kconfig b/security/Kconfig
index 15a4342b5d01..0ced7fd33e4d 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -278,11 +278,11 @@ endchoice
 
 config LSM
string "Ordered list of enabled LSMs"
-   default 
"lockdown,yama,loadpin,safesetid,integrity,smack,selinux,tomoyo,apparmor,bpf" 
if DEFAULT_SECURITY_SMACK
-   default 
"lockdown,yama,loadpin,safesetid,integrity,apparmor,selinux,smack,tomoyo,bpf" 
if DEFAULT_SECURITY_APPARMOR
-   default "lockdown,yama,loadpin,safesetid,integrity,tomoyo,bpf" if 
DEFAULT_SECURITY_TOMOYO
-   default "lockdown,yama,loadpin,safesetid,integrity,bpf" if 
DEFAULT_SECURITY_DAC
-   default 
"lockdown,yama,loadpin,safesetid,integrity,selinux,smack,tomoyo,apparmor,bpf"
+   default 
"landlock,lockdown,yama,loadpin,safesetid,integrity,smack,selinux,tomoyo,apparmor,bpf"
 if DEFAULT_SECURITY_SMACK
+   default 
"landlock,lockdown,yama,loadpin,safesetid,integrity,apparmor,selinux,smack,tomoyo,bpf"
 if DEFAULT_SECURITY_APPARMOR
+   default "landlock,lockdown,yama,loadpin,safesetid,integrity,tomoyo,bpf" 
if DEFAULT_SECURITY_TOMOYO
+   default "landlock,lockdown,yama,loadpin,safesetid,integrity,bpf" if 
DEFAULT_SECURITY_DAC
+   default 
"landlock,lockdown,yama,loadpin,safesetid,integrity,selinux,smack,tomoyo,apparmor,bpf"
help
  A comma-separated list of LSMs, in initialization order.
  Any LSMs left off this list will be ignored. This can be
diff --git a/security/landlock/Makefile b/security/landlock/Makefile
index d846eba445bb..041ea242e627 100644
--- a/security/landlock/Makefile
+++ b/security/landlock/Makefile
@@ -1,3 +1,4 @@
 obj-$(CONFIG_SECURITY_LANDLOCK) := landlock.o
 
-landlock-y := object.o ruleset.o
+landlock-y := setup.o object.o ruleset.o \
+   cred.o
diff --git a/security/landlock/common.h b/security/landlock/common.h
new file mode 100644
index ..5dc0fe15707d
--- /dev/null
+++ b/security/landlock/common.h
@@ -0,0 +1,20 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Landlock LSM - Common constants and helpers
+ *
+ * Copyright © 2016-2020 Mickaël Salaün 
+ * Copyright © 2018-2020 ANSSI
+ */
+
+#ifndef _SECURITY_LANDLOCK_COMMON_H
+#define _SECURITY_LANDLOCK_COMMON_H
+
+#define LANDLOCK_NAME "landlock"
+
+#ifdef pr_fmt
+#undef pr_fmt
+#endif
+
+#define pr_fmt(fmt) LANDLOCK_NAME ": " fmt
+
+#endif /* _SECURITY_LANDLOCK_COMMON_H */
diff --git a/security/landlock/cred.c b/security/landlock/cred.c
new file mode 100644
index ..7074149d2517
--- /dev/null
+++ b/security/landlock/cred.c
@@ -0,0 +1,46 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Landlock LSM - Credential hooks
+ *
+ * Copyright © 2017-2020 Mickaël Salaün 
+ * Copyright © 2018-2020 ANSSI
+ */
+
+#include 
+#include 
+
+#include "common.h"
+#include "cred.h"
+#include "ruleset.h"
+#include "setup.h"
+
+static int hook_cred_prepare(struct cred *const new,
+   const struct cred *const old, const gfp_t gfp)
+{
+   const struct landlock_cred_security *cred_old = landlock_cred(old);
+   struct landlock_cred_security *cred_new

[PATCH v22 10/12] selftests/landlock: Add user space tests

2020-10-27 Thread Mickaël Salaün
From: Mickaël Salaün 

Test all Landlock system calls, ptrace hooks semantic and filesystem
access-control.

Test coverage for security/landlock/ is 95.2% of lines.  The code not
covered only deals with internal kernel errors (e.g. memory allocation)
and race conditions.

Cc: James Morris 
Cc: Jann Horn 
Cc: Kees Cook 
Cc: Serge E. Hallyn 
Cc: Shuah Khan 
Signed-off-by: Mickaël Salaün 
Reviewed-by: Vincent Dagonneau 
---

Changes since v21:
* Remove layout1.chroot test and update layout1.unhandled_access to not
  rely on LANDLOCK_ACCESS_FS_CHROOT.
* Clean up comments.

Changes since v20:
* Update with new syscalls and type names.
* Use the full syscall interfaces: explicitely set the "flags" field to
  zero.
* Update the empty_path_beneath_attr test to check for EFAULT.
* Update and merge tests for the simplified copy_min_struct_from_user().
* Clean up makefile.
* Rename some types and variables in a more consistent way.

Changes since v19:
* Update with the new Landlock syscalls.
* Fix device creation.
* Check the new landlock_attr_features members: last_rule_type and
  last_target_type .
* Constify variables.

Changes since v18:
* Replace ruleset_rw.inval with layout1.inval to avoid inexistent test
  layout.
* Use the new FIXTURE_VARIANT for ptrace_test: makes the tests more
  readable and usable.
* Add ARRAY_SIZE() macro to please checkpatch.

Changes since v17:
* Add new test for mknod with a zero mode.
* Use memset(3) to initialize attr_features in base_test.

Changes since v16:
* Add new unpriv_enforce_without_no_new_privs test: check that ruleset
  enforcing is forbiden without no_new_privs and CAP_SYS_ADMIN.
* Drop capabilities when useful.
* Check the new size_attr_features field from struct
  landlock_attr_features.
* Update the empty_or_same_ruleset test to check complementary empty
  ruleset.
* Update base_test according to the new attribute structures and fix the
  inconsistent_attr test accordingly.
* Switch syscall attribute pointer and size arguments.
* Rename test files with a "_test" suffix.

Changes since v14:
* Add new tests:
  - superset: check new layer bitmask.
  - max_layers: check maximum number of layers.
  - release_inodes: check that umount work well.
  - empty_or_same_ruleset.
  - inconsistent_attr: checks copy_to_user limits.
  - in ruleset_rw.inval to check ruleset FD.
  - proc_unlinked_file: check file access through /proc/self/fd .
  - file_access_rights: check that a file can only get consistent access
rights.
  - unpriv: check that NO_NEW_PRIVS or CAP_SYS_ADMIN is required.
  - check pipe access through /proc/self/fd .
  - check move_mount(2).
  - check ruleset file descriptor properties.
  - proc_nsfs: extend to check that internal filesystems (e.g. nsfs) are
allowed.
* Double-check read and write effective actions.
* Fix potential desynchronization between the kernel sources and
  installed headers by overriding the build step in the Makefile.  This
  also enable to build with Clang.
* Add two files in the test directories (for link test and rename test).
* Remove test for ruleset's show_fdinfo().
* Replace EBADR with EBADFD.
* Update tests accordingly to the changes of rename and link rights.
* Fix (now) illegal access rights tied to files.
* Update rename and link tests.
* Remove superfluous '\n' in TH_LOG() calls.
* Make assert calls consistent and readable.
* Fix the execute test.
* Make tests future-proof.
* Cosmetic fixes.

Changes since v14:
* Add new tests:
  - Compatibility: empty_attr_{ruleset,path_beneath,enforce} to check
minimal attr size.
  - Access types: link_to, rename_from, rename_to, rmdir, unlink,
make_char, make_block, make_reg, make_sock, make_fifo, make_sym,
make_dir, chroot, execute.
  - Test privilege escalation prevention by enforcing a nested rule, on
a parent directory, with less restrictions than one on a child
directory.
  - Test for empty and more than 32-bits allowed_access
* Merge the two test mount hierarchies.
* Complete relative path tests by combining chdir and chroot.
* Adjust tests:
  - Remove the layout1/extend_ruleset_with_denied_path test.
  - Extend layout1/whitelist test with checks on file.
  - Add and use create_dir_and_file().
* Only use read/write checks but not stat(2) for tests.
* Rename test.h to common.h and improve it.
* Rename path name to make them more consistent, easy to understand and
  make them in a common directory.
* Make create_ruleset() more generic.
* Constify variables.
* Re-add static global variables.
* Remove useless openat(2).
* Fix and complete kernel config.
* Set umask and clean up file modes.
* Clean up open flags.
* Improve Makefile.
* Fix spelling.
* Improve comments and error messages.

Changes since v13:
* Add back the filesystem tests (from v10) and extend them.
* Add tests for the new syscall.

Previous changes:
https://lore.kernel.org/lkml/20191104172146.30797-7-...@digikod.net/
---
 tools/testing/selftests/Makefile  |1 +
 too

[PATCH v22 07/12] landlock: Support filesystem access-control

2020-10-27 Thread Mickaël Salaün
From: Mickaël Salaün 

Thanks to the Landlock objects and ruleset, it is possible to identify
inodes according to a process's domain.  To enable an unprivileged
process to express a file hierarchy, it first needs to open a directory
(or a file) and pass this file descriptor to the kernel through
landlock_add_rule(2).  When checking if a file access request is
allowed, we walk from the requested dentry to the real root, following
the different mount layers.  The access to each "tagged" inodes are
collected according to their rule layer level, and ANDed to create
access to the requested file hierarchy.  This makes possible to identify
a lot of files without tagging every inodes nor modifying the
filesystem, while still following the view and understanding the user
has from the filesystem.

Add a new ARCH_EPHEMERAL_INODES for UML because it currently does not
keep the same struct inodes for the same inodes whereas these inodes are
in use.

This commit adds a minimal set of supported filesystem access-control
which doesn't enable to restrict all file-related actions.  This is the
result of multiple discussions to minimize the code of Landlock to ease
review.  Thanks to the Landlock design, extending this access-control
without breaking user space will not be a problem.  Moreover, seccomp
filters can be used to restrict the use of syscall families which may
not be currently handled by Landlock.

Cc: Al Viro 
Cc: Anton Ivanov 
Cc: James Morris 
Cc: Jann Horn 
Cc: Jeff Dike 
Cc: Kees Cook 
Cc: Richard Weinberger 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
---

Changes since v21:
* Rename ARCH_EPHEMERAL_STATES to ARCH_EPHEMERAL_INODES (suggested by
  James Morris).
* Remove the LANDLOCK_ACCESS_FS_CHROOT right because chroot(2) (which
  requires CAP_SYS_CHROOT) doesn't enable to bypass Landlock (as tests
  demonstrate it), and because it is often used by sandboxes, it would
  be counterproductive to forbid it.  This also reduces the code size.
* Clean up documentation.

Changes since v19:
* Fix spelling (spotted by Randy Dunlap).

Changes since v18:
* Remove useless include.
* Fix spelling.

Changes since v17:
* Replace landlock_release_inodes() with security_sb_delete() (requested
  by James Morris).
* Replace struct super_block->s_landlock_inode_refs with the LSM
  infrastructure management of the superblock (requested by James
  Morris).
* Fix mknod restriction with a zero mode (spotted by Vincent Dagonneau).
* Minimize executed code in path_mknod and file_open hooks when the
  current tasks is not sandboxed.
* Remove useless checks on the file pointer and inode in
  hook_file_open() .
* Constify domain pointers.
* Rename inode_landlock() to landlock_inode().
* Import include/uapi/linux/landlock.h and _LANDLOCK_ACCESS_FS_* from
  the ruleset and domain management patch.
* Explain the rational of this minimal set of access-control.
  https://lore.kernel.org/lkml/f646e1c7-33cf-333f-070c-0a40ad046...@digikod.net/

Changes since v16:
* Add ARCH_EPHEMERAL_STATES and enable it for UML.

Changes since v15:
* Replace layer_levels and layer_depth with a bitfield of layers: this
  enables to properly manage superset and subset of access rights,
  whatever their order in the stack of layers.
  Cf. 
https://lore.kernel.org/lkml/e07fe473-1801-01cc-12ae-b3167f952...@digikod.net/
* Allow to open pipes and similar special files through /proc/self/fd/.
* Properly handle internal filesystems such as nsfs: always allow these
  kind of roots because disconnected path cannot be evaluated.
* Remove the LANDLOCK_ACCESS_FS_LINK_TO and
  LANDLOCK_ACCESS_FS_RENAME_{TO,FROM}, but use the
  LANDLOCK_ACCESS_FS_REMOVE_{FILE,DIR} and LANDLOCK_ACCESS_FS_MAKE_*
  instead.  Indeed, it is not possible for now (and not really useful)
  to express the semantic of a source and a destination.
* Check access rights to remove a directory or a file with rename(2).
* Forbid reparenting when linking or renaming.  This is needed to easily
  protect against possible privilege escalation by changing the place of
  a file or directory in relation to an enforced access policy (from the
  set of layers).  This will be relaxed in the future.
* Update hooks to take into account replacement of the object's self and
  beneath access bitfields with one.  Simplify the code.
* Check file related access rights.
* Check d_is_negative() instead of !d_backing_inode() in
  check_access_path_continue(), and continue the path walk while there
  is no mapped inode e.g., with rename(2).
* Check private inode in check_access_path().
* Optimize get_file_access() when dealing with a directory.
* Add missing atomic.h .

Changes since v14:
* Simplify the object, rule and ruleset management at the expense of a
  less aggressive memory freeing (contributed by Jann Horn, with
  additional modifications):
  - Rewrite release_inode() to use inode->sb->s_landlock_inode_refs.
  - Remove useless checks in landlock_release_inodes(), clean object
pointer a

[PATCH v22 08/12] landlock: Add syscall implementations

2020-10-27 Thread Mickaël Salaün
From: Mickaël Salaün 

These 3 system calls are designed to be used by unprivileged processes
to sandbox themselves:
* landlock_create_ruleset(2): Creates a ruleset and returns its file
  descriptor.
* landlock_add_rule(2): Adds a rule (e.g. file hierarchy access) to a
  ruleset, identified by the dedicated file descriptor.
* landlock_enforce_ruleset_current(2): Enforces a ruleset on the current
  thread and its future children (similar to seccomp).  This syscall has
  the same usage restrictions as seccomp(2): the caller must have the
  no_new_privs attribute set or have CAP_SYS_ADMIN in the current user
  namespace.

All these syscalls have a "flags" argument (not currently used) to
enable extensibility.

Here are the motivations for these new syscalls:
* A sandboxed process may not have access to file systems, including
  /dev, /sys or /proc, but it should still be able to add more
  restrictions to itself.
* Neither prctl(2) nor seccomp(2) (which was used in a previous version)
  fit well with the current definition of a Landlock security policy.

All passed structs (attributes) are checked at build time to ensure that
they don't contain holes and that they are aligned the same way for each
architecture.

See the user and kernel documentation for more details (provided by a
following commit):
* Documentation/userspace-api/landlock.rst
* Documentation/security/landlock.rst

Cc: Arnd Bergmann 
Cc: James Morris 
Cc: Jann Horn 
Cc: Kees Cook 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
---

Changes since v21:
* Fix and improve comments.

Changes since v20:
* Remove two arguments to landlock_enforce_ruleset(2) (requested by Arnd
  Bergmann) and rename it to landlock_enforce_ruleset_current(2): remove
  the enum landlock_target_type and the target file descriptor (not used
  for now).  A ruleset can only be enforced on the current thread.
* Remove the size argument in landlock_add_rule() (requested by Arnd
  Bergmann).
* Remove landlock_get_features(2) (suggested by Arnd Bergmann).
* Simplify and rename copy_struct_if_any_from_user() to
  copy_min_struct_from_user().
* Rename "options" to "flags" to allign with current syscalls.
* Rename some types and variables in a more consistent way.
* Fix missing type declarations in syscalls.h .

Changes since v19:
* Replace the landlock(2) syscall with 4 syscalls (one for each
  command): landlock_get_features(2), landlock_create_ruleset(2),
  landlock_add_rule(2) and landlock_enforce_ruleset(2) (suggested by
  Arnd Bergmann).
  https://lore.kernel.org/lkml/56d15841-e2c1-2d58-59b8-3a6a09b23...@digikod.net/
* Return EOPNOTSUPP (instead of ENOPKG) when Landlock is disabled.
* Add two new fields to landlock_attr_features to fit with the new
  syscalls: last_rule_type and last_target_type.  This enable to easily
  identify which types are supported.
* Pack landlock_attr_path_beneath struct because of the removed
  ruleset_fd.
* Update documentation and fix spelling.

Changes since v18:
* Remove useless include.
* Remove LLATTR_SIZE() which was only used to shorten lines. Cf. commit
  bdc48fa11e46 ("checkpatch/coding-style: deprecate 80-column warning").

Changes since v17:
* Synchronize syscall declaration.
* Fix comment.

Changes since v16:
* Add a size_attr_features field to struct landlock_attr_features for
  self-introspection, and move the access_fs field to be more
  consistent.
* Replace __aligned_u64 types of attribute fields with __u16, __s32,
  __u32 and __u64, and check at build time that these structures does
  not contain hole and that they are aligned the same way (8-bits) on
  all architectures.  This shrinks the size of the userspace ABI, which
  may be appreciated especially for struct landlock_attr_features which
  could grow a lot in the future.  For instance, struct
  landlock_attr_features shrinks from 72 bytes to 32 bytes.  This change
  also enables to remove 64-bits to 32-bits conversion checks.
* Switch syscall attribute pointer and size arguments to follow similar
  syscall argument order (e.g. bpf, clone3, openat2).
* Set LANDLOCK_OPT_* types to 32-bits.
* Allow enforcement of empty ruleset, which enables deny-all policies.
* Fix documentation inconsistency.

Changes since v15:
* Do not add file descriptors referring to internal filesystems (e.g.
  nsfs) in a ruleset.
* Replace is_user_mountable() with in-place clean checks.
* Replace EBADR with EBADFD in get_ruleset_from_fd() and
  get_path_from_fd().
* Remove ruleset's show_fdinfo() for now.

Changes since v14:
* Remove the security_file_open() check in get_path_from_fd(): an
  opened FD should not be restricted here, and even less with this hook.
  As a result, it is now allowed to add a path to a ruleset even if the
  access to this path is not allowed (without O_PATH). This doesn't
  change the fact that enforcing a ruleset can't grant any right, only
  remove some rights.  The new layer levels add more consistent
  restrictions.
* Ch

[PATCH v22 06/12] fs,security: Add sb_delete hook

2020-10-27 Thread Mickaël Salaün
From: Mickaël Salaün 

The sb_delete security hook is called when shutting down a superblock,
which may be useful to release kernel objects tied to the superblock's
lifetime (e.g. inodes).

This new hook is needed by Landlock to release (ephemerally) tagged
struct inodes.  This comes from the unprivileged nature of Landlock
described in the next commit.

Cc: Al Viro 
Cc: James Morris 
Cc: Jann Horn 
Cc: Kees Cook 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
---

Changes since v17:
* Initial patch to replace the direct call to landlock_release_inodes()
  (requested by James Morris).
  https://lore.kernel.org/lkml/alpine.lrh.2.21.2005150536440.7...@namei.org/
---
 fs/super.c| 1 +
 include/linux/lsm_hook_defs.h | 1 +
 include/linux/lsm_hooks.h | 2 ++
 include/linux/security.h  | 4 
 security/security.c   | 5 +
 5 files changed, 13 insertions(+)

diff --git a/fs/super.c b/fs/super.c
index a51c2083cd6b..fea9475189f7 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -454,6 +454,7 @@ void generic_shutdown_super(struct super_block *sb)
evict_inodes(sb);
/* only nonzero refcount inodes can have marks */
fsnotify_sb_delete(sb);
+   security_sb_delete(sb);
 
if (sb->s_dio_done_wq) {
destroy_workqueue(sb->s_dio_done_wq);
diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
index 32a940117e7a..1ba9b4dfecb3 100644
--- a/include/linux/lsm_hook_defs.h
+++ b/include/linux/lsm_hook_defs.h
@@ -59,6 +59,7 @@ LSM_HOOK(int, 0, fs_context_dup, struct fs_context *fc,
 LSM_HOOK(int, -ENOPARAM, fs_context_parse_param, struct fs_context *fc,
 struct fs_parameter *param)
 LSM_HOOK(int, 0, sb_alloc_security, struct super_block *sb)
+LSM_HOOK(void, LSM_RET_VOID, sb_delete, struct super_block *sb)
 LSM_HOOK(void, LSM_RET_VOID, sb_free_security, struct super_block *sb)
 LSM_HOOK(void, LSM_RET_VOID, sb_free_mnt_opts, void *mnt_opts)
 LSM_HOOK(int, 0, sb_eat_lsm_opts, char *orig, void **mnt_opts)
diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index ff0f03a45c56..dbfcec05a176 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -108,6 +108,8 @@
  * allocated.
  * @sb contains the super_block structure to be modified.
  * Return 0 if operation was successful.
+ * @sb_delete:
+ * Release objects tied to a superblock (e.g. inodes).
  * @sb_free_security:
  * Deallocate and clear the sb->s_security field.
  * @sb contains the super_block structure to be modified.
diff --git a/include/linux/security.h b/include/linux/security.h
index bc2725491560..a4603b62d444 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -287,6 +287,7 @@ void security_bprm_committed_creds(struct linux_binprm 
*bprm);
 int security_fs_context_dup(struct fs_context *fc, struct fs_context *src_fc);
 int security_fs_context_parse_param(struct fs_context *fc, struct fs_parameter 
*param);
 int security_sb_alloc(struct super_block *sb);
+void security_sb_delete(struct super_block *sb);
 void security_sb_free(struct super_block *sb);
 void security_free_mnt_opts(void **mnt_opts);
 int security_sb_eat_lsm_opts(char *options, void **mnt_opts);
@@ -619,6 +620,9 @@ static inline int security_sb_alloc(struct super_block *sb)
return 0;
 }
 
+static inline void security_sb_delete(struct super_block *sb)
+{ }
+
 static inline void security_sb_free(struct super_block *sb)
 { }
 
diff --git a/security/security.c b/security/security.c
index 4ffd6c3af9d7..4563e7a79216 100644
--- a/security/security.c
+++ b/security/security.c
@@ -899,6 +899,11 @@ int security_sb_alloc(struct super_block *sb)
return rc;
 }
 
+void security_sb_delete(struct super_block *sb)
+{
+   call_void_hook(sb_delete, sb);
+}
+
 void security_sb_free(struct super_block *sb)
 {
call_void_hook(sb_free_security, sb);
-- 
2.28.0



[PATCH v22 04/12] landlock: Add ptrace restrictions

2020-10-27 Thread Mickaël Salaün
From: Mickaël Salaün 

Using ptrace(2) and related debug features on a target process can lead
to a privilege escalation.  Indeed, ptrace(2) can be used by an attacker
to impersonate another task and to remain undetected while performing
malicious activities.  Thanks to  ptrace_may_access(), various part of
the kernel can check if a tracer is more privileged than a tracee.

A landlocked process has fewer privileges than a non-landlocked process
and must then be subject to additional restrictions when manipulating
processes. To be allowed to use ptrace(2) and related syscalls on a
target process, a landlocked process must have a subset of the target
process's rules (i.e. the tracee must be in a sub-domain of the tracer).

Cc: James Morris 
Cc: Jann Horn 
Cc: Kees Cook 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
---

Changes since v21:
* Fix copyright dates.

Changes since v14:
* Constify variables.

Changes since v13:
* Make the ptrace restriction mandatory, like in the v10.
* Remove the eBPF dependency.

Previous changes:
https://lore.kernel.org/lkml/20191104172146.30797-5-...@digikod.net/
---
 security/landlock/Makefile |   2 +-
 security/landlock/ptrace.c | 120 +
 security/landlock/ptrace.h |  14 +
 security/landlock/setup.c  |   2 +
 4 files changed, 137 insertions(+), 1 deletion(-)
 create mode 100644 security/landlock/ptrace.c
 create mode 100644 security/landlock/ptrace.h

diff --git a/security/landlock/Makefile b/security/landlock/Makefile
index 041ea242e627..f1d1eb72fa76 100644
--- a/security/landlock/Makefile
+++ b/security/landlock/Makefile
@@ -1,4 +1,4 @@
 obj-$(CONFIG_SECURITY_LANDLOCK) := landlock.o
 
 landlock-y := setup.o object.o ruleset.o \
-   cred.o
+   cred.o ptrace.o
diff --git a/security/landlock/ptrace.c b/security/landlock/ptrace.c
new file mode 100644
index ..77c77bb1fe97
--- /dev/null
+++ b/security/landlock/ptrace.c
@@ -0,0 +1,120 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Landlock LSM - Ptrace hooks
+ *
+ * Copyright © 2017-2020 Mickaël Salaün 
+ * Copyright © 2019-2020 ANSSI
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "common.h"
+#include "cred.h"
+#include "ptrace.h"
+#include "ruleset.h"
+#include "setup.h"
+
+/**
+ * domain_scope_le - Checks domain ordering for scoped ptrace
+ *
+ * @parent: Parent domain.
+ * @child: Potential child of @parent.
+ *
+ * Checks if the @parent domain is less or equal to (i.e. an ancestor, which
+ * means a subset of) the @child domain.
+ */
+static bool domain_scope_le(const struct landlock_ruleset *const parent,
+   const struct landlock_ruleset *const child)
+{
+   const struct landlock_hierarchy *walker;
+
+   if (!parent)
+   return true;
+   if (!child)
+   return false;
+   for (walker = child->hierarchy; walker; walker = walker->parent) {
+   if (walker == parent->hierarchy)
+   /* @parent is in the scoped hierarchy of @child. */
+   return true;
+   }
+   /* There is no relationship between @parent and @child. */
+   return false;
+}
+
+static bool task_is_scoped(const struct task_struct *const parent,
+   const struct task_struct *const child)
+{
+   bool is_scoped;
+   const struct landlock_ruleset *dom_parent, *dom_child;
+
+   rcu_read_lock();
+   dom_parent = landlock_get_task_domain(parent);
+   dom_child = landlock_get_task_domain(child);
+   is_scoped = domain_scope_le(dom_parent, dom_child);
+   rcu_read_unlock();
+   return is_scoped;
+}
+
+static int task_ptrace(const struct task_struct *const parent,
+   const struct task_struct *const child)
+{
+   /* Quick return for non-landlocked tasks. */
+   if (!landlocked(parent))
+   return 0;
+   if (task_is_scoped(parent, child))
+   return 0;
+   return -EPERM;
+}
+
+/**
+ * hook_ptrace_access_check - Determines whether the current process may access
+ *   another
+ *
+ * @child: Process to be accessed.
+ * @mode: Mode of attachment.
+ *
+ * If the current task has Landlock rules, then the child must have at least
+ * the same rules.  Else denied.
+ *
+ * Determines whether a process may access another, returning 0 if permission
+ * granted, -errno if denied.
+ */
+static int hook_ptrace_access_check(struct task_struct *const child,
+   const unsigned int mode)
+{
+   return task_ptrace(current, child);
+}
+
+/**
+ * hook_ptrace_traceme - Determines whether another process may trace the
+ *  current one
+ *
+ * @parent: Task proposed to be the tracer.
+ *
+ * If the parent has Landlock rules, then the current task must have the same
+ * or more rules.  Else denied.
+ *
+ * Determines whether the nominated task is permitted 

[PATCH v22 02/12] landlock: Add ruleset and domain management

2020-10-27 Thread Mickaël Salaün
From: Mickaël Salaün 

A Landlock ruleset is mainly a red-black tree with Landlock rules as
nodes.  This enables quick update and lookup to match a requested access
e.g., to a file.  A ruleset is usable through a dedicated file
descriptor (cf. following commit implementing syscalls) which enables a
process to create and populate a ruleset with new rules.

A domain is a ruleset tied to a set of processes.  This group of rules
defines the security policy enforced on these processes and their future
children.  A domain can transition to a new domain which is the
intersection of all its constraints and those of a ruleset provided by
the current process.  This modification only impact the current process.
This means that a process can only gain more constraints (i.e. lose
accesses) over time.

Cc: James Morris 
Cc: Jann Horn 
Cc: Kees Cook 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
---

Changes since v21:
* Add and clean up comments.

Changes since v18:
* Account rulesets to kmemcg.
* Remove struct holes.
* Cosmetic changes.

Changes since v17:
* Move include/uapi/linux/landlock.h and _LANDLOCK_ACCESS_FS_* to a
  following patch.

Changes since v16:
* Allow enforcement of empty ruleset, which enables deny-all policies.

Changes since v15:
* Replace layer_levels and layer_depth with a bitfield of layers, cf.
  filesystem commit.
* Rename the LANDLOCK_ACCESS_FS_{UNLINK,RMDIR} with
  LANDLOCK_ACCESS_FS_REMOVE_{FILE,DIR} because it makes sense to use
  them for the action of renaming a file or a directory, which may lead
  to the removal of the source file or directory.  Removes the
  LANDLOCK_ACCESS_FS_{LINK_TO,RENAME_FROM,RENAME_TO} which are now
  replaced with LANDLOCK_ACCESS_FS_REMOVE_{FILE,DIR} and
  LANDLOCK_ACCESS_FS_MAKE_* .
* Update the documentation accordingly and highlight how the access
  rights are taken into account.
* Change nb_rules from atomic_t to u32 because it is not use anymore by
  show_fdinfo().
* Add safeguard for level variables types.
* Check max number of rules.
* Replace struct landlock_access (self and beneath bitfields) with one
  bitfield.
* Remove useless variable.
* Add comments.

Changes since v14:
* Simplify the object, rule and ruleset management at the expense of a
  less aggressive memory freeing (contributed by Jann Horn, with
  additional modifications):
  - Make a domain immutable (remove the opportunistic cleaning).
  - Remove RCU pointers.
  - Merge struct landlock_ref and struct landlock_ruleset_elem into
landlock_rule: get ride of rule's RCU.
  - Adjust union.
  - Remove the landlock_insert_rule() check about a new object with the
same address as a previously disabled one, because it is not
possible to disable a rule anymore.
  Cf. 
https://lore.kernel.org/lkml/cag48ez21ben0wl1bbmtiiu8j9jp5iewthowz4turuj+ki0y...@mail.gmail.com/
* Fix nested domains by implementing a notion of layer level and depth:
  - Update landlock_insert_rule() to manage such layers.
  - Add an inherit_ruleset() helper to properly create a new domain.
  - Rename landlock_find_access() to landlock_find_rule() and return a
full rule reference.
  - Add a layer_level and a layer_depth fields to struct landlock_rule.
  - Add a top_layer_level field to struct landlock_ruleset.
* Remove access rights that may be required for FD-only requests:
  truncate, getattr, lock, chmod, chown, chgrp, ioctl.  This will be
  handle in a future evolution of Landlock, but right now the goal is to
  lighten the code to ease review.
* Remove LANDLOCK_ACCESS_FS_OPEN and rename
  LANDLOCK_ACCESS_FS_{READ,WRITE} with a FILE suffix.
* Rename LANDLOCK_ACCESS_FS_READDIR to match the *_FILE pattern.
* Remove LANDLOCK_ACCESS_FS_MAP which was useless.
* Fix memory leak in put_hierarchy() (reported by Jann Horn).
* Fix user-after-free and rename free_ruleset() (reported by Jann Horn).
* Replace the for loops with rbtree_postorder_for_each_entry_safe().
* Constify variables.
* Only use refcount_inc() through getter helpers.
* Change Landlock_insert_ruleset_access() to
  Landlock_insert_ruleset_rule().
* Rename landlock_put_ruleset_enqueue() to landlock_put_ruleset_deferred().
* Improve kernel documentation and add a warning about the unhandled
  access/syscall families.
* Move ABI check to syscall.c .

Changes since v13:
* New implementation, inspired by the previous inode eBPF map, but
  agnostic to the underlying kernel object.

Previous changes:
https://lore.kernel.org/lkml/20190721213116.23476-7-...@digikod.net/
---
 security/landlock/Makefile  |   2 +-
 security/landlock/ruleset.c | 350 
 security/landlock/ruleset.h | 157 
 3 files changed, 508 insertions(+), 1 deletion(-)
 create mode 100644 security/landlock/ruleset.c
 create mode 100644 security/landlock/ruleset.h

diff --git a/security/landlock/Makefile b/security/landlock/Makefile
index cb6deefbf4c0..d846eba445bb 100644
--- a/security/landlock/Makefile
+++ b/security/landlock/Makefile
@@ -1,3

[PATCH v22 00/12] Landlock LSM

2020-10-27 Thread Mickaël Salaün
Hi,

Can you please consider to merge this into the tree?

This new patch series improves documentation, cleans up comments,
renames ARCH_EPHEMERAL_STATES to ARCH_EPHEMERAL_INODES and removes
LANDLOCK_ACCESS_FS_CHROOT.

The SLOC count is 1183 for security/landlock/ and 1657 for
tools/testing/selftest/landlock/ .  Test coverage for security/landlock/
is 95.2% of lines.  The code not covered only deals with internal kernel
errors (e.g. memory allocation) and race conditions.

The compiled documentation is available here:
https://landlock.io/linux-doc/landlock-v22/userspace-api/landlock.html

This series can be applied on top of v5.10-rc1 .  This can be tested with
CONFIG_SECURITY_LANDLOCK and CONFIG_SAMPLE_LANDLOCK.  This patch series
can be found in a Git repository here:
https://github.com/landlock-lsm/linux/commits/landlock-v22
I would really appreciate constructive comments on this patch series.


# Landlock LSM

The goal of Landlock is to enable to restrict ambient rights (e.g.
global filesystem access) for a set of processes.  Because Landlock is a
stackable LSM [1], it makes possible to create safe security sandboxes
as new security layers in addition to the existing system-wide
access-controls. This kind of sandbox is expected to help mitigate the
security impact of bugs or unexpected/malicious behaviors in user-space
applications. Landlock empowers any process, including unprivileged
ones, to securely restrict themselves.

Landlock is inspired by seccomp-bpf but instead of filtering syscalls
and their raw arguments, a Landlock rule can restrict the use of kernel
objects like file hierarchies, according to the kernel semantic.
Landlock also takes inspiration from other OS sandbox mechanisms: XNU
Sandbox, FreeBSD Capsicum or OpenBSD Pledge/Unveil.

In this current form, Landlock misses some access-control features.
This enables to minimize this patch series and ease review.  This series
still addresses multiple use cases, especially with the combined use of
seccomp-bpf: applications with built-in sandboxing, init systems,
security sandbox tools and security-oriented APIs [2].

Previous version:
https://lore.kernel.org/lkml/20201008153103.1155388-1-...@digikod.net/

[1] 
https://lore.kernel.org/lkml/50db058a-7dde-441b-a7f9-f6837fe8b...@schaufler-ca.com/
[2] 
https://lore.kernel.org/lkml/f646e1c7-33cf-333f-070c-0a40ad046...@digikod.net/


Casey Schaufler (1):
  LSM: Infrastructure management of the superblock

Mickaël Salaün (11):
  landlock: Add object management
  landlock: Add ruleset and domain management
  landlock: Set up the security framework and manage credentials
  landlock: Add ptrace restrictions
  fs,security: Add sb_delete hook
  landlock: Support filesystem access-control
  landlock: Add syscall implementations
  arch: Wire up Landlock syscalls
  selftests/landlock: Add user space tests
  samples/landlock: Add a sandbox manager example
  landlock: Add user and kernel documentation

 Documentation/security/index.rst  |1 +
 Documentation/security/landlock.rst   |   79 +
 Documentation/userspace-api/index.rst |1 +
 Documentation/userspace-api/landlock.rst  |  259 +++
 MAINTAINERS   |   13 +
 arch/Kconfig  |7 +
 arch/alpha/kernel/syscalls/syscall.tbl|3 +
 arch/arm/tools/syscall.tbl|3 +
 arch/arm64/include/asm/unistd.h   |2 +-
 arch/arm64/include/asm/unistd32.h |6 +
 arch/ia64/kernel/syscalls/syscall.tbl |3 +
 arch/m68k/kernel/syscalls/syscall.tbl |3 +
 arch/microblaze/kernel/syscalls/syscall.tbl   |3 +
 arch/mips/kernel/syscalls/syscall_n32.tbl |3 +
 arch/mips/kernel/syscalls/syscall_n64.tbl |3 +
 arch/mips/kernel/syscalls/syscall_o32.tbl |3 +
 arch/parisc/kernel/syscalls/syscall.tbl   |3 +
 arch/powerpc/kernel/syscalls/syscall.tbl  |3 +
 arch/s390/kernel/syscalls/syscall.tbl |3 +
 arch/sh/kernel/syscalls/syscall.tbl   |3 +
 arch/sparc/kernel/syscalls/syscall.tbl|3 +
 arch/um/Kconfig   |1 +
 arch/x86/entry/syscalls/syscall_32.tbl|3 +
 arch/x86/entry/syscalls/syscall_64.tbl|3 +
 arch/xtensa/kernel/syscalls/syscall.tbl   |3 +
 fs/super.c|1 +
 include/linux/lsm_hook_defs.h |1 +
 include/linux/lsm_hooks.h |3 +
 include/linux/security.h  |4 +
 include/linux/syscalls.h  |7 +
 include/uapi/asm-generic/unistd.h |8 +-
 include/uapi/linux/landlock.h |  128 ++
 kernel/sys_ni.c   |5 +
 samples/Kconfig   |7 +
 samples/Makefile  |1 +
 samples/landlock/.gitignore   |1 +
 samples/landlock

Re: [RESEND PATCH v11 0/3] Add trusted_for(2) (was O_MAYEXEC)

2020-10-27 Thread Mickaël Salaün
Andrew, could you please merge this into your tree?

On 19/10/2020 18:49, Mickaël Salaün wrote:
> Hi,
> 
> Can you please consider to merge this into the tree?
> 
> 
> Overview
> 
> 
> The final goal of this patch series is to enable the kernel to be a
> global policy manager by entrusting processes with access control at
> their level.  To reach this goal, two complementary parts are required:
> * user space needs to be able to know if it can trust some file
>   descriptor content for a specific usage;
> * and the kernel needs to make available some part of the policy
>   configured by the system administrator.
> 
> Primary goal of trusted_for(2)
> ==
> 
> This new syscall enables user space to ask the kernel: is this file
> descriptor's content trusted to be used for this purpose?  The set of
> usage currently only contains "execution", but other may follow (e.g.
> "configuration", "sensitive_data").  If the kernel identifies the file
> descriptor as trustworthy for this usage, user space should then take
> this information into account.  The "execution" usage means that the
> content of the file descriptor is trusted according to the system policy
> to be executed by user space, which means that it interprets the content
> or (try to) maps it as executable memory.
> 
> A simple system-wide security policy can be enforced by the system
> administrator through a sysctl configuration consistent with the mount
> points or the file access rights.  The documentation patch explains the
> prerequisites.
> 
> It is important to note that this can only enable to extend access
> control managed by the kernel.  Hence it enables current access control
> mechanism to be extended and become a superset of what they can
> currently control.  Indeed, the security policy could also be delegated
> to an LSM, either a MAC system or an integrity system.  For instance,
> this is required to close a major IMA measurement/appraisal interpreter
> integrity gap by bringing the ability to check the use of scripts [1].
> Other uses are expected, such as for magic-links [2], SGX integration
> [3], bpffs [4].
> 
> Complementary W^X protections can be brought by SELinux, IPE [5] and
> trampfd [6].
> 
> Prerequisite of its use
> ===
> 
> User space needs to adapt to take advantage of this new feature.  For
> example, the PEP 578 [7] (Runtime Audit Hooks) enables Python 3.8 to be
> extended with policy enforcement points related to code interpretation,
> which can be used to align with the PowerShell audit features.
> Additional Python security improvements (e.g. a limited interpreter
> without -c, stdin piping of code) are on their way [8].
> 
> Examples
> 
> 
> The initial idea comes from CLIP OS 4 and the original implementation
> has been used for more than 12 years:
> https://github.com/clipos-archive/clipos4_doc
> Chrome OS has a similar approach:
> https://chromium.googlesource.com/chromiumos/docs/+/master/security/noexec_shell_scripts.md
> 
> Userland patches can be found here:
> https://github.com/clipos-archive/clipos4_portage-overlay/search?q=O_MAYEXEC
> Actually, there is more than the O_MAYEXEC changes (which matches this search)
> e.g., to prevent Python interactive execution. There are patches for
> Bash, Wine, Java (Icedtea), Busybox's ash, Perl and Python. There are
> also some related patches which do not directly rely on O_MAYEXEC but
> which restrict the use of browser plugins and extensions, which may be
> seen as scripts too:
> https://github.com/clipos-archive/clipos4_portage-overlay/tree/master/www-client
> 
> An introduction to O_MAYEXEC was given at the Linux Security Summit
> Europe 2018 - Linux Kernel Security Contributions by ANSSI:
> https://www.youtube.com/watch?v=chNjCRtPKQY=17m15s
> The "write xor execute" principle was explained at Kernel Recipes 2018 -
> CLIP OS: a defense-in-depth OS:
> https://www.youtube.com/watch?v=PjRE0uBtkHU=11m14s
> See also a first LWN article about O_MAYEXEC and a new one about
> trusted_for(2) and its background:
> * https://lwn.net/Articles/82/
> * https://lwn.net/Articles/832959/
> 
> This patch series can be applied on top of v5.9 .  This can be tested
> with CONFIG_SYSCTL.  I would really appreciate constructive comments on
> this patch series.
> 
> Previous series:
> https://lore.kernel.org/lkml/20201001170232.522331-1-...@digikod.net/
> 
> [1] https://lore.kernel.org/lkml/1544647356.4028.105.ca...@linux.ibm.com/
> [2] https://lore.kernel.org/lkml/20190904201933.10736-6-cyp...@cyphar.com/
> [3] 
> https://lore.kernel.org/lkml/CALCETrVovr8XNZSroey7pHF46O=

[PATCH v3] dm verity: Add support for signature verification with 2nd keyring

2020-10-23 Thread Mickaël Salaün
From: Mickaël Salaün 

Add a new configuration DM_VERITY_VERIFY_ROOTHASH_SIG_SECONDARY_KEYRING
to enable dm-verity signatures to be verified against the secondary
trusted keyring.  Instead of relying on the builtin trusted keyring
(with hard-coded certificates), the second trusted keyring can include
certificate authorities from the builtin trusted keyring and child
certificates loaded at run time.  Using the secondary trusted keyring
enables to use dm-verity disks (e.g. loop devices) signed by keys which
did not exist at kernel build time, leveraging the certificate chain of
trust model.  In practice, this makes it possible to update certificates
without kernel update and reboot, aligning with module and kernel
(kexec) signature verification which already use the secondary trusted
keyring.

Cc: Alasdair Kergon 
Cc: Andrew Morton 
Cc: Jarkko Sakkinen 
Cc: Jaskaran Khurana 
Cc: Mike Snitzer 
Cc: Milan Broz 
Signed-off-by: Mickaël Salaün 
---

Previous version:
https://lore.kernel.org/lkml/20201015150504.1319098-1-...@digikod.net/

Changes since v2:
* Add documentation about the builtin and the secondary trusted keyrings
  (requested by Mike Snitzer).

Changes since v1:
* Extend the commit message (asked by Jarkko Sakkinen).
* Rename the Kconfig "help" keyword according to commit 84af7a6194e4
  ("checkpatch: kconfig: prefer 'help' over '---help---'").
---
 Documentation/admin-guide/device-mapper/verity.rst |  7 ++-
 drivers/md/Kconfig | 13 -
 drivers/md/dm-verity-verify-sig.c  |  9 +++--
 3 files changed, 25 insertions(+), 4 deletions(-)

diff --git a/Documentation/admin-guide/device-mapper/verity.rst 
b/Documentation/admin-guide/device-mapper/verity.rst
index 66f71f0dab1b..b088a647acb7 100644
--- a/Documentation/admin-guide/device-mapper/verity.rst
+++ b/Documentation/admin-guide/device-mapper/verity.rst
@@ -134,7 +134,12 @@ root_hash_sig_key_desc 
 the pkcs7 signature of the roothash. The pkcs7 signature is used to 
validate
 the root hash during the creation of the device mapper block device.
 Verification of roothash depends on the config 
DM_VERITY_VERIFY_ROOTHASH_SIG
-being set in the kernel.
+being set in the kernel.  The signatures are checked against the builtin
+trusted keyring by default, or the secondary trusted keyring if
+DM_VERITY_VERIFY_ROOTHASH_SIG_SECONDARY_KEYRING is set.  The secondary
+trusted keyring includes by default the builtin trusted keyring, and it can
+also gain new certificates at run time if they are signed by a certificate
+already in the secondary trusted keyring.
 
 Theory of operation
 ===
diff --git a/drivers/md/Kconfig b/drivers/md/Kconfig
index 30ba3573626c..1d68935e45ef 100644
--- a/drivers/md/Kconfig
+++ b/drivers/md/Kconfig
@@ -530,11 +530,22 @@ config DM_VERITY_VERIFY_ROOTHASH_SIG
bool "Verity data device root hash signature verification support"
depends on DM_VERITY
select SYSTEM_DATA_VERIFICATION
- help
+   help
  Add ability for dm-verity device to be validated if the
  pre-generated tree of cryptographic checksums passed has a pkcs#7
  signature file that can validate the roothash of the tree.
 
+ By default, rely on the builtin trusted keyring.
+
+ If unsure, say N.
+
+config DM_VERITY_VERIFY_ROOTHASH_SIG_SECONDARY_KEYRING
+   bool "Verity data device root hash signature verification with 
secondary keyring"
+   depends on DM_VERITY_VERIFY_ROOTHASH_SIG
+   depends on SECONDARY_TRUSTED_KEYRING
+   help
+ Rely on the secondary trusted keyring to verify dm-verity signatures.
+
  If unsure, say N.
 
 config DM_VERITY_FEC
diff --git a/drivers/md/dm-verity-verify-sig.c 
b/drivers/md/dm-verity-verify-sig.c
index 614e43db93aa..29385dc470d5 100644
--- a/drivers/md/dm-verity-verify-sig.c
+++ b/drivers/md/dm-verity-verify-sig.c
@@ -119,8 +119,13 @@ int verity_verify_root_hash(const void *root_hash, size_t 
root_hash_len,
}
 
ret = verify_pkcs7_signature(root_hash, root_hash_len, sig_data,
-   sig_len, NULL, VERIFYING_UNSPECIFIED_SIGNATURE,
-   NULL, NULL);
+   sig_len,
+#ifdef CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG_SECONDARY_KEYRING
+   VERIFY_USE_SECONDARY_KEYRING,
+#else
+   NULL,
+#endif
+   VERIFYING_UNSPECIFIED_SIGNATURE, NULL, NULL);
 
return ret;
 }

base-commit: bbf5c979011a099af5dc76498918ed7df445635b
-- 
2.28.0



Re: [PATCH v2] dm verity: Add support for signature verification with 2nd keyring

2020-10-23 Thread Mickaël Salaün
It seems that there is no more question. Mike, Alasdair, could you
please consider to merge this into the tree?

On 16/10/2020 14:19, Mickaël Salaün wrote:
> 
> On 16/10/2020 13:08, Milan Broz wrote:
>> On 16/10/2020 10:49, Mickaël Salaün wrote:
>>> On 16/10/2020 10:29, Mickaël Salaün wrote:
>>>>
>>>> On 15/10/2020 18:52, Mike Snitzer wrote:
>>>>> Can you please explain why you've decided to make this a Kconfig CONFIG
>>>>> knob?  Why not either add: a dm-verity table argument? A dm-verity
>>>>> kernel module parameter? or both (to allow a particular default but
>>>>> then
>>>>> per-device override)?
>>>>
>>>> The purpose of signed dm-verity images is to authenticate files, or said
>>>> in another way, to enable the kernel to trust disk images in a flexible
>>>> way (i.e. thanks to certificate's chain of trust). Being able to update
>>>> such chain at run time requires to use the second trusted keyring. This
>>>> keyring automatically includes the certificate authorities from the
>>>> builtin trusted keyring, which are required to dynamically populate the
>>>> secondary trusted keyring with certificates signed by an already trusted
>>>> authority. The roots of trust must then be included at build time in the
>>>> builtin trusted keyring.
>>>>
>>>> To be meaningful, using dm-verity signatures implies to have a
>>>> restricted user space, i.e. even the root user has limited power over
>>>> the kernel and the rest of the system. Blindly trusting data provided by
>>>> user space (e.g. dm-verity table argument, kernel module parameter)
>>>> defeat the purpose of (mandatory) authenticated images.
>>>>
>>>>>
>>>>> Otherwise, _all_ DM verity devices will be configured to use secondary
>>>>> keyring fallback.  Is that really desirable?
>>>>
>>>> That is already the current state (on purpose).
>>>
>>> I meant that when DM_VERITY_VERIFY_ROOTHASH_SIG is set, dm-verity
>>> signature becomes mandatory. This new configuration
>>> DM_VERITY_VERIFY_ROOTHASH_SIG_SECONDARY_KEYRING extend this trust to the
>>> secondary trusted keyring, which contains certificates signed (directly
>>> or indirectly) by CA from the builtin trusted keyring.
>>>
>>> So yes, this new (optional) configuration *extends* the source of trust
>>> for all dm-verity devices, and yes, it is desirable. I think it should
>>> have been this way from the beginning (as for other authentication
>>> mechanisms) but it wasn't necessary at that time.
>>
>> Well, I understand why you need a config option here.
>> And using the secondary keyring actually makes much more sense to me than
>> the original approach.
>>
>> But please do not forget that dm-verity is sometimes used in different
>> contexts where such strict in-kernel certificate trust is unnecessary.
>> With your configure options set, you deliberately remove the possibility
>> to configure such devices.
> It doesn't make sense to set DM_VERITY_VERIFY_ROOTHASH_SIG in generic
> distro because such policy is configured at build time in the kernel
> with hardcoded CAs. If the new option is not set then nothing change. I
> don't see why it could be an issue for use cases we previously defined
> (with DM_VERITY_VERIFY_ROOTHASH_SIG).
> 
>> I understand that it is needed for "trusted" systems, but we should be
>> clear
>> in the documentation.
>> Maybe also add note to
>> /Documentation/admin-guide/device-mapper/verity.rst ?
>> We already mention DM_VERITY_VERIFY_ROOTHASH_SIG there.
> 
> The current documentation remains true.
> DM_VERITY_VERIFY_ROOTHASH_SIG_SECONDARY_KEYRING depends on
> DM_VERITY_VERIFY_ROOTHASH_SIG.
> 
>>
>> The current userspace configuration through veritysetup does not need
>> any patches for your patch, correct?
> 
> Right, it's only different from the kernel point of view.
> 
>>
>> Thanks,
>> Milan
>>


[RESEND PATCH v11 3/3] selftest/interpreter: Add tests for trusted_for(2) policies

2020-10-19 Thread Mickaël Salaün
From: Mickaël Salaün 

Test that checks performed by trusted_for(2) on file descriptors are
consistent with noexec mount points and file execute permissions,
according to the policy configured with the fs.trust_policy sysctl.

Signed-off-by: Mickaël Salaün 
Reviewed-by: Thibaut Sautereau 
Cc: Al Viro 
Cc: Arnd Bergmann 
Cc: Andrew Morton 
Cc: Kees Cook 
Cc: Shuah Khan 
Cc: Vincent Strubel 
---

Changes since v10:
* Update selftest Makefile.

Changes since v9:
* Rename the syscall and the sysctl.
* Update tests for enum trusted_for_usage

Changes since v8:
* Update with the dedicated syscall introspect_access(2) and the renamed
  fs.introspection_policy sysctl.
* Remove check symlink which can't be use as is anymore.
* Use socketpair(2) to test UNIX socket.

Changes since v7:
* Update tests with faccessat2/AT_INTERPRETED, including new ones to
  check that setting R_OK or W_OK returns EINVAL.
* Add tests for memfd, pipefs and nsfs.
* Rename and move back tests to a standalone directory.

Changes since v6:
* Add full combination tests for all file types, including block
  devices, character devices, fifos, sockets and symlinks.
* Properly save and restore initial sysctl value for all tests.

Changes since v5:
* Refactor with FIXTURE_VARIANT, which make the tests much more easy to
  read and maintain.
* Save and restore initial sysctl value (suggested by Kees Cook).
* Test with a sysctl value of 0.
* Check errno in sysctl_access_write test.
* Update tests for the CAP_SYS_ADMIN switch.
* Update tests to check -EISDIR (replacing -EACCES).
* Replace FIXTURE_DATA() with FIXTURE() (spotted by Kees Cook).
* Use global const strings.

Changes since v3:
* Replace RESOLVE_MAYEXEC with O_MAYEXEC.
* Add tests to check that O_MAYEXEC is ignored by open(2) and openat(2).

Changes since v2:
* Move tests from exec/ to openat2/ .
* Replace O_MAYEXEC with RESOLVE_MAYEXEC from openat2(2).
* Cleanup tests.

Changes since v1:
* Move tests from yama/ to exec/ .
* Fix _GNU_SOURCE in kselftest_harness.h .
* Add a new test sysctl_access_write to check if CAP_MAC_ADMIN is taken
  into account.
* Test directory execution which is always forbidden since commit
  73601ea5b7b1 ("fs/open.c: allow opening only regular files during
  execve()"), and also check that even the root user can not bypass file
  execution checks.
* Make sure delete_workspace() always as enough right to succeed.
* Cosmetic cleanup.
---
 tools/testing/selftests/Makefile  |   1 +
 .../testing/selftests/interpreter/.gitignore  |   2 +
 tools/testing/selftests/interpreter/Makefile  |  21 +
 tools/testing/selftests/interpreter/config|   1 +
 .../selftests/interpreter/trust_policy_test.c | 362 ++
 5 files changed, 387 insertions(+)
 create mode 100644 tools/testing/selftests/interpreter/.gitignore
 create mode 100644 tools/testing/selftests/interpreter/Makefile
 create mode 100644 tools/testing/selftests/interpreter/config
 create mode 100644 tools/testing/selftests/interpreter/trust_policy_test.c

diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile
index 9018f45d631d..5a7cf8dd7ce2 100644
--- a/tools/testing/selftests/Makefile
+++ b/tools/testing/selftests/Makefile
@@ -21,6 +21,7 @@ TARGETS += ftrace
 TARGETS += futex
 TARGETS += gpio
 TARGETS += intel_pstate
+TARGETS += interpreter
 TARGETS += ipc
 TARGETS += ir
 TARGETS += kcmp
diff --git a/tools/testing/selftests/interpreter/.gitignore 
b/tools/testing/selftests/interpreter/.gitignore
new file mode 100644
index ..82a4846cbc4b
--- /dev/null
+++ b/tools/testing/selftests/interpreter/.gitignore
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0-only
+/*_test
diff --git a/tools/testing/selftests/interpreter/Makefile 
b/tools/testing/selftests/interpreter/Makefile
new file mode 100644
index ..dbca8ebda67e
--- /dev/null
+++ b/tools/testing/selftests/interpreter/Makefile
@@ -0,0 +1,21 @@
+# SPDX-License-Identifier: GPL-2.0-or-later
+
+CFLAGS += -Wall -O2
+LDLIBS += -lcap
+
+src_test := $(wildcard *_test.c)
+TEST_GEN_PROGS := $(src_test:.c=)
+
+KSFT_KHDR_INSTALL := 1
+include ../lib.mk
+
+khdr_dir = $(top_srcdir)/usr/include
+
+$(khdr_dir)/asm-generic/unistd.h: khdr
+   @:
+
+$(khdr_dir)/linux/trusted-for.h: khdr
+   @:
+
+$(OUTPUT)/%_test: %_test.c $(khdr_dir)/asm-generic/unistd.h 
$(khdr_dir)/linux/trusted-for.h ../kselftest_harness.h
+   $(LINK.c) $< $(LDLIBS) -o $@ -I$(khdr_dir)
diff --git a/tools/testing/selftests/interpreter/config 
b/tools/testing/selftests/interpreter/config
new file mode 100644
index ..dd53c266bf52
--- /dev/null
+++ b/tools/testing/selftests/interpreter/config
@@ -0,0 +1 @@
+CONFIG_SYSCTL=y
diff --git a/tools/testing/selftests/interpreter/trust_policy_test.c 
b/tools/testing/selftests/interpreter/trust_policy_test.c
new file mode 100644
index ..4818c5524ec0
--- /dev/null
+++ b/tools/testing/selftests/interpreter/trust_policy_test.c
@@ -0,0 +1,362 @@
+// SPDX-Licen

[RESEND PATCH v11 0/3] Add trusted_for(2) (was O_MAYEXEC)

2020-10-19 Thread Mickaël Salaün
Hi,

Can you please consider to merge this into the tree?


Overview


The final goal of this patch series is to enable the kernel to be a
global policy manager by entrusting processes with access control at
their level.  To reach this goal, two complementary parts are required:
* user space needs to be able to know if it can trust some file
  descriptor content for a specific usage;
* and the kernel needs to make available some part of the policy
  configured by the system administrator.

Primary goal of trusted_for(2)
==

This new syscall enables user space to ask the kernel: is this file
descriptor's content trusted to be used for this purpose?  The set of
usage currently only contains "execution", but other may follow (e.g.
"configuration", "sensitive_data").  If the kernel identifies the file
descriptor as trustworthy for this usage, user space should then take
this information into account.  The "execution" usage means that the
content of the file descriptor is trusted according to the system policy
to be executed by user space, which means that it interprets the content
or (try to) maps it as executable memory.

A simple system-wide security policy can be enforced by the system
administrator through a sysctl configuration consistent with the mount
points or the file access rights.  The documentation patch explains the
prerequisites.

It is important to note that this can only enable to extend access
control managed by the kernel.  Hence it enables current access control
mechanism to be extended and become a superset of what they can
currently control.  Indeed, the security policy could also be delegated
to an LSM, either a MAC system or an integrity system.  For instance,
this is required to close a major IMA measurement/appraisal interpreter
integrity gap by bringing the ability to check the use of scripts [1].
Other uses are expected, such as for magic-links [2], SGX integration
[3], bpffs [4].

Complementary W^X protections can be brought by SELinux, IPE [5] and
trampfd [6].

Prerequisite of its use
===

User space needs to adapt to take advantage of this new feature.  For
example, the PEP 578 [7] (Runtime Audit Hooks) enables Python 3.8 to be
extended with policy enforcement points related to code interpretation,
which can be used to align with the PowerShell audit features.
Additional Python security improvements (e.g. a limited interpreter
without -c, stdin piping of code) are on their way [8].

Examples


The initial idea comes from CLIP OS 4 and the original implementation
has been used for more than 12 years:
https://github.com/clipos-archive/clipos4_doc
Chrome OS has a similar approach:
https://chromium.googlesource.com/chromiumos/docs/+/master/security/noexec_shell_scripts.md

Userland patches can be found here:
https://github.com/clipos-archive/clipos4_portage-overlay/search?q=O_MAYEXEC
Actually, there is more than the O_MAYEXEC changes (which matches this search)
e.g., to prevent Python interactive execution. There are patches for
Bash, Wine, Java (Icedtea), Busybox's ash, Perl and Python. There are
also some related patches which do not directly rely on O_MAYEXEC but
which restrict the use of browser plugins and extensions, which may be
seen as scripts too:
https://github.com/clipos-archive/clipos4_portage-overlay/tree/master/www-client

An introduction to O_MAYEXEC was given at the Linux Security Summit
Europe 2018 - Linux Kernel Security Contributions by ANSSI:
https://www.youtube.com/watch?v=chNjCRtPKQY=17m15s
The "write xor execute" principle was explained at Kernel Recipes 2018 -
CLIP OS: a defense-in-depth OS:
https://www.youtube.com/watch?v=PjRE0uBtkHU=11m14s
See also a first LWN article about O_MAYEXEC and a new one about
trusted_for(2) and its background:
* https://lwn.net/Articles/82/
* https://lwn.net/Articles/832959/

This patch series can be applied on top of v5.9 .  This can be tested
with CONFIG_SYSCTL.  I would really appreciate constructive comments on
this patch series.

Previous series:
https://lore.kernel.org/lkml/20201001170232.522331-1-...@digikod.net/

[1] https://lore.kernel.org/lkml/1544647356.4028.105.ca...@linux.ibm.com/
[2] https://lore.kernel.org/lkml/20190904201933.10736-6-cyp...@cyphar.com/
[3] 
https://lore.kernel.org/lkml/CALCETrVovr8XNZSroey7pHF46O=kj_c5D9K8h=z2t_cnrpv...@mail.gmail.com/
[4] 
https://lore.kernel.org/lkml/calcetrvez0euffxwfhtag_j+advbzewe0m3wjxmwveo7pj+...@mail.gmail.com/
[5] 
https://lore.kernel.org/lkml/20200406221439.1469862-12-deven.de...@linux.microsoft.com/
[6] 
https://lore.kernel.org/lkml/20200922215326.4603-1-madve...@linux.microsoft.com/
[7] https://www.python.org/dev/peps/pep-0578/
[8] 
https://lore.kernel.org/lkml/0c70debd-e79e-d514-06c6-4cd1e021f...@python.org/

Regards,

Mickaël Salaün (3):
  fs: Add trusted_for(2) syscall implementation and related sysctl
  arch: Wire up trusted_for(2)
  selftest/

[RESEND PATCH v11 1/3] fs: Add trusted_for(2) syscall implementation and related sysctl

2020-10-19 Thread Mickaël Salaün
From: Mickaël Salaün 

The trusted_for() syscall enables user space tasks to check that files
are trusted to be executed or interpreted by user space.  This may allow
script interpreters to check execution permission before reading
commands from a file, or dynamic linkers to allow shared object loading.
This may be seen as a way for a trusted task (e.g. interpreter) to check
the trustworthiness of files (e.g. scripts) before extending its control
flow graph with new ones originating from these files.

The security policy is consistently managed by the kernel through the
new sysctl: fs.trust_policy .  This enables system administrators to
enforce two complementary security policies according to the installed
system: enforce the noexec mount option, and enforce executable file
permission.  Indeed, because of compatibility with installed systems,
only system administrators are able to check that this new enforcement
is in line with the system mount points and file permissions.

For this to be possible, script interpreters must use trusted_for(2)
with the TRUSTED_FOR_EXECUTION usage.  To be fully effective, these
interpreters also need to handle the other ways to execute code: command
line parameters (e.g., option -e for Perl), module loading (e.g., option
-m for Python), stdin, file sourcing, environment variables,
configuration files, etc.  According to the threat model, it may be
acceptable to allow some script interpreters (e.g. Bash) to interpret
commands from stdin, may it be a TTY or a pipe, because it may not be
enough to (directly) perform syscalls.

Even without enforced security policy, user space interpreters can use
this syscall to try as much as possible to enforce the system policy at
their level, knowing that it will not break anything on running systems
which do not care about this feature.  However, on systems which want
this feature enforced, there will be knowledgeable people (i.e. system
administrator who configured fs.trust_policy deliberately) to manage it.

Because trusted_for(2) is a mean to enforce a system-wide security
policy (but not application-centric policies), it does not make sense
for user space to check the sysctl value.  Indeed, this new flag only
enables to extend the system ability to enforce a policy thanks to (some
trusted) user space collaboration.  Moreover, additional security
policies could be managed by LSMs.  This is a best-effort approach from
the application developer point of view:
https://lore.kernel.org/lkml/1477d3d7-4b36-afad-7077-a38f42322...@digikod.net/

trusted_for(2) with TRUSTED_FOR_EXECUTION should not be confused with
the O_EXEC flag (for open) which is intended for execute-only, which
obviously doesn't work for scripts.  However, a similar behavior could
be implemented in user space with O_PATH:
https://lore.kernel.org/lkml/1e2f6913-42f2-3578-28ed-567f6a4bd...@digikod.net/

Being able to restrict execution also enables to protect the kernel by
restricting arbitrary syscalls that an attacker could perform with a
crafted binary or certain script languages.  It also improves multilevel
isolation by reducing the ability of an attacker to use side channels
with specific code.  These restrictions can natively be enforced for ELF
binaries (with the noexec mount option) but require this kernel
extension to properly handle scripts (e.g. Python, Perl).  To get a
consistent execution policy, additional memory restrictions should also
be enforced (e.g. thanks to SELinux).

This is a new implementation of a patch initially written by
Vincent Strubel for CLIP OS 4:
https://github.com/clipos-archive/src_platform_clip-patches/blob/f5cb330d6b684752e403b4e41b39f7004d88e561/1901_open_mayexec.patch
This patch has been used for more than 12 years with customized script
interpreters.  Some examples (with the original O_MAYEXEC) can be found
here:
https://github.com/clipos-archive/clipos4_portage-overlay/search?q=O_MAYEXEC

Co-developed-by: Thibaut Sautereau 
Signed-off-by: Thibaut Sautereau 
Signed-off-by: Mickaël Salaün 
Cc: Al Viro 
Cc: Andrew Morton 
Cc: Arnd Bergmann 
Cc: Jonathan Corbet 
Cc: Kees Cook 
Cc: Vincent Strubel 
---

Changes since v10:
* Add enum definition to syscalls.h .

Changes since v9:
* Rename the syscall to trusted_for(2) and the sysctl to fs.trust_policy
* Add a dedicated enum trusted_for_usage with include/uapi/linux/trusted-for.h
* Remove the extra MAY_INTROSPECTION_EXEC bit.  LSMs can still implement
  this feature themselves.

Changes since v8:
* Add a dedicated syscall introspect_access() (requested by Al Viro).
* Rename MAY_INTERPRETED_EXEC to MAY_INTROSPECTION_EXEC .
* Rename the sysctl fs.interpreted_access to fs.introspection_policy .
* Update documentation.

Changes since v7:
* Replaces openat2/O_MAYEXEC with faccessat2/X_OK/AT_INTERPRETED .
  Switching to an FD-based syscall was suggested by Al Viro and Jann
  Horn.
* Handle special file descriptors.
* Add a compatibility mode for execute/read check.
* Move the sysctl policy from fs

[RESEND PATCH v11 2/3] arch: Wire up trusted_for(2)

2020-10-19 Thread Mickaël Salaün
From: Mickaël Salaün 

Wire up trusted_for(2) for all architectures.

Signed-off-by: Mickaël Salaün 
Reviewed-by: Thibaut Sautereau 
Cc: Al Viro 
Cc: Andrew Morton 
Cc: Arnd Bergmann 
Cc: Kees Cook 
Cc: Vincent Strubel 
---

Changes since v9:
* Rename introspect_access(2) to trusted_for(2).
* Increase syscall number to leave space for memfd_secret(2) in -next.

Changes since v7:
* New patch for the new syscall.
* Increase syscall numbers by 2 to leave space for new ones (in
  linux-next): watch_mount(2) and process_madvise(2).
---
 arch/alpha/kernel/syscalls/syscall.tbl  | 1 +
 arch/arm/tools/syscall.tbl  | 1 +
 arch/arm64/include/asm/unistd.h | 2 +-
 arch/arm64/include/asm/unistd32.h   | 2 ++
 arch/ia64/kernel/syscalls/syscall.tbl   | 1 +
 arch/m68k/kernel/syscalls/syscall.tbl   | 1 +
 arch/microblaze/kernel/syscalls/syscall.tbl | 1 +
 arch/mips/kernel/syscalls/syscall_n32.tbl   | 1 +
 arch/mips/kernel/syscalls/syscall_n64.tbl   | 1 +
 arch/mips/kernel/syscalls/syscall_o32.tbl   | 1 +
 arch/parisc/kernel/syscalls/syscall.tbl | 1 +
 arch/powerpc/kernel/syscalls/syscall.tbl| 1 +
 arch/s390/kernel/syscalls/syscall.tbl   | 1 +
 arch/sh/kernel/syscalls/syscall.tbl | 1 +
 arch/sparc/kernel/syscalls/syscall.tbl  | 1 +
 arch/x86/entry/syscalls/syscall_32.tbl  | 1 +
 arch/x86/entry/syscalls/syscall_64.tbl  | 1 +
 arch/xtensa/kernel/syscalls/syscall.tbl | 1 +
 include/uapi/asm-generic/unistd.h   | 4 +++-
 19 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/arch/alpha/kernel/syscalls/syscall.tbl 
b/arch/alpha/kernel/syscalls/syscall.tbl
index ec8bed9e7b75..0175cfc0f66f 100644
--- a/arch/alpha/kernel/syscalls/syscall.tbl
+++ b/arch/alpha/kernel/syscalls/syscall.tbl
@@ -479,3 +479,4 @@
 547common  openat2 sys_openat2
 548common  pidfd_getfd sys_pidfd_getfd
 549common  faccessat2  sys_faccessat2
+553common  trusted_for sys_trusted_for
diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl
index 171077cbf419..db9c8d35e75b 100644
--- a/arch/arm/tools/syscall.tbl
+++ b/arch/arm/tools/syscall.tbl
@@ -453,3 +453,4 @@
 437common  openat2 sys_openat2
 438common  pidfd_getfd sys_pidfd_getfd
 439common  faccessat2  sys_faccessat2
+443common  trusted_for sys_trusted_for
diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
index 3b859596840d..d1f7d35f986e 100644
--- a/arch/arm64/include/asm/unistd.h
+++ b/arch/arm64/include/asm/unistd.h
@@ -38,7 +38,7 @@
 #define __ARM_NR_compat_set_tls(__ARM_NR_COMPAT_BASE + 5)
 #define __ARM_NR_COMPAT_END(__ARM_NR_COMPAT_BASE + 0x800)
 
-#define __NR_compat_syscalls   440
+#define __NR_compat_syscalls   444
 #endif
 
 #define __ARCH_WANT_SYS_CLONE
diff --git a/arch/arm64/include/asm/unistd32.h 
b/arch/arm64/include/asm/unistd32.h
index 734860ac7cf9..33716dd2c04c 100644
--- a/arch/arm64/include/asm/unistd32.h
+++ b/arch/arm64/include/asm/unistd32.h
@@ -887,6 +887,8 @@ __SYSCALL(__NR_openat2, sys_openat2)
 __SYSCALL(__NR_pidfd_getfd, sys_pidfd_getfd)
 #define __NR_faccessat2 439
 __SYSCALL(__NR_faccessat2, sys_faccessat2)
+#define __NR_trusted_for 443
+__SYSCALL(__NR_trusted_for, sys_trusted_for)
 
 /*
  * Please add new compat syscalls above this comment and update
diff --git a/arch/ia64/kernel/syscalls/syscall.tbl 
b/arch/ia64/kernel/syscalls/syscall.tbl
index f52a41f4c340..68e56436b611 100644
--- a/arch/ia64/kernel/syscalls/syscall.tbl
+++ b/arch/ia64/kernel/syscalls/syscall.tbl
@@ -360,3 +360,4 @@
 437common  openat2 sys_openat2
 438common  pidfd_getfd sys_pidfd_getfd
 439common  faccessat2  sys_faccessat2
+443common  trusted_for sys_trusted_for
diff --git a/arch/m68k/kernel/syscalls/syscall.tbl 
b/arch/m68k/kernel/syscalls/syscall.tbl
index 81fc799d8392..67f0bc2fc4d0 100644
--- a/arch/m68k/kernel/syscalls/syscall.tbl
+++ b/arch/m68k/kernel/syscalls/syscall.tbl
@@ -439,3 +439,4 @@
 437common  openat2 sys_openat2
 438common  pidfd_getfd sys_pidfd_getfd
 439common  faccessat2  sys_faccessat2
+443common  trusted_for sys_trusted_for
diff --git a/arch/microblaze/kernel/syscalls/syscall.tbl 
b/arch/microblaze/kernel/syscalls/syscall.tbl
index b4e263916f41..acd3057886b7 100644
--- a/arch/microblaze/kernel/syscalls/syscall.tbl
+++ b/arch/microblaze/kernel/syscalls/syscall.tbl
@@ -445,3 +445,4 @@
 437common  openat2 sys_openat2
 438common  pidfd_getfd sys_pidfd_getfd
 439common  faccessat2  sys_faccessat2
+443common  trusted_for

Re: [PATCH v2] dm verity: Add support for signature verification with 2nd keyring

2020-10-16 Thread Mickaël Salaün


On 16/10/2020 13:08, Milan Broz wrote:
> On 16/10/2020 10:49, Mickaël Salaün wrote:
>> On 16/10/2020 10:29, Mickaël Salaün wrote:
>>>
>>> On 15/10/2020 18:52, Mike Snitzer wrote:
>>>> Can you please explain why you've decided to make this a Kconfig CONFIG
>>>> knob?  Why not either add: a dm-verity table argument? A dm-verity
>>>> kernel module parameter? or both (to allow a particular default but
>>>> then
>>>> per-device override)?
>>>
>>> The purpose of signed dm-verity images is to authenticate files, or said
>>> in another way, to enable the kernel to trust disk images in a flexible
>>> way (i.e. thanks to certificate's chain of trust). Being able to update
>>> such chain at run time requires to use the second trusted keyring. This
>>> keyring automatically includes the certificate authorities from the
>>> builtin trusted keyring, which are required to dynamically populate the
>>> secondary trusted keyring with certificates signed by an already trusted
>>> authority. The roots of trust must then be included at build time in the
>>> builtin trusted keyring.
>>>
>>> To be meaningful, using dm-verity signatures implies to have a
>>> restricted user space, i.e. even the root user has limited power over
>>> the kernel and the rest of the system. Blindly trusting data provided by
>>> user space (e.g. dm-verity table argument, kernel module parameter)
>>> defeat the purpose of (mandatory) authenticated images.
>>>
>>>>
>>>> Otherwise, _all_ DM verity devices will be configured to use secondary
>>>> keyring fallback.  Is that really desirable?
>>>
>>> That is already the current state (on purpose).
>>
>> I meant that when DM_VERITY_VERIFY_ROOTHASH_SIG is set, dm-verity
>> signature becomes mandatory. This new configuration
>> DM_VERITY_VERIFY_ROOTHASH_SIG_SECONDARY_KEYRING extend this trust to the
>> secondary trusted keyring, which contains certificates signed (directly
>> or indirectly) by CA from the builtin trusted keyring.
>>
>> So yes, this new (optional) configuration *extends* the source of trust
>> for all dm-verity devices, and yes, it is desirable. I think it should
>> have been this way from the beginning (as for other authentication
>> mechanisms) but it wasn't necessary at that time.
> 
> Well, I understand why you need a config option here.
> And using the secondary keyring actually makes much more sense to me than
> the original approach.
> 
> But please do not forget that dm-verity is sometimes used in different
> contexts where such strict in-kernel certificate trust is unnecessary.
> With your configure options set, you deliberately remove the possibility
> to configure such devices.
It doesn't make sense to set DM_VERITY_VERIFY_ROOTHASH_SIG in generic
distro because such policy is configured at build time in the kernel
with hardcoded CAs. If the new option is not set then nothing change. I
don't see why it could be an issue for use cases we previously defined
(with DM_VERITY_VERIFY_ROOTHASH_SIG).

> I understand that it is needed for "trusted" systems, but we should be
> clear
> in the documentation.
> Maybe also add note to
> /Documentation/admin-guide/device-mapper/verity.rst ?
> We already mention DM_VERITY_VERIFY_ROOTHASH_SIG there.

The current documentation remains true.
DM_VERITY_VERIFY_ROOTHASH_SIG_SECONDARY_KEYRING depends on
DM_VERITY_VERIFY_ROOTHASH_SIG.

> 
> The current userspace configuration through veritysetup does not need
> any patches for your patch, correct?

Right, it's only different from the kernel point of view.

> 
> Thanks,
> Milan
> 


Re: [PATCH v2] dm verity: Add support for signature verification with 2nd keyring

2020-10-16 Thread Mickaël Salaün



On 16/10/2020 10:29, Mickaël Salaün wrote:
> 
> On 15/10/2020 18:52, Mike Snitzer wrote:
>> On Thu, Oct 15 2020 at 11:05am -0400,
>> Mickaël Salaün  wrote:
>>
>>> From: Mickaël Salaün 
>>>
>>> Add a new configuration DM_VERITY_VERIFY_ROOTHASH_SIG_SECONDARY_KEYRING
>>> to enable dm-verity signatures to be verified against the secondary
>>> trusted keyring.  Instead of relying on the builtin trusted keyring
>>> (with hard-coded certificates), the second trusted keyring can include
>>> certificate authorities from the builtin trusted keyring and child
>>> certificates loaded at run time.  Using the secondary trusted keyring
>>> enables to use dm-verity disks (e.g. loop devices) signed by keys which
>>> did not exist at kernel build time, leveraging the certificate chain of
>>> trust model.  In practice, this makes it possible to update certificates
>>> without kernel update and reboot, aligning with module and kernel
>>> (kexec) signature verification which already use the secondary trusted
>>> keyring.
>>>
>>> Cc: Alasdair Kergon 
>>> Cc: Andrew Morton 
>>> Cc: Jarkko Sakkinen 
>>> Cc: Jaskaran Khurana 
>>> Cc: Mike Snitzer 
>>> Cc: Milan Broz 
>>> Signed-off-by: Mickaël Salaün 
>>> ---
>>>
>>> Previous version:
>>> https://lore.kernel.org/lkml/20201002071802.535023-1-...@digikod.net/
>>>
>>> Changes since v1:
>>> * Extend the commit message (asked by Jarkko Sakkinen).
>>> * Rename the Kconfig "help" keyword according to commit 84af7a6194e4
>>>   ("checkpatch: kconfig: prefer 'help' over '---help---'").
>>
>> Can you please explain why you've decided to make this a Kconfig CONFIG
>> knob?  Why not either add: a dm-verity table argument? A dm-verity
>> kernel module parameter? or both (to allow a particular default but then
>> per-device override)?
> 
> The purpose of signed dm-verity images is to authenticate files, or said
> in another way, to enable the kernel to trust disk images in a flexible
> way (i.e. thanks to certificate's chain of trust). Being able to update
> such chain at run time requires to use the second trusted keyring. This
> keyring automatically includes the certificate authorities from the
> builtin trusted keyring, which are required to dynamically populate the
> secondary trusted keyring with certificates signed by an already trusted
> authority. The roots of trust must then be included at build time in the
> builtin trusted keyring.
> 
> To be meaningful, using dm-verity signatures implies to have a
> restricted user space, i.e. even the root user has limited power over
> the kernel and the rest of the system. Blindly trusting data provided by
> user space (e.g. dm-verity table argument, kernel module parameter)
> defeat the purpose of (mandatory) authenticated images.
> 
>>
>> Otherwise, _all_ DM verity devices will be configured to use secondary
>> keyring fallback.  Is that really desirable?
> 
> That is already the current state (on purpose).

I meant that when DM_VERITY_VERIFY_ROOTHASH_SIG is set, dm-verity
signature becomes mandatory. This new configuration
DM_VERITY_VERIFY_ROOTHASH_SIG_SECONDARY_KEYRING extend this trust to the
secondary trusted keyring, which contains certificates signed (directly
or indirectly) by CA from the builtin trusted keyring.

So yes, this new (optional) configuration *extends* the source of trust
for all dm-verity devices, and yes, it is desirable. I think it should
have been this way from the beginning (as for other authentication
mechanisms) but it wasn't necessary at that time.

> 
>>
>> Regardless, I really don't see why a Kconfig knob is appropriate.
> 
> Moreover, a Kconfig knob makes sense as much as
> DM_VERITY_VERIFY_ROOTHASH_SIG,
> IMA_KEYRINGS_PERMIT_SIGNED_BY_BUILTIN_OR_SECONDARY, MODULE_SIG_FORCE and
> other similar authentication mechanisms. Indeed, when using these
> configurations, we want the kernel to enforce a specific policy.
> 
> Obviously, we can't make the DM_VERITY_VERIFY_ROOTHASH_SIG relies on the
> secondary trusted keyring without important security implications for
> systems already using this configuration (and then the builtin trusted
> keyring as the unique source of trust).
> 


Re: [PATCH v2] dm verity: Add support for signature verification with 2nd keyring

2020-10-16 Thread Mickaël Salaün


On 15/10/2020 18:52, Mike Snitzer wrote:
> On Thu, Oct 15 2020 at 11:05am -0400,
> Mickaël Salaün  wrote:
> 
>> From: Mickaël Salaün 
>>
>> Add a new configuration DM_VERITY_VERIFY_ROOTHASH_SIG_SECONDARY_KEYRING
>> to enable dm-verity signatures to be verified against the secondary
>> trusted keyring.  Instead of relying on the builtin trusted keyring
>> (with hard-coded certificates), the second trusted keyring can include
>> certificate authorities from the builtin trusted keyring and child
>> certificates loaded at run time.  Using the secondary trusted keyring
>> enables to use dm-verity disks (e.g. loop devices) signed by keys which
>> did not exist at kernel build time, leveraging the certificate chain of
>> trust model.  In practice, this makes it possible to update certificates
>> without kernel update and reboot, aligning with module and kernel
>> (kexec) signature verification which already use the secondary trusted
>> keyring.
>>
>> Cc: Alasdair Kergon 
>> Cc: Andrew Morton 
>> Cc: Jarkko Sakkinen 
>> Cc: Jaskaran Khurana 
>> Cc: Mike Snitzer 
>> Cc: Milan Broz 
>> Signed-off-by: Mickaël Salaün 
>> ---
>>
>> Previous version:
>> https://lore.kernel.org/lkml/20201002071802.535023-1-...@digikod.net/
>>
>> Changes since v1:
>> * Extend the commit message (asked by Jarkko Sakkinen).
>> * Rename the Kconfig "help" keyword according to commit 84af7a6194e4
>>   ("checkpatch: kconfig: prefer 'help' over '---help---'").
> 
> Can you please explain why you've decided to make this a Kconfig CONFIG
> knob?  Why not either add: a dm-verity table argument? A dm-verity
> kernel module parameter? or both (to allow a particular default but then
> per-device override)?

The purpose of signed dm-verity images is to authenticate files, or said
in another way, to enable the kernel to trust disk images in a flexible
way (i.e. thanks to certificate's chain of trust). Being able to update
such chain at run time requires to use the second trusted keyring. This
keyring automatically includes the certificate authorities from the
builtin trusted keyring, which are required to dynamically populate the
secondary trusted keyring with certificates signed by an already trusted
authority. The roots of trust must then be included at build time in the
builtin trusted keyring.

To be meaningful, using dm-verity signatures implies to have a
restricted user space, i.e. even the root user has limited power over
the kernel and the rest of the system. Blindly trusting data provided by
user space (e.g. dm-verity table argument, kernel module parameter)
defeat the purpose of (mandatory) authenticated images.

> 
> Otherwise, _all_ DM verity devices will be configured to use secondary
> keyring fallback.  Is that really desirable?

That is already the current state (on purpose).

> 
> Regardless, I really don't see why a Kconfig knob is appropriate.

Moreover, a Kconfig knob makes sense as much as
DM_VERITY_VERIFY_ROOTHASH_SIG,
IMA_KEYRINGS_PERMIT_SIGNED_BY_BUILTIN_OR_SECONDARY, MODULE_SIG_FORCE and
other similar authentication mechanisms. Indeed, when using these
configurations, we want the kernel to enforce a specific policy.

Obviously, we can't make the DM_VERITY_VERIFY_ROOTHASH_SIG relies on the
secondary trusted keyring without important security implications for
systems already using this configuration (and then the builtin trusted
keyring as the unique source of trust).


[PATCH v2] dm verity: Add support for signature verification with 2nd keyring

2020-10-15 Thread Mickaël Salaün
From: Mickaël Salaün 

Add a new configuration DM_VERITY_VERIFY_ROOTHASH_SIG_SECONDARY_KEYRING
to enable dm-verity signatures to be verified against the secondary
trusted keyring.  Instead of relying on the builtin trusted keyring
(with hard-coded certificates), the second trusted keyring can include
certificate authorities from the builtin trusted keyring and child
certificates loaded at run time.  Using the secondary trusted keyring
enables to use dm-verity disks (e.g. loop devices) signed by keys which
did not exist at kernel build time, leveraging the certificate chain of
trust model.  In practice, this makes it possible to update certificates
without kernel update and reboot, aligning with module and kernel
(kexec) signature verification which already use the secondary trusted
keyring.

Cc: Alasdair Kergon 
Cc: Andrew Morton 
Cc: Jarkko Sakkinen 
Cc: Jaskaran Khurana 
Cc: Mike Snitzer 
Cc: Milan Broz 
Signed-off-by: Mickaël Salaün 
---

Previous version:
https://lore.kernel.org/lkml/20201002071802.535023-1-...@digikod.net/

Changes since v1:
* Extend the commit message (asked by Jarkko Sakkinen).
* Rename the Kconfig "help" keyword according to commit 84af7a6194e4
  ("checkpatch: kconfig: prefer 'help' over '---help---'").
---
 drivers/md/Kconfig| 13 -
 drivers/md/dm-verity-verify-sig.c |  9 +++--
 2 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/drivers/md/Kconfig b/drivers/md/Kconfig
index 30ba3573626c..1d68935e45ef 100644
--- a/drivers/md/Kconfig
+++ b/drivers/md/Kconfig
@@ -530,11 +530,22 @@ config DM_VERITY_VERIFY_ROOTHASH_SIG
bool "Verity data device root hash signature verification support"
depends on DM_VERITY
select SYSTEM_DATA_VERIFICATION
- help
+   help
  Add ability for dm-verity device to be validated if the
  pre-generated tree of cryptographic checksums passed has a pkcs#7
  signature file that can validate the roothash of the tree.
 
+ By default, rely on the builtin trusted keyring.
+
+ If unsure, say N.
+
+config DM_VERITY_VERIFY_ROOTHASH_SIG_SECONDARY_KEYRING
+   bool "Verity data device root hash signature verification with 
secondary keyring"
+   depends on DM_VERITY_VERIFY_ROOTHASH_SIG
+   depends on SECONDARY_TRUSTED_KEYRING
+   help
+ Rely on the secondary trusted keyring to verify dm-verity signatures.
+
  If unsure, say N.
 
 config DM_VERITY_FEC
diff --git a/drivers/md/dm-verity-verify-sig.c 
b/drivers/md/dm-verity-verify-sig.c
index 614e43db93aa..29385dc470d5 100644
--- a/drivers/md/dm-verity-verify-sig.c
+++ b/drivers/md/dm-verity-verify-sig.c
@@ -119,8 +119,13 @@ int verity_verify_root_hash(const void *root_hash, size_t 
root_hash_len,
}
 
ret = verify_pkcs7_signature(root_hash, root_hash_len, sig_data,
-   sig_len, NULL, VERIFYING_UNSPECIFIED_SIGNATURE,
-   NULL, NULL);
+   sig_len,
+#ifdef CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG_SECONDARY_KEYRING
+   VERIFY_USE_SECONDARY_KEYRING,
+#else
+   NULL,
+#endif
+   VERIFYING_UNSPECIFIED_SIGNATURE, NULL, NULL);
 
return ret;
 }

base-commit: bbf5c979011a099af5dc76498918ed7df445635b
-- 
2.28.0



Re: [PATCH v21 12/12] landlock: Add user and kernel documentation

2020-10-15 Thread Mickaël Salaün


On 08/10/2020 17:31, Mickaël Salaün wrote:
> From: Mickaël Salaün 
> 
> This documentation can be built with the Sphinx framework.
> 
> Cc: James Morris 
> Cc: Jann Horn 
> Cc: Kees Cook 
> Cc: Serge E. Hallyn 
> Signed-off-by: Mickaël Salaün 
> Reviewed-by: Vincent Dagonneau 
> ---
> 
> Changes since v20:
> * Update examples and documentation with the new syscalls.
> 
> Changes since v19:
> * Update examples and documentation with the new syscalls.
> 
> Changes since v15:
> * Add current limitations.
> 
> Changes since v14:
> * Fix spelling (contributed by Randy Dunlap).
> * Extend documentation about inheritance and explain layer levels.
> * Remove the use of now-removed access rights.
> * Use GitHub links.
> * Improve kernel documentation.
> * Add section for tests.
> * Update example.
> 
> Changes since v13:
> * Rewrote the documentation according to the major revamp.
> 
> Previous changes:
> https://lore.kernel.org/lkml/20191104172146.30797-8-...@digikod.net/
> ---
>  Documentation/security/index.rst   |   1 +
>  Documentation/security/landlock/index.rst  |  18 ++
>  Documentation/security/landlock/kernel.rst |  69 ++
>  Documentation/security/landlock/user.rst   | 242 +
>  4 files changed, 330 insertions(+)
>  create mode 100644 Documentation/security/landlock/index.rst
>  create mode 100644 Documentation/security/landlock/kernel.rst
>  create mode 100644 Documentation/security/landlock/user.rst
> 
> diff --git a/Documentation/security/index.rst 
> b/Documentation/security/index.rst
> index 8129405eb2cc..e3f2bf4fef77 100644
> --- a/Documentation/security/index.rst
> +++ b/Documentation/security/index.rst
> @@ -16,3 +16,4 @@ Security Documentation
> siphash
> tpm/index
> digsig
> +   landlock/index
> diff --git a/Documentation/security/landlock/index.rst 
> b/Documentation/security/landlock/index.rst
> new file mode 100644
> index ..2520f8f33f5e
> --- /dev/null
> +++ b/Documentation/security/landlock/index.rst
> @@ -0,0 +1,18 @@
> +=
> +Landlock LSM: unprivileged access control
> +=
> +
> +:Author: Mickaël Salaün
> +
> +The goal of Landlock is to enable to restrict ambient rights (e.g.  global
> +filesystem access) for a set of processes.  Because Landlock is a stackable
> +LSM, it makes possible to create safe security sandboxes as new security 
> layers
> +in addition to the existing system-wide access-controls. This kind of sandbox
> +is expected to help mitigate the security impact of bugs or
> +unexpected/malicious behaviors in user-space applications. Landlock empowers
> +any process, including unprivileged ones, to securely restrict themselves.
> +
> +.. toctree::
> +
> +user
> +kernel
> diff --git a/Documentation/security/landlock/kernel.rst 
> b/Documentation/security/landlock/kernel.rst
> new file mode 100644
> index ..27c0933a0b6e
> --- /dev/null
> +++ b/Documentation/security/landlock/kernel.rst
> @@ -0,0 +1,69 @@
> +==
> +Landlock: kernel documentation
> +==
Cf. https://landlock.io/linux-doc/landlock-v21/security/landlock/kernel.html

I guess this is the good place for kernel API documentation.

> diff --git a/Documentation/security/landlock/user.rst 
> b/Documentation/security/landlock/user.rst
> new file mode 100644
> index ..e6fbc75c1af1
> --- /dev/null
> +++ b/Documentation/security/landlock/user.rst
> @@ -0,0 +1,242 @@
> +=
> +Landlock: userspace documentation
> +=
Cf. https://landlock.io/linux-doc/landlock-v21/security/landlock/user.html

Shouldn't this go in Documentation/userspace-api/ instead?

Documentation/security/lsm-development.rst says that LSM documentation
should go to Documentation/admin-guide/LSM/ but this is not (like
seccomp) an admin documentation.
Should the Documentation/userspace-api/landlock.rst be linked from
Documentation/admin-guide/LSM/index.rst too?


Re: [PATCH v21 07/12] landlock: Support filesystem access-control

2020-10-14 Thread Mickaël Salaün


On 14/10/2020 20:52, Mickaël Salaün wrote:
> 
> On 14/10/2020 20:07, James Morris wrote:
>> On Thu, 8 Oct 2020, Mickaël Salaün wrote:
>>
>>> +config ARCH_EPHEMERAL_STATES
>>> +   def_bool n
>>> +   help
>>> + An arch should select this symbol if it does not keep an internal 
>>> kernel
>>> + state for kernel objects such as inodes, but instead relies on 
>>> something
>>> + else (e.g. the host kernel for an UML kernel).
>>> +
>>
>> This is used to disable Landlock for UML, correct?
> 
> Yes
> 
>> I wonder if it could be 
>> more specific: "ephemeral states" is a very broad term.
>>
>> How about something like ARCH_OWN_INODES ?
> 
> Sounds good. We may need add new ones (e.g. for network socket, UID,
> etc.) in the future though.
> 

Because UML is the exception here, it would be more convenient to keep
the inverted semantic. What about ARCH_NO_OWN_INODES or
ARCH_EPHEMERAL_INODES?


Re: [PATCH v21 07/12] landlock: Support filesystem access-control

2020-10-14 Thread Mickaël Salaün


On 14/10/2020 20:07, James Morris wrote:
> On Thu, 8 Oct 2020, Mickaël Salaün wrote:
> 
>> +config ARCH_EPHEMERAL_STATES
>> +def_bool n
>> +help
>> +  An arch should select this symbol if it does not keep an internal 
>> kernel
>> +  state for kernel objects such as inodes, but instead relies on 
>> something
>> +  else (e.g. the host kernel for an UML kernel).
>> +
> 
> This is used to disable Landlock for UML, correct?

Yes

> I wonder if it could be 
> more specific: "ephemeral states" is a very broad term.
> 
> How about something like ARCH_OWN_INODES ?

Sounds good. We may need add new ones (e.g. for network socket, UID,
etc.) in the future though.


Re: [PATCH v1] dm verity: Add support for signature verification with 2nd keyring

2020-10-13 Thread Mickaël Salaün


On 13/10/2020 01:55, Jarkko Sakkinen wrote:
> On Fri, Oct 09, 2020 at 11:50:03AM +0200, Mickaël Salaün wrote:
>> Hi,
>>
>> What do you think about this patch?
>>
>> Regards,
>>  Mickaël
>>
>> On 02/10/2020 09:18, Mickaël Salaün wrote:
>>> From: Mickaël Salaün 
>>>
>>> Add a new DM_VERITY_VERIFY_ROOTHASH_SIG_SECONDARY_KEYRING configuration
>>> to enable dm-verity signatures to be verified against the secondary
>>> trusted keyring.  This allows certificate updates without kernel update
>>> and reboot, aligning with module and kernel (kexec) signature
>>> verifications.
> 
> I'd prefer a bit more verbose phrasing, not least because I have never
> really even peeked at dm-verity, but it is also a good practice.
> 
> You have the middle part of the story missing - explaining the semantics
> of how the feature leads to the aimed solution.

OK, what about:

Add a new configuration DM_VERITY_VERIFY_ROOTHASH_SIG_SECONDARY_KEYRING
to enable dm-verity signatures to be verified against the secondary
trusted keyring. Instead of relying on the builtin trusted keyring (with
hard-coded certificates), the second trusted keyring can include
certificate authorities from the builtin trusted keyring and child
certificates loaded at run time. Using the secondary trusted keyring
enables to use dm-verity disks (e.g. loop devices) signed by keys which
did not exist at kernel build time, leveraging the certificate chain of
trust model. In practice, this allows to update certificates without
kernel update and reboot, aligning with module and kernel (kexec)
signature verification which already use the secondary trusted keyring.


Re: [PATCH v1] dm verity: Add support for signature verification with 2nd keyring

2020-10-09 Thread Mickaël Salaün
Hi,

What do you think about this patch?

Regards,
 Mickaël

On 02/10/2020 09:18, Mickaël Salaün wrote:
> From: Mickaël Salaün 
> 
> Add a new DM_VERITY_VERIFY_ROOTHASH_SIG_SECONDARY_KEYRING configuration
> to enable dm-verity signatures to be verified against the secondary
> trusted keyring.  This allows certificate updates without kernel update
> and reboot, aligning with module and kernel (kexec) signature
> verifications.
> 
> Signed-off-by: Mickaël Salaün 
> Cc: Jaskaran Khurana 
> Cc: Mike Snitzer 
> Cc: Milan Broz 
> ---
>  drivers/md/Kconfig| 13 -
>  drivers/md/dm-verity-verify-sig.c |  9 +++--
>  2 files changed, 19 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/md/Kconfig b/drivers/md/Kconfig
> index 30ba3573626c..63870fdfe8ce 100644
> --- a/drivers/md/Kconfig
> +++ b/drivers/md/Kconfig
> @@ -530,11 +530,22 @@ config DM_VERITY_VERIFY_ROOTHASH_SIG
>   bool "Verity data device root hash signature verification support"
>   depends on DM_VERITY
>   select SYSTEM_DATA_VERIFICATION
> -   help
> + ---help---
> Add ability for dm-verity device to be validated if the
> pre-generated tree of cryptographic checksums passed has a pkcs#7
> signature file that can validate the roothash of the tree.
>  
> +   By default, rely on the builtin trusted keyring.
> +
> +   If unsure, say N.
> +
> +config DM_VERITY_VERIFY_ROOTHASH_SIG_SECONDARY_KEYRING
> + bool "Verity data device root hash signature verification with 
> secondary keyring"
> + depends on DM_VERITY_VERIFY_ROOTHASH_SIG
> + depends on SECONDARY_TRUSTED_KEYRING
> + ---help---
> +   Rely on the secondary trusted keyring to verify dm-verity signatures.
> +
> If unsure, say N.
>  
>  config DM_VERITY_FEC
> diff --git a/drivers/md/dm-verity-verify-sig.c 
> b/drivers/md/dm-verity-verify-sig.c
> index 614e43db93aa..29385dc470d5 100644
> --- a/drivers/md/dm-verity-verify-sig.c
> +++ b/drivers/md/dm-verity-verify-sig.c
> @@ -119,8 +119,13 @@ int verity_verify_root_hash(const void *root_hash, 
> size_t root_hash_len,
>   }
>  
>   ret = verify_pkcs7_signature(root_hash, root_hash_len, sig_data,
> - sig_len, NULL, VERIFYING_UNSPECIFIED_SIGNATURE,
> - NULL, NULL);
> + sig_len,
> +#ifdef CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG_SECONDARY_KEYRING
> + VERIFY_USE_SECONDARY_KEYRING,
> +#else
> + NULL,
> +#endif
> + VERIFYING_UNSPECIFIED_SIGNATURE, NULL, NULL);
>  
>   return ret;
>  }
> 


[PATCH v21 06/12] fs,security: Add sb_delete hook

2020-10-08 Thread Mickaël Salaün
From: Mickaël Salaün 

The sb_delete security hook is called when shutting down a superblock,
which may be useful to release kernel objects tied to the superblock's
lifetime (e.g. inodes).

This new hook is needed by Landlock to release (ephemerally) tagged
struct inodes.  This comes from the unprivileged nature of Landlock
described in the next commit.

Cc: Al Viro 
Cc: James Morris 
Cc: Jann Horn 
Cc: Kees Cook 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
---

Changes since v17:
* Initial patch to replace the direct call to landlock_release_inodes()
  (requested by James Morris).
  https://lore.kernel.org/lkml/alpine.lrh.2.21.2005150536440.7...@namei.org/
---
 fs/super.c| 1 +
 include/linux/lsm_hook_defs.h | 1 +
 include/linux/lsm_hooks.h | 2 ++
 include/linux/security.h  | 4 
 security/security.c   | 5 +
 5 files changed, 13 insertions(+)

diff --git a/fs/super.c b/fs/super.c
index 904459b35119..d5517e49ccdf 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -454,6 +454,7 @@ void generic_shutdown_super(struct super_block *sb)
evict_inodes(sb);
/* only nonzero refcount inodes can have marks */
fsnotify_sb_delete(sb);
+   security_sb_delete(sb);
 
if (sb->s_dio_done_wq) {
destroy_workqueue(sb->s_dio_done_wq);
diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
index 2a8c74d99015..a512d4796409 100644
--- a/include/linux/lsm_hook_defs.h
+++ b/include/linux/lsm_hook_defs.h
@@ -59,6 +59,7 @@ LSM_HOOK(int, 0, fs_context_dup, struct fs_context *fc,
 LSM_HOOK(int, -ENOPARAM, fs_context_parse_param, struct fs_context *fc,
 struct fs_parameter *param)
 LSM_HOOK(int, 0, sb_alloc_security, struct super_block *sb)
+LSM_HOOK(void, LSM_RET_VOID, sb_delete, struct super_block *sb)
 LSM_HOOK(void, LSM_RET_VOID, sb_free_security, struct super_block *sb)
 LSM_HOOK(void, LSM_RET_VOID, sb_free_mnt_opts, void *mnt_opts)
 LSM_HOOK(int, 0, sb_eat_lsm_opts, char *orig, void **mnt_opts)
diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index 29df5075b35d..81f21861651e 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -108,6 +108,8 @@
  * allocated.
  * @sb contains the super_block structure to be modified.
  * Return 0 if operation was successful.
+ * @sb_delete:
+ * Release objects tied to a superblock (e.g. inodes).
  * @sb_free_security:
  * Deallocate and clear the sb->s_security field.
  * @sb contains the super_block structure to be modified.
diff --git a/include/linux/security.h b/include/linux/security.h
index 0a0a03b36a3b..28b0ee6c7239 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -286,6 +286,7 @@ void security_bprm_committed_creds(struct linux_binprm 
*bprm);
 int security_fs_context_dup(struct fs_context *fc, struct fs_context *src_fc);
 int security_fs_context_parse_param(struct fs_context *fc, struct fs_parameter 
*param);
 int security_sb_alloc(struct super_block *sb);
+void security_sb_delete(struct super_block *sb);
 void security_sb_free(struct super_block *sb);
 void security_free_mnt_opts(void **mnt_opts);
 int security_sb_eat_lsm_opts(char *options, void **mnt_opts);
@@ -614,6 +615,9 @@ static inline int security_sb_alloc(struct super_block *sb)
return 0;
 }
 
+static inline void security_sb_delete(struct super_block *sb)
+{ }
+
 static inline void security_sb_free(struct super_block *sb)
 { }
 
diff --git a/security/security.c b/security/security.c
index d60aa835b670..9337b511306c 100644
--- a/security/security.c
+++ b/security/security.c
@@ -898,6 +898,11 @@ int security_sb_alloc(struct super_block *sb)
return rc;
 }
 
+void security_sb_delete(struct super_block *sb)
+{
+   call_void_hook(sb_delete, sb);
+}
+
 void security_sb_free(struct super_block *sb)
 {
call_void_hook(sb_free_security, sb);
-- 
2.28.0



[PATCH v21 02/12] landlock: Add ruleset and domain management

2020-10-08 Thread Mickaël Salaün
From: Mickaël Salaün 

A Landlock ruleset is mainly a red-black tree with Landlock rules as
nodes.  This enables quick update and lookup to match a requested access
e.g., to a file.  A ruleset is usable through a dedicated file
descriptor (cf. following commit implementing the syscall) which enables
a process to create and populate a ruleset with new rules.

A domain is a ruleset tied to a set of processes.  This group of rules
define the security policy enforced on these processes and their future
children.  A domain can transition to a new domain which is the
intersection of all its constraints and those of a ruleset provided by
the current process.  This modification only impact the current process.
This means that a process can only gain more constraints (i.e. lose
accesses) over time.

Cc: James Morris 
Cc: Jann Horn 
Cc: Kees Cook 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
---

Changes since v18:
* Account rulesets to kmemcg.
* Remove struct holes.
* Cosmetic changes.

Changes since v17:
* Move include/uapi/linux/landlock.h and _LANDLOCK_ACCESS_FS_* to a
  following patch.

Changes since v16:
* Allow enforcement of empty ruleset, which enables deny-all policies.

Changes since v15:
* Replace layer_levels and layer_depth with a bitfield of layers, cf.
  filesystem commit.
* Rename the LANDLOCK_ACCESS_FS_{UNLINK,RMDIR} with
  LANDLOCK_ACCESS_FS_REMOVE_{FILE,DIR} because it makes sense to use
  them for the action of renaming a file or a directory, which may lead
  to the removal of the source file or directory.  Removes the
  LANDLOCK_ACCESS_FS_{LINK_TO,RENAME_FROM,RENAME_TO} which are now
  replaced with LANDLOCK_ACCESS_FS_REMOVE_{FILE,DIR} and
  LANDLOCK_ACCESS_FS_MAKE_* .
* Update the documentation accordingly and highlight how the access
  rights are taken into account.
* Change nb_rules from atomic_t to u32 because it is not use anymore by
  show_fdinfo().
* Add safeguard for level variables types.
* Check max number of rules.
* Replace struct landlock_access (self and beneath bitfields) with one
  bitfield.
* Remove useless variable.
* Add comments.

Changes since v14:
* Simplify the object, rule and ruleset management at the expense of a
  less aggressive memory freeing (contributed by Jann Horn, with
  additional modifications):
  - Make a domain immutable (remove the opportunistic cleaning).
  - Remove RCU pointers.
  - Merge struct landlock_ref and struct landlock_ruleset_elem into
landlock_rule: get ride of rule's RCU.
  - Adjust union.
  - Remove the landlock_insert_rule() check about a new object with the
same address as a previously disabled one, because it is not
possible to disable a rule anymore.
  Cf. 
https://lore.kernel.org/lkml/cag48ez21ben0wl1bbmtiiu8j9jp5iewthowz4turuj+ki0y...@mail.gmail.com/
* Fix nested domains by implementing a notion of layer level and depth:
  - Update landlock_insert_rule() to manage such layers.
  - Add an inherit_ruleset() helper to properly create a new domain.
  - Rename landlock_find_access() to landlock_find_rule() and return a
full rule reference.
  - Add a layer_level and a layer_depth fields to struct landlock_rule.
  - Add a top_layer_level field to struct landlock_ruleset.
* Remove access rights that may be required for FD-only requests:
  truncate, getattr, lock, chmod, chown, chgrp, ioctl.  This will be
  handle in a future evolution of Landlock, but right now the goal is to
  lighten the code to ease review.
* Remove LANDLOCK_ACCESS_FS_OPEN and rename
  LANDLOCK_ACCESS_FS_{READ,WRITE} with a FILE suffix.
* Rename LANDLOCK_ACCESS_FS_READDIR to match the *_FILE pattern.
* Remove LANDLOCK_ACCESS_FS_MAP which was useless.
* Fix memory leak in put_hierarchy() (reported by Jann Horn).
* Fix user-after-free and rename free_ruleset() (reported by Jann Horn).
* Replace the for loops with rbtree_postorder_for_each_entry_safe().
* Constify variables.
* Only use refcount_inc() through getter helpers.
* Change Landlock_insert_ruleset_access() to
  Landlock_insert_ruleset_rule().
* Rename landlock_put_ruleset_enqueue() to landlock_put_ruleset_deferred().
* Improve kernel documentation and add a warning about the unhandled
  access/syscall families.
* Move ABI check to syscall.c .

Changes since v13:
* New implementation, inspired by the previous inode eBPF map, but
  agnostic to the underlying kernel object.

Previous changes:
https://lore.kernel.org/lkml/20190721213116.23476-7-...@digikod.net/
---
 MAINTAINERS |   1 +
 security/landlock/Makefile  |   2 +-
 security/landlock/ruleset.c | 342 
 security/landlock/ruleset.h | 157 +
 4 files changed, 501 insertions(+), 1 deletion(-)
 create mode 100644 security/landlock/ruleset.c
 create mode 100644 security/landlock/ruleset.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 40b0ad2b101e..3b951d6b7622 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9769,6 +9769,7 @@ L:linux-security-mod...@vger.kernel.org
 S

[PATCH v21 03/12] landlock: Set up the security framework and manage credentials

2020-10-08 Thread Mickaël Salaün
From: Mickaël Salaün 

A process credentials point to a Landlock domain, which is underneath
implemented with a ruleset.  In the following commits, this domain is
used to check and enforce the ptrace and filesystem security policies.
A domain is inherited from a parent to its child the same way a thread
inherits a seccomp policy.

Cc: James Morris 
Cc: Jann Horn 
Cc: Kees Cook 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
---

Changes since v17:
* Constify returned domain pointers from landlock_get_current_domain()
  and landlock_get_task_domain() helpers.

Changes since v15:
* Optimize landlocked() for current thread.
* Display the greeting message when everything is initialized.

Changes since v14:
* Uses pr_fmt from common.h .
* Constify variables.
* Remove useless NULL initialization.

Changes since v13:
* totally get ride of the seccomp dependency
* only keep credential management and LSM setup.

Previous changes:
https://lore.kernel.org/lkml/20191104172146.30797-4-...@digikod.net/
---
 security/Kconfig   | 10 +++
 security/landlock/Makefile |  3 +-
 security/landlock/common.h | 20 +
 security/landlock/cred.c   | 46 ++
 security/landlock/cred.h   | 58 ++
 security/landlock/setup.c  | 31 
 security/landlock/setup.h  | 16 +++
 7 files changed, 178 insertions(+), 6 deletions(-)
 create mode 100644 security/landlock/common.h
 create mode 100644 security/landlock/cred.c
 create mode 100644 security/landlock/cred.h
 create mode 100644 security/landlock/setup.c
 create mode 100644 security/landlock/setup.h

diff --git a/security/Kconfig b/security/Kconfig
index 15a4342b5d01..0ced7fd33e4d 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -278,11 +278,11 @@ endchoice
 
 config LSM
string "Ordered list of enabled LSMs"
-   default 
"lockdown,yama,loadpin,safesetid,integrity,smack,selinux,tomoyo,apparmor,bpf" 
if DEFAULT_SECURITY_SMACK
-   default 
"lockdown,yama,loadpin,safesetid,integrity,apparmor,selinux,smack,tomoyo,bpf" 
if DEFAULT_SECURITY_APPARMOR
-   default "lockdown,yama,loadpin,safesetid,integrity,tomoyo,bpf" if 
DEFAULT_SECURITY_TOMOYO
-   default "lockdown,yama,loadpin,safesetid,integrity,bpf" if 
DEFAULT_SECURITY_DAC
-   default 
"lockdown,yama,loadpin,safesetid,integrity,selinux,smack,tomoyo,apparmor,bpf"
+   default 
"landlock,lockdown,yama,loadpin,safesetid,integrity,smack,selinux,tomoyo,apparmor,bpf"
 if DEFAULT_SECURITY_SMACK
+   default 
"landlock,lockdown,yama,loadpin,safesetid,integrity,apparmor,selinux,smack,tomoyo,bpf"
 if DEFAULT_SECURITY_APPARMOR
+   default "landlock,lockdown,yama,loadpin,safesetid,integrity,tomoyo,bpf" 
if DEFAULT_SECURITY_TOMOYO
+   default "landlock,lockdown,yama,loadpin,safesetid,integrity,bpf" if 
DEFAULT_SECURITY_DAC
+   default 
"landlock,lockdown,yama,loadpin,safesetid,integrity,selinux,smack,tomoyo,apparmor,bpf"
help
  A comma-separated list of LSMs, in initialization order.
  Any LSMs left off this list will be ignored. This can be
diff --git a/security/landlock/Makefile b/security/landlock/Makefile
index d846eba445bb..041ea242e627 100644
--- a/security/landlock/Makefile
+++ b/security/landlock/Makefile
@@ -1,3 +1,4 @@
 obj-$(CONFIG_SECURITY_LANDLOCK) := landlock.o
 
-landlock-y := object.o ruleset.o
+landlock-y := setup.o object.o ruleset.o \
+   cred.o
diff --git a/security/landlock/common.h b/security/landlock/common.h
new file mode 100644
index ..5dc0fe15707d
--- /dev/null
+++ b/security/landlock/common.h
@@ -0,0 +1,20 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Landlock LSM - Common constants and helpers
+ *
+ * Copyright © 2016-2020 Mickaël Salaün 
+ * Copyright © 2018-2020 ANSSI
+ */
+
+#ifndef _SECURITY_LANDLOCK_COMMON_H
+#define _SECURITY_LANDLOCK_COMMON_H
+
+#define LANDLOCK_NAME "landlock"
+
+#ifdef pr_fmt
+#undef pr_fmt
+#endif
+
+#define pr_fmt(fmt) LANDLOCK_NAME ": " fmt
+
+#endif /* _SECURITY_LANDLOCK_COMMON_H */
diff --git a/security/landlock/cred.c b/security/landlock/cred.c
new file mode 100644
index ..7074149d2517
--- /dev/null
+++ b/security/landlock/cred.c
@@ -0,0 +1,46 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Landlock LSM - Credential hooks
+ *
+ * Copyright © 2017-2020 Mickaël Salaün 
+ * Copyright © 2018-2020 ANSSI
+ */
+
+#include 
+#include 
+
+#include "common.h"
+#include "cred.h"
+#include "ruleset.h"
+#include "setup.h"
+
+static int hook_cred_prepare(struct cred *const new,
+   const struct cred *const old, const gfp_t gfp)
+{
+   const struct landlock_cred_security *cred_old = landlock_cred(old);
+   struct landlock_cred_security *cred_new = landlock_cred(new);
+   struct landloc

[PATCH v21 11/12] samples/landlock: Add a sandbox manager example

2020-10-08 Thread Mickaël Salaün
From: Mickaël Salaün 

Add a basic sandbox tool to launch a command which can only access a
whitelist of file hierarchies in a read-only or read-write way.

Cc: James Morris 
Cc: Jann Horn 
Cc: Kees Cook 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
---

Changes since v20:
* Update with new syscalls and type names.
* Update errno check for EOPNOTSUPP.
* Use the full syscall interfaces: explicitely set the "flags" field to
  zero.

Changes since v19:
* Update with the new Landlock syscalls.
* Comply with commit 5f2fb52fac15 ("kbuild: rename hostprogs-y/always to
  hostprogs/always-y").

Changes since v16:
* Switch syscall attribute pointer and size arguments.

Changes since v15:
* Update access right names.
* Properly assign access right to files according to the new related
  syscall restriction.
* Replace "select" with "depends on" HEADERS_INSTALL (suggested by Randy
  Dunlap).

Changes since v14:
* Fix Kconfig dependency.
* Remove access rights that may be required for FD-only requests:
  mmap, truncate, getattr, lock, chmod, chown, chgrp, ioctl.
* Fix useless hardcoded syscall number.
* Use execvpe().
* Follow symlinks.
* Extend help with common file paths.
* Constify variables.
* Clean up comments.
* Improve error message.

Changes since v11:
* Add back the filesystem sandbox manager and update it to work with the
  new Landlock syscall.

Previous changes:
https://lore.kernel.org/lkml/20190721213116.23476-9-...@digikod.net/
---
 samples/Kconfig  |   7 ++
 samples/Makefile |   1 +
 samples/landlock/.gitignore  |   1 +
 samples/landlock/Makefile|  15 +++
 samples/landlock/sandboxer.c | 220 +++
 5 files changed, 244 insertions(+)
 create mode 100644 samples/landlock/.gitignore
 create mode 100644 samples/landlock/Makefile
 create mode 100644 samples/landlock/sandboxer.c

diff --git a/samples/Kconfig b/samples/Kconfig
index 0ed6e4d71d87..092962924f0d 100644
--- a/samples/Kconfig
+++ b/samples/Kconfig
@@ -124,6 +124,13 @@ config SAMPLE_HIDRAW
bool "hidraw sample"
depends on CC_CAN_LINK && HEADERS_INSTALL
 
+config SAMPLE_LANDLOCK
+   bool "Build Landlock sample code"
+   depends on HEADERS_INSTALL
+   help
+ Build a simple Landlock sandbox manager able to launch a process
+ restricted by a user-defined filesystem access-control security 
policy.
+
 config SAMPLE_PIDFD
bool "pidfd sample"
depends on CC_CAN_LINK && HEADERS_INSTALL
diff --git a/samples/Makefile b/samples/Makefile
index 754553597581..4a6ce8f64a4c 100644
--- a/samples/Makefile
+++ b/samples/Makefile
@@ -11,6 +11,7 @@ obj-$(CONFIG_SAMPLE_KDB)  += kdb/
 obj-$(CONFIG_SAMPLE_KFIFO) += kfifo/
 obj-$(CONFIG_SAMPLE_KOBJECT)   += kobject/
 obj-$(CONFIG_SAMPLE_KPROBES)   += kprobes/
+subdir-$(CONFIG_SAMPLE_LANDLOCK)   += landlock
 obj-$(CONFIG_SAMPLE_LIVEPATCH) += livepatch/
 subdir-$(CONFIG_SAMPLE_PIDFD)  += pidfd
 obj-$(CONFIG_SAMPLE_QMI_CLIENT)+= qmi/
diff --git a/samples/landlock/.gitignore b/samples/landlock/.gitignore
new file mode 100644
index ..f43668b2d318
--- /dev/null
+++ b/samples/landlock/.gitignore
@@ -0,0 +1 @@
+/sandboxer
diff --git a/samples/landlock/Makefile b/samples/landlock/Makefile
new file mode 100644
index ..21eda5774948
--- /dev/null
+++ b/samples/landlock/Makefile
@@ -0,0 +1,15 @@
+# SPDX-License-Identifier: BSD-3-Clause
+
+hostprogs := sandboxer
+
+always-y := $(hostprogs)
+
+KBUILD_HOSTCFLAGS += -I$(objtree)/usr/include
+
+.PHONY: all clean
+
+all:
+   $(MAKE) -C ../.. samples/landlock/
+
+clean:
+   $(MAKE) -C ../.. M=samples/landlock/ clean
diff --git a/samples/landlock/sandboxer.c b/samples/landlock/sandboxer.c
new file mode 100644
index ..53ebc19aad3d
--- /dev/null
+++ b/samples/landlock/sandboxer.c
@@ -0,0 +1,220 @@
+// SPDX-License-Identifier: BSD-3-Clause
+/*
+ * Simple Landlock sandbox manager able to launch a process restricted by a
+ * user-defined filesystem access-control security policy.
+ *
+ * Copyright © 2017-2020 Mickaël Salaün 
+ * Copyright © 2020 ANSSI
+ */
+
+#define _GNU_SOURCE
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#ifndef landlock_create_ruleset
+static inline int landlock_create_ruleset(
+   const struct landlock_ruleset_attr *const attr,
+   const size_t size, const __u32 flags)
+{
+   errno = 0;
+   return syscall(__NR_landlock_create_ruleset, attr, size, flags);
+}
+#endif
+
+#ifndef landlock_add_rule
+static inline int landlock_add_rule(const int ruleset_fd,
+   const enum landlock_rule_type rule_type,
+   const void *const rule_attr, const __u32 flags)
+{
+   errno = 0;
+   return syscall(__NR_landlock_

[PATCH v21 09/12] arch: Wire up Landlock syscalls

2020-10-08 Thread Mickaël Salaün
From: Mickaël Salaün 

Wire up the following system calls for all architectures:
* landlock_create_ruleset(2)
* landlock_add_rule(2)
* landlock_enforce_ruleset_current(2)

Cc: Arnd Bergmann 
Cc: James Morris 
Cc: Jann Horn 
Cc: Kees Cook 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
---

Changes since v20:
* Remove landlock_get_features(2).
* Decrease syscall numbers to stick to process_madvise(2) in -next.
* Rename landlock_enforce_ruleset(2) to
  landlock_enforce_ruleset_current(2).

Changes since v19:
* Increase syscall numbers by 4 to leave space for new ones (in
  linux-next): watch_mount(2), watch_sb(2), fsinfo(2) and
  process_madvise(2) (requested by Arnd Bergmann).
* Replace the previous multiplexor landlock(2) with 4 syscalls:
  landlock_get_features(2), landlock_create_ruleset(2),
  landlock_add_rule(2) and landlock_enforce_ruleset(2).

Changes since v18:
* Increase the syscall number because of the new faccessat2(2).

Changes since v14:
* Add all architectures.

Changes since v13:
* New implementation.
---
 arch/alpha/kernel/syscalls/syscall.tbl  | 3 +++
 arch/arm/tools/syscall.tbl  | 3 +++
 arch/arm64/include/asm/unistd.h | 2 +-
 arch/arm64/include/asm/unistd32.h   | 6 ++
 arch/ia64/kernel/syscalls/syscall.tbl   | 3 +++
 arch/m68k/kernel/syscalls/syscall.tbl   | 3 +++
 arch/microblaze/kernel/syscalls/syscall.tbl | 3 +++
 arch/mips/kernel/syscalls/syscall_n32.tbl   | 3 +++
 arch/mips/kernel/syscalls/syscall_n64.tbl   | 3 +++
 arch/mips/kernel/syscalls/syscall_o32.tbl   | 3 +++
 arch/parisc/kernel/syscalls/syscall.tbl | 3 +++
 arch/powerpc/kernel/syscalls/syscall.tbl| 3 +++
 arch/s390/kernel/syscalls/syscall.tbl   | 3 +++
 arch/sh/kernel/syscalls/syscall.tbl | 3 +++
 arch/sparc/kernel/syscalls/syscall.tbl  | 3 +++
 arch/x86/entry/syscalls/syscall_32.tbl  | 3 +++
 arch/x86/entry/syscalls/syscall_64.tbl  | 3 +++
 arch/xtensa/kernel/syscalls/syscall.tbl | 3 +++
 include/uapi/asm-generic/unistd.h   | 8 +++-
 19 files changed, 62 insertions(+), 2 deletions(-)

diff --git a/arch/alpha/kernel/syscalls/syscall.tbl 
b/arch/alpha/kernel/syscalls/syscall.tbl
index ec8bed9e7b75..227027a0c6a8 100644
--- a/arch/alpha/kernel/syscalls/syscall.tbl
+++ b/arch/alpha/kernel/syscalls/syscall.tbl
@@ -479,3 +479,6 @@
 547common  openat2 sys_openat2
 548common  pidfd_getfd sys_pidfd_getfd
 549common  faccessat2  sys_faccessat2
+552common  landlock_create_ruleset 
sys_landlock_create_ruleset
+553common  landlock_add_rule   
sys_landlock_add_rule
+554common  landlock_enforce_ruleset_current
sys_landlock_enforce_ruleset_current
diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl
index 171077cbf419..fa06bad9b5c2 100644
--- a/arch/arm/tools/syscall.tbl
+++ b/arch/arm/tools/syscall.tbl
@@ -453,3 +453,6 @@
 437common  openat2 sys_openat2
 438common  pidfd_getfd sys_pidfd_getfd
 439common  faccessat2  sys_faccessat2
+442common  landlock_create_ruleset 
sys_landlock_create_ruleset
+443common  landlock_add_rule   
sys_landlock_add_rule
+444common  landlock_enforce_ruleset_current
sys_landlock_enforce_ruleset_current
diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
index 3b859596840d..64ebdc1ec581 100644
--- a/arch/arm64/include/asm/unistd.h
+++ b/arch/arm64/include/asm/unistd.h
@@ -38,7 +38,7 @@
 #define __ARM_NR_compat_set_tls(__ARM_NR_COMPAT_BASE + 5)
 #define __ARM_NR_COMPAT_END(__ARM_NR_COMPAT_BASE + 0x800)
 
-#define __NR_compat_syscalls   440
+#define __NR_compat_syscalls   445
 #endif
 
 #define __ARCH_WANT_SYS_CLONE
diff --git a/arch/arm64/include/asm/unistd32.h 
b/arch/arm64/include/asm/unistd32.h
index 734860ac7cf9..77b4445ef502 100644
--- a/arch/arm64/include/asm/unistd32.h
+++ b/arch/arm64/include/asm/unistd32.h
@@ -887,6 +887,12 @@ __SYSCALL(__NR_openat2, sys_openat2)
 __SYSCALL(__NR_pidfd_getfd, sys_pidfd_getfd)
 #define __NR_faccessat2 439
 __SYSCALL(__NR_faccessat2, sys_faccessat2)
+#define __NR_landlock_create_ruleset 442
+__SYSCALL(__NR_landlock_create_ruleset, sys_landlock_create_ruleset)
+#define __NR_landlock_add_rule 443
+__SYSCALL(__NR_landlock_add_rule, sys_landlock_add_rule)
+#define __NR_landlock_enforce_ruleset_current 444
+__SYSCALL(__NR_landlock_enforce_ruleset_current, 
sys_landlock_enforce_ruleset_current)
 
 /*
  * Please add new compat syscalls above this comment and update
diff --git a/arch/ia64/kernel/syscalls/syscall.tbl 
b/arch/ia64/kernel/syscalls/syscall.tbl
index f52a41f4c340..d6b2a1352c54 100644
--- a/arch/ia64/kernel/syscalls/syscall.tbl
+++ b/arch/ia64/kernel/syscalls/syscall.tbl

[PATCH v21 08/12] landlock: Add syscall implementations

2020-10-08 Thread Mickaël Salaün
From: Mickaël Salaün 

These 3 system calls are designed to be used by unprivileged processes
to sandbox themselves:
* landlock_create_ruleset(2): Creates a ruleset and returns its file
  descriptor.
* landlock_add_rule(2): Adds a rule (e.g. file hierarchy access) to a
  ruleset, identified by the dedicated file descriptor.
* landlock_enforce_ruleset_current(2): Enforces a ruleset on the current
  thread and its future children (similar to seccomp).  This syscall has
  the same usage restrictions as seccomp(2): the caller must have the
  no_new_privs attribute set or have CAP_SYS_ADMIN in the current user
  namespace.

All these syscalls have a "flags" argument (not currently used) to
enable extensibility.

Here are the motivations for these new syscalls:
* A sandboxed process may not have access to file systems, including
  /dev, /sys or /proc, but it should still be able to add more
  restrictions to itself.
* Neither prctl(2) nor seccomp(2) (which was used in a previous version)
  fit well with the current definition of a Landlock security policy.

All passed structs (attributes) are checked at build time to ensure that
they don't contain holes and that they are aligned the same way for each
architecture.

See the user and kernel documentation for more details (provided by a
following commit): Documentation/security/landlock/

Cc: Arnd Bergmann 
Cc: James Morris 
Cc: Jann Horn 
Cc: Kees Cook 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
---

Changes since v20:
* Remove two arguments to landlock_enforce_ruleset(2) (requested by Arnd
  Bergmann) and rename it to landlock_enforce_ruleset_current(2): remove
  the enum landlock_target_type and the target file descriptor (not used
  for now).  A ruleset can only be enforced on the current thread.
* Remove the size argument in landlock_add_rule() (requested by Arnd
  Bergmann).
* Remove landlock_get_features(2) (suggested by Arnd Bergmann).
* Simplify and rename copy_struct_if_any_from_user() to
  copy_min_struct_from_user().
* Rename "options" to "flags" to allign with current syscalls.
* Rename some types and variables in a more consistent way.
* Fix missing type declarations in syscalls.h .

Changes since v19:
* Replace the landlock(2) syscall with 4 syscalls (one for each
  command): landlock_get_features(2), landlock_create_ruleset(2),
  landlock_add_rule(2) and landlock_enforce_ruleset(2) (suggested by
  Arnd Bergmann).
  https://lore.kernel.org/lkml/56d15841-e2c1-2d58-59b8-3a6a09b23...@digikod.net/
* Return EOPNOTSUPP (instead of ENOPKG) when Landlock is disabled.
* Add two new fields to landlock_attr_features to fit with the new
  syscalls: last_rule_type and last_target_type.  This enable to easily
  identify which types are supported.
* Pack landlock_attr_path_beneath struct because of the removed
  ruleset_fd.
* Update documentation and fix spelling.

Changes since v18:
* Remove useless include.
* Remove LLATTR_SIZE() which was only used to shorten lines. Cf. commit
  bdc48fa11e46 ("checkpatch/coding-style: deprecate 80-column warning").

Changes since v17:
* Synchronize syscall declaration.
* Fix comment.

Changes since v16:
* Add a size_attr_features field to struct landlock_attr_features for
  self-introspection, and move the access_fs field to be more
  consistent.
* Replace __aligned_u64 types of attribute fields with __u16, __s32,
  __u32 and __u64, and check at build time that these structures does
  not contain hole and that they are aligned the same way (8-bits) on
  all architectures.  This shrinks the size of the userspace ABI, which
  may be appreciated especially for struct landlock_attr_features which
  could grow a lot in the future.  For instance, struct
  landlock_attr_features shrinks from 72 bytes to 32 bytes.  This change
  also enables to remove 64-bits to 32-bits conversion checks.
* Switch syscall attribute pointer and size arguments to follow similar
  syscall argument order (e.g. bpf, clone3, openat2).
* Set LANDLOCK_OPT_* types to 32-bits.
* Allow enforcement of empty ruleset, which enables deny-all policies.
* Fix documentation inconsistency.

Changes since v15:
* Do not add file descriptors referring to internal filesystems (e.g.
  nsfs) in a ruleset.
* Replace is_user_mountable() with in-place clean checks.
* Replace EBADR with EBADFD in get_ruleset_from_fd() and
  get_path_from_fd().
* Remove ruleset's show_fdinfo() for now.

Changes since v14:
* Remove the security_file_open() check in get_path_from_fd(): an
  opened FD should not be restricted here, and even less with this hook.
  As a result, it is now allowed to add a path to a ruleset even if the
  access to this path is not allowed (without O_PATH). This doesn't
  change the fact that enforcing a ruleset can't grant any right, only
  remove some rights.  The new layer levels add more consistent
  restrictions.
* Check minimal landlock_attr_* size/content. This fix the case when
  no data was provided and e.g., F

[PATCH v21 04/12] landlock: Add ptrace restrictions

2020-10-08 Thread Mickaël Salaün
From: Mickaël Salaün 

Using ptrace(2) and related debug features on a target process can lead
to a privilege escalation.  Indeed, ptrace(2) can be used by an attacker
to impersonate another task and to remain undetected while performing
malicious activities.  Thanks to  ptrace_may_access(), various part of
the kernel can check if a tracer is more privileged than a tracee.

A landlocked process has fewer privileges than a non-landlocked process
and must then be subject to additional restrictions when manipulating
processes. To be allowed to use ptrace(2) and related syscalls on a
target process, a landlocked process must have a subset of the target
process' rules (i.e. the tracee must be in a sub-domain of the tracer).

Cc: James Morris 
Cc: Jann Horn 
Cc: Kees Cook 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
---

Changes since v14:
* Constify variables.

Changes since v13:
* Make the ptrace restriction mandatory, like in the v10.
* Remove the eBPF dependency.

Previous changes:
https://lore.kernel.org/lkml/20191104172146.30797-5-...@digikod.net/
---
 security/landlock/Makefile |   2 +-
 security/landlock/ptrace.c | 120 +
 security/landlock/ptrace.h |  14 +
 security/landlock/setup.c  |   2 +
 4 files changed, 137 insertions(+), 1 deletion(-)
 create mode 100644 security/landlock/ptrace.c
 create mode 100644 security/landlock/ptrace.h

diff --git a/security/landlock/Makefile b/security/landlock/Makefile
index 041ea242e627..f1d1eb72fa76 100644
--- a/security/landlock/Makefile
+++ b/security/landlock/Makefile
@@ -1,4 +1,4 @@
 obj-$(CONFIG_SECURITY_LANDLOCK) := landlock.o
 
 landlock-y := setup.o object.o ruleset.o \
-   cred.o
+   cred.o ptrace.o
diff --git a/security/landlock/ptrace.c b/security/landlock/ptrace.c
new file mode 100644
index ..61df38b13f5c
--- /dev/null
+++ b/security/landlock/ptrace.c
@@ -0,0 +1,120 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Landlock LSM - Ptrace hooks
+ *
+ * Copyright © 2017-2020 Mickaël Salaün 
+ * Copyright © 2020 ANSSI
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "common.h"
+#include "cred.h"
+#include "ptrace.h"
+#include "ruleset.h"
+#include "setup.h"
+
+/**
+ * domain_scope_le - Checks domain ordering for scoped ptrace
+ *
+ * @parent: Parent domain.
+ * @child: Potential child of @parent.
+ *
+ * Checks if the @parent domain is less or equal to (i.e. an ancestor, which
+ * means a subset of) the @child domain.
+ */
+static bool domain_scope_le(const struct landlock_ruleset *const parent,
+   const struct landlock_ruleset *const child)
+{
+   const struct landlock_hierarchy *walker;
+
+   if (!parent)
+   return true;
+   if (!child)
+   return false;
+   for (walker = child->hierarchy; walker; walker = walker->parent) {
+   if (walker == parent->hierarchy)
+   /* @parent is in the scoped hierarchy of @child. */
+   return true;
+   }
+   /* There is no relationship between @parent and @child. */
+   return false;
+}
+
+static bool task_is_scoped(const struct task_struct *const parent,
+   const struct task_struct *const child)
+{
+   bool is_scoped;
+   const struct landlock_ruleset *dom_parent, *dom_child;
+
+   rcu_read_lock();
+   dom_parent = landlock_get_task_domain(parent);
+   dom_child = landlock_get_task_domain(child);
+   is_scoped = domain_scope_le(dom_parent, dom_child);
+   rcu_read_unlock();
+   return is_scoped;
+}
+
+static int task_ptrace(const struct task_struct *const parent,
+   const struct task_struct *const child)
+{
+   /* Quick return for non-landlocked tasks. */
+   if (!landlocked(parent))
+   return 0;
+   if (task_is_scoped(parent, child))
+   return 0;
+   return -EPERM;
+}
+
+/**
+ * hook_ptrace_access_check - Determines whether the current process may access
+ *   another
+ *
+ * @child: Process to be accessed.
+ * @mode: Mode of attachment.
+ *
+ * If the current task has Landlock rules, then the child must have at least
+ * the same rules.  Else denied.
+ *
+ * Determines whether a process may access another, returning 0 if permission
+ * granted, -errno if denied.
+ */
+static int hook_ptrace_access_check(struct task_struct *const child,
+   const unsigned int mode)
+{
+   return task_ptrace(current, child);
+}
+
+/**
+ * hook_ptrace_traceme - Determines whether another process may trace the
+ *  current one
+ *
+ * @parent: Task proposed to be the tracer.
+ *
+ * If the parent has Landlock rules, then the current task must have the same
+ * or more rules.  Else denied.
+ *
+ * Determines whether the nominated task is permitted to trace the current
+ * process, returning 0 i

[PATCH v21 07/12] landlock: Support filesystem access-control

2020-10-08 Thread Mickaël Salaün
From: Mickaël Salaün 

Thanks to the Landlock objects and ruleset, it is possible to identify
inodes according to a process's domain.  To enable an unprivileged
process to express a file hierarchy, it first needs to open a directory
(or a file) and pass this file descriptor to the kernel through
landlock_add_rule(2).  When checking if a file access request is
allowed, we walk from the requested dentry to the real root, following
the different mount layers.  The access to each "tagged" inodes are
collected according to their rule layer level, and ANDed to create
access to the requested file hierarchy.  This makes possible to identify
a lot of files without tagging every inodes nor modifying the
filesystem, while still following the view and understanding the user
has from the filesystem.

Add a new ARCH_EPHEMERAL_STATES for UML because it currently does not
keep the same struct inodes for the same inodes whereas these inodes are
in use.

This commit adds a minimal set of supported filesystem access-control
which doesn't enable to restrict all file-related actions.  This is the
result of multiple discussions to minimize the code of Landlock to ease
review.  Thanks to the Landlock design, extending this access-control
without breaking user space will not be a problem.  Moreover, seccomp
filters can be used to restrict the use of syscall families which may
not be currently handled by Landlock.

Cc: Al Viro 
Cc: Anton Ivanov 
Cc: James Morris 
Cc: Jann Horn 
Cc: Jeff Dike 
Cc: Kees Cook 
Cc: Richard Weinberger 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
---

Changes since v19:
* Fix spelling (spotted by Randy Dunlap).

Changes since v18:
* Remove useless include.
* Fix spelling.

Changes since v17:
* Replace landlock_release_inodes() with security_sb_delete() (requested
  by James Morris).
* Replace struct super_block->s_landlock_inode_refs with the LSM
  infrastructure management of the superblock (requested by James
  Morris).
* Fix mknod restriction with a zero mode (spotted by Vincent Dagonneau).
* Minimize executed code in path_mknod and file_open hooks when the
  current tasks is not sandboxed.
* Remove useless checks on the file pointer and inode in
  hook_file_open() .
* Constify domain pointers.
* Rename inode_landlock() to landlock_inode().
* Import include/uapi/linux/landlock.h and _LANDLOCK_ACCESS_FS_* from
  the ruleset and domain management patch.
* Explain the rational of this minimal set of access-control.
  https://lore.kernel.org/lkml/f646e1c7-33cf-333f-070c-0a40ad046...@digikod.net/

Changes since v16:
* Add ARCH_EPHEMERAL_STATES and enable it for UML.

Changes since v15:
* Replace layer_levels and layer_depth with a bitfield of layers: this
  enables to properly manage superset and subset of access rights,
  whatever their order in the stack of layers.
  Cf. 
https://lore.kernel.org/lkml/e07fe473-1801-01cc-12ae-b3167f952...@digikod.net/
* Allow to open pipes and similar special files through /proc/self/fd/.
* Properly handle internal filesystems such as nsfs: always allow these
  kind of roots because disconnected path cannot be evaluated.
* Remove the LANDLOCK_ACCESS_FS_LINK_TO and
  LANDLOCK_ACCESS_FS_RENAME_{TO,FROM}, but use the
  LANDLOCK_ACCESS_FS_REMOVE_{FILE,DIR} and LANDLOCK_ACCESS_FS_MAKE_*
  instead.  Indeed, it is not possible for now (and not really useful)
  to express the semantic of a source and a destination.
* Check access rights to remove a directory or a file with rename(2).
* Forbid reparenting when linking or renaming.  This is needed to easily
  protect against possible privilege escalation by changing the place of
  a file or directory in relation to an enforced access policy (from the
  set of layers).  This will be relaxed in the future.
* Update hooks to take into account replacement of the object's self and
  beneath access bitfields with one.  Simplify the code.
* Check file related access rights.
* Check d_is_negative() instead of !d_backing_inode() in
  check_access_path_continue(), and continue the path walk while there
  is no mapped inode e.g., with rename(2).
* Check private inode in check_access_path().
* Optimize get_file_access() when dealing with a directory.
* Add missing atomic.h .

Changes since v14:
* Simplify the object, rule and ruleset management at the expense of a
  less aggressive memory freeing (contributed by Jann Horn, with
  additional modifications):
  - Rewrite release_inode() to use inode->sb->s_landlock_inode_refs.
  - Remove useless checks in landlock_release_inodes(), clean object
pointer according to the new struct landlock_object and wait for all
iput() to complete.
  - Rewrite get_inode_object() according to the new struct
landlock_object.  If there is a race-condition when cleaning up an
object, we retry until the concurrent thread finished the object
cleaning.
  Cf. 
https://lore.kernel.org/lkml/cag48ez21ben0wl1bbmtiiu8j9jp5iewthowz4turuj+ki0y...@mail.gmail.com/
* Fix nested doma

[PATCH v21 05/12] LSM: Infrastructure management of the superblock

2020-10-08 Thread Mickaël Salaün
From: Casey Schaufler 

Move management of the superblock->sb_security blob out of the
individual security modules and into the security infrastructure.
Instead of allocating the blobs from within the modules, the modules
tell the infrastructure how much space is required, and the space is
allocated there.

Cc: Kees Cook 
Cc: John Johansen 
Signed-off-by: Casey Schaufler 
Signed-off-by: Mickaël Salaün 
Reviewed-by: Stephen Smalley 
---

Changes since v20:
* Remove all Reviewed-by except Stephen Smalley:
  
https://lore.kernel.org/lkml/CAEjxPJ7ARJO57MBW66=xsBzMMRb=9ulgqock5eskhcaivmx...@mail.gmail.com/
* Cosmetic fix in the commit message.

Changes since v17:
* Rebase the original LSM stacking patch from v5.3 to v5.7: I fixed some
  diff conflicts caused by code moves and function renames in
  selinux/include/objsec.h and selinux/hooks.c .  I checked that it
  builds but I didn't test the changes for SELinux nor SMACK.
  https://lore.kernel.org/r/20190829232935.7099-2-ca...@schaufler-ca.com
---
 include/linux/lsm_hooks.h |  1 +
 security/security.c   | 46 
 security/selinux/hooks.c  | 58 ---
 security/selinux/include/objsec.h |  6 
 security/selinux/ss/services.c|  3 +-
 security/smack/smack.h|  6 
 security/smack/smack_lsm.c| 35 +--
 7 files changed, 85 insertions(+), 70 deletions(-)

diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index 9e2e3e63719d..29df5075b35d 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -1550,6 +1550,7 @@ struct lsm_blob_sizes {
int lbs_cred;
int lbs_file;
int lbs_inode;
+   int lbs_superblock;
int lbs_ipc;
int lbs_msg_msg;
int lbs_task;
diff --git a/security/security.c b/security/security.c
index 70a7ad357bc6..d60aa835b670 100644
--- a/security/security.c
+++ b/security/security.c
@@ -201,6 +201,7 @@ static void __init lsm_set_blob_sizes(struct lsm_blob_sizes 
*needed)
lsm_set_blob_size(>lbs_inode, _sizes.lbs_inode);
lsm_set_blob_size(>lbs_ipc, _sizes.lbs_ipc);
lsm_set_blob_size(>lbs_msg_msg, _sizes.lbs_msg_msg);
+   lsm_set_blob_size(>lbs_superblock, _sizes.lbs_superblock);
lsm_set_blob_size(>lbs_task, _sizes.lbs_task);
 }
 
@@ -331,12 +332,13 @@ static void __init ordered_lsm_init(void)
for (lsm = ordered_lsms; *lsm; lsm++)
prepare_lsm(*lsm);
 
-   init_debug("cred blob size = %d\n", blob_sizes.lbs_cred);
-   init_debug("file blob size = %d\n", blob_sizes.lbs_file);
-   init_debug("inode blob size= %d\n", blob_sizes.lbs_inode);
-   init_debug("ipc blob size  = %d\n", blob_sizes.lbs_ipc);
-   init_debug("msg_msg blob size  = %d\n", blob_sizes.lbs_msg_msg);
-   init_debug("task blob size = %d\n", blob_sizes.lbs_task);
+   init_debug("cred blob size   = %d\n", blob_sizes.lbs_cred);
+   init_debug("file blob size   = %d\n", blob_sizes.lbs_file);
+   init_debug("inode blob size  = %d\n", blob_sizes.lbs_inode);
+   init_debug("ipc blob size= %d\n", blob_sizes.lbs_ipc);
+   init_debug("msg_msg blob size= %d\n", blob_sizes.lbs_msg_msg);
+   init_debug("superblock blob size = %d\n", blob_sizes.lbs_superblock);
+   init_debug("task blob size   = %d\n", blob_sizes.lbs_task);
 
/*
 * Create any kmem_caches needed for blobs
@@ -668,6 +670,27 @@ static void __init lsm_early_task(struct task_struct *task)
panic("%s: Early task alloc failed.\n", __func__);
 }
 
+/**
+ * lsm_superblock_alloc - allocate a composite superblock blob
+ * @sb: the superblock that needs a blob
+ *
+ * Allocate the superblock blob for all the modules
+ *
+ * Returns 0, or -ENOMEM if memory can't be allocated.
+ */
+static int lsm_superblock_alloc(struct super_block *sb)
+{
+   if (blob_sizes.lbs_superblock == 0) {
+   sb->s_security = NULL;
+   return 0;
+   }
+
+   sb->s_security = kzalloc(blob_sizes.lbs_superblock, GFP_KERNEL);
+   if (sb->s_security == NULL)
+   return -ENOMEM;
+   return 0;
+}
+
 /*
  * The default value of the LSM hook is defined in linux/lsm_hook_defs.h and
  * can be accessed with:
@@ -865,12 +888,21 @@ int security_fs_context_parse_param(struct fs_context 
*fc, struct fs_parameter *
 
 int security_sb_alloc(struct super_block *sb)
 {
-   return call_int_hook(sb_alloc_security, 0, sb);
+   int rc = lsm_superblock_alloc(sb);
+
+   if (unlikely(rc))
+   return rc;
+   rc = call_int_hook(sb_alloc_security, 0, sb);
+   if (unlikely(rc))
+   security_sb_free(sb);
+   return rc;
 }
 
 

[PATCH v21 12/12] landlock: Add user and kernel documentation

2020-10-08 Thread Mickaël Salaün
From: Mickaël Salaün 

This documentation can be built with the Sphinx framework.

Cc: James Morris 
Cc: Jann Horn 
Cc: Kees Cook 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
Reviewed-by: Vincent Dagonneau 
---

Changes since v20:
* Update examples and documentation with the new syscalls.

Changes since v19:
* Update examples and documentation with the new syscalls.

Changes since v15:
* Add current limitations.

Changes since v14:
* Fix spelling (contributed by Randy Dunlap).
* Extend documentation about inheritance and explain layer levels.
* Remove the use of now-removed access rights.
* Use GitHub links.
* Improve kernel documentation.
* Add section for tests.
* Update example.

Changes since v13:
* Rewrote the documentation according to the major revamp.

Previous changes:
https://lore.kernel.org/lkml/20191104172146.30797-8-...@digikod.net/
---
 Documentation/security/index.rst   |   1 +
 Documentation/security/landlock/index.rst  |  18 ++
 Documentation/security/landlock/kernel.rst |  69 ++
 Documentation/security/landlock/user.rst   | 242 +
 4 files changed, 330 insertions(+)
 create mode 100644 Documentation/security/landlock/index.rst
 create mode 100644 Documentation/security/landlock/kernel.rst
 create mode 100644 Documentation/security/landlock/user.rst

diff --git a/Documentation/security/index.rst b/Documentation/security/index.rst
index 8129405eb2cc..e3f2bf4fef77 100644
--- a/Documentation/security/index.rst
+++ b/Documentation/security/index.rst
@@ -16,3 +16,4 @@ Security Documentation
siphash
tpm/index
digsig
+   landlock/index
diff --git a/Documentation/security/landlock/index.rst 
b/Documentation/security/landlock/index.rst
new file mode 100644
index ..2520f8f33f5e
--- /dev/null
+++ b/Documentation/security/landlock/index.rst
@@ -0,0 +1,18 @@
+=
+Landlock LSM: unprivileged access control
+=
+
+:Author: Mickaël Salaün
+
+The goal of Landlock is to enable to restrict ambient rights (e.g.  global
+filesystem access) for a set of processes.  Because Landlock is a stackable
+LSM, it makes possible to create safe security sandboxes as new security layers
+in addition to the existing system-wide access-controls. This kind of sandbox
+is expected to help mitigate the security impact of bugs or
+unexpected/malicious behaviors in user-space applications. Landlock empowers
+any process, including unprivileged ones, to securely restrict themselves.
+
+.. toctree::
+
+user
+kernel
diff --git a/Documentation/security/landlock/kernel.rst 
b/Documentation/security/landlock/kernel.rst
new file mode 100644
index ..27c0933a0b6e
--- /dev/null
+++ b/Documentation/security/landlock/kernel.rst
@@ -0,0 +1,69 @@
+==
+Landlock: kernel documentation
+==
+
+Landlock's goal is to create scoped access-control (i.e. sandboxing).  To
+harden a whole system, this feature should be available to any process,
+including unprivileged ones.  Because such process may be compromised or
+backdoored (i.e. untrusted), Landlock's features must be safe to use from the
+kernel and other processes point of view.  Landlock's interface must therefore
+expose a minimal attack surface.
+
+Landlock is designed to be usable by unprivileged processes while following the
+system security policy enforced by other access control mechanisms (e.g. DAC,
+LSM).  Indeed, a Landlock rule shall not interfere with other access-controls
+enforced on the system, only add more restrictions.
+
+Any user can enforce Landlock rulesets on their processes.  They are merged and
+evaluated according to the inherited ones in a way that ensures that only more
+constraints can be added.
+
+Guiding principles for safe access controls
+===
+
+* A Landlock rule shall be focused on access control on kernel objects instead
+  of syscall filtering (i.e. syscall arguments), which is the purpose of
+  seccomp-bpf.
+* To avoid multiple kinds of side-channel attacks (e.g. leak of security
+  policies, CPU-based attacks), Landlock rules shall not be able to
+  programmatically communicate with user space.
+* Kernel access check shall not slow down access request from unsandboxed
+  processes.
+* Computation related to Landlock operations (e.g. enforcing a ruleset) shall
+  only impact the processes requesting them.
+
+Tests
+=
+
+Userspace tests for backward compatibility, ptrace restrictions and filesystem
+support can be found here: `tools/testing/selftests/landlock/`_.
+
+Kernel structures
+=
+
+Object
+--
+
+.. kernel-doc:: security/landlock/object.h
+:identifiers:
+
+Ruleset and domain
+--
+
+A domain is a read-only ruleset tied to a set of subjects (i.e. tasks'
+credentials).  Each time a ruleset is enforced on a task, the current domain is
+duplicated and the ruleset

[PATCH v21 10/12] selftests/landlock: Add initial tests

2020-10-08 Thread Mickaël Salaün
From: Mickaël Salaün 

Test all Landlock system calls, ptrace hooks semantic and filesystem
access-control.

Test coverage for security/landlock/ is 95.4% of lines.  The code not
covered only deals with internal kernel errors (e.g. memory allocation)
and race conditions.

Cc: James Morris 
Cc: Jann Horn 
Cc: Kees Cook 
Cc: Serge E. Hallyn 
Cc: Shuah Khan 
Signed-off-by: Mickaël Salaün 
Reviewed-by: Vincent Dagonneau 
---

Changes since v20:
* Update with new syscalls and type names.
* Use the full syscall interfaces: explicitely set the "flags" field to
  zero.
* Update the empty_path_beneath_attr test to check for EFAULT.
* Update and merge tests for the simplified copy_min_struct_from_user().
* Clean up makefile.
* Rename some types and variables in a more consistent way.

Changes since v19:
* Update with the new Landlock syscalls.
* Fix device creation.
* Check the new landlock_attr_features members: last_rule_type and
  last_target_type .
* Constify variables.

Changes since v18:
* Replace ruleset_rw.inval with layout1.inval to avoid inexistent test
  layout.
* Use the new FIXTURE_VARIANT for ptrace_test: makes the tests more
  readable and usable.
* Add ARRAY_SIZE() macro to please checkpatch.

Changes since v17:
* Add new test for mknod with a zero mode.
* Use memset(3) to initialize attr_features in base_test.

Changes since v16:
* Add new unpriv_enforce_without_no_new_privs test: check that ruleset
  enforcing is forbiden without no_new_privs and CAP_SYS_ADMIN.
* Drop capabilities when useful.
* Check the new size_attr_features field from struct
  landlock_attr_features.
* Update the empty_or_same_ruleset test to check complementary empty
  ruleset.
* Update base_test according to the new attribute structures and fix the
  inconsistent_attr test accordingly.
* Switch syscall attribute pointer and size arguments.
* Rename test files with a "_test" suffix.

Changes since v14:
* Add new tests:
  - superset: check new layer bitmask.
  - max_layers: check maximum number of layers.
  - release_inodes: check that umount work well.
  - empty_or_same_ruleset.
  - inconsistent_attr: checks copy_to_user limits.
  - in ruleset_rw.inval to check ruleset FD.
  - proc_unlinked_file: check file access through /proc/self/fd .
  - file_access_rights: check that a file can only get consistent access
rights.
  - unpriv: check that NO_NEW_PRIVS or CAP_SYS_ADMIN is required.
  - check pipe access through /proc/self/fd .
  - check move_mount(2).
  - check ruleset file descriptor properties.
  - proc_nsfs: extend to check that internal filesystems (e.g. nsfs) are
allowed.
* Double-check read and write effective actions.
* Fix potential desynchronization between the kernel sources and
  installed headers by overriding the build step in the Makefile.  This
  also enable to build with Clang.
* Add two files in the test directories (for link test and rename test).
* Remove test for ruleset's show_fdinfo().
* Replace EBADR with EBADFD.
* Update tests accordingly to the changes of rename and link rights.
* Fix (now) illegal access rights tied to files.
* Update rename and link tests.
* Remove superfluous '\n' in TH_LOG() calls.
* Make assert calls consistent and readable.
* Fix the execute test.
* Make tests future-proof.
* Cosmetic fixes.

Changes since v14:
* Add new tests:
  - Compatibility: empty_attr_{ruleset,path_beneath,enforce} to check
minimal attr size.
  - Access types: link_to, rename_from, rename_to, rmdir, unlink,
make_char, make_block, make_reg, make_sock, make_fifo, make_sym,
make_dir, chroot, execute.
  - Test privilege escalation prevention by enforcing a nested rule, on
a parent directory, with less restrictions than one on a child
directory.
  - Test for empty and more than 32-bits allowed_access
* Merge the two test mount hierarchies.
* Complete relative path tests by combining chdir and chroot.
* Adjust tests:
  - Remove the layout1/extend_ruleset_with_denied_path test.
  - Extend layout1/whitelist test with checks on file.
  - Add and use create_dir_and_file().
* Only use read/write checks but not stat(2) for tests.
* Rename test.h to common.h and improve it.
* Rename path name to make them more consistent, easy to understand and
  make them in a common directory.
* Make create_ruleset() more generic.
* Constify variables.
* Re-add static global variables.
* Remove useless openat(2).
* Fix and complete kernel config.
* Set umask and clean up file modes.
* Clean up open flags.
* Improve Makefile.
* Fix spelling.
* Improve comments and error messages.

Changes since v13:
* Add back the filesystem tests (from v10) and extend them.
* Add tests for the new syscall.

Previous changes:
https://lore.kernel.org/lkml/20191104172146.30797-7-...@digikod.net/
---
 tools/testing/selftests/Makefile  |1 +
 tools/testing/selftests/landlock/.gitignore   |2 +
 tools/testing/selftests/landlock/Makefile |   24 +
 tools/testing/selftests/landlock

[PATCH v21 00/12] Landlock LSM

2020-10-08 Thread Mickaël Salaün
Hi,

This new patch series mainly simplifies the syscalls thanks to Arnd
Bergmann at LPC.

The SLOC count is 1188 for security/landlock/ and 1673 for
tools/testing/selftest/landlock/ .  Test coverage for security/landlock/
is 95.4% of lines.  The code not covered only deals with internal kernel
errors (e.g. memory allocation) and race conditions.

The compiled documentation is available here:
https://landlock.io/linux-doc/landlock-v21/security/landlock/index.html

This series can be applied on top of v5.9-rc8 .  This can be tested with
CONFIG_SECURITY_LANDLOCK and CONFIG_SAMPLE_LANDLOCK.  This patch series
can be found in a Git repository here:
https://github.com/landlock-lsm/linux/commits/landlock-v21
I would really appreciate constructive comments on this patch series.


# Landlock LSM

The goal of Landlock is to enable to restrict ambient rights (e.g.
global filesystem access) for a set of processes.  Because Landlock is a
stackable LSM [1], it makes possible to create safe security sandboxes
as new security layers in addition to the existing system-wide
access-controls. This kind of sandbox is expected to help mitigate the
security impact of bugs or unexpected/malicious behaviors in user-space
applications. Landlock empowers any process, including unprivileged
ones, to securely restrict themselves.

Landlock is inspired by seccomp-bpf but instead of filtering syscalls
and their raw arguments, a Landlock rule can restrict the use of kernel
objects like file hierarchies, according to the kernel semantic.
Landlock also takes inspiration from other OS sandbox mechanisms: XNU
Sandbox, FreeBSD Capsicum or OpenBSD Pledge/Unveil.

In this current form, Landlock misses some access-control features.
This enables to minimize this patch series and ease review.  This series
still addresses multiple use cases, especially with the combined use of
seccomp-bpf: applications with built-in sandboxing, init systems,
security sandbox tools and security-oriented APIs [2].

Previous version:
https://lore.kernel.org/lkml/20200802215903.91936-1-...@digikod.net/

[1] 
https://lore.kernel.org/lkml/50db058a-7dde-441b-a7f9-f6837fe8b...@schaufler-ca.com/
[2] 
https://lore.kernel.org/lkml/f646e1c7-33cf-333f-070c-0a40ad046...@digikod.net/


Casey Schaufler (1):
  LSM: Infrastructure management of the superblock

Mickaël Salaün (11):
  landlock: Add object management
  landlock: Add ruleset and domain management
  landlock: Set up the security framework and manage credentials
  landlock: Add ptrace restrictions
  fs,security: Add sb_delete hook
  landlock: Support filesystem access-control
  landlock: Add syscall implementations
  arch: Wire up Landlock syscalls
  selftests/landlock: Add initial tests
  samples/landlock: Add a sandbox manager example
  landlock: Add user and kernel documentation

 Documentation/security/index.rst  |1 +
 Documentation/security/landlock/index.rst |   18 +
 Documentation/security/landlock/kernel.rst|   69 +
 Documentation/security/landlock/user.rst  |  242 +++
 MAINTAINERS   |   11 +
 arch/Kconfig  |7 +
 arch/alpha/kernel/syscalls/syscall.tbl|3 +
 arch/arm/tools/syscall.tbl|3 +
 arch/arm64/include/asm/unistd.h   |2 +-
 arch/arm64/include/asm/unistd32.h |6 +
 arch/ia64/kernel/syscalls/syscall.tbl |3 +
 arch/m68k/kernel/syscalls/syscall.tbl |3 +
 arch/microblaze/kernel/syscalls/syscall.tbl   |3 +
 arch/mips/kernel/syscalls/syscall_n32.tbl |3 +
 arch/mips/kernel/syscalls/syscall_n64.tbl |3 +
 arch/mips/kernel/syscalls/syscall_o32.tbl |3 +
 arch/parisc/kernel/syscalls/syscall.tbl   |3 +
 arch/powerpc/kernel/syscalls/syscall.tbl  |3 +
 arch/s390/kernel/syscalls/syscall.tbl |3 +
 arch/sh/kernel/syscalls/syscall.tbl   |3 +
 arch/sparc/kernel/syscalls/syscall.tbl|3 +
 arch/um/Kconfig   |1 +
 arch/x86/entry/syscalls/syscall_32.tbl|3 +
 arch/x86/entry/syscalls/syscall_64.tbl|3 +
 arch/xtensa/kernel/syscalls/syscall.tbl   |3 +
 fs/super.c|1 +
 include/linux/lsm_hook_defs.h |1 +
 include/linux/lsm_hooks.h |3 +
 include/linux/security.h  |4 +
 include/linux/syscalls.h  |7 +
 include/uapi/asm-generic/unistd.h |8 +-
 include/uapi/linux/landlock.h |  131 ++
 kernel/sys_ni.c   |5 +
 samples/Kconfig   |7 +
 samples/Makefile  |1 +
 samples/landlock/.gitignore   |1 +
 samples/landlock/Makefile |   15 +
 samples/landlock/sandboxer.c  |  220 +++
 security/Kconfig

[PATCH v21 01/12] landlock: Add object management

2020-10-08 Thread Mickaël Salaün
From: Mickaël Salaün 

A Landlock object enables to identify a kernel object (e.g. an inode).
A Landlock rule is a set of access rights allowed on an object.  Rules
are grouped in rulesets that may be tied to a set of processes (i.e.
subjects) to enforce a scoped access-control (i.e. a domain).

Because Landlock's goal is to empower any process (especially
unprivileged ones) to sandbox themselves, we can't rely on a system-wide
object identification such as file extended attributes.  Indeed, we need
innocuous, composable and modular access-controls.

The main challenge with these constraints is to identify kernel objects
while this identification is useful (i.e. when a security policy makes
use of this object).  But this identification data should be freed once
no policy is using it.  This ephemeral tagging should not and may not be
written in the filesystem.  We then need to manage the lifetime of a
rule according to the lifetime of its object.  To avoid a global lock,
this implementation make use of RCU and counters to safely reference
objects.

A following commit uses this generic object management for inodes.

Cc: James Morris 
Cc: Jann Horn 
Cc: Kees Cook 
Cc: Serge E. Hallyn 
Signed-off-by: Mickaël Salaün 
---

Changes since v18:
* Account objects to kmemcg.

Changes since v14:
* Simplify the object, rule and ruleset management at the expense of a
  less aggressive memory freeing (contributed by Jann Horn, with
  additional modifications):
  - Remove object->list aggregating the rules tied to an object.
  - Remove landlock_get_object(), landlock_drop_object(),
{get,put}_object_cleaner() and landlock_rule_is_disabled().
  - Rewrite landlock_put_object() to use a more simple mechanism
(no tricky RCU).
  - Replace enum landlock_object_type and landlock_release_object() with
landlock_object_underops->release()
  - Adjust unions and Sparse annotations.
  Cf. 
https://lore.kernel.org/lkml/cag48ez21ben0wl1bbmtiiu8j9jp5iewthowz4turuj+ki0y...@mail.gmail.com/
* Merge struct landlock_rule into landlock_ruleset_elem to simplify the
  rule management.
* Constify variables.
* Improve kernel documentation.
* Cosmetic variable renames.
* Remove the "default" in the Kconfig (suggested by Jann Horn).
* Only use refcount_inc() through getter helpers.
* Update Kconfig description.

Changes since v13:
* New dedicated implementation, removing the need for eBPF.

Previous changes:
https://lore.kernel.org/lkml/20190721213116.23476-6-...@digikod.net/
---
 MAINTAINERS| 10 +
 security/Kconfig   |  1 +
 security/Makefile  |  2 +
 security/landlock/Kconfig  | 18 
 security/landlock/Makefile |  3 ++
 security/landlock/object.c | 66 +++
 security/landlock/object.h | 91 ++
 7 files changed, 191 insertions(+)
 create mode 100644 security/landlock/Kconfig
 create mode 100644 security/landlock/Makefile
 create mode 100644 security/landlock/object.c
 create mode 100644 security/landlock/object.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 33b27e62ce19..40b0ad2b101e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9763,6 +9763,16 @@ F:   net/core/sock_map.c
 F: net/ipv4/tcp_bpf.c
 F: net/ipv4/udp_bpf.c
 
+LANDLOCK SECURITY MODULE
+M:     Mickaël Salaün 
+L: linux-security-mod...@vger.kernel.org
+S: Supported
+W: https://landlock.io
+T: git https://github.com/landlock-lsm/linux.git
+F: security/landlock/
+K: landlock
+K: LANDLOCK
+
 LANTIQ / INTEL Ethernet drivers
 M: Hauke Mehrtens 
 L: net...@vger.kernel.org
diff --git a/security/Kconfig b/security/Kconfig
index 7561f6f99f1d..15a4342b5d01 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -238,6 +238,7 @@ source "security/loadpin/Kconfig"
 source "security/yama/Kconfig"
 source "security/safesetid/Kconfig"
 source "security/lockdown/Kconfig"
+source "security/landlock/Kconfig"
 
 source "security/integrity/Kconfig"
 
diff --git a/security/Makefile b/security/Makefile
index 3baf435de541..c688f4907a1b 100644
--- a/security/Makefile
+++ b/security/Makefile
@@ -13,6 +13,7 @@ subdir-$(CONFIG_SECURITY_LOADPIN) += loadpin
 subdir-$(CONFIG_SECURITY_SAFESETID)+= safesetid
 subdir-$(CONFIG_SECURITY_LOCKDOWN_LSM) += lockdown
 subdir-$(CONFIG_BPF_LSM)   += bpf
+subdir-$(CONFIG_SECURITY_LANDLOCK) += landlock
 
 # always enable default capabilities
 obj-y  += commoncap.o
@@ -32,6 +33,7 @@ obj-$(CONFIG_SECURITY_SAFESETID)   += safesetid/
 obj-$(CONFIG_SECURITY_LOCKDOWN_LSM)+= lockdown/
 obj-$(CONFIG_CGROUPS)  += device_cgroup.o
 obj-$(CONFIG_BPF_LSM)  += bpf/
+obj-$(CONFIG_SECURITY_LANDLOCK)+= landlock/
 
 # Object integrity file lists
 subdir-$(CONFIG_INTEGRITY) += integrity
diff --git a/security/landlock/Kconfig b/security/landlock

[PATCH v1] dm verity: Add support for signature verification with 2nd keyring

2020-10-02 Thread Mickaël Salaün
From: Mickaël Salaün 

Add a new DM_VERITY_VERIFY_ROOTHASH_SIG_SECONDARY_KEYRING configuration
to enable dm-verity signatures to be verified against the secondary
trusted keyring.  This allows certificate updates without kernel update
and reboot, aligning with module and kernel (kexec) signature
verifications.

Signed-off-by: Mickaël Salaün 
Cc: Jaskaran Khurana 
Cc: Mike Snitzer 
Cc: Milan Broz 
---
 drivers/md/Kconfig| 13 -
 drivers/md/dm-verity-verify-sig.c |  9 +++--
 2 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/drivers/md/Kconfig b/drivers/md/Kconfig
index 30ba3573626c..63870fdfe8ce 100644
--- a/drivers/md/Kconfig
+++ b/drivers/md/Kconfig
@@ -530,11 +530,22 @@ config DM_VERITY_VERIFY_ROOTHASH_SIG
bool "Verity data device root hash signature verification support"
depends on DM_VERITY
select SYSTEM_DATA_VERIFICATION
- help
+   ---help---
  Add ability for dm-verity device to be validated if the
  pre-generated tree of cryptographic checksums passed has a pkcs#7
  signature file that can validate the roothash of the tree.
 
+ By default, rely on the builtin trusted keyring.
+
+ If unsure, say N.
+
+config DM_VERITY_VERIFY_ROOTHASH_SIG_SECONDARY_KEYRING
+   bool "Verity data device root hash signature verification with 
secondary keyring"
+   depends on DM_VERITY_VERIFY_ROOTHASH_SIG
+   depends on SECONDARY_TRUSTED_KEYRING
+   ---help---
+ Rely on the secondary trusted keyring to verify dm-verity signatures.
+
  If unsure, say N.
 
 config DM_VERITY_FEC
diff --git a/drivers/md/dm-verity-verify-sig.c 
b/drivers/md/dm-verity-verify-sig.c
index 614e43db93aa..29385dc470d5 100644
--- a/drivers/md/dm-verity-verify-sig.c
+++ b/drivers/md/dm-verity-verify-sig.c
@@ -119,8 +119,13 @@ int verity_verify_root_hash(const void *root_hash, size_t 
root_hash_len,
}
 
ret = verify_pkcs7_signature(root_hash, root_hash_len, sig_data,
-   sig_len, NULL, VERIFYING_UNSPECIFIED_SIGNATURE,
-   NULL, NULL);
+   sig_len,
+#ifdef CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG_SECONDARY_KEYRING
+   VERIFY_USE_SECONDARY_KEYRING,
+#else
+   NULL,
+#endif
+   VERIFYING_UNSPECIFIED_SIGNATURE, NULL, NULL);
 
return ret;
 }
-- 
2.28.0



Re: [PATCH v11 2/3] arch: Wire up trusted_for(2)

2020-10-01 Thread Mickaël Salaün


On 01/10/2020 21:33, Tycho Andersen wrote:
> On Thu, Oct 01, 2020 at 07:02:31PM +0200, Mickaël Salaün wrote:
>> --- a/include/uapi/asm-generic/unistd.h
>> +++ b/include/uapi/asm-generic/unistd.h
>> @@ -859,9 +859,11 @@ __SYSCALL(__NR_openat2, sys_openat2)
>>  __SYSCALL(__NR_pidfd_getfd, sys_pidfd_getfd)
>>  #define __NR_faccessat2 439
>>  __SYSCALL(__NR_faccessat2, sys_faccessat2)
>> +#define __NR_trusted_for 443
>> +__SYSCALL(__NR_trusted_for, sys_trusted_for)
>>  
>>  #undef __NR_syscalls
>> -#define __NR_syscalls 440
>> +#define __NR_syscalls 444
> 
> Looks like a rebase problem here?

No, it is a synchronization with the -next tree (cf. changelog) as asked
(and acked for a previous version) by Arnd.


[PATCH v11 3/3] selftest/interpreter: Add tests for trusted_for(2) policies

2020-10-01 Thread Mickaël Salaün
From: Mickaël Salaün 

Test that checks performed by trusted_for(2) on file descriptors are
consistent with noexec mount points and file execute permissions,
according to the policy configured with the fs.trust_policy sysctl.

Signed-off-by: Mickaël Salaün 
Reviewed-by: Thibaut Sautereau 
Cc: Al Viro 
Cc: Arnd Bergmann 
Cc: Andrew Morton 
Cc: Kees Cook 
Cc: Shuah Khan 
Cc: Vincent Strubel 
---

Changes since v10:
* Update selftest Makefile.

Changes since v9:
* Rename the syscall and the sysctl.
* Update tests for enum trusted_for_usage

Changes since v8:
* Update with the dedicated syscall introspect_access(2) and the renamed
  fs.introspection_policy sysctl.
* Remove check symlink which can't be use as is anymore.
* Use socketpair(2) to test UNIX socket.

Changes since v7:
* Update tests with faccessat2/AT_INTERPRETED, including new ones to
  check that setting R_OK or W_OK returns EINVAL.
* Add tests for memfd, pipefs and nsfs.
* Rename and move back tests to a standalone directory.

Changes since v6:
* Add full combination tests for all file types, including block
  devices, character devices, fifos, sockets and symlinks.
* Properly save and restore initial sysctl value for all tests.

Changes since v5:
* Refactor with FIXTURE_VARIANT, which make the tests much more easy to
  read and maintain.
* Save and restore initial sysctl value (suggested by Kees Cook).
* Test with a sysctl value of 0.
* Check errno in sysctl_access_write test.
* Update tests for the CAP_SYS_ADMIN switch.
* Update tests to check -EISDIR (replacing -EACCES).
* Replace FIXTURE_DATA() with FIXTURE() (spotted by Kees Cook).
* Use global const strings.

Changes since v3:
* Replace RESOLVE_MAYEXEC with O_MAYEXEC.
* Add tests to check that O_MAYEXEC is ignored by open(2) and openat(2).

Changes since v2:
* Move tests from exec/ to openat2/ .
* Replace O_MAYEXEC with RESOLVE_MAYEXEC from openat2(2).
* Cleanup tests.

Changes since v1:
* Move tests from yama/ to exec/ .
* Fix _GNU_SOURCE in kselftest_harness.h .
* Add a new test sysctl_access_write to check if CAP_MAC_ADMIN is taken
  into account.
* Test directory execution which is always forbidden since commit
  73601ea5b7b1 ("fs/open.c: allow opening only regular files during
  execve()"), and also check that even the root user can not bypass file
  execution checks.
* Make sure delete_workspace() always as enough right to succeed.
* Cosmetic cleanup.
---
 tools/testing/selftests/Makefile  |   1 +
 .../testing/selftests/interpreter/.gitignore  |   2 +
 tools/testing/selftests/interpreter/Makefile  |  21 +
 tools/testing/selftests/interpreter/config|   1 +
 .../selftests/interpreter/trust_policy_test.c | 362 ++
 5 files changed, 387 insertions(+)
 create mode 100644 tools/testing/selftests/interpreter/.gitignore
 create mode 100644 tools/testing/selftests/interpreter/Makefile
 create mode 100644 tools/testing/selftests/interpreter/config
 create mode 100644 tools/testing/selftests/interpreter/trust_policy_test.c

diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile
index 9018f45d631d..5a7cf8dd7ce2 100644
--- a/tools/testing/selftests/Makefile
+++ b/tools/testing/selftests/Makefile
@@ -21,6 +21,7 @@ TARGETS += ftrace
 TARGETS += futex
 TARGETS += gpio
 TARGETS += intel_pstate
+TARGETS += interpreter
 TARGETS += ipc
 TARGETS += ir
 TARGETS += kcmp
diff --git a/tools/testing/selftests/interpreter/.gitignore 
b/tools/testing/selftests/interpreter/.gitignore
new file mode 100644
index ..82a4846cbc4b
--- /dev/null
+++ b/tools/testing/selftests/interpreter/.gitignore
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0-only
+/*_test
diff --git a/tools/testing/selftests/interpreter/Makefile 
b/tools/testing/selftests/interpreter/Makefile
new file mode 100644
index ..dbca8ebda67e
--- /dev/null
+++ b/tools/testing/selftests/interpreter/Makefile
@@ -0,0 +1,21 @@
+# SPDX-License-Identifier: GPL-2.0-or-later
+
+CFLAGS += -Wall -O2
+LDLIBS += -lcap
+
+src_test := $(wildcard *_test.c)
+TEST_GEN_PROGS := $(src_test:.c=)
+
+KSFT_KHDR_INSTALL := 1
+include ../lib.mk
+
+khdr_dir = $(top_srcdir)/usr/include
+
+$(khdr_dir)/asm-generic/unistd.h: khdr
+   @:
+
+$(khdr_dir)/linux/trusted-for.h: khdr
+   @:
+
+$(OUTPUT)/%_test: %_test.c $(khdr_dir)/asm-generic/unistd.h 
$(khdr_dir)/linux/trusted-for.h ../kselftest_harness.h
+   $(LINK.c) $< $(LDLIBS) -o $@ -I$(khdr_dir)
diff --git a/tools/testing/selftests/interpreter/config 
b/tools/testing/selftests/interpreter/config
new file mode 100644
index ..dd53c266bf52
--- /dev/null
+++ b/tools/testing/selftests/interpreter/config
@@ -0,0 +1 @@
+CONFIG_SYSCTL=y
diff --git a/tools/testing/selftests/interpreter/trust_policy_test.c 
b/tools/testing/selftests/interpreter/trust_policy_test.c
new file mode 100644
index ..4818c5524ec0
--- /dev/null
+++ b/tools/testing/selftests/interpreter/trust_policy_test.c
@@ -0,0 +1,362 @@
+// SPDX-Licen

[PATCH v11 0/3] Add trusted_for(2) (was O_MAYEXEC)

2020-10-01 Thread Mickaël Salaün
Hi,

This eleventh patch series brings small fixes.

Andrew, should this be merged with your tree?

The final goal of this patch series is to enable the kernel to be a
global policy manager by entrusting processes with access control at
their level.  To reach this goal, two complementary parts are required:
* user space needs to be able to know if it can trust some file
  descriptor content for a specific usage;
* and the kernel needs to make available some part of the policy
  configured by the system administrator.

Primary goal of trusted_for(2)
==

This new syscall enables user space to ask the kernel: is this file
descriptor's content trusted to be used for this purpose?  The set of
usage currently only contains "execution", but other may follow (e.g.
"configuration", "sensitive_data").  If the kernel identifies the file
descriptor as trustworthy for this usage, user space should then take
this information into account.  The "execution" usage means that the
content of the file descriptor is trusted according to the system policy
to be executed by user space, which means that it interprets the content
or (try to) maps it as executable memory.

A simple system-wide security policy can be enforced by the system
administrator through a sysctl configuration consistent with the mount
points or the file access rights.  The documentation patch explains the
prerequisites.

It is important to note that this can only enable to extend access
control managed by the kernel.  Hence it enables current access control
mechanism to be extended and become a superset of what they can
currently control.  Indeed, the security policy could also be delegated
to an LSM, either a MAC system or an integrity system.  For instance,
this is required to close a major IMA measurement/appraisal interpreter
integrity gap by bringing the ability to check the use of scripts [1].
Other uses are expected, such as for magic-links [2], SGX integration
[3], bpffs [4].

Complementary W^X protections can be brought by SELinux, IPE [5] and
trampfd [6].

Prerequisite of its use
===

User space needs to adapt to take advantage of this new feature.  For
example, the PEP 578 [7] (Runtime Audit Hooks) enables Python 3.8 to be
extended with policy enforcement points related to code interpretation,
which can be used to align with the PowerShell audit features.
Additional Python security improvements (e.g. a limited interpreter
without -c, stdin piping of code) are on their way [8].

Examples


The initial idea comes from CLIP OS 4 and the original implementation
has been used for more than 12 years:
https://github.com/clipos-archive/clipos4_doc
Chrome OS has a similar approach:
https://chromium.googlesource.com/chromiumos/docs/+/master/security/noexec_shell_scripts.md

Userland patches can be found here:
https://github.com/clipos-archive/clipos4_portage-overlay/search?q=O_MAYEXEC
Actually, there is more than the O_MAYEXEC changes (which matches this search)
e.g., to prevent Python interactive execution. There are patches for
Bash, Wine, Java (Icedtea), Busybox's ash, Perl and Python. There are
also some related patches which do not directly rely on O_MAYEXEC but
which restrict the use of browser plugins and extensions, which may be
seen as scripts too:
https://github.com/clipos-archive/clipos4_portage-overlay/tree/master/www-client

An introduction to O_MAYEXEC was given at the Linux Security Summit
Europe 2018 - Linux Kernel Security Contributions by ANSSI:
https://www.youtube.com/watch?v=chNjCRtPKQY=17m15s
The "write xor execute" principle was explained at Kernel Recipes 2018 -
CLIP OS: a defense-in-depth OS:
https://www.youtube.com/watch?v=PjRE0uBtkHU=11m14s
See also an overview article: https://lwn.net/Articles/82/

This patch series can be applied on top of v5.9-rc7 .  This can be tested
with CONFIG_SYSCTL.  I would really appreciate constructive comments on
this patch series.

Previous version:
https://lore.kernel.org/lkml/20200924153228.387737-1-...@digikod.net/

[1] https://lore.kernel.org/lkml/1544647356.4028.105.ca...@linux.ibm.com/
[2] https://lore.kernel.org/lkml/20190904201933.10736-6-cyp...@cyphar.com/
[3] 
https://lore.kernel.org/lkml/CALCETrVovr8XNZSroey7pHF46O=kj_c5D9K8h=z2t_cnrpv...@mail.gmail.com/
[4] 
https://lore.kernel.org/lkml/calcetrvez0euffxwfhtag_j+advbzewe0m3wjxmwveo7pj+...@mail.gmail.com/
[5] 
https://lore.kernel.org/lkml/20200406221439.1469862-12-deven.de...@linux.microsoft.com/
[6] 
https://lore.kernel.org/lkml/20200922215326.4603-1-madve...@linux.microsoft.com/
[7] https://www.python.org/dev/peps/pep-0578/
[8] 
https://lore.kernel.org/lkml/0c70debd-e79e-d514-06c6-4cd1e021f...@python.org/

Regards,

Mickaël Salaün (3):
  fs: Add trusted_for(2) syscall implementation and related sysctl
  arch: Wire up trusted_for(2)
  selftest/interpreter: Add tests for trusted_for(2) policies

 Documentation/admin-guide/sysctl/

[PATCH v11 1/3] fs: Add trusted_for(2) syscall implementation and related sysctl

2020-10-01 Thread Mickaël Salaün
From: Mickaël Salaün 

The trusted_for() syscall enables user space tasks to check that files
are trusted to be executed or interpreted by user space.  This may allow
script interpreters to check execution permission before reading
commands from a file, or dynamic linkers to allow shared object loading.
This may be seen as a way for a trusted task (e.g. interpreter) to check
the trustworthiness of files (e.g. scripts) before extending its control
flow graph with new ones originating from these files.

The security policy is consistently managed by the kernel through the
new sysctl: fs.trust_policy .  This enables system administrators to
enforce two complementary security policies according to the installed
system: enforce the noexec mount option, and enforce executable file
permission.  Indeed, because of compatibility with installed systems,
only system administrators are able to check that this new enforcement
is in line with the system mount points and file permissions.

For this to be possible, script interpreters must use trusted_for(2)
with the TRUSTED_FOR_EXECUTION usage.  To be fully effective, these
interpreters also need to handle the other ways to execute code: command
line parameters (e.g., option -e for Perl), module loading (e.g., option
-m for Python), stdin, file sourcing, environment variables,
configuration files, etc.  According to the threat model, it may be
acceptable to allow some script interpreters (e.g. Bash) to interpret
commands from stdin, may it be a TTY or a pipe, because it may not be
enough to (directly) perform syscalls.

Even without enforced security policy, user space interpreters can use
this syscall to try as much as possible to enforce the system policy at
their level, knowing that it will not break anything on running systems
which do not care about this feature.  However, on systems which want
this feature enforced, there will be knowledgeable people (i.e. system
administrator who configured fs.trust_policy deliberately) to manage it.

Because trusted_for(2) is a mean to enforce a system-wide security
policy (but not application-centric policies), it does not make sense
for user space to check the sysctl value.  Indeed, this new flag only
enables to extend the system ability to enforce a policy thanks to (some
trusted) user space collaboration.  Moreover, additional security
policies could be managed by LSMs.  This is a best-effort approach from
the application developer point of view:
https://lore.kernel.org/lkml/1477d3d7-4b36-afad-7077-a38f42322...@digikod.net/

trusted_for(2) with TRUSTED_FOR_EXECUTION should not be confused with
the O_EXEC flag (for open) which is intended for execute-only, which
obviously doesn't work for scripts.  However, a similar behavior could
be implemented in user space with O_PATH:
https://lore.kernel.org/lkml/1e2f6913-42f2-3578-28ed-567f6a4bd...@digikod.net/

Being able to restrict execution also enables to protect the kernel by
restricting arbitrary syscalls that an attacker could perform with a
crafted binary or certain script languages.  It also improves multilevel
isolation by reducing the ability of an attacker to use side channels
with specific code.  These restrictions can natively be enforced for ELF
binaries (with the noexec mount option) but require this kernel
extension to properly handle scripts (e.g. Python, Perl).  To get a
consistent execution policy, additional memory restrictions should also
be enforced (e.g. thanks to SELinux).

This is a new implementation of a patch initially written by
Vincent Strubel for CLIP OS 4:
https://github.com/clipos-archive/src_platform_clip-patches/blob/f5cb330d6b684752e403b4e41b39f7004d88e561/1901_open_mayexec.patch
This patch has been used for more than 12 years with customized script
interpreters.  Some examples (with the original O_MAYEXEC) can be found
here:
https://github.com/clipos-archive/clipos4_portage-overlay/search?q=O_MAYEXEC

Co-developed-by: Thibaut Sautereau 
Signed-off-by: Thibaut Sautereau 
Signed-off-by: Mickaël Salaün 
Cc: Al Viro 
Cc: Andrew Morton 
Cc: Arnd Bergmann 
Cc: Jonathan Corbet 
Cc: Kees Cook 
Cc: Vincent Strubel 
---

Changes since v10:
* Add enum definition to syscalls.h .

Changes since v9:
* Rename the syscall to trusted_for(2) and the sysctl to fs.trust_policy
* Add a dedicated enum trusted_for_usage with include/uapi/linux/trusted-for.h
* Remove the extra MAY_INTROSPECTION_EXEC bit.  LSMs can still implement
  this feature themselves.

Changes since v8:
* Add a dedicated syscall introspect_access() (requested by Al Viro).
* Rename MAY_INTERPRETED_EXEC to MAY_INTROSPECTION_EXEC .
* Rename the sysctl fs.interpreted_access to fs.introspection_policy .
* Update documentation.

Changes since v7:
* Replaces openat2/O_MAYEXEC with faccessat2/X_OK/AT_INTERPRETED .
  Switching to an FD-based syscall was suggested by Al Viro and Jann
  Horn.
* Handle special file descriptors.
* Add a compatibility mode for execute/read check.
* Move the sysctl policy from fs

[PATCH v11 2/3] arch: Wire up trusted_for(2)

2020-10-01 Thread Mickaël Salaün
From: Mickaël Salaün 

Wire up trusted_for(2) for all architectures.

Signed-off-by: Mickaël Salaün 
Reviewed-by: Thibaut Sautereau 
Cc: Al Viro 
Cc: Andrew Morton 
Cc: Arnd Bergmann 
Cc: Kees Cook 
Cc: Vincent Strubel 
---

Changes since v9:
* Rename introspect_access(2) to trusted_for(2).
* Increase syscall number to leave space for memfd_secret(2) in -next.

Changes since v7:
* New patch for the new syscall.
* Increase syscall numbers by 2 to leave space for new ones (in
  linux-next): watch_mount(2) and process_madvise(2).
---
 arch/alpha/kernel/syscalls/syscall.tbl  | 1 +
 arch/arm/tools/syscall.tbl  | 1 +
 arch/arm64/include/asm/unistd.h | 2 +-
 arch/arm64/include/asm/unistd32.h   | 2 ++
 arch/ia64/kernel/syscalls/syscall.tbl   | 1 +
 arch/m68k/kernel/syscalls/syscall.tbl   | 1 +
 arch/microblaze/kernel/syscalls/syscall.tbl | 1 +
 arch/mips/kernel/syscalls/syscall_n32.tbl   | 1 +
 arch/mips/kernel/syscalls/syscall_n64.tbl   | 1 +
 arch/mips/kernel/syscalls/syscall_o32.tbl   | 1 +
 arch/parisc/kernel/syscalls/syscall.tbl | 1 +
 arch/powerpc/kernel/syscalls/syscall.tbl| 1 +
 arch/s390/kernel/syscalls/syscall.tbl   | 1 +
 arch/sh/kernel/syscalls/syscall.tbl | 1 +
 arch/sparc/kernel/syscalls/syscall.tbl  | 1 +
 arch/x86/entry/syscalls/syscall_32.tbl  | 1 +
 arch/x86/entry/syscalls/syscall_64.tbl  | 1 +
 arch/xtensa/kernel/syscalls/syscall.tbl | 1 +
 include/uapi/asm-generic/unistd.h   | 4 +++-
 19 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/arch/alpha/kernel/syscalls/syscall.tbl 
b/arch/alpha/kernel/syscalls/syscall.tbl
index ec8bed9e7b75..0175cfc0f66f 100644
--- a/arch/alpha/kernel/syscalls/syscall.tbl
+++ b/arch/alpha/kernel/syscalls/syscall.tbl
@@ -479,3 +479,4 @@
 547common  openat2 sys_openat2
 548common  pidfd_getfd sys_pidfd_getfd
 549common  faccessat2  sys_faccessat2
+553common  trusted_for sys_trusted_for
diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl
index 171077cbf419..db9c8d35e75b 100644
--- a/arch/arm/tools/syscall.tbl
+++ b/arch/arm/tools/syscall.tbl
@@ -453,3 +453,4 @@
 437common  openat2 sys_openat2
 438common  pidfd_getfd sys_pidfd_getfd
 439common  faccessat2  sys_faccessat2
+443common  trusted_for sys_trusted_for
diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
index 3b859596840d..d1f7d35f986e 100644
--- a/arch/arm64/include/asm/unistd.h
+++ b/arch/arm64/include/asm/unistd.h
@@ -38,7 +38,7 @@
 #define __ARM_NR_compat_set_tls(__ARM_NR_COMPAT_BASE + 5)
 #define __ARM_NR_COMPAT_END(__ARM_NR_COMPAT_BASE + 0x800)
 
-#define __NR_compat_syscalls   440
+#define __NR_compat_syscalls   444
 #endif
 
 #define __ARCH_WANT_SYS_CLONE
diff --git a/arch/arm64/include/asm/unistd32.h 
b/arch/arm64/include/asm/unistd32.h
index 734860ac7cf9..33716dd2c04c 100644
--- a/arch/arm64/include/asm/unistd32.h
+++ b/arch/arm64/include/asm/unistd32.h
@@ -887,6 +887,8 @@ __SYSCALL(__NR_openat2, sys_openat2)
 __SYSCALL(__NR_pidfd_getfd, sys_pidfd_getfd)
 #define __NR_faccessat2 439
 __SYSCALL(__NR_faccessat2, sys_faccessat2)
+#define __NR_trusted_for 443
+__SYSCALL(__NR_trusted_for, sys_trusted_for)
 
 /*
  * Please add new compat syscalls above this comment and update
diff --git a/arch/ia64/kernel/syscalls/syscall.tbl 
b/arch/ia64/kernel/syscalls/syscall.tbl
index f52a41f4c340..68e56436b611 100644
--- a/arch/ia64/kernel/syscalls/syscall.tbl
+++ b/arch/ia64/kernel/syscalls/syscall.tbl
@@ -360,3 +360,4 @@
 437common  openat2 sys_openat2
 438common  pidfd_getfd sys_pidfd_getfd
 439common  faccessat2  sys_faccessat2
+443common  trusted_for sys_trusted_for
diff --git a/arch/m68k/kernel/syscalls/syscall.tbl 
b/arch/m68k/kernel/syscalls/syscall.tbl
index 81fc799d8392..67f0bc2fc4d0 100644
--- a/arch/m68k/kernel/syscalls/syscall.tbl
+++ b/arch/m68k/kernel/syscalls/syscall.tbl
@@ -439,3 +439,4 @@
 437common  openat2 sys_openat2
 438common  pidfd_getfd sys_pidfd_getfd
 439common  faccessat2  sys_faccessat2
+443common  trusted_for sys_trusted_for
diff --git a/arch/microblaze/kernel/syscalls/syscall.tbl 
b/arch/microblaze/kernel/syscalls/syscall.tbl
index b4e263916f41..acd3057886b7 100644
--- a/arch/microblaze/kernel/syscalls/syscall.tbl
+++ b/arch/microblaze/kernel/syscalls/syscall.tbl
@@ -445,3 +445,4 @@
 437common  openat2 sys_openat2
 438common  pidfd_getfd sys_pidfd_getfd
 439common  faccessat2  sys_faccessat2
+443common  trusted_for

Re: [PATCH v2 0/4] [RFC] Implement Trampoline File Descriptor

2020-09-25 Thread Mickaël Salaün


On 25/09/2020 00:05, Pavel Machek wrote:
> Hi!
> 
> I believe you should simply delete confusing "introduction" and
> provide details of super-secure system where your patches would be
> useful, instead.

 This RFC talks about converting dynamic code (which cannot be 
 authenticated)
 to static code that can be authenticated using signature verification. That
 is the scope of this RFC.

 If I have not been clear before, by dynamic code, I mean machine code that 
 is
 dynamic in nature. Scripts are beyond the scope of this RFC.

 Also, malware compiled from sources is not dynamic code. That is orthogonal
 to this RFC. If such malware has a valid signature that the kernel permits 
 its
 execution, we have a systemic problem.

 I am not saying that script authentication or compiled malware are not 
 problems.
 I am just saying that this RFC is not trying to solve all of the security 
 problems.
 It is trying to define one way to convert dynamic code to static code to 
 address
 one class of problems.
>>>
>>> Well, you don't have to solve all problems at once.
>>>
>>> But solutions have to exist, and AFAIK in this case they don't. You
>>> are armoring doors, but ignoring open windows.
>>
>> FYI, script execution is being addressed (for the kernel part) by this
>> patch series:
>> https://lore.kernel.org/lkml/20200924153228.387737-1-...@digikod.net/
> 
> Ok.
> 
>>> Or very probably you are thinking about something different than
>>> normal desktop distros (Debian 10). Because on my systems, I have
>>> python, gdb and gcc...
>>
>> It doesn't make sense for a tailored security system to leave all these
>> tools available to an attacker.
> 
> And it also does not make sense to use "trampoline file descriptor" on
> generic system... while W^X should make sense there.

Well, as said before, (full/original/system-wide) W^X may require
trampfd (as well as other building-blocks).

I guess most Linux deployments are not on "generic systems"
anyway (even if they may be based on generic distros), and W^X
contradicts the fact that users/attackers can do whatever they want on
the system.

> 
>>> It would be nice to specify what other pieces need to be present for
>>> this to make sense -- because it makes no sense on Debian 10.
>>
>> Not all kernel features make sense for a generic/undefined usage,
>> especially specific security mechanisms (e.g. SELinux, Smack, Tomoyo,
>> SafeSetID, LoadPin, IMA, IPE, secure/trusted boot, lockdown, etc.), but
>> they can still be definitely useful.
> 
> Yep... so... I'd expect something like... "so you have single-purpose
> system

No one talked about a single-purpose system.

> with all script interpreters removed,

Not necessarily with the patch series I pointed out just before.

> IMA hashing all the files
> to make sure they are not modified, and W^X enabled.

System-wide W^X is not only for memory, and as Madhavan said: "this RFC
pertains to converting dynamic [writable] machine code to static
[non-writable] code".

> Attacker can
> still execute code after buffer overflow by  and trapoline file
> descriptor addrsses that"... so that people running generic systems
> can stop reading after first sentence.

Are you proposing to add a
"[feature-not-useful-without-a-proper-system-configuration]" tag in
subjects? :)


Re: [PATCH v2 0/4] [RFC] Implement Trampoline File Descriptor

2020-09-24 Thread Mickaël Salaün


On 23/09/2020 22:51, Pavel Machek wrote:
> Hi!
> 
 Scenario 2
 --

 We know what code we need in advance. User trampolines are a good example 
 of
 this. It is possible to define such code statically with some help from the
 kernel.

 This RFC addresses (2). (1) needs a general purpose trusted code generator
 and is out of scope for this RFC.
>>>
>>> This is slightly less crazy talk than introduction talking about holes
>>> in W^X. But it is very, very far from normal Unix system, where you
>>> have selection of interpretters to run your malware on (sh, python,
>>> awk, emacs, ...) and often you can even compile malware from sources. 
>>>
>>> And as you noted, we don't have "a general purpose trusted code
>>> generator" for our systems.
>>>
>>> I believe you should simply delete confusing "introduction" and
>>> provide details of super-secure system where your patches would be
>>> useful, instead.
>>
>> This RFC talks about converting dynamic code (which cannot be authenticated)
>> to static code that can be authenticated using signature verification. That
>> is the scope of this RFC.
>>
>> If I have not been clear before, by dynamic code, I mean machine code that is
>> dynamic in nature. Scripts are beyond the scope of this RFC.
>>
>> Also, malware compiled from sources is not dynamic code. That is orthogonal
>> to this RFC. If such malware has a valid signature that the kernel permits 
>> its
>> execution, we have a systemic problem.
>>
>> I am not saying that script authentication or compiled malware are not 
>> problems.
>> I am just saying that this RFC is not trying to solve all of the security 
>> problems.
>> It is trying to define one way to convert dynamic code to static code to 
>> address
>> one class of problems.
> 
> Well, you don't have to solve all problems at once.
> 
> But solutions have to exist, and AFAIK in this case they don't. You
> are armoring doors, but ignoring open windows.

FYI, script execution is being addressed (for the kernel part) by this
patch series:
https://lore.kernel.org/lkml/20200924153228.387737-1-...@digikod.net/

> 
> Or very probably you are thinking about something different than
> normal desktop distros (Debian 10). Because on my systems, I have
> python, gdb and gcc...

It doesn't make sense for a tailored security system to leave all these
tools available to an attacker.

> 
> It would be nice to specify what other pieces need to be present for
> this to make sense -- because it makes no sense on Debian 10.

Not all kernel features make sense for a generic/undefined usage,
especially specific security mechanisms (e.g. SELinux, Smack, Tomoyo,
SafeSetID, LoadPin, IMA, IPE, secure/trusted boot, lockdown, etc.), but
they can still be definitely useful.

> 
> Best regards,
>   Pavel
> 


[PATCH v10 3/3] selftest/interpreter: Add tests for trusted_for(2) policies

2020-09-24 Thread Mickaël Salaün
From: Mickaël Salaün 

Test that checks performed by trusted_for(2) on file descriptors are
consistent with noexec mount points and file execute permissions,
according to the policy configured with the fs.trust_policy sysctl.

Signed-off-by: Mickaël Salaün 
Reviewed-by: Thibaut Sautereau 
Cc: Al Viro 
Cc: Arnd Bergmann 
Cc: Andrew Morton 
Cc: Kees Cook 
Cc: Shuah Khan 
Cc: Vincent Strubel 
---

Changes since v9:
* Rename the syscall and the sysctl.
* Update tests for enum trusted_for_usage

Changes since v8:
* Update with the dedicated syscall introspect_access(2) and the renamed
  fs.introspection_policy sysctl.
* Remove check symlink which can't be use as is anymore.
* Use socketpair(2) to test UNIX socket.

Changes since v7:
* Update tests with faccessat2/AT_INTERPRETED, including new ones to
  check that setting R_OK or W_OK returns EINVAL.
* Add tests for memfd, pipefs and nsfs.
* Rename and move back tests to a standalone directory.

Changes since v6:
* Add full combination tests for all file types, including block
  devices, character devices, fifos, sockets and symlinks.
* Properly save and restore initial sysctl value for all tests.

Changes since v5:
* Refactor with FIXTURE_VARIANT, which make the tests much more easy to
  read and maintain.
* Save and restore initial sysctl value (suggested by Kees Cook).
* Test with a sysctl value of 0.
* Check errno in sysctl_access_write test.
* Update tests for the CAP_SYS_ADMIN switch.
* Update tests to check -EISDIR (replacing -EACCES).
* Replace FIXTURE_DATA() with FIXTURE() (spotted by Kees Cook).
* Use global const strings.

Changes since v3:
* Replace RESOLVE_MAYEXEC with O_MAYEXEC.
* Add tests to check that O_MAYEXEC is ignored by open(2) and openat(2).

Changes since v2:
* Move tests from exec/ to openat2/ .
* Replace O_MAYEXEC with RESOLVE_MAYEXEC from openat2(2).
* Cleanup tests.

Changes since v1:
* Move tests from yama/ to exec/ .
* Fix _GNU_SOURCE in kselftest_harness.h .
* Add a new test sysctl_access_write to check if CAP_MAC_ADMIN is taken
  into account.
* Test directory execution which is always forbidden since commit
  73601ea5b7b1 ("fs/open.c: allow opening only regular files during
  execve()"), and also check that even the root user can not bypass file
  execution checks.
* Make sure delete_workspace() always as enough right to succeed.
* Cosmetic cleanup.
---
 .../testing/selftests/interpreter/.gitignore  |   2 +
 tools/testing/selftests/interpreter/Makefile  |  21 +
 tools/testing/selftests/interpreter/config|   1 +
 .../selftests/interpreter/trust_policy_test.c | 362 ++
 4 files changed, 386 insertions(+)
 create mode 100644 tools/testing/selftests/interpreter/.gitignore
 create mode 100644 tools/testing/selftests/interpreter/Makefile
 create mode 100644 tools/testing/selftests/interpreter/config
 create mode 100644 tools/testing/selftests/interpreter/trust_policy_test.c

diff --git a/tools/testing/selftests/interpreter/.gitignore 
b/tools/testing/selftests/interpreter/.gitignore
new file mode 100644
index ..82a4846cbc4b
--- /dev/null
+++ b/tools/testing/selftests/interpreter/.gitignore
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0-only
+/*_test
diff --git a/tools/testing/selftests/interpreter/Makefile 
b/tools/testing/selftests/interpreter/Makefile
new file mode 100644
index ..dbca8ebda67e
--- /dev/null
+++ b/tools/testing/selftests/interpreter/Makefile
@@ -0,0 +1,21 @@
+# SPDX-License-Identifier: GPL-2.0-or-later
+
+CFLAGS += -Wall -O2
+LDLIBS += -lcap
+
+src_test := $(wildcard *_test.c)
+TEST_GEN_PROGS := $(src_test:.c=)
+
+KSFT_KHDR_INSTALL := 1
+include ../lib.mk
+
+khdr_dir = $(top_srcdir)/usr/include
+
+$(khdr_dir)/asm-generic/unistd.h: khdr
+   @:
+
+$(khdr_dir)/linux/trusted-for.h: khdr
+   @:
+
+$(OUTPUT)/%_test: %_test.c $(khdr_dir)/asm-generic/unistd.h 
$(khdr_dir)/linux/trusted-for.h ../kselftest_harness.h
+   $(LINK.c) $< $(LDLIBS) -o $@ -I$(khdr_dir)
diff --git a/tools/testing/selftests/interpreter/config 
b/tools/testing/selftests/interpreter/config
new file mode 100644
index ..dd53c266bf52
--- /dev/null
+++ b/tools/testing/selftests/interpreter/config
@@ -0,0 +1 @@
+CONFIG_SYSCTL=y
diff --git a/tools/testing/selftests/interpreter/trust_policy_test.c 
b/tools/testing/selftests/interpreter/trust_policy_test.c
new file mode 100644
index ..4818c5524ec0
--- /dev/null
+++ b/tools/testing/selftests/interpreter/trust_policy_test.c
@@ -0,0 +1,362 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Test trusted_for(2) with fs.trust_policy sysctl
+ *
+ * Copyright © 2018-2020 ANSSI
+ *
+ * Author: Mickaël Salaün 
+ */
+
+#define _GNU_SOURCE
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "../kselftest_harness.h"
+
+#ifndef trusted_for
+static int trusted_for(const int fd, const enum 

[PATCH v10 1/3] fs: Add trusted_for(2) syscall implementation and related sysctl

2020-09-24 Thread Mickaël Salaün
From: Mickaël Salaün 

The trusted_for() syscall enables user space tasks to check that files
are trusted to be executed or interpreted by user space.  This may allow
script interpreters to check execution permission before reading
commands from a file, or dynamic linkers to allow shared object loading.
This may be seen as a way for a trusted task (e.g. interpreter) to check
the trustworthiness of files (e.g. scripts) before extending its control
flow graph with new ones originating from these files.

The security policy is consistently managed by the kernel through the
new sysctl: fs.trust_policy .  This enables system administrators to
enforce two complementary security policies according to the installed
system: enforce the noexec mount option, and enforce executable file
permission.  Indeed, because of compatibility with installed systems,
only system administrators are able to check that this new enforcement
is in line with the system mount points and file permissions.

For this to be possible, script interpreters must use trusted_for(2)
with the TRUSTED_FOR_EXECUTION usage.  To be fully effective, these
interpreters also need to handle the other ways to execute code: command
line parameters (e.g., option -e for Perl), module loading (e.g., option
-m for Python), stdin, file sourcing, environment variables,
configuration files, etc.  According to the threat model, it may be
acceptable to allow some script interpreters (e.g. Bash) to interpret
commands from stdin, may it be a TTY or a pipe, because it may not be
enough to (directly) perform syscalls.

Even without enforced security policy, user space interpreters can use
this syscall to try as much as possible to enforce the system policy at
their level, knowing that it will not break anything on running systems
which do not care about this feature.  However, on systems which want
this feature enforced, there will be knowledgeable people (i.e. system
administrator who configured fs.trust_policy deliberately) to manage it.

Because trusted_for(2) is a mean to enforce a system-wide security
policy (but not application-centric policies), it does not make sense
for user space to check the sysctl value.  Indeed, this new flag only
enables to extend the system ability to enforce a policy thanks to (some
trusted) user space collaboration.  Moreover, additional security
policies could be managed by LSMs.  This is a best-effort approach from
the application developer point of view:
https://lore.kernel.org/lkml/1477d3d7-4b36-afad-7077-a38f42322...@digikod.net/

trusted_for(2) with TRUSTED_FOR_EXECUTION should not be confused with
the O_EXEC flag (for open) which is intended for execute-only, which
obviously doesn't work for scripts.  However, a similar behavior could
be implemented in user space with O_PATH:
https://lore.kernel.org/lkml/1e2f6913-42f2-3578-28ed-567f6a4bd...@digikod.net/

Being able to restrict execution also enables to protect the kernel by
restricting arbitrary syscalls that an attacker could perform with a
crafted binary or certain script languages.  It also improves multilevel
isolation by reducing the ability of an attacker to use side channels
with specific code.  These restrictions can natively be enforced for ELF
binaries (with the noexec mount option) but require this kernel
extension to properly handle scripts (e.g. Python, Perl).  To get a
consistent execution policy, additional memory restrictions should also
be enforced (e.g. thanks to SELinux).

This is a new implementation of a patch initially written by
Vincent Strubel for CLIP OS 4:
https://github.com/clipos-archive/src_platform_clip-patches/blob/f5cb330d6b684752e403b4e41b39f7004d88e561/1901_open_mayexec.patch
This patch has been used for more than 12 years with customized script
interpreters.  Some examples (with the original O_MAYEXEC) can be found
here:
https://github.com/clipos-archive/clipos4_portage-overlay/search?q=O_MAYEXEC

Co-developed-by: Thibaut Sautereau 
Signed-off-by: Thibaut Sautereau 
Signed-off-by: Mickaël Salaün 
Cc: Al Viro 
Cc: Andrew Morton 
Cc: Arnd Bergmann 
Cc: Jonathan Corbet 
Cc: Kees Cook 
Cc: Vincent Strubel 
---

Changes since v9:
* Rename the syscall to trusted_for(2) and the sysctl to fs.trust_policy
* Add a dedicated enum trusted_for_usage with include/uapi/linux/trusted-for.h
* Remove the extra MAY_INTROSPECTION_EXEC bit.  LSMs can still implement
  this feature themselves.

Changes since v8:
* Add a dedicated syscall introspect_access() (requested by Al Viro).
* Rename MAY_INTERPRETED_EXEC to MAY_INTROSPECTION_EXEC .
* Rename the sysctl fs.interpreted_access to fs.introspection_policy .
* Update documentation.

Changes since v7:
* Replaces openat2/O_MAYEXEC with faccessat2/X_OK/AT_INTERPRETED .
  Switching to an FD-based syscall was suggested by Al Viro and Jann
  Horn.
* Handle special file descriptors.
* Add a compatibility mode for execute/read check.
* Move the sysctl policy from fs/namei.c to fs/open.c for the new
  faccessat2

[PATCH v10 2/3] arch: Wire up trusted_for(2)

2020-09-24 Thread Mickaël Salaün
From: Mickaël Salaün 

Wire up trusted_for(2) for all architectures.

Signed-off-by: Mickaël Salaün 
Reviewed-by: Thibaut Sautereau 
Cc: Al Viro 
Cc: Andrew Morton 
Cc: Arnd Bergmann 
Cc: Kees Cook 
Cc: Vincent Strubel 
---

Changes since v9:
* Rename introspect_access(2) to trusted_for(2).
* Increase syscall number to leave space for memfd_secret(2) in -next.

Changes since v7:
* New patch for the new syscall.
* Increase syscall numbers by 2 to leave space for new ones (in
  linux-next): watch_mount(2) and process_madvise(2).
---
 arch/alpha/kernel/syscalls/syscall.tbl  | 1 +
 arch/arm/tools/syscall.tbl  | 1 +
 arch/arm64/include/asm/unistd.h | 2 +-
 arch/arm64/include/asm/unistd32.h   | 2 ++
 arch/ia64/kernel/syscalls/syscall.tbl   | 1 +
 arch/m68k/kernel/syscalls/syscall.tbl   | 1 +
 arch/microblaze/kernel/syscalls/syscall.tbl | 1 +
 arch/mips/kernel/syscalls/syscall_n32.tbl   | 1 +
 arch/mips/kernel/syscalls/syscall_n64.tbl   | 1 +
 arch/mips/kernel/syscalls/syscall_o32.tbl   | 1 +
 arch/parisc/kernel/syscalls/syscall.tbl | 1 +
 arch/powerpc/kernel/syscalls/syscall.tbl| 1 +
 arch/s390/kernel/syscalls/syscall.tbl   | 1 +
 arch/sh/kernel/syscalls/syscall.tbl | 1 +
 arch/sparc/kernel/syscalls/syscall.tbl  | 1 +
 arch/x86/entry/syscalls/syscall_32.tbl  | 1 +
 arch/x86/entry/syscalls/syscall_64.tbl  | 1 +
 arch/xtensa/kernel/syscalls/syscall.tbl | 1 +
 include/uapi/asm-generic/unistd.h   | 4 +++-
 19 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/arch/alpha/kernel/syscalls/syscall.tbl 
b/arch/alpha/kernel/syscalls/syscall.tbl
index ec8bed9e7b75..0175cfc0f66f 100644
--- a/arch/alpha/kernel/syscalls/syscall.tbl
+++ b/arch/alpha/kernel/syscalls/syscall.tbl
@@ -479,3 +479,4 @@
 547common  openat2 sys_openat2
 548common  pidfd_getfd sys_pidfd_getfd
 549common  faccessat2  sys_faccessat2
+553common  trusted_for sys_trusted_for
diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl
index 171077cbf419..db9c8d35e75b 100644
--- a/arch/arm/tools/syscall.tbl
+++ b/arch/arm/tools/syscall.tbl
@@ -453,3 +453,4 @@
 437common  openat2 sys_openat2
 438common  pidfd_getfd sys_pidfd_getfd
 439common  faccessat2  sys_faccessat2
+443common  trusted_for sys_trusted_for
diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
index 3b859596840d..d1f7d35f986e 100644
--- a/arch/arm64/include/asm/unistd.h
+++ b/arch/arm64/include/asm/unistd.h
@@ -38,7 +38,7 @@
 #define __ARM_NR_compat_set_tls(__ARM_NR_COMPAT_BASE + 5)
 #define __ARM_NR_COMPAT_END(__ARM_NR_COMPAT_BASE + 0x800)
 
-#define __NR_compat_syscalls   440
+#define __NR_compat_syscalls   444
 #endif
 
 #define __ARCH_WANT_SYS_CLONE
diff --git a/arch/arm64/include/asm/unistd32.h 
b/arch/arm64/include/asm/unistd32.h
index 734860ac7cf9..33716dd2c04c 100644
--- a/arch/arm64/include/asm/unistd32.h
+++ b/arch/arm64/include/asm/unistd32.h
@@ -887,6 +887,8 @@ __SYSCALL(__NR_openat2, sys_openat2)
 __SYSCALL(__NR_pidfd_getfd, sys_pidfd_getfd)
 #define __NR_faccessat2 439
 __SYSCALL(__NR_faccessat2, sys_faccessat2)
+#define __NR_trusted_for 443
+__SYSCALL(__NR_trusted_for, sys_trusted_for)
 
 /*
  * Please add new compat syscalls above this comment and update
diff --git a/arch/ia64/kernel/syscalls/syscall.tbl 
b/arch/ia64/kernel/syscalls/syscall.tbl
index f52a41f4c340..68e56436b611 100644
--- a/arch/ia64/kernel/syscalls/syscall.tbl
+++ b/arch/ia64/kernel/syscalls/syscall.tbl
@@ -360,3 +360,4 @@
 437common  openat2 sys_openat2
 438common  pidfd_getfd sys_pidfd_getfd
 439common  faccessat2  sys_faccessat2
+443common  trusted_for sys_trusted_for
diff --git a/arch/m68k/kernel/syscalls/syscall.tbl 
b/arch/m68k/kernel/syscalls/syscall.tbl
index 81fc799d8392..67f0bc2fc4d0 100644
--- a/arch/m68k/kernel/syscalls/syscall.tbl
+++ b/arch/m68k/kernel/syscalls/syscall.tbl
@@ -439,3 +439,4 @@
 437common  openat2 sys_openat2
 438common  pidfd_getfd sys_pidfd_getfd
 439common  faccessat2  sys_faccessat2
+443common  trusted_for sys_trusted_for
diff --git a/arch/microblaze/kernel/syscalls/syscall.tbl 
b/arch/microblaze/kernel/syscalls/syscall.tbl
index b4e263916f41..acd3057886b7 100644
--- a/arch/microblaze/kernel/syscalls/syscall.tbl
+++ b/arch/microblaze/kernel/syscalls/syscall.tbl
@@ -445,3 +445,4 @@
 437common  openat2 sys_openat2
 438common  pidfd_getfd sys_pidfd_getfd
 439common  faccessat2  sys_faccessat2
+443common  trusted_for

[PATCH v10 0/3] Add trusted_for(2) (was O_MAYEXEC)

2020-09-24 Thread Mickaël Salaün
Hi,

This tenth patch series renames the syscall from introspect_access(2) to
trusted_for(2) and the sysctl from fs.introspect_policy to
fs.trust_policy.  Indeed, the final goal is to enable the kernel to be a
global policy manager by entrusting processes with access control at
their level.  To reach this goal, two complementary parts are required:
* user space needs to be able to know if it can trust some file
  descriptor content for a specific usage;
* and the kernel needs to make available some part of the policy
  configured by the system administrator.

We removed the MAY_INTROSPECT_EXEC which was passed to
inode_permission().  LSMs wishing to use this new syscall will need to
implement such new flag.

Primary goal of trusted_for(2)
==

This new syscall enables user space to ask the kernel: is this file
descriptor's content trusted to be used for this purpose?  The set of
usage currently only contains "execution", but other may follow (e.g.
"configuration", "sensitive_data").  If the kernel identifies the file
descriptor as trustworthy for this usage, user space should then take
this information into account.  The "execution" usage means that the
content of the file descriptor is trusted according to the system policy
to be executed by user space, which means that it interprets the content
or (try to) maps it as executable memory.

A simple system-wide security policy can be enforced by the system
administrator through a sysctl configuration consistent with the mount
points or the file access rights.  The documentation patch explains the
prerequisites.

It is important to note that this can only enable to extend access
control managed by the kernel.  Hence it enables current access control
mechanism to be extended and become a superset of what they can
currently control.  Indeed, the security policy could also be delegated
to an LSM, either a MAC system or an integrity system.  For instance,
this is required to close a major IMA measurement/appraisal interpreter
integrity gap by bringing the ability to check the use of scripts [1].
Other uses are expected, such as for magic-links [2], SGX integration
[3], bpffs [4].

Complementary W^X protections can be brought by SELinux, IPE [5] and
trampfd [6].

Prerequisite of its use
===

User space needs to adapt to take advantage of this new feature.  For
example, the PEP 578 [7] (Runtime Audit Hooks) enables Python 3.8 to be
extended with policy enforcement points related to code interpretation,
which can be used to align with the PowerShell audit features.
Additional Python security improvements (e.g. a limited interpreter
without -c, stdin piping of code) are on their way [8].

Examples


The initial idea comes from CLIP OS 4 and the original implementation
has been used for more than 12 years:
https://github.com/clipos-archive/clipos4_doc
Chrome OS has a similar approach:
https://chromium.googlesource.com/chromiumos/docs/+/master/security/noexec_shell_scripts.md

Userland patches can be found here:
https://github.com/clipos-archive/clipos4_portage-overlay/search?q=O_MAYEXEC
Actually, there is more than the O_MAYEXEC changes (which matches this search)
e.g., to prevent Python interactive execution. There are patches for
Bash, Wine, Java (Icedtea), Busybox's ash, Perl and Python. There are
also some related patches which do not directly rely on O_MAYEXEC but
which restrict the use of browser plugins and extensions, which may be
seen as scripts too:
https://github.com/clipos-archive/clipos4_portage-overlay/tree/master/www-client

An introduction to O_MAYEXEC was given at the Linux Security Summit
Europe 2018 - Linux Kernel Security Contributions by ANSSI:
https://www.youtube.com/watch?v=chNjCRtPKQY=17m15s
The "write xor execute" principle was explained at Kernel Recipes 2018 -
CLIP OS: a defense-in-depth OS:
https://www.youtube.com/watch?v=PjRE0uBtkHU=11m14s
See also an overview article: https://lwn.net/Articles/82/

This patch series can be applied on top of v5.9-rc6 .  This can be tested
with CONFIG_SYSCTL.  I would really appreciate constructive comments on
this patch series.

Previous version:
https://lore.kernel.org/kernel-hardening/20200910164612.114215-1-...@digikod.net/

[1] https://lore.kernel.org/lkml/1544647356.4028.105.ca...@linux.ibm.com/
[2] https://lore.kernel.org/lkml/20190904201933.10736-6-cyp...@cyphar.com/
[3] 
https://lore.kernel.org/lkml/CALCETrVovr8XNZSroey7pHF46O=kj_c5D9K8h=z2t_cnrpv...@mail.gmail.com/
[4] 
https://lore.kernel.org/lkml/calcetrvez0euffxwfhtag_j+advbzewe0m3wjxmwveo7pj+...@mail.gmail.com/
[5] 
https://lore.kernel.org/lkml/20200406221439.1469862-12-deven.de...@linux.microsoft.com/
[6] 
https://lore.kernel.org/lkml/20200922215326.4603-1-madve...@linux.microsoft.com/
[7] https://www.python.org/dev/peps/pep-0578/
[8] 
https://lore.kernel.org/lkml/0c70debd-e79e-d514-06c6-4cd1e021f...@python.org/

Regards,

Mi

Re: [PATCH v20 05/12] LSM: Infrastructure management of the superblock

2020-09-16 Thread Mickaël Salaün


On 04/09/2020 16:06, Stephen Smalley wrote:
> On Thu, Aug 13, 2020 at 2:39 PM Stephen Smalley
>  wrote:
>>
>> On Thu, Aug 13, 2020 at 10:17 AM Mickaël Salaün  wrote:
>>>
>>>
>>> On 12/08/2020 21:16, Stephen Smalley wrote:
>>>> On 8/2/20 5:58 PM, Mickaël Salaün wrote:
>>>>> From: Casey Schaufler 
>>>>>
>>>>> Move management of the superblock->sb_security blob out
>>>>> of the individual security modules and into the security
>>>>> infrastructure. Instead of allocating the blobs from within
>>>>> the modules the modules tell the infrastructure how much
>>>>> space is required, and the space is allocated there.
>>>>>
>>>>> Signed-off-by: Casey Schaufler 
>>>>> Reviewed-by: Kees Cook 
>>>>> Reviewed-by: John Johansen 
>>>>> Reviewed-by: Stephen Smalley 
>>>>> Reviewed-by: Mickaël Salaün 
>>>>> Link:
>>>>> https://lore.kernel.org/r/20190829232935.7099-2-ca...@schaufler-ca.com
>>>>> ---
>>>>>
>>>>> Changes since v17:
>>>>> * Rebase the original LSM stacking patch from v5.3 to v5.7: I fixed some
>>>>>diff conflicts caused by code moves and function renames in
>>>>>selinux/include/objsec.h and selinux/hooks.c .  I checked that it
>>>>>builds but I didn't test the changes for SELinux nor SMACK.
>>>>
>>>> You shouldn't retain Signed-off-by and Reviewed-by lines from an earlier
>>>> patch if you made non-trivial changes to it (even more so if you didn't
>>>> test them).
>>>
>>> I think I made trivial changes according to the original patch. But
>>> without reply from other people with Signed-off-by or Reviewed-by
>>> (Casey, Kees, John), I'll remove them. I guess you don't want your
>>> Reviewed-by to be kept, so I'll remove it, except if you want to review
>>> this patch (or the modified part).
>>
>> At the very least your Reviewed-by line is wrong - yours should be
>> Signed-off-by because the patch went through you and you modified it.
>> I'll try to take a look as time permits but FYI you should this
>> address (already updated in MAINTAINERS) going forward.
> 
> I finally got around to reviewing your updated patch.  You can drop
> the old line and add:
> Reviewed-by: Stephen Smalley 
> 

Thanks! I'll send a new series soon.


Re: [RFC PATCH v9 0/3] Add introspect_access(2) (was O_MAYEXEC)

2020-09-14 Thread Mickaël Salaün
Arnd and Michael,

What do you think of "should_faccessat" or "entrusted_faccessat" for
this new system call?


On 12/09/2020 02:28, James Morris wrote:
> On Thu, 10 Sep 2020, Matthew Wilcox wrote:
> 
>> On Thu, Sep 10, 2020 at 08:38:21PM +0200, Mickaël Salaün wrote:
>>> There is also the use case of noexec mounts and file permissions. From
>>> user space point of view, it doesn't matter which kernel component is in
>>> charge of defining the policy. The syscall should then not be tied with
>>> a verification/integrity/signature/appraisal vocabulary, but simply an
>>> access control one.
>>
>> permission()?
>>
> 
> The caller is not asking the kernel to grant permission, it's asking 
> "SHOULD I access this file?"
> 
> The caller doesn't know, for example, if the script file it's about to 
> execute has been signed, or if it's from a noexec mount. It's asking the 
> kernel, which does know. (Note that this could also be extended to reading 
> configuration files).
> 
> How about: should_faccessat ?
> 

Sounds good to me.


Re: [RFC PATCH v9 0/3] Add introspect_access(2) (was O_MAYEXEC)

2020-09-11 Thread Mickaël Salaün


On 10/09/2020 22:05, Matthew Wilcox wrote:
> On Thu, Sep 10, 2020 at 09:00:10PM +0100, Al Viro wrote:
>> On Thu, Sep 10, 2020 at 07:40:33PM +0100, Matthew Wilcox wrote:
>>> On Thu, Sep 10, 2020 at 08:38:21PM +0200, Mickaël Salaün wrote:
>>>> There is also the use case of noexec mounts and file permissions. From
>>>> user space point of view, it doesn't matter which kernel component is in
>>>> charge of defining the policy. The syscall should then not be tied with
>>>> a verification/integrity/signature/appraisal vocabulary, but simply an
>>>> access control one.
>>>
>>> permission()?
>>
>> int lsm(int fd, const char *how, char *error, int size);
>>
>> Seriously, this is "ask LSM to apply special policy to file"; let's
>> _not_ mess with flags, etc. for that; give it decent bandwidth
>> and since it's completely opaque for the rest of the kernel,
>> just a pass a string to be parsed by LSM as it sees fit.

Well, I don't know why you're so angry against LSM, but as noticed by
Matthew, the main focus of this patch series is not about LSM (no hook,
no security/* code, only file permission and mount option checks,
nothing fancy). Moreover, the syscall you're proposing doesn't make
sense, but I guess it's yet another sarcastic reply. Please, cool down.
We asked for constructive comments and already followed your previous
requests (even if we didn't get answers for some questions), but
seriously, this one is nonsense.

> 
> Hang on, it does have some things which aren't BD^W^WLSM.  It lets
> the interpreter honour the mount -o noexec option.  I presume it's
> not easily defeated by
>   cat /home/salaun/bin/bad.pl | perl -
> 

Funny. I know there is a lot of text and links but please read the
commit messages before further comments.


<    1   2   3   4   5   6   7   8   9   10   >