On Thu, Aug 31, 2017 at 5:02 PM, Eric Biggers <ebigge...@gmail.com> wrote:
> From: Eric Biggers <ebigg...@google.com>
>
> Perhaps long overdue, add a documentation file for filesystem-level
> encryption, a.k.a. fscrypt or fs/crypto/, to the Documentation
> directory.  The new file is based loosely on the latest version of the
> "EXT4 Encryption Design Document (public version)" Google Doc, but with
> many improvements made, including:
>
> - Reflect the reality that it is not specific to ext4 anymore.
> - More thoroughly document the design and user-visible API/behavior.
> - Replace outdated information, such as the outdated explanation of how
>   encrypted filenames are hashed for indexed directories and how
>   encrypted filenames are presented to userspace without the key.
>   (This was changed just before release.)
>
> For now the focus is on the design and user-visible API/behavior, not on
> how to add encryption support to a filesystem --- since the internal API
> is still pretty messy and any standalone documentation for it would
> become outdated as things get refactored over time.
>
> Signed-off-by: Eric Biggers <ebigg...@google.com>
> ---
> Changes since v1:
> - Mention that using existing userspace tools is preferred
> - Don't start an argument about the best way to get random numbers
> - Make it clear that backup/restore of encrypted files without key is
>   not supported yet
> - Mention a reason why the encryption xattr should not be exposed
>   via the xattr system calls
>
>  Documentation/filesystems/fscrypt.rst | 597 
> ++++++++++++++++++++++++++++++++++
>  Documentation/filesystems/index.rst   |  11 +
>  MAINTAINERS                           |   1 +
>  3 files changed, 609 insertions(+)
>  create mode 100644 Documentation/filesystems/fscrypt.rst
>
> diff --git a/Documentation/filesystems/fscrypt.rst 
> b/Documentation/filesystems/fscrypt.rst
> new file mode 100644
> index 000000000000..ec4cad049dde
> --- /dev/null
> +++ b/Documentation/filesystems/fscrypt.rst
> @@ -0,0 +1,597 @@
> +=====================================
> +Filesystem-level encryption (fscrypt)
> +=====================================
> +
> +Introduction
> +============
> +
> +fscrypt is a library which filesystems can hook into to support
> +transparent encryption of files and directories.
> +
> +Note: "fscrypt" in this document refers to the kernel-level portion,
> +implemented in ``fs/crypto/``, as opposed to the userspace tool
> +`fscrypt <https://github.com/google/fscrypt>`_.  This document only
> +covers the kernel-level portion.  For command-line examples of how to
> +use encryption, see the documentation for the userspace tool `fscrypt
> +<https://github.com/google/fscrypt>`_.  Also, it is strongly
> +recommended to use the fscrypt userspace tool, or other existing
> +userspace tools such as Android's key management system, over using
> +the kernel's API directly.  Using existing tools reduces the chance of
> +introducing your own security bugs.  (Nevertheless, for completeness
> +this documentation covers the kernel's API anyway.)

I think we should also mention fscryptctl (https://github.com/google/fscryptctl)
here as well. While we should definitly emphasize that fscrypt is the perferred
solution, Richard Weinberger mentioned a need for a smaller simpiler tool when
you just want to apply a static key to a directory. Fscryptctl also had no
shared library dependanceis unlike fscrypt.

- Joe Richey <joeric...@google.com>

> +
> +Unlike dm-crypt, fscrypt operates at the filesystem level rather than
> +at the block device level.  This allows it to encrypt different files
> +with different keys and to have unencrypted files on the same
> +filesystem.  This is useful for multi-user systems where each user's
> +data-at-rest needs to be cryptographically isolated from the others.
> +However, except for filenames, fscrypt does not encrypt filesystem
> +metadata.
> +
> +Unlike eCryptfs, which is a stacked filesystem, fscrypt is integrated
> +directly into supported filesystems --- currently ext4, F2FS, and
> +UBIFS.  This allows encrypted files to be read and written without
> +caching both the decrypted and encrypted pages in the pagecache,
> +thereby halving the memory used and bringing it in line with
> +unencrypted files.  Similarly, half as many dentries and inodes are
> +needed.  eCryptfs also limits filenames to 143 bytes, causing
> +application compatibility issues; fscrypt allows the full 255 bytes
> +(NAME_MAX).  Finally, unlike eCryptfs, the fscrypt API can be used by
> +unprivileged users, with no need to mount anything.
> +
> +fscrypt does not support encrypting files in-place.  Instead, it
> +supports marking an empty directory as encrypted.  Then, after
> +userspace provides the key, all regular files, directories, and
> +symbolic links created in that directory tree are transparently
> +encrypted.
> +
> +Threat model
> +============
> +
> +Offline attacks
> +---------------
> +
> +Provided that userspace chooses a strong encryption key, fscrypt
> +protects the confidentiality of file contents and filenames in the
> +event of a single point-in-time permanent offline compromise of the
> +block device content.  fscrypt does not protect the confidentiality of
> +non-filename metadata, e.g. file sizes, file permissions, file
> +timestamps, and extended attributes.  Also, the existence and location
> +of holes (unallocated blocks which logically contain all zeroes) in
> +files is not protected.
> +
> +fscrypt is not guaranteed to protect confidentiality or authenticity
> +if an attacker is able to manipulate the filesystem offline prior to
> +an authorized user later accessing the filesystem.
> +
> +Online attacks
> +--------------
> +
> +fscrypt (and storage encryption in general) can only provide limited
> +protection, if any at all, against online attacks.  In detail:
> +
> +fscrypt is only resistant to side-channel attacks, such as timing or
> +electromagnetic attacks, to the extent that the underlying Linux
> +Cryptographic API algorithms are.  If a vulnerable algorithm is used,
> +such as a table-based implementation of AES, it may be possible for an
> +attacker to mount a side channel attack against the online system.
> +
> +After an encryption key has been provided, fscrypt is not designed to
> +hide the plaintext file contents or filenames from other users on the
> +same system, regardless of the visibility of the keyring key.
> +Instead, existing access control mechanisms such as file mode bits,
> +POSIX ACLs, or SELinux should be used for this purpose.  Also note
> +that as long as the encryption keys are *anywhere* in memory, an
> +online attacker can necessarily compromise them by mounting a physical
> +attack or by exploiting any kernel security vulnerability which
> +provides an arbitrary memory read primitive.
> +
> +While it is ostensibly possible to "evict" keys from the system,
> +recently accessed encrypted files will remain accessible at least
> +until the filesystem is unmounted or the VFS caches are dropped, e.g.
> +using ``echo 2 > /proc/sys/vm/drop_caches``.  Even after that, if the
> +RAM is compromised before being powered off, it will likely still be
> +possible to recover portions of the plaintext file contents, if not
> +some of the encryption keys as well.  (Since Linux v4.12, all
> +in-kernel keys related to fscrypt are sanitized before being freed.
> +However, userspace would need to do its part as well.)
> +
> +Currently, fscrypt does not prevent a user from maliciously providing
> +an incorrect key for another user's existing encrypted files.  A
> +protection against this is planned.
> +
> +Key hierarchy
> +=============
> +
> +Master Keys
> +-----------
> +
> +Each encrypted directory tree is protected by a *master key*.  Master
> +keys can be up to 64 bytes long, and must be at least as long as the
> +greater of the key length needed by the contents and filenames
> +encryption modes being used.  For example, if AES-256-XTS is used for
> +contents encryption, the master key must be 64 bytes (512 bits).  Note
> +that the XTS mode is defined to require a key twice as long as that
> +required by the underlying block cipher.
> +
> +To "unlock" an encrypted directory tree, userspace must provide the
> +appropriate master key.  There can be any number of master keys, each
> +of which protects any number of directory trees on any number of
> +filesystems.
> +
> +Userspace should generate master keys either using a cryptographically
> +secure random number generator, or by using a KDF (Key Derivation
> +Function).  Note that whenever a KDF is used to "stretch" a
> +lower-entropy secret such as a passphrase, it is critical that a KDF
> +designed for this purpose be used, such as scrypt, PBKDF2, or Argon2.
> +
> +Per-file keys
> +-------------
> +
> +Master keys are not used to encrypt file contents or names directly.
> +Instead, a unique key is derived for each encrypted file, including
> +each regular file, directory, and symbolic link.  This has several
> +advantages:
> +
> +- In cryptosystems, the same key material should never be used for
> +  different purposes.  Using the master key as both an XTS key for
> +  contents encryption and as a CTS-CBC key for filenames encryption
> +  would violate this rule.
> +- Per-file keys simplify the choice of IVs (Initialization Vectors)
> +  for contents encryption.  Without per-file keys, to ensure IV
> +  uniqueness both the inode and logical block number would need to be
> +  encoded in the IVs.  This would make it impossible to renumber
> +  inodes, which e.g. ``resize2fs`` can do when resizing an ext4
> +  filesystem.  With per-file keys, it is sufficient to encode just the
> +  logical block number in the IVs.
> +- Per-file keys strengthen the encryption of filenames, where IVs are
> +  reused out of necessity.  With a unique key per directory, IV reuse
> +  is limited to within a single directory.
> +- Per-file keys allow individual files to be securely erased simply by
> +  securely erasing their keys.  (Not yet implemented.)
> +
> +A KDF (Key Derivation Function) is used to derive per-file keys from
> +the master key.  This is done instead of wrapping a randomly-generated
> +key for each file because it reduces the size of the encryption xattr,
> +which for some filesystems makes the xattr more likely to fit in-line
> +in the filesystem's inode table.  With a KDF, only a 16-byte nonce is
> +required --- long enough to make key reuse extremely unlikely.  A
> +wrapped key, on the other hand, would need to be up to 64 bytes ---
> +the length of an AES-256-XTS key.  Furthermore, currently there is no
> +requirement to support unlocking a file with multiple alternative
> +master keys or to support rotating master keys.  Instead, the master
> +keys may be wrapped in userspace, e.g. as done by the `fscrypt
> +<https://github.com/google/fscrypt>`_ tool.
> +
> +The current KDF encrypts the master key using the 16-byte nonce as an
> +AES-128-ECB key.  The output is used as the derived key.  If the
> +output is longer than needed, then it is truncated to the needed
> +length.  Truncation is the norm for directories and symlinks, since
> +those use the CTS-CBC encryption mode which requires a key half as
> +long as that required by the XTS encryption mode.
> +
> +Note: this KDF meets the primary security requirement, which is to
> +produce unique derived keys that preserve the entropy of the master
> +key, assuming that the master key is already a good pseudorandom key.
> +However, it is nonstandard and has some theoretical problems such as
> +being reversible, so it is generally considered to be a mistake!  It
> +may be replaced with HKDF or another more standard KDF in the future.
> +
> +Encryption modes and usage
> +==========================
> +
> +fscrypt allows one encryption mode to be specified for file contents
> +and one encryption mode to be specified for filenames.  Different
> +directory trees are permitted to use different encryption modes.
> +Currently, the following pairs of encryption modes are supported:
> +
> +- AES-256-XTS for contents and AES-256-CTS-CBC for filenames
> +- AES-128-CBC for contents and AES-128-CTS-CBC for filenames
> +
> +It is strongly recommended to use AES-256-XTS for contents encryption.
> +AES-128-CBC was added only for low-powered embedded devices with
> +crypto accelerators such as CAAM or CESA that do not support XTS.
> +
> +New encryption modes can be added relatively easily, without changes
> +to individual filesystems.  However, authenticated encryption (AE)
> +modes are not currently supported because of the difficulty of dealing
> +with ciphertext expansion.
> +
> +For file contents, each filesystem block is encrypted independently.
> +Currently, only the case where the filesystem block size is equal to
> +the system's page size (usually 4096 bytes) is supported.  With the
> +XTS mode of operation (recommended), the logical block number within
> +the file is used as the IV.  With the CBC mode of operation (not
> +recommended), ESSIV is used; specifically, the IV for CBC is the
> +logical block number encrypted with AES-256, where the AES-256 key is
> +the SHA-256 hash of the inode's data encryption key.
> +
> +For filenames, the full filename is encrypted at once.  Because of the
> +requirements to retain support for efficient directory lookups and
> +filenames of up to 255 bytes, a constant initialization vector (IV) is
> +used.  However, each encrypted directory uses a unique key, which
> +limits IV reuse to within a single directory.
> +
> +Since filenames are encrypted with the CTS-CBC mode of operation, the
> +plaintext and ciphertext filenames need not be multiples of the AES
> +block size, i.e. 16 bytes.  However, the minimum size that can be
> +encrypted is 16 bytes, so shorter filenames are NUL-padded to 16 bytes
> +before being encrypted.  In addition, to reduce leakage of filename
> +lengths via their ciphertexts, all filenames are NUL-padded to the
> +next 4, 8, 16, or 32-byte boundary (configurable).  32 is recommended
> +since this provides the best confidentiality, at the cost of making
> +directory entries consume slightly more space.  Note that since NUL
> +(``\0``) is not otherwise a valid character in filenames, the padding
> +will never produce duplicate plaintexts.
> +
> +Symbolic link targets are considered a type of filename and are
> +encrypted in the same way as filenames in directory entries.  Each
> +symlink also uses a unique key; hence, the hardcoded IV is not a
> +problem for symlinks.
> +
> +User API
> +========
> +
> +Setting an encryption policy
> +----------------------------
> +
> +The FS_IOC_SET_ENCRYPTION_POLICY ioctl sets an encryption policy on an
> +empty directory or verifies that a directory or regular file already
> +has the specified encryption policy.  It takes in a pointer to a
> +:c:type:`struct fscrypt_policy`, defined as follows::
> +
> +    #define FS_KEY_DESCRIPTOR_SIZE  8
> +
> +    struct fscrypt_policy {
> +            __u8 version;
> +            __u8 contents_encryption_mode;
> +            __u8 filenames_encryption_mode;
> +            __u8 flags;
> +            __u8 master_key_descriptor[FS_KEY_DESCRIPTOR_SIZE];
> +    };
> +
> +This structure must be initialized as follows:
> +
> +- ``version`` must be 0.
> +
> +- ``contents_encryption_mode`` and ``filenames_encryption_mode`` must
> +  be set to constants from ``<linux/fs.h>`` which identify the
> +  encryption modes to use.  If unsure, use
> +  FS_ENCRYPTION_MODE_AES_256_XTS (1) for ``contents_encryption_mode``
> +  and FS_ENCRYPTION_MODE_AES_256_CTS (4) for
> +  ``filenames_encryption_mode``.
> +
> +- ``flags`` must be set to a value from ``<linux/fs.h>`` which
> +  identifies the amount of NUL-padding to use when encrypting
> +  filenames.  If unsure, use FS_POLICY_FLAGS_PAD_32 (0x3).
> +
> +- ``master_key_descriptor`` specifies how to find the master key in
> +  the keyring; see `Adding keys`_.  It is up to userspace to choose a
> +  unique ``master_key_descriptor`` for each master key.  The e4crypt
> +  and fscrypt tools use the first 8 bytes of
> +  ``SHA-512(SHA-512(master_key))``, but this particular scheme is not
> +  required.  Also, the master key need not be in the keyring yet when
> +  FS_IOC_SET_ENCRYPTION_POLICY is executed.  However, it must be added
> +  before any files can be created in the encrypted directory.
> +
> +If the file is not yet encrypted, then FS_IOC_SET_ENCRYPTION_POLICY
> +verifies that the file is an empty directory.  If so, the specified
> +encryption policy is assigned to the directory, turning it into an
> +encrypted directory.  After that, and after providing the
> +corresponding master key as described in `Adding keys`_, all regular
> +files, directories (recursively), and symlinks created in the
> +directory will be encrypted, inheriting the same encryption policy.
> +The filenames in the directory's entries will be encrypted as well.
> +
> +Alternatively, if the file is already encrypted, then
> +FS_IOC_SET_ENCRYPTION_POLICY validates that the specified encryption
> +policy exactly matches the actual one.  If they match, then the ioctl
> +returns 0.  Otherwise, it fails with EEXIST.  This works on both
> +regular files and directories, including nonempty directories.
> +
> +Note that the ext4 filesystem does not allow the root directory to be
> +encrypted, even if it is empty.  Users who want to encrypt an entire
> +filesystem with one key should consider using dm-crypt instead.
> +
> +FS_IOC_SET_ENCRYPTION_POLICY can fail with the following errors:
> +
> +- ``EACCES``: the file is not owned by the process's uid, nor does the
> +  process have the CAP_FOWNER capability in a namespace with the file
> +  owner's uid mapped
> +- ``EEXIST``: the file is already encrypted with an encryption policy
> +  different from the one specified
> +- ``EINVAL``: an invalid encryption policy was specified (invalid
> +  version, mode(s), or flags)
> +- ``ENOTDIR``: the file is unencrypted and is a regular file, not a
> +  directory
> +- ``ENOTEMPTY``: the file is unencrypted and is a nonempty directory
> +- ``ENOTTY``: this type of filesystem does not implement encryption
> +- ``EOPNOTSUPP``: the kernel was not configured with encryption
> +  support for this filesystem, or the filesystem superblock has not
> +  had encryption enabled on it.  (For example, to use encryption on an
> +  ext4 filesystem, CONFIG_EXT4_ENCRYPTION must be enabled in the
> +  kernel config, and the superblock must have had the "encrypt"
> +  feature flag enabled using ``tune2fs -O encrypt`` or ``mkfs.ext4 -O
> +  encrypt``.)
> +- ``EPERM``: this directory may not be encrypted, e.g. because it is
> +  the root directory of an ext4 filesystem
> +- ``EROFS``: the filesystem is readonly
> +
> +Getting an encryption policy
> +----------------------------
> +
> +The FS_IOC_GET_ENCRYPTION_POLICY ioctl retrieves the :c:type:`struct
> +fscrypt_policy`, if any, for a directory or regular file.  See above
> +for the struct definition.  No additional permissions are required
> +beyond the ability to open the file.
> +
> +FS_IOC_GET_ENCRYPTION_POLICY can fail with the following errors:
> +
> +- ``EINVAL``: the file is encrypted, but it uses an unrecognized
> +  encryption context format
> +- ``ENODATA``: the file is not encrypted
> +- ``ENOTTY``: this type of filesystem does not implement encryption
> +- ``EOPNOTSUPP``: the kernel was not configured with encryption
> +  support for this filesystem
> +
> +Note: if you only need to know whether a file is encrypted or not, on
> +most filesystems it is also possible to use the FS_IOC_GETFLAGS ioctl
> +and check for FS_ENCRYPT_FL, or to use the statx() system call and
> +check for STATX_ATTR_ENCRYPTED in stx_attributes.
> +
> +Getting the per-filesystem salt
> +-------------------------------
> +
> +Some filesystems, such as ext4 and F2FS, also support the deprecated
> +ioctl FS_IOC_GET_ENCRYPTION_PWSALT.  This ioctl retrieves a randomly
> +generated 16-byte value stored in the filesystem superblock.  This
> +value is intended to used as a salt when deriving an encryption key
> +from a passphrase or other low-entropy user credential.
> +
> +FS_IOC_GET_ENCRYPTION_PWSALT is deprecated.  Instead, prefer to
> +generate and manage any needed salt(s) in userspace.
> +
> +Adding keys
> +-----------
> +
> +To provide a master key, userspace must add it to an appropriate
> +keyring using the add_key() system call (see:
> +``Documentation/security/keys/core.rst``).  The key type must be
> +"logon"; keys of this type are kept in kernel memory and cannot be
> +read back by userspace.  The key description must be "fscrypt:"
> +followed by the 16-character lower case hex representation of the
> +``master_key_descriptor`` that was set in the encryption policy.  The
> +key payload must conform to the following structure::
> +
> +    #define FS_MAX_KEY_SIZE 64
> +
> +    struct fscrypt_key {
> +            u32 mode;
> +            u8 raw[FS_MAX_KEY_SIZE];
> +            u32 size;
> +    };
> +
> +``mode`` is ignored; just set it to 0.  The actual key is provided in
> +``raw`` with ``size`` indicating its size in bytes.  That is, the
> +bytes ``raw[0..size-1]`` (inclusive) are the actual key.
> +
> +The key description prefix "fscrypt:" may alternatively be replaced
> +with a filesystem-specific prefix such as "ext4:".  However, the
> +filesystem-specific prefixes are deprecated and should not be used in
> +new programs.
> +
> +There are several different types of keyrings in which encryption keys
> +may be placed, such as a session keyring, a user session keyring, or a
> +user keyring.  Each key must be placed in a keyring that is "attached"
> +to all processes that might need to access files encrypted with it, in
> +the sense that request_key() will find the key.  Generally, if only
> +processes belonging to a specific user need to access a given
> +encrypted directory and no session keyring has been installed, then
> +that directory's key should be placed in that user's user session
> +keyring or user keyring.  Otherwise, a session keyring should be
> +installed if needed, and the key should be linked into that session
> +keyring, or in a keyring linked into that session keyring.
> +
> +Note: introducing the complex visibility semantics of keyrings here
> +was arguably a mistake --- especially given that by design, after any
> +process successfully opens an encrypted file (thereby setting up the
> +per-file key), possessing the keyring key is not actually required for
> +any process to read/write the file until its in-memory inode is
> +evicted.  In the future there probably should be a way to provide keys
> +directly to the filesystem instead, which would make the intended
> +semantics clearer.
> +
> +Access semantics
> +================
> +
> +With the key
> +------------
> +
> +With the encryption key, encrypted regular files, directories, and
> +symlinks behave very similarly to their unencrypted counterparts ---
> +after all, the encryption is intended to be transparent.  However,
> +astute users may notice some differences in behavior:
> +
> +- Unencrypted files, or files encrypted with a different encryption
> +  policy (i.e. different key, modes, or flags), cannot be renamed or
> +  linked into an encrypted directory; see `Encryption policy
> +  enforcement`_.  Attempts to do so will fail with EPERM.  However,
> +  encrypted files can be renamed within an encrypted directory, or
> +  into an unencrypted directory.
> +
> +- Direct I/O is not supported on encrypted files.  Attempts to use
> +  direct I/O on such files will fall back to buffered I/O.
> +
> +- The fallocate operations FALLOC_FL_COLLAPSE_RANGE,
> +  FALLOC_FL_INSERT_RANGE, and FALLOC_FL_ZERO_RANGE are not supported
> +  on encrypted files and will fail with EOPNOTSUPP.
> +
> +- Online defragmentation of encrypted files is not supported.  The
> +  EXT4_IOC_MOVE_EXT and F2FS_IOC_MOVE_RANGE ioctls will fail with
> +  EOPNOTSUPP.
> +
> +- The ext4 filesystem does not support data journaling with encrypted
> +  regular files.  It will fall back to ordered data mode instead.
> +
> +- DAX (Direct Access) is not supported on encrypted files.
> +
> +- The st_size of an encrypted symlink will not necessarily give the
> +  length of the symlink target as required by POSIX.  It will actually
> +  give the length of the ciphertext, which may be slightly longer than
> +  the plaintext due to the NUL-padding.
> +
> +Note that mmap *is* supported.  This is possible because the pagecache
> +for an encrypted file contains the plaintext, not the ciphertext.
> +
> +Without the key
> +---------------
> +
> +Some filesystem operations may be performed on encrypted regular
> +files, directories, and symlinks even before their encryption key has
> +been provided:
> +
> +- File metadata may be read, e.g. using stat().
> +
> +- Directories may be listed, in which case the filenames will be
> +  listed in an encoded form derived from their ciphertext.  The
> +  current encoding algorithm is described in `Filename hashing and
> +  encoding`_.  The algorithm is subject to change, but it is
> +  guaranteed that the presented filenames will be no longer than
> +  NAME_MAX bytes, will not contain the ``/`` or ``\0`` characters, and
> +  will uniquely identify directory entries.
> +
> +  The ``.`` and ``..`` directory entries are special.  They are always
> +  present and are not encrypted or encoded.
> +
> +- Files may be deleted.  That is, nondirectory files may be deleted
> +  with unlink() as usual, and empty directories may be deleted with
> +  rmdir() as usual.  Therefore, ``rm`` and ``rm -r`` will work as
> +  expected.
> +
> +- Symlink targets may be read and followed, but they will be presented
> +  in encrypted form, similar to filenames in directories.  Hence, they
> +  are unlikely to point to anywhere useful.
> +
> +Without the key, regular files cannot be opened or truncated.
> +Attempts to do so will fail with ENOKEY.  This implies that any
> +regular file operations that require a file descriptor, such as
> +read(), write(), mmap(), fallocate(), and ioctl(), are also forbidden.
> +
> +Also without the key, files of any type (including directories) cannot
> +be created or linked into an encrypted directory, nor can a name in an
> +encrypted directory be the source or target of a rename, nor can an
> +O_TMPFILE temporary file be created in an encrypted directory.  All
> +such operations will fail with ENOKEY.
> +
> +It is not currently possible to backup and restore encrypted files
> +without the encryption key.  This would require special APIs which
> +have not yet been implemented.
> +
> +Encryption policy enforcement
> +=============================
> +
> +After an encryption policy has been set on a directory, all regular
> +files, directories, and symbolic links created in that directory
> +(recursively) will inherit that encryption policy.  Special files ---
> +that is, named pipes, device nodes, and UNIX domain sockets --- will
> +not be encrypted.
> +
> +Except for those special files, it is forbidden to have unencrypted
> +files, or files encrypted with a different encryption policy, in an
> +encrypted directory tree.  Attempts to link or rename such a file into
> +an encrypted directory will fail with EPERM.  This is also enforced
> +during ->lookup() to provide limited protection against offline
> +attacks that try to disable or downgrade encryption in known locations
> +where applications may later write sensitive data.
> +
> +Implementation details
> +======================
> +
> +Encryption context
> +------------------
> +
> +An encryption policy is represented on-disk by a :c:type:`struct
> +fscrypt_context`.  It is up to individual filesystems to decide where
> +to store it, but normally it would be stored in a hidden extended
> +attribute.  It should *not* be exposed by the xattr-related system
> +calls such as getxattr() and setxattr() because of the special
> +semantics of the encryption xattr.  (In particular, there would be
> +much confusion if an encryption policy were to be added to or removed
> +from anything other than an empty directory.)  The struct is defined
> +as follows::
> +
> +    #define FS_KEY_DESCRIPTOR_SIZE  8
> +    #define FS_KEY_DERIVATION_NONCE_SIZE 16
> +
> +    struct fscrypt_context {
> +            u8 format;
> +            u8 contents_encryption_mode;
> +            u8 filenames_encryption_mode;
> +            u8 flags;
> +            u8 master_key_descriptor[FS_KEY_DESCRIPTOR_SIZE];
> +            u8 nonce[FS_KEY_DERIVATION_NONCE_SIZE];
> +    };
> +
> +Note that :c:type:`struct fscrypt_context` contains the same
> +information as :c:type:`struct fscrypt_policy` (see `Setting an
> +encryption policy`_), except that :c:type:`struct fscrypt_context`
> +also contains a nonce.  The nonce is randomly generated by the kernel
> +and is used to derive the inode's encryption key as described in
> +`Per-file keys`_.
> +
> +Data path changes
> +-----------------
> +
> +For the read path (->readpage()) of regular files, filesystems can
> +read the ciphertext into the page cache and decrypt it in-place.  The
> +page lock must be held until decryption has finished, to prevent the
> +page from becoming visible to userspace prematurely.
> +
> +For the write path (->writepage()) of regular files, filesystems
> +cannot encrypt data in-place in the page cache, since the cached
> +plaintext must be preserved.  Instead, filesystems must encrypt into a
> +temporary buffer or "bounce page", then write out the temporary
> +buffer.  Some filesystems, such as UBIFS, already use temporary
> +buffers regardless of encryption.  Other filesystems, such as ext4 and
> +F2FS, have to allocate bounce pages specially for encryption.
> +
> +Filename hashing and encoding
> +-----------------------------
> +
> +Modern filesystems accelerate directory lookups by using indexed
> +directories.  An indexed directory is organized as a tree keyed by
> +filename hashes.  When a ->lookup() is requested, the filesystem
> +normally hashes the filename being looked up so that it can quickly
> +find the corresponding directory entry, if any.
> +
> +With encryption, lookups must be supported and efficient both with and
> +without the encryption key.  Clearly, it would not work to hash the
> +plaintext filenames, since the plaintext filenames are unavailable
> +without the key.  (Hashing the plaintext filenames would also make it
> +impossible for the filesystem's fsck tool to optimize encrypted
> +directories.)  Instead, filesystems hash the ciphertext filenames,
> +i.e. the bytes actually stored on-disk in the directory entries.  When
> +asked to do a ->lookup() with the key, the filesystem just encrypts
> +the user-supplied name to get the ciphertext.
> +
> +Lookups without the key are more complicated.  The raw ciphertext may
> +contain the ``\0`` and ``/`` characters, which are illegal in
> +filenames.  Therefore, readdir() must base64-encode the ciphertext for
> +presentation.  For most filenames, this works fine; on ->lookup(), the
> +filesystem just base64-decodes the user-supplied name to get back to
> +the raw ciphertext.
> +
> +However, for very long filenames, base64 encoding would cause the
> +filename length to exceed NAME_MAX.  To prevent this, readdir()
> +actually presents long filenames in an abbreviated form which encodes
> +a strong "hash" of the ciphertext filename, along with the optional
> +filesystem-specific hash(es) needed for directory lookups.  This
> +allows the filesystem to still, with a high degree of confidence, map
> +the filename given in ->lookup() back to a particular directory entry
> +that was previously listed by readdir().  See :c:type:`struct
> +fscrypt_digested_name` in the source for more details.
> +
> +Note that the precise way that filenames are presented to userspace
> +without the key is subject to change in the future.  It is only meant
> +as a way to temporarily present valid filenames so that commands like
> +``rm -r`` work as expected on encrypted directories.
> diff --git a/Documentation/filesystems/index.rst 
> b/Documentation/filesystems/index.rst
> index 256e10eedba4..53b89d0edc15 100644
> --- a/Documentation/filesystems/index.rst
> +++ b/Documentation/filesystems/index.rst
> @@ -315,3 +315,14 @@ exported for use by modules.
>     :internal:
>
>  .. kernel-doc:: fs/pipe.c
> +
> +Encryption API
> +==============
> +
> +A library which filesystems can hook into to support transparent
> +encryption of files and directories.
> +
> +.. toctree::
> +    :maxdepth: 2
> +
> +    fscrypt
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 1c3feffb1c1c..beee181ec84e 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -5558,6 +5558,7 @@ T:        git 
> git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscrypt.git
>  S:     Supported
>  F:     fs/crypto/
>  F:     include/linux/fscrypt*.h
> +F:     Documentation/filesystems/fscrypt.rst
>
>  FUJITSU FR-V (FRV) PORT
>  S:     Orphan
> --
> 2.14.1.581.gf28d330327-goog
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fscrypt" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

On Thu, Aug 31, 2017 at 5:02 PM, Eric Biggers <ebigge...@gmail.com> wrote:
> From: Eric Biggers <ebigg...@google.com>
>
> Perhaps long overdue, add a documentation file for filesystem-level
> encryption, a.k.a. fscrypt or fs/crypto/, to the Documentation
> directory.  The new file is based loosely on the latest version of the
> "EXT4 Encryption Design Document (public version)" Google Doc, but with
> many improvements made, including:
>
> - Reflect the reality that it is not specific to ext4 anymore.
> - More thoroughly document the design and user-visible API/behavior.
> - Replace outdated information, such as the outdated explanation of how
>   encrypted filenames are hashed for indexed directories and how
>   encrypted filenames are presented to userspace without the key.
>   (This was changed just before release.)
>
> For now the focus is on the design and user-visible API/behavior, not on
> how to add encryption support to a filesystem --- since the internal API
> is still pretty messy and any standalone documentation for it would
> become outdated as things get refactored over time.
>
> Signed-off-by: Eric Biggers <ebigg...@google.com>
> ---
> Changes since v1:
> - Mention that using existing userspace tools is preferred
> - Don't start an argument about the best way to get random numbers
> - Make it clear that backup/restore of encrypted files without key is
>   not supported yet
> - Mention a reason why the encryption xattr should not be exposed
>   via the xattr system calls
>
>  Documentation/filesystems/fscrypt.rst | 597 
> ++++++++++++++++++++++++++++++++++
>  Documentation/filesystems/index.rst   |  11 +
>  MAINTAINERS                           |   1 +
>  3 files changed, 609 insertions(+)
>  create mode 100644 Documentation/filesystems/fscrypt.rst
>
> diff --git a/Documentation/filesystems/fscrypt.rst 
> b/Documentation/filesystems/fscrypt.rst
> new file mode 100644
> index 000000000000..ec4cad049dde
> --- /dev/null
> +++ b/Documentation/filesystems/fscrypt.rst
> @@ -0,0 +1,597 @@
> +=====================================
> +Filesystem-level encryption (fscrypt)
> +=====================================
> +
> +Introduction
> +============
> +
> +fscrypt is a library which filesystems can hook into to support
> +transparent encryption of files and directories.
> +
> +Note: "fscrypt" in this document refers to the kernel-level portion,
> +implemented in ``fs/crypto/``, as opposed to the userspace tool
> +`fscrypt <https://github.com/google/fscrypt>`_.  This document only
> +covers the kernel-level portion.  For command-line examples of how to
> +use encryption, see the documentation for the userspace tool `fscrypt
> +<https://github.com/google/fscrypt>`_.  Also, it is strongly
> +recommended to use the fscrypt userspace tool, or other existing
> +userspace tools such as Android's key management system, over using
> +the kernel's API directly.  Using existing tools reduces the chance of
> +introducing your own security bugs.  (Nevertheless, for completeness
> +this documentation covers the kernel's API anyway.)
> +
> +Unlike dm-crypt, fscrypt operates at the filesystem level rather than
> +at the block device level.  This allows it to encrypt different files
> +with different keys and to have unencrypted files on the same
> +filesystem.  This is useful for multi-user systems where each user's
> +data-at-rest needs to be cryptographically isolated from the others.
> +However, except for filenames, fscrypt does not encrypt filesystem
> +metadata.
> +
> +Unlike eCryptfs, which is a stacked filesystem, fscrypt is integrated
> +directly into supported filesystems --- currently ext4, F2FS, and
> +UBIFS.  This allows encrypted files to be read and written without
> +caching both the decrypted and encrypted pages in the pagecache,
> +thereby halving the memory used and bringing it in line with
> +unencrypted files.  Similarly, half as many dentries and inodes are
> +needed.  eCryptfs also limits filenames to 143 bytes, causing
> +application compatibility issues; fscrypt allows the full 255 bytes
> +(NAME_MAX).  Finally, unlike eCryptfs, the fscrypt API can be used by
> +unprivileged users, with no need to mount anything.
> +
> +fscrypt does not support encrypting files in-place.  Instead, it
> +supports marking an empty directory as encrypted.  Then, after
> +userspace provides the key, all regular files, directories, and
> +symbolic links created in that directory tree are transparently
> +encrypted.
> +
> +Threat model
> +============
> +
> +Offline attacks
> +---------------
> +
> +Provided that userspace chooses a strong encryption key, fscrypt
> +protects the confidentiality of file contents and filenames in the
> +event of a single point-in-time permanent offline compromise of the
> +block device content.  fscrypt does not protect the confidentiality of
> +non-filename metadata, e.g. file sizes, file permissions, file
> +timestamps, and extended attributes.  Also, the existence and location
> +of holes (unallocated blocks which logically contain all zeroes) in
> +files is not protected.
> +
> +fscrypt is not guaranteed to protect confidentiality or authenticity
> +if an attacker is able to manipulate the filesystem offline prior to
> +an authorized user later accessing the filesystem.
> +
> +Online attacks
> +--------------
> +
> +fscrypt (and storage encryption in general) can only provide limited
> +protection, if any at all, against online attacks.  In detail:
> +
> +fscrypt is only resistant to side-channel attacks, such as timing or
> +electromagnetic attacks, to the extent that the underlying Linux
> +Cryptographic API algorithms are.  If a vulnerable algorithm is used,
> +such as a table-based implementation of AES, it may be possible for an
> +attacker to mount a side channel attack against the online system.
> +
> +After an encryption key has been provided, fscrypt is not designed to
> +hide the plaintext file contents or filenames from other users on the
> +same system, regardless of the visibility of the keyring key.
> +Instead, existing access control mechanisms such as file mode bits,
> +POSIX ACLs, or SELinux should be used for this purpose.  Also note
> +that as long as the encryption keys are *anywhere* in memory, an
> +online attacker can necessarily compromise them by mounting a physical
> +attack or by exploiting any kernel security vulnerability which
> +provides an arbitrary memory read primitive.
> +
> +While it is ostensibly possible to "evict" keys from the system,
> +recently accessed encrypted files will remain accessible at least
> +until the filesystem is unmounted or the VFS caches are dropped, e.g.
> +using ``echo 2 > /proc/sys/vm/drop_caches``.  Even after that, if the
> +RAM is compromised before being powered off, it will likely still be
> +possible to recover portions of the plaintext file contents, if not
> +some of the encryption keys as well.  (Since Linux v4.12, all
> +in-kernel keys related to fscrypt are sanitized before being freed.
> +However, userspace would need to do its part as well.)
> +
> +Currently, fscrypt does not prevent a user from maliciously providing
> +an incorrect key for another user's existing encrypted files.  A
> +protection against this is planned.
> +
> +Key hierarchy
> +=============
> +
> +Master Keys
> +-----------
> +
> +Each encrypted directory tree is protected by a *master key*.  Master
> +keys can be up to 64 bytes long, and must be at least as long as the
> +greater of the key length needed by the contents and filenames
> +encryption modes being used.  For example, if AES-256-XTS is used for
> +contents encryption, the master key must be 64 bytes (512 bits).  Note
> +that the XTS mode is defined to require a key twice as long as that
> +required by the underlying block cipher.
> +
> +To "unlock" an encrypted directory tree, userspace must provide the
> +appropriate master key.  There can be any number of master keys, each
> +of which protects any number of directory trees on any number of
> +filesystems.
> +
> +Userspace should generate master keys either using a cryptographically
> +secure random number generator, or by using a KDF (Key Derivation
> +Function).  Note that whenever a KDF is used to "stretch" a
> +lower-entropy secret such as a passphrase, it is critical that a KDF
> +designed for this purpose be used, such as scrypt, PBKDF2, or Argon2.
> +
> +Per-file keys
> +-------------
> +
> +Master keys are not used to encrypt file contents or names directly.
> +Instead, a unique key is derived for each encrypted file, including
> +each regular file, directory, and symbolic link.  This has several
> +advantages:
> +
> +- In cryptosystems, the same key material should never be used for
> +  different purposes.  Using the master key as both an XTS key for
> +  contents encryption and as a CTS-CBC key for filenames encryption
> +  would violate this rule.
> +- Per-file keys simplify the choice of IVs (Initialization Vectors)
> +  for contents encryption.  Without per-file keys, to ensure IV
> +  uniqueness both the inode and logical block number would need to be
> +  encoded in the IVs.  This would make it impossible to renumber
> +  inodes, which e.g. ``resize2fs`` can do when resizing an ext4
> +  filesystem.  With per-file keys, it is sufficient to encode just the
> +  logical block number in the IVs.
> +- Per-file keys strengthen the encryption of filenames, where IVs are
> +  reused out of necessity.  With a unique key per directory, IV reuse
> +  is limited to within a single directory.
> +- Per-file keys allow individual files to be securely erased simply by
> +  securely erasing their keys.  (Not yet implemented.)
> +
> +A KDF (Key Derivation Function) is used to derive per-file keys from
> +the master key.  This is done instead of wrapping a randomly-generated
> +key for each file because it reduces the size of the encryption xattr,
> +which for some filesystems makes the xattr more likely to fit in-line
> +in the filesystem's inode table.  With a KDF, only a 16-byte nonce is
> +required --- long enough to make key reuse extremely unlikely.  A
> +wrapped key, on the other hand, would need to be up to 64 bytes ---
> +the length of an AES-256-XTS key.  Furthermore, currently there is no
> +requirement to support unlocking a file with multiple alternative
> +master keys or to support rotating master keys.  Instead, the master
> +keys may be wrapped in userspace, e.g. as done by the `fscrypt
> +<https://github.com/google/fscrypt>`_ tool.
> +
> +The current KDF encrypts the master key using the 16-byte nonce as an
> +AES-128-ECB key.  The output is used as the derived key.  If the
> +output is longer than needed, then it is truncated to the needed
> +length.  Truncation is the norm for directories and symlinks, since
> +those use the CTS-CBC encryption mode which requires a key half as
> +long as that required by the XTS encryption mode.
> +
> +Note: this KDF meets the primary security requirement, which is to
> +produce unique derived keys that preserve the entropy of the master
> +key, assuming that the master key is already a good pseudorandom key.
> +However, it is nonstandard and has some theoretical problems such as
> +being reversible, so it is generally considered to be a mistake!  It
> +may be replaced with HKDF or another more standard KDF in the future.
> +
> +Encryption modes and usage
> +==========================
> +
> +fscrypt allows one encryption mode to be specified for file contents
> +and one encryption mode to be specified for filenames.  Different
> +directory trees are permitted to use different encryption modes.
> +Currently, the following pairs of encryption modes are supported:
> +
> +- AES-256-XTS for contents and AES-256-CTS-CBC for filenames
> +- AES-128-CBC for contents and AES-128-CTS-CBC for filenames
> +
> +It is strongly recommended to use AES-256-XTS for contents encryption.
> +AES-128-CBC was added only for low-powered embedded devices with
> +crypto accelerators such as CAAM or CESA that do not support XTS.
> +
> +New encryption modes can be added relatively easily, without changes
> +to individual filesystems.  However, authenticated encryption (AE)
> +modes are not currently supported because of the difficulty of dealing
> +with ciphertext expansion.
> +
> +For file contents, each filesystem block is encrypted independently.
> +Currently, only the case where the filesystem block size is equal to
> +the system's page size (usually 4096 bytes) is supported.  With the
> +XTS mode of operation (recommended), the logical block number within
> +the file is used as the IV.  With the CBC mode of operation (not
> +recommended), ESSIV is used; specifically, the IV for CBC is the
> +logical block number encrypted with AES-256, where the AES-256 key is
> +the SHA-256 hash of the inode's data encryption key.
> +
> +For filenames, the full filename is encrypted at once.  Because of the
> +requirements to retain support for efficient directory lookups and
> +filenames of up to 255 bytes, a constant initialization vector (IV) is
> +used.  However, each encrypted directory uses a unique key, which
> +limits IV reuse to within a single directory.
> +
> +Since filenames are encrypted with the CTS-CBC mode of operation, the
> +plaintext and ciphertext filenames need not be multiples of the AES
> +block size, i.e. 16 bytes.  However, the minimum size that can be
> +encrypted is 16 bytes, so shorter filenames are NUL-padded to 16 bytes
> +before being encrypted.  In addition, to reduce leakage of filename
> +lengths via their ciphertexts, all filenames are NUL-padded to the
> +next 4, 8, 16, or 32-byte boundary (configurable).  32 is recommended
> +since this provides the best confidentiality, at the cost of making
> +directory entries consume slightly more space.  Note that since NUL
> +(``\0``) is not otherwise a valid character in filenames, the padding
> +will never produce duplicate plaintexts.
> +
> +Symbolic link targets are considered a type of filename and are
> +encrypted in the same way as filenames in directory entries.  Each
> +symlink also uses a unique key; hence, the hardcoded IV is not a
> +problem for symlinks.
> +
> +User API
> +========
> +
> +Setting an encryption policy
> +----------------------------
> +
> +The FS_IOC_SET_ENCRYPTION_POLICY ioctl sets an encryption policy on an
> +empty directory or verifies that a directory or regular file already
> +has the specified encryption policy.  It takes in a pointer to a
> +:c:type:`struct fscrypt_policy`, defined as follows::
> +
> +    #define FS_KEY_DESCRIPTOR_SIZE  8
> +
> +    struct fscrypt_policy {
> +            __u8 version;
> +            __u8 contents_encryption_mode;
> +            __u8 filenames_encryption_mode;
> +            __u8 flags;
> +            __u8 master_key_descriptor[FS_KEY_DESCRIPTOR_SIZE];
> +    };
> +
> +This structure must be initialized as follows:
> +
> +- ``version`` must be 0.
> +
> +- ``contents_encryption_mode`` and ``filenames_encryption_mode`` must
> +  be set to constants from ``<linux/fs.h>`` which identify the
> +  encryption modes to use.  If unsure, use
> +  FS_ENCRYPTION_MODE_AES_256_XTS (1) for ``contents_encryption_mode``
> +  and FS_ENCRYPTION_MODE_AES_256_CTS (4) for
> +  ``filenames_encryption_mode``.
> +
> +- ``flags`` must be set to a value from ``<linux/fs.h>`` which
> +  identifies the amount of NUL-padding to use when encrypting
> +  filenames.  If unsure, use FS_POLICY_FLAGS_PAD_32 (0x3).
> +
> +- ``master_key_descriptor`` specifies how to find the master key in
> +  the keyring; see `Adding keys`_.  It is up to userspace to choose a
> +  unique ``master_key_descriptor`` for each master key.  The e4crypt
> +  and fscrypt tools use the first 8 bytes of
> +  ``SHA-512(SHA-512(master_key))``, but this particular scheme is not
> +  required.  Also, the master key need not be in the keyring yet when
> +  FS_IOC_SET_ENCRYPTION_POLICY is executed.  However, it must be added
> +  before any files can be created in the encrypted directory.
> +
> +If the file is not yet encrypted, then FS_IOC_SET_ENCRYPTION_POLICY
> +verifies that the file is an empty directory.  If so, the specified
> +encryption policy is assigned to the directory, turning it into an
> +encrypted directory.  After that, and after providing the
> +corresponding master key as described in `Adding keys`_, all regular
> +files, directories (recursively), and symlinks created in the
> +directory will be encrypted, inheriting the same encryption policy.
> +The filenames in the directory's entries will be encrypted as well.
> +
> +Alternatively, if the file is already encrypted, then
> +FS_IOC_SET_ENCRYPTION_POLICY validates that the specified encryption
> +policy exactly matches the actual one.  If they match, then the ioctl
> +returns 0.  Otherwise, it fails with EEXIST.  This works on both
> +regular files and directories, including nonempty directories.
> +
> +Note that the ext4 filesystem does not allow the root directory to be
> +encrypted, even if it is empty.  Users who want to encrypt an entire
> +filesystem with one key should consider using dm-crypt instead.
> +
> +FS_IOC_SET_ENCRYPTION_POLICY can fail with the following errors:
> +
> +- ``EACCES``: the file is not owned by the process's uid, nor does the
> +  process have the CAP_FOWNER capability in a namespace with the file
> +  owner's uid mapped
> +- ``EEXIST``: the file is already encrypted with an encryption policy
> +  different from the one specified
> +- ``EINVAL``: an invalid encryption policy was specified (invalid
> +  version, mode(s), or flags)
> +- ``ENOTDIR``: the file is unencrypted and is a regular file, not a
> +  directory
> +- ``ENOTEMPTY``: the file is unencrypted and is a nonempty directory
> +- ``ENOTTY``: this type of filesystem does not implement encryption
> +- ``EOPNOTSUPP``: the kernel was not configured with encryption
> +  support for this filesystem, or the filesystem superblock has not
> +  had encryption enabled on it.  (For example, to use encryption on an
> +  ext4 filesystem, CONFIG_EXT4_ENCRYPTION must be enabled in the
> +  kernel config, and the superblock must have had the "encrypt"
> +  feature flag enabled using ``tune2fs -O encrypt`` or ``mkfs.ext4 -O
> +  encrypt``.)
> +- ``EPERM``: this directory may not be encrypted, e.g. because it is
> +  the root directory of an ext4 filesystem
> +- ``EROFS``: the filesystem is readonly
> +
> +Getting an encryption policy
> +----------------------------
> +
> +The FS_IOC_GET_ENCRYPTION_POLICY ioctl retrieves the :c:type:`struct
> +fscrypt_policy`, if any, for a directory or regular file.  See above
> +for the struct definition.  No additional permissions are required
> +beyond the ability to open the file.
> +
> +FS_IOC_GET_ENCRYPTION_POLICY can fail with the following errors:
> +
> +- ``EINVAL``: the file is encrypted, but it uses an unrecognized
> +  encryption context format
> +- ``ENODATA``: the file is not encrypted
> +- ``ENOTTY``: this type of filesystem does not implement encryption
> +- ``EOPNOTSUPP``: the kernel was not configured with encryption
> +  support for this filesystem
> +
> +Note: if you only need to know whether a file is encrypted or not, on
> +most filesystems it is also possible to use the FS_IOC_GETFLAGS ioctl
> +and check for FS_ENCRYPT_FL, or to use the statx() system call and
> +check for STATX_ATTR_ENCRYPTED in stx_attributes.
> +
> +Getting the per-filesystem salt
> +-------------------------------
> +
> +Some filesystems, such as ext4 and F2FS, also support the deprecated
> +ioctl FS_IOC_GET_ENCRYPTION_PWSALT.  This ioctl retrieves a randomly
> +generated 16-byte value stored in the filesystem superblock.  This
> +value is intended to used as a salt when deriving an encryption key
> +from a passphrase or other low-entropy user credential.
> +
> +FS_IOC_GET_ENCRYPTION_PWSALT is deprecated.  Instead, prefer to
> +generate and manage any needed salt(s) in userspace.
> +
> +Adding keys
> +-----------
> +
> +To provide a master key, userspace must add it to an appropriate
> +keyring using the add_key() system call (see:
> +``Documentation/security/keys/core.rst``).  The key type must be
> +"logon"; keys of this type are kept in kernel memory and cannot be
> +read back by userspace.  The key description must be "fscrypt:"
> +followed by the 16-character lower case hex representation of the
> +``master_key_descriptor`` that was set in the encryption policy.  The
> +key payload must conform to the following structure::
> +
> +    #define FS_MAX_KEY_SIZE 64
> +
> +    struct fscrypt_key {
> +            u32 mode;
> +            u8 raw[FS_MAX_KEY_SIZE];
> +            u32 size;
> +    };
> +
> +``mode`` is ignored; just set it to 0.  The actual key is provided in
> +``raw`` with ``size`` indicating its size in bytes.  That is, the
> +bytes ``raw[0..size-1]`` (inclusive) are the actual key.
> +
> +The key description prefix "fscrypt:" may alternatively be replaced
> +with a filesystem-specific prefix such as "ext4:".  However, the
> +filesystem-specific prefixes are deprecated and should not be used in
> +new programs.
> +
> +There are several different types of keyrings in which encryption keys
> +may be placed, such as a session keyring, a user session keyring, or a
> +user keyring.  Each key must be placed in a keyring that is "attached"
> +to all processes that might need to access files encrypted with it, in
> +the sense that request_key() will find the key.  Generally, if only
> +processes belonging to a specific user need to access a given
> +encrypted directory and no session keyring has been installed, then
> +that directory's key should be placed in that user's user session
> +keyring or user keyring.  Otherwise, a session keyring should be
> +installed if needed, and the key should be linked into that session
> +keyring, or in a keyring linked into that session keyring.
> +
> +Note: introducing the complex visibility semantics of keyrings here
> +was arguably a mistake --- especially given that by design, after any
> +process successfully opens an encrypted file (thereby setting up the
> +per-file key), possessing the keyring key is not actually required for
> +any process to read/write the file until its in-memory inode is
> +evicted.  In the future there probably should be a way to provide keys
> +directly to the filesystem instead, which would make the intended
> +semantics clearer.
> +
> +Access semantics
> +================
> +
> +With the key
> +------------
> +
> +With the encryption key, encrypted regular files, directories, and
> +symlinks behave very similarly to their unencrypted counterparts ---
> +after all, the encryption is intended to be transparent.  However,
> +astute users may notice some differences in behavior:
> +
> +- Unencrypted files, or files encrypted with a different encryption
> +  policy (i.e. different key, modes, or flags), cannot be renamed or
> +  linked into an encrypted directory; see `Encryption policy
> +  enforcement`_.  Attempts to do so will fail with EPERM.  However,
> +  encrypted files can be renamed within an encrypted directory, or
> +  into an unencrypted directory.
> +
> +- Direct I/O is not supported on encrypted files.  Attempts to use
> +  direct I/O on such files will fall back to buffered I/O.
> +
> +- The fallocate operations FALLOC_FL_COLLAPSE_RANGE,
> +  FALLOC_FL_INSERT_RANGE, and FALLOC_FL_ZERO_RANGE are not supported
> +  on encrypted files and will fail with EOPNOTSUPP.
> +
> +- Online defragmentation of encrypted files is not supported.  The
> +  EXT4_IOC_MOVE_EXT and F2FS_IOC_MOVE_RANGE ioctls will fail with
> +  EOPNOTSUPP.
> +
> +- The ext4 filesystem does not support data journaling with encrypted
> +  regular files.  It will fall back to ordered data mode instead.
> +
> +- DAX (Direct Access) is not supported on encrypted files.
> +
> +- The st_size of an encrypted symlink will not necessarily give the
> +  length of the symlink target as required by POSIX.  It will actually
> +  give the length of the ciphertext, which may be slightly longer than
> +  the plaintext due to the NUL-padding.
> +
> +Note that mmap *is* supported.  This is possible because the pagecache
> +for an encrypted file contains the plaintext, not the ciphertext.
> +
> +Without the key
> +---------------
> +
> +Some filesystem operations may be performed on encrypted regular
> +files, directories, and symlinks even before their encryption key has
> +been provided:
> +
> +- File metadata may be read, e.g. using stat().
> +
> +- Directories may be listed, in which case the filenames will be
> +  listed in an encoded form derived from their ciphertext.  The
> +  current encoding algorithm is described in `Filename hashing and
> +  encoding`_.  The algorithm is subject to change, but it is
> +  guaranteed that the presented filenames will be no longer than
> +  NAME_MAX bytes, will not contain the ``/`` or ``\0`` characters, and
> +  will uniquely identify directory entries.
> +
> +  The ``.`` and ``..`` directory entries are special.  They are always
> +  present and are not encrypted or encoded.
> +
> +- Files may be deleted.  That is, nondirectory files may be deleted
> +  with unlink() as usual, and empty directories may be deleted with
> +  rmdir() as usual.  Therefore, ``rm`` and ``rm -r`` will work as
> +  expected.
> +
> +- Symlink targets may be read and followed, but they will be presented
> +  in encrypted form, similar to filenames in directories.  Hence, they
> +  are unlikely to point to anywhere useful.
> +
> +Without the key, regular files cannot be opened or truncated.
> +Attempts to do so will fail with ENOKEY.  This implies that any
> +regular file operations that require a file descriptor, such as
> +read(), write(), mmap(), fallocate(), and ioctl(), are also forbidden.
> +
> +Also without the key, files of any type (including directories) cannot
> +be created or linked into an encrypted directory, nor can a name in an
> +encrypted directory be the source or target of a rename, nor can an
> +O_TMPFILE temporary file be created in an encrypted directory.  All
> +such operations will fail with ENOKEY.
> +
> +It is not currently possible to backup and restore encrypted files
> +without the encryption key.  This would require special APIs which
> +have not yet been implemented.
> +
> +Encryption policy enforcement
> +=============================
> +
> +After an encryption policy has been set on a directory, all regular
> +files, directories, and symbolic links created in that directory
> +(recursively) will inherit that encryption policy.  Special files ---
> +that is, named pipes, device nodes, and UNIX domain sockets --- will
> +not be encrypted.
> +
> +Except for those special files, it is forbidden to have unencrypted
> +files, or files encrypted with a different encryption policy, in an
> +encrypted directory tree.  Attempts to link or rename such a file into
> +an encrypted directory will fail with EPERM.  This is also enforced
> +during ->lookup() to provide limited protection against offline
> +attacks that try to disable or downgrade encryption in known locations
> +where applications may later write sensitive data.
> +
> +Implementation details
> +======================
> +
> +Encryption context
> +------------------
> +
> +An encryption policy is represented on-disk by a :c:type:`struct
> +fscrypt_context`.  It is up to individual filesystems to decide where
> +to store it, but normally it would be stored in a hidden extended
> +attribute.  It should *not* be exposed by the xattr-related system
> +calls such as getxattr() and setxattr() because of the special
> +semantics of the encryption xattr.  (In particular, there would be
> +much confusion if an encryption policy were to be added to or removed
> +from anything other than an empty directory.)  The struct is defined
> +as follows::
> +
> +    #define FS_KEY_DESCRIPTOR_SIZE  8
> +    #define FS_KEY_DERIVATION_NONCE_SIZE 16
> +
> +    struct fscrypt_context {
> +            u8 format;
> +            u8 contents_encryption_mode;
> +            u8 filenames_encryption_mode;
> +            u8 flags;
> +            u8 master_key_descriptor[FS_KEY_DESCRIPTOR_SIZE];
> +            u8 nonce[FS_KEY_DERIVATION_NONCE_SIZE];
> +    };
> +
> +Note that :c:type:`struct fscrypt_context` contains the same
> +information as :c:type:`struct fscrypt_policy` (see `Setting an
> +encryption policy`_), except that :c:type:`struct fscrypt_context`
> +also contains a nonce.  The nonce is randomly generated by the kernel
> +and is used to derive the inode's encryption key as described in
> +`Per-file keys`_.
> +
> +Data path changes
> +-----------------
> +
> +For the read path (->readpage()) of regular files, filesystems can
> +read the ciphertext into the page cache and decrypt it in-place.  The
> +page lock must be held until decryption has finished, to prevent the
> +page from becoming visible to userspace prematurely.
> +
> +For the write path (->writepage()) of regular files, filesystems
> +cannot encrypt data in-place in the page cache, since the cached
> +plaintext must be preserved.  Instead, filesystems must encrypt into a
> +temporary buffer or "bounce page", then write out the temporary
> +buffer.  Some filesystems, such as UBIFS, already use temporary
> +buffers regardless of encryption.  Other filesystems, such as ext4 and
> +F2FS, have to allocate bounce pages specially for encryption.
> +
> +Filename hashing and encoding
> +-----------------------------
> +
> +Modern filesystems accelerate directory lookups by using indexed
> +directories.  An indexed directory is organized as a tree keyed by
> +filename hashes.  When a ->lookup() is requested, the filesystem
> +normally hashes the filename being looked up so that it can quickly
> +find the corresponding directory entry, if any.
> +
> +With encryption, lookups must be supported and efficient both with and
> +without the encryption key.  Clearly, it would not work to hash the
> +plaintext filenames, since the plaintext filenames are unavailable
> +without the key.  (Hashing the plaintext filenames would also make it
> +impossible for the filesystem's fsck tool to optimize encrypted
> +directories.)  Instead, filesystems hash the ciphertext filenames,
> +i.e. the bytes actually stored on-disk in the directory entries.  When
> +asked to do a ->lookup() with the key, the filesystem just encrypts
> +the user-supplied name to get the ciphertext.
> +
> +Lookups without the key are more complicated.  The raw ciphertext may
> +contain the ``\0`` and ``/`` characters, which are illegal in
> +filenames.  Therefore, readdir() must base64-encode the ciphertext for
> +presentation.  For most filenames, this works fine; on ->lookup(), the
> +filesystem just base64-decodes the user-supplied name to get back to
> +the raw ciphertext.
> +
> +However, for very long filenames, base64 encoding would cause the
> +filename length to exceed NAME_MAX.  To prevent this, readdir()
> +actually presents long filenames in an abbreviated form which encodes
> +a strong "hash" of the ciphertext filename, along with the optional
> +filesystem-specific hash(es) needed for directory lookups.  This
> +allows the filesystem to still, with a high degree of confidence, map
> +the filename given in ->lookup() back to a particular directory entry
> +that was previously listed by readdir().  See :c:type:`struct
> +fscrypt_digested_name` in the source for more details.
> +
> +Note that the precise way that filenames are presented to userspace
> +without the key is subject to change in the future.  It is only meant
> +as a way to temporarily present valid filenames so that commands like
> +``rm -r`` work as expected on encrypted directories.
> diff --git a/Documentation/filesystems/index.rst 
> b/Documentation/filesystems/index.rst
> index 256e10eedba4..53b89d0edc15 100644
> --- a/Documentation/filesystems/index.rst
> +++ b/Documentation/filesystems/index.rst
> @@ -315,3 +315,14 @@ exported for use by modules.
>     :internal:
>
>  .. kernel-doc:: fs/pipe.c
> +
> +Encryption API
> +==============
> +
> +A library which filesystems can hook into to support transparent
> +encryption of files and directories.
> +
> +.. toctree::
> +    :maxdepth: 2
> +
> +    fscrypt
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 1c3feffb1c1c..beee181ec84e 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -5558,6 +5558,7 @@ T:        git 
> git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscrypt.git
>  S:     Supported
>  F:     fs/crypto/
>  F:     include/linux/fscrypt*.h
> +F:     Documentation/filesystems/fscrypt.rst
>
>  FUJITSU FR-V (FRV) PORT
>  S:     Orphan
> --
> 2.14.1.581.gf28d330327-goog
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fscrypt" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to