Re: overlayfs vs. fscrypt

2019-03-14 Thread Miklos Szeredi
On Wed, Mar 13, 2019 at 11:42 PM Richard Weinberger  wrote:
>
> Am Mittwoch, 13. März 2019, 23:26:11 CET schrieb Eric Biggers:

> > What specifically is wrong with supporting the ciphertext "view" of 
> > encrypted
> > directories, and why do you want to opt UBIFS out of it specifically but not
> > ext4 and f2fs?  (The fscrypt_operations are per-filesystem type, not
> > per-filesystem instance, so I assume that's what you had in mind.)  Note 
> > that we
> > can't unconditionally remove it because people need it to delete files 
> > without
> > the key.  We could add a mount option to disable it, but why exactly?
>
> You are right, fscrypt_operations is the wrong structure.
> My plan was having it per filesystem instance. So a mount-option seems like
> a good option. Of course for all filesystems that support fscrypt, not just 
> UBIFS.

Yes, please.   Changing filesystem contents based on a mount option is
orders of magnitude more sane than doing so on key insertion/removal.

Thanks,
Miklos


Re: overlayfs vs. fscrypt

2019-03-13 Thread Eric Biggers
On Wed, Mar 13, 2019 at 03:29:43PM -0700, James Bottomley wrote:
> On Wed, 2019-03-13 at 15:13 -0700, Eric Biggers wrote:
> > On Wed, Mar 13, 2019 at 02:04:29PM -0700, James Bottomley wrote:
> > > On Wed, 2019-03-13 at 13:25 -0700, Eric Biggers wrote:
> > > > On Wed, Mar 13, 2019 at 01:06:06PM -0700, James Bottomley wrote:
> > > > > On Wed, 2019-03-13 at 12:57 -0700, Eric Biggers wrote:
> > > 
> > > [...]
> > > > > > fscrypt would allow the data to be stored encrypted on the
> > > > > > local disk, so it's protected against offline compromise of
> > > > > > the disk.
> > > > > 
> > > > > Container images are essentially tars of the overlays.  They
> > > > > only become actual filesystems when instantiated at
> > > > > runtime.  The current encrypted container image is an overlay
> > > > > or set of overlays which is tarred then encrypted.  So to
> > > > > instantiate it is decrypted then untarred.
> > > > > 
> > > > > The thing I was wondering about was whether instead of a tar
> > > > > encrypt we could instead produce an encrypted image from a
> > > > > fscrypt filesystem.
> > > > > 
> > > > 
> > > > Why do you care whether the container image is encrypted on the
> > > > local disk, when you're extracting it in plaintext onto the local
> > > > disk anyway each time it runs? Even after the runtime files are
> > > > "deleted", they may still be recoverable from the disk.  Are you
> > > > using shred and BLKSECDISCARD, and a non-COW filesystem?
> > > > 
> > > > Now, if you wanted to avoid writing the plaintext to disk
> > > > entirely (and thereby use encryption to actually achieve a useful
> > > > security property that can't be achieved through file
> > > > permissions), fscrypt is a good solution for that.
> > > 
> > > OK let's start with a cloud and container 101: A container is an
> > > exactly transportable IaaS environment containing an
> > > application.  The format for the exact transport is the "container
> > > image" I've been describing (layered tar file set deployed with
> > > overlays).  These images are usually stored in cloud based
> > > registries which may or may not have useful access controls.  I
> > > take it the reason for image encryption to protect confidentiality
> > > within the registry is obvious.
> > > 
> > > Because of the exact transport, the deployment may be on my laptop,
> > > on my test system or in some type of public or private cloud.  In
> > > all cases bar the laptop, I won't actually own the physical system
> > > which ends up deploying the container.  So in exchange for security
> > > guarantees from the physical system owner, I agree to turn over my
> > > decryption key and possibly a cash payment.  One of these
> > > guarantees is usually that they shred the key after use and that
> > > they deploy a useful key escrow system like vault or keyprotect to
> > > guard it even while the decryption is being done.
> > 
> > 
> > > Another is that all traces of the container be shredded after the
> > > execution is finished.
> > 
> > Well, sounds like that's not the case currently even with an
> > encrypted container image, because the actual runtime files are not
> > encrypted on disk.
> 
> Shredding means destroying all trace including in the on-disk image. 
> However, one problem with the current implementation is there's a
> window between container run and container stop where the unencrypted
> files are in memory and on local disk.  Access or cockup in that window
> can leak confidential data.

Well, another problem is that it almost certainly doesn't actually work, because
some plaintext will still be recoverable from disk or the raw flash.  Actually
erasing data on modern filesystems and storage devices is extremely difficult.
To start, you'd have to run BLKSECDISCARD on every single block ever written to
disk to prepare the container, or written by the container during its execution.

That's one reason to use storage encryption: it reduces the secure deletion
problem to just erasing the encryption key.

> 
> >   Encrypting the runtime files using fscrypt with an ephemeral key
> > would be useful here.  IOW, randomly generate an encryption key when
> > the container starts, never store it anywhere, and wipe it when the
> > container stops.
> > 
> > Note that this is separate from the container *image* encryption.
> 
> Actually, that was my original thought: it needn't be.  If fscrypt can
> usefully add runtime security, then we could have the encrypted layer
> be simply an fscrypt image ... I presume without the key we can create
> a tar image of an fscrypt that is encrypted and would still be visible
> on untar if we did have the key?  So the encrypted layer would be a tar
> of the fscrypt filesystem without the key.
> 

fscrypt doesn't support backup and restore without the key, so no you can't do
that.  But there wouldn't be much benefit for doing it that way anyway, since
you'd still have to untar the container image each time the container is
started, then 

Re: overlayfs vs. fscrypt

2019-03-13 Thread Richard Weinberger
Am Mittwoch, 13. März 2019, 23:26:11 CET schrieb Eric Biggers:
> On Wed, Mar 13, 2019 at 09:33:10PM +0100, Richard Weinberger wrote:
> > Am Mittwoch, 13. März 2019, 15:26:54 CET schrieb Amir Goldstein:
> > > IMO, the best thing for UBIFS to do would be to modify fscrypt to support
> > > opting out of the revalidate behavior, IWO, sanitize your hack to an API.
> > 
> > Given the WTF/s rate this thread has, this might me a good option.
> > Actually people already asked me how to disable this feature because
> > they saw no use of it.
> > Being able to delete encrypted files looks good on the feature list but in
> > reality it has very few users but causes confusion, IMHO.
> > 
> > I propose a new fscrypt_operations flag, FS_CFLG_NO_CRYPT_FNAMES.
> > If this flag is set, a) fscrypt_setup_filename() will return -EPERM if
> > no key is found.
> > And b) __fscrypt_prepare_lookup() will not attach fscrypt_d_ops to the 
> > dentry.
> > 
> > Eric, what do you think?
> > 
> > Thanks,
> > //richard
> > 
> 
> What specifically is wrong with supporting the ciphertext "view" of encrypted
> directories, and why do you want to opt UBIFS out of it specifically but not
> ext4 and f2fs?  (The fscrypt_operations are per-filesystem type, not
> per-filesystem instance, so I assume that's what you had in mind.)  Note that 
> we
> can't unconditionally remove it because people need it to delete files without
> the key.  We could add a mount option to disable it, but why exactly?

You are right, fscrypt_operations is the wrong structure.
My plan was having it per filesystem instance. So a mount-option seems like
a good option. Of course for all filesystems that support fscrypt, not just 
UBIFS.

Over the last year I've converted many emebdded systems to fscrypt and it 
happened
more than once that users, and more importantly, applications got confused that
you can mount and walk the filesystem even if you don't have the key loaded yet.
For them it felt more natural that you cannot even readdir if you don't have 
the key.

In my opinion having such a mount option is useful to match these expectations.
And it is also useful because you can enable only the features you actually 
need.
On embedded systems that I have in mind you never delete files without having 
the key
and since fscrypt is used for the whole filesystem you can just recreate it if 
you
really lost the key.

Thanks,
//richard




Re: overlayfs vs. fscrypt

2019-03-13 Thread James Bottomley
On Wed, 2019-03-13 at 15:13 -0700, Eric Biggers wrote:
> On Wed, Mar 13, 2019 at 02:04:29PM -0700, James Bottomley wrote:
> > On Wed, 2019-03-13 at 13:25 -0700, Eric Biggers wrote:
> > > On Wed, Mar 13, 2019 at 01:06:06PM -0700, James Bottomley wrote:
> > > > On Wed, 2019-03-13 at 12:57 -0700, Eric Biggers wrote:
> > 
> > [...]
> > > > > fscrypt would allow the data to be stored encrypted on the
> > > > > local disk, so it's protected against offline compromise of
> > > > > the disk.
> > > > 
> > > > Container images are essentially tars of the overlays.  They
> > > > only become actual filesystems when instantiated at
> > > > runtime.  The current encrypted container image is an overlay
> > > > or set of overlays which is tarred then encrypted.  So to
> > > > instantiate it is decrypted then untarred.
> > > > 
> > > > The thing I was wondering about was whether instead of a tar
> > > > encrypt we could instead produce an encrypted image from a
> > > > fscrypt filesystem.
> > > > 
> > > 
> > > Why do you care whether the container image is encrypted on the
> > > local disk, when you're extracting it in plaintext onto the local
> > > disk anyway each time it runs? Even after the runtime files are
> > > "deleted", they may still be recoverable from the disk.  Are you
> > > using shred and BLKSECDISCARD, and a non-COW filesystem?
> > > 
> > > Now, if you wanted to avoid writing the plaintext to disk
> > > entirely (and thereby use encryption to actually achieve a useful
> > > security property that can't be achieved through file
> > > permissions), fscrypt is a good solution for that.
> > 
> > OK let's start with a cloud and container 101: A container is an
> > exactly transportable IaaS environment containing an
> > application.  The format for the exact transport is the "container
> > image" I've been describing (layered tar file set deployed with
> > overlays).  These images are usually stored in cloud based
> > registries which may or may not have useful access controls.  I
> > take it the reason for image encryption to protect confidentiality
> > within the registry is obvious.
> > 
> > Because of the exact transport, the deployment may be on my laptop,
> > on my test system or in some type of public or private cloud.  In
> > all cases bar the laptop, I won't actually own the physical system
> > which ends up deploying the container.  So in exchange for security
> > guarantees from the physical system owner, I agree to turn over my
> > decryption key and possibly a cash payment.  One of these
> > guarantees is usually that they shred the key after use and that
> > they deploy a useful key escrow system like vault or keyprotect to
> > guard it even while the decryption is being done.
> 
> 
> > Another is that all traces of the container be shredded after the
> > execution is finished.
> 
> Well, sounds like that's not the case currently even with an
> encrypted container image, because the actual runtime files are not
> encrypted on disk.

Shredding means destroying all trace including in the on-disk image. 
However, one problem with the current implementation is there's a
window between container run and container stop where the unencrypted
files are in memory and on local disk.  Access or cockup in that window
can leak confidential data.

>   Encrypting the runtime files using fscrypt with an ephemeral key
> would be useful here.  IOW, randomly generate an encryption key when
> the container starts, never store it anywhere, and wipe it when the
> container stops.
> 
> Note that this is separate from the container *image* encryption.

Actually, that was my original thought: it needn't be.  If fscrypt can
usefully add runtime security, then we could have the encrypted layer
be simply an fscrypt image ... I presume without the key we can create
a tar image of an fscrypt that is encrypted and would still be visible
on untar if we did have the key?  So the encrypted layer would be a tar
of the fscrypt filesystem without the key.

> > considering is could I be protected against either cloud provider
> > cockups that might leak the image (the misconfigured backup
> > scenario I suggested) or malicious actions of other tenants.
> 
> If the container image is encrypted with a key not on the system,
> then its confidentiality is protected from anything that may happen
> on that system.
> 
> But if the container image encryption key *is* on the system, your
> container image may be leaked either accidentally or maliciously.

Well, yes, but that's like saying if you don't want to pick up a virus
from your network unplug it.  We have to look at ways of deploying the
filesystem and the key such that it's hard to exfiltrate ... which
seems to be similar to your android fscrypt use case.

James



Re: overlayfs vs. fscrypt

2019-03-13 Thread Eric Biggers
On Wed, Mar 13, 2019 at 09:33:10PM +0100, Richard Weinberger wrote:
> Am Mittwoch, 13. März 2019, 15:26:54 CET schrieb Amir Goldstein:
> > IMO, the best thing for UBIFS to do would be to modify fscrypt to support
> > opting out of the revalidate behavior, IWO, sanitize your hack to an API.
> 
> Given the WTF/s rate this thread has, this might me a good option.
> Actually people already asked me how to disable this feature because
> they saw no use of it.
> Being able to delete encrypted files looks good on the feature list but in
> reality it has very few users but causes confusion, IMHO.
> 
> I propose a new fscrypt_operations flag, FS_CFLG_NO_CRYPT_FNAMES.
> If this flag is set, a) fscrypt_setup_filename() will return -EPERM if
> no key is found.
> And b) __fscrypt_prepare_lookup() will not attach fscrypt_d_ops to the dentry.
> 
> Eric, what do you think?
> 
> Thanks,
> //richard
> 

What specifically is wrong with supporting the ciphertext "view" of encrypted
directories, and why do you want to opt UBIFS out of it specifically but not
ext4 and f2fs?  (The fscrypt_operations are per-filesystem type, not
per-filesystem instance, so I assume that's what you had in mind.)  Note that we
can't unconditionally remove it because people need it to delete files without
the key.  We could add a mount option to disable it, but why exactly?

By the way, I suggest that people read Documentation/filesystems/fscrypt.rst for
more information about what fscrypt is supposed to do, as there seems to be a
lot of misconceptions.

- Eric


Re: overlayfs vs. fscrypt

2019-03-13 Thread Eric Biggers
On Wed, Mar 13, 2019 at 02:04:29PM -0700, James Bottomley wrote:
> On Wed, 2019-03-13 at 13:25 -0700, Eric Biggers wrote:
> > On Wed, Mar 13, 2019 at 01:06:06PM -0700, James Bottomley wrote:
> > > On Wed, 2019-03-13 at 12:57 -0700, Eric Biggers wrote:
> [...]
> > > > fscrypt would allow the data to be stored encrypted on the local
> > > > disk, so it's protected against offline compromise of the disk.
> > > 
> > > Container images are essentially tars of the overlays.  They only
> > > become actual filesystems when instantiated at runtime.  The
> > > current encrypted container image is an overlay or set of overlays
> > > which is tarred then encrypted.  So to instantiate it is decrypted
> > > then untarred.
> > > 
> > > The thing I was wondering about was whether instead of a tar
> > > encrypt we could instead produce an encrypted image from a fscrypt
> > > filesystem.
> > > 
> > 
> > Why do you care whether the container image is encrypted on the local
> > disk, when you're extracting it in plaintext onto the local disk
> > anyway each time it runs? Even after the runtime files are "deleted",
> > they may still be recoverable from the disk.  Are you using shred and
> > BLKSECDISCARD, and a non-COW filesystem?
> > 
> > Now, if you wanted to avoid writing the plaintext to disk entirely
> > (and thereby use encryption to actually achieve a useful security
> > property that can't be achieved through file permissions), fscrypt is
> > a good solution for that.
> 
> OK let's start with a cloud and container 101: A container is an
> exactly transportable IaaS environment containing an application.  The
> format for the exact transport is the "container image" I've been
> describing (layered tar file set deployed with overlays).  These images
> are usually stored in cloud based registries which may or may not have
> useful access controls.  I take it the reason for image encryption to
> protect confidentiality within the registry is obvious.
> 
> Because of the exact transport, the deployment may be on my laptop, on
> my test system or in some type of public or private cloud.  In all
> cases bar the laptop, I won't actually own the physical system which
> ends up deploying the container.  So in exchange for security
> guarantees from the physical system owner, I agree to turn over my
> decryption key and possibly a cash payment.  One of these guarantees is
> usually that they shred the key after use and that they deploy a useful
> key escrow system like vault or keyprotect to guard it even while the
> decryption is being done.


> Another is that all traces of the container be shredded after the execution is
> finished.

Well, sounds like that's not the case currently even with an encrypted container
image, because the actual runtime files are not encrypted on disk.  Encrypting
the runtime files using fscrypt with an ephemeral key would be useful here.
IOW, randomly generate an encryption key when the container starts, never store
it anywhere, and wipe it when the container stops.

Note that this is separate from the container *image* encryption.

> considering is could I be protected against either cloud provider
> cockups that might leak the image (the misconfigured backup scenario I
> suggested) or malicious actions of other tenants.

If the container image is encrypted with a key not on the system, then its
confidentiality is protected from anything that may happen on that system.

But if the container image encryption key *is* on the system, your container
image may be leaked either accidentally or maliciously.

- Eric


Re: overlayfs vs. fscrypt

2019-03-13 Thread James Bottomley
On Wed, 2019-03-13 at 13:25 -0700, Eric Biggers wrote:
> On Wed, Mar 13, 2019 at 01:06:06PM -0700, James Bottomley wrote:
> > On Wed, 2019-03-13 at 12:57 -0700, Eric Biggers wrote:
[...]
> > > fscrypt would allow the data to be stored encrypted on the local
> > > disk, so it's protected against offline compromise of the disk.
> > 
> > Container images are essentially tars of the overlays.  They only
> > become actual filesystems when instantiated at runtime.  The
> > current encrypted container image is an overlay or set of overlays
> > which is tarred then encrypted.  So to instantiate it is decrypted
> > then untarred.
> > 
> > The thing I was wondering about was whether instead of a tar
> > encrypt we could instead produce an encrypted image from a fscrypt
> > filesystem.
> > 
> 
> Why do you care whether the container image is encrypted on the local
> disk, when you're extracting it in plaintext onto the local disk
> anyway each time it runs? Even after the runtime files are "deleted",
> they may still be recoverable from the disk.  Are you using shred and
> BLKSECDISCARD, and a non-COW filesystem?
> 
> Now, if you wanted to avoid writing the plaintext to disk entirely
> (and thereby use encryption to actually achieve a useful security
> property that can't be achieved through file permissions), fscrypt is
> a good solution for that.

OK let's start with a cloud and container 101: A container is an
exactly transportable IaaS environment containing an application.  The
format for the exact transport is the "container image" I've been
describing (layered tar file set deployed with overlays).  These images
are usually stored in cloud based registries which may or may not have
useful access controls.  I take it the reason for image encryption to
protect confidentiality within the registry is obvious.

Because of the exact transport, the deployment may be on my laptop, on
my test system or in some type of public or private cloud.  In all
cases bar the laptop, I won't actually own the physical system which
ends up deploying the container.  So in exchange for security
guarantees from the physical system owner, I agree to turn over my
decryption key and possibly a cash payment.  One of these guarantees is
usually that they shred the key after use and that they deploy a useful
key escrow system like vault or keyprotect to guard it even while the
decryption is being done.  Another is that all traces of the container
be shredded after the execution is finished.  The scenarios I'm
considering is could I be protected against either cloud provider
cockups that might leak the image (the misconfigured backup scenario I
suggested) or malicious actions of other tenants.

James



Re: overlayfs vs. fscrypt

2019-03-13 Thread Richard Weinberger
Am Mittwoch, 13. März 2019, 15:26:54 CET schrieb Amir Goldstein:
> IMO, the best thing for UBIFS to do would be to modify fscrypt to support
> opting out of the revalidate behavior, IWO, sanitize your hack to an API.

Given the WTF/s rate this thread has, this might me a good option.
Actually people already asked me how to disable this feature because
they saw no use of it.
Being able to delete encrypted files looks good on the feature list but in
reality it has very few users but causes confusion, IMHO.

I propose a new fscrypt_operations flag, FS_CFLG_NO_CRYPT_FNAMES.
If this flag is set, a) fscrypt_setup_filename() will return -EPERM if
no key is found.
And b) __fscrypt_prepare_lookup() will not attach fscrypt_d_ops to the dentry.

Eric, what do you think?

Thanks,
//richard




Re: overlayfs vs. fscrypt

2019-03-13 Thread Eric Biggers
On Wed, Mar 13, 2019 at 01:06:06PM -0700, James Bottomley wrote:
> On Wed, 2019-03-13 at 12:57 -0700, Eric Biggers wrote:
> > On Wed, Mar 13, 2019 at 12:17:52PM -0700, James Bottomley wrote:
> > > On Wed, 2019-03-13 at 14:58 -0400, Theodore Ts'o wrote:
> > > > On Wed, Mar 13, 2019 at 10:45:04AM -0700, James Bottomley wrote:
> > > > > >   If they can't break root, then the OS's user-id based
> > > > > > access control checks (or SELinux checks if you are using
> > > > > > SELinux) will still protect you.
> > > > > 
> > > > > Well, that's what one would think about the recent runc exploit
> > > > > as well.  The thing I was looking to do was reduce the chances
> > > > > that unencrypted data would be lying around to be
> > > > > discovered.  I suppose the potentially biggest problem is
> > > > > leaking the image after it's decrypted by admin means like a
> > > > > badly configured backup, but unencryped data is potentially
> > > > > discoverable by breakouts as well.
> > > > 
> > > > But while the container is running, the key is available and
> > > > instantiated in the kernel, and the kernel is free to decrypt any
> > > > encrypted file/block.
> > > 
> > > In the current encrypted tar file implementation, while the
> > > container is running the decrypted tar file is extracted into the
> > > container root and available for all to see.
> > > 
> > > The main security benefit of this implementation, as I said, is
> > > security of at rest images and the runtime security is guaranteed
> > > by other systems.
> > 
> > That's not security at rest, because you're decrypting the data and
> > storing it onto the local disk.
> 
> I mean image at rest and image running.  The local disk untar only
> happens for running image.
> 
> > fscrypt would allow the data to be stored encrypted on the local
> > disk, so it's protected against offline compromise of the disk.
> 
> Container images are essentially tars of the overlays.  They only
> become actual filesystems when instantiated at runtime.  The current
> encrypted container image is an overlay or set of overlays which is
> tarred then encrypted.  So to instantiate it is decrypted then
> untarred.
> 
> The thing I was wondering about was whether instead of a tar encrypt we
> could instead produce an encrypted image from a fscrypt filesystem.
> 

Why do you care whether the container image is encrypted on the local disk, when
you're extracting it in plaintext onto the local disk anyway each time it runs?
Even after the runtime files are "deleted", they may still be recoverable from
the disk.  Are you using shred and BLKSECDISCARD, and a non-COW filesystem?

Now, if you wanted to avoid writing the plaintext to disk entirely (and thereby
use encryption to actually achieve a useful security property that can't be
achieved through file permissions), fscrypt is a good solution for that.

- Eric


Re: overlayfs vs. fscrypt

2019-03-13 Thread James Bottomley
On Wed, 2019-03-13 at 12:57 -0700, Eric Biggers wrote:
> On Wed, Mar 13, 2019 at 12:17:52PM -0700, James Bottomley wrote:
> > On Wed, 2019-03-13 at 14:58 -0400, Theodore Ts'o wrote:
> > > On Wed, Mar 13, 2019 at 10:45:04AM -0700, James Bottomley wrote:
> > > > >   If they can't break root, then the OS's user-id based
> > > > > access control checks (or SELinux checks if you are using
> > > > > SELinux) will still protect you.
> > > > 
> > > > Well, that's what one would think about the recent runc exploit
> > > > as well.  The thing I was looking to do was reduce the chances
> > > > that unencrypted data would be lying around to be
> > > > discovered.  I suppose the potentially biggest problem is
> > > > leaking the image after it's decrypted by admin means like a
> > > > badly configured backup, but unencryped data is potentially
> > > > discoverable by breakouts as well.
> > > 
> > > But while the container is running, the key is available and
> > > instantiated in the kernel, and the kernel is free to decrypt any
> > > encrypted file/block.
> > 
> > In the current encrypted tar file implementation, while the
> > container is running the decrypted tar file is extracted into the
> > container root and available for all to see.
> > 
> > The main security benefit of this implementation, as I said, is
> > security of at rest images and the runtime security is guaranteed
> > by other systems.
> 
> That's not security at rest, because you're decrypting the data and
> storing it onto the local disk.

I mean image at rest and image running.  The local disk untar only
happens for running image.

> fscrypt would allow the data to be stored encrypted on the local
> disk, so it's protected against offline compromise of the disk.

Container images are essentially tars of the overlays.  They only
become actual filesystems when instantiated at runtime.  The current
encrypted container image is an overlay or set of overlays which is
tarred then encrypted.  So to instantiate it is decrypted then
untarred.

The thing I was wondering about was whether instead of a tar encrypt we
could instead produce an encrypted image from a fscrypt filesystem.

James


> It would not prevent an attacker who has escalated to root or kernel
> privileges from reading the data while the container is running,
> because that would be impossible.
> 
> It would also not prevent non-root users from reading the data,
> because the kernel already has a huge variety of access control
> mechanisms that can do this and can be used alongside fscrypt.
> 
> > 
> > >   The reason why the kernel won't do this is because of its
> > > access control checks.
> > > 
> > > And we're talking about this within the context of the overlayfs.
> > > When in the container world will we have persistent data that
> > > lasts beyond the lifetime of the running container that will be
> > > using overlayfs?  I didn't think that existed; if you are using,
> > > say, a Docker storage volume, does overlayfs ever get into the
> > > act?  And if so, how, and what are the desired security
> > > properties?
> > 
> > Are you asking about persistent volumes?  I can answer, but that's
> > not the current use case.  The current use case is encrypted
> > images, which are overlays.  If you mean the misconfigured backup
> > comment then I was thinking a backup that wrongly sweeps container
> > root while the container is running.
> > 
> > Lets go back to basics: can fscrypt provide equivalent or better
> > protection than the current encrypted tarfile approach?  If the
> > answer is no because it's too tightly tied to the android use case
> > then perhaps there's not much point discussing it further.
> > 
> 
> It's not tied to the Android use case.  As I mentioned, fscrypt has
> many other users, and it wasn't even originally designed for Android.
> 
> - Eric
> 



Re: overlayfs vs. fscrypt

2019-03-13 Thread Eric Biggers
On Wed, Mar 13, 2019 at 12:17:52PM -0700, James Bottomley wrote:
> On Wed, 2019-03-13 at 14:58 -0400, Theodore Ts'o wrote:
> > On Wed, Mar 13, 2019 at 10:45:04AM -0700, James Bottomley wrote:
> > > >   If they can't break root, then the OS's user-id based access
> > > > control checks (or SELinux checks if you are using SELinux) will
> > > > still protect you.
> > > 
> > > Well, that's what one would think about the recent runc exploit as
> > > well.  The thing I was looking to do was reduce the chances that
> > > unencrypted data would be lying around to be discovered.  I suppose
> > > the potentially biggest problem is leaking the image after it's
> > > decrypted by admin means like a badly configured backup, but
> > > unencryped data is potentially discoverable by breakouts as well.
> > 
> > But while the container is running, the key is available and
> > instantiated in the kernel, and the kernel is free to decrypt any
> > encrypted file/block.
> 
> In the current encrypted tar file implementation, while the container
> is running the decrypted tar file is extracted into the container root
> and available for all to see.
> 
> The main security benefit of this implementation, as I said, is
> security of at rest images and the runtime security is guaranteed by
> other systems.

That's not security at rest, because you're decrypting the data and storing it
onto the local disk.

fscrypt would allow the data to be stored encrypted on the local disk, so it's
protected against offline compromise of the disk.

It would not prevent an attacker who has escalated to root or kernel privileges
from reading the data while the container is running, because that would be
impossible.

It would also not prevent non-root users from reading the data, because the
kernel already has a huge variety of access control mechanisms that can do this
and can be used alongside fscrypt.

> 
> >   The reason why the kernel won't do this is because of its access
> > control checks.
> > 
> > And we're talking about this within the context of the overlayfs.
> > When in the container world will we have persistent data that lasts
> > beyond the lifetime of the running container that will be using
> > overlayfs?  I didn't think that existed; if you are using, say, a
> > Docker storage volume, does overlayfs ever get into the act?  And if
> > so, how, and what are the desired security properties?
> 
> Are you asking about persistent volumes?  I can answer, but that's not
> the current use case.  The current use case is encrypted images, which
> are overlays.  If you mean the misconfigured backup comment then I was
> thinking a backup that wrongly sweeps container root while the
> container is running.
> 
> Lets go back to basics: can fscrypt provide equivalent or better
> protection than the current encrypted tarfile approach?  If the answer
> is no because it's too tightly tied to the android use case then
> perhaps there's not much point discussing it further.
> 

It's not tied to the Android use case.  As I mentioned, fscrypt has many other
users, and it wasn't even originally designed for Android.

- Eric


Re: overlayfs vs. fscrypt

2019-03-13 Thread Eric Biggers
On Wed, Mar 13, 2019 at 07:19:46PM +, Al Viro wrote:
> On Wed, Mar 13, 2019 at 09:44:33AM -0700, Eric Biggers wrote:
> 
> > > Just to make sure - you do realize that ban on multiple dentries refering
> > > to the same directory inode is *NOT* conditional upon those dentries being
> > > hashed, right?
> > 
> > Isn't this handled by d_splice_alias() already, by moving the old dentry to 
> > the
> > new name?
> 
> ... which means that if somebody without the key chdirs into subdirectory
> they only see by encrypted name and waits for proper owner to look it up,
> they suddenly see it by _un_encrypted name.  Or does O_PATH open, for
> that matter, so exec permissions on that thing are not required.

Is there a real problem here?  After the key is added, the filenames are
supposed to be shown in plaintext, not ciphertext.  This is intrinsic to the
fact that we don't support both "views" at the same time.  Either the directory
has the key or it does not.

If someone is using ciphertext view (e.g. doing a directory traversal)
concurrently with the key being added, that can certainly break things.  But the
ciphertext view only allows a very restricted set of actions such as deleting
files.  And if such actions are necessary, the system userspace is meant to be
designed in such a way that adding the key can't race with it.

- Eric


Re: overlayfs vs. fscrypt

2019-03-13 Thread James Bottomley
On Wed, 2019-03-13 at 14:58 -0400, Theodore Ts'o wrote:
> On Wed, Mar 13, 2019 at 10:45:04AM -0700, James Bottomley wrote:
> > >   If they can't break root, then the OS's user-id based access
> > > control checks (or SELinux checks if you are using SELinux) will
> > > still protect you.
> > 
> > Well, that's what one would think about the recent runc exploit as
> > well.  The thing I was looking to do was reduce the chances that
> > unencrypted data would be lying around to be discovered.  I suppose
> > the potentially biggest problem is leaking the image after it's
> > decrypted by admin means like a badly configured backup, but
> > unencryped data is potentially discoverable by breakouts as well.
> 
> But while the container is running, the key is available and
> instantiated in the kernel, and the kernel is free to decrypt any
> encrypted file/block.

In the current encrypted tar file implementation, while the container
is running the decrypted tar file is extracted into the container root
and available for all to see.

The main security benefit of this implementation, as I said, is
security of at rest images and the runtime security is guaranteed by
other systems.

>   The reason why the kernel won't do this is because of its access
> control checks.
> 
> And we're talking about this within the context of the overlayfs.
> When in the container world will we have persistent data that lasts
> beyond the lifetime of the running container that will be using
> overlayfs?  I didn't think that existed; if you are using, say, a
> Docker storage volume, does overlayfs ever get into the act?  And if
> so, how, and what are the desired security properties?

Are you asking about persistent volumes?  I can answer, but that's not
the current use case.  The current use case is encrypted images, which
are overlays.  If you mean the misconfigured backup comment then I was
thinking a backup that wrongly sweeps container root while the
container is running.

Lets go back to basics: can fscrypt provide equivalent or better
protection than the current encrypted tarfile approach?  If the answer
is no because it's too tightly tied to the android use case then
perhaps there's not much point discussing it further.

James



Re: overlayfs vs. fscrypt

2019-03-13 Thread Al Viro
On Wed, Mar 13, 2019 at 09:44:33AM -0700, Eric Biggers wrote:

> > Just to make sure - you do realize that ban on multiple dentries refering
> > to the same directory inode is *NOT* conditional upon those dentries being
> > hashed, right?
> 
> Isn't this handled by d_splice_alias() already, by moving the old dentry to 
> the
> new name?

... which means that if somebody without the key chdirs into subdirectory
they only see by encrypted name and waits for proper owner to look it up,
they suddenly see it by _un_encrypted name.  Or does O_PATH open, for
that matter, so exec permissions on that thing are not required.


Re: overlayfs vs. fscrypt

2019-03-13 Thread Theodore Ts'o
On Wed, Mar 13, 2019 at 10:45:04AM -0700, James Bottomley wrote:
> >   If they can't break root, then the OS's user-id based access
> > control checks (or SELinux checks if you are using SELinux) will
> > still protect you.
> 
> Well, that's what one would think about the recent runc exploit as
> well.  The thing I was looking to do was reduce the chances that
> unencrypted data would be lying around to be discovered.  I suppose the
> potentially biggest problem is leaking the image after it's decrypted
> by admin means like a badly configured backup, but unencryped data is
> potentially discoverable by breakouts as well.

But while the container is running, the key is available and
instantiated in the kernel, and the kernel is free to decrypt any
encrypted file/block.  The reason why the kernel won't do this is
because of its access control checks.

And we're talking about this within the context of the overlayfs.
When in the container world will we have persistent data that lasts
beyond the lifetime of the running container that will be using
overlayfs?  I didn't think that existed; if you are using, say, a
Docker storage volume, does overlayfs ever get into the act?  And if
so, how, and what are the desired security properties?

  - Ted


Re: overlayfs vs. fscrypt

2019-03-13 Thread James Bottomley
On Wed, 2019-03-13 at 12:44 -0400, Theodore Ts'o wrote:
> On Wed, Mar 13, 2019 at 08:36:34AM -0700, James Bottomley wrote:
> > On Wed, 2019-03-13 at 11:16 -0400, Theodore Ts'o wrote:
> > > So before we talk about how to make things work from a technical
> > > perspective, we should consider what the use case happens to be,
> > > and what are the security requirements.  *Why* are we trying to
> > > use the combination of overlayfs and fscrypt, and what are the
> > > security properties we are trying to provide to someone who is
> > > relying on this combination?
> > 
> > I can give one: encrypted containers:
> > 
> > https://github.com/opencontainers/image-spec/issues/747
> > 
> > The current proposal imagines that the key would be delivered to
> > the physical node and the physical node containerd would decrypt
> > all the layers before handing them off to to the kubelet.  However,
> > one could imagine a slightly more secure use case where the layers
> > were constructed as an encrypted filesystem tar and so the key
> > would go into the kernel and the layers would be constructed with
> > encryption in place using fscrypt.
> > 
> > Most of the desired security properties are in image at rest but
> > one can imagine that the running image wants some protection
> > against containment breaches by other tenants and using fscrypt
> > could provide that.
> 
> What kind of containment breaches?  If they can break root, it's all
> over no matter what sort of encryption you are using.

With me it's always unprivileged containers inside a user_ns, so
containment breach means non-root.  I hope eventually this will be the
norm for the container industry as well.

>   If they can't break root, then the OS's user-id based access
> control checks (or SELinux checks if you are using SELinux) will
> still protect you.

Well, that's what one would think about the recent runc exploit as
well.  The thing I was looking to do was reduce the chances that
unencrypted data would be lying around to be discovered.  I suppose the
potentially biggest problem is leaking the image after it's decrypted
by admin means like a badly configured backup, but unencryped data is
potentially discoverable by breakouts as well.

James



Re: overlayfs vs. fscrypt

2019-03-13 Thread Theodore Ts'o
On Wed, Mar 13, 2019 at 08:36:34AM -0700, James Bottomley wrote:
> On Wed, 2019-03-13 at 11:16 -0400, Theodore Ts'o wrote:
> > So before we talk about how to make things work from a technical
> > perspective, we should consider what the use case happens to be, and
> > what are the security requirements.  *Why* are we trying to use the
> > combination of overlayfs and fscrypt, and what are the security
> > properties we are trying to provide to someone who is relying on this
> > combination?
> 
> I can give one: encrypted containers:
> 
> https://github.com/opencontainers/image-spec/issues/747
> 
> The current proposal imagines that the key would be delivered to the
> physical node and the physical node containerd would decrypt all the
> layers before handing them off to to the kubelet.  However, one could
> imagine a slightly more secure use case where the layers were
> constructed as an encrypted filesystem tar and so the key would go into
> the kernel and the layers would be constructed with encryption in place
> using fscrypt.
> 
> Most of the desired security properties are in image at rest but one
> can imagine that the running image wants some protection against
> containment breaches by other tenants and using fscrypt could provide
> that.

What kind of containment breaches?  If they can break root, it's all
over no matter what sort of encryption you are using.  If they can't
break root, then the OS's user-id based access control checks (or
SELinux checks if you are using SELinux) will still protect you.

  - Ted


Re: overlayfs vs. fscrypt

2019-03-13 Thread Eric Biggers
On Wed, Mar 13, 2019 at 04:06:16PM +, Al Viro wrote:
> On Wed, Mar 13, 2019 at 11:16:33AM -0400, Theodore Ts'o wrote:
> > Actually, the original use was for ChromeOS, but the primary
> > assumption is that keying is per user (or profile), and that users are
> > mutually distrustful.  So when Alice logs out of the system, her keys
> > will be invalidated and removed from the kernel.  We can (and do) try
> > to flush cache entries via "echo 3 > /proc/sys/vm/drop_caches" on
> > logout.  However, this does not guarantee that all dcache entries will
> > be removed --- a dcache entry can be pinned due to an open file, a
> > process's current working directory, a bind mount, etc.
> > 
> > The other issue is negative dentries; if you try open a file in an
> > encrypted file, the file system won't even *know* whether or not a
> > file exists, since the directory entries are encrypted; hence, there
> > may be some negative dentries that need to be invalidated.
> > 
> > So a fundamental assumption with fscrypt is that keys will be added
> > and removed, and that when this happens, dentries will need to be
> > invalidated.  This is going to surprise overlayfs, so if overlayfs is
> > going to support fscrypt it *has* to be aware of the fact that this
> > can happen.  It's not even clear what the proper security semantics
> > should be; *especially* if the upper and lower directories aren't
> > similarly protected using the same fscrypt encryption key.  Suppose
> > the lower directory is encrypted, and the upper is not.  Now on a copy
> > up operation, the previously encrypted file, which might contain
> > credit card numbers, medical records, or other things that would cause
> > a GDPR regulator to have a freak out attack, would *poof* become
> > decrypted.
> 
> Just to make sure - you do realize that ban on multiple dentries refering
> to the same directory inode is *NOT* conditional upon those dentries being
> hashed, right?

Isn't this handled by d_splice_alias() already, by moving the old dentry to the
new name?

- Eric


Re: overlayfs vs. fscrypt

2019-03-13 Thread Eric Biggers
On Wed, Mar 13, 2019 at 04:11:48PM +, Al Viro wrote:
> On Wed, Mar 13, 2019 at 08:01:27AM -0700, Eric Biggers wrote:
> 
> > What do you think about this?
> 
> That fscrypt might have some very deep flaws.  I'll need to RTFS and
> review its model, but what I've seen in this thread so far is not
> promising anything good.
> 
> It's not just overlayfs - there are all kinds of interesting trouble
> possible just with fscrypt, unless I'm misparsing what had been said
> so far.

FYI, there *is* a known bug I was very recently made aware of and am planning to
fix.  When ->lookup() finds the plaintext name for a directory and the
ciphertext name is already in the dcache, d_splice_alias() will __d_move() the
existing dentry to the plaintext name.  But it doesn't set
DCACHE_ENCRYPTED_WITH_KEY, so the dentry incorrectly is still marked as a
ciphertext name and will be invalidated on the next lookup.  That's especially
problematic if the lookup that caused the __d_move() came from sys_mount().

I'm thinking the best fix is to have __d_move() propagate
DCACHE_ENCRYPTED_WITH_KEY from 'target' to 'dentry'.

- Eric


Re: overlayfs vs. fscrypt

2019-03-13 Thread Richard Weinberger
Am Mittwoch, 13. März 2019, 17:13:52 CET schrieb James Bottomley:
> > What do you mean by "containment breaches by other tenants"?  Note
> > that while the key is added, fscrypt doesn't prevent access to the
> > encrypted files.
> 
> You mean it's not multiuser safe?  Even if user a owns the key they add
> user b can still see the decrypted contents?

If user a reads the file before, yes. Then user b sees it because the contents
got cached.
That's why you need still make sure that your access control is sane.

Thanks,
//richard




Re: overlayfs vs. fscrypt

2019-03-13 Thread James Bottomley
On Wed, 2019-03-13 at 08:51 -0700, Eric Biggers wrote:
> Hi James,
> 
> On Wed, Mar 13, 2019 at 08:36:34AM -0700, James Bottomley wrote:
> > On Wed, 2019-03-13 at 11:16 -0400, Theodore Ts'o wrote:
> > > So before we talk about how to make things work from a technical
> > > perspective, we should consider what the use case happens to be,
> > > and what are the security requirements.  *Why* are we trying to
> > > use the combination of overlayfs and fscrypt, and what are the
> > > security properties we are trying to provide to someone who is
> > > relying on this combination?
> > 
> > I can give one: encrypted containers:
> > 
> > https://github.com/opencontainers/image-spec/issues/747
> > 
> > The current proposal imagines that the key would be delivered to
> > the physical node and the physical node containerd would decrypt
> > all the layers before handing them off to to the kubelet.  However,
> > one could imagine a slightly more secure use case where the layers
> > were constructed as an encrypted filesystem tar and so the key
> > would go into the kernel and the layers would be constructed with
> > encryption in place using fscrypt.
> > 
> > Most of the desired security properties are in image at rest but
> > one can imagine that the running image wants some protection
> > against containment breaches by other tenants and using fscrypt
> > could provide that.
> > 
> 
> What do you mean by "containment breaches by other tenants"?  Note
> that while the key is added, fscrypt doesn't prevent access to the
> encrypted files.

You mean it's not multiuser safe?  Even if user a owns the key they add
user b can still see the decrypted contents?

>   fscrypt is orthogonal to OS-level access control (UNIX mode bits,
> ACLs, SELinux, etc.), which can and should be used alongside
> fscrypt.  fscrypt is a storage encryption mechanism, not an OS-level
> access control mechanism.

I was assuming in the multi-user case that if you don't own the keyring
you can't see the files. I suppose absent that it boils down to a
possible way to do the layering then as an fscrypt image rather than
tar then encrypt.

James



Re: overlayfs vs. fscrypt

2019-03-13 Thread Al Viro
On Wed, Mar 13, 2019 at 08:01:27AM -0700, Eric Biggers wrote:

> What do you think about this?

That fscrypt might have some very deep flaws.  I'll need to RTFS and
review its model, but what I've seen in this thread so far is not
promising anything good.

It's not just overlayfs - there are all kinds of interesting trouble
possible just with fscrypt, unless I'm misparsing what had been said
so far.


Re: overlayfs vs. fscrypt

2019-03-13 Thread Al Viro
On Wed, Mar 13, 2019 at 11:16:33AM -0400, Theodore Ts'o wrote:
> Actually, the original use was for ChromeOS, but the primary
> assumption is that keying is per user (or profile), and that users are
> mutually distrustful.  So when Alice logs out of the system, her keys
> will be invalidated and removed from the kernel.  We can (and do) try
> to flush cache entries via "echo 3 > /proc/sys/vm/drop_caches" on
> logout.  However, this does not guarantee that all dcache entries will
> be removed --- a dcache entry can be pinned due to an open file, a
> process's current working directory, a bind mount, etc.
> 
> The other issue is negative dentries; if you try open a file in an
> encrypted file, the file system won't even *know* whether or not a
> file exists, since the directory entries are encrypted; hence, there
> may be some negative dentries that need to be invalidated.
> 
> So a fundamental assumption with fscrypt is that keys will be added
> and removed, and that when this happens, dentries will need to be
> invalidated.  This is going to surprise overlayfs, so if overlayfs is
> going to support fscrypt it *has* to be aware of the fact that this
> can happen.  It's not even clear what the proper security semantics
> should be; *especially* if the upper and lower directories aren't
> similarly protected using the same fscrypt encryption key.  Suppose
> the lower directory is encrypted, and the upper is not.  Now on a copy
> up operation, the previously encrypted file, which might contain
> credit card numbers, medical records, or other things that would cause
> a GDPR regulator to have a freak out attack, would *poof* become
> decrypted.

Just to make sure - you do realize that ban on multiple dentries refering
to the same directory inode is *NOT* conditional upon those dentries being
hashed, right?


Re: overlayfs vs. fscrypt

2019-03-13 Thread Eric Biggers
Hi James,

On Wed, Mar 13, 2019 at 08:36:34AM -0700, James Bottomley wrote:
> On Wed, 2019-03-13 at 11:16 -0400, Theodore Ts'o wrote:
> > So before we talk about how to make things work from a technical
> > perspective, we should consider what the use case happens to be, and
> > what are the security requirements.  *Why* are we trying to use the
> > combination of overlayfs and fscrypt, and what are the security
> > properties we are trying to provide to someone who is relying on this
> > combination?
> 
> I can give one: encrypted containers:
> 
> https://github.com/opencontainers/image-spec/issues/747
> 
> The current proposal imagines that the key would be delivered to the
> physical node and the physical node containerd would decrypt all the
> layers before handing them off to to the kubelet.  However, one could
> imagine a slightly more secure use case where the layers were
> constructed as an encrypted filesystem tar and so the key would go into
> the kernel and the layers would be constructed with encryption in place
> using fscrypt.
> 
> Most of the desired security properties are in image at rest but one
> can imagine that the running image wants some protection against
> containment breaches by other tenants and using fscrypt could provide
> that.
> 

What do you mean by "containment breaches by other tenants"?  Note that while
the key is added, fscrypt doesn't prevent access to the encrypted files.
fscrypt is orthogonal to OS-level access control (UNIX mode bits, ACLs, SELinux,
etc.), which can and should be used alongside fscrypt.  fscrypt is a storage
encryption mechanism, not an OS-level access control mechanism.

- Eric


Re: overlayfs vs. fscrypt

2019-03-13 Thread James Bottomley
On Wed, 2019-03-13 at 11:16 -0400, Theodore Ts'o wrote:
> So before we talk about how to make things work from a technical
> perspective, we should consider what the use case happens to be, and
> what are the security requirements.  *Why* are we trying to use the
> combination of overlayfs and fscrypt, and what are the security
> properties we are trying to provide to someone who is relying on this
> combination?

I can give one: encrypted containers:

https://github.com/opencontainers/image-spec/issues/747

The current proposal imagines that the key would be delivered to the
physical node and the physical node containerd would decrypt all the
layers before handing them off to to the kubelet.  However, one could
imagine a slightly more secure use case where the layers were
constructed as an encrypted filesystem tar and so the key would go into
the kernel and the layers would be constructed with encryption in place
using fscrypt.

Most of the desired security properties are in image at rest but one
can imagine that the running image wants some protection against
containment breaches by other tenants and using fscrypt could provide
that.

James



Re: overlayfs vs. fscrypt

2019-03-13 Thread Richard Weinberger
Am Mittwoch, 13. März 2019, 16:16:33 CET schrieb Theodore Ts'o:
> So before we talk about how to make things work from a technical
> perspective, we should consider what the use case happens to be, and
> what are the security requirements.  *Why* are we trying to use the
> combination of overlayfs and fscrypt, and what are the security
> properties we are trying to provide to someone who is relying on this
> combination?

Well, as stated, on (deeply) embedded systems overlayfs is common.
You have a lowerdir with read-only files and an read-write upper dir.
Of course both lower and upper directory need to be encrypted.
In my case ubifs+fscrypt, sometimes also combined with an 
encrypted+authenticated
squashfs.

Thanks,
//richard




Re: overlayfs vs. fscrypt

2019-03-13 Thread Eric Biggers
On Wed, Mar 13, 2019 at 04:26:54PM +0200, Amir Goldstein wrote:
> On Wed, Mar 13, 2019 at 3:34 PM Richard Weinberger  wrote:
> >
> > Am Mittwoch, 13. März 2019, 14:24:47 CET schrieb Miklos Szeredi:
> > > > The use case is that you can delete these files if the DAC/MAC 
> > > > permissions allow it.
> > > > Just like on NTFS. If a user encrypts files, the admin cannot read them 
> > > > but can
> > > > remove them if the user is gone or loses the key.
> > >
> > > There's the underlying filesystem view where admin can delete files,
> > > etc.   And there's the fscrypt layer stacked on top of the underlying
> > > fs, which en/decrypts files *in case the user has the key*.  What if
> > > one user has a key, but the other one doesn't?  Will d_revalidate
> > > constantly switch the set of dentries between the encrypted filenames
> > > and the decrypted ones?  Sounds crazy.  And the fact that NTFS does
> > > this doesn't make it any less crazy...
> >
> > Well, I didn't come up with this feature. :-)
> >
> > If one user has the key and the other not, a classic multi-user
> > system, then you need to make sure that the affected fscrypt instances
> > are not visible by both.
> > For example by using mount namespaces to make sure that user a can only
> > see /home/foo and user b only /home/bar.
> > Or removing the search permission on /home/foo and /home/bar.
> >
> > I know, I know, but that's how it is...
> > Maybe Ted or Eric can give more details on why they chose this approach.
> >
> 
> AFAIK, this feature was born to tailor Android's file based encryption.
> https://source.android.com/security/encryption#file-based
> It is meant to protect data at rest and what happens when user enters
> the screen lock password IIRC, is that some service will get restarted.
> IOW, there should NOT be any processes in Android accessing the
> encrypted user data folders with and without the key simultaneously.

See my response to Miklos.  Even if some processes had the key in their keyring
and some didn't, which isn't the case on Android since on Android the fscrypt
keys are placed in a "global" keyring, there's still only one cached inode per
file/directory/symlink, and it either has the key (->i_crypt_info != NULL) or it
doesn't (i_crypt_info == NULL).  And it can only go from ->i_crypt_info == NULL
to ->i_crypt_info != NULL, not vice versa.

Also to be clear, there are other fscrypt users besides Android.  E.g. Chrome OS
where it replaced eCryptfs for home directory encryption and was actually the
original use case, people using it on "regular" Linux distros like Ubuntu via
the userspace tool https://github.com/google/fscrypt, and Richard using UBIFS
encryption on embedded devices.  It's not just for Android.

> Also, like OpenWRT, in Android the key does not get removed
> (until boot) AFAIK(?).

On Android, the fscrypt keys are removed when you switch users on a multi-user
device, or when you turn off work mode on a device with a work profile.  This is
currently accompanied by a 'sync && echo 3 > /proc/sys/vm/drop_caches', so the
inodes get evicted too and the files revert to their ciphertext "view".  I'd
like to replace this with my proposed new ioctl FS_IOC_REMOVE_ENCRYPTION_KEY,
which avoids the drop_caches hack: https://patchwork.kernel.org/patch/10821455/.

> 
> That dcache behavior remind me of the proposal to make case
> insensitive a per mount option (also for an Android use case).
> Eventually, that was replaced with per directory flag, which plays
> much better with dache.
> 
> IMO, the best thing for UBIFS to do would be to modify fscrypt to support
> opting out of the revalidate behavior, IWO, sanitize your hack to an API.

As noted in my other response, a better solution (if this is really needed at
all) would probably be to move a stripped-down version of fscrypt_d_revalidate()
to the VFS, so fscrypt won't need to use any dentry_operations at all.

> 
> It's good that you are thinking about what will happen with overlayfs
> over ext4/f2fs, but I think that it will be messy if dentry names would be
> changing in underlying fs and the fact the overlayfs accessed the underlying
> dirs with different credentials at times makes this even more messy.
> 
> The way out of this mess IMO would be for ext4/f2fs to also conditionally
> opt-out of d_revalidate behavior at mount time if the fs is expected to be
> used under overlayfs.
> In Android, for example, I think the use case of "admin deleting
> the encrypted directories" is only relevant on "reset to default" and that
> happens in recovery boot that could potentially opt-out of encryption
> altogether (because there is no user to enter the password anyway).
> 
> I could be over simplifying things for the Android use case and my
> information could be severely out dated.
> CC Paul Lawrence to fill in my Android knowledge gaps.
> 
> Thanks,
> Amir.

- Eric


Re: overlayfs vs. fscrypt

2019-03-13 Thread Theodore Ts'o
On Wed, Mar 13, 2019 at 04:26:54PM +0200, Amir Goldstein wrote:
> AFAIK, this feature was born to tailor Android's file based encryption.
> https://source.android.com/security/encryption#file-based
> It is meant to protect data at rest and what happens when user enters
> the screen lock password IIRC, is that some service will get restarted.
> IOW, there should NOT be any processes in Android accessing the
> encrypted user data folders with and without the key simultaneously.
> Also, like OpenWRT, in Android the key does not get removed
> (until boot) AFAIK(?).

Actually, the original use was for ChromeOS, but the primary
assumption is that keying is per user (or profile), and that users are
mutually distrustful.  So when Alice logs out of the system, her keys
will be invalidated and removed from the kernel.  We can (and do) try
to flush cache entries via "echo 3 > /proc/sys/vm/drop_caches" on
logout.  However, this does not guarantee that all dcache entries will
be removed --- a dcache entry can be pinned due to an open file, a
process's current working directory, a bind mount, etc.

The other issue is negative dentries; if you try open a file in an
encrypted file, the file system won't even *know* whether or not a
file exists, since the directory entries are encrypted; hence, there
may be some negative dentries that need to be invalidated.

So a fundamental assumption with fscrypt is that keys will be added
and removed, and that when this happens, dentries will need to be
invalidated.  This is going to surprise overlayfs, so if overlayfs is
going to support fscrypt it *has* to be aware of the fact that this
can happen.  It's not even clear what the proper security semantics
should be; *especially* if the upper and lower directories aren't
similarly protected using the same fscrypt encryption key.  Suppose
the lower directory is encrypted, and the upper is not.  Now on a copy
up operation, the previously encrypted file, which might contain
credit card numbers, medical records, or other things that would cause
a GDPR regulator to have a freak out attack, would *poof* become
decrypted.

So before we talk about how to make things work from a technical
perspective, we should consider what the use case happens to be, and
what are the security requirements.  *Why* are we trying to use the
combination of overlayfs and fscrypt, and what are the security
properties we are trying to provide to someone who is relying on this
combination?

- Ted


Re: overlayfs vs. fscrypt

2019-03-13 Thread Eric Biggers
Hi Miklos,

On Wed, Mar 13, 2019 at 02:24:47PM +0100, Miklos Szeredi wrote:
> On Wed, Mar 13, 2019 at 2:00 PM Richard Weinberger  wrote:
> >
> > Am Mittwoch, 13. März 2019, 13:58:11 CET schrieb Miklos Szeredi:
> > > On Wed, Mar 13, 2019 at 1:47 PM Richard Weinberger  wrote:
> > > >
> > > > Am Mittwoch, 13. März 2019, 13:36:02 CET schrieb Miklos Szeredi:
> > > > > I don't get it.  Does fscrypt try to check permissions via
> > > > > ->d_revalidate?  Why is it not doing that via ->permission()?
> > > >
> > > > Please let me explain. Suppose we have a fscrypto directory /mnt and
> > > > I *don't* have the key.
> > > >
> > > > When reading the directory contents of /mnt will return an encrypted 
> > > > filename.
> > > > e.g.
> > > > # ls /mnt
> > > > +mcQ46ne5Y8U6JMV9Wdq2C
> > >
> > > Why does showing the encrypted contents make any sense?  It could just
> > > return -EPERM on all operations?
> >
> > The use case is that you can delete these files if the DAC/MAC permissions 
> > allow it.
> > Just like on NTFS. If a user encrypts files, the admin cannot read them but 
> > can
> > remove them if the user is gone or loses the key.
> 
> There's the underlying filesystem view where admin can delete files,
> etc.   And there's the fscrypt layer stacked on top of the underlying
> fs, which en/decrypts files *in case the user has the key*.  What if
> one user has a key, but the other one doesn't?  Will d_revalidate
> constantly switch the set of dentries between the encrypted filenames
> and the decrypted ones?  Sounds crazy.  And the fact that NTFS does
> this doesn't make it any less crazy...
> 

fscrypt (aka ext4/f2fs/ubifs encryption) isn't a stacked filesystem.  I think
you're confusing it with eCryptfs.  There's only one "view" of the filesystem.

It's true that different processes can put different keys in their
process-subscribed keyrings, e.g. their session keyrings.  But that doesn't
change the fact that each cached inode either has the key or it doesn't, and all
users share those same cached inodes.  The mistake here is not making the keys
be provided at the filesystem level too, and I've proposed to fix that:
https://patchwork.kernel.org/cover/10821413/

Note that the the key (->i_crypt_info) is never removed from a cached inode
without evicting it.  It used to be done, but it was broken and removed.  Now a
cached inode can only have the key added.  For this reason and others, I think
fscrypt_d_revalidate() contains unneeded checks and can be simplified to this:

static int fscrypt_d_revalidate(struct dentry *dentry, unsigned int 
flags)
{
struct dentry *dir;
bool valid;

if (flags & LOOKUP_RCU)
return -ECHILD;

if (dentry->d_flags & DCACHE_ENCRYPTED_WITH_KEY)
return 1;

dir = dget_parent(dentry);
valid = (d_inode(dir)->i_crypt_info == NULL);
dput(dir);
return valid;
}

I think we can even support RCU mode too.

Then, one possibility is to move fscrypt_d_revalidate() to the VFS.  If we
replace DCACHE_ENCRYPTED_WITH_KEY with the opposite meaning, say
DCACHE_CIPHERTEXT_NAME, then the VFS will have everything it needs to just do
the equivalent of fscrypt_d_revalidate() directly in d_revalidate() in
fs/namei.c.  So fscrypt_d_ops won't be needed at all.  Something like this:

#ifdef CONFIG_FS_ENCRYPTION
static inline int fscrypt_d_revalidate(struct dentry *dentry,
   unsigned int flags)
{
struct dentry *dir;
struct inode *dir_inode;

if (!(READ_ONCE(dentry->d_flags) & DCACHE_CIPHERTEXT_NAME))
return 1;

dir = READ_ONCE(dentry->d_parent);
dir_inode = READ_ONCE(dir->d_inode);
return READ_ONCE(dir_inode->i_crypt_info) == NULL;
}
#else
static inline int fscrypt_d_revalidate(struct dentry *dentry,
   unsigned int flags)
{
return 1;
}
#endif

static inline int d_revalidate(struct dentry *dentry, unsigned int flags)
{
int status;

if (unlikely(dentry->d_flags & DCACHE_OP_REVALIDATE))
status = dentry->d_op->d_revalidate(dentry, flags);
else
status = 1;

if (status > 0)
status = fscrypt_d_revalidate(dentry, flags);
return status;
}


What do you think about this?

- Eric


Re: overlayfs vs. fscrypt

2019-03-13 Thread Amir Goldstein
On Wed, Mar 13, 2019 at 3:34 PM Richard Weinberger  wrote:
>
> Am Mittwoch, 13. März 2019, 14:24:47 CET schrieb Miklos Szeredi:
> > > The use case is that you can delete these files if the DAC/MAC 
> > > permissions allow it.
> > > Just like on NTFS. If a user encrypts files, the admin cannot read them 
> > > but can
> > > remove them if the user is gone or loses the key.
> >
> > There's the underlying filesystem view where admin can delete files,
> > etc.   And there's the fscrypt layer stacked on top of the underlying
> > fs, which en/decrypts files *in case the user has the key*.  What if
> > one user has a key, but the other one doesn't?  Will d_revalidate
> > constantly switch the set of dentries between the encrypted filenames
> > and the decrypted ones?  Sounds crazy.  And the fact that NTFS does
> > this doesn't make it any less crazy...
>
> Well, I didn't come up with this feature. :-)
>
> If one user has the key and the other not, a classic multi-user
> system, then you need to make sure that the affected fscrypt instances
> are not visible by both.
> For example by using mount namespaces to make sure that user a can only
> see /home/foo and user b only /home/bar.
> Or removing the search permission on /home/foo and /home/bar.
>
> I know, I know, but that's how it is...
> Maybe Ted or Eric can give more details on why they chose this approach.
>

AFAIK, this feature was born to tailor Android's file based encryption.
https://source.android.com/security/encryption#file-based
It is meant to protect data at rest and what happens when user enters
the screen lock password IIRC, is that some service will get restarted.
IOW, there should NOT be any processes in Android accessing the
encrypted user data folders with and without the key simultaneously.
Also, like OpenWRT, in Android the key does not get removed
(until boot) AFAIK(?).

That dcache behavior remind me of the proposal to make case
insensitive a per mount option (also for an Android use case).
Eventually, that was replaced with per directory flag, which plays
much better with dache.

IMO, the best thing for UBIFS to do would be to modify fscrypt to support
opting out of the revalidate behavior, IWO, sanitize your hack to an API.

It's good that you are thinking about what will happen with overlayfs
over ext4/f2fs, but I think that it will be messy if dentry names would be
changing in underlying fs and the fact the overlayfs accessed the underlying
dirs with different credentials at times makes this even more messy.

The way out of this mess IMO would be for ext4/f2fs to also conditionally
opt-out of d_revalidate behavior at mount time if the fs is expected to be
used under overlayfs.
In Android, for example, I think the use case of "admin deleting
the encrypted directories" is only relevant on "reset to default" and that
happens in recovery boot that could potentially opt-out of encryption
altogether (because there is no user to enter the password anyway).

I could be over simplifying things for the Android use case and my
information could be severely out dated.
CC Paul Lawrence to fill in my Android knowledge gaps.

Thanks,
Amir.


Re: overlayfs vs. fscrypt

2019-03-13 Thread Richard Weinberger
Am Mittwoch, 13. März 2019, 14:24:47 CET schrieb Miklos Szeredi:
> > The use case is that you can delete these files if the DAC/MAC permissions 
> > allow it.
> > Just like on NTFS. If a user encrypts files, the admin cannot read them but 
> > can
> > remove them if the user is gone or loses the key.
> 
> There's the underlying filesystem view where admin can delete files,
> etc.   And there's the fscrypt layer stacked on top of the underlying
> fs, which en/decrypts files *in case the user has the key*.  What if
> one user has a key, but the other one doesn't?  Will d_revalidate
> constantly switch the set of dentries between the encrypted filenames
> and the decrypted ones?  Sounds crazy.  And the fact that NTFS does
> this doesn't make it any less crazy...

Well, I didn't come up with this feature. :-)

If one user has the key and the other not, a classic multi-user
system, then you need to make sure that the affected fscrypt instances
are not visible by both.
For example by using mount namespaces to make sure that user a can only
see /home/foo and user b only /home/bar.
Or removing the search permission on /home/foo and /home/bar.

I know, I know, but that's how it is...
Maybe Ted or Eric can give more details on why they chose this approach.

Thanks,
//richard







Re: overlayfs vs. fscrypt

2019-03-13 Thread Miklos Szeredi
On Wed, Mar 13, 2019 at 2:00 PM Richard Weinberger  wrote:
>
> Am Mittwoch, 13. März 2019, 13:58:11 CET schrieb Miklos Szeredi:
> > On Wed, Mar 13, 2019 at 1:47 PM Richard Weinberger  wrote:
> > >
> > > Am Mittwoch, 13. März 2019, 13:36:02 CET schrieb Miklos Szeredi:
> > > > I don't get it.  Does fscrypt try to check permissions via
> > > > ->d_revalidate?  Why is it not doing that via ->permission()?
> > >
> > > Please let me explain. Suppose we have a fscrypto directory /mnt and
> > > I *don't* have the key.
> > >
> > > When reading the directory contents of /mnt will return an encrypted 
> > > filename.
> > > e.g.
> > > # ls /mnt
> > > +mcQ46ne5Y8U6JMV9Wdq2C
> >
> > Why does showing the encrypted contents make any sense?  It could just
> > return -EPERM on all operations?
>
> The use case is that you can delete these files if the DAC/MAC permissions 
> allow it.
> Just like on NTFS. If a user encrypts files, the admin cannot read them but 
> can
> remove them if the user is gone or loses the key.

There's the underlying filesystem view where admin can delete files,
etc.   And there's the fscrypt layer stacked on top of the underlying
fs, which en/decrypts files *in case the user has the key*.  What if
one user has a key, but the other one doesn't?  Will d_revalidate
constantly switch the set of dentries between the encrypted filenames
and the decrypted ones?  Sounds crazy.  And the fact that NTFS does
this doesn't make it any less crazy...

Thanks,
Miklos


Re: overlayfs vs. fscrypt

2019-03-13 Thread Richard Weinberger
Am Mittwoch, 13. März 2019, 13:58:11 CET schrieb Miklos Szeredi:
> On Wed, Mar 13, 2019 at 1:47 PM Richard Weinberger  wrote:
> >
> > Am Mittwoch, 13. März 2019, 13:36:02 CET schrieb Miklos Szeredi:
> > > I don't get it.  Does fscrypt try to check permissions via
> > > ->d_revalidate?  Why is it not doing that via ->permission()?
> >
> > Please let me explain. Suppose we have a fscrypto directory /mnt and
> > I *don't* have the key.
> >
> > When reading the directory contents of /mnt will return an encrypted 
> > filename.
> > e.g.
> > # ls /mnt
> > +mcQ46ne5Y8U6JMV9Wdq2C
> 
> Why does showing the encrypted contents make any sense?  It could just
> return -EPERM on all operations?

The use case is that you can delete these files if the DAC/MAC permissions 
allow it.
Just like on NTFS. If a user encrypts files, the admin cannot read them but can
remove them if the user is gone or loses the key.

Thanks,
//richard





Re: overlayfs vs. fscrypt

2019-03-13 Thread Miklos Szeredi
On Wed, Mar 13, 2019 at 1:47 PM Richard Weinberger  wrote:
>
> Am Mittwoch, 13. März 2019, 13:36:02 CET schrieb Miklos Szeredi:
> > I don't get it.  Does fscrypt try to check permissions via
> > ->d_revalidate?  Why is it not doing that via ->permission()?
>
> Please let me explain. Suppose we have a fscrypto directory /mnt and
> I *don't* have the key.
>
> When reading the directory contents of /mnt will return an encrypted filename.
> e.g.
> # ls /mnt
> +mcQ46ne5Y8U6JMV9Wdq2C

Why does showing the encrypted contents make any sense?  It could just
return -EPERM on all operations?

Thanks,
Miklos


Re: overlayfs vs. fscrypt

2019-03-13 Thread Richard Weinberger
Am Mittwoch, 13. März 2019, 13:36:02 CET schrieb Miklos Szeredi: 
> I don't get it.  Does fscrypt try to check permissions via
> ->d_revalidate?  Why is it not doing that via ->permission()?

Please let me explain. Suppose we have a fscrypto directory /mnt and
I *don't* have the key.

When reading the directory contents of /mnt will return an encrypted filename.
e.g.
# ls /mnt
+mcQ46ne5Y8U6JMV9Wdq2C

As soon I load my key the real name is shown and I can read the file contents 
too.
That's why fscrypt has ->d_revalidate(). It checks for the key, if the key is
still not here -> stay with the old encrypted name. If the key is present
-> reveal the real name.

Same happens on the other direction if I unlink my key from the keyring.

> >
> > 2. Teach overlayfs to deal with a upper that has ->d_revalidate().
> > Given the complexity of overlayfs I'm not sure how feasible this is.
> > But I'm no overlayfs expert, maybe I miss something.
> 
> I don't think it would be too complex.  But first I'd like to
> understand exactly why fscrypt is (ab) using d_revalidate().

I hope my answer makes things more clear.

Thanks,
//richard




Re: overlayfs vs. fscrypt

2019-03-13 Thread Miklos Szeredi
On Wed, Mar 13, 2019 at 1:31 PM Richard Weinberger  wrote:
>
> Hi!
>
> overlayfs and fscrypt are not friends.
> Currently it is not possible to use a fscrypt encrypted directory as upper
> directory with overlayfs.
> The reason for that is, fscrypt implements ->d_revalidate().
>
> From fscrypt's point of view having ->d_revalidate() makes sense because it
> wants to hide/show encrypted filenames if someone loads or unlinks a key.
>
> On the other hand, overlayfs makes sure that the upper directory cannot
> change beneath it. Therefore it checks whether the upper directory is a remote
> filesystem by checking for ->d_revalidate() and refuses to mount if so.
>
> In my little embedded Linux world it is common to use both UBIFS and
> overlayfs. Now with UBIFS being encrypted using fscrypt, overlayfs is a
> problem.
> My current hack is not using fscrypt_d_ops in UBIFS. This works because on a
> typical embedded target you setup your crypto keys exactly once, right before
> you mount overlayfs in an initramfs.
>
> But I'm sure this problem will hit sooner or later users of ext4 and f2fs too.
> Therefore I'd like to discuss possible solutions.
>
> So far I see two options:
>
> 1. Get rid of ->d_revalidate() in fscrypt.
> Maybe we find a way to return a dentry via ->lookup() which is not cached at
> all and therefore no ->d_revalidate() is needed. If unreadable and encrypted
> filename lookups are slow, so what?
> AFAIU this approach is impossible in the current dcache design since it is not
> allowed to have more than one dentry to the same file.

I don't get it.  Does fscrypt try to check permissions via
->d_revalidate?  Why is it not doing that via ->permission()?

>
> 2. Teach overlayfs to deal with a upper that has ->d_revalidate().
> Given the complexity of overlayfs I'm not sure how feasible this is.
> But I'm no overlayfs expert, maybe I miss something.

I don't think it would be too complex.  But first I'd like to
understand exactly why fscrypt is (ab) using d_revalidate().

Thanks,
Miklos


overlayfs vs. fscrypt

2019-03-13 Thread Richard Weinberger
Hi!

overlayfs and fscrypt are not friends.
Currently it is not possible to use a fscrypt encrypted directory as upper 
directory with overlayfs.
The reason for that is, fscrypt implements ->d_revalidate().

>From fscrypt's point of view having ->d_revalidate() makes sense because it 
wants to hide/show encrypted filenames if someone loads or unlinks a key.

On the other hand, overlayfs makes sure that the upper directory cannot
change beneath it. Therefore it checks whether the upper directory is a remote 
filesystem by checking for ->d_revalidate() and refuses to mount if so.

In my little embedded Linux world it is common to use both UBIFS and 
overlayfs. Now with UBIFS being encrypted using fscrypt, overlayfs is a 
problem.
My current hack is not using fscrypt_d_ops in UBIFS. This works because on a 
typical embedded target you setup your crypto keys exactly once, right before 
you mount overlayfs in an initramfs.

But I'm sure this problem will hit sooner or later users of ext4 and f2fs too.
Therefore I'd like to discuss possible solutions.

So far I see two options:

1. Get rid of ->d_revalidate() in fscrypt.
Maybe we find a way to return a dentry via ->lookup() which is not cached at 
all and therefore no ->d_revalidate() is needed. If unreadable and encrypted
filename lookups are slow, so what?
AFAIU this approach is impossible in the current dcache design since it is not 
allowed to have more than one dentry to the same file.

2. Teach overlayfs to deal with a upper that has ->d_revalidate().
Given the complexity of overlayfs I'm not sure how feasible this is.
But I'm no overlayfs expert, maybe I miss something.

What else could we do?

Thanks,
//richard