Re: [Virtio-fs] [PATCH 0/2] virtiofsd: drop Linux capabilities(7)

2020-07-13 Thread Vivek Goyal
On Mon, Jul 13, 2020 at 05:54:19PM +0900, Chirantan Ekbote wrote:
> On Thu, Jun 25, 2020 at 9:55 PM Vivek Goyal  wrote:
> >
> > On Thu, Jun 25, 2020 at 12:19:39PM +0900, Chirantan Ekbote wrote:
> > [..]
> > > > Chirantan,
> > > >
> > > > So you ended up renaming all "trusted", "security" and "system" xattrs?
> > > > Only "user" xattrs are complete passthrough?
> > > >
> > >
> > > No, we only rename "security" xattrs (except for selinux).
> > >
> > > >
> > > > IOW, security.selinux will be renamed to user.virtiofs.security.selinux
> > > > on host?
> > > >
> > >
> > > We don't relabel security.selinux because it only requires CAP_FOWNER
> > > in the process's user namespace and it also does its own MAC-based
> > > checks.  Also we have some tools that label files beforehand so it
> > > seemed easier to leave them unchanged.
> >
> > If we rename selinux xattr also, then we can support selinux both in
> > guest and host and they both can have their own independent policies.
> >
> 
> This works as long as you don't need to support setfscreatecon, which
> exists to guarantee that an inode is atomically created with the
> correct selinux context.  It's kind of possible for files with
> O_TMPFILE + fsetxattr + linkat but there is no equivalent for
> directories.  You're basically forced to create the directory and then
> set the xattr and while it's possible to prevent other threads in the
> server from seeing the unfinished directories with a rwlock or similar
> there is no protection against power loss.  If the machine loses power
> after the directory is created but before the selinux xattr is set
> then that directory will have the wrong selinux context and the guest
> would need to run restorecon at boot to assign the correct label.

Overlayfs has this notion of work directory where they create directory
(and file if O_TMPFILE is not supported), and then rename it to correct
place. I am wondering if we could do something similar. If server is
given a work directory where.

A. Create new directory
B. set selinux lables
C. rename directory

That way if machine crashes after step A or step B, that directory
never becomes visible to guest and no relabeling is required.

Thanks
Vivek




Re: [Virtio-fs] [PATCH 0/2] virtiofsd: drop Linux capabilities(7)

2020-07-13 Thread Chirantan Ekbote
On Thu, Jun 25, 2020 at 9:55 PM Vivek Goyal  wrote:
>
> On Thu, Jun 25, 2020 at 12:19:39PM +0900, Chirantan Ekbote wrote:
> [..]
> > > Chirantan,
> > >
> > > So you ended up renaming all "trusted", "security" and "system" xattrs?
> > > Only "user" xattrs are complete passthrough?
> > >
> >
> > No, we only rename "security" xattrs (except for selinux).
> >
> > >
> > > IOW, security.selinux will be renamed to user.virtiofs.security.selinux
> > > on host?
> > >
> >
> > We don't relabel security.selinux because it only requires CAP_FOWNER
> > in the process's user namespace and it also does its own MAC-based
> > checks.  Also we have some tools that label files beforehand so it
> > seemed easier to leave them unchanged.
>
> If we rename selinux xattr also, then we can support selinux both in
> guest and host and they both can have their own independent policies.
>

This works as long as you don't need to support setfscreatecon, which
exists to guarantee that an inode is atomically created with the
correct selinux context.  It's kind of possible for files with
O_TMPFILE + fsetxattr + linkat but there is no equivalent for
directories.  You're basically forced to create the directory and then
set the xattr and while it's possible to prevent other threads in the
server from seeing the unfinished directories with a rwlock or similar
there is no protection against power loss.  If the machine loses power
after the directory is created but before the selinux xattr is set
then that directory will have the wrong selinux context and the guest
would need to run restorecon at boot to assign the correct label.

> Otherwise we either have to disable selinux on host (if we want to
> support it in guest) or somehow guest and how policies will have
> to know about each other and be able to work together (which will
> be hard for a generic use case).

Yes, I agree this is hard to do for a generic case but unfortunately
the more I understand how selinux works the less I feel that it works
well with a passthrough style file system.  As you said it either
needs to be turned off on the host or the host and guest need to work
together.

Chirantan



Re: [Virtio-fs] [PATCH 0/2] virtiofsd: drop Linux capabilities(7)

2020-06-26 Thread Dr. David Alan Gilbert
* Vivek Goyal (vgo...@redhat.com) wrote:
> On Fri, Jun 19, 2020 at 12:39:14PM +0100, Daniel P. Berrangé wrote:
> [..]
> > The CAP_SYS_ADMIN requirement for 'trusted.' xattrs is simply a useful
> > mechanism for applications to control access. The host kernel doesn'
> > tuse this namespace itself. Linux has four namespaces for xattrs:
> > 
> >  -  user - for userspace apps. accessible based on read/write permissions
> >  -  trusted - for userspace apps. accessible by CAP_SYS_ADMIN processes only
> >  -  system - for kernel only. used by ACLs
> >  -  security - for kernel only. used by SELinux
> > 
> > The use case for "trusted" xattrs is thus where a privileged management
> > application or service wants to store metadata against the file, but
> > also needs to grant an unprivileged process access to write to this file
> > while not allowing that unprivileged process the ability to change the
> > metadata. This is mentioned in the man page:
> > 
> > [man xattr(7)]
> >Trusted extended attributes
> >Trusted  extended attributes are visible and accessible only to pro‐
> >cesses that have the CAP_SYS_ADMIN capability.  Attributes  in  this
> >class  are used to implement mechanisms in user space (i.e., outside
> >the kernel) which keep information in extended attributes  to  which
> >ordinary processes should not have access.
> > 
> >User extended attributes
> >User  extended  attributes  may be assigned to files and directories
> >for storing arbitrary additional information such as the mime  type,
> >character  set  or  encoding  of a file.  The access permissions for
> >user attributes are defined by the file permission bits:  read  per‐
> >mission is required to retrieve the attribute value, and writer per‐
> >mission is required to change it.
> > [/man]
> > 
> > Libvirtd uses the "trusted." xattr namespace to record information against
> > disk images for QEMU, because we need to grant QEMU access to read/write
> > the disk iamges, but don't want QEMU to be able to alter our xattrs.
> > 
> > It is unfortunate that this namespace is tied to the CAP_SYS_ADMIN cap.
> > It really ought to have had its own dedicated capability :-( Such is
> > life with anything that uses CAP_SYS_ADMIN...
> > 
> > With this in mind we really should have both trusted. & user. xattrs
> > allowed to the guest by default.
> > 
> > Conversely, we'll need to block usage of the security. and system.
> > namespaces.
> 
> I am wondering can we block usage of "system" and "security"?  What
> about guest setting acls over virtiofs files. These will have to
> go through and that means we need to allow system xattrs.
> 
> Similarly setting file capabilities inside should trigger
> setxattr(security.capability) and that means we need to allow security
> xattr as well.

Yep, we see that when people install Fedora packages, when rpm
unpackgs /usr/bin/newgidmap which has:

$ getfattr -d '--match=.*' /usr/bin/newgidmap 
getfattr: Removing leading '/' from absolute path names
# file: usr/bin/newgidmap
security.capability=0sAQAAAkA=
security.selinux="system_u:object_r:bin_t:s0"


Dave

> Thanks
> Vivek
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: [Virtio-fs] [PATCH 0/2] virtiofsd: drop Linux capabilities(7)

2020-06-25 Thread Vivek Goyal
On Thu, Jun 25, 2020 at 12:19:39PM +0900, Chirantan Ekbote wrote:
[..]
> > Chirantan,
> >
> > So you ended up renaming all "trusted", "security" and "system" xattrs?
> > Only "user" xattrs are complete passthrough?
> >
> 
> No, we only rename "security" xattrs (except for selinux).
> 
> >
> > IOW, security.selinux will be renamed to user.virtiofs.security.selinux
> > on host?
> >
> 
> We don't relabel security.selinux because it only requires CAP_FOWNER
> in the process's user namespace and it also does its own MAC-based
> checks.  Also we have some tools that label files beforehand so it
> seemed easier to leave them unchanged.

If we rename selinux xattr also, then we can support selinux both in
guest and host and they both can have their own independent policies.

Otherwise we either have to disable selinux on host (if we want to
support it in guest) or somehow guest and how policies will have
to know about each other and be able to work together (which will
be hard for a generic use case).

Vivek




Re: [Virtio-fs] [PATCH 0/2] virtiofsd: drop Linux capabilities(7)

2020-06-24 Thread Chirantan Ekbote
On Sat, Jun 20, 2020 at 4:15 AM Vivek Goyal  wrote:
>
> On Fri, Jun 19, 2020 at 01:46:20PM +0900, Chirantan Ekbote wrote:
> > On Fri, Jun 19, 2020 at 4:27 AM Vivek Goyal  wrote:
> > >
> > > On Thu, Jun 18, 2020 at 08:16:55PM +0100, Dr. David Alan Gilbert wrote:
> > > > * Vivek Goyal (vgo...@redhat.com) wrote:
> > > > > On Thu, Apr 16, 2020 at 05:49:05PM +0100, Stefan Hajnoczi wrote:
> > > > > > virtiofsd doesn't need of all Linux capabilities(7) available to 
> > > > > > root.  Keep a
> > > > > > whitelisted set of capabilities that we require.  This improves 
> > > > > > security in
> > > > > > case virtiofsd is compromised by making it hard for an attacker to 
> > > > > > gain further
> > > > > > access to the system.
> > > > >
> > > > > Hi Stefan,
> > > > >
> > > > > I just noticed that this patch set breaks overlayfs on top of 
> > > > > virtiofs.
> > > > >
> > > > > overlayfs sets "trusted.overlay.*" and xattrs in trusted domain
> > > > > need CAP_SYS_ADMIN.
> > > > >
> >
> > Not just that but it needs CAP_SYS_ADMIN in the init namespace[1].  We
> > have the same problem.  Our virtiofs process has CAP_SYS_ADMIN in a
> > user namespace and it cannot set any trusted or security xattrs.  The
> > security xattrs check is at least namespace aware so you only need
> > CAP_SYS_ADMIN in the namespace that mounted the fs but that doesn't
> > help us much.
> >
> > We ended up working around it by prefixing "user.virtiofs." to the
> > xattr name[2], which has its own problems but there was pretty much no
> > chance that we would be able to give the fs device CAP_SYS_ADMIN in
> > the init namespace.
>
> Chirantan,
>
> So you ended up renaming all "trusted", "security" and "system" xattrs?
> Only "user" xattrs are complete passthrough?
>

No, we only rename "security" xattrs (except for selinux).

>
> IOW, security.selinux will be renamed to user.virtiofs.security.selinux
> on host?
>

We don't relabel security.selinux because it only requires CAP_FOWNER
in the process's user namespace and it also does its own MAC-based
checks.  Also we have some tools that label files beforehand so it
seemed easier to leave them unchanged.

Chirantan



Re: [Virtio-fs] [PATCH 0/2] virtiofsd: drop Linux capabilities(7)

2020-06-19 Thread Vivek Goyal
On Fri, Jun 19, 2020 at 01:46:20PM +0900, Chirantan Ekbote wrote:
> On Fri, Jun 19, 2020 at 4:27 AM Vivek Goyal  wrote:
> >
> > On Thu, Jun 18, 2020 at 08:16:55PM +0100, Dr. David Alan Gilbert wrote:
> > > * Vivek Goyal (vgo...@redhat.com) wrote:
> > > > On Thu, Apr 16, 2020 at 05:49:05PM +0100, Stefan Hajnoczi wrote:
> > > > > virtiofsd doesn't need of all Linux capabilities(7) available to 
> > > > > root.  Keep a
> > > > > whitelisted set of capabilities that we require.  This improves 
> > > > > security in
> > > > > case virtiofsd is compromised by making it hard for an attacker to 
> > > > > gain further
> > > > > access to the system.
> > > >
> > > > Hi Stefan,
> > > >
> > > > I just noticed that this patch set breaks overlayfs on top of virtiofs.
> > > >
> > > > overlayfs sets "trusted.overlay.*" and xattrs in trusted domain
> > > > need CAP_SYS_ADMIN.
> > > >
> 
> Not just that but it needs CAP_SYS_ADMIN in the init namespace[1].  We
> have the same problem.  Our virtiofs process has CAP_SYS_ADMIN in a
> user namespace and it cannot set any trusted or security xattrs.  The
> security xattrs check is at least namespace aware so you only need
> CAP_SYS_ADMIN in the namespace that mounted the fs but that doesn't
> help us much.
> 
> We ended up working around it by prefixing "user.virtiofs." to the
> xattr name[2], which has its own problems but there was pretty much no
> chance that we would be able to give the fs device CAP_SYS_ADMIN in
> the init namespace.

Chirantan, 

So you ended up renaming all "trusted", "security" and "system" xattrs?
Only "user" xattrs are complete passthrough?

IOW, security.selinux will be renamed to user.virtiofs.security.selinux
on host?

Thanks
Vivek




Re: [Virtio-fs] [PATCH 0/2] virtiofsd: drop Linux capabilities(7)

2020-06-19 Thread Vivek Goyal
On Fri, Jun 19, 2020 at 12:39:14PM +0100, Daniel P. Berrangé wrote:
[..]
> The CAP_SYS_ADMIN requirement for 'trusted.' xattrs is simply a useful
> mechanism for applications to control access. The host kernel doesn'
> tuse this namespace itself. Linux has four namespaces for xattrs:
> 
>  -  user - for userspace apps. accessible based on read/write permissions
>  -  trusted - for userspace apps. accessible by CAP_SYS_ADMIN processes only
>  -  system - for kernel only. used by ACLs
>  -  security - for kernel only. used by SELinux
> 
> The use case for "trusted" xattrs is thus where a privileged management
> application or service wants to store metadata against the file, but
> also needs to grant an unprivileged process access to write to this file
> while not allowing that unprivileged process the ability to change the
> metadata. This is mentioned in the man page:
> 
> [man xattr(7)]
>Trusted extended attributes
>Trusted  extended attributes are visible and accessible only to pro‐
>cesses that have the CAP_SYS_ADMIN capability.  Attributes  in  this
>class  are used to implement mechanisms in user space (i.e., outside
>the kernel) which keep information in extended attributes  to  which
>ordinary processes should not have access.
> 
>User extended attributes
>User  extended  attributes  may be assigned to files and directories
>for storing arbitrary additional information such as the mime  type,
>character  set  or  encoding  of a file.  The access permissions for
>user attributes are defined by the file permission bits:  read  per‐
>mission is required to retrieve the attribute value, and writer per‐
>mission is required to change it.
> [/man]
> 
> Libvirtd uses the "trusted." xattr namespace to record information against
> disk images for QEMU, because we need to grant QEMU access to read/write
> the disk iamges, but don't want QEMU to be able to alter our xattrs.
> 
> It is unfortunate that this namespace is tied to the CAP_SYS_ADMIN cap.
> It really ought to have had its own dedicated capability :-( Such is
> life with anything that uses CAP_SYS_ADMIN...
> 
> With this in mind we really should have both trusted. & user. xattrs
> allowed to the guest by default.
> 
> Conversely, we'll need to block usage of the security. and system.
> namespaces.

I am wondering can we block usage of "system" and "security"?  What
about guest setting acls over virtiofs files. These will have to
go through and that means we need to allow system xattrs.

Similarly setting file capabilities inside should trigger
setxattr(security.capability) and that means we need to allow security
xattr as well.

Thanks
Vivek




Re: [Virtio-fs] [PATCH 0/2] virtiofsd: drop Linux capabilities(7)

2020-06-19 Thread Vivek Goyal
On Fri, Jun 19, 2020 at 01:05:47PM +0100, Daniel P. Berrangé wrote:
> On Fri, Jun 19, 2020 at 12:49:57PM +0100, Dr. David Alan Gilbert wrote:
> > * Daniel P. Berrangé (berra...@redhat.com) wrote:
> > > On Fri, Jun 19, 2020 at 09:27:46AM +0100, Dr. David Alan Gilbert wrote:
> > > > * Vivek Goyal (vgo...@redhat.com) wrote:
> > > > > On Thu, Jun 18, 2020 at 08:16:55PM +0100, Dr. David Alan Gilbert 
> > > > > wrote:
> > > > > > * Vivek Goyal (vgo...@redhat.com) wrote:
> > > > > > > On Thu, Apr 16, 2020 at 05:49:05PM +0100, Stefan Hajnoczi wrote:
> > > > > > > > virtiofsd doesn't need of all Linux capabilities(7) available 
> > > > > > > > to root.  Keep a
> > > > > > > > whitelisted set of capabilities that we require.  This improves 
> > > > > > > > security in
> > > > > > > > case virtiofsd is compromised by making it hard for an attacker 
> > > > > > > > to gain further
> > > > > > > > access to the system.
> > > > > > > 
> > > > > > > Hi Stefan,
> > > > > > > 
> > > > > > > I just noticed that this patch set breaks overlayfs on top of 
> > > > > > > virtiofs.
> > > > > > > 
> > > > > > > overlayfs sets "trusted.overlay.*" and xattrs in trusted domain
> > > > > > > need CAP_SYS_ADMIN.
> > > > > > > 
> > > > > > > man xattr says.
> > > > > > > 
> > > > > > >Trusted extended attributes
> > > > > > >Trusted  extended  attributes  are  visible and accessible 
> > > > > > > only to pro‐
> > > > > > >cesses that have the  CAP_SYS_ADMIN  capability.   
> > > > > > > Attributes  in  this
> > > > > > >class are used to implement mechanisms in user space 
> > > > > > > (i.e., outside the
> > > > > > >kernel) which keep information in extended attributes to 
> > > > > > > which ordinary
> > > > > > >processes should not have access.
> > > > > > > 
> > > > > > > There is a chance that overlay moves away from trusted xattr in 
> > > > > > > future.
> > > > > > > But for now we need to make it work. This is an important use 
> > > > > > > case for
> > > > > > > kata docker in docker build.
> > > > > > > 
> > > > > > > May be we can add an option to virtiofsd say "--add-cap 
> > > > > > > " and
> > > > > > > ask user to pass in "--add-cap cap_sys_admin" if they need to run 
> > > > > > > daemon
> > > > > > > with this capaibility.
> > > > > > 
> > > > > > I'll admit I don't like the idea of giving it cap_sys_admin.
> > > > > > Can you explain:
> > > > > >   a) What overlayfs uses trusted for?
> > > > > 
> > > > > overlayfs stores bunch of metadata and uses "trusted" xattrs for it.
> > > > 
> > > > Tell me more about this metadata.
> > > > Taking a juicy looking one, what does OVL_XATTR_REDIRECT do?
> > > > Or what happens if I was to write random numbers into OVL_XATTR_NLINK?
> > > > 
> > > > > >   b) If something nasty was to write junk into the trusted 
> > > > > > attributes,
> > > > > > what would happen?
> > > > > 
> > > > > This directory is owned by guest. So it should be able to write
> > > > > anything it wants, as long as process in guest has CAP_SYS_ADMIN, 
> > > > > right?
> > > > 
> > > > Well, we shouldn't be able to break/crash/escape into the host; how
> > > > much does overlayfs validate trusted.* it uses?
> > > > 
> > > > > >   c) I see overlayfs has a fallback check if xattr isn't supported 
> > > > > > at
> > > > > > all - what is the consequence?
> > > > > 
> > > > > It falls back to I think read only mode. 
> > > > 
> > > > It looks like the fallback is more subtle to me:
> > > > /*
> > > >  * Check if upper/work fs supports trusted.overlay.* xattr
> > > >  */
> > > > err = ovl_do_setxattr(ofs->workdir, OVL_XATTR_OPAQUE, "0", 1, 
> > > > 0);
> > > > if (err) {
> > > > ofs->noxattr = true;
> > > > ofs->config.index = false;
> > > > ofs->config.metacopy = false;
> > > > pr_warn("upper fs does not support xattr, falling back 
> > > > to index=off and metacopy=off.\n");
> > > > 
> > > > but I don't know what index and metacopy are.
> > > > 
> > > > > For a moment forget about overlayfs. Say a user process in guest with
> > > > > CAP_SYS_ADMIN is writing trusted.foo. Should that succeed? Is a
> > > > > passthrough filesystem, so it should go through. But currently it
> > > > > wont.
> > > > 
> > > > As long as any effects of what it writes are contained to the area of
> > > > the filesystem exposed to the guest, yes - however it worries me what
> > > > the consequences of broken trusted metadata is.  If it's delicate enough
> > > > that it's guarded by CAP_SYS_ADMIN someone must have worried about it.
> > > 
> > > The CAP_SYS_ADMIN requirement for 'trusted.' xattrs is simply a useful
> > > mechanism for applications to control access. The host kernel doesn'
> > > tuse this namespace itself. Linux has four namespaces for xattrs:
> > > 
> > >  -  user - for userspace apps. accessible based on read/write permissions
> > >  -  trusted - for userspace apps. acc

Re: [Virtio-fs] [PATCH 0/2] virtiofsd: drop Linux capabilities(7)

2020-06-19 Thread Vivek Goyal
On Fri, Jun 19, 2020 at 05:16:48PM +0100, Dr. David Alan Gilbert wrote:

[..]
> > > > >   b) If something nasty was to write junk into the trusted attributes,
> > > > > what would happen?
> > > > 
> > > > This directory is owned by guest. So it should be able to write
> > > > anything it wants, as long as process in guest has CAP_SYS_ADMIN, right?
> > > 
> > > Well, we shouldn't be able to break/crash/escape into the host; how
> > > much does overlayfs validate trusted.* it uses?
> > 
> > I thought qemu and kvm are the one who should ensure we should not be
> > able to break out of sandbox. Kernel implementation could be as 
> > buggy as it wanted to be. We are working with this security model
> > that kernel is completely untrusted.
> 
> But with virtiofs we allow the guest to do a lot of filesystem
> operations on the host.  It's the virtiofsd that has to ensure that
> these are safe and contained within the fs it's exposed; the qemu/kvm
> can't protect us from that.

Fair enough. I should have added virtiofsd to list. Its an attack
vector and is of concern.

> 
> That's why we sandbox the virtiofsd like we do - if we allow a
> priviliged guest to perform calls to an unconstrained virtiofsd it would
> be able to escape.  That's what I want to check.

Sure. So does giving CAP_SYS_ADMIN to virtiofsd allow daemon to escape
the jail. If it does we need to implement what crossvm folks did,
remapping of trusted xattr. That will also allow us to run inside
user namespace and still be able to support trusted xattr emulation
for guest.

Vivek




Re: [Virtio-fs] [PATCH 0/2] virtiofsd: drop Linux capabilities(7)

2020-06-19 Thread Dr. David Alan Gilbert
* Vivek Goyal (vgo...@redhat.com) wrote:
> On Fri, Jun 19, 2020 at 05:16:48PM +0100, Dr. David Alan Gilbert wrote:
> 
> [..]
> > > > > >   b) If something nasty was to write junk into the trusted 
> > > > > > attributes,
> > > > > > what would happen?
> > > > > 
> > > > > This directory is owned by guest. So it should be able to write
> > > > > anything it wants, as long as process in guest has CAP_SYS_ADMIN, 
> > > > > right?
> > > > 
> > > > Well, we shouldn't be able to break/crash/escape into the host; how
> > > > much does overlayfs validate trusted.* it uses?
> > > 
> > > I thought qemu and kvm are the one who should ensure we should not be
> > > able to break out of sandbox. Kernel implementation could be as 
> > > buggy as it wanted to be. We are working with this security model
> > > that kernel is completely untrusted.
> > 
> > But with virtiofs we allow the guest to do a lot of filesystem
> > operations on the host.  It's the virtiofsd that has to ensure that
> > these are safe and contained within the fs it's exposed; the qemu/kvm
> > can't protect us from that.
> 
> Fair enough. I should have added virtiofsd to list. Its an attack
> vector and is of concern.
> 
> > 
> > That's why we sandbox the virtiofsd like we do - if we allow a
> > priviliged guest to perform calls to an unconstrained virtiofsd it would
> > be able to escape.  That's what I want to check.
> 
> Sure. So does giving CAP_SYS_ADMIN to virtiofsd allow daemon to escape
> the jail.

So that's *my* question - what bad things can someone do by setting
attributes (trusted/system/security) - it's fine if they break they
screwup the security inside the container, because they'd need to be
CAP_SYS_ADMIN inside the container to do it - as long as they can't
break the host.  So what happens if someone starts doing bad things to
trusted.* attributes on an overlayfs - no other fs uses them as far as I
know.

> If it does we need to implement what crossvm folks did,
> remapping of trusted xattr. That will also allow us to run inside
> user namespace and still be able to support trusted xattr emulation
> for guest.

I think we need to do that anyway, it's just going to take a while to
figure out.

Dave

> 
> Vivek
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: [Virtio-fs] [PATCH 0/2] virtiofsd: drop Linux capabilities(7)

2020-06-19 Thread Dr. David Alan Gilbert
* Vivek Goyal (vgo...@redhat.com) wrote:
> On Fri, Jun 19, 2020 at 09:27:46AM +0100, Dr. David Alan Gilbert wrote:
> > * Vivek Goyal (vgo...@redhat.com) wrote:
> > > On Thu, Jun 18, 2020 at 08:16:55PM +0100, Dr. David Alan Gilbert wrote:
> > > > * Vivek Goyal (vgo...@redhat.com) wrote:
> > > > > On Thu, Apr 16, 2020 at 05:49:05PM +0100, Stefan Hajnoczi wrote:
> > > > > > virtiofsd doesn't need of all Linux capabilities(7) available to 
> > > > > > root.  Keep a
> > > > > > whitelisted set of capabilities that we require.  This improves 
> > > > > > security in
> > > > > > case virtiofsd is compromised by making it hard for an attacker to 
> > > > > > gain further
> > > > > > access to the system.
> > > > > 
> > > > > Hi Stefan,
> > > > > 
> > > > > I just noticed that this patch set breaks overlayfs on top of 
> > > > > virtiofs.
> > > > > 
> > > > > overlayfs sets "trusted.overlay.*" and xattrs in trusted domain
> > > > > need CAP_SYS_ADMIN.
> > > > > 
> > > > > man xattr says.
> > > > > 
> > > > >Trusted extended attributes
> > > > >Trusted  extended  attributes  are  visible and accessible 
> > > > > only to pro‐
> > > > >cesses that have the  CAP_SYS_ADMIN  capability.   Attributes  
> > > > > in  this
> > > > >class are used to implement mechanisms in user space (i.e., 
> > > > > outside the
> > > > >kernel) which keep information in extended attributes to which 
> > > > > ordinary
> > > > >processes should not have access.
> > > > > 
> > > > > There is a chance that overlay moves away from trusted xattr in 
> > > > > future.
> > > > > But for now we need to make it work. This is an important use case for
> > > > > kata docker in docker build.
> > > > > 
> > > > > May be we can add an option to virtiofsd say "--add-cap " 
> > > > > and
> > > > > ask user to pass in "--add-cap cap_sys_admin" if they need to run 
> > > > > daemon
> > > > > with this capaibility.
> > > > 
> > > > I'll admit I don't like the idea of giving it cap_sys_admin.
> > > > Can you explain:
> > > >   a) What overlayfs uses trusted for?
> > > 
> > > overlayfs stores bunch of metadata and uses "trusted" xattrs for it.
> > 
> > Tell me more about this metadata.
> > Taking a juicy looking one, what does OVL_XATTR_REDIRECT do?
> 
> It contains path information which is used for lookup into lower layer.
> 
> > Or what happens if I was to write random numbers into OVL_XATTR_NLINK?
> 
> Overlay is storing its metadata in trusted.* xattrs. If a user modifies
> metadata, then various kind of bad things can happen. I think one can
> do some kind of checks on metadata to make sure it does not crash
> atleast.
> 
> And that's true for any filesystem. Isn't. If user manages to modify
> metadata outside of filesystem, then lot of bad things can happen. I
> thought that's the reason that people are not comfortable with the
> idea of allowing mounting filesystem from inside user namespace because
> it makes it easy to mount a hand crafted filesystem.
> 
> Anyway, I think overlayfs is just one use case of trusted xattr. Even
> if overlayfs moves away from trusted xattr, what about other users.
> We need to have a story around how will we support trusted xattrs
> safely.
> 
> 
> > 
> > > >   b) If something nasty was to write junk into the trusted attributes,
> > > > what would happen?
> > > 
> > > This directory is owned by guest. So it should be able to write
> > > anything it wants, as long as process in guest has CAP_SYS_ADMIN, right?
> > 
> > Well, we shouldn't be able to break/crash/escape into the host; how
> > much does overlayfs validate trusted.* it uses?
> 
> I thought qemu and kvm are the one who should ensure we should not be
> able to break out of sandbox. Kernel implementation could be as 
> buggy as it wanted to be. We are working with this security model
> that kernel is completely untrusted.

But with virtiofs we allow the guest to do a lot of filesystem
operations on the host.  It's the virtiofsd that has to ensure that
these are safe and contained within the fs it's exposed; the qemu/kvm
can't protect us from that.

That's why we sandbox the virtiofsd like we do - if we allow a
priviliged guest to perform calls to an unconstrained virtiofsd it would
be able to escape.  That's what I want to check.

Dave

> > 
> > > >   c) I see overlayfs has a fallback check if xattr isn't supported at
> > > > all - what is the consequence?
> > > 
> > > It falls back to I think read only mode. 
> > 
> > It looks like the fallback is more subtle to me:
> > /*
> >  * Check if upper/work fs supports trusted.overlay.* xattr
> >  */
> > err = ovl_do_setxattr(ofs->workdir, OVL_XATTR_OPAQUE, "0", 1, 0);
> > if (err) {
> > ofs->noxattr = true;
> > ofs->config.index = false;
> > ofs->config.metacopy = false;
> > pr_warn("upper fs does not support xattr, falling back to 
> > index=off and metacopy=off.\n");

Re: [Virtio-fs] [PATCH 0/2] virtiofsd: drop Linux capabilities(7)

2020-06-19 Thread Vivek Goyal
On Fri, Jun 19, 2020 at 09:27:46AM +0100, Dr. David Alan Gilbert wrote:
> * Vivek Goyal (vgo...@redhat.com) wrote:
> > On Thu, Jun 18, 2020 at 08:16:55PM +0100, Dr. David Alan Gilbert wrote:
> > > * Vivek Goyal (vgo...@redhat.com) wrote:
> > > > On Thu, Apr 16, 2020 at 05:49:05PM +0100, Stefan Hajnoczi wrote:
> > > > > virtiofsd doesn't need of all Linux capabilities(7) available to 
> > > > > root.  Keep a
> > > > > whitelisted set of capabilities that we require.  This improves 
> > > > > security in
> > > > > case virtiofsd is compromised by making it hard for an attacker to 
> > > > > gain further
> > > > > access to the system.
> > > > 
> > > > Hi Stefan,
> > > > 
> > > > I just noticed that this patch set breaks overlayfs on top of virtiofs.
> > > > 
> > > > overlayfs sets "trusted.overlay.*" and xattrs in trusted domain
> > > > need CAP_SYS_ADMIN.
> > > > 
> > > > man xattr says.
> > > > 
> > > >Trusted extended attributes
> > > >Trusted  extended  attributes  are  visible and accessible only 
> > > > to pro‐
> > > >cesses that have the  CAP_SYS_ADMIN  capability.   Attributes  
> > > > in  this
> > > >class are used to implement mechanisms in user space (i.e., 
> > > > outside the
> > > >kernel) which keep information in extended attributes to which 
> > > > ordinary
> > > >processes should not have access.
> > > > 
> > > > There is a chance that overlay moves away from trusted xattr in future.
> > > > But for now we need to make it work. This is an important use case for
> > > > kata docker in docker build.
> > > > 
> > > > May be we can add an option to virtiofsd say "--add-cap " 
> > > > and
> > > > ask user to pass in "--add-cap cap_sys_admin" if they need to run daemon
> > > > with this capaibility.
> > > 
> > > I'll admit I don't like the idea of giving it cap_sys_admin.
> > > Can you explain:
> > >   a) What overlayfs uses trusted for?
> > 
> > overlayfs stores bunch of metadata and uses "trusted" xattrs for it.
> 
> Tell me more about this metadata.
> Taking a juicy looking one, what does OVL_XATTR_REDIRECT do?

It contains path information which is used for lookup into lower layer.

> Or what happens if I was to write random numbers into OVL_XATTR_NLINK?

Overlay is storing its metadata in trusted.* xattrs. If a user modifies
metadata, then various kind of bad things can happen. I think one can
do some kind of checks on metadata to make sure it does not crash
atleast.

And that's true for any filesystem. Isn't. If user manages to modify
metadata outside of filesystem, then lot of bad things can happen. I
thought that's the reason that people are not comfortable with the
idea of allowing mounting filesystem from inside user namespace because
it makes it easy to mount a hand crafted filesystem.

Anyway, I think overlayfs is just one use case of trusted xattr. Even
if overlayfs moves away from trusted xattr, what about other users.
We need to have a story around how will we support trusted xattrs
safely.


> 
> > >   b) If something nasty was to write junk into the trusted attributes,
> > > what would happen?
> > 
> > This directory is owned by guest. So it should be able to write
> > anything it wants, as long as process in guest has CAP_SYS_ADMIN, right?
> 
> Well, we shouldn't be able to break/crash/escape into the host; how
> much does overlayfs validate trusted.* it uses?

I thought qemu and kvm are the one who should ensure we should not be
able to break out of sandbox. Kernel implementation could be as 
buggy as it wanted to be. We are working with this security model
that kernel is completely untrusted.

> 
> > >   c) I see overlayfs has a fallback check if xattr isn't supported at
> > > all - what is the consequence?
> > 
> > It falls back to I think read only mode. 
> 
> It looks like the fallback is more subtle to me:
> /*
>  * Check if upper/work fs supports trusted.overlay.* xattr
>  */
> err = ovl_do_setxattr(ofs->workdir, OVL_XATTR_OPAQUE, "0", 1, 0);
> if (err) {
> ofs->noxattr = true;
> ofs->config.index = false;
> ofs->config.metacopy = false;
> pr_warn("upper fs does not support xattr, falling back to 
> index=off and metacopy=off.\n");
> 
> but I don't know what index and metacopy are.

They enable certain features in overlayfs. In fact, we fall back to
lesser capability on if we are running on ext4/xfs. For virtiofs, 
we deny the mount completely.

/*
 * We allowed sub-optimal upper fs configuration and don't want to break
 * users over kernel upgrade, but we never allowed remote upper fs, so
 * we can enforce strict requirements for remote upper fs.
 */
if (ovl_dentry_remote(ofs->workdir) &&
(!d_type || !rename_whiteout || ofs->noxattr)) {
pr_err("upper fs missing required features.\n");
err = -EINVAL;
goto out;
   

Re: [Virtio-fs] [PATCH 0/2] virtiofsd: drop Linux capabilities(7)

2020-06-19 Thread Vivek Goyal
On Fri, Jun 19, 2020 at 05:26:37PM +0200, Miklos Szeredi wrote:
> On Fri, Jun 19, 2020 at 4:25 PM Vivek Goyal  wrote:
> >
> > On Fri, Jun 19, 2020 at 04:16:30PM +0200, Miklos Szeredi wrote:
> > > On Thu, Jun 18, 2020 at 9:08 PM Vivek Goyal  wrote:
> > > >
> > > > On Thu, Apr 16, 2020 at 05:49:05PM +0100, Stefan Hajnoczi wrote:
> > > > > virtiofsd doesn't need of all Linux capabilities(7) available to 
> > > > > root.  Keep a
> > > > > whitelisted set of capabilities that we require.  This improves 
> > > > > security in
> > > > > case virtiofsd is compromised by making it hard for an attacker to 
> > > > > gain further
> > > > > access to the system.
> > > >
> > > > Hi Stefan,
> > > >
> > > > I just noticed that this patch set breaks overlayfs on top of virtiofs.
> > >
> > > How so?  Virtiofs isn't mounting overlayfs, is it?  Only the mounter
> > > requires CAP_SYS_ADMIN, not the accessor.
> >
> > virtiofsd needs CAP_SYS_ADMIN, otherwise fsetxattr(trusted.overlay.opaque)
> > fails in lo_setxattr().
> >
> > This is triggered when we mount overlayfs on top of virtiofs and overlayfs
> > tries to set OVL_XATTR_OPAQUE on upper to check if trusted xattrs are
> > supported or not.
> 
> Ah, right.
> 
> Plan is to use "user.*" xattr for unprivileged overlay.  This would be
> a good way to eliminate this attack surface in the overlay on virtiofs
> case as well.

So unpriviliged overlay is one which is mounted from inside a user
namespace. But in this case we might be mounting it from init_user_ns
of guest. So from overlayfs perspective this will still be treated
as priviliged overlay instance and it will use trusted xattrs, IIUC?

Thanks
Vivek

> 
> Other ways to minimize risk is to separate operations requiring
> CAP_SYS_ADMIN into a separate process, preferably a separate
> executable, that communicates with virtiofsd using a pipe and contains
> the minimum amount of code.

> 
> Thanks,
> Miklos
> 




Re: [Virtio-fs] [PATCH 0/2] virtiofsd: drop Linux capabilities(7)

2020-06-19 Thread Miklos Szeredi
On Fri, Jun 19, 2020 at 4:25 PM Vivek Goyal  wrote:
>
> On Fri, Jun 19, 2020 at 04:16:30PM +0200, Miklos Szeredi wrote:
> > On Thu, Jun 18, 2020 at 9:08 PM Vivek Goyal  wrote:
> > >
> > > On Thu, Apr 16, 2020 at 05:49:05PM +0100, Stefan Hajnoczi wrote:
> > > > virtiofsd doesn't need of all Linux capabilities(7) available to root.  
> > > > Keep a
> > > > whitelisted set of capabilities that we require.  This improves 
> > > > security in
> > > > case virtiofsd is compromised by making it hard for an attacker to gain 
> > > > further
> > > > access to the system.
> > >
> > > Hi Stefan,
> > >
> > > I just noticed that this patch set breaks overlayfs on top of virtiofs.
> >
> > How so?  Virtiofs isn't mounting overlayfs, is it?  Only the mounter
> > requires CAP_SYS_ADMIN, not the accessor.
>
> virtiofsd needs CAP_SYS_ADMIN, otherwise fsetxattr(trusted.overlay.opaque)
> fails in lo_setxattr().
>
> This is triggered when we mount overlayfs on top of virtiofs and overlayfs
> tries to set OVL_XATTR_OPAQUE on upper to check if trusted xattrs are
> supported or not.

Ah, right.

Plan is to use "user.*" xattr for unprivileged overlay.  This would be
a good way to eliminate this attack surface in the overlay on virtiofs
case as well.

Other ways to minimize risk is to separate operations requiring
CAP_SYS_ADMIN into a separate process, preferably a separate
executable, that communicates with virtiofsd using a pipe and contains
the minimum amount of code.

Thanks,
Miklos



Re: [Virtio-fs] [PATCH 0/2] virtiofsd: drop Linux capabilities(7)

2020-06-19 Thread Vivek Goyal
On Fri, Jun 19, 2020 at 04:16:30PM +0200, Miklos Szeredi wrote:
> On Thu, Jun 18, 2020 at 9:08 PM Vivek Goyal  wrote:
> >
> > On Thu, Apr 16, 2020 at 05:49:05PM +0100, Stefan Hajnoczi wrote:
> > > virtiofsd doesn't need of all Linux capabilities(7) available to root.  
> > > Keep a
> > > whitelisted set of capabilities that we require.  This improves security 
> > > in
> > > case virtiofsd is compromised by making it hard for an attacker to gain 
> > > further
> > > access to the system.
> >
> > Hi Stefan,
> >
> > I just noticed that this patch set breaks overlayfs on top of virtiofs.
> 
> How so?  Virtiofs isn't mounting overlayfs, is it?  Only the mounter
> requires CAP_SYS_ADMIN, not the accessor.

Are you thinking of virtiofs on top of overlayfs? I was referring to
mounting overlayfs on top of virtiofs inside VM.

Thanks
Vivek




Re: [Virtio-fs] [PATCH 0/2] virtiofsd: drop Linux capabilities(7)

2020-06-19 Thread Vivek Goyal
On Fri, Jun 19, 2020 at 04:16:30PM +0200, Miklos Szeredi wrote:
> On Thu, Jun 18, 2020 at 9:08 PM Vivek Goyal  wrote:
> >
> > On Thu, Apr 16, 2020 at 05:49:05PM +0100, Stefan Hajnoczi wrote:
> > > virtiofsd doesn't need of all Linux capabilities(7) available to root.  
> > > Keep a
> > > whitelisted set of capabilities that we require.  This improves security 
> > > in
> > > case virtiofsd is compromised by making it hard for an attacker to gain 
> > > further
> > > access to the system.
> >
> > Hi Stefan,
> >
> > I just noticed that this patch set breaks overlayfs on top of virtiofs.
> 
> How so?  Virtiofs isn't mounting overlayfs, is it?  Only the mounter
> requires CAP_SYS_ADMIN, not the accessor.

virtiofsd needs CAP_SYS_ADMIN, otherwise fsetxattr(trusted.overlay.opaque)
fails in lo_setxattr().

This is triggered when we mount overlayfs on top of virtiofs and overlayfs
tries to set OVL_XATTR_OPAQUE on upper to check if trusted xattrs are
supported or not.

Thanks
Vivek




Re: [Virtio-fs] [PATCH 0/2] virtiofsd: drop Linux capabilities(7)

2020-06-19 Thread Miklos Szeredi
On Thu, Jun 18, 2020 at 9:08 PM Vivek Goyal  wrote:
>
> On Thu, Apr 16, 2020 at 05:49:05PM +0100, Stefan Hajnoczi wrote:
> > virtiofsd doesn't need of all Linux capabilities(7) available to root.  
> > Keep a
> > whitelisted set of capabilities that we require.  This improves security in
> > case virtiofsd is compromised by making it hard for an attacker to gain 
> > further
> > access to the system.
>
> Hi Stefan,
>
> I just noticed that this patch set breaks overlayfs on top of virtiofs.

How so?  Virtiofs isn't mounting overlayfs, is it?  Only the mounter
requires CAP_SYS_ADMIN, not the accessor.

Thanks,
Miklos



Re: [Virtio-fs] [PATCH 0/2] virtiofsd: drop Linux capabilities(7)

2020-06-19 Thread Chirantan Ekbote
On Fri, Jun 19, 2020 at 4:27 AM Vivek Goyal  wrote:
>
> On Thu, Jun 18, 2020 at 08:16:55PM +0100, Dr. David Alan Gilbert wrote:
> > * Vivek Goyal (vgo...@redhat.com) wrote:
> > > On Thu, Apr 16, 2020 at 05:49:05PM +0100, Stefan Hajnoczi wrote:
> > > > virtiofsd doesn't need of all Linux capabilities(7) available to root.  
> > > > Keep a
> > > > whitelisted set of capabilities that we require.  This improves 
> > > > security in
> > > > case virtiofsd is compromised by making it hard for an attacker to gain 
> > > > further
> > > > access to the system.
> > >
> > > Hi Stefan,
> > >
> > > I just noticed that this patch set breaks overlayfs on top of virtiofs.
> > >
> > > overlayfs sets "trusted.overlay.*" and xattrs in trusted domain
> > > need CAP_SYS_ADMIN.
> > >

Not just that but it needs CAP_SYS_ADMIN in the init namespace[1].  We
have the same problem.  Our virtiofs process has CAP_SYS_ADMIN in a
user namespace and it cannot set any trusted or security xattrs.  The
security xattrs check is at least namespace aware so you only need
CAP_SYS_ADMIN in the namespace that mounted the fs but that doesn't
help us much.

We ended up working around it by prefixing "user.virtiofs." to the
xattr name[2], which has its own problems but there was pretty much no
chance that we would be able to give the fs device CAP_SYS_ADMIN in
the init namespace.


[1]: 
https://kernel.googlesource.com/pub/scm/linux/kernel/git/torvalds/linux/+/5e857ce6eae7ca21b2055cca4885545e29228fe2/fs/xattr.c#116
[2]: 
https://chromium-review.googlesource.com/c/chromiumos/platform/crosvm/+/2243111



Re: [Virtio-fs] [PATCH 0/2] virtiofsd: drop Linux capabilities(7)

2020-06-19 Thread Chirantan Ekbote
On Fri, Jun 19, 2020 at 5:40 PM Dr. David Alan Gilbert
 wrote:
>
> * Chirantan Ekbote (chiran...@chromium.org) wrote:
>
> > We ended up working around it by prefixing "user.virtiofs." to the
> > xattr name[2], which has its own problems but there was pretty much no
> > chance that we would be able to give the fs device CAP_SYS_ADMIN in
> > the init namespace.
>
>
> What problems did you hit with that?  We should standardise the renaming
> so we make an on-disc format that's compatible.
>

I guess what I meant by problems is that it made what was previously a
simple and straightforward implementation into something more complex
and added some limitations.  For example, we now need to parse the
result of the listxattr system call and strip out the prefix from any
name in the list.  It also means that we cannot allow the guest to
directly set or remove any "user.virtiofs." xattr as this would allow
an unprivileged process in the vm to modify an xattr that it wouldn't
otherwise be allowed to modify.  On top of being a somewhat arbitrary
restriction this also means that you can't have stacked virtiofs
instances as the lower instance would reject attempts by the upper one
to set those xattrs.  These limitations aren't really a problem for us
but I can see how they might be a problem for others.

The change was also merged just yesterday so there may be other
problems with it that haven't surfaced yet.

I didn't mention it before because I figured this was something that
we brought upon ourselves as chrome os is a bit extreme about
sandboxing.  If we can come up with a standardized way to handle this
I think we'll gladly switch the chrome os implementation to use it.

Chirantan



Re: [Virtio-fs] [PATCH 0/2] virtiofsd: drop Linux capabilities(7)

2020-06-19 Thread Daniel P . Berrangé
On Fri, Jun 19, 2020 at 12:49:57PM +0100, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrangé (berra...@redhat.com) wrote:
> > On Fri, Jun 19, 2020 at 09:27:46AM +0100, Dr. David Alan Gilbert wrote:
> > > * Vivek Goyal (vgo...@redhat.com) wrote:
> > > > On Thu, Jun 18, 2020 at 08:16:55PM +0100, Dr. David Alan Gilbert wrote:
> > > > > * Vivek Goyal (vgo...@redhat.com) wrote:
> > > > > > On Thu, Apr 16, 2020 at 05:49:05PM +0100, Stefan Hajnoczi wrote:
> > > > > > > virtiofsd doesn't need of all Linux capabilities(7) available to 
> > > > > > > root.  Keep a
> > > > > > > whitelisted set of capabilities that we require.  This improves 
> > > > > > > security in
> > > > > > > case virtiofsd is compromised by making it hard for an attacker 
> > > > > > > to gain further
> > > > > > > access to the system.
> > > > > > 
> > > > > > Hi Stefan,
> > > > > > 
> > > > > > I just noticed that this patch set breaks overlayfs on top of 
> > > > > > virtiofs.
> > > > > > 
> > > > > > overlayfs sets "trusted.overlay.*" and xattrs in trusted domain
> > > > > > need CAP_SYS_ADMIN.
> > > > > > 
> > > > > > man xattr says.
> > > > > > 
> > > > > >Trusted extended attributes
> > > > > >Trusted  extended  attributes  are  visible and accessible 
> > > > > > only to pro‐
> > > > > >cesses that have the  CAP_SYS_ADMIN  capability.   
> > > > > > Attributes  in  this
> > > > > >class are used to implement mechanisms in user space (i.e., 
> > > > > > outside the
> > > > > >kernel) which keep information in extended attributes to 
> > > > > > which ordinary
> > > > > >processes should not have access.
> > > > > > 
> > > > > > There is a chance that overlay moves away from trusted xattr in 
> > > > > > future.
> > > > > > But for now we need to make it work. This is an important use case 
> > > > > > for
> > > > > > kata docker in docker build.
> > > > > > 
> > > > > > May be we can add an option to virtiofsd say "--add-cap 
> > > > > > " and
> > > > > > ask user to pass in "--add-cap cap_sys_admin" if they need to run 
> > > > > > daemon
> > > > > > with this capaibility.
> > > > > 
> > > > > I'll admit I don't like the idea of giving it cap_sys_admin.
> > > > > Can you explain:
> > > > >   a) What overlayfs uses trusted for?
> > > > 
> > > > overlayfs stores bunch of metadata and uses "trusted" xattrs for it.
> > > 
> > > Tell me more about this metadata.
> > > Taking a juicy looking one, what does OVL_XATTR_REDIRECT do?
> > > Or what happens if I was to write random numbers into OVL_XATTR_NLINK?
> > > 
> > > > >   b) If something nasty was to write junk into the trusted attributes,
> > > > > what would happen?
> > > > 
> > > > This directory is owned by guest. So it should be able to write
> > > > anything it wants, as long as process in guest has CAP_SYS_ADMIN, right?
> > > 
> > > Well, we shouldn't be able to break/crash/escape into the host; how
> > > much does overlayfs validate trusted.* it uses?
> > > 
> > > > >   c) I see overlayfs has a fallback check if xattr isn't supported at
> > > > > all - what is the consequence?
> > > > 
> > > > It falls back to I think read only mode. 
> > > 
> > > It looks like the fallback is more subtle to me:
> > > /*
> > >  * Check if upper/work fs supports trusted.overlay.* xattr
> > >  */
> > > err = ovl_do_setxattr(ofs->workdir, OVL_XATTR_OPAQUE, "0", 1, 0);
> > > if (err) {
> > > ofs->noxattr = true;
> > > ofs->config.index = false;
> > > ofs->config.metacopy = false;
> > > pr_warn("upper fs does not support xattr, falling back to 
> > > index=off and metacopy=off.\n");
> > > 
> > > but I don't know what index and metacopy are.
> > > 
> > > > For a moment forget about overlayfs. Say a user process in guest with
> > > > CAP_SYS_ADMIN is writing trusted.foo. Should that succeed? Is a
> > > > passthrough filesystem, so it should go through. But currently it
> > > > wont.
> > > 
> > > As long as any effects of what it writes are contained to the area of
> > > the filesystem exposed to the guest, yes - however it worries me what
> > > the consequences of broken trusted metadata is.  If it's delicate enough
> > > that it's guarded by CAP_SYS_ADMIN someone must have worried about it.
> > 
> > The CAP_SYS_ADMIN requirement for 'trusted.' xattrs is simply a useful
> > mechanism for applications to control access. The host kernel doesn'
> > tuse this namespace itself. Linux has four namespaces for xattrs:
> > 
> >  -  user - for userspace apps. accessible based on read/write permissions
> >  -  trusted - for userspace apps. accessible by CAP_SYS_ADMIN processes only
> >  -  system - for kernel only. used by ACLs
> >  -  security - for kernel only. used by SELinux
> > 
> > The use case for "trusted" xattrs is thus where a privileged management
> > application or service wants to store metadata against the file, but
> > also needs to grant an unprivile

Re: [Virtio-fs] [PATCH 0/2] virtiofsd: drop Linux capabilities(7)

2020-06-19 Thread Dr. David Alan Gilbert
* Daniel P. Berrangé (berra...@redhat.com) wrote:
> On Fri, Jun 19, 2020 at 09:27:46AM +0100, Dr. David Alan Gilbert wrote:
> > * Vivek Goyal (vgo...@redhat.com) wrote:
> > > On Thu, Jun 18, 2020 at 08:16:55PM +0100, Dr. David Alan Gilbert wrote:
> > > > * Vivek Goyal (vgo...@redhat.com) wrote:
> > > > > On Thu, Apr 16, 2020 at 05:49:05PM +0100, Stefan Hajnoczi wrote:
> > > > > > virtiofsd doesn't need of all Linux capabilities(7) available to 
> > > > > > root.  Keep a
> > > > > > whitelisted set of capabilities that we require.  This improves 
> > > > > > security in
> > > > > > case virtiofsd is compromised by making it hard for an attacker to 
> > > > > > gain further
> > > > > > access to the system.
> > > > > 
> > > > > Hi Stefan,
> > > > > 
> > > > > I just noticed that this patch set breaks overlayfs on top of 
> > > > > virtiofs.
> > > > > 
> > > > > overlayfs sets "trusted.overlay.*" and xattrs in trusted domain
> > > > > need CAP_SYS_ADMIN.
> > > > > 
> > > > > man xattr says.
> > > > > 
> > > > >Trusted extended attributes
> > > > >Trusted  extended  attributes  are  visible and accessible 
> > > > > only to pro‐
> > > > >cesses that have the  CAP_SYS_ADMIN  capability.   Attributes  
> > > > > in  this
> > > > >class are used to implement mechanisms in user space (i.e., 
> > > > > outside the
> > > > >kernel) which keep information in extended attributes to which 
> > > > > ordinary
> > > > >processes should not have access.
> > > > > 
> > > > > There is a chance that overlay moves away from trusted xattr in 
> > > > > future.
> > > > > But for now we need to make it work. This is an important use case for
> > > > > kata docker in docker build.
> > > > > 
> > > > > May be we can add an option to virtiofsd say "--add-cap " 
> > > > > and
> > > > > ask user to pass in "--add-cap cap_sys_admin" if they need to run 
> > > > > daemon
> > > > > with this capaibility.
> > > > 
> > > > I'll admit I don't like the idea of giving it cap_sys_admin.
> > > > Can you explain:
> > > >   a) What overlayfs uses trusted for?
> > > 
> > > overlayfs stores bunch of metadata and uses "trusted" xattrs for it.
> > 
> > Tell me more about this metadata.
> > Taking a juicy looking one, what does OVL_XATTR_REDIRECT do?
> > Or what happens if I was to write random numbers into OVL_XATTR_NLINK?
> > 
> > > >   b) If something nasty was to write junk into the trusted attributes,
> > > > what would happen?
> > > 
> > > This directory is owned by guest. So it should be able to write
> > > anything it wants, as long as process in guest has CAP_SYS_ADMIN, right?
> > 
> > Well, we shouldn't be able to break/crash/escape into the host; how
> > much does overlayfs validate trusted.* it uses?
> > 
> > > >   c) I see overlayfs has a fallback check if xattr isn't supported at
> > > > all - what is the consequence?
> > > 
> > > It falls back to I think read only mode. 
> > 
> > It looks like the fallback is more subtle to me:
> > /*
> >  * Check if upper/work fs supports trusted.overlay.* xattr
> >  */
> > err = ovl_do_setxattr(ofs->workdir, OVL_XATTR_OPAQUE, "0", 1, 0);
> > if (err) {
> > ofs->noxattr = true;
> > ofs->config.index = false;
> > ofs->config.metacopy = false;
> > pr_warn("upper fs does not support xattr, falling back to 
> > index=off and metacopy=off.\n");
> > 
> > but I don't know what index and metacopy are.
> > 
> > > For a moment forget about overlayfs. Say a user process in guest with
> > > CAP_SYS_ADMIN is writing trusted.foo. Should that succeed? Is a
> > > passthrough filesystem, so it should go through. But currently it
> > > wont.
> > 
> > As long as any effects of what it writes are contained to the area of
> > the filesystem exposed to the guest, yes - however it worries me what
> > the consequences of broken trusted metadata is.  If it's delicate enough
> > that it's guarded by CAP_SYS_ADMIN someone must have worried about it.
> 
> The CAP_SYS_ADMIN requirement for 'trusted.' xattrs is simply a useful
> mechanism for applications to control access. The host kernel doesn'
> tuse this namespace itself. Linux has four namespaces for xattrs:
> 
>  -  user - for userspace apps. accessible based on read/write permissions
>  -  trusted - for userspace apps. accessible by CAP_SYS_ADMIN processes only
>  -  system - for kernel only. used by ACLs
>  -  security - for kernel only. used by SELinux
> 
> The use case for "trusted" xattrs is thus where a privileged management
> application or service wants to store metadata against the file, but
> also needs to grant an unprivileged process access to write to this file
> while not allowing that unprivileged process the ability to change the
> metadata. This is mentioned in the man page:
> 
> [man xattr(7)]
>Trusted extended attributes
>Trusted  extended attributes are visible and accessible only to pro‐
>

Re: [Virtio-fs] [PATCH 0/2] virtiofsd: drop Linux capabilities(7)

2020-06-19 Thread Daniel P . Berrangé
On Fri, Jun 19, 2020 at 09:27:46AM +0100, Dr. David Alan Gilbert wrote:
> * Vivek Goyal (vgo...@redhat.com) wrote:
> > On Thu, Jun 18, 2020 at 08:16:55PM +0100, Dr. David Alan Gilbert wrote:
> > > * Vivek Goyal (vgo...@redhat.com) wrote:
> > > > On Thu, Apr 16, 2020 at 05:49:05PM +0100, Stefan Hajnoczi wrote:
> > > > > virtiofsd doesn't need of all Linux capabilities(7) available to 
> > > > > root.  Keep a
> > > > > whitelisted set of capabilities that we require.  This improves 
> > > > > security in
> > > > > case virtiofsd is compromised by making it hard for an attacker to 
> > > > > gain further
> > > > > access to the system.
> > > > 
> > > > Hi Stefan,
> > > > 
> > > > I just noticed that this patch set breaks overlayfs on top of virtiofs.
> > > > 
> > > > overlayfs sets "trusted.overlay.*" and xattrs in trusted domain
> > > > need CAP_SYS_ADMIN.
> > > > 
> > > > man xattr says.
> > > > 
> > > >Trusted extended attributes
> > > >Trusted  extended  attributes  are  visible and accessible only 
> > > > to pro‐
> > > >cesses that have the  CAP_SYS_ADMIN  capability.   Attributes  
> > > > in  this
> > > >class are used to implement mechanisms in user space (i.e., 
> > > > outside the
> > > >kernel) which keep information in extended attributes to which 
> > > > ordinary
> > > >processes should not have access.
> > > > 
> > > > There is a chance that overlay moves away from trusted xattr in future.
> > > > But for now we need to make it work. This is an important use case for
> > > > kata docker in docker build.
> > > > 
> > > > May be we can add an option to virtiofsd say "--add-cap " 
> > > > and
> > > > ask user to pass in "--add-cap cap_sys_admin" if they need to run daemon
> > > > with this capaibility.
> > > 
> > > I'll admit I don't like the idea of giving it cap_sys_admin.
> > > Can you explain:
> > >   a) What overlayfs uses trusted for?
> > 
> > overlayfs stores bunch of metadata and uses "trusted" xattrs for it.
> 
> Tell me more about this metadata.
> Taking a juicy looking one, what does OVL_XATTR_REDIRECT do?
> Or what happens if I was to write random numbers into OVL_XATTR_NLINK?
> 
> > >   b) If something nasty was to write junk into the trusted attributes,
> > > what would happen?
> > 
> > This directory is owned by guest. So it should be able to write
> > anything it wants, as long as process in guest has CAP_SYS_ADMIN, right?
> 
> Well, we shouldn't be able to break/crash/escape into the host; how
> much does overlayfs validate trusted.* it uses?
> 
> > >   c) I see overlayfs has a fallback check if xattr isn't supported at
> > > all - what is the consequence?
> > 
> > It falls back to I think read only mode. 
> 
> It looks like the fallback is more subtle to me:
> /*
>  * Check if upper/work fs supports trusted.overlay.* xattr
>  */
> err = ovl_do_setxattr(ofs->workdir, OVL_XATTR_OPAQUE, "0", 1, 0);
> if (err) {
> ofs->noxattr = true;
> ofs->config.index = false;
> ofs->config.metacopy = false;
> pr_warn("upper fs does not support xattr, falling back to 
> index=off and metacopy=off.\n");
> 
> but I don't know what index and metacopy are.
> 
> > For a moment forget about overlayfs. Say a user process in guest with
> > CAP_SYS_ADMIN is writing trusted.foo. Should that succeed? Is a
> > passthrough filesystem, so it should go through. But currently it
> > wont.
> 
> As long as any effects of what it writes are contained to the area of
> the filesystem exposed to the guest, yes - however it worries me what
> the consequences of broken trusted metadata is.  If it's delicate enough
> that it's guarded by CAP_SYS_ADMIN someone must have worried about it.

The CAP_SYS_ADMIN requirement for 'trusted.' xattrs is simply a useful
mechanism for applications to control access. The host kernel doesn'
tuse this namespace itself. Linux has four namespaces for xattrs:

 -  user - for userspace apps. accessible based on read/write permissions
 -  trusted - for userspace apps. accessible by CAP_SYS_ADMIN processes only
 -  system - for kernel only. used by ACLs
 -  security - for kernel only. used by SELinux

The use case for "trusted" xattrs is thus where a privileged management
application or service wants to store metadata against the file, but
also needs to grant an unprivileged process access to write to this file
while not allowing that unprivileged process the ability to change the
metadata. This is mentioned in the man page:

[man xattr(7)]
   Trusted extended attributes
   Trusted  extended attributes are visible and accessible only to pro‐
   cesses that have the CAP_SYS_ADMIN capability.  Attributes  in  this
   class  are used to implement mechanisms in user space (i.e., outside
   the kernel) which keep information in extended attributes  to  which
   ordinary processes should not have access.

   User extended attribut

Re: [Virtio-fs] [PATCH 0/2] virtiofsd: drop Linux capabilities(7)

2020-06-19 Thread Dr. David Alan Gilbert
* Chirantan Ekbote (chiran...@chromium.org) wrote:
> On Fri, Jun 19, 2020 at 5:40 PM Dr. David Alan Gilbert
>  wrote:
> >
> > * Chirantan Ekbote (chiran...@chromium.org) wrote:
> >
> > > We ended up working around it by prefixing "user.virtiofs." to the
> > > xattr name[2], which has its own problems but there was pretty much no
> > > chance that we would be able to give the fs device CAP_SYS_ADMIN in
> > > the init namespace.
> >
> >
> > What problems did you hit with that?  We should standardise the renaming
> > so we make an on-disc format that's compatible.
> >
> 
> I guess what I meant by problems is that it made what was previously a
> simple and straightforward implementation into something more complex
> and added some limitations.

Yeh.

>  For example, we now need to parse the
> result of the listxattr system call and strip out the prefix from any
> name in the list.  It also means that we cannot allow the guest to
> directly set or remove any "user.virtiofs." xattr as this would allow
> an unprivileged process in the vm to modify an xattr that it wouldn't
> otherwise be allowed to modify.  On top of being a somewhat arbitrary
> restriction this also means that you can't have stacked virtiofs
> instances as the lower instance would reject attempts by the upper one
> to set those xattrs.  These limitations aren't really a problem for us
> but I can see how they might be a problem for others.

Isn't this a classic escaping problem?
Would it work if you prepended  'user.virtiofs.' onto any xattr
that started with 'trusted' or 'user.virtiofs.' ?

> The change was also merged just yesterday so there may be other
> problems with it that haven't surfaced yet.
> 
> I didn't mention it before because I figured this was something that
> we brought upon ourselves as chrome os is a bit extreme about
> sandboxing. 

I think we're trying to be as extreme in virtiofsd, but it is causing us
similar problems.

> If we can come up with a standardized way to handle this
> I think we'll gladly switch the chrome os implementation to use it.

Great!

Dave

> 
> Chirantan
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: [Virtio-fs] [PATCH 0/2] virtiofsd: drop Linux capabilities(7)

2020-06-19 Thread Dr. David Alan Gilbert
* Chirantan Ekbote (chiran...@chromium.org) wrote:
> On Fri, Jun 19, 2020 at 4:27 AM Vivek Goyal  wrote:
> >
> > On Thu, Jun 18, 2020 at 08:16:55PM +0100, Dr. David Alan Gilbert wrote:
> > > * Vivek Goyal (vgo...@redhat.com) wrote:
> > > > On Thu, Apr 16, 2020 at 05:49:05PM +0100, Stefan Hajnoczi wrote:
> > > > > virtiofsd doesn't need of all Linux capabilities(7) available to 
> > > > > root.  Keep a
> > > > > whitelisted set of capabilities that we require.  This improves 
> > > > > security in
> > > > > case virtiofsd is compromised by making it hard for an attacker to 
> > > > > gain further
> > > > > access to the system.
> > > >
> > > > Hi Stefan,
> > > >
> > > > I just noticed that this patch set breaks overlayfs on top of virtiofs.
> > > >
> > > > overlayfs sets "trusted.overlay.*" and xattrs in trusted domain
> > > > need CAP_SYS_ADMIN.
> > > >
> 
> Not just that but it needs CAP_SYS_ADMIN in the init namespace[1].  We
> have the same problem.  Our virtiofs process has CAP_SYS_ADMIN in a
> user namespace and it cannot set any trusted or security xattrs.  The
> security xattrs check is at least namespace aware so you only need
> CAP_SYS_ADMIN in the namespace that mounted the fs but that doesn't
> help us much.

It would have been good if you'd mentioned that; it would have saved
Vivek some confusion!

> We ended up working around it by prefixing "user.virtiofs." to the
> xattr name[2], which has its own problems but there was pretty much no
> chance that we would be able to give the fs device CAP_SYS_ADMIN in
> the init namespace.


What problems did you hit with that?  We should standardise the renaming
so we make an on-disc format that's compatible.

Dave

> 
> [1]: 
> https://kernel.googlesource.com/pub/scm/linux/kernel/git/torvalds/linux/+/5e857ce6eae7ca21b2055cca4885545e29228fe2/fs/xattr.c#116
> [2]: 
> https://chromium-review.googlesource.com/c/chromiumos/platform/crosvm/+/2243111
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: [Virtio-fs] [PATCH 0/2] virtiofsd: drop Linux capabilities(7)

2020-06-19 Thread Dr. David Alan Gilbert
* Vivek Goyal (vgo...@redhat.com) wrote:
> On Thu, Jun 18, 2020 at 08:16:55PM +0100, Dr. David Alan Gilbert wrote:
> > * Vivek Goyal (vgo...@redhat.com) wrote:
> > > On Thu, Apr 16, 2020 at 05:49:05PM +0100, Stefan Hajnoczi wrote:
> > > > virtiofsd doesn't need of all Linux capabilities(7) available to root.  
> > > > Keep a
> > > > whitelisted set of capabilities that we require.  This improves 
> > > > security in
> > > > case virtiofsd is compromised by making it hard for an attacker to gain 
> > > > further
> > > > access to the system.
> > > 
> > > Hi Stefan,
> > > 
> > > I just noticed that this patch set breaks overlayfs on top of virtiofs.
> > > 
> > > overlayfs sets "trusted.overlay.*" and xattrs in trusted domain
> > > need CAP_SYS_ADMIN.
> > > 
> > > man xattr says.
> > > 
> > >Trusted extended attributes
> > >Trusted  extended  attributes  are  visible and accessible only to 
> > > pro‐
> > >cesses that have the  CAP_SYS_ADMIN  capability.   Attributes  in  
> > > this
> > >class are used to implement mechanisms in user space (i.e., 
> > > outside the
> > >kernel) which keep information in extended attributes to which 
> > > ordinary
> > >processes should not have access.
> > > 
> > > There is a chance that overlay moves away from trusted xattr in future.
> > > But for now we need to make it work. This is an important use case for
> > > kata docker in docker build.
> > > 
> > > May be we can add an option to virtiofsd say "--add-cap " and
> > > ask user to pass in "--add-cap cap_sys_admin" if they need to run daemon
> > > with this capaibility.
> > 
> > I'll admit I don't like the idea of giving it cap_sys_admin.
> > Can you explain:
> >   a) What overlayfs uses trusted for?
> 
> overlayfs stores bunch of metadata and uses "trusted" xattrs for it.

Tell me more about this metadata.
Taking a juicy looking one, what does OVL_XATTR_REDIRECT do?
Or what happens if I was to write random numbers into OVL_XATTR_NLINK?

> >   b) If something nasty was to write junk into the trusted attributes,
> > what would happen?
> 
> This directory is owned by guest. So it should be able to write
> anything it wants, as long as process in guest has CAP_SYS_ADMIN, right?

Well, we shouldn't be able to break/crash/escape into the host; how
much does overlayfs validate trusted.* it uses?

> >   c) I see overlayfs has a fallback check if xattr isn't supported at
> > all - what is the consequence?
> 
> It falls back to I think read only mode. 

It looks like the fallback is more subtle to me:
/*
 * Check if upper/work fs supports trusted.overlay.* xattr
 */
err = ovl_do_setxattr(ofs->workdir, OVL_XATTR_OPAQUE, "0", 1, 0);
if (err) {
ofs->noxattr = true;
ofs->config.index = false;
ofs->config.metacopy = false;
pr_warn("upper fs does not support xattr, falling back to 
index=off and metacopy=off.\n");

but I don't know what index and metacopy are.

> For a moment forget about overlayfs. Say a user process in guest with
> CAP_SYS_ADMIN is writing trusted.foo. Should that succeed? Is a
> passthrough filesystem, so it should go through. But currently it
> wont.

As long as any effects of what it writes are contained to the area of
the filesystem exposed to the guest, yes - however it worries me what
the consequences of broken trusted metadata is.  If it's delicate enough
that it's guarded by CAP_SYS_ADMIN someone must have worried about it.

Dave

> Thanks
> Vivek
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: [Virtio-fs] [PATCH 0/2] virtiofsd: drop Linux capabilities(7)

2020-06-18 Thread Vivek Goyal
On Thu, Jun 18, 2020 at 08:16:55PM +0100, Dr. David Alan Gilbert wrote:
> * Vivek Goyal (vgo...@redhat.com) wrote:
> > On Thu, Apr 16, 2020 at 05:49:05PM +0100, Stefan Hajnoczi wrote:
> > > virtiofsd doesn't need of all Linux capabilities(7) available to root.  
> > > Keep a
> > > whitelisted set of capabilities that we require.  This improves security 
> > > in
> > > case virtiofsd is compromised by making it hard for an attacker to gain 
> > > further
> > > access to the system.
> > 
> > Hi Stefan,
> > 
> > I just noticed that this patch set breaks overlayfs on top of virtiofs.
> > 
> > overlayfs sets "trusted.overlay.*" and xattrs in trusted domain
> > need CAP_SYS_ADMIN.
> > 
> > man xattr says.
> > 
> >Trusted extended attributes
> >Trusted  extended  attributes  are  visible and accessible only to 
> > pro‐
> >cesses that have the  CAP_SYS_ADMIN  capability.   Attributes  in  
> > this
> >class are used to implement mechanisms in user space (i.e., outside 
> > the
> >kernel) which keep information in extended attributes to which 
> > ordinary
> >processes should not have access.
> > 
> > There is a chance that overlay moves away from trusted xattr in future.
> > But for now we need to make it work. This is an important use case for
> > kata docker in docker build.
> > 
> > May be we can add an option to virtiofsd say "--add-cap " and
> > ask user to pass in "--add-cap cap_sys_admin" if they need to run daemon
> > with this capaibility.
> 
> I'll admit I don't like the idea of giving it cap_sys_admin.
> Can you explain:
>   a) What overlayfs uses trusted for?

overlayfs stores bunch of metadata and uses "trusted" xattrs for it.

>   b) If something nasty was to write junk into the trusted attributes,
> what would happen?

This directory is owned by guest. So it should be able to write
anything it wants, as long as process in guest has CAP_SYS_ADMIN, right?

>   c) I see overlayfs has a fallback check if xattr isn't supported at
> all - what is the consequence?

It falls back to I think read only mode. 

For a moment forget about overlayfs. Say a user process in guest with
CAP_SYS_ADMIN is writing trusted.foo. Should that succeed? Is a
passthrough filesystem, so it should go through. But currently it
wont.

Thanks
Vivek




Re: [Virtio-fs] [PATCH 0/2] virtiofsd: drop Linux capabilities(7)

2020-06-18 Thread Dr. David Alan Gilbert
* Vivek Goyal (vgo...@redhat.com) wrote:
> On Thu, Apr 16, 2020 at 05:49:05PM +0100, Stefan Hajnoczi wrote:
> > virtiofsd doesn't need of all Linux capabilities(7) available to root.  
> > Keep a
> > whitelisted set of capabilities that we require.  This improves security in
> > case virtiofsd is compromised by making it hard for an attacker to gain 
> > further
> > access to the system.
> 
> Hi Stefan,
> 
> I just noticed that this patch set breaks overlayfs on top of virtiofs.
> 
> overlayfs sets "trusted.overlay.*" and xattrs in trusted domain
> need CAP_SYS_ADMIN.
> 
> man xattr says.
> 
>Trusted extended attributes
>Trusted  extended  attributes  are  visible and accessible only to pro‐
>cesses that have the  CAP_SYS_ADMIN  capability.   Attributes  in  this
>class are used to implement mechanisms in user space (i.e., outside the
>kernel) which keep information in extended attributes to which ordinary
>processes should not have access.
> 
> There is a chance that overlay moves away from trusted xattr in future.
> But for now we need to make it work. This is an important use case for
> kata docker in docker build.
> 
> May be we can add an option to virtiofsd say "--add-cap " and
> ask user to pass in "--add-cap cap_sys_admin" if they need to run daemon
> with this capaibility.

I'll admit I don't like the idea of giving it cap_sys_admin.
Can you explain:
  a) What overlayfs uses trusted for?
  b) If something nasty was to write junk into the trusted attributes,
what would happen?
  c) I see overlayfs has a fallback check if xattr isn't supported at
all - what is the consequence?

Dave

> Thanks
> Vivek
> 
> > 
> > Stefan Hajnoczi (2):
> >   virtiofsd: only retain file system capabilities
> >   virtiofsd: drop all capabilities in the wait parent process
> > 
> >  tools/virtiofsd/passthrough_ll.c | 51 
> >  1 file changed, 51 insertions(+)
> > 
> > -- 
> > 2.25.1
> > 
> > ___
> > Virtio-fs mailing list
> > virtio...@redhat.com
> > https://www.redhat.com/mailman/listinfo/virtio-fs
> 
> ___
> Virtio-fs mailing list
> virtio...@redhat.com
> https://www.redhat.com/mailman/listinfo/virtio-fs
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: [Virtio-fs] [PATCH 0/2] virtiofsd: drop Linux capabilities(7)

2020-06-18 Thread Vivek Goyal
On Thu, Apr 16, 2020 at 05:49:05PM +0100, Stefan Hajnoczi wrote:
> virtiofsd doesn't need of all Linux capabilities(7) available to root.  Keep a
> whitelisted set of capabilities that we require.  This improves security in
> case virtiofsd is compromised by making it hard for an attacker to gain 
> further
> access to the system.

Hi Stefan,

I just noticed that this patch set breaks overlayfs on top of virtiofs.

overlayfs sets "trusted.overlay.*" and xattrs in trusted domain
need CAP_SYS_ADMIN.

man xattr says.

   Trusted extended attributes
   Trusted  extended  attributes  are  visible and accessible only to pro‐
   cesses that have the  CAP_SYS_ADMIN  capability.   Attributes  in  this
   class are used to implement mechanisms in user space (i.e., outside the
   kernel) which keep information in extended attributes to which ordinary
   processes should not have access.

There is a chance that overlay moves away from trusted xattr in future.
But for now we need to make it work. This is an important use case for
kata docker in docker build.

May be we can add an option to virtiofsd say "--add-cap " and
ask user to pass in "--add-cap cap_sys_admin" if they need to run daemon
with this capaibility.

Thanks
Vivek

> 
> Stefan Hajnoczi (2):
>   virtiofsd: only retain file system capabilities
>   virtiofsd: drop all capabilities in the wait parent process
> 
>  tools/virtiofsd/passthrough_ll.c | 51 
>  1 file changed, 51 insertions(+)
> 
> -- 
> 2.25.1
> 
> ___
> Virtio-fs mailing list
> virtio...@redhat.com
> https://www.redhat.com/mailman/listinfo/virtio-fs




Re: [PATCH 0/2] virtiofsd: drop Linux capabilities(7)

2020-05-01 Thread Dr. David Alan Gilbert
* Stefan Hajnoczi (stefa...@redhat.com) wrote:
> virtiofsd doesn't need of all Linux capabilities(7) available to root.  Keep a
> whitelisted set of capabilities that we require.  This improves security in
> case virtiofsd is compromised by making it hard for an attacker to gain 
> further
> access to the system.

Queued.

> Stefan Hajnoczi (2):
>   virtiofsd: only retain file system capabilities
>   virtiofsd: drop all capabilities in the wait parent process
> 
>  tools/virtiofsd/passthrough_ll.c | 51 
>  1 file changed, 51 insertions(+)
> 
> -- 
> 2.25.1
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: [PATCH 0/2] virtiofsd: drop Linux capabilities(7)

2020-04-17 Thread Stefan Hajnoczi
On Thu, Apr 16, 2020 at 04:10:22PM -0400, Vivek Goyal wrote:
> On Thu, Apr 16, 2020 at 05:49:05PM +0100, Stefan Hajnoczi wrote:
> > virtiofsd doesn't need of all Linux capabilities(7) available to root.  
> > Keep a
> > whitelisted set of capabilities that we require.  This improves security in
> > case virtiofsd is compromised by making it hard for an attacker to gain 
> > further
> > access to the system.
> 
> Hi Stefan,
> 
> Good to see this patch. We needed to limit capabilities to reduce attack
> surface.
> 
> What tests have you run to make sure this current set of whitelisted
> capabilities is good enough.

Booting and light usage of Fedora 29 and running blogbench.

I would appreciate it if others could try it out with their
tests/workloads.

Stefan


signature.asc
Description: PGP signature


Re: [PATCH 0/2] virtiofsd: drop Linux capabilities(7)

2020-04-16 Thread Vivek Goyal
On Thu, Apr 16, 2020 at 05:49:05PM +0100, Stefan Hajnoczi wrote:
> virtiofsd doesn't need of all Linux capabilities(7) available to root.  Keep a
> whitelisted set of capabilities that we require.  This improves security in
> case virtiofsd is compromised by making it hard for an attacker to gain 
> further
> access to the system.

Hi Stefan,

Good to see this patch. We needed to limit capabilities to reduce attack
surface.

What tests have you run to make sure this current set of whitelisted
capabilities is good enough.

Vivek

> 
> Stefan Hajnoczi (2):
>   virtiofsd: only retain file system capabilities
>   virtiofsd: drop all capabilities in the wait parent process
> 
>  tools/virtiofsd/passthrough_ll.c | 51 
>  1 file changed, 51 insertions(+)
> 
> -- 
> 2.25.1
> 




[PATCH 0/2] virtiofsd: drop Linux capabilities(7)

2020-04-16 Thread Stefan Hajnoczi
virtiofsd doesn't need of all Linux capabilities(7) available to root.  Keep a
whitelisted set of capabilities that we require.  This improves security in
case virtiofsd is compromised by making it hard for an attacker to gain further
access to the system.

Stefan Hajnoczi (2):
  virtiofsd: only retain file system capabilities
  virtiofsd: drop all capabilities in the wait parent process

 tools/virtiofsd/passthrough_ll.c | 51 
 1 file changed, 51 insertions(+)

-- 
2.25.1