Re: [lxc-devel] [RFC PATCH 11/11] loop: Allow priveleged operations for root in the namespace which owns a device

2014-05-27 Thread Serge Hallyn
Quoting Michael H. Warfield (m...@wittsend.com):
> On Tue, 2014-05-27 at 03:36 +0200, Serge E. Hallyn wrote:
> > Quoting Michael H. Warfield (m...@wittsend.com):
> > > On Mon, 2014-05-26 at 11:16 +0200, Seth Forshee wrote:
> > > > On Fri, May 23, 2014 at 08:48:25AM +0300, Marian Marinov wrote:
> > > > > -BEGIN PGP SIGNED MESSAGE-
> > > > > Hash: SHA1
> > > > > 
> > > > > One question about this patch.
> > > > > 
> > > > > Why don't you use the devices cgroup check if the root user in that 
> > > > > namespace is allowed to use this device?
> > > > > 
> > > > > This way you can be sure that the root in that namespace can not 
> > > > > access devices to which the host system did not gave
> > > > > him access to.
> > > 
> > > > That might be possible, but I don't want to require something on the
> > > > host to whitelist the device for the container. Then loop would need to
> > > > automatically add the device to devices.allow, which doesn't seem
> > > > desirable to me. But I'm not entirely opposed to the idea if others
> > > > think this is a better way to go.
> > > 
> > > I don't see any safe way to avoid it.  The host has to be in control of
> > > what devices can and can not be accessed by the container.
> 
> > Disagree.  loop%d is meaningless until it is attached to a file.  So
> > whether a container can use loop2 vs loop9 is meaningless.  The point
> > of Seth's loopfs as I understood it is that the container simply gets a
> > unique (not visible to host or any other containers) set of loop devices
> > which it can attach to files which it owns.  So long as the host can't
> > see the container's loop devices (i.e. so it unwittently mounts it when
> > looking for a particular UUID for /var), it won't get fooled by them.
> 
> > So in this case *if* we can do it, a purely namespaced approach - meaning
> > that we restrict visibility of a particular loopdev to one container - is
> > perfect.  
> 
> And in that "*if" is a cloud that says "then a miracle occurs" and that
> miracle needs a lot more detail.

Naturally.  Which is why as Seth says we'll need concrete code to discuss.
But the concept that a well implemented namespace which prevents addressing
a given resource in the first place would suffice is, I think, a well
accepted premise of security in linux.  And in this case it is more
appropriate than trying to finagle it into the devices cgroup.  Note that
Marian said "to check if root user in that namespace is allowed to use
this device."  This first off does not address the concern of root on the
host being tricked by the contents of loop0 which happens to be legitimately
used by container N.  In contrast, making it so that loop0 is only
addressable by container N, and not by the host, does.

Anyway I as reading the above as why don't we *base* the containerized loop
on devices cgroups.  I object to that.  Well, at least until we rule out
more elegant solutions.  Of course I don't object to defense in depth.

-serge
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [lxc-devel] [RFC PATCH 11/11] loop: Allow priveleged operations for root in the namespace which owns a device

2014-05-27 Thread Seth Forshee
On Mon, May 26, 2014 at 10:39:22PM -0400, Michael H. Warfield wrote:
> On Tue, 2014-05-27 at 03:36 +0200, Serge E. Hallyn wrote:
> > Quoting Michael H. Warfield (m...@wittsend.com):
> > > On Mon, 2014-05-26 at 11:16 +0200, Seth Forshee wrote:
> > > > On Fri, May 23, 2014 at 08:48:25AM +0300, Marian Marinov wrote:
> > > > > -BEGIN PGP SIGNED MESSAGE-
> > > > > Hash: SHA1
> > > > > 
> > > > > One question about this patch.
> > > > > 
> > > > > Why don't you use the devices cgroup check if the root user in that 
> > > > > namespace is allowed to use this device?
> > > > > 
> > > > > This way you can be sure that the root in that namespace can not 
> > > > > access devices to which the host system did not gave
> > > > > him access to.
> > > 
> > > > That might be possible, but I don't want to require something on the
> > > > host to whitelist the device for the container. Then loop would need to
> > > > automatically add the device to devices.allow, which doesn't seem
> > > > desirable to me. But I'm not entirely opposed to the idea if others
> > > > think this is a better way to go.
> > > 
> > > I don't see any safe way to avoid it.  The host has to be in control of
> > > what devices can and can not be accessed by the container.
> 
> > Disagree.  loop%d is meaningless until it is attached to a file.  So
> > whether a container can use loop2 vs loop9 is meaningless.  The point
> > of Seth's loopfs as I understood it is that the container simply gets a
> > unique (not visible to host or any other containers) set of loop devices
> > which it can attach to files which it owns.  So long as the host can't
> > see the container's loop devices (i.e. so it unwittently mounts it when
> > looking for a particular UUID for /var), it won't get fooled by them.
> 
> > So in this case *if* we can do it, a purely namespaced approach - meaning
> > that we restrict visibility of a particular loopdev to one container - is
> > perfect.  
> 
> And in that "*if" is a cloud that says "then a miracle occurs" and that
> miracle needs a lot more detail.  How that translates into what is and
> is not visible and what can be mimiced in a container becomes important
> (to say nothing of notifying its udev).  I think this loopfs thing is
> the answer for the loop device case, we just need to clear up those
> details and exorcise the devils we find in them.  The loop devices are
> unique while they strangely seem to work with minimal leakage already
> (all meta data at this time).
> 
> Seth remarked that, maybe, he's not paranoid enough.  You know that I'm
> a well trained professional paranoid and I accept if people think I'm
> overly paranoid (is that even possible?).  Even paranoids have enemies
> and just because you're paranoid it doesn't mean they're not out to get
> you.  While I admit that total isolation is virtually (excuse the pun)
> impossible that doesn't mean I don't strive to maximize the isolation
> and analyze the possibilities and consequences of compromise.
> 
> As I stated, "I don't see any way to avoid it".  I would love to be
> proven wrong.  It would permit my life to be so much more easy.  But how
> can we allow this without the host in control of it and directing things
> to the containers?  A container may request something and the host can
> grant it but the container should not be capable of demanding a device
> over and above the control of the host.  How do we define the rules that
> say what a container can do and what it cannot do without it involving
> knowledge in the host (whitelisting as Seth call's it) of what is and is
> not allowed in the container?

I need to post some code so we have something concrete to discuss, just
haven't gotten to it yet due to travel and meetings. I'll try to work
the code into something presentable and send it out later today.

But in loopfs the kernel would ultimately controls directing loop devs
to containers based on the mount. A container can request a free loop
device via loop-control, and one gets assigned to that mount. Userspace
can ask for a specific loop device number, but if that device is
associated with a different mount that will fail, so the container can't
gain access to another context's loop device. The container has access
to only its loop devices by virtue of not having device nodes for any
other devices.

> We already have the problem that the container devices.allow and
> devices.deny are major and minor based, which we know is fundamentally
> flawed in a udev environment.  We specify major:minor in the
> configuration files as if they are cast in cement (which they are in all
> common cases) but they are not in the general case.  Greg K-H hammers on
> this frequently.
> 
> The loop devices are unique and deserve a unique solution, I'll agree.
> But I'm also comfortable that the host should have rules and procedures
> to whitelist hard devices and loop devices and manage their transfer
> and/or sharing into the containers.

I'm aiming for being 

Re: [lxc-devel] [RFC PATCH 11/11] loop: Allow priveleged operations for root in the namespace which owns a device

2014-05-27 Thread Seth Forshee
On Mon, May 26, 2014 at 10:39:22PM -0400, Michael H. Warfield wrote:
 On Tue, 2014-05-27 at 03:36 +0200, Serge E. Hallyn wrote:
  Quoting Michael H. Warfield (m...@wittsend.com):
   On Mon, 2014-05-26 at 11:16 +0200, Seth Forshee wrote:
On Fri, May 23, 2014 at 08:48:25AM +0300, Marian Marinov wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 One question about this patch.
 
 Why don't you use the devices cgroup check if the root user in that 
 namespace is allowed to use this device?
 
 This way you can be sure that the root in that namespace can not 
 access devices to which the host system did not gave
 him access to.
   
That might be possible, but I don't want to require something on the
host to whitelist the device for the container. Then loop would need to
automatically add the device to devices.allow, which doesn't seem
desirable to me. But I'm not entirely opposed to the idea if others
think this is a better way to go.
   
   I don't see any safe way to avoid it.  The host has to be in control of
   what devices can and can not be accessed by the container.
 
  Disagree.  loop%d is meaningless until it is attached to a file.  So
  whether a container can use loop2 vs loop9 is meaningless.  The point
  of Seth's loopfs as I understood it is that the container simply gets a
  unique (not visible to host or any other containers) set of loop devices
  which it can attach to files which it owns.  So long as the host can't
  see the container's loop devices (i.e. so it unwittently mounts it when
  looking for a particular UUID for /var), it won't get fooled by them.
 
  So in this case *if* we can do it, a purely namespaced approach - meaning
  that we restrict visibility of a particular loopdev to one container - is
  perfect.  
 
 And in that *if is a cloud that says then a miracle occurs and that
 miracle needs a lot more detail.  How that translates into what is and
 is not visible and what can be mimiced in a container becomes important
 (to say nothing of notifying its udev).  I think this loopfs thing is
 the answer for the loop device case, we just need to clear up those
 details and exorcise the devils we find in them.  The loop devices are
 unique while they strangely seem to work with minimal leakage already
 (all meta data at this time).
 
 Seth remarked that, maybe, he's not paranoid enough.  You know that I'm
 a well trained professional paranoid and I accept if people think I'm
 overly paranoid (is that even possible?).  Even paranoids have enemies
 and just because you're paranoid it doesn't mean they're not out to get
 you.  While I admit that total isolation is virtually (excuse the pun)
 impossible that doesn't mean I don't strive to maximize the isolation
 and analyze the possibilities and consequences of compromise.
 
 As I stated, I don't see any way to avoid it.  I would love to be
 proven wrong.  It would permit my life to be so much more easy.  But how
 can we allow this without the host in control of it and directing things
 to the containers?  A container may request something and the host can
 grant it but the container should not be capable of demanding a device
 over and above the control of the host.  How do we define the rules that
 say what a container can do and what it cannot do without it involving
 knowledge in the host (whitelisting as Seth call's it) of what is and is
 not allowed in the container?

I need to post some code so we have something concrete to discuss, just
haven't gotten to it yet due to travel and meetings. I'll try to work
the code into something presentable and send it out later today.

But in loopfs the kernel would ultimately controls directing loop devs
to containers based on the mount. A container can request a free loop
device via loop-control, and one gets assigned to that mount. Userspace
can ask for a specific loop device number, but if that device is
associated with a different mount that will fail, so the container can't
gain access to another context's loop device. The container has access
to only its loop devices by virtue of not having device nodes for any
other devices.

 We already have the problem that the container devices.allow and
 devices.deny are major and minor based, which we know is fundamentally
 flawed in a udev environment.  We specify major:minor in the
 configuration files as if they are cast in cement (which they are in all
 common cases) but they are not in the general case.  Greg K-H hammers on
 this frequently.
 
 The loop devices are unique and deserve a unique solution, I'll agree.
 But I'm also comfortable that the host should have rules and procedures
 to whitelist hard devices and loop devices and manage their transfer
 and/or sharing into the containers.

I'm aiming for being able to use the same tools to manage loop device in
a container as on the host. If the whole thing needs to be managed by a
process on the host then I suspect we 

Re: [lxc-devel] [RFC PATCH 11/11] loop: Allow priveleged operations for root in the namespace which owns a device

2014-05-27 Thread Serge Hallyn
Quoting Michael H. Warfield (m...@wittsend.com):
 On Tue, 2014-05-27 at 03:36 +0200, Serge E. Hallyn wrote:
  Quoting Michael H. Warfield (m...@wittsend.com):
   On Mon, 2014-05-26 at 11:16 +0200, Seth Forshee wrote:
On Fri, May 23, 2014 at 08:48:25AM +0300, Marian Marinov wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 One question about this patch.
 
 Why don't you use the devices cgroup check if the root user in that 
 namespace is allowed to use this device?
 
 This way you can be sure that the root in that namespace can not 
 access devices to which the host system did not gave
 him access to.
   
That might be possible, but I don't want to require something on the
host to whitelist the device for the container. Then loop would need to
automatically add the device to devices.allow, which doesn't seem
desirable to me. But I'm not entirely opposed to the idea if others
think this is a better way to go.
   
   I don't see any safe way to avoid it.  The host has to be in control of
   what devices can and can not be accessed by the container.
 
  Disagree.  loop%d is meaningless until it is attached to a file.  So
  whether a container can use loop2 vs loop9 is meaningless.  The point
  of Seth's loopfs as I understood it is that the container simply gets a
  unique (not visible to host or any other containers) set of loop devices
  which it can attach to files which it owns.  So long as the host can't
  see the container's loop devices (i.e. so it unwittently mounts it when
  looking for a particular UUID for /var), it won't get fooled by them.
 
  So in this case *if* we can do it, a purely namespaced approach - meaning
  that we restrict visibility of a particular loopdev to one container - is
  perfect.  
 
 And in that *if is a cloud that says then a miracle occurs and that
 miracle needs a lot more detail.

Naturally.  Which is why as Seth says we'll need concrete code to discuss.
But the concept that a well implemented namespace which prevents addressing
a given resource in the first place would suffice is, I think, a well
accepted premise of security in linux.  And in this case it is more
appropriate than trying to finagle it into the devices cgroup.  Note that
Marian said to check if root user in that namespace is allowed to use
this device.  This first off does not address the concern of root on the
host being tricked by the contents of loop0 which happens to be legitimately
used by container N.  In contrast, making it so that loop0 is only
addressable by container N, and not by the host, does.

Anyway I as reading the above as why don't we *base* the containerized loop
on devices cgroups.  I object to that.  Well, at least until we rule out
more elegant solutions.  Of course I don't object to defense in depth.

-serge
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [lxc-devel] [RFC PATCH 11/11] loop: Allow priveleged operations for root in the namespace which owns a device

2014-05-26 Thread Michael H. Warfield
On Tue, 2014-05-27 at 03:36 +0200, Serge E. Hallyn wrote:
> Quoting Michael H. Warfield (m...@wittsend.com):
> > On Mon, 2014-05-26 at 11:16 +0200, Seth Forshee wrote:
> > > On Fri, May 23, 2014 at 08:48:25AM +0300, Marian Marinov wrote:
> > > > -BEGIN PGP SIGNED MESSAGE-
> > > > Hash: SHA1
> > > > 
> > > > One question about this patch.
> > > > 
> > > > Why don't you use the devices cgroup check if the root user in that 
> > > > namespace is allowed to use this device?
> > > > 
> > > > This way you can be sure that the root in that namespace can not access 
> > > > devices to which the host system did not gave
> > > > him access to.
> > 
> > > That might be possible, but I don't want to require something on the
> > > host to whitelist the device for the container. Then loop would need to
> > > automatically add the device to devices.allow, which doesn't seem
> > > desirable to me. But I'm not entirely opposed to the idea if others
> > > think this is a better way to go.
> > 
> > I don't see any safe way to avoid it.  The host has to be in control of
> > what devices can and can not be accessed by the container.

> Disagree.  loop%d is meaningless until it is attached to a file.  So
> whether a container can use loop2 vs loop9 is meaningless.  The point
> of Seth's loopfs as I understood it is that the container simply gets a
> unique (not visible to host or any other containers) set of loop devices
> which it can attach to files which it owns.  So long as the host can't
> see the container's loop devices (i.e. so it unwittently mounts it when
> looking for a particular UUID for /var), it won't get fooled by them.

> So in this case *if* we can do it, a purely namespaced approach - meaning
> that we restrict visibility of a particular loopdev to one container - is
> perfect.  

And in that "*if" is a cloud that says "then a miracle occurs" and that
miracle needs a lot more detail.  How that translates into what is and
is not visible and what can be mimiced in a container becomes important
(to say nothing of notifying its udev).  I think this loopfs thing is
the answer for the loop device case, we just need to clear up those
details and exorcise the devils we find in them.  The loop devices are
unique while they strangely seem to work with minimal leakage already
(all meta data at this time).

Seth remarked that, maybe, he's not paranoid enough.  You know that I'm
a well trained professional paranoid and I accept if people think I'm
overly paranoid (is that even possible?).  Even paranoids have enemies
and just because you're paranoid it doesn't mean they're not out to get
you.  While I admit that total isolation is virtually (excuse the pun)
impossible that doesn't mean I don't strive to maximize the isolation
and analyze the possibilities and consequences of compromise.

As I stated, "I don't see any way to avoid it".  I would love to be
proven wrong.  It would permit my life to be so much more easy.  But how
can we allow this without the host in control of it and directing things
to the containers?  A container may request something and the host can
grant it but the container should not be capable of demanding a device
over and above the control of the host.  How do we define the rules that
say what a container can do and what it cannot do without it involving
knowledge in the host (whitelisting as Seth call's it) of what is and is
not allowed in the container?

We already have the problem that the container devices.allow and
devices.deny are major and minor based, which we know is fundamentally
flawed in a udev environment.  We specify major:minor in the
configuration files as if they are cast in cement (which they are in all
common cases) but they are not in the general case.  Greg K-H hammers on
this frequently.

The loop devices are unique and deserve a unique solution, I'll agree.
But I'm also comfortable that the host should have rules and procedures
to whitelist hard devices and loop devices and manage their transfer
and/or sharing into the containers.

Regards,
Mike
-- 
Michael H. Warfield (AI4NB) | (770) 978-7061 |  m...@wittsend.com
   /\/\|=mhw=|\/\/  | (678) 463-0932 |  http://www.wittsend.com/mhw/
   NIC whois: MHW9  | An optimist believes we live in the best of all
 PGP Key: 0x674627FF| possible worlds.  A pessimist is sure of it!



signature.asc
Description: This is a digitally signed message part


Re: [lxc-devel] [RFC PATCH 11/11] loop: Allow priveleged operations for root in the namespace which owns a device

2014-05-26 Thread Serge E. Hallyn
Quoting Michael H. Warfield (m...@wittsend.com):
> On Mon, 2014-05-26 at 11:16 +0200, Seth Forshee wrote:
> > On Fri, May 23, 2014 at 08:48:25AM +0300, Marian Marinov wrote:
> > > -BEGIN PGP SIGNED MESSAGE-
> > > Hash: SHA1
> > > 
> > > One question about this patch.
> > > 
> > > Why don't you use the devices cgroup check if the root user in that 
> > > namespace is allowed to use this device?
> > > 
> > > This way you can be sure that the root in that namespace can not access 
> > > devices to which the host system did not gave
> > > him access to.
> 
> > That might be possible, but I don't want to require something on the
> > host to whitelist the device for the container. Then loop would need to
> > automatically add the device to devices.allow, which doesn't seem
> > desirable to me. But I'm not entirely opposed to the idea if others
> > think this is a better way to go.
> 
> I don't see any safe way to avoid it.  The host has to be in control of
> what devices can and can not be accessed by the container.

Disagree.  loop%d is meaningless until it is attached to a file.  So
whether a container can use loop2 vs loop9 is meaningless.  The point
of Seth's loopfs as I understood it is that the container simply gets a
unique (not visible to host or any other containers) set of loop devices
which it can attach to files which it owns.  So long as the host can't
see the container's loop devices (i.e. so it unwittently mounts it when
looking for a particular UUID for /var), it won't get fooled by them.

So in this case *if* we can do it, a purely namespaced approach - meaning
that we restrict visibility of a particular loopdev to one container - is
perfect.  
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [lxc-devel] [RFC PATCH 11/11] loop: Allow priveleged operations for root in the namespace which owns a device

2014-05-26 Thread Seth Forshee
On Mon, May 26, 2014 at 11:32:05AM -0400, Michael H. Warfield wrote:
> On Mon, 2014-05-26 at 11:16 +0200, Seth Forshee wrote:
> > On Fri, May 23, 2014 at 08:48:25AM +0300, Marian Marinov wrote:
> > > -BEGIN PGP SIGNED MESSAGE-
> > > Hash: SHA1
> > > 
> > > One question about this patch.
> > > 
> > > Why don't you use the devices cgroup check if the root user in that 
> > > namespace is allowed to use this device?
> > > 
> > > This way you can be sure that the root in that namespace can not access 
> > > devices to which the host system did not gave
> > > him access to.
> 
> > That might be possible, but I don't want to require something on the
> > host to whitelist the device for the container. Then loop would need to
> > automatically add the device to devices.allow, which doesn't seem
> > desirable to me. But I'm not entirely opposed to the idea if others
> > think this is a better way to go.
> 
> I don't see any safe way to avoid it.  The host has to be in control of
> what devices can and can not be accessed by the container.

Hmm, for testing I've been giving access to 7:* block devices since my
containers can't mknod and only see device nodes for loop devices they
have access to, but maybe I'm not being sufficiently paranoid.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [lxc-devel] [RFC PATCH 11/11] loop: Allow priveleged operations for root in the namespace which owns a device

2014-05-26 Thread Michael H. Warfield
On Mon, 2014-05-26 at 11:16 +0200, Seth Forshee wrote:
> On Fri, May 23, 2014 at 08:48:25AM +0300, Marian Marinov wrote:
> > -BEGIN PGP SIGNED MESSAGE-
> > Hash: SHA1
> > 
> > One question about this patch.
> > 
> > Why don't you use the devices cgroup check if the root user in that 
> > namespace is allowed to use this device?
> > 
> > This way you can be sure that the root in that namespace can not access 
> > devices to which the host system did not gave
> > him access to.

> That might be possible, but I don't want to require something on the
> host to whitelist the device for the container. Then loop would need to
> automatically add the device to devices.allow, which doesn't seem
> desirable to me. But I'm not entirely opposed to the idea if others
> think this is a better way to go.

I don't see any safe way to avoid it.  The host has to be in control of
what devices can and can not be accessed by the container.

> Seth

Regards,
Mike
-- 
Michael H. Warfield (AI4NB) | (770) 978-7061 |  m...@wittsend.com
   /\/\|=mhw=|\/\/  | (678) 463-0932 |  http://www.wittsend.com/mhw/
   NIC whois: MHW9  | An optimist believes we live in the best of all
 PGP Key: 0x674627FF| possible worlds.  A pessimist is sure of it!



signature.asc
Description: This is a digitally signed message part


Re: [lxc-devel] [RFC PATCH 11/11] loop: Allow priveleged operations for root in the namespace which owns a device

2014-05-26 Thread Michael H. Warfield
On Mon, 2014-05-26 at 11:16 +0200, Seth Forshee wrote:
 On Fri, May 23, 2014 at 08:48:25AM +0300, Marian Marinov wrote:
  -BEGIN PGP SIGNED MESSAGE-
  Hash: SHA1
  
  One question about this patch.
  
  Why don't you use the devices cgroup check if the root user in that 
  namespace is allowed to use this device?
  
  This way you can be sure that the root in that namespace can not access 
  devices to which the host system did not gave
  him access to.

 That might be possible, but I don't want to require something on the
 host to whitelist the device for the container. Then loop would need to
 automatically add the device to devices.allow, which doesn't seem
 desirable to me. But I'm not entirely opposed to the idea if others
 think this is a better way to go.

I don't see any safe way to avoid it.  The host has to be in control of
what devices can and can not be accessed by the container.

 Seth

Regards,
Mike
-- 
Michael H. Warfield (AI4NB) | (770) 978-7061 |  m...@wittsend.com
   /\/\|=mhw=|\/\/  | (678) 463-0932 |  http://www.wittsend.com/mhw/
   NIC whois: MHW9  | An optimist believes we live in the best of all
 PGP Key: 0x674627FF| possible worlds.  A pessimist is sure of it!



signature.asc
Description: This is a digitally signed message part


Re: [lxc-devel] [RFC PATCH 11/11] loop: Allow priveleged operations for root in the namespace which owns a device

2014-05-26 Thread Seth Forshee
On Mon, May 26, 2014 at 11:32:05AM -0400, Michael H. Warfield wrote:
 On Mon, 2014-05-26 at 11:16 +0200, Seth Forshee wrote:
  On Fri, May 23, 2014 at 08:48:25AM +0300, Marian Marinov wrote:
   -BEGIN PGP SIGNED MESSAGE-
   Hash: SHA1
   
   One question about this patch.
   
   Why don't you use the devices cgroup check if the root user in that 
   namespace is allowed to use this device?
   
   This way you can be sure that the root in that namespace can not access 
   devices to which the host system did not gave
   him access to.
 
  That might be possible, but I don't want to require something on the
  host to whitelist the device for the container. Then loop would need to
  automatically add the device to devices.allow, which doesn't seem
  desirable to me. But I'm not entirely opposed to the idea if others
  think this is a better way to go.
 
 I don't see any safe way to avoid it.  The host has to be in control of
 what devices can and can not be accessed by the container.

Hmm, for testing I've been giving access to 7:* block devices since my
containers can't mknod and only see device nodes for loop devices they
have access to, but maybe I'm not being sufficiently paranoid.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [lxc-devel] [RFC PATCH 11/11] loop: Allow priveleged operations for root in the namespace which owns a device

2014-05-26 Thread Serge E. Hallyn
Quoting Michael H. Warfield (m...@wittsend.com):
 On Mon, 2014-05-26 at 11:16 +0200, Seth Forshee wrote:
  On Fri, May 23, 2014 at 08:48:25AM +0300, Marian Marinov wrote:
   -BEGIN PGP SIGNED MESSAGE-
   Hash: SHA1
   
   One question about this patch.
   
   Why don't you use the devices cgroup check if the root user in that 
   namespace is allowed to use this device?
   
   This way you can be sure that the root in that namespace can not access 
   devices to which the host system did not gave
   him access to.
 
  That might be possible, but I don't want to require something on the
  host to whitelist the device for the container. Then loop would need to
  automatically add the device to devices.allow, which doesn't seem
  desirable to me. But I'm not entirely opposed to the idea if others
  think this is a better way to go.
 
 I don't see any safe way to avoid it.  The host has to be in control of
 what devices can and can not be accessed by the container.

Disagree.  loop%d is meaningless until it is attached to a file.  So
whether a container can use loop2 vs loop9 is meaningless.  The point
of Seth's loopfs as I understood it is that the container simply gets a
unique (not visible to host or any other containers) set of loop devices
which it can attach to files which it owns.  So long as the host can't
see the container's loop devices (i.e. so it unwittently mounts it when
looking for a particular UUID for /var), it won't get fooled by them.

So in this case *if* we can do it, a purely namespaced approach - meaning
that we restrict visibility of a particular loopdev to one container - is
perfect.  
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [lxc-devel] [RFC PATCH 11/11] loop: Allow priveleged operations for root in the namespace which owns a device

2014-05-26 Thread Michael H. Warfield
On Tue, 2014-05-27 at 03:36 +0200, Serge E. Hallyn wrote:
 Quoting Michael H. Warfield (m...@wittsend.com):
  On Mon, 2014-05-26 at 11:16 +0200, Seth Forshee wrote:
   On Fri, May 23, 2014 at 08:48:25AM +0300, Marian Marinov wrote:
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

One question about this patch.

Why don't you use the devices cgroup check if the root user in that 
namespace is allowed to use this device?

This way you can be sure that the root in that namespace can not access 
devices to which the host system did not gave
him access to.
  
   That might be possible, but I don't want to require something on the
   host to whitelist the device for the container. Then loop would need to
   automatically add the device to devices.allow, which doesn't seem
   desirable to me. But I'm not entirely opposed to the idea if others
   think this is a better way to go.
  
  I don't see any safe way to avoid it.  The host has to be in control of
  what devices can and can not be accessed by the container.

 Disagree.  loop%d is meaningless until it is attached to a file.  So
 whether a container can use loop2 vs loop9 is meaningless.  The point
 of Seth's loopfs as I understood it is that the container simply gets a
 unique (not visible to host or any other containers) set of loop devices
 which it can attach to files which it owns.  So long as the host can't
 see the container's loop devices (i.e. so it unwittently mounts it when
 looking for a particular UUID for /var), it won't get fooled by them.

 So in this case *if* we can do it, a purely namespaced approach - meaning
 that we restrict visibility of a particular loopdev to one container - is
 perfect.  

And in that *if is a cloud that says then a miracle occurs and that
miracle needs a lot more detail.  How that translates into what is and
is not visible and what can be mimiced in a container becomes important
(to say nothing of notifying its udev).  I think this loopfs thing is
the answer for the loop device case, we just need to clear up those
details and exorcise the devils we find in them.  The loop devices are
unique while they strangely seem to work with minimal leakage already
(all meta data at this time).

Seth remarked that, maybe, he's not paranoid enough.  You know that I'm
a well trained professional paranoid and I accept if people think I'm
overly paranoid (is that even possible?).  Even paranoids have enemies
and just because you're paranoid it doesn't mean they're not out to get
you.  While I admit that total isolation is virtually (excuse the pun)
impossible that doesn't mean I don't strive to maximize the isolation
and analyze the possibilities and consequences of compromise.

As I stated, I don't see any way to avoid it.  I would love to be
proven wrong.  It would permit my life to be so much more easy.  But how
can we allow this without the host in control of it and directing things
to the containers?  A container may request something and the host can
grant it but the container should not be capable of demanding a device
over and above the control of the host.  How do we define the rules that
say what a container can do and what it cannot do without it involving
knowledge in the host (whitelisting as Seth call's it) of what is and is
not allowed in the container?

We already have the problem that the container devices.allow and
devices.deny are major and minor based, which we know is fundamentally
flawed in a udev environment.  We specify major:minor in the
configuration files as if they are cast in cement (which they are in all
common cases) but they are not in the general case.  Greg K-H hammers on
this frequently.

The loop devices are unique and deserve a unique solution, I'll agree.
But I'm also comfortable that the host should have rules and procedures
to whitelist hard devices and loop devices and manage their transfer
and/or sharing into the containers.

Regards,
Mike
-- 
Michael H. Warfield (AI4NB) | (770) 978-7061 |  m...@wittsend.com
   /\/\|=mhw=|\/\/  | (678) 463-0932 |  http://www.wittsend.com/mhw/
   NIC whois: MHW9  | An optimist believes we live in the best of all
 PGP Key: 0x674627FF| possible worlds.  A pessimist is sure of it!



signature.asc
Description: This is a digitally signed message part