On Tue, Jul 21, 2015 at 01:37:21PM -0400, J. Bruce Fields wrote:
> On Fri, Jul 17, 2015 at 12:47:35PM +1000, Dave Chinner wrote:
> > On Thu, Jul 16, 2015 at 07:42:03PM -0500, Eric W. Biederman wrote:
> > > Dave Chinner <da...@fromorbit.com> writes:
> > > > The key difference is that desktops only do this when you physically
> > > > plug in a device. With unprivileged mounts, a hostile attacker
> > > > doesn't need physical access to the machine to exploit lurking
> > > > kernel filesystem bugs. i.e. they can just use loopback mounts, and
> > > > they can keep mounting corrupted images until they find something
> > > > that works.
> > > 
> > > Yep.  That magnifies the problem quite a bit.
> > > 
> > > > User namespaces are supposed to provide trust separation.  The
> > > > kernel filesystems simply aren't hardened against unprivileged
> > > > attacks from below - there is a trust relationship between root and
> > > > the filesystem in that they are the only things that can write to
> > > > the disk. Mounts from within a userns destroys this relationship as
> > > > the userns root, by definition, is not a trusted actor.
> > > 
> > > I talked to Ted Tso a while back and ext4 is at least in principle
> > > already hardened against that kind of attack.  I am not certain I
> > > believe it, but if it is true I think it is fantastic.
> > 
> > No, it's not. No filesystem is, because to harden against such
> > attacks requires complete verification of all metadata when it is
> > read from disk, before it is used, or some method or ensuring the
> > block was not tampered with. CRCs are not sufficient, because they
> > can be tampered with, too.
> > 
> > The only way a filesystem would be able to trust what it reads from
> > disk has not been tampered with in a system with untrusted mounts is
> > if it has some kind of cryptographically secure signature in the
> > metadata and the attacker is unable to access the key for that
> > signature.
> 
> Preventing tampering is a little different from protecting the kernel
> from attack, isn't it?  I thought the latter was what people were asking
> about.

People might be asking for the latter, but the only attack vector
that can be made against filesystems from below is via tampering
with the on-disk structure.

An untrusted user in an untrusted container can construct arbitrary
untrusted filesystem structures and get them parsed by a context
running as $DIETY that assumes the structure is from a trusted
source.  What can possibly go wrong?

IOWs, To protect the kernel against attack from untrusted filesystem
images, we either have to be able to guarantee the image can not be
modified by untrusted parties (i.e.  needs to be created with
signed tools, contain only signed filesystem metadata and
signed/encrypted data), or we have to sandbox the filesystem parsing
code completely (i.e. fuse).

> So, for example, a screwed up on-disk directory structure shouldn't
> result in creating a cycle in the dcache and then deadlocking.

Therein lies the problem: how do you detect such structural defects
without doing a full structure validation? e.g. cyclic links may
only manifest when completely unrelated pieces of metadata are linked
together in a specific way.

Further, the problem is not restricted to validation at mount time -
if the user can write to the filesystem image file, then they can
modify it after it has been mounted, too. That means the attacker
may be someone who has broken into a container, not necessarily the
user you trusted with unprivileged mounts. That means every cold
metadata read needs to be treated with suspicion, not just at mount
time.

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to