Re: [Qemu-devel] Connecting virtio-9p-pci to a remote 9p server

2012-10-30 Thread Chris Webb
Eric Van Hensbergen eri...@gmail.com writes:

 A passthrough makes perfect sense, a couple summers ago we had an
 Extreme Blue team working on using 9p for a cloud hosting environment
 -- while they were primarily working on gatewaying through a host
 operating system we also discussed doing 9p passthrough (primarily for
 test, but my other motive was looking at direct 9p to back-end server
 connections).  I'm copying that team on this message to see if they
 have any additional thoughts.  On one end you lose a bit in that you
 are no longer taking advantage of the host file system cache, which
 can be useful, particularly if there is any consolidation among the
 different guests -- but as you point out, you eliminate several copies
 and transitions through kernel space by just going direct.

[Sorry for the extremely slow followup here. I got caught up bug squashing!]

Yes, that was my feeling. It also allows things like mounting a filesystem
from one VM that's exported by another on the same host. Doing this via the
host vfs would risk deadlock under memory pressure.

I would still be very interested to hear any thoughts from your team on the
best way to get access to the 9p streams from qemu directly if they did any
work in this area. If we're going to fund development work, I'm keen to
produce something as general-purpose and as widely-applicable for other
virtio-9p users as possible, rather than just a local hack for us.

 b) have qemu snoop and validate attach operations -- this may be what
 you were suggesting.  Essentially you can hardcode the attach to only
 validate from a single user (or restrict it to a set of users).  An
 alternative is to overload protocol semantics and have the initial
 version  attach (which could be sent by qemu) carry some significance
 with the server -- hardcoding the protocol parameters and user under
 whose authority all subsequent requests fall under.  This leaves much
 of the implementation details to the server

Yes, you're right, this was what I had in mind. However, I want to be able
to boot linux kernels with these filesystems as rootfs, so things that
involve auth and the like aren't ideal. I'd prefer not to modify the server
either. My plan was to filter the attach to only allow a specific path (or a
set of specific paths) which I can specify in the qemu command line. This
wouldn't require any server modifications, and would allow me to restrict
the guest to the right mountpoint(s) exported by the 9p server. Does that
sound sane?

Cheers,

Chris.



Re: [Qemu-devel] Connecting virtio-9p-pci to a remote 9p server

2012-10-16 Thread Eric Van Hensbergen
On Mon, Oct 15, 2012 at 6:36 AM, Chris Webb ch...@arachsys.com wrote:

 Whilst we can mount the shares on each host and then use qemu's 9p
 passthrough/proxy support to access the mountpoint, going via the host
 kernel and vfs like this feels quite inefficient. We would be converting
 back and forth between vfs and 9p models several times needlessly.

 Instead, I'm wondering about the feasibility of connecting the 9p stream
 directly from qemu's virtio-9p-pci device to a socket opened on a
 9p-over-TCP export from the fileserver. Am I right in thinking that qemu's
 -fsdev proxy gives me access to a file descriptor attached to the 9p stream
 to/from the guest, or is the protocol between virtfs-proxy-helper and qemu
 re-encoded within qemu first?


A passthrough makes perfect sense, a couple summers ago we had an
Extreme Blue team working on using 9p for a cloud hosting environment
-- while they were primarily working on gatewaying through a host
operating system we also discussed doing 9p passthrough (primarily for
test, but my other motive was looking at direct 9p to back-end server
connections).  I'm copying that team on this message to see if they
have any additional thoughts.  On one end you lose a bit in that you
are no longer taking advantage of the host file system cache, which
can be useful, particularly if there is any consolidation among the
different guests -- but as you point out, you eliminate several copies
and transitions through kernel space by just going direct.

 Secondly, assuming I can somehow get at the 9p streams directly (either with
 an existing option or by adding a new one), I'd like to restrict guests to
 the relevant user's subdirectory on the fileserver, and have been thinking
 about doing this by filtering the 9p stream to restrict 'attach' operations.

There's all sorts of magic you can work here, almost all of it can be
implemented on the file server assuming how much you trust your guests
and the intermediate host.  I'm by no means a security expert, but
there are three relatively easy paths:
a) start a server instance per user on the file server ahead of time,
while this is a little obnoxious, it's by far the quickest path -- you
control the port that the virtual images connect to, so as long the
user is only able to connect to his file server, you are good.
There are all the issues involved with uid mapping/etc. on the server
side, but there are multiple ways of securing the file server to
constrain the user to his/her own hierarchy.  Some of the uid mapping
and other security techniques in the qemu server could probably be
extracted to their own stand-alone server relatively easily.
b) have qemu snoop and validate attach operations -- this may be what
you were suggesting.  Essentially you can hardcode the attach to only
validate from a single user (or restrict it to a set of users).  An
alternative is to overload protocol semantics and have the initial
version  attach (which could be sent by qemu) carry some significance
with the server -- hardcoding the protocol parameters and user under
whose authority all subsequent requests fall under.  This leaves much
of the implementation details to the server
c) you can use the authentication mechanisms within the protocol
(Tauth/Rauth and the afid) to independently authenticate users on the
server.  There are some examples of this in the xcpu code and of
course in the original Plan 9 server/client/auth system.  This is
probably the most work intensive, but would be protocol and gateway
(in this case qemu) neutral -- putting most of the work on the client
and server

 Fortunately, 9p uses client-chosen fids rather than server filesystem inode
 numbers which would immediately scupper any simple attempts to implement a
 secure chroot proxy of this kind. Looking at the 9p2000.L protocol, it
 doesn't look obviously difficult, but I've not really worked with 9p before,
 and could well be missing security complications. (I'm not sure whether
 there's risk of symlinks being interpreted server side rather than client
 side, for example.)

The embedded server in qemu should have all the bits you need to
restrict hierarchy, you can alternatively use private namespace and/or
chroot games to further guarantee isolation -- but since the qemu
server also deals with the uid mapping issues, it might be the better
starting point since the team that built it was looking at doing
something very similar to what you want to do (albeit through proxying
a host mounting distributed file system).

Good luck, and feel free to ping me with any 9p questions, I may be
less helpful on any qemu side implementation details I'm afraid.

   -eric



[Qemu-devel] Connecting virtio-9p-pci to a remote 9p server

2012-10-15 Thread Chris Webb
We're planning to implement shared filesystems for guests on our virtualized
hosting platform, stored on a central fileserver separate from the hosts.

Whilst we can mount the shares on each host and then use qemu's 9p
passthrough/proxy support to access the mountpoint, going via the host
kernel and vfs like this feels quite inefficient. We would be converting
back and forth between vfs and 9p models several times needlessly.

Instead, I'm wondering about the feasibility of connecting the 9p stream
directly from qemu's virtio-9p-pci device to a socket opened on a
9p-over-TCP export from the fileserver. Am I right in thinking that qemu's
-fsdev proxy gives me access to a file descriptor attached to the 9p stream
to/from the guest, or is the protocol between virtfs-proxy-helper and qemu
re-encoded within qemu first?

Secondly, assuming I can somehow get at the 9p streams directly (either with
an existing option or by adding a new one), I'd like to restrict guests to
the relevant user's subdirectory on the fileserver, and have been thinking
about doing this by filtering the 9p stream to restrict 'attach' operations.

Fortunately, 9p uses client-chosen fids rather than server filesystem inode
numbers which would immediately scupper any simple attempts to implement a
secure chroot proxy of this kind. Looking at the 9p2000.L protocol, it
doesn't look obviously difficult, but I've not really worked with 9p before,
and could well be missing security complications. (I'm not sure whether
there's risk of symlinks being interpreted server side rather than client
side, for example.)

I'd also be interested in any more general thoughts on this kind of thing.
If we're going to work on it, it would be nice for us to write something
that would be more widely useful to others rather than just create an
in-house hack.

Cheers,

Chris.



Re: [Qemu-devel] Connecting virtio-9p-pci to a remote 9p server

2012-10-15 Thread Troy Benjegerdes
On Mon, Oct 15, 2012 at 12:36:08PM +0100, Chris Webb wrote:
 We're planning to implement shared filesystems for guests on our virtualized
 hosting platform, stored on a central fileserver separate from the hosts.
 
 Whilst we can mount the shares on each host and then use qemu's 9p
 passthrough/proxy support to access the mountpoint, going via the host
 kernel and vfs like this feels quite inefficient. We would be converting
 back and forth between vfs and 9p models several times needlessly.
 
 Instead, I'm wondering about the feasibility of connecting the 9p stream
 directly from qemu's virtio-9p-pci device to a socket opened on a
 9p-over-TCP export from the fileserver. Am I right in thinking that qemu's
 -fsdev proxy gives me access to a file descriptor attached to the 9p stream
 to/from the guest, or is the protocol between virtfs-proxy-helper and qemu
 re-encoded within qemu first?
 
 Secondly, assuming I can somehow get at the 9p streams directly (either with
 an existing option or by adding a new one), I'd like to restrict guests to
 the relevant user's subdirectory on the fileserver, and have been thinking
 about doing this by filtering the 9p stream to restrict 'attach' operations.
 
 Fortunately, 9p uses client-chosen fids rather than server filesystem inode
 numbers which would immediately scupper any simple attempts to implement a
 secure chroot proxy of this kind. Looking at the 9p2000.L protocol, it
 doesn't look obviously difficult, but I've not really worked with 9p before,
 and could well be missing security complications. (I'm not sure whether
 there's risk of symlinks being interpreted server side rather than client
 side, for example.)
 
 I'd also be interested in any more general thoughts on this kind of thing.
 If we're going to work on it, it would be nice for us to write something
 that would be more widely useful to others rather than just create an
 in-house hack.
 
 Cheers,
 
 Chris.
 

If scalability and security are long-term goals, I'd suggest you take a look
at OpenAFS (openafs.org). There are other complications, in that you start 
getting into stuff like how to authenticate your users to the filesystem, but
imho, it's a PITA to switch and get used to this model, but once you do, you
don't have to worry about if you've got some complicated (and one-off) security
filtering configured right.

I've been playing with booting debian kernels  initrds directly from AFS as
the root filesystem ( http://bitspjoule.org/hg/initramfs-tools ). What's nice
(and also a PITA) is that normally the VM client cannot modify the root
filesystem, so I know there's no magic configuration on some VM disk image,
but if I wanted to make a change to multiple VMs, I could authenticate to 
AFS as administrator on one of the nodes, make the change, and then restart
the daemons on all the other VMs to load the new change.

The other (potential) advantage of AFS in a virtualized environment is client
side caching, so instead of VM disk image for the OS, and worrying about
backing it up, you just use that disk image as the client-side cache local to
the VM host machine. The actual authoritative data is stored on the AFS server,
so if you have it cached locally, you never have to hit the network, and if the
disk the cache is on dies, you just restart the VM on a different disk. (If 
you wanted to go overboard, you could modify the client-caching code to go
back to the server if the cache gets a read error)

I can't say that AFS has really *solved* all the hard problems with this, but
there is at least some history on how to effectively deal with them. With what
you are describing with 9p in a production environment, I think you'll end up
re-discovering all the hard problems and have to invent new 9p-specific ways
of dealing with them, and you'll end up with the same complexity as AFS.

- Troy