Re: [Qemu-devel] Connecting virtio-9p-pci to a remote 9p server
Eric Van Hensbergen eri...@gmail.com writes: A passthrough makes perfect sense, a couple summers ago we had an Extreme Blue team working on using 9p for a cloud hosting environment -- while they were primarily working on gatewaying through a host operating system we also discussed doing 9p passthrough (primarily for test, but my other motive was looking at direct 9p to back-end server connections). I'm copying that team on this message to see if they have any additional thoughts. On one end you lose a bit in that you are no longer taking advantage of the host file system cache, which can be useful, particularly if there is any consolidation among the different guests -- but as you point out, you eliminate several copies and transitions through kernel space by just going direct. [Sorry for the extremely slow followup here. I got caught up bug squashing!] Yes, that was my feeling. It also allows things like mounting a filesystem from one VM that's exported by another on the same host. Doing this via the host vfs would risk deadlock under memory pressure. I would still be very interested to hear any thoughts from your team on the best way to get access to the 9p streams from qemu directly if they did any work in this area. If we're going to fund development work, I'm keen to produce something as general-purpose and as widely-applicable for other virtio-9p users as possible, rather than just a local hack for us. b) have qemu snoop and validate attach operations -- this may be what you were suggesting. Essentially you can hardcode the attach to only validate from a single user (or restrict it to a set of users). An alternative is to overload protocol semantics and have the initial version attach (which could be sent by qemu) carry some significance with the server -- hardcoding the protocol parameters and user under whose authority all subsequent requests fall under. This leaves much of the implementation details to the server Yes, you're right, this was what I had in mind. However, I want to be able to boot linux kernels with these filesystems as rootfs, so things that involve auth and the like aren't ideal. I'd prefer not to modify the server either. My plan was to filter the attach to only allow a specific path (or a set of specific paths) which I can specify in the qemu command line. This wouldn't require any server modifications, and would allow me to restrict the guest to the right mountpoint(s) exported by the 9p server. Does that sound sane? Cheers, Chris.
Re: [Qemu-devel] Connecting virtio-9p-pci to a remote 9p server
On Mon, Oct 15, 2012 at 6:36 AM, Chris Webb ch...@arachsys.com wrote: Whilst we can mount the shares on each host and then use qemu's 9p passthrough/proxy support to access the mountpoint, going via the host kernel and vfs like this feels quite inefficient. We would be converting back and forth between vfs and 9p models several times needlessly. Instead, I'm wondering about the feasibility of connecting the 9p stream directly from qemu's virtio-9p-pci device to a socket opened on a 9p-over-TCP export from the fileserver. Am I right in thinking that qemu's -fsdev proxy gives me access to a file descriptor attached to the 9p stream to/from the guest, or is the protocol between virtfs-proxy-helper and qemu re-encoded within qemu first? A passthrough makes perfect sense, a couple summers ago we had an Extreme Blue team working on using 9p for a cloud hosting environment -- while they were primarily working on gatewaying through a host operating system we also discussed doing 9p passthrough (primarily for test, but my other motive was looking at direct 9p to back-end server connections). I'm copying that team on this message to see if they have any additional thoughts. On one end you lose a bit in that you are no longer taking advantage of the host file system cache, which can be useful, particularly if there is any consolidation among the different guests -- but as you point out, you eliminate several copies and transitions through kernel space by just going direct. Secondly, assuming I can somehow get at the 9p streams directly (either with an existing option or by adding a new one), I'd like to restrict guests to the relevant user's subdirectory on the fileserver, and have been thinking about doing this by filtering the 9p stream to restrict 'attach' operations. There's all sorts of magic you can work here, almost all of it can be implemented on the file server assuming how much you trust your guests and the intermediate host. I'm by no means a security expert, but there are three relatively easy paths: a) start a server instance per user on the file server ahead of time, while this is a little obnoxious, it's by far the quickest path -- you control the port that the virtual images connect to, so as long the user is only able to connect to his file server, you are good. There are all the issues involved with uid mapping/etc. on the server side, but there are multiple ways of securing the file server to constrain the user to his/her own hierarchy. Some of the uid mapping and other security techniques in the qemu server could probably be extracted to their own stand-alone server relatively easily. b) have qemu snoop and validate attach operations -- this may be what you were suggesting. Essentially you can hardcode the attach to only validate from a single user (or restrict it to a set of users). An alternative is to overload protocol semantics and have the initial version attach (which could be sent by qemu) carry some significance with the server -- hardcoding the protocol parameters and user under whose authority all subsequent requests fall under. This leaves much of the implementation details to the server c) you can use the authentication mechanisms within the protocol (Tauth/Rauth and the afid) to independently authenticate users on the server. There are some examples of this in the xcpu code and of course in the original Plan 9 server/client/auth system. This is probably the most work intensive, but would be protocol and gateway (in this case qemu) neutral -- putting most of the work on the client and server Fortunately, 9p uses client-chosen fids rather than server filesystem inode numbers which would immediately scupper any simple attempts to implement a secure chroot proxy of this kind. Looking at the 9p2000.L protocol, it doesn't look obviously difficult, but I've not really worked with 9p before, and could well be missing security complications. (I'm not sure whether there's risk of symlinks being interpreted server side rather than client side, for example.) The embedded server in qemu should have all the bits you need to restrict hierarchy, you can alternatively use private namespace and/or chroot games to further guarantee isolation -- but since the qemu server also deals with the uid mapping issues, it might be the better starting point since the team that built it was looking at doing something very similar to what you want to do (albeit through proxying a host mounting distributed file system). Good luck, and feel free to ping me with any 9p questions, I may be less helpful on any qemu side implementation details I'm afraid. -eric
[Qemu-devel] Connecting virtio-9p-pci to a remote 9p server
We're planning to implement shared filesystems for guests on our virtualized hosting platform, stored on a central fileserver separate from the hosts. Whilst we can mount the shares on each host and then use qemu's 9p passthrough/proxy support to access the mountpoint, going via the host kernel and vfs like this feels quite inefficient. We would be converting back and forth between vfs and 9p models several times needlessly. Instead, I'm wondering about the feasibility of connecting the 9p stream directly from qemu's virtio-9p-pci device to a socket opened on a 9p-over-TCP export from the fileserver. Am I right in thinking that qemu's -fsdev proxy gives me access to a file descriptor attached to the 9p stream to/from the guest, or is the protocol between virtfs-proxy-helper and qemu re-encoded within qemu first? Secondly, assuming I can somehow get at the 9p streams directly (either with an existing option or by adding a new one), I'd like to restrict guests to the relevant user's subdirectory on the fileserver, and have been thinking about doing this by filtering the 9p stream to restrict 'attach' operations. Fortunately, 9p uses client-chosen fids rather than server filesystem inode numbers which would immediately scupper any simple attempts to implement a secure chroot proxy of this kind. Looking at the 9p2000.L protocol, it doesn't look obviously difficult, but I've not really worked with 9p before, and could well be missing security complications. (I'm not sure whether there's risk of symlinks being interpreted server side rather than client side, for example.) I'd also be interested in any more general thoughts on this kind of thing. If we're going to work on it, it would be nice for us to write something that would be more widely useful to others rather than just create an in-house hack. Cheers, Chris.
Re: [Qemu-devel] Connecting virtio-9p-pci to a remote 9p server
On Mon, Oct 15, 2012 at 12:36:08PM +0100, Chris Webb wrote: We're planning to implement shared filesystems for guests on our virtualized hosting platform, stored on a central fileserver separate from the hosts. Whilst we can mount the shares on each host and then use qemu's 9p passthrough/proxy support to access the mountpoint, going via the host kernel and vfs like this feels quite inefficient. We would be converting back and forth between vfs and 9p models several times needlessly. Instead, I'm wondering about the feasibility of connecting the 9p stream directly from qemu's virtio-9p-pci device to a socket opened on a 9p-over-TCP export from the fileserver. Am I right in thinking that qemu's -fsdev proxy gives me access to a file descriptor attached to the 9p stream to/from the guest, or is the protocol between virtfs-proxy-helper and qemu re-encoded within qemu first? Secondly, assuming I can somehow get at the 9p streams directly (either with an existing option or by adding a new one), I'd like to restrict guests to the relevant user's subdirectory on the fileserver, and have been thinking about doing this by filtering the 9p stream to restrict 'attach' operations. Fortunately, 9p uses client-chosen fids rather than server filesystem inode numbers which would immediately scupper any simple attempts to implement a secure chroot proxy of this kind. Looking at the 9p2000.L protocol, it doesn't look obviously difficult, but I've not really worked with 9p before, and could well be missing security complications. (I'm not sure whether there's risk of symlinks being interpreted server side rather than client side, for example.) I'd also be interested in any more general thoughts on this kind of thing. If we're going to work on it, it would be nice for us to write something that would be more widely useful to others rather than just create an in-house hack. Cheers, Chris. If scalability and security are long-term goals, I'd suggest you take a look at OpenAFS (openafs.org). There are other complications, in that you start getting into stuff like how to authenticate your users to the filesystem, but imho, it's a PITA to switch and get used to this model, but once you do, you don't have to worry about if you've got some complicated (and one-off) security filtering configured right. I've been playing with booting debian kernels initrds directly from AFS as the root filesystem ( http://bitspjoule.org/hg/initramfs-tools ). What's nice (and also a PITA) is that normally the VM client cannot modify the root filesystem, so I know there's no magic configuration on some VM disk image, but if I wanted to make a change to multiple VMs, I could authenticate to AFS as administrator on one of the nodes, make the change, and then restart the daemons on all the other VMs to load the new change. The other (potential) advantage of AFS in a virtualized environment is client side caching, so instead of VM disk image for the OS, and worrying about backing it up, you just use that disk image as the client-side cache local to the VM host machine. The actual authoritative data is stored on the AFS server, so if you have it cached locally, you never have to hit the network, and if the disk the cache is on dies, you just restart the VM on a different disk. (If you wanted to go overboard, you could modify the client-caching code to go back to the server if the cache gets a read error) I can't say that AFS has really *solved* all the hard problems with this, but there is at least some history on how to effectively deal with them. With what you are describing with 9p in a production environment, I think you'll end up re-discovering all the hard problems and have to invent new 9p-specific ways of dealing with them, and you'll end up with the same complexity as AFS. - Troy