On Friday, November 22, 2013 11:34:41 AM Stefan Hajnoczi wrote: > IMO this seccomp approach is doomed since QEMU does not practice > privilege separation. QEMU is monolithic so it's really hard to create > a meaningful sets of system calls.
I'm a big fan of decomposing QEMU, but based on previous discussions there seems to be a lot of fear from the core QEMU folks around decomposition; enough that I'm not sure it is worth the time and effort at this point to pursue it. While I agree that a decomposed QEMU would be able to make better use of syscall filtering (and LSM/SELinux protection, and ...) I don't believe it means syscall filtering is a complete lost cause with a monolithic QEMU. Any improvement you can make, no matter how small, is still and improvement. > To avoid breaking stuff you need to be too liberal, defeating the purpose of > seccomp. Even if you can only disable a few syscalls you are still better off than you were before. Could it be done better, of course it could, but it doesn't mean you shouldn't try for some benefit. > For each QEMU command-line there may be a different set of syscalls that > should be allowed/forbidden. I'm not sure if you missed it or not, but I had an email exchange with Eduardo on this list about making the syscall whitelist a bit more "intelligent" and dependent on what functionality was enabled for a given QEMU instance. This should help a bit with the problems you are describing. > The existing approach clearly doesn't support the full range of options > that users specify on the command-line. Bugs. It will get fixed in time with more testing/debugging. Eduardo is working on improving the testing and RH's QA folks are working hard to shake out the bugs too. I just posted another bug fix patch to the whitelist a few days ago. > So I guess the options are: > > 1. Don't make it the default since it breaks stuff but use it for very > specific scenarios (e.g. libvirt use cases that have been well tested). In my opinion, I think it was probably a bit premature to make enable it by default, but at some point in the future I think we do need to do this. > 2. Provide a kind of syscall set for various QEMU options and apply the > union of them at launch. This still seems fragile but in theory it > could work. This is what I was discussing above. I think this is likely the next big improvement. -- paul moore security and virtualization @ redhat