Re: [Qemu-devel] [libvirt] [RFC 0/5] block: File descriptor passing using -open-hook-fd

2012-07-09 Thread Anthony Liguori

On 05/17/2012 09:14 AM, Eric Blake wrote:

On 05/17/2012 07:42 AM, Stefan Hajnoczi wrote:



The -open-hook-fd approach allows QEMU to support file descriptor passing
without changing -drive.  It also supports snapshot_blkdev and other commands

By the way, How will it support them?


The problem with snapshot_blkdev is that closing a file and opening a
new file cannot be done by the QEMU process when an SELinux policy is in
place to prevent opening files.


snapshot_blkdev can take an fd:name instead of a /path/to/file for the
file to open, in which case libvirt can pass in the named fd _prior_ to
the snapshot_blkdev using the 'getfd' monitor command.



The -open-hook-fd approach works even when the QEMU process is not
allowed to open files since file descriptor passing over a UNIX domain
socket is used to open files on behalf of QEMU.


The -open-hook-fd approach would indeed allow snapshot_blokdev to ask
for the fd after the fact, but it's much more painful.  Consider a case
with a two-disk snapshot:

with the fd:name approach, the sequence is:

libvirt calls getfd:name1 over normal monitor
qemu responds
libvirt calls getfd:name2 over normal monitor
qemu responds
libvirt calls transaction around blockdev-snapshot-sync over normal
monitor, using fd:name1 and fd:name2
qemu responds

but with -open-hook-fd, the approach would be:

libvirt calls transaction
qemu calls open(file1) over hook
libvirt responds
qemu calls open(file2) over hook
libvirt responds
qemu responds to the original transaction

The 'transaction' operation is thus blocked by the time it takes to do
two intermediate opens over a second channel, which kind of defeats the
purpose of making the transaction take effect with minimal guest
downtime.


How are you defining "guest down time"?

It's important to note that code running in QEMU does not equate to guest 
visible down time unless QEMU does an explicit vm_stop() which is not happening 
here.


Instead, a VCPU may become blocked *if* it attempts to acquire qemu_mute while 
QEMU is holding it.


If your concern is qemu_mutex being held while waiting for libvirt, it would be 
fairly easy to implement a qemu_open_async() that dropped allowed dropping back 
to the main loop and then calling a callback when the open completes.


It would be pretty trivial to convert qmp_transaction to use such a command.

But this is all speculative.  There's no reason to believe that an RPC would 
have a noticable guest visible latency unless you assume there's lot contention. 
 I would strongly suspect that the bdrv_flush() is going to be a much greater 
source of lock contention than the RPC would be.  An RPC is only bound by 
scheduler latency whereas synchronous disk I/O is bound spinning a platter.



And libvirt code becomes a lot trickier to deal with the fact
that two channels are in use, and that the channel that issued the
'transaction' command must block while the other channel for handling
hooks must be responsive.


All libvirt needs to do is listen on a socket and delegate access according to a 
white list.  Whatever is providing fd's needs to have no knowledge of anythign 
other than what the guest is allowed to access which shouldn't depend on an 
executing command.


Regards,

Anthony Liguori


I'm really disliking the hook-fd approach, when a better solution is to
make use of 'getfd' in advance of any operation that will need to open
new fds.






Re: [Qemu-devel] [libvirt] [RFC 0/5] block: File descriptor passing using -open-hook-fd

2012-07-09 Thread Eric Blake
On 07/09/2012 02:00 PM, Anthony Liguori wrote:

>> with the fd:name approach, the sequence is:
>>
>> libvirt calls getfd:name1 over normal monitor
>> qemu responds
>> libvirt calls getfd:name2 over normal monitor
>> qemu responds
>> libvirt calls transaction around blockdev-snapshot-sync over normal
>> monitor, using fd:name1 and fd:name2
>> qemu responds

This general layout is true whether we rewrite all commands to
understand fd:nnn (proposal 1) or whether we add new magic parsing
(/dev/fd/nnn of proposal 3, or even /dev/fdset/nnn of proposal 5), all
as called out in these messages:

https://lists.gnu.org/archive/html/qemu-devel/2012-07/msg00227.html
https://lists.gnu.org/archive/html/qemu-devel/2012-07/msg01098.html

>>
>> but with -open-hook-fd, the approach would be:
>>
>> libvirt calls transaction
>> qemu calls open(file1) over hook
>> libvirt responds
>> qemu calls open(file2) over hook
>> libvirt responds
>> qemu responds to the original transaction

whereas this approach is quite different in semantics, but may indeed be
easier for qemu to implement, at the expense of some more complexity on
the part of libvirt.

At the high level, I think both approaches have one thing in common: by
refactoring all qemu code to go through qemu_open(), we can then
implement our desired complexity (whether fd:nn, /dev/fd/nnn,
/dev/fdset/nnn, or some other magic name parsing; or whether it is an
rpc call over a second socket in parallel to the monitor socket) in just
one location.  Likewise, both approaches have to deal with libvirtd
restarts (magic name parsing by changing an 'inuse' flag when the
monitor detects EOF; rpc passing by failing a qemu_open() when the rpc
socket detects EOF).

>>
>> The 'transaction' operation is thus blocked by the time it takes to do
>> two intermediate opens over a second channel, which kind of defeats the
>> purpose of making the transaction take effect with minimal guest
>> downtime.
> 
> How are you defining "guest down time"?
> 
> It's important to note that code running in QEMU does not equate to
> guest visible down time unless QEMU does an explicit vm_stop() which is
> not happening here.
> 
> Instead, a VCPU may become blocked *if* it attempts to acquire qemu_mute
> while QEMU is holding it.
> 
> If your concern is qemu_mutex being held while waiting for libvirt, it
> would be fairly easy to implement a qemu_open_async() that dropped
> allowed dropping back to the main loop and then calling a callback when
> the open completes.
> 
> It would be pretty trivial to convert qmp_transaction to use such a
> command.

In other words, remembering that transactions are divided into phases:

phase 1 - prepare: obtain all needed fds (whether by pre-opening them
via 'pass-fd' or other new 'getfd' relative, or whether by RPC calls);
no guest downtime, and with cleanup that avoids any leaks on any failures
phase 2 - commit: flush all devices and actually make the changes in
qemu state to use the fds obtained in phase 1

and where the guest downtime (if any) is more likely due to flushing
changes in phase 2

> 
> But this is all speculative.  There's no reason to believe that an RPC
> would have a noticable guest visible latency unless you assume there's
> lot contention.  I would strongly suspect that the bdrv_flush() is going
> to be a much greater source of lock contention than the RPC would be. 
> An RPC is only bound by scheduler latency whereas synchronous disk I/O
> is bound spinning a platter.
> 
>> And libvirt code becomes a lot trickier to deal with the fact
>> that two channels are in use, and that the channel that issued the
>> 'transaction' command must block while the other channel for handling
>> hooks must be responsive.
> 
> All libvirt needs to do is listen on a socket and delegate access
> according to a white list.  Whatever is providing fd's needs to have no
> knowledge of anythign other than what the guest is allowed to access
> which shouldn't depend on an executing command.

That's not quite accurate.  What the guest is allowed to access should
indeed change depending on the executing command.  That is, if I start a
guest with:

base <- delta

then I only want to permet O_RDONLY access to base but O_RDWR access to
delta.  If I then call 'blockdev-snapshot-sync', I want to change to the
situation:

base <- delta <- snap

and give O_RDWR permissions to snap; it would also be nice if qemu
attempts to reopen delta with O_RDONLY permissions (although from a
trust perspective, libvirt must assume that delta is still O_RDWR
because qemu may have been compromised and lie about the tightening of
permissions); at any rate, depending on SELinux capabilities of the
file, libvirt may be able to enforce no further writes to 'delta' by
toggling a SELinux label (obviously, this should only be done after
'blockdev-snapshot-sync' completes).

On the other hand, the user could decide to do a 'block-commit', to
squash things into:

base

where base is now O_RDWR.  But libvirt doesn't w

Re: [Qemu-devel] [libvirt] [RFC 0/5] block: File descriptor passing using -open-hook-fd

2012-07-09 Thread Anthony Liguori

On 07/09/2012 03:29 PM, Eric Blake wrote:

On 07/09/2012 02:00 PM, Anthony Liguori wrote:


with the fd:name approach, the sequence is:

libvirt calls getfd:name1 over normal monitor
qemu responds
libvirt calls getfd:name2 over normal monitor
qemu responds
libvirt calls transaction around blockdev-snapshot-sync over normal
monitor, using fd:name1 and fd:name2
qemu responds


This general layout is true whether we rewrite all commands to
understand fd:nnn (proposal 1) or whether we add new magic parsing
(/dev/fd/nnn of proposal 3, or even /dev/fdset/nnn of proposal 5), all
as called out in these messages:

https://lists.gnu.org/archive/html/qemu-devel/2012-07/msg00227.html
https://lists.gnu.org/archive/html/qemu-devel/2012-07/msg01098.html



but with -open-hook-fd, the approach would be:

libvirt calls transaction
qemu calls open(file1) over hook
libvirt responds
qemu calls open(file2) over hook
libvirt responds
qemu responds to the original transaction


whereas this approach is quite different in semantics, but may indeed be
easier for qemu to implement, at the expense of some more complexity on
the part of libvirt.

At the high level, I think both approaches have one thing in common: by
refactoring all qemu code to go through qemu_open(), we can then
implement our desired complexity (whether fd:nn, /dev/fd/nnn,
/dev/fdset/nnn, or some other magic name parsing; or whether it is an
rpc call over a second socket in parallel to the monitor socket) in just
one location.  Likewise, both approaches have to deal with libvirtd
restarts (magic name parsing by changing an 'inuse' flag when the
monitor detects EOF; rpc passing by failing a qemu_open() when the rpc
socket detects EOF).


Ack.





The 'transaction' operation is thus blocked by the time it takes to do
two intermediate opens over a second channel, which kind of defeats the
purpose of making the transaction take effect with minimal guest
downtime.


How are you defining "guest down time"?

It's important to note that code running in QEMU does not equate to
guest visible down time unless QEMU does an explicit vm_stop() which is
not happening here.

Instead, a VCPU may become blocked *if* it attempts to acquire qemu_mute
while QEMU is holding it.

If your concern is qemu_mutex being held while waiting for libvirt, it
would be fairly easy to implement a qemu_open_async() that dropped
allowed dropping back to the main loop and then calling a callback when
the open completes.

It would be pretty trivial to convert qmp_transaction to use such a
command.


In other words, remembering that transactions are divided into phases:

phase 1 - prepare: obtain all needed fds (whether by pre-opening them
via 'pass-fd' or other new 'getfd' relative, or whether by RPC calls);
no guest downtime, and with cleanup that avoids any leaks on any failures
phase 2 - commit: flush all devices and actually make the changes in
qemu state to use the fds obtained in phase 1

and where the guest downtime (if any) is more likely due to flushing
changes in phase 2


Not quite.  A synchronous flush can cause lock contention.  We need to separate 
out the problem of lock contention from guest down time.


Also, there's no obvious need to move the flushes before opens.  The main issue 
is that we use qemu_mutex to effectively create a write queue.


You can imagine a simple write queueing mechanism that would obviate the need 
need for this such that we could flush, queue upcoming writes, and drop 
qemu_mutex to sleep waiting for libvirt to send us our fds.



But this is all speculative.  There's no reason to believe that an RPC
would have a noticable guest visible latency unless you assume there's
lot contention.  I would strongly suspect that the bdrv_flush() is going
to be a much greater source of lock contention than the RPC would be.
An RPC is only bound by scheduler latency whereas synchronous disk I/O
is bound spinning a platter.


And libvirt code becomes a lot trickier to deal with the fact
that two channels are in use, and that the channel that issued the
'transaction' command must block while the other channel for handling
hooks must be responsive.


All libvirt needs to do is listen on a socket and delegate access
according to a white list.  Whatever is providing fd's needs to have no
knowledge of anythign other than what the guest is allowed to access
which shouldn't depend on an executing command.


That's not quite accurate.  What the guest is allowed to access should
indeed change depending on the executing command.  That is, if I start a
guest with:


I should have spoke more clearly.  libvirt may change the white list for various 
reasons dynamically.  But there shouldn't be a direct dependency on whatever is 
serving up fd's and whatever is changing the white list.


Basically, you just need a shared hash table for each guest.  It should be quite 
simple.



Maybe the only reason that I'm still leaning towards a 'pass-fd'
solution instead of a hook fd solution is that

Re: [Qemu-devel] [libvirt] [RFC 0/5] block: File descriptor passing using -open-hook-fd

2012-05-01 Thread Eric Blake
On 05/01/2012 02:25 PM, Anthony Liguori wrote:
> Thanks for sending this out Stefan.

Indeed.


>> This series adds the -open-hook-fd command-line option.  Whenever QEMU
>> needs to
>> open an image file it sends a request over the given UNIX domain
>> socket.  The
>> response includes the file descriptor or an errno on failure.  Please
>> see the
>> patches for details on the protocol.
>>
>> The -open-hook-fd approach allows QEMU to support file descriptor passing
>> without changing -drive.  It also supports snapshot_blkdev and other
>> commands
>> that re-open image files.
>>
>> Anthony Liguori  wrote most of these patches.  I
>> added a
>> demo -open-hook-fd server and added some small fixes.  Since Anthony is
>> traveling right now I'm sending the RFC for discussion.
> 
> What I like about this approach is that it's useful outside the block
> layer and is conceptionally simple from a QEMU PoV.  We simply delegate
> open() to libvirt and let libvirt enforce whatever rules it wants.
> 
> This is not meant to be an alternative to blockdev, but even with
> blockdev, I think we still want to use a mechanism like this even with
> blockdev.

The overall series looks like it would be rather interesting.  What sort
of timing restrictions are there?  For example, the proposed
'drive-reopen' command (probably now delegated to qemu 1.2) would mean
that qemu would be calling back into libvirt in order to do the reopen.
 If libvirt takes its time in passing back an open fd, is it going to
starve qemu from answering unrelated monitor commands in the meantime?
I definitely want to make sure we avoid deadlock where libvirt is
waiting on a monitor command, but the monitor command is waiting on
libvirt to pass an fd.

Is this also an opportunity to request whether a particular fd must be
seekable vs. acceptable as a one-pass read or write, perhaps by whether
the command is 1 (seekable open) or 2 (one-pass open)?  For example,
migration is one-pass (and therefore libvirt passes a pipe which is
hooked up to a helper app that uses O_DIRECT), while block devices must
be seekable.

-- 
Eric Blake   ebl...@redhat.com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [libvirt] [RFC 0/5] block: File descriptor passing using -open-hook-fd

2012-05-01 Thread Anthony Liguori

On 05/01/2012 03:56 PM, Eric Blake wrote:

On 05/01/2012 02:25 PM, Anthony Liguori wrote:

Thanks for sending this out Stefan.


Indeed.



This series adds the -open-hook-fd command-line option.  Whenever QEMU
needs to
open an image file it sends a request over the given UNIX domain
socket.  The
response includes the file descriptor or an errno on failure.  Please
see the
patches for details on the protocol.

The -open-hook-fd approach allows QEMU to support file descriptor passing
without changing -drive.  It also supports snapshot_blkdev and other
commands
that re-open image files.

Anthony Liguori   wrote most of these patches.  I
added a
demo -open-hook-fd server and added some small fixes.  Since Anthony is
traveling right now I'm sending the RFC for discussion.


What I like about this approach is that it's useful outside the block
layer and is conceptionally simple from a QEMU PoV.  We simply delegate
open() to libvirt and let libvirt enforce whatever rules it wants.

This is not meant to be an alternative to blockdev, but even with
blockdev, I think we still want to use a mechanism like this even with
blockdev.


The overall series looks like it would be rather interesting.  What sort
of timing restrictions are there?  For example, the proposed
'drive-reopen' command (probably now delegated to qemu 1.2) would mean
that qemu would be calling back into libvirt in order to do the reopen.
  If libvirt takes its time in passing back an open fd, is it going to
starve qemu from answering unrelated monitor commands in the meantime?


s/libvirt/kernel/g and your concerns are equally valid.

Doing open() should never be done in a path that could block things.  There's 
always the possibility that we're on top of NFS and the open could timeout.


For something like drive_reopen, we should use an asynchronous open() that 
dispatched the open() in the posix-aio thread pool.


That's part of what's nice about this approach, we could still call file_open() 
in the posix-aio thread pool...



I definitely want to make sure we avoid deadlock where libvirt is
waiting on a monitor command, but the monitor command is waiting on
libvirt to pass an fd.

Is this also an opportunity to request whether a particular fd must be
seekable vs. acceptable as a one-pass read or write, perhaps by whether
the command is 1 (seekable open) or 2 (one-pass open)?


I'm not really sure where the distinction lies...

I want the RPC to behave exactly like open().  So if we're assuming that open() 
of a /dev/ file returns something that is ioctl()'able, then that's what libvirt 
should return.


If we want to sort of do fd-transformation where a special protocol is used for 
things like ioctl, that's fine, but it ought to be a different mechanism (that's 
probably not nearly as generic).



For example,
migration is one-pass (and therefore libvirt passes a pipe which is
hooked up to a helper app that uses O_DIRECT), while block devices must
be seekable.


But migration doesn't involve doing an open().  This is not a replacement for fd 
passing.  This is a replacement for open() to make up for the facts that (1) 
some management tools like libvirt cannot isolate guests with DAC and (2) 
SELinux cannot be used to isolate guests across all file systems.


I would really prefer that the kernel fix this problem for us, but from what I'm 
told, the problem lies in the NFS standards committee so short of forking the 
NFS protocol, there isn't much that the kernel can do.


Regards,

Anthony Liguori








Re: [Qemu-devel] [libvirt] [RFC 0/5] block: File descriptor passing using -open-hook-fd

2012-05-01 Thread Eric Blake
On 05/01/2012 03:53 PM, Anthony Liguori wrote:

>> I think (correct me if I'm wrong) libvirt should be aware of any file
>> that qemu
>> asks it to open. So from a security point of view, libvirt can prevent
>> opening a
>> file if it isn't affiliated with the guest.
> 
> Right, libvirt can maintain a whitelist of files QEMU is allowed to open
> (which is already has because it needs to label these files).

Indeed.

>  The only
> complexity is that it's not a straight strcmp().  The path needs to be
> (carefully) broken into components with '.' and '..' handled
> appropriately.  But this shouldn't be that difficult to do.

Libvirt would probably canonicalize path names, both when sticking them
in the whitelist, and in validating the requests from qemu - agreed that
it's not difficult.

More importantly, libvirt needs to start tracking the backing chain of
any qcow2 or qed file as part of the domain XML; and operations like
'block-stream' would update not only the chain, but also the whitelist.
 In the drive-reopen case, this means that libvirt would have to be
careful when to change labeling - provide access to the new files before
drive-reopen, then revoke access to files after drive-reopen completes.
 In other words, having the -open-hook-fd client pass a command to
libvirt at the time it is closing an fd would help libvirt know when
qemu has quit using a file, which might make it easier to revoke SELinux
labels at that time.

-- 
Eric Blake   ebl...@redhat.com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [libvirt] [RFC 0/5] block: File descriptor passing using -open-hook-fd

2012-05-01 Thread Anthony Liguori

On 05/01/2012 05:15 PM, Eric Blake wrote:

On 05/01/2012 03:53 PM, Anthony Liguori wrote:


I think (correct me if I'm wrong) libvirt should be aware of any file
that qemu
asks it to open. So from a security point of view, libvirt can prevent
opening a
file if it isn't affiliated with the guest.


Right, libvirt can maintain a whitelist of files QEMU is allowed to open
(which is already has because it needs to label these files).


Indeed.


  The only
complexity is that it's not a straight strcmp().  The path needs to be
(carefully) broken into components with '.' and '..' handled
appropriately.  But this shouldn't be that difficult to do.


Libvirt would probably canonicalize path names, both when sticking them
in the whitelist, and in validating the requests from qemu - agreed that
it's not difficult.

More importantly, libvirt needs to start tracking the backing chain of
any qcow2 or qed file as part of the domain XML; and operations like
'block-stream' would update not only the chain, but also the whitelist.
  In the drive-reopen case, this means that libvirt would have to be
careful when to change labeling


Would you give QEMU open access or change the way you label to only allow 
read/write access?  I think the later is probably the better approach.


So presumably, you'll need to adjust the sVirt policy too...

You'll need to detect if a file is on NFS too and figure out what the default 
label is that was given so you can build the rules correctly.


Regards,

Anthony Liguori



Re: [Qemu-devel] [libvirt] [RFC 0/5] block: File descriptor passing using -open-hook-fd

2012-05-02 Thread Daniel P. Berrange
On Wed, May 02, 2012 at 10:20:17AM +0200, Kevin Wolf wrote:
> Am 01.05.2012 22:25, schrieb Anthony Liguori:
> > Thanks for sending this out Stefan.
> > 
> > On 05/01/2012 10:31 AM, Stefan Hajnoczi wrote:
> >> Libvirt can take advantage of SELinux to restrict the QEMU process and 
> >> prevent
> >> it from opening files that it should not have access to.  This improves
> >> security because it prevents the attacker from escaping the QEMU process if
> >> they manage to gain control.
> >>
> >> NFS has been a pain point for SELinux because it does not support labels 
> >> (which
> >> I believe are stored in extended attributes).  In other words, it's not
> >> possible to use SELinux goodness on QEMU when image files are located on 
> >> NFS.
> >> Today we have to allow QEMU access to any file on the NFS export rather 
> >> than
> >> restricting specifically to the image files that the guest requires.
> >>
> >> File descriptor passing is a solution to this problem and might also come 
> >> in
> >> handy elsewhere.  Libvirt or another external process chooses files which 
> >> QEMU
> >> is allowed to access and provides just those file descriptors - QEMU cannot
> >> open the files itself.
> >>
> >> This series adds the -open-hook-fd command-line option.  Whenever QEMU 
> >> needs to
> >> open an image file it sends a request over the given UNIX domain socket.  
> >> The
> >> response includes the file descriptor or an errno on failure.  Please see 
> >> the
> >> patches for details on the protocol.
> >>
> >> The -open-hook-fd approach allows QEMU to support file descriptor passing
> >> without changing -drive.  It also supports snapshot_blkdev and other 
> >> commands
> >> that re-open image files.
> >>
> >> Anthony Liguori  wrote most of these patches.  I 
> >> added a
> >> demo -open-hook-fd server and added some small fixes.  Since Anthony is
> >> traveling right now I'm sending the RFC for discussion.
> > 
> > What I like about this approach is that it's useful outside the block layer 
> > and 
> > is conceptionally simple from a QEMU PoV.  We simply delegate open() to 
> > libvirt 
> > and let libvirt enforce whatever rules it wants.
> > 
> > This is not meant to be an alternative to blockdev, but even with blockdev, 
> > I 
> > think we still want to use a mechanism like this even with blockdev.
> 
> What does it provide on top?
> 
> This doesn't look like something that I'd like a lot. qemu should be
> able to continue to run no matter what the management tool does, whether
> it responds to RPCs properly or whether it has crashed. You need a
> really good use case for the RPC that cannot be covered otherwise in
> order to justify this.

Indeed, this solution breaks if you stop or restart libvirtd while
QEMU is running.  Restarting libvirt while QEMU is running is something
we must support, since installing RPM updates will restart libvirtd
and we cannot let guests die in this case.

I would much prefer to see us be able to pass FDs in directly alongside
the disk config as we do for netdev TAP/etc, and for QEMU / kernel to be
fixed so that you do not need to re-open FDs on the fly.

Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] [libvirt] [RFC 0/5] block: File descriptor passing using -open-hook-fd

2012-05-02 Thread Kevin Wolf
Am 02.05.2012 10:53, schrieb Daniel P. Berrange:
> On Wed, May 02, 2012 at 10:20:17AM +0200, Kevin Wolf wrote:
>> Am 01.05.2012 22:25, schrieb Anthony Liguori:
>>> Thanks for sending this out Stefan.
>>>
>>> On 05/01/2012 10:31 AM, Stefan Hajnoczi wrote:
 Libvirt can take advantage of SELinux to restrict the QEMU process and 
 prevent
 it from opening files that it should not have access to.  This improves
 security because it prevents the attacker from escaping the QEMU process if
 they manage to gain control.

 NFS has been a pain point for SELinux because it does not support labels 
 (which
 I believe are stored in extended attributes).  In other words, it's not
 possible to use SELinux goodness on QEMU when image files are located on 
 NFS.
 Today we have to allow QEMU access to any file on the NFS export rather 
 than
 restricting specifically to the image files that the guest requires.

 File descriptor passing is a solution to this problem and might also come 
 in
 handy elsewhere.  Libvirt or another external process chooses files which 
 QEMU
 is allowed to access and provides just those file descriptors - QEMU cannot
 open the files itself.

 This series adds the -open-hook-fd command-line option.  Whenever QEMU 
 needs to
 open an image file it sends a request over the given UNIX domain socket.  
 The
 response includes the file descriptor or an errno on failure.  Please see 
 the
 patches for details on the protocol.

 The -open-hook-fd approach allows QEMU to support file descriptor passing
 without changing -drive.  It also supports snapshot_blkdev and other 
 commands
 that re-open image files.

 Anthony Liguori  wrote most of these patches.  I 
 added a
 demo -open-hook-fd server and added some small fixes.  Since Anthony is
 traveling right now I'm sending the RFC for discussion.
>>>
>>> What I like about this approach is that it's useful outside the block layer 
>>> and 
>>> is conceptionally simple from a QEMU PoV.  We simply delegate open() to 
>>> libvirt 
>>> and let libvirt enforce whatever rules it wants.
>>>
>>> This is not meant to be an alternative to blockdev, but even with blockdev, 
>>> I 
>>> think we still want to use a mechanism like this even with blockdev.
>>
>> What does it provide on top?
>>
>> This doesn't look like something that I'd like a lot. qemu should be
>> able to continue to run no matter what the management tool does, whether
>> it responds to RPCs properly or whether it has crashed. You need a
>> really good use case for the RPC that cannot be covered otherwise in
>> order to justify this.
> 
> Indeed, this solution breaks if you stop or restart libvirtd while
> QEMU is running.  Restarting libvirt while QEMU is running is something
> we must support, since installing RPM updates will restart libvirtd
> and we cannot let guests die in this case.
> 
> I would much prefer to see us be able to pass FDs in directly alongside
> the disk config as we do for netdev TAP/etc, and for QEMU / kernel to be
> fixed so that you do not need to re-open FDs on the fly.

I agree, and this is what -blockdev would give us.

Part of why I don't like the RFC (apart from RPCing the management tool
being just wrong) is that once again it's trying to take shortcuts and
only provide a hack for the urgent need instead of doing it properly and
implementing -blockdev. I suspect that if we take something half-baked
like this, we will keep being unhappy with the situation in the block
layer, but it won't hurt enough any more to actually spend effort on it,
so that we'll go another five years with it.

Kevin



Re: [Qemu-devel] [libvirt] [RFC 0/5] block: File descriptor passing using -open-hook-fd

2012-05-02 Thread Daniel P. Berrange
On Wed, May 02, 2012 at 11:45:26AM +0200, Kevin Wolf wrote:
> Am 02.05.2012 10:53, schrieb Daniel P. Berrange:
> > On Wed, May 02, 2012 at 10:20:17AM +0200, Kevin Wolf wrote:
> >> Am 01.05.2012 22:25, schrieb Anthony Liguori:
> >>> Thanks for sending this out Stefan.
> >>>
> >>> On 05/01/2012 10:31 AM, Stefan Hajnoczi wrote:
>  Libvirt can take advantage of SELinux to restrict the QEMU process and 
>  prevent
>  it from opening files that it should not have access to.  This improves
>  security because it prevents the attacker from escaping the QEMU process 
>  if
>  they manage to gain control.
> 
>  NFS has been a pain point for SELinux because it does not support labels 
>  (which
>  I believe are stored in extended attributes).  In other words, it's not
>  possible to use SELinux goodness on QEMU when image files are located on 
>  NFS.
>  Today we have to allow QEMU access to any file on the NFS export rather 
>  than
>  restricting specifically to the image files that the guest requires.
> 
>  File descriptor passing is a solution to this problem and might also 
>  come in
>  handy elsewhere.  Libvirt or another external process chooses files 
>  which QEMU
>  is allowed to access and provides just those file descriptors - QEMU 
>  cannot
>  open the files itself.
> 
>  This series adds the -open-hook-fd command-line option.  Whenever QEMU 
>  needs to
>  open an image file it sends a request over the given UNIX domain socket. 
>   The
>  response includes the file descriptor or an errno on failure.  Please 
>  see the
>  patches for details on the protocol.
> 
>  The -open-hook-fd approach allows QEMU to support file descriptor passing
>  without changing -drive.  It also supports snapshot_blkdev and other 
>  commands
>  that re-open image files.
> 
>  Anthony Liguori  wrote most of these patches.  I 
>  added a
>  demo -open-hook-fd server and added some small fixes.  Since Anthony is
>  traveling right now I'm sending the RFC for discussion.
> >>>
> >>> What I like about this approach is that it's useful outside the block 
> >>> layer and 
> >>> is conceptionally simple from a QEMU PoV.  We simply delegate open() to 
> >>> libvirt 
> >>> and let libvirt enforce whatever rules it wants.
> >>>
> >>> This is not meant to be an alternative to blockdev, but even with 
> >>> blockdev, I 
> >>> think we still want to use a mechanism like this even with blockdev.
> >>
> >> What does it provide on top?
> >>
> >> This doesn't look like something that I'd like a lot. qemu should be
> >> able to continue to run no matter what the management tool does, whether
> >> it responds to RPCs properly or whether it has crashed. You need a
> >> really good use case for the RPC that cannot be covered otherwise in
> >> order to justify this.
> > 
> > Indeed, this solution breaks if you stop or restart libvirtd while
> > QEMU is running.  Restarting libvirt while QEMU is running is something
> > we must support, since installing RPM updates will restart libvirtd
> > and we cannot let guests die in this case.
> > 
> > I would much prefer to see us be able to pass FDs in directly alongside
> > the disk config as we do for netdev TAP/etc, and for QEMU / kernel to be
> > fixed so that you do not need to re-open FDs on the fly.
> 
> I agree, and this is what -blockdev would give us.
> 
> Part of why I don't like the RFC (apart from RPCing the management tool
> being just wrong) is that once again it's trying to take shortcuts and
> only provide a hack for the urgent need instead of doing it properly and
> implementing -blockdev. I suspect that if we take something half-baked
> like this, we will keep being unhappy with the situation in the block
> layer, but it won't hurt enough any more to actually spend effort on it,
> so that we'll go another five years with it.

I tend to agree - we have been talking about -blockdev for faar to long
without (AFAICT) making any real progress towards getting it done. I'd
love to see someone bite the bullet & have a go at implementing it


Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|



Re: [Qemu-devel] [libvirt] [RFC 0/5] block: File descriptor passing using -open-hook-fd

2012-05-02 Thread Paolo Bonzini
Il 01/05/2012 22:56, Eric Blake ha scritto:
> What sort
> of timing restrictions are there?  For example, the proposed
> 'drive-reopen' command (probably now delegated to qemu 1.2) would mean
> that qemu would be calling back into libvirt in order to do the reopen.
>  If libvirt takes its time in passing back an open fd, is it going to
> starve qemu from answering unrelated monitor commands in the meantime?
> I definitely want to make sure we avoid deadlock where libvirt is
> waiting on a monitor command, but the monitor command is waiting on
> libvirt to pass an fd.

FWIW I'm going to kill drive-reopen in favor of something like
block-job-complete that will not require reopening (it will require
opening the backing files though, and that can also take time).

Paolo



Re: [Qemu-devel] [libvirt] [RFC 0/5] block: File descriptor passing using -open-hook-fd

2012-05-02 Thread Paolo Bonzini
Il 02/05/2012 11:56, Daniel P. Berrange ha scritto:
> I tend to agree - we have been talking about -blockdev for faar to long
> without (AFAICT) making any real progress towards getting it done. I'd
> love to see someone bite the bullet & have a go at implementing it

Having a spec would help somewhat...

Paolo



Re: [Qemu-devel] [libvirt] [RFC 0/5] block: File descriptor passing using -open-hook-fd

2012-05-03 Thread Anthony Liguori

On 05/02/2012 04:45 AM, Kevin Wolf wrote:

Am 02.05.2012 10:53, schrieb Daniel P. Berrange:

I would much prefer to see us be able to pass FDs in directly alongside
the disk config as we do for netdev TAP/etc, and for QEMU / kernel to be
fixed so that you do not need to re-open FDs on the fly.


I agree, and this is what -blockdev would give us.

Part of why I don't like the RFC (apart from RPCing the management tool
being just wrong) is that once again it's trying to take shortcuts and
only provide a hack for the urgent need instead of doing it properly and
implementing -blockdev.


The proper way to address this problem is *not* -blockdev.  -blockdev is another 
short cut.


The proper way to solve this problem is to add extended attribute to SELinux. 
Another proper solution is for libvirt to launch guests with different UIDs and 
use DAC to prevent guests from opening files.



I suspect that if we take something half-baked
like this, we will keep being unhappy with the situation in the block
layer, but it won't hurt enough any more to actually spend effort on it,
so that we'll go another five years with it.


Wanting to refactor the block layer is great.  I am fully in support of it.  But 
holding practical features hostage is not reasonable.


There is nothing intrinsically cleaner about using -blockdev fd=X verses using 
an RPC like this.  -blockdev has a lot of nice characteristics but solving this 
problem is not one of them.


Regards,

Anthony Liguori


Kevin






Re: [Qemu-devel] [libvirt] [RFC 0/5] block: File descriptor passing using -open-hook-fd

2012-05-03 Thread Zhi Yong Wu
On Tue, May 1, 2012 at 11:31 PM, Stefan Hajnoczi
 wrote:
> Libvirt can take advantage of SELinux to restrict the QEMU process and prevent
> it from opening files that it should not have access to.  This improves
> security because it prevents the attacker from escaping the QEMU process if
> they manage to gain control.
>
> NFS has been a pain point for SELinux because it does not support labels 
> (which
> I believe are stored in extended attributes).  In other words, it's not
> possible to use SELinux goodness on QEMU when image files are located on NFS.
> Today we have to allow QEMU access to any file on the NFS export rather than
> restricting specifically to the image files that the guest requires.
>
> File descriptor passing is a solution to this problem and might also come in
> handy elsewhere.  Libvirt or another external process chooses files which QEMU
> is allowed to access and provides just those file descriptors - QEMU cannot
> open the files itself.
>
> This series adds the -open-hook-fd command-line option.  Whenever QEMU needs 
> to
> open an image file it sends a request over the given UNIX domain socket.  The
> response includes the file descriptor or an errno on failure.  Please see the
> patches for details on the protocol.
>
> The -open-hook-fd approach allows QEMU to support file descriptor passing
> without changing -drive.  It also supports snapshot_blkdev and other commands
By the way, How will it support them?
> that re-open image files.
>
> Anthony Liguori  wrote most of these patches.  I added a
> demo -open-hook-fd server and added some small fixes.  Since Anthony is
> traveling right now I'm sending the RFC for discussion.
>
> Anthony Liguori (3):
>  block: add open() wrapper that can be hooked by libvirt
>  block: add new command line parameter that and protocol description
>  block: plumb up open-hook-fd option
>
> Stefan Hajnoczi (2):
>  osdep: add qemu_recvmsg() wrapper
>  Example -open-hook-fd server
>
>  block.c           |  107 ++
>  block.h           |    2 +
>  block/raw-posix.c |   18 +++
>  block/raw-win32.c |    2 +-
>  block/vdi.c       |    2 +-
>  block/vmdk.c      |    6 +--
>  block/vpc.c       |    2 +-
>  block/vvfat.c     |    4 +-
>  block_int.h       |   12 +
>  osdep.c           |   46 +
>  qemu-common.h     |    2 +
>  qemu-options.hx   |   42 +++
>  test-fd-passing.c |  147 
> +
>  vl.c              |    3 ++
>  14 files changed, 378 insertions(+), 17 deletions(-)
>  create mode 100644 test-fd-passing.c
>
> --
> 1.7.10
>
> --
> libvir-list mailing list
> libvir-l...@redhat.com
> https://www.redhat.com/mailman/listinfo/libvir-list



-- 
Regards,

Zhi Yong Wu



Re: [Qemu-devel] [libvirt] [RFC 0/5] block: File descriptor passing using -open-hook-fd

2012-05-07 Thread Corey Bryant



On 05/01/2012 06:15 PM, Eric Blake wrote:

On 05/01/2012 03:53 PM, Anthony Liguori wrote:


I think (correct me if I'm wrong) libvirt should be aware of any file
that qemu
asks it to open. So from a security point of view, libvirt can prevent
opening a
file if it isn't affiliated with the guest.


Right, libvirt can maintain a whitelist of files QEMU is allowed to open
(which is already has because it needs to label these files).


Indeed.


  The only
complexity is that it's not a straight strcmp().  The path needs to be
(carefully) broken into components with '.' and '..' handled
appropriately.  But this shouldn't be that difficult to do.


Libvirt would probably canonicalize path names, both when sticking them
in the whitelist, and in validating the requests from qemu - agreed that
it's not difficult.

More importantly, libvirt needs to start tracking the backing chain of
any qcow2 or qed file as part of the domain XML; and operations like
'block-stream' would update not only the chain, but also the whitelist.
  In the drive-reopen case, this means that libvirt would have to be
careful when to change labeling - provide access to the new files before
drive-reopen, then revoke access to files after drive-reopen completes.
  In other words, having the -open-hook-fd client pass a command to
libvirt at the time it is closing an fd would help libvirt know when
qemu has quit using a file, which might make it easier to revoke SELinux
labels at that time.



If we were to go with this approach, I think the following updates would 
be required for libvirt.  Could you let me know if I'm missing anything?


libvirt tasks:
- Introduce a data structure to store file whitelist per guest
- Add -open-hook-fd option to QEMU command line and pass Unix
  domain socket fd to QEMU
- Create open() handler that handles requests from QEMU to open
  files and passes back fd
- Potentially also handle close requests from QEMU?  Would allow
  libvirt to update XML and whitelist (as well as SELinux labels).
- Canonicalize path names when putting them in whitelist and
  when validating requests from QEMU
- XML updates to track backing chain of qcow2 and qed files
- Update whitelist and XML chain when QEMU monitor commands are
  used to open new files: block-stream, drive-reopen, drive_add,
  savevm, snapshot_blkdev, change

Updates would also be required for SELinux and AppArmor policy to allow 
libvirt open of NFS files, and allow QEMU read/write (no open allowed) 
of NFS Files.


--
Regards,
Corey




Re: [Qemu-devel] [libvirt] [RFC 0/5] block: File descriptor passing using -open-hook-fd

2012-05-17 Thread Stefan Hajnoczi
On Fri, May 04, 2012 at 11:28:47AM +0800, Zhi Yong Wu wrote:
> On Tue, May 1, 2012 at 11:31 PM, Stefan Hajnoczi
>  wrote:
> > Libvirt can take advantage of SELinux to restrict the QEMU process and 
> > prevent
> > it from opening files that it should not have access to.  This improves
> > security because it prevents the attacker from escaping the QEMU process if
> > they manage to gain control.
> >
> > NFS has been a pain point for SELinux because it does not support labels 
> > (which
> > I believe are stored in extended attributes).  In other words, it's not
> > possible to use SELinux goodness on QEMU when image files are located on 
> > NFS.
> > Today we have to allow QEMU access to any file on the NFS export rather than
> > restricting specifically to the image files that the guest requires.
> >
> > File descriptor passing is a solution to this problem and might also come in
> > handy elsewhere.  Libvirt or another external process chooses files which 
> > QEMU
> > is allowed to access and provides just those file descriptors - QEMU cannot
> > open the files itself.
> >
> > This series adds the -open-hook-fd command-line option.  Whenever QEMU 
> > needs to
> > open an image file it sends a request over the given UNIX domain socket.  
> > The
> > response includes the file descriptor or an errno on failure.  Please see 
> > the
> > patches for details on the protocol.
> >
> > The -open-hook-fd approach allows QEMU to support file descriptor passing
> > without changing -drive.  It also supports snapshot_blkdev and other 
> > commands
> By the way, How will it support them?

The problem with snapshot_blkdev is that closing a file and opening a
new file cannot be done by the QEMU process when an SELinux policy is in
place to prevent opening files.

The -open-hook-fd approach works even when the QEMU process is not
allowed to open files since file descriptor passing over a UNIX domain
socket is used to open files on behalf of QEMU.

Stefan




Re: [Qemu-devel] [libvirt] [RFC 0/5] block: File descriptor passing using -open-hook-fd

2012-05-17 Thread Zhi Yong Wu
On Thu, May 17, 2012 at 9:42 PM, Stefan Hajnoczi
 wrote:
> On Fri, May 04, 2012 at 11:28:47AM +0800, Zhi Yong Wu wrote:
>> On Tue, May 1, 2012 at 11:31 PM, Stefan Hajnoczi
>>  wrote:
>> > Libvirt can take advantage of SELinux to restrict the QEMU process and 
>> > prevent
>> > it from opening files that it should not have access to.  This improves
>> > security because it prevents the attacker from escaping the QEMU process if
>> > they manage to gain control.
>> >
>> > NFS has been a pain point for SELinux because it does not support labels 
>> > (which
>> > I believe are stored in extended attributes).  In other words, it's not
>> > possible to use SELinux goodness on QEMU when image files are located on 
>> > NFS.
>> > Today we have to allow QEMU access to any file on the NFS export rather 
>> > than
>> > restricting specifically to the image files that the guest requires.
>> >
>> > File descriptor passing is a solution to this problem and might also come 
>> > in
>> > handy elsewhere.  Libvirt or another external process chooses files which 
>> > QEMU
>> > is allowed to access and provides just those file descriptors - QEMU cannot
>> > open the files itself.
>> >
>> > This series adds the -open-hook-fd command-line option.  Whenever QEMU 
>> > needs to
>> > open an image file it sends a request over the given UNIX domain socket.  
>> > The
>> > response includes the file descriptor or an errno on failure.  Please see 
>> > the
>> > patches for details on the protocol.
>> >
>> > The -open-hook-fd approach allows QEMU to support file descriptor passing
>> > without changing -drive.  It also supports snapshot_blkdev and other 
>> > commands
>> By the way, How will it support them?
>
> The problem with snapshot_blkdev is that closing a file and opening a
> new file cannot be done by the QEMU process when an SELinux policy is in
> place to prevent opening files.
>
> The -open-hook-fd approach works even when the QEMU process is not
> allowed to open files since file descriptor passing over a UNIX domain
> socket is used to open files on behalf of QEMU.
Do you mean that libvirt will provide QEMU with one service? When QEMU
need open or close one new file, it can send one request to libvirt?
>
> Stefan
>



-- 
Regards,

Zhi Yong Wu



Re: [Qemu-devel] [libvirt] [RFC 0/5] block: File descriptor passing using -open-hook-fd

2012-05-17 Thread Zhi Yong Wu
On Thu, May 17, 2012 at 9:42 PM, Stefan Hajnoczi
 wrote:
> On Fri, May 04, 2012 at 11:28:47AM +0800, Zhi Yong Wu wrote:
>> On Tue, May 1, 2012 at 11:31 PM, Stefan Hajnoczi
>>  wrote:
>> > Libvirt can take advantage of SELinux to restrict the QEMU process and 
>> > prevent
>> > it from opening files that it should not have access to.  This improves
>> > security because it prevents the attacker from escaping the QEMU process if
>> > they manage to gain control.
>> >
>> > NFS has been a pain point for SELinux because it does not support labels 
>> > (which
>> > I believe are stored in extended attributes).  In other words, it's not
>> > possible to use SELinux goodness on QEMU when image files are located on 
>> > NFS.
>> > Today we have to allow QEMU access to any file on the NFS export rather 
>> > than
>> > restricting specifically to the image files that the guest requires.
>> >
>> > File descriptor passing is a solution to this problem and might also come 
>> > in
>> > handy elsewhere.  Libvirt or another external process chooses files which 
>> > QEMU
>> > is allowed to access and provides just those file descriptors - QEMU cannot
>> > open the files itself.
>> >
>> > This series adds the -open-hook-fd command-line option.  Whenever QEMU 
>> > needs to
>> > open an image file it sends a request over the given UNIX domain socket.  
>> > The
>> > response includes the file descriptor or an errno on failure.  Please see 
>> > the
>> > patches for details on the protocol.
>> >
>> > The -open-hook-fd approach allows QEMU to support file descriptor passing
>> > without changing -drive.  It also supports snapshot_blkdev and other 
>> > commands
>> By the way, How will it support them?
>
> The problem with snapshot_blkdev is that closing a file and opening a
> new file cannot be done by the QEMU process when an SELinux policy is in
> place to prevent opening files.
>
> The -open-hook-fd approach works even when the QEMU process is not
> allowed to open files since file descriptor passing over a UNIX domain
> socket is used to open files on behalf of QEMU.
I thought that the patchset can only let QEMU passively get passed fd
parameter from upper application.
>
> Stefan
>



-- 
Regards,

Zhi Yong Wu



Re: [Qemu-devel] [libvirt] [RFC 0/5] block: File descriptor passing using -open-hook-fd

2012-05-17 Thread Eric Blake
On 05/17/2012 07:42 AM, Stefan Hajnoczi wrote:

>>>
>>> The -open-hook-fd approach allows QEMU to support file descriptor passing
>>> without changing -drive.  It also supports snapshot_blkdev and other 
>>> commands
>> By the way, How will it support them?
> 
> The problem with snapshot_blkdev is that closing a file and opening a
> new file cannot be done by the QEMU process when an SELinux policy is in
> place to prevent opening files.

snapshot_blkdev can take an fd:name instead of a /path/to/file for the
file to open, in which case libvirt can pass in the named fd _prior_ to
the snapshot_blkdev using the 'getfd' monitor command.

> 
> The -open-hook-fd approach works even when the QEMU process is not
> allowed to open files since file descriptor passing over a UNIX domain
> socket is used to open files on behalf of QEMU.

The -open-hook-fd approach would indeed allow snapshot_blokdev to ask
for the fd after the fact, but it's much more painful.  Consider a case
with a two-disk snapshot:

with the fd:name approach, the sequence is:

libvirt calls getfd:name1 over normal monitor
qemu responds
libvirt calls getfd:name2 over normal monitor
qemu responds
libvirt calls transaction around blockdev-snapshot-sync over normal
monitor, using fd:name1 and fd:name2
qemu responds

but with -open-hook-fd, the approach would be:

libvirt calls transaction
qemu calls open(file1) over hook
libvirt responds
qemu calls open(file2) over hook
libvirt responds
qemu responds to the original transaction

The 'transaction' operation is thus blocked by the time it takes to do
two intermediate opens over a second channel, which kind of defeats the
purpose of making the transaction take effect with minimal guest
downtime.  And libvirt code becomes a lot trickier to deal with the fact
that two channels are in use, and that the channel that issued the
'transaction' command must block while the other channel for handling
hooks must be responsive.

I'm really disliking the hook-fd approach, when a better solution is to
make use of 'getfd' in advance of any operation that will need to open
new fds.

-- 
Eric Blake   ebl...@redhat.com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [libvirt] [RFC 0/5] block: File descriptor passing using -open-hook-fd

2012-05-18 Thread Stefan Hajnoczi
On Thu, May 17, 2012 at 08:14:15AM -0600, Eric Blake wrote:
> On 05/17/2012 07:42 AM, Stefan Hajnoczi wrote:
> 
> >>>
> >>> The -open-hook-fd approach allows QEMU to support file descriptor passing
> >>> without changing -drive.  It also supports snapshot_blkdev and other 
> >>> commands
> >> By the way, How will it support them?
> > 
> > The problem with snapshot_blkdev is that closing a file and opening a
> > new file cannot be done by the QEMU process when an SELinux policy is in
> > place to prevent opening files.
> 
> snapshot_blkdev can take an fd:name instead of a /path/to/file for the
> file to open, in which case libvirt can pass in the named fd _prior_ to
> the snapshot_blkdev using the 'getfd' monitor command.
> 
> > 
> > The -open-hook-fd approach works even when the QEMU process is not
> > allowed to open files since file descriptor passing over a UNIX domain
> > socket is used to open files on behalf of QEMU.
> 
> The -open-hook-fd approach would indeed allow snapshot_blokdev to ask
> for the fd after the fact, but it's much more painful.  Consider a case
> with a two-disk snapshot:
> 
> with the fd:name approach, the sequence is:
> 
> libvirt calls getfd:name1 over normal monitor
> qemu responds
> libvirt calls getfd:name2 over normal monitor
> qemu responds
> libvirt calls transaction around blockdev-snapshot-sync over normal
> monitor, using fd:name1 and fd:name2
> qemu responds
> 
> but with -open-hook-fd, the approach would be:
> 
> libvirt calls transaction
> qemu calls open(file1) over hook
> libvirt responds
> qemu calls open(file2) over hook
> libvirt responds
> qemu responds to the original transaction
> 
> The 'transaction' operation is thus blocked by the time it takes to do
> two intermediate opens over a second channel, which kind of defeats the
> purpose of making the transaction take effect with minimal guest
> downtime.  And libvirt code becomes a lot trickier to deal with the fact
> that two channels are in use, and that the channel that issued the
> 'transaction' command must block while the other channel for handling
> hooks must be responsive.
> 
> I'm really disliking the hook-fd approach, when a better solution is to
> make use of 'getfd' in advance of any operation that will need to open
> new fds.

This is a good technical argument for using getfd.  I agree with you.

Stefan




Re: [Qemu-devel] [libvirt] [RFC 0/5] block: File descriptor passing using -open-hook-fd

2012-05-18 Thread Stefan Hajnoczi
On Thu, May 17, 2012 at 10:02:01PM +0800, Zhi Yong Wu wrote:
> On Thu, May 17, 2012 at 9:42 PM, Stefan Hajnoczi
>  wrote:
> > On Fri, May 04, 2012 at 11:28:47AM +0800, Zhi Yong Wu wrote:
> >> On Tue, May 1, 2012 at 11:31 PM, Stefan Hajnoczi
> >>  wrote:
> >> > Libvirt can take advantage of SELinux to restrict the QEMU process and 
> >> > prevent
> >> > it from opening files that it should not have access to.  This improves
> >> > security because it prevents the attacker from escaping the QEMU process 
> >> > if
> >> > they manage to gain control.
> >> >
> >> > NFS has been a pain point for SELinux because it does not support labels 
> >> > (which
> >> > I believe are stored in extended attributes).  In other words, it's not
> >> > possible to use SELinux goodness on QEMU when image files are located on 
> >> > NFS.
> >> > Today we have to allow QEMU access to any file on the NFS export rather 
> >> > than
> >> > restricting specifically to the image files that the guest requires.
> >> >
> >> > File descriptor passing is a solution to this problem and might also 
> >> > come in
> >> > handy elsewhere.  Libvirt or another external process chooses files 
> >> > which QEMU
> >> > is allowed to access and provides just those file descriptors - QEMU 
> >> > cannot
> >> > open the files itself.
> >> >
> >> > This series adds the -open-hook-fd command-line option.  Whenever QEMU 
> >> > needs to
> >> > open an image file it sends a request over the given UNIX domain socket. 
> >> >  The
> >> > response includes the file descriptor or an errno on failure.  Please 
> >> > see the
> >> > patches for details on the protocol.
> >> >
> >> > The -open-hook-fd approach allows QEMU to support file descriptor passing
> >> > without changing -drive.  It also supports snapshot_blkdev and other 
> >> > commands
> >> By the way, How will it support them?
> >
> > The problem with snapshot_blkdev is that closing a file and opening a
> > new file cannot be done by the QEMU process when an SELinux policy is in
> > place to prevent opening files.
> >
> > The -open-hook-fd approach works even when the QEMU process is not
> > allowed to open files since file descriptor passing over a UNIX domain
> > socket is used to open files on behalf of QEMU.
> I thought that the patchset can only let QEMU passively get passed fd
> parameter from upper application.

No.  What this patch series does is make QEMU request file descriptors
from an external process (e.g. libvirt) each time it wants to open an
image file.

Stefan