Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-10 Thread Kevin Wolf
Am 09.07.2012 19:35, schrieb Corey Bryant:
 
 
 On 07/09/2012 11:46 AM, Kevin Wolf wrote:
 Am 09.07.2012 17:05, schrieb Corey Bryant:
 I'm not sure this is an issue with current design.  I know things have
 changed a bit as the email threads evolved, so I'll paste the current
 design that I am working from.  Please let me know if you still see any
 issues.

 FD passing:
 ---
 New monitor commands enable adding/removing an fd to/from a set.  New
 monitor command query-fdsets enables querying of current monitor fdsets.
The set of fds should all refer to the same file, with each fd having
 different access flags (ie. O_RDWR, O_RDONLY).  qemu_open can then dup
 the fd that has the matching access mode flags.

 Design points:
 --
 1. add-fd
 - fd is passed via SCM rights and qemu adds fd to first unused fdset
 (e.g. /dev/fdset/1)
 - add-fd monitor function initializes the monitor inuse flag for the
 fdset to true
 - add-fd monitor function initializes the remove flag for the fd to false
 - add-fd returns fdset number and received fd number (e.g fd=3) to caller

 2. drive_add file=/dev/fdset/1
 - qemu_open uses the first fd in fdset1 that has access flags matching
 the qemu_open action flags and has remove flag set to false
 - qemu_open increments refcount for the fdset
 - Need to make sure that if a command like 'device-add' fails that
 refcount is not incremented

 3. add-fd fdset=1
 - fd is passed via SCM rights
 - add-fd monitor function adds the received fd to the specified fdset
 (or fails if fdset doesn't exist)
 - add-fd monitor function initializes the remove flag for the fd to false
 - add-fd returns fdset number and received fd number (e.g fd=4) to caller

 4. block-commit
 - qemu_open performs reopen by using the first fd from the fdset that
 has access flags matching the qemu_open action flags and has remove flag
 set to false
 - qemu_open increments refcount for the fdset
 - Need to make sure that if a command like 'block-commit' fails that
 refcount is not incremented

 5. remove-fd fdset=1 fd=4
 - remove-fd monitor function fails if fdset doesn't exist
 - remove-fd monitor function turns on remove flag for fd=4

 What was again the reason why we keep removed fds in the fdset at all?
 
 Because if refcount is  0 for the fd set, then the fd could be in use 
 by a block device.  So we keep it around until refcount is decremented 
 to zero, at which point it is safe to close.
 

 The removed flag would make sense for a fdset after a hypothetical
 close-fdset call because the fdset needs to be kept around until the
 last user closes it, but I think removed fds can be deleted immediately.
 
 fds in an fd set really need to be kept around until zero block devices 
 reference them.  At that point, if '(refcount == 0  (!inuse || 
 remove))' is true, then we'll officially close the fd.

Block devices don't reference an fd in the fdset. There are two
references in a block device. The first one is obviously the file
descriptor they are using; it is a fd dup()ed from an fd in the fdset,
but it's now independent of it. The other reference is the file name
that is kept in the BlockDriverState, and it always points to
/dev/fdset/X, that is, the whole fdset instead of a single fd.

What happens if you remove a file descriptor from an fdset that is in
use, is that you can't reopen the fdset with the flags of the removed
file descriptor any more. Which I believe is exactly the expected
behaviour. libvirt would use this to revoke r/w access, for example (and
which behaviour you already provide by checking removed in qemu_open).

Are there any other use cases where it makes a difference whether a file
descriptor is kept in the fdset with removed=1 or whether it's actually
removed from the fdset?

 I think I might have confused remove-fd and close-fdset in earlier
 emails in this thread, so I hope this isn't inconsistent with what I
 said before.

 
 Ok no problem.
 
 6. qemu_close (need to replace all close calls in block layer with
 qemu_close)
 - qemu_close decrements refcount for fdset
 - qemu_close closes all fds that have (refcount == 0  (!inuse || remove))
 - qemu_close frees the fdset if no fds remain in it

 7. disconnecting the QMP monitor
 - monitor disconnect visits all fdsets on monitor and turns off monitor
 in-use flag for fdset

 And close all fds with refcount == 0.

 
 Yes, this makes sense.
 
 It also makes sense to close removed fds with refcount == 0 in the 
 remove-fd function.  Basically this will be the same thing we do in 
 qemu_close.  We'll close any fds that evaulate the following as true:
 
 (refcount == 0  (!inuse || remove))

Yes, whatever condition we'll come up with, but it should be the same
and checked in all places where its value might change.

Kevin

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-09 Thread Luiz Capitulino
On Thu, 05 Jul 2012 11:06:56 -0400
Corey Bryant cor...@linux.vnet.ibm.com wrote:

 
 
 On 07/04/2012 04:09 AM, Kevin Wolf wrote:
  Am 03.07.2012 20:21, schrieb Corey Bryant:
  On 07/03/2012 02:00 PM, Eric Blake wrote:
  On 07/03/2012 11:46 AM, Corey Bryant wrote:
 
 
  Yes, I think adding a +1 to the refcount for the monitor makes sense.
 
  I'm a bit unsure how to increment the refcount when a monitor reconnects
  though.  Maybe it is as simple as adding a +1 to each fd's refcount when
  the next QMP monitor connects.
 
  Or maybe delay a +1 until after a 'query-fds' - it is not until the
  monitor has reconnected and learned what fds it should be aware of that
  incrementing the refcount again makes sense.  But that would mean making
  'query-fds' track whether this is the first call since the monitor
  reconnected, as it shouldn't normally increase refcounts.
 
  This doesn't sound ideal.
 
  Yes, it's less than ideal.
 
  The other alternative is that the monitor never re-increments a
  refcount.  Once a monitor disconnects, that fd is lost to the monitor,
  and a reconnected monitor must pass in a new fd to be re-associated with
  the fdset.  In other words, the monitor's use of an fd is a one-way
  operation, starting life in use but ending at the first disconnect or
  remove-fd.
 
  I would vote for this 2nd alternative.  As long as we're not introducing
  an fd leak.  And I don't think we are if we decrement the refcount on
  remove-fd or on QMP disconnect.
 
  In fact, I believe this one is even worse. I can already see hacks like
  adding a dummy FD with invalid flags and removing it again just to
  regain control over the fdset...
 
  You earlier suggestion made a lot of sense to me: Whenever a new QMP
  monitor is connected, increase the refcount. That is, as long as any
  monitor is there, don't drop any fdsets unless explicitly requested via QMP.
 
 Ok.  So refcount would be incremented (for the fd or fdset, whatever we 
 decide on) when QMP reconnects.  I'm assuming we wouldn't wait until 
 after a query-fds call.

I'm not sure this is a good idea because we will leak fds if the client forgets
about the fds when re-connecting (ie. it was restarted) or if a different
client connects to QMP.

If we really want to do that, I think that the right way of doing this is to
add a command for clients to re-again ownership of the fds on reconnection.

But to be honest, I don't fully understand why this is needed.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-09 Thread Kevin Wolf
Am 06.07.2012 19:40, schrieb Corey Bryant:
 
 
 On 07/06/2012 05:11 AM, Kevin Wolf wrote:
 Am 05.07.2012 19:00, schrieb Eric Blake:
 On 07/05/2012 10:35 AM, Corey Bryant wrote:
 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
 refcount of 0; fd=4's in-use flag is turned on
 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
 so qemu_open() increments the refcount of fdset1 to 1
 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
 longer in-use by the monitor, and is left open because it is in use by
 the block device (refcount is 1)
 4. client crashes, so all tracked fds are visited; fd=4 is not in-use
 but refcount is 1 so it is not closed
 5. client re-establishes QMP connection, so all tracked fds are visited.
   What happens to the fd=4 in-use flag?

 ...but what are the semantics here?

 If we _always_ turn the in-use flag back on, then that says that even
 though libvirt successfully asked to close the fd, the reconnection
 means that libvirt once again has to ask to close things.

 If we _never_ turn the in-use flag back on, then we've broken the first
 case above where we want an in-use fd to come back into use after a crash.

 Maybe that argues for two flags per fd: an in-use flag (there is
 currently a monitor connection that can manipulate the fd, either
 because it passed in the fd or because it reconnected) and a removed
 flag (a monitor called remove-fd, and no longer wants to know about the
 fd, even if it crashes and reconnects).

 I was in fact just going to suggest a removed flag as well, however
 combined with including the monitor connections in the refcount instead
 of an additional flag. This would also allow to have (the currently
 mostly hypothetical case of) multiple QMP monitors without interesting
 things happening.

 Maybe I'm missing some point that the inuse flag would allow and
 including monitors in the refcount doesn't. Is there one?

 Kevin

 
 Ok let me try this again. I was going through some of the examples and I 
 think we need a separate in-use flag.  Otherwise, when refcount gets to 
 1, we don't know if it is because of a monitor reference or a block 
 device reference.  

Does it matter?

 I think it would cause fds to sit on the monitor 
 until refcount gets to zero (monitor disconnects). Here's an example 
 without the in-use flag:
 
 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
 refcount of 1 (incremented because of monitor reference); fd=4's remove 
 flag is initialized to off
 2. client calls 'device-add' with /dev/fdset/1 as the backing filename;
 qemu_open() increments the refcount of fdset1 to 2
 3. client crashes, so all fdsets are visited; fd=4 had not yet been
 passed to 'remove-fd', so it's remove flag is off; refcount for fdset1 
 is decremented to 1; fd=4 is left open because it is still in use by the 
 block device (refcount is 1)
 4. client re-establishes QMP connection, refcount for fdset1 is 
 incremented to 2; 'query-fds' lets client learn about fd=4 still being 
 open as part of fdset1
 5. client calls 'remove-fd fdset=1 fd=4'; qemu turns on remove flag for 
 fd=4; but fd=4 remains open because refcount of fdset1 is 2

It also decreases the reference count because the monitor doesn't use it
any more.

 6. qemu_close is called for fd=4; refcount for fdset1 is decremented to 
 1; fd=4 remains open because monitor still references fdset1 (refcount 
 of fdset1 is 1)

So here the refcount becomes 0 and the fdset is closed.

 7. Sometime later.. QMP disconnects; refcount for fdset is decremented 
 to zero; fd=4 is closed

The only question that is a bit unclear to me is whether a remove-fd on
one monitor drops the refcount only for this monitor or for all
monitors. However, both options can be implemented without additional
flags or counters.

Kevin

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-09 Thread Corey Bryant



On 07/09/2012 10:05 AM, Luiz Capitulino wrote:

On Thu, 05 Jul 2012 11:06:56 -0400
Corey Bryant cor...@linux.vnet.ibm.com wrote:




On 07/04/2012 04:09 AM, Kevin Wolf wrote:

Am 03.07.2012 20:21, schrieb Corey Bryant:

On 07/03/2012 02:00 PM, Eric Blake wrote:

On 07/03/2012 11:46 AM, Corey Bryant wrote:



Yes, I think adding a +1 to the refcount for the monitor makes sense.

I'm a bit unsure how to increment the refcount when a monitor reconnects
though.  Maybe it is as simple as adding a +1 to each fd's refcount when
the next QMP monitor connects.


Or maybe delay a +1 until after a 'query-fds' - it is not until the
monitor has reconnected and learned what fds it should be aware of that
incrementing the refcount again makes sense.  But that would mean making
'query-fds' track whether this is the first call since the monitor
reconnected, as it shouldn't normally increase refcounts.


This doesn't sound ideal.


Yes, it's less than ideal.


The other alternative is that the monitor never re-increments a
refcount.  Once a monitor disconnects, that fd is lost to the monitor,
and a reconnected monitor must pass in a new fd to be re-associated with
the fdset.  In other words, the monitor's use of an fd is a one-way
operation, starting life in use but ending at the first disconnect or
remove-fd.


I would vote for this 2nd alternative.  As long as we're not introducing
an fd leak.  And I don't think we are if we decrement the refcount on
remove-fd or on QMP disconnect.


In fact, I believe this one is even worse. I can already see hacks like
adding a dummy FD with invalid flags and removing it again just to
regain control over the fdset...

You earlier suggestion made a lot of sense to me: Whenever a new QMP
monitor is connected, increase the refcount. That is, as long as any
monitor is there, don't drop any fdsets unless explicitly requested via QMP.


Ok.  So refcount would be incremented (for the fd or fdset, whatever we
decide on) when QMP reconnects.  I'm assuming we wouldn't wait until
after a query-fds call.


I'm not sure this is a good idea because we will leak fds if the client forgets
about the fds when re-connecting (ie. it was restarted) or if a different
client connects to QMP.

If we really want to do that, I think that the right way of doing this is to
add a command for clients to re-again ownership of the fds on reconnection.

But to be honest, I don't fully understand why this is needed.



I'm not sure this is an issue with current design.  I know things have 
changed a bit as the email threads evolved, so I'll paste the current 
design that I am working from.  Please let me know if you still see any 
issues.


FD passing:
---
New monitor commands enable adding/removing an fd to/from a set.  New 
monitor command query-fdsets enables querying of current monitor fdsets. 
 The set of fds should all refer to the same file, with each fd having 
different access flags (ie. O_RDWR, O_RDONLY).  qemu_open can then dup 
the fd that has the matching access mode flags.


Design points:
--
1. add-fd
- fd is passed via SCM rights and qemu adds fd to first unused fdset 
(e.g. /dev/fdset/1)
- add-fd monitor function initializes the monitor inuse flag for the 
fdset to true

- add-fd monitor function initializes the remove flag for the fd to false
- add-fd returns fdset number and received fd number (e.g fd=3) to caller

2. drive_add file=/dev/fdset/1
- qemu_open uses the first fd in fdset1 that has access flags matching 
the qemu_open action flags and has remove flag set to false

- qemu_open increments refcount for the fdset
- Need to make sure that if a command like 'device-add' fails that 
refcount is not incremented


3. add-fd fdset=1
- fd is passed via SCM rights
- add-fd monitor function adds the received fd to the specified fdset 
(or fails if fdset doesn't exist)

- add-fd monitor function initializes the remove flag for the fd to false
- add-fd returns fdset number and received fd number (e.g fd=4) to caller

4. block-commit
- qemu_open performs reopen by using the first fd from the fdset that 
has access flags matching the qemu_open action flags and has remove flag 
set to false

- qemu_open increments refcount for the fdset
- Need to make sure that if a command like 'block-commit' fails that 
refcount is not incremented


5. remove-fd fdset=1 fd=4
- remove-fd monitor function fails if fdset doesn't exist
- remove-fd monitor function turns on remove flag for fd=4

6. qemu_close (need to replace all close calls in block layer with 
qemu_close)

- qemu_close decrements refcount for fdset
- qemu_close closes all fds that have (refcount == 0  (!inuse || remove))
- qemu_close frees the fdset if no fds remain in it

7. disconnecting the QMP monitor
- monitor disconnect visits all fdsets on monitor and turns off monitor 
in-use flag for fdset


8. connecting the QMP monitor
- monitor connect visits all fdsets on monitor and turns on monitor 
in-use flag for fdset


9. 

Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-09 Thread Corey Bryant



On 07/09/2012 10:04 AM, Kevin Wolf wrote:

Am 06.07.2012 19:40, schrieb Corey Bryant:



On 07/06/2012 05:11 AM, Kevin Wolf wrote:

Am 05.07.2012 19:00, schrieb Eric Blake:

On 07/05/2012 10:35 AM, Corey Bryant wrote:

1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
refcount of 0; fd=4's in-use flag is turned on
2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
so qemu_open() increments the refcount of fdset1 to 1
3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
longer in-use by the monitor, and is left open because it is in use by
the block device (refcount is 1)
4. client crashes, so all tracked fds are visited; fd=4 is not in-use
but refcount is 1 so it is not closed

5. client re-establishes QMP connection, so all tracked fds are visited.
   What happens to the fd=4 in-use flag?

...but what are the semantics here?

If we _always_ turn the in-use flag back on, then that says that even
though libvirt successfully asked to close the fd, the reconnection
means that libvirt once again has to ask to close things.

If we _never_ turn the in-use flag back on, then we've broken the first
case above where we want an in-use fd to come back into use after a crash.

Maybe that argues for two flags per fd: an in-use flag (there is
currently a monitor connection that can manipulate the fd, either
because it passed in the fd or because it reconnected) and a removed
flag (a monitor called remove-fd, and no longer wants to know about the
fd, even if it crashes and reconnects).


I was in fact just going to suggest a removed flag as well, however
combined with including the monitor connections in the refcount instead
of an additional flag. This would also allow to have (the currently
mostly hypothetical case of) multiple QMP monitors without interesting
things happening.

Maybe I'm missing some point that the inuse flag would allow and
including monitors in the refcount doesn't. Is there one?

Kevin



Ok let me try this again. I was going through some of the examples and I
think we need a separate in-use flag.  Otherwise, when refcount gets to
1, we don't know if it is because of a monitor reference or a block
device reference.


Does it matter?



I think it would cause fds to sit on the monitor
until refcount gets to zero (monitor disconnects). Here's an example
without the in-use flag:

1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
refcount of 1 (incremented because of monitor reference); fd=4's remove
flag is initialized to off
2. client calls 'device-add' with /dev/fdset/1 as the backing filename;
qemu_open() increments the refcount of fdset1 to 2
3. client crashes, so all fdsets are visited; fd=4 had not yet been
passed to 'remove-fd', so it's remove flag is off; refcount for fdset1
is decremented to 1; fd=4 is left open because it is still in use by the
block device (refcount is 1)
4. client re-establishes QMP connection, refcount for fdset1 is
incremented to 2; 'query-fds' lets client learn about fd=4 still being
open as part of fdset1
5. client calls 'remove-fd fdset=1 fd=4'; qemu turns on remove flag for
fd=4; but fd=4 remains open because refcount of fdset1 is 2


It also decreases the reference count because the monitor doesn't use it
any more.


I don't think that will work because refcount is for the entire fdset. 
So we can't decrement the refcount for every fd that is removed from the 
fdset.


I think it is much simpler if we only increment refcount for an fdset on 
qemu_open, and only decrement refcount on qemu_close.





6. qemu_close is called for fd=4; refcount for fdset1 is decremented to
1; fd=4 remains open because monitor still references fdset1 (refcount
of fdset1 is 1)


So here the refcount becomes 0 and the fdset is closed.




7. Sometime later.. QMP disconnects; refcount for fdset is decremented
to zero; fd=4 is closed


The only question that is a bit unclear to me is whether a remove-fd on
one monitor drops the refcount only for this monitor or for all
monitors. However, both options can be implemented without additional
flags or counters.


Before we go back and forth on this thread, would you mind taking a look 
at the last email I sent to Luiz?  It includes all the design points 
that I'm currently working from.  I think it's a good level set and we 
can work off that thread if there are still any issues.


--
Regards,
Corey


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-09 Thread Kevin Wolf
Am 09.07.2012 17:23, schrieb Corey Bryant:
 I think it would cause fds to sit on the monitor
 until refcount gets to zero (monitor disconnects). Here's an example
 without the in-use flag:

 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
 refcount of 1 (incremented because of monitor reference); fd=4's remove
 flag is initialized to off
 2. client calls 'device-add' with /dev/fdset/1 as the backing filename;
 qemu_open() increments the refcount of fdset1 to 2
 3. client crashes, so all fdsets are visited; fd=4 had not yet been
 passed to 'remove-fd', so it's remove flag is off; refcount for fdset1
 is decremented to 1; fd=4 is left open because it is still in use by the
 block device (refcount is 1)
 4. client re-establishes QMP connection, refcount for fdset1 is
 incremented to 2; 'query-fds' lets client learn about fd=4 still being
 open as part of fdset1
 5. client calls 'remove-fd fdset=1 fd=4'; qemu turns on remove flag for
 fd=4; but fd=4 remains open because refcount of fdset1 is 2

 It also decreases the reference count because the monitor doesn't use it
 any more.
 
 I don't think that will work because refcount is for the entire fdset. 
 So we can't decrement the refcount for every fd that is removed from the 
 fdset.
 
 I think it is much simpler if we only increment refcount for an fdset on 
 qemu_open, and only decrement refcount on qemu_close.

Ah right... So this would only work if we had explicit
fdset-create/close commands, where the former would increase the
refcount and the latter decrease it (fdset-open would be optional but I
like symmetry)

Maybe we need (or want) that anyway, but I need to think more about it
first.

 6. qemu_close is called for fd=4; refcount for fdset1 is decremented to
 1; fd=4 remains open because monitor still references fdset1 (refcount
 of fdset1 is 1)

 So here the refcount becomes 0 and the fdset is closed.

 
 7. Sometime later.. QMP disconnects; refcount for fdset is decremented
 to zero; fd=4 is closed

 The only question that is a bit unclear to me is whether a remove-fd on
 one monitor drops the refcount only for this monitor or for all
 monitors. However, both options can be implemented without additional
 flags or counters.
 
 Before we go back and forth on this thread, would you mind taking a look 
 at the last email I sent to Luiz?  It includes all the design points 
 that I'm currently working from.  I think it's a good level set and we 
 can work off that thread if there are still any issues.

Ok, I'll have a look.

Kevin

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-09 Thread Kevin Wolf
Am 09.07.2012 17:05, schrieb Corey Bryant:
 I'm not sure this is an issue with current design.  I know things have 
 changed a bit as the email threads evolved, so I'll paste the current 
 design that I am working from.  Please let me know if you still see any 
 issues.
 
 FD passing:
 ---
 New monitor commands enable adding/removing an fd to/from a set.  New 
 monitor command query-fdsets enables querying of current monitor fdsets. 
   The set of fds should all refer to the same file, with each fd having 
 different access flags (ie. O_RDWR, O_RDONLY).  qemu_open can then dup 
 the fd that has the matching access mode flags.
 
 Design points:
 --
 1. add-fd
 - fd is passed via SCM rights and qemu adds fd to first unused fdset 
 (e.g. /dev/fdset/1)
 - add-fd monitor function initializes the monitor inuse flag for the 
 fdset to true
 - add-fd monitor function initializes the remove flag for the fd to false
 - add-fd returns fdset number and received fd number (e.g fd=3) to caller
 
 2. drive_add file=/dev/fdset/1
 - qemu_open uses the first fd in fdset1 that has access flags matching 
 the qemu_open action flags and has remove flag set to false
 - qemu_open increments refcount for the fdset
 - Need to make sure that if a command like 'device-add' fails that 
 refcount is not incremented
 
 3. add-fd fdset=1
 - fd is passed via SCM rights
 - add-fd monitor function adds the received fd to the specified fdset 
 (or fails if fdset doesn't exist)
 - add-fd monitor function initializes the remove flag for the fd to false
 - add-fd returns fdset number and received fd number (e.g fd=4) to caller
 
 4. block-commit
 - qemu_open performs reopen by using the first fd from the fdset that 
 has access flags matching the qemu_open action flags and has remove flag 
 set to false
 - qemu_open increments refcount for the fdset
 - Need to make sure that if a command like 'block-commit' fails that 
 refcount is not incremented
 
 5. remove-fd fdset=1 fd=4
 - remove-fd monitor function fails if fdset doesn't exist
 - remove-fd monitor function turns on remove flag for fd=4

What was again the reason why we keep removed fds in the fdset at all?

The removed flag would make sense for a fdset after a hypothetical
close-fdset call because the fdset needs to be kept around until the
last user closes it, but I think removed fds can be deleted immediately.

I think I might have confused remove-fd and close-fdset in earlier
emails in this thread, so I hope this isn't inconsistent with what I
said before.

 6. qemu_close (need to replace all close calls in block layer with 
 qemu_close)
 - qemu_close decrements refcount for fdset
 - qemu_close closes all fds that have (refcount == 0  (!inuse || remove))
 - qemu_close frees the fdset if no fds remain in it
 
 7. disconnecting the QMP monitor
 - monitor disconnect visits all fdsets on monitor and turns off monitor 
 in-use flag for fdset

And close all fds with refcount == 0.

 8. connecting the QMP monitor
 - monitor connect visits all fdsets on monitor and turns on monitor 
 in-use flag for fdset
 
 9. query-fdsets
 - returns all fdsets and fds that don't have remove flag on

Kevin

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-09 Thread Luiz Capitulino
On Mon, 09 Jul 2012 17:46:00 +0200
Kevin Wolf kw...@redhat.com wrote:

 Am 09.07.2012 17:05, schrieb Corey Bryant:
  I'm not sure this is an issue with current design.  I know things have 
  changed a bit as the email threads evolved, so I'll paste the current 
  design that I am working from.  Please let me know if you still see any 
  issues.
  
  FD passing:
  ---
  New monitor commands enable adding/removing an fd to/from a set.  New 
  monitor command query-fdsets enables querying of current monitor fdsets. 
The set of fds should all refer to the same file, with each fd having 
  different access flags (ie. O_RDWR, O_RDONLY).  qemu_open can then dup 
  the fd that has the matching access mode flags.
  
  Design points:
  --
  1. add-fd
  - fd is passed via SCM rights and qemu adds fd to first unused fdset 
  (e.g. /dev/fdset/1)

The fdset should be specified by the client, like:

 { execute: add-fd-set, arguments: { set-name: /dev/fdset/1 } }

  - add-fd monitor function initializes the monitor inuse flag for the 
  fdset to true

Why do we need the inuse flag?

  - add-fd monitor function initializes the remove flag for the fd to false
  - add-fd returns fdset number and received fd number (e.g fd=3) to caller
  
  2. drive_add file=/dev/fdset/1
  - qemu_open uses the first fd in fdset1 that has access flags matching 
  the qemu_open action flags and has remove flag set to false
  - qemu_open increments refcount for the fdset
  - Need to make sure that if a command like 'device-add' fails that 
  refcount is not incremented
  
  3. add-fd fdset=1
  - fd is passed via SCM rights
  - add-fd monitor function adds the received fd to the specified fdset 
  (or fails if fdset doesn't exist)
  - add-fd monitor function initializes the remove flag for the fd to false
  - add-fd returns fdset number and received fd number (e.g fd=4) to caller
  
  4. block-commit
  - qemu_open performs reopen by using the first fd from the fdset that 
  has access flags matching the qemu_open action flags and has remove flag 
  set to false
  - qemu_open increments refcount for the fdset
  - Need to make sure that if a command like 'block-commit' fails that 
  refcount is not incremented
  
  5. remove-fd fdset=1 fd=4
  - remove-fd monitor function fails if fdset doesn't exist
  - remove-fd monitor function turns on remove flag for fd=4
 
 What was again the reason why we keep removed fds in the fdset at all?
 
 The removed flag would make sense for a fdset after a hypothetical
 close-fdset call because the fdset needs to be kept around until the
 last user closes it, but I think removed fds can be deleted immediately.

Agreed.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-09 Thread Corey Bryant



On 07/09/2012 11:46 AM, Kevin Wolf wrote:

Am 09.07.2012 17:05, schrieb Corey Bryant:

I'm not sure this is an issue with current design.  I know things have
changed a bit as the email threads evolved, so I'll paste the current
design that I am working from.  Please let me know if you still see any
issues.

FD passing:
---
New monitor commands enable adding/removing an fd to/from a set.  New
monitor command query-fdsets enables querying of current monitor fdsets.
   The set of fds should all refer to the same file, with each fd having
different access flags (ie. O_RDWR, O_RDONLY).  qemu_open can then dup
the fd that has the matching access mode flags.

Design points:
--
1. add-fd
- fd is passed via SCM rights and qemu adds fd to first unused fdset
(e.g. /dev/fdset/1)
- add-fd monitor function initializes the monitor inuse flag for the
fdset to true
- add-fd monitor function initializes the remove flag for the fd to false
- add-fd returns fdset number and received fd number (e.g fd=3) to caller

2. drive_add file=/dev/fdset/1
- qemu_open uses the first fd in fdset1 that has access flags matching
the qemu_open action flags and has remove flag set to false
- qemu_open increments refcount for the fdset
- Need to make sure that if a command like 'device-add' fails that
refcount is not incremented

3. add-fd fdset=1
- fd is passed via SCM rights
- add-fd monitor function adds the received fd to the specified fdset
(or fails if fdset doesn't exist)
- add-fd monitor function initializes the remove flag for the fd to false
- add-fd returns fdset number and received fd number (e.g fd=4) to caller

4. block-commit
- qemu_open performs reopen by using the first fd from the fdset that
has access flags matching the qemu_open action flags and has remove flag
set to false
- qemu_open increments refcount for the fdset
- Need to make sure that if a command like 'block-commit' fails that
refcount is not incremented

5. remove-fd fdset=1 fd=4
- remove-fd monitor function fails if fdset doesn't exist
- remove-fd monitor function turns on remove flag for fd=4


What was again the reason why we keep removed fds in the fdset at all?


Because if refcount is  0 for the fd set, then the fd could be in use 
by a block device.  So we keep it around until refcount is decremented 
to zero, at which point it is safe to close.




The removed flag would make sense for a fdset after a hypothetical
close-fdset call because the fdset needs to be kept around until the
last user closes it, but I think removed fds can be deleted immediately.


fds in an fd set really need to be kept around until zero block devices 
reference them.  At that point, if '(refcount == 0  (!inuse || 
remove))' is true, then we'll officially close the fd.




I think I might have confused remove-fd and close-fdset in earlier
emails in this thread, so I hope this isn't inconsistent with what I
said before.



Ok no problem.


6. qemu_close (need to replace all close calls in block layer with
qemu_close)
- qemu_close decrements refcount for fdset
- qemu_close closes all fds that have (refcount == 0  (!inuse || remove))
- qemu_close frees the fdset if no fds remain in it

7. disconnecting the QMP monitor
- monitor disconnect visits all fdsets on monitor and turns off monitor
in-use flag for fdset


And close all fds with refcount == 0.



Yes, this makes sense.

It also makes sense to close removed fds with refcount == 0 in the 
remove-fd function.  Basically this will be the same thing we do in 
qemu_close.  We'll close any fds that evaulate the following as true:


(refcount == 0  (!inuse || remove))


8. connecting the QMP monitor
- monitor connect visits all fdsets on monitor and turns on monitor
in-use flag for fdset

9. query-fdsets
- returns all fdsets and fds that don't have remove flag on


--
Regards,
Corey


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-09 Thread Luiz Capitulino
On Mon, 09 Jul 2012 13:35:19 -0400
Corey Bryant cor...@linux.vnet.ibm.com wrote:

 
 
 On 07/09/2012 11:46 AM, Kevin Wolf wrote:
  Am 09.07.2012 17:05, schrieb Corey Bryant:
  I'm not sure this is an issue with current design.  I know things have
  changed a bit as the email threads evolved, so I'll paste the current
  design that I am working from.  Please let me know if you still see any
  issues.
 
  FD passing:
  ---
  New monitor commands enable adding/removing an fd to/from a set.  New
  monitor command query-fdsets enables querying of current monitor fdsets.
 The set of fds should all refer to the same file, with each fd having
  different access flags (ie. O_RDWR, O_RDONLY).  qemu_open can then dup
  the fd that has the matching access mode flags.
 
  Design points:
  --
  1. add-fd
  - fd is passed via SCM rights and qemu adds fd to first unused fdset
  (e.g. /dev/fdset/1)
  - add-fd monitor function initializes the monitor inuse flag for the
  fdset to true
  - add-fd monitor function initializes the remove flag for the fd to false
  - add-fd returns fdset number and received fd number (e.g fd=3) to caller
 
  2. drive_add file=/dev/fdset/1
  - qemu_open uses the first fd in fdset1 that has access flags matching
  the qemu_open action flags and has remove flag set to false
  - qemu_open increments refcount for the fdset
  - Need to make sure that if a command like 'device-add' fails that
  refcount is not incremented
 
  3. add-fd fdset=1
  - fd is passed via SCM rights
  - add-fd monitor function adds the received fd to the specified fdset
  (or fails if fdset doesn't exist)
  - add-fd monitor function initializes the remove flag for the fd to false
  - add-fd returns fdset number and received fd number (e.g fd=4) to caller
 
  4. block-commit
  - qemu_open performs reopen by using the first fd from the fdset that
  has access flags matching the qemu_open action flags and has remove flag
  set to false
  - qemu_open increments refcount for the fdset
  - Need to make sure that if a command like 'block-commit' fails that
  refcount is not incremented
 
  5. remove-fd fdset=1 fd=4
  - remove-fd monitor function fails if fdset doesn't exist
  - remove-fd monitor function turns on remove flag for fd=4
 
  What was again the reason why we keep removed fds in the fdset at all?
 
 Because if refcount is  0 for the fd set, then the fd could be in use 
 by a block device.  So we keep it around until refcount is decremented 
 to zero, at which point it is safe to close.

But then the refcount is associated with the set, not with any particular fd.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-09 Thread Corey Bryant



On 07/09/2012 12:18 PM, Luiz Capitulino wrote:

On Mon, 09 Jul 2012 17:46:00 +0200
Kevin Wolf kw...@redhat.com wrote:


Am 09.07.2012 17:05, schrieb Corey Bryant:

I'm not sure this is an issue with current design.  I know things have
changed a bit as the email threads evolved, so I'll paste the current
design that I am working from.  Please let me know if you still see any
issues.

FD passing:
---
New monitor commands enable adding/removing an fd to/from a set.  New
monitor command query-fdsets enables querying of current monitor fdsets.
   The set of fds should all refer to the same file, with each fd having
different access flags (ie. O_RDWR, O_RDONLY).  qemu_open can then dup
the fd that has the matching access mode flags.

Design points:
--
1. add-fd
- fd is passed via SCM rights and qemu adds fd to first unused fdset
(e.g. /dev/fdset/1)


The fdset should be specified by the client, like:

  { execute: add-fd-set, arguments: { set-name: /dev/fdset/1 } }



We could make the fdset name configurable.  Then we wouldn't force 
clients into using the file=/dev/fdset/1 alias on commands like 
device_add.  The risk with this is that clients need to be careful and 
use a unique name that doesn't conflict with any existing file names.


The way it is currently, if add-fd is not given an fdset name, it will 
generate a new fdset and add the fd to it.  add-fd always returns the 
fdset (int) and received fd (int) on success.  (e.g. fdset=1 corresponds 
to file=/dev/fdset/1).  Then the 2nd time you want to add an fd to this 
set, you have to specify fdset=1 on add-fd.


I'll do whatever you all prefer.  I think there are advantages in both 
approaches, however I'm leaning toward the current approach and forcing 
use of /dev/fdset/1 to keep all fd set names consistent.



- add-fd monitor function initializes the monitor inuse flag for the
fdset to true


Why do we need the inuse flag?



This helps to prevent fd leakage.  Let's say the client adds fd=3 to 
/dev/fdset/1 and then the QMP monitor disconnects.  Since the following 
evaluates to true when the monitor disconnects, the fd will be closed:


(refcount == 0  (!inuse || remove))

Note: refcount is incremented for the fdset on qemu_open and decremented 
on qemu_close, and no commands caused it to be incremented from zero in 
this example.



- add-fd monitor function initializes the remove flag for the fd to false
- add-fd returns fdset number and received fd number (e.g fd=3) to caller

2. drive_add file=/dev/fdset/1
- qemu_open uses the first fd in fdset1 that has access flags matching
the qemu_open action flags and has remove flag set to false
- qemu_open increments refcount for the fdset
- Need to make sure that if a command like 'device-add' fails that
refcount is not incremented

3. add-fd fdset=1
- fd is passed via SCM rights
- add-fd monitor function adds the received fd to the specified fdset
(or fails if fdset doesn't exist)
- add-fd monitor function initializes the remove flag for the fd to false
- add-fd returns fdset number and received fd number (e.g fd=4) to caller

4. block-commit
- qemu_open performs reopen by using the first fd from the fdset that
has access flags matching the qemu_open action flags and has remove flag
set to false
- qemu_open increments refcount for the fdset
- Need to make sure that if a command like 'block-commit' fails that
refcount is not incremented

5. remove-fd fdset=1 fd=4
- remove-fd monitor function fails if fdset doesn't exist
- remove-fd monitor function turns on remove flag for fd=4


What was again the reason why we keep removed fds in the fdset at all?

The removed flag would make sense for a fdset after a hypothetical
close-fdset call because the fdset needs to be kept around until the
last user closes it, but I think removed fds can be deleted immediately.


Agreed.



Please take a look at my recent reply to Kevin about this and let me 
know if it clears things up.


--
Regards,
Corey


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-09 Thread Corey Bryant



On 07/09/2012 01:48 PM, Luiz Capitulino wrote:

On Mon, 09 Jul 2012 13:35:19 -0400
Corey Bryant cor...@linux.vnet.ibm.com wrote:




On 07/09/2012 11:46 AM, Kevin Wolf wrote:

Am 09.07.2012 17:05, schrieb Corey Bryant:

I'm not sure this is an issue with current design.  I know things have
changed a bit as the email threads evolved, so I'll paste the current
design that I am working from.  Please let me know if you still see any
issues.

FD passing:
---
New monitor commands enable adding/removing an fd to/from a set.  New
monitor command query-fdsets enables querying of current monitor fdsets.
The set of fds should all refer to the same file, with each fd having
different access flags (ie. O_RDWR, O_RDONLY).  qemu_open can then dup
the fd that has the matching access mode flags.

Design points:
--
1. add-fd
- fd is passed via SCM rights and qemu adds fd to first unused fdset
(e.g. /dev/fdset/1)
- add-fd monitor function initializes the monitor inuse flag for the
fdset to true
- add-fd monitor function initializes the remove flag for the fd to false
- add-fd returns fdset number and received fd number (e.g fd=3) to caller

2. drive_add file=/dev/fdset/1
- qemu_open uses the first fd in fdset1 that has access flags matching
the qemu_open action flags and has remove flag set to false
- qemu_open increments refcount for the fdset
- Need to make sure that if a command like 'device-add' fails that
refcount is not incremented

3. add-fd fdset=1
- fd is passed via SCM rights
- add-fd monitor function adds the received fd to the specified fdset
(or fails if fdset doesn't exist)
- add-fd monitor function initializes the remove flag for the fd to false
- add-fd returns fdset number and received fd number (e.g fd=4) to caller

4. block-commit
- qemu_open performs reopen by using the first fd from the fdset that
has access flags matching the qemu_open action flags and has remove flag
set to false
- qemu_open increments refcount for the fdset
- Need to make sure that if a command like 'block-commit' fails that
refcount is not incremented

5. remove-fd fdset=1 fd=4
- remove-fd monitor function fails if fdset doesn't exist
- remove-fd monitor function turns on remove flag for fd=4


What was again the reason why we keep removed fds in the fdset at all?


Because if refcount is  0 for the fd set, then the fd could be in use
by a block device.  So we keep it around until refcount is decremented
to zero, at which point it is safe to close.


But then the refcount is associated with the set, not with any particular fd.



Exactly, yes that's what we're doing.  Sorry, I thought that was clear 
in the design overview I sent earlier today.


--
Regards,
Corey


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-09 Thread Corey Bryant



On 07/09/2012 11:05 AM, Corey Bryant wrote:



On 07/09/2012 10:05 AM, Luiz Capitulino wrote:

On Thu, 05 Jul 2012 11:06:56 -0400
Corey Bryant cor...@linux.vnet.ibm.com wrote:




On 07/04/2012 04:09 AM, Kevin Wolf wrote:

Am 03.07.2012 20:21, schrieb Corey Bryant:

On 07/03/2012 02:00 PM, Eric Blake wrote:

On 07/03/2012 11:46 AM, Corey Bryant wrote:



Yes, I think adding a +1 to the refcount for the monitor makes
sense.

I'm a bit unsure how to increment the refcount when a monitor
reconnects
though.  Maybe it is as simple as adding a +1 to each fd's
refcount when
the next QMP monitor connects.


Or maybe delay a +1 until after a 'query-fds' - it is not until the
monitor has reconnected and learned what fds it should be aware of
that
incrementing the refcount again makes sense.  But that would mean
making
'query-fds' track whether this is the first call since the monitor
reconnected, as it shouldn't normally increase refcounts.


This doesn't sound ideal.


Yes, it's less than ideal.


The other alternative is that the monitor never re-increments a
refcount.  Once a monitor disconnects, that fd is lost to the
monitor,
and a reconnected monitor must pass in a new fd to be
re-associated with
the fdset.  In other words, the monitor's use of an fd is a one-way
operation, starting life in use but ending at the first disconnect or
remove-fd.


I would vote for this 2nd alternative.  As long as we're not
introducing
an fd leak.  And I don't think we are if we decrement the refcount on
remove-fd or on QMP disconnect.


In fact, I believe this one is even worse. I can already see hacks like
adding a dummy FD with invalid flags and removing it again just to
regain control over the fdset...

You earlier suggestion made a lot of sense to me: Whenever a new QMP
monitor is connected, increase the refcount. That is, as long as any
monitor is there, don't drop any fdsets unless explicitly requested
via QMP.


Ok.  So refcount would be incremented (for the fd or fdset, whatever we
decide on) when QMP reconnects.  I'm assuming we wouldn't wait until
after a query-fds call.


I'm not sure this is a good idea because we will leak fds if the
client forgets
about the fds when re-connecting (ie. it was restarted) or if a different
client connects to QMP.

If we really want to do that, I think that the right way of doing this
is to
add a command for clients to re-again ownership of the fds on
reconnection.

But to be honest, I don't fully understand why this is needed.



I'm not sure this is an issue with current design.  I know things have
changed a bit as the email threads evolved, so I'll paste the current
design that I am working from.  Please let me know if you still see any
issues.

FD passing:
---
New monitor commands enable adding/removing an fd to/from a set.  New
monitor command query-fdsets enables querying of current monitor fdsets.
  The set of fds should all refer to the same file, with each fd having
different access flags (ie. O_RDWR, O_RDONLY).  qemu_open can then dup
the fd that has the matching access mode flags.

Design points:
--
1. add-fd
- fd is passed via SCM rights and qemu adds fd to first unused fdset
(e.g. /dev/fdset/1)
- add-fd monitor function initializes the monitor inuse flag for the
fdset to true
- add-fd monitor function initializes the remove flag for the fd to false
- add-fd returns fdset number and received fd number (e.g fd=3) to caller

2. drive_add file=/dev/fdset/1
- qemu_open uses the first fd in fdset1 that has access flags matching
the qemu_open action flags and has remove flag set to false
- qemu_open increments refcount for the fdset
- Need to make sure that if a command like 'device-add' fails that
refcount is not incremented

3. add-fd fdset=1
- fd is passed via SCM rights
- add-fd monitor function adds the received fd to the specified fdset
(or fails if fdset doesn't exist)
- add-fd monitor function initializes the remove flag for the fd to false
- add-fd returns fdset number and received fd number (e.g fd=4) to caller

4. block-commit
- qemu_open performs reopen by using the first fd from the fdset that
has access flags matching the qemu_open action flags and has remove flag
set to false
- qemu_open increments refcount for the fdset
- Need to make sure that if a command like 'block-commit' fails that
refcount is not incremented

5. remove-fd fdset=1 fd=4
- remove-fd monitor function fails if fdset doesn't exist
- remove-fd monitor function turns on remove flag for fd=4

6. qemu_close (need to replace all close calls in block layer with
qemu_close)
- qemu_close decrements refcount for fdset
- qemu_close closes all fds that have (refcount == 0  (!inuse ||
remove))
- qemu_close frees the fdset if no fds remain in it

7. disconnecting the QMP monitor
- monitor disconnect visits all fdsets on monitor and turns off monitor
in-use flag for fdset

8. connecting the QMP monitor
- monitor connect visits all fdsets on monitor and turns on monitor
in-use flag for 

Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-06 Thread Kevin Wolf
Am 05.07.2012 19:00, schrieb Eric Blake:
 On 07/05/2012 10:35 AM, Corey Bryant wrote:
 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
 refcount of 0; fd=4's in-use flag is turned on
 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
 so qemu_open() increments the refcount of fdset1 to 1
 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
 longer in-use by the monitor, and is left open because it is in use by
 the block device (refcount is 1)
 4. client crashes, so all tracked fds are visited; fd=4 is not in-use
 but refcount is 1 so it is not closed
 5. client re-establishes QMP connection, so all tracked fds are visited.
  What happens to the fd=4 in-use flag?
 
 ...but what are the semantics here?
 
 If we _always_ turn the in-use flag back on, then that says that even
 though libvirt successfully asked to close the fd, the reconnection
 means that libvirt once again has to ask to close things.
 
 If we _never_ turn the in-use flag back on, then we've broken the first
 case above where we want an in-use fd to come back into use after a crash.
 
 Maybe that argues for two flags per fd: an in-use flag (there is
 currently a monitor connection that can manipulate the fd, either
 because it passed in the fd or because it reconnected) and a removed
 flag (a monitor called remove-fd, and no longer wants to know about the
 fd, even if it crashes and reconnects).  

I was in fact just going to suggest a removed flag as well, however
combined with including the monitor connections in the refcount instead
of an additional flag. This would also allow to have (the currently
mostly hypothetical case of) multiple QMP monitors without interesting
things happening.

Maybe I'm missing some point that the inuse flag would allow and
including monitors in the refcount doesn't. Is there one?

Kevin

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-06 Thread Kevin Wolf
Am 05.07.2012 18:37, schrieb Corey Bryant:
 There is one case I'm aware of where we need to be careful: Before
 opening an image, qemu may probe the format. In this case, the image
 gets opened twice, and the first close comes before the second open.
 I'm
 not entirely sure how hard it would be to get rid of that behaviour.

 If we can't get rid of it, we have a small window that the refcount
 doesn't really cover, and if we weren't careful it would become racy.
 This is why I mentioned earlier that maybe we need to defer the
 refcount
 decrease a bit. However, I can't see how the in-use flag would make a
 difference there. If the refcount can't cover it, the in-use flag can't
 either.

 Yeah this is a problem.  Could we introduce another flag to cover this?

 Adding more refcounts or flags is not a problem, but it doesn't solve it
 either. The hard question is when to set that flag.

 I believe it may be easier to just change the block layer so that it
 opens files only once during bdrv_open().
 
 Can this fix be delivered after the fd passing patch series?

Sure, we can't fix everything at once.

Kevin

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-06 Thread Corey Bryant



On 07/06/2012 05:11 AM, Kevin Wolf wrote:

Am 05.07.2012 19:00, schrieb Eric Blake:

On 07/05/2012 10:35 AM, Corey Bryant wrote:

1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
refcount of 0; fd=4's in-use flag is turned on
2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
so qemu_open() increments the refcount of fdset1 to 1
3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
longer in-use by the monitor, and is left open because it is in use by
the block device (refcount is 1)
4. client crashes, so all tracked fds are visited; fd=4 is not in-use
but refcount is 1 so it is not closed

5. client re-establishes QMP connection, so all tracked fds are visited.
  What happens to the fd=4 in-use flag?

...but what are the semantics here?

If we _always_ turn the in-use flag back on, then that says that even
though libvirt successfully asked to close the fd, the reconnection
means that libvirt once again has to ask to close things.

If we _never_ turn the in-use flag back on, then we've broken the first
case above where we want an in-use fd to come back into use after a crash.

Maybe that argues for two flags per fd: an in-use flag (there is
currently a monitor connection that can manipulate the fd, either
because it passed in the fd or because it reconnected) and a removed
flag (a monitor called remove-fd, and no longer wants to know about the
fd, even if it crashes and reconnects).


I was in fact just going to suggest a removed flag as well, however
combined with including the monitor connections in the refcount instead
of an additional flag. This would also allow to have (the currently
mostly hypothetical case of) multiple QMP monitors without interesting
things happening.

Maybe I'm missing some point that the inuse flag would allow and
including monitors in the refcount doesn't. Is there one?

Kevin



I think we need the granularity of having an in-use flag per fd.  Here's 
an example of why:


1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
refcount of 1; fd=4's remove flag is initialized to off
2. client calls 'device-add' with /dev/fdset/1 as the backing filename;
qemu_open() increments the refcount of fdset1 to 2
3. client crashes, so all fdsets are visited; fd=4 had not yet been
passed to 'remove-fd', so it's remove flag is off; refcount for fdset1 
is decremented to 1; fd=4 is left open because it is still in use by the 
block device (refcount is 1)
4. client re-establishes QMP connection, refcount for fdset1 is 
incremented to 2; 'query-fds' lets client learn about fd=4 still being 
open as part of fdset1
5. qemu_close is called for fd=4; refcount for fdset1 is decremented to 
1; fd=4 remains open as part of fdset1
6. QMP disconnects; refcount for fdset is decremented to zero;  fd=4 is 
closed and fdset1 is freed


1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
refcount of 1 (because of monitor reference); fd=4's remove flag is 
initialized to off

2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
but the command fails for some other reason, so the refcount is still 1
at the end of the command (although it may have been temporarily
incremented then decremented during the command)
3. client calls 'remove-fd fdset=1 fd=4' to deal with the failure (or
QMP connection is closed), so qemu turns on the remove flag for fd=4; 
refcount is 0 so all fds in fdset1 are closed



I think we need the granularity of having an in-use flag per fd.  If we 
track monitor in-use in the reference count, then we are tracking it for 
the fdset and I think this could cause leaks.


In the following example, we have a refcount for the fdset, an in-use 
flag per fd, and a remove flag per fd.  We're only 
incrementing/decrementing refcount in qemu_open/qemu_close.






In the following example, we have a refcount for the fdset, and a remove 
flag per fd.  We're incrementing refcount when the fdset is first 
created, when QMP re-connects, and in qemu_open.  We're decrementing 
refcount when QMP disconnects, and in qemu_close.


1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
refcount of 1; fd=4's remove flag is initialized to off
2. client crashes, so all tracked fdsets are visited; fd=4 has
not yet been passed to 'remove-fd', so its remove flag is off; in-use
flags are turned off and both fds are left open because the set is still
in use by the block device (refcount is 1)








1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
refcount of 1; fd=4's remove flag is initialized to off
2. client calls 'device-add' with /dev/fdset/1 as the backing filename;
qemu_open() increments the refcount of fdset1 to 2
3. client crashes, so all fdsets are visited; fd=4 had not yet been
passed to 'remove-fd', so it's remove flag is off; refcount for fdset1 
is decremented to 1; fd=4 is left open because it is still in use by the 
block device (refcount is 1)
4. client re-establishes 

Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-06 Thread Corey Bryant

Ugh... please disregard this.  I hit send accidentally.

On 07/06/2012 01:14 PM, Corey Bryant wrote:



On 07/06/2012 05:11 AM, Kevin Wolf wrote:

Am 05.07.2012 19:00, schrieb Eric Blake:

On 07/05/2012 10:35 AM, Corey Bryant wrote:

1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
refcount of 0; fd=4's in-use flag is turned on
2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
so qemu_open() increments the refcount of fdset1 to 1
3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
longer in-use by the monitor, and is left open because it is in use by
the block device (refcount is 1)
4. client crashes, so all tracked fds are visited; fd=4 is not in-use
but refcount is 1 so it is not closed

5. client re-establishes QMP connection, so all tracked fds are visited.
  What happens to the fd=4 in-use flag?

...but what are the semantics here?

If we _always_ turn the in-use flag back on, then that says that even
though libvirt successfully asked to close the fd, the reconnection
means that libvirt once again has to ask to close things.

If we _never_ turn the in-use flag back on, then we've broken the first
case above where we want an in-use fd to come back into use after a
crash.

Maybe that argues for two flags per fd: an in-use flag (there is
currently a monitor connection that can manipulate the fd, either
because it passed in the fd or because it reconnected) and a removed
flag (a monitor called remove-fd, and no longer wants to know about the
fd, even if it crashes and reconnects).


I was in fact just going to suggest a removed flag as well, however
combined with including the monitor connections in the refcount instead
of an additional flag. This would also allow to have (the currently
mostly hypothetical case of) multiple QMP monitors without interesting
things happening.

Maybe I'm missing some point that the inuse flag would allow and
including monitors in the refcount doesn't. Is there one?

Kevin



I think we need the granularity of having an in-use flag per fd.  Here's
an example of why:

1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
refcount of 1; fd=4's remove flag is initialized to off
2. client calls 'device-add' with /dev/fdset/1 as the backing filename;
qemu_open() increments the refcount of fdset1 to 2
3. client crashes, so all fdsets are visited; fd=4 had not yet been
passed to 'remove-fd', so it's remove flag is off; refcount for fdset1
is decremented to 1; fd=4 is left open because it is still in use by the
block device (refcount is 1)
4. client re-establishes QMP connection, refcount for fdset1 is
incremented to 2; 'query-fds' lets client learn about fd=4 still being
open as part of fdset1
5. qemu_close is called for fd=4; refcount for fdset1 is decremented to
1; fd=4 remains open as part of fdset1
6. QMP disconnects; refcount for fdset is decremented to zero;  fd=4 is
closed and fdset1 is freed

1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
refcount of 1 (because of monitor reference); fd=4's remove flag is
initialized to off
2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
but the command fails for some other reason, so the refcount is still 1
at the end of the command (although it may have been temporarily
incremented then decremented during the command)
3. client calls 'remove-fd fdset=1 fd=4' to deal with the failure (or
QMP connection is closed), so qemu turns on the remove flag for fd=4;
refcount is 0 so all fds in fdset1 are closed


I think we need the granularity of having an in-use flag per fd.  If we
track monitor in-use in the reference count, then we are tracking it for
the fdset and I think this could cause leaks.

In the following example, we have a refcount for the fdset, an in-use
flag per fd, and a remove flag per fd.  We're only
incrementing/decrementing refcount in qemu_open/qemu_close.





In the following example, we have a refcount for the fdset, and a remove
flag per fd.  We're incrementing refcount when the fdset is first
created, when QMP re-connects, and in qemu_open.  We're decrementing
refcount when QMP disconnects, and in qemu_close.

1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
refcount of 1; fd=4's remove flag is initialized to off
2. client crashes, so all tracked fdsets are visited; fd=4 has
not yet been passed to 'remove-fd', so its remove flag is off; in-use
flags are turned off and both fds are left open because the set is still
in use by the block device (refcount is 1)








1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
refcount of 1; fd=4's remove flag is initialized to off
2. client calls 'device-add' with /dev/fdset/1 as the backing filename;
qemu_open() increments the refcount of fdset1 to 2
3. client crashes, so all fdsets are visited; fd=4 had not yet been
passed to 'remove-fd', so it's remove flag is off; refcount for fdset1
is decremented to 1; fd=4 is left open because it 

Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-06 Thread Corey Bryant



On 07/06/2012 05:11 AM, Kevin Wolf wrote:

Am 05.07.2012 19:00, schrieb Eric Blake:

On 07/05/2012 10:35 AM, Corey Bryant wrote:

1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
refcount of 0; fd=4's in-use flag is turned on
2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
so qemu_open() increments the refcount of fdset1 to 1
3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
longer in-use by the monitor, and is left open because it is in use by
the block device (refcount is 1)
4. client crashes, so all tracked fds are visited; fd=4 is not in-use
but refcount is 1 so it is not closed

5. client re-establishes QMP connection, so all tracked fds are visited.
  What happens to the fd=4 in-use flag?

...but what are the semantics here?

If we _always_ turn the in-use flag back on, then that says that even
though libvirt successfully asked to close the fd, the reconnection
means that libvirt once again has to ask to close things.

If we _never_ turn the in-use flag back on, then we've broken the first
case above where we want an in-use fd to come back into use after a crash.

Maybe that argues for two flags per fd: an in-use flag (there is
currently a monitor connection that can manipulate the fd, either
because it passed in the fd or because it reconnected) and a removed
flag (a monitor called remove-fd, and no longer wants to know about the
fd, even if it crashes and reconnects).


I was in fact just going to suggest a removed flag as well, however
combined with including the monitor connections in the refcount instead
of an additional flag. This would also allow to have (the currently
mostly hypothetical case of) multiple QMP monitors without interesting
things happening.

Maybe I'm missing some point that the inuse flag would allow and
including monitors in the refcount doesn't. Is there one?

Kevin



Ok let me try this again. I was going through some of the examples and I 
think we need a separate in-use flag.  Otherwise, when refcount gets to 
1, we don't know if it is because of a monitor reference or a block 
device reference.  I think it would cause fds to sit on the monitor 
until refcount gets to zero (monitor disconnects). Here's an example 
without the in-use flag:


1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
refcount of 1 (incremented because of monitor reference); fd=4's remove 
flag is initialized to off

2. client calls 'device-add' with /dev/fdset/1 as the backing filename;
qemu_open() increments the refcount of fdset1 to 2
3. client crashes, so all fdsets are visited; fd=4 had not yet been
passed to 'remove-fd', so it's remove flag is off; refcount for fdset1 
is decremented to 1; fd=4 is left open because it is still in use by the 
block device (refcount is 1)
4. client re-establishes QMP connection, refcount for fdset1 is 
incremented to 2; 'query-fds' lets client learn about fd=4 still being 
open as part of fdset1
5. client calls 'remove-fd fdset=1 fd=4'; qemu turns on remove flag for 
fd=4; but fd=4 remains open because refcount of fdset1 is 2
6. qemu_close is called for fd=4; refcount for fdset1 is decremented to 
1; fd=4 remains open because monitor still references fdset1 (refcount 
of fdset1 is 1)
7. Sometime later.. QMP disconnects; refcount for fdset is decremented 
to zero; fd=4 is closed


In the following case, we have an in-use and remove flag per fd and we 
only increment/decrement refcount on qemu_open/qemu_close:


1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
refcount of 0; fd=4's remove flag is initialized to off and in-use flag 
is initialized to on

2. client calls 'device-add' with /dev/fdset/1 as the backing filename;
qemu_open() increments the refcount of fdset1 to 1
3. client crashes, so all fdsets are visited; fd=4 had not yet been
passed to 'remove-fd', so it's remove flag is off; fd=4's in-use flag is 
turned off; fd=4 is left open because it is still in-use by the block 
device (refcount is still 1)
4. client re-establishes QMP connection, refcount for fdset1 is still 1; 
fd=4's in-use flag is turned on; 'query-fds' lets client learn about 
fd=4 still being open as part of fdset1
5. client calls 'remove-fd fdset=1 fd=4'; qemu turns on remove flag for 
fd=4; but fd=4 remains open because refcount of fdset1 is 1
6. qemu_close is called for fd=4; refcount for fdset1 is decremented to 
0; fd=4 is closed because (refcount == 0  (!inuse || removed)) is true

7. Sometime later.. QMP disconnects

--
Regards,
Corey


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-06 Thread Corey Bryant



On 07/06/2012 01:40 PM, Corey Bryant wrote:



On 07/06/2012 05:11 AM, Kevin Wolf wrote:

Am 05.07.2012 19:00, schrieb Eric Blake:

On 07/05/2012 10:35 AM, Corey Bryant wrote:

1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
refcount of 0; fd=4's in-use flag is turned on
2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
so qemu_open() increments the refcount of fdset1 to 1
3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
longer in-use by the monitor, and is left open because it is in use by
the block device (refcount is 1)
4. client crashes, so all tracked fds are visited; fd=4 is not in-use
but refcount is 1 so it is not closed

5. client re-establishes QMP connection, so all tracked fds are visited.
  What happens to the fd=4 in-use flag?

...but what are the semantics here?

If we _always_ turn the in-use flag back on, then that says that even
though libvirt successfully asked to close the fd, the reconnection
means that libvirt once again has to ask to close things.

If we _never_ turn the in-use flag back on, then we've broken the first
case above where we want an in-use fd to come back into use after a
crash.

Maybe that argues for two flags per fd: an in-use flag (there is
currently a monitor connection that can manipulate the fd, either
because it passed in the fd or because it reconnected) and a removed
flag (a monitor called remove-fd, and no longer wants to know about the
fd, even if it crashes and reconnects).


I was in fact just going to suggest a removed flag as well, however
combined with including the monitor connections in the refcount instead
of an additional flag. This would also allow to have (the currently
mostly hypothetical case of) multiple QMP monitors without interesting
things happening.

Maybe I'm missing some point that the inuse flag would allow and
including monitors in the refcount doesn't. Is there one?

Kevin



Ok let me try this again. I was going through some of the examples and I
think we need a separate in-use flag.  Otherwise, when refcount gets to
1, we don't know if it is because of a monitor reference or a block
device reference.  I think it would cause fds to sit on the monitor
until refcount gets to zero (monitor disconnects). Here's an example
without the in-use flag:

1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
refcount of 1 (incremented because of monitor reference); fd=4's remove
flag is initialized to off
2. client calls 'device-add' with /dev/fdset/1 as the backing filename;
qemu_open() increments the refcount of fdset1 to 2
3. client crashes, so all fdsets are visited; fd=4 had not yet been
passed to 'remove-fd', so it's remove flag is off; refcount for fdset1
is decremented to 1; fd=4 is left open because it is still in use by the
block device (refcount is 1)
4. client re-establishes QMP connection, refcount for fdset1 is
incremented to 2; 'query-fds' lets client learn about fd=4 still being
open as part of fdset1
5. client calls 'remove-fd fdset=1 fd=4'; qemu turns on remove flag for
fd=4; but fd=4 remains open because refcount of fdset1 is 2
6. qemu_close is called for fd=4; refcount for fdset1 is decremented to
1; fd=4 remains open because monitor still references fdset1 (refcount
of fdset1 is 1)
7. Sometime later.. QMP disconnects; refcount for fdset is decremented
to zero; fd=4 is closed

In the following case, we have an in-use and remove flag per fd and we
only increment/decrement refcount on qemu_open/qemu_close:

1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
refcount of 0; fd=4's remove flag is initialized to off and in-use flag
is initialized to on
2. client calls 'device-add' with /dev/fdset/1 as the backing filename;
qemu_open() increments the refcount of fdset1 to 1
3. client crashes, so all fdsets are visited; fd=4 had not yet been
passed to 'remove-fd', so it's remove flag is off; fd=4's in-use flag is
turned off; fd=4 is left open because it is still in-use by the block
device (refcount is still 1)
4. client re-establishes QMP connection, refcount for fdset1 is still 1;
fd=4's in-use flag is turned on; 'query-fds' lets client learn about
fd=4 still being open as part of fdset1
5. client calls 'remove-fd fdset=1 fd=4'; qemu turns on remove flag for
fd=4; but fd=4 remains open because refcount of fdset1 is 1
6. qemu_close is called for fd=4; refcount for fdset1 is decremented to
0; fd=4 is closed because (refcount == 0  (!inuse || removed)) is true
7. Sometime later.. QMP disconnects



I have one modification to this.  It looks the inuse flag could be at 
the fdset level, rather than an inuse flag per fd.  Although.. you did 
mention multiple monitors, so perhaps inuse should be a counter?


--
Regards,
Corey


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-05 Thread Corey Bryant



On 07/04/2012 04:00 AM, Kevin Wolf wrote:

Am 03.07.2012 19:03, schrieb Eric Blake:

2. drive_add file=/dev/fdset/1 - qemu_open uses the first fd from the
set that has access flags matching the qemu_open action flags.
qemu_open increments refcount for this fd.
3. add-fd /dev/fdset/1 FDSET={M} - qemu adds fd to set named
/dev/fdset/1 - command returns qemu fd to caller (e.g fd=5).  libvirt
in-use flag turned on for fd.
3. block-commit - qemu_open reopens /dev/fdset/1 by using the first
fd from the set that has access flags matching the qemu_open action
flags.  qemu_open increments refcount for this fd.
4. remove-fd /dev/fdset/1 5 - caller requests fd==5 be removed from the
set.  turns libvirt in-use flag off marking the fd ready to be closed
when qemu is done with it.


If we decided to not return the individual fd numbers to libvirt, file
descriptors would be uniquely identified by an fdset/flags pair here.



Are you saying we'd pass the fdset name and flags parameters on
remove-fd to somehow identify the fds to remove?


Passing the flag parameters is not trivial, as that would mean the QMP
code would have to define constants mapping to all of the O_* flags that
qemu_open supports.  It's easier to support closing by fd number.


Good point. So pass-fd or whatever it will be called needs to returnn
both fdset number and the individual fd number.


5. qemu_close decrements refcount for fd, and closes fd when refcount is
zero and libvirt in use flag is off.


The monitor could just hold another reference, then we save the
additional flag. But that's a qemu implementation detail.



I'm not sure I understand what you mean.


pass-fd (or add-fd, whatever name we give it) adds an fd to an fdset,
with initial use count of 1 (the use is the monitor).  qemu_open()
increments the use count.  A new qemu_close() wrapper would decrement
the use count.  And both calling 'remove-fd', or closing the QMP monitor
of an fd that has not yet been passed through 'remove-fd', serves as a
way to decrement the use count.  You'd still have to track whether the
monitor is using an fd (to avoid over-decrementing on QMP monitor
close), but by having the monitor's use also tracked under the refcount,
then refcount reaching 0 is sufficient to auto-close an fd.



I think
that also means that re-establishing the client QMP connection would
increment


This is an interesting idea. I've never thought about what to do with a
reconnect. It felt a bit odd at first, but now your proposal totally
makes sense to me.



For some examples:

1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
use by monitor, as member of fdset1
2. client crashes, so all tracked fds are visited; fd=4 had not yet been
passed to 'remove-fd', so qemu decrements refcount; refcount of fd=4 is
now 0 so qemu closes it


Doesn't the refcount belong to an fdset rather than a single fd? As long
as the fdset has still a reference, it's possible to reach the other fds
in the set through a reopen. For example:

1. client calls 'add-fd', qemu is now tracking fd=4 (O_RDWR) as a member
of fdset1, in use by monitor, refcount 1
2. client calls 'add-fd', qemu is now tracing fd=5 (O_RDONLY) as a
member of fdset, in use by monitor, refcount is still 1
3. client calls 'device-add' with /dev/fdset/1 as the backing filename
and r/w, so qemu_open() dup()s fd 4 and increments the refcount to 2
4. client crashes, so all tracked fdsets are visited; fdset1 gets its
refcount decreased to 1, and both member fds 4 and 5 stay open.

If you had refcounted a single fd, fd 5 would be closed now and qemu
couldn't switch to O_RDONLY any more.



If the O_RDONLY is for a future command, it would make more sense to add 
that fd before that future command.  Or are you passing it at step 2 
because it is needed for the probe re-open issue?



1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
use by monitor, as member of fdset1
2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
so qemu_open() increments the refcount to 2
3. client crashes, so all tracked fds are visited; fd=4 had not yet been
passed to 'remove-fd', so qemu decrements refcount to 1, but leaves fd=4
open because it is still in use by the block device
4. client re-establishes QMP connection, and 'query-fds' lets client
learn about fd=4 still being open as part of fdset1, but also informs
client that fd is not in use by the monitor

1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
use by monitor, as member of fdset1
2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
so qemu_open() increments the refcount to 2
3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
longer in use by the monitor, refcount decremented to 1 but still left
open because it is in use by the block device
4. client crashes, so all tracked fds are visited; but fd=4 is already
marked as not in use by the monitor, so its refcount is unchanged

1. client 

Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-05 Thread Kevin Wolf
Am 05.07.2012 16:22, schrieb Corey Bryant:
 For some examples:

 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
 use by monitor, as member of fdset1
 2. client crashes, so all tracked fds are visited; fd=4 had not yet been
 passed to 'remove-fd', so qemu decrements refcount; refcount of fd=4 is
 now 0 so qemu closes it

 Doesn't the refcount belong to an fdset rather than a single fd? As long
 as the fdset has still a reference, it's possible to reach the other fds
 in the set through a reopen. For example:

 1. client calls 'add-fd', qemu is now tracking fd=4 (O_RDWR) as a member
 of fdset1, in use by monitor, refcount 1
 2. client calls 'add-fd', qemu is now tracing fd=5 (O_RDONLY) as a
 member of fdset, in use by monitor, refcount is still 1
 3. client calls 'device-add' with /dev/fdset/1 as the backing filename
 and r/w, so qemu_open() dup()s fd 4 and increments the refcount to 2
 4. client crashes, so all tracked fdsets are visited; fdset1 gets its
 refcount decreased to 1, and both member fds 4 and 5 stay open.

 If you had refcounted a single fd, fd 5 would be closed now and qemu
 couldn't switch to O_RDONLY any more.

 
 If the O_RDONLY is for a future command, it would make more sense to add 
 that fd before that future command.  Or are you passing it at step 2 
 because it is needed for the probe re-open issue?

Come on, requiring realistic examples isn't fair. ;-)

Swap steps 2 and 3 in the example, then step 3 is just the preparation
for a command that uses the new fd. The problem stays the same. Or a
live commit operation could be in flight so that qemu must be able to
switch on completion without libvirt interaction.

There are enough reasons that an fd in an fdset exists and is not
opened, but I can't think of one why it should be dropped.

 In use by whom? If it's still in use in qemu (as in in-use flag would
 be set) and we have a refcount of zero, then that's a bug.


 In use by qemu.  I don't think it's a bug.  I think there are situations
 where refcount gets to zero but qemu is still using the fd.

 I think the refcount being non-zero _is_ what defines an fd as being in
 use by qemu (including use by the monitor).  Any place you have to close
 an fd before reopening it is dangerous; the safe way is always to open
 with the new permissions before closing the old permissions.

 There is one case I'm aware of where we need to be careful: Before
 opening an image, qemu may probe the format. In this case, the image
 gets opened twice, and the first close comes before the second open. I'm
 not entirely sure how hard it would be to get rid of that behaviour.

 If we can't get rid of it, we have a small window that the refcount
 doesn't really cover, and if we weren't careful it would become racy.
 This is why I mentioned earlier that maybe we need to defer the refcount
 decrease a bit. However, I can't see how the in-use flag would make a
 difference there. If the refcount can't cover it, the in-use flag can't
 either.
 
 Yeah this is a problem.  Could we introduce another flag to cover this?

Adding more refcounts or flags is not a problem, but it doesn't solve it
either. The hard question is when to set that flag.

I believe it may be easier to just change the block layer so that it
opens files only once during bdrv_open().

Kevin

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-05 Thread Corey Bryant



On 07/04/2012 04:09 AM, Kevin Wolf wrote:

Am 03.07.2012 20:21, schrieb Corey Bryant:

On 07/03/2012 02:00 PM, Eric Blake wrote:

On 07/03/2012 11:46 AM, Corey Bryant wrote:



Yes, I think adding a +1 to the refcount for the monitor makes sense.

I'm a bit unsure how to increment the refcount when a monitor reconnects
though.  Maybe it is as simple as adding a +1 to each fd's refcount when
the next QMP monitor connects.


Or maybe delay a +1 until after a 'query-fds' - it is not until the
monitor has reconnected and learned what fds it should be aware of that
incrementing the refcount again makes sense.  But that would mean making
'query-fds' track whether this is the first call since the monitor
reconnected, as it shouldn't normally increase refcounts.


This doesn't sound ideal.


Yes, it's less than ideal.


The other alternative is that the monitor never re-increments a
refcount.  Once a monitor disconnects, that fd is lost to the monitor,
and a reconnected monitor must pass in a new fd to be re-associated with
the fdset.  In other words, the monitor's use of an fd is a one-way
operation, starting life in use but ending at the first disconnect or
remove-fd.


I would vote for this 2nd alternative.  As long as we're not introducing
an fd leak.  And I don't think we are if we decrement the refcount on
remove-fd or on QMP disconnect.


In fact, I believe this one is even worse. I can already see hacks like
adding a dummy FD with invalid flags and removing it again just to
regain control over the fdset...

You earlier suggestion made a lot of sense to me: Whenever a new QMP
monitor is connected, increase the refcount. That is, as long as any
monitor is there, don't drop any fdsets unless explicitly requested via QMP.


Ok.  So refcount would be incremented (for the fd or fdset, whatever we 
decide on) when QMP reconnects.  I'm assuming we wouldn't wait until 
after a query-fds call.


I'll go with this approach until someone objects.

--
Regards,
Corey


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-05 Thread Corey Bryant



On 07/05/2012 10:51 AM, Kevin Wolf wrote:

Am 05.07.2012 16:22, schrieb Corey Bryant:

For some examples:

1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
use by monitor, as member of fdset1
2. client crashes, so all tracked fds are visited; fd=4 had not yet been
passed to 'remove-fd', so qemu decrements refcount; refcount of fd=4 is
now 0 so qemu closes it


Doesn't the refcount belong to an fdset rather than a single fd? As long
as the fdset has still a reference, it's possible to reach the other fds
in the set through a reopen. For example:

1. client calls 'add-fd', qemu is now tracking fd=4 (O_RDWR) as a member
of fdset1, in use by monitor, refcount 1
2. client calls 'add-fd', qemu is now tracing fd=5 (O_RDONLY) as a
member of fdset, in use by monitor, refcount is still 1
3. client calls 'device-add' with /dev/fdset/1 as the backing filename
and r/w, so qemu_open() dup()s fd 4 and increments the refcount to 2
4. client crashes, so all tracked fdsets are visited; fdset1 gets its
refcount decreased to 1, and both member fds 4 and 5 stay open.

If you had refcounted a single fd, fd 5 would be closed now and qemu
couldn't switch to O_RDONLY any more.



If the O_RDONLY is for a future command, it would make more sense to add
that fd before that future command.  Or are you passing it at step 2
because it is needed for the probe re-open issue?


Come on, requiring realistic examples isn't fair. ;-)


Heh :)



Swap steps 2 and 3 in the example, then step 3 is just the preparation
for a command that uses the new fd. The problem stays the same. Or a
live commit operation could be in flight so that qemu must be able to
switch on completion without libvirt interaction.


Good point.

I thought it would be useful to run through the examples again with each 
fdset having a refcount, and each fd having an in-use flag.  One 
difference though is that refcount is only incremented/decremented when 
qemu_open/qemu_close are called.  I think this works and covers Kevin's 
concerns.  Actually it might be a bit simpler too.


1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with 
refcount of 0; fd=4's in-use flag is turned on

2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
so qemu_open() increments the refcount of fdset1 to 1
3. client crashes, so all tracked fds are visited; fd=4 had not yet been
passed to 'remove-fd', so it's in-use flag is on;  in-use flag is turned 
off and fd=4 is left open because it is still in use by the block device 
(refcount is 1)
4. client re-establishes QMP connection, so all tracked fds are visited, 
and in-use flags are turned back on; 'query-fds' lets client learn about 
fd=4 still being open as part of fdset1


1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with 
refcount of 0; fd=4's in-use flag is turned on

2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
so qemu_open() increments the refcount of fdset1 to 1
3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
longer in-use by the monitor, and is left open because it is in use by 
the block device (refcount is 1)
4. client crashes, so all tracked fds are visited; fd=4 is not in-use 
but refcount is 1 so it is not closed


1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with 
refcount of 0; fd=4's in-use flag is turned on

2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
but the command fails for some other reason, so the refcount is still 0
at the end of the command (although it may have been temporarily
incremented then decremented during the command)
3. client calls 'remove-fd fdset=1 fd=4' to deal with the failure (or
QMP connection is closed), so qemu marks fd=4 as no longer in-use by the 
monitor; refcount is 0 so all fds in fdset1 are closed


1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with 
refcount of 0; fd=4's in-use flag is turned on
2. client calls 'add-fd', qemu is now tracking fd=5 in fdset1 with 
refcount still 0; fd=5's in-use flag is turned on
3. client crashes, so all tracked fds are visited; fd=4 and fd=5 had not 
yet been passed to 'remove-fd', so their in-use flags are on; in-use 
flags are turned off; refcount of fdset1 is 0 so qemu closes all fds in 
fdset1


1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with 
refcount of 0; fd=4's in-use flag is turned on
2. client calls 'add-fd', qemu is now tracking fd=5 in fdset1 with 
refcount still 0; fd=5's in-use flag is turned on

3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
longer in-use by the monitor, and fd=4 is closed since the refcount is 
0; fd=5 remains open


1. client calls 'add-fd', qemu is now tracking fd=4 (O_RDWR) in fdset1 
with refcount of 0; fd=4's in-use flag is turned on
2. client calls 'add-fd', qemu is now tracking fd=5 (O_RDONLY) in fdset1 
with refcount still 0; fd=5's in-use flag is turned on
3. client calls 'device-add' with /dev/fdset/1 

Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-05 Thread Corey Bryant



On 07/05/2012 12:35 PM, Corey Bryant wrote:



On 07/05/2012 10:51 AM, Kevin Wolf wrote:

Am 05.07.2012 16:22, schrieb Corey Bryant:

For some examples:

1. client calls 'add-fd', qemu is now tracking fd=4 with refcount
1, in
use by monitor, as member of fdset1
2. client crashes, so all tracked fds are visited; fd=4 had not yet
been
passed to 'remove-fd', so qemu decrements refcount; refcount of
fd=4 is
now 0 so qemu closes it


Doesn't the refcount belong to an fdset rather than a single fd? As
long
as the fdset has still a reference, it's possible to reach the other
fds
in the set through a reopen. For example:

1. client calls 'add-fd', qemu is now tracking fd=4 (O_RDWR) as a
member
of fdset1, in use by monitor, refcount 1
2. client calls 'add-fd', qemu is now tracing fd=5 (O_RDONLY) as a
member of fdset, in use by monitor, refcount is still 1
3. client calls 'device-add' with /dev/fdset/1 as the backing filename
and r/w, so qemu_open() dup()s fd 4 and increments the refcount to 2
4. client crashes, so all tracked fdsets are visited; fdset1 gets its
refcount decreased to 1, and both member fds 4 and 5 stay open.

If you had refcounted a single fd, fd 5 would be closed now and qemu
couldn't switch to O_RDONLY any more.



If the O_RDONLY is for a future command, it would make more sense to add
that fd before that future command.  Or are you passing it at step 2
because it is needed for the probe re-open issue?


Come on, requiring realistic examples isn't fair. ;-)


Heh :)



Swap steps 2 and 3 in the example, then step 3 is just the preparation
for a command that uses the new fd. The problem stays the same. Or a
live commit operation could be in flight so that qemu must be able to
switch on completion without libvirt interaction.


Good point.

I thought it would be useful to run through the examples again with each
fdset having a refcount, and each fd having an in-use flag.  One
difference though is that refcount is only incremented/decremented when
qemu_open/qemu_close are called.  I think this works and covers Kevin's
concerns.  Actually it might be a bit simpler too.

1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
refcount of 0; fd=4's in-use flag is turned on
2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
so qemu_open() increments the refcount of fdset1 to 1
3. client crashes, so all tracked fds are visited; fd=4 had not yet been
passed to 'remove-fd', so it's in-use flag is on;  in-use flag is turned
off and fd=4 is left open because it is still in use by the block device
(refcount is 1)
4. client re-establishes QMP connection, so all tracked fds are visited,
and in-use flags are turned back on; 'query-fds' lets client learn about
fd=4 still being open as part of fdset1

1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
refcount of 0; fd=4's in-use flag is turned on
2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
so qemu_open() increments the refcount of fdset1 to 1
3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
longer in-use by the monitor, and is left open because it is in use by
the block device (refcount is 1)
4. client crashes, so all tracked fds are visited; fd=4 is not in-use
but refcount is 1 so it is not closed

1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
refcount of 0; fd=4's in-use flag is turned on
2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
but the command fails for some other reason, so the refcount is still 0
at the end of the command (although it may have been temporarily
incremented then decremented during the command)
3. client calls 'remove-fd fdset=1 fd=4' to deal with the failure (or
QMP connection is closed), so qemu marks fd=4 as no longer in-use by the
monitor; refcount is 0 so all fds in fdset1 are closed

1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
refcount of 0; fd=4's in-use flag is turned on
2. client calls 'add-fd', qemu is now tracking fd=5 in fdset1 with
refcount still 0; fd=5's in-use flag is turned on
3. client crashes, so all tracked fds are visited; fd=4 and fd=5 had not
yet been passed to 'remove-fd', so their in-use flags are on; in-use
flags are turned off; refcount of fdset1 is 0 so qemu closes all fds in
fdset1

1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
refcount of 0; fd=4's in-use flag is turned on
2. client calls 'add-fd', qemu is now tracking fd=5 in fdset1 with
refcount still 0; fd=5's in-use flag is turned on
3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
longer in-use by the monitor, and fd=4 is closed since the refcount is
0; fd=5 remains open

1. client calls 'add-fd', qemu is now tracking fd=4 (O_RDWR) in fdset1
with refcount of 0; fd=4's in-use flag is turned on
2. client calls 'add-fd', qemu is now tracking fd=5 (O_RDONLY) in fdset1
with refcount still 0; fd=5's in-use flag is turned on
3. client calls 'device-add' with 

Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-05 Thread Eric Blake
On 07/05/2012 10:35 AM, Corey Bryant wrote:

 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
 refcount of 0; fd=4's in-use flag is turned on
 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
 so qemu_open() increments the refcount of fdset1 to 1
 3. client crashes, so all tracked fds are visited; fd=4 had not yet been
 passed to 'remove-fd', so it's in-use flag is on;  in-use flag is turned
 off and fd=4 is left open because it is still in use by the block device
 (refcount is 1)
 4. client re-establishes QMP connection, so all tracked fds are visited,
 and in-use flags are turned back on; 'query-fds' lets client learn about
 fd=4 still being open as part of fdset1

This says that an in-use fd comes back into use after a crash...

 
 1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
 refcount of 0; fd=4's in-use flag is turned on
 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
 so qemu_open() increments the refcount of fdset1 to 1
 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
 longer in-use by the monitor, and is left open because it is in use by
 the block device (refcount is 1)
 4. client crashes, so all tracked fds are visited; fd=4 is not in-use
 but refcount is 1 so it is not closed
5. client re-establishes QMP connection, so all tracked fds are visited.
 What happens to the fd=4 in-use flag?

...but what are the semantics here?

If we _always_ turn the in-use flag back on, then that says that even
though libvirt successfully asked to close the fd, the reconnection
means that libvirt once again has to ask to close things.

If we _never_ turn the in-use flag back on, then we've broken the first
case above where we want an in-use fd to come back into use after a crash.

Maybe that argues for two flags per fd: an in-use flag (there is
currently a monitor connection that can manipulate the fd, either
because it passed in the fd or because it reconnected) and a removed
flag (a monitor called remove-fd, and no longer wants to know about the
fd, even if it crashes and reconnects).  An fd is then closed when
'refcount == 0  (!inuse || removed)'.  On monitor disconnect, the
inuse flag is cleared, and on reconnect, it is set; but that does not
impact the removed flag.  And the 'query-fds' command would omit any fd
with the 'removed' flag, even when the fd is still kept open internally
thanks to the refcount being nonzero.

But I'm definitely agreeing that tying the refcount to the set rather
than to individual fds within the set makes sense; you still avoid the
fd leak in that all fds in the set are closed when both the monitor has
disavowed the set and when qemu_close() has finished using any of the fds.

-- 
Eric Blake   ebl...@redhat.com+1-919-301-3266
Libvirt virtualization library http://libvirt.org





signature.asc
Description: OpenPGP digital signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-05 Thread Corey Bryant



On 07/05/2012 01:00 PM, Eric Blake wrote:

On 07/05/2012 10:35 AM, Corey Bryant wrote:


1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
refcount of 0; fd=4's in-use flag is turned on
2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
so qemu_open() increments the refcount of fdset1 to 1
3. client crashes, so all tracked fds are visited; fd=4 had not yet been
passed to 'remove-fd', so it's in-use flag is on;  in-use flag is turned
off and fd=4 is left open because it is still in use by the block device
(refcount is 1)
4. client re-establishes QMP connection, so all tracked fds are visited,
and in-use flags are turned back on; 'query-fds' lets client learn about
fd=4 still being open as part of fdset1


This says that an in-use fd comes back into use after a crash...



1. client calls 'add-fd', qemu is now tracking fd=4 in fdset1 with
refcount of 0; fd=4's in-use flag is turned on
2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
so qemu_open() increments the refcount of fdset1 to 1
3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
longer in-use by the monitor, and is left open because it is in use by
the block device (refcount is 1)
4. client crashes, so all tracked fds are visited; fd=4 is not in-use
but refcount is 1 so it is not closed

5. client re-establishes QMP connection, so all tracked fds are visited.
  What happens to the fd=4 in-use flag?

...but what are the semantics here?

If we _always_ turn the in-use flag back on, then that says that even
though libvirt successfully asked to close the fd, the reconnection
means that libvirt once again has to ask to close things.



Ah, yeah I missed that.


If we _never_ turn the in-use flag back on, then we've broken the first
case above where we want an in-use fd to come back into use after a crash.

Maybe that argues for two flags per fd: an in-use flag (there is
currently a monitor connection that can manipulate the fd, either
because it passed in the fd or because it reconnected) and a removed
flag (a monitor called remove-fd, and no longer wants to know about the
fd, even if it crashes and reconnects).  An fd is then closed when
'refcount == 0  (!inuse || removed)'.  On monitor disconnect, the
inuse flag is cleared, and on reconnect, it is set; but that does not
impact the removed flag.  And the 'query-fds' command would omit any fd
with the 'removed' flag, even when the fd is still kept open internally
thanks to the refcount being nonzero.



I agree.  Having the 2 flags makes sense and solves the issues you 
mentioned above.



--
Regards,
Corey



But I'm definitely agreeing that tying the refcount to the set rather
than to individual fds within the set makes sense; you still avoid the
fd leak in that all fds in the set are closed when both the monitor has
disavowed the set and when qemu_close() has finished using any of the fds.



--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-04 Thread Kevin Wolf
Am 03.07.2012 19:03, schrieb Eric Blake:
 2. drive_add file=/dev/fdset/1 - qemu_open uses the first fd from the
 set that has access flags matching the qemu_open action flags.
 qemu_open increments refcount for this fd.
 3. add-fd /dev/fdset/1 FDSET={M} - qemu adds fd to set named
 /dev/fdset/1 - command returns qemu fd to caller (e.g fd=5).  libvirt
 in-use flag turned on for fd.
 3. block-commit - qemu_open reopens /dev/fdset/1 by using the first
 fd from the set that has access flags matching the qemu_open action
 flags.  qemu_open increments refcount for this fd.
 4. remove-fd /dev/fdset/1 5 - caller requests fd==5 be removed from the
 set.  turns libvirt in-use flag off marking the fd ready to be closed
 when qemu is done with it.

 If we decided to not return the individual fd numbers to libvirt, file
 descriptors would be uniquely identified by an fdset/flags pair here.


 Are you saying we'd pass the fdset name and flags parameters on
 remove-fd to somehow identify the fds to remove?
 
 Passing the flag parameters is not trivial, as that would mean the QMP
 code would have to define constants mapping to all of the O_* flags that
 qemu_open supports.  It's easier to support closing by fd number.

Good point. So pass-fd or whatever it will be called needs to returnn
both fdset number and the individual fd number.

 5. qemu_close decrements refcount for fd, and closes fd when refcount is
 zero and libvirt in use flag is off.

 The monitor could just hold another reference, then we save the
 additional flag. But that's a qemu implementation detail.


 I'm not sure I understand what you mean.
 
 pass-fd (or add-fd, whatever name we give it) adds an fd to an fdset,
 with initial use count of 1 (the use is the monitor).  qemu_open()
 increments the use count.  A new qemu_close() wrapper would decrement
 the use count.  And both calling 'remove-fd', or closing the QMP monitor
 of an fd that has not yet been passed through 'remove-fd', serves as a
 way to decrement the use count.  You'd still have to track whether the
 monitor is using an fd (to avoid over-decrementing on QMP monitor
 close), but by having the monitor's use also tracked under the refcount,
 then refcount reaching 0 is sufficient to auto-close an fd.  

 I think
 that also means that re-establishing the client QMP connection would
 increment 

This is an interesting idea. I've never thought about what to do with a
reconnect. It felt a bit odd at first, but now your proposal totally
makes sense to me.


 For some examples:
 
 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
 use by monitor, as member of fdset1
 2. client crashes, so all tracked fds are visited; fd=4 had not yet been
 passed to 'remove-fd', so qemu decrements refcount; refcount of fd=4 is
 now 0 so qemu closes it

Doesn't the refcount belong to an fdset rather than a single fd? As long
as the fdset has still a reference, it's possible to reach the other fds
in the set through a reopen. For example:

1. client calls 'add-fd', qemu is now tracking fd=4 (O_RDWR) as a member
of fdset1, in use by monitor, refcount 1
2. client calls 'add-fd', qemu is now tracing fd=5 (O_RDONLY) as a
member of fdset, in use by monitor, refcount is still 1
3. client calls 'device-add' with /dev/fdset/1 as the backing filename
and r/w, so qemu_open() dup()s fd 4 and increments the refcount to 2
4. client crashes, so all tracked fdsets are visited; fdset1 gets its
refcount decreased to 1, and both member fds 4 and 5 stay open.

If you had refcounted a single fd, fd 5 would be closed now and qemu
couldn't switch to O_RDONLY any more.

 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
 use by monitor, as member of fdset1
 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
 so qemu_open() increments the refcount to 2
 3. client crashes, so all tracked fds are visited; fd=4 had not yet been
 passed to 'remove-fd', so qemu decrements refcount to 1, but leaves fd=4
 open because it is still in use by the block device
 4. client re-establishes QMP connection, and 'query-fds' lets client
 learn about fd=4 still being open as part of fdset1, but also informs
 client that fd is not in use by the monitor
 
 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
 use by monitor, as member of fdset1
 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
 so qemu_open() increments the refcount to 2
 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
 longer in use by the monitor, refcount decremented to 1 but still left
 open because it is in use by the block device
 4. client crashes, so all tracked fds are visited; but fd=4 is already
 marked as not in use by the monitor, so its refcount is unchanged
 
 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
 use by monitor, as member of fdset1
 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
 but the command 

Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-04 Thread Kevin Wolf
Am 03.07.2012 20:21, schrieb Corey Bryant:
 On 07/03/2012 02:00 PM, Eric Blake wrote:
 On 07/03/2012 11:46 AM, Corey Bryant wrote:


 Yes, I think adding a +1 to the refcount for the monitor makes sense.

 I'm a bit unsure how to increment the refcount when a monitor reconnects
 though.  Maybe it is as simple as adding a +1 to each fd's refcount when
 the next QMP monitor connects.

 Or maybe delay a +1 until after a 'query-fds' - it is not until the
 monitor has reconnected and learned what fds it should be aware of that
 incrementing the refcount again makes sense.  But that would mean making
 'query-fds' track whether this is the first call since the monitor
 reconnected, as it shouldn't normally increase refcounts.
 
 This doesn't sound ideal.

Yes, it's less than ideal.

 The other alternative is that the monitor never re-increments a
 refcount.  Once a monitor disconnects, that fd is lost to the monitor,
 and a reconnected monitor must pass in a new fd to be re-associated with
 the fdset.  In other words, the monitor's use of an fd is a one-way
 operation, starting life in use but ending at the first disconnect or
 remove-fd.
 
 I would vote for this 2nd alternative.  As long as we're not introducing 
 an fd leak.  And I don't think we are if we decrement the refcount on 
 remove-fd or on QMP disconnect.

In fact, I believe this one is even worse. I can already see hacks like
adding a dummy FD with invalid flags and removing it again just to
regain control over the fdset...

You earlier suggestion made a lot of sense to me: Whenever a new QMP
monitor is connected, increase the refcount. That is, as long as any
monitor is there, don't drop any fdsets unless explicitly requested via QMP.

Kevin

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-03 Thread Daniel P. Berrange
On Mon, Jul 02, 2012 at 04:31:09PM -0600, Eric Blake wrote:
 On 07/02/2012 04:02 PM, Corey Bryant wrote:
 
  Here's another option that Kevin and I discussed today on IRC.  I've
  modified a few minor details since the discussion. And Kevin please
  correct me if anything is wrong.
  
  Proposal Four: Pass a set of fds via 'pass-fds'.  The group of fds
  should all refer to the same file, but may have different access flags
  (ie. O_RDWR, O_RDONLY).  qemu_open can then dup the fd that has the
  matching access mode flags.
 
 But this means that libvirt has to open a file O_RDWR up front for any
 file that it _might_ need qemu to reopen later, and that qemu is now
 hanging on to 2 fds per fdset instead of 1 fd for the life of any client
 of the fdset.
 
 I see no reason why libvirt can't pass in an O_RDWR fd when qemu only
 needs to use an O_RDONLY fd;

If libvirt has only granted read-only access to the file with sVirt, then
passing a O_RDWR file handle to QEMU will result in an SELinux denial,
even if QEMU doesn't try to do I/O on it. So this is out of the question.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-03 Thread Kevin Wolf
Am 03.07.2012 00:31, schrieb Eric Blake:
 On 07/02/2012 04:02 PM, Corey Bryant wrote:
 
 Here's another option that Kevin and I discussed today on IRC.  I've
 modified a few minor details since the discussion. And Kevin please
 correct me if anything is wrong.

 Proposal Four: Pass a set of fds via 'pass-fds'.  The group of fds
 should all refer to the same file, but may have different access flags
 (ie. O_RDWR, O_RDONLY).  qemu_open can then dup the fd that has the
 matching access mode flags.
 
 But this means that libvirt has to open a file O_RDWR up front for any
 file that it _might_ need qemu to reopen later, and that qemu is now
 hanging on to 2 fds per fdset instead of 1 fd for the life of any client
 of the fdset.

It doesn't have to be up front. There's no reason not to have monitor
commands that add or remove fds from a given fdset later.

 I see no reason why libvirt can't pass in an O_RDWR fd when qemu only
 needs to use an O_RDONLY fd; making qemu insist on an O_RDONLY fd makes
 life harder for libvirt.  On the other hand, I can see from a safety
 standpoint that passing in an O_RDWR fd opens up more potential for
 errors than passing in an O_RDONLY fd, 

Yes, this is exactly my consideration. Having a read-only file opened as
O_RDWR gives us the chance to make stupid mistakes.

 but if you don't know up front
 whether you will ever need to write into a file, then it would be better
 to pass in O_RDONLY.  At least I don't see it as a security risk:
 passing in O_RDWR but setting SELinux permissions on the fd to only
 allow read() but not write() via the labeling of the fd may be possible,
 so that libvirt could still prevent accidental writes into an O_RDWR
 file that starts life only needing to service reads.

But this would assume that libvirt knows exactly when a reopen happens,
for each current and future qemu version. I wouldn't want to tie qemu's
internals so closely to the management application, even if libvirt
could probably get it reasonably right (allow rw before sending a
monitor command; revoke rw when receiving a QMP event that the commit
has completed). It wouldn't work if we had a qemu-initiated ro-rw
transition, but I don't think we have one at the moment.

 pass-fds:
 { 'command': 'pass-fds',
   'data': {'fdsetname': 'str', '*close': 'bool'},
   'returns': 'string' }
 
 This still doesn't solve Dan's complaint that things are not atomic; if
 the monitor dies between the pass-fds and the later use of the fdset, we
 would need a counterpart monitor command to query what fdsets are
 currently known by qemu. 

If you want a query-fdsets, that should be easy enough.

Actually, we might be able to even make the command transactionable.
This would of course require blockdev-add to be transactionable as well
to be of any use.

 And like you pointed out, you didn't make it
 clear how a timeout mechanism would be implemented to auto-close fds
 that are not consumed in a fixed amount of time - would that be another
 optional parameter to 'pass-fds'?

Do you think a timeout is helpful? Can't we just drop libvirt's
reference automatically if the monitor connection gets closed?

The BH/timer thing Corey mentioned is more about the qemu internal
problem that during a reopen there may be a short period where the old
fd is closed, but the new one not yet opened, so the fdset might need to
survive a short period with refcount 0.

 Or do we need a way to initially create a set with only one O_RDONLY fd,
 but later pass in a new O_RDWR fd but associate it with the existing set
 rather than creating a new set (akin to the 'force' option of 'pass-fd'
 in proposal two)?

As I said above, yes, I think this makes a lot sense.

Kevin

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-03 Thread Corey Bryant



On 07/02/2012 06:31 PM, Eric Blake wrote:

On 07/02/2012 04:02 PM, Corey Bryant wrote:


Here's another option that Kevin and I discussed today on IRC.  I've
modified a few minor details since the discussion. And Kevin please
correct me if anything is wrong.

Proposal Four: Pass a set of fds via 'pass-fds'.  The group of fds
should all refer to the same file, but may have different access flags
(ie. O_RDWR, O_RDONLY).  qemu_open can then dup the fd that has the
matching access mode flags.


But this means that libvirt has to open a file O_RDWR up front for any
file that it _might_ need qemu to reopen later, and that qemu is now
hanging on to 2 fds per fdset instead of 1 fd for the life of any client
of the fdset.

I see no reason why libvirt can't pass in an O_RDWR fd when qemu only
needs to use an O_RDONLY fd; making qemu insist on an O_RDONLY fd makes
life harder for libvirt.  On the other hand, I can see from a safety
standpoint that passing in an O_RDWR fd opens up more potential for
errors than passing in an O_RDONLY fd, but if you don't know up front
whether you will ever need to write into a file, then it would be better
to pass in O_RDONLY.  At least I don't see it as a security risk:
passing in O_RDWR but setting SELinux permissions on the fd to only
allow read() but not write() via the labeling of the fd may be possible,
so that libvirt could still prevent accidental writes into an O_RDWR
file that starts life only needing to service reads.


pass-fds:
 { 'command': 'pass-fds',
   'data': {'fdsetname': 'str', '*close': 'bool'},
   'returns': 'string' }


This still doesn't solve Dan's complaint that things are not atomic; if
the monitor dies between the pass-fds and the later use of the fdset, we
would need a counterpart monitor command to query what fdsets are
currently known by qemu.  And like you pointed out, you didn't make it
clear how a timeout mechanism would be implemented to auto-close fds
that are not consumed in a fixed amount of time - would that be another
optional parameter to 'pass-fds'?


Yes see the description of the close bool on pass-fds below which 
determines how the fds are closed.


It's not an atomic operation, but it does handle Dan's concern of 
preventing fd leaks.


If libvirt did crash after the pass-fd, then couldn't it just abandon 
the fd set (which will get closed) and libvirt could send a new fd set 
and work with that?




Or do we need a way to initially create a set with only one O_RDONLY fd,
but later pass in a new O_RDWR fd but associate it with the existing set
rather than creating a new set (akin to the 'force' option of 'pass-fd'
in proposal two)?


Yeah I see what you're saying.

I was also wondering if it would make sense for qemu to consume the 
passed fds and use those (ie. no dup()).  This would prevent the issues 
of having to close the fds that linger on the monitor.  But I don't know 
if it's realistic for libvirt to know how many open calls qemu will need 
to make (this would need 1 fd passed per open call).





close-fds:
 { 'command': 'close-fds',
   'data': {'fdsetname': 'str' }
where:
  @fdsetname - the file descriptor set name
  @close - *if true QEMU closes the monitor fds when they expire.
   if false, fds remain on the monitor until close-fds
   command.
PRO: Supports reopen
PRO: All other commands work without impact by using qemu_open()
PRO: No fd leakage if close==true specified
CON: Determining when to close fds when close==true may be tricky
USAGE:
1. pass-fd FDSET={M} - returns a string /dev/fdset/1) that refers to
the passed set of fds.
2. drive_add file=/dev/fdset/1 -- qemu_open() uses the first fd from the
list that has access flags matching the qemu_open() action flags.
3. block-commit -- qemu_open() reopens /dev/fdset/1 by using the first
fd from the list that has access flags matching the qemu_open() action
flags.
4. The monitor fds are closed:
A) *If @close == true, qemu closes the set of fds when the timer
   expires
B) If @close == false, libvirt must issue close-fds command to close
   the set of fds

*How to solve this has yet to be determined.  Kevin mentioned
potentially using a bottom-half or a timer in qemu_close().


Still, it is certainly an option worth thinking about, and I'm hoping we
can come to a solid design that everyone agrees provides the desired
security and flexibility without having to recode every single existing
command.



I agree.

--
Regards,
Corey


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-03 Thread Corey Bryant



On 07/02/2012 06:02 PM, Corey Bryant wrote:



On 06/26/2012 06:54 PM, Eric Blake wrote:

On 06/26/2012 04:28 PM, Corey Bryant wrote:


With this proposed series, we have usage akin to:

1. pass_fd FDSET={M} - returns a string /dev/fd/N showing
QEMU's
   view of the FD
2. drive_add file=/dev/fd/N
3. if failure:
 close_fd /dev/fd/N


In fact, there are more steps:

4. use it successfully
5. close_fd /dev/fd/N

I think it would well be possible that qemu just closes the fd when
it's
not used internally any more.


pass-fd could have a flag indicating qemu to do that.



It seems like libvirt would be in a better position to understand when a
file is no longer in use, and then it can call close_fd.  No?  Of course
the the only fd that needs to be closed is the originally passed fd. The
dup'd fd's are closed by QEMU.


The more we discuss this, the more I'm convinced that commands like
'block-commit' that will reopen a file to change from O_RDONLY to O_RDWR
will need to have an optional argument that says the name of the file to
reopen.  That is, I've seen three proposals:

Proposal One: enhance every command to take an fd:name protocol.
PRO: Successful use of a name removes the fd from the 'getfd' list.
CON: Lots of commands to change.
CON: The command sequence is not atomic.
CON: Block layer does not have a file name tied to the fd, so the reopen
operation also needs to learn the fd:name protocol, but since we're
already touching the command it's not much more work.
USAGE:
1. getfd FDSET={M}, ties new fd to name
2. drive_add fd=name - looks up the fd by name from the list
3a. on success, fd is no longer in the list, drive_add consumed it
3b. on failure, libvirt must call 'closefd name' to avoid a leak
4. getfd FDSET={M2}, ties new fd to newname
5. block-commit fd=newname - looks up fd by name from the list
6a. on success, fd is no longer in the list, block-commit consumed it
6b. on failure, libvirt must call 'closefd newname' to avoid a leak

Proposal Two: Create a permanent name via 'getfd' or 'pass-fd', then
update that name to the new fd in advance of any operation that needs to
do a reopen.
PRO: All other commands work without impact by using qemu_open(), by
getting a clone of the permanent name.
CON: The permanent name must remain open as long as qemu might need to
reopen it, and reopening needs the pass-fd force option.
CON: The command sequence is not atomic.
USAGE:
1. pass_fd FDSET={M} - returns an integer N (or a string /dev/fd/N)
showing QEMU's permanent name of the FD
2. drive_add file=permanent name means that qemu_open() will clone the
fd, and drive_add is now using yet another FD while N remains open;
meanwhile, the block layer knows that the drive is tied to the permanent
name
3. pass-fd force FDSET={M2} - replaces fd N with the passed M2, and
still returns /dev/fd/N
4. block-commit - looks up the block-device name (/dev/fd/N), which maps
back to the permanent fd, and gets a copy of the newly passed M2
5. on completion (success or failure), libvirt must call 'closefd name'
to avoid a leak

Proposal Three: Use thread-local fds passed alongside any command,
rather than passing via a separate command
PRO: All commands can now atomically handle fd passing
PRO: Commands that only need a one-shot open can now use fd
CON: Commands that need to reopen still need modification to allow a
name override during the reopen
1. drive_add nfds=1 file=/dev/fdlist/1 FDSET={M} - on success, the fd
is used as the block file, on failure it is atomically closed, so there
is no leak either way. However, the block file has no permanent name.
2. block-commit nfds=1 file-override=/dev/fdlist/1 FDSET={M2} - file
override must be passed again (since we have no guarantee that the
/dev/fdlist/1 of _this_ command matches the command-local naming used in
the previous command), but the fd is again used atomically

Under proposal 3, there is no need to dup fd's, merely only to check
that qemu_open(/dev/fdlist/n, flags) has compatible flags with the fd
received via FDSET.  And the fact that things are made atomic means that
libvirt no longer has to worry about calling closefd, nor does it have
to worry about being interrupted between two monitor commands and not
knowing what state qemu is in.  But it does mean teaching every possible
command that wants to do a reopen to learn how to use fd overrides
rather than to reuse the permanent name that was originally in place on
the first open.



Here's another option that Kevin and I discussed today on IRC.  I've
modified a few minor details since the discussion. And Kevin please
correct me if anything is wrong.

Proposal Four: Pass a set of fds via 'pass-fds'.  The group of fds
should all refer to the same file, but may have different access flags
(ie. O_RDWR, O_RDONLY).  qemu_open can then dup the fd that has the
matching access mode flags.
pass-fds:
 { 'command': 'pass-fds',
   'data': {'fdsetname': 'str', '*close': 'bool'},
   'returns': 

Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-03 Thread Kevin Wolf
Am 03.07.2012 17:40, schrieb Corey Bryant:
 Thanks again for taking time to discuss this at today's QEMU community call.
 
 Here's the proposal we discussed at the call.  Please let me know if I 
 missed anything or if there are any issues with this design.
 
 Proposal Five:  New monitor commands enable adding/removing an fd 
 to/from a set.  The set of fds should all refer to the same file, but 
 may have different access flags (ie. O_RDWR, O_RDONLY).  qemu_open can 
 then dup the fd that has the matching access mode flags.
 PRO: Supports reopen
 PRO: All other commands work without impact by using qemu_open()
 PRO: No fd leakage (fds are associated with monitor connection and, if 
 not in use, closed when monitor disconnects)
 PRO: Security-wise this is ok since libvirt can manage the set of fd's 
 (ie. it can add/remove an O_RDWR fd to/from the set as needed).
 CON: Not atomic (e.g. doesn't add an fd with single drive_add command).
 USAGE:
 1. add-fd /dev/fdset/1 FDSET={M} - qemu adds fd to set named 
 /dev/fdset/1 - command returns qemu fd (e.g fd=4) to caller.  libvirt 
 in-use flag turned on for fd.

I thought qemu would rather return the number of the fdset (which it
also assigns if none it passed, i.e. for fdset creation). Does libvirt
need the number of an individual fd?

If libvirt prefers to assign fdset numbers itself, I'm not against it,
it's just something that wasn't clear to me yet.

 2. drive_add file=/dev/fdset/1 - qemu_open uses the first fd from the
 set that has access flags matching the qemu_open action flags. 
 qemu_open increments refcount for this fd.
 3. add-fd /dev/fdset/1 FDSET={M} - qemu adds fd to set named 
 /dev/fdset/1 - command returns qemu fd to caller (e.g fd=5).  libvirt 
 in-use flag turned on for fd.
 3. block-commit - qemu_open reopens /dev/fdset/1 by using the first
 fd from the set that has access flags matching the qemu_open action 
 flags.  qemu_open increments refcount for this fd.
 4. remove-fd /dev/fdset/1 5 - caller requests fd==5 be removed from the 
 set.  turns libvirt in-use flag off marking the fd ready to be closed 
 when qemu is done with it.

If we decided to not return the individual fd numbers to libvirt, file
descriptors would be uniquely identified by an fdset/flags pair here.

 5. qemu_close decrements refcount for fd, and closes fd when refcount is 
 zero and libvirt in use flag is off.

The monitor could just hold another reference, then we save the
additional flag. But that's a qemu implementation detail.

 More functional details:
 -If libvirt crashes it can call query-fd /dev/fdset/1 to determine 
 which fds are open in the set.

We also need a query-fdsets command that lists all fdsets that exist. If
we add information about single fds to the return value of it, we
probably don't need a separate query-fd that operates on a single fdset.

 -If monitor connection closes, qemu will close fds that have a refcount 
 of zero.  Do we also need a qemu in-use flag in case refcount is zero 
 and fd is still in use?

In use by whom? If it's still in use in qemu (as in in-use flag would
be set) and we have a refcount of zero, then that's a bug.

 -This support requires introduction of qemu_close function that will be 
 called everywhere in block layer that close is currently called.
 
 Notes:
 -Patch series 1 will include support for all of the above.  This will be 
 my initial focus.
 -Patch series 2 will include command line support that enables 
 association of command line fd with a monitor set.  This will be my 
 secondary focus, most likely after patch series 1 is applied.

Thanks, this is a good and as far as I can tell complete summary of what
we discussed.

Kevin

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-03 Thread Corey Bryant



On 07/03/2012 11:59 AM, Kevin Wolf wrote:

Am 03.07.2012 17:40, schrieb Corey Bryant:

Thanks again for taking time to discuss this at today's QEMU community call.

Here's the proposal we discussed at the call.  Please let me know if I
missed anything or if there are any issues with this design.

Proposal Five:  New monitor commands enable adding/removing an fd
to/from a set.  The set of fds should all refer to the same file, but
may have different access flags (ie. O_RDWR, O_RDONLY).  qemu_open can
then dup the fd that has the matching access mode flags.
PRO: Supports reopen
PRO: All other commands work without impact by using qemu_open()
PRO: No fd leakage (fds are associated with monitor connection and, if
not in use, closed when monitor disconnects)
PRO: Security-wise this is ok since libvirt can manage the set of fd's
(ie. it can add/remove an O_RDWR fd to/from the set as needed).
CON: Not atomic (e.g. doesn't add an fd with single drive_add command).
USAGE:
1. add-fd /dev/fdset/1 FDSET={M} - qemu adds fd to set named
/dev/fdset/1 - command returns qemu fd (e.g fd=4) to caller.  libvirt
in-use flag turned on for fd.


I thought qemu would rather return the number of the fdset (which it
also assigns if none it passed, i.e. for fdset creation). Does libvirt
need the number of an individual fd?

If libvirt prefers to assign fdset numbers itself, I'm not against it,
it's just something that wasn't clear to me yet.



That's fine.  QEMU can return the fdset number or a string 
(/dev/fdset/1) if none is specified.  And an fdset will need to be 
specified if adding to an existing set.


I think libvirt will need the fd returned by add-fd so that it can 
evaluate fds returned by query-fd.  It's also useful for remove-fd.



2. drive_add file=/dev/fdset/1 - qemu_open uses the first fd from the
set that has access flags matching the qemu_open action flags.
qemu_open increments refcount for this fd.
3. add-fd /dev/fdset/1 FDSET={M} - qemu adds fd to set named
/dev/fdset/1 - command returns qemu fd to caller (e.g fd=5).  libvirt
in-use flag turned on for fd.
3. block-commit - qemu_open reopens /dev/fdset/1 by using the first
fd from the set that has access flags matching the qemu_open action
flags.  qemu_open increments refcount for this fd.
4. remove-fd /dev/fdset/1 5 - caller requests fd==5 be removed from the
set.  turns libvirt in-use flag off marking the fd ready to be closed
when qemu is done with it.


If we decided to not return the individual fd numbers to libvirt, file
descriptors would be uniquely identified by an fdset/flags pair here.



Are you saying we'd pass the fdset name and flags parameters on 
remove-fd to somehow identify the fds to remove?



5. qemu_close decrements refcount for fd, and closes fd when refcount is
zero and libvirt in use flag is off.


The monitor could just hold another reference, then we save the
additional flag. But that's a qemu implementation detail.



I'm not sure I understand what you mean.


More functional details:
-If libvirt crashes it can call query-fd /dev/fdset/1 to determine
which fds are open in the set.


We also need a query-fdsets command that lists all fdsets that exist. If
we add information about single fds to the return value of it, we
probably don't need a separate query-fd that operates on a single fdset.



Yes, good point.  And maybe we don't need 2 commands.  query-fdsets 
could return all the sets and all the fds that are in those sets.



-If monitor connection closes, qemu will close fds that have a refcount
of zero.  Do we also need a qemu in-use flag in case refcount is zero
and fd is still in use?


In use by whom? If it's still in use in qemu (as in in-use flag would
be set) and we have a refcount of zero, then that's a bug.



In use by qemu.  I don't think it's a bug.  I think there are situations 
where refcount gets to zero but qemu is still using the fd.



-This support requires introduction of qemu_close function that will be
called everywhere in block layer that close is currently called.

Notes:
-Patch series 1 will include support for all of the above.  This will be
my initial focus.
-Patch series 2 will include command line support that enables
association of command line fd with a monitor set.  This will be my
secondary focus, most likely after patch series 1 is applied.


Thanks, this is a good and as far as I can tell complete summary of what
we discussed.

Kevin



Definitely!  Thank you for all the input.

--
Regards,
Corey


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-03 Thread Eric Blake
On 07/03/2012 10:25 AM, Corey Bryant wrote:

 I thought qemu would rather return the number of the fdset (which it
 also assigns if none it passed, i.e. for fdset creation). Does libvirt
 need the number of an individual fd?

 If libvirt prefers to assign fdset numbers itself, I'm not against it,
 it's just something that wasn't clear to me yet.

 
 That's fine.  QEMU can return the fdset number or a string
 (/dev/fdset/1) if none is specified.  And an fdset will need to be
 specified if adding to an existing set.
 
 I think libvirt will need the fd returned by add-fd so that it can
 evaluate fds returned by query-fd.  It's also useful for remove-fd.

Correct - since we will be adding a remove-fd, then that command needs
to know both the fdset name and the individual fd within the set to be
removed.

 
 2. drive_add file=/dev/fdset/1 - qemu_open uses the first fd from the
 set that has access flags matching the qemu_open action flags.
 qemu_open increments refcount for this fd.
 3. add-fd /dev/fdset/1 FDSET={M} - qemu adds fd to set named
 /dev/fdset/1 - command returns qemu fd to caller (e.g fd=5).  libvirt
 in-use flag turned on for fd.
 3. block-commit - qemu_open reopens /dev/fdset/1 by using the first
 fd from the set that has access flags matching the qemu_open action
 flags.  qemu_open increments refcount for this fd.
 4. remove-fd /dev/fdset/1 5 - caller requests fd==5 be removed from the
 set.  turns libvirt in-use flag off marking the fd ready to be closed
 when qemu is done with it.

 If we decided to not return the individual fd numbers to libvirt, file
 descriptors would be uniquely identified by an fdset/flags pair here.

 
 Are you saying we'd pass the fdset name and flags parameters on
 remove-fd to somehow identify the fds to remove?

Passing the flag parameters is not trivial, as that would mean the QMP
code would have to define constants mapping to all of the O_* flags that
qemu_open supports.  It's easier to support closing by fd number.

 
 5. qemu_close decrements refcount for fd, and closes fd when refcount is
 zero and libvirt in use flag is off.

 The monitor could just hold another reference, then we save the
 additional flag. But that's a qemu implementation detail.

 
 I'm not sure I understand what you mean.

pass-fd (or add-fd, whatever name we give it) adds an fd to an fdset,
with initial use count of 1 (the use is the monitor).  qemu_open()
increments the use count.  A new qemu_close() wrapper would decrement
the use count.  And both calling 'remove-fd', or closing the QMP monitor
of an fd that has not yet been passed through 'remove-fd', serves as a
way to decrement the use count.  You'd still have to track whether the
monitor is using an fd (to avoid over-decrementing on QMP monitor
close), but by having the monitor's use also tracked under the refcount,
then refcount reaching 0 is sufficient to auto-close an fd.  I think
that also means that re-establishing the client QMP connection would
increment  For some examples:

1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
use by monitor, as member of fdset1
2. client crashes, so all tracked fds are visited; fd=4 had not yet been
passed to 'remove-fd', so qemu decrements refcount; refcount of fd=4 is
now 0 so qemu closes it

1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
use by monitor, as member of fdset1
2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
so qemu_open() increments the refcount to 2
3. client crashes, so all tracked fds are visited; fd=4 had not yet been
passed to 'remove-fd', so qemu decrements refcount to 1, but leaves fd=4
open because it is still in use by the block device
4. client re-establishes QMP connection, and 'query-fds' lets client
learn about fd=4 still being open as part of fdset1, but also informs
client that fd is not in use by the monitor

1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
use by monitor, as member of fdset1
2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
so qemu_open() increments the refcount to 2
3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
longer in use by the monitor, refcount decremented to 1 but still left
open because it is in use by the block device
4. client crashes, so all tracked fds are visited; but fd=4 is already
marked as not in use by the monitor, so its refcount is unchanged

1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
use by monitor, as member of fdset1
2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
but the command fails for some other reason, so the refcount is still 1
at the end of the command (although it may have been temporarily
incremented then decremented during the command)
3. client calls 'remove-fd fdset=1 fd=4' to deal with the failure (or
QMP connection is closed), so qemu marks fd=4 as no longer in use by the
monitor, refcount is now decremented to 0 and fd=4 

Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-03 Thread Corey Bryant



On 07/03/2012 01:03 PM, Eric Blake wrote:

On 07/03/2012 10:25 AM, Corey Bryant wrote:


I thought qemu would rather return the number of the fdset (which it
also assigns if none it passed, i.e. for fdset creation). Does libvirt
need the number of an individual fd?

If libvirt prefers to assign fdset numbers itself, I'm not against it,
it's just something that wasn't clear to me yet.



That's fine.  QEMU can return the fdset number or a string
(/dev/fdset/1) if none is specified.  And an fdset will need to be
specified if adding to an existing set.

I think libvirt will need the fd returned by add-fd so that it can
evaluate fds returned by query-fd.  It's also useful for remove-fd.


Correct - since we will be adding a remove-fd, then that command needs
to know both the fdset name and the individual fd within the set to be
removed.



Ok




2. drive_add file=/dev/fdset/1 - qemu_open uses the first fd from the
set that has access flags matching the qemu_open action flags.
qemu_open increments refcount for this fd.
3. add-fd /dev/fdset/1 FDSET={M} - qemu adds fd to set named
/dev/fdset/1 - command returns qemu fd to caller (e.g fd=5).  libvirt
in-use flag turned on for fd.
3. block-commit - qemu_open reopens /dev/fdset/1 by using the first
fd from the set that has access flags matching the qemu_open action
flags.  qemu_open increments refcount for this fd.
4. remove-fd /dev/fdset/1 5 - caller requests fd==5 be removed from the
set.  turns libvirt in-use flag off marking the fd ready to be closed
when qemu is done with it.


If we decided to not return the individual fd numbers to libvirt, file
descriptors would be uniquely identified by an fdset/flags pair here.



Are you saying we'd pass the fdset name and flags parameters on
remove-fd to somehow identify the fds to remove?


Passing the flag parameters is not trivial, as that would mean the QMP
code would have to define constants mapping to all of the O_* flags that
qemu_open supports.  It's easier to support closing by fd number.



I understand what you were saying now, although I guess it's not 
applicable at this point.  I'll plan on returning the fd from add-fd and 
passing the fd on the remove-fd command.





5. qemu_close decrements refcount for fd, and closes fd when refcount is
zero and libvirt in use flag is off.


The monitor could just hold another reference, then we save the
additional flag. But that's a qemu implementation detail.



I'm not sure I understand what you mean.


pass-fd (or add-fd, whatever name we give it) adds an fd to an fdset,
with initial use count of 1 (the use is the monitor).  qemu_open()
increments the use count.  A new qemu_close() wrapper would decrement
the use count.  And both calling 'remove-fd', or closing the QMP monitor
of an fd that has not yet been passed through 'remove-fd', serves as a
way to decrement the use count.  You'd still have to track whether the
monitor is using an fd (to avoid over-decrementing on QMP monitor
close), but by having the monitor's use also tracked under the refcount,
then refcount reaching 0 is sufficient to auto-close an fd.  I think
that also means that re-establishing the client QMP connection would
increment  For some examples:


Yes, I think adding a +1 to the refcount for the monitor makes sense.

I'm a bit unsure how to increment the refcount when a monitor reconnects 
though.  Maybe it is as simple as adding a +1 to each fd's refcount when 
the next QMP monitor connects.




1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
use by monitor, as member of fdset1
2. client crashes, so all tracked fds are visited; fd=4 had not yet been
passed to 'remove-fd', so qemu decrements refcount; refcount of fd=4 is
now 0 so qemu closes it


Just a note that the fd above also hasn't yet been referenced by a 
drive-add/device-add, so it will be closed in step 2.




1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
use by monitor, as member of fdset1
2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
so qemu_open() increments the refcount to 2
3. client crashes, so all tracked fds are visited; fd=4 had not yet been
passed to 'remove-fd', so qemu decrements refcount to 1, but leaves fd=4
open because it is still in use by the block device
4. client re-establishes QMP connection, and 'query-fds' lets client
learn about fd=4 still being open as part of fdset1, but also informs
client that fd is not in use by the monitor


And in step 4 the QMP connection will increment the refcount +1 for all 
fds that persisted through the QMP disconnect. (?)




1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
use by monitor, as member of fdset1
2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
so qemu_open() increments the refcount to 2
3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
longer in use by the monitor, refcount decremented to 1 but still left
open because it is 

Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-03 Thread Eric Blake
On 07/03/2012 11:46 AM, Corey Bryant wrote:

 
 Yes, I think adding a +1 to the refcount for the monitor makes sense.
 
 I'm a bit unsure how to increment the refcount when a monitor reconnects
 though.  Maybe it is as simple as adding a +1 to each fd's refcount when
 the next QMP monitor connects.

Or maybe delay a +1 until after a 'query-fds' - it is not until the
monitor has reconnected and learned what fds it should be aware of that
incrementing the refcount again makes sense.  But that would mean making
'query-fds' track whether this is the first call since the monitor
reconnected, as it shouldn't normally increase refcounts.

The other alternative is that the monitor never re-increments a
refcount.  Once a monitor disconnects, that fd is lost to the monitor,
and a reconnected monitor must pass in a new fd to be re-associated with
the fdset.  In other words, the monitor's use of an fd is a one-way
operation, starting life in use but ending at the first disconnect or
remove-fd.


 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
 use by monitor, as member of fdset1
 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
 so qemu_open() increments the refcount to 2
 3. client crashes, so all tracked fds are visited; fd=4 had not yet been
 passed to 'remove-fd', so qemu decrements refcount to 1, but leaves fd=4
 open because it is still in use by the block device
 4. client re-establishes QMP connection, and 'query-fds' lets client
 learn about fd=4 still being open as part of fdset1, but also informs
 client that fd is not in use by the monitor
 
 And in step 4 the QMP connection will increment the refcount +1 for all
 fds that persisted through the QMP disconnect. (?)

I'm not sure we need the refcount increment on reconnect.  'query-fds'
should provide enough information for the new monitor to know what
fdsets are still in use by qemu, even though they are no longer
available to 'remove-fd' from the monitor, and if the monitor is worried
about keeping the fd set alive it can call 'add-fd' to again associate a
new fd with the set.  The lifetime of a set is thus as long as any of
its associated fds have a non-zero refcount.

-- 
Eric Blake   ebl...@redhat.com+1-919-301-3266
Libvirt virtualization library http://libvirt.org





signature.asc
Description: OpenPGP digital signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-03 Thread Corey Bryant



On 07/03/2012 02:00 PM, Eric Blake wrote:

On 07/03/2012 11:46 AM, Corey Bryant wrote:



Yes, I think adding a +1 to the refcount for the monitor makes sense.

I'm a bit unsure how to increment the refcount when a monitor reconnects
though.  Maybe it is as simple as adding a +1 to each fd's refcount when
the next QMP monitor connects.


Or maybe delay a +1 until after a 'query-fds' - it is not until the
monitor has reconnected and learned what fds it should be aware of that
incrementing the refcount again makes sense.  But that would mean making
'query-fds' track whether this is the first call since the monitor
reconnected, as it shouldn't normally increase refcounts.


This doesn't sound ideal.



The other alternative is that the monitor never re-increments a
refcount.  Once a monitor disconnects, that fd is lost to the monitor,
and a reconnected monitor must pass in a new fd to be re-associated with
the fdset.  In other words, the monitor's use of an fd is a one-way
operation, starting life in use but ending at the first disconnect or
remove-fd.




I would vote for this 2nd alternative.  As long as we're not introducing 
an fd leak.  And I don't think we are if we decrement the refcount on 
remove-fd or on QMP disconnect.



1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
use by monitor, as member of fdset1
2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
so qemu_open() increments the refcount to 2
3. client crashes, so all tracked fds are visited; fd=4 had not yet been
passed to 'remove-fd', so qemu decrements refcount to 1, but leaves fd=4
open because it is still in use by the block device
4. client re-establishes QMP connection, and 'query-fds' lets client
learn about fd=4 still being open as part of fdset1, but also informs
client that fd is not in use by the monitor


And in step 4 the QMP connection will increment the refcount +1 for all
fds that persisted through the QMP disconnect. (?)


I'm not sure we need the refcount increment on reconnect.  'query-fds'
should provide enough information for the new monitor to know what
fdsets are still in use by qemu, even though they are no longer
available to 'remove-fd' from the monitor, and if the monitor is worried
about keeping the fd set alive it can call 'add-fd' to again associate a
new fd with the set.  The lifetime of a set is thus as long as any of
its associated fds have a non-zero refcount.



This sounds good to me.

And qemu_open will need to make sure the monitor in_use flag is true 
before dup'ing an fd.


--
Regards,
Corey


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-02 Thread Corey Bryant



On 06/26/2012 06:54 PM, Eric Blake wrote:

On 06/26/2012 04:28 PM, Corey Bryant wrote:


With this proposed series, we have usage akin to:

1. pass_fd FDSET={M} - returns a string /dev/fd/N showing QEMU's
   view of the FD
2. drive_add file=/dev/fd/N
3. if failure:
 close_fd /dev/fd/N


In fact, there are more steps:

4. use it successfully
5. close_fd /dev/fd/N

I think it would well be possible that qemu just closes the fd when it's
not used internally any more.


pass-fd could have a flag indicating qemu to do that.



It seems like libvirt would be in a better position to understand when a
file is no longer in use, and then it can call close_fd.  No?  Of course
the the only fd that needs to be closed is the originally passed fd. The
dup'd fd's are closed by QEMU.


The more we discuss this, the more I'm convinced that commands like
'block-commit' that will reopen a file to change from O_RDONLY to O_RDWR
will need to have an optional argument that says the name of the file to
reopen.  That is, I've seen three proposals:

Proposal One: enhance every command to take an fd:name protocol.
PRO: Successful use of a name removes the fd from the 'getfd' list.
CON: Lots of commands to change.
CON: The command sequence is not atomic.
CON: Block layer does not have a file name tied to the fd, so the reopen
operation also needs to learn the fd:name protocol, but since we're
already touching the command it's not much more work.
USAGE:
1. getfd FDSET={M}, ties new fd to name
2. drive_add fd=name - looks up the fd by name from the list
3a. on success, fd is no longer in the list, drive_add consumed it
3b. on failure, libvirt must call 'closefd name' to avoid a leak
4. getfd FDSET={M2}, ties new fd to newname
5. block-commit fd=newname - looks up fd by name from the list
6a. on success, fd is no longer in the list, block-commit consumed it
6b. on failure, libvirt must call 'closefd newname' to avoid a leak

Proposal Two: Create a permanent name via 'getfd' or 'pass-fd', then
update that name to the new fd in advance of any operation that needs to
do a reopen.
PRO: All other commands work without impact by using qemu_open(), by
getting a clone of the permanent name.
CON: The permanent name must remain open as long as qemu might need to
reopen it, and reopening needs the pass-fd force option.
CON: The command sequence is not atomic.
USAGE:
1. pass_fd FDSET={M} - returns an integer N (or a string /dev/fd/N)
showing QEMU's permanent name of the FD
2. drive_add file=permanent name means that qemu_open() will clone the
fd, and drive_add is now using yet another FD while N remains open;
meanwhile, the block layer knows that the drive is tied to the permanent
name
3. pass-fd force FDSET={M2} - replaces fd N with the passed M2, and
still returns /dev/fd/N
4. block-commit - looks up the block-device name (/dev/fd/N), which maps
back to the permanent fd, and gets a copy of the newly passed M2
5. on completion (success or failure), libvirt must call 'closefd name'
to avoid a leak

Proposal Three: Use thread-local fds passed alongside any command,
rather than passing via a separate command
PRO: All commands can now atomically handle fd passing
PRO: Commands that only need a one-shot open can now use fd
CON: Commands that need to reopen still need modification to allow a
name override during the reopen
1. drive_add nfds=1 file=/dev/fdlist/1 FDSET={M} - on success, the fd
is used as the block file, on failure it is atomically closed, so there
is no leak either way. However, the block file has no permanent name.
2. block-commit nfds=1 file-override=/dev/fdlist/1 FDSET={M2} - file
override must be passed again (since we have no guarantee that the
/dev/fdlist/1 of _this_ command matches the command-local naming used in
the previous command), but the fd is again used atomically

Under proposal 3, there is no need to dup fd's, merely only to check
that qemu_open(/dev/fdlist/n, flags) has compatible flags with the fd
received via FDSET.  And the fact that things are made atomic means that
libvirt no longer has to worry about calling closefd, nor does it have
to worry about being interrupted between two monitor commands and not
knowing what state qemu is in.  But it does mean teaching every possible
command that wants to do a reopen to learn how to use fd overrides
rather than to reuse the permanent name that was originally in place on
the first open.



Here's another option that Kevin and I discussed today on IRC.  I've 
modified a few minor details since the discussion. And Kevin please 
correct me if anything is wrong.


Proposal Four: Pass a set of fds via 'pass-fds'.  The group of fds 
should all refer to the same file, but may have different access flags 
(ie. O_RDWR, O_RDONLY).  qemu_open can then dup the fd that has the 
matching access mode flags.

pass-fds:
{ 'command': 'pass-fds',
  'data': {'fdsetname': 'str', '*close': 'bool'},
  'returns': 'string' }
close-fds:
{ 'command': 

Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-07-02 Thread Eric Blake
On 07/02/2012 04:02 PM, Corey Bryant wrote:

 Here's another option that Kevin and I discussed today on IRC.  I've
 modified a few minor details since the discussion. And Kevin please
 correct me if anything is wrong.
 
 Proposal Four: Pass a set of fds via 'pass-fds'.  The group of fds
 should all refer to the same file, but may have different access flags
 (ie. O_RDWR, O_RDONLY).  qemu_open can then dup the fd that has the
 matching access mode flags.

But this means that libvirt has to open a file O_RDWR up front for any
file that it _might_ need qemu to reopen later, and that qemu is now
hanging on to 2 fds per fdset instead of 1 fd for the life of any client
of the fdset.

I see no reason why libvirt can't pass in an O_RDWR fd when qemu only
needs to use an O_RDONLY fd; making qemu insist on an O_RDONLY fd makes
life harder for libvirt.  On the other hand, I can see from a safety
standpoint that passing in an O_RDWR fd opens up more potential for
errors than passing in an O_RDONLY fd, but if you don't know up front
whether you will ever need to write into a file, then it would be better
to pass in O_RDONLY.  At least I don't see it as a security risk:
passing in O_RDWR but setting SELinux permissions on the fd to only
allow read() but not write() via the labeling of the fd may be possible,
so that libvirt could still prevent accidental writes into an O_RDWR
file that starts life only needing to service reads.

 pass-fds:
 { 'command': 'pass-fds',
   'data': {'fdsetname': 'str', '*close': 'bool'},
   'returns': 'string' }

This still doesn't solve Dan's complaint that things are not atomic; if
the monitor dies between the pass-fds and the later use of the fdset, we
would need a counterpart monitor command to query what fdsets are
currently known by qemu.  And like you pointed out, you didn't make it
clear how a timeout mechanism would be implemented to auto-close fds
that are not consumed in a fixed amount of time - would that be another
optional parameter to 'pass-fds'?

Or do we need a way to initially create a set with only one O_RDONLY fd,
but later pass in a new O_RDWR fd but associate it with the existing set
rather than creating a new set (akin to the 'force' option of 'pass-fd'
in proposal two)?

 close-fds:
 { 'command': 'close-fds',
   'data': {'fdsetname': 'str' }
 where:
  @fdsetname - the file descriptor set name
  @close - *if true QEMU closes the monitor fds when they expire.
   if false, fds remain on the monitor until close-fds
   command.
 PRO: Supports reopen
 PRO: All other commands work without impact by using qemu_open()
 PRO: No fd leakage if close==true specified
 CON: Determining when to close fds when close==true may be tricky
 USAGE:
 1. pass-fd FDSET={M} - returns a string /dev/fdset/1) that refers to
 the passed set of fds.
 2. drive_add file=/dev/fdset/1 -- qemu_open() uses the first fd from the
 list that has access flags matching the qemu_open() action flags.
 3. block-commit -- qemu_open() reopens /dev/fdset/1 by using the first
 fd from the list that has access flags matching the qemu_open() action
 flags.
 4. The monitor fds are closed:
A) *If @close == true, qemu closes the set of fds when the timer
   expires
B) If @close == false, libvirt must issue close-fds command to close
   the set of fds
 
 *How to solve this has yet to be determined.  Kevin mentioned
 potentially using a bottom-half or a timer in qemu_close().

Still, it is certainly an option worth thinking about, and I'm hoping we
can come to a solid design that everyone agrees provides the desired
security and flexibility without having to recode every single existing
command.

-- 
Eric Blake   ebl...@redhat.com+1-919-301-3266
Libvirt virtualization library http://libvirt.org





signature.asc
Description: OpenPGP digital signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-06-28 Thread Luiz Capitulino
On Wed, 27 Jun 2012 10:58:01 +0200
Kevin Wolf kw...@redhat.com wrote:

 Am 26.06.2012 20:40, schrieb Corey Bryant:
  Here is a quick proof of concept (ie untested) patch to demonstrate
  what I mean. It relies on Cory's patch which converts everything
  to use qemu_open. It is also still valuable to make the change
  to qemu_open() to support /dev/fd/N for passing FDs during
  QEMU initial startup for CLI args.
 
  IMHO, what I propose here is preferrable for QMP clients that
  our current plan of requiring use of 3 monitor commands (passfd,
  X, closefd).
 
  Thanks for the PoC.
 
  Two other required updates that I can think of would be:
 
  1) Update change, block_stream, block_reopen, snapshot_blkdev, and
  perhaps other monitor commands to support receiving fd's via SCM_RIGHTS.
 
  
  Nevermind my comment.  I see that your PoC supports passing nfds for any 
  QMP command.
  
  I'm curious what Kevin's thoughts are on this and the overall approach.
 
 I'm not against introducing this nfd thing as a general feature for QMP
 commands instead of a separate pass-fd command. It's not obvious to me
 that everyone would agree with that, so let's CC Luiz at least.

I don't have objections either, but I want to point out two things.

First, we've agreed on adding new commands instead of extending existing
ones. I feel that not everyone is agreement with this and maybe we could
revisit this topic, but in principle new commands should be added.

Secondly, this is just a small suggestion, but we're in a point in time
where several block commands are being added. I think it's a good moment
to think hard how the ideal block API would look like and try to design
new commands based on that.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-06-28 Thread Corey Bryant



On 06/27/2012 05:51 AM, Kevin Wolf wrote:

Am 27.06.2012 11:20, schrieb Daniel P. Berrange:

On Wed, Jun 27, 2012 at 11:16:32AM +0200, Kevin Wolf wrote:

The really bad case that nobody thought of is that, when block-commit
has finished, we need to switch back to read-only. This is an event that
is not triggered by a certain monitor command, but that comes from
inside qemu. I'm almost sure that we'll see more of this as we add more
asynchronous commands.

This only works if we can pass the new file descriptor in advance. It
would work nicely if you go with pass-fd and actually maintain a list of
file descriptors for each /dev/fd/N, along with the different flags the
file descriptors are meant for. I can't see how it would work with the
temporary /dev/fdlist/N or the fd:name approach because they both imply
that the original file descriptors are closed by the time that the QMP
command returns.


What I don't understand though is that when you're reopening FDs, with
the pass-fd command approach, you're merely dup'ing the original FDs that
was passed in from the client.  Why can't you alternatively just dup the
FD you already have.


That's easy: Because I don't know that I'm dealing with an FD. I think
originally the whole point in this /dev/fd/ thing was that it would
transparently enable the functionality for all parts of qemu: The block
layer, of course, but also new char devices, migration targets,
screenshot files or whatever else can deal with files. In this design
the block layer doesn't even know that it's not a regular file.

If we now decided to go for a design where the /dev/fd path isn't
enough, but changes are needed to each subsystem where it should work,
then this point doesn't apply any more and I cast my vote for abandoning
everything starting with /dev/ and introducing a proper fd: protocol in
the block layer, so that the block layer at least has a chance to know
that it has to do treat this file specially.


I don't see why we need to keep the original FD
around forever.  If the QMP command handler nees the temporary /dev/fdlist/N
file to exist for longer than the duration of the command, it can simply
dup() it to get a permanent copy of its own.


If we're not going to make it transparent and if we don't care about QMP
command bloat, we can choose completely different designs, yes. And then
qemu could probably try to internally work around a suboptimal API.

Kevin



Are there any more thoughts on how to move forward with this?  It seems 
like we're at a stand-still with the discussion.


--
Regards,
Corey


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-06-27 Thread Kevin Wolf
Am 27.06.2012 00:28, schrieb Corey Bryant:
 
 
 On 06/26/2012 04:50 PM, Luiz Capitulino wrote:
 On Tue, 26 Jun 2012 13:45:52 +0200
 Kevin Wolf kw...@redhat.com wrote:

 Am 26.06.2012 11:10, schrieb Daniel P. Berrange:
 I was thinking about some of the sources complexity when using
 FD passing from libvirt and wanted to raise one idea for discussion
 before we continue.

 With this proposed series, we have usage akin to:

1. pass_fd FDSET={M} - returns a string /dev/fd/N showing QEMU's
   view of the FD
2. drive_add file=/dev/fd/N
3. if failure:
 close_fd /dev/fd/N

 In fact, there are more steps:

 4. use it successfully
 5. close_fd /dev/fd/N

 I think it would well be possible that qemu just closes the fd when it's
 not used internally any more.

 pass-fd could have a flag indicating qemu to do that.

 
 It seems like libvirt would be in a better position to understand when a 
 file is no longer in use, and then it can call close_fd.  No?  Of course 
 the the only fd that needs to be closed is the originally passed fd. 
 The dup'd fd's are closed by QEMU.

No, libvirt doesn't know it, because the original file descriptor is
still needed when qemu decides to reopen the file. So I think qemu needs
some kind of refcounting anyway. One of the references is held by
libvirt and it can drop it with close_fd, and the other one would be for
the BlockDriverState or whatever you use the FD with. (There's a tricky
part: When do you actually close the FD? If libvirt has dropped its
reference and qemu reopens, for example because it has just probed the
image format, we have a short time where the refcount would be 0, but we
can't drop it anyway.)

Kevin

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-06-27 Thread Kevin Wolf
Am 26.06.2012 20:40, schrieb Corey Bryant:
 Here is a quick proof of concept (ie untested) patch to demonstrate
 what I mean. It relies on Cory's patch which converts everything
 to use qemu_open. It is also still valuable to make the change
 to qemu_open() to support /dev/fd/N for passing FDs during
 QEMU initial startup for CLI args.

 IMHO, what I propose here is preferrable for QMP clients that
 our current plan of requiring use of 3 monitor commands (passfd,
 X, closefd).

 Thanks for the PoC.

 Two other required updates that I can think of would be:

 1) Update change, block_stream, block_reopen, snapshot_blkdev, and
 perhaps other monitor commands to support receiving fd's via SCM_RIGHTS.

 
 Nevermind my comment.  I see that your PoC supports passing nfds for any 
 QMP command.
 
 I'm curious what Kevin's thoughts are on this and the overall approach.

I'm not against introducing this nfd thing as a general feature for QMP
commands instead of a separate pass-fd command. It's not obvious to me
that everyone would agree with that, so let's CC Luiz at least.

The that I'm unsure about is what we should do with qemu reopening the
file. If you close the fd immediately, you obviously can't do that any
more. Even worse, libvirt doesn't have a unique ID for each passed file
descriptor any more, so even though we have introduced a QMP feature for
file descriptor passing, we would still need to touch all commands to
allow assigning a new fd.

I think having one stable original fd that libvirt can refer to is much
nicer.

Kevin

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-06-27 Thread Kevin Wolf
Am 27.06.2012 00:54, schrieb Eric Blake:
 It seems like libvirt would be in a better position to understand when a
 file is no longer in use, and then it can call close_fd.  No?  Of course
 the the only fd that needs to be closed is the originally passed fd. The
 dup'd fd's are closed by QEMU.
 
 The more we discuss this, the more I'm convinced that commands like
 'block-commit' that will reopen a file to change from O_RDONLY to O_RDWR
 will need to have an optional argument that says the name of the file to
 reopen.  That is, I've seen three proposals:

Thanks for the summary. In fact, after having read this, I start
thinking that we're oversimplifying things because we're only thinking
about one use case, block-commit.

There are more: Live snapshot doesn't only put a new image on top, it
must also make the old top-level image read-only. This isn't bad per se,
but it shows that QMP commands can easily become bloated if you decide
to change every command.

The really bad case that nobody thought of is that, when block-commit
has finished, we need to switch back to read-only. This is an event that
is not triggered by a certain monitor command, but that comes from
inside qemu. I'm almost sure that we'll see more of this as we add more
asynchronous commands.

This only works if we can pass the new file descriptor in advance. It
would work nicely if you go with pass-fd and actually maintain a list of
file descriptors for each /dev/fd/N, along with the different flags the
file descriptors are meant for. I can't see how it would work with the
temporary /dev/fdlist/N or the fd:name approach because they both imply
that the original file descriptors are closed by the time that the QMP
command returns.

Kevin

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-06-27 Thread Daniel P. Berrange
On Wed, Jun 27, 2012 at 11:16:32AM +0200, Kevin Wolf wrote:
 Am 27.06.2012 00:54, schrieb Eric Blake:
  It seems like libvirt would be in a better position to understand when a
  file is no longer in use, and then it can call close_fd.  No?  Of course
  the the only fd that needs to be closed is the originally passed fd. The
  dup'd fd's are closed by QEMU.
  
  The more we discuss this, the more I'm convinced that commands like
  'block-commit' that will reopen a file to change from O_RDONLY to O_RDWR
  will need to have an optional argument that says the name of the file to
  reopen.  That is, I've seen three proposals:
 
 Thanks for the summary. In fact, after having read this, I start
 thinking that we're oversimplifying things because we're only thinking
 about one use case, block-commit.
 
 There are more: Live snapshot doesn't only put a new image on top, it
 must also make the old top-level image read-only. This isn't bad per se,
 but it shows that QMP commands can easily become bloated if you decide
 to change every command.
 
 The really bad case that nobody thought of is that, when block-commit
 has finished, we need to switch back to read-only. This is an event that
 is not triggered by a certain monitor command, but that comes from
 inside qemu. I'm almost sure that we'll see more of this as we add more
 asynchronous commands.
 
 This only works if we can pass the new file descriptor in advance. It
 would work nicely if you go with pass-fd and actually maintain a list of
 file descriptors for each /dev/fd/N, along with the different flags the
 file descriptors are meant for. I can't see how it would work with the
 temporary /dev/fdlist/N or the fd:name approach because they both imply
 that the original file descriptors are closed by the time that the QMP
 command returns.

What I don't understand though is that when you're reopening FDs, with
the pass-fd command approach, you're merely dup'ing the original FDs that
was passed in from the client.  Why can't you alternatively just dup the
FD you already have. I don't see why we need to keep the original FD
around forever.  If the QMP command handler nees the temporary /dev/fdlist/N
file to exist for longer than the duration of the command, it can simply
dup() it to get a permanent copy of its own.

Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-06-27 Thread Kevin Wolf
Am 27.06.2012 11:20, schrieb Daniel P. Berrange:
 On Wed, Jun 27, 2012 at 11:16:32AM +0200, Kevin Wolf wrote:
 The really bad case that nobody thought of is that, when block-commit
 has finished, we need to switch back to read-only. This is an event that
 is not triggered by a certain monitor command, but that comes from
 inside qemu. I'm almost sure that we'll see more of this as we add more
 asynchronous commands.

 This only works if we can pass the new file descriptor in advance. It
 would work nicely if you go with pass-fd and actually maintain a list of
 file descriptors for each /dev/fd/N, along with the different flags the
 file descriptors are meant for. I can't see how it would work with the
 temporary /dev/fdlist/N or the fd:name approach because they both imply
 that the original file descriptors are closed by the time that the QMP
 command returns.
 
 What I don't understand though is that when you're reopening FDs, with
 the pass-fd command approach, you're merely dup'ing the original FDs that
 was passed in from the client.  Why can't you alternatively just dup the
 FD you already have. 

That's easy: Because I don't know that I'm dealing with an FD. I think
originally the whole point in this /dev/fd/ thing was that it would
transparently enable the functionality for all parts of qemu: The block
layer, of course, but also new char devices, migration targets,
screenshot files or whatever else can deal with files. In this design
the block layer doesn't even know that it's not a regular file.

If we now decided to go for a design where the /dev/fd path isn't
enough, but changes are needed to each subsystem where it should work,
then this point doesn't apply any more and I cast my vote for abandoning
everything starting with /dev/ and introducing a proper fd: protocol in
the block layer, so that the block layer at least has a chance to know
that it has to do treat this file specially.

 I don't see why we need to keep the original FD
 around forever.  If the QMP command handler nees the temporary /dev/fdlist/N
 file to exist for longer than the duration of the command, it can simply
 dup() it to get a permanent copy of its own.

If we're not going to make it transparent and if we don't care about QMP
command bloat, we can choose completely different designs, yes. And then
qemu could probably try to internally work around a suboptimal API.

Kevin

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-06-27 Thread Corey Bryant



On 06/27/2012 04:43 AM, Kevin Wolf wrote:

Am 27.06.2012 00:28, schrieb Corey Bryant:



On 06/26/2012 04:50 PM, Luiz Capitulino wrote:

On Tue, 26 Jun 2012 13:45:52 +0200
Kevin Wolf kw...@redhat.com wrote:


Am 26.06.2012 11:10, schrieb Daniel P. Berrange:

I was thinking about some of the sources complexity when using
FD passing from libvirt and wanted to raise one idea for discussion
before we continue.

With this proposed series, we have usage akin to:

1. pass_fd FDSET={M} - returns a string /dev/fd/N showing QEMU's
   view of the FD
2. drive_add file=/dev/fd/N
3. if failure:
 close_fd /dev/fd/N


In fact, there are more steps:

4. use it successfully
5. close_fd /dev/fd/N

I think it would well be possible that qemu just closes the fd when it's
not used internally any more.


pass-fd could have a flag indicating qemu to do that.



It seems like libvirt would be in a better position to understand when a
file is no longer in use, and then it can call close_fd.  No?  Of course
the the only fd that needs to be closed is the originally passed fd.
The dup'd fd's are closed by QEMU.


No, libvirt doesn't know it, because the original file descriptor is
still needed when qemu decides to reopen the file. So I think qemu needs
some kind of refcounting anyway. One of the references is held by
libvirt and it can drop it with close_fd, and the other one would be for
the BlockDriverState or whatever you use the FD with. (There's a tricky
part: When do you actually close the FD? If libvirt has dropped its
reference and qemu reopens, for example because it has just probed the
image format, we have a short time where the refcount would be 0, but we
can't drop it anyway.)

Kevin



Yes, the refcount getting to 0 while the fd is still in use is a tough 
one.  It just seems like the management app would be better equipped to 
understand when a drive is no longer needed.  Isn't this what drive_del 
is for?  If qemu is never told that the drive is no longer needed, then 
the fd remains on the monitor, and it's not a leak.


--
Regards,
Corey


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-06-26 Thread Kevin Wolf
Am 26.06.2012 11:10, schrieb Daniel P. Berrange:
 I was thinking about some of the sources complexity when using
 FD passing from libvirt and wanted to raise one idea for discussion
 before we continue.
 
 With this proposed series, we have usage akin to:
 
   1. pass_fd FDSET={M} - returns a string /dev/fd/N showing QEMU's
  view of the FD
   2. drive_add file=/dev/fd/N
   3. if failure:
close_fd /dev/fd/N

In fact, there are more steps:

4. use it successfully
5. close_fd /dev/fd/N

I think it would well be possible that qemu just closes the fd when it's
not used internally any more. The point of close_fd is more that libvirt
still keeps a handle and can make use of this fd. One example where this
is necessary is reopening the FD of a backing file in order to perform a
live commit operation.

Kevin

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-06-26 Thread Daniel P. Berrange
On Tue, Jun 26, 2012 at 09:52:51AM -0400, Corey Bryant wrote:
 So now from a client's POV you'd have a flow like
 
 * drive_add file=/dev/fd/N  FDSET={N}
 
 IIUC then drive_add would loop and pass each fd in the set via SCM_RIGHTS?

Yes, you'd probably use the JSON to tell QEMU exactly
how many FDs you're intending to pass with the command,
and then once the command is received, you'd have N*SCM_RIGHTS
messages sent/received.

 
 And in QEMU you'd have something like
 
 * handle_monitor_command
  - recvmsg all FDs, and stash them in a thread local FDContext
context
  - invoke monitor command handler
- Sees file=/dev/fd/N
- Fetch /dev/fd/N from FDContext
- If success remove /dev/fd/N from FDContext
  - close() all FDs left in FDContext
 
 The key point with this is that because the FDs are directly
 associated with a monitor command, QEMU can /guarantee/ that
 FDs are never leaked, regardless of client behaviour.
 
 Wouldn't this leak fds if libvirt crashed part way through sending
 the set of fds?

No, because you've scoped the FDs to the current monitor instance,
and the current command being processed you know to close all FDs
when the associated command has finished, as well as closing them
all if you see EOF on the monitor socket while in the middle of
receiving a command.

Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-06-26 Thread Daniel P. Berrange
On Tue, Jun 26, 2012 at 03:11:40PM +0100, Daniel P. Berrange wrote:
 On Tue, Jun 26, 2012 at 09:52:51AM -0400, Corey Bryant wrote:
  So now from a client's POV you'd have a flow like
  
  * drive_add file=/dev/fd/N  FDSET={N}
  
  IIUC then drive_add would loop and pass each fd in the set via SCM_RIGHTS?
 
 Yes, you'd probably use the JSON to tell QEMU exactly
 how many FDs you're intending to pass with the command,
 and then once the command is received, you'd have N*SCM_RIGHTS
 messages sent/received.
 
  
  And in QEMU you'd have something like
  
  * handle_monitor_command
   - recvmsg all FDs, and stash them in a thread local FDContext
 context
   - invoke monitor command handler
 - Sees file=/dev/fd/N
 - Fetch /dev/fd/N from FDContext
 - If success remove /dev/fd/N from FDContext
   - close() all FDs left in FDContext
  
  The key point with this is that because the FDs are directly
  associated with a monitor command, QEMU can /guarantee/ that
  FDs are never leaked, regardless of client behaviour.
  
  Wouldn't this leak fds if libvirt crashed part way through sending
  the set of fds?
 
 No, because you've scoped the FDs to the current monitor instance,
 and the current command being processed you know to close all FDs
 when the associated command has finished, as well as closing them
 all if you see EOF on the monitor socket while in the middle of
 receiving a command.

Here is a quick proof of concept (ie untested) patch to demonstrate
what I mean. It relies on Cory's patch which converts everything
to use qemu_open. It is also still valuable to make the change
to qemu_open() to support /dev/fd/N for passing FDs during
QEMU initial startup for CLI args.

IMHO, what I propose here is preferrable for QMP clients that
our current plan of requiring use of 3 monitor commands (passfd,
X, closefd).

From 55a264b647d90a30c1cc00cb1896535246bcd9b7 Mon Sep 17 00:00:00 2001
From: Daniel P. Berrange berra...@redhat.com
Date: Tue, 26 Jun 2012 15:48:13 +0100
Subject: [PATCH] Support passing FDs with the monitor command that uses them

Add support for a syntax /dev/fdlist/N, where 'N' refers to
the N'th FD that was passed along with the currently processing
command in this thread.

The QMP client sends

{
  execute: drive_add,
  arguments: {
   filename: /dev/fdlist/0,
   backingfilename: /dev/fdlist/1 },
  nfds: 2
}

followed by nfds * SCM_RIGHTS packets

The FDs received along with the drive_add command are placed
in a thread-local list, then the QMP handler function is invoked.
The qemu_open() function will resolve any path /dev/fdlist/0
to use an FD stored in the  thread-local list, and dup() it.
One the QMP handler function is finished, all FDs in the thread
local list are closed.

THe reason for choice of /dev/fdlist/N, is to allow the
/dev/fd/N syntax to be used for FDs that are passed to QEMU
at startup time, which will be in the global FD number namespace,
not the thread-local index.

The reason for using a thread local is to ensure the FDs are
only made available to the currently executing monitor command
handler in this thread, and not any other threads.

The advantage over using a separate 'passfd' command, is that
this guarentees all FDs are closed in QEMU once the QMP command
handler is finished. There is no risk of leaks being triggered
by the client either intentionally, or via it crashing.

NB: this is a proof of concept demonstration of the overall
architectural design. The patch would need more work before
it was suitable for merging

 - Pick a better place/file for the fdlist APIs
 - Do proper error reporting in various places
 - Check for possibility that qemu_chr_fe_get_msgfd
   can block or return EAGAIN.

diff --git a/monitor.c b/monitor.c
index f6107ba..a3a4bd4 100644
--- a/monitor.c
+++ b/monitor.c
@@ -4490,6 +4490,8 @@ static void handle_qmp_command(JSONMessageParser *parser, 
QList *tokens)
 const mon_cmd_t *cmd;
 const char *cmd_name;
 Monitor *mon = cur_mon;
+int nfds;
+size_t i;
 
 args = input = NULL;
 
@@ -4535,6 +4537,17 @@ static void handle_qmp_command(JSONMessageParser 
*parser, QList *tokens)
 goto err_out;
 }
 
+nfds = qdict_get_int(input, nfds);
+for (i = 0 ; i  nfds ; i++) {
+   int fd = qemu_chr_fe_get_msgfd(mon-chr);
+   if (fd == -1) {
+   qerror_report(QERR_FD_NOT_SUPPLIED);
+   goto err_out;
+   }
+
+   qemu_fdlist_add(fd);
+}
+
 if (handler_is_async(cmd)) {
 err = qmp_async_cmd_handler(mon, cmd, args);
 if (err) {
@@ -4552,6 +4565,7 @@ err_out:
 out:
 QDECREF(input);
 QDECREF(args);
+qemu_fdlist_closeall();
 }
 
 /**
diff --git a/osdep.c b/osdep.c
index 03817f0..0849cb6 100644
--- a/osdep.c
+++ b/osdep.c
@@ -75,6 +75,81 @@ int qemu_madvise(void *addr, size_t len, int advice)
 #endif
 }
 
+typedef 

Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-06-26 Thread Corey Bryant



On 06/26/2012 11:03 AM, Daniel P. Berrange wrote:

On Tue, Jun 26, 2012 at 03:11:40PM +0100, Daniel P. Berrange wrote:

On Tue, Jun 26, 2012 at 09:52:51AM -0400, Corey Bryant wrote:

So now from a client's POV you'd have a flow like

* drive_add file=/dev/fd/N  FDSET={N}


IIUC then drive_add would loop and pass each fd in the set via SCM_RIGHTS?


Yes, you'd probably use the JSON to tell QEMU exactly
how many FDs you're intending to pass with the command,
and then once the command is received, you'd have N*SCM_RIGHTS
messages sent/received.



And in QEMU you'd have something like

* handle_monitor_command
 - recvmsg all FDs, and stash them in a thread local FDContext
   context
 - invoke monitor command handler
   - Sees file=/dev/fd/N
   - Fetch /dev/fd/N from FDContext
   - If success remove /dev/fd/N from FDContext
 - close() all FDs left in FDContext

The key point with this is that because the FDs are directly
associated with a monitor command, QEMU can /guarantee/ that
FDs are never leaked, regardless of client behaviour.


Wouldn't this leak fds if libvirt crashed part way through sending
the set of fds?


No, because you've scoped the FDs to the current monitor instance,
and the current command being processed you know to close all FDs
when the associated command has finished, as well as closing them
all if you see EOF on the monitor socket while in the middle of
receiving a command.


Here is a quick proof of concept (ie untested) patch to demonstrate
what I mean. It relies on Cory's patch which converts everything
to use qemu_open. It is also still valuable to make the change
to qemu_open() to support /dev/fd/N for passing FDs during
QEMU initial startup for CLI args.

IMHO, what I propose here is preferrable for QMP clients that
our current plan of requiring use of 3 monitor commands (passfd,
X, closefd).


Thanks for the PoC.

Two other required updates that I can think of would be:

1) Update change, block_stream, block_reopen, snapshot_blkdev, and 
perhaps other monitor commands to support receiving fd's via SCM_RIGHTS.


2) Support re-opening files with different access modes (O_RDONLY, 
O_WRONLY, or O_RDWR).  This would be similar to the force option for 
pass-fd.


--
Regards,
Corey



 From 55a264b647d90a30c1cc00cb1896535246bcd9b7 Mon Sep 17 00:00:00 2001
From: Daniel P. Berrange berra...@redhat.com
Date: Tue, 26 Jun 2012 15:48:13 +0100
Subject: [PATCH] Support passing FDs with the monitor command that uses them

Add support for a syntax /dev/fdlist/N, where 'N' refers to
the N'th FD that was passed along with the currently processing
command in this thread.

The QMP client sends

 {
   execute: drive_add,
   arguments: {
filename: /dev/fdlist/0,
backingfilename: /dev/fdlist/1 },
   nfds: 2
 }

followed by nfds * SCM_RIGHTS packets

The FDs received along with the drive_add command are placed
in a thread-local list, then the QMP handler function is invoked.
The qemu_open() function will resolve any path /dev/fdlist/0
to use an FD stored in the  thread-local list, and dup() it.
One the QMP handler function is finished, all FDs in the thread
local list are closed.

THe reason for choice of /dev/fdlist/N, is to allow the
/dev/fd/N syntax to be used for FDs that are passed to QEMU
at startup time, which will be in the global FD number namespace,
not the thread-local index.

The reason for using a thread local is to ensure the FDs are
only made available to the currently executing monitor command
handler in this thread, and not any other threads.

The advantage over using a separate 'passfd' command, is that
this guarentees all FDs are closed in QEMU once the QMP command
handler is finished. There is no risk of leaks being triggered
by the client either intentionally, or via it crashing.

NB: this is a proof of concept demonstration of the overall
architectural design. The patch would need more work before
it was suitable for merging

  - Pick a better place/file for the fdlist APIs
  - Do proper error reporting in various places
  - Check for possibility that qemu_chr_fe_get_msgfd
can block or return EAGAIN.

diff --git a/monitor.c b/monitor.c
index f6107ba..a3a4bd4 100644
--- a/monitor.c
+++ b/monitor.c
@@ -4490,6 +4490,8 @@ static void handle_qmp_command(JSONMessageParser *parser, 
QList *tokens)
  const mon_cmd_t *cmd;
  const char *cmd_name;
  Monitor *mon = cur_mon;
+int nfds;
+size_t i;

  args = input = NULL;

@@ -4535,6 +4537,17 @@ static void handle_qmp_command(JSONMessageParser 
*parser, QList *tokens)
  goto err_out;
  }

+nfds = qdict_get_int(input, nfds);
+for (i = 0 ; i  nfds ; i++) {
+   int fd = qemu_chr_fe_get_msgfd(mon-chr);
+   if (fd == -1) {
+   qerror_report(QERR_FD_NOT_SUPPLIED);
+   goto err_out;
+   }
+
+   qemu_fdlist_add(fd);
+}

Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-06-26 Thread Corey Bryant



On 06/26/2012 11:37 AM, Corey Bryant wrote:



On 06/26/2012 11:03 AM, Daniel P. Berrange wrote:

On Tue, Jun 26, 2012 at 03:11:40PM +0100, Daniel P. Berrange wrote:

On Tue, Jun 26, 2012 at 09:52:51AM -0400, Corey Bryant wrote:

So now from a client's POV you'd have a flow like

* drive_add file=/dev/fd/N  FDSET={N}


IIUC then drive_add would loop and pass each fd in the set via
SCM_RIGHTS?


Yes, you'd probably use the JSON to tell QEMU exactly
how many FDs you're intending to pass with the command,
and then once the command is received, you'd have N*SCM_RIGHTS
messages sent/received.



And in QEMU you'd have something like

* handle_monitor_command
 - recvmsg all FDs, and stash them in a thread local
FDContext
   context
 - invoke monitor command handler
   - Sees file=/dev/fd/N
   - Fetch /dev/fd/N from FDContext
   - If success remove /dev/fd/N from FDContext
 - close() all FDs left in FDContext

The key point with this is that because the FDs are directly
associated with a monitor command, QEMU can /guarantee/ that
FDs are never leaked, regardless of client behaviour.


Wouldn't this leak fds if libvirt crashed part way through sending
the set of fds?


No, because you've scoped the FDs to the current monitor instance,
and the current command being processed you know to close all FDs
when the associated command has finished, as well as closing them
all if you see EOF on the monitor socket while in the middle of
receiving a command.


Here is a quick proof of concept (ie untested) patch to demonstrate
what I mean. It relies on Cory's patch which converts everything
to use qemu_open. It is also still valuable to make the change
to qemu_open() to support /dev/fd/N for passing FDs during
QEMU initial startup for CLI args.

IMHO, what I propose here is preferrable for QMP clients that
our current plan of requiring use of 3 monitor commands (passfd,
X, closefd).


Thanks for the PoC.

Two other required updates that I can think of would be:

1) Update change, block_stream, block_reopen, snapshot_blkdev, and
perhaps other monitor commands to support receiving fd's via SCM_RIGHTS.



Nevermind my comment.  I see that your PoC supports passing nfds for any 
QMP command.


I'm curious what Kevin's thoughts are on this and the overall approach.


2) Support re-opening files with different access modes (O_RDONLY,
O_WRONLY, or O_RDWR).  This would be similar to the force option for
pass-fd.



I'm still not quite sure how we'd go about this.  We need a way to 
determine the existing QEMU fd that is to be re-associated with the new 
fd, when using a /dev/fdlist/0 filename.  In this new approach, 0 
corresponds to an index, not an fd.  The prior approach of using the 
/dev/fd/nnn, where nnn corresponded to an actual QEMU fd, made this easy.






 From 55a264b647d90a30c1cc00cb1896535246bcd9b7 Mon Sep 17 00:00:00 2001
From: Daniel P. Berrange berra...@redhat.com
Date: Tue, 26 Jun 2012 15:48:13 +0100
Subject: [PATCH] Support passing FDs with the monitor command that
uses them

Add support for a syntax /dev/fdlist/N, where 'N' refers to
the N'th FD that was passed along with the currently processing
command in this thread.

The QMP client sends

 {
   execute: drive_add,
   arguments: {
filename: /dev/fdlist/0,
backingfilename: /dev/fdlist/1 },
   nfds: 2
 }

followed by nfds * SCM_RIGHTS packets

The FDs received along with the drive_add command are placed
in a thread-local list, then the QMP handler function is invoked.
The qemu_open() function will resolve any path /dev/fdlist/0
to use an FD stored in the  thread-local list, and dup() it.
One the QMP handler function is finished, all FDs in the thread
local list are closed.

THe reason for choice of /dev/fdlist/N, is to allow the
/dev/fd/N syntax to be used for FDs that are passed to QEMU
at startup time, which will be in the global FD number namespace,
not the thread-local index.

The reason for using a thread local is to ensure the FDs are
only made available to the currently executing monitor command
handler in this thread, and not any other threads.

The advantage over using a separate 'passfd' command, is that
this guarentees all FDs are closed in QEMU once the QMP command
handler is finished. There is no risk of leaks being triggered
by the client either intentionally, or via it crashing.

NB: this is a proof of concept demonstration of the overall
architectural design. The patch would need more work before
it was suitable for merging

  - Pick a better place/file for the fdlist APIs
  - Do proper error reporting in various places
  - Check for possibility that qemu_chr_fe_get_msgfd
can block or return EAGAIN.

diff --git a/monitor.c b/monitor.c
index f6107ba..a3a4bd4 100644
--- a/monitor.c
+++ b/monitor.c
@@ -4490,6 +4490,8 @@ static void handle_qmp_command(JSONMessageParser
*parser, QList *tokens)
 

Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-06-26 Thread Daniel P. Berrange
On Tue, Jun 26, 2012 at 02:40:03PM -0400, Corey Bryant wrote:
 
 
 On 06/26/2012 11:37 AM, Corey Bryant wrote:
 
 
 On 06/26/2012 11:03 AM, Daniel P. Berrange wrote:
 On Tue, Jun 26, 2012 at 03:11:40PM +0100, Daniel P. Berrange wrote:
 On Tue, Jun 26, 2012 at 09:52:51AM -0400, Corey Bryant wrote:
 So now from a client's POV you'd have a flow like
 
 * drive_add file=/dev/fd/N  FDSET={N}
 
 IIUC then drive_add would loop and pass each fd in the set via
 SCM_RIGHTS?
 
 Yes, you'd probably use the JSON to tell QEMU exactly
 how many FDs you're intending to pass with the command,
 and then once the command is received, you'd have N*SCM_RIGHTS
 messages sent/received.
 
 
 And in QEMU you'd have something like
 
 * handle_monitor_command
  - recvmsg all FDs, and stash them in a thread local
 FDContext
context
  - invoke monitor command handler
- Sees file=/dev/fd/N
- Fetch /dev/fd/N from FDContext
- If success remove /dev/fd/N from FDContext
  - close() all FDs left in FDContext
 
 The key point with this is that because the FDs are directly
 associated with a monitor command, QEMU can /guarantee/ that
 FDs are never leaked, regardless of client behaviour.
 
 Wouldn't this leak fds if libvirt crashed part way through sending
 the set of fds?
 
 No, because you've scoped the FDs to the current monitor instance,
 and the current command being processed you know to close all FDs
 when the associated command has finished, as well as closing them
 all if you see EOF on the monitor socket while in the middle of
 receiving a command.
 
 Here is a quick proof of concept (ie untested) patch to demonstrate
 what I mean. It relies on Cory's patch which converts everything
 to use qemu_open. It is also still valuable to make the change
 to qemu_open() to support /dev/fd/N for passing FDs during
 QEMU initial startup for CLI args.
 
 IMHO, what I propose here is preferrable for QMP clients that
 our current plan of requiring use of 3 monitor commands (passfd,
 X, closefd).
 
 Thanks for the PoC.
 
 Two other required updates that I can think of would be:
 
 1) Update change, block_stream, block_reopen, snapshot_blkdev, and
 perhaps other monitor commands to support receiving fd's via SCM_RIGHTS.
 
 
 Nevermind my comment.  I see that your PoC supports passing nfds for
 any QMP command.
 
 I'm curious what Kevin's thoughts are on this and the overall approach.
 
 2) Support re-opening files with different access modes (O_RDONLY,
 O_WRONLY, or O_RDWR).  This would be similar to the force option for
 pass-fd.
 
 
 I'm still not quite sure how we'd go about this.  We need a way to
 determine the existing QEMU fd that is to be re-associated with the
 new fd, when using a /dev/fdlist/0 filename.  In this new approach,
 0 corresponds to an index, not an fd.  The prior approach of using
 the /dev/fd/nnn, where nnn corresponded to an actual QEMU fd, made
 this easy.

Hmm, I'm not sure I understand what the use case is for the 'force'
parameter ? In my proof of concept I left out the  block of code
from qemu_open() you had for dup'ing the FD with various different
flags set, but that was just for brevity. I think it ought to fit
in the same way, unless you're refering to a different area of the
code.

 @@ -83,6 +158,26 @@ int qemu_open(const char *name, int flags, ...)
   {
   int ret;
   int mode = 0;
 +const char *p;
 +
 +/* Attempt dup of fd for pre-opened file */
 +if (strstart(name, /dev/fdlist/, p)) {
 +int idx;
 +int fd;
 +
 +idx = qemu_parse_fd(p);
 +if (idx == -1) {
 +return -1;
 +}
 +
 +fd = qemu_fdlist_dup(idx);
 +if (fd == -1) {
 +return -1;
 +}
 +
 +/* XXX verify flags match */

eg, this part of the code you had some work related to
'flags'

 +return fd;
 +}
 
   if (flags  O_CREAT) {
   va_list ap;

Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-06-26 Thread Luiz Capitulino
On Tue, 26 Jun 2012 13:45:52 +0200
Kevin Wolf kw...@redhat.com wrote:

 Am 26.06.2012 11:10, schrieb Daniel P. Berrange:
  I was thinking about some of the sources complexity when using
  FD passing from libvirt and wanted to raise one idea for discussion
  before we continue.
  
  With this proposed series, we have usage akin to:
  
1. pass_fd FDSET={M} - returns a string /dev/fd/N showing QEMU's
   view of the FD
2. drive_add file=/dev/fd/N
3. if failure:
 close_fd /dev/fd/N
 
 In fact, there are more steps:
 
 4. use it successfully
 5. close_fd /dev/fd/N
 
 I think it would well be possible that qemu just closes the fd when it's
 not used internally any more.

pass-fd could have a flag indicating qemu to do that.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-06-26 Thread Corey Bryant



On 06/26/2012 04:50 PM, Luiz Capitulino wrote:

On Tue, 26 Jun 2012 13:45:52 +0200
Kevin Wolf kw...@redhat.com wrote:


Am 26.06.2012 11:10, schrieb Daniel P. Berrange:

I was thinking about some of the sources complexity when using
FD passing from libvirt and wanted to raise one idea for discussion
before we continue.

With this proposed series, we have usage akin to:

   1. pass_fd FDSET={M} - returns a string /dev/fd/N showing QEMU's
  view of the FD
   2. drive_add file=/dev/fd/N
   3. if failure:
close_fd /dev/fd/N


In fact, there are more steps:

4. use it successfully
5. close_fd /dev/fd/N

I think it would well be possible that qemu just closes the fd when it's
not used internally any more.


pass-fd could have a flag indicating qemu to do that.



It seems like libvirt would be in a better position to understand when a 
file is no longer in use, and then it can call close_fd.  No?  Of course 
the the only fd that needs to be closed is the originally passed fd. 
The dup'd fd's are closed by QEMU.


--
Regards,
Corey


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd

2012-06-26 Thread Corey Bryant



On 06/26/2012 04:42 PM, Daniel P. Berrange wrote:

On Tue, Jun 26, 2012 at 02:40:03PM -0400, Corey Bryant wrote:



On 06/26/2012 11:37 AM, Corey Bryant wrote:



On 06/26/2012 11:03 AM, Daniel P. Berrange wrote:

On Tue, Jun 26, 2012 at 03:11:40PM +0100, Daniel P. Berrange wrote:

On Tue, Jun 26, 2012 at 09:52:51AM -0400, Corey Bryant wrote:

So now from a client's POV you'd have a flow like

* drive_add file=/dev/fd/N  FDSET={N}


IIUC then drive_add would loop and pass each fd in the set via
SCM_RIGHTS?


Yes, you'd probably use the JSON to tell QEMU exactly
how many FDs you're intending to pass with the command,
and then once the command is received, you'd have N*SCM_RIGHTS
messages sent/received.



And in QEMU you'd have something like

* handle_monitor_command
 - recvmsg all FDs, and stash them in a thread local
FDContext
   context
 - invoke monitor command handler
   - Sees file=/dev/fd/N
   - Fetch /dev/fd/N from FDContext
   - If success remove /dev/fd/N from FDContext
 - close() all FDs left in FDContext

The key point with this is that because the FDs are directly
associated with a monitor command, QEMU can /guarantee/ that
FDs are never leaked, regardless of client behaviour.


Wouldn't this leak fds if libvirt crashed part way through sending
the set of fds?


No, because you've scoped the FDs to the current monitor instance,
and the current command being processed you know to close all FDs
when the associated command has finished, as well as closing them
all if you see EOF on the monitor socket while in the middle of
receiving a command.


Here is a quick proof of concept (ie untested) patch to demonstrate
what I mean. It relies on Cory's patch which converts everything
to use qemu_open. It is also still valuable to make the change
to qemu_open() to support /dev/fd/N for passing FDs during
QEMU initial startup for CLI args.

IMHO, what I propose here is preferrable for QMP clients that
our current plan of requiring use of 3 monitor commands (passfd,
X, closefd).


Thanks for the PoC.

Two other required updates that I can think of would be:

1) Update change, block_stream, block_reopen, snapshot_blkdev, and
perhaps other monitor commands to support receiving fd's via SCM_RIGHTS.



Nevermind my comment.  I see that your PoC supports passing nfds for
any QMP command.

I'm curious what Kevin's thoughts are on this and the overall approach.


2) Support re-opening files with different access modes (O_RDONLY,
O_WRONLY, or O_RDWR).  This would be similar to the force option for
pass-fd.



I'm still not quite sure how we'd go about this.  We need a way to
determine the existing QEMU fd that is to be re-associated with the
new fd, when using a /dev/fdlist/0 filename.  In this new approach,
0 corresponds to an index, not an fd.  The prior approach of using
the /dev/fd/nnn, where nnn corresponded to an actual QEMU fd, made
this easy.


Hmm, I'm not sure I understand what the use case is for the 'force'
parameter ? In my proof of concept I left out the  block of code
from qemu_open() you had for dup'ing the FD with various different
flags set, but that was just for brevity. I think it ought to fit
in the same way, unless you're refering to a different area of the
code.



The access modes (O_RDONLY, O_WRONLY, or O_RDWR) can only be set on the 
open() call, which is performed by libvirt.  fcntl(F_SETFL) can't change 
them.  So the point of pass-fd's force option is to allow libvirt to 
open() the same file with a new access mode, pass it to qemu, and copy 
it to the existing qemu fd. For example:


fd1 = open(/path/to/file.img, O_RDONLY)
fd2 = open(/path/to/file.img, O_RDWR)
pass-fd disk0 (fd1) returns fd=3 (/dev/fd/3 in qemu has O_RDONLY)
drive_add file:/dev/fd/3
pass-fd disk0 (fd2) returns fd=3 (/dev/fd/3 in qemu now has O_RDWR)

This is required because the access_mode flags used by qemu_open() are 
the same as the passed fd's access_mode flags.  And qemu will re-open 
files with different access modes.



@@ -83,6 +158,26 @@ int qemu_open(const char *name, int flags, ...)

  {
  int ret;
  int mode = 0;
+const char *p;
+
+/* Attempt dup of fd for pre-opened file */
+if (strstart(name, /dev/fdlist/, p)) {
+int idx;
+int fd;
+
+idx = qemu_parse_fd(p);
+if (idx == -1) {
+return -1;
+}
+
+fd = qemu_fdlist_dup(idx);
+if (fd == -1) {
+return -1;
+}
+
+/* XXX verify flags match */


eg, this part of the code you had some work related to
'flags'



Yes, no problem and thanks again for the code. :)  I'm hoping we can get 
some more input soon so I can get moving in one direction or another.



+return fd;
+}

  if (flags  O_CREAT) {
  va_list ap;


Daniel



--
Regards,
Corey


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list