Re: KVM call agenda for Apr 27

2010-04-27 Thread Avi Kivity

On 04/27/2010 01:36 AM, Anthony Liguori wrote:


A few comments:

1) The problem was not block watermark itself but generating a 
notification on the watermark threshold.  It's a heuristic and should 
be implemented based on polling block stats. 


Polling for an event that never happens is bad engineering.  What 
frequency do you poll?  you're forcing the user to make a lose-lose 
tradeoff.


Otherwise, we'll be adding tons of events to qemu that we'll struggle 
to maintain.


That's not a valid reason to reject a user requirement.  We may argue 
the requirement is bogus, or that the suggested implementation is wrong 
and point in a different direction, but saying that we may have to add 
more code in the future due to other requirements is ... well I can't 
find a word for it.




2) A block plugin doesn't solve the problem if it's just at the 
BlockDriverState level because it can't interact with qcow2.


Why not?  We have a layered model.  guest - qcow2 - plugin (sends 
event) - raw-posix.  Just need to insert the plugin at the appropriate 
layer.




3) For general block plugins, it's probably better to tackle userspace 
block devices.  We have CUSE and FUSE already, a BUSE is a logical 
conclusion.


We also have an nbd client.

Here's another option:  an nbd-like protocol that remotes all 
BlockDriver operations except read and write over a unix domain socket.  
The open operation returns an fd (SCM_RIGHTS strikes again) that is used 
for read and write.  This can be used to implement snapshots over LVM, 
for example.


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM call agenda for Apr 27

2010-04-27 Thread Dor Laor

On 04/27/2010 11:14 AM, Avi Kivity wrote:

On 04/27/2010 01:36 AM, Anthony Liguori wrote:


A few comments:

1) The problem was not block watermark itself but generating a
notification on the watermark threshold. It's a heuristic and should
be implemented based on polling block stats.


Polling for an event that never happens is bad engineering. What
frequency do you poll? you're forcing the user to make a lose-lose
tradeoff.


Otherwise, we'll be adding tons of events to qemu that we'll struggle
to maintain.


That's not a valid reason to reject a user requirement. We may argue the
requirement is bogus, or that the suggested implementation is wrong and
point in a different direction, but saying that we may have to add more
code in the future due to other requirements is ... well I can't find a
word for it.



2) A block plugin doesn't solve the problem if it's just at the
BlockDriverState level because it can't interact with qcow2.


Why not? We have a layered model. guest - qcow2 - plugin (sends event)
- raw-posix. Just need to insert the plugin at the appropriate layer.



3) For general block plugins, it's probably better to tackle userspace
block devices. We have CUSE and FUSE already, a BUSE is a logical
conclusion.


We also have an nbd client.

Here's another option: an nbd-like protocol that remotes all BlockDriver
operations except read and write over a unix domain socket. The open
operation returns an fd (SCM_RIGHTS strikes again) that is used for read
and write. This can be used to implement snapshots over LVM, for example.



Why w/o read/writes? the watermark code needs them too (as info, not the 
actual buffer).


IMHO the whole thing is way over engineered:
 a) Having another channel into qemu is complicating management
software. Isn't the monitor should be the channel? Otherwise we'll
need to create another QMP (or nbd like Avi suggest) for these
actions. It's extra work for mgmt and they will have hard time to
understand events interleaving of the various channels
 b) How the plugins are defined? Is it scripts? Binaries? Do they open
their own sockets?

So I suggest either to stick with qmp or to have new block layer but let 
qmp pass events from it - this is actually the nbd-like approach but 
with qmp socket.


Thanks,
Dor
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Re: KVM call agenda for Apr 27

2010-04-27 Thread Kevin Wolf
Am 27.04.2010 00:36, schrieb Anthony Liguori:
 On 04/26/2010 05:12 PM, Chris Wright wrote:
 * Anthony Liguori (anth...@codemonkey.ws) wrote:

 On 04/26/2010 12:26 PM, Chris Wright wrote:
  
 Please send in any agenda items you are interested in covering.

 While I don't expect it to be the case this week, if we have a
 lack of agenda items I'll cancel the week's call.

 - qemu management interface (and libvirt)
 - stable tree policy (push vs. pull and call for stable volunteers)
  
 block plug in (follow-on from qmp block watermark)

 
 A few comments:
 
 1) The problem was not block watermark itself but generating a 
 notification on the watermark threshold.  It's a heuristic and should be 
 implemented based on polling block stats.  Otherwise, we'll be adding 
 tons of events to qemu that we'll struggle to maintain.

Polling just feels completely wrong. You're almost guaranteed to poll in
the wrong intervals because depending on what the guest is doing you
might need it every couple of seconds (installation) or you may not need
it in days (just working on a fully allocated image).

 2) A block plugin doesn't solve the problem if it's just at the 
 BlockDriverState level because it can't interact with qcow2.

There's no interaction with qcow2 needed, it would just need to remember
the highest offset it was requested to read or write. But I'd really
hate to stick another protocol between qcow2 and file.

It doesn't solve the problem nicely though because it still needs to
generate some event which is not present in a normal qemu without the
plugin.

Kevin
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM call agenda for Apr 27

2010-04-27 Thread Avi Kivity

On 04/27/2010 11:48 AM, Dor Laor wrote:

Here's another option: an nbd-like protocol that remotes all BlockDriver
operations except read and write over a unix domain socket. The open
operation returns an fd (SCM_RIGHTS strikes again) that is used for read
and write. This can be used to implement snapshots over LVM, for 
example.





Why w/o read/writes? 


To avoid the copying.


the watermark code needs them too (as info, not the actual buffer).


Yeah.  It works for lvm snapshots, not for watermarks.



IMHO the whole thing is way over engineered:
 a) Having another channel into qemu is complicating management
software. Isn't the monitor should be the channel? Otherwise we'll
need to create another QMP (or nbd like Avi suggest) for these
actions. It's extra work for mgmt and they will have hard time to
understand events interleaving of the various channels


block layer plugins allow intercepting all interesting block layer 
events, not just write-past-a-watermark, and allow actions based on 
those events.  It's a more general solution.



 b) How the plugins are defined? Is it scripts? Binaries? Do they open
their own sockets?


Shared objects.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM call agenda for Apr 27

2010-04-27 Thread Dor Laor

On 04/27/2010 11:56 AM, Avi Kivity wrote:

On 04/27/2010 11:48 AM, Dor Laor wrote:

Here's another option: an nbd-like protocol that remotes all BlockDriver
operations except read and write over a unix domain socket. The open
operation returns an fd (SCM_RIGHTS strikes again) that is used for read
and write. This can be used to implement snapshots over LVM, for
example.




Why w/o read/writes?


To avoid the copying.


Of course, just pass the offset+len on read/write too




the watermark code needs them too (as info, not the actual buffer).


Yeah. It works for lvm snapshots, not for watermarks.



IMHO the whole thing is way over engineered:
a) Having another channel into qemu is complicating management
software. Isn't the monitor should be the channel? Otherwise we'll
need to create another QMP (or nbd like Avi suggest) for these
actions. It's extra work for mgmt and they will have hard time to
understand events interleaving of the various channels


block layer plugins allow intercepting all interesting block layer
events, not just write-past-a-watermark, and allow actions based on
those events. It's a more general solution.


No problem there, as long as we do try to use the single existing QMP 
with the plugins. Otherwise we'll create QMP2 for the block events in a 
year from now.





b) How the plugins are defined? Is it scripts? Binaries? Do they open
their own sockets?


Shared objects.




--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM call agenda for Apr 27

2010-04-27 Thread Kevin Wolf
Am 27.04.2010 10:56, schrieb Avi Kivity:
 On 04/27/2010 11:48 AM, Dor Laor wrote:
 Here's another option: an nbd-like protocol that remotes all BlockDriver
 operations except read and write over a unix domain socket. The open
 operation returns an fd (SCM_RIGHTS strikes again) that is used for read
 and write. This can be used to implement snapshots over LVM, for 
 example.



 Why w/o read/writes? 
 
 To avoid the copying.

Hm, stupid question: What problem does this NFS thing solve? What can we
do with it that we currently can't do inside qemu?

 the watermark code needs them too (as info, not the actual buffer).
 
 Yeah.  It works for lvm snapshots, not for watermarks.

So even if it solves anything, it doesn't solve the watermark problem.
At least I'm not sure how you would use LVM snapshots to dynamically
grow the volume on which a qcow2 image is stored.

 IMHO the whole thing is way over engineered:
  a) Having another channel into qemu is complicating management
 software. Isn't the monitor should be the channel? Otherwise we'll
 need to create another QMP (or nbd like Avi suggest) for these
 actions. It's extra work for mgmt and they will have hard time to
 understand events interleaving of the various channels

I agree. But if everyone insists on overengineering what about allowing
QMP clients to request an event for any change to a particular query-*
result? (Or rather a specific field of it, if the watermark is going to
be a blockstat.) Would need to be supported by the respective
implementation behind that query subcommand, but we can enable it one
after another without changes to the interface (except that more
parameter values start working).

If this is not overengineered enough yet, add a scripting engine to
allow specifying more complex conditions for event generation, depending
on multiple fields and so on. :-)

Kevin
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM call agenda for Apr 27

2010-04-27 Thread Avi Kivity

On 04/27/2010 12:08 PM, Dor Laor wrote:

On 04/27/2010 11:56 AM, Avi Kivity wrote:

On 04/27/2010 11:48 AM, Dor Laor wrote:
Here's another option: an nbd-like protocol that remotes all 
BlockDriver

operations except read and write over a unix domain socket. The open
operation returns an fd (SCM_RIGHTS strikes again) that is used for 
read

and write. This can be used to implement snapshots over LVM, for
example.




Why w/o read/writes?


To avoid the copying.


Of course, just pass the offset+len on read/write too


There will be a large performance impact.



IMHO the whole thing is way over engineered:
a) Having another channel into qemu is complicating management
software. Isn't the monitor should be the channel? Otherwise we'll
need to create another QMP (or nbd like Avi suggest) for these
actions. It's extra work for mgmt and they will have hard time to
understand events interleaving of the various channels


block layer plugins allow intercepting all interesting block layer
events, not just write-past-a-watermark, and allow actions based on
those events. It's a more general solution.


No problem there, as long as we do try to use the single existing QMP 
with the plugins. Otherwise we'll create QMP2 for the block events in 
a year from now.


I don't see how we can interleave messages from the plugin into the qmp 
stream without causing confusion.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM call agenda for Apr 27

2010-04-27 Thread Avi Kivity

On 04/27/2010 12:16 PM, Kevin Wolf wrote:

Am 27.04.2010 10:56, schrieb Avi Kivity:
   

On 04/27/2010 11:48 AM, Dor Laor wrote:
 

Here's another option: an nbd-like protocol that remotes all BlockDriver
operations except read and write over a unix domain socket. The open
operation returns an fd (SCM_RIGHTS strikes again) that is used for read
and write. This can be used to implement snapshots over LVM, for
example.

 


Why w/o read/writes?
   

To avoid the copying.
 

Hm, stupid question: What problem does this NFS thing solve? What can we
do with it that we currently can't do inside qemu?
   


For example, you can't create an lvm snapshot due to privilege problems.


the watermark code needs them too (as info, not the actual buffer).
   

Yeah.  It works for lvm snapshots, not for watermarks.
 

So even if it solves anything, it doesn't solve the watermark problem.
At least I'm not sure how you would use LVM snapshots to dynamically
grow the volume on which a qcow2 image is stored.
   


It's separate issue.  Consider a fat-provisioned lvm volume, you can't 
snapshot it today from within qemu.


It doesn't solve the watermark problem (qcow2 on fat-provisioned 
backing), I didn't think it through.



IMHO the whole thing is way over engineered:
  a) Having another channel into qemu is complicating management
 software. Isn't the monitor should be the channel? Otherwise we'll
 need to create another QMP (or nbd like Avi suggest) for these
 actions. It's extra work for mgmt and they will have hard time to
 understand events interleaving of the various channels
   

I agree. But if everyone insists on overengineering what about allowing
QMP clients to request an event for any change to a particular query-*
result? (Or rather a specific field of it, if the watermark is going to
be a blockstat.) Would need to be supported by the respective
implementation behind that query subcommand, but we can enable it one
after another without changes to the interface (except that more
parameter values start working).

If this is not overengineered enough yet, add a scripting engine to
allow specifying more complex conditions for event generation, depending
on multiple fields and so on. :-)
   


Any simple question has a complicated answer...

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM call agenda for Apr 27

2010-04-27 Thread Dor Laor

On 04/27/2010 12:22 PM, Avi Kivity wrote:

On 04/27/2010 12:08 PM, Dor Laor wrote:

On 04/27/2010 11:56 AM, Avi Kivity wrote:

On 04/27/2010 11:48 AM, Dor Laor wrote:

Here's another option: an nbd-like protocol that remotes all
BlockDriver
operations except read and write over a unix domain socket. The open
operation returns an fd (SCM_RIGHTS strikes again) that is used for
read
and write. This can be used to implement snapshots over LVM, for
example.




Why w/o read/writes?


To avoid the copying.


Of course, just pass the offset+len on read/write too


There will be a large performance impact.



IMHO the whole thing is way over engineered:
a) Having another channel into qemu is complicating management
software. Isn't the monitor should be the channel? Otherwise we'll
need to create another QMP (or nbd like Avi suggest) for these
actions. It's extra work for mgmt and they will have hard time to
understand events interleaving of the various channels


block layer plugins allow intercepting all interesting block layer
events, not just write-past-a-watermark, and allow actions based on
those events. It's a more general solution.


No problem there, as long as we do try to use the single existing QMP
with the plugins. Otherwise we'll create QMP2 for the block events in
a year from now.


I don't see how we can interleave messages from the plugin into the qmp
stream without causing confusion.


Those are QMP async events.

Since Kevin suggested adding even more events (was is cynical?) maybe we 
can use optional QMP opaque block events that the plugin issues and it 
will travel using the standard QMP connection as async event to the 
interested mgmt app.

Once stabilized each event can go into the official QMP protocol.


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM call agenda for Apr 27

2010-04-27 Thread Kevin Wolf
Am 27.04.2010 11:32, schrieb Dor Laor:
 On 04/27/2010 12:22 PM, Avi Kivity wrote:
 On 04/27/2010 12:08 PM, Dor Laor wrote:
 On 04/27/2010 11:56 AM, Avi Kivity wrote:
 On 04/27/2010 11:48 AM, Dor Laor wrote:
 IMHO the whole thing is way over engineered:
 a) Having another channel into qemu is complicating management
 software. Isn't the monitor should be the channel? Otherwise we'll
 need to create another QMP (or nbd like Avi suggest) for these
 actions. It's extra work for mgmt and they will have hard time to
 understand events interleaving of the various channels

 block layer plugins allow intercepting all interesting block layer
 events, not just write-past-a-watermark, and allow actions based on
 those events. It's a more general solution.

 No problem there, as long as we do try to use the single existing QMP
 with the plugins. Otherwise we'll create QMP2 for the block events in
 a year from now.

 I don't see how we can interleave messages from the plugin into the qmp
 stream without causing confusion.
 
 Those are QMP async events.
 
 Since Kevin suggested adding even more events (was is cynical?) 

The part about adding a scripting engine was.

The idea of adding a generic event (one event, not even more!) for a QMP
query-* result change doesn't sound that bad on second thought, though.
It's not specific for watermarks and looks less complicated than all the
plugin, NBD and QMP2 stuff.

It's almost the same as Anthony's polling suggestion (works with query-*
results from user perspective), just without polling.

Kevin
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM call agenda for Apr 27

2010-04-27 Thread Gleb Natapov
On Mon, Apr 26, 2010 at 05:36:52PM -0500, Anthony Liguori wrote:
 On 04/26/2010 05:12 PM, Chris Wright wrote:
 * Anthony Liguori (anth...@codemonkey.ws) wrote:
 On 04/26/2010 12:26 PM, Chris Wright wrote:
 Please send in any agenda items you are interested in covering.
 
 While I don't expect it to be the case this week, if we have a
 lack of agenda items I'll cancel the week's call.
 - qemu management interface (and libvirt)
 - stable tree policy (push vs. pull and call for stable volunteers)
 block plug in (follow-on from qmp block watermark)
 
 A few comments:
 
 1) The problem was not block watermark itself but generating a
 notification on the watermark threshold.  It's a heuristic and
 should be implemented based on polling block stats.  Otherwise,
 we'll be adding tons of events to qemu that we'll struggle to
 maintain.
 
Network cards have low number of rx/tx buffers interrupt. This is also
heuristic. Do you think driver should poll for this event instead and
NIC designers just wasted their time designing the feature?

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM call agenda for Apr 27

2010-04-27 Thread Anthony Liguori

On 04/27/2010 06:11 AM, Gleb Natapov wrote:

Network cards have low number of rx/tx buffers interrupt. This is also
heuristic. Do you think driver should poll for this event instead and
NIC designers just wasted their time designing the feature?
   


I don't see how the two cases are at all similar.

More importantly, I don't see what the burden is of polling when you're 
talking about a very unusual statistic that has a very limited use case.


Regards,

Anthony Liguori


--
Gleb.
   


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM call agenda for Apr 27

2010-04-27 Thread Anthony Liguori

On 04/27/2010 03:14 AM, Avi Kivity wrote:

On 04/27/2010 01:36 AM, Anthony Liguori wrote:


A few comments:

1) The problem was not block watermark itself but generating a 
notification on the watermark threshold.  It's a heuristic and should 
be implemented based on polling block stats. 


Polling for an event that never happens is bad engineering.  What 
frequency do you poll?  you're forcing the user to make a lose-lose 
tradeoff.


Otherwise, we'll be adding tons of events to qemu that we'll struggle 
to maintain.


That's not a valid reason to reject a user requirement.  We may argue 
the requirement is bogus, or that the suggested implementation is 
wrong and point in a different direction, but saying that we may have 
to add more code in the future due to other requirements is ... well I 
can't find a word for it.


Polling is the best solution because it offers the most flexibility.  
Baking the heuristic into qemu just removes flexibility for all consumers.




2) A block plugin doesn't solve the problem if it's just at the 
BlockDriverState level because it can't interact with qcow2.


Why not?  We have a layered model.  guest - qcow2 - plugin (sends 
event) - raw-posix.  Just need to insert the plugin at the 
appropriate layer.


All of the qcow2 information is static to the qcow2 driver and I don't 
think changing that for plugins is a good idea.




3) For general block plugins, it's probably better to tackle 
userspace block devices.  We have CUSE and FUSE already, a BUSE is a 
logical conclusion.


We also have an nbd client.

Here's another option:  an nbd-like protocol that remotes all 
BlockDriver operations except read and write over a unix domain 
socket.  The open operation returns an fd (SCM_RIGHTS strikes again) 
that is used for read and write.  This can be used to implement 
snapshots over LVM, for example.


How does it address the watermark problem?

Regards,

Anthony Liguori
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM call agenda for Apr 27

2010-04-27 Thread Avi Kivity

On 04/27/2010 04:03 PM, Anthony Liguori wrote:

On 04/27/2010 03:14 AM, Avi Kivity wrote:

On 04/27/2010 01:36 AM, Anthony Liguori wrote:


A few comments:

1) The problem was not block watermark itself but generating a 
notification on the watermark threshold.  It's a heuristic and 
should be implemented based on polling block stats. 


Polling for an event that never happens is bad engineering.  What 
frequency do you poll?  you're forcing the user to make a lose-lose 
tradeoff.


Otherwise, we'll be adding tons of events to qemu that we'll 
struggle to maintain.


That's not a valid reason to reject a user requirement.  We may argue 
the requirement is bogus, or that the suggested implementation is 
wrong and point in a different direction, but saying that we may have 
to add more code in the future due to other requirements is ... well 
I can't find a word for it.


Polling is the best solution because it offers the most flexibility.  
Baking the heuristic into qemu just removes flexibility for all 
consumers.


Can you explain?  The ability to poll is not removed.  Nor is any 
heuristics performed by qemu; it just sends a notification when a write 
exceeds a user-defined threshold.  The distance from the threshold to 
the top-of-file is not known to qemu.






2) A block plugin doesn't solve the problem if it's just at the 
BlockDriverState level because it can't interact with qcow2.


Why not?  We have a layered model.  guest - qcow2 - plugin (sends 
event) - raw-posix.  Just need to insert the plugin at the 
appropriate layer.


All of the qcow2 information is static to the qcow2 driver and I don't 
think changing that for plugins is a good idea.


There is no information that the plugin needs to access:

  if (write_offset  watermark  watermark_enabled) {
  send_notification();
  watermark_enabled = false;
  }
  bdrv_write(backing_blockstate, ..., write_offset, ...);

Note write_offset if not the guest's offset, but the offset into the 
backing file.  It will trigger on guest writes and qcow2 metadata writes 
alike.






3) For general block plugins, it's probably better to tackle 
userspace block devices.  We have CUSE and FUSE already, a BUSE is a 
logical conclusion.


We also have an nbd client.

Here's another option:  an nbd-like protocol that remotes all 
BlockDriver operations except read and write over a unix domain 
socket.  The open operation returns an fd (SCM_RIGHTS strikes again) 
that is used for read and write.  This can be used to implement 
snapshots over LVM, for example.


How does it address the watermark problem?


It doesn't. Mea confusa.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Re: KVM call agenda for Apr 27

2010-04-27 Thread Anthony Liguori

On 04/27/2010 03:53 AM, Kevin Wolf wrote:

Am 27.04.2010 00:36, schrieb Anthony Liguori:
   

On 04/26/2010 05:12 PM, Chris Wright wrote:
 

* Anthony Liguori (anth...@codemonkey.ws) wrote:

   

On 04/26/2010 12:26 PM, Chris Wright wrote:

 

Please send in any agenda items you are interested in covering.

While I don't expect it to be the case this week, if we have a
lack of agenda items I'll cancel the week's call.

   

- qemu management interface (and libvirt)
- stable tree policy (push vs. pull and call for stable volunteers)

 

block plug in (follow-on from qmp block watermark)

   

A few comments:

1) The problem was not block watermark itself but generating a
notification on the watermark threshold.  It's a heuristic and should be
implemented based on polling block stats.  Otherwise, we'll be adding
tons of events to qemu that we'll struggle to maintain.
 

Polling just feels completely wrong. You're almost guaranteed to poll in
the wrong intervals because depending on what the guest is doing you
might need it every couple of seconds (installation) or you may not need
it in days (just working on a fully allocated image).
   


The event basically boils down to: when some threshold is reached, raise 
an event.  What statistics go into this computation and what algorithm 
is used to compute the threshold depends on the management tool.


The memory ballooning daemon we're developing basically does nothing but 
this.  It monitors stats from the guest to generate events when we've 
reached a certain memory condition that will likely require some action 
(like ballooning).  Why would we do the block watermarking in qemu and 
not do the ballooning heuristics too?


I know of quite a few tools that all do some form of this.


2) A block plugin doesn't solve the problem if it's just at the
BlockDriverState level because it can't interact with qcow2.
 

There's no interaction with qcow2 needed, it would just need to remember
the highest offset it was requested to read or write. But I'd really
hate to stick another protocol between qcow2 and file.

It doesn't solve the problem nicely though because it still needs to
generate some event which is not present in a normal qemu without the
plugin.
   


Polling is really the right solution.  It gives the management tool 
ultimate flexibility in tweaking the heuristics as they see fit.


Regards,

Anthony Liguori


Kevin
   


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM call agenda for Apr 27

2010-04-27 Thread Daniel P. Berrange
On Tue, Apr 27, 2010 at 08:03:42AM -0500, Anthony Liguori wrote:
 On 04/27/2010 03:14 AM, Avi Kivity wrote:
 On 04/27/2010 01:36 AM, Anthony Liguori wrote:
 
 A few comments:
 
 1) The problem was not block watermark itself but generating a 
 notification on the watermark threshold.  It's a heuristic and should 
 be implemented based on polling block stats. 
 
 Polling for an event that never happens is bad engineering.  What 
 frequency do you poll?  you're forcing the user to make a lose-lose 
 tradeoff.
 
 Otherwise, we'll be adding tons of events to qemu that we'll struggle 
 to maintain.
 
 That's not a valid reason to reject a user requirement.  We may argue 
 the requirement is bogus, or that the suggested implementation is 
 wrong and point in a different direction, but saying that we may have 
 to add more code in the future due to other requirements is ... well I 
 can't find a word for it.
 
 Polling is the best solution because it offers the most flexibility.  
 Baking the heuristic into qemu just removes flexibility for all consumers.

Polling as the added advantage that you can recover better if the
app talking to QMP is offline for a period. eg if libvirt were 
disconnected from QMP at the time the high watermark event were
triggered, the next you'll know is a ENOSPACE event. If the app
were able to poll on the allocation value, then it could immediately
see the watermark had been passed the first time it polled after
libvirt reconnected to QMP. As you say its also more flexible because
you can invent a usage where you have 2 or 3 watermarks where you
could try harder to get more space as you pass each watermark.


Daniel
-- 
|: Red Hat, Engineering, London-o-   http://people.redhat.com/berrange/ :|
|: http://libvirt.org -o- http://virt-manager.org -o- http://deltacloud.org :|
|: http://autobuild.org-o- http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505  -o-   F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM call agenda for Apr 27

2010-04-27 Thread Anthony Liguori

On 04/27/2010 04:41 AM, Kevin Wolf wrote:

Am 27.04.2010 11:32, schrieb Dor Laor:
   

On 04/27/2010 12:22 PM, Avi Kivity wrote:
 

On 04/27/2010 12:08 PM, Dor Laor wrote:
   

On 04/27/2010 11:56 AM, Avi Kivity wrote:
 

On 04/27/2010 11:48 AM, Dor Laor wrote:
   

IMHO the whole thing is way over engineered:
a) Having another channel into qemu is complicating management
software. Isn't the monitor should be the channel? Otherwise we'll
need to create another QMP (or nbd like Avi suggest) for these
actions. It's extra work for mgmt and they will have hard time to
understand events interleaving of the various channels
 

block layer plugins allow intercepting all interesting block layer
events, not just write-past-a-watermark, and allow actions based on
those events. It's a more general solution.
   

No problem there, as long as we do try to use the single existing QMP
with the plugins. Otherwise we'll create QMP2 for the block events in
a year from now.
 

I don't see how we can interleave messages from the plugin into the qmp
stream without causing confusion.
   

Those are QMP async events.

Since Kevin suggested adding even more events (was is cynical?)
 

The part about adding a scripting engine was.

The idea of adding a generic event (one event, not even more!) for a QMP
query-* result change doesn't sound that bad on second thought, though.
It's not specific for watermarks and looks less complicated than all the
plugin, NBD and QMP2 stuff.

It's almost the same as Anthony's polling suggestion (works with query-*
results from user perspective), just without polling.
   


Is this really necessary other than making people feel less bad about 
polling?


How I understand the use case, polling every five seconds would be 
completely reasonable in addressing the use-case.  It might not be as 
sexy as a generic event notification mechanism but not everything can't 
be JSON.


Regards,

Anthony Liguori


Kevin
   


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM call agenda for Apr 27

2010-04-27 Thread Gleb Natapov
On Tue, Apr 27, 2010 at 02:11:46PM +0100, Daniel P. Berrange wrote:
 On Tue, Apr 27, 2010 at 08:03:42AM -0500, Anthony Liguori wrote:
  On 04/27/2010 03:14 AM, Avi Kivity wrote:
  On 04/27/2010 01:36 AM, Anthony Liguori wrote:
  
  A few comments:
  
  1) The problem was not block watermark itself but generating a 
  notification on the watermark threshold.  It's a heuristic and should 
  be implemented based on polling block stats. 
  
  Polling for an event that never happens is bad engineering.  What 
  frequency do you poll?  you're forcing the user to make a lose-lose 
  tradeoff.
  
  Otherwise, we'll be adding tons of events to qemu that we'll struggle 
  to maintain.
  
  That's not a valid reason to reject a user requirement.  We may argue 
  the requirement is bogus, or that the suggested implementation is 
  wrong and point in a different direction, but saying that we may have 
  to add more code in the future due to other requirements is ... well I 
  can't find a word for it.
  
  Polling is the best solution because it offers the most flexibility.  
  Baking the heuristic into qemu just removes flexibility for all consumers.
 
 Polling as the added advantage that you can recover better if the
 app talking to QMP is offline for a period. eg if libvirt were 
 disconnected from QMP at the time the high watermark event were
 triggered, the next you'll know is a ENOSPACE event. If the app
 were able to poll on the allocation value, then it could immediately
 see the watermark had been passed the first time it polled after
 libvirt reconnected to QMP. As you say its also more flexible because
 you can invent a usage where you have 2 or 3 watermarks where you
 could try harder to get more space as you pass each watermark.
 
When libvirt reconnects it should poll once and then wait for
notification. If you want to have several watermarks configure
first one and after getting notification about it configure
second one and so on.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Re: KVM call agenda for Apr 27

2010-04-27 Thread Kevin Wolf
Am 27.04.2010 15:10, schrieb Anthony Liguori:
 On 04/27/2010 03:53 AM, Kevin Wolf wrote:
 Am 27.04.2010 00:36, schrieb Anthony Liguori:

 On 04/26/2010 05:12 PM, Chris Wright wrote:
  
 * Anthony Liguori (anth...@codemonkey.ws) wrote:


 On 04/26/2010 12:26 PM, Chris Wright wrote:

  
 Please send in any agenda items you are interested in covering.

 While I don't expect it to be the case this week, if we have a
 lack of agenda items I'll cancel the week's call.


 - qemu management interface (and libvirt)
 - stable tree policy (push vs. pull and call for stable volunteers)

  
 block plug in (follow-on from qmp block watermark)


 A few comments:

 1) The problem was not block watermark itself but generating a
 notification on the watermark threshold.  It's a heuristic and should be
 implemented based on polling block stats.  Otherwise, we'll be adding
 tons of events to qemu that we'll struggle to maintain.
  
 Polling just feels completely wrong. You're almost guaranteed to poll in
 the wrong intervals because depending on what the guest is doing you
 might need it every couple of seconds (installation) or you may not need
 it in days (just working on a fully allocated image).

 
 The event basically boils down to: when some threshold is reached, raise 
 an event.  What statistics go into this computation and what algorithm 
 is used to compute the threshold depends on the management tool.

The watermark is not some complex computed value, but actually the
statistic itself. We can get rid of handling a threshold in qemu by just
signalling something has changed with this stat.

I'm really not arguing that qemu should do anything complex or even
define policy. It's just about avoiding polling all the time when
nothing has changed and polling too late when things are changing quickly.

 Polling is really the right solution.  It gives the management tool 
 ultimate flexibility in tweaking the heuristics as they see fit.

Isn't providing this flexibility completely orthogonal to polling vs.
event-based?

Kevin
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM call agenda for Apr 27

2010-04-27 Thread Anthony Liguori

On 04/27/2010 08:05 AM, Gleb Natapov wrote:

On Tue, Apr 27, 2010 at 08:00:02AM -0500, Anthony Liguori wrote:
   

On 04/27/2010 06:11 AM, Gleb Natapov wrote:
 

Network cards have low number of rx/tx buffers interrupt. This is also
heuristic. Do you think driver should poll for this event instead and
NIC designers just wasted their time designing the feature?
   

I don't see how the two cases are at all similar.

 

They are the same. They send notification when resource is low.

   

More importantly, I don't see what the burden is of polling when
you're talking about a very unusual statistic that has a very
limited use case.

 

Poll is the wrong answer. Always. The statistic is very common and has
wide use case unless you have unlimited storage.
   


Every management tool does polling in some form.  They'll poll CPU 
stats, I/O stats, etc.


The typical use-case for overcommitting storage is using file backed 
images.  Using a dynamically growing LVM volume is almost certainly 
unique to RHEV-M.


Regards,

Anthony Liguori


--
Gleb.
   


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Re: KVM call agenda for Apr 27

2010-04-27 Thread Anthony Liguori

On 04/27/2010 08:18 AM, Kevin Wolf wrote:


The watermark is not some complex computed value, but actually the
statistic itself. We can get rid of handling a threshold in qemu by just
signalling something has changed with this stat.

I'm really not arguing that qemu should do anything complex or even
define policy. It's just about avoiding polling all the time when
nothing has changed and polling too late when things are changing quickly.

   

Polling is really the right solution.  It gives the management tool
ultimate flexibility in tweaking the heuristics as they see fit.
 

Isn't providing this flexibility completely orthogonal to polling vs.
event-based?
   


Except then we need to offer a generic statistics mechanism which seems 
like it's going to add a fair bit of complexity.  So far, the only 
argument for it seems to be a misplaced notion that polling is evil.


Regards,

Anthony Liguori


Kevin
   


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM call agenda for Apr 27

2010-04-27 Thread Gleb Natapov
On Tue, Apr 27, 2010 at 08:19:06AM -0500, Anthony Liguori wrote:
 On 04/27/2010 08:05 AM, Gleb Natapov wrote:
 On Tue, Apr 27, 2010 at 08:00:02AM -0500, Anthony Liguori wrote:
 On 04/27/2010 06:11 AM, Gleb Natapov wrote:
 Network cards have low number of rx/tx buffers interrupt. This is also
 heuristic. Do you think driver should poll for this event instead and
 NIC designers just wasted their time designing the feature?
 I don't see how the two cases are at all similar.
 
 They are the same. They send notification when resource is low.
 
 More importantly, I don't see what the burden is of polling when
 you're talking about a very unusual statistic that has a very
 limited use case.
 
 Poll is the wrong answer. Always. The statistic is very common and has
 wide use case unless you have unlimited storage.
 
 Every management tool does polling in some form.  They'll poll CPU
 stats, I/O stats, etc.
 
When there is no other way to get statistic polling is unavoidable, so
management tools do that. But here you propose to force management to do
polling for no good reason.

 The typical use-case for overcommitting storage is using file backed
 images.  Using a dynamically growing LVM volume is almost certainly
 unique to RHEV-M.
 

What's the difference? If storage is overcommitted management wants to
know ahead of time that it needs to extend storage space. It may do
that by polling storage/qemu or by getting notifications asynchronously.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM call agenda for Apr 27

2010-04-27 Thread Daniel P. Berrange
On Tue, Apr 27, 2010 at 04:15:54PM +0300, Gleb Natapov wrote:
 On Tue, Apr 27, 2010 at 02:11:46PM +0100, Daniel P. Berrange wrote:
  On Tue, Apr 27, 2010 at 08:03:42AM -0500, Anthony Liguori wrote:
   On 04/27/2010 03:14 AM, Avi Kivity wrote:
   On 04/27/2010 01:36 AM, Anthony Liguori wrote:
   
   A few comments:
   
   1) The problem was not block watermark itself but generating a 
   notification on the watermark threshold.  It's a heuristic and should 
   be implemented based on polling block stats. 
   
   Polling for an event that never happens is bad engineering.  What 
   frequency do you poll?  you're forcing the user to make a lose-lose 
   tradeoff.
   
   Otherwise, we'll be adding tons of events to qemu that we'll struggle 
   to maintain.
   
   That's not a valid reason to reject a user requirement.  We may argue 
   the requirement is bogus, or that the suggested implementation is 
   wrong and point in a different direction, but saying that we may have 
   to add more code in the future due to other requirements is ... well I 
   can't find a word for it.
   
   Polling is the best solution because it offers the most flexibility.  
   Baking the heuristic into qemu just removes flexibility for all consumers.
  
  Polling as the added advantage that you can recover better if the
  app talking to QMP is offline for a period. eg if libvirt were 
  disconnected from QMP at the time the high watermark event were
  triggered, the next you'll know is a ENOSPACE event. If the app
  were able to poll on the allocation value, then it could immediately
  see the watermark had been passed the first time it polled after
  libvirt reconnected to QMP. As you say its also more flexible because
  you can invent a usage where you have 2 or 3 watermarks where you
  could try harder to get more space as you pass each watermark.
  
 When libvirt reconnects it should poll once and then wait for
 notification. If you want to have several watermarks configure
 first one and after getting notification about it configure
 second one and so on.

So regardless of whether polling or events are 'best', we need to have the
pollable QMP command implemented to get rid of the potential for a missed 
event to a watermark threshold that has already past. The same race problem 
exists with updating the thresholds on the fly as one is passed.

Daniel
-- 
|: Red Hat, Engineering, London-o-   http://people.redhat.com/berrange/ :|
|: http://libvirt.org -o- http://virt-manager.org -o- http://deltacloud.org :|
|: http://autobuild.org-o- http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505  -o-   F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Re: KVM call agenda for Apr 27

2010-04-27 Thread Kevin Wolf
Am 27.04.2010 15:21, schrieb Anthony Liguori:
 On 04/27/2010 08:18 AM, Kevin Wolf wrote:

 The watermark is not some complex computed value, but actually the
 statistic itself. We can get rid of handling a threshold in qemu by just
 signalling something has changed with this stat.

 I'm really not arguing that qemu should do anything complex or even
 define policy. It's just about avoiding polling all the time when
 nothing has changed and polling too late when things are changing quickly.


 Polling is really the right solution.  It gives the management tool
 ultimate flexibility in tweaking the heuristics as they see fit.
  
 Isn't providing this flexibility completely orthogonal to polling vs.
 event-based?

 
 Except then we need to offer a generic statistics mechanism which seems 
 like it's going to add a fair bit of complexity.  So far, the only 
 argument for it seems to be a misplaced notion that polling is evil.

I'm not sure if adding events is evil is a much better position. :-)

The natural thing is really events here, because we want to get informed
every time something changes. Polling is a workaround for cases where
you can't get these events. So I think it's you who should explain why
polling is so much better than using events.

Note that IIUC the case is here different from the ballooning you
mentioned. The statistics for ballooning change all the time and you
don't want to get informed about changes but monitor the statistics all
the time, right? This is indeed a scenario where polling seems more
natural. But in contrast, the watermark usually doesn't change most of
the time and we want to know when changes happen.

Kevin
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Re: KVM call agenda for Apr 27

2010-04-27 Thread Anthony Liguori

On 04/27/2010 08:42 AM, Kevin Wolf wrote:

Am 27.04.2010 15:21, schrieb Anthony Liguori:
   

On 04/27/2010 08:18 AM, Kevin Wolf wrote:
 

The watermark is not some complex computed value, but actually the
statistic itself. We can get rid of handling a threshold in qemu by just
signalling something has changed with this stat.

I'm really not arguing that qemu should do anything complex or even
define policy. It's just about avoiding polling all the time when
nothing has changed and polling too late when things are changing quickly.


   

Polling is really the right solution.  It gives the management tool
ultimate flexibility in tweaking the heuristics as they see fit.

 

Isn't providing this flexibility completely orthogonal to polling vs.
event-based?

   

Except then we need to offer a generic statistics mechanism which seems
like it's going to add a fair bit of complexity.  So far, the only
argument for it seems to be a misplaced notion that polling is evil.
 

I'm not sure if adding events is evil is a much better position. :-)

The natural thing is really events here, because we want to get informed
every time something changes.


You want to be informed every time something has changed and the value 
has met a boolean condition (value  threshold).



  Polling is a workaround for cases where
you can't get these events. So I think it's you who should explain why
polling is so much better than using events.
   


Polling gives a management tool more flexibility in implementing the 
evaluation condition.


Adding something to the protocol is a long term support statement.  We 
shouldn't add events unless we think they are events that we'll want to 
support in the long term.  If RHEV-M decides that instead of doing value 
 threshold, they want to include additional information (maybe 
factoring in I/O rate), then a new event needs to be added and we're 
stuck supporting the old event forever.


Polling lets us avoid introducing new protocol operations as heuristics 
change.



Note that IIUC the case is here different from the ballooning you
mentioned. The statistics for ballooning change all the time and you
don't want to get informed about changes but monitor the statistics all
the time, right?


No, generally speaking, you care about threshold conditions.  For 
instance, you want to know when guest reported free memory is  some 
percentage of total memory.  It's very similar really.



  This is indeed a scenario where polling seems more
natural. But in contrast, the watermark usually doesn't change most of
the time and we want to know when changes happen.
   


Regards,

Anthony Liguori


Kevin
   


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Re: KVM call agenda for Apr 27

2010-04-27 Thread Kevin Wolf
Am 27.04.2010 15:48, schrieb Anthony Liguori:
 On 04/27/2010 08:42 AM, Kevin Wolf wrote:
 Am 27.04.2010 15:21, schrieb Anthony Liguori:

 On 04/27/2010 08:18 AM, Kevin Wolf wrote:
  
 The watermark is not some complex computed value, but actually the
 statistic itself. We can get rid of handling a threshold in qemu by just
 signalling something has changed with this stat.

 I'm really not arguing that qemu should do anything complex or even
 define policy. It's just about avoiding polling all the time when
 nothing has changed and polling too late when things are changing quickly.



 Polling is really the right solution.  It gives the management tool
 ultimate flexibility in tweaking the heuristics as they see fit.

  
 Isn't providing this flexibility completely orthogonal to polling vs.
 event-based?


 Except then we need to offer a generic statistics mechanism which seems
 like it's going to add a fair bit of complexity.  So far, the only
 argument for it seems to be a misplaced notion that polling is evil.
  
 I'm not sure if adding events is evil is a much better position. :-)

 The natural thing is really events here, because we want to get informed
 every time something changes.
 
 You want to be informed every time something has changed and the value 
 has met a boolean condition (value  threshold).
 
   Polling is a workaround for cases where
 you can't get these events. So I think it's you who should explain why
 polling is so much better than using events.

 
 Polling gives a management tool more flexibility in implementing the 
 evaluation condition.
 
 Adding something to the protocol is a long term support statement.  We 
 shouldn't add events unless we think they are events that we'll want to 
 support in the long term.  If RHEV-M decides that instead of doing value 
   threshold, they want to include additional information (maybe 
 factoring in I/O rate), then a new event needs to be added and we're 
 stuck supporting the old event forever.
 
 Polling lets us avoid introducing new protocol operations as heuristics 
 change.

This is what I meant with the flexibility being orthogonal to polling
vs. event-based. You're comparing apples and oranges.

You can either compare sending an event when value  threshold with
polling a boolean that says if this threshold is reached. Or you can
compare sending an event on changes with polling the absolute value. But
comparing sending an event on a threshold to polling the absolute value
mixes two different things.

 Note that IIUC the case is here different from the ballooning you
 mentioned. The statistics for ballooning change all the time and you
 don't want to get informed about changes but monitor the statistics all
 the time, right?
 
 No, generally speaking, you care about threshold conditions.  For 
 instance, you want to know when guest reported free memory is  some 
 percentage of total memory.  It's very similar really.

Hm, maybe implementing something generic with thresholds actually
wouldn't be a bad idea then.

But still you have the fact that it's changing all the time which is
very different from the watermark. The watermark only ever grows and
often doesn't change at all. So the watermark thing can live without
thresholds, it's enough to get informed about any changes.

Kevin
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Re: KVM call agenda for Apr 27

2010-04-27 Thread Anthony Liguori

On 04/27/2010 08:58 AM, Kevin Wolf wrote:

Am 27.04.2010 15:48, schrieb Anthony Liguori:
   

On 04/27/2010 08:42 AM, Kevin Wolf wrote:
 

Am 27.04.2010 15:21, schrieb Anthony Liguori:

   

On 04/27/2010 08:18 AM, Kevin Wolf wrote:

 

The watermark is not some complex computed value, but actually the
statistic itself. We can get rid of handling a threshold in qemu by just
signalling something has changed with this stat.

I'm really not arguing that qemu should do anything complex or even
define policy. It's just about avoiding polling all the time when
nothing has changed and polling too late when things are changing quickly.



   

Polling is really the right solution.  It gives the management tool
ultimate flexibility in tweaking the heuristics as they see fit.


 

Isn't providing this flexibility completely orthogonal to polling vs.
event-based?


   

Except then we need to offer a generic statistics mechanism which seems
like it's going to add a fair bit of complexity.  So far, the only
argument for it seems to be a misplaced notion that polling is evil.

 

I'm not sure if adding events is evil is a much better position. :-)

The natural thing is really events here, because we want to get informed
every time something changes.
   

You want to be informed every time something has changed and the value
has met a boolean condition (value  threshold).

 

   Polling is a workaround for cases where
you can't get these events. So I think it's you who should explain why
polling is so much better than using events.

   

Polling gives a management tool more flexibility in implementing the
evaluation condition.

Adding something to the protocol is a long term support statement.  We
shouldn't add events unless we think they are events that we'll want to
support in the long term.  If RHEV-M decides that instead of doing value
threshold, they want to include additional information (maybe
factoring in I/O rate), then a new event needs to be added and we're
stuck supporting the old event forever.

Polling lets us avoid introducing new protocol operations as heuristics
change.
 

This is what I meant with the flexibility being orthogonal to polling
vs. event-based. You're comparing apples and oranges.

You can either compare sending an event when value  threshold with
polling a boolean that says if this threshold is reached. Or you can
compare sending an event on changes with polling the absolute value. But
comparing sending an event on a threshold to polling the absolute value
mixes two different things.
   


But is statistic change really useful?  During a guest install, you'd 
get hundreds and hundreds of these events.



Note that IIUC the case is here different from the ballooning you
mentioned. The statistics for ballooning change all the time and you
don't want to get informed about changes but monitor the statistics all
the time, right?
   

No, generally speaking, you care about threshold conditions.  For
instance, you want to know when guest reported free memory is  some
percentage of total memory.  It's very similar really.
 

Hm, maybe implementing something generic with thresholds actually
wouldn't be a bad idea then.

But still you have the fact that it's changing all the time which is
very different from the watermark. The watermark only ever grows and
often doesn't change at all. So the watermark thing can live without
thresholds, it's enough to get informed about any changes.
   


It actually changes quite often early in it's lifetime.

Regards,

Anthony Liguori


Kevin
   


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM call agenda for Apr 27

2010-04-27 Thread Gleb Natapov
On Tue, Apr 27, 2010 at 02:38:17PM +0100, Daniel P. Berrange wrote:
 On Tue, Apr 27, 2010 at 04:15:54PM +0300, Gleb Natapov wrote:
  On Tue, Apr 27, 2010 at 02:11:46PM +0100, Daniel P. Berrange wrote:
   On Tue, Apr 27, 2010 at 08:03:42AM -0500, Anthony Liguori wrote:
On 04/27/2010 03:14 AM, Avi Kivity wrote:
On 04/27/2010 01:36 AM, Anthony Liguori wrote:

A few comments:

1) The problem was not block watermark itself but generating a 
notification on the watermark threshold.  It's a heuristic and should 
be implemented based on polling block stats. 

Polling for an event that never happens is bad engineering.  What 
frequency do you poll?  you're forcing the user to make a lose-lose 
tradeoff.

Otherwise, we'll be adding tons of events to qemu that we'll struggle 
to maintain.

That's not a valid reason to reject a user requirement.  We may argue 
the requirement is bogus, or that the suggested implementation is 
wrong and point in a different direction, but saying that we may have 
to add more code in the future due to other requirements is ... well I 
can't find a word for it.

Polling is the best solution because it offers the most flexibility.  
Baking the heuristic into qemu just removes flexibility for all 
consumers.
   
   Polling as the added advantage that you can recover better if the
   app talking to QMP is offline for a period. eg if libvirt were 
   disconnected from QMP at the time the high watermark event were
   triggered, the next you'll know is a ENOSPACE event. If the app
   were able to poll on the allocation value, then it could immediately
   see the watermark had been passed the first time it polled after
   libvirt reconnected to QMP. As you say its also more flexible because
   you can invent a usage where you have 2 or 3 watermarks where you
   could try harder to get more space as you pass each watermark.
   
  When libvirt reconnects it should poll once and then wait for
  notification. If you want to have several watermarks configure
  first one and after getting notification about it configure
  second one and so on.
 
 So regardless of whether polling or events are 'best', we need to have the
 pollable QMP command implemented to get rid of the potential for a missed 
 event to a watermark threshold that has already past. The same race problem 
 exists with updating the thresholds on the fly as one is passed.
 
Of course. Polling command is needed in any case.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


KVM call agenda for Apr 27

2010-04-26 Thread Chris Wright
Please send in any agenda items you are interested in covering.

While I don't expect it to be the case this week, if we have a 
lack of agenda items I'll cancel the week's call.

thanks,
-chris
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM call agenda for Apr 27

2010-04-26 Thread Anthony Liguori

On 04/26/2010 12:26 PM, Chris Wright wrote:

Please send in any agenda items you are interested in covering.

While I don't expect it to be the case this week, if we have a
lack of agenda items I'll cancel the week's call.
   


- qemu management interface (and libvirt)
- stable tree policy (push vs. pull and call for stable volunteers)

Regards,

Anthony Liguori


thanks,
-chris
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
   


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM call agenda for Apr 27

2010-04-26 Thread Chris Wright
* Anthony Liguori (anth...@codemonkey.ws) wrote:
 On 04/26/2010 12:26 PM, Chris Wright wrote:
 Please send in any agenda items you are interested in covering.
 
 While I don't expect it to be the case this week, if we have a
 lack of agenda items I'll cancel the week's call.
 
 - qemu management interface (and libvirt)
 - stable tree policy (push vs. pull and call for stable volunteers)

block plug in (follow-on from qmp block watermark)
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM call agenda for Apr 27

2010-04-26 Thread Anthony Liguori

On 04/26/2010 05:12 PM, Chris Wright wrote:

* Anthony Liguori (anth...@codemonkey.ws) wrote:
   

On 04/26/2010 12:26 PM, Chris Wright wrote:
 

Please send in any agenda items you are interested in covering.

While I don't expect it to be the case this week, if we have a
lack of agenda items I'll cancel the week's call.
   

- qemu management interface (and libvirt)
- stable tree policy (push vs. pull and call for stable volunteers)
 

block plug in (follow-on from qmp block watermark)
   


A few comments:

1) The problem was not block watermark itself but generating a 
notification on the watermark threshold.  It's a heuristic and should be 
implemented based on polling block stats.  Otherwise, we'll be adding 
tons of events to qemu that we'll struggle to maintain.


2) A block plugin doesn't solve the problem if it's just at the 
BlockDriverState level because it can't interact with qcow2.


3) For general block plugins, it's probably better to tackle userspace 
block devices.  We have CUSE and FUSE already, a BUSE is a logical 
conclusion.


Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Re: KVM call agenda for Apr 27

2010-04-26 Thread Luiz Capitulino
On Mon, 26 Apr 2010 12:51:08 -0500
Anthony Liguori anth...@codemonkey.ws wrote:

 On 04/26/2010 12:26 PM, Chris Wright wrote:
  Please send in any agenda items you are interested in covering.
 
  While I don't expect it to be the case this week, if we have a
  lack of agenda items I'll cancel the week's call.
 
 
 - qemu management interface (and libvirt)
 - stable tree policy (push vs. pull and call for stable volunteers)

 What do you mean by push vs. pull?

 Anyway, Aurelien was working on a stable release last week, maybe
he's interested in helping with the stables (or not :)).
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Re: KVM call agenda for Apr 27

2010-04-26 Thread Aurelien Jarno
On Mon, Apr 26, 2010 at 10:15:58PM -0300, Luiz Capitulino wrote:
 On Mon, 26 Apr 2010 12:51:08 -0500
 Anthony Liguori anth...@codemonkey.ws wrote:
 
  On 04/26/2010 12:26 PM, Chris Wright wrote:
   Please send in any agenda items you are interested in covering.
  
   While I don't expect it to be the case this week, if we have a
   lack of agenda items I'll cancel the week's call.
  
  
  - qemu management interface (and libvirt)
  - stable tree policy (push vs. pull and call for stable volunteers)
 
  What do you mean by push vs. pull?
 
  Anyway, Aurelien was working on a stable release last week, maybe
 he's interested in helping with the stables (or not :)).
 

I didn't find the time to do the stable release, but we should be very
close now.

I am interested to have stable releases, but if someone else want to
work on that, I am fine.

-- 
Aurelien Jarno  GPG: 1024D/F1BCDB73
aurel...@aurel32.net http://www.aurel32.net
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html