On Tue, Sep 14, 2010 at 11:41:55AM -0300, Eduardo Habkost wrote:
> On Tue, Sep 14, 2010 at 09:24:11AM -0500, Adam Litke wrote:
> > On Tue, 2010-09-14 at 11:09 -0300, Eduardo Habkost wrote:
> > > On Wed, Sep 08, 2010 at 09:21:16AM -0500, Adam Litke wrote:
> [...]
> > > >  static uint32_t virtio_balloon_get_features(VirtIODevice *vdev, 
> > > > uint32_t f)
> > > >  {
> > > > -    f |= (1 << VIRTIO_BALLOON_F_STATS_VQ);
> > > > +    /*
> > > > +     * The addition of memory stats reporting to the virtio balloon 
> > > > causes
> > > > +     * the 'info balloon' command to become asynchronous.  This is a 
> > > > regression
> > > > +     * because in some cases it can hang the user monitor.
> > > > +     *
> > > > +     * Disable this feature until a better interface for asynchronous 
> > > > commands
> > > > +     * can be worked out.
> > > > +     *
> > > > +     * -aglitke
> > > > +     */
> > > > +    /* f |= (1 << VIRTIO_BALLOON_F_STATS_VQ); */
> > > 
> > > 
> > > This field is guest-visible, won't this cause problems on migration?
> > 
> > I haven't followed migration very closely, but isn't this a common
> > problem whenever one migrates a vm to a newer version of qemu that has
> > more features.  I thought that virtio feature negotiation would ensure
> > that stats have been disabled at the device level and would remain
> > disabled post migration.  Please correct me if I am mistaken.
> 
> If this is the case, then it's another reason to keep the feature bit
> enabled: the feature bit just let the guest know that the host may
> request balloon stats at any moment, and it's better to keep that
> capability in case the guest is migrated, isn't it?
> 
> Also, what happens if we try to migrate from a qemu version that had the
> feature bit set to a qemu version without this feature bit?
> 
> I don't know the details either, but even if this works gracefully for
> migration in both directions, it sound much simpler to not change the
> guest-visible machine model and just change the "info balloon" behavior.

It looks worse than that: by disabling this feature you will break migration
from older qemu versions that had the old "info balloon" behavior.

This is the incoming virtio migration code:

int virtio_load(VirtIODevice *vdev, QEMUFile *f)
{
[...]
    uint32_t supported_features =
        vdev->binding->get_features(vdev->binding_opaque);
[...]
    qemu_get_8s(f, &vdev->status);
    qemu_get_8s(f, &vdev->isr);
    qemu_get_be16s(f, &vdev->queue_sel);
    qemu_get_be32s(f, &features);
    if (features & ~supported_features) {
        fprintf(stderr, "Features 0x%x unsupported. Allowed features: 0x%x\n",
                features, supported_features);
        return -1;
    }
[...]

To make things worse: virtio_net_load() doesn't check the return value of
virtio_load(), so it will probably crash in an unpredictable way.

I keep my suggestion: if all we need is to change the way "info balloon"
behaves, then we can simply change the "info balloon" behavior, intead
of changing the guest-visible machine model.

-- 
Eduardo

Reply via email to