Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'

2020-06-24 Thread David Hildenbrand



> Am 24.06.2020 um 22:36 schrieb Michael S. Tsirkin :
> 
> On Wed, Jun 24, 2020 at 06:01:02PM +0200, David Hildenbrand wrote:
>>> On 24.06.20 17:37, Michael S. Tsirkin wrote:
>>> On Wed, Jun 24, 2020 at 05:28:59PM +0200, David Hildenbrand wrote:
> So at the high level the idea was simple, we just clear the dirty bit
> when page is hinted, unless we sent a new command since. Implementation
> was reviewed by migration maintainers. If there's a consensus the code
> is written so badly we can't maintain it, maybe we should remove it.
> Which parts are unmaintainable in your eyes - migration or virtio ones?
 
 QEMU implementation without a propert virtio specification. I hope that
 we can *at least* finally document the expected behavior. Alex gave it a
 shot, and I was hoping that Wei could jump in to clarify, help move this
 forward ... after all he implemented (+designed?) the feature and the
 virtio interface.
 
> Or maybe it's the general thing that interface was never specced
> properly.
 
 Yes, a spec would be definitely a good starter ...
 
 [...]
 
>> 
>> 1. If migration fails during RAM precopy, the guest will never receive a
>> DONE notification. Probably easy to fix.
>> 
>> 2. Unclear semantics. Alex tried to document what the actual semantics
>> of hinted pages are.
> 
> I'll reply to that now.
> 
>> Assume the following in the guest to a previously
>> hinted page
>> 
>> /* page was hinted and is reused now */
>> if (page[x] != Y)
>>page[x] == Y;
>> /* migration ends, we now run on the destination */
>> BUG_ON(page[x] != Y);
>> /* BUG, because the content chan
> 
> The assumption hinting makes is that data in page is writtent to before 
> it's used.
> 
> 
>> A guest can observe that. And that could be a random driver that just
>> allocated a page.
>> 
>> (I *assume* in Linux we might catch that using kasan, but I am not 100%
>> sure, also, the actual semantics to document are unclear - e.g., for
>> other guests)
> 
> I think it's basically simple: hinting means it's ok to
> fill page with trash unless it has been modified since the command
> ID supplied.
 
 Yeah, I quite dislike the semantics, especially, as they are different
 to well-know semantics as e.g., represent in MADV_FREE. Getting changed
 content when reading is really weird. But it seemed to be easier to
 implement (low hanging fruit) and nobody complained back then. Well, now
 we are stuck with it.
 
 [..]
>>> 
>>> The difference with MADV_FREE is
>>> - asynchronous (using cmd id to synchronize)
>>> - zero not guaranteed
>>> 
>>> right?
>> 
>> *looking into man page*, yes, when reading you either get the old
>> content or zero.
>> 
>> (I remember that a re-read also makes the content stable, but looks like
>> you really have to write to a page)
>> 
>> We should most probably do what Alex suggested and initialize pages (at
>> least write a single byte) when leaking them from the shrinker in the
>> guest while hinting is active, such that the content is stable for
>> anybody to allocate and reuse a page.
> 
> Drivers ignore old content from slab though, so I don't really see
> the point.
> 

That‘s what we‘re hoping for and what we would expect. Maybe we should just 
life with that assumption and hope for the best ...

>> -- 
>> Thanks,
>> 
>> David / dhildenb
> 




Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'

2020-06-24 Thread Michael S. Tsirkin
On Wed, Jun 24, 2020 at 06:01:02PM +0200, David Hildenbrand wrote:
> On 24.06.20 17:37, Michael S. Tsirkin wrote:
> > On Wed, Jun 24, 2020 at 05:28:59PM +0200, David Hildenbrand wrote:
> >>> So at the high level the idea was simple, we just clear the dirty bit
> >>> when page is hinted, unless we sent a new command since. Implementation
> >>> was reviewed by migration maintainers. If there's a consensus the code
> >>> is written so badly we can't maintain it, maybe we should remove it.
> >>> Which parts are unmaintainable in your eyes - migration or virtio ones?
> >>
> >> QEMU implementation without a propert virtio specification. I hope that
> >> we can *at least* finally document the expected behavior. Alex gave it a
> >> shot, and I was hoping that Wei could jump in to clarify, help move this
> >> forward ... after all he implemented (+designed?) the feature and the
> >> virtio interface.
> >>
> >>> Or maybe it's the general thing that interface was never specced
> >>> properly.
> >>
> >> Yes, a spec would be definitely a good starter ...
> >>
> >> [...]
> >>
> 
>  1. If migration fails during RAM precopy, the guest will never receive a
>  DONE notification. Probably easy to fix.
> 
>  2. Unclear semantics. Alex tried to document what the actual semantics
>  of hinted pages are.
> >>>
> >>> I'll reply to that now.
> >>>
>  Assume the following in the guest to a previously
>  hinted page
> 
>  /* page was hinted and is reused now */
>  if (page[x] != Y)
>   page[x] == Y;
>  /* migration ends, we now run on the destination */
>  BUG_ON(page[x] != Y);
>  /* BUG, because the content chan
> >>>
> >>> The assumption hinting makes is that data in page is writtent to before 
> >>> it's used.
> >>>
> >>>
>  A guest can observe that. And that could be a random driver that just
>  allocated a page.
> 
>  (I *assume* in Linux we might catch that using kasan, but I am not 100%
>  sure, also, the actual semantics to document are unclear - e.g., for
>  other guests)
> >>>
> >>> I think it's basically simple: hinting means it's ok to
> >>> fill page with trash unless it has been modified since the command
> >>> ID supplied.
> >>
> >> Yeah, I quite dislike the semantics, especially, as they are different
> >> to well-know semantics as e.g., represent in MADV_FREE. Getting changed
> >> content when reading is really weird. But it seemed to be easier to
> >> implement (low hanging fruit) and nobody complained back then. Well, now
> >> we are stuck with it.
> >>
> >> [..]
> > 
> > The difference with MADV_FREE is
> > - asynchronous (using cmd id to synchronize)
> > - zero not guaranteed
> > 
> > right?
> 
> *looking into man page*, yes, when reading you either get the old
> content or zero.
> 
> (I remember that a re-read also makes the content stable, but looks like
> you really have to write to a page)
> 
> We should most probably do what Alex suggested and initialize pages (at
> least write a single byte) when leaking them from the shrinker in the
> guest while hinting is active, such that the content is stable for
> anybody to allocate and reuse a page.

Drivers ignore old content from slab though, so I don't really see
the point.

> -- 
> Thanks,
> 
> David / dhildenb




Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'

2020-06-24 Thread David Hildenbrand
On 24.06.20 17:37, Michael S. Tsirkin wrote:
> On Wed, Jun 24, 2020 at 05:28:59PM +0200, David Hildenbrand wrote:
>>> So at the high level the idea was simple, we just clear the dirty bit
>>> when page is hinted, unless we sent a new command since. Implementation
>>> was reviewed by migration maintainers. If there's a consensus the code
>>> is written so badly we can't maintain it, maybe we should remove it.
>>> Which parts are unmaintainable in your eyes - migration or virtio ones?
>>
>> QEMU implementation without a propert virtio specification. I hope that
>> we can *at least* finally document the expected behavior. Alex gave it a
>> shot, and I was hoping that Wei could jump in to clarify, help move this
>> forward ... after all he implemented (+designed?) the feature and the
>> virtio interface.
>>
>>> Or maybe it's the general thing that interface was never specced
>>> properly.
>>
>> Yes, a spec would be definitely a good starter ...
>>
>> [...]
>>

 1. If migration fails during RAM precopy, the guest will never receive a
 DONE notification. Probably easy to fix.

 2. Unclear semantics. Alex tried to document what the actual semantics
 of hinted pages are.
>>>
>>> I'll reply to that now.
>>>
 Assume the following in the guest to a previously
 hinted page

 /* page was hinted and is reused now */
 if (page[x] != Y)
page[x] == Y;
 /* migration ends, we now run on the destination */
 BUG_ON(page[x] != Y);
 /* BUG, because the content chan
>>>
>>> The assumption hinting makes is that data in page is writtent to before 
>>> it's used.
>>>
>>>
 A guest can observe that. And that could be a random driver that just
 allocated a page.

 (I *assume* in Linux we might catch that using kasan, but I am not 100%
 sure, also, the actual semantics to document are unclear - e.g., for
 other guests)
>>>
>>> I think it's basically simple: hinting means it's ok to
>>> fill page with trash unless it has been modified since the command
>>> ID supplied.
>>
>> Yeah, I quite dislike the semantics, especially, as they are different
>> to well-know semantics as e.g., represent in MADV_FREE. Getting changed
>> content when reading is really weird. But it seemed to be easier to
>> implement (low hanging fruit) and nobody complained back then. Well, now
>> we are stuck with it.
>>
>> [..]
> 
> The difference with MADV_FREE is
> - asynchronous (using cmd id to synchronize)
> - zero not guaranteed
> 
> right?

*looking into man page*, yes, when reading you either get the old
content or zero.

(I remember that a re-read also makes the content stable, but looks like
you really have to write to a page)

We should most probably do what Alex suggested and initialize pages (at
least write a single byte) when leaking them from the shrinker in the
guest while hinting is active, such that the content is stable for
anybody to allocate and reuse a page.

-- 
Thanks,

David / dhildenb




Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'

2020-06-24 Thread Michael S. Tsirkin
On Wed, Jun 24, 2020 at 05:28:59PM +0200, David Hildenbrand wrote:
> > So at the high level the idea was simple, we just clear the dirty bit
> > when page is hinted, unless we sent a new command since. Implementation
> > was reviewed by migration maintainers. If there's a consensus the code
> > is written so badly we can't maintain it, maybe we should remove it.
> > Which parts are unmaintainable in your eyes - migration or virtio ones?
> 
> QEMU implementation without a propert virtio specification. I hope that
> we can *at least* finally document the expected behavior. Alex gave it a
> shot, and I was hoping that Wei could jump in to clarify, help move this
> forward ... after all he implemented (+designed?) the feature and the
> virtio interface.
> 
> > Or maybe it's the general thing that interface was never specced
> > properly.
> 
> Yes, a spec would be definitely a good starter ...
> 
> [...]
> 
> >>
> >> 1. If migration fails during RAM precopy, the guest will never receive a
> >> DONE notification. Probably easy to fix.
> >>
> >> 2. Unclear semantics. Alex tried to document what the actual semantics
> >> of hinted pages are.
> > 
> > I'll reply to that now.
> > 
> >> Assume the following in the guest to a previously
> >> hinted page
> >>
> >> /* page was hinted and is reused now */
> >> if (page[x] != Y)
> >>page[x] == Y;
> >> /* migration ends, we now run on the destination */
> >> BUG_ON(page[x] != Y);
> >> /* BUG, because the content chan
> > 
> > The assumption hinting makes is that data in page is writtent to before 
> > it's used.
> > 
> > 
> >> A guest can observe that. And that could be a random driver that just
> >> allocated a page.
> >>
> >> (I *assume* in Linux we might catch that using kasan, but I am not 100%
> >> sure, also, the actual semantics to document are unclear - e.g., for
> >> other guests)
> > 
> > I think it's basically simple: hinting means it's ok to
> > fill page with trash unless it has been modified since the command
> > ID supplied.
> 
> Yeah, I quite dislike the semantics, especially, as they are different
> to well-know semantics as e.g., represent in MADV_FREE. Getting changed
> content when reading is really weird. But it seemed to be easier to
> implement (low hanging fruit) and nobody complained back then. Well, now
> we are stuck with it.
> 
> [..]

The difference with MADV_FREE is
- asynchronous (using cmd id to synchronize)
- zero not guaranteed

right?

> > 
> >> There are other concerns I had regarding the iothread (e.g., while
> >> reporting is active, virtio_ballloon_get_free_page_hints() is
> >> essentially a busy loop, in contrast to documented -
> >> continue_to_get_hints will always be true).
> > 
> > So that would be a performance issue you are suggesting, right?
> 
> I misread the code, so that comment does no longer apply (see other
> message).
> 
> > 
> >>> The appeal of hinting is that it's 0 overhead outside migration,
> >>> and pains were taken to avoid keeping pages locked while
> >>> hypervisor is busy.
> >>>
> >>> If we are to drop hinting completely we need to show that reporting
> >>> can be comparable, and we'll probably want to add a mode for
> >>> reporting that behaves somewhat similarly.
> >>
> >> Depends on the actual users. If we're dropping a feature that nobody is
> >> actively using, I don't think we have to show anything.
> > 
> > 
> > I don't know how to find out. So far it doesn't look like we found
> > any common data corruptions that would indicate no one can use it safely.
> > Races around reset aren't all that uncommon but I don't think that
> > qualifies as a deal breaker.
> 
> As I said, there are no libvirt bindings, so at least anything using
> libvirt does not use it. I'd be curious about actual users.
> 
> > 
> > I find the idea of asynchronously sending hints to host without
> > waiting for them to be processed intriguing. Not something
> > I'd work on implementing if we had reporting originally,
> > but since it's there I'm not sure we should just discard it
> > at this point.
> > 
> >> This feature obviously saw no proper review.
> > 
> > I did my best but obviously missed some things.
> 
> Yeah, definitely not your fault. People cannot expect maintainers to
> review everything in detail.
> 
> -- 
> Thanks,
> 
> David / dhildenb




Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'

2020-06-24 Thread David Hildenbrand
> So at the high level the idea was simple, we just clear the dirty bit
> when page is hinted, unless we sent a new command since. Implementation
> was reviewed by migration maintainers. If there's a consensus the code
> is written so badly we can't maintain it, maybe we should remove it.
> Which parts are unmaintainable in your eyes - migration or virtio ones?

QEMU implementation without a propert virtio specification. I hope that
we can *at least* finally document the expected behavior. Alex gave it a
shot, and I was hoping that Wei could jump in to clarify, help move this
forward ... after all he implemented (+designed?) the feature and the
virtio interface.

> Or maybe it's the general thing that interface was never specced
> properly.

Yes, a spec would be definitely a good starter ...

[...]

>>
>> 1. If migration fails during RAM precopy, the guest will never receive a
>> DONE notification. Probably easy to fix.
>>
>> 2. Unclear semantics. Alex tried to document what the actual semantics
>> of hinted pages are.
> 
> I'll reply to that now.
> 
>> Assume the following in the guest to a previously
>> hinted page
>>
>> /* page was hinted and is reused now */
>> if (page[x] != Y)
>>  page[x] == Y;
>> /* migration ends, we now run on the destination */
>> BUG_ON(page[x] != Y);
>> /* BUG, because the content chan
> 
> The assumption hinting makes is that data in page is writtent to before it's 
> used.
> 
> 
>> A guest can observe that. And that could be a random driver that just
>> allocated a page.
>>
>> (I *assume* in Linux we might catch that using kasan, but I am not 100%
>> sure, also, the actual semantics to document are unclear - e.g., for
>> other guests)
> 
> I think it's basically simple: hinting means it's ok to
> fill page with trash unless it has been modified since the command
> ID supplied.

Yeah, I quite dislike the semantics, especially, as they are different
to well-know semantics as e.g., represent in MADV_FREE. Getting changed
content when reading is really weird. But it seemed to be easier to
implement (low hanging fruit) and nobody complained back then. Well, now
we are stuck with it.

[..]

> 
>> There are other concerns I had regarding the iothread (e.g., while
>> reporting is active, virtio_ballloon_get_free_page_hints() is
>> essentially a busy loop, in contrast to documented -
>> continue_to_get_hints will always be true).
> 
> So that would be a performance issue you are suggesting, right?

I misread the code, so that comment does no longer apply (see other
message).

> 
>>> The appeal of hinting is that it's 0 overhead outside migration,
>>> and pains were taken to avoid keeping pages locked while
>>> hypervisor is busy.
>>>
>>> If we are to drop hinting completely we need to show that reporting
>>> can be comparable, and we'll probably want to add a mode for
>>> reporting that behaves somewhat similarly.
>>
>> Depends on the actual users. If we're dropping a feature that nobody is
>> actively using, I don't think we have to show anything.
> 
> 
> I don't know how to find out. So far it doesn't look like we found
> any common data corruptions that would indicate no one can use it safely.
> Races around reset aren't all that uncommon but I don't think that
> qualifies as a deal breaker.

As I said, there are no libvirt bindings, so at least anything using
libvirt does not use it. I'd be curious about actual users.

> 
> I find the idea of asynchronously sending hints to host without
> waiting for them to be processed intriguing. Not something
> I'd work on implementing if we had reporting originally,
> but since it's there I'm not sure we should just discard it
> at this point.
> 
>> This feature obviously saw no proper review.
> 
> I did my best but obviously missed some things.

Yeah, definitely not your fault. People cannot expect maintainers to
review everything in detail.

-- 
Thanks,

David / dhildenb




Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'

2020-06-24 Thread Michael S. Tsirkin
On Thu, Jun 18, 2020 at 07:10:43PM +0200, David Hildenbrand wrote:
> >>
> >> Ugh, ...
> >>
> >> @MST, you might have missed that in another discussion, what's your
> >> general opinion about removing free page hinting in QEMU (and Linux)? We
> >> keep finding issues in the QEMU implementation, including non-trivial
> >> ones, and have to speculate about the actual semantics. I can see that
> >> e.g., libvirt does not support it yet.
> > 
> > Not maintaining two similar features sounds attractive.
> 
> I consider free page hinting (in QEMU) to be in an unmaintainable state
> (and it looks like Alex and I are fixing a feature we don't actually
> intend to use / not aware of users). In contrast to that, the free page
> reporting functionality/implementation is a walk in the park.

So at the high level the idea was simple, we just clear the dirty bit
when page is hinted, unless we sent a new command since. Implementation
was reviewed by migration maintainers. If there's a consensus the code
is written so badly we can't maintain it, maybe we should remove it.
Which parts are unmaintainable in your eyes - migration or virtio ones?
Or maybe it's the general thing that interface was never specced
properly.

> > 
> > I'm still trying to get my head around the list of issues.  So far they
> > all look kind of minor to me.  Would you like to summarize them
> > somewhere?
> 
> Some things I still have in my mind

Thanks for the summary!

> 
> 1. If migration fails during RAM precopy, the guest will never receive a
> DONE notification. Probably easy to fix.
> 
> 2. Unclear semantics. Alex tried to document what the actual semantics
> of hinted pages are.

I'll reply to that now.

> Assume the following in the guest to a previously
> hinted page
> 
> /* page was hinted and is reused now */
> if (page[x] != Y)
>   page[x] == Y;
> /* migration ends, we now run on the destination */
> BUG_ON(page[x] != Y);
> /* BUG, because the content chan

The assumption hinting makes is that data in page is writtent to before it's 
used.


> A guest can observe that. And that could be a random driver that just
> allocated a page.
> 
> (I *assume* in Linux we might catch that using kasan, but I am not 100%
> sure, also, the actual semantics to document are unclear - e.g., for
> other guests)

I think it's basically simple: hinting means it's ok to
fill page with trash unless it has been modified since the command
ID supplied.

> As Alex mentioned, it is not even guaranteed in QEMU that we receive a
> zero page on the destination, it could also be something else (e.g.,
> previously migrated values).


Absolutely.

> 3. If I am not wrong, the iothread works in
> virtio_ballloon_get_free_page_hints() on the virtqueue only with holding
> the free_page_lock (no BQL).
> 
> Assume we're migrating, the iothread is active, and the guest triggers a
> device reset.
> 
> virtio_balloon_device_reset() will trigger a
> virtio_balloon_free_page_stop(s). That won't actually wait for the
> iothread to stop, it will only temporarily lock free_page_lock and
> update s->free_page_report_status.
> 
> I think there can be a race between the device reset and the iothread.
> Once virtio_balloon_free_page_stop() returned,
> virtio_ballloon_get_free_page_hints() can still call
> - virtio_queue_set_notification(vq, 0);
> - virtio_queue_set_notification(vq, 1);
> - virtio_notify(vdev, vq);
> - virtqueue_pop()
> 
> I doubt this is very nice.

Reset is notoriously hard to get right.

> There are other concerns I had regarding the iothread (e.g., while
> reporting is active, virtio_ballloon_get_free_page_hints() is
> essentially a busy loop, in contrast to documented -
> continue_to_get_hints will always be true).

So that would be a performance issue you are suggesting, right?

> > The appeal of hinting is that it's 0 overhead outside migration,
> > and pains were taken to avoid keeping pages locked while
> > hypervisor is busy.
> > 
> > If we are to drop hinting completely we need to show that reporting
> > can be comparable, and we'll probably want to add a mode for
> > reporting that behaves somewhat similarly.
> 
> Depends on the actual users. If we're dropping a feature that nobody is
> actively using, I don't think we have to show anything.


I don't know how to find out. So far it doesn't look like we found
any common data corruptions that would indicate no one can use it safely.
Races around reset aren't all that uncommon but I don't think that
qualifies as a deal breaker.

I find the idea of asynchronously sending hints to host without
waiting for them to be processed intriguing. Not something
I'd work on implementing if we had reporting originally,
but since it's there I'm not sure we should just discard it
at this point.

> This feature obviously saw no proper review.

I did my best but obviously missed some things.

> -- 
> Thanks,
> 
> David / dhildenb




Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'

2020-06-18 Thread David Hildenbrand
> 
>> 2. Unclear semantics. Alex tried to document what the actual semantics
>> of hinted pages are. Assume the following in the guest to a previously
>> hinted page
>> 
>> /* page was hinted and is reused now */
>> if (page[x] != Y)
>>page[x] == Y;
>> /* migration ends, we now run on the destination */
>> BUG_ON(page[x] != Y);
>> /* BUG, because the content chan
>> 
>> A guest can observe that. And that could be a random driver that just
>> allocated a page.
>> 
>> (I *assume* in Linux we might catch that using kasan, but I am not 100%
>> sure, also, the actual semantics to document are unclear - e.g., for
>> other guests)
>> 
>> As Alex mentioned, it is not even guaranteed in QEMU that we receive a
>> zero page on the destination, it could also be something else (e.g.,
>> previously migrated values).
> 
> So this is only an issue for pages that are pushed out of the balloon
> as a part of the shrinker process though. So fixing it would be pretty
> straightforward as we would just have to initialize or at least dirty
> pages that are leaked as a part of the shrinker. That may have an
> impact on performance though as it would result in us dirtying pages
> that are freed as a result of the shrinker being triggered.
> 

It really depends on the desired semantics, which are unclear because there is 
no doc/spec. Either QEMU is buggy or the kernel is buggy.

>> 3. If I am not wrong, the iothread works in
>> virtio_ballloon_get_free_page_hints() on the virtqueue only with holding
>> the free_page_lock (no BQL).
>> 
>> Assume we're migrating, the iothread is active, and the guest triggers a
>> device reset.
>> 
>> virtio_balloon_device_reset() will trigger a
>> virtio_balloon_free_page_stop(s). That won't actually wait for the
>> iothread to stop, it will only temporarily lock free_page_lock and
>> update s->free_page_report_status.
>> 
>> I think there can be a race between the device reset and the iothread.
>> Once virtio_balloon_free_page_stop() returned,
>> virtio_ballloon_get_free_page_hints() can still call
>> - virtio_queue_set_notification(vq, 0);
>> - virtio_queue_set_notification(vq, 1);
>> - virtio_notify(vdev, vq);
>> - virtqueue_pop()
>> 
>> I doubt this is very nice.
> 
> And our conversation had me start looking though reference to
> virtio_balloon_free_page_stop. It looks like we call it for when we
> unrealize the device or reset the device. It might make more sense for
> us to look at pushing the status to DONE and forcing the iothread to
> be flushed out.
> 
>> There are other concerns I had regarding the iothread (e.g., while
>> reporting is active, virtio_ballloon_get_free_page_hints() is
>> essentially a busy loop, in contrast to documented -
>> continue_to_get_hints will always be true).
>> 
>>> The appeal of hinting is that it's 0 overhead outside migration,
>>> and pains were taken to avoid keeping pages locked while
>>> hypervisor is busy.
>>> 
>>> If we are to drop hinting completely we need to show that reporting
>>> can be comparable, and we'll probably want to add a mode for
>>> reporting that behaves somewhat similarly.
>> 
>> Depends on the actual users. If we're dropping a feature that nobody is
>> actively using, I don't think we have to show anything.
>> 
>> This feature obviously saw no proper review.
> 
> I'm pretty sure it had some, as it went through several iterations as
> I recall. However I don't think the review of the virtio interface was
> very detailed as I think most of the attention was on the kernel
> interface.

Yes, that‘s what I meant. The kernel side and the migration code (QEMU) got a 
lot of attention.




Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'

2020-06-18 Thread Alexander Duyck
On Thu, Jun 18, 2020 at 10:10 AM David Hildenbrand  wrote:
>
> >>
> >> Ugh, ...
> >>
> >> @MST, you might have missed that in another discussion, what's your
> >> general opinion about removing free page hinting in QEMU (and Linux)? We
> >> keep finding issues in the QEMU implementation, including non-trivial
> >> ones, and have to speculate about the actual semantics. I can see that
> >> e.g., libvirt does not support it yet.
> >
> > Not maintaining two similar features sounds attractive.

Agreed. Just to make sure we are all on the same page I am adding Wei
Wang since he was the original author for page hinting.

> I consider free page hinting (in QEMU) to be in an unmaintainable state
> (and it looks like Alex and I are fixing a feature we don't actually
> intend to use / not aware of users). In contrast to that, the free page
> reporting functionality/implementation is a walk in the park.
>
> >
> > I'm still trying to get my head around the list of issues.  So far they
> > all look kind of minor to me.  Would you like to summarize them
> > somewhere?
>
> Some things I still have in my mind
>
>
> 1. If migration fails during RAM precopy, the guest will never receive a
> DONE notification. Probably easy to fix.

Agreed. It is just a matter of finding the right point to add a hook
so that if we abort the migration we can report DONE.

> 2. Unclear semantics. Alex tried to document what the actual semantics
> of hinted pages are. Assume the following in the guest to a previously
> hinted page
>
> /* page was hinted and is reused now */
> if (page[x] != Y)
> page[x] == Y;
> /* migration ends, we now run on the destination */
> BUG_ON(page[x] != Y);
> /* BUG, because the content chan
>
> A guest can observe that. And that could be a random driver that just
> allocated a page.
>
> (I *assume* in Linux we might catch that using kasan, but I am not 100%
> sure, also, the actual semantics to document are unclear - e.g., for
> other guests)
>
> As Alex mentioned, it is not even guaranteed in QEMU that we receive a
> zero page on the destination, it could also be something else (e.g.,
> previously migrated values).

So this is only an issue for pages that are pushed out of the balloon
as a part of the shrinker process though. So fixing it would be pretty
straightforward as we would just have to initialize or at least dirty
pages that are leaked as a part of the shrinker. That may have an
impact on performance though as it would result in us dirtying pages
that are freed as a result of the shrinker being triggered.

> 3. If I am not wrong, the iothread works in
> virtio_ballloon_get_free_page_hints() on the virtqueue only with holding
> the free_page_lock (no BQL).
>
> Assume we're migrating, the iothread is active, and the guest triggers a
> device reset.
>
> virtio_balloon_device_reset() will trigger a
> virtio_balloon_free_page_stop(s). That won't actually wait for the
> iothread to stop, it will only temporarily lock free_page_lock and
> update s->free_page_report_status.
>
> I think there can be a race between the device reset and the iothread.
> Once virtio_balloon_free_page_stop() returned,
> virtio_ballloon_get_free_page_hints() can still call
> - virtio_queue_set_notification(vq, 0);
> - virtio_queue_set_notification(vq, 1);
> - virtio_notify(vdev, vq);
> - virtqueue_pop()
>
> I doubt this is very nice.

And our conversation had me start looking though reference to
virtio_balloon_free_page_stop. It looks like we call it for when we
unrealize the device or reset the device. It might make more sense for
us to look at pushing the status to DONE and forcing the iothread to
be flushed out.

> There are other concerns I had regarding the iothread (e.g., while
> reporting is active, virtio_ballloon_get_free_page_hints() is
> essentially a busy loop, in contrast to documented -
> continue_to_get_hints will always be true).
>
> > The appeal of hinting is that it's 0 overhead outside migration,
> > and pains were taken to avoid keeping pages locked while
> > hypervisor is busy.
> >
> > If we are to drop hinting completely we need to show that reporting
> > can be comparable, and we'll probably want to add a mode for
> > reporting that behaves somewhat similarly.
>
> Depends on the actual users. If we're dropping a feature that nobody is
> actively using, I don't think we have to show anything.
>
> This feature obviously saw no proper review.

I'm pretty sure it had some, as it went through several iterations as
I recall. However I don't think the review of the virtio interface was
very detailed as I think most of the attention was on the kernel
interface.

As far as trying to do this with page reporting it would be doable,
but I would need to use something like the command interface so that I
would have a way to tell the driver when to drop the reported bit from
the pages and when to stop/resume hinting. However it still wouldn't
resolve the issue of copy on write style pages where the page is only
read and 

Re: [virtio-dev] Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'

2020-06-18 Thread Dr. David Alan Gilbert
* Alexander Duyck (alexander.du...@gmail.com) wrote:
> On Tue, May 26, 2020 at 9:14 PM Alexander Duyck
>  wrote:
> >
> > From: Alexander Duyck 
> >
> > In an upcoming patch a feature named Free Page Reporting is about to be
> > added. In order to avoid any confusion we should drop the use of the word
> > 'report' when referring to Free Page Hinting. So what this patch does is go
> > through and replace all instances of 'report' with 'hint" when we are
> > referring to free page hinting.
> >
> > Acked-by: David Hildenbrand 
> > Signed-off-by: Alexander Duyck 
> > ---
> >  hw/virtio/virtio-balloon.c |   78 
> > ++--
> >  include/hw/virtio/virtio-balloon.h |   20 +
> >  2 files changed, 49 insertions(+), 49 deletions(-)
> >
> > diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> > index 3e2ac1104b5f..dc15409b0bb6 100644
> > --- a/hw/virtio/virtio-balloon.c
> > +++ b/hw/virtio/virtio-balloon.c
> 
> ...
> 
> > @@ -817,14 +817,14 @@ static int virtio_balloon_post_load_device(void 
> > *opaque, int version_id)
> >  return 0;
> >  }
> >
> > -static const VMStateDescription vmstate_virtio_balloon_free_page_report = {
> > +static const VMStateDescription vmstate_virtio_balloon_free_page_hint = {
> >  .name = "virtio-balloon-device/free-page-report",
> >  .version_id = 1,
> >  .minimum_version_id = 1,
> >  .needed = virtio_balloon_free_page_support,
> >  .fields = (VMStateField[]) {
> > -VMSTATE_UINT32(free_page_report_cmd_id, VirtIOBalloon),
> > -VMSTATE_UINT32(free_page_report_status, VirtIOBalloon),
> > +VMSTATE_UINT32(free_page_hint_cmd_id, VirtIOBalloon),
> > +VMSTATE_UINT32(free_page_hint_status, VirtIOBalloon),
> >  VMSTATE_END_OF_LIST()
> >  }
> >  };
> 
> So I noticed this patch wasn't in the list of patches pulled, but that
> is probably for the best since I believe the change above might have
> broken migration as VMSTATE_UINT32 does a stringify on the first
> parameter.
> Any advice on how to address it, or should I just give up on renaming
> free_page_report_cmd_id and free_page_report_status?

The filed names never hit the wire; the migration format is trivial
binary, especially of things like integers - that lands as just 4 bytes
on the wire [ hopefully in the place where the destination expects to
receive them ].
You need to be careful of the names of top level vmstate devices, and
the names of subsections; I don't think any other naming is in the
stream.
(We've even done hacks in the past of converting a VMSTATE_UINT32 to a
pair of UINT16 )

Dave

> Looking at this I wonder why we even need to migrate these values? It
> seems like if we are completing a migration the cmd_id should always
> be "DONE" shouldn't it? It isn't as if we are going to migrate the
> hinting from one host to another. We will have to start over which is
> essentially the signal that the "DONE" value provides. Same thing for
> the status. We shouldn't be able to migrate unless both of these are
> already in the "DONE" state so if anything I wonder if we shouldn't
> have that as the initial state for the device and just drop the
> migration info.
> 
> Thanks.
> 
> - Alex
> 
> -
> To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'

2020-06-18 Thread David Hildenbrand
 > There are other concerns I had regarding the iothread (e.g., while
> reporting is active, virtio_ballloon_get_free_page_hints() is
> essentially a busy loop, in contrast to documented -
> continue_to_get_hints will always be true).

FWIW, I just double checked this and my memory was bad.

 -
-- 
Thanks,

David / dhildenb




Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'

2020-06-18 Thread David Hildenbrand
>>
>> Ugh, ...
>>
>> @MST, you might have missed that in another discussion, what's your
>> general opinion about removing free page hinting in QEMU (and Linux)? We
>> keep finding issues in the QEMU implementation, including non-trivial
>> ones, and have to speculate about the actual semantics. I can see that
>> e.g., libvirt does not support it yet.
> 
> Not maintaining two similar features sounds attractive.

I consider free page hinting (in QEMU) to be in an unmaintainable state
(and it looks like Alex and I are fixing a feature we don't actually
intend to use / not aware of users). In contrast to that, the free page
reporting functionality/implementation is a walk in the park.

> 
> I'm still trying to get my head around the list of issues.  So far they
> all look kind of minor to me.  Would you like to summarize them
> somewhere?

Some things I still have in my mind


1. If migration fails during RAM precopy, the guest will never receive a
DONE notification. Probably easy to fix.

2. Unclear semantics. Alex tried to document what the actual semantics
of hinted pages are. Assume the following in the guest to a previously
hinted page

/* page was hinted and is reused now */
if (page[x] != Y)
page[x] == Y;
/* migration ends, we now run on the destination */
BUG_ON(page[x] != Y);
/* BUG, because the content chan

A guest can observe that. And that could be a random driver that just
allocated a page.

(I *assume* in Linux we might catch that using kasan, but I am not 100%
sure, also, the actual semantics to document are unclear - e.g., for
other guests)

As Alex mentioned, it is not even guaranteed in QEMU that we receive a
zero page on the destination, it could also be something else (e.g.,
previously migrated values).

3. If I am not wrong, the iothread works in
virtio_ballloon_get_free_page_hints() on the virtqueue only with holding
the free_page_lock (no BQL).

Assume we're migrating, the iothread is active, and the guest triggers a
device reset.

virtio_balloon_device_reset() will trigger a
virtio_balloon_free_page_stop(s). That won't actually wait for the
iothread to stop, it will only temporarily lock free_page_lock and
update s->free_page_report_status.

I think there can be a race between the device reset and the iothread.
Once virtio_balloon_free_page_stop() returned,
virtio_ballloon_get_free_page_hints() can still call
- virtio_queue_set_notification(vq, 0);
- virtio_queue_set_notification(vq, 1);
- virtio_notify(vdev, vq);
- virtqueue_pop()

I doubt this is very nice.

There are other concerns I had regarding the iothread (e.g., while
reporting is active, virtio_ballloon_get_free_page_hints() is
essentially a busy loop, in contrast to documented -
continue_to_get_hints will always be true).

> The appeal of hinting is that it's 0 overhead outside migration,
> and pains were taken to avoid keeping pages locked while
> hypervisor is busy.
> 
> If we are to drop hinting completely we need to show that reporting
> can be comparable, and we'll probably want to add a mode for
> reporting that behaves somewhat similarly.

Depends on the actual users. If we're dropping a feature that nobody is
actively using, I don't think we have to show anything.

This feature obviously saw no proper review.

-- 
Thanks,

David / dhildenb




Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'

2020-06-18 Thread Michael S. Tsirkin
On Thu, Jun 18, 2020 at 05:58:28PM +0200, David Hildenbrand wrote:
> On 18.06.20 17:14, Alexander Duyck wrote:
> > On Thu, Jun 18, 2020 at 5:54 AM David Hildenbrand  wrote:
> >>
> >> On 13.06.20 22:07, Alexander Duyck wrote:
> >>> On Tue, May 26, 2020 at 9:14 PM Alexander Duyck
> >>>  wrote:
> 
>  From: Alexander Duyck 
> 
>  In an upcoming patch a feature named Free Page Reporting is about to be
>  added. In order to avoid any confusion we should drop the use of the word
>  'report' when referring to Free Page Hinting. So what this patch does is 
>  go
>  through and replace all instances of 'report' with 'hint" when we are
>  referring to free page hinting.
> 
>  Acked-by: David Hildenbrand 
>  Signed-off-by: Alexander Duyck 
>  ---
>   hw/virtio/virtio-balloon.c |   78 
>  ++--
>   include/hw/virtio/virtio-balloon.h |   20 +
>   2 files changed, 49 insertions(+), 49 deletions(-)
> 
>  diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
>  index 3e2ac1104b5f..dc15409b0bb6 100644
>  --- a/hw/virtio/virtio-balloon.c
>  +++ b/hw/virtio/virtio-balloon.c
> >>>
> >>> ...
> >>>
>  @@ -817,14 +817,14 @@ static int virtio_balloon_post_load_device(void 
>  *opaque, int version_id)
>   return 0;
>   }
> 
>  -static const VMStateDescription vmstate_virtio_balloon_free_page_report 
>  = {
>  +static const VMStateDescription vmstate_virtio_balloon_free_page_hint = 
>  {
>   .name = "virtio-balloon-device/free-page-report",
>   .version_id = 1,
>   .minimum_version_id = 1,
>   .needed = virtio_balloon_free_page_support,
>   .fields = (VMStateField[]) {
>  -VMSTATE_UINT32(free_page_report_cmd_id, VirtIOBalloon),
>  -VMSTATE_UINT32(free_page_report_status, VirtIOBalloon),
>  +VMSTATE_UINT32(free_page_hint_cmd_id, VirtIOBalloon),
>  +VMSTATE_UINT32(free_page_hint_status, VirtIOBalloon),
>   VMSTATE_END_OF_LIST()
>   }
>   };
> >>>
> >>> So I noticed this patch wasn't in the list of patches pulled, but that
> >>> is probably for the best since I believe the change above might have
> >>> broken migration as VMSTATE_UINT32 does a stringify on the first
> >>> parameter.
> >>
> >> Indeed, it's the name of the vmstate field. But I don't think it is
> >> relevant for migration. It's and indicator if a field is valid and it's
> >> used in traces/error messages.
> >>
> >> See git grep "field->name"
> >>
> >> I don't think renaming this is problematic. Can you rebase and resent?
> >> Thanks!
> > 
> > Okay, I will.
> > 
> >>> Any advice on how to address it, or should I just give up on renaming
> >>> free_page_report_cmd_id and free_page_report_status?
> >>>
> >>> Looking at this I wonder why we even need to migrate these values? It
> >>> seems like if we are completing a migration the cmd_id should always
> >>> be "DONE" shouldn't it? It isn't as if we are going to migrate the
> >>
> >> The *status* should be DONE IIUC. The cmd_id might be relevant, no? It's
> >> always incremented until it wraps.
> > 
> > The thing is, the cmd_id visible to the driver if the status is DONE
> > is the cmd_id value for DONE. So as long as the driver acknowledges
> > the value we could essentially start over the cmd_id without any
> > negative effect. The driver would have to put down a new descriptor to
> > start a block of hinting in order to begin reporting again so there
> > shouldn't be any risk of us falsely hinting pages that were in a
> > previous epoch.
> > 
> > Ugh, although now looking at it I think we might have a bug in the
> > QEMU code in that the driver could in theory force its way past a
> > "STOP" by just replaying the last command_id descriptor and then keep
> > going. Should be a pretty easy fix though as we should only allow a
> > transition to S_START if the status is S_REQUESTED/
> 
> Ugh, ...
> 
> @MST, you might have missed that in another discussion, what's your
> general opinion about removing free page hinting in QEMU (and Linux)? We
> keep finding issues in the QEMU implementation, including non-trivial
> ones, and have to speculate about the actual semantics. I can see that
> e.g., libvirt does not support it yet.

Not maintaining two similar features sounds attractive.

I'm still trying to get my head around the list of issues.  So far they
all look kind of minor to me.  Would you like to summarize them
somewhere?
The appeal of hinting is that it's 0 overhead outside migration,
and pains were taken to avoid keeping pages locked while
hypervisor is busy.

If we are to drop hinting completely we need to show that reporting
can be comparable, and we'll probably want to add a mode for
reporting that behaves somewhat similarly.

-- 
MST




Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'

2020-06-18 Thread David Hildenbrand
On 18.06.20 17:14, Alexander Duyck wrote:
> On Thu, Jun 18, 2020 at 5:54 AM David Hildenbrand  wrote:
>>
>> On 13.06.20 22:07, Alexander Duyck wrote:
>>> On Tue, May 26, 2020 at 9:14 PM Alexander Duyck
>>>  wrote:

 From: Alexander Duyck 

 In an upcoming patch a feature named Free Page Reporting is about to be
 added. In order to avoid any confusion we should drop the use of the word
 'report' when referring to Free Page Hinting. So what this patch does is go
 through and replace all instances of 'report' with 'hint" when we are
 referring to free page hinting.

 Acked-by: David Hildenbrand 
 Signed-off-by: Alexander Duyck 
 ---
  hw/virtio/virtio-balloon.c |   78 
 ++--
  include/hw/virtio/virtio-balloon.h |   20 +
  2 files changed, 49 insertions(+), 49 deletions(-)

 diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
 index 3e2ac1104b5f..dc15409b0bb6 100644
 --- a/hw/virtio/virtio-balloon.c
 +++ b/hw/virtio/virtio-balloon.c
>>>
>>> ...
>>>
 @@ -817,14 +817,14 @@ static int virtio_balloon_post_load_device(void 
 *opaque, int version_id)
  return 0;
  }

 -static const VMStateDescription vmstate_virtio_balloon_free_page_report = 
 {
 +static const VMStateDescription vmstate_virtio_balloon_free_page_hint = {
  .name = "virtio-balloon-device/free-page-report",
  .version_id = 1,
  .minimum_version_id = 1,
  .needed = virtio_balloon_free_page_support,
  .fields = (VMStateField[]) {
 -VMSTATE_UINT32(free_page_report_cmd_id, VirtIOBalloon),
 -VMSTATE_UINT32(free_page_report_status, VirtIOBalloon),
 +VMSTATE_UINT32(free_page_hint_cmd_id, VirtIOBalloon),
 +VMSTATE_UINT32(free_page_hint_status, VirtIOBalloon),
  VMSTATE_END_OF_LIST()
  }
  };
>>>
>>> So I noticed this patch wasn't in the list of patches pulled, but that
>>> is probably for the best since I believe the change above might have
>>> broken migration as VMSTATE_UINT32 does a stringify on the first
>>> parameter.
>>
>> Indeed, it's the name of the vmstate field. But I don't think it is
>> relevant for migration. It's and indicator if a field is valid and it's
>> used in traces/error messages.
>>
>> See git grep "field->name"
>>
>> I don't think renaming this is problematic. Can you rebase and resent?
>> Thanks!
> 
> Okay, I will.
> 
>>> Any advice on how to address it, or should I just give up on renaming
>>> free_page_report_cmd_id and free_page_report_status?
>>>
>>> Looking at this I wonder why we even need to migrate these values? It
>>> seems like if we are completing a migration the cmd_id should always
>>> be "DONE" shouldn't it? It isn't as if we are going to migrate the
>>
>> The *status* should be DONE IIUC. The cmd_id might be relevant, no? It's
>> always incremented until it wraps.
> 
> The thing is, the cmd_id visible to the driver if the status is DONE
> is the cmd_id value for DONE. So as long as the driver acknowledges
> the value we could essentially start over the cmd_id without any
> negative effect. The driver would have to put down a new descriptor to
> start a block of hinting in order to begin reporting again so there
> shouldn't be any risk of us falsely hinting pages that were in a
> previous epoch.
> 
> Ugh, although now looking at it I think we might have a bug in the
> QEMU code in that the driver could in theory force its way past a
> "STOP" by just replaying the last command_id descriptor and then keep
> going. Should be a pretty easy fix though as we should only allow a
> transition to S_START if the status is S_REQUESTED/

Ugh, ...

@MST, you might have missed that in another discussion, what's your
general opinion about removing free page hinting in QEMU (and Linux)? We
keep finding issues in the QEMU implementation, including non-trivial
ones, and have to speculate about the actual semantics. I can see that
e.g., libvirt does not support it yet.

-- 
Thanks,

David / dhildenb




Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'

2020-06-18 Thread Alexander Duyck
On Thu, Jun 18, 2020 at 5:54 AM David Hildenbrand  wrote:
>
> On 13.06.20 22:07, Alexander Duyck wrote:
> > On Tue, May 26, 2020 at 9:14 PM Alexander Duyck
> >  wrote:
> >>
> >> From: Alexander Duyck 
> >>
> >> In an upcoming patch a feature named Free Page Reporting is about to be
> >> added. In order to avoid any confusion we should drop the use of the word
> >> 'report' when referring to Free Page Hinting. So what this patch does is go
> >> through and replace all instances of 'report' with 'hint" when we are
> >> referring to free page hinting.
> >>
> >> Acked-by: David Hildenbrand 
> >> Signed-off-by: Alexander Duyck 
> >> ---
> >>  hw/virtio/virtio-balloon.c |   78 
> >> ++--
> >>  include/hw/virtio/virtio-balloon.h |   20 +
> >>  2 files changed, 49 insertions(+), 49 deletions(-)
> >>
> >> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> >> index 3e2ac1104b5f..dc15409b0bb6 100644
> >> --- a/hw/virtio/virtio-balloon.c
> >> +++ b/hw/virtio/virtio-balloon.c
> >
> > ...
> >
> >> @@ -817,14 +817,14 @@ static int virtio_balloon_post_load_device(void 
> >> *opaque, int version_id)
> >>  return 0;
> >>  }
> >>
> >> -static const VMStateDescription vmstate_virtio_balloon_free_page_report = 
> >> {
> >> +static const VMStateDescription vmstate_virtio_balloon_free_page_hint = {
> >>  .name = "virtio-balloon-device/free-page-report",
> >>  .version_id = 1,
> >>  .minimum_version_id = 1,
> >>  .needed = virtio_balloon_free_page_support,
> >>  .fields = (VMStateField[]) {
> >> -VMSTATE_UINT32(free_page_report_cmd_id, VirtIOBalloon),
> >> -VMSTATE_UINT32(free_page_report_status, VirtIOBalloon),
> >> +VMSTATE_UINT32(free_page_hint_cmd_id, VirtIOBalloon),
> >> +VMSTATE_UINT32(free_page_hint_status, VirtIOBalloon),
> >>  VMSTATE_END_OF_LIST()
> >>  }
> >>  };
> >
> > So I noticed this patch wasn't in the list of patches pulled, but that
> > is probably for the best since I believe the change above might have
> > broken migration as VMSTATE_UINT32 does a stringify on the first
> > parameter.
>
> Indeed, it's the name of the vmstate field. But I don't think it is
> relevant for migration. It's and indicator if a field is valid and it's
> used in traces/error messages.
>
> See git grep "field->name"
>
> I don't think renaming this is problematic. Can you rebase and resent?
> Thanks!

Okay, I will.

> > Any advice on how to address it, or should I just give up on renaming
> > free_page_report_cmd_id and free_page_report_status?
> >
> > Looking at this I wonder why we even need to migrate these values? It
> > seems like if we are completing a migration the cmd_id should always
> > be "DONE" shouldn't it? It isn't as if we are going to migrate the
>
> The *status* should be DONE IIUC. The cmd_id might be relevant, no? It's
> always incremented until it wraps.

The thing is, the cmd_id visible to the driver if the status is DONE
is the cmd_id value for DONE. So as long as the driver acknowledges
the value we could essentially start over the cmd_id without any
negative effect. The driver would have to put down a new descriptor to
start a block of hinting in order to begin reporting again so there
shouldn't be any risk of us falsely hinting pages that were in a
previous epoch.

Ugh, although now looking at it I think we might have a bug in the
QEMU code in that the driver could in theory force its way past a
"STOP" by just replaying the last command_id descriptor and then keep
going. Should be a pretty easy fix though as we should only allow a
transition to S_START if the status is S_REQUESTED/

> > hinting from one host to another. We will have to start over which is
> > essentially the signal that the "DONE" value provides. Same thing for
> > the status. We shouldn't be able to migrate unless both of these are
> > already in the "DONE" state so if anything I wonder if we shouldn't
> > have that as the initial state for the device and just drop the
> > migration info.
>
> We'll have to glue that to a compat machine unfortunately, so we can
> just keep migrating it ... :(

Yeah, I kind of figured that would be the case. However if the name
change is not an issue then it should not be a problem.

Thanks.

- Alex



Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'

2020-06-18 Thread David Hildenbrand
On 13.06.20 22:07, Alexander Duyck wrote:
> On Tue, May 26, 2020 at 9:14 PM Alexander Duyck
>  wrote:
>>
>> From: Alexander Duyck 
>>
>> In an upcoming patch a feature named Free Page Reporting is about to be
>> added. In order to avoid any confusion we should drop the use of the word
>> 'report' when referring to Free Page Hinting. So what this patch does is go
>> through and replace all instances of 'report' with 'hint" when we are
>> referring to free page hinting.
>>
>> Acked-by: David Hildenbrand 
>> Signed-off-by: Alexander Duyck 
>> ---
>>  hw/virtio/virtio-balloon.c |   78 
>> ++--
>>  include/hw/virtio/virtio-balloon.h |   20 +
>>  2 files changed, 49 insertions(+), 49 deletions(-)
>>
>> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
>> index 3e2ac1104b5f..dc15409b0bb6 100644
>> --- a/hw/virtio/virtio-balloon.c
>> +++ b/hw/virtio/virtio-balloon.c
> 
> ...
> 
>> @@ -817,14 +817,14 @@ static int virtio_balloon_post_load_device(void 
>> *opaque, int version_id)
>>  return 0;
>>  }
>>
>> -static const VMStateDescription vmstate_virtio_balloon_free_page_report = {
>> +static const VMStateDescription vmstate_virtio_balloon_free_page_hint = {
>>  .name = "virtio-balloon-device/free-page-report",
>>  .version_id = 1,
>>  .minimum_version_id = 1,
>>  .needed = virtio_balloon_free_page_support,
>>  .fields = (VMStateField[]) {
>> -VMSTATE_UINT32(free_page_report_cmd_id, VirtIOBalloon),
>> -VMSTATE_UINT32(free_page_report_status, VirtIOBalloon),
>> +VMSTATE_UINT32(free_page_hint_cmd_id, VirtIOBalloon),
>> +VMSTATE_UINT32(free_page_hint_status, VirtIOBalloon),
>>  VMSTATE_END_OF_LIST()
>>  }
>>  };
> 
> So I noticed this patch wasn't in the list of patches pulled, but that
> is probably for the best since I believe the change above might have
> broken migration as VMSTATE_UINT32 does a stringify on the first
> parameter.

Indeed, it's the name of the vmstate field. But I don't think it is
relevant for migration. It's and indicator if a field is valid and it's
used in traces/error messages.

See git grep "field->name"

I don't think renaming this is problematic. Can you rebase and resent?
Thanks!

> Any advice on how to address it, or should I just give up on renaming
> free_page_report_cmd_id and free_page_report_status?
> 
> Looking at this I wonder why we even need to migrate these values? It
> seems like if we are completing a migration the cmd_id should always
> be "DONE" shouldn't it? It isn't as if we are going to migrate the

The *status* should be DONE IIUC. The cmd_id might be relevant, no? It's
always incremented until it wraps.

> hinting from one host to another. We will have to start over which is
> essentially the signal that the "DONE" value provides. Same thing for
> the status. We shouldn't be able to migrate unless both of these are
> already in the "DONE" state so if anything I wonder if we shouldn't
> have that as the initial state for the device and just drop the
> migration info.

We'll have to glue that to a compat machine unfortunately, so we can
just keep migrating it ... :(


-- 
Thanks,

David / dhildenb




Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'

2020-06-13 Thread Alexander Duyck
On Tue, May 26, 2020 at 9:14 PM Alexander Duyck
 wrote:
>
> From: Alexander Duyck 
>
> In an upcoming patch a feature named Free Page Reporting is about to be
> added. In order to avoid any confusion we should drop the use of the word
> 'report' when referring to Free Page Hinting. So what this patch does is go
> through and replace all instances of 'report' with 'hint" when we are
> referring to free page hinting.
>
> Acked-by: David Hildenbrand 
> Signed-off-by: Alexander Duyck 
> ---
>  hw/virtio/virtio-balloon.c |   78 
> ++--
>  include/hw/virtio/virtio-balloon.h |   20 +
>  2 files changed, 49 insertions(+), 49 deletions(-)
>
> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> index 3e2ac1104b5f..dc15409b0bb6 100644
> --- a/hw/virtio/virtio-balloon.c
> +++ b/hw/virtio/virtio-balloon.c

...

> @@ -817,14 +817,14 @@ static int virtio_balloon_post_load_device(void 
> *opaque, int version_id)
>  return 0;
>  }
>
> -static const VMStateDescription vmstate_virtio_balloon_free_page_report = {
> +static const VMStateDescription vmstate_virtio_balloon_free_page_hint = {
>  .name = "virtio-balloon-device/free-page-report",
>  .version_id = 1,
>  .minimum_version_id = 1,
>  .needed = virtio_balloon_free_page_support,
>  .fields = (VMStateField[]) {
> -VMSTATE_UINT32(free_page_report_cmd_id, VirtIOBalloon),
> -VMSTATE_UINT32(free_page_report_status, VirtIOBalloon),
> +VMSTATE_UINT32(free_page_hint_cmd_id, VirtIOBalloon),
> +VMSTATE_UINT32(free_page_hint_status, VirtIOBalloon),
>  VMSTATE_END_OF_LIST()
>  }
>  };

So I noticed this patch wasn't in the list of patches pulled, but that
is probably for the best since I believe the change above might have
broken migration as VMSTATE_UINT32 does a stringify on the first
parameter.
Any advice on how to address it, or should I just give up on renaming
free_page_report_cmd_id and free_page_report_status?

Looking at this I wonder why we even need to migrate these values? It
seems like if we are completing a migration the cmd_id should always
be "DONE" shouldn't it? It isn't as if we are going to migrate the
hinting from one host to another. We will have to start over which is
essentially the signal that the "DONE" value provides. Same thing for
the status. We shouldn't be able to migrate unless both of these are
already in the "DONE" state so if anything I wonder if we shouldn't
have that as the initial state for the device and just drop the
migration info.

Thanks.

- Alex



[PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'

2020-05-26 Thread Alexander Duyck
From: Alexander Duyck 

In an upcoming patch a feature named Free Page Reporting is about to be
added. In order to avoid any confusion we should drop the use of the word
'report' when referring to Free Page Hinting. So what this patch does is go
through and replace all instances of 'report' with 'hint" when we are
referring to free page hinting.

Acked-by: David Hildenbrand 
Signed-off-by: Alexander Duyck 
---
 hw/virtio/virtio-balloon.c |   78 ++--
 include/hw/virtio/virtio-balloon.h |   20 +
 2 files changed, 49 insertions(+), 49 deletions(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 3e2ac1104b5f..dc15409b0bb6 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -527,21 +527,21 @@ static bool get_free_page_hints(VirtIOBalloon *dev)
 ret = false;
 goto out;
 }
-if (id == dev->free_page_report_cmd_id) {
-dev->free_page_report_status = FREE_PAGE_REPORT_S_START;
+if (id == dev->free_page_hint_cmd_id) {
+dev->free_page_hint_status = FREE_PAGE_HINT_S_START;
 } else {
 /*
  * Stop the optimization only when it has started. This
  * avoids a stale stop sign for the previous command.
  */
-if (dev->free_page_report_status == FREE_PAGE_REPORT_S_START) {
-dev->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
+if (dev->free_page_hint_status == FREE_PAGE_HINT_S_START) {
+dev->free_page_hint_status = FREE_PAGE_HINT_S_STOP;
 }
 }
 }
 
 if (elem->in_num) {
-if (dev->free_page_report_status == FREE_PAGE_REPORT_S_START) {
+if (dev->free_page_hint_status == FREE_PAGE_HINT_S_START) {
 qemu_guest_free_page_hint(elem->in_sg[0].iov_base,
   elem->in_sg[0].iov_len);
 }
@@ -567,11 +567,11 @@ static void virtio_ballloon_get_free_page_hints(void 
*opaque)
 qemu_mutex_unlock(>free_page_lock);
 virtio_notify(vdev, vq);
   /*
-   * Start to poll the vq once the reporting started. Otherwise, continue
+   * Start to poll the vq once the hinting started. Otherwise, continue
* only when there are entries on the vq, which need to be given back.
*/
 } while (continue_to_get_hints ||
- dev->free_page_report_status == FREE_PAGE_REPORT_S_START);
+ dev->free_page_hint_status == FREE_PAGE_HINT_S_START);
 virtio_queue_set_notification(vq, 1);
 }
 
@@ -592,14 +592,14 @@ static void virtio_balloon_free_page_start(VirtIOBalloon 
*s)
 return;
 }
 
-if (s->free_page_report_cmd_id == UINT_MAX) {
-s->free_page_report_cmd_id =
-   VIRTIO_BALLOON_FREE_PAGE_REPORT_CMD_ID_MIN;
+if (s->free_page_hint_cmd_id == UINT_MAX) {
+s->free_page_hint_cmd_id =
+   VIRTIO_BALLOON_FREE_PAGE_HINT_CMD_ID_MIN;
 } else {
-s->free_page_report_cmd_id++;
+s->free_page_hint_cmd_id++;
 }
 
-s->free_page_report_status = FREE_PAGE_REPORT_S_REQUESTED;
+s->free_page_hint_status = FREE_PAGE_HINT_S_REQUESTED;
 virtio_notify_config(vdev);
 }
 
@@ -607,18 +607,18 @@ static void virtio_balloon_free_page_stop(VirtIOBalloon 
*s)
 {
 VirtIODevice *vdev = VIRTIO_DEVICE(s);
 
-if (s->free_page_report_status != FREE_PAGE_REPORT_S_STOP) {
+if (s->free_page_hint_status != FREE_PAGE_HINT_S_STOP) {
 /*
  * The lock also guarantees us that the
  * virtio_ballloon_get_free_page_hints exits after the
- * free_page_report_status is set to S_STOP.
+ * free_page_hint_status is set to S_STOP.
  */
 qemu_mutex_lock(>free_page_lock);
 /*
- * The guest hasn't done the reporting, so host sends a notification
- * to the guest to actively stop the reporting.
+ * The guest isn't done hinting, so send a notification
+ * to the guest to actively stop the hinting.
  */
-s->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
+s->free_page_hint_status = FREE_PAGE_HINT_S_STOP;
 qemu_mutex_unlock(>free_page_lock);
 virtio_notify_config(vdev);
 }
@@ -628,15 +628,15 @@ static void virtio_balloon_free_page_done(VirtIOBalloon 
*s)
 {
 VirtIODevice *vdev = VIRTIO_DEVICE(s);
 
-s->free_page_report_status = FREE_PAGE_REPORT_S_DONE;
+s->free_page_hint_status = FREE_PAGE_HINT_S_DONE;
 virtio_notify_config(vdev);
 }
 
 static int
-virtio_balloon_free_page_report_notify(NotifierWithReturn *n, void *data)
+virtio_balloon_free_page_hint_notify(NotifierWithReturn *n, void *data)
 {
 VirtIOBalloon *dev = container_of(n, VirtIOBalloon,
-  free_page_report_notify);
+  free_page_hint_notify);
 VirtIODevice *vdev =