On Wed, Apr 15, 2020 at 1:17 AM David Hildenbrand <da...@redhat.com> wrote: > > On 10.04.20 05:41, Alexander Duyck wrote: > > From: Alexander Duyck <alexander.h.du...@linux.intel.com> > > > > Add support for free page reporting. The idea is to function very similar > > to how the balloon works in that we basically end up madvising the page as > > not being used. However we don't really need to bother with any deflate > > type logic since the page will be faulted back into the guest when it is > > read or written to. > > > > This provides a new way of letting the guest proactively report free > > pages to the hypervisor, so the hypervisor can reuse them. In contrast to > > inflate/deflate that is triggered via the hypervisor explicitly. > > Much better, thanks! > > > > > Signed-off-by: Alexander Duyck <alexander.h.du...@linux.intel.com> > > --- > > hw/virtio/virtio-balloon.c | 63 > > +++++++++++++++++++++++++++++++++++- > > include/hw/virtio/virtio-balloon.h | 2 + > > 2 files changed, 62 insertions(+), 3 deletions(-) > > > > diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c > > index 1c6d36a29a04..86d8b48a8e3a 100644 > > --- a/hw/virtio/virtio-balloon.c > > +++ b/hw/virtio/virtio-balloon.c > > @@ -321,6 +321,57 @@ static void balloon_stats_set_poll_interval(Object > > *obj, Visitor *v, > > balloon_stats_change_timer(s, 0); > > } > > > > +static void virtio_balloon_handle_report(VirtIODevice *vdev, VirtQueue *vq) > > +{ > > + VirtIOBalloon *dev = VIRTIO_BALLOON(vdev); > > + VirtQueueElement *elem; > > + > > + while ((elem = virtqueue_pop(vq, sizeof(VirtQueueElement)))) { > > + unsigned int i; > > + > > + for (i = 0; i < elem->in_num; i++) { > > + void *addr = elem->in_sg[i].iov_base; > > + size_t size = elem->in_sg[i].iov_len; > > + ram_addr_t ram_offset; > > + size_t rb_page_size; > > + RAMBlock *rb; > > + > > + if (qemu_balloon_is_inhibited() || dev->poison_val) { > > + continue; > > actually, you want to do that in the outer loop, no?
I'll move that. Odds are compiler was doing that anyway. > > + } > > + > > + /* > > + * There is no need to check the memory section to see if > > + * it is ram/readonly/romd like there is for handle_output > > + * below. If the region is not meant to be written to then > > + * address_space_map will have allocated a bounce buffer > > + * and it will be freed in address_space_unmap and trigger > > + * and unassigned_mem_write before failing to copy over the > > + * buffer. If more than one bad descriptor is provided it > > + * will return NULL after the first bounce buffer and fail > > + * to map any resources. > > + */ > > + rb = qemu_ram_block_from_host(addr, false, &ram_offset); > > + if (!rb) { > > + trace_virtio_balloon_bad_addr(elem->in_addr[i]); > > + continue; > > + } > > + > > + /* For now we will simply ignore unaligned memory regions */ > > + rb_page_size = qemu_ram_pagesize(rb); > > + if (!QEMU_IS_ALIGNED(ram_offset | size, rb_page_size)) { > > /me thinks you can drop rb_page_size You mean just fold it into the statement? I guess I can do that. > I *think* there is still one remaining case to handle: Crossing RAM blocks. > > Most probably you should check > > /* For now, ignore crossing RAM blocks. */ > if (ram_offset + size >= qemu_ram_get_used_length()) { > continue; > } > > otherwise ram_block_discard_range() will report an error. Makes sense. I can add that into the QEMU_IS_ALIGNED check.