On 2024/04/15 17:49, Dmitry Osipenko wrote:
On 4/15/24 11:13, Akihiko Odaki wrote:
On 2024/04/15 17:03, Dmitry Osipenko wrote:
Hello,
On 4/13/24 14:57, Akihiko Odaki wrote:
...
+static void
+virtio_gpu_virgl_unmap_resource_blob(VirtIOGPU *g,
+ struct
virtio_gpu_simple_resource *res)
+{
+ VirtIOGPUBase *b = VIRTIO_GPU_BASE(g);
+
+ if (!res->mr) {
+ return;
+ }
+
+ memory_region_set_enabled(res->mr, false);
+ memory_region_del_subregion(&b->hostmem, res->mr);
+
+ /* memory region owns res->mr object and frees it when mr is
released */
+ res->mr = NULL;
+
+ virgl_renderer_resource_unmap(res->resource_id);
Hi,
First, thanks for keeping working on this.
This patch has some changes since the previous version, but it is still
vulnerable to the race condition pointed out. The memory region is
asynchronously unmapped from the guest address space, but the backing
memory on the host address space is unmapped synchronously before that.
This results in use-after-free. The whole unmapping operation needs to
be implemented in an asynchronous manner.
Thanks for the clarification! I missed this point from the previous
discussion.
Could you please clarify what do you mean by the "asynchronous manner"?
Virglrenderer API works only in the virtio-gpu-gl context, it can't be
accessed from other places.
The memory_region_del_subregion() should remove the region as long as
nobody references it, isn't it? On Linux guest nobody should reference
hostmem regions besides virtio-gpu device on the unmap, don't know about
other guests.
We can claim it a guest's fault if MR lives after the deletion and in
that case exit Qemu with a noisy error msg or leak resource. WDYT?
We need to be prepared for a faulty guest for reliability and security
as they are common goals of virtualization, and it is nice to have them
after all.
You need to call virgl_renderer_resource_unmap() after the MR actually
gets freed. The virtio-gpu-gl context is just a context with BQL so it
is fine to call virgl functions in most places.
Do you have example of a legit use-case where hostmem MR could outlive
resource mapping?
MR outliving after memory_region_del_subregion() is not a use-case, but
a situation that occurs due to the limitation of the memory subsystem.
It is not easy to answer how often such a situation happens.
Turning it into a error condition is much more reasonable to do than to
to worry about edge case that nobody cares about, which can't be tested
easily and that not trivial to support, IMO.
I'm not sure what you mean by turning into an error condition. I doubt
it's possible to emit errors when someone touches an unmapped region.
Reproducing this issue is not easy as it's often cases for
use-after-free bugs, but fixing it is not that complicated in my opinion
since you already have an implementation which asynchronously unmaps the
region in v6. I write my suggestions to fix problems in v6:
- Remove ref member in virgl_gpu_resource, vres_get_ref(),
vres_put_ref(), and virgl_resource_unmap().
- Change virtio_gpu_virgl_process_cmd(),
virgl_cmd_resource_unmap_blob(), and virgl_cmd_resource_unref() to
return a bool, which tells if the command was processed or suspended.
- In virtio_gpu_process_cmdq(), break if the command was suspended.
- In virgl_resource_blob_async_unmap(), call virtio_gpu_gl_block(g, false).
- In virgl_cmd_resource_unmap_blob() and virgl_cmd_resource_unref(),
call memory_region_del_subregion() and virtio_gpu_gl_block(g, true), and
tell that the command was suspended if the reference counter of
MemoryRegion > 0. Free and unmap the MR otherwise.
Regards,
Akihiko Odaki