On Wed, Aug 21, 2024 at 2:56 PM Jonah Palmer <jonah.pal...@oracle.com> wrote:
>
> Decouples the IOVA allocator from the IOVA->HVA tree and instead adds
> the allocated IOVA range to an IOVA-only tree (iova_map). This IOVA tree
> will hold all IOVA ranges that have been allocated (e.g. in the
> IOVA->HVA tree) and are removed when any IOVA ranges are deallocated.
>
> A new API function vhost_iova_tree_insert() is also created to add a
> IOVA->HVA mapping into the IOVA->HVA tree.
>

I think this is a good first iteration but we can take steps to
simplify it. Also, it is great to be able to make points on real code
instead of designs on the air :).

I expected a split of vhost_iova_tree_map_alloc between the current
vhost_iova_tree_map_alloc and vhost_iova_tree_map_alloc_gpa, or
similar. Similarly, a vhost_iova_tree_remove and
vhost_iova_tree_remove_gpa would be needed.

The first one is used for regions that don't exist in the guest, like
SVQ vrings or CVQ buffers. The second one is the one used by the
memory listener to map the guest regions into the vdpa device.

Implementation wise, only two trees are actually needed:
* Current iova_taddr_map that contains all IOVA->vaddr translations as
seen by the device, so both allocation functions can work on a single
tree. The function iova_tree_find_iova keeps using this one, so the
user does not need to know if the address is from the guest or only
exists in QEMU by using RAMBlock etc. All insert and remove functions
use this tree.
* A new tree that relates IOVA to GPA, that only
vhost_iova_tree_map_alloc_gpa and vhost_iova_tree_remove_gpa uses.

The ideal case is that the key in this new tree is the GPA and the
value is the IOVA. But IOVATree's DMA is named the reverse: iova is
the key and translated_addr is the vaddr. We can create a new tree
struct for that, use GTree directly, or translate the reverse
linearly. As memory add / remove should not be frequent, I think the
simpler is the last one, but I'd be ok with creating a new tree.

vhost_iova_tree_map_alloc_gpa needs to add the map to this new tree
also. Similarly, vhost_iova_tree_remove_gpa must look for the GPA in
this tree, and only remove the associated DMAMap in iova_taddr_map
that matches the IOVA.

Does it make sense to you?

> Signed-off-by: Jonah Palmer <jonah.pal...@oracle.com>
> ---
>  hw/virtio/vhost-iova-tree.c | 38 ++++++++++++++++++++++++++++++++-----
>  hw/virtio/vhost-iova-tree.h |  1 +
>  hw/virtio/vhost-vdpa.c      | 31 ++++++++++++++++++++++++------
>  net/vhost-vdpa.c            | 13 +++++++++++--
>  4 files changed, 70 insertions(+), 13 deletions(-)
>
> diff --git a/hw/virtio/vhost-iova-tree.c b/hw/virtio/vhost-iova-tree.c
> index 3d03395a77..32c03db2f5 100644
> --- a/hw/virtio/vhost-iova-tree.c
> +++ b/hw/virtio/vhost-iova-tree.c
> @@ -28,12 +28,17 @@ struct VhostIOVATree {
>
>      /* IOVA address to qemu memory maps. */
>      IOVATree *iova_taddr_map;
> +
> +    /* IOVA tree (IOVA allocator) */
> +    IOVATree *iova_map;
>  };
>
>  /**
> - * Create a new IOVA tree
> + * Create a new VhostIOVATree with a new set of IOVATree's:

s/IOVA tree/VhostIOVATree/ is good, but I think the rest is more an
implementation detail.

> + * - IOVA allocator (iova_map)
> + * - IOVA->HVA tree (iova_taddr_map)
>   *
> - * Returns the new IOVA tree
> + * Returns the new VhostIOVATree
>   */
>  VhostIOVATree *vhost_iova_tree_new(hwaddr iova_first, hwaddr iova_last)
>  {
> @@ -44,6 +49,7 @@ VhostIOVATree *vhost_iova_tree_new(hwaddr iova_first, 
> hwaddr iova_last)
>      tree->iova_last = iova_last;
>
>      tree->iova_taddr_map = iova_tree_new();
> +    tree->iova_map = iova_tree_new();
>      return tree;
>  }
>
> @@ -53,6 +59,7 @@ VhostIOVATree *vhost_iova_tree_new(hwaddr iova_first, 
> hwaddr iova_last)
>  void vhost_iova_tree_delete(VhostIOVATree *iova_tree)
>  {
>      iova_tree_destroy(iova_tree->iova_taddr_map);
> +    iova_tree_destroy(iova_tree->iova_map);
>      g_free(iova_tree);
>  }
>
> @@ -88,13 +95,12 @@ int vhost_iova_tree_map_alloc(VhostIOVATree *tree, DMAMap 
> *map)
>      /* Some vhost devices do not like addr 0. Skip first page */
>      hwaddr iova_first = tree->iova_first ?: qemu_real_host_page_size();
>
> -    if (map->translated_addr + map->size < map->translated_addr ||

Why remove this condition? If the request is invalid we still need to
return an error here.

Maybe we should move it to iova_tree_alloc_map though.

> -        map->perm == IOMMU_NONE) {
> +    if (map->perm == IOMMU_NONE) {
>          return IOVA_ERR_INVALID;
>      }
>
>      /* Allocate a node in IOVA address */
> -    return iova_tree_alloc_map(tree->iova_taddr_map, map, iova_first,
> +    return iova_tree_alloc_map(tree->iova_map, map, iova_first,
>                                 tree->iova_last);
>  }
>
> @@ -107,4 +113,26 @@ int vhost_iova_tree_map_alloc(VhostIOVATree *tree, 
> DMAMap *map)
>  void vhost_iova_tree_remove(VhostIOVATree *iova_tree, DMAMap map)
>  {
>      iova_tree_remove(iova_tree->iova_taddr_map, map);
> +    iova_tree_remove(iova_tree->iova_map, map);
> +}
> +
> +/**
> + * Insert a new mapping to the IOVA->HVA tree
> + *
> + * @tree: The VhostIOVATree
> + * @map: The iova map
> + *
> + * Returns:
> + * - IOVA_OK if the map fits in the container
> + * - IOVA_ERR_INVALID if the map does not make sense (like size overflow)
> + * - IOVA_ERR_OVERLAP if the IOVA range overlaps with an existing range
> + */
> +int vhost_iova_tree_insert(VhostIOVATree *iova_tree, DMAMap *map)
> +{
> +    if (map->translated_addr + map->size < map->translated_addr ||
> +        map->perm == IOMMU_NONE) {
> +        return IOVA_ERR_INVALID;
> +    }
> +
> +    return iova_tree_insert(iova_tree->iova_taddr_map, map);
>  }
> diff --git a/hw/virtio/vhost-iova-tree.h b/hw/virtio/vhost-iova-tree.h
> index 4adfd79ff0..8bf7b64786 100644
> --- a/hw/virtio/vhost-iova-tree.h
> +++ b/hw/virtio/vhost-iova-tree.h
> @@ -23,5 +23,6 @@ const DMAMap *vhost_iova_tree_find_iova(const VhostIOVATree 
> *iova_tree,
>                                          const DMAMap *map);
>  int vhost_iova_tree_map_alloc(VhostIOVATree *iova_tree, DMAMap *map);
>  void vhost_iova_tree_remove(VhostIOVATree *iova_tree, DMAMap map);
> +int vhost_iova_tree_insert(VhostIOVATree *iova_tree, DMAMap *map);
>
>  #endif
> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> index 3cdaa12ed5..6702459065 100644
> --- a/hw/virtio/vhost-vdpa.c
> +++ b/hw/virtio/vhost-vdpa.c
> @@ -361,10 +361,10 @@ static void 
> vhost_vdpa_listener_region_add(MemoryListener *listener,
>      if (s->shadow_data) {
>          int r;
>
> -        mem_region.translated_addr = (hwaddr)(uintptr_t)vaddr,
>          mem_region.size = int128_get64(llsize) - 1,
>          mem_region.perm = IOMMU_ACCESS_FLAG(true, section->readonly),
>
> +        /* Allocate IOVA range and add the mapping to the IOVA tree */
>          r = vhost_iova_tree_map_alloc(s->iova_tree, &mem_region);
>          if (unlikely(r != IOVA_OK)) {
>              error_report("Can't allocate a mapping (%d)", r);
> @@ -372,6 +372,14 @@ static void 
> vhost_vdpa_listener_region_add(MemoryListener *listener,
>          }
>
>          iova = mem_region.iova;
> +
> +        /* Add mapping to the IOVA->HVA tree */
> +        mem_region.translated_addr = (hwaddr)(uintptr_t)vaddr;
> +        r = vhost_iova_tree_insert(s->iova_tree, &mem_region);
> +        if (unlikely(r != IOVA_OK)) {
> +            error_report("Can't add listener region mapping (%d)", r);
> +            goto fail_map;
> +        }

I'd say it is not intuitive for the caller code.

>      }
>
>      vhost_vdpa_iotlb_batch_begin_once(s);
> @@ -1142,19 +1150,30 @@ static void vhost_vdpa_svq_unmap_rings(struct 
> vhost_dev *dev,
>   *
>   * @v: Vhost-vdpa device
>   * @needle: The area to search iova
> + * @taddr: The translated address (SVQ HVA)
>   * @errorp: Error pointer
>   */
>  static bool vhost_vdpa_svq_map_ring(struct vhost_vdpa *v, DMAMap *needle,
> -                                    Error **errp)
> +                                    hwaddr taddr, Error **errp)
>  {
>      int r;
>
> +    /* Allocate IOVA range and add the mapping to the IOVA tree */
>      r = vhost_iova_tree_map_alloc(v->shared->iova_tree, needle);
>      if (unlikely(r != IOVA_OK)) {
>          error_setg(errp, "Cannot allocate iova (%d)", r);
>          return false;
>      }
>
> +    /* Add mapping to the IOVA->HVA tree */
> +    needle->translated_addr = taddr;
> +    r = vhost_iova_tree_insert(v->shared->iova_tree, needle);
> +    if (unlikely(r != IOVA_OK)) {
> +        error_setg(errp, "Cannot add SVQ vring mapping (%d)", r);
> +        vhost_iova_tree_remove(v->shared->iova_tree, *needle);
> +        return false;
> +    }
> +
>      r = vhost_vdpa_dma_map(v->shared, v->address_space_id, needle->iova,
>                             needle->size + 1,
>                             (void *)(uintptr_t)needle->translated_addr,
> @@ -1192,11 +1211,11 @@ static bool vhost_vdpa_svq_map_rings(struct vhost_dev 
> *dev,
>      vhost_svq_get_vring_addr(svq, &svq_addr);
>
>      driver_region = (DMAMap) {
> -        .translated_addr = svq_addr.desc_user_addr,
>          .size = driver_size - 1,
>          .perm = IOMMU_RO,
>      };
> -    ok = vhost_vdpa_svq_map_ring(v, &driver_region, errp);
> +    ok = vhost_vdpa_svq_map_ring(v, &driver_region, svq_addr.desc_user_addr,
> +                                 errp);
>      if (unlikely(!ok)) {
>          error_prepend(errp, "Cannot create vq driver region: ");
>          return false;
> @@ -1206,11 +1225,11 @@ static bool vhost_vdpa_svq_map_rings(struct vhost_dev 
> *dev,
>      addr->avail_user_addr = driver_region.iova + avail_offset;
>
>      device_region = (DMAMap) {
> -        .translated_addr = svq_addr.used_user_addr,
>          .size = device_size - 1,
>          .perm = IOMMU_RW,
>      };
> -    ok = vhost_vdpa_svq_map_ring(v, &device_region, errp);
> +    ok = vhost_vdpa_svq_map_ring(v, &device_region, svq_addr.used_user_addr,
> +                                 errp);
>      if (unlikely(!ok)) {
>          error_prepend(errp, "Cannot create vq device region: ");
>          vhost_vdpa_svq_unmap_ring(v, driver_region.translated_addr);
> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> index 03457ead66..81da956b92 100644
> --- a/net/vhost-vdpa.c
> +++ b/net/vhost-vdpa.c
> @@ -512,15 +512,24 @@ static int vhost_vdpa_cvq_map_buf(struct vhost_vdpa *v, 
> void *buf, size_t size,
>      DMAMap map = {};
>      int r;
>
> -    map.translated_addr = (hwaddr)(uintptr_t)buf;
>      map.size = size - 1;
>      map.perm = write ? IOMMU_RW : IOMMU_RO,
> +
> +    /* Allocate IOVA range and add the mapping to the IOVA tree */
>      r = vhost_iova_tree_map_alloc(v->shared->iova_tree, &map);
>      if (unlikely(r != IOVA_OK)) {
> -        error_report("Cannot map injected element");
> +        error_report("Cannot allocate IOVA range for injected element");
>          return r;
>      }
>
> +    /* Add mapping to the IOVA->HVA tree */
> +    map.translated_addr = (hwaddr)(uintptr_t)buf;
> +    r = vhost_iova_tree_insert(v->shared->iova_tree, &map);
> +    if (unlikely(r != IOVA_OK)) {
> +        error_report("Cannot map injected element into IOVA->HVA tree");
> +        goto dma_map_err;
> +    }
> +
>      r = vhost_vdpa_dma_map(v->shared, v->address_space_id, map.iova,
>                             vhost_vdpa_net_cvq_cmd_page_len(), buf, !write);
>      if (unlikely(r < 0)) {
> --
> 2.43.5
>


Reply via email to