On Wed, Apr 24, 2024 at 12:21 AM Si-Wei Liu <si-wei....@oracle.com> wrote: > > > > On 4/22/2024 1:49 AM, Eugenio Perez Martin wrote: > > On Sat, Apr 20, 2024 at 1:50 AM Si-Wei Liu <si-wei....@oracle.com> wrote: > >> > >> > >> On 4/19/2024 1:29 AM, Eugenio Perez Martin wrote: > >>> On Thu, Apr 18, 2024 at 10:46 PM Si-Wei Liu <si-wei....@oracle.com> wrote: > >>>> > >>>> On 4/10/2024 3:03 AM, Eugenio Pérez wrote: > >>>>> IOVA tree is also used to track the mappings of virtio-net shadow > >>>>> virtqueue. This mappings may not match with the GPA->HVA ones. > >>>>> > >>>>> This causes a problem when overlapped regions (different GPA but same > >>>>> translated HVA) exists in the tree, as looking them by HVA will return > >>>>> them twice. To solve this, create an id member so we can assign unique > >>>>> identifiers (GPA) to the maps. > >>>>> > >>>>> Signed-off-by: Eugenio Pérez <epere...@redhat.com> > >>>>> --- > >>>>> include/qemu/iova-tree.h | 5 +++-- > >>>>> util/iova-tree.c | 3 ++- > >>>>> 2 files changed, 5 insertions(+), 3 deletions(-) > >>>>> > >>>>> diff --git a/include/qemu/iova-tree.h b/include/qemu/iova-tree.h > >>>>> index 2a10a7052e..34ee230e7d 100644 > >>>>> --- a/include/qemu/iova-tree.h > >>>>> +++ b/include/qemu/iova-tree.h > >>>>> @@ -36,6 +36,7 @@ typedef struct DMAMap { > >>>>> hwaddr iova; > >>>>> hwaddr translated_addr; > >>>>> hwaddr size; /* Inclusive */ > >>>>> + uint64_t id; > >>>>> IOMMUAccessFlags perm; > >>>>> } QEMU_PACKED DMAMap; > >>>>> typedef gboolean (*iova_tree_iterator)(DMAMap *map); > >>>>> @@ -100,8 +101,8 @@ const DMAMap *iova_tree_find(const IOVATree *tree, > >>>>> const DMAMap *map); > >>>>> * @map: the mapping to search > >>>>> * > >>>>> * Search for a mapping in the iova tree that translated_addr > >>>>> overlaps with the > >>>>> - * mapping range specified. Only the first found mapping will be > >>>>> - * returned. > >>>>> + * mapping range specified and map->id is equal. Only the first found > >>>>> + * mapping will be returned. > >>>>> * > >>>>> * Return: DMAMap pointer if found, or NULL if not found. Note that > >>>>> * the returned DMAMap pointer is maintained internally. User > >>>>> should > >>>>> diff --git a/util/iova-tree.c b/util/iova-tree.c > >>>>> index 536789797e..0863e0a3b8 100644 > >>>>> --- a/util/iova-tree.c > >>>>> +++ b/util/iova-tree.c > >>>>> @@ -97,7 +97,8 @@ static gboolean > >>>>> iova_tree_find_address_iterator(gpointer key, gpointer value, > >>>>> > >>>>> needle = args->needle; > >>>>> if (map->translated_addr + map->size < needle->translated_addr > >>>>> || > >>>>> - needle->translated_addr + needle->size < map->translated_addr) > >>>>> { > >>>>> + needle->translated_addr + needle->size < map->translated_addr > >>>>> || > >>>>> + needle->id != map->id) { > >>>> It looks this iterator can also be invoked by SVQ from > >>>> vhost_svq_translate_addr() -> iova_tree_find_iova(), where guest GPA > >>>> space will be searched on without passing in the ID (GPA), and exact > >>>> match for the same GPA range is not actually needed unlike the mapping > >>>> removal case. Could we create an API variant, for the SVQ lookup case > >>>> specifically? Or alternatively, add a special flag, say skip_id_match to > >>>> DMAMap, and the id match check may look like below: > >>>> > >>>> (!needle->skip_id_match && needle->id != map->id) > >>>> > >>>> I think vhost_svq_translate_addr() could just call the API variant or > >>>> pass DMAmap with skip_id_match set to true to svq_iova_tree_find_iova(). > >>>> > >>> I think you're totally right. But I'd really like to not complicate > >>> the API of the iova_tree more. > >>> > >>> I think we can look for the hwaddr using memory_region_from_host and > >>> then get the hwaddr. It is another lookup though... > >> Yeah, that will be another means of doing translation without having to > >> complicate the API around iova_tree. I wonder how the lookup through > >> memory_region_from_host() may perform compared to the iova tree one, the > >> former looks to be an O(N) linear search on a linked list while the > >> latter would be roughly O(log N) on an AVL tree? > > Even worse, as the reverse lookup (from QEMU vaddr to SVQ IOVA) is > > linear too. It is not even ordered. > Oh Sorry, I misread the code and I should look for g_tree_foreach () > instead of g_tree_search_node(). So the former is indeed linear > iteration, but it looks to be ordered? > > https://github.com/GNOME/glib/blob/main/glib/gtree.c#L1115
The GPA / IOVA are ordered but we're looking by QEMU's vaddr. If we have these translations: [0x1000, 0x2000] -> [0x10000, 0x11000] [0x2000, 0x3000] -> [0x6000, 0x7000] We will see them in this order, so we cannot stop the search at the first node. > > > > But apart from this detail you're right, I have the same concerns with > > this solution too. If we see a hard performance regression we could go > > to more complicated solutions, like maintaining a reverse IOVATree in > > vhost-iova-tree too. First RFCs of SVQ did that actually. > Agreed, yeap we can use memory_region_from_host for now. Any reason why > reverse IOVATree was dropped, lack of users? But now we have one! > No, it is just simplicity. We already have an user in the hot patch in the master branch, vhost_svq_vring_write_descs. But I never profiled enough to find if it is a bottleneck or not to be honest. I'll send the new series by today, thank you for finding these issues! > Thanks, > -Siwei > > > > Thanks! > > > >> Of course, > >> memory_region_from_host() won't search out of the guest memory space for > >> sure. As this could be on the hot data path I have a little bit > >> hesitance over the potential cost or performance regression this change > >> could bring in, but maybe I'm overthinking it too much... > >> > >> Thanks, > >> -Siwei > >> > >>>> Thanks, > >>>> -Siwei > >>>>> return false; > >>>>> } > >>>>> >