Hi,

It seems that this patch has been stale for more than a month. From my
understanding, it is closely related to [1] and feedbacks from maintainers
are needed. So I am sending this email as a gentle reminder about that.
Thanks!

Kind regards,
Henry 

[1] https://patchwork.kernel.org/project/xen-devel/list/?series=642393

> -----Original Message-----
> Subject: [RFC PATCH] xen/docs: Document acquire resource interface
> 
> This commit creates a new doc to document the acquire resource interface.
> This
> is a reference document.
> 
> Signed-off-by: Matias Ezequiel Vara Larsen <matias.v...@vates.fr>
> ---
> RFC: The current document still contains TODOs. I am not really sure why
> different resources are implemented differently. I would like to understand it
> better so I can document it and then easily build new resources. I structured
> the document in two sections but I am not sure if that is the right way to do
> it.
> 
> ---
>  .../acquire_resource_reference.rst            | 337 ++++++++++++++++++
>  docs/hypervisor-guide/index.rst               |   2 +
>  2 files changed, 339 insertions(+)
>  create mode 100644 docs/hypervisor-guide/acquire_resource_reference.rst
> 
> diff --git a/docs/hypervisor-guide/acquire_resource_reference.rst
> b/docs/hypervisor-guide/acquire_resource_reference.rst
> new file mode 100644
> index 0000000000..a9944aae1d
> --- /dev/null
> +++ b/docs/hypervisor-guide/acquire_resource_reference.rst
> @@ -0,0 +1,337 @@
> +.. SPDX-License-Identifier: CC-BY-4.0
> +
> +Acquire resource reference
> +==========================
> +
> +Acquire resource allows you to share a resource between a domain and a
> dom0 pv
> +tool.  Resources are generally represented by pages that are mapped into
> the pv
> +tool memory space. These pages are accessed by Xen and they may or may
> not be
> +accessed by the DomU itself. This document describes the api to build pv
> tools.
> +The document also describes the software components required to create
> and
> +expose a domain's resource. This is not a tutorial or a how-to guide. It
> merely
> +describes the machinery that is already described in the code itself.
> +
> +.. warning::
> +
> +    The code in this document may already be out of date, however it may
> +    be enough to illustrate how the acquire resource interface works.
> +
> +
> +PV tool API
> +-----------
> +
> +This section describes the api to map a resource from a pv tool. The api is
> based
> +on the following functions:
> +
> +* xenforeignmemory_open()
> +
> +* xenforeignmemory_resource_size()
> +
> +* xenforeignmemory_map_resource()
> +
> +* xenforeignmemory_unmap_resource()
> +
> +The ``xenforeignmemory_open()`` function gets the handler that is used by
> the
> +rest of the functions:
> +
> +.. code-block:: c
> +
> +   fh = xenforeignmemory_open(NULL, 0);
> +
> +The ``xenforeignmemory_resource_size()`` function gets the size of the
> resource.
> +For example, in the following code, we get the size of the
> +``XENMEM_RESOURCE_VMTRACE_BUF``:
> +
> +.. code-block:: c
> +
> +    rc = xenforeignmemory_resource_size(fh, domid,
> XENMEM_resource_vmtrace_buf, vcpu, &size);
> +
> +The size of the resource is returned in ``size`` in bytes.
> +
> +The ``xenforeignmemory_map_resource()`` function maps a domain's
> resource. The
> +function is declared as follows:
> +
> +.. code-block:: c
> +
> +    xenforeignmemory_resource_handle
> *xenforeignmemory_map_resource(
> +        xenforeignmemory_handle *fmem, domid_t domid, unsigned int type,
> +        unsigned int id, unsigned long frame, unsigned long nr_frames,
> +        void **paddr, int prot, int flags);
> +
> +The size of the resource is in number of frames. For example, **QEMU**
> uses it
> +to map the ioreq server between the domain and QEMU:
> +
> +.. code-block:: c
> +
> +    fres = xenforeignmemory_map_resource(xen_fmem, xen_domid,
> XENMEM_resource_ioreq_server,
> +         state->ioservid, 0, 2, &addr, PROT_READ | PROT_WRITE, 0);
> +
> +
> +The third parameter corresponds with the resource that we request from
> the
> +domain, e.g., ``XENMEM_resource_ioreq_server``. The seventh parameter
> returns a
> +point-to-pointer to the address of the mapped resource.
> +
> +Finally, the ``xenforeignmemory_unmap_resource()`` function unmaps the
> region:
> +
> +.. code-block:: c
> +    :caption: tools/misc/xen-vmtrace.c
> +
> +    if ( fres && xenforeignmemory_unmap_resource(fh, fres) )
> +        perror("xenforeignmemory_unmap_resource()");
> +
> +Sharing a resource with a pv tool
> +---------------------------------
> +
> +In this section, we describe how to build a new resource and share it with a
> pv
> +too. Resources are defined in ``xen/include/public/memory.h``. In Xen-4.16,
> +there are three resources:
> +
> +.. code-block:: c
> +    :caption: xen/include/public/memory.h
> +
> +    #define XENMEM_resource_ioreq_server 0
> +    #define XENMEM_resource_grant_table 1
> +    #define XENMEM_resource_vmtrace_buf 2
> +
> +The ``resource_max_frames()`` function returns the size of a resource. The
> +resource may provide a handler to get the size. This is the definition of the
> +``resource_max_frame()`` function:
> +
> +.. code-block:: c
> +    :linenos:
> +    :caption: xen/common/memory.c
> +
> +    static unsigned int resource_max_frames(const struct domain *d,
> +                                            unsigned int type, unsigned int 
> id)
> +    {
> +        switch ( type )
> +        {
> +        case XENMEM_resource_grant_table:
> +            return gnttab_resource_max_frames(d, id);
> +
> +        case XENMEM_resource_ioreq_server:
> +            return ioreq_server_max_frames(d);
> +
> +        case XENMEM_resource_vmtrace_buf:
> +            return d->vmtrace_size >> PAGE_SHIFT;
> +
> +        default:
> +            return -EOPNOTSUPP;
> +        }
> +    }
> +
> +The ``_acquire_resource()`` function invokes the corresponding handler that
> maps
> +the resource. The handler relies on ``type`` to select the right handler:
> +
> +.. code-block:: c
> +    :linenos:
> +    :caption: xen/common/memory.c
> +
> +    static int _acquire_resource(
> +        struct domain *d, unsigned int type, unsigned int id, unsigned int 
> frame,
> +        unsigned int nr_frames, xen_pfn_t mfn_list[])
> +    {
> +        switch ( type )
> +        {
> +        case XENMEM_resource_grant_table:
> +            return gnttab_acquire_resource(d, id, frame, nr_frames, 
> mfn_list);
> +
> +        case XENMEM_resource_ioreq_server:
> +            return acquire_ioreq_server(d, id, frame, nr_frames, mfn_list);
> +
> +        case XENMEM_resource_vmtrace_buf:
> +            return acquire_vmtrace_buf(d, id, frame, nr_frames, mfn_list);
> +
> +        default:
> +            return -EOPNOTSUPP;
> +        }
> +    }
> +
> +Note that if a new resource has to be added, these two functions need to be
> +modified. These handlers have the common declaration:
> +
> +.. code-block:: c
> +    :linenos:
> +    :caption: xen/common/memory.c
> +
> +    static int acquire_vmtrace_buf(
> +        struct domain *d, unsigned int id, unsigned int frame,
> +        unsigned int nr_frames, xen_pfn_t mfn_list[])
> +    {
> +
> +The function returns in ``mfn_list[]`` a number of ``nr_frames`` of pointers
> to
> +mfn pages. For example, for the ``XENMEM_resource_vmtrace_buf``
> resource, the
> +handler is defined as follows:
> +
> +.. code-block:: c
> +    :linenos:
> +    :caption: xen/common/memory.c
> +
> +    static int acquire_vmtrace_buf(
> +        struct domain *d, unsigned int id, unsigned int frame,
> +        unsigned int nr_frames, xen_pfn_t mfn_list[])
> +    {
> +        const struct vcpu *v = domain_vcpu(d, id);
> +        unsigned int i;
> +        mfn_t mfn;
> +
> +        if ( !v )
> +            return -ENOENT;
> +
> +        if ( !v->vmtrace.pg ||
> +             (frame + nr_frames) > (d->vmtrace_size >> PAGE_SHIFT) )
> +            return -EINVAL;
> +
> +        mfn = page_to_mfn(v->vmtrace.pg);
> +
> +        for ( i = 0; i < nr_frames; i++ )
> +            mfn_list[i] = mfn_x(mfn) + frame + i;
> +
> +        return nr_frames;
> +    }
> +
> +Note that the handler only returns the mfn pages that have been previously
> +allocated in ``vmtrace.pg``. The allocation of the resource happens during
> the
> +instantiation of the vcpu. A set of pages is allocated during the 
> instantiation
> +of each vcpu. For allocating the page, we use the domheap with the
> +``MEMF_no_refcount`` flag:
> +
> +.. What do we require to set this flag?
> +
> +.. code-block:: c
> +
> +    v->vmtrace.pg = alloc_domheap_page(s->target, MEMF_no_refcount);
> +
> +To access the pages in the context of Xen, we are required to map the page
> by
> +using:
> +
> +.. code-block:: c
> +
> +    va_page = __map_domain_page_global(page);
> +
> +The ``va_page`` pointer is used in the context of Xen. The function that
> +allocates the pages runs the following verification after allocation. For
> +example, the following code is from ``vmtrace_alloc_buffer()`` that allocates
> +the page for vmtrace for a given vcpu:
> +
> +.. Why is this verification required after allocation?
> +
> +.. code-block:: c
> +
> +    for ( i = 0; i < (d->vmtrace_size >> PAGE_SHIFT); i++ )
> +        if ( unlikely(!get_page_and_type(&pg[i], d, PGT_writable_page)) )
> +            /*
> +             * The domain can't possibly know about this page yet, so failure
> +             * here is a clear indication of something fishy going on.
> +             */
> +            goto refcnt_err;
> +
> +The allocated pages are released by first using ``unmap_domheap_page()``
> and
> +then using ``free_domheap_page()`` to finally release the page. Note that
> the
> +releasing of these resources may vary depending on how there are
> allocated.
> +
> +Acquire Resources
> +-----------------
> +
> +This section briefly describes the resources that rely on the acquire 
> resource
> +interface. These resources are mapped by pv tools like QEMU.
> +
> +Intel Processor Trace (IPT)
> +```````````````````````````
> +
> +This resource is named ``XENMEM_resource_vmtrace_buf`` and its size in
> bytes is
> +set in ``d->vmtrace_size``. It contains the traces generated by the IPT. 
> These
> +traces are generated by each vcpu. The pages are allocated during
> +``vcpu_create()``. The pages are stored in the ``vcpu`` structure in
> +``sched.h``:
> +
> +.. code-block:: c
> +
> +   struct {
> +        struct page_info *pg; /* One contiguous allocation of d->vmtrace_size
> */
> +    } vmtrace;
> +
> +During ``vcpu_create()``, the pg is allocated by using the per-domain heap:
> +
> +.. code-block:: c
> +
> +    pg = alloc_domheap_pages(d, get_order_from_bytes(d->vmtrace_size),
> MEMF_no_refcount);
> +
> +For a given vcpu, the page is loaded into the guest at
> +``vmx_restore_guest_msrs()``:
> +
> +.. code-block:: c
> +    :caption: xen/arch/x86/hvm/vmx/vmx.c
> +
> +    wrmsrl(MSR_RTIT_OUTPUT_BASE, page_to_maddr(v->vmtrace.pg));
> +
> +The releasing of the pages happens during the vcpu teardown.
> +
> +Grant Table
> +```````````
> +
> +The grant tables are represented by the ``XENMEM_resource_grant_table``
> +resource. Grant tables are special since guests can map grant tables. Dom0
> also
> +needs to write into the grant table to set up the grants for xenstored and
> +xenconsoled. When acquiring the resource, the pages are allocated from
> the xen
> +heap in ``gnttab_get_shared_frame_mfn()``:
> +
> +.. code-block:: c
> +    :linenos:
> +    :caption: xen/common/grant_table.c
> +
> +    gt->shared_raw[i] = alloc_xenheap_page()
> +    share_xen_page_with_guest(virt_to_page(gt->shared_raw[i]), d,
> SHARE_rw);
> +
> +Then, pages are shared with the guest. These pages are then converted
> from virt
> +to mfn before returning:
> +
> +.. code-block:: c
> +    :linenos:
> +
> +    for ( i = 0; i < nr_frames; ++i )
> +         mfn_list[i] = virt_to_mfn(vaddrs[frame + i]);
> +
> +Ioreq server
> +````````````
> +
> +The ioreq server is represented by the ``XENMEM_resource_ioreq_server``
> +resource. An ioreq server provides emulated devices to HVM and PVH
> guests. The
> +allocation is done in ``ioreq_server_alloc_mfn()``. The following code
> partially
> +shows the allocation of the pages that represent the ioreq server:
> +
> +.. code-block:: c
> +    :linenos:
> +    :caption: xen/common/ioreq.c
> +
> +    page = alloc_domheap_page(s->target, MEMF_no_refcount);
> +
> +    iorp->va = __map_domain_page_global(page);
> +    if ( !iorp->va )
> +        goto fail;
> +
> +    iorp->page = page;
> +    clear_page(iorp->va);
> +    return 0;
> +
> +The function above is invoked from ``ioreq_server_get_frame()`` which is
> called
> +from ``acquire_ioreq_server()``. For acquiring, the function returns the
> +allocated pages as follows:
> +
> +.. code-block:: c
> +
> +    *mfn = page_to_mfn(s->bufioreq.page);
> +
> +The ``ioreq_server_free_mfn()`` function releases the pages as follows:
> +
> +.. code-block:: c
> +    :linenos:
> +    :caption: xen/common/ioreq.c
> +
> +    unmap_domain_page_global(iorp->va);
> +    iorp->va = NULL;
> +
> +    put_page_alloc_ref(page);
> +    put_page_and_type(page);
> +
> +.. TODO: Why unmap() and free() are not used instead?
> diff --git a/docs/hypervisor-guide/index.rst b/docs/hypervisor-
> guide/index.rst
> index e4393b0697..961a11525f 100644
> --- a/docs/hypervisor-guide/index.rst
> +++ b/docs/hypervisor-guide/index.rst
> @@ -9,3 +9,5 @@ Hypervisor documentation
>     code-coverage
> 
>     x86/index
> +
> +   acquire_resource_reference
> --
> 2.25.1
> 


Reply via email to