Re: Enabling hypervisor agnosticism for VirtIO backends

2021-09-14 Thread Alex Bennée


Stefano Stabellini  writes:

> On Mon, 6 Sep 2021, AKASHI Takahiro wrote:
>> > the second is how many context switches are involved in a transaction.
>> > Of course with all things there is a trade off. Things involving the
>> > very tightest latency would probably opt for a bare metal backend which
>> > I think would imply hypervisor knowledge in the backend binary.
>> 
>> In configuration phase of virtio device, the latency won't be a big matter.
>> In device operations (i.e. read/write to block devices), if we can
>> resolve 'mmap' issue, as Oleksandr is proposing right now, the only issue is
>> how efficiently we can deliver notification to the opposite side. Right?
>> And this is a very common problem whatever approach we would take.
>> 
>> Anyhow, if we do care the latency in my approach, most of virtio-proxy-
>> related code can be re-implemented just as a stub (or shim?) library
>> since the protocols are defined as RPCs.
>> In this case, however, we would lose the benefit of providing "single binary"
>> BE.
>> (I know this is is an arguable requirement, though.)
>
> In my experience, latency, performance, and security are far more
> important than providing a single binary.
>
> In my opinion, we should optimize for the best performance and security,
> then be practical on the topic of hypervisor agnosticism. For instance,
> a shared source with a small hypervisor-specific component, with one
> implementation of the small component for each hypervisor, would provide
> a good enough hypervisor abstraction. It is good to be hypervisor
> agnostic, but I wouldn't go extra lengths to have a single binary.

I agree it shouldn't be a primary goal although a single binary working
with helpers to bridge the gap would make a cool demo. The real aim of
agnosticism is avoid having multiple implementations of the backend
itself for no other reason than a change in hypervisor.

> I cannot picture a case where a BE binary needs to be moved between
> different hypervisors and a recompilation is impossible (BE, not FE).
> Instead, I can definitely imagine detailed requirements on IRQ latency
> having to be lower than 10us or bandwidth higher than 500 MB/sec.
>
> Instead of virtio-proxy, my suggestion is to work together on a common
> project and common source with others interested in the same problem.
>
> I would pick something like kvmtool as a basis. It doesn't have to be
> kvmtools, and kvmtools specifically is GPL-licensed, which is
> unfortunate because it would help if the license was BSD-style for ease
> of integration with Zephyr and other RTOSes.

This does imply making some choices, especially the implementation
language. However I feel that C is really the lowest common denominator
here and I get the sense that people would rather avoid it if they could
given the potential security implications of a bug prone back end. This
is what is prompting interest in Rust.

> As long as the project is open to working together on multiple
> hypervisors and deployment models then it is fine. For instance, the
> shared source could be based on OpenAMP kvmtool [1] (the original
> kvmtool likely prefers to stay small and narrow-focused on KVM). OpenAMP
> kvmtool was created to add support for hypervisor-less virtio but they
> are very open to hypervisors too. It could be a good place to add a Xen
> implementation, a KVM fatqueue implementation, a Jailhouse
> implementation, etc. -- work together toward the common goal of a single
> BE source (not binary) supporting multiple different deployment models.
>
>
> [1] https://github.com/OpenAMP/kvmtool


-- 
Alex Bennée



Re: Enabling hypervisor agnosticism for VirtIO backends

2021-09-13 Thread Stefano Stabellini
On Mon, 6 Sep 2021, AKASHI Takahiro wrote:
> > the second is how many context switches are involved in a transaction.
> > Of course with all things there is a trade off. Things involving the
> > very tightest latency would probably opt for a bare metal backend which
> > I think would imply hypervisor knowledge in the backend binary.
> 
> In configuration phase of virtio device, the latency won't be a big matter.
> In device operations (i.e. read/write to block devices), if we can
> resolve 'mmap' issue, as Oleksandr is proposing right now, the only issue is
> how efficiently we can deliver notification to the opposite side. Right?
> And this is a very common problem whatever approach we would take.
> 
> Anyhow, if we do care the latency in my approach, most of virtio-proxy-
> related code can be re-implemented just as a stub (or shim?) library
> since the protocols are defined as RPCs.
> In this case, however, we would lose the benefit of providing "single binary"
> BE.
> (I know this is is an arguable requirement, though.)

In my experience, latency, performance, and security are far more
important than providing a single binary.

In my opinion, we should optimize for the best performance and security,
then be practical on the topic of hypervisor agnosticism. For instance,
a shared source with a small hypervisor-specific component, with one
implementation of the small component for each hypervisor, would provide
a good enough hypervisor abstraction. It is good to be hypervisor
agnostic, but I wouldn't go extra lengths to have a single binary. I
cannot picture a case where a BE binary needs to be moved between
different hypervisors and a recompilation is impossible (BE, not FE).
Instead, I can definitely imagine detailed requirements on IRQ latency
having to be lower than 10us or bandwidth higher than 500 MB/sec.

Instead of virtio-proxy, my suggestion is to work together on a common
project and common source with others interested in the same problem.

I would pick something like kvmtool as a basis. It doesn't have to be
kvmtools, and kvmtools specifically is GPL-licensed, which is
unfortunate because it would help if the license was BSD-style for ease
of integration with Zephyr and other RTOSes.

As long as the project is open to working together on multiple
hypervisors and deployment models then it is fine. For instance, the
shared source could be based on OpenAMP kvmtool [1] (the original
kvmtool likely prefers to stay small and narrow-focused on KVM). OpenAMP
kvmtool was created to add support for hypervisor-less virtio but they
are very open to hypervisors too. It could be a good place to add a Xen
implementation, a KVM fatqueue implementation, a Jailhouse
implementation, etc. -- work together toward the common goal of a single
BE source (not binary) supporting multiple different deployment models.


[1] https://github.com/OpenAMP/kvmtool



Re: Enabling hypervisor agnosticism for VirtIO backends

2021-09-09 Thread AKASHI Takahiro
Hi Christopher,

On Tue, Sep 07, 2021 at 11:09:34AM -0700, Christopher Clark wrote:
> On Tue, Sep 7, 2021 at 4:55 AM AKASHI Takahiro 
> wrote:
> 
> > Hi,
> >
> > I have not covered all your comments below yet.
> > So just one comment:
> >
> > On Mon, Sep 06, 2021 at 05:57:43PM -0700, Christopher Clark wrote:
> > > On Thu, Sep 2, 2021 at 12:19 AM AKASHI Takahiro <
> > takahiro.aka...@linaro.org>
> > > wrote:
> >
> > (snip)
> >
> > > >It appears that, on FE side, at least three hypervisor calls (and
> > data
> > > >copying) need to be invoked at every request, right?
> > > >
> > >
> > > For a write, counting FE sendv ops:
> > > 1: the write data payload is sent via the "Argo ring for writes"
> > > 2: the descriptor is sent via a sync of the available/descriptor ring
> > >   -- is there a third one that I am missing?
> >
> > In the picture, I can see
> > a) Data transmitted by Argo sendv
> > b) Descriptor written after data sendv
> > c) VirtIO ring sync'd to back-end via separate sendv
> >
> > Oops, (b) is not a hypervisor call, is it?
> >
> 
> That's correct, it is not - the blue arrows in the diagram are not
> hypercalls, they are intended to show data movement or action in the flow
> of performing the operation, and (b) is a data write within the guest's
> address space into the descriptor ring.
> 
> 
> 
> > (But I guess that you will have to have yet another call for notification
> > since there is no config register of QueueNotify?)
> >
> 
> Reasoning about hypercalls necessary for data movement:
> 
> VirtIO transport drivers are responsible for instantiating virtqueues
> (setup_vq) and are able to populate the notify function pointer in the
> virtqueue that they supply. The virtio-argo transport driver can provide a
> suitable notify function implementation that will issue the Argo hypercall
> sendv hypercall(s) for sending data from the guest frontend to the backend.
> By issuing the sendv at the time of the queuenotify, rather than as each
> buffer is added to the virtqueue, the cost of the sendv hypercall can be
> amortized over multiple buffer additions to the virtqueue.
> 
> I also understand that there has been some recent work in the Linaro
> Project Stratos on "Fat Virtqueues", where the data to be transmitted is
> included within an expanded virtqueue, which could further reduce the
> number of hypercalls required, since the data can be transmitted inline
> with the descriptors.
> Reference here:
> https://linaro.atlassian.net/wiki/spaces/STR/pages/25626313982/2021-01-21+Project+Stratos+Sync+Meeting+notes
> https://linaro.atlassian.net/browse/STR-25

Ah, yes. Obviously, "fatvirtqueue" has pros and cons.
One of cons is that it won't be suitable for bigger payload
with limited space in descriptors.

> As a result of the above, I think that a single hypercall could be
> sufficient for communicating data for multiple requests, and that a
> two-hypercall-per-request (worst case) upper bound could also be
> established.

When it comes to the payload or data plane, "fatvirtqueue" as well as
Argo utilizes copying. You dub it "DMA operations".
A similar approach can be also seen in virtio-over-ivshmem, where
a limited amount of memory are shared and FE will allocate some space
in this buffer and copy the payload into there. Those allocation will
be done via dma_ops of virtio_ivshmem driver. BE, on the other hand,
fetches the data from the shared memory by using the "offset" described
in a descriptor.
Shared memory is divided into a couple of different groups;
one for read/write for all, others for one writer with many readers.
(I hope I'm right here :)

Looks close to Argo? What is different is who is responsible for
copying data, the kernel or the hypervisor.
(Yeah, I know that Argo has more crucial aspects like access controls.)

In this sense, ivshmem can also be a candidate for hypervisor-agnostic
framework. Jailhouse doesn't say so explicitly AFAIK.
Jan may have some more to say.

Thanks,
-Takahiro Akashi


> Christopher
> 
> 
> 
> >
> > Thanks,
> > -Takahiro Akashi
> >
> >
> > > Christopher
> > >
> > >
> > > >
> > > > Thanks,
> > > > -Takahiro Akashi
> > > >
> > > >
> > > > > * Here are the design documents for building VirtIO-over-Argo, to
> > > > support a
> > > > >   hypervisor-agnostic frontend VirtIO transport driver using Argo.
> > > > >
> > > > > The Development Plan to build VirtIO virtual device support over Argo
> > > > > transport:
> > > > >
> > > >
> > https://openxt.atlassian.net/wiki/spaces/DC/pages/1696169985/VirtIO-Argo+Development+Phase+1
> > > > >
> > > > > A design for using VirtIO over Argo, describing how VirtIO data
> > > > structures
> > > > > and communication is handled over the Argo transport:
> > > > >
> > https://openxt.atlassian.net/wiki/spaces/DC/pages/1348763698/VirtIO+Argo
> > > > >
> > > > > Diagram (from the above document) showing how VirtIO rings are
> > > > synchronized
> > > > > between domains without using shared memory:
> > > > >
> > > >
> > 

Re: Enabling hypervisor agnosticism for VirtIO backends

2021-09-07 Thread Christopher Clark
On Tue, Sep 7, 2021 at 4:55 AM AKASHI Takahiro 
wrote:

> Hi,
>
> I have not covered all your comments below yet.
> So just one comment:
>
> On Mon, Sep 06, 2021 at 05:57:43PM -0700, Christopher Clark wrote:
> > On Thu, Sep 2, 2021 at 12:19 AM AKASHI Takahiro <
> takahiro.aka...@linaro.org>
> > wrote:
>
> (snip)
>
> > >It appears that, on FE side, at least three hypervisor calls (and
> data
> > >copying) need to be invoked at every request, right?
> > >
> >
> > For a write, counting FE sendv ops:
> > 1: the write data payload is sent via the "Argo ring for writes"
> > 2: the descriptor is sent via a sync of the available/descriptor ring
> >   -- is there a third one that I am missing?
>
> In the picture, I can see
> a) Data transmitted by Argo sendv
> b) Descriptor written after data sendv
> c) VirtIO ring sync'd to back-end via separate sendv
>
> Oops, (b) is not a hypervisor call, is it?
>

That's correct, it is not - the blue arrows in the diagram are not
hypercalls, they are intended to show data movement or action in the flow
of performing the operation, and (b) is a data write within the guest's
address space into the descriptor ring.



> (But I guess that you will have to have yet another call for notification
> since there is no config register of QueueNotify?)
>

Reasoning about hypercalls necessary for data movement:

VirtIO transport drivers are responsible for instantiating virtqueues
(setup_vq) and are able to populate the notify function pointer in the
virtqueue that they supply. The virtio-argo transport driver can provide a
suitable notify function implementation that will issue the Argo hypercall
sendv hypercall(s) for sending data from the guest frontend to the backend.
By issuing the sendv at the time of the queuenotify, rather than as each
buffer is added to the virtqueue, the cost of the sendv hypercall can be
amortized over multiple buffer additions to the virtqueue.

I also understand that there has been some recent work in the Linaro
Project Stratos on "Fat Virtqueues", where the data to be transmitted is
included within an expanded virtqueue, which could further reduce the
number of hypercalls required, since the data can be transmitted inline
with the descriptors.
Reference here:
https://linaro.atlassian.net/wiki/spaces/STR/pages/25626313982/2021-01-21+Project+Stratos+Sync+Meeting+notes
https://linaro.atlassian.net/browse/STR-25

As a result of the above, I think that a single hypercall could be
sufficient for communicating data for multiple requests, and that a
two-hypercall-per-request (worst case) upper bound could also be
established.

Christopher



>
> Thanks,
> -Takahiro Akashi
>
>
> > Christopher
> >
> >
> > >
> > > Thanks,
> > > -Takahiro Akashi
> > >
> > >
> > > > * Here are the design documents for building VirtIO-over-Argo, to
> > > support a
> > > >   hypervisor-agnostic frontend VirtIO transport driver using Argo.
> > > >
> > > > The Development Plan to build VirtIO virtual device support over Argo
> > > > transport:
> > > >
> > >
> https://openxt.atlassian.net/wiki/spaces/DC/pages/1696169985/VirtIO-Argo+Development+Phase+1
> > > >
> > > > A design for using VirtIO over Argo, describing how VirtIO data
> > > structures
> > > > and communication is handled over the Argo transport:
> > > >
> https://openxt.atlassian.net/wiki/spaces/DC/pages/1348763698/VirtIO+Argo
> > > >
> > > > Diagram (from the above document) showing how VirtIO rings are
> > > synchronized
> > > > between domains without using shared memory:
> > > >
> > >
> https://openxt.atlassian.net/46e1c93b-2b87-4cb2-951e-abd4377a1194#media-blob-url=true=01f7d0e1-7686-4f0b-88e1-457c1d30df40=contentId-1348763698=1348763698=image%2Fpng=device-buffer-access-virtio-argo.png=243175=1106=1241
> > > >
> > > > Please note that the above design documents show that the existing
> VirtIO
> > > > device drivers, and both vring and virtqueue data structures can be
> > > > preserved
> > > > while interdomain communication can be performed with no shared
> memory
> > > > required
> > > > for most drivers; (the exceptions where further design is required
> are
> > > those
> > > > such as virtual framebuffer devices where shared memory regions are
> > > > intentionally
> > > > added to the communication structure beyond the vrings and
> virtqueues).
> > > >
> > > > An analysis of VirtIO and Argo, informing the design:
> > > >
> > >
> https://openxt.atlassian.net/wiki/spaces/DC/pages/1333428225/Analysis+of+Argo+as+a+transport+medium+for+VirtIO
> > > >
> > > > * Argo can be used for a communication path for configuration
> between the
> > > > backend
> > > >   and the toolstack, avoiding the need for a dependency on XenStore,
> > > which
> > > > is an
> > > >   advantage for any hypervisor-agnostic design. It is also amenable
> to a
> > > > notification
> > > >   mechanism that is not based on Xen event channels.
> > > >
> > > > * Argo does not use or require shared memory between VMs and
> provides an
> > > > 

Re: Enabling hypervisor agnosticism for VirtIO backends

2021-09-07 Thread AKASHI Takahiro
Hi,

I have not covered all your comments below yet.
So just one comment:

On Mon, Sep 06, 2021 at 05:57:43PM -0700, Christopher Clark wrote:
> On Thu, Sep 2, 2021 at 12:19 AM AKASHI Takahiro 
> wrote:

(snip)

> >It appears that, on FE side, at least three hypervisor calls (and data
> >copying) need to be invoked at every request, right?
> >
> 
> For a write, counting FE sendv ops:
> 1: the write data payload is sent via the "Argo ring for writes"
> 2: the descriptor is sent via a sync of the available/descriptor ring
>   -- is there a third one that I am missing?

In the picture, I can see
a) Data transmitted by Argo sendv
b) Descriptor written after data sendv
c) VirtIO ring sync'd to back-end via separate sendv

Oops, (b) is not a hypervisor call, is it?
(But I guess that you will have to have yet another call for notification
since there is no config register of QueueNotify?)

Thanks,
-Takahiro Akashi


> Christopher
> 
> 
> >
> > Thanks,
> > -Takahiro Akashi
> >
> >
> > > * Here are the design documents for building VirtIO-over-Argo, to
> > support a
> > >   hypervisor-agnostic frontend VirtIO transport driver using Argo.
> > >
> > > The Development Plan to build VirtIO virtual device support over Argo
> > > transport:
> > >
> > https://openxt.atlassian.net/wiki/spaces/DC/pages/1696169985/VirtIO-Argo+Development+Phase+1
> > >
> > > A design for using VirtIO over Argo, describing how VirtIO data
> > structures
> > > and communication is handled over the Argo transport:
> > > https://openxt.atlassian.net/wiki/spaces/DC/pages/1348763698/VirtIO+Argo
> > >
> > > Diagram (from the above document) showing how VirtIO rings are
> > synchronized
> > > between domains without using shared memory:
> > >
> > https://openxt.atlassian.net/46e1c93b-2b87-4cb2-951e-abd4377a1194#media-blob-url=true=01f7d0e1-7686-4f0b-88e1-457c1d30df40=contentId-1348763698=1348763698=image%2Fpng=device-buffer-access-virtio-argo.png=243175=1106=1241
> > >
> > > Please note that the above design documents show that the existing VirtIO
> > > device drivers, and both vring and virtqueue data structures can be
> > > preserved
> > > while interdomain communication can be performed with no shared memory
> > > required
> > > for most drivers; (the exceptions where further design is required are
> > those
> > > such as virtual framebuffer devices where shared memory regions are
> > > intentionally
> > > added to the communication structure beyond the vrings and virtqueues).
> > >
> > > An analysis of VirtIO and Argo, informing the design:
> > >
> > https://openxt.atlassian.net/wiki/spaces/DC/pages/1333428225/Analysis+of+Argo+as+a+transport+medium+for+VirtIO
> > >
> > > * Argo can be used for a communication path for configuration between the
> > > backend
> > >   and the toolstack, avoiding the need for a dependency on XenStore,
> > which
> > > is an
> > >   advantage for any hypervisor-agnostic design. It is also amenable to a
> > > notification
> > >   mechanism that is not based on Xen event channels.
> > >
> > > * Argo does not use or require shared memory between VMs and provides an
> > > alternative
> > >   to the use of foreign shared memory mappings. It avoids some of the
> > > complexities
> > >   involved with using grants (eg. XSA-300).
> > >
> > > * Argo supports Mandatory Access Control by the hypervisor, satisfying a
> > > common
> > >   certification requirement.
> > >
> > > * The Argo headers are BSD-licensed and the Xen hypervisor implementation
> > > is GPLv2 but
> > >   accessible via the hypercall interface. The licensing should not
> > present
> > > an obstacle
> > >   to adoption of Argo in guest software or implementation by other
> > > hypervisors.
> > >
> > > * Since the interface that Argo presents to a guest VM is similar to
> > DMA, a
> > > VirtIO-Argo
> > >   frontend transport driver should be able to operate with a physical
> > > VirtIO-enabled
> > >   smart-NIC if the toolstack and an Argo-aware backend provide support.
> > >
> > > The next Xen Community Call is next week and I would be happy to answer
> > > questions
> > > about Argo and on this topic. I will also be following this thread.
> > >
> > > Christopher
> > > (Argo maintainer, Xen Community)
> > >
> > >
> > 
> > > [1]
> > > An introduction to Argo:
> > >
> > https://static.sched.com/hosted_files/xensummit19/92/Argo%20and%20HMX%20-%20OpenXT%20-%20Christopher%20Clark%20-%20Xen%20Summit%202019.pdf
> > > https://www.youtube.com/watch?v=cnC0Tg3jqJQ
> > > Xen Wiki page for Argo:
> > >
> > https://wiki.xenproject.org/wiki/Argo:_Hypervisor-Mediated_Exchange_(HMX)_for_Xen
> > >
> > > [2]
> > > OpenXT Linux Argo driver and userspace library:
> > > https://github.com/openxt/linux-xen-argo
> > >
> > > Windows V4V at OpenXT wiki:
> > > https://openxt.atlassian.net/wiki/spaces/DC/pages/14844007/V4V
> > > Windows v4v driver source:
> > > 

Re: Enabling hypervisor agnosticism for VirtIO backends

2021-09-06 Thread Christopher Clark
On Thu, Sep 2, 2021 at 12:19 AM AKASHI Takahiro 
wrote:

> Hi Christopher,
>
> Thank you for your feedback.
>
> On Mon, Aug 30, 2021 at 12:53:00PM -0700, Christopher Clark wrote:
> > [ resending message to ensure delivery to the CCd mailing lists
> > post-subscription ]
> >
> > Apologies for being late to this thread, but I hope to be able to
> > contribute to
> > this discussion in a meaningful way. I am grateful for the level of
> > interest in
> > this topic. I would like to draw your attention to Argo as a suitable
> > technology for development of VirtIO's hypervisor-agnostic interfaces.
> >
> > * Argo is an interdomain communication mechanism in Xen (on x86 and Arm)
> > that
> >   can send and receive hypervisor-mediated notifications and messages
> > between
> >   domains (VMs). [1] The hypervisor can enforce Mandatory Access Control
> > over
> >   all communication between domains. It is derived from the earlier v4v,
> > which
> >   has been deployed on millions of machines with the HP/Bromium uXen
> > hypervisor
> >   and with OpenXT.
> >
> > * Argo has a simple interface with a small number of operations that was
> >   designed for ease of integration into OS primitives on both Linux
> > (sockets)
> >   and Windows (ReadFile/WriteFile) [2].
> > - A unikernel example of using it has also been developed for XTF.
> [3]
> >
> > * There has been recent discussion and support in the Xen community for
> > making
> >   revisions to the Argo interface to make it hypervisor-agnostic, and
> > support
> >   implementations of Argo on other hypervisors. This will enable a single
> >   interface for an OS kernel binary to use for inter-VM communication
> that
> > will
> >   work on multiple hypervisors -- this applies equally to both backends
> and
> >   frontend implementations. [4]
>
> Regarding virtio-over-Argo, let me ask a few questions:
> (In figure "Virtual device buffer access:Virtio+Argo" in [4])
>

(for ref, this diagram is from this document:
 https://openxt.atlassian.net/wiki/spaces/DC/pages/1348763698 )

Takahiro, thanks for reading the Virtio-Argo materials.

Some relevant context before answering your questions below: the Argo
request
interface from the hypervisor to a guest, which is currently exposed only
via a
dedicated hypercall op, has been discussed within the Xen community and is
open
to being changed in order to better enable support for guest VM access to
Argo
functions in a hypervisor-agnostic way.

The proposal is to allow hypervisors the option to implement and expose any
of
multiple access mechanisms for Argo, and then enable a guest device driver
to
probe the hypervisor for methods that it is aware of and able to use. The
hypercall op is likely to be retained (in some form), and complemented at
least
on x86 with another interface via MSRs presented to the guests.



> 1) How the configuration is managed?
>On either virtio-mmio or virtio-pci, there always takes place
>some negotiation between the FE and BE through the "configuration"
>space. How can this be done in virtio-over-Argo?
>

Just to be clear about my understanding: your question, in the context of a
Linux kernel virtio device driver implementation, is about how a virtio-argo
transport driver would implement the get_features function of the
virtio_config_ops, as a parallel to the work that vp_get_features does for
virtio-pci, and vm_get_features does for virtio-mmio.

The design is still open on this and options have been discussed, including:

* an extension to Argo to allow the system toolstack (which is responsible
for
  managing guest VMs and enabling connections from front-to-backends)
  to manage a table of "implicit destinations", so a guest can transmit Argo
  messages to eg. "my storage service" port and the hypervisor will deliver
it
  based on a destination table pre-programmed by the toolstack for the VM.
  [1]
 - ref: Notes from the December 2019 Xen F2F meeting in Cambridge, UK:
   [1] https://lists.archive.carbon60.com/xen/devel/577800#577800

  So within that feature negotiation function, communication with the
backend
  via that Argo channel will occur.

* IOREQ
The Xen IOREQ implementation is not currently appropriate for virtio-argo
since
it requires the use of foreign memory mappings of frontend memory in the
backend
guest. However, a new HMX interface from the hypervisor could support a new
DMA
Device Model Op to allow the backend to request the hypervisor to retrieve
specified
bytes from the frontend guest, which would enable plumbing for device
configuration
between an IOREQ server (device model backend implementation) and the guest
driver.
[2]

Feature negotiation in the front end in this case would look very similar to
the virtio-mmio implementation.

ref: Argo HMX Transport for VirtIO meeting minutes, from January 2021:
[2]
https://lists.xenproject.org/archives/html/xen-devel/2021-02/msg01422.html

* guest ACPI tables that surface the address of a remote Argo endpoint
  on 

Re: Enabling hypervisor agnosticism for VirtIO backends

2021-09-05 Thread AKASHI Takahiro
Alex,

On Fri, Sep 03, 2021 at 10:28:06AM +0100, Alex Benn??e wrote:
> 
> AKASHI Takahiro  writes:
> 
> > Alex,
> >
> > On Wed, Sep 01, 2021 at 01:53:34PM +0100, Alex Benn??e wrote:
> >> 
> >> Stefan Hajnoczi  writes:
> >> 
> >> > [[PGP Signed Part:Undecided]]
> >> > On Wed, Aug 04, 2021 at 12:20:01PM -0700, Stefano Stabellini wrote:
> >> >> > Could we consider the kernel internally converting IOREQ messages from
> >> >> > the Xen hypervisor to eventfd events? Would this scale with other 
> >> >> > kernel
> >> >> > hypercall interfaces?
> >> >> > 
> >> >> > So any thoughts on what directions are worth experimenting with?
> >> >>  
> >> >> One option we should consider is for each backend to connect to Xen via
> >> >> the IOREQ interface. We could generalize the IOREQ interface and make it
> >> >> hypervisor agnostic. The interface is really trivial and easy to add.
> >> >> The only Xen-specific part is the notification mechanism, which is an
> >> >> event channel. If we replaced the event channel with something else the
> >> >> interface would be generic. See:
> >> >> https://gitlab.com/xen-project/xen/-/blob/staging/xen/include/public/hvm/ioreq.h#L52
> >> >
> >> > There have been experiments with something kind of similar in KVM
> >> > recently (see struct ioregionfd_cmd):
> >> > https://lore.kernel.org/kvm/dad3d025bcf15ece11d9df0ff685e8ab0a4f2edd.1613828727.git.eafanas...@gmail.com/
> >> 
> >> Reading the cover letter was very useful in showing how this provides a
> >> separate channel for signalling IO events to userspace instead of using
> >> the normal type-2 vmexit type event. I wonder how deeply tied the
> >> userspace facing side of this is to KVM? Could it provide a common FD
> >> type interface to IOREQ?
> >
> > Why do you stick to a "FD" type interface?
> 
> I mean most user space interfaces on POSIX start with a file descriptor
> and the usual read/write semantics or a series of ioctls.

Who do you assume is responsible for implementing this kind of
fd semantics, OSs on BE or hypervisor itself?

I think such interfaces can only be easily implemented on type-2 hypervisors.

# In this sense, I don't think rust-vmm, as it is, cannot be
# a general solution.

> >> As I understand IOREQ this is currently a direct communication between
> >> userspace and the hypervisor using the existing Xen message bus. My
> >
> > With IOREQ server, IO event occurrences are notified to BE via Xen's event
> > channel, while the actual contexts of IO events (see struct ioreq in 
> > ioreq.h)
> > are put in a queue on a single shared memory page which is to be assigned
> > beforehand with xenforeignmemory_map_resource hypervisor call.
> 
> If we abstracted the IOREQ via the kernel interface you would probably
> just want to put the ioreq structure on a queue rather than expose the
> shared page to userspace. 

Where is that queue?

> >> worry would be that by adding knowledge of what the underlying
> >> hypervisor is we'd end up with excess complexity in the kernel. For one
> >> thing we certainly wouldn't want an API version dependency on the kernel
> >> to understand which version of the Xen hypervisor it was running on.
> >
> > That's exactly what virtio-proxy in my proposal[1] does; All the hypervisor-
> > specific details of IO event handlings are contained in virtio-proxy
> > and virtio BE will communicate with virtio-proxy through a virtqueue
> > (yes, virtio-proxy is seen as yet another virtio device on BE) and will
> > get IO event-related *RPC* callbacks, either MMIO read or write, from
> > virtio-proxy.
> >
> > See page 8 (protocol flow) and 10 (interfaces) in [1].
> 
> There are two areas of concern with the proxy approach at the moment.
> The first is how the bootstrap of the virtio-proxy channel happens and

As I said, from BE point of view, virtio-proxy would be seen
as yet another virtio device by which BE could talk to "virtio
proxy" vm or whatever else.

This way we guarantee BE's hypervisor-agnosticism instead of having
"common" hypervisor interfaces. That is the base of my idea.

> the second is how many context switches are involved in a transaction.
> Of course with all things there is a trade off. Things involving the
> very tightest latency would probably opt for a bare metal backend which
> I think would imply hypervisor knowledge in the backend binary.

In configuration phase of virtio device, the latency won't be a big matter.
In device operations (i.e. read/write to block devices), if we can
resolve 'mmap' issue, as Oleksandr is proposing right now, the only issue is
how efficiently we can deliver notification to the opposite side. Right?
And this is a very common problem whatever approach we would take.

Anyhow, if we do care the latency in my approach, most of virtio-proxy-
related code can be re-implemented just as a stub (or shim?) library
since the protocols are defined as RPCs.
In this case, however, we would lose the benefit of providing "single binary"
BE.
(I know this is is an 

Re: Enabling hypervisor agnosticism for VirtIO backends

2021-09-03 Thread Alex Bennée


AKASHI Takahiro  writes:

> Alex,
>
> On Wed, Sep 01, 2021 at 01:53:34PM +0100, Alex Benn??e wrote:
>> 
>> Stefan Hajnoczi  writes:
>> 
>> > [[PGP Signed Part:Undecided]]
>> > On Wed, Aug 04, 2021 at 12:20:01PM -0700, Stefano Stabellini wrote:
>> >> > Could we consider the kernel internally converting IOREQ messages from
>> >> > the Xen hypervisor to eventfd events? Would this scale with other kernel
>> >> > hypercall interfaces?
>> >> > 
>> >> > So any thoughts on what directions are worth experimenting with?
>> >>  
>> >> One option we should consider is for each backend to connect to Xen via
>> >> the IOREQ interface. We could generalize the IOREQ interface and make it
>> >> hypervisor agnostic. The interface is really trivial and easy to add.
>> >> The only Xen-specific part is the notification mechanism, which is an
>> >> event channel. If we replaced the event channel with something else the
>> >> interface would be generic. See:
>> >> https://gitlab.com/xen-project/xen/-/blob/staging/xen/include/public/hvm/ioreq.h#L52
>> >
>> > There have been experiments with something kind of similar in KVM
>> > recently (see struct ioregionfd_cmd):
>> > https://lore.kernel.org/kvm/dad3d025bcf15ece11d9df0ff685e8ab0a4f2edd.1613828727.git.eafanas...@gmail.com/
>> 
>> Reading the cover letter was very useful in showing how this provides a
>> separate channel for signalling IO events to userspace instead of using
>> the normal type-2 vmexit type event. I wonder how deeply tied the
>> userspace facing side of this is to KVM? Could it provide a common FD
>> type interface to IOREQ?
>
> Why do you stick to a "FD" type interface?

I mean most user space interfaces on POSIX start with a file descriptor
and the usual read/write semantics or a series of ioctls.

>> As I understand IOREQ this is currently a direct communication between
>> userspace and the hypervisor using the existing Xen message bus. My
>
> With IOREQ server, IO event occurrences are notified to BE via Xen's event
> channel, while the actual contexts of IO events (see struct ioreq in ioreq.h)
> are put in a queue on a single shared memory page which is to be assigned
> beforehand with xenforeignmemory_map_resource hypervisor call.

If we abstracted the IOREQ via the kernel interface you would probably
just want to put the ioreq structure on a queue rather than expose the
shared page to userspace. 

>> worry would be that by adding knowledge of what the underlying
>> hypervisor is we'd end up with excess complexity in the kernel. For one
>> thing we certainly wouldn't want an API version dependency on the kernel
>> to understand which version of the Xen hypervisor it was running on.
>
> That's exactly what virtio-proxy in my proposal[1] does; All the hypervisor-
> specific details of IO event handlings are contained in virtio-proxy
> and virtio BE will communicate with virtio-proxy through a virtqueue
> (yes, virtio-proxy is seen as yet another virtio device on BE) and will
> get IO event-related *RPC* callbacks, either MMIO read or write, from
> virtio-proxy.
>
> See page 8 (protocol flow) and 10 (interfaces) in [1].

There are two areas of concern with the proxy approach at the moment.
The first is how the bootstrap of the virtio-proxy channel happens and
the second is how many context switches are involved in a transaction.
Of course with all things there is a trade off. Things involving the
very tightest latency would probably opt for a bare metal backend which
I think would imply hypervisor knowledge in the backend binary.

>
> If kvm's ioregionfd can fit into this protocol, virtio-proxy for kvm
> will hopefully be implemented using ioregionfd.
>
> -Takahiro Akashi
>
> [1] https://op-lists.linaro.org/pipermail/stratos-dev/2021-August/000548.html

-- 
Alex Bennée



Re: Enabling hypervisor agnosticism for VirtIO backends

2021-09-03 Thread AKASHI Takahiro
Alex,

On Wed, Sep 01, 2021 at 01:53:34PM +0100, Alex Benn??e wrote:
> 
> Stefan Hajnoczi  writes:
> 
> > [[PGP Signed Part:Undecided]]
> > On Wed, Aug 04, 2021 at 12:20:01PM -0700, Stefano Stabellini wrote:
> >> > Could we consider the kernel internally converting IOREQ messages from
> >> > the Xen hypervisor to eventfd events? Would this scale with other kernel
> >> > hypercall interfaces?
> >> > 
> >> > So any thoughts on what directions are worth experimenting with?
> >>  
> >> One option we should consider is for each backend to connect to Xen via
> >> the IOREQ interface. We could generalize the IOREQ interface and make it
> >> hypervisor agnostic. The interface is really trivial and easy to add.
> >> The only Xen-specific part is the notification mechanism, which is an
> >> event channel. If we replaced the event channel with something else the
> >> interface would be generic. See:
> >> https://gitlab.com/xen-project/xen/-/blob/staging/xen/include/public/hvm/ioreq.h#L52
> >
> > There have been experiments with something kind of similar in KVM
> > recently (see struct ioregionfd_cmd):
> > https://lore.kernel.org/kvm/dad3d025bcf15ece11d9df0ff685e8ab0a4f2edd.1613828727.git.eafanas...@gmail.com/
> 
> Reading the cover letter was very useful in showing how this provides a
> separate channel for signalling IO events to userspace instead of using
> the normal type-2 vmexit type event. I wonder how deeply tied the
> userspace facing side of this is to KVM? Could it provide a common FD
> type interface to IOREQ?

Why do you stick to a "FD" type interface?

> As I understand IOREQ this is currently a direct communication between
> userspace and the hypervisor using the existing Xen message bus. My

With IOREQ server, IO event occurrences are notified to BE via Xen's event
channel, while the actual contexts of IO events (see struct ioreq in ioreq.h)
are put in a queue on a single shared memory page which is to be assigned
beforehand with xenforeignmemory_map_resource hypervisor call.

> worry would be that by adding knowledge of what the underlying
> hypervisor is we'd end up with excess complexity in the kernel. For one
> thing we certainly wouldn't want an API version dependency on the kernel
> to understand which version of the Xen hypervisor it was running on.

That's exactly what virtio-proxy in my proposal[1] does; All the hypervisor-
specific details of IO event handlings are contained in virtio-proxy
and virtio BE will communicate with virtio-proxy through a virtqueue
(yes, virtio-proxy is seen as yet another virtio device on BE) and will
get IO event-related *RPC* callbacks, either MMIO read or write, from
virtio-proxy.

See page 8 (protocol flow) and 10 (interfaces) in [1].

If kvm's ioregionfd can fit into this protocol, virtio-proxy for kvm
will hopefully be implemented using ioregionfd.

-Takahiro Akashi

[1] https://op-lists.linaro.org/pipermail/stratos-dev/2021-August/000548.html

> >> There is also another problem. IOREQ is probably not be the only
> >> interface needed. Have a look at
> >> https://marc.info/?l=xen-devel=162373754705233=2. Don't we also need
> >> an interface for the backend to inject interrupts into the frontend? And
> >> if the backend requires dynamic memory mappings of frontend pages, then
> >> we would also need an interface to map/unmap domU pages.
> >> 
> >> These interfaces are a lot more problematic than IOREQ: IOREQ is tiny
> >> and self-contained. It is easy to add anywhere. A new interface to
> >> inject interrupts or map pages is more difficult to manage because it
> >> would require changes scattered across the various emulators.
> >
> > Something like ioreq is indeed necessary to implement arbitrary devices,
> > but if you are willing to restrict yourself to VIRTIO then other
> > interfaces are possible too because the VIRTIO device model is different
> > from the general purpose x86 PIO/MMIO that Xen's ioreq seems to
> > support.
> 
> It's true our focus is just VirtIO which does support alternative
> transport options however most implementations seem to be targeting
> virtio-mmio for it's relative simplicity and understood semantics
> (modulo a desire for MSI to reduce round trip latency handling
> signalling).
> 
> >
> > Stefan
> >
> > [[End of PGP Signed Part]]
> 
> 
> -- 
> Alex Bennée



Re: Enabling hypervisor agnosticism for VirtIO backends

2021-09-02 Thread Stefan Hajnoczi
On Wed, Sep 01, 2021 at 01:53:34PM +0100, Alex Bennée wrote:
> 
> Stefan Hajnoczi  writes:
> 
> > [[PGP Signed Part:Undecided]]
> > On Wed, Aug 04, 2021 at 12:20:01PM -0700, Stefano Stabellini wrote:
> >> > Could we consider the kernel internally converting IOREQ messages from
> >> > the Xen hypervisor to eventfd events? Would this scale with other kernel
> >> > hypercall interfaces?
> >> > 
> >> > So any thoughts on what directions are worth experimenting with?
> >>  
> >> One option we should consider is for each backend to connect to Xen via
> >> the IOREQ interface. We could generalize the IOREQ interface and make it
> >> hypervisor agnostic. The interface is really trivial and easy to add.
> >> The only Xen-specific part is the notification mechanism, which is an
> >> event channel. If we replaced the event channel with something else the
> >> interface would be generic. See:
> >> https://gitlab.com/xen-project/xen/-/blob/staging/xen/include/public/hvm/ioreq.h#L52
> >
> > There have been experiments with something kind of similar in KVM
> > recently (see struct ioregionfd_cmd):
> > https://lore.kernel.org/kvm/dad3d025bcf15ece11d9df0ff685e8ab0a4f2edd.1613828727.git.eafanas...@gmail.com/
> 
> Reading the cover letter was very useful in showing how this provides a
> separate channel for signalling IO events to userspace instead of using
> the normal type-2 vmexit type event. I wonder how deeply tied the
> userspace facing side of this is to KVM? Could it provide a common FD
> type interface to IOREQ?

I wondered this too after reading Stefano's link to Xen's ioreq. They
seem to be quite similar. ioregionfd is closer to have PIO/MMIO vmexits
are handled in KVM while I guess ioreq is closer to how Xen handles
them, but those are small details.

It may be possible to use the ioreq struct instead of ioregionfd in KVM,
but I haven't checked each field.

> As I understand IOREQ this is currently a direct communication between
> userspace and the hypervisor using the existing Xen message bus. My
> worry would be that by adding knowledge of what the underlying
> hypervisor is we'd end up with excess complexity in the kernel. For one
> thing we certainly wouldn't want an API version dependency on the kernel
> to understand which version of the Xen hypervisor it was running on.
> 
> >> There is also another problem. IOREQ is probably not be the only
> >> interface needed. Have a look at
> >> https://marc.info/?l=xen-devel=162373754705233=2. Don't we also need
> >> an interface for the backend to inject interrupts into the frontend? And
> >> if the backend requires dynamic memory mappings of frontend pages, then
> >> we would also need an interface to map/unmap domU pages.
> >> 
> >> These interfaces are a lot more problematic than IOREQ: IOREQ is tiny
> >> and self-contained. It is easy to add anywhere. A new interface to
> >> inject interrupts or map pages is more difficult to manage because it
> >> would require changes scattered across the various emulators.
> >
> > Something like ioreq is indeed necessary to implement arbitrary devices,
> > but if you are willing to restrict yourself to VIRTIO then other
> > interfaces are possible too because the VIRTIO device model is different
> > from the general purpose x86 PIO/MMIO that Xen's ioreq seems to
> > support.
> 
> It's true our focus is just VirtIO which does support alternative
> transport options however most implementations seem to be targeting
> virtio-mmio for it's relative simplicity and understood semantics
> (modulo a desire for MSI to reduce round trip latency handling
> signalling).

Okay.

Stefan


signature.asc
Description: PGP signature


Re: Enabling hypervisor agnosticism for VirtIO backends

2021-09-02 Thread AKASHI Takahiro
gt; 
> * The Argo headers are BSD-licensed and the Xen hypervisor implementation
> is GPLv2 but
>   accessible via the hypercall interface. The licensing should not present
> an obstacle
>   to adoption of Argo in guest software or implementation by other
> hypervisors.
> 
> * Since the interface that Argo presents to a guest VM is similar to DMA, a
> VirtIO-Argo
>   frontend transport driver should be able to operate with a physical
> VirtIO-enabled
>   smart-NIC if the toolstack and an Argo-aware backend provide support.
> 
> The next Xen Community Call is next week and I would be happy to answer
> questions
> about Argo and on this topic. I will also be following this thread.
> 
> Christopher
> (Argo maintainer, Xen Community)
> 
> 
> [1]
> An introduction to Argo:
> https://static.sched.com/hosted_files/xensummit19/92/Argo%20and%20HMX%20-%20OpenXT%20-%20Christopher%20Clark%20-%20Xen%20Summit%202019.pdf
> https://www.youtube.com/watch?v=cnC0Tg3jqJQ
> Xen Wiki page for Argo:
> https://wiki.xenproject.org/wiki/Argo:_Hypervisor-Mediated_Exchange_(HMX)_for_Xen
> 
> [2]
> OpenXT Linux Argo driver and userspace library:
> https://github.com/openxt/linux-xen-argo
> 
> Windows V4V at OpenXT wiki:
> https://openxt.atlassian.net/wiki/spaces/DC/pages/14844007/V4V
> Windows v4v driver source:
> https://github.com/OpenXT/xc-windows/tree/master/xenv4v
> 
> HP/Bromium uXen V4V driver:
> https://github.com/uxen-virt/uxen/tree/ascara/windows/uxenv4vlib
> 
> [3]
> v2 of the Argo test unikernel for XTF:
> https://lists.xenproject.org/archives/html/xen-devel/2021-01/msg02234.html
> 
> [4]
> Argo HMX Transport for VirtIO meeting minutes:
> https://lists.xenproject.org/archives/html/xen-devel/2021-02/msg01422.html
> 
> VirtIO-Argo Development wiki page:
> https://openxt.atlassian.net/wiki/spaces/DC/pages/1696169985/VirtIO-Argo+Development+Phase+1
> 
> 
> > On Thu, Aug 26, 2021 at 5:11 AM Wei Chen  wrote:
> >
> >> Hi Akashi,
> >>
> >> > -Original Message-
> >> > From: AKASHI Takahiro 
> >> > Sent: 2021年8月26日 17:41
> >> > To: Wei Chen 
> >> > Cc: Oleksandr Tyshchenko ; Stefano Stabellini
> >> > ; Alex Benn??e ; Kaly
> >> Xin
> >> > ; Stratos Mailing List <
> >> stratos-...@op-lists.linaro.org>;
> >> > virtio-...@lists.oasis-open.org; Arnd Bergmann <
> >> arnd.bergm...@linaro.org>;
> >> > Viresh Kumar ; Stefano Stabellini
> >> > ; stefa...@redhat.com; Jan Kiszka
> >> > ; Carl van Schaik ;
> >> > prat...@quicinc.com; Srivatsa Vaddagiri ; Jean-
> >> > Philippe Brucker ; Mathieu Poirier
> >> > ; Oleksandr Tyshchenko
> >> > ; Bertrand Marquis
> >> > ; Artem Mygaiev ;
> >> Julien
> >> > Grall ; Juergen Gross ; Paul Durrant
> >> > ; Xen Devel 
> >> > Subject: Re: Enabling hypervisor agnosticism for VirtIO backends
> >> >
> >> > Hi Wei,
> >> >
> >> > On Fri, Aug 20, 2021 at 03:41:50PM +0900, AKASHI Takahiro wrote:
> >> > > On Wed, Aug 18, 2021 at 08:35:51AM +, Wei Chen wrote:
> >> > > > Hi Akashi,
> >> > > >
> >> > > > > -----Original Message-
> >> > > > > From: AKASHI Takahiro 
> >> > > > > Sent: 2021年8月18日 13:39
> >> > > > > To: Wei Chen 
> >> > > > > Cc: Oleksandr Tyshchenko ; Stefano
> >> Stabellini
> >> > > > > ; Alex Benn??e ;
> >> > Stratos
> >> > > > > Mailing List ; virtio-
> >> > dev@lists.oasis-
> >> > > > > open.org; Arnd Bergmann ; Viresh Kumar
> >> > > > > ; Stefano Stabellini
> >> > > > > ; stefa...@redhat.com; Jan Kiszka
> >> > > > > ; Carl van Schaik
> >> > ;
> >> > > > > prat...@quicinc.com; Srivatsa Vaddagiri ;
> >> > Jean-
> >> > > > > Philippe Brucker ; Mathieu Poirier
> >> > > > > ; Oleksandr Tyshchenko
> >> > > > > ; Bertrand Marquis
> >> > > > > ; Artem Mygaiev  >> >;
> >> > Julien
> >> > > > > Grall ; Juergen Gross ; Paul
> >> > Durrant
> >> > > > > ; Xen Devel 
> >> > > > > Subject: Re: Enabling hypervisor agnosticism for VirtIO backends
> >> > > > >
> >> > > 

RE: Enabling hypervisor agnosticism for VirtIO backends

2021-09-01 Thread Wei Chen
Hi Akashi, Oleksandr,

> -Original Message-
> From: Xen-devel  On Behalf Of Wei
> Chen
> Sent: 2021年9月2日 9:31
> To: AKASHI Takahiro 
> Cc: Oleksandr Tyshchenko ; Stefano Stabellini
> ; Alex Benn??e ; Kaly Xin
> ; Stratos Mailing List ;
> virtio-...@lists.oasis-open.org; Arnd Bergmann ;
> Viresh Kumar ; Stefano Stabellini
> ; stefa...@redhat.com; Jan Kiszka
> ; Carl van Schaik ;
> prat...@quicinc.com; Srivatsa Vaddagiri ; Jean-
> Philippe Brucker ; Mathieu Poirier
> ; Oleksandr Tyshchenko
> ; Bertrand Marquis
> ; Artem Mygaiev ; Julien
> Grall ; Juergen Gross ; Paul Durrant
> ; nd ; Xen Devel 
> Subject: RE: Enabling hypervisor agnosticism for VirtIO backends
> 
> Hi Akashi,
> 
> > -Original Message-
> > From: AKASHI Takahiro 
> > Sent: 2021年9月1日 20:29
> > To: Wei Chen 
> > Cc: Oleksandr Tyshchenko ; Stefano Stabellini
> > ; Alex Benn??e ; Kaly
> Xin
> > ; Stratos Mailing List  lists.linaro.org>;
> > virtio-...@lists.oasis-open.org; Arnd Bergmann
> ;
> > Viresh Kumar ; Stefano Stabellini
> > ; stefa...@redhat.com; Jan Kiszka
> > ; Carl van Schaik ;
> > prat...@quicinc.com; Srivatsa Vaddagiri ; Jean-
> > Philippe Brucker ; Mathieu Poirier
> > ; Oleksandr Tyshchenko
> > ; Bertrand Marquis
> > ; Artem Mygaiev ;
> Julien
> > Grall ; Juergen Gross ; Paul Durrant
> > ; nd ; Xen Devel 
> > Subject: Re: Enabling hypervisor agnosticism for VirtIO backends
> >
> > Hi Wei,
> >
> > On Wed, Sep 01, 2021 at 11:12:58AM +, Wei Chen wrote:
> > > Hi Akashi,
> > >
> > > > -Original Message-
> > > > From: AKASHI Takahiro 
> > > > Sent: 2021年8月31日 14:18
> > > > To: Wei Chen 
> > > > Cc: Oleksandr Tyshchenko ; Stefano Stabellini
> > > > ; Alex Benn??e ;
> Kaly
> > Xin
> > > > ; Stratos Mailing List  > lists.linaro.org>;
> > > > virtio-...@lists.oasis-open.org; Arnd Bergmann
> > ;
> > > > Viresh Kumar ; Stefano Stabellini
> > > > ; stefa...@redhat.com; Jan Kiszka
> > > > ; Carl van Schaik
> ;
> > > > prat...@quicinc.com; Srivatsa Vaddagiri ;
> Jean-
> > > > Philippe Brucker ; Mathieu Poirier
> > > > ; Oleksandr Tyshchenko
> > > > ; Bertrand Marquis
> > > > ; Artem Mygaiev ;
> > Julien
> > > > Grall ; Juergen Gross ; Paul
> Durrant
> > > > ; Xen Devel 
> > > > Subject: Re: Enabling hypervisor agnosticism for VirtIO backends
> > > >
> > > > Wei,
> > > >
> > > > On Thu, Aug 26, 2021 at 12:10:19PM +, Wei Chen wrote:
> > > > > Hi Akashi,
> > > > >
> > > > > > -Original Message-
> > > > > > From: AKASHI Takahiro 
> > > > > > Sent: 2021年8月26日 17:41
> > > > > > To: Wei Chen 
> > > > > > Cc: Oleksandr Tyshchenko ; Stefano
> Stabellini
> > > > > > ; Alex Benn??e ;
> > Kaly
> > > > Xin
> > > > > > ; Stratos Mailing List  > > > lists.linaro.org>;
> > > > > > virtio-...@lists.oasis-open.org; Arnd Bergmann
> > > > ;
> > > > > > Viresh Kumar ; Stefano Stabellini
> > > > > > ; stefa...@redhat.com; Jan Kiszka
> > > > > > ; Carl van Schaik
> > ;
> > > > > > prat...@quicinc.com; Srivatsa Vaddagiri ;
> > Jean-
> > > > > > Philippe Brucker ; Mathieu Poirier
> > > > > > ; Oleksandr Tyshchenko
> > > > > > ; Bertrand Marquis
> > > > > > ; Artem Mygaiev
> ;
> > > > Julien
> > > > > > Grall ; Juergen Gross ; Paul
> > Durrant
> > > > > > ; Xen Devel 
> > > > > > Subject: Re: Enabling hypervisor agnosticism for VirtIO backends
> > > > > >
> > > > > > Hi Wei,
> > > > > >
> > > > > > On Fri, Aug 20, 2021 at 03:41:50PM +0900, AKASHI Takahiro wrote:
> > > > > > > On Wed, Aug 18, 2021 at 08:35:51AM +, Wei Chen wrote:
> > > > > > > > Hi Akashi,
> > > > > > > >
> > > > > > > > > -Original Message-
> > > > > > > > > From: AKASHI Takahiro 
> > > > > > > > > Sent: 2021年8月18日 13:39
> > > > > > > > > To: Wei Chen 
> > > > > > > > > Cc: Oleksandr Tyshc

RE: Enabling hypervisor agnosticism for VirtIO backends

2021-09-01 Thread Wei Chen
Hi Akashi,

> -Original Message-
> From: AKASHI Takahiro 
> Sent: 2021年9月1日 20:29
> To: Wei Chen 
> Cc: Oleksandr Tyshchenko ; Stefano Stabellini
> ; Alex Benn??e ; Kaly Xin
> ; Stratos Mailing List ;
> virtio-...@lists.oasis-open.org; Arnd Bergmann ;
> Viresh Kumar ; Stefano Stabellini
> ; stefa...@redhat.com; Jan Kiszka
> ; Carl van Schaik ;
> prat...@quicinc.com; Srivatsa Vaddagiri ; Jean-
> Philippe Brucker ; Mathieu Poirier
> ; Oleksandr Tyshchenko
> ; Bertrand Marquis
> ; Artem Mygaiev ; Julien
> Grall ; Juergen Gross ; Paul Durrant
> ; nd ; Xen Devel 
> Subject: Re: Enabling hypervisor agnosticism for VirtIO backends
> 
> Hi Wei,
> 
> On Wed, Sep 01, 2021 at 11:12:58AM +, Wei Chen wrote:
> > Hi Akashi,
> >
> > > -Original Message-
> > > From: AKASHI Takahiro 
> > > Sent: 2021年8月31日 14:18
> > > To: Wei Chen 
> > > Cc: Oleksandr Tyshchenko ; Stefano Stabellini
> > > ; Alex Benn??e ; Kaly
> Xin
> > > ; Stratos Mailing List  lists.linaro.org>;
> > > virtio-...@lists.oasis-open.org; Arnd Bergmann
> ;
> > > Viresh Kumar ; Stefano Stabellini
> > > ; stefa...@redhat.com; Jan Kiszka
> > > ; Carl van Schaik ;
> > > prat...@quicinc.com; Srivatsa Vaddagiri ; Jean-
> > > Philippe Brucker ; Mathieu Poirier
> > > ; Oleksandr Tyshchenko
> > > ; Bertrand Marquis
> > > ; Artem Mygaiev ;
> Julien
> > > Grall ; Juergen Gross ; Paul Durrant
> > > ; Xen Devel 
> > > Subject: Re: Enabling hypervisor agnosticism for VirtIO backends
> > >
> > > Wei,
> > >
> > > On Thu, Aug 26, 2021 at 12:10:19PM +, Wei Chen wrote:
> > > > Hi Akashi,
> > > >
> > > > > -Original Message-
> > > > > From: AKASHI Takahiro 
> > > > > Sent: 2021年8月26日 17:41
> > > > > To: Wei Chen 
> > > > > Cc: Oleksandr Tyshchenko ; Stefano Stabellini
> > > > > ; Alex Benn??e ;
> Kaly
> > > Xin
> > > > > ; Stratos Mailing List  > > lists.linaro.org>;
> > > > > virtio-...@lists.oasis-open.org; Arnd Bergmann
> > > ;
> > > > > Viresh Kumar ; Stefano Stabellini
> > > > > ; stefa...@redhat.com; Jan Kiszka
> > > > > ; Carl van Schaik
> ;
> > > > > prat...@quicinc.com; Srivatsa Vaddagiri ;
> Jean-
> > > > > Philippe Brucker ; Mathieu Poirier
> > > > > ; Oleksandr Tyshchenko
> > > > > ; Bertrand Marquis
> > > > > ; Artem Mygaiev ;
> > > Julien
> > > > > Grall ; Juergen Gross ; Paul
> Durrant
> > > > > ; Xen Devel 
> > > > > Subject: Re: Enabling hypervisor agnosticism for VirtIO backends
> > > > >
> > > > > Hi Wei,
> > > > >
> > > > > On Fri, Aug 20, 2021 at 03:41:50PM +0900, AKASHI Takahiro wrote:
> > > > > > On Wed, Aug 18, 2021 at 08:35:51AM +, Wei Chen wrote:
> > > > > > > Hi Akashi,
> > > > > > >
> > > > > > > > -Original Message-
> > > > > > > > From: AKASHI Takahiro 
> > > > > > > > Sent: 2021年8月18日 13:39
> > > > > > > > To: Wei Chen 
> > > > > > > > Cc: Oleksandr Tyshchenko ; Stefano
> > > Stabellini
> > > > > > > > ; Alex Benn??e
> ;
> > > > > Stratos
> > > > > > > > Mailing List ; virtio-
> > > > > dev@lists.oasis-
> > > > > > > > open.org; Arnd Bergmann ; Viresh
> Kumar
> > > > > > > > ; Stefano Stabellini
> > > > > > > > ; stefa...@redhat.com; Jan
> Kiszka
> > > > > > > > ; Carl van Schaik
> > > > > ;
> > > > > > > > prat...@quicinc.com; Srivatsa Vaddagiri
> ;
> > > > > Jean-
> > > > > > > > Philippe Brucker ; Mathieu Poirier
> > > > > > > > ; Oleksandr Tyshchenko
> > > > > > > > ; Bertrand Marquis
> > > > > > > > ; Artem Mygaiev
> > > ;
> > > > > Julien
> > > > > > > > Grall ; Juergen Gross ;
> Paul
> > > > > Durrant
> > > > > > > > ; Xen Devel 
> > > > > > > > Subject: Re: Enabling hypervisor agnosticism for VirtIO
> backends
> > > > > > 

Re: Enabling hypervisor agnosticism for VirtIO backends

2021-09-01 Thread Oleksandr Tyshchenko
Hi Akashi,

I am sorry for the possible format issues.


>
> > > >
> > > > It's a RFC discussion. We have tested this RFC patch internally.
> > > > https://lists.xenproject.org/archives/html/xen-devel/2021-
> > > 07/msg01532.html
> > >
> > > I'm afraid that I miss something here, but I don't know
> > > why this proposed API will lead to eliminating 'mmap' in accessing
> > > the queued payload at every request?
> > >
> >
> > This API give Xen device model (QEMU or kvmtool) the ability to map
> > whole guest RAM in device model's address space. In this case, device
> > model doesn't need dynamic hypercall to map/unmap payload memory.
> > It can use a flat offset to access payload memory in its address
> > space directly. Just Like KVM device model does now.
>
Yes!


>
> Thank you. Quickly, let me make sure one thing:
> This API itself doesn't do any mapping operations, right?


Right. The only purpose of that "API" is to guery hypervisor to find
unallocated address space ranges to map the foreign pages into (instead of
stealing real RAM pages),
In a nutshell, if you try to map the whole guest memory in the backend
address space on Arm (or even cache some mappings) you might end up with
memory exhaustion in the backend domain (XSA-300),
and the possibility to hit XSA-300 is higher if your backend needs to serve
several Guests. Of course, this depends on the memory assigned to the
backend domain and to the Guest(s) it serves...
We believe that with the proposed solution the backend will be able to
handle Guest(s) without wasting it's real RAM. However, please note that
proposed Xen + Linux changes which are on review now [1] are far from the
final solution and require rework and some prereq work to operate in a
proper and safe way.


>
> So I suppose that virtio BE guest is responsible to
> 1) fetch the information about all the memory regions in FE,
> 2) call this API to allocate a big chunk of unused space in BE,
> 3) create grant/foreign mappings for FE onto this region(S)
> in the initialization/configuration of emulated virtio devices.
>
> Is this the way this API is expected to be used?
>

No really, the userspace backend doesn't need to call this API at all, all
what backend calls is still
xenforeignmemory_map()/xenforeignmemory_unmap(), so let's say "magic" is
done by Linux and Xen internally.
You can take a look at the virtio-disk PoC [2] (last 4 patches) to better
understand what Wei and I are talking about. There we map the Guest memory
at the beginning and just calculate a pointer at runtime. Again, the code
is not in good shape, but it is enough to demonstrate the feasibility of
the improvement.



> Does Xen already has an interface for (1)?
>

I am not aware of anything existing. For the PoC I guessed the Guest memory
layout in a really hackish way (I got total Guest memory size, so having
GUEST_RAMX_BASE/GUEST_RAMX_SIZE in hand just performed calculation).
Definitely, it is a no-go, so 1) deserves additional discussion/design.

[1]
https://lore.kernel.org/xen-devel/1627489110-25633-1-git-send-email-olekst...@gmail.com/
https://lore.kernel.org/lkml/1627490656-1267-1-git-send-email-olekst...@gmail.com/
https://lore.kernel.org/lkml/1627490656-1267-2-git-send-email-olekst...@gmail.com/
[2]
https://github.com/otyshchenko1/virtio-disk/commits/map_opt_next
-- 
Regards,

Oleksandr Tyshchenko


Re: Enabling hypervisor agnosticism for VirtIO backends

2021-09-01 Thread Alex Bennée


Stefan Hajnoczi  writes:

> [[PGP Signed Part:Undecided]]
> On Wed, Aug 04, 2021 at 12:20:01PM -0700, Stefano Stabellini wrote:
>> > Could we consider the kernel internally converting IOREQ messages from
>> > the Xen hypervisor to eventfd events? Would this scale with other kernel
>> > hypercall interfaces?
>> > 
>> > So any thoughts on what directions are worth experimenting with?
>>  
>> One option we should consider is for each backend to connect to Xen via
>> the IOREQ interface. We could generalize the IOREQ interface and make it
>> hypervisor agnostic. The interface is really trivial and easy to add.
>> The only Xen-specific part is the notification mechanism, which is an
>> event channel. If we replaced the event channel with something else the
>> interface would be generic. See:
>> https://gitlab.com/xen-project/xen/-/blob/staging/xen/include/public/hvm/ioreq.h#L52
>
> There have been experiments with something kind of similar in KVM
> recently (see struct ioregionfd_cmd):
> https://lore.kernel.org/kvm/dad3d025bcf15ece11d9df0ff685e8ab0a4f2edd.1613828727.git.eafanas...@gmail.com/

Reading the cover letter was very useful in showing how this provides a
separate channel for signalling IO events to userspace instead of using
the normal type-2 vmexit type event. I wonder how deeply tied the
userspace facing side of this is to KVM? Could it provide a common FD
type interface to IOREQ?

As I understand IOREQ this is currently a direct communication between
userspace and the hypervisor using the existing Xen message bus. My
worry would be that by adding knowledge of what the underlying
hypervisor is we'd end up with excess complexity in the kernel. For one
thing we certainly wouldn't want an API version dependency on the kernel
to understand which version of the Xen hypervisor it was running on.

>> There is also another problem. IOREQ is probably not be the only
>> interface needed. Have a look at
>> https://marc.info/?l=xen-devel=162373754705233=2. Don't we also need
>> an interface for the backend to inject interrupts into the frontend? And
>> if the backend requires dynamic memory mappings of frontend pages, then
>> we would also need an interface to map/unmap domU pages.
>> 
>> These interfaces are a lot more problematic than IOREQ: IOREQ is tiny
>> and self-contained. It is easy to add anywhere. A new interface to
>> inject interrupts or map pages is more difficult to manage because it
>> would require changes scattered across the various emulators.
>
> Something like ioreq is indeed necessary to implement arbitrary devices,
> but if you are willing to restrict yourself to VIRTIO then other
> interfaces are possible too because the VIRTIO device model is different
> from the general purpose x86 PIO/MMIO that Xen's ioreq seems to
> support.

It's true our focus is just VirtIO which does support alternative
transport options however most implementations seem to be targeting
virtio-mmio for it's relative simplicity and understood semantics
(modulo a desire for MSI to reduce round trip latency handling
signalling).

>
> Stefan
>
> [[End of PGP Signed Part]]


-- 
Alex Bennée



Re: Enabling hypervisor agnosticism for VirtIO backends

2021-09-01 Thread AKASHI Takahiro
Hi Wei,

On Wed, Sep 01, 2021 at 11:12:58AM +, Wei Chen wrote:
> Hi Akashi,
> 
> > -Original Message-
> > From: AKASHI Takahiro 
> > Sent: 2021年8月31日 14:18
> > To: Wei Chen 
> > Cc: Oleksandr Tyshchenko ; Stefano Stabellini
> > ; Alex Benn??e ; Kaly Xin
> > ; Stratos Mailing List ;
> > virtio-...@lists.oasis-open.org; Arnd Bergmann ;
> > Viresh Kumar ; Stefano Stabellini
> > ; stefa...@redhat.com; Jan Kiszka
> > ; Carl van Schaik ;
> > prat...@quicinc.com; Srivatsa Vaddagiri ; Jean-
> > Philippe Brucker ; Mathieu Poirier
> > ; Oleksandr Tyshchenko
> > ; Bertrand Marquis
> > ; Artem Mygaiev ; Julien
> > Grall ; Juergen Gross ; Paul Durrant
> > ; Xen Devel 
> > Subject: Re: Enabling hypervisor agnosticism for VirtIO backends
> > 
> > Wei,
> > 
> > On Thu, Aug 26, 2021 at 12:10:19PM +, Wei Chen wrote:
> > > Hi Akashi,
> > >
> > > > -Original Message-
> > > > From: AKASHI Takahiro 
> > > > Sent: 2021年8月26日 17:41
> > > > To: Wei Chen 
> > > > Cc: Oleksandr Tyshchenko ; Stefano Stabellini
> > > > ; Alex Benn??e ; Kaly
> > Xin
> > > > ; Stratos Mailing List  > lists.linaro.org>;
> > > > virtio-...@lists.oasis-open.org; Arnd Bergmann
> > ;
> > > > Viresh Kumar ; Stefano Stabellini
> > > > ; stefa...@redhat.com; Jan Kiszka
> > > > ; Carl van Schaik ;
> > > > prat...@quicinc.com; Srivatsa Vaddagiri ; Jean-
> > > > Philippe Brucker ; Mathieu Poirier
> > > > ; Oleksandr Tyshchenko
> > > > ; Bertrand Marquis
> > > > ; Artem Mygaiev ;
> > Julien
> > > > Grall ; Juergen Gross ; Paul Durrant
> > > > ; Xen Devel 
> > > > Subject: Re: Enabling hypervisor agnosticism for VirtIO backends
> > > >
> > > > Hi Wei,
> > > >
> > > > On Fri, Aug 20, 2021 at 03:41:50PM +0900, AKASHI Takahiro wrote:
> > > > > On Wed, Aug 18, 2021 at 08:35:51AM +, Wei Chen wrote:
> > > > > > Hi Akashi,
> > > > > >
> > > > > > > -Original Message-
> > > > > > > From: AKASHI Takahiro 
> > > > > > > Sent: 2021年8月18日 13:39
> > > > > > > To: Wei Chen 
> > > > > > > Cc: Oleksandr Tyshchenko ; Stefano
> > Stabellini
> > > > > > > ; Alex Benn??e ;
> > > > Stratos
> > > > > > > Mailing List ; virtio-
> > > > dev@lists.oasis-
> > > > > > > open.org; Arnd Bergmann ; Viresh Kumar
> > > > > > > ; Stefano Stabellini
> > > > > > > ; stefa...@redhat.com; Jan Kiszka
> > > > > > > ; Carl van Schaik
> > > > ;
> > > > > > > prat...@quicinc.com; Srivatsa Vaddagiri ;
> > > > Jean-
> > > > > > > Philippe Brucker ; Mathieu Poirier
> > > > > > > ; Oleksandr Tyshchenko
> > > > > > > ; Bertrand Marquis
> > > > > > > ; Artem Mygaiev
> > ;
> > > > Julien
> > > > > > > Grall ; Juergen Gross ; Paul
> > > > Durrant
> > > > > > > ; Xen Devel 
> > > > > > > Subject: Re: Enabling hypervisor agnosticism for VirtIO backends
> > > > > > >
> > > > > > > On Tue, Aug 17, 2021 at 08:39:09AM +, Wei Chen wrote:
> > > > > > > > Hi Akashi,
> > > > > > > >
> > > > > > > > > -Original Message-
> > > > > > > > > From: AKASHI Takahiro 
> > > > > > > > > Sent: 2021年8月17日 16:08
> > > > > > > > > To: Wei Chen 
> > > > > > > > > Cc: Oleksandr Tyshchenko ; Stefano
> > > > Stabellini
> > > > > > > > > ; Alex Benn??e
> > ;
> > > > > > > Stratos
> > > > > > > > > Mailing List ; virtio-
> > > > > > > dev@lists.oasis-
> > > > > > > > > open.org; Arnd Bergmann ; Viresh
> > Kumar
> > > > > > > > > ; Stefano Stabellini
> > > > > > > > > ; stefa...@redhat.com; Jan
> > Kiszka
> > > > > > > > > ; Carl van Schaik
> > > > ;
> > > > > > > > > prat...@quicinc.com; Srivatsa Vaddagiri
> > ;
>

RE: Enabling hypervisor agnosticism for VirtIO backends

2021-09-01 Thread Wei Chen
Hi Akashi,

> -Original Message-
> From: AKASHI Takahiro 
> Sent: 2021年8月31日 14:18
> To: Wei Chen 
> Cc: Oleksandr Tyshchenko ; Stefano Stabellini
> ; Alex Benn??e ; Kaly Xin
> ; Stratos Mailing List ;
> virtio-...@lists.oasis-open.org; Arnd Bergmann ;
> Viresh Kumar ; Stefano Stabellini
> ; stefa...@redhat.com; Jan Kiszka
> ; Carl van Schaik ;
> prat...@quicinc.com; Srivatsa Vaddagiri ; Jean-
> Philippe Brucker ; Mathieu Poirier
> ; Oleksandr Tyshchenko
> ; Bertrand Marquis
> ; Artem Mygaiev ; Julien
> Grall ; Juergen Gross ; Paul Durrant
> ; Xen Devel 
> Subject: Re: Enabling hypervisor agnosticism for VirtIO backends
> 
> Wei,
> 
> On Thu, Aug 26, 2021 at 12:10:19PM +, Wei Chen wrote:
> > Hi Akashi,
> >
> > > -Original Message-
> > > From: AKASHI Takahiro 
> > > Sent: 2021年8月26日 17:41
> > > To: Wei Chen 
> > > Cc: Oleksandr Tyshchenko ; Stefano Stabellini
> > > ; Alex Benn??e ; Kaly
> Xin
> > > ; Stratos Mailing List  lists.linaro.org>;
> > > virtio-...@lists.oasis-open.org; Arnd Bergmann
> ;
> > > Viresh Kumar ; Stefano Stabellini
> > > ; stefa...@redhat.com; Jan Kiszka
> > > ; Carl van Schaik ;
> > > prat...@quicinc.com; Srivatsa Vaddagiri ; Jean-
> > > Philippe Brucker ; Mathieu Poirier
> > > ; Oleksandr Tyshchenko
> > > ; Bertrand Marquis
> > > ; Artem Mygaiev ;
> Julien
> > > Grall ; Juergen Gross ; Paul Durrant
> > > ; Xen Devel 
> > > Subject: Re: Enabling hypervisor agnosticism for VirtIO backends
> > >
> > > Hi Wei,
> > >
> > > On Fri, Aug 20, 2021 at 03:41:50PM +0900, AKASHI Takahiro wrote:
> > > > On Wed, Aug 18, 2021 at 08:35:51AM +, Wei Chen wrote:
> > > > > Hi Akashi,
> > > > >
> > > > > > -Original Message-
> > > > > > From: AKASHI Takahiro 
> > > > > > Sent: 2021年8月18日 13:39
> > > > > > To: Wei Chen 
> > > > > > Cc: Oleksandr Tyshchenko ; Stefano
> Stabellini
> > > > > > ; Alex Benn??e ;
> > > Stratos
> > > > > > Mailing List ; virtio-
> > > dev@lists.oasis-
> > > > > > open.org; Arnd Bergmann ; Viresh Kumar
> > > > > > ; Stefano Stabellini
> > > > > > ; stefa...@redhat.com; Jan Kiszka
> > > > > > ; Carl van Schaik
> > > ;
> > > > > > prat...@quicinc.com; Srivatsa Vaddagiri ;
> > > Jean-
> > > > > > Philippe Brucker ; Mathieu Poirier
> > > > > > ; Oleksandr Tyshchenko
> > > > > > ; Bertrand Marquis
> > > > > > ; Artem Mygaiev
> ;
> > > Julien
> > > > > > Grall ; Juergen Gross ; Paul
> > > Durrant
> > > > > > ; Xen Devel 
> > > > > > Subject: Re: Enabling hypervisor agnosticism for VirtIO backends
> > > > > >
> > > > > > On Tue, Aug 17, 2021 at 08:39:09AM +, Wei Chen wrote:
> > > > > > > Hi Akashi,
> > > > > > >
> > > > > > > > -Original Message-
> > > > > > > > From: AKASHI Takahiro 
> > > > > > > > Sent: 2021年8月17日 16:08
> > > > > > > > To: Wei Chen 
> > > > > > > > Cc: Oleksandr Tyshchenko ; Stefano
> > > Stabellini
> > > > > > > > ; Alex Benn??e
> ;
> > > > > > Stratos
> > > > > > > > Mailing List ; virtio-
> > > > > > dev@lists.oasis-
> > > > > > > > open.org; Arnd Bergmann ; Viresh
> Kumar
> > > > > > > > ; Stefano Stabellini
> > > > > > > > ; stefa...@redhat.com; Jan
> Kiszka
> > > > > > > > ; Carl van Schaik
> > > ;
> > > > > > > > prat...@quicinc.com; Srivatsa Vaddagiri
> ;
> > > Jean-
> > > > > > > > Philippe Brucker ; Mathieu Poirier
> > > > > > > > ; Oleksandr Tyshchenko
> > > > > > > > ; Bertrand Marquis
> > > > > > > > ; Artem Mygaiev
> > > ;
> > > > > > Julien
> > > > > > > > Grall ; Juergen Gross ;
> Paul
> > > Durrant
> > > > > > > > ; Xen Devel 
> > > > > > > > Subject: Re: Enabling hypervisor agnosticism for VirtIO
> backends
> > > > > > &

Re: Enabling hypervisor agnosticism for VirtIO backends

2021-08-31 Thread AKASHI Takahiro
Wei,

On Thu, Aug 26, 2021 at 12:10:19PM +, Wei Chen wrote:
> Hi Akashi,
> 
> > -Original Message-
> > From: AKASHI Takahiro 
> > Sent: 2021年8月26日 17:41
> > To: Wei Chen 
> > Cc: Oleksandr Tyshchenko ; Stefano Stabellini
> > ; Alex Benn??e ; Kaly Xin
> > ; Stratos Mailing List ;
> > virtio-...@lists.oasis-open.org; Arnd Bergmann ;
> > Viresh Kumar ; Stefano Stabellini
> > ; stefa...@redhat.com; Jan Kiszka
> > ; Carl van Schaik ;
> > prat...@quicinc.com; Srivatsa Vaddagiri ; Jean-
> > Philippe Brucker ; Mathieu Poirier
> > ; Oleksandr Tyshchenko
> > ; Bertrand Marquis
> > ; Artem Mygaiev ; Julien
> > Grall ; Juergen Gross ; Paul Durrant
> > ; Xen Devel 
> > Subject: Re: Enabling hypervisor agnosticism for VirtIO backends
> >
> > Hi Wei,
> >
> > On Fri, Aug 20, 2021 at 03:41:50PM +0900, AKASHI Takahiro wrote:
> > > On Wed, Aug 18, 2021 at 08:35:51AM +, Wei Chen wrote:
> > > > Hi Akashi,
> > > >
> > > > > -Original Message-
> > > > > From: AKASHI Takahiro 
> > > > > Sent: 2021年8月18日 13:39
> > > > > To: Wei Chen 
> > > > > Cc: Oleksandr Tyshchenko ; Stefano Stabellini
> > > > > ; Alex Benn??e ;
> > Stratos
> > > > > Mailing List ; virtio-
> > dev@lists.oasis-
> > > > > open.org; Arnd Bergmann ; Viresh Kumar
> > > > > ; Stefano Stabellini
> > > > > ; stefa...@redhat.com; Jan Kiszka
> > > > > ; Carl van Schaik
> > ;
> > > > > prat...@quicinc.com; Srivatsa Vaddagiri ;
> > Jean-
> > > > > Philippe Brucker ; Mathieu Poirier
> > > > > ; Oleksandr Tyshchenko
> > > > > ; Bertrand Marquis
> > > > > ; Artem Mygaiev ;
> > Julien
> > > > > Grall ; Juergen Gross ; Paul
> > Durrant
> > > > > ; Xen Devel 
> > > > > Subject: Re: Enabling hypervisor agnosticism for VirtIO backends
> > > > >
> > > > > On Tue, Aug 17, 2021 at 08:39:09AM +, Wei Chen wrote:
> > > > > > Hi Akashi,
> > > > > >
> > > > > > > -Original Message-
> > > > > > > From: AKASHI Takahiro 
> > > > > > > Sent: 2021年8月17日 16:08
> > > > > > > To: Wei Chen 
> > > > > > > Cc: Oleksandr Tyshchenko ; Stefano
> > Stabellini
> > > > > > > ; Alex Benn??e ;
> > > > > Stratos
> > > > > > > Mailing List ; virtio-
> > > > > dev@lists.oasis-
> > > > > > > open.org; Arnd Bergmann ; Viresh Kumar
> > > > > > > ; Stefano Stabellini
> > > > > > > ; stefa...@redhat.com; Jan Kiszka
> > > > > > > ; Carl van Schaik
> > ;
> > > > > > > prat...@quicinc.com; Srivatsa Vaddagiri ;
> > Jean-
> > > > > > > Philippe Brucker ; Mathieu Poirier
> > > > > > > ; Oleksandr Tyshchenko
> > > > > > > ; Bertrand Marquis
> > > > > > > ; Artem Mygaiev
> > ;
> > > > > Julien
> > > > > > > Grall ; Juergen Gross ; Paul
> > Durrant
> > > > > > > ; Xen Devel 
> > > > > > > Subject: Re: Enabling hypervisor agnosticism for VirtIO backends
> > > > > > >
> > > > > > > Hi Wei, Oleksandr,
> > > > > > >
> > > > > > > On Mon, Aug 16, 2021 at 10:04:03AM +, Wei Chen wrote:
> > > > > > > > Hi All,
> > > > > > > >
> > > > > > > > Thanks for Stefano to link my kvmtool for Xen proposal here.
> > > > > > > > This proposal is still discussing in Xen and KVM communities.
> > > > > > > > The main work is to decouple the kvmtool from KVM and make
> > > > > > > > other hypervisors can reuse the virtual device implementations.
> > > > > > > >
> > > > > > > > In this case, we need to introduce an intermediate hypervisor
> > > > > > > > layer for VMM abstraction, Which is, I think it's very close
> > > > > > > > to stratos' virtio hypervisor agnosticism work.
> > > > > > >
> > > > > > > # My proposal[1] comes from my own idea and doesn't always
> > represent
> > > > > > > # Linaro's view

Re: Enabling hypervisor agnosticism for VirtIO backends

2021-08-30 Thread Christopher Clark
 driver:
https://github.com/uxen-virt/uxen/tree/ascara/windows/uxenv4vlib

[3]
v2 of the Argo test unikernel for XTF:
https://lists.xenproject.org/archives/html/xen-devel/2021-01/msg02234.html

[4]
Argo HMX Transport for VirtIO meeting minutes:
https://lists.xenproject.org/archives/html/xen-devel/2021-02/msg01422.html

VirtIO-Argo Development wiki page:
https://openxt.atlassian.net/wiki/spaces/DC/pages/1696169985/VirtIO-Argo+Development+Phase+1


> On Thu, Aug 26, 2021 at 5:11 AM Wei Chen  wrote:
>
>> Hi Akashi,
>>
>> > -Original Message-
>> > From: AKASHI Takahiro 
>> > Sent: 2021年8月26日 17:41
>> > To: Wei Chen 
>> > Cc: Oleksandr Tyshchenko ; Stefano Stabellini
>> > ; Alex Benn??e ; Kaly
>> Xin
>> > ; Stratos Mailing List <
>> stratos-...@op-lists.linaro.org>;
>> > virtio-...@lists.oasis-open.org; Arnd Bergmann <
>> arnd.bergm...@linaro.org>;
>> > Viresh Kumar ; Stefano Stabellini
>> > ; stefa...@redhat.com; Jan Kiszka
>> > ; Carl van Schaik ;
>> > prat...@quicinc.com; Srivatsa Vaddagiri ; Jean-
>> > Philippe Brucker ; Mathieu Poirier
>> > ; Oleksandr Tyshchenko
>> > ; Bertrand Marquis
>> > ; Artem Mygaiev ;
>> Julien
>> > Grall ; Juergen Gross ; Paul Durrant
>> > ; Xen Devel 
>> > Subject: Re: Enabling hypervisor agnosticism for VirtIO backends
>> >
>> > Hi Wei,
>> >
>> > On Fri, Aug 20, 2021 at 03:41:50PM +0900, AKASHI Takahiro wrote:
>> > > On Wed, Aug 18, 2021 at 08:35:51AM +, Wei Chen wrote:
>> > > > Hi Akashi,
>> > > >
>> > > > > -Original Message-
>> > > > > From: AKASHI Takahiro 
>> > > > > Sent: 2021年8月18日 13:39
>> > > > > To: Wei Chen 
>> > > > > Cc: Oleksandr Tyshchenko ; Stefano
>> Stabellini
>> > > > > ; Alex Benn??e ;
>> > Stratos
>> > > > > Mailing List ; virtio-
>> > dev@lists.oasis-
>> > > > > open.org; Arnd Bergmann ; Viresh Kumar
>> > > > > ; Stefano Stabellini
>> > > > > ; stefa...@redhat.com; Jan Kiszka
>> > > > > ; Carl van Schaik
>> > ;
>> > > > > prat...@quicinc.com; Srivatsa Vaddagiri ;
>> > Jean-
>> > > > > Philippe Brucker ; Mathieu Poirier
>> > > > > ; Oleksandr Tyshchenko
>> > > > > ; Bertrand Marquis
>> > > > > ; Artem Mygaiev > >;
>> > Julien
>> > > > > Grall ; Juergen Gross ; Paul
>> > Durrant
>> > > > > ; Xen Devel 
>> > > > > Subject: Re: Enabling hypervisor agnosticism for VirtIO backends
>> > > > >
>> > > > > On Tue, Aug 17, 2021 at 08:39:09AM +, Wei Chen wrote:
>> > > > > > Hi Akashi,
>> > > > > >
>> > > > > > > -Original Message-
>> > > > > > > From: AKASHI Takahiro 
>> > > > > > > Sent: 2021年8月17日 16:08
>> > > > > > > To: Wei Chen 
>> > > > > > > Cc: Oleksandr Tyshchenko ; Stefano
>> > Stabellini
>> > > > > > > ; Alex Benn??e <
>> alex.ben...@linaro.org>;
>> > > > > Stratos
>> > > > > > > Mailing List ; virtio-
>> > > > > dev@lists.oasis-
>> > > > > > > open.org; Arnd Bergmann ; Viresh
>> Kumar
>> > > > > > > ; Stefano Stabellini
>> > > > > > > ; stefa...@redhat.com; Jan
>> Kiszka
>> > > > > > > ; Carl van Schaik
>> > ;
>> > > > > > > prat...@quicinc.com; Srivatsa Vaddagiri > >;
>> > Jean-
>> > > > > > > Philippe Brucker ; Mathieu Poirier
>> > > > > > > ; Oleksandr Tyshchenko
>> > > > > > > ; Bertrand Marquis
>> > > > > > > ; Artem Mygaiev
>> > ;
>> > > > > Julien
>> > > > > > > Grall ; Juergen Gross ; Paul
>> > Durrant
>> > > > > > > ; Xen Devel 
>> > > > > > > Subject: Re: Enabling hypervisor agnosticism for VirtIO
>> backends
>> > > > > > >
>> > > > > > > Hi Wei, Oleksandr,
>> > > > > > >
>> > > > > > > On Mon, Aug 16, 2021 at 10:04:03AM +, Wei Chen wrote:
>> > > > &

Re: Enabling hypervisor agnosticism for VirtIO backends

2021-08-30 Thread Christopher Clark
 of the Argo test unikernel for XTF:
https://lists.xenproject.org/archives/html/xen-devel/2021-01/msg02234.html

[4]
Argo HMX Transport for VirtIO meeting minutes:
https://lists.xenproject.org/archives/html/xen-devel/2021-02/msg01422.html

VirtIO-Argo Development wiki page:
https://openxt.atlassian.net/wiki/spaces/DC/pages/1696169985/VirtIO-Argo+Development+Phase+1


On Thu, Aug 26, 2021 at 5:11 AM Wei Chen  wrote:

> Hi Akashi,
>
> > -Original Message-
> > From: AKASHI Takahiro 
> > Sent: 2021年8月26日 17:41
> > To: Wei Chen 
> > Cc: Oleksandr Tyshchenko ; Stefano Stabellini
> > ; Alex Benn??e ; Kaly
> Xin
> > ; Stratos Mailing List <
> stratos-...@op-lists.linaro.org>;
> > virtio-...@lists.oasis-open.org; Arnd Bergmann  >;
> > Viresh Kumar ; Stefano Stabellini
> > ; stefa...@redhat.com; Jan Kiszka
> > ; Carl van Schaik ;
> > prat...@quicinc.com; Srivatsa Vaddagiri ; Jean-
> > Philippe Brucker ; Mathieu Poirier
> > ; Oleksandr Tyshchenko
> > ; Bertrand Marquis
> > ; Artem Mygaiev ;
> Julien
> > Grall ; Juergen Gross ; Paul Durrant
> > ; Xen Devel 
> > Subject: Re: Enabling hypervisor agnosticism for VirtIO backends
> >
> > Hi Wei,
> >
> > On Fri, Aug 20, 2021 at 03:41:50PM +0900, AKASHI Takahiro wrote:
> > > On Wed, Aug 18, 2021 at 08:35:51AM +, Wei Chen wrote:
> > > > Hi Akashi,
> > > >
> > > > > -Original Message-
> > > > > From: AKASHI Takahiro 
> > > > > Sent: 2021年8月18日 13:39
> > > > > To: Wei Chen 
> > > > > Cc: Oleksandr Tyshchenko ; Stefano Stabellini
> > > > > ; Alex Benn??e ;
> > Stratos
> > > > > Mailing List ; virtio-
> > dev@lists.oasis-
> > > > > open.org; Arnd Bergmann ; Viresh Kumar
> > > > > ; Stefano Stabellini
> > > > > ; stefa...@redhat.com; Jan Kiszka
> > > > > ; Carl van Schaik
> > ;
> > > > > prat...@quicinc.com; Srivatsa Vaddagiri ;
> > Jean-
> > > > > Philippe Brucker ; Mathieu Poirier
> > > > > ; Oleksandr Tyshchenko
> > > > > ; Bertrand Marquis
> > > > > ; Artem Mygaiev  >;
> > Julien
> > > > > Grall ; Juergen Gross ; Paul
> > Durrant
> > > > > ; Xen Devel 
> > > > > Subject: Re: Enabling hypervisor agnosticism for VirtIO backends
> > > > >
> > > > > On Tue, Aug 17, 2021 at 08:39:09AM +, Wei Chen wrote:
> > > > > > Hi Akashi,
> > > > > >
> > > > > > > -Original Message-
> > > > > > > From: AKASHI Takahiro 
> > > > > > > Sent: 2021年8月17日 16:08
> > > > > > > To: Wei Chen 
> > > > > > > Cc: Oleksandr Tyshchenko ; Stefano
> > Stabellini
> > > > > > > ; Alex Benn??e  >;
> > > > > Stratos
> > > > > > > Mailing List ; virtio-
> > > > > dev@lists.oasis-
> > > > > > > open.org; Arnd Bergmann ; Viresh
> Kumar
> > > > > > > ; Stefano Stabellini
> > > > > > > ; stefa...@redhat.com; Jan
> Kiszka
> > > > > > > ; Carl van Schaik
> > ;
> > > > > > > prat...@quicinc.com; Srivatsa Vaddagiri  >;
> > Jean-
> > > > > > > Philippe Brucker ; Mathieu Poirier
> > > > > > > ; Oleksandr Tyshchenko
> > > > > > > ; Bertrand Marquis
> > > > > > > ; Artem Mygaiev
> > ;
> > > > > Julien
> > > > > > > Grall ; Juergen Gross ; Paul
> > Durrant
> > > > > > > ; Xen Devel 
> > > > > > > Subject: Re: Enabling hypervisor agnosticism for VirtIO
> backends
> > > > > > >
> > > > > > > Hi Wei, Oleksandr,
> > > > > > >
> > > > > > > On Mon, Aug 16, 2021 at 10:04:03AM +, Wei Chen wrote:
> > > > > > > > Hi All,
> > > > > > > >
> > > > > > > > Thanks for Stefano to link my kvmtool for Xen proposal here.
> > > > > > > > This proposal is still discussing in Xen and KVM communities.
> > > > > > > > The main work is to decouple the kvmtool from KVM and make
> > > > > > > > other hypervisors can reuse the virtual device
> implementations.
> > > > > > > >
> > > > > >

RE: Enabling hypervisor agnosticism for VirtIO backends

2021-08-26 Thread Wei Chen
Hi Akashi,

> -Original Message-
> From: AKASHI Takahiro 
> Sent: 2021年8月26日 17:41
> To: Wei Chen 
> Cc: Oleksandr Tyshchenko ; Stefano Stabellini
> ; Alex Benn??e ; Kaly Xin
> ; Stratos Mailing List ;
> virtio-...@lists.oasis-open.org; Arnd Bergmann ;
> Viresh Kumar ; Stefano Stabellini
> ; stefa...@redhat.com; Jan Kiszka
> ; Carl van Schaik ;
> prat...@quicinc.com; Srivatsa Vaddagiri ; Jean-
> Philippe Brucker ; Mathieu Poirier
> ; Oleksandr Tyshchenko
> ; Bertrand Marquis
> ; Artem Mygaiev ; Julien
> Grall ; Juergen Gross ; Paul Durrant
> ; Xen Devel 
> Subject: Re: Enabling hypervisor agnosticism for VirtIO backends
>
> Hi Wei,
>
> On Fri, Aug 20, 2021 at 03:41:50PM +0900, AKASHI Takahiro wrote:
> > On Wed, Aug 18, 2021 at 08:35:51AM +, Wei Chen wrote:
> > > Hi Akashi,
> > >
> > > > -Original Message-
> > > > From: AKASHI Takahiro 
> > > > Sent: 2021年8月18日 13:39
> > > > To: Wei Chen 
> > > > Cc: Oleksandr Tyshchenko ; Stefano Stabellini
> > > > ; Alex Benn??e ;
> Stratos
> > > > Mailing List ; virtio-
> dev@lists.oasis-
> > > > open.org; Arnd Bergmann ; Viresh Kumar
> > > > ; Stefano Stabellini
> > > > ; stefa...@redhat.com; Jan Kiszka
> > > > ; Carl van Schaik
> ;
> > > > prat...@quicinc.com; Srivatsa Vaddagiri ;
> Jean-
> > > > Philippe Brucker ; Mathieu Poirier
> > > > ; Oleksandr Tyshchenko
> > > > ; Bertrand Marquis
> > > > ; Artem Mygaiev ;
> Julien
> > > > Grall ; Juergen Gross ; Paul
> Durrant
> > > > ; Xen Devel 
> > > > Subject: Re: Enabling hypervisor agnosticism for VirtIO backends
> > > >
> > > > On Tue, Aug 17, 2021 at 08:39:09AM +, Wei Chen wrote:
> > > > > Hi Akashi,
> > > > >
> > > > > > -Original Message-
> > > > > > From: AKASHI Takahiro 
> > > > > > Sent: 2021年8月17日 16:08
> > > > > > To: Wei Chen 
> > > > > > Cc: Oleksandr Tyshchenko ; Stefano
> Stabellini
> > > > > > ; Alex Benn??e ;
> > > > Stratos
> > > > > > Mailing List ; virtio-
> > > > dev@lists.oasis-
> > > > > > open.org; Arnd Bergmann ; Viresh Kumar
> > > > > > ; Stefano Stabellini
> > > > > > ; stefa...@redhat.com; Jan Kiszka
> > > > > > ; Carl van Schaik
> ;
> > > > > > prat...@quicinc.com; Srivatsa Vaddagiri ;
> Jean-
> > > > > > Philippe Brucker ; Mathieu Poirier
> > > > > > ; Oleksandr Tyshchenko
> > > > > > ; Bertrand Marquis
> > > > > > ; Artem Mygaiev
> ;
> > > > Julien
> > > > > > Grall ; Juergen Gross ; Paul
> Durrant
> > > > > > ; Xen Devel 
> > > > > > Subject: Re: Enabling hypervisor agnosticism for VirtIO backends
> > > > > >
> > > > > > Hi Wei, Oleksandr,
> > > > > >
> > > > > > On Mon, Aug 16, 2021 at 10:04:03AM +, Wei Chen wrote:
> > > > > > > Hi All,
> > > > > > >
> > > > > > > Thanks for Stefano to link my kvmtool for Xen proposal here.
> > > > > > > This proposal is still discussing in Xen and KVM communities.
> > > > > > > The main work is to decouple the kvmtool from KVM and make
> > > > > > > other hypervisors can reuse the virtual device implementations.
> > > > > > >
> > > > > > > In this case, we need to introduce an intermediate hypervisor
> > > > > > > layer for VMM abstraction, Which is, I think it's very close
> > > > > > > to stratos' virtio hypervisor agnosticism work.
> > > > > >
> > > > > > # My proposal[1] comes from my own idea and doesn't always
> represent
> > > > > > # Linaro's view on this subject nor reflect Alex's concerns.
> > > > Nevertheless,
> > > > > >
> > > > > > Your idea and my proposal seem to share the same background.
> > > > > > Both have the similar goal and currently start with, at first,
> Xen
> > > > > > and are based on kvm-tool. (Actually, my work is derived from
> > > > > > EPAM's virtio-disk, which is also based on kvm-tool.)
> > > > > >
> > > > > > In particular, the abstractio

Re: Enabling hypervisor agnosticism for VirtIO backends

2021-08-26 Thread AKASHI Takahiro
Hi Wei,

On Fri, Aug 20, 2021 at 03:41:50PM +0900, AKASHI Takahiro wrote:
> On Wed, Aug 18, 2021 at 08:35:51AM +, Wei Chen wrote:
> > Hi Akashi,
> > 
> > > -Original Message-
> > > From: AKASHI Takahiro 
> > > Sent: 2021年8月18日 13:39
> > > To: Wei Chen 
> > > Cc: Oleksandr Tyshchenko ; Stefano Stabellini
> > > ; Alex Benn??e ; Stratos
> > > Mailing List ; virtio-dev@lists.oasis-
> > > open.org; Arnd Bergmann ; Viresh Kumar
> > > ; Stefano Stabellini
> > > ; stefa...@redhat.com; Jan Kiszka
> > > ; Carl van Schaik ;
> > > prat...@quicinc.com; Srivatsa Vaddagiri ; Jean-
> > > Philippe Brucker ; Mathieu Poirier
> > > ; Oleksandr Tyshchenko
> > > ; Bertrand Marquis
> > > ; Artem Mygaiev ; Julien
> > > Grall ; Juergen Gross ; Paul Durrant
> > > ; Xen Devel 
> > > Subject: Re: Enabling hypervisor agnosticism for VirtIO backends
> > >
> > > On Tue, Aug 17, 2021 at 08:39:09AM +, Wei Chen wrote:
> > > > Hi Akashi,
> > > >
> > > > > -Original Message-
> > > > > From: AKASHI Takahiro 
> > > > > Sent: 2021年8月17日 16:08
> > > > > To: Wei Chen 
> > > > > Cc: Oleksandr Tyshchenko ; Stefano Stabellini
> > > > > ; Alex Benn??e ;
> > > Stratos
> > > > > Mailing List ; virtio-
> > > dev@lists.oasis-
> > > > > open.org; Arnd Bergmann ; Viresh Kumar
> > > > > ; Stefano Stabellini
> > > > > ; stefa...@redhat.com; Jan Kiszka
> > > > > ; Carl van Schaik ;
> > > > > prat...@quicinc.com; Srivatsa Vaddagiri ; Jean-
> > > > > Philippe Brucker ; Mathieu Poirier
> > > > > ; Oleksandr Tyshchenko
> > > > > ; Bertrand Marquis
> > > > > ; Artem Mygaiev ;
> > > Julien
> > > > > Grall ; Juergen Gross ; Paul Durrant
> > > > > ; Xen Devel 
> > > > > Subject: Re: Enabling hypervisor agnosticism for VirtIO backends
> > > > >
> > > > > Hi Wei, Oleksandr,
> > > > >
> > > > > On Mon, Aug 16, 2021 at 10:04:03AM +, Wei Chen wrote:
> > > > > > Hi All,
> > > > > >
> > > > > > Thanks for Stefano to link my kvmtool for Xen proposal here.
> > > > > > This proposal is still discussing in Xen and KVM communities.
> > > > > > The main work is to decouple the kvmtool from KVM and make
> > > > > > other hypervisors can reuse the virtual device implementations.
> > > > > >
> > > > > > In this case, we need to introduce an intermediate hypervisor
> > > > > > layer for VMM abstraction, Which is, I think it's very close
> > > > > > to stratos' virtio hypervisor agnosticism work.
> > > > >
> > > > > # My proposal[1] comes from my own idea and doesn't always represent
> > > > > # Linaro's view on this subject nor reflect Alex's concerns.
> > > Nevertheless,
> > > > >
> > > > > Your idea and my proposal seem to share the same background.
> > > > > Both have the similar goal and currently start with, at first, Xen
> > > > > and are based on kvm-tool. (Actually, my work is derived from
> > > > > EPAM's virtio-disk, which is also based on kvm-tool.)
> > > > >
> > > > > In particular, the abstraction of hypervisor interfaces has a same
> > > > > set of interfaces (for your "struct vmm_impl" and my "RPC 
> > > > > interfaces").
> > > > > This is not co-incident as we both share the same origin as I said
> > > above.
> > > > > And so we will also share the same issues. One of them is a way of
> > > > > "sharing/mapping FE's memory". There is some trade-off between
> > > > > the portability and the performance impact.
> > > > > So we can discuss the topic here in this ML, too.
> > > > > (See Alex's original email, too).
> > > > >
> > > > Yes, I agree.
> > > >
> > > > > On the other hand, my approach aims to create a "single-binary"
> > > solution
> > > > > in which the same binary of BE vm could run on any hypervisors.
> > > > > Somehow similar to your "proposal-#2" in [2], but in my solution, all
> > > > > the hypervisor-specific code would be 

Re: Enabling hypervisor agnosticism for VirtIO backends

2021-08-25 Thread Stefan Hajnoczi
On Wed, Aug 25, 2021 at 07:29:45PM +0900, AKASHI Takahiro wrote:
> On Mon, Aug 23, 2021 at 10:58:46AM +0100, Stefan Hajnoczi wrote:
> > On Mon, Aug 23, 2021 at 03:25:00PM +0900, AKASHI Takahiro wrote:
> > > Hi Stefan,
> > > 
> > > On Tue, Aug 17, 2021 at 11:41:01AM +0100, Stefan Hajnoczi wrote:
> > > > On Wed, Aug 04, 2021 at 12:20:01PM -0700, Stefano Stabellini wrote:
> > > > > > Could we consider the kernel internally converting IOREQ messages 
> > > > > > from
> > > > > > the Xen hypervisor to eventfd events? Would this scale with other 
> > > > > > kernel
> > > > > > hypercall interfaces?
> > > > > > 
> > > > > > So any thoughts on what directions are worth experimenting with?
> > > > >  
> > > > > One option we should consider is for each backend to connect to Xen 
> > > > > via
> > > > > the IOREQ interface. We could generalize the IOREQ interface and make 
> > > > > it
> > > > > hypervisor agnostic. The interface is really trivial and easy to add.
> > > > > The only Xen-specific part is the notification mechanism, which is an
> > > > > event channel. If we replaced the event channel with something else 
> > > > > the
> > > > > interface would be generic. See:
> > > > > https://gitlab.com/xen-project/xen/-/blob/staging/xen/include/public/hvm/ioreq.h#L52
> > > > 
> > > > There have been experiments with something kind of similar in KVM
> > > > recently (see struct ioregionfd_cmd):
> > > > https://lore.kernel.org/kvm/dad3d025bcf15ece11d9df0ff685e8ab0a4f2edd.1613828727.git.eafanas...@gmail.com/
> > > 
> > > Do you know the current status of Elena's work?
> > > It was last February that she posted her latest patch
> > > and it has not been merged upstream yet.
> > 
> > Elena worked on this during her Outreachy internship. At the moment no
> > one is actively working on the patches.
> 
> Does RedHat plan to take over or follow up her work hereafter?
> # I'm simply asking from my curiosity.

At the moment I'm not aware of anyone from Red Hat working on it. If
someone decides they need this KVM API then that could change.

> > > > > There is also another problem. IOREQ is probably not be the only
> > > > > interface needed. Have a look at
> > > > > https://marc.info/?l=xen-devel=162373754705233=2. Don't we also 
> > > > > need
> > > > > an interface for the backend to inject interrupts into the frontend? 
> > > > > And
> > > > > if the backend requires dynamic memory mappings of frontend pages, 
> > > > > then
> > > > > we would also need an interface to map/unmap domU pages.
> > > > > 
> > > > > These interfaces are a lot more problematic than IOREQ: IOREQ is tiny
> > > > > and self-contained. It is easy to add anywhere. A new interface to
> > > > > inject interrupts or map pages is more difficult to manage because it
> > > > > would require changes scattered across the various emulators.
> > > > 
> > > > Something like ioreq is indeed necessary to implement arbitrary devices,
> > > > but if you are willing to restrict yourself to VIRTIO then other
> > > > interfaces are possible too because the VIRTIO device model is different
> > > > from the general purpose x86 PIO/MMIO that Xen's ioreq seems to support.
> > > 
> > > Can you please elaborate your thoughts a bit more here?
> > > 
> > > It seems to me that trapping MMIOs to configuration space and
> > > forwarding those events to BE (or device emulation) is a quite
> > > straight-forward way to emulate device MMIOs.
> > > Or do you think of something of protocols used in vhost-user?
> > > 
> > > # On the contrary, virtio-ivshmem only requires a driver to explicitly
> > > # forward a "write" request of MMIO accesses to BE. But I don't think
> > > # it's your point. 
> > 
> > See my first reply to this email thread about alternative interfaces for
> > VIRTIO device emulation. The main thing to note was that although the
> > shared memory vring is used by VIRTIO transports today, the device model
> > actually allows transports to implement virtqueues differently (e.g.
> > making it possible to create a VIRTIO over TCP transport without shared
> > memory in the future).
> 
> Do you have any example of such use cases or systems?

This aspect of VIRTIO isn't being exploited today AFAIK. But the
layering to allow other virtqueue implementations is there. For example,
Linux's virtqueue API is independent of struct vring, so existing
drivers generally aren't tied to vrings.

> > It's possible to define a hypercall interface as a new VIRTIO transport
> > that provides higher-level virtqueue operations. Doing this is more work
> > than using vrings though since existing guest driver and device
> > emulation code already supports vrings.
> 
> Personally, I'm open to discuss about your point, but
> 
> > I don't know the requirements of Stratos so I can't say if creating a
> > new hypervisor-independent interface (VIRTIO transport) that doesn't
> > rely on shared memory vrings makes sense. I just wanted to raise the
> > idea in case you find that VIRTIO's vrings don't 

Re: Enabling hypervisor agnosticism for VirtIO backends

2021-08-25 Thread AKASHI Takahiro
Hi Stefan,

On Mon, Aug 23, 2021 at 10:58:46AM +0100, Stefan Hajnoczi wrote:
> On Mon, Aug 23, 2021 at 03:25:00PM +0900, AKASHI Takahiro wrote:
> > Hi Stefan,
> > 
> > On Tue, Aug 17, 2021 at 11:41:01AM +0100, Stefan Hajnoczi wrote:
> > > On Wed, Aug 04, 2021 at 12:20:01PM -0700, Stefano Stabellini wrote:
> > > > > Could we consider the kernel internally converting IOREQ messages from
> > > > > the Xen hypervisor to eventfd events? Would this scale with other 
> > > > > kernel
> > > > > hypercall interfaces?
> > > > > 
> > > > > So any thoughts on what directions are worth experimenting with?
> > > >  
> > > > One option we should consider is for each backend to connect to Xen via
> > > > the IOREQ interface. We could generalize the IOREQ interface and make it
> > > > hypervisor agnostic. The interface is really trivial and easy to add.
> > > > The only Xen-specific part is the notification mechanism, which is an
> > > > event channel. If we replaced the event channel with something else the
> > > > interface would be generic. See:
> > > > https://gitlab.com/xen-project/xen/-/blob/staging/xen/include/public/hvm/ioreq.h#L52
> > > 
> > > There have been experiments with something kind of similar in KVM
> > > recently (see struct ioregionfd_cmd):
> > > https://lore.kernel.org/kvm/dad3d025bcf15ece11d9df0ff685e8ab0a4f2edd.1613828727.git.eafanas...@gmail.com/
> > 
> > Do you know the current status of Elena's work?
> > It was last February that she posted her latest patch
> > and it has not been merged upstream yet.
> 
> Elena worked on this during her Outreachy internship. At the moment no
> one is actively working on the patches.

Does RedHat plan to take over or follow up her work hereafter?
# I'm simply asking from my curiosity.

> > > > There is also another problem. IOREQ is probably not be the only
> > > > interface needed. Have a look at
> > > > https://marc.info/?l=xen-devel=162373754705233=2. Don't we also need
> > > > an interface for the backend to inject interrupts into the frontend? And
> > > > if the backend requires dynamic memory mappings of frontend pages, then
> > > > we would also need an interface to map/unmap domU pages.
> > > > 
> > > > These interfaces are a lot more problematic than IOREQ: IOREQ is tiny
> > > > and self-contained. It is easy to add anywhere. A new interface to
> > > > inject interrupts or map pages is more difficult to manage because it
> > > > would require changes scattered across the various emulators.
> > > 
> > > Something like ioreq is indeed necessary to implement arbitrary devices,
> > > but if you are willing to restrict yourself to VIRTIO then other
> > > interfaces are possible too because the VIRTIO device model is different
> > > from the general purpose x86 PIO/MMIO that Xen's ioreq seems to support.
> > 
> > Can you please elaborate your thoughts a bit more here?
> > 
> > It seems to me that trapping MMIOs to configuration space and
> > forwarding those events to BE (or device emulation) is a quite
> > straight-forward way to emulate device MMIOs.
> > Or do you think of something of protocols used in vhost-user?
> > 
> > # On the contrary, virtio-ivshmem only requires a driver to explicitly
> > # forward a "write" request of MMIO accesses to BE. But I don't think
> > # it's your point. 
> 
> See my first reply to this email thread about alternative interfaces for
> VIRTIO device emulation. The main thing to note was that although the
> shared memory vring is used by VIRTIO transports today, the device model
> actually allows transports to implement virtqueues differently (e.g.
> making it possible to create a VIRTIO over TCP transport without shared
> memory in the future).

Do you have any example of such use cases or systems?

> It's possible to define a hypercall interface as a new VIRTIO transport
> that provides higher-level virtqueue operations. Doing this is more work
> than using vrings though since existing guest driver and device
> emulation code already supports vrings.

Personally, I'm open to discuss about your point, but

> I don't know the requirements of Stratos so I can't say if creating a
> new hypervisor-independent interface (VIRTIO transport) that doesn't
> rely on shared memory vrings makes sense. I just wanted to raise the
> idea in case you find that VIRTIO's vrings don't meet your requirements.

While I cannot represent the project's view, what the JIRA task
that is assigned to me describes:
  Deliverables
* Low level library allowing:
* management of virtio rings and buffers
  [and so on]
So supporting the shared memory-based vring is one of our assumptions.

In my understanding, the goal of Stratos project is that we would
have several VMs congregated into a SoC, yet sharing most of
physical IPs, where the shared memory should be, I assume, the most
efficient transport for virtio.
One of target applications would be automotive, I guess.

Alex and Mike should have more to say here.

-Takahiro Akashi

> Stefan





Re: Enabling hypervisor agnosticism for VirtIO backends

2021-08-23 Thread Stefan Hajnoczi
On Mon, Aug 23, 2021 at 03:25:00PM +0900, AKASHI Takahiro wrote:
> Hi Stefan,
> 
> On Tue, Aug 17, 2021 at 11:41:01AM +0100, Stefan Hajnoczi wrote:
> > On Wed, Aug 04, 2021 at 12:20:01PM -0700, Stefano Stabellini wrote:
> > > > Could we consider the kernel internally converting IOREQ messages from
> > > > the Xen hypervisor to eventfd events? Would this scale with other kernel
> > > > hypercall interfaces?
> > > > 
> > > > So any thoughts on what directions are worth experimenting with?
> > >  
> > > One option we should consider is for each backend to connect to Xen via
> > > the IOREQ interface. We could generalize the IOREQ interface and make it
> > > hypervisor agnostic. The interface is really trivial and easy to add.
> > > The only Xen-specific part is the notification mechanism, which is an
> > > event channel. If we replaced the event channel with something else the
> > > interface would be generic. See:
> > > https://gitlab.com/xen-project/xen/-/blob/staging/xen/include/public/hvm/ioreq.h#L52
> > 
> > There have been experiments with something kind of similar in KVM
> > recently (see struct ioregionfd_cmd):
> > https://lore.kernel.org/kvm/dad3d025bcf15ece11d9df0ff685e8ab0a4f2edd.1613828727.git.eafanas...@gmail.com/
> 
> Do you know the current status of Elena's work?
> It was last February that she posted her latest patch
> and it has not been merged upstream yet.

Elena worked on this during her Outreachy internship. At the moment no
one is actively working on the patches.

> > > There is also another problem. IOREQ is probably not be the only
> > > interface needed. Have a look at
> > > https://marc.info/?l=xen-devel=162373754705233=2. Don't we also need
> > > an interface for the backend to inject interrupts into the frontend? And
> > > if the backend requires dynamic memory mappings of frontend pages, then
> > > we would also need an interface to map/unmap domU pages.
> > > 
> > > These interfaces are a lot more problematic than IOREQ: IOREQ is tiny
> > > and self-contained. It is easy to add anywhere. A new interface to
> > > inject interrupts or map pages is more difficult to manage because it
> > > would require changes scattered across the various emulators.
> > 
> > Something like ioreq is indeed necessary to implement arbitrary devices,
> > but if you are willing to restrict yourself to VIRTIO then other
> > interfaces are possible too because the VIRTIO device model is different
> > from the general purpose x86 PIO/MMIO that Xen's ioreq seems to support.
> 
> Can you please elaborate your thoughts a bit more here?
> 
> It seems to me that trapping MMIOs to configuration space and
> forwarding those events to BE (or device emulation) is a quite
> straight-forward way to emulate device MMIOs.
> Or do you think of something of protocols used in vhost-user?
> 
> # On the contrary, virtio-ivshmem only requires a driver to explicitly
> # forward a "write" request of MMIO accesses to BE. But I don't think
> # it's your point. 

See my first reply to this email thread about alternative interfaces for
VIRTIO device emulation. The main thing to note was that although the
shared memory vring is used by VIRTIO transports today, the device model
actually allows transports to implement virtqueues differently (e.g.
making it possible to create a VIRTIO over TCP transport without shared
memory in the future).

It's possible to define a hypercall interface as a new VIRTIO transport
that provides higher-level virtqueue operations. Doing this is more work
than using vrings though since existing guest driver and device
emulation code already supports vrings.

I don't know the requirements of Stratos so I can't say if creating a
new hypervisor-independent interface (VIRTIO transport) that doesn't
rely on shared memory vrings makes sense. I just wanted to raise the
idea in case you find that VIRTIO's vrings don't meet your requirements.

Stefan


signature.asc
Description: PGP signature


Re: Enabling hypervisor agnosticism for VirtIO backends

2021-08-23 Thread AKASHI Takahiro
Hi Stefan,

On Tue, Aug 17, 2021 at 11:41:01AM +0100, Stefan Hajnoczi wrote:
> On Wed, Aug 04, 2021 at 12:20:01PM -0700, Stefano Stabellini wrote:
> > > Could we consider the kernel internally converting IOREQ messages from
> > > the Xen hypervisor to eventfd events? Would this scale with other kernel
> > > hypercall interfaces?
> > > 
> > > So any thoughts on what directions are worth experimenting with?
> >  
> > One option we should consider is for each backend to connect to Xen via
> > the IOREQ interface. We could generalize the IOREQ interface and make it
> > hypervisor agnostic. The interface is really trivial and easy to add.
> > The only Xen-specific part is the notification mechanism, which is an
> > event channel. If we replaced the event channel with something else the
> > interface would be generic. See:
> > https://gitlab.com/xen-project/xen/-/blob/staging/xen/include/public/hvm/ioreq.h#L52
> 
> There have been experiments with something kind of similar in KVM
> recently (see struct ioregionfd_cmd):
> https://lore.kernel.org/kvm/dad3d025bcf15ece11d9df0ff685e8ab0a4f2edd.1613828727.git.eafanas...@gmail.com/

Do you know the current status of Elena's work?
It was last February that she posted her latest patch
and it has not been merged upstream yet.

> > There is also another problem. IOREQ is probably not be the only
> > interface needed. Have a look at
> > https://marc.info/?l=xen-devel=162373754705233=2. Don't we also need
> > an interface for the backend to inject interrupts into the frontend? And
> > if the backend requires dynamic memory mappings of frontend pages, then
> > we would also need an interface to map/unmap domU pages.
> > 
> > These interfaces are a lot more problematic than IOREQ: IOREQ is tiny
> > and self-contained. It is easy to add anywhere. A new interface to
> > inject interrupts or map pages is more difficult to manage because it
> > would require changes scattered across the various emulators.
> 
> Something like ioreq is indeed necessary to implement arbitrary devices,
> but if you are willing to restrict yourself to VIRTIO then other
> interfaces are possible too because the VIRTIO device model is different
> from the general purpose x86 PIO/MMIO that Xen's ioreq seems to support.

Can you please elaborate your thoughts a bit more here?

It seems to me that trapping MMIOs to configuration space and
forwarding those events to BE (or device emulation) is a quite
straight-forward way to emulate device MMIOs.
Or do you think of something of protocols used in vhost-user?

# On the contrary, virtio-ivshmem only requires a driver to explicitly
# forward a "write" request of MMIO accesses to BE. But I don't think
# it's your point. 

-Takahiro Akashi

> Stefan





Re: Enabling hypervisor agnosticism for VirtIO backends

2021-08-20 Thread AKASHI Takahiro
On Wed, Aug 18, 2021 at 08:35:51AM +, Wei Chen wrote:
> Hi Akashi,
> 
> > -Original Message-
> > From: AKASHI Takahiro 
> > Sent: 2021年8月18日 13:39
> > To: Wei Chen 
> > Cc: Oleksandr Tyshchenko ; Stefano Stabellini
> > ; Alex Benn??e ; Stratos
> > Mailing List ; virtio-dev@lists.oasis-
> > open.org; Arnd Bergmann ; Viresh Kumar
> > ; Stefano Stabellini
> > ; stefa...@redhat.com; Jan Kiszka
> > ; Carl van Schaik ;
> > prat...@quicinc.com; Srivatsa Vaddagiri ; Jean-
> > Philippe Brucker ; Mathieu Poirier
> > ; Oleksandr Tyshchenko
> > ; Bertrand Marquis
> > ; Artem Mygaiev ; Julien
> > Grall ; Juergen Gross ; Paul Durrant
> > ; Xen Devel 
> > Subject: Re: Enabling hypervisor agnosticism for VirtIO backends
> >
> > On Tue, Aug 17, 2021 at 08:39:09AM +, Wei Chen wrote:
> > > Hi Akashi,
> > >
> > > > -Original Message-
> > > > From: AKASHI Takahiro 
> > > > Sent: 2021年8月17日 16:08
> > > > To: Wei Chen 
> > > > Cc: Oleksandr Tyshchenko ; Stefano Stabellini
> > > > ; Alex Benn??e ;
> > Stratos
> > > > Mailing List ; virtio-
> > dev@lists.oasis-
> > > > open.org; Arnd Bergmann ; Viresh Kumar
> > > > ; Stefano Stabellini
> > > > ; stefa...@redhat.com; Jan Kiszka
> > > > ; Carl van Schaik ;
> > > > prat...@quicinc.com; Srivatsa Vaddagiri ; Jean-
> > > > Philippe Brucker ; Mathieu Poirier
> > > > ; Oleksandr Tyshchenko
> > > > ; Bertrand Marquis
> > > > ; Artem Mygaiev ;
> > Julien
> > > > Grall ; Juergen Gross ; Paul Durrant
> > > > ; Xen Devel 
> > > > Subject: Re: Enabling hypervisor agnosticism for VirtIO backends
> > > >
> > > > Hi Wei, Oleksandr,
> > > >
> > > > On Mon, Aug 16, 2021 at 10:04:03AM +, Wei Chen wrote:
> > > > > Hi All,
> > > > >
> > > > > Thanks for Stefano to link my kvmtool for Xen proposal here.
> > > > > This proposal is still discussing in Xen and KVM communities.
> > > > > The main work is to decouple the kvmtool from KVM and make
> > > > > other hypervisors can reuse the virtual device implementations.
> > > > >
> > > > > In this case, we need to introduce an intermediate hypervisor
> > > > > layer for VMM abstraction, Which is, I think it's very close
> > > > > to stratos' virtio hypervisor agnosticism work.
> > > >
> > > > # My proposal[1] comes from my own idea and doesn't always represent
> > > > # Linaro's view on this subject nor reflect Alex's concerns.
> > Nevertheless,
> > > >
> > > > Your idea and my proposal seem to share the same background.
> > > > Both have the similar goal and currently start with, at first, Xen
> > > > and are based on kvm-tool. (Actually, my work is derived from
> > > > EPAM's virtio-disk, which is also based on kvm-tool.)
> > > >
> > > > In particular, the abstraction of hypervisor interfaces has a same
> > > > set of interfaces (for your "struct vmm_impl" and my "RPC interfaces").
> > > > This is not co-incident as we both share the same origin as I said
> > above.
> > > > And so we will also share the same issues. One of them is a way of
> > > > "sharing/mapping FE's memory". There is some trade-off between
> > > > the portability and the performance impact.
> > > > So we can discuss the topic here in this ML, too.
> > > > (See Alex's original email, too).
> > > >
> > > Yes, I agree.
> > >
> > > > On the other hand, my approach aims to create a "single-binary"
> > solution
> > > > in which the same binary of BE vm could run on any hypervisors.
> > > > Somehow similar to your "proposal-#2" in [2], but in my solution, all
> > > > the hypervisor-specific code would be put into another entity (VM),
> > > > named "virtio-proxy" and the abstracted operations are served via RPC.
> > > > (In this sense, BE is hypervisor-agnostic but might have OS
> > dependency.)
> > > > But I know that we need discuss if this is a requirement even
> > > > in Stratos project or not. (Maybe not)
> > > >
> > >
> > > Sorry, I haven't had time to finish reading your virtio-proxy completely
> > > (I will do it ASAP). But from

RE: Enabling hypervisor agnosticism for VirtIO backends

2021-08-18 Thread Wei Chen
Hi Akashi,

> -Original Message-
> From: AKASHI Takahiro 
> Sent: 2021年8月18日 13:39
> To: Wei Chen 
> Cc: Oleksandr Tyshchenko ; Stefano Stabellini
> ; Alex Benn??e ; Stratos
> Mailing List ; virtio-dev@lists.oasis-
> open.org; Arnd Bergmann ; Viresh Kumar
> ; Stefano Stabellini
> ; stefa...@redhat.com; Jan Kiszka
> ; Carl van Schaik ;
> prat...@quicinc.com; Srivatsa Vaddagiri ; Jean-
> Philippe Brucker ; Mathieu Poirier
> ; Oleksandr Tyshchenko
> ; Bertrand Marquis
> ; Artem Mygaiev ; Julien
> Grall ; Juergen Gross ; Paul Durrant
> ; Xen Devel 
> Subject: Re: Enabling hypervisor agnosticism for VirtIO backends
>
> On Tue, Aug 17, 2021 at 08:39:09AM +, Wei Chen wrote:
> > Hi Akashi,
> >
> > > -Original Message-
> > > From: AKASHI Takahiro 
> > > Sent: 2021年8月17日 16:08
> > > To: Wei Chen 
> > > Cc: Oleksandr Tyshchenko ; Stefano Stabellini
> > > ; Alex Benn??e ;
> Stratos
> > > Mailing List ; virtio-
> dev@lists.oasis-
> > > open.org; Arnd Bergmann ; Viresh Kumar
> > > ; Stefano Stabellini
> > > ; stefa...@redhat.com; Jan Kiszka
> > > ; Carl van Schaik ;
> > > prat...@quicinc.com; Srivatsa Vaddagiri ; Jean-
> > > Philippe Brucker ; Mathieu Poirier
> > > ; Oleksandr Tyshchenko
> > > ; Bertrand Marquis
> > > ; Artem Mygaiev ;
> Julien
> > > Grall ; Juergen Gross ; Paul Durrant
> > > ; Xen Devel 
> > > Subject: Re: Enabling hypervisor agnosticism for VirtIO backends
> > >
> > > Hi Wei, Oleksandr,
> > >
> > > On Mon, Aug 16, 2021 at 10:04:03AM +, Wei Chen wrote:
> > > > Hi All,
> > > >
> > > > Thanks for Stefano to link my kvmtool for Xen proposal here.
> > > > This proposal is still discussing in Xen and KVM communities.
> > > > The main work is to decouple the kvmtool from KVM and make
> > > > other hypervisors can reuse the virtual device implementations.
> > > >
> > > > In this case, we need to introduce an intermediate hypervisor
> > > > layer for VMM abstraction, Which is, I think it's very close
> > > > to stratos' virtio hypervisor agnosticism work.
> > >
> > > # My proposal[1] comes from my own idea and doesn't always represent
> > > # Linaro's view on this subject nor reflect Alex's concerns.
> Nevertheless,
> > >
> > > Your idea and my proposal seem to share the same background.
> > > Both have the similar goal and currently start with, at first, Xen
> > > and are based on kvm-tool. (Actually, my work is derived from
> > > EPAM's virtio-disk, which is also based on kvm-tool.)
> > >
> > > In particular, the abstraction of hypervisor interfaces has a same
> > > set of interfaces (for your "struct vmm_impl" and my "RPC interfaces").
> > > This is not co-incident as we both share the same origin as I said
> above.
> > > And so we will also share the same issues. One of them is a way of
> > > "sharing/mapping FE's memory". There is some trade-off between
> > > the portability and the performance impact.
> > > So we can discuss the topic here in this ML, too.
> > > (See Alex's original email, too).
> > >
> > Yes, I agree.
> >
> > > On the other hand, my approach aims to create a "single-binary"
> solution
> > > in which the same binary of BE vm could run on any hypervisors.
> > > Somehow similar to your "proposal-#2" in [2], but in my solution, all
> > > the hypervisor-specific code would be put into another entity (VM),
> > > named "virtio-proxy" and the abstracted operations are served via RPC.
> > > (In this sense, BE is hypervisor-agnostic but might have OS
> dependency.)
> > > But I know that we need discuss if this is a requirement even
> > > in Stratos project or not. (Maybe not)
> > >
> >
> > Sorry, I haven't had time to finish reading your virtio-proxy completely
> > (I will do it ASAP). But from your description, it seems we need a
> > 3rd VM between FE and BE? My concern is that, if my assumption is right,
> > will it increase the latency in data transport path? Even if we're
> > using some lightweight guest like RTOS or Unikernel,
>
> Yes, you're right. But I'm afraid that it is a matter of degree.
> As far as we execute 'mapping' operations at every fetch of payload,
> we will see latency issue (even in your case) and if we have some solution
> for it, we won't see it neither in my proposal :)

Re: Enabling hypervisor agnosticism for VirtIO backends

2021-08-17 Thread AKASHI Takahiro
On Tue, Aug 17, 2021 at 08:39:09AM +, Wei Chen wrote:
> Hi Akashi,
> 
> > -Original Message-
> > From: AKASHI Takahiro 
> > Sent: 2021年8月17日 16:08
> > To: Wei Chen 
> > Cc: Oleksandr Tyshchenko ; Stefano Stabellini
> > ; Alex Benn??e ; Stratos
> > Mailing List ; virtio-dev@lists.oasis-
> > open.org; Arnd Bergmann ; Viresh Kumar
> > ; Stefano Stabellini
> > ; stefa...@redhat.com; Jan Kiszka
> > ; Carl van Schaik ;
> > prat...@quicinc.com; Srivatsa Vaddagiri ; Jean-
> > Philippe Brucker ; Mathieu Poirier
> > ; Oleksandr Tyshchenko
> > ; Bertrand Marquis
> > ; Artem Mygaiev ; Julien
> > Grall ; Juergen Gross ; Paul Durrant
> > ; Xen Devel 
> > Subject: Re: Enabling hypervisor agnosticism for VirtIO backends
> >
> > Hi Wei, Oleksandr,
> >
> > On Mon, Aug 16, 2021 at 10:04:03AM +, Wei Chen wrote:
> > > Hi All,
> > >
> > > Thanks for Stefano to link my kvmtool for Xen proposal here.
> > > This proposal is still discussing in Xen and KVM communities.
> > > The main work is to decouple the kvmtool from KVM and make
> > > other hypervisors can reuse the virtual device implementations.
> > >
> > > In this case, we need to introduce an intermediate hypervisor
> > > layer for VMM abstraction, Which is, I think it's very close
> > > to stratos' virtio hypervisor agnosticism work.
> >
> > # My proposal[1] comes from my own idea and doesn't always represent
> > # Linaro's view on this subject nor reflect Alex's concerns. Nevertheless,
> >
> > Your idea and my proposal seem to share the same background.
> > Both have the similar goal and currently start with, at first, Xen
> > and are based on kvm-tool. (Actually, my work is derived from
> > EPAM's virtio-disk, which is also based on kvm-tool.)
> >
> > In particular, the abstraction of hypervisor interfaces has a same
> > set of interfaces (for your "struct vmm_impl" and my "RPC interfaces").
> > This is not co-incident as we both share the same origin as I said above.
> > And so we will also share the same issues. One of them is a way of
> > "sharing/mapping FE's memory". There is some trade-off between
> > the portability and the performance impact.
> > So we can discuss the topic here in this ML, too.
> > (See Alex's original email, too).
> >
> Yes, I agree.
> 
> > On the other hand, my approach aims to create a "single-binary" solution
> > in which the same binary of BE vm could run on any hypervisors.
> > Somehow similar to your "proposal-#2" in [2], but in my solution, all
> > the hypervisor-specific code would be put into another entity (VM),
> > named "virtio-proxy" and the abstracted operations are served via RPC.
> > (In this sense, BE is hypervisor-agnostic but might have OS dependency.)
> > But I know that we need discuss if this is a requirement even
> > in Stratos project or not. (Maybe not)
> >
> 
> Sorry, I haven't had time to finish reading your virtio-proxy completely
> (I will do it ASAP). But from your description, it seems we need a
> 3rd VM between FE and BE? My concern is that, if my assumption is right,
> will it increase the latency in data transport path? Even if we're
> using some lightweight guest like RTOS or Unikernel,

Yes, you're right. But I'm afraid that it is a matter of degree.
As far as we execute 'mapping' operations at every fetch of payload,
we will see latency issue (even in your case) and if we have some solution
for it, we won't see it neither in my proposal :)

> > Specifically speaking about kvm-tool, I have a concern about its
> > license term; Targeting different hypervisors and different OSs
> > (which I assume includes RTOS's), the resultant library should be
> > license permissive and GPL for kvm-tool might be an issue.
> > Any thoughts?
> >
> 
> Yes. If user want to implement a FreeBSD device model, but the virtio
> library is GPL. Then GPL would be a problem. If we have another good
> candidate, I am open to it.

I have some candidates, particularly for vq/vring, in my mind:
* Open-AMP, or
* corresponding Free-BSD code

-Takahiro Akashi


> > -Takahiro Akashi
> >
> >
> > [1] https://op-lists.linaro.org/pipermail/stratos-dev/2021-
> > August/000548.html
> > [2] https://marc.info/?l=xen-devel=162373754705233=2
> >
> > >
> > > > From: Oleksandr Tyshchenko 
> > > > Sent: 2021年8月14日 23:38
> > > > To: AKASHI Takahiro ; Stefano Stabellini
> > 
> > > > Cc: Alex Benn??e ; Stratos Mailing Lis

Re: Enabling hypervisor agnosticism for VirtIO backends

2021-08-17 Thread Stefan Hajnoczi
On Wed, Aug 04, 2021 at 12:20:01PM -0700, Stefano Stabellini wrote:
> > Could we consider the kernel internally converting IOREQ messages from
> > the Xen hypervisor to eventfd events? Would this scale with other kernel
> > hypercall interfaces?
> > 
> > So any thoughts on what directions are worth experimenting with?
>  
> One option we should consider is for each backend to connect to Xen via
> the IOREQ interface. We could generalize the IOREQ interface and make it
> hypervisor agnostic. The interface is really trivial and easy to add.
> The only Xen-specific part is the notification mechanism, which is an
> event channel. If we replaced the event channel with something else the
> interface would be generic. See:
> https://gitlab.com/xen-project/xen/-/blob/staging/xen/include/public/hvm/ioreq.h#L52

There have been experiments with something kind of similar in KVM
recently (see struct ioregionfd_cmd):
https://lore.kernel.org/kvm/dad3d025bcf15ece11d9df0ff685e8ab0a4f2edd.1613828727.git.eafanas...@gmail.com/

> There is also another problem. IOREQ is probably not be the only
> interface needed. Have a look at
> https://marc.info/?l=xen-devel=162373754705233=2. Don't we also need
> an interface for the backend to inject interrupts into the frontend? And
> if the backend requires dynamic memory mappings of frontend pages, then
> we would also need an interface to map/unmap domU pages.
> 
> These interfaces are a lot more problematic than IOREQ: IOREQ is tiny
> and self-contained. It is easy to add anywhere. A new interface to
> inject interrupts or map pages is more difficult to manage because it
> would require changes scattered across the various emulators.

Something like ioreq is indeed necessary to implement arbitrary devices,
but if you are willing to restrict yourself to VIRTIO then other
interfaces are possible too because the VIRTIO device model is different
from the general purpose x86 PIO/MMIO that Xen's ioreq seems to support.

Stefan


signature.asc
Description: PGP signature


RE: Enabling hypervisor agnosticism for VirtIO backends

2021-08-17 Thread Wei Chen
Hi Akashi,

> -Original Message-
> From: AKASHI Takahiro 
> Sent: 2021年8月17日 16:08
> To: Wei Chen 
> Cc: Oleksandr Tyshchenko ; Stefano Stabellini
> ; Alex Benn??e ; Stratos
> Mailing List ; virtio-dev@lists.oasis-
> open.org; Arnd Bergmann ; Viresh Kumar
> ; Stefano Stabellini
> ; stefa...@redhat.com; Jan Kiszka
> ; Carl van Schaik ;
> prat...@quicinc.com; Srivatsa Vaddagiri ; Jean-
> Philippe Brucker ; Mathieu Poirier
> ; Oleksandr Tyshchenko
> ; Bertrand Marquis
> ; Artem Mygaiev ; Julien
> Grall ; Juergen Gross ; Paul Durrant
> ; Xen Devel 
> Subject: Re: Enabling hypervisor agnosticism for VirtIO backends
>
> Hi Wei, Oleksandr,
>
> On Mon, Aug 16, 2021 at 10:04:03AM +, Wei Chen wrote:
> > Hi All,
> >
> > Thanks for Stefano to link my kvmtool for Xen proposal here.
> > This proposal is still discussing in Xen and KVM communities.
> > The main work is to decouple the kvmtool from KVM and make
> > other hypervisors can reuse the virtual device implementations.
> >
> > In this case, we need to introduce an intermediate hypervisor
> > layer for VMM abstraction, Which is, I think it's very close
> > to stratos' virtio hypervisor agnosticism work.
>
> # My proposal[1] comes from my own idea and doesn't always represent
> # Linaro's view on this subject nor reflect Alex's concerns. Nevertheless,
>
> Your idea and my proposal seem to share the same background.
> Both have the similar goal and currently start with, at first, Xen
> and are based on kvm-tool. (Actually, my work is derived from
> EPAM's virtio-disk, which is also based on kvm-tool.)
>
> In particular, the abstraction of hypervisor interfaces has a same
> set of interfaces (for your "struct vmm_impl" and my "RPC interfaces").
> This is not co-incident as we both share the same origin as I said above.
> And so we will also share the same issues. One of them is a way of
> "sharing/mapping FE's memory". There is some trade-off between
> the portability and the performance impact.
> So we can discuss the topic here in this ML, too.
> (See Alex's original email, too).
>
Yes, I agree.

> On the other hand, my approach aims to create a "single-binary" solution
> in which the same binary of BE vm could run on any hypervisors.
> Somehow similar to your "proposal-#2" in [2], but in my solution, all
> the hypervisor-specific code would be put into another entity (VM),
> named "virtio-proxy" and the abstracted operations are served via RPC.
> (In this sense, BE is hypervisor-agnostic but might have OS dependency.)
> But I know that we need discuss if this is a requirement even
> in Stratos project or not. (Maybe not)
>

Sorry, I haven't had time to finish reading your virtio-proxy completely
(I will do it ASAP). But from your description, it seems we need a
3rd VM between FE and BE? My concern is that, if my assumption is right,
will it increase the latency in data transport path? Even if we're
using some lightweight guest like RTOS or Unikernel,

> Specifically speaking about kvm-tool, I have a concern about its
> license term; Targeting different hypervisors and different OSs
> (which I assume includes RTOS's), the resultant library should be
> license permissive and GPL for kvm-tool might be an issue.
> Any thoughts?
>

Yes. If user want to implement a FreeBSD device model, but the virtio
library is GPL. Then GPL would be a problem. If we have another good
candidate, I am open to it.

> -Takahiro Akashi
>
>
> [1] https://op-lists.linaro.org/pipermail/stratos-dev/2021-
> August/000548.html
> [2] https://marc.info/?l=xen-devel=162373754705233=2
>
> >
> > > From: Oleksandr Tyshchenko 
> > > Sent: 2021年8月14日 23:38
> > > To: AKASHI Takahiro ; Stefano Stabellini
> 
> > > Cc: Alex Benn??e ; Stratos Mailing List
> ; virtio-...@lists.oasis-open.org; Arnd
> Bergmann ; Viresh Kumar
> ; Stefano Stabellini
> ; stefa...@redhat.com; Jan Kiszka
> ; Carl van Schaik ;
> prat...@quicinc.com; Srivatsa Vaddagiri ; Jean-
> Philippe Brucker ; Mathieu Poirier
> ; Wei Chen ; Oleksandr
> Tyshchenko ; Bertrand Marquis
> ; Artem Mygaiev ; Julien
> Grall ; Juergen Gross ; Paul Durrant
> ; Xen Devel 
> > > Subject: Re: Enabling hypervisor agnosticism for VirtIO backends
> > >
> > > Hello, all.
> > >
> > > Please see some comments below. And sorry for the possible format
> issues.
> > >
> > > > On Wed, Aug 11, 2021 at 9:27 AM AKASHI Takahiro
> <mailto:takahiro.aka...@linaro.org> wrote:
> > > > On Wed, Aug 04, 2021 at 12:20:01PM -0700, Stefano Stabellini wrote:
> > > > > CCing people worki

Re: Enabling hypervisor agnosticism for VirtIO backends

2021-08-17 Thread AKASHI Takahiro
Hi Wei, Oleksandr,

On Mon, Aug 16, 2021 at 10:04:03AM +, Wei Chen wrote:
> Hi All,
> 
> Thanks for Stefano to link my kvmtool for Xen proposal here.
> This proposal is still discussing in Xen and KVM communities.
> The main work is to decouple the kvmtool from KVM and make
> other hypervisors can reuse the virtual device implementations.
> 
> In this case, we need to introduce an intermediate hypervisor
> layer for VMM abstraction, Which is, I think it's very close
> to stratos' virtio hypervisor agnosticism work.

# My proposal[1] comes from my own idea and doesn't always represent
# Linaro's view on this subject nor reflect Alex's concerns. Nevertheless,

Your idea and my proposal seem to share the same background.
Both have the similar goal and currently start with, at first, Xen
and are based on kvm-tool. (Actually, my work is derived from
EPAM's virtio-disk, which is also based on kvm-tool.)

In particular, the abstraction of hypervisor interfaces has a same
set of interfaces (for your "struct vmm_impl" and my "RPC interfaces").
This is not co-incident as we both share the same origin as I said above.
And so we will also share the same issues. One of them is a way of
"sharing/mapping FE's memory". There is some trade-off between
the portability and the performance impact.
So we can discuss the topic here in this ML, too.
(See Alex's original email, too).

On the other hand, my approach aims to create a "single-binary" solution
in which the same binary of BE vm could run on any hypervisors.
Somehow similar to your "proposal-#2" in [2], but in my solution, all
the hypervisor-specific code would be put into another entity (VM),
named "virtio-proxy" and the abstracted operations are served via RPC.
(In this sense, BE is hypervisor-agnostic but might have OS dependency.)
But I know that we need discuss if this is a requirement even
in Stratos project or not. (Maybe not)

Specifically speaking about kvm-tool, I have a concern about its
license term; Targeting different hypervisors and different OSs
(which I assume includes RTOS's), the resultant library should be
license permissive and GPL for kvm-tool might be an issue.
Any thoughts?

-Takahiro Akashi


[1] https://op-lists.linaro.org/pipermail/stratos-dev/2021-August/000548.html
[2] https://marc.info/?l=xen-devel=162373754705233=2

> 
> > From: Oleksandr Tyshchenko 
> > Sent: 2021年8月14日 23:38
> > To: AKASHI Takahiro ; Stefano Stabellini 
> > 
> > Cc: Alex Benn??e ; Stratos Mailing List 
> > ; virtio-...@lists.oasis-open.org; Arnd 
> > Bergmann ; Viresh Kumar 
> > ; Stefano Stabellini 
> > ; stefa...@redhat.com; Jan Kiszka 
> > ; Carl van Schaik ; 
> > prat...@quicinc.com; Srivatsa Vaddagiri ; 
> > Jean-Philippe Brucker ; Mathieu Poirier 
> > ; Wei Chen ; Oleksandr 
> > Tyshchenko ; Bertrand Marquis 
> > ; Artem Mygaiev ; Julien 
> > Grall ; Juergen Gross ; Paul Durrant 
> > ; Xen Devel 
> > Subject: Re: Enabling hypervisor agnosticism for VirtIO backends
> >
> > Hello, all.
> >
> > Please see some comments below. And sorry for the possible format issues.
> >
> > > On Wed, Aug 11, 2021 at 9:27 AM AKASHI Takahiro 
> > > <mailto:takahiro.aka...@linaro.org> wrote:
> > > On Wed, Aug 04, 2021 at 12:20:01PM -0700, Stefano Stabellini wrote:
> > > > CCing people working on Xen+VirtIO and IOREQs. Not trimming the original
> > > > email to let them read the full context.
> > > >
> > > > My comments below are related to a potential Xen implementation, not
> > > > because it is the only implementation that matters, but because it is
> > > > the one I know best.
> > >
> > > Please note that my proposal (and hence the working prototype)[1]
> > > is based on Xen's virtio implementation (i.e. IOREQ) and particularly
> > > EPAM's virtio-disk application (backend server).
> > > It has been, I believe, well generalized but is still a bit biased
> > > toward this original design.
> > >
> > > So I hope you like my approach :)
> > >
> > > [1] 
> > > https://op-lists.linaro.org/pipermail/stratos-dev/2021-August/000546.html
> > >
> > > Let me take this opportunity to explain a bit more about my approach 
> > > below.
> > >
> > > > Also, please see this relevant email thread:
> > > > https://marc.info/?l=xen-devel=162373754705233=2
> > > >
> > > >
> > > > On Wed, 4 Aug 2021, Alex Bennée wrote:
> > > > > Hi,
> > > > >
> > > > > One of the goals of Project Stratos is to enable hypervisor agnostic
>

RE: Enabling hypervisor agnosticism for VirtIO backends

2021-08-16 Thread Wei Chen
Hi All,

Thanks for Stefano to link my kvmtool for Xen proposal here.
This proposal is still discussing in Xen and KVM communities.
The main work is to decouple the kvmtool from KVM and make
other hypervisors can reuse the virtual device implementations.

In this case, we need to introduce an intermediate hypervisor
layer for VMM abstraction, Which is, I think it's very close
to stratos' virtio hypervisor agnosticism work.


> From: Oleksandr Tyshchenko 
> Sent: 2021年8月14日 23:38
> To: AKASHI Takahiro ; Stefano Stabellini 
> 
> Cc: Alex Benn??e ; Stratos Mailing List 
> ; virtio-...@lists.oasis-open.org; Arnd 
> Bergmann ; Viresh Kumar ; 
> Stefano Stabellini ; stefa...@redhat.com; Jan 
> Kiszka ; Carl van Schaik ; 
> prat...@quicinc.com; Srivatsa Vaddagiri ; Jean-Philippe 
> Brucker ; Mathieu Poirier 
> ; Wei Chen ; Oleksandr 
> Tyshchenko ; Bertrand Marquis 
> ; Artem Mygaiev ; Julien 
> Grall ; Juergen Gross ; Paul Durrant 
> ; Xen Devel 
> Subject: Re: Enabling hypervisor agnosticism for VirtIO backends
>
> Hello, all.
>
> Please see some comments below. And sorry for the possible format issues.
>
> > On Wed, Aug 11, 2021 at 9:27 AM AKASHI Takahiro 
> > <mailto:takahiro.aka...@linaro.org> wrote:
> > On Wed, Aug 04, 2021 at 12:20:01PM -0700, Stefano Stabellini wrote:
> > > CCing people working on Xen+VirtIO and IOREQs. Not trimming the original
> > > email to let them read the full context.
> > >
> > > My comments below are related to a potential Xen implementation, not
> > > because it is the only implementation that matters, but because it is
> > > the one I know best.
> >
> > Please note that my proposal (and hence the working prototype)[1]
> > is based on Xen's virtio implementation (i.e. IOREQ) and particularly
> > EPAM's virtio-disk application (backend server).
> > It has been, I believe, well generalized but is still a bit biased
> > toward this original design.
> >
> > So I hope you like my approach :)
> >
> > [1] 
> > https://op-lists.linaro.org/pipermail/stratos-dev/2021-August/000546.html
> >
> > Let me take this opportunity to explain a bit more about my approach below.
> >
> > > Also, please see this relevant email thread:
> > > https://marc.info/?l=xen-devel=162373754705233=2
> > >
> > >
> > > On Wed, 4 Aug 2021, Alex Bennée wrote:
> > > > Hi,
> > > >
> > > > One of the goals of Project Stratos is to enable hypervisor agnostic
> > > > backends so we can enable as much re-use of code as possible and avoid
> > > > repeating ourselves. This is the flip side of the front end where
> > > > multiple front-end implementations are required - one per OS, assuming
> > > > you don't just want Linux guests. The resultant guests are trivially
> > > > movable between hypervisors modulo any abstracted paravirt type
> > > > interfaces.
> > > >
> > > > In my original thumb nail sketch of a solution I envisioned vhost-user
> > > > daemons running in a broadly POSIX like environment. The interface to
> > > > the daemon is fairly simple requiring only some mapped memory and some
> > > > sort of signalling for events (on Linux this is eventfd). The idea was a
> > > > stub binary would be responsible for any hypervisor specific setup and
> > > > then launch a common binary to deal with the actual virtqueue requests
> > > > themselves.
> > > >
> > > > Since that original sketch we've seen an expansion in the sort of ways
> > > > backends could be created. There is interest in encapsulating backends
> > > > in RTOSes or unikernels for solutions like SCMI. There interest in Rust
> > > > has prompted ideas of using the trait interface to abstract differences
> > > > away as well as the idea of bare-metal Rust backends.
> > > >
> > > > We have a card (STR-12) called "Hypercall Standardisation" which
> > > > calls for a description of the APIs needed from the hypervisor side to
> > > > support VirtIO guests and their backends. However we are some way off
> > > > from that at the moment as I think we need to at least demonstrate one
> > > > portable backend before we start codifying requirements. To that end I
> > > > want to think about what we need for a backend to function.
> > > >
> > > > Configuration
> > > > =
> > > >
> > > > In the type-2 setup this is typically fairly simple because the host
> > 

Re: Enabling hypervisor agnosticism for VirtIO backends

2021-08-14 Thread Oleksandr Tyshchenko
Hello, all.

Please see some comments below. And sorry for the possible format issues.

On Wed, Aug 11, 2021 at 9:27 AM AKASHI Takahiro 
wrote:

> On Wed, Aug 04, 2021 at 12:20:01PM -0700, Stefano Stabellini wrote:
> > CCing people working on Xen+VirtIO and IOREQs. Not trimming the original
> > email to let them read the full context.
> >
> > My comments below are related to a potential Xen implementation, not
> > because it is the only implementation that matters, but because it is
> > the one I know best.
>
> Please note that my proposal (and hence the working prototype)[1]
> is based on Xen's virtio implementation (i.e. IOREQ) and particularly
> EPAM's virtio-disk application (backend server).
> It has been, I believe, well generalized but is still a bit biased
> toward this original design.
>
> So I hope you like my approach :)
>
> [1]
> https://op-lists.linaro.org/pipermail/stratos-dev/2021-August/000546.html
>
> Let me take this opportunity to explain a bit more about my approach below.
>
> > Also, please see this relevant email thread:
> > https://marc.info/?l=xen-devel=162373754705233=2
> >
> >
> > On Wed, 4 Aug 2021, Alex Bennée wrote:
> > > Hi,
> > >
> > > One of the goals of Project Stratos is to enable hypervisor agnostic
> > > backends so we can enable as much re-use of code as possible and avoid
> > > repeating ourselves. This is the flip side of the front end where
> > > multiple front-end implementations are required - one per OS, assuming
> > > you don't just want Linux guests. The resultant guests are trivially
> > > movable between hypervisors modulo any abstracted paravirt type
> > > interfaces.
> > >
> > > In my original thumb nail sketch of a solution I envisioned vhost-user
> > > daemons running in a broadly POSIX like environment. The interface to
> > > the daemon is fairly simple requiring only some mapped memory and some
> > > sort of signalling for events (on Linux this is eventfd). The idea was
> a
> > > stub binary would be responsible for any hypervisor specific setup and
> > > then launch a common binary to deal with the actual virtqueue requests
> > > themselves.
> > >
> > > Since that original sketch we've seen an expansion in the sort of ways
> > > backends could be created. There is interest in encapsulating backends
> > > in RTOSes or unikernels for solutions like SCMI. There interest in Rust
> > > has prompted ideas of using the trait interface to abstract differences
> > > away as well as the idea of bare-metal Rust backends.
> > >
> > > We have a card (STR-12) called "Hypercall Standardisation" which
> > > calls for a description of the APIs needed from the hypervisor side to
> > > support VirtIO guests and their backends. However we are some way off
> > > from that at the moment as I think we need to at least demonstrate one
> > > portable backend before we start codifying requirements. To that end I
> > > want to think about what we need for a backend to function.
> > >
> > > Configuration
> > > =
> > >
> > > In the type-2 setup this is typically fairly simple because the host
> > > system can orchestrate the various modules that make up the complete
> > > system. In the type-1 case (or even type-2 with delegated service VMs)
> > > we need some sort of mechanism to inform the backend VM about key
> > > details about the system:
> > >
> > >   - where virt queue memory is in it's address space
> > >   - how it's going to receive (interrupt) and trigger (kick) events
> > >   - what (if any) resources the backend needs to connect to
> > >
> > > Obviously you can elide over configuration issues by having static
> > > configurations and baking the assumptions into your guest images
> however
> > > this isn't scalable in the long term. The obvious solution seems to be
> > > extending a subset of Device Tree data to user space but perhaps there
> > > are other approaches?
> > >
> > > Before any virtio transactions can take place the appropriate memory
> > > mappings need to be made between the FE guest and the BE guest.
> >
> > > Currently the whole of the FE guests address space needs to be visible
> > > to whatever is serving the virtio requests. I can envision 3
> approaches:
> > >
> > >  * BE guest boots with memory already mapped
> > >
> > >  This would entail the guest OS knowing where in it's Guest Physical
> > >  Address space is already taken up and avoiding clashing. I would
> assume
> > >  in this case you would want a standard interface to userspace to then
> > >  make that address space visible to the backend daemon.
>
> Yet another way here is that we would have well known "shared memory"
> between
> VMs. I think that Jailhouse's ivshmem gives us good insights on this matter
> and that it can even be an alternative for hypervisor-agnostic solution.
>
> (Please note memory regions in ivshmem appear as a PCI device and can be
> mapped locally.)
>
> I want to add this shared memory aspect to my virtio-proxy, but
> the resultant solution would 

Re: Enabling hypervisor agnosticism for VirtIO backends

2021-08-11 Thread AKASHI Takahiro
On Wed, Aug 04, 2021 at 12:20:01PM -0700, Stefano Stabellini wrote:
> CCing people working on Xen+VirtIO and IOREQs. Not trimming the original
> email to let them read the full context.
> 
> My comments below are related to a potential Xen implementation, not
> because it is the only implementation that matters, but because it is
> the one I know best.

Please note that my proposal (and hence the working prototype)[1]
is based on Xen's virtio implementation (i.e. IOREQ) and particularly
EPAM's virtio-disk application (backend server).
It has been, I believe, well generalized but is still a bit biased
toward this original design.

So I hope you like my approach :)

[1] https://op-lists.linaro.org/pipermail/stratos-dev/2021-August/000546.html

Let me take this opportunity to explain a bit more about my approach below.

> Also, please see this relevant email thread:
> https://marc.info/?l=xen-devel=162373754705233=2
> 
> 
> On Wed, 4 Aug 2021, Alex Bennée wrote:
> > Hi,
> > 
> > One of the goals of Project Stratos is to enable hypervisor agnostic
> > backends so we can enable as much re-use of code as possible and avoid
> > repeating ourselves. This is the flip side of the front end where
> > multiple front-end implementations are required - one per OS, assuming
> > you don't just want Linux guests. The resultant guests are trivially
> > movable between hypervisors modulo any abstracted paravirt type
> > interfaces.
> > 
> > In my original thumb nail sketch of a solution I envisioned vhost-user
> > daemons running in a broadly POSIX like environment. The interface to
> > the daemon is fairly simple requiring only some mapped memory and some
> > sort of signalling for events (on Linux this is eventfd). The idea was a
> > stub binary would be responsible for any hypervisor specific setup and
> > then launch a common binary to deal with the actual virtqueue requests
> > themselves.
> > 
> > Since that original sketch we've seen an expansion in the sort of ways
> > backends could be created. There is interest in encapsulating backends
> > in RTOSes or unikernels for solutions like SCMI. There interest in Rust
> > has prompted ideas of using the trait interface to abstract differences
> > away as well as the idea of bare-metal Rust backends.
> > 
> > We have a card (STR-12) called "Hypercall Standardisation" which
> > calls for a description of the APIs needed from the hypervisor side to
> > support VirtIO guests and their backends. However we are some way off
> > from that at the moment as I think we need to at least demonstrate one
> > portable backend before we start codifying requirements. To that end I
> > want to think about what we need for a backend to function.
> > 
> > Configuration
> > =
> > 
> > In the type-2 setup this is typically fairly simple because the host
> > system can orchestrate the various modules that make up the complete
> > system. In the type-1 case (or even type-2 with delegated service VMs)
> > we need some sort of mechanism to inform the backend VM about key
> > details about the system:
> > 
> >   - where virt queue memory is in it's address space
> >   - how it's going to receive (interrupt) and trigger (kick) events
> >   - what (if any) resources the backend needs to connect to
> > 
> > Obviously you can elide over configuration issues by having static
> > configurations and baking the assumptions into your guest images however
> > this isn't scalable in the long term. The obvious solution seems to be
> > extending a subset of Device Tree data to user space but perhaps there
> > are other approaches?
> > 
> > Before any virtio transactions can take place the appropriate memory
> > mappings need to be made between the FE guest and the BE guest.
> 
> > Currently the whole of the FE guests address space needs to be visible
> > to whatever is serving the virtio requests. I can envision 3 approaches:
> > 
> >  * BE guest boots with memory already mapped
> > 
> >  This would entail the guest OS knowing where in it's Guest Physical
> >  Address space is already taken up and avoiding clashing. I would assume
> >  in this case you would want a standard interface to userspace to then
> >  make that address space visible to the backend daemon.

Yet another way here is that we would have well known "shared memory" between
VMs. I think that Jailhouse's ivshmem gives us good insights on this matter
and that it can even be an alternative for hypervisor-agnostic solution. 

(Please note memory regions in ivshmem appear as a PCI device and can be
mapped locally.)

I want to add this shared memory aspect to my virtio-proxy, but
the resultant solution would eventually look similar to ivshmem.

> >  * BE guests boots with a hypervisor handle to memory
> > 
> >  The BE guest is then free to map the FE's memory to where it wants in
> >  the BE's guest physical address space.
> 
> I cannot see how this could work for Xen. There is no "handle" to give
> to the backend if the backend is not 

Re: Enabling hypervisor agnosticism for VirtIO backends

2021-08-04 Thread Stefano Stabellini
CCing people working on Xen+VirtIO and IOREQs. Not trimming the original
email to let them read the full context.

My comments below are related to a potential Xen implementation, not
because it is the only implementation that matters, but because it is
the one I know best.

Also, please see this relevant email thread:
https://marc.info/?l=xen-devel=162373754705233=2


On Wed, 4 Aug 2021, Alex Bennée wrote:
> Hi,
> 
> One of the goals of Project Stratos is to enable hypervisor agnostic
> backends so we can enable as much re-use of code as possible and avoid
> repeating ourselves. This is the flip side of the front end where
> multiple front-end implementations are required - one per OS, assuming
> you don't just want Linux guests. The resultant guests are trivially
> movable between hypervisors modulo any abstracted paravirt type
> interfaces.
> 
> In my original thumb nail sketch of a solution I envisioned vhost-user
> daemons running in a broadly POSIX like environment. The interface to
> the daemon is fairly simple requiring only some mapped memory and some
> sort of signalling for events (on Linux this is eventfd). The idea was a
> stub binary would be responsible for any hypervisor specific setup and
> then launch a common binary to deal with the actual virtqueue requests
> themselves.
> 
> Since that original sketch we've seen an expansion in the sort of ways
> backends could be created. There is interest in encapsulating backends
> in RTOSes or unikernels for solutions like SCMI. There interest in Rust
> has prompted ideas of using the trait interface to abstract differences
> away as well as the idea of bare-metal Rust backends.
> 
> We have a card (STR-12) called "Hypercall Standardisation" which
> calls for a description of the APIs needed from the hypervisor side to
> support VirtIO guests and their backends. However we are some way off
> from that at the moment as I think we need to at least demonstrate one
> portable backend before we start codifying requirements. To that end I
> want to think about what we need for a backend to function.
> 
> Configuration
> =
> 
> In the type-2 setup this is typically fairly simple because the host
> system can orchestrate the various modules that make up the complete
> system. In the type-1 case (or even type-2 with delegated service VMs)
> we need some sort of mechanism to inform the backend VM about key
> details about the system:
> 
>   - where virt queue memory is in it's address space
>   - how it's going to receive (interrupt) and trigger (kick) events
>   - what (if any) resources the backend needs to connect to
> 
> Obviously you can elide over configuration issues by having static
> configurations and baking the assumptions into your guest images however
> this isn't scalable in the long term. The obvious solution seems to be
> extending a subset of Device Tree data to user space but perhaps there
> are other approaches?
> 
> Before any virtio transactions can take place the appropriate memory
> mappings need to be made between the FE guest and the BE guest.

> Currently the whole of the FE guests address space needs to be visible
> to whatever is serving the virtio requests. I can envision 3 approaches:
> 
>  * BE guest boots with memory already mapped
> 
>  This would entail the guest OS knowing where in it's Guest Physical
>  Address space is already taken up and avoiding clashing. I would assume
>  in this case you would want a standard interface to userspace to then
>  make that address space visible to the backend daemon.
> 
>  * BE guests boots with a hypervisor handle to memory
> 
>  The BE guest is then free to map the FE's memory to where it wants in
>  the BE's guest physical address space.

I cannot see how this could work for Xen. There is no "handle" to give
to the backend if the backend is not running in dom0. So for Xen I think
the memory has to be already mapped and the mapping probably done by the
toolstack (also see below.) Or we would have to invent a new Xen
hypervisor interface and Xen virtual machine privileges to allow this
kind of mapping.

If we run the backend in Dom0 that we have no problems of course.


> To activate the mapping will
>  require some sort of hypercall to the hypervisor. I can see two options
>  at this point:
> 
>   - expose the handle to userspace for daemon/helper to trigger the
> mapping via existing hypercall interfaces. If using a helper you
> would have a hypervisor specific one to avoid the daemon having to
> care too much about the details or push that complexity into a
> compile time option for the daemon which would result in different
> binaries although a common source base.
> 
>   - expose a new kernel ABI to abstract the hypercall differences away
> in the guest kernel. In this case the userspace would essentially
> ask for an abstract "map guest N memory to userspace ptr" and let
> the kernel deal with the different hypercall interfaces. This of