Re: [virtio-dev] [PATCH] virtio-net: support timestamp received packet

2022-10-18 Thread Jason Wang
On Wed, Oct 19, 2022 at 9:32 AM Shuyi Cheng
 wrote:
>
> This patch introduces VIRTIO_NET_F_RX_TSTAMP to enhance the
> observability of network packet delay between virtio device and virtio
> driver.
>
> We have encountered many network jitter problems from virtio device to
> virtio driver in the production environment. Due to the lack of relevant
> indicators in this path, we often spend a lot of energy to locate such
> problems. If the virtio device can provide the packet receiving
> timestamp, then we can easily calculate the network jitter index between
> the virtio device and the virtio driver. When such a problem is
> encountered again, it is easy to determine the problem boundary.
>
> Thanks and looking forward to your response!

While at it, do we need a tx timestamp as well?

>
> Signed-off-by: Shuyi Cheng 
> ---
>   content.tex | 14 ++
>   1 file changed, 14 insertions(+)
>
> diff --git a/content.tex b/content.tex
> index e863709..472acf3 100644
> --- a/content.tex
> +++ b/content.tex
> @@ -3097,6 +3097,8 @@ \subsection{Feature bits}\label{sec:Device Types /
> Network Device / Feature bits
>   \item[VIRTIO_NET_F_HASH_REPORT(57)] Device can report per-packet hash
>   value and a type of calculated hash.
>
> +\item[VIRTIO_NET_F_RX_TSTAMP(58)] Device can timestamp received packet.
> +
>   \item[VIRTIO_NET_F_GUEST_HDRLEN(59)] Driver can provide the exact
> \field{hdr_len}
>   value. Device benefits from knowing the exact header length.
>
> @@ -3371,6 +3373,7 @@ \subsection{Device Operation}\label{sec:Device
> Types / Network Device / Device O
>   #define VIRTIO_NET_HDR_F_NEEDS_CSUM1
>   #define VIRTIO_NET_HDR_F_DATA_VALID2
>   #define VIRTIO_NET_HDR_F_RSC_INFO  4
> +#define VIRTIO_NET_HDR_F_TSTAMP8
>   u8 flags;
>   #define VIRTIO_NET_HDR_GSO_NONE0
>   #define VIRTIO_NET_HDR_GSO_TCPV4   1
> @@ -3387,6 +3390,7 @@ \subsection{Device Operation}\label{sec:Device
> Types / Network Device / Device O
>   le32 hash_value;(Only if VIRTIO_NET_F_HASH_REPORT
> negotiated)
>   le16 hash_report;   (Only if VIRTIO_NET_F_HASH_REPORT
> negotiated)
>   le16 padding_reserved;  (Only if VIRTIO_NET_F_HASH_REPORT
> negotiated)
> +le64 tstamp;(Only if VIRITO_NET_F_RX_TSTAMP negotiated)
>   };
>   \end{lstlisting}
>
> @@ -3809,6 +3813,13 @@ \subsubsection{Processing of Incoming
> Packets}\label{sec:Device Types / Network
>   checksum (in case of multiple encapsulated protocols, one level
>   of checksums is validated).
>
> +If VIRTIO_NET_F_RX_TSTAMP was not negotiated, the device MUST not set
> +VIRTIO_NET_HDR_F_TSTAMP bit in \field{flags}.
> +
> +If VIRTIO_NET_F_RX_TSTAMP was negotiated, the device MUST also
> +set VIRTIO_NET_HDR_F_TSTAMP bit in \field{flags},
> +set \field{tstamp} to time to receive the packet.

What kind of time should we use? Willem tried to propose something
like this via TAI:

https://lists.linuxfoundation.org/pipermail/virtualization/2021-February/052476.html

Thanks

> +
>   \drivernormative{\paragraph}{Processing of Incoming
>   Packets}{Device Types / Network Device / Device Operation /
>   Processing of Incoming Packets}
> @@ -3831,6 +3842,9 @@ \subsubsection{Processing of Incoming
> Packets}\label{sec:Device Types / Network
>   VIRTIO_NET_HDR_F_DATA_VALID is set, the driver MUST NOT
>   rely on the packet checksum being correct.
>
> +If VIRTIO_NET_HDR_F_TSTAMP bit in \field{flags} is not set, the
> +driver MUST NOT use the \field{tstamp}.
> +
>   \paragraph{Hash calculation for incoming packets}
>   \label{sec:Device Types / Network Device / Device Operation /
> Processing of Incoming Packets / Hash calculation for incoming packets}
>
> --
> 2.27.0
>
>
> -
> To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org
>


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



Re: [virtio-dev] [PATCH 0/2] introduce virtio-ism: internal shared memory device

2022-10-18 Thread Jason Wang
On Wed, Oct 19, 2022 at 12:35 PM Xuan Zhuo  wrote:
>
> On Wed, 19 Oct 2022 11:56:52 +0800, Jason Wang  wrote:
> > On Wed, Oct 19, 2022 at 10:42 AM Xuan Zhuo  
> > wrote:
> > >
> > > On Mon, 17 Oct 2022 16:17:31 +0800, Jason Wang  
> > > wrote:
> > >
> > >
> > > Hi Jason,
> > >
> > > I think there may be some problems with the direction we are discussing.
> >
> > Probably not.
> >
> > As far as we are focusing on technology, there's nothing wrong from my
> > perspective. And this is how the community works. Your idea needs to
> > be justified and people are free to raise any technical questions
> > especially considering you've posted a spec change with prototype
> > codes but not only the idea.
> >
> > > Our
> > > goal is to add an new ism device. As far as the spec is concerned, we are 
> > > not
> > > concerned with the implementation of the backend.
> > >
> > > The direction we should discuss is what is the difference between the ism 
> > > device
> > > and other devices such as virtio-net, and whether it is necessary to 
> > > introduce
> > > this new device.
> >
> > This is somehow what I want to ask, actually it's not a comparison
> > with virtio-net but:
> >
> > - virtio-roce
> > - virtio-vhost-user
> > - virtio-(p)mem
> >
> > or whether we can simply add features to those devices to achieve what
> > you want to do here.
> >
> > > How to share the backend with other deivce is another problem.
> >
> > Yes, anything that is used for your virito-ism prototype can be used
> > for other devices.
> >
> > >
> > > Our goal is to dynamically obtain a piece of memory to share with other 
> > > vms.
> >
> > So at this level, I don't see the exact difference compared to
> > virtio-vhost-user. Let's just focus on the API that carries on the
> > semantic:
> >
> > - map/unmap
> > - permission update
> >
> > The only missing piece is the per region notification.
>
>
>
> I want to know how we can share a region based on vvu:
>
> |-|   |---|
> | |   |   |
> |  -  |   |  ---  |
> |  | ? |  |   |  | vvu |  |
> |-|   |---|
>  |  |
>  |  |
>  |--|
>
> Can you describe this process in the vvu scenario you are considering?
>
>
> The flow of ism we consider is as follows:
> 1. SMC calls the interface ism_alloc_region() of the ism driver to return 
> the
>location of a memory region in the PCI space and a token.

Can virtio-vhost-user be backed on the memory you've used for ISM?
It's just a name of the command:

VHOST_IOTLB_UPDATE(or other) vs VIRTIO_ISM_CTRL_ALLOC.

Or we can consider the form another angle, can virtio-vhost-user be
built on top of ISM?

> 2. The ism driver mmap the memory region and return to SMC with the token

This part should be the same as long as we add token to a specific region.

> 3. SMC passes the token to the connected peer

Should be the same.

> 4. the peer calls the ism driver interface ism_attach_region(token) to
>get the location of the PCI space of the shared memory

Ditto.

Thanks

>
> Thanks.
>
>
> >
> > >
> > > In a connection, this memory will be used repeatedly. As far as SMC is 
> > > concerned,
> > > it will use it as a ring. Of course, we also need a notify mechanism.
> > >
> > > That's what we're aiming for, so we should first discuss whether this
> > > requirement is reasonable.
> >
> > So unless somebody said "no", it is fine until now.
> >
> > > I think it's a feature currently not supported by
> > > other devices specified by the current virtio spce.
> >
> > Probably, but we've already had rfcs for roce and vhost-user.
> >
> > Thanks
> >
> > >
> > > Thanks.
> > >
> > >
> >
>


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



Re: [virtio-dev] [PATCH 0/2] introduce virtio-ism: internal shared memory device

2022-10-18 Thread Jason Wang
On Wed, Oct 19, 2022 at 12:22 PM Xuan Zhuo  wrote:
>
> On Wed, 19 Oct 2022 11:56:52 +0800, Jason Wang  wrote:
> > On Wed, Oct 19, 2022 at 10:42 AM Xuan Zhuo  
> > wrote:
> > >
> > > On Mon, 17 Oct 2022 16:17:31 +0800, Jason Wang  
> > > wrote:
> > >
> > >
> > > Hi Jason,
> > >
> > > I think there may be some problems with the direction we are discussing.
> >
> > Probably not.
> >
> > As far as we are focusing on technology, there's nothing wrong from my
> > perspective. And this is how the community works. Your idea needs to
> > be justified and people are free to raise any technical questions
> > especially considering you've posted a spec change with prototype
> > codes but not only the idea.
> >
> > > Our
> > > goal is to add an new ism device. As far as the spec is concerned, we are 
> > > not
> > > concerned with the implementation of the backend.
> > >
> > > The direction we should discuss is what is the difference between the ism 
> > > device
> > > and other devices such as virtio-net, and whether it is necessary to 
> > > introduce
> > > this new device.
> >
> > This is somehow what I want to ask, actually it's not a comparison
> > with virtio-net but:
> >
> > - virtio-roce
> > - virtio-vhost-user
> > - virtio-(p)mem
> >
> > or whether we can simply add features to those devices to achieve what
> > you want to do here.
>
>
> Yes, this is my priority to discuss.
>
> At the moment, I think the most similar to ism is the Vhost-user Device 
> Backend
> of virtio-vhost-user.
>
> My understanding of it is to map any virtio device to another vm as a vvu
> device.

Yes, so a possible way is to have a device with memory zone/region
provision and management then map it via virtio-vhost-user.

>
> From this design purpose, I think the two are different.
>
> Of course, you might want to extend it, it does have some similarities and 
> uses
> a lot of similar techniques.

I don't have any preference so far. If you think your idea makes more
sense, then try your best to justify it in the list.

> So we can really discuss in this direction, whether
> the vvu device can be extended to achieve the purpose of ism, or whether the
> design goals can be agreed.

I've added Stefan in the loop, let's hear from him.

>
> Or, in the direction of memory sharing in the backend, can ism and vvu be 
> merged?
> Should device/driver APIs remain independent?

Btw, you mentioned that one possible user of ism is the smc, but I
don't see how it connects to that with your prototype driver.

Thanks

>
> Thanks.
>
>
> >
> > > How to share the backend with other deivce is another problem.
> >
> > Yes, anything that is used for your virito-ism prototype can be used
> > for other devices.
> >
> > >
> > > Our goal is to dynamically obtain a piece of memory to share with other 
> > > vms.
> >
> > So at this level, I don't see the exact difference compared to
> > virtio-vhost-user. Let's just focus on the API that carries on the
> > semantic:
> >
> > - map/unmap
> > - permission update
> >
> > The only missing piece is the per region notification.
> >
> > >
> > > In a connection, this memory will be used repeatedly. As far as SMC is 
> > > concerned,
> > > it will use it as a ring. Of course, we also need a notify mechanism.
> > >
> > > That's what we're aiming for, so we should first discuss whether this
> > > requirement is reasonable.
> >
> > So unless somebody said "no", it is fine until now.
> >
> > > I think it's a feature currently not supported by
> > > other devices specified by the current virtio spce.
> >
> > Probably, but we've already had rfcs for roce and vhost-user.
> >
> > Thanks
> >
> > >
> > > Thanks.
> > >
> > >
> >
>


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



Re: [virtio-dev] [PATCH 0/2] introduce virtio-ism: internal shared memory device

2022-10-18 Thread Jason Wang



在 2022/10/18 16:55, He Rongguang 写道:



在 2022/10/18 14:54, Jason Wang 写道:
On Mon, Oct 17, 2022 at 8:31 PM Xuan Zhuo 
 wrote:


On Mon, 17 Oct 2022 16:17:31 +0800, Jason Wang  
wrote:

Adding Stefan.


On Mon, Oct 17, 2022 at 3:47 PM Xuan Zhuo 
 wrote:


Hello everyone,

# Background

Nowadays, there is a common scenario to accelerate communication 
between
different VMs and containers, including light weight virtual 
machine based
containers. One way to achieve this is to colocate them on the 
same host.
However, the performance of inter-VM communication through network 
stack is not
optimal and may also waste extra CPU cycles. This scenario has 
been discussed

many times, but still no generic solution available [1] [2] [3].

With pci-ivshmem + SMC(Shared Memory Communications: [4]) based 
PoC[5],
We found that by changing the communication channel between VMs 
from TCP to SMC

with shared memory, we can achieve superior performance for a common
socket-based application[5]:
   - latency reduced by about 50%
   - throughput increased by about 300%
   - CPU consumption reduced by about 50%

Since there is no particularly suitable shared memory management 
solution
matches the need for SMC(See ## Comparison with existing 
technology), and virtio
is the standard for communication in the virtualization world, we 
want to
implement a virtio-ism device based on virtio, which can support 
on-demand
memory sharing across VMs, containers or VM-container. To match 
the needs of SMC,

the virtio-ism device need to support:

1. Dynamic provision: shared memory regions are dynamically 
allocated and

    provisioned.
2. Multi-region management: the shared memory is divided into 
regions,
    and a peer may allocate one or more regions from the same 
shared memory

    device.
3. Permission control: The permission of each region can be set 
seperately.


Looks like virtio-ROCE

https://lore.kernel.org/all/20220511095900.343-1-xieyon...@bytedance.com/T/ 



and virtio-vhost-user can satisfy the requirement?



# Virtio ism device

ISM devices provide the ability to share memory between different 
guests on a
host. A guest's memory got from ism device can be shared with 
multiple peers at
the same time. This shared relationship can be dynamically created 
and released.


The shared memory obtained from the device is divided into 
multiple ism regions
for share. ISM device provides a mechanism to notify other ism 
region referrers

of content update events.

# Usage (SMC as example)

Maybe there is one of possible use cases:

1. SMC calls the interface ism_alloc_region() of the ism driver to 
return the

    location of a memory region in the PCI space and a token.
2. The ism driver mmap the memory region and return to SMC with 
the token

3. SMC passes the token to the connected peer
3. the peer calls the ism driver interface 
ism_attach_region(token) to

    get the location of the PCI space of the shared memory


# About hot plugging of the ism device

    Hot plugging of devices is a heavier, possibly failed, 
time-consuming, and

    less scalable operation. So, we don't plan to support it for now.

# Comparison with existing technology

## ivshmem or ivshmem 2.0 of Qemu

    1. ivshmem 1.0 is a large piece of memory that can be seen by 
all devices that

    use this VM, so the security is not enough.

    2. ivshmem 2.0 is a shared memory belonging to a VM that can 
be read-only by all
    other VMs that use the ivshmem 2.0 shared memory device, which 
also does not

    meet our needs in terms of security.

## vhost-pci and virtiovhostuser

    Does not support dynamic allocation and therefore not suitable 
for SMC.


I think this is an implementation issue, we can support VHOST IOTLB
message then the regions could be added/removed on demand.



1. After the attacker connects with the victim, if the attacker does 
not
    dereference memory, the memory will be occupied under 
virtiovhostuser. In the
    case of ism devices, the victim can directly release the 
reference, and the
    maliciously referenced region only occupies the attacker's 
resources


Let's define the security boundary here. E.g do we trust the device or
not? If yes, in the case of virtiovhostuser, can we simple do
VHOST_IOTLB_UNMAP then we can safely release the memory from the
attacker.



2. The ism device of a VM can be shared with multiple (1000+) VMs at 
the same

    time, which is a challenge for virtiovhostuser


Please elaborate more the the challenges, anything make
virtiovhostuser different?


Hi, besides that, I think there's another distinctive difference 
between virtio-ism+smc and virtiovhostuser: in virtiovhostuser, one 
end is frontend(virtio-net device), the other end is vhost backend, 
thus it's one frontend to one backend model, whereas in our business 
scenario, we need a dynamically network communication model, in which 
one end that runs for a long time may connect and communicate to a 
just booted VM, i.e., each end 

Re: [virtio-dev] [PATCH 0/2] introduce virtio-ism: internal shared memory device

2022-10-18 Thread Jason Wang
On Wed, Oct 19, 2022 at 10:42 AM Xuan Zhuo  wrote:
>
> On Mon, 17 Oct 2022 16:17:31 +0800, Jason Wang  wrote:
>
>
> Hi Jason,
>
> I think there may be some problems with the direction we are discussing.

Probably not.

As far as we are focusing on technology, there's nothing wrong from my
perspective. And this is how the community works. Your idea needs to
be justified and people are free to raise any technical questions
especially considering you've posted a spec change with prototype
codes but not only the idea.

> Our
> goal is to add an new ism device. As far as the spec is concerned, we are not
> concerned with the implementation of the backend.
>
> The direction we should discuss is what is the difference between the ism 
> device
> and other devices such as virtio-net, and whether it is necessary to introduce
> this new device.

This is somehow what I want to ask, actually it's not a comparison
with virtio-net but:

- virtio-roce
- virtio-vhost-user
- virtio-(p)mem

or whether we can simply add features to those devices to achieve what
you want to do here.

> How to share the backend with other deivce is another problem.

Yes, anything that is used for your virito-ism prototype can be used
for other devices.

>
> Our goal is to dynamically obtain a piece of memory to share with other vms.

So at this level, I don't see the exact difference compared to
virtio-vhost-user. Let's just focus on the API that carries on the
semantic:

- map/unmap
- permission update

The only missing piece is the per region notification.

>
> In a connection, this memory will be used repeatedly. As far as SMC is 
> concerned,
> it will use it as a ring. Of course, we also need a notify mechanism.
>
> That's what we're aiming for, so we should first discuss whether this
> requirement is reasonable.

So unless somebody said "no", it is fine until now.

> I think it's a feature currently not supported by
> other devices specified by the current virtio spce.

Probably, but we've already had rfcs for roce and vhost-user.

Thanks

>
> Thanks.
>
>


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



Re: [virtio-dev] [PATCH 0/2] introduce virtio-ism: internal shared memory device

2022-10-18 Thread Jason Wang



在 2022/10/18 16:33, Gerry 写道:




2022年10月18日 14:54,Jason Wang  写道:

On Mon, Oct 17, 2022 at 8:31 PM Xuan Zhuo 
 wrote:


On Mon, 17 Oct 2022 16:17:31 +0800, Jason Wang  
wrote:

Adding Stefan.


On Mon, Oct 17, 2022 at 3:47 PM Xuan Zhuo 
 wrote:


Hello everyone,

# Background

Nowadays, there is a common scenario to accelerate communication 
between
different VMs and containers, including light weight virtual 
machine based
containers. One way to achieve this is to colocate them on the 
same host.
However, the performance of inter-VM communication through network 
stack is not
optimal and may also waste extra CPU cycles. This scenario has 
been discussed

many times, but still no generic solution available [1] [2] [3].

With pci-ivshmem + SMC(Shared Memory Communications: [4]) based 
PoC[5],
We found that by changing the communication channel between VMs 
from TCP to SMC

with shared memory, we can achieve superior performance for a common
socket-based application[5]:
 - latency reduced by about 50%
 - throughput increased by about 300%
 - CPU consumption reduced by about 50%

Since there is no particularly suitable shared memory management 
solution
matches the need for SMC(See ## Comparison with existing 
technology), and virtio
is the standard for communication in the virtualization world, we 
want to
implement a virtio-ism device based on virtio, which can support 
on-demand
memory sharing across VMs, containers or VM-container. To match 
the needs of SMC,

the virtio-ism device need to support:

1. Dynamic provision: shared memory regions are dynamically 
allocated and

  provisioned.
2. Multi-region management: the shared memory is divided into regions,
  and a peer may allocate one or more regions from the same shared 
memory

  device.
3. Permission control: The permission of each region can be set 
seperately.


Looks like virtio-ROCE

https://lore.kernel.org/all/20220511095900.343-1-xieyon...@bytedance.com/T/

and virtio-vhost-user can satisfy the requirement?



# Virtio ism device

ISM devices provide the ability to share memory between different 
guests on a
host. A guest's memory got from ism device can be shared with 
multiple peers at
the same time. This shared relationship can be dynamically created 
and released.


The shared memory obtained from the device is divided into 
multiple ism regions
for share. ISM device provides a mechanism to notify other ism 
region referrers

of content update events.

# Usage (SMC as example)

Maybe there is one of possible use cases:

1. SMC calls the interface ism_alloc_region() of the ism driver to 
return the

  location of a memory region in the PCI space and a token.
2. The ism driver mmap the memory region and return to SMC with 
the token

3. SMC passes the token to the connected peer
3. the peer calls the ism driver interface ism_attach_region(token) to
  get the location of the PCI space of the shared memory


# About hot plugging of the ism device

  Hot plugging of devices is a heavier, possibly failed, 
time-consuming, and

  less scalable operation. So, we don't plan to support it for now.

# Comparison with existing technology

## ivshmem or ivshmem 2.0 of Qemu

  1. ivshmem 1.0 is a large piece of memory that can be seen by 
all devices that

  use this VM, so the security is not enough.

  2. ivshmem 2.0 is a shared memory belonging to a VM that can be 
read-only by all
  other VMs that use the ivshmem 2.0 shared memory device, which 
also does not

  meet our needs in terms of security.

## vhost-pci and virtiovhostuser

  Does not support dynamic allocation and therefore not suitable 
for SMC.


I think this is an implementation issue, we can support VHOST IOTLB
message then the regions could be added/removed on demand.



1. After the attacker connects with the victim, if the attacker does not
  dereference memory, the memory will be occupied under 
virtiovhostuser. In the
  case of ism devices, the victim can directly release the 
reference, and the

  maliciously referenced region only occupies the attacker's resources


Let's define the security boundary here. E.g do we trust the device or
not? If yes, in the case of virtiovhostuser, can we simple do
VHOST_IOTLB_UNMAP then we can safely release the memory from the
attacker.

Thanks, Jason:)
In our the design, there are several roles involved:
1) a virtio-ism-smc front-end driver
2) a Virtio-ism backend device driver and its associated vmm
3) a global device manager
4) a group of remote/peer virtio-ism backend devices/vmms
5) a group of remote/peer virtio-ism-smc front-end drivers

Among which , we treat 1, 2 and 3 as trusted, 4 and 5 as untrusted.



It looks to me VIRTIO_ISM_PERM_MANAGE violates what you've described 
here. E.g what happens if 1 grant this permission to 5?



Because 4 and 5 are trusted, we can’t guarantee that IOTLB Invalidate 
requests have been executed as expected.



Interesting, I wonder how this is guaranteed by ISM. Anything that can 
work for ISM but 

Re: [virtio-dev] [PATCH 0/2] introduce virtio-ism: internal shared memory device

2022-10-18 Thread Xuan Zhuo
On Mon, 17 Oct 2022 16:17:31 +0800, Jason Wang  wrote:


Hi Jason,

I think there may be some problems with the direction we are discussing. Our
goal is to add an new ism device. As far as the spec is concerned, we are not
concerned with the implementation of the backend.

The direction we should discuss is what is the difference between the ism device
and other devices such as virtio-net, and whether it is necessary to introduce
this new device. How to share the backend with other deivce is another problem.

Our goal is to dynamically obtain a piece of memory to share with other vms.

In a connection, this memory will be used repeatedly. As far as SMC is 
concerned,
it will use it as a ring. Of course, we also need a notify mechanism.

That's what we're aiming for, so we should first discuss whether this
requirement is reasonable. I think it's a feature currently not supported by
other devices specified by the current virtio spce.

Thanks.



-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



Re: [virtio-dev] [PATCH 0/2] introduce virtio-ism: internal shared memory device

2022-10-18 Thread Jan Kiszka
On 17.10.22 09:47, Xuan Zhuo wrote:
> Hello everyone,
> 
> # Background
> 
> Nowadays, there is a common scenario to accelerate communication between
> different VMs and containers, including light weight virtual machine based
> containers. One way to achieve this is to colocate them on the same host.
> However, the performance of inter-VM communication through network stack is 
> not
> optimal and may also waste extra CPU cycles. This scenario has been discussed
> many times, but still no generic solution available [1] [2] [3].
> 
> With pci-ivshmem + SMC(Shared Memory Communications: [4]) based PoC[5],
> We found that by changing the communication channel between VMs from TCP to 
> SMC
> with shared memory, we can achieve superior performance for a common
> socket-based application[5]:
>   - latency reduced by about 50%
>   - throughput increased by about 300%
>   - CPU consumption reduced by about 50%
> 
> Since there is no particularly suitable shared memory management solution
> matches the need for SMC(See ## Comparison with existing technology), and 
> virtio
> is the standard for communication in the virtualization world, we want to
> implement a virtio-ism device based on virtio, which can support on-demand
> memory sharing across VMs, containers or VM-container. To match the needs of 
> SMC,
> the virtio-ism device need to support:
> 
> 1. Dynamic provision: shared memory regions are dynamically allocated and
>provisioned.
> 2. Multi-region management: the shared memory is divided into regions,
>and a peer may allocate one or more regions from the same shared memory
>device.
> 3. Permission control: The permission of each region can be set seperately.
> 
> # Virtio ism device
> 
> ISM devices provide the ability to share memory between different guests on a
> host. A guest's memory got from ism device can be shared with multiple peers 
> at
> the same time. This shared relationship can be dynamically created and 
> released.
> 
> The shared memory obtained from the device is divided into multiple ism 
> regions
> for share. ISM device provides a mechanism to notify other ism region 
> referrers
> of content update events.
> 
> # Usage (SMC as example)
> 
> Maybe there is one of possible use cases:
> 
> 1. SMC calls the interface ism_alloc_region() of the ism driver to return the
>location of a memory region in the PCI space and a token.
> 2. The ism driver mmap the memory region and return to SMC with the token
> 3. SMC passes the token to the connected peer
> 3. the peer calls the ism driver interface ism_attach_region(token) to
>get the location of the PCI space of the shared memory
> 
> 
> # About hot plugging of the ism device
> 
>Hot plugging of devices is a heavier, possibly failed, time-consuming, and
>less scalable operation. So, we don't plan to support it for now.
> 
> # Comparison with existing technology
> 
> ## ivshmem or ivshmem 2.0 of Qemu
> 
>1. ivshmem 1.0 is a large piece of memory that can be seen by all devices 
> that
>use this VM, so the security is not enough.
> 
>2. ivshmem 2.0 is a shared memory belonging to a VM that can be read-only 
> by all
>other VMs that use the ivshmem 2.0 shared memory device, which also does 
> not
>meet our needs in terms of security.

This is addressed by establishing separate links between VMs (modeled
with separate devices). That is a trade-off between simplicity of the
model and convenience, for sure.

Jan

-- 
Siemens AG, Technology
Competence Center Embedded Linux


-
To unsubscribe, e-mail: virtio-dev-unsubscr...@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-h...@lists.oasis-open.org



Re: [virtio-dev] [PATCH 0/2] introduce virtio-ism: internal shared memory device

2022-10-18 Thread Jason Wang



在 2022/10/18 11:15, dust.li 写道:

On Mon, Oct 17, 2022 at 04:17:31PM +0800, Jason Wang wrote:

Adding Stefan.


On Mon, Oct 17, 2022 at 3:47 PM Xuan Zhuo  wrote:

Hello everyone,

# Background

Nowadays, there is a common scenario to accelerate communication between
different VMs and containers, including light weight virtual machine based
containers. One way to achieve this is to colocate them on the same host.
However, the performance of inter-VM communication through network stack is not
optimal and may also waste extra CPU cycles. This scenario has been discussed
many times, but still no generic solution available [1] [2] [3].

With pci-ivshmem + SMC(Shared Memory Communications: [4]) based PoC[5],
We found that by changing the communication channel between VMs from TCP to SMC
with shared memory, we can achieve superior performance for a common
socket-based application[5]:
   - latency reduced by about 50%
   - throughput increased by about 300%
   - CPU consumption reduced by about 50%

Since there is no particularly suitable shared memory management solution
matches the need for SMC(See ## Comparison with existing technology), and virtio
is the standard for communication in the virtualization world, we want to
implement a virtio-ism device based on virtio, which can support on-demand
memory sharing across VMs, containers or VM-container. To match the needs of 
SMC,
the virtio-ism device need to support:

1. Dynamic provision: shared memory regions are dynamically allocated and
provisioned.
2. Multi-region management: the shared memory is divided into regions,
and a peer may allocate one or more regions from the same shared memory
device.
3. Permission control: The permission of each region can be set seperately.

Looks like virtio-ROCE

https://lore.kernel.org/all/20220511095900.343-1-xieyon...@bytedance.com/T/

Thanks for your reply!

Yes, RoCE is OK for SMC and can support all those features.
And SMC already support RoCE now.

The biggest advantage of virito-ism compared to roce is performance.
When 2 VMs are on the same host. With RoCE, the RDMA device still need
to do a memory copy to transfer the data from one VM to another, regardless
of the devcie is implemented by software or hardware.
But with this virito-ism device, the memory can be truely shared between
2 VMs, and no memory copy is needed in the datapath.



Adding Yong Ji for more thoughts.






and virtio-vhost-user can satisfy the requirement?

XuanZhuo has already listed the reasons, but I want to say something
more about that.

We throught about virtio-vhost-user before, and I think the biggest
different between virtio-vhost-user and virtio-ism device is where
the shared memory comes from.

IIUC, with virtio-vhost-user, the shared memory belongs to the front-end
VM, and mapped to the backend VM. But with virtio-ism device, the shared
memory is from the device, and mapped to both VMs.



It doesn't differ from the view of host (qemu)? Even it is, it should 
not be hard to mandate the virtio-vhost-user to use memory belong to the 
device.





So, with virtio-vhost-user, if the front-end VM want to disconnect with
the back-end VM, it has no way to do it. If the front-end VM has
disconnect and released its reference to the shared memory, but the
back-end VM didn't(intentional or unintentional), the front-end VM
cannot reuse that memory.



This can be mandated by the hypervisor (Qemu), isn't it?

Thanks



This creates a big security hole.

With virtio-ism, we can avoid that using a backend server to account
the shared memory usage of each VM. Since the shared memory belongs
to the device, any VM who has released its reference to the shared
memory will no longer be accounted, thus can allocate new memory from
the device.

Thanks.


# Virtio ism device

ISM devices provide the ability to share memory between different guests on a
host. A guest's memory got from ism device can be shared with multiple peers at
the same time. This shared relationship can be dynamically created and released.

The shared memory obtained from the device is divided into multiple ism regions
for share. ISM device provides a mechanism to notify other ism region referrers
of content update events.

# Usage (SMC as example)

Maybe there is one of possible use cases:

1. SMC calls the interface ism_alloc_region() of the ism driver to return the
location of a memory region in the PCI space and a token.
2. The ism driver mmap the memory region and return to SMC with the token
3. SMC passes the token to the connected peer
3. the peer calls the ism driver interface ism_attach_region(token) to
get the location of the PCI space of the shared memory


# About hot plugging of the ism device

Hot plugging of devices is a heavier, possibly failed, time-consuming, and
less scalable operation. So, we don't plan to support it for now.

# Comparison with existing technology

## ivshmem or ivshmem 2.0 of Qemu

1. ivshmem 1.0 is a large piece of 

Re: [virtio-dev] [PATCH 0/2] introduce virtio-ism: internal shared memory device

2022-10-18 Thread Jason Wang
On Mon, Oct 17, 2022 at 8:31 PM Xuan Zhuo  wrote:
>
> On Mon, 17 Oct 2022 16:17:31 +0800, Jason Wang  wrote:
> > Adding Stefan.
> >
> >
> > On Mon, Oct 17, 2022 at 3:47 PM Xuan Zhuo  
> > wrote:
> > >
> > > Hello everyone,
> > >
> > > # Background
> > >
> > > Nowadays, there is a common scenario to accelerate communication between
> > > different VMs and containers, including light weight virtual machine based
> > > containers. One way to achieve this is to colocate them on the same host.
> > > However, the performance of inter-VM communication through network stack 
> > > is not
> > > optimal and may also waste extra CPU cycles. This scenario has been 
> > > discussed
> > > many times, but still no generic solution available [1] [2] [3].
> > >
> > > With pci-ivshmem + SMC(Shared Memory Communications: [4]) based PoC[5],
> > > We found that by changing the communication channel between VMs from TCP 
> > > to SMC
> > > with shared memory, we can achieve superior performance for a common
> > > socket-based application[5]:
> > >   - latency reduced by about 50%
> > >   - throughput increased by about 300%
> > >   - CPU consumption reduced by about 50%
> > >
> > > Since there is no particularly suitable shared memory management solution
> > > matches the need for SMC(See ## Comparison with existing technology), and 
> > > virtio
> > > is the standard for communication in the virtualization world, we want to
> > > implement a virtio-ism device based on virtio, which can support on-demand
> > > memory sharing across VMs, containers or VM-container. To match the needs 
> > > of SMC,
> > > the virtio-ism device need to support:
> > >
> > > 1. Dynamic provision: shared memory regions are dynamically allocated and
> > >provisioned.
> > > 2. Multi-region management: the shared memory is divided into regions,
> > >and a peer may allocate one or more regions from the same shared memory
> > >device.
> > > 3. Permission control: The permission of each region can be set 
> > > seperately.
> >
> > Looks like virtio-ROCE
> >
> > https://lore.kernel.org/all/20220511095900.343-1-xieyon...@bytedance.com/T/
> >
> > and virtio-vhost-user can satisfy the requirement?
> >
> > >
> > > # Virtio ism device
> > >
> > > ISM devices provide the ability to share memory between different guests 
> > > on a
> > > host. A guest's memory got from ism device can be shared with multiple 
> > > peers at
> > > the same time. This shared relationship can be dynamically created and 
> > > released.
> > >
> > > The shared memory obtained from the device is divided into multiple ism 
> > > regions
> > > for share. ISM device provides a mechanism to notify other ism region 
> > > referrers
> > > of content update events.
> > >
> > > # Usage (SMC as example)
> > >
> > > Maybe there is one of possible use cases:
> > >
> > > 1. SMC calls the interface ism_alloc_region() of the ism driver to return 
> > > the
> > >location of a memory region in the PCI space and a token.
> > > 2. The ism driver mmap the memory region and return to SMC with the token
> > > 3. SMC passes the token to the connected peer
> > > 3. the peer calls the ism driver interface ism_attach_region(token) to
> > >get the location of the PCI space of the shared memory
> > >
> > >
> > > # About hot plugging of the ism device
> > >
> > >Hot plugging of devices is a heavier, possibly failed, time-consuming, 
> > > and
> > >less scalable operation. So, we don't plan to support it for now.
> > >
> > > # Comparison with existing technology
> > >
> > > ## ivshmem or ivshmem 2.0 of Qemu
> > >
> > >1. ivshmem 1.0 is a large piece of memory that can be seen by all 
> > > devices that
> > >use this VM, so the security is not enough.
> > >
> > >2. ivshmem 2.0 is a shared memory belonging to a VM that can be 
> > > read-only by all
> > >other VMs that use the ivshmem 2.0 shared memory device, which also 
> > > does not
> > >meet our needs in terms of security.
> > >
> > > ## vhost-pci and virtiovhostuser
> > >
> > >Does not support dynamic allocation and therefore not suitable for SMC.
> >
> > I think this is an implementation issue, we can support VHOST IOTLB
> > message then the regions could be added/removed on demand.
>
>
> 1. After the attacker connects with the victim, if the attacker does not
>dereference memory, the memory will be occupied under virtiovhostuser. In 
> the
>case of ism devices, the victim can directly release the reference, and the
>maliciously referenced region only occupies the attacker's resources

Let's define the security boundary here. E.g do we trust the device or
not? If yes, in the case of virtiovhostuser, can we simple do
VHOST_IOTLB_UNMAP then we can safely release the memory from the
attacker.

>
> 2. The ism device of a VM can be shared with multiple (1000+) VMs at the same
>time, which is a challenge for virtiovhostuser

Please elaborate more the the challenges, anything make
virtiovhostuser