Re: [PATCH v7 11/12] vduse: Introduce VDUSE - vDPA Device in Userspace

2021-05-27 Thread Jason Wang


在 2021/5/28 上午11:54, Yongji Xie 写道:

On Fri, May 28, 2021 at 9:33 AM Jason Wang  wrote:


在 2021/5/27 下午6:14, Yongji Xie 写道:

On Thu, May 27, 2021 at 4:43 PM Jason Wang  wrote:

在 2021/5/27 下午4:41, Jason Wang 写道:

在 2021/5/27 下午3:34, Yongji Xie 写道:

On Thu, May 27, 2021 at 1:40 PM Jason Wang  wrote:

在 2021/5/27 下午1:08, Yongji Xie 写道:

On Thu, May 27, 2021 at 1:00 PM Jason Wang 
wrote:

在 2021/5/27 下午12:57, Yongji Xie 写道:

On Thu, May 27, 2021 at 12:13 PM Jason Wang 
wrote:

在 2021/5/17 下午5:55, Xie Yongji 写道:

+
+static int vduse_dev_msg_sync(struct vduse_dev *dev,
+   struct vduse_dev_msg *msg)
+{
+ init_waitqueue_head(&msg->waitq);
+ spin_lock(&dev->msg_lock);
+ vduse_enqueue_msg(&dev->send_list, msg);
+ wake_up(&dev->waitq);
+ spin_unlock(&dev->msg_lock);
+ wait_event_killable(msg->waitq, msg->completed);

What happens if the userspace(malicous) doesn't give a response
forever?

It looks like a DOS. If yes, we need to consider a way to fix that.


How about using wait_event_killable_timeout() instead?

Probably, and then we need choose a suitable timeout and more
important,
need to report the failure to virtio.


Makes sense to me. But it looks like some
vdpa_config_ops/virtio_config_ops such as set_status() didn't have a
return value.  Now I add a WARN_ON() for the failure. Do you mean we
need to add some change for virtio core to handle the failure?

Maybe, but I'm not sure how hard we can do that.


We need to change all virtio device drivers in this way.

Probably.



We had NEEDS_RESET but it looks we don't implement it.


Could it handle the failure of get_feature() and get/set_config()?

Looks not:

"

The device SHOULD set DEVICE_NEEDS_RESET when it enters an error state
that a reset is needed. If DRIVER_OK is set, after it sets
DEVICE_NEEDS_RESET, the device MUST send a device configuration change
notification to the driver.

"

This looks implies that NEEDS_RESET may only work after device is
probed. But in the current design, even the reset() is not reliable.



Or a rough idea is that maybe need some relaxing to be coupled loosely
with userspace. E.g the device (control path) is implemented in the
kernel but the datapath is implemented in the userspace like TUN/TAP.


I think it can work for most cases. One problem is that the set_config
might change the behavior of the data path at runtime, e.g.
virtnet_set_mac_address() in the virtio-net driver and
cache_type_store() in the virtio-blk driver. Not sure if this path is
able to return before the datapath is aware of this change.

Good point.

But set_config() should be rare:

E.g in the case of virtio-net with VERSION_1, config space is read
only, and it was set via control vq.

For block, we can

1) start from without WCE or
2) we add a config change notification to userspace or
3) extend the spec to use vq instead of config space

Thanks

Another thing if we want to go this way:

We need find a way to terminate the data path from the kernel side, to
implement to reset semantic.


Do you mean terminate the data path in vdpa_reset().


Yes.



   Is it ok to just
notify userspace to stop data path asynchronously?


For well-behaved userspace, yes but no for buggy or malicious ones.


But the buggy or malicious daemons can't do anything if my
understanding is correct.



You're right. I originally thought there can still have bouncing. But 
consider we don't do that during fault.


It should be safe.





I had an idea, how about terminate IOTLB in this case? Then we're in
fact turn datapath off.


Sorry, I didn't get your point here. What do you mean by terminating
IOTLB?



I meant terminate the bouncing but it looks safe after a second thought :)

Thanks



  Remove iotlb mapping? But userspace can still access the mapped
region.

Thanks,
Yongji



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[RFC v4] virtio-vsock: add description for datagram type

2021-05-27 Thread Jiang Wang
From: "jiang.wang" 

Add supports for datagram type for virtio-vsock. Datagram
sockets are connectionless and unreliable. To avoid contention
with stream and other sockets, add two more virtqueues and
a new feature bit to identify if those two new queues exist or not.

Also add descriptions for resource management of datagram, which
does not use the existing credit update mechanism associated with
stream sockets.

Signed-off-by: Jiang Wang 
---

V2: addressed the comments for the previous version.
V3: add description for the mergeable receive buffer.
V4: add a feature bit for stream and reserver a bit for seqpacket.
Fix mrg_rxbuf related sentences.

 virtio-vsock.tex | 155 ++-
 1 file changed, 142 insertions(+), 13 deletions(-)

diff --git a/virtio-vsock.tex b/virtio-vsock.tex
index da7e641..bacac3c 100644
--- a/virtio-vsock.tex
+++ b/virtio-vsock.tex
@@ -9,14 +9,41 @@ \subsection{Device ID}\label{sec:Device Types / Socket Device 
/ Device ID}
 
 \subsection{Virtqueues}\label{sec:Device Types / Socket Device / Virtqueues}
 \begin{description}
-\item[0] rx
-\item[1] tx
+\item[0] stream rx
+\item[1] stream tx
+\item[2] datagram rx
+\item[3] datagram tx
+\item[4] event
+\end{description}
+The virtio socket device uses 5 queues if feature bit VIRTIO_VSOCK_F_DRGAM is 
set. Otherwise, it
+only uses 3 queues, as the following.
+
+\begin{description}
+\item[0] stream rx
+\item[1] stream tx
 \item[2] event
 \end{description}
 
+When behavior differs between stream and datagram rx/tx virtqueues
+their full names are used. Common behavior is simply described in
+terms of rx/tx virtqueues and applies to both stream and datagram
+virtqueues.
+
 \subsection{Feature bits}\label{sec:Device Types / Socket Device / Feature 
bits}
 
-There are currently no feature bits defined for this device.
+\begin{description}
+\item[VIRTIO_VSOCK_F_STREAM (0)] Device has support for stream socket type.
+\end{description}
+
+\begin{description}
+\item[VIRTIO_VSOCK_F_DGRAM (2)] Device has support for datagram socket type.
+\end{description}
+
+\begin{description}
+\item[VIRTIO_VSOCK_F_MRG_RXBUF (3)] Driver can merge receive buffers.
+\end{description}
+
+If no feature bits are defined, then assume only VIRTIO_VSOCK_F_STREAM is set.
 
 \subsection{Device configuration layout}\label{sec:Device Types / Socket 
Device / Device configuration layout}
 
@@ -64,6 +91,8 @@ \subsection{Device Operation}\label{sec:Device Types / Socket 
Device / Device Op
 
 Packets transmitted or received contain a header before the payload:
 
+If feature VIRTIO_VSOCK_F_MRG_RXBUF is not negotiated, use the following 
header.
+
 \begin{lstlisting}
 struct virtio_vsock_hdr {
le64 src_cid;
@@ -79,6 +108,15 @@ \subsection{Device Operation}\label{sec:Device Types / 
Socket Device / Device Op
 };
 \end{lstlisting}
 
+If feature VIRTIO_VSOCK_F_MRG_RXBUF is negotiated, use the following header.
+\begin{lstlisting}
+struct virtio_vsock_hdr_mrg_rxbuf {
+   struct virtio_vsock_hdr hdr;
+   le16 num_buffers;
+};
+\end{lstlisting}
+
+
 The upper 32 bits of src_cid and dst_cid are reserved and zeroed.
 
 Most packets simply transfer data but control packets are also used for
@@ -107,6 +145,9 @@ \subsection{Device Operation}\label{sec:Device Types / 
Socket Device / Device Op
 
 \subsubsection{Virtqueue Flow Control}\label{sec:Device Types / Socket Device 
/ Device Operation / Virtqueue Flow Control}
 
+Flow control applies to stream sockets; datagram sockets do not have
+flow control.
+
 The tx virtqueue carries packets initiated by applications and replies to
 received packets.  The rx virtqueue carries packets initiated by the device and
 replies to previously transmitted packets.
@@ -140,12 +181,15 @@ \subsubsection{Addressing}\label{sec:Device Types / 
Socket Device / Device Opera
 consists of a (cid, port number) tuple. The header fields used for this are
 \field{src_cid}, \field{src_port}, \field{dst_cid}, and \field{dst_port}.
 
-Currently only stream sockets are supported. \field{type} is 1 for stream
-socket types.
+Currently stream and datagram (dgram) sockets are supported. \field{type} is 1 
for stream
+socket types. \field{type} is 3 for dgram socket types.
 
 Stream sockets provide in-order, guaranteed, connection-oriented delivery
 without message boundaries.
 
+Datagram sockets provide unordered, unreliable, connectionless messages 
+with message boundaries and a maximum length.
+
 \subsubsection{Buffer Space Management}\label{sec:Device Types / Socket Device 
/ Device Operation / Buffer Space Management}
 \field{buf_alloc} and \field{fwd_cnt} are used for buffer space management of
 stream sockets. The guest and the device publish how much buffer space is
@@ -162,7 +206,7 @@ \subsubsection{Buffer Space Management}\label{sec:Device 
Types / Socket Device /
 u32 peer_free = peer_buf_alloc - (tx_cnt - peer_fwd_cnt);
 \end{lstlisting}
 
-If there is insufficient buffer space, the sender waits until v

Re: Re: [virtio-comment] [RFC v3] virtio-vsock: add description for datagram type

2021-05-27 Thread Jiang Wang .
On Thu, May 27, 2021 at 6:21 AM Stefano Garzarella  wrote:
>
> Re-send my thoughts on this new series...
>
> On Wed, May 26, 2021 at 05:50:35PM +, Jiang Wang wrote:
> >From: "jiang.wang" 
> >
> >Add supports for datagram type for virtio-vsock. Datagram
> >sockets are connectionless and unreliable. To avoid contention
> >with stream and other sockets, add two more virtqueues and
> >a new feature bit to identify if those two new queues exist or not.
> >
> >Also add descriptions for resource management of datagram, which
> >does not use the existing credit update mechanism associated with
> >stream sockets.
> >
> >Signed-off-by: Jiang Wang 
> >---
> >V2: addressed the comments for the previous version.
> >V3: add description for the mergeable receive buffer.
> >
> >btw: send the same patch again to include virtio-comment@
> >
> > virtio-vsock.tex | 137 
> > +--
> > 1 file changed, 124 insertions(+), 13 deletions(-)
> >
> >diff --git a/virtio-vsock.tex b/virtio-vsock.tex
> >index da7e641..7eb3596 100644
> >--- a/virtio-vsock.tex
> >+++ b/virtio-vsock.tex
> >@@ -9,14 +9,36 @@ \subsection{Device ID}\label{sec:Device Types / Socket 
> >Device / Device ID}
> >
> > \subsection{Virtqueues}\label{sec:Device Types / Socket Device / Virtqueues}
> > \begin{description}
> >-\item[0] rx
> >-\item[1] tx
> >+\item[0] stream rx
> >+\item[1] stream tx
> >+\item[2] datagram rx
> >+\item[3] datagram tx
> >+\item[4] event
> >+\end{description}
> >+The virtio socket device uses 5 queues if feature bit VIRTIO_VSOCK_F_DRGAM 
> >is set. Otherwise, it
> >+only uses 3 queues, as the following.
> >+
> >+\begin{description}
> >+\item[0] stream rx
> >+\item[1] stream tx
> > \item[2] event
> > \end{description}
> >
> >+When behavior differs between stream and datagram rx/tx virtqueues
> >+their full names are used. Common behavior is simply described in
> >+terms of rx/tx virtqueues and applies to both stream and datagram
> >+virtqueues.
> >+
> > \subsection{Feature bits}\label{sec:Device Types / Socket Device / Feature 
> > bits}
> >
> >-There are currently no feature bits defined for this device.
> >+\begin{description}
> >+\item[VIRTIO_VSOCK_F_DGRAM (0)] Device has support for datagram socket type.
> >+\end{description}
>
> As suggested by Michael here [1] we should add also a feature bit for
> stream, and maybe is better to reserve bit 0 for it.
>
> Arseny already sent an implementation of SEQPACKET using bit 1 for
> VIRTIO_VSOCK_F_SEQPACKET, so I think we can use bit 2 for DGRAM and bit
> 3 for MRG_RXBUF.
>
> [1]
> https://lists.oasis-open.org/archives/virtio-comment/202104/msg00016.html

Sure.

> >+
> >+\begin{description}
> >+\item[VIRTIO_VSOCK_F_MRG_RXBUF (1)] Driver can merge receive buffers.
> >+\end{description}
> >+
> >
> > \subsection{Device configuration layout}\label{sec:Device Types / Socket 
> > Device / Device configuration layout}
> >
> >@@ -64,6 +86,8 @@ \subsection{Device Operation}\label{sec:Device Types / 
> >Socket Device / Device Op
> >
> > Packets transmitted or received contain a header before the payload:
> >
> >+If feature VIRTIO_VSOCK_F_MRG_RXBUF is not negotiated, use the following 
> >header.
> >+
> > \begin{lstlisting}
> > struct virtio_vsock_hdr {
> >   le64 src_cid;
> >@@ -79,6 +103,15 @@ \subsection{Device Operation}\label{sec:Device Types / 
> >Socket Device / Device Op
> > };
> > \end{lstlisting}
> >
> >+If feature VIRTIO_VSOCK_F_MRG_RXBUF is negotianted, use the following 
> >header.
> >+\begin{lstlisting}
> >+struct virtio_vsock_hdr_mrg_rxbuf {
> >+  struct virtio_vsock_hdr hdr;
> >+  le16 num_buffers;
> >+};
> >+\end{lstlisting}
> >+
> >+
> > The upper 32 bits of src_cid and dst_cid are reserved and zeroed.
> >
> > Most packets simply transfer data but control packets are also used for
> >@@ -107,6 +140,9 @@ \subsection{Device Operation}\label{sec:Device Types / 
> >Socket Device / Device Op
> >
> > \subsubsection{Virtqueue Flow Control}\label{sec:Device Types / Socket 
> > Device / Device Operation / Virtqueue Flow Control}
> >
> >+Flow control applies to stream sockets; datagram sockets do not have
> >+flow control.
> >+
> > The tx virtqueue carries packets initiated by applications and replies to
> > received packets.  The rx virtqueue carries packets initiated by the device 
> > and
> > replies to previously transmitted packets.
> >@@ -140,12 +176,15 @@ \subsubsection{Addressing}\label{sec:Device Types / 
> >Socket Device / Device Opera
> > consists of a (cid, port number) tuple. The header fields used for this are
> > \field{src_cid}, \field{src_port}, \field{dst_cid}, and \field{dst_port}.
> >
> >-Currently only stream sockets are supported. \field{type} is 1 for stream
> >-socket types.
> >+Currently stream and datagram (dgram) sockets are supported. \field{type} 
> >is 1 for stream
> >+socket types. \field{type} is 3 for dgram socket types.
> >
> > Stream sockets provide in-order, guaranteed, connection-oriented delivery
> 

Re: [PATCH v7 11/12] vduse: Introduce VDUSE - vDPA Device in Userspace

2021-05-27 Thread Jason Wang


在 2021/5/27 下午9:17, Yongji Xie 写道:

On Thu, May 27, 2021 at 4:41 PM Jason Wang  wrote:


在 2021/5/27 下午3:34, Yongji Xie 写道:

On Thu, May 27, 2021 at 1:40 PM Jason Wang  wrote:

在 2021/5/27 下午1:08, Yongji Xie 写道:

On Thu, May 27, 2021 at 1:00 PM Jason Wang  wrote:

在 2021/5/27 下午12:57, Yongji Xie 写道:

On Thu, May 27, 2021 at 12:13 PM Jason Wang  wrote:

在 2021/5/17 下午5:55, Xie Yongji 写道:

+
+static int vduse_dev_msg_sync(struct vduse_dev *dev,
+   struct vduse_dev_msg *msg)
+{
+ init_waitqueue_head(&msg->waitq);
+ spin_lock(&dev->msg_lock);
+ vduse_enqueue_msg(&dev->send_list, msg);
+ wake_up(&dev->waitq);
+ spin_unlock(&dev->msg_lock);
+ wait_event_killable(msg->waitq, msg->completed);

What happens if the userspace(malicous) doesn't give a response forever?

It looks like a DOS. If yes, we need to consider a way to fix that.


How about using wait_event_killable_timeout() instead?

Probably, and then we need choose a suitable timeout and more important,
need to report the failure to virtio.


Makes sense to me. But it looks like some
vdpa_config_ops/virtio_config_ops such as set_status() didn't have a
return value.  Now I add a WARN_ON() for the failure. Do you mean we
need to add some change for virtio core to handle the failure?

Maybe, but I'm not sure how hard we can do that.


We need to change all virtio device drivers in this way.


Probably.



We had NEEDS_RESET but it looks we don't implement it.


Could it handle the failure of get_feature() and get/set_config()?


Looks not:

"

The device SHOULD set DEVICE_NEEDS_RESET when it enters an error state
that a reset is needed. If DRIVER_OK is set, after it sets
DEVICE_NEEDS_RESET, the device MUST send a device configuration change
notification to the driver.

"

This looks implies that NEEDS_RESET may only work after device is
probed. But in the current design, even the reset() is not reliable.



Or a rough idea is that maybe need some relaxing to be coupled loosely
with userspace. E.g the device (control path) is implemented in the
kernel but the datapath is implemented in the userspace like TUN/TAP.


I think it can work for most cases. One problem is that the set_config
might change the behavior of the data path at runtime, e.g.
virtnet_set_mac_address() in the virtio-net driver and
cache_type_store() in the virtio-blk driver. Not sure if this path is
able to return before the datapath is aware of this change.


Good point.

But set_config() should be rare:

E.g in the case of virtio-net with VERSION_1, config space is read only,
and it was set via control vq.

For block, we can

1) start from without WCE or
2) we add a config change notification to userspace or

I prefer this way. And I think we also need to do similar things for
set/get_vq_state().



Yes, I agree.

Thanks




Thanks,
Yongji



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v7 11/12] vduse: Introduce VDUSE - vDPA Device in Userspace

2021-05-27 Thread Jason Wang


在 2021/5/27 下午6:14, Yongji Xie 写道:

On Thu, May 27, 2021 at 4:43 PM Jason Wang  wrote:


在 2021/5/27 下午4:41, Jason Wang 写道:

在 2021/5/27 下午3:34, Yongji Xie 写道:

On Thu, May 27, 2021 at 1:40 PM Jason Wang  wrote:

在 2021/5/27 下午1:08, Yongji Xie 写道:

On Thu, May 27, 2021 at 1:00 PM Jason Wang 
wrote:

在 2021/5/27 下午12:57, Yongji Xie 写道:

On Thu, May 27, 2021 at 12:13 PM Jason Wang 
wrote:

在 2021/5/17 下午5:55, Xie Yongji 写道:

+
+static int vduse_dev_msg_sync(struct vduse_dev *dev,
+   struct vduse_dev_msg *msg)
+{
+ init_waitqueue_head(&msg->waitq);
+ spin_lock(&dev->msg_lock);
+ vduse_enqueue_msg(&dev->send_list, msg);
+ wake_up(&dev->waitq);
+ spin_unlock(&dev->msg_lock);
+ wait_event_killable(msg->waitq, msg->completed);

What happens if the userspace(malicous) doesn't give a response
forever?

It looks like a DOS. If yes, we need to consider a way to fix that.


How about using wait_event_killable_timeout() instead?

Probably, and then we need choose a suitable timeout and more
important,
need to report the failure to virtio.


Makes sense to me. But it looks like some
vdpa_config_ops/virtio_config_ops such as set_status() didn't have a
return value.  Now I add a WARN_ON() for the failure. Do you mean we
need to add some change for virtio core to handle the failure?

Maybe, but I'm not sure how hard we can do that.


We need to change all virtio device drivers in this way.


Probably.



We had NEEDS_RESET but it looks we don't implement it.


Could it handle the failure of get_feature() and get/set_config()?


Looks not:

"

The device SHOULD set DEVICE_NEEDS_RESET when it enters an error state
that a reset is needed. If DRIVER_OK is set, after it sets
DEVICE_NEEDS_RESET, the device MUST send a device configuration change
notification to the driver.

"

This looks implies that NEEDS_RESET may only work after device is
probed. But in the current design, even the reset() is not reliable.



Or a rough idea is that maybe need some relaxing to be coupled loosely
with userspace. E.g the device (control path) is implemented in the
kernel but the datapath is implemented in the userspace like TUN/TAP.


I think it can work for most cases. One problem is that the set_config
might change the behavior of the data path at runtime, e.g.
virtnet_set_mac_address() in the virtio-net driver and
cache_type_store() in the virtio-blk driver. Not sure if this path is
able to return before the datapath is aware of this change.


Good point.

But set_config() should be rare:

E.g in the case of virtio-net with VERSION_1, config space is read
only, and it was set via control vq.

For block, we can

1) start from without WCE or
2) we add a config change notification to userspace or
3) extend the spec to use vq instead of config space

Thanks


Another thing if we want to go this way:

We need find a way to terminate the data path from the kernel side, to
implement to reset semantic.


Do you mean terminate the data path in vdpa_reset().



Yes.



  Is it ok to just
notify userspace to stop data path asynchronously?



For well-behaved userspace, yes but no for buggy or malicious ones.

I had an idea, how about terminate IOTLB in this case? Then we're in 
fact turn datapath off.


Thanks



  Userspace should
not be able to do any I/O at that time because the iotlb mapping is
already removed.

Thanks,
Yongji



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 5/5] vhost: fix up vhost_work coding style

2021-05-27 Thread Stefano Garzarella

On Tue, May 25, 2021 at 12:47:33PM -0500, Mike Christie wrote:

Switch from a mix of tabs and spaces to just tabs.

Signed-off-by: Mike Christie 
---
drivers/vhost/vhost.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 575c8180caad..7d5306d1229d 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -20,9 +20,9 @@ typedef void (*vhost_work_fn_t)(struct vhost_work *work);

#define VHOST_WORK_QUEUED 1
struct vhost_work {
-   struct llist_node node;
-   vhost_work_fn_t   fn;
-   unsigned long flags;
+   struct llist_node   node;
+   vhost_work_fn_t fn;
+   unsigned long   flags;
};

/* Poll a file (eventfd or socket) */
--
2.25.1



Reviewed-by: Stefano Garzarella 

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/5] vhost: remove work arg from vhost_work_flush

2021-05-27 Thread Stefano Garzarella

On Tue, May 25, 2021 at 12:47:29PM -0500, Mike Christie wrote:

vhost_work_flush doesn't do anything with the work arg. This patch drops
it and then renames vhost_work_flush to vhost_work_dev_flush to reflect
that the function flushes all the works in the dev and not just a
specific queue or work item.

Signed-off-by: Mike Christie 
Acked-by: Jason Wang 
Reviewed-by: Chaitanya Kulkarni 
---
drivers/vhost/scsi.c  | 4 ++--
drivers/vhost/vhost.c | 8 
drivers/vhost/vhost.h | 2 +-
drivers/vhost/vsock.c | 2 +-
4 files changed, 8 insertions(+), 8 deletions(-)


Reviewed-by: Stefano Garzarella 

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [virtio-comment] [RFC v3] virtio-vsock: add description for datagram type

2021-05-27 Thread Stefano Garzarella

Re-send my thoughts on this new series...

On Wed, May 26, 2021 at 05:50:35PM +, Jiang Wang wrote:

From: "jiang.wang" 

Add supports for datagram type for virtio-vsock. Datagram
sockets are connectionless and unreliable. To avoid contention
with stream and other sockets, add two more virtqueues and
a new feature bit to identify if those two new queues exist or not.

Also add descriptions for resource management of datagram, which
does not use the existing credit update mechanism associated with
stream sockets.

Signed-off-by: Jiang Wang 
---
V2: addressed the comments for the previous version.
V3: add description for the mergeable receive buffer.

btw: send the same patch again to include virtio-comment@

virtio-vsock.tex | 137 +--
1 file changed, 124 insertions(+), 13 deletions(-)

diff --git a/virtio-vsock.tex b/virtio-vsock.tex
index da7e641..7eb3596 100644
--- a/virtio-vsock.tex
+++ b/virtio-vsock.tex
@@ -9,14 +9,36 @@ \subsection{Device ID}\label{sec:Device Types / Socket Device 
/ Device ID}

\subsection{Virtqueues}\label{sec:Device Types / Socket Device / Virtqueues}
\begin{description}
-\item[0] rx
-\item[1] tx
+\item[0] stream rx
+\item[1] stream tx
+\item[2] datagram rx
+\item[3] datagram tx
+\item[4] event
+\end{description}
+The virtio socket device uses 5 queues if feature bit VIRTIO_VSOCK_F_DRGAM is 
set. Otherwise, it
+only uses 3 queues, as the following.
+
+\begin{description}
+\item[0] stream rx
+\item[1] stream tx
\item[2] event
\end{description}

+When behavior differs between stream and datagram rx/tx virtqueues
+their full names are used. Common behavior is simply described in
+terms of rx/tx virtqueues and applies to both stream and datagram
+virtqueues.
+
\subsection{Feature bits}\label{sec:Device Types / Socket Device / Feature bits}

-There are currently no feature bits defined for this device.
+\begin{description}
+\item[VIRTIO_VSOCK_F_DGRAM (0)] Device has support for datagram socket type.
+\end{description}


As suggested by Michael here [1] we should add also a feature bit for
stream, and maybe is better to reserve bit 0 for it.

Arseny already sent an implementation of SEQPACKET using bit 1 for
VIRTIO_VSOCK_F_SEQPACKET, so I think we can use bit 2 for DGRAM and bit
3 for MRG_RXBUF.

[1]
https://lists.oasis-open.org/archives/virtio-comment/202104/msg00016.html


+
+\begin{description}
+\item[VIRTIO_VSOCK_F_MRG_RXBUF (1)] Driver can merge receive buffers.
+\end{description}
+

\subsection{Device configuration layout}\label{sec:Device Types / Socket Device 
/ Device configuration layout}

@@ -64,6 +86,8 @@ \subsection{Device Operation}\label{sec:Device Types / Socket 
Device / Device Op

Packets transmitted or received contain a header before the payload:

+If feature VIRTIO_VSOCK_F_MRG_RXBUF is not negotiated, use the following 
header.
+
\begin{lstlisting}
struct virtio_vsock_hdr {
le64 src_cid;
@@ -79,6 +103,15 @@ \subsection{Device Operation}\label{sec:Device Types / 
Socket Device / Device Op
};
\end{lstlisting}

+If feature VIRTIO_VSOCK_F_MRG_RXBUF is negotianted, use the following header.
+\begin{lstlisting}
+struct virtio_vsock_hdr_mrg_rxbuf {
+   struct virtio_vsock_hdr hdr;
+   le16 num_buffers;
+};
+\end{lstlisting}
+
+
The upper 32 bits of src_cid and dst_cid are reserved and zeroed.

Most packets simply transfer data but control packets are also used for
@@ -107,6 +140,9 @@ \subsection{Device Operation}\label{sec:Device Types / 
Socket Device / Device Op

\subsubsection{Virtqueue Flow Control}\label{sec:Device Types / Socket Device / 
Device Operation / Virtqueue Flow Control}

+Flow control applies to stream sockets; datagram sockets do not have
+flow control.
+
The tx virtqueue carries packets initiated by applications and replies to
received packets.  The rx virtqueue carries packets initiated by the device and
replies to previously transmitted packets.
@@ -140,12 +176,15 @@ \subsubsection{Addressing}\label{sec:Device Types / 
Socket Device / Device Opera
consists of a (cid, port number) tuple. The header fields used for this are
\field{src_cid}, \field{src_port}, \field{dst_cid}, and \field{dst_port}.

-Currently only stream sockets are supported. \field{type} is 1 for stream
-socket types.
+Currently stream and datagram (dgram) sockets are supported. \field{type} is 1 
for stream
+socket types. \field{type} is 3 for dgram socket types.

Stream sockets provide in-order, guaranteed, connection-oriented delivery
without message boundaries.

+Datagram sockets provide unordered, unreliable, connectionless messages
+with message boundaries and a maximum length.
+
\subsubsection{Buffer Space Management}\label{sec:Device Types / Socket Device 
/ Device Operation / Buffer Space Management}
\field{buf_alloc} and \field{fwd_cnt} are used for buffer space management of
stream sockets. The guest and the device publish how much buffer space is
@@ -162,7 +201,7 @@ \subsubsection{Buffer Space Management}

Re: [virtio-comment] [PATCH v6 0/2] virtio-vsock: introduce SOCK_SEQPACKET description

2021-05-27 Thread Stefano Garzarella

On Mon, May 24, 2021 at 09:32:29PM +0300, Arseny Krasnov wrote:

This adds description of SOCK_SEQPACKET type for virtio-vsock.

Here is latest RFC implementation for Linux, with more details:

https://lkml.org/lkml/2021/5/20/2386

Also this patchset has patch which replaces enums to defines in
virtio-vsock part of spec. SOCK_SEQPACKET patch is based on it.

Arseny Krasnov(2):
 virtio-vsock: use C style defines for constants
 virtio-vsock: SOCK_SEQPACKET description

virtio-vsock.tex | 74 --
1 file changed, 46 insertions(+), 28 deletions(-)



The series LGTM!

Let's wait for other comments.

I'll send a follow up patch next week about feature bit for stream 
socket (bit 0).


Thanks,
Stefano

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v7 11/12] vduse: Introduce VDUSE - vDPA Device in Userspace

2021-05-27 Thread Jason Wang


在 2021/5/27 下午4:41, Jason Wang 写道:


在 2021/5/27 下午3:34, Yongji Xie 写道:

On Thu, May 27, 2021 at 1:40 PM Jason Wang  wrote:


在 2021/5/27 下午1:08, Yongji Xie 写道:
On Thu, May 27, 2021 at 1:00 PM Jason Wang  
wrote:

在 2021/5/27 下午12:57, Yongji Xie 写道:
On Thu, May 27, 2021 at 12:13 PM Jason Wang  
wrote:

在 2021/5/17 下午5:55, Xie Yongji 写道:

+
+static int vduse_dev_msg_sync(struct vduse_dev *dev,
+   struct vduse_dev_msg *msg)
+{
+ init_waitqueue_head(&msg->waitq);
+ spin_lock(&dev->msg_lock);
+ vduse_enqueue_msg(&dev->send_list, msg);
+ wake_up(&dev->waitq);
+ spin_unlock(&dev->msg_lock);
+ wait_event_killable(msg->waitq, msg->completed);
What happens if the userspace(malicous) doesn't give a response 
forever?


It looks like a DOS. If yes, we need to consider a way to fix that.


How about using wait_event_killable_timeout() instead?
Probably, and then we need choose a suitable timeout and more 
important,

need to report the failure to virtio.


Makes sense to me. But it looks like some
vdpa_config_ops/virtio_config_ops such as set_status() didn't have a
return value.  Now I add a WARN_ON() for the failure. Do you mean we
need to add some change for virtio core to handle the failure?


Maybe, but I'm not sure how hard we can do that.


We need to change all virtio device drivers in this way.



Probably.





We had NEEDS_RESET but it looks we don't implement it.


Could it handle the failure of get_feature() and get/set_config()?



Looks not:

"

The device SHOULD set DEVICE_NEEDS_RESET when it enters an error state 
that a reset is needed. If DRIVER_OK is set, after it sets 
DEVICE_NEEDS_RESET, the device MUST send a device configuration change 
notification to the driver.


"

This looks implies that NEEDS_RESET may only work after device is 
probed. But in the current design, even the reset() is not reliable.






Or a rough idea is that maybe need some relaxing to be coupled loosely
with userspace. E.g the device (control path) is implemented in the
kernel but the datapath is implemented in the userspace like TUN/TAP.


I think it can work for most cases. One problem is that the set_config
might change the behavior of the data path at runtime, e.g.
virtnet_set_mac_address() in the virtio-net driver and
cache_type_store() in the virtio-blk driver. Not sure if this path is
able to return before the datapath is aware of this change.



Good point.

But set_config() should be rare:

E.g in the case of virtio-net with VERSION_1, config space is read 
only, and it was set via control vq.


For block, we can

1) start from without WCE or
2) we add a config change notification to userspace or
3) extend the spec to use vq instead of config space

Thanks



Another thing if we want to go this way:

We need find a way to terminate the data path from the kernel side, to 
implement to reset semantic.


Thanks







Thanks,
Yongji



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v7 11/12] vduse: Introduce VDUSE - vDPA Device in Userspace

2021-05-27 Thread Jason Wang


在 2021/5/27 下午3:34, Yongji Xie 写道:

On Thu, May 27, 2021 at 1:40 PM Jason Wang  wrote:


在 2021/5/27 下午1:08, Yongji Xie 写道:

On Thu, May 27, 2021 at 1:00 PM Jason Wang  wrote:

在 2021/5/27 下午12:57, Yongji Xie 写道:

On Thu, May 27, 2021 at 12:13 PM Jason Wang  wrote:

在 2021/5/17 下午5:55, Xie Yongji 写道:

+
+static int vduse_dev_msg_sync(struct vduse_dev *dev,
+   struct vduse_dev_msg *msg)
+{
+ init_waitqueue_head(&msg->waitq);
+ spin_lock(&dev->msg_lock);
+ vduse_enqueue_msg(&dev->send_list, msg);
+ wake_up(&dev->waitq);
+ spin_unlock(&dev->msg_lock);
+ wait_event_killable(msg->waitq, msg->completed);

What happens if the userspace(malicous) doesn't give a response forever?

It looks like a DOS. If yes, we need to consider a way to fix that.


How about using wait_event_killable_timeout() instead?

Probably, and then we need choose a suitable timeout and more important,
need to report the failure to virtio.


Makes sense to me. But it looks like some
vdpa_config_ops/virtio_config_ops such as set_status() didn't have a
return value.  Now I add a WARN_ON() for the failure. Do you mean we
need to add some change for virtio core to handle the failure?


Maybe, but I'm not sure how hard we can do that.


We need to change all virtio device drivers in this way.



Probably.





We had NEEDS_RESET but it looks we don't implement it.


Could it handle the failure of get_feature() and get/set_config()?



Looks not:

"

The device SHOULD set DEVICE_NEEDS_RESET when it enters an error state 
that a reset is needed. If DRIVER_OK is set, after it sets 
DEVICE_NEEDS_RESET, the device MUST send a device configuration change 
notification to the driver.


"

This looks implies that NEEDS_RESET may only work after device is 
probed. But in the current design, even the reset() is not reliable.






Or a rough idea is that maybe need some relaxing to be coupled loosely
with userspace. E.g the device (control path) is implemented in the
kernel but the datapath is implemented in the userspace like TUN/TAP.


I think it can work for most cases. One problem is that the set_config
might change the behavior of the data path at runtime, e.g.
virtnet_set_mac_address() in the virtio-net driver and
cache_type_store() in the virtio-blk driver. Not sure if this path is
able to return before the datapath is aware of this change.



Good point.

But set_config() should be rare:

E.g in the case of virtio-net with VERSION_1, config space is read only, 
and it was set via control vq.


For block, we can

1) start from without WCE or
2) we add a config change notification to userspace or
3) extend the spec to use vq instead of config space

Thanks




Thanks,
Yongji



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization