Re: [PATCH v7 11/12] vduse: Introduce VDUSE - vDPA Device in Userspace
在 2021/5/28 上午11:54, Yongji Xie 写道: On Fri, May 28, 2021 at 9:33 AM Jason Wang wrote: 在 2021/5/27 下午6:14, Yongji Xie 写道: On Thu, May 27, 2021 at 4:43 PM Jason Wang wrote: 在 2021/5/27 下午4:41, Jason Wang 写道: 在 2021/5/27 下午3:34, Yongji Xie 写道: On Thu, May 27, 2021 at 1:40 PM Jason Wang wrote: 在 2021/5/27 下午1:08, Yongji Xie 写道: On Thu, May 27, 2021 at 1:00 PM Jason Wang wrote: 在 2021/5/27 下午12:57, Yongji Xie 写道: On Thu, May 27, 2021 at 12:13 PM Jason Wang wrote: 在 2021/5/17 下午5:55, Xie Yongji 写道: + +static int vduse_dev_msg_sync(struct vduse_dev *dev, + struct vduse_dev_msg *msg) +{ + init_waitqueue_head(&msg->waitq); + spin_lock(&dev->msg_lock); + vduse_enqueue_msg(&dev->send_list, msg); + wake_up(&dev->waitq); + spin_unlock(&dev->msg_lock); + wait_event_killable(msg->waitq, msg->completed); What happens if the userspace(malicous) doesn't give a response forever? It looks like a DOS. If yes, we need to consider a way to fix that. How about using wait_event_killable_timeout() instead? Probably, and then we need choose a suitable timeout and more important, need to report the failure to virtio. Makes sense to me. But it looks like some vdpa_config_ops/virtio_config_ops such as set_status() didn't have a return value. Now I add a WARN_ON() for the failure. Do you mean we need to add some change for virtio core to handle the failure? Maybe, but I'm not sure how hard we can do that. We need to change all virtio device drivers in this way. Probably. We had NEEDS_RESET but it looks we don't implement it. Could it handle the failure of get_feature() and get/set_config()? Looks not: " The device SHOULD set DEVICE_NEEDS_RESET when it enters an error state that a reset is needed. If DRIVER_OK is set, after it sets DEVICE_NEEDS_RESET, the device MUST send a device configuration change notification to the driver. " This looks implies that NEEDS_RESET may only work after device is probed. But in the current design, even the reset() is not reliable. Or a rough idea is that maybe need some relaxing to be coupled loosely with userspace. E.g the device (control path) is implemented in the kernel but the datapath is implemented in the userspace like TUN/TAP. I think it can work for most cases. One problem is that the set_config might change the behavior of the data path at runtime, e.g. virtnet_set_mac_address() in the virtio-net driver and cache_type_store() in the virtio-blk driver. Not sure if this path is able to return before the datapath is aware of this change. Good point. But set_config() should be rare: E.g in the case of virtio-net with VERSION_1, config space is read only, and it was set via control vq. For block, we can 1) start from without WCE or 2) we add a config change notification to userspace or 3) extend the spec to use vq instead of config space Thanks Another thing if we want to go this way: We need find a way to terminate the data path from the kernel side, to implement to reset semantic. Do you mean terminate the data path in vdpa_reset(). Yes. Is it ok to just notify userspace to stop data path asynchronously? For well-behaved userspace, yes but no for buggy or malicious ones. But the buggy or malicious daemons can't do anything if my understanding is correct. You're right. I originally thought there can still have bouncing. But consider we don't do that during fault. It should be safe. I had an idea, how about terminate IOTLB in this case? Then we're in fact turn datapath off. Sorry, I didn't get your point here. What do you mean by terminating IOTLB? I meant terminate the bouncing but it looks safe after a second thought :) Thanks Remove iotlb mapping? But userspace can still access the mapped region. Thanks, Yongji ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[RFC v4] virtio-vsock: add description for datagram type
From: "jiang.wang" Add supports for datagram type for virtio-vsock. Datagram sockets are connectionless and unreliable. To avoid contention with stream and other sockets, add two more virtqueues and a new feature bit to identify if those two new queues exist or not. Also add descriptions for resource management of datagram, which does not use the existing credit update mechanism associated with stream sockets. Signed-off-by: Jiang Wang --- V2: addressed the comments for the previous version. V3: add description for the mergeable receive buffer. V4: add a feature bit for stream and reserver a bit for seqpacket. Fix mrg_rxbuf related sentences. virtio-vsock.tex | 155 ++- 1 file changed, 142 insertions(+), 13 deletions(-) diff --git a/virtio-vsock.tex b/virtio-vsock.tex index da7e641..bacac3c 100644 --- a/virtio-vsock.tex +++ b/virtio-vsock.tex @@ -9,14 +9,41 @@ \subsection{Device ID}\label{sec:Device Types / Socket Device / Device ID} \subsection{Virtqueues}\label{sec:Device Types / Socket Device / Virtqueues} \begin{description} -\item[0] rx -\item[1] tx +\item[0] stream rx +\item[1] stream tx +\item[2] datagram rx +\item[3] datagram tx +\item[4] event +\end{description} +The virtio socket device uses 5 queues if feature bit VIRTIO_VSOCK_F_DRGAM is set. Otherwise, it +only uses 3 queues, as the following. + +\begin{description} +\item[0] stream rx +\item[1] stream tx \item[2] event \end{description} +When behavior differs between stream and datagram rx/tx virtqueues +their full names are used. Common behavior is simply described in +terms of rx/tx virtqueues and applies to both stream and datagram +virtqueues. + \subsection{Feature bits}\label{sec:Device Types / Socket Device / Feature bits} -There are currently no feature bits defined for this device. +\begin{description} +\item[VIRTIO_VSOCK_F_STREAM (0)] Device has support for stream socket type. +\end{description} + +\begin{description} +\item[VIRTIO_VSOCK_F_DGRAM (2)] Device has support for datagram socket type. +\end{description} + +\begin{description} +\item[VIRTIO_VSOCK_F_MRG_RXBUF (3)] Driver can merge receive buffers. +\end{description} + +If no feature bits are defined, then assume only VIRTIO_VSOCK_F_STREAM is set. \subsection{Device configuration layout}\label{sec:Device Types / Socket Device / Device configuration layout} @@ -64,6 +91,8 @@ \subsection{Device Operation}\label{sec:Device Types / Socket Device / Device Op Packets transmitted or received contain a header before the payload: +If feature VIRTIO_VSOCK_F_MRG_RXBUF is not negotiated, use the following header. + \begin{lstlisting} struct virtio_vsock_hdr { le64 src_cid; @@ -79,6 +108,15 @@ \subsection{Device Operation}\label{sec:Device Types / Socket Device / Device Op }; \end{lstlisting} +If feature VIRTIO_VSOCK_F_MRG_RXBUF is negotiated, use the following header. +\begin{lstlisting} +struct virtio_vsock_hdr_mrg_rxbuf { + struct virtio_vsock_hdr hdr; + le16 num_buffers; +}; +\end{lstlisting} + + The upper 32 bits of src_cid and dst_cid are reserved and zeroed. Most packets simply transfer data but control packets are also used for @@ -107,6 +145,9 @@ \subsection{Device Operation}\label{sec:Device Types / Socket Device / Device Op \subsubsection{Virtqueue Flow Control}\label{sec:Device Types / Socket Device / Device Operation / Virtqueue Flow Control} +Flow control applies to stream sockets; datagram sockets do not have +flow control. + The tx virtqueue carries packets initiated by applications and replies to received packets. The rx virtqueue carries packets initiated by the device and replies to previously transmitted packets. @@ -140,12 +181,15 @@ \subsubsection{Addressing}\label{sec:Device Types / Socket Device / Device Opera consists of a (cid, port number) tuple. The header fields used for this are \field{src_cid}, \field{src_port}, \field{dst_cid}, and \field{dst_port}. -Currently only stream sockets are supported. \field{type} is 1 for stream -socket types. +Currently stream and datagram (dgram) sockets are supported. \field{type} is 1 for stream +socket types. \field{type} is 3 for dgram socket types. Stream sockets provide in-order, guaranteed, connection-oriented delivery without message boundaries. +Datagram sockets provide unordered, unreliable, connectionless messages +with message boundaries and a maximum length. + \subsubsection{Buffer Space Management}\label{sec:Device Types / Socket Device / Device Operation / Buffer Space Management} \field{buf_alloc} and \field{fwd_cnt} are used for buffer space management of stream sockets. The guest and the device publish how much buffer space is @@ -162,7 +206,7 @@ \subsubsection{Buffer Space Management}\label{sec:Device Types / Socket Device / u32 peer_free = peer_buf_alloc - (tx_cnt - peer_fwd_cnt); \end{lstlisting} -If there is insufficient buffer space, the sender waits until v
Re: Re: [virtio-comment] [RFC v3] virtio-vsock: add description for datagram type
On Thu, May 27, 2021 at 6:21 AM Stefano Garzarella wrote: > > Re-send my thoughts on this new series... > > On Wed, May 26, 2021 at 05:50:35PM +, Jiang Wang wrote: > >From: "jiang.wang" > > > >Add supports for datagram type for virtio-vsock. Datagram > >sockets are connectionless and unreliable. To avoid contention > >with stream and other sockets, add two more virtqueues and > >a new feature bit to identify if those two new queues exist or not. > > > >Also add descriptions for resource management of datagram, which > >does not use the existing credit update mechanism associated with > >stream sockets. > > > >Signed-off-by: Jiang Wang > >--- > >V2: addressed the comments for the previous version. > >V3: add description for the mergeable receive buffer. > > > >btw: send the same patch again to include virtio-comment@ > > > > virtio-vsock.tex | 137 > > +-- > > 1 file changed, 124 insertions(+), 13 deletions(-) > > > >diff --git a/virtio-vsock.tex b/virtio-vsock.tex > >index da7e641..7eb3596 100644 > >--- a/virtio-vsock.tex > >+++ b/virtio-vsock.tex > >@@ -9,14 +9,36 @@ \subsection{Device ID}\label{sec:Device Types / Socket > >Device / Device ID} > > > > \subsection{Virtqueues}\label{sec:Device Types / Socket Device / Virtqueues} > > \begin{description} > >-\item[0] rx > >-\item[1] tx > >+\item[0] stream rx > >+\item[1] stream tx > >+\item[2] datagram rx > >+\item[3] datagram tx > >+\item[4] event > >+\end{description} > >+The virtio socket device uses 5 queues if feature bit VIRTIO_VSOCK_F_DRGAM > >is set. Otherwise, it > >+only uses 3 queues, as the following. > >+ > >+\begin{description} > >+\item[0] stream rx > >+\item[1] stream tx > > \item[2] event > > \end{description} > > > >+When behavior differs between stream and datagram rx/tx virtqueues > >+their full names are used. Common behavior is simply described in > >+terms of rx/tx virtqueues and applies to both stream and datagram > >+virtqueues. > >+ > > \subsection{Feature bits}\label{sec:Device Types / Socket Device / Feature > > bits} > > > >-There are currently no feature bits defined for this device. > >+\begin{description} > >+\item[VIRTIO_VSOCK_F_DGRAM (0)] Device has support for datagram socket type. > >+\end{description} > > As suggested by Michael here [1] we should add also a feature bit for > stream, and maybe is better to reserve bit 0 for it. > > Arseny already sent an implementation of SEQPACKET using bit 1 for > VIRTIO_VSOCK_F_SEQPACKET, so I think we can use bit 2 for DGRAM and bit > 3 for MRG_RXBUF. > > [1] > https://lists.oasis-open.org/archives/virtio-comment/202104/msg00016.html Sure. > >+ > >+\begin{description} > >+\item[VIRTIO_VSOCK_F_MRG_RXBUF (1)] Driver can merge receive buffers. > >+\end{description} > >+ > > > > \subsection{Device configuration layout}\label{sec:Device Types / Socket > > Device / Device configuration layout} > > > >@@ -64,6 +86,8 @@ \subsection{Device Operation}\label{sec:Device Types / > >Socket Device / Device Op > > > > Packets transmitted or received contain a header before the payload: > > > >+If feature VIRTIO_VSOCK_F_MRG_RXBUF is not negotiated, use the following > >header. > >+ > > \begin{lstlisting} > > struct virtio_vsock_hdr { > > le64 src_cid; > >@@ -79,6 +103,15 @@ \subsection{Device Operation}\label{sec:Device Types / > >Socket Device / Device Op > > }; > > \end{lstlisting} > > > >+If feature VIRTIO_VSOCK_F_MRG_RXBUF is negotianted, use the following > >header. > >+\begin{lstlisting} > >+struct virtio_vsock_hdr_mrg_rxbuf { > >+ struct virtio_vsock_hdr hdr; > >+ le16 num_buffers; > >+}; > >+\end{lstlisting} > >+ > >+ > > The upper 32 bits of src_cid and dst_cid are reserved and zeroed. > > > > Most packets simply transfer data but control packets are also used for > >@@ -107,6 +140,9 @@ \subsection{Device Operation}\label{sec:Device Types / > >Socket Device / Device Op > > > > \subsubsection{Virtqueue Flow Control}\label{sec:Device Types / Socket > > Device / Device Operation / Virtqueue Flow Control} > > > >+Flow control applies to stream sockets; datagram sockets do not have > >+flow control. > >+ > > The tx virtqueue carries packets initiated by applications and replies to > > received packets. The rx virtqueue carries packets initiated by the device > > and > > replies to previously transmitted packets. > >@@ -140,12 +176,15 @@ \subsubsection{Addressing}\label{sec:Device Types / > >Socket Device / Device Opera > > consists of a (cid, port number) tuple. The header fields used for this are > > \field{src_cid}, \field{src_port}, \field{dst_cid}, and \field{dst_port}. > > > >-Currently only stream sockets are supported. \field{type} is 1 for stream > >-socket types. > >+Currently stream and datagram (dgram) sockets are supported. \field{type} > >is 1 for stream > >+socket types. \field{type} is 3 for dgram socket types. > > > > Stream sockets provide in-order, guaranteed, connection-oriented delivery >
Re: [PATCH v7 11/12] vduse: Introduce VDUSE - vDPA Device in Userspace
在 2021/5/27 下午9:17, Yongji Xie 写道: On Thu, May 27, 2021 at 4:41 PM Jason Wang wrote: 在 2021/5/27 下午3:34, Yongji Xie 写道: On Thu, May 27, 2021 at 1:40 PM Jason Wang wrote: 在 2021/5/27 下午1:08, Yongji Xie 写道: On Thu, May 27, 2021 at 1:00 PM Jason Wang wrote: 在 2021/5/27 下午12:57, Yongji Xie 写道: On Thu, May 27, 2021 at 12:13 PM Jason Wang wrote: 在 2021/5/17 下午5:55, Xie Yongji 写道: + +static int vduse_dev_msg_sync(struct vduse_dev *dev, + struct vduse_dev_msg *msg) +{ + init_waitqueue_head(&msg->waitq); + spin_lock(&dev->msg_lock); + vduse_enqueue_msg(&dev->send_list, msg); + wake_up(&dev->waitq); + spin_unlock(&dev->msg_lock); + wait_event_killable(msg->waitq, msg->completed); What happens if the userspace(malicous) doesn't give a response forever? It looks like a DOS. If yes, we need to consider a way to fix that. How about using wait_event_killable_timeout() instead? Probably, and then we need choose a suitable timeout and more important, need to report the failure to virtio. Makes sense to me. But it looks like some vdpa_config_ops/virtio_config_ops such as set_status() didn't have a return value. Now I add a WARN_ON() for the failure. Do you mean we need to add some change for virtio core to handle the failure? Maybe, but I'm not sure how hard we can do that. We need to change all virtio device drivers in this way. Probably. We had NEEDS_RESET but it looks we don't implement it. Could it handle the failure of get_feature() and get/set_config()? Looks not: " The device SHOULD set DEVICE_NEEDS_RESET when it enters an error state that a reset is needed. If DRIVER_OK is set, after it sets DEVICE_NEEDS_RESET, the device MUST send a device configuration change notification to the driver. " This looks implies that NEEDS_RESET may only work after device is probed. But in the current design, even the reset() is not reliable. Or a rough idea is that maybe need some relaxing to be coupled loosely with userspace. E.g the device (control path) is implemented in the kernel but the datapath is implemented in the userspace like TUN/TAP. I think it can work for most cases. One problem is that the set_config might change the behavior of the data path at runtime, e.g. virtnet_set_mac_address() in the virtio-net driver and cache_type_store() in the virtio-blk driver. Not sure if this path is able to return before the datapath is aware of this change. Good point. But set_config() should be rare: E.g in the case of virtio-net with VERSION_1, config space is read only, and it was set via control vq. For block, we can 1) start from without WCE or 2) we add a config change notification to userspace or I prefer this way. And I think we also need to do similar things for set/get_vq_state(). Yes, I agree. Thanks Thanks, Yongji ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v7 11/12] vduse: Introduce VDUSE - vDPA Device in Userspace
在 2021/5/27 下午6:14, Yongji Xie 写道: On Thu, May 27, 2021 at 4:43 PM Jason Wang wrote: 在 2021/5/27 下午4:41, Jason Wang 写道: 在 2021/5/27 下午3:34, Yongji Xie 写道: On Thu, May 27, 2021 at 1:40 PM Jason Wang wrote: 在 2021/5/27 下午1:08, Yongji Xie 写道: On Thu, May 27, 2021 at 1:00 PM Jason Wang wrote: 在 2021/5/27 下午12:57, Yongji Xie 写道: On Thu, May 27, 2021 at 12:13 PM Jason Wang wrote: 在 2021/5/17 下午5:55, Xie Yongji 写道: + +static int vduse_dev_msg_sync(struct vduse_dev *dev, + struct vduse_dev_msg *msg) +{ + init_waitqueue_head(&msg->waitq); + spin_lock(&dev->msg_lock); + vduse_enqueue_msg(&dev->send_list, msg); + wake_up(&dev->waitq); + spin_unlock(&dev->msg_lock); + wait_event_killable(msg->waitq, msg->completed); What happens if the userspace(malicous) doesn't give a response forever? It looks like a DOS. If yes, we need to consider a way to fix that. How about using wait_event_killable_timeout() instead? Probably, and then we need choose a suitable timeout and more important, need to report the failure to virtio. Makes sense to me. But it looks like some vdpa_config_ops/virtio_config_ops such as set_status() didn't have a return value. Now I add a WARN_ON() for the failure. Do you mean we need to add some change for virtio core to handle the failure? Maybe, but I'm not sure how hard we can do that. We need to change all virtio device drivers in this way. Probably. We had NEEDS_RESET but it looks we don't implement it. Could it handle the failure of get_feature() and get/set_config()? Looks not: " The device SHOULD set DEVICE_NEEDS_RESET when it enters an error state that a reset is needed. If DRIVER_OK is set, after it sets DEVICE_NEEDS_RESET, the device MUST send a device configuration change notification to the driver. " This looks implies that NEEDS_RESET may only work after device is probed. But in the current design, even the reset() is not reliable. Or a rough idea is that maybe need some relaxing to be coupled loosely with userspace. E.g the device (control path) is implemented in the kernel but the datapath is implemented in the userspace like TUN/TAP. I think it can work for most cases. One problem is that the set_config might change the behavior of the data path at runtime, e.g. virtnet_set_mac_address() in the virtio-net driver and cache_type_store() in the virtio-blk driver. Not sure if this path is able to return before the datapath is aware of this change. Good point. But set_config() should be rare: E.g in the case of virtio-net with VERSION_1, config space is read only, and it was set via control vq. For block, we can 1) start from without WCE or 2) we add a config change notification to userspace or 3) extend the spec to use vq instead of config space Thanks Another thing if we want to go this way: We need find a way to terminate the data path from the kernel side, to implement to reset semantic. Do you mean terminate the data path in vdpa_reset(). Yes. Is it ok to just notify userspace to stop data path asynchronously? For well-behaved userspace, yes but no for buggy or malicious ones. I had an idea, how about terminate IOTLB in this case? Then we're in fact turn datapath off. Thanks Userspace should not be able to do any I/O at that time because the iotlb mapping is already removed. Thanks, Yongji ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH 5/5] vhost: fix up vhost_work coding style
On Tue, May 25, 2021 at 12:47:33PM -0500, Mike Christie wrote: Switch from a mix of tabs and spaces to just tabs. Signed-off-by: Mike Christie --- drivers/vhost/vhost.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h index 575c8180caad..7d5306d1229d 100644 --- a/drivers/vhost/vhost.h +++ b/drivers/vhost/vhost.h @@ -20,9 +20,9 @@ typedef void (*vhost_work_fn_t)(struct vhost_work *work); #define VHOST_WORK_QUEUED 1 struct vhost_work { - struct llist_node node; - vhost_work_fn_t fn; - unsigned long flags; + struct llist_node node; + vhost_work_fn_t fn; + unsigned long flags; }; /* Poll a file (eventfd or socket) */ -- 2.25.1 Reviewed-by: Stefano Garzarella ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH 1/5] vhost: remove work arg from vhost_work_flush
On Tue, May 25, 2021 at 12:47:29PM -0500, Mike Christie wrote: vhost_work_flush doesn't do anything with the work arg. This patch drops it and then renames vhost_work_flush to vhost_work_dev_flush to reflect that the function flushes all the works in the dev and not just a specific queue or work item. Signed-off-by: Mike Christie Acked-by: Jason Wang Reviewed-by: Chaitanya Kulkarni --- drivers/vhost/scsi.c | 4 ++-- drivers/vhost/vhost.c | 8 drivers/vhost/vhost.h | 2 +- drivers/vhost/vsock.c | 2 +- 4 files changed, 8 insertions(+), 8 deletions(-) Reviewed-by: Stefano Garzarella ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [virtio-comment] [RFC v3] virtio-vsock: add description for datagram type
Re-send my thoughts on this new series... On Wed, May 26, 2021 at 05:50:35PM +, Jiang Wang wrote: From: "jiang.wang" Add supports for datagram type for virtio-vsock. Datagram sockets are connectionless and unreliable. To avoid contention with stream and other sockets, add two more virtqueues and a new feature bit to identify if those two new queues exist or not. Also add descriptions for resource management of datagram, which does not use the existing credit update mechanism associated with stream sockets. Signed-off-by: Jiang Wang --- V2: addressed the comments for the previous version. V3: add description for the mergeable receive buffer. btw: send the same patch again to include virtio-comment@ virtio-vsock.tex | 137 +-- 1 file changed, 124 insertions(+), 13 deletions(-) diff --git a/virtio-vsock.tex b/virtio-vsock.tex index da7e641..7eb3596 100644 --- a/virtio-vsock.tex +++ b/virtio-vsock.tex @@ -9,14 +9,36 @@ \subsection{Device ID}\label{sec:Device Types / Socket Device / Device ID} \subsection{Virtqueues}\label{sec:Device Types / Socket Device / Virtqueues} \begin{description} -\item[0] rx -\item[1] tx +\item[0] stream rx +\item[1] stream tx +\item[2] datagram rx +\item[3] datagram tx +\item[4] event +\end{description} +The virtio socket device uses 5 queues if feature bit VIRTIO_VSOCK_F_DRGAM is set. Otherwise, it +only uses 3 queues, as the following. + +\begin{description} +\item[0] stream rx +\item[1] stream tx \item[2] event \end{description} +When behavior differs between stream and datagram rx/tx virtqueues +their full names are used. Common behavior is simply described in +terms of rx/tx virtqueues and applies to both stream and datagram +virtqueues. + \subsection{Feature bits}\label{sec:Device Types / Socket Device / Feature bits} -There are currently no feature bits defined for this device. +\begin{description} +\item[VIRTIO_VSOCK_F_DGRAM (0)] Device has support for datagram socket type. +\end{description} As suggested by Michael here [1] we should add also a feature bit for stream, and maybe is better to reserve bit 0 for it. Arseny already sent an implementation of SEQPACKET using bit 1 for VIRTIO_VSOCK_F_SEQPACKET, so I think we can use bit 2 for DGRAM and bit 3 for MRG_RXBUF. [1] https://lists.oasis-open.org/archives/virtio-comment/202104/msg00016.html + +\begin{description} +\item[VIRTIO_VSOCK_F_MRG_RXBUF (1)] Driver can merge receive buffers. +\end{description} + \subsection{Device configuration layout}\label{sec:Device Types / Socket Device / Device configuration layout} @@ -64,6 +86,8 @@ \subsection{Device Operation}\label{sec:Device Types / Socket Device / Device Op Packets transmitted or received contain a header before the payload: +If feature VIRTIO_VSOCK_F_MRG_RXBUF is not negotiated, use the following header. + \begin{lstlisting} struct virtio_vsock_hdr { le64 src_cid; @@ -79,6 +103,15 @@ \subsection{Device Operation}\label{sec:Device Types / Socket Device / Device Op }; \end{lstlisting} +If feature VIRTIO_VSOCK_F_MRG_RXBUF is negotianted, use the following header. +\begin{lstlisting} +struct virtio_vsock_hdr_mrg_rxbuf { + struct virtio_vsock_hdr hdr; + le16 num_buffers; +}; +\end{lstlisting} + + The upper 32 bits of src_cid and dst_cid are reserved and zeroed. Most packets simply transfer data but control packets are also used for @@ -107,6 +140,9 @@ \subsection{Device Operation}\label{sec:Device Types / Socket Device / Device Op \subsubsection{Virtqueue Flow Control}\label{sec:Device Types / Socket Device / Device Operation / Virtqueue Flow Control} +Flow control applies to stream sockets; datagram sockets do not have +flow control. + The tx virtqueue carries packets initiated by applications and replies to received packets. The rx virtqueue carries packets initiated by the device and replies to previously transmitted packets. @@ -140,12 +176,15 @@ \subsubsection{Addressing}\label{sec:Device Types / Socket Device / Device Opera consists of a (cid, port number) tuple. The header fields used for this are \field{src_cid}, \field{src_port}, \field{dst_cid}, and \field{dst_port}. -Currently only stream sockets are supported. \field{type} is 1 for stream -socket types. +Currently stream and datagram (dgram) sockets are supported. \field{type} is 1 for stream +socket types. \field{type} is 3 for dgram socket types. Stream sockets provide in-order, guaranteed, connection-oriented delivery without message boundaries. +Datagram sockets provide unordered, unreliable, connectionless messages +with message boundaries and a maximum length. + \subsubsection{Buffer Space Management}\label{sec:Device Types / Socket Device / Device Operation / Buffer Space Management} \field{buf_alloc} and \field{fwd_cnt} are used for buffer space management of stream sockets. The guest and the device publish how much buffer space is @@ -162,7 +201,7 @@ \subsubsection{Buffer Space Management}
Re: [virtio-comment] [PATCH v6 0/2] virtio-vsock: introduce SOCK_SEQPACKET description
On Mon, May 24, 2021 at 09:32:29PM +0300, Arseny Krasnov wrote: This adds description of SOCK_SEQPACKET type for virtio-vsock. Here is latest RFC implementation for Linux, with more details: https://lkml.org/lkml/2021/5/20/2386 Also this patchset has patch which replaces enums to defines in virtio-vsock part of spec. SOCK_SEQPACKET patch is based on it. Arseny Krasnov(2): virtio-vsock: use C style defines for constants virtio-vsock: SOCK_SEQPACKET description virtio-vsock.tex | 74 -- 1 file changed, 46 insertions(+), 28 deletions(-) The series LGTM! Let's wait for other comments. I'll send a follow up patch next week about feature bit for stream socket (bit 0). Thanks, Stefano ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v7 11/12] vduse: Introduce VDUSE - vDPA Device in Userspace
在 2021/5/27 下午4:41, Jason Wang 写道: 在 2021/5/27 下午3:34, Yongji Xie 写道: On Thu, May 27, 2021 at 1:40 PM Jason Wang wrote: 在 2021/5/27 下午1:08, Yongji Xie 写道: On Thu, May 27, 2021 at 1:00 PM Jason Wang wrote: 在 2021/5/27 下午12:57, Yongji Xie 写道: On Thu, May 27, 2021 at 12:13 PM Jason Wang wrote: 在 2021/5/17 下午5:55, Xie Yongji 写道: + +static int vduse_dev_msg_sync(struct vduse_dev *dev, + struct vduse_dev_msg *msg) +{ + init_waitqueue_head(&msg->waitq); + spin_lock(&dev->msg_lock); + vduse_enqueue_msg(&dev->send_list, msg); + wake_up(&dev->waitq); + spin_unlock(&dev->msg_lock); + wait_event_killable(msg->waitq, msg->completed); What happens if the userspace(malicous) doesn't give a response forever? It looks like a DOS. If yes, we need to consider a way to fix that. How about using wait_event_killable_timeout() instead? Probably, and then we need choose a suitable timeout and more important, need to report the failure to virtio. Makes sense to me. But it looks like some vdpa_config_ops/virtio_config_ops such as set_status() didn't have a return value. Now I add a WARN_ON() for the failure. Do you mean we need to add some change for virtio core to handle the failure? Maybe, but I'm not sure how hard we can do that. We need to change all virtio device drivers in this way. Probably. We had NEEDS_RESET but it looks we don't implement it. Could it handle the failure of get_feature() and get/set_config()? Looks not: " The device SHOULD set DEVICE_NEEDS_RESET when it enters an error state that a reset is needed. If DRIVER_OK is set, after it sets DEVICE_NEEDS_RESET, the device MUST send a device configuration change notification to the driver. " This looks implies that NEEDS_RESET may only work after device is probed. But in the current design, even the reset() is not reliable. Or a rough idea is that maybe need some relaxing to be coupled loosely with userspace. E.g the device (control path) is implemented in the kernel but the datapath is implemented in the userspace like TUN/TAP. I think it can work for most cases. One problem is that the set_config might change the behavior of the data path at runtime, e.g. virtnet_set_mac_address() in the virtio-net driver and cache_type_store() in the virtio-blk driver. Not sure if this path is able to return before the datapath is aware of this change. Good point. But set_config() should be rare: E.g in the case of virtio-net with VERSION_1, config space is read only, and it was set via control vq. For block, we can 1) start from without WCE or 2) we add a config change notification to userspace or 3) extend the spec to use vq instead of config space Thanks Another thing if we want to go this way: We need find a way to terminate the data path from the kernel side, to implement to reset semantic. Thanks Thanks, Yongji ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v7 11/12] vduse: Introduce VDUSE - vDPA Device in Userspace
在 2021/5/27 下午3:34, Yongji Xie 写道: On Thu, May 27, 2021 at 1:40 PM Jason Wang wrote: 在 2021/5/27 下午1:08, Yongji Xie 写道: On Thu, May 27, 2021 at 1:00 PM Jason Wang wrote: 在 2021/5/27 下午12:57, Yongji Xie 写道: On Thu, May 27, 2021 at 12:13 PM Jason Wang wrote: 在 2021/5/17 下午5:55, Xie Yongji 写道: + +static int vduse_dev_msg_sync(struct vduse_dev *dev, + struct vduse_dev_msg *msg) +{ + init_waitqueue_head(&msg->waitq); + spin_lock(&dev->msg_lock); + vduse_enqueue_msg(&dev->send_list, msg); + wake_up(&dev->waitq); + spin_unlock(&dev->msg_lock); + wait_event_killable(msg->waitq, msg->completed); What happens if the userspace(malicous) doesn't give a response forever? It looks like a DOS. If yes, we need to consider a way to fix that. How about using wait_event_killable_timeout() instead? Probably, and then we need choose a suitable timeout and more important, need to report the failure to virtio. Makes sense to me. But it looks like some vdpa_config_ops/virtio_config_ops such as set_status() didn't have a return value. Now I add a WARN_ON() for the failure. Do you mean we need to add some change for virtio core to handle the failure? Maybe, but I'm not sure how hard we can do that. We need to change all virtio device drivers in this way. Probably. We had NEEDS_RESET but it looks we don't implement it. Could it handle the failure of get_feature() and get/set_config()? Looks not: " The device SHOULD set DEVICE_NEEDS_RESET when it enters an error state that a reset is needed. If DRIVER_OK is set, after it sets DEVICE_NEEDS_RESET, the device MUST send a device configuration change notification to the driver. " This looks implies that NEEDS_RESET may only work after device is probed. But in the current design, even the reset() is not reliable. Or a rough idea is that maybe need some relaxing to be coupled loosely with userspace. E.g the device (control path) is implemented in the kernel but the datapath is implemented in the userspace like TUN/TAP. I think it can work for most cases. One problem is that the set_config might change the behavior of the data path at runtime, e.g. virtnet_set_mac_address() in the virtio-net driver and cache_type_store() in the virtio-blk driver. Not sure if this path is able to return before the datapath is aware of this change. Good point. But set_config() should be rare: E.g in the case of virtio-net with VERSION_1, config space is read only, and it was set via control vq. For block, we can 1) start from without WCE or 2) we add a config change notification to userspace or 3) extend the spec to use vq instead of config space Thanks Thanks, Yongji ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization