[RFC] virtio_scsi: to poll and kick the virtqueue in timeout handler

2021-05-22 Thread Dongli Zhang
This RFC is to trigger the discussion about to poll and kick the
virtqueue on purpose in virtio-scsi timeout handler.

The virtio-scsi relies on the virtio vring shared between VM and host.
The VM side produces requests to vring and kicks the virtqueue, while the
host side produces responses to vring and interrupts the VM side.

By default the virtio-scsi handler depends on the host timeout handler
by BLK_EH_RESET_TIMER to give host a chance to perform EH.

However, this is not helpful for the case that the responses are available
on vring but the notification from host to VM is lost.


Would you mind share your feedback on this idea to poll the vring on
purpose in timeout handler to give VM a chance to recover from stalled
IO? In addition, how about to kick for an extra time, in case the
stalled IO is due to the loss of VM-to-host notification?


I have tested the IO can be recovered after interrupt is lost on purpose
with below debug patch.

https://github.com/finallyjustice/patchset/blob/master/scsi-virtio_scsi-to-lose-an-interrupt-on-purpose.patch


In addition, the virtio-blk may implement the timeout handler as well,
with the helper function in below patchset from Stefan.

https://lore.kernel.org/linux-block/20210520141305.355961-1-stefa...@redhat.com/T/#t


Thank you very much!

Dongli Zhang



Considering there might be loss of interrupt or kick issue, the timeout
handler may poll or kick the virtqueue on purpose, in order to recover the
IO.

If the response is already available on vring, it indicates the host side
has already processed the request and it is not helpful by giving host a
chance to perform EH.

There will be syslog like below by the timeout handler:

[  135.830746] virtio_scsi: Virtio SCSI HBA 0: I/O 196 QID 3 timeout, 
completion polled

Signed-off-by: Dongli Zhang 
---
 drivers/scsi/virtio_scsi.c | 32 
 1 file changed, 32 insertions(+)

diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
index b9c86a7e3b97..633950b6336a 100644
--- a/drivers/scsi/virtio_scsi.c
+++ b/drivers/scsi/virtio_scsi.c
@@ -36,6 +36,9 @@
 #define VIRTIO_SCSI_EVENT_LEN 8
 #define VIRTIO_SCSI_VQ_BASE 2
 
+static bool __read_mostly timed_out_enabled;
+module_param(timed_out_enabled, bool, 0644);
+
 /* Command queue element */
 struct virtio_scsi_cmd {
struct scsi_cmnd *sc;
@@ -732,9 +735,38 @@ static void virtscsi_commit_rqs(struct Scsi_Host *shost, 
u16 hwq)
  * The host guarantees to respond to each command, although I/O
  * latencies might be higher than on bare metal.  Reset the timer
  * unconditionally to give the host a chance to perform EH.
+ *
+ * If 'timed_out_enabled' is set, the timeout handler polls or kicks the
+ * virtqueue on purpose, in order to recover the IO, in case there is loss
+ * of interrupt or kick issue with virtio.
  */
 static enum blk_eh_timer_return virtscsi_eh_timed_out(struct scsi_cmnd *scmnd)
 {
+   struct Scsi_Host *shost;
+   struct virtio_scsi *vscsi;
+   struct virtio_scsi_vq *req_vq;
+
+   if (!timed_out_enabled)
+   return BLK_EH_RESET_TIMER;
+
+   shost = scmnd->device->host;
+   vscsi = shost_priv(shost);
+   req_vq = virtscsi_pick_vq_mq(vscsi, scmnd);
+
+   virtscsi_vq_done(vscsi, req_vq, virtscsi_complete_cmd);
+
+   if (test_bit(SCMD_STATE_COMPLETE, &scmnd->state)) {
+   pr_warn("Virtio SCSI HBA %u: I/O %u QID %d timeout, completion 
polled\n",
+   shost->host_no, scmnd->tag, req_vq->vq->index);
+   return BLK_EH_DONE;
+   }
+
+   /*
+* To kick the virtqueue in case the timeout is due to the loss of
+* one prior notification.
+*/
+   virtqueue_notify(req_vq->vq);
+
return BLK_EH_RESET_TIMER;
 }
 
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[RFC v3] virtio-vsock: add description for datagram type

2021-05-22 Thread Jiang Wang
From: "jiang.wang" 

Add supports for datagram type for virtio-vsock. Datagram
sockets are connectionless and unreliable. To avoid contention
with stream and other sockets, add two more virtqueues and
a new feature bit to identify if those two new queues exist or not.

Also add descriptions for resource management of datagram, which
does not use the existing credit update mechanism associated with
stream sockets.

Signed-off-by: Jiang Wang 
---
V2: addressed the comments for the previous version.
V3: add description for the mergeable receive buffer.

 virtio-vsock.tex | 137 +--
 1 file changed, 124 insertions(+), 13 deletions(-)

diff --git a/virtio-vsock.tex b/virtio-vsock.tex
index da7e641..7eb3596 100644
--- a/virtio-vsock.tex
+++ b/virtio-vsock.tex
@@ -9,14 +9,36 @@ \subsection{Device ID}\label{sec:Device Types / Socket Device 
/ Device ID}
 
 \subsection{Virtqueues}\label{sec:Device Types / Socket Device / Virtqueues}
 \begin{description}
-\item[0] rx
-\item[1] tx
+\item[0] stream rx
+\item[1] stream tx
+\item[2] datagram rx
+\item[3] datagram tx
+\item[4] event
+\end{description}
+The virtio socket device uses 5 queues if feature bit VIRTIO_VSOCK_F_DRGAM is 
set. Otherwise, it
+only uses 3 queues, as the following.
+
+\begin{description}
+\item[0] stream rx
+\item[1] stream tx
 \item[2] event
 \end{description}
 
+When behavior differs between stream and datagram rx/tx virtqueues
+their full names are used. Common behavior is simply described in
+terms of rx/tx virtqueues and applies to both stream and datagram
+virtqueues.
+
 \subsection{Feature bits}\label{sec:Device Types / Socket Device / Feature 
bits}
 
-There are currently no feature bits defined for this device.
+\begin{description}
+\item[VIRTIO_VSOCK_F_DGRAM (0)] Device has support for datagram socket type.
+\end{description}
+
+\begin{description}
+\item[VIRTIO_VSOCK_F_MRG_RXBUF (1)] Driver can merge receive buffers.
+\end{description}
+
 
 \subsection{Device configuration layout}\label{sec:Device Types / Socket 
Device / Device configuration layout}
 
@@ -64,6 +86,8 @@ \subsection{Device Operation}\label{sec:Device Types / Socket 
Device / Device Op
 
 Packets transmitted or received contain a header before the payload:
 
+If feature VIRTIO_VSOCK_F_MRG_RXBUF is not negotiated, use the following 
header.
+
 \begin{lstlisting}
 struct virtio_vsock_hdr {
le64 src_cid;
@@ -79,6 +103,15 @@ \subsection{Device Operation}\label{sec:Device Types / 
Socket Device / Device Op
 };
 \end{lstlisting}
 
+If feature VIRTIO_VSOCK_F_MRG_RXBUF is negotianted, use the following header.
+\begin{lstlisting}
+struct virtio_vsock_hdr_mrg_rxbuf {
+   struct virtio_vsock_hdr hdr;
+   le16 num_buffers;
+};
+\end{lstlisting}
+
+
 The upper 32 bits of src_cid and dst_cid are reserved and zeroed.
 
 Most packets simply transfer data but control packets are also used for
@@ -107,6 +140,9 @@ \subsection{Device Operation}\label{sec:Device Types / 
Socket Device / Device Op
 
 \subsubsection{Virtqueue Flow Control}\label{sec:Device Types / Socket Device 
/ Device Operation / Virtqueue Flow Control}
 
+Flow control applies to stream sockets; datagram sockets do not have
+flow control.
+
 The tx virtqueue carries packets initiated by applications and replies to
 received packets.  The rx virtqueue carries packets initiated by the device and
 replies to previously transmitted packets.
@@ -140,12 +176,15 @@ \subsubsection{Addressing}\label{sec:Device Types / 
Socket Device / Device Opera
 consists of a (cid, port number) tuple. The header fields used for this are
 \field{src_cid}, \field{src_port}, \field{dst_cid}, and \field{dst_port}.
 
-Currently only stream sockets are supported. \field{type} is 1 for stream
-socket types.
+Currently stream and datagram (dgram) sockets are supported. \field{type} is 1 
for stream
+socket types. \field{type} is 3 for dgram socket types.
 
 Stream sockets provide in-order, guaranteed, connection-oriented delivery
 without message boundaries.
 
+Datagram sockets provide unordered, unreliable, connectionless messages 
+with message boundaries and a maximum length.
+
 \subsubsection{Buffer Space Management}\label{sec:Device Types / Socket Device 
/ Device Operation / Buffer Space Management}
 \field{buf_alloc} and \field{fwd_cnt} are used for buffer space management of
 stream sockets. The guest and the device publish how much buffer space is
@@ -162,7 +201,7 @@ \subsubsection{Buffer Space Management}\label{sec:Device 
Types / Socket Device /
 u32 peer_free = peer_buf_alloc - (tx_cnt - peer_fwd_cnt);
 \end{lstlisting}
 
-If there is insufficient buffer space, the sender waits until virtqueue buffers
+For stream sockets, if there is insufficient buffer space, the sender waits 
until virtqueue buffers
 are returned and checks \field{buf_alloc} and \field{fwd_cnt} again. Sending
 the VIRTIO_VSOCK_OP_CREDIT_REQUEST packet queries how much buffer space is
 available. The reply to t

Re: [PATCH v4 2/5] fuse: Call vfs_get_tree() for submounts

2021-05-22 Thread kernel test robot
Hi Greg,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on fuse/for-next]
[also build test WARNING on linux/master linus/master v5.13-rc2 next-20210521]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Greg-Kurz/virtiofs-propagate-sync-to-file-server/20210522-210652
base:   https://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse.git 
for-next
config: sparc64-randconfig-p002-20210522 (attached as .config)
compiler: sparc64-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/0day-ci/linux/commit/ee3cc45c5a2311efc82021bd5463271507bef828
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Greg-Kurz/virtiofs-propagate-sync-to-file-server/20210522-210652
git checkout ee3cc45c5a2311efc82021bd5463271507bef828
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross 
ARCH=sparc64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All warnings (new ones prefixed by >>):

   fs/fuse/dir.c: In function 'fuse_dentry_automount':
>> fs/fuse/dir.c:312:21: warning: variable 'fm' set but not used 
>> [-Wunused-but-set-variable]
 312 |  struct fuse_mount *fm;
 | ^~

Kconfig warnings: (for reference only)
   WARNING: unmet direct dependencies detected for FRAME_POINTER
   Depends on DEBUG_KERNEL && (M68K || UML || SUPERH) || 
ARCH_WANT_FRAME_POINTERS || MCOUNT
   Selected by
   - LOCKDEP && DEBUG_KERNEL && LOCK_DEBUGGING_SUPPORT && !MIPS && !PPC && !ARM 
&& !S390 && !MICROBLAZE && !ARC && !X86


vim +/fm +312 fs/fuse/dir.c

8fab010644363f Miklos Szeredi 2018-08-15  303  
bf109c64040f5b Max Reitz  2020-04-21  304  /*
bf109c64040f5b Max Reitz  2020-04-21  305   * Create a fuse_mount object 
with a new superblock (with path->dentry
bf109c64040f5b Max Reitz  2020-04-21  306   * as the root), and return that 
mount so it can be auto-mounted on
bf109c64040f5b Max Reitz  2020-04-21  307   * @path.
bf109c64040f5b Max Reitz  2020-04-21  308   */
bf109c64040f5b Max Reitz  2020-04-21  309  static struct vfsmount 
*fuse_dentry_automount(struct path *path)
bf109c64040f5b Max Reitz  2020-04-21  310  {
bf109c64040f5b Max Reitz  2020-04-21  311   struct fs_context *fsc;
bf109c64040f5b Max Reitz  2020-04-21 @312   struct fuse_mount *fm;
bf109c64040f5b Max Reitz  2020-04-21  313   struct vfsmount *mnt;
bf109c64040f5b Max Reitz  2020-04-21  314   struct fuse_inode *mp_fi = 
get_fuse_inode(d_inode(path->dentry));
bf109c64040f5b Max Reitz  2020-04-21  315   int err;
bf109c64040f5b Max Reitz  2020-04-21  316  
bf109c64040f5b Max Reitz  2020-04-21  317   fsc = 
fs_context_for_submount(path->mnt->mnt_sb->s_type, path->dentry);
bf109c64040f5b Max Reitz  2020-04-21  318   if (IS_ERR(fsc)) {
bf109c64040f5b Max Reitz  2020-04-21  319   err = PTR_ERR(fsc);
bf109c64040f5b Max Reitz  2020-04-21  320   goto out;
bf109c64040f5b Max Reitz  2020-04-21  321   }
bf109c64040f5b Max Reitz  2020-04-21  322  
ee3cc45c5a2311 Greg Kurz  2021-05-20  323   /*
ee3cc45c5a2311 Greg Kurz  2021-05-20  324* Hijack fsc->fs_private to 
pass the mount point inode to
ee3cc45c5a2311 Greg Kurz  2021-05-20  325* fuse_get_tree_submount(). It 
*must* be NULLified afterwards
ee3cc45c5a2311 Greg Kurz  2021-05-20  326* to avoid the inode pointer 
to be passed to kfree() when
ee3cc45c5a2311 Greg Kurz  2021-05-20  327* the context gets freed.
ee3cc45c5a2311 Greg Kurz  2021-05-20  328*/
ee3cc45c5a2311 Greg Kurz  2021-05-20  329   fsc->fs_private = mp_fi;
ee3cc45c5a2311 Greg Kurz  2021-05-20  330   err = vfs_get_tree(fsc);
ee3cc45c5a2311 Greg Kurz  2021-05-20  331   fsc->fs_private = NULL;
ee3cc45c5a2311 Greg Kurz  2021-05-20  332   if (err)
bf109c64040f5b Max Reitz  2020-04-21  333   goto out_put_fsc;
bf109c64040f5b Max Reitz  2020-04-21  334  
ee3cc45c5a2311 Greg Kurz  2021-05-20  335   fm = 
get_fuse_mount_super(fsc->root->d_sb);
bf109c64040f5b Max Reitz  2020-04-21  336  
bf109c64040f5b Max Reitz  2020-04-21  337   /* Create the submount */
bf109c64040f5b Max Reitz  2020-04-21  338   mnt = vfs_create_mount(fsc);
bf109c64040f5b Max Reitz  2020-04-21  339   if (IS_ERR(mnt)) {
bf109c64040f5b Max Reitz  2020-04-21  340   err = P

Re: [PATCH v4 2/5] fuse: Call vfs_get_tree() for submounts

2021-05-22 Thread kernel test robot
Hi Greg,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on fuse/for-next]
[also build test WARNING on linux/master linus/master v5.13-rc2 next-20210521]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Greg-Kurz/virtiofs-propagate-sync-to-file-server/20210522-210652
base:   https://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse.git 
for-next
config: nds32-randconfig-r011-20210522 (attached as .config)
compiler: nds32le-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/0day-ci/linux/commit/ee3cc45c5a2311efc82021bd5463271507bef828
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Greg-Kurz/virtiofs-propagate-sync-to-file-server/20210522-210652
git checkout ee3cc45c5a2311efc82021bd5463271507bef828
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross 
ARCH=nds32 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All warnings (new ones prefixed by >>):

   fs/fuse/dir.c: In function 'fuse_dentry_automount':
>> fs/fuse/dir.c:312:21: warning: variable 'fm' set but not used 
>> [-Wunused-but-set-variable]
 312 |  struct fuse_mount *fm;
 | ^~


vim +/fm +312 fs/fuse/dir.c

8fab010644363f Miklos Szeredi 2018-08-15  303  
bf109c64040f5b Max Reitz  2020-04-21  304  /*
bf109c64040f5b Max Reitz  2020-04-21  305   * Create a fuse_mount object 
with a new superblock (with path->dentry
bf109c64040f5b Max Reitz  2020-04-21  306   * as the root), and return that 
mount so it can be auto-mounted on
bf109c64040f5b Max Reitz  2020-04-21  307   * @path.
bf109c64040f5b Max Reitz  2020-04-21  308   */
bf109c64040f5b Max Reitz  2020-04-21  309  static struct vfsmount 
*fuse_dentry_automount(struct path *path)
bf109c64040f5b Max Reitz  2020-04-21  310  {
bf109c64040f5b Max Reitz  2020-04-21  311   struct fs_context *fsc;
bf109c64040f5b Max Reitz  2020-04-21 @312   struct fuse_mount *fm;
bf109c64040f5b Max Reitz  2020-04-21  313   struct vfsmount *mnt;
bf109c64040f5b Max Reitz  2020-04-21  314   struct fuse_inode *mp_fi = 
get_fuse_inode(d_inode(path->dentry));
bf109c64040f5b Max Reitz  2020-04-21  315   int err;
bf109c64040f5b Max Reitz  2020-04-21  316  
bf109c64040f5b Max Reitz  2020-04-21  317   fsc = 
fs_context_for_submount(path->mnt->mnt_sb->s_type, path->dentry);
bf109c64040f5b Max Reitz  2020-04-21  318   if (IS_ERR(fsc)) {
bf109c64040f5b Max Reitz  2020-04-21  319   err = PTR_ERR(fsc);
bf109c64040f5b Max Reitz  2020-04-21  320   goto out;
bf109c64040f5b Max Reitz  2020-04-21  321   }
bf109c64040f5b Max Reitz  2020-04-21  322  
ee3cc45c5a2311 Greg Kurz  2021-05-20  323   /*
ee3cc45c5a2311 Greg Kurz  2021-05-20  324* Hijack fsc->fs_private to 
pass the mount point inode to
ee3cc45c5a2311 Greg Kurz  2021-05-20  325* fuse_get_tree_submount(). It 
*must* be NULLified afterwards
ee3cc45c5a2311 Greg Kurz  2021-05-20  326* to avoid the inode pointer 
to be passed to kfree() when
ee3cc45c5a2311 Greg Kurz  2021-05-20  327* the context gets freed.
ee3cc45c5a2311 Greg Kurz  2021-05-20  328*/
ee3cc45c5a2311 Greg Kurz  2021-05-20  329   fsc->fs_private = mp_fi;
ee3cc45c5a2311 Greg Kurz  2021-05-20  330   err = vfs_get_tree(fsc);
ee3cc45c5a2311 Greg Kurz  2021-05-20  331   fsc->fs_private = NULL;
ee3cc45c5a2311 Greg Kurz  2021-05-20  332   if (err)
bf109c64040f5b Max Reitz  2020-04-21  333   goto out_put_fsc;
bf109c64040f5b Max Reitz  2020-04-21  334  
ee3cc45c5a2311 Greg Kurz  2021-05-20  335   fm = 
get_fuse_mount_super(fsc->root->d_sb);
bf109c64040f5b Max Reitz  2020-04-21  336  
bf109c64040f5b Max Reitz  2020-04-21  337   /* Create the submount */
bf109c64040f5b Max Reitz  2020-04-21  338   mnt = vfs_create_mount(fsc);
bf109c64040f5b Max Reitz  2020-04-21  339   if (IS_ERR(mnt)) {
bf109c64040f5b Max Reitz  2020-04-21  340   err = PTR_ERR(mnt);
bf109c64040f5b Max Reitz  2020-04-21  341   goto out_put_fsc;
bf109c64040f5b Max Reitz  2020-04-21  342   }
bf109c64040f5b Max Reitz  2020-04-21  343   mntget(mnt);
bf109c64040f5b Max Reitz  2020-04-21  344   put_fs_context(fsc);
bf109c64040f5b Max Reitz  2020-04-21  345   return mnt;
bf109c64040f5b Max Reitz  2020-04-21  346  
bf109c64040f5b Max Reitz  2020-04-21