Hi Kevin,

Could you please spend some time reviewing and commenting on this patch series.

Thanks,
Jiahui Cen

On 2020/9/30 17:45, Jiahui Cen wrote:
> A VM in the cloud environment may use a virutal disk as the backend storage,
> and there are usually filesystems on the virtual block device. When backend
> storage is temporarily down, any I/O issued to the virtual block device will
> cause an error. For example, an error occurred in ext4 filesystem would make
> the filesystem readonly. However a cloud backend storage can be soon 
> recovered.
> For example, an IP-SAN may be down due to network failure and will be online
> soon after network is recovered. The error in the filesystem may not be
> recovered unless a device reattach or system restart. So an I/O rehandle is
> in need to implement a self-healing mechanism.
> 
> This patch series propose a feature called I/O hang. It can rehandle AIOs
> with EIO error without sending error back to guest. From guest's perspective
> of view it is just like an IO is hanging and not returned. Guest can get
> back running smoothly when I/O is recovred with this feature enabled.
> 
> v1->v2:
> * Rebase to fix compile problems.
> * Fix incorrect remove of rehandle list.
> * Provide rehandle pause interface.
> 
> Jiahui Cen (8):
>   block-backend: introduce I/O rehandle info
>   block-backend: rehandle block aios when EIO
>   block-backend: add I/O hang timeout
>   block-backend: add I/O rehandle pause/unpause
>   block-backend: enable I/O hang when timeout is set
>   virtio-blk: pause I/O hang when resetting
>   qemu-option: add I/O hang timeout option
>   qapi: add I/O hang and I/O hang timeout qapi event
> 
>  block/block-backend.c          | 300 +++++++++++++++++++++++++++++++++
>  blockdev.c                     |  11 ++
>  hw/block/virtio-blk.c          |   8 +
>  include/sysemu/block-backend.h |   5 +
>  qapi/block-core.json           |  26 +++
>  5 files changed, 350 insertions(+)
> 

Reply via email to