rerror= mechanism

Eric Blake Tue, 23 Feb 2021 05:41:56 -0800

On 2/23/21 3:40 AM, Stefan Hajnoczi wrote:
> On Fri, Feb 05, 2021 at 06:13:06PM +0800, Jiahui Cen wrote:
>> This patch series propose to extend the werror=/rerror= mechanism to add
>> a 'retry' feature. It can automatically retry failed I/O requests on error
>> without sending error back to guest, and guest can get back running smoothly
>> when I/O is recovred.
> 
> This patch series implements a retry followed by werror/rerror=report
> after a timeout. This mechanism could be made more generic (and the code
> could be simplified) by removing the new werror/rerror=retry action and
> instead implementing the retry/timeout followed by *any* werror=/rerror=
> policy chosen by the user.
> 
> In other words, if the retry interval is non-zero, retry the request and
> check for timeouts. When the timeout is reached, obey the
> werror=/rerror= action.
> 
> This is more flexible than hard-coding werror=retry to mean retry
> timeout followed by werror=report.
> 
> For example:
> 
>   werror=stop,write-retry-interval=1000,write-retry-timeout=15000,
>   rerror=report,read-retry-interval=1000,read-retry-timeout=15000
> 
> Failed write requests will be retried once a second for 15 seconds.
> If the timeout is reached the guest is stopped.
> 
> Failed read requests will be retried once a second for 15 seconds. If
> the timeout is reached the error is reported to the guest.


You may also want to look at what the NBD block device already
implements for retries, and see if making retry generic to the block
layer in general can do everything already possible in the NBD code, at
which point the NBD code can be simplified.  Vladimir (added in cc) is
the best point of contact there.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v5 0/9] block: Add retry for werror=/rerror= mechanism

Reply via email to