Re: [Qemu-block] question:about introduce a new feature named “I/O hang”

2019-07-08 Thread Maxim Levitsky
On Fri, 2019-07-05 at 09:50 +0200, Kevin Wolf wrote:
> Am 04.07.2019 um 17:16 hat wangjie (P) geschrieben:
> > Hi, everybody:
> > 
> > I developed a feature named "I/O hang",my intention is to solve the problem
> > like that:
> > If the backend storage media of VM disk is far-end storage like IPSAN or
> > FCSAN, storage net link will always disconnection and
> > make I/O requests return EIO to Guest, and the status of filesystem in Guest
> > will be read-only, even the link recovered
> > after a while, the status of filesystem in Guest will not recover.
> 
> The standard solution for this is configuring the guest device with
> werror=stop,rerror=stop so that the error is not delivered to the guest,
> but the VM is stopped. When you run 'cont', the request is then retried.
> 
> > So I developed a feature named "I/O hang" to solve this problem, the
> > solution like that:
> > when some I/O requests return EIO in backend, "I/O hang" will catch the
> > requests in qemu block layer and
> > insert the requests to a rehandle queue but not return EIO to Guest, the I/O
> > requests in Guest will hang but it does not lead
> > Guest filesystem to be read-only, then "I/O hang" will loop to rehandle the
> > requests for a period time(ex. 5 second) until the requests
> > not return EIO(when backend storage link recovered).
> 
> Letting requests hang without stopping the VM risks the guest running
> into timeouts and deciding that its disk is broken.
I came to say exactly this.
While developing the nvme-mdev I also had this problem and due to assumptions 
built in the block layer,
you can't just let the guest wait forever for a request.

Note that Linux's nvme driver does know how to retry failed requests, including 
these that timed out if that helps in any way.

Best regards,
Maxim Levitsky


> 
> As you say your "hang" and retry logic sits in the block layer, what do
> you do when you encounter a bdrv_drain() request?
> 
> > In addition to the function as above, "I/O hang" also can sent event to
> > libvirt after backend storage status changed.
> > 
> > configure methods:
> > 1. "I/O hang" ability can be configured for each disk as a disk attribute.
> > 2. "I/O hang" timeout value also can be configured for each disk, when
> > storage link not recover in timeout value,
> >"I/O hang" will disable rehandle I/O requests and return EIO to Guest.
> > 
> > Are you interested in the feature?  I intend to push this feature to qemu
> > org, what's your opinion?
> 
> Were you aware of werror/rerror? Before we add another mechanism, we
> need to be sure how the features compare, that the new mechanism
> provides a significant advantage and that we keep code duplication as
> low as possible.
> 
> Kevin
> 





Re: [Qemu-block] question:about introduce a new feature named “I/O hang”

2019-07-05 Thread Kevin Wolf
Am 04.07.2019 um 17:16 hat wangjie (P) geschrieben:
> Hi, everybody:
> 
> I developed a feature named "I/O hang",my intention is to solve the problem
> like that:
> If the backend storage media of VM disk is far-end storage like IPSAN or
> FCSAN, storage net link will always disconnection and
> make I/O requests return EIO to Guest, and the status of filesystem in Guest
> will be read-only, even the link recovered
> after a while, the status of filesystem in Guest will not recover.

The standard solution for this is configuring the guest device with
werror=stop,rerror=stop so that the error is not delivered to the guest,
but the VM is stopped. When you run 'cont', the request is then retried.

> So I developed a feature named "I/O hang" to solve this problem, the
> solution like that:
> when some I/O requests return EIO in backend, "I/O hang" will catch the
> requests in qemu block layer and
> insert the requests to a rehandle queue but not return EIO to Guest, the I/O
> requests in Guest will hang but it does not lead
> Guest filesystem to be read-only, then "I/O hang" will loop to rehandle the
> requests for a period time(ex. 5 second) until the requests
> not return EIO(when backend storage link recovered).

Letting requests hang without stopping the VM risks the guest running
into timeouts and deciding that its disk is broken.

As you say your "hang" and retry logic sits in the block layer, what do
you do when you encounter a bdrv_drain() request?

> In addition to the function as above, "I/O hang" also can sent event to
> libvirt after backend storage status changed.
> 
> configure methods:
> 1. "I/O hang" ability can be configured for each disk as a disk attribute.
> 2. "I/O hang" timeout value also can be configured for each disk, when
> storage link not recover in timeout value,
>    "I/O hang" will disable rehandle I/O requests and return EIO to Guest.
> 
> Are you interested in the feature?  I intend to push this feature to qemu
> org, what's your opinion?

Were you aware of werror/rerror? Before we add another mechanism, we
need to be sure how the features compare, that the new mechanism
provides a significant advantage and that we keep code duplication as
low as possible.

Kevin