Am 08.09.2015 um 13:27 hat Denis V. Lunev geschrieben: > interesting point. Yes, it flushes all requests and most likely > hangs inside waiting requests to complete. But fortunately > this happens after the switch to paused state thus > the guest becomes paused. That's why I have missed this > fact. > > This (could) be considered as a problem but I have no (good) > solution at the moment. Should think a bit on.
Let me suggest a radically different design. Note that I don't say this is necessarily how things should be done, I'm just trying to introduce some new ideas and broaden the discussion, so that we have a larger set of ideas from which we can pick the right solution(s). The core of my idea would be a new filter block driver 'timeout' that can be added on top of each BDS that could potentially fail, like a raw-posix BDS pointing to a file on NFS. This way most pieces of the solution are nicely modularised and don't touch the block layer core. During normal operation the driver would just be passing through requests to the lower layer. When it detects a timeout, however, it completes the request it received with -ETIMEDOUT. It also completes any new request it receives with -ETIMEDOUT without passing the request on until the request that originally timed out returns. This is our safety measure against anyone seeing whether or how the timed out request modified data. We need to make sure that bdrv_drain() doesn't wait for this request. Possibly we need to introduce a .bdrv_drain callback that replaces the default handling, because bdrv_requests_pending() in the default handling considers bs->file, which would still have the timed out request. We don't want to see this; bdrv_drain_all() should complete even though that request is still pending internally (externally, we returned -ETIMEDOUT, so we can consider it completed). This way the monitor stays responsive and background jobs can go on if they don't use the failing block device. And then we essentially reuse the rerror/werror mechanism that we already have to stop the VM. The device models would be extended to always stop the VM on -ETIMEDOUT, regardless of the error policy. In this state, the VM would even be migratable if you make sure that the pending request can't modify the image on the destination host any more. Do you think this could work, or did I miss something important? Kevin