On 02/06/2015 18:43, ronnie sahlberg wrote:
> If we change this to iSCSI, we can actually avoid this by using task
> management functions:
>       guest -> QEMU        write A to sector 1
>       QEMU -> iSCSI        write A to sector 1
>      ... timeout...
>       QEMU -> iSCSI       task management: abort task for Write A     (**A)
>       QEMU -> guest        write A to sector 1 timed out
>       guest -> QEMU        write B to sector 1         (**B)
> 
> I think that IF a task times out and then IF you then immediately
> generate and send a task management abort task to the
> target, and you do this before you tell the guest the i/o failed, then
> all should be good.

You still have to wait for the answer to the TMF, so this doesn't help
much. :-(

Paolo

> That should guarantee the ordering of **A always being sent to the
> target before **B
> so the race should not happen.
> 
> 
> 
> 
>         At this point you have the two outstanding writes are for the same
>         sector and with different payloads, so it's undefined which one
>         wins.
> 
>               QEMU -> NFS          write B to sector 1
>               NFS -> QEMU          write B to sector 1 completed
>               QEMU -> guest        write B to sector 1 completed
>               NFS -> QEMU          write A to sector 1 completed
>                                    (QEMU doesn't report this to the guest)
> 
>         The guest thinks it has written B, but it's possible that the
>         storage
>         has written A.
> 
> 
>     So you would go for infinite reconnecting? We can SIGKILL then anyway.
> 
>     As said before my idea would be default of 5000ms for all sync calls and
>     no timeout for all async calls coming from the block layer.
> 
>     A user settable timeout can be optionally specified via cmdline options
>     to define a timeout for both sync and async calls.
> 
> 
> Sounds sane to me.
> 
> As for infinite reconnect. I guess that since these disks are not
> exposes as "removable" to the
> guest, there is not really much recovery that the guest kernel can do if
> the disk go away and never return
> so there might not be much point in not having infinite reconnect attempts.
> 
> 
> 

Reply via email to