Il 12/12/2012 16:25, Michael S. Tsirkin ha scritto: > On Wed, Dec 12, 2012 at 04:01:27PM +0100, Paolo Bonzini wrote: >> Il 12/12/2012 15:47, Michael S. Tsirkin ha scritto: >>>> Ok, so we need some API for virtio-{blk,scsi} to communicate back the >>>> indexes of in-flight requests to virtio. The indexes are known from the >>>> VirtQueueElement, so that's fine. >>>> >>>> Even better would be a virtio_save_request/virtio_load_request API... >>> >>> So you are saying this is a bug then? Great. >> >> I'm not sure what you mean by "it will never put the missing heads >> in the used ring". The serialized requests are put in the used rings >> when they are completed and virtio-{blk,scsi} calls virtqueue_push. Is >> the problem if you have a vring that looks like this: >> >> A A U A U U A A >> >> ? Which heads are leaked? {0,1}, {2} or {6,7}? Or a combination thereof? > > I don't know what A A U A U U A A means.
Right, not very clear.... It means that descriptor 2 and 4 and 5 are free, while the others are in-flight. > To make it work, complete all requests when vm is stopped. That's not a choice, sorry. >>> This is exactly what the assert above is out there to catch. >>> And you really can't fix it without breaking migration compatibility. >> >> Why not? The index in the vring is in the migration data. > > index is not enough if requests are outstanding. Sorry, I meant the descriptor index. > Pls check the example in the log of the patch. I'm likewise not sure what you meant by A 1 A 2 U 2 A 2 U 2 A 2 U 2 A 2 <--- U 2 If I understand it, before the point marked with the arrow, the avail ring is 1 2 2 2 vring_avail_idx(vq) == 3 last_avail_idx == 3 After, it is 2 2 2 2 vring_avail_idx(vq) == 4 last_avail_idx == 3 What's wrong with that? You wrote "the only way to know head 1 is outstanding is because backend has stored this info somewhere". But the backend _is_ tracking it (by serializing and then restoring the VirtQueueElement) and no leak happens because virtqueue_fill/flush will put the head on the used ring sooner or later. >>> As step 1, I think we should just complete all outstanding >>> requests when VM stops. >>> >>> Yes it means you can't do the retry hack after migration >>> but this is hardly common scenario. >> >> I disagree... > > Disagree with what? You are saying it's common? It's not common, but you cannot block migration because you have an I/O error. Solving the error may involve migrating the guests away from that host. Paolo