On Tue, Feb 08, 2022 at 04:44:33PM +0200, Nir Soffer wrote: > On Tue, Feb 8, 2022 at 3:27 PM Richard W.M. Jones <rjo...@redhat.com> wrote: > > virt-v2v 1.44.2: > > Complete log: http://oirase.annexia.org/tmp/virt-v2v-1.44.2-rhv-upload.log > > > > [ 63.1] Copying disk 1/1 > > ... > > transfer 7aecd359-3706-49f5-8a0c-c8799f7b100a is finalizing_success > > transfer 7aecd359-3706-49f5-8a0c-c8799f7b100a is finished_success > > transfer 7aecd359-3706-49f5-8a0c-c8799f7b100a finalized in 6.105 seconds > > [ 89.2] Creating output metadata > > [ 89.8] Finishing off > > > > virt-v2v 1.45.97: > > Complete log: http://oirase.annexia.org/tmp/virt-v2v-1.45.97-rhv-upload.log > > > > [ 71.0] Copying disk 1/1 > > [ 82.7] Creating output metadata > > ... > > transfer 6ea3d724-16f9-4bda-a33e-69a783480abc is finalizing_success > > transfer 6ea3d724-16f9-4bda-a33e-69a783480abc is finished_success > > transfer 6ea3d724-16f9-4bda-a33e-69a783480abc finalized in 61.552 seconds > > [ 144.9] Finishing off > > This happens because virt-v2v try to finalize the transfer *before* > closing the connections to imageio server. > > Current imageio release mark the a ticket as canceled, but will not > remove it if the ticket has open connections from clients. If the > clients are idle, the connection closes after 60 seconds, so when > engine tries again to remove the ticket, the operation succeeds.
Ah ha! I should be able to fix this easily enough ... Let me try a few things and then I'll come back to the rest of your message. Rich. > Upstream version improved this flow to wait only for ongoing requests. If > there > are no ongoing requests the ticket is removed immediately, ignoring the open > connections. The connections will be closed when the client closes the > connection > on when reading from the socket times out. > > You can test upstream version from here: > https://github.com/oVirt/ovirt-imageio/actions/runs/1811058094 > > But if you want to be compatible with released imageio version, you > must close the connections to imageio *before* finalizing the transfers. > > We already discussed this issue last year, when we handled the bug > when vddk blocks for many minutes during block status and imageio > connection is droped, and you opened a bug for this. > > > It's not a completely like-for-like comparison because the rhv-upload > > backend changed a lot between these versions. In particular if I was > > going to pick some suspicious change it would be my refactoring here > > which was supposed to be neutral but maybe isn't: > > > > > > https://github.com/libguestfs/virt-v2v/commit/143a22860216b94d3a81706193088d50c03fc35c > > > > Unfortunately this commit is almost impossible to revert because of > > deep restructuring of surrounding code. > > The refactoring is not the issue, the issue is that we do not terminate > the output nbdkit before finalizing. > > For some output we cannot do this since we want to query nbdkit for > more info (e.g block status) but for rhv upload we don't have anything > to do with nbkdit instance and we must close the connection before > we finalize, so we should close the output right after we finish the copy. > > > Another idea: > > > > Old virt-v2v uses qemu-img convert which does not flush by default. > > New virt-v2v uses nbdcopy with the --flush option, so it will call > > imageio PATCH /path ... "op":"flush" at the end. However removing > > nbdcopy --flush didn't help very much (only a couple of seconds off). > > Calling flush is right, we cannot remove it. Although there is bug in nbdkit, > and if flushes during close() flow even if you did not send a flush. > > > Do you have any other ideas? > > > > What exactly does imageio do during the finalizing_success state? > > The flow is this: > - engine send request to vdsm to delete the ticket > - vdsm connects to imageio control socket and send DELETE /tickets/{ticket-id} > - imageio mark the ticket as canceled, so no new request can succeed > and no new connection to attach to the ticket > - imageio mark all ongoing requests as canceled, so if the request is in > read/write loop, it will abort on the the next read/write complete > - upstream: imageio waits until all ongoing operations complete > - release: imageio waits until all connections are closed > - if the ticket could not be removed in 60 seconds, imageio returns 409 > Conflict > - imageio returns 200 OK > - engine retries the delete if it failed > - when engine succeeds and finish other finalization tasks, the image > transfer will > be reported as finished. > > I would expect the flow to time out on virt-v2v side, since we use 60 seconds > timeout, and when removing a ticket times out, it takes 60 seconds, so you > should not succeed to finalize in 61 seconds. There is a bug in ovirt > engine that cause this flow to succeed. > > Nir -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org _______________________________________________ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs