On 03/19/2013 09:45 AM, Paolo Bonzini wrote:
This is because of downtime: You have to drain the queue anyway at the
very end, and if you don't drain it in advance after each iteration, then
the queue will have lots of bytes in it waiting for transmission and the
Virtual Machine will be stopped for a much longer period of time during
the last iteration waiting for RDMA card to finish transmission of all
those
bytes.
Shouldn't the "current chunk full" case take care of it too?
Of course if you disable chunking you have to add a different condition,
perhaps directly into save_rdma_page.
No, we don't want to flush on "chunk full" - that has a different meaning.
We want to have as many chunks submitted to the hardware for transmission
as possible to keep the bytes moving.
3. And also during qemu_savem_state_complete(), also using qemu_fflush.
This would be caught by put_buffer, but (2) would not.
I'm not sure this is good enough either - we don't want to flush
the queue *frequently*..... only when it's necessary for performance
.... we do want the queue to have some meat to it so the hardware
can write bytes as fast as possible.....
If we flush inside put_buffer (which is called very frequently):
Is it called at any time during RAM migration?
I don't understand the question: the flushing we've been discussing
is *only* for RAM migration - not for the non-live state.
I haven't introduced any "new" flushes for non-live state other than
when it's absolutely necessary to flush for RAM migration.
then we have no way to distinquish *where* put buffer was called from
(either from qemu_savevm_state_complete() or from a device-level
function call that's using QEMUFile).
Can you make drain a no-op if there is nothing in flight? Then every
call to put_buffer after the first should not have any overhead.
Paolo
That still doesn't solve the problem: If there is nothing in flight,
then there is no reason to call qemu_fflush() in the first place.
This is why I avoided using fflush() in the beginning, because it
sort of "confuses" who is using it: from the perspective of fflush(),
you can't tell if the user calling it for RAM or for non-live state.
The flushes we need are only for RAM, not the rest of it......
Make sense?