> > > First explanation, why I think this don't fix the full problem. > > > Whith this patch, we fix the problem where we have a dirty block > > > layer but basically nothing dirtying the memory on the guest (we are > > > moving the 20 seconds from max_downtime for the blocklayer flush), > > > to 20 seconds until we have decided that the amount of dirty memory > > > is small enough to be transferred during max_downtime. But it is > > > still going to take 20 seconds to flush the block layer, and during > > > that 20 seconds, the amount of memory that can be dirty is HUGE. > > > > It's true. > > What kind of cache is it actually that takes 20s to flush here? >
I run a script in the guest which do a dd operation, like this: #!/bin/sh for i in {1..1000000} do time dd if=/dev/zero of=/time.bdf bs=4k count=200000 rm /time.bdf done It's an extreme case.