On 08/02/2018 12:29 PM, Dr. David Alan Gilbert wrote: > * Denis V. Lunev (d...@openvz.org) wrote: >> On 08/01/2018 09:55 PM, Dr. David Alan Gilbert wrote: >>> * Denis V. Lunev (d...@openvz.org) wrote: >>>> On 08/01/2018 08:40 PM, Dr. David Alan Gilbert wrote: >>>>> * John Snow (js...@redhat.com) wrote: >>>>>> On 08/01/2018 06:20 AM, Dr. David Alan Gilbert wrote: >>>>>>> * John Snow (js...@redhat.com) wrote: >>>>>>> >>>>>>> <snip> >>>>>>> >>>>>>>> I'd rather do something like this: >>>>>>>> - Always flush bitmaps to disk on inactivate. >>>>>>> Does that increase the time taken by the inactivate measurably? >>>>>>> If it's small relative to everything else that's fine; it's just I >>>>>>> always worry a little since I think this happens after we've stopped the >>>>>>> CPU on the source, so is part of the 'downtime'. >>>>>>> >>>>>>> Dave >>>>>>> -- >>>>>>> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK >>>>>>> >>>>>> I'm worried that if we don't, we're leaving behind unusable, partially >>>>>> complete files behind us. That's a bad design and we shouldn't push for >>>>>> it just because it's theoretically faster. >>>>> Oh I don't care about theoretical speed; but if it's actually unusably >>>>> slow in practice then it needs fixing. >>>>> >>>>> Dave >>>> This is not "theoretical" speed. This is real practical speed and >>>> instability. >>>> EACH IO operation can be performed unpredictably slow and thus with >>>> IO operations in mind you can not even calculate or predict downtime, >>>> which should be done according to the migration protocol. >>> We end up doing some IO anyway, even ignoring these new bitmaps, >>> at the end of the migration when we pause the CPU, we do a >>> bdrv_inactivate_all to flush any outstanding writes; so we've already >>> got that unpredictable slowness. >>> >>> So, not being a block person, but with some interest in making sure >>> downtime doesn't increase, I just wanted to understand whether the >>> amount of writes we're talking about here is comparable to that >>> which already exists or a lot smaller or a lot larger. >>> If the amount of IO you're talking about is much smaller than what >>> we typically already do, then John has a point and you may as well >>> do the write. >>> If the amount of IO for the bitmap is much larger and would slow >>> the downtime a lot then you've got a point and that would be unworkable. >>> >>> Dave >> This is not theoretical difference. >> >> For 1 Tb drive and 64 kb bitmap granularity the size of bitmap is >> 2 Mb + some metadata (64 Kb). Thus we will have to write >> 2 Mb of data per bitmap. > OK, this was about my starting point; I think your Mb here is Byte not > Bit; so assuming a drive of 200MByte/s, that's 200/2=1/100th of a > second = 10ms; now 10ms I'd say is small enough not to worry about downtime > increases, since the number we normally hope for is in the 300ms ish > range. > >> For some case there are 2-3-5 bitmaps >> this we will have 10 Mb of data. > OK, remembering I'm not a block person can you just explain why > you need 5 bitmaps? > But with 5 bitmaps that's 50ms, that's starting to get worrying. > >> With 16 Tb drive the amount of >> data to write will be multiplied by 16 which gives 160 Mb to >> write. More disks and bigger the size - more data to write. > Yeh and that's going on for a second and way too big. > > (Although that feels like you could fix it by adding bitmaps on your > bitmaps hierarchically so you didn't write them all; but that's > getting way more complex). > >> Above amount should be multiplied by 2 - x Mb to be written >> on source, x Mb to be read on target which gives 320 Mb to >> write. >> >> That is why this is not good - we have linear increase with the >> size and amount of disks. >> >> There is also some thoughts on normal guest IO. Theoretically >> we can think on replaying IO on the target closing the file >> immediately or block writes to changed areas and notify >> target upon IO completion or invent other fancy dances. >> At least we think right now on these optimizations for regular >> migration paths. >> >> The problem right that such things are not needed now for CBT >> but will become necessary and pretty much useless upon >> introducing this stuff. > I don't quite understand the last two paragraphs. we are thinking right now to eliminate delay on regular IO for migration. There is some thoughts and internal work in progress. That is why I am worrying.
> However, coming back to my question; it was really saying that > normal guest IO during the end of the migration will cause > a delay; I'm expecting that to be fairly unrelated to the size > of the disk; more to do with workload; so I guess in your case > the worry is the case of big large disks giving big large > bitmaps. exactly! Den