So here's my understanding: "-rtc base=" says what is the RTC value when the guest starts. This value is only used by qemu_get_timedate, and most RTCs only use it on startup or reset. However, there are exceptions (the PC RTC's host clock notifier, the ds1338's set time functionality, and all reads of m41t80/m48t59/twl92230) and this causes the bug.
On 19/07/19 14:36, Dr. David Alan Gilbert wrote: > d) The host clock jump detection (b) is broken - it correctly detects > backwards jumps; but it's detection of a forward jump is based > on two readings of the host clock being more than 60s apart - but > often ona q emu running a Linux guest the host clock isn't read at all; > so reading hwclock, waiting a minute and reading it again will trigger > the jump code. Oops. Back when the detection was added, there were two QEMU_CLOCK_HOST timers firing every second so the clock jump detection happened promptly. These timers were then removed as a power-saving optimization, and that broke the jump detection. > 1) Tell people to do what libvirt does and specify base= differently > on the dest. This is racy; the user does not have a good way to know the exact base on the destination. > 2) Migrate the offset value such that the base= on the destination > is ignored At least on some RTCs the offset is already being migrated indirectly. For example on x86 the (base_rtc, last_update) pair might be usable to reconstruct the offset? > 3) Fix the host clock jump detection > > (3) is probably independent - the easiest fix would seem to be just > to set a timer to read the host clock at say 20 second intervals > which is wasteful but would avoid the false trigger. > > Is (2) worth it or do we just go with (1) - I'm tempted to just > specify the behaviour. > > Mind you, we could kill the host clock jump detection code - only > the mc148618 registers on the notifier for it - so presumably > aarch/ppc/s390 etc dont see it. I would just remove the host clock jump detection code. IIUC that should fix your bug so you don't even need to do the above-mentioned reconstruction of the offset (let's call it 2b) in the PC RTC. That still leaves the problem that the base goes out of sync on migration on m41t80/m48t59/twl92230. For that, I think that the simplest thing to do would be to fix those to store and migrate the offset themselves just like all other RTC implementations. Paolo