On Fri, 12 Apr 2019 13:44:07 +0800 Wei Yang <richardw.y...@linux.intel.com> wrote:
> On Tue, Apr 09, 2019 at 04:54:15PM +0200, Igor Mammedov wrote: > >> > >> Let's see a normal hotplug case first. > >> > >> I did one test to see the how a guest with hot-plug cpu migrate to > >> destination. It looks the migration fails if I just do hot-plug at > >> source. So I have to do hot-plug both at source and at destination. > >> This > >> will expand the table_mr both at source and destination. > >> > >> Then let's see the effect of hotplug more devices to exceed original padded > >> size. There are two cases, before re-sizable MemoryRegion and after. > >> > >> 1) Before re-sizable MemoryRegion introduced > >> > >> Before re-sizable MemoryRegion introduced, we just pad table_mr to 4K. > >> And > >> this size never gets bigger, if I am right. To be accurate, the > >> table_blob > >> would grow to next padded size if we hot-add more cpus/pci bridge, but > >> we > >> just copy the original size of MemoryRegion. Even without migration, > >> the > >> ACPI table is corrupted when we expand to next padded size. > >> > >> Is my understanding correct here? > >> > >> 2) After re-sizable MemoryRegion introduced > >> > >> This time both tabl_blob and MemoryRegion grows when it expand to next > >> padded size. Since we need to hot-add device at both side, ACPI table > >> grows at the same pace. > >> > >> Every thing looks good, until one of it exceed the resizable > >> MemoryRegion's max size. (Not sure this is possible in reality, while > >> possible in theory). Actually, this looks like case 1) when resizable > >> MemoryRegion is not introduced. The too big ACPI table get corrupted. > >> > >> So if my understanding is correct, the procedure you mentioned "expand from > >> initial padded size to next padded size" only applies to two different max > >> size resizable MemoryRegion. For other cases, the procedure corrupt the > >> ACPI > >> table itself. > >> > >> Then when we look at > >> > >> commit 07fb61760cdea7c3f1b9c897513986945bca8e89 > >> Author: Paolo Bonzini <pbonz...@redhat.com> > >> Date: Mon Jul 28 17:34:15 2014 +0200 > >> > >> pc: hack for migration compatibility from QEMU 2.0 > >> > >> This fix ACPI migration issue before resizable MemoryRegion is > >> introduced(introduced in 2015-01-08). This looks expand to next padded size > >> always corrupt ACPI table at that time. And it make me think expand to next > >> padded size is not the procedure we should do? > >> > >> And my colleague Wei Wang(in cc) mentioned, to make migration succeed, the > >> MemoryRegion has to be the same size at both side. So I guess the problem > >> doesn't lie in hotplug but in "main table" size difference. > > > >It's true only for pre-resizable MemoryRegion QEMU versions, > >after that size doesn't affect migration anymore. > > > > > >> For example, we have two version of Qemu: v1 and v2. Their "main table" > >> size > >> is: > >> > >> v1: 3990 > >> v2: 4020 > >> > >> At this point, their ACPI table all padded to 4k, which is the same. > >> > >> Then we create a machine with 1 more vcpu by these two versions. This will > >> expand the table to: > >> > >> v1: 4095 > >> v2: 4125 > >> > >> After padding, v1's ACPI table size is still 4k but v2's is 8k. Now the > >> migration is broken. > >> > >> If this analysis is correct, the relationship between migration failure and > >> ACPI table is "the change of ACPI table size". Any size change of any > >you should make distinction between used_length and max_length here. > >Migration puts on wire used_length and that's what matter for keeping > >migration > >working. > > > >> ACPI table would break migration. While of course, since we pad the table, > >> only some combinations of tables would result in a visible real size > >> change in > >> MemoryRegion. > >> > >> Then the principle for future ACPI development is to keep all ACPI table > >> size > >> unchanged. > >once again it applies only for QEMU (versions < 2.1) and that was > >the problem (i.e. there always would be configurations that would create > >differently sized tables regardless of arbitrary size we would preallocate) > >resizable MemoryRegions solved. > > > >> Now let's back to mcfg table. As the comment mentioned, guest could > >> enable/disable MCFG, so the code here reserve table no matter it is > >> enabled or > >> not. This behavior ensures ACPI table size not changed. So do we need to > >> find > >> the machine type as you suggested before? > >We should be able to drop mcgf 'padding' hack since machine version > >which was introduced in the QEMU version that introduced resizable > >MemoryRegion > >as well. > > > >I'll send a patch to address that > > Hi, Igor, > > We have found the qemu version 2.1 which is with resizable MemoryRegion > enabled and q35 will stop support version before 2.3. The concern about ACPI > mcfg table breaking live migration is solved, right? > > If so, I would prepare mcfg refactor patch based on your cleanup. yes, just base your patches on top of "[PATCH for-4.1] q35: acpi: do not create dummy MCFG table"