On 9/29/21 10:12 AM, Thomas Huth wrote:
On 07/09/2021 14.45, Pierre Morel wrote:
On 9/7/21 9:32 AM, Thomas Huth wrote:
On 22/07/2021 19.42, Pierre Morel wrote:
We use new objects to have a dynamic administration of the CPU
topology.
The highier level object is the S390 book. In a first implementation
I didn't spot any migration related code in here ... is this already
migration-safe?
Not sure at all.
The topology may change at any moment and we interpret PTF, the
instruction which tell us if the topology changed.
Obviously the topology on the target may not be the same as on the
source.
So what I propose is to disable topology change during the migration:
- on migration start, disable PTF interpretation and block the
topology_change _report in the emulation.
- on migration end set back PTF interpretation and unblock the emulation
In the case, in discussion with David on KVM, that we do not emulate
PTF for hosts without the stfl(11) we can even make it simpler in QEMU
by always reporting "no change" for PTF 2 in the emulation.
Note that the Linux kernel, even if the topology can change at any
moment use a polling every minute to check the topology changes, so I
guess we can ignore the optimization during the migration.
What do you think?
I don't have much clue, this topology stuff is still mostly a black box
to me - so there is no interrupt or something similar presented to the
guest when the topology changes? The guest really has to poll for
changes? ... that sounds like a weird design to me... if the guest polls
too frequently, it wastes cycles due to the polling - but if it polls
not often enough, it could run for a while with wrong topology information?
Yes, it is so.
There are no interrupt for topology change, no event or any other
notification, I guess the overhead has been considered too high.
I guess that a change to the topology is done when (1) a CPU is not
running, exists in the configuration and used to be running but then is
being moved in the stopped state, so its environment can be safely
migrated to another CPU for the next scheduling slice or (2) when the
CPU is added to or removed from the configuration.
I also guess that what would be nice would be to get the information in
the guest when it needs to get scheduling decisions.
I had a try on this but it was not done right and I must think a little
bit more on this. Currently the Linux kernel does a poling every minute
using PTF(2) which is speed up to 100ms in case the admin voluntary
changes the topology.
Pierre
Thomas
--
Pierre Morel
IBM Lab Boeblingen