Gleb Natapov wrote:
It's not a matter of correctness IMHO.

Ah now I see what you mean. New FW has to be used after reboot because old
one may not be able to init HW any longer.

I think we have two ways to view firmware. The first would be to treat guest firmware as part of the guest. What that means it that we should store all firmware in an nvram file, migrate the nvram file during migration, and provide a mechanism for the user to individually upgrade/replace the firmware. We would never replace it from underneath the guest. We only put things in nvram the first time it is initialized. After that, it's entirely based on what the guest does.

On the one hand, this simplifies live migration, puts the user in more control over when there are firmware changes, gives us a very clear set of rules to follow with respect to compatibility. On the other hand, it means we must provide backwards compatibility at the fw_cfg level and it means that any firmware upgrade has to be initiated by the user. That means it's very likely that a user will not upgrade their firmware when upgrading qemu. We could version the nvram and if we detect a guest image being moved between versions (without the use of -M xxx) we could automatically clear the nvram causing new firmware to be loaded. Stable fixes for firmware though would require explicit action on the part of the guest. This could be considered an advantage depending on your perspective.

The other option would be to treat guest firmware as part of the machine state. The problem with this approach is that the firmware maintains state within the guest that qemu has no visibility into. This makes it impossible for us to upgrade the firmware unless the guest is at a very well defined point in time (IOW, start-up or reset). This has very different semantics than the rest of our machine and that's concerning.

I'm honestly on the fence about the two. The first mechanism (nvram) is very appealing in that it matches hardware. Having to rely on a guest to update it's firmware is concerning though. The second mechanism is appealing from an ease of use perspective but the semantics of doing a live migration and getting a different firmware after reset/shutdown is rather scary from a support perspective.

I can understand an argument for predictability wrt wanted to be
able to guarantee that after the first reboot during a live
migration, you'll get the new firmware.  I'm not sure that's less
predictable then hard shutdown/start-up and I'm not sure I can
really make an argument for one way vs. the other.

system_reset _is_ hard shutdown/start-up. If it is not it is a bug, we
just arguing if the same applies for the case that migration was done
between boot and reset.

It's not and it will never be completely. By this logic, we should close the file descriptors for the block devices and try to reopen them. It's certainly possible that someone does a mv of the block device and replaces it under the covers.

Likewise, if a user upgrades the firmware independently of the qemu version, it would need to reread the firmware from disk each boot.

Regards,

Anthony Liguori


Reply via email to