On 02/17/17 17:03, Laszlo Ersek wrote:
> On 02/17/17 16:33, Ben Warren wrote:
>>
>>> On Feb 17, 2017, at 2:43 AM, Igor Mammedov <imamm...@redhat.com
>>> <mailto:imamm...@redhat.com>> wrote:
>>>
>>> On Thu, 16 Feb 2017 15:15:36 -0800
>>> b...@skyportsystems.com <mailto:b...@skyportsystems.com> wrote:
>>>
>>>> From: Ben Warren <b...@skyportsystems.com <mailto:b...@skyportsystems.com>>
>>>>
>>>> This implements the VM Generation ID feature by passing a 128-bit
>>>> GUID to the guest via a fw_cfg blob.
>>>> Any time the GUID changes, an ACPI notify event is sent to the guest
>>>>
>>>> The user interface is a simple device with one parameter:
>>>> - guid (string, must be "auto" or in UUID format
>>>>   xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx)
>>> I've given it some testing with WS2012R2 and v4 patches for Seabios,
>>>
>>> Windows is able to read initial GUID allocation and writeback
>>> seems to work somehow:
>>>
>>> (qemu) info vm-generation-id
>>> c109c09b-0e8b-42d5-9b33-8409c9dcd16c
>>>
>>> vmgenid client in Windows reads it as 2 following 64bit integers:
>>> 42d50e8bc109c09b:6cd1dcc90984339b
>>>
>>> However update path/restore from snapshot doesn't
>>> here is as I've tested it:
>>>
>>> qemu-system-x86_64 -device vmgenid,id=testvgid,guid=auto -monitor stdio
>>> (qemu) info vm-generation-id
>>> c109c09b-0e8b-42d5-9b33-8409c9dcd16c
>>> (qemu) stop
>>> (qemu) migrate "exec:gzip -c > STATEFILE.gz"
>>> (qemu) quit
>>>
>>> qemu-system-x86_64 -device vmgenid,id=testvgid,guid=auto -monitor stdio
>>> -incoming "exec: gzip -c -d STATEFILE.gz"
>>> (qemu) info vm-generation-id
>>> 28b587fa-991b-4267-80d7-9cf28b746fe9
>>>
>>> guest
>>> 1. doesn't get GPE notification that it must receive
>>> 2. vmgenid client in Windows reads the same value
>>>      42d50e8bc109c09b:6cd1dcc90984339b
>>>
>> Strange, this was working for me, but with a slightly different test method:
>>
>>   * I use virsh save/restore
> 
> Awesome, this actually what I should try. All my guests are managed by
> libvirt (with the occasional <qemu:arg>, for development), and direct
> QEMU monitor commands such as
> 
>   virsh qemu-monitor-command ovmf.rhel7 --hmp 'info vm-generation-id'
> 
> only work for me if they are reasonably non-intrusive.
> 
>>   * While I do later testing with Windows, during development I use a
>>     Linux kernel module I wrote that keeps track of GUID and
>>     notifications.  I’m happy to share this with you if interested.
> 
> Please do. If you have a public git repo somewhere, that would be
> awesome. (Bonus points if the module builds out-of-tree, if the
> kernel-devel package is installed.)
> 
> NB: while the set-id monitor command was part of the series, I did test
> it to the extent that I checked the SCI ("ACPI interrupt") count in the
> guest, in /proc/interrupts. I did see it increase, so minimally the SCI
> injection was fine.

So, I did some testing with a RHEL-7 guest. I passed '-device
vmgenid=auto' to QEMU using the <qemu:arg> element in the domain XML.

(1) I started the guest normally, and grepped /proc/interrupts for
"acpi". Zero interrupts on either VCPU.

(2) Dumped the guest RAM to a file with "virsh dump ... --memory-only",
opened it with crash, and listed the 16 GUID bytes at the offset that
the firmware (OVMF) reported at startup.

(3) cycled through "virsh managedsave" and "virsh start"

(4) grepped /proc/interrupts again for "acpi". One interrupt had been
delivered to one of the VCPUs, all others were zero.

(5) Repeated step (2). The bytes listed this time were different.

(6) Issued "virsh qemu-monitor-command ovmf.rhel7 --hmp 'info
vm-generation-id", and compared the output against the bytes dumped
(with crash) from guest memory, in step 5. They were a match.

So, to me it seems like the SCI is injected, and the memory contents are
changed.

---*---

Windows Server 2012 R2 test:

(7) booted the guest similarly with '-device vmgenid=auto' via
<qemu:arg> in the domain XML.

(8) Initial check from the host side:

$ virsh qemu-monitor-command ovmf.win2012r2.q35 \
    --hmp 'info vm-generation-id'
a3f7c334-7dc4-4694-8b8f-abf52abb072f

(9) Verifying the same from within, using Vadim's program (note: I
logged into the VM with ssh, using Cygwin's SSHD in the guest):

$ ./vmgenid.exe
VmCounterValue: 46947dc4a3f7c334:2f07bb2af5ab8f8b
0x34 0xc3 0xf7 0xa3 0xc4 0x7d 0x94 0x46 0x8b 0x8f 0xab 0xf5 0x2a 0xbb
0x07 0x2f

This is a match, so the initial setup works. (Look only at the raw byte
dump in the second line -- it matches the Little Endian UUID
representation as specified in the SMBIOS spec!)

(10) Logged out of the guest (with ssh), cycled through "virsh
managedsave" and "virsh start" for the domain, logged back in.

(11) in the guest:

$ ./vmgenid.exe
VmCounterValue: 4a12296b382162da:6c00d1a52699b7bd
0xda 0x62 0x21 0x38 0x6b 0x29 0x12 0x4a 0xbd 0xb7 0x99 0x26 0xa5 0xd1
0x00 0x6c

(12) on the host:

$ virsh qemu-monitor-command ovmf.win2012r2.q35 \
      --hmp 'info vm-generation-id'
382162da-296b-4a12-bdb7-9926a5d1006c

This is again a match. (Again, look only at the raw byte dump from
vmgenid.exe under (11), and consider the BE/LE conversion for the first
three segments!)

(13) Logged out of the guest with ssh, and started Vadim's other program
(vmgenid_wait.exe), this time from a normal CMD window on the GUI. The
program started, reproduced the above output (seen under (11)), and then
went to sleep (waiting).

(14) cycled through "virsh managedsave" and "virsh start" for the domain.

(15) The domain resumed, and Vadim's vmgenid_wait.exe woke up, printing
(manual transcript):

VmCounterValue changed to: 495ba7807ed37772:195d0cff681f7a7

Please refer to the following screenshot:

http://people.redhat.com/~lersek/vmgenid-dd1f68c5-89b0-4458-84fa-de9e3d23f4cb/Screenshot_ovmf.win2012r2.q35_2017-02-17_20:34:41.png

(16) on the host:

$ virsh qemu-monitor-command ovmf.win2012r2.q35 \
      --hmp 'info vm-generation-id'

7ed37772-a780-495b-a7f7-81f6cfd09501

This is again a match. It is not easy to see, because Vadim's
"vmgenid_wait.exe" does not print the raw byte dump after it wakes up;
the raw byte dump is only printed before it goes to sleep. After wakeup,
it only dumps the composed values. Somewhat more confusingly, the 64-bit
hex integers are not zero padded, we'll have to make up for that manually.

So here goes:

   [A]      [B]  [C]  [D]  [E]
   7ed37772-a780-495b-a7f7-81f6cfd09501 (from the host)

   [C]  [B]  [A]        [E']         [D']
   495b a780 7ed37772 : 0195d0cff681 f7a7 (from vmgenid_wait.exe)
                        ^
                        zero padding added manually

The parts marked with an apostrophe (') are reversed, byte-wise.

So, I'm going to have to declare this "working by design".

Confirming my earlier Tested-by (same patches as before):

Tested-by: Laszlo Ersek <ler...@redhat.com>

What could be the difference between Igor's setup and mine? Perhaps the
BIOS? (Again, I used OVMF.)

The "managedsave" command of virsh boils down to (see
"src/qemu/qemu_driver.c" in the libvirt source):

qemuDomainManagedSave()
  qemuDomainSaveInternal()
    qemuProcessStopCPUs()
    qemuDomainSaveMemory()
      qemuDomainSaveHeader()
      qemuMigrationToFile()
        qemuMonitorMigrateToFd()
          ...
    qemuProcessStop()

I capture all traffic between libvirt and the QEMU monitor, and between
libvirt and the QEMU guest agent, in the libvirt log file, as a rule, so
I can paste the relevant lines:

Libvirt sending the file descriptor to QEMU:

2017-02-17 19:31:54.305+0000: 16586: debug :
qemuMonitorJSONCommandWithFd:296 : Send command
'{"execute":"getfd","arguments":{"fdname":"migrate"},"id":"libvirt-30"}'
for write with FD 26

Libvirt starting the migration:

2017-02-17 19:31:54.306+0000: 16586: debug :
qemuMonitorJSONCommandWithFd:296 : Send command
'{"execute":"migrate","arguments":{"detach":true,"blk":false,"inc":false,"uri":"fd:migrate"},"id":"libvirt-31"}'
for write with FD -1

Then loading it:

2017-02-17 19:32:02.083+0000: 16585: debug :
qemuMonitorJSONCommandWithFd:296 : Send command
'{"execute":"migrate-incoming","arguments":{"uri":"fd:25"},"id":"libvirt-17"}'
for write with FD -1

I don't have the slightest idea why it failed for Igor -- I can only
suspect the SeaBIOS patches.

Note that in the SeaBIOS discussion, Ben mentioned that he wasn't
actually seeing the fw_cfg writes in QEMU on the S3 resume path, despite
SeaBIOS attempting them. So, perhaps, is there a bug in the latest
SeaBIOS patches that prevent fw_cfg writes completely, even on the
normal boot path? That would be consistent with Igor's results: the
initial download works (hence the first GUID can be seen), but then the
update does not work (because QEMU has not received the address).

Thanks
Laszlo

Reply via email to