[libvirt] [PATCH go-xml] Added Storage Pool and Storage Volume XML schemes.

2017-01-08 Thread Alexey Slaykovsky
Signed-off-by: Alexey Slaykovsky 
---
 storage_encryption.go |  50 ++
 storage_pool.go   | 121 +
 storage_pool_test.go  | 240 
 storage_vol.go|  83 +
 storage_vol_test.go   | 246 ++
 5 files changed, 740 insertions(+)
 create mode 100644 storage_encryption.go
 create mode 100644 storage_pool.go
 create mode 100644 storage_pool_test.go
 create mode 100644 storage_vol.go
 create mode 100644 storage_vol_test.go

diff --git a/storage_encryption.go b/storage_encryption.go
new file mode 100644
index 000..f66314c
--- /dev/null
+++ b/storage_encryption.go
@@ -0,0 +1,50 @@
+/*
+ * This file is part of the libvirt-go-xml project
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ *
+ * Copyright (C) 2017 Red Hat, Inc.
+ *
+ */
+
+package libvirtxml
+
+type StorageEncryptionSecret struct {
+   Type string `xml:"type,attr"`
+   UUID string `xml:"uuid,attr"`
+}
+
+type StorageEncryptionCipher struct {
+   Name string `xml:"name,attr"`
+   Size uint64 `xml:"size,attr"`
+   Mode string `xml:"mode,attr"`
+   Hash string `xml:"hash,attr"`
+}
+
+type StorageEncryptionIvgen struct {
+   Name string `xml:"name,attr"`
+   Hash string `xml:"hash,attr"`
+}
+
+type StorageEncryption struct {
+   Format string   `xml:"format,attr"`
+   Secret *StorageEncryptionSecret `xml:"secret"`
+   Cipher *StorageEncryptionCipher `xml:"cipher"`
+   Ivgen  *StorageEncryptionIvgen  `xml:"ivgen"`
+}
diff --git a/storage_pool.go b/storage_pool.go
new file mode 100644
index 000..3481361
--- /dev/null
+++ b/storage_pool.go
@@ -0,0 +1,121 @@
+/*
+ * This file is part of the libvirt-go-xml project
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ *
+ * Copyright (C) 2017 Red Hat, Inc.
+ *
+ */
+
+package libvirtxml
+
+import "encoding/xml"
+
+type StoragePoolTargetPermissions struct {
+   Owner string `xml:"owner,omitempty"`
+   Group string `xml:"group,omitempty"`
+   Mode  string `xml:"mode,omitempty"`
+   Label string `xml:"label,omitempty"`
+}
+
+type StoragePoolTargetTimestamps struct {
+   Atime string `xml:"atime"`
+   Mtime string `xml:"mtime"`
+   Ctime string `xml:"ctime"`
+}
+
+type StoragePoolTarget struct {
+   Pathstring`xml:"path,omitempty"`
+   Permissions *StoragePoolTargetPermissions `xml:"permissions"`
+   Timestamps  *StoragePoolTargetTimestamps  `xml:"timestamps"`
+   Encryption  *StorageEncryption`xml:"encryption"`
+}
+
+type StoragePoolSourceFormat struct {
+   Type string `xml:"type,attr"`
+}
+type StoragePoolSourceHost struct {
+   Name string `xml:"name,attr"`
+}
+
+type StoragePoolSourceDevice struct {
+   Path  string `xml:"path,attr"`
+  

Re: [libvirt] [PATCH] conf: Parse virtio-crypto in the domain XML

2017-01-08 Thread Gonglei (Arei)
>
> Sent: Monday, January 09, 2017 10:15 AM
> To: libvir-list@redhat.com
> Cc: Wubin (H); Zhoujian (jay, Euler); Gonglei (Arei); berra...@redhat.com;
> longpeng
> Subject: [PATCH] conf: Parse virtio-crypto in the domain XML
> 
> As virtio-crypto has been supported in QEMU 2.9 and

Take a quick glance:

s/QEMU 2.9/QEMU 2.8/

> the frontend driver has been merged in linux 4.10,
> so it's necessary to support virtio-crypto in domain
> XML.
> 
> This patch parse the domain XML with virtio-crypto
> support, the virtio-crypto XML looks like this:
> 
> 
> 
> 
> 
> Signed-off-by: Longpeng(Mike) 
> ---

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] libvirt support for mediated devices

2017-01-08 Thread Laine Stump
VFIO's new mediated device interface  "is used for allowing 
software-defined devices to be exposed through VFIO while the host 
driver manages access to the interface" (quoted from 
http://www.phoronix.com/scan.php?page=news_item&px=VFIO-Linux-4.10-Mediated 
). Now that the support for mediated devices has been added to the 
upstream Linux kernel, there is a stable API that libvirt can use to 
support assigning mediated devices (e.g. virtual GPUs) to Qemu/KVM 
guests (or presumably any other hypervisor that support device 
assignment via VFIO.


We've had a few private discussions about what should be added to 
libvirt, and now have enough rough ideas to start discussing it on the list.


The major requirements we've come up with so far (in what I think is a 
reasonable order of implementation) are:


1) The ability to assign an already-created mediated device to a guest 
(think of "" mode for assigning regular PCI 
devices).


2) reporting of the capabilities of a mediated device "parent" 
(including, for example, the supported types and maximum number of child 
devices that are supported, and the names of all existing child devices) 
and of existing child devices (via the node device APIs, e.g. virsh 
nodedev-list and virsh nodedev-dumpxml)


3) The ability to create and destroy mediated devices via the NodeDevice 
API. (similar in function to the "virsh detach-device and virsh 
attach-device commands - i.e. they make a device ready to be assigned to 
a guest using , but have no persistent config and no 
"auto-start" capability).


4) Support for "managed" mediated devices - libvirt will create a new 
child device as required, and destroy it when it's no longer needed 
(similar to the way that standard PCI hostdevs are (when managed="yes") 
detached from their host driver and attached to vfio-pci as needed) (I 
think this is less useful than item (5), but is simpler and may be a 
good way to test all the preceding additions (as well as being useful in 
some simpler configurations).


5) The ability to create and manage "pools" of mediated devices, with 
persistent config and an auto-start capability so that the device pools 
are automatically created when the host is booted (this will require 
either some form of persistent config and lifecycle management to be 
added to the nodedevice driver, or a new libvirt driver type with 
functionality similar to storage pools, but used to manage pools of mdev 
child devices).


=

Going back to the beginning, with slightly more detail:

1) "Unmanaged" mediated device assignment - assigning an existing device 
to a virtual machine


This will assume that the desired child device has already been created, 
and can be found in /sys/bus/mdev/devices/$UUID. Here's a first attempt 
at what the XML would look like:


/
  


 

 

In the past, the "type" attribute of hostdev described the type on both 
the host and the guest. With mediated devices, the device isn't visible 
on the host as a PCI device, but just as a software device. So the type 
attribute in  now gives the type of the device on the guest, 
and the device type on the host is determined from the contents of .


Erik had a different suggestion for this (which I think he's already 
working on patches for) - that the type attribute in  should be 
the type of the device in the *host*, and the type in the guest would be 
that given in the . Something like this I think:







 

(Is this correct, Erik?)

(I arrived at my suggestion by the thinking that, in other places where 
there are similar attributes for the host and guest side, e.g. the IP 
addresses and routes that can be added on both the host and guest side 
of an , everything related to the host side is in the 
 subelement, while things related to the guest are directly 
under the toplevel of the device element. On the other hand, the 
"managed" attribute isn't something related to the guest, but to the 
host, and his idea has less redundancy, so maybe he's onto something...)


(NB: a mediated device could be exposed to the guest as a PCI device, a 
CCW device, or anything else supported by vfio. The type of device that 
the guest will see can be determined from the contents of 
mdev_supported_types//device_api under the parent device's 
directory in sysfs (it will be, e.g., "vfio-pci" or "vfio-ccw"). But 
libvirt assigns guest-side addresses at the time a domain is defined, 
and it's possible that the mdev child device won't be created yet at 
define time (and therefore we won't know which parent device it's 
associated with, and so we won't be able to look at device_api). In such 
situations, it will be up to management to know something about the 
device it will be creating and assume a type. Fortunately this is a 
reasonably safe thing to do - on x86 platforms we can be fairly certain 
that the device will be a PCI device. (And, because this also makes a

[libvirt] [PATCH] conf: Parse virtio-crypto in the domain XML

2017-01-08 Thread Longpeng(Mike)
As virtio-crypto has been supported in QEMU 2.9 and
the frontend driver has been merged in linux 4.10,
so it's necessary to support virtio-crypto in domain
XML.

This patch parse the domain XML with virtio-crypto
support, the virtio-crypto XML looks like this:





Signed-off-by: Longpeng(Mike) 
---
 src/conf/domain_conf.c | 212 +
 src/conf/domain_conf.h |  32 +++
 src/qemu/qemu_alias.c  |  20 
 src/qemu/qemu_alias.h  |   4 +
 src/qemu/qemu_capabilities.c   |   4 +
 src/qemu/qemu_capabilities.h   |   2 +
 src/qemu/qemu_command.c| 129 +
 src/qemu/qemu_command.h|   4 +
 src/qemu/qemu_domain_address.c |  17 
 src/qemu/qemu_driver.c |   6 ++
 src/qemu/qemu_hotplug.c|   1 +
 11 files changed, 431 insertions(+)

diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index 9f7b906..fcfccd5 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -237,6 +237,7 @@ VIR_ENUM_IMPL(virDomainDevice, VIR_DOMAIN_DEVICE_LAST,
   "memballoon",
   "nvram",
   "rng",
+  "crypto",
   "shmem",
   "tpm",
   "panic",
@@ -797,6 +798,14 @@ VIR_ENUM_IMPL(virDomainRNGBackend,
   "random",
   "egd");
 
+VIR_ENUM_IMPL(virDomainCryptoModel,
+  VIR_DOMAIN_CRYPTO_MODEL_LAST,
+  "virtio");
+
+VIR_ENUM_IMPL(virDomainCryptoBackend,
+  VIR_DOMAIN_CRYPTO_BACKEND_LAST,
+  "builtin");
+
 VIR_ENUM_IMPL(virDomainTPMModel, VIR_DOMAIN_TPM_MODEL_LAST,
   "tpm-tis")
 
@@ -2337,6 +2346,9 @@ void virDomainDeviceDefFree(virDomainDeviceDefPtr def)
 case VIR_DOMAIN_DEVICE_RNG:
 virDomainRNGDefFree(def->data.rng);
 break;
+case VIR_DOMAIN_DEVICE_CRYPTO:
+virDomainCryptoDefFree(def->data.crypto);
+break;
 case VIR_DOMAIN_DEVICE_CHR:
 virDomainChrDefFree(def->data.chr);
 break;
@@ -2600,6 +2612,10 @@ void virDomainDefFree(virDomainDefPtr def)
 virDomainRNGDefFree(def->rngs[i]);
 VIR_FREE(def->rngs);
 
+for (i = 0; i < def->ncryptos; i++)
+virDomainCryptoDefFree(def->cryptos[i]);
+VIR_FREE(def->cryptos);
+
 for (i = 0; i < def->nmems; i++)
 virDomainMemoryDefFree(def->mems[i]);
 VIR_FREE(def->mems);
@@ -3169,6 +3185,8 @@ virDomainDeviceGetInfo(virDomainDeviceDefPtr device)
 return &device->data.shmem->info;
 case VIR_DOMAIN_DEVICE_RNG:
 return &device->data.rng->info;
+case VIR_DOMAIN_DEVICE_CRYPTO:
+return &device->data.crypto->info;
 case VIR_DOMAIN_DEVICE_TPM:
 return &device->data.tpm->info;
 case VIR_DOMAIN_DEVICE_PANIC:
@@ -3464,6 +3482,12 @@ virDomainDeviceInfoIterateInternal(virDomainDefPtr def,
 if (cb(def, &device, &def->rngs[i]->info, opaque) < 0)
 return -1;
 }
+device.type = VIR_DOMAIN_DEVICE_CRYPTO;
+for (i = 0; i < def->ncryptos; i++) {
+device.data.crypto = def->cryptos[i];
+if (cb(def, &device, &def->cryptos[i]->info, opaque) < 0)
+return -1;
+}
 if (def->nvram) {
 device.type = VIR_DOMAIN_DEVICE_NVRAM;
 device.data.nvram = def->nvram;
@@ -3541,6 +3565,7 @@ virDomainDeviceInfoIterateInternal(virDomainDefPtr def,
 case VIR_DOMAIN_DEVICE_PANIC:
 case VIR_DOMAIN_DEVICE_LAST:
 case VIR_DOMAIN_DEVICE_RNG:
+case VIR_DOMAIN_DEVICE_CRYPTO:
 case VIR_DOMAIN_DEVICE_MEMORY:
 break;
 }
@@ -4599,6 +4624,7 @@ virDomainDeviceDefValidateInternal(const 
virDomainDeviceDef *dev,
 case VIR_DOMAIN_DEVICE_MEMBALLOON:
 case VIR_DOMAIN_DEVICE_NVRAM:
 case VIR_DOMAIN_DEVICE_RNG:
+case VIR_DOMAIN_DEVICE_CRYPTO:
 case VIR_DOMAIN_DEVICE_SHMEM:
 case VIR_DOMAIN_DEVICE_TPM:
 case VIR_DOMAIN_DEVICE_PANIC:
@@ -12022,6 +12048,88 @@ virDomainRNGDefParseXML(xmlNodePtr node,
 }
 
 
+static virDomainCryptoDefPtr
+virDomainCryptoDefParseXML(xmlNodePtr node,
+   xmlXPathContextPtr ctxt,
+   unsigned int flags)
+{
+char *model = NULL;
+char *backend = NULL;
+char *queues = NULL;
+virDomainCryptoDefPtr def;
+xmlNodePtr save = ctxt->node;
+xmlNodePtr *backends = NULL;
+int nbackends;
+
+if (VIR_ALLOC(def) < 0)
+return NULL;
+
+if (!(model = virXMLPropString(node, "model"))) {
+virReportError(VIR_ERR_XML_ERROR, "%s",
+   _("missing Crypto device model"));
+goto error;
+}
+
+if ((def->model = virDomainCryptoModelTypeFromString(model)) < 0) {
+virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+   _("unknown Crypto model '%s'"), model);
+goto error;
+}
+
+ctxt->node = node;
+
+if ((nbackends = virXPathNodeSet("./backend", ctxt, &backends)) < 0)
+goto error;
+
+if (nbackends != 

Re: [libvirt] Proposal PCI/PCIe device placement on PAPR guests

2017-01-08 Thread David Gibson
On Fri, Jan 06, 2017 at 12:57:58PM +0100, Greg Kurz wrote:
> On Thu, 5 Jan 2017 16:46:18 +1100
> David Gibson  wrote:
> 
> > There was a discussion back in November on the qemu list which spilled
> > onto the libvirt list about how to add support for PCIe devices to
> > POWER VMs, specifically 'pseries' machine type PAPR guests.
> > 
> > Here's a more concrete proposal for how to handle part of this in
> > future from the libvirt side.  Strictly speaking what I'm suggesting
> > here isn't intrinsically linked to PCIe: it will make adding PCIe
> > support sanely easier, as well as having a number of advantages for
> > both PCIe and plain-PCI devices on PAPR guests.
> > 
> > Background:
> > 
> >  * Currently the pseries machine type only supports vanilla PCI
> >buses.
> > * This is a qemu limitation, not something inherent - PAPR guests
> >   running under PowerVM (the IBM hypervisor) can use passthrough
> >   PCIe devices (PowerVM doesn't emulate devices though).
> > * In fact the way PCI access is para-virtalized in PAPR makes the
> >   usual distinctions between PCI and PCIe largely disappear
> >  * Presentation of PCIe devices to PAPR guests is unusual
> > * Unlike x86 - and other "bare metal" platforms, root ports are
> >   not made visible to the guest. i.e. all devices (typically)
> >   appear as though they were integrated devices on x86
> > * In terms of topology all devices will appear in a way similar to
> >   a vanilla PCI bus, even PCIe devices
> >* However PCIe extended config space is accessible
> > * This means libvirt's usual placement of PCIe devices is not
> >   suitable for PAPR guests
> >  * PAPR has its own hotplug mechanism
> > * This is used instead of standard PCIe hotplug
> > * This mechanism works for both PCIe and vanilla-PCI devices
> > * This can hotplug/unplug devices even without a root port P2P
> >   bridge between it and the root "bus
> >  * Multiple independent host bridges are routine on PAPR
> > * Unlike PC (where all host bridges have multiplexed access to
> >   configuration space) PCI host bridges (PHBs) are truly
> >   independent for PAPR guests (disjoint MMIO regions in system
> >   address space)
> > * PowerVM typically presents a separate PHB to the guest for each
> >   host slot passed through
> > 
> > The Proposal:
> > 
> > I suggest that libvirt implement a new default algorithm for placing
> > (i.e. assigning addresses to) both PCI and PCIe devices for (only)
> > PAPR guests.
> > 
> > The short summary is that by default it should assign each device to a
> > separate vPHB, creating vPHBs as necessary.
> > 
> >   * For passthrough sometimes a group of host devices can't be safely
> > isolated from each other - this is known as a (host) Partitionable
> > Endpoint (PE).  In this case, if any device in the PE is passed
> > through to a guest, the whole PE must be passed through to the
> > same vPHB in the guest.  From the guest POV, each vPHB has exactly
> > one (guest) PE.
> >   * To allow for hotplugged devices, libvirt should also add a number
> > of additional, empty vPHBs (the PAPR spec allows for hotplug of
> > PHBs, but this is not yet implemented in qemu).  When hotplugging
> > a new device (or PE) libvirt should locate a vPHB which doesn't
> > currently contain anything.
> >   * libvirt should only (automatically) add PHBs - never root ports or
> > other PCI to PCI bridges
> > 
> > In order to handle migration, the vPHBs will need to be represented in
> > the domain XML, which will also allow the user to override this
> > topology if they want.
> > 
> > Advantages:
> > 
> > There are still some details I need to figure out w.r.t. handling PCIe
> > devices (on both the qemu and libvirt sides).  However the fact that
> 
> One such detail may be that PCIe devices should have the
> "ibm,pci-config-space-type" property set to 1 in the DT,
> for the driver to be able to access the extended config
> space.


Right.

> > PAPR guests don't typically see PCIe root ports means that the normal
> > libvirt PCIe allocation scheme won't work.  This scheme has several
> > advantages with or without support for PCIe devices:
> > 
> >  * Better performance for 32-bit devices
> > 
> > With multiple devices on a single vPHB they all must share a (fairly
> > small) 32-bit DMA/IOMMU window.  With separate PHBs they each have a
> > separate window.  PAPR guests have an always-on guest visible IOMMU.
> > 
> >  * Better EEH handling for passthrough devices
> > 
> > EEH is an IBM hardware-assisted mechanism for isolating and safely
> > resetting devices experiencing hardware faults so they don't bring
> > down other devices or the system at large.  It's roughly similar to
> > PCIe AER in concept, but has a different IBM specific interface, and
> > works on both PCI and PCIe devices.
> > 
> > Currently the kernel interfaces for handling EEH eve

Re: [libvirt] Proposal PCI/PCIe device placement on PAPR guests

2017-01-08 Thread David Gibson
On Fri, Jan 06, 2017 at 06:34:29PM +0100, Andrea Bolognani wrote:
> [Added Laine to CC, fixed qemu-devel address]
> 
> On Thu, 2017-01-05 at 16:46 +1100, David Gibson wrote:
> [...]
> >   * To allow for hotplugged devices, libvirt should also add a number
> > of additional, empty vPHBs (the PAPR spec allows for hotplug of
> > PHBs, but this is not yet implemented in qemu).
> 
> "A number" here will have to mean "one", same number of
> empty PCIe Root Ports libvirt will add to a newly-defined
> q35 guest.

Umm.. why?

> > When hotplugging
> > a new device (or PE) libvirt should locate a vPHB which doesn't
> > currently contain anything.
> 
> This will need to be a PHB-specific behavior, because at the
> moment libvirt will happily pick one of the empty slots in
> an existing PHB.

Exactly.  Well, whether it's PHB model specific or machine type
specific is up to you really.  We can only have PAPR PHBs on a PAPR
machine type, so it's kind of arbitrary.

> 
> >   * libvirt should only (automatically) add PHBs - never root ports or
> > other PCI to PCI bridges
> > 
> > In order to handle migration, the vPHBs will need to be represented in
> > the domain XML, which will also allow the user to override this
> > topology if they want.
> 
> We'll have to decide how to represent them in the XML, but
> that's basically your average bikeshedding.

Right.  Maybe we'd best get started with it, in the hopes of finishing
it in the forseeable future.

> Overall, the plan seems entirely reasonable to me.
> 
> It's pretty clear at this point that pseries guest are
> different enough in their handling of PCI that none of
> the address allocation algorithms currently implemented
> in libvirt could be quite adapted to work with it, so
> a custom one is in order.

Yes, that was my conclusion as well.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

[libvirt] Exporting kvm_max_guest_tsc_khz to userspace (was Re: [PATCH 4/4] kvm: Allow migration with invtsc)

2017-01-08 Thread Eduardo Habkost
On Fri, Jan 06, 2017 at 08:31:27AM -0200, Marcelo Tosatti wrote:
[...]
> > > 
> > > > Maybe your use case just needs a explicit "invtsc-migration=on"
> > > > command-line flag without a mechanism to abort migration on
> > > > mismatch? I can't tell.
> > > 
> > > Again, there is no special case.
> > > 
> > > > Note that even if we follow your suggestion and implement an
> > > > alternative version of patch 4/4 to cover your use case, I will
> > > > strongly recommend libvirt developers to support configuring TSC
> > > > frequency if they want to support invtsc + migration without
> > > > surprising/unpredictable restrictions on migratability.
> > > 
> > > Well, alright. If you make the TSC frequency of the host
> > > available to mgmt software as you describe, and write the steps mgmt
> > > software should take, i'm good.
> > 
> > I plan to. The problem is that the mechanism to query the host
> > frequency may be unavailable in the first version.
> 
> Well just export KVM_GET_TSC_KHZ in a QMP command right? Its pretty
> easy.
> 
> Let me know if you need any help coding or testing.

I just found out that KVM doesn't provide something that QEMU and
libvirt need: the value of kvm_max_guest_tsc_khz. Without it, we
have no way to know if a given VM is really migratable to a host.

Could we add a KVM_CAP_MAX_TSC_KHZ capability for that?

-- 
Eduardo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH 0/5] Use non-blacklisted family/model/stepping for Haswell CPU model

2017-01-08 Thread Eduardo Habkost
Oops, there are 3 actual mistakes I missed among the false
positives, below:

On Sun, Jan 08, 2017 at 11:47:21AM -0800, no-re...@patchew.org wrote:
[...]
> === OUTPUT BEGIN ===
> Checking PATCH 1/5: i386: Add explicit array size to 
> x86_cpu_vendor_words2str()...
> ERROR: line over 90 characters
> #27: FILE: target/i386/cpu.c:172:
> +static void x86_cpu_vendor_words2str(char dst[static (CPUID_VENDOR_SZ + 1)], 
> uint32_t vendor1,

False positive.

> 
> ERROR: space prohibited between function name and open parenthesis '('
> #27: FILE: target/i386/cpu.c:172:
> +static void x86_cpu_vendor_words2str(char dst[static (CPUID_VENDOR_SZ + 1)], 
> uint32_t vendor1,

False positive.

> 
> total: 2 errors, 0 warnings, 8 lines checked
> 
> Your patch has style problems, please review.  If any of these errors
> are false positives report them to the maintainer, see
> CHECKPATCH in MAINTAINERS.
> 
> Checking PATCH 2/5: i386: host_vendor_fms() helper function...
> ERROR: line over 90 characters
> #20: FILE: target/i386/cpu.c:685:
> +void host_vendor_fms(char vendor[static (CPUID_VENDOR_SZ + 1)], int *family, 
> int *model, int *stepping)

Oops, will fix it.

> 
> ERROR: space prohibited between function name and open parenthesis '('
> #20: FILE: target/i386/cpu.c:685:
> +void host_vendor_fms(char vendor[static (CPUID_VENDOR_SZ + 1)], int *family, 
> int *model, int *stepping)

False positive.

> 
> ERROR: storage class should be at the beginning of the declaration
> #20: FILE: target/i386/cpu.c:685:
> +void host_vendor_fms(char vendor[static (CPUID_VENDOR_SZ + 1)], int *family, 
> int *model, int *stepping)
> 

False positive.

> ERROR: line over 90 characters
> #57: FILE: target/i386/cpu.c:1599:
> +host_vendor_fms(host_cpudef.vendor, &host_cpudef.family, 
> &host_cpudef.model, &host_cpudef.stepping);

Will fix it.

> 
> ERROR: line over 90 characters
> #69: FILE: target/i386/cpu.h:1426:
> +void host_vendor_fms(char vendor[static (CPUID_VENDOR_SZ + 1)], int *family, 
> int *model, int *stepping);

Will fix it.

> 
> ERROR: space prohibited between function name and open parenthesis '('
> #69: FILE: target/i386/cpu.h:1426:
> +void host_vendor_fms(char vendor[static (CPUID_VENDOR_SZ + 1)], int *family, 
> int *model, int *stepping);

False positive.

> 
> ERROR: storage class should be at the beginning of the declaration
> #69: FILE: target/i386/cpu.h:1426:
> +void host_vendor_fms(char vendor[static (CPUID_VENDOR_SZ + 1)], int *family, 
> int *model, int *stepping);

False positive.

> 
> total: 7 errors, 0 warnings, 50 lines checked
> 
> Your patch has style problems, please review.  If any of these errors
> are false positives report them to the maintainer, see
> CHECKPATCH in MAINTAINERS.
> 
> Checking PATCH 3/5: i386/kvm: Blacklist TSX on known broken hosts...
> Checking PATCH 4/5: pc: Add 2.9 machine-types...
> Checking PATCH 5/5: i386: Change stepping of Haswell to non-blacklisted 
> value...
> === OUTPUT END ===
> 
> Test command exited with code: 1
> 
> 
> ---
> Email generated automatically by Patchew [http://patchew.org/].
> Please send your feedback to patchew-de...@freelists.org

-- 
Eduardo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 2/5] i386: host_vendor_fms() helper function

2017-01-08 Thread Eduardo Habkost
Helper function for code that needs to check the host CPU
vendor/family/model/stepping values.

Signed-off-by: Eduardo Habkost 
---
 target/i386/cpu.h |  1 +
 target/i386/cpu.c | 28 
 2 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index a7f2f6099d..0f4a9d412b 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1423,6 +1423,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
uint32_t count,
 void cpu_clear_apic_feature(CPUX86State *env);
 void host_cpuid(uint32_t function, uint32_t count,
 uint32_t *eax, uint32_t *ebx, uint32_t *ecx, uint32_t *edx);
+void host_vendor_fms(char vendor[static (CPUID_VENDOR_SZ + 1)], int *family, 
int *model, int *stepping);
 
 /* helper.c */
 int x86_cpu_handle_mmu_fault(CPUState *cpu, vaddr addr,
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 25b802bb06..9dbb2d98da 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -682,6 +682,25 @@ void host_cpuid(uint32_t function, uint32_t count,
 *edx = vec[3];
 }
 
+void host_vendor_fms(char vendor[static (CPUID_VENDOR_SZ + 1)], int *family, 
int *model, int *stepping)
+{
+uint32_t eax, ebx, ecx, edx;
+
+host_cpuid(0x0, 0, &eax, &ebx, &ecx, &edx);
+x86_cpu_vendor_words2str(vendor, ebx, edx, ecx);
+
+host_cpuid(0x1, 0, &eax, &ebx, &ecx, &edx);
+if (family) {
+*family = ((eax >> 8) & 0x0F) + ((eax >> 20) & 0xFF);
+}
+if (model) {
+*model = ((eax >> 4) & 0x0F) | ((eax & 0xF) >> 12);
+}
+if (stepping) {
+*stepping = eax & 0x0F;
+}
+}
+
 /* CPU class name definitions: */
 
 #define X86_CPU_TYPE_SUFFIX "-" TYPE_X86_CPU
@@ -1574,17 +1593,10 @@ static void host_x86_cpu_class_init(ObjectClass *oc, 
void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(oc);
 X86CPUClass *xcc = X86_CPU_CLASS(oc);
-uint32_t eax = 0, ebx = 0, ecx = 0, edx = 0;
 
 xcc->kvm_required = true;
 
-host_cpuid(0x0, 0, &eax, &ebx, &ecx, &edx);
-x86_cpu_vendor_words2str(host_cpudef.vendor, ebx, edx, ecx);
-
-host_cpuid(0x1, 0, &eax, &ebx, &ecx, &edx);
-host_cpudef.family = ((eax >> 8) & 0x0F) + ((eax >> 20) & 0xFF);
-host_cpudef.model = ((eax >> 4) & 0x0F) | ((eax & 0xF) >> 12);
-host_cpudef.stepping = eax & 0x0F;
+host_vendor_fms(host_cpudef.vendor, &host_cpudef.family, 
&host_cpudef.model, &host_cpudef.stepping);
 
 cpu_x86_fill_model_id(host_cpudef.model_id);
 
-- 
2.11.0.259.g40922b1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 5/5] i386: Change stepping of Haswell to non-blacklisted value

2017-01-08 Thread Eduardo Habkost
glibc blacklists TSX on Haswell CPUs with model==60 and
stepping < 4. To make the Haswell CPU model more useful, make
those guests actually use TSX by changing CPU stepping to 4.

References:
* glibc commit 2702856bf45c82cf8e69f2064f5aa15c0ceb6359
  
https://sourceware.org/git/?p=glibc.git;a=commit;h=2702856bf45c82cf8e69f2064f5aa15c0ceb6359

Signed-off-by: Eduardo Habkost 
---
 include/hw/i386/pc.h | 6 +-
 target/i386/cpu.c| 2 +-
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 230e9e70c5..fcd9b4f541 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -376,7 +376,11 @@ bool e820_get_entry(int, uint32_t, uint64_t *, uint64_t *);
 
 #define PC_COMPAT_2_8 \
 HW_COMPAT_2_8 \
-
+{\
+.driver   = "Haswell-" TYPE_X86_CPU,\
+.property = "stepping",\
+.value= "1",\
+},
 
 #define PC_COMPAT_2_7 \
 HW_COMPAT_2_7 \
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 9dbb2d98da..003de7e74f 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1190,7 +1190,7 @@ static X86CPUDefinition builtin_x86_defs[] = {
 .vendor = CPUID_VENDOR_INTEL,
 .family = 6,
 .model = 60,
-.stepping = 1,
+.stepping = 4,
 .features[FEAT_1_EDX] =
 CPUID_VME | CPUID_SSE2 | CPUID_SSE | CPUID_FXSR | CPUID_MMX |
 CPUID_CLFLUSH | CPUID_PSE36 | CPUID_PAT | CPUID_CMOV | CPUID_MCA |
-- 
2.11.0.259.g40922b1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 4/5] pc: Add 2.9 machine-types

2017-01-08 Thread Eduardo Habkost
Cc: "Michael S. Tsirkin" 
Cc: Laszlo Ersek 
Cc: Igor Mammedov 
Signed-off-by: Eduardo Habkost 
---
Changes v1 -> v2:
* Added extra backslash to PC_COMPAT_2_8 definition
  * Suggested-by: Laszlo Ersek 
---
 include/hw/i386/pc.h |  2 ++
 hw/i386/pc_piix.c| 15 ---
 hw/i386/pc_q35.c | 13 +++--
 3 files changed, 25 insertions(+), 5 deletions(-)

diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index b22e699c46..230e9e70c5 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -375,6 +375,8 @@ int e820_get_num_entries(void);
 bool e820_get_entry(int, uint32_t, uint64_t *, uint64_t *);
 
 #define PC_COMPAT_2_8 \
+HW_COMPAT_2_8 \
+
 
 #define PC_COMPAT_2_7 \
 HW_COMPAT_2_7 \
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 5e1adbe53c..9f102aa388 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -437,13 +437,24 @@ static void pc_i440fx_machine_options(MachineClass *m)
 m->default_display = "std";
 }
 
-static void pc_i440fx_2_8_machine_options(MachineClass *m)
+static void pc_i440fx_2_9_machine_options(MachineClass *m)
 {
 pc_i440fx_machine_options(m);
 m->alias = "pc";
 m->is_default = 1;
 }
 
+DEFINE_I440FX_MACHINE(v2_9, "pc-i440fx-2.9", NULL,
+  pc_i440fx_2_9_machine_options);
+
+static void pc_i440fx_2_8_machine_options(MachineClass *m)
+{
+pc_i440fx_2_9_machine_options(m);
+m->is_default = 0;
+m->alias = NULL;
+SET_MACHINE_COMPAT(m, PC_COMPAT_2_8);
+}
+
 DEFINE_I440FX_MACHINE(v2_8, "pc-i440fx-2.8", NULL,
   pc_i440fx_2_8_machine_options);
 
@@ -451,8 +462,6 @@ DEFINE_I440FX_MACHINE(v2_8, "pc-i440fx-2.8", NULL,
 static void pc_i440fx_2_7_machine_options(MachineClass *m)
 {
 pc_i440fx_2_8_machine_options(m);
-m->is_default = 0;
-m->alias = NULL;
 SET_MACHINE_COMPAT(m, PC_COMPAT_2_7);
 }
 
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index d042fe0843..dd792a8547 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -301,19 +301,28 @@ static void pc_q35_machine_options(MachineClass *m)
 m->max_cpus = 288;
 }
 
-static void pc_q35_2_8_machine_options(MachineClass *m)
+static void pc_q35_2_9_machine_options(MachineClass *m)
 {
 pc_q35_machine_options(m);
 m->alias = "q35";
 }
 
+DEFINE_Q35_MACHINE(v2_9, "pc-q35-2.9", NULL,
+   pc_q35_2_9_machine_options);
+
+static void pc_q35_2_8_machine_options(MachineClass *m)
+{
+pc_q35_2_9_machine_options(m);
+m->alias = NULL;
+SET_MACHINE_COMPAT(m, PC_COMPAT_2_8);
+}
+
 DEFINE_Q35_MACHINE(v2_8, "pc-q35-2.8", NULL,
pc_q35_2_8_machine_options);
 
 static void pc_q35_2_7_machine_options(MachineClass *m)
 {
 pc_q35_2_8_machine_options(m);
-m->alias = NULL;
 m->max_cpus = 255;
 SET_MACHINE_COMPAT(m, PC_COMPAT_2_7);
 }
-- 
2.11.0.259.g40922b1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 3/5] i386/kvm: Blacklist TSX on known broken hosts

2017-01-08 Thread Eduardo Habkost
Some Intel CPUs are known to have a broken TSX implementation. A
microcode update from Intel disabled TSX on those CPUs, but
GET_SUPPORTED_CPUID might be reporting it as supported if the
hosts were not updated yet.

Manually fixup the GET_SUPPORTED_CPUID data to ensure we will
never enable TSX when running on those hosts.

Reference:
* glibc commit 2702856bf45c82cf8e69f2064f5aa15c0ceb6359:
  
https://sourceware.org/git/?p=glibc.git;a=commit;h=2702856bf45c82cf8e69f2064f5aa15c0ceb6359

Signed-off-by: Eduardo Habkost 
---
 target/i386/kvm.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index 10a9cd8f7f..3e99512640 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -272,6 +272,19 @@ static int get_para_features(KVMState *s)
 return features;
 }
 
+static bool host_tsx_blacklisted(void)
+{
+int family, model, stepping;\
+char vendor[CPUID_VENDOR_SZ + 1];
+
+host_vendor_fms(vendor, &family, &model, &stepping);
+
+/* Check if we are running on a Haswell host known to have broken TSX */
+return !strcmp(vendor, CPUID_VENDOR_INTEL) &&
+   (family == 6) &&
+   ((model == 63 && stepping < 4) ||
+model == 60 || model == 69 || model == 70);
+}
 
 /* Returns the value for a specific register on the cpuid entry
  */
@@ -355,6 +368,10 @@ uint32_t kvm_arch_get_supported_cpuid(KVMState *s, 
uint32_t function,
 }
 } else if (function == 6 && reg == R_EAX) {
 ret |= CPUID_6_EAX_ARAT; /* safe to allow because of emulated APIC */
+} else if (function == 7 && index == 0 && reg == R_EBX) {
+if (host_tsx_blacklisted()) {
+ret &= ~(CPUID_7_0_EBX_RTM | CPUID_7_0_EBX_HLE);
+}
 } else if (function == 0x8001 && reg == R_EDX) {
 /* On Intel, kvm returns cpuid according to the Intel spec,
  * so add missing bits according to the AMD spec:
-- 
2.11.0.259.g40922b1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 0/5] Use non-blacklisted family/model/stepping for Haswell CPU model

2017-01-08 Thread Eduardo Habkost
A recent glibc commit[1] added a blacklist to ensure it won't use
TSX on hosts that are known to have a broken TSX implementation.

Our existing Haswell CPU model has a blacklisted
family/model/stepping combination, so it has to be updated to
make sure guests will really use TSX. This is done by patch 5/5.

However, to do this safely we need to ensure the host CPU is not
a blacklisted one, so we won't mislead guests by exposing
known-to-be-good FMS values on a known-to-be-broken host. This is
done by patch 3/5.

[1] 
https://sourceware.org/git/?p=glibc.git;a=commit;h=2702856bf45c82cf8e69f2064f5aa15c0ceb6359

---
Cc: dgilb...@redhat.com
Cc: fwei...@redhat.com
Cc: car...@redhat.com
Cc: trie...@redhat.com
Cc: berra...@redhat.com
Cc: jdene...@redhat.com
Cc: pbonz...@redhat.com

Eduardo Habkost (5):
  i386: Add explicit array size to x86_cpu_vendor_words2str()
  i386: host_vendor_fms() helper function
  i386/kvm: Blacklist TSX on known broken hosts
  pc: Add 2.9 machine-types
  i386: Change stepping of Haswell to non-blacklisted value

 include/hw/i386/pc.h |  6 ++
 target/i386/cpu.h|  1 +
 hw/i386/pc_piix.c| 15 ---
 hw/i386/pc_q35.c | 13 +++--
 target/i386/cpu.c| 32 ++--
 target/i386/kvm.c| 17 +
 6 files changed, 69 insertions(+), 15 deletions(-)

-- 
2.11.0.259.g40922b1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 1/5] i386: Add explicit array size to x86_cpu_vendor_words2str()

2017-01-08 Thread Eduardo Habkost
Add explicit array size to x86_cpu_vendor_words2str() to let the
compiler check if we are really passing an array of the right
size.

GCC still doesn't print a warning when the array is too small[1],
but clang already does.

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50584

Signed-off-by: Eduardo Habkost 
---
 target/i386/cpu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index a149c8dc42..25b802bb06 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -169,7 +169,7 @@
 
 
 
-static void x86_cpu_vendor_words2str(char *dst, uint32_t vendor1,
+static void x86_cpu_vendor_words2str(char dst[static (CPUID_VENDOR_SZ + 1)], 
uint32_t vendor1,
  uint32_t vendor2, uint32_t vendor3)
 {
 int i;
-- 
2.11.0.259.g40922b1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] Do different Virgil3D users running have the same CANVAS fingerprint?

2017-01-08 Thread bancfc

On 2016-12-27 04:05, ban...@openmailbox.org wrote:

Background:
Canvas fingerprinting is a technique to track users based on their
GPUs. https://en.wikipedia.org/wiki/Canvas_fingerprinting

Someone did some testing and the good news is the fingerprint is
common for all users of the same OS across all KVM users because it
presents the same virtual GPU/drivers for everybody.

What I need your help to find out is whether this holds for Virgil3D
users that enable graphics acceleration. (I am running on Debian and
the libvirt version in the repos is too old to have this included also
I don't have multiple machines)


To test this you need:

To run plain FireFox on a virtualized distro (with GL acceleration
enabled in libvirt) of your choice and visit
https://browserleaks.com/canvas . Then repeat the same steps with that
same distro (in a VM) on another physical machine and compare values.

A good result is if you see the same numbers for both machines. That
means the virtual device  fingerprint is uniform for all virtual
users.


Bumping. I realized I post this in holiday season when it was easily 
missed.


Since CANVAS is about graphics drivers version the question can be 
simplified: Is the virtio-gpu driver version uniform across KVM 
versions?



--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] memory-ballooning side-channel attack

2017-01-08 Thread bancfc

On 2016-12-27 03:51, ban...@openmailbox.org wrote:

Hello and Happy Holidays,

In the past few years many serious attacks against the memory
deduplication (KSM) feature of all hypervisors have been shown. [1]
Even allowing attackers to modify/steal APT keys and source lists on
the host. [2] Since its not enabled by default the fall out is
relatively low and easily mitigated.

New side-channel attacks against memory-ballon enabled VMs are
beginning to surface. Please consider documenting this and disabling
this feature for newly created VMs to have safe defaults.

[1] https://staff.aist.go.jp/c.artho/papers/EuroSec2011-suzaki.pdf
[2]
https://www.usenix.org/system/files/conference/usenixsecurity16/sec16_paper_razavi.pdf
[3] http://ieeexplore.ieee.org/document/7562068/

*Hint: If you can't see the IEEE paper use sci-hub.


Bumping. I realized I post this in holiday season when it was easily 
missed.


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH v3 0/2] Allow migration with invtsc if TSC frequency is explicitly set

2017-01-08 Thread Eduardo Habkost
This series makes QEMU accept migration if tsc-frequency is
explicitly set on configuration. As management software is
required to keep device configuration the same on migration
source or destination, explicit tsc-frequency will ensure that
either:

* The destination host has a matching TSC frequency; or
* The destination host has TSC scaling available.

Changelog
=

v2 -> v3:
* Fix build failure ((missing closing braces)

v1 -> v2:
* v1 series subject was:
  * [PATCH 0/4] Allow migration with invtsc if there's no
frequency mismatch
* Removed patches 3/4 and 4/4, that allowed migration
  if no explicit tsc-frequency was set. Implementing the check on
  post_load or post_init is not enough to make migration abort,
  so we will need a more complex solution to implement that
  feature.

Plans for future work
=

1) Querying host TSC frequency/scaling capability
-

I plan to include TSC frequency/scaling information on
query-cpu-model-expansion model="host" in a future series. Then
management software will be able to automatically configure TSC
frequency when invtsc is enabled, instead of requiring the user
to configure it explicitly. While we don't implement that, invtsc
migration will be possible only if the user configures TSC
frequency manually.

2) invtsc migration with no explicit TSC frequency
--

A future series can implement migration when TSC frequency is not
specified explicitly. It will be a bit more complex because it
requires either letting the destination abort the migration, or
sending TSC frequency/scaling information from destination to
source.

---
Cc: Marcelo Tosatti 
Cc: "Daniel P. Berrange" 
Cc: Paolo Bonzini 
Cc: k...@vger.kernel.org
Cc: Haozhong Zhang 
Cc: "Michael S. Tsirkin" 
Cc: Igor Mammedov 
Cc: libvir-list@redhat.com
Cc: Jiri Denemark 

Eduardo Habkost (2):
  kvm: Simplify invtsc check
  kvm: Allow invtsc migration if tsc-khz is set explicitly

 target/i386/kvm.c | 20 +++-
 1 file changed, 11 insertions(+), 9 deletions(-)

-- 
2.11.0.259.g40922b1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] [PATCH v2 0/2] Allow migration with invtsc if TSC frequency is explicitly set

2017-01-08 Thread Eduardo Habkost
On Sun, Jan 08, 2017 at 09:19:49AM -0800, no-re...@patchew.org wrote:
> Hi,
> 
> Your series failed automatic build test. Please find the testing commands and
> their output below. If you have docker installed, you can probably reproduce 
> it
> locally.
> 
> Type: series
> Message-id: 20170108171330.11129-1-ehabk...@redhat.com
> Subject: [Qemu-devel] [PATCH v2 0/2] Allow migration with invtsc if TSC 
> frequency is explicitly set

Ouch. I simply removed patches 3/4 and 4/4 from the series, and
didn't notice there was a mistake in patch 1/2 corrected by patch
4/4 in v1. I will fix it and send v3, sorry for the noise.

-- 
Eduardo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH v2 0/2] Allow migration with invtsc if TSC frequency is explicitly set

2017-01-08 Thread Eduardo Habkost
This series makes QEMU accept migration if tsc-frequency is
explicitly set on configuration. As management software is
required to keep device configuration the same on migration
source or destination, explicit tsc-frequency will ensure that
either:

* The destination host has a matching TSC frequency; or
* The destination host has TSC scaling available.

Changes series v1 -> v2
===

* v1 series subject was:
  * [PATCH 0/4] Allow migration with invtsc if there's no
frequency mismatch
* Removed patches 3/4 and 4/4, that allowed migration
  if no explicit tsc-frequency was set. Implementing the check on
  post_load or post_init is not enough to make migration abort,
  so we will need a more complex solution to implement that
  feature.

Plans for future work
=

1) Querying host TSC frequency/scaling capability
-

I plan to include TSC frequency/scaling information on
query-cpu-model-expansion model="host" in a future series. Then
management software will be able to automatically configure TSC
frequency when invtsc is enabled, instead of requiring the user
to configure it explicitly. While we don't implement that, invtsc
migration will be possible only if the user configures TSC
frequency manually.

2) invtsc migration with no explicit TSC frequency
--

A future series can implement migration when TSC frequency is not
specified explicitly. It will be a bit more complex because it
requires either letting the destination abort the migration, or
sending TSC frequency/scaling information from destination to
source.

---
Cc: Marcelo Tosatti 
Cc: "Daniel P. Berrange" 
Cc: Paolo Bonzini 
Cc: k...@vger.kernel.org
Cc: Haozhong Zhang 
Cc: "Michael S. Tsirkin" 
Cc: Igor Mammedov 
Cc: libvir-list@redhat.com
Cc: Jiri Denemark 

Eduardo Habkost (2):
  kvm: Simplify invtsc check
  kvm: Allow invtsc migration if tsc-khz is set explicitly

 target/i386/kvm.c | 19 ++-
 1 file changed, 10 insertions(+), 9 deletions(-)

-- 
2.11.0.259.g40922b1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] How to make dest host abort migration safely (was Re: [PATCH 4/4] kvm: Allow migration with invtsc)

2017-01-08 Thread Eduardo Habkost
On Fri, Jan 06, 2017 at 08:31:27AM -0200, Marcelo Tosatti wrote:
> On Thu, Jan 05, 2017 at 10:19:50AM -0200, Eduardo Habkost wrote:
> > On Thu, Jan 05, 2017 at 08:48:32AM -0200, Marcelo Tosatti wrote:
> > > On Wed, Jan 04, 2017 at 11:36:31PM -0200, Eduardo Habkost wrote:
> > > > On Wed, Jan 04, 2017 at 08:26:27PM -0200, Marcelo Tosatti wrote:
> > > > > On Wed, Jan 04, 2017 at 05:59:17PM -0200, Eduardo Habkost wrote:
> > > > > > On Wed, Jan 04, 2017 at 11:39:16AM -0200, Eduardo Habkost wrote:
> > > > > > > On Wed, Jan 04, 2017 at 09:56:56AM -0200, Marcelo Tosatti wrote:
> > > > > > > > On Tue, Dec 27, 2016 at 05:21:20PM -0200, Eduardo Habkost wrote:
[...]
> > > > > Can't this be fixed in QEMU? Just check that destination host supports
> > > > > TSC scaling before migration (or that KVM_GET_TSC_KHZ return value
> > > > > matches on source and destination).
> > > > > 
> > > > 
> > > > This solves only one use case: where you want to expose the
> > > > starting host TSC frequency to the guest. It doesn't cover any
> > > > scenario where the TSC frequency of the starting host isn't the
> > > > best one. (See the two scenarios I described above)
> > > 
> > > True.
> > > 
> > > > You seem to have a specific use case in mind. If you describe it,
> > > > we could decide if it's worth the extra migration code complexity
> > > > to deal with invtsc migration without explicit tsc-frequency. By
> > > > now, I am not convinced yet.
> > > 
> > > I don't have any specific use case in mind. I'm just trying to
> > > avoid moving complexity to libvirt which in my experience is a
> > > the best thing to do.
> > 
> > I think both our proposals make things harder for libvirt in
> > different ways. I just want to make the complexity explicit for
> > libvirt, instead of giving them the illusion of safety (making
> > migration break in ways libvirt can't detect).
> > 
> > Anyway, your points are still valid and it would be still useful
> > to do what you propose. I will give it a try on a new version of
> > patch 4/4 that can abort migration earlier. But note that I
> > expect some pushback from other maintainers because of the
> > complexity of the code. If that happens, be aware that I won't be
> > able to justify the added complexity because I would still think
> > it's not worth it.

I tried to move the destination TSC-scaling check to post_load,
and the bad news is that post_load is also too late to abort
migration (destination aborts, but source shows migration as
completed).

I'm CCing Juan, Amit and David to see if they have any
suggestion.

Summarizing the problem for them: on some cases we want to allow
migration only if the destination host has a matching TSC
frequency or if TSC scaling is supported by the destination host.
I don't know what's the best way to implement that check. We
could either:

* Send TSC frequency/scaling info from the the source host, and
  make the source host abort migration. Is there a good mechanism
  for that?
* Make the destination host abort migration. I don't know if
  there's a safe mechanism for that.



While we figure that out, I will submit v2 without patches 3/4
and 4/4, because patch 2/2 (allowing migration if tsc-freq is
explicitly set) is simple and useful. After that, we can add:

* The mechanism to query the host TSC frequency (see below)
* The mechanism you asked for, to allow migration without
  explicit TSC frequency if we know the destination host supports
  TSC scaling or has a matching TSC frequency (more complex, and
  maybe unnecessary if we implement the previous item)


> > 
> > > 
> > > > Maybe your use case just needs a explicit "invtsc-migration=on"
> > > > command-line flag without a mechanism to abort migration on
> > > > mismatch? I can't tell.
> > > 
> > > Again, there is no special case.
> > > 
> > > > Note that even if we follow your suggestion and implement an
> > > > alternative version of patch 4/4 to cover your use case, I will
> > > > strongly recommend libvirt developers to support configuring TSC
> > > > frequency if they want to support invtsc + migration without
> > > > surprising/unpredictable restrictions on migratability.
> > > 
> > > Well, alright. If you make the TSC frequency of the host
> > > available to mgmt software as you describe, and write the steps mgmt
> > > software should take, i'm good.
> > 
> > I plan to. The problem is that the mechanism to query the host
> > frequency may be unavailable in the first version.
> 
> Well just export KVM_GET_TSC_KHZ in a QMP command right? Its pretty
> easy.

I plan to expose it as part of query-cpu-model-expansion (work in
progress, right now) when querying the "host" CPU model.

> 
> Let me know if you need any help coding or testing.

Thanks!

-- 
Eduardo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] Regarding Migration Code

2017-01-08 Thread Anubhav Guleria
Greetings,

I was trying to understand the flow of Migration Code in libvirt and
have few doubts:

1)  libvirt talks to QEMU/KVM guests via QEMU API. So overall, in
order to manage QEMU/KVM guests I can either use libvirt (or tools
based on libvirt like virsh) or QEMU monitor. Is it so?

2) Since libvirt is Hypervisor neutral so actual migration
algorithm(precopy or postcopy) is present in the hypervisor ,i.e. in
case of QEMU it should be present in QEMU code base . I was going
through the code starting from libvirt-domain api and not able to
follow after virDomainMigrateVersion3Full(.)  . Kindly help.

3) I want to determine the Migration
Time, Downtime and other resource usage.
What is the correct way to determine the same? Should I use libvirt
API or directly virsh or something else? Which is fast and efficient
way for the same?


Thanks in advance. And sorry if that was too basic

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list