[Qemu-devel] [PATCH] spapr: remove the 'nr_servers' field from the machine

2017-04-04 Thread Cédric Le Goater
xics_system_init() does not need 'nr_servers' anymore as it is only
used to define the 'interrupt-controller' node in the device tree. So
let's just compute the value when calling spapr_dt_xics().

This also gives us an opportunity to simplify the xics_system_init()
routine and introduce a specific spapr_ics_create() helper to create
the sPAPR ICS object.

Signed-off-by: Cédric Le Goater 
---
 hw/ppc/spapr.c | 56 ++
 include/hw/ppc/spapr.h |  1 -
 2 files changed, 24 insertions(+), 33 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 4e87ffe776ea..df231a3fc4a1 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -96,45 +96,40 @@
 
 #define HTAB_SIZE(spapr)(1ULL << ((spapr)->htab_shift))
 
-static int try_create_xics(sPAPRMachineState *spapr, const char *type_ics,
-   const char *type_icp, int nr_servers,
-   int nr_irqs, Error **errp)
+static ICSState *spapr_ics_create(sPAPRMachineState *spapr,
+  const char *type_ics,
+  int nr_irqs, Error **errp)
 {
-XICSFabric *xi = XICS_FABRIC(spapr);
 Error *err = NULL, *local_err = NULL;
-ICSState *ics = NULL;
+Object *obj;
 
-ics = ICS_SIMPLE(object_new(type_ics));
-object_property_add_child(OBJECT(spapr), "ics", OBJECT(ics), NULL);
-object_property_set_int(OBJECT(ics), nr_irqs, "nr-irqs", &err);
-object_property_add_const_link(OBJECT(ics), "xics", OBJECT(xi), NULL);
-object_property_set_bool(OBJECT(ics), true, "realized", &local_err);
+obj = object_new(type_ics);
+object_property_add_child(OBJECT(spapr), "ics", obj, NULL);
+object_property_add_const_link(obj, "xics", OBJECT(spapr), &error_abort);
+object_property_set_int(obj, nr_irqs, "nr-irqs", &err);
+object_property_set_bool(obj, true, "realized", &local_err);
 error_propagate(&err, local_err);
 if (err) {
 error_propagate(errp, err);
-return -1;
+return NULL;
 }
 
-spapr->nr_servers = nr_servers;
-spapr->ics = ics;
-spapr->icp_type = type_icp;
-return 0;
+return ICS_SIMPLE(obj);
 }
 
-static int xics_system_init(MachineState *machine,
-int nr_servers, int nr_irqs, Error **errp)
+static void xics_system_init(MachineState *machine, int nr_irqs, Error **errp)
 {
-int rc = -1;
+sPAPRMachineState *spapr = SPAPR_MACHINE(machine);
 
 if (kvm_enabled()) {
 Error *err = NULL;
 
 if (machine_kernel_irqchip_allowed(machine) &&
-!xics_kvm_init(SPAPR_MACHINE(machine), errp)) {
-rc = try_create_xics(SPAPR_MACHINE(machine), TYPE_ICS_KVM,
- TYPE_KVM_ICP, nr_servers, nr_irqs, &err);
+!xics_kvm_init(spapr, errp)) {
+spapr->icp_type = TYPE_KVM_ICP;
+spapr->ics = spapr_ics_create(spapr, TYPE_ICS_KVM, nr_irqs, &err);
 }
-if (machine_kernel_irqchip_required(machine) && rc < 0) {
+if (machine_kernel_irqchip_required(machine) && !spapr->ics) {
 error_reportf_err(err,
   "kernel_irqchip requested but unavailable: ");
 } else {
@@ -142,13 +137,11 @@ static int xics_system_init(MachineState *machine,
 }
 }
 
-if (rc < 0) {
-xics_spapr_init(SPAPR_MACHINE(machine), errp);
-rc = try_create_xics(SPAPR_MACHINE(machine), TYPE_ICS_SIMPLE,
-   TYPE_ICP, nr_servers, nr_irqs, errp);
+if (!spapr->ics) {
+xics_spapr_init(spapr, errp);
+spapr->icp_type = TYPE_ICP;
+spapr->ics = spapr_ics_create(spapr, TYPE_ICS_SIMPLE, nr_irqs, errp);
 }
-
-return rc;
 }
 
 static int spapr_fixup_cpu_smt_dt(void *fdt, int offset, PowerPCCPU *cpu,
@@ -912,6 +905,7 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
 void *fdt;
 sPAPRPHBState *phb;
 char *buf;
+int smt = kvmppc_smt_threads();
 
 fdt = g_malloc0(FDT_MAX_SIZE);
 _FDT((fdt_create_empty_tree(fdt, FDT_MAX_SIZE)));
@@ -951,7 +945,7 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
 _FDT(fdt_setprop_cell(fdt, 0, "#size-cells", 2));
 
 /* /interrupt controller */
-spapr_dt_xics(spapr->nr_servers, fdt, PHANDLE_XICP);
+spapr_dt_xics(DIV_ROUND_UP(max_cpus * smt, smp_threads), fdt, 
PHANDLE_XICP);
 
 ret = spapr_populate_memory(spapr, fdt);
 if (ret < 0) {
@@ -1970,7 +1964,6 @@ static void ppc_spapr_init(MachineState *machine)
 hwaddr node0_size = spapr_node0_size();
 long load_limit, fw_size;
 char *filename;
-int smt = kvmppc_smt_threads();
 
 msi_nonbroken = true;
 
@@ -2021,8 +2014,7 @@ static void ppc_spapr_init(MachineState *machine)
 load_limit = MIN(spapr->rma_size, RTAS_MAX_ADDR) - FW_OVERHEAD;
 
 /* Set up Interrupt Controller before we create the VCPUs */
-xics_system_init(machine, DIV_ROUND_UP(max_cpus * 

Re: [Qemu-devel] [PATCH v5 0/9] ppc/pnv: interrupt controller (POWER8)

2017-04-04 Thread Cédric Le Goater
On 04/05/2017 07:51 AM, David Gibson wrote:
> On Mon, Apr 03, 2017 at 09:45:56AM +0200, Cédric Le Goater wrote:
>> Hello,
>>
>> Here is a series adding support for the interrupt controller as found
>> on a POWER8 system. POWER9 uses a different interrupt controller
>> called XIVE, still to be worked on.
>>
>> The initial patches are more cleanups of the XICS layer which move the
>> IRQ 'server' number mapping under the machine handlers.
>>
>> A new PnvICPState object based on MMIOs, which is specific to PowerNV,
>> is introduced in XICS. These ICP objects are created for each thread
>> of a core and linked to the associated PowerPCCPU object.
>>
>> Finally, to make use of the XICS layer, the PowerNV machine is
>> extended with a QOM XICSFabric interface and with a global memory
>> region acting as the Interrupt Management area.
>>
>>
>> To test, grab a kernel and a rootfs image here :
>>
>>   
>> https://openpower.xyz/job/openpower-op-build/distro=ubuntu,target=palmetto/lastSuccessfulBuild/artifact/images/zImage.epapr
>>   
>> https://openpower.xyz/job/openpower-op-build/distro=ubuntu,target=palmetto/lastSuccessfulBuild/artifact/images/rootfs.cpio.xz
>>
>> The full patchset is available here :
>>
>>https://github.com/legoater/qemu/commits/powernv-ipmi-2.9
>>
>> Thanks,
> 
> I've applied the series to my ppc-for-2.10 branch.  A couple of things
> to think about for follow up cleanups:
> 
> 1) The spapr->nr_servers field has very little remaining use; can you
>go a bit further and remove it entirely?

yes. I have a patch for it already, reshuffling the xics_system_init
routine again to make it look a little better : 


https://github.com/legoater/qemu/commit/94e90b4b660d521ad4806a5496dd89f351995810
 
> 2) At the moment you're creating (and realizing) the ICP objects 

yes. This is because the number of ICPState objects depends 
on 'nr_threads' which is only known at realize time.

We could also define an array of icps in the core:

PnvICPState icps[MAX_THREADS]

initialize them in an init routine and only realize a subset 
depending on 'nr_threads'. 

This is possible because the size of TYPE_PNV_ICP and TYPE_ICP
are the same. 

But, the current method is safer I think. 

> just  before you realize the core objects. 

yes. this is to not complete the core realization if the ICP 
creation fails. Else the core would be fully realized before 
the ICP, which feels incorrect as the ICP is an attribute 
of the core now. 

> I'm not entirely sure if
> creating new objects at realize time is QOMishly correct.  Maybe do
> some more enquiries to see if it is or not, and if not update.

Yes.

Thanks,

C. 
   



Re: [Qemu-devel] [Qemu-stable] [PATCH 00/81] Patch Round-up for stable 2.8.1, freeze on 2017-03-27

2017-04-04 Thread Gonglei (Arei)
>
> > >
> > > >
> > > >
> > > > > -Original Message-
> > > > > From: Michael Roth [mailto:mdr...@linux.vnet.ibm.com]
> > > > > Sent: Wednesday, April 05, 2017 12:09 PM
> > > > > To: Gonglei (Arei); qemu-devel@nongnu.org
> > > > > Cc: qemu-sta...@nongnu.org
> > > > > Subject: RE: [Qemu-stable] [PATCH 00/81] Patch Round-up for stable
> 2.8.1,
> > > > > freeze on 2017-03-27
> > > > >
> > > > > Quoting Gonglei (Arei) (2017-04-04 21:01:54)
> > > > > > Hi,
> > > > > >
> > > > > > I'd like to ask when QEMU 2.8.1 release? According to the previous
> > > planning,
> > > > > > It should release in Mar 30.
> > > > >
> > > > > It's already been released :)
> > > > >
> > > > >
> > > https://lists.nongnu.org/archive/html/qemu-devel/2017-03/msg06332.html
> > > > >
> > > > Thanks for your feedback!
> > > >
> > > > I missed that mail :(   But why don't I find the v2.8.1 tag in QEMU 
> > > > tree?
> > >
> > > Where are you pulling from? From a repo I haven't updated recently:
> > >
> > > mdroth@loki:~/w/qemu.git$ git remote get-url origin
> > > git://git.qemu.org/qemu.git
> > > mdroth@loki:~/w/qemu.git$ git fetch origin
> > > remote: Counting objects: 3198, done.
> > > remote: Compressing objects: 100% (1725/1725), done.
> > > remote: Total 2502 (delta 1963), reused 962 (delta 772)
> > > Receiving objects: 100% (2502/2502), 607.53 KiB | 791.00 KiB/s, done.
> > > Resolving deltas: 100% (1963/1963), completed with 424 local objects.
> > > From git://git.qemu.org/qemu
> > >b64842d..1fde6ee  master -> origin/master
> > >  * [new branch]  stable-2.8 -> origin/stable-2.8
> > >  * [new tag] v2.8.1 -> v2.8.1
> > >  * [new tag] v2.9.0-rc3 -> v2.9.0-rc3
> > >  * [new tag] v2.9.0-rc0 -> v2.9.0-rc0
> > >  * [new tag] v2.9.0-rc1 -> v2.9.0-rc1
> > >  * [new tag] v2.9.0-rc2 -> v2.9.0-rc2
> > > ...
> > >
> > > If you're pulling from the github mirror I do see that it hasn't been 
> > > updated
> > > yet with stable-2.8 branch or tag. That process was previously automated
> > > AFAIK,
> > > but maybe something has changed since the recent server switch-over. Will
> > > look
> > > into it more tomorrow.
> > >
> > Yes, I pull it from github mirror since the firewall of my firm.
> > And I notice that the latest tag is v2.8.0, no v2.8.1 and v2.9.0-rcX:
> >
> > linux-arei:/mnt/sdb/gonglei/qemu # git tag | tail
> > v2.7.0-rc3
> > v2.7.0-rc4
> > v2.7.0-rc5
> > v2.7.1
> > v2.8.0
> > v2.8.0-rc0
> > v2.8.0-rc1
> > v2.8.0-rc2
> > v2.8.0-rc3
> > v2.8.0-rc4
> 
> Thanks for confirming. I've pinged the admins about it, I'm sure it'll
> get sorted out soon.
> 
> The tags might be missing on the github mirror, but the master branch seems
> up to date at least. I've also pushed stable-2.8 branch and v2.8.1 to my
> tree in case you need to grab them in the meantime:
> 
>   https://github.com/mdroth/qemu/releases/tag/v2.8.1
> 
Thanks for your work!

-Gonglei



Re: [Qemu-devel] [Qemu-stable] [PATCH 00/81] Patch Round-up for stable 2.8.1, freeze on 2017-03-27

2017-04-04 Thread Michael Roth
Quoting Gonglei (Arei) (2017-04-05 00:52:02)
> >
> > >
> > >
> > > > -Original Message-
> > > > From: Michael Roth [mailto:mdr...@linux.vnet.ibm.com]
> > > > Sent: Wednesday, April 05, 2017 12:09 PM
> > > > To: Gonglei (Arei); qemu-devel@nongnu.org
> > > > Cc: qemu-sta...@nongnu.org
> > > > Subject: RE: [Qemu-stable] [PATCH 00/81] Patch Round-up for stable 
> > > > 2.8.1,
> > > > freeze on 2017-03-27
> > > >
> > > > Quoting Gonglei (Arei) (2017-04-04 21:01:54)
> > > > > Hi,
> > > > >
> > > > > I'd like to ask when QEMU 2.8.1 release? According to the previous
> > planning,
> > > > > It should release in Mar 30.
> > > >
> > > > It's already been released :)
> > > >
> > > >
> > https://lists.nongnu.org/archive/html/qemu-devel/2017-03/msg06332.html
> > > >
> > > Thanks for your feedback!
> > >
> > > I missed that mail :(   But why don't I find the v2.8.1 tag in QEMU tree?
> > 
> > Where are you pulling from? From a repo I haven't updated recently:
> > 
> > mdroth@loki:~/w/qemu.git$ git remote get-url origin
> > git://git.qemu.org/qemu.git
> > mdroth@loki:~/w/qemu.git$ git fetch origin
> > remote: Counting objects: 3198, done.
> > remote: Compressing objects: 100% (1725/1725), done.
> > remote: Total 2502 (delta 1963), reused 962 (delta 772)
> > Receiving objects: 100% (2502/2502), 607.53 KiB | 791.00 KiB/s, done.
> > Resolving deltas: 100% (1963/1963), completed with 424 local objects.
> > From git://git.qemu.org/qemu
> >b64842d..1fde6ee  master -> origin/master
> >  * [new branch]  stable-2.8 -> origin/stable-2.8
> >  * [new tag] v2.8.1 -> v2.8.1
> >  * [new tag] v2.9.0-rc3 -> v2.9.0-rc3
> >  * [new tag] v2.9.0-rc0 -> v2.9.0-rc0
> >  * [new tag] v2.9.0-rc1 -> v2.9.0-rc1
> >  * [new tag] v2.9.0-rc2 -> v2.9.0-rc2
> > ...
> > 
> > If you're pulling from the github mirror I do see that it hasn't been 
> > updated
> > yet with stable-2.8 branch or tag. That process was previously automated
> > AFAIK,
> > but maybe something has changed since the recent server switch-over. Will
> > look
> > into it more tomorrow.
> > 
> Yes, I pull it from github mirror since the firewall of my firm.
> And I notice that the latest tag is v2.8.0, no v2.8.1 and v2.9.0-rcX:
> 
> linux-arei:/mnt/sdb/gonglei/qemu # git tag | tail
> v2.7.0-rc3
> v2.7.0-rc4
> v2.7.0-rc5
> v2.7.1
> v2.8.0
> v2.8.0-rc0
> v2.8.0-rc1
> v2.8.0-rc2
> v2.8.0-rc3
> v2.8.0-rc4

Thanks for confirming. I've pinged the admins about it, I'm sure it'll
get sorted out soon.

The tags might be missing on the github mirror, but the master branch seems
up to date at least. I've also pushed stable-2.8 branch and v2.8.1 to my
tree in case you need to grab them in the meantime:

  https://github.com/mdroth/qemu/releases/tag/v2.8.1

> 
> Thanks,
> -Gonglei
> 




Re: [Qemu-devel] [Qemu-stable] [PATCH 00/81] Patch Round-up for stable 2.8.1, freeze on 2017-03-27

2017-04-04 Thread Gonglei (Arei)
>
> >
> >
> > > -Original Message-
> > > From: Michael Roth [mailto:mdr...@linux.vnet.ibm.com]
> > > Sent: Wednesday, April 05, 2017 12:09 PM
> > > To: Gonglei (Arei); qemu-devel@nongnu.org
> > > Cc: qemu-sta...@nongnu.org
> > > Subject: RE: [Qemu-stable] [PATCH 00/81] Patch Round-up for stable 2.8.1,
> > > freeze on 2017-03-27
> > >
> > > Quoting Gonglei (Arei) (2017-04-04 21:01:54)
> > > > Hi,
> > > >
> > > > I'd like to ask when QEMU 2.8.1 release? According to the previous
> planning,
> > > > It should release in Mar 30.
> > >
> > > It's already been released :)
> > >
> > >
> https://lists.nongnu.org/archive/html/qemu-devel/2017-03/msg06332.html
> > >
> > Thanks for your feedback!
> >
> > I missed that mail :(   But why don't I find the v2.8.1 tag in QEMU tree?
> 
> Where are you pulling from? From a repo I haven't updated recently:
> 
> mdroth@loki:~/w/qemu.git$ git remote get-url origin
> git://git.qemu.org/qemu.git
> mdroth@loki:~/w/qemu.git$ git fetch origin
> remote: Counting objects: 3198, done.
> remote: Compressing objects: 100% (1725/1725), done.
> remote: Total 2502 (delta 1963), reused 962 (delta 772)
> Receiving objects: 100% (2502/2502), 607.53 KiB | 791.00 KiB/s, done.
> Resolving deltas: 100% (1963/1963), completed with 424 local objects.
> From git://git.qemu.org/qemu
>b64842d..1fde6ee  master -> origin/master
>  * [new branch]  stable-2.8 -> origin/stable-2.8
>  * [new tag] v2.8.1 -> v2.8.1
>  * [new tag] v2.9.0-rc3 -> v2.9.0-rc3
>  * [new tag] v2.9.0-rc0 -> v2.9.0-rc0
>  * [new tag] v2.9.0-rc1 -> v2.9.0-rc1
>  * [new tag] v2.9.0-rc2 -> v2.9.0-rc2
> ...
> 
> If you're pulling from the github mirror I do see that it hasn't been updated
> yet with stable-2.8 branch or tag. That process was previously automated
> AFAIK,
> but maybe something has changed since the recent server switch-over. Will
> look
> into it more tomorrow.
> 
Yes, I pull it from github mirror since the firewall of my firm.
And I notice that the latest tag is v2.8.0, no v2.8.1 and v2.9.0-rcX:

linux-arei:/mnt/sdb/gonglei/qemu # git tag | tail
v2.7.0-rc3
v2.7.0-rc4
v2.7.0-rc5
v2.7.1
v2.8.0
v2.8.0-rc0
v2.8.0-rc1
v2.8.0-rc2
v2.8.0-rc3
v2.8.0-rc4

Thanks,
-Gonglei



Re: [Qemu-devel] [PATCH v5 0/9] ppc/pnv: interrupt controller (POWER8)

2017-04-04 Thread David Gibson
On Mon, Apr 03, 2017 at 09:45:56AM +0200, Cédric Le Goater wrote:
> Hello,
> 
> Here is a series adding support for the interrupt controller as found
> on a POWER8 system. POWER9 uses a different interrupt controller
> called XIVE, still to be worked on.
> 
> The initial patches are more cleanups of the XICS layer which move the
> IRQ 'server' number mapping under the machine handlers.
> 
> A new PnvICPState object based on MMIOs, which is specific to PowerNV,
> is introduced in XICS. These ICP objects are created for each thread
> of a core and linked to the associated PowerPCCPU object.
> 
> Finally, to make use of the XICS layer, the PowerNV machine is
> extended with a QOM XICSFabric interface and with a global memory
> region acting as the Interrupt Management area.
> 
> 
> To test, grab a kernel and a rootfs image here :
> 
>   
> https://openpower.xyz/job/openpower-op-build/distro=ubuntu,target=palmetto/lastSuccessfulBuild/artifact/images/zImage.epapr
>   
> https://openpower.xyz/job/openpower-op-build/distro=ubuntu,target=palmetto/lastSuccessfulBuild/artifact/images/rootfs.cpio.xz
> 
> The full patchset is available here :
> 
>https://github.com/legoater/qemu/commits/powernv-ipmi-2.9
> 
> Thanks,

I've applied the series to my ppc-for-2.10 branch.  A couple of things
to think about for follow up cleanups:

1) The spapr->nr_servers field has very little remaining use; can you
   go a bit further and remove it entirely?

2) At the moment you're creating (and realizing) the ICP objects just
before you realize the core objects.  I'm not entirely sure if
creating new objects at realize time is QOMishly correct.  Maybe do
some more enquiries to see if it is or not, and if not update.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [Qemu-stable] [PATCH 00/81] Patch Round-up for stable 2.8.1, freeze on 2017-03-27

2017-04-04 Thread Michael Roth
Quoting Gonglei (Arei) (2017-04-04 23:51:26)
> Hi Michael,
> 
> 
> > -Original Message-
> > From: Michael Roth [mailto:mdr...@linux.vnet.ibm.com]
> > Sent: Wednesday, April 05, 2017 12:09 PM
> > To: Gonglei (Arei); qemu-devel@nongnu.org
> > Cc: qemu-sta...@nongnu.org
> > Subject: RE: [Qemu-stable] [PATCH 00/81] Patch Round-up for stable 2.8.1,
> > freeze on 2017-03-27
> > 
> > Quoting Gonglei (Arei) (2017-04-04 21:01:54)
> > > Hi,
> > >
> > > I'd like to ask when QEMU 2.8.1 release? According to the previous 
> > > planning,
> > > It should release in Mar 30.
> > 
> > It's already been released :)
> > 
> >   https://lists.nongnu.org/archive/html/qemu-devel/2017-03/msg06332.html
> > 
> Thanks for your feedback!
> 
> I missed that mail :(   But why don't I find the v2.8.1 tag in QEMU tree?

Where are you pulling from? From a repo I haven't updated recently:

mdroth@loki:~/w/qemu.git$ git remote get-url origin
git://git.qemu.org/qemu.git
mdroth@loki:~/w/qemu.git$ git fetch origin
remote: Counting objects: 3198, done.
remote: Compressing objects: 100% (1725/1725), done.
remote: Total 2502 (delta 1963), reused 962 (delta 772)
Receiving objects: 100% (2502/2502), 607.53 KiB | 791.00 KiB/s, done.
Resolving deltas: 100% (1963/1963), completed with 424 local objects.
From git://git.qemu.org/qemu
   b64842d..1fde6ee  master -> origin/master
 * [new branch]  stable-2.8 -> origin/stable-2.8
 * [new tag] v2.8.1 -> v2.8.1
 * [new tag] v2.9.0-rc3 -> v2.9.0-rc3
 * [new tag] v2.9.0-rc0 -> v2.9.0-rc0
 * [new tag] v2.9.0-rc1 -> v2.9.0-rc1
 * [new tag] v2.9.0-rc2 -> v2.9.0-rc2
...

If you're pulling from the github mirror I do see that it hasn't been updated
yet with stable-2.8 branch or tag. That process was previously automated AFAIK,
but maybe something has changed since the recent server switch-over. Will look
into it more tomorrow.

> 
> Thanks,
> -Gonglei
> 




Re: [Qemu-devel] [Qemu-stable] [PATCH 00/81] Patch Round-up for stable 2.8.1, freeze on 2017-03-27

2017-04-04 Thread Gonglei (Arei)
Hi Michael,


> -Original Message-
> From: Michael Roth [mailto:mdr...@linux.vnet.ibm.com]
> Sent: Wednesday, April 05, 2017 12:09 PM
> To: Gonglei (Arei); qemu-devel@nongnu.org
> Cc: qemu-sta...@nongnu.org
> Subject: RE: [Qemu-stable] [PATCH 00/81] Patch Round-up for stable 2.8.1,
> freeze on 2017-03-27
> 
> Quoting Gonglei (Arei) (2017-04-04 21:01:54)
> > Hi,
> >
> > I'd like to ask when QEMU 2.8.1 release? According to the previous planning,
> > It should release in Mar 30.
> 
> It's already been released :)
> 
>   https://lists.nongnu.org/archive/html/qemu-devel/2017-03/msg06332.html
> 
Thanks for your feedback!

I missed that mail :(   But why don't I find the v2.8.1 tag in QEMU tree?

Thanks,
-Gonglei



Re: [Qemu-devel] [PATCH kernel v8 2/4] virtio-balloon: VIRTIO_BALLOON_F_CHUNK_TRANSFER

2017-04-04 Thread Wang, Wei W
On Wednesday, April 5, 2017 11:54 AM, Michael S. Tsirkin wrote:
> On Wed, Apr 05, 2017 at 03:31:36AM +, Wang, Wei W wrote:
> > On Thursday, March 16, 2017 3:09 PM Wei Wang wrote:
> > > The implementation of the current virtio-balloon is not very
> > > efficient, because the ballooned pages are transferred to the host
> > > one by one. Here is the breakdown of the time in percentage spent on
> > > each step of the balloon inflating process (inflating 7GB of an 8GB idle 
> > > guest).
> > >
> > > 1) allocating pages (6.5%)
> > > 2) sending PFNs to host (68.3%)
> > > 3) address translation (6.1%)
> > > 4) madvise (19%)
> > >
> > > It takes about 4126ms for the inflating process to complete.
> > > The above profiling shows that the bottlenecks are stage 2) and stage 4).
> > >
> > > This patch optimizes step 2) by transferring pages to the host in
> > > chunks. A chunk consists of guest physically continuous pages, and
> > > it is offered to the host via a base PFN (i.e. the start PFN of
> > > those physically continuous pages) and the size (i.e. the total number of 
> > > the
> pages). A chunk is formated as below:
> > >
> > > 
> > > | Base (52 bit)| Rsvd (12 bit) |
> > > 
> > > 
> > > | Size (52 bit)| Rsvd (12 bit) |
> > > 
> > >
> > > By doing so, step 4) can also be optimized by doing address
> > > translation and
> > > madvise() in chunks rather than page by page.
> > >
> > > This optimization requires the negotiation of a new feature bit,
> > > VIRTIO_BALLOON_F_CHUNK_TRANSFER.
> > >
> > > With this new feature, the above ballooning process takes ~590ms
> > > resulting in an improvement of ~85%.
> > >
> > > TODO: optimize stage 1) by allocating/freeing a chunk of pages
> > > instead of a single page each time.
> > >
> > > Signed-off-by: Liang Li 
> > > Signed-off-by: Wei Wang 
> > > Suggested-by: Michael S. Tsirkin 
> > > ---
> > >  drivers/virtio/virtio_balloon.c | 371
> +-
> > > --
> > >  include/uapi/linux/virtio_balloon.h |   9 +
> > >  2 files changed, 353 insertions(+), 27 deletions(-)
> > >
> > > diff --git a/drivers/virtio/virtio_balloon.c
> > > b/drivers/virtio/virtio_balloon.c index
> > > f59cb4f..3f4a161 100644
> > > --- a/drivers/virtio/virtio_balloon.c
> > > +++ b/drivers/virtio/virtio_balloon.c
> > > @@ -42,6 +42,10 @@
> > >  #define OOM_VBALLOON_DEFAULT_PAGES 256  #define
> > > VIRTBALLOON_OOM_NOTIFY_PRIORITY 80
> > >
> > > +#define PAGE_BMAP_SIZE   (8 * PAGE_SIZE)
> > > +#define PFNS_PER_PAGE_BMAP   (PAGE_BMAP_SIZE * BITS_PER_BYTE)
> > > +#define PAGE_BMAP_COUNT_MAX  32
> > > +
> > >  static int oom_pages = OOM_VBALLOON_DEFAULT_PAGES;
> > > module_param(oom_pages, int, S_IRUSR | S_IWUSR);
> > > MODULE_PARM_DESC(oom_pages, "pages to free on OOM"); @@ -50,6
> +54,14
> > > @@ MODULE_PARM_DESC(oom_pages, "pages to free on OOM");  static
> > > struct vfsmount *balloon_mnt;  #endif
> > >
> > > +#define BALLOON_CHUNK_BASE_SHIFT 12 #define
> > > +BALLOON_CHUNK_SIZE_SHIFT 12 struct balloon_page_chunk {
> > > + __le64 base;
> > > + __le64 size;
> > > +};
> > > +
> > > +typedef __le64 resp_data_t;
> > >  struct virtio_balloon {
> > >   struct virtio_device *vdev;
> > >   struct virtqueue *inflate_vq, *deflate_vq, *stats_vq; @@ -67,6
> > > +79,31 @@ struct virtio_balloon {
> > >
> > >   /* Number of balloon pages we've told the Host we're not using. */
> > >   unsigned int num_pages;
> > > + /* Pointer to the response header. */
> > > + struct virtio_balloon_resp_hdr *resp_hdr;
> > > + /* Pointer to the start address of response data. */
> > > + resp_data_t *resp_data;
> >
> > I think the implementation has an issue here - both the balloon pages and 
> > the
> unused pages use the same buffer ("resp_data" above) to store chunks. It would
> cause a race in this case: live migration starts while ballooning is also in 
> progress.
> I plan to use separate buffers for CHUNKS_OF_BALLOON_PAGES and
> CHUNKS_OF_UNUSED_PAGES. Please let me know if you have a different
> suggestion. Thanks.
> >
> > Best,
> > Wei
> 
> Is only one resp data ever in flight for each kind?
> If not you want as many buffers as vq allows.
> 

No, all the kinds were using only one resp_data. I will make it one resp_data 
for each kind.

Best,
Wei





Re: [Qemu-devel] [Qemu-stable] [PATCH 00/81] Patch Round-up for stable 2.8.1, freeze on 2017-03-27

2017-04-04 Thread Michael Roth
Quoting Gonglei (Arei) (2017-04-04 21:01:54)
> Hi,
> 
> I'd like to ask when QEMU 2.8.1 release? According to the previous planning,
> It should release in Mar 30.

It's already been released :)

  https://lists.nongnu.org/archive/html/qemu-devel/2017-03/msg06332.html

> 
> Thanks,
> -Gonglei
> 
> > -Original Message-
> > From: Qemu-stable
> > [mailto:qemu-stable-bounces+arei.gonglei=huawei@nongnu.org] On
> > Behalf Of Michael Roth
> > Sent: Tuesday, March 21, 2017 7:07 AM
> > To: qemu-devel@nongnu.org
> > Cc: qemu-sta...@nongnu.org
> > Subject: [Qemu-stable] [PATCH 00/81] Patch Round-up for stable 2.8.1, freeze
> > on 2017-03-27
> > 
> > Hi everyone,
> > 
> > The following new patches are queued for QEMU stable v2.8.1:
> > 
> >   https://github.com/mdroth/qemu/commits/stable-2.8-staging
> > 
> > The release is planned for 2017-03-30:
> > 
> >   http://wiki.qemu.org/Planning/2.8
> > 
> > Please respond here or CC qemu-sta...@nongnu.org on any patches you
> > think should be included in the release.
> > 
> > Testing/feedback is greatly appreciated.
> > 
> > Thanks!
> > 
> 




Re: [Qemu-devel] [RFC] migration/block:limit the time used for block migration

2017-04-04 Thread 858585 jemmy
On Wed, Mar 29, 2017 at 11:57 PM, Juan Quintela  wrote:
>
> 858585 jemmy  wrote:
> > On Tue, Mar 28, 2017 at 5:47 PM, Juan Quintela  wrote:
> >> Lidong Chen  wrote:
> >>> when migration with quick speed, mig_save_device_bulk invoke
> >>> bdrv_is_allocated too frequently, and cause vnc reponse slowly.
> >>> this patch limit the time used for bdrv_is_allocated.
> >>>
> >>> Signed-off-by: Lidong Chen 
> >>> ---
> >>>  migration/block.c | 39 +++
> >>>  1 file changed, 31 insertions(+), 8 deletions(-)
> >>>
> >>> diff --git a/migration/block.c b/migration/block.c
> >>> index 7734ff7..d3e81ca 100644
> >>> --- a/migration/block.c
> >>> +++ b/migration/block.c
> >>> @@ -110,6 +110,7 @@ typedef struct BlkMigState {
> >>>  int transferred;
> >>>  int prev_progress;
> >>>  int bulk_completed;
> >>> +int time_ns_used;
> >>
> >> An int that can only take values 0/1 is called a bool O:-)
> > time_ns_used is used to store how many ns used by bdrv_is_allocated.
>
> Oops, I really mean timeout_flag, sorry :-(
>
> >> Do we really want to call clock_gettime each time that
> >> bdrv_is_allocated() is called?  My understanding is that clock_gettime
> >> is expensive, but I don't know how expensive is brdrv_is_allocated()
> >
> > i write this code to measure the time used by  brdrv_is_allocated()
> >
> >  279 static int max_time = 0;
> >  280 int tmp;
> >
> >  288 clock_gettime(CLOCK_MONOTONIC_RAW, &ts1);
> >  289 ret = bdrv_is_allocated(blk_bs(bb), cur_sector,
> >  290 MAX_IS_ALLOCATED_SEARCH, 
> > &nr_sectors);
> >  291 clock_gettime(CLOCK_MONOTONIC_RAW, &ts2);
> >  292
> >  293
> >  294 tmp =  (ts2.tv_sec - ts1.tv_sec)*10L
> >  295+ (ts2.tv_nsec - ts1.tv_nsec);
> >  296 if (tmp > max_time) {
> >  297max_time=tmp;
> >  298fprintf(stderr, "max_time is %d\n", max_time);
> >  299 }
> >
> > the test result is below:
> >
> >  max_time is 37014
> >  max_time is 1075534
> >  max_time is 17180913
> >  max_time is 28586762
> >  max_time is 49563584
> >  max_time is 103085447
> >  max_time is 110836833
> >  max_time is 120331438
>
> this is around 120ms, no?  It is quite a lot, really :-(

i find the reason is mainly caused by qemu_co_mutex_lock invoked by
qcow2_co_get_block_status.
qemu_co_mutex_lock(&s->lock);
ret = qcow2_get_cluster_offset(bs, sector_num << 9, &bytes,
   &cluster_offset);
qemu_co_mutex_unlock(&s->lock);

>
>
> > so i think it's necessary to clock_gettime each time.
> > but clock_gettime only available on linux. maybe clock() is better.
>
qemu_clock_get_ms(QEMU_CLOCK_REALTIME) is a better option.

> Thanks, Juan.



Re: [Qemu-devel] [PATCH kernel v8 2/4] virtio-balloon: VIRTIO_BALLOON_F_CHUNK_TRANSFER

2017-04-04 Thread Michael S. Tsirkin
On Wed, Apr 05, 2017 at 03:31:36AM +, Wang, Wei W wrote:
> On Thursday, March 16, 2017 3:09 PM Wei Wang wrote:
> > The implementation of the current virtio-balloon is not very efficient, 
> > because
> > the ballooned pages are transferred to the host one by one. Here is the
> > breakdown of the time in percentage spent on each step of the balloon 
> > inflating
> > process (inflating 7GB of an 8GB idle guest).
> > 
> > 1) allocating pages (6.5%)
> > 2) sending PFNs to host (68.3%)
> > 3) address translation (6.1%)
> > 4) madvise (19%)
> > 
> > It takes about 4126ms for the inflating process to complete.
> > The above profiling shows that the bottlenecks are stage 2) and stage 4).
> > 
> > This patch optimizes step 2) by transferring pages to the host in chunks. A 
> > chunk
> > consists of guest physically continuous pages, and it is offered to the 
> > host via a
> > base PFN (i.e. the start PFN of those physically continuous pages) and the 
> > size
> > (i.e. the total number of the pages). A chunk is formated as below:
> > 
> > 
> > | Base (52 bit)| Rsvd (12 bit) |
> > 
> > 
> > | Size (52 bit)| Rsvd (12 bit) |
> > 
> > 
> > By doing so, step 4) can also be optimized by doing address translation and
> > madvise() in chunks rather than page by page.
> > 
> > This optimization requires the negotiation of a new feature bit,
> > VIRTIO_BALLOON_F_CHUNK_TRANSFER.
> > 
> > With this new feature, the above ballooning process takes ~590ms resulting 
> > in
> > an improvement of ~85%.
> > 
> > TODO: optimize stage 1) by allocating/freeing a chunk of pages instead of a
> > single page each time.
> > 
> > Signed-off-by: Liang Li 
> > Signed-off-by: Wei Wang 
> > Suggested-by: Michael S. Tsirkin 
> > ---
> >  drivers/virtio/virtio_balloon.c | 371 
> > +-
> > --
> >  include/uapi/linux/virtio_balloon.h |   9 +
> >  2 files changed, 353 insertions(+), 27 deletions(-)
> > 
> > diff --git a/drivers/virtio/virtio_balloon.c 
> > b/drivers/virtio/virtio_balloon.c index
> > f59cb4f..3f4a161 100644
> > --- a/drivers/virtio/virtio_balloon.c
> > +++ b/drivers/virtio/virtio_balloon.c
> > @@ -42,6 +42,10 @@
> >  #define OOM_VBALLOON_DEFAULT_PAGES 256
> >  #define VIRTBALLOON_OOM_NOTIFY_PRIORITY 80
> > 
> > +#define PAGE_BMAP_SIZE (8 * PAGE_SIZE)
> > +#define PFNS_PER_PAGE_BMAP (PAGE_BMAP_SIZE * BITS_PER_BYTE)
> > +#define PAGE_BMAP_COUNT_MAX32
> > +
> >  static int oom_pages = OOM_VBALLOON_DEFAULT_PAGES;
> > module_param(oom_pages, int, S_IRUSR | S_IWUSR);
> > MODULE_PARM_DESC(oom_pages, "pages to free on OOM"); @@ -50,6 +54,14
> > @@ MODULE_PARM_DESC(oom_pages, "pages to free on OOM");  static struct
> > vfsmount *balloon_mnt;  #endif
> > 
> > +#define BALLOON_CHUNK_BASE_SHIFT 12
> > +#define BALLOON_CHUNK_SIZE_SHIFT 12
> > +struct balloon_page_chunk {
> > +   __le64 base;
> > +   __le64 size;
> > +};
> > +
> > +typedef __le64 resp_data_t;
> >  struct virtio_balloon {
> > struct virtio_device *vdev;
> > struct virtqueue *inflate_vq, *deflate_vq, *stats_vq; @@ -67,6 +79,31
> > @@ struct virtio_balloon {
> > 
> > /* Number of balloon pages we've told the Host we're not using. */
> > unsigned int num_pages;
> > +   /* Pointer to the response header. */
> > +   struct virtio_balloon_resp_hdr *resp_hdr;
> > +   /* Pointer to the start address of response data. */
> > +   resp_data_t *resp_data;
> 
> I think the implementation has an issue here - both the balloon pages and the 
> unused pages use the same buffer ("resp_data" above) to store chunks. It 
> would cause a race in this case: live migration starts while ballooning is 
> also in progress. I plan to use separate buffers for CHUNKS_OF_BALLOON_PAGES 
> and CHUNKS_OF_UNUSED_PAGES. Please let me know if you have a different 
> suggestion. Thanks.
> 
> Best,
> Wei

Is only one resp data ever in flight for each kind?
If not you want as many buffers as vq allows.

-- 
MST



Re: [Qemu-devel] [PATCH kernel v8 2/4] virtio-balloon: VIRTIO_BALLOON_F_CHUNK_TRANSFER

2017-04-04 Thread Wang, Wei W
On Thursday, March 16, 2017 3:09 PM Wei Wang wrote:
> The implementation of the current virtio-balloon is not very efficient, 
> because
> the ballooned pages are transferred to the host one by one. Here is the
> breakdown of the time in percentage spent on each step of the balloon 
> inflating
> process (inflating 7GB of an 8GB idle guest).
> 
> 1) allocating pages (6.5%)
> 2) sending PFNs to host (68.3%)
> 3) address translation (6.1%)
> 4) madvise (19%)
> 
> It takes about 4126ms for the inflating process to complete.
> The above profiling shows that the bottlenecks are stage 2) and stage 4).
> 
> This patch optimizes step 2) by transferring pages to the host in chunks. A 
> chunk
> consists of guest physically continuous pages, and it is offered to the host 
> via a
> base PFN (i.e. the start PFN of those physically continuous pages) and the 
> size
> (i.e. the total number of the pages). A chunk is formated as below:
> 
> 
> | Base (52 bit)| Rsvd (12 bit) |
> 
> 
> | Size (52 bit)| Rsvd (12 bit) |
> 
> 
> By doing so, step 4) can also be optimized by doing address translation and
> madvise() in chunks rather than page by page.
> 
> This optimization requires the negotiation of a new feature bit,
> VIRTIO_BALLOON_F_CHUNK_TRANSFER.
> 
> With this new feature, the above ballooning process takes ~590ms resulting in
> an improvement of ~85%.
> 
> TODO: optimize stage 1) by allocating/freeing a chunk of pages instead of a
> single page each time.
> 
> Signed-off-by: Liang Li 
> Signed-off-by: Wei Wang 
> Suggested-by: Michael S. Tsirkin 
> ---
>  drivers/virtio/virtio_balloon.c | 371 +-
> --
>  include/uapi/linux/virtio_balloon.h |   9 +
>  2 files changed, 353 insertions(+), 27 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_balloon.c 
> b/drivers/virtio/virtio_balloon.c index
> f59cb4f..3f4a161 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -42,6 +42,10 @@
>  #define OOM_VBALLOON_DEFAULT_PAGES 256
>  #define VIRTBALLOON_OOM_NOTIFY_PRIORITY 80
> 
> +#define PAGE_BMAP_SIZE   (8 * PAGE_SIZE)
> +#define PFNS_PER_PAGE_BMAP   (PAGE_BMAP_SIZE * BITS_PER_BYTE)
> +#define PAGE_BMAP_COUNT_MAX  32
> +
>  static int oom_pages = OOM_VBALLOON_DEFAULT_PAGES;
> module_param(oom_pages, int, S_IRUSR | S_IWUSR);
> MODULE_PARM_DESC(oom_pages, "pages to free on OOM"); @@ -50,6 +54,14
> @@ MODULE_PARM_DESC(oom_pages, "pages to free on OOM");  static struct
> vfsmount *balloon_mnt;  #endif
> 
> +#define BALLOON_CHUNK_BASE_SHIFT 12
> +#define BALLOON_CHUNK_SIZE_SHIFT 12
> +struct balloon_page_chunk {
> + __le64 base;
> + __le64 size;
> +};
> +
> +typedef __le64 resp_data_t;
>  struct virtio_balloon {
>   struct virtio_device *vdev;
>   struct virtqueue *inflate_vq, *deflate_vq, *stats_vq; @@ -67,6 +79,31
> @@ struct virtio_balloon {
> 
>   /* Number of balloon pages we've told the Host we're not using. */
>   unsigned int num_pages;
> + /* Pointer to the response header. */
> + struct virtio_balloon_resp_hdr *resp_hdr;
> + /* Pointer to the start address of response data. */
> + resp_data_t *resp_data;

I think the implementation has an issue here - both the balloon pages and the 
unused pages use the same buffer ("resp_data" above) to store chunks. It would 
cause a race in this case: live migration starts while ballooning is also in 
progress. I plan to use separate buffers for CHUNKS_OF_BALLOON_PAGES and 
CHUNKS_OF_UNUSED_PAGES. Please let me know if you have a different suggestion. 
Thanks.

Best,
Wei



Re: [Qemu-devel] [PATCH v3 1/1] block: pass the right options for BlockDriver.bdrv_open()

2017-04-04 Thread Dong Jia Shi
* Max Reitz  [2017-03-29 23:07:22 +0200]:

> On 29.03.2017 03:16, Dong Jia Shi wrote:
> > raw_open() expects the caller always passing in the right actual
> > @options parameter. But when trying to applying snapshot on a RBD
> > image, bdrv_snapshot_goto() calls raw_open() (by calling the
> > bdrv_open callback on the BlockDriver) with a NULL @options, and
> > that will result in a Segmentation fault.
> > 
> > For the other non-raw format drivers, it also makes sense to passing
> > in the actual options, althought they don't trigger the problem so
> > far.
> > 
> > Let's prepare a @options by adding the "file" key-value pair to a
> > copy of the actual options that were given for the node (i.e.
> > bs->options), and pass it to the callback.
> > 
> > BlockDriver.bdrv_open() expects bs->file to be NULL and just
> > overwrites it with the result from bdrv_open_child(). If that
> > bdrv_open_child() fails, the field becomes NULL. While we are at
> > it, we also correct the cleanning up action for a call failure of
> > BlockDriver.bdrv_open() by replacing bdrv_unref() with
> > bdrv_unref_child().
> > 
> > Suggested-by: Max Reitz 
> > Signed-off-by: Dong Jia Shi 
> > ---
> >  block/snapshot.c | 14 --
> >  1 file changed, 12 insertions(+), 2 deletions(-)
> > 
> > diff --git a/block/snapshot.c b/block/snapshot.c
> > index bf5c2ca..281626c 100644
> > --- a/block/snapshot.c
> > +++ b/block/snapshot.c
> > @@ -27,6 +27,7 @@
> >  #include "block/block_int.h"
> >  #include "qapi/error.h"
> >  #include "qapi/qmp/qerror.h"
> > +#include "qapi/qmp/qstring.h"
> >  
> >  QemuOptsList internal_snapshot_opts = {
> >  .name = "snapshot",
> > @@ -189,11 +190,20 @@ int bdrv_snapshot_goto(BlockDriverState *bs,
> >  }
> >  
> >  if (bs->file) {
> > +QDict *options = qdict_clone_shallow(bs->options);
> > +QDict *file_options;
> > +
> > +qdict_extract_subqdict(options, &file_options, "file.");
> > +QDECREF(file_options);
> > +qdict_put(options, "file",
> > +  qstring_from_str(bdrv_get_node_name(bs->file->bs)));
> > +
> >  drv->bdrv_close(bs);
> >  ret = bdrv_snapshot_goto(bs->file->bs, snapshot_id);
> > -open_ret = drv->bdrv_open(bs, NULL, bs->open_flags, NULL);
> > +open_ret = drv->bdrv_open(bs, options, bs->open_flags, NULL);
> > +QDECREF(options);
> >  if (open_ret < 0) {
> > -bdrv_unref(bs->file->bs);
> > +bdrv_unref_child(bs, bs->file);
> >  bs->drv = NULL;
> >  return open_ret;
> >  }
> 
> I just noticed another issue (sorry I did not before...):
Hi Max,
No need for sorry.

> 
> In drv->bdrv_open(), the block driver will generally overwrite bs->file
> without looking at it because it assumes that it's NULL. That means we
> should probably actually make sure it's NULL because otherwise we will
> the child BDS will have a reference count that is 1 too high.
> 
> (bdrv_open_inherit() (ultimately called from bdrv_open_child()) invokes
> bdrv_ref() for a child BDS specified by node-name reference.)
> 
> That means we should unconditionally invoke
> bdrv_unref_child(bs, bs->file) before calling drv->bdrv_open(). But we
> have to wrap everything in bdrv_ref()/bdrv_unref() so the BDS isn't
> deleted in the meantime.
Understood.

> 
> So I think it should look something like this:
> 
> BlockDriverState *file;
> QDict *options = ...;
> QDict *file_options;
> 
> file = bs->file->bs;
> /* Prevent it from getting deleted when detached from bs */
> bdrv_ref(file);
> 
> qdict_extract_subqdict(...);
> QDECREF(file_options);
> qdict_put(..., qstring_from_str(bdrv_get_node_name(file)));
> 
> drv->bdrv_close(bs);
> bdrv_unref_child(bs, bs->file);
> bs->file = NULL;
> 
> ret = bdrv_snapshot_goto(file, snapshot_id);
> open_ret = drv->bdrv_open(...);
Here, I think we'd still need a:
QDECREF(options);

> if (open_ret < 0) {
> bdrv_unref(file);
> bs->drv = NULL;
> return open_ret;
> }
> 
> assert(bs->file->bs == file);
> bdrv_unref(file);

It looks convoluted, but I don't see a better solution. So I will
prepare a new patch based on your code.

Thanks for the help!

> 
> 
> Max
> 

-- 
Dong Jia Shi




Re: [Qemu-devel] [Qemu-stable] [PATCH 00/81] Patch Round-up for stable 2.8.1, freeze on 2017-03-27

2017-04-04 Thread Gonglei (Arei)
Hi,

I'd like to ask when QEMU 2.8.1 release? According to the previous planning,
It should release in Mar 30.

Thanks,
-Gonglei

> -Original Message-
> From: Qemu-stable
> [mailto:qemu-stable-bounces+arei.gonglei=huawei@nongnu.org] On
> Behalf Of Michael Roth
> Sent: Tuesday, March 21, 2017 7:07 AM
> To: qemu-devel@nongnu.org
> Cc: qemu-sta...@nongnu.org
> Subject: [Qemu-stable] [PATCH 00/81] Patch Round-up for stable 2.8.1, freeze
> on 2017-03-27
> 
> Hi everyone,
> 
> The following new patches are queued for QEMU stable v2.8.1:
> 
>   https://github.com/mdroth/qemu/commits/stable-2.8-staging
> 
> The release is planned for 2017-03-30:
> 
>   http://wiki.qemu.org/Planning/2.8
> 
> Please respond here or CC qemu-sta...@nongnu.org on any patches you
> think should be included in the release.
> 
> Testing/feedback is greatly appreciated.
> 
> Thanks!
> 




Re: [Qemu-devel] how to create threads like dataplane in qemu

2017-04-04 Thread Fam Zheng
On Sat, 04/01 11:54, PERSIST wrote:
> Hello!
>   I want to create a specific thread for each process in VM so that virtio 
> block requests from each process can be sumbitted in the thread related to 
> the process.
>   I tried to implement it by referring to dataplane,but I can not understand 
> the source about dataplane.

There are multiple abstraction layers between guest process and QEMU threads, I
don't think the model you describe is practical. In dataplane, requests are
submitted to host through virtio ioeventfd, and handled by the per-virtual
device thread. What are you trying to achieve here?

Fam



[Qemu-devel] [PATCH]Enable kvmvapic only when host doesn't support VAPIC capability in KVM mode

2017-04-04 Thread Xu, Anthony
In KVM mode, enable kvmvapic only when host doesn't support 
VAPIC capability.

Save the time to set up kvmvapic in some hosts.


Signed-off -by: Anthony Xu 


diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
index c3829e3..d5c53af 100644
--- a/hw/intc/apic_common.c
+++ b/hw/intc/apic_common.c
@@ -317,8 +317,11 @@ static void apic_common_realize(DeviceState *dev, Error 
**errp)
 info = APIC_COMMON_GET_CLASS(s);
 info->realize(dev, errp);
 
-/* Note: We need at least 1M to map the VAPIC option ROM */
+/* Note: We need at least 1M to map the VAPIC option ROM,
+   if it is KVM, enable kvmvapic only when KVM doesn't have 
+   VAPIC capability*/ 
 if (!vapic && s->vapic_control & VAPIC_ENABLE_MASK &&
+(!kvm_enabled() || (kvm_enabled() && !kvm_has_vapic())) &&
 !hax_enabled() && ram_size >= 1024 * 1024) {
 vapic = sysbus_create_simple("kvmvapic", -1, NULL);
 }
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index 24281fc..43e0e4c 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -215,6 +215,7 @@ extern KVMState *kvm_state;
 
 bool kvm_has_free_slot(MachineState *ms);
 int kvm_has_sync_mmu(void);
+int kvm_has_vapic(void);
 int kvm_has_vcpu_events(void);
 int kvm_has_robust_singlestep(void);
 int kvm_has_debugregs(void);
diff --git a/kvm-all.c b/kvm-all.c
index 90b8573..edcb6ea 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -2232,6 +2232,10 @@ int kvm_has_sync_mmu(void)
 return kvm_check_extension(kvm_state, KVM_CAP_SYNC_MMU);
 }
 
+int kvm_has_vapic(void){
+return !kvm_check_extension(kvm_state, KVM_CAP_VAPIC);
+}
+
 int kvm_has_vcpu_events(void)
 {
 return kvm_state->vcpu_events;



Re: [Qemu-devel] [PATCH v2 0/3] Update AppleSMC for OS X Sierra 10.12.4 guests

2017-04-04 Thread Gabriel L. Somlo
On Tue, Apr 04, 2017 at 09:35:00PM +0200, Alexander Graf wrote:
> On 04/04/2017 07:01 PM, Gabriel L. Somlo wrote:
> > As of 10.12.4 (currently the latest Sierra update), OS X refuses to boot
> > unless the AppleSMC supports a third I/O port, which provides the current
> > error status when read.
> 
> Looks much nicer after this series :). Thanks a lot!
> 
> Reviewed-by: Alexander Graf 

Thanks! Thinking about it, we should probably drop the last (3/3)
patch, once again to minimize useless churn -- Doing that part (or
not, as the case may turn out to be) should be part of a future
series implementing more accurate emulation, including key write
support...

Thanks,
--Gabriel



[Qemu-devel] centos 7.2 guest doesn't boot without kvmvapic in TCG mode

2017-04-04 Thread Xu, Anthony
I disabled kvmvapic by 

diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c
index c3829e3..52be2b0 100644
--- a/hw/intc/apic_common.c
+++ b/hw/intc/apic_common.c
@@ -440,7 +440,7 @@ static const VMStateDescription vmstate_apic_common = {
 static Property apic_properties_common[] = {
 DEFINE_PROP_UINT8("version", APICCommonState, version, 0x14),
 DEFINE_PROP_BIT("vapic", APICCommonState, vapic_control, VAPIC_ENABLE_BIT,
-true),
+false),
 DEFINE_PROP_BOOL("legacy-instance-id", APICCommonState, legacy_instance_id,
  false),
 DEFINE_PROP_END_OF_LIST(),


Tried to boot centos guest with this command

./x86_64-softmmu/qemu-system-x86_64 -bios /home/root/guest/seabios.bin 
-machine q35 -m 1G -drive 
format=raw,file=/home/root/images/centos7.2.img,if=ide,index=0 
-nographic  -nodefaults  -serial pty -monitor pty 
-chardev stdio,id=seabios -device isa-debugcon,iobase=0x402,chardev=seabios


Seabios complained 

Seabios output
"
Booting from Hard Disk...
WARNING - Timeout at ahci_command:154!
Boot failed: could not read the boot disk

enter handle_18:
  NULL
Booting from Floppy...
Boot failed: could not read the boot disk

enter handle_18:
  NULL
No bootable device.
"

It can boot with kvmvapic enabled by using the same command.

Seabios output
"
Booting from Hard Disk...
Booting from :7c00
"

In the original commit,
" Therefore, the VAPI is enabled by default, even in TCG mode. Here, no
speedup can be achieved as the emulation overhead of the TPR register is
marginal compared to instruction emulation."

Seems no performance benefit for TCG mode
What's the other benefit of using kvmvapic in TCG mode?

Should we make TCG mode work without kvmvapic?

If we don't need to use kvmvapic in TCG mode, 
we can make kvmvapic code clearer by removing all TCG related code in 
kvmvapic.c.


Anthony



[Qemu-devel] [PATCH v1 1/5] cadence_gem: Read the correct queue descriptor

2017-04-04 Thread Alistair Francis
Read the correct descriptor instead of hardcoding the first (q=0).

Signed-off-by: Alistair Francis 
---

 hw/net/cadence_gem.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/net/cadence_gem.c b/hw/net/cadence_gem.c
index d4de8ad..17c229d 100644
--- a/hw/net/cadence_gem.c
+++ b/hw/net/cadence_gem.c
@@ -790,8 +790,8 @@ static void gem_get_rx_desc(CadenceGEMState *s, int q)
 {
 DB_PRINT("read descriptor 0x%x\n", (unsigned)s->rx_desc_addr[q]);
 /* read current descriptor */
-cpu_physical_memory_read(s->rx_desc_addr[0],
- (uint8_t *)s->rx_desc[0], sizeof(s->rx_desc[0]));
+cpu_physical_memory_read(s->rx_desc_addr[q],
+ (uint8_t *)s->rx_desc[q], sizeof(s->rx_desc[q]));
 
 /* Descriptor owned by software ? */
 if (rx_desc_get_ownership(s->rx_desc[q]) == 1) {
-- 
2.9.3




[Qemu-devel] [PATCH v1 0/5] Improve the Cadence GEM multi-queue support

2017-04-04 Thread Alistair Francis
Improve the Cadence GEM multi-queue support. This fixes a few bugs
which were hanging around from the initial implementation.

Alistair Francis (5):
  cadence_gem: Read the correct queue descriptor
  cadence_gem: Correct the multi-queue can rx logic
  cadence_gem: Only trigger interrupts if the status register is set
  cadence_gem: Make the revision a property
  xlnx-zynqmp: Set the Cadence GEM revision

 hw/arm/xlnx-zynqmp.c |  6 +-
 hw/net/cadence_gem.c | 30 +++---
 include/hw/net/cadence_gem.h |  1 +
 3 files changed, 25 insertions(+), 12 deletions(-)

-- 
2.9.3




Re: [Qemu-devel] [PATCH] Fix Event Viewer errors caused by qemu-ga

2017-04-04 Thread Michael Roth
Quoting Sameeh Jubran (2017-03-22 03:14:53)
>On Tue, Mar 21, 2017 at 6:09 PM, Michael Roth
>
>wrote:
>
>> Quoting Sameeh Jubran (2017-03-21 05:49:52)
>>> When the command "guest-fsfreeze-freeze" is executed it causes
>>> the VSS service to log the errors below in the Event Viewer.
>>>
>>> These errors are caused by two issues in the function
>>> "CommitSnapshots"
>> in
>>> provider.cpp:
>>>
>>> 1. When VSS_TIMEOUT_MSEC expires the funtion returns E_ABORT. This
>>> causes
>>> the error #12293.
>>>
>>> 2. The VSS_TIMEOUT_MSEC value is too big. According to msdn the
>>> "Flush & Hold" operation has 10 seconds timeout not configurable, The
>>> "CommitSnapshots" is a part of the "Flush & Hold" process and thus
>>> any
>>> timeout bigger than 10 seconds would cause the error #12298 and
>>> anything
>>> bigger than 40 seconds causes the error #12340. All this info can be
>> found here:
>>> https://msdn.microsoft.com/en-us/library/windows/desktop/
>> aa384589(v=vs.85).aspx
>>
>> Not sure how best to deal with this. Technically our CommitSnapshots
>> interface is driven by the backup job being run by QGA/QEMU management
>> side. If that amount of time exceeds the VSS limits then I think it's
>> appropriate for VSS to log the error accordingly. VSS_TIMEOUT_MSEC
>> here
>> doesn't actually have too much correlation with the VSS-set timeout,
>> IIRC it's specifically picked to exceed both the 10 and 40 second
>> timeouts and acts more as a fail-safe timeout.
>
>The timeout was added in #commit:
>b39297aedfabe9b2c426cd540413be991500da25
>There is no point in setting the TIMEOUT for this long as the actual
>freeze
>- Fush and Hold Writes -
>is limited to 10 seconds ( not configurable) according to msdn
>https://msdn.microsoft.com/en-us/library/windows/desktop/aa384589%28v=vs.85%29.aspx

The reason it's set that long is because the VSS service is perfectly
capable of telling us exactly when it has timed out. I think racing with
it because we don't like how it chooses to log errors is an ugly approach,
and in the process of doing that we lose an extra second of time that
management might have been able to make use of to complete the snapshot
successfully. 10s is already small as is, losing 10% of that to avoid
Event Log messages seems like a poor trade-off.

>
>>
>> Are the event logs causing issues? FWIW, on the posix side we also opt
>> for gratuitous logging to syslog and such, the idea there being that
>> cooperative guests would prefer transparency on how the agent is being
>> used.
>>
>Apparently, these error logs are annoying to some (
>https://bugzilla.redhat.com/show_bug.cgi?id=1387125),

It seems more like users were complaining about a different event log
error dealing with permissions issues. I don't really see anyone
concerned with the timeout errors once it was apparent how they factored
into the overall snapshot operation.

>moreover I don't think that our implementation to the freeze operation -
>which is a workaround in a way -
>should log errors even though we know they are false alarm.

These aren't really false alarms. I think there's a good reason the VSS
service was designed to log failures to hold writes in the event log,
and we primarily use it for the same reason those errors would be
relevant: creating reliable snapshots.

Users who don't care about the errors can use the filtering facilities
built into the event log viewer. Users who do care probably feel a bit
more strongly about keeping that information logged.

>
>>
>> That said, I do think error 12293 is unecessary, since IIUC it would
>> always be paired with the actual VSS-reported error. So avoiding the
>> E_ABORT seems reasonable either way.
>>
>>>
>>> |event id|   error
>>  |
>>> * 12293  : Volume Shadow Copy Service error: Error calling a routine
>>> on a
>>>Shadow Copy Provider
>>>{----}.
>>>Routine details CommitSnapshots [hr = 0x80004004, Operation
>>>aborted.
>>>
>>> * 12340  : Volume Shadow Copy Error: VSS waited more than 40 seconds
>>> for
>>>all volumes to be flushed.  This caused volume
>>>\\?\Volume{62a171da-32ec-11e4-80b1-806e6f6e6963}\ to timeout
>>>while waiting for the release-writes phase of shadow copy
>>>creation. Trying again when disk activity is lower may solve
>>>this problem.
>>>
>>> * 12298  : Volume Shadow Copy Service error: The I/O writes cannot be
>> held
>>>during the shadow copy creation period on volume
>>>\\?\Volume{62a171d9-32ec-11e4-80b1-806e6f6e6963}\. The
>>>volume
>>>index in the shadow copy set is 0. Error details:
>>>Open[0x, The operation completed successfully. ],
>>>Flush[0x, The operation completed successfully.],
>>>Release[0x, The operation completed successfully.],
>>>OnRun[0x80042314, The shadow copy provider timed out while
>>>holding writes to t

[Qemu-devel] [PATCH v1 2/5] cadence_gem: Correct the multi-queue can rx logic

2017-04-04 Thread Alistair Francis
Correct the buffer descriptor busy logic to work correctly when using
multiple queues.

Signed-off-by: Alistair Francis 
---

 hw/net/cadence_gem.c | 18 +++---
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/hw/net/cadence_gem.c b/hw/net/cadence_gem.c
index 17c229d..3e37665 100644
--- a/hw/net/cadence_gem.c
+++ b/hw/net/cadence_gem.c
@@ -481,14 +481,18 @@ static int gem_can_receive(NetClientState *nc)
 }
 
 for (i = 0; i < s->num_priority_queues; i++) {
-if (rx_desc_get_ownership(s->rx_desc[i]) == 1) {
-if (s->can_rx_state != 2) {
-s->can_rx_state = 2;
-DB_PRINT("can't receive - busy buffer descriptor (q%d) 0x%x\n",
- i, s->rx_desc_addr[i]);
- }
-return 0;
+if (rx_desc_get_ownership(s->rx_desc[i]) != 1) {
+break;
+}
+};
+
+if (i == s->num_priority_queues) {
+if (s->can_rx_state != 2) {
+s->can_rx_state = 2;
+DB_PRINT("can't receive - busy buffer descriptor (q%d) 0x%x\n",
+ i, s->rx_desc_addr[i]);
 }
+return 0;
 }
 
 if (s->can_rx_state != 0) {
-- 
2.9.3




[Qemu-devel] [PATCH v1 3/5] cadence_gem: Only trigger interrupts if the status register is set

2017-04-04 Thread Alistair Francis
Only trigger multi-queue GEM interrupts if the interrupt status register
is set. This logic was already used for non multi-queue interrupts but
it also applies to multi-queue interrupts.

Signed-off-by: Alistair Francis 
---

 hw/net/cadence_gem.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/net/cadence_gem.c b/hw/net/cadence_gem.c
index 3e37665..b9eaed4 100644
--- a/hw/net/cadence_gem.c
+++ b/hw/net/cadence_gem.c
@@ -518,7 +518,7 @@ static void gem_update_int_status(CadenceGEMState *s)
 }
 
 for (i = 0; i < s->num_priority_queues; ++i) {
-if (s->regs[GEM_INT_Q1_STATUS + i]) {
+if (s->regs[GEM_INT_Q1_STATUS + i] && s->regs[GEM_ISR]) {
 DB_PRINT("asserting int. (q=%d)\n", i);
 qemu_set_irq(s->irq[i], 1);
 }
-- 
2.9.3




Re: [Qemu-devel] [PATCH qemu-ga] Fix a bug where qemu-ga service is stuck during stop operation

2017-04-04 Thread Michael Roth
Quoting Sameeh Jubran (2017-03-21 08:59:11)
> After triggering a freeze command without any following thaw command,
> qemu-ga will not respond to stop operation. This behaviour is wanted on Linux
> as there is no time limit for a freeze command and we want to prevent
> quitting in the middle of freeze, on the other hand on Windows the time
> limit for freeze is 10 seconds, so we should wait for the timeout, thaw
> the file system and quit.
> 
> Signed-off-by: Sameeh Jubran 
> ---
>  qga/main.c | 20 
>  qga/vss-win32.h|  1 +
>  qga/vss-win32/vss-common.h |  5 +
>  3 files changed, 26 insertions(+)
> 
> diff --git a/qga/main.c b/qga/main.c
> index 92658bc..de37c85 100644
> --- a/qga/main.c
> +++ b/qga/main.c
> @@ -130,9 +130,29 @@ static void quit_handler(int sig)
>   * unless all log/pid files are on unfreezable filesystems. there's
>   * also a very likely chance killing the agent before unfreezing
>   * the filesystems is a mistake (or will be viewed as one later).
> + * On Windows the freeze interval is limited to 10 seconds, so
> + * we should quit, but first we should wait for the timeout, thaw
> + * the filesystem and quit.
>   */
>  if (ga_is_frozen(ga_state)) {
> +#ifdef _WIN32
> +int i = 0;
> +Error *err = NULL;
> +HANDLE hEventTimeout;

Maybe add the following to help explain the added delay:

   g_debug("Thawing filesystems before exiting");

> +
> +hEventTimeout = OpenEvent(EVENT_ALL_ACCESS, FALSE, 
> EVENT_NAME_TIMEOUT);
> +if (hEventTimeout) {
> +WaitForSingleObject(hEventTimeout, 0);
> +CloseHandle(hEventTimeout);
> +}
> +qga_vss_fsfreeze(&i, &err, false);
> +if (err) {
> +g_debug("Error: %s", error_get_pretty(err));

Perhaps: "Error unfreezing filesystems prior to exiting: %s"

> +error_free(err);
> +}
> +#else
>  return;
> +#endif
>  }
>  g_debug("received signal num %d, quitting", sig);
> 
> diff --git a/qga/vss-win32.h b/qga/vss-win32.h
> index 4d1d150..5f4ea70 100644
> --- a/qga/vss-win32.h
> +++ b/qga/vss-win32.h
> @@ -13,6 +13,7 @@
>  #ifndef VSS_WIN32_H
>  #define VSS_WIN32_H
> 
> +#include "qga/vss-win32/vss-common.h"
> 
>  bool vss_init(bool init_requester);
>  void vss_deinit(bool deinit_requester);
> diff --git a/qga/vss-win32/vss-common.h b/qga/vss-win32/vss-common.h
> index c81a856..5756d00 100644
> --- a/qga/vss-win32/vss-common.h
> +++ b/qga/vss-win32/vss-common.h
> @@ -12,6 +12,7 @@
> 
>  #ifndef VSS_COMMON_H
>  #define VSS_COMMON_H
> +#ifndef VSS_WIN32_H
> 
>  #define __MIDL_user_allocate_free_DEFINED__
>  #include 
> @@ -57,6 +58,7 @@
>  #define L(a) _L(a)
> 
>  /* Constants for QGA VSS Provider */
> +#endif
> 
>  #define QGA_PROVIDER_NAME "QEMU Guest Agent VSS Provider"
>  #define QGA_PROVIDER_LNAME L(QGA_PROVIDER_NAME)
> @@ -66,6 +68,8 @@
>  #define EVENT_NAME_THAW"Global\\QGAVSSEvent-thaw"
>  #define EVENT_NAME_TIMEOUT "Global\\QGAVSSEvent-timeout"

This is a bit of tangle. I'd rather you just move these out
into a separate header, vss-handles.h or something, and then
include that in main.c and vss-common.h.

> 
> +#ifndef VSS_WIN32_H
> +
>  const GUID g_gProviderId = { 0x3629d4ed, 0xee09, 0x4e0e,
>  {0x9a, 0x5c, 0x6d, 0x8b, 0xa2, 0x87, 0x2a, 0xef} };
>  const GUID g_gProviderVersion = { 0x11ef8b15, 0xcac6, 0x40d6,
> @@ -126,3 +130,4 @@ public:
>  };
> 
>  #endif
> +#endif
> -- 
> 2.9.3
> 




[Qemu-devel] [PATCH v1 4/5] cadence_gem: Make the revision a property

2017-04-04 Thread Alistair Francis
Expose the Cadence GEM revision as a property.

Signed-off-by: Alistair Francis 
---

 hw/net/cadence_gem.c | 6 +-
 include/hw/net/cadence_gem.h | 1 +
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/hw/net/cadence_gem.c b/hw/net/cadence_gem.c
index b9eaed4..7047e02 100644
--- a/hw/net/cadence_gem.c
+++ b/hw/net/cadence_gem.c
@@ -300,6 +300,8 @@
 #define DESC_1_RX_SOF 0x4000
 #define DESC_1_RX_EOF 0x8000
 
+#define GEM_MODID_VALUE 0x00020118
+
 static inline unsigned tx_desc_get_buffer(unsigned *desc)
 {
 return desc[0];
@@ -1213,7 +1215,7 @@ static void gem_reset(DeviceState *d)
 s->regs[GEM_TXPAUSE] = 0x;
 s->regs[GEM_TXPARTIALSF] = 0x03ff;
 s->regs[GEM_RXPARTIALSF] = 0x03ff;
-s->regs[GEM_MODID] = 0x00020118;
+s->regs[GEM_MODID] = s->revision;
 s->regs[GEM_DESCONF] = 0x02500111;
 s->regs[GEM_DESCONF2] = 0x2ab13fff;
 s->regs[GEM_DESCONF5] = 0x002f2145;
@@ -1512,6 +1514,8 @@ static const VMStateDescription vmstate_cadence_gem = {
 
 static Property gem_properties[] = {
 DEFINE_NIC_PROPERTIES(CadenceGEMState, conf),
+DEFINE_PROP_UINT32("revision", CadenceGEMState, revision,
+   GEM_MODID_VALUE),
 DEFINE_PROP_UINT8("num-priority-queues", CadenceGEMState,
   num_priority_queues, 1),
 DEFINE_PROP_UINT8("num-type1-screeners", CadenceGEMState,
diff --git a/include/hw/net/cadence_gem.h b/include/hw/net/cadence_gem.h
index c469ffe..35de622 100644
--- a/include/hw/net/cadence_gem.h
+++ b/include/hw/net/cadence_gem.h
@@ -50,6 +50,7 @@ typedef struct CadenceGEMState {
 uint8_t num_priority_queues;
 uint8_t num_type1_screeners;
 uint8_t num_type2_screeners;
+uint32_t revision;
 
 /* GEM registers backing store */
 uint32_t regs[CADENCE_GEM_MAXREG];
-- 
2.9.3




[Qemu-devel] [PATCH v1 5/5] xlnx-zynqmp: Set the Cadence GEM revision

2017-04-04 Thread Alistair Francis
Signed-off-by: Alistair Francis 
---

 hw/arm/xlnx-zynqmp.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/hw/arm/xlnx-zynqmp.c b/hw/arm/xlnx-zynqmp.c
index bc4e66b..e41b6fe 100644
--- a/hw/arm/xlnx-zynqmp.c
+++ b/hw/arm/xlnx-zynqmp.c
@@ -30,6 +30,8 @@
 #define ARM_PHYS_TIMER_PPI  30
 #define ARM_VIRT_TIMER_PPI  27
 
+#define GEM_REVISION0x40070106
+
 #define GIC_BASE_ADDR   0xf900
 #define GIC_DIST_ADDR   0xf901
 #define GIC_CPU_ADDR0xf902
@@ -334,8 +336,10 @@ static void xlnx_zynqmp_realize(DeviceState *dev, Error 
**errp)
 qemu_check_nic_model(nd, TYPE_CADENCE_GEM);
 qdev_set_nic_properties(DEVICE(&s->gem[i]), nd);
 }
+object_property_set_int(OBJECT(&s->gem[i]), GEM_REVISION, "revision",
+&error_abort);
 object_property_set_int(OBJECT(&s->gem[i]), 2, "num-priority-queues",
-  &error_abort);
+&error_abort);
 object_property_set_bool(OBJECT(&s->gem[i]), true, "realized", &err);
 if (err) {
 error_propagate(errp, err);
-- 
2.9.3




Re: [Qemu-devel] [PATCH qemu-ga] qemu-ga: Don't display errors to the user on thaw command

2017-04-04 Thread Michael Roth
Quoting Sameeh Jubran (2017-03-21 09:14:35)
> Errors that are related to ur inner implementation for the thaw command
> shouldn't be displayed to the user.
> 
> Signed-off-by: Sameeh Jubran 
> ---
>  qga/vss-win32/requester.cpp | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/qga/vss-win32/requester.cpp b/qga/vss-win32/requester.cpp
> index 0cd2f0e..272e71b 100644
> --- a/qga/vss-win32/requester.cpp
> +++ b/qga/vss-win32/requester.cpp
> @@ -463,7 +463,7 @@ void requester_thaw(int *num_vols, ErrorSet *errset)
>  hr = WaitForAsync(pAsync);
>  }
>  if (FAILED(hr)) {
> -err_set(errset, hr, "failed to complete backup");

We cannot do this. If the freeze operation didn't successfully maintain
the frozen state for entire duration we *must* report an error to the
user, otherwise users have no way to know that their snapshot might be
completely corrupted. Well, I suppose they can look at
guest-fsfreeze-thaw's return value and check that it matches the number
of volumes that were originally frozen, but that aspect is intended as a
sanity check to identify situations outside of qemu-ga's control, like
another user/application unfreezing the filesystems before qemu-ga. This
situation *is* within qemu-ga's control, and should be reported as an
error. Same for the other failures below.

> +fprintf(stderr, "failed to complete backup");
>  }
>  break;
> 
> @@ -480,18 +480,18 @@ void requester_thaw(int *num_vols, ErrorSet *errset)
> 
>  case VSS_E_UNEXPECTED_PROVIDER_ERROR:
>  if (WaitForSingleObject(vss_ctx.hEventTimeout, 0) != WAIT_OBJECT_0) {
> -err_set(errset, hr, "unexpected error in VSS provider");
> +fprintf(stderr, "unexpected error in VSS provider");
>  break;
>  }
>  /* fall through if hEventTimeout is signaled */
> 
>  case (HRESULT)VSS_E_HOLD_WRITES_TIMEOUT:
> -err_set(errset, hr, "couldn't hold writes: "
> +fprintf(stderr, "couldn't hold writes: "
>  "fsfreeze is limited up to 10 seconds");
>  break;
> 
>  default:
> -err_set(errset, hr, "failed to do snapshot set");
> +fprintf(stderr, "failed to do snapshot set");
>  }
> 
>  if (err_is_set(errset)) {
> -- 
> 2.9.3
> 




Re: [Qemu-devel] [PATCH qemu-ga v2] qemu-ga: Make QGA VSS provider service run only when needed

2017-04-04 Thread Michael Roth
Quoting Sameeh Jubran (2017-03-23 11:26:50)
> Currently the service runs in background on boot even though it is not
> needed and once it is running it never stops. The service needs to be
> running only during freeze operation and it should be stopped after
> executing thaw.
> 
> Signed-off-by: Sameeh Jubran 


Thanks, applied to qga tree:

  https://github.com/mdroth/qemu/commits/qga

> ---
>  qga/vss-win32/install.cpp   | 28 ++--
>  qga/vss-win32/install.h | 20 
>  qga/vss-win32/requester.cpp |  2 ++
>  3 files changed, 48 insertions(+), 2 deletions(-)
>  create mode 100644 qga/vss-win32/install.h
> 
> diff --git a/qga/vss-win32/install.cpp b/qga/vss-win32/install.cpp
> index f4160a3..f41fcdf 100644
> --- a/qga/vss-win32/install.cpp
> +++ b/qga/vss-win32/install.cpp
> @@ -14,7 +14,7 @@
> 
>  #include "vss-common.h"
>  #include 
> -#include 
> +#include "install.h"
>  #include 
>  #include 
>  #include 
> @@ -276,7 +276,7 @@ STDAPI COMRegister(void)
> 
>  chk(pCatalog->CreateServiceForApplication(
>  _bstr_t(QGA_PROVIDER_LNAME), _bstr_t(QGA_PROVIDER_LNAME),
> -_bstr_t(L"SERVICE_AUTO_START"), _bstr_t(L"SERVICE_ERROR_NORMAL"),
> +_bstr_t(L"SERVICE_DEMAND_START"), 
> _bstr_t(L"SERVICE_ERROR_NORMAL"),
>  _bstr_t(L""), _bstr_t(L".\\localsystem"), _bstr_t(L""), FALSE));
>  chk(pCatalog->InstallComponent(_bstr_t(QGA_PROVIDER_LNAME),
> _bstr_t(dllPath), _bstr_t(tlbPath),
> @@ -461,3 +461,27 @@ namespace _com_util
>  return bstr;
>  }
>  }
> +
> +/* Stop QGA VSS provider service from COM+ Application Admin Catalog */
> +
> +STDAPI StopService(void)
> +{
> +HRESULT hr;
> +COMInitializer initializer;
> +COMPointer pUnknown;
> +COMPointer pCatalog;
> +
> +int count = 0;
> +
> +chk(QGAProviderFind(QGAProviderCount, (void *)&count));
> +if (count) {
> +chk(CoCreateInstance(CLSID_COMAdminCatalog, NULL, 
> CLSCTX_INPROC_SERVER,
> +IID_IUnknown, (void **)pUnknown.replace()));
> +chk(pUnknown->QueryInterface(IID_ICOMAdminCatalog2,
> +(void **)pCatalog.replace()));
> +chk(pCatalog->ShutdownApplication(_bstr_t(QGA_PROVIDER_LNAME)));
> +}
> +
> +out:
> +return hr;
> +}
> diff --git a/qga/vss-win32/install.h b/qga/vss-win32/install.h
> new file mode 100644
> index 000..35364af
> --- /dev/null
> +++ b/qga/vss-win32/install.h
> @@ -0,0 +1,20 @@
> +/*
> + * QEMU Guest Agent VSS requester declarations
> + *
> + * Copyright Hitachi Data Systems Corp. 2013
> + *
> + * Authors:
> + *  Tomoki Sekiyama   
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +
> +#ifndef INSTALL_H
> +#define INSTALL_H
> +
> +#include 
> +
> +STDAPI StopService(void);
> +
> +#endif
> diff --git a/qga/vss-win32/requester.cpp b/qga/vss-win32/requester.cpp
> index 272e71b..27308ad 100644
> --- a/qga/vss-win32/requester.cpp
> +++ b/qga/vss-win32/requester.cpp
> @@ -13,6 +13,7 @@
>  #include "qemu/osdep.h"
>  #include "vss-common.h"
>  #include "requester.h"
> +#include "install.h"
>  #include 
>  #include 
> 
> @@ -501,4 +502,5 @@ void requester_thaw(int *num_vols, ErrorSet *errset)
>  requester_cleanup();
> 
>  CoUninitialize();
> +StopService();
>  }
> -- 
> 2.9.3
> 




Re: [Qemu-devel] [PATCH v2 2/6] qdict: Add convenience helpers for wrapped puts

2017-04-04 Thread Eric Blake
On 01/19/2017 08:38 AM, Eric Blake wrote:
> On 01/19/2017 03:25 AM, Markus Armbruster wrote:
>> Eric Blake  writes:
>>
>>> Quite a few users of qdict_put() were manually wrapping a
>>> non-QObject. We can make such call-sites shorter, by providing
>>> common macros to do the tedious work.  Also shorten nearby
>>> qdict_put_obj(,,QOBJECT()) sequences.
>>>
>>> Signed-off-by: Eric Blake 
>>> Reviewed-by: Alberto Garcia 
>>>
>>> ---
>>>
>>> v2: rebase to current master
>>>
>>> I'm okay if you want me to break this patch into smaller pieces.
>>
>> I guess I'm okay with a single piece, but I'd like to know how you did
>> the conversion.  Coccinelle?  Manually?
> 
> Manual, via grepping for put_obj.*QOBJECT. I'll see if I can do the same
> under Coccinelle (at which point, committing the script will make it
> easier to rerun cleanups if later code reintroduces poor usage
> patterns), so maybe I have a v3 coming up.

I've got a Coccinelle patch (mostly) working now - but it has one
shortfall - I found places in tests/check-qdict.c that coccinelle
didn't, and traced it to the fact that our use of g_assert_cmpint(expr,
==, expr) throws off the coccinelle parser so badly that it silently
ignores the entire function body containing the use of that macro.  v3
will be posted soon, with the best of both worlds (coccinelle caught
spots that I missed, not to mention recent code base changes; and my
manual search found the spots in tests/ that coccinelle missed).

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] How to provide build/CI environment for NetBSD host

2017-04-04 Thread Ryo ONODERA
Hi,

From: Peter Maydell , Date: Tue, 4 Apr 2017 23:30:42 
+0100

> On 4 April 2017 at 23:28, Ryo ONODERA  wrote:
>> I use qemu from master branch (with some modification) every weekend.
>> However I had never run 'make check'.
>> I will run 'make check'.
> 
> Hmm. ivshmem_server doesn't link for me, because
> it uses shm_open() but doesn't link with -lrt.
> So the build fails.

 My 'some modifications' contains fix for your problem.
I will submit it later.

Thank you.

> thanks
> -- PMM

--
Ryo ONODERA // ryo...@yk.rim.or.jp
PGP fingerprint = 82A2 DC91 76E0 A10A 8ABB  FD1B F404 27FA C7D1 15F3



Re: [Qemu-devel] How to provide build/CI environment for NetBSD host

2017-04-04 Thread Peter Maydell
On 4 April 2017 at 23:28, Ryo ONODERA  wrote:
> I use qemu from master branch (with some modification) every weekend.
> However I had never run 'make check'.
> I will run 'make check'.

Hmm. ivshmem_server doesn't link for me, because
it uses shm_open() but doesn't link with -lrt.
So the build fails.

thanks
-- PMM



Re: [Qemu-devel] How to provide build/CI environment for NetBSD host

2017-04-04 Thread Ryo ONODERA
Hi,

From: Peter Maydell , Date: Tue, 4 Apr 2017 22:43:50 
+0100

> On 4 April 2017 at 22:22, Ryo ONODERA  wrote:
>> I saw "Warning of unsupported host systems"
>> in http://wiki.qemu-project.org/ChangeLog/2.9 wiki page.
>>
>> I would like to contribute to qemu's NetBSD host support.
>>
>> I can provide NetBSD/amd64(x86_64) 7-stable build
>> environment as build/CI machine to the qemu project
>> if nobody provides it to the project yet.
>>
>> How to provide build/CI environment to the qemu project?
> 
> Thanks for the offer. Since then, I've been able to set up
> a NetBSD VM, which is probably a reasonable CI environment.

Good news for me! Thank you very much!

> However, the other reason why netbsd isn't in the supported
> list is that when I tried compiling it it didn't pass the basic
> 'make; make check' test. So obviously nobody's been
> using or testing it :-(

I use qemu from master branch (with some modification) every weekend.
However I had never run 'make check'.
I will run 'make check'.

> You could start with looking at the compile failures
> and submitting patches.

O.k.
Thank you.

> thanks
> -- PMM

--
Ryo ONODERA // ryo...@yk.rim.or.jp
PGP fingerprint = 82A2 DC91 76E0 A10A 8ABB  FD1B F404 27FA C7D1 15F3



Re: [Qemu-devel] How to provide build/CI environment for NetBSD host

2017-04-04 Thread Peter Maydell
On 4 April 2017 at 22:22, Ryo ONODERA  wrote:
> I saw "Warning of unsupported host systems"
> in http://wiki.qemu-project.org/ChangeLog/2.9 wiki page.
>
> I would like to contribute to qemu's NetBSD host support.
>
> I can provide NetBSD/amd64(x86_64) 7-stable build
> environment as build/CI machine to the qemu project
> if nobody provides it to the project yet.
>
> How to provide build/CI environment to the qemu project?

Thanks for the offer. Since then, I've been able to set up
a NetBSD VM, which is probably a reasonable CI environment.

However, the other reason why netbsd isn't in the supported
list is that when I tried compiling it it didn't pass the basic
'make; make check' test. So obviously nobody's been
using or testing it :-(

You could start with looking at the compile failures
and submitting patches.

thanks
-- PMM



[Qemu-devel] [PATCH] Put all trace.o into libqemuutil.a

2017-04-04 Thread Xu, Anthony
Put all trace.o into libqemuutil.a

Currently all trace.o are linked into qemu-system, qemu-img, 
qemu-nbd, qemu-io etc., even the corresponding components 
are not included.
Put all trace.o into libqemuutil.a that the linker would only pull in .o 
files containing symbols that are actually referenced by the 
program.


Signed-off -by: Anthony Xu 



diff --git a/Makefile b/Makefile
index 6c359b2..31d41a7 100644
--- a/Makefile
+++ b/Makefile
@@ -346,7 +346,7 @@ dtc/%:
mkdir -p $@

 $(SUBDIR_RULES): libqemuutil.a libqemustub.a $(common-obj-y) $(chardev-obj-y) \
-   $(qom-obj-y) $(crypto-aes-obj-$(CONFIG_USER_ONLY)) $(trace-obj-y)
+   $(qom-obj-y) $(crypto-aes-obj-$(CONFIG_USER_ONLY))

 ROMSUBDIR_RULES=$(patsubst %,romsubdir-%, $(ROMS))
 # Only keep -O and -g cflags
@@ -366,11 +366,11 @@ Makefile: $(version-obj-y)
 # Build libraries

 libqemustub.a: $(stub-obj-y)
-libqemuutil.a: $(util-obj-y)
+libqemuutil.a: $(util-obj-y) $(trace-obj-y)

 ##

-COMMON_LDADDS = $(trace-obj-y) libqemuutil.a libqemustub.a
+COMMON_LDADDS = libqemuutil.a libqemustub.a

 qemu-img.o: qemu-img-cmds.h

diff --git a/Makefile.target b/Makefile.target
index d5ff0c7..69239e0 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -185,8 +185,7 @@ dummy := $(call unnest-vars,.., \
qom-obj-y \
io-obj-y \
common-obj-y \
-   common-obj-m \
-   trace-obj-y)
+   common-obj-m)
 target-obj-y := $(target-obj-y-save)
 all-obj-y += $(common-obj-y)
 all-obj-y += $(target-obj-y)
@@ -198,7 +197,7 @@ all-obj-$(CONFIG_SOFTMMU) += $(io-obj-y)

 $(QEMU_PROG_BUILD): config-devices.mak

-COMMON_LDADDS = $(trace-obj-y) ../libqemuutil.a ../libqemustub.a
+COMMON_LDADDS = ../libqemuutil.a ../libqemustub.a

 # build either PROG or PROGW
 $(QEMU_PROG_BUILD): $(all-obj-y) $(COMMON_LDADDS)
diff --git a/tests/Makefile.include b/tests/Makefile.include
index f3de81f..579ec07 100644
--- a/tests/Makefile.include
+++ b/tests/Makefile.include
@@ -519,7 +519,7 @@ QEMU_CFLAGS += -I$(SRC_PATH)/tests


 # Deps that are common to various different sets of tests below
-test-util-obj-y = $(trace-obj-y) libqemuutil.a libqemustub.a
+test-util-obj-y = libqemuutil.a libqemustub.a
 test-qom-obj-y = $(qom-obj-y) $(test-util-obj-y)
 test-qapi-obj-y = tests/test-qapi-visit.o tests/test-qapi-types.o \
tests/test-qapi-event.o tests/test-qmp-introspect.o \



[Qemu-devel] How to provide build/CI environment for NetBSD host

2017-04-04 Thread Ryo ONODERA
Hi,

I saw "Warning of unsupported host systems"
in http://wiki.qemu-project.org/ChangeLog/2.9 wiki page.

I would like to contribute to qemu's NetBSD host support.

I can provide NetBSD/amd64(x86_64) 7-stable build
environment as build/CI machine to the qemu project
if nobody provides it to the project yet.

How to provide build/CI environment to the qemu project?

Thank you.

P.S.
I am not on the qemu-devel@nongnu.org list.
Please add me to CC.

--
Ryo ONODERA // ryo...@yk.rim.or.jp
PGP fingerprint = 82A2 DC91 76E0 A10A 8ABB  FD1B F404 27FA C7D1 15F3



[Qemu-devel] [ANNOUNCE] QEMU 2.9.0-rc3 is now available

2017-04-04 Thread Michael Roth
Hello,

On behalf of the QEMU Team, I'd like to announce the availability of the
fourth release candidate for the QEMU 2.9 release.  This release is meant
for testing purposes and should not be used in a production environment.

  http://download.qemu-project.org/qemu-2.9.0-rc3.tar.xz
  http://download.qemu-project.org/qemu-2.9.0-rc3.tar.xz.sig

A note from the maintainer:

  We anticipate there being an rc4 release next week but that
  should be the last rc before the 2.9.0 release, so we'll only
  be taking fixes for critical bugs between now and then.

You can help improve the quality of the QEMU 2.9 release by testing this
release and reporting bugs on Launchpad:

  https://bugs.launchpad.net/qemu/

The release plan, as well a documented known issues for release
candidates, are available at:

  http://wiki.qemu.org/Planning/2.9

The dates have all been pushed back a week due to delays with the
initial RC.

Please add entries to the ChangeLog for the 2.9 release below:

  http://wiki.qemu.org/ChangeLog/2.9

Changes since rc2:

1fde6ee: Update version for v2.9.0-rc3 release (Peter Maydell)
6d54af0: 9pfs: clear migration blocker at session reset (Greg Kurz)
18adde8: 9pfs: fix multiple flush for same request (Greg Kurz)
193982c: pci: Only unmap bus_master_enabled_region if was added previously 
(Alexey Kardashevskiy)
b8a6872: io: fix FD socket handling in DNS lookup (Daniel P. Berrange)
0e5d632: io: fix incoming client socket initialization (Wang guang)
d9123d0: tests/libqtest.c: Delete possible stale unix sockets (Peter Maydell)
ecbddbb: main-loop: Acquire main_context lock around os_host_main_loop_wait. 
(Richard W.M. Jones)
86d1bd7: block/parallels: Avoid overflows (Max Reitz)
f82c5b1: iotests: Improve image-clear tests on non-aligned image (Eric Blake)
0c1bd46: qcow2: Discard unaligned tail when wiping image (Eric Blake)
07ff948: iotests: fix 097 when run with qcow (Daniel P. Berrange)
6aabeb5: qemu-io-cmds: Assert that global and nofile commands don't use 
ct->perms (Peter Maydell)
d1c1368: sheepdog: Fix blockdev-add (Markus Armbruster)
9445673: nbd: Tidy up blockdev-add interface (Markus Armbruster)
216411b: sockets: New helper socket_address_crumple() (Markus Armbruster)
8bc0673: qapi-schema: SocketAddressFlat variants 'vsock' and 'fd' (Markus 
Armbruster)
fce5d53: gluster: Prepare for SocketAddressFlat extension (Markus Armbruster)
129c7d1: block: Document -drive problematic code and bugs (Markus Armbruster)
a6c7628: io vnc sockets: Clean up SocketAddressKind switches (Markus Armbruster)
d2e49aa: char: Fix socket with "type": "vsock" address (Markus Armbruster)
ca0b64e: nbd sockets vnc: Mark problematic address family tests TODO (Markus 
Armbruster)
9588c58: block: add missed aio_context_acquire into release_drive (Denis V. 
Lunev)
102a3d8: usb-host: switch to LIBUSB_API_VERSION (Gerd Hoffmann)
230f4c6: disas/cris.c: Avoid unintentional sign extension (Peter Maydell)
6499fd1: configure: Mark SPARC as supported (Peter Maydell)
5c32be5: tcg/sparc: Zero extend address argument to ld/st helpers (Peter 
Maydell)
709a340: tcg/sparc: Zero extend data argument to store helpers (Peter Maydell)
90c4fe5: exec: revert MemoryRegionCache (Paolo Bonzini)
fa03cb7: vnc: allow to connect with add_client when -vnc none (Marc-André 
Lureau)
1684907: Fix input-linux reading from device (Javier Celaya)
243afe8: xhci: flush dequeue pointer to endpoint context (Gerd Hoffmann)
8149e29: pseries: Enforce homogeneous threads-per-core (David Gibson)
cc1e139: nbd: fix memory leak on socket_connect failed (yaolujing)
cb9a05a: ipmi: Fix macro issues (Corey Minyard)
c7f15bc: target-i386: fix "info lapic" segfault on isapc (Tejaswini Poluri)
b9afaba: iscsi: drop unused IscsiAIOCB.qiov field (Stefan Hajnoczi)
34634ca: block/curl: Check protocol prefix (Max Reitz)
6b9d62d: qapi/curl: Extend and fix blockdev-add schema (Max Reitz)
e98c696: rbd: Fix regression in legacy key/values containing escaped : (Eric 
Blake)
93587e3: Revert "vfio/pci-quirks.c: Disable stolen memory for igd VFIO" (Xiong 
Zhang)
e7d5441: hw/intc/arm_gicv3_kvm: Check KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS in reset 
(Eric Auger)
fd5d23b: hmp: fix "dump-quest-memory" segfault (Iwona Kotlarska)
4eaf720: qga: Make qemu-ga compile statically for Windows (Sameeh Jubran)
b4053c6: e1000: disable debug by default (Jason Wang)
1074b87: virtio-net: avoid call tap_enable when there's only one queue (Jason 
Wang)
8251a72: qga: don't fail if mount doesn't have slave devices (Michael Roth)
0d87608: tests/acpi: don't pack a structure (Michael S. Tsirkin)
375f74f: vhost: generalize iommu memory region (Jason Wang)
fb59dab: configure: Don't claim 'unsupported host OS' when better message 
available (Peter Maydell)
fe6824d: spapr: fix memory hot-unplugging (Laurent Vivier)
24ec286: spapr: fix buffer-overflow (Marc-André Lureau)
b8adbc6: virtio: fix vring_align() on 64-bit windows (Andrew Baumann)
c53598e: pci: Add missing drop of bus master AS reference (Alexey Kardashevskiy)
aa26292: even

[Qemu-devel] [PATCH v2 16/21] generic-sdhci: Remove user_creatable flag

2017-04-04 Thread Eduardo Habkost
generic-sdhci needs to be wired by other devices' code, so it
can't be used with -device. Remove the user_creatable flag from
the device class.

Cc: Peter Maydell 
Cc: "Edgar E. Iglesias" 
Cc: David Gibson 
Cc: Alexander Graf 
Cc: "Michael S. Tsirkin" 
Cc: Marcel Apfelbaum 
Cc: Prasad J Pandit 
Cc: Alistair Francis 
Reviewed-by: Alistair Francis 
Signed-off-by: Eduardo Habkost 
---
Changes v1 -> v2:
* (none)
---
 hw/sd/sdhci.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/hw/sd/sdhci.c b/hw/sd/sdhci.c
index dbf61fccb8..6d6a791ee9 100644
--- a/hw/sd/sdhci.c
+++ b/hw/sd/sdhci.c
@@ -1360,11 +1360,6 @@ static void sdhci_sysbus_class_init(ObjectClass *klass, 
void *data)
 dc->props = sdhci_sysbus_properties;
 dc->realize = sdhci_sysbus_realize;
 dc->reset = sdhci_poweron_reset;
-/*
- * FIXME: Set only because we are not sure yet if this device
- * will be outside the q35 sysbus whitelist.
- */
-dc->user_creatable = true;
 }
 
 static const TypeInfo sdhci_sysbus_info = {
-- 
2.11.0.259.g40922b1




[Qemu-devel] [PATCH v2 21/21] s390-pcibus: No need to set user_creatable=false explicitly

2017-04-04 Thread Eduardo Habkost
TYPE_S390_PCI_HOST_BRIDGE is a subclass of TYPE_PCI_HOST_BRIDGE,
which is a subclass of TYPE_SYS_BUS_DEVICE. TYPE_SYS_BUS_DEVICE
already sets user_creatable=false, so we don't require an
explicit user_creatable=false assignment in
s390_pcihost_class_init().

Cc: Alexander Graf 
Cc: Christian Borntraeger 
Cc: Cornelia Huck 
Cc: Frank Blaschka 
Cc: Laszlo Ersek 
Cc: Marcel Apfelbaum 
Cc: Markus Armbruster 
Cc: Peter Maydell 
Cc: Pierre Morel 
Cc: Richard Henderson 
Cc: Thomas Huth 
Cc: Yi Min Zhao 
Signed-off-by: Eduardo Habkost 
---
Changes series v1 -> v2:
* Previous patch was:
  "s390: Add FIXME for unexplained user_creatable=false line",
  but now we know we can remove the explicit user_creatable=false
  assignment
---
 hw/s390x/s390-pci-bus.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index 1ec30c45ce..973893df07 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -867,7 +867,6 @@ static void s390_pcihost_class_init(ObjectClass *klass, 
void *data)
 DeviceClass *dc = DEVICE_CLASS(klass);
 HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(klass);
 
-dc->user_creatable = false;
 dc->reset = s390_pcihost_reset;
 k->init = s390_pcihost_init;
 hc->plug = s390_pcihost_hot_plug;
-- 
2.11.0.259.g40922b1




[Qemu-devel] [PATCH v2 11/21] allwinner-ahci: Remove user_creatable flag

2017-04-04 Thread Eduardo Habkost
allwinner-ahci needs its IRQ to be connected and mmio to be
mapped (this is done by the alwinner-a10 device realize method),
and won't work with -device. Remove the user_creatable flag from
the device class.

Cc: John Snow 
Cc: qemu-bl...@nongnu.org
Cc: Beniamino Galvani 
Cc: Peter Maydell 
Cc: qemu-...@nongnu.org
Signed-off-by: Eduardo Habkost 
---
Changes v1 -> v2:
* Commit message rewrite only
---
 hw/ide/ahci.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
index 2ea1a282ef..f60826d6e0 100644
--- a/hw/ide/ahci.c
+++ b/hw/ide/ahci.c
@@ -1815,11 +1815,6 @@ static void allwinner_ahci_class_init(ObjectClass 
*klass, void *data)
 DeviceClass *dc = DEVICE_CLASS(klass);
 
 dc->vmsd = &vmstate_allwinner_ahci;
-/*
- * FIXME: Set only because we are not sure yet if this device
- * will be outside the q35 sysbus whitelist.
- */
-dc->user_creatable = true;
 }
 
 static const TypeInfo allwinner_ahci_info = {
-- 
2.11.0.259.g40922b1




[Qemu-devel] [PATCH v2 20/21] xen-sysdev: Remove user_creatable flag

2017-04-04 Thread Eduardo Habkost
TYPE_XENSYSDEV is only used internally by xen_be_init(), and is
not supposed to be plugged/unplugged dynamically. Remove the
user_creatable flag from the device class.

Cc: Juergen Gross ,
Cc: Peter Maydell ,
Cc: Thomas Huth 
Cc: sstabell...@kernel.org
Cc: Markus Armbruster ,
Cc: Marcel Apfelbaum ,
Cc: Laszlo Ersek 
Signed-off-by: Eduardo Habkost 
---
Changes series v1 -> v2:
* (New patch added to series)
---
 hw/xen/xen_backend.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/hw/xen/xen_backend.c b/hw/xen/xen_backend.c
index 56df27992f..e48c5f48fc 100644
--- a/hw/xen/xen_backend.c
+++ b/hw/xen/xen_backend.c
@@ -663,11 +663,6 @@ static void xen_sysdev_class_init(ObjectClass *klass, void 
*data)
 k->init = xen_sysdev_init;
 dc->props = xen_sysdev_properties;
 dc->bus_type = TYPE_XENSYSBUS;
-/*
- * FIXME: Set only because we are not sure yet if this device
- * will be outside the q35 sysbus whitelist.
- */
-dc->user_creatable = true;
 }
 
 static const TypeInfo xensysdev_info = {
-- 
2.11.0.259.g40922b1




[Qemu-devel] [PATCH v2 06/21] pflash_cfi01: Remove user_creatable flag

2017-04-04 Thread Eduardo Habkost
TYPE_CFI_PFLASH01 devices need to be mapped by
pflash_cfi01_register() (or equivalent) and can't be used with
-device. Remove user_creatable from the device class.

Cc: Kevin Wolf 
Cc: Max Reitz 
Cc: qemu-bl...@nongnu.org
Cc: Laszlo Ersek 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Laszlo Ersek 
Signed-off-by: Eduardo Habkost 
---
Changes v1 -> v2:
* Commit message rewrite only
---
 hw/block/pflash_cfi01.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/hw/block/pflash_cfi01.c b/hw/block/pflash_cfi01.c
index ef71956433..594d4cf6fe 100644
--- a/hw/block/pflash_cfi01.c
+++ b/hw/block/pflash_cfi01.c
@@ -927,11 +927,6 @@ static void pflash_cfi01_class_init(ObjectClass *klass, 
void *data)
 dc->props = pflash_cfi01_properties;
 dc->vmsd = &vmstate_pflash;
 set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
-/*
- * FIXME: Set only because we are not sure yet if this device
- * will be outside the q35 sysbus whitelist.
- */
-dc->user_creatable = true;
 }
 
 
-- 
2.11.0.259.g40922b1




[Qemu-devel] [PATCH v2 19/21] virtio-mmio: Remove user_creatable flag

2017-04-04 Thread Eduardo Habkost
virtio-mmio needs to be wired and mapped by other device or board
code, and won't work with -device. Remove the user_creatable flag
from the device class.

Cc: Peter Maydell 
Cc: Shannon Zhao 
Cc: "Michael S. Tsirkin" 
Reviewed-by: Laszlo Ersek 
Signed-off-by: Eduardo Habkost 
---
Changes v1 -> v2:
* (none)
---
 hw/virtio/virtio-mmio.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/hw/virtio/virtio-mmio.c b/hw/virtio/virtio-mmio.c
index 6491a771ff..5807aa87fe 100644
--- a/hw/virtio/virtio-mmio.c
+++ b/hw/virtio/virtio-mmio.c
@@ -450,11 +450,6 @@ static void virtio_mmio_class_init(ObjectClass *klass, 
void *data)
 dc->reset = virtio_mmio_reset;
 set_bit(DEVICE_CATEGORY_MISC, dc->categories);
 dc->props = virtio_mmio_properties;
-/*
- * FIXME: Set only because we are not sure yet if this device
- * will be outside the q35 sysbus whitelist.
- */
-dc->user_creatable = true;
 }
 
 static const TypeInfo virtio_mmio_info = {
-- 
2.11.0.259.g40922b1




[Qemu-devel] [PATCH v2 02/21] sysbus: Set user_creatable=false by default on TYPE_SYS_BUS_DEVICE

2017-04-04 Thread Eduardo Habkost
commit 33cd52b5d7b9adfd009e95f07e6c64dd88ae2a31 unset
cannot_instantiate_with_device_add_yet in TYPE_SYSBUS, making all
sysbus devices appear on "-device help" and lack the "no-user"
flag in "info qdm".

To fix this, we can set user_creatable=false by default on
TYPE_SYS_BUS_DEVICE, but this requires setting
user_creatable=true explicitly on the sysbus devices that
actually work with -device.

Fortunately today we have just a few has_dynamic_sysbus=1
machines: virt, pc-q35-*, ppce500, and spapr.

virt, ppce500, and spapr have extra checks to ensure just a few
device types can be instantiated:

* virt supports only TYPE_VFIO_CALXEDA_XGMAC, TYPE_VFIO_AMD_XGBE.
* ppce500 supports only TYPE_ETSEC_COMMON.
* spapr supports only TYPE_SPAPR_PCI_HOST_BRIDGE.

This patch sets user_creatable=true explicitly on those 4 device
classes.

Now, the more complex cases:

pc-q35-*: q35 has no sysbus device whitelist yet (which is a
separate bug). We are in the process of fixing it and building a
sysbus whitelist on q35, but in the meantime we can fix the
"-device help" and "info qdm" bugs mentioned above. Also, despite
not being strictly necessary for fixing the q35 bug, reducing the
list of user_creatable=true devices will help us be more
confident when building the q35 whitelist.

xen: We also have a hack at xen_set_dynamic_sysbus(), that sets
has_dynamic_sysbus=true at runtime when using the Xen
accelerator. This hack is only used to allow xen-backend devices
to be dynamically plugged/unplugged.

This means today we can use -device with the following 22 device
types, that are the ones compiled into the qemu-system-x86_64 and
qemu-system-i386 binaries:

* allwinner-ahci
* amd-iommu
* cfi.pflash01
* esp
* fw_cfg_io
* fw_cfg_mem
* generic-sdhci
* hpet
* intel-iommu
* ioapic
* isabus-bridge
* kvmclock
* kvm-ioapic
* kvmvapic
* SUNW,fdtwo
* sysbus-ahci
* sysbus-fdc
* sysbus-ohci
* unimplemented-device
* virtio-mmio
* xen-backend
* xen-sysdev

This patch adds user_creatable=true explicitly to those devices,
temporarily, just to keep 100% compatibility with existing
behavior of q35. Subsequent patches will remove
user_creatable=true from the devices that are really not meant to
user-creatable on any machine, and remove the FIXME comment from
the ones that are really supposed to be user-creatable. This is
being done in separate patches because we still don't have an
obvious list of devices that will be whitelisted by q35, and I
would like to get each device reviewed individually.

Cc: Alexander Graf 
Cc: Alex Williamson 
Cc: Alistair Francis 
Cc: Beniamino Galvani 
Cc: Christian Borntraeger 
Cc: Cornelia Huck 
Cc: David Gibson 
Cc: "Edgar E. Iglesias" 
Cc: Eduardo Habkost 
Cc: Frank Blaschka 
Cc: Gabriel L. Somlo 
Cc: Gerd Hoffmann 
Cc: Igor Mammedov 
Cc: Jason Wang 
Cc: John Snow 
Cc: Juergen Gross 
Cc: Kevin Wolf 
Cc: Laszlo Ersek 
Cc: Marcel Apfelbaum 
Cc: Markus Armbruster 
Cc: Max Reitz 
Cc: "Michael S. Tsirkin" 
Cc: Paolo Bonzini 
Cc: Peter Maydell 
Cc: Pierre Morel 
Cc: Prasad J Pandit 
Cc: qemu-...@nongnu.org
Cc: qemu-bl...@nongnu.org
Cc: qemu-...@nongnu.org
Cc: Richard Henderson 
Cc: Rob Herring 
Cc: Shannon Zhao 
Cc: sstabell...@kernel.org
Cc: Thomas Huth 
Cc: Yi Min Zhao 
Signed-off-by: Eduardo Habkost 
---
Changes v1 -> v2:
* Rewrite commit message: don't pretend we are actually fixing
  the q35 issue. We're just fixing "info qdm" and "-device help".
  Making it easier to fix q35 is just a nice side-effect.
* Rewrite FIXME comments to make it clear that we just have
  user_creatable=true because we don't know yet if the device
  should be in the q35 whitelist
---
 hw/block/fdc.c   | 10 ++
 hw/block/pflash_cfi01.c  |  5 +
 hw/core/sysbus.c | 11 +++
 hw/i386/amd_iommu.c  |  5 +
 hw/i386/intel_iommu.c|  5 +
 hw/i386/kvm/clock.c  |  5 +
 hw/i386/kvm/ioapic.c |  5 +
 hw/i386/kvmvapic.c   |  5 +
 hw/ide/ahci.c| 10 ++
 hw/intc/ioapic.c |  5 +
 hw/isa/isa-bus.c |  5 +
 hw/misc/unimp.c  |  5 +
 hw/net/fsl_etsec/etsec.c |  2 ++
 hw/nvram/fw_cfg.c| 10 ++
 hw/ppc/spapr_pci.c   |  2 ++
 hw/scsi/esp.c|  5 +
 hw/sd/sdhci.c|  5 +
 hw/timer/hpet.c  |  5 +
 hw/usb/hcd-ohci.c|  5 +
 hw/vfio/amd-xgbe.c   |  2 ++
 hw/vfio/calxeda-xgmac.c  |  2 ++
 hw/virtio/virtio-mmio.c  |  5 +
 hw/xen/xen_backend.c | 10 ++
 23 files changed, 129 insertions(+)

diff --git a/hw/block/fdc.c b/hw/block/fdc.c
index a328693d15..3d05565628 100644
--- a/hw/block/fdc.c
+++ b/hw/block/fdc.c
@@ -2880,6 +2880,11 @@ static void sysbus_fdc_class_init(ObjectClass *klass, 
void *data)
 
 dc->props = sysbus_fdc_properties;
 set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
+/*
+ * FIXME: Set only because we are not sure yet if this device
+ * will be outside the q35 sysbus whitelist.
+ */
+dc->user_creatable = true;
 }

[Qemu-devel] [PATCH v2 17/21] hpet: Remove user_creatable flag

2017-04-04 Thread Eduardo Habkost
hpet needs to be mapped and wired by the board code and won't
work with -device. Remove the user_creatable flag from the device
class.

Cc: "Michael S. Tsirkin" 
Cc: Paolo Bonzini 
Signed-off-by: Eduardo Habkost 
---
Changes v1 -> v2:
* (none)
---
 hw/timer/hpet.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/hw/timer/hpet.c b/hw/timer/hpet.c
index 4dcbd5bb3d..a2c18b30c3 100644
--- a/hw/timer/hpet.c
+++ b/hw/timer/hpet.c
@@ -771,11 +771,6 @@ static void hpet_device_class_init(ObjectClass *klass, 
void *data)
 dc->reset = hpet_reset;
 dc->vmsd = &vmstate_hpet;
 dc->props = hpet_device_properties;
-/*
- * FIXME: Set only because we are not sure yet if this device
- * will be outside the q35 sysbus whitelist.
- */
-dc->user_creatable = true;
 }
 
 static const TypeInfo hpet_device_info = {
-- 
2.11.0.259.g40922b1




[Qemu-devel] [PATCH v2 05/21] fdc: Remove user_creatable flag from sysbus-fdc & SUNW, fdtwo

2017-04-04 Thread Eduardo Habkost
sysbus-fdc and SUNW,fdtwo devices need IRQs to be wired and mmio
to be mapped, and can't be used with -device. Unset
user_creatable on their device classes.

Cc: John Snow 
Cc: Kevin Wolf 
Cc: Max Reitz 
Cc: qemu-bl...@nongnu.org
Cc: Thomas Huth 
Signed-off-by: Eduardo Habkost 
---
Changes v1 -> v2:
* Commit message rewrite only
---
 hw/block/fdc.c | 10 --
 1 file changed, 10 deletions(-)

diff --git a/hw/block/fdc.c b/hw/block/fdc.c
index 3d05565628..a328693d15 100644
--- a/hw/block/fdc.c
+++ b/hw/block/fdc.c
@@ -2880,11 +2880,6 @@ static void sysbus_fdc_class_init(ObjectClass *klass, 
void *data)
 
 dc->props = sysbus_fdc_properties;
 set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
-/*
- * FIXME: Set only because we are not sure yet if this device
- * will be outside the q35 sysbus whitelist.
- */
-dc->user_creatable = true;
 }
 
 static const TypeInfo sysbus_fdc_info = {
@@ -2911,11 +2906,6 @@ static void sun4m_fdc_class_init(ObjectClass *klass, 
void *data)
 
 dc->props = sun4m_fdc_properties;
 set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
-/*
- * FIXME: Set only because we are not sure yet if this device
- * will be outside the q35 sysbus whitelist.
- */
-dc->user_creatable = true;
 }
 
 static const TypeInfo sun4m_fdc_info = {
-- 
2.11.0.259.g40922b1




[Qemu-devel] [PATCH v2 13/21] unimplemented-device: Remove user_creatable flag

2017-04-04 Thread Eduardo Habkost
unimplemented-device needs to be created and mapped using
create_unimplemented_device() (or equivalent code), and won't
work with -device. Remove the user_creatable flag from the device
class.

Cc: Peter Maydell 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Eduardo Habkost 
---
Changes v1 -> v2:
* Commit message rewrite only
---
 hw/misc/unimp.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/hw/misc/unimp.c b/hw/misc/unimp.c
index e446c1d652..bcbb585888 100644
--- a/hw/misc/unimp.c
+++ b/hw/misc/unimp.c
@@ -90,11 +90,6 @@ static void unimp_class_init(ObjectClass *klass, void *data)
 
 dc->realize = unimp_realize;
 dc->props = unimp_properties;
-/*
- * FIXME: Set only because we are not sure yet if this device
- * will be outside the q35 sysbus whitelist.
- */
-dc->user_creatable = true;
 }
 
 static const TypeInfo unimp_info = {
-- 
2.11.0.259.g40922b1




[Qemu-devel] [PATCH v2 04/21] iommu: Remove FIXME comment about user_creatable=true

2017-04-04 Thread Eduardo Habkost
amd-iommu and intel-iommu are really meant to be used with
-device, so they need user_creatable=true. Remove the FIXME
comment.

Cc: Marcel Apfelbaum 
Cc: "Michael S. Tsirkin" 
Signed-off-by: Eduardo Habkost 
---
Changes v1 -> v2:
* (none)
---
 hw/i386/amd_iommu.c   | 5 +
 hw/i386/intel_iommu.c | 5 +
 2 files changed, 2 insertions(+), 8 deletions(-)

diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
index 7b92c8c15a..efcc93cbfd 100644
--- a/hw/i386/amd_iommu.c
+++ b/hw/i386/amd_iommu.c
@@ -1186,10 +1186,7 @@ static void amdvi_class_init(ObjectClass *klass, void* 
data)
 dc->vmsd = &vmstate_amdvi;
 dc->hotpluggable = false;
 dc_class->realize = amdvi_realize;
-/*
- * FIXME: Set only because we are not sure yet if this device
- * will be outside the q35 sysbus whitelist.
- */
+/* Supported by the pc-q35-* machine types */
 dc->user_creatable = true;
 }
 
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index d71d52c328..b14db306ef 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -2601,10 +2601,7 @@ static void vtd_class_init(ObjectClass *klass, void 
*data)
 dc->hotpluggable = false;
 x86_class->realize = vtd_realize;
 x86_class->int_remap = vtd_int_remap;
-/*
- * FIXME: Set only because we are not sure yet if this device
- * will be outside the q35 sysbus whitelist.
- */
+/* Supported by the pc-q35-* machine types */
 dc->user_creatable = true;
 }
 
-- 
2.11.0.259.g40922b1




[Qemu-devel] [PATCH v2 18/21] sysbus-ohci: Remove user_creatable flag

2017-04-04 Thread Eduardo Habkost
sysbus-ohci needs to be mapped and wired by device or board code,
and won't work with -device. Remove the user_creatable flag from
the device class.

Cc: Gerd Hoffmann 
Signed-off-by: Eduardo Habkost 
---
Changes v1 -> v2:
* Commit message rewrite only
---
 hw/usb/hcd-ohci.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/hw/usb/hcd-ohci.c b/hw/usb/hcd-ohci.c
index 18b31022a7..3ada35e954 100644
--- a/hw/usb/hcd-ohci.c
+++ b/hw/usb/hcd-ohci.c
@@ -2159,11 +2159,6 @@ static void ohci_sysbus_class_init(ObjectClass *klass, 
void *data)
 dc->desc = "OHCI USB Controller";
 dc->props = ohci_sysbus_properties;
 dc->reset = usb_ohci_reset_sysbus;
-/*
- * FIXME: Set only because we are not sure yet if this device
- * will be outside the q35 sysbus whitelist.
- */
-dc->user_creatable = true;
 }
 
 static const TypeInfo ohci_sysbus_info = {
-- 
2.11.0.259.g40922b1




[Qemu-devel] [PATCH v2 10/21] sysbus-ahci: Remove user_creatable flag

2017-04-04 Thread Eduardo Habkost
The sysbus-ahci devices are supposed to be created and wired by
code from other devices, like calxeda_init() and
xlnx_zynqmp_realize(), and won't work with -device. Remove the
user_creatable flag from the device class.

Cc: John Snow 
Cc: qemu-bl...@nongnu.org
Cc: Rob Herring 
Cc: Peter Maydell 
Cc: Alistair Francis 
Cc: "Edgar E. Iglesias" 
Reviewed-by: Alistair Francis 
Signed-off-by: Eduardo Habkost 
---
Changes v1 -> v2:
* (none)
---
 hw/ide/ahci.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
index 7f10cda354..2ea1a282ef 100644
--- a/hw/ide/ahci.c
+++ b/hw/ide/ahci.c
@@ -1721,11 +1721,6 @@ static void sysbus_ahci_class_init(ObjectClass *klass, 
void *data)
 dc->props = sysbus_ahci_properties;
 dc->reset = sysbus_ahci_reset;
 set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
-/*
- * FIXME: Set only because we are not sure yet if this device
- * will be outside the q35 sysbus whitelist.
- */
-dc->user_creatable = true;
 }
 
 static const TypeInfo sysbus_ahci_info = {
-- 
2.11.0.259.g40922b1




[Qemu-devel] [PATCH v2 03/21] xen-backend: Remove FIXME comment about user_creatable flag

2017-04-04 Thread Eduardo Habkost
xen-backend can be plugged/unplugged dynamically when using the
Xen accelerator, so keep the user_creatable flag on the device
class and remove the FIXME comment.

Cc: Juergen Gross ,
Cc: Peter Maydell ,
Cc: Thomas Huth 
Cc: sstabell...@kernel.org
Cc: Markus Armbruster ,
Cc: Marcel Apfelbaum ,
Cc: Laszlo Ersek 
Signed-off-by: Eduardo Habkost 
---
Changes series v1 -> v2:
* (New patch added to series)
---
 hw/xen/xen_backend.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/hw/xen/xen_backend.c b/hw/xen/xen_backend.c
index b39dce4cab..56df27992f 100644
--- a/hw/xen/xen_backend.c
+++ b/hw/xen/xen_backend.c
@@ -618,10 +618,7 @@ static void xendev_class_init(ObjectClass *klass, void 
*data)
 
 dc->props = xendev_properties;
 set_bit(DEVICE_CATEGORY_MISC, dc->categories);
-/*
- * FIXME: Set only because we are not sure yet if this device
- * will be outside the q35 sysbus whitelist.
- */
+/* xen-backend devices can be plugged/unplugged dynamically */
 dc->user_creatable = true;
 }
 
-- 
2.11.0.259.g40922b1




[Qemu-devel] [PATCH v2 15/21] esp: Remove user_creatable flag

2017-04-04 Thread Eduardo Habkost
esp devices aren't going to work with -device, as they need IRQs
to be connected and mmio to be mapped (this is done by
esp_init()). Remove the user_creatable flag from the device
class.

Cc: Paolo Bonzini 
Signed-off-by: Eduardo Habkost 
---
Changes v1 -> v2:
* Commit message rewrite only
---
 hw/scsi/esp.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c
index 7bdc1e1b99..eee831efeb 100644
--- a/hw/scsi/esp.c
+++ b/hw/scsi/esp.c
@@ -728,11 +728,6 @@ static void sysbus_esp_class_init(ObjectClass *klass, void 
*data)
 dc->reset = sysbus_esp_hard_reset;
 dc->vmsd = &vmstate_sysbus_esp_scsi;
 set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
-/*
- * FIXME: Set only because we are not sure yet if this device
- * will be outside the q35 sysbus whitelist.
- */
-dc->user_creatable = true;
 }
 
 static const TypeInfo sysbus_esp_info = {
-- 
2.11.0.259.g40922b1




[Qemu-devel] [PATCH v2 14/21] fw_cfg: Remove user_creatable flag

2017-04-04 Thread Eduardo Habkost
fw_cfg won't work with -device, as:
* fw_cfg_init1() won't get called for the device;
* The device won't appear at /machine/fw_cfg, and won't work with
  the -fw_cfg command-line option.

Remove the user_creatable flag from the device class.

Cc: "Michael S. Tsirkin" 
Cc: Laszlo Ersek 
Cc: Gabriel L. Somlo 
Reviewed-by: Laszlo Ersek 
Signed-off-by: Eduardo Habkost 
---
Changes v1 -> v2:
* (none)
---
 hw/nvram/fw_cfg.c | 10 --
 1 file changed, 10 deletions(-)

diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
index 7993aea792..316fca9bc1 100644
--- a/hw/nvram/fw_cfg.c
+++ b/hw/nvram/fw_cfg.c
@@ -1101,11 +1101,6 @@ static void fw_cfg_io_class_init(ObjectClass *klass, 
void *data)
 
 dc->realize = fw_cfg_io_realize;
 dc->props = fw_cfg_io_properties;
-/*
- * FIXME: Set only because we are not sure yet if this device
- * will be outside the q35 sysbus whitelist.
- */
-dc->user_creatable = true;
 }
 
 static const TypeInfo fw_cfg_io_info = {
@@ -1172,11 +1167,6 @@ static void fw_cfg_mem_class_init(ObjectClass *klass, 
void *data)
 
 dc->realize = fw_cfg_mem_realize;
 dc->props = fw_cfg_mem_properties;
-/*
- * FIXME: Set only because we are not sure yet if this device
- * will be outside the q35 sysbus whitelist.
- */
-dc->user_creatable = true;
 }
 
 static const TypeInfo fw_cfg_mem_info = {
-- 
2.11.0.259.g40922b1




[Qemu-devel] [PATCH v2 12/21] isabus-bridge: Remove user_creatable flag

2017-04-04 Thread Eduardo Habkost
isabus-bridge needs to be created by isa_bus_new(), and won't
work with -device, as it won't create the TYPE_ISA_BUS bus
itself. Remove the user_creatable flag from the device class.

Cc: Marcel Apfelbaum 
Cc: "Michael S. Tsirkin" 
Signed-off-by: Eduardo Habkost 
---
Changes v1 -> v2:
* (none)
---
 hw/isa/isa-bus.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/hw/isa/isa-bus.c b/hw/isa/isa-bus.c
index ad4ac3b4f6..348e0eab9d 100644
--- a/hw/isa/isa-bus.c
+++ b/hw/isa/isa-bus.c
@@ -221,11 +221,6 @@ static void isabus_bridge_class_init(ObjectClass *klass, 
void *data)
 
 set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
 dc->fw_name = "isa";
-/*
- * FIXME: Set only because we are not sure yet if this device
- * will be outside the q35 sysbus whitelist.
- */
-dc->user_creatable = true;
 }
 
 static const TypeInfo isabus_bridge_info = {
-- 
2.11.0.259.g40922b1




[Qemu-devel] [PATCH v2 01/21] qdev: Replace cannot_instantiate_with_device_add_yet with !user_creatable

2017-04-04 Thread Eduardo Habkost
cannot_instantiate_with_device_add_yet was introduced by commit
837d37167dc446af8a91189108b363c04609e296 to replace no_user. It
was supposed to be a temporary measure.

When it was introduced, we had 54
cannot_instantiate_with_device_add_yet=true lines in the code.
Today (3 years later) this number has not shrinked: we now have
57 cannot_instantiate_with_device_add_yet=true lines. I think it
is safe to say it is not a temporary measure, and we won't see
the flag go away soon.

Instead of a long field name that misleads people to believe it
is temporary, replace it a shorter and less misleading field:
user_creatable.

Except for code comments, changes were generated using the
following Coccinelle patch:

  @@
  expression DC;
  @@
  (
  -DC->cannot_instantiate_with_device_add_yet = false;
  +DC->user_creatable = true;
  |
  -DC->cannot_instantiate_with_device_add_yet = true;
  +DC->user_creatable = false;
  )

  @@
  typedef ObjectClass;
  expression dc;
  identifier class, data;
  @@
   static void device_class_init(ObjectClass *class, void *data)
   {
   ...
   dc->hotpluggable = true;
  +dc->user_creatable = true;
   ...
   }

  @@
  @@
   struct DeviceClass {
   ...
  -bool cannot_instantiate_with_device_add_yet;
  +bool user_creatable;
   ...
  }

  @@
  expression DC;
  @@
  (
  -!DC->cannot_instantiate_with_device_add_yet
  +DC->user_creatable
  |
  -DC->cannot_instantiate_with_device_add_yet
  +!DC->user_creatable
  )

Cc: Markus Armbruster 
Cc: Laszlo Ersek 
Cc: Marcel Apfelbaum 
Cc: Peter Maydell 
Cc: Thomas Huth 
Acked-by: Alistair Francis 
Signed-off-by: Eduardo Habkost 
---
Changes v1 -> v2:
* (none)
---
 include/hw/qdev-core.h  | 10 +-
 include/hw/qdev-properties.h|  4 ++--
 hw/acpi/piix4.c |  2 +-
 hw/arm/spitz.c  |  2 +-
 hw/audio/marvell_88w8618.c  |  2 +-
 hw/audio/pcspk.c|  2 +-
 hw/core/or-irq.c|  2 +-
 hw/core/qdev.c  |  1 +
 hw/core/register.c  |  2 +-
 hw/dma/i8257.c  |  2 +-
 hw/dma/sparc32_dma.c|  2 +-
 hw/gpio/omap_gpio.c |  4 ++--
 hw/i2c/omap_i2c.c   |  2 +-
 hw/i2c/smbus_eeprom.c   |  2 +-
 hw/i2c/smbus_ich9.c |  2 +-
 hw/i386/pc.c|  2 +-
 hw/input/vmmouse.c  |  2 +-
 hw/intc/apic_common.c   |  2 +-
 hw/intc/etraxfs_pic.c   |  2 +-
 hw/intc/grlib_irqmp.c   |  2 +-
 hw/intc/i8259_common.c  |  2 +-
 hw/intc/nios2_iic.c |  2 +-
 hw/intc/omap_intc.c |  4 ++--
 hw/isa/lpc_ich9.c   |  2 +-
 hw/isa/piix4.c  |  2 +-
 hw/isa/vt82c686.c   |  2 +-
 hw/mips/gt64xxx_pci.c   |  2 +-
 hw/misc/vmport.c|  2 +-
 hw/net/dp8393x.c|  2 +-
 hw/net/etraxfs_eth.c|  2 +-
 hw/net/lance.c  |  2 +-
 hw/pci-bridge/dec.c |  2 +-
 hw/pci-bridge/pci_expander_bridge.c |  2 +-
 hw/pci-host/apb.c   |  2 +-
 hw/pci-host/bonito.c|  2 +-
 hw/pci-host/gpex.c  |  2 +-
 hw/pci-host/grackle.c   |  2 +-
 hw/pci-host/piix.c  |  6 +++---
 hw/pci-host/ppce500.c   |  2 +-
 hw/pci-host/prep.c  |  2 +-
 hw/pci-host/q35.c   |  4 ++--
 hw/pci-host/uninorth.c  |  8 
 hw/pci-host/versatile.c |  2 +-
 hw/pci-host/xilinx-pcie.c   |  2 +-
 hw/ppc/ppc4xx_pci.c |  2 +-
 hw/ppc/spapr_drc.c  |  2 +-
 hw/s390x/s390-pci-bus.c |  2 +-
 hw/sd/milkymist-memcard.c   |  2 +-
 hw/sd/pl181.c   |  2 +-
 hw/sh4/sh_pci.c |  2 +-
 hw/timer/i8254_common.c |  2 +-
 hw/timer/mc146818rtc.c  |  2 +-
 monitor.c   |  2 +-
 qdev-monitor.c  |  6 +++---
 qom/cpu.c   |  2 +-
 target/i386/cpu.c   |  2 +-
 56 files changed, 71 insertions(+), 70 deletions(-)

diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
index b44b476765..bc4e6f0486 100644
--- a/include/hw/qdev-core.h
+++ b/include/hw/qdev-core.h
@@ -103,16 +103,16 @@ typedef struct DeviceClass {
 Property *props;
 
 /*
- * Shall we hide this device model from -device / device_add?
+ * Can this device be instantiated with -device / device_add?
  * All devices should support instantiation with device_add, and
  * this flag should not exist.  But we're not there, yet.  Some
  * devices fail to instantiate with cryptic error messages.
  * Others instantiate, but don't work.  Exposing users to such
- * behavior would be cruel; this flag serves to protect them.  It
- * should never be set without a

[Qemu-devel] [PATCH v2 08/21] ioapic: Remove user_creatable flag

2017-04-04 Thread Eduardo Habkost
An ioapic device is already created by the q35 initialization
code, and using "-device ioapic" or "-device kvm-ioapic" will
always fail with "Only 1 ioapics allowed". Remove the
user_creatable flag from the ioapic device classes.

Cc: Igor Mammedov 
Cc: "Michael S. Tsirkin" 
Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: Eduardo Habkost 
Signed-off-by: Eduardo Habkost 
---
Changes v1 -> v2:
* (none)
---
 hw/i386/kvm/ioapic.c | 5 -
 hw/intc/ioapic.c | 5 -
 2 files changed, 10 deletions(-)

diff --git a/hw/i386/kvm/ioapic.c b/hw/i386/kvm/ioapic.c
index 348c405180..98ca480792 100644
--- a/hw/i386/kvm/ioapic.c
+++ b/hw/i386/kvm/ioapic.c
@@ -167,11 +167,6 @@ static void kvm_ioapic_class_init(ObjectClass *klass, void 
*data)
 k->post_load = kvm_ioapic_put;
 dc->reset= kvm_ioapic_reset;
 dc->props= kvm_ioapic_properties;
-/*
- * FIXME: Set only because we are not sure yet if this device
- * will be outside the q35 sysbus whitelist.
- */
-dc->user_creatable = true;
 }
 
 static const TypeInfo kvm_ioapic_info = {
diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c
index f9e4f77def..37c4386ae3 100644
--- a/hw/intc/ioapic.c
+++ b/hw/intc/ioapic.c
@@ -448,11 +448,6 @@ static void ioapic_class_init(ObjectClass *klass, void 
*data)
 k->post_load = ioapic_update_kvm_routes;
 dc->reset = ioapic_reset_common;
 dc->props = ioapic_properties;
-/*
- * FIXME: Set only because we are not sure yet if this device
- * will be outside the q35 sysbus whitelist.
- */
-dc->user_creatable = true;
 }
 
 static const TypeInfo ioapic_info = {
-- 
2.11.0.259.g40922b1




[Qemu-devel] [PATCH v2 09/21] kvmvapic: Remove user_creatable flag

2017-04-04 Thread Eduardo Habkost
The kvmvapic device is only usable when created by
apic_common_realize(), not using -device. Remove the
user_creatable flag from the device class.

Cc: Igor Mammedov 
Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: Eduardo Habkost 
Cc: "Michael S. Tsirkin" 
Signed-off-by: Eduardo Habkost 
---
Changes v1 -> v2:
* Commit message rewrite only
---
 hw/i386/kvmvapic.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/hw/i386/kvmvapic.c b/hw/i386/kvmvapic.c
index 45f6267c93..82a49556af 100644
--- a/hw/i386/kvmvapic.c
+++ b/hw/i386/kvmvapic.c
@@ -856,11 +856,6 @@ static void vapic_class_init(ObjectClass *klass, void 
*data)
 dc->reset   = vapic_reset;
 dc->vmsd= &vmstate_vapic;
 dc->realize = vapic_realize;
-/*
- * FIXME: Set only because we are not sure yet if this device
- * will be outside the q35 sysbus whitelist.
- */
-dc->user_creatable = true;
 }
 
 static const TypeInfo vapic_type = {
-- 
2.11.0.259.g40922b1




[Qemu-devel] [PATCH v2 07/21] kvmclock: Remove user_creatable flag

2017-04-04 Thread Eduardo Habkost
kvmclock should be used by guests only when the appropriate CPUID
feature flags are set on the VCPU, and it is automatically
created by kvmclock_create() when those feature flags are set.
This means creating a kvmclock device using -device is useless.
Remove user_creatable from its device class.

Cc: "Michael S. Tsirkin" 
Cc: Paolo Bonzini 
Cc: Richard Henderson 
Signed-off-by: Eduardo Habkost 
---
Changes v1 -> v2:
* (none)
---
 hw/i386/kvm/clock.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/hw/i386/kvm/clock.c b/hw/i386/kvm/clock.c
index 7665bef999..13eca374cd 100644
--- a/hw/i386/kvm/clock.c
+++ b/hw/i386/kvm/clock.c
@@ -286,11 +286,6 @@ static void kvmclock_class_init(ObjectClass *klass, void 
*data)
 dc->realize = kvmclock_realize;
 dc->vmsd = &kvmclock_vmsd;
 dc->props = kvmclock_properties;
-/*
- * FIXME: Set only because we are not sure yet if this device
- * will be outside the q35 sysbus whitelist.
- */
-dc->user_creatable = true;
 }
 
 static const TypeInfo kvmclock_info = {
-- 
2.11.0.259.g40922b1




[Qemu-devel] [PATCH v2 00/21] qdev/sysbus: Set user_creatable=false by default on sysbus

2017-04-04 Thread Eduardo Habkost
Changes v1 -> v2


* Rewrote series name and cover letter completely to not pretend
  we're fixing the q35 lack-of-sysbus-whitelist bug, and explain
  the motivation for the series.
  * Previous series name was:
"sysbus: Don't allow -device/device_add by default"
  * Rewrote description of patch 02/21, too
  * (I really hope people read this cover letter before
commenting on individual patches.)
* Rewrote FIXME comments to make it clear that we just set
  user_creatable=true temporarily because we don't know yet if
  the device should be in the q35 whitelist.
* Set user_creatable=true on xen-backend also
  (I didn't notice it was missing because I was building QEMU
  without xen support)
  * New patches:
* "xen-backend: Remove FIXME comment about user_creatable flag"
* "xen-sysdev: Remove user_creatable flag"
* Patch:
"s390: Add FIXME for unexplained user_creatable=false line"
  replaced with:
"s390-pcibus: No need to set user_creatable=false explicitly"

Description
---

This series refactor the cannot_instantiate_with_device_add code
for sysbus. First, the cannot_instantiate_with_device_add field
is replaced by !user_creatable.

Then, we change TYPE_SYS_BUS_DEVICE to set user_creatable=false
by default, and we set user_creatable=true explicitly only on the
devices that are really supposed to be user-creatable on some
machines.

Motivation
--

First of all, this makes the code less fragile: setting
user_creatable=false or cannot_instantiate_with_device_add=true
on all sysbus devices is incorrect, and makes code that looks at
cannot_instantiate_with_device_add/user_creatable easy to break.

This also fixes a regression introduced by commit
33cd52b5d7b9adfd009e95f07e6c64dd88ae2a31, that makes all sysbus
devices appear on "-device help" and lack the "no-user" flag on
"info qdm"[1].

This will also make it possible for automatic test code (like the
device-crash-test.py script I sent a while ago[2]) skip devices
that are not supposed to be user-creatable on any machine.

A note about the lack of sysbus whitelist on q35


This series won't make the per-machine whitelist of sysbus
devices unnecessary, but just makes the user_creatable field
consistent on the sys-bus-device classes. This means q35 and xen
still need to be fixed to implemented a sysbus device whitelist.

However, despite not being strictly necessary for fixing the q35
bug, reducing the list of user_creatable=true devices will help
us be more confident when building the q35 whitelist.

Full list of user_creatable=true sysbus devices
---

In the end of this series, the only remaining sysbus devices with
user_creatable=true will be:

* vfio-amd-xgbe (arm)
* vfio-calxeda-xgmac (arm)
* amd-iommu (x86)
* intel-iommu (x86)
* xen-backend (x86)
* spapr-pci-host-bridge (ppc)
* spapr-pci-vfio-host-bridge (ppc)
* eTSEC (ppc)

References/Notes


[1] For example, before this series, we had 174 sysbus devices
listed on qemu-system-arm -device help:
  $ qemu-system-arm -machine none -device help 2>&1 | grep 'bus System' | 
wc -l
  174
  $
after this series, we now have:
  $ ./arm-softmmu/qemu-system-arm -machine none -device help 2>&1 | grep 
'bus System'
  name "vfio-amd-xgbe", bus System, desc "VFIO AMD XGBE"
  name "vfio-calxeda-xgmac", bus System, desc "VFIO Calxeda XGMAC"
  $

[2] Subject: [PATCH 0/3] script for crash-testing -device

---
Cc: Alexander Graf 
Cc: Laszlo Ersek 
Cc: Markus Armbruster 
Cc: Marcel Apfelbaum 
Cc: Thomas Huth 
Cc: Peter Maydell 

Eduardo Habkost (21):
  qdev: Replace cannot_instantiate_with_device_add_yet with
!user_creatable
  sysbus: Set user_creatable=false by default on TYPE_SYS_BUS_DEVICE
  xen-backend: Remove FIXME comment about user_creatable flag
  iommu: Remove FIXME comment about user_creatable=true
  fdc: Remove user_creatable flag from sysbus-fdc & SUNW,fdtwo
  pflash_cfi01: Remove user_creatable flag
  kvmclock: Remove user_creatable flag
  ioapic: Remove user_creatable flag
  kvmvapic: Remove user_creatable flag
  sysbus-ahci: Remove user_creatable flag
  allwinner-ahci: Remove user_creatable flag
  isabus-bridge: Remove user_creatable flag
  unimplemented-device: Remove user_creatable flag
  fw_cfg: Remove user_creatable flag
  esp: Remove user_creatable flag
  generic-sdhci: Remove user_creatable flag
  hpet: Remove user_creatable flag
  sysbus-ohci: Remove user_creatable flag
  virtio-mmio: Remove user_creatable flag
  xen-sysdev: Remove user_creatable flag
  s390-pcibus: No need to set user_creatable=false explicitly

 include/hw/qdev-core.h  | 10 +-
 include/hw/qdev-properties.h|  4 ++--
 hw/acpi/piix4.c |  2 +-
 hw/arm/spitz.c  |  2 +-
 hw/audio/marvell_88w8618.c  |  2 +-
 hw/audio/pcspk.c|  2 +-
 hw/core/or-irq.c

Re: [Qemu-devel] [RFC 19/19] virtio-mmio: Remove user_creatable flag

2017-04-04 Thread Eduardo Habkost
On Mon, Apr 03, 2017 at 05:50:13PM +0200, Laszlo Ersek wrote:
> On 04/01/17 02:46, Eduardo Habkost wrote:
> > virtio-mmio needs to be wired and mapped by other device or board
> > code, and won't work with -device. Remove the user_creatable flag
> > from the device class.
> > 
> > Cc: Peter Maydell 
> > Cc: Shannon Zhao 
> > Cc: "Michael S. Tsirkin" 
> > Signed-off-by: Eduardo Habkost 
> > ---
> >  hw/virtio/virtio-mmio.c | 5 -
> >  1 file changed, 5 deletions(-)
> > 
> > diff --git a/hw/virtio/virtio-mmio.c b/hw/virtio/virtio-mmio.c
> > index b595512a3d..5807aa87fe 100644
> > --- a/hw/virtio/virtio-mmio.c
> > +++ b/hw/virtio/virtio-mmio.c
> > @@ -450,11 +450,6 @@ static void virtio_mmio_class_init(ObjectClass *klass, 
> > void *data)
> >  dc->reset = virtio_mmio_reset;
> >  set_bit(DEVICE_CATEGORY_MISC, dc->categories);
> >  dc->props = virtio_mmio_properties;
> > -/*
> > - * FIXME: Set only for compatibility on q35 machine-type.
> > - * Probably never meant to be user-creatable
> > - */
> > -dc->user_creatable = true;
> >  }
> >  
> >  static const TypeInfo virtio_mmio_info = {
> > 
> 
> Possible addition to the commit message: a reference to commit
> 587078f0ed63 ("hw/arm/virt: explain device-to-transport mapping in
> create_virtio_devices()", 2015-02-05).

I'm confused by the comments there: it says "Unfortunately, we
can't counteract the kernel change by reversing the loop; it
would break existing command lines". But, which comment-lines it
would break, if "-device virtio-mmio" was never supported?

> 
> That's another example showing that board code has to participate in
> virtio-mmio transport placement.
> 
> Reviewed-by: Laszlo Ersek 

Thanks!

-- 
Eduardo



Re: [Qemu-devel] [PATCH v2 0/3] Update AppleSMC for OS X Sierra 10.12.4 guests

2017-04-04 Thread Alexander Graf

On 04/04/2017 07:01 PM, Gabriel L. Somlo wrote:

As of 10.12.4 (currently the latest Sierra update), OS X refuses to boot
unless the AppleSMC supports a third I/O port, which provides the current
error status when read.


Looks much nicer after this series :). Thanks a lot!

Reviewed-by: Alexander Graf 


Alex




Re: [Qemu-devel] [PATCH v3 0/1] slirp: add SOCKS5 support

2017-04-04 Thread no-reply
Hi,

This series seems to have some coding style problems. See output below for
more information:

Subject: [Qemu-devel] [PATCH v3 0/1] slirp: add SOCKS5 support
Type: series
Message-id: 20170403235636.5647-1-laur...@vivier.eu

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

# Useful git options
git config --local diff.renamelimit 0
git config --local diff.renames True

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
50a5735 slirp: add SOCKS5 support

=== OUTPUT BEGIN ===
Checking PATCH 1/1: slirp: add SOCKS5 support...
ERROR: suspect code indent for conditional statements (4, 6)
#741: FILE: slirp/tcp_subr.c:428:
+if (so->so_proxy_state) {
+  ret = socks5_connect(s, slirp->proxy_server, slirp->proxy_port,

total: 1 errors, 0 warnings, 687 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@freelists.org

Re: [Qemu-devel] [RFC PATCH 0/2] Calcuate downtime for postcopy live migration

2017-04-04 Thread Dr. David Alan Gilbert
* Alexey Perevalov (a.pereva...@samsung.com) wrote:
> Hi David,
> 
> I already asked you about downtime calculation for postcopy live migration.
> As I remember you said it's worth not to calculate it per vCPU or maybe I
> understood you incorrectly. I decided to proof it could be useful.

Thanks - apologies for taking so long to look at it.
Some higher level thoughts:
   a) It needs to be switchable - the tree etc look like they could use a fair
  amount of RAM.
   b) The cpu bitmask is a problem given we can have more than 64 CPUs
   c) Tracing the pages that took the longest can be interesting - I've done
  graphs of latencies before - you get fun things like watching messes
  where you lose requests and the page eventually arrives anyway after
  a few seconds.
  
> This patch set is based on commit 272d7dee5951f926fad1911f2f072e5915cdcba0
> of QEMU master branch. It requires commit into Andreas git repository
> "userfaultfd: provide pid in userfault uffd_msg"
> 
> When I tested it I found following moments are strange:
> 1. First userfault always occurs due to access to ram in 
> vapic_map_rom_writable,
> all vCPU are sleeping in this time

That's probably not too surprising - I bet the vapic device load code does that?
I've sometimes wondered about preloading the queue on the source with some that 
we know
will need to be loaded early.

> 2. Latest half of all userfault was initiated by kworkers, that's why I had a 
> doubt
> about current in handle_userfault inside kernel as a proper task_struct for 
> pagefault
> initiator. All vCPU was sleeping at that moment.

When you say kworkers - which ones?  I wonder what they are - perhaps incoming 
network
packets using vhost?

> 3. Also there is a discrepancy, of vCPU state and real vCPU thread state.

What do you mean by that?

> This patch is just for showing and idea, if you ok with this idea none RFC 
> patch will not
> include proc access && a lot of traces.
> Also I think it worth to guard postcopy_downtime in MigrationIncomingState and
> return calculated downtime into src, where qeury-migration will be invocked.

I don't think it's worth it, we can always ask the destination and sending stuff
back to the source is probably messy - especially at the end.

Dave

> Alexey Perevalov (2):
>   userfault: add pid into uffd_msg
>   migration: calculate downtime on dst side
> 
>  include/migration/migration.h |  11 ++
>  linux-headers/linux/userfaultfd.h |   1 +
>  migration/migration.c | 238 
> +-
>  migration/postcopy-ram.c  |  61 +-
>  migration/savevm.c|   2 +
>  migration/trace-events|  10 +-
>  6 files changed, 319 insertions(+), 4 deletions(-)
> 
> -- 
> 1.8.3.1
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH 2/2] migration: calculate downtime on dst side

2017-04-04 Thread Dr. David Alan Gilbert
* Alexey Perevalov (a.pereva...@samsung.com) wrote:
> This patch provides downtime calculation per vCPU,
> as a summary and as a overlapped value for all vCPUs.
> 
> This approach just keeps tree with page fault addr as a key,
> and t1-t2 interval of pagefault time and page copy time, with
> affected vCPU bit mask.
> For more implementation details please see comment to
> get_postcopy_total_downtime function.
> 
> Signed-off-by: Alexey Perevalov 
> ---
>  include/migration/migration.h |  11 ++
>  migration/migration.c | 238 
> +-
>  migration/postcopy-ram.c  |  61 ++-
>  migration/savevm.c|   2 +
>  migration/trace-events|  10 +-
>  5 files changed, 318 insertions(+), 4 deletions(-)
> 
> diff --git a/include/migration/migration.h b/include/migration/migration.h
> index 5720c88..8f9af77 100644
> --- a/include/migration/migration.h
> +++ b/include/migration/migration.h
> @@ -123,10 +123,21 @@ struct MigrationIncomingState {
>  
>  /* See savevm.c */
>  LoadStateEntry_Head loadvm_handlers;
> +
> +/*
> + *  Tree for keeping postcopy downtime,
> + *  necessary to calculate correct downtime, during multiple
> + *  vm suspends, it keeps host page address as a key and
> + *  DowntimeDuration as a data
> + */
> +GTree *postcopy_downtime;
>  };
>  
>  MigrationIncomingState *migration_incoming_get_current(void);
>  void migration_incoming_state_destroy(void);
> +void mark_postcopy_downtime_begin(uint64_t addr, int cpu);
> +void mark_postcopy_downtime_end(uint64_t addr);
> +int64_t get_postcopy_total_downtime(void);
>  
>  /*
>   * An outstanding page request, on the source, having been received
> diff --git a/migration/migration.c b/migration/migration.c
> index 54060f7..57d71e1 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -77,6 +77,12 @@ static NotifierList migration_state_notifiers =
>  
>  static bool deferred_incoming;
>  
> +typedef struct {
> +int64_t begin;
> +int64_t end;
> +uint64_t cpus;
> +} DowntimeDuration;
> +
>  /*
>   * Current state of incoming postcopy; note this is not part of
>   * MigrationIncomingState since it's state is used during cleanup
> @@ -117,6 +123,21 @@ MigrationState *migrate_get_current(void)
>  return ¤t_migration;
>  }
>  
> +static gint addr_compare(gconstpointer a, gconstpointer b,
> + gpointer user_data G_GNUC_UNUSED)
> +{
> +if (a == b)
> +return 0;
> +else if (a > b)
> +return 1;
> +return -1;
> +}

Weird that glib doesn't have that builtin?

> +static void destroy_downtime_duration(gpointer data)
> +{
> +free(data);

   g_free

> +}
> +
>  MigrationIncomingState *migration_incoming_get_current(void)
>  {
>  static bool once;
> @@ -128,6 +149,9 @@ MigrationIncomingState 
> *migration_incoming_get_current(void)
>  QLIST_INIT(&mis_current.loadvm_handlers);
>  qemu_mutex_init(&mis_current.rp_mutex);
>  qemu_event_init(&mis_current.main_thread_load_event, false);
> +mis_current.postcopy_downtime = g_tree_new_full(addr_compare,
> + NULL, NULL,
> + destroy_downtime_duration);
>  once = true;
>  }
>  return &mis_current;
> @@ -138,10 +162,13 @@ void migration_incoming_state_destroy(void)
>  struct MigrationIncomingState *mis = migration_incoming_get_current();
>  
>  qemu_event_destroy(&mis->main_thread_load_event);
> +if (mis->postcopy_downtime) {
> +g_tree_destroy(mis->postcopy_downtime);
> +mis->postcopy_downtime = NULL;
> +}
>  loadvm_free_handlers(mis);
>  }
>  
> -
>  typedef struct {
>  bool optional;
>  uint32_t size;
> @@ -1119,7 +1146,6 @@ MigrationState *migrate_init(const MigrationParams 
> *params)
>  s->last_req_rb = NULL;
>  error_free(s->error);
>  s->error = NULL;
> -
>  migrate_set_state(&s->state, MIGRATION_STATUS_NONE, 
> MIGRATION_STATUS_SETUP);
>  
>  QSIMPLEQ_INIT(&s->src_page_requests);
> @@ -2109,3 +2135,211 @@ PostcopyState postcopy_state_set(PostcopyState 
> new_state)
>  return atomic_xchg(&incoming_postcopy_state, new_state);
>  }
>  
> +void mark_postcopy_downtime_begin(uint64_t addr, int cpu)
> +{
> +MigrationIncomingState *mis = migration_incoming_get_current();
> +DowntimeDuration *dd;
> +if (!mis || !mis->postcopy_downtime) {
> +error_report("Migration incoming state should exists mis %p", mis);
> +return;
> +}
> +
> +dd = g_tree_lookup(mis->postcopy_downtime, (gpointer)addr); /* !!! cast 
> */
> +if (!dd) {
> +dd = (DowntimeDuration *)g_malloc0(sizeof(DowntimeDuration));

g_new0 is quite nice to avoid the sizeof.

> +g_tree_insert(mis->postcopy_downtime, (gpointer)addr, (gpointer)dd);
> +}
> +
> +if (cpu < 0)
> +/* assume in this situation all vCPUs are 

Re: [Qemu-devel] [PATCH 46/51] ram: Remember last_page instead of last_offset

2017-04-04 Thread Juan Quintela
"Dr. David Alan Gilbert"  wrote:
> * Juan Quintela (quint...@redhat.com) wrote:
>> Signed-off-by: Juan Quintela 
>> ---
>>  migration/ram.c | 14 +++---
>>  1 file changed, 7 insertions(+), 7 deletions(-)
>> 
>> diff --git a/migration/ram.c b/migration/ram.c
>> index b1a031e..57b776b 100644
>> --- a/migration/ram.c
>> +++ b/migration/ram.c
>> @@ -171,8 +171,8 @@ struct RAMState {
>>  RAMBlock *last_seen_block;
>>  /* Last block from where we have sent data */
>>  RAMBlock *last_sent_block;
>> -/* Last offset we have sent data from */
>> -ram_addr_t last_offset;
>> +/* Last dirty page we have sent */
>
> Can you make that 'Last dirty target page we have sent' 
> just so we know which shape page we're dealing with.

Done.

>> +ram_addr_t last_page;
>>  /* last ram version we have seen */
>>  uint32_t last_version;
>>  /* We are in the first round */
>> @@ -1063,7 +1063,7 @@ static bool find_dirty_block(RAMState *rs, 
>> PageSearchStatus *pss,
>>  pss->offset = migration_bitmap_find_dirty(rs, pss->block, pss->offset,
>>page);
>>  if (pss->complete_round && pss->block == rs->last_seen_block &&
>> -pss->offset >= rs->last_offset) {
>> +pss->offset >= rs->last_page) {
>
> That's odd; isn't pss->offset still in bytes?

It is not odd, it is wrong.

Fixed.

Thanks, Juan.



Re: [Qemu-devel] [PATCH 45/51] ram: Use page number instead of an address for the bitmap operations

2017-04-04 Thread Juan Quintela
"Dr. David Alan Gilbert"  wrote:
> * Juan Quintela (quint...@redhat.com) wrote:
>> We use an unsigned long for the page number.  Notice that our bitmaps
>> already got that for the index, so we have that limit.
>> 
>> Signed-off-by: Juan Quintela 
>> ---
>>  migration/ram.c | 76 
>> ++---
>>  1 file changed, 34 insertions(+), 42 deletions(-)
>> 
>> diff --git a/migration/ram.c b/migration/ram.c
>> index 6cd77b5..b1a031e 100644
>> --- a/migration/ram.c
>> +++ b/migration/ram.c
>> @@ -611,13 +611,12 @@ static int save_xbzrle_page(RAMState *rs, uint8_t 
>> **current_data,
>>   * @rs: current RAM state
>>   * @rb: RAMBlock where to search for dirty pages
>>   * @start: starting address (typically so we can continue from previous 
>> page)
>> - * @ram_addr_abs: pointer into which to store the address of the dirty page
>> - *within the global ram_addr space
>> + * @page: pointer into where to store the dirty page
>
> I'd prefer if you could call it 'page_abs' - it often gets tricky to know
> whether we're talking about a page offset within a RAMBlock or an
> offset within
> the whole bitmap.

I don't really care.  Changed.

> (I wish we had different index types)

This is C man!!
>> -trace_get_queued_page(block->idstr,
>> -  (uint64_t)offset,
>> -  (uint64_t)*ram_addr_abs);
>> +trace_get_queued_page(block->idstr, (uint64_t)offset,
>> + *page);
>
> I think you need to fix the trace_ definitions for get_queued_page
> and get_queued_page_not_dirty they're currently taking uint64_t's for
> ram_addr and they now need to be long's (with the format changes).

Done.

Thanks, Juan.



Re: [Qemu-devel] [PATCH v11 0/2] block/vxhs.c: Add support for a new block device type called "vxhs"

2017-04-04 Thread Jeff Cody
On Mon, Apr 03, 2017 at 08:48:07PM -0700, Ashish Mittal wrote:
> - Veritas HyperScale block driver in QEMU is designed to provide an 
> accelerated
>   IO path from KVM virtual machines to Veritas HyperScale storage service.
> 
> - A network IO transfer library that translates block IO from HyperScale block
>   driver to a network IO format to send it to Veritas HyperScale storage
>   service. This library (libvxhs) has been open sourced and is available on
>   github here:  https://github.com/VeritasHyperScale/libqnio.git
> 
> Ashish Mittal (2):
>   block/vxhs.c: Add support for a new block device type called "vxhs"
>   block/vxhs.c: Add qemu-iotests for new block device type "vxhs"
> 
>  block/Makefile.objs  |   2 +
>  block/trace-events   |  17 ++
>  block/vxhs.c | 575 
> +++
>  configure|  39 +++
>  qapi/block-core.json |  23 +-
>  tests/qemu-iotests/common|   6 +
>  tests/qemu-iotests/common.config |  13 +
>  tests/qemu-iotests/common.filter |   1 +
>  tests/qemu-iotests/common.rc |  19 ++
>  9 files changed, 693 insertions(+), 2 deletions(-)
>  create mode 100644 block/vxhs.c
> 
> -- 
> 2.5.5
>

With this additional patch (below), this passes all qemu-iotests for qcow2.


>From aeea8e23e28de325c572f8f3f67a1651b62887bd Mon Sep 17 00:00:00 2001
Message-Id: 

From: Jeff Cody 
Date: Tue, 14 Feb 2017 09:51:42 -0500
Subject: [PATCH v10 3/2 1/1] qemu-iotests: exclude vxhs from image creation
 via protocol

The protocol VXHS does not support image creation.  Some tests expect
to be able to create images through the protocol.  Exclude VXHS from
these tests.

Signed-off-by: Jeff Cody 
---
 tests/qemu-iotests/017 | 1 +
 tests/qemu-iotests/020 | 1 +
 tests/qemu-iotests/029 | 1 +
 tests/qemu-iotests/073 | 1 +
 tests/qemu-iotests/114 | 1 +
 tests/qemu-iotests/130 | 1 +
 tests/qemu-iotests/134 | 1 +
 tests/qemu-iotests/156 | 1 +
 tests/qemu-iotests/158 | 1 +
 9 files changed, 9 insertions(+)

diff --git a/tests/qemu-iotests/017 b/tests/qemu-iotests/017
index e3f9e75..4f9302d 100755
--- a/tests/qemu-iotests/017
+++ b/tests/qemu-iotests/017
@@ -41,6 +41,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 # Any format supporting backing files
 _supported_fmt qcow qcow2 vmdk qed
 _supported_proto generic
+_unsupported_proto vxhs
 _supported_os Linux
 _unsupported_imgopts "subformat=monolithicFlat" "subformat=twoGbMaxExtentFlat"
 
diff --git a/tests/qemu-iotests/020 b/tests/qemu-iotests/020
index 9c4a68c..7a0 100755
--- a/tests/qemu-iotests/020
+++ b/tests/qemu-iotests/020
@@ -43,6 +43,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 # Any format supporting backing files
 _supported_fmt qcow qcow2 vmdk qed
 _supported_proto generic
+_unsupported_proto vxhs
 _supported_os Linux
 _unsupported_imgopts "subformat=monolithicFlat" \
  "subformat=twoGbMaxExtentFlat" \
diff --git a/tests/qemu-iotests/029 b/tests/qemu-iotests/029
index e639ac0..30bab24 100755
--- a/tests/qemu-iotests/029
+++ b/tests/qemu-iotests/029
@@ -42,6 +42,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 # Any format supporting intenal snapshots
 _supported_fmt qcow2
 _supported_proto generic
+_unsupported_proto vxhs
 _supported_os Linux
 # Internal snapshots are (currently) impossible with refcount_bits=1
 _unsupported_imgopts 'refcount_bits=1[^0-9]'
diff --git a/tests/qemu-iotests/073 b/tests/qemu-iotests/073
index ad37a61..40f85b1 100755
--- a/tests/qemu-iotests/073
+++ b/tests/qemu-iotests/073
@@ -39,6 +39,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 
 _supported_fmt qcow2
 _supported_proto generic
+_unsupported_proto vxhs
 _supported_os Linux
 
 CLUSTER_SIZE=64k
diff --git a/tests/qemu-iotests/114 b/tests/qemu-iotests/114
index f110d4f..5b7dc54 100755
--- a/tests/qemu-iotests/114
+++ b/tests/qemu-iotests/114
@@ -39,6 +39,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 
 _supported_fmt qcow2
 _supported_proto generic
+_unsupported_proto vxhs
 _supported_os Linux
 
 
diff --git a/tests/qemu-iotests/130 b/tests/qemu-iotests/130
index ecc8a5b..f941fc9 100755
--- a/tests/qemu-iotests/130
+++ b/tests/qemu-iotests/130
@@ -42,6 +42,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 
 _supported_fmt qcow2
 _supported_proto generic
+_unsupported_proto vxhs
 _supported_os Linux
 
 qemu_comm_method="monitor"
diff --git a/tests/qemu-iotests/134 b/tests/qemu-iotests/134
index af618b8..acce946 100755
--- a/tests/qemu-iotests/134
+++ b/tests/qemu-iotests/134
@@ -39,6 +39,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 
 _supported_fmt qcow2
 _supported_proto generic
+_unsupported_proto vxhs
 _supported_os Linux
 
 
diff --git a/tests/qemu-iotests/156 b/tests/qemu-iotests/156
index cc95ff1..78deaff 100755
--- a/tests/qemu-iotests/156
+++ b/tests/qemu-iotests/156
@@ -48,6 +48,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 
 _supported_fmt qcow2 qed
 _supported_proto generic
+_unsupported_proto vxhs
 _supported_os Linux
 
 # 

Re: [Qemu-devel] [PATCH 1/2] userfault: add pid into uffd_msg

2017-04-04 Thread Dr. David Alan Gilbert
* Alexey Perevalov (a.pereva...@samsung.com) wrote:
> Signed-off-by: Alexey Perevalov 
> ---
>  linux-headers/linux/userfaultfd.h | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/linux-headers/linux/userfaultfd.h 
> b/linux-headers/linux/userfaultfd.h
> index 2ed5dc3..7b299a2 100644
> --- a/linux-headers/linux/userfaultfd.h
> +++ b/linux-headers/linux/userfaultfd.h
> @@ -77,6 +77,7 @@ struct uffd_msg {
>   struct {
>   __u64   flags;
>   __u64   address;
> + pid_t   pid;
>   } pagefault;

Yes, but we'll just rerun the sync of the headers once your change
lands in the kernel.

Dave

>   struct {
> -- 
> 1.8.3.1
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH 42/51] ram: Pass RAMBlock to bitmap_sync

2017-04-04 Thread Juan Quintela
"Dr. David Alan Gilbert"  wrote:
> * Juan Quintela (quint...@redhat.com) wrote:
>> "Dr. David Alan Gilbert"  wrote:
>> > * Juan Quintela (quint...@redhat.com) wrote:
>> >> We change the meaning of start to be the offset from the beggining of
>> >> the block.
>> >
>> > s/beggining/beginning/
>> >
>> > Why do this?
>> > We have:
>> >migration_bitmap_sync (all blocks)
>> >migration_bitmap_sync_range - called per block
>> >cpu_physical_memory_sync_dirty_bitmap
>> >
>> > Why keep migration_bitmap_sync_range having start/length as well
>> > as the block
>> > if you could just rename it to migration_bitmap_sync_block and
>> > just give it the rb?
>> > And since cpu_physical_memory_clear_dirty_range is lower level,
>> > why give it
>> > the rb?
>> 
>> I did it on the previous series, then I remembered that I was not going
>> to be able to sync only part of the range, as I will want in the future.
>> 
>> If you preffer as an intermediate meassure to just move to blocks, I can
>> do, but change is really small and not sure if it makes sense.
>
> OK then, but just comment it to say you want to.
> I'm still not sure if cpu_physical_memory_clear_dirty_range should
> have the RB;
> it feels that it's lower level, kvm stuff rather than things that know
> about RAMBlocks.

Bitmap is going to be there in the following patch.  Not a lot that we
can be done about that, no?

Right now we have:

- absolute address
- RAMblock
- byte offset inside block
- byte offset of ramblock
- Whole bitmaps (Migration, code and vga)
- migration bitmaps

This rseries move the migration bitmap inside the RAMBlock.  And we have
the RAMBlock in the caller.  We could search for it there, but looks
very inefficient.

I am trying to change all the code to use:

RAMblock pointer + target page offset inside ramblock

So we need to do a lot less calculations.

Later, Juan.



Re: [Qemu-devel] [PATCH 30/51] ram: Move src_page_req* to RAMState

2017-04-04 Thread Juan Quintela
"Dr. David Alan Gilbert"  wrote:
> * Juan Quintela (quint...@redhat.com) wrote:

Hi

>> @@ -1191,19 +1204,18 @@ static bool get_queued_page(RAMState *rs, 
>> MigrationState *ms,
>>   *
>>   * It should be empty at the end anyway, but in error cases there may
>>   * xbe some left.
>> - *
>> - * @ms: current migration state
>>   */
>> -void flush_page_queue(MigrationState *ms)
>> +void flush_page_queue(void)
>
> I'm not sure this is safe;  it's called from migrate_fd_cleanup right at
> the end, if you do any finalisation/cleanup of the RAMState in
> ram_save_complete
> then when is it safe to run this?

But, looking into it, I think that we should be able to move this into
ram_save_cleanup() no?

We don't need it after that?
As an added bonus, we can remove the export of it.

>> @@ -1260,16 +1272,16 @@ int ram_save_queue_pages(MigrationState *ms, const 
>> char *rbname,
>>  goto err;
>>  }
>>  
>> -struct MigrationSrcPageRequest *new_entry =
>> -g_malloc0(sizeof(struct MigrationSrcPageRequest));
>> +struct RAMSrcPageRequest *new_entry =
>> +g_malloc0(sizeof(struct RAMSrcPageRequest));
>>  new_entry->rb = ramblock;
>>  new_entry->offset = start;
>>  new_entry->len = len;
>>  
>>  memory_region_ref(ramblock->mr);
>> -qemu_mutex_lock(&ms->src_page_req_mutex);
>> -QSIMPLEQ_INSERT_TAIL(&ms->src_page_requests, new_entry, next_req);
>> -qemu_mutex_unlock(&ms->src_page_req_mutex);
>> +qemu_mutex_lock(&rs->src_page_req_mutex);
>> +QSIMPLEQ_INSERT_TAIL(&rs->src_page_requests, new_entry, next_req);
>> +qemu_mutex_unlock(&rs->src_page_req_mutex);
>
> Hmm ok where did it get it's rs from?
> Anyway, the thing I needed to convince myself of was that there was
> any guarantee that
> RAMState would exist by the time the first request came in, something
> that we now need
> to be careful of.
> I think we're mostly OK; we call qemu_savevm_state_begin() at the top
> of migration_thread so the ram_save_setup should be done and allocate
> the RAMState before we get into the main loop and thus before we ever
> look at the 'start_postcopy' flag and thus before we ever ask the destination
> to send us stuff.
>
>>  rcu_read_unlock();
>>  
>>  return 0;
>> @@ -1408,7 +1420,7 @@ static int ram_find_and_save_block(RAMState *rs, 
>> QEMUFile *f, bool last_stage)
>>  
>>  do {
>>  again = true;
>> -found = get_queued_page(rs, ms, &pss, &dirty_ram_abs);
>> +found = get_queued_page(rs, &pss, &dirty_ram_abs);
>>  
>>  if (!found) {
>>  /* priority queue empty, so just search for something dirty */
>> @@ -1968,6 +1980,8 @@ static int ram_state_init(RAMState *rs)
>>  
>>  memset(rs, 0, sizeof(*rs));
>>  qemu_mutex_init(&rs->bitmap_mutex);
>> +qemu_mutex_init(&rs->src_page_req_mutex);
>> +QSIMPLEQ_INIT(&rs->src_page_requests);
>
> Similar question to above; that mutex is going to get reinit'd
> on a new migration and it shouldn't be without it being destroyed.
> Maybe make it a once.

good catch.  I think that the easiest way is to just move it into proper
RAMState, init it here, and destroy it at cleanup?

Later, Juan.



Re: [Qemu-devel] [RFC PATCH v1 4/9] target/i386/misc_helper: wrap BQL around another IRQ generator

2017-04-04 Thread Eduardo Habkost
On Tue, Apr 04, 2017 at 09:53:15AM -0700, Richard Henderson wrote:
> On 04/03/2017 05:45 AM, Alex Bennée wrote:
> > Anything that calls into HW emulation must be protected by the BQL.
> > 
> > Signed-off-by: Alex Bennée 
> > ---
> >  target/i386/misc_helper.c | 3 +++
> >  1 file changed, 3 insertions(+)
> 
> Reviewed-by: Richard Henderson 

In case somebody is going to queue the whole series in one take:

Acked-by: Eduardo Habkost 

-- 
Eduardo



Re: [Qemu-devel] [PATCH v2 00/51] Creating RAMState for migration

2017-04-04 Thread Dr. David Alan Gilbert
* Juan Quintela (quint...@redhat.com) wrote:
> "Dr. David Alan Gilbert"  wrote:
> > * Juan Quintela (quint...@redhat.com) wrote:
> >> Hi
> >
> > Some high level points:
> >
> >> Continuation of previous series, all review comments addressed. New things:
> >> - Consolidate all function comments in the same style (yes, docs)
> >> - Be much more careful with maintaining comments correct
> >> - Move all postcopy fields to RAMState
> >
> >> - Move QEMUFile to RAMState
> >> - rename qemu_target_page_bits() to qemu_target_page_size() to reflect use
> >> - Remove MigrationState from functions that don't need it
> >> - reorganize last_sent_block to the place where it is used/needed
> >> - Move several places from offsets to pages
> >> - Rename last_ram_offset() to last_ram_page() to refect use
> >
> > An interesting question is what happens if we ever have multiple threads
> > working on RAM at once, I assume you're thinking there will be multiple
> > RAMStates?  It'll be interesting to see whether everything we have now got
> > in RAMState is stuff that wants to be replicated that way.
> 
> Working on paolo suggestion on sending everyhting through multiplefd's.
> That requires multiple FD's.

Yes, but for example do we want multiple postcopy request queues?
Do we have one reverse stream for requests or multiple?

Dave

> Thanks, Juan.
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH RFC] hw/pvrdma: Proposal of a new pvrdma device

2017-04-04 Thread Leon Romanovsky
On Tue, Apr 04, 2017 at 04:38:40PM +0300, Marcel Apfelbaum wrote:
> On 04/03/2017 09:23 AM, Leon Romanovsky wrote:
> > On Fri, Mar 31, 2017 at 06:45:43PM +0300, Marcel Apfelbaum wrote:
> > > On 03/30/2017 11:28 PM, Doug Ledford wrote:
> > > > On 3/30/17 9:13 AM, Leon Romanovsky wrote:
> > > > > On Thu, Mar 30, 2017 at 02:12:21PM +0300, Marcel Apfelbaum wrote:
> > > > > > From: Yuval Shaia 
> > > > > >
> > > > > >  Hi,
> > > > > >
> > > > > >  General description
> > > > > >  ===
> > > > > >  This is a very early RFC of a new RoCE emulated device
> > > > > >  that enables guests to use the RDMA stack without having
> > > > > >  a real hardware in the host.
> > > > > >
> > > > > >  The current implementation supports only VM to VM communication
> > > > > >  on the same host.
> > > > > >  Down the road we plan to make possible to be able to support
> > > > > >  inter-machine communication by utilizing physical RoCE devices
> > > > > >  or Soft RoCE.
> > > > > >
> > > > > >  The goals are:
> > > > > >  - Reach fast and secure loos-less Inter-VM data exchange.
> > > > > >  - Support remote VMs or bare metal machines.
> > > > > >  - Allow VMs migration.
> > > > > >  - Do not require to pin all VM memory.
> > > > > >
> > > > > >
> > > > > >  Objective
> > > > > >  =
> > > > > >  Have a QEMU implementation of the PVRDMA device. We aim to do so 
> > > > > > without
> > > > > >  any change in the PVRDMA guest driver which is already merged into 
> > > > > > the
> > > > > >  upstream kernel.
> > > > > >
> > > > > >
> > > > > >  RFC status
> > > > > >  ===
> > > > > >  The project is in early development stages and supports
> > > > > >  only basic send/receive operations.
> > > > > >
> > > > > >  We present it so we can get feedbacks on design,
> > > > > >  feature demands and to receive comments from the
> > > > > >  community pointing us to the "right" direction.
> > > > >
> > > > > If to judge by the feedback which you got from RDMA community
> > > > > for kernel proposal [1], this community failed to understand:
> > > > > 1. Why do you need new module?
> > > >
> > > > In this case, this is a qemu module to allow qemu to provide a virt 
> > > > rdma device to guests that is compatible with the device provided by 
> > > > VMWare's ESX product.  Right now, the vmware_pvrdma driver
> > > > works only when the guest is running on a VMWare ESX server product, 
> > > > this would change that.  Marcel mentioned that they are currently 
> > > > making it compatible because that's the easiest/quickest thing to
> > > > do, but in the future they might extend beyond what VMWare's virt rdma 
> > > > driver provides/uses and might then need to either modify it to work 
> > > > with their extensions or fork and create their own virt
> > > > client driver.
> > > >
> > > > > 2. Why existing solutions are not enough and can't be extended?
> > > >
> > > > This patch is against the qemu source code, not the kernel.  There is 
> > > > no other solution in the qemu source code, so there is no existing 
> > > > solution to extend.
> > > >
> > > > > 3. Why RXE (SoftRoCE) can't be extended to perform this inter-VM
> > > > >communication via virtual NIC?
> > > >
> > > > Eventually they want this to work on real hardware, and to be more or 
> > > > less transparent to the guest.  They will need to make it independent 
> > > > of the kernel hardware/driver in use.  That means their own
> > > > virt driver, then the virt driver will eventually hook into whatever 
> > > > hardware is present on the system, or failing that, fall back to soft 
> > > > RoCE or soft iWARP if that ever makes it in the kernel.
> > > >
> > > >
> > >
> > > Hi Leon and Doug,
> > > Your feedback is much appreciated!
> > >
> > > As Doug mentioned, the RFC is a QEMU implementation of a pvrdma device,
> > > so SoftRoCE can't help here (we are emulating a PCI device).
> >
> > I just responded to the latest email, but as you understood from my 
> > question,
> > it was related to your KDBR module.
> >
> > >
> > > Regarding the new KDBR module (Kernel Data Bridge), as the name suggests 
> > > is
> > > a bridge between different VMs or between a VM and a hardware/software 
> > > device
> > > and does not replace it.
> > >
> > > Leon, utilizing the Soft RoCE is definitely part of our roadmap from the 
> > > start,
> > > we find the project a must since most of our systems don't even have real
> > > RDMA hardware, and the question is how do best integrate with it.
> >
> > This is exactly the question, you chose as an implementation path to do
> > it with new module over char device. I'm not against your approach,
> > but would like to see the list with pros and cons for over possible
> > solutions if any. Does it make sense to do special ULP to share the data
> > between different drivers over shared memory?
>
> Hi Leon,
>
> Here are some thoughts regarding the Soft RoCE usage in our project.
> We thought about using it as backend for QEMU pvr

Re: [Qemu-devel] [PULL 0/2] 9pfs fixes for 2.9 2017-04-04

2017-04-04 Thread Peter Maydell
On 4 April 2017 at 17:13, Greg Kurz  wrote:
> The following changes since commit 87cc4c61020addea6a001b94b662596b1896d1b3:
>
>   Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into 
> staging (2017-04-04 11:40:55 +0100)
>
> are available in the git repository at:
>
>   https://github.com/gkurz/qemu.git tags/for-upstream
>
> for you to fetch changes up to 6d54af0ea970b4c0eb48bd2ae1d22b207dd4:
>
>   9pfs: clear migration blocker at session reset (2017-04-04 18:06:01 +0200)
>
> 
> Some 9pfs bugs fixes: potential hang at reset, migration blocker leak.
>
> 
> Greg Kurz (2):
>   9pfs: fix multiple flush for same request
>   9pfs: clear migration blocker at session reset
>
>  hw/9pfs/9p.c | 17 ++---
>  1 file changed, 10 insertions(+), 7 deletions(-)

Applied, thanks.

-- PMM



Re: [Qemu-devel] [PATCH v2 00/51] Creating RAMState for migration

2017-04-04 Thread Juan Quintela
"Dr. David Alan Gilbert"  wrote:
> * Juan Quintela (quint...@redhat.com) wrote:
>> Hi
>
> Some high level points:
>
>> Continuation of previous series, all review comments addressed. New things:
>> - Consolidate all function comments in the same style (yes, docs)
>> - Be much more careful with maintaining comments correct
>> - Move all postcopy fields to RAMState
>
>> - Move QEMUFile to RAMState
>> - rename qemu_target_page_bits() to qemu_target_page_size() to reflect use
>> - Remove MigrationState from functions that don't need it
>> - reorganize last_sent_block to the place where it is used/needed
>> - Move several places from offsets to pages
>> - Rename last_ram_offset() to last_ram_page() to refect use
>
> An interesting question is what happens if we ever have multiple threads
> working on RAM at once, I assume you're thinking there will be multiple
> RAMStates?  It'll be interesting to see whether everything we have now got
> in RAMState is stuff that wants to be replicated that way.

Working on paolo suggestion on sending everyhting through multiplefd's.
That requires multiple FD's.

Thanks, Juan.



Re: [Qemu-devel] [PATCH 01/51] ram: Update all functions comments

2017-04-04 Thread Juan Quintela
"Dr. David Alan Gilbert"  wrote:
> * Juan Quintela (quint...@redhat.com) wrote:
>> Added doc comments for existing functions comment and rewrite them in
>> a common style.
>> 
>> Signed-off-by: Juan Quintela 
>> ---
>>  migration/ram.c | 348 
>> 
>>  1 file changed, 227 insertions(+), 121 deletions(-)
>> 

>>   *
>>   * If this is the 1st block, it also writes the block identification
>>   *
>> - * Returns: Number of bytes written
>> + * Returns the number of bytes written
>
> Do the doc tools recognise that to pick up the explanation
> for the return value?

No clue.  Following qemu/include/exec/memory.h

>> @@ -459,8 +474,8 @@ static void xbzrle_cache_zero_page(ram_addr_t 
>> current_addr)
>>   *  -1 means that xbzrle would be longer than normal
>>   *
>>   * @f: QEMUFile where to send the data
>> - * @current_data:
>> - * @current_addr:
>> + * @current_data: contents of the page
>
> That's wrong.  The point of current_data is that it gets updated by this
> function to point to the cache page whenever the data ends up in the cache.
> It's important then that the caller uses that pointer to save the data to
> disk/network rather than the original pointer, since the data that's saved
> must exactly match the cache contents even if the guest is still writing to 
> it.

this is the current text:

* @current_data: pointer to the address of the page contents

This was Peter suggestion.

Rest of suggestions included. 

Thanks, Juan.



[Qemu-devel] [PATCH v2 2/3] applesmc: implement error status port

2017-04-04 Thread Gabriel L. Somlo
As of release 10.12.4, OS X (Sierra) refuses to boot unless the
AppleSMC supports an additional I/O port, expected to provide an
error status code.

Update the [cmd|data]_write() and data_read() methods to implement
the required state machine, and add I/O region & methods to handle
access to the error port.

Originally proposed by Eric Shelton 

Signed-off-by: Gabriel Somlo 
---
 hw/misc/applesmc.c | 141 +++--
 1 file changed, 115 insertions(+), 26 deletions(-)

diff --git a/hw/misc/applesmc.c b/hw/misc/applesmc.c
index 6381197..0d882e8 100644
--- a/hw/misc/applesmc.c
+++ b/hw/misc/applesmc.c
@@ -43,6 +43,7 @@
 enum {
 APPLESMC_DATA_PORT   = 0x00,
 APPLESMC_CMD_PORT= 0x04,
+APPLESMC_ERR_PORT= 0x1e,
 APPLESMC_NUM_PORTS   = 0x20,
 };
 
@@ -53,6 +54,24 @@ enum {
 APPLESMC_GET_KEY_TYPE_CMD= 0x13,
 };
 
+enum {
+APPLESMC_ST_CMD_DONE = 0x00,
+APPLESMC_ST_DATA_READY   = 0x01,
+APPLESMC_ST_BUSY = 0x02,
+APPLESMC_ST_ACK  = 0x04,
+APPLESMC_ST_NEW_CMD  = 0x08,
+};
+
+enum {
+APPLESMC_ST_1E_CMD_INTRUPTED = 0x80,
+APPLESMC_ST_1E_STILL_BAD_CMD = 0x81,
+APPLESMC_ST_1E_BAD_CMD   = 0x82,
+APPLESMC_ST_1E_NOEXIST   = 0x84,
+APPLESMC_ST_1E_WRITEONLY = 0x85,
+APPLESMC_ST_1E_READONLY  = 0x86,
+APPLESMC_ST_1E_BAD_INDEX = 0xb8,
+};
+
 #ifdef DEBUG_SMC
 #define smc_debug(...) fprintf(stderr, "AppleSMC: " __VA_ARGS__)
 #else
@@ -77,9 +96,12 @@ struct AppleSMCState {
 
 MemoryRegion io_data;
 MemoryRegion io_cmd;
+MemoryRegion io_err;
 uint32_t iobase;
 uint8_t cmd;
 uint8_t status;
+uint8_t status_1e;
+uint8_t last_ret;
 char key[4];
 uint8_t read_pos;
 uint8_t data_len;
@@ -93,89 +115,138 @@ static void applesmc_io_cmd_write(void *opaque, hwaddr 
addr, uint64_t val,
   unsigned size)
 {
 AppleSMCState *s = opaque;
+uint8_t status = s->status & 0x0f;
 
-smc_debug("CMD Write B: %#x = %#x\n", addr, val);
+smc_debug("CMD received: 0x%02x\n", (uint8_t)val);
 switch (val) {
 case APPLESMC_READ_CMD:
-s->status = 0x0c;
+/* did last command run through OK? */
+if (status == APPLESMC_ST_CMD_DONE || status == APPLESMC_ST_NEW_CMD) {
+s->cmd = val;
+s->status = APPLESMC_ST_NEW_CMD | APPLESMC_ST_ACK;
+} else {
+smc_debug("ERROR: previous command interrupted!\n");
+s->status = APPLESMC_ST_NEW_CMD;
+s->status_1e = APPLESMC_ST_1E_CMD_INTRUPTED;
+}
 break;
+default:
+smc_debug("UNEXPECTED CMD 0x%02x\n", (uint8_t)val);
+s->status = APPLESMC_ST_NEW_CMD;
+s->status_1e = APPLESMC_ST_1E_BAD_CMD;
 }
-s->cmd = val;
 s->read_pos = 0;
 s->data_pos = 0;
 }
 
-static void applesmc_fill_data(AppleSMCState *s)
+static struct AppleSMCData *applesmc_find_key(AppleSMCState *s)
 {
 struct AppleSMCData *d;
 
 QLIST_FOREACH(d, &s->data_def, node) {
 if (!memcmp(d->key, s->key, 4)) {
-smc_debug("Key matched (%s Len=%d Data=%s)\n", d->key,
-  d->len, d->data);
-memcpy(s->data, d->data, d->len);
-return;
+return d;
 }
 }
+return NULL;
 }
 
 static void applesmc_io_data_write(void *opaque, hwaddr addr, uint64_t val,
unsigned size)
 {
 AppleSMCState *s = opaque;
+struct AppleSMCData *d;
 
-smc_debug("DATA Write B: %#x = %#x\n", addr, val);
+smc_debug("DATA received: 0x%02x\n", (uint8_t)val);
 switch (s->cmd) {
 case APPLESMC_READ_CMD:
+if ((s->status & 0x0f) == APPLESMC_ST_CMD_DONE) {
+break;
+}
 if (s->read_pos < 4) {
 s->key[s->read_pos] = val;
-s->status = 0x04;
+s->status = APPLESMC_ST_ACK;
 } else if (s->read_pos == 4) {
-s->data_len = val;
-s->status = 0x05;
-s->data_pos = 0;
-smc_debug("Key = %c%c%c%c Len = %d\n", s->key[0],
-  s->key[1], s->key[2], s->key[3], val);
-applesmc_fill_data(s);
+d = applesmc_find_key(s);
+if (d != NULL) {
+memcpy(s->data, d->data, d->len);
+s->data_len = d->len;
+s->data_pos = 0;
+s->status = APPLESMC_ST_ACK | APPLESMC_ST_DATA_READY;
+s->status_1e = APPLESMC_ST_CMD_DONE;  /* clear on valid key */
+} else {
+smc_debug("READ_CMD: key '%c%c%c%c' not found!\n",
+  s->key[0], s->key[1], s->key[2], s->key[3]);
+s->status = APPLESMC_ST_CMD_DONE;
+s->status_1e = APPLESMC_ST_1E_NOEXIST;
+}
 }
 

[Qemu-devel] [PATCH v2 1/3] applesmc: cosmetic whitespace and indentation cleanup

2017-04-04 Thread Gabriel L. Somlo
Signed-off-by: Gabriel Somlo 
Reviewed-by: Alexander Graf 
Reviewed-by: Philippe Mathieu-Daudé 
---
 hw/misc/applesmc.c | 98 --
 1 file changed, 50 insertions(+), 48 deletions(-)

diff --git a/hw/misc/applesmc.c b/hw/misc/applesmc.c
index 77fab5b..6381197 100644
--- a/hw/misc/applesmc.c
+++ b/hw/misc/applesmc.c
@@ -39,21 +39,24 @@
 /* #define DEBUG_SMC */
 
 #define APPLESMC_DEFAULT_IOBASE0x300
-/* data port used by Apple SMC */
-#define APPLESMC_DATA_PORT 0x0
-/* command/status port used by Apple SMC */
-#define APPLESMC_CMD_PORT  0x4
-#define APPLESMC_NR_PORTS  32
 
-#define APPLESMC_READ_CMD  0x10
-#define APPLESMC_WRITE_CMD 0x11
-#define APPLESMC_GET_KEY_BY_INDEX_CMD  0x12
-#define APPLESMC_GET_KEY_TYPE_CMD  0x13
+enum {
+APPLESMC_DATA_PORT   = 0x00,
+APPLESMC_CMD_PORT= 0x04,
+APPLESMC_NUM_PORTS   = 0x20,
+};
+
+enum {
+APPLESMC_READ_CMD= 0x10,
+APPLESMC_WRITE_CMD   = 0x11,
+APPLESMC_GET_KEY_BY_INDEX_CMD= 0x12,
+APPLESMC_GET_KEY_TYPE_CMD= 0x13,
+};
 
 #ifdef DEBUG_SMC
 #define smc_debug(...) fprintf(stderr, "AppleSMC: " __VA_ARGS__)
 #else
-#define smc_debug(...) do { } while(0)
+#define smc_debug(...) do { } while (0)
 #endif
 
 static char default_osk[64] = "This is a dummy key. Enter the real key "
@@ -77,12 +80,11 @@ struct AppleSMCState {
 uint32_t iobase;
 uint8_t cmd;
 uint8_t status;
-uint8_t key[4];
+char key[4];
 uint8_t read_pos;
 uint8_t data_len;
 uint8_t data_pos;
 uint8_t data[255];
-uint8_t charactic[4];
 char *osk;
 QLIST_HEAD(, AppleSMCData) data_def;
 };
@@ -93,10 +95,10 @@ static void applesmc_io_cmd_write(void *opaque, hwaddr 
addr, uint64_t val,
 AppleSMCState *s = opaque;
 
 smc_debug("CMD Write B: %#x = %#x\n", addr, val);
-switch(val) {
-case APPLESMC_READ_CMD:
-s->status = 0x0c;
-break;
+switch (val) {
+case APPLESMC_READ_CMD:
+s->status = 0x0c;
+break;
 }
 s->cmd = val;
 s->read_pos = 0;
@@ -123,54 +125,54 @@ static void applesmc_io_data_write(void *opaque, hwaddr 
addr, uint64_t val,
 AppleSMCState *s = opaque;
 
 smc_debug("DATA Write B: %#x = %#x\n", addr, val);
-switch(s->cmd) {
-case APPLESMC_READ_CMD:
-if(s->read_pos < 4) {
-s->key[s->read_pos] = val;
-s->status = 0x04;
-} else if(s->read_pos == 4) {
-s->data_len = val;
-s->status = 0x05;
-s->data_pos = 0;
-smc_debug("Key = %c%c%c%c Len = %d\n", s->key[0],
-  s->key[1], s->key[2], s->key[3], val);
-applesmc_fill_data(s);
-}
-s->read_pos++;
-break;
+switch (s->cmd) {
+case APPLESMC_READ_CMD:
+if (s->read_pos < 4) {
+s->key[s->read_pos] = val;
+s->status = 0x04;
+} else if (s->read_pos == 4) {
+s->data_len = val;
+s->status = 0x05;
+s->data_pos = 0;
+smc_debug("Key = %c%c%c%c Len = %d\n", s->key[0],
+  s->key[1], s->key[2], s->key[3], val);
+applesmc_fill_data(s);
+}
+s->read_pos++;
+break;
 }
 }
 
-static uint64_t applesmc_io_data_read(void *opaque, hwaddr addr1,
-  unsigned size)
+static uint64_t applesmc_io_data_read(void *opaque, hwaddr addr, unsigned size)
 {
 AppleSMCState *s = opaque;
 uint8_t retval = 0;
 
-switch(s->cmd) {
-case APPLESMC_READ_CMD:
-if(s->data_pos < s->data_len) {
-retval = s->data[s->data_pos];
-smc_debug("READ_DATA[%d] = %#hhx\n", s->data_pos,
-  retval);
-s->data_pos++;
-if(s->data_pos == s->data_len) {
-s->status = 0x00;
-smc_debug("EOF\n");
-} else
-s->status = 0x05;
+switch (s->cmd) {
+case APPLESMC_READ_CMD:
+if (s->data_pos < s->data_len) {
+retval = s->data[s->data_pos];
+smc_debug("READ_DATA[%d] = %#hhx\n", s->data_pos,
+  retval);
+s->data_pos++;
+if (s->data_pos == s->data_len) {
+s->status = 0x00;
+smc_debug("EOF\n");
+} else {
+s->status = 0x05;
 }
+}
 }
-smc_debug("DATA Read b: %#x = %#x\n", addr1, retval);
+smc_debug("DATA Read b: %#x = %#x\n", addr, retval);
 
 return retval;
 }
 
-static uint64_t applesmc_io_cmd_read(void *opaque, hwaddr addr1, unsigned size)
+static uint64_t applesmc_io_cmd_read(void *opaque, hwaddr addr, unsigned size)
 {
 AppleSMCStat

[Qemu-devel] [PATCH v2 0/3] Update AppleSMC for OS X Sierra 10.12.4 guests

2017-04-04 Thread Gabriel L. Somlo
As of 10.12.4 (currently the latest Sierra update), OS X refuses to boot
unless the AppleSMC supports a third I/O port, which provides the current
error status when read.

New since v1:

- 1/3: don't touch the default OSK string, as it's unnecessary
at this time

- 2/3: don't consolidate I/O regions, leave as-is for data
and cmd; This patch now implements the error-code state
machine, AND adds an i/o region dedicaded to the error
status port, complete with read/write access methods.

- 3/3: optional patch setting access width to 1-byte on data and
cmd i/o regions. Tested on OS X versions 10.[6..12].

> This series consists of three patches:
>
>   - 1/3: indentation/whitespace cleanup for applesmc.c to the point
>   where it now passes scripts/checkpatc.pl, and allows
>   subsequent changes to look nice in diff-patch format :)
>
>   - 2/3: consolidate Port I/O into a single region, and invoke
>   appropriate read/write methods based on the offset being
>   accessed
>
>   - 3/3: implement read-only error/status port, and update
>   data and command read/write methods to correctly
>   maintain the state machine for keeping the status_1e
>   value up to date.

Gabriel L. Somlo (3):
  applesmc: cosmetic whitespace and indentation cleanup
  applesmc: implement error status port
  applesmc: fix port i/o access width

 hw/misc/applesmc.c | 219 +
 1 file changed, 155 insertions(+), 64 deletions(-)

-- 
2.7.4




[Qemu-devel] [PATCH v2 3/3] applesmc: fix port i/o access width

2017-04-04 Thread Gabriel L. Somlo
Set width of the two i/o regions dedicated to the AppleSMC's 8-bit
data and command ports to 1 byte.

Signed-off-by: Gabriel Somlo 
---

Setting these to 1-byte width works fine on any OS X version I could find
to test on: 10.(6-12), inclusive.

On linux, the applesmc kernel module tries *hard* to avoid loading on
anything that's not a Mac, by checking DMI board vendor and product
strings. If I force it using:

   -smbios type=1,manufacturer='Apple Inc.',product='iMac2',family='iMac' \
   -smbios type=2,manufacturer='Apple Inc.',version='iMac' \
   -device isa-applesmc

the module fails both before and after this whole series, suggesting it's
not (just) the access width but rather the overall incomplete emulation
that's the issue.

If we decide to go for implementing a more complete emulation, beyond
simply the minimum necessary to satisfy OS X, we should definitely ensure
that Linux is also happily able to initialize its applesmc driver...

 hw/misc/applesmc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/misc/applesmc.c b/hw/misc/applesmc.c
index 0d882e8..7896812 100644
--- a/hw/misc/applesmc.c
+++ b/hw/misc/applesmc.c
@@ -316,12 +316,12 @@ static void applesmc_isa_realize(DeviceState *dev, Error 
**errp)
 AppleSMCState *s = APPLE_SMC(dev);
 
 memory_region_init_io(&s->io_data, OBJECT(s), &applesmc_data_io_ops, s,
-  "applesmc-data", 4);
+  "applesmc-data", 1);
 isa_register_ioport(&s->parent_obj, &s->io_data,
 s->iobase + APPLESMC_DATA_PORT);
 
 memory_region_init_io(&s->io_cmd, OBJECT(s), &applesmc_cmd_io_ops, s,
-  "applesmc-cmd", 4);
+  "applesmc-cmd", 1);
 isa_register_ioport(&s->parent_obj, &s->io_cmd,
 s->iobase + APPLESMC_CMD_PORT);
 
-- 
2.7.4




Re: [Qemu-devel] [PULL 0/1] pci: fix

2017-04-04 Thread Peter Maydell
On 4 April 2017 at 16:35, Michael S. Tsirkin  wrote:
> The following changes since commit 87cc4c61020addea6a001b94b662596b1896d1b3:
>
>   Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into 
> staging (2017-04-04 11:40:55 +0100)
>
> are available in the git repository at:
>
>   git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git tags/for_upstream
>
> for you to fetch changes up to 193982c6f9424779b53a168fe32ebc30a776cbf1:
>
>   pci: Only unmap bus_master_enabled_region if was added previously 
> (2017-04-04 18:32:25 +0300)
>
> 
> pci: fix
>
> A single bugfix for a error handling issue in pci.
>
> Signed-off-by: Michael S. Tsirkin 
>
> 
> Alexey Kardashevskiy (1):
>   pci: Only unmap bus_master_enabled_region if was added previously

Applied, thanks.

-- PMM



Re: [Qemu-devel] [RFC PATCH v1 6/9] cpus: check cpu->running in cpu_get_icount_raw()

2017-04-04 Thread Richard Henderson

On 04/03/2017 05:45 AM, Alex Bennée wrote:

The lifetime of current_cpu is now the lifetime of the vCPU thread.
However get_icount_raw() can apply a fudge factor if called while code
is running to take into account the current executed instruction
count.

To ensure this is always the case we also check cpu->running.

Signed-off-by: Alex Bennée 
---
 cpus.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


Reviewed-by: Richard Henderson 


r~



Re: [Qemu-devel] [RFC PATCH v1 5/9] cpus: remove icount handling from qemu_tcg_cpu_thread_fn

2017-04-04 Thread Richard Henderson

On 04/03/2017 05:45 AM, Alex Bennée wrote:

We should never be running in multi-threaded mode with icount enabled.
There is no point calling handle_icount_deadline here so remove it and
assert !use_icount.

Signed-off-by: Alex Bennée 
---
 cpus.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)


Reviewed-by: Richard Henderson 


r~



Re: [Qemu-devel] [RFC PATCH v1 4/9] target/i386/misc_helper: wrap BQL around another IRQ generator

2017-04-04 Thread Richard Henderson

On 04/03/2017 05:45 AM, Alex Bennée wrote:

Anything that calls into HW emulation must be protected by the BQL.

Signed-off-by: Alex Bennée 
---
 target/i386/misc_helper.c | 3 +++
 1 file changed, 3 insertions(+)


Reviewed-by: Richard Henderson 


r~



Re: [Qemu-devel] [PULL v1 0/2] Merge qio 2017/04/04 v1

2017-04-04 Thread Peter Maydell
On 4 April 2017 at 16:20, Daniel P. Berrange  wrote:
> The following changes since commit 87cc4c61020addea6a001b94b662596b1896d1b3:
>
>   Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into 
> staging (2017-04-04 11:40:55 +0100)
>
> are available in the git repository at:
>
>   git://github.com/berrange/qemu tags/pull-qio-2017-04-04-1
>
> for you to fetch changes up to b8a68728b6a3fad86f15aa5efdc31ea0b3cb8a62:
>
>   io: fix FD socket handling in DNS lookup (2017-04-04 16:17:03 +0100)
>
> 
> Merge qio 2017/04/04 v1
>
> 

Applied, thanks.

-- PMM



Re: [Qemu-devel] [PATCH v3 02/21] mux: simplfy muxes_realize_done

2017-04-04 Thread Marc-André Lureau
Hi Philippe

- Original Message -
> Hi Marc-André,
> 
> On 03/16/2017 06:21 AM, Marc-André Lureau wrote:
> > mux_chr_event() already send events to all backends, rename it,
> > export it, and use it from muxes_realize_done. This should help abstract
> > away mux implementation.
> >
> > Signed-off-by: Marc-André Lureau 
> > Reviewed-by: Eric Blake 
> > ---
> >  chardev/char-mux.h |  2 +-
> >  chardev/char-mux.c | 11 ---
> >  chardev/char.c |  9 ++---
> >  3 files changed, 11 insertions(+), 11 deletions(-)
> >
> > diff --git a/chardev/char-mux.h b/chardev/char-mux.h
> > index 9a2fffce91..3f41dfcfd2 100644
> > --- a/chardev/char-mux.h
> > +++ b/chardev/char-mux.h
> > @@ -58,6 +58,6 @@ typedef struct MuxChardev {
> >
> >  void mux_chr_set_handlers(Chardev *chr, GMainContext *context);
> >  void mux_set_focus(Chardev *chr, int focus);
> > -void mux_chr_send_event(MuxChardev *d, int mux_nr, int event);
> > +void mux_chr_send_all_event(Chardev *chr, int event);
> >
> >  #endif /* CHAR_MUX_H */
> > diff --git a/chardev/char-mux.c b/chardev/char-mux.c
> > index 5547a36a0a..37d42c65c6 100644
> > --- a/chardev/char-mux.c
> > +++ b/chardev/char-mux.c
> > @@ -114,7 +114,7 @@ static void mux_print_help(Chardev *chr)
> >  }
> >  }
> >
> > -void mux_chr_send_event(MuxChardev *d, int mux_nr, int event)
> > +static void mux_chr_send_event(MuxChardev *d, int mux_nr, int event)
> >  {
> >  CharBackend *be = d->backends[mux_nr];
> >
> > @@ -222,9 +222,9 @@ static void mux_chr_read(void *opaque, const uint8_t
> > *buf, int size)
> >
> >  bool muxes_realized;
> >
> > -static void mux_chr_event(void *opaque, int event)
> > +void mux_chr_send_all_event(Chardev *chr, int event)
> >  {
> > -MuxChardev *d = MUX_CHARDEV(opaque);
> > +MuxChardev *d = MUX_CHARDEV(chr);
> >  int i;
> >
> >  if (!muxes_realized) {
> > @@ -237,6 +237,11 @@ static void mux_chr_event(void *opaque, int event)
> >  }
> >  }
> >
> > +static void mux_chr_event(void *opaque, int event)
> 
> Why use an opaque pointer and not Chardev *chr here? Not a big deal but
> it eases debugging sessions.
> Anyway it'll be correctly displayed single-stepping in :)

It's the signature of IOEventHandler, avoiding casting.

> 
> Reviewed-by: Philippe Mathieu-Daudé 
> 
> > +{
> > +mux_chr_send_all_event(CHARDEV(opaque), event);
> > +}
> > +
> >  static GSource *mux_chr_add_watch(Chardev *s, GIOCondition cond)
> >  {
> >  MuxChardev *d = MUX_CHARDEV(s);
> > diff --git a/chardev/char.c b/chardev/char.c
> > index aad639b620..674c097fbe 100644
> > --- a/chardev/char.c
> > +++ b/chardev/char.c
> > @@ -451,22 +451,17 @@ static void muxes_realize_done(Notifier *notifier,
> > void *unused)
> >  {
> >  Chardev *chr;
> >
> > +muxes_realized = true;
> >  QTAILQ_FOREACH(chr, &chardevs, next) {
> >  if (CHARDEV_IS_MUX(chr)) {
> > -MuxChardev *d = MUX_CHARDEV(chr);
> > -int i;
> > -
> >  /* send OPENED to all already-attached FEs */
> > -for (i = 0; i < d->mux_cnt; i++) {
> > -mux_chr_send_event(d, i, CHR_EVENT_OPENED);
> > -}
> > +mux_chr_send_all_event(CHARDEV(chr), CHR_EVENT_OPENED);
> >  /* mark mux as OPENED so any new FEs will immediately receive
> >   * OPENED event
> >   */
> >  qemu_chr_be_event(chr, CHR_EVENT_OPENED);
> >  }
> >  }
> > -muxes_realized = true;
> >  }
> >
> >  static Notifier muxes_realize_notify = {
> >
> 



Re: [Qemu-devel] [PATCH 0/7] Provide support for the software TPM emulator

2017-04-04 Thread Stefan Berger

On 04/04/2017 11:43 AM, Daniel P. Berrange wrote:

On Mon, Apr 03, 2017 at 05:18:37PM +, Marc-André Lureau wrote:

Hi

On Mon, Apr 3, 2017 at 7:08 PM Daniel P. Berrange 
wrote:


On Fri, Mar 31, 2017 at 04:10:09PM +0300, Amarnath Valluri wrote:

Briefly, Theses set of patches introduces:
  - new TPM backend driver to support software TPM emulators(swtpm(1)).
  - and few supported fixes/enhancements/cleanup to existing tpm backend

code.

The similar idea was initiated earliar(2) by Stefan Berger(CCed) with

slightly

different approach, using CUSE. As swtpm has excellent support for unix

domain

sockets, hence this implementation uses unix domain sockets to

communicate with

swtpm.

When Qemu is configured with 'emulator' tpm backend, it spawns 'swtpm'

and

communicates its via Unix domain sockets.

I'm not convinced that having QEMU spawning swtpm itself is a desirable
approach, as it means QEMU needs to have all the privileges that swtpm
will need, so that swtpm can inherit them. At the very least I think we
need to have a way to disable this spawning, so it can connect to a
pre-existing swtpm process that's been spawned ahead of time. This will
let us have stricter privilege separation.


Could the security contexts be given as arguments to qemu for the
subprocesses? The reason to have qemu spawn its own subprocess is that it
would leave more flexibility on how and when to do it, or even to use
multiple subprcesses if some day that makes sense. If there is no reason to
make this interface public or shared by various other processes, I would
rather see it as internal to qemu rather than managed by other tools. It
also makes command line & testing easier.

If QEMU is responsible for spawning it, then swtpm would have to run as the
same user/group as QEMU, as we do not want QEMU to have ability to invoke
programs with setuid. This means any state files / config / devices that
swtpm has to access would also be accessible by QEMU directly (assuming just
DAC controls, no MAC like SELinux). It also means that swtpm would be able
to access QEMU resources - so a bug in swtpm would allow swtpm to compromise
the guest disk image and other resources QEMU access.

If by contrast, libvirtd is responsible for spawning it, or an even a higher
level mgmt app like OpenStack/oVirt, then swtpm can be run with completely
isolated user / group privileges. The only resource that QEMU needs access
to is the UNIX domain socket, and swtpm doesn't need to access anything
belonging to QEMU. This will give us much stronger security separation
between swtpm & QEMU.  It will also allow us to write a more useful seccomp
policy that entirey blocks exec() from QEMU, further mitigating damage from
potential QEMU exploits.


Here are my patches for libvirt. They are pretty old by now. (won't 
forward port unless...)


https://github.com/stefanberger/libvirt-tpm/commits/v1.2.14-maint-vtpm

   Stefan




Re: [Qemu-devel] [PATCH 7/7] tpm: New backend driver to support TPM emulator

2017-04-04 Thread Marc-André Lureau
Hi

(just a quick review, there will be more once the design discussion is over)

On Fri, Mar 31, 2017 at 5:03 PM Amarnath Valluri 
wrote:

> This change introduces a new TPM backend driver that can communicates with
>

communicate


> swtpm(software TPM emulator) using unix domain socket interface.
>
> Swtpm uses two unix sockets, one for plain TPM commands and responses, and
> one
> for out-of-band control messages.
>
>
Is the protocol documented? what's the reason to use 2 sockets?


> The swtpm and associated tools can be found here:
> https://github.com/stefanberger/swtpm
>
> Usage:
> # setup TPM state directory
> mkdir /tmp/mytpm
> chown -R tss:root /tmp/mytpm
> /usr/bin/swtpm_setup --tpm-state /tmp/mytpm --createek
>
> # Ask qeum to use TPM emulator with given tpm state directory
>

qemu


> qemu-system-x86_64 \
> [...] \
> -tpmdev
> emulator,id=tpm0,tpmstatedir=/tmp/mytpm,logfile=/tmp/swtpm.log \
> -device tpm-tis,tpmdev=tpm0 \
> [...]
>
> Signed-off-by: Amarnath Valluri 
> ---
>  configure |   6 +
>  hmp.c |  14 +
>  hw/tpm/Makefile.objs  |   1 +
>  hw/tpm/tpm_emulator.c | 740
> ++
>  hw/tpm/tpm_ioctl.h| 243 +
>  hw/tpm/tpm_util.c |  34 +++
>  hw/tpm/tpm_util.h |   3 +
>  qapi-schema.json  |  22 +-
>  qemu-options.hx   |  25 +-
>  tpm.c |   7 +-
>  10 files changed, 1089 insertions(+), 6 deletions(-)
>  create mode 100644 hw/tpm/tpm_emulator.c
>  create mode 100644 hw/tpm/tpm_ioctl.h
>
> diff --git a/configure b/configure
> index 4901b9a..9089546 100755
> --- a/configure
> +++ b/configure
> @@ -3349,8 +3349,10 @@ fi
>
>  if test "$targetos" = Linux && test "$cpu" = i386 -o "$cpu" = x86_64; then
>tpm_passthrough=$tpm
> +  tpm_emulator=$tpm
>

What is the restriction to x86 & linux? I suppose it should work on various
unix.


>  else
>tpm_passthrough=no
> +  tpm_emulator=no
>  fi
>
>  ##
> @@ -5125,6 +5127,7 @@ echo "gcov enabled  $gcov"
>  echo "TPM support   $tpm"
>  echo "libssh2 support   $libssh2"
>  echo "TPM passthrough   $tpm_passthrough"
> +echo "TPM emulator  $tpm_emulator"
>  echo "QOM debugging $qom_cast_debug"
>  echo "lzo support   $lzo"
>  echo "snappy support$snappy"
> @@ -5704,6 +5707,9 @@ if test "$tpm" = "yes"; then
>if test "$tpm_passthrough" = "yes"; then
>  echo "CONFIG_TPM_PASSTHROUGH=y" >> $config_host_mak
>fi
> +  if test "$tpm_emulator" = "yes"; then
> +echo "CONFIG_TPM_EMULATOR=y" >> $config_host_mak
> +  fi
>  fi
>
>  echo "TRACE_BACKENDS=$trace_backends" >> $config_host_mak
> diff --git a/hmp.c b/hmp.c
> index edb8970..03a47e2 100644
> --- a/hmp.c
> +++ b/hmp.c
> @@ -937,6 +937,7 @@ void hmp_info_tpm(Monitor *mon, const QDict *qdict)
>  Error *err = NULL;
>  unsigned int c = 0;
>  TPMPassthroughOptions *tpo;
> +TPMEmulatorOptions *teo;
>
>  info_list = qmp_query_tpm(&err);
>  if (err) {
> @@ -966,6 +967,19 @@ void hmp_info_tpm(Monitor *mon, const QDict *qdict)
> tpo->has_cancel_path ? ",cancel-path=" : "",
> tpo->has_cancel_path ? tpo->cancel_path : "");
>  break;
> +case TPM_TYPE_OPTIONS_KIND_EMULATOR:
> +teo = ti->options->u.emulator.data;
> +monitor_printf(mon, ",tmpstatedir=%s", teo->tpmstatedir);
> +if (teo->has_path) {
> +monitor_printf(mon, ",path=%s", teo->path);
> +}
> +if (teo->has_logfile) {
> +monitor_printf(mon, ",logfile=%s", teo->logfile);
> +}
> +if (teo->has_loglevel) {
> +monitor_printf(mon, ",loglevel=%ld", teo->loglevel);
> +}
> +break;
>  case TPM_TYPE_OPTIONS_KIND__MAX:
>  break;
>  }
> diff --git a/hw/tpm/Makefile.objs b/hw/tpm/Makefile.objs
> index 64cecc3..41f0b7a 100644
> --- a/hw/tpm/Makefile.objs
> +++ b/hw/tpm/Makefile.objs
> @@ -1,2 +1,3 @@
>  common-obj-$(CONFIG_TPM_TIS) += tpm_tis.o
>  common-obj-$(CONFIG_TPM_PASSTHROUGH) += tpm_passthrough.o tpm_util.o
> +common-obj-$(CONFIG_TPM_EMULATOR) += tpm_emulator.o tpm_util.o
> diff --git a/hw/tpm/tpm_emulator.c b/hw/tpm/tpm_emulator.c
> new file mode 100644
> index 000..5b1dcfa
> --- /dev/null
> +++ b/hw/tpm/tpm_emulator.c
> @@ -0,0 +1,740 @@
> +/*
> + *  emulator TPM driver
> + *
> + *  Copyright (c) 2017 Intel Corporation
> + *  Author: Amarnath Valluri 
> + *
> + *  Copyright (c) 2010 - 2013 IBM Corporation
> + *  Authors:
> + *Stefan Berger 
> + *
> + *  Copyright (C) 2011 IAIK, Graz University of Technology
> + *Author: Andreas Niederl
> + *
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation;

[Qemu-devel] [PULL 2/2] 9pfs: clear migration blocker at session reset

2017-04-04 Thread Greg Kurz
The migration blocker survives a device reset: if the guest mounts a 9p
share and then gets rebooted with system_reset, it will be unmigratable
until it remounts and umounts the 9p share again.

This happens because the migration blocker is supposed to be cleared when
we put the last reference on the root fid, but virtfs_reset() wrongly calls
free_fid() instead of put_fid().

This patch fixes virtfs_reset() so that it honor the way fids are supposed
to be manipulated: first get a reference and later put it back when you're
done.

Signed-off-by: Greg Kurz 
Reviewed-by: Li Qiang 
---
 hw/9pfs/9p.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index ef47a0a5ad6f..c80ba67389ce 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -539,14 +539,15 @@ static void coroutine_fn virtfs_reset(V9fsPDU *pdu)
 
 /* Free all fids */
 while (s->fid_list) {
+/* Get fid */
 fidp = s->fid_list;
+fidp->ref++;
+
+/* Clunk fid */
 s->fid_list = fidp->next;
+fidp->clunked = 1;
 
-if (fidp->ref) {
-fidp->clunked = 1;
-} else {
-free_fid(pdu, fidp);
-}
+put_fid(pdu, fidp);
 }
 }
 
-- 
2.7.4




[Qemu-devel] [PULL 1/2] 9pfs: fix multiple flush for same request

2017-04-04 Thread Greg Kurz
If a client tries to flush the same outstanding request several times, only
the first flush completes. Subsequent ones keep waiting for the request
completion in v9fs_flush() and, therefore, leak a PDU. This will cause QEMU
to hang when draining active PDUs the next time the device is reset.

Let have each flush request wake up the next one if any. The last waiter
frees the cancelled PDU.

Signed-off-by: Greg Kurz 
Reviewed-by: Eric Blake 
---
 hw/9pfs/9p.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/hw/9pfs/9p.c b/hw/9pfs/9p.c
index 48babce836b6..ef47a0a5ad6f 100644
--- a/hw/9pfs/9p.c
+++ b/hw/9pfs/9p.c
@@ -2387,8 +2387,10 @@ static void coroutine_fn v9fs_flush(void *opaque)
  * Wait for pdu to complete.
  */
 qemu_co_queue_wait(&cancel_pdu->complete, NULL);
-cancel_pdu->cancelled = 0;
-pdu_free(cancel_pdu);
+if (!qemu_co_queue_next(&cancel_pdu->complete)) {
+cancel_pdu->cancelled = 0;
+pdu_free(cancel_pdu);
+}
 }
 pdu_complete(pdu, 7);
 }
-- 
2.7.4




[Qemu-devel] [PULL 0/2] 9pfs fixes for 2.9 2017-04-04

2017-04-04 Thread Greg Kurz
The following changes since commit 87cc4c61020addea6a001b94b662596b1896d1b3:

  Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging 
(2017-04-04 11:40:55 +0100)

are available in the git repository at:

  https://github.com/gkurz/qemu.git tags/for-upstream

for you to fetch changes up to 6d54af0ea970b4c0eb48bd2ae1d22b207dd4:

  9pfs: clear migration blocker at session reset (2017-04-04 18:06:01 +0200)


Some 9pfs bugs fixes: potential hang at reset, migration blocker leak.


Greg Kurz (2):
  9pfs: fix multiple flush for same request
  9pfs: clear migration blocker at session reset

 hw/9pfs/9p.c | 17 ++---
 1 file changed, 10 insertions(+), 7 deletions(-)
-- 
2.7.4




Re: [Qemu-devel] [PATCH RFC] hw/pvrdma: Proposal of a new pvrdma device

2017-04-04 Thread Jason Gunthorpe
On Tue, Apr 04, 2017 at 04:38:40PM +0300, Marcel Apfelbaum wrote:

> Here are some thoughts regarding the Soft RoCE usage in our project.
> We thought about using it as backend for QEMU pvrdma device
> we didn't how it will support our requirements.
> 
> 1. Does Soft RoCE support inter process (VM) fast path ? The KDBR
>removes the need for hw resources, emulated or not, concentrating
>on one copy from a VM to another.

I'd rather see someone optimize the loopback path of soft roce than
see KDBR :)

> 3. Our intention is for KDBR to be used in other contexts as well when we need
>inter VM data exchange, e.g. backend for virtio devices. We didn't see how 
> this
>kind of requirement can be implemented inside SoftRoce as we don't see any
>connection between them.

KDBR looks like weak RDMA to me, so it is reasonable question why not
use full RDMA with loopback optimization instead of creating something
unique.

IMHO, it also makes more sense for something like KDBR to live as a
RDMA transport, not as a unique char device, it is obviously very
RDMA-like.

.. and the char dev really can't be used when implementing user space
RDMA, that would just make a big mess..

> 4. We don't want all the VM memory to be pinned since it disable 
> memory-over-commit
>which in turn will make the pvrdma device useless.
>We weren't sure how nice would play Soft RoCE with memory pinning and we 
> wanted
>more control on memory management. It may be a solvable issue, but combined
>with the others lead us to our decision to come up with our kernel bridge 
> (char

soft roce certainly can be optimized to remove the page pin and always
run in an ODP-like mode.

But obviously if you connect pvrdma to real hardware then the page pin
comes back.

>device or not, we went for it since it was the easiest to
>implement for a POC)

I can see why it would be easy to implement, but not sure how this
really improves the kernel..

Jason



Re: [Qemu-devel] [PATCH v11 1/2] block/vxhs.c: Add support for a new block device type called "vxhs"

2017-04-04 Thread Daniel P. Berrange
On Mon, Apr 03, 2017 at 08:48:08PM -0700, Ashish Mittal wrote:
> Source code for the qnio library that this code loads can be downloaded from:
> https://github.com/VeritasHyperScale/libqnio.git
> 
> Sample command line using JSON syntax:
> ./x86_64-softmmu/qemu-system-x86_64 -name instance-0008 -S -vnc 0.0.0.0:0
> -k en-us -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5
> -msg timestamp=on
> 'json:{"driver":"vxhs","vdisk-id":"c3e9095a-a5ee-4dce-afeb-2a59fb387410",
> "server":{"host":"172.172.17.4","port":""}}'
> 
> Sample command line using URI syntax:
> qemu-img convert -f raw -O raw -n
> /var/lib/nova/instances/_base/0c5eacd5ebea5ed914b6a3e7b18f1ce734c386ad
> vxhs://192.168.0.1:/c6718f6b-0401-441d-a8c3-1f0064d75ee0
> 
> Sample command line using TLS credentials (run in secure mode):
> ./qemu-io --object
> tls-creds-x509,id=tls0,dir=/etc/pki/qemu/vxhs,endpoint=client -c 'read
> -v 66000 2.5k' 'json:{"server.host": "127.0.0.1", "server.port": "",
> "vdisk-id": "/test.raw", "driver": "vxhs", "tls-creds":"tls0"}'
> 
> Signed-off-by: Ashish Mittal 

I've not reviewed the i/o related stuff, but from the POV of TLS cred
handling / QAPI / disk open, this all looks fine to me now.

  Reviewed-by: Daniel P. Berrange 

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://entangle-photo.org   -o-http://search.cpan.org/~danberr/ :|



Re: [Qemu-devel] [PATCH 5/5] cpu: throttle: fix throttle time slice

2017-04-04 Thread Dr. David Alan Gilbert
* Paolo Bonzini (pbonz...@redhat.com) wrote:
> 
> 
> > > So I'm inclined _not_ to take your patch.  One possibility could be to
> > > do the following:
> > > 
> > > - for throttling between 0% and 80%, use the current algorithm.  At 66%,
> > > the CPU will work for 10 ms and sleep for 40 ms.
> > > 
> > > - for throttling above 80% adapt your algorithm to have a variable
> > > timeslice, going from 50 ms at 66% to 100 ms at 100%.  This way, the CPU
> > > time will shrink below 10 ms and the sleep time will grow.
> 
> Oops, all the 66% should be 80%.
> 
> > It seems odd to have a threshold like that on something that's supposedly
> > a linear scale.
> 
> I futzed a bit with the threshold until the first derivative of the CPU
> time was zero at the threshold, and the result was 80%.  That is, if you
> switch before 80%, the CPU time slice can grow to more than 10 ms right
> after the threshold, and then start shrinking.
> 
> > > It looks like this: http://i.imgur.com/lyFie04.png
> > > 
> > > So at 99% the timeslice will be 97.5 ms; the CPU will work for 975 us
> > > and sleep for the rest (10x more than with just your patch).  But I'm
> > > not sure it's really worth it.
> > 
> > Can you really run a CPU for 975us ?
> 
> It's 2-3 million clock cycles, should be doable.

I wasn't really worrying about the CPU running, I was more worried
about timer resolution in stopping it.  If you're timer isn't that accurate
in starting/stopping the CPU then the errors might be so large as to
make that a bit odd.

Dave

> Paolo
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



[Qemu-devel] [PULL 0/1] pci: fix

2017-04-04 Thread Michael S. Tsirkin
The following changes since commit 87cc4c61020addea6a001b94b662596b1896d1b3:

  Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging 
(2017-04-04 11:40:55 +0100)

are available in the git repository at:

  git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git tags/for_upstream

for you to fetch changes up to 193982c6f9424779b53a168fe32ebc30a776cbf1:

  pci: Only unmap bus_master_enabled_region if was added previously (2017-04-04 
18:32:25 +0300)


pci: fix

A single bugfix for a error handling issue in pci.

Signed-off-by: Michael S. Tsirkin 


Alexey Kardashevskiy (1):
  pci: Only unmap bus_master_enabled_region if was added previously

 hw/pci/pci.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)




Re: [Qemu-devel] [PATCH 0/7] Provide support for the software TPM emulator

2017-04-04 Thread Daniel P. Berrange
On Mon, Apr 03, 2017 at 05:18:37PM +, Marc-André Lureau wrote:
> Hi
> 
> On Mon, Apr 3, 2017 at 7:08 PM Daniel P. Berrange 
> wrote:
> 
> > On Fri, Mar 31, 2017 at 04:10:09PM +0300, Amarnath Valluri wrote:
> > > Briefly, Theses set of patches introduces:
> > >  - new TPM backend driver to support software TPM emulators(swtpm(1)).
> > >  - and few supported fixes/enhancements/cleanup to existing tpm backend
> > code.
> > >
> > > The similar idea was initiated earliar(2) by Stefan Berger(CCed) with
> > slightly
> > > different approach, using CUSE. As swtpm has excellent support for unix
> > domain
> > > sockets, hence this implementation uses unix domain sockets to
> > communicate with
> > > swtpm.
> > >
> > > When Qemu is configured with 'emulator' tpm backend, it spawns 'swtpm'
> > and
> > > communicates its via Unix domain sockets.
> >
> > I'm not convinced that having QEMU spawning swtpm itself is a desirable
> > approach, as it means QEMU needs to have all the privileges that swtpm
> > will need, so that swtpm can inherit them. At the very least I think we
> > need to have a way to disable this spawning, so it can connect to a
> > pre-existing swtpm process that's been spawned ahead of time. This will
> > let us have stricter privilege separation.
> >
> 
> Could the security contexts be given as arguments to qemu for the
> subprocesses? The reason to have qemu spawn its own subprocess is that it
> would leave more flexibility on how and when to do it, or even to use
> multiple subprcesses if some day that makes sense. If there is no reason to
> make this interface public or shared by various other processes, I would
> rather see it as internal to qemu rather than managed by other tools. It
> also makes command line & testing easier.

If QEMU is responsible for spawning it, then swtpm would have to run as the
same user/group as QEMU, as we do not want QEMU to have ability to invoke
programs with setuid. This means any state files / config / devices that
swtpm has to access would also be accessible by QEMU directly (assuming just
DAC controls, no MAC like SELinux). It also means that swtpm would be able
to access QEMU resources - so a bug in swtpm would allow swtpm to compromise
the guest disk image and other resources QEMU access.

If by contrast, libvirtd is responsible for spawning it, or an even a higher
level mgmt app like OpenStack/oVirt, then swtpm can be run with completely
isolated user / group privileges. The only resource that QEMU needs access
to is the UNIX domain socket, and swtpm doesn't need to access anything
belonging to QEMU. This will give us much stronger security separation
between swtpm & QEMU.  It will also allow us to write a more useful seccomp
policy that entirey blocks exec() from QEMU, further mitigating damage from
potential QEMU exploits. 

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://entangle-photo.org   -o-http://search.cpan.org/~danberr/ :|



[Qemu-devel] [RFC PATCH for-2.9] block: Ignore guest dev permissions during incoming migration

2017-04-04 Thread Kevin Wolf
Usually guest devices don't like other writers to the same image, so
they use blk_set_perm() to prevent this from happening. In the migration
phase before the VM is actually running, though, they don't have a
problem with writes to the image. On the other hand, storage migration
needs to be able to write to the image in this phase, so the restrictive
blk_set_perm() call of qdev devices breaks it.

This patch flags all BlockBackends with a qdev device as
blk->disable_perm during incoming migration, which means that the
requested permissions are stored in the BlockBackend, but not actually
applied to its root node yet.

Once migration has finished and the VM should be resumed, the
permissions are applied. If they cannot be applied (e.g. because the NBD
server used for block migration hasn't been shut down), resuming the VM
fails.

Signed-off-by: Kevin Wolf 
---
 block/block-backend.c | 40 +++-
 include/block/block.h |  2 ++
 migration/migration.c |  8 
 qmp.c |  6 ++
 4 files changed, 55 insertions(+), 1 deletion(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index 0b63773..f817040 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -61,6 +61,7 @@ struct BlockBackend {
 
 uint64_t perm;
 uint64_t shared_perm;
+bool disable_perm;
 
 bool allow_write_beyond_eof;
 
@@ -578,7 +579,7 @@ int blk_set_perm(BlockBackend *blk, uint64_t perm, uint64_t 
shared_perm,
 {
 int ret;
 
-if (blk->root) {
+if (blk->root && !blk->disable_perm) {
 ret = bdrv_child_try_set_perm(blk->root, perm, shared_perm, errp);
 if (ret < 0) {
 return ret;
@@ -597,15 +598,52 @@ void blk_get_perm(BlockBackend *blk, uint64_t *perm, 
uint64_t *shared_perm)
 *shared_perm = blk->shared_perm;
 }
 
+/*
+ * Notifies the user of all BlockBackends that migration has completed. qdev
+ * devices can tighten their permissions in response (specifically revoke
+ * shared write permissions that we needed for storage migration).
+ *
+ * If an error is returned, the VM cannot be allowed to be resumed.
+ */
+void blk_resume_after_migration(Error **errp)
+{
+BlockBackend *blk;
+Error *local_err = NULL;
+
+for (blk = blk_next(NULL); blk; blk = blk_next(blk)) {
+if (!blk->disable_perm) {
+continue;
+}
+
+blk->disable_perm = false;
+
+blk_set_perm(blk, blk->perm, blk->shared_perm, &local_err);
+if (local_err) {
+error_propagate(errp, local_err);
+blk->disable_perm = true;
+return;
+}
+}
+}
+
 static int blk_do_attach_dev(BlockBackend *blk, void *dev)
 {
 if (blk->dev) {
 return -EBUSY;
 }
+
+/* While migration is still incoming, we don't need to apply the
+ * permissions of guest device BlockBackends. We might still have a block
+ * job or NBD server writing to the image for storage migration. */
+if (runstate_check(RUN_STATE_INMIGRATE)) {
+blk->disable_perm = true;
+}
+
 blk_ref(blk);
 blk->dev = dev;
 blk->legacy_dev = false;
 blk_iostatus_reset(blk);
+
 return 0;
 }
 
diff --git a/include/block/block.h b/include/block/block.h
index 5149260..3e09222 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -366,6 +366,8 @@ void bdrv_invalidate_cache(BlockDriverState *bs, Error 
**errp);
 void bdrv_invalidate_cache_all(Error **errp);
 int bdrv_inactivate_all(void);
 
+void blk_resume_after_migration(Error **errp);
+
 /* Ensure contents are flushed to disk.  */
 int bdrv_flush(BlockDriverState *bs);
 int coroutine_fn bdrv_co_flush(BlockDriverState *bs);
diff --git a/migration/migration.c b/migration/migration.c
index 54060f7..ad4036f 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -349,6 +349,14 @@ static void process_incoming_migration_bh(void *opaque)
 exit(EXIT_FAILURE);
 }
 
+/* If we get an error here, just don't restart the VM yet. */
+blk_resume_after_migration(&local_err);
+if (local_err) {
+error_free(local_err);
+local_err = NULL;
+autostart = false;
+}
+
 /*
  * This must happen after all error conditions are dealt with and
  * we're sure the VM is going to be running on this host.
diff --git a/qmp.c b/qmp.c
index fa82b59..a744e44 100644
--- a/qmp.c
+++ b/qmp.c
@@ -207,6 +207,12 @@ void qmp_cont(Error **errp)
 }
 }
 
+blk_resume_after_migration(&local_err);
+if (local_err) {
+error_propagate(errp, local_err);
+return;
+}
+
 if (runstate_check(RUN_STATE_INMIGRATE)) {
 autostart = 1;
 } else {
-- 
1.8.3.1




[Qemu-devel] [PULL 1/1] pci: Only unmap bus_master_enabled_region if was added previously

2017-04-04 Thread Michael S. Tsirkin
From: Alexey Kardashevskiy 

Normally pci_init_bus_master() would be called either via
bus->machine_done.notify or directly from do_pci_register_device().

However if a device's realize() failed, pci_init_bus_master() is not
called, and do_pci_unregister_device() fails on
memory_region_del_subregion() as it was not mapped.

This adds a check that subregion was mapped before unmapping it.

Fixes: c53598ed18e4 ("pci: Add missing drop of bus master AS reference")
Signed-off-by: Alexey Kardashevskiy 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Max Reitz 
Reviewed-by: Marcel Apfelbaum 
Acked-by: Paolo Bonzini 
Tested-by: John Snow 
---
 hw/pci/pci.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index bd8043c..259483b 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -869,8 +869,10 @@ static void do_pci_unregister_device(PCIDevice *pci_dev)
 pci_dev->bus->devices[pci_dev->devfn] = NULL;
 pci_config_free(pci_dev);
 
-memory_region_del_subregion(&pci_dev->bus_master_container_region,
-&pci_dev->bus_master_enable_region);
+if (memory_region_is_mapped(&pci_dev->bus_master_enable_region)) {
+memory_region_del_subregion(&pci_dev->bus_master_container_region,
+&pci_dev->bus_master_enable_region);
+}
 address_space_destroy(&pci_dev->bus_master_as);
 }
 
-- 
MST




Re: [Qemu-devel] PCI regression in 2.9

2017-04-04 Thread Michael S. Tsirkin
On Tue, Apr 04, 2017 at 11:30:13AM -0400, Jeff Cody wrote:
> 
> I ran into this while running qemu iotests, and it looks like it is a
> regression from 2.8.
> 
> Here is the reproducer:
> 
> $ ./x86_64-softmmu/qemu-system-x86_64 -drive if=virtio
> qemu-system-x86_64: qemu-kvm/memory.c:2078: memory_region_del_subregion: 
> Assertion `subregion->container == mr' failed
> Aborted (core dumped)
> 
> The proper output is (with no abort):
> qemu-system-x86_64: -drive if=virtio: Device needs media, but drive is empty
> 
> I bisected it down to this commit:
> 
> commit c53598ed18e40a9609573b21f2a361221ca0f806
> Author: Alexey Kardashevskiy 
> Date:   Mon Mar 27 15:40:30 2017 +1100
> 
> pci: Add missing drop of bus master AS reference
> 
> The recent introduction of a bus master container added
> memory_region_add_subregion() into the PCI device registering path but
> missed memory_region_del_subregion() in the unregistering path leaving
> a reference to the root memory region of the new container.
> 
> This adds missing memory_region_del_subregion().
> 
> Fixes: 3716d5902d743 ("pci: introduce a bus master container")
> Signed-off-by: Alexey Kardashevskiy 
> Reviewed-by: Michael S. Tsirkin 
> Signed-off-by: Michael S. Tsirkin 
> Reviewed-by: Paolo Bonzini 


Thanks for the report. I'm preparing a pull request with a fix,
will copy you so you can test.

Thanks!

-- 
MST



[Qemu-devel] PCI regression in 2.9

2017-04-04 Thread Jeff Cody

I ran into this while running qemu iotests, and it looks like it is a
regression from 2.8.

Here is the reproducer:

$ ./x86_64-softmmu/qemu-system-x86_64 -drive if=virtio
qemu-system-x86_64: qemu-kvm/memory.c:2078: memory_region_del_subregion: 
Assertion `subregion->container == mr' failed
Aborted (core dumped)

The proper output is (with no abort):
qemu-system-x86_64: -drive if=virtio: Device needs media, but drive is empty

I bisected it down to this commit:

commit c53598ed18e40a9609573b21f2a361221ca0f806
Author: Alexey Kardashevskiy 
Date:   Mon Mar 27 15:40:30 2017 +1100

pci: Add missing drop of bus master AS reference

The recent introduction of a bus master container added
memory_region_add_subregion() into the PCI device registering path but
missed memory_region_del_subregion() in the unregistering path leaving
a reference to the root memory region of the new container.

This adds missing memory_region_del_subregion().

Fixes: 3716d5902d743 ("pci: introduce a bus master container")
Signed-off-by: Alexey Kardashevskiy 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Paolo Bonzini 




  1   2   >