Re: [Qemu-devel] [RFC] virtio-pci: Allow PCIe virtio devices on root bus

2017-02-15 Thread Gerd Hoffmann
On Di, 2017-02-14 at 14:53 +0200, Marcel Apfelbaum wrote:
> I suppose XHCI can behave the same as virtio if Gerd has nothing
> against it

No objections.

cheers,
  Gerd




[Qemu-devel] 答复: Re: [RFC] virtio-fc: draft idea of virtual fibre channel HBA

2017-02-15 Thread Lin Ma


>>> Stefan Hajnoczi  2/15/2017 11:33 下午 >>>
>On Wed, Feb 15, 2017 at 12:15:02AM -0700, Lin Ma wrote:
>> Hi all,
>>  
>> We know that libvirt can create fibre channels vHBA on host
>> based on npiv, and present the LUNs to guest.
>>  
>> I'd like to implement a virtual fibre channel HBA for qemu,
>> I havn't investigate it deeply yet. The idea presents a fc
>> vHBA in guest, interact with remote fc switch through npiv,
>> The LUNs will be recognized inside guest. I sent this email
>> here to see if you are in agreement with this approach and
>> hope to get some ideas/suggestions.
>>  
>> The frontend is based on virtio, say virtio-fc-pci; the backend
>> is based on npiv of physical fc hba on host.
>> The implementation of this virtual fc hba doesn't support Fc-al,
>> only supports Fabric. It wrappers scsi data info fc frames, then
>> forwards them to backend, sounds like scsi over fc.
>> (maybe I can re-use some of virtio-scsi code/idea to deal with scsi data)
>>  
>> The minimum invocation may look like:
>> qemu-system-x86_64 \
>> .. \
>> -object fibrechannel-backend,id=fcdev0,host=:81:00.0 \
>> -device 
>> virtio-fc-pci,id=vfc0,fc_backend=fcdev0,wwpn=1001,wwnn=1101
>>  \
>> ..
>>  
>> BTW, I have no idea how to make migration works:
>> How to deal with the BDF during migration?
>> How to deal with the Fabric ID during migration?
>>  
>> It's a draft idea, There are lots of related code I need to
>> investigate, Currently this is all thoughts I have.
>>  
>> Hello Paolo and Stefan, You are the authors of virtio-scsi,
>> and had some in-depth discuss about virtio-scsi in 2011
>> with Hannes, May I have your ideas/thoughts?
>
>I'm not sure it's necessary for the guest to have FC access.  Fam Zheng
>and Paolo are working on virtio-scsi FC NPIV enhancements.  It should
>make NPIV work better without adding a whole new FC device.
>
>https://lkml.org/lkml/2017/1/16/439
>
>The plan is:
>
>1. libvirt listens to udev events on the host so it can add/remove
>   QEMU SCSI LUNs.
>
>2. virtio-scsi is extended to include a WWPN that the guest can see.
>
>The guest doesn't do any FC fabric level stuff, it just does virtio-scsi
>as usual.
>
>It supports live migration by swapping between a pair of WWPNs across
>migration.
OK, Thanks for the information. This way still makes qemu as a SCSI
TARGET, I'm not sure which way makes more sense, But it does solve
the live migration case.

>What are the benefits of having FC access from the guest?
Actually, I havn't dug it too much, Just thought that from virtualization's
perspective, when interact with FC storage, having complete FC access
from the guest is the way it should use.
Lin


Re: [Qemu-devel] [PATCH 14/17] qmp: add x-debug-block-dirty-bitmap-sha256

2017-02-15 Thread Vladimir Sementsov-Ogievskiy

16.02.2017 03:35, John Snow wrote:


On 02/13/2017 04:54 AM, Vladimir Sementsov-Ogievskiy wrote:

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 

This is simply the same as the version in the other two series, right?


Yes. Context a bit differs... Aha, I've discovered that in migration I'm 
adding bdrv_next_dirty_bitmap and in persistent - 
bdrv_dirty_bitmap_next. Anyway, one series should be rebased after 
applying the second..




Reviewed-by: John Snow 


---
  block/dirty-bitmap.c |  5 +
  blockdev.c   | 29 +
  include/block/dirty-bitmap.h |  2 ++
  include/qemu/hbitmap.h   |  8 
  qapi/block-core.json | 27 +++
  tests/Makefile.include   |  2 +-
  util/hbitmap.c   | 11 +++
  7 files changed, 83 insertions(+), 1 deletion(-)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 32aa6eb..5bec99b 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -558,3 +558,8 @@ BdrvDirtyBitmap *bdrv_next_dirty_bitmap(BlockDriverState 
*bs,
  
  return QLIST_NEXT(bitmap, list);

  }
+
+char *bdrv_dirty_bitmap_sha256(const BdrvDirtyBitmap *bitmap, Error **errp)
+{
+return hbitmap_sha256(bitmap->bitmap, errp);
+}
diff --git a/blockdev.c b/blockdev.c
index db82ac9..4d06885 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2790,6 +2790,35 @@ void qmp_block_dirty_bitmap_clear(const char *node, 
const char *name,
  aio_context_release(aio_context);
  }
  
+BlockDirtyBitmapSha256 *qmp_x_debug_block_dirty_bitmap_sha256(const char *node,

+  const char *name,
+  Error **errp)
+{
+AioContext *aio_context;
+BdrvDirtyBitmap *bitmap;
+BlockDriverState *bs;
+BlockDirtyBitmapSha256 *ret = NULL;
+char *sha256;
+
+bitmap = block_dirty_bitmap_lookup(node, name, &bs, &aio_context, errp);
+if (!bitmap || !bs) {
+return NULL;
+}
+
+sha256 = bdrv_dirty_bitmap_sha256(bitmap, errp);
+if (sha256 == NULL) {
+goto out;
+}
+
+ret = g_new(BlockDirtyBitmapSha256, 1);
+ret->sha256 = sha256;
+
+out:
+aio_context_release(aio_context);
+
+return ret;
+}
+
  void hmp_drive_del(Monitor *mon, const QDict *qdict)
  {
  const char *id = qdict_get_str(qdict, "id");
diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index 20b3ec7..ded872a 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -78,4 +78,6 @@ void bdrv_dirty_bitmap_deserialize_finish(BdrvDirtyBitmap 
*bitmap);
  BdrvDirtyBitmap *bdrv_next_dirty_bitmap(BlockDriverState *bs,
  BdrvDirtyBitmap *bitmap);
  
+char *bdrv_dirty_bitmap_sha256(const BdrvDirtyBitmap *bitmap, Error **errp);

+
  #endif
diff --git a/include/qemu/hbitmap.h b/include/qemu/hbitmap.h
index 9239fe5..f353e56 100644
--- a/include/qemu/hbitmap.h
+++ b/include/qemu/hbitmap.h
@@ -238,6 +238,14 @@ void hbitmap_deserialize_zeroes(HBitmap *hb, uint64_t 
start, uint64_t count,
  void hbitmap_deserialize_finish(HBitmap *hb);
  
  /**

+ * hbitmap_sha256:
+ * @bitmap: HBitmap to operate on.
+ *
+ * Returns SHA256 hash of the last level.
+ */
+char *hbitmap_sha256(const HBitmap *bitmap, Error **errp);
+
+/**
   * hbitmap_free:
   * @hb: HBitmap to operate on.
   *
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 932f5bb..8646054 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1632,6 +1632,33 @@
'data': 'BlockDirtyBitmap' }
  
  ##

+# @BlockDirtyBitmapSha256:
+#
+# SHA256 hash of dirty bitmap data
+#
+# @sha256: ASCII representation of SHA256 bitmap hash
+#
+# Since: 2.9
+##
+  { 'struct': 'BlockDirtyBitmapSha256',
+'data': {'sha256': 'str'} }
+
+##
+# @x-debug-block-dirty-bitmap-sha256:
+#
+# Get bitmap SHA256
+#
+# Returns: BlockDirtyBitmapSha256 on success
+#  If @node is not a valid block device, DeviceNotFound
+#  If @name is not found or if hashing has failed, GenericError with an
+#  explanation
+#
+# Since: 2.9
+##
+  { 'command': 'x-debug-block-dirty-bitmap-sha256',
+'data': 'BlockDirtyBitmap', 'returns': 'BlockDirtyBitmapSha256' }
+
+##
  # @blockdev-mirror:
  #
  # Start mirroring a block device's writes to a new destination.
diff --git a/tests/Makefile.include b/tests/Makefile.include
index 634394a..7a71b4d 100644
--- a/tests/Makefile.include
+++ b/tests/Makefile.include
@@ -526,7 +526,7 @@ tests/test-blockjob$(EXESUF): tests/test-blockjob.o 
$(test-block-obj-y) $(test-u
  tests/test-blockjob-txn$(EXESUF): tests/test-blockjob-txn.o 
$(test-block-obj-y) $(test-util-obj-y)
  tests/test-thread-pool$(EXESUF): tests/test-thread-pool.o $(test-block-obj-y)
  tests/test-iov$(EXESUF): tests/test-iov.o $(test-util-obj-y)
-tests/test-hbitmap$(EXESUF): tests/test-hbitmap.o $(test-util-obj-y)
+tests/test-hbitmap$(EXESUF):

[Qemu-devel] [PATCH v7 5/8] qmp/hmp: add query-vm-generation-id and 'info vm-generation-id' commands

2017-02-15 Thread ben
From: Igor Mammedov 

Add commands to query Virtual Machine Generation ID counter.

QMP command example:
{ "execute": "query-vm-generation-id" }

HMP command example:
info vm-generation-id

Signed-off-by: Igor Mammedov 
Reviewed-by: Eric Blake 
Signed-off-by: Ben Warren 
Reviewed-by: Laszlo Ersek 
---
 hmp-commands-info.hx | 14 ++
 hmp.c|  9 +
 hmp.h|  1 +
 hw/acpi/vmgenid.c| 16 
 qapi-schema.json | 20 
 stubs/Makefile.objs  |  1 +
 stubs/vmgenid.c  |  9 +
 7 files changed, 70 insertions(+)
 create mode 100644 stubs/vmgenid.c

diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx
index b0f35e6..a53f105 100644
--- a/hmp-commands-info.hx
+++ b/hmp-commands-info.hx
@@ -802,6 +802,20 @@ Show information about hotpluggable CPUs
 ETEXI
 
 STEXI
+@item info vm-generation-id
+@findex vm-generation-id
+Show Virtual Machine Generation ID
+ETEXI
+
+{
+.name   = "vm-generation-id",
+.args_type  = "",
+.params = "",
+.help   = "Show Virtual Machine Generation ID",
+.cmd = hmp_info_vm_generation_id,
+},
+
+STEXI
 @end table
 ETEXI
 
diff --git a/hmp.c b/hmp.c
index 2bc4f06..535613d 100644
--- a/hmp.c
+++ b/hmp.c
@@ -2565,3 +2565,12 @@ void hmp_hotpluggable_cpus(Monitor *mon, const QDict 
*qdict)
 
 qapi_free_HotpluggableCPUList(saved);
 }
+
+void hmp_info_vm_generation_id(Monitor *mon, const QDict *qdict)
+{
+GuidInfo *info = qmp_query_vm_generation_id(NULL);
+if (info) {
+monitor_printf(mon, "%s\n", info->guid);
+}
+qapi_free_GuidInfo(info);
+}
diff --git a/hmp.h b/hmp.h
index 05daf7c..799fd37 100644
--- a/hmp.h
+++ b/hmp.h
@@ -137,5 +137,6 @@ void hmp_rocker_of_dpa_flows(Monitor *mon, const QDict 
*qdict);
 void hmp_rocker_of_dpa_groups(Monitor *mon, const QDict *qdict);
 void hmp_info_dump(Monitor *mon, const QDict *qdict);
 void hmp_hotpluggable_cpus(Monitor *mon, const QDict *qdict);
+void hmp_info_vm_generation_id(Monitor *mon, const QDict *qdict);
 
 #endif
diff --git a/hw/acpi/vmgenid.c b/hw/acpi/vmgenid.c
index 8fba7e0..9f97b72 100644
--- a/hw/acpi/vmgenid.c
+++ b/hw/acpi/vmgenid.c
@@ -237,3 +237,19 @@ static void vmgenid_register_types(void)
 }
 
 type_init(vmgenid_register_types)
+
+GuidInfo *qmp_query_vm_generation_id(Error **errp)
+{
+GuidInfo *info;
+VmGenIdState *vms;
+Object *obj = find_vmgenid_dev();
+
+if (!obj) {
+return NULL;
+}
+vms = VMGENID(obj);
+
+info = g_malloc0(sizeof(*info));
+info->guid = qemu_uuid_unparse_strdup(&vms->guid);
+return info;
+}
diff --git a/qapi-schema.json b/qapi-schema.json
index 5edb08d..396e49c 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -6056,3 +6056,23 @@
 #
 ##
 { 'command': 'query-hotpluggable-cpus', 'returns': ['HotpluggableCPU'] }
+
+##
+# @GuidInfo:
+#
+# GUID information.
+#
+# @guid: the globally unique identifier
+#
+# Since: 2.9
+##
+{ 'struct': 'GuidInfo', 'data': {'guid': 'str'} }
+
+##
+# @query-vm-generation-id:
+#
+# Show Virtual Machine Generation ID
+#
+# Since 2.9
+##
+{ 'command': 'query-vm-generation-id', 'returns': 'GuidInfo' }
diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs
index a187295..0bffca6 100644
--- a/stubs/Makefile.objs
+++ b/stubs/Makefile.objs
@@ -35,3 +35,4 @@ stub-obj-y += qmp_pc_dimm_device_list.o
 stub-obj-y += target-monitor-defs.o
 stub-obj-y += target-get-monitor-def.o
 stub-obj-y += pc_madt_cpu_entry.o
+stub-obj-y += vmgenid.o
diff --git a/stubs/vmgenid.c b/stubs/vmgenid.c
new file mode 100644
index 000..c64eb7a
--- /dev/null
+++ b/stubs/vmgenid.c
@@ -0,0 +1,9 @@
+#include "qemu/osdep.h"
+#include "qmp-commands.h"
+#include "qapi/qmp/qerror.h"
+
+GuidInfo *qmp_query_vm_generation_id(Error **errp)
+{
+error_setg(errp, QERR_UNSUPPORTED);
+return NULL;
+}
-- 
2.7.4




Re: [Qemu-devel] [Help] Windows2012 as Guest 64+cores on KVM Halts

2017-02-15 Thread Vadim Rozenfeld
On Thu, 2017-02-16 at 01:31 +, Gonglei (Arei) wrote:
> Hi,
> 
> > 
> > 
> > On Sat, 2017-02-11 at 10:39 -0500, Paolo Bonzini wrote:
> > > 
> > > > 
> > > > 
> > > > 
> > > > > 
> > > > > 
> > > > > On 10/02/2017 10:31, Gonglei (Arei) wrote:
> > > > > > 
> > > > > > 
> > > > > > But We tested the same cases on Xen platform and VMware,
> > > > > > and
> > > > > > the guest booted successfully.
> > > > > 
> > > > > Were these two also tested with enlightenments enabled?  TCG
> > > > > surely isn't.
> > > > 
> > > > About TCG, I just remove ' accel=kvm,' and 'hy_releaxed' from
> > > > the
> > > > below QEMU
> > > > Command line, I thought the hyper-V enabled then. Sorry about
> > > > that.
> > > > 
> > > > But for Xen, we set 'viridian=1' which be thought the Hyper-V
> > > > is
> > > > enabled.
> > > > 
> > > > For VMWare we also enabled the Hyper-V enlightenments.
> > If I'm not mistaken, even Hyper-V server doesn't allow specify more
> > than 64 vCPUs for Generation 1 VMs.
> 
> Normally yes, but I found the explanation from Microsoft document
> about it:
> 
> Maximum Supported Virtual Processors
> 
> On Windows operating systems versions through Windows Server 2008
> R2, 
> reporting the HV#1 hypervisor interface limits the Windows virtual
> machine 
> to a maximum of 64 VPs, regardless of what is reported via
> CPUID.4005.EAX.
> Starting with Windows Server 2012 and Windows 8, if
> CPUID.4005.EAX 
> contains a value of -1, Windows assumes that the hypervisor imposes
> no specific
> limit to the number of VPs. In this case, Windows Server 2012 guest
> VMs may
> use more than 64 VPs, up to the maximum supported number of
> processors 
> applicable to the specific Windows version being used.
> 
> Link: https://docs.microsoft.com/en-us/virtualization/hyper-v-on-wind
> ows/reference/tlfs
> 
> "Requirements for Implementing the Microsoft Hypervisor Interface"
> 
> And the below patch works for me, I can support max 255 vcpus for
> WS2012
> with hyper-v enlightenments.
> 
> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> index 27fd050..efe3cbc 100644
> --- a/target/i386/kvm.c
> +++ b/target/i386/kvm.c
> @@ -772,7 +772,7 @@ int kvm_arch_init_vcpu(CPUState *cs)
> 
>  c = &cpuid_data.entries[cpuid_i++];
>  c->function = HYPERV_CPUID_IMPLEMENT_LIMITS;
> -c->eax = 0x40;
> +c->eax = -1;
>  c->ebx = 0x40;
> 
>  kvm_base = KVM_CPUID_SIGNATURE_NEXT;
> 

Nice.
I tried the following patch some time ago. Unfortunately it didn't work
for me for some reason:

@@ -772,8 +773,9 @@ int kvm_arch_init_vcpu(CPUState *cs)
 
 c = &cpuid_data.entries[cpuid_i++];
 c->function = HYPERV_CPUID_IMPLEMENT_LIMITS;
-c->eax = 0x40;
-c->ebx = 0x40;
+c->eax = 0x00f0;//0x40;
+c->ebx = 0x0200;//0x40;
+c->ecx = 0x0648;

I used the same numbers as provided by WS2016 for both Gen1 and Gen2
VMs.

> > 
> > > > 
> > In any case, if you are only interested in hv_relaxed, you can drop
> > it
> > off for WS2012 as long as you have cpu hypervisor flag
> > (CPUID.1:ECX [bit 31]=1) turned on.
> > 
> hy_relaxed is just a example of enabling hyperv-v enlightenments.
> 
> Thanks,
> -Gonglei



[Qemu-devel] [PATCH v7 2/8] docs: VM Generation ID device description

2017-02-15 Thread ben
From: Ben Warren 

This patch is based off an earlier version by
Gal Hammer (gham...@redhat.com)

Requirements section, ASCII diagrams and overall help
provided by Laszlo Ersek (ler...@redhat.com)

Signed-off-by: Gal Hammer 
Signed-off-by: Ben Warren 
Reviewed-by: Laszlo Ersek 
Reviewed-by: Igor Mammedov 
---
 docs/specs/vmgenid.txt | 245 +
 1 file changed, 245 insertions(+)
 create mode 100644 docs/specs/vmgenid.txt

diff --git a/docs/specs/vmgenid.txt b/docs/specs/vmgenid.txt
new file mode 100644
index 000..aa9f518
--- /dev/null
+++ b/docs/specs/vmgenid.txt
@@ -0,0 +1,245 @@
+VIRTUAL MACHINE GENERATION ID
+=
+
+Copyright (C) 2016 Red Hat, Inc.
+Copyright (C) 2017 Skyport Systems, Inc.
+
+This work is licensed under the terms of the GNU GPL, version 2 or later.
+See the COPYING file in the top-level directory.
+
+===
+
+The VM generation ID (vmgenid) device is an emulated device which
+exposes a 128-bit, cryptographically random, integer value identifier,
+referred to as a Globally Unique Identifier, or GUID.
+
+This allows management applications (e.g. libvirt) to notify the guest
+operating system when the virtual machine is executed with a different
+configuration (e.g. snapshot execution or creation from a template).  The
+guest operating system notices the change, and is then able to react as
+appropriate by marking its copies of distributed databases as dirty,
+re-initializing its random number generator etc.
+
+
+Requirements
+
+
+These requirements are extracted from the "How to implement virtual machine
+generation ID support in a virtualization platform" section of the
+specification, dated August 1, 2012.
+
+
+The document may be found on the web at:
+  http://go.microsoft.com/fwlink/?LinkId=260709
+
+R1a. The generation ID shall live in an 8-byte aligned buffer.
+
+R1b. The buffer holding the generation ID shall be in guest RAM, ROM, or device
+ MMIO range.
+
+R1c. The buffer holding the generation ID shall be kept separate from areas
+ used by the operating system.
+
+R1d. The buffer shall not be covered by an AddressRangeMemory or
+ AddressRangeACPI entry in the E820 or UEFI memory map.
+
+R1e. The generation ID shall not live in a page frame that could be mapped with
+ caching disabled. (In other words, regardless of whether the generation ID
+ lives in RAM, ROM or MMIO, it shall only be mapped as cacheable.)
+
+R2 to R5. [These AML requirements are isolated well enough in the Microsoft
+  specification for us to simply refer to them here.]
+
+R6. The hypervisor shall expose a _HID (hardware identifier) object in the
+VMGenId device's scope that is unique to the hypervisor vendor.
+
+
+QEMU Implementation
+---
+
+The above-mentioned specification does not dictate which ACPI descriptor table
+will contain the VM Generation ID device.  Other implementations (Hyper-V and
+Xen) put it in the main descriptor table (Differentiated System Description
+Table or DSDT).  For ease of debugging and implementation, we have decided to
+put it in its own Secondary System Description Table, or SSDT.
+
+The following is a dump of the contents from a running system:
+
+# iasl -p ./SSDT -d /sys/firmware/acpi/tables/SSDT
+
+Intel ACPI Component Architecture
+ASL+ Optimizing Compiler version 20150717-64
+Copyright (c) 2000 - 2015 Intel Corporation
+
+Reading ACPI table from file /sys/firmware/acpi/tables/SSDT - Length
+0198 (0xC6)
+ACPI: SSDT 0x C6 (v01 BOCHS  VMGENID  0001 BXPC
+0001)
+Acpi table [SSDT] successfully installed and loaded
+Pass 1 parse of [SSDT]
+Pass 2 parse of [SSDT]
+Parsing Deferred Opcodes (Methods/Buffers/Packages/Regions)
+
+Parsing completed
+Disassembly completed
+ASL Output:./SSDT.dsl - 1631 bytes
+# cat SSDT.dsl
+/*
+ * Intel ACPI Component Architecture
+ * AML/ASL+ Disassembler version 20150717-64
+ * Copyright (c) 2000 - 2015 Intel Corporation
+ *
+ * Disassembling to symbolic ASL+ operators
+ *
+ * Disassembly of /sys/firmware/acpi/tables/SSDT, Sun Feb  5 00:19:37 2017
+ *
+ * Original Table Header:
+ * Signature"SSDT"
+ * Length   0x00CA (202)
+ * Revision 0x01
+ * Checksum 0x4B
+ * OEM ID   "BOCHS "
+ * OEM Table ID "VMGENID"
+ * OEM Revision 0x0001 (1)
+ * Compiler ID  "BXPC"
+ * Compiler Version 0x0001 (1)
+ */
+DefinitionBlock ("/sys/firmware/acpi/tables/SSDT.aml", "SSDT", 1, "BOCHS ",
+"VMGENID", 0x0001)
+{
+Name (VGIA, 0x07FFF000)
+Scope (\_SB)
+{
+Device (VGEN)
+{
+Name (_HID, "QEMUVGID")  // _HID: Hardware ID
+Name (_CID, "VM_Gen_Counter")  // _CID: Compatible ID
+Name (_DDN, "VM_Gen_Counter")  // _DDN: DOS Device Name
+Method (_STA, 0, NotSerialized)  // _STA: Status
+{
+Local0 = 0x0F
+ 

[Qemu-devel] [PATCH v7 7/8] tests: Add unit tests for the VM Generation ID feature

2017-02-15 Thread ben
From: Ben Warren 

The following tests are implemented:
* test that a GUID passed in by command line is propagated to the guest.
  Read the GUID both from guest memory and from the monitor
* test that the "auto" argument to the GUID generates a valid GUID, as
  seen by the guest.

  This patch is loosely based on a previous patch from:
  Gal Hammer   and Igor Mammedov 

Signed-off-by: Ben Warren 
---
 tests/Makefile.include |   2 +
 tests/vmgenid-test.c   | 174 +
 2 files changed, 176 insertions(+)
 create mode 100644 tests/vmgenid-test.c

diff --git a/tests/Makefile.include b/tests/Makefile.include
index 143507e..8d36341 100644
--- a/tests/Makefile.include
+++ b/tests/Makefile.include
@@ -241,6 +241,7 @@ check-qtest-i386-y += tests/usb-hcd-xhci-test$(EXESUF)
 gcov-files-i386-y += hw/usb/hcd-xhci.c
 check-qtest-i386-y += tests/pc-cpu-test$(EXESUF)
 check-qtest-i386-y += tests/q35-test$(EXESUF)
+check-qtest-i386-y += tests/vmgenid-test$(EXESUF)
 gcov-files-i386-y += hw/pci-host/q35.c
 check-qtest-i386-$(CONFIG_VHOST_NET_TEST_i386) += 
tests/vhost-user-test$(EXESUF)
 ifeq ($(CONFIG_VHOST_NET_TEST_i386),)
@@ -726,6 +727,7 @@ tests/ivshmem-test$(EXESUF): tests/ivshmem-test.o 
contrib/ivshmem-server/ivshmem
 tests/vhost-user-bridge$(EXESUF): tests/vhost-user-bridge.o 
contrib/libvhost-user/libvhost-user.o $(test-util-obj-y)
 tests/test-uuid$(EXESUF): tests/test-uuid.o $(test-util-obj-y)
 tests/test-arm-mptimer$(EXESUF): tests/test-arm-mptimer.o
+tests/vmgenid-test$(EXESUF): tests/vmgenid-test.o tests/acpi-utils.o
 
 tests/migration/stress$(EXESUF): tests/migration/stress.o
$(call quiet-command, $(LINKPROG) -static -O3 $(PTHREAD_LIB) -o $@ $< 
,"LINK","$(TARGET_DIR)$@")
diff --git a/tests/vmgenid-test.c b/tests/vmgenid-test.c
new file mode 100644
index 000..1741455
--- /dev/null
+++ b/tests/vmgenid-test.c
@@ -0,0 +1,174 @@
+/*
+ * QTest testcase for VM Generation ID
+ *
+ * Copyright (c) 2016 Red Hat, Inc.
+ * Copyright (c) 2017 Skyport Systems
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include 
+#include 
+#include 
+#include "qemu/osdep.h"
+#include "qemu/bitmap.h"
+#include "qemu/uuid.h"
+#include "hw/acpi/acpi-defs.h"
+#include "acpi-utils.h"
+#include "libqtest.h"
+
+#define VGID_GUID "324e6eaf-d1d1-4bf6-bf41-b9bb6c91fb87"
+#define VMGENID_GUID_OFFSET  40   /* allow space for
+   * OVMF SDT Header Probe Supressor
+   */
+
+typedef struct {
+AcpiTableHeader header;
+gchar name_op;
+gchar vgia[4];
+gchar val_op;
+uint32_t vgia_val;
+} QEMU_PACKED VgidTable;
+
+static uint32_t acpi_find_vgia(void)
+{
+uint32_t off;
+AcpiRsdpDescriptor rsdp_table;
+uint32_t rsdt;
+AcpiRsdtDescriptorRev1 rsdt_table;
+int tables_nr;
+uint32_t *tables;
+AcpiTableHeader ssdt_table;
+VgidTable vgid_table;
+int i;
+
+off = acpi_find_rsdp_address();
+g_assert_cmphex(off, <, 0x10);
+
+acpi_parse_rsdp_table(off, &rsdp_table);
+
+rsdt = rsdp_table.rsdt_physical_address;
+/* read the header */
+ACPI_READ_TABLE_HEADER(&rsdt_table, rsdt);
+ACPI_ASSERT_CMP(rsdt_table.signature, "RSDT");
+
+/* compute the table entries in rsdt */
+tables_nr = (rsdt_table.length - sizeof(AcpiRsdtDescriptorRev1)) /
+sizeof(uint32_t);
+g_assert_cmpint(tables_nr, >, 0);
+
+/* get the addresses of the tables pointed by rsdt */
+tables = g_new0(uint32_t, tables_nr);
+ACPI_READ_ARRAY_PTR(tables, tables_nr, rsdt);
+
+for (i = 0; i < tables_nr; i++) {
+ACPI_READ_TABLE_HEADER(&ssdt_table, tables[i]);
+if (!strncmp((char *)ssdt_table.oem_table_id, "VMGENID", 7)) {
+/* the first entry in the table should be VGIA
+ * That's all we need
+ */
+ACPI_READ_FIELD(vgid_table.name_op, tables[i]);
+g_assert(vgid_table.name_op == 0x08);  /* name */
+ACPI_READ_ARRAY(vgid_table.vgia, tables[i]);
+g_assert(memcmp(vgid_table.vgia, "VGIA", 4) == 0);
+ACPI_READ_FIELD(vgid_table.val_op, tables[i]);
+g_assert(vgid_table.val_op == 0x0C);  /* dword */
+ACPI_READ_FIELD(vgid_table.vgia_val, tables[i]);
+/* The GUID is written at a fixed offset into the fw_cfg file
+ * in order to implement the "OVMF SDT Header probe suppressor"
+ * see docs/specs/vmgenid.txt for more details
+ */
+return vgid_table.vgia_val + VMGENID_GUID_OFFSET;
+}
+}
+return 0;
+}
+
+static void read_guid_from_memory(QemuUUID *guid)
+{
+uint32_t vmgenid_addr;
+int i;
+
+vmgenid_addr = acpi_find_vgia();
+g_assert(vmgenid_addr);
+
+/* Read the GUID directly from guest memory */
+for (i = 0; i < 16; i++) {
+guid->data[i] = rea

[Qemu-devel] [PATCH v7 8/8] MAINTAINERS: Add VM Generation ID entry

2017-02-15 Thread ben
From: Ben Warren 

Also add BIOS tables entry

Signed-off-by: Ben Warren 
---
 MAINTAINERS | 8 
 1 file changed, 8 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index fb57d8e..e2e4b4f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -909,6 +909,7 @@ F: hw/acpi/*
 F: hw/smbios/*
 F: hw/i386/acpi-build.[hc]
 F: hw/arm/virt-acpi-build.c
+F: tests/bios-tables-test.c
 
 ppc4xx
 M: Alexander Graf 
@@ -1123,6 +1124,13 @@ F: hw/nvram/chrp_nvram.c
 F: include/hw/nvram/chrp_nvram.h
 F: tests/prom-env-test.c
 
+VM Generation ID
+M: Ben Warren 
+S: Maintained
+F: hw/acpi/vmgenid.c
+F: include/hw/acpi/vmgenid.h
+F: tests/vmgenid-test.c
+
 Subsystems
 --
 Audio
-- 
2.7.4




[Qemu-devel] [PATCH v7 0/8] Add support for VM Generation ID

2017-02-15 Thread ben
From: Ben Warren 

This patch set adds support for passing a GUID to Windows guests.  It is a
re-implementation of previous patch sets written by Igor Mammedov et al, but
this time passing the GUID data as a fw_cfg blob.

This patch set has dependencies on new guest functionality, in particular the
support for a new linker-loader command and the ability to write back data
to QEMU over a DMA link.  Work is in flight in both SeaBIOS and OVMF to support 
this.

v6->v7:
- Rebased to top of tree.
- Added 'src_offset' field to "write pointer" command
- Reworked unit tests based on feedback
- various minor changes based on feedback
- Added entries to MAINTAINERS file

v5->v6:
- Rebased to top of tree.
- Changed device from sysbus to a simple device.  This removed the need for
  adding dynamic sysbus support to pc_piix boards.
- Removed patch that introduced QWORD patching of AML.
- Removed ability to set GUID via QMP/HMP.
- Improved comments/documentation in code.

v4->v5:
- Added significantly more detail to the documentation.
- Replaced the previously-implemented linker-loader command with a new one:
  "write pointer".  This allows writing the guest address of a fw_cfg blob 
back
  to an arbitrary offset in a writeable fw_cfg file visible to QEMU.  This 
will
  require support in SeaBIOS and OVMF (ongoing).
- Fixed endianness issues throughout.
- Several styling cleanups.

v3->v4:
- Rebased to top of tree.
- Re-added document patch that was accidentally dropped from the last 
revision.
- Added VMState functionality so that VGIA is restored properly.
- Added Unit tests
v2->v3:
- Added second writeable fw_cfg for storing the VM Generaiton ID
  address.  This uses a new linker-loader command for instructing the
  guest to write back the allocated address.  A patch for SeaBIOS has been
  submitted 
(https://www.seabios.org/pipermail/seabios/2017-January/011079.html)
  and the resulting binary will need to be pulled into QEMU once accepted.
- Setting VM Generation ID by command line or qmp/hmp now accepts an "auto"
  value, whereby QEMU generates a random GUID.
- Incorporated review comments from v2 mainly around code styling and AML 
syntax
- Changed to use the E05 ACPI event instead of E00
v1->v2:
- Removed "changed" boolean parameter as it is unneeded
- Added ACPI Notify logic
- Style changes to pass checkpatch.pl
- Added support for dynamic sysbus to pc_piix boards


Ben Warren (7):
  linker-loader: Add new 'write pointer' command
  docs: VM Generation ID device description
  ACPI: Add vmgenid blob storage to the build tables
  ACPI: Add Virtual Machine Generation ID support
  tests: Move reusable ACPI code into a utility file
  tests: Add unit tests for the VM Generation ID feature
  MAINTAINERS: Add VM Generation ID entry

Igor Mammedov (1):
  qmp/hmp: add query-vm-generation-id and 'info vm-generation-id'
commands

 MAINTAINERS  |   8 ++
 default-configs/i386-softmmu.mak |   1 +
 default-configs/x86_64-softmmu.mak   |   1 +
 docs/specs/vmgenid.txt   | 245 +
 hmp-commands-info.hx |  14 ++
 hmp.c|   9 ++
 hmp.h|   1 +
 hw/acpi/Makefile.objs|   1 +
 hw/acpi/aml-build.c  |   2 +
 hw/acpi/bios-linker-loader.c |  66 -
 hw/acpi/vmgenid.c| 255 +++
 hw/i386/acpi-build.c |  16 +++
 include/hw/acpi/acpi_dev_interface.h |   1 +
 include/hw/acpi/aml-build.h  |   1 +
 include/hw/acpi/bios-linker-loader.h |   7 +
 include/hw/acpi/vmgenid.h|  35 +
 qapi-schema.json |  20 +++
 stubs/Makefile.objs  |   1 +
 stubs/vmgenid.c  |   9 ++
 tests/Makefile.include   |   4 +-
 tests/acpi-utils.c   |  65 +
 tests/acpi-utils.h   |  94 +
 tests/bios-tables-test.c | 132 +++---
 tests/vmgenid-test.c | 174 
 24 files changed, 1041 insertions(+), 121 deletions(-)
 create mode 100644 docs/specs/vmgenid.txt
 create mode 100644 hw/acpi/vmgenid.c
 create mode 100644 include/hw/acpi/vmgenid.h
 create mode 100644 stubs/vmgenid.c
 create mode 100644 tests/acpi-utils.c
 create mode 100644 tests/acpi-utils.h
 create mode 100644 tests/vmgenid-test.c

-- 
2.7.4




[Qemu-devel] [PATCH v7 4/8] ACPI: Add Virtual Machine Generation ID support

2017-02-15 Thread ben
From: Ben Warren 

This implements the VM Generation ID feature by passing a 128-bit
GUID to the guest via a fw_cfg blob.
Any time the GUID changes, an ACPI notify event is sent to the guest

The user interface is a simple device with one parameter:
 - guid (string, must be "auto" or in UUID format
   ----)

Signed-off-by: Ben Warren 
---
 default-configs/i386-softmmu.mak |   1 +
 default-configs/x86_64-softmmu.mak   |   1 +
 hw/acpi/Makefile.objs|   1 +
 hw/acpi/vmgenid.c| 239 +++
 hw/i386/acpi-build.c |  16 +++
 include/hw/acpi/acpi_dev_interface.h |   1 +
 include/hw/acpi/vmgenid.h|  35 +
 7 files changed, 294 insertions(+)
 create mode 100644 hw/acpi/vmgenid.c
 create mode 100644 include/hw/acpi/vmgenid.h

diff --git a/default-configs/i386-softmmu.mak b/default-configs/i386-softmmu.mak
index 48b07a4..029e952 100644
--- a/default-configs/i386-softmmu.mak
+++ b/default-configs/i386-softmmu.mak
@@ -59,3 +59,4 @@ CONFIG_I82801B11=y
 CONFIG_SMBIOS=y
 CONFIG_HYPERV_TESTDEV=$(CONFIG_KVM)
 CONFIG_PXB=y
+CONFIG_ACPI_VMGENID=y
diff --git a/default-configs/x86_64-softmmu.mak 
b/default-configs/x86_64-softmmu.mak
index fd96345..d1d7432 100644
--- a/default-configs/x86_64-softmmu.mak
+++ b/default-configs/x86_64-softmmu.mak
@@ -59,3 +59,4 @@ CONFIG_I82801B11=y
 CONFIG_SMBIOS=y
 CONFIG_HYPERV_TESTDEV=$(CONFIG_KVM)
 CONFIG_PXB=y
+CONFIG_ACPI_VMGENID=y
diff --git a/hw/acpi/Makefile.objs b/hw/acpi/Makefile.objs
index 6acf798..11c35bc 100644
--- a/hw/acpi/Makefile.objs
+++ b/hw/acpi/Makefile.objs
@@ -5,6 +5,7 @@ common-obj-$(CONFIG_ACPI_CPU_HOTPLUG) += cpu_hotplug.o
 common-obj-$(CONFIG_ACPI_MEMORY_HOTPLUG) += memory_hotplug.o
 common-obj-$(CONFIG_ACPI_CPU_HOTPLUG) += cpu.o
 common-obj-$(CONFIG_ACPI_NVDIMM) += nvdimm.o
+common-obj-$(CONFIG_ACPI_VMGENID) += vmgenid.o
 common-obj-$(call lnot,$(CONFIG_ACPI_X86)) += acpi-stub.o
 
 common-obj-y += acpi_interface.o
diff --git a/hw/acpi/vmgenid.c b/hw/acpi/vmgenid.c
new file mode 100644
index 000..8fba7e0
--- /dev/null
+++ b/hw/acpi/vmgenid.c
@@ -0,0 +1,239 @@
+/*
+ *  Virtual Machine Generation ID Device
+ *
+ *  Copyright (C) 2017 Skyport Systems.
+ *
+ *  Author: Ben Warren 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qmp-commands.h"
+#include "hw/acpi/acpi.h"
+#include "hw/acpi/aml-build.h"
+#include "hw/acpi/vmgenid.h"
+#include "hw/nvram/fw_cfg.h"
+#include "sysemu/sysemu.h"
+
+void vmgenid_build_acpi(VmGenIdState *vms, GArray *table_data, GArray *guid,
+BIOSLinker *linker)
+{
+Aml *ssdt, *dev, *scope, *method, *addr, *if_ctx;
+uint32_t vgia_offset;
+QemuUUID guid_le;
+
+/* Fill in the GUID values.  These need to be converted to little-endian
+ * first, since that's what the guest expects
+ */
+g_array_set_size(guid, VMGENID_FW_CFG_SIZE - ARRAY_SIZE(guid_le.data));
+guid_le = vms->guid;
+qemu_uuid_bswap(&guid_le);
+/* The GUID is written at a fixed offset into the fw_cfg file
+ * in order to implement the "OVMF SDT Header probe suppressor"
+ * see docs/specs/vmgenid.txt for more details
+ */
+g_array_insert_vals(guid, VMGENID_GUID_OFFSET, guid_le.data,
+ARRAY_SIZE(guid_le.data));
+
+/* Put this in a separate SSDT table */
+ssdt = init_aml_allocator();
+
+/* Reserve space for header */
+acpi_data_push(ssdt->buf, sizeof(AcpiTableHeader));
+
+/* Storage for the GUID address */
+vgia_offset = table_data->len +
+build_append_named_dword(ssdt->buf, "VGIA");
+scope = aml_scope("\\_SB");
+dev = aml_device("VGEN");
+aml_append(dev, aml_name_decl("_HID", aml_string("QEMUVGID")));
+aml_append(dev, aml_name_decl("_CID", aml_string("VM_Gen_Counter")));
+aml_append(dev, aml_name_decl("_DDN", aml_string("VM_Gen_Counter")));
+
+/* Simple status method to check that address is linked and non-zero */
+method = aml_method("_STA", 0, AML_NOTSERIALIZED);
+addr = aml_local(0);
+aml_append(method, aml_store(aml_int(0xf), addr));
+if_ctx = aml_if(aml_equal(aml_name("VGIA"), aml_int(0)));
+aml_append(if_ctx, aml_store(aml_int(0), addr));
+aml_append(method, if_ctx);
+aml_append(method, aml_return(addr));
+aml_append(dev, method);
+
+/* the ADDR method returns two 32-bit words representing the lower and
+ * upper halves * of the physical address of the fw_cfg blob
+ * (holding the GUID)
+ */
+method = aml_method("ADDR", 0, AML_NOTSERIALIZED);
+
+addr = aml_local(0);
+aml_append(method, aml_store(aml_package(2), addr));
+
+aml_append(method, aml_store(aml_add(aml_name("VGIA"),
+ aml_int(VMGENID_GUID_OFFSET), NULL),
+ aml_index(addr, a

[Qemu-devel] [PATCH v7 6/8] tests: Move reusable ACPI code into a utility file

2017-02-15 Thread ben
From: Ben Warren 

Also usable by upcoming VM Generation ID tests

Signed-off-by: Ben Warren 
---
 tests/Makefile.include   |   2 +-
 tests/acpi-utils.c   |  65 +++
 tests/acpi-utils.h   |  94 +
 tests/bios-tables-test.c | 132 ++-
 4 files changed, 175 insertions(+), 118 deletions(-)
 create mode 100644 tests/acpi-utils.c
 create mode 100644 tests/acpi-utils.h

diff --git a/tests/Makefile.include b/tests/Makefile.include
index 634394a..143507e 100644
--- a/tests/Makefile.include
+++ b/tests/Makefile.include
@@ -667,7 +667,7 @@ tests/hd-geo-test$(EXESUF): tests/hd-geo-test.o
 tests/boot-order-test$(EXESUF): tests/boot-order-test.o $(libqos-obj-y)
 tests/boot-serial-test$(EXESUF): tests/boot-serial-test.o $(libqos-obj-y)
 tests/bios-tables-test$(EXESUF): tests/bios-tables-test.o \
-   tests/boot-sector.o $(libqos-obj-y)
+   tests/boot-sector.o tests/acpi-utils.o $(libqos-obj-y)
 tests/pxe-test$(EXESUF): tests/pxe-test.o tests/boot-sector.o $(libqos-obj-y)
 tests/tmp105-test$(EXESUF): tests/tmp105-test.o $(libqos-omap-obj-y)
 tests/ds1338-test$(EXESUF): tests/ds1338-test.o $(libqos-imx-obj-y)
diff --git a/tests/acpi-utils.c b/tests/acpi-utils.c
new file mode 100644
index 000..c5d1ebd
--- /dev/null
+++ b/tests/acpi-utils.c
@@ -0,0 +1,65 @@
+/*
+ * ACPI Utility Functions
+ *
+ * Copyright (c) 2013 Red Hat Inc.
+ * Copyright (c) 2017 Skyport Systems
+ *
+ * Authors:
+ *  Michael S. Tsirkin ,
+ *  Ben Warren ,
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include 
+#include "qemu-common.h"
+#include "hw/smbios/smbios.h"
+#include "qemu/bitmap.h"
+#include "acpi-utils.h"
+#include "boot-sector.h"
+
+uint8_t acpi_calc_checksum(const uint8_t *data, int len)
+{
+int i;
+uint8_t sum = 0;
+
+for (i = 0; i < len; i++) {
+sum += data[i];
+}
+
+return sum;
+}
+
+uint32_t acpi_find_rsdp_address(void)
+{
+uint32_t off;
+
+/* RSDP location can vary across a narrow range */
+for (off = 0xf; off < 0x10; off += 0x10) {
+uint8_t sig[] = "RSD PTR ";
+int i;
+
+for (i = 0; i < sizeof sig - 1; ++i) {
+sig[i] = readb(off + i);
+}
+
+if (!memcmp(sig, "RSD PTR ", sizeof sig)) {
+break;
+}
+}
+return off;
+}
+
+void acpi_parse_rsdp_table(uint32_t addr, AcpiRsdpDescriptor *rsdp_table)
+{
+ACPI_READ_FIELD(rsdp_table->signature, addr);
+ACPI_ASSERT_CMP64(rsdp_table->signature, "RSD PTR ");
+
+ACPI_READ_FIELD(rsdp_table->checksum, addr);
+ACPI_READ_ARRAY(rsdp_table->oem_id, addr);
+ACPI_READ_FIELD(rsdp_table->revision, addr);
+ACPI_READ_FIELD(rsdp_table->rsdt_physical_address, addr);
+ACPI_READ_FIELD(rsdp_table->length, addr);
+}
diff --git a/tests/acpi-utils.h b/tests/acpi-utils.h
new file mode 100644
index 000..9f9a2d5
--- /dev/null
+++ b/tests/acpi-utils.h
@@ -0,0 +1,94 @@
+/*
+ * Utilities for working with ACPI tables
+ *
+ * Copyright (c) 2013 Red Hat Inc.
+ *
+ * Authors:
+ *  Michael S. Tsirkin ,
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef TEST_ACPI_UTILS_H
+#define TEST_ACPI_UTILS_H
+
+#include "hw/acpi/acpi-defs.h"
+#include "libqtest.h"
+
+/* DSDT and SSDTs format */
+typedef struct {
+AcpiTableHeader header;
+gchar *aml;/* aml bytecode from guest */
+gsize aml_len;
+gchar *aml_file;
+gchar *asl;/* asl code generated from aml */
+gsize asl_len;
+gchar *asl_file;
+bool tmp_files_retain;   /* do not delete the temp asl/aml */
+} QEMU_PACKED AcpiSdtTable;
+
+#define ACPI_READ_FIELD(field, addr)   \
+do {   \
+switch (sizeof(field)) {   \
+case 1:\
+field = readb(addr);   \
+break; \
+case 2:\
+field = readw(addr);   \
+break; \
+case 4:\
+field = readl(addr);   \
+break; \
+case 8:\
+field = readq(addr);   \
+break; \
+default:   \
+g_assert(false);   \
+}  \
+addr += sizeof(field);  \
+} while (0);
+
+#define ACPI_READ_ARRAY_PTR(arr, length, addr)  \
+do {\
+int idx;\
+for (idx = 0;

[Qemu-devel] [PATCH v7 3/8] ACPI: Add vmgenid blob storage to the build tables

2017-02-15 Thread ben
From: Ben Warren 

This allows them to be centrally initialized and destroyed

The "AcpiBuildTables.vmgenid" array will be used to construct the
"etc/vmgenid_guid" fw_cfg blob.

Its contents will be linked into fw_cfg after being built on the
pc_machine_done() -> acpi_setup() -> acpi_build() call path, and dropped
without use on the subsequent, guest triggered, acpi_build_update() ->
acpi_build() call path.

Signed-off-by: Ben Warren 
Reviewed-by: Laszlo Ersek 
Reviewed-by: Igor Mammedov 
---
 hw/acpi/aml-build.c | 2 ++
 include/hw/acpi/aml-build.h | 1 +
 2 files changed, 3 insertions(+)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index b2a1e40..c6f2032 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1559,6 +1559,7 @@ void acpi_build_tables_init(AcpiBuildTables *tables)
 tables->rsdp = g_array_new(false, true /* clear */, 1);
 tables->table_data = g_array_new(false, true /* clear */, 1);
 tables->tcpalog = g_array_new(false, true /* clear */, 1);
+tables->vmgenid = g_array_new(false, true /* clear */, 1);
 tables->linker = bios_linker_loader_init();
 }
 
@@ -1568,6 +1569,7 @@ void acpi_build_tables_cleanup(AcpiBuildTables *tables, 
bool mfre)
 g_array_free(tables->rsdp, true);
 g_array_free(tables->table_data, true);
 g_array_free(tables->tcpalog, mfre);
+g_array_free(tables->vmgenid, mfre);
 }
 
 /* Build rsdt table */
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index 559326c..00c21f1 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -210,6 +210,7 @@ struct AcpiBuildTables {
 GArray *table_data;
 GArray *rsdp;
 GArray *tcpalog;
+GArray *vmgenid;
 BIOSLinker *linker;
 } AcpiBuildTables;
 
-- 
2.7.4




[Qemu-devel] [PATCH v7 1/8] linker-loader: Add new 'write pointer' command

2017-02-15 Thread ben
From: Ben Warren 

This is similar to the existing 'add pointer' functionality, but instead
of instructing the guest (BIOS or UEFI) to patch memory, it instructs
the guest to write the pointer back to QEMU via a writeable fw_cfg file.

Signed-off-by: Ben Warren 
---
 hw/acpi/bios-linker-loader.c | 66 ++--
 include/hw/acpi/bios-linker-loader.h |  7 
 2 files changed, 70 insertions(+), 3 deletions(-)

diff --git a/hw/acpi/bios-linker-loader.c b/hw/acpi/bios-linker-loader.c
index d963ebe..d5fb703 100644
--- a/hw/acpi/bios-linker-loader.c
+++ b/hw/acpi/bios-linker-loader.c
@@ -78,6 +78,21 @@ struct BiosLinkerLoaderEntry {
 uint32_t length;
 } cksum;
 
+/*
+ * COMMAND_WRITE_POINTER - write the fw_cfg file (originating from
+ * @dest_file) at @wr_pointer.offset, by adding a pointer to
+ * @src_offset within the table originating from @src_file.
+ * 1,2,4 or 8 byte unsigned addition is used depending on
+ * @wr_pointer.size.
+ */
+struct {
+char dest_file[BIOS_LINKER_LOADER_FILESZ];
+char src_file[BIOS_LINKER_LOADER_FILESZ];
+uint32_t dst_offset;
+uint32_t src_offset;
+uint8_t size;
+} wr_pointer;
+
 /* padding */
 char pad[124];
 };
@@ -85,9 +100,10 @@ struct BiosLinkerLoaderEntry {
 typedef struct BiosLinkerLoaderEntry BiosLinkerLoaderEntry;
 
 enum {
-BIOS_LINKER_LOADER_COMMAND_ALLOCATE = 0x1,
-BIOS_LINKER_LOADER_COMMAND_ADD_POINTER  = 0x2,
-BIOS_LINKER_LOADER_COMMAND_ADD_CHECKSUM = 0x3,
+BIOS_LINKER_LOADER_COMMAND_ALLOCATE  = 0x1,
+BIOS_LINKER_LOADER_COMMAND_ADD_POINTER   = 0x2,
+BIOS_LINKER_LOADER_COMMAND_ADD_CHECKSUM  = 0x3,
+BIOS_LINKER_LOADER_COMMAND_WRITE_POINTER = 0x4,
 };
 
 enum {
@@ -278,3 +294,47 @@ void bios_linker_loader_add_pointer(BIOSLinker *linker,
 
 g_array_append_vals(linker->cmd_blob, &entry, sizeof entry);
 }
+
+/*
+ * bios_linker_loader_write_pointer: ask guest to write a pointer to the
+ * source file into the destination file, and write it back to QEMU via
+ * fw_cfg DMA.
+ *
+ * @linker: linker object instance
+ * @dest_file: destination file that must be written
+ * @dst_patched_offset: location within destination file blob to be patched
+ *  with the pointer to @src_file, in bytes
+ * @dst_patched_offset_size: size of the pointer to be patched
+ *  at @dst_patched_offset in @dest_file blob, in bytes
+ * @src_file: source file who's address must be taken
+ * @src_offset: location within source file blob to which
+ *  @dest_file+@dst_patched_offset will point to after
+ *  firmware's executed WRITE_POINTER command
+ */
+void bios_linker_loader_write_pointer(BIOSLinker *linker,
+const char *dest_file,
+uint32_t dst_patched_offset,
+uint8_t dst_patched_size,
+const char *src_file,
+uint32_t src_offset)
+{
+BiosLinkerLoaderEntry entry;
+const BiosLinkerFileEntry *source_file =
+bios_linker_find_file(linker, src_file);
+
+assert(source_file);
+assert(src_offset <= source_file->blob->len);
+memset(&entry, 0, sizeof entry);
+strncpy(entry.wr_pointer.dest_file, dest_file,
+sizeof entry.wr_pointer.dest_file - 1);
+strncpy(entry.wr_pointer.src_file, src_file,
+sizeof entry.wr_pointer.src_file - 1);
+entry.command = cpu_to_le32(BIOS_LINKER_LOADER_COMMAND_WRITE_POINTER);
+entry.wr_pointer.dst_offset = cpu_to_le32(dst_patched_offset);
+entry.wr_pointer.src_offset = cpu_to_le32(dst_patched_offset);
+entry.wr_pointer.size = dst_patched_size;
+assert(dst_patched_size == 1 || dst_patched_size == 2 ||
+   dst_patched_size == 4 || dst_patched_size == 8);
+
+g_array_append_vals(linker->cmd_blob, &entry, sizeof entry);
+}
diff --git a/include/hw/acpi/bios-linker-loader.h 
b/include/hw/acpi/bios-linker-loader.h
index fa1e5d1..efe17b0 100644
--- a/include/hw/acpi/bios-linker-loader.h
+++ b/include/hw/acpi/bios-linker-loader.h
@@ -26,5 +26,12 @@ void bios_linker_loader_add_pointer(BIOSLinker *linker,
 const char *src_file,
 uint32_t src_offset);
 
+void bios_linker_loader_write_pointer(BIOSLinker *linker,
+  const char *dest_file,
+  uint32_t dst_patched_offset,
+  uint8_t dst_patched_size,
+  const char *src_file,
+  uint32_t src_offset);
+
 void bios_linker_loader_cleanup(BIOSLinker *linker);
 #endif
-- 
2.7.4




Re: [Qemu-devel] [PATCH v6 7/7] tests: Add unit tests for the VM Generation ID feature

2017-02-15 Thread Ben Warren

> On Feb 15, 2017, at 5:13 AM, Igor Mammedov  wrote:
> 
> On Tue, 14 Feb 2017 22:15:49 -0800
> b...@skyportsystems.com  wrote:
> 
>> From: Ben Warren 
>> 
>> The following tests are implemented:
>> * test that a GUID passed in by command line is propagated to the guest.
>> * test that changing the GUID at runtime via the monitor is reflected in
>>  the guest.
>> * test that the "auto" argument to the GUID generates a different, and
>>  correct GUID as seen by the guest.
>> 
>>  This patch is loosely based on a previous patch from:
>>  Gal Hammer   and Igor Mammedov 
>> 
>> Signed-off-by: Ben Warren 
>> ---
>> tests/Makefile.include |   2 +
>> tests/vmgenid-test.c   | 195 
>> +
>> 2 files changed, 197 insertions(+)
>> create mode 100644 tests/vmgenid-test.c
>> 
>> diff --git a/tests/Makefile.include b/tests/Makefile.include
>> index 634394a..ca4b3f7 100644
>> --- a/tests/Makefile.include
>> +++ b/tests/Makefile.include
>> @@ -241,6 +241,7 @@ check-qtest-i386-y += tests/usb-hcd-xhci-test$(EXESUF)
>> gcov-files-i386-y += hw/usb/hcd-xhci.c
>> check-qtest-i386-y += tests/pc-cpu-test$(EXESUF)
>> check-qtest-i386-y += tests/q35-test$(EXESUF)
>> +check-qtest-i386-y += tests/vmgenid-test$(EXESUF)
>> gcov-files-i386-y += hw/pci-host/q35.c
>> check-qtest-i386-$(CONFIG_VHOST_NET_TEST_i386) += 
>> tests/vhost-user-test$(EXESUF)
>> ifeq ($(CONFIG_VHOST_NET_TEST_i386),)
>> @@ -726,6 +727,7 @@ tests/ivshmem-test$(EXESUF): tests/ivshmem-test.o 
>> contrib/ivshmem-server/ivshmem
>> tests/vhost-user-bridge$(EXESUF): tests/vhost-user-bridge.o 
>> contrib/libvhost-user/libvhost-user.o $(test-util-obj-y)
>> tests/test-uuid$(EXESUF): tests/test-uuid.o $(test-util-obj-y)
>> tests/test-arm-mptimer$(EXESUF): tests/test-arm-mptimer.o
>> +tests/vmgenid-test$(EXESUF): tests/vmgenid-test.o
>> 
>> tests/migration/stress$(EXESUF): tests/migration/stress.o
>>  $(call quiet-command, $(LINKPROG) -static -O3 $(PTHREAD_LIB) -o $@ $< 
>> ,"LINK","$(TARGET_DIR)$@")
>> diff --git a/tests/vmgenid-test.c b/tests/vmgenid-test.c
>> new file mode 100644
>> index 000..721ba05
>> --- /dev/null
>> +++ b/tests/vmgenid-test.c
>> @@ -0,0 +1,195 @@
>> +/*
>> + * QTest testcase for VM Generation ID
>> + *
>> + * Copyright (c) 2016 Red Hat, Inc.
>> + * Copyright (c) 2017 Skyport Systems
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
>> + * See the COPYING file in the top-level directory.
>> + */
>> +
>> +#include 
>> +#include 
>> +#include 
>> +#include "qemu/osdep.h"
>> +#include "qemu/bitmap.h"
>> +#include "qemu/uuid.h"
>> +#include "hw/acpi/acpi-defs.h"
>> +#include "acpi-utils.h"
>> +#include "libqtest.h"
>> +
>> +#define VGID_GUID "324e6eaf-d1d1-4bf6-bf41-b9bb6c91fb87"
>> +#define VMGENID_GUID_OFFSET  40   /* allow space for
>> +   * OVMF SDT Header Probe Supressor
>> +   */
>> +
>> +static uint32_t vgia;
>> +
>> +typedef struct {
>> +AcpiTableHeader header;
>> +gchar name_op;
>> +gchar vgia[4];
>> +gchar val_op;
>> +uint32_t vgia_val;
>> +} QEMU_PACKED VgidTable;
>> +
>> +static uint32_t find_vgia(void)
>> +{
> [...]
> 
> ===
>> +/* First, find the RSDP */
>> +for (off = 0xf; off < 0x10; off += 0x10) {
>> +uint8_t sig[] = "RSD PTR ";
>> +
>> +for (i = 0; i < sizeof sig - 1; ++i) {
>> +sig[i] = readb(off + i);
>> +}
>> +
>> +if (!memcmp(sig, "RSD PTR ", sizeof sig)) {
>> +break;
>> +}
>> +}
>> +g_assert_cmphex(off, <, 0x10);
>> +
>> +/* Parse the RSDP header so we can find the RSDT */
>> +ACPI_READ_FIELD(rsdp_table.signature, off);
>> +ACPI_ASSERT_CMP64(rsdp_table.signature, "RSD PTR ");
>> +
>> +ACPI_READ_FIELD(rsdp_table.checksum, off);
>> +ACPI_READ_ARRAY(rsdp_table.oem_id, off);
>> +ACPI_READ_FIELD(rsdp_table.revision, off);
>> +ACPI_READ_FIELD(rsdp_table.rsdt_physical_address, off);
>> +
>> +rsdt = rsdp_table.rsdt_physical_address;
>> +/* read the header */
>> +ACPI_READ_TABLE_HEADER(&rsdt_table, rsdt);
>> +ACPI_ASSERT_CMP(rsdt_table.signature, "RSDT");
>> +
>> +/* compute the table entries in rsdt */
>> +tables_nr = (rsdt_table.length - sizeof(AcpiRsdtDescriptorRev1)) /
>> +sizeof(uint32_t);
>> +g_assert_cmpint(tables_nr, >, 0);
>> +
>> +/* get the addresses of the tables pointed by rsdt */
>> +tables = g_new0(uint32_t, tables_nr);
>> +ACPI_READ_ARRAY_PTR(tables, tables_nr, rsdt);
> ===
> above is duplicated code from bios-tables-test.c
> please extract it into separate functions and use them in both tests.
> 
Done in v7
>> +for (i = 0; i < tables_nr; i++) {
>> +ACPI_READ_TABLE_HEADER(&ssdt_table, tables[i]);
>> +if (!strncmp((char *)ssdt_table.oem_table_id, "VMGENID", 7)) {
>> +/* the first entry in the table should be 

Re: [Qemu-devel] [PATCH v6 5/7] qmp/hmp: add query-vm-generation-id and 'info vm-generation-id' commands

2017-02-15 Thread Ben Warren

> On Feb 15, 2017, at 7:36 AM, Laszlo Ersek  wrote:
> 
> Two questions:
> 
> On 02/15/17 07:15, b...@skyportsystems.com  
> wrote:
>> From: Igor Mammedov 
>> 
>> Add commands to query Virtual Machine Generation ID counter.
>> 
>> QMP command example:
>>{ "execute": "query-vm-generation-id" }
>> 
>> HMP command example:
>>info vm-generation-id
>> 
>> Signed-off-by: Igor Mammedov 
>> Reviewed-by: Eric Blake 
>> Signed-off-by: Ben Warren 
>> ---
>> hmp-commands-info.hx | 13 +
>> hmp.c|  9 +
>> hmp.h|  1 +
>> hw/acpi/vmgenid.c| 16 
>> qapi-schema.json | 20 
>> stubs/Makefile.objs  |  1 +
>> stubs/vmgenid.c  |  8 
>> 7 files changed, 68 insertions(+)
>> create mode 100644 stubs/vmgenid.c
>> 
>> diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx
>> index b0f35e6..f3df793 100644
>> --- a/hmp-commands-info.hx
>> +++ b/hmp-commands-info.hx
>> @@ -802,6 +802,19 @@ Show information about hotpluggable CPUs
>> ETEXI
>> 
>> STEXI
>> +@item info vm-generation-id
> 
> (1) Don't we need some kind of @findex here, for consistency with the
> rest of the file?
> 
>> +Show Virtual Machine Generation ID
>> +ETEXI
>> +
>> +{
>> +.name   = "vm-generation-id",
>> +.args_type  = "",
>> +.params = "",
>> +.help   = "Show Virtual Machine Generation ID",
>> +.cmd = hmp_info_vm_generation_id,
>> +},
>> +
>> +STEXI
>> @end table
>> ETEXI
>> 
>> diff --git a/hmp.c b/hmp.c
>> index 2bc4f06..535613d 100644
>> --- a/hmp.c
>> +++ b/hmp.c
>> @@ -2565,3 +2565,12 @@ void hmp_hotpluggable_cpus(Monitor *mon, const QDict 
>> *qdict)
>> 
>> qapi_free_HotpluggableCPUList(saved);
>> }
>> +
>> +void hmp_info_vm_generation_id(Monitor *mon, const QDict *qdict)
>> +{
>> +GuidInfo *info = qmp_query_vm_generation_id(NULL);
>> +if (info) {
>> +monitor_printf(mon, "%s\n", info->guid);
>> +}
>> +qapi_free_GuidInfo(info);
>> +}
>> diff --git a/hmp.h b/hmp.h
>> index 05daf7c..799fd37 100644
>> --- a/hmp.h
>> +++ b/hmp.h
>> @@ -137,5 +137,6 @@ void hmp_rocker_of_dpa_flows(Monitor *mon, const QDict 
>> *qdict);
>> void hmp_rocker_of_dpa_groups(Monitor *mon, const QDict *qdict);
>> void hmp_info_dump(Monitor *mon, const QDict *qdict);
>> void hmp_hotpluggable_cpus(Monitor *mon, const QDict *qdict);
>> +void hmp_info_vm_generation_id(Monitor *mon, const QDict *qdict);
>> 
>> #endif
>> diff --git a/hw/acpi/vmgenid.c b/hw/acpi/vmgenid.c
>> index b1b7b32..c159c76 100644
>> --- a/hw/acpi/vmgenid.c
>> +++ b/hw/acpi/vmgenid.c
>> @@ -235,3 +235,19 @@ static void vmgenid_register_types(void)
>> }
>> 
>> type_init(vmgenid_register_types)
>> +
>> +GuidInfo *qmp_query_vm_generation_id(Error **errp)
>> +{
>> +GuidInfo *info;
>> +VmGenIdState *vms;
>> +Object *obj = find_vmgenid_dev();
>> +
>> +if (!obj) {
>> +return NULL;
>> +}
>> +vms = VMGENID(obj);
>> +
>> +info = g_malloc0(sizeof(*info));
>> +info->guid = qemu_uuid_unparse_strdup(&vms->guid);
>> +return info;
>> +}
>> diff --git a/qapi-schema.json b/qapi-schema.json
>> index 61151f3..5e2a47f 100644
>> --- a/qapi-schema.json
>> +++ b/qapi-schema.json
>> @@ -6051,3 +6051,23 @@
>> #
>> ##
>> { 'command': 'query-hotpluggable-cpus', 'returns': ['HotpluggableCPU'] }
>> +
>> +##
>> +# @GuidInfo:
>> +#
>> +# GUID information.
>> +#
>> +# @guid: the globally unique identifier
>> +#
>> +# Since: 2.9
>> +##
>> +{ 'struct': 'GuidInfo', 'data': {'guid': 'str'} }
>> +
>> +##
>> +# @query-vm-generation-id:
>> +#
>> +# Show Virtual Machine Generation ID
>> +#
>> +# Since 2.9
>> +##
>> +{ 'command': 'query-vm-generation-id', 'returns': 'GuidInfo' }
>> diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs
>> index a187295..0bffca6 100644
>> --- a/stubs/Makefile.objs
>> +++ b/stubs/Makefile.objs
>> @@ -35,3 +35,4 @@ stub-obj-y += qmp_pc_dimm_device_list.o
>> stub-obj-y += target-monitor-defs.o
>> stub-obj-y += target-get-monitor-def.o
>> stub-obj-y += pc_madt_cpu_entry.o
>> +stub-obj-y += vmgenid.o
>> diff --git a/stubs/vmgenid.c b/stubs/vmgenid.c
>> new file mode 100644
>> index 000..8c448ac
>> --- /dev/null
>> +++ b/stubs/vmgenid.c
>> @@ -0,0 +1,8 @@
>> +#include "qemu/osdep.h"
>> +#include "qmp-commands.h"
>> +
>> +GuidInfo *qmp_query_vm_generation_id(Error **errp)
>> +{
>> +error_setg(errp, "this command is not currently supported");
>> +return NULL;
>> +}
>> 
> 
> (2) Don't we usually employ QERR_UNSUPPORTED for the format string in
> such cases?
> 
> With or without updates:
> 
> Reviewed-by: Laszlo Ersek mailto:ler...@redhat.com>>
> 
Both items changed.  Thanks!
> Thanks
> Laszlo



smime.p7s
Description: S/MIME cryptographic signature


Re: [Qemu-devel] [PATCH v6 4/7] ACPI: Add Virtual Machine Generation ID support

2017-02-15 Thread Ben Warren

> On Feb 15, 2017, at 7:24 AM, Laszlo Ersek  wrote:
> 
> On 02/15/17 13:19, Igor Mammedov wrote:
>> On Tue, 14 Feb 2017 22:15:46 -0800
>> b...@skyportsystems.com wrote:
>> 
>>> From: Ben Warren 
>>> 
>>> This implements the VM Generation ID feature by passing a 128-bit
>>> GUID to the guest via a fw_cfg blob.
>>> Any time the GUID changes, an ACPI notify event is sent to the guest
>>> 
>>> The user interface is a simple device with one parameter:
>>> - guid (string, must be "auto" or in UUID format
>>>   ----)
>>> 
>>> Signed-off-by: Ben Warren 
>>> ---
>>> default-configs/i386-softmmu.mak |   1 +
>>> default-configs/x86_64-softmmu.mak   |   1 +
>>> hw/acpi/Makefile.objs|   1 +
>>> hw/acpi/vmgenid.c| 237 
>>> +++
>>> hw/i386/acpi-build.c |  16 +++
>>> include/hw/acpi/acpi_dev_interface.h |   1 +
>>> include/hw/acpi/vmgenid.h|  35 ++
>>> 7 files changed, 292 insertions(+)
>>> create mode 100644 hw/acpi/vmgenid.c
>>> create mode 100644 include/hw/acpi/vmgenid.h
>>> 
>>> diff --git a/default-configs/i386-softmmu.mak 
>>> b/default-configs/i386-softmmu.mak
>>> index 48b07a4..029e952 100644
>>> --- a/default-configs/i386-softmmu.mak
>>> +++ b/default-configs/i386-softmmu.mak
>>> @@ -59,3 +59,4 @@ CONFIG_I82801B11=y
>>> CONFIG_SMBIOS=y
>>> CONFIG_HYPERV_TESTDEV=$(CONFIG_KVM)
>>> CONFIG_PXB=y
>>> +CONFIG_ACPI_VMGENID=y
>>> diff --git a/default-configs/x86_64-softmmu.mak 
>>> b/default-configs/x86_64-softmmu.mak
>>> index fd96345..d1d7432 100644
>>> --- a/default-configs/x86_64-softmmu.mak
>>> +++ b/default-configs/x86_64-softmmu.mak
>>> @@ -59,3 +59,4 @@ CONFIG_I82801B11=y
>>> CONFIG_SMBIOS=y
>>> CONFIG_HYPERV_TESTDEV=$(CONFIG_KVM)
>>> CONFIG_PXB=y
>>> +CONFIG_ACPI_VMGENID=y
>>> diff --git a/hw/acpi/Makefile.objs b/hw/acpi/Makefile.objs
>>> index 6acf798..11c35bc 100644
>>> --- a/hw/acpi/Makefile.objs
>>> +++ b/hw/acpi/Makefile.objs
>>> @@ -5,6 +5,7 @@ common-obj-$(CONFIG_ACPI_CPU_HOTPLUG) += cpu_hotplug.o
>>> common-obj-$(CONFIG_ACPI_MEMORY_HOTPLUG) += memory_hotplug.o
>>> common-obj-$(CONFIG_ACPI_CPU_HOTPLUG) += cpu.o
>>> common-obj-$(CONFIG_ACPI_NVDIMM) += nvdimm.o
>>> +common-obj-$(CONFIG_ACPI_VMGENID) += vmgenid.o
>>> common-obj-$(call lnot,$(CONFIG_ACPI_X86)) += acpi-stub.o
>>> 
>>> common-obj-y += acpi_interface.o
>>> diff --git a/hw/acpi/vmgenid.c b/hw/acpi/vmgenid.c
>>> new file mode 100644
>>> index 000..b1b7b32
>>> --- /dev/null
>>> +++ b/hw/acpi/vmgenid.c
>>> @@ -0,0 +1,237 @@
>>> +/*
>>> + *  Virtual Machine Generation ID Device
>>> + *
>>> + *  Copyright (C) 2017 Skyport Systems.
>>> + *
>>> + *  Author: Ben Warren 
>>> + *
>>> + * This work is licensed under the terms of the GNU GPL, version 2 or 
>>> later.
>>> + * See the COPYING file in the top-level directory.
>>> + *
>>> + */
>>> +
>>> +#include "qemu/osdep.h"
>>> +#include "qmp-commands.h"
>>> +#include "hw/acpi/acpi.h"
>>> +#include "hw/acpi/aml-build.h"
>>> +#include "hw/acpi/vmgenid.h"
>>> +#include "hw/nvram/fw_cfg.h"
>>> +#include "sysemu/sysemu.h"
>>> +
>>> +void vmgenid_build_acpi(VmGenIdState *vms, GArray *table_data, GArray 
>>> *guid,
>>> +BIOSLinker *linker)
>>> +{
>>> +Aml *ssdt, *dev, *scope, *method, *addr, *if_ctx;
>>> +uint32_t vgia_offset;
>>> +QemuUUID guid_le;
>>> +
>>> +/* Fill in the GUID values.  These need to be converted to 
>>> little-endian
>>> + * first, since that's what the guest expects
>>> + */
>>> +g_array_set_size(guid, VMGENID_FW_CFG_SIZE);
>>> +memcpy(&guid_le.data, &vms->guid.data, sizeof(vms->guid.data));
>>> +qemu_uuid_bswap(&guid_le);
>>> +/* The GUID is written at a fixed offset into the fw_cfg file
>>> + * in order to implement the "OVMF SDT Header probe suppressor"
>>> + * see docs/specs/vmgenid.txt for more details
>>> + */
>>> +g_array_insert_vals(guid, VMGENID_GUID_OFFSET, guid_le.data,
>>> +ARRAY_SIZE(guid_le.data));
> 
> Ben:
> 
> (1) The logic is sane here, but the initial sizing of the array is not
> correct. The initial size should be
> 
>  (VMGENID_FW_CFG_SIZE - ARRAY_SIZE(guid_le.data))
> 
> The reason for this is that g_array_insert_vals() really inserts (it
> doesn't overwrite) data, therefore it grows the array. From the GLib
> source code [glib/garray.c]:
> 
> --
> GArray*
> g_array_insert_vals (GArray*farray,
> guint  index_,
> gconstpointer  data,
> guint  len)
> {
>  GRealArray *array = (GRealArray*) farray;
> 
>  g_return_val_if_fail (array, NULL);
> 
>  g_array_maybe_expand (array, len);
> 
>  memmove (g_array_elt_pos (array, len + index_),
>   g_array_elt_pos (array, index_),
>   g_array_elt_len (array, array->len - index_));
> 
>  memcpy (g_array_elt_pos (array, index_), data, g_array_elt_len (array,
> len));
> 
>  array->len 

Re: [Qemu-devel] [PATCH v6 3/7] ACPI: Add vmgenid blob storage to the build tables

2017-02-15 Thread Ben Warren

> On Feb 15, 2017, at 6:30 AM, Laszlo Ersek  wrote:
> 
> On 02/15/17 07:15, b...@skyportsystems.com wrote:
>> From: Ben Warren 
>> 
>> This allows them to be centrally initialized and destroyed
>> 
>> The "AcpiBuildTables.vmgenid" array will be used to construct the
>> "etc/vmgenid" fw_cfg blob.
> 
> Trivial wart: the blob is now called "etc/vmgenid_guid".
> 
> If you send a v7, feel free to fix it up. Not critical.
> 
Fixed in v7
> My R-b stands.
> 
> Thanks!
> Laszlo
> 
>> Its contents will be linked into fw_cfg after being built on the
>> pc_machine_done() -> acpi_setup() -> acpi_build() call path, and dropped
>> without use on the subsequent, guest triggered, acpi_build_update() ->
>> acpi_build() call path.
>> 
>> Signed-off-by: Ben Warren 
>> Reviewed-by: Laszlo Ersek 
>> ---
>> hw/acpi/aml-build.c | 2 ++
>> include/hw/acpi/aml-build.h | 1 +
>> 2 files changed, 3 insertions(+)
>> 
>> diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
>> index b2a1e40..c6f2032 100644
>> --- a/hw/acpi/aml-build.c
>> +++ b/hw/acpi/aml-build.c
>> @@ -1559,6 +1559,7 @@ void acpi_build_tables_init(AcpiBuildTables *tables)
>> tables->rsdp = g_array_new(false, true /* clear */, 1);
>> tables->table_data = g_array_new(false, true /* clear */, 1);
>> tables->tcpalog = g_array_new(false, true /* clear */, 1);
>> +tables->vmgenid = g_array_new(false, true /* clear */, 1);
>> tables->linker = bios_linker_loader_init();
>> }
>> 
>> @@ -1568,6 +1569,7 @@ void acpi_build_tables_cleanup(AcpiBuildTables 
>> *tables, bool mfre)
>> g_array_free(tables->rsdp, true);
>> g_array_free(tables->table_data, true);
>> g_array_free(tables->tcpalog, mfre);
>> +g_array_free(tables->vmgenid, mfre);
>> }
>> 
>> /* Build rsdt table */
>> diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
>> index 559326c..00c21f1 100644
>> --- a/include/hw/acpi/aml-build.h
>> +++ b/include/hw/acpi/aml-build.h
>> @@ -210,6 +210,7 @@ struct AcpiBuildTables {
>> GArray *table_data;
>> GArray *rsdp;
>> GArray *tcpalog;
>> +GArray *vmgenid;
>> BIOSLinker *linker;
>> } AcpiBuildTables;
>> 
>> 
> 



smime.p7s
Description: S/MIME cryptographic signature


Re: [Qemu-devel] [PATCH v6 0/7] Add support for VM Generation ID

2017-02-15 Thread Ben Warren

> On Feb 15, 2017, at 12:52 PM, Laszlo Ersek  wrote:
> 
> On 02/15/17 21:09, Michael S. Tsirkin wrote:
>> On Wed, Feb 15, 2017 at 08:47:48PM +0100, Laszlo Ersek wrote:
> 
> [snip]
> 
>>> For patches #1, #3, #4 and #5:
>>> 
>>> Tested-by: Laszlo Ersek 
>>> 
>>> I'll soon post the OVMF patches.
>>> 
>>> Thanks!
>>> Laszlo
>> 
>> 
>> How do you feel about Igor's request to change WRITE_POINTER to add
>> offset in there, so guest can pass in the address of GUID and
>> not start of table? Would that be a lot of work to add?
> 
> I think it's doable in practice: simply add a constant from the command
> itself, for passing the value back to QEMU, and also for saving the
> fw_cfg write commend for S3 resume time.
> 
> But, I disagree with it from a design POV.
> 
> Igor's point is:
> 
>> Math complicates QEMU code though and not only QMEMU but AML code as
>> well.
> 
> As I understand it, the goal is to push the addition to the firmware
> (which is "one place"), rather than having to implement it twice in
> QEMU, i.e., in two places ((a) native QEMU logic, (b) AML generator).
> 
> Here's my counter-argument:
> 
> (a) As I mentioned earlier, assume a complex communication structure
> between the guest OS and QEMU. Currently our shared structure consists
> of a single field (the GUID), but next time it might contain several fields.
> 
> For such a multi-field shared structure, QEMU will have to do manual
> offsetting into the guest RAM anyway, for accessing fields F1, F2, and
> F3. We will not create three separate WRITE_POINTER commands and let the
> firmware calculate and return the absolute GPAs of the fields F1, F2 and
> F3. Instead, there will be one WRITE_POINTER command, and QEMU will do
> the offsetting manually, minimally for fields F2 and F3.
> 
> "src_offset" looks tempting now only because we have a shared structure
> with only one field, the GUID at offset 40 decimal.
> 
> (b) Regarding the runtime addition in the AML code:
> 
> As discussed before, the main reason *now*, for not pointing VGIA (and
> other named integer objects) with ADD_POINTER commands directly to
> "meaningful" fields, is that OVMF probes the targets of ADD_POINTER
> commands for patterns that look like ACPI table headers. And, for the
> time being, we want to suppress any mis-recognitions by prepending some
> padding.
> 
> Igor was right to dislike this, and we agreed that *down the road* we
> should add allocation flags, or further allocation commands, to supplant
> this kind of heuristics in OVMF. But:
> 
> - we don't have time to do it now, plus
> 
> - please observe that the runtime addition in AML relates to the
>  ADD_POINTER and the ALLOCATE commands. It does not relate to
>  WRITE_POINTER at all.
> 
>  Whatever we change on WRITE_POINTER will do nothing for suppressing
>  OVMF's table header probing -- because that is tied to ADD_POINTER
>  --, therefore WRITE_POINTER tweaks cannot eliminate the "need to add"
>  in AML.
> 
> 
> In summary, I think the proposed WRITE_POINTER modification is
> implementable, but I think it will not pay off, because:
> 
> (a) for QEMU logic, it will not prove useful as soon as we have a
> multi-field shared structure (QEMU will have to add field offsets anyway),
> 
> (b) and for eliminating the AML addition (which is a consequence of the
> current ADD_POINTER handling in OVMF), it does nothing.
> 
OK Laszlo, in v7 (imminent)  I went ahead and implemented this src_offset.  If 
you are truly dead-set against it, it’s not very hard to remove.  To me it 
seems pretty harmless.

> Thanks
> Laszlo



smime.p7s
Description: S/MIME cryptographic signature


[Qemu-devel] [PATCH v2 3/4] char: remove the right fd been watched in qemu_chr_fe_set_handlers()

2017-02-15 Thread zhanghailiang
We can call qemu_chr_fe_set_handlers() to add/remove fd been watched
in 'context' which can be either default main context or other explicit
context. But the original logic is not correct, we didn't remove
the right fd because we call g_main_context_find_source_by_id(NULL, tag)
which always try to find the Gsource from default context.

Fix it by passing the right context to g_main_context_find_source_by_id().

Cc: Paolo Bonzini 
Cc: Marc-André Lureau 
Signed-off-by: zhanghailiang 
---
 chardev/char-io.c | 13 +
 chardev/char-io.h |  2 ++
 chardev/char.c|  2 +-
 3 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/chardev/char-io.c b/chardev/char-io.c
index 7dfc3f2..a69cc61 100644
--- a/chardev/char-io.c
+++ b/chardev/char-io.c
@@ -127,14 +127,14 @@ guint io_add_watch_poll(Chardev *chr,
 return tag;
 }
 
-static void io_remove_watch_poll(guint tag)
+static void io_remove_watch_poll(guint tag, GMainContext *context)
 {
 GSource *source;
 IOWatchPoll *iwp;
 
 g_return_if_fail(tag > 0);
 
-source = g_main_context_find_source_by_id(NULL, tag);
+source = g_main_context_find_source_by_id(context, tag);
 g_return_if_fail(source != NULL);
 
 iwp = io_watch_poll_from_source(source);
@@ -146,14 +146,19 @@ static void io_remove_watch_poll(guint tag)
 g_source_destroy(&iwp->parent);
 }
 
-void remove_fd_in_watch(Chardev *chr)
+void qemu_remove_fd_in_watch(Chardev *chr, GMainContext *context)
 {
 if (chr->fd_in_tag) {
-io_remove_watch_poll(chr->fd_in_tag);
+io_remove_watch_poll(chr->fd_in_tag, context);
 chr->fd_in_tag = 0;
 }
 }
 
+void remove_fd_in_watch(Chardev *chr)
+{
+qemu_remove_fd_in_watch(chr, NULL);
+}
+
 int io_channel_send_full(QIOChannel *ioc,
  const void *buf, size_t len,
  int *fds, size_t nfds)
diff --git a/chardev/char-io.h b/chardev/char-io.h
index d7ae5f1..117c888 100644
--- a/chardev/char-io.h
+++ b/chardev/char-io.h
@@ -38,6 +38,8 @@ guint io_add_watch_poll(Chardev *chr,
 
 void remove_fd_in_watch(Chardev *chr);
 
+void qemu_remove_fd_in_watch(Chardev *chr, GMainContext *context);
+
 int io_channel_send(QIOChannel *ioc, const void *buf, size_t len);
 
 int io_channel_send_full(QIOChannel *ioc, const void *buf, size_t len,
diff --git a/chardev/char.c b/chardev/char.c
index abd525f..5563375 100644
--- a/chardev/char.c
+++ b/chardev/char.c
@@ -560,7 +560,7 @@ void qemu_chr_fe_set_handlers(CharBackend *b,
 cc = CHARDEV_GET_CLASS(s);
 if (!opaque && !fd_can_read && !fd_read && !fd_event) {
 fe_open = 0;
-remove_fd_in_watch(s);
+qemu_remove_fd_in_watch(s, context);
 } else {
 fe_open = 1;
 }
-- 
1.8.3.1





[Qemu-devel] [PATCH v2 2/4] colo-compare: kick compare thread to exit after some cleanup in finalization

2017-02-15 Thread zhanghailiang
We should call g_main_loop_quit() to notify colo compare thread to
exit, Or it will run in g_main_loop_run() forever.

Besides, the finalizing process can't happen in context of colo thread,
it is reasonable to remove the 'if (qemu_thread_is_self(&s->thread))'
branch.

Before compare thead exits, some cleanup works need to be
done,  All unhandled packets need to be released and connection_track_table
needs to be freed, or there will be memory leak.

Signed-off-by: zhanghailiang 
---
 net/colo-compare.c | 39 +--
 1 file changed, 29 insertions(+), 10 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index fdde788..37ce75c 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -83,6 +83,8 @@ typedef struct CompareState {
 GHashTable *connection_track_table;
 /* compare thread, a thread for each NIC */
 QemuThread thread;
+
+GMainLoop *compare_loop;
 } CompareState;
 
 typedef struct CompareClass {
@@ -496,7 +498,6 @@ static gboolean check_old_packet_regular(void *opaque)
 static void *colo_compare_thread(void *opaque)
 {
 GMainContext *worker_context;
-GMainLoop *compare_loop;
 CompareState *s = opaque;
 GSource *timeout_source;
 
@@ -507,7 +508,7 @@ static void *colo_compare_thread(void *opaque)
 qemu_chr_fe_set_handlers(&s->chr_sec_in, compare_chr_can_read,
  compare_sec_chr_in, NULL, s, worker_context, 
true);
 
-compare_loop = g_main_loop_new(worker_context, FALSE);
+s->compare_loop = g_main_loop_new(worker_context, FALSE);
 
 /* To kick any packets that the secondary doesn't match */
 timeout_source = g_timeout_source_new(REGULAR_PACKET_CHECK_MS);
@@ -515,10 +516,10 @@ static void *colo_compare_thread(void *opaque)
   (GSourceFunc)check_old_packet_regular, s, NULL);
 g_source_attach(timeout_source, worker_context);
 
-g_main_loop_run(compare_loop);
+g_main_loop_run(s->compare_loop);
 
 g_source_unref(timeout_source);
-g_main_loop_unref(compare_loop);
+g_main_loop_unref(s->compare_loop);
 g_main_context_unref(worker_context);
 return NULL;
 }
@@ -675,6 +676,23 @@ static void colo_compare_complete(UserCreatable *uc, Error 
**errp)
 return;
 }
 
+static void colo_flush_packets(void *opaque, void *user_data)
+{
+CompareState *s = user_data;
+Connection *conn = opaque;
+Packet *pkt = NULL;
+
+while (!g_queue_is_empty(&conn->primary_list)) {
+pkt = g_queue_pop_head(&conn->primary_list);
+compare_chr_send(&s->chr_out, pkt->data, pkt->size);
+packet_destroy(pkt, NULL);
+}
+while (!g_queue_is_empty(&conn->secondary_list)) {
+pkt = g_queue_pop_head(&conn->secondary_list);
+packet_destroy(pkt, NULL);
+}
+}
+
 static void colo_compare_class_init(ObjectClass *oc, void *data)
 {
 UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
@@ -703,14 +721,15 @@ static void colo_compare_finalize(Object *obj)
 qemu_chr_fe_deinit(&s->chr_sec_in);
 qemu_chr_fe_deinit(&s->chr_out);
 
-g_queue_free(&s->conn_list);
+g_main_loop_quit(s->compare_loop);
+qemu_thread_join(&s->thread);
 
-if (qemu_thread_is_self(&s->thread)) {
-/* compare connection */
-g_queue_foreach(&s->conn_list, colo_compare_connection, s);
-qemu_thread_join(&s->thread);
-}
+/* Release all unhandled packets after compare thead exited */
+g_queue_foreach(&s->conn_list, colo_flush_packets, s);
+
+g_queue_free(&s->conn_list);
 
+g_hash_table_destroy(s->connection_track_table);
 g_free(s->pri_indev);
 g_free(s->sec_indev);
 g_free(s->outdev);
-- 
1.8.3.1





[Qemu-devel] [PATCH v2 4/4] colo-compare: Fix removing fds been watched incorrectly in finalization

2017-02-15 Thread zhanghailiang
We will catch the bellow error report while try to delete compare object
by qmp command:
chardev/char-io.c:91: io_watch_poll_finalize: Assertion `iwp->src == ((void 
*)0)' failed.

This is caused by failing to remove the right fd been watched while
call qemu_chr_fe_set_handlers();

Fix it by pass the worker_context parameter to qemu_chr_fe_set_handlers().

Signed-off-by: zhanghailiang 
---
 net/colo-compare.c | 20 +++-
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 37ce75c..a6fc2ff 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -84,6 +84,7 @@ typedef struct CompareState {
 /* compare thread, a thread for each NIC */
 QemuThread thread;
 
+GMainContext *worker_context;
 GMainLoop *compare_loop;
 } CompareState;
 
@@ -497,30 +498,29 @@ static gboolean check_old_packet_regular(void *opaque)
 
 static void *colo_compare_thread(void *opaque)
 {
-GMainContext *worker_context;
 CompareState *s = opaque;
 GSource *timeout_source;
 
-worker_context = g_main_context_new();
+s->worker_context = g_main_context_new();
 
 qemu_chr_fe_set_handlers(&s->chr_pri_in, compare_chr_can_read,
- compare_pri_chr_in, NULL, s, worker_context, 
true);
+  compare_pri_chr_in, NULL, s, s->worker_context, 
true);
 qemu_chr_fe_set_handlers(&s->chr_sec_in, compare_chr_can_read,
- compare_sec_chr_in, NULL, s, worker_context, 
true);
+  compare_sec_chr_in, NULL, s, s->worker_context, 
true);
 
-s->compare_loop = g_main_loop_new(worker_context, FALSE);
+s->compare_loop = g_main_loop_new(s->worker_context, FALSE);
 
 /* To kick any packets that the secondary doesn't match */
 timeout_source = g_timeout_source_new(REGULAR_PACKET_CHECK_MS);
 g_source_set_callback(timeout_source,
   (GSourceFunc)check_old_packet_regular, s, NULL);
-g_source_attach(timeout_source, worker_context);
+g_source_attach(timeout_source, s->worker_context);
 
 g_main_loop_run(s->compare_loop);
 
 g_source_unref(timeout_source);
 g_main_loop_unref(s->compare_loop);
-g_main_context_unref(worker_context);
+g_main_context_unref(s->worker_context);
 return NULL;
 }
 
@@ -717,8 +717,10 @@ static void colo_compare_finalize(Object *obj)
 {
 CompareState *s = COLO_COMPARE(obj);
 
-qemu_chr_fe_deinit(&s->chr_pri_in);
-qemu_chr_fe_deinit(&s->chr_sec_in);
+qemu_chr_fe_set_handlers(&s->chr_pri_in, NULL, NULL, NULL, NULL,
+ s->worker_context, true);
+qemu_chr_fe_set_handlers(&s->chr_sec_in, NULL, NULL, NULL, NULL,
+ s->worker_context, true);
 qemu_chr_fe_deinit(&s->chr_out);
 
 g_main_loop_quit(s->compare_loop);
-- 
1.8.3.1





[Qemu-devel] [PATCH v2 1/4] colo-compare: use g_timeout_source_new() to process the stale packets

2017-02-15 Thread zhanghailiang
Instead of using qemu timer to process the stale packets,
We re-use the colo compare thread to process these packets
by creating a new timeout coroutine.

Besides, since we process all the same vNIC's net connection/packets
in one thread, it is safe to remove the timer_check_lock.

Signed-off-by: zhanghailiang 
---
 net/colo-compare.c | 62 +++---
 1 file changed, 22 insertions(+), 40 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 162fd6a..fdde788 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -83,9 +83,6 @@ typedef struct CompareState {
 GHashTable *connection_track_table;
 /* compare thread, a thread for each NIC */
 QemuThread thread;
-/* Timer used on the primary to find packets that are never matched */
-QEMUTimer *timer;
-QemuMutex timer_check_lock;
 } CompareState;
 
 typedef struct CompareClass {
@@ -374,9 +371,7 @@ static void colo_compare_connection(void *opaque, void 
*user_data)
 
 while (!g_queue_is_empty(&conn->primary_list) &&
!g_queue_is_empty(&conn->secondary_list)) {
-qemu_mutex_lock(&s->timer_check_lock);
 pkt = g_queue_pop_tail(&conn->primary_list);
-qemu_mutex_unlock(&s->timer_check_lock);
 switch (conn->ip_proto) {
 case IPPROTO_TCP:
 result = g_queue_find_custom(&conn->secondary_list,
@@ -411,9 +406,7 @@ static void colo_compare_connection(void *opaque, void 
*user_data)
  * until next comparison.
  */
 trace_colo_compare_main("packet different");
-qemu_mutex_lock(&s->timer_check_lock);
 g_queue_push_tail(&conn->primary_list, pkt);
-qemu_mutex_unlock(&s->timer_check_lock);
 /* TODO: colo_notify_checkpoint();*/
 break;
 }
@@ -486,11 +479,26 @@ static void compare_sec_chr_in(void *opaque, const 
uint8_t *buf, int size)
 }
 }
 
+/*
+ * Check old packet regularly so it can watch for any packets
+ * that the secondary hasn't produced equivalents of.
+ */
+static gboolean check_old_packet_regular(void *opaque)
+{
+CompareState *s = opaque;
+
+/* if have old packet we will notify checkpoint */
+colo_old_packet_check(s);
+
+return TRUE;
+}
+
 static void *colo_compare_thread(void *opaque)
 {
 GMainContext *worker_context;
 GMainLoop *compare_loop;
 CompareState *s = opaque;
+GSource *timeout_source;
 
 worker_context = g_main_context_new();
 
@@ -501,8 +509,15 @@ static void *colo_compare_thread(void *opaque)
 
 compare_loop = g_main_loop_new(worker_context, FALSE);
 
+/* To kick any packets that the secondary doesn't match */
+timeout_source = g_timeout_source_new(REGULAR_PACKET_CHECK_MS);
+g_source_set_callback(timeout_source,
+  (GSourceFunc)check_old_packet_regular, s, NULL);
+g_source_attach(timeout_source, worker_context);
+
 g_main_loop_run(compare_loop);
 
+g_source_unref(timeout_source);
 g_main_loop_unref(compare_loop);
 g_main_context_unref(worker_context);
 return NULL;
@@ -604,26 +619,6 @@ static int find_and_check_chardev(Chardev **chr,
 }
 
 /*
- * Check old packet regularly so it can watch for any packets
- * that the secondary hasn't produced equivalents of.
- */
-static void check_old_packet_regular(void *opaque)
-{
-CompareState *s = opaque;
-
-timer_mod(s->timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) +
-  REGULAR_PACKET_CHECK_MS);
-/* if have old packet we will notify checkpoint */
-/*
- * TODO: Make timer handler run in compare thread
- * like qemu_chr_add_handlers_full.
- */
-qemu_mutex_lock(&s->timer_check_lock);
-colo_old_packet_check(s);
-qemu_mutex_unlock(&s->timer_check_lock);
-}
-
-/*
  * Called from the main thread on the primary
  * to setup colo-compare.
  */
@@ -665,7 +660,6 @@ static void colo_compare_complete(UserCreatable *uc, Error 
**errp)
 net_socket_rs_init(&s->sec_rs, compare_sec_rs_finalize);
 
 g_queue_init(&s->conn_list);
-qemu_mutex_init(&s->timer_check_lock);
 
 s->connection_track_table = g_hash_table_new_full(connection_key_hash,
   connection_key_equal,
@@ -678,12 +672,6 @@ static void colo_compare_complete(UserCreatable *uc, Error 
**errp)
QEMU_THREAD_JOINABLE);
 compare_id++;
 
-/* A regular timer to kick any packets that the secondary doesn't match */
-s->timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, /* Only when guest runs */
-check_old_packet_regular, s);
-timer_mod(s->timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) +
-REGULAR_PACKET_CHECK_MS);
-
 return;
 }
 
@@ -723,12 +711,6 @@ static void colo_compare_finalize(Object *obj)
 qemu_thread_join(&s->thread);
 }
 
-if (s->timer) {
-timer_del(s->timer);
-}
-
-qemu_mutex_destroy(

[Qemu-devel] [PATCH v2 0/4] colo-compare: fix some bugs

2017-02-15 Thread zhanghailiang
This series includes two parts: codes optimization and bug fix.
patch 1 tries to move timer process into colo compare thread as 
a new coroutine.
patch 2 ~ 4 fixe some bugs of colo compare.

v2:
 - Squash patch 3 of last version into patch 2. (ZhangChen's suggestion)

zhanghailiang (4):
  colo-compare: use g_timeout_source_new() to process the stale packets
  colo-compare: kick compare thread to exit after some cleanup in
finalization
  char: remove the right fd been watched in qemu_chr_fe_set_handlers()
  colo-compare: Fix removing fds been watched incorrectly in
finalization

 chardev/char-io.c  |  13 --
 chardev/char-io.h  |   2 +
 chardev/char.c |   2 +-
 net/colo-compare.c | 115 +++--
 4 files changed, 71 insertions(+), 61 deletions(-)

-- 
1.8.3.1





Re: [Qemu-devel] [PULL 08/41] intel_iommu: support device iotlb descriptor

2017-02-15 Thread Jason Wang



On 2017年02月16日 13:43, Jason Wang wrote:



On 2017年02月16日 13:36, Liu, Yi L wrote:

-Original Message-
From: Qemu-devel 
[mailto:qemu-devel-bounces+yi.l.liu=intel@nongnu.org]

On Behalf Of Michael S. Tsirkin
Sent: Tuesday, January 10, 2017 1:40 PM
To: qemu-devel@nongnu.org
Cc: Peter Maydell ; Eduardo Habkost
; Jason Wang ; Peter Xu
; Paolo Bonzini ; Richard
Henderson 
Subject: [Qemu-devel] [PULL 08/41] intel_iommu: support device iotlb
descriptor

From: Jason Wang 

This patch enables device IOTLB support for intel iommu. The major 
work is to
implement QI device IOTLB descriptor processing and notify the 
device through

iommu notifier.


Hi Jason/Michael,

Recently Peter Xu's patch also touched intel-iommu emulation. His 
patch shadows
second-level page table by capturing iotlb flush from guest. It would 
result in page
table updating in host. Does this patch also use the same map/umap 
API provided

by VFIO?


Yes, it depends on the iommu notifier too.

If it is, then I think it would also update page table in host. It 
looks to be
a duplicate update. Pls refer to the following snapshot captured from 
section 6.5.2.5

of vtd spec.

"Since translation requests from a device may be serviced by hardware 
from the IOTLB, software must
always request IOTLB invalidation (iotlb_inv_dsc) before requesting 
corresponding Device-TLB

(dev_tlb_inv_dsc) invalidation."

Maybe for device-iotlb, we need a separate API which just pass down 
the invalidate

info without updating page table. Any thoughts?


cc Alex.

If we want ATS to be visible for guest (but I'm not sure if VFIO 
support this), we probably need another notifier or a new flag.


Thanks 


Or need a dedicated address_space if ATS were enabled for the device.



Re: [Qemu-devel] [PULL 08/41] intel_iommu: support device iotlb descriptor

2017-02-15 Thread Jason Wang



On 2017年02月16日 13:36, Liu, Yi L wrote:

-Original Message-
From: Qemu-devel [mailto:qemu-devel-bounces+yi.l.liu=intel@nongnu.org]
On Behalf Of Michael S. Tsirkin
Sent: Tuesday, January 10, 2017 1:40 PM
To: qemu-devel@nongnu.org
Cc: Peter Maydell ; Eduardo Habkost
; Jason Wang ; Peter Xu
; Paolo Bonzini ; Richard
Henderson 
Subject: [Qemu-devel] [PULL 08/41] intel_iommu: support device iotlb
descriptor

From: Jason Wang 

This patch enables device IOTLB support for intel iommu. The major work is to
implement QI device IOTLB descriptor processing and notify the device through
iommu notifier.


Hi Jason/Michael,

Recently Peter Xu's patch also touched intel-iommu emulation. His patch shadows
second-level page table by capturing iotlb flush from guest. It would result in 
page
table updating in host. Does this patch also use the same map/umap API provided
by VFIO?


Yes, it depends on the iommu notifier too.


If it is, then I think it would also update page table in host. It looks to be
a duplicate update. Pls refer to the following snapshot captured from section 
6.5.2.5
of vtd spec.

"Since translation requests from a device may be serviced by hardware from the 
IOTLB, software must
always request IOTLB invalidation (iotlb_inv_dsc) before requesting 
corresponding Device-TLB
(dev_tlb_inv_dsc) invalidation."

Maybe for device-iotlb, we need a separate API which just pass down the 
invalidate
info without updating page table. Any thoughts?


cc Alex.

If we want ATS to be visible for guest (but I'm not sure if VFIO support 
this), we probably need another notifier or a new flag.


Thanks



Thanks,
Yi L

Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: Eduardo Habkost 
Cc: Michael S. Tsirkin 
Signed-off-by: Jason Wang 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Peter Xu 
---
  hw/i386/intel_iommu_internal.h | 13 ++-
  include/hw/i386/x86-iommu.h|  1 +
  hw/i386/intel_iommu.c  | 83
+++---
  hw/i386/x86-iommu.c| 17 +
  4 files changed, 107 insertions(+), 7 deletions(-)

diff --git a/hw/i386/intel_iommu_internal.h
b/hw/i386/intel_iommu_internal.h index 11abfa2..356f188 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -183,6 +183,7 @@
  /* (offset >> 4) << 8 */
  #define VTD_ECAP_IRO(DMAR_IOTLB_REG_OFFSET << 4)
  #define VTD_ECAP_QI (1ULL << 1)
+#define VTD_ECAP_DT (1ULL << 2)
  /* Interrupt Remapping support */
  #define VTD_ECAP_IR (1ULL << 3)
  #define VTD_ECAP_EIM(1ULL << 4)
@@ -326,6 +327,7 @@ typedef union VTDInvDesc VTDInvDesc;
  #define VTD_INV_DESC_TYPE   0xf
  #define VTD_INV_DESC_CC 0x1 /* Context-cache Invalidate Desc 
*/
  #define VTD_INV_DESC_IOTLB  0x2
+#define VTD_INV_DESC_DEVICE 0x3
  #define VTD_INV_DESC_IEC0x4 /* Interrupt Entry Cache
 Invalidate Descriptor */
  #define VTD_INV_DESC_WAIT   0x5 /* Invalidation Wait Descriptor */
@@ -361,6 +363,13 @@ typedef union VTDInvDesc VTDInvDesc;
  #define VTD_INV_DESC_IOTLB_RSVD_LO  0xff00ULL
  #define VTD_INV_DESC_IOTLB_RSVD_HI  0xf80ULL

+/* Mask for Device IOTLB Invalidate Descriptor */ #define
+VTD_INV_DESC_DEVICE_IOTLB_ADDR(val) ((val) & 0xf000ULL)
+#define VTD_INV_DESC_DEVICE_IOTLB_SIZE(val) ((val) & 0x1) #define
+VTD_INV_DESC_DEVICE_IOTLB_SID(val) (((val) >> 32) & 0xULL) #define
+VTD_INV_DESC_DEVICE_IOTLB_RSVD_HI 0xffeULL #define
+VTD_INV_DESC_DEVICE_IOTLB_RSVD_LO 0xffe0fff8
+
  /* Information about page-selective IOTLB invalidate */  struct
VTDIOTLBPageInvInfo {
  uint16_t domain_id;
@@ -399,8 +408,8 @@ typedef struct VTDRootEntry VTDRootEntry;
  #define VTD_CONTEXT_ENTRY_FPD   (1ULL << 1) /* Fault Processing Disable
*/
  #define VTD_CONTEXT_ENTRY_TT(3ULL << 2) /* Translation Type */
  #define VTD_CONTEXT_TT_MULTI_LEVEL  0
-#define VTD_CONTEXT_TT_DEV_IOTLB1
-#define VTD_CONTEXT_TT_PASS_THROUGH 2
+#define VTD_CONTEXT_TT_DEV_IOTLB(1ULL << 2)
+#define VTD_CONTEXT_TT_PASS_THROUGH (2ULL << 2)
  /* Second Level Page Translation Pointer*/
  #define VTD_CONTEXT_ENTRY_SLPTPTR   (~0xfffULL)
  #define VTD_CONTEXT_ENTRY_RSVD_LO   (0xff0ULL | ~VTD_HAW_MASK)
diff --git a/include/hw/i386/x86-iommu.h b/include/hw/i386/x86-iommu.h
index 0c89d98..361c07c 100644
--- a/include/hw/i386/x86-iommu.h
+++ b/include/hw/i386/x86-iommu.h
@@ -73,6 +73,7 @@ typedef struct IEC_Notifier IEC_Notifier;  struct
X86IOMMUState {
  SysBusDevice busdev;
  bool intr_supported;/* Whether vIOMMU supports IR */
+bool dt_supported;  /* Whether vIOMMU supports DT */
  IommuType type; /* IOMMU type - AMD/Intel */
  QLIST_HEAD(, IEC_Notifier) iec_notifiers; /* IEC notify list */  }; diff 
--git
a/hw/i386/intel

Re: [Qemu-devel] [PULL 08/41] intel_iommu: support device iotlb descriptor

2017-02-15 Thread Liu, Yi L
> -Original Message-
> From: Qemu-devel [mailto:qemu-devel-bounces+yi.l.liu=intel@nongnu.org]
> On Behalf Of Michael S. Tsirkin
> Sent: Tuesday, January 10, 2017 1:40 PM
> To: qemu-devel@nongnu.org
> Cc: Peter Maydell ; Eduardo Habkost
> ; Jason Wang ; Peter Xu
> ; Paolo Bonzini ; Richard
> Henderson 
> Subject: [Qemu-devel] [PULL 08/41] intel_iommu: support device iotlb
> descriptor
> 
> From: Jason Wang 
> 
> This patch enables device IOTLB support for intel iommu. The major work is to
> implement QI device IOTLB descriptor processing and notify the device through
> iommu notifier.
>
Hi Jason/Michael,

Recently Peter Xu's patch also touched intel-iommu emulation. His patch shadows
second-level page table by capturing iotlb flush from guest. It would result in 
page
table updating in host. Does this patch also use the same map/umap API provided
by VFIO? If it is, then I think it would also update page table in host. It 
looks to be
a duplicate update. Pls refer to the following snapshot captured from section 
6.5.2.5
of vtd spec. 

"Since translation requests from a device may be serviced by hardware from the 
IOTLB, software must
always request IOTLB invalidation (iotlb_inv_dsc) before requesting 
corresponding Device-TLB
(dev_tlb_inv_dsc) invalidation."

Maybe for device-iotlb, we need a separate API which just pass down the 
invalidate
info without updating page table. Any thoughts?

Thanks,
Yi L
> Cc: Paolo Bonzini 
> Cc: Richard Henderson 
> Cc: Eduardo Habkost 
> Cc: Michael S. Tsirkin 
> Signed-off-by: Jason Wang 
> Reviewed-by: Michael S. Tsirkin 
> Signed-off-by: Michael S. Tsirkin 
> Reviewed-by: Peter Xu 
> ---
>  hw/i386/intel_iommu_internal.h | 13 ++-
>  include/hw/i386/x86-iommu.h|  1 +
>  hw/i386/intel_iommu.c  | 83
> +++---
>  hw/i386/x86-iommu.c| 17 +
>  4 files changed, 107 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/i386/intel_iommu_internal.h
> b/hw/i386/intel_iommu_internal.h index 11abfa2..356f188 100644
> --- a/hw/i386/intel_iommu_internal.h
> +++ b/hw/i386/intel_iommu_internal.h
> @@ -183,6 +183,7 @@
>  /* (offset >> 4) << 8 */
>  #define VTD_ECAP_IRO(DMAR_IOTLB_REG_OFFSET << 4)
>  #define VTD_ECAP_QI (1ULL << 1)
> +#define VTD_ECAP_DT (1ULL << 2)
>  /* Interrupt Remapping support */
>  #define VTD_ECAP_IR (1ULL << 3)
>  #define VTD_ECAP_EIM(1ULL << 4)
> @@ -326,6 +327,7 @@ typedef union VTDInvDesc VTDInvDesc;
>  #define VTD_INV_DESC_TYPE   0xf
>  #define VTD_INV_DESC_CC 0x1 /* Context-cache Invalidate Desc 
> */
>  #define VTD_INV_DESC_IOTLB  0x2
> +#define VTD_INV_DESC_DEVICE 0x3
>  #define VTD_INV_DESC_IEC0x4 /* Interrupt Entry Cache
> Invalidate Descriptor */
>  #define VTD_INV_DESC_WAIT   0x5 /* Invalidation Wait Descriptor 
> */
> @@ -361,6 +363,13 @@ typedef union VTDInvDesc VTDInvDesc;
>  #define VTD_INV_DESC_IOTLB_RSVD_LO  0xff00ULL
>  #define VTD_INV_DESC_IOTLB_RSVD_HI  0xf80ULL
> 
> +/* Mask for Device IOTLB Invalidate Descriptor */ #define
> +VTD_INV_DESC_DEVICE_IOTLB_ADDR(val) ((val) & 0xf000ULL)
> +#define VTD_INV_DESC_DEVICE_IOTLB_SIZE(val) ((val) & 0x1) #define
> +VTD_INV_DESC_DEVICE_IOTLB_SID(val) (((val) >> 32) & 0xULL) #define
> +VTD_INV_DESC_DEVICE_IOTLB_RSVD_HI 0xffeULL #define
> +VTD_INV_DESC_DEVICE_IOTLB_RSVD_LO 0xffe0fff8
> +
>  /* Information about page-selective IOTLB invalidate */  struct
> VTDIOTLBPageInvInfo {
>  uint16_t domain_id;
> @@ -399,8 +408,8 @@ typedef struct VTDRootEntry VTDRootEntry;
>  #define VTD_CONTEXT_ENTRY_FPD   (1ULL << 1) /* Fault Processing Disable
> */
>  #define VTD_CONTEXT_ENTRY_TT(3ULL << 2) /* Translation Type */
>  #define VTD_CONTEXT_TT_MULTI_LEVEL  0
> -#define VTD_CONTEXT_TT_DEV_IOTLB1
> -#define VTD_CONTEXT_TT_PASS_THROUGH 2
> +#define VTD_CONTEXT_TT_DEV_IOTLB(1ULL << 2)
> +#define VTD_CONTEXT_TT_PASS_THROUGH (2ULL << 2)
>  /* Second Level Page Translation Pointer*/
>  #define VTD_CONTEXT_ENTRY_SLPTPTR   (~0xfffULL)
>  #define VTD_CONTEXT_ENTRY_RSVD_LO   (0xff0ULL | ~VTD_HAW_MASK)
> diff --git a/include/hw/i386/x86-iommu.h b/include/hw/i386/x86-iommu.h
> index 0c89d98..361c07c 100644
> --- a/include/hw/i386/x86-iommu.h
> +++ b/include/hw/i386/x86-iommu.h
> @@ -73,6 +73,7 @@ typedef struct IEC_Notifier IEC_Notifier;  struct
> X86IOMMUState {
>  SysBusDevice busdev;
>  bool intr_supported;/* Whether vIOMMU supports IR */
> +bool dt_supported;  /* Whether vIOMMU supports DT */
>  IommuType type; /* IOMMU type - AMD/Intel */
>  QLIST_HEAD(, IEC_Notifier) iec_notifiers; /* IEC notify list */  }; diff 
> --git
> a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index e39b764..ec62239
> 100644
> --- a/hw/i386/int

Re: [Qemu-devel] [PATCH 3/5] colo-compare: release all unhandled packets in finalize function

2017-02-15 Thread Hailiang Zhang


On 2017/2/16 10:27, Zhang Chen wrote:



On 02/15/2017 04:34 PM, zhanghailiang wrote:

We should release all unhandled packets before finalize colo compare.
Besides, we need to free connection_track_table, or there will be
a memory leak bug.

Signed-off-by: zhanghailiang
---
   net/colo-compare.c | 20 
   1 file changed, 20 insertions(+)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index a16e2d5..809bad3 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -676,6 +676,23 @@ static void colo_compare_complete(UserCreatable *uc, Error 
**errp)
   return;
   }



This function in my patch "colo-compare and filter-rewriter work with
colo-frame "
Named 'colo_flush_connection', I think use 'flush' instead of 'release'
is better,



OK, i will fix it in next version, thanks.


Thanks
Zhang Chen



+static void colo_release_packets(void *opaque, void *user_data)
+{
+CompareState *s = user_data;
+Connection *conn = opaque;
+Packet *pkt = NULL;
+
+while (!g_queue_is_empty(&conn->primary_list)) {
+pkt = g_queue_pop_head(&conn->primary_list);
+compare_chr_send(&s->chr_out, pkt->data, pkt->size);
+packet_destroy(pkt, NULL);
+}
+while (!g_queue_is_empty(&conn->secondary_list)) {
+pkt = g_queue_pop_head(&conn->secondary_list);
+packet_destroy(pkt, NULL);
+}
+}
+
   static void colo_compare_class_init(ObjectClass *oc, void *data)
   {
   UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
@@ -707,9 +724,12 @@ static void colo_compare_finalize(Object *obj)
   g_main_loop_quit(s->compare_loop);
   qemu_thread_join(&s->thread);

+/* Release all unhandled packets after compare thead exited */
+g_queue_foreach(&s->conn_list, colo_release_packets, s);

   g_queue_free(&s->conn_list);

+g_hash_table_destroy(s->connection_track_table);
   g_free(s->pri_indev);
   g_free(s->sec_indev);
   g_free(s->outdev);







Re: [Qemu-devel] [PATCH V7 2/2] Add a new qmp command to do checkpoint, query xen replication status

2017-02-15 Thread Zhang Chen



On 02/16/2017 01:08 PM, Jason Wang wrote:



On 2017年02月16日 11:25, Zhang Chen wrote:

Ping...

No new for a long time.

Who can pick up this patch?



I believe you'd better cc migration maintainers (cced), have you tried 
scripts/get_maintainer ?


Thanks Jason.
Add cc Markus Armbruster ,
   Amit Shah ,
   zhanghailiang 




Thanks



Thanks

Zhang Chen


On 02/14/2017 04:28 AM, Stefano Stabellini wrote:

On Wed, 8 Feb 2017, Eric Blake wrote:

On 02/07/2017 11:24 PM, Zhang Chen wrote:

We can call this qmp command to do checkpoint outside of qemu.
Xen colo will need this function.

Signed-off-by: Zhang Chen 
Signed-off-by: Wen Congyang 
---
  migration/colo.c | 17 
  qapi-schema.json | 60 


  2 files changed, 77 insertions(+)


Reviewed-by: Eric Blake 

Given that the series is all acked, are you going to take care of the
pull request?


.







.



--
Thanks
Zhang Chen






Re: [Qemu-devel] [PATCH 5/6] target-ppc: support for 32-bit carry and overflow

2017-02-15 Thread Nikunj A Dadhania
Richard Henderson  writes:

> On 02/14/2017 02:05 PM, Nikunj A Dadhania wrote:
>> Yes, you are right. I had a discussion with Paul Mackerras yesterday, he
>> explained to me in detail about the bits. I am working on the revised
>> implementation. Will detail it in the commit message.
>
> As you're working on this, consider changing the definition of cpu_ov such 
> that 
> the MSB is OV and bit 31 is OV32.
>
> E.g.
>
>
>   static inline void gen_op_arith_compute_ov(DisasContext *ctx, TCGv arg0,
>  TCGv arg1, TCGv arg2, int sub)
>   {
>   TCGv t0 = tcg_temp_new();
>
>   tcg_gen_xor_tl(cpu_ov, arg0, arg2);
>   tcg_gen_xor_tl(t0, arg1, arg2);
>   if (sub) {
>   tcg_gen_and_tl(cpu_ov, cpu_ov, t0);
>   } else {
>   tcg_gen_andc_tl(cpu_ov, cpu_ov, t0);
>   }
>   tcg_temp_free(t0);
>   if (NARROW_MODE(ctx)) {
>   tcg_gen_ext32s_tl(cpu_ov, cpu_ov);
>   }
> -tcg_gen_shri_tl(cpu_ov, cpu_ov, TARGET_LONG_BITS - 1);
>   tcg_gen_or_tl(cpu_so, cpu_so, cpu_ov);
>   }
>
>
> is all that is required for arithmetic to compute OV and OV32 into those two 
> bits.

How about the below?

@@ -809,10 +809,11 @@ static inline void gen_op_arith_compute_ov(DisasContext 
*ctx, TCGv arg0,
 tcg_gen_andc_tl(cpu_ov, cpu_ov, t0);
 }
 tcg_temp_free(t0);
+tcg_gen_extract_tl(cpu_ov32, cpu_ov, 31, 1);
+tcg_gen_extract_tl(cpu_ov, cpu_ov, 63, 1);
 if (NARROW_MODE(ctx)) {
-tcg_gen_ext32s_tl(cpu_ov, cpu_ov);
+tcg_gen_mov_tl(cpu_ov, cpu_ov32);
 }
-tcg_gen_shri_tl(cpu_ov, cpu_ov, TARGET_LONG_BITS - 1);
 tcg_gen_or_tl(cpu_so, cpu_so, cpu_ov);
 }

Regards
Nikunj




Re: [Qemu-devel] [PATCH V7 2/2] Add a new qmp command to do checkpoint, query xen replication status

2017-02-15 Thread Jason Wang



On 2017年02月16日 11:25, Zhang Chen wrote:

Ping...

No new for a long time.

Who can pick up this patch?



I believe you'd better cc migration maintainers (cced), have you tried 
scripts/get_maintainer ?


Thanks



Thanks

Zhang Chen


On 02/14/2017 04:28 AM, Stefano Stabellini wrote:

On Wed, 8 Feb 2017, Eric Blake wrote:

On 02/07/2017 11:24 PM, Zhang Chen wrote:

We can call this qmp command to do checkpoint outside of qemu.
Xen colo will need this function.

Signed-off-by: Zhang Chen 
Signed-off-by: Wen Congyang 
---
  migration/colo.c | 17 
  qapi-schema.json | 60 


  2 files changed, 77 insertions(+)


Reviewed-by: Eric Blake 

Given that the series is all acked, are you going to take care of the
pull request?


.








[Qemu-devel] [PATCH v2] pcie: simplify pcie_add_capability()

2017-02-15 Thread Peter Xu
When we add PCIe extended capabilities, we should be following the rule
that we add the head extended cap (at offset 0x100) first, then the rest
of them. Meanwhile, we are always adding new capability bits at the end
of the list. Here the "next" looks meaningless in all cases since it
should always be zero (along with the "header").

Simplify the function a bit, and it looks more readable now.

Signed-off-by: Peter Xu 
---
v2:
- rebased to mst's patch
  "pci/pcie: don't assume cap id 0 is reserved"
- avoid having side-effect code in assertion. [Marcel]
  (I removed it directly since after mst's fix it would never return
   nonzero now)
---
 hw/pci/pcie.c | 14 +++---
 1 file changed, 3 insertions(+), 11 deletions(-)

diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
index f4dd177..fc54bfd 100644
--- a/hw/pci/pcie.c
+++ b/hw/pci/pcie.c
@@ -665,32 +665,24 @@ void pcie_add_capability(PCIDevice *dev,
  uint16_t cap_id, uint8_t cap_ver,
  uint16_t offset, uint16_t size)
 {
-uint32_t header;
-uint16_t next;
-
 assert(offset >= PCI_CONFIG_SPACE_SIZE);
 assert(offset < offset + size);
 assert(offset + size <= PCIE_CONFIG_SPACE_SIZE);
 assert(size >= 8);
 assert(pci_is_express(dev));
 
-if (offset == PCI_CONFIG_SPACE_SIZE) {
-header = pci_get_long(dev->config + offset);
-next = PCI_EXT_CAP_NEXT(header);
-} else {
+if (offset != PCI_CONFIG_SPACE_SIZE) {
 uint16_t prev;
 
 /*
  * 0x is not a valid cap id (it's a 16 bit field). use
  * internally to find the last capability in the linked list.
  */
-next = pcie_find_capability_list(dev, 0x, &prev);
-
+pcie_find_capability_list(dev, 0x, &prev);
 assert(prev >= PCI_CONFIG_SPACE_SIZE);
-assert(next == 0);
 pcie_ext_cap_set_next(dev, prev, offset);
 }
-pci_set_long(dev->config + offset, PCI_EXT_CAP(cap_id, cap_ver, next));
+pci_set_long(dev->config + offset, PCI_EXT_CAP(cap_id, cap_ver, 0));
 
 /* Make capability read-only by default */
 memset(dev->wmask + offset, 0, size);
-- 
2.7.4




Re: [Qemu-devel] [PATCH] pci/pcie: don't assume cap id 0 is reserved

2017-02-15 Thread Peter Xu
On Thu, Feb 16, 2017 at 11:04:46AM +0800, Peter Xu wrote:
> On Wed, Feb 15, 2017 at 07:52:35PM -0700, Alex Williamson wrote:
> > On Thu, 16 Feb 2017 10:35:28 +0800
> > Peter Xu  wrote:
> > 
> > > On Wed, Feb 15, 2017 at 10:49:47PM +0200, Michael S. Tsirkin wrote:
> > > > VFIO actually wants to create a capability with ID == 0.
> > > > This is done to make guest drivers skip the given capability.
> > > > pcie_add_capability then trips up on this capability
> > > > when looking for end of capability list.
> > > > 
> > > > To support this use-case, it's easy enough to switch to
> > > > e.g. 0x for these comparisons - we can be sure
> > > > it will never match a 16-bit capability ID.
> > > > 
> > > > Signed-off-by: Michael S. Tsirkin   
> > > 
> > > Reviewed-by: Peter Xu 
> > > 
> > > Two nits:
> > > 
> > > (1) maybe we can s/0x/0x/ in the whole patch since ecap_id
> > > is 16 bits
> > 
> > The former is used because it's beyond the address space of a valid
> > capability.  Using 0x just makes the situation different, not
> > better.
> 
> But isn't pcie_find_capability_list() defining cap_id parameter as
> uint16_t? In that case, 0x will be the same as 0x since
> we'll just take the lower 16 bits?

Alex helpped pointing out that this patch has touched the parameter
while I didn't notice that. Sorry. :(

Please take my r-b and ignore the two nits. Thanks,

-- peterx



Re: [Qemu-devel] [RFC] virtio-pci: Allow PCIe virtio devices on root bus

2017-02-15 Thread David Gibson
On Thu, Feb 16, 2017 at 01:48:42PM +1100, David Gibson wrote:
> On Wed, Feb 15, 2017 at 04:59:33PM +0200, Marcel Apfelbaum wrote:
> > On 02/15/2017 03:45 AM, David Gibson wrote:
> > > On Tue, Feb 14, 2017 at 02:53:08PM +0200, Marcel Apfelbaum wrote:
> > > > On 02/14/2017 06:15 AM, David Gibson wrote:
> > > > > On Mon, Feb 13, 2017 at 12:14:23PM +0200, Marcel Apfelbaum wrote:
> > > > > > On 02/13/2017 06:33 AM, David Gibson wrote:
> > > > > > > On Sun, Feb 12, 2017 at 09:05:46PM +0200, Marcel Apfelbaum wrote:
> > > > > > > > On 02/10/2017 02:37 AM, David Gibson wrote:
> > > > > > > > > On Thu, Feb 09, 2017 at 10:04:47AM +0100, Laszlo Ersek wrote:
> > > > > > > > > > On 02/09/17 05:16, David Gibson wrote:
> > > > > > > > > > > On Wed, Feb 08, 2017 at 11:40:50AM +0100, Laszlo Ersek 
> > > > > > > > > > > wrote:
> > > > > > > > > > > > On 02/08/17 07:16, David Gibson wrote:
[snip]
> > > >   Which means that you can use it to
> > > > > drive PCIe devices just fine.  "Bus level" PCIe extensions like AER
> > > > > and PCIe standard hotplug won't work, but PAPR has its own mechanisms
> > > > > for those (common between PCI and PCIe).
> > > > > 
> > > > > I did float the idea of having the pseries PCI bus remain plain PCI
> > > > > but with a special flag to allow PCIe devices to be attached to it
> > > > > anyway.  It wasn't greeted with much enthusiasm..
> > > > > 
> > > > 
> > > > Can you point me to the discussion please? It seems similar to what I 
> > > > proposed above.
> > > 
> > > Sorry, I was misleading.  I think I just raised that idea with Andrea
> > > and a few other people internally, not on one of the lists at large.
> > > 
> > > > As you properly described it, is much closer to PCI then PCIe, even the 
> > > > only characteristic
> > > > that makes it "a little" PCIe, the Extended Configuration Space support,
> > > > is done with an alternative interface.
> > > > 
> > > > I agree the PAPR bus is not PCIe.
> > > 
> > > Ok, so if we take that direction, the question becomes how do we let
> > > PCIe devices plug into this mostly-not-PCIe bus.  Maybe introduce a
> > > "pci_bus_accepts_express()" function that will replace many, but not
> > > all current uses of "pci_bus_is_express()"?
> > > 
> > 
> > Sounds good and I think Eduardo is already working on exactly this
> > idea, however he is on PTO now. It is better to synchronize with him.
> 
> Ah, right.  Do you know when he'll be back?  This is semi-urgent for
> Power.
> 
> 
> > > Such a helper could maybe simplify the logic in virtio-pci (and XHCI?)
> > > by returning false on an x86 root bus.
> > > 
> > 
> > The rule would me more complicated. We don't want to completely remove the
> > possibility to have PCIe devices as part of Root Complex. it seems
> > like I am contradicting myself, but no).
> > This is why we have guidelines and  not hard-coded policies.
> > Also ,the QEMU way is to be more permissive. We provide guidelines and sane
> > defaults, but we let the user to chose.
> > 
> > Getting back to our problem, the rule would be:
> > hybrid devices should be PCI or PCIe for a bus?
> > PAPR bus should return 'PCIe' for hybrid devices.
> > X86 bus should return 'PCIe' if not root.
> 
> Ok.

Wait, actually.. we have two possible directions to go, both of which
have been mentioned in the thread, but I don't think we've settled on
one:

1) Have pseries create a PCIe bus (as my first cut draft does).

That should allow pure PCIe devices to appear either under a port or
(more usually for PAPR) as "integrated endpoints".  In addition we'd
need as suggested above a "pcie_hybrid_type()" function that would
tell hybrid devices to also appear as PCIe rather than PCI.

2) Have pseries create a vanilla PCI bus (or a special PAPR PCI
   variant)

Appearing as vanilla PCI would in a number of ways more closely match
the way PCI buses are handled on PAPR.  However, we still need to
connect PCIe devices to it.  So we'd need some 'bus_accepts_pcie()'
hook and use that (in place of pci_bus_is_express()) to determine both
whether we can attach pure PCIe devices and that hybrid devices should
appear as PCIe rather than plain PCI.


Based on the immediately preceding discussion, I was leaning towards
(2).  Is that your feeling as well?

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH V7 2/2] Add a new qmp command to do checkpoint, query xen replication status

2017-02-15 Thread Zhang Chen

Ping...

No new for a long time.

Who can pick up this patch?


Thanks

Zhang Chen


On 02/14/2017 04:28 AM, Stefano Stabellini wrote:

On Wed, 8 Feb 2017, Eric Blake wrote:

On 02/07/2017 11:24 PM, Zhang Chen wrote:

We can call this qmp command to do checkpoint outside of qemu.
Xen colo will need this function.

Signed-off-by: Zhang Chen 
Signed-off-by: Wen Congyang 
---
  migration/colo.c | 17 
  qapi-schema.json | 60 
  2 files changed, 77 insertions(+)


Reviewed-by: Eric Blake 

Given that the series is all acked, are you going to take care of the
pull request?


.



--
Thanks
Zhang Chen






Re: [Qemu-devel] [PATCH v7 2/2] block/vxhs.c: Add qemu-iotests for new block device type "vxhs"

2017-02-15 Thread ashish mittal
Sorry, pressed the "send" button instead of "expand text" on the
previous email ...

On Mon, Feb 13, 2017 at 6:43 AM, Stefan Hajnoczi  wrote:
> On Tue, Feb 07, 2017 at 08:18:14PM -0800, Ashish Mittal wrote:
>> diff --git a/tests/qemu-iotests/common.config 
>> b/tests/qemu-iotests/common.config
>> index f6384fb..c7a80c0 100644
>> --- a/tests/qemu-iotests/common.config
>> +++ b/tests/qemu-iotests/common.config
>> @@ -105,6 +105,10 @@ if [ -z "$QEMU_NBD_PROG" ]; then
>>  export QEMU_NBD_PROG="`set_prog_path qemu-nbd`"
>>  fi
>>
>> +if [ -z "$QEMU_VXHS_PROG" ]; then
>> +export QEMU_VXHS_PROG="`set_prog_path qnio_server /usr/local/bin`"
>
> Did you test this with /usr/local/bin/qnio_server?
>
> I think it will evaluate to QEMU_VXHS_PROG=/usr/local/bin when qnio_server
> isn't found in PATH.  You probably wanted /usr/local/bin/qnio_server instead.
>
> I suggest dropping the second argument completely and letting the user set 
> PATH
> themselves.  No existing set_prog_path caller uses the second argument.
>

You're right. Will drop the second argument.

> # $1 = prog to look for, $2* = default pathnames if not found in $PATH
> set_prog_path()
> {
> p=`command -v $1 2> /dev/null`
> if [ -n "$p" -a -x "$p" ]; then
> echo $p
> return 0
> fi
> p=$1
>
> shift
> for f; do
> if [ -x $f ]; then
> echo $f
> return 0
> fi
> done
>
> echo ""
> return 1
> }
>
>> diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc
>> index 3213765..06a3164 100644
>> --- a/tests/qemu-iotests/common.rc
>> +++ b/tests/qemu-iotests/common.rc
>> @@ -89,6 +89,9 @@ else
>>  TEST_IMG=$TEST_DIR/t.$IMGFMT
>>  elif [ "$IMGPROTO" = "archipelago" ]; then
>>  TEST_IMG="archipelago:at.$IMGFMT"
>> +elif [ "$IMGPROTO" = "vxhs" ]; then
>> +TEST_IMG_FILE=$TEST_DIR/t.$IMGFMT
>> +TEST_IMG="vxhs://127.0.0.1:/t.$IMGFMT"
>>  else
>>  TEST_IMG=$IMGPROTO:$TEST_DIR/t.$IMGFMT
>>  fi
>> @@ -175,6 +178,12 @@ _make_test_img()
>>  eval "$QEMU_NBD -v -t -b 127.0.0.1 -p 10810 -f $IMGFMT  
>> $TEST_IMG_FILE &"
>>  sleep 1 # FIXME: qemu-nbd needs to be listening before we continue
>>  fi
>> +
>> +# Start QNIO server on image directory for vxhs protocol
>> +if [ $IMGPROTO = "vxhs" ]; then
>> +eval "$QEMU_VXHS -d  $TEST_DIR &"
>> +sleep 1 # Wait for server to come up.
>
> This is a pre-existing problem and you don't need to fix it now:
>
> We should replace sleep 1 with a function that probes the TCP port until the
> connection can be established or a timeout is reached.  The netcat (nc) 
> utility
> is often used for this.
>
> sleep 1 is not reliable and may fail on a heavily loaded machine like the
> Travis-CI build machines that are used.

Will skip this one for now.

Thanks,
Ashish



Re: [Qemu-devel] [PATCH] pci/pcie: don't assume cap id 0 is reserved

2017-02-15 Thread Peter Xu
On Wed, Feb 15, 2017 at 07:52:35PM -0700, Alex Williamson wrote:
> On Thu, 16 Feb 2017 10:35:28 +0800
> Peter Xu  wrote:
> 
> > On Wed, Feb 15, 2017 at 10:49:47PM +0200, Michael S. Tsirkin wrote:
> > > VFIO actually wants to create a capability with ID == 0.
> > > This is done to make guest drivers skip the given capability.
> > > pcie_add_capability then trips up on this capability
> > > when looking for end of capability list.
> > > 
> > > To support this use-case, it's easy enough to switch to
> > > e.g. 0x for these comparisons - we can be sure
> > > it will never match a 16-bit capability ID.
> > > 
> > > Signed-off-by: Michael S. Tsirkin   
> > 
> > Reviewed-by: Peter Xu 
> > 
> > Two nits:
> > 
> > (1) maybe we can s/0x/0x/ in the whole patch since ecap_id
> > is 16 bits
> 
> The former is used because it's beyond the address space of a valid
> capability.  Using 0x just makes the situation different, not
> better.

But isn't pcie_find_capability_list() defining cap_id parameter as
uint16_t? In that case, 0x will be the same as 0x since
we'll just take the lower 16 bits?

> 
> > 
> > (2) maybe we can add one more sentence in the comment below showing
> > where the 0x thing comes from (it comes from PCIe spec 7.9.2)
> 
> The capability in hardware is 16bits, thus a value that exceeds 16 bits
> can never match a valid ID.  It has nothing to do with 7.9.2.  Thanks,
> 
> Alex
> 
> > > ---
> > >  hw/pci/pcie.c | 11 +++
> > >  1 file changed, 7 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
> > > index cbd4bb4..f4dd177 100644
> > > --- a/hw/pci/pcie.c
> > > +++ b/hw/pci/pcie.c
> > > @@ -610,7 +610,8 @@ bool pcie_cap_is_arifwd_enabled(const PCIDevice *dev)
> > >   * uint16_t ext_cap_size
> > >   */
> > >  
> > > -static uint16_t pcie_find_capability_list(PCIDevice *dev, uint16_t 
> > > cap_id,
> > > +/* Passing a cap_id value > 0x will return 0 and put end of list in 
> > > prev */
> > > +static uint16_t pcie_find_capability_list(PCIDevice *dev, uint32_t 
> > > cap_id,
> > >uint16_t *prev_p)
> > >  {
> > >  uint16_t prev = 0;
> > > @@ -679,9 +680,11 @@ void pcie_add_capability(PCIDevice *dev,
> > >  } else {
> > >  uint16_t prev;
> > >  
> > > -/* 0 is reserved cap id. use internally to find the last 
> > > capability
> > > -   in the linked list */
> > > -next = pcie_find_capability_list(dev, 0, &prev);
> > > +/*
> > > + * 0x is not a valid cap id (it's a 16 bit field). use
> > > + * internally to find the last capability in the linked list.
> > > + */
> > > +next = pcie_find_capability_list(dev, 0x, &prev);
> > >  
> > >  assert(prev >= PCI_CONFIG_SPACE_SIZE);
> > >  assert(next == 0);
> > > -- 
> > > MST  
> > 
> > -- peterx
> 

-- peterx



Re: [Qemu-devel] [PATCH] hw/ppc/spapr: Check for valid page size when hot plugging memory

2017-02-15 Thread David Gibson
On Wed, Feb 15, 2017 at 10:21:44AM +0100, Thomas Huth wrote:
> On POWER, the valid page sizes that the guest can use are bound
> to the CPU and not to the memory region. QEMU already has some
> fancy logic to find out the right maximum memory size to tell
> it to the guest during boot (see getrampagesize() in the file
> target/ppc/kvm.c for more information).
> However, once we're booted and the guest is using huge pages
> already, it is currently still possible to hot-plug memory regions
> that does not support huge pages - which of course does not work
> on POWER, since the guest thinks that it is possible to use huge
> pages everywhere. The KVM_RUN ioctl will then abort with -EFAULT,
> QEMU spills out a not very helpful error message together with
> a register dump and the user is annoyed that the VM unexpectedly
> died.
> To avoid this situation, we should check the page size of hot-plugged
> DIMMs to see whether it is possible to use it in the current VM.
> If it does not fit, we can print out a better error message and
> refuse to add it, so that the VM does not die unexpectely and the
> user has a second chance to plug a DIMM with a matching memory
> backend instead.
> 
> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1419466
> Signed-off-by: Thomas Huth 

Using the global is a bit yucky, but I can't see an easy way to remove
it, and it's not like there aren't already some ugly globals in the
KVM code.  In the meantime this fixes a real bug, so I've merged this
to ppc-for-2.9.

Thanks.

> ---
>  hw/ppc/spapr.c   |  8 
>  target/ppc/kvm.c | 32 
>  target/ppc/kvm_ppc.h |  7 +++
>  3 files changed, 43 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index e465d7a..1a90aae 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -2357,6 +2357,7 @@ static void spapr_memory_plug(HotplugHandler 
> *hotplug_dev, DeviceState *dev,
>  uint64_t align = memory_region_get_alignment(mr);
>  uint64_t size = memory_region_size(mr);
>  uint64_t addr;
> +char *mem_dev;
>  
>  if (size % SPAPR_MEMORY_BLOCK_SIZE) {
>  error_setg(&local_err, "Hotplugged memory size must be a multiple of 
> "
> @@ -2364,6 +2365,13 @@ static void spapr_memory_plug(HotplugHandler 
> *hotplug_dev, DeviceState *dev,
>  goto out;
>  }
>  
> +mem_dev = object_property_get_str(OBJECT(dimm), PC_DIMM_MEMDEV_PROP, 
> NULL);
> +if (mem_dev && !kvmppc_is_mem_backend_page_size_ok(mem_dev)) {
> +error_setg(&local_err, "Memory backend has bad page size. "
> +   "Use 'memory-backend-file' with correct mem-path.");
> +goto out;
> +}
> +
>  pc_dimm_memory_plug(dev, &ms->hotplug_memory, mr, align, &local_err);
>  if (local_err) {
>  goto out;
> diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
> index 663d2e7..584546b 100644
> --- a/target/ppc/kvm.c
> +++ b/target/ppc/kvm.c
> @@ -438,12 +438,13 @@ static bool kvm_valid_page_size(uint32_t flags, long 
> rampgsize, uint32_t shift)
>  return (1ul << shift) <= rampgsize;
>  }
>  
> +static long max_cpu_page_size;
> +
>  static void kvm_fixup_page_sizes(PowerPCCPU *cpu)
>  {
>  static struct kvm_ppc_smmu_info smmu_info;
>  static bool has_smmu_info;
>  CPUPPCState *env = &cpu->env;
> -long rampagesize;
>  int iq, ik, jq, jk;
>  bool has_64k_pages = false;
>  
> @@ -458,7 +459,9 @@ static void kvm_fixup_page_sizes(PowerPCCPU *cpu)
>  has_smmu_info = true;
>  }
>  
> -rampagesize = getrampagesize();
> +if (!max_cpu_page_size) {
> +max_cpu_page_size = getrampagesize();
> +}
>  
>  /* Convert to QEMU form */
>  memset(&env->sps, 0, sizeof(env->sps));
> @@ -478,14 +481,14 @@ static void kvm_fixup_page_sizes(PowerPCCPU *cpu)
>  struct ppc_one_seg_page_size *qsps = &env->sps.sps[iq];
>  struct kvm_ppc_one_seg_page_size *ksps = &smmu_info.sps[ik];
>  
> -if (!kvm_valid_page_size(smmu_info.flags, rampagesize,
> +if (!kvm_valid_page_size(smmu_info.flags, max_cpu_page_size,
>   ksps->page_shift)) {
>  continue;
>  }
>  qsps->page_shift = ksps->page_shift;
>  qsps->slb_enc = ksps->slb_enc;
>  for (jk = jq = 0; jk < KVM_PPC_PAGE_SIZES_MAX_SZ; jk++) {
> -if (!kvm_valid_page_size(smmu_info.flags, rampagesize,
> +if (!kvm_valid_page_size(smmu_info.flags, max_cpu_page_size,
>   ksps->enc[jk].page_shift)) {
>  continue;
>  }
> @@ -510,12 +513,33 @@ static void kvm_fixup_page_sizes(PowerPCCPU *cpu)
>  env->mmu_model &= ~POWERPC_MMU_64K;
>  }
>  }
> +
> +bool kvmppc_is_mem_backend_page_size_ok(char *obj_path)
> +{
> +Object *mem_obj = object_resolve_path(obj_path, NULL);
> +char *mempath = object_property_get_str(mem_obj, "mem-path", NULL);
> +long pagesize;
> 

Re: [Qemu-devel] [RFC] virtio-pci: Allow PCIe virtio devices on root bus

2017-02-15 Thread David Gibson
On Wed, Feb 15, 2017 at 04:59:33PM +0200, Marcel Apfelbaum wrote:
> On 02/15/2017 03:45 AM, David Gibson wrote:
> > On Tue, Feb 14, 2017 at 02:53:08PM +0200, Marcel Apfelbaum wrote:
> > > On 02/14/2017 06:15 AM, David Gibson wrote:
> > > > On Mon, Feb 13, 2017 at 12:14:23PM +0200, Marcel Apfelbaum wrote:
> > > > > On 02/13/2017 06:33 AM, David Gibson wrote:
> > > > > > On Sun, Feb 12, 2017 at 09:05:46PM +0200, Marcel Apfelbaum wrote:
> > > > > > > On 02/10/2017 02:37 AM, David Gibson wrote:
> > > > > > > > On Thu, Feb 09, 2017 at 10:04:47AM +0100, Laszlo Ersek wrote:
> > > > > > > > > On 02/09/17 05:16, David Gibson wrote:
> > > > > > > > > > On Wed, Feb 08, 2017 at 11:40:50AM +0100, Laszlo Ersek 
> > > > > > > > > > wrote:
> > > > > > > > > > > On 02/08/17 07:16, David Gibson wrote:
> > > > > > > > > > > > Marcel,
> > > > > > > > > > > > 
> > > > > > > > > > > > Your original patch adding PCIe support to virtio-pci.c 
> > > > > > > > > > > > has the
> > > > > > > > > > > > limitation noted below that PCIe won't be enabled if 
> > > > > > > > > > > > the device is on
> > > > > > > > > > > > the root bus (rather than under a root or downstream 
> > > > > > > > > > > > port).  As
> > > > > > > > > > > > reasoned below, I think removing the check is correct, 
> > > > > > > > > > > > even for x86
> > > > > > > > > > > > (though it would rarely be useful there).  But I could 
> > > > > > > > > > > > well have
> > > > > > > > > > > > missed something.  Let me know if so...
> > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > Virtio devices can appear as either vanilla PCI or 
> > > > > > > > > > > > PCI-Express devices
> > > > > > > > > > > > depending on the bus they're connected to.  At the 
> > > > > > > > > > > > moment it will only
> > > > > > > > > > > > appear as vanilla PCI if connected to the root bus of a 
> > > > > > > > > > > > PCIe host bridge.
> > > > > > > > > > > > 
> > > > > > > > > > > > Presumably this is to reflect the fact that PCIe 
> > > > > > > > > > > > devices usually need to
> > > > > > > > > > > > be connected to a root (or further downstream) port 
> > > > > > > > > > > > rather than directly
> > > > > > > > > > > > on the root bus.  However, due to the odd requirements 
> > > > > > > > > > > > of the PAPR spec on the 'pseries'
> > > > > > > > > > > > machine type, it's normal for PCIe devices to appear on 
> > > > > > > > > > > > the root bus
> > > > > > > > > > > > without root ports.
> > > > > > > > > > > > 
> > > > > > > > > > > > Further, even on x86, there's no inherent reason we 
> > > > > > > > > > > > couldn't present a
> > > > > > > > > > > > virtio device as an "integrated device" (typically used 
> > > > > > > > > > > > for things built
> > > > > > > > > > > > into the PCI chipset), and those devices *do* typically 
> > > > > > > > > > > > appear on the root
> > > > > > > > > > > > bus.
> > > > > > > > > > > 
> > > > > > > > > > > I'm not personally making a counter-argument, just 
> > > > > > > > > > > qouting some of
> > > > > > > > > > > the relevant parts of "docs/pcie.txt" ("PCI EXPRESS 
> > > > > > > > > > > GUIDELINES"):
> > > > > > > > > > 
> > > > > > > > > > So, an earlier discussion more or less concluded that the 
> > > > > > > > > > PCIe
> > > > > > > > > > guidelines don't really work with PAPR guests.  That comes 
> > > > > > > > > > because
> > > > > > > > > > PAPR was designed with PowerVM in mind which allows PCI 
> > > > > > > > > > passthrough
> > > > > > > > > > but doesn't do any emulated PCI devices.  So they wanted to 
> > > > > > > > > > present
> > > > > > > > > > passed through devices (virtual or phyical) to the guest 
> > > > > > > > > > without
> > > > > > > > > > inserting virtual root ports.
> > > > > > > > > > 
> > > > > > > > > > Now, you can argue that this was a silly decision in PAPR, 
> > > > > > > > > > and you
> > > > > > > > > > could well be right, but there it is.
> > > > > > > > > 
> > > > > > > > > I can totally accept this, but then we should state it as a 
> > > > > > > > > fact near
> > > > > > > > > the top of "docs/pcie.txt".
> > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > > > Place only the following kinds of devices directly on 
> > > > > > > > > > > > the Root Complex:
> > > > > > > > > > > > (1) PCI Devices (e.g. network card, graphics card, 
> > > > > > > > > > > > IDE controller),
> > > > > > > > > > > > not controllers. Place only legacy PCI devices 
> > > > > > > > > > > > on
> > > > > > > > > > > > the Root Complex. These will be considered 
> > > > > > > > > > > > Integrated Endpoints.
> > > > > > > > > > > > Note: Integrated Endpoints are not 
> > > > > > > > > > > > hot-pluggable.
> > > > > > > > > > > > 
> > > > > > > > > > > > Although the PCI Express spec does not forbid 
> > > > > > > > > > > > PCI Express devices as
> > > > > > > > > > > > Integrated Endpoints, existing hardware mostly 
> > > > > > > > > >

Re: [Qemu-devel] [PATCH v7 2/2] block/vxhs.c: Add qemu-iotests for new block device type "vxhs"

2017-02-15 Thread ashish mittal
On Mon, Feb 13, 2017 at 6:43 AM, Stefan Hajnoczi  wrote:
> On Tue, Feb 07, 2017 at 08:18:14PM -0800, Ashish Mittal wrote:
>> diff --git a/tests/qemu-iotests/common.config 
>> b/tests/qemu-iotests/common.config
>> index f6384fb..c7a80c0 100644
>> --- a/tests/qemu-iotests/common.config
>> +++ b/tests/qemu-iotests/common.config
>> @@ -105,6 +105,10 @@ if [ -z "$QEMU_NBD_PROG" ]; then
>>  export QEMU_NBD_PROG="`set_prog_path qemu-nbd`"
>>  fi
>>
>> +if [ -z "$QEMU_VXHS_PROG" ]; then
>> +export QEMU_VXHS_PROG="`set_prog_path qnio_server /usr/local/bin`"
>
> Did you test this with /usr/local/bin/qnio_server?
>
> I think it will evaluate to QEMU_VXHS_PROG=/usr/local/bin when qnio_server
> isn't found in PATH.  You probably wanted /usr/local/bin/qnio_server instead.
>
> I suggest dropping the second argument completely and letting the user set 
> PATH
> themselves.  No existing set_prog_path caller uses the second argument.
>
> # $1 = prog to look for, $2* = default pathnames if not found in $PATH
> set_prog_path()
> {
> p=`command -v $1 2> /dev/null`
> if [ -n "$p" -a -x "$p" ]; then
> echo $p
> return 0
> fi
> p=$1
>
> shift
> for f; do
> if [ -x $f ]; then
> echo $f
> return 0
> fi
> done
>
> echo ""
> return 1
> }
>
>> diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc
>> index 3213765..06a3164 100644
>> --- a/tests/qemu-iotests/common.rc
>> +++ b/tests/qemu-iotests/common.rc
>> @@ -89,6 +89,9 @@ else
>>  TEST_IMG=$TEST_DIR/t.$IMGFMT
>>  elif [ "$IMGPROTO" = "archipelago" ]; then
>>  TEST_IMG="archipelago:at.$IMGFMT"
>> +elif [ "$IMGPROTO" = "vxhs" ]; then
>> +TEST_IMG_FILE=$TEST_DIR/t.$IMGFMT
>> +TEST_IMG="vxhs://127.0.0.1:/t.$IMGFMT"
>>  else
>>  TEST_IMG=$IMGPROTO:$TEST_DIR/t.$IMGFMT
>>  fi
>> @@ -175,6 +178,12 @@ _make_test_img()
>>  eval "$QEMU_NBD -v -t -b 127.0.0.1 -p 10810 -f $IMGFMT  
>> $TEST_IMG_FILE &"
>>  sleep 1 # FIXME: qemu-nbd needs to be listening before we continue
>>  fi
>> +
>> +# Start QNIO server on image directory for vxhs protocol
>> +if [ $IMGPROTO = "vxhs" ]; then
>> +eval "$QEMU_VXHS -d  $TEST_DIR &"
>> +sleep 1 # Wait for server to come up.
>
> This is a pre-existing problem and you don't need to fix it now:
>
> We should replace sleep 1 with a function that probes the TCP port until the
> connection can be established or a timeout is reached.  The netcat (nc) 
> utility
> is often used for this.
>
> sleep 1 is not reliable and may fail on a heavily loaded machine like the
> Travis-CI build machines that are used.



Re: [Qemu-devel] [PATCH v7 1/2] block/vxhs.c: Add support for a new block device type called "vxhs"

2017-02-15 Thread ashish mittal
On Mon, Feb 13, 2017 at 6:57 AM, Stefan Hajnoczi  wrote:
> On Tue, Feb 07, 2017 at 08:18:13PM -0800, Ashish Mittal wrote:
>> +static int vxhs_parse_uri(const char *filename, QDict *options)
>> +{
>> +URI *uri = NULL;
>> +char *hoststr, *portstr;
>> +char *port;
>> +int ret = 0;
>> +
>> +trace_vxhs_parse_uri_filename(filename);
>> +uri = uri_parse(filename);
>> +if (!uri || !uri->server || !uri->path) {
>> +uri_free(uri);
>> +return -EINVAL;
>> +}
>> +
>> +hoststr = g_strdup(VXHS_OPT_SERVER".host");
>> +qdict_put(options, hoststr, qstring_from_str(uri->server));
>> +g_free(hoststr);
>> +
>> +portstr = g_strdup(VXHS_OPT_SERVER".port");
>> +if (uri->port) {
>> +port = g_strdup_printf("%d", uri->port);
>> +qdict_put(options, portstr, qstring_from_str(port));
>> +g_free(port);
>> +}
>> +g_free(portstr);
>> +
>> +if (strstr(uri->path, "vxhs") == NULL) {
>> +qdict_put(options, "vdisk-id", qstring_from_str(uri->path));
>> +}
>> +
>> +trace_vxhs_parse_uri_hostinfo(1, uri->server, uri->port);
>
> What is the purpose of the first argument?
>

It used to be a placeholder for the host index, which is now only 1. I
will remove it.

>> +str = g_strdup_printf(VXHS_OPT_SERVER".");
>> +qdict_extract_subqdict(options, &backing_options, str);
>> +
>> +/* Create opts info from runtime_tcp_opts list */
>> +tcp_opts = qemu_opts_create(&runtime_tcp_opts, NULL, 0, &error_abort);
>> +qemu_opts_absorb_qdict(tcp_opts, backing_options, &local_err);
>> +if (local_err) {
>> +qdict_del(backing_options, str);
>
> What is qdict_del(backing_options, VXHS_OPT_SERVER".") supposed to do?
> The same call is made further down too.
>

Per my understanding, qdict_del() is to free the 'server.' entries
within the subqdict.

qdict_extract_subqdict() allocates a subqdict and populates it with
the entries based on the pattern we pass. In this case 'server.'.

>> +qemu_opts_del(tcp_opts);
>> +ret = -EINVAL;
>> +goto out;
>> +}
>> +
>> +server_host_opt = qemu_opt_get(tcp_opts, VXHS_OPT_HOST);
>> +if (!server_host_opt) {
>> +error_setg(&local_err, QERR_MISSING_PARAMETER,
>> +   VXHS_OPT_SERVER"."VXHS_OPT_HOST);
>> +ret = -EINVAL;
>> +goto out;
>
> Missing qemu_opts_del(tcp_opts).
>

Will fix this!

>> +}
>> +
>> +if (strlen(server_host_opt) > MAXHOSTNAMELEN) {
>> +error_setg(errp, "server.host cannot be more than %d characters",
>> +   MAXHOSTNAMELEN);
>> +ret = -EINVAL;
>> +goto out;
>
> Missing qemu_opts_del(tcp_opts).
>

Will fix this!

>> @@ -5114,6 +5147,7 @@ echo "tcmalloc support  $tcmalloc"
>>  echo "jemalloc support  $jemalloc"
>>  echo "avx2 optimization $avx2_opt"
>>  echo "replication support $replication"
>> +echo "VxHS block device $vxhs"
>>
>>  if test "$sdl_too_old" = "yes"; then
>>  echo "-> Your SDL version is too old - please upgrade to have SDL support"
>> @@ -5729,6 +5763,12 @@ if test "$pthread_setname_np" = "yes" ; then
>>echo "CONFIG_PTHREAD_SETNAME_NP=y" >> $config_host_mak
>>  fi
>>
>> +if test "$vxhs" = "yes" ; then
>> +  echo "CONFIG_VXHS=y" >> $config_host_mak
>> +  echo "VXHS_CFLAGS=$vxhs_cflags" >> $config_host_mak
>
> Please drop this unused variable.

Will fix this!


Thanks,
Ashish



Re: [Qemu-devel] [PATCH] pci/pcie: don't assume cap id 0 is reserved

2017-02-15 Thread Alex Williamson
On Thu, 16 Feb 2017 10:35:28 +0800
Peter Xu  wrote:

> On Wed, Feb 15, 2017 at 10:49:47PM +0200, Michael S. Tsirkin wrote:
> > VFIO actually wants to create a capability with ID == 0.
> > This is done to make guest drivers skip the given capability.
> > pcie_add_capability then trips up on this capability
> > when looking for end of capability list.
> > 
> > To support this use-case, it's easy enough to switch to
> > e.g. 0x for these comparisons - we can be sure
> > it will never match a 16-bit capability ID.
> > 
> > Signed-off-by: Michael S. Tsirkin   
> 
> Reviewed-by: Peter Xu 
> 
> Two nits:
> 
> (1) maybe we can s/0x/0x/ in the whole patch since ecap_id
> is 16 bits

The former is used because it's beyond the address space of a valid
capability.  Using 0x just makes the situation different, not
better.

> 
> (2) maybe we can add one more sentence in the comment below showing
> where the 0x thing comes from (it comes from PCIe spec 7.9.2)

The capability in hardware is 16bits, thus a value that exceeds 16 bits
can never match a valid ID.  It has nothing to do with 7.9.2.  Thanks,

Alex

> > ---
> >  hw/pci/pcie.c | 11 +++
> >  1 file changed, 7 insertions(+), 4 deletions(-)
> > 
> > diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
> > index cbd4bb4..f4dd177 100644
> > --- a/hw/pci/pcie.c
> > +++ b/hw/pci/pcie.c
> > @@ -610,7 +610,8 @@ bool pcie_cap_is_arifwd_enabled(const PCIDevice *dev)
> >   * uint16_t ext_cap_size
> >   */
> >  
> > -static uint16_t pcie_find_capability_list(PCIDevice *dev, uint16_t cap_id,
> > +/* Passing a cap_id value > 0x will return 0 and put end of list in 
> > prev */
> > +static uint16_t pcie_find_capability_list(PCIDevice *dev, uint32_t cap_id,
> >uint16_t *prev_p)
> >  {
> >  uint16_t prev = 0;
> > @@ -679,9 +680,11 @@ void pcie_add_capability(PCIDevice *dev,
> >  } else {
> >  uint16_t prev;
> >  
> > -/* 0 is reserved cap id. use internally to find the last capability
> > -   in the linked list */
> > -next = pcie_find_capability_list(dev, 0, &prev);
> > +/*
> > + * 0x is not a valid cap id (it's a 16 bit field). use
> > + * internally to find the last capability in the linked list.
> > + */
> > +next = pcie_find_capability_list(dev, 0x, &prev);
> >  
> >  assert(prev >= PCI_CONFIG_SPACE_SIZE);
> >  assert(next == 0);
> > -- 
> > MST  
> 
> -- peterx




Re: [Qemu-devel] iommu emulation

2017-02-15 Thread Alex Williamson
On Thu, 16 Feb 2017 10:28:39 +0800
Peter Xu  wrote:

> On Wed, Feb 15, 2017 at 11:15:52AM -0700, Alex Williamson wrote:
> 
> [...]
> 
> > > Alex, do you like something like below to fix above issue that Jintack
> > > has encountered?
> > > 
> > > (note: this code is not for compile, only trying show what I mean...)
> > > 
> > > --8<---
> > > diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> > > index 332f41d..4dca631 100644
> > > --- a/hw/vfio/pci.c
> > > +++ b/hw/vfio/pci.c
> > > @@ -1877,25 +1877,6 @@ static void vfio_add_ext_cap(VFIOPCIDevice *vdev)
> > >   */
> > >  config = g_memdup(pdev->config, vdev->config_size);
> > > 
> > > -/*
> > > - * Extended capabilities are chained with each pointing to the next, 
> > > so we
> > > - * can drop anything other than the head of the chain simply by 
> > > modifying
> > > - * the previous next pointer.  For the head of the chain, we can 
> > > modify the
> > > - * capability ID to something that cannot match a valid capability.  
> > > ID
> > > - * 0 is reserved for this since absence of capabilities is indicated 
> > > by
> > > - * 0 for the ID, version, AND next pointer.  However, 
> > > pcie_add_capability()
> > > - * uses ID 0 as reserved for list management and will incorrectly 
> > > match and
> > > - * assert if we attempt to pre-load the head of the chain with this 
> > > ID.
> > > - * Use ID 0x temporarily since it is also seems to be reserved in
> > > - * part for identifying absence of capabilities in a root complex 
> > > register
> > > - * block.  If the ID still exists after adding capabilities, switch 
> > > back to
> > > - * zero.  We'll mark this entire first dword as emulated for this 
> > > purpose.
> > > - */
> > > -pci_set_long(pdev->config + PCI_CONFIG_SPACE_SIZE,
> > > - PCI_EXT_CAP(0x, 0, 0));
> > > -pci_set_long(pdev->wmask + PCI_CONFIG_SPACE_SIZE, 0);
> > > -pci_set_long(vdev->emulated_config_bits + PCI_CONFIG_SPACE_SIZE, ~0);
> > > -
> > >  for (next = PCI_CONFIG_SPACE_SIZE; next;
> > >   next = PCI_EXT_CAP_NEXT(pci_get_long(config + next))) {
> > >  header = pci_get_long(config + next);
> > > @@ -1917,6 +1898,8 @@ static void vfio_add_ext_cap(VFIOPCIDevice *vdev)
> > >  switch (cap_id) {
> > >  case PCI_EXT_CAP_ID_SRIOV: /* Read-only VF BARs confuse OVMF */
> > >  case PCI_EXT_CAP_ID_ARI: /* XXX Needs next function 
> > > virtualization */
> > > +/* keep this ecap header (4 bytes), but mask cap_id to 
> > > 0x */
> > > +...
> > >  trace_vfio_add_ext_cap_dropped(vdev->vbasedev.name, cap_id, 
> > > next);
> > >  break;
> > >  default:
> > > @@ -1925,11 +1908,6 @@ static void vfio_add_ext_cap(VFIOPCIDevice *vdev)
> > > 
> > >  }
> > > 
> > > -/* Cleanup chain head ID if necessary */
> > > -if (pci_get_word(pdev->config + PCI_CONFIG_SPACE_SIZE) == 0x) {
> > > -pci_set_word(pdev->config + PCI_CONFIG_SPACE_SIZE, 0);
> > > -}
> > > -
> > >  g_free(config);
> > >  return;
> > >  }  
> > > ->8-
> > > 
> > > Since after all we need the assumption that 0x is reserved for
> > > cap_id. Then, we can just remove the "first 0x then 0x0" hack,
> > > which is imho error-prone and hacky.  
> > 
> > This doesn't fix the bug, which is that pcie_add_capability() uses a
> > valid capability ID for it's own internal tracking.  It's only doing
> > this to find the end of the capability chain, which we could do in a
> > spec complaint way by looking for a zero next pointer.  Fix that and
> > then vfio doesn't need to do this set to 0x then back to zero
> > nonsense at all.  Capability ID zero is valid.  Thanks,  
> 
> Yeah I see Michael's fix on the capability list stuff. However, imho
> these are two different issues? Or say, even if with that patch, we
> should still need this hack (first 0x0, then 0x) right? Since
> looks like that patch didn't solve the problem if the first pcie ecap
> is masked at 0x100.

I thought the problem was that QEMU in the host exposes a device with a
capability ID of 0 to the L1 guest.  QEMU in the L1 guest balks at a
capability ID of 0 because that's how it finds the end of the chain.
Therefore if we make QEMU not use capability ID 0 for internal
purposes, things work.  vfio using 0x and swapping back to 0x0
becomes unnecessary, but doesn't hurt anything.  Thanks,

Alex



Re: [Qemu-devel] [PATCH 3/5] colo-compare: release all unhandled packets in finalize function

2017-02-15 Thread Jason Wang



On 2017年02月16日 10:43, Hailiang Zhang wrote:

On 2017/2/16 10:34, Jason Wang wrote:



On 2017年02月15日 16:34, zhanghailiang wrote:

We should release all unhandled packets before finalize colo compare.
Besides, we need to free connection_track_table, or there will be
a memory leak bug.

Signed-off-by: zhanghailiang 
---
   net/colo-compare.c | 20 
   1 file changed, 20 insertions(+)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index a16e2d5..809bad3 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -676,6 +676,23 @@ static void colo_compare_complete(UserCreatable 
*uc, Error **errp)

   return;
   }

+static void colo_release_packets(void *opaque, void *user_data)
+{
+CompareState *s = user_data;
+Connection *conn = opaque;
+Packet *pkt = NULL;
+
+while (!g_queue_is_empty(&conn->primary_list)) {
+pkt = g_queue_pop_head(&conn->primary_list);
+compare_chr_send(&s->chr_out, pkt->data, pkt->size);


Any reason to send packets here?



Yes, considering the usage case which we shut COLO for
the VM to make it as a normal VM without FT.
We need to remove all the filter objects. In this case,
IMHO, it is necessary to release the unhandled packets.


Thanks.


Right, I see. All other patches looks good let's squash this into 2.

Thanks




Thanks


+packet_destroy(pkt, NULL);
+}
+while (!g_queue_is_empty(&conn->secondary_list)) {
+pkt = g_queue_pop_head(&conn->secondary_list);
+packet_destroy(pkt, NULL);
+}
+}
+
   static void colo_compare_class_init(ObjectClass *oc, void *data)
   {
   UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
@@ -707,9 +724,12 @@ static void colo_compare_finalize(Object *obj)
   g_main_loop_quit(s->compare_loop);
   qemu_thread_join(&s->thread);

+/* Release all unhandled packets after compare thead exited */
+g_queue_foreach(&s->conn_list, colo_release_packets, s);

   g_queue_free(&s->conn_list);

+g_hash_table_destroy(s->connection_track_table);
   g_free(s->pri_indev);
   g_free(s->sec_indev);
   g_free(s->outdev);



.








Re: [Qemu-devel] [PATCH 2/5] colo-compare: kick compare thread to exit while finalize

2017-02-15 Thread Hailiang Zhang

On 2017/2/16 10:25, Zhang Chen wrote:



On 02/15/2017 04:34 PM, zhanghailiang wrote:

We should call g_main_loop_quit() to notify colo compare thread to
exit, Or it will run in g_main_loop_run() forever.

Besides, the finalizing process can't happen in context of colo thread,
it is reasonable to remove the 'if (qemu_thread_is_self(&s->thread))'
branch.

Signed-off-by: zhanghailiang 
---
   net/colo-compare.c | 19 +--
   1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index fdde788..a16e2d5 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -83,6 +83,8 @@ typedef struct CompareState {
   GHashTable *connection_track_table;
   /* compare thread, a thread for each NIC */
   QemuThread thread;
+
+GMainLoop *compare_loop;
   } CompareState;

   typedef struct CompareClass {
@@ -496,7 +498,6 @@ static gboolean check_old_packet_regular(void *opaque)
   static void *colo_compare_thread(void *opaque)
   {
   GMainContext *worker_context;
-GMainLoop *compare_loop;
   CompareState *s = opaque;
   GSource *timeout_source;

@@ -507,7 +508,7 @@ static void *colo_compare_thread(void *opaque)
   qemu_chr_fe_set_handlers(&s->chr_sec_in, compare_chr_can_read,
compare_sec_chr_in, NULL, s, worker_context, 
true);

-compare_loop = g_main_loop_new(worker_context, FALSE);
+s->compare_loop = g_main_loop_new(worker_context, FALSE);

   /* To kick any packets that the secondary doesn't match */
   timeout_source = g_timeout_source_new(REGULAR_PACKET_CHECK_MS);
@@ -515,10 +516,10 @@ static void *colo_compare_thread(void *opaque)
 (GSourceFunc)check_old_packet_regular, s, NULL);
   g_source_attach(timeout_source, worker_context);

-g_main_loop_run(compare_loop);
+g_main_loop_run(s->compare_loop);

   g_source_unref(timeout_source);
-g_main_loop_unref(compare_loop);
+g_main_loop_unref(s->compare_loop);
   g_main_context_unref(worker_context);
   return NULL;
   }
@@ -703,13 +704,11 @@ static void colo_compare_finalize(Object *obj)
   qemu_chr_fe_deinit(&s->chr_sec_in);
   qemu_chr_fe_deinit(&s->chr_out);

-g_queue_free(&s->conn_list);
+g_main_loop_quit(s->compare_loop);
+qemu_thread_join(&s->thread);

-if (qemu_thread_is_self(&s->thread)) {
-/* compare connection */
-g_queue_foreach(&s->conn_list, colo_compare_connection, s);
-qemu_thread_join(&s->thread);
-}


Before free the 's->conn_list', you should flush all queued primary packets
and release all queued secondary packets here, so combine this patch
with 3/5 patch as
one patch is a better choose.



Make sense, will fix it in next version, thanks.


Thanks
Zhang Chen


+
+g_queue_free(&s->conn_list);

   g_free(s->pri_indev);
   g_free(s->sec_indev);







Re: [Qemu-devel] [PATCH 3/5] colo-compare: release all unhandled packets in finalize function

2017-02-15 Thread Hailiang Zhang

On 2017/2/16 10:34, Jason Wang wrote:



On 2017年02月15日 16:34, zhanghailiang wrote:

We should release all unhandled packets before finalize colo compare.
Besides, we need to free connection_track_table, or there will be
a memory leak bug.

Signed-off-by: zhanghailiang 
---
   net/colo-compare.c | 20 
   1 file changed, 20 insertions(+)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index a16e2d5..809bad3 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -676,6 +676,23 @@ static void colo_compare_complete(UserCreatable *uc, Error 
**errp)
   return;
   }

+static void colo_release_packets(void *opaque, void *user_data)
+{
+CompareState *s = user_data;
+Connection *conn = opaque;
+Packet *pkt = NULL;
+
+while (!g_queue_is_empty(&conn->primary_list)) {
+pkt = g_queue_pop_head(&conn->primary_list);
+compare_chr_send(&s->chr_out, pkt->data, pkt->size);


Any reason to send packets here?



Yes, considering the usage case which we shut COLO for
the VM to make it as a normal VM without FT.
We need to remove all the filter objects. In this case,
IMHO, it is necessary to release the unhandled packets.


Thanks.


Thanks


+packet_destroy(pkt, NULL);
+}
+while (!g_queue_is_empty(&conn->secondary_list)) {
+pkt = g_queue_pop_head(&conn->secondary_list);
+packet_destroy(pkt, NULL);
+}
+}
+
   static void colo_compare_class_init(ObjectClass *oc, void *data)
   {
   UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
@@ -707,9 +724,12 @@ static void colo_compare_finalize(Object *obj)
   g_main_loop_quit(s->compare_loop);
   qemu_thread_join(&s->thread);

+/* Release all unhandled packets after compare thead exited */
+g_queue_foreach(&s->conn_list, colo_release_packets, s);

   g_queue_free(&s->conn_list);

+g_hash_table_destroy(s->connection_track_table);
   g_free(s->pri_indev);
   g_free(s->sec_indev);
   g_free(s->outdev);



.






Re: [Qemu-devel] [PATCH] target-ppc: Add quad precision muladd instructions

2017-02-15 Thread Bharata B Rao
On Thu, Feb 16, 2017 at 09:13:31AM +1100, Richard Henderson wrote:
> On 02/15/2017 05:37 PM, Bharata B Rao wrote:
> > + *
> > + * TODO: When float128_muladd() becomes available, switch this
> > + * implementation to use that instead of separate float128_mul()
> > + * followed by float128_add().
> 
> Let's just do that, rather than add something that can't pass tests.
> 
> You should be able to copy float64_muladd and, for the most part, s/128/256/
> and s/64/128/.  Other of the magic numbers, like the implicit bit and the
> exponent bias, you get from float128_mul.

I started like that but got lost somewhere down that path...

It needs at least the following new functions to be implemented:

propagateFloat128MulAddNaN
shortShift256Left
shift256RightJamming
add256
sub256

It all looked doable, but the magic numbers used around the code that
does eventual multiplication looked difficult to understand and I couldn't
deduce that from float128_mul. For some reason float128_mul implements
multipliction via multiplication and addition (mul128To256 & add128). There
is no equivalent to this in float64_muladd.

Let me make another attempt at this.

Regards,
Bharata.




Re: [Qemu-devel] [PATCH 2/5] colo-compare: kick compare thread to exit while finalize

2017-02-15 Thread Jason Wang



On 2017年02月16日 10:25, Zhang Chen wrote:

@@ -703,13 +704,11 @@ static void colo_compare_finalize(Object *obj)
  qemu_chr_fe_deinit(&s->chr_sec_in);
  qemu_chr_fe_deinit(&s->chr_out);
  -g_queue_free(&s->conn_list);
+g_main_loop_quit(s->compare_loop);
+qemu_thread_join(&s->thread);
  -if (qemu_thread_is_self(&s->thread)) {
-/* compare connection */
-g_queue_foreach(&s->conn_list, colo_compare_connection, s);
-qemu_thread_join(&s->thread);
-}


Before free the 's->conn_list', you should flush all queued primary 
packets
and release all queued secondary packets here, so combine this patch 
with 3/5 patch as

one patch is a better choose.

Thanks
Zhang Chen 


Yes, agree.

Thanks



Re: [Qemu-devel] [PATCH] pci/pcie: don't assume cap id 0 is reserved

2017-02-15 Thread Peter Xu
On Wed, Feb 15, 2017 at 10:49:47PM +0200, Michael S. Tsirkin wrote:
> VFIO actually wants to create a capability with ID == 0.
> This is done to make guest drivers skip the given capability.
> pcie_add_capability then trips up on this capability
> when looking for end of capability list.
> 
> To support this use-case, it's easy enough to switch to
> e.g. 0x for these comparisons - we can be sure
> it will never match a 16-bit capability ID.
> 
> Signed-off-by: Michael S. Tsirkin 

Reviewed-by: Peter Xu 

Two nits:

(1) maybe we can s/0x/0x/ in the whole patch since ecap_id
is 16 bits

(2) maybe we can add one more sentence in the comment below showing
where the 0x thing comes from (it comes from PCIe spec 7.9.2)

Thanks,

> ---
>  hw/pci/pcie.c | 11 +++
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
> index cbd4bb4..f4dd177 100644
> --- a/hw/pci/pcie.c
> +++ b/hw/pci/pcie.c
> @@ -610,7 +610,8 @@ bool pcie_cap_is_arifwd_enabled(const PCIDevice *dev)
>   * uint16_t ext_cap_size
>   */
>  
> -static uint16_t pcie_find_capability_list(PCIDevice *dev, uint16_t cap_id,
> +/* Passing a cap_id value > 0x will return 0 and put end of list in prev 
> */
> +static uint16_t pcie_find_capability_list(PCIDevice *dev, uint32_t cap_id,
>uint16_t *prev_p)
>  {
>  uint16_t prev = 0;
> @@ -679,9 +680,11 @@ void pcie_add_capability(PCIDevice *dev,
>  } else {
>  uint16_t prev;
>  
> -/* 0 is reserved cap id. use internally to find the last capability
> -   in the linked list */
> -next = pcie_find_capability_list(dev, 0, &prev);
> +/*
> + * 0x is not a valid cap id (it's a 16 bit field). use
> + * internally to find the last capability in the linked list.
> + */
> +next = pcie_find_capability_list(dev, 0x, &prev);
>  
>  assert(prev >= PCI_CONFIG_SPACE_SIZE);
>  assert(next == 0);
> -- 
> MST

-- peterx



Re: [Qemu-devel] [PATCH 3/5] colo-compare: release all unhandled packets in finalize function

2017-02-15 Thread Jason Wang



On 2017年02月15日 16:34, zhanghailiang wrote:

We should release all unhandled packets before finalize colo compare.
Besides, we need to free connection_track_table, or there will be
a memory leak bug.

Signed-off-by: zhanghailiang 
---
  net/colo-compare.c | 20 
  1 file changed, 20 insertions(+)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index a16e2d5..809bad3 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -676,6 +676,23 @@ static void colo_compare_complete(UserCreatable *uc, Error 
**errp)
  return;
  }
  
+static void colo_release_packets(void *opaque, void *user_data)

+{
+CompareState *s = user_data;
+Connection *conn = opaque;
+Packet *pkt = NULL;
+
+while (!g_queue_is_empty(&conn->primary_list)) {
+pkt = g_queue_pop_head(&conn->primary_list);
+compare_chr_send(&s->chr_out, pkt->data, pkt->size);


Any reason to send packets here?

Thanks


+packet_destroy(pkt, NULL);
+}
+while (!g_queue_is_empty(&conn->secondary_list)) {
+pkt = g_queue_pop_head(&conn->secondary_list);
+packet_destroy(pkt, NULL);
+}
+}
+
  static void colo_compare_class_init(ObjectClass *oc, void *data)
  {
  UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
@@ -707,9 +724,12 @@ static void colo_compare_finalize(Object *obj)
  g_main_loop_quit(s->compare_loop);
  qemu_thread_join(&s->thread);
  
+/* Release all unhandled packets after compare thead exited */

+g_queue_foreach(&s->conn_list, colo_release_packets, s);
  
  g_queue_free(&s->conn_list);
  
+g_hash_table_destroy(s->connection_track_table);

  g_free(s->pri_indev);
  g_free(s->sec_indev);
  g_free(s->outdev);





Re: [Qemu-devel] [PATCH] pcie: simplify pcie_add_capability()

2017-02-15 Thread Peter Xu
On Thu, Feb 16, 2017 at 10:18:00AM +0800, Cao jin wrote:
> Hi peter
> 
> On 02/14/2017 03:51 PM, Peter Xu wrote:
> > When we add PCIe extended capabilities, we should be following the rule
> > that we add the head extended cap (at offset 0x100) first, then the rest
> > of them. Meanwhile, we are always adding new capability bits at the end
> > of the list. Here the "next" looks meaningless in all cases since it
> > should always be zero (along with the "header").
> > 
> > Simplify the function a bit, and it looks more readable now.
> > 
> 
> See if this suggestion could be incorporated into your patch:)
> http://lists.nongnu.org/archive/html/qemu-devel/2017-01/msg01418.html

Sure. But imho that's really trivial and as long as the assertions are
working correctly (no matter in which order) I can live with both. :)

Anyway, thanks for the pointer!

-- peterx



Re: [Qemu-devel] iommu emulation

2017-02-15 Thread Peter Xu
On Wed, Feb 15, 2017 at 11:15:52AM -0700, Alex Williamson wrote:

[...]

> > Alex, do you like something like below to fix above issue that Jintack
> > has encountered?
> > 
> > (note: this code is not for compile, only trying show what I mean...)
> > 
> > --8<---
> > diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> > index 332f41d..4dca631 100644
> > --- a/hw/vfio/pci.c
> > +++ b/hw/vfio/pci.c
> > @@ -1877,25 +1877,6 @@ static void vfio_add_ext_cap(VFIOPCIDevice *vdev)
> >   */
> >  config = g_memdup(pdev->config, vdev->config_size);
> > 
> > -/*
> > - * Extended capabilities are chained with each pointing to the next, 
> > so we
> > - * can drop anything other than the head of the chain simply by 
> > modifying
> > - * the previous next pointer.  For the head of the chain, we can 
> > modify the
> > - * capability ID to something that cannot match a valid capability.  ID
> > - * 0 is reserved for this since absence of capabilities is indicated by
> > - * 0 for the ID, version, AND next pointer.  However, 
> > pcie_add_capability()
> > - * uses ID 0 as reserved for list management and will incorrectly 
> > match and
> > - * assert if we attempt to pre-load the head of the chain with this ID.
> > - * Use ID 0x temporarily since it is also seems to be reserved in
> > - * part for identifying absence of capabilities in a root complex 
> > register
> > - * block.  If the ID still exists after adding capabilities, switch 
> > back to
> > - * zero.  We'll mark this entire first dword as emulated for this 
> > purpose.
> > - */
> > -pci_set_long(pdev->config + PCI_CONFIG_SPACE_SIZE,
> > - PCI_EXT_CAP(0x, 0, 0));
> > -pci_set_long(pdev->wmask + PCI_CONFIG_SPACE_SIZE, 0);
> > -pci_set_long(vdev->emulated_config_bits + PCI_CONFIG_SPACE_SIZE, ~0);
> > -
> >  for (next = PCI_CONFIG_SPACE_SIZE; next;
> >   next = PCI_EXT_CAP_NEXT(pci_get_long(config + next))) {
> >  header = pci_get_long(config + next);
> > @@ -1917,6 +1898,8 @@ static void vfio_add_ext_cap(VFIOPCIDevice *vdev)
> >  switch (cap_id) {
> >  case PCI_EXT_CAP_ID_SRIOV: /* Read-only VF BARs confuse OVMF */
> >  case PCI_EXT_CAP_ID_ARI: /* XXX Needs next function virtualization 
> > */
> > +/* keep this ecap header (4 bytes), but mask cap_id to 0x 
> > */
> > +...
> >  trace_vfio_add_ext_cap_dropped(vdev->vbasedev.name, cap_id, 
> > next);
> >  break;
> >  default:
> > @@ -1925,11 +1908,6 @@ static void vfio_add_ext_cap(VFIOPCIDevice *vdev)
> > 
> >  }
> > 
> > -/* Cleanup chain head ID if necessary */
> > -if (pci_get_word(pdev->config + PCI_CONFIG_SPACE_SIZE) == 0x) {
> > -pci_set_word(pdev->config + PCI_CONFIG_SPACE_SIZE, 0);
> > -}
> > -
> >  g_free(config);
> >  return;
> >  }
> > ->8-  
> > 
> > Since after all we need the assumption that 0x is reserved for
> > cap_id. Then, we can just remove the "first 0x then 0x0" hack,
> > which is imho error-prone and hacky.
> 
> This doesn't fix the bug, which is that pcie_add_capability() uses a
> valid capability ID for it's own internal tracking.  It's only doing
> this to find the end of the capability chain, which we could do in a
> spec complaint way by looking for a zero next pointer.  Fix that and
> then vfio doesn't need to do this set to 0x then back to zero
> nonsense at all.  Capability ID zero is valid.  Thanks,

Yeah I see Michael's fix on the capability list stuff. However, imho
these are two different issues? Or say, even if with that patch, we
should still need this hack (first 0x0, then 0x) right? Since
looks like that patch didn't solve the problem if the first pcie ecap
is masked at 0x100.

Please correct me if I missed anything. Thanks,

-- peterx



Re: [Qemu-devel] [PATCH 3/5] colo-compare: release all unhandled packets in finalize function

2017-02-15 Thread Zhang Chen



On 02/15/2017 04:34 PM, zhanghailiang wrote:

We should release all unhandled packets before finalize colo compare.
Besides, we need to free connection_track_table, or there will be
a memory leak bug.

Signed-off-by: zhanghailiang
---
  net/colo-compare.c | 20 
  1 file changed, 20 insertions(+)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index a16e2d5..809bad3 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -676,6 +676,23 @@ static void colo_compare_complete(UserCreatable *uc, Error 
**errp)
  return;
  }
  


This function in my patch "colo-compare and filter-rewriter work with 
colo-frame "
Named 'colo_flush_connection', I think use 'flush' instead of 'release' 
is better,


Thanks
Zhang Chen



+static void colo_release_packets(void *opaque, void *user_data)
+{
+CompareState *s = user_data;
+Connection *conn = opaque;
+Packet *pkt = NULL;
+
+while (!g_queue_is_empty(&conn->primary_list)) {
+pkt = g_queue_pop_head(&conn->primary_list);
+compare_chr_send(&s->chr_out, pkt->data, pkt->size);
+packet_destroy(pkt, NULL);
+}
+while (!g_queue_is_empty(&conn->secondary_list)) {
+pkt = g_queue_pop_head(&conn->secondary_list);
+packet_destroy(pkt, NULL);
+}
+}
+
  static void colo_compare_class_init(ObjectClass *oc, void *data)
  {
  UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
@@ -707,9 +724,12 @@ static void colo_compare_finalize(Object *obj)
  g_main_loop_quit(s->compare_loop);
  qemu_thread_join(&s->thread);
  
+/* Release all unhandled packets after compare thead exited */

+g_queue_foreach(&s->conn_list, colo_release_packets, s);
  
  g_queue_free(&s->conn_list);
  
+g_hash_table_destroy(s->connection_track_table);

  g_free(s->pri_indev);
  g_free(s->sec_indev);
  g_free(s->outdev);


--
Thanks
Zhang Chen






Re: [Qemu-devel] [PATCH 2/5] colo-compare: kick compare thread to exit while finalize

2017-02-15 Thread Zhang Chen



On 02/15/2017 04:34 PM, zhanghailiang wrote:

We should call g_main_loop_quit() to notify colo compare thread to
exit, Or it will run in g_main_loop_run() forever.

Besides, the finalizing process can't happen in context of colo thread,
it is reasonable to remove the 'if (qemu_thread_is_self(&s->thread))'
branch.

Signed-off-by: zhanghailiang 
---
  net/colo-compare.c | 19 +--
  1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index fdde788..a16e2d5 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -83,6 +83,8 @@ typedef struct CompareState {
  GHashTable *connection_track_table;
  /* compare thread, a thread for each NIC */
  QemuThread thread;
+
+GMainLoop *compare_loop;
  } CompareState;
  
  typedef struct CompareClass {

@@ -496,7 +498,6 @@ static gboolean check_old_packet_regular(void *opaque)
  static void *colo_compare_thread(void *opaque)
  {
  GMainContext *worker_context;
-GMainLoop *compare_loop;
  CompareState *s = opaque;
  GSource *timeout_source;
  
@@ -507,7 +508,7 @@ static void *colo_compare_thread(void *opaque)

  qemu_chr_fe_set_handlers(&s->chr_sec_in, compare_chr_can_read,
   compare_sec_chr_in, NULL, s, worker_context, 
true);
  
-compare_loop = g_main_loop_new(worker_context, FALSE);

+s->compare_loop = g_main_loop_new(worker_context, FALSE);
  
  /* To kick any packets that the secondary doesn't match */

  timeout_source = g_timeout_source_new(REGULAR_PACKET_CHECK_MS);
@@ -515,10 +516,10 @@ static void *colo_compare_thread(void *opaque)
(GSourceFunc)check_old_packet_regular, s, NULL);
  g_source_attach(timeout_source, worker_context);
  
-g_main_loop_run(compare_loop);

+g_main_loop_run(s->compare_loop);
  
  g_source_unref(timeout_source);

-g_main_loop_unref(compare_loop);
+g_main_loop_unref(s->compare_loop);
  g_main_context_unref(worker_context);
  return NULL;
  }
@@ -703,13 +704,11 @@ static void colo_compare_finalize(Object *obj)
  qemu_chr_fe_deinit(&s->chr_sec_in);
  qemu_chr_fe_deinit(&s->chr_out);
  
-g_queue_free(&s->conn_list);

+g_main_loop_quit(s->compare_loop);
+qemu_thread_join(&s->thread);
  
-if (qemu_thread_is_self(&s->thread)) {

-/* compare connection */
-g_queue_foreach(&s->conn_list, colo_compare_connection, s);
-qemu_thread_join(&s->thread);
-}


Before free the 's->conn_list', you should flush all queued primary packets
and release all queued secondary packets here, so combine this patch 
with 3/5 patch as

one patch is a better choose.

Thanks
Zhang Chen


+
+g_queue_free(&s->conn_list);
  
  g_free(s->pri_indev);

  g_free(s->sec_indev);


--
Thanks
Zhang Chen






Re: [Qemu-devel] [PATCH] pcie: simplify pcie_add_capability()

2017-02-15 Thread Peter Xu
On Wed, Feb 15, 2017 at 04:25:05PM +0200, Marcel Apfelbaum wrote:
> On 02/14/2017 09:51 AM, Peter Xu wrote:
> >When we add PCIe extended capabilities, we should be following the rule
> >that we add the head extended cap (at offset 0x100) first, then the rest
> >of them. Meanwhile, we are always adding new capability bits at the end
> >of the list. Here the "next" looks meaningless in all cases since it
> >should always be zero (along with the "header").
> >
> >Simplify the function a bit, and it looks more readable now.
> >
> >Signed-off-by: Peter Xu 
> >---
> > hw/pci/pcie.c | 15 ---
> > 1 file changed, 4 insertions(+), 11 deletions(-)
> >
> >diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
> >index cbd4bb4..e0e6f6a 100644
> >--- a/hw/pci/pcie.c
> >+++ b/hw/pci/pcie.c
> >@@ -664,30 +664,23 @@ void pcie_add_capability(PCIDevice *dev,
> >  uint16_t cap_id, uint8_t cap_ver,
> >  uint16_t offset, uint16_t size)
> > {
> >-uint32_t header;
> >-uint16_t next;
> >-
> > assert(offset >= PCI_CONFIG_SPACE_SIZE);
> > assert(offset < offset + size);
> > assert(offset + size <= PCIE_CONFIG_SPACE_SIZE);
> > assert(size >= 8);
> > assert(pci_is_express(dev));
> >
> >-if (offset == PCI_CONFIG_SPACE_SIZE) {
> >-header = pci_get_long(dev->config + offset);
> >-next = PCI_EXT_CAP_NEXT(header);
> >-} else {
> >+if (offset != PCI_CONFIG_SPACE_SIZE) {
> > uint16_t prev;
> >
> > /* 0 is reserved cap id. use internally to find the last capability
> >in the linked list */
> >-next = pcie_find_capability_list(dev, 0, &prev);
> >-
> >+assert(pcie_find_capability_list(dev, 0, &prev) == 0);
> 
> Hi Peter,
> 
> It is not recommended to use assert with an expression with side-effects.

Exactly. Thanks Marcel, I'll repost.

-- peterx



Re: [Qemu-devel] [PATCH] pcie: simplify pcie_add_capability()

2017-02-15 Thread Cao jin
Hi peter

On 02/14/2017 03:51 PM, Peter Xu wrote:
> When we add PCIe extended capabilities, we should be following the rule
> that we add the head extended cap (at offset 0x100) first, then the rest
> of them. Meanwhile, we are always adding new capability bits at the end
> of the list. Here the "next" looks meaningless in all cases since it
> should always be zero (along with the "header").
> 
> Simplify the function a bit, and it looks more readable now.
> 

See if this suggestion could be incorporated into your patch:)
http://lists.nongnu.org/archive/html/qemu-devel/2017-01/msg01418.html

-- 
Sincerely,
Cao jin

> Signed-off-by: Peter Xu 
> ---
>  hw/pci/pcie.c | 15 ---
>  1 file changed, 4 insertions(+), 11 deletions(-)
> 
> diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
> index cbd4bb4..e0e6f6a 100644
> --- a/hw/pci/pcie.c
> +++ b/hw/pci/pcie.c
> @@ -664,30 +664,23 @@ void pcie_add_capability(PCIDevice *dev,
>   uint16_t cap_id, uint8_t cap_ver,
>   uint16_t offset, uint16_t size)
>  {
> -uint32_t header;
> -uint16_t next;
> -
>  assert(offset >= PCI_CONFIG_SPACE_SIZE);
>  assert(offset < offset + size);
>  assert(offset + size <= PCIE_CONFIG_SPACE_SIZE);
>  assert(size >= 8);
>  assert(pci_is_express(dev));
>  
> -if (offset == PCI_CONFIG_SPACE_SIZE) {
> -header = pci_get_long(dev->config + offset);
> -next = PCI_EXT_CAP_NEXT(header);
> -} else {
> +if (offset != PCI_CONFIG_SPACE_SIZE) {
>  uint16_t prev;
>  
>  /* 0 is reserved cap id. use internally to find the last capability
> in the linked list */
> -next = pcie_find_capability_list(dev, 0, &prev);
> -
> +assert(pcie_find_capability_list(dev, 0, &prev) == 0);
>  assert(prev >= PCI_CONFIG_SPACE_SIZE);
> -assert(next == 0);
>  pcie_ext_cap_set_next(dev, prev, offset);
>  }
> -pci_set_long(dev->config + offset, PCI_EXT_CAP(cap_id, cap_ver, next));
> +
> +pci_set_long(dev->config + offset, PCI_EXT_CAP(cap_id, cap_ver, 0));
>  
>  /* Make capability read-only by default */
>  memset(dev->wmask + offset, 0, size);
> 







Re: [Qemu-devel] [PATCH v3 15/16] target-m68k: add more FPU instructions

2017-02-15 Thread Richard Henderson

On 02/07/2017 11:59 AM, Laurent Vivier wrote:

+static long double floatx80_to_ldouble(floatx80 val)
+{
+if (floatx80_is_infinity(val)) {
+if (floatx80_is_neg(val)) {
+return -__builtin_infl();
+}
+return __builtin_infl();
+}
+if (floatx80_is_any_nan(val)) {
+char low[20];
+sprintf(low, "0x%016"PRIx64, val.low);
+
+return nanl(low);
+}
+
+return *(long double *)&val;
+}


This doesn't work except for x86 host.

You ought to extract the mantissa, convert the 64-bit value to long-double, and 
use ldexpl to scale the result for the exponent.


Similarly converting the other way use frexpl and ldexpl.


r~



Re: [Qemu-devel] [PATCH v3 14/16] target-m68k: add explicit single and double precision operations

2017-02-15 Thread Richard Henderson

On 02/07/2017 11:59 AM, Laurent Vivier wrote:

+case 0: /* fmove */
+break;
+case 0x40: /* fsmove */
+gen_helper_redf32_FP0(cpu_env);
+gen_helper_extf32_FP0(cpu_env);
+break;
+case 0x44: /* fdmove */
+gen_helper_redf64_FP0(cpu_env);
+gen_helper_extf64_FP0(cpu_env);
 break;


This is going to produce double-rounding errors.  Better to properly set the 
rounding precision first and convert once.



r~



Re: [Qemu-devel] [PATCH v3 13/16] target-m68k: add fsglmul and fsgldiv

2017-02-15 Thread Richard Henderson

On 02/07/2017 11:59 AM, Laurent Vivier wrote:

fsglmul and fsgldiv truncate data to single precision before computing
results.

Signed-off-by: Laurent Vivier 
---
 target/m68k/fpu_helper.c | 22 ++
 target/m68k/helper.h |  2 ++
 target/m68k/translate.c  |  8 
 3 files changed, 32 insertions(+)

diff --git a/target/m68k/fpu_helper.c b/target/m68k/fpu_helper.c
index 42f5b5c..8a3eed3 100644
--- a/target/m68k/fpu_helper.c
+++ b/target/m68k/fpu_helper.c
@@ -351,6 +351,17 @@ void HELPER(mul_FP0_FP1)(CPUM68KState *env)
 floatx80_to_FP0(env, res);
 }

+void HELPER(sglmul_FP0_FP1)(CPUM68KState *env)
+{
+float64 a, b, res;
+
+a = floatx80_to_float64(FP0_to_floatx80(env), &env->fp_status);
+b = floatx80_to_float64(FP1_to_floatx80(env), &env->fp_status);


s/float64/float32/g

Kinda sorta, probably close enough.  The manual says the resulting exponent may 
be out of range.  Which means this will produce +Inf in cases HW won't.



r~



Re: [Qemu-devel] [PATCH v3 12/16] target-m68k: add fscale, fgetman, fgetexp and fmod

2017-02-15 Thread Richard Henderson

On 02/07/2017 11:59 AM, Laurent Vivier wrote:

Signed-off-by: Laurent Vivier 
---
 target/m68k/cpu.h|  1 +
 target/m68k/fpu_helper.c | 56 
 target/m68k/helper.h |  4 
 target/m68k/translate.c  | 14 
 4 files changed, 75 insertions(+)

diff --git a/target/m68k/cpu.h b/target/m68k/cpu.h
index 7985dc3..3042ab7 100644
--- a/target/m68k/cpu.h
+++ b/target/m68k/cpu.h
@@ -253,6 +253,7 @@ typedef enum {
 /* Quotient */

 #define FPSR_QT_MASK  0x00ff
+#define FPSR_QT_SHIFT 16

 /* Floating-Point Control Register */
 /* Rounding mode */
diff --git a/target/m68k/fpu_helper.c b/target/m68k/fpu_helper.c
index d8145e0..42f5b5c 100644
--- a/target/m68k/fpu_helper.c
+++ b/target/m68k/fpu_helper.c
@@ -458,3 +458,59 @@ void HELPER(const_FP0)(CPUM68KState *env, uint32_t offset)
 env->fp0l = fpu_rom[offset].low;
 env->fp0h = fpu_rom[offset].high;
 }
+
+void HELPER(getexp_FP0)(CPUM68KState *env)
+{
+int32_t exp;
+floatx80 res;
+
+res = FP0_to_floatx80(env);
+if (floatx80_is_zero_or_denormal(res) || floatx80_is_any_nan(res) ||
+floatx80_is_infinity(res)) {
+return;
+}
+exp = (env->fp0h & 0x7fff) - 0x3fff;
+
+res = int32_to_floatx80(exp, &env->fp_status);
+
+floatx80_to_FP0(env, res);


Failure to raise OPERR for infinities?


+void HELPER(getman_FP0)(CPUM68KState *env)
+{
+floatx80 res;
+res = int64_to_floatx80(env->fp0l, &env->fp_status);
+floatx80_to_FP0(env, res);
+}


This seems completely wrong.  (1) NaN gets returned, (2) Inf raises OPERR, (3) 
Normal values return something in the range [1.0 ... 2.0).  Which means you 
should just force the exponent rather than convert the low part.



+
+void HELPER(scale_FP0_FP1)(CPUM68KState *env)
+{
+int32_t scale;
+int32_t exp;
+
+scale = floatx80_to_int32(FP0_to_floatx80(env), &env->fp_status);
+
+exp = (env->fp1h & 0x7fff) + scale;
+
+env->fp0h = (env->fp1h & 0x8000) | (exp & 0x7fff);
+env->fp0l = env->fp1l;
+}


Missing handling for NaN, Inf, 0, denormal.


r~



Re: [Qemu-devel] [Help] Windows2012 as Guest 64+cores on KVM Halts

2017-02-15 Thread Gonglei (Arei)
Hi,

> 
> On Sat, 2017-02-11 at 10:39 -0500, Paolo Bonzini wrote:
> > >
> > >
> > > >
> > > > On 10/02/2017 10:31, Gonglei (Arei) wrote:
> > > > >
> > > > > But We tested the same cases on Xen platform and VMware, and
> > > > > the guest booted successfully.
> > > >
> > > > Were these two also tested with enlightenments enabled?  TCG
> > > > surely isn't.
> > >
> > > About TCG, I just remove ' accel=kvm,' and 'hy_releaxed' from the
> > > below QEMU
> > > Command line, I thought the hyper-V enabled then. Sorry about that.
> > >
> > > But for Xen, we set 'viridian=1' which be thought the Hyper-V is
> > > enabled.
> > >
> > > For VMWare we also enabled the Hyper-V enlightenments.
> If I'm not mistaken, even Hyper-V server doesn't allow specify more
> than 64 vCPUs for Generation 1 VMs.

Normally yes, but I found the explanation from Microsoft document about it:

Maximum Supported Virtual Processors

On Windows operating systems versions through Windows Server 2008 R2, 
reporting the HV#1 hypervisor interface limits the Windows virtual machine 
to a maximum of 64 VPs, regardless of what is reported via CPUID.4005.EAX.
Starting with Windows Server 2012 and Windows 8, if CPUID.4005.EAX 
contains a value of -1, Windows assumes that the hypervisor imposes no specific
limit to the number of VPs. In this case, Windows Server 2012 guest VMs may
use more than 64 VPs, up to the maximum supported number of processors 
applicable to the specific Windows version being used.

Link: 
https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/reference/tlfs

"Requirements for Implementing the Microsoft Hypervisor Interface"

And the below patch works for me, I can support max 255 vcpus for WS2012
with hyper-v enlightenments.

diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index 27fd050..efe3cbc 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -772,7 +772,7 @@ int kvm_arch_init_vcpu(CPUState *cs)

 c = &cpuid_data.entries[cpuid_i++];
 c->function = HYPERV_CPUID_IMPLEMENT_LIMITS;
-c->eax = 0x40;
+c->eax = -1;
 c->ebx = 0x40;

 kvm_base = KVM_CPUID_SIGNATURE_NEXT;

> In any case, if you are only interested in hv_relaxed, you can drop it
> off for WS2012 as long as you have cpu hypervisor flag
> (CPUID.1:ECX [bit 31]=1) turned on.
> 
hy_relaxed is just a example of enabling hyperv-v enlightenments.

Thanks,
-Gonglei


Re: [Qemu-devel] [PATCH v3 11/16] target-m68k: add fmovecr

2017-02-15 Thread Richard Henderson

On 02/07/2017 11:59 AM, Laurent Vivier wrote:

fmovecr moves a floating point constant from the
FPU ROM to a floating point register.

Signed-off-by: Laurent Vivier 
---
 target/m68k/fpu_helper.c | 31 +++
 target/m68k/helper.h |  1 +
 target/m68k/translate.c  | 12 +++-
 3 files changed, 43 insertions(+), 1 deletion(-)


Reviewed-by: Richard Henderson 


r~



Re: [Qemu-devel] [PATCH v3 10/16] target-m68k: add fscc.

2017-02-15 Thread Richard Henderson

On 02/07/2017 11:59 AM, Laurent Vivier wrote:

+addr = tcg_temp_local_new();
+tcg_gen_mov_i32(addr, taddr);
+l1 = gen_new_label();
+l2 = gen_new_label();
+gen_fjmpcc(s, ext & 0x3f, l1);
+gen_store(s, OS_BYTE, addr, tcg_const_i32(0x00));
+tcg_gen_br(l2);
+gen_set_label(l1);
+gen_store(s, OS_BYTE, addr, tcg_const_i32(0xff));
+gen_set_label(l2);
+tcg_temp_free(addr);


Use tcg_gen_setcond, like in scc.


+l1 = gen_new_label();
+tcg_gen_ori_i32(reg, reg, 0x00ff);
+gen_fjmpcc(s, ext & 0x3f, l1);
+tcg_gen_andi_i32(reg, reg, 0xff00);
+gen_set_label(l1);


Likewise.


r~



Re: [Qemu-devel] [PATCH v3 09/16] target-m68k: add fmovem

2017-02-15 Thread Richard Henderson

On 02/07/2017 11:59 AM, Laurent Vivier wrote:

Signed-off-by: Laurent Vivier 
---
 target/m68k/fpu_helper.c |  6 +++
 target/m68k/helper.h |  1 +
 target/m68k/translate.c  | 99 +++-
 3 files changed, 80 insertions(+), 26 deletions(-)

diff --git a/target/m68k/fpu_helper.c b/target/m68k/fpu_helper.c
index 1e68c41..aadfc82 100644
--- a/target/m68k/fpu_helper.c
+++ b/target/m68k/fpu_helper.c
@@ -421,3 +421,9 @@ void HELPER(update_fpstatus)(CPUM68KState *env)

 set_float_exception_flags(flags, &env->fp_status);
 }
+
+void HELPER(fmovem)(CPUM68KState *env, uint32_t opsize,
+uint32_t mode, uint32_t mask)
+{
+fprintf(stderr, "MISSING HELPER fmovem\n");
+}


Um... no.



diff --git a/target/m68k/helper.h b/target/m68k/helper.h
index 072a6d0..58bc273 100644
--- a/target/m68k/helper.h
+++ b/target/m68k/helper.h
@@ -31,6 +31,7 @@ DEF_HELPER_1(cmp_FP0_FP1, void, env)
 DEF_HELPER_2(set_fpcr, void, env, i32)
 DEF_HELPER_1(tst_FP0, void, env)
 DEF_HELPER_1(update_fpstatus, void, env)
+DEF_HELPER_4(fmovem, void, env, i32, i32, i32)

 DEF_HELPER_3(mac_move, void, env, i32, i32)
 DEF_HELPER_3(macmulf, i64, env, i32, i32)
diff --git a/target/m68k/translate.c b/target/m68k/translate.c
index f9c64ff..ac60f1a 100644
--- a/target/m68k/translate.c
+++ b/target/m68k/translate.c
@@ -4483,13 +4483,79 @@ static void gen_op_fmove_fcr(CPUM68KState *env, 
DisasContext *s,
 tcg_temp_free_i32(addr);
 }

+static void gen_op_fmovem(CPUM68KState *env, DisasContext *s,
+  uint32_t insn, uint32_t ext)
+{
+int opsize;
+uint16_t mask;
+int i;
+uint32_t mode;
+int32_t incr;
+TCGv addr, tmp;
+int is_load;
+
+if (m68k_feature(s->env, M68K_FEATURE_FPU)) {
+opsize = OS_EXTENDED;
+} else {
+opsize = OS_DOUBLE;  /* FIXME */
+}
+
+mode = (ext >> 11) & 0x3;
+if ((mode & 0x1) == 1) {
+gen_helper_fmovem(cpu_env, tcg_const_i32(opsize),
+  tcg_const_i32(mode), DREG(ext, 0));


... why not just raise illegal opcode here instead of fprintf.
You should also add a comment about not supporting the dynamic set.

That said... it almost seems easier to support fmovem as a helper than it does 
inline.  So perhaps just always implement it out of line?



r~



Re: [Qemu-devel] iommu emulation

2017-02-15 Thread Alex Williamson
On Wed, 15 Feb 2017 18:25:26 -0500
Jintack Lim  wrote:

> On Wed, Feb 15, 2017 at 5:50 PM, Alex Williamson  > wrote:  
> 
> > On Wed, 15 Feb 2017 17:05:35 -0500
> > Jintack Lim  wrote:
> >  
> > > On Tue, Feb 14, 2017 at 9:52 PM, Peter Xu  wrote:
> > >  
> > > > On Tue, Feb 14, 2017 at 07:50:39AM -0500, Jintack Lim wrote:
> > > >
> > > > [...]
> > > >  
> > > > > > > >> > I misunderstood what you said?  
> > > > > > > >
> > > > > > > > I failed to understand why an vIOMMU could help boost  
> > performance.  
> > > > :(  
> > > > > > > > Could you provide your command line here so that I can try to
> > > > > > > > reproduce?  
> > > > > > >
> > > > > > > Sure. This is the command line to launch L1 VM
> > > > > > >
> > > > > > > qemu-system-x86_64 -M q35,accel=kvm,kernel-irqchip=split \
> > > > > > > -m 12G -device intel-iommu,intremap=on,eim=off,caching-mode=on \
> > > > > > > -drive file=/mydata/guest0.img,format=raw --nographic -cpu host  
> > \  
> > > > > > > -smp 4,sockets=4,cores=1,threads=1 \
> > > > > > > -device vfio-pci,host=08:00.0,id=net0
> > > > > > >
> > > > > > > And this is for L2 VM.
> > > > > > >
> > > > > > > ./qemu-system-x86_64 -M q35,accel=kvm \
> > > > > > > -m 8G \
> > > > > > > -drive file=/vm/l2guest.img,format=raw --nographic -cpu host \
> > > > > > > -device vfio-pci,host=00:03.0,id=net0  
> > > > > >
> > > > > > ... here looks like these are command lines for L1/L2 guest, rather
> > > > > > than L1 guest with/without vIOMMU?
> > > > > >  
> > > > >
> > > > > That's right. I thought you were asking about command lines for L1/L2 
> > > > >  
> > > > guest  
> > > > > :(.
> > > > > I think I made the confusion, and as I said above, I didn't mean to  
> > talk  
> > > > > about the performance of L1 guest with/without vIOMMO.
> > > > > We can move on!  
> > > >
> > > > I see. Sure! :-)
> > > >
> > > > [...]
> > > >  
> > > > > >
> > > > > > Then, I *think* above assertion you encountered would fail only if
> > > > > > prev == 0 here, but I still don't quite sure why was that  
> > happening.  
> > > > > > Btw, could you paste me your "lspci -vvv -s 00:03.0" result in  
> > your L1  
> > > > > > guest?
> > > > > >  
> > > > >
> > > > > Sure. This is from my L1 guest.  
> > > >
> > > > Hmm... I think I found the problem...
> > > >  
> > > > >
> > > > > root@guest0:~# lspci -vvv -s 00:03.0
> > > > > 00:03.0 Network controller: Mellanox Technologies MT27500 Family
> > > > > [ConnectX-3]
> > > > > Subsystem: Mellanox Technologies Device 0050
> > > > > Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> > > > > Stepping- SERR+ FastB2B- DisINTx+
> > > > > Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-  
> >  > > > > SERR-  > > > > Latency: 0, Cache Line Size: 64 bytes
> > > > > Interrupt: pin A routed to IRQ 23
> > > > > Region 0: Memory at fe90 (64-bit, non-prefetchable) [size=1M]
> > > > > Region 2: Memory at fe00 (64-bit, prefetchable) [size=8M]
> > > > > Expansion ROM at fea0 [disabled] [size=1M]
> > > > > Capabilities: [40] Power Management version 3
> > > > > Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA  
> > PME(D0-,D1-,D2-,D3hot-,D3cold-  
> > > > )  
> > > > > Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
> > > > > Capabilities: [48] Vital Product Data
> > > > > Product Name: CX354A - ConnectX-3 QSFP
> > > > > Read-only fields:
> > > > > [PN] Part number: MCX354A-FCBT
> > > > > [EC] Engineering changes: A4
> > > > > [SN] Serial number: MT1346X00791
> > > > > [V0] Vendor specific: PCIe Gen3 x8
> > > > > [RV] Reserved: checksum good, 0 byte(s) reserved
> > > > > Read/write fields:
> > > > > [V1] Vendor specific: N/A
> > > > > [YA] Asset tag: N/A
> > > > > [RW] Read-write area: 105 byte(s) free
> > > > > [RW] Read-write area: 253 byte(s) free
> > > > > [RW] Read-write area: 253 byte(s) free
> > > > > [RW] Read-write area: 253 byte(s) free
> > > > > [RW] Read-write area: 253 byte(s) free
> > > > > [RW] Read-write area: 253 byte(s) free
> > > > > [RW] Read-write area: 253 byte(s) free
> > > > > [RW] Read-write area: 253 byte(s) free
> > > > > [RW] Read-write area: 253 byte(s) free
> > > > > [RW] Read-write area: 253 byte(s) free
> > > > > [RW] Read-write area: 253 byte(s) free
> > > > > [RW] Read-write area: 253 byte(s) free
> > > > > [RW] Read-write area: 253 byte(s) free
> > > > > [RW] Read-write area: 253 byte(s) free
> > > > > [RW] Read-write area: 253 byte(s) free
> > > > > [RW] Read-write area: 252 byte(s) free
> > > > > End
> > > > > Capabilities: [9c] MSI-X: Enable+ Count=128 Masked-
> > > > > Vector table: BAR=0 offset=0007c000
> > > > > PBA: BAR=0 offset=0007d000
> > > > > Capabilities: [60] Express (v2) Root Complex Integrated Endpoint,  
> > MSI 00  
> > > > > DevCap: MaxPayload 256 bytes, PhantFunc 0
> > > > > ExtTag- RBE+
> > > > > DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported+
> > > > > RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
> > > > > MaxPayload 256 bytes, MaxReadReq 4096 bytes
> > > > > DevS

Re: [Qemu-devel] [PATCH v3 08/16] target-m68k: define 96bit FP registers for gdb on 680x0

2017-02-15 Thread Richard Henderson

On 02/07/2017 11:59 AM, Laurent Vivier wrote:

Signed-off-by: Laurent Vivier 
---
 configure|  2 +-
 gdb-xml/m68k-fp.xml  | 21 +
 target/m68k/helper.c | 45 +
 3 files changed, 67 insertions(+), 1 deletion(-)
 create mode 100644 gdb-xml/m68k-fp.xml


Reviewed-by: Richard Henderson 


r~



Re: [Qemu-devel] [virtio-dev] Re: [virtio-dev] [PATCH v16 1/2] virtio-crypto: Add virtio crypto device specification

2017-02-15 Thread Gonglei (Arei)
Hi Halil,

> 
> On 02/09/2017 03:29 AM, Gonglei (Arei) wrote:
> [..]
> > Oh, so much work need to be done.
> >
> > Halil, Would you mind work together with me to perfect the spec?
> > And feel free to add your signed-off-by tag. :)
> >
> > TBH as a non-native English speaker, it's more difficult writing a
> > spec than coding. :(
> >
> > Look forward to your reply.
> >
> 
> 
> First, sorry for the long delay -- was busy and then ill. Thank you

I hope you feel better now.

> very much for your offer. I would prefer continuing as a reviewer,
> but I would very much appreciate if you could add me to the
> 'Acknowledgments' appendix as a part of your patch ;). 

No problem, I can do that.

> Unfortunately
> I do not have the time now to allocate significantly more time for
> this. I'm also having difficulties to think of another way of working
> efficiently together on this, than what we already do. I can try to
> provide more suggestions in terms of formulation, but it's still
> just review. Thank you very much!
> 

OK, thank you, I can understand. Just like me, I am also busy solving bugs
of inner projects and some works with high priority. So, recently
I have little time on the spec. :( But I'll do it once I have time again.

Thanks,
-Gonglei




Re: [Qemu-devel] [PATCH v3 07/16] target-m68k: manage FPU exceptions

2017-02-15 Thread Richard Henderson

On 02/07/2017 11:59 AM, Laurent Vivier wrote:

Signed-off-by: Laurent Vivier 
---
 target/m68k/cpu.h|  28 +
 target/m68k/fpu_helper.c | 107 ++-
 target/m68k/helper.h |   1 +
 target/m68k/translate.c  |  27 
 4 files changed, 162 insertions(+), 1 deletion(-)

diff --git a/target/m68k/cpu.h b/target/m68k/cpu.h
index 6b3cb26..7985dc3 100644
--- a/target/m68k/cpu.h
+++ b/target/m68k/cpu.h
@@ -57,6 +57,15 @@
 #define EXCP_TRAP15 47   /* User trap #15.  */
 #define EXCP_UNSUPPORTED61
 #define EXCP_ICE13
+#define EXCP_FP_BSUN48 /* Branch Set on Unordered */
+#define EXCP_FP_INEX49 /* Inexact result */
+#define EXCP_FP_DZ  50 /* Divide by Zero */
+#define EXCP_FP_UNFL51 /* Underflow */
+#define EXCP_FP_OPERR   52 /* Operand Error */
+#define EXCP_FP_OVFL53 /* Overflow */
+#define EXCP_FP_SNAN54 /* Signaling Not-A-Number */
+#define EXCP_FP_UNIMP   55 /* Unimplemented Data type */
+

 #define EXCP_RTE0x100
 #define EXCP_HALT_INSN  0x101
@@ -222,6 +231,25 @@ typedef enum {
 #define FPSR_CC_Z 0x0400 /* Zero */
 #define FPSR_CC_N 0x0800 /* Negative */

+/* Exception Status */
+#define FPSR_ES_MASK  0xff00
+#define FPSR_ES_BSUN  0x8000 /* Branch Set on Unordered */
+#define FPSR_ES_SNAN  0x4000 /* Signaling Not-A-Number */
+#define FPSR_ES_OPERR 0x2000 /* Operand Error */
+#define FPSR_ES_OVFL  0x1000 /* Overflow */
+#define FPSR_ES_UNFL  0x0800 /* Underflow */
+#define FPSR_ES_DZ0x0400 /* Divide by Zero */
+#define FPSR_ES_INEX2 0x0200 /* Inexact operation */
+#define FPSR_ES_INEX  0x0100 /* Inexact decimal input */
+
+/* Accrued Exception */
+#define FPSR_AE_MASK  0x00ff
+#define FPSR_AE_IOP   0x0080 /* Invalid Operation */
+#define FPSR_AE_OVFL  0x0040 /* Overflow */
+#define FPSR_AE_UNFL  0x0020 /* Underflow */
+#define FPSR_AE_DZ0x0010 /* Divide by Zero */
+#define FPSR_AE_INEX  0x0008 /* Inexact */
+
 /* Quotient */

 #define FPSR_QT_MASK  0x00ff
diff --git a/target/m68k/fpu_helper.c b/target/m68k/fpu_helper.c
index 9d39118..1e68c41 100644
--- a/target/m68k/fpu_helper.c
+++ b/target/m68k/fpu_helper.c
@@ -177,6 +177,70 @@ static void restore_rounding_mode(CPUM68KState *env)
 }
 }

+static void set_fpsr_exception(CPUM68KState *env)
+{
+uint32_t fpsr = 0;
+int flags;
+
+flags = get_float_exception_flags(&env->fp_status);
+if (flags == 0) {
+return;
+}
+set_float_exception_flags(0, &env->fp_status);
+
+if (flags & float_flag_invalid) {
+fpsr |= FPSR_AE_IOP;
+}
+if (flags & float_flag_divbyzero) {
+fpsr |= FPSR_AE_DZ;
+}
+if (flags & float_flag_overflow) {
+fpsr |= FPSR_AE_OVFL;
+}
+if (flags & float_flag_underflow) {
+fpsr |= FPSR_AE_UNFL;
+}
+if (flags & float_flag_inexact) {
+fpsr |= FPSR_AE_INEX;
+}
+
+env->fpsr = (env->fpsr & ~FPSR_AE_MASK) | fpsr;
+}
+
+static void fpu_exception(CPUM68KState *env, uint32_t exception)
+{
+CPUState *cs = CPU(m68k_env_get_cpu(env));
+
+env->fpsr = (env->fpsr & ~FPSR_ES_MASK) | exception;
+if (env->fpcr & exception) {


What are you trying to do here?  This test is obviously true if exception != 0.


+switch (exception) {
+case FPSR_ES_BSUN:
+cs->exception_index = EXCP_FP_BSUN;
+break;
+case FPSR_ES_SNAN:
+cs->exception_index = EXCP_FP_SNAN;
+break;
+case FPSR_ES_OPERR:
+cs->exception_index = EXCP_FP_OPERR;
+break;
+case FPSR_ES_OVFL:
+cs->exception_index = EXCP_FP_OVFL;
+break;
+case FPSR_ES_UNFL:
+cs->exception_index = EXCP_FP_UNFL;
+break;
+case FPSR_ES_DZ:
+cs->exception_index = EXCP_FP_DZ;
+break;
+case FPSR_ES_INEX:
+case FPSR_ES_INEX2:
+cs->exception_index = EXCP_FP_INEX;
+break;
+}
+cpu_loop_exit_restore(cs, GETPC());


GETPC must be invoked from the outer-most handler.  You need to pass this in 
from the callers.



+}
+}
+
 void cpu_m68k_set_fpcr(CPUM68KState *env, uint32_t val)
 {
 env->fpcr = val & 0x;
@@ -292,10 +356,16 @@ void HELPER(cmp_FP0_FP1)(CPUM68KState *env)
 {
 floatx80 fp0 = FP0_to_floatx80(env);
 floatx80 fp1 = FP1_to_floatx80(env);
-int float_compare;
+int flags, float_compare;

 float_compare = floatx80_compare(fp1, fp0, &env->fp_status);
 env->fpsr = (env->fpsr & ~FPSR_CC_MASK) | float_comp_to_cc(float_compare);
+
+flags = get_float_exception_flags(&env->fp_status);
+if (flags & float_flag_invalid) {
+fpu_exception(env, FPSR_ES_OPERR);
+   }
+   set_fpsr_exception(env);
 }

 void HELPER(tst_FP0)(CPUM68KState *env)
@@ -315,4 +385,39 @@ void HELPER(tst_FP0)(CPUM68KSt

Re: [Qemu-devel] [PATCH v3 06/16] target-m68k: add FPCR and FPSR

2017-02-15 Thread Richard Henderson

On 02/07/2017 11:59 AM, Laurent Vivier wrote:

 void HELPER(itrunc_FP0)(CPUM68KState *env)
 {
 floatx80 res;

+set_float_rounding_mode(float_round_to_zero, &env->fp_status);
 res = floatx80_round_to_int(FP0_to_floatx80(env), &env->fp_status);
+restore_rounding_mode(env);


It would be better to save/restore the current rounding mode as opposed to 
recomputing from the fpcr.



 void HELPER(cmp_FP0_FP1)(CPUM68KState *env)
 {
 floatx80 fp0 = FP0_to_floatx80(env);
 floatx80 fp1 = FP1_to_floatx80(env);
-floatx80 res;
+int float_compare;

-res = floatx80_sub(fp0, fp1, &env->fp_status);
-if (floatx80_is_quiet_nan(res, &env->fp_status)) {
-/* +/-inf compares equal against itself, but sub returns nan.  */
-if (!floatx80_is_quiet_nan(fp0, &env->fp_status)
-&& !floatx80_is_quiet_nan(fp1, &env->fp_status)) {
-res = floatx80_zero;
-if (floatx80_lt_quiet(fp0, res, &env->fp_status)) {
-res = floatx80_chs(res);
-}
-}
-}
-
-floatx80_to_FP0(env, res);
+float_compare = floatx80_compare(fp1, fp0, &env->fp_status);
+env->fpsr = (env->fpsr & ~FPSR_CC_MASK) | float_comp_to_cc(float_compare);
 }

-uint32_t HELPER(compare_FP0)(CPUM68KState *env)
+void HELPER(tst_FP0)(CPUM68KState *env)
 {
-floatx80 fp0 = FP0_to_floatx80(env);
-return floatx80_compare_quiet(fp0, floatx80_zero, &env->fp_status);
+uint32_t fpsr = 0;
+floatx80 val = FP0_to_floatx80(env);
+
+if (floatx80_is_neg(val)) {
+fpsr |= FPSR_CC_N;
+}
+
+if (floatx80_is_any_nan(val)) {
+fpsr |= FPSR_CC_A;
+} else if (floatx80_is_infinity(val)) {
+fpsr |= FPSR_CC_I;
+} else if (floatx80_is_zero(val)) {
+fpsr |= FPSR_CC_Z;
+}
+env->fpsr = (env->fpsr & ~FPSR_CC_MASK) | fpsr;
 }


It would be better to pass in the old FPSR value, returning the new FPSR value, 
so that the helper can be marked TCG_CALL_NO_RWG -- not reading or modifying 
TCG globals.



+gen_helper_tst_FP0(cpu_env);


Making this, e.g. gen_helper_tst_FP0(QREG_FPSR, cpu_env).  Which will also 
happen to leave the FPSR in a host register where it can be used if the next 
insn is a fp branch.



r~



[Qemu-devel] [PATCH v5 8/8] hw/mips: MIPS Boston board support

2017-02-15 Thread Yongbok Kim
From: Paul Burton 

Introduce support for emulating the MIPS Boston development board. The
Boston board is built around an FPGA & 3 PCIe controllers, one of which
is connected to an Intel EG20T Platform Controller Hub. It is used
during the development & debug of new CPUs and the software intended to
run on them, and is essentially the successor to the older MIPS Malta
board.

This patch does not implement the EG20T, instead connecting an already
supported ICH-9 AHCI controller. Whilst this isn't accurate it's enough
for typical stock Boston software (eg. Linux kernels) to work with hard
disks given that both the ICH-9 & EG20T implement the AHCI
specification.

Boston boards typically boot kernels in the FIT image format, and this
patch will treat kernels provided to QEMU as such. When loading a kernel
directly, the board code will generate minimal firmware much as the
Malta board code does. This firmware will set up the CM, CPC & GIC
register base addresses then set argument registers & jump to the kernel
entry point. Alternatively, bootloader code may be loaded using the bios
argument in which case no firmware will be generated & execution will
proceed from the start of the boot code at the default MIPS boot
exception vector (offset 0x1fc0 into (c)kseg1).

Currently real Boston boards are always used with FPGA bitfiles that
include a Global Interrupt Controller (GIC), so the interrupt
configuration is only defined for such cases. Therefore the board will
only allow use of CPUs which implement the CPS components, including the
GIC, and will otherwise exit with a message.

Signed-off-by: Paul Burton 
Reviewed-by: Yongbok Kim 
[yongbok@imgtec.com:
  isolated boston machine support for mips64el.
  updated for recent Chardev changes.
  ignore missing bios/kernel for qtest.]
Signed-off-by: Yongbok Kim 
---
 configure|   2 +-
 default-configs/mips64el-softmmu.mak |   2 +
 hw/mips/Makefile.objs|   1 +
 hw/mips/boston.c | 576 +++
 4 files changed, 580 insertions(+), 1 deletion(-)
 create mode 100644 hw/mips/boston.c

diff --git a/configure b/configure
index 4b68861..8e8f18d 100755
--- a/configure
+++ b/configure
@@ -3378,7 +3378,7 @@ fi
 fdt_required=no
 for target in $target_list; do
   case $target in
-aarch64*-softmmu|arm*-softmmu|ppc*-softmmu|microblaze*-softmmu)
+
aarch64*-softmmu|arm*-softmmu|ppc*-softmmu|microblaze*-softmmu|mips64el-softmmu)
   fdt_required=yes
 ;;
   esac
diff --git a/default-configs/mips64el-softmmu.mak 
b/default-configs/mips64el-softmmu.mak
index 485e218..cc5f3b3 100644
--- a/default-configs/mips64el-softmmu.mak
+++ b/default-configs/mips64el-softmmu.mak
@@ -10,3 +10,5 @@ CONFIG_JAZZ=y
 CONFIG_G364FB=y
 CONFIG_JAZZ_LED=y
 CONFIG_VT82C686=y
+CONFIG_MIPS_BOSTON=y
+CONFIG_PCI_XILINX=y
diff --git a/hw/mips/Makefile.objs b/hw/mips/Makefile.objs
index 9352a1c..48cd2ef 100644
--- a/hw/mips/Makefile.objs
+++ b/hw/mips/Makefile.objs
@@ -4,3 +4,4 @@ obj-$(CONFIG_JAZZ) += mips_jazz.o
 obj-$(CONFIG_FULONG) += mips_fulong2e.o
 obj-y += gt64xxx_pci.o
 obj-$(CONFIG_MIPS_CPS) += cps.o
+obj-$(CONFIG_MIPS_BOSTON) += boston.o
diff --git a/hw/mips/boston.c b/hw/mips/boston.c
new file mode 100644
index 000..560c8b4
--- /dev/null
+++ b/hw/mips/boston.c
@@ -0,0 +1,576 @@
+/*
+ * MIPS Boston development board emulation.
+ *
+ * Copyright (c) 2016 Imagination Technologies
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ */
+
+#include "qemu/osdep.h"
+#include "qemu-common.h"
+
+#include "exec/address-spaces.h"
+#include "hw/boards.h"
+#include "hw/char/serial.h"
+#include "hw/hw.h"
+#include "hw/ide/pci.h"
+#include "hw/ide/ahci.h"
+#include "hw/loader.h"
+#include "hw/loader-fit.h"
+#include "hw/mips/cps.h"
+#include "hw/mips/cpudevs.h"
+#include "hw/pci-host/xilinx-pcie.h"
+#include "qapi/error.h"
+#include "qemu/cutils.h"
+#include "qemu/error-report.h"
+#include "qemu/log.h"
+#include "sysemu/char.h"
+#include "sysemu/device_tree.h"
+#include "sysemu/sysemu.h"
+#include "sysemu/qtest.h"
+
+#include 
+
+#define TYPE_MIPS_BOSTON "mips-boston"
+#define BOSTON(obj) OBJECT_CHECK(BostonState, (obj), TYPE_MIPS_BOSTON)
+
+typedef struct {
+SysBusDevice parent_obj;
+
+MachineState *mach;
+MIPSCPSState *cps;
+SerialState *uart;
+
+CharBackend lcd_displ

Re: [Qemu-devel] [PATCH 14/17] qmp: add x-debug-block-dirty-bitmap-sha256

2017-02-15 Thread John Snow


On 02/13/2017 04:54 AM, Vladimir Sementsov-Ogievskiy wrote:
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> Reviewed-by: Max Reitz 

This is simply the same as the version in the other two series, right?

Reviewed-by: John Snow 

> ---
>  block/dirty-bitmap.c |  5 +
>  blockdev.c   | 29 +
>  include/block/dirty-bitmap.h |  2 ++
>  include/qemu/hbitmap.h   |  8 
>  qapi/block-core.json | 27 +++
>  tests/Makefile.include   |  2 +-
>  util/hbitmap.c   | 11 +++
>  7 files changed, 83 insertions(+), 1 deletion(-)
> 
> diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
> index 32aa6eb..5bec99b 100644
> --- a/block/dirty-bitmap.c
> +++ b/block/dirty-bitmap.c
> @@ -558,3 +558,8 @@ BdrvDirtyBitmap *bdrv_next_dirty_bitmap(BlockDriverState 
> *bs,
>  
>  return QLIST_NEXT(bitmap, list);
>  }
> +
> +char *bdrv_dirty_bitmap_sha256(const BdrvDirtyBitmap *bitmap, Error **errp)
> +{
> +return hbitmap_sha256(bitmap->bitmap, errp);
> +}
> diff --git a/blockdev.c b/blockdev.c
> index db82ac9..4d06885 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -2790,6 +2790,35 @@ void qmp_block_dirty_bitmap_clear(const char *node, 
> const char *name,
>  aio_context_release(aio_context);
>  }
>  
> +BlockDirtyBitmapSha256 *qmp_x_debug_block_dirty_bitmap_sha256(const char 
> *node,
> +  const char 
> *name,
> +  Error **errp)
> +{
> +AioContext *aio_context;
> +BdrvDirtyBitmap *bitmap;
> +BlockDriverState *bs;
> +BlockDirtyBitmapSha256 *ret = NULL;
> +char *sha256;
> +
> +bitmap = block_dirty_bitmap_lookup(node, name, &bs, &aio_context, errp);
> +if (!bitmap || !bs) {
> +return NULL;
> +}
> +
> +sha256 = bdrv_dirty_bitmap_sha256(bitmap, errp);
> +if (sha256 == NULL) {
> +goto out;
> +}
> +
> +ret = g_new(BlockDirtyBitmapSha256, 1);
> +ret->sha256 = sha256;
> +
> +out:
> +aio_context_release(aio_context);
> +
> +return ret;
> +}
> +
>  void hmp_drive_del(Monitor *mon, const QDict *qdict)
>  {
>  const char *id = qdict_get_str(qdict, "id");
> diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
> index 20b3ec7..ded872a 100644
> --- a/include/block/dirty-bitmap.h
> +++ b/include/block/dirty-bitmap.h
> @@ -78,4 +78,6 @@ void bdrv_dirty_bitmap_deserialize_finish(BdrvDirtyBitmap 
> *bitmap);
>  BdrvDirtyBitmap *bdrv_next_dirty_bitmap(BlockDriverState *bs,
>  BdrvDirtyBitmap *bitmap);
>  
> +char *bdrv_dirty_bitmap_sha256(const BdrvDirtyBitmap *bitmap, Error **errp);
> +
>  #endif
> diff --git a/include/qemu/hbitmap.h b/include/qemu/hbitmap.h
> index 9239fe5..f353e56 100644
> --- a/include/qemu/hbitmap.h
> +++ b/include/qemu/hbitmap.h
> @@ -238,6 +238,14 @@ void hbitmap_deserialize_zeroes(HBitmap *hb, uint64_t 
> start, uint64_t count,
>  void hbitmap_deserialize_finish(HBitmap *hb);
>  
>  /**
> + * hbitmap_sha256:
> + * @bitmap: HBitmap to operate on.
> + *
> + * Returns SHA256 hash of the last level.
> + */
> +char *hbitmap_sha256(const HBitmap *bitmap, Error **errp);
> +
> +/**
>   * hbitmap_free:
>   * @hb: HBitmap to operate on.
>   *
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index 932f5bb..8646054 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -1632,6 +1632,33 @@
>'data': 'BlockDirtyBitmap' }
>  
>  ##
> +# @BlockDirtyBitmapSha256:
> +#
> +# SHA256 hash of dirty bitmap data
> +#
> +# @sha256: ASCII representation of SHA256 bitmap hash
> +#
> +# Since: 2.9
> +##
> +  { 'struct': 'BlockDirtyBitmapSha256',
> +'data': {'sha256': 'str'} }
> +
> +##
> +# @x-debug-block-dirty-bitmap-sha256:
> +#
> +# Get bitmap SHA256
> +#
> +# Returns: BlockDirtyBitmapSha256 on success
> +#  If @node is not a valid block device, DeviceNotFound
> +#  If @name is not found or if hashing has failed, GenericError with 
> an
> +#  explanation
> +#
> +# Since: 2.9
> +##
> +  { 'command': 'x-debug-block-dirty-bitmap-sha256',
> +'data': 'BlockDirtyBitmap', 'returns': 'BlockDirtyBitmapSha256' }
> +
> +##
>  # @blockdev-mirror:
>  #
>  # Start mirroring a block device's writes to a new destination.
> diff --git a/tests/Makefile.include b/tests/Makefile.include
> index 634394a..7a71b4d 100644
> --- a/tests/Makefile.include
> +++ b/tests/Makefile.include
> @@ -526,7 +526,7 @@ tests/test-blockjob$(EXESUF): tests/test-blockjob.o 
> $(test-block-obj-y) $(test-u
>  tests/test-blockjob-txn$(EXESUF): tests/test-blockjob-txn.o 
> $(test-block-obj-y) $(test-util-obj-y)
>  tests/test-thread-pool$(EXESUF): tests/test-thread-pool.o $(test-block-obj-y)
>  tests/test-iov$(EXESUF): tests/test-iov.o $(test-util-obj-y)
> -tests/test-hbitmap$(EXESUF): tests/test-hbitmap.o $(test-util-obj-y)
> +tests/test-hbitmap$(

[Qemu-devel] [PATCH v5 5/8] dtc: Update requirement to v1.4.2

2017-02-15 Thread Yongbok Kim
From: Paul Burton 

In order to obtain fdt_first_subnode & fdt_next_subnode symbols from
libfdt for use by a later patch, bump the requirement for dtc to v1.4.2
& the submodule to that same version.

Signed-off-by: Paul Burton 
Reviewed-by: Yongbok Kim 
Signed-off-by: Yongbok Kim 
---
 configure | 6 +++---
 dtc   | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/configure b/configure
index 1c9655e..4b68861 100755
--- a/configure
+++ b/configure
@@ -3396,11 +3396,11 @@ fi
 if test "$fdt" != "no" ; then
   fdt_libs="-lfdt"
   # explicitly check for libfdt_env.h as it is missing in some stable installs
-  # and test for required functions to make sure we are on a version >= 1.4.0
+  # and test for required functions to make sure we are on a version >= 1.4.2
   cat > $TMPC << EOF
 #include 
 #include 
-int main(void) { fdt_get_property_by_offset(0, 0, 0); return 0; }
+int main(void) { fdt_first_subnode(0, 0); return 0; }
 EOF
   if compile_prog "" "$fdt_libs" ; then
 # system DTC is good - use it
@@ -3418,7 +3418,7 @@ EOF
 fdt_libs="-L\$(BUILD_DIR)/dtc/libfdt $fdt_libs"
   elif test "$fdt" = "yes" ; then
 # have neither and want - prompt for system/submodule install
-error_exit "DTC (libfdt) version >= 1.4.0 not present. Your options:" \
+error_exit "DTC (libfdt) version >= 1.4.2 not present. Your options:" \
 "  (1) Preferred: Install the DTC (libfdt) devel package" \
 "  (2) Fetch the DTC submodule, using:" \
 "  git submodule update --init dtc"
diff --git a/dtc b/dtc
index 65cc4d2..ec02b34 16
--- a/dtc
+++ b/dtc
@@ -1 +1 @@
-Subproject commit 65cc4d2748a2c2e6f27f1cf39e07a5dbabd80ebf
+Subproject commit ec02b34c05be04f249ffaaca4b666f5246877dea
-- 
2.7.4




[Qemu-devel] [PATCH v5 7/8] hw: xilinx-pcie: Add support for Xilinx AXI PCIe Controller

2017-02-15 Thread Yongbok Kim
From: Paul Burton 

Add support for emulating the Xilinx AXI Root Port Bridge for PCI
Express as described by Xilinx' PG055 document. This is a PCIe
controller that can be used with certain series of Xilinx FPGAs, and is
used on the MIPS Boston board which will make use of this code.

Signed-off-by: Paul Burton 
[yongbok@imgtec.com:
  removed returning on !level,
  updated IRQ connection with GPIO logic,
  moved xilinx_pcie_init() to boston.c
  replaced stw_le_p() with pci_set_word()
  and other cosmetic changes]
Signed-off-by: Yongbok Kim 
---
 hw/pci-host/Makefile.objs |   1 +
 hw/pci-host/xilinx-pcie.c | 328 ++
 include/hw/pci-host/xilinx-pcie.h |  68 
 3 files changed, 397 insertions(+)
 create mode 100644 hw/pci-host/xilinx-pcie.c
 create mode 100644 include/hw/pci-host/xilinx-pcie.h

diff --git a/hw/pci-host/Makefile.objs b/hw/pci-host/Makefile.objs
index 45f1f0e..9c7909c 100644
--- a/hw/pci-host/Makefile.objs
+++ b/hw/pci-host/Makefile.objs
@@ -16,3 +16,4 @@ common-obj-$(CONFIG_FULONG) += bonito.o
 common-obj-$(CONFIG_PCI_PIIX) += piix.o
 common-obj-$(CONFIG_PCI_Q35) += q35.o
 common-obj-$(CONFIG_PCI_GENERIC) += gpex.o
+common-obj-$(CONFIG_PCI_XILINX) += xilinx-pcie.o
diff --git a/hw/pci-host/xilinx-pcie.c b/hw/pci-host/xilinx-pcie.c
new file mode 100644
index 000..8b71e2d
--- /dev/null
+++ b/hw/pci-host/xilinx-pcie.c
@@ -0,0 +1,328 @@
+/*
+ * Xilinx PCIe host controller emulation.
+ *
+ * Copyright (c) 2016 Imagination Technologies
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ */
+
+#include "qemu/osdep.h"
+#include "hw/pci/pci_bridge.h"
+#include "hw/pci-host/xilinx-pcie.h"
+
+enum root_cfg_reg {
+/* Interrupt Decode Register */
+ROOTCFG_INTDEC  = 0x138,
+
+/* Interrupt Mask Register */
+ROOTCFG_INTMASK = 0x13c,
+/* INTx Interrupt Received */
+#define ROOTCFG_INTMASK_INTX(1 << 16)
+/* MSI Interrupt Received */
+#define ROOTCFG_INTMASK_MSI (1 << 17)
+
+/* PHY Status/Control Register */
+ROOTCFG_PSCR= 0x144,
+/* Link Up */
+#define ROOTCFG_PSCR_LINK_UP(1 << 11)
+
+/* Root Port Status/Control Register */
+ROOTCFG_RPSCR   = 0x148,
+/* Bridge Enable */
+#define ROOTCFG_RPSCR_BRIDGEEN  (1 << 0)
+/* Interrupt FIFO Not Empty */
+#define ROOTCFG_RPSCR_INTNEMPTY (1 << 18)
+/* Interrupt FIFO Overflow */
+#define ROOTCFG_RPSCR_INTOVF(1 << 19)
+
+/* Root Port Interrupt FIFO Read Register 1 */
+ROOTCFG_RPIFR1  = 0x158,
+#define ROOTCFG_RPIFR1_INT_LANE_SHIFT   27
+#define ROOTCFG_RPIFR1_INT_ASSERT_SHIFT 29
+#define ROOTCFG_RPIFR1_INT_VALID_SHIFT  31
+/* Root Port Interrupt FIFO Read Register 2 */
+ROOTCFG_RPIFR2  = 0x15c,
+};
+
+static void xilinx_pcie_update_intr(XilinxPCIEHost *s,
+uint32_t set, uint32_t clear)
+{
+int level;
+
+s->intr |= set;
+s->intr &= ~clear;
+
+if (s->intr_fifo_r != s->intr_fifo_w) {
+s->intr |= ROOTCFG_INTMASK_INTX;
+}
+
+level = !!(s->intr & s->intr_mask);
+qemu_set_irq(s->irq, level);
+}
+
+static void xilinx_pcie_queue_intr(XilinxPCIEHost *s,
+   uint32_t fifo_reg1, uint32_t fifo_reg2)
+{
+XilinxPCIEInt *intr;
+unsigned int new_w;
+
+new_w = (s->intr_fifo_w + 1) % ARRAY_SIZE(s->intr_fifo);
+if (new_w == s->intr_fifo_r) {
+s->rpscr |= ROOTCFG_RPSCR_INTOVF;
+return;
+}
+
+intr = &s->intr_fifo[s->intr_fifo_w];
+s->intr_fifo_w = new_w;
+
+intr->fifo_reg1 = fifo_reg1;
+intr->fifo_reg2 = fifo_reg2;
+
+xilinx_pcie_update_intr(s, ROOTCFG_INTMASK_INTX, 0);
+}
+
+static void xilinx_pcie_set_irq(void *opaque, int irq_num, int level)
+{
+XilinxPCIEHost *s = XILINX_PCIE_HOST(opaque);
+
+xilinx_pcie_queue_intr(s,
+   (irq_num << ROOTCFG_RPIFR1_INT_LANE_SHIFT) |
+   (level << ROOTCFG_RPIFR1_INT_ASSERT_SHIFT) |
+   (1 << ROOTCFG_RPIFR1_INT_VALID_SHIFT),
+   0);
+}
+
+static void xilinx_pcie_host_realize(DeviceState *dev, Error **errp)
+{
+PCIHostState *pci = PCI_HOST_BRIDGE(dev);
+XilinxPCIEHost *s = XILINX_PCIE_HOST(dev);
+SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
+PCIExpressHost *pex = PCIE_HOST_BRIDGE(dev);
+
+snprintf(s->name, sizeof(s

[Qemu-devel] [PATCH v5 4/8] target-mips: Provide function to test if a CPU supports an ISA

2017-02-15 Thread Yongbok Kim
From: Paul Burton 

Provide a new cpu_supports_isa function which allows callers to
determine whether a CPU supports one of the ISA_ flags, by testing
whether the associated struct mips_def_t sets the ISA flags in its
insn_flags field.

An example use of this is to allow boards which generate bootloader code
to determine the properties of the CPU that will be used, for example
whether the CPU is 64 bit or which architecture revision it implements.

Signed-off-by: Paul Burton 
Reviewed-by: Leon Alrae 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Yongbok Kim 
---
 target/mips/cpu.h   |  1 +
 target/mips/translate.c | 10 ++
 2 files changed, 11 insertions(+)

diff --git a/target/mips/cpu.h b/target/mips/cpu.h
index e1c78f5..4a4747a 100644
--- a/target/mips/cpu.h
+++ b/target/mips/cpu.h
@@ -815,6 +815,7 @@ int cpu_mips_signal_handler(int host_signum, void *pinfo, 
void *puc);
 
 #define cpu_init(cpu_model) CPU(cpu_mips_init(cpu_model))
 bool cpu_supports_cps_smp(const char *cpu_model);
+bool cpu_supports_isa(const char *cpu_model, unsigned int isa);
 void cpu_set_exception_base(int vp_index, target_ulong address);
 
 /* TODO QOM'ify CPU reset and remove */
diff --git a/target/mips/translate.c b/target/mips/translate.c
index 7f8ecf4..8b4a072 100644
--- a/target/mips/translate.c
+++ b/target/mips/translate.c
@@ -20233,6 +20233,16 @@ bool cpu_supports_cps_smp(const char *cpu_model)
 return (def->CP0_Config3 & (1 << CP0C3_CMGCR)) != 0;
 }
 
+bool cpu_supports_isa(const char *cpu_model, unsigned int isa)
+{
+const mips_def_t *def = cpu_mips_find_by_name(cpu_model);
+if (!def) {
+return false;
+}
+
+return (def->insn_flags & isa) != 0;
+}
+
 void cpu_set_exception_base(int vp_index, target_ulong address)
 {
 MIPSCPU *vp = MIPS_CPU(qemu_get_cpu(vp_index));
-- 
2.7.4




[Qemu-devel] [PATCH v5 2/8] hw/mips_gictimer: provide API for retrieving frequency

2017-02-15 Thread Yongbok Kim
From: Paul Burton 

Provide a new function mips_gictimer_get_freq() which returns the
frequency at which a GIC timer will count. This will be useful for
boards which perform setup based upon this frequency.

Signed-off-by: Paul Burton 
Reviewed-by: Leon Alrae 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Yongbok Kim 
---
 hw/timer/mips_gictimer.c | 5 +
 include/hw/timer/mips_gictimer.h | 1 +
 2 files changed, 6 insertions(+)

diff --git a/hw/timer/mips_gictimer.c b/hw/timer/mips_gictimer.c
index 3698889..f5c5806 100644
--- a/hw/timer/mips_gictimer.c
+++ b/hw/timer/mips_gictimer.c
@@ -14,6 +14,11 @@
 
 #define TIMER_PERIOD 10 /* 10 ns period for 100 Mhz frequency */
 
+uint32_t mips_gictimer_get_freq(MIPSGICTimerState *gic)
+{
+return NANOSECONDS_PER_SECOND / TIMER_PERIOD;
+}
+
 static void gic_vptimer_update(MIPSGICTimerState *gictimer,
uint32_t vp_index, uint64_t now)
 {
diff --git a/include/hw/timer/mips_gictimer.h b/include/hw/timer/mips_gictimer.h
index c8bc5d2..c7ca6c8 100644
--- a/include/hw/timer/mips_gictimer.h
+++ b/include/hw/timer/mips_gictimer.h
@@ -31,6 +31,7 @@ struct MIPSGICTimerState {
 MIPSGICTimerCB *cb;
 };
 
+uint32_t mips_gictimer_get_freq(MIPSGICTimerState *gic);
 uint32_t mips_gictimer_get_sh_count(MIPSGICTimerState *gic);
 void mips_gictimer_store_sh_count(MIPSGICTimerState *gic, uint64_t count);
 uint32_t mips_gictimer_get_vp_compare(MIPSGICTimerState *gictimer,
-- 
2.7.4




[Qemu-devel] [PATCH v5 6/8] loader: Support Flattened Image Trees (FIT images)

2017-02-15 Thread Yongbok Kim
From: Paul Burton 

Introduce support for loading Flattened Image Trees, as used by modern
U-Boot. FIT images are essentially flattened device tree files which
contain binary images such as kernels, FDTs or ramdisks along with one
or more configuration nodes describing boot configurations.

The MIPS Boston board typically boots kernels in the form of FIT images,
and will make use of this code.

Signed-off-by: Paul Burton 
[yongbok@imgtec.com: fixed potential memory leaks]
Signed-off-by: Yongbok Kim 
---
 hw/core/Makefile.objs   |   1 +
 hw/core/loader-fit.c| 325 
 hw/core/loader.c|   7 +-
 include/hw/loader-fit.h |  41 ++
 include/hw/loader.h |   6 +
 5 files changed, 374 insertions(+), 6 deletions(-)
 create mode 100644 hw/core/loader-fit.c
 create mode 100644 include/hw/loader-fit.h

diff --git a/hw/core/Makefile.objs b/hw/core/Makefile.objs
index 7f8c9dc..ff59512 100644
--- a/hw/core/Makefile.objs
+++ b/hw/core/Makefile.objs
@@ -13,6 +13,7 @@ common-obj-$(CONFIG_PTIMER) += ptimer.o
 common-obj-$(CONFIG_SOFTMMU) += sysbus.o
 common-obj-$(CONFIG_SOFTMMU) += machine.o
 common-obj-$(CONFIG_SOFTMMU) += loader.o
+common-obj-$(CONFIG_SOFTMMU) += loader-fit.o
 common-obj-$(CONFIG_SOFTMMU) += qdev-properties-system.o
 common-obj-$(CONFIG_SOFTMMU) += register.o
 common-obj-$(CONFIG_SOFTMMU) += or-irq.o
diff --git a/hw/core/loader-fit.c b/hw/core/loader-fit.c
new file mode 100644
index 000..4ddd35e
--- /dev/null
+++ b/hw/core/loader-fit.c
@@ -0,0 +1,325 @@
+/*
+ * Flattened Image Tree loader.
+ *
+ * Copyright (c) 2016 Imagination Technologies
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ */
+
+#include "qemu/osdep.h"
+#include "exec/address-spaces.h"
+#include "exec/memory.h"
+#include "hw/loader.h"
+#include "hw/loader-fit.h"
+#include "qemu/cutils.h"
+#include "qemu/error-report.h"
+#include "sysemu/device_tree.h"
+#include "sysemu/sysemu.h"
+
+#include 
+#include 
+
+#define FIT_LOADER_MAX_PATH (128)
+
+static const void *fit_load_image_alloc(const void *itb, const char *name,
+int *poff, size_t *psz)
+{
+const void *data;
+const char *comp;
+void *uncomp_data;
+char path[FIT_LOADER_MAX_PATH];
+int off, sz;
+ssize_t uncomp_len;
+
+snprintf(path, sizeof(path), "/images/%s", name);
+
+off = fdt_path_offset(itb, path);
+if (off < 0) {
+return NULL;
+}
+if (poff) {
+*poff = off;
+}
+
+data = fdt_getprop(itb, off, "data", &sz);
+if (!data) {
+return NULL;
+}
+
+comp = fdt_getprop(itb, off, "compression", NULL);
+if (!comp || !strcmp(comp, "none")) {
+if (psz) {
+*psz = sz;
+}
+uncomp_data = g_malloc(sz);
+memmove(uncomp_data, data, sz);
+return uncomp_data;
+}
+
+if (!strcmp(comp, "gzip")) {
+uncomp_len = UBOOT_MAX_GUNZIP_BYTES;
+uncomp_data = g_malloc(uncomp_len);
+
+uncomp_len = gunzip(uncomp_data, uncomp_len, (void *) data, sz);
+if (uncomp_len < 0) {
+error_printf("unable to decompress %s image\n", name);
+g_free(uncomp_data);
+return NULL;
+}
+
+data = g_realloc(uncomp_data, uncomp_len);
+if (psz) {
+*psz = uncomp_len;
+}
+return data;
+}
+
+error_printf("unknown compression '%s'\n", comp);
+return NULL;
+}
+
+static int fit_image_addr(const void *itb, int img, const char *name,
+  hwaddr *addr)
+{
+const void *prop;
+int len;
+
+prop = fdt_getprop(itb, img, name, &len);
+if (!prop) {
+return -ENOENT;
+}
+
+switch (len) {
+case 4:
+*addr = fdt32_to_cpu(*(fdt32_t *)prop);
+return 0;
+case 8:
+*addr = fdt64_to_cpu(*(fdt64_t *)prop);
+return 0;
+default:
+error_printf("invalid %s address length %d\n", name, len);
+return -EINVAL;
+}
+}
+
+static int fit_load_kernel(const struct fit_loader *ldr, const void *itb,
+   int cfg, void *opaque, hwaddr *pend)
+{
+const char *name;
+const void *data;
+const void *load_data;
+hwaddr load_addr, entry_addr;
+int img_off, err;
+size_t sz;
+int ret;
+
+name = fdt_getprop(itb, cfg, "kern

[Qemu-devel] [PATCH v5 0/8] MIPS Boston board support

2017-02-15 Thread Yongbok Kim
This series introduces support for the MIPS Boston development board. It begins
by introducing support for moving MIPS Coherence Manager GCRs which Boston
software typically does to avoid conflicting with its flash memory region. An
API is then added to retrieve the emulated MIPS GIC timer frequency, which is
used to report system clock frequency to software via "platform registers"
which the Boston board provides. An issue with the MIPS GIC that current Boston
Linux kernels encounter is fixed, and an API introduced to allow the board to
determine whether the MIPS CPS hardware is supported.

The last 3 patches are more extensive, providing support for the FIT image
format used with Boston, the Xilinx PCIe controller which Boston boards include
3 of, and finally the Boston board support itself.

This can be tested with either U-Boot or Linux if desired. U-Boot support is
available in the following patchset:

  https://www.mail-archive.com/u-boot@lists.denx.de/msg221003.html

Linux kernel support can be found as part of the generic kernel patchset:

  https://www.linux-mips.org/archives/linux-mips/2016-08/msg00456.html

Hopefully this will be merged for v4.9, but it can also be found in a
downstream kernel from Imagination Technologies in the "eng" branch of:

  git://git.linux-mips.org/pub/scm/linux-mti.git

Linux may be built with:

  $ make 64r6el_defconfig
  $ make

The arch/mips/boot/vmlinux.gz.itb image may then be provided to QEMU's -kernel
argument, for example:

  $ qemu-system-mips64el -M boston -kernel vmlinux.gz.itb -serial stdio

v5:
  loader-fit
quick fix for the redefinition issue reported from Patchew.

v4:
Yongbok Kim:
  boston
ignore missing bios/kernel for qtest.

v3:
Yongbok Kim:
  loader-fit
fixed potential memory leaks.
  xlinix-pcie
added descriptions for macros. (Alistair)
removed returning on !level. (Alistair)
updated IRQ connection with GPIO logic (Alistair)
moved xilinx_pcie_init() to boston.c (Alistair)
replaced stw_le_p() with pci_set_word()
  boston
isolated boston machine support for mips64el.
updated for recent Chardev changes.
  and other cosmetic changes.

v1, v2:
Paul Burton (8):
  hw/mips_cmgcr: allow GCR base to be moved
  hw/mips_gictimer: provide API for retrieving frequency
  hw/mips_gic: Update pin state on mask changes
  target-mips: Provide function to test if a CPU supports an ISA
  dtc: Update requirement to v1.4.2
  loader: Support Flattened Image Trees (FIT images)
  hw: xilinx-pcie: Add support for Xilinx AXI PCIe Controller
  hw/mips: MIPS Boston board support

 configure|   8 +-
 default-configs/mips64el-softmmu.mak |   2 +
 dtc  |   2 +-
 hw/core/Makefile.objs|   1 +
 hw/core/loader-fit.c | 325 
 hw/core/loader.c |   7 +-
 hw/intc/mips_gic.c   |  56 ++--
 hw/mips/Makefile.objs|   1 +
 hw/mips/boston.c | 576 +++
 hw/misc/mips_cmgcr.c |  17 ++
 hw/pci-host/Makefile.objs|   1 +
 hw/pci-host/xilinx-pcie.c| 328 
 hw/timer/mips_gictimer.c |   5 +
 include/hw/loader-fit.h  |  41 +++
 include/hw/loader.h  |   6 +
 include/hw/misc/mips_cmgcr.h |   3 +
 include/hw/pci-host/xilinx-pcie.h|  68 +
 include/hw/timer/mips_gictimer.h |   1 +
 target/mips/cpu.h|   1 +
 target/mips/translate.c  |  10 +
 20 files changed, 1423 insertions(+), 36 deletions(-)
 create mode 100644 hw/core/loader-fit.c
 create mode 100644 hw/mips/boston.c
 create mode 100644 hw/pci-host/xilinx-pcie.c
 create mode 100644 include/hw/loader-fit.h
 create mode 100644 include/hw/pci-host/xilinx-pcie.h

-- 
2.7.4




[Qemu-devel] [PATCH v5 3/8] hw/mips_gic: Update pin state on mask changes

2017-02-15 Thread Yongbok Kim
From: Paul Burton 

If the GIC interrupt mask is changed by a write to the smask (set mask)
or rmask (reset mask) registers, we need to re-evaluate the state of the
pins/IRQs fed to the CPU. Without doing so we risk leaving a pin high
despite the interrupt that led to that state being masked, or losing
interrupts if an already pending interrupt is unmasked.

Signed-off-by: Paul Burton 
Reviewed-by: Leon Alrae 
Signed-off-by: Yongbok Kim 
---
 hw/intc/mips_gic.c | 56 ++
 1 file changed, 31 insertions(+), 25 deletions(-)

diff --git a/hw/intc/mips_gic.c b/hw/intc/mips_gic.c
index 6e25773..15e6e40 100644
--- a/hw/intc/mips_gic.c
+++ b/hw/intc/mips_gic.c
@@ -20,31 +20,29 @@
 #include "kvm_mips.h"
 #include "hw/intc/mips_gic.h"
 
-static void mips_gic_set_vp_irq(MIPSGICState *gic, int vp, int pin, int level)
+static void mips_gic_set_vp_irq(MIPSGICState *gic, int vp, int pin)
 {
-int ored_level = level;
+int ored_level = 0;
 int i;
 
 /* ORing pending registers sharing same pin */
-if (!ored_level) {
-for (i = 0; i < gic->num_irq; i++) {
-if ((gic->irq_state[i].map_pin & GIC_MAP_MSK) == pin &&
-gic->irq_state[i].map_vp == vp &&
-gic->irq_state[i].enabled) {
-ored_level |= gic->irq_state[i].pending;
-}
-if (ored_level) {
-/* no need to iterate all interrupts */
-break;
-}
+for (i = 0; i < gic->num_irq; i++) {
+if ((gic->irq_state[i].map_pin & GIC_MAP_MSK) == pin &&
+gic->irq_state[i].map_vp == vp &&
+gic->irq_state[i].enabled) {
+ored_level |= gic->irq_state[i].pending;
 }
-if (((gic->vps[vp].compare_map & GIC_MAP_MSK) == pin) &&
-(gic->vps[vp].mask & GIC_VP_MASK_CMP_MSK)) {
-/* ORing with local pending register (count/compare) */
-ored_level |= (gic->vps[vp].pend & GIC_VP_MASK_CMP_MSK) >>
-  GIC_VP_MASK_CMP_SHF;
+if (ored_level) {
+/* no need to iterate all interrupts */
+break;
 }
 }
+if (((gic->vps[vp].compare_map & GIC_MAP_MSK) == pin) &&
+(gic->vps[vp].mask & GIC_VP_MASK_CMP_MSK)) {
+/* ORing with local pending register (count/compare) */
+ored_level |= (gic->vps[vp].pend & GIC_VP_MASK_CMP_MSK) >>
+  GIC_VP_MASK_CMP_SHF;
+}
 if (kvm_enabled())  {
 kvm_mips_set_ipi_interrupt(mips_env_get_cpu(gic->vps[vp].env),
pin + GIC_CPU_PIN_OFFSET,
@@ -55,21 +53,27 @@ static void mips_gic_set_vp_irq(MIPSGICState *gic, int vp, 
int pin, int level)
 }
 }
 
-static void gic_set_irq(void *opaque, int n_IRQ, int level)
+static void gic_update_pin_for_irq(MIPSGICState *gic, int n_IRQ)
 {
-MIPSGICState *gic = (MIPSGICState *) opaque;
 int vp = gic->irq_state[n_IRQ].map_vp;
 int pin = gic->irq_state[n_IRQ].map_pin & GIC_MAP_MSK;
 
+if (vp < 0 || vp >= gic->num_vps) {
+return;
+}
+mips_gic_set_vp_irq(gic, vp, pin);
+}
+
+static void gic_set_irq(void *opaque, int n_IRQ, int level)
+{
+MIPSGICState *gic = (MIPSGICState *) opaque;
+
 gic->irq_state[n_IRQ].pending = (uint8_t) level;
 if (!gic->irq_state[n_IRQ].enabled) {
 /* GIC interrupt source disabled */
 return;
 }
-if (vp < 0 || vp >= gic->num_vps) {
-return;
-}
-mips_gic_set_vp_irq(gic, vp, pin, level);
+gic_update_pin_for_irq(gic, n_IRQ);
 }
 
 #define OFFSET_CHECK(c) \
@@ -209,7 +213,7 @@ static void gic_timer_store_vp_compare(MIPSGICState *gic, 
uint32_t vp_index,
 gic->vps[vp_index].pend &= ~(1 << GIC_LOCAL_INT_COMPARE);
 if (gic->vps[vp_index].compare_map & GIC_MAP_TO_PIN_MSK) {
 uint32_t pin = (gic->vps[vp_index].compare_map & GIC_MAP_MSK);
-mips_gic_set_vp_irq(gic, vp_index, pin, 0);
+mips_gic_set_vp_irq(gic, vp_index, pin);
 }
 mips_gictimer_store_vp_compare(gic->gic_timer, vp_index, compare);
 }
@@ -286,6 +290,7 @@ static void gic_write(void *opaque, hwaddr addr, uint64_t 
data, unsigned size)
 OFFSET_CHECK((base + size * 8) <= gic->num_irq);
 for (i = 0; i < size * 8; i++) {
 gic->irq_state[base + i].enabled &= !((data >> i) & 1);
+gic_update_pin_for_irq(gic, base + i);
 }
 break;
 case GIC_SH_WEDGE_OFS:
@@ -305,6 +310,7 @@ static void gic_write(void *opaque, hwaddr addr, uint64_t 
data, unsigned size)
 OFFSET_CHECK((base + size * 8) <= gic->num_irq);
 for (i = 0; i < size * 8; i++) {
 gic->irq_state[base + i].enabled |= (data >> i) & 1;
+gic_update_pin_for_irq(gic, base + i);
 }
 break;
 case GIC_SH_MAP0_PIN_OFS ... GIC_SH_MAP255_PIN_OFS:
-- 
2.7.4




[Qemu-devel] [PATCH v5 1/8] hw/mips_cmgcr: allow GCR base to be moved

2017-02-15 Thread Yongbok Kim
From: Paul Burton 

Support moving the GCR base address & updating the CPU's CP0 CMGCRBase
register appropriately. This is required if a platform needs to move its
GCRs away from other memory, as the MIPS Boston development board does
to avoid its flash memory.

Signed-off-by: Paul Burton 
Reviewed-by: Leon Alrae 
Signed-off-by: Yongbok Kim 
---
 hw/misc/mips_cmgcr.c | 17 +
 include/hw/misc/mips_cmgcr.h |  3 +++
 2 files changed, 20 insertions(+)

diff --git a/hw/misc/mips_cmgcr.c b/hw/misc/mips_cmgcr.c
index b3ba166..a1edb53 100644
--- a/hw/misc/mips_cmgcr.c
+++ b/hw/misc/mips_cmgcr.c
@@ -29,6 +29,20 @@ static inline bool is_gic_connected(MIPSGCRState *s)
 return s->gic_mr != NULL;
 }
 
+static inline void update_gcr_base(MIPSGCRState *gcr, uint64_t val)
+{
+CPUState *cpu;
+MIPSCPU *mips_cpu;
+
+gcr->gcr_base = val & GCR_BASE_GCRBASE_MSK;
+memory_region_set_address(&gcr->iomem, gcr->gcr_base);
+
+CPU_FOREACH(cpu) {
+mips_cpu = MIPS_CPU(cpu);
+mips_cpu->env.CP0_CMGCRBase = gcr->gcr_base >> 4;
+}
+}
+
 static inline void update_cpc_base(MIPSGCRState *gcr, uint64_t val)
 {
 if (is_cpc_connected(gcr)) {
@@ -117,6 +131,9 @@ static void gcr_write(void *opaque, hwaddr addr, uint64_t 
data, unsigned size)
 MIPSGCRVPState *other_vps = &gcr->vps[current_vps->other];
 
 switch (addr) {
+case GCR_BASE_OFS:
+update_gcr_base(gcr, data);
+break;
 case GCR_GIC_BASE_OFS:
 update_gic_base(gcr, data);
 break;
diff --git a/include/hw/misc/mips_cmgcr.h b/include/hw/misc/mips_cmgcr.h
index a209d91..c9dfcb4 100644
--- a/include/hw/misc/mips_cmgcr.h
+++ b/include/hw/misc/mips_cmgcr.h
@@ -41,6 +41,9 @@
 #define GCR_L2_CONFIG_BYPASS_SHF20
 #define GCR_L2_CONFIG_BYPASS_MSK((0x1ULL) << GCR_L2_CONFIG_BYPASS_SHF)
 
+/* GCR_BASE register fields */
+#define GCR_BASE_GCRBASE_MSK 0x8000ULL
+
 /* GCR_GIC_BASE register fields */
 #define GCR_GIC_BASE_GICEN_MSK   1
 #define GCR_GIC_BASE_GICBASE_MSK 0xFFFEULL
-- 
2.7.4




Re: [Qemu-devel] [PATCH v15 25/25] qcow2-bitmap: improve check_constraints_on_bitmap

2017-02-15 Thread John Snow


On 02/15/2017 05:10 AM, Vladimir Sementsov-Ogievskiy wrote:
> Add detailed error messages.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 

Reviewed-by: John Snow 

> ---
>  block/qcow2-bitmap.c | 48 ++--
>  1 file changed, 34 insertions(+), 14 deletions(-)
> 
> diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
> index 9177c56..e25c872 100644
> --- a/block/qcow2-bitmap.c
> +++ b/block/qcow2-bitmap.c
> @@ -160,28 +160,49 @@ static int check_table_entry(uint64_t entry, int 
> cluster_size)
>  
>  static int check_constraints_on_bitmap(BlockDriverState *bs,
> const char *name,
> -   uint32_t granularity)
> +   uint32_t granularity,
> +   Error **errp)
>  {
>  BDRVQcow2State *s = bs->opaque;
>  int granularity_bits = ctz32(granularity);
>  int64_t len = bdrv_getlength(bs);
> -bool fail;
>  
>  assert(granularity > 0);
>  assert((granularity & (granularity - 1)) == 0);
>  
>  if (len < 0) {
> +error_setg_errno(errp, -len, "Failed to get size of '%s'",
> + bdrv_get_device_or_node_name(bs));
>  return len;
>  }
>  
> -fail = (granularity_bits > BME_MAX_GRANULARITY_BITS) ||
> -   (granularity_bits < BME_MIN_GRANULARITY_BITS) ||
> -   (len > (uint64_t)BME_MAX_PHYS_SIZE << granularity_bits) ||
> -   (len > (uint64_t)BME_MAX_TABLE_SIZE * s->cluster_size <<
> -  granularity_bits) ||
> -   (strlen(name) > BME_MAX_NAME_SIZE);
> +if (granularity_bits > BME_MAX_GRANULARITY_BITS) {
> +error_setg(errp, "Granularity exceeds maximum (%u bytes)",
> +   1 << BME_MAX_GRANULARITY_BITS);
> +return -EINVAL;
> +}
> +if (granularity_bits < BME_MIN_GRANULARITY_BITS) {
> +error_setg(errp, "Granularity is under minimum (%u bytes)",
> +   1 << BME_MIN_GRANULARITY_BITS);
> +return -EINVAL;
> +}
>  
> -return fail ? -EINVAL : 0;
> +if ((len > (uint64_t)BME_MAX_PHYS_SIZE << granularity_bits) ||
> +(len > (uint64_t)BME_MAX_TABLE_SIZE * s->cluster_size <<
> +   granularity_bits))
> +{
> +error_setg(errp, "Too much space will be occupied by the bitmap. "
> +   "Use larger granularity");
> +return -EINVAL;
> +}
> +
> +if (strlen(name) > BME_MAX_NAME_SIZE) {
> +error_setg(errp, "Name length exceeds maximum (%u characters)",
> +   BME_MAX_NAME_SIZE);
> +return -EINVAL;
> +}
> +
> +return 0;
>  }
>  
>  static void clear_bitmap_table(BlockDriverState *bs, uint64_t *bitmap_table,
> @@ -1142,9 +1163,9 @@ void 
> qcow2_store_persistent_dirty_bitmaps(BlockDriverState *bs, Error **errp)
>  continue;
>  }
>  
> -if (check_constraints_on_bitmap(bs, name, granularity) < 0) {
> -error_setg(errp, "Bitmap '%s' doesn't satisfy the constraints",
> -   name);
> +if (check_constraints_on_bitmap(bs, name, granularity, errp) < 0) {
> +error_prepend(errp, "Bitmap '%s' doesn't satisfy the 
> constraints: ",
> +  name);
>  goto fail;
>  }
>  
> @@ -1233,8 +1254,7 @@ bool qcow2_can_store_new_dirty_bitmap(BlockDriverState 
> *bs,
>  bool found;
>  Qcow2BitmapList *bm_list;
>  
> -if (check_constraints_on_bitmap(bs, name, granularity) != 0) {
> -error_setg(errp, "The constraints are not satisfied");
> +if (check_constraints_on_bitmap(bs, name, granularity, errp) != 0) {
>  goto fail;
>  }
>  
> 

-- 
—js



Re: [Qemu-devel] RFC: How to make seccomp reliable and useful ?

2017-02-15 Thread Eduardo Otubo
On Wed, Feb 15, 2017 at 06=27=32PM +, Daniel P. Berrange wrote:
> The current impl of seccomp in QEMU is intentionally allowing a huge range
> of system calls to be executed. The goal was that running '-sandbox on'
> should never break any feature of QEMU, so naturally any syscall that can
> executed on any codepath QEMU takes must be allowed.
> 
> This is good for usability because users don't need to understand the 
> technical
> details of the sandbox technology, they merely say "on" and it "just works".
> Conversely though, this is bad for security because QEMU has to allow a huge
> range of system calls to be used due to its broad functionality.
> 
> During initial discussions for seccomp back in 2012 it was suggested, there
> might be alternate policies developed for QEMU which deny some features, but
> improve security overall. To best of my knowledge, this has never been 
> discussed
> again since then.
> 
> 
> In addition, since initially merging, there has been a steady stream of 
> patches
> to whitelist further syscalls that were missing. Some of these were missing 
> due
> to newly added functionality in QEMU since the original seccomp impl, while
> others have been missing since day 1. It is reasonable to expect that there 
> are
> still many syscalls missing in the whitelist. In just a couple of minutes of
> comparing the whitelist vs global syscall list it was possible to identify two
> further missing syscalls. The '-netdev bridge,br=virbr0' network backend fails
> because setuid is blocked, preventing execution of the qemu-bridge-helper
> program. If built against glibc < 2.9, or running on kernel < 2.6.27 it will
> fail to call eventfd() because we only permit eventfd2() syscall, not the
> older eventfd() syscall used on older Linux. Some ifup scripts used with the
> -netdev arg may also break due to lack of chmod, flock, getxattr permissions.
> This risk of missing syscalls is why -sandbox defaults to off, and we've never
> considered defaulting it to on.
> 
> 
> The fundamental problem is that building a whitelist of syscalls used by QEMU
> emulators is an intractable problem. QEMU on my system links to 183 different
> shared libraries and there is no way in the world that anyone can figure out
> which code paths QEMU triggers in these libraries and thus identify which
> syscalls will be genuinely needed.
> 
> Thus a whitelist based approach for QEMU is doomed to always be missing some
> syscalls, resulting in uneccessary abrts of QEMU when it tickles some edge
> case. If you are lucky the abort() happens at startup so you see it quickly
> and can address it. If you are unlucky the abort() happens after your VM has
> been running for days/week/months and you loose data.
> 
> IOW, seccomp integration as it currently exists today in QEMU offers minimal
> security benefits, while at the same time causing spurious crashes which may
> cause user data loss from aborting a running VM, discouraging users from using
> even the minimal protection it offers.
> 
> I think we need to rework our seccomp support so that we can have a high 
> enough
> level of confidence in it, that it could be enabled by default. At the same 
> time
> we need to make it do something more tangibly useful from a security POV.
> 
> 
> First we need to admit that whitelisting is a failed approach, and switch to
> using blacklisting. Unless we do this, we'll never have high enough confidence
> to enable it by default - something that's never turned on might as well not
> exist at all.
> 
> 
> There is a reasonable easily identifiable set of syscalls that QEMU should
> never be permitted to use, no matter what configuration it is in, what helpers
> it spawns, or what libraries it links to. eg reboot, swapon, swapoff,  syslog,
> mount, unmount, kexec_*, etc - any syscall that affects global system state,
> rather than process local state should be forbidden.
> 
> There are some syscalls that are simply hardcoded to return ENOSYS which can
> be trivially blacklisted. afs_syscall, break, fattach, ftime, etc (see the
> man page 'unimplemented(2)').
> 
> There are some syscalls which are considered obsolete - they were previously
> useful, but no modern code would call them, as they have been superceeded.
> For example, readdir replaced by getdents. We could blacklist these by default
> but provide a way to allow use of obsolete syscalls if running on older 
> systems.
> e.g. '-sandbox on,obsolete=allow'. They might be obsolete enough that we 
> decide
> to just block them permanently with no opt in - would need to analyse when
> their replacements appeared in widespread use.
> 
> There might be a few more syscalls which we can determine are never valid to
> use in QEMU or any library or helper program it might run. I expect this list
> to be very small though, given the impossibility of auditing code paths 
> through
> millions of lines of code QEMU links to.
> 
> Everything else should be allowed.
> 
> At this point we ha

Re: [Qemu-devel] iommu emulation

2017-02-15 Thread Jintack Lim
On Wed, Feb 15, 2017 at 5:50 PM, Alex Williamson  wrote:

> On Wed, 15 Feb 2017 17:05:35 -0500
> Jintack Lim  wrote:
>
> > On Tue, Feb 14, 2017 at 9:52 PM, Peter Xu  wrote:
> >
> > > On Tue, Feb 14, 2017 at 07:50:39AM -0500, Jintack Lim wrote:
> > >
> > > [...]
> > >
> > > > > > >> > I misunderstood what you said?
> > > > > > >
> > > > > > > I failed to understand why an vIOMMU could help boost
> performance.
> > > :(
> > > > > > > Could you provide your command line here so that I can try to
> > > > > > > reproduce?
> > > > > >
> > > > > > Sure. This is the command line to launch L1 VM
> > > > > >
> > > > > > qemu-system-x86_64 -M q35,accel=kvm,kernel-irqchip=split \
> > > > > > -m 12G -device intel-iommu,intremap=on,eim=off,caching-mode=on \
> > > > > > -drive file=/mydata/guest0.img,format=raw --nographic -cpu host
> \
> > > > > > -smp 4,sockets=4,cores=1,threads=1 \
> > > > > > -device vfio-pci,host=08:00.0,id=net0
> > > > > >
> > > > > > And this is for L2 VM.
> > > > > >
> > > > > > ./qemu-system-x86_64 -M q35,accel=kvm \
> > > > > > -m 8G \
> > > > > > -drive file=/vm/l2guest.img,format=raw --nographic -cpu host \
> > > > > > -device vfio-pci,host=00:03.0,id=net0
> > > > >
> > > > > ... here looks like these are command lines for L1/L2 guest, rather
> > > > > than L1 guest with/without vIOMMU?
> > > > >
> > > >
> > > > That's right. I thought you were asking about command lines for L1/L2
> > > guest
> > > > :(.
> > > > I think I made the confusion, and as I said above, I didn't mean to
> talk
> > > > about the performance of L1 guest with/without vIOMMO.
> > > > We can move on!
> > >
> > > I see. Sure! :-)
> > >
> > > [...]
> > >
> > > > >
> > > > > Then, I *think* above assertion you encountered would fail only if
> > > > > prev == 0 here, but I still don't quite sure why was that
> happening.
> > > > > Btw, could you paste me your "lspci -vvv -s 00:03.0" result in
> your L1
> > > > > guest?
> > > > >
> > > >
> > > > Sure. This is from my L1 guest.
> > >
> > > Hmm... I think I found the problem...
> > >
> > > >
> > > > root@guest0:~# lspci -vvv -s 00:03.0
> > > > 00:03.0 Network controller: Mellanox Technologies MT27500 Family
> > > > [ConnectX-3]
> > > > Subsystem: Mellanox Technologies Device 0050
> > > > Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> > > > Stepping- SERR+ FastB2B- DisINTx+
> > > > Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
>  > > > SERR-  > > > Latency: 0, Cache Line Size: 64 bytes
> > > > Interrupt: pin A routed to IRQ 23
> > > > Region 0: Memory at fe90 (64-bit, non-prefetchable) [size=1M]
> > > > Region 2: Memory at fe00 (64-bit, prefetchable) [size=8M]
> > > > Expansion ROM at fea0 [disabled] [size=1M]
> > > > Capabilities: [40] Power Management version 3
> > > > Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
> PME(D0-,D1-,D2-,D3hot-,D3cold-
> > > )
> > > > Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
> > > > Capabilities: [48] Vital Product Data
> > > > Product Name: CX354A - ConnectX-3 QSFP
> > > > Read-only fields:
> > > > [PN] Part number: MCX354A-FCBT
> > > > [EC] Engineering changes: A4
> > > > [SN] Serial number: MT1346X00791
> > > > [V0] Vendor specific: PCIe Gen3 x8
> > > > [RV] Reserved: checksum good, 0 byte(s) reserved
> > > > Read/write fields:
> > > > [V1] Vendor specific: N/A
> > > > [YA] Asset tag: N/A
> > > > [RW] Read-write area: 105 byte(s) free
> > > > [RW] Read-write area: 253 byte(s) free
> > > > [RW] Read-write area: 253 byte(s) free
> > > > [RW] Read-write area: 253 byte(s) free
> > > > [RW] Read-write area: 253 byte(s) free
> > > > [RW] Read-write area: 253 byte(s) free
> > > > [RW] Read-write area: 253 byte(s) free
> > > > [RW] Read-write area: 253 byte(s) free
> > > > [RW] Read-write area: 253 byte(s) free
> > > > [RW] Read-write area: 253 byte(s) free
> > > > [RW] Read-write area: 253 byte(s) free
> > > > [RW] Read-write area: 253 byte(s) free
> > > > [RW] Read-write area: 253 byte(s) free
> > > > [RW] Read-write area: 253 byte(s) free
> > > > [RW] Read-write area: 253 byte(s) free
> > > > [RW] Read-write area: 252 byte(s) free
> > > > End
> > > > Capabilities: [9c] MSI-X: Enable+ Count=128 Masked-
> > > > Vector table: BAR=0 offset=0007c000
> > > > PBA: BAR=0 offset=0007d000
> > > > Capabilities: [60] Express (v2) Root Complex Integrated Endpoint,
> MSI 00
> > > > DevCap: MaxPayload 256 bytes, PhantFunc 0
> > > > ExtTag- RBE+
> > > > DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported+
> > > > RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
> > > > MaxPayload 256 bytes, MaxReadReq 4096 bytes
> > > > DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
> > > > DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not
> > > > Supported
> > > > DevCtl2: Completion Timeout: 65ms to 210ms, TimeoutDis-, LTR-, OBFF
> > > Disabled
> > > > Capabilities: [100 v0] #00
> > >
> > > Here we have the head of ecap capability as cap_id==0, then when we
> > > boot the

Re: [Qemu-devel] [PATCH v15 16/25] qmp: add persistent flag to block-dirty-bitmap-add

2017-02-15 Thread John Snow


On 02/15/2017 05:10 AM, Vladimir Sementsov-Ogievskiy wrote:
> Add optional 'persistent' flag to qmp command block-dirty-bitmap-add.
> Default is false.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> Signed-off-by: Denis V. Lunev 
> Reviewed-by: Max Reitz 

Reviewed-by: John Snow 

> ---
>  blockdev.c   | 18 +-
>  qapi/block-core.json |  8 +++-
>  2 files changed, 24 insertions(+), 2 deletions(-)
> 
> diff --git a/blockdev.c b/blockdev.c
> index 245e1e1..40605fa 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -1967,6 +1967,7 @@ static void 
> block_dirty_bitmap_add_prepare(BlkActionState *common,
>  /* AIO context taken and released within qmp_block_dirty_bitmap_add */
>  qmp_block_dirty_bitmap_add(action->node, action->name,
> action->has_granularity, action->granularity,
> +   action->has_persistent, action->persistent,
> &local_err);
>  
>  if (!local_err) {
> @@ -2696,10 +2697,12 @@ out:
>  
>  void qmp_block_dirty_bitmap_add(const char *node, const char *name,
>  bool has_granularity, uint32_t granularity,
> +bool has_persistent, bool persistent,
>  Error **errp)
>  {
>  AioContext *aio_context;
>  BlockDriverState *bs;
> +BdrvDirtyBitmap *bitmap;
>  
>  if (!name || name[0] == '\0') {
>  error_setg(errp, "Bitmap name cannot be empty");
> @@ -2725,7 +2728,20 @@ void qmp_block_dirty_bitmap_add(const char *node, 
> const char *name,
>  granularity = bdrv_get_default_bitmap_granularity(bs);
>  }
>  
> -bdrv_create_dirty_bitmap(bs, granularity, name, errp);
> +if (!has_persistent) {
> +persistent = false;
> +}
> +
> +if (persistent &&
> +!bdrv_can_store_new_dirty_bitmap(bs, name, granularity, errp))
> +{
> +goto out;
> +}
> +
> +bitmap = bdrv_create_dirty_bitmap(bs, granularity, name, errp);
> +if (bitmap != NULL) {
> +bdrv_dirty_bitmap_set_persistance(bitmap, persistent);
> +}
>  
>   out:
>  aio_context_release(aio_context);
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index 932f5bb..535df20 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -1559,10 +1559,16 @@
>  # @granularity: #optional the bitmap granularity, default is 64k for
>  #   block-dirty-bitmap-add
>  #
> +# @persistent: #optional the bitmap is persistent, i.e. it will be saved to 
> the
> +#  corresponding block device image file on its close. For now 
> only
> +#  Qcow2 disks support persistent bitmaps. Default is false.
> +#  (Since 2.9)
> +#
>  # Since: 2.4
>  ##
>  { 'struct': 'BlockDirtyBitmapAdd',
> -  'data': { 'node': 'str', 'name': 'str', '*granularity': 'uint32' } }
> +  'data': { 'node': 'str', 'name': 'str', '*granularity': 'uint32',
> +'*persistent': 'bool' } }
>  
>  ##
>  # @block-dirty-bitmap-add:
> 




Re: [Qemu-devel] [PATCH v15 15/25] qcow2: add .bdrv_can_store_new_dirty_bitmap

2017-02-15 Thread John Snow


On 02/15/2017 05:10 AM, Vladimir Sementsov-Ogievskiy wrote:
> Realize .bdrv_can_store_new_dirty_bitmap interface.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 

Thanks,

Reviewed-by: John Snow 

> ---
>  block/qcow2-bitmap.c | 52 
> 
>  block/qcow2.c|  1 +
>  block/qcow2.h|  4 
>  3 files changed, 57 insertions(+)
> 
> diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
> index b177a95..1ee89e4 100644
> --- a/block/qcow2-bitmap.c
> +++ b/block/qcow2-bitmap.c
> @@ -1182,3 +1182,55 @@ fail:
>  
>  bitmap_list_free(bm_list);
>  }
> +
> +bool qcow2_can_store_new_dirty_bitmap(BlockDriverState *bs,
> +  const char *name,
> +  uint32_t granularity,
> +  Error **errp)
> +{
> +BDRVQcow2State *s = bs->opaque;
> +bool found;
> +Qcow2BitmapList *bm_list;
> +
> +if (check_constraints_on_bitmap(bs, name, granularity) != 0) {
> +error_setg(errp, "The constraints are not satisfied");
> +goto fail;
> +}
> +
> +if (s->nb_bitmaps == 0) {
> +return true;
> +}
> +
> +if (s->nb_bitmaps >= QCOW2_MAX_BITMAPS) {
> +error_setg(errp,
> +   "Maximum number of persistent bitmaps is already 
> reached");
> +goto fail;
> +}
> +
> +if (s->bitmap_directory_size + calc_dir_entry_size(strlen(name), 0) >
> +QCOW2_MAX_BITMAP_DIRECTORY_SIZE)
> +{
> +error_setg(errp, "No enough space in the bitmap directory");
> +goto fail;
> +}
> +
> +bm_list = bitmap_list_load(bs, s->bitmap_directory_offset,
> +   s->bitmap_directory_size, errp);
> +if (bm_list == NULL) {
> +goto fail;
> +}
> +
> +found = find_bitmap_by_name(bm_list, name);
> +bitmap_list_free(bm_list);
> +if (found) {
> +error_setg(errp, "Bitmap with the same name is already stored");
> +goto fail;
> +}
> +
> +return true;
> +
> +fail:
> +error_prepend(errp, "Can't make bitmap '%s' persistent in '%s': ",
> +  name, bdrv_get_device_or_node_name(bs));
> +return false;
> +}
> diff --git a/block/qcow2.c b/block/qcow2.c
> index d0e41bf..6e1fe53 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -3541,6 +3541,7 @@ BlockDriver bdrv_qcow2 = {
>  
>  .bdrv_load_autoloading_dirty_bitmaps = 
> qcow2_load_autoloading_dirty_bitmaps,
>  .bdrv_store_persistent_dirty_bitmaps = 
> qcow2_store_persistent_dirty_bitmaps,
> +.bdrv_can_store_new_dirty_bitmap = qcow2_can_store_new_dirty_bitmap,
>  };
>  
>  static void bdrv_qcow2_init(void)
> diff --git a/block/qcow2.h b/block/qcow2.h
> index d9a7643..749710d 100644
> --- a/block/qcow2.h
> +++ b/block/qcow2.h
> @@ -616,5 +616,9 @@ void qcow2_cache_put(BlockDriverState *bs, Qcow2Cache *c, 
> void **table);
>  /* qcow2-bitmap.c functions */
>  void qcow2_load_autoloading_dirty_bitmaps(BlockDriverState *bs, Error 
> **errp);
>  void qcow2_store_persistent_dirty_bitmaps(BlockDriverState *bs, Error 
> **errp);
> +bool qcow2_can_store_new_dirty_bitmap(BlockDriverState *bs,
> +  const char *name,
> +  uint32_t granularity,
> +  Error **errp);
>  
>  #endif
> 



Re: [Qemu-devel] [PATCH v6 6/7] tests: Move reusable ACPI macros into a new header file

2017-02-15 Thread Ben Warren

> On Feb 15, 2017, at 2:56 PM, Eric Blake  wrote:
> 
> On 02/15/2017 03:58 PM, Ben Warren wrote:
> 
>>> 
>>> ---
>>> tests/acpi-utils.h   | 75 
>>> 
>>> tests/bios-tables-test.c | 72 +-
>>> 2 files changed, 76 insertions(+), 71 deletions(-)
> 
> 
>>> No copyright blurb? Also, does MAINTAINERS need an update to cover the
>>> new file?
>>> 
>> Sure, I didn’t realize the header files all have copyright headers.  As for 
>> MAINTAINERS, do you mean I should add a device entry for vmgenid?
> 
> In this patch, you're just refactoring to a new tests/acpi-utils.h, so
> I'd normally suggest adding it to the blurb that owns
> tests/bios-tables-test.c - but as a pre-existing problem, that also has
> no listed maintainer.  we're trying to ensure that all new added files
> have something listed in MAINTAINERS, even if it is in a misc section
> that only emails the list, although it's harder to say what maintainer
> to use for existing files that you are merely touching, and failure to
> list a maintainer is not (yet) a hard failure (although there have been
> patches proposed to scripts/checkpatch.pl to tighten the rules).
> 
> A new section for vmgenid might not be a bad idea, especially it if
> covers more files than just the one addition I noticed in this patch.
> 
Thank you for clarifying.  I’ll take care of it.
> -- 
> Eric Blake   eblake redhat com+1-919-301-3266
> Libvirt virtualization library http://libvirt.org
> 
—Ben



smime.p7s
Description: S/MIME cryptographic signature


Re: [Qemu-devel] [PATCH v3 05/16] target-m68k: use floatx80 internally

2017-02-15 Thread Richard Henderson

On 02/07/2017 11:59 AM, Laurent Vivier wrote:

+uint32_t fp0h;
+uint64_t fp0l;
+uint32_t fp1h;
+uint64_t fp1l;


I'm not especially keen on these temporaries.

Wouldn't it be better to pass pointers to FPReg to the helpers, so that e.g.

  fadd.x  fp0, fp1

puts the result in fp1 directly, without having to copy from this FP0 location?
I can see that you would need a proper FPReg temporary to deal with e.g.

  fadd.d  a0@, fp1

such that the memory source gets converted before being used as input to the 
addition.



+static float32 FP0_to_float32(CPUM68KState *env)
 {
+return *(float32 *)&env->fp0h;
 }

...

+static void float32_to_FP0(CPUM68KState *env, float32 val)
 {
+env->fp0h = *(uint32_t *)&val;
 }


I don't like this type-punning.  I also don't see what good it does to store 
these truncated values in portions of FP0, when you could simply pass or return 
them by value from the relevant helpers, e.g.



+void HELPER(reds32_FP0)(CPUM68KState *env)
 {
+int32_t res;
+
+res = floatx80_to_int32(FP0_to_floatx80(env), &env->fp_status);
+
+int32_to_FP0(env, res);
 }


could easily be

int32_t HELPER(reds32)(CPUM68KState *env, FPReg *val)
{
  return floatx80_to_int32(*val, &env->fp_status);
}


r~



Re: [Qemu-devel] [PATCH v6 6/7] tests: Move reusable ACPI macros into a new header file

2017-02-15 Thread Eric Blake
On 02/15/2017 03:58 PM, Ben Warren wrote:

>> 
>> ---
>> tests/acpi-utils.h   | 75 
>> 
>> tests/bios-tables-test.c | 72 +-
>> 2 files changed, 76 insertions(+), 71 deletions(-)


>> No copyright blurb? Also, does MAINTAINERS need an update to cover the
>> new file?
>>
> Sure, I didn’t realize the header files all have copyright headers.  As for 
> MAINTAINERS, do you mean I should add a device entry for vmgenid?

In this patch, you're just refactoring to a new tests/acpi-utils.h, so
I'd normally suggest adding it to the blurb that owns
tests/bios-tables-test.c - but as a pre-existing problem, that also has
no listed maintainer.  we're trying to ensure that all new added files
have something listed in MAINTAINERS, even if it is in a misc section
that only emails the list, although it's harder to say what maintainer
to use for existing files that you are merely touching, and failure to
list a maintainer is not (yet) a hard failure (although there have been
patches proposed to scripts/checkpatch.pl to tighten the rules).

A new section for vmgenid might not be a bad idea, especially it if
covers more files than just the one addition I noticed in this patch.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] iommu emulation

2017-02-15 Thread Alex Williamson
On Wed, 15 Feb 2017 17:05:35 -0500
Jintack Lim  wrote:

> On Tue, Feb 14, 2017 at 9:52 PM, Peter Xu  wrote:
> 
> > On Tue, Feb 14, 2017 at 07:50:39AM -0500, Jintack Lim wrote:
> >
> > [...]
> >  
> > > > > >> > I misunderstood what you said?  
> > > > > >
> > > > > > I failed to understand why an vIOMMU could help boost performance.  
> > :(  
> > > > > > Could you provide your command line here so that I can try to
> > > > > > reproduce?  
> > > > >
> > > > > Sure. This is the command line to launch L1 VM
> > > > >
> > > > > qemu-system-x86_64 -M q35,accel=kvm,kernel-irqchip=split \
> > > > > -m 12G -device intel-iommu,intremap=on,eim=off,caching-mode=on \
> > > > > -drive file=/mydata/guest0.img,format=raw --nographic -cpu host \
> > > > > -smp 4,sockets=4,cores=1,threads=1 \
> > > > > -device vfio-pci,host=08:00.0,id=net0
> > > > >
> > > > > And this is for L2 VM.
> > > > >
> > > > > ./qemu-system-x86_64 -M q35,accel=kvm \
> > > > > -m 8G \
> > > > > -drive file=/vm/l2guest.img,format=raw --nographic -cpu host \
> > > > > -device vfio-pci,host=00:03.0,id=net0  
> > > >
> > > > ... here looks like these are command lines for L1/L2 guest, rather
> > > > than L1 guest with/without vIOMMU?
> > > >  
> > >
> > > That's right. I thought you were asking about command lines for L1/L2  
> > guest  
> > > :(.
> > > I think I made the confusion, and as I said above, I didn't mean to talk
> > > about the performance of L1 guest with/without vIOMMO.
> > > We can move on!  
> >
> > I see. Sure! :-)
> >
> > [...]
> >  
> > > >
> > > > Then, I *think* above assertion you encountered would fail only if
> > > > prev == 0 here, but I still don't quite sure why was that happening.
> > > > Btw, could you paste me your "lspci -vvv -s 00:03.0" result in your L1
> > > > guest?
> > > >  
> > >
> > > Sure. This is from my L1 guest.  
> >
> > Hmm... I think I found the problem...
> >  
> > >
> > > root@guest0:~# lspci -vvv -s 00:03.0
> > > 00:03.0 Network controller: Mellanox Technologies MT27500 Family
> > > [ConnectX-3]
> > > Subsystem: Mellanox Technologies Device 0050
> > > Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> > > Stepping- SERR+ FastB2B- DisINTx+
> > > Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-  > > SERR-  > > Latency: 0, Cache Line Size: 64 bytes
> > > Interrupt: pin A routed to IRQ 23
> > > Region 0: Memory at fe90 (64-bit, non-prefetchable) [size=1M]
> > > Region 2: Memory at fe00 (64-bit, prefetchable) [size=8M]
> > > Expansion ROM at fea0 [disabled] [size=1M]
> > > Capabilities: [40] Power Management version 3
> > > Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold- 
> > >  
> > )  
> > > Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
> > > Capabilities: [48] Vital Product Data
> > > Product Name: CX354A - ConnectX-3 QSFP
> > > Read-only fields:
> > > [PN] Part number: MCX354A-FCBT
> > > [EC] Engineering changes: A4
> > > [SN] Serial number: MT1346X00791
> > > [V0] Vendor specific: PCIe Gen3 x8
> > > [RV] Reserved: checksum good, 0 byte(s) reserved
> > > Read/write fields:
> > > [V1] Vendor specific: N/A
> > > [YA] Asset tag: N/A
> > > [RW] Read-write area: 105 byte(s) free
> > > [RW] Read-write area: 253 byte(s) free
> > > [RW] Read-write area: 253 byte(s) free
> > > [RW] Read-write area: 253 byte(s) free
> > > [RW] Read-write area: 253 byte(s) free
> > > [RW] Read-write area: 253 byte(s) free
> > > [RW] Read-write area: 253 byte(s) free
> > > [RW] Read-write area: 253 byte(s) free
> > > [RW] Read-write area: 253 byte(s) free
> > > [RW] Read-write area: 253 byte(s) free
> > > [RW] Read-write area: 253 byte(s) free
> > > [RW] Read-write area: 253 byte(s) free
> > > [RW] Read-write area: 253 byte(s) free
> > > [RW] Read-write area: 253 byte(s) free
> > > [RW] Read-write area: 253 byte(s) free
> > > [RW] Read-write area: 252 byte(s) free
> > > End
> > > Capabilities: [9c] MSI-X: Enable+ Count=128 Masked-
> > > Vector table: BAR=0 offset=0007c000
> > > PBA: BAR=0 offset=0007d000
> > > Capabilities: [60] Express (v2) Root Complex Integrated Endpoint, MSI 00
> > > DevCap: MaxPayload 256 bytes, PhantFunc 0
> > > ExtTag- RBE+
> > > DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported+
> > > RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
> > > MaxPayload 256 bytes, MaxReadReq 4096 bytes
> > > DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
> > > DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not
> > > Supported
> > > DevCtl2: Completion Timeout: 65ms to 210ms, TimeoutDis-, LTR-, OBFF  
> > Disabled  
> > > Capabilities: [100 v0] #00  
> >
> > Here we have the head of ecap capability as cap_id==0, then when we
> > boot the l2 guest with the same device, we'll first copy this
> > cap_id==0 cap, then when adding the 2nd ecap, we'll probably encounter
> > problem since pcie_find_capability_list() will thought there is no cap
> > at all (cap_id==0 is skipped).
> >
> > Do you want to try thi

Re: [Qemu-devel] [RFC QEMU PATCH 1/8] nvdimm: do not initialize label_data if label_size is zero

2017-02-15 Thread Konrad Rzeszutek Wilk
On Mon, Oct 10, 2016 at 08:34:16AM +0800, Haozhong Zhang wrote:
> When memory-backend-xen is used, the label_data pointer can not be got
> via memory_region_get_ram_ptr(). We will use other functions to get

Could you explain why it cannot be retrieved via that way?

> label_data once we introduce NVDIMM label support to Xen.

Is this an particular patch in this series that does that?
You may want to enumerate which one it is.

> 
> Signed-off-by: Haozhong Zhang 
> ---
> Cc: Xiao Guangrong 
> Cc: "Michael S. Tsirkin" 
> Cc: Igor Mammedov 
> ---
>  hw/mem/nvdimm.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/mem/nvdimm.c b/hw/mem/nvdimm.c
> index 7895805..d25993b 100644
> --- a/hw/mem/nvdimm.c
> +++ b/hw/mem/nvdimm.c
> @@ -87,7 +87,9 @@ static void nvdimm_realize(PCDIMMDevice *dimm, Error **errp)
>  align = memory_region_get_alignment(mr);
>  
>  pmem_size = size - nvdimm->label_size;
> -nvdimm->label_data = memory_region_get_ram_ptr(mr) + pmem_size;
> +if (nvdimm->label_size) {
> +nvdimm->label_data = memory_region_get_ram_ptr(mr) + pmem_size;
> +}
>  pmem_size = QEMU_ALIGN_DOWN(pmem_size, align);
>  
>  if (size <= nvdimm->label_size || !pmem_size) {
> -- 
> 2.10.1
> 



Re: [Qemu-devel] [PATCH] target-ppc: Add quad precision muladd instructions

2017-02-15 Thread Richard Henderson

On 02/15/2017 05:37 PM, Bharata B Rao wrote:

+ *
+ * TODO: When float128_muladd() becomes available, switch this
+ * implementation to use that instead of separate float128_mul()
+ * followed by float128_add().


Let's just do that, rather than add something that can't pass tests.

You should be able to copy float64_muladd and, for the most part, s/128/256/ 
and s/64/128/.  Other of the magic numbers, like the implicit bit and the 
exponent bias, you get from float128_mul.



r~



Re: [Qemu-devel] iommu emulation

2017-02-15 Thread Jintack Lim
On Tue, Feb 14, 2017 at 9:52 PM, Peter Xu  wrote:

> On Tue, Feb 14, 2017 at 07:50:39AM -0500, Jintack Lim wrote:
>
> [...]
>
> > > > >> > I misunderstood what you said?
> > > > >
> > > > > I failed to understand why an vIOMMU could help boost performance.
> :(
> > > > > Could you provide your command line here so that I can try to
> > > > > reproduce?
> > > >
> > > > Sure. This is the command line to launch L1 VM
> > > >
> > > > qemu-system-x86_64 -M q35,accel=kvm,kernel-irqchip=split \
> > > > -m 12G -device intel-iommu,intremap=on,eim=off,caching-mode=on \
> > > > -drive file=/mydata/guest0.img,format=raw --nographic -cpu host \
> > > > -smp 4,sockets=4,cores=1,threads=1 \
> > > > -device vfio-pci,host=08:00.0,id=net0
> > > >
> > > > And this is for L2 VM.
> > > >
> > > > ./qemu-system-x86_64 -M q35,accel=kvm \
> > > > -m 8G \
> > > > -drive file=/vm/l2guest.img,format=raw --nographic -cpu host \
> > > > -device vfio-pci,host=00:03.0,id=net0
> > >
> > > ... here looks like these are command lines for L1/L2 guest, rather
> > > than L1 guest with/without vIOMMU?
> > >
> >
> > That's right. I thought you were asking about command lines for L1/L2
> guest
> > :(.
> > I think I made the confusion, and as I said above, I didn't mean to talk
> > about the performance of L1 guest with/without vIOMMO.
> > We can move on!
>
> I see. Sure! :-)
>
> [...]
>
> > >
> > > Then, I *think* above assertion you encountered would fail only if
> > > prev == 0 here, but I still don't quite sure why was that happening.
> > > Btw, could you paste me your "lspci -vvv -s 00:03.0" result in your L1
> > > guest?
> > >
> >
> > Sure. This is from my L1 guest.
>
> Hmm... I think I found the problem...
>
> >
> > root@guest0:~# lspci -vvv -s 00:03.0
> > 00:03.0 Network controller: Mellanox Technologies MT27500 Family
> > [ConnectX-3]
> > Subsystem: Mellanox Technologies Device 0050
> > Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> > Stepping- SERR+ FastB2B- DisINTx+
> > Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-  > SERR-  > Latency: 0, Cache Line Size: 64 bytes
> > Interrupt: pin A routed to IRQ 23
> > Region 0: Memory at fe90 (64-bit, non-prefetchable) [size=1M]
> > Region 2: Memory at fe00 (64-bit, prefetchable) [size=8M]
> > Expansion ROM at fea0 [disabled] [size=1M]
> > Capabilities: [40] Power Management version 3
> > Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-
> )
> > Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
> > Capabilities: [48] Vital Product Data
> > Product Name: CX354A - ConnectX-3 QSFP
> > Read-only fields:
> > [PN] Part number: MCX354A-FCBT
> > [EC] Engineering changes: A4
> > [SN] Serial number: MT1346X00791
> > [V0] Vendor specific: PCIe Gen3 x8
> > [RV] Reserved: checksum good, 0 byte(s) reserved
> > Read/write fields:
> > [V1] Vendor specific: N/A
> > [YA] Asset tag: N/A
> > [RW] Read-write area: 105 byte(s) free
> > [RW] Read-write area: 253 byte(s) free
> > [RW] Read-write area: 253 byte(s) free
> > [RW] Read-write area: 253 byte(s) free
> > [RW] Read-write area: 253 byte(s) free
> > [RW] Read-write area: 253 byte(s) free
> > [RW] Read-write area: 253 byte(s) free
> > [RW] Read-write area: 253 byte(s) free
> > [RW] Read-write area: 253 byte(s) free
> > [RW] Read-write area: 253 byte(s) free
> > [RW] Read-write area: 253 byte(s) free
> > [RW] Read-write area: 253 byte(s) free
> > [RW] Read-write area: 253 byte(s) free
> > [RW] Read-write area: 253 byte(s) free
> > [RW] Read-write area: 253 byte(s) free
> > [RW] Read-write area: 252 byte(s) free
> > End
> > Capabilities: [9c] MSI-X: Enable+ Count=128 Masked-
> > Vector table: BAR=0 offset=0007c000
> > PBA: BAR=0 offset=0007d000
> > Capabilities: [60] Express (v2) Root Complex Integrated Endpoint, MSI 00
> > DevCap: MaxPayload 256 bytes, PhantFunc 0
> > ExtTag- RBE+
> > DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported+
> > RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
> > MaxPayload 256 bytes, MaxReadReq 4096 bytes
> > DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
> > DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not
> > Supported
> > DevCtl2: Completion Timeout: 65ms to 210ms, TimeoutDis-, LTR-, OBFF
> Disabled
> > Capabilities: [100 v0] #00
>
> Here we have the head of ecap capability as cap_id==0, then when we
> boot the l2 guest with the same device, we'll first copy this
> cap_id==0 cap, then when adding the 2nd ecap, we'll probably encounter
> problem since pcie_find_capability_list() will thought there is no cap
> at all (cap_id==0 is skipped).
>
> Do you want to try this "hacky patch" to see whether it works for you?
>

Thanks for following this up!

I just tried this, and I got some different message this time.

qemu-system-x86_64: vfio: Cannot reset device :00:03.0, no available
reset mechanism.
qemu-system-x86_64: vfio: Cannot reset device :00:03.0, no available
reset mechanism.


Thanks,
Jintack



Re: [Qemu-devel] [PATCH v6 6/7] tests: Move reusable ACPI macros into a new header file

2017-02-15 Thread Ben Warren

> On Feb 15, 2017, at 1:35 PM, Eric Blake  wrote:
> 
> On 02/15/2017 12:15 AM, b...@skyportsystems.com 
>  wrote:
>> From: Ben Warren 
>> 
>> Also usable by upcoming VM Generation ID tests
>> 
>> Signed-off-by: Ben Warren 
>> ---
>> tests/acpi-utils.h   | 75 
>> 
>> tests/bios-tables-test.c | 72 +-
>> 2 files changed, 76 insertions(+), 71 deletions(-)
>> create mode 100644 tests/acpi-utils.h
>> 
>> diff --git a/tests/acpi-utils.h b/tests/acpi-utils.h
>> new file mode 100644
>> index 000..d5e5eff
>> --- /dev/null
>> +++ b/tests/acpi-utils.h
>> @@ -0,0 +1,75 @@
>> +#ifndef TEST_ACPI_UTILS_H
>> +#define TEST_ACPI_UTILS_H
> 
> No copyright blurb? Also, does MAINTAINERS need an update to cover the
> new file?
> 
Sure, I didn’t realize the header files all have copyright headers.  As for 
MAINTAINERS, do you mean I should add a device entry for vmgenid?

thanks,
Ben
> -- 
> Eric Blake   eblake redhat com+1-919-301-3266
> Libvirt virtualization library http://libvirt.org 


smime.p7s
Description: S/MIME cryptographic signature


Re: [Qemu-devel] [PATCH v12 12/24] tcg: handle EXCP_ATOMIC exception for system emulation

2017-02-15 Thread Richard Henderson

On 02/14/2017 09:50 PM, Alex Bennée wrote:


Richard Henderson  writes:


On 02/13/2017 11:10 PM, Alex Bennée wrote:

@@ -239,9 +240,16 @@ static void cpu_exec_step(CPUState *cpu)



+} else if (r == EXCP_ATOMIC) {
+qemu_mutex_unlock_iothread();
+cpu_exec_step_atomic(cpu);
+qemu_mutex_lock_iothread();

...

+case EXCP_ATOMIC:
+qemu_mutex_unlock_iothread();
+cpu_exec_step_atomic(cpu);
+qemu_mutex_lock_iothread();



I just noticed this, but if you have to do a v13, it might be best to
move these locks inside cpu_exec_step_atomic, as with tcg_cpu_exec.
Otherwise leave it for later.


Will that work given cpu_exec_step_atomic() is common between linux-user
and system emulation?


Ug.  No, you're right.


r~



Re: [Qemu-devel] [PATCH v6 6/7] tests: Move reusable ACPI macros into a new header file

2017-02-15 Thread Eric Blake
On 02/15/2017 12:15 AM, b...@skyportsystems.com wrote:
> From: Ben Warren 
> 
> Also usable by upcoming VM Generation ID tests
> 
> Signed-off-by: Ben Warren 
> ---
>  tests/acpi-utils.h   | 75 
> 
>  tests/bios-tables-test.c | 72 +-
>  2 files changed, 76 insertions(+), 71 deletions(-)
>  create mode 100644 tests/acpi-utils.h
> 
> diff --git a/tests/acpi-utils.h b/tests/acpi-utils.h
> new file mode 100644
> index 000..d5e5eff
> --- /dev/null
> +++ b/tests/acpi-utils.h
> @@ -0,0 +1,75 @@
> +#ifndef TEST_ACPI_UTILS_H
> +#define TEST_ACPI_UTILS_H

No copyright blurb? Also, does MAINTAINERS need an update to cover the
new file?

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH] pci/pcie: don't assume cap id 0 is reserved

2017-02-15 Thread Alex Williamson
On Wed, 15 Feb 2017 22:49:47 +0200
"Michael S. Tsirkin"  wrote:

> VFIO actually wants to create a capability with ID == 0.
> This is done to make guest drivers skip the given capability.
> pcie_add_capability then trips up on this capability
> when looking for end of capability list.
> 
> To support this use-case, it's easy enough to switch to
> e.g. 0x for these comparisons - we can be sure
> it will never match a 16-bit capability ID.
> 
> Signed-off-by: Michael S. Tsirkin 
> ---


Reviewed-by: Alex Williamson 


>  hw/pci/pcie.c | 11 +++
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
> index cbd4bb4..f4dd177 100644
> --- a/hw/pci/pcie.c
> +++ b/hw/pci/pcie.c
> @@ -610,7 +610,8 @@ bool pcie_cap_is_arifwd_enabled(const PCIDevice *dev)
>   * uint16_t ext_cap_size
>   */
>  
> -static uint16_t pcie_find_capability_list(PCIDevice *dev, uint16_t cap_id,
> +/* Passing a cap_id value > 0x will return 0 and put end of list in prev 
> */
> +static uint16_t pcie_find_capability_list(PCIDevice *dev, uint32_t cap_id,
>uint16_t *prev_p)
>  {
>  uint16_t prev = 0;
> @@ -679,9 +680,11 @@ void pcie_add_capability(PCIDevice *dev,
>  } else {
>  uint16_t prev;
>  
> -/* 0 is reserved cap id. use internally to find the last capability
> -   in the linked list */
> -next = pcie_find_capability_list(dev, 0, &prev);
> +/*
> + * 0x is not a valid cap id (it's a 16 bit field). use
> + * internally to find the last capability in the linked list.
> + */
> +next = pcie_find_capability_list(dev, 0x, &prev);
>  
>  assert(prev >= PCI_CONFIG_SPACE_SIZE);
>  assert(next == 0);




[Qemu-devel] [PATCH v2] target/sparc: Restore ldstub of odd asis

2017-02-15 Thread Richard Henderson
Fixes the booting of ss20 roms.

Cc: qemu-sta...@nongnu.org
Reported-by: Michael Russo 
Tested-by: Mark Cave-Ayland 
Signed-off-by: Richard Henderson 
---
v2: Update tags.
---
 target/sparc/translate.c | 27 +--
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index 655060c..aa6734d 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -2448,8 +2448,31 @@ static void gen_ldstub_asi(DisasContext *dc, TCGv dst, 
TCGv addr, int insn)
 gen_ldstub(dc, dst, addr, da.mem_idx);
 break;
 default:
-/* ??? Should be DAE_invalid_asi.  */
-gen_exception(dc, TT_DATA_ACCESS);
+/* ??? In theory, this should be raise DAE_invalid_asi.
+   But the SS-20 roms do ldstuba [%l0] #ASI_M_CTL, %o1.  */
+if (parallel_cpus) {
+gen_helper_exit_atomic(cpu_env);
+} else {
+TCGv_i32 r_asi = tcg_const_i32(da.asi);
+TCGv_i32 r_mop = tcg_const_i32(MO_UB);
+TCGv_i64 s64, t64;
+
+save_state(dc);
+t64 = tcg_temp_new_i64();
+gen_helper_ld_asi(t64, cpu_env, addr, r_asi, r_mop);
+
+s64 = tcg_const_i64(0xff);
+gen_helper_st_asi(cpu_env, addr, s64, r_asi, r_mop);
+tcg_temp_free_i64(s64);
+tcg_temp_free_i32(r_mop);
+tcg_temp_free_i32(r_asi);
+
+tcg_gen_trunc_i64_tl(dst, t64);
+tcg_temp_free_i64(t64);
+
+/* End the TB.  */
+dc->npc = DYNAMIC_PC;
+}
 break;
 }
 }
-- 
2.9.3




Re: [Qemu-devel] [PATCH v6 1/7] linker-loader: Add new 'write pointer' command

2017-02-15 Thread Michael S. Tsirkin
On Wed, Feb 15, 2017 at 10:44:05AM -0800, Ben Warren wrote:
> 
> On Feb 15, 2017, at 10:35 AM, Igor Mammedov  wrote:
> 
> On Wed, 15 Feb 2017 10:14:55 -0800
> Ben Warren  wrote:
> 
> 
> On Feb 15, 2017, at 10:06 AM, Michael S. Tsirkin 
> wrote:
> 
> On Wed, Feb 15, 2017 at 09:54:08AM -0800, Ben Warren wrote:  
> 
> 
>   On Feb 15, 2017, at 9:43 AM, Igor Mammedov <
> imamm...@redhat.com> wrote:
> 
>   On Wed, 15 Feb 2017 18:39:06 +0200
>   "Michael S. Tsirkin"  wrote:
> 
> 
>   On Wed, Feb 15, 2017 at 04:56:02PM +0100, Igor Mammedov
> wrote:
> 
>   On Wed, 15 Feb 2017 17:30:00 +0200
>   "Michael S. Tsirkin"  wrote:
> 
> 
>   On Wed, Feb 15, 2017 at 04:22:25PM +0100, Igor
> Mammedov wrote:
> 
> 
>   On Wed, 15 Feb 2017 15:13:20 +0100
>   Laszlo Ersek  wrote:
> 
> 
>   Commenting under Igor's reply for
> simplicity
> 
>   On 02/15/17 11:57, Igor Mammedov wrote:
>
> 
>   On Tue, 14 Feb 2017 22:15:43 -0800
>   b...@skyportsystems.com wrote:
> 
> 
>   From: Ben Warren <
> b...@skyportsystems.com>
> 
>   This is similar to the existing
> 'add pointer'
>   functionality, but instead
>   of instructing the guest (BIOS 
> or
> UEFI) to
>   patch memory, it instructs
>   the guest to write the pointer
> back to QEMU via
>   a writeable fw_cfg file.
> 
>   Signed-off-by: Ben Warren <  
>   b...@skyportsystems.com>  
>   ---
>   hw/acpi/bios-linker-loader.c
> | 58
>   
> ++--
>   include/hw/acpi/
> bios-linker-loader.h |  6 
>   2 files changed, 61 insertions
> (+), 3 deletions
>   (-)
> 
>   diff --git a/hw/acpi/
> bios-linker-loader.c b/hw/
>   acpi/bios-linker-loader.c
>   index d963ebe..5030cf1 100644
>   --- a/hw/acpi/
> bios-linker-loader.c
>   +++ b/hw/acpi/
> bios-linker-loader.c
>   @@ -78,6 +78,19 @@ struct
> BiosLinkerLoaderEntry
>   {
>   uint32_t length;
>   } cksum;
> 
>   +/*
>   + * 
> COMMAND_WRITE_POINTER
> - write the
>   fw_cfg file (originating from
>   + * @dest_file) at
> @wr_pointer.offset,
>   by adding a pointer to the table
>   + * originating from
> @src_file. 1,2,4
>   or 8 byte unsigned
>   + * addition is used
> depending on
>   @wr_pointer.size.
>   + */  
> 
> 
>   The words "adding" and "addition" are
> causing confusion
>   here.
> 
>   In all of the previous discussion,
> *addition* was out
>   of scope from
>   WRITE_POINTER. Again, the firmware is
> specifically not
>   required to
>   *re

Re: [Qemu-devel] [PATCH v6 0/7] Add support for VM Generation ID

2017-02-15 Thread Laszlo Ersek
On 02/15/17 21:09, Michael S. Tsirkin wrote:
> On Wed, Feb 15, 2017 at 08:47:48PM +0100, Laszlo Ersek wrote:

[snip]

>> For patches #1, #3, #4 and #5:
>>
>> Tested-by: Laszlo Ersek 
>>
>> I'll soon post the OVMF patches.
>>
>> Thanks!
>> Laszlo
> 
> 
> How do you feel about Igor's request to change WRITE_POINTER to add
> offset in there, so guest can pass in the address of GUID and
> not start of table? Would that be a lot of work to add?

I think it's doable in practice: simply add a constant from the command
itself, for passing the value back to QEMU, and also for saving the
fw_cfg write commend for S3 resume time.

But, I disagree with it from a design POV.

Igor's point is:

> Math complicates QEMU code though and not only QMEMU but AML code as
> well.

As I understand it, the goal is to push the addition to the firmware
(which is "one place"), rather than having to implement it twice in
QEMU, i.e., in two places ((a) native QEMU logic, (b) AML generator).

Here's my counter-argument:

(a) As I mentioned earlier, assume a complex communication structure
between the guest OS and QEMU. Currently our shared structure consists
of a single field (the GUID), but next time it might contain several fields.

For such a multi-field shared structure, QEMU will have to do manual
offsetting into the guest RAM anyway, for accessing fields F1, F2, and
F3. We will not create three separate WRITE_POINTER commands and let the
firmware calculate and return the absolute GPAs of the fields F1, F2 and
F3. Instead, there will be one WRITE_POINTER command, and QEMU will do
the offsetting manually, minimally for fields F2 and F3.

"src_offset" looks tempting now only because we have a shared structure
with only one field, the GUID at offset 40 decimal.

(b) Regarding the runtime addition in the AML code:

As discussed before, the main reason *now*, for not pointing VGIA (and
other named integer objects) with ADD_POINTER commands directly to
"meaningful" fields, is that OVMF probes the targets of ADD_POINTER
commands for patterns that look like ACPI table headers. And, for the
time being, we want to suppress any mis-recognitions by prepending some
padding.

Igor was right to dislike this, and we agreed that *down the road* we
should add allocation flags, or further allocation commands, to supplant
this kind of heuristics in OVMF. But:

- we don't have time to do it now, plus

- please observe that the runtime addition in AML relates to the
  ADD_POINTER and the ALLOCATE commands. It does not relate to
  WRITE_POINTER at all.

  Whatever we change on WRITE_POINTER will do nothing for suppressing
  OVMF's table header probing -- because that is tied to ADD_POINTER
  --, therefore WRITE_POINTER tweaks cannot eliminate the "need to add"
  in AML.


In summary, I think the proposed WRITE_POINTER modification is
implementable, but I think it will not pay off, because:

(a) for QEMU logic, it will not prove useful as soon as we have a
multi-field shared structure (QEMU will have to add field offsets anyway),

(b) and for eliminating the AML addition (which is a consequence of the
current ADD_POINTER handling in OVMF), it does nothing.

Thanks
Laszlo



[Qemu-devel] [PATCH] pci/pcie: don't assume cap id 0 is reserved

2017-02-15 Thread Michael S. Tsirkin
VFIO actually wants to create a capability with ID == 0.
This is done to make guest drivers skip the given capability.
pcie_add_capability then trips up on this capability
when looking for end of capability list.

To support this use-case, it's easy enough to switch to
e.g. 0x for these comparisons - we can be sure
it will never match a 16-bit capability ID.

Signed-off-by: Michael S. Tsirkin 
---
 hw/pci/pcie.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
index cbd4bb4..f4dd177 100644
--- a/hw/pci/pcie.c
+++ b/hw/pci/pcie.c
@@ -610,7 +610,8 @@ bool pcie_cap_is_arifwd_enabled(const PCIDevice *dev)
  * uint16_t ext_cap_size
  */
 
-static uint16_t pcie_find_capability_list(PCIDevice *dev, uint16_t cap_id,
+/* Passing a cap_id value > 0x will return 0 and put end of list in prev */
+static uint16_t pcie_find_capability_list(PCIDevice *dev, uint32_t cap_id,
   uint16_t *prev_p)
 {
 uint16_t prev = 0;
@@ -679,9 +680,11 @@ void pcie_add_capability(PCIDevice *dev,
 } else {
 uint16_t prev;
 
-/* 0 is reserved cap id. use internally to find the last capability
-   in the linked list */
-next = pcie_find_capability_list(dev, 0, &prev);
+/*
+ * 0x is not a valid cap id (it's a 16 bit field). use
+ * internally to find the last capability in the linked list.
+ */
+next = pcie_find_capability_list(dev, 0x, &prev);
 
 assert(prev >= PCI_CONFIG_SPACE_SIZE);
 assert(next == 0);
-- 
MST



Re: [Qemu-devel] [PATCH v8 1/2] block/vxhs.c: Add support for a new block device type called "vxhs"

2017-02-15 Thread ashish mittal
Thanks! Will change accordingly in the next patch.

On Tue, Feb 14, 2017 at 7:54 PM, Jeff Cody  wrote:
> On Tue, Feb 14, 2017 at 07:02:32PM -0800, ashish mittal wrote:
>> On Tue, Feb 14, 2017 at 2:34 PM, ashish mittal  wrote:
>> > On Tue, Feb 14, 2017 at 12:51 PM, Jeff Cody  wrote:
>> >> On Thu, Feb 09, 2017 at 01:24:58AM -0800, ashish mittal wrote:
>> >>> On Wed, Feb 8, 2017 at 10:29 PM, Jeff Cody  wrote:
>> >>> > On Wed, Feb 08, 2017 at 09:23:33PM -0800, Ashish Mittal wrote:
>> >>> >> From: Ashish Mittal 
>> >>> >>
>> >>> >> Source code for the qnio library that this code loads can be 
>> >>> >> downloaded from:
>> >>> >> https://github.com/VeritasHyperScale/libqnio.git
>> >>> >>
>> >>> >> Sample command line using JSON syntax:
>> >>> >> ./x86_64-softmmu/qemu-system-x86_64 -name instance-0008 -S -vnc 
>> >>> >> 0.0.0.0:0
>> >>> >> -k en-us -vga cirrus -device 
>> >>> >> virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5
>> >>> >> -msg timestamp=on
>> >>> >> 'json:{"driver":"vxhs","vdisk-id":"c3e9095a-a5ee-4dce-afeb-2a59fb387410",
>> >>> >> "server":{"host":"172.172.17.4","port":""}}'
>> >>> >>
>> >>> >> Sample command line using URI syntax:
>> >>> >> qemu-img convert -f raw -O raw -n
>> >>> >> /var/lib/nova/instances/_base/0c5eacd5ebea5ed914b6a3e7b18f1ce734c386ad
>> >>> >> vxhs://192.168.0.1:/c6718f6b-0401-441d-a8c3-1f0064d75ee0
>> >>> >>
>> >>
>> >> [...]
>> >>
>> >>> >> +#define VXHS_OPT_FILENAME   "filename"
>> >>> >> +#define VXHS_OPT_VDISK_ID   "vdisk-id"
>> >>> >> +#define VXHS_OPT_SERVER "server"
>> >>> >> +#define VXHS_OPT_HOST   "host"
>> >>> >> +#define VXHS_OPT_PORT   "port"
>> >>> >> +#define VXHS_UUID_DEF "12345678-1234-1234-1234-123456789012"
>> >>> >
>> >>> > What is this?  It is not a valid UUID; is the value significant?
>> >>> >
>> >>>
>> >>> This value gets passed to libvxhs for binaries like qemu-io, qemu-img
>> >>> that do not have an Instance ID. We can use this default ID to control
>> >>> access to specific vdisks by these binaries. qemu-kvm will pass the
>> >>> actual instance ID, and therefore will not use this default value.
>> >>>
>> >>> Will reply to other queries soon.
>> >>>
>> >>
>> >> If you are going to call it a UUID, it should adhere to the RFC 4122 spec.
>> >> You can easily generate a compliant UUID with uuidgen.  However:
>> >>
>> >> Can you explain more about how you are using this to control access by
>> >> qemu-img and qemu-io?  Looking at libqnio, it looks like this is used to
>> >> determine at runtime which TLS certs to use based off of a
>> >> pathname/filename, which is then how I presume you are controlling access.
>> >> See Daniel's email regarding TLS certificates.
>> >>
>> >
>> > (1) The default behavior would be to disallow access to any vdisks by
>> > the non qemu-kvm binaries. qemu-kvm would use the actual instance ID
>> > for authentication.
>> > (2) Depending on the workflow, HyperScale controller can choose to
>> > grant *temporary* access to specific vdisks by qemu-img, qemu-io
>> > binaries (identified by the default VXHS_UUID_DEF above).
>> > (3) This information, described in #2, would be communicated by the
>> > HyperScale controller to the actual proprietary VxHS server (running
>> > on each compute) that does the authentication/SSL.
>> > (4) The HyperScale controller, in this way, can grant/revoke access
>> > for specific vdisks not just to clients with VXHS_UUID_DEF instance
>> > ID, but also the actual VM instances.
>> >
>> >> [...]
>> >>
>> >>> >> +
>> >>> >> +static void bdrv_vxhs_init(void)
>> >>> >> +{
>> >>> >> +char out[37];
>> >>
>> >> Additional point: this should be sized as UUID_FMT_LEN + 1, not 37, but I
>> >> suspect this code is changing anyways.
>> >>
>> >>> >> +
>> >>> >> +if (qemu_uuid_is_null(&qemu_uuid)) {
>> >>> >> +lib_init_failed = iio_init(QNIO_VERSION, vxhs_iio_callback, 
>> >>> >> VXHS_UUID_DEF);
>> >>> >> +} else {
>> >>> >> +qemu_uuid_unparse(&qemu_uuid, out);
>> >>> >> +lib_init_failed = iio_init(QNIO_VERSION, vxhs_iio_callback, 
>> >>> >> out);
>> >>> >> +}
>> >>> >> +
>> >>> >
>> >>> > [1]
>> >>> >
>> >>> > Can you explain what is going on here with the qemu_uuid check?
>> >>> >
>> >
>> > (1) qemu_uuid_is_null(&qemu_uuid) is true for qemu-io, qemu-img that
>> > do not define a instance ID. We end up using the default VXHS_UUID_DEF
>> > ID for them, and authenticating them as described above.
>> >
>> > (2) For the other case 'else', we convert the uuid to a char * using
>> > qemu_uuid_unparse(), and pass the resulting char * uuid in variable
>> > 'out' to libvxhs.
>> >
>> >>> >
>> >>> > You also can't do this here.  This init function is just to register 
>> >>> > the
>> >>> > driver (e.g. populate the BlockDriver list).  You shouldn't be doing
>> >>> > anything other than the bdrv_register() call here.
>> >>> >
>> >>> > Since you want to run this iio_init only once, I would recommend doing 
>> >>> > it in
>> >

Re: [Qemu-devel] [PATCH v3 8/8] hw: Drop superfluous special checks for orphaned -drive

2017-02-15 Thread John Snow


On 02/15/2017 05:05 AM, Markus Armbruster wrote:
> We've traditionally rejected orphans here and there, but not
> systematically.  For instance, the sun4m machines have an onboard SCSI
> HBA (bus=0), and have always rejected bus>0.  Other machines with an
> onboard SCSI HBA don't.
> 
> Commit a66c9dc made all orphans trigger a warning, and the previous
> commit turned this into an error.  The checks "here and there" are now
> redundant.  Drop them.
> 
> Note that the one in mips_jazz.c was wrong: it rejected bus > MAX_FD,
> but MAX_FD is the number of floppy drives per bus.
> 
> Error messages change from
> 
> $ qemu-system-x86_64 -drive if=ide,bus=2
> qemu-system-x86_64: Too many IDE buses defined (3 > 2)
> $ qemu-system-mips64 -M magnum,accel=qtest -drive if=floppy,bus=2,id=fd1
> qemu: too many floppy drives
> $ qemu-system-sparc -M LX -drive if=scsi,bus=1
> qemu: too many SCSI bus
> 
> to
> 
> $ qemu-system-x86_64 -drive if=ide,bus=2
> qemu-system-x86_64: -drive if=ide,bus=2: machine type does not support 
> if=ide,bus=2,unit=0
> $ qemu-system-mips64 -M magnum,accel=qtest -drive if=floppy,bus=2,id=fd1
> qemu-system-mips64: -drive if=floppy,bus=2,id=fd1: machine type does not 
> support if=floppy,bus=2,unit=0
> $ qemu-system-sparc -M LX -drive if=scsi,bus=1
> qemu-system-sparc: -drive if=scsi,bus=1: machine type does not support 
> if=scsi,bus=1,unit=0
> 
> Cc: John Snow 
> Cc: "Hervé Poussineau" 
> Cc: Mark Cave-Ayland 
> Signed-off-by: Markus Armbruster 
> ---
>  hw/ide/core.c   | 17 -
>  hw/mips/mips_jazz.c |  4 
>  hw/sparc/sun4m.c|  5 -
>  3 files changed, 26 deletions(-)
> 
> diff --git a/hw/ide/core.c b/hw/ide/core.c
> index 43709e5..cfa5de6 100644
> --- a/hw/ide/core.c
> +++ b/hw/ide/core.c
> @@ -2840,23 +2840,6 @@ const VMStateDescription vmstate_ide_bus = {
>  void ide_drive_get(DriveInfo **hd, int n)
>  {
>  int i;
> -int highest_bus = drive_get_max_bus(IF_IDE) + 1;
> -int max_devs = drive_get_max_devs(IF_IDE);
> -int n_buses = max_devs ? (n / max_devs) : n;
> -
> -/*
> - * Note: The number of actual buses available is not known.
> - * We compute this based on the size of the DriveInfo* array, n.
> - * If it is less than max_devs * ,
> - * We will stop looking for drives prematurely instead of overfilling
> - * the array.
> - */
> -
> -if (highest_bus > n_buses) {
> -error_report("Too many IDE buses defined (%d > %d)",
> - highest_bus, n_buses);
> -exit(1);
> -}
>  
>  for (i = 0; i < n; i++) {
>  hd[i] = drive_get_by_index(IF_IDE, i);
> diff --git a/hw/mips/mips_jazz.c b/hw/mips/mips_jazz.c
> index 73f6c9f..1cef581 100644
> --- a/hw/mips/mips_jazz.c
> +++ b/hw/mips/mips_jazz.c
> @@ -291,10 +291,6 @@ static void mips_jazz_init(MachineState *machine,
>   qdev_get_gpio_in(rc4030, 5), &esp_reset, &dma_enable);
>  
>  /* Floppy */
> -if (drive_get_max_bus(IF_FLOPPY) >= MAX_FD) {
> -fprintf(stderr, "qemu: too many floppy drives\n");
> -exit(1);
> -}
>  for (n = 0; n < MAX_FD; n++) {
>  fds[n] = drive_get(IF_FLOPPY, 0, n);
>  }
> diff --git a/hw/sparc/sun4m.c b/hw/sparc/sun4m.c
> index f5b6efd..61416a6 100644
> --- a/hw/sparc/sun4m.c
> +++ b/hw/sparc/sun4m.c
> @@ -989,11 +989,6 @@ static void sun4m_hw_init(const struct sun4m_hwdef 
> *hwdef,
>  slavio_misc_init(hwdef->slavio_base, hwdef->aux1_base, hwdef->aux2_base,
>   slavio_irq[30], fdc_tc);
>  
> -if (drive_get_max_bus(IF_SCSI) > 0) {
> -fprintf(stderr, "qemu: too many SCSI bus\n");
> -exit(1);
> -}
> -
>  esp_init(hwdef->esp_base, 2,
>   espdma_memory_read, espdma_memory_write,
>   espdma, espdma_irq, &esp_reset, &dma_enable);
> 

http://i.imgur.com/v1Lvzb1.jpg

Reviewed-by: John Snow 



  1   2   3   4   >