Re: [PATCH v4 27/33] nvdimm acpi: save arg3 for NVDIMM device _DSM method

2015-10-18 Thread Michael S. Tsirkin
On Mon, Oct 19, 2015 at 12:04:48PM +0800, Xiao Guangrong wrote:
> 
> 
> On 10/19/2015 01:16 AM, Michael S. Tsirkin wrote:
> >On Mon, Oct 19, 2015 at 08:54:13AM +0800, Xiao Guangrong wrote:
> >>Check if the input Arg3 is valid then store it into dsm_in if needed
> >>
> >>Signed-off-by: Xiao Guangrong 
> >>---
> >>  hw/acpi/nvdimm.c | 19 +++
> >>  1 file changed, 19 insertions(+)
> >>
> >>diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
> >>index 7e99889..b211b8b 100644
> >>--- a/hw/acpi/nvdimm.c
> >>+++ b/hw/acpi/nvdimm.c
> >>@@ -624,10 +624,29 @@ static void nvdimm_build_acpi_devices(NVDIMMState 
> >>*state, GSList *device_list,
> >>
> >>  method = aml_method_serialized("NCAL", 4);
> >>  {
> >>+Aml *ifctx;
> >>+
> >>  aml_append(method, aml_store(aml_arg(0), aml_name("HDLE")));
> >>  aml_append(method, aml_store(aml_arg(1), aml_name("REVS")));
> >>  aml_append(method, aml_store(aml_arg(2), aml_name("FUNC")));
> >>
> >>+/* Arg3 is passed as Package and it has one element? */
> >>+ifctx = aml_if(aml_and(aml_equal(aml_object_type(aml_arg(3)),
> >>+ aml_int(4)),
> >>+   aml_equal(aml_sizeof(aml_arg(3)),
> >>+ aml_int(1;
> >>+{
> >>+/* Local0 = Index(Arg3, 0) */
> >>+aml_append(ifctx, aml_store(aml_index(aml_arg(3), aml_int(0)),
> >>+aml_local(0)));
> >>+/* Local3 = DeRefOf(Local0) */
> >>+aml_append(ifctx, aml_store(aml_derefof(aml_local(0)),
> >>+aml_local(3)));
> >>+/* ARG3 = Local3 */
> >>+aml_append(ifctx, aml_store(aml_local(3), aml_name("ARG3")));
> >>+}
> >>+aml_append(method, ifctx);
> >>+
> >>  aml_append(method, aml_store(aml_int(NOTIFY_VALUE), 
> >> aml_name("NOTI")));
> >>
> >>  aml_append(method, aml_store(aml_name("RLEN"), aml_local(6)));
> >
> >I commented on this patch on v3.
> >It doesn't look like this was addressed.
> >
> 
> Ah... I see no one commented this patch ([PATCH v3 26/32] nvdimm: save arg3 
> for NVDIMM
> device_DSM method) on v3.
> 
> Do you mean we need more and better comment to explain arg3? Or anything else?

Interesting. I have it in my sent mail file, but it doesn't seem to
be on list. I've just resent it, and a couple of other messages
that seem to have disappeared into the ether.

These are the messages I have for v3:

33587   F To Xiao Guangro Re: [PATCH v3 26/32] nvdimm: save arg3 for NVDIMM 
device _DSM method
33588   F To Xiao Guangro Re: [PATCH v3 22/32] nvdimm: init the address region 
used by NVDIMM ACPI
33589   F To Xiao Guangro Re: [PATCH v3 23/32] nvdimm: build ACPI NFIT table
33590   F To Xiao Guangro Re: [PATCH v3 00/32] implement vNVDIMM
33597   F To Xiao Guangro Re: [PATCH v3 23/32] nvdimm: build ACPI NFIT table
33598   F To Xiao Guangro Re: [PATCH v3 00/32] implement vNVDIMM
33599   F To Xiao Guangro Re: [PATCH v3 23/32] nvdimm: build ACPI NFIT table

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 00/32] implement vNVDIMM

2015-10-18 Thread Michael S. Tsirkin
On Tue, Oct 13, 2015 at 01:29:48PM +0800, Xiao Guangrong wrote:
> 
> 
> On 10/12/2015 07:55 PM, Michael S. Tsirkin wrote:
> >On Sun, Oct 11, 2015 at 11:52:32AM +0800, Xiao Guangrong wrote:
> >>Changelog in v3:
> >>There is huge change in this version, thank Igor, Stefan, Paolo, Eduardo,
> >>Michael for their valuable comments, the patchset finally gets better shape.
> >
> >Thanks!
> >This needs some changes in coding style, and more comments, to
> >make it easier to maintain going forward.
> 
> Thanks for your review, Michael. I have learned lots of thing from
> your comments.
> 
> >
> >High level comments - I didn't point out all instances,
> >please go over code and locate them yourself.
> >I focused on acpi code in this review.
> 
> Okay, will do.
> 
> >
> > - fix coding style violations, prefix eveything with nvdimm_ etc
> 
> Actually i did not pay attention on naming the stuff which is only internally
> used. Thank you for pointing it out and will fix it in next version.
> 
> > - in apci code, avoid manual memory management/complex pointer math
> 
> I am not very good at ACPI ASL/AML, could you please more detail?

It's about C.

For example:
Foo *foo = acpi_data_push(table, sizeof *foo);
Bar *foo = acpi_data_push(table, sizeof *bar);
is pretty obviously safe, and it doesn't require you to do any
calculations.
char *buf = acpi_data_push(table, sizeof *foo + sizeof *bar);
is worse, now you need:
Bar *bar = (Bar *)(buf + sizeof *foo);
which will corrupt memory if you get the size wrong in push.

> > - comments are needed to document apis & explain what's going on
> > - constants need comments too, refer to text that
> >   can be looked up in acpi spec verbatim
> 
> Indeed, will document carefully.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 22/32] nvdimm: init the address region used by NVDIMM ACPI

2015-10-18 Thread Michael S. Tsirkin
On Sun, Oct 11, 2015 at 11:52:54AM +0800, Xiao Guangrong wrote:
> We reserve the memory region 0xFF0 ~ 0xFFF0 for NVDIMM ACPI
> which is used as:
> - the first page is mapped as MMIO, ACPI write data to this page to
>   transfer the control to QEMU
> 
> - the second page is RAM-based which used to save the input info of
>   _DSM method and QEMU reuse it store output info
> 
> - the left is mapped as RAM, it's the buffer returned by _FIT method,
>   this is needed by NVDIMM hotplug
> 

Isn't there some way to document this in code, e.g. with
macros?

Adding text under docs/specs would also be a good idea.


> Signed-off-by: Xiao Guangrong 
> ---
>  hw/i386/pc.c|   3 ++
>  hw/mem/Makefile.objs|   2 +-
>  hw/mem/nvdimm/acpi.c| 120 
> 
>  include/hw/i386/pc.h|   2 +
>  include/hw/mem/nvdimm.h |  19 
>  5 files changed, 145 insertions(+), 1 deletion(-)
>  create mode 100644 hw/mem/nvdimm/acpi.c
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 6694b18..8fea4c3 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -1360,6 +1360,9 @@ FWCfgState *pc_memory_init(PCMachineState *pcms,
>  exit(EXIT_FAILURE);
>  }
>  
> +nvdimm_init_memory_state(&pcms->nvdimm_memory, system_memory, 
> machine,
> + TARGET_PAGE_SIZE);
> +

Shouldn't this be conditional on presence of the nvdimm device?


>  pcms->hotplug_memory.base =
>  ROUND_UP(0x1ULL + pcms->above_4g_mem_size, 1ULL << 30);
>  
> diff --git a/hw/mem/Makefile.objs b/hw/mem/Makefile.objs
> index e0ff328..7310bac 100644
> --- a/hw/mem/Makefile.objs
> +++ b/hw/mem/Makefile.objs
> @@ -1,3 +1,3 @@
>  common-obj-$(CONFIG_DIMM) += dimm.o
>  common-obj-$(CONFIG_MEM_HOTPLUG) += pc-dimm.o
> -common-obj-$(CONFIG_NVDIMM) += nvdimm/nvdimm.o
> +common-obj-$(CONFIG_NVDIMM) += nvdimm/nvdimm.o nvdimm/acpi.o
> diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
> new file mode 100644
> index 000..b640874
> --- /dev/null
> +++ b/hw/mem/nvdimm/acpi.c
> @@ -0,0 +1,120 @@
> +/*
> + * NVDIMM ACPI Implementation
> + *
> + * Copyright(C) 2015 Intel Corporation.
> + *
> + * Author:
> + *  Xiao Guangrong 
> + *
> + * NFIT is defined in ACPI 6.0: 5.2.25 NVDIMM Firmware Interface Table (NFIT)
> + * and the DSM specfication can be found at:
> + *   http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
> + *
> + * Currently, it only supports PMEM Virtualization.
> + *
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2 of the License, or (at your option) any later version.
> + *
> + * This library is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this library; if not, see 
> 
> + */
> +
> +#include "qemu-common.h"
> +#include "hw/acpi/acpi.h"
> +#include "hw/acpi/aml-build.h"
> +#include "hw/mem/nvdimm.h"
> +#include "internal.h"
> +
> +/* System Physical Address Range Structure */
> +struct nfit_spa {
> +uint16_t type;
> +uint16_t length;
> +uint16_t spa_index;
> +uint16_t flags;
> +uint32_t reserved;
> +uint32_t proximity_domain;
> +uint8_t type_guid[16];
> +uint64_t spa_base;
> +uint64_t spa_length;
> +uint64_t mem_attr;
> +} QEMU_PACKED;
> +typedef struct nfit_spa nfit_spa;
> +
> +/* Memory Device to System Physical Address Range Mapping Structure */
> +struct nfit_memdev {
> +uint16_t type;
> +uint16_t length;
> +uint32_t nfit_handle;
> +uint16_t phys_id;
> +uint16_t region_id;
> +uint16_t spa_index;
> +uint16_t dcr_index;
> +uint64_t region_len;
> +uint64_t region_offset;
> +uint64_t region_dpa;
> +uint16_t interleave_index;
> +uint16_t interleave_ways;
> +uint16_t flags;
> +uint16_t reserved;
> +} QEMU_PACKED;
> +typedef struct nfit_memdev nfit_memdev;
> +
> +/* NVDIMM Control Region Structure */
> +struct nfit_dcr {
> +uint16_t type;
> +uint16_t length;
> +uint16_t dcr_index;
> +uint16_t vendor_id;
> +uint16_t device_id;
> +uint16_t revision_id;
> +uint16_t sub_vendor_id;
> +uint16_t sub_device_id;
> +uint16_t sub_revision_id;
> +uint8_t reserved[6];
> +uint32_t serial_number;
> +uint16_t fic;
> +uint16_t num_bcw;
> +uint64_t bcw_size;
> +uint64_t cmd_offset;
> +uint64_t cmd_size;
> +uint64_t status_offset;
> +uint64_t status_size;
> +uint16_t flags;
> +uint8_t reserved2[6];
> +} QEMU_PACKED;
> +typedef struct nfit_dcr nfit_dcr;

Struct naming

Re: [PATCH v3 00/32] implement vNVDIMM

2015-10-18 Thread Michael S. Tsirkin
On Sun, Oct 11, 2015 at 11:52:32AM +0800, Xiao Guangrong wrote:
> Changelog in v3:
> There is huge change in this version, thank Igor, Stefan, Paolo, Eduardo,
> Michael for their valuable comments, the patchset finally gets better shape.

Thanks!
This needs some changes in coding style, and more comments, to
make it easier to maintain going forward.

High level comments - I didn't point out all instances,
please go over code and locate them yourself.
I focused on acpi code in this review.

- fix coding style violations, prefix eveything with nvdimm_ etc
- in apci code, avoid manual memory management/complex pointer math
- comments are needed to document apis & explain what's going on
- constants need comments too, refer to text that
  can be looked up in acpi spec verbatim


> - changes from Igor's comments:
>   1) abstract dimm device type from pc-dimm and create nvdimm device based on
>  dimm, then it uses memory backend device as nvdimm's memory and NUMA has
>  easily been implemented.
>   2) let file-backend device support any kind of filesystem not only for
>  hugetlbfs and let it work on file not only for directory which is
>  achieved by extending 'mem-path' - if it's a directory then it works as
>  current behavior, otherwise if it's file then directly allocates memory
>  from it.
>   3) we figure out a unused memory hole below 4G that is 0xFF0 ~ 
>  0xFFF0, this range is large enough for NVDIMM ACPI as build 64-bit
>  ACPI SSDT/DSDT table will break windows XP.
>  BTW, only make SSDT.rev = 2 can not work since the width is only depended
>  on DSDT.rev based on 19.6.28 DefinitionBlock (Declare Definition Block)
>  in ACPI spec:
> | Note: For compatibility with ACPI versions before ACPI 2.0, the bit 
> | width of Integer objects is dependent on the ComplianceRevision of the DSDT.
> | If the ComplianceRevision is less than 2, all integers are restricted to 32 
> | bits. Otherwise, full 64-bit integers are used. The version of the DSDT 
> sets 
> | the global integer width for all integers, including integers in SSDTs.
>   4) use the lowest ACPI spec version to document AML terms.
>   5) use "nvdimm" as nvdimm device name instead of "pc-nvdimm"
> 
> - changes from Stefan's comments:
>   1) do not do endian adjustment in-place since _DSM memory is visible to 
> guest
>   2) use target platform's target page size instead of fixed PAGE_SIZE
>  definition
>   3) lots of code style improvement and typo fixes.
>   4) live migration fix
> - changes from Paolo's comments:
>   1) improve the name of memory region
>   
> - other changes:
>   1) return exact buffer size for _DSM method instead of the page size.
>   2) introduce mutex in NVDIMM ACPI as the _DSM memory is shared by all nvdimm
>  devices.
>   3) NUMA support
>   4) implement _FIT method
>   5) rename "configdata" to "reserve-label-data"
>   6) simplify _DSM arg3 determination
>   7) main changelog update to let it reflect v3.
> 
> Changlog in v2:
> - Use litten endian for DSM method, thanks for Stefan's suggestion
> 
> - introduce a new parameter, @configdata, if it's false, Qemu will
>   build a static and readonly namespace in memory and use it serveing
>   for DSM GET_CONFIG_SIZE/GET_CONFIG_DATA requests. In this case, no
>   reserved region is needed at the end of the @file, it is good for
>   the user who want to pass whole nvdimm device and make its data
>   completely be visible to guest
> 
> - divide the source code into separated files and add maintain info
> 
> BTW, PCOMMIT virtualization on KVM side is work in progress, hopefully will
> be posted on next week
> 
> == Background ==
> NVDIMM (A Non-Volatile Dual In-line Memory Module) is going to be supported
> on Intel's platform. They are discovered via ACPI and configured by _DSM
> method of NVDIMM device in ACPI. There has some supporting documents which
> can be found at:
> ACPI 6: http://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf
> NVDIMM Namespace: http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf
> DSM Interface Example: 
> http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
> Driver Writer's Guide: 
> http://pmem.io/documents/NVDIMM_Driver_Writers_Guide.pdf
> 
> Currently, the NVDIMM driver has been merged into upstream Linux Kernel and
> this patchset tries to enable it in virtualization field
> 
> == Design ==
> NVDIMM supports two mode accesses, one is PMEM which maps NVDIMM into CPU's
> address space then CPU can directly access it as normal memory, another is
> BLK which is used as block device to reduce the occupying of CPU address
> space
> 
> BLK mode accesses NVDIMM via Command Register window and Data Register window.
> BLK virtualization has high workload since each sector access will cause at
> least two VM-EXIT. So we currently only imperilment vPMEM in this patchset
> 
> --- vPMEM design ---
> We introduce a new device named "nvdimm", it uses memory bac

Re: [PATCH v3 26/32] nvdimm: save arg3 for NVDIMM device _DSM method

2015-10-18 Thread Michael S. Tsirkin
On Sun, Oct 11, 2015 at 11:52:58AM +0800, Xiao Guangrong wrote:
> Check if the input Arg3 is valid then store it into dsm_in if needed
> 
> We only do the save on NVDIMM device since we are not going to support any
> function on root device
> 
> Signed-off-by: Xiao Guangrong 
> ---
>  hw/mem/nvdimm/acpi.c | 21 -
>  1 file changed, 20 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/mem/nvdimm/acpi.c b/hw/mem/nvdimm/acpi.c
> index d9fa0fd..3b9399c 100644
> --- a/hw/mem/nvdimm/acpi.c
> +++ b/hw/mem/nvdimm/acpi.c
> @@ -442,7 +442,7 @@ static void build_nvdimm_devices(NVDIMMState *state, 
> GSList *device_list,
>  int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
> NULL);
>  uint32_t handle = nvdimm_slot_to_handle(slot);
> -Aml *dev, *method;
> +Aml *dev, *method, *ifctx;
>  
>  dev = aml_device("NV%02X", slot);
>  aml_append(dev, aml_name_decl("_ADR", aml_int(handle)));
> @@ -452,6 +452,24 @@ static void build_nvdimm_devices(NVDIMMState *state, 
> GSList *device_list,
>  method = aml_method("_DSM", 4);
>  {
>  SAVE_ARG012_HANDLE_LOCK(method, aml_int(handle));
> +
> +/* Arg3 is passed as Package and it has one element? */
> +ifctx = aml_if(aml_and(aml_equal(aml_object_type(aml_arg(3)),
> + aml_int(4)),
> +   aml_equal(aml_sizeof(aml_arg(3)),

aml_arg(3) is used many times below.
Pls give it a name that makes sense (not arg3! what is it for?)

> + aml_int(1;

Pls document AML constants used.
Like this:

ifctx = aml_if(aml_and(aml_equal(aml_object_type(aml_arg(3)),
 aml_int(4 /* 4 - Package */) ),
   aml_equal(aml_sizeof(aml_arg(3)),
 aml_int(1;

> +{
> +/* Local0 = Index(Arg3, 0) */
> +aml_append(ifctx, aml_store(aml_index(aml_arg(3), 
> aml_int(0)),
> +aml_local(0)));
> +/* Local3 = DeRefOf(Local0) */
> +aml_append(ifctx, aml_store(aml_derefof(aml_local(0)),
> +aml_local(3)));
> +/* ARG3 = Local3 */
> +aml_append(ifctx, aml_store(aml_local(3), aml_name("ARG3")));

This isn't a good way to comment things: you are
just adding ASL before the equivalent C.
Pls document what's going on.




> +}
> +aml_append(method, ifctx);
> +
>  NOTIFY_AND_RETURN_UNLOCK(method);
>  }
>  aml_append(dev, method);
> @@ -534,6 +552,7 @@ static void nvdimm_build_acpi_devices(NVDIMMState *state, 
> GSList *device_list,
>  method = aml_method("_DSM", 4);
>  {
>  SAVE_ARG012_HANDLE_LOCK(method, aml_int(0));
> +/* no command we support on ROOT device has Arg3. */
>  NOTIFY_AND_RETURN_UNLOCK(method);
>  }
>  aml_append(dev, method);
> -- 
> 1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM: x86: MMU: Initialize force_pt_level before calling mapping_level()

2015-10-18 Thread Takuya Yoshikawa
Commit fd1369021878 ("KVM: x86: MMU: Move mapping_level_dirty_bitmap()
call in mapping_level()") forgot to initialize force_pt_level to false
in FNAME(page_fault)() before calling mapping_level() like
nonpaging_map() does.  This can sometimes result in forcing page table
level mapping unnecessarily.

Fix this and move the first *force_pt_level check in mapping_level()
before kvm_vcpu_gfn_to_memslot() call to make it a bit clearer that
the variable must be initialized before mapping_level() gets called.

This change can also avoid calling kvm_vcpu_gfn_to_memslot() when
!check_hugepage_cache_consistency() check in tdp_page_fault() forces
page table level mapping.

Signed-off-by: Takuya Yoshikawa 
---
 arch/x86/kvm/mmu.c | 7 ---
 arch/x86/kvm/paging_tmpl.h | 2 +-
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index dd2a7c6..7d85bca 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -886,10 +886,11 @@ static int mapping_level(struct kvm_vcpu *vcpu, gfn_t 
large_gfn,
int host_level, level, max_level;
struct kvm_memory_slot *slot;
 
-   slot = kvm_vcpu_gfn_to_memslot(vcpu, large_gfn);
+   if (unlikely(*force_pt_level))
+   return PT_PAGE_TABLE_LEVEL;
 
-   if (likely(!*force_pt_level))
-   *force_pt_level = !memslot_valid_for_gpte(slot, true);
+   slot = kvm_vcpu_gfn_to_memslot(vcpu, large_gfn);
+   *force_pt_level = !memslot_valid_for_gpte(slot, true);
if (unlikely(*force_pt_level))
return PT_PAGE_TABLE_LEVEL;
 
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index bf39d0f..b41faa9 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -698,7 +698,7 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, gva_t 
addr, u32 error_code,
int r;
pfn_t pfn;
int level = PT_PAGE_TABLE_LEVEL;
-   bool force_pt_level;
+   bool force_pt_level = false;
unsigned long mmu_seq;
bool map_writable, is_self_change_mapping;
 
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 28/33] nvdimm acpi: support DSM_FUN_IMPLEMENTED function

2015-10-18 Thread Xiao Guangrong



On 10/19/2015 02:05 AM, Michael S. Tsirkin wrote:

On Mon, Oct 19, 2015 at 08:54:14AM +0800, Xiao Guangrong wrote:

__DSM is defined in ACPI 6.0: 9.14.1 _DSM (Device Specific Method)

Function 0 is a query function. We do not support any function on root
device and only 3 functions are support for NVDIMM device,
DSM_DEV_FUN_NAMESPACE_LABEL_SIZE, DSM_DEV_FUN_GET_NAMESPACE_LABEL_DATA and
DSM_DEV_FUN_SET_NAMESPACE_LABEL_DATA, that means we currently only allow to
access device's Label Namespace

Signed-off-by: Xiao Guangrong 
---
  hw/acpi/nvdimm.c | 184 ++-
  1 file changed, 182 insertions(+), 2 deletions(-)

diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index b211b8b..37fea1c 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -260,6 +260,22 @@ static uint32_t nvdimm_slot_to_dcr_index(int slot)
  return nvdimm_slot_to_spa_index(slot) + 1;
  }

+static NVDIMMDevice
+*nvdimm_get_device_by_handle(GSList *list, uint32_t handle)
+{
+for (; list; list = list->next) {
+NVDIMMDevice *nvdimm = list->data;
+int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
+   NULL);
+
+if (nvdimm_slot_to_handle(slot) == handle) {
+return nvdimm;
+}
+}
+
+return NULL;
+}
+
  /*
   * Please refer to ACPI 6.0: 5.2.25.1 System Physical Address Range
   * Structure
@@ -411,6 +427,60 @@ static void nvdimm_build_nfit(GArray *structures, GArray 
*table_offsets,
  /* detailed _DSM design please refer to docs/specs/acpi_nvdimm.txt */
  #define NOTIFY_VALUE  0x99


Again, please prefix everything consistently.


Okay, will do. Sorry for i missed it.





+enum {
+DSM_FUN_IMPLEMENTED = 0,
+
+/* NVDIMM Root Device Functions */
+DSM_ROOT_DEV_FUN_ARS_CAP = 1,
+DSM_ROOT_DEV_FUN_ARS_START = 2,
+DSM_ROOT_DEV_FUN_ARS_QUERY = 3,
+
+/* NVDIMM Device (non-root) Functions */
+DSM_DEV_FUN_SMART = 1,
+DSM_DEV_FUN_SMART_THRESHOLD = 2,
+DSM_DEV_FUN_BLOCK_NVDIMM_FLAGS = 3,
+DSM_DEV_FUN_NAMESPACE_LABEL_SIZE = 4,
+DSM_DEV_FUN_GET_NAMESPACE_LABEL_DATA = 5,
+DSM_DEV_FUN_SET_NAMESPACE_LABEL_DATA = 6,
+DSM_DEV_FUN_VENDOR_EFFECT_LOG_SIZE = 7,
+DSM_DEV_FUN_GET_VENDOR_EFFECT_LOG = 8,
+DSM_DEV_FUN_VENDOR_SPECIFIC = 9,
+};


Does FUN stand for "function"? FUNC or FN is probably better.



Yes.


Please list exact names as they appear in spec so
they can be searched for.


The spec reference was at where this _FUN_ is used, eg:

/*
 * Please refer to DSM specification 4.4.1 Get Namespace Label Size
 * (Function Index 4).
 *
 * It gets the size of Namespace Label data area and the max data size
 * that Get/Set Namespace Label Data functions can transfer.
 */
static void nvdimm_dsm_func_label_size(NVDIMMDevice *nvdimm, GArray *out)

I will follow your ‘single use’ comments below, these definitions will
be dropped, the code will be like this:

switch (function) {
case 4 /* DSM Spec 4.4.1 Get Namespace Label Size Get Namespace Label Size. */:
nvdimm_dsm_func_label_size();
case ...
...
};






+
+enum {
+/* Common return status codes. */
+DSM_STATUS_SUCCESS = 0,   /* Success */
+DSM_STATUS_NOT_SUPPORTED = 1, /* Not Supported */
+
+/* NVDIMM Root Device _DSM function return status codes*/
+DSM_ROOT_DEV_STATUS_INVALID_PARAS = 2,/* Invalid Input Parameters */
+DSM_ROOT_DEV_STATUS_FUNCTION_SPECIFIC_ERROR = 3, /* Function-Specific
+Error */
+
+/* NVDIMM Device (non-root) _DSM function return status codes*/
+DSM_DEV_STATUS_NON_EXISTING_MEM_DEV = 2,  /* Non-Existing Memory Device */
+DSM_DEV_STATUS_INVALID_PARAS = 3, /* Invalid Input Parameters */
+DSM_DEV_STATUS_VENDOR_SPECIFIC_ERROR = 4, /* Vendor Specific Error */
+};
+
+/* Current revision supported by DSM specification is 1. */
+#define DSM_REVISION(1)
+
+/*
+ * please refer to ACPI 6.0: 9.14.1 _DSM (Device Specific Method): Return
+ * Value Information:


Drop "please refer to".


Okay.




+ *   if set to zero, no functions are supported (other than function zero)
+ *   for the specified UUID and Revision ID. If set to one, at least one
+ *   additional function is supported.
+ */
+
+/* do not support any function on root. */
+#define ROOT_SUPPORT_FUN (0ULL)


Needs a name that implies the comment somehow.


+#define DIMM_SUPPORT_FUN((1 << DSM_FUN_IMPLEMENTED)   \
+   | (1 << DSM_DEV_FUN_NAMESPACE_LABEL_SIZE)  \
+   | (1 << DSM_DEV_FUN_GET_NAMESPACE_LABEL_DATA)  \
+   | (1 << DSM_DEV_FUN_SET_NAMESPACE_LABEL_DATA))
+


I think it's best to just drop these macros.
There's a single point of use - just add a comment there
explaining what does it mean.


Okay. Good to me.


You will be able to drop all _FUN_ macros too.


Yes, it's good for code reducti

Re: [PATCH v4 27/33] nvdimm acpi: save arg3 for NVDIMM device _DSM method

2015-10-18 Thread Xiao Guangrong



On 10/19/2015 01:16 AM, Michael S. Tsirkin wrote:

On Mon, Oct 19, 2015 at 08:54:13AM +0800, Xiao Guangrong wrote:

Check if the input Arg3 is valid then store it into dsm_in if needed

Signed-off-by: Xiao Guangrong 
---
  hw/acpi/nvdimm.c | 19 +++
  1 file changed, 19 insertions(+)

diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index 7e99889..b211b8b 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -624,10 +624,29 @@ static void nvdimm_build_acpi_devices(NVDIMMState *state, 
GSList *device_list,

  method = aml_method_serialized("NCAL", 4);
  {
+Aml *ifctx;
+
  aml_append(method, aml_store(aml_arg(0), aml_name("HDLE")));
  aml_append(method, aml_store(aml_arg(1), aml_name("REVS")));
  aml_append(method, aml_store(aml_arg(2), aml_name("FUNC")));

+/* Arg3 is passed as Package and it has one element? */
+ifctx = aml_if(aml_and(aml_equal(aml_object_type(aml_arg(3)),
+ aml_int(4)),
+   aml_equal(aml_sizeof(aml_arg(3)),
+ aml_int(1;
+{
+/* Local0 = Index(Arg3, 0) */
+aml_append(ifctx, aml_store(aml_index(aml_arg(3), aml_int(0)),
+aml_local(0)));
+/* Local3 = DeRefOf(Local0) */
+aml_append(ifctx, aml_store(aml_derefof(aml_local(0)),
+aml_local(3)));
+/* ARG3 = Local3 */
+aml_append(ifctx, aml_store(aml_local(3), aml_name("ARG3")));
+}
+aml_append(method, ifctx);
+
  aml_append(method, aml_store(aml_int(NOTIFY_VALUE), 
aml_name("NOTI")));

  aml_append(method, aml_store(aml_name("RLEN"), aml_local(6)));


I commented on this patch on v3.
It doesn't look like this was addressed.



Ah... I see no one commented this patch ([PATCH v3 26/32] nvdimm: save arg3 for 
NVDIMM
device  _DSM method) on v3.

Do you mean we need more and better comment to explain arg3? Or anything else?


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 23/33] nvdimm acpi: init the address region used by NVDIMM ACPI

2015-10-18 Thread Xiao Guangrong



On 10/19/2015 01:15 AM, Michael S. Tsirkin wrote:

On Mon, Oct 19, 2015 at 08:54:09AM +0800, Xiao Guangrong wrote:

+typedef struct nfit_spa nfit_spa;


There are still multiple coding style violations.  Pls fix all code to
match coding style.  I commented on this before.



Er, i thought what you disliked is just function names... Now I know i
was misunderstanding it and sorry for wasting your time on it.

I will change all structs in this patch that:
struct nfit_spa -> struct nvdimm_nfit_spa
struct nfit_memdev -> struct nvdimm_nfit_memdev
struct nfit_dcr nfit_dcr -> struct nvdimm_nfit_dcr nfit_dcr;

and also copy the spec reference from where these struct was used to
where these struct was defined.


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 2/2] block: enable dax for raw block devices

2015-10-18 Thread Ross Zwisler
On Fri, Oct 16, 2015 at 08:49:41PM -0400, Dan Williams wrote:
> If an application wants exclusive access to all of the persistent memory
> provided by an NVDIMM namespace it can use this raw-block-dax facility
> to forgo establishing a filesystem.  This capability is targeted
> primarily to hypervisors wanting to provision persistent memory for
> guests.
> 
> Cc: Jeff Moyer 
> Cc: Christoph Hellwig 
> Cc: Al Viro 
> Cc: Andrew Morton 
> Cc: Ross Zwisler 
> Cc: Xiao Guangrong 
> Signed-off-by: Dan Williams 
> ---
> 
> Only lighted tested so far, but seems to work, is the shortest path to a
> DAX mapping, and makes it easier to trigger the pmd_fault path (no
> fs-block-allocator interactions).
> 
>  fs/block_dev.c |   84 
> +++-
>  1 file changed, 83 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/block_dev.c b/fs/block_dev.c
> index 5277dd83d254..498b71455570 100644
> --- a/fs/block_dev.c
> +++ b/fs/block_dev.c
> @@ -1687,13 +1687,95 @@ static const struct address_space_operations 
> def_blk_aops = {
>   .is_dirty_writeback = buffer_check_dirty_writeback,
>  };
>  
> +#ifdef CONFIG_FS_DAX
> +static int blkdev_dax_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
> +{
> + struct inode *bd_inode = file_bd_inode(vma->vm_file);
> + struct block_device *bdev = I_BDEV(bd_inode);
> + int ret;
> +
> + mutex_lock(&bdev->bd_mutex);
> + ret = __dax_fault(vma, vmf, blkdev_get_block, NULL);
> + mutex_unlock(&bdev->bd_mutex);
> +
> + return ret;
> +}

This all looks very straightforward.  The one comment I have is that this
code is missing the calls to sb_[start|end]_pagefault(), and to
file_update_time() that are found in ext[24]/xfs and the generic fault code.

The previous version of this code used the generic fault implementation, and
was calling these functions via filemap_page_mkwrite().

It is possible that they were omitted for a reason - does protection from
filesystem freezing still make sense when talking with a raw block device?
For example, if that block device *has* a mounted filesystem on it that is
frozen, does sb_start_pagefault() prevent against page faults on the raw
device that try and make something writable?  

In any case, the presence of them in filemap_page_mkwrite() tells me that they
at least aren't harmful, and I wanted to make sure they weren't needed before
leaving them out.  If the omission was intentional, should we add a comment to
explain why they are missing?
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFT - PATCH v2 0/2] KVM/arm64: add fp/simd lazy switch support

2015-10-18 Thread Christoffer Dall
On Mon, Oct 12, 2015 at 09:29:23AM -0700, Mario Smarduch wrote:
> Hi Christoffer, Marc -
>   I just threw this test your way without any explanation.

I'm confused.  Did you send me something somewhere already?

> 
> The test loops, does fp arithmetic and checks the truncated result.
> It could be a little more dynamic have an initial run to
> get the sum to compare against while looping, different fp
> hardware may come up with a different sum, but truncation is
> to 5'th decimal point.
> 
> The rationale is that if there is any fp/simd corruption
> one of these runs should fail. I think most likely scenario
> for that is a world switch in midst of fp operation. I've
> instrumented (basically add some tracing to vcpu_put()) and
> validated vcpu_put gets called thousands of time (for v7,v8)
> for an over night test running two guests/host crunching
> fp operations.
> 
> Other then that not sure how to really catch any problems
> with the patches applied. Obviously this is a huge issues, if this has
> any problems. If you or Marc have any other ideas I'd be happy
> to enhance the test.

I think it's important to run two VMs at the same time, each with some
floating-point work, and then run some floating point on the host at the
same time.

You can make that even more interesting by doing 32-bit guests at the
same time as well.

I believe Marc was running Panranoia
(http://www.netlib.org/paranoia/paranoia.c) to test the last lazy
series.

Thanks,
-Christoffer
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM/arm: kernel low level debug suport for ARM32 virtual platforms

2015-10-18 Thread Christoffer Dall
On Fri, Oct 16, 2015 at 08:19:59PM -0700, Mario Smarduch wrote:
> When booting a VM using QEMU or Kvmtool there are no clear ways to 
> enable low level debugging for these virtual platforms. some menu port 
> choices are not supported by the virtual platforms at all. And there is no
> help on the location of physical and virtual addresses for the ports.
> This may lead to wrong debug port and a frozen VM with a blank screen.
> 
> This patch adds menu selections for QEMU and Kvmtool virtual platforms for 
> low 
> level kernel print debugging. Help section displays port physical and
> virutal addresses.
> 
> ARM reference models use the MIDR register to run-time select UART port 
> address 
> (for ARCH_VEXPRESS) based on A9 or A15 part numbers. Looked for a same 
> approach
> but couldn't find a way to differentiate between virtual platforms, something
> like a platform register.
> 
> Signed-off-by: Mario Smarduch 
> ---

I can't think of any better way to do this and I would be happy to see
this functionality in Linux, so:

Acked-by: Christoffer Dall 

>  arch/arm/Kconfig.debug | 22 ++
>  1 file changed, 22 insertions(+)
> 
> diff --git a/arch/arm/Kconfig.debug b/arch/arm/Kconfig.debug
> index a2e16f9..d126bd4 100644
> --- a/arch/arm/Kconfig.debug
> +++ b/arch/arm/Kconfig.debug
> @@ -1155,6 +1155,28 @@ choice
> This option selects UART0 on VIA/Wondermedia System-on-a-chip
> devices, including VT8500, WM8505, WM8650 and WM8850.
>  
> + config DEBUG_VIRT_UART_QEMU
> + bool "Kernel low-level debugging on QEMU Virtual Platform"
> + depends on ARCH_VIRT
> + select DEBUG_UART_PL01X
> + help
> +   Say Y here if you want the debug print routines to direct
> +   their output to PL011 UART port on QEMU Virtual Platform.
> +   Appropriate address values are:
> + PHYSVIRT
> + 0x900   0xf809
> +
> + config DEBUG_VIRT_UART_KVMTOOL
> + bool "Kernel low-level debugging on Kvmtool Virtual Platform"
> + depends on ARCH_VIRT
> + select DEBUG_UART_8250
> + help
> +   Say Y here if you want the debug print routines to direct
> +   their output to 8250 UART port on Kvmtool Virtual
> +   Platform. Appropriate address values are:
> + PHYSVIRT
> + 0x3f8   0xf80903f8
> +
>   config DEBUG_ICEDCC
>   bool "Kernel low-level debugging via EmbeddedICE DCC channel"
>   help
> -- 
> 1.9.1
> 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [kvm-unit-tests PATCHv4] ARM PMU tests

2015-10-18 Thread Andrew Jones
On Mon, Oct 12, 2015 at 11:07:47AM -0400, Christopher Covington wrote:
> Changes from v3 in response to Drew's suggestions:
> 
> * Improved pmu_data / PMCR fields and usage
> * Straightened out awkward conditionals
> * Added 32-bit support
> * Styling enhancements
> * Deferred -icount testing to later patch
> 
>

Sorry I was slow to review this version. Also, just FYI, I'll be on
vacation for a week, so I'll probably be slow to review the next
version too :-) Anyway, thanks for the patches, and thanks for your
patience.

drew 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [kvm-unit-tests PATCHv4 3/3] arm: pmu: Add CPI checking

2015-10-18 Thread Andrew Jones
On Mon, Oct 12, 2015 at 11:07:50AM -0400, Christopher Covington wrote:
> Calculate the numbers of cycles per instruction (CPI) implied by ARM
> PMU cycle counter values. The code includes a strict checking facility
> intended for the -icount option in TCG mode but it is not yet enabled
> in the configuration file. Enabling it must wait on infrastructure
> improvements which allow for different tests to be run on TCG versus
> KVM.
> 
> Signed-off-by: Christopher Covington 
> ---
>  arm/pmu.c | 91 
> ++-
>  1 file changed, 90 insertions(+), 1 deletion(-)
> 
> diff --git a/arm/pmu.c b/arm/pmu.c
> index ae81970..169c36c 100644
> --- a/arm/pmu.c
> +++ b/arm/pmu.c
> @@ -37,6 +37,18 @@ static inline unsigned long get_pmccntr(void)
>   asm volatile("mrc p15, 0, %0, c9, c13, 0" : "=r" (cycles));
>   return cycles;
>  }
> +
> +static inline void loop(int i, uint32_t pmcr)
> +{
> + uint32_t z = 0;
> +
> + asm volatile(
> + "   mcr p15, 0, %[pmcr], c9, c12, 0\n"
> + "   1: subs %[i], %[i], #1\n"
> + "   bgt 1b\n"
> + "   mcr p15, 0, %[z], c9, c12, 0\n"
> + : [i] "+r" (i) : [pmcr] "r" (pmcr), [z] "r" (z) : "cc");

Assembly is always ugly, but we can do a bit better formatting with tabs

asm volatile(
"   mcr p15, 0, %[pmcr], c9, c12, 0\n"
"1: subs%[i], %[i], #1\n"
"   bgt 1b\n"
"   mcr p15, 0, %[z], c9, c12, 0\n"
: [i] "+r" (i)
: [pmcr] "r" (pmcr), [z] "r" (z)
: "cc");

Actually it can be even cleaner because you already created set_pmcr()

set_pmcr(pmcr);

asm volatile(
"1: subs%0, %0, #1\n"
"   bgt 1b\n"
: "+r" (i) : : "cc");

set_pmcr(0);


> +}
>  #elif defined(__aarch64__)
>  static inline uint32_t get_pmcr(void)
>  {
> @@ -58,6 +70,16 @@ static inline unsigned long get_pmccntr(void)
>   asm volatile("mrs %0, pmccntr_el0" : "=r" (cycles));
>   return cycles;
>  }
> +
> +static inline void loop(int i, uint32_t pmcr)
> +{
> + asm volatile(
> + "   msr pmcr_el0, %[pmcr]\n"
> + "   1: subs %[i], %[i], #1\n"
> + "   b.gt 1b\n"
> + "   msr pmcr_el0, xzr\n"
> + : [i] "+r" (i) : [pmcr] "r" (pmcr) : "cc");

same comment as above

> +}
>  #endif
>  
>  struct pmu_data {
> @@ -125,12 +147,79 @@ static bool check_cycles_increase(void)
>   return true;
>  }
>  
> -int main(void)
> +/*
> + * Execute a known number of guest instructions. Only odd instruction counts
> + * greater than or equal to 3 are supported by the in-line assembly code. The

Not all odd counts, right? But rather all multiples of 3? IIUC this is because
the loop is two instructions (sub + branch), and then the clearing of the pmcr
register counts as the 3rd?

> + * control register (PMCR_EL0) is initialized with the provided value 
> (allowing
> + * for example for the cycle counter or event counters to be reset). At the 
> end
> + * of the exact instruction loop, zero is written to PMCR_EL0 to disable
> + * counting, allowing the cycle counter or event counters to be read at the
> + * leisure of the calling code.
> + */
> +static void measure_instrs(int num, uint32_t pmcr)
> +{
> + int i = (num - 1) / 2;
> +
> + assert(num >= 3 && ((num - 1) % 2 == 0));
> + loop(i, pmcr);
> +}
> +
> +/*
> + * Measure cycle counts for various known instruction counts. Ensure that the
> + * cycle counter progresses (similar to check_cycles_increase() but with more
> + * instructions and using reset and stop controls). If supplied a positive,
> + * nonzero CPI parameter, also strictly check that every measurement matches
> + * it. Strict CPI checking is used to test -icount mode.
> + */
> +static bool check_cpi(int cpi)
> +{
> + struct pmu_data pmu;

memset(&pmu, 0, sizeof(pmu));

> +
> + pmu.cycle_counter_reset = 1;
> + pmu.enable = 1;
> +
> + if (cpi > 0)
> + printf("Checking for CPI=%d.\n", cpi);
> + printf("instrs : cycles0 cycles1 ...\n");
> +
> + for (int i = 3; i < 300; i += 32) {
> + int avg, sum = 0;
> +
> + printf("%d :", i);
> + for (int j = 0; j < NR_SAMPLES; j++) {
> + int cycles;
> +
> + measure_instrs(i, pmu.pmcr_el0);
> + cycles = get_pmccntr();
> + printf(" %d", cycles);
> +
> + if (!cycles || (cpi > 0 && cycles != i * cpi)) {
> + printf("\n");
> + return false;
> + }
> +
> + sum += cycles;
> + }
> + avg = sum / NR_SAMPLES;
> + printf(" sum=%d avg=%d avg_ipc=%d avg_cpi=%d\n",
> + sum, avg, i / avg, avg / i);
> + }
> +
> + return true;
> +}
> +

Re: [Qemu-devel] [kvm-unit-tests PATCHv4 2/3] arm: pmu: Check cycle count increases

2015-10-18 Thread Andrew Jones
On Mon, Oct 12, 2015 at 11:07:49AM -0400, Christopher Covington wrote:
> Ensure that reads of the PMCCNTR_EL0 are monotonically increasing,
> even for the smallest delta of two subsequent reads.
> 
> Signed-off-by: Christopher Covington 
> ---
>  arm/pmu.c | 54 ++
>  1 file changed, 54 insertions(+)
> 
> diff --git a/arm/pmu.c b/arm/pmu.c
> index 42d0ee1..ae81970 100644
> --- a/arm/pmu.c
> +++ b/arm/pmu.c
> @@ -14,6 +14,8 @@
>   */
>  #include "libcflat.h"
>  
> +#define NR_SAMPLES 10
> +
>  #if defined(__arm__)
>  static inline uint32_t get_pmcr(void)
>  {
> @@ -22,6 +24,19 @@ static inline uint32_t get_pmcr(void)
>   asm volatile("mrc p15, 0, %0, c9, c12, 0" : "=r" (ret));
>   return ret;
>  }
> +
> +static inline void set_pmcr(uint32_t pmcr)
> +{
> + asm volatile("mcr p15, 0, %0, c9, c12, 0" : : "r" (pmcr));
> +}
> +
> +static inline unsigned long get_pmccntr(void)
> +{
> + unsigned long cycles;
> +
> + asm volatile("mrc p15, 0, %0, c9, c13, 0" : "=r" (cycles));
> + return cycles;

This is a 64-bit register, even for arm, but I guess there's no
need to access more than 32-bits using mrrc?

> +}
>  #elif defined(__aarch64__)
>  static inline uint32_t get_pmcr(void)
>  {
> @@ -30,6 +45,19 @@ static inline uint32_t get_pmcr(void)
>   asm volatile("mrs %0, pmcr_el0" : "=r" (ret));
>   return ret;
>  }
> +
> +static inline void set_pmcr(uint32_t pmcr)
> +{
> + asm volatile("msr pmcr_el0, %0" : : "r" (pmcr));
> +}
> +
> +static inline unsigned long get_pmccntr(void)
> +{
> + unsigned long cycles;
> +
> + asm volatile("mrs %0, pmccntr_el0" : "=r" (cycles));
> + return cycles;
> +}
>  #endif
>  
>  struct pmu_data {
> @@ -72,11 +100,37 @@ static bool check_pmcr(void)
>   return pmu.implementer != 0;
>  }
>  
> +/*
> + * Ensure that the cycle counter progresses between back-to-back reads.
> + */
> +static bool check_cycles_increase(void)
> +{
> + struct pmu_data pmu;
> +
> + pmu.enable = 1;
> + set_pmcr(pmu.pmcr_el0);

You need to zero pmu out first, it's just random stack junk except
for 'enable' definitely being 1 at this point. 

> +
> + for (int i = 0; i < NR_SAMPLES; i++) {
> + unsigned long a, b;
> +
> + a = get_pmccntr();
> + b = get_pmccntr();
> +
> + if (a >= b) {
> + printf("Read %ld then %ld.\n", a, b);
> + return false;
> + }
> + }
> +
> + return true;
> +}
> +
>  int main(void)
>  {
>   report_prefix_push("pmu");
>  
>   report("Control register", check_pmcr());
> + report("Monotonically increasing cycle count", check_cycles_increase());
>  
>   return report_summary();
>  }
> -- 
> Qualcomm Innovation Center, Inc.
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 28/33] nvdimm acpi: support DSM_FUN_IMPLEMENTED function

2015-10-18 Thread Michael S. Tsirkin
On Mon, Oct 19, 2015 at 08:54:14AM +0800, Xiao Guangrong wrote:
> __DSM is defined in ACPI 6.0: 9.14.1 _DSM (Device Specific Method)
> 
> Function 0 is a query function. We do not support any function on root
> device and only 3 functions are support for NVDIMM device,
> DSM_DEV_FUN_NAMESPACE_LABEL_SIZE, DSM_DEV_FUN_GET_NAMESPACE_LABEL_DATA and
> DSM_DEV_FUN_SET_NAMESPACE_LABEL_DATA, that means we currently only allow to
> access device's Label Namespace
> 
> Signed-off-by: Xiao Guangrong 
> ---
>  hw/acpi/nvdimm.c | 184 
> ++-
>  1 file changed, 182 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
> index b211b8b..37fea1c 100644
> --- a/hw/acpi/nvdimm.c
> +++ b/hw/acpi/nvdimm.c
> @@ -260,6 +260,22 @@ static uint32_t nvdimm_slot_to_dcr_index(int slot)
>  return nvdimm_slot_to_spa_index(slot) + 1;
>  }
>  
> +static NVDIMMDevice
> +*nvdimm_get_device_by_handle(GSList *list, uint32_t handle)
> +{
> +for (; list; list = list->next) {
> +NVDIMMDevice *nvdimm = list->data;
> +int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
> +   NULL);
> +
> +if (nvdimm_slot_to_handle(slot) == handle) {
> +return nvdimm;
> +}
> +}
> +
> +return NULL;
> +}
> +
>  /*
>   * Please refer to ACPI 6.0: 5.2.25.1 System Physical Address Range
>   * Structure
> @@ -411,6 +427,60 @@ static void nvdimm_build_nfit(GArray *structures, GArray 
> *table_offsets,
>  /* detailed _DSM design please refer to docs/specs/acpi_nvdimm.txt */
>  #define NOTIFY_VALUE  0x99

Again, please prefix everything consistently.

>  
> +enum {
> +DSM_FUN_IMPLEMENTED = 0,
> +
> +/* NVDIMM Root Device Functions */
> +DSM_ROOT_DEV_FUN_ARS_CAP = 1,
> +DSM_ROOT_DEV_FUN_ARS_START = 2,
> +DSM_ROOT_DEV_FUN_ARS_QUERY = 3,
> +
> +/* NVDIMM Device (non-root) Functions */
> +DSM_DEV_FUN_SMART = 1,
> +DSM_DEV_FUN_SMART_THRESHOLD = 2,
> +DSM_DEV_FUN_BLOCK_NVDIMM_FLAGS = 3,
> +DSM_DEV_FUN_NAMESPACE_LABEL_SIZE = 4,
> +DSM_DEV_FUN_GET_NAMESPACE_LABEL_DATA = 5,
> +DSM_DEV_FUN_SET_NAMESPACE_LABEL_DATA = 6,
> +DSM_DEV_FUN_VENDOR_EFFECT_LOG_SIZE = 7,
> +DSM_DEV_FUN_GET_VENDOR_EFFECT_LOG = 8,
> +DSM_DEV_FUN_VENDOR_SPECIFIC = 9,
> +};

Does FUN stand for "function"? FUNC or FN is probably better.

Please list exact names as they appear in spec so
they can be searched for.



> +
> +enum {
> +/* Common return status codes. */
> +DSM_STATUS_SUCCESS = 0,   /* Success */
> +DSM_STATUS_NOT_SUPPORTED = 1, /* Not Supported */
> +
> +/* NVDIMM Root Device _DSM function return status codes*/
> +DSM_ROOT_DEV_STATUS_INVALID_PARAS = 2,/* Invalid Input Parameters */
> +DSM_ROOT_DEV_STATUS_FUNCTION_SPECIFIC_ERROR = 3, /* Function-Specific
> +Error */
> +
> +/* NVDIMM Device (non-root) _DSM function return status codes*/
> +DSM_DEV_STATUS_NON_EXISTING_MEM_DEV = 2,  /* Non-Existing Memory Device 
> */
> +DSM_DEV_STATUS_INVALID_PARAS = 3, /* Invalid Input Parameters */
> +DSM_DEV_STATUS_VENDOR_SPECIFIC_ERROR = 4, /* Vendor Specific Error */
> +};
> +
> +/* Current revision supported by DSM specification is 1. */
> +#define DSM_REVISION(1)
> +
> +/*
> + * please refer to ACPI 6.0: 9.14.1 _DSM (Device Specific Method): Return
> + * Value Information:

Drop "please refer to".

> + *   if set to zero, no functions are supported (other than function zero)
> + *   for the specified UUID and Revision ID. If set to one, at least one
> + *   additional function is supported.
> + */
> +
> +/* do not support any function on root. */
> +#define ROOT_SUPPORT_FUN (0ULL)

Needs a name that implies the comment somehow.

> +#define DIMM_SUPPORT_FUN((1 << DSM_FUN_IMPLEMENTED)   \
> +   | (1 << DSM_DEV_FUN_NAMESPACE_LABEL_SIZE)  \
> +   | (1 << DSM_DEV_FUN_GET_NAMESPACE_LABEL_DATA)  \
> +   | (1 << DSM_DEV_FUN_SET_NAMESPACE_LABEL_DATA))
> +

I think it's best to just drop these macros.
There's a single point of use - just add a comment there
explaining what does it mean.
You will be able to drop all _FUN_ macros too.


>  struct dsm_in {
>  uint32_t handle;
>  uint32_t revision;
> @@ -420,6 +490,11 @@ struct dsm_in {
>  } QEMU_PACKED;
>  typedef struct dsm_in dsm_in;
>  
> +struct cmd_out_implemented {
> +uint64_t cmd_list;
> +};
> +typedef struct cmd_out_implemented cmd_out_implemented;
> +
>  struct dsm_out {
>  /* the size of buffer filled by QEMU. */
>  uint32_t len;
> @@ -434,12 +509,115 @@ nvdimm_dsm_read(void *opaque, hwaddr addr, unsigned 
> size)
>  return 0;
>  }
>  
> +static void nvdimm_dsm_write_status(GArray *out, uint32_t status)
> +{
> +/* status locates in the first 4 byt

Re: [Qemu-devel] [kvm-unit-tests PATCHv4 1/3] arm: Add PMU test

2015-10-18 Thread Andrew Jones
On Mon, Oct 12, 2015 at 11:07:48AM -0400, Christopher Covington wrote:
> Beginning with a simple sanity check of the control register, add
> a unit test for the ARM Performance Monitors Unit (PMU).
> 
> Signed-off-by: Christopher Covington 
> ---
>  arm/pmu.c| 82 
> 
>  arm/unittests.cfg|  5 +++
>  config/config-arm-common.mak |  4 ++-
>  3 files changed, 90 insertions(+), 1 deletion(-)
>  create mode 100644 arm/pmu.c
> 
> diff --git a/arm/pmu.c b/arm/pmu.c
> new file mode 100644
> index 000..42d0ee1
> --- /dev/null
> +++ b/arm/pmu.c
> @@ -0,0 +1,82 @@
> +/*
> + * Test the ARM Performance Monitors Unit (PMU).
> + *
> + * Copyright 2015 The Linux Foundation. All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU Lesser General Public License version 2.1 and
> + * only version 2.1 as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful, but 
> WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public 
> License
> + * for more details.
> + */
> +#include "libcflat.h"
> +
> +#if defined(__arm__)
> +static inline uint32_t get_pmcr(void)
> +{
> + uint32_t ret;
> +
> + asm volatile("mrc p15, 0, %0, c9, c12, 0" : "=r" (ret));
> + return ret;
> +}
> +#elif defined(__aarch64__)
> +static inline uint32_t get_pmcr(void)
> +{
> + uint32_t ret;
> +
> + asm volatile("mrs %0, pmcr_el0" : "=r" (ret));
> + return ret;
> +}
> +#endif
> +
> +struct pmu_data {
> + union {
> + uint32_t pmcr_el0;
> + struct {
> + uint32_t enable:1;
> + uint32_t event_counter_reset:1;
> + uint32_t cycle_counter_reset:1;
> + uint32_t cycle_counter_clock_divider:1;
> + uint32_t event_counter_export:1;
> + uint32_t cycle_counter_disable_when_prohibited:1;
> + uint32_t cycle_counter_long:1;
> + uint32_t reserved:4;
> + uint32_t counters:5;
> + uint32_t identification_code:8;
> + uint32_t implementer:8;
> + };
> + };
> +};
> +
> +/*
> + * As a simple sanity check on the PMCR_EL0, ensure the implementer field 
> isn't
> + * null. Also print out a couple other interesting fields for diagnostic
> + * purposes. For example, as of fall 2015, QEMU TCG mode doesn't implement
> + * event counters and therefore reports zero event counters, but hopefully
> + * support for at least the instructions event will be added in the future 
> and
> + * the reported number of event counters will become nonzero.
> + */
> +static bool check_pmcr(void)
> +{
> + struct pmu_data pmu;
> +
> + pmu.pmcr_el0 = get_pmcr();
> +
> + printf("PMU implementer: %c\n", pmu.implementer);
> + printf("Identification code: 0x%x\n", pmu.identification_code);
> + printf("Event counters:  %d\n", pmu.counters);
> +
> + return pmu.implementer != 0;
> +}
> +
> +int main(void)
> +{
> + report_prefix_push("pmu");
> +
> + report("Control register", check_pmcr());
> +
> + return report_summary();
> +}
> diff --git a/arm/unittests.cfg b/arm/unittests.cfg
> index e068a0c..fd94adb 100644
> --- a/arm/unittests.cfg
> +++ b/arm/unittests.cfg
> @@ -35,3 +35,8 @@ file = selftest.flat
>  smp = `getconf _NPROCESSORS_CONF`
>  extra_params = -append 'smp'
>  groups = selftest
> +
> +# Test PMU support without -icount
> +[pmu]
> +file = pmu.flat
> +groups = pmu
> diff --git a/config/config-arm-common.mak b/config/config-arm-common.mak
> index 698555d..b34d04c 100644
> --- a/config/config-arm-common.mak
> +++ b/config/config-arm-common.mak
> @@ -11,7 +11,8 @@ endif
>  
>  tests-common = \
>   $(TEST_DIR)/selftest.flat \
> - $(TEST_DIR)/spinlock-test.flat
> + $(TEST_DIR)/spinlock-test.flat \
> + $(TEST_DIR)/pmu.flat
>  
>  all: test_cases
>  
> @@ -70,3 +71,4 @@ test_cases: $(generated_files) $(tests-common) $(tests)
>  
>  $(TEST_DIR)/selftest.elf: $(cstart.o) $(TEST_DIR)/selftest.o
>  $(TEST_DIR)/spinlock-test.elf: $(cstart.o) $(TEST_DIR)/spinlock-test.o
> +$(TEST_DIR)/pmu.elf: $(cstart.o) $(TEST_DIR)/pmu.o
> -- 
> Qualcomm Innovation Center, Inc.
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project
> 
>

Reviewed-by: Andrew Jones  
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 27/33] nvdimm acpi: save arg3 for NVDIMM device _DSM method

2015-10-18 Thread Michael S. Tsirkin
On Mon, Oct 19, 2015 at 08:54:13AM +0800, Xiao Guangrong wrote:
> Check if the input Arg3 is valid then store it into dsm_in if needed
> 
> Signed-off-by: Xiao Guangrong 
> ---
>  hw/acpi/nvdimm.c | 19 +++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
> index 7e99889..b211b8b 100644
> --- a/hw/acpi/nvdimm.c
> +++ b/hw/acpi/nvdimm.c
> @@ -624,10 +624,29 @@ static void nvdimm_build_acpi_devices(NVDIMMState 
> *state, GSList *device_list,
>  
>  method = aml_method_serialized("NCAL", 4);
>  {
> +Aml *ifctx;
> +
>  aml_append(method, aml_store(aml_arg(0), aml_name("HDLE")));
>  aml_append(method, aml_store(aml_arg(1), aml_name("REVS")));
>  aml_append(method, aml_store(aml_arg(2), aml_name("FUNC")));
>  
> +/* Arg3 is passed as Package and it has one element? */
> +ifctx = aml_if(aml_and(aml_equal(aml_object_type(aml_arg(3)),
> + aml_int(4)),
> +   aml_equal(aml_sizeof(aml_arg(3)),
> + aml_int(1;
> +{
> +/* Local0 = Index(Arg3, 0) */
> +aml_append(ifctx, aml_store(aml_index(aml_arg(3), aml_int(0)),
> +aml_local(0)));
> +/* Local3 = DeRefOf(Local0) */
> +aml_append(ifctx, aml_store(aml_derefof(aml_local(0)),
> +aml_local(3)));
> +/* ARG3 = Local3 */
> +aml_append(ifctx, aml_store(aml_local(3), aml_name("ARG3")));
> +}
> +aml_append(method, ifctx);
> +
>  aml_append(method, aml_store(aml_int(NOTIFY_VALUE), 
> aml_name("NOTI")));
>  
>  aml_append(method, aml_store(aml_name("RLEN"), aml_local(6)));

I commented on this patch on v3.
It doesn't look like this was addressed.

> -- 
> 1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 23/33] nvdimm acpi: init the address region used by NVDIMM ACPI

2015-10-18 Thread Michael S. Tsirkin
On Mon, Oct 19, 2015 at 08:54:09AM +0800, Xiao Guangrong wrote:
> +typedef struct nfit_spa nfit_spa;

There are still multiple coding style violations.  Pls fix all code to
match coding style.  I commented on this before.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 05/33] acpi: add aml_object_type

2015-10-18 Thread Xiao Guangrong
Implement ObjectType which is used by NVDIMM _DSM method in
later patch

Signed-off-by: Xiao Guangrong 
---
 hw/acpi/aml-build.c | 8 
 include/hw/acpi/aml-build.h | 1 +
 2 files changed, 9 insertions(+)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index efc06ab..9f792ab 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1178,6 +1178,14 @@ Aml *aml_concatenate(Aml *source1, Aml *source2, Aml 
*target)
 return var;
 }
 
+/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefObjectType */
+Aml *aml_object_type(Aml *object)
+{
+Aml *var = aml_opcode(0x8E /* ObjectTypeOp */);
+aml_append(var, object);
+return var;
+}
+
 void
 build_header(GArray *linker, GArray *table_data,
  AcpiTableHeader *h, const char *sig, int len, uint8_t rev)
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index 325782d..5b8a118 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -278,6 +278,7 @@ Aml *aml_derefof(Aml *arg);
 Aml *aml_sizeof(Aml *arg);
 Aml *aml_create_field(Aml *srcbuf, Aml *index, Aml *len, const char *name);
 Aml *aml_concatenate(Aml *source1, Aml *source2, Aml *target);
+Aml *aml_object_type(Aml *object);
 
 void
 build_header(GArray *linker, GArray *table_data,
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 32/33] nvdimm: allow using whole backend memory as pmem

2015-10-18 Thread Xiao Guangrong
Introduce a parameter, named "reserve-label-data", if it is
false which indicates that QEMU does not reserve any region
on the backend memory to support label data. It is a
'label-less' NVDIMM device mode that linux will use whole
memory on the device as a single namesapce

This is useful for the users who want to pass whole nvdimm
device and make its data completely be visible to guest

The parameter is false on default

Signed-off-by: Xiao Guangrong 
---
 hw/acpi/nvdimm.c| 21 +
 hw/mem/nvdimm.c | 37 -
 include/hw/mem/nvdimm.h |  6 ++
 3 files changed, 59 insertions(+), 5 deletions(-)

diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index 838a57e..f69bb39 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -647,6 +647,13 @@ nvdimm_dsm_cmd_get_label_data(NVDIMMDevice *nvdimm, dsm_in 
*in, GArray *out)
 nvdimm_debug("Read Label Data: offset %#x length %#x.\n",
  cmd_in->offset, cmd_in->length);
 
+if (!nvdimm->reserve_label_data) {
+nvdimm_debug("read label request on the device without "
+ "label data reserved.\n");
+status = DSM_STATUS_NOT_SUPPORTED;
+goto exit;
+}
+
 if (nvdimm->label_size < cmd_in->offset + cmd_in->length) {
 nvdimm_debug("position %#x is beyond label data (len = %#lx).\n",
  cmd_in->offset + cmd_in->length, nvdimm->label_size);
@@ -687,6 +694,14 @@ nvdimm_dsm_cmd_set_label_data(NVDIMMDevice *nvdimm, dsm_in 
*in, GArray *out)
 
 nvdimm_debug("Write Label Data: offset %#x length %#x.\n",
  cmd_in->offset, cmd_in->length);
+
+if (!nvdimm->reserve_label_data) {
+nvdimm_debug("write label request on the device without "
+ "label data reserved.\n");
+status = DSM_STATUS_NOT_SUPPORTED;
+goto exit;
+}
+
 if (nvdimm->label_size < cmd_in->offset + cmd_in->length) {
 nvdimm_debug("position %#x is beyond label data (len = %#lx).\n",
  cmd_in->offset + cmd_in->length, nvdimm->label_size);
@@ -724,6 +739,12 @@ static void nvdimm_dsm_write_nvdimm(dsm_in *in, GArray 
*out)
 /* please refer to ACPI 6.0: 9.14.1 _DSM (Device Specific Method) */
 case DSM_FUN_IMPLEMENTED:
 cmd_list = cpu_to_le64(DIMM_SUPPORT_FUN);
+
+/* no function support if the device does not have label data. */
+if (!nvdimm->reserve_label_data) {
+cmd_list = cpu_to_le64(0UL);
+}
+
 g_array_append_vals(out, &cmd_list, sizeof(cmd_list));
 goto free;
 case DSM_DEV_FUN_NAMESPACE_LABEL_SIZE:
diff --git a/hw/mem/nvdimm.c b/hw/mem/nvdimm.c
index 2d121f6..cc69a3e 100644
--- a/hw/mem/nvdimm.c
+++ b/hw/mem/nvdimm.c
@@ -60,14 +60,15 @@ static void nvdimm_realize(DIMMDevice *dimm, Error **errp)
 {
 MemoryRegion *mr;
 NVDIMMDevice *nvdimm = NVDIMM(dimm);
-uint64_t size;
+uint64_t reserved_label_size, size;
 
 nvdimm->label_size = MIN_NAMESPACE_LABEL_SIZE;
+reserved_label_size = nvdimm->reserve_label_data ? nvdimm->label_size : 0;
 
 mr = host_memory_backend_get_memory(dimm->hostmem, errp);
 size = memory_region_size(mr);
 
-if (size <= nvdimm->label_size) {
+if (size <= reserved_label_size) {
 char *path = 
object_get_canonical_path_component(OBJECT(dimm->hostmem));
 error_setg(errp, "the size of memdev %s (0x%" PRIx64 ") is too small"
" to contain nvdimm namespace label (0x%" PRIx64 ")", path,
@@ -76,9 +77,12 @@ static void nvdimm_realize(DIMMDevice *dimm, Error **errp)
 }
 
 memory_region_init_alias(&nvdimm->nvdimm_mr, OBJECT(dimm), "nvdimm-memory",
- mr, 0, size - nvdimm->label_size);
-nvdimm->label_data = memory_region_get_ram_ptr(mr) +
- memory_region_size(&nvdimm->nvdimm_mr);
+ mr, 0, size - reserved_label_size);
+
+if (reserved_label_size) {
+nvdimm->label_data = memory_region_get_ram_ptr(mr) +
+ memory_region_size(&nvdimm->nvdimm_mr);
+}
 }
 
 static void nvdimm_class_init(ObjectClass *oc, void *data)
@@ -93,10 +97,33 @@ static void nvdimm_class_init(ObjectClass *oc, void *data)
 ddc->get_memory_region = nvdimm_get_memory_region;
 }
 
+static bool nvdimm_get_reserve_label_data(Object *obj, Error **errp)
+{
+NVDIMMDevice *nvdimm = NVDIMM(obj);
+
+return nvdimm->reserve_label_data;
+}
+
+static void
+nvdimm_set_reserve_label_data(Object *obj, bool value, Error **errp)
+{
+NVDIMMDevice *nvdimm = NVDIMM(obj);
+
+nvdimm->reserve_label_data = value;
+}
+
+static void nvdimm_init(Object *obj)
+{
+object_property_add_bool(obj, "reserve-label-data",
+ nvdimm_get_reserve_label_data,
+ nvdimm_set_reserve_label_data, NULL);
+}
+
 static TypeInfo nvdimm_info = {
 .name  = TYPE_NVDIMM,
 .parent   

[PATCH v4 29/33] nvdimm acpi: support DSM_DEV_FUN_NAMESPACE_LABEL_SIZE function

2015-10-18 Thread Xiao Guangrong
Function 4 is used to get Namespace label size

Signed-off-by: Xiao Guangrong 
---
 hw/acpi/nvdimm.c | 97 ++--
 1 file changed, 95 insertions(+), 2 deletions(-)

diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index 37fea1c..1274d95 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -481,12 +481,28 @@ enum {
| (1 << DSM_DEV_FUN_GET_NAMESPACE_LABEL_DATA)  \
| (1 << DSM_DEV_FUN_SET_NAMESPACE_LABEL_DATA))
 
+struct cmd_in_get_label_data {
+uint32_t offset; /* the offset in the namespace label data area. */
+uint32_t length; /* the size of data is to be read via the function. */
+} QEMU_PACKED;
+typedef struct cmd_in_get_label_data cmd_in_get_label_data;
+
+struct cmd_in_set_label_data {
+uint32_t offset; /* the offset in the namespace label data area. */
+uint32_t length; /* the size of data is to be written via the function. */
+uint8_t in_buf[0]; /* the data written to label data area. */
+} QEMU_PACKED;
+typedef struct cmd_in_set_label_data cmd_in_set_label_data;
+
 struct dsm_in {
 uint32_t handle;
 uint32_t revision;
 uint32_t function;
/* the remaining size in the page is used by arg3. */
-uint8_t arg3[0];
+union {
+uint8_t arg3[0];
+cmd_in_set_label_data cmd_set_label_data;
+};
 } QEMU_PACKED;
 typedef struct dsm_in dsm_in;
 
@@ -495,10 +511,32 @@ struct cmd_out_implemented {
 };
 typedef struct cmd_out_implemented cmd_out_implemented;
 
+struct cmd_out_label_size {
+uint32_t status; /* return status code. */
+uint32_t label_size; /* the size of label data area. */
+/*
+ * Maximum size of the namespace label data length supported by
+ * the platform in Get/Set Namespace Label Data functions.
+ */
+uint32_t max_xfer;
+} QEMU_PACKED;
+typedef struct cmd_out_label_size cmd_out_label_size;
+
+struct cmd_out_get_label_data {
+uint32_t status;/*return status code. */
+uint8_t out_buf[0]; /* the data got via Get Namesapce Label function. */
+} QEMU_PACKED;
+typedef struct cmd_out_get_label_data cmd_out_get_label_data;
+
 struct dsm_out {
 /* the size of buffer filled by QEMU. */
 uint32_t len;
-uint8_t data[0];
+union {
+uint8_t data[0];
+cmd_out_implemented cmd_implemented;
+cmd_out_label_size cmd_label_size;
+cmd_out_get_label_data cmd_get_label_data;
+};
 } QEMU_PACKED;
 typedef struct dsm_out dsm_out;
 
@@ -534,6 +572,58 @@ static void nvdimm_dsm_write_root(dsm_in *in, GArray *out)
 nvdimm_dsm_write_status(out, status);
 }
 
+/*
+ * the max transfer size is the max size transferred by both a
+ * DSM_DEV_FUN_GET_NAMESPACE_LABEL_DATA and a
+ * DSM_DEV_FUN_SET_NAMESPACE_LABEL_DATA command.
+ */
+static uint32_t nvdimm_get_max_xfer_label_size(void)
+{
+dsm_in *in;
+dsm_out *out;
+uint32_t max_get_size, max_set_size, dsm_memory_size = getpagesize();
+
+/*
+ * the max data ACPI can read one time which is transferred by
+ * the response of DSM_DEV_FUN_GET_NAMESPACE_LABEL_DATA.
+ */
+max_get_size = dsm_memory_size - offsetof(dsm_out, data) -
+   sizeof(out->cmd_get_label_data);
+
+/*
+ * the max data ACPI can write one time which is transferred by
+ * DSM_DEV_FUN_SET_NAMESPACE_LABEL_DATA
+ */
+max_set_size = dsm_memory_size - offsetof(dsm_in, arg3) -
+   sizeof(in->cmd_set_label_data);
+
+return MIN(max_get_size, max_set_size);
+}
+
+/*
+ * Please refer to DSM specification 4.4.1 Get Namespace Label Size
+ * (Function Index 4).
+ *
+ * It gets the size of Namespace Label data area and the max data size
+ * that Get/Set Namespace Label Data functions can transfer.
+ */
+static void nvdimm_dsm_func_label_size(NVDIMMDevice *nvdimm, GArray *out)
+{
+cmd_out_label_size cmd_label_size;
+uint32_t label_size, mxfer;
+
+label_size = nvdimm->label_size;
+mxfer = nvdimm_get_max_xfer_label_size();
+
+nvdimm_debug("label_size %#x, max_xfer %#x.\n", label_size, mxfer);
+
+cmd_label_size.status = cpu_to_le32(DSM_STATUS_SUCCESS);
+cmd_label_size.label_size = cpu_to_le32(label_size);
+cmd_label_size.max_xfer = cpu_to_le32(mxfer);
+
+g_array_append_vals(out, &cmd_label_size, sizeof(cmd_label_size));
+}
+
 static void nvdimm_dsm_write_nvdimm(dsm_in *in, GArray *out)
 {
 GSList *list = nvdimm_get_plugged_device_list();
@@ -551,6 +641,9 @@ static void nvdimm_dsm_write_nvdimm(dsm_in *in, GArray *out)
 cmd_list = cpu_to_le64(DIMM_SUPPORT_FUN);
 g_array_append_vals(out, &cmd_list, sizeof(cmd_list));
 goto free;
+case DSM_DEV_FUN_NAMESPACE_LABEL_SIZE:
+nvdimm_dsm_func_label_size(nvdimm, out);
+goto free;
 default:
 status = DSM_STATUS_NOT_SUPPORTED;
 };
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.

[PATCH v4 02/33] acpi: add aml_sizeof

2015-10-18 Thread Xiao Guangrong
Implement SizeOf term which is used by NVDIMM _DSM method in later patch

Reviewed-by: Igor Mammedov 
Signed-off-by: Xiao Guangrong 
---
 hw/acpi/aml-build.c | 8 
 include/hw/acpi/aml-build.h | 1 +
 2 files changed, 9 insertions(+)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index cbd53f4..a72214d 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1143,6 +1143,14 @@ Aml *aml_derefof(Aml *arg)
 return var;
 }
 
+/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefSizeOf */
+Aml *aml_sizeof(Aml *arg)
+{
+Aml *var = aml_opcode(0x87 /* SizeOfOp */);
+aml_append(var, arg);
+return var;
+}
+
 void
 build_header(GArray *linker, GArray *table_data,
  AcpiTableHeader *h, const char *sig, int len, uint8_t rev)
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index 5a03d33..7296efb 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -275,6 +275,7 @@ Aml *aml_varpackage(uint32_t num_elements);
 Aml *aml_touuid(const char *uuid);
 Aml *aml_unicode(const char *str);
 Aml *aml_derefof(Aml *arg);
+Aml *aml_sizeof(Aml *arg);
 
 void
 build_header(GArray *linker, GArray *table_data,
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 25/33] nvdimm acpi: init the address region used by DSM

2015-10-18 Thread Xiao Guangrong
Map the NVDIMM ACPI memory region to guest address space

Detailed DSM design please refer to docs/specs/acpi_nvdimm.txt

Signed-off-by: Xiao Guangrong 
---
 hw/acpi/nvdimm.c| 87 ++---
 include/hw/mem/nvdimm.h |  8 +
 2 files changed, 91 insertions(+), 4 deletions(-)

diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index 8d8376c..bc28828 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -391,10 +391,9 @@ static GArray *nvdimm_build_device_structure(GSList 
*device_list)
 return structures;
 }
 
-static void nvdimm_build_nfit(GSList *device_list, GArray *table_offsets,
+static void nvdimm_build_nfit(GArray *structures, GArray *table_offsets,
   GArray *table_data, GArray *linker)
 {
-GArray *structures = nvdimm_build_device_structure(device_list);
 void *header;
 
 acpi_add_table(table_offsets, table_data);
@@ -407,12 +406,80 @@ static void nvdimm_build_nfit(GSList *device_list, GArray 
*table_offsets,
 
 build_header(linker, table_data, header, "NFIT",
  sizeof(nfit) + structures->len, 1);
-g_array_free(structures, true);
+}
+
+static uint64_t
+nvdimm_dsm_read(void *opaque, hwaddr addr, unsigned size)
+{
+return 0;
+}
+
+static void
+nvdimm_dsm_write(void *opaque, hwaddr addr, uint64_t val, unsigned size)
+{
+}
+
+static const MemoryRegionOps nvdimm_dsm_ops = {
+.read = nvdimm_dsm_read,
+.write = nvdimm_dsm_write,
+.endianness = DEVICE_LITTLE_ENDIAN,
+};
+
+static MemoryRegion *nvdimm_build_dsm_memory(NVDIMMState *state)
+{
+MemoryRegion *dsm_ram_mr, *dsm_mmio_mr, *dsm_fit_mr;
+uint64_t page_size = getpagesize();
+uint64_t fit_size = memory_region_size(&state->mr) - page_size * 2;
+
+/* DSM memory has already been built. */
+dsm_fit_mr = memory_region_find(&state->mr, page_size * 2,
+fit_size).mr;
+if (dsm_fit_mr) {
+nvdimm_debug("DSM FIT has already been built by %s.\n",
+ dsm_fit_mr->name);
+return dsm_fit_mr;
+}
+
+/*
+ * the first page is MMIO-based used to transfer control from guest
+ * ACPI to QEMU.
+ */
+dsm_mmio_mr = g_new(MemoryRegion, 1);
+memory_region_init_io(dsm_mmio_mr, NULL, &nvdimm_dsm_ops, state,
+  "nvdimm.dsm_mmio", page_size);
+
+/*
+ * the second page is RAM-based used to transfer data between guest
+ * ACPI and QEMU.
+ */
+dsm_ram_mr = g_new(MemoryRegion, 1);
+memory_region_init_ram(dsm_ram_mr, NULL, "nvdimm.dsm_ram",
+   page_size, &error_abort);
+vmstate_register_ram_global(dsm_ram_mr);
+
+/*
+ * the left is RAM-based which is _FIT buffer returned by _FIT
+ * method.
+ */
+dsm_fit_mr = g_new(MemoryRegion, 1);
+memory_region_init_ram(dsm_fit_mr, NULL, "nvdimm.fit", fit_size,
+   &error_abort);
+vmstate_register_ram_global(dsm_fit_mr);
+
+memory_region_add_subregion(&state->mr, 0, dsm_mmio_mr);
+memory_region_add_subregion(&state->mr, page_size, dsm_ram_mr);
+memory_region_add_subregion(&state->mr, page_size * 2, dsm_fit_mr);
+
+/* the caller will unref it. */
+memory_region_ref(dsm_fit_mr);
+return dsm_fit_mr;
 }
 
 void nvdimm_build_acpi(NVDIMMState *state, GArray *table_offsets,
GArray *table_data, GArray *linker)
 {
+MemoryRegion *fit_mr;
+GArray *structures;
 GSList *device_list = nvdimm_get_plugged_device_list();
 
 if (!memory_region_size(&state->mr)) {
@@ -424,6 +491,18 @@ void nvdimm_build_acpi(NVDIMMState *state, GArray 
*table_offsets,
 return;
 }
 
-nvdimm_build_nfit(device_list, table_offsets, table_data, linker);
+fit_mr = nvdimm_build_dsm_memory(state);
+
+structures = nvdimm_build_device_structure(device_list);
+
+/* Build fit memory which is presented to guest via _FIT method. */
+assert(memory_region_size(fit_mr) >= structures->len);
+memcpy(memory_region_get_ram_ptr(fit_mr), structures->data,
+   structures->len);
+
+nvdimm_build_nfit(structures, table_offsets, table_data, linker);
+
+memory_region_unref(fit_mr);
 g_slist_free(device_list);
+g_array_free(structures, true);
 }
diff --git a/include/hw/mem/nvdimm.h b/include/hw/mem/nvdimm.h
index dc77a1f..c2dc635 100644
--- a/include/hw/mem/nvdimm.h
+++ b/include/hw/mem/nvdimm.h
@@ -25,6 +25,14 @@
 
 #include "hw/mem/dimm.h"
 
+#define NVDIMM_DEBUG 0
+#define nvdimm_debug(fmt, ...)\
+do {  \
+if (NVDIMM_DEBUG) {   \
+fprintf(stderr, "nvdimm: " fmt, ## __VA_ARGS__);  \
+} \
+} while (0)
+
 /*
  * The minimum label data size is required by NVDIMM Namespace
  * specification, please refer to chapter 2 Namesp

[PATCH v4 04/33] acpi: add aml_concatenate

2015-10-18 Thread Xiao Guangrong
Implement Concatenate term which is used by NVDIMM _DSM method
in later patch

Signed-off-by: Xiao Guangrong 
---
 hw/acpi/aml-build.c | 14 ++
 include/hw/acpi/aml-build.h |  1 +
 2 files changed, 15 insertions(+)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index 9fe5e7b..efc06ab 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1164,6 +1164,20 @@ Aml *aml_create_field(Aml *srcbuf, Aml *index, Aml *len, 
const char *name)
 return var;
 }
 
+/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefConcat */
+Aml *aml_concatenate(Aml *source1, Aml *source2, Aml *target)
+{
+Aml *var = aml_opcode(0x73 /* ConcatOp */);
+aml_append(var, source1);
+aml_append(var, source2);
+
+if (target) {
+aml_append(var, target);
+}
+
+return var;
+}
+
 void
 build_header(GArray *linker, GArray *table_data,
  AcpiTableHeader *h, const char *sig, int len, uint8_t rev)
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index 7e1c43b..325782d 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -277,6 +277,7 @@ Aml *aml_unicode(const char *str);
 Aml *aml_derefof(Aml *arg);
 Aml *aml_sizeof(Aml *arg);
 Aml *aml_create_field(Aml *srcbuf, Aml *index, Aml *len, const char *name);
+Aml *aml_concatenate(Aml *source1, Aml *source2, Aml *target);
 
 void
 build_header(GArray *linker, GArray *table_data,
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 03/33] acpi: add aml_create_field

2015-10-18 Thread Xiao Guangrong
Implement CreateField term which is used by NVDIMM _DSM method in later patch

Signed-off-by: Xiao Guangrong 
---
 hw/acpi/aml-build.c | 13 +
 include/hw/acpi/aml-build.h |  1 +
 2 files changed, 14 insertions(+)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index a72214d..9fe5e7b 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1151,6 +1151,19 @@ Aml *aml_sizeof(Aml *arg)
 return var;
 }
 
+/* ACPI 1.0b: 16.2.5.2 Named Objects Encoding: DefCreateField */
+Aml *aml_create_field(Aml *srcbuf, Aml *index, Aml *len, const char *name)
+{
+Aml *var = aml_alloc();
+build_append_byte(var->buf, 0x5B); /* ExtOpPrefix */
+build_append_byte(var->buf, 0x13); /* CreateFieldOp */
+aml_append(var, srcbuf);
+aml_append(var, index);
+aml_append(var, len);
+build_append_namestring(var->buf, "%s", name);
+return var;
+}
+
 void
 build_header(GArray *linker, GArray *table_data,
  AcpiTableHeader *h, const char *sig, int len, uint8_t rev)
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index 7296efb..7e1c43b 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -276,6 +276,7 @@ Aml *aml_touuid(const char *uuid);
 Aml *aml_unicode(const char *str);
 Aml *aml_derefof(Aml *arg);
 Aml *aml_sizeof(Aml *arg);
+Aml *aml_create_field(Aml *srcbuf, Aml *index, Aml *len, const char *name);
 
 void
 build_header(GArray *linker, GArray *table_data,
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 17/33] dimm: abstract dimm device from pc-dimm

2015-10-18 Thread Xiao Guangrong
A base device, dimm, is abstracted from pc-dimm, so that we can
build nvdimm device based on dimm in the later patch

Signed-off-by: Xiao Guangrong 
---
 default-configs/i386-softmmu.mak   |  1 +
 default-configs/x86_64-softmmu.mak |  1 +
 hw/mem/Makefile.objs   |  3 ++-
 hw/mem/dimm.c  | 11 ++---
 hw/mem/pc-dimm.c   | 46 ++
 include/hw/mem/dimm.h  |  4 ++--
 include/hw/mem/pc-dimm.h   |  7 ++
 7 files changed, 61 insertions(+), 12 deletions(-)
 create mode 100644 hw/mem/pc-dimm.c
 create mode 100644 include/hw/mem/pc-dimm.h

diff --git a/default-configs/i386-softmmu.mak b/default-configs/i386-softmmu.mak
index 43c96d1..3ece8bb 100644
--- a/default-configs/i386-softmmu.mak
+++ b/default-configs/i386-softmmu.mak
@@ -18,6 +18,7 @@ CONFIG_FDC=y
 CONFIG_ACPI=y
 CONFIG_ACPI_X86=y
 CONFIG_ACPI_X86_ICH=y
+CONFIG_DIMM=y
 CONFIG_ACPI_MEMORY_HOTPLUG=y
 CONFIG_ACPI_CPU_HOTPLUG=y
 CONFIG_APM=y
diff --git a/default-configs/x86_64-softmmu.mak 
b/default-configs/x86_64-softmmu.mak
index dfb8095..92ea7c1 100644
--- a/default-configs/x86_64-softmmu.mak
+++ b/default-configs/x86_64-softmmu.mak
@@ -18,6 +18,7 @@ CONFIG_FDC=y
 CONFIG_ACPI=y
 CONFIG_ACPI_X86=y
 CONFIG_ACPI_X86_ICH=y
+CONFIG_DIMM=y
 CONFIG_ACPI_MEMORY_HOTPLUG=y
 CONFIG_ACPI_CPU_HOTPLUG=y
 CONFIG_APM=y
diff --git a/hw/mem/Makefile.objs b/hw/mem/Makefile.objs
index 7563ef5..cebb4b1 100644
--- a/hw/mem/Makefile.objs
+++ b/hw/mem/Makefile.objs
@@ -1 +1,2 @@
-common-obj-$(CONFIG_MEM_HOTPLUG) += dimm.o
+common-obj-$(CONFIG_DIMM) += dimm.o
+common-obj-$(CONFIG_MEM_HOTPLUG) += pc-dimm.o
diff --git a/hw/mem/dimm.c b/hw/mem/dimm.c
index 6c1ea98..23d5daa 100644
--- a/hw/mem/dimm.c
+++ b/hw/mem/dimm.c
@@ -1,5 +1,5 @@
 /*
- * Dimm device for Memory Hotplug
+ * Dimm device abstraction
  *
  * Copyright ProfitBricks GmbH 2012
  * Copyright (C) 2014 Red Hat Inc
@@ -432,21 +432,13 @@ static void dimm_realize(DeviceState *dev, Error **errp)
 }
 }
 
-static MemoryRegion *dimm_get_memory_region(DIMMDevice *dimm)
-{
-return host_memory_backend_get_memory(dimm->hostmem, &error_abort);
-}
-
 static void dimm_class_init(ObjectClass *oc, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(oc);
-DIMMDeviceClass *ddc = DIMM_CLASS(oc);
 
 dc->realize = dimm_realize;
 dc->props = dimm_properties;
 dc->desc = "DIMM memory module";
-
-ddc->get_memory_region = dimm_get_memory_region;
 }
 
 static TypeInfo dimm_info = {
@@ -456,6 +448,7 @@ static TypeInfo dimm_info = {
 .instance_init = dimm_init,
 .class_init= dimm_class_init,
 .class_size= sizeof(DIMMDeviceClass),
+.abstract  = true,
 };
 
 static void dimm_register_types(void)
diff --git a/hw/mem/pc-dimm.c b/hw/mem/pc-dimm.c
new file mode 100644
index 000..38323e9
--- /dev/null
+++ b/hw/mem/pc-dimm.c
@@ -0,0 +1,46 @@
+/*
+ * Dimm device for Memory Hotplug
+ *
+ * Copyright ProfitBricks GmbH 2012
+ * Copyright (C) 2014 Red Hat Inc
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see 
+ */
+
+#include "hw/mem/pc-dimm.h"
+
+static MemoryRegion *pc_dimm_get_memory_region(DIMMDevice *dimm)
+{
+return host_memory_backend_get_memory(dimm->hostmem, &error_abort);
+}
+
+static void pc_dimm_class_init(ObjectClass *oc, void *data)
+{
+DIMMDeviceClass *ddc = DIMM_CLASS(oc);
+
+ddc->get_memory_region = pc_dimm_get_memory_region;
+}
+
+static TypeInfo pc_dimm_info = {
+.name  = TYPE_PC_DIMM,
+.parent= TYPE_DIMM,
+.class_init= pc_dimm_class_init,
+};
+
+static void pc_dimm_register_types(void)
+{
+type_register_static(&pc_dimm_info);
+}
+
+type_init(pc_dimm_register_types)
diff --git a/include/hw/mem/dimm.h b/include/hw/mem/dimm.h
index 5ddbf08..84a62ed 100644
--- a/include/hw/mem/dimm.h
+++ b/include/hw/mem/dimm.h
@@ -1,5 +1,5 @@
 /*
- * PC DIMM device
+ * Dimm device abstraction
  *
  * Copyright ProfitBricks GmbH 2012
  * Copyright (C) 2013-2014 Red Hat Inc
@@ -20,7 +20,7 @@
 #include "sysemu/hostmem.h"
 #include "hw/qdev.h"
 
-#define TYPE_DIMM "pc-dimm"
+#define TYPE_DIMM "dimm"
 #define DIMM(obj) \
 OBJECT_CHECK(DIMMDevice, (obj), TYPE_DIMM)
 #define DIMM_CLASS(oc) \
diff --git a/include/hw/mem/pc-dimm.h b/include/hw/mem/pc-dimm.h
new file mode 100644
index 000..50818c2
--- /dev/null
+++ b/include/hw/mem/pc-dimm

[PATCH v4 30/33] nvdimm acpi: support DSM_DEV_FUN_GET_NAMESPACE_LABEL_DATA

2015-10-18 Thread Xiao Guangrong
Function 5 is used to get Namespace Label Data

Signed-off-by: Xiao Guangrong 
---
 hw/acpi/nvdimm.c | 45 +
 1 file changed, 45 insertions(+)

diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index 1274d95..1683a82 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -501,6 +501,7 @@ struct dsm_in {
/* the remaining size in the page is used by arg3. */
 union {
 uint8_t arg3[0];
+cmd_in_get_label_data cmd_get_label_data;
 cmd_in_set_label_data cmd_set_label_data;
 };
 } QEMU_PACKED;
@@ -624,6 +625,47 @@ static void nvdimm_dsm_func_label_size(NVDIMMDevice 
*nvdimm, GArray *out)
 g_array_append_vals(out, &cmd_label_size, sizeof(cmd_label_size));
 }
 
+/*
+ * please refer to DSM specification 4.5 Get Namespace Label Data (Function
+ * Index 5).
+ */
+static void
+nvdimm_dsm_cmd_get_label_data(NVDIMMDevice *nvdimm, dsm_in *in, GArray *out)
+{
+cmd_in_get_label_data *cmd_in = &in->cmd_get_label_data;
+uint32_t status = DSM_STATUS_SUCCESS;
+
+le32_to_cpus(&cmd_in->offset);
+le32_to_cpus(&cmd_in->length);
+
+nvdimm_debug("Read Label Data: offset %#x length %#x.\n",
+ cmd_in->offset, cmd_in->length);
+
+if (nvdimm->label_size < cmd_in->offset + cmd_in->length) {
+nvdimm_debug("position %#x is beyond label data (len = %#lx).\n",
+ cmd_in->offset + cmd_in->length, nvdimm->label_size);
+status = DSM_DEV_STATUS_INVALID_PARAS;
+goto exit;
+}
+
+if (cmd_in->length > nvdimm_get_max_xfer_label_size()) {
+nvdimm_debug("get length (%#x) is larger than max_xfer (%#x).\n",
+ cmd_in->length, nvdimm_get_max_xfer_label_size());
+status = DSM_DEV_STATUS_INVALID_PARAS;
+goto exit;
+}
+
+/* write cmd_out_get_label_data.status. */
+nvdimm_dsm_write_status(out, status);
+/* write cmd_out_get_label_data.out_buf. */
+g_array_append_vals(out, nvdimm->label_data + cmd_in->offset,
+cmd_in->length);
+return;
+
+exit:
+nvdimm_dsm_write_status(out, status);
+}
+
 static void nvdimm_dsm_write_nvdimm(dsm_in *in, GArray *out)
 {
 GSList *list = nvdimm_get_plugged_device_list();
@@ -644,6 +686,9 @@ static void nvdimm_dsm_write_nvdimm(dsm_in *in, GArray *out)
 case DSM_DEV_FUN_NAMESPACE_LABEL_SIZE:
 nvdimm_dsm_func_label_size(nvdimm, out);
 goto free;
+case DSM_DEV_FUN_GET_NAMESPACE_LABEL_DATA:
+nvdimm_dsm_cmd_get_label_data(nvdimm, in, out);
+goto free;
 default:
 status = DSM_STATUS_NOT_SUPPORTED;
 };
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 28/33] nvdimm acpi: support DSM_FUN_IMPLEMENTED function

2015-10-18 Thread Xiao Guangrong
__DSM is defined in ACPI 6.0: 9.14.1 _DSM (Device Specific Method)

Function 0 is a query function. We do not support any function on root
device and only 3 functions are support for NVDIMM device,
DSM_DEV_FUN_NAMESPACE_LABEL_SIZE, DSM_DEV_FUN_GET_NAMESPACE_LABEL_DATA and
DSM_DEV_FUN_SET_NAMESPACE_LABEL_DATA, that means we currently only allow to
access device's Label Namespace

Signed-off-by: Xiao Guangrong 
---
 hw/acpi/nvdimm.c | 184 ++-
 1 file changed, 182 insertions(+), 2 deletions(-)

diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index b211b8b..37fea1c 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -260,6 +260,22 @@ static uint32_t nvdimm_slot_to_dcr_index(int slot)
 return nvdimm_slot_to_spa_index(slot) + 1;
 }
 
+static NVDIMMDevice
+*nvdimm_get_device_by_handle(GSList *list, uint32_t handle)
+{
+for (; list; list = list->next) {
+NVDIMMDevice *nvdimm = list->data;
+int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
+   NULL);
+
+if (nvdimm_slot_to_handle(slot) == handle) {
+return nvdimm;
+}
+}
+
+return NULL;
+}
+
 /*
  * Please refer to ACPI 6.0: 5.2.25.1 System Physical Address Range
  * Structure
@@ -411,6 +427,60 @@ static void nvdimm_build_nfit(GArray *structures, GArray 
*table_offsets,
 /* detailed _DSM design please refer to docs/specs/acpi_nvdimm.txt */
 #define NOTIFY_VALUE  0x99
 
+enum {
+DSM_FUN_IMPLEMENTED = 0,
+
+/* NVDIMM Root Device Functions */
+DSM_ROOT_DEV_FUN_ARS_CAP = 1,
+DSM_ROOT_DEV_FUN_ARS_START = 2,
+DSM_ROOT_DEV_FUN_ARS_QUERY = 3,
+
+/* NVDIMM Device (non-root) Functions */
+DSM_DEV_FUN_SMART = 1,
+DSM_DEV_FUN_SMART_THRESHOLD = 2,
+DSM_DEV_FUN_BLOCK_NVDIMM_FLAGS = 3,
+DSM_DEV_FUN_NAMESPACE_LABEL_SIZE = 4,
+DSM_DEV_FUN_GET_NAMESPACE_LABEL_DATA = 5,
+DSM_DEV_FUN_SET_NAMESPACE_LABEL_DATA = 6,
+DSM_DEV_FUN_VENDOR_EFFECT_LOG_SIZE = 7,
+DSM_DEV_FUN_GET_VENDOR_EFFECT_LOG = 8,
+DSM_DEV_FUN_VENDOR_SPECIFIC = 9,
+};
+
+enum {
+/* Common return status codes. */
+DSM_STATUS_SUCCESS = 0,   /* Success */
+DSM_STATUS_NOT_SUPPORTED = 1, /* Not Supported */
+
+/* NVDIMM Root Device _DSM function return status codes*/
+DSM_ROOT_DEV_STATUS_INVALID_PARAS = 2,/* Invalid Input Parameters */
+DSM_ROOT_DEV_STATUS_FUNCTION_SPECIFIC_ERROR = 3, /* Function-Specific
+Error */
+
+/* NVDIMM Device (non-root) _DSM function return status codes*/
+DSM_DEV_STATUS_NON_EXISTING_MEM_DEV = 2,  /* Non-Existing Memory Device */
+DSM_DEV_STATUS_INVALID_PARAS = 3, /* Invalid Input Parameters */
+DSM_DEV_STATUS_VENDOR_SPECIFIC_ERROR = 4, /* Vendor Specific Error */
+};
+
+/* Current revision supported by DSM specification is 1. */
+#define DSM_REVISION(1)
+
+/*
+ * please refer to ACPI 6.0: 9.14.1 _DSM (Device Specific Method): Return
+ * Value Information:
+ *   if set to zero, no functions are supported (other than function zero)
+ *   for the specified UUID and Revision ID. If set to one, at least one
+ *   additional function is supported.
+ */
+
+/* do not support any function on root. */
+#define ROOT_SUPPORT_FUN (0ULL)
+#define DIMM_SUPPORT_FUN((1 << DSM_FUN_IMPLEMENTED)   \
+   | (1 << DSM_DEV_FUN_NAMESPACE_LABEL_SIZE)  \
+   | (1 << DSM_DEV_FUN_GET_NAMESPACE_LABEL_DATA)  \
+   | (1 << DSM_DEV_FUN_SET_NAMESPACE_LABEL_DATA))
+
 struct dsm_in {
 uint32_t handle;
 uint32_t revision;
@@ -420,6 +490,11 @@ struct dsm_in {
 } QEMU_PACKED;
 typedef struct dsm_in dsm_in;
 
+struct cmd_out_implemented {
+uint64_t cmd_list;
+};
+typedef struct cmd_out_implemented cmd_out_implemented;
+
 struct dsm_out {
 /* the size of buffer filled by QEMU. */
 uint32_t len;
@@ -434,12 +509,115 @@ nvdimm_dsm_read(void *opaque, hwaddr addr, unsigned size)
 return 0;
 }
 
+static void nvdimm_dsm_write_status(GArray *out, uint32_t status)
+{
+/* status locates in the first 4 bytes in the dsm memory. */
+assert(!out->len);
+
+status = cpu_to_le32(status);
+g_array_append_vals(out, &status, sizeof(status));
+}
+
+static void nvdimm_dsm_write_root(dsm_in *in, GArray *out)
+{
+uint32_t status = DSM_STATUS_NOT_SUPPORTED;
+
+/* please refer to ACPI 6.0: 9.14.1 _DSM (Device Specific Method) */
+if (in->function == DSM_FUN_IMPLEMENTED) {
+uint64_t cmd_list = cpu_to_le64(ROOT_SUPPORT_FUN);
+
+g_array_append_vals(out, &cmd_list, sizeof(cmd_list));
+return;
+}
+
+nvdimm_debug("Return status %#x.\n", status);
+nvdimm_dsm_write_status(out, status);
+}
+
+static void nvdimm_dsm_write_nvdimm(dsm_in *in, GArray *out)
+{
+GSList *list = nvdimm_get_plugged_device_list();
+NVDIM

[PATCH v4 16/33] pc-dimm: rename pc-dimm.c and pc-dimm.h

2015-10-18 Thread Xiao Guangrong
Rename:
   pc-dimm.c => dimm.c
   pc-dimm.h => dimm.h

It prepares the work which abstracts dimm device type for both pc-dimm and
nvdimm

Signed-off-by: Xiao Guangrong 
---
 hw/Makefile.objs | 2 +-
 hw/acpi/ich9.c   | 2 +-
 hw/acpi/memory_hotplug.c | 4 ++--
 hw/acpi/piix4.c  | 2 +-
 hw/i386/pc.c | 2 +-
 hw/mem/Makefile.objs | 2 +-
 hw/mem/{pc-dimm.c => dimm.c} | 2 +-
 hw/ppc/spapr.c   | 2 +-
 include/hw/i386/pc.h | 2 +-
 include/hw/mem/{pc-dimm.h => dimm.h} | 0
 include/hw/ppc/spapr.h   | 2 +-
 numa.c   | 2 +-
 qmp.c| 2 +-
 stubs/qmp_dimm_device_list.c | 2 +-
 14 files changed, 14 insertions(+), 14 deletions(-)
 rename hw/mem/{pc-dimm.c => dimm.c} (99%)
 rename include/hw/mem/{pc-dimm.h => dimm.h} (100%)

diff --git a/hw/Makefile.objs b/hw/Makefile.objs
index 7e7c241..12ecda9 100644
--- a/hw/Makefile.objs
+++ b/hw/Makefile.objs
@@ -30,8 +30,8 @@ devices-dirs-$(CONFIG_SOFTMMU) += vfio/
 devices-dirs-$(CONFIG_VIRTIO) += virtio/
 devices-dirs-$(CONFIG_SOFTMMU) += watchdog/
 devices-dirs-$(CONFIG_SOFTMMU) += xen/
-devices-dirs-$(CONFIG_MEM_HOTPLUG) += mem/
 devices-dirs-$(CONFIG_SMBIOS) += smbios/
+devices-dirs-y += mem/
 devices-dirs-y += core/
 common-obj-y += $(devices-dirs-y)
 obj-y += $(devices-dirs-y)
diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
index b0d6a67..1e9ae20 100644
--- a/hw/acpi/ich9.c
+++ b/hw/acpi/ich9.c
@@ -35,7 +35,7 @@
 #include "exec/address-spaces.h"
 
 #include "hw/i386/ich9.h"
-#include "hw/mem/pc-dimm.h"
+#include "hw/mem/dimm.h"
 
 //#define DEBUG
 
diff --git a/hw/acpi/memory_hotplug.c b/hw/acpi/memory_hotplug.c
index 1f6..e232641 100644
--- a/hw/acpi/memory_hotplug.c
+++ b/hw/acpi/memory_hotplug.c
@@ -1,6 +1,6 @@
 #include "hw/acpi/memory_hotplug.h"
 #include "hw/acpi/pc-hotplug.h"
-#include "hw/mem/pc-dimm.h"
+#include "hw/mem/dimm.h"
 #include "hw/boards.h"
 #include "hw/qdev-core.h"
 #include "trace.h"
@@ -148,7 +148,7 @@ static void acpi_memory_hotplug_write(void *opaque, hwaddr 
addr, uint64_t data,
 
 dev = DEVICE(mdev->dimm);
 hotplug_ctrl = qdev_get_hotplug_handler(dev);
-/* call pc-dimm unplug cb */
+/* call dimm unplug cb */
 hotplug_handler_unplug(hotplug_ctrl, dev, &local_err);
 if (local_err) {
 trace_mhp_acpi_dimm_delete_failed(mem_st->selector);
diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index 0b2cb6e..b2f5b2c 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -33,7 +33,7 @@
 #include "hw/acpi/pcihp.h"
 #include "hw/acpi/cpu_hotplug.h"
 #include "hw/hotplug.h"
-#include "hw/mem/pc-dimm.h"
+#include "hw/mem/dimm.h"
 #include "hw/acpi/memory_hotplug.h"
 #include "hw/acpi/acpi_dev_interface.h"
 #include "hw/xen/xen.h"
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index d6b9fa7..6694b18 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -62,7 +62,7 @@
 #include "hw/boards.h"
 #include "hw/pci/pci_host.h"
 #include "acpi-build.h"
-#include "hw/mem/pc-dimm.h"
+#include "hw/mem/dimm.h"
 #include "qapi/visitor.h"
 #include "qapi-visit.h"
 
diff --git a/hw/mem/Makefile.objs b/hw/mem/Makefile.objs
index b000fb4..7563ef5 100644
--- a/hw/mem/Makefile.objs
+++ b/hw/mem/Makefile.objs
@@ -1 +1 @@
-common-obj-$(CONFIG_MEM_HOTPLUG) += pc-dimm.o
+common-obj-$(CONFIG_MEM_HOTPLUG) += dimm.o
diff --git a/hw/mem/pc-dimm.c b/hw/mem/dimm.c
similarity index 99%
rename from hw/mem/pc-dimm.c
rename to hw/mem/dimm.c
index 51f737f..6c1ea98 100644
--- a/hw/mem/pc-dimm.c
+++ b/hw/mem/dimm.c
@@ -18,7 +18,7 @@
  * License along with this library; if not, see 
  */
 
-#include "hw/mem/pc-dimm.h"
+#include "hw/mem/dimm.h"
 #include "qemu/config-file.h"
 #include "qapi/visitor.h"
 #include "qemu/range.h"
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 4fb91a5..171fa77 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2138,7 +2138,7 @@ static void spapr_machine_device_plug(HotplugHandler 
*hotplug_dev,
  *
  * - Memory gets hotplugged to a different node than what the user
  *   specified.
- * - Since pc-dimm subsystem in QEMU still thinks that memory belongs
+ * - Since dimm subsystem in QEMU still thinks that memory belongs
  *   to memory-less node, a reboot will set things accordingly
  *   and the previously hotplugged memory now ends in the right node.
  *   This appears as if some memory moved from one node to another.
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 0503485..693b6c5 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -16,7 +16,7 @@
 #include "hw/pci/pci.h"
 #include "hw/boards.h"
 #include "hw/compat.h"
-#include "hw/mem/pc-dimm.h"
+#include "hw/mem/dimm.h"
 
 #define HPET_INTCAP "hpet-intcap"
 
diff --git a/include/hw/mem/pc-dimm.h b

[PATCH v4 06/33] acpi: add aml_method_serialized

2015-10-18 Thread Xiao Guangrong
It avoid explicit Mutex and will be used by NVDIMM ACPI

Signed-off-by: Xiao Guangrong 
---
 hw/acpi/aml-build.c | 26 --
 include/hw/acpi/aml-build.h |  1 +
 2 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index 9f792ab..8bee8b2 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -696,14 +696,36 @@ Aml *aml_while(Aml *predicate)
 }
 
 /* ACPI 1.0b: 16.2.5.2 Named Objects Encoding: DefMethod */
-Aml *aml_method(const char *name, int arg_count)
+static Aml *__aml_method(const char *name, int arg_count, bool serialized)
 {
 Aml *var = aml_bundle(0x14 /* MethodOp */, AML_PACKAGE);
+int methodflags;
+
+/*
+ * MethodFlags:
+ *   bit 0-2: ArgCount (0-7)
+ *   bit 3: SerializeFlag
+ * 0: NotSerialized
+ * 1: Serialized
+ *   bit 4-7: reserved (must be 0)
+ */
+assert(!(arg_count & ~7));
+methodflags = arg_count | (serialized << 3);
 build_append_namestring(var->buf, "%s", name);
-build_append_byte(var->buf, arg_count); /* MethodFlags: ArgCount */
+build_append_byte(var->buf, methodflags);
 return var;
 }
 
+Aml *aml_method(const char *name, int arg_count)
+{
+return __aml_method(name, arg_count, false);
+}
+
+Aml *aml_method_serialized(const char *name, int arg_count)
+{
+return __aml_method(name, arg_count, true);
+}
+
 /* ACPI 1.0b: 16.2.5.2 Named Objects Encoding: DefDevice */
 Aml *aml_device(const char *name_format, ...)
 {
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index 5b8a118..00cf40e 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -263,6 +263,7 @@ Aml *aml_qword_memory(AmlDecode dec, AmlMinFixed min_fixed,
 Aml *aml_scope(const char *name_format, ...) GCC_FMT_ATTR(1, 2);
 Aml *aml_device(const char *name_format, ...) GCC_FMT_ATTR(1, 2);
 Aml *aml_method(const char *name, int arg_count);
+Aml *aml_method_serialized(const char *name, int arg_count);
 Aml *aml_if(Aml *predicate);
 Aml *aml_else(void);
 Aml *aml_while(Aml *predicate);
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 09/33] exec: allow file_ram_alloc to work on file

2015-10-18 Thread Xiao Guangrong
Currently, file_ram_alloc() only works on directory - it creates a file
under @path and do mmap on it

This patch tries to allow it to work on file directly, if @path is a
directory it works as before, otherwise it treats @path as the target
file then directly allocate memory from it

Signed-off-by: Xiao Guangrong 
---
 exec.c | 80 ++
 1 file changed, 51 insertions(+), 29 deletions(-)

diff --git a/exec.c b/exec.c
index d2a3357..09e9938 100644
--- a/exec.c
+++ b/exec.c
@@ -1157,14 +1157,60 @@ void qemu_mutex_unlock_ramlist(void)
 }
 
 #ifdef __linux__
+static bool path_is_dir(const char *path)
+{
+struct stat fs;
+
+return stat(path, &fs) == 0 && S_ISDIR(fs.st_mode);
+}
+
+static int open_file_path(RAMBlock *block, const char *path, size_t size)
+{
+char *filename;
+char *sanitized_name;
+char *c;
+int fd;
+
+if (!path_is_dir(path)) {
+int flags = (block->flags & RAM_SHARED) ? O_RDWR : O_RDONLY;
+
+flags |= O_EXCL;
+return open(path, flags);
+}
+
+/* Make name safe to use with mkstemp by replacing '/' with '_'. */
+sanitized_name = g_strdup(memory_region_name(block->mr));
+for (c = sanitized_name; *c != '\0'; c++) {
+if (*c == '/') {
+*c = '_';
+}
+}
+filename = g_strdup_printf("%s/qemu_back_mem.%s.XX", path,
+   sanitized_name);
+g_free(sanitized_name);
+fd = mkstemp(filename);
+if (fd >= 0) {
+unlink(filename);
+/*
+ * ftruncate is not supported by hugetlbfs in older
+ * hosts, so don't bother bailing out on errors.
+ * If anything goes wrong with it under other filesystems,
+ * mmap will fail.
+ */
+if (ftruncate(fd, size)) {
+perror("ftruncate");
+}
+}
+g_free(filename);
+
+return fd;
+}
+
 static void *file_ram_alloc(RAMBlock *block,
 ram_addr_t memory,
 const char *path,
 Error **errp)
 {
-char *filename;
-char *sanitized_name;
-char *c;
 void *area;
 int fd;
 uint64_t pagesize;
@@ -1194,38 +1240,14 @@ static void *file_ram_alloc(RAMBlock *block,
 goto error;
 }
 
-/* Make name safe to use with mkstemp by replacing '/' with '_'. */
-sanitized_name = g_strdup(memory_region_name(block->mr));
-for (c = sanitized_name; *c != '\0'; c++) {
-if (*c == '/')
-*c = '_';
-}
-
-filename = g_strdup_printf("%s/qemu_back_mem.%s.XX", path,
-   sanitized_name);
-g_free(sanitized_name);
+memory = ROUND_UP(memory, pagesize);
 
-fd = mkstemp(filename);
+fd = open_file_path(block, path, memory);
 if (fd < 0) {
 error_setg_errno(errp, errno,
  "unable to create backing store for path %s", path);
-g_free(filename);
 goto error;
 }
-unlink(filename);
-g_free(filename);
-
-memory = ROUND_UP(memory, pagesize);
-
-/*
- * ftruncate is not supported by hugetlbfs in older
- * hosts, so don't bother bailing out on errors.
- * If anything goes wrong with it under other filesystems,
- * mmap will fail.
- */
-if (ftruncate(fd, memory)) {
-perror("ftruncate");
-}
 
 area = qemu_ram_mmap(fd, memory, pagesize, block->flags & RAM_SHARED);
 if (area == MAP_FAILED) {
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 12/33] pc-dimm: remove DEFAULT_PC_DIMMSIZE

2015-10-18 Thread Xiao Guangrong
It's not used any more

Signed-off-by: Xiao Guangrong 
---
 include/hw/mem/pc-dimm.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/include/hw/mem/pc-dimm.h b/include/hw/mem/pc-dimm.h
index c1ee7b0..15590f1 100644
--- a/include/hw/mem/pc-dimm.h
+++ b/include/hw/mem/pc-dimm.h
@@ -20,8 +20,6 @@
 #include "sysemu/hostmem.h"
 #include "hw/qdev.h"
 
-#define DEFAULT_PC_DIMMSIZE (1024*1024*1024)
-
 #define TYPE_PC_DIMM "pc-dimm"
 #define PC_DIMM(obj) \
 OBJECT_CHECK(PCDIMMDevice, (obj), TYPE_PC_DIMM)
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 26/33] nvdimm acpi: build ACPI nvdimm devices

2015-10-18 Thread Xiao Guangrong
NVDIMM devices is defined in ACPI 6.0 9.20 NVDIMM Devices

There is a root device under \_SB and specified NVDIMM devices are under the
root device. Each NVDIMM device has _ADR which returns its handle used to
associate MEMDEV structure in NFIT

We reserve handle 0 for root device. In this patch, we save handle, arg0,
arg1 and arg2. Arg3 is conditionally saved in later patch

Signed-off-by: Xiao Guangrong 
---
 hw/acpi/nvdimm.c | 216 +++
 1 file changed, 216 insertions(+)

diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index bc28828..7e99889 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -408,15 +408,38 @@ static void nvdimm_build_nfit(GArray *structures, GArray 
*table_offsets,
  sizeof(nfit) + structures->len, 1);
 }
 
+/* detailed _DSM design please refer to docs/specs/acpi_nvdimm.txt */
+#define NOTIFY_VALUE  0x99
+
+struct dsm_in {
+uint32_t handle;
+uint32_t revision;
+uint32_t function;
+   /* the remaining size in the page is used by arg3. */
+uint8_t arg3[0];
+} QEMU_PACKED;
+typedef struct dsm_in dsm_in;
+
+struct dsm_out {
+/* the size of buffer filled by QEMU. */
+uint32_t len;
+uint8_t data[0];
+} QEMU_PACKED;
+typedef struct dsm_out dsm_out;
+
 static uint64_t
 nvdimm_dsm_read(void *opaque, hwaddr addr, unsigned size)
 {
+fprintf(stderr, "BUG: we never read DSM notification MMIO.\n");
 return 0;
 }
 
 static void
 nvdimm_dsm_write(void *opaque, hwaddr addr, uint64_t val, unsigned size)
 {
+if (val != NOTIFY_VALUE) {
+fprintf(stderr, "BUG: unexepected notify value 0x%" PRIx64, val);
+}
 }
 
 static const MemoryRegionOps nvdimm_dsm_ops = {
@@ -475,6 +498,196 @@ static MemoryRegion *nvdimm_build_dsm_memory(NVDIMMState 
*state)
 return dsm_fit_mr;
 }
 
+#define BUILD_STA_METHOD(_dev_, _method_)  \
+do {   \
+_method_ = aml_method("_STA", 0);  \
+aml_append(_method_, aml_return(aml_int(0x0f)));   \
+aml_append(_dev_, _method_);   \
+} while (0)
+
+#define BUILD_DSM_METHOD(_dev_, _method_, _handle_, _errcode_, _uuid_) \
+do {   \
+Aml *ifctx, *uuid; \
+_method_ = aml_method("_DSM", 4);  \
+/* check UUID if it is we expect, return the errorcode if not.*/   \
+uuid = aml_touuid(_uuid_); \
+ifctx = aml_if(aml_lnot(aml_equal(aml_arg(0), uuid))); \
+aml_append(ifctx, aml_return(aml_int(_errcode_))); \
+aml_append(method, ifctx); \
+aml_append(method, aml_return(aml_call4("NCAL", aml_int(_handle_), \
+   aml_arg(1), aml_arg(2), aml_arg(3;  \
+aml_append(_dev_, _method_);   \
+} while (0)
+
+#define BUILD_FIELD_UNIT_STRUCT(_field_, _s_, _f_, _name_) \
+aml_append(_field_, aml_named_field(_name_,\
+   sizeof(typeof_field(_s_, _f_)) * BITS_PER_BYTE))
+
+#define BUILD_FIELD_UNIT_SIZE(_field_, _byte_, _name_) \
+aml_append(_field_, aml_named_field(_name_, (_byte_) * BITS_PER_BYTE))
+
+static void build_nvdimm_devices(NVDIMMState *state, GSList *device_list,
+ Aml *root_dev)
+{
+for (; device_list; device_list = device_list->next) {
+NVDIMMDevice *nvdimm = device_list->data;
+int slot = object_property_get_int(OBJECT(nvdimm), DIMM_SLOT_PROP,
+   NULL);
+uint32_t handle = nvdimm_slot_to_handle(slot);
+Aml *dev, *method;
+
+dev = aml_device("NV%02X", slot);
+aml_append(dev, aml_name_decl("_ADR", aml_int(handle)));
+
+BUILD_STA_METHOD(dev, method);
+
+/*
+ * Please refer to DSM specification Chapter 4 _DSM Interface
+ * for NVDIMM Device (non-root) - Example
+ */
+BUILD_DSM_METHOD(dev, method,
+ handle /* NVDIMM Device Handle */,
+ 3 /* Invalid Input Parameters */,
+ "4309AC30-0D11-11E4-9191-0800200C9A66"
+ /* UUID for NVDIMM Devices. */);
+
+aml_append(root_dev, dev);
+}
+}
+
+static void nvdimm_build_acpi_devices(NVDIMMState *state, GSList *device_list,
+  Aml *sb_scope)
+{
+Aml *dev, *method, *field;
+uint64_t page_size = getpagesize();
+int fit_size = nvdimm_device_structure_size(g_slist_length(device_list));
+
+dev = aml_device("NVDR");
+aml_append(dev

[PATCH v4 27/33] nvdimm acpi: save arg3 for NVDIMM device _DSM method

2015-10-18 Thread Xiao Guangrong
Check if the input Arg3 is valid then store it into dsm_in if needed

Signed-off-by: Xiao Guangrong 
---
 hw/acpi/nvdimm.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index 7e99889..b211b8b 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -624,10 +624,29 @@ static void nvdimm_build_acpi_devices(NVDIMMState *state, 
GSList *device_list,
 
 method = aml_method_serialized("NCAL", 4);
 {
+Aml *ifctx;
+
 aml_append(method, aml_store(aml_arg(0), aml_name("HDLE")));
 aml_append(method, aml_store(aml_arg(1), aml_name("REVS")));
 aml_append(method, aml_store(aml_arg(2), aml_name("FUNC")));
 
+/* Arg3 is passed as Package and it has one element? */
+ifctx = aml_if(aml_and(aml_equal(aml_object_type(aml_arg(3)),
+ aml_int(4)),
+   aml_equal(aml_sizeof(aml_arg(3)),
+ aml_int(1;
+{
+/* Local0 = Index(Arg3, 0) */
+aml_append(ifctx, aml_store(aml_index(aml_arg(3), aml_int(0)),
+aml_local(0)));
+/* Local3 = DeRefOf(Local0) */
+aml_append(ifctx, aml_store(aml_derefof(aml_local(0)),
+aml_local(3)));
+/* ARG3 = Local3 */
+aml_append(ifctx, aml_store(aml_local(3), aml_name("ARG3")));
+}
+aml_append(method, ifctx);
+
 aml_append(method, aml_store(aml_int(NOTIFY_VALUE), aml_name("NOTI")));
 
 aml_append(method, aml_store(aml_name("RLEN"), aml_local(6)));
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 31/33] nvdimm acpi: support DSM_DEV_FUN_SET_NAMESPACE_LABEL_DATA

2015-10-18 Thread Xiao Guangrong
Function 6 is used to set Namespace Label Data

Signed-off-by: Xiao Guangrong 
---
 hw/acpi/nvdimm.c | 46 ++
 1 file changed, 46 insertions(+)

diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index 1683a82..838a57e 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -529,6 +529,11 @@ struct cmd_out_get_label_data {
 } QEMU_PACKED;
 typedef struct cmd_out_get_label_data cmd_out_get_label_data;
 
+struct cmd_out_set_label_data {
+uint32_t status;
+};
+typedef struct cmd_out_set_label_data cmd_out_set_label_data;
+
 struct dsm_out {
 /* the size of buffer filled by QEMU. */
 uint32_t len;
@@ -537,6 +542,7 @@ struct dsm_out {
 cmd_out_implemented cmd_implemented;
 cmd_out_label_size cmd_label_size;
 cmd_out_get_label_data cmd_get_label_data;
+cmd_out_set_label_data cmd_set_label_data;
 };
 } QEMU_PACKED;
 typedef struct dsm_out dsm_out;
@@ -666,6 +672,43 @@ exit:
 nvdimm_dsm_write_status(out, status);
 }
 
+/*
+ * please refer to DSM specification 4.6 Set Namespace Label Data
+ * (Function Index 6).
+ */
+static void
+nvdimm_dsm_cmd_set_label_data(NVDIMMDevice *nvdimm, dsm_in *in, GArray *out)
+{
+cmd_in_set_label_data *cmd_in = &in->cmd_set_label_data;
+uint32_t status;
+
+le32_to_cpus(&cmd_in->offset);
+le32_to_cpus(&cmd_in->length);
+
+nvdimm_debug("Write Label Data: offset %#x length %#x.\n",
+ cmd_in->offset, cmd_in->length);
+if (nvdimm->label_size < cmd_in->offset + cmd_in->length) {
+nvdimm_debug("position %#x is beyond label data (len = %#lx).\n",
+ cmd_in->offset + cmd_in->length, nvdimm->label_size);
+status = DSM_DEV_STATUS_INVALID_PARAS;
+goto exit;
+}
+
+if (cmd_in->length > nvdimm_get_max_xfer_label_size()) {
+nvdimm_debug("set length (%#x) is larger than max_xfer (%#x).\n",
+ cmd_in->length, nvdimm_get_max_xfer_label_size());
+status = DSM_DEV_STATUS_INVALID_PARAS;
+goto exit;
+}
+
+status = DSM_STATUS_SUCCESS;
+memcpy(nvdimm->label_data + cmd_in->offset, cmd_in->in_buf,
+   cmd_in->length);
+
+exit:
+nvdimm_dsm_write_status(out, status);
+}
+
 static void nvdimm_dsm_write_nvdimm(dsm_in *in, GArray *out)
 {
 GSList *list = nvdimm_get_plugged_device_list();
@@ -689,6 +732,9 @@ static void nvdimm_dsm_write_nvdimm(dsm_in *in, GArray *out)
 case DSM_DEV_FUN_GET_NAMESPACE_LABEL_DATA:
 nvdimm_dsm_cmd_get_label_data(nvdimm, in, out);
 goto free;
+case DSM_DEV_FUN_SET_NAMESPACE_LABEL_DATA:
+nvdimm_dsm_cmd_set_label_data(nvdimm, in, out);
+goto free;
 default:
 status = DSM_STATUS_NOT_SUPPORTED;
 };
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 21/33] nvdimm: implement NVDIMM device abstract

2015-10-18 Thread Xiao Guangrong
Introduce "nvdimm" device which is based on dimm device type

128K memory region which is the minimum namespace label size
required by NVDIMM Namespace Spec locates at the end of
backend memory device is reserved for label data

We can use "-m 1G,maxmem=100G,slots=10 -object memory-backend-file,
id=mem1,size=1G,mem-path=/dev/pmem0 -device nvdimm,memdev=mem1" to
create NVDIMM device for guest

Signed-off-by: Xiao Guangrong 
---
 default-configs/i386-softmmu.mak   |  1 +
 default-configs/x86_64-softmmu.mak |  1 +
 hw/acpi/memory_hotplug.c   |  6 +++
 hw/mem/Makefile.objs   |  1 +
 hw/mem/nvdimm.c| 84 ++
 include/hw/mem/nvdimm.h| 66 ++
 6 files changed, 159 insertions(+)
 create mode 100644 hw/mem/nvdimm.c
 create mode 100644 include/hw/mem/nvdimm.h

diff --git a/default-configs/i386-softmmu.mak b/default-configs/i386-softmmu.mak
index 3ece8bb..a1b24e5 100644
--- a/default-configs/i386-softmmu.mak
+++ b/default-configs/i386-softmmu.mak
@@ -47,6 +47,7 @@ CONFIG_APIC=y
 CONFIG_IOAPIC=y
 CONFIG_PVPANIC=y
 CONFIG_MEM_HOTPLUG=y
+CONFIG_NVDIMM = y
 CONFIG_XIO3130=y
 CONFIG_IOH3420=y
 CONFIG_I82801B11=y
diff --git a/default-configs/x86_64-softmmu.mak 
b/default-configs/x86_64-softmmu.mak
index 92ea7c1..e3f5a0b 100644
--- a/default-configs/x86_64-softmmu.mak
+++ b/default-configs/x86_64-softmmu.mak
@@ -47,6 +47,7 @@ CONFIG_APIC=y
 CONFIG_IOAPIC=y
 CONFIG_PVPANIC=y
 CONFIG_MEM_HOTPLUG=y
+CONFIG_NVDIMM = y
 CONFIG_XIO3130=y
 CONFIG_IOH3420=y
 CONFIG_I82801B11=y
diff --git a/hw/acpi/memory_hotplug.c b/hw/acpi/memory_hotplug.c
index e232641..92cd973 100644
--- a/hw/acpi/memory_hotplug.c
+++ b/hw/acpi/memory_hotplug.c
@@ -1,6 +1,7 @@
 #include "hw/acpi/memory_hotplug.h"
 #include "hw/acpi/pc-hotplug.h"
 #include "hw/mem/dimm.h"
+#include "hw/mem/nvdimm.h"
 #include "hw/boards.h"
 #include "hw/qdev-core.h"
 #include "trace.h"
@@ -231,6 +232,11 @@ void acpi_memory_plug_cb(ACPIREGS *ar, qemu_irq irq, 
MemHotplugState *mem_st,
 {
 MemStatus *mdev;
 
+/* Currently, NVDIMM hotplug has not been supported yet. */
+if (object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM)) {
+return;
+}
+
 mdev = acpi_memory_slot_status(mem_st, dev, errp);
 if (!mdev) {
 return;
diff --git a/hw/mem/Makefile.objs b/hw/mem/Makefile.objs
index cebb4b1..12d9b72 100644
--- a/hw/mem/Makefile.objs
+++ b/hw/mem/Makefile.objs
@@ -1,2 +1,3 @@
 common-obj-$(CONFIG_DIMM) += dimm.o
 common-obj-$(CONFIG_MEM_HOTPLUG) += pc-dimm.o
+common-obj-$(CONFIG_NVDIMM) += nvdimm.o
diff --git a/hw/mem/nvdimm.c b/hw/mem/nvdimm.c
new file mode 100644
index 000..51494b6
--- /dev/null
+++ b/hw/mem/nvdimm.c
@@ -0,0 +1,84 @@
+/*
+ * Non-Volatile Dual In-line Memory Module Virtualization Implementation
+ *
+ * Copyright(C) 2015 Intel Corporation.
+ *
+ * Author:
+ *  Xiao Guangrong 
+ *
+ * Currently, it only supports PMEM Virtualization.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see 
+ */
+
+#include "qapi/visitor.h"
+#include "hw/mem/nvdimm.h"
+
+static MemoryRegion *nvdimm_get_memory_region(DIMMDevice *dimm)
+{
+NVDIMMDevice *nvdimm = NVDIMM(dimm);
+
+return memory_region_size(&nvdimm->nvdimm_mr) ? &nvdimm->nvdimm_mr : NULL;
+}
+
+static void nvdimm_realize(DIMMDevice *dimm, Error **errp)
+{
+MemoryRegion *mr;
+NVDIMMDevice *nvdimm = NVDIMM(dimm);
+uint64_t size;
+
+nvdimm->label_size = MIN_NAMESPACE_LABEL_SIZE;
+
+mr = host_memory_backend_get_memory(dimm->hostmem, errp);
+size = memory_region_size(mr);
+
+if (size <= nvdimm->label_size) {
+char *path = 
object_get_canonical_path_component(OBJECT(dimm->hostmem));
+error_setg(errp, "the size of memdev %s (0x%" PRIx64 ") is too small"
+   " to contain nvdimm namespace label (0x%" PRIx64 ")", path,
+   memory_region_size(mr), nvdimm->label_size);
+return;
+}
+
+memory_region_init_alias(&nvdimm->nvdimm_mr, OBJECT(dimm), "nvdimm-memory",
+ mr, 0, size - nvdimm->label_size);
+nvdimm->label_data = memory_region_get_ram_ptr(mr) +
+ memory_region_size(&nvdimm->nvdimm_mr);
+}
+
+static void nvdimm_class_init(ObjectClass *oc, void *data)
+{
+DeviceClass *dc = DEVICE_CLASS(oc);
+DIMMDeviceClass *ddc = DIMM_C

[PATCH v4 23/33] nvdimm acpi: init the address region used by NVDIMM ACPI

2015-10-18 Thread Xiao Guangrong
We reserve the memory region 0xFF0 ~ 0xFFF0 for NVDIMM ACPI
which is used as:
- the first page is mapped as MMIO, ACPI write data to this page to
  transfer the control to QEMU

- the second page is RAM-based which used to save the input info of
  _DSM method and QEMU reuse it store output info

- the left is mapped as RAM, it's the buffer returned by _FIT method,
  this is needed by NVDIMM hotplug

Signed-off-by: Xiao Guangrong 
---
 hw/acpi/Makefile.objs   |   1 +
 hw/acpi/nvdimm.c| 143 
 hw/i386/pc.c|   2 +
 include/hw/i386/pc.h|   2 +
 include/hw/mem/nvdimm.h |  18 ++
 5 files changed, 166 insertions(+)
 create mode 100644 hw/acpi/nvdimm.c

diff --git a/hw/acpi/Makefile.objs b/hw/acpi/Makefile.objs
index 7d3230c..80426b4 100644
--- a/hw/acpi/Makefile.objs
+++ b/hw/acpi/Makefile.objs
@@ -2,6 +2,7 @@ common-obj-$(CONFIG_ACPI_X86) += core.o piix4.o pcihp.o
 common-obj-$(CONFIG_ACPI_X86_ICH) += ich9.o tco.o
 common-obj-$(CONFIG_ACPI_CPU_HOTPLUG) += cpu_hotplug.o
 common-obj-$(CONFIG_ACPI_MEMORY_HOTPLUG) += memory_hotplug.o
+common-obj-$(CONFIG_NVDIMM) += nvdimm.o
 common-obj-$(CONFIG_ACPI) += acpi_interface.o
 common-obj-$(CONFIG_ACPI) += bios-linker-loader.o
 common-obj-$(CONFIG_ACPI) += aml-build.o
diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
new file mode 100644
index 000..fd70de2
--- /dev/null
+++ b/hw/acpi/nvdimm.c
@@ -0,0 +1,143 @@
+/*
+ * NVDIMM ACPI Implementation
+ *
+ * Copyright(C) 2015 Intel Corporation.
+ *
+ * Author:
+ *  Xiao Guangrong 
+ *
+ * NFIT is defined in ACPI 6.0: 5.2.25 NVDIMM Firmware Interface Table (NFIT)
+ * and the DSM specification can be found at:
+ *   http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
+ *
+ * Currently, it only supports PMEM Virtualization.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see 
+ */
+
+#include "qemu-common.h"
+#include "hw/acpi/acpi.h"
+#include "hw/acpi/aml-build.h"
+#include "hw/mem/nvdimm.h"
+
+/*
+ * System Physical Address Range Structure
+ *
+ * It describes the system physical address ranges occupied by NVDIMMs and
+ * the types of the regions.
+ */
+struct nfit_spa {
+uint16_t type;
+uint16_t length;
+uint16_t spa_index;
+uint16_t flags;
+uint32_t reserved;
+uint32_t proximity_domain;
+uint8_t type_guid[16];
+uint64_t spa_base;
+uint64_t spa_length;
+uint64_t mem_attr;
+} QEMU_PACKED;
+typedef struct nfit_spa nfit_spa;
+
+/*
+ * Memory Device to System Physical Address Range Mapping Structure
+ *
+ * It enables identifying each NVDIMM region and the corresponding SPA
+ * describing the memory interleave
+ */
+struct nfit_memdev {
+uint16_t type;
+uint16_t length;
+uint32_t nfit_handle;
+uint16_t phys_id;
+uint16_t region_id;
+uint16_t spa_index;
+uint16_t dcr_index;
+uint64_t region_len;
+uint64_t region_offset;
+uint64_t region_dpa;
+uint16_t interleave_index;
+uint16_t interleave_ways;
+uint16_t flags;
+uint16_t reserved;
+} QEMU_PACKED;
+typedef struct nfit_memdev nfit_memdev;
+
+/*
+ * NVDIMM Control Region Structure
+ *
+ * It describes the NVDIMM and if applicable, Block Control Window.
+ */
+struct nfit_dcr {
+uint16_t type;
+uint16_t length;
+uint16_t dcr_index;
+uint16_t vendor_id;
+uint16_t device_id;
+uint16_t revision_id;
+uint16_t sub_vendor_id;
+uint16_t sub_device_id;
+uint16_t sub_revision_id;
+uint8_t reserved[6];
+uint32_t serial_number;
+uint16_t fic;
+uint16_t num_bcw;
+uint64_t bcw_size;
+uint64_t cmd_offset;
+uint64_t cmd_size;
+uint64_t status_offset;
+uint64_t status_size;
+uint16_t flags;
+uint8_t reserved2[6];
+} QEMU_PACKED;
+typedef struct nfit_dcr nfit_dcr;
+
+/*
+ * calculate the size of structures which describe all NVDIMM devices.
+ * Currently each device has three structures as only PMEM is supported
+ * now.
+ */
+static uint64_t nvdimm_device_structure_size(uint64_t slots)
+{
+return slots * (sizeof(nfit_spa) + sizeof(nfit_memdev) + sizeof(nfit_dcr));
+}
+
+/*
+ * calculate the size of the memory used to implement NVDIMM ACPI operations
+ * which include:
+ * - __DSM method: it needs two pages to transfer control and data between
+ *   Guest ACPI and QEMU.
+ *
+ * - _FIT method: it returns a buffer to G

[PATCH v4 24/33] nvdimm acpi: build ACPI NFIT table

2015-10-18 Thread Xiao Guangrong
NFIT is defined in ACPI 6.0: 5.2.25 NVDIMM Firmware Interface Table (NFIT)

Currently, we only support PMEM mode. Each device has 3 structures:
- SPA structure, defines the PMEM region info

- MEM DEV structure, it has the @handle which is used to associate specified
  ACPI NVDIMM  device we will introduce in later patch.
  Also we can happily ignored the memory device's interleave, the real
  nvdimm hardware access is hidden behind host

- DCR structure, it defines vendor ID used to associate specified vendor
  nvdimm driver. Since we only implement PMEM mode this time, Command
  window and Data window are not needed

Signed-off-by: Xiao Guangrong 
---
 hw/acpi/nvdimm.c| 286 
 hw/i386/acpi-build.c|  10 ++
 hw/mem/nvdimm.c |  24 
 include/hw/mem/nvdimm.h |  13 +++
 4 files changed, 333 insertions(+)

diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index fd70de2..8d8376c 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -31,6 +31,72 @@
 #include "hw/acpi/aml-build.h"
 #include "hw/mem/nvdimm.h"
 
+#define NVDIMM_UUID_LE(a, b, c, d0, d1, d2, d3, d4, d5, d6, d7) \
+   { (a) & 0xff, ((a) >> 8) & 0xff, ((a) >> 16) & 0xff, ((a) >> 24) & 0xff, \
+ (b) & 0xff, ((b) >> 8) & 0xff, (c) & 0xff, ((c) >> 8) & 0xff,  \
+ (d0), (d1), (d2), (d3), (d4), (d5), (d6), (d7) }
+
+/*
+ * This GUID defines a Byte Addressable Persistent Memory (PM) Region.
+ * Please refer to ACPI 6.0: 5.2.25.1 System Physical Address Range
+ * Structure.
+ */
+static const uint8_t nfit_spa_uuid_pm[] = NVDIMM_UUID_LE(0x66f0d379, 0xb4f3,
+0x4074, 0xac, 0x43, 0x0d, 0x33, 0x18, 0xb7, 0x8c, 0xdb);
+
+/* NFIT Structure Types. */
+enum {
+NFIT_STRUCTURE_SPA = 0,
+NFIT_STRUCTURE_MEMDEV = 1,
+NFIT_STRUCTURE_IDT = 2,
+NFIT_STRUCTURE_SMBIOS = 3,
+NFIT_STRUCTURE_DCR = 4,
+NFIT_STRUCTURE_BDW = 5,
+NFIT_STRUCTURE_FLUSH = 6,
+};
+
+/*
+ * NVDIMM Firmware Interface Table
+ * @signature: "NFIT"
+ *
+ * It provides information that allows OSPM to enumerate NVDIMM present in
+ * the platform and associate system physical address ranges created by the
+ * NVDIMMs.
+ *
+ * Detailed info please refer to ACPI 6.0: 5.2.25 NVDIMM Firmware Interface
+ * Table (NFIT)
+ */
+struct nfit {
+ACPI_TABLE_HEADER_DEF
+uint32_t reserved;
+} QEMU_PACKED;
+typedef struct nfit nfit;
+
+/*
+ * Memory mapping attributes for the address range described in system
+ * physical address range structure.
+ */
+enum {
+EFI_MEMORY_UC = 0x1ULL,
+EFI_MEMORY_WC = 0x2ULL,
+EFI_MEMORY_WT = 0x4ULL,
+EFI_MEMORY_WB = 0x8ULL,
+EFI_MEMORY_UCE = 0x10ULL,
+EFI_MEMORY_WP = 0x1000ULL,
+EFI_MEMORY_RP = 0x2000ULL,
+EFI_MEMORY_XP = 0x4000ULL,
+EFI_MEMORY_NV = 0x8000ULL,
+EFI_MEMORY_MORE_RELIABLE = 0x1ULL,
+};
+
+/*
+ * Control region is strictly for management during hot add/online
+ * operation.
+ */
+#define SPA_FLAGS_ADD_ONLINE_ONLY (1)
+/* Data in Proximity Domain field is valid. */
+#define SPA_FLAGS_PROXIMITY_VALID (1 << 1)
+
 /*
  * System Physical Address Range Structure
  *
@@ -76,6 +142,14 @@ struct nfit_memdev {
 typedef struct nfit_memdev nfit_memdev;
 
 /*
+ * please refer to DSM specification, Chapter 2 NVDIMM Device Specific
+ * Method (DSM).
+ */
+#define REVSISON_ID1
+/* the format interface code supported by DSM specification. */
+#define NFIT_FIC1  0x201
+
+/*
  * NVDIMM Control Region Structure
  *
  * It describes the NVDIMM and if applicable, Block Control Window.
@@ -141,3 +215,215 @@ void nvdimm_init_memory_state(NVDIMMState *state, 
MemoryRegion *system_memory,
NVDIMM_ACPI_MEM_SIZE);
 memory_region_add_subregion(system_memory, state->base, &state->mr);
 }
+
+/*
+ * Module serial number is a unique number for each device. We use the
+ * slot id of NVDIMM device to generate this number so that each device
+ * associates with a different number.
+ *
+ * 0x123456 is a magic number we arbitrarily chose.
+ */
+static uint32_t nvdimm_slot_to_sn(int slot)
+{
+return 0x123456 + slot;
+}
+
+/*
+ * handle is used to uniquely associate nfit_memdev structure with NVDIMM
+ * ACPI device - nfit_memdev.nfit_handle matches with the value returned
+ * by ACPI device _ADR method.
+ *
+ * We generate the handle with the slot id of NVDIMM device and reserve
+ * 0 for NVDIMM root device.
+ */
+static uint32_t nvdimm_slot_to_handle(int slot)
+{
+return slot + 1;
+}
+
+/*
+ * index uniquely identifies the structure, 0 is reserved which indicates
+ * that the structure is not valid or the associated structure is not
+ * present.
+ *
+ * Each NVDIMM device needs two indexes, one for nfit_spa and another for
+ * nfit_dc which are generated by the slot id of NVDIMM device.
+ */
+static uint16_t nvdimm_slot_to_spa_index(int slot)
+{
+return (slot + 1) << 1;
+}
+
+/* See the comment of nvdimm_slot_to_spa_index(). */
+static uint32_t nvdimm_slot_to_dcr_index(int s

[PATCH v4 33/33] nvdimm: add maintain info

2015-10-18 Thread Xiao Guangrong
Add NVDIMM maintainer

Signed-off-by: Xiao Guangrong 
---
 MAINTAINERS | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 9bde832..cf259f9 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -868,6 +868,13 @@ M: Jiri Pirko 
 S: Maintained
 F: hw/net/rocker/
 
+NVDIMM
+M: Xiao Guangrong 
+S: Maintained
+F: hw/acpi/nvdimm.c
+F: hw/mem/nvdimm.c
+F: include/hw/mem/nvdimm.h
+
 Subsystems
 --
 Audio
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 22/33] docs: add NVDIMM ACPI documentation

2015-10-18 Thread Xiao Guangrong
It describes the basic concepts of NVDIMM ACPI and the interface
between QEMU and the ACPI BIOS

Signed-off-by: Xiao Guangrong 
---
 docs/specs/acpi_nvdimm.txt | 154 +
 1 file changed, 154 insertions(+)
 create mode 100644 docs/specs/acpi_nvdimm.txt

diff --git a/docs/specs/acpi_nvdimm.txt b/docs/specs/acpi_nvdimm.txt
new file mode 100644
index 000..feafa93
--- /dev/null
+++ b/docs/specs/acpi_nvdimm.txt
@@ -0,0 +1,154 @@
+QEMU<->ACPI BIOS NVDIMM interface
+-
+
+QEMU supports NVDIMM via ACPI. This document describes the basic concepts of
+NVDIMM ACPI and the interface between QEMU and the ACPI BIOS.
+
+NVDIMM ACPI Background
+--
+NVDIMM is introduced in ACPI 6.0 which defines an NVDIMM root device under
+_SB scope with a _HID of “ACPI0012”. For each NVDIMM present or intended
+to be supported by platform, platform firmware also exposes an ACPI
+Namespace Device under the root device.
+
+The NVDIMM child devices under the NVDIMM root device are defined with _ADR
+corresponding to the NFIT device handle. The NVDIMM root device and the
+NVDIMM devices can have device specific methods (_DSM) to provide additional
+functions specific to a particular NVDIMM implementation.
+
+This is an example from ACPI 6.0, a platform contains one NVDIMM:
+
+Scope (\_SB){
+   Device (NVDR) // Root device
+   {
+  Name (_HID, “ACPI0012”)
+  Method (_STA) {...}
+  Method (_FIT) {...}
+  Method (_DSM, ...) {...}
+  Device (NVD)
+  {
+ Name(_ADR, h) //where h is NFIT Device Handle for this NVDIMM
+ Method (_DSM, ...) {...}
+  }
+   }
+}
+
+Methods supported on both NVDIMM root device and NVDIMM device are
+1) _STA(Status)
+   It returns the current status of a device, which can be one of the
+   following: enabled, disabled, or removed.
+
+   Arguments: None
+
+   Return Value:
+   It returns an An Integer which is defined as followings:
+   Bit [0] – Set if the device is present.
+   Bit [1] – Set if the device is enabled and decoding its resources.
+   Bit [2] – Set if the device should be shown in the UI.
+   Bit [3] – Set if the device is functioning properly (cleared if device
+ failed its diagnostics).
+   Bit [4] – Set if the battery is present.
+   Bits [31:5] – Reserved (must be cleared).
+
+2) _DSM (Device Specific Method)
+   It is a control method that enables devices to provide device specific
+   control functions that are consumed by the device driver.
+   The NVDIMM DSM specification can be found at:
+http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
+
+   Arguments:
+   Arg0 – A Buffer containing a UUID (16 Bytes)
+   Arg1 – An Integer containing the Revision ID (4 Bytes)
+   Arg2 – An Integer containing the Function Index (4 Bytes)
+   Arg3 – A package containing parameters for the function specified by the
+  UUID, Revision ID, and Function Index
+
+   Return Value:
+   If Function Index = 0, a Buffer containing a function index bitfield.
+   Otherwise, the return value and type depends on the UUID, revision ID
+   and function index which are described in the DSM specification.
+
+Methods on NVDIMM ROOT Device
+_FIT(Firmware Interface Table)
+   It evaluates to a buffer returning data in the format of a series of NFIT
+   Type Structure.
+
+   Arguments: None
+
+   Return Value:
+   A Buffer containing a list of NFIT Type structure entries.
+
+   The detailed definition of the structure can be found at ACPI 6.0: 5.2.25
+   NVDIMM Firmware Interface Table (NFIT).
+
+QEMU NVDIMM Implemention
+
+QEMU reserves the memory region, 0xFF0 ~ 0xFFF0, for NVDIMM ACPI.
+
+0xFF0 - 0xFF00FFF:
+   The first page of the region is MMIO-based that means any access in this
+   region will be emulated by QEMU. ACPU uses it to transfer control from
+   guest to QEMU.
+
+   Write Access:
+   [0xFF0 - 0xFF3]: 4 bytes, ACPI write 0x99 to it to transfer
+control to QEMU.
+
+0xFF01000 - 0xFF01FFF:
+   This second page of the region is RAM-based and it is used to transfer
+   data between _DSM method and QEMU. If ACPI has control, this pages is
+   owned by ACPI which writes _DSM input data to it, otherwise, it is owned
+   by QEMU which emulates _DSM access and writes the output data to it.
+
+   ACPI Writes _DSM Input Data:
+   [0xFF01000 - 0xFF01003]: 4 bytes, NVDIMM Devcie Handle, 0 is reserved
+for NVDIMM Root device.
+   [0xFF01004 - 0xFF01007]: 4 bytes, Revision ID, that is the Arg1 of _DSM
+method.
+   [0xFF01008 - 0xFF0100B]: 4 bytes. Function Index, that is the Arg2 of
+_DSM method.
+   [0xFF0100C - 0xFF01FFF]: 4084 bytes, the Arg3 of _DSM method
+
+   QEMU Writes Output Data
+   [0xFF01000 - 0xFF01003]: 4 bytes, @buffer-size, see below.
+   [0xFF01004 - @buffer-size]: the size is depends on

[PATCH v4 19/33] dimm: keep the state of the whole backend memory

2015-10-18 Thread Xiao Guangrong
QEMU keeps the state of memory of dimm device during live migration,
however, it is not enough for nvdimm device as its memory does not
contain its label data, so that we should protect the whole backend
memory instead

Signed-off-by: Xiao Guangrong 
---
 hw/mem/dimm.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/hw/mem/dimm.c b/hw/mem/dimm.c
index 9e0403a..478cacd 100644
--- a/hw/mem/dimm.c
+++ b/hw/mem/dimm.c
@@ -135,9 +135,16 @@ void dimm_memory_plug(DeviceState *dev, MemoryHotplugState 
*hpms,
 }
 
 memory_region_add_subregion(&hpms->mr, addr - hpms->base, mr);
-vmstate_register_ram(mr, dev);
 numa_set_mem_node_id(addr, memory_region_size(mr), dimm->node);
 
+/*
+ * save the state only for @mr is not enough as it does not contain
+ * the label data of NVDIMM device, so that we keep the state of
+ * whole hostmem instead.
+ */
+vmstate_register_ram(host_memory_backend_get_memory(dimm->hostmem, errp),
+ dev);
+
 out:
 error_propagate(errp, local_err);
 }
@@ -146,10 +153,13 @@ void dimm_memory_unplug(DeviceState *dev, 
MemoryHotplugState *hpms,
MemoryRegion *mr)
 {
 DIMMDevice *dimm = DIMM(dev);
+MemoryRegion *backend_mr;
+
+backend_mr = host_memory_backend_get_memory(dimm->hostmem, &error_abort);
 
 numa_unset_mem_node_id(dimm->addr, memory_region_size(mr), dimm->node);
 memory_region_del_subregion(&hpms->mr, mr);
-vmstate_unregister_ram(mr, dev);
+vmstate_unregister_ram(backend_mr, dev);
 }
 
 int qmp_dimm_device_list(Object *obj, void *opaque)
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 20/33] dimm: introduce realize callback

2015-10-18 Thread Xiao Guangrong
nvdimm need check if the backend memory is large enough to contain label
data and init its memory region when the device is realized, so introduce
realize callback which is called after common dimm has been realize

Signed-off-by: Xiao Guangrong 
---
 hw/mem/dimm.c | 5 +
 include/hw/mem/dimm.h | 1 +
 2 files changed, 6 insertions(+)

diff --git a/hw/mem/dimm.c b/hw/mem/dimm.c
index 478cacd..3d06cb9 100644
--- a/hw/mem/dimm.c
+++ b/hw/mem/dimm.c
@@ -429,6 +429,7 @@ static void dimm_init(Object *obj)
 static void dimm_realize(DeviceState *dev, Error **errp)
 {
 DIMMDevice *dimm = DIMM(dev);
+DIMMDeviceClass *ddc = DIMM_GET_CLASS(dimm);
 
 if (!dimm->hostmem) {
 error_setg(errp, "'" DIMM_MEMDEV_PROP "' property is not set");
@@ -441,6 +442,10 @@ static void dimm_realize(DeviceState *dev, Error **errp)
dimm->node, nb_numa_nodes ? nb_numa_nodes : 1);
 return;
 }
+
+if (ddc->realize) {
+ddc->realize(dimm, errp);
+}
 }
 
 static void dimm_class_init(ObjectClass *oc, void *data)
diff --git a/include/hw/mem/dimm.h b/include/hw/mem/dimm.h
index 84a62ed..663288d 100644
--- a/include/hw/mem/dimm.h
+++ b/include/hw/mem/dimm.h
@@ -65,6 +65,7 @@ typedef struct DIMMDeviceClass {
 DeviceClass parent_class;
 
 /* public */
+void (*realize)(DIMMDevice *dimm, Error **errp);
 MemoryRegion *(*get_memory_region)(DIMMDevice *dimm);
 } DIMMDeviceClass;
 
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 13/33] pc-dimm: make pc_existing_dimms_capacity static and rename it

2015-10-18 Thread Xiao Guangrong
pc_existing_dimms_capacity() can be static since it is not used out of
pc-dimm.c and drop the pc_ prefix to prepare the work which abstracts
dimm device type from pc-dimm

Signed-off-by: Xiao Guangrong 
---
 hw/mem/pc-dimm.c | 73 
 include/hw/mem/pc-dimm.h |  1 -
 2 files changed, 36 insertions(+), 38 deletions(-)

diff --git a/hw/mem/pc-dimm.c b/hw/mem/pc-dimm.c
index 2bae994..425f627 100644
--- a/hw/mem/pc-dimm.c
+++ b/hw/mem/pc-dimm.c
@@ -32,6 +32,38 @@ typedef struct pc_dimms_capacity {
  Error**errp;
 } pc_dimms_capacity;
 
+static int existing_dimms_capacity_internal(Object *obj, void *opaque)
+{
+pc_dimms_capacity *cap = opaque;
+uint64_t *size = &cap->size;
+
+if (object_dynamic_cast(obj, TYPE_PC_DIMM)) {
+DeviceState *dev = DEVICE(obj);
+
+if (dev->realized) {
+(*size) += object_property_get_int(obj, PC_DIMM_SIZE_PROP,
+cap->errp);
+}
+
+if (cap->errp && *cap->errp) {
+return 1;
+}
+}
+object_child_foreach(obj, existing_dimms_capacity_internal, opaque);
+return 0;
+}
+
+static uint64_t existing_dimms_capacity(Error **errp)
+{
+pc_dimms_capacity cap;
+
+cap.size = 0;
+cap.errp = errp;
+
+existing_dimms_capacity_internal(qdev_get_machine(), &cap);
+return cap.size;
+}
+
 void pc_dimm_memory_plug(DeviceState *dev, MemoryHotplugState *hpms,
  MemoryRegion *mr, uint64_t align, bool gap,
  Error **errp)
@@ -40,7 +72,7 @@ void pc_dimm_memory_plug(DeviceState *dev, MemoryHotplugState 
*hpms,
 MachineState *machine = MACHINE(qdev_get_machine());
 PCDIMMDevice *dimm = PC_DIMM(dev);
 Error *local_err = NULL;
-uint64_t existing_dimms_capacity = 0;
+uint64_t dimms_capacity = 0;
 uint64_t addr;
 
 addr = object_property_get_int(OBJECT(dimm), PC_DIMM_ADDR_PROP, 
&local_err);
@@ -56,17 +88,16 @@ void pc_dimm_memory_plug(DeviceState *dev, 
MemoryHotplugState *hpms,
 goto out;
 }
 
-existing_dimms_capacity = pc_existing_dimms_capacity(&local_err);
+dimms_capacity = existing_dimms_capacity(&local_err);
 if (local_err) {
 goto out;
 }
 
-if (existing_dimms_capacity + memory_region_size(mr) >
+if (dimms_capacity + memory_region_size(mr) >
 machine->maxram_size - machine->ram_size) {
 error_setg(&local_err, "not enough space, currently 0x%" PRIx64
" in use of total hot pluggable 0x" RAM_ADDR_FMT,
-   existing_dimms_capacity,
-   machine->maxram_size - machine->ram_size);
+   dimms_capacity, machine->maxram_size - machine->ram_size);
 goto out;
 }
 
@@ -121,38 +152,6 @@ void pc_dimm_memory_unplug(DeviceState *dev, 
MemoryHotplugState *hpms,
 vmstate_unregister_ram(mr, dev);
 }
 
-static int pc_existing_dimms_capacity_internal(Object *obj, void *opaque)
-{
-pc_dimms_capacity *cap = opaque;
-uint64_t *size = &cap->size;
-
-if (object_dynamic_cast(obj, TYPE_PC_DIMM)) {
-DeviceState *dev = DEVICE(obj);
-
-if (dev->realized) {
-(*size) += object_property_get_int(obj, PC_DIMM_SIZE_PROP,
-cap->errp);
-}
-
-if (cap->errp && *cap->errp) {
-return 1;
-}
-}
-object_child_foreach(obj, pc_existing_dimms_capacity_internal, opaque);
-return 0;
-}
-
-uint64_t pc_existing_dimms_capacity(Error **errp)
-{
-pc_dimms_capacity cap;
-
-cap.size = 0;
-cap.errp = errp;
-
-pc_existing_dimms_capacity_internal(qdev_get_machine(), &cap);
-return cap.size;
-}
-
 int qmp_pc_dimm_device_list(Object *obj, void *opaque)
 {
 MemoryDeviceInfoList ***prev = opaque;
diff --git a/include/hw/mem/pc-dimm.h b/include/hw/mem/pc-dimm.h
index 15590f1..c1e5774 100644
--- a/include/hw/mem/pc-dimm.h
+++ b/include/hw/mem/pc-dimm.h
@@ -87,7 +87,6 @@ uint64_t pc_dimm_get_free_addr(uint64_t address_space_start,
 int pc_dimm_get_free_slot(const int *hint, int max_slots, Error **errp);
 
 int qmp_pc_dimm_device_list(Object *obj, void *opaque);
-uint64_t pc_existing_dimms_capacity(Error **errp);
 void pc_dimm_memory_plug(DeviceState *dev, MemoryHotplugState *hpms,
  MemoryRegion *mr, uint64_t align, bool gap,
  Error **errp);
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 15/33] stubs: rename qmp_pc_dimm_device_list.c

2015-10-18 Thread Xiao Guangrong
Rename qmp_pc_dimm_device_list.c to qmp_dimm_device_list.c

Signed-off-by: Xiao Guangrong 
---
 stubs/Makefile.objs | 2 +-
 stubs/{qmp_pc_dimm_device_list.c => qmp_dimm_device_list.c} | 0
 2 files changed, 1 insertion(+), 1 deletion(-)
 rename stubs/{qmp_pc_dimm_device_list.c => qmp_dimm_device_list.c} (100%)

diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs
index ce6ce11..e28af50 100644
--- a/stubs/Makefile.objs
+++ b/stubs/Makefile.objs
@@ -37,6 +37,6 @@ stub-obj-y += vmstate.o
 stub-obj-$(CONFIG_WIN32) += fd-register.o
 stub-obj-y += cpus.o
 stub-obj-y += kvm.o
-stub-obj-y += qmp_pc_dimm_device_list.o
+stub-obj-y += qmp_dimm_device_list.o
 stub-obj-y += target-monitor-defs.o
 stub-obj-y += vhost.o
diff --git a/stubs/qmp_pc_dimm_device_list.c b/stubs/qmp_dimm_device_list.c
similarity index 100%
rename from stubs/qmp_pc_dimm_device_list.c
rename to stubs/qmp_dimm_device_list.c
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 14/33] pc-dimm: drop the prefix of pc-dimm

2015-10-18 Thread Xiao Guangrong
This patch is generated by this script:

find ./ -name "*.[ch]" -o -name "*.json" -o -name "trace-events" -type f \
| xargs sed -i "s/PC_DIMM/DIMM/g"

find ./ -name "*.[ch]" -o -name "*.json" -o -name "trace-events" -type f \
| xargs sed -i "s/PCDIMM/DIMM/g"

find ./ -name "*.[ch]" -o -name "*.json" -o -name "trace-events" -type f \
| xargs sed -i "s/pc_dimm/dimm/g"

find ./ -name "trace-events" -type f | xargs sed -i "s/pc-dimm/dimm/g"

It prepares the work which abstracts dimm device type for both pc-dimm and
nvdimm

Signed-off-by: Xiao Guangrong 
---
 hmp.c   |   2 +-
 hw/acpi/ich9.c  |   6 +-
 hw/acpi/memory_hotplug.c|  16 ++---
 hw/acpi/piix4.c |   6 +-
 hw/i386/pc.c|  32 -
 hw/mem/pc-dimm.c| 148 
 hw/ppc/spapr.c  |  18 ++---
 include/hw/mem/pc-dimm.h|  62 -
 numa.c  |   2 +-
 qapi-schema.json|   8 +--
 qmp.c   |   2 +-
 stubs/qmp_pc_dimm_device_list.c |   2 +-
 trace-events|   8 +--
 13 files changed, 156 insertions(+), 156 deletions(-)

diff --git a/hmp.c b/hmp.c
index 5048eee..5c617d2 100644
--- a/hmp.c
+++ b/hmp.c
@@ -1952,7 +1952,7 @@ void hmp_info_memory_devices(Monitor *mon, const QDict 
*qdict)
 MemoryDeviceInfoList *info_list = qmp_query_memory_devices(&err);
 MemoryDeviceInfoList *info;
 MemoryDeviceInfo *value;
-PCDIMMDeviceInfo *di;
+DIMMDeviceInfo *di;
 
 for (info = info_list; info; info = info->next) {
 value = info->value;
diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
index 1c7fcfa..b0d6a67 100644
--- a/hw/acpi/ich9.c
+++ b/hw/acpi/ich9.c
@@ -440,7 +440,7 @@ void ich9_pm_add_properties(Object *obj, ICH9LPCPMRegs *pm, 
Error **errp)
 void ich9_pm_device_plug_cb(ICH9LPCPMRegs *pm, DeviceState *dev, Error **errp)
 {
 if (pm->acpi_memory_hotplug.is_enabled &&
-object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+object_dynamic_cast(OBJECT(dev), TYPE_DIMM)) {
 acpi_memory_plug_cb(&pm->acpi_regs, pm->irq, &pm->acpi_memory_hotplug,
 dev, errp);
 } else if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) {
@@ -455,7 +455,7 @@ void ich9_pm_device_unplug_request_cb(ICH9LPCPMRegs *pm, 
DeviceState *dev,
   Error **errp)
 {
 if (pm->acpi_memory_hotplug.is_enabled &&
-object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+object_dynamic_cast(OBJECT(dev), TYPE_DIMM)) {
 acpi_memory_unplug_request_cb(&pm->acpi_regs, pm->irq,
   &pm->acpi_memory_hotplug, dev, errp);
 } else {
@@ -468,7 +468,7 @@ void ich9_pm_device_unplug_cb(ICH9LPCPMRegs *pm, 
DeviceState *dev,
   Error **errp)
 {
 if (pm->acpi_memory_hotplug.is_enabled &&
-object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
+object_dynamic_cast(OBJECT(dev), TYPE_DIMM)) {
 acpi_memory_unplug_cb(&pm->acpi_memory_hotplug, dev, errp);
 } else {
 error_setg(errp, "acpi: device unplug for not supported device"
diff --git a/hw/acpi/memory_hotplug.c b/hw/acpi/memory_hotplug.c
index 2ff0d5c..1f6 100644
--- a/hw/acpi/memory_hotplug.c
+++ b/hw/acpi/memory_hotplug.c
@@ -54,23 +54,23 @@ static uint64_t acpi_memory_hotplug_read(void *opaque, 
hwaddr addr,
 o = OBJECT(mdev->dimm);
 switch (addr) {
 case 0x0: /* Lo part of phys address where DIMM is mapped */
-val = o ? object_property_get_int(o, PC_DIMM_ADDR_PROP, NULL) : 0;
+val = o ? object_property_get_int(o, DIMM_ADDR_PROP, NULL) : 0;
 trace_mhp_acpi_read_addr_lo(mem_st->selector, val);
 break;
 case 0x4: /* Hi part of phys address where DIMM is mapped */
-val = o ? object_property_get_int(o, PC_DIMM_ADDR_PROP, NULL) >> 32 : 
0;
+val = o ? object_property_get_int(o, DIMM_ADDR_PROP, NULL) >> 32 : 0;
 trace_mhp_acpi_read_addr_hi(mem_st->selector, val);
 break;
 case 0x8: /* Lo part of DIMM size */
-val = o ? object_property_get_int(o, PC_DIMM_SIZE_PROP, NULL) : 0;
+val = o ? object_property_get_int(o, DIMM_SIZE_PROP, NULL) : 0;
 trace_mhp_acpi_read_size_lo(mem_st->selector, val);
 break;
 case 0xc: /* Hi part of DIMM size */
-val = o ? object_property_get_int(o, PC_DIMM_SIZE_PROP, NULL) >> 32 : 
0;
+val = o ? object_property_get_int(o, DIMM_SIZE_PROP, NULL) >> 32 : 0;
 trace_mhp_acpi_read_size_hi(mem_st->selector, val);
 break;
 case 0x10: /* node proximity for _PXM method */
-val = o ? object_property_get_int(o, PC_DIMM_NODE_PROP, NULL) : 0;
+val = o ? object_property_get_int(o, DIMM_NODE_PROP, NULL) : 0;
 trace_mhp_acpi_read_pxm(mem_st->selector, val);
 break;
 case 0x14: /* pack and return is_* fields */

[PATCH v4 18/33] dimm: get mapped memory region from DIMMDeviceClass->get_memory_region

2015-10-18 Thread Xiao Guangrong
Curretly, the memory region of backed memory is directly mapped to
guest's address space, however, it is not true for nvdimm device

This patch let dimm device realize this fact and use
DIMMDeviceClass->get_memory_region method to get the mapped memory
region

Signed-off-by: Xiao Guangrong 
---
 hw/mem/dimm.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/mem/dimm.c b/hw/mem/dimm.c
index 23d5daa..9e0403a 100644
--- a/hw/mem/dimm.c
+++ b/hw/mem/dimm.c
@@ -380,8 +380,9 @@ static void dimm_get_size(Object *obj, Visitor *v, void 
*opaque,
 int64_t value;
 MemoryRegion *mr;
 DIMMDevice *dimm = DIMM(obj);
+DIMMDeviceClass *ddc = DIMM_GET_CLASS(obj);
 
-mr = host_memory_backend_get_memory(dimm->hostmem, errp);
+mr = ddc->get_memory_region(dimm);
 value = memory_region_size(mr);
 
 visit_type_int(v, &value, name, errp);
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 10/33] hostmem-file: clean up memory allocation

2015-10-18 Thread Xiao Guangrong
- hostmem-file.c is compiled only if CONFIG_LINUX is enabled so that is
  unnecessary to do the same check in the source file

- the interface, HostMemoryBackendClass->alloc(), is not called many
  times, do not need to check if the memory-region is initialized

Signed-off-by: Xiao Guangrong 
---
 backends/hostmem-file.c | 11 +++
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
index e9b6d21..9097a57 100644
--- a/backends/hostmem-file.c
+++ b/backends/hostmem-file.c
@@ -46,17 +46,12 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error 
**errp)
 error_setg(errp, "mem-path property not set");
 return;
 }
-#ifndef CONFIG_LINUX
-error_setg(errp, "-mem-path not supported on this host");
-#else
-if (!memory_region_size(&backend->mr)) {
-backend->force_prealloc = mem_prealloc;
-memory_region_init_ram_from_file(&backend->mr, OBJECT(backend),
+
+backend->force_prealloc = mem_prealloc;
+memory_region_init_ram_from_file(&backend->mr, OBJECT(backend),
  object_get_canonical_path(OBJECT(backend)),
  backend->size, fb->share,
  fb->mem_path, errp);
-}
-#endif
 }
 
 static void
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 08/33] exec: allow memory to be allocated from any kind of path

2015-10-18 Thread Xiao Guangrong
Currently file_ram_alloc() is designed for hugetlbfs, however, the memory
of nvdimm can come from either raw pmem device eg, /dev/pmem, or the file
locates at DAX enabled filesystem

So this patch let it work on any kind of path

Signed-off-by: Xiao Guangrong 
---
 exec.c | 56 +---
 1 file changed, 17 insertions(+), 39 deletions(-)

diff --git a/exec.c b/exec.c
index 4505dc7..d2a3357 100644
--- a/exec.c
+++ b/exec.c
@@ -1157,32 +1157,6 @@ void qemu_mutex_unlock_ramlist(void)
 }
 
 #ifdef __linux__
-
-#include 
-
-#define HUGETLBFS_MAGIC   0x958458f6
-
-static long gethugepagesize(const char *path, Error **errp)
-{
-struct statfs fs;
-int ret;
-
-do {
-ret = statfs(path, &fs);
-} while (ret != 0 && errno == EINTR);
-
-if (ret != 0) {
-error_setg_errno(errp, errno, "failed to get page size of file %s",
- path);
-return 0;
-}
-
-if (fs.f_type != HUGETLBFS_MAGIC)
-fprintf(stderr, "Warning: path not on HugeTLBFS: %s\n", path);
-
-return fs.f_bsize;
-}
-
 static void *file_ram_alloc(RAMBlock *block,
 ram_addr_t memory,
 const char *path,
@@ -1193,20 +1167,24 @@ static void *file_ram_alloc(RAMBlock *block,
 char *c;
 void *area;
 int fd;
-uint64_t hpagesize;
-Error *local_err = NULL;
+uint64_t pagesize;
 
-hpagesize = gethugepagesize(path, &local_err);
-if (local_err) {
-error_propagate(errp, local_err);
+pagesize = qemu_file_get_page_size(path);
+if (!pagesize) {
+error_setg(errp, "can't get page size for %s", path);
 goto error;
 }
-block->mr->align = hpagesize;
 
-if (memory < hpagesize) {
+if (pagesize == getpagesize()) {
+fprintf(stderr, "Memory is not allocated from HugeTlbfs.\n");
+}
+
+block->mr->align = pagesize;
+
+if (memory < pagesize) {
 error_setg(errp, "memory size 0x" RAM_ADDR_FMT " must be equal to "
-   "or larger than huge page size 0x%" PRIx64,
-   memory, hpagesize);
+   "or larger than page size 0x%" PRIx64,
+   memory, pagesize);
 goto error;
 }
 
@@ -1230,14 +1208,14 @@ static void *file_ram_alloc(RAMBlock *block,
 fd = mkstemp(filename);
 if (fd < 0) {
 error_setg_errno(errp, errno,
- "unable to create backing store for hugepages");
+ "unable to create backing store for path %s", path);
 g_free(filename);
 goto error;
 }
 unlink(filename);
 g_free(filename);
 
-memory = ROUND_UP(memory, hpagesize);
+memory = ROUND_UP(memory, pagesize);
 
 /*
  * ftruncate is not supported by hugetlbfs in older
@@ -1249,10 +1227,10 @@ static void *file_ram_alloc(RAMBlock *block,
 perror("ftruncate");
 }
 
-area = qemu_ram_mmap(fd, memory, hpagesize, block->flags & RAM_SHARED);
+area = qemu_ram_mmap(fd, memory, pagesize, block->flags & RAM_SHARED);
 if (area == MAP_FAILED) {
 error_setg_errno(errp, errno,
- "unable to map backing store for hugepages");
+ "unable to map backing store for path %s", path);
 close(fd);
 goto error;
 }
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 07/33] util: introduce qemu_file_get_page_size()

2015-10-18 Thread Xiao Guangrong
There are three places use the some logic to get the page size on
the file path or file fd

This patch introduces qemu_file_get_page_size() to unify the code

Signed-off-by: Xiao Guangrong 
---
 include/qemu/osdep.h |  1 +
 target-ppc/kvm.c | 21 +++--
 util/oslib-posix.c   | 16 
 util/oslib-win32.c   |  5 +
 4 files changed, 25 insertions(+), 18 deletions(-)

diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
index ef21efb..9c8c0c4 100644
--- a/include/qemu/osdep.h
+++ b/include/qemu/osdep.h
@@ -286,4 +286,5 @@ void os_mem_prealloc(int fd, char *area, size_t sz);
 
 int qemu_read_password(char *buf, int buf_size);
 
+size_t qemu_file_get_page_size(const char *mem_path);
 #endif
diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index f8ea783..ed3424e 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -306,28 +306,13 @@ static void kvm_get_smmu_info(PowerPCCPU *cpu, struct 
kvm_ppc_smmu_info *info)
 
 static long gethugepagesize(const char *mem_path)
 {
-struct statfs fs;
-int ret;
-
-do {
-ret = statfs(mem_path, &fs);
-} while (ret != 0 && errno == EINTR);
+long size = qemu_file_get_page_size(mem_path);
 
-if (ret != 0) {
-fprintf(stderr, "Couldn't statfs() memory path: %s\n",
-strerror(errno));
+if (!size) {
 exit(1);
 }
 
-#define HUGETLBFS_MAGIC   0x958458f6
-
-if (fs.f_type != HUGETLBFS_MAGIC) {
-/* Explicit mempath, but it's ordinary pages */
-return getpagesize();
-}
-
-/* It's hugepage, return the huge page size */
-return fs.f_bsize;
+return size;
 }
 
 static int find_max_supported_pagesize(Object *obj, void *opaque)
diff --git a/util/oslib-posix.c b/util/oslib-posix.c
index 892d2d8..32b4d1f 100644
--- a/util/oslib-posix.c
+++ b/util/oslib-posix.c
@@ -360,6 +360,22 @@ static size_t fd_getpagesize(int fd)
 return getpagesize();
 }
 
+size_t qemu_file_get_page_size(const char *path)
+{
+size_t size = 0;
+int fd = qemu_open(path, O_RDONLY);
+
+if (fd < 0) {
+fprintf(stderr, "Could not open %s.\n", path);
+goto exit;
+}
+
+size = fd_getpagesize(fd);
+qemu_close(fd);
+exit:
+return size;
+}
+
 void os_mem_prealloc(int fd, char *area, size_t memory)
 {
 int ret;
diff --git a/util/oslib-win32.c b/util/oslib-win32.c
index 08f5a9c..1ff1fae 100644
--- a/util/oslib-win32.c
+++ b/util/oslib-win32.c
@@ -462,6 +462,11 @@ size_t getpagesize(void)
 return system_info.dwPageSize;
 }
 
+size_t qemu_file_get_page_size(const char *path)
+{
+return getpagesize();
+}
+
 void os_mem_prealloc(int fd, char *area, size_t memory)
 {
 int i;
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 01/33] acpi: add aml_derefof

2015-10-18 Thread Xiao Guangrong
Implement DeRefOf term which is used by NVDIMM _DSM method in later patch

Reviewed-by: Igor Mammedov 
Signed-off-by: Xiao Guangrong 
---
 hw/acpi/aml-build.c | 8 
 include/hw/acpi/aml-build.h | 1 +
 2 files changed, 9 insertions(+)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index 0d4b324..cbd53f4 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1135,6 +1135,14 @@ Aml *aml_unicode(const char *str)
 return var;
 }
 
+/* ACPI 1.0b: 16.2.5.4 Type 2 Opcodes Encoding: DefDerefOf */
+Aml *aml_derefof(Aml *arg)
+{
+Aml *var = aml_opcode(0x83 /* DerefOfOp */);
+aml_append(var, arg);
+return var;
+}
+
 void
 build_header(GArray *linker, GArray *table_data,
  AcpiTableHeader *h, const char *sig, int len, uint8_t rev)
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index 1b632dc..5a03d33 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -274,6 +274,7 @@ Aml *aml_create_dword_field(Aml *srcbuf, Aml *index, const 
char *name);
 Aml *aml_varpackage(uint32_t num_elements);
 Aml *aml_touuid(const char *uuid);
 Aml *aml_unicode(const char *str);
+Aml *aml_derefof(Aml *arg);
 
 void
 build_header(GArray *linker, GArray *table_data,
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 11/33] hostmem-file: use whole file size if possible

2015-10-18 Thread Xiao Guangrong
Use the whole file size if @size is not specified which is useful
if we want to directly pass a file to guest

Signed-off-by: Xiao Guangrong 
---
 backends/hostmem-file.c | 48 
 1 file changed, 44 insertions(+), 4 deletions(-)

diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
index 9097a57..e1bc9ff 100644
--- a/backends/hostmem-file.c
+++ b/backends/hostmem-file.c
@@ -9,6 +9,9 @@
  * This work is licensed under the terms of the GNU GPL, version 2 or later.
  * See the COPYING file in the top-level directory.
  */
+#include 
+#include 
+
 #include "qemu-common.h"
 #include "sysemu/hostmem.h"
 #include "sysemu/sysemu.h"
@@ -33,20 +36,57 @@ struct HostMemoryBackendFile {
 char *mem_path;
 };
 
+static uint64_t get_file_size(const char *file)
+{
+struct stat stat_buf;
+uint64_t size = 0;
+int fd;
+
+fd = open(file, O_RDONLY);
+if (fd < 0) {
+return 0;
+}
+
+if (stat(file, &stat_buf) < 0) {
+goto exit;
+}
+
+if ((S_ISBLK(stat_buf.st_mode)) && !ioctl(fd, BLKGETSIZE64, &size)) {
+goto exit;
+}
+
+size = lseek(fd, 0, SEEK_END);
+if (size == -1) {
+size = 0;
+}
+exit:
+close(fd);
+return size;
+}
+
 static void
 file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
 {
 HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(backend);
 
-if (!backend->size) {
-error_setg(errp, "can't create backend with size 0");
-return;
-}
 if (!fb->mem_path) {
 error_setg(errp, "mem-path property not set");
 return;
 }
 
+if (!backend->size) {
+/*
+ * use the whole file size if @size is not specified.
+ */
+backend->size = get_file_size(fb->mem_path);
+}
+
+if (!backend->size) {
+error_setg(errp, "failed to get file size for %s, can't create "
+ "backend on it", mem_path);
+return;
+}
+
 backend->force_prealloc = mem_prealloc;
 memory_region_init_ram_from_file(&backend->mr, OBJECT(backend),
  object_get_canonical_path(OBJECT(backend)),
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 00/33] implement vNVDIMM

2015-10-18 Thread Xiao Guangrong
This patchset can be found at:
  https://github.com/xiaogr/qemu.git nvdimm-v4

It is based on pci branch on Michael's tree and the top commit is:
commit e20cff85470be (piix: fix resource leak reported by Coverity).


Changelog in v4:
- changes from Michael's comments:
  1) show the message, "Memory is not allocated from HugeTlbfs", if file
 based memory is not allocated from hugetlbfs.
  2) introduce function, acpi_get_nvdimm_state(), to get NVDIMMState
 from Machine.
  3) statically define UUID and make its operation more clear
  4) use GArray to build device structures to avoid potential buffer
 overflow
  4) improve comments in the code
  5) improve code style

- changes from Igor's comments:
  1) add NVDIMM ACPI spec document
  2) use serialized method to avoid Mutex
  3) move NVDIMM ACPI's code to hw/acpi/nvdimm.c
  4) introduce a common ASL method used by _DSM for all devices to reduce
 ACPI size
  5) handle UUID in ACPI AML code. BTW, i'd keep handling revision in QEMU
 it's better to upgrade QEMU to support Rev2 in the future

- changes from Stefan's comments:
  1) copy input data from DSM memory to local buffer to avoid potential
 issues as DSM memory is visible to guest. Output data is handled
 in a similar way

- changes from Dan's comments:
  1) drop static namespace as Linux has already supported label-less
 nvdimm devices

- changes from Vladimir's comments:
  1) print better message, "failed to get file size for %s, can't create
 backend on it", if any file operation filed to obtain file size

- others:
  create a git repo on github.com for better review/test

Also, thanks for Eric Blake's review on QAPI's side.

Thank all of you to review this patchset.

Changelog in v3:
There is huge change in this version, thank Igor, Stefan, Paolo, Eduardo,
Michael for their valuable comments, the patchset finally gets better shape.
- changes from Igor's comments:
  1) abstract dimm device type from pc-dimm and create nvdimm device based on
 dimm, then it uses memory backend device as nvdimm's memory and NUMA has
 easily been implemented.
  2) let file-backend device support any kind of filesystem not only for
 hugetlbfs and let it work on file not only for directory which is
 achieved by extending 'mem-path' - if it's a directory then it works as
 current behavior, otherwise if it's file then directly allocates memory
 from it.
  3) we figure out a unused memory hole below 4G that is 0xFF0 ~ 
 0xFFF0, this range is large enough for NVDIMM ACPI as build 64-bit
 ACPI SSDT/DSDT table will break windows XP.
 BTW, only make SSDT.rev = 2 can not work since the width is only depended
 on DSDT.rev based on 19.6.28 DefinitionBlock (Declare Definition Block)
 in ACPI spec:
| Note: For compatibility with ACPI versions before ACPI 2.0, the bit 
| width of Integer objects is dependent on the ComplianceRevision of the DSDT.
| If the ComplianceRevision is less than 2, all integers are restricted to 32 
| bits. Otherwise, full 64-bit integers are used. The version of the DSDT sets 
| the global integer width for all integers, including integers in SSDTs.
  4) use the lowest ACPI spec version to document AML terms.
  5) use "nvdimm" as nvdimm device name instead of "pc-nvdimm"

- changes from Stefan's comments:
  1) do not do endian adjustment in-place since _DSM memory is visible to guest
  2) use target platform's target page size instead of fixed PAGE_SIZE
 definition
  3) lots of code style improvement and typo fixes.
  4) live migration fix
- changes from Paolo's comments:
  1) improve the name of memory region
  
- other changes:
  1) return exact buffer size for _DSM method instead of the page size.
  2) introduce mutex in NVDIMM ACPI as the _DSM memory is shared by all nvdimm
 devices.
  3) NUMA support
  4) implement _FIT method
  5) rename "configdata" to "reserve-label-data"
  6) simplify _DSM arg3 determination
  7) main changelog update to let it reflect v3.

Changlog in v2:
- Use litten endian for DSM method, thanks for Stefan's suggestion

- introduce a new parameter, @configdata, if it's false, Qemu will
  build a static and readonly namespace in memory and use it serveing
  for DSM GET_CONFIG_SIZE/GET_CONFIG_DATA requests. In this case, no
  reserved region is needed at the end of the @file, it is good for
  the user who want to pass whole nvdimm device and make its data
  completely be visible to guest

- divide the source code into separated files and add maintain info

BTW, PCOMMIT virtualization on KVM side is work in progress, hopefully will
be posted on next week

== Background ==
NVDIMM (A Non-Volatile Dual In-line Memory Module) is going to be supported
on Intel's platform. They are discovered via ACPI and configured by _DSM
method of NVDIMM device in ACPI. There has some supporting documents which
can be found at:
ACPI 6: http://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf
NVDI

[PATCH 3/4] vfio: platform: add compat in vfio_platform_device

2015-10-18 Thread Eric Auger
Let's retrieve the compatibility string on probe and store it
in the vfio_platform_device struct

Signed-off-by: Eric Auger 
---
 drivers/vfio/platform/vfio_platform_common.c  | 15 ---
 drivers/vfio/platform/vfio_platform_private.h |  1 +
 2 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/vfio/platform/vfio_platform_common.c 
b/drivers/vfio/platform/vfio_platform_common.c
index d36afc9..31a6a8c 100644
--- a/drivers/vfio/platform/vfio_platform_common.c
+++ b/drivers/vfio/platform/vfio_platform_common.c
@@ -38,16 +38,11 @@ static const struct vfio_platform_reset_combo 
reset_lookup_table[] = {
 static void vfio_platform_get_reset(struct vfio_platform_device *vdev,
struct device *dev)
 {
-   const char *compat;
int (*reset)(struct vfio_platform_device *);
-   int ret, i;
-
-   ret = device_property_read_string(dev, "compatible", &compat);
-   if (ret)
-   return;
+   int i;
 
for (i = 0 ; i < ARRAY_SIZE(reset_lookup_table); i++) {
-   if (!strcmp(reset_lookup_table[i].compat, compat)) {
+   if (!strcmp(reset_lookup_table[i].compat, vdev->compat)) {
request_module(reset_lookup_table[i].module_name);
reset = __symbol_get(
reset_lookup_table[i].reset_function_name);
@@ -538,6 +533,12 @@ int vfio_platform_probe_common(struct vfio_platform_device 
*vdev,
struct iommu_group *group;
int ret;
 
+   ret = device_property_read_string(dev, "compatible", &vdev->compat);
+   if (ret) {
+   pr_err("VFIO: cannot retrieve compat for %s\n", vdev->name);
+   return -EINVAL;
+   }
+
if (!vdev)
return -EINVAL;
 
diff --git a/drivers/vfio/platform/vfio_platform_private.h 
b/drivers/vfio/platform/vfio_platform_private.h
index 17323f0..b274646 100644
--- a/drivers/vfio/platform/vfio_platform_private.h
+++ b/drivers/vfio/platform/vfio_platform_private.h
@@ -56,6 +56,7 @@ struct vfio_platform_device {
u32 num_irqs;
int refcnt;
struct mutexigate;
+   const char  *compat;
 
/*
 * These fields should be filled by the bus specific binder
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/4] vfio: platform: use list of registered reset function

2015-10-18 Thread Eric Auger
Remove the static lookup table and use the dynamic list of registered
reset functions instead. Also load the reset module through its alias.
The reset struct module pointer is stored in vfio_platform_device.

This patch fixes the issue related to the usage of __symbol_get, which
besides from being moot, prevented compilation with CONFIG_MODULES
disabled.

Also usage of MODULE_ALIAS makes possible to add a new reset module
without needing to update the framework. This was suggested by Arnd.

Signed-off-by: Eric Auger 
Reported-by: Arnd Bergmann 
---
 drivers/vfio/platform/vfio_platform_common.c  | 46 +++
 drivers/vfio/platform/vfio_platform_private.h |  1 +
 2 files changed, 26 insertions(+), 21 deletions(-)

diff --git a/drivers/vfio/platform/vfio_platform_common.c 
b/drivers/vfio/platform/vfio_platform_common.c
index 31a6a8c..f3b6299 100644
--- a/drivers/vfio/platform/vfio_platform_common.c
+++ b/drivers/vfio/platform/vfio_platform_common.c
@@ -27,37 +27,41 @@ struct list_head reset_list;
 LIST_HEAD(reset_list);
 static DEFINE_MUTEX(driver_lock);
 
-static const struct vfio_platform_reset_combo reset_lookup_table[] = {
-   {
-   .compat = "calxeda,hb-xgmac",
-   .reset_function_name = "vfio_platform_calxedaxgmac_reset",
-   .module_name = "vfio-platform-calxedaxgmac",
-   },
-};
+static vfio_platform_reset_fn_t vfio_platform_lookup_reset(const char *compat,
+   struct module **module)
+{
+   struct vfio_platform_reset_node *iter;
+
+   list_for_each_entry(iter, &reset_list, link) {
+   if (!strcmp(iter->compat, compat) &&
+   try_module_get(iter->owner)) {
+   *module = iter->owner;
+   return iter->reset;
+   }
+   }
+
+   return NULL;
+}
 
 static void vfio_platform_get_reset(struct vfio_platform_device *vdev,
struct device *dev)
 {
-   int (*reset)(struct vfio_platform_device *);
-   int i;
-
-   for (i = 0 ; i < ARRAY_SIZE(reset_lookup_table); i++) {
-   if (!strcmp(reset_lookup_table[i].compat, vdev->compat)) {
-   request_module(reset_lookup_table[i].module_name);
-   reset = __symbol_get(
-   reset_lookup_table[i].reset_function_name);
-   if (reset) {
-   vdev->reset = reset;
-   return;
-   }
-   }
+   char modname[256];
+
+   vdev->reset = vfio_platform_lookup_reset(vdev->compat,
+   &vdev->reset_module);
+   if (!vdev->reset) {
+   snprintf(modname, 256, "vfio-reset:%s", vdev->compat);
+   request_module(modname);
+   vdev->reset = vfio_platform_lookup_reset(vdev->compat,
+&vdev->reset_module);
}
 }
 
 static void vfio_platform_put_reset(struct vfio_platform_device *vdev)
 {
if (vdev->reset)
-   symbol_put_addr(vdev->reset);
+   module_put(vdev->reset_module);
 }
 
 static int vfio_platform_regions_init(struct vfio_platform_device *vdev)
diff --git a/drivers/vfio/platform/vfio_platform_private.h 
b/drivers/vfio/platform/vfio_platform_private.h
index b274646..2070dcc 100644
--- a/drivers/vfio/platform/vfio_platform_private.h
+++ b/drivers/vfio/platform/vfio_platform_private.h
@@ -57,6 +57,7 @@ struct vfio_platform_device {
int refcnt;
struct mutexigate;
const char  *compat;
+   struct module   *reset_module;
 
/*
 * These fields should be filled by the bus specific binder
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/4] vfio: platform: reset: calxedaxgmac: add reset function registration

2015-10-18 Thread Eric Auger
This patch adds the reset function registration/unregistration.
Also a MODULE_ALIAS is added.

Signed-off-by: Eric Auger 
---
 .../platform/reset/vfio_platform_calxedaxgmac.c| 40 --
 1 file changed, 38 insertions(+), 2 deletions(-)

diff --git a/drivers/vfio/platform/reset/vfio_platform_calxedaxgmac.c 
b/drivers/vfio/platform/reset/vfio_platform_calxedaxgmac.c
index 619dc7d..4f76b17 100644
--- a/drivers/vfio/platform/reset/vfio_platform_calxedaxgmac.c
+++ b/drivers/vfio/platform/reset/vfio_platform_calxedaxgmac.c
@@ -29,8 +29,7 @@
 #define DRIVER_VERSION  "0.1"
 #define DRIVER_AUTHOR   "Eric Auger "
 #define DRIVER_DESC "Reset support for Calxeda xgmac vfio platform device"
-
-#define CALXEDAXGMAC_COMPAT "calxeda,hb-xgmac"
+#define COMPAT "calxeda,hb-xgmac"
 
 /* XGMAC Register definitions */
 #define XGMAC_CONTROL   0x  /* MAC Configuration */
@@ -80,6 +79,43 @@ int vfio_platform_calxedaxgmac_reset(struct 
vfio_platform_device *vdev)
 }
 EXPORT_SYMBOL_GPL(vfio_platform_calxedaxgmac_reset);
 
+static int __init vfio_platform_calxedaxgmac_init(void)
+{
+   int (*register_reset)(struct module *, char*,
+   vfio_platform_reset_fn_t);
+   int ret;
+
+   register_reset = symbol_get(vfio_platform_register_reset);
+   if (!register_reset)
+   return -EINVAL;
+
+   ret = register_reset(THIS_MODULE, COMPAT,
+   vfio_platform_calxedaxgmac_reset);
+
+   symbol_put(vfio_platform_register_reset);
+
+   return ret;
+}
+
+static void __exit vfio_platform_calxedaxgmac_exit(void)
+{
+   int (*unregister_reset)(char *);
+   int ret;
+
+   unregister_reset = symbol_get(vfio_platform_unregister_reset);
+   if (!unregister_reset)
+   return;
+
+   ret = unregister_reset(COMPAT);
+
+   symbol_put(vfio_platform_unregister_reset);
+}
+
+module_init(vfio_platform_calxedaxgmac_init);
+module_exit(vfio_platform_calxedaxgmac_exit);
+
+MODULE_ALIAS("vfio-reset:"  COMPAT);
+
 MODULE_VERSION(DRIVER_VERSION);
 MODULE_LICENSE("GPL v2");
 MODULE_AUTHOR(DRIVER_AUTHOR);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/4] vfio: platform: add capability to register a reset function

2015-10-18 Thread Eric Auger
In preparation for subsequent changes in reset function lookup,
lets introduce a dynamic list of reset combos (compat string,
reset module, reset function). The list can be populated/voided with
two new functions, vfio_platform_register/unregister_reset. Those are
not yet used in this patch.

Signed-off-by: Eric Auger 
---
 drivers/vfio/platform/vfio_platform_common.c  | 55 +++
 drivers/vfio/platform/vfio_platform_private.h | 14 +++
 2 files changed, 69 insertions(+)

diff --git a/drivers/vfio/platform/vfio_platform_common.c 
b/drivers/vfio/platform/vfio_platform_common.c
index e43efb5..d36afc9 100644
--- a/drivers/vfio/platform/vfio_platform_common.c
+++ b/drivers/vfio/platform/vfio_platform_common.c
@@ -23,6 +23,8 @@
 
 #include "vfio_platform_private.h"
 
+struct list_head reset_list;
+LIST_HEAD(reset_list);
 static DEFINE_MUTEX(driver_lock);
 
 static const struct vfio_platform_reset_combo reset_lookup_table[] = {
@@ -573,3 +575,56 @@ struct vfio_platform_device 
*vfio_platform_remove_common(struct device *dev)
return vdev;
 }
 EXPORT_SYMBOL_GPL(vfio_platform_remove_common);
+
+int vfio_platform_register_reset(struct module *reset_owner, char *compat,
+vfio_platform_reset_fn_t reset)
+{
+   struct vfio_platform_reset_node *node, *iter;
+   bool found = false;
+
+   list_for_each_entry(iter, &reset_list, link) {
+   if (!strcmp(iter->compat, compat)) {
+   found = true;
+   break;
+   }
+   }
+   if (found)
+   return -EINVAL;
+
+   node = kmalloc(sizeof(*node), GFP_KERNEL);
+   if (!node)
+   return -ENOMEM;
+
+   node->compat = kstrdup(compat, GFP_KERNEL);
+   if (!node->compat)
+   return -ENOMEM;
+
+   node->owner = reset_owner;
+   node->reset = reset;
+
+   list_add(&node->link, &reset_list);
+   return 0;
+}
+EXPORT_SYMBOL_GPL(vfio_platform_register_reset);
+
+int vfio_platform_unregister_reset(char *compat)
+{
+   struct vfio_platform_reset_node *iter;
+   bool found = false;
+
+   list_for_each_entry(iter, &reset_list, link) {
+   if (!strcmp(iter->compat, compat)) {
+   found = true;
+   break;
+   }
+   }
+   if (!found)
+   return -EINVAL;
+
+   list_del(&iter->link);
+   kfree(iter->compat);
+   kfree(iter);
+   return 0;
+}
+EXPORT_SYMBOL_GPL(vfio_platform_unregister_reset);
+
diff --git a/drivers/vfio/platform/vfio_platform_private.h 
b/drivers/vfio/platform/vfio_platform_private.h
index 1c9b3d5..17323f0 100644
--- a/drivers/vfio/platform/vfio_platform_private.h
+++ b/drivers/vfio/platform/vfio_platform_private.h
@@ -76,6 +76,15 @@ struct vfio_platform_reset_combo {
const char *module_name;
 };
 
+typedef int (*vfio_platform_reset_fn_t)(struct vfio_platform_device *vdev);
+
+struct vfio_platform_reset_node {
+   struct list_head link;
+   char *compat;
+   struct module *owner;
+   vfio_platform_reset_fn_t reset;
+};
+
 extern int vfio_platform_probe_common(struct vfio_platform_device *vdev,
  struct device *dev);
 extern struct vfio_platform_device *vfio_platform_remove_common
@@ -89,4 +98,9 @@ extern int vfio_platform_set_irqs_ioctl(struct 
vfio_platform_device *vdev,
unsigned start, unsigned count,
void *data);
 
+extern int vfio_platform_register_reset(struct module *owner,
+   char *compat,
+   vfio_platform_reset_fn_t reset);
+extern int vfio_platform_unregister_reset(char *compat);
+
 #endif /* VFIO_PLATFORM_PRIVATE_H */
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/4] VFIO platform reset module rework

2015-10-18 Thread Eric Auger
This series fixes the current implementation by getting rid of the
usage of __symbol_get which caused a compilation issue with
CONFIG_MODULES disabled. On top of this, the usage of MODULE_ALIAS makes
possible to add a new reset module without being obliged to update the
framework. The new implementation relies on the reset module registering
its reset function to the vfio-platform driver.

The series is available at

https://git.linaro.org/people/eric.auger/linux.git/shortlog/refs/heads/v4.3-rc5-rework-xgbe-v2

Best Regards

Eric


Eric Auger (4):
  vfio: platform: add capability to register a reset function
  vfio: platform: reset: calxedaxgmac: add reset function registration
  vfio: platform: add compat in vfio_platform_device
  vfio: platform: use list of registered reset function

 .../platform/reset/vfio_platform_calxedaxgmac.c|  40 +++-
 drivers/vfio/platform/vfio_platform_common.c   | 112 -
 drivers/vfio/platform/vfio_platform_private.h  |  16 +++
 3 files changed, 140 insertions(+), 28 deletions(-)

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html