On Mon, Mar 20, 2017 at 08:09:34AM +0800, Haozhong Zhang wrote:
> This is v2 RFC patch series to add vNVDIMM support to HVM domains.
> v1 can be found at 
> https://lists.xenproject.org/archives/html/xen-devel/2016-10/msg00424.html.
> 
> No label and no _DSM except function 0 "query implemented functions"
> is supported by this version, but they will be added by future patches.
> 
> The corresponding Qemu patch series is sent in another thread
> "[RFC QEMU PATCH v2 00/10] Implement vNVDIMM for Xen HVM guest".
> 
> All patch series can be found at
>   Xen:  https://github.com/hzzhan9/xen.git nvdimm-rfc-v2
>   Qemu: https://github.com/hzzhan9/qemu.git xen-nvdimm-rfc-v2
> 

Hey!

Thank you for posting this. A quick question.
> Changes in v2
> ==============
> 
> - One of the primary changes in v2 is dropping the linux kernel
>   patches, which were used to reserve on host pmem for placing its
>   frametable and M2P table. In v2, we add a management tool xen-ndctl
>   which is used in Dom0 to notify Xen hypervisor of which storage can
>   be used to manage the host pmem.
> 
>   For example,
>   1.   xen-ndctl setup 0x240000 0x380000 0x380000 0x3c0000
>     tells Xen hypervisor to use host pmem pages at MFN 0x380000 ~
>     0x3c0000 to manage host pmem pages at MFN 0x240000 ~ 0x380000.
>     I.e. the former is used to place the frame table and M2P table of
>     both ranges of pmem pages.
> 
>   2.   xen-ndctl setup 0x240000 0x380000
>     tells Xen hypervisor to use the regular RAM to manage the host
>     pmem pages at MFN 0x240000 ~ 0x380000. I.e the regular RMA is used
>     to place the frame table and M2P table.

How were you thinking to 'glue' this to the libvirt (xl) way of setting
up NVDIMM? Could you explain (even in broad ways) how that would be
done? I see the 'vnvdimms' but somehow would have thought the
libxl would parse the /proc/iomem (or perhaps call ndctl to
obtain this ?)

> 
> - Another primary change in v2 is dropping the support to map files on
>   the host pmem to HVM domains as virtual NVDIMMs, as I cannot find a
>   stable to fix the fiemap of host files. Instead, we can rely on the
>   ability added in Linux kernel v4.9 that enables creating multiple
>   pmem namespaces on a single nvdimm interleave set.

Could you expand on this a bit please? This is a quite important feature
and I thought the mix of mlock + fiemap would have solved this?

> 
> - Other changes are logged in each patch separately.
> 
> How to Test

Thank you for the detailed way this is explained!
> ==============
> 
> 0. This patch series can be tested either on the real hardware with
>    NVDIMM, or in the nested virtualization environment on KVM. The

Real hardware, eh? Nice!

>    latter requires QEMU 2.9 or newer with, for example, following
>    commands and options,
>      # dd if=/dev/zero of=nvm-8G.img bs=1G count=8
>      # rmmod kvm-intel; modprobe kvm-intel nested=1
>      # qemu-system-x86_64 -enable-kvm -smp 4 -cpu host,+vmx \
>                           -hda DISK_IMG_OF_XEN \
>                           -machine pc,nvdimm \
>                           -m 8G,slots=4,maxmem=128G \
>                           -object 
> memory-backend-file,id=mem1,mem-path=nvm-8G,size=8G \
>                           -device nvdimm,id=nv1,memdev=mem1,label-size=2M \
>                           ...
>    Above will create a nested virtualization environment with a 8G
>    pmem mode NVDIMM device (whose last 2MB is used as the label
>    storage area).
> 
> 1. Check out Xen and QEMU from above repositories and branches. Build
>    and install Xen with qemu-xen replaced by above QEMU.
> 
> 2. Build and install Linux kernel 4.9 or later as Dom0 kernel with the
>    following configs selected:
>        CONFIG_ACPI_NFIT
>        CONFIG_LIBNVDIMM
>        CONFIG_BLK_DEV_PMEM
>        CONFIG_NVDIMM_PFN
>        CONFIG_FS_DAX
> 
> 3. Check out ndctl from https://github.com/pmem/ndctl.git. Build and
>    install ndctl in Dom0.
> 
> 4. Boot to Xen Dom0.
> 
> 5. Create pmem namespaces on the host pmem region.
>      # ndctl disable-region region0
>      # ndctl zero-labels nmem0                        // clear existing labels
>      # ndctl init-labels nmem0                        // initialize the label 
> area
>      # ndctl enable-region region0     
>      # ndctl create-namespace -r region0 -s 4G -m raw // create one 4G pmem 
> namespace
>      # ndctl create-namespace -r region0 -s 1G -m raw // create one 1G pmem 
> namespace
>      # ndctl list --namespaces
>      [
>        {
>            "dev":"namespace0.0",
>            "mode":"raw",
>            "size":4294967296,
>            "uuid":"bbfbedbd-3ada-4f55-9484-01f2722c651b",
>            "blockdev":"pmem0"
>        },
>        {
>            "dev":"namespace0.1",
>            "mode":"raw",
>            "size":1073741824,
>            "uuid":"dd4d3949-6887-417b-b819-89a7854fcdbd",
>            "blockdev":"pmem0.1"
>        }
>      ]
> 
> 6. Ask Xen hypervisor to use namespace0.1 to manage namespace0.0.
>      # grep namespace /proc/iomem
>          240000000-33fffffff : namespace0.0
>          340000000-37fffffff : namespace0.1
>      # xen-ndctl setup 0x240000 0x340000 0x340000 0x380000
> 
> 7. Start a HVM domain with "vnvdimms=[ '/dev/pmem0' ]" in its xl config.
> 
>    If ndctl is installed in HVM domain, "ndctl list" should be able to
>    list a 4G pmem namespace, e.g.
>    {
>      "dev":"namespace0.0",
>      "mode":"raw",
>      "size":4294967296,
>      "blockdev":"pmem0"
>    }
>    
> 
> Haozhong Zhang (15):
>   xen/common: add Kconfig item for pmem support
>   xen: probe pmem regions via ACPI NFIT
>   xen/x86: allow customizing locations of extended frametable & M2P
>   xen/x86: add XEN_SYSCTL_nvdimm_pmem_setup to setup host pmem
>   xen/x86: add XENMEM_populate_pmemmap to map host pmem pages to HVM domain
>   tools: reserve guest memory for ACPI from device model
>   tools/libacpi: expose the minimum alignment used by mem_ops.alloc
>   tools/libacpi: add callback acpi_ctxt.p2v to get a pointer from physical 
> address
>   tools/libacpi: add callbacks to access XenStore
>   tools/libacpi: add a simple AML builder
>   tools/libacpi: load ACPI built by the device model
>   tools/libxl: build qemu options from xl vNVDIMM configs
>   tools/libxl: add support to map host pmem device to guests
>   tools/libxl: initiate pmem mapping via qmp callback
>   tools/misc: add xen-ndctl
> 
>  .gitignore                              |   1 +
>  docs/man/xl.cfg.pod.5.in                |   6 +
>  tools/firmware/hvmloader/Makefile       |   3 +-
>  tools/firmware/hvmloader/util.c         |  75 ++++++
>  tools/firmware/hvmloader/util.h         |  10 +
>  tools/firmware/hvmloader/xenbus.c       |  44 +++-
>  tools/flask/policy/modules/dom0.te      |   2 +-
>  tools/flask/policy/modules/xen.if       |   2 +-
>  tools/libacpi/acpi2_0.h                 |   2 +
>  tools/libacpi/aml_build.c               | 326 +++++++++++++++++++++++
>  tools/libacpi/aml_build.h               | 116 +++++++++
>  tools/libacpi/build.c                   | 311 ++++++++++++++++++++++
>  tools/libacpi/libacpi.h                 |  21 ++
>  tools/libxc/include/xc_dom.h            |   1 +
>  tools/libxc/include/xenctrl.h           |  36 +++
>  tools/libxc/xc_dom_x86.c                |   7 +
>  tools/libxc/xc_domain.c                 |  15 ++
>  tools/libxc/xc_misc.c                   |  17 ++
>  tools/libxl/Makefile                    |   7 +-
>  tools/libxl/libxl_create.c              |   4 +-
>  tools/libxl/libxl_dm.c                  | 109 +++++++-
>  tools/libxl/libxl_dom.c                 |  22 ++
>  tools/libxl/libxl_nvdimm.c              | 182 +++++++++++++
>  tools/libxl/libxl_nvdimm.h              |  42 +++
>  tools/libxl/libxl_qmp.c                 | 116 ++++++++-
>  tools/libxl/libxl_types.idl             |   8 +
>  tools/libxl/libxl_x86_acpi.c            |  41 +++
>  tools/misc/Makefile                     |   4 +
>  tools/misc/xen-ndctl.c                  | 227 ++++++++++++++++
>  tools/xl/xl_parse.c                     |  16 ++
>  xen/arch/x86/acpi/boot.c                |   4 +
>  xen/arch/x86/domain.c                   |   7 +
>  xen/arch/x86/sysctl.c                   |  22 ++
>  xen/arch/x86/x86_64/mm.c                | 191 ++++++++++++--
>  xen/common/Kconfig                      |   9 +
>  xen/common/Makefile                     |   1 +
>  xen/common/compat/memory.c              |   1 +
>  xen/common/domain.c                     |   3 +
>  xen/common/memory.c                     |  43 +++
>  xen/common/pmem.c                       | 448 
> ++++++++++++++++++++++++++++++++
>  xen/drivers/acpi/Makefile               |   2 +
>  xen/drivers/acpi/nfit.c                 | 116 +++++++++
>  xen/include/acpi/actbl.h                |   1 +
>  xen/include/acpi/actbl1.h               |  42 +++
>  xen/include/public/hvm/hvm_xs_strings.h |  11 +
>  xen/include/public/memory.h             |  14 +-
>  xen/include/public/sysctl.h             |  29 ++-
>  xen/include/xen/acpi.h                  |   4 +
>  xen/include/xen/pmem.h                  |  66 +++++
>  xen/include/xen/sched.h                 |   3 +
>  xen/include/xsm/dummy.h                 |  11 +
>  xen/include/xsm/xsm.h                   |  12 +
>  xen/xsm/dummy.c                         |   4 +
>  xen/xsm/flask/hooks.c                   |  17 ++
>  xen/xsm/flask/policy/access_vectors     |   4 +
>  55 files changed, 2795 insertions(+), 43 deletions(-)
>  create mode 100644 tools/libacpi/aml_build.c
>  create mode 100644 tools/libacpi/aml_build.h
>  create mode 100644 tools/libxl/libxl_nvdimm.c
>  create mode 100644 tools/libxl/libxl_nvdimm.h
>  create mode 100644 tools/misc/xen-ndctl.c
>  create mode 100644 xen/common/pmem.c
>  create mode 100644 xen/drivers/acpi/nfit.c
>  create mode 100644 xen/include/xen/pmem.h
> 
> -- 
> 2.12.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Reply via email to