Re: [Xen-devel] [RFC XEN PATCH v3 00/39] Add vNVDIMM support to HVM domains
On 10/27/17 11:26 +0800, Chao Peng wrote: > On Mon, 2017-09-11 at 12:37 +0800, Haozhong Zhang wrote: > > Overview > > == > > > > > (RFC v2 can be found at https://lists.xen.org/archives/html/xen- > devel/2017-03/msg02401.html) > > > > Well, this RFC v3 changes and inflates a lot from previous versions. > > The primary changes are listed below, most of which are to simplify > > the first implementation and avoid additional inflation. > > > > 1. Drop the support to maintain the frametable and M2P table of PMEM > > in RAM. In the future, we may add this support back. > > I don't find any discussion in v2 about this, but I'm thinking putting > those Xen data structures in RAM sometimes is useful (e.g. when > performance is important). It's better not making hard restriction on > this. Well, this is to reduce the complexity, as you see the current patch size is already too big. In addition, the size of NVDIMM can be very large, e.g. several tera-bytes or even more, which would require a large RAM space to store its frametable and M2P (~10 MB per 1 GB) and leave fewer RAM for guest usage. > > > > > 2. Hide host NFIT and deny access to host PMEM from Dom0. In other > > words, the kernel NVDIMM driver is loaded in Dom 0 and existing > > management utilities (e.g. ndctl) do not work in Dom0 anymore. This > > is to workaround the inferences of PMEM access between Dom0 and Xen > > hypervisor. In the future, we may add a stub driver in Dom0 which > > will hold the PMEM pages being used by Xen hypervisor and/or other > > domains. > > > > 3. As there is no NVDIMM driver and management utilities in Dom0 now, > > > we cannot easily specify an area of host NVDIMM (e.g., by > /dev/pmem0) > > and manage NVDIMM in Dom0 (e.g., creating labels). Instead, we > > have to specify the exact MFNs of host PMEM pages in xl domain > > configuration files and the newly added Xen NVDIMM management > > utility xen-ndctl. > > > > If there are indeed some tasks that have to be handled by existing > > driver and management utilities, such as recovery from hardware > > failures, they have to be accomplished out of Xen environment. > > What kind of recovery can happen and does the recovery can happen at > runtime? For example, can we recover a portion of NVDIMM assigned to a > certain VM while keep other VMs still using NVDIMM? For example, evaluate ACPI _DSM (maybe vendor specific) for error recovery and/or scrubbing bad blocks, etc. > > > > > After 2. is solved in the future, we would be able to make existing > > driver and management utilities work in Dom0 again. > > Is there any reason why we can't do it now? If existing ndctl (with > additional patches) can work then we don't need introduce xen-ndctl > anymore? I think that keeps user interface clearer. The simple reason is I want to reduce the components (Xen/kernel/QEMU) touched by the first patchset (whose primary target is to implement the basic functionality, i.e. mapping host NVDIMM to guest as a virtual NVDIMM). As you said, leaving a driver (the nvdimm driver and/or a stub driver) in Dom0 would make the user interface clearer. Let's see what I can get in the next version. Thanks, Haozhong ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC XEN PATCH v3 00/39] Add vNVDIMM support to HVM domains
On Mon, 2017-09-11 at 12:37 +0800, Haozhong Zhang wrote: > Overview > == > > > (RFC v2 can be found at https://lists.xen.org/archives/html/xen- devel/2017-03/msg02401.html) > > Well, this RFC v3 changes and inflates a lot from previous versions. > The primary changes are listed below, most of which are to simplify > the first implementation and avoid additional inflation. > > 1. Drop the support to maintain the frametable and M2P table of PMEM > in RAM. In the future, we may add this support back. I don't find any discussion in v2 about this, but I'm thinking putting those Xen data structures in RAM sometimes is useful (e.g. when performance is important). It's better not making hard restriction on this. > > 2. Hide host NFIT and deny access to host PMEM from Dom0. In other > words, the kernel NVDIMM driver is loaded in Dom 0 and existing > management utilities (e.g. ndctl) do not work in Dom0 anymore. This > is to workaround the inferences of PMEM access between Dom0 and Xen > hypervisor. In the future, we may add a stub driver in Dom0 which > will hold the PMEM pages being used by Xen hypervisor and/or other > domains. > > 3. As there is no NVDIMM driver and management utilities in Dom0 now, > > we cannot easily specify an area of host NVDIMM (e.g., by /dev/pmem0) > and manage NVDIMM in Dom0 (e.g., creating labels). Instead, we > have to specify the exact MFNs of host PMEM pages in xl domain > configuration files and the newly added Xen NVDIMM management > utility xen-ndctl. > > If there are indeed some tasks that have to be handled by existing > driver and management utilities, such as recovery from hardware > failures, they have to be accomplished out of Xen environment. What kind of recovery can happen and does the recovery can happen at runtime? For example, can we recover a portion of NVDIMM assigned to a certain VM while keep other VMs still using NVDIMM? > > After 2. is solved in the future, we would be able to make existing > driver and management utilities work in Dom0 again. Is there any reason why we can't do it now? If existing ndctl (with additional patches) can work then we don't need introduce xen-ndctl anymore? I think that keeps user interface clearer. Chao___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [RFC XEN PATCH v3 00/39] Add vNVDIMM support to HVM domains
Overview == (RFC v2 can be found at https://lists.xen.org/archives/html/xen-devel/2017-03/msg02401.html) Well, this RFC v3 changes and inflates a lot from previous versions. The primary changes are listed below, most of which are to simplify the first implementation and avoid additional inflation. 1. Drop the support to maintain the frametable and M2P table of PMEM in RAM. In the future, we may add this support back. 2. Hide host NFIT and deny access to host PMEM from Dom0. In other words, the kernel NVDIMM driver is loaded in Dom 0 and existing management utilities (e.g. ndctl) do not work in Dom0 anymore. This is to workaround the inferences of PMEM access between Dom0 and Xen hypervisor. In the future, we may add a stub driver in Dom0 which will hold the PMEM pages being used by Xen hypervisor and/or other domains. 3. As there is no NVDIMM driver and management utilities in Dom0 now, we cannot easily specify an area of host NVDIMM (e.g., by /dev/pmem0) and manage NVDIMM in Dom0 (e.g., creating labels). Instead, we have to specify the exact MFNs of host PMEM pages in xl domain configuration files and the newly added Xen NVDIMM management utility xen-ndctl. If there are indeed some tasks that have to be handled by existing driver and management utilities, such as recovery from hardware failures, they have to be accomplished out of Xen environment. After 2. is solved in the future, we would be able to make existing driver and management utilities work in Dom0 again. All patches can be found at Xen: https://github.com/hzzhan9/xen.git nvdimm-rfc-v3 QEMU: https://github.com/hzzhan9/qemu.git xen-nvdimm-rfc-v3 How to Test == 1. Build and install this patchset with the associated QEMU patches. 2. Use xen-ndctl to get a list of PMEM regions detected by Xen hypervisor, e.g. # xen-ndctl list --raw Raw PMEM regions: 0: MFN 0x48 - 0x88, PXM 3 which indicates a PMEM region is present at MFN 0x48 - 0x88. 3. Setup a management area to manage the guest data areas. # xen-ndctl setup-mgmt 0x48 0x4c # xen-ndctl list --mgmt Management PMEM regions: 0: MFN 0x48 - 0x4c, used 0xc00 The first command setup the PMEM area in MFN 0x48 - 0x4c (1GB) as a management area, which is also used to manage itself. The second command list all management areas, and 'used' field shows the number of pages has been used from the beginning of that area. The size ratio between a management area and areas that it manages (including itself) should be at least 1 : 100 (i.e., 32 bytes for frametable and 8 bytes for M2P table per page). The size of a management area as well as a data area below is currently restricted to 256 Mbytes or multiples. The alignment is restricted to 2 Mbytes or multiples. 4. Setup a data area that can be used by guest. # xen-ndctl setup-data 0x4c 0x88 0x480c00 0x4c # xen-ndctl list --data Data PMEM regions: 0: MFN 0x4c - 0x88, MGMT MFN 0x480c00 - 0x48b000 The first command setup the remaining PMEM pages from MFN 0x4c to 0x88 as a data area. The management area MFN from 0x480c00 to 0x4c is specified to manage this data area. The actual used management pages can be found by the second command. 5. Assign a data pages to a HVM domain by adding the following line in the domain configuration. vnvdimms = [ 'type=mfn, backend=0x4c, nr_pages=0x10' ] which assigns 4 Gbytes PMEM starting from MFN 0x4c to that domain. A 4 Gbytes PMEM should be present in guest (e.g., as /dev/pmem0) after above steps of setup. There can be one or multiple entries in vnvdimms, which do not overlap with each other. Sharing the PMEM pages between domains are not supported, so PMEM pages assigned to each domain should not overlap with each other. Patch Organization == This RFC v3 is composed of following 6 parts per the task they are going to solve. The tool stack patches are collected and separated into each part. - Part 0. Bug fix and code cleanup [01/39] x86_64/mm: fix the PDX group check in mem_hotadd_check() [02/39] x86_64/mm: drop redundant MFN to page conventions in cleanup_frame_table() [03/39] x86_64/mm: avoid cleaning the unmapped frame table - Part 1. Detect host PMEM Detect host PMEM via NFIT. No frametable and M2P table for them are created in this part. [04/39] xen/common: add Kconfig item for pmem support [05/39] x86/mm: exclude PMEM regions from initial frametable [06/39] acpi: probe valid PMEM regions via NFIT [07/39] xen/pmem: register valid PMEM regions to Xen hypervisor [08/39] xen/pmem: hide NFIT and deny access to PMEM from Dom0 [09/39] xen/pmem: add framework for hypercall XEN_SYSCTL_nvdimm_op [10/39] xen/pmem: add