NVDIMM support

Igor Mammedov Tue, 26 Feb 2019 08:58:05 -0800

On Tue, 26 Feb 2019 14:11:58 +0100
Auger Eric <eric.au...@redhat.com> wrote:


> Hi Igor,
> 
> On 2/26/19 9:40 AM, Auger Eric wrote:
> > Hi Igor,
> > 
> > On 2/25/19 10:42 AM, Igor Mammedov wrote:
> >> On Fri, 22 Feb 2019 18:35:26 +0100
> >> Auger Eric <eric.au...@redhat.com> wrote:
> >>
> >>> Hi Igor,
> >>>
> >>> On 2/22/19 5:27 PM, Igor Mammedov wrote:
> >>>> On Wed, 20 Feb 2019 23:39:46 +0100
> >>>> Eric Auger <eric.au...@redhat.com> wrote:
> >>>>
> >>>>> This series aims to bump the 255GB RAM limit in machvirt and to
> >>>>> support device memory in general, and especially PCDIMM/NVDIMM.
> >>>>>
> >>>>> In machvirt versions < 4.0, the initial RAM starts at 1GB and can
> >>>>> grow up to 255GB. From 256GB onwards we find IO regions such as the
> >>>>> additional GICv3 RDIST region, high PCIe ECAM region and high PCIe
> >>>>> MMIO region. The address map was 1TB large. This corresponded to
> >>>>> the max IPA capacity KVM was able to manage.
> >>>>>
> >>>>> Since 4.20, the host kernel is able to support a larger and dynamic
> >>>>> IPA range. So the guest physical address can go beyond the 1TB. The
> >>>>> max GPA size depends on the host kernel configuration and physical CPUs.
> >>>>>
> >>>>> In this series we use this feature and allow the RAM to grow without
> >>>>> any other limit than the one put by the host kernel.
> >>>>>
> >>>>> The RAM still starts at 1GB. First comes the initial ram (-m) of size
> >>>>> ram_size and then comes the device memory (,maxmem) of size
> >>>>> maxram_size - ram_size. The device memory is potentially hotpluggable
> >>>>> depending on the instantiated memory objects.
> >>>>>
> >>>>> IO regions previously located between 256GB and 1TB are moved after
> >>>>> the RAM. Their offset is dynamically computed, depends on ram_size
> >>>>> and maxram_size. Size alignment is enforced.
> >>>>>
> >>>>> In case maxmem value is inferior to 255GB, the legacy memory map
> >>>>> still is used. The change of memory map becomes effective from 4.0
> >>>>> onwards.
> >>>>>
> >>>>> As we keep the initial RAM at 1GB base address, we do not need to do
> >>>>> invasive changes in the EDK2 FW. It seems nobody is eager to do
> >>>>> that job at the moment.
> >>>>>
> >>>>> Device memory being put just after the initial RAM, it is possible
> >>>>> to get access to this feature while keeping a 1TB address map.
> >>>>>
> >>>>> This series reuses/rebases patches initially submitted by Shameer
> >>>>> in [1] and Kwangwoo in [2] for the PC-DIMM and NV-DIMM parts.
> >>>>>
> >>>>> Functionally, the series is split into 3 parts:
> >>>>> 1) bump of the initial RAM limit [1 - 9] and change in
> >>>>>    the memory map
> >>>>
> >>>>> 2) Support of PC-DIMM [10 - 13]
> >>>> Is this part complete ACPI wise (for coldplug)? I haven't noticed
> >>>> DSDT AML here no E820 changes, so ACPI wise pc-dimm shouldn't be
> >>>> visible to the guest. It might be that DT is masking problem
> >>>> but well, that won't work on ACPI only guests.
> >>>
> >>> guest /proc/meminfo or "lshw -class memory" reflects the amount of mem
> >>> added with the DIMM slots.
> >> Question is how does it get there? Does it come from DT or from firmware
> >> via UEFI interfaces?
> >>
> >>> So it looks fine to me. Isn't E820 a pure x86 matter?
> >> sorry for misleading, I've meant is UEFI GetMemoryMap().
> >> On x86, I'm wary of adding PC-DIMMs to E802 which then gets exposed
> >> via UEFI GetMemoryMap() as guest kernel might start using it as normal
> >> memory early at boot and later put that memory into zone normal and hence
> >> make it non-hot-un-pluggable. The same concerns apply to DT based means
> >> of discovery.
> >> (That's guest issue but it's easy to workaround it not putting hotpluggable
> >> memory into UEFI GetMemoryMap() or DT and let DSDT describe it properly)
> >> That way memory doesn't get (ab)used by firmware or early boot kernel 
> >> stages
> >> and doesn't get locked up.
> >>
> >>> What else would you expect in the dsdt?
> >> Memory device descriptions, look for code that adds PNP0C80 with _CRS
> >> describing memory ranges
> > 
> > OK thank you for the explanations. I will work on PNP0C80 addition then.
> > Does it mean that in ACPI mode we must not output DT hotplug memory
> > nodes or assuming that PNP0C80 is properly described, it will "override"
> > DT description?
> 
> After further investigations, I think the pieces you pointed out are
> added by Shameer's series, ie. through the build_memory_hotplug_aml()
> call. So I suggest we separate the concerns: this series brings support
> for DIMM coldplug. hotplug, including all the relevant ACPI structures
> will be added later on by Shameer.

Maybe we should not put pc-dimms in DT for this series until it gets clear
if it doesn't conflict with ACPI in some way.

Re: [Qemu-devel] [PATCH v7 00/17] ARM virt: Initial RAM expansion and PCDIMM/NVDIMM support

Reply via email to