virt: Allocate device_memory

Shameerali Kolothum Thodi Thu, 05 Jul 2018 06:20:26 -0700

> -----Original Message-----
> From: Auger Eric [mailto:eric.au...@redhat.com]
> Sent: 05 July 2018 13:18
> To: David Hildenbrand <da...@redhat.com>; eric.auger....@gmail.com;
> qemu-devel@nongnu.org; qemu-...@nongnu.org; peter.mayd...@linaro.org;
> Shameerali Kolothum Thodi <shameerali.kolothum.th...@huawei.com>;
> imamm...@redhat.com
> Cc: w...@redhat.com; drjo...@redhat.com; da...@gibson.dropbear.id.au;
> dgilb...@redhat.com; ag...@suse.de
> Subject: Re: [Qemu-devel] [RFC v3 06/15] hw/arm/virt: Allocate
> device_memory
> 
> Hi David,
> 
> On 07/05/2018 02:09 PM, David Hildenbrand wrote:
> > On 05.07.2018 14:00, Auger Eric wrote:
> >> Hi David,
> >>
> >> On 07/05/2018 01:54 PM, David Hildenbrand wrote:
> >>> On 05.07.2018 13:42, Auger Eric wrote:
> >>>> Hi David,
> >>>>
> >>>> On 07/04/2018 02:05 PM, David Hildenbrand wrote:
> >>>>> On 03.07.2018 21:27, Auger Eric wrote:
> >>>>>> Hi David,
> >>>>>> On 07/03/2018 08:25 PM, David Hildenbrand wrote:
> >>>>>>> On 03.07.2018 09:19, Eric Auger wrote:
> >>>>>>>> We define a new hotpluggable RAM region (aka. device memory).
> >>>>>>>> Its base is 2TB GPA. This obviously requires 42b IPA support
> >>>>>>>> in KVM/ARM, FW and guest kernel. At the moment the device
> >>>>>>>> memory region is max 2TB.
> >>>>>>>
> >>>>>>> Maybe a stupid question, but why exactly does it have to start at 2TB
> >>>>>>> (and not e.g. at 1TB)?
> >>>>>> not a stupid question. See tentative answer below.
> >>>>>>>
> >>>>>>>>
> >>>>>>>> This is largely inspired of device memory initialization in
> >>>>>>>> pc machine code.
> >>>>>>>>
> >>>>>>>> Signed-off-by: Eric Auger <eric.au...@redhat.com>
> >>>>>>>> Signed-off-by: Kwangwoo Lee <kwangwoo....@sk.com>
> >>>>>>>> ---
> >>>>>>>>  hw/arm/virt.c         | 104
> ++++++++++++++++++++++++++++++++++++--------------
> >>>>>>>>  include/hw/arm/arm.h  |   2 +
> >>>>>>>>  include/hw/arm/virt.h |   1 +
> >>>>>>>>  3 files changed, 79 insertions(+), 28 deletions(-)
> >>>>>>>>
> >>>>>>>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> >>>>>>>> index 5a4d0bf..6fefb78 100644
> >>>>>>>> --- a/hw/arm/virt.c
> >>>>>>>> +++ b/hw/arm/virt.c
> >>>>>>>> @@ -59,6 +59,7 @@
> >>>>>>>>  #include "qapi/visitor.h"
> >>>>>>>>  #include "standard-headers/linux/input.h"
> >>>>>>>>  #include "hw/arm/smmuv3.h"
> >>>>>>>> +#include "hw/acpi/acpi.h"
> >>>>>>>>
> >>>>>>>>  #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
> >>>>>>>>      static void virt_##major##_##minor##_class_init(ObjectClass *oc,
> \
> >>>>>>>> @@ -94,34 +95,25 @@
> >>>>>>>>
> >>>>>>>>  #define PLATFORM_BUS_NUM_IRQS 64
> >>>>>>>>
> >>>>>>>> -/* RAM limit in GB. Since VIRT_MEM starts at the 1GB mark, this
> means
> >>>>>>>> - * RAM can go up to the 256GB mark, leaving 256GB of the physical
> >>>>>>>> - * address space unallocated and free for future use between 256G
> and 512G.
> >>>>>>>> - * If we need to provide more RAM to VMs in the future then we
> need to:
> >>>>>>>> - *  * allocate a second bank of RAM starting at 2TB and working up
> >>>>>> I acknowledge this comment was the main justification. Now if you look
> at
> >>>>>>
> >>>>>> Principles of ARM Memory Maps
> >>>>>>
> http://infocenter.arm.com/help/topic/com.arm.doc.den0001c/DEN0001C_princ
> iples_of_arm_memory_maps.pdf
> >>>>>> chapter 2.3 you will find that when adding PA bits, you always leave
> >>>>>> space for reserved space and mapped IO.
> >>>>>
> >>>>> Thanks for the pointer!
> >>>>>
> >>>>> So ... we can fit
> >>>>>
> >>>>> a) 2GB at 2GB
> >>>>> b) 32GB at 32GB
> >>>>> c) 512GB at 512GB
> >>>>> d) 8TB at 8TB
> >>>>> e) 128TB at 128TB
> >>>>>
> >>>>> (this is a nice rule of thumb if I understand it correctly :) )
> >>>>>
> >>>>> We should strive for device memory (maxram_size - ram_size) to fit
> >>>>> exactly into one of these slots (otherwise things get nasty).
> >>>>>
> >>>>> Depending on the ram_size, we might have simpler setups and can
> support
> >>>>> more configurations, no?
> >>>>>
> >>>>> E.g. ram_size <= 34GB, device_memory <= 512GB
> >>>>> -> move ram into a) and b)
> >>>>> -> move device memory into c)
> >>>>
> >>>> The issue is machvirt doesn't comply with that document.
> >>>> At the moment we have
> >>>> 0 -> 1GB MMIO
> >>>> 1GB -> 256GB RAM
> >>>> 256GB -> 512GB is theoretically reserved for IO but most is free.
> >>>> 512GB -> 1T is reserved for ECAM MMIO range. This is the top of our
> >>>> existing 40b GPA space.
> >>>>
> >>>> We don't want to change this address map due to legacy reasons.
> >>>>
> >>>
> >>> Thanks, good to know!
> >>>
> >>>> Another question! do you know if it would be possible to have
> >>>> device_memory region split into several discontinuous segments?
> >>>
> >>> It can be implemented for sure, but I would try to avoid that, as it
> >>> makes certain configurations impossible (and very end user unfriendly).
> >>>
> >>> E.g. (numbers completely made up, but it should show what I mean)
> >>>
> >>> -m 20G,maxmem=120G:
> >>> -> Try to add a DIMM with 100G -> error.
> >>> -> But we can add e.g. two DIMMs with 40G and 60G.
> >>>
> >>> This exposes internal details to the end user. And the end user has no
> >>> idea what is going on.
> >>>
> >>> So I think we should try our best to keep that area consecutive.
> >>
> >> Actually I didn't sufficiently detail the context. I would like
> >> 1) 1 segment to be exposed to the end-user through slot|maxmem stuff
> >> (what this series targets) and
> >> 2) another segment used to instantiate PC-DIMM for internal needs as
> >> replacement of part of the 1GB -> 256GB static RAM. This was the purpose
> >> of Shameer's original series
> >
> > I am not sure if PC-DIMMs are exactly what you want for internal purposes.
> >
> >>
> >> [1] [RFC v2 0/6] hw/arm: Add support for non-contiguous iova regions
> >> http://patchwork.ozlabs.org/cover/914694/
> >> This approach is not yet validated though.
> >>
> >> The rationale is sometimes you must have "holes" in RAM as some GPAs
> >> match reserved IOVAs for assigned devices.
> >
> > So if I understand it correctly, all you want is some memory region that
> > a) contains only initially defined memory
> > b) can have some holes in it
> >
> > This is exactly what x86 already does (pc_memory_init): Simply construct
> > your own memory region leaving holes in it.
> >
> >
> > memory_region_init_alias(ram_below_4g, NULL, "ram-below-4g", ram,
> >                          0, pcms->below_4g_mem_size);
> > memory_region_add_subregion(system_memory, 0, ram_below_4g);
> > ...
> > if (pcms->above_4g_mem_size > 0)
> >     memory_region_init_alias(ram_above_4g, NULL, "ram-above-4g", ram,
> >     ...
> >     memory_region_add_subregion(system_memory, 0x100000000ULL,
> >     ...
> >
> > They "indicate" these different GPA areas using the e820 map to the guest.
> >
> > Would that also work for you?
> 
> I would tentatively say yes. Effectively I am not sure that if we were
> to actually put holes in the 1G-256GB RAM segment, PC-DIMM would be the
> natural choice. Also the reserved IOVA issue impacts the device_memory
> region area I think. I am skeptical about the fact we can put holes in
> static RAM and device_memory regions like that.


The first approach[1] we had to address the holes in memory was using
the memory alias way mentioned above.  And based on Drew's review, the
pc-dimm way of handling was introduced. I think the main argument was that
it will be useful when we eventually support hotplug. But since that is added
anyway as part of this series, I am not sure we have any other benefit in
modeling it as pc-dimm. May be I am missing something here.

Thanks,
Shameer

[1]. https://lists.gnu.org/archive/html/qemu-arm/2018-04/msg00243.html


> Thanks!
> 
> Eric
> >
> >>
> >> Thanks
> >>
> >> Eric
> >>
> >>>
> >>>>
> >>>> Thanks
> >>>>
> >>>> Eric
> >>>
> >>>
> >
> >

Re: [Qemu-devel] [RFC v3 06/15] hw/arm/virt: Allocate device_memory

Reply via email to