On 8/8/23 12:05 AM, David Hildenbrand wrote:
> On 07.08.23 14:41, David Hildenbrand wrote:
>> On 07.08.23 14:27, Michal Hocko wrote:
>>> On Sat 05-08-23 19:54:23, Aneesh Kumar K V wrote:
>>> [...]
>>>> Do you see a need for firmware-managed memory to be hotplugged in with
>>>> different memory block sizes?
>>>
>>> In short. Yes. Slightly longer, a fixed size memory block semantic is
>>> just standing in the way and I would even argue it is actively harmful.
>>> Just have a look at ridicously small memory blocks on ppc. I do
>>> understand that it makes some sense to be aligned to the memory model
>>> (so sparsmem section aligned). In an ideal world, memory hotplug v2
>>> interface (if we ever go that path) should be physical memory range based.
>>
>> Yes, we discussed that a couple of times already (and so far nobody
>> cared to implement any of that).
>>
>> Small memory block sizes are very beneficial for use cases like PPC
>> dlar, virtio-mem, hyperv-balloon, ... essentially in most virtual
>> environments where you might want to add/remove memory in very small
>> granularity. I don't see that changing any time soon. Rather the opposite.
>>
>> Small memory block sizes are suboptimal for large machines where you
>> might never end up removing such memory (boot memory), or when dealing
>> with devices that can only be removed in one piece (DIMM/kmem). We
>> already have memory groups in place to model that.
>>
>> For the latter it might be beneficial to have memory blocks of larger
>> size that correspond to the physical memory ranges. That might also make
>> a memmap (re-)configuration easier.
>>
>> Not sure if that is standing in any way or is harmful, though.
>>
> 
> Just because I thought of something right now, I'll share it, maybe it makes 
> sense.
> 
> Assume when we get add_memory*(MHP_MEMMAP_ON_MEMORY) and it is enabled by the 
> admin:
> 
> 1) We create a single altmap at the beginning of the memory
> 
> 2) We create the existing fixed-size memory block devices, but flag them
>    to be part of a single "altmap" unit.
> 
> 3) Whenever we trigger offlining of a single such memory block, we
>    offline *all* memory blocks belonging to that altmap, essentially
>    using a single offline_pages() call and updating all memory block
>    states accordingly.
> 
> 4) Whenever we trigger onlining of a single such memory block, we
>    online *all* memory blocks belonging to that altmap, using a single
>    online_pages() call.
> 
> 5) We fail remove_memory() if it doesn't cover the same (altmap) range.
> 
> So we can avoid having a memory block v2 (and all that comes with that ...) 
> for now and still get that altmap stuff sorted out. As that altmap behavior 
> can be controlled by the admin, we should be fine for now.
> 
> I think all memory notifiers should already be able to handle bigger 
> granularity, but it would be easy to check. Some internal things might 
> require a bit of tweaking.
> 

We can look at the possibility of using the altmap space reserved for a 
namespace (via option -M dev) for allocating struct page memory even with 
dax/kmem. 

-aneesh

Reply via email to