Michal Hocko <mho...@kernel.org> writes: > From: Michal Hocko <mho...@suse.com> > > This knob has been added by 31bc3858ea3e ("memory-hotplug: add automatic > onlining policy for the newly added memory") mainly to cover memory > hotplug based balooning solutions currently implemented for HyperV > and Xen. Both of them want to online the memory as soon after > registering as possible otherwise they can register too much memory > which cannot be used and trigger the oom killer (we need ~1.5% of the > registered memory so a large increase can consume all the available > memory). hv_mem_hot_add even waits for the userspace to online the > memory if the auto onlining is disabled to mitigate that problem. > > Adding yet another knob and a config option just doesn't make much sense > IMHO. How is a random user supposed to know when to enable this option? > Ballooning drivers know much better that they want to do an immediate > online rather than waiting for the userspace to do that. If the memory > is onlined for a different purpose then we already have a notification > for the userspace and udev can handle the onlining. So the knob as well > as the config option for the default behavior just doesn't make any > sense. Let's remove them and allow user of add_memory to request the > online status explicitly. Not only it makes more sense it also removes a > lot of clutter. > > Signed-off-by: Michal Hocko <mho...@suse.com> > --- > > Hi, > I am sending this as an RFC because this is a user visible change. Maybe > we won't be able to remove the sysfs knob which would be sad, especially > when it has been added without a wider discussion and IMHO it is just > wrong. Is there any reason why a kernel command line parameter wouldn't > work just fine? > > Even in that case I believe that we should remove > CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE knob. It just adds to an already > messy config space. Does anybody depend on the policy during the early > boot before the userspace can set the sysfs knob? Or why those users cannot > simply use the kernel command line parameter. > > I also believe that the wait-for-userspace in hyperV should just die. It > should do the unconditional onlining. Same as Xen. I do not see any > reason why those should depend on the userspace. This should be just > fixed regardless of the sysfs/config part. I can separate this out of course. > > Thoughts/Concerns?
I don't have anything new to add to the discussion happened last week but I'd like to summarize my arguments against this change: 1) This patch doesn't solve any issue. Configuration option is not an issue by itself, it is an option for distros to decide what they want to ship: udev rule with known issues (legacy mode) or enable the new option. Distro makers and users building their kernels should be able to answer this simple question "do you want to automatically online all newly added memory or not". There are distros already which ship kernels with CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE enabled (Fedora 24 and 25 as far as I remember, maybe someone else). 2) This patch creates an imbalance between Xen/Hyper-V on one side and KVM/Vmware on another. KVM/Vmware use pure ACPI memory hotplug and this memory won't get onlined. I don't understand how this problem is supposed to be solved by distros. They'll *have to* continue shipping a udev rule which has and always will have issues. 3) Kernel command line is not a viable choice, it is rather a debug method. Having all newly added memory online as soon as possible is a major use-case not something a couple of users wants (and this is proved by major distros shipping the unconditional 'offline->online' rule with udev). A couple of other thoughts: 1) Having all newly added memory online ASAP is probably what people want for all virtual machines. Unfortunately, we have additional complexity with memory zones (ZONE_NORMAL, ZONE_MOVABLE) and in some cases manual intervention is required. Especially, when further unplug is expected. 2) Adding new memory can (in some extreme cases) still fail as we need some *other* memory before we're able to online the newly added block. This is an issue to be solved and it is doable (IMO) with some pre-allocation. I'd also like to notice that this patch doesn't re-introduce the issue I was fixing with in-kernel memory onlining as all memory added through the Hyper-V driver will be auto-onlined unconditionally. What I disagree with here is taking away choice without fixing any real world issues. [snip] -- Vitaly