On 26.04.2016 23:03, Michael Roth wrote: > Quoting Igor Mammedov (2016-04-26 02:52:36) >> On Tue, 26 Apr 2016 10:39:23 +0530 >> Bharata B Rao <bhar...@linux.vnet.ibm.com> wrote: >> >>> On Mon, Apr 25, 2016 at 11:20:50AM +0200, Igor Mammedov wrote: >>>> On Wed, 16 Mar 2016 10:11:54 +0530 >>>> Bharata B Rao <bhar...@linux.vnet.ibm.com> wrote: >>>> >>>>> On Wed, Mar 16, 2016 at 12:36:05PM +1100, David Gibson wrote: >>>>>> On Tue, Mar 15, 2016 at 10:08:56AM +0530, Bharata B Rao wrote: >>>>>>> Add support to hot remove pc-dimm memory devices. >>>>>>> >>>>>>> Signed-off-by: Bharata B Rao <bhar...@linux.vnet.ibm.com> >>>>>> >>>>>> Reviewed-by: David Gibson <da...@gibson.dropbear.id.au> >>>>>> >>>>>> Looks correct, but again, needs to wait on the PAPR change. >>>> [...] >>>>> >>>>> While we are here, I would also like to get some opinion on the real >>>>> need for memory unplug. Is there anything that memory unplug gives us >>>>> which memory ballooning (shrinking mem via ballooning) can't give ? >>>> Sure ballooning can complement memory hotplug but turning it on would >>>> effectively reduce hotplug to balloning as it would enable overcommit >>>> capability instead of hard partitioning pc-dimms provides. So one >>>> could just use ballooning only and not bother with hotplug at all. >>>> >>>> On the other hand memory hotplug/unplug (at least on x86) tries >>>> to model real hardware, thus removing need in paravirt ballooning >>>> solution in favor of native guest support. >>> >>> Thanks for your views. >>> >>>> >>>> PS: >>>> Guest wise, currently hot-unplug is not well supported in linux, >>>> i.e. it's not guarantied that guest will honor unplug request >>>> as it may pin dimm by using it as a non migratable memory. So >>>> there is something to work on guest side to make unplug more >>>> reliable/guarantied. >>> >>> In the above scenario where the guest doesn't allow removal of certain >>> parts of DIMM memory, what is the expected behaviour as far as QEMU >>> DIMM device is concerned ? I seem to be running into this situation >>> very often with PowerPC mem unplug where I am left with a DIMM device >>> that has only some memory blocks released. In this situation, I would like >>> to block further unplug requests on the same device, but QEMU seems >>> to allow more such unplug requests to come in via the monitor. So >>> qdev won't help me here ? Should I detect such condition from the >>> machine unplug() handler and take required action ? >> I think offlining is a guests task along with recovering from >> inability to offline (i.e. offline all + eject or restore original state). >> QUEM does it's job by notifying guest what dimm it wants to remove >> and removes it when guest asks it (at least in x86 world). > > In the case of pseries, the DIMM abstraction isn't really exposed to > the guest, but rather the memory blocks we use to make the backing > memdev memory available to the guest. During unplug, the guest > completely releases these blocks back to QEMU, and if it can only > release a subset of what's requested it does not attempt to recover. > We can potentially change that behavior on the guest side, since > partially-freed DIMMs aren't currently useful on the host-side... > > But, in the case of pseries, I wonder if it makes sense to maybe go > ahead and MADV_DONTNEED the ranges backing these released blocks so the > host can at least partially reclaim the memory from a partially > unplugged DIMM?
Sounds like this could be a good compromise. Thomas