On Thu, May 16, 2013 at 2:50 PM, Tang Chen <[email protected]> wrote: > The following patch-set allocated pagetables to local node. > https://lkml.org/lkml/2013/4/11/829 > > Doing this will break memory hot-remove. > > Before removing memory, the kernel offlines memory. If offlining > memory fails, the memory cannot be removed. The pagetables are > used by the kernel, so they cannot be offlined. Furthermore, they > cannot be removed. > > Of course, we can free pagetable pages because the pagetables of > the to be removed memory are useless. But offlining memory doesn't > mean removing memory. If users only want to offline memory, the > pagetables should not be freed. > > The minimum unit of memory online/offline is block. And by default, > one block contains one section, which by default is 128MB. There is > possiblity that half of the block is pagetable, and the other half > is movable memory. > > When we offline this kind of block, the status of the block is > uncertain. We cannot simply free the pagetables in this block because > they may be used by other online blocks. But when doing memory > hot-remove, the failure of offlining blocks will break the memory > hot-remove logic. > > > In order to fix it, we have three solutions: > > 1. Reserve the whole block (128MB), making no user can use the rest > parts of the block. And skip them when offlining memory. > When all the other blocks are offlined, free the pagetable, and remove > all the memory. > > But we may lose some memory for this purpose. 128MB is a little big > to waste. > > > 2. Keep this block online. Although the offline operation fails, it is > OK to remove memory. > > But the offline operation will always fail. And generally speaking, > there are a lot of reasons of offline failing, it is difficult to > detect if it is OK to remove memory. So we don't suggest this way. > > > 3. Migrate user pages and make this block offline. Offlining memory won't > stop the kernel using the pagetables stored in them, so it will be OK. > > But this will change the semantics of "offline". I'm not sure if we > can do it in this way. > > > So before we fix this problem, I think we should not allocate pagetables > to local node when CONFIG_MEMORY_HOTREMOVE is enabled. And recover it when > we confirm the direction and fix the problem. > > This patch is based on > git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git > for-x86-mm > > Any other solution for this problem is welcome. > > > Signed-off-by: Tang Chen <[email protected]>
Ugh. Special-casing for CONFIG_MEMORY_HOTPLUG is just begging for trouble. Were you able to determine which commit broke memory hot-remove? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

