Re: [PATCH part2 v2 0/8] Arrange hotpluggable memory as ZONE_MOVABLE

2013-10-16 Thread Ingo Molnar
* H. Peter Anvin wrote: > On 10/14/2013 11:50 PM, Ingo Molnar wrote: > > > > So if anyone can implement it using huge pages, with a really fast > > __va() and __pa() implementation, then it might be possible. But > > that's a pretty major surgery on x86. > > Well, we already *have* a way to

Re: [PATCH part2 v2 0/8] Arrange hotpluggable memory as ZONE_MOVABLE

2013-10-15 Thread H. Peter Anvin
On 10/14/2013 11:50 PM, Ingo Molnar wrote: > > So if anyone can implement it using huge pages, with a really fast __va() > and __pa() implementation, then it might be possible. But that's a pretty > major surgery on x86. > Well, we already *have* a way to deal with that for Xen (by inserting a

Re: [PATCH part2 v2 0/8] Arrange hotpluggable memory as ZONE_MOVABLE

2013-10-15 Thread Tejun Heo
Hello, Yinghai. On Mon, Oct 14, 2013 at 07:25:55PM -0700, Yinghai Lu wrote: > > Wouldn't that amount be fairly static and restricted? If you wanna > > chunk memory init anyway, there's no reason to init more than > > necessary until smp stage is reached. The more you do early, the more > > seria

Re: [PATCH part2 v2 0/8] Arrange hotpluggable memory as ZONE_MOVABLE

2013-10-14 Thread Ingo Molnar
* H. Peter Anvin wrote: > On 10/14/2013 01:37 PM, Yinghai Lu wrote: > >> > >> Optimizing NUMA boot just requires moving the heavy lifting to > >> appropriate NUMA nodes. It doesn't require that early boot phase > >> should strictly follow NUMA node boundaries. > > > > At end of day, I like to

Re: [PATCH part2 v2 0/8] Arrange hotpluggable memory as ZONE_MOVABLE

2013-10-14 Thread Yinghai Lu
On Mon, Oct 14, 2013 at 1:55 PM, Tejun Heo wrote: > Hello, > > On Mon, Oct 14, 2013 at 01:37:20PM -0700, Yinghai Lu wrote: >> The problem is how to define "amount necessary". If we can parse srat early, >> then we could just map RAM for all boot nodes one time, instead of try some >> small and the

Re: [PATCH part2 v2 0/8] Arrange hotpluggable memory as ZONE_MOVABLE

2013-10-14 Thread Zhang Yanfei
Hello tejun, peter and yinghai On 10/15/2013 04:55 AM, Tejun Heo wrote: > Hello, > > On Mon, Oct 14, 2013 at 01:37:20PM -0700, Yinghai Lu wrote: >> The problem is how to define "amount necessary". If we can parse srat early, >> then we could just map RAM for all boot nodes one time, instead of tr

Re: [PATCH part2 v2 0/8] Arrange hotpluggable memory as ZONE_MOVABLE

2013-10-14 Thread Tejun Heo
Hello, On Mon, Oct 14, 2013 at 01:37:20PM -0700, Yinghai Lu wrote: > The problem is how to define "amount necessary". If we can parse srat early, > then we could just map RAM for all boot nodes one time, instead of try some > small and then after SRAT table, expand it cover non-boot nodes. Wouldn

Re: [PATCH part2 v2 0/8] Arrange hotpluggable memory as ZONE_MOVABLE

2013-10-14 Thread H. Peter Anvin
On 10/14/2013 01:42 PM, Yinghai Lu wrote: >> >> However, I don't understand how we can avoid #2, given that it is >> fundamentally a sysadmin-driven tradeoff between performance and >> reliability. > > If we make all numa systems support nodes hot-remove logically. > like we boot system with node0

Re: [PATCH part2 v2 0/8] Arrange hotpluggable memory as ZONE_MOVABLE

2013-10-14 Thread Yinghai Lu
On Mon, Oct 14, 2013 at 1:35 PM, H. Peter Anvin wrote: ... >> 2. also we should avoid adding "movable_nodes" command line. ... >> 6. in the long run, We should rework our NUMA booting: >> a. boot system with boot numa nodes early only. >> b. in later init stage or user space, init other nodes >> R

Re: [PATCH part2 v2 0/8] Arrange hotpluggable memory as ZONE_MOVABLE

2013-10-14 Thread H. Peter Anvin
On 10/14/2013 01:37 PM, Yinghai Lu wrote: >> >> Optimizing NUMA boot just requires moving the heavy lifting to >> appropriate NUMA nodes. It doesn't require that early boot phase >> should strictly follow NUMA node boundaries. > > At end of day, I like to see all numa system (ram/cpu/pci) could h

Re: [PATCH part2 v2 0/8] Arrange hotpluggable memory as ZONE_MOVABLE

2013-10-14 Thread H. Peter Anvin
On 10/14/2013 12:34 PM, Yinghai Lu wrote: > > The points for parsing SRAT early instead of Yanfei/Tang v7: > 1. We just reached one unified path to setup page tables for 32bit, > 64bit and xen or non xen after several years. We should not have add > another path for system > that support hotplug.

Re: [PATCH part2 v2 0/8] Arrange hotpluggable memory as ZONE_MOVABLE

2013-10-14 Thread Yinghai Lu
On Mon, Oct 14, 2013 at 1:04 PM, Tejun Heo wrote: >> 6. in the long run, We should rework our NUMA booting: >> a. boot system with boot numa nodes early only. >> b. in later init stage or user space, init other nodes >> RAM/CPU/PCI...in parallel. >> that will reduce boot time for 8 sockets/32 sock

Re: [PATCH part2 v2 0/8] Arrange hotpluggable memory as ZONE_MOVABLE

2013-10-14 Thread Tejun Heo
Hello, Yinghai. On Mon, Oct 14, 2013 at 12:34:49PM -0700, Yinghai Lu wrote: > The points for parsing SRAT early instead of Yanfei/Tang v7: > > 1. We just reached one unified path to setup page tables for 32bit, > 64bit and xen or non xen after several years. We should not have add > another path f

Re: [PATCH part2 v2 0/8] Arrange hotpluggable memory as ZONE_MOVABLE

2013-10-14 Thread Yinghai Lu
On Mon, Oct 14, 2013 at 8:34 AM, Zhang Yanfei wrote: > Hello tejun, > > On 10/14/2013 11:19 PM, Tejun Heo wrote: >> Hey, >> >> On Mon, Oct 14, 2013 at 11:06:14PM +0800, Zhang Yanfei wrote: >>> a little difference here, consider a 16-GB node. If we parse SRAT earlier, >>> and still use the top-down

Re: [PATCH part2 v2 0/8] Arrange hotpluggable memory as ZONE_MOVABLE

2013-10-14 Thread Zhang Yanfei
Hello tejun, On 10/14/2013 11:19 PM, Tejun Heo wrote: > Hey, > > On Mon, Oct 14, 2013 at 11:06:14PM +0800, Zhang Yanfei wrote: >> a little difference here, consider a 16-GB node. If we parse SRAT earlier, >> and still use the top-down allocation, and kernel image is loaded at 16MB, >> we reserve

[PATCH part2 v2 0/8] Arrange hotpluggable memory as ZONE_MOVABLE

2013-10-11 Thread Zhang Yanfei
Hello guys, this is the part2 of our memory hotplug work. This part is based on the part1: "x86, memblock: Allocate memory near kernel image before SRAT parsed" which is base on 3.12-rc4. You could refer part1 from: https://lkml.org/lkml/2013/10/10/644 Any comments are welcome! Thanks! [Prob