Re: [PATCH 14/14] x86, mm: Put pagetable on local node ram

2013-03-08 Thread Yinghai Lu
On Fri, Mar 8, 2013 at 12:20 AM, Tang Chen  wrote:
> Hi Yinghai,
>
> On 03/08/2013 12:58 PM, Yinghai Lu wrote:
> ..
>
>> /* xen has big range in reserved near end of ram, skip it at
>> first.*/
>> -   addr = memblock_find_in_range(ISA_END_ADDRESS, end, PMD_SIZE,
>> PMD_SIZE);
>> +   addr = memblock_find_in_range(begin, end, PMD_SIZE, PMD_SIZE);
>
>
> Found that the latest code here is:
>
>  414 addr = memblock_find_in_range(ISA_END_ADDRESS, end, PMD_SIZE,
>  415  PAGE_SIZE);
>  
>
> The "align" is PAGE_SIZE, not PMD_SIZE. Not sure if it is a problem. :)
>

Yes, it is PMD_SIZE.

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=98e7a989979b185f49e86ddaed2ad6890299d9f0

Thanks

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 14/14] x86, mm: Put pagetable on local node ram

2013-03-08 Thread Tang Chen

Hi Yinghai,

On 03/08/2013 12:58 PM, Yinghai Lu wrote:
..

/* xen has big range in reserved near end of ram, skip it at first.*/
-   addr = memblock_find_in_range(ISA_END_ADDRESS, end, PMD_SIZE, PMD_SIZE);
+   addr = memblock_find_in_range(begin, end, PMD_SIZE, PMD_SIZE);


Found that the latest code here is:

 414 addr = memblock_find_in_range(ISA_END_ADDRESS, end, PMD_SIZE,
 415  PAGE_SIZE);
 

The "align" is PAGE_SIZE, not PMD_SIZE. Not sure if it is a problem. :)

Thanks. :)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 14/14] x86, mm: Put pagetable on local node ram

2013-03-08 Thread Tang Chen

Hi Yinghai,

On 03/08/2013 12:58 PM, Yinghai Lu wrote:
..

/* xen has big range in reserved near end of ram, skip it at first.*/
-   addr = memblock_find_in_range(ISA_END_ADDRESS, end, PMD_SIZE, PMD_SIZE);
+   addr = memblock_find_in_range(begin, end, PMD_SIZE, PMD_SIZE);


Found that the latest code here is:

 414 addr = memblock_find_in_range(ISA_END_ADDRESS, end, PMD_SIZE,
 415  PAGE_SIZE);
 

The align is PAGE_SIZE, not PMD_SIZE. Not sure if it is a problem. :)

Thanks. :)


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 14/14] x86, mm: Put pagetable on local node ram

2013-03-08 Thread Yinghai Lu
On Fri, Mar 8, 2013 at 12:20 AM, Tang Chen tangc...@cn.fujitsu.com wrote:
 Hi Yinghai,

 On 03/08/2013 12:58 PM, Yinghai Lu wrote:
 ..

 /* xen has big range in reserved near end of ram, skip it at
 first.*/
 -   addr = memblock_find_in_range(ISA_END_ADDRESS, end, PMD_SIZE,
 PMD_SIZE);
 +   addr = memblock_find_in_range(begin, end, PMD_SIZE, PMD_SIZE);


 Found that the latest code here is:

  414 addr = memblock_find_in_range(ISA_END_ADDRESS, end, PMD_SIZE,
  415  PAGE_SIZE);
  

 The align is PAGE_SIZE, not PMD_SIZE. Not sure if it is a problem. :)


Yes, it is PMD_SIZE.

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=98e7a989979b185f49e86ddaed2ad6890299d9f0

Thanks

Yinghai
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 14/14] x86, mm: Put pagetable on local node ram

2013-03-07 Thread Yinghai Lu
On Thu, Mar 7, 2013 at 11:01 PM, Tejun Heo  wrote:
> On Thu, Mar 07, 2013 at 08:58:40PM -0800, Yinghai Lu wrote:
>> If node with ram is hotplugable, local node mem for page table and vmemmap
>> should be on that node ram.
>>
>> This patch is some kind of refreshment of
>> | commit 1411e0ec3123ae4c4ead6bfc9fe3ee5a3ae5c327
>> | Date:   Mon Dec 27 16:48:17 2010 -0800
>> |
>> |x86-64, numa: Put pgtable to local node memory
>> That was reverted before.
>>
>> We have reason to reintroduce it to make memory hotplug work.
>>
>> Split calling of init_mem_mapping into early_initmem_info
>> for nodes after we get numa info there.
>>
>> First node will be low range.
>> Need to rework alloc_low_pages to alloc page table in following order:
>>   BRK, local node, low range
>>
>> Still only load_cr3 one time, otherwise we would break xen 64bit again.
>
> Hmmm... can you please split this patch further?  init_mem_mapping()
> change can be separated, no?

will try to split it out.

> Also, comments are disturbingly missing.
> How are other people reading the code supposed to know what it's
> trying to achieve why and how?  Hmmm... we're also likely to end up
> with smaller mapping for misaligned NUMA configurations (I think my
> test machine is like that).  Is it guaranteed that the top level ends
> up in the first node?  It really needs documentation.

Yes. To really memory hotplug working, will need to trim the node
alignment to be
1G in memblock and numa_meminfo.

also need to put pgd page in low range (first node) if 512G block is
crossing node.
for example: if node2 is [256g, 1024g), pgd for 256g-512g, must be stay on node0
and 512g-1024g could stay on node2.
or just put all PGD pages on low range (first node).

Thanks

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 14/14] x86, mm: Put pagetable on local node ram

2013-03-07 Thread Tejun Heo
On Thu, Mar 07, 2013 at 08:58:40PM -0800, Yinghai Lu wrote:
> If node with ram is hotplugable, local node mem for page table and vmemmap
> should be on that node ram.
> 
> This patch is some kind of refreshment of
> | commit 1411e0ec3123ae4c4ead6bfc9fe3ee5a3ae5c327
> | Date:   Mon Dec 27 16:48:17 2010 -0800
> |
> |x86-64, numa: Put pgtable to local node memory
> That was reverted before.
> 
> We have reason to reintroduce it to make memory hotplug work.
> 
> Split calling of init_mem_mapping into early_initmem_info
> for nodes after we get numa info there.
> 
> First node will be low range.
> Need to rework alloc_low_pages to alloc page table in following order:
>   BRK, local node, low range
> 
> Still only load_cr3 one time, otherwise we would break xen 64bit again.

Hmmm... can you please split this patch further?  init_mem_mapping()
change can be separated, no?  Also, comments are disturbingly missing.
How are other people reading the code supposed to know what it's
trying to achieve why and how?  Hmmm... we're also likely to end up
with smaller mapping for misaligned NUMA configurations (I think my
test machine is like that).  Is it guaranteed that the top level ends
up in the first node?  It really needs documentation.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 14/14] x86, mm: Put pagetable on local node ram

2013-03-07 Thread Tejun Heo
On Thu, Mar 07, 2013 at 08:58:40PM -0800, Yinghai Lu wrote:
 If node with ram is hotplugable, local node mem for page table and vmemmap
 should be on that node ram.
 
 This patch is some kind of refreshment of
 | commit 1411e0ec3123ae4c4ead6bfc9fe3ee5a3ae5c327
 | Date:   Mon Dec 27 16:48:17 2010 -0800
 |
 |x86-64, numa: Put pgtable to local node memory
 That was reverted before.
 
 We have reason to reintroduce it to make memory hotplug work.
 
 Split calling of init_mem_mapping into early_initmem_info
 for nodes after we get numa info there.
 
 First node will be low range.
 Need to rework alloc_low_pages to alloc page table in following order:
   BRK, local node, low range
 
 Still only load_cr3 one time, otherwise we would break xen 64bit again.

Hmmm... can you please split this patch further?  init_mem_mapping()
change can be separated, no?  Also, comments are disturbingly missing.
How are other people reading the code supposed to know what it's
trying to achieve why and how?  Hmmm... we're also likely to end up
with smaller mapping for misaligned NUMA configurations (I think my
test machine is like that).  Is it guaranteed that the top level ends
up in the first node?  It really needs documentation.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 14/14] x86, mm: Put pagetable on local node ram

2013-03-07 Thread Yinghai Lu
On Thu, Mar 7, 2013 at 11:01 PM, Tejun Heo t...@kernel.org wrote:
 On Thu, Mar 07, 2013 at 08:58:40PM -0800, Yinghai Lu wrote:
 If node with ram is hotplugable, local node mem for page table and vmemmap
 should be on that node ram.

 This patch is some kind of refreshment of
 | commit 1411e0ec3123ae4c4ead6bfc9fe3ee5a3ae5c327
 | Date:   Mon Dec 27 16:48:17 2010 -0800
 |
 |x86-64, numa: Put pgtable to local node memory
 That was reverted before.

 We have reason to reintroduce it to make memory hotplug work.

 Split calling of init_mem_mapping into early_initmem_info
 for nodes after we get numa info there.

 First node will be low range.
 Need to rework alloc_low_pages to alloc page table in following order:
   BRK, local node, low range

 Still only load_cr3 one time, otherwise we would break xen 64bit again.

 Hmmm... can you please split this patch further?  init_mem_mapping()
 change can be separated, no?

will try to split it out.

 Also, comments are disturbingly missing.
 How are other people reading the code supposed to know what it's
 trying to achieve why and how?  Hmmm... we're also likely to end up
 with smaller mapping for misaligned NUMA configurations (I think my
 test machine is like that).  Is it guaranteed that the top level ends
 up in the first node?  It really needs documentation.

Yes. To really memory hotplug working, will need to trim the node
alignment to be
1G in memblock and numa_meminfo.

also need to put pgd page in low range (first node) if 512G block is
crossing node.
for example: if node2 is [256g, 1024g), pgd for 256g-512g, must be stay on node0
and 512g-1024g could stay on node2.
or just put all PGD pages on low range (first node).

Thanks

Yinghai
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/