On 06/14/2017 03:12 PM, Mike Kravetz wrote:
> On 06/13/2017 02:00 AM, Michal Hocko wrote:
>> From: Michal Hocko <[email protected]>
>>
>> alloc_huge_page_nodemask tries to allocate from any numa node in the
>> allowed node mask starting from lower numa nodes. This might lead to
>> filling up those low NUMA nodes while others are not used. We can reduce
>> this risk by introducing a concept of the preferred node similar to what
>> we have in the regular page allocator. We will start allocating from the
>> preferred nid and then iterate over all allowed nodes in the zonelist
>> order until we try them all.
>>
>> This is mimicking the page allocator logic except it operates on
>> per-node mempools. dequeue_huge_page_vma already does this so distill
>> the zonelist logic into a more generic dequeue_huge_page_nodemask
>> and use it in alloc_huge_page_nodemask.
>>
>> Signed-off-by: Michal Hocko <[email protected]>
>> ---
> 
> 
> I built attempts/hugetlb-zonelists, threw it on a test machine, ran the
> libhugetlbfs test suite and saw failures.  The failures started with this
> patch: commit 7e8b09f14495 in your tree.  I have not yet started to look
> into the failures.  It is even possible that the tests are making bad
> assumptions, but there certainly appears to be changes in behavior visible
> to the application(s).

nm.  The failures were the result of dequeue_huge_page_nodemask() always
returning NULL.  Vlastimil already noticed this issue and provided a
solution.

-- 
Mike Kravetz

> 
> FYI - My 'test machine' is an x86 KVM insatnce with 8GB memory simulating
> 2 nodes.  Huge page allocations before running tests:
> node0
> 512   free_hugepages
> 512   nr_hugepages
> 0     surplus_hugepages
> node1
> 512   free_hugepages
> 512   nr_hugepages
> 0     surplus_hugepages
> 
> I can take a closer look at the failures tomorrow.
> 

Reply via email to