On 20/12/2016 5:18 PM, Michal Hocko wrote:
On Mon 12-12-16 13:59:07, Jia He wrote:
In commit b9f00e147f27 ("mm, page_alloc: reduce branches in
zone_statistics"), it reconstructed codes to reduce the branch miss rate.
Compared with the original logic, it assumed if !(flag & __GFP_OTHER_NODE)
  z->node would not be equal to preferred_zone->node. That seems to be
incorrect.
I am sorry but I have hard time following the changelog. It is clear
that you are trying to fix a missed NUMA_{HIT,OTHER} accounting
but it is not really clear when such thing happens. You are adding
preferred_zone->node check. preferred_zone is the first zone in the
requested zonelist. So for the most allocations it is a node from the
local node. But if something request an explicit numa node (without
__GFP_OTHER_NODE which would be the majority I suspect) then we could
indeed end up accounting that as a NUMA_MISS, NUMA_FOREIGN so the
referenced patch indeed caused an unintended change of accounting AFAIU.

If this is correct then it should be a part of the changelog. I also
cannot say I would like the fix. First of all I am not sure
__GFP_OTHER_NODE is a good idea at all. How is an explicit usage of the
flag any different from an explicit __alloc_pages_node(non_local_nid)?
In both cases we ask for an allocation on a remote node and successful
allocation is a NUMA_HIT and NUMA_OTHER.

That being said, why cannot we simply do the following? As a bonus, we
can get rid of a barely used __GFP_OTHER_NODE. Also the number of
branches will stay same.
Yes, I agree maybe we can get rid of __GFP_OTHER_NODE if no objections
Seems currently it is only used for hugepage and statistics
---
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 429855be6ec9..f035d5c8b864 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2583,25 +2583,17 @@ int __isolate_free_page(struct page *page, unsigned int 
order)
   * Update NUMA hit/miss statistics
   *
   * Must be called with interrupts disabled.
- *
- * When __GFP_OTHER_NODE is set assume the node of the preferred
- * zone is the local node. This is useful for daemons who allocate
- * memory on behalf of other processes.
   */
  static inline void zone_statistics(struct zone *preferred_zone, struct zone 
*z,
                                                                gfp_t flags)
  {
  #ifdef CONFIG_NUMA
-       int local_nid = numa_node_id();
-       enum zone_stat_item local_stat = NUMA_LOCAL;
-
-       if (unlikely(flags & __GFP_OTHER_NODE)) {
-               local_stat = NUMA_OTHER;
-               local_nid = preferred_zone->node;
-       }
+       if (z->node == preferred_zone->node) {
+               enum zone_stat_item local_stat = NUMA_LOCAL;
- if (z->node == local_nid) {
                __inc_zone_state(z, NUMA_HIT);
+               if (z->node != numa_node_id())
+                       local_stat = NUMA_OTHER;
                __inc_zone_state(z, local_stat);
        } else {
                __inc_zone_state(z, NUMA_MISS);
I thought the logic here is different
Here is the zone_statistics() before introducing __GFP_OTHER_NODE:

if (z->zone_pgdat == preferred_zone->zone_pgdat) {
        __inc_zone_state(z, NUMA_HIT);
    } else {
        __inc_zone_state(z, NUMA_MISS);
        __inc_zone_state(preferred_zone, NUMA_FOREIGN);
    }
    if (z->node == numa_node_id())
        __inc_zone_state(z, NUMA_LOCAL);
    else
        __inc_zone_state(z, NUMA_OTHER);

B.R.
Jia

Reply via email to