The amount of dirtyable pages should not include the total number of
free pages: there is a number of reserved pages that the page
allocator and kswapd always try to keep free.

The closer (reclaimable pages - dirty pages) is to the number of
reserved pages, the more likely it becomes for reclaim to run into
dirty pages:

       +----------+ ---
       |   anon   |  |
       +----------+  |
       |          |  |
       |          |  -- dirty limit new    -- flusher new
       |   file   |  |                     |
       |          |  |                     |
       |          |  -- dirty limit old    -- flusher old
       |          |                        |
       +----------+                       --- reclaim
       | reserved |
       +----------+
       |  kernel  |
       +----------+

Not treating reserved pages as dirtyable on a global level is only a
conceptual fix.  In reality, dirty pages are not distributed equally
across zones and reclaim runs into dirty pages on a regular basis.

But it is important to get this right before tackling the problem on a
per-zone level, where the distance between reclaim and the dirty pages
is mostly much smaller in absolute numbers.

Signed-off-by: Johannes Weiner <jwei...@redhat.com>
---
 include/linux/mmzone.h |    1 +
 mm/page-writeback.c    |    8 +++++---
 mm/page_alloc.c        |    1 +
 3 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 1ed4116..e28f8e0 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -316,6 +316,7 @@ struct zone {
         * sysctl_lowmem_reserve_ratio sysctl changes.
         */
        unsigned long           lowmem_reserve[MAX_NR_ZONES];
+       unsigned long           totalreserve_pages;
 
 #ifdef CONFIG_NUMA
        int node;
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index da6d263..9f896db 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -169,8 +169,9 @@ static unsigned long highmem_dirtyable_memory(unsigned long 
total)
                struct zone *z =
                        &NODE_DATA(node)->node_zones[ZONE_HIGHMEM];
 
-               x += zone_page_state(z, NR_FREE_PAGES) +
-                    zone_reclaimable_pages(z);
+               x += zone_page_state(z, NR_FREE_PAGES) -
+                       zone->totalreserve_pages;
+               x += zone_reclaimable_pages(z);
        }
        /*
         * Make sure that the number of highmem pages is never larger
@@ -194,7 +195,8 @@ static unsigned long determine_dirtyable_memory(void)
 {
        unsigned long x;
 
-       x = global_page_state(NR_FREE_PAGES) + global_reclaimable_pages();
+       x = global_page_state(NR_FREE_PAGES) - totalreserve_pages;
+       x += global_reclaimable_pages();
 
        if (!vm_highmem_is_dirtyable)
                x -= highmem_dirtyable_memory(x);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 1dba05e..7e8e2ee 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5075,6 +5075,7 @@ static void calculate_totalreserve_pages(void)
 
                        if (max > zone->present_pages)
                                max = zone->present_pages;
+                       zone->totalreserve_pages = max;
                        reserve_pages += max;
                }
        }
-- 
1.7.6

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to