3.16.36-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Hugh Dickins <hu...@google.com>

commit 42cb14b110a5698ccf26ce59c4441722605a3743 upstream.

clear_page_dirty_for_io() has accumulated writeback and memcg subtleties
since v2.6.16 first introduced page migration; and the set_page_dirty()
which completed its migration of PageDirty, later had to be moderated to
__set_page_dirty_nobuffers(); then PageSwapBacked had to skip that too.

No actual problems seen with this procedure recently, but if you look into
what the clear_page_dirty_for_io(page)+set_page_dirty(newpage) is actually
achieving, it turns out to be nothing more than moving the PageDirty flag,
and its NR_FILE_DIRTY stat from one zone to another.

It would be good to avoid a pile of irrelevant decrementations and
incrementations, and improper event counting, and unnecessary descent of
the radix_tree under tree_lock (to set the PAGECACHE_TAG_DIRTY which
radix_tree_replace_slot() left in place anyway).

Do the NR_FILE_DIRTY movement, like the other stats movements, while
interrupts still disabled in migrate_page_move_mapping(); and don't even
bother if the zone is the same.  Do the PageDirty movement there under
tree_lock too, where old page is frozen and newpage not yet visible:
bearing in mind that as soon as newpage becomes visible in radix_tree, an
un-page-locked set_page_dirty() might interfere (or perhaps that's just
not possible: anything doing so should already hold an additional
reference to the old page, preventing its migration; but play safe).

But we do still need to transfer PageDirty in migrate_page_copy(), for
those who don't go the mapping route through migrate_page_move_mapping().

Signed-off-by: Hugh Dickins <hu...@google.com>
Cc: Christoph Lameter <c...@linux.com>
Cc: "Kirill A. Shutemov" <kirill.shute...@linux.intel.com>
Cc: Rik van Riel <r...@redhat.com>
Cc: Vlastimil Babka <vba...@suse.cz>
Cc: Davidlohr Bueso <d...@stgolabs.net>
Cc: Oleg Nesterov <o...@redhat.com>
Cc: Sasha Levin <sasha.le...@oracle.com>
Cc: Dmitry Vyukov <dvyu...@google.com>
Cc: KOSAKI Motohiro <kosaki.motoh...@jp.fujitsu.com>
Signed-off-by: Andrew Morton <a...@linux-foundation.org>
Signed-off-by: Linus Torvalds <torva...@linux-foundation.org>
[bwh: Backported to 3.16: adjust context.  This is not just an optimisation,
 but turned out to fix a possible oops (CVE-2016-3070).]
Signed-off-by: Ben Hutchings <b...@decadent.org.uk>
---
 mm/migrate.c | 51 +++++++++++++++++++++++++++++++--------------------
 1 file changed, 31 insertions(+), 20 deletions(-)

--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -30,6 +30,7 @@
 #include <linux/mempolicy.h>
 #include <linux/vmalloc.h>
 #include <linux/security.h>
+#include <linux/backing-dev.h>
 #include <linux/memcontrol.h>
 #include <linux/syscalls.h>
 #include <linux/hugetlb.h>
@@ -342,6 +343,8 @@ int migrate_page_move_mapping(struct add
                struct buffer_head *head, enum migrate_mode mode,
                int extra_count)
 {
+       struct zone *oldzone, *newzone;
+       int dirty;
        int expected_count = 1 + extra_count;
        void **pslot;
 
@@ -352,6 +355,9 @@ int migrate_page_move_mapping(struct add
                return MIGRATEPAGE_SUCCESS;
        }
 
+       oldzone = page_zone(page);
+       newzone = page_zone(newpage);
+
        spin_lock_irq(&mapping->tree_lock);
 
        pslot = radix_tree_lookup_slot(&mapping->page_tree,
@@ -392,6 +398,13 @@ int migrate_page_move_mapping(struct add
                set_page_private(newpage, page_private(page));
        }
 
+       /* Move dirty while page refs frozen and newpage not yet exposed */
+       dirty = PageDirty(page);
+       if (dirty) {
+               ClearPageDirty(page);
+               SetPageDirty(newpage);
+       }
+
        radix_tree_replace_slot(pslot, newpage);
 
        /*
@@ -401,6 +414,9 @@ int migrate_page_move_mapping(struct add
         */
        page_unfreeze_refs(page, expected_count - 1);
 
+       spin_unlock(&mapping->tree_lock);
+       /* Leave irq disabled to prevent preemption while updating stats */
+
        /*
         * If moved to a different zone then also account
         * the page for that zone. Other VM counters will be
@@ -411,13 +427,19 @@ int migrate_page_move_mapping(struct add
         * via NR_FILE_PAGES and NR_ANON_PAGES if they
         * are mapped to swap space.
         */
-       __dec_zone_page_state(page, NR_FILE_PAGES);
-       __inc_zone_page_state(newpage, NR_FILE_PAGES);
-       if (!PageSwapCache(page) && PageSwapBacked(page)) {
-               __dec_zone_page_state(page, NR_SHMEM);
-               __inc_zone_page_state(newpage, NR_SHMEM);
+       if (newzone != oldzone) {
+               __dec_zone_state(oldzone, NR_FILE_PAGES);
+               __inc_zone_state(newzone, NR_FILE_PAGES);
+               if (PageSwapBacked(page) && !PageSwapCache(page)) {
+                       __dec_zone_state(oldzone, NR_SHMEM);
+                       __inc_zone_state(newzone, NR_SHMEM);
+               }
+               if (dirty && mapping_cap_account_dirty(mapping)) {
+                       __dec_zone_state(oldzone, NR_FILE_DIRTY);
+                       __inc_zone_state(newzone, NR_FILE_DIRTY);
+               }
        }
-       spin_unlock_irq(&mapping->tree_lock);
+       local_irq_enable();
 
        return MIGRATEPAGE_SUCCESS;
 }
@@ -541,20 +563,9 @@ void migrate_page_copy(struct page *newp
        if (PageMappedToDisk(page))
                SetPageMappedToDisk(newpage);
 
-       if (PageDirty(page)) {
-               clear_page_dirty_for_io(page);
-               /*
-                * Want to mark the page and the radix tree as dirty, and
-                * redo the accounting that clear_page_dirty_for_io undid,
-                * but we can't use set_page_dirty because that function
-                * is actually a signal that all of the page has become dirty.
-                * Whereas only part of our page may be dirty.
-                */
-               if (PageSwapBacked(page))
-                       SetPageDirty(newpage);
-               else
-                       __set_page_dirty_nobuffers(newpage);
-       }
+       /* Move dirty on pages not done by migrate_page_move_mapping() */
+       if (PageDirty(page))
+               SetPageDirty(newpage);
 
        /*
         * Copy NUMA information to the new page, to prevent over-eager

Reply via email to