From: Andrey Ryabinin <[email protected]>

ltp test mm.mtest01w fails: it just tries to allocate and use 80% of
available RAM+swap which is > RAMSIZE, so swap is intended to be used.

Instead the test process is killed by OOM:
get_scan_count() detects that cgroup has enough inactive file to reclaim
only it and does not touch active anon.
(inactive_file_low == false, as the test does not generate pagecache at
all and there are several entries in inactive pagecache (1-10) which we
cann't reclaim, they are activated instead).

Thus active lists are not reclaimed => active anon (plenty of it) is not
reclaimed (swapped out), reclaim is unsuccessful => OOM.

How we fix this:
1) we honor absolute value of inactive file cache:
   scan_balance is set to SCAN_FILE only in case
   inactive file is > 4Mb.
   And don't decrease (with sc->priority increase) the size of required
   inactive file cache when make a descision to scan file cache only.

2) the above leads to "has_inactive" is set only if
   inactive file is > 4Mb.

3) if we did not reclaim anything while processing inactive file,
   shrink active lists as well.

https://jira.sw.ru/browse/PSBM-92480

Signed-off-by: Andrey Ryabinin <[email protected]>
Acked-by: Konstantin Khorenko <[email protected]>
---
 mm/vmscan.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index a122e4cfa1a4..d84ba1a5c4f8 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2208,9 +2208,11 @@ static void get_scan_count(struct lruvec *lruvec, struct 
mem_cgroup *memcg,
        /*
         * There is enough inactive page cache, do not reclaim
         * anything from the anonymous working set right now.
+        * We don't decrease the required level of inactive page cache
+        * with sc->priority decrease.
         */
        if (!inactive_file_low &&
-           lruvec_lru_size(lruvec, LRU_INACTIVE_FILE) >> sc->priority) {
+           lruvec_lru_size(lruvec, LRU_INACTIVE_FILE) >> (DEF_PRIORITY - 2)) {
                scan_balance = SCAN_FILE;
                goto out;
        }
@@ -2262,7 +2264,7 @@ static void get_scan_count(struct lruvec *lruvec, struct 
mem_cgroup *memcg,
        fraction[1] = fp;
        denominator = ap + fp + 1;
 out:
-       sc->has_inactive = !inactive_file_low ||
+       sc->has_inactive = (!inactive_file_low && (scan_balance == SCAN_FILE)) 
||
                ((scan_balance != SCAN_FILE) && !inactive_anon_low);
        *lru_pages = 0;
        for_each_evictable_lru(lru) {
@@ -2597,7 +2599,8 @@ static void shrink_zone(struct zone *zone, struct 
scan_control *sc,
                        }
                } while ((memcg = mem_cgroup_iter(root, memcg, &reclaim)));
 
-               if (!sc->has_inactive && !sc->may_shrink_active) {
+               if ((!sc->has_inactive || !sc->nr_reclaimed)
+                   && !sc->may_shrink_active) {
                        sc->may_shrink_active = 1;
                        retry = true;
                        continue;
-- 
2.15.1

_______________________________________________
Devel mailing list
[email protected]
https://lists.openvz.org/mailman/listinfo/devel

Reply via email to