Re: [PATCH RFC] mm: don't miss the last page because of round-off error

2018-08-22 Thread Roman Gushchin
On Wed, Aug 22, 2018 at 09:01:19AM +0300, Konstantin Khlebnikov wrote:
> On Tue, Aug 21, 2018 at 8:15 PM, Johannes Weiner  wrote:
> > On Tue, Aug 21, 2018 at 08:11:44AM +0300, Konstantin Khlebnikov wrote:
> >> On Sat, Aug 18, 2018 at 4:22 AM, Matthew Wilcox  
> >> wrote:
> >> > On Fri, Aug 17, 2018 at 04:18:34PM -0700, Roman Gushchin wrote:
> >> >> - scan = div64_u64(scan * fraction[file],
> >> >> -  denominator);
> >> >> + if (scan > 1)
> >> >> + scan = div64_u64(scan * fraction[file],
> >> >> +  denominator);
> >> >
> >> > Wouldn't we be better off doing a div_round_up?  ie:
> >> >
> >> > scan = div64_u64(scan * fraction[file] + denominator - 1, 
> >> > denominator);
> >> >
> >> > although i'd rather hide that in a new macro in math64.h than opencode it
> >> > here.
> >>
> >> All numbers here should be up to nr_pages * 200 and fit into unsigned long.
> >> I see no reason for u64. If they overflow then u64 wouldn't help either.
> >
> > It is nr_pages * 200 * recent_scanned, where recent_scanned can be up
> > to four times of what's on the LRUs. That can overflow a u32 with even
> > small amounts of memory.
> 
> Ah, this thing is inverted because it aims to proportional reactivation rate
> rather than the proportional pressure to reclaimable pages.
> That's not obvious. I suppose this should be in comment above it.
> 
> Well, at least denominator should fit into unsigned long. So full
> 64/64 division is redundant.

In any case it's not related to the original issue and should be
treated separately. I'd like to keep the patch simple to make
backporting to stable easy.

All refactorings can be done separately, if necessarily.

Thanks!


Re: [PATCH RFC] mm: don't miss the last page because of round-off error

2018-08-21 Thread Konstantin Khlebnikov
On Tue, Aug 21, 2018 at 8:15 PM, Johannes Weiner  wrote:
> On Tue, Aug 21, 2018 at 08:11:44AM +0300, Konstantin Khlebnikov wrote:
>> On Sat, Aug 18, 2018 at 4:22 AM, Matthew Wilcox  wrote:
>> > On Fri, Aug 17, 2018 at 04:18:34PM -0700, Roman Gushchin wrote:
>> >> - scan = div64_u64(scan * fraction[file],
>> >> -  denominator);
>> >> + if (scan > 1)
>> >> + scan = div64_u64(scan * fraction[file],
>> >> +  denominator);
>> >
>> > Wouldn't we be better off doing a div_round_up?  ie:
>> >
>> > scan = div64_u64(scan * fraction[file] + denominator - 1, 
>> > denominator);
>> >
>> > although i'd rather hide that in a new macro in math64.h than opencode it
>> > here.
>>
>> All numbers here should be up to nr_pages * 200 and fit into unsigned long.
>> I see no reason for u64. If they overflow then u64 wouldn't help either.
>
> It is nr_pages * 200 * recent_scanned, where recent_scanned can be up
> to four times of what's on the LRUs. That can overflow a u32 with even
> small amounts of memory.

Ah, this thing is inverted because it aims to proportional reactivation rate
rather than the proportional pressure to reclaimable pages.
That's not obvious. I suppose this should be in comment above it.

Well, at least denominator should fit into unsigned long. So full
64/64 division is redundant.


Re: [PATCH RFC] mm: don't miss the last page because of round-off error

2018-08-21 Thread Johannes Weiner
On Tue, Aug 21, 2018 at 08:11:44AM +0300, Konstantin Khlebnikov wrote:
> On Sat, Aug 18, 2018 at 4:22 AM, Matthew Wilcox  wrote:
> > On Fri, Aug 17, 2018 at 04:18:34PM -0700, Roman Gushchin wrote:
> >> - scan = div64_u64(scan * fraction[file],
> >> -  denominator);
> >> + if (scan > 1)
> >> + scan = div64_u64(scan * fraction[file],
> >> +  denominator);
> >
> > Wouldn't we be better off doing a div_round_up?  ie:
> >
> > scan = div64_u64(scan * fraction[file] + denominator - 1, 
> > denominator);
> >
> > although i'd rather hide that in a new macro in math64.h than opencode it
> > here.
> 
> All numbers here should be up to nr_pages * 200 and fit into unsigned long.
> I see no reason for u64. If they overflow then u64 wouldn't help either.

It is nr_pages * 200 * recent_scanned, where recent_scanned can be up
to four times of what's on the LRUs. That can overflow a u32 with even
small amounts of memory.


Re: [PATCH RFC] mm: don't miss the last page because of round-off error

2018-08-21 Thread Matthew Wilcox
On Tue, Aug 21, 2018 at 08:11:44AM +0300, Konstantin Khlebnikov wrote:
> On Sat, Aug 18, 2018 at 4:22 AM, Matthew Wilcox  wrote:
> > On Fri, Aug 17, 2018 at 04:18:34PM -0700, Roman Gushchin wrote:
> >> - scan = div64_u64(scan * fraction[file],
> >> -  denominator);
> >> + if (scan > 1)
> >> + scan = div64_u64(scan * fraction[file],
> >> +  denominator);
> >
> > Wouldn't we be better off doing a div_round_up?  ie:
> >
> > scan = div64_u64(scan * fraction[file] + denominator - 1, 
> > denominator);
> >
> > although i'd rather hide that in a new macro in math64.h than opencode it
> > here.
> 
> All numbers here should be up to nr_pages * 200 and fit into unsigned long.
> I see no reason for u64. If they overflow then u64 wouldn't help either.

Shaohua added the div64 usage initially, adding him to the cc.

> There is macro DIV_ROUND_UP in kernel.h

Indeed there is.


Re: [PATCH RFC] mm: don't miss the last page because of round-off error

2018-08-20 Thread Konstantin Khlebnikov
On Sat, Aug 18, 2018 at 4:22 AM, Matthew Wilcox  wrote:
> On Fri, Aug 17, 2018 at 04:18:34PM -0700, Roman Gushchin wrote:
>> - scan = div64_u64(scan * fraction[file],
>> -  denominator);
>> + if (scan > 1)
>> + scan = div64_u64(scan * fraction[file],
>> +  denominator);
>
> Wouldn't we be better off doing a div_round_up?  ie:
>
> scan = div64_u64(scan * fraction[file] + denominator - 1, 
> denominator);
>
> although i'd rather hide that in a new macro in math64.h than opencode it
> here.

All numbers here should be up to nr_pages * 200 and fit into unsigned long.
I see no reason for u64. If they overflow then u64 wouldn't help either.

There is macro DIV_ROUND_UP in kernel.h


Re: [PATCH RFC] mm: don't miss the last page because of round-off error

2018-08-20 Thread Roman Gushchin
On Fri, Aug 17, 2018 at 06:22:13PM -0700, Matthew Wilcox wrote:
> On Fri, Aug 17, 2018 at 04:18:34PM -0700, Roman Gushchin wrote:
> > -   scan = div64_u64(scan * fraction[file],
> > -denominator);
> > +   if (scan > 1)
> > +   scan = div64_u64(scan * fraction[file],
> > +denominator);
> 
> Wouldn't we be better off doing a div_round_up?  ie:
> 
>   scan = div64_u64(scan * fraction[file] + denominator - 1, denominator);
> 
> although i'd rather hide that in a new macro in math64.h than opencode it
> here.

Good idea! Will do in v2.

Thanks!


Re: [PATCH RFC] mm: don't miss the last page because of round-off error

2018-08-17 Thread Matthew Wilcox
On Fri, Aug 17, 2018 at 04:18:34PM -0700, Roman Gushchin wrote:
> - scan = div64_u64(scan * fraction[file],
> -  denominator);
> + if (scan > 1)
> + scan = div64_u64(scan * fraction[file],
> +  denominator);

Wouldn't we be better off doing a div_round_up?  ie:

scan = div64_u64(scan * fraction[file] + denominator - 1, denominator);

although i'd rather hide that in a new macro in math64.h than opencode it
here.


[PATCH RFC] mm: don't miss the last page because of round-off error

2018-08-17 Thread Roman Gushchin
I've noticed, that dying memory cgroups are  often pinned
in memory by a single pagecache page. Even under moderate
memory pressure they sometimes stayed in such state
for a long time. That looked strange.

My investigation showed that the problem is caused by
applying the LRU pressure balancing math:

  scan = div64_u64(scan * fraction[lru], denominator),

where

  denominator = fraction[anon] + fraction[file] + 1.

Because fraction[lru] is always less than denominator,
if the initial scan size is 1, the result is always 0.

This means the last page is not scanned and has
no chances to be reclaimed.

Fix this by skipping the balancing logic if the initial
scan count is 1.

In practice this change significantly improves the speed
of dying cgroups reclaim.

Signed-off-by: Roman Gushchin 
Cc: Andrew Morton 
Cc: Johannes Weiner 
Cc: Michal Hocko 
Cc: Tejun Heo 
Cc: Rik van Riel 
Cc: Konstantin Khlebnikov 
---
 mm/vmscan.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 03822f86f288..f85c5ec01886 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2287,9 +2287,12 @@ static void get_scan_count(struct lruvec *lruvec, struct 
mem_cgroup *memcg,
/*
 * Scan types proportional to swappiness and
 * their relative recent reclaim efficiency.
+* Make sure we don't miss the last page
+* because of a round-off error.
 */
-   scan = div64_u64(scan * fraction[file],
-denominator);
+   if (scan > 1)
+   scan = div64_u64(scan * fraction[file],
+denominator);
break;
case SCAN_FILE:
case SCAN_ANON:
-- 
2.17.1