Re: [PATCH 0/3] Shrinkers and proportional reclaim

2014-05-27 Thread Dave Chinner
On Tue, May 27, 2014 at 04:19:12PM -0700, Hugh Dickins wrote:
> On Wed, 28 May 2014, Konstantin Khlebnikov wrote:
> > On Wed, May 28, 2014 at 1:17 AM, Hugh Dickins  wrote:
> > > On Tue, 27 May 2014, Dave Chinner wrote:
> > >> On Mon, May 26, 2014 at 02:44:29PM -0700, Hugh Dickins wrote:
> > >> >
> > >> > [PATCH 4/3] fs/superblock: Avoid counting without __GFP_FS
> > >> >
> > >> > Don't waste time counting objects in super_cache_count() if no 
> > >> > __GFP_FS:
> > >> > super_cache_scan() would only back out with SHRINK_STOP in that case.
> > >> >
> > >> > Signed-off-by: Hugh Dickins 
> > >>
> > >> While you might think that's a good thing, it's not.  The act of
> > >> shrinking is kept separate from the accounting of how much shrinking
> > >> needs to take place.  The amount of work the shrinker can't do due
> > >> to the reclaim context is deferred until the shrinker is called in a
> > >> context where it can do work (eg. kswapd)
> > >>
> > >> Hence not accounting for work that can't be done immediately will
> > >> adversely impact the balance of the system under memory intensive
> > >> filesystem workloads. In these worklaods, almost all allocations are
> > >> done in the GFP_NOFS or GFP_NOIO contexts so not deferring the work
> > >> will will effectively stop superblock cache reclaim entirely
> > >
> > > Thanks for filling me in on that.  At first I misunderstood you,
> > > and went off looking in the wrong direction.  Now I see what you're
> > > referring to: the quantity that shrink_slab_node() accumulates in
> > > and withdraws from shrinker->nr_deferred[nid].
> > 
> > Maybe shrinker could accumulate fraction nr_pages_scanned / lru_pages
> > instead of exact amount of required work? Count of shrinkable objects
> > might be calculated later, when shrinker is called from a suitable context
> > and can actualy do something.
> 
> Good idea, probably a worthwhile optimization to think through further.
> (Though experience says that Dave will explain how that can never work.)

Heh. :)

Two things, neither are show-stoppers but would need to be handled
in some way.

First: it would remove a lot of the policy flexibility from the
shrinker implementations that we currently have. i.e. the "work to
do" policy is current set by the shrinker, not by the shrinker
infrastructure. The shrinker infrastructure only determines whether
it can be done immediately of whether it shoul dbe deferred

e.g. there are shrinkers that don't do work unless they are
over certain thresholds. For these shrinkers, they need to have the
work calculated by the callout as they may decide nothing
can/should/needs to be done, and that decision may have nothing to
do with the current reclaim context. You can't really do this
without a callout to determine the cache size.

The other thing I see is that deferring the ratio of work rather
than the actual work is that it doesn't take into account the fact
that the cache sizes might be changing in a different way to memory
pressure. i.e. a sudden increase in cache size just before deferred
reclaim occurred would cause much more reclaim than the current
code, even though the cache wasn't contributing to the original
deferred memory pressure.

This will lead to bursty/peaky reclaim behaviour because we then
can't distinguish an large instantenous change in memory pressure
from "wind up" caused by lots of small increments of deferred work.
We specifically damp the second case:

/*
 * We need to avoid excessive windup on filesystem shrinkers
 * due to large numbers of GFP_NOFS allocations causing the
 * shrinkers to return -1 all the time. This results in a large
 * nr being built up so when a shrink that can do some work
 * comes along it empties the entire cache due to nr >>>
 * freeable. This is bad for sustaining a working set in
 * memory.
 *
 * Hence only allow the shrinker to scan the entire cache when
 * a large delta change is calculated directly.
 */

Hence we'd need a different mechanism to prevent such defered work
wind up from occurring. We can probably do better than the current
SWAG if we design a new algorithm that has this damping built in.
The current algorithm is all based around the "seek penalty"
reinstantiating a reclaimed object has, and that simply does not
match for many shrinker users now as they aren't spinning disk
based. Hence I think we really need to look at improving the entire
shrinker "work" algorithm rather than just tinkering around the
edges...

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] Shrinkers and proportional reclaim

2014-05-27 Thread Hugh Dickins
On Wed, 28 May 2014, Konstantin Khlebnikov wrote:
> On Wed, May 28, 2014 at 1:17 AM, Hugh Dickins  wrote:
> > On Tue, 27 May 2014, Dave Chinner wrote:
> >> On Mon, May 26, 2014 at 02:44:29PM -0700, Hugh Dickins wrote:
> >> >
> >> > [PATCH 4/3] fs/superblock: Avoid counting without __GFP_FS
> >> >
> >> > Don't waste time counting objects in super_cache_count() if no __GFP_FS:
> >> > super_cache_scan() would only back out with SHRINK_STOP in that case.
> >> >
> >> > Signed-off-by: Hugh Dickins 
> >>
> >> While you might think that's a good thing, it's not.  The act of
> >> shrinking is kept separate from the accounting of how much shrinking
> >> needs to take place.  The amount of work the shrinker can't do due
> >> to the reclaim context is deferred until the shrinker is called in a
> >> context where it can do work (eg. kswapd)
> >>
> >> Hence not accounting for work that can't be done immediately will
> >> adversely impact the balance of the system under memory intensive
> >> filesystem workloads. In these worklaods, almost all allocations are
> >> done in the GFP_NOFS or GFP_NOIO contexts so not deferring the work
> >> will will effectively stop superblock cache reclaim entirely
> >
> > Thanks for filling me in on that.  At first I misunderstood you,
> > and went off looking in the wrong direction.  Now I see what you're
> > referring to: the quantity that shrink_slab_node() accumulates in
> > and withdraws from shrinker->nr_deferred[nid].
> 
> Maybe shrinker could accumulate fraction nr_pages_scanned / lru_pages
> instead of exact amount of required work? Count of shrinkable objects
> might be calculated later, when shrinker is called from a suitable context
> and can actualy do something.

Good idea, probably a worthwhile optimization to think through further.
(Though experience says that Dave will explain how that can never work.)

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] Shrinkers and proportional reclaim

2014-05-27 Thread Konstantin Khlebnikov
On Wed, May 28, 2014 at 1:17 AM, Hugh Dickins  wrote:
> On Tue, 27 May 2014, Dave Chinner wrote:
>> On Mon, May 26, 2014 at 02:44:29PM -0700, Hugh Dickins wrote:
>> >
>> > [PATCH 4/3] fs/superblock: Avoid counting without __GFP_FS
>> >
>> > Don't waste time counting objects in super_cache_count() if no __GFP_FS:
>> > super_cache_scan() would only back out with SHRINK_STOP in that case.
>> >
>> > Signed-off-by: Hugh Dickins 
>>
>> While you might think that's a good thing, it's not.  The act of
>> shrinking is kept separate from the accounting of how much shrinking
>> needs to take place.  The amount of work the shrinker can't do due
>> to the reclaim context is deferred until the shrinker is called in a
>> context where it can do work (eg. kswapd)
>>
>> Hence not accounting for work that can't be done immediately will
>> adversely impact the balance of the system under memory intensive
>> filesystem workloads. In these worklaods, almost all allocations are
>> done in the GFP_NOFS or GFP_NOIO contexts so not deferring the work
>> will will effectively stop superblock cache reclaim entirely
>
> Thanks for filling me in on that.  At first I misunderstood you,
> and went off looking in the wrong direction.  Now I see what you're
> referring to: the quantity that shrink_slab_node() accumulates in
> and withdraws from shrinker->nr_deferred[nid].

Maybe shrinker could accumulate fraction nr_pages_scanned / lru_pages
instead of exact amount of required work? Count of shrinkable objects
might be calculated later, when shrinker is called from a suitable context and
can actualy do something.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] Shrinkers and proportional reclaim

2014-05-27 Thread Hugh Dickins
On Tue, 27 May 2014, Dave Chinner wrote:
> On Mon, May 26, 2014 at 02:44:29PM -0700, Hugh Dickins wrote:
> > 
> > [PATCH 4/3] fs/superblock: Avoid counting without __GFP_FS
> > 
> > Don't waste time counting objects in super_cache_count() if no __GFP_FS:
> > super_cache_scan() would only back out with SHRINK_STOP in that case.
> > 
> > Signed-off-by: Hugh Dickins 
> 
> While you might think that's a good thing, it's not.  The act of
> shrinking is kept separate from the accounting of how much shrinking
> needs to take place.  The amount of work the shrinker can't do due
> to the reclaim context is deferred until the shrinker is called in a
> context where it can do work (eg. kswapd)
> 
> Hence not accounting for work that can't be done immediately will
> adversely impact the balance of the system under memory intensive
> filesystem workloads. In these worklaods, almost all allocations are
> done in the GFP_NOFS or GFP_NOIO contexts so not deferring the work
> will will effectively stop superblock cache reclaim entirely

Thanks for filling me in on that.  At first I misunderstood you,
and went off looking in the wrong direction.  Now I see what you're
referring to: the quantity that shrink_slab_node() accumulates in
and withdraws from shrinker->nr_deferred[nid].

Right: forget my super_cache_count() __GFP_FS patch!

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] Shrinkers and proportional reclaim

2014-05-27 Thread Hugh Dickins
On Tue, 27 May 2014, Dave Chinner wrote:
 On Mon, May 26, 2014 at 02:44:29PM -0700, Hugh Dickins wrote:
  
  [PATCH 4/3] fs/superblock: Avoid counting without __GFP_FS
  
  Don't waste time counting objects in super_cache_count() if no __GFP_FS:
  super_cache_scan() would only back out with SHRINK_STOP in that case.
  
  Signed-off-by: Hugh Dickins hu...@google.com
 
 While you might think that's a good thing, it's not.  The act of
 shrinking is kept separate from the accounting of how much shrinking
 needs to take place.  The amount of work the shrinker can't do due
 to the reclaim context is deferred until the shrinker is called in a
 context where it can do work (eg. kswapd)
 
 Hence not accounting for work that can't be done immediately will
 adversely impact the balance of the system under memory intensive
 filesystem workloads. In these worklaods, almost all allocations are
 done in the GFP_NOFS or GFP_NOIO contexts so not deferring the work
 will will effectively stop superblock cache reclaim entirely

Thanks for filling me in on that.  At first I misunderstood you,
and went off looking in the wrong direction.  Now I see what you're
referring to: the quantity that shrink_slab_node() accumulates in
and withdraws from shrinker-nr_deferred[nid].

Right: forget my super_cache_count() __GFP_FS patch!

Hugh
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] Shrinkers and proportional reclaim

2014-05-27 Thread Konstantin Khlebnikov
On Wed, May 28, 2014 at 1:17 AM, Hugh Dickins hu...@google.com wrote:
 On Tue, 27 May 2014, Dave Chinner wrote:
 On Mon, May 26, 2014 at 02:44:29PM -0700, Hugh Dickins wrote:
 
  [PATCH 4/3] fs/superblock: Avoid counting without __GFP_FS
 
  Don't waste time counting objects in super_cache_count() if no __GFP_FS:
  super_cache_scan() would only back out with SHRINK_STOP in that case.
 
  Signed-off-by: Hugh Dickins hu...@google.com

 While you might think that's a good thing, it's not.  The act of
 shrinking is kept separate from the accounting of how much shrinking
 needs to take place.  The amount of work the shrinker can't do due
 to the reclaim context is deferred until the shrinker is called in a
 context where it can do work (eg. kswapd)

 Hence not accounting for work that can't be done immediately will
 adversely impact the balance of the system under memory intensive
 filesystem workloads. In these worklaods, almost all allocations are
 done in the GFP_NOFS or GFP_NOIO contexts so not deferring the work
 will will effectively stop superblock cache reclaim entirely

 Thanks for filling me in on that.  At first I misunderstood you,
 and went off looking in the wrong direction.  Now I see what you're
 referring to: the quantity that shrink_slab_node() accumulates in
 and withdraws from shrinker-nr_deferred[nid].

Maybe shrinker could accumulate fraction nr_pages_scanned / lru_pages
instead of exact amount of required work? Count of shrinkable objects
might be calculated later, when shrinker is called from a suitable context and
can actualy do something.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] Shrinkers and proportional reclaim

2014-05-27 Thread Hugh Dickins
On Wed, 28 May 2014, Konstantin Khlebnikov wrote:
 On Wed, May 28, 2014 at 1:17 AM, Hugh Dickins hu...@google.com wrote:
  On Tue, 27 May 2014, Dave Chinner wrote:
  On Mon, May 26, 2014 at 02:44:29PM -0700, Hugh Dickins wrote:
  
   [PATCH 4/3] fs/superblock: Avoid counting without __GFP_FS
  
   Don't waste time counting objects in super_cache_count() if no __GFP_FS:
   super_cache_scan() would only back out with SHRINK_STOP in that case.
  
   Signed-off-by: Hugh Dickins hu...@google.com
 
  While you might think that's a good thing, it's not.  The act of
  shrinking is kept separate from the accounting of how much shrinking
  needs to take place.  The amount of work the shrinker can't do due
  to the reclaim context is deferred until the shrinker is called in a
  context where it can do work (eg. kswapd)
 
  Hence not accounting for work that can't be done immediately will
  adversely impact the balance of the system under memory intensive
  filesystem workloads. In these worklaods, almost all allocations are
  done in the GFP_NOFS or GFP_NOIO contexts so not deferring the work
  will will effectively stop superblock cache reclaim entirely
 
  Thanks for filling me in on that.  At first I misunderstood you,
  and went off looking in the wrong direction.  Now I see what you're
  referring to: the quantity that shrink_slab_node() accumulates in
  and withdraws from shrinker-nr_deferred[nid].
 
 Maybe shrinker could accumulate fraction nr_pages_scanned / lru_pages
 instead of exact amount of required work? Count of shrinkable objects
 might be calculated later, when shrinker is called from a suitable context
 and can actualy do something.

Good idea, probably a worthwhile optimization to think through further.
(Though experience says that Dave will explain how that can never work.)

Hugh
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] Shrinkers and proportional reclaim

2014-05-27 Thread Dave Chinner
On Tue, May 27, 2014 at 04:19:12PM -0700, Hugh Dickins wrote:
 On Wed, 28 May 2014, Konstantin Khlebnikov wrote:
  On Wed, May 28, 2014 at 1:17 AM, Hugh Dickins hu...@google.com wrote:
   On Tue, 27 May 2014, Dave Chinner wrote:
   On Mon, May 26, 2014 at 02:44:29PM -0700, Hugh Dickins wrote:
   
[PATCH 4/3] fs/superblock: Avoid counting without __GFP_FS
   
Don't waste time counting objects in super_cache_count() if no 
__GFP_FS:
super_cache_scan() would only back out with SHRINK_STOP in that case.
   
Signed-off-by: Hugh Dickins hu...@google.com
  
   While you might think that's a good thing, it's not.  The act of
   shrinking is kept separate from the accounting of how much shrinking
   needs to take place.  The amount of work the shrinker can't do due
   to the reclaim context is deferred until the shrinker is called in a
   context where it can do work (eg. kswapd)
  
   Hence not accounting for work that can't be done immediately will
   adversely impact the balance of the system under memory intensive
   filesystem workloads. In these worklaods, almost all allocations are
   done in the GFP_NOFS or GFP_NOIO contexts so not deferring the work
   will will effectively stop superblock cache reclaim entirely
  
   Thanks for filling me in on that.  At first I misunderstood you,
   and went off looking in the wrong direction.  Now I see what you're
   referring to: the quantity that shrink_slab_node() accumulates in
   and withdraws from shrinker-nr_deferred[nid].
  
  Maybe shrinker could accumulate fraction nr_pages_scanned / lru_pages
  instead of exact amount of required work? Count of shrinkable objects
  might be calculated later, when shrinker is called from a suitable context
  and can actualy do something.
 
 Good idea, probably a worthwhile optimization to think through further.
 (Though experience says that Dave will explain how that can never work.)

Heh. :)

Two things, neither are show-stoppers but would need to be handled
in some way.

First: it would remove a lot of the policy flexibility from the
shrinker implementations that we currently have. i.e. the work to
do policy is current set by the shrinker, not by the shrinker
infrastructure. The shrinker infrastructure only determines whether
it can be done immediately of whether it shoul dbe deferred

e.g. there are shrinkers that don't do work unless they are
over certain thresholds. For these shrinkers, they need to have the
work calculated by the callout as they may decide nothing
can/should/needs to be done, and that decision may have nothing to
do with the current reclaim context. You can't really do this
without a callout to determine the cache size.

The other thing I see is that deferring the ratio of work rather
than the actual work is that it doesn't take into account the fact
that the cache sizes might be changing in a different way to memory
pressure. i.e. a sudden increase in cache size just before deferred
reclaim occurred would cause much more reclaim than the current
code, even though the cache wasn't contributing to the original
deferred memory pressure.

This will lead to bursty/peaky reclaim behaviour because we then
can't distinguish an large instantenous change in memory pressure
from wind up caused by lots of small increments of deferred work.
We specifically damp the second case:

/*
 * We need to avoid excessive windup on filesystem shrinkers
 * due to large numbers of GFP_NOFS allocations causing the
 * shrinkers to return -1 all the time. This results in a large
 * nr being built up so when a shrink that can do some work
 * comes along it empties the entire cache due to nr 
 * freeable. This is bad for sustaining a working set in
 * memory.
 *
 * Hence only allow the shrinker to scan the entire cache when
 * a large delta change is calculated directly.
 */

Hence we'd need a different mechanism to prevent such defered work
wind up from occurring. We can probably do better than the current
SWAG if we design a new algorithm that has this damping built in.
The current algorithm is all based around the seek penalty
reinstantiating a reclaimed object has, and that simply does not
match for many shrinker users now as they aren't spinning disk
based. Hence I think we really need to look at improving the entire
shrinker work algorithm rather than just tinkering around the
edges...

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] Shrinkers and proportional reclaim

2014-05-26 Thread Dave Chinner
On Mon, May 26, 2014 at 02:44:29PM -0700, Hugh Dickins wrote:
> On Thu, 22 May 2014, Mel Gorman wrote:
> 
> > This series is aimed at regressions noticed during reclaim activity. The
> > first two patches are shrinker patches that were posted ages ago but never
> > merged for reasons that are unclear to me. I'm posting them again to see if
> > there was a reason they were dropped or if they just got lost. Dave?  Time?
> > The last patch adjusts proportional reclaim. Yuanhan Liu, can you retest
> > the vm scalability test cases on a larger machine? Hugh, does this work
> > for you on the memcg test cases?
> 
> Yes it does, thank you.
> 
> Though the situation is muddy, since on our current internal tree, I'm
> surprised to find that the memcg test case no longer fails reliably
> without our workaround and without your fix.
> 
> "Something must have changed"; but it would take a long time to work
> out what.  If I travel back in time with git, to where we first applied
> the "vindictive" patch, then yes that test case convincingly fails
> without either (my or your) patch, and passes with either patch.
> 
> And you have something that satisfies Yuanhan too, that's great.
> 
> I'm also pleased to see Dave and Tim reduce the contention in
> grab_super_passive(): that's a familiar symbol from livelock dumps.
> 
> You might want to add this little 4/3, that we've had in for a
> while; but with grab_super_passive() out of super_cache_count(),
> it will have much less importance.
> 
> 
> [PATCH 4/3] fs/superblock: Avoid counting without __GFP_FS
> 
> Don't waste time counting objects in super_cache_count() if no __GFP_FS:
> super_cache_scan() would only back out with SHRINK_STOP in that case.
> 
> Signed-off-by: Hugh Dickins 

While you might think that's a good thing, it's not.  The act of
shrinking is kept separate from the accounting of how much shrinking
needs to take place.  The amount of work the shrinker can't do due
to the reclaim context is deferred until the shrinker is called in a
context where it can do work (eg. kswapd)

Hence not accounting for work that can't be done immediately will
adversely impact the balance of the system under memory intensive
filesystem workloads. In these worklaods, almost all allocations are
done in the GFP_NOFS or GFP_NOIO contexts so not deferring the work
will will effectively stop superblock cache reclaim entirely

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] Shrinkers and proportional reclaim

2014-05-26 Thread Hugh Dickins
On Thu, 22 May 2014, Mel Gorman wrote:

> This series is aimed at regressions noticed during reclaim activity. The
> first two patches are shrinker patches that were posted ages ago but never
> merged for reasons that are unclear to me. I'm posting them again to see if
> there was a reason they were dropped or if they just got lost. Dave?  Time?
> The last patch adjusts proportional reclaim. Yuanhan Liu, can you retest
> the vm scalability test cases on a larger machine? Hugh, does this work
> for you on the memcg test cases?

Yes it does, thank you.

Though the situation is muddy, since on our current internal tree, I'm
surprised to find that the memcg test case no longer fails reliably
without our workaround and without your fix.

"Something must have changed"; but it would take a long time to work
out what.  If I travel back in time with git, to where we first applied
the "vindictive" patch, then yes that test case convincingly fails
without either (my or your) patch, and passes with either patch.

And you have something that satisfies Yuanhan too, that's great.

I'm also pleased to see Dave and Tim reduce the contention in
grab_super_passive(): that's a familiar symbol from livelock dumps.

You might want to add this little 4/3, that we've had in for a
while; but with grab_super_passive() out of super_cache_count(),
it will have much less importance.


[PATCH 4/3] fs/superblock: Avoid counting without __GFP_FS

Don't waste time counting objects in super_cache_count() if no __GFP_FS:
super_cache_scan() would only back out with SHRINK_STOP in that case.

Signed-off-by: Hugh Dickins 
---

 fs/super.c |6 ++
 1 file changed, 6 insertions(+)

--- melgo/fs/super.c2014-05-26 13:39:33.000131904 -0700
+++ linux/fs/super.c2014-05-26 13:56:19.012155813 -0700
@@ -110,6 +110,12 @@ static unsigned long super_cache_count(s
struct super_block *sb;
longtotal_objects = 0;
 
+   /*
+* None can be freed without __GFP_FS, so don't waste time counting.
+*/
+   if (!(sc->gfp_mask & __GFP_FS))
+   return 0;
+
sb = container_of(shrink, struct super_block, s_shrink);
 
/*
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] Shrinkers and proportional reclaim

2014-05-26 Thread Hugh Dickins
On Thu, 22 May 2014, Mel Gorman wrote:

 This series is aimed at regressions noticed during reclaim activity. The
 first two patches are shrinker patches that were posted ages ago but never
 merged for reasons that are unclear to me. I'm posting them again to see if
 there was a reason they were dropped or if they just got lost. Dave?  Time?
 The last patch adjusts proportional reclaim. Yuanhan Liu, can you retest
 the vm scalability test cases on a larger machine? Hugh, does this work
 for you on the memcg test cases?

Yes it does, thank you.

Though the situation is muddy, since on our current internal tree, I'm
surprised to find that the memcg test case no longer fails reliably
without our workaround and without your fix.

Something must have changed; but it would take a long time to work
out what.  If I travel back in time with git, to where we first applied
the vindictive patch, then yes that test case convincingly fails
without either (my or your) patch, and passes with either patch.

And you have something that satisfies Yuanhan too, that's great.

I'm also pleased to see Dave and Tim reduce the contention in
grab_super_passive(): that's a familiar symbol from livelock dumps.

You might want to add this little 4/3, that we've had in for a
while; but with grab_super_passive() out of super_cache_count(),
it will have much less importance.


[PATCH 4/3] fs/superblock: Avoid counting without __GFP_FS

Don't waste time counting objects in super_cache_count() if no __GFP_FS:
super_cache_scan() would only back out with SHRINK_STOP in that case.

Signed-off-by: Hugh Dickins hu...@google.com
---

 fs/super.c |6 ++
 1 file changed, 6 insertions(+)

--- melgo/fs/super.c2014-05-26 13:39:33.000131904 -0700
+++ linux/fs/super.c2014-05-26 13:56:19.012155813 -0700
@@ -110,6 +110,12 @@ static unsigned long super_cache_count(s
struct super_block *sb;
longtotal_objects = 0;
 
+   /*
+* None can be freed without __GFP_FS, so don't waste time counting.
+*/
+   if (!(sc-gfp_mask  __GFP_FS))
+   return 0;
+
sb = container_of(shrink, struct super_block, s_shrink);
 
/*
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] Shrinkers and proportional reclaim

2014-05-26 Thread Dave Chinner
On Mon, May 26, 2014 at 02:44:29PM -0700, Hugh Dickins wrote:
 On Thu, 22 May 2014, Mel Gorman wrote:
 
  This series is aimed at regressions noticed during reclaim activity. The
  first two patches are shrinker patches that were posted ages ago but never
  merged for reasons that are unclear to me. I'm posting them again to see if
  there was a reason they were dropped or if they just got lost. Dave?  Time?
  The last patch adjusts proportional reclaim. Yuanhan Liu, can you retest
  the vm scalability test cases on a larger machine? Hugh, does this work
  for you on the memcg test cases?
 
 Yes it does, thank you.
 
 Though the situation is muddy, since on our current internal tree, I'm
 surprised to find that the memcg test case no longer fails reliably
 without our workaround and without your fix.
 
 Something must have changed; but it would take a long time to work
 out what.  If I travel back in time with git, to where we first applied
 the vindictive patch, then yes that test case convincingly fails
 without either (my or your) patch, and passes with either patch.
 
 And you have something that satisfies Yuanhan too, that's great.
 
 I'm also pleased to see Dave and Tim reduce the contention in
 grab_super_passive(): that's a familiar symbol from livelock dumps.
 
 You might want to add this little 4/3, that we've had in for a
 while; but with grab_super_passive() out of super_cache_count(),
 it will have much less importance.
 
 
 [PATCH 4/3] fs/superblock: Avoid counting without __GFP_FS
 
 Don't waste time counting objects in super_cache_count() if no __GFP_FS:
 super_cache_scan() would only back out with SHRINK_STOP in that case.
 
 Signed-off-by: Hugh Dickins hu...@google.com

While you might think that's a good thing, it's not.  The act of
shrinking is kept separate from the accounting of how much shrinking
needs to take place.  The amount of work the shrinker can't do due
to the reclaim context is deferred until the shrinker is called in a
context where it can do work (eg. kswapd)

Hence not accounting for work that can't be done immediately will
adversely impact the balance of the system under memory intensive
filesystem workloads. In these worklaods, almost all allocations are
done in the GFP_NOFS or GFP_NOIO contexts so not deferring the work
will will effectively stop superblock cache reclaim entirely

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] Shrinkers and proportional reclaim

2014-05-22 Thread Yuanhan Liu
On Thu, May 22, 2014 at 05:30:51PM +0100, Mel Gorman wrote:
> On Fri, May 23, 2014 at 12:14:16AM +0800, Yuanhan Liu wrote:
> > On Thu, May 22, 2014 at 10:09:36AM +0100, Mel Gorman wrote:
> > > This series is aimed at regressions noticed during reclaim activity. The
> > > first two patches are shrinker patches that were posted ages ago but never
> > > merged for reasons that are unclear to me. I'm posting them again to see 
> > > if
> > > there was a reason they were dropped or if they just got lost. Dave?  
> > > Time?
> > > The last patch adjusts proportional reclaim. Yuanhan Liu, can you retest
> > > the vm scalability test cases on a larger machine? Hugh, does this work
> > > for you on the memcg test cases?
> > 
> > Sure, and here is the result. I applied these 3 patches on v3.15-rc6,
> > and head commit is 60c10afd. e82e0561 is the old commit that introduced
> > the regression.  The testserver has 512G memory and 120 CPU.
> > 
> > It's a simple result; if you need more data, I can gather them and send
> > it to you tomorrow:
> > 
> > e82e0561v3.15-rc6   60c10afd
> > 
> > 185607851223212238868453
> > -34%+109
> > 
> > As you can see, the performance is back, and it is way much better ;)
> > 
> 
> Thanks a lot for that and the quick response. It is much appreciated.

Welcome! And sorry that I made a silly mistake. Those numbers are right
though, I just setup wrong compare base; I should compare them with
e82e0561's parent, which is 75485363ce85526 at below table.

Here is the detailed results to compensate the mistake I made ;)

Legend:
~XX%- stddev percent  (3 runs for each kernel)
[+-]XX% - change percent


75485363ce85526  e82e0561dae9f3ae5a21fc2d3  v3.15-rc6  
60c10afd233f3344479d229dc  
---  -  -  
-  
  35979244 ~ 0% -48.4%   18560785 ~ 0% -66.0%   12235090 ~ 0%  
+8.0%   38868453 ~ 0%   vm-scalability.throughput

 28138 ~ 0%   +7448.2%2123943 ~ 0%   +2724.5% 794777 ~ 0%  
+1.6%  28598 ~ 0%   proc-vmstat.allocstall

   544 ~ 6% -95.2% 26 ~ 0% -96.5% 19 ~21%  
-6.9%506 ~ 6%   numa-vmstat.node2.nr_isolated_file
  12009832 ~11%+368.1%   56215319 ~ 0%+312.9%   49589361 ~ 1%  
+0.7%   12091235 ~ 5%   numa-numastat.node3.numa_foreign
   560 ~ 5% -95.7% 24 ~12% -96.9% 17 ~10%  
-8.7%511 ~ 2%   numa-vmstat.node1.nr_isolated_file
   8740137 ~12%+574.0%   58910256 ~ 0%+321.0%   36798827 ~ 0% 
+21.0%   10578905 ~13%   numa-vmstat.node0.numa_other
   8734988 ~12%+574.4%   58904944 ~ 0%+321.2%   36794158 ~ 0% 
+21.0%   10572718 ~13%   numa-vmstat.node0.numa_miss
  1308 ~12%-100.0%  0 ~ 0%-100.0%  0  
+23.3%   1612 ~18%   proc-vmstat.pgscan_direct_throttle
  12294788 ~11%+401.2%   61622745 ~ 0%+332.6%   53190547 ~ 0% 
-13.2%   10667387 ~ 5%   numa-numastat.node1.numa_foreign
   576 ~ 6% -91.2% 50 ~22% -94.3% 33 ~20% 
-18.1%472 ~ 1%   numa-vmstat.node0.nr_isolated_file
12 ~24%   +2400.0%316 ~ 4%  +13543.7%   1728 ~ 5%
+155.3% 32 ~29%   proc-vmstat.compact_stall
   572 ~ 2% -96.4% 20 ~18% -97.6% 13 ~11% 
-17.5%472 ~ 2%   numa-vmstat.node3.nr_isolated_file
  3012 ~12%   +2388.4%  74959 ~ 0%+254.7%  10685 ~ 1% 
-45.4%   1646 ~ 1%   proc-vmstat.pageoutrun
  2312 ~ 3% -94.2%133 ~ 4% -95.8% 97 ~ 8% 
-12.6%   2021 ~ 2%   proc-vmstat.nr_isolated_file
   2575163 ~ 0%   +2779.1%   74141888 ~ 0%+958.0%   27244229 ~ 0%  
-1.3%2542941 ~ 0%   proc-vmstat.pgscan_direct_dma32
  21916603 ~13%   +2519.8%  5.742e+08 ~ 0%   +2868.9%  6.507e+08 ~ 0% 
-16.1%   18397644 ~ 5%   proc-vmstat.pgscan_kswapd_normal
 53306 ~24%   +1077.9% 627895 ~ 0%   +2066.2%1154741 ~ 0% 
+23.5%  65815 ~24%   proc-vmstat.pgscan_kswapd_dma32
   2575163 ~ 0%   +2778.6%   74129497 ~ 0%+957.8%   27239606 ~ 0%  
-1.3%2542353 ~ 0%   proc-vmstat.pgsteal_direct_dma32
  21907744 ~14%   +2520.8%  5.742e+08 ~ 0%   +2870.0%  6.507e+08 ~ 0% 
-16.1%   18386641 ~ 5%   proc-vmstat.pgsteal_kswapd_normal
 53306 ~24%   +1077.7% 627796 ~ 0%   +2065.7%1154436 ~ 0% 
+23.3%  65731 ~24%   proc-vmstat.pgsteal_kswapd_dma32
   2967449 ~ 0%   +2432.7%   75156011 ~ 0%+869.9%   28781337 ~ 0%  
-0.7%2945933 ~ 0%   proc-vmstat.pgalloc_dma32
  13081172 ~11%+599.4%   91495653 ~ 0%+337.1%   57180622 ~ 0% 
+12.1%   14668141 ~13%   numa-numastat.node0.other_node
  13073426 ~11%+599.8%   91489575 ~ 0%+337.4%   57177129 ~ 0% 
+12.1%   14660341 ~13%   numa-numastat.node0.numa_miss
   281 ~23%   

Re: [PATCH 0/3] Shrinkers and proportional reclaim

2014-05-22 Thread Tim Chen
On Thu, 2014-05-22 at 10:09 +0100, Mel Gorman wrote:
> This series is aimed at regressions noticed during reclaim activity. The
> first two patches are shrinker patches that were posted ages ago but never
> merged for reasons that are unclear to me. I'm posting them again to see if
> there was a reason they were dropped or if they just got lost. Dave?  Time?

As far as I remembered, I think Dave was planning to merge this as part
of his VFS scalability patch series.  Otherwise there wasn't any other
issues.

Thanks to Mel for looking at these patches and Yunhan for testing them.

Tim

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] Shrinkers and proportional reclaim

2014-05-22 Thread Mel Gorman
On Fri, May 23, 2014 at 12:14:16AM +0800, Yuanhan Liu wrote:
> On Thu, May 22, 2014 at 10:09:36AM +0100, Mel Gorman wrote:
> > This series is aimed at regressions noticed during reclaim activity. The
> > first two patches are shrinker patches that were posted ages ago but never
> > merged for reasons that are unclear to me. I'm posting them again to see if
> > there was a reason they were dropped or if they just got lost. Dave?  Time?
> > The last patch adjusts proportional reclaim. Yuanhan Liu, can you retest
> > the vm scalability test cases on a larger machine? Hugh, does this work
> > for you on the memcg test cases?
> 
> Sure, and here is the result. I applied these 3 patches on v3.15-rc6,
> and head commit is 60c10afd. e82e0561 is the old commit that introduced
> the regression.  The testserver has 512G memory and 120 CPU.
> 
> It's a simple result; if you need more data, I can gather them and send
> it to you tomorrow:
> 
> e82e0561v3.15-rc6   60c10afd
> 
> 185607851223212238868453
> -34%+109
> 
> As you can see, the performance is back, and it is way much better ;)
> 

Thanks a lot for that and the quick response. It is much appreciated.

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] Shrinkers and proportional reclaim

2014-05-22 Thread Yuanhan Liu
On Thu, May 22, 2014 at 10:09:36AM +0100, Mel Gorman wrote:
> This series is aimed at regressions noticed during reclaim activity. The
> first two patches are shrinker patches that were posted ages ago but never
> merged for reasons that are unclear to me. I'm posting them again to see if
> there was a reason they were dropped or if they just got lost. Dave?  Time?
> The last patch adjusts proportional reclaim. Yuanhan Liu, can you retest
> the vm scalability test cases on a larger machine? Hugh, does this work
> for you on the memcg test cases?

Sure, and here is the result. I applied these 3 patches on v3.15-rc6,
and head commit is 60c10afd. e82e0561 is the old commit that introduced
the regression.  The testserver has 512G memory and 120 CPU.

It's a simple result; if you need more data, I can gather them and send
it to you tomorrow:

e82e0561v3.15-rc6   60c10afd

185607851223212238868453
-34%+109

As you can see, the performance is back, and it is way much better ;)

--yliu
> 
> Based on ext4, I get the following results but unfortunately my larger test
> machines are all unavailable so this is based on a relatively small machine.
> 
> postmark
>   3.15.0-rc53.15.0-rc5
>  vanilla   proportion-v1r4
> Ops/sec Transactions 21.00 (  0.00%)   25.00 ( 19.05%)
> Ops/sec FilesCreate  39.00 (  0.00%)   45.00 ( 15.38%)
> Ops/sec CreateTransact   10.00 (  0.00%)   12.00 ( 20.00%)
> Ops/sec FilesDeleted   6202.00 (  0.00%) 6202.00 (  0.00%)
> Ops/sec DeleteTransact   11.00 (  0.00%)   12.00 (  9.09%)
> Ops/sec DataRead/MB  25.97 (  0.00%)   30.02 ( 15.59%)
> Ops/sec DataWrite/MB 49.99 (  0.00%)   57.78 ( 15.58%)
> 
> ffsb (mail server simulator)
>  3.15.0-rc5 3.15.0-rc5
> vanillaproportion-v1r4
> Ops/sec readall   9402.63 (  0.00%)  9805.74 (  4.29%)
> Ops/sec create4695.45 (  0.00%)  4781.39 (  1.83%)
> Ops/sec delete 173.72 (  0.00%)   177.23 (  2.02%)
> Ops/sec Transactions 14271.80 (  0.00%) 14764.37 (  3.45%)
> Ops/sec Read37.00 (  0.00%)38.50 (  4.05%)
> Ops/sec Write   18.20 (  0.00%)18.50 (  1.65%)
> 
> dd of a large file
> 3.15.0-rc53.15.0-rc5
>vanilla   proportion-v1r4
> WallTime DownloadTar   75.00 (  0.00%)   61.00 ( 18.67%)
> WallTime DD   423.00 (  0.00%)  401.00 (  5.20%)
> WallTime Delete 2.00 (  0.00%)5.00 (-150.00%)
> 
> stutter (times mmap latency during large amounts of IO)
> 
> 3.15.0-rc53.15.0-rc5
>vanilla   proportion-v1r4
> Unit >5ms Delays  80252. (  0.00%)  81523. ( -1.58%)
> Unit Mmap min 8.2118 (  0.00%)  8.3206 ( -1.33%)
> Unit Mmap mean   17.4614 (  0.00%) 17.2868 (  1.00%)
> Unit Mmap stddev 24.9059 (  0.00%) 34.6771 (-39.23%)
> Unit Mmap max  2811.6433 (  0.00%)   2645.1398 (  5.92%)
> Unit Mmap 90%20.5098 (  0.00%) 18.3105 ( 10.72%)
> Unit Mmap 93%22.9180 (  0.00%) 20.1751 ( 11.97%)
> Unit Mmap 95%25.2114 (  0.00%) 22.4988 ( 10.76%)
> Unit Mmap 99%46.1430 (  0.00%) 43.5952 (  5.52%)
> Unit Ideal  Tput 85.2623 (  0.00%) 78.8906 (  7.47%)
> Unit Tput min44.0666 (  0.00%) 43.9609 (  0.24%)
> Unit Tput mean   45.5646 (  0.00%) 45.2009 (  0.80%)
> Unit Tput stddev  0.9318 (  0.00%)  1.1084 (-18.95%)
> Unit Tput max46.7375 (  0.00%) 46.7539 ( -0.04%)
> 
>  fs/super.c  | 16 +---
>  mm/vmscan.c | 36 +---
>  2 files changed, 34 insertions(+), 18 deletions(-)
> 
> -- 
> 1.8.4.5
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] Shrinkers and proportional reclaim

2014-05-22 Thread Yuanhan Liu
On Thu, May 22, 2014 at 10:09:36AM +0100, Mel Gorman wrote:
 This series is aimed at regressions noticed during reclaim activity. The
 first two patches are shrinker patches that were posted ages ago but never
 merged for reasons that are unclear to me. I'm posting them again to see if
 there was a reason they were dropped or if they just got lost. Dave?  Time?
 The last patch adjusts proportional reclaim. Yuanhan Liu, can you retest
 the vm scalability test cases on a larger machine? Hugh, does this work
 for you on the memcg test cases?

Sure, and here is the result. I applied these 3 patches on v3.15-rc6,
and head commit is 60c10afd. e82e0561 is the old commit that introduced
the regression.  The testserver has 512G memory and 120 CPU.

It's a simple result; if you need more data, I can gather them and send
it to you tomorrow:

e82e0561v3.15-rc6   60c10afd

185607851223212238868453
-34%+109

As you can see, the performance is back, and it is way much better ;)

--yliu
 
 Based on ext4, I get the following results but unfortunately my larger test
 machines are all unavailable so this is based on a relatively small machine.
 
 postmark
   3.15.0-rc53.15.0-rc5
  vanilla   proportion-v1r4
 Ops/sec Transactions 21.00 (  0.00%)   25.00 ( 19.05%)
 Ops/sec FilesCreate  39.00 (  0.00%)   45.00 ( 15.38%)
 Ops/sec CreateTransact   10.00 (  0.00%)   12.00 ( 20.00%)
 Ops/sec FilesDeleted   6202.00 (  0.00%) 6202.00 (  0.00%)
 Ops/sec DeleteTransact   11.00 (  0.00%)   12.00 (  9.09%)
 Ops/sec DataRead/MB  25.97 (  0.00%)   30.02 ( 15.59%)
 Ops/sec DataWrite/MB 49.99 (  0.00%)   57.78 ( 15.58%)
 
 ffsb (mail server simulator)
  3.15.0-rc5 3.15.0-rc5
 vanillaproportion-v1r4
 Ops/sec readall   9402.63 (  0.00%)  9805.74 (  4.29%)
 Ops/sec create4695.45 (  0.00%)  4781.39 (  1.83%)
 Ops/sec delete 173.72 (  0.00%)   177.23 (  2.02%)
 Ops/sec Transactions 14271.80 (  0.00%) 14764.37 (  3.45%)
 Ops/sec Read37.00 (  0.00%)38.50 (  4.05%)
 Ops/sec Write   18.20 (  0.00%)18.50 (  1.65%)
 
 dd of a large file
 3.15.0-rc53.15.0-rc5
vanilla   proportion-v1r4
 WallTime DownloadTar   75.00 (  0.00%)   61.00 ( 18.67%)
 WallTime DD   423.00 (  0.00%)  401.00 (  5.20%)
 WallTime Delete 2.00 (  0.00%)5.00 (-150.00%)
 
 stutter (times mmap latency during large amounts of IO)
 
 3.15.0-rc53.15.0-rc5
vanilla   proportion-v1r4
 Unit 5ms Delays  80252. (  0.00%)  81523. ( -1.58%)
 Unit Mmap min 8.2118 (  0.00%)  8.3206 ( -1.33%)
 Unit Mmap mean   17.4614 (  0.00%) 17.2868 (  1.00%)
 Unit Mmap stddev 24.9059 (  0.00%) 34.6771 (-39.23%)
 Unit Mmap max  2811.6433 (  0.00%)   2645.1398 (  5.92%)
 Unit Mmap 90%20.5098 (  0.00%) 18.3105 ( 10.72%)
 Unit Mmap 93%22.9180 (  0.00%) 20.1751 ( 11.97%)
 Unit Mmap 95%25.2114 (  0.00%) 22.4988 ( 10.76%)
 Unit Mmap 99%46.1430 (  0.00%) 43.5952 (  5.52%)
 Unit Ideal  Tput 85.2623 (  0.00%) 78.8906 (  7.47%)
 Unit Tput min44.0666 (  0.00%) 43.9609 (  0.24%)
 Unit Tput mean   45.5646 (  0.00%) 45.2009 (  0.80%)
 Unit Tput stddev  0.9318 (  0.00%)  1.1084 (-18.95%)
 Unit Tput max46.7375 (  0.00%) 46.7539 ( -0.04%)
 
  fs/super.c  | 16 +---
  mm/vmscan.c | 36 +---
  2 files changed, 34 insertions(+), 18 deletions(-)
 
 -- 
 1.8.4.5
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] Shrinkers and proportional reclaim

2014-05-22 Thread Mel Gorman
On Fri, May 23, 2014 at 12:14:16AM +0800, Yuanhan Liu wrote:
 On Thu, May 22, 2014 at 10:09:36AM +0100, Mel Gorman wrote:
  This series is aimed at regressions noticed during reclaim activity. The
  first two patches are shrinker patches that were posted ages ago but never
  merged for reasons that are unclear to me. I'm posting them again to see if
  there was a reason they were dropped or if they just got lost. Dave?  Time?
  The last patch adjusts proportional reclaim. Yuanhan Liu, can you retest
  the vm scalability test cases on a larger machine? Hugh, does this work
  for you on the memcg test cases?
 
 Sure, and here is the result. I applied these 3 patches on v3.15-rc6,
 and head commit is 60c10afd. e82e0561 is the old commit that introduced
 the regression.  The testserver has 512G memory and 120 CPU.
 
 It's a simple result; if you need more data, I can gather them and send
 it to you tomorrow:
 
 e82e0561v3.15-rc6   60c10afd
 
 185607851223212238868453
 -34%+109
 
 As you can see, the performance is back, and it is way much better ;)
 

Thanks a lot for that and the quick response. It is much appreciated.

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] Shrinkers and proportional reclaim

2014-05-22 Thread Tim Chen
On Thu, 2014-05-22 at 10:09 +0100, Mel Gorman wrote:
 This series is aimed at regressions noticed during reclaim activity. The
 first two patches are shrinker patches that were posted ages ago but never
 merged for reasons that are unclear to me. I'm posting them again to see if
 there was a reason they were dropped or if they just got lost. Dave?  Time?

As far as I remembered, I think Dave was planning to merge this as part
of his VFS scalability patch series.  Otherwise there wasn't any other
issues.

Thanks to Mel for looking at these patches and Yunhan for testing them.

Tim

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] Shrinkers and proportional reclaim

2014-05-22 Thread Yuanhan Liu
On Thu, May 22, 2014 at 05:30:51PM +0100, Mel Gorman wrote:
 On Fri, May 23, 2014 at 12:14:16AM +0800, Yuanhan Liu wrote:
  On Thu, May 22, 2014 at 10:09:36AM +0100, Mel Gorman wrote:
   This series is aimed at regressions noticed during reclaim activity. The
   first two patches are shrinker patches that were posted ages ago but never
   merged for reasons that are unclear to me. I'm posting them again to see 
   if
   there was a reason they were dropped or if they just got lost. Dave?  
   Time?
   The last patch adjusts proportional reclaim. Yuanhan Liu, can you retest
   the vm scalability test cases on a larger machine? Hugh, does this work
   for you on the memcg test cases?
  
  Sure, and here is the result. I applied these 3 patches on v3.15-rc6,
  and head commit is 60c10afd. e82e0561 is the old commit that introduced
  the regression.  The testserver has 512G memory and 120 CPU.
  
  It's a simple result; if you need more data, I can gather them and send
  it to you tomorrow:
  
  e82e0561v3.15-rc6   60c10afd
  
  185607851223212238868453
  -34%+109
  
  As you can see, the performance is back, and it is way much better ;)
  
 
 Thanks a lot for that and the quick response. It is much appreciated.

Welcome! And sorry that I made a silly mistake. Those numbers are right
though, I just setup wrong compare base; I should compare them with
e82e0561's parent, which is 75485363ce85526 at below table.

Here is the detailed results to compensate the mistake I made ;)

Legend:
~XX%- stddev percent  (3 runs for each kernel)
[+-]XX% - change percent


75485363ce85526  e82e0561dae9f3ae5a21fc2d3  v3.15-rc6  
60c10afd233f3344479d229dc  
---  -  -  
-  
  35979244 ~ 0% -48.4%   18560785 ~ 0% -66.0%   12235090 ~ 0%  
+8.0%   38868453 ~ 0%   vm-scalability.throughput

 28138 ~ 0%   +7448.2%2123943 ~ 0%   +2724.5% 794777 ~ 0%  
+1.6%  28598 ~ 0%   proc-vmstat.allocstall

   544 ~ 6% -95.2% 26 ~ 0% -96.5% 19 ~21%  
-6.9%506 ~ 6%   numa-vmstat.node2.nr_isolated_file
  12009832 ~11%+368.1%   56215319 ~ 0%+312.9%   49589361 ~ 1%  
+0.7%   12091235 ~ 5%   numa-numastat.node3.numa_foreign
   560 ~ 5% -95.7% 24 ~12% -96.9% 17 ~10%  
-8.7%511 ~ 2%   numa-vmstat.node1.nr_isolated_file
   8740137 ~12%+574.0%   58910256 ~ 0%+321.0%   36798827 ~ 0% 
+21.0%   10578905 ~13%   numa-vmstat.node0.numa_other
   8734988 ~12%+574.4%   58904944 ~ 0%+321.2%   36794158 ~ 0% 
+21.0%   10572718 ~13%   numa-vmstat.node0.numa_miss
  1308 ~12%-100.0%  0 ~ 0%-100.0%  0  
+23.3%   1612 ~18%   proc-vmstat.pgscan_direct_throttle
  12294788 ~11%+401.2%   61622745 ~ 0%+332.6%   53190547 ~ 0% 
-13.2%   10667387 ~ 5%   numa-numastat.node1.numa_foreign
   576 ~ 6% -91.2% 50 ~22% -94.3% 33 ~20% 
-18.1%472 ~ 1%   numa-vmstat.node0.nr_isolated_file
12 ~24%   +2400.0%316 ~ 4%  +13543.7%   1728 ~ 5%
+155.3% 32 ~29%   proc-vmstat.compact_stall
   572 ~ 2% -96.4% 20 ~18% -97.6% 13 ~11% 
-17.5%472 ~ 2%   numa-vmstat.node3.nr_isolated_file
  3012 ~12%   +2388.4%  74959 ~ 0%+254.7%  10685 ~ 1% 
-45.4%   1646 ~ 1%   proc-vmstat.pageoutrun
  2312 ~ 3% -94.2%133 ~ 4% -95.8% 97 ~ 8% 
-12.6%   2021 ~ 2%   proc-vmstat.nr_isolated_file
   2575163 ~ 0%   +2779.1%   74141888 ~ 0%+958.0%   27244229 ~ 0%  
-1.3%2542941 ~ 0%   proc-vmstat.pgscan_direct_dma32
  21916603 ~13%   +2519.8%  5.742e+08 ~ 0%   +2868.9%  6.507e+08 ~ 0% 
-16.1%   18397644 ~ 5%   proc-vmstat.pgscan_kswapd_normal
 53306 ~24%   +1077.9% 627895 ~ 0%   +2066.2%1154741 ~ 0% 
+23.5%  65815 ~24%   proc-vmstat.pgscan_kswapd_dma32
   2575163 ~ 0%   +2778.6%   74129497 ~ 0%+957.8%   27239606 ~ 0%  
-1.3%2542353 ~ 0%   proc-vmstat.pgsteal_direct_dma32
  21907744 ~14%   +2520.8%  5.742e+08 ~ 0%   +2870.0%  6.507e+08 ~ 0% 
-16.1%   18386641 ~ 5%   proc-vmstat.pgsteal_kswapd_normal
 53306 ~24%   +1077.7% 627796 ~ 0%   +2065.7%1154436 ~ 0% 
+23.3%  65731 ~24%   proc-vmstat.pgsteal_kswapd_dma32
   2967449 ~ 0%   +2432.7%   75156011 ~ 0%+869.9%   28781337 ~ 0%  
-0.7%2945933 ~ 0%   proc-vmstat.pgalloc_dma32
  13081172 ~11%+599.4%   91495653 ~ 0%+337.1%   57180622 ~ 0% 
+12.1%   14668141 ~13%   numa-numastat.node0.other_node
  13073426 ~11%+599.8%   91489575 ~ 0%+337.4%   57177129 ~ 0% 
+12.1%   14660341 ~13%   numa-numastat.node0.numa_miss
   281 ~23%   +1969.4%   5822 ~ 1%   +3321.4%   9625 ~ 2% 
-26.9%