Re: [PATCH 0/9] psi: pressure stall information for CPU, memory, and IO v4

2018-10-23 Thread Peter Zijlstra
On Tue, Oct 23, 2018 at 01:29:37PM -0400, Johannes Weiner wrote:
> On Thu, Oct 18, 2018 at 07:07:10PM -0700, Andrew Morton wrote:
> > On Tue, 28 Aug 2018 13:22:49 -0400 Johannes Weiner  
> > wrote:
> > 
> > > This version 4 of the PSI series incorporates feedback from Peter and
> > > fixes two races in the lockless aggregator that Suren found in his
> > > testing and which caused the sample calculation to sometimes underflow
> > > and record bogusly large samples; details at the bottom of this email.
> > 
> > We've had very little in the way of review activity for the PSI
> > patchset.  According to the changelog tags, anyway.
> 
> Peter reviewed it quite extensively over all revisions, and acked the
> final version. Peter, can we add your acked-by or reviewed-by tag(s)?

I don't really do reviewed by; but yes, I thought I already did; lemme
find.

> The scheduler part accounts for 99% of the complexity in those
> patches. The mm bits, while somewhat sprawling, are mostly mechanical.

Ah, I now see my mistake;

  
https://lkml.kernel.org/r/20180907110407.gq24...@hirez.programming.kicks-ass.net

I forgot to include an actual tag therein. My bad.

Acked-by: Peter Zijlstra (Intel) 


Re: [PATCH 0/9] psi: pressure stall information for CPU, memory, and IO v4

2018-10-23 Thread Peter Zijlstra
On Tue, Oct 23, 2018 at 01:29:37PM -0400, Johannes Weiner wrote:
> On Thu, Oct 18, 2018 at 07:07:10PM -0700, Andrew Morton wrote:
> > On Tue, 28 Aug 2018 13:22:49 -0400 Johannes Weiner  
> > wrote:
> > 
> > > This version 4 of the PSI series incorporates feedback from Peter and
> > > fixes two races in the lockless aggregator that Suren found in his
> > > testing and which caused the sample calculation to sometimes underflow
> > > and record bogusly large samples; details at the bottom of this email.
> > 
> > We've had very little in the way of review activity for the PSI
> > patchset.  According to the changelog tags, anyway.
> 
> Peter reviewed it quite extensively over all revisions, and acked the
> final version. Peter, can we add your acked-by or reviewed-by tag(s)?

I don't really do reviewed by; but yes, I thought I already did; lemme
find.

> The scheduler part accounts for 99% of the complexity in those
> patches. The mm bits, while somewhat sprawling, are mostly mechanical.

Ah, I now see my mistake;

  
https://lkml.kernel.org/r/20180907110407.gq24...@hirez.programming.kicks-ass.net

I forgot to include an actual tag therein. My bad.

Acked-by: Peter Zijlstra (Intel) 


Re: [PATCH 0/9] psi: pressure stall information for CPU, memory, and IO v4

2018-10-23 Thread Johannes Weiner
On Thu, Oct 18, 2018 at 07:07:10PM -0700, Andrew Morton wrote:
> On Tue, 28 Aug 2018 13:22:49 -0400 Johannes Weiner  wrote:
> 
> > This version 4 of the PSI series incorporates feedback from Peter and
> > fixes two races in the lockless aggregator that Suren found in his
> > testing and which caused the sample calculation to sometimes underflow
> > and record bogusly large samples; details at the bottom of this email.
> 
> We've had very little in the way of review activity for the PSI
> patchset.  According to the changelog tags, anyway.

Peter reviewed it quite extensively over all revisions, and acked the
final version. Peter, can we add your acked-by or reviewed-by tag(s)?

The scheduler part accounts for 99% of the complexity in those
patches. The mm bits, while somewhat sprawling, are mostly mechanical.


Re: [PATCH 0/9] psi: pressure stall information for CPU, memory, and IO v4

2018-10-23 Thread Johannes Weiner
On Thu, Oct 18, 2018 at 07:07:10PM -0700, Andrew Morton wrote:
> On Tue, 28 Aug 2018 13:22:49 -0400 Johannes Weiner  wrote:
> 
> > This version 4 of the PSI series incorporates feedback from Peter and
> > fixes two races in the lockless aggregator that Suren found in his
> > testing and which caused the sample calculation to sometimes underflow
> > and record bogusly large samples; details at the bottom of this email.
> 
> We've had very little in the way of review activity for the PSI
> patchset.  According to the changelog tags, anyway.

Peter reviewed it quite extensively over all revisions, and acked the
final version. Peter, can we add your acked-by or reviewed-by tag(s)?

The scheduler part accounts for 99% of the complexity in those
patches. The mm bits, while somewhat sprawling, are mostly mechanical.


Re: [PATCH 0/9] psi: pressure stall information for CPU, memory, and IO v4

2018-10-18 Thread Andrew Morton
On Tue, 28 Aug 2018 13:22:49 -0400 Johannes Weiner  wrote:

> This version 4 of the PSI series incorporates feedback from Peter and
> fixes two races in the lockless aggregator that Suren found in his
> testing and which caused the sample calculation to sometimes underflow
> and record bogusly large samples; details at the bottom of this email.

We've had very little in the way of review activity for the PSI
patchset.  According to the changelog tags, anyway.



Re: [PATCH 0/9] psi: pressure stall information for CPU, memory, and IO v4

2018-10-18 Thread Andrew Morton
On Tue, 28 Aug 2018 13:22:49 -0400 Johannes Weiner  wrote:

> This version 4 of the PSI series incorporates feedback from Peter and
> fixes two races in the lockless aggregator that Suren found in his
> testing and which caused the sample calculation to sometimes underflow
> and record bogusly large samples; details at the bottom of this email.

We've had very little in the way of review activity for the PSI
patchset.  According to the changelog tags, anyway.



Re: [PATCH 0/9] psi: pressure stall information for CPU, memory, and IO v4

2018-09-25 Thread Suren Baghdasaryan
I emailed Daniel 4.9 backport patches. Unfortunately that seems to be
the easiest way to share them. If anyone else is interested in them
please email me directly.
Thanks,
Suren.


On Tue, Sep 18, 2018 at 8:53 AM, Suren Baghdasaryan  wrote:
> Hi Daniel,
>
> On Sun, Sep 16, 2018 at 10:22 PM, Daniel Drake  wrote:
>> Hi Suren
>>
>> On Fri, Sep 7, 2018 at 11:58 PM, Suren Baghdasaryan  
>> wrote:
>>> Thanks for the new patchset! Backported to 4.9 and retested on ARMv8 8
>>> code system running Android. Signals behave as expected reacting to
>>> memory pressure, no jumps in "total" counters that would indicate an
>>> overflow/underflow issues. Nicely done!
>>
>> Can you share your Linux v4.9 psi backport somewhere?
>>
>
> Absolutely. Let me figure out what's the best way to do share that and
> make sure they apply cleanly on official 4.9 (I was using vendor's
> tree for testing). Will need a day or so to get this done.
> In case you need them sooner, there were several "prerequisite"
> patches that I had to backport to make PSI backporting
> easier/possible. Following is the list as shown by "git log
> --oneline":
>
> PSI patches:
>
> ef94c067f360 psi: cgroup support
> 60081a7aeb0b psi: pressure stall information for CPU, memory, and IO
> acd2a16497e9 sched: introduce this_rq_lock_irq()
> f30268c29309 sched: sched.h: make rq locking and clock functions
> available in stats.h
> a2fd1c94b743 sched: loadavg: make calc_load_n() public
> 32a74dec4967 sched: loadavg: consolidate LOAD_INT, LOAD_FRAC, CALC_LOAD
> 8e3991dd1a73 delayacct: track delays from thrashing cache pages
> 4ae940e7e6ff mm: workingset: tell cache transitions from workingset thrashing
> e9ccd63399e0 mm: workingset: don't drop refault information prematurely
>
> Prerequisites:
>
> b5a58c778c54 workqueue: make workqueue available early during boot
> ae5f39ee13b5 sched/core: Add wrappers for lockdep_(un)pin_lock()
> 7276f98a72c1 sched/headers, delayacct: Move the 'struct
> task_delay_info' definition from  to
> 
> 287318d13688 mm: add PageWaiters indicating tasks are waiting for a page bit
> edfa64560aaa sched/headers: Remove  from 
> 
> f6b6ba853959 sched/headers: Move loadavg related definitions from
>  to 
> 395b0a9f7aae sched/headers: Prepare for new header dependencies before
> moving code to 
>
> PSI patches needed some adjustments but nothing really major.
>
>> Thanks
>> Daniel
>
> Thanks,
> Suren.


Re: [PATCH 0/9] psi: pressure stall information for CPU, memory, and IO v4

2018-09-25 Thread Suren Baghdasaryan
I emailed Daniel 4.9 backport patches. Unfortunately that seems to be
the easiest way to share them. If anyone else is interested in them
please email me directly.
Thanks,
Suren.


On Tue, Sep 18, 2018 at 8:53 AM, Suren Baghdasaryan  wrote:
> Hi Daniel,
>
> On Sun, Sep 16, 2018 at 10:22 PM, Daniel Drake  wrote:
>> Hi Suren
>>
>> On Fri, Sep 7, 2018 at 11:58 PM, Suren Baghdasaryan  
>> wrote:
>>> Thanks for the new patchset! Backported to 4.9 and retested on ARMv8 8
>>> code system running Android. Signals behave as expected reacting to
>>> memory pressure, no jumps in "total" counters that would indicate an
>>> overflow/underflow issues. Nicely done!
>>
>> Can you share your Linux v4.9 psi backport somewhere?
>>
>
> Absolutely. Let me figure out what's the best way to do share that and
> make sure they apply cleanly on official 4.9 (I was using vendor's
> tree for testing). Will need a day or so to get this done.
> In case you need them sooner, there were several "prerequisite"
> patches that I had to backport to make PSI backporting
> easier/possible. Following is the list as shown by "git log
> --oneline":
>
> PSI patches:
>
> ef94c067f360 psi: cgroup support
> 60081a7aeb0b psi: pressure stall information for CPU, memory, and IO
> acd2a16497e9 sched: introduce this_rq_lock_irq()
> f30268c29309 sched: sched.h: make rq locking and clock functions
> available in stats.h
> a2fd1c94b743 sched: loadavg: make calc_load_n() public
> 32a74dec4967 sched: loadavg: consolidate LOAD_INT, LOAD_FRAC, CALC_LOAD
> 8e3991dd1a73 delayacct: track delays from thrashing cache pages
> 4ae940e7e6ff mm: workingset: tell cache transitions from workingset thrashing
> e9ccd63399e0 mm: workingset: don't drop refault information prematurely
>
> Prerequisites:
>
> b5a58c778c54 workqueue: make workqueue available early during boot
> ae5f39ee13b5 sched/core: Add wrappers for lockdep_(un)pin_lock()
> 7276f98a72c1 sched/headers, delayacct: Move the 'struct
> task_delay_info' definition from  to
> 
> 287318d13688 mm: add PageWaiters indicating tasks are waiting for a page bit
> edfa64560aaa sched/headers: Remove  from 
> 
> f6b6ba853959 sched/headers: Move loadavg related definitions from
>  to 
> 395b0a9f7aae sched/headers: Prepare for new header dependencies before
> moving code to 
>
> PSI patches needed some adjustments but nothing really major.
>
>> Thanks
>> Daniel
>
> Thanks,
> Suren.


Re: [PATCH 0/9] psi: pressure stall information for CPU, memory, and IO v4

2018-09-18 Thread Suren Baghdasaryan
On Mon, Sep 17, 2018 at 6:29 AM, peter enderborg
 wrote:
> Will it be part of the backport to 4.9 google android or is it for test only?

Currently I'm testing these patches in tandem with PSI monitor that
I'm developing and test results look good. If things go well and we
start using PSI for Android I will try to upstream the backport. If
upstream rejects it we will have to merge it into Android common
kernel repo as a last resort. Hope this answers your question.

> I guess that this patch is to big for the LTS tree.
>
> On 09/07/2018 05:58 PM, Suren Baghdasaryan wrote:
>> Thanks for the new patchset! Backported to 4.9 and retested on ARMv8 8
>> code system running Android. Signals behave as expected reacting to
>> memory pressure, no jumps in "total" counters that would indicate an
>> overflow/underflow issues. Nicely done!
>>
>> Tested-by: Suren Baghdasaryan 
>>
>> On Fri, Sep 7, 2018 at 8:09 AM, Johannes Weiner  wrote:
>>> On Fri, Sep 07, 2018 at 01:04:07PM +0200, Peter Zijlstra wrote:
 So yeah, grudingly acked. Did you want me to pick this up through the
 scheduler tree since most of this lives there?
>>> Thanks for the ack.
>>>
>>> As for routing it, I'll leave that decision to you and Andrew. It
>>> touches stuff all over, so it could result in quite a few conflicts
>>> between trees (although I don't expect any of them to be non-trivial).
>
>

Thanks,
Suren.


Re: [PATCH 0/9] psi: pressure stall information for CPU, memory, and IO v4

2018-09-18 Thread Suren Baghdasaryan
On Mon, Sep 17, 2018 at 6:29 AM, peter enderborg
 wrote:
> Will it be part of the backport to 4.9 google android or is it for test only?

Currently I'm testing these patches in tandem with PSI monitor that
I'm developing and test results look good. If things go well and we
start using PSI for Android I will try to upstream the backport. If
upstream rejects it we will have to merge it into Android common
kernel repo as a last resort. Hope this answers your question.

> I guess that this patch is to big for the LTS tree.
>
> On 09/07/2018 05:58 PM, Suren Baghdasaryan wrote:
>> Thanks for the new patchset! Backported to 4.9 and retested on ARMv8 8
>> code system running Android. Signals behave as expected reacting to
>> memory pressure, no jumps in "total" counters that would indicate an
>> overflow/underflow issues. Nicely done!
>>
>> Tested-by: Suren Baghdasaryan 
>>
>> On Fri, Sep 7, 2018 at 8:09 AM, Johannes Weiner  wrote:
>>> On Fri, Sep 07, 2018 at 01:04:07PM +0200, Peter Zijlstra wrote:
 So yeah, grudingly acked. Did you want me to pick this up through the
 scheduler tree since most of this lives there?
>>> Thanks for the ack.
>>>
>>> As for routing it, I'll leave that decision to you and Andrew. It
>>> touches stuff all over, so it could result in quite a few conflicts
>>> between trees (although I don't expect any of them to be non-trivial).
>
>

Thanks,
Suren.


Re: [PATCH 0/9] psi: pressure stall information for CPU, memory, and IO v4

2018-09-18 Thread Suren Baghdasaryan
Hi Daniel,

On Sun, Sep 16, 2018 at 10:22 PM, Daniel Drake  wrote:
> Hi Suren
>
> On Fri, Sep 7, 2018 at 11:58 PM, Suren Baghdasaryan  wrote:
>> Thanks for the new patchset! Backported to 4.9 and retested on ARMv8 8
>> code system running Android. Signals behave as expected reacting to
>> memory pressure, no jumps in "total" counters that would indicate an
>> overflow/underflow issues. Nicely done!
>
> Can you share your Linux v4.9 psi backport somewhere?
>

Absolutely. Let me figure out what's the best way to do share that and
make sure they apply cleanly on official 4.9 (I was using vendor's
tree for testing). Will need a day or so to get this done.
In case you need them sooner, there were several "prerequisite"
patches that I had to backport to make PSI backporting
easier/possible. Following is the list as shown by "git log
--oneline":

PSI patches:

ef94c067f360 psi: cgroup support
60081a7aeb0b psi: pressure stall information for CPU, memory, and IO
acd2a16497e9 sched: introduce this_rq_lock_irq()
f30268c29309 sched: sched.h: make rq locking and clock functions
available in stats.h
a2fd1c94b743 sched: loadavg: make calc_load_n() public
32a74dec4967 sched: loadavg: consolidate LOAD_INT, LOAD_FRAC, CALC_LOAD
8e3991dd1a73 delayacct: track delays from thrashing cache pages
4ae940e7e6ff mm: workingset: tell cache transitions from workingset thrashing
e9ccd63399e0 mm: workingset: don't drop refault information prematurely

Prerequisites:

b5a58c778c54 workqueue: make workqueue available early during boot
ae5f39ee13b5 sched/core: Add wrappers for lockdep_(un)pin_lock()
7276f98a72c1 sched/headers, delayacct: Move the 'struct
task_delay_info' definition from  to

287318d13688 mm: add PageWaiters indicating tasks are waiting for a page bit
edfa64560aaa sched/headers: Remove  from 
f6b6ba853959 sched/headers: Move loadavg related definitions from
 to 
395b0a9f7aae sched/headers: Prepare for new header dependencies before
moving code to 

PSI patches needed some adjustments but nothing really major.

> Thanks
> Daniel

Thanks,
Suren.


Re: [PATCH 0/9] psi: pressure stall information for CPU, memory, and IO v4

2018-09-18 Thread Suren Baghdasaryan
Hi Daniel,

On Sun, Sep 16, 2018 at 10:22 PM, Daniel Drake  wrote:
> Hi Suren
>
> On Fri, Sep 7, 2018 at 11:58 PM, Suren Baghdasaryan  wrote:
>> Thanks for the new patchset! Backported to 4.9 and retested on ARMv8 8
>> code system running Android. Signals behave as expected reacting to
>> memory pressure, no jumps in "total" counters that would indicate an
>> overflow/underflow issues. Nicely done!
>
> Can you share your Linux v4.9 psi backport somewhere?
>

Absolutely. Let me figure out what's the best way to do share that and
make sure they apply cleanly on official 4.9 (I was using vendor's
tree for testing). Will need a day or so to get this done.
In case you need them sooner, there were several "prerequisite"
patches that I had to backport to make PSI backporting
easier/possible. Following is the list as shown by "git log
--oneline":

PSI patches:

ef94c067f360 psi: cgroup support
60081a7aeb0b psi: pressure stall information for CPU, memory, and IO
acd2a16497e9 sched: introduce this_rq_lock_irq()
f30268c29309 sched: sched.h: make rq locking and clock functions
available in stats.h
a2fd1c94b743 sched: loadavg: make calc_load_n() public
32a74dec4967 sched: loadavg: consolidate LOAD_INT, LOAD_FRAC, CALC_LOAD
8e3991dd1a73 delayacct: track delays from thrashing cache pages
4ae940e7e6ff mm: workingset: tell cache transitions from workingset thrashing
e9ccd63399e0 mm: workingset: don't drop refault information prematurely

Prerequisites:

b5a58c778c54 workqueue: make workqueue available early during boot
ae5f39ee13b5 sched/core: Add wrappers for lockdep_(un)pin_lock()
7276f98a72c1 sched/headers, delayacct: Move the 'struct
task_delay_info' definition from  to

287318d13688 mm: add PageWaiters indicating tasks are waiting for a page bit
edfa64560aaa sched/headers: Remove  from 
f6b6ba853959 sched/headers: Move loadavg related definitions from
 to 
395b0a9f7aae sched/headers: Prepare for new header dependencies before
moving code to 

PSI patches needed some adjustments but nothing really major.

> Thanks
> Daniel

Thanks,
Suren.


Re: [PATCH 0/9] psi: pressure stall information for CPU, memory, and IO v4

2018-09-17 Thread Peter Zijlstra


A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?


Re: [PATCH 0/9] psi: pressure stall information for CPU, memory, and IO v4

2018-09-17 Thread Peter Zijlstra


A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?


Re: [PATCH 0/9] psi: pressure stall information for CPU, memory, and IO v4

2018-09-17 Thread peter enderborg
Will it be part of the backport to 4.9 google android or is it for test only?
I guess that this patch is to big for the LTS tree.

On 09/07/2018 05:58 PM, Suren Baghdasaryan wrote:
> Thanks for the new patchset! Backported to 4.9 and retested on ARMv8 8
> code system running Android. Signals behave as expected reacting to
> memory pressure, no jumps in "total" counters that would indicate an
> overflow/underflow issues. Nicely done!
>
> Tested-by: Suren Baghdasaryan 
>
> On Fri, Sep 7, 2018 at 8:09 AM, Johannes Weiner  wrote:
>> On Fri, Sep 07, 2018 at 01:04:07PM +0200, Peter Zijlstra wrote:
>>> So yeah, grudingly acked. Did you want me to pick this up through the
>>> scheduler tree since most of this lives there?
>> Thanks for the ack.
>>
>> As for routing it, I'll leave that decision to you and Andrew. It
>> touches stuff all over, so it could result in quite a few conflicts
>> between trees (although I don't expect any of them to be non-trivial).




Re: [PATCH 0/9] psi: pressure stall information for CPU, memory, and IO v4

2018-09-17 Thread peter enderborg
Will it be part of the backport to 4.9 google android or is it for test only?
I guess that this patch is to big for the LTS tree.

On 09/07/2018 05:58 PM, Suren Baghdasaryan wrote:
> Thanks for the new patchset! Backported to 4.9 and retested on ARMv8 8
> code system running Android. Signals behave as expected reacting to
> memory pressure, no jumps in "total" counters that would indicate an
> overflow/underflow issues. Nicely done!
>
> Tested-by: Suren Baghdasaryan 
>
> On Fri, Sep 7, 2018 at 8:09 AM, Johannes Weiner  wrote:
>> On Fri, Sep 07, 2018 at 01:04:07PM +0200, Peter Zijlstra wrote:
>>> So yeah, grudingly acked. Did you want me to pick this up through the
>>> scheduler tree since most of this lives there?
>> Thanks for the ack.
>>
>> As for routing it, I'll leave that decision to you and Andrew. It
>> touches stuff all over, so it could result in quite a few conflicts
>> between trees (although I don't expect any of them to be non-trivial).




Re: [PATCH 0/9] psi: pressure stall information for CPU, memory, and IO v4

2018-09-16 Thread Daniel Drake
Hi Suren

On Fri, Sep 7, 2018 at 11:58 PM, Suren Baghdasaryan  wrote:
> Thanks for the new patchset! Backported to 4.9 and retested on ARMv8 8
> code system running Android. Signals behave as expected reacting to
> memory pressure, no jumps in "total" counters that would indicate an
> overflow/underflow issues. Nicely done!

Can you share your Linux v4.9 psi backport somewhere?

Thanks
Daniel


Re: [PATCH 0/9] psi: pressure stall information for CPU, memory, and IO v4

2018-09-16 Thread Daniel Drake
Hi Suren

On Fri, Sep 7, 2018 at 11:58 PM, Suren Baghdasaryan  wrote:
> Thanks for the new patchset! Backported to 4.9 and retested on ARMv8 8
> code system running Android. Signals behave as expected reacting to
> memory pressure, no jumps in "total" counters that would indicate an
> overflow/underflow issues. Nicely done!

Can you share your Linux v4.9 psi backport somewhere?

Thanks
Daniel


Re: [PATCH 0/9] psi: pressure stall information for CPU, memory, and IO v4

2018-09-07 Thread Suren Baghdasaryan
Thanks for the new patchset! Backported to 4.9 and retested on ARMv8 8
code system running Android. Signals behave as expected reacting to
memory pressure, no jumps in "total" counters that would indicate an
overflow/underflow issues. Nicely done!

Tested-by: Suren Baghdasaryan 

On Fri, Sep 7, 2018 at 8:09 AM, Johannes Weiner  wrote:
> On Fri, Sep 07, 2018 at 01:04:07PM +0200, Peter Zijlstra wrote:
>> So yeah, grudingly acked. Did you want me to pick this up through the
>> scheduler tree since most of this lives there?
>
> Thanks for the ack.
>
> As for routing it, I'll leave that decision to you and Andrew. It
> touches stuff all over, so it could result in quite a few conflicts
> between trees (although I don't expect any of them to be non-trivial).


Re: [PATCH 0/9] psi: pressure stall information for CPU, memory, and IO v4

2018-09-07 Thread Suren Baghdasaryan
Thanks for the new patchset! Backported to 4.9 and retested on ARMv8 8
code system running Android. Signals behave as expected reacting to
memory pressure, no jumps in "total" counters that would indicate an
overflow/underflow issues. Nicely done!

Tested-by: Suren Baghdasaryan 

On Fri, Sep 7, 2018 at 8:09 AM, Johannes Weiner  wrote:
> On Fri, Sep 07, 2018 at 01:04:07PM +0200, Peter Zijlstra wrote:
>> So yeah, grudingly acked. Did you want me to pick this up through the
>> scheduler tree since most of this lives there?
>
> Thanks for the ack.
>
> As for routing it, I'll leave that decision to you and Andrew. It
> touches stuff all over, so it could result in quite a few conflicts
> between trees (although I don't expect any of them to be non-trivial).


Re: [PATCH 0/9] psi: pressure stall information for CPU, memory, and IO v4

2018-09-07 Thread Johannes Weiner
On Fri, Sep 07, 2018 at 01:04:07PM +0200, Peter Zijlstra wrote:
> So yeah, grudingly acked. Did you want me to pick this up through the
> scheduler tree since most of this lives there?

Thanks for the ack.

As for routing it, I'll leave that decision to you and Andrew. It
touches stuff all over, so it could result in quite a few conflicts
between trees (although I don't expect any of them to be non-trivial).


Re: [PATCH 0/9] psi: pressure stall information for CPU, memory, and IO v4

2018-09-07 Thread Johannes Weiner
On Fri, Sep 07, 2018 at 01:04:07PM +0200, Peter Zijlstra wrote:
> So yeah, grudingly acked. Did you want me to pick this up through the
> scheduler tree since most of this lives there?

Thanks for the ack.

As for routing it, I'll leave that decision to you and Andrew. It
touches stuff all over, so it could result in quite a few conflicts
between trees (although I don't expect any of them to be non-trivial).


Re: [PATCH 0/9] psi: pressure stall information for CPU, memory, and IO v4

2018-09-07 Thread Peter Zijlstra
On Wed, Sep 05, 2018 at 05:43:03PM -0400, Johannes Weiner wrote:
> On Tue, Aug 28, 2018 at 01:22:49PM -0400, Johannes Weiner wrote:
> > This version 4 of the PSI series incorporates feedback from Peter and
> > fixes two races in the lockless aggregator that Suren found in his
> > testing and which caused the sample calculation to sometimes underflow
> > and record bogusly large samples; details at the bottom of this email.
> 
> Peter, do the changes from v3 look sane to you?
> 
> If there aren't any further objections, I was hoping we could get this
> lined up for 4.20.

I suppose it looks ok, there's a few small nits, but nothing big.

I still hate psi_ttwu_dequeue(), but I don't really know what to about
that.

So yeah, grudingly acked. Did you want me to pick this up through the
scheduler tree since most of this lives there?


Re: [PATCH 0/9] psi: pressure stall information for CPU, memory, and IO v4

2018-09-07 Thread Peter Zijlstra
On Wed, Sep 05, 2018 at 05:43:03PM -0400, Johannes Weiner wrote:
> On Tue, Aug 28, 2018 at 01:22:49PM -0400, Johannes Weiner wrote:
> > This version 4 of the PSI series incorporates feedback from Peter and
> > fixes two races in the lockless aggregator that Suren found in his
> > testing and which caused the sample calculation to sometimes underflow
> > and record bogusly large samples; details at the bottom of this email.
> 
> Peter, do the changes from v3 look sane to you?
> 
> If there aren't any further objections, I was hoping we could get this
> lined up for 4.20.

I suppose it looks ok, there's a few small nits, but nothing big.

I still hate psi_ttwu_dequeue(), but I don't really know what to about
that.

So yeah, grudingly acked. Did you want me to pick this up through the
scheduler tree since most of this lives there?


Re: [PATCH 0/9] psi: pressure stall information for CPU, memory, and IO v4

2018-09-07 Thread Peter Zijlstra
On Wed, Sep 05, 2018 at 05:43:03PM -0400, Johannes Weiner wrote:
> On Tue, Aug 28, 2018 at 01:22:49PM -0400, Johannes Weiner wrote:
> > This version 4 of the PSI series incorporates feedback from Peter and
> > fixes two races in the lockless aggregator that Suren found in his
> > testing and which caused the sample calculation to sometimes underflow
> > and record bogusly large samples; details at the bottom of this email.
> 
> Peter, do the changes from v3 look sane to you?

I'll go have a look.


Re: [PATCH 0/9] psi: pressure stall information for CPU, memory, and IO v4

2018-09-07 Thread Peter Zijlstra
On Wed, Sep 05, 2018 at 05:43:03PM -0400, Johannes Weiner wrote:
> On Tue, Aug 28, 2018 at 01:22:49PM -0400, Johannes Weiner wrote:
> > This version 4 of the PSI series incorporates feedback from Peter and
> > fixes two races in the lockless aggregator that Suren found in his
> > testing and which caused the sample calculation to sometimes underflow
> > and record bogusly large samples; details at the bottom of this email.
> 
> Peter, do the changes from v3 look sane to you?

I'll go have a look.


Re: [PATCH 0/9] psi: pressure stall information for CPU, memory, and IO v4

2018-09-07 Thread Daniel Drake
On Thu, Sep 6, 2018 at 5:43 AM, Johannes Weiner  wrote:
> Peter, do the changes from v3 look sane to you?
>
> If there aren't any further objections, I was hoping we could get this
> lined up for 4.20.

That would be excellent. I just retested the latest version at
http://git.cmpxchg.org/cgit.cgi/linux-psi.git (Linux 4.18) and the
results are great.

Test setup:
Endless OS
GeminiLake N4200 low end laptop
2GB RAM
swap (and zram swap) disabled

Baseline test: open a handful of large-ish apps and several website
tabs in Google Chrome.
Results: after a couple of minutes, system is excessively thrashing,
mouse cursor can barely be moved, UI is not responding to mouse
clicks, so it's impractical to recover from this situation as an
ordinary user

Add my simple killer:
https://gist.github.com/dsd/a8988bf0b81a6163475988120fe8d9cd
Results: when the thrashing causes the UI to become sluggish, the
killer steps in and kills something (usually a chrome tab), and the
system remains usable. I repeatedly opened more apps and more websites
over a 15 minute period but I wasn't able to get the system to a point
of UI unresponsiveness.

Thanks,
Daniel


Re: [PATCH 0/9] psi: pressure stall information for CPU, memory, and IO v4

2018-09-07 Thread Daniel Drake
On Thu, Sep 6, 2018 at 5:43 AM, Johannes Weiner  wrote:
> Peter, do the changes from v3 look sane to you?
>
> If there aren't any further objections, I was hoping we could get this
> lined up for 4.20.

That would be excellent. I just retested the latest version at
http://git.cmpxchg.org/cgit.cgi/linux-psi.git (Linux 4.18) and the
results are great.

Test setup:
Endless OS
GeminiLake N4200 low end laptop
2GB RAM
swap (and zram swap) disabled

Baseline test: open a handful of large-ish apps and several website
tabs in Google Chrome.
Results: after a couple of minutes, system is excessively thrashing,
mouse cursor can barely be moved, UI is not responding to mouse
clicks, so it's impractical to recover from this situation as an
ordinary user

Add my simple killer:
https://gist.github.com/dsd/a8988bf0b81a6163475988120fe8d9cd
Results: when the thrashing causes the UI to become sluggish, the
killer steps in and kills something (usually a chrome tab), and the
system remains usable. I repeatedly opened more apps and more websites
over a 15 minute period but I wasn't able to get the system to a point
of UI unresponsiveness.

Thanks,
Daniel


Re: [PATCH 0/9] psi: pressure stall information for CPU, memory, and IO v4

2018-09-05 Thread Johannes Weiner
On Tue, Aug 28, 2018 at 01:22:49PM -0400, Johannes Weiner wrote:
> This version 4 of the PSI series incorporates feedback from Peter and
> fixes two races in the lockless aggregator that Suren found in his
> testing and which caused the sample calculation to sometimes underflow
> and record bogusly large samples; details at the bottom of this email.

Peter, do the changes from v3 look sane to you?

If there aren't any further objections, I was hoping we could get this
lined up for 4.20.


Re: [PATCH 0/9] psi: pressure stall information for CPU, memory, and IO v4

2018-09-05 Thread Johannes Weiner
On Tue, Aug 28, 2018 at 01:22:49PM -0400, Johannes Weiner wrote:
> This version 4 of the PSI series incorporates feedback from Peter and
> fixes two races in the lockless aggregator that Suren found in his
> testing and which caused the sample calculation to sometimes underflow
> and record bogusly large samples; details at the bottom of this email.

Peter, do the changes from v3 look sane to you?

If there aren't any further objections, I was hoping we could get this
lined up for 4.20.


[PATCH 0/9] psi: pressure stall information for CPU, memory, and IO v4

2018-08-28 Thread Johannes Weiner
This version 4 of the PSI series incorporates feedback from Peter and
fixes two races in the lockless aggregator that Suren found in his
testing and which caused the sample calculation to sometimes underflow
and record bogusly large samples; details at the bottom of this email.

Overview

PSI reports the overall wallclock time in which the tasks in a system
(or cgroup) wait for (contended) hardware resources.

This helps users understand the resource pressure their workloads are
under, which allows them to rootcause and fix throughput and latency
problems caused by overcommitting, underprovisioning, suboptimal job
placement in a grid; as well as anticipate major disruptions like OOM.

Real-world applications

We're using the data collected by PSI (and its previous incarnation,
memdelay) quite extensively at Facebook, and with several success
stories.

One usecase is avoiding OOM hangs/livelocks. The reason these happen
is because the OOM killer is triggered by reclaim not being able to
free pages, but with fast flash devices there is *always* some clean
and uptodate cache to reclaim; the OOM killer never kicks in, even as
tasks spend 90% of the time thrashing the cache pages of their own
executables. There is no situation where this ever makes sense in
practice. We wrote a <100 line POC python script to monitor memory
pressure and kill stuff way before such pathological thrashing leads
to full system losses that would require forcible hard resets.

We've since extended and deployed this code into other places to
guarantee latency and throughput SLAs, since they're usually violated
way before the kernel OOM killer would ever kick in.

It is available here: https://github.com/facebookincubator/oomd

Eventually we probably want to trigger the in-kernel OOM killer based
on extreme sustained pressure as well, so that Linux can avoid memory
livelocks - which technically aren't deadlocks, but to the user
indistinguishable from them - out of the box. We'd continue using OOMD
as the first line of defense to ensure workload health and implement
complex kill policies that are beyond the scope of the kernel.

We also use PSI memory pressure for loadshedding. Our batch job
infrastructure used to use heuristics based on various VM stats to
anticipate OOM situations, with lackluster success. We switched it to
PSI and managed to anticipate and avoid OOM kills and lockups fairly
reliably. The reduction of OOM outages in the worker pool raised the
pool's aggregate productivity, and we were able to switch that service
to smaller machines.

Lastly, we use cgroups to isolate a machine's main workload from
maintenance crap like package upgrades, logging, configuration, as
well as to prevent multiple workloads on a machine from stepping on
each others' toes. We were not able to configure this properly without
the pressure metrics; we would see latency or bandwidth drops, but it
would often be hard to impossible to rootcause it post-mortem.

We now log and graph pressure for the containers in our fleet and can
trivially link latency spikes and throughput drops to shortages of
specific resources after the fact, and fix the job config/scheduling.

PSI has also received testing, feedback, and feature requests from
Android and EndlessOS for the purpose of low-latency OOM killing, to
intervene in pressure situations before the UI starts hanging.

How do you use this feature?

A kernel with CONFIG_PSI=y will create a /proc/pressure directory with
3 files: cpu, memory, and io. If using cgroup2, cgroups will also have
cpu.pressure, memory.pressure and io.pressure files, which simply
aggregate task stalls at the cgroup level instead of system-wide.

The cpu file contains one line:

some avg10=2.04 avg60=0.75 avg300=0.40 total=157656722

The averages give the percentage of walltime in which one or more
tasks are delayed on the runqueue while another task has the
CPU. They're recent averages over 10s, 1m, 5m windows, so you can tell
short term trends from long term ones, similarly to the load average.

The total= value gives the absolute stall time in microseconds. This
allows detecting latency spikes that might be too short to sway the
running averages. It also allows custom time averaging in case the
10s/1m/5m windows aren't adequate for the usecase (or are too coarse
with future hardware).

What to make of this "some" metric? If CPU utilization is at 100% and
CPU pressure is 0, it means the system is perfectly utilized, with one
runnable thread per CPU and nobody waiting. At two or more runnable
tasks per CPU, the system is 100% overcommitted and the pressure
average will indicate as much. From a utilization perspective this is
a great state of course: no CPU cycles are being wasted, even when 50%
of the threads were to go idle (as most workloads do vary). From the
perspective of the individual job it's not great, however, and they
would do better with more resources. Depending on 

[PATCH 0/9] psi: pressure stall information for CPU, memory, and IO v4

2018-08-28 Thread Johannes Weiner
This version 4 of the PSI series incorporates feedback from Peter and
fixes two races in the lockless aggregator that Suren found in his
testing and which caused the sample calculation to sometimes underflow
and record bogusly large samples; details at the bottom of this email.

Overview

PSI reports the overall wallclock time in which the tasks in a system
(or cgroup) wait for (contended) hardware resources.

This helps users understand the resource pressure their workloads are
under, which allows them to rootcause and fix throughput and latency
problems caused by overcommitting, underprovisioning, suboptimal job
placement in a grid; as well as anticipate major disruptions like OOM.

Real-world applications

We're using the data collected by PSI (and its previous incarnation,
memdelay) quite extensively at Facebook, and with several success
stories.

One usecase is avoiding OOM hangs/livelocks. The reason these happen
is because the OOM killer is triggered by reclaim not being able to
free pages, but with fast flash devices there is *always* some clean
and uptodate cache to reclaim; the OOM killer never kicks in, even as
tasks spend 90% of the time thrashing the cache pages of their own
executables. There is no situation where this ever makes sense in
practice. We wrote a <100 line POC python script to monitor memory
pressure and kill stuff way before such pathological thrashing leads
to full system losses that would require forcible hard resets.

We've since extended and deployed this code into other places to
guarantee latency and throughput SLAs, since they're usually violated
way before the kernel OOM killer would ever kick in.

It is available here: https://github.com/facebookincubator/oomd

Eventually we probably want to trigger the in-kernel OOM killer based
on extreme sustained pressure as well, so that Linux can avoid memory
livelocks - which technically aren't deadlocks, but to the user
indistinguishable from them - out of the box. We'd continue using OOMD
as the first line of defense to ensure workload health and implement
complex kill policies that are beyond the scope of the kernel.

We also use PSI memory pressure for loadshedding. Our batch job
infrastructure used to use heuristics based on various VM stats to
anticipate OOM situations, with lackluster success. We switched it to
PSI and managed to anticipate and avoid OOM kills and lockups fairly
reliably. The reduction of OOM outages in the worker pool raised the
pool's aggregate productivity, and we were able to switch that service
to smaller machines.

Lastly, we use cgroups to isolate a machine's main workload from
maintenance crap like package upgrades, logging, configuration, as
well as to prevent multiple workloads on a machine from stepping on
each others' toes. We were not able to configure this properly without
the pressure metrics; we would see latency or bandwidth drops, but it
would often be hard to impossible to rootcause it post-mortem.

We now log and graph pressure for the containers in our fleet and can
trivially link latency spikes and throughput drops to shortages of
specific resources after the fact, and fix the job config/scheduling.

PSI has also received testing, feedback, and feature requests from
Android and EndlessOS for the purpose of low-latency OOM killing, to
intervene in pressure situations before the UI starts hanging.

How do you use this feature?

A kernel with CONFIG_PSI=y will create a /proc/pressure directory with
3 files: cpu, memory, and io. If using cgroup2, cgroups will also have
cpu.pressure, memory.pressure and io.pressure files, which simply
aggregate task stalls at the cgroup level instead of system-wide.

The cpu file contains one line:

some avg10=2.04 avg60=0.75 avg300=0.40 total=157656722

The averages give the percentage of walltime in which one or more
tasks are delayed on the runqueue while another task has the
CPU. They're recent averages over 10s, 1m, 5m windows, so you can tell
short term trends from long term ones, similarly to the load average.

The total= value gives the absolute stall time in microseconds. This
allows detecting latency spikes that might be too short to sway the
running averages. It also allows custom time averaging in case the
10s/1m/5m windows aren't adequate for the usecase (or are too coarse
with future hardware).

What to make of this "some" metric? If CPU utilization is at 100% and
CPU pressure is 0, it means the system is perfectly utilized, with one
runnable thread per CPU and nobody waiting. At two or more runnable
tasks per CPU, the system is 100% overcommitted and the pressure
average will indicate as much. From a utilization perspective this is
a great state of course: no CPU cycles are being wasted, even when 50%
of the threads were to go idle (as most workloads do vary). From the
perspective of the individual job it's not great, however, and they
would do better with more resources. Depending on