Re: [PATCH v2] perf script python: integrate page reclaim analyze script

2019-10-01 Thread Yafang Shao
On Tue, Oct 1, 2019 at 10:45 PM Mel Gorman  wrote:
>
> On Mon, Sep 30, 2019 at 11:19:44PM -0400, Yafang Shao wrote:
> > A new perf script page-reclaim is introduced in this patch. This new script
> > is used to report the page reclaim details. The possible usage of this
> > script is as bellow,
> > - identify latency spike caused by direct reclaim
> > - whehter the latency spike is relevant with pageout
> > - why is page reclaim requested, i.e. whether it is because of memory
> >   fragmentation
> > - page reclaim efficiency
> > etc
> > In the future we may also enhance it to analyze the memcg reclaim.
> >
>
> Hi,
>
> I ended up not reviewing this patch in detail simply because I would
> approach the same class of problem in an entirely different way today.
> There is value in accumulating the stats in a report like this;
>
> > $ perf script report page-reclaim
> > Direct reclaims: 4924
> > Direct latency (ms)total max avg min
> > 177823.2116378.977  36.114   0.051
> > Direct file reclaimed 22920
> > Direct file scanned 28306
> > Direct file sync write I/O 0
> > Direct file async write I/O 0
> > Direct anon reclaimed 212567
> > Direct anon scanned 1446854
> > Direct anon sync write I/O 0
> > Direct anon async write I/O 278325
> > Direct order  0 1 3
> >  48702331
> > Wake kswapd requests 716
> > Wake order  0 1
> > 715 1
> >
> > Kswapd reclaims: 9
>
> However, the basic option I would prefer is having the raw latency
> information for Direct latency that can be externally parsed by R or any
> other statistical method. The reason why is because knowing the max latency
> is not enough, I'd want to know the spread of latencies and whether they
> were clustered at a point of time or spread out over long periods of
> time. I would then build the higher-level reports on top if necessary.
>
> Today, I would also have considered getting the latency figures using eBPF
> or systemtap instead although having perf do it may be useful too. That's
> not universally popular though so at minimum I would have;
>

eBPF requires newer kernel, while there're still lots of servers
running with old kernels.
The systemtap is not convenient as it requires many debug packages,
and it is still not stable enough to run on the product environment,
for example, the systemtap deamon may exit without uninstalling the
systemtap kernel module.

> perf script record page-reclaim -- capture all page-reclaim tracepoints
> perf script report page-reclaim -- For reclaim entry/exit, merge the two
> tracepoints into one that reports latency. Dump the rest out
> verbatim
>
> For latencies, I would externally post-process them until such time as I
> found a common class of bug that needed a high-level report and then
> build the perf script support for it.
>

This seem like a good suggestion.
I will try to think about it.

> Please note that I did not spot anything wrong with your script, it's
> just that I would not use it myself in its current format for debugging
> a reclaim-related problem.
>
> --
> Mel Gorman
> SUSE Labs


Re: [PATCH v2] perf script python: integrate page reclaim analyze script

2019-10-01 Thread Mel Gorman
On Mon, Sep 30, 2019 at 11:19:44PM -0400, Yafang Shao wrote:
> A new perf script page-reclaim is introduced in this patch. This new script
> is used to report the page reclaim details. The possible usage of this
> script is as bellow,
> - identify latency spike caused by direct reclaim
> - whehter the latency spike is relevant with pageout
> - why is page reclaim requested, i.e. whether it is because of memory
>   fragmentation
> - page reclaim efficiency
> etc
> In the future we may also enhance it to analyze the memcg reclaim.
> 

Hi,

I ended up not reviewing this patch in detail simply because I would
approach the same class of problem in an entirely different way today.
There is value in accumulating the stats in a report like this;

> $ perf script report page-reclaim
> Direct reclaims: 4924
> Direct latency (ms)total max avg min
> 177823.2116378.977  36.114   0.051
> Direct file reclaimed 22920
> Direct file scanned 28306
> Direct file sync write I/O 0
> Direct file async write I/O 0
> Direct anon reclaimed 212567
> Direct anon scanned 1446854
> Direct anon sync write I/O 0
> Direct anon async write I/O 278325
> Direct order  0 1 3
>  48702331
> Wake kswapd requests 716
> Wake order  0 1
> 715 1
> 
> Kswapd reclaims: 9

However, the basic option I would prefer is having the raw latency
information for Direct latency that can be externally parsed by R or any
other statistical method. The reason why is because knowing the max latency
is not enough, I'd want to know the spread of latencies and whether they
were clustered at a point of time or spread out over long periods of
time. I would then build the higher-level reports on top if necessary.

Today, I would also have considered getting the latency figures using eBPF
or systemtap instead although having perf do it may be useful too. That's
not universally popular though so at minimum I would have;

perf script record page-reclaim -- capture all page-reclaim tracepoints
perf script report page-reclaim -- For reclaim entry/exit, merge the two
tracepoints into one that reports latency. Dump the rest out
verbatim

For latencies, I would externally post-process them until such time as I
found a common class of bug that needed a high-level report and then
build the perf script support for it.

Please note that I did not spot anything wrong with your script, it's
just that I would not use it myself in its current format for debugging
a reclaim-related problem.

-- 
Mel Gorman
SUSE Labs


[PATCH v2] perf script python: integrate page reclaim analyze script

2019-09-30 Thread Yafang Shao
A new perf script page-reclaim is introduced in this patch. This new script
is used to report the page reclaim details. The possible usage of this
script is as bellow,
- identify latency spike caused by direct reclaim
- whehter the latency spike is relevant with pageout
- why is page reclaim requested, i.e. whether it is because of memory
  fragmentation
- page reclaim efficiency
etc
In the future we may also enhance it to analyze the memcg reclaim.

Bellow is how to use this script,
# Record, one of the following
$ perf record -e 'vmscan:mm_vmscan_*' ./workload
$ perf script record page-reclaim

# Report
$ perf script report page-reclaim

# Report per process latency
$ perf script report page-reclaim -- -p

# Report per process latency details. At what time and how long it
# stalls at each time.
$ perf script report page-reclaim -- -v

An example of doing mmtests,
$ perf script report page-reclaim
Direct reclaims: 4924
Direct latency (ms)total max avg min
  177823.2116378.977  36.114   0.051
Direct file reclaimed 22920
Direct file scanned 28306
Direct file sync write I/O 0
Direct file async write I/O 0
Direct anon reclaimed 212567
Direct anon scanned 1446854
Direct anon sync write I/O 0
Direct anon async write I/O 278325
Direct order  0 1 3
   48702331
Wake kswapd requests 716
Wake order  0 1
  715 1

Kswapd reclaims: 9
Kswapd latency (ms)total max avg min
   86353.046   42128.8169594.783 120.736
Kswapd file reclaimed 366461
Kswapd file scanned 369554
Kswapd file sync write I/O 0
Kswapd file async write I/O 0
Kswapd anon reclaimed 362594
Kswapd anon scanned 693938
Kswapd anon sync write I/O 0
Kswapd anon async write I/O 330663
Kswapd order  0 1 3
  3 1 5
Kswapd re-wakes 705

$ perf script report page-reclaim -- -p
# besides the above basic output, it will also summary per task
# latency
Per process latency (ms):
 pid[comm] total max avg min
   1[systemd]276.764 248.933   21.29   0.293
 163[kswapd0]  86353.046   42128.8169594.783 120.736
7241[bash] 12787.749 859.091  94.028   0.163
1592[master]  81.604  70.811  27.201   2.906
1595[pickup] 496.162 374.168 165.387  14.478
1098[auditd]   19.32   19.32   19.32   19.32
1120[irqbalance]5232.3311386.352 158.555   0.169
7236[usemem]79649.041763.281  24.921   0.051
1605[sshd]   1344.41 645.125  34.4720.16
7238[bash]   1158.921023.307 231.784   0.067
7239[bash] 15100.776 993.447  82.069   0.145
...

$ per script report page-reclaim -- -v
# Besides the basic output, it will asl show per task latency details
Per process latency (ms):
 pid[comm] total max avg min
   timestamp  latency(ns)
   1[systemd]276.764 248.933   21.29   0.293
   3406860552338: 16819800
   3406877381650: 5532855
   3407458799399: 929517
   3407459796042: 916682
   3407460763220: 418989
   3407461250236: 332355
   3407461637534: 401731
   3407462092234: 449219
   3407462605855: 292857
   3407462952343: 372700
   3407463364947: 414880
   3407463829547: 949162
   3407464813883: 248933444
 163[kswapd0]  86353.046   42128.8169594.783 120.736
   3357637025977: 1026962745
   3358915619888: 41268642175
   3400239664127: 42128816204
   3443784780373: 679641989
   3444847948969: 120735792
   3445001978784: 342713657
   3445835850664: 316851589
   3446865035476: 247457873
   3449355401352: 221223878
  ...

This script must be in sync with bellow vmscan tracepoints,
mm_vmscan_direct_reclaim_begin
mm_vmscan_direct_reclaim_end
mm_vmscan_kswapd_wake
mm_vmscan_kswapd_sleep
mm_vmscan_wakeup_kswapd
mm_vmscan_lru_shrink_inactive
mm_vmscan_writepage

Signed-off-by: Yafang Shao 
Cc: Tony Jones 
Cc: Mel Gorman 
---
v1->v2: verified it with python2.7 and python3.6.
add vmscan tracepoints comments in this script

---
 tools/perf/scripts/python/bin/page-reclaim-record |   2 +
 tools/perf/scripts/python/bin/page-reclaim-report |   4 +
 tools/perf/scripts/python/page-reclaim.py | 393 ++
 3 files changed,