[ 
https://issues.apache.org/jira/browse/HBASE-21926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16791062#comment-16791062
 ] 

Andrew Purtell commented on HBASE-21926:
----------------------------------------

Text to be copied into release notes:

This change introduces a new servlet that runs async-profiler.

Go to [https://github.com/jvm-profiling-tools/async-profiler], download a 
release appropriate for your platform, and install on every cluster host. Set 
ASYNC_PROFILER_HOME in the environment (put it in hbase-env.sh) to the 
async-profiler install location, or pass it on the HBase daemon's command line 
as {{-Dasync.profiler.home=/path/to/async-profiler}}.

Once this is done you have access to async-profiler via the HBase UI or direct 
interaction with the infoserver. Examples:
 - To collect 30 second CPU profile of current process (returns FlameGraph svg)
 {{curl "http://localhost:16030/prof"}}
 - To collect 1 minute CPU profile of current process and output in tree format 
(html)
 {{curl "http://localhost:16030/prof?output=tree&duration=60"}}
 - To collect 30 second heap allocation profile of current process (returns 
FlameGraph svg)
 {{curl "http://localhost:16030/prof?event=alloc"}}
 - To collect lock contention profile of current process (returns FlameGraph 
svg)
 {{curl "http://localhost:16030/prof?event=lock"}}

The following event types are supported by async-profiler. Default is 'cpu'. 
Not all operating systems will support all types.

Perf events:
 * cpu
 * page-faults
 * context-switches
 * cycles
 * instructions
 * cache-references
 * cache-misses
 * branches
 * branch-misses
 * bus-cycles
 * L1-dcache-load-misses
 * LLC-load-misses
 * dTLB-load-misses

Java events:
 * alloc
 * lock

NOTE: The additional query parameter 'pid' can be used to specify the PID of 
the specific process to be profiled. If this parameter is missing the local 
process in which the infoserver is embedded will be profiled. Profile targets 
that are not JVMs might work but is not specifically supported. There are 
security implications. Access to the infoserver should be appropriately 
restricted.

> Profiler servlet
> ----------------
>
>                 Key: HBASE-21926
>                 URL: https://issues.apache.org/jira/browse/HBASE-21926
>             Project: HBase
>          Issue Type: New Feature
>          Components: master, Operability, regionserver
>            Reporter: Andrew Purtell
>            Assignee: Andrew Purtell
>            Priority: Major
>             Fix For: 3.0.0, 1.6.0, 2.3.0
>
>
> HIVE-20202 describes how Hive added a web endpoint for online in production 
> profiling based on async-profiler. The endpoint was added as a servlet to 
> httpserver and supports retrieval of flamegraphs compiled from the profiler 
> trace. Async profiler 
> ([https://github.com/jvm-profiling-tools/async-profiler] ) can also profile 
> heap allocations, lock contention, and HW performance counters in addition to 
> CPU.
> The profiling overhead is pretty low and is safe to run in production. The 
> async-profiler project measured and describes CPU and memory overheads on 
> these issues: 
> [https://github.com/jvm-profiling-tools/async-profiler/issues/14] and 
> [https://github.com/jvm-profiling-tools/async-profiler/issues/131] 
> We have an httpserver based servlet stack so we can use HIVE-20202 as an 
> implementation template for a similar feature for HBase daemons. Ideally we 
> achieve these requirements:
>  * Retrieve flamegraph SVG generated from latest profile trace.
>  * Online enable and disable of profiling activity. (async-profiler does not 
> do instrumentation based profiling so this should not cause the code gen 
> related perf problems of that other approach and can be safely toggled on and 
> off while under production load.)
>  * CPU profiling.
>  * ALLOCATION profiling.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to