Thanks Yun and Yu for driving this proposal!

It's very useful for troubleshooting why the CPU usage is high.
+1

Best,
Rui

On Mon, Oct 9, 2023 at 7:21 PM Zhanghao Chen <zhanghao.c...@outlook.com>
wrote:

> Hi Yun and Yu,
>
> Thanks for driving this. This would definitely help users identify
> performance bottlenecks, especially for the cases where the bottleneck lies
> in the system stack (e.g. GC), and big +1 for the downloadable flamegraph
> to ease sharing. I'm wondering if we could add this for the job manager as
> well. In the OLAP scenario and sometimes in the streaming scenario (when
> there're some heavy operations during execution plan generation or in
> operator coordinators), the JM can have bottleneck as well.
>
> Best,
> Zhanghao Chen
> ________________________________
> From: Yu Chen <yuchen.e...@gmail.com>
> Sent: Monday, October 9, 2023 17:24
> To: dev@flink.apache.org <dev@flink.apache.org>
> Subject: [DISCUSS] FLIP-375: Built-in cross-platform powerful java
> profiler on taskmanagers
>
> Hi all,
>
> Yun Tang and I are opening this thread to discuss our proposal to integrate
> async-profiler's capabilities for profiling taskmananger (e.g., generating
> flame graphs) in the Flink Web [1].
>
>
> Currently, Flink provides ThreadDump and Operator-Level Flame Graphs by
> sampling task threads. The results generated in such way missing the
> relevant stack of java threads and system calls. The async-profiler[2] is a
> low-overhead sampling profiler for Java, but the steps to use it in the
> production environment are cumbersome and suffer from permissions and
> security risks.
>
> Therefore, we propose adding rest APIs to provide the capability to invoke
> async-profiler on multiple platforms through JNI, which can be easily
> operated on Web UI. This enhancement will improve the efficiency and
> experience of Flink users in identifying performance bottlenecks.
>
>
>
> Please refer to the FLIP document for more details about the proposed
> design
> and implementation. We welcome any feedback and opinions on this proposal.
>
>
>
> [1] FLIP-375: Built-in cross-platform powerful java profiler on
> taskmanagers - Apache Flink - Apache Software Foundation
> <
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-375%3A+Built-in+cross-platform+powerful+java+profiler+on+taskmanagers
> >
>
> [2] GitHub - async-profiler/async-profiler: Sampling CPU and HEAP profiler
> for Java featuring AsyncGetCallTrace + perf_events
> <https://github.com/async-profiler/async-profiler>
>
>
>
> Best regards,
>
> Yun Tang and Yu Chen
>

Reply via email to