Re: [DISCUSS] FLIP-375: Built-in cross-platform powerful java profiler on taskmanagers

Rui Fan Mon, 16 Oct 2023 22:49:39 -0700

Thanks Yun for the update, LGTM.

Best,
Rui


On Mon, Oct 16, 2023 at 1:34 PM Yu Chen <yuchen.e...@gmail.com> wrote:

> Hi David.
>
> Thanks for your detailed comments.
> The async-profiler has a lot of features, but there are some requirements
> to use it (some cases encountered in production are listed below):
> 1. Prior to JDK 11, alloc mode requires HotSpot debug symbols, and for
> OpenJDK, you need to install them additionally.
> 2. CPU mode requires perf_events support.
> 3. etc (refer to async-profiler's troubleshooting section[1])
>
> However, the addition of modifying system kernel parameters may involve the
> action of invoking shell commands.
> For security considerations, as a workaround, we can display async-profiler
> error messages in the web UI when the task's runtime environment does not
> satisfy the corresponding mode, and the user can manually modify the
> runtime environment (e.g., modify kernel parameters) based on the error
> messages as a bypass solution.
>
> As you recommend, we will revise the FLIP as follows:
> 1. We expose the -e parameter as a dropdown box in the Flink WEB and make
> `itimer` mode the default option. This makes it easy for users to use the
> different profiling modes (wall clock/alloc/cpu/itemer).
> 2. We added a message column on Flink Web to show the call exception so
> that users can adjust the environment according to the exception
> information.
>
> [1] async-profiler/async-profiler: Sampling CPU and HEAP profiler for Java
> featuring AsyncGetCallTrace + perf_events (github.com)
> <https://github.com/async-profiler/async-profiler#troubleshooting>
>
> Best,
> Yu Chen
>
> David Christle <david.chris...@discordapp.com.invalid> 于2023年10月14日周六
> 04:12写道：
>
> > In the Wiki, this FLIP is motivated by:
> >
> > - That the current flamegraph functionality can only see operator-level
> > stack traces, while async-profiler provides CPU/allocation/locks
> > information, along with deeper Java & system call stack information.
> > - Low configurability (e.g. cannot set the sampling interval) + the
> > usability is limited to visual inspection of the flamegraph whenever it
> > happens to read out.
> >
> > The current built-in flamegraph is extremely valuable and easy to use.
> But
> > as a regular user of async-profiler for Flink applications, I agree these
> > deficiencies are worth improving. It's exciting that async-profiler might
> > be built-in, since it will make using it much easier.
> >
> > I'm a little confused, though, about the scope of functionality we'll
> have
> > with this FLIP, in particular the use of itimer only & perf_events
> support.
> > One of the questions near the end of the Wiki is whether sampling with
> > perf_events will be supported. The current answer seems to say that only
> > itimer mode will be supported, as this mode does not rely on perf_events
> > being enabled.
> >
> > However, given that an aim of the FLIP is to support configurability
> (e.g.
> > sampling intervals), is it that much more work to support configurability
> > of the event & the other common options, too? If the '-e' event flag is
> > fixed to itimer only, we can't use wall clock/alloc/cpu profiling modes.
> > The Wiki mentions async-profiler's JNI interface will be used, which has
> > 'event_str' as an input. So, it seems like supporting different event
> types
> > (or even multiple event types in one profile) is possible.
> >
> > Regarding perf_events, it's true that it's disabled in many
> > environments. But it is possible to enable it for debugging purposes. In
> > our Kubernetes workloads, this means adding SYS_PTRACE and SYS_ADMIN to
> the
> > securityContext, deploying the job, and then running:
> >
> > sysctl kernel.kptr_restrict=0
> > sysctl kernel.perf_event_paranoid=1
> >
> > before starting async-profiler.
> >
> > It would be nice if dynamically changing the kernel parameters was
> built-in
> > to this FLIP, somehow, as well, to set these parameters correctly before
> > profiling. If the environment restricts changing these, that's fine. We
> can
> > simply report to the user via the UI that setting them failed, and that
> the
> > choice of profiling configurations is limited without them. I also think
> > it's OK if `itimer` is the default in the UI, as it works under the
> > broadest conditions. But given the motivation in the Wiki is that
> > async-profiler can see detailed system call stack info, allocation, etc.,
> > and that the async-profiler docs describe itimer mode as a "fallback"
> > rather than the way the profiler is best used, it feels like this FLIP
> > should support async-profiler's regular modes of operation & the other
> > most-common configuration options. From my own experience, `cpu`
> (requiring
> > perf_events) is a bit more accurate than `itimer`, and if I recall, and
> > samples once per thread. `wall` is very useful to debug blocks on I/O or
> > locks. Getting per-thread information is nice to drill down into specific
> > parts of the Flink application, e.g. the flame graph lets me ignore the
> > many other tasks running on TM & drill down into just the Source threads,
> > when debugging a Source issue.
> >
> > Kind regards,
> > David
> >
> > On Fri, Oct 13, 2023 at 1:45 AM Rui Fan <1996fan...@gmail.com> wrote:
> >
> > > One minor comment:
> > >
> > > In general, the generic java profiler includes memory analysis,
> > > cpu, thread, deadlock, etc. The FLIP title is java profiler, but
> > > the FLIP just supports flamegraph at process level.
> > > So the `powerful java profiler` title may not be suitable.
> > > Would you mind updating the FLIP title?
> > >
> > > Best,
> > > Rui
> > >
> > > On Fri, Oct 13, 2023 at 4:34 PM Yu Chen <yuchen.e...@gmail.com> wrote:
> > >
> > > > Hi all.
> > > > If there are no further questions, we will start a vote on FLIP-375
> > next
> > > > week.
> > > >
> > > > Best regards,
> > > > Yu Chen
> > > >
> > > >
> > > > Yu Chen <yuchen.e...@gmail.com> 于2023年10月9日周一 17:24写道：
> > > >
> > > > > Hi all,
> > > > >
> > > > > Yun Tang and I are opening this thread to discuss our proposal to
> > > > > integrate async-profiler's capabilities for profiling taskmananger
> > > (e.g.,
> > > > > generating flame graphs) in the Flink Web [1].
> > > > >
> > > > >
> > > > > Currently, Flink provides ThreadDump and Operator-Level Flame
> Graphs
> > by
> > > > > sampling task threads. The results generated in such way missing
> the
> > > > > relevant stack of java threads and system calls. The
> > async-profiler[2]
> > > > is a
> > > > > low-overhead sampling profiler for Java, but the steps to use it in
> > the
> > > > > production environment are cumbersome and suffer from permissions
> and
> > > > > security risks.
> > > > >
> > > > > Therefore, we propose adding rest APIs to provide the capability to
> > > > invoke
> > > > > async-profiler on multiple platforms through JNI, which can be
> easily
> > > > > operated on Web UI. This enhancement will improve the efficiency
> and
> > > > > experience of Flink users in identifying performance bottlenecks.
> > > > >
> > > > >
> > > > >
> > > > > Please refer to the FLIP document for more details about the
> proposed
> > > > design
> > > > > and implementation. We welcome any feedback and opinions on this
> > > > proposal.
> > > > >
> > > > >
> > > > >
> > > > > [1] FLIP-375: Built-in cross-platform powerful java profiler on
> > > > > taskmanagers - Apache Flink - Apache Software Foundation
> > > > > <
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-375%3A+Built-in+cross-platform+powerful+java+profiler+on+taskmanagers
> > > > >
> > > > >
> > > > > [2] GitHub - async-profiler/async-profiler: Sampling CPU and HEAP
> > > > > profiler for Java featuring AsyncGetCallTrace + perf_events
> > > > > <https://github.com/async-profiler/async-profiler>
> > > > >
> > > > >
> > > > >
> > > > > Best regards,
> > > > >
> > > > > Yun Tang and Yu Chen
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] FLIP-375: Built-in cross-platform powerful java profiler on taskmanagers

Reply via email to