Vladimir Matveev created FLINK-32137:
----------------------------------------
Summary: Flame graph is hard to use with many task managers
Key: FLINK-32137
URL: https://issues.apache.org/jira/browse/FLINK-32137
Project: Flink
Issue Type: Bug
Components: Runtime / Web Frontend
Affects Versions: 1.16.1
Reporter: Vladimir Matveev
Attachments: image (1).png
In case there are many task managers executing the same operator, the flame
graph becomes very hard to use. As you can see on the attached picture, it
considers instances of the same lambda function as different classes, and their
number seems to be equal to the number of task managers (i.e. each JVM gets its
own "class" name, which is expected for lambdas I guess). This lambda function
is deep within Flink's own call stack, so this kind of graph is inevitable
regardless of the job's own logic, and there is nothing we can do at the job
logic's level to fix it.
This behavior makes evaluating the flame graph very hard, because all of the
useful information gets "compressed" inside each "column" of the graph, and at
the same time, it does not give any useful information since this is just an
artifact of the class name generation in the JVM.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)