Dear devs,

Currently, for log output, Flink does not explicitly distinguish between
framework logs and user logs. In Task Manager, logs from the framework are
intermixed with the user's business logs. In some deployment models, such
as Standalone or YARN session, there are different task instances of
different jobs deployed in the same Task Manager. It makes the log event
flow more confusing unless the users explicitly use tags to distinguish
them and it makes locating problems more difficult and inefficient. For
YARN job cluster deployment model, this problem will not be very serious,
but we still need to artificially distinguish between the framework and the
business log. Overall, we found that Flink's existing log model has the
following problems:


   -

   Framework log and business log are mixed in the same log file. There is
   no way to make a clear distinction, which is not conducive to problem
   location and analysis;
   -

   Not conducive to the independent collection of business logs;


Therefore, we propose a mechanism to separate the framework and business
log. It can split existing log files for Task Manager.

Currently, it is associated with two JIRA issue:

   -

   FLINK-11202[1]: Split log file per job
   -

   FLINK-11782[2]: Enhance TaskManager log visualization by listing all log
   files for Flink web UI


We have implemented and validated it in standalone and Flink on YARN (job
cluster) mode.

sketch 1:

[image: flink-web-ui-taskmanager-log-files.png]

sketch 2:
[image: flink-web-ui-taskmanager-log-files-2.png]

Design documentation :
https://docs.google.com/document/d/1TTYAtFoTWaGCveKDZH394FYdRyNyQFnVoW5AYFvnr5I/edit?usp=sharing

Best,
Vino

[1]: https://issues.apache.org/jira/browse/FLINK-11202
[2]: https://issues.apache.org/jira/browse/FLINK-11782

Reply via email to