[ 
https://issues.apache.org/jira/browse/FLINK-14814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16976364#comment-16976364
 ] 

Piotr Nowojski commented on FLINK-14814:
----------------------------------------

Having multiple output edges is I think not that often, and even if, one can 
deduce the state from the combined output usage basing on the fact that buffers 
are rarely in other states than "mostly empty" and "mostly full". Value of 
{{outputUsage}} jiggling around 50% means one output is full other is empty. 
Because of that I wouldn't worry about it too much, at least not in the first 
version.

I think the bigger problem is that your screenshot displays the tasks, not 
individual subtasks/parallel instances. This rises a question:
# do we want to present non aggregated metrics for subtask?
# do we want to present aggregated metrics for the tasks? ...
# ... if so, how to aggregate the metrics (and who should be doing that)?

1. would be easier to do, significantly more detailed and fine grained, however 
less user friendly and more difficult to use.
2. loosing some information in an exchange for a simpler usage

(we might want to do both, or one first, later the other)

3. we would have to decide how to aggregate individual value. For example if 
one single subtask is back-pressured, do we report that whole task is 
back-pressured? For pool usage should we average them out? Max? Regarding who 
should be doing that - it shouldn't be the UI, so in that case we would need 
one more metric related ticket to actually come up with an idea how to 
aggregate the metrics.

> Show the vertex that produces the backpressure source in the job
> ----------------------------------------------------------------
>
>                 Key: FLINK-14814
>                 URL: https://issues.apache.org/jira/browse/FLINK-14814
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Metrics, Runtime / Network, Runtime / REST, 
> Runtime / Web Frontend
>            Reporter: lining
>            Assignee: lining
>            Priority: Major
>         Attachments: 2B0E910D-6D95-401F-B450-1F6B1AFB9BEA.png
>
>
> By checking the status of output and input buffer pools exposed via 
> FLINK-14815 (output buffer empty, input buffer full) it is possible to 
> display which node is a source of the back pressure. This information could 
> be displayed/accessible in the Web Frontend.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to