[
https://issues.apache.org/jira/browse/FLINK-4152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384291#comment-15384291
]
ASF GitHub Bot commented on FLINK-4152:
---------------------------------------
Github user tillrohrmann commented on a diff in the pull request:
https://github.com/apache/flink/pull/2257#discussion_r71355929
--- Diff:
flink-yarn/src/main/java/org/apache/flink/yarn/YarnFlinkResourceManager.java ---
@@ -78,6 +79,9 @@
/** The containers where a TaskManager is starting and we are waiting
for it to register */
private final Map<ResourceID, YarnContainerInLaunch> containersInLaunch;
+ /** The container where a TaskManager has been started and is running
in */
+ private final Map<ResourceID, Container> containersLaunched;
--- End diff --
I never said that it's yours but you've been much more involved in the
design of this component than I was. Thus, I assumed that you know more than I
do what the semantics of the `registeredWorkers` fields is. For example, does
this field represent currently registered TMs at a JM or does this field
represents the TMs which were registered at the RM by the JM?
I didn't know about it and couldn't deduce it from the code. Consequently,
I chose the safest approach to leave it as it is and introduce a new container
state in the `YarnFlinkResourceManager` implementation.
But if you think that clearing is just a bug and that the corresponding
JavaDoc comment is outdated, then it might be a good option to change it.
> TaskManager registration exponential backoff doesn't work
> ---------------------------------------------------------
>
> Key: FLINK-4152
> URL: https://issues.apache.org/jira/browse/FLINK-4152
> Project: Flink
> Issue Type: Bug
> Components: Distributed Coordination, TaskManager, YARN Client
> Reporter: Robert Metzger
> Assignee: Till Rohrmann
> Attachments: logs.tgz
>
>
> While testing Flink 1.1 I've found that the TaskManagers are logging many
> messages when registering at the JobManager.
> This is the log file:
> https://gist.github.com/rmetzger/0cebe0419cdef4507b1e8a42e33ef294
> Its logging more than 3000 messages in less than a minute. I don't think that
> this is the expected behavior.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)