[
https://issues.apache.org/jira/browse/FLINK-4152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384026#comment-15384026
]
ASF GitHub Bot commented on FLINK-4152:
---------------------------------------
Github user mxm commented on a diff in the pull request:
https://github.com/apache/flink/pull/2257#discussion_r71325133
--- Diff:
flink-runtime/src/main/scala/org/apache/flink/runtime/jobmanager/JobManager.scala
---
@@ -405,36 +374,13 @@ class JobManager(
currentResourceManager match {
case Some(rm) =>
- val future = (rm ? decorateMessage(new
RegisterResource(taskManager, msg)))(timeout)
- future.onComplete {
- case scala.util.Success(response) =>
- // the resource manager is available and answered
- self ! response
- case scala.util.Failure(t) =>
- t match {
- case _: TimeoutException =>
- log.info("Attempt to register resource at
ResourceManager timed out. Retrying")
- case _ =>
- log.warn("Failure while asking ResourceManager for
RegisterResource. Retrying", t)
- }
- // slow or unreachable resource manager, register anyway and
let the rm reconnect
- self ! decorateMessage(new
RegisterResourceSuccessful(taskManager, msg))
- self ! decorateMessage(new ReconnectResourceManager(rm))
- }(context.dispatcher)
-
+ log.info(s"Register task manager $resourceId at the resource
manager.")
+ rm ! decorateMessage(new RegisterResource(msg))
--- End diff --
I just don't understand why you remove this functionality. It was not
broken in any way. Of course, we can always add features later (that is true
for any component) but it changes the original RM design. If we want to add
monitoring of the pool size later on, we will have to re-add the proper
registration at the RM.
> TaskManager registration exponential backoff doesn't work
> ---------------------------------------------------------
>
> Key: FLINK-4152
> URL: https://issues.apache.org/jira/browse/FLINK-4152
> Project: Flink
> Issue Type: Bug
> Components: Distributed Coordination, TaskManager, YARN Client
> Reporter: Robert Metzger
> Assignee: Till Rohrmann
> Attachments: logs.tgz
>
>
> While testing Flink 1.1 I've found that the TaskManagers are logging many
> messages when registering at the JobManager.
> This is the log file:
> https://gist.github.com/rmetzger/0cebe0419cdef4507b1e8a42e33ef294
> Its logging more than 3000 messages in less than a minute. I don't think that
> this is the expected behavior.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)