Github user mxm commented on a diff in the pull request:

    https://github.com/apache/flink/pull/2257#discussion_r71174483
  
    --- Diff: 
flink-runtime/src/main/scala/org/apache/flink/runtime/jobmanager/JobManager.scala
 ---
    @@ -405,36 +374,13 @@ class JobManager(
     
           currentResourceManager match {
             case Some(rm) =>
    -          val future = (rm ? decorateMessage(new 
RegisterResource(taskManager, msg)))(timeout)
    -          future.onComplete {
    -            case scala.util.Success(response) =>
    -              // the resource manager is available and answered
    -              self ! response
    -            case scala.util.Failure(t) =>
    -              t match {
    -                case _: TimeoutException =>
    -                  log.info("Attempt to register resource at 
ResourceManager timed out. Retrying")
    -                case _ =>
    -                  log.warn("Failure while asking ResourceManager for 
RegisterResource. Retrying", t)
    -              }
    -              // slow or unreachable resource manager, register anyway and 
let the rm reconnect
    -              self ! decorateMessage(new 
RegisterResourceSuccessful(taskManager, msg))
    -              self ! decorateMessage(new ReconnectResourceManager(rm))
    -          }(context.dispatcher)
    -
    +          log.info(s"Register task manager $resourceId at the resource 
manager.")
    +          rm ! decorateMessage(new RegisterResource(msg))
    --- End diff --
    
    
    
    If containers die, then the ResourceManager will always be notified by Yarn 
and is able to pass this information to the JobManager. The advantage of 
ensuring that this message gets delivered upon TaskManager registration is that 
the ResourceManager can actually guarantee resources. On the other hand, if 
messages can be lost, the ResourceManager is just a tool to say "give me more", 
"give me less" with no actual guarantees how much you will get.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to