[jira] [Commented] (FLINK-4152) TaskManager registration exponential backoff doesn't work

ASF GitHub Bot (JIRA) Mon, 18 Jul 2016 03:12:34 -0700

    [ 
https://issues.apache.org/jira/browse/FLINK-4152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15382036#comment-15382036
 ]


ASF GitHub Bot commented on FLINK-4152:
---------------------------------------

Github user mxm commented on the issue:

    https://github.com/apache/flink/pull/2257
  
    Thank you for the pull request! Looking at the changes, it looks like it 
could have been broken up into two pull requests and jira issues. 1) Avoiding 
duplicate RegisterTaskManager messages 2) Changing core behavior of the 
ResourceManager.
    
    Concerning 2, I would like to understand why it was necessary to change so 
much code. It seems like it would have sufficed to change one line of code (not 
clearing the bookkeeping on leader ship change). I'm not saying your changes 
don't make sense but I don't think they are backed by the original JIRA issue.
    
    I'm not sure about the role change of the RM in this PR. The RM should be 
the authority for allocating new resources. If those resources are not properly 
reported back to the RM (e.g. message loss), the resource allocation won't work 
properly. 



> TaskManager registration exponential backoff doesn't work
> ---------------------------------------------------------
>
>                 Key: FLINK-4152
>                 URL: https://issues.apache.org/jira/browse/FLINK-4152
>             Project: Flink
>          Issue Type: Bug
>          Components: Distributed Coordination, TaskManager, YARN Client
>            Reporter: Robert Metzger
>            Assignee: Till Rohrmann
>         Attachments: logs.tgz
>
>
> While testing Flink 1.1 I've found that the TaskManagers are logging many 
> messages when registering at the JobManager.
> This is the log file: 
> https://gist.github.com/rmetzger/0cebe0419cdef4507b1e8a42e33ef294
> Its logging more than 3000 messages in less than a minute. I don't think that 
> this is the expected behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-4152) TaskManager registration exponential backoff doesn't work

Reply via email to