[jira] [Commented] (FLINK-3927) TaskManager registration may fail if Yarn versions don't match
[ https://issues.apache.org/jira/browse/FLINK-3927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15296212#comment-15296212 ] ASF GitHub Bot commented on FLINK-3927: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/2013 > TaskManager registration may fail if Yarn versions don't match > -- > > Key: FLINK-3927 > URL: https://issues.apache.org/jira/browse/FLINK-3927 > Project: Flink > Issue Type: Bug > Components: ResourceManager >Affects Versions: 1.1.0 >Reporter: Maximilian Michels >Assignee: Maximilian Michels > Fix For: 1.1.0 > > > Flink's ResourceManager uses the Yarn container ids to identify connecting > task managers. Yarn's stringified container id may not be consistent across > different Hadoop versions, e.g. Hadoop 2.3.0 and Hadoop 2.7.1. The > ResourceManager gets it from the Yarn reports while the TaskManager infers it > from the Yarn environment variables. The ResourceManager may use Hadoop 2.3.0 > version while the cluster runs Hadoop 2.7.1. > The solution is to pass the ID through a custom environment variable which is > set by the ResourceManager before launching the TaskManager in the container. > That way we will always use the Hadoop client's id generation method. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3927) TaskManager registration may fail if Yarn versions don't match
[ https://issues.apache.org/jira/browse/FLINK-3927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15296207#comment-15296207 ] ASF GitHub Bot commented on FLINK-3927: --- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/2013#issuecomment-220944377 +1 to merge > TaskManager registration may fail if Yarn versions don't match > -- > > Key: FLINK-3927 > URL: https://issues.apache.org/jira/browse/FLINK-3927 > Project: Flink > Issue Type: Bug > Components: ResourceManager >Affects Versions: 1.1.0 >Reporter: Maximilian Michels >Assignee: Maximilian Michels > Fix For: 1.1.0 > > > Flink's ResourceManager uses the Yarn container ids to identify connecting > task managers. Yarn's stringified container id may not be consistent across > different Hadoop versions, e.g. Hadoop 2.3.0 and Hadoop 2.7.1. The > ResourceManager gets it from the Yarn reports while the TaskManager infers it > from the Yarn environment variables. The ResourceManager may use Hadoop 2.3.0 > version while the cluster runs Hadoop 2.7.1. > The solution is to pass the ID through a custom environment variable which is > set by the ResourceManager before launching the TaskManager in the container. > That way we will always use the Hadoop client's id generation method. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3927) TaskManager registration may fail if Yarn versions don't match
[ https://issues.apache.org/jira/browse/FLINK-3927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15291484#comment-15291484 ] ASF GitHub Bot commented on FLINK-3927: --- GitHub user mxm opened a pull request: https://github.com/apache/flink/pull/2013 [FLINK-3927][yarn] make container id consistent across Hadoop versions Fixes a bug where the container id generation would vary across Hadoop versions of the client/cluster. The ResourceManager assumes a persistent resource id. Based on #2012, to re-enable the Yarn tests. You can merge this pull request into a Git repository by running: $ git pull https://github.com/mxm/flink FLINK-3927 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/2013.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2013 commit 422e078c93b558dba3d0c6a53643824198e2c545 Author: Maximilian MichelsDate: 2016-05-19T12:29:12Z [FLINK-3927][yarn] make container id consistent across Hadoop versions - introduce a unique container id independent of the Hadoop version - improve printing of exceptions during registration - minor improvements to the Yarn ResourceManager code commit c27fc8553f4dc0fbcee09c52848477cff2de0b11 Author: Maximilian Michels Date: 2016-05-19T15:59:23Z [FLINK-3938] re-enable Yarn tests As of 70978f560fa5cab6d84ec27d58faa2627babd362, the Yarn tests were not executed anymore. They were moved to the test directory but there was still a Maven configuration in place to change the test directory. > TaskManager registration may fail if Yarn versions don't match > -- > > Key: FLINK-3927 > URL: https://issues.apache.org/jira/browse/FLINK-3927 > Project: Flink > Issue Type: Bug > Components: ResourceManager >Affects Versions: 1.1.0 >Reporter: Maximilian Michels >Assignee: Maximilian Michels > Fix For: 1.1.0 > > > Flink's ResourceManager uses the Yarn container ids to identify connecting > task managers. Yarn's stringified container id may not be consistent across > different Hadoop versions, e.g. Hadoop 2.3.0 and Hadoop 2.7.1. The > ResourceManager gets it from the Yarn reports while the TaskManager infers it > from the Yarn environment variables. The ResourceManager may use Hadoop 2.3.0 > version while the cluster runs Hadoop 2.7.1. > The solution is to pass the ID through a custom environment variable which is > set by the ResourceManager before launching the TaskManager in the container. > That way we will always use the Hadoop client's id generation method. -- This message was sent by Atlassian JIRA (v6.3.4#6332)