[jira] [Commented] (MAPREDUCE-6565) Configuration to use host name in delegation token service is not read from job.xml during MapReduce job execution.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709498#comment-15709498 ] Junping Du commented on MAPREDUCE-6565: --- My bad... I forgot branch-2 and branch-2.8 has big gap already and I only compile on branch-2... Let me revert it from 2.8 first and I will check HADOOP-12954 if it need backport to branch-2.8. > Configuration to use host name in delegation token service is not read from > job.xml during MapReduce job execution. > --- > > Key: MAPREDUCE-6565 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6565 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chris Nauroth >Assignee: Li Lu > Fix For: 2.8.0 > > Attachments: MAPREDUCE-6565-trunk.001.patch > > > By default, the service field of a delegation token is populated based on > server IP address. Setting {{hadoop.security.token.service.use_ip}} to > {{false}} changes this behavior to use host name instead of IP address. > However, this configuration property is not read from job.xml. Instead, it's > read from a separate {{Configuration}} instance created during static > initialization of {{SecurityUtil}}. This does not work correctly with > MapReduce jobs if the framework is distributed by setting > {{mapreduce.application.framework.path}} and the > {{mapreduce.application.classpath}} is isolated to avoid reading > core-site.xml from the cluster nodes. MapReduce tasks will fail to > authenticate to HDFS, because they'll try to find a delegation token based on > the NameNode IP address, even though at job submission time the tokens were > generated using the host name. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6565) Configuration to use host name in delegation token service is not read from job.xml during MapReduce job execution.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709462#comment-15709462 ] Li Lu commented on MAPREDUCE-6565: -- Yes we need HADOOP-12954 prior to the fix here. Shall we back port that issue as well? > Configuration to use host name in delegation token service is not read from > job.xml during MapReduce job execution. > --- > > Key: MAPREDUCE-6565 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6565 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chris Nauroth >Assignee: Li Lu > Fix For: 2.8.0 > > Attachments: MAPREDUCE-6565-trunk.001.patch > > > By default, the service field of a delegation token is populated based on > server IP address. Setting {{hadoop.security.token.service.use_ip}} to > {{false}} changes this behavior to use host name instead of IP address. > However, this configuration property is not read from job.xml. Instead, it's > read from a separate {{Configuration}} instance created during static > initialization of {{SecurityUtil}}. This does not work correctly with > MapReduce jobs if the framework is distributed by setting > {{mapreduce.application.framework.path}} and the > {{mapreduce.application.classpath}} is isolated to avoid reading > core-site.xml from the cluster nodes. MapReduce tasks will fail to > authenticate to HDFS, because they'll try to find a delegation token based on > the NameNode IP address, even though at job submission time the tokens were > generated using the host name. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6565) Configuration to use host name in delegation token service is not read from job.xml during MapReduce job execution.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709437#comment-15709437 ] Eric Badger commented on MAPREDUCE-6565: [~djp], this seems to have broken the 2.8 build. Can you revert it from 2.8 until we can fix the patch? > Configuration to use host name in delegation token service is not read from > job.xml during MapReduce job execution. > --- > > Key: MAPREDUCE-6565 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6565 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chris Nauroth >Assignee: Li Lu > Fix For: 2.8.0 > > Attachments: MAPREDUCE-6565-trunk.001.patch > > > By default, the service field of a delegation token is populated based on > server IP address. Setting {{hadoop.security.token.service.use_ip}} to > {{false}} changes this behavior to use host name instead of IP address. > However, this configuration property is not read from job.xml. Instead, it's > read from a separate {{Configuration}} instance created during static > initialization of {{SecurityUtil}}. This does not work correctly with > MapReduce jobs if the framework is distributed by setting > {{mapreduce.application.framework.path}} and the > {{mapreduce.application.classpath}} is isolated to avoid reading > core-site.xml from the cluster nodes. MapReduce tasks will fail to > authenticate to HDFS, because they'll try to find a delegation token based on > the NameNode IP address, even though at job submission time the tokens were > generated using the host name. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6565) Configuration to use host name in delegation token service is not read from job.xml during MapReduce job execution.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15707054#comment-15707054 ] Hudson commented on MAPREDUCE-6565: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10910 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/10910/]) MAPREDUCE-6565. Configuration to use host name in delegation token (junping_du: rev 8f6e14399a3e77e1bdcc5034f7601e9f62163dea) * (edit) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/MRAppMaster.java * (edit) hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/YarnChild.java > Configuration to use host name in delegation token service is not read from > job.xml during MapReduce job execution. > --- > > Key: MAPREDUCE-6565 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6565 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chris Nauroth >Assignee: Li Lu > Attachments: MAPREDUCE-6565-trunk.001.patch > > > By default, the service field of a delegation token is populated based on > server IP address. Setting {{hadoop.security.token.service.use_ip}} to > {{false}} changes this behavior to use host name instead of IP address. > However, this configuration property is not read from job.xml. Instead, it's > read from a separate {{Configuration}} instance created during static > initialization of {{SecurityUtil}}. This does not work correctly with > MapReduce jobs if the framework is distributed by setting > {{mapreduce.application.framework.path}} and the > {{mapreduce.application.classpath}} is isolated to avoid reading > core-site.xml from the cluster nodes. MapReduce tasks will fail to > authenticate to HDFS, because they'll try to find a delegation token based on > the NameNode IP address, even though at job submission time the tokens were > generated using the host name. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6565) Configuration to use host name in delegation token service is not read from job.xml during MapReduce job execution.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702867#comment-15702867 ] Junping Du commented on MAPREDUCE-6565: --- Patch LGTM too. > Configuration to use host name in delegation token service is not read from > job.xml during MapReduce job execution. > --- > > Key: MAPREDUCE-6565 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6565 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chris Nauroth >Assignee: Li Lu > Attachments: MAPREDUCE-6565-trunk.001.patch > > > By default, the service field of a delegation token is populated based on > server IP address. Setting {{hadoop.security.token.service.use_ip}} to > {{false}} changes this behavior to use host name instead of IP address. > However, this configuration property is not read from job.xml. Instead, it's > read from a separate {{Configuration}} instance created during static > initialization of {{SecurityUtil}}. This does not work correctly with > MapReduce jobs if the framework is distributed by setting > {{mapreduce.application.framework.path}} and the > {{mapreduce.application.classpath}} is isolated to avoid reading > core-site.xml from the cluster nodes. MapReduce tasks will fail to > authenticate to HDFS, because they'll try to find a delegation token based on > the NameNode IP address, even though at job submission time the tokens were > generated using the host name. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6565) Configuration to use host name in delegation token service is not read from job.xml during MapReduce job execution.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702812#comment-15702812 ] Jason Lowe commented on MAPREDUCE-6565: --- Patch looks good to me, +1. It will address the specific issue for MapReduce. Other frameworks, such as Spark, will need to make similar modifications. > Configuration to use host name in delegation token service is not read from > job.xml during MapReduce job execution. > --- > > Key: MAPREDUCE-6565 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6565 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chris Nauroth >Assignee: Li Lu > Attachments: MAPREDUCE-6565-trunk.001.patch > > > By default, the service field of a delegation token is populated based on > server IP address. Setting {{hadoop.security.token.service.use_ip}} to > {{false}} changes this behavior to use host name instead of IP address. > However, this configuration property is not read from job.xml. Instead, it's > read from a separate {{Configuration}} instance created during static > initialization of {{SecurityUtil}}. This does not work correctly with > MapReduce jobs if the framework is distributed by setting > {{mapreduce.application.framework.path}} and the > {{mapreduce.application.classpath}} is isolated to avoid reading > core-site.xml from the cluster nodes. MapReduce tasks will fail to > authenticate to HDFS, because they'll try to find a delegation token based on > the NameNode IP address, even though at job submission time the tokens were > generated using the host name. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6565) Configuration to use host name in delegation token service is not read from job.xml during MapReduce job execution.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15691724#comment-15691724 ] Li Lu commented on MAPREDUCE-6565: -- That's not exactly the same issue as we observed here. Can you try latest trunk to verify this issue? I would not be surprised if problems occur on all YARN apps after a fix in YarnClient, but I believe the issue discussed here is solely about MR. > Configuration to use host name in delegation token service is not read from > job.xml during MapReduce job execution. > --- > > Key: MAPREDUCE-6565 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6565 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chris Nauroth >Assignee: Li Lu > Attachments: MAPREDUCE-6565-trunk.001.patch > > > By default, the service field of a delegation token is populated based on > server IP address. Setting {{hadoop.security.token.service.use_ip}} to > {{false}} changes this behavior to use host name instead of IP address. > However, this configuration property is not read from job.xml. Instead, it's > read from a separate {{Configuration}} instance created during static > initialization of {{SecurityUtil}}. This does not work correctly with > MapReduce jobs if the framework is distributed by setting > {{mapreduce.application.framework.path}} and the > {{mapreduce.application.classpath}} is isolated to avoid reading > core-site.xml from the cluster nodes. MapReduce tasks will fail to > authenticate to HDFS, because they'll try to find a delegation token based on > the NameNode IP address, even though at job submission time the tokens were > generated using the host name. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6565) Configuration to use host name in delegation token service is not read from job.xml during MapReduce job execution.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15691714#comment-15691714 ] Yuren Wu commented on MAPREDUCE-6565: - this.happens across board as long as job is submitted remotely > Configuration to use host name in delegation token service is not read from > job.xml during MapReduce job execution. > --- > > Key: MAPREDUCE-6565 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6565 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chris Nauroth >Assignee: Li Lu > Attachments: MAPREDUCE-6565-trunk.001.patch > > > By default, the service field of a delegation token is populated based on > server IP address. Setting {{hadoop.security.token.service.use_ip}} to > {{false}} changes this behavior to use host name instead of IP address. > However, this configuration property is not read from job.xml. Instead, it's > read from a separate {{Configuration}} instance created during static > initialization of {{SecurityUtil}}. This does not work correctly with > MapReduce jobs if the framework is distributed by setting > {{mapreduce.application.framework.path}} and the > {{mapreduce.application.classpath}} is isolated to avoid reading > core-site.xml from the cluster nodes. MapReduce tasks will fail to > authenticate to HDFS, because they'll try to find a delegation token based on > the NameNode IP address, even though at job submission time the tokens were > generated using the host name. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6565) Configuration to use host name in delegation token service is not read from job.xml during MapReduce job execution.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15691700#comment-15691700 ] Hadoop QA commented on MAPREDUCE-6565: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 15s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 17s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 38s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 22s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 22s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 15s {color} | {color:red} hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app: The patch generated 1 new + 90 unchanged - 1 fixed = 91 total (was 91) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 43s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 56s {color} | {color:green} hadoop-mapreduce-client-app in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 22m 20s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12840334/MAPREDUCE-6565-trunk.001.patch | | JIRA Issue | MAPREDUCE-6565 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 4ebd81471565 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 0de0c32 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6820/artifact/patchprocess/diff-checkstyle-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-app.txt | | Test Results | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6820/testReport/ | | modules | C: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app U: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app | | Console output | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6820/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > Configuration to use host name in delegation token service is
[jira] [Commented] (MAPREDUCE-6565) Configuration to use host name in delegation token service is not read from job.xml during MapReduce job execution.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15691643#comment-15691643 ] Li Lu commented on MAPREDUCE-6565: -- bq. Maybe you guys have more insight into the code and can figure out how to solve this problem for all YARN managed applications. I believe so far we've only seen this problems occurred on MR apps with specific ways of distributions (tarballs through distributed cache). No? > Configuration to use host name in delegation token service is not read from > job.xml during MapReduce job execution. > --- > > Key: MAPREDUCE-6565 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6565 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chris Nauroth >Assignee: Li Lu > Attachments: MAPREDUCE-6565-trunk.001.patch > > > By default, the service field of a delegation token is populated based on > server IP address. Setting {{hadoop.security.token.service.use_ip}} to > {{false}} changes this behavior to use host name instead of IP address. > However, this configuration property is not read from job.xml. Instead, it's > read from a separate {{Configuration}} instance created during static > initialization of {{SecurityUtil}}. This does not work correctly with > MapReduce jobs if the framework is distributed by setting > {{mapreduce.application.framework.path}} and the > {{mapreduce.application.classpath}} is isolated to avoid reading > core-site.xml from the cluster nodes. MapReduce tasks will fail to > authenticate to HDFS, because they'll try to find a delegation token based on > the NameNode IP address, even though at job submission time the tokens were > generated using the host name. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6565) Configuration to use host name in delegation token service is not read from job.xml during MapReduce job execution.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15691638#comment-15691638 ] Yuren Wu commented on MAPREDUCE-6565: - Thanks for all the attention you guys have paid to this issue. My initial hack works well with MR job, however when I did a test on Spark job submitted remotely to yarn cluster, the hack did not work. i did not open another jira for spark job because i did not quite dig into the spark job settings to see who it was handled. Maybe you guys have more insight into the code and can figure out how to solve this problem for all YARN managed applications. > Configuration to use host name in delegation token service is not read from > job.xml during MapReduce job execution. > --- > > Key: MAPREDUCE-6565 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6565 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chris Nauroth >Assignee: Li Lu > Attachments: MAPREDUCE-6565-trunk.001.patch > > > By default, the service field of a delegation token is populated based on > server IP address. Setting {{hadoop.security.token.service.use_ip}} to > {{false}} changes this behavior to use host name instead of IP address. > However, this configuration property is not read from job.xml. Instead, it's > read from a separate {{Configuration}} instance created during static > initialization of {{SecurityUtil}}. This does not work correctly with > MapReduce jobs if the framework is distributed by setting > {{mapreduce.application.framework.path}} and the > {{mapreduce.application.classpath}} is isolated to avoid reading > core-site.xml from the cluster nodes. MapReduce tasks will fail to > authenticate to HDFS, because they'll try to find a delegation token based on > the NameNode IP address, even though at job submission time the tokens were > generated using the host name. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6565) Configuration to use host name in delegation token service is not read from job.xml during MapReduce job execution.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15677831#comment-15677831 ] Jason Lowe commented on MAPREDUCE-6565: --- I'm +1 for making all client-side settings override anything in any site file that isn't marked final for 3.x. I'm a bit hesitant for 2.x given the long-standing semantics for some of these properties, and in any case there needs to be a clear release note explaining to users what to expect with the change. Hmm, there might be a problem with adding job.xml as a default resource, and that has to do with relative ordering of when job.xml is added and other default resources like the *-site.xml files are added. The various site.xml files are only added as defaults when the related classes are loaded (e.g.: HdfsConfiguration, YarnConfiguration, JobConf, etc.) If we add job.xml as a default resource _before_ some of these classes are touched then some site files will override the job.xml files because they'll be loaded later. We can probably get the ordering right for all the site files provided by core Hadoop, but I'm worried about downstream projects that may have their own site files (e.g.: hive-site.xml). Client-side settings could be smashed by site settings if the ordering is not correct. job.xml would need to be the last default resource added, and we may not be able to guarantee that with arbitrary downstream code. Unfortunately without using the default resource feature of Configuration, I don't know of a straightforward way to get classes using plain ol' Configuration instances to see values set in job.xml. Any ideas here? We can fix individual instances like hadoop.security.token.service.use_ip in case-specific ways (i.e.: calling the SecurityUtil.setTokenServiceUseIp method for this property), but not all cases will have a straightforward fix. And we'd have to track them all down individually. > Configuration to use host name in delegation token service is not read from > job.xml during MapReduce job execution. > --- > > Key: MAPREDUCE-6565 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6565 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chris Nauroth >Assignee: Li Lu > > By default, the service field of a delegation token is populated based on > server IP address. Setting {{hadoop.security.token.service.use_ip}} to > {{false}} changes this behavior to use host name instead of IP address. > However, this configuration property is not read from job.xml. Instead, it's > read from a separate {{Configuration}} instance created during static > initialization of {{SecurityUtil}}. This does not work correctly with > MapReduce jobs if the framework is distributed by setting > {{mapreduce.application.framework.path}} and the > {{mapreduce.application.classpath}} is isolated to avoid reading > core-site.xml from the cluster nodes. MapReduce tasks will fail to > authenticate to HDFS, because they'll try to find a delegation token based on > the NameNode IP address, even though at job submission time the tokens were > generated using the host name. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6565) Configuration to use host name in delegation token service is not read from job.xml during MapReduce job execution.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15677579#comment-15677579 ] Junping Du commented on MAPREDUCE-6565: --- bq. I think the tarball is muddying the waters here. *-site.xml files should be treated the same whether they are in a tarball or not. It'd be tricky and messy to do otherwise. Essentially what we're asking is whether clients should be able to override any non-final setting in the *-site.xml files with their job.xml setting, even if that setting is a "server side" property Agree. Whether clients are allowed to override any non-final setting is something I originally asking for. But later, I realized that if we disallow client to override "server-setting" - like case here, tar-ball configuration will become the final setting for "server-side" configuration. It also means MR tarball will become various according to cluster settings but not according to releases. Standing at support-ability prospective, we don't want our users to touch tarball settings because it is hard to control/monitor these settings by Ambari or other tools. Otherwise, we have to dig into tarball settings when weird things happens. In this sense, I think it could be simpler if we don't differentiate client-side and server-side setting in configuration loading. bq. The risk of this change is when the client's *-site.xml files do not match what should be there. For the "server side" settings this has been working because we've been ignoring job.xml for those. Once we start using job.xml for even those properties, jobs that were working in the past because we ignored bad values will break. I don't know offhand how many other properties besides this one could suddenly change because we start using the client's version of the property in job.xml when we didn't before. I think even before, our configuration loading doesn't guarantee we only loading client-side configuration rather than server-side configuration. Even for mapred-site.xml, there are history server related settings that belongs to server-side but you can override a different value in client side which just doesn't work though. So we still need to rely on code logic to bypass what settings belongs to server-setting in client side. Also, better documentation could be helpful for user to differentiate what's client-side and what's server-side. Our current tricky loading mechanism won't help on both side. > Configuration to use host name in delegation token service is not read from > job.xml during MapReduce job execution. > --- > > Key: MAPREDUCE-6565 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6565 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chris Nauroth >Assignee: Li Lu > > By default, the service field of a delegation token is populated based on > server IP address. Setting {{hadoop.security.token.service.use_ip}} to > {{false}} changes this behavior to use host name instead of IP address. > However, this configuration property is not read from job.xml. Instead, it's > read from a separate {{Configuration}} instance created during static > initialization of {{SecurityUtil}}. This does not work correctly with > MapReduce jobs if the framework is distributed by setting > {{mapreduce.application.framework.path}} and the > {{mapreduce.application.classpath}} is isolated to avoid reading > core-site.xml from the cluster nodes. MapReduce tasks will fail to > authenticate to HDFS, because they'll try to find a delegation token based on > the NameNode IP address, even though at job submission time the tokens were > generated using the host name. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6565) Configuration to use host name in delegation token service is not read from job.xml during MapReduce job execution.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15675073#comment-15675073 ] Jason Lowe commented on MAPREDUCE-6565: --- bq. Can we make a consensus that everything in tarball's config is not final (unless we explicitly mark it as final)? I think the tarball is muddying the waters here. *-site.xml files should be treated the same whether they are in a tarball or not. It'd be tricky and messy to do otherwise. Essentially what we're asking is whether clients should be able to override _any_ non-final setting in the *-site.xml files with their job.xml setting, even if that setting is a "server side" property. bq. may be we can simply go ahead to make job.xml as the highest priority without differentiating client/server settings. Any risk I am missing here? Making job.xml a default resource accomplishes that proposal if we want to go that route. Note: Usually job.xml contains the contents of the client-side *-site.xml files, so as long as those match the cluster we should be good there. The risk of this change is when the client's *-site.xml files do _not_ match what should be there. For the "server side" settings this has been working because we've been ignoring job.xml for those. Once we start using job.xml for even those properties, jobs that were working in the past because we ignored bad values will break. I don't know offhand how many other properties besides this one could suddenly change because we start using the client's version of the property in job.xml when we didn't before. I agree with [~gtCarrera9] that it's more consistent and less surprising to users if job.xml settings override any other settings in the job. However there are going to be some cases that break when we "fix" it. That's why I'm a bit hesitant, especially if this is going into 2.x. > Configuration to use host name in delegation token service is not read from > job.xml during MapReduce job execution. > --- > > Key: MAPREDUCE-6565 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6565 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chris Nauroth >Assignee: Li Lu > > By default, the service field of a delegation token is populated based on > server IP address. Setting {{hadoop.security.token.service.use_ip}} to > {{false}} changes this behavior to use host name instead of IP address. > However, this configuration property is not read from job.xml. Instead, it's > read from a separate {{Configuration}} instance created during static > initialization of {{SecurityUtil}}. This does not work correctly with > MapReduce jobs if the framework is distributed by setting > {{mapreduce.application.framework.path}} and the > {{mapreduce.application.classpath}} is isolated to avoid reading > core-site.xml from the cluster nodes. MapReduce tasks will fail to > authenticate to HDFS, because they'll try to find a delegation token based on > the NameNode IP address, even though at job submission time the tokens were > generated using the host name. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6565) Configuration to use host name in delegation token service is not read from job.xml during MapReduce job execution.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15674969#comment-15674969 ] Junping Du commented on MAPREDUCE-6565: --- bq. We just have the main tarball for the cluster and users override various settings for their specific job via job.xml, just like a regular non-tarball job does Our case is similar but slightly different. For every release, we generally only provide one tarball for all users' clusters. So every user can specify their cluster's own various settings without touch tarball that we handle over. Otherwise, taking care of multiple versions of tarball for the same release will be supportability disaster that we cannot afford. Can we make a consensus that everything in tarball's config is not final (unless we explicitly mark it as final)? If so, then may be we can simply go ahead to make job.xml as the highest priority without differentiating client/server settings. Any risk I am missing here? > Configuration to use host name in delegation token service is not read from > job.xml during MapReduce job execution. > --- > > Key: MAPREDUCE-6565 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6565 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chris Nauroth >Assignee: Li Lu > > By default, the service field of a delegation token is populated based on > server IP address. Setting {{hadoop.security.token.service.use_ip}} to > {{false}} changes this behavior to use host name instead of IP address. > However, this configuration property is not read from job.xml. Instead, it's > read from a separate {{Configuration}} instance created during static > initialization of {{SecurityUtil}}. This does not work correctly with > MapReduce jobs if the framework is distributed by setting > {{mapreduce.application.framework.path}} and the > {{mapreduce.application.classpath}} is isolated to avoid reading > core-site.xml from the cluster nodes. MapReduce tasks will fail to > authenticate to HDFS, because they'll try to find a delegation token based on > the NameNode IP address, even though at job submission time the tokens were > generated using the host name. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6565) Configuration to use host name in delegation token service is not read from job.xml during MapReduce job execution.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15674732#comment-15674732 ] Jason Lowe commented on MAPREDUCE-6565: --- The more I think about this, the more I believe the original problem is describing an invalid setup. I can accomplish the same thing without a tarball by shipping my own custom core-site.xml that appears _before_ any core-site.xml provided by the admins of the cluster. I think we can agree that any such setup that fails due to a bad setting in the eclipsing core-site.xml is a fault of the user's setup and not a fault of the Hadoop framework itself. So I would argue the fault lies with construction of the job in question that failed. Either it used a faulty tarball that had a bad core-site.xml in it, or the core-site.xml was missing completely from the classpath (which may have been the case here). > Configuration to use host name in delegation token service is not read from > job.xml during MapReduce job execution. > --- > > Key: MAPREDUCE-6565 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6565 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chris Nauroth >Assignee: Li Lu > > By default, the service field of a delegation token is populated based on > server IP address. Setting {{hadoop.security.token.service.use_ip}} to > {{false}} changes this behavior to use host name instead of IP address. > However, this configuration property is not read from job.xml. Instead, it's > read from a separate {{Configuration}} instance created during static > initialization of {{SecurityUtil}}. This does not work correctly with > MapReduce jobs if the framework is distributed by setting > {{mapreduce.application.framework.path}} and the > {{mapreduce.application.classpath}} is isolated to avoid reading > core-site.xml from the cluster nodes. MapReduce tasks will fail to > authenticate to HDFS, because they'll try to find a delegation token based on > the NameNode IP address, even though at job submission time the tokens were > generated using the host name. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6565) Configuration to use host name in delegation token service is not read from job.xml during MapReduce job execution.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15674699#comment-15674699 ] Jason Lowe commented on MAPREDUCE-6565: --- bq. it won't have any benefit to give client flexibility to be different with what server settings. Isn't it? Agreed, I think the best approach here long-term is to not have this be a config setting at all but rather derived from the communication session that retrieved the token in the first place. I also agree that it is going to be more dangerous than not for job.xml to override this particular setting (whether or not we're using an MR tarball). bq. if we think everything inside of MR tarball should be per job only This is definitely not the case, at least for our clusters. As I mentioned above, the MR tarballs are created by the cluster admins and therefore have the appropriate configs for that cluster. We do _not_ want the jobs to pick up configs from the nodes per the issues I described earlier. bq. I am not fully agree that it should be cluster admin' job to create tarball and keep consistent for all configurations with cluster settings. The tarball consist of the Hadoop jars and the *-site.xml files. Both of these are things admins of the cluster are expected to maintain and provide. Therefore I think it's completely appropriate that typically these tarballs are created and provided by admins. We're not running a bunch of different versions of the tarball for different types of jobs. We just have the main tarball for the cluster and users override various settings for their specific job via job.xml, just like a regular non-tarball job does. To be clear, I'm not saying everyone has to run it the way we do. However if you don't then there needs to be solutions for the types of rolling upgrade problems I pointed out above. Also I don't think the way we're using the tarball is an invalid setup, so therefore any proposed change needs to ensure this type of setup keeps working. > Configuration to use host name in delegation token service is not read from > job.xml during MapReduce job execution. > --- > > Key: MAPREDUCE-6565 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6565 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chris Nauroth >Assignee: Li Lu > > By default, the service field of a delegation token is populated based on > server IP address. Setting {{hadoop.security.token.service.use_ip}} to > {{false}} changes this behavior to use host name instead of IP address. > However, this configuration property is not read from job.xml. Instead, it's > read from a separate {{Configuration}} instance created during static > initialization of {{SecurityUtil}}. This does not work correctly with > MapReduce jobs if the framework is distributed by setting > {{mapreduce.application.framework.path}} and the > {{mapreduce.application.classpath}} is isolated to avoid reading > core-site.xml from the cluster nodes. MapReduce tasks will fail to > authenticate to HDFS, because they'll try to find a delegation token based on > the NameNode IP address, even though at job submission time the tokens were > generated using the host name. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6565) Configuration to use host name in delegation token service is not read from job.xml during MapReduce job execution.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15674604#comment-15674604 ] Junping Du commented on MAPREDUCE-6565: --- I probably won't agree it could be safe to let job.xml settings to override all configurations, especially serve-side configuration. Forget about MR tarball case, do we think this setting in job.xml should override what we have in local configuration? I think it is probably not - as I mentioned above, it won't have any benefit to give client flexibility to be different with what server settings. Isn't it? Base on that, back to MR tar ball case, if we think everything inside of MR tarball should be per job only, then only client configuration should work, but serve configuration (like case here) shouldn't get chance to override cluster setting. I am not fully agree that it should be cluster admin' job to create tarball and keep consistent for all configurations with cluster settings. In supportability prospective, making some server-side configurations to be transparent from client setting (job.xml or mr tar ball config) should be our job. Given that, I think the real problem now is: we should bypass server-side configurations in mr tar ball. Thoughts? > Configuration to use host name in delegation token service is not read from > job.xml during MapReduce job execution. > --- > > Key: MAPREDUCE-6565 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6565 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chris Nauroth >Assignee: Li Lu > > By default, the service field of a delegation token is populated based on > server IP address. Setting {{hadoop.security.token.service.use_ip}} to > {{false}} changes this behavior to use host name instead of IP address. > However, this configuration property is not read from job.xml. Instead, it's > read from a separate {{Configuration}} instance created during static > initialization of {{SecurityUtil}}. This does not work correctly with > MapReduce jobs if the framework is distributed by setting > {{mapreduce.application.framework.path}} and the > {{mapreduce.application.classpath}} is isolated to avoid reading > core-site.xml from the cluster nodes. MapReduce tasks will fail to > authenticate to HDFS, because they'll try to find a delegation token based on > the NameNode IP address, even though at job submission time the tokens were > generated using the host name. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6565) Configuration to use host name in delegation token service is not read from job.xml during MapReduce job execution.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15673983#comment-15673983 ] Jason Lowe commented on MAPREDUCE-6565: --- bq. To me this pretty much reveals the nature of a bug. IMHO if we really want job.xml settings to override configs consistently then an easier, safer way to do that is making job.xml a default resource. Otherwise we're going to find more individual cases of this in the future where someone forgot to use the right Configuration type or pass the correct configs around. Good luck with code that's not MapReduce-aware, simply uses Configuration, and has no way of passing in a conf object. Either way we tackle this, we need to realize that the change will break scenarios that are working today where client-side configs do not have the proper value for hadoop.security.token.service.use_ip. > Configuration to use host name in delegation token service is not read from > job.xml during MapReduce job execution. > --- > > Key: MAPREDUCE-6565 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6565 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chris Nauroth >Assignee: Li Lu > > By default, the service field of a delegation token is populated based on > server IP address. Setting {{hadoop.security.token.service.use_ip}} to > {{false}} changes this behavior to use host name instead of IP address. > However, this configuration property is not read from job.xml. Instead, it's > read from a separate {{Configuration}} instance created during static > initialization of {{SecurityUtil}}. This does not work correctly with > MapReduce jobs if the framework is distributed by setting > {{mapreduce.application.framework.path}} and the > {{mapreduce.application.classpath}} is isolated to avoid reading > core-site.xml from the cluster nodes. MapReduce tasks will fail to > authenticate to HDFS, because they'll try to find a delegation token based on > the NameNode IP address, even though at job submission time the tokens were > generated using the host name. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6565) Configuration to use host name in delegation token service is not read from job.xml during MapReduce job execution.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15672159#comment-15672159 ] Li Lu commented on MAPREDUCE-6565: -- bq. job.xml does override the tarball configs in almost every way except this security setting because of the way that setting is loaded To me this pretty much reveals the nature of a bug. Normally users would expect to have per-job configs override everything else, but this does not hold with the use ip setting. So one possible to fix this might be passing in the map reduce job configuration in security util, instead of using its own? In SecurityUtil, this limits the default behavior of use ip to be a newly created Configuration object, which may not be consistent with MR's job specific setting: {code} static { setConfigurationInternal(new Configuration()); } {code} And there is one API for this class to set the configuration used in SecurityUtil: {code} @InterfaceAudience.Public @InterfaceStability.Evolving public static void setConfiguration(Configuration conf) { LOG.info("Updating Configuration"); setConfigurationInternal(conf); } {code} So what we can do is to use the MR app's config to set this configuration? > Configuration to use host name in delegation token service is not read from > job.xml during MapReduce job execution. > --- > > Key: MAPREDUCE-6565 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6565 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chris Nauroth >Assignee: Li Lu > > By default, the service field of a delegation token is populated based on > server IP address. Setting {{hadoop.security.token.service.use_ip}} to > {{false}} changes this behavior to use host name instead of IP address. > However, this configuration property is not read from job.xml. Instead, it's > read from a separate {{Configuration}} instance created during static > initialization of {{SecurityUtil}}. This does not work correctly with > MapReduce jobs if the framework is distributed by setting > {{mapreduce.application.framework.path}} and the > {{mapreduce.application.classpath}} is isolated to avoid reading > core-site.xml from the cluster nodes. MapReduce tasks will fail to > authenticate to HDFS, because they'll try to find a delegation token based on > the NameNode IP address, even though at job submission time the tokens were > generated using the host name. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6565) Configuration to use host name in delegation token service is not read from job.xml during MapReduce job execution.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15670553#comment-15670553 ] Jason Lowe commented on MAPREDUCE-6565: --- bq. In addition, from usability prospective, I think client setting in job.xml has the highest priority to overwrite whatever conf in tar ball or not - given a bit overhead to change config setting in tarball. Isn't it? job.xml does override the tarball configs in almost every way _except_ this security setting because of the way that setting is loaded (i.e.: via Configuration instead of JobConf and job.xml is not a default resource). bq. If so, this actually belongs to a server-side setting, even it take effective in our current client side code. In practice that's essentially how it works in our setup. Since this property is currently not read from job.xml, it ends up using whatever is in the core-site.xml that's in the MR tarball. The MR tarball is created by the cluster admins and so the confs are consistent with the server settings for that cluster. A user could try overriding this property in their job.xml, but it wouldn't be used because job.xml is ignored for this specific config. (Yes, a user could provide their own, custom tarball with custom site files to override this setting, but they need to know what they're doing if they choose not to use the default MR tarball provided by the cluster admins.) Given this is a client-side setting that must be in sync with the server-side setting it'd be nice if this wasn't possible to get out-of-sync. For example, the server that granted the token could also communicate which setting must be used by the client later when the token is used, but that's a significant change with backward compatibility concerns. Of course there's also backwards compatibility concerns with adding job.xml as a default resource so that this property could be fetched from there instead of core-site.xml or core-default.xml. > Configuration to use host name in delegation token service is not read from > job.xml during MapReduce job execution. > --- > > Key: MAPREDUCE-6565 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6565 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chris Nauroth >Assignee: Li Lu > > By default, the service field of a delegation token is populated based on > server IP address. Setting {{hadoop.security.token.service.use_ip}} to > {{false}} changes this behavior to use host name instead of IP address. > However, this configuration property is not read from job.xml. Instead, it's > read from a separate {{Configuration}} instance created during static > initialization of {{SecurityUtil}}. This does not work correctly with > MapReduce jobs if the framework is distributed by setting > {{mapreduce.application.framework.path}} and the > {{mapreduce.application.classpath}} is isolated to avoid reading > core-site.xml from the cluster nodes. MapReduce tasks will fail to > authenticate to HDFS, because they'll try to find a delegation token based on > the NameNode IP address, even though at job submission time the tokens were > generated using the host name. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6565) Configuration to use host name in delegation token service is not read from job.xml during MapReduce job execution.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15669320#comment-15669320 ] Junping Du commented on MAPREDUCE-6565: --- [~jlowe], thanks for your comments! For client settings, I agree that it would be convenient for loading configuration from tarball as we can honor different settings for different jobs (in kind of "batch" mode), especially from rolling upgrade prospectives. In addition, from usability prospective, I think client setting in job.xml has the highest priority to overwrite whatever conf in tar ball or not - given a bit overhead to change config setting in tarball. Isn't it? However, my next question is: do we think CommonConfigurationKeys.HADOOP_SECURITY_TOKEN_SERVICE_USE_IP is a client side setting? I assume we should keep this consistent in cluster level as mismatch between client and server setting will cause job get failed. If so, this actually belongs to a server-side setting, even it take effective in our current client side code. > Configuration to use host name in delegation token service is not read from > job.xml during MapReduce job execution. > --- > > Key: MAPREDUCE-6565 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6565 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chris Nauroth >Assignee: Li Lu > > By default, the service field of a delegation token is populated based on > server IP address. Setting {{hadoop.security.token.service.use_ip}} to > {{false}} changes this behavior to use host name instead of IP address. > However, this configuration property is not read from job.xml. Instead, it's > read from a separate {{Configuration}} instance created during static > initialization of {{SecurityUtil}}. This does not work correctly with > MapReduce jobs if the framework is distributed by setting > {{mapreduce.application.framework.path}} and the > {{mapreduce.application.classpath}} is isolated to avoid reading > core-site.xml from the cluster nodes. MapReduce tasks will fail to > authenticate to HDFS, because they'll try to find a delegation token based on > the NameNode IP address, even though at job submission time the tokens were > generated using the host name. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6565) Configuration to use host name in delegation token service is not read from job.xml during MapReduce job execution.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668458#comment-15668458 ] Jason Lowe commented on MAPREDUCE-6565: --- bq. I think MR over distributed cache shouldn't load configurations that included in MR tar ball. We explicitly have MapReduce load configurations that are in the tarball. We used to run without that and had the jobs pick up the configurations from the node, but we often ran into cases where rolling upgrades broke jobs on the cluster (e.g.: config changes to reference a plugin class that only exists in the new release, or old config setting breaks job running new release, etc.). Things got a lot simpler for us when we put the configs in the MR tarball and made sure those configs reflect the entire cluster. Added bonus: we don't need to push new confs to every node when the conf changes are just client-side (i.e.: not for the datanode/nodemanager). We just update our tarball in HDFS and every new job picks up the change. Back to the original reported issue: one reason not all security settings in job.xml aren't honored is because, unlike the *-site.xml files, job.xml is not added as a default Configuration resource. That means anything that simply instantiates a Configuration instance (e.g.: SecurityUtil, etc.) will not "see" job.xml settings. I'm not sure why job.xml wasn't added as a default resource. That may have been an intentional omission to avoid some problem, but part of me wonders if it should have been setup so that all Configuration objects start using job.xml settings if job.xml applies (i.e: we're using JobConf). > Configuration to use host name in delegation token service is not read from > job.xml during MapReduce job execution. > --- > > Key: MAPREDUCE-6565 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6565 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chris Nauroth >Assignee: Li Lu > > By default, the service field of a delegation token is populated based on > server IP address. Setting {{hadoop.security.token.service.use_ip}} to > {{false}} changes this behavior to use host name instead of IP address. > However, this configuration property is not read from job.xml. Instead, it's > read from a separate {{Configuration}} instance created during static > initialization of {{SecurityUtil}}. This does not work correctly with > MapReduce jobs if the framework is distributed by setting > {{mapreduce.application.framework.path}} and the > {{mapreduce.application.classpath}} is isolated to avoid reading > core-site.xml from the cluster nodes. MapReduce tasks will fail to > authenticate to HDFS, because they'll try to find a delegation token based on > the NameNode IP address, even though at job submission time the tokens were > generated using the host name. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6565) Configuration to use host name in delegation token service is not read from job.xml during MapReduce job execution.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668024#comment-15668024 ] Junping Du commented on MAPREDUCE-6565: --- I think MR over distributed cache shouldn't load configurations that included in MR tar ball. CC [~jlowe]. > Configuration to use host name in delegation token service is not read from > job.xml during MapReduce job execution. > --- > > Key: MAPREDUCE-6565 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6565 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chris Nauroth >Assignee: Li Lu > > By default, the service field of a delegation token is populated based on > server IP address. Setting {{hadoop.security.token.service.use_ip}} to > {{false}} changes this behavior to use host name instead of IP address. > However, this configuration property is not read from job.xml. Instead, it's > read from a separate {{Configuration}} instance created during static > initialization of {{SecurityUtil}}. This does not work correctly with > MapReduce jobs if the framework is distributed by setting > {{mapreduce.application.framework.path}} and the > {{mapreduce.application.classpath}} is isolated to avoid reading > core-site.xml from the cluster nodes. MapReduce tasks will fail to > authenticate to HDFS, because they'll try to find a delegation token based on > the NameNode IP address, even though at job submission time the tokens were > generated using the host name. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6565) Configuration to use host name in delegation token service is not read from job.xml during MapReduce job execution.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15290467#comment-15290467 ] Yuren Wu commented on MAPREDUCE-6565: - further thoughts on this fix.. the multi-home network setup for hadoop is getting some attention in industry. The security token design using toke + service name does not have updates to accommodate the complex network setup. HA This quick fix just get by with executing mapreduce jobs. However, I would suggest to create a new request to address the multi-home network and token handling in a more organized effort. Security package has very succinct log and it took me quite while to track down the issue. Properties under hadoop.security should be handled in a single code base to interact with various components. Credential token services such as retrieve/clone are handled by different packages in multiple components. Current code is really difficult to understand and manage. > Configuration to use host name in delegation token service is not read from > job.xml during MapReduce job execution. > --- > > Key: MAPREDUCE-6565 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6565 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chris Nauroth > > By default, the service field of a delegation token is populated based on > server IP address. Setting {{hadoop.security.token.service.use_ip}} to > {{false}} changes this behavior to use host name instead of IP address. > However, this configuration property is not read from job.xml. Instead, it's > read from a separate {{Configuration}} instance created during static > initialization of {{SecurityUtil}}. This does not work correctly with > MapReduce jobs if the framework is distributed by setting > {{mapreduce.application.framework.path}} and the > {{mapreduce.application.classpath}} is isolated to avoid reading > core-site.xml from the cluster nodes. MapReduce tasks will fail to > authenticate to HDFS, because they'll try to find a delegation token based on > the NameNode IP address, even though at job submission time the tokens were > generated using the host name. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6565) Configuration to use host name in delegation token service is not read from job.xml during MapReduce job execution.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15290449#comment-15290449 ] Yuren Wu commented on MAPREDUCE-6565: - a quick place to fix the map reduce issue is to add the following line into the YarnClient.java LOG.info("YARN CHILD CHECK SECURITY SETTING USE_IP:"+job.get(CommonConfigurationKeys.HADOOP_SECURITY_TOKEN_SERVICE_USE_IP)); // get useIp flag for KMS boolean useIp = job.getBoolean( CommonConfigurationKeys.HADOOP_SECURITY_TOKEN_SERVICE_USE_IP, CommonConfigurationKeys.HADOOP_SECURITY_TOKEN_SERVICE_USE_IP_DEFAULT); LOG.debug("set securityutil token service use ip value from config." +useIp); SecurityUtil.setTokenServiceUseIp(useIp); > Configuration to use host name in delegation token service is not read from > job.xml during MapReduce job execution. > --- > > Key: MAPREDUCE-6565 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6565 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chris Nauroth > > By default, the service field of a delegation token is populated based on > server IP address. Setting {{hadoop.security.token.service.use_ip}} to > {{false}} changes this behavior to use host name instead of IP address. > However, this configuration property is not read from job.xml. Instead, it's > read from a separate {{Configuration}} instance created during static > initialization of {{SecurityUtil}}. This does not work correctly with > MapReduce jobs if the framework is distributed by setting > {{mapreduce.application.framework.path}} and the > {{mapreduce.application.classpath}} is isolated to avoid reading > core-site.xml from the cluster nodes. MapReduce tasks will fail to > authenticate to HDFS, because they'll try to find a delegation token based on > the NameNode IP address, even though at job submission time the tokens were > generated using the host name. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6565) Configuration to use host name in delegation token service is not read from job.xml during MapReduce job execution.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15290047#comment-15290047 ] Yuren Wu commented on MAPREDUCE-6565: - need more thought around this.. just realized that setting this flag caused hdfs delegation tokens cannot be looked up by ipc client.. the basic issue is that delegation tokens are populated by each individual client and they do not have a conform protocol to follow to use ip or host name. i will post more findings and suggestions later. > Configuration to use host name in delegation token service is not read from > job.xml during MapReduce job execution. > --- > > Key: MAPREDUCE-6565 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6565 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chris Nauroth > > By default, the service field of a delegation token is populated based on > server IP address. Setting {{hadoop.security.token.service.use_ip}} to > {{false}} changes this behavior to use host name instead of IP address. > However, this configuration property is not read from job.xml. Instead, it's > read from a separate {{Configuration}} instance created during static > initialization of {{SecurityUtil}}. This does not work correctly with > MapReduce jobs if the framework is distributed by setting > {{mapreduce.application.framework.path}} and the > {{mapreduce.application.classpath}} is isolated to avoid reading > core-site.xml from the cluster nodes. MapReduce tasks will fail to > authenticate to HDFS, because they'll try to find a delegation token based on > the NameNode IP address, even though at job submission time the tokens were > generated using the host name. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6565) Configuration to use host name in delegation token service is not read from job.xml during MapReduce job execution.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15289614#comment-15289614 ] Yuren Wu commented on MAPREDUCE-6565: - Looks like no patch is available here. I propose to handle it in the DFSUtil.createKeyProvider method. // get useIp flag for KMS boolean useIp = conf.getBoolean( CommonConfigurationKeys.HADOOP_SECURITY_TOKEN_SERVICE_USE_IP, CommonConfigurationKeys.HADOOP_SECURITY_TOKEN_SERVICE_USE_IP_DEFAULT); LOG.debug("set "+CommonConfigurationKeys.HADOOP_SECURITY_TOKEN_SERVICE_USE_IP +" value from config." +useIp); SecurityUtil.setTokenServiceUseIp(useIp); > Configuration to use host name in delegation token service is not read from > job.xml during MapReduce job execution. > --- > > Key: MAPREDUCE-6565 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6565 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chris Nauroth > > By default, the service field of a delegation token is populated based on > server IP address. Setting {{hadoop.security.token.service.use_ip}} to > {{false}} changes this behavior to use host name instead of IP address. > However, this configuration property is not read from job.xml. Instead, it's > read from a separate {{Configuration}} instance created during static > initialization of {{SecurityUtil}}. This does not work correctly with > MapReduce jobs if the framework is distributed by setting > {{mapreduce.application.framework.path}} and the > {{mapreduce.application.classpath}} is isolated to avoid reading > core-site.xml from the cluster nodes. MapReduce tasks will fail to > authenticate to HDFS, because they'll try to find a delegation token based on > the NameNode IP address, even though at job submission time the tokens were > generated using the host name. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org