[ https://issues.apache.org/jira/browse/MAPREDUCE-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425770#comment-16425770 ]
Jason Lowe commented on MAPREDUCE-7069: --------------------------------------- Thanks for the patch! It's a bit odd and inefficient that setVMEnv calls MRApps.setEnvFromInputProperty twice. I think it would be clearer and more efficient to call it once, place the results in a temporary map (like it already does in the second call), then only set HADOOP_ROOT_LOGGER and HADOOP_CLIENT_OPTS in the environment if they are not set in the temporary map. Then at the end we can simply call addAll to dump the contents of the temporary map into the environment map. The example documentation in JobConf is confusing. It uses "MAPRED_MAP_TASK_ENV" and "MAPRED_REDUCE_TASK_ENV" but those literal strings should not be used in the property name. It would be clearer if this used "mapreduce.map.env" and "mapreduce.reduce.env" in the examples. Either that or give the example in the Java realm with something like set(MAPRED_MAP_TASK_ENV + ".varName", varValue) so it's clearly not a literal string in the property name. My pereference is the former. The relevant property descriptions in mapred-default.xml should be updated to reflect the new functionality. It would be good to update MapReduceTutorial.md to document the options for passing environment variables to tasks. There are a number of comments in setEnvFromString that should be fixed up. I realize this is mostly cut-n-paste from the old setEnvFromInputString, but since we're refactoring it would be nice to clean it up a bit in the process. There's not such thing as a tt (tasktracker) in YARN, and the comments imply this is only called to setup the env by a nodemanager for a child process. That's not always the case. "note" s/b "not", etc. For javadoc comments it's not necessary to state the type of the variable after the variable name. Javadoc can automatically extract this from the method signature. Nit: setEnvFromInputStringMap does not need to be public. Would it be easier to call tmpEnv.addAll(inputMap) and pass tmpEnv instead of inputMap? Then we don't need to explicitly iterate the map. The unit test should add a new properies with commas and or equal signs in the value and verify the values come through in the environment map. Does it make sense to split some of the unit test up into separate tests? For example the null input test can easily stand by itself. Separate tests make it easier to identify what's working and what's broken rather than a stacktrace with a line number in the middle of a large unit test that is testing many different aspects. > Add ability to specify user environment variables individually > -------------------------------------------------------------- > > Key: MAPREDUCE-7069 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7069 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Reporter: Jim Brennan > Assignee: Jim Brennan > Priority: Major > Attachments: MAPREDUCE-7069.001.patch, MAPREDUCE-7069.002.patch > > > As reported in YARN-6830, it is currently not possible to specify an > environment variable that contains commas via {{mapreduce.map.env}}, > mapreduce.reduce.env, or {{mapreduce.admin.user.env}}. > To address this, [~aw] proposed in [YARN-6830] that we add the ability to > specify environment variables individually: > {quote}e.g, mapreduce.map.env.[foo]=bar gets turned into foo=bar > {quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org